Do we need app cache busting after we have a CDN cache invalidation solution?
Browser cache busting solution and its under the hood mechanism
Who should read this post: Have you encountered the problems that you release a new version of your React app (or any SPA), but your customers still use the older version because of their cache in their browser? If so, you should read this post along with the last one.
Cache invalidation is a hard problem
There are only two hard things in Computer Science: cache invalidation and naming things.
-- Phil Karlton
We touched upon one hard things at the CDN layer in our last post, but it’s not done yet. In this post, we are going to talk about how to invalidate the browser cache. Both this post and the last one mean to be generic discussion for client application caching, though we use React applications as the specific context.
Internet speed and why cache is a must technique for building commercial applications
Internet traffic latency is a real problem and having your users wait for an extra 1s will potentially cost you millions of dollars.
If an e-commerce site is making $100,000 per day, a 1 second page delay could potentially cost you $2.5 million in lost sales every year.
And someone made analogy between our daily time scale to computing system caching mechanisms, and here I present it to illustrate why caching is must for any commercial applications (pay attentions to the red circles)
In short, for the perceived user experience, the difference between the cached app and uncached app for the front-end loading time could be thousands of time.
HTTP cache policy and browser behavior for Single Page Application (SPA)
Because of the huge gap between accessing a locally cached content vs. accessing content remotely, our HTTP protocol designers foresaw the long-term trend and gave us the caching mechanism for us to tune, which is the field of Standard Cache-Control
Cache-Control: no-store
— no cache at all
Cache-Control: max-age=<X seconds>
— cache for X seconds
Essentially, you can either choose to not cache the SPA at all, so that you can guarantee your browser always get the latest version from your server; or you can choose to cache for a certain amount of time, so that the browser does not need to go out to Internet at all to render the SPA. Certainly we want to cache the content as long as possible because we don’t want our customers to think of our app “sluggish” / “slow”.
But if we have already released a new version of our app, yet the users are still using the cached (old) version of our app, we may render into a lot of problems as well.
So, how can we achieve the following behavior?
Use browser cache as much as we can, and invalidate the browser cache immediately once the new release is rolled out to the server.
Please read the next section…
React app distributions
What does a React build distribution look like? The build details can be found in React’s webpack mechanism. But long story short, the application contains a HTML file, which has a link to a bundle.js, and then bundle.js will render the client side assets further, in which process it needs the dependency to static assests like css and png files — see the dependency conceptual diagram below.
The above dependency diagram is shown in the following distribution (a release).
Cache busting for React App
In our previous post talking about continuous deployment of React app with Github Action, we alluded a simple react app repo — please read it if you’d like to become a solo front-end hero in your organization 😀 (no need of help from anyone else and release a new product).
In that repo, we implemented a mechanism described here introduced by Dinesh Pandiyan. The original article is clear and very followable. However, it may not give the best visual representations for starters to understand what has happened.
As the goal newsletter is to give newcomers a quick overview for a new subject, and since I truly believe a picture is worth 1000 words, here I draw a diagram to illustrate the under the hood mechanism for this browser cache busting mechanism.
What the above mechanism did was to always fetch meta.json from the server, and compare the build time of local cache and the fetched meta.json file.
If the meta.json does not indicate there is a new release, then respect the browser cache control — there is no hard refresh to reload the application from the remote server;
If the meta.json indicates there is a new release, then the application performs a hard refresh and reload the application from the remote server.
In this way, we achieved our goal of Use browser cache as much as we can, and invalidate the browser cache immediately once the new release is rolled out to the server.
Summary
Though last post talked about CDN cache invalidation, this post talked about browser cache busting — hopefully this gives you a clear picture about how to offer your customers the best performance yet keep your customers updated whenever you have new release.
In the next post, we are going to put the end to end caching chain into a single diagram…