jmarantz commented on issue #1686: URL: https://github.com/apache/incubator-pagespeed-ngx/issues/1686#issuecomment-624303798
This makes sense and everything you've mentioned is consistent with my hypothesis for the root cause. If your cache doesn't get into a state where it evicts anything, then it will work to: a) not be able to fetch your CSS files via http b) be able to capture them into the shared cache as they are served optimized, via an output-filter that is installed by mod_pagespeed as part of InPlaceResourceOptimization c) optimize those cached resources d) re-optimize them also in another server with a separate metadata cache, having loaded them from the shared http cache e) save the optimized resources from the shared cache. f) serve them happily from all servers connected to that shared cache. That all works great until the asset is evicted. At that point, pagespeed gets a request for an optimized URL. It can't find it in the cache. Then it: g) decodes the URL to find the origin asset h) attempts to use HTTP to fetch the origin asset, and fails i) serves a 404. If you are OK keeping your cache from ever evicting anything, then this bug can be tolerated. But as you observed, new resources are added to the cache all the time, and depending on the eviction algorithm, you may find your valuable (and non-reconstructable) assets evicted and 404s will result. One theory as to why new stuff is added to the cache is that PageSpeed attempts to use the cache to store properties of an HTML page, such as which domains are referenced, or which images are above the fold and hence would benefit from inlining and suffer from lazyloading. HTML pages may have URLs with a high-entropy query-parameter, and never be revisited. That can cause a lot of cache writes, and ultimately (depending on cache eviction setup), cause you to drop critical resources that you can't reconstruct. This is what https://github.com/apache/incubator-pagespeed-mod/issues/1145 is all about. The best solution to this I think is to fix HTTP fetching so it works in your environment. Can you check to see whether it is, you can look at the pagespeed_admin/statistics page. For example, the stats for modpagesped.com look like this: [image: Screen Shot 2020-05-05 at 4.55.44 PM.png] The name of the HTTP fetcher used by mod_pagespeed is "serf", and this page tells us that modpagespeed.com's installation of mod_pagespeed has done 5425 successful fetches since the server started, and zero fetches failed. What do the stats look like on your system? On Tue, May 5, 2020 at 4:25 PM Andrew Borg <notificati...@github.com> wrote: > @jmarantz <https://github.com/jmarantz> @Lofesa > <https://github.com/Lofesa> > > I have an update for you, been running pagespeed again for few days now > with zero issues, so far. From the advice in this thread I've set > pagespeed DefaultSharedMemoryCacheKB 0; > > and this (we have many domains pointing to the same website) > pagespeed CacheFragment $root_dir; > > Just one thing please, take a look at this image: https://ibb.co/1T1KpdX > > It's a usage graph of the cache memcache). Is this possibly normal?? It > just keeps growing and we do have maybe a few thousand images on the site > but nothing to fill 1.5GB of optimized content. How? > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <https://github.com/apache/incubator-pagespeed-ngx/issues/1686#issuecomment-624286853>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAO2IPO2PVMCXRQCMPYJACLRQBY2JANCNFSM4MTB4TGA> > . > ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org