jmarantz commented on issue #1686:
URL: 
https://github.com/apache/incubator-pagespeed-ngx/issues/1686#issuecomment-624303798


   This makes sense and everything you've mentioned is consistent with my
   hypothesis for the root cause.
   
   If your cache doesn't get into a state where it evicts anything, then it
   will work to:
     a) not be able to fetch your CSS files via http
     b) be able to capture them into the shared cache as they are served
   optimized, via an output-filter that is installed by mod_pagespeed as part
   of InPlaceResourceOptimization
     c) optimize those cached resources
     d) re-optimize them also in another server with a separate metadata
   cache, having loaded them from the shared http cache
     e) save the optimized resources from the shared cache.
     f) serve them happily from all servers connected to that shared cache.
   
   That all works great until the asset is evicted. At that point, pagespeed
   gets a request for an optimized URL. It can't find it in the cache. Then it:
     g) decodes the URL to find the origin asset
     h) attempts to use HTTP to fetch the origin asset, and fails
     i) serves a 404.
   
   If you are OK keeping your cache from ever evicting anything, then this bug
   can be tolerated. But as you observed, new resources are added to the cache
   all the time, and depending on the eviction algorithm, you may find your
   valuable (and non-reconstructable) assets evicted and 404s will result.
   
   One theory as to why new stuff is added to the cache is that PageSpeed
   attempts to use the cache to store properties of an HTML page, such as
   which domains are referenced, or which images are above the fold and hence
   would benefit from inlining and suffer from lazyloading. HTML pages may
   have URLs with a high-entropy query-parameter, and never be revisited. That
   can cause a lot of cache writes, and ultimately (depending on cache
   eviction setup), cause you to drop critical resources that you can't
   reconstruct.
   
   This is what https://github.com/apache/incubator-pagespeed-mod/issues/1145 is
   all about.
   
   
   The best solution to this I think is to fix HTTP fetching so it works in
   your environment. Can you check to see whether it is, you can look at the
   pagespeed_admin/statistics page. For example, the stats for modpagesped.com
   look like this:
   
   [image: Screen Shot 2020-05-05 at 4.55.44 PM.png]
   
   The name of the HTTP fetcher used by mod_pagespeed is "serf", and this page
   tells us that modpagespeed.com's installation of mod_pagespeed has done
   5425 successful fetches since the server started, and zero fetches failed.
   What do the stats look like on your system?
   
   On Tue, May 5, 2020 at 4:25 PM Andrew Borg <notificati...@github.com> wrote:
   
   > @jmarantz <https://github.com/jmarantz> @Lofesa
   > <https://github.com/Lofesa>
   >
   > I have an update for you, been running pagespeed again for few days now
   > with zero issues, so far. From the advice in this thread I've set
   > pagespeed DefaultSharedMemoryCacheKB 0;
   >
   > and this (we have many domains pointing to the same website)
   > pagespeed CacheFragment $root_dir;
   >
   > Just one thing please, take a look at this image: https://ibb.co/1T1KpdX
   >
   > It's a usage graph of the cache memcache). Is this possibly normal?? It
   > just keeps growing and we do have maybe a few thousand images on the site
   > but nothing to fill 1.5GB of optimized content. How?
   >
   > —
   > You are receiving this because you were mentioned.
   > Reply to this email directly, view it on GitHub
   > 
<https://github.com/apache/incubator-pagespeed-ngx/issues/1686#issuecomment-624286853>,
   > or unsubscribe
   > 
<https://github.com/notifications/unsubscribe-auth/AAO2IPO2PVMCXRQCMPYJACLRQBY2JANCNFSM4MTB4TGA>
   > .
   >
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to