jmarantz commented on issue #1686:
URL:
https://github.com/apache/incubator-pagespeed-ngx/issues/1686#issuecomment-624303798
This makes sense and everything you've mentioned is consistent with my
hypothesis for the root cause.
If your cache doesn't get into a state where it evicts anything, then it
will work to:
a) not be able to fetch your CSS files via http
b) be able to capture them into the shared cache as they are served
optimized, via an output-filter that is installed by mod_pagespeed as part
of InPlaceResourceOptimization
c) optimize those cached resources
d) re-optimize them also in another server with a separate metadata
cache, having loaded them from the shared http cache
e) save the optimized resources from the shared cache.
f) serve them happily from all servers connected to that shared cache.
That all works great until the asset is evicted. At that point, pagespeed
gets a request for an optimized URL. It can't find it in the cache. Then it:
g) decodes the URL to find the origin asset
h) attempts to use HTTP to fetch the origin asset, and fails
i) serves a 404.
If you are OK keeping your cache from ever evicting anything, then this bug
can be tolerated. But as you observed, new resources are added to the cache
all the time, and depending on the eviction algorithm, you may find your
valuable (and non-reconstructable) assets evicted and 404s will result.
One theory as to why new stuff is added to the cache is that PageSpeed
attempts to use the cache to store properties of an HTML page, such as
which domains are referenced, or which images are above the fold and hence
would benefit from inlining and suffer from lazyloading. HTML pages may
have URLs with a high-entropy query-parameter, and never be revisited. That
can cause a lot of cache writes, and ultimately (depending on cache
eviction setup), cause you to drop critical resources that you can't
reconstruct.
This is what https://github.com/apache/incubator-pagespeed-mod/issues/1145 is
all about.
The best solution to this I think is to fix HTTP fetching so it works in
your environment. Can you check to see whether it is, you can look at the
pagespeed_admin/statistics page. For example, the stats for modpagesped.com
look like this:
[image: Screen Shot 2020-05-05 at 4.55.44 PM.png]
The name of the HTTP fetcher used by mod_pagespeed is "serf", and this page
tells us that modpagespeed.com's installation of mod_pagespeed has done
5425 successful fetches since the server started, and zero fetches failed.
What do the stats look like on your system?
On Tue, May 5, 2020 at 4:25 PM Andrew Borg <[email protected]> wrote:
> @jmarantz <https://github.com/jmarantz> @Lofesa
> <https://github.com/Lofesa>
>
> I have an update for you, been running pagespeed again for few days now
> with zero issues, so far. From the advice in this thread I've set
> pagespeed DefaultSharedMemoryCacheKB 0;
>
> and this (we have many domains pointing to the same website)
> pagespeed CacheFragment $root_dir;
>
> Just one thing please, take a look at this image: https://ibb.co/1T1KpdX
>
> It's a usage graph of the cache memcache). Is this possibly normal?? It
> just keeps growing and we do have maybe a few thousand images on the site
> but nothing to fill 1.5GB of optimized content. How?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
>
<https://github.com/apache/incubator-pagespeed-ngx/issues/1686#issuecomment-624286853>,
> or unsubscribe
>
<https://github.com/notifications/unsubscribe-auth/AAO2IPO2PVMCXRQCMPYJACLRQBY2JANCNFSM4MTB4TGA>
> .
>
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]