oschaaf commented on issue #1556: Combined css returns truncated, mis-encoded 
body on cache hit
URL: 
https://github.com/apache/incubator-pagespeed-ngx/issues/1556#issuecomment-389066123
 
 
   So I caught the http cache in gdb writing the bad entry. The C-L is 7411 
with C-E gzip, but the body content is uncompressed and a lot larger. I suspect 
that this problem is masked for small response sizes by recovery mechanisms 
both at the server and browser level, which is why it isn't showing up reliably.
   
   ```
   (gdb) p response_headers->ToString()
   $14 = "HTTP/1.1 200 OK\r\nServer: nginx/1.13.4\r\nAccept-Ranges: 
bytes\r\nContent-Type: text/css\r\nDate: Tue, 15 May 2018 06:42:09 
GMT\r\nExpires: Wed, 15 May 2019 06:42:09 GMT\r\nCache-Control: 
max-age=31536000\r\nEtag: W/\"0\"\r\nLast-Modified: Tue, 15 May 2018 06:42:09 
GMT\r\nX-Original-Content-Length: 50878\r\nVary: 
Accept-Encoding\r\nContent-Encoding: gzip\r\nContent-Length: 7411\r\n\r\n"
   (gdb) bt
   #0  net_instaweb::HTTPCache::PutInternal (this=0x188efb8, 
preserve_response_headers=false, 
key="http://localhost/store.css+vendor.css.pagespeed.cc.iKEaqUIgkT.css";, 
fragment="localhost", 
       start_us=1526366529263953, value=0x7ffff2034c50, 
response_headers=0x7ffff2034d50, handler=0x1a22d78) at 
net/instaweb/http/http_cache.cc:457
   #1  0x0000000000706979 in net_instaweb::HTTPCache::Put (this=0x188efb8, 
key="http://localhost/store.css+vendor.css.pagespeed.cc.iKEaqUIgkT.css";, 
fragment="localhost", req_properties=..., 
       http_options=..., value=0x7fffe40079a0, handler=0x1a22d78) at 
net/instaweb/http/http_cache.cc:495
   #2  0x00000000005d4c82 in net_instaweb::RewriteDriver::Write 
(this=0x7fffe4002f98, inputs=std::vector of length 2, capacity 2 = {...}, 
contents=..., 
       type=0x17723c0 <net_instaweb::(anonymous namespace)::kTypes+96>, 
charset=..., output=0x7fffe4007978) at 
net/instaweb/rewriter/rewrite_driver.cc:3362
   #3  0x0000000000b3fb1d in net_instaweb::ResourceCombiner::WriteCombination 
(this=0x7fffe40084b8, combine_resources=std::vector of length 2, capacity 2 = 
{...}, combination=..., handler=0x1a22d78)
       at net/instaweb/rewriter/resource_combiner.cc:287
   #4  0x000000000066e821 in net_instaweb::CssCombineFilter::CssCombiner::Write 
(this=0x7fffe40084b8, in=std::vector of length 2, capacity 2 = {...}, out=...)
       at net/instaweb/rewriter/css_combine_filter.cc:123
   #5  0x000000000066f0d1 in net_instaweb::CssCombineFilter::Context::Rewrite 
(this=0x7fffe4008338, partition_index=0, partition=0x7fffe4007ee8, output=...)
       at net/instaweb/rewriter/css_combine_filter.cc:266
   #6  0x0000000000b568a8 in 
net_instaweb::RewriteContext::InvokeRewriteFunction::Run (this=0x7fffe4000c48) 
at net/instaweb/rewriter/rewrite_context.cc:958
   #7  0x0000000000725d8c in net_instaweb::Function::CallRun 
(this=0x7fffe4000c48) at pagespeed/kernel/base/function.cc:51
   #8  0x00000000008a75c8 in net_instaweb::QueuedWorkerPool::Run 
(this=0x1a06068, sequence=0x7fffe4003ad8, worker=0x1a2e0f8) at 
pagespeed/kernel/thread/queued_worker_pool.cc:157
   #9  0x00000000008ad5cc in 
net_instaweb::MemberFunction2<net_instaweb::QueuedWorkerPool, 
net_instaweb::QueuedWorkerPool::Sequence*, net_instaweb::QueuedWorker*>::Run 
(this=0x7fffdc031ef8)
       at ./pagespeed/kernel/base/function.h:202
   #10 0x0000000000725d8c in net_instaweb::Function::CallRun 
(this=0x7fffdc031ef8) at pagespeed/kernel/base/function.cc:51
   #11 0x00000000008b3b64 in net_instaweb::Worker::WorkThread::Run 
(this=0x1a2e128) at pagespeed/kernel/thread/worker.cc:85
   #12 0x00000000008b6ffc in net_instaweb::PthreadThreadImpl::InvokeRun 
(self_ptr=0x1a2dd08) at pagespeed/kernel/thread/pthread_thread_system.cc:121
   #13 0x00007ffff79bd6ba in start_thread (arg=0x7ffff2036700) at 
pthread_create.c:333
   #14 0x00007ffff638341d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
   ```
   
   At the surface, it looks like `ResourceCombiner::WriteCombination` sanitizes 
the output headers, but not the ones it stores into the cache.
   
   Trying out the following change to see how that turns out:
   
   ```diff
   diff --git a/net/instaweb/rewriter/resource_combiner.cc 
b/net/instaweb/rewriter/resource_combiner.cc
   index e978236..25b971e 100644
   --- a/net/instaweb/rewriter/resource_combiner.cc
   +++ b/net/instaweb/rewriter/resource_combiner.cc
   @@ -278,7 +278,8 @@ bool ResourceCombiner::WriteCombination(
        for (int i = 1, n = combine_resources.size(); i < n; ++i) {
          
output_headers->RemoveIfNotIn(*combine_resources[i]->response_headers());
        }
   -
   +    
combination->response_headers()->RemoveAll(HttpAttributes::kContentEncoding);
   +    
combination->response_headers()->RemoveAll(HttpAttributes::kContentLength);
        // TODO(morlovich): Fix combiners to deal with charsets.
        written =
            rewrite_driver_->Write(
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to