sourabh8Webscale opened a new issue #2039:
URL: https://github.com/apache/incubator-pagespeed-mod/issues/2039


   **Whats is the Issue?**
   
   Requests having sharded hostname and non-optimized resources(non-pagespeed 
resource) are not being mapped to origin domain.
   
   Here is a glimpse of my configuration, I have provided a detailed 
configuration in the last section.
   ```
       # CDN Sharding is enabled.
       ModPagespeedShardDomain http://www.example.com 
http://cdn1-cname.example.com/ABC,http://cdn2-cname.example.com/ABC,http://cdn3-cname.example.com/ABC
       ModPagespeedShardDomain http://www.alias-example.com 
http://cdn1-cname.example.com/DEF,http://cdn2-cname.example.com/DEF,http://cdn3-cname.example.com/DEF
       
       # Map origin domain.
       ModPagespeedMapOriginDomain 127.0.0.1 http://www.example.com
       ModPagespeedMapOriginDomain 127.0.0.1 http://www.alias-example.com
   ```
   
   `ABC`, `DEF` are our checksum to map the shard request to a correct hostname 
or alias.
   
   **Problem statement:**
   Requests like `http://www.cdn1-cname.example.com/ABC/ex/fonts/myfont.tiff` 
ends up at localhost with path as `/ABC/ex/fonts/myfonts.tiff` instead of 
`/ex/fonts/myfonts.tiff`.
   
   
   **What is the expected output?**
   
   I expect the URL 
`http://www.cdn1-cname.example.com/ABC/ex/fonts/myfont.tiff` to be mapped to 
its master domain
   something like `http://www.example.com/ex/fonts/myfont.tiff` and then should 
request to its origin with URL something like
   `http://localhost/ex/fonts/myfont.tiff`.
   
   **What do you see instead?**
   
   _For non-optimized and sharded resource:_
   
   What I can observe is that when a request like 
`http://www.cdn1-cname.example.com/ABC/ex/fonts/myfont.tiff` reaches
   the Instaweb_handler, it first checks if the requests is a pagespeed 
original URL or if such request has already been made
   or if requests are a part of another request and it didn't find it to be 
anyone of them 
   then it creates a URL using `ap_construct_url`, which is similar to the 
current URL. Then it registers this
   requests as original pagespeed request by registering it 
using`apr_table_setn`.
   Then it checks is this URL is a pagespeed beacon request or if this URL is a 
pagespeed resource, which turns out to be false.
   And this marks the URL to be a non-pagespeed URL using `apr_table_set`.
   Pagespeed doesn't alter this URL and then `ProxyFixupHost` changes the host 
to `www.example.com` and updates the request to 
`http://www.example.com/ABC/ex/fonts/myfont.tiff` and  `ProxyPass` directs it 
to balancer cluster which updates requests to 
http://localhost/ABC/ex/fonts/myfont.tiff`.
   
   _For optimized resource:_
   
   For example a sharded and optimized request like this
   
`http://www.cdn1-cname.example.com/ABC/ex/400x300a1.jpg.pagespeed.ic.hash.webp`
   gets properly mapped to its master domain 
`http://www.example.com/ex/400x300a1.jpg.pagespeed.ic.hash.webp`
   and then it gets searched in pagespeed cache if found then it returns the 
resource if not then
   it tries to load the resource asynchronously by updating the URL to 
`http://www.cdn1-cname.example.com/ABC/ex/1.jpg` and then fetches the resource 
from the localhost with request `http://127.0.0.1/ex/1.jpg`.
   
   **On what operating system?**
   Ubuntu 16.04.7
   
   **Which version of Apache?**
   2.4.43
   
   **Which version of pagespeed?**
   v1.13.35.2
   
   **What steps will reproduce the problem?**
   
   Request a page in which there should resources which pagespeed should decide 
not to optimize , such as font files or small images the file formats which are 
failing for me are .tiff, .woff, .woff2, .png, .svg. and a few others.
   
   This issue is coming in production where the customer site has a lot of 
resources but only above mentioned resources
   are not being handled properly.
   
   **My configuration:**
   
   ```
   <VirtualHost 127.0.0.1:80 10.12x.x.3x:80>
     ServerName www.example.com
     ServerAlias www.alias-example.com
     ConcurrentLimit xx
     SuspendLimit 4xx
     MaximumQueueTime 6x
     DocumentRoot lg_vhosts/cname.example.com.80/www
     SetEnvIfExpr true proxy-request-timeout=30
     AllowEncodedSlashes NoDecode
     ModPagespeedFileCachePath /var/cache/pagespeed/cname.example.com
     ModPagespeedFileCacheSizeKb 10xxxx
     WSGeoLookUpEnable 1
     RemoteIPInternalProxy 127.0.0.1
     LegacyTrust off
     AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css 
text/javascript application/javascript text/javascript text/x-js 
text/x-javascript
     <IfModule pagespeed_module>
       ModPagespeed On
       ModPagespeedEnableFilters insert_dns_prefetch
       ModPagespeedEnableFilters add_instrumentation
       ModPagespeedUrlValuedAttribute img data-src image
       ModPagespeedFetchHttps enable
       ModPagespeedDomain www.example.com
       ModPagespeedDomain www.alias-example.com
       ModPagespeedDomain https://www.example.com
       ModPagespeedDomain https://www.alias-example.com
   
       # CDN Sharding is enabled.
       ModPagespeedShardDomain http://www.example.com 
http://cdn1-cname.example.com/ABC,http://cdn2-cname.example.com/ABC,http://cdn3-cname.example.com/ABC
       ModPagespeedShardDomain http://www.alias-example.com 
http://cdn1-cname.example.com/DEF,http://cdn2-cname.example.com/DEF,http://cdn3-cname.example.com/DEF
       
       ModPagespeedShardDomain https://www.example.com 
https://cdn1-cname.example.com/ABC,https://cdn2-cname.example.com/ABC,https://www.cdn3-cname.example.com/ABC
       ModPagespeedShardDomain https://www.alias-example.com 
https://cdn1-cname.example.com/DEF,https://cdn2-cname.example.com/DEF,https://cdn3-cname.example.com/DEF
       # Map origin domain.
       ModPagespeedMapOriginDomain 127.0.0.1 http://www.example.com
       ModPagespeedMapOriginDomain 127.0.0.1 http://www.alias-example.com
     </IfModule>
     ProxyPreserveHost On
       ProxySourceAddress 10.12x.0.xx
     SetEnvIfExpr "%{CONN_REMOTE_ADDR} = '127.0.0.1'" !proxy-initial-not-pooled
     <Proxy balancer://cluster_0_0>
       BalancerMember http://10x.1xx.1xx.1xx:80 route=0 timeout=600 retry=0 
connectiontimeout=60 loadfactor=1
       ProxySet lbmethod=bybusyness
     </Proxy>
   
     # Set to default balancer if not already set
     SetEnvIfExpr "-z reqenv('LB')" LB=cluster_0_0
     ProxyPassInterpolateEnv On
     SSLProxyEngine On
     RequestHeader set X-Forwarded-Proto http
     Protocols http/1.1
   </VirtualHost>
   
   <VirtualHost 127.0.0.1:80 10.12x.x.3x:80>
     ServerName cname.example.com
     ConcurrentLimit 40
     SuspendLimit 400
     MaximumQueueTime 60
     Session off
     DocumentRoot lg_vhosts/cname.example.com.80/www
     AllowEncodedSlashes NoDecode
     ServerAlias cdn1-cname.example.com
     ServerAlias cdn2-cname.example.com
     ServerAlias cdn3-cname.example.com
     SecWebAppId xyz
     ModPagespeedFileCachePath /var/cache/pagespeed/cname.example.com
     ModPagespeedFileCacheSizeKb 10xxxxx
   
     WSGeoLookUpEnable 1
   
     RemoteIPInternalProxy 127.0.0.1
     LegacyTrust off
     AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css 
text/javascript application/javascript text/javascript text/x-js 
text/x-javascript
     <IfModule pagespeed_module>
       ModPagespeed On
       ModPagespeedEnableFilters insert_dns_prefetch
       ModPagespeedEnableFilters add_instrumentation
       ModPagespeedUrlValuedAttribute img data-src image
       ModPagespeedFetchHttps enable
       ModPagespeedDomain www.example.com
       ModPagespeedDomain www.alias-example.com
       ModPagespeedDomain https:// www.example.com
       ModPagespeedDomain https://www.alias-example.com
   
   
       # CDN Sharding is enabled.
       ModPagespeedShardDomain http://www.example.com 
http://cdn1-cname.example.com/ABC,http://cdn2-cname.example.com/ABC,http://cdn3-cname.example.com/ABC
       ModPagespeedShardDomain http://www.alias-example.com 
http://cdn1-cname.example.com/DEF,http://cdn2-cname.example.com/DEF,http://cdn3-cname.example.com/DEF
       
       # Optimize HTTPS is enabled as well, so we need to define https shards.
       ModPagespeedShardDomain https://www.example.com 
https://www.cdn1-cname.example.com/ABC,https://cdn2-cname.example.com/ABC,https://cdn3-cname.example.com/ABC
       ModPagespeedShardDomain https://www.alias-example.com 
https://www.cdn1-cname.example.com/DEF,https://cdn2-cname.example.com/DEF,https://cdn3-cname.example.com/DEF
       # Map origin domain.
       ModPagespeedMapOriginDomain 127.0.0.1 http://www.example.com
       ModPagespeedMapOriginDomain 127.0.0.1 http://www.alias-example.com
   
     </IfModule>
     ProxyFixupHost www.example.com
     SetEnvIfExpr true !proxy-initial-not-pooled
     <Proxy balancer://cluster_0_0>
       BalancerMember http://127.0.0.1:80
       ProxySet lbmethod=bybusyness
     </Proxy>
     ProxyPass / balancer://cluster_0_0/
     RequestHeader set X-Forwarded-Proto http
     Protocols http/1.1
   </VirtualHost>
   ```
   
   We have created two virtual hosts one for handling CDN requests and one for 
the actual server. The CDN requests come first and then it gets resolved to the 
actual hostname and then request is send to the load balancer which picks the 
server based on busyness.
   
   
   **Things I have tried:**
   1. I have experimented with the  configuration with the following changes
   added
   ` ModPagespeedShardDomain 127.0.0.1 www.cdn1-cname.example.com/ABC 
www.example.com` to vhost file, but it gave me similar results.
   2. I tried some code changes in `Instawebhandler` to map the sharded 
requests to its correct hostname, when its a sharded and non-pagespeed 
requests, but it failed to change the request and the request kept reverting to 
the original request.
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to