BIG PICTURE/SUMMARY:
====================
Need to find a way to have the correct/complete PURGE requests sent to a Varnish front end cache via CacheFu when using Apache virtual hosting and Zope Virtual Host Monster (VHM) "Inside-Out" virtual hosting syntax in the Apache RewriteRule statements.


ENVIRONMENT SNAPSHOT:
=====================

Zope 2.10.5-final
Python 2.4.4
Plone 3.0.5
Archetypes 1.5.5
CacheFu 1.1.1 (SVN/UNRELEASED Rev. 56454)

Red Hat Enterprise Linux AS 4

Setup is a caching/HTTP-server machine in front (running Varnish and Apache), then several backend machines running ZEO clients, then a single ZEO Storage Server:


    Varnish 1.1.2 (port 80) &
       Apache 2.2 (port 81)
               |
               |
               |
         ZEO Client(s)
               |
               |
               |
       ZEO Storage Server


Let's say we're running Apache/Varnish on a server called cache.mycompany.com (IP address 10.5.54.156) and the site we're hosting is http://www.mycompany.com/foo, which maps to a Plone site called "PloneSite" under the ZODB mount point "mount1" (i.e., /mount1/PloneSite) on a ZEO client instance running on port 8080 on a machine called zopebackend1.mycompany.com (IP address 10.5.54.155). That is, http://www.mycompany.com/foo maps to http://zopebackend1.mycompany.com:8080/mount1/PloneSite.

Apache's httpd.conf has rewrite rules such as the following within the www.company.com <VirtualHost> section:

  <VirtualHost 10.5.54.156>
  ServerName www.mycompany.com

  .
  .
  .

RewriteRule ^/foo(.*) http://zopebackend1.mycompany.com:8080/VirtualHostBase/http/www.mycompany.com:80/mount1/PloneSite/VirtualHostRoot/_vh_foo$1 [L,P]

  .
  .
  .

  </VirtualHost>

Note the use of the _vh_foo at the end of the Rewrite rule--i.e., we are using the "Inside-Out" virtual hosting feature of Virtual Host Monster syntax.

CacheFu is enabled on the Plone site, and left at default settings except for the following:

Proxy Cache Purge Configuration: Simple Purge (squid/varnish in front)
  Site Domains: http://www.mycompany.com:80


STEPS TO REPRODUCE THE PROBLEM:
===============================
Set up an environment similar to the one described above (Varnish cache and Apache on same server, Varnish listening on port 80, Apache on port 81; backend ZEO client servers that Apache passes non-cache-hit requests to; a ZEO storage server). Could probably use a single Zope instance instead of ZEO, too...the main thing here is what Varnish caches, the virtual hosting on Apache, and the use of VHM "Inside-Out" virtual hosting syntax in the Apache RewriteRule statement.

Start up the site and make sure you can get to it and its contents via the virtual host, and verify that Varnish is caching that site. Then change some items on the Plone site in order to force a "PURGE" request to be sent via CacheFu on the Plone site.


EXPECTED RESULT:
================
I believe what's cached by Varnish is the URL it sees coming in. E.g., if the client request/GET is http://www.mycompany.com/foo/news Varnish will cache /foo/news as the object, I believe. It may include the "http" and hostname...I'm not positive.

For example, say at one point a GET request is done and it logs the following as an entry in the Varnish NCSA format log file:

67.164.214.168 - - [22/Jan/2008:11:45:29 -1000] "GET http://www.mycompany.com/foo/office/party-photos/view HTTP/1.1" 404 322 "http://www.mycompany.com/foo/office/index.html"; "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; Media Center PC 4.0)"

Based on this, I would expect the "PURGE" requests sent to Varnish by the ZEO client instance (caused by CacheFu) when the Plone site content changes to have the same basic URL..e.g., to be something like the following (as an entry in the Varnish NCSA format log):

10.5.54.155 - - [22/Jan/2008:12:05:13 -1000] "PURGE http://www.mycompany.com/foo/office/party-photos/view HTTP/1.1" 404 425 "-" "-"



ACTUAL RESULT:
==============
The actual "PURGE" request sent looks like the following:


10.5.54.155 - - [22/Jan/2008:12:05:13 -1000] "PURGE http://www.mycompany.com/office/party-photos/view HTTP/1.1" 404 425 "-" "-"

I.e., it is missing the "foo" in between the "www.mycompany.com/" and "office/party-photos/view". It appears that, because of this, Varnish does not purge the object from the cache because the URL does not match what was originally cached.

The PURGE request is missing the "foo" because, I think, the actual path after the true "/mount1/PloneSite" Plone site is "office/party-photos/view" (i.e., the full path would be "http://zopebackend1.mycompany.com:8080/mount1/PloneSite/office/party-photos/view"; -- the missing "foo" portion is from the virtual hosting, not the actual Plone site path).


NOTES/QUESTIONS/THOUGHTS:
=========================
If there's an easy way to set the settings of CacheFu to do what I want, I guess I've missed it. I've tried using the "Purge with Custom URLs" purge configuration too, but no luck (although I haven't tried modifying the rewritePurgeUrls.py script...it'd take me a bit to figure out what to change).

If I try to put something after the port number within a "Site Domains" section entry, it gets truncated when I hit "Save". E.g., if I put the following in "Site Domains":

  http://www.mycompany.com:80/foo

it gets truncated to

  http://www.mycompany.com:80

(i.e., the "/foo" is taken off) when I hit "Save".

I tried this with "Simple Purge" and "Purge with Custom URLs" both (although I only thought it might work for the custom URLs option). I did so because I was hoping I could force CacheFu to add the "/foo" to its PURGE requests.

Actually, if the functionality isn't there, maybe the following can be a request for a future version of CacheFu (or the current 1.1.1 release due Feb. 3rd, if it's not too hard to add--perhaps I should submit an issue/suggestion via the CacheFu development area?):

Allow the addition of something after the port number within the Site Domains section of CacheFu configs, so that if inside-out virtual hosting is used, the correct PURGE URL is sent to the front-end cache.

Another suggestion along those lines would be the following:

There may be more than one cache/HTTP-server machine (for redundancy and load-balancing)--so there would be a need to send the correct/complete PURGE request to more than one cache proxy...but perhaps that's taken care of by the "Proxy Cache Domains" setting in CacheFu?



Thanks ahead of time for any insight and help anyone can give me on this issue.


Dan K.
[EMAIL PROTECTED]



_______________________________________________
Setup mailing list
[email protected]
http://lists.plone.org/mailman/listinfo/setup

Reply via email to