Justin Erenkrantz wrote:
MD5 has the possibility for collisions, too. What do squid or other proxies do?
True. I'll see what others do.
On one hand, I think doing MD5 is sort of silly - just use the URL itself. *shrug* -- justin
With mod_disk_cache, how about urls such as:
http://www.cnn.com/2004/US/West/08/02/missing.woman/index.html?this=that&the%20other=something
or when you need to cache two different versions of the same URL because of Vary?
I actually have somewhat of a solution:
URL encode the uri and any vary elements: www.cnn.com/index.html?this=that Accept-Encoding: gzip Cookie: Special=SomeValue
may become:
www.cnn.com%2Findex.html%3Fthis%3Dthat+Accept-Encoding%3A+gzip+Cookie%3A+Special%3DSomeValue
A very simple hashing function could put this in some directory structure, so the file on disk may be:
/var/cache/apache/00/89/www.cnn.com%2Findex.html%3Fthis%3Dthat+Accept-Encoding%3A+gzip+Cookie%3A+Special%3DSomeValue
Should be pretty fast (?) if the urlencode was effecient.
-- Brian Akins Senior Systems Engineer CNN Internet Technologies