Colm MacCarthaigh wrote:
Currently;GET / HTTP/1.1 Host: ftp.heanet.ie GET http://ftp.heanet.ie/ HTTP/1.0 GET HTTP://Ftp.Heanet.Ie/ HTTP/1.0 are all mapped to different hashes by mod_cache; despite being the same content, this is an inefficient waste of disk space and really awkward for me trying to write a debug/admin tool. The attached patch makes it deterministic, by mapping them all to;"http://ftp.heanet.ie:80/?"
The idea of canonicalising the name is sound, but munging them into an added :80 and an added ? is really ugly - these are not the kind of URLs that an end user would understand at a glance if they had to see them listed.
Is it possible to remove the :80 if the scheme is https, and remove the :443 if the scheme is https:? What is the significance of the added "?"?
Regards, Graham --
