Forwarding, due to snafu with my accept/dist files (I hate smartlist) Yeah - I know majordomo is better. I'll find the time someday to switch back...
Hi; I've decided to take a stab at making tclhttpd into a caching proxy server, but need some details on how to go about it and i'm hoping someone in the group can point me in the right direction. I'm guessing that under normal conditions, the webserver will see only the path part of the http://somesite.com/index.html - where .com could be any of .com, .edu, .gov etc. and index.html could be anything from blank to a deep path like /root/branch/twig/ to /root/branch/twig/index.htm(l) and also include things like :ports and cgi strings. My first question, is what does a proxy server see, and what does it look like? I guess the proxy must see the whole thing from "http..." on so that it can look in it's cache to see if that site:path:page is in the cache. Secondly, it's fairly obvious that if it's not in the cache (i'd use domain/root/branch/twig/page.html in the file directory structure) that one should either pass it out to the web and somehow capture the fetched page before sending it on to the client, or if the computer is offline, return a document unavailable page. Would one use the socket command and the http package to fetch the page and it's images etc and then return it to the proxy? Steve Ball also mentioned something about how to check the freshness of the cached page using the 'head' fetch - how is that done? There is also the matter of purging the cache of stale pages, but that seems pretty straight forward using the file dates in the cache and running a utility (thread) to clean it up periodically. I guess what i'm looking for is a general docuemnt about the http protocol. I'm going to be looking at the http reccommendation at W3.org, but i'd like to find something that is a more dynamic description of how to use http. Can anyone point me to something like that, either web, sources or book? TIA, Dave LeBlanc
