Forwarding, due to snafu with my accept/dist files (I hate smartlist)
Yeah - I know majordomo is better.  I'll find the time someday to
switch back...



Hi;

I've decided to take a stab at making tclhttpd into a caching proxy server,
but need some details on how to go about it and i'm hoping someone in the
group can point me in the right direction.

I'm guessing that under normal conditions, the webserver will see only the
path part of the http://somesite.com/index.html - where .com could be any
of .com, .edu, .gov etc. and index.html could be anything from blank to a
deep path like /root/branch/twig/ to /root/branch/twig/index.htm(l) and
also include things like :ports and cgi strings. My first question, is what
does a proxy server see, and what does it look like? I guess the proxy must
see the whole thing from "http..." on so that it can look in it's cache to
see if that site:path:page is in the cache.

Secondly, it's fairly obvious that if it's not in the cache (i'd use
domain/root/branch/twig/page.html in the file directory structure) that one
should either pass it out to the web and somehow capture the fetched page
before sending it on to the client, or if the computer is offline, return a
document unavailable page. Would one use the socket command and the http
package to fetch the page and it's images etc and then return it to the proxy?

Steve Ball also mentioned something about how to check the freshness of the
cached page using the 'head' fetch - how is that done? There is also the
matter of purging the cache of stale pages, but that seems pretty straight
forward using the file dates in the cache and running a utility (thread) to
clean it up periodically.

I guess what i'm looking for is a general docuemnt about the http protocol.
I'm going to be looking at the http reccommendation at W3.org, but i'd like
to find something that is a more dynamic description of how to use http.
Can anyone point me to something like that, either web, sources or book?

TIA,

Dave LeBlanc




Reply via email to