In the current sources, you'll have to hack at this point in HttpRead,
right after it parses the request:
# Strip leading http://server.
# We check and discard proxy requests here, too.
if {[regexp {^https?://([^/:]+)} $data(url) x xserv]} {
set myname [lindex $data(self) 1]
if {[string compare \
[string tolower $xserv] \
[string tolower $myname]] != 0} {
Httpd_Error $sock 400 $line
return
}
regsub {^https?://([^/]+)} $data(url) {} data(url)
}
What the code should do, (and will shortly), is capture the server and
port information and then call out to a proxy hook instead of Url_Dispatch
when the request is complete.
>>>David LeBlanc said:
> Hi;
>
> I've decided to take a stab at making tclhttpd into a caching proxy server,
> but need some details on how to go about it and i'm hoping someone in the
> group can point me in the right direction.
>
> I'm guessing that under normal conditions, the webserver will see only the
> path part of the http://somesite.com/index.html - where .com could be any
> of .com, .edu, .gov etc. and index.html could be anything from blank to a
> deep path like /root/branch/twig/ to /root/branch/twig/index.htm(l) and
> also include things like :ports and cgi strings. My first question, is what
> does a proxy server see, and what does it look like? I guess the proxy must
> see the whole thing from "http..." on so that it can look in it's cache to
> see if that site:path:page is in the cache.
>
> Secondly, it's fairly obvious that if it's not in the cache (i'd use
> domain/root/branch/twig/page.html in the file directory structure) that one
> should either pass it out to the web and somehow capture the fetched page
> before sending it on to the client, or if the computer is offline, return a
> document unavailable page. Would one use the socket command and the http
> package to fetch the page and it's images etc and then return it to the prox
y?
>
> Steve Ball also mentioned something about how to check the freshness of the
> cached page using the 'head' fetch - how is that done? There is also the
> matter of purging the cache of stale pages, but that seems pretty straight
> forward using the file dates in the cache and running a utility (thread) to
> clean it up periodically.
>
> I guess what i'm looking for is a general docuemnt about the http protocol.
> I'm going to be looking at the http reccommendation at W3.org, but i'd like
> to find something that is a more dynamic description of how to use http.
> Can anyone point me to something like that, either web, sources or book?
>
> TIA,
>
> Dave LeBlanc
>
>
-- Brent Welch <[EMAIL PROTECTED]>
http://www.scriptics.com
Scriptics: The Tcl Platform Company