In the current sources, you'll have to hack at this point in HttpRead,
right after it parses the request:

                # Strip leading http://server.
                # We check and discard proxy requests here, too.

                if {[regexp {^https?://([^/:]+)} $data(url) x xserv]} {
                    set myname [lindex $data(self) 1]
                    if {[string compare \
                            [string tolower $xserv] \
                            [string tolower $myname]] != 0} {
                        Httpd_Error $sock 400 $line
                        return
                    }
                    regsub {^https?://([^/]+)} $data(url) {} data(url)
                }

What the code should do, (and will shortly), is capture the server and
port information and then call out to a proxy hook instead of Url_Dispatch
when the request is complete.

>>>David LeBlanc said:
 > Hi;
 > 
 > I've decided to take a stab at making tclhttpd into a caching proxy server,
 > but need some details on how to go about it and i'm hoping someone in the
 > group can point me in the right direction.
 > 
 > I'm guessing that under normal conditions, the webserver will see only the
 > path part of the http://somesite.com/index.html - where .com could be any
 > of .com, .edu, .gov etc. and index.html could be anything from blank to a
 > deep path like /root/branch/twig/ to /root/branch/twig/index.htm(l) and
 > also include things like :ports and cgi strings. My first question, is what
 > does a proxy server see, and what does it look like? I guess the proxy must
 > see the whole thing from "http..." on so that it can look in it's cache to
 > see if that site:path:page is in the cache.
 > 
 > Secondly, it's fairly obvious that if it's not in the cache (i'd use
 > domain/root/branch/twig/page.html in the file directory structure) that one
 > should either pass it out to the web and somehow capture the fetched page
 > before sending it on to the client, or if the computer is offline, return a
 > document unavailable page. Would one use the socket command and the http
 > package to fetch the page and it's images etc and then return it to the prox
     y?
 > 
 > Steve Ball also mentioned something about how to check the freshness of the
 > cached page using the 'head' fetch - how is that done? There is also the
 > matter of purging the cache of stale pages, but that seems pretty straight
 > forward using the file dates in the cache and running a utility (thread) to
 > clean it up periodically.
 > 
 > I guess what i'm looking for is a general docuemnt about the http protocol.
 > I'm going to be looking at the http reccommendation at W3.org, but i'd like
 > to find something that is a more dynamic description of how to use http.
 > Can anyone point me to something like that, either web, sources or book?
 > 
 > TIA,
 > 
 > Dave LeBlanc
 > 
 > 

--      Brent Welch     <[EMAIL PROTECTED]>
        http://www.scriptics.com
        Scriptics: The Tcl Platform Company


Reply via email to