Hi.

From a recent thread originally dedicated to find out if a proxy server can be really "transparent", I'll first quote a summary from "solprovider".

quote

I think the confusion is between an network proxy server and a Web
"reverse" proxy server.

A network proxy server handles NAT (Network Address Translation).  A
company internally uses private IP addresses (e.g. 10.*.*.*).  All
Internet traffic from these internal addresses use a network proxy
server to reach the Internet.  The proxy server changes the
originating IP Addresses on the outbound packets from the internal
network IP address to the proxy's Internet IP address.  Responses from
the Internet server are received by the proxy server and changed again
to be sent to the originating computer on the internal network.  The
browser uses the Internet domain name so Cookies are not affected.

A Web "reverse" proxy server handles multiple software applications
appearing as a single server.  The applications can be found on
multiple ports on one server or on multiple hardware servers.  Visitor
traffic to several applications goes to one IP Address.  The Web
server at that IP Address decides where the request should be sent
distinguishing based on the server name (using Virtual Servers) or the
path (using Rewrites).  If the applications use Cookies, the
application Cookies must be rewritten by the Web proxy server because
the browsers use the server name of the Web proxy server, not the
application servers.
1. The browser requests http://myapp.example.com.
2. The Web proxy server myapp.example.com sends the request to
myInternalApplicationServer.example.org.
3. The myInternalApplicationServer.example.org sends a response with a
Cookie for myInternalApplicationServer.example.org to the Web proxy
server.
4. The Web proxy server changes the Cookie from
myInternalApplicationServer.example.org to myapp.example.com.
5. The browser receives the Cookie for myapp.example.com and send the
Cookie with future requests to the Web proxy server.
6. The Web proxy server sends the incoming Cookies with the request to
the application server as in #2.  (Depending on security, the incoming
Cookies may need to be changed to match the receiving server.)
7. GOTO #3.

Deciding the type of proxy server being used may be confusing.  An
Internet request for an internal server can be handled with either
type depending on the gateway server.
- Network proxy: The gateway uses firewall software for NAT -- all
requests for the internal server are sent to the internal server.  The
internal server sends Cookies using its Internet name.
- Web proxy: The gateway is a Web server.  Internal application
servers do not use Internet names so the gateway must translate URLs
and Cookies.

--
The specification in the OP was how to Web proxy requests:
1. Server receives request for http://www.example.com/amazon/...
2. Server passes request to http://www.amazon.com/...
3. Server translates response from amazon so the visitor receives
Cookies from .example.com.
4. Future requests are translated so the Web proxy server
(www.example.com) sends the requests including Cookies to amazon.com.

Read http://httpd.apache.org/docs/2.0/mod/mod_proxy.html
Read the sections applying to "reverse" proxies.  Ignore "forward"
proxying because that process is not transparent -- the client
computer must be configured to use a forward proxy.

I once had difficulty with ProxyPass and switched to using Rewrites so
I would handle this with something like:
        RewriteEngine On
        RewriteRule ^/amazon/(.*)$ http://www.amazon.com/$1 [P]
        ProxyPassReverseCookieDomain amazon.com example.com
        ProxyPassReverse /amazon/       http://www.amazon.com/
This should handle Cookies and handle removing/adding "/amazon" in the path.

We have not discussed changing links in pages from amazon.com to use
example.com.  This simple often-needed functionality has been ignored
by the Apache httpd project.  (This functionality was included in a
servlet I wrote in 1999.) Research "mod_proxy_html".

unquote

Now, I believe that there is still a third type of proxy, as follows :

When I configure my browser to use "ourproxy.ourdomain.com:8000" as the HTTP proxy for my browser, it means that independently of whatever NAT may be effected by an internal router that connects my internal network to the internet, something else is going on : Whenever I type in my browser a URL like "http://www.amazon.com";, my browser will not resolve "www.amazon.com" and send it a request like :
GET / HTTP/1.1
Host: www.amazon.com
...

Instead, my browser will send a request to "ourproxy.ourdomain.com:8000", as follows :
GET http://www.amazon.com HTTP/1.1
Host: www.amazon.com
...

The server at "ourproxy.ourdomain.com:8000" will then look up in his page cache, to see if it already has this page from a previous access. Then it will either return this cached page, or retrieve the page anew from "www.amazon.com", cache it (maybe) and deliver the newly-fecthed page. (I am skipping a lot of details about freshness, no-cache etc..)

The main (original) question was however : what happens in this case to cookies possibly set by "www.amazon.com" ?

I personally imagine that such a proxy server (which I guess is the "forward" kind) caches only page contents, not the HTTP headers returned with each page, or am I wrong ?

And in any case, if a page was returned from "www.amazon.com" along with a "Set-Cookie" HTTP header, it should not be cache-able by the proxy server, or am I wrong again ?

And, if such a proxy retrieves a new page from an external server, and the page comes back with a "Set-Cookie" header, this cookie header is then passed unchanged to the original browser requester, isn't it ?

And the requesting browser should accept this cookie as originating from "www.amazon.com", even if technically this answer comes back from the proxy server, no ?

André

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [EMAIL PROTECTED]
  "   from the digest: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to