Issue 3772: Distributed reliability test failures with NewHTTP
http://code.google.com/p/chromium/issues/detail?id=3772

Comment #13 by [EMAIL PROTECTED]:
Ok cool, chromium buildbot 4522 has some useful stacks which hold the  
answer:

-------------------
Problem sequence
-------------------

1. Make an HTTPS request through tunnel (http proxy).

2. The proxy responds with a 404 while establishing the tunnel. Moreover,  
the response is a HTTP/1.1 persistent connection.
*** mini-dump data ***
[see "Bug (B) Data" section]

3. Since the connection is persistent, we recycle it back into the socket  
pool. However as the tunnel was not established yet, the socket
that we recycle is of type TCPClientSocket (and not SSLClientSocket):

[HttpNetworkTransaction::DoReadBodyComplete()]:

     if (!keep_alive)
       connection_.set_socket(NULL);
     connection_.Reset();

*** mini-dump data ***
The |connection_.reset()| line above is reached, with state:
- keep_alive = true
- establishing_tunnel_ = true
- connection_.socket() is of type TCPClientSocket

4. Later on, another HTTPS request is made to the same host. We request a  
socket for group "proxy/.../https://....";, and the pool returns
the socket that was recycled in step (3) -- namely a TCPClientSocket.

5. The code in DoInitConnectionComplete() implicitly expects the reused  
socket to be of type SSLClientSocket. Based on this assumption,
|establishing_tunnel_| is left to false.

6. After sending the request, and getting back response headers, we reach  
this location in DidReadResponseHeaders():

   if (using_ssl_ && !establishing_tunnel_) {
     SSLClientSocket* ssl_socket =
         reinterpret_cast<SSLClientSocket*>(connection_.socket());
     ssl_socket->GetSSLInfo(&response_.ssl_info);
   }

Now invoking GetSSLInfo() may cause a crash, since connection_.socket() is  
of type TCPClientSocket and not SSLClientSocket as the
reinterpret_cast expects.

-------------------
Bugs (1+)
-------------------

A. The first bug is that in step (3) we recycle a TCPClientSocket for an  
HTTPS address. This should be easy to fix by checking for
|establishing_tunnel_|.

B. There is at least one more bug. Notably, how is it that we get a 404 in  
step (2). From our experiments with the similar proxy server
(klee2:12923), it returns a non-keep alive 501 in the case of CONNECT  
requests.
The most convenient explanation, is there is a bug in the proxy server  
causing it to return 404.
It is also possible however, that there is another bug in our code (mixing  
up responses, sending a GET instead of a CONNECT, etc.. could
possibly account for these symptoms too)

-------------------
Solutions
-------------------

Fixing (A) should cause the crashes to stop, however bug (B) needs  
investigation in case it is another code problem.

Huan expects to turn on full-memory dumps early next week: we could simply  
postpone investigation of (B) until that time, since if we
have all the data at our fingertips (request/responses) debugging will go  
much faster.

-------------------
Bug (B) Data
-------------------

The limited data I have so far on bug (B) is:

URLS (7 listed, but there are more)

https://ssl.google-analytics.com/urchin.js
https://www.paypal.com/images/x-click-but04.gif
https://www.google.com/accounts/ServiceLogin?
service=blogger&continue=https%3A%2F%2Fwww.blogger.com%2Fstart&passive=true&go=true&alinsu=1&aplinsu=1&alwf=true&skipvpage=true&rm=false&
showra=1&fpui=2&naui=8
https://ad.yieldmanager.com/pixel?id=81194&t=2
https://oasn03.247realmedia.com/RealMedia/ads/adstream.track/1234?
XE&epmAccountKey=1013&epmXTransKey=47&epmXtransStep=999&epmXtransCategory=&epmXtransItem=&epmXtransQuantity=&epmXTransRevenue=&XE
https://sam.t-online.com/directlogin?skinID=d4s_d_wirk2&sif=1&main=http%3A%2F%2Fwww.t-online.de%2F&tbxmaster=https://tbx2.t-online.de
https://www.ibm.com/dynamicnav/Controller?
sid=111&sidCb=100:[EMAIL PROTECTED]:[EMAIL PROTECTED]:[EMAIL PROTECTED]:[EMAIL 
PROTECTED]:ibmCommon
DynamicNavShowMI@&dc_subject=zz999&op=view&ts=1225760129468&country=us&langu  
[truncated]

For each of the URLs above:

- response_code: 404
- http_version: 1.1
- content_length: 1354

The fact that content_length is the same for each URL makes me curious to  
see what the response bytes were...



-- 
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Chromium-bugs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/chromium-bugs?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to