[ 
https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Galić updated TS-1049:
---------------------------

    Backport to Version: 3.0.5  (was: 3.0.3)
    
> TS hangs (dead lock) on HTTPS POST requests
> -------------------------------------------
>
>                 Key: TS-1049
>                 URL: https://issues.apache.org/jira/browse/TS-1049
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP, SSL
>    Affects Versions: 3.1.1, 3.1.0, 3.0.2
>         Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>            Reporter: Wilson Ho
>            Assignee: Igor Galić
>            Priority: Blocker
>             Fix For: 3.1.2
>
>         Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never 
> forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to 
> the backend/origin server via HTTP.  TS process the HTTP headers and 
> establishes connection to the origin server, but the body of the HTTPS POST 
> is never read.  This hangs until the client times out and shuts down the 
> connection.
> To reproduce:
> # Client connects to TS using HTTPS (works OK if it is just HTTP).
> # It must be a POST request.
> # TS must use at least 2 worker threads.
> # Easier to reproduce when the connections to the origin server is HTTP (not 
> HTTPS).
> # POST body must be large enough so that the HTTP request headers and POST 
> body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> # I can consistently reproduce this problem using 2 separate clients each 
> simultaneously submitting 2 requests back to back (i.e., 2 requests from each 
> client, a total of 4 requests).  This gives you a high probability that at 
> least one of the requests would hang.
> Observation:
> # Thread A accepted and processed the HTTP headers, and called 
> "UnixNetProcessor::connect_re" to prepare a new connection to the origin 
> server.
> # Thread A must not have read the body of the POST.  Otherwise, it works fine.
> # Thread B was assigned the task to handle the origin server connection.  If 
> the same thread A was picked, then everything works fine.
> # Apparently, one of the first things that thread B does is to acquire the 
> mutex for reading from the client.  (Why does it do that??)
> # While thread B was holding the mutex, thread A proceeded in 
> "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  
> Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, 
> but gave up after the second failure. But if thread B released the mutex soon 
> enough, that thread A could proceed happily and everything works.
> # From this point, the body of the POST is never read from the client, and 
> there is nothing to be proxy'd to the origin server, and both the consumer 
> and producer tasks are never scheduled to run again -- or until the client 
> times out.  I tried setting the client-side time out to as long as 3-5 
> minutes and TS really does not recover by itself until the client closed the 
> connection.
> This is the first time I uses this bug system.  Please let me know how I 
> could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to