[ https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Igor Galić updated TS-1049: --------------------------- Backport to Version: 3.0.5 (was: 3.0.3) > TS hangs (dead lock) on HTTPS POST requests > ------------------------------------------- > > Key: TS-1049 > URL: https://issues.apache.org/jira/browse/TS-1049 > Project: Traffic Server > Issue Type: Bug > Components: Core, HTTP, SSL > Affects Versions: 3.1.1, 3.1.0, 3.0.2 > Environment: RedHat Enterprise Linux 6.0, Intel 32-bit > Reporter: Wilson Ho > Assignee: Igor Galić > Priority: Blocker > Fix For: 3.1.2 > > Attachments: records.config > > > A very reproducible bug where the body of a HTTPS POST request is never > forwarded to the origin server. > Client submits a HTTPS POST request to TS, which is supposed to forward to > the backend/origin server via HTTP. TS process the HTTP headers and > establishes connection to the origin server, but the body of the HTTPS POST > is never read. This hangs until the client times out and shuts down the > connection. > To reproduce: > # Client connects to TS using HTTPS (works OK if it is just HTTP). > # It must be a POST request. > # TS must use at least 2 worker threads. > # Easier to reproduce when the connections to the origin server is HTTP (not > HTTPS). > # POST body must be large enough so that the HTTP request headers and POST > body do *NOT* fit within the same TCP packet. (2000 bytes is a good size) > # I can consistently reproduce this problem using 2 separate clients each > simultaneously submitting 2 requests back to back (i.e., 2 requests from each > client, a total of 4 requests). This gives you a high probability that at > least one of the requests would hang. > Observation: > # Thread A accepted and processed the HTTP headers, and called > "UnixNetProcessor::connect_re" to prepare a new connection to the origin > server. > # Thread A must not have read the body of the POST. Otherwise, it works fine. > # Thread B was assigned the task to handle the origin server connection. If > the same thread A was picked, then everything works fine. > # Apparently, one of the first things that thread B does is to acquire the > mutex for reading from the client. (Why does it do that??) > # While thread B was holding the mutex, thread A proceeded in > "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex. > Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, > but gave up after the second failure. But if thread B released the mutex soon > enough, that thread A could proceed happily and everything works. > # From this point, the body of the POST is never read from the client, and > there is nothing to be proxy'd to the origin server, and both the consumer > and producer tasks are never scheduled to run again -- or until the client > times out. I tried setting the client-side time out to as long as 3-5 > minutes and TS really does not recover by itself until the client closed the > connection. > This is the first time I uses this bug system. Please let me know how I > could produce the configuration files and trace logs, etc. Thanks! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira