[ https://issues.apache.org/jira/browse/TS-4838?focusedWorklogId=28904&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-28904 ]
ASF GitHub Bot logged work on TS-4838: -------------------------------------- Author: ASF GitHub Bot Created on: 13/Sep/16 03:12 Start Date: 13/Sep/16 03:12 Worklog Time Spent: 10m Work Description: Github user jpeach closed the pull request at: https://github.com/apache/trafficserver/pull/1002 Issue Time Tracking ------------------- Worklog Id: (was: 28904) Time Spent: 50m (was: 40m) > After TS-3612 restructuring, very slow SSL sessions and > HttpSM::state_raw_http_server_open errors > ------------------------------------------------------------------------------------------------- > > Key: TS-4838 > URL: https://issues.apache.org/jira/browse/TS-4838 > Project: Traffic Server > Issue Type: Bug > Components: Core, SSL > Affects Versions: 6.2.0, 7.0.0 > Environment: CentOS/RHEL 7.2, x86_64 > Reporter: Dimitry Andric > Assignee: James Peach > Fix For: 7.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > > We have been using TrafficServer 5.3.2 for quite some time now, for forward > proxying of a number of different HTML5 applications, one of the most > important ones being YouTube's TV interface, e.g. https://youtube.com/tv. > This is all hosted on CentOS 7.2 x86_64 machines. > We recently upgraded to 6.2.0, and then started having problems with the > CONNECT requests for port 443 which are generated by the YouTube app. It > seems like these connections are "stalled" somehow, sometimes for >10 > seconds. Meanwhile, {{diags.log}} is getting spammed lots of the following: > {noformat} > [Sep 9 16:45:47.683] Server {0x2b3e50c0b700} ERROR: > [HttpSM::state_raw_http_server_open] event: EVENT_INTERVAL state: 0 > server_entry: (nil) > {noformat} > Requests that seem to stall are most likely all of the CONNECT kind, e.g.: > {noformat} > 1473432382.474 30405 127.0.0.1 TCP_MISS/200 4916 CONNECT > ad.doubleclick.net:443/ - DIRECT/ad.doubleclick.net - > 1473432382.481 30411 127.0.0.1 TCP_MISS/200 54024 CONNECT i9.ytimg.com:443/ - > DIRECT/i9.ytimg.com - > 1473432382.486 30417 127.0.0.1 TCP_MISS/200 5389 CONNECT > pagead2.googlesyndication.com:443/ - DIRECT/pagead2.googlesyndication.com - > 1473432390.451 42772 127.0.0.1 TCP_MISS/200 5198 CONNECT csi.gstatic.com:443/ > - DIRECT/csi.gstatic.com - > 1473432390.459 43833 127.0.0.1 TCP_MISS/200 11610 CONNECT > www.youtube.com:443/ - DIRECT/www.youtube.com - > 1473432390.483 38414 127.0.0.1 TCP_MISS/200 2870983 CONNECT > r17---sn-5hnednl7.googlevideo.com:443/ - > DIRECT/r17---sn-5hnednl7.googlevideo.com - > {noformat} > As part of figuring out how to diagnose this, I tried a downgrade to > TrafficServer 6.1.1, and this made all the stalling and problems disappear. > Afterwards, I did a {{git bisect}} on master, from the branch point of 6.1 to > the branch point of 6.2, and I ended up at [commit > af76977|https://git-dual.apache.org/repos/asf?p=trafficserver.git;a=commit;h=af76977adb9f3c0296a232688bbcb5a1421a6768]: > {quote} > Author: Susan Hinrichs <shinr...@draggingnagging.corp.ne1.yahoo.com> > Date: Wed Apr 13 19:57:39 2016 +0000 > TS-3612: Restructure client session and transaction processing. This > closes #570. > {quote} > Unfortunately, this is a quite big refactoring commit, so it is not possible > to revert it individually to see whether it improves things. > I read TS-3612 and #570, and I saw there were also a number of follow-up > commits to fix various problems with it, but this particular problem of > stalled SSL connections is still occurring with master as of today, > 2016-09-09. > I realize that this report is still missing reproduction details, since it is > tricky to analyze what the YouTube app is doing, and simple {{curl https://}} > tests appear to go fast, and don't seem to trigger any stalling. But YouTube > itself is pretty easy to try out, I think. -- This message was sent by Atlassian JIRA (v6.3.4#6332)