[ https://issues.apache.org/jira/browse/TS-4717?focusedWorklogId=31574&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-31574 ]
ASF GitHub Bot logged work on TS-4717: -------------------------------------- Author: ASF GitHub Bot Created on: 04/Nov/16 02:21 Start Date: 04/Nov/16 02:21 Worklog Time Spent: 10m Work Description: GitHub user sekimura opened a pull request: https://github.com/apache/trafficserver/pull/1188 TS-4717: Http2 stack explosion. This is a backport PR of https://github.com/apache/trafficserver/pull/842 You can merge this pull request into a Git repository by running: $ git pull https://github.com/sekimura/trafficserver ts-4717 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1188.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1188 ---- commit 1e6cc95b0b7c3f1c72750289ee37097afc4d782e Author: Susan Hinrichs <shinr...@ieee.org> Date: 2016-08-08T18:36:41Z TS-4717: Http2 stack explosion. ---- Issue Time Tracking ------------------- Worklog Id: (was: 31574) Time Spent: 3.5h (was: 3h 20m) > Http2 stack explosion > --------------------- > > Key: TS-4717 > URL: https://issues.apache.org/jira/browse/TS-4717 > Project: Traffic Server > Issue Type: Bug > Components: HTTP/2 > Reporter: Susan Hinrichs > Assignee: Susan Hinrichs > Fix For: 7.0.0 > > Time Spent: 3.5h > Remaining Estimate: 0h > > We see this periodically with high traffic loads. ATS crashes with 7000+ > frames on the stack. The bulk of the frames are the following frame > sequence. > {code} > #117 0x00000000005159c8 in Continuation::handleEvent (this=0x2b0bdd101b90, > event=100, data=0x2b0bad0c7cf0) > at ../iocore/eventsystem/I_Continuation.h:150 > #118 0x000000000064c05d in Http2ClientSession::state_start_frame_read > (this=0x2b0bdd101b90, event=100, edata=0x2b0bad0c7cf0) > at Http2ClientSession.cc:451 > #119 0x000000000064b0af in Http2ClientSession::main_event_handler > (this=0x2b0bdd101b90, event=100, edata=0x2b0bad0c7cf0) at > Http2ClientSession.cc:292 > #120 0x00000000005159c8 in Continuation::handleEvent (this=0x2b0bdd101b90, > event=100, data=0x2b0bad0c7cf0) > at ../iocore/eventsystem/I_Continuation.h:150 > #121 0x000000000064c386 in Http2ClientSession::state_complete_frame_read > (this=0x2b0bdd101b90, event=100, edata=0x2b0bad0c7cf0) > at Http2ClientSession.cc:483 > #122 0x000000000064b0af in Http2ClientSession::main_event_handler > (this=0x2b0bdd101b90, event=100, edata=0x2b0bad0c7cf0) at > Http2ClientSession.cc:292 > #123 0x00000000005159c8 in Continuation::handleEvent (this=0x2b0bdd101b90, > event=100, data=0x2b0bad0c7cf0) > at ../iocore/eventsystem/I_Continuation.h:150 > #124 0x000000000064c05d in Http2ClientSession::state_start_frame_read > (this=0x2b0bdd101b90, event=100, edata=0x2b0bad0c7cf0) > at Http2ClientSession.cc:451 > {code} > We had cherry picked in the fix for TS-4209 to correctly enforce the > concurrent stream limit. But in the latest crash of this type, it looks like > we are pulling small items from cache, so the stream lives and dies on the > stack. The concurrent active connection count never reaches the limit. > I am going to try to change the > state_state_start_frame_read/state_complete_frame_read logic from recursing > handlers to a loop. -- This message was sent by Atlassian JIRA (v6.3.4#6332)