[ https://issues.apache.org/jira/browse/TS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Phil Sorber updated TS-2848: ---------------------------- Fix Version/s: (was: 5.3.0) 6.0.0 > ATS crash in HttpSM::release_server_session > ------------------------------------------- > > Key: TS-2848 > URL: https://issues.apache.org/jira/browse/TS-2848 > Project: Traffic Server > Issue Type: Bug > Components: HTTP > Reporter: Feifei Cai > Assignee: Alan M. Carroll > Labels: crash, review, yahoo > Fix For: 6.0.0 > > Attachments: TS-2848.diff > > > We deploy ATS on production hosts, and noticed crashes with the following > stack trace. This happens not very frequently, about 1 week or even longer. > It crashes repeatedly in the last 2 months, however, the root cause is not > found and we can not reproduce the crash as wish, only wait for it happens. > {noformat} > NOTE: Traffic Server received Sig 11: Segmentation fault > /home/y/bin/traffic_server - STACK TRACE: > /lib64/libpthread.so.0(+0x321e60f500)[0x2b69adf8f500] > /home/y/bin/traffic_server(_ZN6HttpSM22release_server_sessionEb+0x35)[0x529eb5] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x2db)[0x5362bb] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] > /home/y/bin/traffic_server(_ZN6HttpSM16do_hostdb_lookupEv+0x282)[0x51e422] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0xbad)[0x536b8d] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2] > /home/y/bin/traffic_server(_ZN6HttpSM21state_cache_open_readEiPv+0xfe)[0x52ff8e] > /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098] > /home/y/bin/traffic_server(_ZN11HttpCacheSM21state_cache_open_readEiPv+0x1b2)[0x50bef2] > /home/y/bin/traffic_server(_ZN7CacheVC8callcontEi+0x53)[0x5f0a93] > /home/y/bin/traffic_server(_ZN7CacheVC17openReadStartHeadEiP5Event+0x7cf)[0x65934f] > /home/y/bin/traffic_server(_ZN5Cache9open_readEP12ContinuationP7INK_MD5P7HTTPHdrP21CacheLookupHttpConfig13CacheFragTypePci+0x383)[0x656373] > /home/y/bin/traffic_server(_ZN14CacheProcessor9open_readEP12ContinuationP3URLbP7HTTPHdrP21CacheLookupHttpConfigl13CacheFragType+0xad)[0x633a6d] > /home/y/bin/traffic_server(_ZN11HttpCacheSM9open_readEP3URLP7HTTPHdrP21CacheLookupHttpConfigl+0x94)[0x50b944] > /home/y/bin/traffic_server(_ZN6HttpSM24do_cache_lookup_and_readEv+0xf3)[0x51d893] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x722)[0x536702] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x49d)[0x53546d] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] > /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] > /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] > /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x85d)[0x53683d] > /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0] > /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] > /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] > /home/y/libexec64/trafficserver/header_rewrite.so(+0x1288d)[0x2b69c368888d] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] > /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b] > /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14] > /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d] > /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34] > /home/y/bin/traffic_server(_ZN6HttpSM33state_read_server_response_headerEiPv+0x398)[0x530828] > /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098] > /home/y/bin/traffic_server[0x68606b] > /home/y/bin/traffic_server[0x688a14] > /home/y/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x1f2)[0x681582] > /home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x8f)[0x6a89bf] > /home/y/bin/traffic_server(_ZN7EThread7executeEv+0x4a3)[0x6a93a3] > /home/y/bin/traffic_server[0x6a785a] > /lib64/libpthread.so.0(+0x321e607851)[0x2b69adf87851] > /lib64/libc.so.6(clone+0x6d)[0x321e2e890d] > {noformat} > gdb back trace: > {noformat} > (gdb) bt > #0 0x0000000000529eb5 in HttpSM::release_server_session (this=0x2b12bc107bd0, > serve_from_cache=true) at HttpSM.cc:4892 > #1 0x00000000005362bb in HttpSM::set_next_state (this=0x2b12bc107bd0) at > HttpSM.cc:7010 > #2 0x000000000053537a in HttpSM::handle_api_return (this=0x2b12bc107bd0) at > HttpSM.cc:1557 > #3 0x000000000052dbd0 in HttpSM::state_api_callout (this=0x2b12bc107bd0, > event=0, data=0x0) at HttpSM.cc:1489 > #4 0x00000000005361d2 in HttpSM::set_next_state (this=0x2b12bc107bd0) at > HttpSM.cc:6815 > #5 0x000000000051e422 in HttpSM::do_hostdb_lookup (this=0x2b12bc107bd0) at > HttpSM.cc:3919 > #6 0x0000000000536b8d in HttpSM::set_next_state (this=0x2b12bc107bd0) at > HttpSM.cc:6914 > #7 0x000000000053537a in HttpSM::handle_api_return (this=0x2b12bc107bd0) at > HttpSM.cc:1557 > #8 0x000000000052dbd0 in HttpSM::state_api_callout (this=0x2b12bc107bd0, > event=0, data=0x0) at HttpSM.cc:1489 > #9 0x00000000005361d2 in HttpSM::set_next_state (this=0x2b12bc107bd0) at > HttpSM.cc:6815 > #10 0x000000000053537a in HttpSM::handle_api_return (this=0x2b12bc107bd0) at > HttpSM.cc:1557 > #11 0x000000000052dbd0 in HttpSM::state_api_callout (this=0x2b12bc107bd0, > event=0, data=0x0) at HttpSM.cc:1489 > #12 0x00000000005361d2 in HttpSM::set_next_state (this=0x2b12bc107bd0) at > HttpSM.cc:6815 > #13 0x000000000052ff8e in HttpSM::state_cache_open_read (this=0x2b12bc107bd0, > event=1102, data=0x2b13240e2190) at HttpSM.cc:2457 > #14 0x0000000000533098 in HttpSM::main_handler (this=0x2b12bc107bd0, > event=1102, data=0x2b13240e2190) at HttpSM.cc:2516 > #15 0x000000000050bef2 in handleEvent (this=0x2b12bc109d88, event=<value > optimized out>, data=0x2b13240e2190) at > ../../iocore/eventsystem/I_Continuation.h:146 > #16 HttpCacheSM::state_cache_open_read (this=0x2b12bc109d88, event=<value > optimized out>, data=0x2b13240e2190) at HttpCacheSM.cc:118 > #17 0x00000000005f0a93 in handleEvent (this=0x2b13240e2190, event=<value > optimized out>) at ../../iocore/eventsystem/I_Continuation.h:146 > #18 CacheVC::callcont (this=0x2b13240e2190, event=<value optimized out>) at > ../../iocore/cache/P_CacheInternal.h:666 > #19 0x000000000065934f in CacheVC::openReadStartHead (this=0x2b13240e2190, > event=3900, e=0x0) at CacheRead.cc:1193 > #20 0x0000000000634b7d in handleEvent (this=0x2b13240e2190, event=<value > optimized out>, e=<value optimized out>) > at ../../iocore/eventsystem/I_Continuation.h:146 > #21 CacheVC::handleReadDone (this=0x2b13240e2190, event=<value optimized out>, > e=<value optimized out>) at Cache.cc:2257 > #22 0x00000000005f0bf5 in handleEvent (this=<value optimized out>, > event=<value > optimized out>, data=<value optimized out>) > at ../../iocore/eventsystem/I_Continuation.h:146 > #23 AIOCallbackInternal::io_complete (this=<value optimized out>, event=<value > optimized out>, data=<value optimized out>) at ../../iocore/aio/P_AIO.h:123 > #24 0x00000000006a89bf in handleEvent (this=0x2b11e0404010, e=0x2b12cc08afe0, > calling_code=1) at I_Continuation.h:146 > #25 EThread::process_event (this=0x2b11e0404010, e=0x2b12cc08afe0, > calling_code=1) at UnixEThread.cc:141 > #26 0x00000000006a953b in EThread::execute (this=0x2b11e0404010) at > UnixEThread.cc:192 > #27 0x00000000006a785a in spawn_thread_internal (a=0x10a6e20) at Thread.cc:88 > #28 0x00002b11daee9851 in start_thread () from /lib64/libpthread.so.0 > #29 0x00000038174e890d in clone () from /lib64/libc.so.6 > {noformat} > The code where crash happens is as follows. It's due to trying to access > t_state.current.server, which is NULL at some conditions. Here we do not > check for NULL pointer, so I think this means one of the following: > # t_state.current.server should not be NULL, we can add assert here. > # OR t_state.current.server could be NULL, we should add check here, and > maybe some additional handle. > I'm not familiar with Http State Machine's code, could some one help point > out which is the right meaning? It would be appreciate if some one can > comment for this or the potential root causes. Thank you! > {code:title=proxy/http/HttpSM.cc} > // void HttpSM::release_server_session() > // > // Called when we are not tunneling a response from the > // server. If the session is keep alive, release it back to the > // shared pool, otherwise close it > // > void > HttpSM::release_server_session(bool serve_from_cache) > { > if (server_session != NULL) { > if (t_state.current.server->keep_alive == HTTP_KEEPALIVE && > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)