[ 
https://issues.apache.org/jira/browse/TS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susan Hinrichs reassigned TS-2848:
----------------------------------

    Assignee: Susan Hinrichs

> ATS crash in HttpSM::release_server_session
> -------------------------------------------
>
>                 Key: TS-2848
>                 URL: https://issues.apache.org/jira/browse/TS-2848
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: HTTP
>            Reporter: Feifei Cai
>            Assignee: Susan Hinrichs
>              Labels: crash, yahoo
>             Fix For: sometime
>
>         Attachments: TS-2848.diff
>
>
> We deploy ATS on production hosts, and noticed crashes with the following 
> stack trace. This happens not very frequently, about 1 week or even longer. 
> It crashes repeatedly in the last 2 months, however, the root cause is not 
> found and we can not reproduce the crash as wish, only wait for it happens.
> {noformat}
> NOTE: Traffic Server received Sig 11: Segmentation fault
> /home/y/bin/traffic_server - STACK TRACE:
> /lib64/libpthread.so.0(+0x321e60f500)[0x2b69adf8f500]
> /home/y/bin/traffic_server(_ZN6HttpSM22release_server_sessionEb+0x35)[0x529eb5]
> /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x2db)[0x5362bb]
> /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a]
> /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0]
> /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2]
> /home/y/bin/traffic_server(_ZN6HttpSM16do_hostdb_lookupEv+0x282)[0x51e422]
> /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0xbad)[0x536b8d]
> /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a]
> /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0]
> /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2]
> /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a]
> /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0]
> /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x1f2)[0x5361d2]
> /home/y/bin/traffic_server(_ZN6HttpSM21state_cache_open_readEiPv+0xfe)[0x52ff8e]
> /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098]
> /home/y/bin/traffic_server(_ZN11HttpCacheSM21state_cache_open_readEiPv+0x1b2)[0x50bef2]
> /home/y/bin/traffic_server(_ZN7CacheVC8callcontEi+0x53)[0x5f0a93]
> /home/y/bin/traffic_server(_ZN7CacheVC17openReadStartHeadEiP5Event+0x7cf)[0x65934f]
> /home/y/bin/traffic_server(_ZN5Cache9open_readEP12ContinuationP7INK_MD5P7HTTPHdrP21CacheLookupHttpConfig13CacheFragTypePci+0x383)[0x656373]
> /home/y/bin/traffic_server(_ZN14CacheProcessor9open_readEP12ContinuationP3URLbP7HTTPHdrP21CacheLookupHttpConfigl13CacheFragType+0xad)[0x633a6d]
> /home/y/bin/traffic_server(_ZN11HttpCacheSM9open_readEP3URLP7HTTPHdrP21CacheLookupHttpConfigl+0x94)[0x50b944]
> /home/y/bin/traffic_server(_ZN6HttpSM24do_cache_lookup_and_readEv+0xf3)[0x51d893]
> /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x722)[0x536702]
> /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x49d)[0x53546d]
> /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0]
> /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b]
> /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14]
> /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d]
> /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34]
> /home/y/bin/traffic_server(_ZN6HttpSM14set_next_stateEv+0x85d)[0x53683d]
> /home/y/bin/traffic_server(_ZN6HttpSM17handle_api_returnEv+0x3aa)[0x53537a]
> /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x2b0)[0x52dbd0]
> /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b]
> /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14]
> /home/y/libexec64/trafficserver/header_rewrite.so(+0x1288d)[0x2b69c368888d]
> /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34]
> /home/y/bin/traffic_server(_ZN6HttpSM18state_api_callbackEiPv+0x8b)[0x53328b]
> /home/y/bin/traffic_server(TSHttpTxnReenable+0x404)[0x4b9b14]
> /home/y/libexec64/trafficserver/header_filter.so(+0x2d5d)[0x2b69c3471d5d]
> /home/y/bin/traffic_server(_ZN6HttpSM17state_api_calloutEiPv+0x114)[0x52da34]
> /home/y/bin/traffic_server(_ZN6HttpSM33state_read_server_response_headerEiPv+0x398)[0x530828]
> /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533098]
> /home/y/bin/traffic_server[0x68606b]
> /home/y/bin/traffic_server[0x688a14]
> /home/y/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x1f2)[0x681582]
> /home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x8f)[0x6a89bf]
> /home/y/bin/traffic_server(_ZN7EThread7executeEv+0x4a3)[0x6a93a3]
> /home/y/bin/traffic_server[0x6a785a]
> /lib64/libpthread.so.0(+0x321e607851)[0x2b69adf87851]
> /lib64/libc.so.6(clone+0x6d)[0x321e2e890d]
> {noformat}
> gdb back trace:
> {noformat}
> (gdb) bt
> #0  0x0000000000529eb5 in HttpSM::release_server_session (this=0x2b12bc107bd0,
> serve_from_cache=true) at HttpSM.cc:4892
> #1  0x00000000005362bb in HttpSM::set_next_state (this=0x2b12bc107bd0) at
> HttpSM.cc:7010
> #2  0x000000000053537a in HttpSM::handle_api_return (this=0x2b12bc107bd0) at
> HttpSM.cc:1557
> #3  0x000000000052dbd0 in HttpSM::state_api_callout (this=0x2b12bc107bd0,
> event=0, data=0x0) at HttpSM.cc:1489
> #4  0x00000000005361d2 in HttpSM::set_next_state (this=0x2b12bc107bd0) at
> HttpSM.cc:6815
> #5  0x000000000051e422 in HttpSM::do_hostdb_lookup (this=0x2b12bc107bd0) at
> HttpSM.cc:3919
> #6  0x0000000000536b8d in HttpSM::set_next_state (this=0x2b12bc107bd0) at
> HttpSM.cc:6914
> #7  0x000000000053537a in HttpSM::handle_api_return (this=0x2b12bc107bd0) at
> HttpSM.cc:1557
> #8  0x000000000052dbd0 in HttpSM::state_api_callout (this=0x2b12bc107bd0,
> event=0, data=0x0) at HttpSM.cc:1489
> #9  0x00000000005361d2 in HttpSM::set_next_state (this=0x2b12bc107bd0) at
> HttpSM.cc:6815
> #10 0x000000000053537a in HttpSM::handle_api_return (this=0x2b12bc107bd0) at
> HttpSM.cc:1557
> #11 0x000000000052dbd0 in HttpSM::state_api_callout (this=0x2b12bc107bd0,
> event=0, data=0x0) at HttpSM.cc:1489
> #12 0x00000000005361d2 in HttpSM::set_next_state (this=0x2b12bc107bd0) at
> HttpSM.cc:6815
> #13 0x000000000052ff8e in HttpSM::state_cache_open_read (this=0x2b12bc107bd0,
> event=1102, data=0x2b13240e2190) at HttpSM.cc:2457
> #14 0x0000000000533098 in HttpSM::main_handler (this=0x2b12bc107bd0,
> event=1102, data=0x2b13240e2190) at HttpSM.cc:2516
> #15 0x000000000050bef2 in handleEvent (this=0x2b12bc109d88, event=<value
> optimized out>, data=0x2b13240e2190) at
> ../../iocore/eventsystem/I_Continuation.h:146
> #16 HttpCacheSM::state_cache_open_read (this=0x2b12bc109d88, event=<value
> optimized out>, data=0x2b13240e2190) at HttpCacheSM.cc:118
> #17 0x00000000005f0a93 in handleEvent (this=0x2b13240e2190, event=<value
> optimized out>) at ../../iocore/eventsystem/I_Continuation.h:146
> #18 CacheVC::callcont (this=0x2b13240e2190, event=<value optimized out>) at
> ../../iocore/cache/P_CacheInternal.h:666
> #19 0x000000000065934f in CacheVC::openReadStartHead (this=0x2b13240e2190,
> event=3900, e=0x0) at CacheRead.cc:1193
> #20 0x0000000000634b7d in handleEvent (this=0x2b13240e2190, event=<value
> optimized out>, e=<value optimized out>)
>     at ../../iocore/eventsystem/I_Continuation.h:146
> #21 CacheVC::handleReadDone (this=0x2b13240e2190, event=<value optimized out>,
> e=<value optimized out>) at Cache.cc:2257
> #22 0x00000000005f0bf5 in handleEvent (this=<value optimized out>, 
> event=<value
> optimized out>, data=<value optimized out>)
>     at ../../iocore/eventsystem/I_Continuation.h:146
> #23 AIOCallbackInternal::io_complete (this=<value optimized out>, event=<value
> optimized out>, data=<value optimized out>) at ../../iocore/aio/P_AIO.h:123
> #24 0x00000000006a89bf in handleEvent (this=0x2b11e0404010, e=0x2b12cc08afe0,
> calling_code=1) at I_Continuation.h:146
> #25 EThread::process_event (this=0x2b11e0404010, e=0x2b12cc08afe0,
> calling_code=1) at UnixEThread.cc:141
> #26 0x00000000006a953b in EThread::execute (this=0x2b11e0404010) at
> UnixEThread.cc:192
> #27 0x00000000006a785a in spawn_thread_internal (a=0x10a6e20) at Thread.cc:88
> #28 0x00002b11daee9851 in start_thread () from /lib64/libpthread.so.0
> #29 0x00000038174e890d in clone () from /lib64/libc.so.6
> {noformat}
> The code where crash happens is as follows. It's due to trying to access 
> t_state.current.server, which is NULL at some conditions. Here we do not 
> check for NULL pointer, so I think this means one of the following:
> # t_state.current.server should not be NULL, we can add assert here.
> # OR t_state.current.server could be NULL, we should add check here, and 
> maybe some additional handle.
> I'm not familiar with Http State Machine's code, could some one help point 
> out which is the right meaning? It would be appreciate if some one can 
> comment for this or the potential root causes. Thank you!
> {code:title=proxy/http/HttpSM.cc}
> // void HttpSM::release_server_session()
> //
> //  Called when we are not tunneling a response from the
> //   server.  If the session is keep alive, release it back to the
> //   shared pool, otherwise close it
> //
> void
> HttpSM::release_server_session(bool serve_from_cache)
> {
>   if (server_session != NULL) {
>     if (t_state.current.server->keep_alive == HTTP_KEEPALIVE &&
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to