[ 
https://issues.apache.org/jira/browse/TS-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144936#comment-13144936
 ] 

weijin commented on TS-857:
---------------------------

httpsm can be callbacked between different threads, when terminating the sm, we 
call do_io_close to close the net_vc, but net_vc::do_io_close is not thread 
safe, Inactivlity_cop and netHandler can also close net_vc when net_vc::closed 
is set without locking the mutex of net_vc. I hope this can also explain TS-934.

I have read amc`s patch for TS-934 carefully recently, he contributed a lot in 
the problem solving.  I have two questions: 1) should we lock mutex of net_vc 
in Inactivlity_cop and netHandler 2) should one thread can close net_vc of a 
different thread. 

I tend to add some codes in net_vc::do_io_close and net_vc::mainEvent to make 
it thread safe:
UnixNetVConnection::do_io_close 
{
   if (thread != this_ethread()) {
      thread->schedule_imm(this, EVENT_VC_TRY_TO_CLOSE);
      return;
   }
   disable_read(this);
   disable_write(this);
   .....
   close_UnixNetVConneciton(this, t);
}


UnixNetVConnection::mainEvent(int event, void *e)
{
   if (event == EVENT_VC_TRY_TO_CLOSE) {
     do_io_close();
     return EVENT_DONE;
   }
   ....
   // check the active and inactivity timeout 
   ....
}

                
> Crash Report: HttpTunnel::chain_abort_all -> HttpServerSession::do_io_close 
> -> UnixNetVConnection::do_io_close
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: TS-857
>                 URL: https://issues.apache.org/jira/browse/TS-857
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: HTTP, Network
>    Affects Versions: 3.1.0
>         Environment: in my branch that is something same as 3.0.x
>            Reporter: Zhao Yongming
>            Assignee: weijin
>             Fix For: 3.1.2
>
>
> here is the bt from the crash, some of the information is missing due to we 
> have not enable the --enable-debug configure options.
> {code}
> [New process 7532]
> #0  ink_stack_trace_get (stack=<value optimized out>, len=<value optimized 
> out>, signalhandler_frame=<value optimized out>)
>     at ink_stack_trace.cc:68
> 68        fp = (void **) (*fp);
> (gdb) bt
> #0  ink_stack_trace_get (stack=<value optimized out>, len=<value optimized 
> out>, signalhandler_frame=<value optimized out>)
>     at ink_stack_trace.cc:68
> #1  0x00002ba641dccef1 in ink_stack_trace_dump (sighandler_frame=<value 
> optimized out>) at ink_stack_trace.cc:114
> #2  0x00000000004df020 in signal_handler (sig=<value optimized out>) at 
> signals.cc:225
> #3  <signal handler called>
> #4  0x00000000006a1ea9 in UnixNetVConnection::do_io_close (this=0x1cc9bd20, 
> alerrno=<value optimized out>)
>     at ../../iocore/eventsystem/I_Lock.h:297
> #5  0x000000000051f1d0 in HttpServerSession::do_io_close 
> (this=0x2aaab0042c80, alerrno=20600) at HttpServerSession.cc:127
> #6  0x000000000056d1e9 in HttpTunnel::chain_abort_all (this=0x2aabeeffdd70, 
> p=0x2aabeeffdf68) at HttpTunnel.cc:1300
> #7  0x00000000005269ca in HttpSM::tunnel_handler_ua (this=0x2aabeeffc070, 
> event=104, c=0x2aabeeffdda8) at HttpSM.cc:2987
> #8  0x0000000000571dfc in HttpTunnel::consumer_handler (this=0x2aabeeffdd70, 
> event=104, c=0x2aabeeffdda8) at HttpTunnel.cc:1232
> #9  0x0000000000572032 in HttpTunnel::main_handler (this=0x2aabeeffdd70, 
> event=1088608784, data=<value optimized out>)
>     at HttpTunnel.cc:1456
> #10 0x00000000006a6307 in write_to_net_io (nh=0x2aaaab12d688, vc=0x1cc876e0, 
> thread=<value optimized out>)
>     at ../../iocore/eventsystem/I_Continuation.h:146
> #11 0x000000000069ce97 in NetHandler::mainNetEvent (this=0x2aaaab12d688, 
> event=<value optimized out>, e=0x171c1ed0) at UnixNet.cc:405
> #12 0x00000000006cddaf in EThread::process_event (this=0x2aaaab12c010, 
> e=0x171c1ed0, calling_code=5) at I_Continuation.h:146
> #13 0x00000000006ce6bc in EThread::execute (this=0x2aaaab12c010) at 
> UnixEThread.cc:262
> #14 0x00000000006cd0ee in spawn_thread_internal (a=0x171b58f0) at Thread.cc:88
> #15 0x0000003c33c064a7 in start_thread () from /lib64/libpthread.so.0
> #16 0x0000003c330d3c2d in clone () from /lib64/libc.so.6
> (gdb) info f
> Stack level 0, frame at 0x40e2b790:
>  rip = 0x2ba641dccdf3 in ink_stack_trace_get(void**, int, int) 
> (ink_stack_trace.cc:68); saved rip 0x2ba641dccef1
>  called by frame at 0x40e2bbe0
>  source language c++.
>  Arglist at 0x40e2b770, args: stack=<value optimized out>, len=<value 
> optimized out>, signalhandler_frame=<value optimized out>
>  Locals at 0x40e2b770, Previous frame's sp is 0x40e2b790
>  Saved registers:
>   rbx at 0x40e2b778, rbp at 0x40e2b780, rip at 0x40e2b788
> (gdb) 
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to