[ https://issues.apache.org/jira/browse/TS-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13097644#comment-13097644 ]
Brian Geffon commented on TS-937: --------------------------------- Thanks for the response weijin. Perhaps I'm missing something but there is currently a check for the event being cancelled in ProcessEvent for all events that do not have a timeout, if the event has some timeout on it then there won't be a check, why would it be safe to put the cancel check in ProcessEvent in that situation? [http://svn.apache.org/viewvc/trafficserver/traffic/trunk/iocore/eventsystem/UnixEThread.cc?view=markup#l234] > EThread::execute still processing cancelled event > ------------------------------------------------- > > Key: TS-937 > URL: https://issues.apache.org/jira/browse/TS-937 > Project: Traffic Server > Issue Type: Bug > Components: Core > Affects Versions: 3.0.1, 2.1.9 > Environment: RHEL6 > Reporter: Brian Geffon > Fix For: 3.1.1 > > Attachments: UnixEThread.patch > > > The included GDB log will show that ATS is trying to process an event that > has already been canceled, examining the code of UnixEThread.cc line 232 > shows that EThread::process_event gets called without a check for the event > being cancelled. > Brian > Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0x7ffff64fa700 (LWP 28518)] > 0x00000000006fc663 in EThread::process_event (this=0x7ffff68ff010, > e=0x1db45c0, calling_code=1) at UnixEThread.cc:130 > 130 MUTEX_TRY_LOCK_FOR(lock, e->mutex.m_ptr, this, e->continuation); > Missing separate debuginfos, use: debuginfo-install > expat-2.0.1-9.1.el6.x86_64 glibc-2.12-1.25.el6_1.3.x86_64 > keyutils-libs-1.4-1.el6.x86_64 krb5-libs-1.9-9.el6_1.1.x86_64 > libcom_err-1.41.12-7.el6.x86_64 libgcc-4.4.5-6.el6.x86_64 > libselinux-2.0.94-5.el6.x86_64 libstdc++-4.4.5-6.el6.x86_64 > openssl-1.0.0-10.el6_1.4.x86_64 pcre-7.8-3.1.el6.x86_64 > tcl-8.5.7-6.el6.x86_64 zlib-1.2.3-25.el6.x86_64 > (gdb) bt > #0 0x00000000006fc663 in EThread::process_event (this=0x7ffff68ff010, > e=0x1db45c0, calling_code=1) at UnixEThread.cc:130 > #1 0x00000000006fcbaf in EThread::execute (this=0x7ffff68ff010) at > UnixEThread.cc:232 > #2 0x00000000006fb844 in spawn_thread_internal (a=0xfb7e80) at Thread.cc:88 > #3 0x00000036204077e1 in start_thread () from /lib64/libpthread.so.0 > #4 0x000000361f8e577d in clone () from /lib64/libc.so.6 > (gdb) bt full > #0 0x00000000006fc663 in EThread::process_event (this=0x7ffff68ff010, > e=0x1db45c0, calling_code=1) at UnixEThread.cc:130 > lock = {m = {m_ptr = 0x7ffff64f9d20}, lock_acquired = 202} > #1 0x00000000006fcbaf in EThread::execute (this=0x7ffff68ff010) at > UnixEThread.cc:232 > done_one = false > e = 0x1db45c0 > NegativeQueue = {<DLL<Event, Event::Link_link>> = {head = 0xfc75f0}, > tail = 0xfc75f0} > next_time = 1314647904419648000 > #2 0x00000000006fb844 in spawn_thread_internal (a=0xfb7e80) at Thread.cc:88 > p = 0xfb7e80 > #3 0x00000036204077e1 in start_thread () from /lib64/libpthread.so.0 > No symbol table info available. > #4 0x000000361f8e577d in clone () from /lib64/libc.so.6 > No symbol table info available. > (gdb) f 0 > #0 0x00000000006fc663 in EThread::process_event (this=0x7ffff68ff010, > e=0x1db45c0, calling_code=1) at UnixEThread.cc:130 > 130 MUTEX_TRY_LOCK_FOR(lock, e->mutex.m_ptr, this, e->continuation); > (gdb) p *e > $2 = {<Action> = {_vptr.Action = 0x775170, continuation = 0x1f2fc08, mutex = > {m_ptr = 0x7fffd40fba40}, cancelled = 1}, ethread = 0x7ffff68ff010, > in_the_prot_queue = 0, in_the_priority_queue = 0, > immediate = 1, globally_allocated = 1, in_heap = 0, callback_event = 1, > timeout_at = 0, period = 0, cookie = 0x0, link = {<SLink<Event>> = {next = > 0x0}, prev = 0x0}} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira