[jira] [Assigned] (TS-1039) PATCH: use pcre-config to find libpcre
[ https://issues.apache.org/jira/browse/TS-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Galić reassigned TS-1039: -- Assignee: Igor Galić PATCH: use pcre-config to find libpcre -- Key: TS-1039 URL: https://issues.apache.org/jira/browse/TS-1039 Project: Traffic Server Issue Type: Improvement Components: Build Reporter: James Peach Assignee: Igor Galić Priority: Minor Attachments: 0001-Use-pcre-config-to-find-libpcre.patch This patch uses pcre-config to determine the compilation options needed to use libpcre. This is an improvement over the exiting configure arguments since it will work without user intervention in more circumstances. The existing configuration option still works as expected for compatibility reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1039) PATCH: use pcre-config to find libpcre
[ https://issues.apache.org/jira/browse/TS-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166036#comment-13166036 ] Igor Galić commented on TS-1039: Thank you very much James. I now *really* think I should write a short wiki/doc on how to submit patches. Or just ask Linus to make git's default {{diff}} output *usable*, you know.. for {{patch}}. By putting this in your {{~/.gitconfig}}, they can actually become usable: {noformat} [diff] noprefix = true {noformat} PATCH: use pcre-config to find libpcre -- Key: TS-1039 URL: https://issues.apache.org/jira/browse/TS-1039 Project: Traffic Server Issue Type: Improvement Components: Build Reporter: James Peach Assignee: Igor Galić Priority: Minor Attachments: 0001-Use-pcre-config-to-find-libpcre.patch This patch uses pcre-config to determine the compilation options needed to use libpcre. This is an improvement over the exiting configure arguments since it will work without user intervention in more circumstances. The existing configuration option still works as expected for compatibility reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (TS-1042) PATCH: correct debug message in FetchSM
[ https://issues.apache.org/jira/browse/TS-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Galić reassigned TS-1042: -- Assignee: Igor Galić PATCH: correct debug message in FetchSM --- Key: TS-1042 URL: https://issues.apache.org/jira/browse/TS-1042 Project: Traffic Server Issue Type: Bug Components: HTTP Reporter: James Peach Assignee: Igor Galić Priority: Minor Attachments: 0004-Fix-FetchSM-debugging-message.patch In the FetchSM module, there is a debug message that can walk off the end of the buffer. This patch corrects that by limiting the printed string to the known length. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1039) PATCH: use pcre-config to find libpcre
[ https://issues.apache.org/jira/browse/TS-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166114#comment-13166114 ] Igor Galić commented on TS-1039: Tested both scenarios, worked out fine -- thanks. Patch applied! PATCH: use pcre-config to find libpcre -- Key: TS-1039 URL: https://issues.apache.org/jira/browse/TS-1039 Project: Traffic Server Issue Type: Improvement Components: Build Reporter: James Peach Assignee: Igor Galić Priority: Minor Fix For: 3.1.2 Attachments: 0001-Use-pcre-config-to-find-libpcre.patch This patch uses pcre-config to determine the compilation options needed to use libpcre. This is an improvement over the exiting configure arguments since it will work without user intervention in more circumstances. The existing configuration option still works as expected for compatibility reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1042) PATCH: correct debug message in FetchSM
[ https://issues.apache.org/jira/browse/TS-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166115#comment-13166115 ] Igor Galić commented on TS-1042: crawling through {{printf(3)}}, giving up, asking {{##C}} - I now know what {{printf(%*.*s, length, length, string);}} does! Thank you again for the patch, I aplied it in r1212343 PATCH: correct debug message in FetchSM --- Key: TS-1042 URL: https://issues.apache.org/jira/browse/TS-1042 Project: Traffic Server Issue Type: Bug Components: HTTP Reporter: James Peach Assignee: Igor Galić Priority: Minor Fix For: 3.1.2 Attachments: 0004-Fix-FetchSM-debugging-message.patch In the FetchSM module, there is a debug message that can walk off the end of the buffer. This patch corrects that by limiting the printed string to the known length. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1042) PATCH: correct debug message in FetchSM
[ https://issues.apache.org/jira/browse/TS-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166222#comment-13166222 ] Leif Hedstrom commented on TS-1042: --- Hmmm, why %*.*s, length, length) ? Everywhere else we just do %.*s, length, string) PATCH: correct debug message in FetchSM --- Key: TS-1042 URL: https://issues.apache.org/jira/browse/TS-1042 Project: Traffic Server Issue Type: Bug Components: HTTP Reporter: James Peach Assignee: Igor Galić Priority: Minor Fix For: 3.1.2 Attachments: 0004-Fix-FetchSM-debugging-message.patch In the FetchSM module, there is a debug message that can walk off the end of the buffer. This patch corrects that by limiting the printed string to the known length. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1042) PATCH: correct debug message in FetchSM
[ https://issues.apache.org/jira/browse/TS-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166330#comment-13166330 ] James Peach commented on TS-1042: - To be honest, it's mostly from habit. However %*.*s prints exactly the number of bytes whereas %.*s prints up to the number of bytes. In this particular case we know the number of bytes and we want to print all of them so %*.*s seems like the right choice. But %.*s will fix the bug just as well. PATCH: correct debug message in FetchSM --- Key: TS-1042 URL: https://issues.apache.org/jira/browse/TS-1042 Project: Traffic Server Issue Type: Bug Components: HTTP Reporter: James Peach Assignee: Igor Galić Priority: Minor Fix For: 3.1.2 Attachments: 0004-Fix-FetchSM-debugging-message.patch In the FetchSM module, there is a debug message that can walk off the end of the buffer. This patch corrects that by limiting the printed string to the known length. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1041) PATCH: guarantee to populate sockaddr length for TSHostLookupResultAddrGet
[ https://issues.apache.org/jira/browse/TS-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166332#comment-13166332 ] James Peach commented on TS-1041: - I was referring to the sockaddr you get from TSHostLookupResultAddrGet(), where you look at sa_len to figure out how many bytes to copy. PATCH: guarantee to populate sockaddr length for TSHostLookupResultAddrGet -- Key: TS-1041 URL: https://issues.apache.org/jira/browse/TS-1041 Project: Traffic Server Issue Type: Improvement Components: DNS Environment: Mac OS X 10.7 Reporter: James Peach Priority: Minor Attachments: 0003-Ensure-sockaddr-length-is-always-populated.patch The sockaddr returned by TSHostLookupResultAddrGet() does not always get it's sa_len field populated correctly. This patch guarantees to populate it to the correct value so that plugin authors can rely on that field when copying the TSHostLookupResultAddrGet() result. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1040) PATCH: teach TSHostLookup to use const
[ https://issues.apache.org/jira/browse/TS-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166334#comment-13166334 ] James Peach commented on TS-1040: - Yep I meant to click the donate button but forgot. I didn't see a way to toggle it after the fact. Do you need me to attach it again? PATCH: teach TSHostLookup to use const -- Key: TS-1040 URL: https://issues.apache.org/jira/browse/TS-1040 Project: Traffic Server Issue Type: Improvement Components: DNS Reporter: James Peach Priority: Minor Attachments: 0002-TSHostLookup-should-take-const-hostname-argument.patch This patch improves the TSHostLookup() API by specifying it's hostname argument as const. This reduces the number of casts required of plugin authors. The new prototype is: tsapi TSAction TSHostLookup(TSCont contp, const char* hostname, size_t namelen) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-857) Crash Report: HttpTunnel::chain_abort_all - HttpServerSession::do_io_close - UnixNetVConnection::do_io_close
[ https://issues.apache.org/jira/browse/TS-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166354#comment-13166354 ] John Plevyak commented on TS-857: - So, this patch throws a VC_EVENT_DO_CLOSE event at a NetVC, but it doesn't record that fact that the event is outstanding? What prevents the NetVC from being deallocated before the event is processed? Crash Report: HttpTunnel::chain_abort_all - HttpServerSession::do_io_close - UnixNetVConnection::do_io_close -- Key: TS-857 URL: https://issues.apache.org/jira/browse/TS-857 Project: Traffic Server Issue Type: Bug Components: HTTP, Network Affects Versions: 3.1.0 Environment: in my branch that is something same as 3.0.x Reporter: Zhao Yongming Assignee: weijin Fix For: 3.1.3 Attachments: ts-857.diff, ts-857.diff here is the bt from the crash, some of the information is missing due to we have not enable the --enable-debug configure options. {code} [New process 7532] #0 ink_stack_trace_get (stack=value optimized out, len=value optimized out, signalhandler_frame=value optimized out) at ink_stack_trace.cc:68 68fp = (void **) (*fp); (gdb) bt #0 ink_stack_trace_get (stack=value optimized out, len=value optimized out, signalhandler_frame=value optimized out) at ink_stack_trace.cc:68 #1 0x2ba641dccef1 in ink_stack_trace_dump (sighandler_frame=value optimized out) at ink_stack_trace.cc:114 #2 0x004df020 in signal_handler (sig=value optimized out) at signals.cc:225 #3 signal handler called #4 0x006a1ea9 in UnixNetVConnection::do_io_close (this=0x1cc9bd20, alerrno=value optimized out) at ../../iocore/eventsystem/I_Lock.h:297 #5 0x0051f1d0 in HttpServerSession::do_io_close (this=0x2aaab0042c80, alerrno=20600) at HttpServerSession.cc:127 #6 0x0056d1e9 in HttpTunnel::chain_abort_all (this=0x2aabeeffdd70, p=0x2aabeeffdf68) at HttpTunnel.cc:1300 #7 0x005269ca in HttpSM::tunnel_handler_ua (this=0x2aabeeffc070, event=104, c=0x2aabeeffdda8) at HttpSM.cc:2987 #8 0x00571dfc in HttpTunnel::consumer_handler (this=0x2aabeeffdd70, event=104, c=0x2aabeeffdda8) at HttpTunnel.cc:1232 #9 0x00572032 in HttpTunnel::main_handler (this=0x2aabeeffdd70, event=1088608784, data=value optimized out) at HttpTunnel.cc:1456 #10 0x006a6307 in write_to_net_io (nh=0x2b12d688, vc=0x1cc876e0, thread=value optimized out) at ../../iocore/eventsystem/I_Continuation.h:146 #11 0x0069ce97 in NetHandler::mainNetEvent (this=0x2b12d688, event=value optimized out, e=0x171c1ed0) at UnixNet.cc:405 #12 0x006cddaf in EThread::process_event (this=0x2b12c010, e=0x171c1ed0, calling_code=5) at I_Continuation.h:146 #13 0x006ce6bc in EThread::execute (this=0x2b12c010) at UnixEThread.cc:262 #14 0x006cd0ee in spawn_thread_internal (a=0x171b58f0) at Thread.cc:88 #15 0x003c33c064a7 in start_thread () from /lib64/libpthread.so.0 #16 0x003c330d3c2d in clone () from /lib64/libc.so.6 (gdb) info f Stack level 0, frame at 0x40e2b790: rip = 0x2ba641dccdf3 in ink_stack_trace_get(void**, int, int) (ink_stack_trace.cc:68); saved rip 0x2ba641dccef1 called by frame at 0x40e2bbe0 source language c++. Arglist at 0x40e2b770, args: stack=value optimized out, len=value optimized out, signalhandler_frame=value optimized out Locals at 0x40e2b770, Previous frame's sp is 0x40e2b790 Saved registers: rbx at 0x40e2b778, rbp at 0x40e2b780, rip at 0x40e2b788 (gdb) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-857) Crash Report: HttpTunnel::chain_abort_all - HttpServerSession::do_io_close - UnixNetVConnection::do_io_close
[ https://issues.apache.org/jira/browse/TS-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166353#comment-13166353 ] John Plevyak commented on TS-857: - So, this patch throws a VC_EVENT_DO_CLOSE event at a NetVC, but it doesn't record that fact that the event is outstanding? What prevents the NetVC from being deallocated before the event is processed? Crash Report: HttpTunnel::chain_abort_all - HttpServerSession::do_io_close - UnixNetVConnection::do_io_close -- Key: TS-857 URL: https://issues.apache.org/jira/browse/TS-857 Project: Traffic Server Issue Type: Bug Components: HTTP, Network Affects Versions: 3.1.0 Environment: in my branch that is something same as 3.0.x Reporter: Zhao Yongming Assignee: weijin Fix For: 3.1.3 Attachments: ts-857.diff, ts-857.diff here is the bt from the crash, some of the information is missing due to we have not enable the --enable-debug configure options. {code} [New process 7532] #0 ink_stack_trace_get (stack=value optimized out, len=value optimized out, signalhandler_frame=value optimized out) at ink_stack_trace.cc:68 68fp = (void **) (*fp); (gdb) bt #0 ink_stack_trace_get (stack=value optimized out, len=value optimized out, signalhandler_frame=value optimized out) at ink_stack_trace.cc:68 #1 0x2ba641dccef1 in ink_stack_trace_dump (sighandler_frame=value optimized out) at ink_stack_trace.cc:114 #2 0x004df020 in signal_handler (sig=value optimized out) at signals.cc:225 #3 signal handler called #4 0x006a1ea9 in UnixNetVConnection::do_io_close (this=0x1cc9bd20, alerrno=value optimized out) at ../../iocore/eventsystem/I_Lock.h:297 #5 0x0051f1d0 in HttpServerSession::do_io_close (this=0x2aaab0042c80, alerrno=20600) at HttpServerSession.cc:127 #6 0x0056d1e9 in HttpTunnel::chain_abort_all (this=0x2aabeeffdd70, p=0x2aabeeffdf68) at HttpTunnel.cc:1300 #7 0x005269ca in HttpSM::tunnel_handler_ua (this=0x2aabeeffc070, event=104, c=0x2aabeeffdda8) at HttpSM.cc:2987 #8 0x00571dfc in HttpTunnel::consumer_handler (this=0x2aabeeffdd70, event=104, c=0x2aabeeffdda8) at HttpTunnel.cc:1232 #9 0x00572032 in HttpTunnel::main_handler (this=0x2aabeeffdd70, event=1088608784, data=value optimized out) at HttpTunnel.cc:1456 #10 0x006a6307 in write_to_net_io (nh=0x2b12d688, vc=0x1cc876e0, thread=value optimized out) at ../../iocore/eventsystem/I_Continuation.h:146 #11 0x0069ce97 in NetHandler::mainNetEvent (this=0x2b12d688, event=value optimized out, e=0x171c1ed0) at UnixNet.cc:405 #12 0x006cddaf in EThread::process_event (this=0x2b12c010, e=0x171c1ed0, calling_code=5) at I_Continuation.h:146 #13 0x006ce6bc in EThread::execute (this=0x2b12c010) at UnixEThread.cc:262 #14 0x006cd0ee in spawn_thread_internal (a=0x171b58f0) at Thread.cc:88 #15 0x003c33c064a7 in start_thread () from /lib64/libpthread.so.0 #16 0x003c330d3c2d in clone () from /lib64/libc.so.6 (gdb) info f Stack level 0, frame at 0x40e2b790: rip = 0x2ba641dccdf3 in ink_stack_trace_get(void**, int, int) (ink_stack_trace.cc:68); saved rip 0x2ba641dccef1 called by frame at 0x40e2bbe0 source language c++. Arglist at 0x40e2b770, args: stack=value optimized out, len=value optimized out, signalhandler_frame=value optimized out Locals at 0x40e2b770, Previous frame's sp is 0x40e2b790 Saved registers: rbx at 0x40e2b778, rbp at 0x40e2b780, rip at 0x40e2b788 (gdb) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (TS-949) key-volume hash table is not consistent when a disk is marked as bad or removed due to failure
[ https://issues.apache.org/jira/browse/TS-949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Plevyak updated TS-949: Attachment: TS-949-jp2.patch This patch uses a table of random number selected based on the size of the disk partition and selects the closest to the center of each bucket as the bucket owner. This is stable for inserts, removes and never switches between disks which remain present. This should address the issue. key-volume hash table is not consistent when a disk is marked as bad or removed due to failure --- Key: TS-949 URL: https://issues.apache.org/jira/browse/TS-949 Project: Traffic Server Issue Type: Bug Components: Cache Affects Versions: 3.1.0 Environment: Multi-volume cache with apparently faulty drives Reporter: B Wyatt Assignee: John Plevyak Fix For: 3.1.2 Attachments: TS-949-jp-1.patch, TS-949-jp2.patch, TS949-BW-p1.patch The method for resolving collisions when distributing hash-table space to volumes for the object_key-volume hash table creates inconsistency when a disk is determined to be bad, or when a failed disk is removed from the volume.config. Background: The hash space is distributed by round robin draft where each volume drafts a random index in the hash table until the hash space is exhausted. The random order in which a given volume drafts hash table slots is consistent across reboot/crash/disk-failure, however when a volume attempts to draft a slot which has already been occupied, it skips to its next random pick and attempts to draft that slot until it finds an open slot. This ensures that the hash is partitioned evenly between volumes. The issue: Resolving slot contention breaks the consistency as it is dependent on the order that the volumes draft. When rebuilding the hash after disk failure or reboot with fewer drives, a volume may secure an index that was previously occupied by the dead-disk. In the old hash, the surviving volume would have selected another random index due to contention. If this index is taken, by the next draft round it will represent an inconsistent key-volume result. The effects of one inconsistency will then cascade as whichever volume occupies that index after removing a dead disk is now behind on its draft sequence as well. An Example: ||Disk||Draft Sequence|| |A|1,4,7,5| |B|4,2,8,1| |C|3,7,5,2| Pre-failure Hash Table after 2 rounds of draft: |A|B|C|B|C|?|A|?| Post-failure of drive B Hash Table after 3 rounds of draft: |A|C|C|A|{color:red}A{color}|?|{color:red}C{color}|?| Two slots have become inconsistent and more will probably follow. These inconsistencies become objects stored in a volume but lost to the top level cache for open/lookup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-857) Crash Report: HttpTunnel::chain_abort_all - HttpServerSession::do_io_close - UnixNetVConnection::do_io_close
[ https://issues.apache.org/jira/browse/TS-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166513#comment-13166513 ] Alan M. Carroll commented on TS-857: Where would it record the pending event? The presumption is that the only other thread that might close it is the one that will process the event. But the real issue is that sometimes the lock is dropped between the time the lock try is done and the event is scheduled, leading to scheduling on the null thread. If you look at TS-934 you can see how that issue is handled there, but even that's not sufficient because normal VC processing from NetHandler::mainNetEvent does not lock the VCs, so there isn't any way do anything safely in this case. Crash Report: HttpTunnel::chain_abort_all - HttpServerSession::do_io_close - UnixNetVConnection::do_io_close -- Key: TS-857 URL: https://issues.apache.org/jira/browse/TS-857 Project: Traffic Server Issue Type: Bug Components: HTTP, Network Affects Versions: 3.1.0 Environment: in my branch that is something same as 3.0.x Reporter: Zhao Yongming Assignee: weijin Fix For: 3.1.3 Attachments: ts-857.diff, ts-857.diff here is the bt from the crash, some of the information is missing due to we have not enable the --enable-debug configure options. {code} [New process 7532] #0 ink_stack_trace_get (stack=value optimized out, len=value optimized out, signalhandler_frame=value optimized out) at ink_stack_trace.cc:68 68fp = (void **) (*fp); (gdb) bt #0 ink_stack_trace_get (stack=value optimized out, len=value optimized out, signalhandler_frame=value optimized out) at ink_stack_trace.cc:68 #1 0x2ba641dccef1 in ink_stack_trace_dump (sighandler_frame=value optimized out) at ink_stack_trace.cc:114 #2 0x004df020 in signal_handler (sig=value optimized out) at signals.cc:225 #3 signal handler called #4 0x006a1ea9 in UnixNetVConnection::do_io_close (this=0x1cc9bd20, alerrno=value optimized out) at ../../iocore/eventsystem/I_Lock.h:297 #5 0x0051f1d0 in HttpServerSession::do_io_close (this=0x2aaab0042c80, alerrno=20600) at HttpServerSession.cc:127 #6 0x0056d1e9 in HttpTunnel::chain_abort_all (this=0x2aabeeffdd70, p=0x2aabeeffdf68) at HttpTunnel.cc:1300 #7 0x005269ca in HttpSM::tunnel_handler_ua (this=0x2aabeeffc070, event=104, c=0x2aabeeffdda8) at HttpSM.cc:2987 #8 0x00571dfc in HttpTunnel::consumer_handler (this=0x2aabeeffdd70, event=104, c=0x2aabeeffdda8) at HttpTunnel.cc:1232 #9 0x00572032 in HttpTunnel::main_handler (this=0x2aabeeffdd70, event=1088608784, data=value optimized out) at HttpTunnel.cc:1456 #10 0x006a6307 in write_to_net_io (nh=0x2b12d688, vc=0x1cc876e0, thread=value optimized out) at ../../iocore/eventsystem/I_Continuation.h:146 #11 0x0069ce97 in NetHandler::mainNetEvent (this=0x2b12d688, event=value optimized out, e=0x171c1ed0) at UnixNet.cc:405 #12 0x006cddaf in EThread::process_event (this=0x2b12c010, e=0x171c1ed0, calling_code=5) at I_Continuation.h:146 #13 0x006ce6bc in EThread::execute (this=0x2b12c010) at UnixEThread.cc:262 #14 0x006cd0ee in spawn_thread_internal (a=0x171b58f0) at Thread.cc:88 #15 0x003c33c064a7 in start_thread () from /lib64/libpthread.so.0 #16 0x003c330d3c2d in clone () from /lib64/libc.so.6 (gdb) info f Stack level 0, frame at 0x40e2b790: rip = 0x2ba641dccdf3 in ink_stack_trace_get(void**, int, int) (ink_stack_trace.cc:68); saved rip 0x2ba641dccef1 called by frame at 0x40e2bbe0 source language c++. Arglist at 0x40e2b770, args: stack=value optimized out, len=value optimized out, signalhandler_frame=value optimized out Locals at 0x40e2b770, Previous frame's sp is 0x40e2b790 Saved registers: rbx at 0x40e2b778, rbp at 0x40e2b780, rip at 0x40e2b788 (gdb) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1031) reduce lock in netHandler and reduce the possiblity of acquiring expire server sessions
[ https://issues.apache.org/jira/browse/TS-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166654#comment-13166654 ] John Plevyak commented on TS-1031: -- I don't understand why this is necessary. Nobody should call do_io_close() until they have cleared ALL pointers to the NetVC. This seems like a hack to prevent buggy code from crashing in this particular way rather than just doing other bad things (including crashing in some other way). reduce lock in netHandler and reduce the possiblity of acquiring expire server sessions --- Key: TS-1031 URL: https://issues.apache.org/jira/browse/TS-1031 Project: Traffic Server Issue Type: Improvement Components: Core Affects Versions: 3.1.1 Reporter: Zhao Yongming Assignee: weijin Priority: Minor Attachments: ts-1031.diff reduce lock in netHandler and reduce the possiblity of acquiring expire server sessions. put your patch here for review :D -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-937) EThread::execute still processing cancelled event
[ https://issues.apache.org/jira/browse/TS-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1312#comment-1312 ] John Plevyak commented on TS-937: - Let's nuke TS_HAS_PURIFY. If we want to make a valgrind target (e.g. something which would enable normal malloc) than that is an idea, but this macro is confusing. EThread::execute still processing cancelled event - Key: TS-937 URL: https://issues.apache.org/jira/browse/TS-937 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 3.0.1, 2.1.9 Environment: RHEL6 Reporter: Brian Geffon Fix For: 3.1.2 Attachments: UnixEThread.patch The included GDB log will show that ATS is trying to process an event that has already been canceled, examining the code of UnixEThread.cc line 232 shows that EThread::process_event gets called without a check for the event being cancelled. Brian Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x764fa700 (LWP 28518)] 0x006fc663 in EThread::process_event (this=0x768ff010, e=0x1db45c0, calling_code=1) at UnixEThread.cc:130 130 MUTEX_TRY_LOCK_FOR(lock, e-mutex.m_ptr, this, e-continuation); Missing separate debuginfos, use: debuginfo-install expat-2.0.1-9.1.el6.x86_64 glibc-2.12-1.25.el6_1.3.x86_64 keyutils-libs-1.4-1.el6.x86_64 krb5-libs-1.9-9.el6_1.1.x86_64 libcom_err-1.41.12-7.el6.x86_64 libgcc-4.4.5-6.el6.x86_64 libselinux-2.0.94-5.el6.x86_64 libstdc++-4.4.5-6.el6.x86_64 openssl-1.0.0-10.el6_1.4.x86_64 pcre-7.8-3.1.el6.x86_64 tcl-8.5.7-6.el6.x86_64 zlib-1.2.3-25.el6.x86_64 (gdb) bt #0 0x006fc663 in EThread::process_event (this=0x768ff010, e=0x1db45c0, calling_code=1) at UnixEThread.cc:130 #1 0x006fcbaf in EThread::execute (this=0x768ff010) at UnixEThread.cc:232 #2 0x006fb844 in spawn_thread_internal (a=0xfb7e80) at Thread.cc:88 #3 0x0036204077e1 in start_thread () from /lib64/libpthread.so.0 #4 0x00361f8e577d in clone () from /lib64/libc.so.6 (gdb) bt full #0 0x006fc663 in EThread::process_event (this=0x768ff010, e=0x1db45c0, calling_code=1) at UnixEThread.cc:130 lock = {m = {m_ptr = 0x764f9d20}, lock_acquired = 202} #1 0x006fcbaf in EThread::execute (this=0x768ff010) at UnixEThread.cc:232 done_one = false e = 0x1db45c0 NegativeQueue = {DLLEvent, Event::Link_link = {head = 0xfc75f0}, tail = 0xfc75f0} next_time = 1314647904419648000 #2 0x006fb844 in spawn_thread_internal (a=0xfb7e80) at Thread.cc:88 p = 0xfb7e80 #3 0x0036204077e1 in start_thread () from /lib64/libpthread.so.0 No symbol table info available. #4 0x00361f8e577d in clone () from /lib64/libc.so.6 No symbol table info available. (gdb) f 0 #0 0x006fc663 in EThread::process_event (this=0x768ff010, e=0x1db45c0, calling_code=1) at UnixEThread.cc:130 130 MUTEX_TRY_LOCK_FOR(lock, e-mutex.m_ptr, this, e-continuation); (gdb) p *e $2 = {Action = {_vptr.Action = 0x775170, continuation = 0x1f2fc08, mutex = {m_ptr = 0x7fffd40fba40}, cancelled = 1}, ethread = 0x768ff010, in_the_prot_queue = 0, in_the_priority_queue = 0, immediate = 1, globally_allocated = 1, in_heap = 0, callback_event = 1, timeout_at = 0, period = 0, cookie = 0x0, link = {SLinkEvent = {next = 0x0}, prev = 0x0}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-937) EThread::execute still processing cancelled event
[ https://issues.apache.org/jira/browse/TS-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166718#comment-13166718 ] Leif Hedstrom commented on TS-937: -- Sold. I did add a --disable-freelist configure option a while ago, which turns the freelist into malloc/free calls (I hope at least, unless I fucked it up :). The thought was that we'd use this option for memory debugging either with valgrind, or e.g. tcmalloc. EThread::execute still processing cancelled event - Key: TS-937 URL: https://issues.apache.org/jira/browse/TS-937 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 3.0.1, 2.1.9 Environment: RHEL6 Reporter: Brian Geffon Fix For: 3.1.2 Attachments: UnixEThread.patch The included GDB log will show that ATS is trying to process an event that has already been canceled, examining the code of UnixEThread.cc line 232 shows that EThread::process_event gets called without a check for the event being cancelled. Brian Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x764fa700 (LWP 28518)] 0x006fc663 in EThread::process_event (this=0x768ff010, e=0x1db45c0, calling_code=1) at UnixEThread.cc:130 130 MUTEX_TRY_LOCK_FOR(lock, e-mutex.m_ptr, this, e-continuation); Missing separate debuginfos, use: debuginfo-install expat-2.0.1-9.1.el6.x86_64 glibc-2.12-1.25.el6_1.3.x86_64 keyutils-libs-1.4-1.el6.x86_64 krb5-libs-1.9-9.el6_1.1.x86_64 libcom_err-1.41.12-7.el6.x86_64 libgcc-4.4.5-6.el6.x86_64 libselinux-2.0.94-5.el6.x86_64 libstdc++-4.4.5-6.el6.x86_64 openssl-1.0.0-10.el6_1.4.x86_64 pcre-7.8-3.1.el6.x86_64 tcl-8.5.7-6.el6.x86_64 zlib-1.2.3-25.el6.x86_64 (gdb) bt #0 0x006fc663 in EThread::process_event (this=0x768ff010, e=0x1db45c0, calling_code=1) at UnixEThread.cc:130 #1 0x006fcbaf in EThread::execute (this=0x768ff010) at UnixEThread.cc:232 #2 0x006fb844 in spawn_thread_internal (a=0xfb7e80) at Thread.cc:88 #3 0x0036204077e1 in start_thread () from /lib64/libpthread.so.0 #4 0x00361f8e577d in clone () from /lib64/libc.so.6 (gdb) bt full #0 0x006fc663 in EThread::process_event (this=0x768ff010, e=0x1db45c0, calling_code=1) at UnixEThread.cc:130 lock = {m = {m_ptr = 0x764f9d20}, lock_acquired = 202} #1 0x006fcbaf in EThread::execute (this=0x768ff010) at UnixEThread.cc:232 done_one = false e = 0x1db45c0 NegativeQueue = {DLLEvent, Event::Link_link = {head = 0xfc75f0}, tail = 0xfc75f0} next_time = 1314647904419648000 #2 0x006fb844 in spawn_thread_internal (a=0xfb7e80) at Thread.cc:88 p = 0xfb7e80 #3 0x0036204077e1 in start_thread () from /lib64/libpthread.so.0 No symbol table info available. #4 0x00361f8e577d in clone () from /lib64/libc.so.6 No symbol table info available. (gdb) f 0 #0 0x006fc663 in EThread::process_event (this=0x768ff010, e=0x1db45c0, calling_code=1) at UnixEThread.cc:130 130 MUTEX_TRY_LOCK_FOR(lock, e-mutex.m_ptr, this, e-continuation); (gdb) p *e $2 = {Action = {_vptr.Action = 0x775170, continuation = 0x1f2fc08, mutex = {m_ptr = 0x7fffd40fba40}, cancelled = 1}, ethread = 0x768ff010, in_the_prot_queue = 0, in_the_priority_queue = 0, immediate = 1, globally_allocated = 1, in_heap = 0, callback_event = 1, timeout_at = 0, period = 0, cookie = 0x0, link = {SLinkEvent = {next = 0x0}, prev = 0x0}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira