[ https://issues.apache.org/jira/browse/TS-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Leif Hedstrom updated TS-2187: ------------------------------ Backport to Version: (was: 4.0.2) This feature is not in 4.0, so we shouldn't back port it to 4.0.x, right? > failed assert `nr == sizeof(uint64_t)` in EventNotify::signal() > --------------------------------------------------------------- > > Key: TS-2187 > URL: https://issues.apache.org/jira/browse/TS-2187 > Project: Traffic Server > Issue Type: Bug > Components: Core > Reporter: Yunkai Zhang > Assignee: Yunkai Zhang > Fix For: 4.1.0 > > Attachments: 0001-TS-2187-use-nonblock-eventfd-in-EventNotify.patch > > > I have noticed two bug report on "failed assert `nr == sizeof(uint64_t)`" in > EventNotify::signal(): > {code} > [TrafficServer] using root directory '/usr/local/trafficserver-4.1.0' > FATAL: EventNotify.cc:73: failed assert `nr == sizeof(uint64_t)` > /usr/local/trafficserver-4.1.0/bin/traffic_server - STACK TRACE: > /usr/local/trafficserver-4.1.0/lib/libtsutil.so.4(+0x14b57)[0x2b71a2d84b57] > /usr/local/trafficserver-4.1.0/lib/libtsutil.so.4(+0x13d7f)[0x2b71a2d83d7f] > /usr/local/trafficserver-4.1.0/lib/libtsutil.so.4(+0x1c32e)[0x2b71a2d8c32e] > /usr/local/trafficserver-4.1.0/bin/traffic_server(LogObject::_checkout_write(unsigned > long*, unsigned long)+0x1f5)[0x5adf25] > /usr/local/trafficserver-4.1.0/bin/traffic_server(LogObjectManager::check_buffer_expiration(long)+0x7b)[0x5afbfb] > /usr/local/trafficserver-4.1.0/bin/traffic_server(Log::periodic_tasks(long)+0xe2)[0x595c92] > /usr/local/trafficserver-4.1.0/bin/traffic_server(Log::flush_thread_main(void*)+0x28e)[0x596cee] > /usr/local/trafficserver-4.1.0/bin/traffic_server[0x59abcd] > /usr/local/trafficserver-4.1.0/bin/traffic_server(EThread::execute()+0x1159)[0x6ae139] > /usr/local/trafficserver-4.1.0/bin/traffic_server[0x6ab93a] > /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x2b71a479cb50] > /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2b71a5430a7d] > [E. Mgmt] log ==> [TrafficManager] using root directory > '/usr/local/trafficserver-4.1.0' > [TrafficServer] using root directory '/usr/local/trafficserver-4.1.0' > {code} > In order to fix this issue, I cook a patch: > 1) Use nonblock eventfd, so that we can tolerate write() failed with errno > EAGAIN -- which is acceptable as the signal receiver will be notified > eventually in this case. > 2) After using nonblock eventfd, read() will not block in wait(). So I use > epoll_wait() to implement block behavior, just like timedwait(). > 3) nonblock eventfd can fix a potential problem: if receiver didn't read() > data immediately, senders might block in write(). > Please test this patch, any feedback is welcome. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira