A NOTE has been added to this issue. ====================================================================== http://www.dbmail.org/mantis/view.php?id=940 ====================================================================== Reported By: john Assigned To: ====================================================================== Project: DBMail Issue ID: 940 Category: IMAP daemon Reproducibility: sometimes Severity: crash Priority: normal Status: new target: ====================================================================== Date Submitted: 11-Nov-11 16:01 CET Last Modified: 14-Nov-11 17:45 CET ====================================================================== Summary: SIGSEGV (segmentation fault) while running imapsync Description: dbmail-imapd crashed after about 16 hours of running against multiple imapsyncs.
No errors in the logs, no core dump (and no symbols anyways) available. I've now compiled a debug build of dbmail and libgmime (-O0 -g, no stripping) and bumped the log level to debug (256). I've resumed the imapsyncs while running "dbmail-imapd -D" from gdb. This way, I hope I can provide more details the next time it crashes (fear the heisenbug). As noted, this might take a while... dbmail-imapd[18072]: segfault at 0 ip 00007f2de99b70f5 sp 00007f2ddf355ab0 error 6 in libgmime-2.4.so.2.4.14[7f2de9995000+4f000] ====================================================================== ---------------------------------------------------------------------- (0003328) john (reporter) - 11-Nov-11 16:20 http://www.dbmail.org/mantis/view.php?id=940#c3328 ---------------------------------------------------------------------- Just a quick note: The crash occured with 288b73a79fe20bae7737fb622aefff761bb34c3f ---------------------------------------------------------------------- (0003329) john (reporter) - 12-Nov-11 01:54 http://www.dbmail.org/mantis/view.php?id=940#c3329 ---------------------------------------------------------------------- Crashed again, nothing interesting in the debug log, but this looks promising: Starting program: /usr/sbin/dbmail-imapd -D [Thread debugging using libthread_db enabled] [New Thread 0x7ffff24e1700 (LWP 27355)] [New Thread 0x7ffff1ad7700 (LWP 27356)] [New Thread 0x7ffff12d6700 (LWP 27357)] [New Thread 0x7ffff0ad5700 (LWP 27359)] [New Thread 0x7ffff02d4700 (LWP 27360)] [New Thread 0x7fffefad3700 (LWP 27361)] [New Thread 0x7fffef2d2700 (LWP 27362)] [New Thread 0x7fffeead1700 (LWP 27363)] [New Thread 0x7fffee2d0700 (LWP 27364)] [New Thread 0x7fffedacf700 (LWP 27365)] [New Thread 0x7fffed2ce700 (LWP 27366)] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffeead1700 (LWP 27363)] 0x00007ffff5aee52a in memset () from /lib/libc.so.6 (gdb) info threads 12 Thread 0x7fffed2ce700 (LWP 27366) 0x00007ffff5ddf0bd in read () from /lib/libpthread.so.0 11 Thread 0x7fffedacf700 (LWP 27365) 0x00007ffff5ddf0bd in read () from /lib/libpthread.so.0 10 Thread 0x7fffee2d0700 (LWP 27364) 0x00007ffff5ddf0bd in read () from /lib/libpthread.so.0 * 9 Thread 0x7fffeead1700 (LWP 27363) 0x00007ffff5aee52a in memset () from /lib/libc.so.6 8 Thread 0x7fffef2d2700 (LWP 27362) 0x00007ffff5ddc16c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 7 Thread 0x7fffefad3700 (LWP 27361) 0x00007ffff5ddf0bd in read () from /lib/libpthread.so.0 6 Thread 0x7ffff02d4700 (LWP 27360) 0x00007ffff5ddf0bd in read () from /lib/libpthread.so.0 5 Thread 0x7ffff0ad5700 (LWP 27359) 0x00007ffff5ddebe4 in __lll_lock_wait () from /lib/libpthread.so.0 4 Thread 0x7ffff12d6700 (LWP 27357) 0x00007ffff5ddc16c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 3 Thread 0x7ffff1ad7700 (LWP 27356) 0x00007ffff5b32f1d in write () from /lib/libc.so.6 2 Thread 0x7ffff24e1700 (LWP 27355) 0x00007ffff5ddc4d9 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 1 Thread 0x7ffff7fe5720 (LWP 27344) 0x00007ffff5b3f623 in epoll_wait () from /lib/libc.so.6 (gdb) bt http://www.dbmail.org/mantis/view.php?id=0 0x00007ffff5aee52a in memset () from /lib/libc.so.6 http://www.dbmail.org/mantis/view.php?id=1 0x00007ffff7957e89 in g_mime_iconv_strndup (cd=0x798730, str=0x7fffd81e8b10 "Änderungen an der Artikel-Detailseite", n=38) at gmime-iconv-utils.c:161 http://www.dbmail.org/mantis/view.php?id=2 0x00007ffff7957f30 in g_mime_iconv_strdup (cd=0x798730, str=0x7fffd81e8b10 "Änderungen an der Artikel-Detailseite") at gmime-iconv-utils.c:199 http://www.dbmail.org/mantis/view.php?id=3 0x00007ffff6281a79 in dbmail_iconv_str_to_db (str_in=0x7fffd81e8b10 "Änderungen an der Artikel-Detailseite", charset=0x7fffd80cc9c0 "utf-8") at dm_iconv.c:134 http://www.dbmail.org/mantis/view.php?id=4 0x00007ffff6257384 in _header_cache (key=0x7fffd804ede0 "Subject", header=0x7fffd804ede0 "Subject", user_data=0x7fffd831ad50) at dbmail-message.c:1572 http://www.dbmail.org/mantis/view.php?id=5 0x00007ffff706c8a6 in g_tree_foreach () from /lib/libglib-2.0.so.0 http://www.dbmail.org/mantis/view.php?id=6 0x00007ffff6256140 in dbmail_message_cache_headers (self=0x7fffd831ad50) at dbmail-message.c:1298 http://www.dbmail.org/mantis/view.php?id=7 0x00007ffff6255865 in dbmail_message_store (self=0x7fffd831ad50) at dbmail-message.c:1165 http://www.dbmail.org/mantis/view.php?id=8 0x00007ffff6277acf in db_append_msg ( msgdata=0x7fffd81e4060 "Received: from xxx.xxxxx.xx (xxx.xxx.xxxxx.xx [10.10.1.5])\r\n\tby xxx.xxxxx.xx (xxxxxx) with ESMTP id AEF7940914\r\n\tfor <xxxxxxxxxxxxx@xxxxxxxxxx>; Mon, 31 Oct 2011 17:44:28 +0100 (CET)\r\nReceived: from b"..., mailbox_idnr=569, user_idnr=1639, internal_date=0x7fffeead0d70 "2011-10-31 16:44:28", msg_idnr=0x7fffeead0d98) at dm_db.c:3627 http://www.dbmail.org/mantis/view.php?id=9 0x000000000040e614 in _ic_append_enter (D=0x7fffd81f67f0) at imapcommands.c:1254 http://www.dbmail.org/mantis/view.php?id=10 0x00007ffff6283929 in dm_thread_dispatch (data=0x7fffd81f67f0, user_data=0x0) at server.c:162 http://www.dbmail.org/mantis/view.php?id=11 0x00007ffff706b5cf in ?? () from /lib/libglib-2.0.so.0 http://www.dbmail.org/mantis/view.php?id=12 0x00007ffff7069784 in ?? () from /lib/libglib-2.0.so.0 http://www.dbmail.org/mantis/view.php?id=13 0x00007ffff5dd78ba in start_thread () from /lib/libpthread.so.0 http://www.dbmail.org/mantis/view.php?id=14 0x00007ffff5b3f02d in clone () from /lib/libc.so.6 http://www.dbmail.org/mantis/view.php?id=15 0x0000000000000000 in ?? () (gdb) bt full 5 http://www.dbmail.org/mantis/view.php?id=0 0x00007ffff5aee52a in memset () from /lib/libc.so.6 No symbol table info available. http://www.dbmail.org/mantis/view.php?id=1 0x00007ffff7957e89 in g_mime_iconv_strndup (cd=0x798730, str=0x7fffd81e8b10 "Änderungen an der Artikel-Detailseite", n=38) at gmime-iconv-utils.c:161 inleft = 0 outleft = 140736817703276 converted = 134665270 out = 0x7fffd806a910 "" outbuf = 0x0 inbuf = 0x7fffd81e8b36 "" outlen = 92 errnosav = 32767 http://www.dbmail.org/mantis/view.php?id=2 0x00007ffff7957f30 in g_mime_iconv_strdup (cd=0x798730, str=0x7fffd81e8b10 "Änderungen an der Artikel-Detailseite") at gmime-iconv-utils.c:199 No locals. http://www.dbmail.org/mantis/view.php?id=3 0x00007ffff6281a79 in dbmail_iconv_str_to_db (str_in=0x7fffd81e8b10 "Änderungen an der Artikel-Detailseite", charset=0x7fffd80cc9c0 "utf-8") at dm_iconv.c:134 subj = 0x0 conv_iconv = 0x2600000026 http://www.dbmail.org/mantis/view.php?id=4 0x00007ffff6257384 in _header_cache (key=0x7fffd804ede0 "Subject", header=0x7fffd804ede0 "Subject", user_data=0x7fffd831ad50) at dbmail-message.c:1572 t = 0x7fffd81e8b10 "Änderungen an der Artikel-Detailseite" value = 0x7fffd81d1310 "[xxxxxxxxxxxxxxxxxx.xx 0000747]: Änderungen an der Artikel-Detailseite" headername_id = 7 headervalue_id = 0 self = 0x7fffd831ad50 values = 0x7fffd82e0530 raw = 0x7fffd819cff0 "=?utf-8?Q?[xxxxxxxxxxxxxxxxxx.xx_0000747]:_=C3=84nderungen_an_der_Artikel?= =?utf-8?Q?-Detailseite?=" i = 0 date = 0 isaddr = 0 isdate = 0 issubject = 1 charset = 0x7fffd80cc9c0 "utf-8" sortfield = 0x0 datefield = 0x0 emaillist = 0x0 ia = 0x7fffe01b0300 __func__ = "_header_cache" (More stack frames follow...) On first sight, it looks like gmime's frame 1 g_mime_iconv_strndup() went nuts, notice the locals "outleft" and "converted", they seem way off. I've not yet looked into dbmail's frame 4, i.e. why "t" is only part of "value" (perhaps this is expected). I've dumped core to inspect this later on. Perhaps you see something obvious. Here is the exact version of gmime-iconv-utils.c used to compile libgmime on the system. Notice the line number 161 (memset) as reported above: http://git.gnome.org/browse/gmime/tree/gmime/gmime-iconv-utils.c?h=gmime-2-4&id=GMIME_2_4_14#n102 I've looked up their history, there were *no* code changes to gmime-iconv-utils.c after that on the whole 2.4 branch. (on 2.6 they refactored g_mime_iconv_strndup(): http://git.gnome.org/browse/gmime/commit/?id=9ab4d7fc16fe0df5d0da55dbe2f8422c66ee21b6 ) So in the end, on first sight (it's 2 a.m. here...), it looks like a gmime bug :-) ---------------------------------------------------------------------- (0003330) john (reporter) - 12-Nov-11 01:59 http://www.dbmail.org/mantis/view.php?id=940#c3330 ---------------------------------------------------------------------- (... the bug tracker replaced gdb's frame identifiers with links to issues ....) ---------------------------------------------------------------------- (0003333) paul (administrator) - 12-Nov-11 21:06 http://www.dbmail.org/mantis/view.php?id=940#c3333 ---------------------------------------------------------------------- I'm assuming this is a gmime bug. Squeeze's version is quite old (januari 2010), and ubuntu's (2.4.24) at least doesn't crash on this Subject. ---------------------------------------------------------------------- (0003334) paul (administrator) - 12-Nov-11 21:07 http://www.dbmail.org/mantis/view.php?id=940#c3334 ---------------------------------------------------------------------- fyi, the modified Subject value in dbmail's frame is due to base-subject reduction which is needed for THREAD=ORDEREDSUBJECT ---------------------------------------------------------------------- (0003341) john (reporter) - 13-Nov-11 16:25 http://www.dbmail.org/mantis/view.php?id=940#c3341 ---------------------------------------------------------------------- I'm still not sure about the root of the problem. I can't reproduce this with the gmime testsuite of the same version on the same system (i.e. same libs) and the string in question. Here is a link to the upstream thread: http://mail.gnome.org/archives/gmime-devel-list/2011-November/msg00006.html Please don't close this yet as it could also be related to an unclean state of "cd" (the conversion descriptor), which is initialized in dbmail's code. (my last message to this thread still awaits moderator approval, I'm not subscribed there). ---------------------------------------------------------------------- (0003342) john (reporter) - 13-Nov-11 17:12 http://www.dbmail.org/mantis/view.php?id=940#c3342 ---------------------------------------------------------------------- Looks like I'm on the never ending heisenbug hunting session: Gmime uses glibc's iconv_* internally, hence the conversion descriptor is obtained by calling iconv_open(3). From the man page: --- A conversion descriptor contains a conversion state. After creation using iconv_open(), the state is in the initial state. Using iconv(3) modifies the descriptor's conversion state. (This implies that a conversion descriptor can not be used in multiple threads simultaneously.) To bring the state back to the initial state, use iconv(3) with NULL as inbuf argument. --- Is the cd stored as DBMail's ic->to_db thread safe (might also affect other cds)? As noted in the upstream thread, the locals in g_mime_iconv_strndup make no sense, but this would change if cd would be in an (unexpected) unclean state. ---------------------------------------------------------------------- (0003343) paul (administrator) - 13-Nov-11 19:54 http://www.dbmail.org/mantis/view.php?id=940#c3343 ---------------------------------------------------------------------- As far as I know the ic struct is passed around between threads in a safe manner, and never shared between them. Also, initialization of the iconv structures is done using g_once which is the documented way of doing this safely. I haven't seen thread related issues in almost two years. But this sure smells like one. ---------------------------------------------------------------------- (0003344) paul (administrator) - 13-Nov-11 20:25 http://www.dbmail.org/mantis/view.php?id=940#c3344 ---------------------------------------------------------------------- Scratch the first part of my previous message. ic->to_db is not used in a thread safe manner. I've added mutex locks around access to iconv_t. Plz try http://git.dbmail.eu/paul/dbmail/commit/?id=71ed85b9041fc93dee25b5c0b63fc16dd40a8238 ---------------------------------------------------------------------- (0003345) john (reporter) - 14-Nov-11 17:45 http://www.dbmail.org/mantis/view.php?id=940#c3345 ---------------------------------------------------------------------- Thanks Paul. That's exactly what I meant (initialization was protected by g_once, but I was unsure about the missing coordination for usage). I'm attaching some related patches for stuff that I've noticed while looking through the file. I'll resume the imapsyncs against the latest HEAD this evening and will report back within the next few days. Issue History Date Modified Username Field Change ====================================================================== 11-Nov-11 16:01 john New Issue 11-Nov-11 16:20 john Note Added: 0003328 12-Nov-11 01:54 john Note Added: 0003329 12-Nov-11 01:59 john Note Added: 0003330 12-Nov-11 21:06 paul Note Added: 0003333 12-Nov-11 21:07 paul Note Added: 0003334 13-Nov-11 16:25 john Note Added: 0003341 13-Nov-11 17:12 john Note Added: 0003342 13-Nov-11 19:54 paul Note Added: 0003343 13-Nov-11 20:25 paul Note Added: 0003344 14-Nov-11 17:45 john Note Added: 0003345 ====================================================================== _______________________________________________ Dbmail-dev mailing list Dbmail-dev@dbmail.org http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev