THREAD segfault

2011-07-22 Thread David Carter
The attached Berkeley format mailbox contains three messages from the 
imap5 IETF list (including one from Bron!).


The third message in the sequence contains a broken References: header 
which causes the IMAP THREAD extension to segfault in Cyrus 2.3 and 2.4.


 . THREAD REFERENCES UTF-8 ALL
 Connection closed by foreign host.

gdb tells me that the explosion is in index.c:

 static int index_thread_compare(Thread *t1, Thread *t2,
struct sortcrit *call_data)
 {
MsgData *md1, *md2;

/* if the container is empty, use the first child's container */
md1 = t1-msgdata ? t1-msgdata : t1-child-msgdata;
md2 = t2-msgdata ? t2-msgdata : t2-child-msgdata;
return index_sort_compare(md1, md2, call_data);
 }

where t2-msgdata and t2-child are both NULL, so t2-child-msgdata 
isn't going to work.


Working up the stack backtrace _index_thread_ref() [imap/index.c] has

/* Step 4: sort the root set */
ref_sort_root(rootset.root);

where rootset.root contains:

 (gdb) p *rootset.root-child-next
 $6 = {msgdata = 0x0, parent = 0x0, child = 0x0, next = 0x0}

I believe that this spurious empty node without any content is the cause 
of the segfault. I'm a little puzzled since the previous step in the same 
function is:


/* Step 3: prune tree of empty containers - get our deposit back :^) */
ref_prune_tree(rootset.root);

which appears to exist precisely to remove such nodes.

I think that this is the same bug as:

  http://bugzilla.cyrusimap.org/show_bug.cgi?id=2772

which has been open since December 2005.

Should I open a new bug in bugzilla? Stripping the spurious whitespace 
from the middle of msgids in the References header (as John Capo suggests 
in Bugzilla #2772) feels like the most elegant solution.


--
David Carter Email: david.car...@ucs.cam.ac.uk
University Computing Service,Phone: (01223) 334502
New Museums Site, Pembroke Street,   Fax:   (01223) 334679
Cambridge UK. CB2 3QH.From MAILER-DAEMON Fri Jul 22 16:40:42 2011
Date: 22 Jul 2011 16:40:42 +0100
From: Mail System Internal Data mailer-dae...@magenta.csi.cam.ac.uk
Subject: DON'T DELETE THIS MESSAGE -- FOLDER INTERNAL DATA
Message-ID: 1311349...@magenta.csi.cam.ac.uk
X-IMAP: 1311349242 03
Status: RO

This text is part of the internal format of your mail folder, and is not
a real message.  It is created automatically by the mail system software.
If deleted, important folder data will be lost, and it will be re-created
with the data reset to initial values.

From dp...@magenta.csi.cam.ac.uk Thu Jul 21 16:29:24 2011 +0100
Return-Path: imap5-boun...@ietf.org
Received: from ppsw-50.csi.cam.ac.uk (ppsw-50-intramail.csi.cam.ac.uk 
[192.168.128.150])
 by cyrus-28.csi.private.cam.ac.uk (Cyrus v2.3.14) with LMTPA;
 Thu, 21 Jul 2011 16:29:24 +0100
X-Sieve: CMU Sieve 2.3
X-Cam-AntiVirus: no malware found
X-Cam-SpamDetails: score -5.3 from SpamAssassin-3.3.2-1148654 
 * -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/,
 *  medium trust
 *  [64.170.98.30 listed in list.dnswl.dnsbl.ja.net]
 *  0.0 BAD_ENC_HEADER Message has bad MIME encoding in the header
 *  0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
 *   (brong[at]fastmail.fm)
 * -1.2 RP_MATCHES_RCVD Envelope sender domain matches handover relay
 *  domain
 * -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
 *  [score: 0.]
 *  0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily
 *  valid
 *  0.0 T_DKIM_INVALID DKIM-Signature header exists but is not valid
X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/
Received: from mail.ietf.org ([64.170.98.30]:55098)
by ppsw-50.csi.cam.ac.uk (mx.cam.ac.uk [131.111.8.147]:25)
with esmtp (csa=unknown) id 1QjvBj-0004Z9-qR (Exim 4.72) for 
d...@dotat.at
(return-path imap5-boun...@ietf.org); Thu, 21 Jul 2011 16:29:24 +0100
Received: from ietfa.amsl.com (localhost [127.0.0.1])
by ietfa.amsl.com (Postfix) with ESMTP id 7141B21F8AE6;
Thu, 21 Jul 2011 08:29:22 -0700 (PDT)
X-Original-To: im...@ietfa.amsl.com
Delivered-To: im...@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1])
by ietfa.amsl.com (Postfix) with ESMTP id 7354321F8A7B
for im...@ietfa.amsl.com; Thu, 21 Jul 2011 08:29:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.394
X-Spam-Level: 
X-Spam-Status: No, score=-2.394 tagged_above=-999 required=5
tests=[AWL=-0.605, BAD_ENC_HEADER=1.81, BAYES_00=-2.599,
RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30])
by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024)
with ESMTP id u7MDkcPG2Jx0 for im...@ietfa.amsl.com;
Thu, 21 Jul 2011 08:29:17 -0700 (PDT)
Received: from out5.smtp.messagingengine.com (out5.smtp.messagingengine.com
[66.111.4.29]) by ietfa.amsl.com (Postfix) with 

Re: THREAD segfault

2011-07-22 Thread Bron Gondwana
On Fri, Jul 22, 2011 at 05:58:07PM +0100, David Carter wrote:
 The attached Berkeley format mailbox contains three messages from
 the imap5 IETF list (including one from Bron!).

Hehe - wow.  There's another open bug about this too.  There seems to
be various brokenness in the THREAD stuff.

#3463

I'm going to merge the two tickets - feel free to add more information
to the end result if you think it will help, but I think the comment
on #2772 is right, we should be normalising the whitespace.

Bron.