Re: Crashes in recent builds from master

2021-09-08 Thread Aaron Schrab

At 09:39 -0700 08 Sep 2021, "Kevin J. McCarthy"  wrote:

On Wed, Sep 08, 2021 at 09:09:03AM -0700, Kevin J. McCarthy wrote:

On Wed, Sep 08, 2021 at 03:56:56AM -0700, Kevin J. McCarthy wrote:
To trigger the QRESYNC failure, delete some messages in the mailbox 
using mutt.  Sync and exit the mailbox, wait till there are more new 
messages in that mailbox and reopen using mutt.


In the little bit of testing I was able to do after enabling ASAN, it 
had seemed that it was actually crashing as I left the mailbox after 
deleting rather than needing to return to it. But, I'd wanted to do a 
bit more to isolate the problem before reporting that.


I was able to trigger the crash, and I've figured out the problem.  
I'll push a commit to a branch for testing later on today.


I've pushed several commits to branch 'kevin/stable-fixes'.  As the 
branch says, it's based on 'stable' and so doesn't have the thread 
changes in master.


However, I've also pushed up a branch 
'kevin/master-stable-fixes-rebase-test' that has those commits merged 
in to master.


I need to clean things up and test more, but would appreciate if you 
would test it too.


I've done a bit of testing with that and ASAN already, and haven't had 
any crashes even when doing things that seemed to reliably cause crashes 
with ASAN before those fixes. I'll continue using that build for awhile, 
although I'll likely want to disable ASAN at some point.



Thank you.


Thank you for the prompt fix.


signature.asc
Description: PGP signature


Re: Crashes in recent builds from master

2021-09-08 Thread Eric Blake
On Wed, Sep 08, 2021 at 06:24:51PM +0200, Rene Kita wrote:
> On Wed, Sep 08, 2021 at 03:56:56AM -0700, Kevin J. McCarthy wrote:
> > However, I haven't been able to figure out the memory error leading to a
> > crash yet.  It would be helpful if you could run with ASAN enabled until you
> > get the crash(es).  With ASAN you'll need to arrange it so the tmux window
> > doesn't close when mutt crashes so you can read the ASAN report.
> JFTR, you should be able to catch the output from ASAN with tee:
> % mutt 2> >(tee -a stderr.log >&2)

Or, before you start mutt, add ASAN=log_path=asan:print_legend=0 to
your environment, and the ASAN output will now appear in a file in the
current working directory when mutt exits.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: Crashes in recent builds from master

2021-09-08 Thread Kevin J. McCarthy

On Wed, Sep 08, 2021 at 09:09:03AM -0700, Kevin J. McCarthy wrote:

On Wed, Sep 08, 2021 at 03:56:56AM -0700, Kevin J. McCarthy wrote:
However, I haven't been able to figure out the memory error leading 
to a crash yet.  It would be helpful if you could run with ASAN 
enabled until you get the crash(es).  With ASAN you'll need to 
arrange it so the tmux window doesn't close when mutt crashes so you 
can read the ASAN report.


To trigger the QRESYNC failure, delete some messages in the mailbox 
using mutt.  Sync and exit the mailbox, wait till there are more new 
messages in that mailbox and reopen using mutt.


I was able to trigger the crash, and I've figured out the problem.  
I'll push a commit to a branch for testing later on today.


I've pushed several commits to branch 'kevin/stable-fixes'.  As the 
branch says, it's based on 'stable' and so doesn't have the thread 
changes in master.


However, I've also pushed up a branch 
'kevin/master-stable-fixes-rebase-test' that has those commits merged in 
to master.


I need to clean things up and test more, but would appreciate if you 
would test it too.


Thank you.

--
Kevin J. McCarthy
GPG Fingerprint: 8975 A9B3 3AA3 7910 385C  5308 ADEF 7684 8031 6BDA


signature.asc
Description: PGP signature


Re: Crashes in recent builds from master

2021-09-08 Thread Rene Kita
On Wed, Sep 08, 2021 at 03:56:56AM -0700, Kevin J. McCarthy wrote:
> However, I haven't been able to figure out the memory error leading to a
> crash yet.  It would be helpful if you could run with ASAN enabled until you
> get the crash(es).  With ASAN you'll need to arrange it so the tmux window
> doesn't close when mutt crashes so you can read the ASAN report.
JFTR, you should be able to catch the output from ASAN with tee:
% mutt 2> >(tee -a stderr.log >&2)


Re: Crashes in recent builds from master

2021-09-08 Thread Kevin J. McCarthy

On Wed, Sep 08, 2021 at 03:56:56AM -0700, Kevin J. McCarthy wrote:
However, I haven't been able to figure out the memory error leading to 
a crash yet.  It would be helpful if you could run with ASAN enabled 
until you get the crash(es).  With ASAN you'll need to arrange it so 
the tmux window doesn't close when mutt crashes so you can read the 
ASAN report.


To trigger the QRESYNC failure, delete some messages in the mailbox 
using mutt.  Sync and exit the mailbox, wait till there are more new 
messages in that mailbox and reopen using mutt.


I was able to trigger the crash, and I've figured out the problem.  I'll 
push a commit to a branch for testing later on today.


--
Kevin J. McCarthy
GPG Fingerprint: 8975 A9B3 3AA3 7910 385C  5308 ADEF 7684 8031 6BDA


signature.asc
Description: PGP signature


Re: Crashes in recent builds from master

2021-09-08 Thread Kevin J. McCarthy

On Tue, Sep 07, 2021 at 09:50:05PM -0700, Kevin J. McCarthy wrote:

On Tue, Sep 07, 2021 at 11:45:03PM -0400, Aaron Schrab wrote:

At 18:00 -0700 07 Sep 2021, "Kevin J. McCarthy"  wrote:

Are you using $imap_qresync or $imap_condstore?


Yes, I have both of those enabled (using dovecot 2.3.16 from Debian 
unstable as the IMAP server).  In at least some of the crashes I 
believe I've seen messages about QRESYNC failing immediately before; 
but I generally have mutt running in a tmux window that's set to 
close when mutt exits so the message is generally only visible very 
briefly.


That gives me some ideas.  I'll take a closer look, but I think my fix 
in commit 74ce032f may have caused some other issues with 
$imap_qresync.


Yes, it looks like commit 74ce032f was incorrect.  I'll need to fix that 
and make another stable release soon.


However, I haven't been able to figure out the memory error leading to a 
crash yet.  It would be helpful if you could run with ASAN enabled until 
you get the crash(es).  With ASAN you'll need to arrange it so the tmux 
window doesn't close when mutt crashes so you can read the ASAN report.


To trigger the QRESYNC failure, delete some messages in the mailbox 
using mutt.  Sync and exit the mailbox, wait till there are more new 
messages in that mailbox and reopen using mutt.


--
Kevin J. McCarthy
GPG Fingerprint: 8975 A9B3 3AA3 7910 385C  5308 ADEF 7684 8031 6BDA


signature.asc
Description: PGP signature


Re: Crashes in recent builds from master

2021-09-07 Thread Kevin J. McCarthy

On Tue, Sep 07, 2021 at 11:45:03PM -0400, Aaron Schrab wrote:

At 18:00 -0700 07 Sep 2021, "Kevin J. McCarthy"  wrote:

Are you using $imap_qresync or $imap_condstore?


Yes, I have both of those enabled (using dovecot 2.3.16 from Debian 
unstable as the IMAP server).  In at least some of the crashes I 
believe I've seen messages about QRESYNC failing immediately before; 
but I generally have mutt running in a tmux window that's set to close 
when mutt exits so the message is generally only visible very briefly.


That gives me some ideas.  I'll take a closer look, but I think my fix 
in commit 74ce032f may have caused some other issues with $imap_qresync.


I don't see how it's causing the crash, but it may be that I didn't 
properly reset something if verifying the qresync failed, leading to a 
stray pointer.


I'll try to take a closer look the next couple days.

--
Kevin J. McCarthy
GPG Fingerprint: 8975 A9B3 3AA3 7910 385C  5308 ADEF 7684 8031 6BDA


signature.asc
Description: PGP signature


Re: Crashes in recent builds from master

2021-09-07 Thread Aaron Schrab

At 18:00 -0700 07 Sep 2021, "Kevin J. McCarthy"  wrote:

On Tue, Sep 07, 2021 at 07:15:16PM -0400, Aaron Schrab wrote:
Since updating from a build based on bcdb61560 (Add %T status format 
for $sort_thread_groups., 2021-08-05) to one based on 27e61da56 
(Merge branch 'stable', 2021-08-24) I've been experiencing some 
sporadic crashes.


Unfortunately both IMAP and thread code *have* been touched recently. 


I'd thought that most of the thread changes were in the clear, at least 
until I'd looked further into my builds for writing the original 
message.



So it looks like I goofed something up.  :-(


Trying to catch those types of issues is one of the reasons I try to 
follow master (or in some cases branches that aren't even that ready) 
fairly closely.


However, bcdb61560 isn't on master.  Were you running off of my 
development branch before, or were you perhaps referring to 5aa75ed2?


Yes, I had been using your branch with the early support for 
$sort_thread_groups. I *thought* that I'd updated to the version of that 
that got into master, but apparently I hadn't. At least that early 
version of the new threading mode had seemed very stable to me. Of 
course with the sporadic nature of the crashes it's possible that I just 
hadn't hit the problem then.



You may also want to try enabling ASAN via something like
 export CFLAGS='-g3 -fno-omit-frame-pointer -fsanitize=address'
and re-configure/recompile, to see if it can give an earlier warning
about memory corruption.


I've added that to my configure wrapper script, and I'll be restarting 
to use the copy built with that as soon as I send this message.


For the more troublesome one I get the following backtrace. Once 
this comes up it will keep crashing when I attempt to change to the 
same folder, at least in the short term. Although if I open this 
folder once with the old build then switch back to the new build the 
problem will go away for awhile.


Are you using $imap_qresync or $imap_condstore?


Yes, I have both of those enabled (using dovecot 2.3.16 from Debian 
unstable as the IMAP server).  In at least some of the crashes I believe 
I've seen messages about QRESYNC failing immediately before; but I 
generally have mutt running in a tmux window that's set to close when 
mutt exits so the message is generally only visible very briefly.



The stack is in a pretty benign section, so it seems like it's a wild
pointer or something corrupting memory.


The other problem seems to occur on line 431 of thread.c:

→·  !tmp->fake_thread &&→·→·   /* don't match pseudo threads */


I usually test with $strict_threads enabled.  I'll turn that off and see
if I can trigger the problem.


If I run into a case where the problem seems to be at least briefly 
reproducible I'll try turning that off to see if that avoids the 
problem.


signature.asc
Description: PGP signature


Re: Crashes in recent builds from master

2021-09-07 Thread Kevin J. McCarthy

On Tue, Sep 07, 2021 at 07:15:16PM -0400, Aaron Schrab wrote:
Since updating from a build based on bcdb61560 (Add %T status format 
for $sort_thread_groups., 2021-08-05) to one based on 27e61da56 (Merge 
branch 'stable', 2021-08-24) I've been experiencing some sporadic 
crashes.


Unfortunately both IMAP and thread code *have* been touched recently. 
So it looks like I goofed something up.  :-(


However, bcdb61560 isn't on master.  Were you running off of my 
development branch before, or were you perhaps referring to 5aa75ed2?


You may also want to try enabling ASAN via something like
  export CFLAGS='-g3 -fno-omit-frame-pointer -fsanitize=address'
and re-configure/recompile, to see if it can give an earlier warning
about memory corruption.

For the more troublesome one I get the following backtrace. Once this 
comes up it will keep crashing when I attempt to change to the same 
folder, at least in the short term. Although if I open this folder 
once with the old build then switch back to the new build the problem 
will go away for awhile.


Are you using $imap_qresync or $imap_condstore?

The stack is in a pretty benign section, so it seems like it's a wild
pointer or something corrupting memory.


The other problem seems to occur on line 431 of thread.c:

→·  !tmp->fake_thread &&→·→·   /* don't match pseudo threads */


I usually test with $strict_threads enabled.  I'll turn that off and see
if I can trigger the problem.

--
Kevin J. McCarthy
GPG Fingerprint: 8975 A9B3 3AA3 7910 385C  5308 ADEF 7684 8031 6BDA


signature.asc
Description: PGP signature


Crashes in recent builds from master

2021-09-07 Thread Aaron Schrab
Since updating from a build based on bcdb61560 (Add %T status format for 
$sort_thread_groups., 2021-08-05) to one based on 27e61da56 (Merge 
branch 'stable', 2021-08-24) I've been experiencing some sporadic 
crashes. After getting coredumps enabled, it seems that there are a 
couple of different issues. I've mainly observed both of these when 
changing folders, I'm using exclusively IMAP folders and for the 
purposes here it's all on a single, local IMAP server.


For the more troublesome one I get the following backtrace. Once this 
comes up it will keep crashing when I attempt to change to the same 
folder, at least in the short term. Although if I open this folder once 
with the old build then switch back to the new build the problem will go 
away for awhile.


#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
#1  0x7fb66d64a536 in __GI_abort () at abort.c:79
#2  0x7fb66d6a22b8 in __libc_message (action=action@entry=do_abort, 
fmt=fmt@entry=0x7fb66d7b03a4 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3  0x7fb66d6a9d0a in malloc_printerr (str=str@entry=0x7fb66d7ae6a2 
"realloc(): invalid next size") at malloc.c:5389
#4  0x7fb66d6adf8c in _int_realloc (av=av@entry=0x7fb66d7e2ba0 
, oldp=oldp@entry=0x55e25f67af30, oldsize=oldsize@entry=569616, 
nb=569824) at malloc.c:4601
#5  0x7fb66d6af0e6 in __GI___libc_realloc (oldmem=0x55e25f67af40, 
bytes=569808) at malloc.c:3246
#6  0x55e25c20265f in safe_realloc (ptr=0x55e25e53b408, siz=569808) at 
lib.c:176
#7  0x55e25c1c2eb6 in mx_alloc_memory (ctx=0x55e25e53b3b0) at mx.c:1461
#8  0x55e25c251854 in imap_read_headers (idata=0x55e25e5f2800, 
msn_begin=71193, msn_end=71201, initial_download=1) at message.c:409
#9  0x55e25c24cae8 in imap_open_mailbox (ctx=0x55e25e53b3b0) at 
imap.c:997
#10 0x55e25c1c069b in mx_open_mailbox (path=0x55e25e43f310 
"imaps://a...@pug.qqx.org/L/dev/git", flags=0, pctx=0x0) at mx.c:656
#11 0x55e25c18926a in mutt_index_menu () at curs_main.c:1433
#12 0x55e25c1b1dd9 in main (argc=1, argv=0x7fffbf9e51b8, 
environ=0x7fffbf9e51d0) at main.c:1380

The other problem seems to occur on line 431 of thread.c:

→·  !tmp->fake_thread &&→·→·   /* don't match pseudo threads */

With this one, starting mutt again and immediately opening the same 
folder appears to work, although it seems likely to appear again soon 
after. I don't currently have a core file for this problem, since I 
wasn't expecting to be looking at two different problems I was just 
using the standard name `core` so the one from the first issue that I 
noted overwrote the ones from this issue.


For all of this I have:

set sort="threads"
set sort_aux="date-received"
set sort_thread_groups="last-date-received"

I plan to continue looking for a fix myself, but if anyone else has 
ideas I'd be glad to hear them.