Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2017-07-09 Thread David Bremner
Mark Walters  writes:

>
> I have looked at this and I think this is not notmuch's fault: I think
> it is a mua doing strange things:
>
> One of the mails has an in-reply-to header which looks like
>
> In-reply-to: Message from Carsten Dominik  of  
>   "Tue, 15 Mar 2011 12:18:51 BST."
> <17242340-a14f-495a-b144-20c96d52b...@gmail.com>
>
> and I think notmuch is taking the carsten.domi...@gmail.com as message
> id.
>
> A similar in-reply-to header appears in the other thread so notmuch
> pairs them up. According to http://www.jwz.org/doc/threading.html this
> form of header is not allowed under RFC2822 but was allowed under the
> earlier RFC822.

I have identified a second, similar problem. Some MUA insert(s|ed) the
From: address into the References field.

For an example see

  id:t2sfa09ca6d1004220920haccb86aam4bb949c77024c...@mail.gmail.com
  https://www.mail-archive.com/emacs-orgmode@gnu.org/msg24266.html
  
This seems to be another throwback to rfc822

I don't know how notmuch can detect this once it those references
propagate to replies.

- we could merge threads only based on messages known to exist (ignoring
  ghost messages), but presumably there's a reason we invented ghost
  messages

- we could, at some point in future support blacklisting references
  (using message properties).

___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2014-02-16 Thread David Bremner
Gregor Zattler  writes:

>
> With mutt I had a view at the collapsed 7 respective 34 threads.
> One then sees the very first E-Mails of a thread and among other
> information their subjects.
> Via editing I produced two lists with subjects and then searched
> each of the 7 in the list with the 34.
>

OK, if you have a small mailbox, send it to the list; otherwise send it
to me privately. Also, send a mockup of what you think the output of
notmuch search should be, since I'm still pretty confused about that.

d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2014-01-27 Thread Eric
On Sun, 26 Jan 2014 22:26:04 +0100, Gregor Zattler  wrote:
> Hi David,
> * David Bremner  [24. Jan. 2014]:
> > Mark Walters  writes:
> >> I have looked at this and I think this is not notmuch's fault: I think
> >> it is a mua doing strange things:
> >>
> >> One of the mails has an in-reply-to header which looks like
> >>
> >> In-reply-to: Message from Carsten Dominik  of   
> >>  "Tue, 15 Mar 2011 12:18:51 BST."
> >> <17242340-a14f-495a-b144-20c96d52b...@gmail.com>
> >>
> >> and I think notmuch is taking the carsten.domi...@gmail.com as message
> >> id.
> >>
> > 
> > Can someone test if this is fixed by cf8aaafbad68 (i.e. does the problem
> > persist in git master or 0.17)?
> 
> The problem is *not* fixed.  

I've never been happy with notmuch's threading, always seem to get too
many threads, so I tend to do

notmuch show --format=mbox \
  $(notmuch search --output=threads -- whatever)  | mhonarc -tlevels 15 -

to work out which threads I want (I am not an emacs user!).

mhonarc, as far as I know, uses something like the jwz algorithm (
http://http://www.jwz.org/doc/threading.html), the use of which would
solve my problem but possibly not Gregor's. If that is indeed caused by
the header referred to above and there is some email client which does
that, it would need special handling in jwz and probably in any
algorithm, but the maintainers of the mail client should also be told to
fix it! (RFC2822)

Digression I know, but I just wanted to flag the need for more work in
general on threading in notmuch.

Eric
-- 
ms fnd in a lbry
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2014-01-26 Thread Gregor Zattler
Hi David,
* David Bremner  [26. Jan. 2014]:
> Gregor Zattler  writes:
> 
>> Today I produced another mbox with the very same command but with
>> a now larger email corpus freshly indexed with a fresh notmuch.
>> The mbox contains (according to mutt) 507 messages in 34 threads.
>> One of them is the thread I searched for.
>>
>> I grepped for the 7 subjects within the 34 subjects and only 5
>> showed up.
> 
> I don't know what you mean here. Grepped where? in the raw messages?

With mutt I had a view at the collapsed 7 respective 34 threads.
One then sees the very first E-Mails of a thread and among other
information their subjects.
Via editing I produced two lists with subjects and then searched
each of the 7 in the list with the 34.

>> If somebody want's to dig into this: I can provide the two
>> mboxes. 
>>
>> Disclaimer: Many of the emails which arrived before the problem
>> report are not the exact same than then, because since the I
>> mangled them with a script.  This should have not changed the
>> threading but I cannot be 100% sure.  But if it's important for
>> further investigation I'm probably able to reproduce the status
>> quo of the email corpus then from my backups.
> 
> If it's currently not working then I guess your current corpus should be
> fine. It would probably help to restate what exactly is wrong. There was
> a lot of discussion, and the concrete problem I saw identified (in
> id:874nvcekjk@qmul.ac.uk ) was that certain malformed In-reply-to
> headers were causing unrelated threads to merge.

Yes.  I understood the commit message of the commit you
referenced in the email I answered to, that now notmuch uses
 Reference: headers to do the threading.  I had a quick view at
the References header in the mbox file and none looked
suspicious.

Ciao, Gregor
-- 
 -... --- .-. . -.. ..--.. ...-.-
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2014-01-26 Thread David Bremner
Gregor Zattler  writes:

> Today I produced another mbox with the very same command but with
> a now larger email corpus freshly indexed with a fresh notmuch.
> The mbox contains (according to mutt) 507 messages in 34 threads.
> One of them is the thread I searched for.
>
> I grepped for the 7 subjects within the 34 subjects and only 5
> showed up.

I don't know what you mean here. Grepped where? in the raw messages?

> If somebody want's to dig into this: I can provide the two
> mboxes. 
>
> Disclaimer: Many of the emails which arrived before the problem
> report are not the exact same than then, because since the I
> mangled them with a script.  This should have not changed the
> threading but I cannot be 100% sure.  But if it's important for
> further investigation I'm probably able to reproduce the status
> quo of the email corpus then from my backups.

If it's currently not working then I guess your current corpus should be
fine. It would probably help to restate what exactly is wrong. There was
a lot of discussion, and the concrete problem I saw identified (in
id:874nvcekjk@qmul.ac.uk ) was that certain malformed In-reply-to
headers were causing unrelated threads to merge.

d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2014-01-26 Thread Gregor Zattler
Hi David,
* David Bremner  [24. Jan. 2014]:
> Mark Walters  writes:
>> I have looked at this and I think this is not notmuch's fault: I think
>> it is a mua doing strange things:
>>
>> One of the mails has an in-reply-to header which looks like
>>
>> In-reply-to: Message from Carsten Dominik  of 
>>"Tue, 15 Mar 2011 12:18:51 BST."
>> <17242340-a14f-495a-b144-20c96d52b...@gmail.com>
>>
>> and I think notmuch is taking the carsten.domi...@gmail.com as message
>> id.
>>
> 
> Can someone test if this is fixed by cf8aaafbad68 (i.e. does the problem
> persist in git master or 0.17)?

The problem is *not* fixed.  

I was the one who reported this problem two years ago.  I did the
same notmuch search again.  Since I did no know if it's important
with which version the emails were indexed I did a full index
with notmuch 0.17+40~gecbb29e.

I still have the mbox produced with notmuch show two years ago.
Viewed with mutt (1) I see 206 messages in 7 threads (number of
lines after collapse-all) (notmuch emacs show showed three
threads then).  One of the threads is the one I searched for.

Today I produced another mbox with the very same command but with
a now larger email corpus freshly indexed with a fresh notmuch.
The mbox contains (according to mutt) 507 messages in 34 threads.
One of them is the thread I searched for.

I grepped for the 7 subjects within the 34 subjects and only 5
showed up.

Only 17 of the 507 messages arrived since the problem report two
years ago.

If somebody want's to dig into this: I can provide the two
mboxes. 

Disclaimer: Many of the emails which arrived before the problem
report are not the exact same than then, because since the I
mangled them with a script.  This should have not changed the
threading but I cannot be 100% sure.  But if it's important for
further investigation I'm probably able to reproduce the status
quo of the email corpus then from my backups.

Thanks for your persistence. 

Ciao, Gregor
-- 
 -... --- .-. . -.. ..--.. ...-.-
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2014-01-24 Thread David Bremner
Mark Walters  writes:

> Hi 
>
> I have looked at this and I think this is not notmuch's fault: I think
> it is a mua doing strange things:
>
> One of the mails has an in-reply-to header which looks like
>
> In-reply-to: Message from Carsten Dominik  of  
>   "Tue, 15 Mar 2011 12:18:51 BST."
> <17242340-a14f-495a-b144-20c96d52b...@gmail.com>
>
> and I think notmuch is taking the carsten.domi...@gmail.com as message
> id.
>

Can someone test if this is fixed by cf8aaafbad68 (i.e. does the problem
persist in git master or 0.17)?

d




___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2012-01-31 Thread Jameson Graef Rollins
On Tue, 31 Jan 2012 01:18:55 +, Mark Walters  
wrote:
> One of the mails has an in-reply-to header which looks like
> 
> In-reply-to: Message from Carsten Dominik  of  
>   "Tue, 15 Mar 2011 12:18:51 BST."
> <17242340-a14f-495a-b144-20c96d52b...@gmail.com>

!!
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2012-01-30 Thread Mark Walters
On Mon, 30 Jan 2012 23:34:16 +0100, Gregor Zattler  wrote:
> Hi Mark,
> * Mark Walters  [30. Jan. 2012]:
> > On Mon, 30 Jan 2012 20:04:25 +0100, Gregor Zattler  
> > wrote:
> >> * Pieter Praet  [30. Jan. 2012]:
> >>> On Mon, 30 Jan 2012 00:42:14 +0100, Gregor Zattler  
> >>> wrote:
>  * Pieter Praet  [26. Jan. 2012]:
> > Here's another couple of threads squashed into a single one:
> > - [O] [Use Question] Capture and long lines
> >   - id:"banlktikof4txunllufrznsd6k2zys7s...@mail.gmail.com"
> > - [O] Worg update
> >   - id:"m1wrfiz3ch@tsdye.com"
> > - [O] Table formula to convert hex to dec
> >   - id:"20110724080054.GB16388@x201"
> > - [O] ICS import?
> >   - id:"20120125173421.GQ3747@x201"
> > 
> > 
> > AFAICT, none of them share Message-Id's...
>  
>  Do you consider this a bug?
>  
> >>> 
> >>> I do.  No idea what causes it or how to fix it though... :)
> >> 
> >> First I thougt it' not a severe bug since one see's more not less
> >> messages in notmuch show buffer.  But later I realised one also
> >> sees less not more threads in notmuch search buffer and might not
> >> read certain notmuch threads because of "wrong" $Subject: in
> >> notmuch search buffer.
> 
> > I think notmuch links two messages into the same thread if they have an
> > in-reply-to or reference header in common: i.e the messages reference a
> > common parent message.  (See comment in lib/database.cc "Even before a
> > message is added, it's pre-allocated thread ID is useful so that all
> > descendant messages that reference this common parent can be recognized
> > as belonging to the same thread.")
> 
> So in case message a from thread A and message b from B would
> name the same Message c in their In-Reoply-To:/References:
> headers, while c is not (for some reason) in A or B, notmuch
> would assume both threads linked?  Makes sense.
>  
> > As far as I can see your grep tests haven't checked for that. 
> 
> True.
> 
> > Also, could you email me the mbox you had (I think you said that it was
> > a mailing list so all public) and I will take a look?
> 
> Sure, I do so off-list because of the size of the attachment.

Hi 

I have looked at this and I think this is not notmuch's fault: I think
it is a mua doing strange things:

One of the mails has an in-reply-to header which looks like

In-reply-to: Message from Carsten Dominik  of
"Tue, 15 Mar 2011 12:18:51 BST."
<17242340-a14f-495a-b144-20c96d52b...@gmail.com>

and I think notmuch is taking the carsten.domi...@gmail.com as message
id.

A similar in-reply-to header appears in the other thread so notmuch
pairs them up. According to http://www.jwz.org/doc/threading.html this
form of header is not allowed under RFC2822 but was allowed under the
earlier RFC822.

You can see several such messages on the gnu-mailing list site eg

ftp://lists.gnu.org/emacs-orgmode/2011-11 

search for "in-reply-to: M" but they all appear to be from the same
person (running mh-e 8.3 nmh under emacs 24)

In my collection from the linux kernel mailing list I get some examples
of in-reply-to not just being :  but it was only about 200 from
100,000 messages in the second half of 2010 (the most recent archives I
have).

Best wishes

Mark




___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2012-01-30 Thread Gregor Zattler
Hi Mark,
* Mark Walters  [30. Jan. 2012]:
> On Mon, 30 Jan 2012 20:04:25 +0100, Gregor Zattler  wrote:
>> * Pieter Praet  [30. Jan. 2012]:
>>> On Mon, 30 Jan 2012 00:42:14 +0100, Gregor Zattler  
>>> wrote:
 * Pieter Praet  [26. Jan. 2012]:
> Here's another couple of threads squashed into a single one:
> - [O] [Use Question] Capture and long lines
>   - id:"banlktikof4txunllufrznsd6k2zys7s...@mail.gmail.com"
> - [O] Worg update
>   - id:"m1wrfiz3ch@tsdye.com"
> - [O] Table formula to convert hex to dec
>   - id:"20110724080054.GB16388@x201"
> - [O] ICS import?
>   - id:"20120125173421.GQ3747@x201"
> 
> 
> AFAICT, none of them share Message-Id's...
 
 Do you consider this a bug?
 
>>> 
>>> I do.  No idea what causes it or how to fix it though... :)
>> 
>> First I thougt it' not a severe bug since one see's more not less
>> messages in notmuch show buffer.  But later I realised one also
>> sees less not more threads in notmuch search buffer and might not
>> read certain notmuch threads because of "wrong" $Subject: in
>> notmuch search buffer.

> I think notmuch links two messages into the same thread if they have an
> in-reply-to or reference header in common: i.e the messages reference a
> common parent message.  (See comment in lib/database.cc "Even before a
> message is added, it's pre-allocated thread ID is useful so that all
> descendant messages that reference this common parent can be recognized
> as belonging to the same thread.")

So in case message a from thread A and message b from B would
name the same Message c in their In-Reoply-To:/References:
headers, while c is not (for some reason) in A or B, notmuch
would assume both threads linked?  Makes sense.
 
> As far as I can see your grep tests haven't checked for that. 

True.

> Also, could you email me the mbox you had (I think you said that it was
> a mailing list so all public) and I will take a look?

Sure, I do so off-list because of the size of the attachment.


Ciao, Gregor
-- 
 -... --- .-. . -.. ..--.. ...-.-
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2012-01-30 Thread Mark Walters

On Mon, 30 Jan 2012 20:04:25 +0100, Gregor Zattler  wrote:
> Hi Pieter,
> * Pieter Praet  [30. Jan. 2012]:
> > On Mon, 30 Jan 2012 00:42:14 +0100, Gregor Zattler  
> > wrote:
> >> * Pieter Praet  [26. Jan. 2012]:
> >>> On Thu, 26 Jan 2012 13:44:50 +0100, Gregor Zattler  
> >>> wrote:
> |> [2] grep -I "^Message-Id:" /tmp/thread-I-m-interested-in.mbox |sed 
>  -e "s/Message-Id: $//" >really.mid
> |> grep -I -F really.mid rest.mbox
> |> --> no match
> >>> 
> [...]
> >>> Also, the '-F' option expects input on stdin, not a filename.
> >> 
> >> No, this is -F instead of -f and means --fixed-strings.
> >> 
> > And as I said, `-F' requires input on stdin, like this:
> > 
> >   `grep -F "$(cat really.mid)" rest.mbox'
> > 
> > Otherwise [1] you're grepping for the pattern 'really.mid' instead of
> > for the patterns specified *in* 'really.mid', so naturally, you aren't
> > getting any results.
> 
> *blush* you're right and I'm wrong.  I re-re-did the greps with
> with the same results (no hits at all).
> 
> [...]
> >>> Here's another couple of threads squashed into a single one:
> >>> - [O] [Use Question] Capture and long lines
> >>>   - id:"banlktikof4txunllufrznsd6k2zys7s...@mail.gmail.com"
> >>> - [O] Worg update
> >>>   - id:"m1wrfiz3ch@tsdye.com"
> >>> - [O] Table formula to convert hex to dec
> >>>   - id:"20110724080054.GB16388@x201"
> >>> - [O] ICS import?
> >>>   - id:"20120125173421.GQ3747@x201"
> >>> 
> >>> 
> >>> AFAICT, none of them share Message-Id's...
> >> 
> >> Do you consider this a bug?
> >> 
> > 
> > I do.  No idea what causes it or how to fix it though... :)
> 
> First I thougt it' not a severe bug since one see's more not less
> messages in notmuch show buffer.  But later I realised one also
> sees less not more threads in notmuch search buffer and might not
> read certain notmuch threads because of "wrong" $Subject: in
> notmuch search buffer.

Hi

I think notmuch links two messages into the same thread if they have an
in-reply-to or reference header in common: i.e the messages reference a
common parent message.  (See comment in lib/database.cc "Even before a
message is added, it's pre-allocated thread ID is useful so that all
descendant messages that reference this common parent can be recognized
as belonging to the same thread.")

As far as I can see your grep tests haven't checked for that. 

Also, could you email me the mbox you had (I think you said that it was
a mailing list so all public) and I will take a look?

Thanks

Mark
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2012-01-30 Thread Gregor Zattler
Hi Pieter,
* Pieter Praet  [30. Jan. 2012]:
> On Mon, 30 Jan 2012 00:42:14 +0100, Gregor Zattler  wrote:
>> * Pieter Praet  [26. Jan. 2012]:
>>> On Thu, 26 Jan 2012 13:44:50 +0100, Gregor Zattler  
>>> wrote:
|> [2] grep -I "^Message-Id:" /tmp/thread-I-m-interested-in.mbox |sed 
 -e "s/Message-Id: $//" >really.mid
|> grep -I -F really.mid rest.mbox
|> --> no match
>>> 
[...]
>>> Also, the '-F' option expects input on stdin, not a filename.
>> 
>> No, this is -F instead of -f and means --fixed-strings.
>> 
> And as I said, `-F' requires input on stdin, like this:
> 
>   `grep -F "$(cat really.mid)" rest.mbox'
> 
> Otherwise [1] you're grepping for the pattern 'really.mid' instead of
> for the patterns specified *in* 'really.mid', so naturally, you aren't
> getting any results.

*blush* you're right and I'm wrong.  I re-re-did the greps with
with the same results (no hits at all).

[...]
>>> Here's another couple of threads squashed into a single one:
>>> - [O] [Use Question] Capture and long lines
>>>   - id:"banlktikof4txunllufrznsd6k2zys7s...@mail.gmail.com"
>>> - [O] Worg update
>>>   - id:"m1wrfiz3ch@tsdye.com"
>>> - [O] Table formula to convert hex to dec
>>>   - id:"20110724080054.GB16388@x201"
>>> - [O] ICS import?
>>>   - id:"20120125173421.GQ3747@x201"
>>> 
>>> 
>>> AFAICT, none of them share Message-Id's...
>> 
>> Do you consider this a bug?
>> 
> 
> I do.  No idea what causes it or how to fix it though... :)

First I thougt it' not a severe bug since one see's more not less
messages in notmuch show buffer.  But later I realised one also
sees less not more threads in notmuch search buffer and might not
read certain notmuch threads because of "wrong" $Subject: in
notmuch search buffer.

Ciao, Gregor
-- 
 -... --- .-. . -.. ..--.. ...-.-
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2012-01-29 Thread Pieter Praet
On Mon, 30 Jan 2012 00:42:14 +0100, Gregor Zattler  wrote:
> Hi Pieter, notmuch developers
> * Pieter Praet  [26. Jan. 2012]:
> > On Thu, 26 Jan 2012 13:44:50 +0100, Gregor Zattler  
> > wrote:
> >>|> [2] grep -I "^Message-Id:" /tmp/thread-I-m-interested-in.mbox |sed 
> >> -e "s/Message-Id: $//" >really.mid
> >>|> grep -I -F really.mid rest.mbox
> >>|> --> no match
> >> 
> > 
> > Did you mean to do case-insensitive grep? ('-i' instead of '-I').
> 
> Yes I did mean case-insensitive search and the `-I' is the result
> of a misguided abbrev... Sorry about this.
> 
> > Also, the '-F' option expects input on stdin, not a filename.
> 
> No, this is -F instead of -f and means --fixed-strings.
> 

And as I said, `-F' requires input on stdin, like this:

  `grep -F "$(cat really.mid)" rest.mbox'

Otherwise [1] you're grepping for the pattern 'really.mid' instead of
for the patterns specified *in* 'really.mid', so naturally, you aren't
getting any results.


> > Try this (with all individual threads split into separate mboxes):
> > 
> >   #+begin_src sh
> > for i in $(ls *.mbox) ; do
> > grep -i '^Message-Id:' "${i}" | \
> > sed -e 's/^.\{13\}//' -e 's/>$//' \
> > > "${i}.mids"
> > done
> > for i in $(ls *.mids) ; do
> > echo "## Grepping for ${i}'s Message-Ids"
> > grep -i -F "$(cat ${i})" *.mbox
> > done
> >   #+end_src
> 
> Thanks I did it "manual".
> 
> > Here's another couple of threads squashed into a single one:
> > - [O] [Use Question] Capture and long lines
> >   - id:"banlktikof4txunllufrznsd6k2zys7s...@mail.gmail.com"
> > - [O] Worg update
> >   - id:"m1wrfiz3ch@tsdye.com"
> > - [O] Table formula to convert hex to dec
> >   - id:"20110724080054.GB16388@x201"
> > - [O] ICS import?
> >   - id:"20120125173421.GQ3747@x201"
> > 
> > 
> > AFAICT, none of them share Message-Id's...
> 
> Do you consider this a bug?
> 

I do.  No idea what causes it or how to fix it though... :)


> Ciao, Gregor
> -- 
>  -... --- .-. . -.. ..--.. ...-.-
> ___
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch


Peace

-- 
Pieter

[1] id:"20120126124450.GB30209@shi.workgroup"
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2012-01-29 Thread Gregor Zattler
Hi Pieter, notmuch developers
* Pieter Praet  [26. Jan. 2012]:
> On Thu, 26 Jan 2012 13:44:50 +0100, Gregor Zattler  wrote:
>>|> [2] grep -I "^Message-Id:" /tmp/thread-I-m-interested-in.mbox |sed -e 
>> "s/Message-Id: $//" >really.mid
>>|> grep -I -F really.mid rest.mbox
>>|> --> no match
>> 
> 
> Did you mean to do case-insensitive grep? ('-i' instead of '-I').

Yes I did mean case-insensitive search and the `-I' is the result
of a misguided abbrev... Sorry about this.

> Also, the '-F' option expects input on stdin, not a filename.

No, this is -F instead of -f and means --fixed-strings.

> Try this (with all individual threads split into separate mboxes):
> 
>   #+begin_src sh
> for i in $(ls *.mbox) ; do
> grep -i '^Message-Id:' "${i}" | \
> sed -e 's/^.\{13\}//' -e 's/>$//' \
> > "${i}.mids"
> done
> for i in $(ls *.mids) ; do
> echo "## Grepping for ${i}'s Message-Ids"
> grep -i -F "$(cat ${i})" *.mbox
> done
>   #+end_src

Thanks I did it "manual".

> Here's another couple of threads squashed into a single one:
> - [O] [Use Question] Capture and long lines
>   - id:"banlktikof4txunllufrznsd6k2zys7s...@mail.gmail.com"
> - [O] Worg update
>   - id:"m1wrfiz3ch@tsdye.com"
> - [O] Table formula to convert hex to dec
>   - id:"20110724080054.GB16388@x201"
> - [O] ICS import?
>   - id:"20120125173421.GQ3747@x201"
> 
> 
> AFAICT, none of them share Message-Id's...

Do you consider this a bug?

Ciao, Gregor
-- 
 -... --- .-. . -.. ..--.. ...-.-
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2012-01-29 Thread Gregor Zattler
Hi Jani, notmuch developers,

executive summary: notmuch almangamates several e-mail threads
into one notmuch-thread, I consider this a bug:

* Jani Nikula  [26. Jan. 2012]:
> On Thu, 26 Jan 2012 13:44:50 +0100, Gregor Zattler  wrote:
>> * Jameson Graef Rollins  [25. Jan. 2012]:
>>> On Wed, 25 Jan 2012 20:19:03 -0500, Austin Clements  
>>> wrote:
 One very common cause of this is someone using "reply" to get an
 initial set of recipients, but then replacing the entire message and
 subject (presumably without realizing that the mail is still tracking
 what it was a reply to).  This can also happen if someone
 intentionally replies to multiple messages (though few mail clients
 support this), or if there was a message ID collision.
>>> 
>>> This is a very common occurrence for me as well.  I would put money down
>>> that this is what you're seeing.
>> 
>> I thought about this too and this is why I checked for any
>> occurrence of Message-IDs in the other emails: 
>> 
>>|> I isolated the thread I was interested in,
>>|> extracted the message ids of its messages and greped the rest of
>>|> the messages for this message ids: no matches.[2] Therefore no of
>>|> the rests messages are part of the thread I was interested in
>> 
>> perhaps there was a logic error in how I did this:
>> 
>>|> [2] grep -I "^Message-Id:" /tmp/thread-I-m-interested-in.mbox |sed -e 
>> "s/Message-Id: $//" >really.mid
>>|> grep -I -F really.mid rest.mbox
>>|> --> no match
>> /tmp/thread-I-m-interested-in.mbox  is a mbox with messages
>> I'minterested in, the "real" ones.  really.mid is a list of
>> Message-IDs of these "real" emails.  rest.mbox is a mbox with the
>> other emails, Emacs showed in his notmuch show buffer but are
>> other threads.
>> 
>> Since there is no match I concluded, the threads are not linked.
>> Perhaps I made a mistake.  I'l retest it and report again.  But
>> right now I don't have the time to do this.

I re-did it.  This time I used the Emacs interface, searched for
folder:orgmode date 64 bit 32 
and in the notmuch-search -buffer I used notmuch-search-stash-thread-id to
get the internal thread-number.  I then did a

notmuch show --format=mbox thread:000108e0 >thread.mbox

opened this mbox with mutt, saved the one thread about dates
before 1970 in one maildir
`date64bit32-I-am-interested-in.mailbox' and the rest in a
maildir `other-e-mails.mailbox'.

I produced a list of all Message-Ids of the interesting thread by
doing

rgrep -E -i "^Message-Id:[[:space:]]" 
date64bit32-I-am-interested-in.mailbox|egrep -o "[^<]+@[^>]+" 
>date64bit32-I-am-interested-in.mid

and searched for this strings in the other e-mails:

rgrep -F date64bit32-I-am-interested-in.mid other-e-mails.mailbox

No hits.

I also did it the other way around:

rgrep -E -i "^Message-Id:[[:space:]]" other-e-mails.mailbox|egrep -o 
"[^<]+@[^>]+" >other-e-mails.mid

rgrep -F other-e-mails.mid date64bit32-I-am-interested-in.mailbox

No hits.

(I spared me the hassle to search for the Message-Ids in correct
headers only, there are simply no hits anywhere in this other e-mails.

Thus I conclude that notmuch amalgamates different e-mail-threads
into one as represented by one thread-id.

I consider this a bug.

If anybody is interested I can email her/him the mbox file with
the relevant thread (minus privacy relevant headers / 300 KiB gzipped).

> Do you have an mbox file in the maildir indexed by notmuch? That seems
> like the issue.

I don't think so:  I rgreped for files with more than 1 line
beginning with "Message-Id".  I got 38 hits.  I looked at all of
them, they are no mbox files (at least no valid ones) but e-mails
with other e-mails attached, or cited or in one case a
multipart/mixed message with plain text part and html part.

Nonetheless I isolated all Message-Ids from these 38 files,
eliminated some html artefacts and greped for this in
date64bit32-I-am-interested-in.mailbox and other-e-mails.mailbox:
No hits with either file.  I also did it the other way around:
Searching for the Message-ids of the two sets in the 38 potential
mbox files: No hit.

Ciao, Gregor
-- 
 -... --- .-. . -.. ..--.. ...-.-
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2012-01-26 Thread Pieter Praet
On Thu, 26 Jan 2012 13:44:50 +0100, Gregor Zattler  wrote:
> Hi Jamie, Austin,
> * Jameson Graef Rollins  [25. Jan. 2012]:
> > On Wed, 25 Jan 2012 20:19:03 -0500, Austin Clements  
> > wrote:
> >> One very common cause of this is someone using "reply" to get an
> >> initial set of recipients, but then replacing the entire message and
> >> subject (presumably without realizing that the mail is still tracking
> >> what it was a reply to).  This can also happen if someone
> >> intentionally replies to multiple messages (though few mail clients
> >> support this), or if there was a message ID collision.
> > 
> > This is a very common occurrence for me as well.  I would put money down
> > that this is what you're seeing.
> 
> I thought about this too and this is why I checked for any
> occurrence of Message-IDs in the other emails: 
> 
>|> I isolated the thread I was interested in,
>|> extracted the message ids of its messages and greped the rest of
>|> the messages for this message ids: no matches.[2] Therefore no of
>|> the rests messages are part of the thread I was interested in
> 
> perhaps there was a logic error in how I did this:
> 
>|> [2] grep -I "^Message-Id:" /tmp/thread-I-m-interested-in.mbox |sed -e 
> "s/Message-Id: $//" >really.mid
>|> grep -I -F really.mid rest.mbox
>|> --> no match
> 

Did you mean to do case-insensitive grep? ('-i' instead of '-I').

Also, the '-F' option expects input on stdin, not a filename.


Try this (with all individual threads split into separate mboxes):

  #+begin_src sh
for i in $(ls *.mbox) ; do
grep -i '^Message-Id:' "${i}" | \
sed -e 's/^.\{13\}//' -e 's/>$//' \
> "${i}.mids"
done
for i in $(ls *.mids) ; do
echo "## Grepping for ${i}'s Message-Ids"
grep -i -F "$(cat ${i})" *.mbox
done
  #+end_src


Here's another couple of threads squashed into a single one:
- [O] [Use Question] Capture and long lines
  - id:"banlktikof4txunllufrznsd6k2zys7s...@mail.gmail.com"
- [O] Worg update
  - id:"m1wrfiz3ch@tsdye.com"
- [O] Table formula to convert hex to dec
  - id:"20110724080054.GB16388@x201"
- [O] ICS import?
  - id:"20120125173421.GQ3747@x201"


AFAICT, none of them share Message-Id's...


> /tmp/thread-I-m-interested-in.mbox  is a mbox with messages
> I'minterested in, the "real" ones.  really.mid is a list of
> Message-IDs of these "real" emails.  rest.mbox is a mbox with the
> other emails, Emacs showed in his notmuch show buffer but are
> other threads.
> 
> Since there is no match I concluded, the threads are not linked.
> Perhaps I made a mistake.  I'l retest it and report again.  But
> right now I don't have the time to do this.
> 
> Ciao, Gregor
> -- 
>  -... --- .-. . -.. ..--.. ...-.-
> ___
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch


Peace

-- 
Pieter
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2012-01-26 Thread Jani Nikula
On Thu, 26 Jan 2012 13:44:50 +0100, Gregor Zattler  wrote:
> Hi Jamie, Austin,
> * Jameson Graef Rollins  [25. Jan. 2012]:
> > On Wed, 25 Jan 2012 20:19:03 -0500, Austin Clements  
> > wrote:
> >> One very common cause of this is someone using "reply" to get an
> >> initial set of recipients, but then replacing the entire message and
> >> subject (presumably without realizing that the mail is still tracking
> >> what it was a reply to).  This can also happen if someone
> >> intentionally replies to multiple messages (though few mail clients
> >> support this), or if there was a message ID collision.
> > 
> > This is a very common occurrence for me as well.  I would put money down
> > that this is what you're seeing.
> 
> I thought about this too and this is why I checked for any
> occurrence of Message-IDs in the other emails: 
> 
>|> I isolated the thread I was interested in,
>|> extracted the message ids of its messages and greped the rest of
>|> the messages for this message ids: no matches.[2] Therefore no of
>|> the rests messages are part of the thread I was interested in
> 
> perhaps there was a logic error in how I did this:
> 
>|> [2] grep -I "^Message-Id:" /tmp/thread-I-m-interested-in.mbox |sed -e 
> "s/Message-Id: $//" >really.mid
>|> grep -I -F really.mid rest.mbox
>|> --> no match
> 
> /tmp/thread-I-m-interested-in.mbox  is a mbox with messages
> I'minterested in, the "real" ones.  really.mid is a list of
> Message-IDs of these "real" emails.  rest.mbox is a mbox with the
> other emails, Emacs showed in his notmuch show buffer but are
> other threads.
> 
> Since there is no match I concluded, the threads are not linked.
> Perhaps I made a mistake.  I'l retest it and report again.  But
> right now I don't have the time to do this.

Do you have an mbox file in the maildir indexed by notmuch? That seems
like the issue.

Jani.
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Bug?: notmuch-search-show-thread shows several threads; only one containing matching messages

2012-01-26 Thread Gregor Zattler
Hi Jamie, Austin,
* Jameson Graef Rollins  [25. Jan. 2012]:
> On Wed, 25 Jan 2012 20:19:03 -0500, Austin Clements  wrote:
>> One very common cause of this is someone using "reply" to get an
>> initial set of recipients, but then replacing the entire message and
>> subject (presumably without realizing that the mail is still tracking
>> what it was a reply to).  This can also happen if someone
>> intentionally replies to multiple messages (though few mail clients
>> support this), or if there was a message ID collision.
> 
> This is a very common occurrence for me as well.  I would put money down
> that this is what you're seeing.

I thought about this too and this is why I checked for any
occurrence of Message-IDs in the other emails: 

   |> I isolated the thread I was interested in,
   |> extracted the message ids of its messages and greped the rest of
   |> the messages for this message ids: no matches.[2] Therefore no of
   |> the rests messages are part of the thread I was interested in

perhaps there was a logic error in how I did this:

   |> [2] grep -I "^Message-Id:" /tmp/thread-I-m-interested-in.mbox |sed -e 
"s/Message-Id: $//" >really.mid
   |> grep -I -F really.mid rest.mbox
   |> --> no match

/tmp/thread-I-m-interested-in.mbox  is a mbox with messages
I'minterested in, the "real" ones.  really.mid is a list of
Message-IDs of these "real" emails.  rest.mbox is a mbox with the
other emails, Emacs showed in his notmuch show buffer but are
other threads.

Since there is no match I concluded, the threads are not linked.
Perhaps I made a mistake.  I'l retest it and report again.  But
right now I don't have the time to do this.

Ciao, Gregor
-- 
 -... --- .-. . -.. ..--.. ...-.-
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch