Re: [Dovecot] Many messages clustered around the same date.saved value

2012-04-03 Thread Timo Sirainen
On 29.3.2012, at 5.41, Joseph Tam wrote:

 Ah, with mbox there isn't any usable fallback for date.saved.  If it's
 not in dovecot.index.cache, the current time is used.
 
 I'm a little confused as to why it needed a fallback.  In other words,
 why wasn't date.saved put into the index as soon as the IMAP operation
 copied it into Trash?
 
 If this data isn't set at that time, when does it get instantiated?
 When I actually ask for it?


Well..:

 - date.saved is stored only in dovecot.index.cache file
 - if it doesn't exist and is requested, the current time is returned and it's 
added to the cache
 - when date.saved has already fetched once (so it already exists in 
dovecot.index.cache file), and mail is saved via LDA/IMAP then it gets added 
there immediately when saving
 - dovecot.index.cache has caching decisions, and some old/unused fields may 
get dropped from it once in a while
 - maybe due to some bugs or whatever, the fields or the entire cache may get 
dropped for some other reason

So it probably should have worked, but for some reason didn't.

It would be possible to store date.saved in dovecot.index file, like mdbox 
does, so cache decisions wouldn't matter. But probably too much trouble to be 
worth it, very few mbox installations care about it.

Re: [Dovecot] Many messages clustered around the same date.saved value

2012-03-28 Thread Timo Sirainen
On 27.3.2012, at 4.16, Joseph Tam wrote:

 However, I noticed a strange thing: querying what would have been
 deleted
 doveadm -ftab fetch -A date.saved mailbox Trash savedbefore 7d
 showed many date.saved values are clustered around the same timestamp,
 even among different user's Trash mailbox.
 ...
 I can't explain why many different users would have messages with the
 same (or closeby) date.saved value.
 Which mailbox format? With Maildir the date.saved is taken from
 dovecot.index.cache file, and in some cases that might get dropped.  If
 it does, then it fallbacks to using the file's ctime.
 
 mbox.

Ah, with mbox there isn't any usable fallback for date.saved. If it's not in 
dovecot.index.cache, the current time is used.

 These wrong values shouldn't cause problems with expunge queries since
 they err on the side of safety.

Right.

Re: [Dovecot] Many messages clustered around the same date.saved value

2012-03-28 Thread Joseph Tam


Timo Sirainen t...@iki.fi wrote:


Which mailbox format? With Maildir the date.saved is taken from
dovecot.index.cache file, and in some cases that might get dropped.  If
it does, then it fallbacks to using the file's ctime.


mbox.


Ah, with mbox there isn't any usable fallback for date.saved.  If it's
not in dovecot.index.cache, the current time is used.


I'm a little confused as to why it needed a fallback.  In other words,
why wasn't date.saved put into the index as soon as the IMAP operation
copied it into Trash?

If this data isn't set at that time, when does it get instantiated?
When I actually ask for it?

Joseph Tam jtam.h...@gmail.com


Re: [Dovecot] Many messages clustered around the same date.saved value

2012-03-26 Thread Timo Sirainen
On Sun, 2012-03-25 at 00:46 -0700, Joseph Tam wrote:
 Subject: Different user messages clustered around the same date.saved value
 
 After updating dovecot to 2.1.3, I can now use doveadm expunge -A ...
 to iterate through all user trash folders and expunge old messages.
 
 However, I noticed a strange thing: querying what would have been deleted
 
   doveadm -ftab fetch -A date.saved mailbox Trash savedbefore 7d
 
 showed many date.saved values are clustered around the same
 timestamp, even among different user's Trash mailbox.  One user's trash
 mailbox having the same date.saved is explained by a user deleting a
 lot of message at one time, but I can't explain why many different users
 would have messages with the same (or closeby) date.saved value.

Which mailbox format? With Maildir the date.saved is taken from
dovecot.index.cache file, and in some cases that might get dropped. If
it does, then it fallbacks to using the file's ctime.




Re: [Dovecot] Many messages clustered around the same date.saved value

2012-03-26 Thread Joseph Tam


Timo Sirainen wrote:


However, I noticed a strange thing: querying what would have been
deleted

doveadm -ftab fetch -A date.saved mailbox Trash savedbefore 7d

showed many date.saved values are clustered around the same timestamp,
even among different user's Trash mailbox.
...
I can't explain why many different users would have messages with the
same (or closeby) date.saved value.


Which mailbox format? With Maildir the date.saved is taken from
dovecot.index.cache file, and in some cases that might get dropped.  If
it does, then it fallbacks to using the file's ctime.


mbox.

A further look into this reveals that the clustered date.saved values
are the earliest values for every mailbox in the system.  This timestamp
is close to the time I was testing doveadm ... -A, so the likely
explanation is that I accidentally deleted/updated these values using
some variation of doveadm, even though I remember confining my testing
to query/search/fetch.  This appears to be a case of PEBKAC.

These wrong values shouldn't cause problems with expunge queries since
they err on the side of safety.

Thanks for the insight though.

Joseph Tam jtam.h...@gmail.com


[Dovecot] Many messages clustered around the same date.saved value

2012-03-25 Thread Joseph Tam


Subject: Different user messages clustered around the same date.saved value

After updating dovecot to 2.1.3, I can now use doveadm expunge -A ...
to iterate through all user trash folders and expunge old messages.

However, I noticed a strange thing: querying what would have been deleted

doveadm -ftab fetch -A date.saved mailbox Trash savedbefore 7d

showed many date.saved values are clustered around the same
timestamp, even among different user's Trash mailbox.  One user's trash
mailbox having the same date.saved is explained by a user deleting a
lot of message at one time, but I can't explain why many different users
would have messages with the same (or closeby) date.saved value.

For example, the output of the above query on my system showed the 10s
window /2012-03-05 18:08:0[0-9]/ matched 7658 messages among 22 different
user Trash mailboxes, which is statistically unlikely.

I did't see anything special in the dovecot logs at this time to
explain this.  What would cause this?

Joseph Tam jtam.h...@gmail.com