Re: [Dovecot] what to expect from changing index location

2011-06-30 Thread Davide Vaghetti
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/29/2011 06:36 PM, William Blunn wrote:
 On 28/06/2011 17:13, Davide Vaghetti wrote:
 I have one thousand virtual users with mdbox mailbox format and 10 
 GByte quota. I have noticed some performance problem related to
 I/O (the mailbox disk is a 6TB raid1+0 on ISCSI), so I want to put
 the index files on a different disk. My actual mail_location is:
 
 mail_location = mdbox:/var/vmail/%-1.1u/%u/mdbox
 
 and I want to switch to
 
 mail_location = 
 mdbox:/var/vmail/%-1.1u/%u/mdbox:INDEX=/var/indexes/%-1.1u/%u/
 
 But I cannot figure out a pair of things:
 
 - - do the switch trigger the rebuilding of the index files?
 
 ! DANGER, DANGER !!
 
 Index files cannot be re-generated under mdbox
 
 Go away and read http://wiki2.dovecot.org/MailboxFormat/dbox
 
 ... with dbox the Index files actually contain significant data
 which is held nowhere else. Index files for both *single-dbox* and 
 *multi-dbox* contain message flags and keywords. For *multi-dbox*,
 the index file also contains the map_uids which link (via the map
 index) to the actual message data. This data cannot be automatically
 recreated, so it is important that Index files are treated with the
 same care as message data files.
 
 If you don't already know this, then you probably shouldn't even be 
 using mdbox.
 
 - - can I get rid of all the old index files?
 
 NO!
 
 - - how much the index files (no fts squat) can grow?
 
 First solve your understanding problem with mdbox, then worry about 
 details such as this.
 


Bill, thanks for all the __important__ info. You almost saved my ass ;-)
 (BTW, that is why I was asking)

I'll check again the documentation to better understand index in the
mdbox context.

Nontheless, I still have to care about the index files grow factor, so
if you, or anyone else, can point me to the right documentation, or have
a rule of thumbs to know it, please share it.

Regards
davide


- -- 
Dott. Davide Vaghetti
Centro Servizi Informatici Facolta' di Ingegneria
Universita' di Pisa
PGP:
http://keys.keysigning.org:11371/pks/lookup?op=getsearch=0x7A1B3BA18C4E0A4D
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk4MIUYACgkQehs7oYxOCk1pHwCfeomYITfTiyAhMC2oQhM3cFhW
Vh8AoPBSRflEWP4sFTpD1vgZKya+0KtV
=e7rX
-END PGP SIGNATURE-


Re: [Dovecot] what to expect from changing index location

2011-06-30 Thread William Blunn
I concede that this is most likely a WIBNI (Wouldn't It Be Nice If...) 
and most likely will end up on the list of WIBNIs, never to be implemented.


But I would like to take the brainstorm forward another step, just to see.

On 30/06/2011 05:35, Timo Sirainen wrote:

To allow for migration of existing installations, it might be an idea to
make Dovecot look for both ddb and index when opening, but use 
ddb when creating new files.


This makes it annoying. It wastes disk I/O.. 


OK fair enough.

(Though not actually *disk* I/O /per se/. It is not like we would create 
any further sync-to-disk requirement (i.e. requiring to wait for another 
revolution), but rather that it would require more system calls.)


Presumably it's important that it works correctly for existing users 
with minimal risk of problems if people take the path of least 
resistance (and people don't read the release notes). I imagine many 
people will not be bothered about some extra failed open calls. But we 
should still have a way to tune for optimal I/O usage so that systems 
which are up against it for performance can be tuned. OK, how about this:


A configuration directive like this:

filename_word_ddb = ddb index

This specifies a list of words which will be tried in the place where we 
mean to say ddb in a filename.


If the directive is not present, then the default value would be as per 
the example above. This should allow existing installations to work 
correctly using old configuration files.


If a new file needs to be created, then it will use the first entry in 
the list.


So new installs will use ddb for all such files, and will be optimal 
where the file exists already, but mildly sub-optimal where the file 
doesn't exist (because Dovecot would have to try opening each possible 
variation before being able to know that the file was not openable). In 
order to tune for I/O, the administrator can reconfigure the list to be 
just ddb.


Old installs will have existing files with index with new files being 
created with ddb. This will work correctly, but with some degree of 
sub-optimality. In order to tune for I/O, the administrator would need to:


1. Configure filename_word_ddb to ddb index ddb (to mitigate the race 
condition where a file is renamed after ddb is tried but before 
index is tried)

2. Re-name existing files (from ...index... to ...ddb...)
3. Check that no files with old names exist
4. Change the list to ddb

This means that things should work correctly by default, and only get 
messed-up when people actively go and try to optimise things without 
paying attention to what they're doing.


BTW. Cyrus also has cyrus.index file, which is the only storage for 
message flags. So Dovecot isn't alone with this.


Though two is still a small sample compared to the weight of existing 
terminology usage.


Besides, Cyrus is somewhat in-bred, and we would expect it to be 
quirky :-)


Bill


Re: [Dovecot] what to expect from changing index location

2011-06-29 Thread Davide Vaghetti
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/28/2011 07:29 PM, Charles Marcus wrote:
 On 2011-06-28 12:13 PM, Davide Vaghetti wrote:
 mail_location = mdbox:/var/vmail/%-1.1u/%u/mdbox
 
 and I want to switch to
 
 mail_location = 
 mdbox:/var/vmail/%-1.1u/%u/mdbox:INDEX=/var/indexes/%-1.1u/%u/
 
 But I cannot figure out a pair of things:
 
 - - do the switch trigger the rebuilding of the index files?
 
 - - can I get rid of all the old index files?
 
 I'm by no means an expert, but with that many users I think if you
 did this in one shot (all indexes being rebuilt simultaneously as
 users logged in) your system would slow to a crawl...
 
 I would first rsync the existing indexes over live, then stop
 dovecot, do another quick rsync of the indexes, then make the change
 and restart dovecot...
 
 That will minimize the impact (rebuilding of indexes)...
 

Good hint! Thank you.

What about the index grow factor? Do some of you folks have any idea
about that (no ftp squat)?

bye
davide
- -- 
Dott. Davide Vaghetti
Centro Servizi Informatici Facolta' di Ingegneria
Universita' di Pisa
PGP:
http://keys.keysigning.org:11371/pks/lookup?op=getsearch=0x7A1B3BA18C4E0A4D
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk4LSr0ACgkQehs7oYxOCk2iBwCfbcygrvBaO4JJFAtgTb9fXwZg
FPMAoI/yZFborIJH+U3gTx28In602H7k
=pHbw
-END PGP SIGNATURE-


Re: [Dovecot] what to expect from changing index location

2011-06-29 Thread William Blunn

On 28/06/2011 17:13, Davide Vaghetti wrote:

I have one thousand virtual users with mdbox mailbox format and 10 GByte quota. 
I have noticed some performance problem related to I/O (the mailbox disk is a 
6TB raid1+0 on ISCSI), so I want to put the index files on a different disk. My 
actual mail_location is:

mail_location = mdbox:/var/vmail/%-1.1u/%u/mdbox

and I want to switch to

mail_location =
mdbox:/var/vmail/%-1.1u/%u/mdbox:INDEX=/var/indexes/%-1.1u/%u/

But I cannot figure out a pair of things:

- - do the switch trigger the rebuilding of the index files?


! DANGER, DANGER !!

Index files cannot be re-generated under mdbox

Go away and read http://wiki2.dovecot.org/MailboxFormat/dbox

... with dbox the Index files actually contain significant data which 
is held nowhere else. Index files for both *single-dbox* and 
*multi-dbox* contain message flags and keywords. For *multi-dbox*, the 
index file also contains the map_uids which link (via the map index) 
to the actual message data. This data cannot be automatically recreated, 
so it is important that Index files are treated with the same care as 
message data files.


If you don't already know this, then you probably shouldn't even be 
using mdbox.



- - can I get rid of all the old index files?


NO!


- - how much the index files (no fts squat) can grow?


First solve your understanding problem with mdbox, then worry about 
details such as this.


Bill



Re: [Dovecot] what to expect from changing index location

2011-06-29 Thread William Blunn
In fact, under sdbox and mdbox, calling these files index files is 
misleading because it implies that they can be re-created, leading to 
situations like this.


Such situations could result in catastrophic data loss. Whilst we could 
say it is user error, users could argue that it is common knowledge 
that files referred to as index files can be re-created from the data 
files.


In reality, these so-called index files are actually database files 
containing critical data.


They happen to use the same format as Dovecot uses for index files in 
connection with mbox and maildir, but they contain data which is held 
nowhere else and cannot be recreated.


Perhaps the per-mailbox index files for sdbox and mdbox should be 
re-named to message metadata databases, and the map index should be 
renamed to message store database.


Specifically we should avoid the word index. By including the word 
database, we make it clearer that these files contain data.


Timo, what do you reckon?

Regards,

Bill

On 29/06/2011 17:36, William Blunn wrote:

On 28/06/2011 17:13, Davide Vaghetti wrote:
I have one thousand virtual users with mdbox mailbox format and 10 
GByte quota. I have noticed some performance problem related to I/O 
(the mailbox disk is a 6TB raid1+0 on ISCSI), so I want to put the 
index files on a different disk. My actual mail_location is:


mail_location = mdbox:/var/vmail/%-1.1u/%u/mdbox

and I want to switch to

mail_location =
mdbox:/var/vmail/%-1.1u/%u/mdbox:INDEX=/var/indexes/%-1.1u/%u/

But I cannot figure out a pair of things:

- - do the switch trigger the rebuilding of the index files?


! DANGER, DANGER !!

Index files cannot be re-generated under mdbox

Go away and read http://wiki2.dovecot.org/MailboxFormat/dbox

... with dbox the Index files actually contain significant data which 
is held nowhere else. Index files for both *single-dbox* and 
*multi-dbox* contain message flags and keywords. For *multi-dbox*, the 
index file also contains the map_uids which link (via the map index) 
to the actual message data. This data cannot be automatically 
recreated, so it is important that Index files are treated with the 
same care as message data files.


If you don't already know this, then you probably shouldn't even be 
using mdbox.



- - can I get rid of all the old index files?


NO!


- - how much the index files (no fts squat) can grow?


First solve your understanding problem with mdbox, then worry about 
details such as this.


Bill






Re: [Dovecot] what to expect from changing index location

2011-06-29 Thread William Blunn

On 29/06/2011 18:00, William Blunn wrote:
Perhaps the per-mailbox index files for sdbox and mdbox should be 
re-named to message metadata databases, and the map index should 
be renamed to message store database.


Also it might be an idea to change the filenames of the files to avoid 
the word index.


Perhaps use something like ddb instead (means Dovecot database).

So,

${location}/mailboxes/INBOX/dbox-Mails/dovecot.index
${location}/mailboxes/INBOX/dbox-Mails/dovecot.index.cache
${location}/mailboxes/INBOX/dbox-Mails/dovecot.index.log
${location}/storage/dovecot.map.index

becomes

${location}/mailboxes/INBOX/dbox-Mails/dovecot.ddb
${location}/mailboxes/INBOX/dbox-Mails/dovecot.ddb.cache
${location}/mailboxes/INBOX/dbox-Mails/dovecot.ddb.log
${location}/storage/dovecot.map.ddb

To allow for migration of existing installations, it might be an idea to 
make Dovecot look for both ddb and index when opening, but use ddb 
when creating new files.


Regards,

Bill


Re: [Dovecot] what to expect from changing index location

2011-06-29 Thread Timo Sirainen
On Wed, 2011-06-29 at 18:09 +0100, William Blunn wrote:
 On 29/06/2011 18:00, William Blunn wrote:
  Perhaps the per-mailbox index files for sdbox and mdbox should be 
  re-named to message metadata databases, and the map index should 
  be renamed to message store database.
 
 Also it might be an idea to change the filenames of the files to avoid 
 the word index.
 
 Perhaps use something like ddb instead (means Dovecot database).

Or simply db :)

 ${location}/mailboxes/INBOX/dbox-Mails/dovecot.ddb
 ${location}/mailboxes/INBOX/dbox-Mails/dovecot.ddb.cache
 ${location}/mailboxes/INBOX/dbox-Mails/dovecot.ddb.log
 ${location}/storage/dovecot.map.ddb

Yes, this would be nice, but..

 To allow for migration of existing installations, it might be an idea to 
 make Dovecot look for both ddb and index when opening, but use ddb 
 when creating new files.

This makes it annoying. It wastes disk I/O..

BTW. Cyrus also has cyrus.index file, which is the only storage for
message flags. So Dovecot isn't alone with this.



Re: [Dovecot] what to expect from changing index location

2011-06-28 Thread Charles Marcus
On 2011-06-28 12:13 PM, Davide Vaghetti wrote:
 mail_location = mdbox:/var/vmail/%-1.1u/%u/mdbox
 
 and I want to switch to
 
 mail_location =
 mdbox:/var/vmail/%-1.1u/%u/mdbox:INDEX=/var/indexes/%-1.1u/%u/
 
 But I cannot figure out a pair of things:
 
 - - do the switch trigger the rebuilding of the index files?
 
 - - can I get rid of all the old index files?

I'm by no means an expert, but with that many users I think if you did
this in one shot (all indexes being rebuilt simultaneously as users
logged in) your system would slow to a crawl...

I would first rsync the existing indexes over live, then stop dovecot,
do another quick rsync of the indexes, then make the change and restart
dovecot...

That will minimize the impact (rebuilding of indexes)...

-- 

Best regards,

Charles