Re: [Dovecot] Great time savings backing a mdbox versus Maildir

2011-02-08 Thread Jan-Frode Myklebust
On Tue, Feb 08, 2011 at 08:42:40AM +0100, Javier Miguel Rodríguez wrote:
 
 I am writing to this mailing list to thanks Timo for dovecot 2 
 mdbox. We have almost 30.000 active users and our life was sad with
 Maildir  backup: 24 hours for a full backup  with bacula (zlib
 enabled maildirs, 1.4 TB). After switching to mdbox, the backup time
 is under 12 hours ! Instead of backing 17 millions files, with mdbox
 our backup is only of 1 million files, and that speeds up a lot the
 backup operation.

Oh.. I envy you. Will probably need to do the same at some point, but
I'm having problems understanding how we will ever be able to make the
transition. Too many files -- too many users..

How long did it take to convert from maildir to mdbox, how much downtime ?

Do you have a clustered setup, or single node? I'm wondering how safe
mdbox will be on a clusterfs (GPFS), as we've had a bit of trouble with
the index-files when they're accessed from multiple nodes at the same
time (but that was with v1.0.15 -- so we should maybe trust that such
problems has since been fixed :-)


  -jf


Re: [Dovecot] Great time savings backing a mdbox versus Maildir

2011-02-08 Thread Javier Miguel Rodríguez



Oh.. I envy you. Will probably need to do the same at some point, but
I'm having problems understanding how we will ever be able to make the
transition. Too many files -- too many users..



We did the transiction via imapsync: we had /the old server/ and 
a /new server/, and we migrated all mailboxes
with imapsync and master user feature. The first imapsync takes a lot of 
time, but the next ones are incremental, and take much less time. When 
we are ready (a night) , we stop we switch from old server to new 
server. Minimal downtine, and if everythings goes wrong, we can 
imapsync in the other way, from new- old instead old-new



Our mail servers are virtualized in a  vmware vsphere cluster. We 
have HA  DRS, and all the info is stored in the iSCSI SAN. Ir our setup 
we only have a virtualized mail , but if the hw node fails the 
virtualized starts automatically in another ESX.


Regards

Javier


How long did it take to convert from maildir to mdbox, how much downtime ?

Do you have a clustered setup, or single node? I'm wondering how safe
mdbox will be on a clusterfs (GPFS), as we've had a bit of trouble with
the index-files when they're accessed from multiple nodes at the same
time (but that was with v1.0.15 -- so we should maybe trust that such
problems has since been fixed :-)


   -jf




Re: [Dovecot] Great time savings backing a mdbox versus Maildir

2011-02-08 Thread Timo Sirainen
On 8.2.2011, at 9.42, Javier Miguel Rodríguez wrote:

I am writing to this mailing list to thanks Timo for dovecot 2  mdbox. We 
 have almost 30.000 active users and our life was sad with Maildir  backup: 
 24 hours for a full backup  with bacula (zlib enabled maildirs, 1.4 TB). 
 After switching to mdbox, the backup time is under 12 hours ! Instead of 
 backing 17 millions files, with mdbox our backup is only of 1 million files, 
 and that speeds up a lot the backup operation.

Hmm. I guess if you were doing backups 24h/day, then you can't really say how 
much faster mdbox performs than maildir (outside backups)?



Re: [Dovecot] Great time savings backing a mdbox versus Maildir

2011-02-08 Thread Javier de Miguel Rodríguez

Hello

Hmm. I guess if you were doing backups 24h/day, then you can't really say how 
much faster mdbox performs than maildir (outside backups)?



No, 24 hours is for a FULL backup in the weekend. An incremental 
backup is only 2-3 hours in the night every day.


About performance... I can not give you real numbers of Maildir vs 
mdbox. In Maildir our indexes were stored in a ram disk, but we can not 
do that with mdbox (we can not recreate them if power is lost).


Regards

Javier



Re: [Dovecot] Great time savings backing a mdbox versus Maildir

2011-02-08 Thread Timo Sirainen
On 9.2.2011, at 0.28, Javier de Miguel Rodríguez wrote:

 Hello
 Hmm. I guess if you were doing backups 24h/day, then you can't really say 
 how much faster mdbox performs than maildir (outside backups)?
 
 
No, 24 hours is for a FULL backup in the weekend. An incremental backup is 
 only 2-3 hours in the night every day.
 
About performance... I can not give you real numbers of Maildir vs mdbox. 
 In Maildir our indexes were stored in a ram disk, but we can not do that with 
 mdbox (we can not recreate them if power is lost).

So with mdbox disk I/O usage increased compared to maildir+ramdisk indexes?



Re: [Dovecot] Great time savings backing a mdbox versus Maildir

2011-02-08 Thread Javier Miguel Rodríguez



So with mdbox disk I/O usage increased compared to maildir+ramdisk indexes?



That is a tricky question to ask. It depends on usage, I think 
the following:



- LDA delivery: load is a bit lower (on disk) in Maildir vs mdbox: 
in both cases the message has to be written, indexes are updated, in 
Maildir indexes are in ram, so lower disk load in this case


- POP3 access: the same as the previous post

- IMAP access: this is tricky. In mdbox a /delete message/ 
command only lowers the refcount, indexes are updated and in the night a 
cron job runs doveadm purge. In Maildir, you really delete the message 
when MUA/webmail /compacts/ the folder, and indexes are updated. I 
think that mdbox has a /delayed IO /in this case, and has less load on 
disk on production hours.


Am I missing anything? The stats in the SAN after the change 
maildir-mdbox do not help, we have zlib enabled in lda  imap with 
mdbox, so our # of real IOPs is lower than Maildir (we did not have zlib 
enabled)


Regards

Javier




Re: [Dovecot] Great time savings backing a mdbox versus Maildir

2011-02-08 Thread Timo Sirainen
On 9.2.2011, at 9.21, Javier Miguel Rodríguez wrote:

That is a tricky question to ask. It depends on usage, I think the 
 following:
 
 
- LDA delivery: load is a bit lower (on disk) in Maildir vs mdbox: in both 
 cases the message has to be written, indexes are updated, in Maildir indexes 
 are in ram, so lower disk load in this case
 
- POP3 access: the same as the previous post
 
- IMAP access: this is tricky. In mdbox a /delete message/ command only 
 lowers the refcount, indexes are updated and in the night a cron job runs 
 doveadm purge. In Maildir, you really delete the message when MUA/webmail 
 /compacts/ the folder, and indexes are updated. I think that mdbox has a 
 /delayed IO /in this case, and has less load on disk on production hours.
Am I missing anything?

Yes, in theory those are right. I'm interested in finding out some real numbers 
:)

 The stats in the SAN after the change maildir-mdbox do not help, we have 
 zlib enabled in lda  imap with mdbox, so our # of real IOPs is lower than 
 Maildir (we did not have zlib enabled)


I wonder how large a write can be before it is split to two iops.. With NFS 
probably smaller I'd guess. Still, I would have thought that even if zlib 
writes only half as much, the disk iops difference wouldn't be nearly as much.



Re: [Dovecot] Great time savings backing a mdbox versus Maildir

2011-02-08 Thread Javier Miguel Rodríguez



The stats in the SAN after the change maildir-mdbox do not help, we have zlib 
enabled in lda  imap with mdbox, so our # of real IOPs is lower than Maildir (we 
did not have zlib enabled)


I wonder how large a write can be before it is split to two iops.. With NFS 
probably smaller I'd guess. Still, I would have thought that even if zlib 
writes only half as much, the disk iops difference wouldn't be nearly as much.



Without zlib our mailstore was 2.1 TB. With zlib enabled is 1.4 TB. 
We use a iSCSI SAN with ext4. I am writing a document with some 
benchmarking of dovecot (postal  rabid software) with some graphs about 
# of iops, cpu load, and so... I am still writing it if you are 
interested I can post a link to the document in the list.


Regards

Javier




[Dovecot] Great time savings backing a mdbox versus Maildir

2011-02-07 Thread Javier Miguel Rodríguez


Hello

I am writing to this mailing list to thanks Timo for dovecot 2  
mdbox. We have almost 30.000 active users and our life was sad with 
Maildir  backup: 24 hours for a full backup  with bacula (zlib enabled 
maildirs, 1.4 TB). After switching to mdbox, the backup time is under 12 
hours ! Instead of backing 17 millions files, with mdbox our backup is 
only of 1 million files, and that speeds up a lot the backup operation.



Timo, here you have detailed info about the bacula backup jobs, you 
can use them in the wiki if you desire. If you need aditional info 
(hardware specs, dovecot config, etc) I can share it.


*Maildir:*

|//Job:Backup_Linux_buzon_us.2011-01-21_19.03.26_38
  Backup Level:   Full
  Client: buzon_us 2.0.3 (06Mar07) 
x86_64-redhat-linux-gnu,redhat,Enterprise release
  FileSet:Full Buzon 2011-01-21 19:03:26
  Pool:   Pool_Linux_Buzones_US (From Job resource)
  Catalog:MyCatalog (From Client resource)
  Storage:File (From command line)
  Scheduled time: 21-ene-2011 19:03:20
  Start time: 21-ene-2011 19:03:29
  End time:   22-ene-2011 19:46:45
  Elapsed time:*1 day 43 mins 16 secs*
  Priority:   10
  FD Files Written:   16,903,801
  SD Files Written:*16,903,801*
  FD Bytes Written:   1,445,943,227,706 (1.445 TB)
  SD Bytes Written:   1,448,983,971,450 (1.448 TB)
  Rate:*16247.3 KB/s*
  Software Compression:   None
  VSS:no
  Encryption: no
  Accurate:   no
  Volume name(s): Buzones_US_2
  Volume Session Id:  26
  Volume Session Time:1295511704
  Last Volume Bytes:  1,450,628,892,676 (*1.450 TB*)
  Non-fatal FD errors:0
  SD Errors:  0
  FD termination status:  OK
  SD termination status:  OK
  Termination:Backup OK//|







*mdbox with mdbox_rotate_size=10m:*


|Build OS:   x86_64-redhat-linux-gnu redhat Enterprise release
  JobId:  3587
  Job:Backup_Linux_buzon_us.2011-02-07_08.13.52_53
  Backup Level:   Full (upgraded from Incremental)
  Client: buzon_us 2.0.3 (06Mar07) 
x86_64-redhat-linux-gnu,redhat,Enterprise release
  FileSet:Full Buzon 2011-01-21 19:03:26
  Pool:   Pool_Linux_Buzones_US (From Job resource)
  Catalog:MyCatalog (From Client resource)
  Storage:File (From command line)
  Scheduled time: 07-feb-2011 08:13:44
  Start time: 07-feb-2011 08:13:54
  End time:   07-feb-2011 19:43:50
  Elapsed time:*11 hours 29 mins 56 secs*
  Priority:   10
  FD Files Written:*1,148,780*
  SD Files Written:   1,148,780
  FD Bytes Written:*1,537,062,152,773 (1.537 TB)*
  SD Bytes Written:   1,537,218,147,402 (1.537 TB)
  Rate:*37130.7 KB/s*
  Software Compression:   None
  VSS:no
  Encryption: no
  Accurate:   yes
  Volume name(s): Buzones_US_4|Buzones_US_5
  Volume Session Id:  101
  Volume Session Time:1296724657
  Last Volume Bytes:  438,873,898,586 (438.8 GB)
  Non-fatal FD errors:0
  SD Errors:  0
  FD termination status:  OK
  SD termination status:  OK
  Termination:Backup OK|



Regards

Javier de Miguel
University of Seville