Hello everyone,
I am currently evaluating dovecot for our new email production servers
(20k+ mailboxes) and found out something strange.
I'm using those settings on Dovecot 2.2.4 (x86_64 / Slackware / compiled
from sources)
mdbox_rotate_size = 128M
mdbox_rotate_interval = 1d
mdbox_preallocate_space = yes
with virtual users and location like :
mail_location = mdbox:~/mdbox
I don't think the remaining config is relevant but ask me if you need
some other parts.
Using test accounts for 2 weeks now I've figured that the 128M
preallocated space is never 'hole punched" (to use a similar term than
"man fallocate" on Linux), even when rotating m.* files.
From what I understand those files will never be appended again because
of the mdbox_rotate_interval. Then doveadm purge creates new files so
old ones would never grow again.
Here is an example of a mdbox storage using ls -ls (which shows
allocated VS used space)
total 4065176
1884 -rw------- 1 mail mail 1926656 Jul 29 10:55 dovecot.map.index
4 -rw------- 1 mail mail 460 Jul 29 11:26 dovecot.map.index.log
48 -rw------- 1 mail mail 44304 Jul 29 10:55
dovecot.map.index.log.2
131072 -rw------- 1 mail mail 133165066 Jul 19 15:31 m.10
131072 -rw------- 1 mail mail 133507393 Jul 19 15:32 m.13
131072 -rw------- 1 mail mail 134155182 Jul 19 15:33 m.14
131072 -rw------- 1 mail mail 134213403 Jul 19 15:30 m.2
131072 -rw------- 1 mail mail 46464 Jul 21 04:30 m.21
131072 -rw------- 1 mail mail 134215030 Jul 19 15:30 m.3
131072 -rw------- 1 mail mail 25852 Jul 25 01:54 m.32
131072 -rw------- 1 mail mail 2360 Jul 26 00:05 m.34
131072 -rw------- 1 mail mail 169073 Jul 27 23:18 m.35
131072 -rw------- 1 mail mail 31624 Jul 27 01:55 m.36
131072 -rw------- 1 mail mail 134216982 Jul 28 04:30 m.37
131076 -rw------- 1 mail mail 134217804 Jul 28 04:30 m.38
131072 -rw------- 1 mail mail 134217341 Jul 28 04:30 m.39
131072 -rw------- 1 mail mail 134213719 Jul 19 15:30 m.4
131072 -rw------- 1 mail mail 29740970 Jul 28 04:30 m.40
131072 -rw------- 1 mail mail 129175917 Jul 28 04:30 m.41
131072 -rw------- 1 mail mail 133174937 Jul 28 04:30 m.42
131072 -rw------- 1 mail mail 633436 Jul 28 04:30 m.43
131072 -rw------- 1 mail mail 3154623 Jul 28 04:30 m.44
131072 -rw------- 1 mail mail 3676879 Jul 28 04:30 m.45
131072 -rw------- 1 mail mail 468158 Jul 28 04:30 m.46
131072 -rw------- 1 mail mail 26964 Jul 28 04:30 m.47
131072 -rw------- 1 mail mail 3574599 Jul 28 04:30 m.48
131072 -rw------- 1 mail mail 3789133 Jul 28 04:30 m.49
131072 -rw------- 1 mail mail 134215016 Jul 19 15:30 m.5
131072 -rw------- 1 mail mail 1280074 Jul 28 04:30 m.50
131076 -rw------- 1 mail mail 635459 Jul 28 22:47 m.51
131072 -rw------- 1 mail mail 1459418 Jul 29 10:55 m.52
131072 -rw------- 1 mail mail 132941013 Jul 29 11:26 m.53
131072 -rw------- 1 mail mail 134213475 Jul 19 15:30 m.7
131072 -rw------- 1 mail mail 132240074 Jul 19 15:31 m.9
There's a lot of "lost" space since preallocated space would only be
reclaimed when *all* emails in m.X file have refcount=0 and after a
doveadm purge call, if I read well the dovecot docs.
On mailboxes patterns with low incoming mail (< 100kb / day) this would
waste much space. Of course I can decrease rotate size a lot but it
would then produce a lot of files and would certainly become similar
performance-wise to sdbox/maildir/...
There would certainly be smart to use something similar to
"FALLOC_FL_PUNCH_HOLE" on rotation (when doing close() ?) so that when
we're sure there won't be anymore data appended to file that the
allocated space == used space.
I will disable space preallocation for our next tests since it wastes
much storage for us ; did you have any feedback on how much it may
affect performance ? I found in this ML archives some messages about the
implementation but didn't see anyone clearly stating how much better
preallocation is.
Thanks, best regards,
Stephane Berthelot.