Thanks Jack. That's good to know. It is definitely something to consider. In a distributed storage scenario we might build a dedicated pool for that and tune the pool as more capacity or performance is needed.
Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* On Wed, May 16, 2018 at 4:45 PM Jack <c...@jack.fr.eu.org> wrote: > On 05/16/2018 09:35 PM, Webert de Souza Lima wrote: > > We'll soon do benchmarks of sdbox vs mdbox over cephfs with bluestore > > backend. > > We'll have to do some some work on how to simulate user traffic, for > writes > > and readings. That seems troublesome. > I would appreciate seeing these results ! > > > Thanks for the plugins recommendations. I'll take the change and ask you > > how is the SIS status? We have used it in the past and we've had some > > problems with it. > > I am using it since Dec 2016 with mdbox, with no issue at all (I am > currently using Dovecot 2.2.27-3 from Debian Stretch) > The only config I use is mail_attachment_dir, the rest lies as default > (mail_attachment_min_size = 128k, mail_attachment_fs = sis posix, > ail_attachment_hash = %{sha1}) > The backend storage is a local filesystem, and there is only one Dovecot > instance > > > > > Regards, > > > > Webert Lima > > DevOps Engineer at MAV Tecnologia > > *Belo Horizonte - Brasil* > > *IRC NICK - WebertRLZ* > > > > > > On Wed, May 16, 2018 at 4:19 PM Jack <c...@jack.fr.eu.org> wrote: > > > >> Hi, > >> > >> Many (most ?) filesystems does not store multiple files on the same > block > >> > >> Thus, with sdbox, every single mail (you know, that kind of mail with 10 > >> lines in it) will eat an inode, and a block (4k here) > >> mdbox is more compact on this way > >> > >> Another difference: sdbox removes the message, mdbox does not : a single > >> metadata update is performed, which may be packed with others if many > >> files are deleted at once > >> > >> That said, I do not have experience with dovecot + cephfs, nor have made > >> tests for sdbox vs mdbox > >> > >> However, and this is a bit out of topic, I recommend you look at the > >> following dovecot's features (if not already done), as they are awesome > >> and will help you a lot: > >> - Compression (classic, https://wiki.dovecot.org/Plugins/Zlib) > >> - Single-Instance-Storage (aka sis, aka "attachment deduplication" : > >> https://www.dovecot.org/list/dovecot/2013-December/094276.html) > >> > >> Regards, > >> On 05/16/2018 08:37 PM, Webert de Souza Lima wrote: > >>> I'm sending this message to both dovecot and ceph-users ML so please > >> don't > >>> mind if something seems too obvious for you. > >>> > >>> Hi, > >>> > >>> I have a question for both dovecot and ceph lists and below I'll > explain > >>> what's going on. > >>> > >>> Regarding dbox format (https://wiki2.dovecot.org/MailboxFormat/dbox), > >> when > >>> using sdbox, a new file is stored for each email message. > >>> When using mdbox, multiple messages are appended to a single file until > >> it > >>> reaches/passes the rotate limit. > >>> > >>> I would like to understand better how the mdbox format impacts on IO > >>> performance. > >>> I think it's generally expected that fewer larger file translate to > less > >> IO > >>> and more troughput when compared to more small files, but how does > >> dovecot > >>> handle that with mdbox? > >>> If dovecot does flush data to storage upon each and every new email is > >>> arrived and appended to the corresponding file, would that mean that it > >>> generate the same ammount of IO as it would do with one file per > message? > >>> Also, if using mdbox many messages will be appended to a said file > >> before a > >>> new file is created. That should mean that a file descriptor is kept > open > >>> for sometime by dovecot process. > >>> Using cephfs as backend, how would this impact cluster performance > >>> regarding MDS caps and inodes cached when files from thousands of users > >> are > >>> opened and appended all over? > >>> > >>> I would like to understand this better. > >>> > >>> Why? > >>> We are a small Business Email Hosting provider with bare metal, self > >> hosted > >>> systems, using dovecot for servicing mailboxes and cephfs for email > >> storage. > >>> > >>> We are currently working on dovecot and storage redesign to be in > >>> production ASAP. The main objective is to serve more users with better > >>> performance, high availability and scalability. > >>> * high availability and load balancing is extremely important to us * > >>> > >>> On our current model, we're using mdbox format with dovecot, having > >>> dovecot's INDEXes stored in a replicated pool of SSDs, and messages > >> stored > >>> in a replicated pool of HDDs (under a Cache Tier with a pool of SSDs). > >>> All using cephfs / filestore backend. > >>> > >>> Currently there are 3 clusters running dovecot 2.2.34 and ceph Jewel > >>> (10.2.9-4). > >>> - ~25K users from a few thousands of domains per cluster > >>> - ~25TB of email data per cluster > >>> - ~70GB of dovecot INDEX [meta]data per cluster > >>> - ~100MB of cephfs metadata per cluster > >>> > >>> Our goal is to build a single ceph cluster for storage that could > expand > >> in > >>> capacity, be highly available and perform well enough. I know, that's > >> what > >>> everyone wants. > >>> > >>> Cephfs is an important choise because: > >>> - there can be multiple mountpoints, thus multiple dovecot instances > on > >>> different hosts > >>> - the same storage backend is used for all dovecot instances > >>> - no need of sharding domains > >>> - dovecot is easily load balanced (with director sticking users to the > >>> same dovecot backend) > >>> > >>> On the upcoming upgrade we intent to: > >>> - upgrade ceph to 12.X (Luminous) > >>> - drop the SSD Cache Tier (because it's deprecated) > >>> - use bluestore engine > >>> > >>> I was said on freenode/#dovecot that there are many cases where SDBOX > >> would > >>> perform better with NFS sharing. > >>> In case of cephfs, at first, I wouldn't think that would be true > because > >>> more files == more generated IO, but thinking about what I said in the > >>> beginning regarding sdbox vs mdbox that could be wrong. > >>> > >>> Any thoughts will be highlt appreciated. > >>> > >>> Regards, > >>> > >>> Webert Lima > >>> DevOps Engineer at MAV Tecnologia > >>> *Belo Horizonte - Brasil* > >>> *IRC NICK - WebertRLZ* > >>> > >>> > >>> > >>> _______________________________________________ > >>> ceph-users mailing list > >>> ceph-users@lists.ceph.com > >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>> > >> > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com