Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2018-08-30 Thread Marc Roos
 

How is it going with this? Are we getting close to a state where we can 
store a mailbox on ceph with this librmb?



-Original Message-
From: Wido den Hollander [mailto:w...@42on.com] 
Sent: maandag 25 september 2017 9:20
To: Gregory Farnum; Danny Al-Gaaf
Cc: ceph-users
Subject: Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot


> Op 22 september 2017 om 23:56 schreef Gregory Farnum 
:
> 
> 
> On Fri, Sep 22, 2017 at 2:49 PM, Danny Al-Gaaf 
 wrote:
> > Am 22.09.2017 um 22:59 schrieb Gregory Farnum:
> > [..]
> >> This is super cool! Is there anything written down that explains 
> >> this for Ceph developers who aren't familiar with the workings of 
Dovecot?
> >> I've got some questions I see going through it, but they may be 
> >> very dumb.
> >>
> >> *) Why are indexes going on CephFS? Is this just about wanting a 
> >> local cache, or about the existing Dovecot implementations, or 
> >> something else? Almost seems like you could just store the whole 
> >> thing in a CephFS filesystem if that's safe. ;)
> >
> > This is, if everything works as expected, only an intermediate step. 

> > An idea is
> > (https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/status-
> > 3) be to use omap to store the index/meta data.
> >
> > We chose a step-by-step approach and since we are currently not sure 

> > if using omap would work performance wise, we use CephFS (also since 

> > this requires no changes in Dovecot). Currently we put our focus on 
> > the development of the first version of librmb, but the code to use 
> > omap is already there. It needs integration, testing, and 
> > performance tuning to verify if it would work with our requirements.
> >
> >> *) It looks like each email is getting its own object in RADOS, and 

> >> I assume those are small messages, which leads me to
> >
> > The mail distribution looks like this:
> > https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/mailplat
> > form-mails-dist
> >
> >
> > Yes, the majority of the mails are under 500k, but the most objects 
> > are around 50k. Not so many very small objects.
> 
> Ah, that slide makes more sense with that context — I was paging 
> through it in bed last night and thought it was about the number of 
> emails per user or something weird.
> 
> So those mail objects are definitely bigger than I expected; 
interesting.
> 
> >
> >>   *) is it really cost-acceptable to not use EC pools on email 
data?
> >
> > We will use EC pools for the mail objects and replication for 
CephFS.
> >
> > But even without EC there would be a cost case compared to the 
> > current system. We will save a large amount of IOPs in the new 
> > platform since the (NFS) POSIX layer is removed from the IO path (at 

> > least for the mail objects). And we expect with Ceph and commodity 
> > hardware we can compete with a traditional enterprise NAS/NFS 
anyway.
> >
> >>   *) isn't per-object metadata overhead a big cost compared to the 
> >> actual stored data?
> >
> > I assume not. The metadata/index is not so much compared to the size 

> > of the mails (currently with NFS around 10% I would say). In the 
> > classic NFS based dovecot the number of index/cache/metadata files 
> > is an issue anyway. With 6.7 billion mails we have 1.2 billion 
> > index/cache/metadata files 
> > 
(https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/mailplatfor
m-mails-nums).
> 
> I was unclear; I meant the RADOS metadata cost of storing an object. I 

> haven't quantified that in a while but it was big enough to make 4KB 
> objects pretty expensive, which I was incorrectly assuming would be 
> the case for most emails.
> EC pools have the same issue; if you want to erasure-code a 40KB 
> object into 5+3 then you pay the metadata overhead for each 8KB
> (40KB/5) of data, but again that's more on the practical side of 
> things than my initial assumptions placed it.

Yes, it is. But combining object isn't easy either. RGW also has this 
limitation where objects are striped in RADOS and the EC overhead can 
become large.

At this moment the price/GB (correct me if needed Danny!) isn't th 
biggest problem. It could be that all mails will be stored on a 
replicated pool.

There also might be some overhead in BlueStore per object, but the 
number of Deutsche Telekom show that mails usually aren't 4kb. Only a 
small portion of e-mails is 4kb.

We will see how this turns out.

Wido

> 
> This is super cool!
> -Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-25 Thread Danny Al-Gaaf
Am 25.09.2017 um 10:00 schrieb Marc Roos:
>  
> 
> But from the looks of this dovecot mailinglist post, you didn’t start 
> your project with talking to the dovecot guys, or have an ongoing 
> communication with them during the development. I would think with that 
> their experience could be a valuable asset. I am not talking about just 
> giving some files at the end.

This look is may misleading.

We discussed with the dovecot guys about the topic to get an open source
Ceph/RADOS implementation for dovecot before we started. The outcome of
the discussions was from our side to bring this project alive and
sponsor it to get a generic solution.

We discussed a generic librmb approach also e.g. with Sage and others in
the Ceph community (and there was the tracker item #12430) so that this
library can be used also in other email server projects.

Anyway: we invite everybody, especially the dovecot community, to
participate and contribute to make this project successful. The goal is
not to have a new project besides dovecot. We are more than happy to
contribute the code to the corresponding projects and then close/remove
this repo from github. This is the final goal, what you see now is
hopefully only an intermediate step.

> Ps. Is there some index of these slides? I have problems browsing back 
> to a specific one constantly.

With ESC you should get an overview or you use 'm' to get to the menu.

Danny
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-25 Thread Marc Roos
 

But from the looks of this dovecot mailinglist post, you didn’t start 
your project with talking to the dovecot guys, or have an ongoing 
communication with them during the development. I would think with that 
their experience could be a valuable asset. I am not talking about just 
giving some files at the end.

Ps. Is there some index of these slides? I have problems browsing back 
to a specific one constantly.


-Original Message-
From: Danny Al-Gaaf [mailto:danny.al-g...@bisect.de] 
Sent: maandag 25 september 2017 9:37
To: Marc Roos; ceph-users
Subject: Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

Am 25.09.2017 um 09:00 schrieb Marc Roos:
>  
>>From the looks of it, to bad the efforts could not be
> combined/coordinated, that seems to be an issue with many open source 
> initiatives.

That's not right. The plan is to contribute the librmb code to the Ceph 
project and the Dovecot part back to the Dovecot project (as described 
in the slides) as soon as we know that it will work with real live load.

We simply needed a place to start with it, then we split the code into 
parts to move it to the corresponding projects.

Danny


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-25 Thread Danny Al-Gaaf
Am 25.09.2017 um 09:00 schrieb Marc Roos:
>  
>>From the looks of it, to bad the efforts could not be 
> combined/coordinated, that seems to be an issue with many open source 
> initiatives.

That's not right. The plan is to contribute the librmb code to the Ceph
project and the Dovecot part back to the Dovecot project (as described
in the slides) as soon as we know that it will work with real live load.

We simply needed a place to start with it, then we split the code into
parts to move it to the corresponding projects.

Danny
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-25 Thread Wido den Hollander

> Op 22 september 2017 om 23:56 schreef Gregory Farnum :
> 
> 
> On Fri, Sep 22, 2017 at 2:49 PM, Danny Al-Gaaf  
> wrote:
> > Am 22.09.2017 um 22:59 schrieb Gregory Farnum:
> > [..]
> >> This is super cool! Is there anything written down that explains this
> >> for Ceph developers who aren't familiar with the workings of Dovecot?
> >> I've got some questions I see going through it, but they may be very
> >> dumb.
> >>
> >> *) Why are indexes going on CephFS? Is this just about wanting a local
> >> cache, or about the existing Dovecot implementations, or something
> >> else? Almost seems like you could just store the whole thing in a
> >> CephFS filesystem if that's safe. ;)
> >
> > This is, if everything works as expected, only an intermediate step. An
> > idea is
> > (https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/status-3)
> > be to use omap to store the index/meta data.
> >
> > We chose a step-by-step approach and since we are currently not sure if
> > using omap would work performance wise, we use CephFS (also since this
> > requires no changes in Dovecot). Currently we put our focus on the
> > development of the first version of librmb, but the code to use omap is
> > already there. It needs integration, testing, and performance tuning to
> > verify if it would work with our requirements.
> >
> >> *) It looks like each email is getting its own object in RADOS, and I
> >> assume those are small messages, which leads me to
> >
> > The mail distribution looks like this:
> > https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/mailplatform-mails-dist
> >
> >
> > Yes, the majority of the mails are under 500k, but the most objects are
> > around 50k. Not so many very small objects.
> 
> Ah, that slide makes more sense with that context — I was paging
> through it in bed last night and thought it was about the number of
> emails per user or something weird.
> 
> So those mail objects are definitely bigger than I expected; interesting.
> 
> >
> >>   *) is it really cost-acceptable to not use EC pools on email data?
> >
> > We will use EC pools for the mail objects and replication for CephFS.
> >
> > But even without EC there would be a cost case compared to the current
> > system. We will save a large amount of IOPs in the new platform since
> > the (NFS) POSIX layer is removed from the IO path (at least for the mail
> > objects). And we expect with Ceph and commodity hardware we can compete
> > with a traditional enterprise NAS/NFS anyway.
> >
> >>   *) isn't per-object metadata overhead a big cost compared to the
> >> actual stored data?
> >
> > I assume not. The metadata/index is not so much compared to the size of
> > the mails (currently with NFS around 10% I would say). In the classic
> > NFS based dovecot the number of index/cache/metadata files is an issue
> > anyway. With 6.7 billion mails we have 1.2 billion index/cache/metadata
> > files
> > (https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/mailplatform-mails-nums).
> 
> I was unclear; I meant the RADOS metadata cost of storing an object. I
> haven't quantified that in a while but it was big enough to make 4KB
> objects pretty expensive, which I was incorrectly assuming would be
> the case for most emails.
> EC pools have the same issue; if you want to erasure-code a 40KB
> object into 5+3 then you pay the metadata overhead for each 8KB
> (40KB/5) of data, but again that's more on the practical side of
> things than my initial assumptions placed it.

Yes, it is. But combining object isn't easy either. RGW also has this 
limitation where objects are striped in RADOS and the EC overhead can become 
large.

At this moment the price/GB (correct me if needed Danny!) isn't th biggest 
problem. It could be that all mails will be stored on a replicated pool.

There also might be some overhead in BlueStore per object, but the number of 
Deutsche Telekom show that mails usually aren't 4kb. Only a small portion of 
e-mails is 4kb.

We will see how this turns out.

Wido

> 
> This is super cool!
> -Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-25 Thread Marc Roos
 
>From the looks of it, to bad the efforts could not be 
combined/coordinated, that seems to be an issue with many open source 
initiatives.


-Original Message-
From: mj [mailto:li...@merit.unu.edu] 
Sent: zondag 24 september 2017 16:37
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

Hi,

I forwarded your announcement to the dovecot  mailinglist. The following 
reply to it was posted by there by Timo Sirainen. I'm forwarding it 
here, as you might not be reading the dovecot mailinglist.

Wido:
> First, the Github link:
> https://github.com/ceph-dovecot/dovecot-ceph-plugin
> 
> I am not going to repeat everything which is on Github, put a short 
summary:
> 
> - CephFS is used for storing Mailbox Indexes
> - E-Mails are stored directly as RADOS objects
> - It's a Dovecot plugin
> 
> We would like everybody to test librmb and report back issues on 
Github so that further development can be done.
> 
> It's not finalized yet, but all the help is welcome to make librmb the 
best solution for storing your e-mails on Ceph with Dovecot.

Timo:
It would be have been nicer if RADOS support was implemented as lib-fs 
driver, and the fs-API had been used all over the place elsewhere. So 1) 
LibRadosMailBox wouldn't have been relying so much on RADOS specifically 
and 2) fs-rados could have been used for other purposes. There are 
already fs-dict and dict-fs drivers, so the RADOS dict driver may not 
have been necessary to implement if fs-rados was implemented instead 
(although I didn't check it closely enough to verify). (We've had 
fs-rados on our TODO list for a while also.)

BTW. We've also been planning on open sourcing some of the obox pieces, 
mainly fs-drivers (e.g. fs-s3). The obox format maybe too, but without 
the "metacache" piece. The current obox code is a bit too much married 
into the metacache though to make open sourcing it easy. (The metacache 
is about storing the Dovecot index files in object storage and 
efficiently caching them on local filesystem, which isn't planned to be 
open sourced in near future. That's pretty much the only difficult piece 
of the obox plugin, with Cassandra integration coming as a good second. 
I wish there had been a better/easier geo-distributed key-value database 
to use - tombstones are annoyingly troublesome.)

And using rmb-mailbox format, my main worries would be:
  * doesn't store index files (= message flags) - not necessarily a 
problem, as long as you don't want geo-replication
  * index corruption means rebuilding them, which means rescanning list 
of mail files, which means rescanning the whole RADOS namespace, which 
practically means  rescanning the RADOS pool. That most likely is a very 
very slow operation, which you want to avoid unless it's absolutely 
necessary. Need to be very careful to avoid that happening, and in 
general to avoid losing mails in case of crashes or other bugs.
  * I think copying/moving mails physically copies the full data on disk
  * Each IMAP/POP3/LMTP/etc process connects to RADOS separately from 
each others - some connection pooling would likely help here

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-24 Thread mj

Hi,

I forwarded your announcement to the dovecot  mailinglist. The following 
reply to it was posted by there by Timo Sirainen. I'm forwarding it 
here, as you might not be reading the dovecot mailinglist.


Wido:

First, the Github link:
https://github.com/ceph-dovecot/dovecot-ceph-plugin

I am not going to repeat everything which is on Github, put a short summary:

- CephFS is used for storing Mailbox Indexes
- E-Mails are stored directly as RADOS objects
- It's a Dovecot plugin

We would like everybody to test librmb and report back issues on Github so that 
further development can be done.

It's not finalized yet, but all the help is welcome to make librmb the best 
solution for storing your e-mails on Ceph with Dovecot.


Timo:
It would be have been nicer if RADOS support was implemented as lib-fs 
driver, and the fs-API had been used all over the place elsewhere. So 1) 
LibRadosMailBox wouldn't have been relying so much on RADOS specifically 
and 2) fs-rados could have been used for other purposes. There are 
already fs-dict and dict-fs drivers, so the RADOS dict driver may not 
have been necessary to implement if fs-rados was implemented instead 
(although I didn't check it closely enough to verify). (We've had 
fs-rados on our TODO list for a while also.)


BTW. We've also been planning on open sourcing some of the obox pieces, 
mainly fs-drivers (e.g. fs-s3). The obox format maybe too, but without 
the "metacache" piece. The current obox code is a bit too much married 
into the metacache though to make open sourcing it easy. (The metacache 
is about storing the Dovecot index files in object storage and 
efficiently caching them on local filesystem, which isn't planned to be 
open sourced in near future. That's pretty much the only difficult piece 
of the obox plugin, with Cassandra integration coming as a good second. 
I wish there had been a better/easier geo-distributed key-value database 
to use - tombstones are annoyingly troublesome.)


And using rmb-mailbox format, my main worries would be:
 * doesn't store index files (= message flags) - not necessarily a 
problem, as long as you don't want geo-replication
 * index corruption means rebuilding them, which means rescanning list 
of mail files, which means rescanning the whole RADOS namespace, which 
practically means  rescanning the RADOS pool. That most likely is a very 
very slow operation, which you want to avoid unless it's absolutely 
necessary. Need to be very careful to avoid that happening, and in 
general to avoid losing mails in case of crashes or other bugs.

 * I think copying/moving mails physically copies the full data on disk
 * Each IMAP/POP3/LMTP/etc process connects to RADOS separately from 
each others - some connection pooling would likely help here


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-22 Thread Danny Al-Gaaf
Am 22.09.2017 um 23:56 schrieb Gregory Farnum:
> On Fri, Sep 22, 2017 at 2:49 PM, Danny Al-Gaaf  
> wrote:
>> Am 22.09.2017 um 22:59 schrieb Gregory Farnum:
>> [..]
>>> This is super cool! Is there anything written down that explains this
>>> for Ceph developers who aren't familiar with the workings of Dovecot?
>>> I've got some questions I see going through it, but they may be very
>>> dumb.
>>>
>>> *) Why are indexes going on CephFS? Is this just about wanting a local
>>> cache, or about the existing Dovecot implementations, or something
>>> else? Almost seems like you could just store the whole thing in a
>>> CephFS filesystem if that's safe. ;)
>>
>> This is, if everything works as expected, only an intermediate step. An
>> idea is
>> (https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/status-3)
>> be to use omap to store the index/meta data.
>>
>> We chose a step-by-step approach and since we are currently not sure if
>> using omap would work performance wise, we use CephFS (also since this
>> requires no changes in Dovecot). Currently we put our focus on the
>> development of the first version of librmb, but the code to use omap is
>> already there. It needs integration, testing, and performance tuning to
>> verify if it would work with our requirements.
>>
>>> *) It looks like each email is getting its own object in RADOS, and I
>>> assume those are small messages, which leads me to
>>
>> The mail distribution looks like this:
>> https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/mailplatform-mails-dist
>>
>>
>> Yes, the majority of the mails are under 500k, but the most objects are
>> around 50k. Not so many very small objects.
> 
> Ah, that slide makes more sense with that context — I was paging
> through it in bed last night and thought it was about the number of
> emails per user or something weird.
> 
> So those mail objects are definitely bigger than I expected; interesting.
> 
>>
>>>   *) is it really cost-acceptable to not use EC pools on email data?
>>
>> We will use EC pools for the mail objects and replication for CephFS.
>>
>> But even without EC there would be a cost case compared to the current
>> system. We will save a large amount of IOPs in the new platform since
>> the (NFS) POSIX layer is removed from the IO path (at least for the mail
>> objects). And we expect with Ceph and commodity hardware we can compete
>> with a traditional enterprise NAS/NFS anyway.
>>
>>>   *) isn't per-object metadata overhead a big cost compared to the
>>> actual stored data?
>>
>> I assume not. The metadata/index is not so much compared to the size of
>> the mails (currently with NFS around 10% I would say). In the classic
>> NFS based dovecot the number of index/cache/metadata files is an issue
>> anyway. With 6.7 billion mails we have 1.2 billion index/cache/metadata
>> files
>> (https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/mailplatform-mails-nums).
> 
> I was unclear; I meant the RADOS metadata cost of storing an object. I
> haven't quantified that in a while but it was big enough to make 4KB
> objects pretty expensive, which I was incorrectly assuming would be
> the case for most emails.
> EC pools have the same issue; if you want to erasure-code a 40KB
> object into 5+3 then you pay the metadata overhead for each 8KB
> (40KB/5) of data, but again that's more on the practical side of
> things than my initial assumptions placed it.
> 
> This is super cool!

Currently we assume/hope it will work out. But we will have an eye on
this topic as soon as we go into the PoC phase with the new hardware and
run real load tests.

Danny



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-22 Thread Gregory Farnum
On Fri, Sep 22, 2017 at 2:49 PM, Danny Al-Gaaf  wrote:
> Am 22.09.2017 um 22:59 schrieb Gregory Farnum:
> [..]
>> This is super cool! Is there anything written down that explains this
>> for Ceph developers who aren't familiar with the workings of Dovecot?
>> I've got some questions I see going through it, but they may be very
>> dumb.
>>
>> *) Why are indexes going on CephFS? Is this just about wanting a local
>> cache, or about the existing Dovecot implementations, or something
>> else? Almost seems like you could just store the whole thing in a
>> CephFS filesystem if that's safe. ;)
>
> This is, if everything works as expected, only an intermediate step. An
> idea is
> (https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/status-3)
> be to use omap to store the index/meta data.
>
> We chose a step-by-step approach and since we are currently not sure if
> using omap would work performance wise, we use CephFS (also since this
> requires no changes in Dovecot). Currently we put our focus on the
> development of the first version of librmb, but the code to use omap is
> already there. It needs integration, testing, and performance tuning to
> verify if it would work with our requirements.
>
>> *) It looks like each email is getting its own object in RADOS, and I
>> assume those are small messages, which leads me to
>
> The mail distribution looks like this:
> https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/mailplatform-mails-dist
>
>
> Yes, the majority of the mails are under 500k, but the most objects are
> around 50k. Not so many very small objects.

Ah, that slide makes more sense with that context — I was paging
through it in bed last night and thought it was about the number of
emails per user or something weird.

So those mail objects are definitely bigger than I expected; interesting.

>
>>   *) is it really cost-acceptable to not use EC pools on email data?
>
> We will use EC pools for the mail objects and replication for CephFS.
>
> But even without EC there would be a cost case compared to the current
> system. We will save a large amount of IOPs in the new platform since
> the (NFS) POSIX layer is removed from the IO path (at least for the mail
> objects). And we expect with Ceph and commodity hardware we can compete
> with a traditional enterprise NAS/NFS anyway.
>
>>   *) isn't per-object metadata overhead a big cost compared to the
>> actual stored data?
>
> I assume not. The metadata/index is not so much compared to the size of
> the mails (currently with NFS around 10% I would say). In the classic
> NFS based dovecot the number of index/cache/metadata files is an issue
> anyway. With 6.7 billion mails we have 1.2 billion index/cache/metadata
> files
> (https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/mailplatform-mails-nums).

I was unclear; I meant the RADOS metadata cost of storing an object. I
haven't quantified that in a while but it was big enough to make 4KB
objects pretty expensive, which I was incorrectly assuming would be
the case for most emails.
EC pools have the same issue; if you want to erasure-code a 40KB
object into 5+3 then you pay the metadata overhead for each 8KB
(40KB/5) of data, but again that's more on the practical side of
things than my initial assumptions placed it.

This is super cool!
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-22 Thread Danny Al-Gaaf
Am 22.09.2017 um 22:59 schrieb Gregory Farnum:
[..]
> This is super cool! Is there anything written down that explains this
> for Ceph developers who aren't familiar with the workings of Dovecot?
> I've got some questions I see going through it, but they may be very
> dumb.
> 
> *) Why are indexes going on CephFS? Is this just about wanting a local
> cache, or about the existing Dovecot implementations, or something
> else? Almost seems like you could just store the whole thing in a
> CephFS filesystem if that's safe. ;)

This is, if everything works as expected, only an intermediate step. An
idea is
(https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/status-3)
be to use omap to store the index/meta data.

We chose a step-by-step approach and since we are currently not sure if
using omap would work performance wise, we use CephFS (also since this
requires no changes in Dovecot). Currently we put our focus on the
development of the first version of librmb, but the code to use omap is
already there. It needs integration, testing, and performance tuning to
verify if it would work with our requirements.

> *) It looks like each email is getting its own object in RADOS, and I
> assume those are small messages, which leads me to

The mail distribution looks like this:
https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/mailplatform-mails-dist


Yes, the majority of the mails are under 500k, but the most objects are
around 50k. Not so many very small objects.

>   *) is it really cost-acceptable to not use EC pools on email data?

We will use EC pools for the mail objects and replication for CephFS.

But even without EC there would be a cost case compared to the current
system. We will save a large amount of IOPs in the new platform since
the (NFS) POSIX layer is removed from the IO path (at least for the mail
objects). And we expect with Ceph and commodity hardware we can compete
with a traditional enterprise NAS/NFS anyway.

>   *) isn't per-object metadata overhead a big cost compared to the
> actual stored data?

I assume not. The metadata/index is not so much compared to the size of
the mails (currently with NFS around 10% I would say). In the classic
NFS based dovecot the number of index/cache/metadata files is an issue
anyway. With 6.7 billion mails we have 1.2 billion index/cache/metadata
files
(https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/mailplatform-mails-nums).

Danny
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-22 Thread Gregory Farnum
On Thu, Sep 21, 2017 at 1:40 AM, Wido den Hollander  wrote:
> Hi,
>
> A tracker issue has been out there for a while: 
> http://tracker.ceph.com/issues/12430
>
> Storing e-mail in RADOS with Dovecot, the IMAP/POP3/LDA server with a huge 
> marketshare.
>
> It took a while, but last year Deutsche Telekom took on the heavy work and 
> started a project to develop librmb: LibRadosMailBox
>
> Together with Deutsche Telekom and Tallence GmbH (DE) this project came to 
> life.
>
> First, the Github link: https://github.com/ceph-dovecot/dovecot-ceph-plugin
>
> I am not going to repeat everything which is on Github, put a short summary:
>
> - CephFS is used for storing Mailbox Indexes
> - E-Mails are stored directly as RADOS objects
> - It's a Dovecot plugin
>
> We would like everybody to test librmb and report back issues on Github so 
> that further development can be done.
>
> It's not finalized yet, but all the help is welcome to make librmb the best 
> solution for storing your e-mails on Ceph with Dovecot.
>
> Danny Al-Gaaf has written a small blogpost about it and a presentation:
>
> - https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/
> - http://blog.bisect.de/2017/09/ceph-meetup-berlin-followup-librmb.html
>
> To get a idea of the scale: 4,7PB of RAW storage over 1.200 OSDs is the final 
> goal (last slide in presentation). That will provide roughly 1,2PB of usable 
> storage capacity for storing e-mail, a lot of e-mail.
>
> To see this project finally go into the Open Source world excites me a lot :-)
>
> A very, very big thanks to Deutsche Telekom for funding this awesome project!
>
> A big thanks as well to Tallence as they did an awesome job in developing 
> librmb in such a short time.

This is super cool! Is there anything written down that explains this
for Ceph developers who aren't familiar with the workings of Dovecot?
I've got some questions I see going through it, but they may be very
dumb.

*) Why are indexes going on CephFS? Is this just about wanting a local
cache, or about the existing Dovecot implementations, or something
else? Almost seems like you could just store the whole thing in a
CephFS filesystem if that's safe. ;)

*) It looks like each email is getting its own object in RADOS, and I
assume those are small messages, which leads me to

  *) is it really cost-acceptable to not use EC pools on email data?

  *) isn't per-object metadata overhead a big cost compared to the
actual stored data?

-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-22 Thread Wido den Hollander

> Op 22 september 2017 om 8:03 schreef Adrian Saul 
> :
> 
> 
> 
> Thanks for bringing this to attention Wido - its of interest to us as we are 
> currently looking to migrate mail platforms onto Ceph using NFS, but this 
> seems far more practical.
> 

Great! Keep in mind this is still in a very experimental phase, but we can use 
all the feedback to make librmb awesome.

Issues can be reported on Github.

Thanks!

Wido

> 
> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> > Wido den Hollander
> > Sent: Thursday, 21 September 2017 6:40 PM
> > To: ceph-us...@ceph.com
> > Subject: [ceph-users] librmb: Mail storage on RADOS with Dovecot
> >
> > Hi,
> >
> > A tracker issue has been out there for a while:
> > http://tracker.ceph.com/issues/12430
> >
> > Storing e-mail in RADOS with Dovecot, the IMAP/POP3/LDA server with a
> > huge marketshare.
> >
> > It took a while, but last year Deutsche Telekom took on the heavy work and
> > started a project to develop librmb: LibRadosMailBox
> >
> > Together with Deutsche Telekom and Tallence GmbH (DE) this project came
> > to life.
> >
> > First, the Github link: https://github.com/ceph-dovecot/dovecot-ceph-
> > plugin
> >
> > I am not going to repeat everything which is on Github, put a short summary:
> >
> > - CephFS is used for storing Mailbox Indexes
> > - E-Mails are stored directly as RADOS objects
> > - It's a Dovecot plugin
> >
> > We would like everybody to test librmb and report back issues on Github so
> > that further development can be done.
> >
> > It's not finalized yet, but all the help is welcome to make librmb the best
> > solution for storing your e-mails on Ceph with Dovecot.
> >
> > Danny Al-Gaaf has written a small blogpost about it and a presentation:
> >
> > - https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/
> > - http://blog.bisect.de/2017/09/ceph-meetup-berlin-followup-librmb.html
> >
> > To get a idea of the scale: 4,7PB of RAW storage over 1.200 OSDs is the 
> > final
> > goal (last slide in presentation). That will provide roughly 1,2PB of usable
> > storage capacity for storing e-mail, a lot of e-mail.
> >
> > To see this project finally go into the Open Source world excites me a lot 
> > :-)
> >
> > A very, very big thanks to Deutsche Telekom for funding this awesome
> > project!
> >
> > A big thanks as well to Tallence as they did an awesome job in developing
> > librmb in such a short time.
> >
> > Wido
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> Confidentiality: This email and any attachments are confidential and may be 
> subject to copyright, legal or some other professional privilege. They are 
> intended solely for the attention and use of the named addressee(s). They may 
> only be copied, distributed or disclosed with the consent of the copyright 
> owner. If you have received this email by mistake or by breach of the 
> confidentiality clause, please notify the sender immediately by return email 
> and delete or destroy all copies of the email. Any confidentiality, privilege 
> or copyright is not waived or lost because this email has been sent to you by 
> mistake.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-22 Thread Adrian Saul

Thanks for bringing this to attention Wido - its of interest to us as we are 
currently looking to migrate mail platforms onto Ceph using NFS, but this seems 
far more practical.


> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Wido den Hollander
> Sent: Thursday, 21 September 2017 6:40 PM
> To: ceph-us...@ceph.com
> Subject: [ceph-users] librmb: Mail storage on RADOS with Dovecot
>
> Hi,
>
> A tracker issue has been out there for a while:
> http://tracker.ceph.com/issues/12430
>
> Storing e-mail in RADOS with Dovecot, the IMAP/POP3/LDA server with a
> huge marketshare.
>
> It took a while, but last year Deutsche Telekom took on the heavy work and
> started a project to develop librmb: LibRadosMailBox
>
> Together with Deutsche Telekom and Tallence GmbH (DE) this project came
> to life.
>
> First, the Github link: https://github.com/ceph-dovecot/dovecot-ceph-
> plugin
>
> I am not going to repeat everything which is on Github, put a short summary:
>
> - CephFS is used for storing Mailbox Indexes
> - E-Mails are stored directly as RADOS objects
> - It's a Dovecot plugin
>
> We would like everybody to test librmb and report back issues on Github so
> that further development can be done.
>
> It's not finalized yet, but all the help is welcome to make librmb the best
> solution for storing your e-mails on Ceph with Dovecot.
>
> Danny Al-Gaaf has written a small blogpost about it and a presentation:
>
> - https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/
> - http://blog.bisect.de/2017/09/ceph-meetup-berlin-followup-librmb.html
>
> To get a idea of the scale: 4,7PB of RAW storage over 1.200 OSDs is the final
> goal (last slide in presentation). That will provide roughly 1,2PB of usable
> storage capacity for storing e-mail, a lot of e-mail.
>
> To see this project finally go into the Open Source world excites me a lot :-)
>
> A very, very big thanks to Deutsche Telekom for funding this awesome
> project!
>
> A big thanks as well to Tallence as they did an awesome job in developing
> librmb in such a short time.
>
> Wido
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Confidentiality: This email and any attachments are confidential and may be 
subject to copyright, legal or some other professional privilege. They are 
intended solely for the attention and use of the named addressee(s). They may 
only be copied, distributed or disclosed with the consent of the copyright 
owner. If you have received this email by mistake or by breach of the 
confidentiality clause, please notify the sender immediately by return email 
and delete or destroy all copies of the email. Any confidentiality, privilege 
or copyright is not waived or lost because this email has been sent to you by 
mistake.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-21 Thread Brad Hubbard
This looks great Wido!


Kudos to all involved.

On Thu, Sep 21, 2017 at 6:40 PM, Wido den Hollander  wrote:
> Hi,
>
> A tracker issue has been out there for a while: 
> http://tracker.ceph.com/issues/12430
>
> Storing e-mail in RADOS with Dovecot, the IMAP/POP3/LDA server with a huge 
> marketshare.
>
> It took a while, but last year Deutsche Telekom took on the heavy work and 
> started a project to develop librmb: LibRadosMailBox
>
> Together with Deutsche Telekom and Tallence GmbH (DE) this project came to 
> life.
>
> First, the Github link: https://github.com/ceph-dovecot/dovecot-ceph-plugin
>
> I am not going to repeat everything which is on Github, put a short summary:
>
> - CephFS is used for storing Mailbox Indexes
> - E-Mails are stored directly as RADOS objects
> - It's a Dovecot plugin
>
> We would like everybody to test librmb and report back issues on Github so 
> that further development can be done.
>
> It's not finalized yet, but all the help is welcome to make librmb the best 
> solution for storing your e-mails on Ceph with Dovecot.
>
> Danny Al-Gaaf has written a small blogpost about it and a presentation:
>
> - https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/
> - http://blog.bisect.de/2017/09/ceph-meetup-berlin-followup-librmb.html
>
> To get a idea of the scale: 4,7PB of RAW storage over 1.200 OSDs is the final 
> goal (last slide in presentation). That will provide roughly 1,2PB of usable 
> storage capacity for storing e-mail, a lot of e-mail.
>
> To see this project finally go into the Open Source world excites me a lot :-)
>
> A very, very big thanks to Deutsche Telekom for funding this awesome project!
>
> A big thanks as well to Tallence as they did an awesome job in developing 
> librmb in such a short time.
>
> Wido
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com