[ceph-users] Designing an application with Ceph

2013-08-13 Thread Nulik Nol
Hi,
I am planning to use Ceph as a database storage for a webmail
client/server application, and I am thinking to store the data as
key/value pair instead of using any RDBMSs, for speed. The webmail
will manage companies, and each company will have many users, users
will end/receive emails and store them in their inboxes, kind of like
Gmail, but per company. The server will be developed in C, client code
in HTML/Javascript and binary client (standalone app) in C++
So, my question is, how would you recommend me to design the backend ?

I have thought of these choices:

1. Use Ceph as filesystem and BerkeleyDB as the database engine.
Berekley DB uses 2 files per table, so I will have 1 directory per
company and a 2 files per each table, I think there will be no more
than 20 tables in my whole app. Ceph will be used here as a remote
filesystem where BerkeleyDB will do all the data organization. The
RADOS interface of Ceph (to store key/pair values) will be not used,
since Berkeley DB will write and read to the OSDs directly and
Berkeley DB is a key/value pair database. But I have never used a DB
one a remote filesystem not sure if it will work well. Advantages of
this architecture: quick  easy.
Disadvantages: lower performance (overhead in CephFS and BerkeleyDB),
also I will not be able to write plugins for RADOS in C++ to combine
many data modifications in a single call to the server.

2. Use librados C api and write all the 'queries' hardcoded in C
specifically for the
application. Since the application is pretty standard and is not
supposed to change
much, I can do this. I would create a RADOS object for each
application object (like for example 'user' record, 'email' record,
'chat message' record, etc...).
Advantages: high performance. Disadvantages: a bit more to code ,
specially the data search functions.

I am interested in performance, so I am thinking to go for the option
2, what do you think? Can RADOS fully replace a database engine ? (I
mean, NoSQL engine, like Berkeley for example)

Will appreciate very much your comments.
TIA
Nulik
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Designing an application with Ceph

2013-08-13 Thread Samuel Just
2 is certainly an intriguing option.  RADOS isn't really a database
engine (even a nosql one), but should be able to serve your needs
here.  Have you seen the omap api available in librados?  It allows
you to efficiently store key/value pairs attached to a librados object
(uses leveldb on the OSDs to actually handle the key/value mapping).

One caveat is that the C api is somewhat less complete than the C++
api.  That would be pretty easily remedied if there were demand
though.
-Sam

On Tue, Aug 13, 2013 at 2:01 PM, Nulik Nol nulik...@gmail.com wrote:
 Hi,
 I am planning to use Ceph as a database storage for a webmail
 client/server application, and I am thinking to store the data as
 key/value pair instead of using any RDBMSs, for speed. The webmail
 will manage companies, and each company will have many users, users
 will end/receive emails and store them in their inboxes, kind of like
 Gmail, but per company. The server will be developed in C, client code
 in HTML/Javascript and binary client (standalone app) in C++
 So, my question is, how would you recommend me to design the backend ?

 I have thought of these choices:

 1. Use Ceph as filesystem and BerkeleyDB as the database engine.
 Berekley DB uses 2 files per table, so I will have 1 directory per
 company and a 2 files per each table, I think there will be no more
 than 20 tables in my whole app. Ceph will be used here as a remote
 filesystem where BerkeleyDB will do all the data organization. The
 RADOS interface of Ceph (to store key/pair values) will be not used,
 since Berkeley DB will write and read to the OSDs directly and
 Berkeley DB is a key/value pair database. But I have never used a DB
 one a remote filesystem not sure if it will work well. Advantages of
 this architecture: quick  easy.
 Disadvantages: lower performance (overhead in CephFS and BerkeleyDB),
 also I will not be able to write plugins for RADOS in C++ to combine
 many data modifications in a single call to the server.

 2. Use librados C api and write all the 'queries' hardcoded in C
 specifically for the
 application. Since the application is pretty standard and is not
 supposed to change
 much, I can do this. I would create a RADOS object for each
 application object (like for example 'user' record, 'email' record,
 'chat message' record, etc...).
 Advantages: high performance. Disadvantages: a bit more to code ,
 specially the data search functions.

 I am interested in performance, so I am thinking to go for the option
 2, what do you think? Can RADOS fully replace a database engine ? (I
 mean, NoSQL engine, like Berkeley for example)

 Will appreciate very much your comments.
 TIA
 Nulik
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com