Re: app design recommendations

2013-08-22 Thread Nulik Nol
Hi.
Thanks to everybody for the replies, I figured out what to do now.


 I'm not sure about the answer on this one, but you are aware of the fact
 that you can't search for objects? You can't ask RADOS to give you all
 objects where attr X has value Y.


Well, the main daily operation is to read the last 50 emails, and it
consumes about 95% of the system's resources, so I am going to offset
advanced search functions to an RDBMS and content search to sphinx,
gaining additional data copy in case of a disaster and saving in
development time.

Regards
Nulik
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: app design recommendations

2013-08-21 Thread Samuel Just
Hi,

Looks interesting.

On Tue, Aug 20, 2013 at 1:06 PM, Nulik Nol nulik...@gmail.com wrote:
 Hi,
 I am creating an email system which will handle whole company's email,
 mostly internal mail. There will be thousands of companies and
 hundreds of users per company. So I am planning to use one pool per
 company to store email messages. Can Ceph manage thousands or maybe
 hundred thousands of pools ? Could there be any slowdown at production
 with such design after some growth?

You can't have thousands of pools.  Each pool shows up in the OSDMap,
and each pool must have at least 1 pg.  However, we now have rados
level namespaces which subdivide a pool.  You can have as many
namespaces as objects if you want.  The main caveat is that you can't
efficiently list the contents of a namespace.  We do support user
capabilities on namespaces.


 Every email will be stored as an individual ceph object (emails will
 average 512 bytes and rarely have attachments) , is it ok to store
 them as a ceph objects or will it be less efficient than storing
 multiple emails in a ceph object,? What is the optimal ceph object
 size to store individually, so it would be preferable to do this
 instead of writing through omap with leveldb? (kind of ceph object vs
 omap benchmark question)

I am unsure.  Each rados object will end up as a file on the backing
filesystem, so there is some overhead there.  My first instinct for
objects of 512 bytes might be to hash the key and use part of the
hash to determine the object name and the rest to determine the omap
key.  It might be fine to do 1 email per object though.


 Also I will be putting mini-chat sessions between users in a ceph
 object, each time a user sends a message to another user, I will
 append the text to the ceph object, so my question is, will Ceph
 rewrite the whole object into a new physical location on disk when I
 do an append? Or will it just rewrite the block that was modified?

Appends are performed using filesystem writes, so it won't overwrite
the entire object.  Rados support partial overwrites just fine.


 And last questions: Which is faster, storing small key/value pairs in
 omap or in xattrs ? Will storing key/value pairs in xattrs result in
 space waste by allocating a block for zero-sized object on the OSD? (I
 won't write any data to the object, just use xattrs)

Depending on the backing filesystem, xattrs are either stored as xattrs
on the backing filesystem or as keys in leveldb (just like omap).  I
suggest you reserve xattrs for small pieces of metadata and omap for
large sets of keys.  zero sized objects are fine.

-Sam


 Will appreciate very much your comments.

 Best Regards
 Nulik
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


app design recommendations

2013-08-20 Thread Nulik Nol
Hi,
I am creating an email system which will handle whole company's email,
mostly internal mail. There will be thousands of companies and
hundreds of users per company. So I am planning to use one pool per
company to store email messages. Can Ceph manage thousands or maybe
hundred thousands of pools ? Could there be any slowdown at production
with such design after some growth?

Every email will be stored as an individual ceph object (emails will
average 512 bytes and rarely have attachments) , is it ok to store
them as a ceph objects or will it be less efficient than storing
multiple emails in a ceph object,? What is the optimal ceph object
size to store individually, so it would be preferable to do this
instead of writing through omap with leveldb? (kind of ceph object vs
omap benchmark question)

Also I will be putting mini-chat sessions between users in a ceph
object, each time a user sends a message to another user, I will
append the text to the ceph object, so my question is, will Ceph
rewrite the whole object into a new physical location on disk when I
do an append? Or will it just rewrite the block that was modified?

And last questions: Which is faster, storing small key/value pairs in
omap or in xattrs ? Will storing key/value pairs in xattrs result in
space waste by allocating a block for zero-sized object on the OSD? (I
won't write any data to the object, just use xattrs)

Will appreciate very much your comments.

Best Regards
Nulik
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html