Hello fellow Cloudstack users,

This is my first post to this mailing list, so please excuse me if I'm not
following the proper etiquette.

My name is Zoran, and I'm a developer working for a DDN company.  While
investigating the Cloudstack S3 and its performance, I encountered a bit
weird behavior of the Cloudstack S3 server, so I wanted to verify my
findings with you guys.

To test the S3 performance I used Python and BOTO libraries to create an
S3 client that is adding random content/name keys into the Cloudstack S3
and a single bucket.  To my surprise, the Cloudstack S3 buckets were
getting more and more unresponsive.  For example, at about 20'000 keys it
was taking about 10 seconds to "get a bucket" (BOTO ref
http://boto.s3.amazonaws.com/ref/s3.html#boto.s3.connection.S3Connection.ge
t_bucket, AWS ref 
http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html).

Did someone else also notice this significant slow-downs, or is it perhaps
just my environment and possibly misconfiguration?


Not to leave it at this, I tried to locate the delay on the Cloudstack S3
server-side, and I may have found two potential issues with
SObjectDaoImpl.listBucketObjects()  (ref
https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=blob;f=awsapi/sr
c/com/cloud/bridge/persist/dao/SObjectDaoImpl.java;h=6d23757b8b57ded9443bfe
61aaa3742590b21c49;hb=master#l71):

1) the maxKeys parameter seems to be ignored, so all 20'000+ keys (ie.
full bucket content) was being inspected instead of normally just first
1'000 keys, which is a default maxKeys value

2) the way the object's data is extracted form the MySQL seems to be using
sub-queries instead of JOINs, so something similar to this:
>  objList = SQL("select * from SObject where SBucketID='xxx' ");
>  for (ObjectVO obj : objList) {
>     objItem = SQL("select * from SObject_Item where SObjectID='yyy' ");
>  }

Note that the data can be retrieved "in one go" and lot more efficiently
if one used JOIN on the database-side, ie.
>  select * from sobject so LEFT JOIN sobject_item si on
>so.ID=si.SObjectID where SBucketID='xxx';
However, I am not sure if the Cloudstack DAO and *VO-objects abstraction
supports database JOINs.

Can someone confirm if these are actual code issues?

Thank you in advance!


Best regards,
                Zoran

Reply via email to