RE: [MERGE]object_store branch into master

Edison Su Mon, 20 May 2013 14:31:10 -0700


> -----Original Message-----
> From: John Burwell [mailto:jburw...@basho.com]
> Sent: Monday, May 20, 2013 12:56 PM
> To: dev@cloudstack.apache.org
> Subject: Re: [MERGE]object_store branch into master
> 
> All,
> 
> Since this change is so large, it makes reviewing and commenting in detail
> extremely difficult.  Would it be possible to push this patch through Review
> Board to ease comprehension and promote a conversation about this patch?


We can try to push it into Review Board.

> 
> Reading through the FS, I have the following questions regarding the
> operation of the NFS cache:
> 
> What happens if/when the disk space of the NFS cache is exhausted?  What
> are the sizing recommendations/guidelines for it?
> What strategy is used to age files out of the NFS cache?
As usual, admin can have multiple NFS secondary storages, admin can also add 
multiple NFS cache storages. The NFS cache storage capacity plan will be the 
same as NFS secondary storage.
If there multiple NFS cache storages, the current strategy will randomly choose 
one of them. Currently, no clean up/aging out strategy implemented yet
But the situation can be improved: most of cached object can be deleted after 
accessed once. Take template as example, if zone wide storage is used, put 
template on cache storage has little value, as once the template is downloaded 
into primary storage, suddenly all the hypervisor host can access it.
I think the simple LRU algorithm to delete cached objects should be enough. It 
can be added later, the cache storage has its own pom project, it's place to 
add more intelligence. 

> If two processes, process1 and process2, are both using a template,
> templateA, will both processes reference the same file in the NFS cache?  If
It's possible, that one template can be downloaded into cache storage twice, in 
case of concurrent accessed by two processes. The current implementation is 
that, if two processes want to download the same template from s3 into one 
primary storage at the same time, there is only one template will be downloaded 
into cache storage. While, if two processes want to download the same template 
into different primary storage, the template will be cached twice. 
> they are reading from the same file and process1 finishes before process2,
> will process1 attempt to delete process2?

There is no way to delete while read, as each cached object has its own state 
machine. If it's accessed by one process, the state will be changed to 
"Copying", you can't delete an object when it's in "Copying" state.

> If a file transfer from the NFS cache to the object store fails, what is the
> recovery/retry strategy?  What durability guarantees will CloudStack supply
> when a snapshot, template, or ISO is in the cache, but can't be written to the
> object store?

The error handling of cache storage shouldn't be different than without cache 
storage. For example, directly backup snapshot from primary storage to S3, 
without cache storage. If backup failed, then the whole process failed, user 
needs to do it again through cloudstack API. So in cache storage case, if push 
object from cache storage to s3 failed, then the whole backup process failed.

> What will be the migration strategy for the objects contained in S3
> buckets/Swift containers from pre-4.2.0 instances?  Currently, CloudStack
> tracks a mapping between these objects and templates/ISOs in the
> template_switt_ref and template_s3_ref table.

We need to migrate DB from existing template_s3_ref to template_store_ref, and 
put all the s3 information into image_store and image_store_details tables.

> 
> Finally, does the S3 implementation use multi-part upload to transfer files to
> the object store?  If not, the implementation will be limited to storing 
> files no
> larger than 5GB in size.
Oh, this is something we don't know yet. We haven't try to upload a template 
which is large than 5GB, so haven't met this issue.
Could you help to hack it up?:)

> 
> Thanks,
> -John
> 
> On May 20, 2013, at 1:52 PM, Chip Childers <chip.child...@sungard.com>
> wrote:
> 
> > On Fri, May 17, 2013 at 08:19:57AM -0400, David Nalley wrote:
> >> On Fri, May 17, 2013 at 4:11 AM, Edison Su <edison...@citrix.com> wrote:
> >>> Hi all,
> >>>     Min and I worked on object_store branch during the last one and half
> month. We made a lot of refactor on the storage code, mostly related to
> secondary storage, but also on the general storage framework. The following
> goals are made:
> >>>
> >>> 1.       An unified storage framework. Both secondary
> storages(nfs/s3/swift etc) and primary storages will share the same plugin
> model, the same interface. Add any other new storages into cloudstack will
> much easier and straightforward.
> >>>
> >>> 2.       The storage interface  between mgt server and resource is 
> >>> unified,
> currently there are only 5 commands send out by mgt server:
> copycommand/createobjectcommand/deletecommand/attachcommand/de
> ttachcommand, and each storage vendor can decode/encode all the
> entities(volume/snapshot/storage pool/ template etc) by its own.
> >>>
> >>> 3.       NFS secondary storage is not explicitly depended on by other
> components. For example, when registering template into S3, template will
> be write into S3 directly, instead of storing into nfs secondary storage, then
> push to S3. If s3 is used as secondary storage, then nfs storage will be used 
> as
> cache storage, but from other components point of view, cache storage is
> invisible. So, it's possible to make nfs storage as optional if s3 is used for
> certain hypervisors.
> >>> The detailed FS is at
> >>>
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Storage+Backu
> >>> p+Object+Store+Plugin+Framework
> >>> The test we did:
> >>>
> >>> 1.       We modified marvin to use new storage api
> >>>
> >>> 2.       Test_volume and test_vm_life_cycle, test_template under smoke
> test folder are executed against xenserver/kvm/vmware and devcloud,
> some of them are failed, it's partly due to bugs introduced by our code, 
> partly
> master branch itself has issue(e.g. resizevolume doesn't work). We want to
> fix these issues after merging into master.
> >>>
> >>> The basic follow does work: create user vm, attach/detach volume,
> register template, create template from volume/snapshot, take snapshot,
> create volume from snapshot.
> >>>  It's a huge change, around 60k LOC patch, to review the code, you can
> try: git diff master..object_store, will show all the diff.
> >>>  Comments/feedback are welcome. Thanks.
> >>>
> >>>
> >>
> >>
> >> Given the amount of change, can we get at least a BVT run against
> >> your branch done before merge?
> >>
> >> --David
> >>
> >
> > +1 to BVT please.

RE: [MERGE]object_store branch into master

Reply via email to