Re: Requirements of Storage Orchestration

Marcus Sorensen Wed, 31 Oct 2012 17:13:01 -0700

Ok. Thanks for the detail. I agree that creating volumes would be a
great place to start and build from there. I think #1 is ideal on the
more advanced backend features.


On Wed, Oct 31, 2012 at 3:05 PM, Edison Su <[email protected]> wrote:
>
>
>> -----Original Message-----
>> From: Marcus Sorensen [mailto:[email protected]]
>> Sent: Wednesday, October 31, 2012 12:35 PM
>> To: [email protected]
>> Subject: Re: Requirements of Storage Orchestration
>>
>> I just don't see how this would be easy to implement. Actually creating an
>> iscsi lun on a target would be fine, there are a few known parameters and
>> the plugin would do the work. Allowing someone to configure any arbitrary
>> storage array via plugin seems tricky. I'm not an expert in the code though,
>> nor do I really understand the storage framework, but let me explain why I
>> think it's tricky.
>>
>>  Admin wants to create new primary storage, there's a plugin call provided to
>> list devices/disks attached to an appliance maybe? And it's up to the plugin 
>> to
>> do the actual work of collecting that, but it returns a list of objects to
>> cloudstack with a vendor-agnostic string for disk identifier and an arbitrary
>> set of properties based on storage appliance capabilities, like physical
>> blocksize of disks, disk size, controller/backplane location, maybe some
>> SMART attributes. Then we query the storage plugin for methods that can be
>> used on that appliance to create pools out of those disks (raid levels, 
>> zpools,
>> etc), and what features can be set, and what values those features accept.
>> Then we present all of this to the admin in some way that allows him/her to
>> define a storage pool with compression, dedup, encryption, ashift, and
>> whatever features the admin wants. Then we accept admin's input and send
>> a create pool command to the plugin.
>> This command might let us set a feature on the pool to let us know if we're
>> dealing with a filesystem or if we need to carve that pool into volumes. Then
>> we can call the plugin to create volumes if necessary, and perhaps format
>> those volumes with some filesystem. Then call the plugin for exporting them
>> via NFS, or export those volumes directly as iscsi or FC or whatever.
>
> Hand over the complicated tasks to storage provider itself. The life cycle of 
> a data store provider is:
> Register -> enable -> disable -> deregister
> Before enabling the data store provider, admin should call some APIs to 
> initialize or configure the storage backend. Each storage provider may have 
> its own specific features and configurations. If we can't generalize it in 
> one bunch of standard APIs, there are two ways to deal with it in my mind:
> 1. The provider can expose its own APIs to admin. This is what we are doing 
> for network provider.
> 2. Make the API itself extensible. We define a big API, like 
> initializeProvider(map<string, Object> parameters), admin then passes a map 
> of parameters that specific to the provider.
> If the provider is configured properly by admin, storage backend is ready to 
> create storage pool, then admin can call an API to enable it.
> If the provider is enabled, admin can call another API to create storage 
> pools on it. Again, this API, like above initialize API, can be specific to 
> each provider, but what this API returns is a token or a URI, which uniquely 
> identify this storage pool. During the API call, provider can format the 
> disk, exporting a NFS export, or any kind of specific operations on the 
> storage backend. After the API call, the storage pool is ready to create 
> volume on it.
> If storage pool is created, admin can attach it into a specific scope(a 
> zone/cluster/host), then can create volume on it. CreateVolume api is a 
> standard API, the parameters are volume size, and volume format(vhd/raw/qcow2 
> etc), cloudstack will then call storage pool's provider to create a volume, 
> which will returns URI or token to cloudstack. The volume uri or token is 
> transparent to cloudstack, it can be in the form of "iscsi://whatever", 
> "nfs://whatever" or just a simple uuid.
> After we can get the volume uri or token, then pass it to hypervisor to lunch 
> VM. On the hypervisor side, there should have code to decode the uri or 
> token, which can be specific to each storage provider.
> Right now, I am not focusing on how to initialize or configure storage 
> backend, but more than on how to discover storage pool from existing properly 
> configured storage backend, then create volumes on it. If storage provider is 
> interested in how to integrate its storage into cloudstack seamlessly, 
> definitely, we can work together to get it work.
>
>>
>> That last sentence makes sense to me. OpenStack for example allows you to
>> create iscsi luns on various appliances, to use for VMs. It's at the point 
>> where
>> we're actually configuring and managing the appliance's disks and filesystems
>> that seems redundant to the vendor tools, difficult to do, and rarely
>> used/useful. Even if it were available, I wonder who would write such a
>> plugin?
>>
>> On Wed, Oct 31, 2012 at 12:39 PM, Edison Su <[email protected]> wrote:
>> >
>> >
>> >> -----Original Message-----
>> >> From: Marcus Sorensen [mailto:[email protected]]
>> >> Sent: Wednesday, October 31, 2012 11:13 AM
>> >> To: [email protected]
>> >> Subject: Re: Requirements of Storage Orchestration
>> >>
>> >> It seems to me that for the most part these things are a function of
>> >> managing the storage backend, along with creating the disk pools,
>> >> formatting filesystems, and the like, that are used as primary storage by
>> CloudStack.
>> >>
>> >> Should there be plugins to manage storage backends? Does any
>> >> competing project in the segment do this? It seems extremely complex
>> >> to add in functionality to expose disks and arrays from a SAN or NAS,
>> >> allow the admin to configure them into pools, choose filesystems,
>> >> manage NFS/iSCSI/RBD exports, and configure filesystem features all
>> >> through cloudstack. The root admin would be the only one with access
>> >> and they likely would find it just as easy to do it with the tools the 
>> >> storage
>> vendor provides.
>> >
>> > It should be easy to add storage pool management functionality into the
>> new storage framework.
>> > The primary storage layer looks like this:
>> > Volume service -> primary data store provider -> primary data store ->
>> > volume The lifecycle of primary data store is:
>> > Create -> attach ->detach -> delete
>> > Whenever admin wants to create a storage pool, call an api on the data
>> store provider, the provider can talk to its storage backend, create
>> storage(an ISCSI target or nfs mount point etc ).
>> > Admin can attach the storage to a zone, or pod, or cluster, or host, thus
>> cloudstack can use it to create volume on it.
>> >
>> >>
>> >> To me the only way it makes sense to roll those things in is if
>> >> there's some way to do it at the VM image level. I believe qcow2
>> >> supports encryption, and we can probably do encrypted lvm volumes as
>> >> well. I'd actually like to look into this. We also need to realize
>> >> that encrypting the disk doesn't do much good if someone gets access
>> >> to the VM host or cloudstack, they could likely see the encryption
>> >> key as well, but it does help in a case where someone tries to
>> >> download a copy of the disk image, if someone takes the physical disk
>> array, or something like that.
>> >>
>> >> Dedup will likely always be a function of the filesystem or storage
>> >> array, and I don't see a way for cloudstack to work at that level.
>> >>
>> >> On Wed, Oct 31, 2012 at 11:40 AM, Nguyen Anh Tu <[email protected]>
>> >> wrote:
>> >> > Love to hear that!!! Some days ago I post a mail to ask community
>> >> > about encrypting VM data in CloudStack, but seemly not to much
>> >> > people
>> >> take care.
>> >> > I'm writing an encryption service based on TrueCrypt, runing in
>> >> > background inside the VM. It separates from CloudStack. Great to
>> >> > hear about the API idea. I think it's a good choice. Some questions
>> >> > about API scenario: how to generate passphase/key? how to keep it?
>> >> >
>> >> > 2012/10/31 Edison Su <[email protected]>
>> >> >
>> >> >>
>> >> >>
>> >> >> > -----Original Message-----
>> >> >> > From: Umasankar Mukkara
>> [mailto:[email protected]]
>> >> >> > Sent: Tuesday, October 30, 2012 9:20 AM
>> >> >> > To: [email protected]
>> >> >> > Subject: Requirements of Storage Orchestration
>> >> >> >
>> >> >> > Today I had the opportunity to listen to Kevin Kluge at the
>> >> >> > inauguration
>> >> >> event
>> >> >> > of Bangalore CloudStack Group. Some thoughts around new storage
>> >> >> > requirements popped out after this event. I thought I will post
>> >> >> > to the
>> >> >> dev
>> >> >> > group and check what are already in progress. Kevin said, Edison
>> >> >> > Su is
>> >> >> already
>> >> >> > in the process of designing and implementing/re-factoring some
>> >> >> > portions
>> >> >> of
>> >> >> > storage orchestrator.
>> >> >> >
>> >> >> > I could think of the following extensions to the current
>> >> >> > cloudstack
>> >> >> >
>> >> >> >
>> >> >> >    1. Ability to offload the data protection capabilities to the 
>> >> >> > storage
>> >> >> >    array. (like dedup/snapshot/backup/encypt/compress)
>> >> >> >    2. Ability to provide an API at storage orchestrator so that
>> >> >> > the
>> >> >> storage
>> >> >> >    array can write to this API
>> >> >>
>> >> >> Only snapshot/backup are taken into consideration. Any details
>> >> >> about the scenario of encypt/compress/dedup?
>> >> >> Such as,  how to use this functionalities, what's the api should look 
>> >> >> like?
>> >> >> We can expose more capabilities on the API and storage driver layer.
>> >> >>
>> >> >> >    3. Extend the current storage offerings to include some of the
>> storage
>> >> >> >    array capabilities such as IOPS guarantee (or throttle), 
>> >> >> > throughput
>> >> >> >    guarantee (or throttle)
>> >> >> >
>> >> >> > Where can I learn the current development threads around these
>> >> >> > in cloudstack? Edision Su (or) some one who is working on this,
>> >> >> > can please provide some pointers these so that I can pull myself
>> >> >> > up to speed ?I
>> >> >> would
>> >> >> > like to actively participate and hack some parts of it :)
>> >> >> Oh, great! There are so many code I want to change, really need
>> >> >> help and get feedback from other people.
>> >> >> I'll send out the status of my current work and what I am trying
>> >> >> to do in another thread.
>> >> >>
>> >> >> >
>> >> >> > --
>> >> >> >
>> >> >> > Regards,
>> >> >> > Uma.
>> >> >> >
>> >> >> ------------------------------------------------------------------
>> >> >> ---
>> >> >> ---------
>> >> >> > CloudByte ElastiStor 1.0 is now available under Early Access
>> >> >> > Program<http://www.cloudbyte.com/eap.aspx>
>> >> >> >
>> >> >> ------------------------------------------------------------------
>> >> >> ---
>> >> >> ----------
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> >
>> >> > N.g.U.y.e.N.A.n.H.t.U

Re: Requirements of Storage Orchestration

Reply via email to