All, I had to duck of the meeting early to attend to a personal matter. I would like to purpose re-convening on Monday, 28 January 2013, @ 2pm EDT. Also, I put together a UML class diagram of my thoughts on the notion of storage devices [1]. It is not intended to be complete, but communicate the basic idea.
Thanks, -John [1] https://docs.google.com/file/d/0B5bRhyhJCfDgSlBKRDhyRDg0Q2M/edit On Jan 24, 2013, at 2:51 PM, "ASF IRC Services" <[email protected]> wrote: > Members present: jburwell, iswc, ke4qqq, edison_cs, topcloud, widodh > > ---------------- > Meeting summary: > ---------------- > > 1. Preface > > IRC log follows: > > > # 1. Preface # > 19:03:44 [widodh]: and after those are on the table we can discuss them > 19:04:14 [jburwell]: item 1 for me is the storage driver model > 19:04:30 [jburwell]: which edison and I have been discussing for a bit > 19:04:52 [edison_cs]: yah, let's continue the discuss > 19:04:54 [widodh]: Yes, I noticed that and I have some questions about that > (for later) > 19:05:09 [widodh]: item 1 for me is the management server having direct > access to all APIs > 19:05:22 [widodh]: do we have anything else? > 19:05:29 [edison_cs]: I have one > 19:05:44 [edison_cs]: the current status of what am I doing, and the plan for > 4.1 > 19:06:22 [edison_cs]: any new topic? > 19:06:44 [iswc]: I'm here as a spectator, I have no input so just ignore me > 19:07:07 [edison_cs]: ok, then let's move on > 19:07:15 [edison_cs]: who will be the first? > 19:07:24 [widodh]: You are? Let us know what you are doing > 19:07:29 [edison_cs]: ok > 19:07:29 [jburwell]: I think widodh's item 1 and my item 1 are intwined > 19:08:07 [edison_cs]: I am writting code based on > http://cwiki.apache.org/confluence/display/CLOUDSTACK/Storage+subsystem+2.0 > 19:08:07 [jburwell]: because the driver model we have been discussing is > designed to be execution local agnostic > 19:08:31 [edison_cs]: finished the orchestration part of code > 19:08:44 [edison_cs]: on javelin branch > 19:08:54 [jburwell]: do you have the storage subsystem wired into cloudstack? > 19:08:59 [jburwell]: at any level .. > 19:09:02 [edison_cs]: basically, it can call driver to create/copy > volume/snapshot/template etc > 19:09:08 [edison_cs]: not yet > 19:09:14 [edison_cs]: my plan > 19:09:37 [edison_cs]: is plugged into mgt server at end of this week > 19:09:52 [edison_cs]: there will be no change for current storage code > 19:10:07 [jburwell]: then why merge it in for 4.1.0? > 19:10:29 [edison_cs]: so people can start to write driver for primary storage > 19:10:52 [topcloud]: edison: is the plan still to treat the current > cloudstack code as one big datastore provider? > 19:10:52 [jburwell]: I would suggest merging in any new storage subsystem > post-4.1.0 > 19:10:54 [topcloud]: and merge it in this way? > 19:10:59 [edison_cs]: old code and new code will be co-exist for a while > 19:11:14 [widodh]: what would the benefit be for doing that in 4.1? Won't > people be writing against master anyway? > 19:11:14 [jburwell]: as a first action in then next release lifecycle > 19:11:15 [ke4qqq]: but it won't be useful in 4.1 - so why not merge it into > master after the 4.1 branch > 19:11:22 [edison_cs]: topcloud: yes, current storagemanager will be the > default data store provider > 19:11:44 [jburwell]: ke4qqq +1 > 19:11:59 [edison_cs]: ke4qqq: zone-wide storage will need the new code > 19:12:02 [topcloud]: edison is saying it will be wired. > 19:12:14 [topcloud]: by this weekend. > 19:12:22 [jburwell]: as I understand it, it will be wired into the mgmt server > 19:12:29 [jburwell]: but not actually doing anything > 19:12:53 [jburwell]: so why take on such a large commit late in the > development cycle? > 19:13:17 [jburwell]: the difference in merge dates at this point would be > early next week > 19:13:22 [edison_cs]: jburwell: the exisiting storage code, will be the > default data storage provider > 19:13:24 [jburwell]: as opposed to late next week > 19:13:46 [jburwell]: this feels like a big, big change to take on days before > a code freeze > 19:14:07 [jburwell]: does it add any functionality for 4.1.0? > 19:14:22 [edison_cs]: to add zone-wide storage > 19:14:37 [edison_cs]: and possbile to add new storage driver > 19:14:37 [ke4qqq]: is the zone-wide storage stuff done? > 19:14:44 [widodh]: Ok, so that's the win for 4.1? zone-wide storage > 19:14:45 [edison_cs]: after the merge > 19:14:52 [edison_cs]: it will be doable > 19:15:07 [jburwell]: it seems unlikely new storage drivers would be added in > time for 4.1.0 code freeze > 19:15:14 [ke4qqq]: but no new features will hit 4.1 after branch - so new > drivers in 4.1 doesn't help > 19:15:37 [jburwell]: as I have said, I think we need to rework the driver > model as well > 19:15:38 [edison_cs]: widodh: yes, that's the plan, after merge, I will add > zone-wide storage support > 19:15:44 [jburwell]: to separate logical and physical operations > 19:16:07 [jburwell]: being a week out from the freeze date, this feels like a > lot to take on > 19:16:31 [topcloud]: why don't we talk about that first. > 19:16:44 [jburwell]: topcloud which .. the driver model? > 19:16:44 [topcloud]: the driver model rework. > 19:17:22 [jburwell]: topcloud so I have two hobby horses I have been flogged > wrt to storage drivers > 19:17:39 [topcloud]: flog away > 19:17:52 [jburwell]: 1) remove all implicit and explicit assumptions > regarding a storage device's support of an underlying filesystem > 19:18:09 [jburwell]: 2) separate logical and physical storage operations to > consolidate domain logic and simplify implementation > 19:18:22 [topcloud]: +1 on 1) is there anything specific that you saw that > led you to this? > 19:18:37 [topcloud]: +1 on 2 as well. > 19:18:52 [jburwell]: when implementing the S3 capabilities > 19:19:07 [jburwell]: you run into File as the implicit coin of the realm > 19:19:14 [topcloud]: i mean these two points are the reason why we're doing > storage refactor. > 19:19:24 [widodh]: jburwell: Yes, on #1, the problem with RBD is that it are > "virtual devices". So never assume a block device is also a kernel block > device > 19:19:29 [jburwell]: basically, storage needs to speak in terms of URIs and > Input/OutputStreams > 19:19:38 [widodh]: Same goes when we implement Sheepdog (which will happen) > 19:20:09 [jburwell]: basically, CloudStack needs to speak to a storage driver > with a logical URI such as "/template/2/200" for a template id 200 on account > id 2 > 19:20:22 [edison_cs]: In the new storage code, all the > objects(volume/template/snapshot) are just URI > 19:20:22 [jburwell]: the driver maps that to a physical location > 19:20:29 [topcloud]: so here's our take on this. > 19:20:44 [jburwell]: topcloud when you say our who do u mean? > 19:20:52 [topcloud]: edison and mine. > 19:21:16 [widodh]: topcloud: I'm bad at IRC nicknames :) What's your real > name? > 19:21:23 [topcloud]: alex huang > 19:21:37 [topcloud]: i should change my name. > 19:21:37 [topcloud]: or nicname. > 19:21:46 [widodh]: Ah, hi Alex! > 19:21:59 [topcloud]: anyways....the way we looked at this is as follows: > 19:22:14 [topcloud]: there's three parts to any storage provisioning. > 19:22:29 [topcloud]: - orchestration: that's cloudstack orchestration code. > 19:23:00 [topcloud]: - provisioning: that's logic on provisioning different > storage pieces (volume, snapshot, etc) > 19:23:14 [topcloud]: - the actual moving of the bytes: > 19:23:23 [topcloud]: what do you think? > 19:23:44 [jburwell]: topcloud I think along similar lines but with slightly > different terms > 19:23:47 [jburwell]: so for a quick normalization > 19:23:53 [topcloud]: like to hear your thoughts on that. > 19:24:01 [jburwell]: I see two types of operations logical and physical > 19:24:22 [jburwell]: logical operations cover orchestration and provisioning > and should be common across all devices > 19:24:29 [topcloud]: physical is the actual bye movement? > 19:24:39 [jburwell]: physical operations involve the movement of bytes from > one point to another > 19:24:52 [topcloud]: cool...don't think we're that different. > 19:24:54 [jburwell]: nope > 19:24:59 [widodh]: Nope, me neither > 19:25:01 [jburwell]: I think we are on the same page there > 19:25:14 [jburwell]: from there I see another dimension > 19:25:14 [edison_cs]: I just finished the "logical operations cover > orchestration and provisioning and should be common across all devices" part > 19:25:14 [jburwell]: when we get the physical layer > 19:25:17 [topcloud]: for logical, we divide between orchestration and > provisioning so cloudstack and the storage vendors can agree on the right > division. > 19:25:37 [jburwell]: a physical device has capabilities > 19:25:59 [jburwell]: and at the logical layer, we have policies that > constrain how the capabilities of a device can be used in cloudstack > 19:25:59 [topcloud]: cool...so let's talk about where these three parts fit. > i think cloudstack is obvious right....so forget that. > 19:26:07 [jburwell]: so, for example, S3 is an object store > 19:26:22 [jburwell]: actually, NFS is a better example of this > 19:26:22 [edison_cs]: all the operations on snapshot/volume are generalized > into 3 or 5 api calls: create/delete/copy etc > 19:26:23 [jburwell]: NFS is a file system type of store > 19:26:29 [jburwell]: right .. > 19:26:31 [jburwell]: it can store anything .. > 19:26:40 [jburwell]: images, isos, templates, volumes, etc > 19:26:59 [jburwell]: however, we have designated at that > hfs://foo-server/primary is only used for volume storage > 19:27:07 [edison_cs]: the orchestration doesn't care about iamges/isos/ or > whatever, they will share the same code > 19:27:07 [topcloud]: so i would like logical part to rise above that. > 19:27:15 [edison_cs]: to do create/delete/copy > 19:27:17 [topcloud]: that it is not tied to NFS or S3.... > 19:27:30 [jburwell]: topcloud agreed > 19:27:52 [jburwell]: from my view, the snapshot service is passed a target > storage device > 19:27:59 [topcloud]: so in the storage refactor, do you see anything that has > that tie in? > 19:28:07 [jburwell]: for which to create the snapshot > 19:28:52 [jburwell]: in this case, I see three abstractions > 19:29:22 [jburwell]: StorageDevice, StorageDeviceDriver, and > SnapshotStorageDevice (extends StorageDevice) > 19:30:24 [topcloud]: ok..i think i see slightly different....and it has to do > with where we think each piece fit in. > 19:30:37 [jburwell]: StorageDevice provides the logical tie between an > instance StorageDeviceDriver, configuration information, and the policy for > its use > 19:30:44 [edison_cs]: jburwell: I think all the diagrams in > https://cwiki.apache.org/confluence/display/CLOUDSTACK/Storage+subsystem+2.0, > are talking about upper level(orchestration and logical operations) > 19:30:53 [edison_cs]: it's not at the physical level > 19:31:09 [jburwell]: the storagedevicedriver is provided by a "vendor" to > actually interact with the physical device > 19:31:44 [jburwell]: finally, a SnapshotStorageDevice provides some addition > operation to perform on device snapshots for those device that support it > 19:31:52 [edison_cs]: seems the driver is a little bit confused in diagramgs > 19:32:22 [topcloud]: jburwell: i think i understand what you're getting at. > 19:32:44 [edison_cs]: the driver in the diagrams is a hook, the vendor can > plugin his own code into cloudstack mgt server, which will change how the > storage provision work > 19:33:07 [jburwell]: topcloud so, one or more StorageDevice instances are > passed into each logical service operation > 19:33:29 [jburwell]: am I making sense? > 19:33:37 [topcloud]: I think the difference is that we don't think cloudstack > management server does not need to understand how the physical operation is > performed. > 19:33:52 [topcloud]: strike the does not > 19:34:22 [topcloud]: I think like edison says driver was a poor choice of > name. > 19:34:23 [topcloud]: for his design. > 19:34:29 [jburwell]: topcloud I completely agree that it does not need to > know how > 19:34:37 [jburwell]: however, it can drive a storage device to do something > 19:35:07 [edison_cs]: but it's at the logical level, at at the physcial level > 19:35:07 [jburwell]: for example, I ask a storage device for InputStream from > a URI > 19:35:29 [jburwell]: and then copy from that InputStream into an OutputStream > provided by another storage for a second URI > 19:35:44 [jburwell]: CloudStack has no idea whatsoever how those read and > write operations are occuring > 19:35:55 [jburwell]: that is encapsulated in the InputStream and OutputStream > instances > 19:36:01 [topcloud]: yup...then why should cloudstack even know about streams? > 19:36:07 [topcloud]: that's my point. > 19:36:10 [jburwell]: but cloudstack does know that I need a template/2/200 to > create VM > 19:36:14 [topcloud]: it should just know uri. > 19:36:29 [jburwell]: because a lot duplicated logic can be eliminated > 19:36:37 [jburwell]: and a URI without the context of the stream is useless > 19:36:44 [topcloud]: the stream is only necessary at where the actual > physical provisioning is taking place. > 19:36:59 [widodh]: Wouldn't you otherwise also be pulling that data through > your management server? > 19:37:07 [topcloud]: no no. > 19:37:22 [jburwell]: widodh topcloud yes, the stream is only necessary where > execution occurs > 19:37:37 [jburwell]: which is why storage device drivers are stateless > 19:37:44 [topcloud]: cloudstack has always separate ms from where data flows. > 19:37:44 [jburwell]: and create streams on demand > 19:38:01 [jburwell]: so, if a StorageDevice instance is serialized down to > the SSVM > 19:38:07 [jburwell]: and then SSVM starts using it > 19:38:14 [topcloud]: so in this case, cloudstack talks in uri and then the > provider figures out how to translate a uri to a data stream. > 19:38:14 [widodh]: get it, inside the SSVM > 19:38:15 [jburwell]: the stream will not be created until the SSVM asks for it > 19:38:29 [topcloud]: and it might be done inside ssvm or inside dom0 of the > host. > 19:38:37 [jburwell]: topcloud exactly > 19:38:51 [jburwell]: the driver doesn't know or care where its located > 19:38:52 [jburwell]: when asked for a stream, it creates i > 19:38:53 [jburwell]: it > 19:39:17 [jburwell]: whether it is on the SSVM, mgmt server, or some other > daemon created in a future version of CS > 19:39:23 [topcloud]: i think we are thinking about the same thing but i still > don't understand why the stream needs to be presented to cs management server. > 19:40:01 [topcloud]: to me uri is presented to management server. and then > management server presents it to a data motion service and that motion > service understand how to extract stream from the uri. > 19:40:32 [jburwell]: because I believe a lot of logic can pulled out of the > physical layer into the logical layer > 19:40:37 [jburwell]: to gain consistent algorithms > 19:40:47 [jburwell]: and greatly simplify device driver implementation > 19:41:14 [topcloud]: so you want to push that into cloudstack's orchestration? > 19:41:14 [edison_cs]: jburwell: at the logical level, all the volumes/* are > all have the same base interface: dataobject > 19:41:39 [edison_cs]: so mgt server, doesn't know any difference between > volumes/snapshot/template during the orchestartion > 19:41:52 [jburwell]: take downloading a template from a http resource as an > example > 19:42:09 [jburwell]: we connect to the http resource and get an output stream > from the request > 19:42:24 [topcloud]: define we? > 19:42:29 [jburwell]: that template is going to an storage device > 19:42:45 [jburwell]: we is CS > 19:42:52 [topcloud]: ok. > 19:42:54 [jburwell]: the SSVM daemon > 19:43:01 [jburwell]: downloading a template > 19:43:02 [topcloud]: ah.... > 19:43:16 [topcloud]: so when we talk about CS for us, we're not considering > ssvm. > 19:43:22 [jburwell]: then is simply a matter grabbing an output stream from > the storage device > 19:43:23 [jburwell]: and copying between the two > 19:43:38 [jburwell]: this logic really could be executed anywher > 19:43:39 [jburwell]: e > 19:44:07 [topcloud]: ok...i think i see a difference in understanding here. > 19:44:14 [jburwell]: but in this case, template download occurs on the SSVM > 19:44:24 [topcloud]: in terms of ssvm we agree with what you're saying. > 19:45:14 [topcloud]: in the refactoring, what we've changed is only the > logical part on the management server. > 19:45:23 [topcloud]: it did not involve the ssvm portion. > 19:45:40 [topcloud]: we believe the actual provisioning can be done anywhere, > including ssvm but not necessarily always ssvm. > 19:45:45 [jburwell]: in the management server, my notion of the storage > device bridges logical to physical > 19:46:07 [jburwell]: in my view, to be clear, I don't see storage devices as > caring where they live and execute > 19:46:09 [jburwell]: they simply do as they are told > 19:46:14 [topcloud]: agreed. > 19:46:38 [jburwell]: I hate to do this > 19:46:40 [topcloud]: how about we talk about this in terms of the interfaces > introduced in the refactor. > 19:46:47 [jburwell]: but an emergency has come up > 19:46:52 [jburwell]: and I need to get home rickey tick > 19:46:59 [topcloud]: oops. > 19:46:59 [topcloud]: hope everything is okay. > 19:47:07 [jburwell]: so do I > 19:47:09 [widodh]: Yes, go home > 19:47:14 [jburwell]: can adjourn to tomorrow afternoon > 19:47:14 [jburwell]: ? > 19:47:15 [widodh]: more important > 19:47:16 [topcloud]: let us know when you can do it. > 19:47:25 [topcloud]: sure...go take care of your emergency first. > 19:47:44 [jburwell]: we will pick up at topcloud's request to discuss in > terms of the interfaces in the refactor > 19:47:59 [topcloud]: widodh: have what we talked about resolved your concerns? > 19:48:07 [widodh]: Ok! Thanks jburwell. Good luck! > 19:48:14 [topcloud]: jburwell: sure...thanks. > 19:48:14 [widodh]: topcloud: Yes. I have to be honest that I didn't look at > the code good enough > 19:48:15 [jburwell]: widodh thanks >
