Mike, Before we can dig into timelines or implementations, I think we need to get consensus on the problem to solved and the goals. Once we have a proper understanding of the scope, I believe we can chunk the across a set of development lifecycle. The subject is vast, but it also has a far reaching impact to both the storage and network layer evolution efforts. As such, I believe we need to start addressing it as part of the next release.
As a separate thread, we need to discuss the timeline for the next release. I think we need to avoid the time compression caused by the overlap of the 4.1 stabilization effort and 4.2 development. Therefore, I don't think we should consider development of the next release started until the first 4.2 RC is released. I will try to open a separate discuss thread for this topic, as well as, tying of the discussion of release code names. Thanks, -John On Aug 20, 2013, at 6:22 PM, Mike Tutkowski <[email protected]> wrote: > Hey John, > > I think this is some great stuff. Thanks for the write up. > > It looks like you have ideas around what might go into a first release of > this plug-in framework. Were you thinking we'd have enough time to squeeze > that first rev into 4.3. I'm just wondering (it's not a huge deal to hit > that release for this) because we would only have about five weeks. > > Thanks > > > On Tue, Aug 20, 2013 at 3:43 PM, John Burwell <[email protected]> wrote: > >> All, >> >> In capturing my thoughts on storage, my thinking backed into the driver >> model. While we have the beginnings of such a model today, I see the >> following deficiencies: >> >> >> 1. *Multiple Models*: The Storage, Hypervisor, and Security layers >> each have a slightly different model for allowing system functionality to >> be extended/substituted. These differences increase the barrier of entry >> for vendors seeking to extend CloudStack and accrete code paths to be >> maintained and verified. >> 2. *Leaky Abstraction*: Plugins are registered through a Spring >> configuration file. In addition to being operator unfriendly (most >> sysadmins are not Spring experts nor do they want to be), we expose the >> core bootstrapping mechanism to operators. Therefore, a misconfiguration >> could negatively impact the injection/configuration of internal management >> server components. Essentially handing them a loaded shotgun pointed at >> our right foot. >> 3. *Nondeterministic Load/Unload Model*: Because the core loading >> mechanism is Spring, the management has little control over the timing and >> order of component loading/unloading. Changes to the Management Server's >> component dependency graph could break a driver by causing it to be started >> at an unexpected time. >> 4. *Lack of Execution Isolation*: As a Spring component, plugins are >> loaded into the same execution context as core management server >> components. Therefore, an errant plugin can corrupt the entire management >> server. >> >> >> For next revision of the plugin/driver mechanism, I would like see us >> migrate towards a standard pluggable driver model that supports all of the >> management server's extension points (e.g. network devices, storage >> devices, hypervisors, etc) with the following capabilities: >> >> >> - *Consolidated Lifecycle and Startup Procedure*: Drivers share a >> common state machine and categorization (e.g. network, storage, hypervisor, >> etc) that permits the deterministic calculation of initialization and >> destruction order (i.e. network layer drivers -> storage layer drivers -> >> hypervisor drivers). Plugin inter-dependencies would be supported between >> plugins sharing the same category. >> - *In-process Installation and Upgrade*: Adding or upgrading a driver >> does not require the management server to be restarted. This capability >> implies a system that supports the simultaneous execution of multiple >> driver versions and the ability to suspend continued execution work on a >> resource while the underlying driver instance is replaced. >> - *Execution Isolation*: The deployment packaging and execution >> environment supports different (and potentially conflicting) versions of >> dependencies to be simultaneously used. Additionally, plugins would be >> sufficiently sandboxed to protect the management server against driver >> instability. >> - *Extension Data Model*: Drivers provide a property bag with a >> metadata descriptor to validate and render vendor specific data. The >> contents of this property bag will provided to every driver operation >> invocation at runtime. The metadata descriptor would be a lightweight >> description that provides a label resource key, a description resource key, >> data type (string, date, number, boolean), required flag, and optional >> length limit. >> - *Introspection: Administrative APIs/UIs allow operators to >> understand the configuration of the drivers in the system, their >> configuration, and their current state.* >> - *Discoverability*: Optionally, drivers can be discovered via a >> project repository definition (similar to Yum) allowing drivers to be >> remotely acquired and operators to be notified regarding update >> availability. The project would also provide, free of charge, certificates >> to sign plugins. This mechanism would support local mirroring to support >> air gapped management networks. >> >> >> Fundamentally, I do not want to turn CloudStack into an erector set with >> more screws than nuts which is a risk with highly pluggable architectures. >> As such, I think we would need to tightly bound the scope of drivers and >> their behaviors to prevent the loss system usability and stability. My >> thinking is that drivers would be packaged into a custom JAR, CAR >> (CloudStack ARchive), that would be structured as followed: >> >> >> - META-INF >> - MANIFEST.MF >> - driver.yaml (driver metadata(e.g. version, name, description, >> etc) serialized in YAML format) >> - LICENSE (a text file containing the driver's license) >> - lib (driver dependencies) >> - classes (driver implementation) >> - resources (driver message files and potentially JS resources) >> >> >> The management server would acquire drivers through a simple scan of a URL >> (e.g. file directory, S3 bucket, etc). For every CAR object found, the >> management server would create an execution environment (likely a dedicated >> ExecutorService and Classloader), and transition the state of the driver to >> Running (the exact state model would need to be worked out). To be really >> nice, we could develop a custom Ant task/Maven plugin/Gradle plugin to >> create CARs. I can also imagine an opportunities to add hooks to this >> model to register instrumentation information with JMX and authorization. >> >> To keep the scope of this email confined, we would introduce the general >> notion of a Resource, and (hand wave hand wave) eventually compartmentalize >> the execution of work around a resource [1]. This (hand waved) >> compartmentalization would allow us the controls necessary to safely and >> reliably perform in-place driver upgrades. For an initial release, I would >> recommend implementing the abstractions, loading mechanism, extension data >> model, and discovery features. With these capabilities in place, we could >> attack the in-place upgrade model. >> >> If we were to adopt such a pluggable capability, we would have the >> opportunity to decouple the vendor and CloudStack release schedules. For >> example, if a vendor were introducing a new product that required a new or >> updated driver, they would no longer need to wait for a CloudStack release >> to support it. They would also gain the ability to fix high priority >> defects in the same manner. >> >> I have hand waved a number of issues that would need to be resolved before >> such an approach could be implemented. However, I think we need to decide, >> as a community, that it worth devoting energy and effort to enhancing the >> plugin/driver model and the goals of that effort before driving head first >> into the deep rabbit hole of design/implementation. >> >> Thoughts? (/me ducks) >> -John >> >> [1]: My opinions on the matter from CloudStack Collab 2013 -> >> http://www.slideshare.net/JohnBurwell1/how-to-run-from-a-zombie-cloud-stack-distributed-process-management >> > > > > -- > *Mike Tutkowski* > *Senior CloudStack Developer, SolidFire Inc.* > e: [email protected] > o: 303.746.7302 > Advancing the way the world uses the > cloud<http://solidfire.com/solution/overview/?video=play> > *™*
signature.asc
Description: Message signed with OpenPGP using GPGMail
