Re: [caiman-discuss] Install Engine Design Document review

Karen Tung Thu, 10 Jun 2010 16:44:36 -0700

Hi Dave,

Thank you for reviewing the document.  Please see my responses inline.


On 06/09/10 09:09, Dave Miner wrote:

Karen,
A few comments that are mostly pretty high-level since others havecommented on the more detailed issues I spotted already.
I think there's an over-emphasis on singletons in this design. We'vediscussed DOC already and I think resolved that it won't be, but Iwould prefer not to see singletons other than the InstallEngine. Itis a natural singleton, whereas the others seem to be a case of thisdesign dictating behavior where it isn't actually necessary. Forexample, if some checkpoint wants/needs to do some alternate loggingfor some reason, or if the application has a need for multiple logs tosatisfy some other integration requirement, there's no reason not toallow use of a different logging instance.

As discussed else where, I will remove any mention of the DOC being asingleton, and provide a public interface

in the engine to access the DOC.

As for logging, the engine calls logging.setLoggerClass(InstallLogger)to force all logger instances to use InstallLogger.If they want to use a different logger class, all logger instance willget affected. That's a restriction with Python logging.

I think I am perhaps not using the right terminology in the spec. Theengine will set the InstallLogger class,then, initialize an instance of the Python logger. All other installcomponents should reference this instancethat's initialized by the engine. If they want to initialize anotherinstance, they can do that. That instance

will still be using the InstallLogger class, but that instance

won't inherit any of the properties that's set for the instanceinitialized by the engine.

So, I think I will remove any mention about logger being a singleton,and replace it with something

similar to what I said above.

A further comment on the logging: why not allow the application tospecify an alternate logger instance vs. the one that you wouldinstantiate automatically? That seems more flexible that merelyallowing a log level. Beyond that, I don't quite understand thereason that each checkpoint gets a "sub-logger"; what benefit doesthis provide?

Python has this hierarchical functionality for logging. Each checkpointwill have it's own logger instance, butthe checkpoint's logger instance will inherit all the properties of the"parent" logger instance.For example, if the logger instance created by the engine is called"InstallationLogger", and there's a checkpointcalled "TargetDiscovery", the logger instance with the name"InstallationLogger.TargetDiscovery" willbe a separate logger instance, but it will inherit all the properties ofthe "InstallationLogger" instance.

Each checkpoint needs a separate logger instance so we can supportsetting separate

logging level at the checkpoint level.

Most, if not all, applications that use the engine will be privilegedapps and as such should not be using /tmp for storage of any data./var/run, please, perhaps falling back to /tmp if you find that youdon't have write access there; and use TMPDIR environment variable ifyou need to provide flexibility to developers or support personnel.

Sounds good. Will adopt this in the spec, and change all theappropriate places.

In section 7.4, there's reference to an "installation target", whichseems to make the engine dependent on a specific checkpoint (TargetInstantiation) and elements of its schema, but you don't really saythat here or list it as an imported interface. This seems to be anexception (at least in part) to the principle of the engine treatingall checkpoints equally. Wouldn't it make more sense to define amethod on InstallEngine that the application or a checkpoint couldcall to set the ZFS dataset that the engine should be using for itsstorage requirements?

Defining a method in the InstallEngine object sounds like a great idea.

I'm disappointed that the methodology for determining checkpointprogress weighting is still TBD. I'd thought this was one of thethings we were trying to sort out in prototyping, but perhaps Iassumed too much. When can we expect this to be specified?

The prototype confirm that we can have each of the checkpoints reportit's own weight, and the engine usedthese weight to normalize the process reported by the checkpoints. Wealso shown in the prototype that we

can use the logger for progress reporting.

I don't have plans to work on the methodology in detail in the shortterm. It would involvespecifying exactly which machine with what configuration should be usedas the standard,and also provide the mapping between a performance number generated fromthatmachine to the weight. In order to do this accurately, I think it wouldinvolve more researchand experimentation to determine what would work for most cases. If wehave thecode in the engine to accept and interpret weights provided bycheckpoints, when we

eventually have the methodology in place, we can just change the value

returned by the get_performance_estimate() function in the checkpoints,which shouldhave a very minimal impact. At the mean time, we can have thecheckpoints return

the "guess" weight like we do now.

In section 11, we seem to be eliminating the ability that DC currentlyhas to continue in spite of errors in a particular checkpoint. Hasthis been discussed with the DC team?

I forgot about the ability to continue in spite of errors. Thatfunctionality should be included in the engine.I will provide an interface in the InstallEngine class for theapplication to indicate that theywant to continue despite errors. By default, we will not continue ifthere's an error.

Finally, a moderately out-there question that is admittedly not partof the existing requirements list: what if we wanted to makecheckpoints executable in parallel at some point in the future? Wouldwe look at a tree model, rather than linear list, for checkpointspecification, or something else? Is there anything else in thisdesign that would hinder that possibility (or things we could easilymodify now to help allow it more easily)? An existing case where Imight want to do this right now is to generate USB images at the sametime as an ISO in DC, rather than generating a USB from an ISOexclusively (the current behavior). It's also the case that many ofthe existing ICT's could probably be run in parallel since theygenerally wouldn't conflict.

In order to support this, we would probably need to change howcheckpoints are registered. Probably addingmore arguments to the register_checkpoint() function to specify whichcheckpoints can be executed together.

In the InstallEngine object, we currently store checkpoints in the orderthat they are to be executed as a list.Since Python allows one to store a list within a list, for the case ofexecuting checkpoints in parallel,we can store all the checkpoints that are meant to be run in parallel ina sub-list inside the"execution list". At execution time, we go down the "execution list"and run one single checkpoint

or a group of checkpoints.

For example, if our execution list has the following

A, (B, C, D), E, (F, G)

First, Checkpoint A will be executed by itself
Then, Checkpoint B, C, D will execute at the same time
Then, Checkpoint E will execute
Finally, Checkpoint F and G will be executed at the same time.

So, I don't think the current design would hinder that possibility.

Thank you again for taking the time to review the install design doc.

--Karen

_______________________________________________
caiman-discuss mailing list
caiman-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

Re: [caiman-discuss] Install Engine Design Document review

Reply via email to