Hi Dave,
Thank you for reviewing the document. Please see my responses inline.
On 06/09/10 09:09, Dave Miner wrote:
Karen,
A few comments that are mostly pretty high-level since others have
commented on the more detailed issues I spotted already.
I think there's an over-emphasis on singletons in this design. We've
discussed DOC already and I think resolved that it won't be, but I
would prefer not to see singletons other than the InstallEngine. It
is a natural singleton, whereas the others seem to be a case of this
design dictating behavior where it isn't actually necessary. For
example, if some checkpoint wants/needs to do some alternate logging
for some reason, or if the application has a need for multiple logs to
satisfy some other integration requirement, there's no reason not to
allow use of a different logging instance.
As discussed else where, I will remove any mention of the DOC being a
singleton, and provide a public interface
in the engine to access the DOC.
As for logging, the engine calls logging.setLoggerClass(InstallLogger)
to force all logger instances to use InstallLogger.
If they want to use a different logger class, all logger instance will
get affected. That's a restriction with Python logging.
I think I am perhaps not using the right terminology in the spec. The
engine will set the InstallLogger class,
then, initialize an instance of the Python logger. All other install
components should reference this instance
that's initialized by the engine. If they want to initialize another
instance, they can do that. That instance
will still be using the InstallLogger class, but that instance
won't inherit any of the properties that's set for the instance
initialized by the engine.
So, I think I will remove any mention about logger being a singleton,
and replace it with something
similar to what I said above.
A further comment on the logging: why not allow the application to
specify an alternate logger instance vs. the one that you would
instantiate automatically? That seems more flexible that merely
allowing a log level. Beyond that, I don't quite understand the
reason that each checkpoint gets a "sub-logger"; what benefit does
this provide?
Python has this hierarchical functionality for logging. Each checkpoint
will have it's own logger instance, but
the checkpoint's logger instance will inherit all the properties of the
"parent" logger instance.
For example, if the logger instance created by the engine is called
"InstallationLogger", and there's a checkpoint
called "TargetDiscovery", the logger instance with the name
"InstallationLogger.TargetDiscovery" will
be a separate logger instance, but it will inherit all the properties of
the "InstallationLogger" instance.
Each checkpoint needs a separate logger instance so we can support
setting separate
logging level at the checkpoint level.
Most, if not all, applications that use the engine will be privileged
apps and as such should not be using /tmp for storage of any data.
/var/run, please, perhaps falling back to /tmp if you find that you
don't have write access there; and use TMPDIR environment variable if
you need to provide flexibility to developers or support personnel.
Sounds good. Will adopt this in the spec, and change all the
appropriate places.
In section 7.4, there's reference to an "installation target", which
seems to make the engine dependent on a specific checkpoint (Target
Instantiation) and elements of its schema, but you don't really say
that here or list it as an imported interface. This seems to be an
exception (at least in part) to the principle of the engine treating
all checkpoints equally. Wouldn't it make more sense to define a
method on InstallEngine that the application or a checkpoint could
call to set the ZFS dataset that the engine should be using for its
storage requirements?
Defining a method in the InstallEngine object sounds like a great idea.
I'm disappointed that the methodology for determining checkpoint
progress weighting is still TBD. I'd thought this was one of the
things we were trying to sort out in prototyping, but perhaps I
assumed too much. When can we expect this to be specified?
The prototype confirm that we can have each of the checkpoints report
it's own weight, and the engine used
these weight to normalize the process reported by the checkpoints. We
also shown in the prototype that we
can use the logger for progress reporting.
I don't have plans to work on the methodology in detail in the short
term. It would involve
specifying exactly which machine with what configuration should be used
as the standard,
and also provide the mapping between a performance number generated from
that
machine to the weight. In order to do this accurately, I think it would
involve more research
and experimentation to determine what would work for most cases. If we
have the
code in the engine to accept and interpret weights provided by
checkpoints, when we
eventually have the methodology in place, we can just change the value
returned by the get_performance_estimate() function in the checkpoints,
which should
have a very minimal impact. At the mean time, we can have the
checkpoints return
the "guess" weight like we do now.
In section 11, we seem to be eliminating the ability that DC currently
has to continue in spite of errors in a particular checkpoint. Has
this been discussed with the DC team?
I forgot about the ability to continue in spite of errors. That
functionality should be included in the engine.
I will provide an interface in the InstallEngine class for the
application to indicate that they
want to continue despite errors. By default, we will not continue if
there's an error.
Finally, a moderately out-there question that is admittedly not part
of the existing requirements list: what if we wanted to make
checkpoints executable in parallel at some point in the future? Would
we look at a tree model, rather than linear list, for checkpoint
specification, or something else? Is there anything else in this
design that would hinder that possibility (or things we could easily
modify now to help allow it more easily)? An existing case where I
might want to do this right now is to generate USB images at the same
time as an ISO in DC, rather than generating a USB from an ISO
exclusively (the current behavior). It's also the case that many of
the existing ICT's could probably be run in parallel since they
generally wouldn't conflict.
In order to support this, we would probably need to change how
checkpoints are registered. Probably adding
more arguments to the register_checkpoint() function to specify which
checkpoints can be executed together.
In the InstallEngine object, we currently store checkpoints in the order
that they are to be executed as a list.
Since Python allows one to store a list within a list, for the case of
executing checkpoints in parallel,
we can store all the checkpoints that are meant to be run in parallel in
a sub-list inside the
"execution list". At execution time, we go down the "execution list"
and run one single checkpoint
or a group of checkpoints.
For example, if our execution list has the following
A, (B, C, D), E, (F, G)
First, Checkpoint A will be executed by itself
Then, Checkpoint B, C, D will execute at the same time
Then, Checkpoint E will execute
Finally, Checkpoint F and G will be executed at the same time.
So, I don't think the current design would hinder that possibility.
Thank you again for taking the time to review the install design doc.
--Karen
_______________________________________________
caiman-discuss mailing list
caiman-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss