[caiman-discuss] Caiman architecture document open for review

Sarah Jelinek Tue, 12 Jan 2010 17:44:27 -0700

Hi Karen,

Thank you for the review. My comments inline...



> Hi Sarah,
> 
> Here are my comments on the document.
> 
> General:
> ---------------
> 
> - Question: throughout the document, you choose "class" to refer to the 
> different
> pieces of the New Solaris Installation suite.  Why is "class" chosen?  I 
> generally
> see "class" being used in the object oriented programming context.  Is 
> it implied that
> our implementation of the suite must be done in a language that supports 
> OO?
> Would it be more general to refer to the different pieces as "module" or 
> "component"?

I use 'class' specifically to imply the OO'ness of the architecture. And 
  yes, the language(s) we choose should support OO. Or if we do decide 
some of this is better done in C(not likely, but in case) we build the C 
datastructures and funtions in an OO manner.


> - In various parts of the document, there's mention about certain python 
> features
> being used, such as using the python logging module for the engine.
> Would it make sense to spell out Python as the language that will be used?
>

Actually, we are not necessarily going to be using python and the 
implementation details don't belong in the architecture document anyway. 
I think the only reference to Python is the logging. I will remove this 
reference.

> Specific comments:
> -------------------------------
> Section 1:
> - The list of of "architecturally significant points" did not include 
> any requirement
> for being able to create customization images (DC), and the
> ability to easily upgrade (snap).  Is that intentional?

No, simply a shortened list. I think these are included in other 
requirements listed in the document. If not, I can certainly add them here.

> 
> Section 2:
> - This section mentioned that the engine offers functionality to pause 
> or stop the installation
> at each checkpoint, and to resume at those checkpoints.  Currently, DC 
> provides similar
> functionality if the "installation target" is on a ZFS dataset.  In 
> order to provide these functionalities,
> do we also require the installation target to be a ZFS dataset?
> 

No, we won't required ZFS datasets. I think the difference between DC 
and the other Caiman installer apps is that it isn't our intent to kill 
the installer app when a checkpoint is completed. We don't have to. So, 
in affect the application is in a pause state, and we have to 'resume', 
either at the next checkpoint or one previous.  The data we will need to 
retain will be retained in memory.


> Section 2.1, 2nd paragraph:
> - Question: Can checkpoints interact with the user?  If so, does the 
> engine need to coordinate that?

I am not sure what you mean by interact with users. Certainly, 
checkpoints will be used in interactive installations, and that 
installation app will have to reap the data from the data collection 
once the checkpoint is completed. In this sense, yes, they can interact 
and yes the engine is responsible for managing this interaction. In the 
case of the liveCD GUI for example, the gui registers the appropriate 
checkpoints, target discovery, target instantiation, transfer and 
instance configuration. The application requests the engine to run the 
first checkpoint, target discovery, and this data is put in the data 
collection cache. The GUI application then has to read the data from the 
cache and allow users to make choices. After the user has completed 
their choices and chooses to start the installation, the application 
requests the engine to execute the remaining checkpoints that were 
registered.

> If not, please add something in this paragraph to clarify.
> - This description mentions that the execution engine automatically 
> computes and reports
> progress as checkpoints are executed.  I wonder how realistic it is to
> report useful overall progress?  I suppose each checkpoint can specify a 
> "weight" for themselves
> indicating whether it is a long checkpoint or short one.  Then, the 
> engine can average
> it out somehow, and assign a weight.  But all these will be guesses, not 
> sure how accurate
> can be overall progress be reported.

The weighting and time of execution has to be supplied by the 
checkpoint. Certainly, the engine will do some computing of the progress 
totals and percentages, but it is based on input from the checkpoint. We 
do something similar today with the GUI. Except that the overall 
percentage(weighting) is actually hardcoded in liborchestrator. It isn't 
totally accurate either.

We haven't design this yet, but I don't think that having the engine 
manage this calculation with input from the checkpoints makes the 
progress reporting less accurate.

> Section 2.1, the list of methods ExecutionEngine need to provide:
> - I think it is missing list_checkpoints() or get_checkpoints() so 
> applications would know what
> value to use when they want to pause/stop/resume.
> 

They are doing the registration of the checkpoints up front, at 
application startup time, so I would think they would know the 
checkpoints that are available. That said.. I am wondering if there is a 
need for this I am not seeing. Maybe you can describe a use case that 
might be interesting for this?

> Section 2.1, the list of methods a checkpoint must implement is missing 
> the following:
> - stop(): this is needed to support cancel() in the engine
> - percent_completed(): this is useful for progress reporting for long 
> running checkpoints.

There is a get_progress_estimate() for the checkpoint which should give 
us the percentage or whatever unit of measure we are using, toward 
completion.

stop() should be added. I will do this.



> Section 2.2, second bullet in the beginning:
> - Why just translate installation data into valid AI manifest?  It would 
> be more general to translate
> it into any manifest.  Other installer might be able to accept manifests 
> as input in the future.

Well.. I guess I should remove the AI reference since you are about the 
100th person to mention this :-). In general, the plan is to have the 
'manifests' that our products use be the same, wherever possible. So, in 
theory a manifest from this could translate to another use.



> Section 2.2, the methods provided by the DataCollection class:
> - clear_cache() and create_cache(), there's no description.  I couldn't 
> figure out how they
> will be used based on the name.

I will add descriptions.


> Section 3.1, 2nd paragraph:
> - Do you mean "semantic validation" of user-provided manifests?

No, I mean syntactic validation here. At this point in time, we have no 
plans to do semantic validation and if we did, the ManifestParser isn't 
likely the place this would happen.

> Section 3.2, the list of objects Target Discovery need to discover:
> -" Installed instances of other operating systems....", what about 
> existing instances
> of OpenSolaris?

Those are covered by the discovery of Boot Environments.

> 
> Section 3.2.1:
> - This whole section talks about the different classes that target 
> discovery needs to
> support and the interfaces/methods these classes are required to implement.
> Target discovery is supposed to be a "read" operation, but many of these 
> classes
> have methods for "write" operations such as set_xxxxx(), snapshot(), 
> clone(), or other
> functions for implementing a functionality, such as the send()/receive() 
> methods in
> the Dataset class.  I think it is very confusing to have the description 
> for all these
> in the section for Target discovery.  Why not have a completely new 
> section that just
> talks about all the classes and all the methods it requires, and perhaps 
> talk about
> what functionality these methods can provide.

I can see your point. I think maybe the better answer is to have one 
section that addresses target discovery and target instantiation. The 
classes are the same for both, the methods of course may not be. I will 
work on getting this clearer.


> 
> Section 3.2.1.1, description of the required methods for Partition class 
> and Slice Class
> - I think you also need get_assoc_disk() and set_assoc_disk() in those 2 
> classes so it
> can associate itself back to the disk that contains these 
> partitions/slices.

I agree I need a method that allows us to build the association back to 
the parent. Maybe get_assoc_parent()? I have something similar in the 
LogicalTarget class, get_assoc_container() but I should likely just use 
get_assoc_parent(). What do you think?


> Section 4.1, New Solaris Install UML diagram
> - Question: does ProgressReciver, DataCollection and Logger all need to 
> talk to
> the checkpoints, because it is the check points that will generate the 
> "data"
> for those classes?  Or is it true that all the "data" from the 
> checkpoint classes will
> be reported to the engine, and the engine will dispatch them to the 3 
> classes as appropriate?


The DataCollection object is open for communication from anyone. To do 
this a method, get_data_collection() is called on the ExecutionEngine to 
get the DataCollection object handle.


> - InstallApp needs an arrow to the DataCollection class also, because 
> previous
> section in the doc talks about DataCollection is instantiated by the 
> app, and the app
> can access/modify the data outside of the engine.

It is instantiated by the InstallApp, through the ExecutionEngine. It 
uses the method set_data_collection() to do this. I see what you mean by 
the text indicating it is instantiated by the InstallerApp, and it 
is,just as the ProgressReceiver and Logger are instantiated by the 
InstallerApp, via the ExecutionEngine. I think the diagram reflects this 
correctly.

> - Question: will the app be able to use the logger too?  To log 
> messages/errors?  If so, need
> an arrow.

It can use the logger. It actually gets the logger and writes to the 
logger via the ExecutionEngine though, so it isn't direct. Although, I 
can see that in diagram 4.2, I do show the checkpoints directly logging 
messages to the logger. That seems broken perhaps. The app should be 
able to log messages as well, but only through the ExecutionEngine 
interfaces to the logger. I will update this to reflect this fact.

> - Should the arrow from InstallApp to ExecutionEngine be 
> bi-directional?  Since InstallApp will
> be getting progress/errors from engine?

The ProgressReceiver gets the progress reports from the ExecutionEngine. 
  The ProgressReceiver doesn't know about the ExecutionEngine, it simply 
gets updates from it so the association is uni-directional. The 
ProgressReceiver is generally the InstallerApp, but is managed via a 
separate class. The ProgressReceiver will also get the errors from the 
ExecutionEngine via the update() method. So, to answer your question, I 
don't think this should be bi-drectional to the InstallerApp.


> - Under "FileSystem", only Dataset is listed, what about UFS file systems?

UFS and other filesystems will be supported for discovery, but not 
creation. Dataset is called out specifically as a subclass since it has 
unique methods and we do create them as part of our installations.


> Section 4 and Section 5 general comment:
> - The engine is supposed to provide the ability to pause/stop/resume.  
> It would be useful to
> have a UML diagram or use case to illustrate how that can be done.

I am not sure how I would show that in a UML diagram, but I will think 
on this and see what I can do.


> Section 5.1, item 4, instance configuration:
> - You mentioned that this checkpoint will reverse media-specific 
> customizations and apply
> system configuration options selected by the user.  I think that system 
> configuration options,
> and transferring of the log file are common things that all installers 
> need to do,
> and most of the code can probably be shared
> among installers.  So, it might be useful for it to be it's own 
> checkpoint class.  Reversing of
> media-specific customization should be in a separate checkpoint class 
> because it is very specific
> to how the media is constructed.

Configuration options and how they are applied is pretty application 
specific. For example, the way configuration is specified in the GUI is 
different than for AI. Applying them may be different as well. The 
transfer of logs I can see is common but having this as its own separate 
application class seems unnecessary. However, I can see where some of 
InstanceConfiguration might be broken down in to multiple parts. I think 
we need to keep this in mind when working through the design of the 
CommonApplication classes and the ApplicationClasses that are specific 
to installers.
> 
> Section 5.3, first paragraph, 2nd sectence:
> - ..... a manifest is used to drive the installation process.... I think 
> "construction process" sounds better
> than "installation process"

Sounds better to me as well. I will change.
> 
> Section 5.3, step 2:
> - .... create the ZFS datasets required for the construction process.  
> Currently, both UFS and ZFS are
> supported by DC.  Does this imply that we will no longer support UFS?

I am not sure that's what this was intended to imply, I think it was 
just an example of a possible use case. But, that raises a good 
question.. what if we no longer supported UFS? What impact would that have?

thanks again for the review!

sarah
***

[caiman-discuss] Caiman architecture document open for review

Reply via email to