Hi Karen, Thank you for the review. My comments inline...
> Hi Sarah, > > Here are my comments on the document. > > General: > --------------- > > - Question: throughout the document, you choose "class" to refer to the > different > pieces of the New Solaris Installation suite. Why is "class" chosen? I > generally > see "class" being used in the object oriented programming context. Is > it implied that > our implementation of the suite must be done in a language that supports > OO? > Would it be more general to refer to the different pieces as "module" or > "component"? I use 'class' specifically to imply the OO'ness of the architecture. And yes, the language(s) we choose should support OO. Or if we do decide some of this is better done in C(not likely, but in case) we build the C datastructures and funtions in an OO manner. > - In various parts of the document, there's mention about certain python > features > being used, such as using the python logging module for the engine. > Would it make sense to spell out Python as the language that will be used? > Actually, we are not necessarily going to be using python and the implementation details don't belong in the architecture document anyway. I think the only reference to Python is the logging. I will remove this reference. > Specific comments: > ------------------------------- > Section 1: > - The list of of "architecturally significant points" did not include > any requirement > for being able to create customization images (DC), and the > ability to easily upgrade (snap). Is that intentional? No, simply a shortened list. I think these are included in other requirements listed in the document. If not, I can certainly add them here. > > Section 2: > - This section mentioned that the engine offers functionality to pause > or stop the installation > at each checkpoint, and to resume at those checkpoints. Currently, DC > provides similar > functionality if the "installation target" is on a ZFS dataset. In > order to provide these functionalities, > do we also require the installation target to be a ZFS dataset? > No, we won't required ZFS datasets. I think the difference between DC and the other Caiman installer apps is that it isn't our intent to kill the installer app when a checkpoint is completed. We don't have to. So, in affect the application is in a pause state, and we have to 'resume', either at the next checkpoint or one previous. The data we will need to retain will be retained in memory. > Section 2.1, 2nd paragraph: > - Question: Can checkpoints interact with the user? If so, does the > engine need to coordinate that? I am not sure what you mean by interact with users. Certainly, checkpoints will be used in interactive installations, and that installation app will have to reap the data from the data collection once the checkpoint is completed. In this sense, yes, they can interact and yes the engine is responsible for managing this interaction. In the case of the liveCD GUI for example, the gui registers the appropriate checkpoints, target discovery, target instantiation, transfer and instance configuration. The application requests the engine to run the first checkpoint, target discovery, and this data is put in the data collection cache. The GUI application then has to read the data from the cache and allow users to make choices. After the user has completed their choices and chooses to start the installation, the application requests the engine to execute the remaining checkpoints that were registered. > If not, please add something in this paragraph to clarify. > - This description mentions that the execution engine automatically > computes and reports > progress as checkpoints are executed. I wonder how realistic it is to > report useful overall progress? I suppose each checkpoint can specify a > "weight" for themselves > indicating whether it is a long checkpoint or short one. Then, the > engine can average > it out somehow, and assign a weight. But all these will be guesses, not > sure how accurate > can be overall progress be reported. The weighting and time of execution has to be supplied by the checkpoint. Certainly, the engine will do some computing of the progress totals and percentages, but it is based on input from the checkpoint. We do something similar today with the GUI. Except that the overall percentage(weighting) is actually hardcoded in liborchestrator. It isn't totally accurate either. We haven't design this yet, but I don't think that having the engine manage this calculation with input from the checkpoints makes the progress reporting less accurate. > Section 2.1, the list of methods ExecutionEngine need to provide: > - I think it is missing list_checkpoints() or get_checkpoints() so > applications would know what > value to use when they want to pause/stop/resume. > They are doing the registration of the checkpoints up front, at application startup time, so I would think they would know the checkpoints that are available. That said.. I am wondering if there is a need for this I am not seeing. Maybe you can describe a use case that might be interesting for this? > Section 2.1, the list of methods a checkpoint must implement is missing > the following: > - stop(): this is needed to support cancel() in the engine > - percent_completed(): this is useful for progress reporting for long > running checkpoints. There is a get_progress_estimate() for the checkpoint which should give us the percentage or whatever unit of measure we are using, toward completion. stop() should be added. I will do this. > Section 2.2, second bullet in the beginning: > - Why just translate installation data into valid AI manifest? It would > be more general to translate > it into any manifest. Other installer might be able to accept manifests > as input in the future. Well.. I guess I should remove the AI reference since you are about the 100th person to mention this :-). In general, the plan is to have the 'manifests' that our products use be the same, wherever possible. So, in theory a manifest from this could translate to another use. > Section 2.2, the methods provided by the DataCollection class: > - clear_cache() and create_cache(), there's no description. I couldn't > figure out how they > will be used based on the name. I will add descriptions. > Section 3.1, 2nd paragraph: > - Do you mean "semantic validation" of user-provided manifests? No, I mean syntactic validation here. At this point in time, we have no plans to do semantic validation and if we did, the ManifestParser isn't likely the place this would happen. > Section 3.2, the list of objects Target Discovery need to discover: > -" Installed instances of other operating systems....", what about > existing instances > of OpenSolaris? Those are covered by the discovery of Boot Environments. > > Section 3.2.1: > - This whole section talks about the different classes that target > discovery needs to > support and the interfaces/methods these classes are required to implement. > Target discovery is supposed to be a "read" operation, but many of these > classes > have methods for "write" operations such as set_xxxxx(), snapshot(), > clone(), or other > functions for implementing a functionality, such as the send()/receive() > methods in > the Dataset class. I think it is very confusing to have the description > for all these > in the section for Target discovery. Why not have a completely new > section that just > talks about all the classes and all the methods it requires, and perhaps > talk about > what functionality these methods can provide. I can see your point. I think maybe the better answer is to have one section that addresses target discovery and target instantiation. The classes are the same for both, the methods of course may not be. I will work on getting this clearer. > > Section 3.2.1.1, description of the required methods for Partition class > and Slice Class > - I think you also need get_assoc_disk() and set_assoc_disk() in those 2 > classes so it > can associate itself back to the disk that contains these > partitions/slices. I agree I need a method that allows us to build the association back to the parent. Maybe get_assoc_parent()? I have something similar in the LogicalTarget class, get_assoc_container() but I should likely just use get_assoc_parent(). What do you think? > Section 4.1, New Solaris Install UML diagram > - Question: does ProgressReciver, DataCollection and Logger all need to > talk to > the checkpoints, because it is the check points that will generate the > "data" > for those classes? Or is it true that all the "data" from the > checkpoint classes will > be reported to the engine, and the engine will dispatch them to the 3 > classes as appropriate? The DataCollection object is open for communication from anyone. To do this a method, get_data_collection() is called on the ExecutionEngine to get the DataCollection object handle. > - InstallApp needs an arrow to the DataCollection class also, because > previous > section in the doc talks about DataCollection is instantiated by the > app, and the app > can access/modify the data outside of the engine. It is instantiated by the InstallApp, through the ExecutionEngine. It uses the method set_data_collection() to do this. I see what you mean by the text indicating it is instantiated by the InstallerApp, and it is,just as the ProgressReceiver and Logger are instantiated by the InstallerApp, via the ExecutionEngine. I think the diagram reflects this correctly. > - Question: will the app be able to use the logger too? To log > messages/errors? If so, need > an arrow. It can use the logger. It actually gets the logger and writes to the logger via the ExecutionEngine though, so it isn't direct. Although, I can see that in diagram 4.2, I do show the checkpoints directly logging messages to the logger. That seems broken perhaps. The app should be able to log messages as well, but only through the ExecutionEngine interfaces to the logger. I will update this to reflect this fact. > - Should the arrow from InstallApp to ExecutionEngine be > bi-directional? Since InstallApp will > be getting progress/errors from engine? The ProgressReceiver gets the progress reports from the ExecutionEngine. The ProgressReceiver doesn't know about the ExecutionEngine, it simply gets updates from it so the association is uni-directional. The ProgressReceiver is generally the InstallerApp, but is managed via a separate class. The ProgressReceiver will also get the errors from the ExecutionEngine via the update() method. So, to answer your question, I don't think this should be bi-drectional to the InstallerApp. > - Under "FileSystem", only Dataset is listed, what about UFS file systems? UFS and other filesystems will be supported for discovery, but not creation. Dataset is called out specifically as a subclass since it has unique methods and we do create them as part of our installations. > Section 4 and Section 5 general comment: > - The engine is supposed to provide the ability to pause/stop/resume. > It would be useful to > have a UML diagram or use case to illustrate how that can be done. I am not sure how I would show that in a UML diagram, but I will think on this and see what I can do. > Section 5.1, item 4, instance configuration: > - You mentioned that this checkpoint will reverse media-specific > customizations and apply > system configuration options selected by the user. I think that system > configuration options, > and transferring of the log file are common things that all installers > need to do, > and most of the code can probably be shared > among installers. So, it might be useful for it to be it's own > checkpoint class. Reversing of > media-specific customization should be in a separate checkpoint class > because it is very specific > to how the media is constructed. Configuration options and how they are applied is pretty application specific. For example, the way configuration is specified in the GUI is different than for AI. Applying them may be different as well. The transfer of logs I can see is common but having this as its own separate application class seems unnecessary. However, I can see where some of InstanceConfiguration might be broken down in to multiple parts. I think we need to keep this in mind when working through the design of the CommonApplication classes and the ApplicationClasses that are specific to installers. > > Section 5.3, first paragraph, 2nd sectence: > - ..... a manifest is used to drive the installation process.... I think > "construction process" sounds better > than "installation process" Sounds better to me as well. I will change. > > Section 5.3, step 2: > - .... create the ZFS datasets required for the construction process. > Currently, both UFS and ZFS are > supported by DC. Does this imply that we will no longer support UFS? I am not sure that's what this was intended to imply, I think it was just an example of a possible use case. But, that raises a good question.. what if we no longer supported UFS? What impact would that have? thanks again for the review! sarah ***
