Sebastien Roy wrote: > This architecture is not extensible. Is it inconceivable that other > subsystems may be interested in these events in the future (or even > today)? There is no provision here for multiple callbacks, and the > callbacks have names that are clustering specific. Was any > consideration given to this?
Yes, we considered adding a callback registration API so that interested parties could register their own suspend callbacks, but decided against that because, at present, Sun Cluster is the only interested party that needs in-kernel notifications. The callbacks were specifically requested by Sun Cluster and are only to be used by Sun Cluster. This is the model used by Sun Cluster and ON in other kernel subsystems. We also need to control which callbacks occur. At present we don't want other kernel subsystems to register callbacks that could cause a suspend operation to fail. Users expect migrations to succeed when all the conditions outlined in the LDom documentation are met. Today, a suspend (initiated by the HV as part of a domain migration) has no Solaris hooks and Solaris is not aware of a suspend/resume. This process is being changed so that Solaris will be aware of suspend/resume, but we want to continue that model as much as possible, only making an exception for Sun Cluster here. Lastly, in LDoms, a suspend operation only occurs to permit a domain migration and this is initiated by the management software on a separate control domain. The LDom management software is not ready to account for an arbitrary number of pre/post callbacks which could take any length of time. Certain operations are blocked when a migration is in progress and so this affects usability. In the event that the notification scheme needs to expand in the future, we will address the need. This could be with an extensible callback mechanism in the kernel or an API that includes notifications issued by the domain manager on the control domain. Thanks, Haik