On Wed, 4 Nov 2009, Haik Aftandilian wrote: > Sebastien Roy wrote: > > This architecture is not extensible. Is it inconceivable that other > > subsystems may be interested in these events in the future (or even > > today)? There is no provision here for multiple callbacks, and the > > callbacks have names that are clustering specific. Was any > > consideration given to this? > > Yes, we considered adding a callback registration API so that interested > parties could register their own suspend callbacks, but decided against that > because, at present, Sun Cluster is the only interested party that needs > in-kernel notifications. The callbacks were specifically requested by Sun > Cluster and are only to be used by Sun Cluster. This is the model used by Sun > Cluster and ON in other kernel subsystems. > > We also need to control which callbacks occur. At present we don't want other > kernel subsystems to register callbacks that could cause a suspend operation > to fail. Users expect migrations to succeed when all the conditions outlined > in the LDom documentation are met. Today, a suspend (initiated by the HV as > part of a domain migration) has no Solaris hooks and Solaris is not aware of a > suspend/resume. This process is being changed so that Solaris will be aware of > suspend/resume, but we want to continue that model as much as possible, only > making an exception for Sun Cluster here.
Solaris *is* aware of suspend/resume, it is just not obvious to me that existing work/teams are being considered the project team. As I previously mentioned, it does appear as if the problems and needs by guest migration are identical to the problems and needs by suspending bare metal machine (including "this cannot fail"). So why doesn't this proposal/project wish to align itself with that work (possibly using callbacks that already exist)? How will this project align with the Solaris core work? ---- Randy > > Lastly, in LDoms, a suspend operation only occurs to permit a domain migration > and this is initiated by the management software on a separate control domain. > The LDom management software is not ready to account for an arbitrary number > of pre/post callbacks which could take any length of time. Certain operations > are blocked when a migration is in progress and so this affects usability. > > In the event that the notification scheme needs to expand in the future, we > will address the need. This could be with an extensible callback mechanism in > the kernel or an API that includes notifications issued by the domain manager > on the control domain. > > Thanks, > Haik >