Randy Fishel wrote:
> 
>> These suspend operations are entirely distinct from
>> CPR suspend operations.
>>
> 
>   Why?  It would seem to me that the problems are the same for both, 
> and that it would be worthwhile to leverage this work, than to roll a 
> private solution.
> 
>       ---- Randy

Re-using the CPR code was considered, but was not done because the 
requirements of CPR and sun4v suspend/resume are different.

The current sun4v suspend occurs without Solaris' knowledge. The code in 
development for suspend which will utilize this case is extremely 
minimal compared to CPR and that is because CPR 1) has to deal with 
physical devices and 2) has to write the OS memory to disk. The sun4v 
suspend/resume and therefore domain migration is only available on 
domains comprised entirely of virtual I/O devices (i.e., virtual disks 
and virtual network devices). In sun4v, devices drivers do not need to 
be suspended and devices do not need to be quiesced or powered down, 
user threads do not need to be stopped, kernel threads do not need to be 
stopped, and we don't need to flush anything to disk.

Leveraging the CPR code could have been done by providing several empty 
sun4v-specific cpr_ routines as well as making modifications to common 
CPR code to move more portions into platform specific areas. Some of the 
callbacks in CPR would have to be skipped on sun4v or we'd have to 
change other subsystems to not register callbacks on sun4v. And we would 
want to do this in a way that wouldn't make it difficult to eventually 
support CPR suspend on sun4v platforms. This work is targeting a Solaris 
update release and the amount of new code written and risk to other 
platforms was smallest with this approach. All that considered we chose 
this approach.

Thanks,
Haik

Reply via email to