James Carlson wrote:
> The architectural matters here are that (a) the bits natively on the
> disk [not in the archive] are the real ones and (b) nothing but
> inconsistency or corruption in the *real* bits should stop the system
> from coming up.  The boot archive is just supposed to be a cache.  If
> it's inconsistent, that shouldn't turn the system into a warm brick as
> it does today.  It should cause the system to come up more slowly.

If you accept the idea that the boot archive is just a cache for the 
filesystem contents, that seems to have two consequences:

1. If the boot archive and the filesystem ever disagree, the boot 
archive is wrong. The right (unconditional) action is to rebuild the 
boot archive and boot again. This can only fail if the filesystem 
contents themselves are incorrect. [1]

Does everyone agree with the paragraph above, or are there additional 
complications?

2. The mechanism proposed by this case is never needed for consistency 
(since we can always rebuild the boot archive on the next boot), but it 
may be a useful optimization to make the boot-time consistency check 
pass more often.

It seems like we could achieve the same optimization without affecting 
uadmin(2) and without having to disable the optimization for clustered 
systems. A background service could periodically check the boot archive 
for consistency and rebuild if necessary. [2] Nothing special is needed 
in uadmin(2), and the current reboot check would generally succeed 
without stopping to rebuild the archive.

Perhaps this idea has been considered before and rejected, but I don't 
know why. Can anyone enlighten me?
        
        Scott

[1] If booting fails a second time, after rebuilding the boot archive, 
then the archive and filesystem are no longer inconsistent. At that 
point, it's reasonable to boot the failsafe archive and leave it to the 
user to (somehow) repair the damage.

[2] We'd probably want to allow for settling time (e.g. make sure at 
least a minute has passed since the last file was updated) to avoid 
rebuilding in the middle of a patch/package install.

-- 
Scott Rotondo
Principal Engineer, Solaris Security Technologies
President, Trusted Computing Group
Phone/FAX: +1 408 850 3655 (Internal x68278)

Reply via email to