I've only recently started working with SVM and there's probably a
'better' way to do the following, but it still shouldn't have failed in
the manner in which it did, imho.

OK, the scenario: host with two internal disks only, mirrored with SVM
referred to as disk A and B. Need to make a backup of the filesystem
data onto disk B. Host has an install DVD loaded.

- bring down box to PROM

- boot into single-user mode on the DVD

- relabel disk B and create a whole-disk filesystem in slice 0

- using ufsfump, dump the content of the mirrored filesystems to disk B

- reboot host

Now my expectation of what would happen here would be that SVM would
discover that disk B's metadb and the its halves of the mirrored devices
were now gone/broken and just do the Right Thing and boot the host with
all the mirror devices in maintenance mode.

What actually happened was the following:

- kernel banner is displayed, then the kernel panics:

|Boot device: diska  File and args:
|SunOS Release 5.10 Version Generic_127111-06 64-bit
|Copyright 1983-2007 Sun Microsystems, Inc.  All rights reserved.
|Use is subject to license terms.
|WARNING: promif_ds_init: can't find ds_cap_init
|
|panic[cpu0]/thread=180e000: mod_hold_stub: Couldn't load stub module 
misc/strplumb
|
|000000000180b890 genunix:mod_hold_stub+1f0 (0, 18a3400, 18f0818, 600008ff910, 
182a170, 0)
|  %l0-3: 00000300014b98f8 000006000091e800 000000000182dcd0 0000000000000000
|  %l4-7: 0000000000000000 000000000000003f 0000000001861c00 0000000000000000
|000000000180b940 unix:stubs_common_code+30 (0, 0, 185d000, 18f6000, 1091800, 3)
|  %l0-3: 000000000180b209 000000000180b2e1 00000003fe000000 0000000001047400
|  %l4-7: 0000000000000000 000000000182a180 0000000001877000 00000600008e74f0
|000000000180ba10 genunix:main+f0 (1826ee8, 0, 1855a70, 18f0400, 70002000, 
1826c00)
|  %l0-3: 0000000070002000 0000000000000001 000000000180c000 000000000180e000
|  %l4-7: 0000000000000001 000000000180c000 0000000000000060 0000000000000000
|
|syncing file systems... done
|skipping system dump - no dump device configured
|rebooting...
|Resetting...

- on rebooting, the host hangs hard:

|SPARC Enterprise T5120, No Keyboard
|Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
|OpenBoot 4.27.1, 16256 MB memory available, Serial #77054504.
|Ethernet address 0:14:4f:97:c2:28, Host ID: 8497c228.
|
|
|
|Boot device: diska  File and args:
[hang]

I then forced a break and booted from the install DVD into single-user
mode again. It seems that something somewhere has completely trashed the
root filesystem on disk A and the host is definately not bootable.

I restored the root filesystem from the backup I'd made earlier and
found that I could boot the host into more-or-less the expected state if
I physically unplugged disk B from the host and let it boot normally.

Once SVM discovered that the disk was missing, it failed the metadb
and subdisks on that device. I was then able to shut down the host,
re-insert disk B and bring it back up again without problems.

Granted, this is probably not the manner in which this should have been
done, but I am very surprised that it caused the amount of problems
that it did ... a panic I can understand, but how and why did the root
filesystem become corrupted?

Regards,
Malcolm

-- 
Malcolm Herbert                                This brain intentionally
mjch at mjch.net                                                left blank

Reply via email to