Re: [discuss] ZFS issue

Joshua M. Clulow Sat, 05 Mar 2016 17:09:01 -0800

On 5 March 2016 at 10:38, Reginald Beardsley via illumos-discuss
<[email protected]> wrote:
> How does one track from the BIOS level disk identification through to the OS 
> presentation?  SunOS 4.1 I knew everything, but Solaris changed that and I've 
> never found a good guide to decoding the new device tree.


Unfortunately the information the BIOS passes to the operating system
on the subject of disk enumeration is relatively thin on the ground,
and what _is_ available is not necessarily trustworthy if your BIOS
has bugs or incomplete information.  This problem, as well as the more
general problem of mapping logical disks to _physical_ disk slots in
the machine itself, is not really a solved problem in general.

Where it _has_ been solved, in a few narrow cases, is when a server
manufacturer promises to cable a system in a particular way, and we
have been able to ship a mapping file from HBA ports to physical
slots.  If you are fortunate enough to have a SES-capable SCSI/SAS
enclosure it may contain its own mapping information, for which we do
currently have some software support.

> What would happen if a pair of disk drive cables got swapped on a 3 disk 
> RAIDZ1?  With a 3 way mirror?  That might have happened when I was cleaning 
> out the dust. There were no issues booting or using the system until I 
> started the scrub.

It is my understanding that ZFS uses the "devid" of a disk in order to
be able to tell it apart from other disks.  There are a number of
different ways that this identifier can be generated, but one of the
core goals is that it uniquely identifies a specific drive --
regardless of the port to which it is currently attached.  You can see
the "devid" of a disk with "prtconf -v $diskdevice".

For example, one of my systems has an AHCI SATA controller where each
port on the controller is exposed through a specific hot swap bay.
The first "target" on the controller, "c0t0d0", is whatever is plugged
into the first port.  The second port is the second target, i.e.
"c0t1d0", etc.  Obviously if I were to move a device from one port to
another, it would have a different device path.  Looking at the devid
for one of the disks, we see:

# prtconf -v /dev/dsk/c0t1d0s0 \
    | nawk '/name=.devid. t/{q=1;next}q{print;exit}'

            value='id1,sd@SATA_____WDC_WD20EARS-00M_____WD-WMAZA4347050'

Note that the identifier includes the bus type (SATA), the product
identifier and the drive serial number.  If I move the disk to another
port, the device path will change (as it is now attached at a
different target address) but the "devid" will not.

Similarly, if we look at a VMware guest, we can see that (in this
guest) VMware has exposed its virtual disk as an emulated parallel
SCSI (SPI) device.  SPI controllers generally make their disks
available at a device path where the "target" portion reflects the
SCSI address on the bus -- e.g. the disk at ID 7 will be c0t7d0.  The
"devid" in this case comes from an inquiry command sent to the device
to obtain its unique identifier:

            value='id1,sd@n6000c2947deaa1f0abc33300ea120555'

This is a 16 byte WWN, which is unique to the "disk" itself.  Because
this is a virtual disk, this information is actually persisted outside
the VMware guest in a file.  Digging in to the VMDK file (which is
binary, but contains some strings) we see:

    ddb.uuid = "60 00 C2 94 7d ea a1 f0-ab c3 33 00 ea 12 05 55"

This matches the "devid" of the disk itself.  In a physical SAS disk,
which will have a similar identifier, the value is generally stored
within the ROM of the disk controller.

The only situation for which things are potentially a little more
dicey is old-fashioned parallel IDE; cf. the "cmdk" driver.  In these
cases it is conceivable that a device might not _have_ a persistent
unique identifier that we can read from the hardware.  But even with
IDE disks, I believe that an identifier is generated at format time
and then stored somewhere in the disk label.

> Thus since a "zpool import -f" from another system is going to ignore the 
> ondisk drive identification (???)  It seems to me it has to if the pool was 
> not exported and that the export operation marks the vdev as exported so that 
> there will not be conflicts if it is imported into another system.

A "zpool import" on another system will still make use of the "devid"
read from each volume to find the appropriate disks, regardless of
their device path or attachment point.  The thing that _might_ have
been tripping you up is that "/etc/zfs/zpool.cache" contains some
information about where a particular disk was previously attached
which is (I believe) used to speed up the import process for "rpool",
etc.  I don't think that incorrect information (say, after you
rearranged the disks) should cause problems during operation, but I
only really use a distribution where we do not _have_ this cache file
(SmartOS has no persistent root).

You can see the contents of your cache file thus: "zdb -C -U
/etc/zfs/zpool.cache".  You can see the information in the cache file
_and_ the information within the pool itself with: "zdb -CC -U
/etc/zfs/zpool.cache $POOLNAME".

I realise this is not a direct answer to your question, but hopefully
it gives you some more places to look.



Cheers.

-- 
Joshua M. Clulow
UNIX Admin/Developer
http://blog.sysmgr.org


-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Re: [discuss] ZFS issue

Reply via email to