6 марта 2016 г. 2:08:15 CET, "Joshua M. Clulow" <[email protected]> пишет:
>On 5 March 2016 at 10:38, Reginald Beardsley via illumos-discuss
><[email protected]> wrote:
>> How does one track from the BIOS level disk identification through to
>the OS presentation?  SunOS 4.1 I knew everything, but Solaris changed
>that and I've never found a good guide to decoding the new device tree.
>
>Unfortunately the information the BIOS passes to the operating system
>on the subject of disk enumeration is relatively thin on the ground,
>and what _is_ available is not necessarily trustworthy if your BIOS
>has bugs or incomplete information.  This problem, as well as the more
>general problem of mapping logical disks to _physical_ disk slots in
>the machine itself, is not really a solved problem in general.
>
>Where it _has_ been solved, in a few narrow cases, is when a server
>manufacturer promises to cable a system in a particular way, and we
>have been able to ship a mapping file from HBA ports to physical
>slots.  If you are fortunate enough to have a SES-capable SCSI/SAS
>enclosure it may contain its own mapping information, for which we do
>currently have some software support.
>
>> What would happen if a pair of disk drive cables got swapped on a 3
>disk RAIDZ1?  With a 3 way mirror?  That might have happened when I was
>cleaning out the dust. There were no issues booting or using the system
>until I started the scrub.
>
>It is my understanding that ZFS uses the "devid" of a disk in order to
>be able to tell it apart from other disks.  There are a number of
>different ways that this identifier can be generated, but one of the
>core goals is that it uniquely identifies a specific drive --
>regardless of the port to which it is currently attached.  You can see
>the "devid" of a disk with "prtconf -v $diskdevice".
>
>For example, one of my systems has an AHCI SATA controller where each
>port on the controller is exposed through a specific hot swap bay.
>The first "target" on the controller, "c0t0d0", is whatever is plugged
>into the first port.  The second port is the second target, i.e.
>"c0t1d0", etc.  Obviously if I were to move a device from one port to
>another, it would have a different device path.  Looking at the devid
>for one of the disks, we see:
>
># prtconf -v /dev/dsk/c0t1d0s0 \
>    | nawk '/name=.devid. t/{q=1;next}q{print;exit}'
>
>           value='id1,sd@SATA_____WDC_WD20EARS-00M_____WD-WMAZA4347050'
>
>Note that the identifier includes the bus type (SATA), the product
>identifier and the drive serial number.  If I move the disk to another
>port, the device path will change (as it is now attached at a
>different target address) but the "devid" will not.
>
>Similarly, if we look at a VMware guest, we can see that (in this
>guest) VMware has exposed its virtual disk as an emulated parallel
>SCSI (SPI) device.  SPI controllers generally make their disks
>available at a device path where the "target" portion reflects the
>SCSI address on the bus -- e.g. the disk at ID 7 will be c0t7d0.  The
>"devid" in this case comes from an inquiry command sent to the device
>to obtain its unique identifier:
>
>            value='id1,sd@n6000c2947deaa1f0abc33300ea120555'
>
>This is a 16 byte WWN, which is unique to the "disk" itself.  Because
>this is a virtual disk, this information is actually persisted outside
>the VMware guest in a file.  Digging in to the VMDK file (which is
>binary, but contains some strings) we see:
>
>    ddb.uuid = "60 00 C2 94 7d ea a1 f0-ab c3 33 00 ea 12 05 55"
>
>This matches the "devid" of the disk itself.  In a physical SAS disk,
>which will have a similar identifier, the value is generally stored
>within the ROM of the disk controller.
>
>The only situation for which things are potentially a little more
>dicey is old-fashioned parallel IDE; cf. the "cmdk" driver.  In these
>cases it is conceivable that a device might not _have_ a persistent
>unique identifier that we can read from the hardware.  But even with
>IDE disks, I believe that an identifier is generated at format time
>and then stored somewhere in the disk label.
>
>> Thus since a "zpool import -f" from another system is going to ignore
>the ondisk drive identification (???)  It seems to me it has to if the
>pool was not exported and that the export operation marks the vdev as
>exported so that there will not be conflicts if it is imported into
>another system.
>
>A "zpool import" on another system will still make use of the "devid"
>read from each volume to find the appropriate disks, regardless of
>their device path or attachment point.  The thing that _might_ have
>been tripping you up is that "/etc/zfs/zpool.cache" contains some
>information about where a particular disk was previously attached
>which is (I believe) used to speed up the import process for "rpool",
>etc.  I don't think that incorrect information (say, after you
>rearranged the disks) should cause problems during operation, but I
>only really use a distribution where we do not _have_ this cache file
>(SmartOS has no persistent root).
>
>You can see the contents of your cache file thus: "zdb -C -U
>/etc/zfs/zpool.cache".  You can see the information in the cache file
>_and_ the information within the pool itself with: "zdb -CC -U
>/etc/zfs/zpool.cache $POOLNAME".
>
>I realise this is not a direct answer to your question, but hopefully
>it gives you some more places to look.
>
>
>
>Cheers.

>From my reading, rpool is special in that grub finds bios devices for the 
>pool, finds the last-used device path (string saved in zpool headers), and 
>passes this string and the chosen rootfs dataset number(!) to solaris/illumos 
>as "here you'd find a copy of rootfs to mount". This is why one device must be 
>authoritative (so rpools are single disks or mirrors) and why we have problems 
>changing the controller type (legacy ide vs. native sata, vm vs. phys, etc.)

I had an incomplete hack to have grub pass pool-component vdev guids, but did 
not figure out some critical bits to make it all work on root-mount side. 
Reimplementers welcome ;)

The zpool.cache if available (and if requested - e.g. no altrool used during 
mount) comes into play much later.

HTH,
Jim
--
Typos courtesy of K-9 Mail on my Samsung Android


-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Reply via email to