Hi Kent, I'm one of the team that works on Solaris' mpt driver, which we recently enhanced to deliver mpxio support with SAS. I have a bit of knowledge about your issue :-)
Kent Watsen wrote: > Based on recommendations from this list, I asked the company that built > my box to use an LSI SAS3081E controller. > > The first problem I noticed was that the drive-numbers were ordered > incorrectly. That is, given that my system has 24 bays (6 rows, 4 > bays/row), the drive numbers from top-to-bottom & left-to-right were 6, > 1, 0, 2, 4, 5 - even though when the system boots, each drive is scanned > in perfect order (I can tell by watching the LEDs blink). > > I contacted LSI tech support and they explained: > > <start response> > SAS treats device IDs differently than SCSI. LSI SAS controllers > "remember" devices in the order they were discovered by the controller. > This memory is persistent across power cycles. It is based on the world > wide name (WWN) given uniquely to every SAS device. This allows your > boot device to remain your boot device no matter where it migrates in > the SAS topology. > > In order to clear the memory of existing devices you need at least one > device that will not be present in your final configuration. Re-boot > the machine and enter the LSI configuration utility (CTRL-C). Then find > your way to SAS Topology. To see "more" options, press CTRL-M. Choose > the option to clear all non-present device IDs. This clears the > persistent memory of all devices not present at that time. Exchange the > drives. The system will now remember the order it finds the drives > after the next boot cycle. > <end response> Firstly, yes, the LSI SAS hbas do use persistent mapping, with a "logical target id" by default. This is where the hba does the translation between the physical disk device's SAS address (which you'll see in "prtconf -v" as the devid), and an essentially arbitrary target number which gets passed up to the OS - in this case Solaris. The support person @ LSI was correct about deleting all those mappings. Yes, the controller is being smart and tracking the actual device rather than a particular bay/slot mapping. This isn't so bad, mostly. The effect for you is that you can't assume that the replaced device is going to have the same target number as the old one (in fact, I'd call that quite unlikely) so you'll have to see what the new device name is by checking your dmesg or iostat -En output. > Sure enough, I was able to physical reorder my drives so they were 0, 1, > 2, 4, 5, 6 - so, appearantly, the company that put my system together > moved the drives around after they were initially scanned. But where is > 3? (answer below). Then I tried another test: > > 1. make first disk blink > > # run dd if=/dev/dsk/c2t0d0p0 of=/dev/null count=10 > 10+0 records in > 10+0 records out > > 2. pull disk '0' out and replace it with a brand new disk > > # run dd if=/dev/dsk/c2t0d0p0 of=/dev/null count=10 > dd: /dev/dsk/c2t0d0p0: open: No such file or directory > > 3. scratch head and try again with '3' (I had previously cleared the > LSI's controllers memory) > > # run dd if=/dev/dsk/c2t3d0p0 of=/dev/null count=10 > 10+0 records in > 10+0 records out > > So, it seems my SAS controller is being too smart for its own good - it > tracks the drives themselves, not the drive-bays. If I hot-swap a brand > new drive into a bay, Solaris will see it as a new disk, not a > replacement for the old disk. How can ZFS support this? I asked the > LSI tech support again and got: > > <start quote> > I don't have the knowledge to answer that, so I'll just say > this: most vendors, including Sun, set up the SAS HBA to use > "enclosure/slot" naming, which means that if a drive is > swapped, it does NOT get a new name (after all, the enclosure > and slot did not change). > <end quote> Now here's where things get murky. At this point in time at least (it may change!) Solaris' mpt driver uses LSI's logical target id mapping method. This is *NOT* an enclosure/slot naming method - at least, not from the OS' point of view. Additionally, unless you're using an actual real SCSI Enclosure Services (ses) device, there's no enclosure to provide enclosure/slot mapping with either. Since mpt uses logical target id, therefore the target id which Solaris sees _will definitely change_ if you swap a disk. (I'm a tad annoyed that the LSI support person appears to have made an assumption based on a total lack of understanding about how Solaris' mpt driver works). (My assumption here is that you're using Solaris' mpt(7d) driver rather than LSI's itmpt driver) So how do you use your system and its up-to-24 drives with ZFS? (a) ensure that you note what Solaris's idea of the target id is when you replace a drive, then (b) use "zpool replace" to tell ZFS what to do with the new device in your enclosure. I hope the above helps you along the way... but I'm sure you'll have followup questions, so please don't hesitate to ask either directly or to the list. best regards, James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss