On 4/12/15 1:25 PM, [email protected] wrote: > Michael Tiernan wrote: >> Normally, what will happen is that the kickstart process will wipe and >> rebuild on the drive in Slot1 since it is the first drive. This is not >> the desired outcome. >> >> What I want to do is confirm that the drive I'm focusing on is the >> "correct" one in the physical hardware slot 0. > Curious why the 'for whatever reason' has gone unremarked. > What's the use case? (Frankly, could this happen to me and > should I pay close attention?) One response seemed to assume you > have data on other disks you want to preserve. Maybe you're > trying to track a physical disk that contains the root > filesystem? Fail the install if any disks are inop? Something > else? First off, a preface/reminder. We don't always get to choose the entirety of our infrastructure that we inherit. :( (i.e. suspend preferred logic and assume to pick one battle at a time.)
So, my use case, as screwy as it may seem is this: I've got a machine with > 1 drives in it. Usually the number is 6, 8, or greater now that we've got some new slot rich Dells in the racks. Machine is running along with the system on the disk in slot 0 and the other 6, 8, etc drives configured as individual RAID0 containers or just as raw disks. Don't ask, just go with it. Principle rule, data (on data drives) is sacred and should never be lost. Now, something happens and "other person with permission" reboots the system "because" and instead of it coming up, we find that the system drive has gone bad. Sadly this happens much more than I'd like. This results in a situation where I cannot determine the UUIDs of the existing drives and divine where to build the new root. Sometimes the system drive truly disappears and in other cases just begins to exhibit signs of total failure. Either way, I replace the system drive with another drive (SATA) and then have to build a new system on this drive. What *sometimes* happens is that the new/replacement drive is bad or also going dumb. When this happens, at times, instead of it being "Drive #1", it is ignored and not counted and then what will happen when the kickstart proceeds is that the *SECOND* drive in the system, the first data drive, aka "Drive #2" will get wiped out and the system built on it.[2] This is not the desired result. My preference would be to be able to ask the PCI cards/slots that handle storage "Do you have a target in slot0?" I *CAN* determine, so far that in some cases[1], that if I pull the drive from slot0 and slot1 is the next drive but I've not done enough tests to confirm this fully. I also have situations where the "Slot0" device is reached via direct hardware "linkage" (PCI-HBA->SATADev) but there's also times when the connection is less "plain" and is PCI-HBA->RAIDContr->SATADev) so that the "Zeroth drive" is a virtual drive. (Which I can live with if I can determine it.)) So far, I've found that the "/dev/disk/by-path" information is SOMEWHAT informative at first but fails quickly when you try to parse it up. I can post examples but it /seems/ that the response from inside the HBA, beyond the PCI definitions of responses, is vague enough so that you can't reliably be sure of what you're getting unless you determine things like firmware revs and try to keep track of them. The one thing that I've run into is the question "If you report SCSI target a.b.c.0, are you telling me that this applies SPECIFICALLY to the first *possible* slot, better known as slot0?" So far, I've not found definitions of this information that tells me for sure. [1] The determination of these cases shows that if I pull the drive in slot 0 and build on the drive in slot 1, the /dev/disk/by-path identification shows as SCSI target a.b.c.1 but I have not proven that this is *always* the case. (I'm still testing.) [2] As already pointed out the "correction" to this problem is to specify what device I want the system built on in the kickstart using the anaconda "--onpart=/dev/disk/by-path/pci...." path. However, I need to construct this path on demand since it changes with different systems but it also *seems* to change with different revs of hardware/firmware! As it is, the kickstart says /dev/sda but if the device in slot0 goes "away" then /dev/sda is what's in slot1 which isn't a good thing. There's days that it feels like all this device "standardization" is giving us an opaque window on the engine under the hood but we can't touch it and the responses to "where's the spark plugs" is as firm as "over there". In my very limited experience and view, I am of the opinion that I should be able to ask the system: "Do you have any devices that are handling storage targets?" "If so, can you tell me the physical hard correlation between them and how you see them?" OR "If I ask you, can you tell me if there's a drive/storage device available in what shows up on the hardware as SlotN" It may be that there's no physical correlation, the drives are all virtual devices hanging off the network but I should *still* be able to query about them directly and get correlatable data about them. Of course I may be just digging myself a hole in a religious fervor and should give up. -- << MCT >> Michael C Tiernan. http://www.linkedin.com/in/mtiernan Non Impediti Ratione Cogatationis Women and cats will do as they please, and men and dogs should relax and get used to the idea. -Robert A. Heinlein _______________________________________________ Discuss mailing list [email protected] https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss This list provided by the League of Professional System Administrators http://lopsa.org/
