Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-25 Thread Don NetBSD

On 9/25/2018 3:19 AM, David Brownlee wrote:

[attrs elided]


I have no idea whether this would actually map to your real
requirements, but a possible workflow could be:

Bringing up new appliance ("slot mapping")
- Assuming you have "ID" devices digitally and physically labelled 1..n.
- User is directed to insert as many ID devices as they have slots
switch on machine
- Appliance boots, detects it has devices attached, checks to see they
are ID devices, updates slots and records its slot mappings


I would just use N different (make/model) drives for that purpose and
examine dmesg on boot:  "OK, the 500G Seacrate is located in the
top left slot and that appears to have been probed as sd0.  The 320G WD
is in the slot to its right and that seems to have been probed as sd4.
etc."  As this is only done once, I can just grab any old drives and
stuff them into the machine, knowing their contents won't be altered
(unless I screw up).  Then, put them back  once I've got
the slots marked.


Mmm finding and maintaining N different models of drives might be a


The point is NOT to be required to have "special disks" (e.g., your disks
with ID's written on the media).  Pick up N disks that differ in some way
from each other (size, manufacturer) and stuff them into slots.  Watch
the dmesg output as those are probed.  Return the disks to their original
homes.


I don't expect (nor want!) "them" to be able to bring up new boxes
unsupervised.  There are too many little details that could have
consequences.  E.g., any performance metrics reported for a drive
in appliance A might differ from (that same drive!) in appliance B.


Reasonable, but its always nice to design what would be the full
robust system, and then decide what corners to cut :-p, plus from past
experience you invariably end up at some point needing to build a box
at the same time as your attention is split fielding something else
urgent.


The "right solution" is to use our existing product as the fixture!
- we already own the hardware design (and know how to troubleshoot it)
- can get replacement parts any time (what if the server shits the bed?)
- can build as many as we like (no hunting for identical/compatible servers)
- own all the sources (and know how they work!)
- don't have to worry about hot swap (just power the device down, remove
  the drive, insert another, power up -- near instantaneous boot)
- don't have to go "exploring" all of these issues

*This* approach is the result of someone with a superficial knowledge of
the issues spouting off to Management ears that were foolishly receptive
to the -- ahem -- "short cut".  I'll be able to prove that when I'm done.

My interest, now, lies in how I could exploit this approach for some
other organizations with which I'm affiliated (that DON'T have an existing
product line that they could repurpose for the task).


Normal use
- When a new sdX or wdX device is detected system determines its slot
mapping and uses it when talking to user
- If it can't determine slot mapping, it suggests a new slot mapping
pass (something strange has happened)

Optional extra credit ("Where is what slot")
- User is instructed to apply sticky number labels next to ID devices
when bring up appliance


*I* would be that "user".  I imagine eventually having a "live (remote)
display" that  reports/summarizes the activities and status of each
drive slot.  Presently, that takes the form of a text display that
summarizes a single appliance on a single screen (curses).  That
could evolve into something graphical.


Usually a big fan of html in this case - can start by spitting out a
static html page with a table and 30 second meta refresh, and extend
to some simple javascript which refreshes within page...


With *no* experience in HTML, I've actually become VERY interested in
using it to make "platform independent" interfaces.  E.g., interfaces
that I could serve over a phone (via WiFi) without having to write
applications that run *in* the phone.

[Looking at designing a remote display for a pallet scale, presently, so
a forklift operator can just look at his phone to see how much the
pallet of goods weighs WITHOUT having to exit the vehicle and walk up
to the (small) display located indoors.]


  I do product design/development for a living, not "test fixture
design".


We all have to start somewhere :-p


I did my stint with production test several decades ago.  Considerably
more labor intensive, back then.  Hence my fear of letting this turn
into a "project" that THEY have to maintain.


So, I'm not too keen on embelishing this more than necessary
(and delaying the NEXT product's delivery!)


It sounds like you have all the right ideas - we're fascinated to hear
how it goes! :)


Presently, I'm more interested in what I can do for OTHER folks using
a similar approach (COTS hardware vs. something "owned" but proprietary).
But, let some one else pay me to learn what I want to learn...  


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-25 Thread David Brownlee
On Tue, 25 Sep 2018 at 06:51, Don NetBSD  wrote:
>
> On 9/24/2018 4:14 AM, David Brownlee wrote:
> > On Mon, 24 Sep 2018 at 11:08, Don NetBSD  wrote:
> >
> > I have no idea whether this would actually map to your real
> > requirements, but a possible workflow could be:
> >
> > Bringing up new appliance ("slot mapping")
> > - Assuming you have "ID" devices digitally and physically labelled 1..n.
> > - User is directed to insert as many ID devices as they have slots
> > switch on machine
> > - Appliance boots, detects it has devices attached, checks to see they
> > are ID devices, updates slots and records its slot mappings
>
> I would just use N different (make/model) drives for that purpose and
> examine dmesg on boot:  "OK, the 500G Seacrate is located in the
> top left slot and that appears to have been probed as sd0.  The 320G WD
> is in the slot to its right and that seems to have been probed as sd4.
> etc."  As this is only done once, I can just grab any old drives and
> stuff them into the machine, knowing their contents won't be altered
> (unless I screw up).  Then, put them back  once I've got
> the slots marked.

Mmm finding and maintaining N different models of drives might be a
pain. If you have how swap bays you could always script up something
like
"put disk in slot 1 and hit return or Q to quit"
"..."
"OK identified, move disk from slot 1 to slot 2 and hit return or Q to quit"

Also provides an inline test of the hot swap facility :)

> I am expecting this to bear some logical relationship to how the
> manufacturer designed the "drive cage" (the one server that I've
> examined so far has them laid out in the order a casual observer
> would expect -- no surprises, there).
>
> I don't expect (nor want!) "them" to be able to bring up new boxes
> unsupervised.  There are too many little details that could have
> consequences.  E.g., any performance metrics reported for a drive
> in appliance A might differ from (that same drive!) in appliance B.

Reasonable, but its always nice to design what would be the full
robust system, and then decide what corners to cut :-p, plus from past
experience you invariably end up at some point needing to build a box
at the same time as your attention is split fielding something else
urgent.

> > Normal use
> > - When a new sdX or wdX device is detected system determines its slot
> > mapping and uses it when talking to user
> > - If it can't determine slot mapping, it suggests a new slot mapping
> > pass (something strange has happened)
> >
> > Optional extra credit ("Where is what slot")
> > - User is instructed to apply sticky number labels next to ID devices
> > when bring up appliance
>
> *I* would be that "user".  I imagine eventually having a "live (remote)
> display" that  reports/summarizes the activities and status of each
> drive slot.  Presently, that takes the form of a text display that
> summarizes a single appliance on a single screen (curses).  That
> could evolve into something graphical.

Usually a big fan of html in this case - can start by spitting out a
static html page with a table and 30 second meta refresh, and extend
to some simple javascript which refreshes within page...

> > Optional extra credit ("Where is what slot and sticky labels fall off")
> > - User directed to take photo of appliance with ID devices to record
> > where the slots were & upload to web server on applicance
> > - If user is confused on slot mapping web server on appliance can show
> > mapping picture
> >
> > Optional extra credit ("Users mess with hardware/swap disks to other 
> > machines")
> > - At boot time system takes a copy of dmesg and notes the available
> > atabus/scsibus and device names
> > - If this ever changes it forces a new slot mapping pass
>
>   I do product design/development for a living, not "test fixture
> design".

We all have to start somewhere :-p

> So, I'm not too keen on embelishing this more than necessary
> (and delaying the NEXT product's delivery!)

It sounds like you have all the right ideas - we're fascinated to hear
how it goes! :)

David


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-24 Thread Don NetBSD

On 9/24/2018 4:14 AM, David Brownlee wrote:

On Mon, 24 Sep 2018 at 11:08, Don NetBSD  wrote:


On 9/18/2018 3:54 AM, David Brownlee wrote:

Just some musing about handling drive mappings:

For sd devices you could use "scsictl sdX identify" to map back from
sdX to (scsibus, target, lun) numbers and then onto each drive's
physical location.


OK.  That would help me initially identify the "slots" in order to
hard-wire them in the kernel.  I.e., stuff every slot, boot, then
"identify" each disk (having made the contents of each disk unique
enough to map to the probed devices).

Presumably, once each slot is wired down, then it need not be
populated at boot -- yet the device will still exist for it when it
later "appears".


Yes, though if you can identify the slots for hardwiring into the
kernel you could also run the same process at runtime as run a GENERIC
kernel.

I have no idea whether this would actually map to your real
requirements, but a possible workflow could be:

Bringing up new appliance ("slot mapping")
- Assuming you have "ID" devices digitally and physically labelled 1..n.
- User is directed to insert as many ID devices as they have slots
switch on machine
- Appliance boots, detects it has devices attached, checks to see they
are ID devices, updates slots and records its slot mappings


I would just use N different (make/model) drives for that purpose and
examine dmesg on boot:  "OK, the 500G Seacrate is located in the
top left slot and that appears to have been probed as sd0.  The 320G WD
is in the slot to its right and that seems to have been probed as sd4.
etc."  As this is only done once, I can just grab any old drives and
stuff them into the machine, knowing their contents won't be altered
(unless I screw up).  Then, put them back  once I've got
the slots marked.

I am expecting this to bear some logical relationship to how the
manufacturer designed the "drive cage" (the one server that I've
examined so far has them laid out in the order a casual observer
would expect -- no surprises, there).

I don't expect (nor want!) "them" to be able to bring up new boxes
unsupervised.  There are too many little details that could have
consequences.  E.g., any performance metrics reported for a drive
in appliance A might differ from (that same drive!) in appliance B.


Normal use
- When a new sdX or wdX device is detected system determines its slot
mapping and uses it when talking to user
- If it can't determine slot mapping, it suggests a new slot mapping
pass (something strange has happened)

Optional extra credit ("Where is what slot")
- User is instructed to apply sticky number labels next to ID devices
when bring up appliance


*I* would be that "user".  I imagine eventually having a "live (remote)
display" that  reports/summarizes the activities and status of each
drive slot.  Presently, that takes the form of a text display that
summarizes a single appliance on a single screen (curses).  That
could evolve into something graphical.


Optional extra credit ("Where is what slot and sticky labels fall off")
- User directed to take photo of appliance with ID devices to record
where the slots were & upload to web server on applicance
- If user is confused on slot mapping web server on appliance can show
mapping picture

Optional extra credit ("Users mess with hardware/swap disks to other machines")
- At boot time system takes a copy of dmesg and notes the available
atabus/scsibus and device names
- If this ever changes it forces a new slot mapping pass


  I do product design/development for a living, not "test fixture
design".  So, I'm not too keen on embelishing this more than necessary
(and delaying the NEXT product's delivery!)


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-24 Thread David Brownlee
On Mon, 24 Sep 2018 at 11:08, Don NetBSD  wrote:
>
> On 9/18/2018 3:54 AM, David Brownlee wrote:
> > Just some musing about handling drive mappings:
> >
> > For sd devices you could use "scsictl sdX identify" to map back from
> > sdX to (scsibus, target, lun) numbers and then onto each drive's
> > physical location.
>
> OK.  That would help me initially identify the "slots" in order to
> hard-wire them in the kernel.  I.e., stuff every slot, boot, then
> "identify" each disk (having made the contents of each disk unique
> enough to map to the probed devices).
>
> Presumably, once each slot is wired down, then it need not be
> populated at boot -- yet the device will still exist for it when it
> later "appears".

Yes, though if you can identify the slots for hardwiring into the
kernel you could also run the same process at runtime as run a GENERIC
kernel.

I have no idea whether this would actually map to your real
requirements, but a possible workflow could be:

Bringing up new appliance ("slot mapping")
- Assuming you have "ID" devices digitally and physically labelled 1..n.
- User is directed to insert as many ID devices as they have slots
switch on machine
- Appliance boots, detects it has devices attached, checks to see they
are ID devices, updates slots and records its slot mappings

Normal use
- When a new sdX or wdX device is detected system determines its slot
mapping and uses it when talking to user
- If it can't determine slot mapping, it suggests a new slot mapping
pass (something strange has happened)

Optional extra credit ("Where is what slot")
- User is instructed to apply sticky number labels next to ID devices
when bring up appliance

Optional extra credit ("Where is what slot and sticky labels fall off")
- User directed to take photo of appliance with ID devices to record
where the slots were & upload to web server on applicance
- If user is confused on slot mapping web server on appliance can show
mapping picture

Optional extra credit ("Users mess with hardware/swap disks to other machines")
- At boot time system takes a copy of dmesg and notes the available
atabus/scsibus and device names
- If this ever changes it forces a new slot mapping pass

David


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-24 Thread Don NetBSD

On 9/18/2018 3:54 AM, David Brownlee wrote:

Just some musing about handling drive mappings:

For sd devices you could use "scsictl sdX identify" to map back from
sdX to (scsibus, target, lun) numbers and then onto each drive's
physical location.


OK.  That would help me initially identify the "slots" in order to
hard-wire them in the kernel.  I.e., stuff every slot, boot, then
"identify" each disk (having made the contents of each disk unique
enough to map to the probed devices).

Presumably, once each slot is wired down, then it need not be
populated at boot -- yet the device will still exist for it when it
later "appears".


The drives would need to be labelled via GPS and the software set to
mount via named slices for referencing data on each drive.
It would even mean someone could pull a set of drives from one machine
to another and as long as they get the boot drive right the order of
the others is irrelevant.


There's no "other (NetBSD) machine", here.  The drives "go their separate ways"
once I've finished with them.  Think of it as a "PROM programmer" that can
handle multiple devices at the same time.  The disks are the equivalent of
the PROMs.  "Program" them and then remove them from the fixture and use
them 


For indicating which drive to pull one thought would be to quiesce all
other drives then pulse activity on the drive to pull for 5 seconds.


Quiescing the other drives would be unfortunate.  The goal is to be able
to have a more-or-less continuous process whereby drives are inserted,
processed and extracted regardless of the state/activity of the other
drives in the appliance.

An "operator" will power up the appliance and insert the drives that
need to be "processed" (they will likely NOT be the same nor require the
same processing).  He will "Start" a particular slot and then move on
to some other activity.  When informed that a slot is finished (successfully),
he will "Eject" that disk, slap a label onto it that has been printed
for it (by the same appliance), place the disk in a "completed" pile and
insert another disk in the now vacant slot.

Lather, rinse, repeat.

When the last disk is finished, the appliance will power itself down
(having logged the results of each disk in the event the power-down
occurs after the close of business).  The process will repeat on the
next day/shift.


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-18 Thread David Brownlee
Just some musing about handling drive mappings:

For sd devices you could use "scsictl sdX identify" to map back from
sdX to (scsibus, target, lun) numbers and then onto each drive's
physical location.
The drives would need to be labelled via GPS and the software set to
mount via named slices for referencing data on each drive.
It would even mean someone could pull a set of drives from one machine
to another and as long as they get the boot drive right the order of
the others is irrelevant.
For indicating which drive to pull one thought would be to quiesce all
other drives then pulse activity on the drive to pull for 5 seconds.

David

On Sun, 16 Sep 2018 at 20:02, Don NetBSD  wrote:
>
> On 9/16/2018 2:27 AM, Michael van Elst wrote:
> > netbsd-embed...@gmx.com (Don NetBSD) writes:
> >
> >> Ah!  So, the sd(4) driver won't pass "non-scsi" commands and
> >> the sd(4) devices might not accept scsi commands.  (damned if you
> >> do, damned if you don't)?
> >
> > The sd driver just passes some byte sequence, some are intercepted by
> > mfi, some aren't. The mfi firmware might do more before it actually
> > reaches the disk.
>
> The 2950 uses an mpt(4) -- though I suspect your points apply equally.
>
> >> I'd have to make sure I numbered the targets on each scsibus as well.
> >
> > Yes, you need to wire scsibus (if there could be more than one) and
> > you need to wire sd.
>
> So, my kernel config should contain N+1 sd entries (the N+1th for a
> wildcarded "sd?") each with specific unit numbers.  Ditto scsibus(4)'s
> as well as the controller (mpt) to which they attach.  In this way, I
> should be able to rely on the /dev mappings to specific hardware (even
> in the absence of said hardware or portions thereof).
>
> >> Yes, that's what I'm trying to guard against.  I can almost guarantee
> >> that someone will get the "bright idea" that they can hack together a
> >> second appliance -- using DIFFERENT hardware (a computer is a computer,
> >> right?) and just cloning the system disk.  Then, complain when it
> >> doesn't work as expected.
> >
> > You need to wire down scsibus to a specific controller. I.e. adding
> > another controller or replacing it with a different model makes your
> > kernel config void.
>
> That's acceptable -- it's an *appliance*, not a "general purpose computer".
> I just need to ensure that the software either knows that the configuration
> has been changed (and can complain about it) *or* lock the system (hardware
> and software) so it *can't* be changed.
>
> (sigh)  This would be *so* much easier if I just pulled product off the Line
> and tweeked the firmware!  "Want another?  Go get one (and *I* know it will
> be identical to the last!)"


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-16 Thread Don NetBSD

On 9/16/2018 2:27 AM, Michael van Elst wrote:

netbsd-embed...@gmx.com (Don NetBSD) writes:


Ah!  So, the sd(4) driver won't pass "non-scsi" commands and
the sd(4) devices might not accept scsi commands.  (damned if you
do, damned if you don't)?


The sd driver just passes some byte sequence, some are intercepted by
mfi, some aren't. The mfi firmware might do more before it actually
reaches the disk.


The 2950 uses an mpt(4) -- though I suspect your points apply equally.


I'd have to make sure I numbered the targets on each scsibus as well.


Yes, you need to wire scsibus (if there could be more than one) and
you need to wire sd.


So, my kernel config should contain N+1 sd entries (the N+1th for a
wildcarded "sd?") each with specific unit numbers.  Ditto scsibus(4)'s
as well as the controller (mpt) to which they attach.  In this way, I
should be able to rely on the /dev mappings to specific hardware (even
in the absence of said hardware or portions thereof).


Yes, that's what I'm trying to guard against.  I can almost guarantee
that someone will get the "bright idea" that they can hack together a
second appliance -- using DIFFERENT hardware (a computer is a computer,
right?) and just cloning the system disk.  Then, complain when it
doesn't work as expected.


You need to wire down scsibus to a specific controller. I.e. adding
another controller or replacing it with a different model makes your
kernel config void.


That's acceptable -- it's an *appliance*, not a "general purpose computer".
I just need to ensure that the software either knows that the configuration
has been changed (and can complain about it) *or* lock the system (hardware
and software) so it *can't* be changed.

(sigh)  This would be *so* much easier if I just pulled product off the Line
and tweeked the firmware!  "Want another?  Go get one (and *I* know it will
be identical to the last!)"


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-16 Thread Michael van Elst
netbsd-embed...@gmx.com (Don NetBSD) writes:

>Ah!  So, the sd(4) driver won't pass "non-scsi" commands and
>the sd(4) devices might not accept scsi commands.  (damned if you
>do, damned if you don't)?

The sd driver just passes some byte sequence, some are intercepted by
mfi, some aren't. The mfi firmware might do more before it actually
reaches the disk.

>I'd have to make sure I numbered the targets on each scsibus as well.

Yes, you need to wire scsibus (if there could be more than one) and
you need to wire sd.


>Yes, that's what I'm trying to guard against.  I can almost guarantee
>that someone will get the "bright idea" that they can hack together a
>second appliance -- using DIFFERENT hardware (a computer is a computer,
>right?) and just cloning the system disk.  Then, complain when it
>doesn't work as expected.

You need to wire down scsibus to a specific controller. I.e. adding
another controller or replacing it with a different model makes your
kernel config void.

-- 
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-16 Thread Don NetBSD

On 9/16/2018 12:21 AM, Michael van Elst wrote:

netbsd-embed...@gmx.com (Don NetBSD) writes:


But can't I walk back up the device tree and find the number of leaves
on a particular (physical) controller?


You could find out from the config file how many disks you have wired.
Still unrelated to real hardware.


That gives me a number for the "MAX_DRIVES_SUPPORTED" manifest constant
in my code (I have to put SOME limit on how many devices can be handled).
Said a more practical way, I can make sure I build a kernel that handles
at least as many devices as my code will handle (silly for my code to
handle more than the kernel supports)


SATA would be wd(4), not sd(4).

SATA on a SAS controller appear as sd(4) devices on scsibus's.
Not sure I have the option to attach them to atabus, instead
(nor why I would want to do that)


If the driver presents the disks as sd(4) devices, you get some
virtual unit that just happens to be usuable to access the
physical one.

For example, the mfi(4) driver would present the disks to you as
sd(4). However, such an sd device would emulate the basic read/write
commands. It also passes through other commands, but your SATA devices
would have problems to understand SCSI commands.


Ah!  So, the sd(4) driver won't pass "non-scsi" commands and
the sd(4) devices might not accept scsi commands.  (damned if you
do, damned if you don't)?


But, I would have to rely on empirical observation to know which device
is which?


Yes. Fortunately we still enumerate devices serially, so the numbers don't
change.


I'd have to make sure I numbered the targets on each scsibus as well.
And, size each to handle the largest shelf that might be attached to it.
So, if I have a 15 drive shelf today, I'd number those slots 1-15.
The second shelf, 16-32.  Etc.

If, thereafter, the first shelf was replaced with an 8 drive shelf, then
devices 9-15 would just disappear -- 16 would continue to be the first
device on the second shelf.


The kernel configuration of course will be specific to your machine then.
If you replace hardware you might need a new configuration.


Yes, that's what I'm trying to guard against.  I can almost guarantee
that someone will get the "bright idea" that they can hack together a
second appliance -- using DIFFERENT hardware (a computer is a computer,
right?) and just cloning the system disk.  Then, complain when it
doesn't work as expected.

The only ways I can think of to hard-wire the software (and kernel config)
to the machine is to examine the MAC(s) in the machine and compare against
hard-coded values.  Or, PXE serve *a* kernel based on the MAC of the
client requesting it.  (this latter lets me painlessly address future
needs assuming the MACs are immutable)


How can I configure a kernel to support a very large number of
(wired down) drives even if the hardware to support those drives
isn't present (I'm thinking about the case of having a couple
of disk shelfs which may/may not be present at any given time)?


Disk shelfs are irrelevant, controllers, channels, target and
lun ids are. The scsi and ata manpages give some examples about
possible kernel configurations to wire down disks.



The shelfs are relevant because they can be "removed"  in much
the same way that a drive can be removed.


Simple passive shelfs aren't even visible. It's like the disks in that
shelf are dead if the shelf is removed. If you wire down scsibus to
specific controllers/ports and sd devices to specific scsibusses and
target/lun ids, nothing will change when a shelf is removed.


The key, there, is to wire down the scsibus and ensure the SAS cables
aren't swapped/misplugged.  I'm obviously trying to avoid all the
potential screwups that can happen after I release the fixture to
Manufacturing.

OK, I guess I'll drag out a few shelfs and start poking at them
to see what they reveal.  Or, maybe wait until I have access to
the "real" kit so I don't make assumptions based on MY hardware
that prove to be incompatible with the boss's stuff.

Thanks!


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-16 Thread Michael van Elst
netbsd-embed...@gmx.com (Don NetBSD) writes:

>But can't I walk back up the device tree and find the number of leaves
>on a particular (physical) controller?

You could find out from the config file how many disks you have wired.
Still unrelated to real hardware.


>The backplane on the machine I'm currently using has no ses device
>probed.  Yet, the kernel seems to know that there are 4 drives installed
>(*IF* they are present when the machine is booted!)

It only knows that 4 targets responded.


>> SATA would be wd(4), not sd(4).
>SATA on a SAS controller appear as sd(4) devices on scsibus's.
>Not sure I have the option to attach them to atabus, instead
>(nor why I would want to do that)

If the driver presents the disks as sd(4) devices, you get some
virtual unit that just happens to be usuable to access the
physical one.

For example, the mfi(4) driver would present the disks to you as
sd(4). However, such an sd device would emulate the basic read/write
commands. It also passes through other commands, but your SATA devices
would have problems to understand SCSI commands.


>But, I would have to rely on empirical observation to know which device
>is which?

Yes. Fortunately we still enumerate devices serially, so the numbers don't
change.

The kernel configuration of course will be specific to your machine then.
If you replace hardware you might need a new configuration.


>>> How can I configure a kernel to support a very large number of
>>> (wired down) drives even if the hardware to support those drives
>>> isn't present (I'm thinking about the case of having a couple
>>> of disk shelfs which may/may not be present at any given time)?
>> 
>> Disk shelfs are irrelevant, controllers, channels, target and
>> lun ids are. The scsi and ata manpages give some examples about
>> possible kernel configurations to wire down disks.

>The shelfs are relevant because they can be "removed"  in much
>the same way that a drive can be removed.

Simple passive shelfs aren't even visible. It's like the disks in that
shelf are dead if the shelf is removed. If you wire down scsibus to
specific controllers/ports and sd devices to specific scsibusses and
target/lun ids, nothing will change when a shelf is removed.

>Man pages indicate syntax for wiring down but give no guidance as
>to how (other than empirical) to figure out which is which.  E.g.,
>for PATA, you knew MASTER vs. SLAVE.  Does driver probe the controller
>in the order the motherboard manufacturer has defined "SATA connections"?
>(and, how might that relate to a backplane where there is nothing
>besides OCD to force the slots to appear in a particular order)

There is no information about that and your SAS controller might even
hide the details.

-- 
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-15 Thread Don NetBSD

On 9/15/2018 11:27 PM, Michael van Elst wrote:

netbsd-embed...@gmx.com (Don NetBSD) writes:


How can I determine the number of /potential/ disk devices (sd(4))
that a system MIGHT support -- *if* the drives had been installed
prior to boot?


That would be difficult. sd(4) is used for several different kinds
of disks, including virtual ones and SCSI is a bus. You MIGHT
install hundreds of sd devices but that's unrelated to e.g. how
many drive slots the machine has.


But can't I walk back up the device tree and find the number of leaves
on a particular (physical) controller?  Note that I control the
kernel configuration so devices can't exist where i don't expect
them...  (e.g., remove the USB devices from the configuration
file and the possibility of a mass device being inserted goes away).


E.g., if I have a 15 slot backplane but only have
a drive installed in slot 13, then *that* appears as sd0 and there
is no mention of the potential for the other 14 drives.


A backplane might support a ses(4) enclosure device that could
be queried.


The backplane on the machine I'm currently using has no ses device
probed.  Yet, the kernel seems to know that there are 4 drives installed
(*IF* they are present when the machine is booted!)


A driver for a multiport controller usually knows how many ports
are available. But that's not exposed, and in case of a bus
topology, you still wouldn't know what is possible.


Presumably, I can wire down each sd(4) device to correspond to a
particular "slot" (SATA port) in the machine when I build a kernel
with that in mind.


SATA would be wd(4), not sd(4).


SATA on a SAS controller appear as sd(4) devices on scsibus's.
Not sure I have the option to attach them to atabus, instead
(nor why I would want to do that)


[This allows software to KNOW that sd0 is "the drive in the top
left slot" even if there is no drive present there when the machine
boots]


You could create a custom kernel that wires drive units to specific
locations. You'd also may need to wire the 'scsibus' instances.


But, I would have to rely on empirical observation to know which device
is which?


For SATA that would wd devices and atabus instances.

USB might be an issue. You may need to remove the umass driver so
that no SCSI or ATA instances can attach.


Exactly.  The issue then becomes one of ensuring a particular slot/bay
in a particular shelf maps to a particular /dev/sd*.  I'd have to
label the slots, label the shelfs, label the SAS cables (and which
connectors they attach to).  But, so long as no one swaps cables,
things should stay as intended.

If a shelf is not powered on at boot, then I'd have to wire down the
associated controller/scsibus and make provisions to reprobe it when
it comes online.


How can I configure a kernel to support a very large number of
(wired down) drives even if the hardware to support those drives
isn't present (I'm thinking about the case of having a couple
of disk shelfs which may/may not be present at any given time)?


Disk shelfs are irrelevant, controllers, channels, target and
lun ids are. The scsi and ata manpages give some examples about
possible kernel configurations to wire down disks.


The shelfs are relevant because they can be "removed"  in much
the same way that a drive can be removed.  Not planning for that
possibility means an operator may be receiving directions regarding
"remove drive 5" when, in fact, the shelf in which "5" is installed
happens to have labeled it as "20".

Man pages indicate syntax for wiring down but give no guidance as
to how (other than empirical) to figure out which is which.  E.g.,
for PATA, you knew MASTER vs. SLAVE.  Does driver probe the controller
in the order the motherboard manufacturer has defined "SATA connections"?
(and, how might that relate to a backplane where there is nothing
besides OCD to force the slots to appear in a particular order)


Re: Sizing hardware drive capabilities (in the absence of probed devices)

2018-09-15 Thread Michael van Elst
netbsd-embed...@gmx.com (Don NetBSD) writes:

>How can I determine the number of /potential/ disk devices (sd(4))
>that a system MIGHT support -- *if* the drives had been installed
>prior to boot?

That would be difficult. sd(4) is used for several different kinds
of disks, including virtual ones and SCSI is a bus. You MIGHT
install hundreds of sd devices but that's unrelated to e.g. how
many drive slots the machine has.


> E.g., if I have a 15 slot backplane but only have
>a drive installed in slot 13, then *that* appears as sd0 and there
>is no mention of the potential for the other 14 drives.

A backplane might support a ses(4) enclosure device that could
be queried.

A driver for a multiport controller usually knows how many ports
are available. But that's not exposed, and in case of a bus
topology, you still wouldn't know what is possible.



>Presumably, I can wire down each sd(4) device to correspond to a
>particular "slot" (SATA port) in the machine when I build a kernel
>with that in mind.

SATA would be wd(4), not sd(4).



>[This allows software to KNOW that sd0 is "the drive in the top
>left slot" even if there is no drive present there when the machine
>boots]

You could create a custom kernel that wires drive units to specific
locations. You'd also may need to wire the 'scsibus' instances.
For SATA that would wd devices and atabus instances.

USB might be an issue. You may need to remove the umass driver so
that no SCSI or ATA instances can attach.


>How can I configure a kernel to support a very large number of
>(wired down) drives even if the hardware to support those drives
>isn't present (I'm thinking about the case of having a couple
>of disk shelfs which may/may not be present at any given time)?

Disk shelfs are irrelevant, controllers, channels, target and
lun ids are. The scsi and ata manpages give some examples about
possible kernel configurations to wire down disks.

-- 
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."