Re: [Autotest] [RFC] adding hardware inventory to autotest

Nishanth Aravamudan Thu, 31 May 2012 14:58:14 -0700

Hi Dan,

On 31.05.2012 [04:56:58 +0000], DeFolo, Daniel wrote:
> Nish and Lucas,
> 
> I and a few members of my team have a lot of experience doing HW
> attrib style test scheduling so please feel free specifically bring me
> and Chris Nelson into future detailed discussions for the scheduling
> part of this discussion.  We are both pretty new to Linux still so you
> probably no better than us how to get the attrib info, but we might
> have some ideas in the space of adding some tables and implementing
> queries to schedule jobs in this way.


First off -- thanks a ton for your very in-depth replies!

> Just FYI, we called our prior solution constraints based scheduling
> (as in HW  and SW constraints being options) and had support for
> constraints being both hard and soft.  The idea was that a hard
> constraint was similar to the autotest "only if needed" labels in that
> if a job didn't request a hard constraint explicitly it wouldn't be
> considered to match the host.  I think I like your terminology (only
> if needed) better, and think it could equally be applied to HW
> attribute scheduling as for label based scheduling.  That said, this
> would need to be a per-host & per-attribute setting, and not applied
> to every host that has the attribute (as opposed to how "only if
> needed" is currently a label specific attribute that would apply to
> every host that uses that label).   Does that makes sense?

Just to be clear, "only if needed" is the setting you mean? I've not
actually played much with labels. I'll need to look at the code and
think over what you wrote more.

> See below for more comments.
> 
> -Dan DeFolo
> 
> > -----Original Message-----
> > From: [email protected] [mailto:autotest-
> > [email protected]] On Behalf Of Nishanth Aravamudan
> > Sent: Wednesday, May 16, 2012 2:51 PM
> > To: [email protected]
> > Cc: [email protected]
> > Subject: [Autotest] [RFC] adding hardware inventory to autotest
> > 
> > Hi everyone,
> > 
> > Lucas and I were discussing a new feature that I think would be very useful
> > for Autotest: hardware inventory.
> > 
> > With the upcoming/ongoing feature to add tighter Cobbler support to
> > Autotest I, at least, have a need to know what kinds of machines are
> > available to run jobs on. I want to know things like: CPU information, 
> > amount
> > of memory, PCI devices, etc.
> > 
> >  - My initial proposal is to use lshw to acquire information from
> >    running systems.
> >    - An alternative would be smolt. I've found some issues with the
> >      distribution versions of smolt on ppc64, at least.
> >    - In theory, we could have a pluggable infrastructure, much like the
> >      install_server support itself, and the administrator could specify
> >      how to obtain the information.
> 
> That makes sense to me.  One thing to consider is subdividing the
> inventory commands as makes sense by category to the average user and
> just return back a dictionary of attributes.  For example, if the
> inventory interface command were called hwprobe I could imagine
> calling something like this to get CPU info:

Yeah, I think returning dictionaries is the most sensible option. I like
the idea of breaking up the information (as it might come from multiple
locations) by type.

> hwprobe -t cpu
> 
> { 'cpu_model: 'Intel(R) Xeon(R) CPU           X5550' ,
>   'cpu_speed': '2.27GHz',
>   'cpu_cores': 16,
> ....
> }
> 
> Admins could add new categories of things to probe as well as control
> where the information for each category of probe should come from. 
> 
> Also, there should be some standard default value to report (e.g.
> Unknown, "", None, ??, etc) when the architecture specific probing
> logic isn't able to come up with a value for an expected field.

Agreed.

> Finally, there should be some normalization logic for the size and
> speed attributes so they are all stored using standard units.  For
> example, convert all CPU speeds to GHz before story, all memory sizes
> to either MB or GB (probably MB), all disk sizes to GB, etc.  This is
> the one thing that perhaps shouldn't completely be up to admins.  At
> most, before recording their values into the DB they should be
> normalized.   I'll comment more on this later.

Yeah, I agree on normalization. lshw, for instance, stores the units in
the XML it produces for various fields.

> >  - Adding hardware information immediately requires us to extend the
> >    information stored in a Machine object
> >    - My current list includes (sources in parentheses):
> >      Machine model (dmidecode, lsvpd)
> >      CPU version (/proc/cpuinfo)
> >      CPU speed (/proc/cpuinfo)
> >      Memory installed (/sys/devices/system/memory or /proc/meminfo)
> >      Swap size (/proc/meminfo, /proc/swaps?)
> >      Serial number (dmidecode, lsvpd)
> >      CPU topology (# of cores, # of hardware threads, sysfs, /proc/cpuinfo))
> >      PCI devices (lspci)
> >      KVM capable (/proc/cpuinfo)
> >      CPU flags (/proc/cpuinfo)
> >      Timestamp of last inventory
> >      Version of last inventory tool used
> >      BIOS/system firmware level
> 
> I think all of the above sounds like a great start for the core HW
> info to gather.

Good.

> I would also add the following to the list of things to gather (I'm
> still trying to map the hp-ux way of doing this to Linux so if
> anything below sounds a bit off, bring me back in line!):
> * basic disk device attribs (type, size, device, pci/hw path) mapped
> to a unique ID that won't change with each OS install  (lsscsi,
> /proc/scsi)
> * SAN disk attribs (WWID, transport) (lsscsi -t, ??)

I think these two are good to grab, you're right.

> * admin attribs for disks - is disk available for anyone to use
> (scratch) vs reserved with non-specific purpose(don't touch) vs
> reserved for specific purpose(swap, dump, reserved for VM disk images,
> etc) 

I think this points to an issue outside of inventory altogether. With
the cobbler integration work, machines can be reinstalled on every job
-- and there is no way to indicate that a given disk should not be used
for the reinstallation -- unless the Cobbler administrator configures
this.

It also becomes tricky (I think) to enforce these attributes w/in a
reservation job, for instance.

Not to say they aren't useful, though, but I don't think it will be part
of my first cut.

> * CPU hyperthreading - on/off (??)

CPU HT should get covered by the topology above -- that is, if there are
0 hardware threads present, then it's off.

> >      - Reading through this list, I imagine the implementation becomes
> >        an InventoryInterface, with per-architecture implementations of
> >        how to actually obtain the data.
> >    - Additionally, one can imagine end-users may have site-specific,
> >      internal or controlled extra data they would like to be searchable
> >      & storable per-machine. I think it makes sense to have an
> >      additional, admin-interface defined table to store a list of labels
> >      and how to get the data that should be stored with that label. We
> >      would then store a hash of the obtained information in the
> >      inventory job.
> 
> This hash would have all the HW data right?  Not just the stuff the
> admin added (so default + admin added).

Yep. Well, what I'm proposing is that we store the "built-in" data in
columns, but the admin-configured extensions (whatever they may be) in a
hash in its own column.

> I think there is value in doing the inventory for each job and
> actually storing that information in the job regardless of if the job
> was scheduled through classic mechanisms (hostname or label matching)
> or HW attrib comparative scheduling (discussed below).  Saving the HW
> info in the job is important as all we can confirm right now is which
> host the job ran on.  If the host is changing over time (having
> memory, IO cards, storage, etc added/removed) having that data stored
> and available in queryable form would be valuable.  I recognize much
> of this info is in perhaps available in the job sysinfo files, but if
> DB tables are going to be added to make these attribs available for
> scheduling I would vote to make them available as job attribs as well.
> I think this is what you were after with the "store a hash of obtained
> information in the inventory job", but I wasn't clear if there would
> be an inventory job linked to each test job.

I think it makes sense to have inventory run before (or part of) each
job, as you mention -- and as a tester, it is a good sanity check that
the machine I'm using was the right one, etc.

> NOTE: one way to handle this would be to potentially revision host
> entries in inventory part of the DB so that host + inventory_revision
> would be a unique snapshot of these attribs and that would be all you
> need to store in each test job.  
> 
> >  - Have a contrib script appropriate for
> >    /var/lib/cobbler/triggers/add/system/ to automatically run an
> >    inventory job on system add.
> 
> Would we additionally want to trigger inventory calls each time a job
> runs on a host (during the VERIFY state)?  It seems like you are
> proposing this in the next bullet below but I didn't see it explicitly
> listed
> 
> >  - Extend the create job UI to allow searching by hardware information.
> >    This requires seeing what once satisfied the request, re-running
> >    inventory on that host, and if the required data is still present,
> >    running the job.
> 
> In general it sounds like your process is in alignment with what I was
> thinking of, but having implemented a similar HW attribute scheduling
> system before I know this is getting to a potentially complicated and
> performance sensitive part of this discussion.  You need scheduling to
> be fast to make the queing efficient and reports like I may someday
> want to add (estimated completion time reports using scheduling
> simulations + historic test duration data) possible.  Further, I want
> to make sure that the HW attribute comparison is actually not
> finalized until the end of the process you describe above.  For
> example, if after the first evaluation the scheduler picks host 1, if
> host 1 doesn't still match after the re-inventory the scheduler should
> just treat that like any other failed host verify (repeat the
> comparison and perhaps come up with host 3, and then try that).

I think that makes sense to me. I also tried to right my list in order
of completion (and likelihood of completing :)

> In terms of implementation comments...
> The first issue is with units.  If a user asks for memory > 3030MB and
> you are storing your inventory info for memory in as a string using a
> mish-mash of units (perhaps because different architectures report
> different units) then it becomes increasingly hard to translate the
> requested criteria for the job into a basic DB query.  You first have
> to get memory for all machines, then convert it to their units, then
> do the comparison.  As per the above, I suggest avoiding this by
> standardizing the units for the inventory interface (so you know what
> unit is in the DB for every field with a numeric value) and then doing
> a conversion from the user's requested unit to the DB native unit
> before you do the scheduling query.

Agreed on standard units being stored.
 
> The second issue you have covered above in terms of needing to do a
> re-inventory.  Since we can't re-inventory your entire HW pool each
> time a new job request comes in, you are right that doing a first pass
> guess and then re-checking in the verify stage is the right thing to
> do.  If we do the inventory with every job or at most, after every
> re-install, it should be pretty accurate and there will be very few
> cases of that first guess not being a good one.
> 
> In this model of delayed HW attribute scheduling, the
> hw_attribute_scheduler module (think of the metahost scheduler module)
> logic to handle at least 4 scenarios during each scheduler tick (or if
> performance is slow, every X ticks):
> 
> 1) queued - the job still matches a host in a normal state.  If so,
> just wait until one of those hosts is Ready.
> 2) ready - a matching host becomes "ready", queue it up to run just
> like a normal job (assuming the re-inventory will put the job back in
> queued if it no longer matches)
> 3) starved - the job doesn't match anything but failed hosts (which
> shouldn't be touched by HW attrib scheduled jobs in my mind) or
> reserved hosts (hopefully coming soon as per issue #360).  In this
> case, it might be a long time before the job starts so an email to
> admins or user might be in order
> 4) orphaned - the job doesn't match any hosts at all - probably handle
> the same as starved, but just call out that the difference (starved
> may eventually run, orphaned is unlikely to ever run w/o
> intervention).
> 
> NOTE: Starved and orphaned jobs don't necessarily require immediate
> action in the scheduler (e.g. aborting the job) as depending on the
> use case and the attribute that is starved on, it might be something
> an admin is staffed to keep an eye on and fix (perhaps swapping IO
> cards, configuring storage, etc).  It might also just clear up after
> the next inventory refresh as perhaps a system was temporarily
> modified for a special job and will be put back to its normal state at
> the end of it.   That said, the job status should clearly show starved
> an orphaned jobs with either a new status value (not just queued) and
> somehow bring attention to the fact that there are some jobs waiting
> on HW attribs X, Y, Z.    
> 
> > This becomes necessary in particular for the case of
> >    PCI devices being removed, but also could come about when hostnames
> >    get recycled (for instance) or hardware is upgraded.
> 
> Also, it is necessary for unplanned things like memory dying, disks
> dying, etc.  I'm not proposing there be a sanity check for HW yet
> (e.g. during the equivalent of the verify stage my old automation
> would avoid using host that unexpected HW attributes change and ask an
> admin to review the host) as that can create a lot of overhead
> depending on how many hosts there are and how volatile your HW is, but
> there needs to be allowances for HW attribs frequently changing.

Great feedback!

So I'm thinking of starting small, getting an RFC out there and then
building up from there:

1) Basic hardware inventory (using the first list from above), stored in
database columns per-host
        * using standardized units (which will probably be part of the
          column name for clarity
        * statically display the information in host details
2) Add ability to extend the inventory by the admin
        * define fields and collection mechanism
3) Add ability search for a host by hardware information
        * but not handle the scheduling side of things, so there is
          still manual work needing to be done by the user to get the
          machines they want
4) Hardware-aware scheduling, essentially meta-host scheduling but with
        hardwrae constraints

Thanks,
Nish

-- 
Nishanth Aravamudan <[email protected]>
IBM Linux Technology Center

_______________________________________________
Autotest mailing list
[email protected]
http://test.kernel.org/cgi-bin/mailman/listinfo/autotest

Re: [Autotest] [RFC] adding hardware inventory to autotest

Reply via email to