On Fri, May 8, 2009 at 9:20 AM, Sarah Jelinek <Sarah.Jelinek at sun.com> wrote:
> Hi Mike,
>
> Thank you for this data! I do have some comments inline..
>>
>> I noticed in the AI Client Redesign Meeting Notes[1]:
>>
>> Then there was a discussion about Derived Profiles. The outcome was
>> to gather requirements around the following:
>>
>> | - What does deriving mean? That is, what aspects of the profile may
>> | ? be derived? What problems are we trying to solve here?
>> | - Who derives a profile? Some clients or all the clients?
>> | - Should the client support substitution of certain fields in the
>> | ? AI manifest? If yes, what problem will that solve?
>> | - How does the impact the criteria selection on the AI server?
>>
>> Currently I use derived profiles to do the following:
>>
>> - Customize partitioning based upon server model, disk size,
>> ?memory size, etc.
>>
>
> Can you be specific about the criteria you feel are requirements? What I
> mean by criteria is what things do you believe must be included so that the
> client can probe and effectively create the correct derived profile?

Things that are in my current and/or begin scripts that derive the
profile include:

- Always create / and alt-/ of the same size and on the same disks
- If enough disks are available, mirror everything
- If the disks are big enough, create / and alt-/ as X GB, else X/2 GB
- If running Solaris 10 or later, use leftover space for soft partitions.
- If running Solaris 9 or earlier, mount leftover space at /local
- If running on V240, V440, T2000, etc., root gets mirrored across
disks on the same controller
- If running on a 6800, 15K, 25K, etc., find the two JBODs that are
attached and mirror across them.
- If running on a Thumper, be sure to mirror across the devices that
the BIOS has access to
- If special device aliases (e.g. jsroot1, jsroot2) are found by
probing OBP, find the disks associated with them and install there
instead of using the rules above.
- Determine what site I am in (based on IP) and download the flash
archive from there

Translated into the new way, this probably means:

- Have the ability for the sysadmin - not the tool - to select which
disks to install onto.
- Provide a means that is flexible enough that disk selection can be
done by physical path, as I do above with jsroot*.  This is important
because the controller number can vary based on which PCI or PCIe
cards are installed.  I would hate to install Solaris onto SAN disks
(overwriting application data) when I meant to write to local disk.
- Have the ability to specify the size of rpool, which may be smaller
than a single disk.
- Have the ability to specify other zpools should reside
- Have the ability to tune the size and possibly location of swap &
dump.  That is, a system with small drives (old or SSD) might put swap
& dump in a separate pool - or may decide to use SVM for swap & dump
because ZFS increases the space requirements for them.
- Specify which mirror(s) or repository(s) to install from, based on
locally defined location rules.
- Specify proxy based on locally defined location rules.

>>
>> - Select the appropriate flash archive based on server model
>> ?(primarily sun4u vs. sun4v vs i86pc)
>>
>> I have a lot of logic in finish scripts (JASS) and third-party system
>> management tools that does various other things based upon location
>> (derived from IP address), OS revision, and other criteria that is very
>> hard or impossible to acquire automatically. ?Arguably, the bulk of JASS
>> is
>> legacy baggage with secure by default.
>>
>> As I look forward, I would like to derive profiles that:
>>
>> - Lays out storage properly. ?The definition of "properly" will be likely
>> ?be dependent on criteria that doesn't work for everyone. ?That is, at
>> ?MyCo we may boot from local disk and want two compressed mirrors. ?At
>> ?YourCo "properly" means to use the lowest numbered LUN presented via
>> ?iSCSI from storage array X.
>> - Select software to install based on somewhat arbitrary rules. ?That is,
>> ?at site X I need the omniback package and site Y I need netbackup. ?If
>> ?it's the primary ldom of a sun4v box, install LDoms Manager 2.4.
>>
>
> What types of data would drive the rules for the software choices?

The primary IP address along with a populated netmasks file and some
home-brew logic drives site identification.

Probing OBP for aliases (prtpicl -c aliases -v) is great for
system-specific overrides.

Querying the network for which subnets are available shows some
promise (e.g. snooping for EIGRP packets on and seeing "VLAN#50
10.0.50.0 - 224.0.0.10", or eventually LLDP) for making decisions.

>>
>> - Select repository (or mirror) based on location such that I don't
>> install
>> ?across the Atlantic if I have a closer copy.
>> - Select repository based on location (lab installs experimental bits)
>> - Require production servers to have packages signed by the OS vendor or
>> by
>> ?internal QA. ?That is, make it impossible to install experimental
>> ?third-party software on production.
>>
>
> How would we be able to determine it is a production server? I assume the
> profile you would derive in this case would have its ips repo set for
> installation such that there wouldn't be experimental software. Is this what
> you are thinking?

This would likely feed off of subnet-based rules.  Arguably, this is
probably more easily dealt with by having selecting a different base
installation profile (prod vs. lab) on the AI server.  The derived
profile would probably just tweak this base profile for
hardware-specific items and picking the closest appropriate repo.

> A few more questions:
>
> 1. How easy is it for you to use, and configure your current jumpstart
> configuration to enable derived profiles? Are the user interfaces easy to
> use?

The current setup of a jumpstart client involves (as a non-privileged
user), setting up a system-specific wanboot.conf and system.conf using
a fairly simple script.

jumpstartzone$ /jumpstart/<release>/add_client_wanboot -e <mac> -h
<hostname> ...
Run this at the OpenBoot prompt:
    ok setenv network-boot-arguments=...

ok setenv network-boot-arguments=...
ok boot net - install

Every client uses the same begin script to derive the profile.  I
tweak the Begin/derive-profile.beg script when I do a new image
release (point it to the next flar) or something comes up that causes
other problems (like Solaris becomes huge and needs more than 8 GB for
/).

The rules for site determination use a netmasks file and a "subnets" file. e.g.:

    10.0.1.0 SiteA
    10.0.2.0 SiteB

The rules for selecting installation disks require a diskmap file that
looks like:

# 480R- note that they use a qlogic fiber channel chip just like our HBA's do
root1 SUNW,Sun-Fire-480R c.t0d0s2 ../../devices/pci at 9,600000/SUNW,qlc at 
2/fp at 0,0
root2 SUNW,Sun-Fire-480R c.t1d0s2 ../../devices/pci at 9,600000/SUNW,qlc at 
2/fp at 0,0

# T5220
root1 SUNW,SPARC-Enterprise-T5220 c.t0d0s2
../../devices/pci at 0/pci at 0/pci at 2/scsi at 0
root2 SUNW,SPARC-Enterprise-T5220 c.t1d0s2
../../devices/pci at 0/pci at 0/pci at 2/scsi at 0
data1 SUNW,SPARC-Enterprise-T5220 c.t2d0s2
../../devices/pci at 0/pci at 0/pci at 2/scsi at 0
data2 SUNW,SPARC-Enterprise-T5220 c.t3d0s2
../../devices/pci at 0/pci at 0/pci at 2/scsi at 0

If I just used the first two disks in $SI_DISK_LIST, the 480R may give
the disks I list above or something that is storing an oracle database
out on the SAN.  Best to avoid overwriting the database.

> 2. What do you like about the way it is currently implemented?

- It works.
- I can trust that the just-hired-last-week junior sysadmin armed with
a simple procedure can install Solaris per standards without risk of
breaking the jumpstart environment for everyone else.  That is, since
there is no customization to perform on the jumpstart server there is
no chance that someone that is not tasked with maintaining jumpstart
will break jumpstart.
- Policy enforcement via scripting is much easier, accurate, and
cost-effective than policy enforcement via training, audits,
remediation, retraining, etc. (Sysadmins need to know what they are
doing, but need to be focused on value-add, not minutia.)

> 3. What don't you like?

- It took way too much work to make all of this reliable and workable
for a single process to use on a global basis.
- Making everything work equally well for network-based and DVD-based
installations was difficult.
- When things don't work (which is extremely rarely - see 2 above)
jumpstart is hard to debug because of lack of documented ways to
observe the process along with lack of a documented way to restart the
process without enduring POST, slow wanboot download, etc.  I know
many tricks to get around this, but learning them was painful and
often times only possible because of the extensive use of shell
scripts during installation.
- Automated installation has way too much of "every customer must
figure it out for themselves."

It seems as though the last point is supposed to be addressed with
JASS and/or JET.  For me,
JASS (now unmaintained and not yet open source) was a big help for
automation of security hardening.  JET came to my attention long after
JASS was already working (including various custom written modules).
Almost every introduction I had to JET felt like a sales job for
professional services.  That is, the thing that I needed didn't come
with JET, but if I paid for some professional services they would
provide it.  Well, the thing that I needed was typically easier to
bolt onto JASS than it was to go through the requisition process for
professional services.

Striking the balance between everyone having to figure it out for
themselves and lacking required flexibility is extremely difficult.  I
would prefer that we currently err in favor of giving too much
flexibility. Having flexibility will allow sysadmins to come up with
clever ways of accomplishing what they need to do, hopefully leading
to contributions of those clever things back to the community.  I
worry that lack of flexibility will hinder adoption at large sites -
all of which will suddenly sing the praises of jumpstart.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/

Reply via email to