[install-discuss] Solaris installation strategy

Mike Gerdts Mon, 27 Mar 2006 19:25:33 -0600

On 3/27/06, Dave Miner <Dave.Miner at sun.com> wrote:
> Mike Gerdts wrote:
> > On 3/24/06, Dave Miner <Dave.Miner at sun.com> wrote:
> >> Mike Gerdts wrote:
> >> ...
> >>> I'm *so* glad to see that this is an area of focus.  My comments on
> >>> the document and installation related tasks follow.
> >>>
> >>> Page 6, Bullet 2, sub-item 2: SUNWCXall no longer does the trick...
> >>> when SUNWCXall is installed on a sun4u box (15k domain) the sun4v
> >>> platform support is not added.  This implies that in addition to my
> >>> 15k domain used primarily for image development (that ain't cheap), I
> >>> now need to have a T1000 or T2000 sitting around for the same purpose.
> >>>  In a globally distributed jumpstart environment, I now need to
> >>> distribute three ~2 GB flash archives to get x86-64, sun4u, and sun4v
> >>> support.
> >>>
> >> Thanks for pointing this out, as I hadn't noticed it.
> >
> > Is this a bug or accepted limitation for some reason?  Has pointing it
> > out caused it to be noted in an updated version of the document, a bug
> > filed, or both?  I can file the bug through OpenSolaris if this is not
> > a conscious design decision.
> >
>
> Sorry, I should have said a bit more.  It's a conscious design decision,
> at least in terms of the way installation was designed oh so many years
> ago, when sun4, sun4c, sun4d, and sun4u all walked the earth, disks were
> small, and so on.  sun4v's the first new architecture in SPARC systems
> in 10 years.  I need to look into the issues a bit, and it'll probably
> lead to some more verbiage in the document when I update it in a couple
> of weeks.


Is it safe to simply add the few packages that are sun4v or T2000
specific to a sun4u system (15k) to enable building a single sparc
flar?  I'm not asking for an in-depth analysis here, just wondering if
you have a strong gut feeling one way or another.  I *really* want to
have one sparc flar.

> >>> Page 8 - Live Upgrade is also hampered by the following in my environment:
> >>>
> >>> 1) It uses a version of cpio which does not support sparse files.
> >>> This causes files like /var/adm/lastlog to balloon in size when large
> >>> UID's (100,000,000 - 999,999,999) are used.  Similar issues likely
> >>> exist if a quotas file happens to be in a partition used for live
> >>> upgrade.
> >
> > Bug 4480319.  I'm not sure if this is the one that I filed or not, but
> > it's been out there for a while.  I've discussed this a bit on
> > zones-discuss as well because "zoneadm clone" now has the same problem
> > as live upgrade and flash archives.
> >
>
> Yeah, you're listed as one of the customers for it.  Now that we have
> SEEK_HOLE in Nevada, we could probably fix it without too much pain.

That was also mentioned on the zones-discuss list.  One of these days
maybe I'll pick this one up.  For right now, I am usually OK with
truncating lastlog or adding it to an exclude list.

> >>> 2) It has spotty support for upgrading to metadevices.
> >
> > I am pretty sure that there is a bug on this one, but I am having
> > troubles finding it.   Essentially, it boils down to the following
> > blowing up:
> >
> > lucreate -s - -n newbe -m /:d30:ufs,preserve
> > luupgrade -f -n newbe -s $osmedia -J 'archive_location nfs://somewhere'
> >
> > To work around this, I have done:
> >
> > # cp $osmedia/Solaris_10/Tools/Boot/usr/sbin/install.d/pfinstall \
> >     /var/tmp/pfinstall.orig
> > # mount -F lofs -O /dir/pfinstall-wrapper \
> >     $osmedia/Solaris_10/Tools/Boot/usr/sbin/install.d/pfinstall
> >
> > The wrapper causes the following change in the profile before calling
> > /var/tmp/pfinstall.orig
> >
> > < filesys d30 existing /
> > ---
> >> filesys mirror:d30 c0t0d0s3 c0t1d0s3 existing /
> >> metadb c0t0d0s7
> >> metadb c0t1d0s7
> >
> > Note that this has worked for me on one machine and just got it to
> > work in the past 24 hours.  By no means am I convinced that it is a
> > robust workaround yet.
> >
>
> Kind of a scary workaround, but it'll be good input to the bug report.

Very much so.  I wish that the source code for live upgrade was available...

> > Beyond that, I found the following problems going from S9 to S10:
> >
> > 1) netgroup entries in /etc/shadow were missing but they were in 
> > /etc/passwd.
> > 2) Solaris 10 should have more default password entries than Solaris 9
> > (gdm, webservd, etc.).  These were lost.
> > 3) swap and other metadevices were commented from vfstab
> > 4) mount points for lofs file systems were missing
> > 5) Complaints about svc:/system/cvc:default in maintenance mode when
> > it was not appropriate for the platform (should not have been enabled)
> > 6) SVM related sevices are not enabled
> > 7) JASS to the new boot environment looked kinda scary when it started
> > out with complaining about shared library problems calling zonename.
> >
>
> Was this going to S10 1/06?  That last one looks like something that
> should occur only in that case.

It was S10 1/06.  Running JASS after reboot was just fine, though.  It
just speaks to the point that live upgrade could really stand to run
in its own virtual machine.

> Seems like a number of class-action scripts didn't work right, though.
> Feels like something really basic went wrong in the installation.

By class action, I assume you mean post-install scripts, right?  My
understanding was that these should run only after a pkgadd, not when
a flash archive is applied.  Or are there other scripts that I am not
aware of?

The netgroup thingy seems to be related to a poorly documented feature
that got new behavior with somewhat recent updates to PAM. 
Previously, netgroup entries were not required in /etc/shadow. 
Frankly, it wouldn't surprise me to see this one fall through the
cracks.

The fact that passwd was missing some entries seems like an
over-zealous sync task.

Mount points and vfstab problems surprised me.  I haven't seen this
problem before.

Because the flar was generated on a 15k, having system/cvc enabled was
not terribly surprising.

SVM related services not being enabled is a problem for regular
jumpstarts as well.

> >> I expect we'll fix the fragmentation between x86 and SPARC by going to
> >> GRUB on SPARC as well.  Long term, I think the model is better for most
> >> people, but the transition could have been handled better, I agree.
> >
> > This oughta be interesting... Is this part of making zfs bootable
> > (that is, is it easier to write the bootstrap code for grub than it is
> > for openboot?)
> >
>
> Yes, it's very much related to the zfs boot support.  As anyone who
> tried to use WAN installation found, getting new OBP features released
> on all the platforms is very difficult.

I got excited about wanboot when I first read about it in ~2002 and
was disappointed to see no OpenBoot updates to support it.  The first
platform I have seen ship with network-boot-params as a variable in
nvram is a T2000.  In that time I completely gave up on the
technology.  There are some cases where I may start looking at it
again.

> >>> Optimization of network performance is sometimes a matter of
> >>> optimizing the size of installation media.  If something like flash
> >>> archives continues to exist, they should use a better compression tool
> >>> than compress(1).
> >> Sure, providing options here seems reasonable.
> >
> > A key here may be to devise a file format that chunks a data stream
> > into lots of somewhat large pieces that are individually compressed,
> > using the compression algorithm that gives the right mix of speed and
> > size.  When the data stream is being extracted, the various chunks
> > could be individually uncompressed on multiple hardware threads.
> >
>
> Seems like it may be overkill based on likely source and destination
> bandwidth, but an interesting idea.

Within a year I bet my NFS file servers are on 10 gigabit.  "Jumpstart
clients" are increasingly getting faster internal disks, using SAN
boot, or possibly using iSCSI to a decent array.  At the same time,
single-threaded CPU performance has seemed to hit a brick wall in
favor of multi-threaded designs.

A simple test of "time gzcat /tmp/stuff.tar.gz > /dev/null" indicates
that gzcat can process about 27 MB/s on a Blade 1500 running at 1062
MHz.   Rather interestingly, zcat can only do about 19 MB/s.  The file
stuff.tar contains about 21 MB of stuff from /sbin and /etc on a S9
box.  These data rates will keep pretty much any internal disk today
saturated, especially with the number of small files typically found
in an OS.  However, if cache-based arrays are used for the OS disk or
the flash archive contains large files, the single-core performance
can start to get in the way.  Obviously, more analysis is needed
before deciding this is the future of decompression.

If you look at the problem from the other direction, however,
compression can benefit greatly from parallel algorithms because even
fastest cores are slower at gzip or bzip2 than moderately fast disks.

> >>> Other ramblings...
> >>>
> > One thing that I have found very nice on with various Linux distros
> > and Nexenta is that I have virtual consoles (or approximations
> > thereof) that allow me to observe the installation process more than
> > watching a progress bar.  This is very helpful when getting to know a
> > new installer or debugging changes.
>
> Would some ads during the install help? ;^)

Only if you can come up with no less than 4 different versions of how
Sun's founders decided upon a name.  Don't bore me with the "Stanford
Univeristy Network".  Talk about their forebearers praying to Helios
or somesuch.

> Seriously, I hope we'll be bringing back virtual console support soon.
> It's much missed.
>
> Thanks for all your comments, Mike.

Not a problem.  I'm glad to have an audience that is honestly looking
for feedback to improve the product.  It's kinda like therapy for a
grumpy sysadmin - there are a few years of pent-up frustration that I
can now work through.  :)

Mike

--
Mike Gerdts
http://mgerdts.blogspot.com/

[install-discuss] Solaris installation strategy

Reply via email to