Peter Tribble wrote:
>
> ptime pkgchk -l -p /usr/bin/ls
>
> real 0.662
> user 0.557
> sys 0.104
>
> Dropping the requirement to support wildcards allows a different
> algorithm, so we get:
>
> ptime ./pkgchk -l -p /usr/bin/ls
>
> real 0.007
> user 0.002
> sys 0.004
>
Wow. Are you binary-searching the contents file? Given the absurdly
restricted way the "wild card" in pkgchk -p works (it just stops
comparing when it hits the asterisk) you could do that and still support
the "wild card". (Not that there seems to be any point in maintaining
this old undocumented behavior.)
However, this bumps up against some current efforts to speed up
pkgadd/pkgrm (and thus patchadd/patchrm which are built on top.) One of
the ideas being looked at is to stop maintaining
/var/sadm/install/contents as we know it. (Right now every pkgadd and
pkgrm has to read in all of /var/sadm/install/contents and write it all
out at the end.) One idea is to keep each package's contents entry in
that package's directory in /var/sadm/pkg. That way only a small file
needs to be written on pkgadd, and pkgrm is real cheap. (Files that are
owned by multiple packages need some special consideration, which there
is still discussion on.) Keeping pkgchk -l -[p|P] working in this new
regime isn't hard, but it will probably be a little slower than the
current scheme (have to open 1000+ little files instead of one big one.)
It does seem worth slowing down pkgchk -l a little to make
pkgadd/pkgrm a lot faster, since pkgchk -l is a occasional thing while
pkgadd/pkgrm are what makes install/upgrade/patching slow.
This does raise the question of what to do about things that read
/var/sadm/install/contents today. We know that there are at least some
programs that read it (despite it never being a documented stable
interface) -- old versions of the Java run time read it to determine
which fonts were available, for example. But the main use does seem to
be people who've learned to grep through it to find various things (I'd
never have survived the SunOS 4->5 transition without it.) Several
ideas have come to mind:
- Just do basically
"cat /var/sadm/pkg/*/contents | \
sort -u > /var/sadm/install/contents"
at the end of any install or upgrade, and "every so often" in a
cron job. Most usages would go just fine, but the cases where it
failed would be really annoying.
- Get rid of /var/sadm/install/contents, so at least things fail
cleanly. People who learned to grep through it will pretty quickly
learn to either use pkgchk -l or "grep /var/sadm/pkg/*/contents ..".
- Get rid of contents file, but provide a new command which
dumps the contents in a nice parsable
format, perhaps not exactly like the current contents file but still
easily grep-able. Tell people currently grepping the contents
file to do
pkgcontents | grep ...
instead. Has the advantage that if we change the /var/sadm/pkg format
again we can do so without another finger retraining.
(Eventually we might want to circle back to the "real database"
idea but do it better)
- Write a special file system that mounts on /var/sadm/install/contents
and does the cat on the fly. Makes the change very invisible, but
it's really a lot of work and gadgetry to preserve an interface that
was never documented for use.
I'd sure love to hear any other ideas.
In any case, given this going on, changing the way pkgchk -l -p searches
the contents file may not be the most interesting thing to work on.
Rich