Garrett D'Amore wrote:
> Alan Coopersmith wrote:
> > I'm sponsoring this fast-track request on behalf of the
> > ksh93-integration and busybox projects.  The timeout is
> > set for Friday, July 31, 2009.
> >
> >       -Alan Coopersmith-           alan.coopersmith at sun.com
> >        Sun Microsystems, Inc. - X Window System Engineering
> >
> > Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI
> > This information is Copyright 2009 Sun Microsystems
> > 1. Introduction
> >     1.1. Project/Component Working Name:
> >        AST versions of fold, mktemp, pathchk, & tty
> >     1.2. Name of Document Author/Supplier:
> >        Author:  Roland Mainz
> >     1.3  Date of This Document:
> >       24 July, 2009
> > 4. Technical Description
[snip]
> My main concern here is the integration of manual page functionality
> into the commands themselves.  I see both benefits and costs.  The
> benefit is that the documentation is more likely to match the actual
> command.  But part of the cost is a much higher cost to perform
> localization for these,

Erm... see below... normal manpages won't be discontinued - see below...
... and there is (in theory) no restriction to localise builtin manpages
- the matching string would simply appear in the l10n catalog file for
libcmd (this is however not planned (yet)).

> and (depending on implementation) a potentially
> larger minimum size of the binaries.  (I'm assuming for the moment that
> the documentation is stored in the binary, and the command is doing more
> than just executing some pipeline to access the manual content from
> /usr/share/man or whatever.)
> 
> Personally, I think --man, --html and --nroff and such is a dangerous
> precedent to set.

... which already exists since the ksh93-integration project was
started...

> I'd rather not have them, and instead rely on the
> "man" command to provide this functionality.

Erm... we never proposed to discontinue to use manpages. The builtin
support for "--man"&co. is actually only a nice side-effect of the AST
getopts function - see below...

> (Also with --html and
> --nroff an --man, how is the content stored -- does the command do
> format conversions on demand?...  It seems like this functionality also
> might add to the total code size, although I guess this functionality is
> already stored in the AST libraries.  Hopefully this is *not* storing 3
> separate copies of the documentation in the binaries, at least!

Erm... some clarifications:
1. Support for "--man"/"-nroff"/"--html" is only a side-effect of the
extensions of the |libast::getopts()| function. This extension
(available via the ksh93 "getopts" interface for scripts and
|libast::getopts()| for binaries) is used to describe the short&&long
command/utilty options and allows to tag them with some messages, too.
The libast code then converts this into a mannual page on demand.
2. Please don't worry... we don're store 666 copies of the manual page
somewhere in the code - instead the _compact_ getopts string is
dynamically converted at runtime to the requested output format.
3. The actual "extra" space being used is _tiny_ - just looking at the
(completely different) implementations of /usr/bin/fold vs. the "fold.o"
object from one of my development trees:
-- snip --
$ ls -l ./build_sparc_64bit/arch/sol11.sun4/src/lib/libcmd/fold.o
/usr/bin/fold
-rw-r--r--   1 gisburn  gisburn    14136 Jun 28 04:27
./build_sparc_64bit/src/lib/libcmd/fold.o
-r-xr-xr-x   1 root     bin        12752 Sep 13  2006 /usr/bin/fold
-- snip --
This is on SPARC ("fold.o" is SPARCv9, "/usr/bin/fold" is SPARCv8) ...
and the size difference is AFAIK not a problem since UFS uses 8k pages
and both binaries therefore fit happily in two 8k pages and on x86 both
need three 4k pages.
4. The bulitin manpages are intentionally only a short/terse version of
the normal manpages (for example $ ksh93 --man # only gives normal shell
usage, not the full syntax of the ksh93 scripting language and all the
details. It's intended as quick reference and not to replace the full
manual page (for smaller projects like "bldenv" or "webrev" it can
replace the full manual page but that's more the exception from the
standard)).
5. For the long-term (not yet, not now, not this case (please)) it may
be interesting to generate the builtin string for AST "getopts" and the
normal manual pages from a DocBook/XML master file (with some XML
ifdef/else/endif to cut-out the non-mandatory parts). That would unify
both systems but requires that I'll finish the "shxml" work (e.g.
xmlreader/xmlwriter support for the shell) or write a dedicated
docbook2astgetopts.xsl XSLT stylesheet.
6. We're _not_ interested to create a precedent here, just deliver what
upstream does in it's code. And we document the "--man" function in this
ARC case as optional _usabilty_ enhancement.

> Also, I'd like the case submitter to provide some justification for
> these changes?  Are there functional changes that will help with
> familiarity?

Yes, at least we cover the following goals:
- Familarity: GNU+BSD command line options (which increases
interoperabilty, not only across GNU but BSD and MacOSX, too)
- Performance:
  1. The AST implementions are usually a lot faster than the current
commands (we seen with the replacements for /usr/bin/cut, /usr/bin/paste
etc. which are sometimes eight, ten or twelve times faster (partially
because the |libc::stdio| implementation is _extremely_ slow))
  2. Performance boost for OpenSolaris/Indiana since the tool is a
builtin shell command for /usr/bin/sh, /sbin/sh, /usr/bin/ksh and
/usr/bin/ksh93
- 64bit clean codebase: Right now OS/Net is _not_ being 64bit clean (the
tools we are touching in ksh93-integration update2 and this case are in
particular the worst offenders, followed by the CTF tools and some minor
other areas). This situation causes serious problems (e.g. accounting
for 1/5 of the engineering time required to port Solaris) for ports to
other hardware - for example the Solaris/SyetemZ port was forced to
implement a 32bit emulation layer (!!) on pure 64bit hardware because
there was no other easy way to get Solaris ported (and IMO this
situation _sucks_ (<-- sorry... but I really don't like it that the code
was never cleaned-up)).
- Long-term maintaince: Living and _cooperative_ upstream who helps with
bugfixing
- License: CDDL-compatible license (for example "GNU coreutils" are
GPLv2 which prevents the tools from being embedded as shared library
([1]) in non-GPLv2 code [2]).

[1]=Which would be hopeless anyway since the code would need to be
re-written from scratch to make it re-entrant
[2]=(erm... could we please not have a license flamewar ? I'm only
reciting what the coreutils folks are claiming...)

> (I'm assuming most people use GNU utilities on foreign
> operating systems, and not ksh93 versions.) 

See above - as said the tools we upgraded to the AST versions until now
have their options compatible to _both_ the GNU _and_ BSD versions (we
didn't cover any commands yet where options in different OSes have a
conflicting meaning... we'll save that pain for the future... ;-/ ).

> Does this have any impact
> on the size of the objects on disk,

See above. The size of the compiled code is a bit different since this
new code is a completely different implementation but the size differenc
is usually within +30%-/30% (not caused by the builtin manpages). And
AFAIK the extra size is compensated by the busybox-style implementation
which gurantees that the active code is always shared.

> the performance of the utilities,

The new tools weren't tested for performance yet but for the code
changed in ksh93-integration update2 we know that the AST versions are
significantly faster (with the exception of "tail" and "tee" _maybe_
(but I don't have any results yet ("tail" no longer uses |mmap()| which
may cause performace degration but this change eliminates SIGSEGV/SIGBUS
when the file shrinks and "tee" has a bit higher startup time (but has
superiour buffering)))), for example AST "paste" runs - depending on the
locale - 8-12 times faster than the current version.

> or
> the number of closed source bits we use to make up ON?

Not with this ARC case but the previous one killed-off the closed-source
"tail" and one of the next will cover the "sed" and "tr" variations.
We're intending to get rid of the closed-source _commands_+_utilities_
soon with backwards-compatible opensource versions (e.g. "sed" is more
or less done except compatibilty testing).

> Any of those
> would help provide justification for these changes.

See above...

----

Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) roland.mainz at nrubsig.org
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 3992797
 (;O/ \/ \O;)

Reply via email to