[9fans] Making read(1) an rc(1) builtin?

smiley Sun, 03 Apr 2011 15:32:57 -0700

I'm in the process of writing some filters in rc(1).  One thing that has
come to concern me about rc(1) is that read(1) is not a "builtin"
command.  For example, with a loop like:


    while(message=`{read})
      switch($message) {
      case foo
        dofoo
      case bar
        dobar
      case *
        dodefault
      }

Each line that's read by the script causes it to fork a new process,
/bin/read, whose sole purpose is to read a single line and die.  That
means at least one fork for each line read and, if your input has many
lines, that means spawning many processes.  I wonder if it wouldn't make
sense to move read(1) into rc(1) and make it a "builtin" command.  A
wrapper script could then be created, at /bin/read, to call "rc -c 'eval
read $*'" with the appropriate arguments (or sed $n^q, etc.), for any
program that requires an actual /bin/read to exist.

A similar line of thought holds for /bin/test.  The string and numeric
tests (-n, -z, =, !=, <, >, -lt, -eq, -ne, etc.) can be very frequently
used, and can lead to spawning unnecessarily many processes.  For the
file test parameters (-e, -f, -d, -r, -x, -A, -L, -T, etc.), however,
this argument isn't as strong.  Since the file tests have to stat(2) a
path, they already require a call to the underlying file system, and an
additional fork wouldn't be that much more expensive.  I could see the
string and numeric tests being moved into rc(1) as a "test" builtin,
with the file tests residing at "/bin/ftest" (note the "f").  The "test"
builtin could scan its arguments and call "ftest" if needed.  A wrapper
script at /bin/test could provide compatibility for existing programs
which expect an executable named /bin/test to exist.

I understand the Unix/Plan 9 philosophy of connecting tools that do one
job and do it well.  But I don't think /bin/read and /bin/test are
places where that philosophy is practical (i.e., efficient).  After all,
reading input lines really is the perogative of any program that
processes line-oriented data (like rc(1) does).  In addition, /bin/read
represents a simple and fairly stable interface that's not likely to
change appreciably in the future.  Comparison of numeric and string
values is also a fairly stable operation that's not likely to change,
and is not likely to be needed outside of rc(1).  Most programming
languages (C, awk, etc.) have their own mechanisms for integer and
string comparison.  I suspect moving these operations into rc(1) (with
appropriate replacement scripts to ensure compatibility) could
appreciably increase the performance of shell scripts, with very little
cost in modularity or compatibility.

Any thoughts on this?

I'm also a bit stumped by the fact that rc(1) doesn't have anything
analogous to bash(1)'s string parsing operations: ${foo#bar},
${foo##bar}, ${foo%bar}, ${foo%%bar}, or ${foo/bar/baz}.  Is there any
way to extract substrings (or single characters) from a string in rc(1)
without having to fork a dd, awk, or sed?  I've tried setting ifs='' and
using foo=($"bar), but rc(1) always splits bar on spaces.  Perhaps, if
rc(1) used the first character of $ifs to split $"bar, $bar could be
split into individual characters when ifs=''.  Then, the characters of
$bar could be addressed without resort to dd and friends.

(As a side note, if anyone goes into rc(1)'s source to implement any of
this, please add a "--" option (or similar) to the "echo" builtin while
you're there.  Having to wrap echo in:

    # make 'myecho $foo' work even when $foo starts with '-n'
    fn myecho {
      if(~ $1 --) {
        shift
        if(~ $1 -n) {
          shift
          echo -n -n $*
          echo
        }
        if not echo $*
      }
      if not echo $*
    }

can be rather inconvenient.)

-- 
+---------------------------------------------------------------+
|E-Mail: smi...@zenzebra.mv.com             PGP key ID: BC549F8B|
|Fingerprint: 9329 DB4A 30F5 6EDA D2BA  3489 DAB7 555A BC54 9F8B|
+---------------------------------------------------------------+

[9fans] Making read(1) an rc(1) builtin?

Reply via email to