Bugs in unexpand(1) version 6.10

Kevin O'Gorman Thu, 22 Jan 2009 17:36:49 -0800

Three oddities in unexpand have been noted by my students here.

----------------------
Version information:
ke...@treat Test $ unexpand --version
unexpand (GNU coreutils) 6.10
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.


Written by David MacKenzie.
ke...@treat Test $ 


----------------------
1) Replacement of tabs.  This is backwards since unexpand is supposed
to convert the other way.  There's a bash script
attached that illustrates the behavior: tabtest.sh.   When I run
it, I get this output:

ke...@treat Test $ bash tabtest.sh
0000000 61 09 62 20 63 0a
          a  \t   b       c  \n
0000006
0000000 61 20 62 20 63 0a
          a       b       c  \n
0000006
ke...@treat Test $

----------------------
2) Infinite output when conflicting options are given.  It seems to me
that a suitable diagnostic message would be better.  The test case is the
attached testcase.sh script.  When I run it , I get this (runnnnon error message
is from the original):

ke...@treat Test $ bash testcase.sh
testcase.sh: line 5: 22245 File size limit exceededunexpand -t2 -t5  > 
testoutput <<EOF
         a
EOF

-rw-r--r-- 1 kevin kevin 1024 2009-01-22 10:14 testoutput
0000000 09 09 09 09 09 09 09 09 09 09 09 09 09 09 09 09
         \t  \t  \t  \t  \t  \t  \t  \t  \t  \t  \t  \t  \t  \t  \t  \t
*
0002000
ke...@treat Test $

----------------------
Single spaces leading up to a tab are treated inconsistently.  Sometimes they
are replaced by a tab and sometimes not.  The info page is vague enough to
allow either interpretation, but the variations seem undesirable.

If there's a good reason for the behavior, it should be documented.

I would note that the POSIX.1 man page is explicit, and allows only for changing
an initial sequence of blanks (not at issue here) or two or more blanks leading
up to a tab (also not at issue) so that a POSIX-compliant implementation would
not do conversions at all in the case of single blanks.  This seems consistent
with the motivation of making the file smaller, and avoiding changes that do
not further that end.

Test case is in the script testspace.sh
When I run it I get

ke...@treat Test $ bash testspace.sh
0000000 61 62 63 20 64 65 66 20 20 67 0a
          a   b   c       d   e   f           g  \n
0000013
0000000 61 62 63 20 64 65 66 09 20 67 0a
          a   b   c       d   e   f  \t       g  \n
0000013
ke...@treat Test $


The blank betweed "c" and "d" is not converted, but
the blank after "f" is converted to a tab (^T).
It is not at all clear why, since they both lead up to a tab stop.
One surmises that the following blank is making a difference, but it's
hard to see a motivation for the distinction.

I submit that it is just as well not to convert in both cases, as that
is most consistent with POSIX.

In any event, the documentation should be more clear about what cases are
handled and how.

++ kevin


-- 
Kevin O'Gorman, PhD   http://users.csc.calpoly.edu/~kogorman

testspace.sh
Description: application/shellscript

testcase.sh
Description: application/shellscript

tabtest.sh
Description: application/shellscript

_______________________________________________
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils

Bugs in unexpand(1) version 6.10

Reply via email to