I slept on this and here is what I think
*they should have observed a unix-on/windows-user even just for a few minutes*
to see that --text default is wrong wrong wrong
*they should have made binary the default and '*' mark text mode exception case*
and then the minimal fraction of unix users on windows that generate \r\n
*and* don't want to sum the \r will have to explicitly demand --text
I slept on this and here is what I think should happen
(0) ast defaults to --binary for all methods and does O_TEXT only with
explicit --text
(1) anyone who has a file with ' ' or '*' as the first character
*and* calls ast md5sum will be sol
(2) petition gnu coreutils to accept
<checksum><one-ascii-space-char><name-not-starting-with-space-or-asterisk>
as being generated in --binary mode
(3) anyone who uses gnu md5sum to generate a checklist and uses something
other than ast or gnu md5sum --check will be sol
(4) change ast -t, --total => -T, --total and -T, --text => -t, --text
for gnu compatibility, and retain the ast --binary default *in all cases,
no isatty crap*
(5) change ast --header to include --text (but never --binary)
(6) change ast cksum --check to recognize either
<checksum><space><name>
<checksum><space><gnu-text-or-binary-indicator><name>
in _WINIX (uwin cygwin) make the distiction --text --binary
based on <gnu-text-or-binary-indicator>, otherwise ignore
<gnu-text-or-binary-indicator>
if you notice, --method=md5 and --method=sha* are the only ones where ast
prints *exactly*
<checksum><space><name>
so it will be able to faithfully distinguish the ast vs gnu case for --check
I will consider this concession:
(0)(1)(3)(4)(5)(6)
(7) ast methods that currently list
<checksum><space><name>
will change to
<checksum><space><gnu-text-or-binary-indicator><name>
this would result in the '*' almost always being printed
ast will then handle old-ast and new-ast (gnu) formats seamlessly
can the unix user who never touches dos handle seeing the '*' indicator in
md5sum output?
there are 2 comments below
this is another example where patches don't just exist in a vacuum
the universe of unintended consequences has to be extended to include unix on
dos and unix on ebcdic
On Wed, 25 Sep 2013 07:21:24 +0200 Roland Mainz wrote:
> On Wed, Sep 25, 2013 at 7:03 AM, Glenn Fowler <[email protected]> wrote:
> > On Wed, 25 Sep 2013 00:39:18 +0200 Roland Mainz wrote:
> >> --089e01536feea09e6e04e728ceb8
> >> Content-Type: text/plain; charset=ISO-8859-1
> >> Attached (as "astksh20130913_md5sum_compat1.diff.txt") is a patch
> >> which fixes an incompatibility between AST md5sum(1)&&co. and GNU
> >> coreutils md5sum(1)&&co. fixes.
> >
> >> There are three major differences which caused hiccups for 3rd-party
> >> scripts:
> >> - GNU coreutils md5sum/sha1sum/sha224sum/sha256sum default to text mode
> >> - GNU coreutils use a " *" before the file name to indicate binary
> >> mode and " " to indicate text mode... the AST hash utilities used
> >> only a single blank " " instead.
> >> - "-t" means "text mode" for GNU coreutils while AST used this for "total"
> >
> >> * Notes:
> >> - GNU and AST *sum(1) utilities now have identical output and seem to
> >> be 100% compatible with each other
> >> - On platforms which do not implement |O_BINARY| and |O_TEXT| the
> >> change only affects the seperator (" "/" *"(=new) vs. " "(=old)).
> >> Portable applications can use [[:space:]]+ in egrep(1) to make sure
> >> they can match the hashes against both the old and new versions of AST
> >> *sum(1)
> >> - The output *intentionally* changes only for utilities matching the
> >> shell pattern "*@(md5|sha@(1|224|256|384|512))sum". This is done to
> >> maintain compatibility for cksum(1) and sum(1)
> >> - AST does not have a sha224sum(1) utility (yet) ... need to talk to
> >>
> > I'm sorry but making --text the default on a windows systems simply does
> > not make sense
> Well... blame Cygwin and "Windows Services For Unix" for that crazy
> idea. But I was looking at an older version of "md5sum" on Linux...
> but it turns out the situation is a bit more complex:
> -- snip --
> 157 void
> 158 usage (int status)
> 159 {
> 160 if (status != EXIT_SUCCESS)
> 161 emit_try_help ();
> 162 else
> 163 {
> 164 printf (_("\
> 165 Usage: %s [OPTION]... [FILE]...\n\
> 166 Print or check %s (%d-bit) checksums.\n\
> 167 With no FILE, or when FILE is -, read standard input.\n\
> 168 \n\
> 169 "),
> 170 program_name,
> 171 DIGEST_TYPE_STRING,
> 172 DIGEST_BITS);
> 173 if (O_BINARY)
> 174 fputs (_("\
> 175 -b, --binary read in binary mode (default unless
> reading tty stdin)\n\
> 176 "), stdout);
> 177 else
> 178 fputs (_("\
> 179 -b, --binary read in binary mode\n\
> 180 "), stdout);
> 181 printf (_("\
> 182 -c, --check read %s sums from the FILEs and check them\n"),
> 183 DIGEST_TYPE_STRING);
> 184 fputs (_("\
> 185 --tag create a BSD-style checksum\n\
> 186 "), stdout);
> 187 if (O_BINARY)
> 188 fputs (_("\
> 189 -t, --text read in text mode (default if reading tty
> stdin)\n\
> 190 "), stdout);
> 191 else
> 192 fputs (_("\
> 193 -t, --text read in text mode (default)\n\
> 194 "), stdout);
> 195 fputs (_("\
> 196 \n\
> 197 The following three options are useful only when verifying checksums:\n\
> 198 --quiet don't print OK for each successfully
> verified file\n\
> 199 --status don't output anything, status code shows
> success\n\
> 200 -w, --warn warn about improperly formatted checksum lines\n\
> 201 \n\
> 202 "), stdout);
> 203 fputs (_("\
> 204 --strict with --check, exit non-zero for any invalid
> input\n\
> 205 "), stdout);
> 206 fputs (HELP_OPTION_DESCRIPTION, stdout);
> 207 fputs (VERSION_OPTION_DESCRIPTION, stdout);
> 208 printf (_("\
> 209 \n\
> 210 The sums are computed as described in %s. When checking, the input\n\
> 211 should be a former output of this program. The default mode is to
> print\n\
> 212 a line with checksum, a character indicating input mode ('*' for
> binary,\n\
> 213 space for text), and name for each FILE.\n"),
> 214 DIGEST_REFERENCE);
> 215 emit_ancillary_info ();
> 216 }
> 217
> 218 exit (status);
> 219 }
> -- snip --
> So basically the per-platform defaults are governed via the
> availability of |O_BINARY| at build time and whether you're reading
> from a tty stdin.
> > it renders tgz md5sum verification useless
> Yes and no. Yes, it's not a good idea... but what should we do for
> compatibility on Windows (Cygwin&&SFU) ? On Unix/Linux the
> --text/--binary options are no-ops but we need to be able to produce
> compatible output (e.g. the " "/" *") and read it back (I forgot
> about that part in my patch).
> > where do you see anywhere "the md5sum --binary value for foo.tgz is
> > hexhhexhexhex"
> >
> > my guess is that because of this weasling
> > Note: There is no difference between binary and text mode option
> > on GNU system.
> > most gnu weaned users call md2sum with neither --text nor --binary
> >
> > and this note lies anyway -- it *does* make a difference ' ' is printed for
> > text,
> > '*' is printed for binary
> >
> > and on cygwin guess what -- md5sum defaults to binary
> Erm... see |usage()| function above... are you sure this is correct ?
as opposed to _UWIN, there is little gnu code untouched by _CYGWIN
my guess is there's a few of them in the code used to build on cygwin
> > if there's any change it will be for the md5sum-specific output to do the '
> > ' vs '*'
> > based on text vs binary so on all implementations '*' will be printed by
> > default
> AFAIK that's not neccesary - see |usage()| above... there are limits
> to the insanity, governed by whether the platform has |O_BINARY| and
> whether the input is a tty or not.
> ... and please only change the output for utilities which match
> "*@(md5|sha@(1|224|256|384|512))sum" ... otherwise we end-up with a
> lot of trouble for scripts which depend on specific output for
> cksum(1) and sum(1) etc.
> > how many scripts will break with that default?
> A lot of scripts which do md5sum and sha256sum verification choke on
> the " "/" *" vs. " " difference... we have that issue at least since
> 2007 when someone from Sun reported the issue in the Sun bugster bug
> database that libcmd "md5sum" can't replace GNU coreutils
> "md5sum"&&co. until this issue has been fixed.
do those script use --check to verify the sum?
_______________________________________________
ast-developers mailing list
[email protected]
http://lists.research.att.com/mailman/listinfo/ast-developers