On Wed, Sep 25, 2013 at 7:03 AM, Glenn Fowler <[email protected]> wrote:
> On Wed, 25 Sep 2013 00:39:18 +0200 Roland Mainz wrote:
>> --089e01536feea09e6e04e728ceb8
>> Content-Type: text/plain; charset=ISO-8859-1
>> Attached (as "astksh20130913_md5sum_compat1.diff.txt") is a patch
>> which fixes an incompatibility between AST md5sum(1)&&co. and GNU
>> coreutils md5sum(1)&&co. fixes.
>
>> There are three major differences which caused hiccups for 3rd-party scripts:
>> - GNU coreutils md5sum/sha1sum/sha224sum/sha256sum default to text mode
>> - GNU coreutils use a " *" before the file name to indicate binary
>> mode and " " to indicate text mode... the AST hash utilities used
>> only a single blank " " instead.
>> - "-t" means "text mode" for GNU coreutils while AST used this for "total"
>
>> * Notes:
>> - GNU and AST *sum(1) utilities now have identical output and seem to
>> be 100% compatible with each other
>> - On platforms which do not implement |O_BINARY| and |O_TEXT| the
>> change only affects the seperator (" "/" *"(=new) vs. " "(=old)).
>> Portable applications can use [[:space:]]+ in egrep(1) to make sure
>> they can match the hashes against both the old and new versions of AST
>> *sum(1)
>> - The output *intentionally* changes only for utilities matching the
>> shell pattern "*@(md5|sha@(1|224|256|384|512))sum". This is done to
>> maintain compatibility for cksum(1) and sum(1)
>> - AST does not have a sha224sum(1) utility (yet) ... need to talk to
>>
> I'm sorry but making --text the default on a windows systems simply does not
> make sense
Well... blame Cygwin and "Windows Services For Unix" for that crazy
idea. But I was looking at an older version of "md5sum" on Linux...
but it turns out the situation is a bit more complex:
-- snip --
157 void
158 usage (int status)
159 {
160 if (status != EXIT_SUCCESS)
161 emit_try_help ();
162 else
163 {
164 printf (_("\
165 Usage: %s [OPTION]... [FILE]...\n\
166 Print or check %s (%d-bit) checksums.\n\
167 With no FILE, or when FILE is -, read standard input.\n\
168 \n\
169 "),
170 program_name,
171 DIGEST_TYPE_STRING,
172 DIGEST_BITS);
173 if (O_BINARY)
174 fputs (_("\
175 -b, --binary read in binary mode (default unless
reading tty stdin)\n\
176 "), stdout);
177 else
178 fputs (_("\
179 -b, --binary read in binary mode\n\
180 "), stdout);
181 printf (_("\
182 -c, --check read %s sums from the FILEs and check them\n"),
183 DIGEST_TYPE_STRING);
184 fputs (_("\
185 --tag create a BSD-style checksum\n\
186 "), stdout);
187 if (O_BINARY)
188 fputs (_("\
189 -t, --text read in text mode (default if reading tty stdin)\n\
190 "), stdout);
191 else
192 fputs (_("\
193 -t, --text read in text mode (default)\n\
194 "), stdout);
195 fputs (_("\
196 \n\
197 The following three options are useful only when verifying checksums:\n\
198 --quiet don't print OK for each successfully
verified file\n\
199 --status don't output anything, status code shows success\n\
200 -w, --warn warn about improperly formatted checksum lines\n\
201 \n\
202 "), stdout);
203 fputs (_("\
204 --strict with --check, exit non-zero for any invalid input\n\
205 "), stdout);
206 fputs (HELP_OPTION_DESCRIPTION, stdout);
207 fputs (VERSION_OPTION_DESCRIPTION, stdout);
208 printf (_("\
209 \n\
210 The sums are computed as described in %s. When checking, the input\n\
211 should be a former output of this program. The default mode is to print\n\
212 a line with checksum, a character indicating input mode ('*' for binary,\n\
213 space for text), and name for each FILE.\n"),
214 DIGEST_REFERENCE);
215 emit_ancillary_info ();
216 }
217
218 exit (status);
219 }
-- snip --
So basically the per-platform defaults are governed via the
availability of |O_BINARY| at build time and whether you're reading
from a tty stdin.
> it renders tgz md5sum verification useless
Yes and no. Yes, it's not a good idea... but what should we do for
compatibility on Windows (Cygwin&&SFU) ? On Unix/Linux the
--text/--binary options are no-ops but we need to be able to produce
compatible output (e.g. the " "/" *") and read it back (I forgot
about that part in my patch).
> where do you see anywhere "the md5sum --binary value for foo.tgz is
> hexhhexhexhex"
>
> my guess is that because of this weasling
> Note: There is no difference between binary and text mode option on
> GNU system.
> most gnu weaned users call md2sum with neither --text nor --binary
>
> and this note lies anyway -- it *does* make a difference ' ' is printed for
> text,
> '*' is printed for binary
>
> and on cygwin guess what -- md5sum defaults to binary
Erm... see |usage()| function above... are you sure this is correct ?
> if there's any change it will be for the md5sum-specific output to do the ' '
> vs '*'
> based on text vs binary so on all implementations '*' will be printed by
> default
AFAIK that's not neccesary - see |usage()| above... there are limits
to the insanity, governed by whether the platform has |O_BINARY| and
whether the input is a tty or not.
... and please only change the output for utilities which match
"*@(md5|sha@(1|224|256|384|512))sum" ... otherwise we end-up with a
lot of trouble for scripts which depend on specific output for
cksum(1) and sum(1) etc.
> how many scripts will break with that default?
A lot of scripts which do md5sum and sha256sum verification choke on
the " "/" *" vs. " " difference... we have that issue at least since
2007 when someone from Sun reported the issue in the Sun bugster bug
database that libcmd "md5sum" can't replace GNU coreutils
"md5sum"&&co. until this issue has been fixed.
----
Bye,
Roland
--
__ . . __
(o.\ \/ /.o) [email protected]
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 3992797
(;O/ \/ \O;)
_______________________________________________
ast-developers mailing list
[email protected]
http://lists.research.att.com/mailman/listinfo/ast-developers