Hi,
I have attached a patch adding SHA-3 to 'cksum' using the --algorithm
(-a) option and some tests based off of tests/cksum/b2sum.sh. There are
some details of it that I would like others thoughts on before pushing.
1. Originally I wanted the default digest size to be 256 bits instead of
512 bits like this patch does. This is because for SHA-2, a very large
majority of the time I see SHA-256 being used. Less commonly it is
SHA-512, and even less commonly SHA-224 and SHA-384. If the world
decides to switch to SHA-3 eventually, I would guess that people would
use the same digest size.
However, this would require more code since the value in the
'algorithm_bits' array is really treated as a maximum supported by the
algorithm. This also would differ from the default behavior of
'cksum -a blake2b' which defaults to the greatest supported digest size,
512 bits.
Less importantly, my Fedora system has a sha3sum command which is a Perl
script to interface with the Digest::SHA3 module. It defaults to 224
bits, but I care less about diverging from that. We are cksum not
sha3sum anyways.
2. I did not restrict the '--length' option to 224, 256, 384, and 512
bits. Instead the argument will round up to use the next largest digest
size and then truncate the digest. Here is an example of the behavior on
a file that does not change across program invocations:
$ ./src/cksum -a sha3 -l 224 COPYING
SHA3-224 (COPYING) =
0e93a263ef507adafd16b2330ba30384c89f56700198efe7b54588a0
$ ./src/cksum -a sha3 -l 112 COPYING
SHA3-112 (COPYING) = 0e93a263ef507adafd16b2330ba3
$ ./src/cksum -a sha3 -l 256 COPYING
SHA3-256 (COPYING) =
edb0016d9f8bafb54540da34f05a8d510de8114488f23916276bdead05509a53
$ ./src/cksum -a sha3 -l 232 COPYING
SHA3-232 (COPYING) =
edb0016d9f8bafb54540da34f05a8d510de8114488f23916276bdead05
$ ./src/cksum -a sha3 -l 384 COPYING
SHA3-384 (COPYING) =
93b8fc41e79c2445f8d653c56a1265f12d6c51d54f9ba17c015cde6e35bdb0c4a200a656beab782307bb4912dec1f8f0
$ ./src/cksum -a sha3 -l 264 COPYING
SHA3-264 (COPYING) =
93b8fc41e79c2445f8d653c56a1265f12d6c51d54f9ba17c015cde6e35bdb0c4a2
$ ./src/cksum -a sha3 -l 512 COPYING
SHA3-512 (COPYING) =
678655c1f91fb4dbb27e1450fb41bcfd0209339c3493c595ab1fc294dd7a04eb23dc74934aa2229d990b8eb92f8f89528667b7c604548f134c950b0edda374ef
$ ./src/cksum -a sha3 -l 392 COPYING
SHA3-392 (COPYING) =
678655c1f91fb4dbb27e1450fb41bcfd0209339c3493c595ab1fc294dd7a04eb23dc74934aa2229d990b8eb92f8f895286
I am conflicted on this since it might give the impression that SHA-232,
for example, is standardized in FIPS-202 when that is not the case. This
differs from the blake2b specification which states that digests can be
anywhere from 1 to 64 bytes [2].
My idea was to use a name similar the truncated SHA-2 algorithms which
NIST standardized, SHA-512/224 and SHA-512/256. They never standardized
similar algorithms for SHA-3, for reasons I do not know of. Maybe
because SHA-3 is not vulnerabile to a length extension attack that SHA-2
is, but I do not think that was their rational for them in the first
place [3][4].
Anyways, I think the notation SHA3-<DIGEST-SIZE>/<TRUNCATED-SIZE> has
less potential to cause problems. NIST acknowledges that there is
sometimes a need to truncate digests [5]. Therefore, I think this
notation would be well understood.
Thoughts?
Collin
[1] https://nvlpubs.nist.gov/nistpubs/fips/nist.fips.202.pdf
[2] https://www.blake2.net/blake2.pdf
[3] https://en.wikipedia.org/wiki/Length_extension_attack
[4]
https://csrc.nist.gov/csrc/media/Projects/crypto-publication-review-project/documents/initial-comments/fips180-4-initial-public-comments-2022.pdf
[5]
https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-107r1.pdf
>From e028c657cb49a35385d391f28bd5b6400205a460 Mon Sep 17 00:00:00 2001
Message-ID: <e028c657cb49a35385d391f28bd5b6400205a460.1756695368.git.collin.fu...@gmail.com>
From: Collin Funk <[email protected]>
Date: Sun, 31 Aug 2025 16:56:08 -0700
Subject: [PATCH] cksum: add support for SHA-3
* src/digest.c: Include sha3.h.
(BLAKE2B_MAX_LEN): Rename to
DIGEST_MAX_LEN since it is also used for SHA-3.
(sha3_sum_stream): New function.
(enum Algorithm, algorithm_args, algorithm_args, algorithm_types)
algorithm_tags, algorithm_bits, cksumfns, cksum_output_fns): Add entries
for SHA-3.
(usage): Mention that SHA-3 is supported. Mention that 'cksum -a sha3'
supports the --length option.
(split_3): Use DIGEST_MAX_LEN instead of BLAKE2B_MAX_LEN. Determine the
length of the digest for SHA-3.
(digest_file): Set the digest length in bytes. Use DIGEST_MAX_LEN
instead of BLAKE2B_MAX_LEN. Always append the digest length to SHA3 in
the output.
(main): Allow the use of --length with 'cksum -a sha3'. Use
DIGEST_MAX_LEN instead of BLAKE2B_MAX_LEN.
* tests/cksum/cksum-base64.pl (@pairs): Add expected sha3 output.
(fmt): Modify the output to use SHA3-512 since that is the default.
* tests/cksum/cksum-sha3.sh: New test, based on tests/cksum/b2sum.sh.
* tests/local.mk (all_tests): Add the test.
* bootstrap.conf: Add crypto/sha3.
* gnulib: Update to latest commit.
* NEWS: Mention the change.
* doc/coreutils.texi (cksum general options): Mention sha3 as a
supported argument to the -a option. Mention that 'cksum -a sha3'
supports the --length option. Mention that SHA-3 is considered secure.
---
NEWS | 4 +++
bootstrap.conf | 1 +
doc/coreutils.texi | 4 ++-
gnulib | 2 +-
src/digest.c | 66 ++++++++++++++++++++++++++----------
tests/cksum/cksum-base64.pl | 3 ++
tests/cksum/cksum-sha3.sh | 67 +++++++++++++++++++++++++++++++++++++
tests/local.mk | 1 +
8 files changed, 128 insertions(+), 20 deletions(-)
create mode 100755 tests/cksum/cksum-sha3.sh
diff --git a/NEWS b/NEWS
index 24430cedb..980d23b86 100644
--- a/NEWS
+++ b/NEWS
@@ -87,6 +87,10 @@ GNU coreutils NEWS -*- outline -*-
basenc supports the --base58 option to encode and decode
the visually unambiguous Base58 encoding.
+ cksum -a now supports the 'sha3' argument, to use the SHA3-224,
+ SHA3-256, SHA3-384, SHA3-512 message digest algorithms depending on
+ the option pased to --length (-l) option. SHA3-512 is the default.
+
'date' now outputs dates in the country's native calendar for the
Iranian locale (fa_IR) and for the Ethiopian locale (am_ET), and also
does so more consistently for the Thailand locale (th_TH.UTF-8).
diff --git a/bootstrap.conf b/bootstrap.conf
index 49fcf30f3..6fde4d151 100644
--- a/bootstrap.conf
+++ b/bootstrap.conf
@@ -68,6 +68,7 @@ gnulib_modules="
crc-x86_64
crypto/md5
crypto/sha1
+ crypto/sha3
crypto/sha256
crypto/sha512
crypto/sm3
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 3f0931e1a..c16278169 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -4151,6 +4151,7 @@ @node cksum general options
@samp{sha256} equivalent to @command{sha256sum}
@samp{sha384} equivalent to @command{sha384sum}
@samp{sha512} equivalent to @command{sha512sum}
+@samp{sha3} only available through @command{cksum}
@samp{blake2b} equivalent to @command{b2sum}
@samp{sm3} only available through @command{cksum}
@end example
@@ -4180,6 +4181,7 @@ @node cksum general options
@opindex -l
@opindex --length
@cindex BLAKE2 hash length
+@cindex SHA-3 hash length
Change (shorten) the default digest length.
This is specified in bits and thus must be a multiple of 8.
This option is ignored when @option{--check} is specified,
@@ -4368,7 +4370,7 @@ @node md5sum invocation
fingerprint is considered infeasible at the moment, it is known how
to modify certain files, including digital certificates, so that they
appear valid when signed with an \hash\ digest. For more secure hashes,
-consider using SHA-2 or @command{b2sum}.
+consider using SHA-2, SHA-3, or @command{b2sum}.
@xref{sha2 utilities}. @xref{b2sum invocation}.
@end macro
@weakHash{MD5}
diff --git a/gnulib b/gnulib
index 9b07115f4..6c5936ddd 160000
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit 9b07115f4a344effef1dde8bd0e6e356d4b0e744
+Subproject commit 6c5936dddf2b9685675624313065f2379305598a
diff --git a/src/digest.c b/src/digest.c
index 0e4e62dee..553187b36 100644
--- a/src/digest.c
+++ b/src/digest.c
@@ -55,6 +55,9 @@
# include "sha512.h"
#endif
#if HASH_ALGO_CKSUM
+# include "sha3.h"
+#endif
+#if HASH_ALGO_CKSUM
# include "sm3.h"
#endif
#include "fadvise.h"
@@ -216,10 +219,15 @@ static bool base64_digest = false;
/* If true, print binary digests, not hex. */
static bool raw_digest = false;
+/* blake2b and sha3 allow the -l option. Luckily they both have the same
+ maximum digest size. */
#if HASH_ALGO_BLAKE2 || HASH_ALGO_CKSUM
-# define BLAKE2B_MAX_LEN BLAKE2B_OUTBYTES
+# if HASH_ALGO_CKSUM
+static_assert (BLAKE2B_OUTBYTES == SHA3_512_DIGEST_SIZE);
+# endif
+# define DIGEST_MAX_LEN BLAKE2B_OUTBYTES
static uintmax_t digest_length;
-#endif /* HASH_ALGO_BLAKE2 */
+#endif /* HASH_ALGO_BLAKE2 || HASH_ALGO_CKSUM */
typedef void (*digest_output_fn)(char const *, int, void const *, bool,
bool, unsigned char, bool, uintmax_t);
@@ -279,6 +287,20 @@ sha512_sum_stream (FILE *stream, void *resstream,
return sha512_stream (stream, resstream);
}
static int
+sha3_sum_stream (FILE *stream, void *resstream, uintmax_t *length)
+{
+ if (*length <= SHA3_224_DIGEST_SIZE)
+ return sha3_224_stream (stream, resstream);
+ else if (*length <= SHA3_256_DIGEST_SIZE)
+ return sha3_256_stream (stream, resstream);
+ else if (*length <= SHA3_384_DIGEST_SIZE)
+ return sha3_384_stream (stream, resstream);
+ else if (*length <= SHA3_512_DIGEST_SIZE)
+ return sha3_512_stream (stream, resstream);
+ else
+ unreachable ();
+}
+static int
blake2b_sum_stream (FILE *stream, void *resstream, uintmax_t *length)
{
return blake2b_stream (stream, resstream, *length);
@@ -301,6 +323,7 @@ enum Algorithm
sha256,
sha384,
sha512,
+ sha3,
blake2b,
sm3,
};
@@ -308,24 +331,24 @@ enum Algorithm
static char const *const algorithm_args[] =
{
"bsd", "sysv", "crc", "crc32b", "md5", "sha1", "sha224",
- "sha256", "sha384", "sha512", "blake2b", "sm3", nullptr
+ "sha256", "sha384", "sha512", "sha3", "blake2b", "sm3", nullptr
};
static enum Algorithm const algorithm_types[] =
{
bsd, sysv, crc, crc32b, md5, sha1, sha224,
- sha256, sha384, sha512, blake2b, sm3,
+ sha256, sha384, sha512, sha3, blake2b, sm3,
};
ARGMATCH_VERIFY (algorithm_args, algorithm_types);
static char const *const algorithm_tags[] =
{
"BSD", "SYSV", "CRC", "CRC32B", "MD5", "SHA1", "SHA224",
- "SHA256", "SHA384", "SHA512", "BLAKE2b", "SM3", nullptr
+ "SHA256", "SHA384", "SHA512", "SHA3", "BLAKE2b", "SM3", nullptr
};
static int const algorithm_bits[] =
{
16, 16, 32, 32, 128, 160, 224,
- 256, 384, 512, 512, 256, 0
+ 256, 384, 512, 512, 512, 256, 0
};
static_assert (ARRAY_CARDINALITY (algorithm_bits)
@@ -345,6 +368,7 @@ static sumfn cksumfns[]=
sha256_sum_stream,
sha384_sum_stream,
sha512_sum_stream,
+ sha3_sum_stream,
blake2b_sum_stream,
sm3_sum_stream,
};
@@ -362,6 +386,7 @@ static digest_output_fn cksum_output_fns[]=
output_file,
output_file,
output_file,
+ output_file,
};
bool cksum_debug;
#endif
@@ -479,7 +504,8 @@ Print or check %s (%d-bit) checksums.\n\
# if HASH_ALGO_BLAKE2 || HASH_ALGO_CKSUM
fputs (_("\
-l, --length=BITS digest length in bits; must not exceed the max for\n\
- the blake2 algorithm and must be a multiple of 8\n\
+ the blake2 and sha3 algorithms and must be a\n\
+ multiple of 8\n\
"), stdout);
# endif
# if HASH_ALGO_CKSUM
@@ -544,6 +570,7 @@ DIGEST determines the digest algorithm and default output format:\n\
sha256 (equivalent to sha256sum)\n\
sha384 (equivalent to sha384sum)\n\
sha512 (equivalent to sha512sum)\n\
+ sha3 (only available through cksum)\n\
blake2b (equivalent to b2sum)\n\
sm3 (only available through cksum)\n\
\n"), stdout);
@@ -823,7 +850,7 @@ split_3 (char *s, size_t s_len,
s[--i] = '(';
# if HASH_ALGO_BLAKE2
- digest_length = BLAKE2B_MAX_LEN * 8;
+ digest_length = DIGEST_MAX_LEN * 8;
# else
digest_length = algorithm_bits[cksum_algorithm];
# endif
@@ -865,14 +892,14 @@ split_3 (char *s, size_t s_len,
#if HASH_ALGO_BLAKE2 || HASH_ALGO_CKSUM
/* Auto determine length. */
# if HASH_ALGO_CKSUM
- if (cksum_algorithm == blake2b) {
+ if (cksum_algorithm == blake2b || cksum_algorithm == sha3) {
# endif
unsigned char const *hp = *digest;
digest_hex_bytes = 0;
while (c_isxdigit (*hp++))
digest_hex_bytes++;
if (digest_hex_bytes < 2 || digest_hex_bytes % 2
- || BLAKE2B_MAX_LEN * 2 < digest_hex_bytes)
+ || DIGEST_MAX_LEN * 2 < digest_hex_bytes)
return false;
digest_length = digest_hex_bytes * 4;
# if HASH_ALGO_CKSUM
@@ -1013,7 +1040,7 @@ digest_file (char const *filename, int *binary, unsigned char *bin_result,
fadvise (fp, FADVISE_SEQUENTIAL);
#if HASH_ALGO_CKSUM
- if (cksum_algorithm == blake2b)
+ if (cksum_algorithm == blake2b || cksum_algorithm == sha3)
*length = digest_length / 8;
err = DIGEST_STREAM (fp, bin_result, length);
#elif HASH_ALGO_SUM
@@ -1064,12 +1091,14 @@ output_file (char const *file, int binary_file, void const *digest,
{
fputs (DIGEST_TYPE_STRING, stdout);
# if HASH_ALGO_BLAKE2
- if (digest_length < BLAKE2B_MAX_LEN * 8)
+ if (digest_length < DIGEST_MAX_LEN * 8)
printf ("-%ju", digest_length);
# elif HASH_ALGO_CKSUM
+ if (cksum_algorithm == sha3)
+ printf ("-%ju", digest_length);
if (cksum_algorithm == blake2b)
{
- if (digest_length < BLAKE2B_MAX_LEN * 8)
+ if (digest_length < DIGEST_MAX_LEN * 8)
printf ("-%ju", digest_length);
}
# endif
@@ -1478,17 +1507,18 @@ main (int argc, char **argv)
min_digest_line_length = MIN_DIGEST_LINE_LENGTH;
#if HASH_ALGO_BLAKE2 || HASH_ALGO_CKSUM
# if HASH_ALGO_CKSUM
- if (digest_length && cksum_algorithm != blake2b)
+ if (digest_length && (cksum_algorithm != blake2b && cksum_algorithm != sha3))
error (EXIT_FAILURE, 0,
- _("--length is only supported with --algorithm=blake2b"));
+ _("--length is only supported with --algorithm=blake2b or "
+ "--algorithm=sha3"));
# endif
- if (digest_length > BLAKE2B_MAX_LEN * 8)
+ if (digest_length > DIGEST_MAX_LEN * 8)
{
error (0, 0, _("invalid length: %s"), quote (digest_length_str));
error (EXIT_FAILURE, 0,
_("maximum digest length for %s is %d bits"),
quote (DIGEST_TYPE_STRING),
- BLAKE2B_MAX_LEN * 8);
+ DIGEST_MAX_LEN * 8);
}
if (digest_length % 8 != 0)
{
@@ -1498,7 +1528,7 @@ main (int argc, char **argv)
if (digest_length == 0)
{
# if HASH_ALGO_BLAKE2
- digest_length = BLAKE2B_MAX_LEN * 8;
+ digest_length = DIGEST_MAX_LEN * 8;
# else
digest_length = algorithm_bits[cksum_algorithm];
# endif
diff --git a/tests/cksum/cksum-base64.pl b/tests/cksum/cksum-base64.pl
index a174c0ebe..5f34296a9 100755
--- a/tests/cksum/cksum-base64.pl
+++ b/tests/cksum/cksum-base64.pl
@@ -36,6 +36,7 @@ my @pairs =
['sha256', "47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU="],
['sha384', "OLBgp1GsljhM2TJ+sbHjaiH9txEUvgdDTAzHv2P24donTt6/529l+9Ua0vFImLlb"],
['sha512', "z4PhNX7vuL3xVChQ1m2AB9Yg5AULVxXcg/SpIdNs6c5H0NE8XYXysP+DGNKHfuwvY7kxvUdBeoGlODJ6+SfaPg=="],
+ ['sha3', "pp9zzKI6msXItWfcGFp1bpfJghZP4lhZ4NHcwUdcgKYVshI68fX5TBHj6UAsOsVY9QAZnZW20+MBdYWGKB3NJg=="],
['blake2b', "eGoC90IBWQPGxv2FJVLScpEvR0DhWEdhiobiF/cfVBnSXhAxr+5YUxOJZESTTrBLkDpoWxRIt1XVb3Aa/pvizg=="],
['sm3', "GrIdg1XPoX+OYRlIMegajyK+yMco/vt0ftA161CCqis="],
);
@@ -47,6 +48,8 @@ sub fmt ($$) {
$h !~ m{^(sysv|bsd|crc|crc32b)$} and $v = uc($h). " (f) = $v";
# BLAKE2b is inconsistent:
$v =~ s{BLAKE2B}{BLAKE2b};
+ # 'cksum -a sha3' defaults to SHA3-512.
+ $v =~ s/^SHA3\b/SHA3-512/;
return "$v"
}
diff --git a/tests/cksum/cksum-sha3.sh b/tests/cksum/cksum-sha3.sh
new file mode 100755
index 000000000..9de07af2d
--- /dev/null
+++ b/tests/cksum/cksum-sha3.sh
@@ -0,0 +1,67 @@
+#!/bin/sh
+# 'cksum -a sha3' tests.
+
+# Copyright (C) 2025 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <https://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
+print_ver_ cksum
+getlimits_
+
+# Ensure we can --check the --tag format we produce
+for i in 'a' ' b' '*c' '44' ' '; do
+ echo "$i" > "$i"
+ for l in 0 224 256 384 512; do
+ cksum -a sha3 -l $l "$i" >> check.sha3
+ done
+done
+# Note -l is inferred from the tags in the mixed format file
+cksum -a sha3 --strict -c check.sha3 || fail=1
+
+# Also ensure the openssl tagged variant works
+sed 's/ //; s/ =/=/' < check.sha3 > openssl.sha3 || framework_failure_
+cksum -a sha3 --strict -c openssl.sha3 || fail=1
+
+# Ensure we can check non tagged format
+for l in 0 224 256 384 512; do
+ cksum -a sha3 --untagged --text -l $l /dev/null \
+ | tee -a check.vals > check.sha3
+ cksum -a sha3 -l $l --strict -c check.sha3 || fail=1
+ cksum -a sha3 --strict -c check.sha3 || fail=1
+done
+
+# Ensure the checksum values are correct. The reference
+# check.vals was created using OpenSSL.
+cksum -a sha3 --length=256 check.vals > out.tmp || fail=1
+tr '*' ' ' < out.tmp > out || framework_failure_ # Remove binary tag on cygwin
+printf '%s' 'SHA3-256 (check.vals) = ' > exp
+echo 'b54024fe765df4c0eb166c87a1aad53bd960ddabf62cbd153242ac337c26d70a' >> exp
+compare exp out || fail=1
+
+# This would fail before coreutil-9.4
+# Only validate the last specified, used length
+cksum -a sha3 -l 253 -l 256 /dev/null || fail=1
+
+# This would not flag an error in coreutils 9.6 and 9.7
+for len in 513 1024 $UINTMAX_OFLOW; do
+ returns_ 1 cksum -a sha3 -l $len /dev/null 2>err || fail=1
+ cat <<EOF > exp || framework_failure_
+cksum: invalid length: '$len'
+cksum: maximum digest length for 'SHA3' is 512 bits
+EOF
+ compare exp err || fail=1
+done
+
+Exit $fail
diff --git a/tests/local.mk b/tests/local.mk
index bcb01dd8e..a42a20fbe 100644
--- a/tests/local.mk
+++ b/tests/local.mk
@@ -380,6 +380,7 @@ all_tests = \
tests/cksum/sha256sum.pl \
tests/cksum/sha384sum.pl \
tests/cksum/sha512sum.pl \
+ tests/cksum/cksum-sha3.sh \
tests/shred/shred-exact.sh \
tests/shred/shred-passes.sh \
tests/shred/shred-remove.sh \
--
2.51.0