Pádraig Brady <[email protected]> writes:

> Nice one.
>
> Note the usage() also states that both files being stdin
> is not supported, so we should remove the " (not both)" qualification.

Oh, nice catch. I didn't see that.

I also noticed just now that POSIX says this behavior is undefined [1]:

    If both file1 and file2 refer to standard input or to the same FIFO
    special, block special, or character special file, the results are
    undefined.

I tested the following implementations that behave the same as us (minus
the closing standard input twice):

    * FreeBSD 15.0
    * MacOS 26
    * OpenBSD 7.8
    * NetBSD 10.1
    * Solaris 11.4
    * BusyBox
    * Toybox

The one case I could find that didn't was AIX 7.3:

    $ seq 9 | sed 'n;n;p' | /bin/comm - -   
    comm: 0653-039 File1 and File2 can not both refer to stdin.
    Usage: comm [-123] File1 File2

I'm not too sure why anyone would invoke 'comm' like this, but I figured
the behavior was worth documenting in the manual. I didn't mention other
implementations since they may change in the future, who knows.

>> +# In coreutils 9.11 and earlier, running 'comm - -' would close
>> +# standard input twice, leading to an incorrect exit status.
>> +echo 'y' >exp || framework_failure_
>> +yes | head -n 1 | timeout 10 comm - - >out 2>err || fail=1
>
> I'm a bit confused with the use of yes(1) here.
> It might be better to use a more descriptive example
> showing the alternation of lines consumed from the same input:
>
> $ seq 9 | sed 'n;n;p' | src/comm - -
> 1
>       2
>               3
> 4
>       5
>               6
> 7
>       8
>               9

Yep, that is much better. The was testing something else and the
invocation was mostly copy pasted. I agree it looks a bit silly now.

Anyways, I pushed the attached patch which uses your improved test.

Collin

[1] https://pubs.opengroup.org/onlinepubs/9799919799/utilities/comm.html

>From c5ddd417aa8d8e1cf070445760e1747410fb42be Mon Sep 17 00:00:00 2001
Message-ID: <c5ddd417aa8d8e1cf070445760e1747410fb42be.1776911134.git.collin.fu...@gmail.com>
From: Collin Funk <[email protected]>
Date: Tue, 21 Apr 2026 20:10:55 -0700
Subject: [PATCH v2] comm: don't close standard input twice

* NEWS: Mention the bug fix.
* src/comm.c (usage): Remove mention that FILE1 and FILE2 cannot both be
standard input.
(compare_files): Only close standard input once.
* doc/coreutils.texi (comm invocation): Document the behavior of
'comm - -' which is not portable to all implementations.
* tests/comm/dash-dash.sh: New file.
* tests/misc/comm.pl: Move to tests/comm/comm.pl.
* tests/local.mk (all_tests): Add the new test. Rename the existing
test.
---
 NEWS                         |  6 ++++++
 doc/coreutils.texi           |  6 ++++++
 src/comm.c                   |  6 ++++--
 tests/{misc => comm}/comm.pl |  0
 tests/comm/dash-dash.sh      | 31 +++++++++++++++++++++++++++++++
 tests/local.mk               |  3 ++-
 6 files changed, 49 insertions(+), 3 deletions(-)
 rename tests/{misc => comm}/comm.pl (100%)
 create mode 100755 tests/comm/dash-dash.sh

diff --git a/NEWS b/NEWS
index 038a1b520..a827cf0d9 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,12 @@ GNU coreutils NEWS                                    -*- outline -*-
 
 * Noteworthy changes in release ?.? (????-??-??) [?]
 
+** Bug fixes
+
+  'comm - -' no longer closes standard input twice.  Previously it would
+  mistakenly exit with a nonzero status.
+  [This bug was present in "the beginning".]
+
 
 * Noteworthy changes in release 9.11 (2026-04-20) [stable]
 
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 0f9d3e276..26d405b0b 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -5502,6 +5502,12 @@ @node comm invocation
 
 @end table
 
+POSIX leaves the behavior undefined when @var{file1} and @var{file2}
+both refer to standard input, i.e., when both are @samp{-}.  GNU
+@command{comm} reads the first line as if it came from @var{file1} and
+the second as if it were from @var{file2}, and continues alternating in
+this manner until the end of file is reached.
+
 @node ptx invocation
 @section @command{ptx}: Produce permuted indexes
 
diff --git a/src/comm.c b/src/comm.c
index 8e671f40d..b2ceca382 100644
--- a/src/comm.c
+++ b/src/comm.c
@@ -112,7 +112,7 @@ Compare sorted files FILE1 and FILE2 line by line.\n\
 "), stdout);
       fputs (_("\
 \n\
-When FILE1 or FILE2 (not both) is -, read standard input.\n\
+When FILE1 or FILE2 is -, read standard input.\n\
 "), stdout);
       fputs (_("\
 \n\
@@ -388,7 +388,9 @@ compare_files (char **infiles)
           }
     }
 
-  for (int i = 0; i < 2; i++)
+  /* Avoid closing standard input twice when invoking 'comm - -'.  */
+  const int n_streams = 2 - (streams[0] == streams[1]);
+  for (int i = 0; i < n_streams; i++)
     if (fclose (streams[i]) != 0)
       error (EXIT_FAILURE, errno, "%s", quotef (infiles[i]));
 
diff --git a/tests/misc/comm.pl b/tests/comm/comm.pl
similarity index 100%
rename from tests/misc/comm.pl
rename to tests/comm/comm.pl
diff --git a/tests/comm/dash-dash.sh b/tests/comm/dash-dash.sh
new file mode 100755
index 000000000..9d7dc6afc
--- /dev/null
+++ b/tests/comm/dash-dash.sh
@@ -0,0 +1,31 @@
+#!/bin/sh
+# Test that 'comm - -' works.
+
+# Copyright (C) 2026 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <https://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
+print_ver_ comm
+
+printf '1\n\t2\n\t\t3\n4\n\t5\n\t\t6\n7\n\t8\n\t\t9\n' >exp \
+  || framework_failure_
+
+# In coreutils 9.11 and earlier, running 'comm - -' would close
+# standard input twice, leading to an incorrect exit status.
+seq 9 | sed 'n;n;p' | timeout 10 comm - - >out 2>err || fail=1
+compare exp out || fail=1
+compare /dev/null err || fail=1
+
+Exit $fail
diff --git a/tests/local.mk b/tests/local.mk
index 628e54436..fde3ea989 100644
--- a/tests/local.mk
+++ b/tests/local.mk
@@ -324,7 +324,8 @@ all_tests =					\
   tests/cksum/cksum-base64.pl			\
   tests/cksum/cksum-base64-untagged.sh		\
   tests/cksum/cksum-raw.sh			\
-  tests/misc/comm.pl				\
+  tests/comm/comm.pl				\
+  tests/comm/dash-dash.sh			\
   tests/csplit/csplit.sh			\
   tests/csplit/csplit-1000.sh			\
   tests/csplit/csplit-heap.sh			\
-- 
2.53.0

Reply via email to