A NOTE has been added to this issue. 
====================================================================== 
https://austingroupbugs.net/view.php?id=243 
====================================================================== 
Reported By:                dwheeler
Assigned To:                ajosey
====================================================================== 
Project:                    1003.1(2008)/Issue 7
Issue ID:                   243
Category:                   Shell and Utilities
Type:                       Enhancement Request
Severity:                   Objection
Priority:                   normal
Status:                     Under Review
Name:                       David A. Wheeler 
Organization:               IDA 
User Reference:              
Section:                    find 
Page Number:                2740 
Line Number:                89194 
Interp Status:              --- 
Final Accepted Text:         
====================================================================== 
Date Submitted:             2010-04-29 19:23 UTC
Last Modified:              2023-01-09 16:20 UTC
====================================================================== 
Summary:                    Add -print0 to "find"
======================================================================
Relationships       ID      Summary
----------------------------------------------------------------------
has duplicate       0000244 Add -0 to xargs
has duplicate       0000245 Add -0 option to shell's "read"
has duplicate       0000903 Please, add find -print0, xargs -0, rea...
====================================================================== 

---------------------------------------------------------------------- 
 (0006100) geoffclare (manager) - 2023-01-09 16:20
 https://austingroupbugs.net/view.php?id=243#c6100 
---------------------------------------------------------------------- 
Page and line numbers are for Issue 8 draft 2.1.

On page 2763 line 91806 section find (OPERANDS),
change:<blockquote><b>-print</b><blockquote>The primary shall always
evaluate as true; it shall cause the current pathname to be written to
standard
output.</blockquote></blockquote>to:<blockquote><b>-print</b><blockquote>The
primary shall always evaluate as true; it shall cause the current pathname
to be written to standard output, followed by a
<newline>.</blockquote><b>-print0</b><blockquote>The primary shall always
evaluate as true; it shall cause the current pathname to be written to
standard output, followed by a null byte.</blockquote></blockquote>
On page 2765 line 91869 section find (STDOUT), change:<blockquote>current
pathnames to be written</blockquote>to:<blockquote>current pathname to be
written</blockquote>
After page 2765 line 91871 section find (STDOUT), add:<blockquote>The
<b>-print0</b> primary shall cause the current pathname to be written to
standard output, followed by a null byte.</blockquote>
On page 2766 line 91911 section find (EXAMPLES), after:<blockquote>They
both write out the entire directory hierarchy from the current
directory.</blockquote>append:<blockquote>With this output format, if any
pathnames include <newline> characters, it is not possible to tell where
each pathname begins and ends. This problem can be avoided by omitting such
pathnames:<pre>LC_ALL=POSIX find . -name $'*\n*' -prune -o -print</pre>or
by using a sentinel in the pathname that <i>find</i> would never otherwise
produce, such as:<pre>find .//. -print</pre>or by using <b>-print0</b>
instead of <b>-print</b> and processing the output with a utility that can
accept null-terminated pathnames as input, such as <i>xargs</i> with the
<b>-0</b> option or <i>read</i> with <b>-d</b> "", for example:<pre>find .
-print0 | while IFS= read -rd "" file
do
    # process "$file"
done</pre>It should be noted that using <i>find</i> with <b>-print0</b> to
pipe input to <i>xargs</i> <b>-0</b> is less safe than using <i>find</i>
with <b>-exec</b> because if <i>find</i> <b>-print0</b> is terminated after
it has written a partial pathname, the partial pathname will be processed
as if it was a complete pathname.</blockquote>
On page 2769 line 92033-92037 section find (RATIONALE),
delete:<blockquote>Other implementations [...] it would now be
reading.</blockquote>
On page 3106 line 105084 section read (SYNOPSIS),
change:<blockquote><pre>read [-r]
<i>var</i>...</pre></blockquote>to:<blockquote><pre>read [-r] [-d
<i>delim</i>] <i>var</i>...</pre></blockquote>
On page 3106 line 105088 section read (DESCRIPTION), change:<blockquote>By
default, unless the <b>-r</b> option is specified, <backslash> shall act as
an escape character. An unescaped <backslash> shall preserve the literal
value of the following character, with the exception of a <newline>. If a
<newline> follows the <backslash>, the <i>read</i> utility shall interpret
this as line continuation. The <backslash> and <newline> shall be removed
before splitting the input into fields.</blockquote>to:<blockquote>By
default, unless the <b>-r</b> option is specified, <backslash> shall act as
an escape character. An unescaped <backslash> shall preserve the literal
value of the following character, with the exception of either <newline> or
the logical line delimiter specified with the <b>-d</b> <i>delim</i> option
(if it is used and <i>delim</i> is not <newline>); it is unspecified which.
If this excepted character follows the <backslash>, the <i>read</i> utility
shall interpret this as line continuation. The <backslash> and the excepted
character shall be removed before splitting the input into
fields.</blockquote>
On page 3106 line 105097 section read (DESCRIPTION), change:<blockquote>The
terminating <newline> (if any) shall be removed from the
input</blockquote>to:<blockquote>The terminating logical line delimiter (if
any) shall be removed from the input</blockquote>
After page 3106 line 105115 section read (DESCRIPTION), add:
<blockquote>If end-of-file is detected before a terminating logical line
delimiter is encountered, the variables specified by the <i>var</i>
operands shall be set as described above and the exit status shall be 1.
On page 3106 line 105118 section read (OPTIONS), change:<blockquote>The
following option is supported:</blockquote>to:<blockquote>The following
options shall be supported:

<b>-d</b> <i>delim</i><blockquote>If <i>delim</i> consists of one
single-byte character, that byte shall be used as the logical line
delimiter. If <i>delim</i> is the null string, the logical line delimiter
shall be the null byte. Otherwise, the behavior is
unspecified.</blockquote></blockquote>
On page 3107 line 105125 section read (STDIN), change:<blockquote>The
standard input shall be a text file.</blockquote>to:<blockquote>If the
<b>-d</b> <i>delim</i> option is not specified, or if it is specified and
<i>delim</i> consists of one single-byte character, the standard input
shall contain zero or more characters and shall not contain any null
bytes.

If the <b>-d</b> <i>delim</i> option is specified and <i>delim</i> is the
null string, the standard input shall contain zero or more bytes (which
need not form valid characters).</blockquote>
After page 3108 line 105167 section read (APPLICATION USAGE), add two new
paragraphs:<blockquote>The <b>-d</b> <i>delim</i> option enables reading up
to an arbitrary single-byte delimiter. When <i>delim</i> is the null
string, the delimiter is the null byte and this allows <i>read</i> to be
used to process null-terminated lists of pathnames (as produced by the
<i>find</i> <b>-print0</b> primary), with correct handling of pathnames
that contain <newline> characters. Note that in order to specify the null
string as the delimiter, <b>-d</b> and <i>delim</i> need to be specified as
two separate arguments. Implementations differ in their handling of
<backslash> for line continuation when <b>-d</b> <i>delim</i> is specified
(and <i>delim</i> is not <newline>); some treat <backslash><i>delim</i> (or
<backslash><NUL> if <i>delim</i> is the null string) as a line
continuation, whereas others still treat <backslash><newline> as a line
continuation. Consequently, portable applications need to specify <b>-r</b>
whenever they specify <b>-d</b> <i>delim</i> (and <i>delim</i> is not
<newline>).

When the current locale is not the C or POSIX locale, pathnames can contain
bytes that do not form part of a valid character, and therefore portable
applications need to ensure that the current locale is the C or POSIX
locale when using <i>read</i> with arbitrary pathnames as input. (If
<i>IFS</i> is not set to the null string this applies even when using
<b>-d</b> "", because the field splitting performed by <i>read</i> is a
character-based operation.) When reading a pathname it is also inadvisable
to use the contents of the first <i>var</i> operand, if non-empty, when the
exit status of <i>read</i> is 1, as it is likely the result of the command
used to generate the list of pathnames (for example <i>find</i> with
<b>-print</b> or </b>-print0</b> being terminated after it has written a
partial pathname, and consequently using it could result in the wrong
pathname being processed.</blockquote>
On page 3108 line 105186 section read (RATIONALE),
change:<blockquote>Although the standard input is required to be a text
file, and therefore will always end with a <newline> (unless it is an empty
file), the processing of continuation lines when the <b>−r</b> option is
not used can result in the input not ending with a <newline>. This occurs
if the last line of the input file ends with a <backslash> <newline>. It is
for this reason that ``if any’’ is used in ``The terminating <newline>
(if any) shall be removed from the input’’ in the description. It is
not a relaxation of the requirement for standard input to be a text
file.</blockquote>to:<blockquote>Earlier versions of this standard required
the standard input to be a text file, and therefore the results were
undefined if the input was not empty and end-of-file was detected before a
<newline> character was encountered. However, all of the most popular shell
implementations have been found to have consistent behavior in this case,
and so the behavior is now specified and the requirement for standard input
to be a text file has been relaxed to allow non-empty input that does not
end with a <newline>.</blockquote>
On page 3365 line 114578 section xargs (SYNOPSIS), change:<blockquote>[-E
eofstr]</blockquote>to:<blockquote>[-E eofstr|-0]</blockquote>
On page 3365 line 114593 section xargs (DESCRIPTION),
change:<blockquote>The application shall ensure that arguments in the
standard input are separated by unquoted <blank> characters, unescaped
<blank> characters, or <newline> characters. A string of zero or more
non-double-quote ('"') characters and non-<newline> characters can be
quoted by enclosing them in double-quotes. A string of zero or more
non-<apostrophe> ('\'') characters and non-<newline> characters can be
quoted by enclosing them in <apostrophe> characters. Any unquoted character
can be escaped by preceding it with a <backslash>. The utility named by
<i>utility</i> shall be executed one or more times until the end-of-file is
reached or the logical end-of file string is found. The results are
unspecified if the utility named by <i>utility</i> attempts to read from
its standard input.</blockquote>to:<blockquote>If the <b>-0</b> option is
not specified, the application shall ensure that arguments in the standard
input are separated by unquoted <blank> characters, unescaped <blank>
characters, or <newline> characters, and quoting characters shall be
interpreted as follows: <ul>
<li>A string of zero or more non-double-quote ('"') non-<newline>
characters can be quoted by enclosing them in double-quotes.</li>
<li>A string of zero or more non-<apostrophe> ('\'') non-<newline>
characters can be quoted by enclosing them in <apostrophe>
characters.</li>
<li>Any unquoted character can be escaped by preceding it with a
<backslash>.</li> </ul>
If the <b>-0</b> option is specified, the application shall ensure that
arguments in the standard input are separated by null bytes.

The utility named by <i>utility</i> shall be executed one or more times
until the end-of-file is reached or the logical end-of file string is
found. The results are unspecified if the utility named by <i>utility</i>
attempts to read from its standard input.</blockquote>
On page 3365 line 114612 section xargs (OPTIONS -E), change:<blockquote>If
<b>-E</b> is not specified</blockquote>to:<blockquote>If neither <b>-E</b>
nor <b>-0</b> is specified</blockquote>
On page 3365 line 114617 section xargs (OPTIONS -I),
change:<blockquote>Insert mode: <i>utility</i> is executed for each logical
line from standard input. Arguments in the standard input shall be
separated only by unescaped <newline> characters, not by <blank>
characters. Any unquoted unescaped <blank> characters at the beginning of
each line shall be ignored.</blockquote>to:<blockquote>Insert mode: invoke
<i>utility</i> for each argument from standard input. If <b>-0</b> is not
specified, arguments in the standard input shall be separated only by
unescaped <newline> characters, not by <blank> characters, and any unquoted
unescaped <blank> characters at the beginning of each line shall be
ignored.</blockquote>
On page 3366 line 114625 section xargs (OPTIONS -L), change:<blockquote>The
<i>utility</i> shall be executed for each non-empty <i>number</i> lines of
arguments from standard input. The last invocation of <i>utility</i> shall
be with fewer lines of arguments if fewer than <i>number</i> remain. A line
is considered to end with the first <newline> unless the last character of
the line is an unescaped <blank>; a trailing unescaped <blank> signals
continuation to the next non-empty line,
inclusive.</blockquote>to:<blockquote>Invoke <i>utility</i> for each set of
<i>number</i> arguments from standard input. The last invocation of
<i>utility</i> shall be with fewer arguments if fewer than <i>number</i>
remain. If the <b>-0</b> option is not specified, each line in the standard
input shall be treated as containing one argument except that empty lines
shall be ignored and a line ending with a trailing unescaped <blank> shall
signal continuation to the next non-empty line, inclusive; such
continuation shall result in removal of all trailing unescaped <blank>
characters and all <newline> characters that immediately follow them from
the argument.</blockquote>
On page 3366 line 114644 section xargs (OPTIONS -s), change:<blockquote>The
total number of lines exceeds that specified by the <b>-L</b>
option.</blockquote>to:<blockquote>The total number of arguments exceeds
that specified by the <b>-L</b> option.</blockquote>
After page 3366 line 114655 section xargs (OPTIONS),
add:<blockquote>-0<blockquote>Use a null byte as the input argument
delimiter and do not treat any other input bytes as special.</blockquote>If
the mutually exclusive <b>-0</b> and <b>-E</b> <i>eofstr</i> options are
both specified, the behavior is unspecified, except that if <i>eofstr</i>
is the null string the behavior shall be the same as if <b>-0</b> was
specified without <b>-E</b> <i>eofstr</i>.</blockquote>
On page 3367 line 114664 section xargs (STDIN), change:<blockquote>The
standard input shall be a text file. The results are unspecified if an
end-of-file condition is detected immediately following an escaped
<newline>.</blockquote>to:<blockquote>If the <b>-0</b> option is not
specified, the standard input shall be a text file and the results are
unspecified if an end-of-file condition is detected immediately following
an escaped <newline>.

If the <b>-0</b> option is specified, the standard input need not be a text
file, and <i>xargs</i> shall process the input as bytes, not
characters.</blockquote>
On page 3368 line 114722 section xargs (APPLICATION USAGE),
change:<blockquote>Note that since input is parsed as lines,
...</blockquote>to:<blockquote>Note that since (if <b>-0</b> is not
specified) input is parsed as lines, ...</blockquote>
On page 3368 line 114726 section xargs (APPLICATION USAGE),
change:<blockquote>This can be solved by
...</blockquote>to:<blockquote>This can be solved by using the
<b>-print0</b> primary of <i>find</i> together with the <i>xargs</i>
<b>-0</b> option, or by ...</blockquote> 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2010-04-29 19:23 dwheeler       New Issue                                    
2010-04-29 19:23 dwheeler       Status                   New => Under Review 
2010-04-29 19:23 dwheeler       Assigned To               => ajosey          
2010-04-29 19:23 dwheeler       Name                      => David A. Wheeler
2010-04-29 19:23 dwheeler       Organization              => IDA             
2010-04-29 19:23 dwheeler       Section                   => find            
2010-04-29 19:23 dwheeler       Page Number               => 2740            
2010-04-29 19:23 dwheeler       Line Number               => 89194           
2011-07-06 23:42 Don Cragun     Relationship added       related to 0000244  
2011-07-06 23:42 Don Cragun     Relationship added       related to 0000245  
2011-07-06 23:54 Don Cragun     Note Added: 0000882                          
2011-11-16 18:22 dwheeler       Note Added: 0001020                          
2015-03-12 16:15 Don Cragun     Relationship added       has duplicate 0000903
2022-12-08 15:39 geoffclare     Note Added: 0006091                          
2022-12-08 15:40 geoffclare     Note Edited: 0006091                         
2022-12-08 16:21 stephane       Note Added: 0006092                          
2022-12-08 16:23 stephane       Note Edited: 0006092                         
2022-12-08 16:32 stephane       Note Added: 0006093                          
2022-12-08 17:02 stephane       Note Edited: 0006093                         
2022-12-09 10:22 geoffclare     Note Edited: 0006091                         
2022-12-09 10:30 geoffclare     Note Edited: 0006091                         
2022-12-09 10:44 geoffclare     Note Edited: 0006091                         
2022-12-09 10:50 geoffclare     Note Added: 0006094                          
2022-12-09 11:21 geoffclare     Note Edited: 0006091                         
2022-12-09 12:09 stephane       Note Added: 0006095                          
2023-01-09 16:13 Don Cragun     Relationship replaced    has duplicate 0000244
2023-01-09 16:17 Don Cragun     Relationship replaced    has duplicate 0000245
2023-01-09 16:20 geoffclare     Note Added: 0006100                          
======================================================================


    • Re: Add... Geoff Clare via austin-group-l at The Open Group
      • Re:... Steffen Nurpmeso via austin-group-l at The Open Group
      • Re:... Stephane Chazelas via austin-group-l at The Open Group
        • ... Geoff Clare via austin-group-l at The Open Group
  • [1003.1(2008... Austin Group Bug Tracker via austin-group-l at The Open Group
  • [1003.1(2008... Austin Group Bug Tracker via austin-group-l at The Open Group
  • [1003.1(2008... Austin Group Bug Tracker via austin-group-l at The Open Group
  • [1003.1(2008... Austin Group Bug Tracker via austin-group-l at The Open Group
  • [1003.1(2008... Austin Group Bug Tracker via austin-group-l at The Open Group
  • [1003.1(2008... Austin Group Bug Tracker via austin-group-l at The Open Group
  • [1003.1(2008... Austin Group Bug Tracker via austin-group-l at The Open Group
    • Re: [10... Stephane Chazelas via austin-group-l at The Open Group
      • Re:... Geoff Clare via austin-group-l at The Open Group
        • ... Stephane Chazelas via austin-group-l at The Open Group
          • ... Stephane Chazelas via austin-group-l at The Open Group
  • [1003.1(2008... Austin Group Bug Tracker via austin-group-l at The Open Group
  • [1003.1(2008... Austin Group Bug Tracker via austin-group-l at The Open Group
  • [1003.1(2008... Austin Group Bug Tracker via austin-group-l at The Open Group
  • [1003.1(2008... Austin Group Bug Tracker via austin-group-l at The Open Group
    • Re: [10... Stephane Chazelas via austin-group-l at The Open Group
      • Re:... Geoff Clare via austin-group-l at The Open Group

Reply via email to