A NOTE has been added to this issue. 
====================================================================== 
https://austingroupbugs.net/view.php?id=1872 
====================================================================== 
Reported By:                steffen
Assigned To:                
====================================================================== 
Project:                    1003.1(2024)/Issue8
Issue ID:                   1872
Category:                   Shell and Utilities
Type:                       Clarification Requested
Severity:                   Editorial
Priority:                   normal
Status:                     New
Name:                       steffen 
Organization:                
User Reference:              
Section:                    find 
Page Number:                2946 
Line Number:                98444 ff. 
Interp Status:              --- 
Final Accepted Text:         
====================================================================== 
Date Submitted:             2024-11-07 21:34 UTC
Last Modified:              2024-11-08 08:31 UTC
====================================================================== 
Summary:                    find: clarify "less safe" statement
====================================================================== 

---------------------------------------------------------------------- 
 (0006953) stephane (reporter) - 2024-11-08 08:31
 https://austingroupbugs.net/view.php?id=1872#c6953 
---------------------------------------------------------------------- 
For the record, I was the one bringing that issue up to POSIX in
https://austingroupbugs.net/view.php?id=243#c6093 which resulted in that text, I
also mention it at
https://unix.stackexchange.com/questions/321697/why-is-looping-over-finds-output-bad-practice/321757#321757
(Interrupted output section) with a real life example.

See also
https://unix.stackexchange.com/questions/730873/find-print0-xargs-0-cmd-vs-find-exec-cmd/730874#730874
for some historical background on -exec {} + vs xargs -0.

I do vaguely remember mentioning it to the GNU findutils maintainers, but I
may have imagined it and in any case don't remember the outcome.

Now I think it's too late for POSIX to mandate implementations discard
non-delimited records as that behaviour is being relied upon. A new option
would have to be introduced. Could be a -D pendant to -d where -D '\0'
requires the delimiter when -d '\0' doesn't (-d currently badly missing
from POSIX xargs for it's ability to deal with lines with -d '\n'). Or a -F
extra flag to require Full records.

Generally, GNU utilities do process non-delimited records. For instance all
their text utilities allow non-delimited lines on input (and some add the
delimiter back on output, some don't). Same applies for those that take a
-z or -0 to deal with NUL-delimited records instead of lines.

In POSIX, behaviour is unspecified for text utilities if the input doesn't
end in newline with the exception of awk, so in any case, regardless of the
behaviour of xargs in this instance, if one does something like:

find ... -print0 |
  awk -v RS='\0' -v ORS='\0' '{print; print $0".back"}' |
  xargs -r0n2 cp -p --

(not that POSIX allows NUL field separator for awk yet), the fact that find
was interrupted if it was is lost when that reaches xargs.

The NUL-delimited mode of GNU text utilities is often used to process text
files as a whole as a poor man's "slurp mode". As in:

sed -z 's/.../.../' file.txt

To have substitution possibly spanning several lines.

xargs -r0a file.txt printf %b

To expand echo-style escape sequences in the contents of file.txt.

Mandating the delimiter would break those.

In any case, that's not something for POSIX to address. For now, it allows
implementations not to discard non-delimited records and warns about the
safety implication of doing so. It's up to implementations to decide what
they want to do now: ignore the problem which in practice rarely happens
and falls in the category of the rare pathological cases, like memory/fd
exhaustion or random bit flip on solar flares where all bets are off anyway
or address it either by breaking backward compatibility or add extra API.
If different implementors agree on that new API, then that can be specified
in POSIX. 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2024-11-07 21:34 steffen        New Issue                                    
2024-11-07 21:34 steffen        Name                      => steffen         
2024-11-07 21:34 steffen        Section                   => find            
2024-11-07 21:34 steffen        Page Number               => 2946            
2024-11-07 21:34 steffen        Line Number               => 98444 ff.       
2024-11-07 21:38 steffen        Note Added: 0006951                          
2024-11-08 01:40 steffen        Note Added: 0006952                          
2024-11-08 01:42 steffen        Note Edited: 0006952                         
2024-11-08 08:31 stephane       Note Added: 0006953                          
======================================================================


  • [1003.1(2024... Austin Group Bug Tracker via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group
      • Re:... Steffen Nurpmeso via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group
    • [1003.1... Austin Group Bug Tracker via austin-group-l at The Open Group

Reply via email to