Re: busybox sed, 'r' command
On Thu, 24 Mar 2016, Ron Yorston wrote: > > and specifically about 'r': > >If rfile does not exist or cannot be read, it shall be treated as if >it were an empty file, causing no error condition. My observation, looking at the strace from GNU sed, is it attempts to open a file with no/empty name and fails, but ignores the error. Cheers, -- Cristian ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: *FLAWED* Re: busybox sed, 'r' command
I was curious what POSIX says and how other *nix systems would handle Christian's examples. On the 'r' command POSIX makes a general comment: The r and w command verbs, and the w flag to the s command, take an rfile (or wfile) parameter... and specifically about 'r': If rfile does not exist or cannot be read, it shall be treated as if it were an empty file, causing no error condition. This offers no guidance on how to handle a missing parameter, unless you read 'if rfile does not exist' to mean the parameter rather than the actual file. In practice only GNU sed ignores an 'r' command with no parameter; BusyBox, FreeBSD, Solaris and Version 7 UNIX[1] treat it as an error. On newlines, POSIX only offers: In default operation, sed cyclically shall append a line of input, less its terminating character, into the pattern space. Given Christian's sample file with no trailing newline and the command 'sed -n p /tmp/bar' GNU sed returns all three lines with no newline on the last; BusyBox and FreeBSD return all three lines with a newline on the last; Solaris and Version 7 UNIX only return the first two lines. So, you pays your money and you takes your choice. Busybox sed's behaviour is certainly consistent with *nix tradition. We're just lucky to have so many traditions to choose from. Ron --- [1] http://www.nordier.com/v7x86/index.html has a virtual machine with UNIX v7 for x86. ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: *FLAWED* Re: busybox sed, 'r' command
busybox generally follows a pretty simple flow: (1) is the behavior you're looking at explicitly documented by POSIX ? if yes, then do what POSIX says & you're done (2) is the behavior described as "implementation defined" by POSIX ? if yes, do whatever produces smaller code (3) is the behavior attempting to replicate another standard (e.g. GNU) ? is it behavior the standard explicitly documents ? if yes, do what the replicated standard does otherwise, if it's an edge case no one cares about, stick to small code. -mike signature.asc Description: Digital signature ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: *FLAWED* Re: busybox sed, 'r' command
On Wed, 23 Mar 2016, Ralf Friedl wrote: > > On the other hand, I don't know why busybox sed needs exactly one space > between command and filename. GNU sed works with zero or more spaces. Good points, everyone. Thanks. Still... # Note, the input file /tmp/bar lacks the on the last line # simplified GNU sed ignores open failure on not specified/not existing file: $ strace sed 'r' /tmp/bar open("/tmp/bar", O_RDONLY|O_LARGEFILE) = 3 read(3, "foo\nbar\nbaz", 4096) = 11 write(1, "foo\n", 4foo )= 4 open("", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory) write(1, "bar\n", 4bar )= 4 open("", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory) read(3, "", 4096) = 0 write(1, "baz\n", 4baz )= 4 open("", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory) read(3, "", 4096) = 0 close(3)= 0 busybox sed reports an error: $ strace busybox sed 'r' /tmp/bar write(2, "sed: empty filename\n", 20sed: empty filename ) = 20 Arguably, this may look like a bug in GNU sed, or intentional behaviour? # Let's do a more reasonable test. $ sed -n '1,$p' /tmp/bar | cat -E foo$ bar$ baz ^ Note the missing char on the last line. $ busybox sed -n '1,$p' /tmp/bar | cat -E foo$ bar$ baz$ ^ There's a char on the last line. Which is at fault here? I would say both (with reservations). But obviously, non-determinism. $ f=/tmp/bar && cat $f && [ -z "$(tail -c1 $f)" ] || echo and: $ f=/tmp/bar && cat $f && tail -c1 $f | read __ || echo work, but they look more convoluted to me. Cheers, -- Cristian ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: busybox sed, 'r' command
On 3/23/2016 12:12 PM, Ralf Friedl wrote: Cristian Ionescu-Idbohrn wrote: sed (GNU sed) 4.2.2 can do this: $ printf 'foo bar baz' | sed r - foo bar baz or, after storing the text in a file: $ printf 'foo bar baz' >/tmp/bar $ sed r /tmp/bar foo bar baz But busybox sed can't: $ printf 'foo bar baz' | busybox sed r - sed: empty filename $ busybox sed r /tmp/bar sed: empty filename $ printf '' | busybox sed 'r /tmp/bar' $ busybox sed 'r /tmp/bar' The 'r' command is documented by GNU sed as a GNU extension. Still, busybox sed documents the 'r' command as supported: r [address]r file Read contents of file and append after the contents of the pattern space. Exactly one space must be put between r and the filename. Am I misinterpreting the documentation? From the documentation: > The full format for invoking `sed' is: > sed OPTIONS... [SCRIPT] [INPUTFILE...] So in your example you invoce sed with the script "r" and the input file "-" or "/tmp/bar". The content is not printed because it is the argument to the "r" command, but because it is the main input file to sed. You can avoid that by using quotes around the command and the file name, or by omitting the space between the command and the filename. You should also try the last two examples, where you invoke busybox sed with quotes, with GNU sed. The behaviour is the same. You should note that in your example when reading from a file, sed didn't read from stdin, at least you don't mention it, although your interpretation would mean that the filename is the argument to the "r" command, therefor no argument is given to sed, and sed should read stdin. You should also not that invoking the "r" command with the filename causes the content of this file to be inserted after every line. When reading from a pipe, the pipe is empty after the first line. My documentation to GNU sed 4.2.2 says: > `r FILENAME' > As a GNU extension, this command accepts two addresses. > > Queue the contents of FILENAME to be read and inserted into the > output stream at the end of the current cycle, or when the next > input line is read. Note that if FILENAME cannot be read, it is > treated as if it were an empty file, without any error indication. > > As a GNU `sed' extension, the special value `/dev/stdin' is > supported for the file name, which reads the contents of the > standard input. So the main difference seems to be that GNU sed doesn't give an error message if the file can't be read. I'm not sure why that would be a good idea. Also not that there is no mention of using "r -" for stdin, instead /dev/stdin is mentioned. On the other hand, I don't know why busybox sed needs exactly one space between command and filename. GNU sed works with zero or more spaces. It looks to me that what actually happens when running "sed r" is that it appends *no lines* to the end of each line read from stdin. $ printf 'foo bar baz' | sed Does not add a final newline $ printf 'foo bar baz' | sed r Does add a final newline $ printf 'foo bar baz' | sed 'r /dev/null' Does add a final newline echo "blah" > - $ printf 'foo bar baz' | sed 'r -' results in foo blah bar blah baz blah So it is not a special case for the filename. I personally don't see much value in preserving the behavior of appending nothing for a file which doesn't exist. Tools should give errors if they can't do what you ask them to. -Mike C. ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: busybox sed, 'r' command
Cristian Ionescu-Idbohrn wrote: sed (GNU sed) 4.2.2 can do this: $ printf 'foo bar baz' | sed r - foo bar baz or, after storing the text in a file: $ printf 'foo bar baz' >/tmp/bar $ sed r /tmp/bar foo bar baz But busybox sed can't: $ printf 'foo bar baz' | busybox sed r - sed: empty filename $ busybox sed r /tmp/bar sed: empty filename $ printf '' | busybox sed 'r /tmp/bar' $ busybox sed 'r /tmp/bar' The 'r' command is documented by GNU sed as a GNU extension. Still, busybox sed documents the 'r' command as supported: r [address]r file Read contents of file and append after the contents of the pattern space. Exactly one space must be put between r and the filename. Am I misinterpreting the documentation? From the documentation: > The full format for invoking `sed' is: > sed OPTIONS... [SCRIPT] [INPUTFILE...] So in your example you invoce sed with the script "r" and the input file "-" or "/tmp/bar". The content is not printed because it is the argument to the "r" command, but because it is the main input file to sed. You can avoid that by using quotes around the command and the file name, or by omitting the space between the command and the filename. You should also try the last two examples, where you invoke busybox sed with quotes, with GNU sed. The behaviour is the same. You should note that in your example when reading from a file, sed didn't read from stdin, at least you don't mention it, although your interpretation would mean that the filename is the argument to the "r" command, therefor no argument is given to sed, and sed should read stdin. You should also not that invoking the "r" command with the filename causes the content of this file to be inserted after every line. When reading from a pipe, the pipe is empty after the first line. My documentation to GNU sed 4.2.2 says: > `r FILENAME' > As a GNU extension, this command accepts two addresses. > > Queue the contents of FILENAME to be read and inserted into the > output stream at the end of the current cycle, or when the next > input line is read. Note that if FILENAME cannot be read, it is > treated as if it were an empty file, without any error indication. > > As a GNU `sed' extension, the special value `/dev/stdin' is > supported for the file name, which reads the contents of the > standard input. So the main difference seems to be that GNU sed doesn't give an error message if the file can't be read. I'm not sure why that would be a good idea. Also not that there is no mention of using "r -" for stdin, instead /dev/stdin is mentioned. On the other hand, I don't know why busybox sed needs exactly one space between command and filename. GNU sed works with zero or more spaces. ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: busybox sed, 'r' command
On Wed, 23 Mar 2016, Ron Yorston wrote: > > Since the 'r' command requires a space before the filename it will need > to be quoted. Some of your examples have quotes and some don't so you > aren't always comparing the same thing. Right. Still. The different behaviour confused me. > "sed r -" is an 'r' command with no filename while the "sed 'r -'" is an > 'r' command with a filename of '-'. It appears that GNU sed and BusyBox > sed handle an 'r' command with no filename differently. Yes. That seems to be it. Question is if busybox sed should mimic GNU sed behaviour or not. The current GNU sed behaviour might be seen upon as a bug. But it's been like that for ages. Maybe it's a bug upstream wants to keep for historical reasons? > Also note that printf doesn't issue a newline at the end of the string. > This can affect the results. Yes, that was intentional. A file that lacks a at the end of the last line, passing through: $ sed r enforces proper line termination on last line. I know there's other cludge that can achieve the same thing. Cheers, -- Cristian ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: busybox sed, 'r' command
Cristian, Since the 'r' command requires a space before the filename it will need to be quoted. Some of your examples have quotes and some don't so you aren't always comparing the same thing. "sed r -" is an 'r' command with no filename while the "sed 'r -'" is an 'r' command with a filename of '-'. It appears that GNU sed and BusyBox sed handle an 'r' command with no filename differently. Also note that printf doesn't issue a newline at the end of the string. This can affect the results. Ron ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox