On Mon, Jul 09, 2018 at 06:23:17PM -0300, Daniel Bolgheroni wrote:
> Working with some input/commands which dealt with multiline pattern
> spaces, noticed a behaviour I was not expecting. Reducing the test to a
> simple case:
>
>   $ echo "foo\nbar\nbaz" | sed "1d; N"
>   bar
>   baz
>   $
>
> This is for 3 lines of input. However, same command but for 4 lines of
> input:
>
>   $ echo "foo\nbar\nbaz\nqux" | sed "1d; N"
>   bar
>   baz   <--- qux not here
>   $
>
> For 6 lines of input:
>
>   $ echo "foo\nbar\nbaz\nqux\nquux\nquuz" | sed "1d; N"
>   bar
>   baz
>   qux
>   quux  <--- quuz not here
>   $
>
> (use echo -e if on bash)
>
> When the input has an even number of lines, the last line disappears. My
> understanding is this isn't supposed to happen, but I'm unsure if it has
> something to do with POSIX gray areas of undefined behaviours. For
> instance, GNU sed from ports presents a different behaviour (second test
> case but with gsed):
>
>   $ echo "foo\nbar\nbaz\nqux" | gsed "1d; N"
>   bar
>   baz
>   qux   <--- qux here
>   $
>
> Before suggesting anything, anyone with a stronger experience on sed
> historical behaviours can shed some light on if this is acceptable?
>
> Thank you.
>
> --
> db
>

This is really a question about what the "N" command should be doing if
there is no further input.  GNU sed seems to output the current pattern
space.

POSIX says:

        If no next line of input is available, the N command verb shall
        branch to the end of the script and quit without starting a new
        cycle or copying the pattern space to standard output.


Note: "without [...] copying the pattern space to standard output"

So this makes me believe that OpenBSD sed is doing the correct thing
with regards to the POSIX spec. for sed.

Also note that the input to sed should be "text files".  In all your
examples, the last line is not properly terminated, which, strictly and
pedantically speaking, violates the "text file" requirement.  This does
not affect the outcome of the commands though.

Also, printf may be used to portably print strings with C escape
sequences.


Cheers,

--
Andreas Kusalananda Kähäri,
National Bioinformatics Infrastructure Sweden (NBIS),
Uppsala University, Sweden.








När du har kontakt med oss på Uppsala universitet med e-post så innebär det att 
vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du 
läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For 
more information on how this is performed, please read here: 
http://www.uu.se/om-uu/dataskydd-personuppgifter/

Reply via email to