sed: allow d and y functions in { } function list

2014-10-22 Thread Christopher Zimmermann
Hi

$ sed -e { y/o/u/ }
sed: 1: { y/o/u/ }: extra text at the end of a transform command

but this is allowed according to the manual:

 Functions can be combined to form a function list, a list of sed
 functions separated by newlines, as follows:

   { function
 function
 ...
 function
   }

 The `{' can be preceded or followed by whitespace.  The function can be
 preceded by whitespace as well.  The terminating `}' must be preceded by
 a newline or optional whitespace.


I tried to fix it by adding special cases for '}' to the relevant parts
of the compile_stream() function. Is this correct? OK?


Christopher


Index: compile.c
===
RCS file: /cvs/src/usr.bin/sed/compile.c,v
retrieving revision 1.36
diff -u -p -r1.36 compile.c
--- compile.c   8 Oct 2014 04:19:08 -   1.36
+++ compile.c   22 Oct 2014 15:32:55 -
@@ -234,14 +234,19 @@ nonsel:   /* Now parse the command */
case EMPTY: /* d D g G h H l n N p P q x = \0 */
p++;
EATSPACE();
-   if (*p == ';') {
+   switch (*p) {
+   case ';':
p++;
link = cmd-next;
+   /* FALLTHROUGH */
+   case '}':
goto semicolon;
+   case '\0':
+   break;
+   default:
+   err(COMPILE, extra characters at the end of 
+   %c command, cmd-code);
}
-   if (*p)
-   err(COMPILE,
-extra characters at the end of %c command, cmd-code);
break;
case TEXT:  /* a c i */
p++;
@@ -323,14 +328,19 @@ nonsel:   /* Now parse the command */
p++;
p = compile_tr(p, (char **)cmd-u.y);
EATSPACE();
-   if (*p == ';') {
+   switch (*p) {
+   case ';':
p++;
link = cmd-next;
+   /* FALLTHROUGH */
+   case '}':
goto semicolon;
-   }
-   if (*p)
+   case '\0':
+   break;
+   default:
err(COMPILE, extra text at the end of a
 transform command);
+   }
break;
}
}


-- 
http://gmerlin.de
OpenPGP: http://gmerlin.de/christopher.pub
F190 D013 8F01 AA53 E080  3F3C F17F B0A1 D44E 4FEE

signature.asc
Description: PGP signature


Re: sed: allow d and y functions in { } function list

2014-10-22 Thread Philip Guenther
On Wed, Oct 22, 2014 at 8:37 AM, Christopher Zimmermann
chr...@openbsd.org wrote:
 $ sed -e { y/o/u/ }
 sed: 1: { y/o/u/ }: extra text at the end of a transform command

 but this is allowed according to the manual:

  Functions can be combined to form a function list, a list of sed
  functions separated by newlines, as follows:

{ function
  function
  ...
  function
}

  The `{' can be preceded or followed by whitespace.  The function can be
  preceded by whitespace as well.  The terminating `}' must be preceded by
  a newline or optional whitespace.

That looks like a documentation bug to me.  To quote the POSIX spec:
--
[2addr] {editing command
editing command
...
} Execute a list of sed editing commands only when the pattern space
is selected. The

  list of sed editing commands shall be surrounded by
braces and separated by
  newline characters, and conform to the following
rules. The braces can be
  preceded or followed by blank characters. The
editing commands can be
  preceded by blank characters, but shall not be
followed by blank characters.
  The right-brace shall be preceded by a newline
and can be preceded or
  followed by blank characters.
--

So the newline before the close-brace is required.  Since the code
matches the spec, I think we should change the doc to match both of
them.  Or is there some reason this extension is required?


Philip Guenther



Re: sed: allow d and y functions in { } function list

2014-10-22 Thread Christopher Zimmermann
On Wed, 22 Oct 2014 10:46:43 -0700 Philip Guenther guent...@gmail.com
wrote:

 On Wed, Oct 22, 2014 at 8:37 AM, Christopher Zimmermann
 chr...@openbsd.org wrote:
  $ sed -e { y/o/u/ }
  sed: 1: { y/o/u/ }: extra text at the end of a transform command
 
  but this is allowed according to the manual:
 
   Functions can be combined to form a function list, a list of
  sed functions separated by newlines, as follows:
 
 { function
   function
   ...
   function
 }
 
   The `{' can be preceded or followed by whitespace.  The
  function can be preceded by whitespace as well.  The terminating
  `}' must be preceded by a newline or optional whitespace.
 
 That looks like a documentation bug to me.  To quote the POSIX spec:
 --
 [2addr] {editing command
 editing command
 ...
 } Execute a list of sed editing commands only when the pattern space
 is selected. The
 
   list of sed editing commands shall be surrounded by
 braces and separated by
   newline characters, and conform to the following
 rules. The braces can be
   preceded or followed by blank characters. The
 editing commands can be
   preceded by blank characters, but shall not be
 followed by blank characters.
   The right-brace shall be preceded by a newline
 and can be preceded or
   followed by blank characters.
 --
 
 So the newline before the close-brace is required.  Since the code
 matches the spec, I think we should change the doc to match both of
 them.  Or is there some reason this extension is required?

I just patched sysutils/findlib:

$OpenBSD: patch-src_findlib_Makefile,v 1.5 2014/10/22 14:56:42 chrisz Exp $
--- src/findlib/Makefile.orig   Wed Oct 15 13:07:40 2014
+++ src/findlib/MakefileWed Oct 22 16:54:22 2014
@@ -74,7 +74,7 @@ topfind.ml: topfind.ml.in
if [ $(ENABLE_TOPFIND_PPXOPT) = true ]; then \
cp topfind.ml.in topfind.ml; \
else \
-   sed -e '/PPXOPT_BEGIN/,/PPXOPT_END/{d}' topfind.ml.in\
+   sed -e '/PPXOPT_BEGIN/,/PPXOPT_END/ d' topfind.ml.in\
 topfind.ml ;   \
fi
 
gnu sed is more permissive here:

{ commands }
A group of commands may be enclosed between { and } characters.
This is particularly useful when you want a group of commands
to be triggered by a single address (or address-range) match. 

But I don't expect to stumble over non portable sed code like that
often.

Christopher


-- 
http://gmerlin.de
OpenPGP: http://gmerlin.de/christopher.pub
F190 D013 8F01 AA53 E080  3F3C F17F B0A1 D44E 4FEE

signature.asc
Description: PGP signature


Re: sed: allow d and y functions in { } function list

2014-10-22 Thread Christopher Zimmermann
On Wed, 22 Oct 2014 21:57:14 +0200 Ingo Schwarze schwa...@usta.de
wrote:

  So the newline before the close-brace is required.  Since the code
  matches the spec, I think we should change the doc to match both of
  them.  Or is there some reason this extension is required?
 
 That would be the following patch:
   Ingo
 
 
 Index: sed.1
 ===
 RCS file: /cvs/src/usr.bin/sed/sed.1,v
 retrieving revision 1.43
 diff -u -p -r1.43 sed.1
 --- sed.1 27 May 2014 17:45:02 -  1.43
 +++ sed.1 22 Oct 2014 19:36:54 -
 @@ -268,7 +268,7 @@ Functions can be combined to form a
  .Em function list ,
  a list of
  .Nm
 -functions separated by newlines, as follows:
 +functions each followed by a newline, as follows:
  .Bd -literal -offset indent
  { function
function
 @@ -277,13 +277,8 @@ functions separated by newlines, as foll
  }
  .Ed
  .Pp
 -The
 -.Ql {
 -can be preceded or followed by whitespace.
 -The function can be preceded by whitespace as well.
 -The terminating
 -.Ql }
 -must be preceded by a newline or optional whitespace.
 +The braces can be preceded and followed by whitespace.
 +The functions can be preceded by whitespace as well.
  .Pp
  Functions and function lists may be preceded by an exclamation mark,
  in which case they are applied only to lines that are


OK chrisz@

-- 
http://gmerlin.de
OpenPGP: http://gmerlin.de/christopher.pub
F190 D013 8F01 AA53 E080  3F3C F17F B0A1 D44E 4FEE

signature.asc
Description: PGP signature