Re: non portable sed scripts

2006-05-23 Thread Paul Eggert
Ralf Wildenhues <[EMAIL PROTECTED]> writes:

> the 2.59 shell selection algorithm would probably(?) have selected
> /bin/sh as shell, whereas, due to changes we did because of OSF,
> /usr/bin/posix/sh is preferred now.

Ouch.  Good catch.

> I hope we get away with this.

I don't think we will, since the bug occurs every 1024 bytes, and many
define.sed scripts are longer than that.

I installed this patch, which works around this particular problem by
not using shell expansion at all in the here-documents used to create
defines.sed.  However, other instances of this problem lurk in
AC_LANG_SOURCE(C), _AC_INIT_HELP, _AC_DEFINE_Q, AC_LANG_CONFTEST,
_AC_OUTPUT_FILES_PREPARE, _AC_OUTPUT_FILE, and
_AC_OUTPUT_CONFIG_STATUS, with the last 3 being the most worrisome.

Perhaps Tim could check whether this patch fixes his problem?
If not, other patches are probably also needed.

I just now noticed that this patch removes the undocumented
ac_word_regexp var.  That was a fairly recent addition, though (June
2005), and I couldn't find evidence in Google of other packages using
it.

2006-05-23  Paul Eggert  <[EMAIL PROTECTED]>

* lib/autoconf/status.m4 (_AC_OUTPUT_HEADER): Don't use shell
expansion in the here-documents used by config.status, as that
runs afoul of the Korn shell version M-12/28/93d bug described in
the Autoconf manual, and this in turn causes a Coreutils 5.95 build to
fail as described by Tim Rice and diagnosed by Ralf Wildenhues in
.

--- lib/autoconf/status.m4  23 May 2006 08:27:32 -  1.106
+++ lib/autoconf/status.m4  23 May 2006 23:30:57 -  1.108
@@ -601,27 +601,6 @@ m4_define([_AC_OUTPUT_HEADER],
   #
   # CONFIG_HEADER
   #
-
-  # These sed commands are passed to sed as "A NAME B PARAMS C VALUE D", where
-  # NAME is the cpp macro being defined, VALUE is the value it is being given.
-  # PARAMS is the parameter list in the macro definition--in most cases, it's
-  # just an empty string.
-  #
-dnl Quote, for the `[ ]' and `define'.
-[  ac_dA='s,^\([#]*\)[^ ]*\([   ]*'
-  ac_dB='\)[(].*,\1define\2'
-  ac_dC=' '
-  ac_dD=' ,']
-dnl ac_dD used to contain `;t' at the end, but that was both slow and 
incorrect.
-dnl 1) Since the script must be broken into chunks containing 100 commands,
-dnl the extra command meant extra calls to sed.
-dnl 2) The code was incorrect: in the unusual case where a symbol has multiple
-dnl different AC_DEFINEs, the last one should be honored.
-dnl
-dnl ac_dB works because every line has a space appended.  ac_dD reinserts
-dnl the space, because some symbol may have been AC_DEFINEd several times.
-
-  [ac_word_regexp=[_$as_cr_Letters][_$as_cr_alnum]*]
 _ACEOF
 
 # Transform confdefs.h into a sed script `conftest.defines', that
@@ -637,6 +616,26 @@ echo 's/$/ /' >conftest.defines
 dnl
 dnl Quote, for `[ ]' and `define'.
 [ac_word_re=[_$as_cr_Letters][_$as_cr_alnum]*
+# These sed commands are passed to sed as "A NAME B PARAMS C VALUE D", where
+# NAME is the cpp macro being defined, VALUE is the value it is being given.
+# PARAMS is the parameter list in the macro definition--in most cases, it's
+# just an empty string.
+ac_dA='s,^\\([  #]*\\)[^]*\\([  ]*'
+ac_dB='\\)[ (].*,\\1define\\2'
+ac_dC=' '
+ac_dD=' ,']
+dnl ac_dD used to contain `;t' at the end, but that was both slow and 
incorrect.
+dnl 1) Since the script must be broken into chunks containing 100 commands,
+dnl the extra command meant extra calls to sed.
+dnl 2) The code was incorrect: in the unusual case where a symbol has multiple
+dnl different AC_DEFINEs, the last one should be honored.
+dnl
+dnl ac_dB works because every line has a space appended.  ac_dD reinserts
+dnl the space, because some symbol may have been AC_DEFINEd several times.
+dnl
+dnl The first use of ac_dA has a space prepended, so that the second
+dnl use does not match the initial 's' of $ac_dA.
+[
 uniq confdefs.h |
   sed -n '
t rset
@@ -646,9 +645,8 @@ uniq confdefs.h |
d
:ok
s/[\\&,]/\\&/g
-   s/[\\$`]/\\&/g
-   s/^\('"$ac_word_re"'\)\(([^()]*)\)[  
]*\(.*\)/${ac_dA}\1$ac_dB\2${ac_dC}\3$ac_dD/p
-   s/^\('"$ac_word_re"'\)[  ]*\(.*\)/${ac_dA}\1$ac_dB${ac_dC}\2$ac_dD/p
+   s/^\('"$ac_word_re"'\)\(([^()]*)\)[  ]*\(.*\)/ 
'"$ac_dA"'\1'"$ac_dB"'\2'"${ac_dC}"'\3'"$ac_dD"'/p
+   s/^\('"$ac_word_re"'\)[  
]*\(.*\)/'"$ac_dA"'\1'"$ac_dB$ac_dC"'\2'"$ac_dD"'/p
   ' >>conftest.defines
 ]
 # Remove the space that was appended to ease matching.
@@ -682,12 +680,14 @@ while :
 do
   # Write a here document:
   dnl Quote, for the `[ ]' and `define'.
-  echo ['# First, check the format of the line:
-cat >"$tmp/defines.sed" <>$CONFIG_STATUS <<_ACEOF
+# First, check the format of the line:
+cat >"\$tmp/defines.sed" <<\\CEOF
+/^[ ]*#[]*undef[][  ]*$ac_word_re[  ]*\$/b def
+/^[ ]*#[]*define[ 

Re: non portable sed scripts

2006-05-23 Thread Tim Rice
On Tue, 23 May 2006, Ralf Wildenhues wrote:

> > > Pleas try again with /usr/bin/posix/sh as shell; that's what the shell
> > > selection algorithm of 2.59c will select.
> > 
> > Yes that fails. /usr/bin/posix/sh is a symbolic link to /u95/bin/sh which
> > is hard linked to /u95/bin/ksh. /usr/bin/ksh is a symbolic link to
> > /u95/bin/ksh.
> > 
[snip]
> So I assume we have an incarnation of a bug similar to this one
> (quoting `info Autoconf "Here-Documents"'):
> 
> |Many older shells (including the Bourne shell) implement
> | here-documents inefficiently.  And some shells mishandle large
> | here-documents: for example, Solaris `dtksh', which is derived from
> | Korn shell version M-12/28/93d, mishandles variable expansion that
> | occurs on 1024-byte buffer boundaries within a here-document.  Users
> | can generally fix these problems by using a faster or more reliable
> | shell, e.g., by using the command `CONFIG_SHELL=/bin/bash /bin/bash
> | ./configure' rather than plain `./configure'.

I'd say the identical bug.
...
$ what /usr/bin/ksh | grep -i version
Version M-12/28/93e-SCO
...

-- 
Tim RiceMultitalents(707) 887-1469
[EMAIL PROTECTED]






Re: non portable sed scripts

2006-05-23 Thread Stepan Kasal
Hello,

On Tue, May 23, 2006 at 02:33:42PM +0200, Stepan Kasal wrote:
> you are so bright, Ralf!

this doesn't sound nice, I'm afraid.

I wanted to say that it was realy clever to notice that

> > | s,^\([ ]*#[]*\)[^  ]*\([   ][  ]*HAVE_DECL_STPCPY\)[   
> > (].*$,\1define\2 0 ,
> > | HAVE_DECL_STRNDUP\)[   (].*$,\1define\2 0 ,

is actually a malformed sed script.  And even more clever was to
connect it with that bug in the docs.

Well done! Thanks.

Stepan




Re: non portable sed scripts

2006-05-23 Thread Stepan Kasal
Hello,

On Tue, May 23, 2006 at 10:43:22AM +0200, Ralf Wildenhues wrote:
> | s,^\([   ]*#[]*\)[^  ]*\([   ][  ]*HAVE_DECL_NANOSLEEP\)[   
>  (].*$,\1define\2 0 ,
> | s,^\([   ]*#[]*\)[^  ]*\([   ][  ]*HAVE_DECL_REALLOC\)[  
> (].*$,\1define\2 1 ,
> | s,^\([   ]*#[]*\)[^  ]*\([   ][  ]*HAVE_DECL_STPCPY\)[   
> (].*$,\1define\2 0 ,
> | HAVE_DECL_STRNDUP\)[ (].*$,\1define\2 0 ,
> | s,^\([   ]*#[]*\)[^  ]*\([   ][  ]*HAVE_DECL_STRNLEN\)[  
> (].*$,\1define\2 0 ,
> [...]
> 
> So I assume we have an incarnation of a bug similar to this one
> (quoting `info Autoconf "Here-Documents"'):
> 
> |Many older shells (including the Bourne shell) implement
> | here-documents inefficiently.  And some shells mishandle large
> | here-documents: for example, Solaris `dtksh', which is derived from
> | Korn shell version M-12/28/93d, mishandles variable expansion that
> | occurs on 1024-byte buffer boundaries within a here-document.  Users
> | can generally fix these problems by using a faster or more reliable
> | shell, e.g., by using the command `CONFIG_SHELL=/bin/bash /bin/bash
> | ./configure' rather than plain `./configure'.

you are so bright, Ralf!

> [...]  Otherwise, I don't see much
> choice other than to suggest passing a more reliable shell.

Of course there is a general solution: we can actively test the shell
for this problem, in the ``detect better shell'' routine.
But this will enlarge the generated script by a kilobyte :-O

Have a nice day,
Stepan




Re: non portable sed scripts

2006-05-23 Thread Ralf Wildenhues
[ Cc:ing bug-autoconf again ]

* Tim Rice wrote on Tue, May 23, 2006 at 04:13:34AM CEST:
> On Mon, 22 May 2006, Ralf Wildenhues wrote:
> 
> > > Next I tried
> > >   CONFIG_SHELL=/bin/sh /bin/sh  \
> > >   /opt/src/gnu/coreutils-5.95/configure \
> > >   CONFIG_SHELL=/bin/sh
> > > Again a valid config.h and no error.
> > > That was all on my UnixWare 7.1.1 box.
> > 
> > Pleas try again with /usr/bin/posix/sh as shell; that's what the shell
> > selection algorithm of 2.59c will select.
> 
> Yes that fails. /usr/bin/posix/sh is a symbolic link to /u95/bin/sh which
> is hard linked to /u95/bin/ksh. /usr/bin/ksh is a symbolic link to
> /u95/bin/ksh.
> 
> Testing with /usr/bin/ksh fails too.
> I've attached a snip of the output of a "/usr/bin/posix/sh -x" test.

Thanks.  This snippet shows that it's the shell which actually generates
a broken script:

| + cat
| + 1> ./conf24563-17529/defines.sed 0<<
[...]
| s,^\([ ]*#[]*\)[^  ]*\([   ][  ]*HAVE_DECL_NANOSLEEP\)[   
 (].*$,\1define\2 0 ,
| s,^\([ ]*#[]*\)[^  ]*\([   ][  ]*HAVE_DECL_REALLOC\)[  
(].*$,\1define\2 1 ,
| s,^\([ ]*#[]*\)[^  ]*\([   ][  ]*HAVE_DECL_STPCPY\)[   
(].*$,\1define\2 0 ,
| HAVE_DECL_STRNDUP\)[   (].*$,\1define\2 0 ,
| s,^\([ ]*#[]*\)[^  ]*\([   ][  ]*HAVE_DECL_STRNLEN\)[  
(].*$,\1define\2 0 ,
[...]

So I assume we have an incarnation of a bug similar to this one
(quoting `info Autoconf "Here-Documents"'):

|Many older shells (including the Bourne shell) implement
| here-documents inefficiently.  And some shells mishandle large
| here-documents: for example, Solaris `dtksh', which is derived from
| Korn shell version M-12/28/93d, mishandles variable expansion that
| occurs on 1024-byte buffer boundaries within a here-document.  Users
| can generally fix these problems by using a faster or more reliable
| shell, e.g., by using the command `CONFIG_SHELL=/bin/bash /bin/bash
| ./configure' rather than plain `./configure'.

Hmm.  This may actually present a regression on this system: the 2.59
shell selection algorithm would probably(?) have selected /bin/sh as
shell, whereas, due to changes we did because of OSF, /usr/bin/posix/sh
is preferred now.

I hope we get away with this.  The reduction of ac_max_sed_lines Paul
just installed may just save us, hopefully.  Otherwise, I don't see much
choice other than to suggest passing a more reliable shell.

Cheers,
Ralf




Re: Bug#368502: autoconf: breaks existing build systems that use ${datadir}

2006-05-23 Thread Ralf Wildenhues
[ better keep the debian bug address in Cc: ]

* Ben Pfaff wrote on Tue, May 23, 2006 at 01:34:18AM CEST:
> 
> The "configure" script from the CVS autoconf did report warnings
> for the lack of datarootdir.  This appears to be harmless.  The
> two versions of "configure" generated identical config.h, modulo
> a few comments.  They also generated mostly identical
> Makefile.in, except for a few worrisome parts that appear
> unrelated to your bug report:
> 
> -ac_ct_AR = ar
> +ac_ct_AR = @ac_ct_AR@
> 
> +ac_ct_RANLIB = @ac_ct_RANLIB@
> +ac_ct_STRIP = @ac_ct_STRIP@
> 
> (Looks like some substitutions are missing somehow.  Perhaps I
> need to run more than just "autoconf".)

Rerunning automake will eliminate those lines completely.  (These
substitutions are unnecessary, as those variables should be internal
to the configure script only.)

> if test "x${bindir}" = 'x${exec_prefix}/bin'; then
>   if test "x${exec_prefix}" = "xNONE"; then
> if test "x${prefix}" = "xNONE"; then
>   bindir="${ac_default_prefix}/bin";
> else
>   bindir="${prefix}/bin";
> fi
>   else
> if test "x${prefix}" = "xNONE"; then
>   bindir="${ac_default_prefix}/bin";
> else
>   bindir="${prefix}/bin";
> fi
>   fi
> fi

Yes, code similar to this but with ${datadir} is going to have a
problem.  But the Autoconf manual has been warning against such code
for a while now:
  info Autoconf "Defining Directories"

Cheers,
Ralf