Defaults when cross-compiling

2023-11-06 Thread Michael T. Kloos
I was trying to cross-compile bash for musl libc.  The configure script reports:

checking for working sbrk... configure: WARNING: cannot check working sbrk if 
cross-compiling
yes

However, I don't believe musl libc supports sbrk.  However, autoconf seems to 
default
to assuming yes and sets the HAVE_SBRK definition.  Bash then crashes on 
xmalloc failure.  
Is this intended behavior?  




Re: posix command search and execution

2023-11-06 Thread Robert Elz
Date:Mon, 6 Nov 2023 14:28:24 -0500
From:Chet Ramey 
Message-ID:  <0ab6075e-22bf-43cd-992c-b2476f626...@case.edu>

  | On 11/6/23 10:48 AM, Mike Jonkmans wrote:
  | > According to these docs (what I make of it), resolving is done
  | > in steps, the first applicable step is used:

This is one of the most debated, and stupidest, parts of posix.

  | > 1b) List several names that have unspecified results.
  | This is an ad-hoc list of builtins that shells implement,
  | not necessarily common across all shells.

If it were just builtins it would not be important, the issue
is more that some shells implement some of that list as reserved
words, or aliases, and if that's done what applications can do
alters dramatically.   So avoiding using those words as command
names, except when using the known features of a specific shell,
is the best way to remain portable.

  | > 1c) Use a function, for functions not matching standard utilities.

No, that's not what it says, it is except of standard utilities
implemented as functions.   More on that below.

  | > 1d) Lists 20 fixed utility names (like alias, cd etc.) that are
  | >  to be invoked at his point. No PATH search yet.
  | >   These are the `regular builtins'.

In the next standard the ones listed are the intrinsic builtins,
and includes only those that must be builtin to work.   But
implementations can add more to the list.

  | >   (These need not exist as builtin).
  | These are the historical common builtins.

That is how the existing standard is written.

  | > 1eI) Search is successful.
  | > 1eIa) Check for `regular builtins' and functions
  | >and invoke that regular builtin/function.
  | >Q: Shouldn't this specify an ordering for builtins/functions?
  |
  | The text seems to imply that you can't have both, doesn't it?

While I suppose you could have both, it would be very unusual.
Again, the functions that can get invoked here are only the
standard utilities implemented as functions, all others would
have been invoked earlier, and we would never be here.

That phrasing is meant to apply to a standard utility (which
just means any utility defined by the standard, as distinct
from others added be the implementation, or user) that is
implemented as a function, by the implementation.  It is
not meant to apply to a function that the application happens
to have defined with the same name as a standard utility.

  | My feeling, without testing anything, is that most shells would allow
  | functions to override builtins here.

Since I have never seen any shell implement any standard utility
as a function, it would be very hard to test.   Further if the
did, also implementing the same thing as a builtin would be
even harder to imagine - why do it twice when one of the two
would never be used?   So not just hard to test - probably
impossible.

It is also unclear to me why anyone would ever implement a standard
builtin as a function - implementing builtins is simpler for the
implementation than functions (in my experience anyway) and in
any case, if the rules in the standard are followed, there is
no way (except possibly by using "command", and even that is not
clear to me) to tell if the implementation used a function or
a builtin (maybe the output from type might make it clear, but
not necessarily).

  | This has been an area of significant disagreement.

It has indeed.

  | > 1eIb) Run the utility.
  | >(This is where ordinary builtins should run).
  | > (It seems logical that a builtin takes precedence over PATH).
  |
  | You'd be surprised.

Yes.   But almost all shells implement it that way, so the
seemingly logical assumption is mostly backed by experience.

  | Note that this seems to require that you can only run
  | a builtin if it exists (or something with that name exists) in $PATH.

A builtin for a standard utility, yes.  Unless the implementation has
defined it as intrinsic (which the forthcoming standard allows, but
discourages).  Applications (which includes users) who invoke non
standard utilities are stepping outside the standard, so get
unspecified results (so implementations can add new non-standard
builtins without also adding a matching command in PATH without
issues.

  | So if you have a builtin that doesn't exist in $PATH and isn't listed as
  | one of the regular builtins, what do you do? Even the unspecified list
  | doesn't give much help.

If it is a standard utility it is required to exist in PATH.
If PATH has been changed so that is no longer true, then that
is a non-conforming environment, and anything is OK.  Similarly
if the builtin is not a standard utility (like declare or
enable for example).

  | This is a quality of implementation feature.
  | Why confuse users by allowing them to define a function that
  | will never be executed?

Indeed - but you could also write that as "Why confuse users by
allowing them to define a function that can never be invoked?"
and by 

Re: posix command search and execution

2023-11-06 Thread Chet Ramey

On 11/6/23 10:48 AM, Mike Jonkmans wrote:


POSIX Command Search

According to these docs (what I make of it), resolving is done in steps,
the first applicable step is used:

1) No slash in the name.

1a) Run the builtin if the name matches that of a Special Builtin.

1b) List several names that have unspecified results.


This is an ad-hoc list of builtins that shells implement, not necessarily
common across all shells.


1c) Use a function, for functions not matching standard utilities.

1d) Lists 20 fixed utility names (like alias, cd etc.) that are
 to be invoked at his point. No PATH search yet.
These are the `regular builtins'.
(These need not exist as builtin).


These are the historical common builtins.



1e) Search via PATH.
1eI) Search is successful.
1eIa) Check for `regular builtins' and functions
   and invoke that regular builtin/function.
   Q: Shouldn't this specify an ordering for builtins/functions?


The text seems to imply that you can't have both, doesn't it? My feeling,
without testing anything, is that most shells would allow functions to
override builtins here.

This has been an area of significant disagreement. The two camps basically
break down into giving users more flexibility by prioritizing functions to
giving script writers more power to determine exactly what is executed.
This covers both functions that override `regular' builtins and those that
override non-builtin utilities. (There is more nuance than in this
description.)

POSIX is pretty definite about which camp the standard is in: "The sequence
selected for the Shell and Utilities volume of POSIX.1-2017 acknowledges
that special built-ins cannot be overridden, but gives the programmer full
control over which versions of other utilities are executed."


  Q: Why check for regular builtins? That was already done in 1d.


Implementations can provide other builtins. The check in 1d is only for
those specific ones.


1eIb) Run the utility.
   (This is where ordinary builtins should run).
  (It seems logical that a builtin takes precedence over PATH).


You'd be surprised. Note that this seems to require that you can only run
a builtin if it exists (or something with that name exists) in $PATH.

Look at https://www.austingroupbugs.net/view.php?id=854 for a discussion
of this issue.


1eII) Nothing in PATH, exit with 127


So if you have a builtin that doesn't exist in $PATH and isn't listed as
one of the regular builtins, what do you do? Even the unspecified list
doesn't give much help.


2) Slash in name? Try to execute; prescribed exits: 127 or 126.

I hope I have understood POSIX correctly on these points.


[...]


In posix mode, it seems that bash:
- 1a) Honors special builtins (but see 1c).
- 1b) With source as a special builtin - which is ok (as unspecified).
- 1c) Doesn't allow you to define a function with the name
   of a special builtin.


This is a quality of implementation feature. Why confuse users by allowing
them to define a function that will never be executed?


   A function defined before `set -o posix' will mask a
  special builtin. (This seems to be ok).


It will not, at least not while posix mode is enabled. If you mean that a
function will be executed before a special builtin when not in posix mode,
you are correct, because there are no special builtins when not in posix
mode.


   Utilities are masked by a function of that name.
- 1d) All names in the list, except newgrp, are bash-builtins
   and are used.
- 1eIa)
   Functions don't need a successful path search - per 1c.
- 1eIa & b)
   Builtins are also ran regardless of path search.
- *) ok.

Regarding source and 1c in bash, a mail from our beloved maintainer:


Flatterer.


https://lists.gnu.org/archive/html/bug-bash/2014-03/msg00084.html





Thus in posix mode, bash does not follow this part of the standard.


Exactly which part of the standard are you saying bash is not following?
The requirements concerning PATH search and builtins are different in the
next version of POSIX, the result of interp 854. The standard already
says this about functions with the same name as a special builtin:

"The function is named fname; the application shall ensure that it is a name
(see XBD Name) and that it is not the name of a special built-in utility."


But should it?
I would rather have POSIX modified to *also* accept the, more logical,
bash way (i.c. first matching functions, then builtins, then PATH).
Would that be a feasible modification to suggest to the Austingroup?


I think the resolution to interpretation 854 addresses this. Shells
who want this odering just declare all the builtins they implement as
`intrinsic' so they're not subject to a PATH search. That way there's no
difference between the regular builtins and the ones an implementation
chooses to provide. It still leaves posix special builtins, but I think
those are with us forever.



- 

posix command search and execution

2023-11-06 Thread Mike Jonkmans
Hi,

I have some remarks/questions on POSIX Command Search and Execution,
related to bash and some to POSIX itself.


Introduction

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01_01
describes what to do when a simple command name needs to be resolved.
A rationale is in:
https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xcu_chap02.html#tag_23_02_09_02

POSIX Command Search

According to these docs (what I make of it), resolving is done in steps,
the first applicable step is used:

1) No slash in the name.

1a) Run the builtin if the name matches that of a Special Builtin.

1b) List several names that have unspecified results.

1c) Use a function, for functions not matching standard utilities.

1d) Lists 20 fixed utility names (like alias, cd etc.) that are
to be invoked at his point. No PATH search yet.
These are the `regular builtins'.
(These need not exist as builtin).

1e) Search via PATH.
1eI) Search is successful.
1eIa) Check for `regular builtins' and functions
  and invoke that regular builtin/function.
  Q: Shouldn't this specify an ordering for builtins/functions?
  Q: Why check for regular builtins? That was already done in 1d.
1eIb) Run the utility.
  (This is where ordinary builtins should run).
  (It seems logical that a builtin takes precedence over PATH).
1eII) Nothing in PATH, exit with 127

2) Slash in name? Try to execute; prescribed exits: 127 or 126.

I hope I have understood POSIX correctly on these points.


Bash Command Search

When not in posix mode, bash does:
- Ignore 1a, 1b, 1d.
- 1c) Just use the function. (Especially masking standard utilities).
- 1e) With 1eIa & b use builtins even if utility is not found in PATH.
Which is has a quite logical order: function, builtin, PATH.

In posix mode, it seems that bash:
- 1a) Honors special builtins (but see 1c).
- 1b) With source as a special builtin - which is ok (as unspecified).
- 1c) Doesn't allow you to define a function with the name
  of a special builtin.
  A function defined before `set -o posix' will mask a
  special builtin. (This seems to be ok).
  Utilities are masked by a function of that name.
- 1d) All names in the list, except newgrp, are bash-builtins
  and are used.
- 1eIa)
  Functions don't need a successful path search - per 1c.
- 1eIa & b)
  Builtins are also ran regardless of path search.
- *) ok.

Regarding source and 1c in bash, a mail from our beloved maintainer:
https://lists.gnu.org/archive/html/bug-bash/2014-03/msg00084.html

Thus in posix mode, bash does not follow this part of the standard.
But should it?
I would rather have POSIX modified to *also* accept the, more logical,
bash way (i.c. first matching functions, then builtins, then PATH).
Would that be a feasible modification to suggest to the Austingroup?


Remarks

- Regarding 1eIb.
  The shells posh, dash, ksh and zsh
  also run builtins, even when not found in PATH.
  Checked with the `test' builtin (mv /usr/bin/test{,.sav})
  on the versions found on Ubuntu 22.04.

- The 'newgrp' utility (mentioned in 1d) is not a builtin in bash.
  This is ok. The regular builtins from 1d need not be provided. See:
  https://lists.gnu.org/archive/html/bug-bash/2005-02/msg00129.html
  Builtins are defined in:
  
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_83
  Q: Isn't that incorrect in stating where regular builtins are defined?

- In bash's code builtins/builtins.c (made from builtins/mkbuiltins.c),
  the regular builtins are flagged with POSIX_BUILTIN.
  But `hash' is not. Omission or intention?
  Also, I don't see any use of the POSIX_BUILTIN flag. Remove?


POSIX Definitions

- A utility is mostly a builtin or executable:
  
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_22

- Utilities:
  https://pubs.opengroup.org/onlinepubs/9699919799/idx/utilities.html
  Q: Where is `standard utilities' defined - as used in 1d.

- The POSIX definition of `regular builtin' is the same as those in 1d.
  
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html#tag_17_06
 
-- 
Regards, Mike Jonkmans



Re: nullglob is documented incorrectly

2023-11-06 Thread Greg Wooledge
On Mon, Nov 06, 2023 at 08:56:11AM -0500, Chet Ramey wrote:
> The null string (NULL) and the empty string ("") are not the same thing.

If this is true, then the man page has many inconsistencies that need
to be cleared up.  For example, in the definition of PATH:

   PATH   The search path for commands.  It is a colon-separated  list  of
  directories  in  which the shell looks for commands (see COMMAND
  EXECUTION below).  A zero-length (null) directory  name  in  the
  value of PATH indicates the current directory.  A null directory
  name may appear as two adjacent colons,  or  as  an  initial  or
  trailing  colon.

Obviously you can't have a NULL pointer in the middle of PATH's content,
when it's serialized as a single string full of colons.  So "null" here
clearly refers to an empty (zero-length) string, at least prior to
whatever internal splitting may occur.

The word "null" is used in many places throughout the man page, and in
most of these, the context seems to say that it means the same as the
empty string.  Parameter Expansion is one such place:

   ${parameter:-word}
  Use Default Values.  If parameter is unset or null,  the  expan‐
  sion  of word is substituted.  Otherwise, the value of parameter
  is substituted.

We know that if parameter contains the empty string, the Default Value
is used.  Since the empty string case is never mentioned explicitly, one
is led to believe that "null" covers that case, possibly in addition to
the case where the variable is/contains a NULL pointer internally.

As far as NULL variables go, I assume this is how it appears:

unicorn:~$ unset -v x; x=""; declare -p x
declare -- x=""
unicorn:~$ unset -v x; declare x; declare -p x
declare -- x

with the first being an empty string, and the second being a NULL pointer.
If that's true, then it wouldn't hurt to spell that out explicitly
somewhere.



Re: nullglob is documented incorrectly

2023-11-06 Thread Andreas Schwab
On Nov 06 2023, Chet Ramey wrote:

> If nullglob is set, the non-matching pattern expands to the null string,
> which is removed by word splitting.

Since filename expansion happens after word splitting, this cannot be
true.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



Re: nullglob is documented incorrectly

2023-11-06 Thread Emanuele Torre
On Mon, Nov 06, 2023 at 08:56:11AM -0500, Chet Ramey wrote:
> If nullglob is set, the non-matching pattern expands to the null string,
> which is removed by word splitting.

Well, I guess that is one way to look at it, but this explanation does
not make much sense in my opinion since the results of pathname
expansion are not field split.

The results of expanding a glob with no matches while nullglob is set
are not one null string, they are no strings.

o/
 emanuele6



Re: nullglob is documented incorrectly

2023-11-06 Thread Chet Ramey

On 11/6/23 9:22 AM, Andreas Schwab wrote:

On Nov 06 2023, Chet Ramey wrote:


If nullglob is set, the non-matching pattern expands to the null string,
which is removed by word splitting.


Since filename expansion happens after word splitting, this cannot be
true.


Then maybe a better way to put it is "removed, as with word splitting,"
since the word splitting section in the man page describes what happens
to arguments that are unquoted null strings.

The other problem is that POSIX uses null string and empty string as
synonyms, so text that inherits what POSIX says inherits that meaning.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: nullglob is documented incorrectly

2023-11-06 Thread Chet Ramey

On 11/5/23 5:15 AM, Emanuele Torre wrote:

Today, a user in the #bash IRC channel of libera.chat, misunderstood how
nullglob is supposed to work because is documented incorrectly in the
man page:

 'nullglob'
  If set, Bash allows filename patterns which match no files to
  expand to a null string, rather than themselves.

globs expand to nothing if there are no matches with nullglob set, not
to the null/empty string.


The null string (NULL) and the empty string ("") are not the same thing.

If nullglob is set, the non-matching pattern expands to the null string,
which is removed by word splitting.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/