Re: "unset var" pops var off variable stack instead of unsetting it

2017-03-17 Thread Dan Douglas
On 03/17/2017 09:16 PM, Dan Douglas wrote:
> Why
> would a subshell just make the call stack go away?

I guess slight correction, it's unset itself, because:

> In fact, mksh prints "global" even without the subshell, despite it 
> using dynamic scope for either function definition syntax.

Another "not-sure-if-bug-or-feature". It is a way to guarantee reaching
the global scope, which is impossible in bash, short of calling unset
${#FUNCNAME[@]} times.

If feature, I'm not instantly in love with it.



Re: "unset var" pops var off variable stack instead of unsetting it

2017-03-17 Thread Dan Douglas
On 03/17/2017 07:21 PM, Stephane Chazelas wrote:
> I don't expect the need to have to add "local var" in
> 
> (
>unset -v var
>echo "${var-OK}"
> )

True. I would pretty much never use a subshell command group when I know
that locals are available though. And if I know locals are available then
(except dash) I know arrays are available, in which case I'd almost never
use field splitting. This is only like the millionth screwy gotcha with
IFS. Everybody knows IFS is broken beyond repair :o)

Then again, you could easily write a similar bug with any other special
variable that has a side-effect.

> would be obvious to many people beside you though.
> very
> People writing function libraries meant to be used by several
> POSIX-like shells need to change their code to:
> 
> split() (
>   [ -z "$BASH_VERSION" ] || local IFS # WA for bash bug^Wmisfeature
>   unset -v IFS
>   set -f 
>   split+glob $1
> )
> 
> if they want them to be reliable in bash.

Even if the inconsistent effect of unset isn't obvious, it should be
obvious that a subshell isn't equivalent to setting a local, because it
doesn't just make dynamic scope go away.

I'm far more surprised by the behavior of mksh and dash, in which it's
the subshell rather than the unset builtin that's inconsistent. Why
would a subshell just make the call stack go away? That makes no sense,
and a subshell isn't supposed to do that. Dash and mksh don't even agree
with one another on how that works:

(cmd) ~ $ mksh /dev/fd/3 3<<\EOF
function f { typeset x=f; g; }
function g { ( unset x; echo "${x-unset}"; ) }
x=global; f
EOF

global

(ins) ~ $ dash /dev/fd/3 3<<\EOF
alias typeset=local function=
function f() { typeset x=f; g; }
function g() { ( unset x; echo "${x-unset}"; ) }
x=global; f
EOF

unset

In fact, mksh prints "global" even without the subshell, despite it
using dynamic scope for either function definition syntax.

At least bash's output in this case (empty) can be fully explained as
a combination of quirks with unset and hidden locals (neither being
documented), plus dynamic scope being what it is.

We're pretty much arguing over which is the less counter-intuitive
inconsistency here. If mksh's subshells worked consistently as in bash,
you'd have written the same bug as bash in your example. And it would
be even easier to do so without the unset quirk since this could happen
within a single function call too:

(cmd) ~ $ mksh /dev/fd/3 3<<\EOF
function f {
  typeset x=f
  ( unset x; echo "${x-unset}"; )
}

x=global; f
EOF

global

This could really bite if x and f are defined in separate files so the
initial state of x isn't necessarily known.

> So what should the documentation be? With my "eval" case in
> mind, it's hard to explain without getting down to how stacking
> variables work. Maybe something like:
> 
> [...]

All that touches on several issues in addition to scope, such as
the various states that a variable can be in, and the exact nature
of references to variables like 'a[@]'. That's some of the least-well
documented stuff, but some of that should also probably be left subject
to change due to the great inconsistency across shells and other issues
just within bash.  (Ugh also have to mention the stupid 'a[0]' with
associative arrays - that's one where "consistency" is itself a bug).

> It might be worth pointing out that "unset -v", contrary to the
> default behaviour, won't unset functions so it's a good idea to
> use "unset -v" instead of "unset" if one can't guarantee that
> the variable was set beforehand (like the common case of using
> unset to remove a variable which was potentially imported from
> the environment).

Yeah. I believe POSIX mentions that as well.



Re: "unset var" pops var off variable stack instead of unsetting it

2017-03-17 Thread Stephane Chazelas
2017-03-17 17:35:36 -0500, Dan Douglas:
> The need to localize IFS is pretty obvious to me - of course that's
> given prior knowledge of how it works.
[...]

I don't expect the need to have to add "local var" in

(
   unset -v var
   echo "${var-OK}"
)

would be obvious to many people beside you though.

People writing function libraries meant to be used by several
POSIX-like shells need to change their code to:

split() (
  [ -z "$BASH_VERSION" ] || local IFS # WA for bash bug^Wmisfeature
  unset -v IFS
  set -f 
  split+glob $1
)

if they want them to be reliable in bash.

> The problem is the non-obvious nature of unset's interaction with scope,

the main problem to me is an unset command that doesn't unset.

As shown in my original post, there's also a POSIX conformance
issue.

> (and the lack of documentation). Not much can be done about the former,
> as it is with so many things.

So what should the documentation be? With my "eval" case in
mind, it's hard to explain without getting down to how stacking
variables work. Maybe something like:

after unset -v var
  - if var had been declared (without -g) in the current
function scope (not the global scope), $var becomes unset in
the current scope (not in parent scopes). Futher unset
attempts will not affect the variable in parent scopes.
  - otherwise, the previous var value (and type and attributes)
is popped from a stack. That stack is pushed every time the
variable is declared without -g in a new function scope or
when the "." or "eval" special builtins are invoked as var=x
eval 'code' or var=x . sourced-file. If the stack was empty,
the variable is unset.

There's also missing documentation for:

unset -v 'var[x]' (note the need to quote that glob)
  can only be used if "var" is an array or hash variable and unsets
  the array/hash element of key x. Unsetting the last element
  does not unset the variable. For arrays, negative subscripts
  are relative to the greatest assigned subscript in the array.
  unset "a[-1]" "a[-1]" unsets the 2 elements with the greatest
  subscript, but that's not necessarily the case for unset
  "a[-2]" "a[-1]" if the array was sparse.

  unset "var[@]" or unset "var[*]" can be used to unset all the
  elements at once. For associative arrays, use unset 'a[\*]' or
  unset 'a[\@]' to unset the elements of key * and @. It is not
  possible [AFAICT] to unset the element of key "]" or where the
  key consists only of backslash characters [btw, it also looks
  like bash hashes (contrary to zsh or ksh93 ones) can't have an
  element with an empty key]

It is not an error to attempt to "unset" a variable or array
element that is not set, except when using negative subscripts.
  
Also, the doc says:

>  The -v flag specifies that NAME refers to parameters.
>  This is the default behaviour.

It might be worth pointing out that "unset -v", contrary to the
default behaviour, won't unset functions so it's a good idea to
use "unset -v" instead of "unset" if one can't guarantee that
the variable was set beforehand (like the common case of using
unset to remove a variable which was potentially imported from
the environment).

-- 
Stephane



Re: "unset var" pops var off variable stack instead of unsetting it

2017-03-17 Thread Dan Douglas
The need to localize IFS is pretty obvious to me - of course that's
given prior knowledge of how it works.

The problem is the non-obvious nature of unset's interaction with scope,
(and the lack of documentation). Not much can be done about the former,
as it is with so many things.




Re: "unset var" pops var off variable stack instead of unsetting it

2017-03-17 Thread Grisha Levit
It's probably easiest to find previous discussions here on this topic
by searching for 'upvar' (common name for a function that takes
advantage of this behavior).

Latest I think was this one [1] but a much earlier discussion is here [2].

  [1] https://lists.gnu.org/archive/html//bug-bash/2015-01/msg00018.html
  [2] https://lists.gnu.org/archive/html/bug-bash/2008-10/msg00107.html



"unset var" pops var off variable stack instead of unsetting it

2017-03-17 Thread Stephane Chazelas
Hi,

consider this function:

split() (
  unset -v IFS  # default splitting
  set -o noglob # disable glob

  set -- $1 # split+(no)glob
  [ "$#" -eq 0 ] || printf '<%s>\n' "$@"
)

Note the subshell above for the local scope for $IFS and for
the noglob option. That's a common idiom in POSIX shells when
you want to split something: subshell, set IFS, disable glob,
use the split+glob operator.

split 'foo * bar'

outputs


<*>


as expected. So far so good.

Now, if that "split" functions is called from within a function
that declares $IFS local like:

bar() {
  local IFS=.
  split $1
}

Then, the "unset", instead of unsetting IFS, actually pops a
layer off the stack.

For instance

foo() {
  local IFS=:
  bar $1
}

foo 'a b.c:d'

outputs



instead of




because after the "unset IFS", $IFS is not unset (which would
result in the default splitting behaviour) but set to ":" as it
was before "bar" ran "local IFS=."

A simpler reproducer:

$ bash -c 'f()(unset a; echo "$a"); g(){ local a=1; f;}; a=0; g'
0

Or even with POSIX syntax:

$ bash -c 'f()(unset a; echo "$a"); a=0; a=1 eval f'
0

A work around is to change the "split" function to:

split() (
  local IFS
  unset -v IFS  # default splitting
  set -o noglob # disable glob

  set -- $1 # split+(no)glob
  [ "$#" -eq 0 ] || printf '<%s>\n' "$@"
)

For some reason, in that case (when "local" and "unset" are
called in the same function context), unset does unset the
variable.

Credits to Dan Douglas
(https://www.mail-archive.com/miros-mksh@mirbsd.org/msg00707.html)
for finding the bug. He did find a use for it though (get the
value of a variable from the caller's scope).

-- 
Stephane



Re: \! and \# in PS1 vs PS2 vs PS4, PS0 and ${var@P}

2017-03-17 Thread Grisha Levit
On Tue, Mar 14, 2017 at 9:07 PM, Chet Ramey  wrote:
> when PS1 is expanded the first time, the "current" history entry is the
> one corresponding to the last command entered

> PS2 looks at the current history entry, which is 530 since we've
> started on it.

I think I'm missing something. It seems that when PS1 is expanded \! *does*
match what will eventually become the history number of the command-to-be-
-entered, while PS2 does not. i.e. I can't see how we've started on the second
line of history if the current input will still be stored in the first.

$ PS1='\!  $ ' PS2=${PS1/$/>}; history -c
1  $ : 1.1 \
2  > : 1.2
2  $ fc -l -1
1: 1.1 : 1.2

> When the first line is entered, the history number and command numbers
> get incremented

There seems to be a mismatch: the history number is incremented and the
command number is not:

$ PS1='\! \#$ ' PS2=${PS1/$/>} HISTFILE= $BASH --norc -i <<<$':\\\n'
1 1$ :\
2 1>

> I'm not sure what the question is.

Fair enough; sorry for the vague report.  I thought it was surprising that:

1. \! and \# increment at different times during command entry
2. The docs refer only to a point in a command's lifecycle at which a prompt
   is displayed and then to a history/command number of "this command":

PS0[...] expanded and displayed after reading a command and before the
   command is executed
PS2[...] expanded as with PS1 [...]
PS4[...] expanded as with PS1 and the value is printed before each
   command bash displays during an execution trace

\! the history number of this command
\# the command number of this command

The history number of a command is its position in the history list [...]
while the command number is the position in the sequence of commands
executed during the current shell session.

Bash expands and displays PS1 before reading the first line of a command
and expands and displays PS2 before reading [...] subsequent lines [...]
Bash displays PS0 after it reads a command but before executing it.

This makes clear that the expansions do happen at different times so e.g. \t
should differ since the time changes between the prompts' expansions.
However, a literal reading of the above suggests that `this command' is the
same in all cases and so it's not clear why \! and !# differ.  Perhaps instead
saying of "number of this command" something like "number of the next command
to be read" would be more correct.



process substitution not correctly parsed inside variable expansion

2017-03-17 Thread D630

There is a parse error in B:

# A

bash$ p=; : "${p:=>(f()(echo "$@") ;f foo)}"; declare -p p
declare -- p=">(f()(echo ) ;f foo)"

bash$ p=; : ${p:=>(f()(echo "$@") ;f foo)}; declare -p p
declare -- p="/dev/fd/63"
foo

bash$ p=; echo "${p:=>(f()(echo "$@") ;f foo)}"

(f()(echo ) ;f foo)



# B

bash$ p=; : "${p:=>(f() { echo "$@"; };f foo)}"; declare -p p
declare -- p=">(f() { echo ; "

bash$ p=; : ${p:=>(f() { echo "$@"; };f foo)}; declare -p p
bash: syntax error near unexpected token `)'

bash$ p=; echo "${p:=>(f() { echo "$@"; };f foo)}"

(f() { echo ; ;f foo)}