List of background processes in a command group, in a pipeline, executed sequentially under certain conditions.

2011-10-01 Thread Dan Douglas
ONF_MACHTYPE='x86_64-pc-linux-gnu' -
DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -
DHAVE_CONFIG_H   -I.  -I. -I./include -I./lib  -
DDEFAULT_PATH_VALUE='/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'
 
-DSTANDARD_UTILS_PATH='/bin:/usr/bin:/sbin:/usr/sbin' -
DSYS_BASHRC='/etc/bash/bashrc' -DSYS_BASH_LOGOUT='/etc/bash/bash_logout' -
DNON_INTERACTIVE_LOGIN_SHELLS -DSSH_SOURCE_BASHRC -march=native -Ofast -ggdb -
mmmx -floop-interchange -floop-strip-mine -floop-block -pipe
uname output: Linux smorgbox 3.0.4-pf+ #36 SMP PREEMPT Sat Sep 24 16:47:49 CDT 
2011 x86_64 Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz GenuineIntel GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.2
Patch Level: 10
Release Status: release

Thanks for having a look!
-Dan Douglas

signature.asc
Description: This is a digitally signed message part.


[bug] Command substitutions within C-style for loops. Semicolon causes error.

2011-10-28 Thread Dan Douglas
It seems the Bash parser can't distinguish between the semicolon as a
command separator list operator, and the arithmetic context delimiter of
the C-style for loop. I realize this construct isn't particularly useful.
ksh93 and zsh handle it as expected, but in Bash it causes a syntax error
on my system.

Sorry I couldn't think of an example that isn't completely pointless:
$ ( ksh -c 'for (( i = j = k = 1; i % 9 || (j *= -1, $( ((i%9)) || printf "
" >&2; echo 0), k++ <= 10); i += j )); do printf "$i"; done' )
12345678 987654321 012345678 987654321 012345678 987654321 012345678
987654321 012345678 987654321 012345678

$ ( for (( i = j = k = 1; i % 9 || (j *= -1, $( ((i%9)) || printf " " >&2;
echo 0), k++ <= 10); i += j )); do printf "$i"; done )
-bash: syntax error: `;' unexpected
-bash: syntax error: `(( i = j = k = 1; i % 9 || (j *= -1, $( ((i%9)) ||
printf " " >&2; echo 0), k++ <= 10); i += j ))'
Segmentation fault

The segfault on the subshell there is a bit odd if it were merely a syntax
error.

GNU bash, version 4.2.10(1)-release (x86_64-pc-linux-gnu)

-Dan Douglas


Re: lseek with bash

2011-12-11 Thread Dan Douglas
On Friday, December 09, 2011 04:35:11 PM Chet Ramey wrote:
> On 12/9/11 10:12 AM, Jean-Jacques Brucker wrote:
> > Playing with flock to securely access to a file shared by multiple
> > process. I noticed there are no documented way to do an lseek on an
> 
> > opened fd with bash :
>   [...]
> 
> > I have solve my problem by making this small binary (i just needed a
> > rewind) :
> > 
> > int main(int argc,char * argv[]) { return lseek(atoi(argv[1]),0L,0); }
> > 
> > But i ll be glad to use a standard and finished tool.
> > 
> > Of course we could make an "lseek" binary with some options to cover
> > all use cases of lseek function. But I prefer to have such
> > functionality inside bash.
> 
> ksh93 has this functionality with different syntax.  I'm not convinced
> it's of general enough value to add to bash, especially when a separate
> binary (of obviously trivial complexity) does the job.
> 
> Chet

<# and <## are a little silly but fun to play with.  I wouldn't call it a top  
priority, but hypothetically if you wanted to get a lot of power out of one 
extra bit of syntax you could do worse than <#((expr)). It pretty much takes 
care of most common seek functionality found in the fancy FD objects of other 
languages.

I'd agree lseek alone isn't all that useful. More generally, being able to 
store $CUR and then later jump to it directly should be a big improvment over 
the sort of hack you'd have to pull in Bash to do the same. (Maybe track 
position by either looping with read -N,  or piping through tee >(wc -m), then 
exploit /dev/stdin to get back to the beginning and seek forward again... 
Yuck.) Basic I/O isn't something I like to repeatedly fork little external 
processes to achieve. The zsh MULTIOS concept is interesting to that end but  
probably a bit over-the-top for Bash.

Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: Print non-readonly variables with declare +r -p

2011-12-13 Thread Dan Douglas
On Tuesday, December 13, 2011 12:14:41 PM lhun...@mbillemo.lin-k.net wrote:
> Configuration Information [Automatically generated, do not change]:
> Machine: i386
> OS: darwin11.2.0
> Compiler: /Developer/usr/bin/clang
> Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='i386'
> -DCONF_OSTYPE='darwin11.2.0' -DCONF_MACHTYPE='i386-apple-darwin11.2.0'
> -DCONF_VENDOR='apple' -DLOCALEDIR='/opt/local/share/locale'
> -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -DMACOSX   -I.  -I. -I./include
> -I./lib  -I/opt/local/include -pipe -O2 -arch x86_64 uname output: Darwin
> mbillemo.lin-k.net 11.2.0 Darwin Kernel Version 11.2.0: Tue Aug  9 20:54:00
> PDT 2011; root:xnu-1699.24.8~1/RELEASE_X86_64 x86_64 Machine Type:
> i386-apple-darwin11.2.0
> 
> Bash Version: 4.2
> Patch Level: 10
> Release Status: release
> 
> Description:
>   The description of declare states that using + instead of - in front of 
> an
> attribute turns it off.  From that, I would expect (and find useful) to be
> able to display (-p) all non-read-only variables using declare +r -p, just
> like I can display all readonly variables using declare -r -p
> 
> Repeat-By:
>   declare +r -p
> 
> Fix:
> Show all variables that do not have the given attribute(s) when included
> in the declare command prefixed with a +.

Would this be consistent with anything else? AFAIK the only option for 
filtering results is -f and -F unless no additional "names" are given. "declare 
+l -p" for example doesn't appear to select only non -l attribute names 
either.

ksh93 appears mostly consistent with this and prints matches regardless of + 
or -.

I imagine this is ok because Bash's declare -p is intended to be human-
readable only, whereas Ksh guarantees -p produces output in a format reusable 
as input.

$ ( typeset -r x=0; typeset y=1; typeset -p x y )
typeset -r x=0
y=1
$ ( typeset -r x=0; typeset y=1; typeset -r -p x y )
typeset -r x=0
y=1
$ ( typeset -r x=0; typeset y=1; typeset +r -p x y )
typeset -r x
y
$ ( typeset -r x=0; typeset y=1; typeset +r )   
x
$ ( typeset -r x=0; typeset y=1; typeset -r )
x=0

And it has the additional behavior that +p omits values. Bash doesn't do this.

signature.asc
Description: This is a digitally signed message part.


Re: Print non-readonly variables with declare +r -p

2011-12-14 Thread Dan Douglas
On Wednesday, December 14, 2011 10:57:21 AM Chet Ramey wrote:
> On 12/13/11 3:13 PM, Dan Douglas wrote:
> > I imagine this is ok because Bash's declare -p is intended to be human-
> > readable only, whereas Ksh guarantees -p produces output in a format
> > reusable as input.
> 
> Bash's `declare -p' output is intended to be reusable as input, though the
> documentation doesn't make that explicit.

Ah. I had assumed the single quotes added around array compound asssignment 
broke this (intentionally?), but now I see it's valid. I expected declare -a 
y='([0]="a" [1]="b c" [2]="d")' to set the first element of y to that entire 
string.

signature.asc
Description: This is a digitally signed message part.


Re: Ill positioned 'until' keyword

2011-12-14 Thread Dan Douglas
On Wednesday, December 14, 2011 05:47:24 PM Peng Yu wrote:
> Hi,
> 
> I looks a little wired why 'until' is the way it is now. According to
> the manual until is before the do-done block.
> 
> until test-commands; do consequent-commands; done
> 
> A common design of until in other language is that it allows the loop
> body be executed at least once and test the condition at the end of
> the run of the first time. It seems that a better of bash should also
> follow the practice. If I understand it correctly, the above is exact
> the same as the following, in which case the do done block can be
> executed zero time. Because of this, I think that the current 'until'
> is not necessary, and probably better to change its definition so that
> it allows the execution of the loop at least once.
> 
> while ! test-commands; do consequent-commands; done
> 
> 
> In short, I'd expect the following code echo 9 (not working with the
> current bash).
> 
> COUNTER=9
> do
>  echo COUNTER $COUNTER
>  let COUNTER-=1
> done
> until [  $COUNTER -lt 10 ];
> 
> 
> I'd like to hear for what reason 'until' is designed in the way it is
> now. Shall we considered to at least allow an option in bash to change
> it meaning to the one I propose (or adding a different command, like
> until2, for what I proposed), which give us time to let the orignal
> until usage dies out.

You're right, most non-shell languages do it that way. The current behavior is 
consistent with most shells and the way it's specified by SUS: 
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_04_11

"until" in Bash is the same as:

while ! { list 1; }; do
list 2
done

Here's one of the many possible workarounds:

declare -i x=5

until
echo $x
(( x-- < 1 ))
do :
done

The echo is guaranteed to execute at least once. The arithmetic is equivalent 
to list 2. Some additional work is needed to make the exit status of this 
pattern equivalent to that of the current until, but it's doable.

If it were changed there would need to be a separate behavior for POSIX mode.

signature.asc
Description: This is a digitally signed message part.


segfault expanding certain arrays created via read -N directly to an array.

2011-12-19 Thread Dan Douglas
x = 0
temp = 0x0
temp1 = 
string = 0x6f66a0 "\"${x[@]}\""
string_size = 9
sindex = 9
quoted_dollar_at = 0
quoted_state = 2
had_quoted_null = 0
has_dollar_at = 0
tflag = 
pflags = 
assignoff = -1
c = 
t_index = 1
twochars = 
state = {__count = 0, __value = {__wch = 0, __wchb = "\000\000\000"}}
#8  0x00450788 in shell_expand_word_list (tlist=0x6f6330, 
eflags=) at subst.c:9215
temp_list = 
expanded_something = 0
has_dollar_at = 0
expanded = 
new_list = 
next = 0x0
temp_string = 
#9  expand_word_list_internal (list=, eflags=31) at subst.c:9332
new_list = 0x6f63b0
temp_list = 
tint = 
#10 0x00431239 in execute_simple_command (fds_to_close=, 
async=0, pipe_out=-1, pipe_in=-1, 
simple_command=0x6f5860) at execute_cmd.c:3771
words = 
lastword = 
command_line = 0x0
temp = 
builtin_is_special = 0
already_forked = 0
func = 
first_word_quoted = 0
result = 0
dofork = 
builtin = 
old_builtin = 
old_command_builtin = 
lastarg = 
old_last_async_pid = 24474
#11 execute_command_internal (command=0x6f5830, asynchronous=, 
pipe_in=-1, pipe_out=, 
fds_to_close=) at execute_cmd.c:735
exec_result = 0
user_subshell = 
invert = 
ignore_return = 0
was_error_trap = 0
my_undo_list = 0x0
exec_undo_list = 0x0
last_pid = -1
save_line_number = 0
#12 0x00434893 in execute_connection (command=0x6f58f0, asynchronous=0, 
pipe_in=-1, pipe_out=-1, fds_to_close=0x6f5920)
at execute_cmd.c:2328
tc = 
second = 
ignore_return = 0
exec_result = 
was_error_trap = 
invert = 
save_line_number = 
#13 0x00430486 in execute_command_internal (command=0x6f58f0, 
asynchronous=, pipe_in=-1, pipe_out=-1, 
fds_to_close=0x6f5920) at execute_cmd.c:891
exec_result = 0
user_subshell = 
invert = 0
ignore_return = 0
was_error_trap = 
my_undo_list = 0x0
exec_undo_list = 0x0
last_pid = 
save_line_number = 
#14 0x0046d7b3 in parse_and_execute (string=, 
from_file=, flags=)
at evalstring.c:319
bitmap = 0x6f5920
code = 0
lreset = 
should_jump_to_top_level = 0
last_result = 0
command = 0x6f58f0
#15 0x0041d399 in run_one_command (command=) at 
shell.c:1315
code = 0
#16 0x0041c2c6 in main (argc=3, argv=0x7fffd448, 
env=0x7fffd468) at shell.c:688
i = 
code = 
old_errexit_flag = 0
saverst = 0
locally_skip_execution = 0
arg_index = 3
top_level_arg_index = 3

--
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


[bug] Bash translates >&$var into &>$var for exported functions.

2012-01-22 Thread Dan Douglas
Hello, In the case of exported functions, Bash interprets a copy descriptor 
followed by an expansion as the >& synonym for &>, resulting in the output 
going to a file named as the value of the FD it's given.  This only applies to 
">&$var" and not "<&$var". I've tested various quoting, Is there some way 
around this?

Gist over here if it's easier to read: https://gist.github.com/1661392

TESTCASE (Overwrites the file named "3" in CWD):
#!/usr/bin/env bash

set -x

f() {
echo 'hi'
} >&${1}

{ f 3; cat; } <<<'' 3>/dev/stdin

export -f f
export -pf

PS4='* ' BASH_XTRACEFD=4 bash -xc 'f 3; cat' <<<'' 3>/dev/stdin 4>&2

[[ -f 3 ]] && cat ./3
END TESTCASE

OUTPUT:
 ~ $ rm 3; ./exbug
+ f 3
+ echo hi
+ cat
hi
+ export -f f
+ export -pf
f () 
{ 
echo 'hi'
} &>${1}
declare -fx f
* PS4='* '
* BASH_XTRACEFD=4
+ bash -xc 'f 3; cat'
* f 3
* echo hi
* cat
+ [[ -f 3 ]]
+ cat ./3
hi
END OUTPUT

Bash v. 4.2 w/ patchset 20 on Gentoo Linux amd64.
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: bash blocking on exec assigning an output file descriptor to a fifo

2012-02-14 Thread Dan Douglas
O_NONBLOCK is up there in things I wouldn't mind using. Namely, having access 
to errno. I don't see any way of determining the "fullness" of a buffer even 
through /proc/self/fdinfo/ on Linux.
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: excess braces ignored: bug or feature ?

2012-02-17 Thread Dan Douglas
On Friday, February 17, 2012 02:51:27 PM Mike Frysinger wrote:
> can't tell if this is a bug or a feature.
> 
> FOO= BAR=bar
> 
> : ${FOO:=${BAR}
> 
> echo $FOO
> 
> i'd expect an error, or FOO to contain those excess braces.  instead, FOO is
> just "bar".
> -mike

My favorite is probably the parser ignoring any valid redirection syntax with 
the special command substitutions.

 ~ $ { echo "$({}

signature.asc
Description: This is a digitally signed message part.


Re: excess braces ignored: bug or feature ?

2012-02-19 Thread Dan Douglas
On Sunday, February 19, 2012 04:25:46 PM Chet Ramey wrote:
> On 2/17/12 6:22 PM, Dan Douglas wrote:
> > My favorite is probably the parser ignoring any valid redirection syntax
> > with the special command substitutions.
> > 
> >  ~ $ { echo "$({} > 
> > hi
> 
> Bash does the same thing as ksh93 here, though for the wrong reasons.
> I think ksh93 counts an input redirection that saves the file descriptor
> into a variable as sufficient to kick in the `equivalent to cat file'
> clause.
> 
> >  ~ $ { echo "$(11 > 
> > hi
> 
> This is a bug.  It should be a file descriptor out of range error, or
> it should treat the string of digits as a word, since the value
> exceeds the largest intmax_t.
> 
> > That one really is ignored. No variable named xxx... is actually set.
> 
> I assume you mean the first one.  It doesn't matter whether or not the
> variable is set as a side effect of the redirection -- it's in a
> subshell and disappears.
> 
> Chet

Oh so a subshell is created after all, and that really is a command 
substitution + redirect! I just chalked it up to Bash recycling the way 
redirects were parsed.

I think I ran across that quirk in trying to determine whether $(&2;})"

...only to discover that ${ cmd;} also increments .sh.subshell anyway (like 
BASH_SUBSHELL), and there was no BASHPID equivalent so that was a dead end.
-- 
Dan Douglas



Re: excess braces ignored: bug or feature ?

2012-02-20 Thread Dan Douglas
On Sunday, February 19, 2012 04:25:46 PM Chet Ramey wrote:

> I assume you mean the first one.  It doesn't matter whether or not the
> variable is set as a side effect of the redirection -- it's in a
> subshell and disappears.
> 
> Chet

Forgot to mention though, It's possible in ksh there is no subshell created if 
you consider this:

$ : "$(&2;})"
1
$ : $(: $( echo ${.sh.subshell} >&2))
2

It even works with the subshell-less command substitution, but there's no 
typeset output, so either x is automatically unset, it's never set to begin 
with, or ${ &2 | :
0
 ~ $ : | { echo $BASH_SUBSHELL >&2; } | :   



1
 ~ $ : | ( echo $BASH_SUBSHELL >&2; ) | :   



1
 ~ $ : | ( ( echo $BASH_SUBSHELL >&2; ) ) | :   



2
 ~ $ : | { ( echo $BASH_SUBSHELL >&2; ) } | :   


   
2
 ~ $ : | { { echo $BASH_SUBSHELL >&2; } } | :   

        
   
1

-- 
Dan Douglas



Re: shopt can't set extglob in a sub-shell?

2012-02-26 Thread Dan Douglas
On Saturday, February 25, 2012 09:42:29 PM Davide Baldini wrote:

> Description:
>   A 'test.sh` script file composed exclusively of the following text
>   fails execution:
>   #!/bin/bash
>   (
>   shopt -s extglob
>   echo !(x)
>   )
>   giving the output:
>   $ ./test.sh
>   ./test.sh: line 4: syntax error near unexpected token `('
>   ./test.sh: line 4: `echo !(x)'
>   Moving the shopt line above the sub-shell parenthesis makes the script
>   work.
> 
>   The debian man pages give no explanation.
> 
>   Thank you.

Non-eval workaround if you're desperate:

#!/usr/bin/env bash
(
shopt -s extglob
declare -a a='( !(x) )'
echo "${a[@]}"
)

You may be aware extglob is special and affects parsing in other ways. Quoting 
Greg's wiki (http://mywiki.wooledge.org/glob):

> Likewise, you cannot put shopt -s extglob inside a function that uses
> extended globs, because the function as a whole must be parsed when it's
> defined; the shopt command won't take effect until the function is called, at
> which point it's too late.

This appears to be a similar situation. Since parentheses are "metacharacters" 
they act strongly as word boundaries without a special exception for extglobs.

I just tested a bunch of permutations. I was a bit surprised to see this one 
fail:

f()
if [[ $FUNCNAME != ${FUNCNAME[1]} ]]; then
trap 'shopt -u extglob' RETURN
shopt -s extglob
f
else
f()(
shopt -s extglob
echo !(x)
)
f
fi

f

I was thinking there might be a general solution via the RETURN trap where you 
could just set "trace" on functions where you want it, but looks like even 
"redefinitions" break recursively, so you're stuck. Fortunately, there aren't a 
lot of good reasons to have extglob disabled to begin with (if any).
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: Pathname expansion not performed in Here Documents

2012-02-26 Thread Dan Douglas
On Monday, February 27, 2012 04:03:34 AM Davide Baldini wrote:
>   Is this expected? Standing at the debian's man bash, variables inside
>   'here document' are supposed to expand with no special exceptions
>   and undergo word splitting and pathname expansion.

"If word is unquoted, all lines of the here-document are subjected to 
parameter expansion, command substitution, and arithmetic expansion."

No pathname expansion.
-- 
Dan Douglas



Re: Pathname expansion not performed in Here Documents

2012-02-27 Thread Dan Douglas
On Monday, February 27, 2012 02:07:25 PM Davide Baldini wrote:
> FROM Davide Baldini
> 
> On 02/27/12 04:11, Dan Douglas wrote:
> > "If word is unquoted, all lines of the here-document are subjected to
> > parameter expansion, command substitution, and arithmetic expansion."
> > 
> > No pathname expansion.
> 
> That section of manual doesn't specifically include word splitting nor
> pathname expansion into the list of performed expansions, but the word
> 
> splitting does include itself unconditionally:
> >  Word Splitting
> > 
> > The  shell  scans the results of parameter expansion, command substitu-
> > tion, and arithmetic expansion that did not occur within double  quotes
> > for word splitting.
> 
> and pathname expansion ties itself to word splitting:
> >  Pathname Expansion
> > 
> > After  word  splitting, [...]

Pathname expansion and word splitting are separate, unrelated steps that 
normally occur in an ordinary command evaluation context and other places 
where "words", "arguments", or "parameters" are relevant concepts. 

Word splitting only applies to unquoted expansions resulting from steps higher 
in the expansion hierarchy. Since the body of a here doc ument is basically 
just a string, and in shell land "words" are more structured data, word 
splitting in this context is nonsensical. According to POSIX: "The here-
document shall be treated as a single word".

Pathname expansion is a step that generates words by applying constraints 
using the language of shell pattern matching over the domain of files under a 
given path. Because pathname expansion can only occur in unquoted contexts, It 
happens after word splitting so that files containing characters in IFS are not 
split into separate words. Again, like word splitting, they are only 
applicable to contexts where arguments/parameters/words are relevant.
 
There are a number of genuine issues with this section of the manual that I 
haven't gotten around to addressing on this list yet. This isn't one of them. 

In situations such as this where a feature is specified by POSIX, you should 
also refer to it. In modern times, it's the lowest common denominator upon 
which many shells like Bash attempt to comply. The manpage just briefly 
reiterates what's in the perfectly good spec.

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07_04

Additionally, you can test on other shells that try to emulate the POSIX shell 
or some superset such as Dash, many of the Korn shells, zsh, and (maybe) 
busybox. They should all mostly agree upon fundamental features like this and 
have their own documentation you can cross-reference.

> If intended behaviour is to exclude some expansions from performing word
> splitting or pathname expansion, they should be specifically pointed out

Enumerating which expansions apply to all possible contexts would be a 
monumental task. In general most expansions apply most places, otherwise the 
manual lists those that are applicable.

In this case the dead giveaway should be that quote removal doesn't occur in a 
heredoc which means quoting expansions to disable pathname expansion would be 
impossible. Even if literal quotes required escaping, there's no way it could 
be useful for anything other than breaking things people forgot to quote by 
dumping the meaningless result of a glob into your heredoc.

The more places word splitting and pathname expansion are implicitly disabled, 
the better. There are only a handful of places where pathname expansion is 
apropos, and in Bash, unless writing in a restricted subset for portability, 
word splitting is virtually NEVER desirable. This is why (good) shell 
programming resources so heavily emphesize proper quoting, because in no case 
do you want to allow the possibility of dangerous inadvertant word splitting 
or pathname expansion.  When in doubt, quote, because almost everyplace else 
redundant quotes are removed with few exceptions where they cause side-effects.

If you do for some reason want a glob result in a heredoc, put it into an 
array or the positional parameters and expand that.
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread Dan Douglas
On Tuesday, February 28, 2012 05:53:32 PM Roman Rakus wrote:
> On 02/28/2012 05:49 PM, Greg Wooledge wrote:
> > On Tue, Feb 28, 2012 at 05:36:47PM +0100, Roman Rakus wrote:
> >> And that means, there isn't way to substitute "something" to ' (single
> >> quote) when you want to not perform word splitting. I would consider
> >> it
> >> as a bug.
> > 
> > imadev:~$ q=\'
> > imadev:~$ input="foosomethingbar"
> > imadev:~$ echo "${input//something/$q}"
> > foo'bar
> 
> I meant without temporary variable.
> 
> RR
ormaaj@ormaajbox ~ $ ( x=abc; echo ${x/b/$'\''} )
a'c
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread Dan Douglas
On Tuesday, February 28, 2012 06:38:22 PM John Kearney wrote:
> On 02/28/2012 06:31 PM, Dan Douglas wrote:
> > On Tuesday, February 28, 2012 05:53:32 PM Roman Rakus wrote:
> >> On 02/28/2012 05:49 PM, Greg Wooledge wrote:
> >>> On Tue, Feb 28, 2012 at 05:36:47PM +0100, Roman Rakus wrote:
> >>>> And that means, there isn't way to substitute "something" to
> >>>> ' (single quote) when you want to not perform word splitting.
> >>>> I would consider it as a bug.
> >>> 
> >>> imadev:~$ q=\' imadev:~$ input="foosomethingbar" imadev:~$ echo
> >>> "${input//something/$q}" foo'bar
> >> 
> >> I meant without temporary variable.
> >> 
> >> RR
> > 
> > ormaaj@ormaajbox ~ $ ( x=abc; echo ${x/b/$'\''} ) a'c
> 
> ( x=abc; echo "${x/b/$'\''}" )
> -bash: bad substitution: no closing `}' in "${x/b/'}"
> 
> 
> you forgot the double quotes ;)
> 
> 
> I really did spend like an hour or 2 one day trying to figure it out
> and gave up.

Hm good catch. Thought there might be a new quoting context over there.
-- 
Dan Douglas



Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread Dan Douglas
On Tuesday, February 28, 2012 06:52:13 PM John Kearney wrote:
> On 02/28/2012 06:43 PM, Dan Douglas wrote:
> > On Tuesday, February 28, 2012 06:38:22 PM John Kearney wrote:
> >> On 02/28/2012 06:31 PM, Dan Douglas wrote:
> >>> On Tuesday, February 28, 2012 05:53:32 PM Roman Rakus wrote:
> >>>> On 02/28/2012 05:49 PM, Greg Wooledge wrote:
> >>>>> On Tue, Feb 28, 2012 at 05:36:47PM +0100, Roman Rakus
> >>>>> 
> >>>>> wrote:
> >>>>>> And that means, there isn't way to substitute "something"
> >>>>>> to ' (single quote) when you want to not perform word
> >>>>>> splitting. I would consider it as a bug.
> >>>>> 
> >>>>> imadev:~$ q=\' imadev:~$ input="foosomethingbar" imadev:~$
> >>>>> echo "${input//something/$q}" foo'bar
> >>>> 
> >>>> I meant without temporary variable.
> >>>> 
> >>>> RR
> >>> 
> >>> ormaaj@ormaajbox ~ $ ( x=abc; echo ${x/b/$'\''} ) a'c
> >> 
> >> ( x=abc; echo "${x/b/$'\''}" ) -bash: bad substitution: no
> >> closing `}' in "${x/b/'}"
> >> 
> >> 
> >> you forgot the double quotes ;)
> >> 
> >> 
> >> I really did spend like an hour or 2 one day trying to figure it
> >> out and gave up.
> > 
> > Hm good catch. Thought there might be a new quoting context over
> > there.
> 
> I think we can all agree its inconsistent, just not so sure we care??
> i.e. we know workarounds that aren't so bad variables etc.

Eh, it's sort of consistent. e.g. this doesn't work either:

unset x; echo "${x:-$'\''}"

and likewise a backslash escape alone won't do the trick. I'd assume this 
applies to just about every expansion.

I didn't think too hard before posting that. :)
-- 
Dan Douglas



Brace expansion bug

2012-03-26 Thread Dan Douglas
Hi, hopefully a self-explanatory one today:

~ $ ( set -x -- {a..c}; echo "${*-"{1..3}"}" )
+ echo 'a b c' 'a b c' 'a b c'
a b c a b c a b c

~ $ ( set -x -- {a..c}; echo "${*/"{1..3}"/$*}" )
+ echo 'a b c' 'a b c' 'a b c'
a b c a b c a b c

I'm told similar glitches have been found before. Looks like this applies to 
anything that has this sort of replacement context that contains valid brace 
expansion syntax inside quotes.

Bash 4.2.24 - Gentoo amd64

-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: Brace expansion bug

2012-03-26 Thread Dan Douglas
On Monday, March 26, 2012 01:44:58 PM you wrote:
> Dan Douglas  writes:
> > Hi, hopefully a self-explanatory one today:
> > ~ $ ( set -x -- {a..c}; echo "${*-"{1..3}"}" )
> > + echo 'a b c' 'a b c' 'a b c'
> > a b c a b c a b c
> > 
> > ~ $ ( set -x -- {a..c}; echo "${*/"{1..3}"/$*}" )
> > + echo 'a b c' 'a b c' 'a b c'
> > a b c a b c a b c
> 
> *Note (bash) Brace Expansion::
> 
>Brace expansion is performed before any other expansions, and any
> characters special to other expansions are preserved in the result.  It
> is strictly textual.  Bash does not apply any syntactic interpretation
> to the context of the expansion or the text between the braces.  To
> avoid conflicts with parameter expansion, the string `${' is not
> considered eligible for brace expansion.
> 
> Andreas.

No other shell I have to test with that has brace expansion behaves this way. 
Is this intentional? There's a trivial workaround by escaping the first brace,  
but expansion order shouldn't be relevant here, this is about the parsing and 
token recognition steps. In order for brace expansion to occur to begin with 
it would have to first be recognized as such.

Ordinarily the inner and outer quotes of the alternate argument expansions are 
processed separately so that in the case of an unquoted expansion you can have 
situations like ${x+"$y" "$z"} which potentially expands to zero or two words 
depending on the circumstance. If there's a brace expansion detected, the 
function of the quotes is modified and the brace expansion becomes stronger 
from a parsing standpoint. This seems counter-intuitive.

Don't know how much I'm allowed to quote here, but a quick read of the POSIX 
parsing rules and parameter expansion sections suggest to me that  the start 
of the parameter expansion should be the most important factor, and that 
nested quotes and braces are counted as part of the parameter expansion and 
escape at least any closing braces (doesn't mention opening brace). They do 
seem to be emphasizing the point that there's nesting going on and that the 
shell should try to consider the parameter expansion as a whole first.

This is hard to interpret, it was obviously not written to take brace 
expansion into account. The Bash manpage does specifically omit brace expansion 
from evaluation of "word" which I suppose should be a clue that it's 
intentional.

-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: Brace expansion bug

2012-03-26 Thread Dan Douglas
On Monday, March 26, 2012 08:07:00 AM you wrote:
> On 03/26/2012 07:56 AM, Dan Douglas wrote:
> > Don't know how much I'm allowed to quote here, but a quick read of the
> > POSIX parsing rules and parameter expansion sections suggest to me that
> >  the start of the parameter expansion should be the most important
> > factor, and that nested quotes and braces are counted as part of the
> > parameter expansion and escape at least any closing braces (doesn't
> > mention opening brace).
> POSIX doesn't specify brace expansion.  Use of brace expansion is an
> extension, and since there is no standard, it's hard to say whether it's
> right or wrong; you can only say whether it behaves as bash documented
> it to behave.

Fair enough, So long as there's no conflicting language. It is a useful hack if 
the behavior is intended - slightly betterthan the sometimes tempting but not-
so-practical: cmd "${some-expansion}"{,}

-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: Passing variables by reference conflicts with local

2012-04-30 Thread Dan Douglas
On Monday, April 23, 2012 04:56:09 PM Clark Wang wrote:
> On Wed, May 5, 2010 at 01:57, Freddy Vulto  wrote:
> > It appears that `unset' is capable of traversing down the call-stack and
> > ...

I reverse engineered most of this behavior a few months ago and wrote a 
detailed explaination and example here:

http://wiki.bash-hackers.org/commands/builtin/unset#scope

Hopefully most of that is correct.
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Segfault on compound assignment to a variable whose name is set in the environment of a declaration builtin.

2012-05-18 Thread Dan Douglas
Hi Chet, segfault occurs during array assignment if an attempt is made to 
modify a
variable of the same name from the environment. It appears to only occur in the 
global scope.
I imagine the expected result should be either an error, or to evaluate in a 
mannar similar to
`x=1 let "x[x++]=x"', for example.

One way to reproduce below:

 ~ $ ( rm core; ulimit -c unlimited; bash -c 'x=1 declare -a x=( [x++]= )'; gdb 
-q "$(type -P bash)" -c core )
Reading symbols from /bin/bash...Reading symbols from 
/usr/lib64/debug/bin/bash.debug...done.
done.
[New LWP 20493]

warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
Core was generated by `bash -c x=1 declare -a x=( [x++]= )'.
Program terminated with signal 11, Segmentation fault.
#0  0x004630a0 in array_insert (a=0x8e03c0, i=0, v=0x8dfe50 "") at 
array.c:633
633 for (ae = element_forw(a->head); ae != a->head; ae = 
element_forw(ae)) {
(gdb) bt full
#0  0x004630a0 in array_insert (a=0x8e03c0, i=0, v=0x8dfe50 "") at 
array.c:633
new = 0x8e0240
ae = 0x8e0070
#1  0x004640fe in bind_array_var_internal (entry=0x8dff30, ind=0, 
key=0x0, value=0x8e0306 "", flags=0) at arrayfunc.c:163
dentry = 0x1f333b550
newval = 0x8dfe50 ""
#2  0x00464b86 in assign_compound_array_list (var=0x8dff30, 
nlist=0x8dfe70, flags=0) at arrayfunc.c:529
a = 0x8e01e0
h = 0x0
list = 0x8dfe70
w = 0x8e0300 "[x++]="
val = 0x8e0306 ""
nval = 0x8dfff0 "x=([x++]=)"
len = 4
iflags = 0
ind = 0
last_ind = 0
akey = 0x0
#3  0x00464bfb in assign_array_var_from_string (var=0x8dff30, 
value=0x8dfcd2 "([x++]=)", flags=0) at arrayfunc.c:548
nlist = 0x8dfe70
#4  0x0047cbd7 in declare_internal (list=0x8e0070, local_var=0) at 
./declare.def:509
value = 0x8dfcd2 "([x++]=)"
aflags = 0
compound_array_assign = 1
name = 0x8dfcd0 "x"
offset = 1
making_array_special = 0
simple_array_assign = 0
flags_on = 4
flags_off = 0
flags = 0x7333b654
any_failed = 0
assign_error = 0
pflag = 0
nodefs = 0
opt = -1
mkglobal = 0
t = 0x0
subscript_start = 0x0
var = 0x8dff30
shell_fn = 0x0
#5  0x0047c004 in declare_builtin (list=0x21) at ./declare.def:98
No locals.
#6  0x00433278 in execute_builtin (builtin=0x47bff0 , 
words=0x8dffb0, flags=64, subshell=0) at execute_cmd.c:4113
old_e_flag = 0
result = 0
eval_unwind = 0
isbltinenv = 0
error_trap = 0x0
#7  0x00433eb8 in execute_builtin_or_function (words=0x8dffb0, 
builtin=0x47bff0 , var=0x0, redirects=0x0,
fds_to_close=0x8dfbc0, flags=64) at execute_cmd.c:4538
result = 0
saved_undo_list = 0x0
ofifo = 0
nfifo = 0
osize = 0
ofifo_list = 0x0
#8  0x00432d8f in execute_simple_command (simple_command=0x8ddd30, 
pipe_in=-1, pipe_out=-1, async=0, fds_to_close=0x8dfbc0)
at execute_cmd.c:3948
words = 0x8dffb0
lastword = 0x8e0070
command_line = 0x0
lastarg = 0x8dfff0 "x=([x++]=)"
temp = 0x0
first_word_quoted = 0
result = 0
builtin_is_special = 0
already_forked = 0
dofork = 0
old_last_async_pid = -1
builtin = 0x47bff0 
func = 0x0
old_builtin = 0
old_command_builtin = 0
#9  0x0042d26f in execute_command_internal (command=0x8ddd00, 
asynchronous=0, pipe_in=-1, pipe_out=-1, fds_to_close=0x8dfbc0)
at execute_cmd.c:735
exec_result = 0
user_subshell = 0
invert = 0
ignore_return = 0
was_error_trap = 0
my_undo_list = 0x0
exec_undo_list = 0x0
last_pid = -1
save_line_number = 0
#10 0x0047ec83 in parse_and_execute (string=0x8dd630 "x=1 declare -a 
x=( [x++]= )", from_file=0x49ff30 "-c", flags=4) at evalstring.c:319
bitmap = 0x8dfbc0
code = 0
---Type  to continue, or q  to quit---
lreset = 0
should_jump_to_top_level = 0
last_result = 0
command = 0x8ddd00
#11 0x00417f9b in run_one_command (command=0x7333d820 "x=1 declare 
-a x=( [x++]= )") at shell.c:1315
code = 0
#12 0x004172a9 in main (argc=3, argv=0x7333bb38, 
env=0x7333bb58) at shell.c:688
i = 0
code = 0
old_errexit_flag = 0
        saverst = 0
locally_skip_execution = 0
arg_index = 3
top_level_arg_index = 3
(gdb) q
 $ echo $BASH_VERSION
4.2.28(1)-release

Thanks again. (not overly anxious for a fix.)
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: quoted and concatenated positional parameters

2012-05-23 Thread Dan Douglas
On Wednesday, May 23, 2012 09:47:33 PM gregrwm wrote:
> expansion anomaly with quoted and concatenated positional parameters

Also reproducible in 4.2.28(1)-release

This occurs when any expansion is adjacent to or contained within a word that
is adjacent to an expansion of the from "$@" or "${a[@]}", and within the same
double-quotes. Bash mistakenly treats "${@}${x}" and "${@}""$x" differently.

> echo  '${@:2}c$1 c2 c3 #works as long as quoting omitted'

Because you're applying word-splitting to the result.  You'll see that there is
only one word if IFS is set to null.  It's impossible to test whether the
unquoted case is correct. The manpage says that only a quoted "$@" is split
into words for reasons other than word-splitting. Bash, mksh, and dash all
disagree about the unquoted cases, while Bash is the only shell to take issue
with the adjacent expansion in all cases.

 ~ $ for sh in {{,m}k,{d,b}a,z}sh; do printf '%s\n' "${sh}:" "$("$sh" -c 
"$( ' "$@"; echo; }   


  
args "${@}${1}"
args "${@}foo" 
args ${@}${1}   


  
args ${@}foo


  
IFS=


  
args ${@}${1}   


  
args ${@}foo


  
EOF 


  

ksh:
<1> <2> <3> <4> <51> 
<1> <2> <3> <4> <5foo> 
<1> <2> <3> <4> <51> 
<1> <2> <3> <4> <5foo> 
<1> <2> <3> <4> <51> 
<1> <2> <3> <4> <5foo> 

mksh:
<1> <2> <3> <4> <51> 
<1> <2> <3> <4> <5foo> 
<1> <2> <3> <4> <51> 
<1> <2> <3> <4> <5foo> 
<1 2 3 4 51> 
<1 2 3 4 5foo> 

dash:
<1> <2> <3> <4> <51> 
<1> <2> <3> <4> <5foo> 
<1> <2> <3> <4> <51> 
<1> <2> <3> <4> <5foo> 
<123451> 
<12345foo> 

bash:
<1 2 3 4 51> 
<1> <2> <3> <4> <5foo> 
<1> <2> <3> <4> <51> 
<1> <2> <3> <4> <5foo> 
<1 2 3 4 51> 
<1> <2> <3> <4> <5foo> 

zsh:
<1> <2> <3> <4> <51> 
<1> <2> <3> <4> <5foo> 
<1> <2> <3> <4> <51> 
<1> <2> <3> <4> <5foo> 
<1> <2> <3> <4> <51> 
<1> <2> <3> <4> <5foo> 
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: quoted and concatenated positional parameters

2012-05-23 Thread Dan Douglas
Ugh, Sorry, I forgot to strip trailing whitespace. If that wasn't 
comprehensible for anyone, the heredoc in the preceeding the testcase was:

args() { printf '<%s> ' "$@"; echo; }
args "${@}${1}"
args "${@}foo"
args ${@}${1}
args ${@}foo
IFS=
args ${@}${1}
args ${@}foo
EOF

-- 
Dan Douglas



Re: handling of test == by BASH's POSIX mode

2012-05-27 Thread Dan Douglas
On Sunday, May 27, 2012 08:45:46 PM Jon Seymour wrote:
> On 27/05/2012, at 17:39, Geir Hauge  wrote:
> 
> > 2012/5/27 Jon Seymour :
> >> Is there a reason why bash doesn't treat == as an illegal test
> >> operator when running in POSIX mode?
> > 
> > POSIX does not say == is not allowed.
> > 
> > POSIX tells you what the shell should at least be able to do. A POSIX
> > compliant shell can have whatever other features it likes, as long as
> > the POSIX features are covered.
> > 
> 
> I guess the question is better phrased thus: what use case is usefully 
served by having bash's POSIX mode support a superset of test operators than 
other compliant POSIX shells?  As it stands, I can't use bash's POSIX mode to 
verify the validity or otherwise of a POSIX script because bash won't report 
these kinds of errors - even when running in POSIX mode.
> 
> There is an --enable-strict-posix (?) configuration option. Will this do what 
I expect?
> 
> > 
> >> This is problematic because use of test == in scripts that should be
> >> POSIX isn't getting caught when I run them under bash's POSIX mode.
> >> The scripts then fail when run under dash which seems to be stricter
> >> about this.
> > 
> > Don't use non-POSIX features in a POSIX script, and you'll be fine.
> > http://www.opengroup.org/onlinepubs/9699919799/utilities/contents.html
> > 
> 
> Which is the exactly the point. Practically speaking when I write scripts I 
expect an interpreter that claims to be running in POSIX mode to give me some 
help to flag usage of non POSIX idioms. Yes, I can second guess the interpreter 
by reading the spec, but is this really the most efficient way to catch these 
kinds of errors?
> 
> Jon.

There are no shells in existence that can do what you want. All major shells 
claiming to be POSIX compatible include some superset that can't be disabled. 
The only shell I have installed not supporting == in [ is dash, and there are 
so many scripts in the wild using == with [ it would be a miracle if your 
system didn't break because of it. Even the coreutils /usr/bin/[ supports ==.

Performing that kind of checking, rigorously, in a shell, would be impossible 
to do statically anyway. Any such lint tool is limited to lexical analysis 
which makes it not very useful for testing unless your script is perfectly 
free of side-efffects. And who writes side-effect free shell scripts?

How would the shell check for the correctness of:

"$(rm -rf somepath; echo '[')" x == x ]
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: handling of test == by BASH's POSIX mode

2012-05-27 Thread Dan Douglas
> POSIX hasn't provided a way to validate whether a script
> only uses features that are required to be supported by POSIX
> compliant interpreters.

I believe that was someone else's point, but yes that would be a problem for 
anyone who wanted to implement compliance check warnings.

> even if bash is technically compliant with POSIX, what use case is usefully 
> served by having bash support a superset of the POSIX test operators while 
executing in POSIX mode?

It's a matter of practicality and the fact that nobody has written it yet. If 
you wanted to implement conformance checks without making Bash even bigger and 
slower, and harder to maintain, I don't think there would be objections. Bash 
just modifies conflicting features to the minimal extent necessary to bring it 
into compliance, which seems to be the path of least resistance.

This would be a big job, I think, and not quite at the top of my wish-list. 
Right now you can increase the number of things that fail by explicitly 
disabling non-POSIX built-ins using the Bash "enable" builtin.

> I wasn't claiming that static checking would be viable. In fact, the
> impossibility of static checking is precisely why it would be useful
> to have real POSIX "compliant" interpreters that were as conservative
> as possible in the syntax and commands they accepted at least in their
> so-called POSIX mode.

Dash is useful for testing. The Bash answer is [[, which CAN do a lot of 
special error handling on due to it being a compound command. I wrote a bit 
about this here:

http://mywiki.wooledge.org/BashFAQ/031/#Theory

In reality, [[ is one of the very most portable non-POSIX features available. 
Most people shouldn't have to worry about avoiding it.

On Sunday, May 27, 2012 11:09:03 PM Jon Seymour wrote:
> That said, from the point of view of promoting interoperable scripts,
> my view is that it (in an ideal world**) would be better if bash chose
> to fail, while executing in POSIX mode, in this case.

In an ideal world, POSIX would define [[, Dash wouldn't exist, and we would 
have some resource other than POSIX that specifies what's ACTUALLY portable 
between modern shells, so that the only people who have to worry are those 
targeting Busybox and Solaris Heirloom, or stubborn curmudgeons who insist on 
Dash... some of these things feel like supporting IE6 - it's better to just 
not.

> This is exactly my problem: I replaced /bin/sh -> dash with /bin/sh ->
> bash because a 3rd party product installation script failed when dash
> was the "POSIX" shell.

If you need a fast small shell, use mksh. It supports some of the more 
essential features of both Bash and Ksh (arrays, ((, [[), and some of its own, 
without all the draconian restrictions of dash.

(Note this is querying my package manager, including symbols and source files)

# equery s dash
 * app-shells/dash-0.5.7.1
 Total files : 73
 Total size  : 1.14 MiB
 # equery s mksh
 * app-shells/mksh-
 Total files : 39
 Total size  : 1.61 MiB
 # equery s bash
 * app-shells/bash-4.2_p28
 Total files : 464
 Total size  : 6.33 MiB

-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: Compare 2 arrays.

2012-05-30 Thread Dan Douglas
On Wed, May 30, 2012 at 4:57 PM, Greg Wooledge  wrote:
>
> On Wed, May 30, 2012 at 10:14:42AM -0600, Bill Gradwohl wrote:
> > What say you Chet? Bug or feature? There is no middle ground.
>
> That's unrealistic.  There are plenty of things that occupy that middle
> ground -- unexpected program behaviors.  The programmer can never
> anticipate *every* input sequence that users will throw at the software,
> so some of them may cause surprises.

Since variable names treated as strings can cause execution of
arbitrary code in Bash, mixing them with user input isn't usually
possible anyway. However, that's irrelevant to whether this kind of
expansion is a bug.

> The danger of using unexpected program behaviors in your applications
> is that the behaviors may change without warning in a future version of
> the program.  The programmer may not even be aware that this behavior
> exists, let alone that people are using it.  A clean-up of the parser
> (or similar change) may make the behavior go away, or change.

That is true, but I don't think this particular thing is unexpected.
Hopefully it doesn't change unless it's replaced by something better,
because it's extremely useful.

> ...
> I added that on 2011-05-02 after Buglouse described it on IRC.  I'm fairly
> certain that EVERYONE else in the channel at the time was as surprised
> by it as I was.
>

I think they shouldn't have been surprised. Here it is described on
stackoverflow prior to that date:
http://stackoverflow.com/a/4017175/495451

I don't understand why you continue to consider this a possible bug.
The manual only has a couple sentences to say about indirect
expansion:

"If the first character of parameter is an exclamation point (!), a
level of variable indirection is introduced.Bash uses the value of
the variable formed from the rest of parameter as the name of the
variable; this variable is then expanded and that value is used in the
rest of the substitution, rather than the value of parameter itself.
This is known as indirect expansion."

And Bash does exactly that. If indirect array expansion is a bug, then
it's a documentation bug, because it's implied by the manual amongst
other things. The question is whether or not values of the form
"arr[n]" are considered valid "parameters" or "variable names". IMO,
they are. "The value of the variable formed from the rest of the
parameter" must itself be a "variable name" in order for these
sentences to make sense. This is consistent with their usage in
several Bash builtins that accept "variable names" as arguments such
as "read", "unset", and "printf". I consider some of those which don't
as bugs, such as "[[ -v". IMO, this should be made consistent. But at
least I've tested "${!x}" extensively in complex situations and it
works as expected every time. It's also not that difficult to
understand.

Still, it's an advanced feature and probably just as well that it's obfuscated.



Re: Indirect access to variables, including arrays (was Re: Compare 2 arrays.)

2012-06-07 Thread Dan Douglas
On Thursday, June 07, 2012 10:01:51 AM Pierre Gaston wrote:
> On Thu, Jun 7, 2012 at 6:07 AM, Linda Walsh  wrote:
> >(no I haven't made it space/bracket...whatever proof...just a bit
> > more work)
> 
> It's not just "a bit more work", there are many workarounds but it's not
> really possible to make a really robust generic solution for assignment,
> and in the end it just not as simple and pretty as nameref.
> 
> Fwiw here is a robust and simple solution for in_:
> 
> _in () {
>   local e t
>   t="${2:?}[@]";
>   for e in "${!t}"; do [[ $1 = "$e" ]] && return 0;done
>   return 1;
> }
> 

Not robust due to the name conflicts with "e" or "t". There's also no good way 
to generate a list of locals without parsing "local" output (i'd rather live 
with the conflicts). I usually store indirect references in the positional 
parameters for that reason. Proper encapsulation is impossible.

Also, the standard behavior of ${var:?} is nearly useless unless you want to 
either create fatal errors or put the whole function into a subshell. I wish 
there were a way to modify that expansion to something nonstandard but useful. 
For associative arrays (where a null parameter is the only invalid key), the 
best solution is to explicitly check for null parameters and bail out of the 
function. There's no convenient way to do that other than looping.

I used a convoluted solution in this "multidimensional array" example: 
http://wiki.bash-hackers.org/syntax/arrays#indirection

I'm not very happy with it and will probably change the ${x+${y[z]+...}} 
hieroglyphics to something readable as soon as I can think of whatever sucks 
the least.

Granted, these are atypical issues. I care about this more than most due to 
Gentoo being stuck with Bash and eclasses being among the few valid reasons to 
care about large safe extensible libraries written for a shell.
-- 
Dan Douglas



mapfile -n seeks ahead an extra line

2012-06-20 Thread Dan Douglas
From: Dan Douglas
To: bug-bash@gnu.org
Subject: mapfile -n seeks ahead an extra line

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: x86_64-pc-linux-gnu-gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' 
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' 
-DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL 
-DHAVE_CONFIG_H   -I. -I./include -I. -I./include -I./lib  
-DDEFAULT_PATH_VALUE='/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'
 -DSTANDARD_UTILS_PATH='/bin:/usr/bin:/sbin:/usr/sbin' 
-DSYS_BASHRC='/etc/bash/bashrc' -DSYS_BASH_LOGOUT='/etc/bash/bash_logout' 
-DNON_INTERACTIVE_LOGIN_SHELLS -DSSH_SOURCE_BASHRC -march=native -Ofast -mmmx 
-pipe -floop-interchange -floop-strip-mine -floop-block -ggdb
uname output: Linux smorgbox 3.4.2-pf+ #62 SMP PREEMPT Mon Jun 11 16:24:00 CDT 
2012 x86_64 Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz GenuineIntel GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.2
Patch Level: 29
Release Status: release

Description:
mapfile -n eats a line too many, discarding the last.

Repeat-By:
$ printf '%s\n' {a..f} | { mapfile -tn 2 a; mapfile -t b; declare -p a b; }
declare -a a='([0]="a" [1]="b")'
declare -a b='([0]="d" [1]="e" [2]="f")'

signature.asc
Description: This is a digitally signed message part.


Re: Doc of "set -e" should mention non-locality

2012-06-29 Thread Dan Douglas
On Thursday, June 28, 2012 02:37:17 PM Rainer Blome wrote:
> The implementation of "set -e" does not respect "lexical nesting".
> This can be very surprising.  

None of the "set" options do, nor does the ERR trap. That would make this the 
exception. Here's a workaround (untested).

sete() {
[[ $- == *e* ]]  && return 1
trap "$(

signature.asc
Description: This is a digitally signed message part.


Re: Doc of "set -e" should mention non-locality

2012-07-05 Thread Dan Douglas
On Wednesday, July 04, 2012 05:37:25 PM Rainer Blome wrote:
>  Original-Nachricht 
> > Datum: Fri, 29 Jun 2012 18:03:13 -0500
> > Von: Dan Douglas 
> > An: bug-bash@gnu.org
> > CC: Rainer Blome 
> 
> Remember that my main suggestion is to clearly document the intended
> behavior (see the subject). This could mean to add a generic
> paragraph to the documentation of "set" that describes the scope
> and extent for all options. 

I'm all for better documentation. Scope in Bash is a complex subject. Almost 
none of it is documented, and there is little standardization around how 
function scope is supposed to work anyway. I'd call set -e a low priority 
relative to documenting what scope behaviors are actually in place.

> > On Thursday, June 28, 2012 02:37:17 PM Rainer Blome wrote:
> > > The implementation of "set -e" does not respect "lexical nesting".
> > > This can be very surprising.  
> > 
> > None of the "set" options do, nor does the ERR trap.
> 
> That may very well be. Is this documented anywhere?

About the one thing you can count on with regards to scope in Bash, is that it 
won't be lexical. But that's true of the majority of shells that have any kind 
of scoping features at all beyond positional parameters and environment 
variables.

> PS: Your suggested "workaround" is, well, hard to understand.
> Reminds me of the way people extend FORTH by massaging the stacks.
> You have to know exactly what is parsed, substituted and evaluated
> when in order to understand this (if it even works, did not try it).
> I would not dare use this in production for fear of
> receiving a beating from colleagues over hard to maintain code. ;-)

That's basically what it is (and this is yet another an undocumented scope 
thing). Setting this trap within a function sets a hook on the function and 
all of its callers which essentially runs eval on the given string upon 
returning. To make matters more confusing, also potentially on callees if they 
have the trace attribute set.

Other than that there's not much to understand or maintain. It'll set -e when 
returning from the first function and set +e when returning from the second, 
then unset itself on any further callers.

> > Here's a workaround (untested).
> > 
> > sete() {
> > [[ $- == *e* ]]  && return 1
> > trap "$( > } < > if [[ $FUNCNAME != \$FUNCNAME ]]; then
> > set +e
> > trap - RETURN
> > else
> > set -e
> > fi
> > EOF
> > 
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: why must non-standard $IFS members be treated so differently ?

2012-07-29 Thread Dan Douglas
On Sunday, July 29, 2012 03:23:29 PM Jason Vas Dias wrote:
> Good day Chet, list -
>  I'm concerned about the difference in output of these functions with
> the example input
>  given on the '$' prefixed line below (with 4.2.29(2)-release
> (x86_64-unknown-linux-gnu)):
> 
>  function count_args {v=($@);  echo ${#v[@]}; }
> 
>  function count_colons {  IFS=':' ;v=($@);  echo ${#v[@]}; }
> 
>  $ echo $(count_args 1 2 3\ 4) $(count_colons 1:2:3\:4)
>  3 4
> 
>  It appears to be impossible for an item delimited by 'X' to contain
> an escaped  'X' ('\X')  if 'X' is not
>  a standard delimiter (' ', '') .  Quoting doesn't seem to help either:
> 
>  $ echo $(count_args 1 2 3\ 4) $(count_colons 1:2:3':4')
>  3 4
> 
> To me, this appears to be a bug.
> 
> But I bet you're going to tell me it is a feature ?
> Please explain.

Bash doesn't re-parse the results of expansions for quotes or escaping. When 
you expand something unquoted, the entire result is always subject to 
word-splitting. 

In the case of `count_args 1 2 3\ 4', you are passing 3 arguments. The 
backslash is not the result of expansion , so it gets treated as escaping a 
space. Note this is because bash is pass-by-value, this escaping/expansion is 
processed prior to calling the function.

In the case of `count_colons 1:2:3\:4', you are passing one argument. The shell 
strips away the backslash when the function is called (just as it did in the 
first example), so the argument being passed is actually '1:2:3:4'

 $ printf '%s\n' 1:2:3\:4
1:2:3:4

If you wanted to pass the backslash, you would have to either quote the 
argument, or use \\.

However in either case, you're going to have 4 arguments, because as previously 
stated, escape characters  resulting from expansions are not treated as escapes.

 $ f() { IFS=: local -a 'v=( $@ )'; printf '<%s> ' "${v[@]}"; echo; }; f 
1:2:3\:4
<1> <2> <3> <4>
 $ f() { IFS=: local -a 'v=( $@ )'; printf '<%s> ' "${v[@]}"; echo; }; f 
1:2:3\\:4
<1> <2> <3\> <4>

See also my answer to this recent question: http://superuser.com/a/454564/78905
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: why must non-standard $IFS members be treated so differently ?

2012-07-29 Thread Dan Douglas
On Sunday, July 29, 2012 03:23:29 PM Jason Vas Dias wrote:
> echo $(count_args 1 2 3\ 4)

I should also have mentioned that I couldn't reproduce this case. You should 
be getting 4 here in your example, not 3. I have the same Bash version. Are 
you sure you were echoing `${#v[@]} ' and not `${#@}', and also that you did 
not set IFS=: for count_args? If you use exactly the function you sent  with 
the default IFS then you should get 4 here.
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: bash does filename expansion when assigning to array member in compound form

2012-08-18 Thread Dan Douglas
This is a feature that all shells with this style of compound assignment have 
in common. If no explicit subscripts are given, the text between the 
parentheses is processed exactly as though it were arguments to a command 
including brace expansion, word-splitting, and pathname expansion (and 
consequently, quoting is just as important). This is an important feature 
because it allows storing the results of a glob in an array easily.

If a subscript is given explicitly, then the right-hand side of the assignment 
is treated exactly as an ordinary scalar assignment would be, including all 
analagous behaviors for `+=' and the integer attribute.

 $ set -x; a=( [1]=* )
+ a=([1]=*)
-- 
Dan Douglas



Re: bash does filename expansion when assigning to array member in compound form

2012-08-18 Thread Dan Douglas
On Saturday, August 18, 2012 07:55:17 PM Stephane Chazelas wrote:
> 2012-08-18 10:26:22 -0500, Dan Douglas:
> > This is a feature that all shells with this style of compound assignment 
> > have 
> > in common. If no explicit subscripts are given, the text between the 
> > parentheses is processed exactly as though it were arguments to a command 
> > including brace expansion, word-splitting, and pathname expansion (and 
> > consequently, quoting is just as important). This is an important feature 
> > because it allows storing the results of a glob in an array easily.
> > 
> > If a subscript is given explicitly, then the right-hand side of the 
> > assignment 
> > is treated exactly as an ordinary scalar assignment would be, including all 
> > analagous behaviors for `+=' and the integer attribute.
> > 
> >  $ set -x; a=( [1]=* )
> > + a=([1]=*)
> [...]
> 
> Nope:
> 
> ~/1$ touch '[1]=x'
> ~/1$ bash -c 'a=( [1]=* ); echo "${a[@]}"'
> [1]=x
> ~/1$ bash -c 'a=( [1]=asd ); echo "${a[@]}"'
> asd
> 
> That's a bug though.
> 
> Just do
> 
> a=("*") or a=('*') or a=(\*)
> 
>
 
Eh yeah. At least the left side gets implicit quoting, and it correctly 
disables brace expansion. In mksh compound assignment is just sugar for set -A, 
so Bash isn't unique in this.

  $ touch 1=a; mksh -c 'a=([123]=*); print -r "${a[@]}"'
1=a

-- 
Dan Douglas



Re: bash does filename expansion when assigning to array member in compound form

2012-08-18 Thread Dan Douglas
Bleh I'm wrong, brace expansion remains too. I should know this... it's hard to 
remember all the quirks even when I write them down.



Re: bash does filename expansion when assigning to array member in compound form

2012-08-20 Thread Dan Douglas
On Monday, August 20, 2012 07:44:51 PM Roman Rakus wrote:
> And how would you achieve to fill array with all file names containing 
> `[1]=' for example.

$ ls
[1]=a  [1]=b
$ ( typeset -a a=( \[1\]=* ); typeset -p a )
typeset -a a=('[1]=a' '[1]=b')
$ ( typeset -a a=( [1]=* ); typeset -p a )
typeset -a a=([1]='*')
$

In ksh93, by escaping. I think this is what most people would expect and 
probably what Bash intended.

Of course, In that shell in order to use "[n]="-style indexing each and every 
element needs to be specified that way explicitly. I like that Bash can just 
implicitly start counting at any index.
-- 
Dan Douglas



Re: bash does filename expansion when assigning to array member in compound form

2012-08-21 Thread Dan Douglas
On Tuesday, August 21, 2012 07:24:31 AM Stephane Chazelas wrote:
> 2012-08-20 19:44:51 +0200, Roman Rakus:
> [...]
> > And how would you achieve to fill array with all file names
> > containing `[1]=' for example.
> [...]
> 
> Another interesting question is how to fill the array with all
> the file names that start with a digit followed by "=".
> 
...
> a=(@([0-9])=*); typeset -p a'
> declare -a a='([0]="3=foo" [1]="4=foo" [2]="5=foo")'

Another way:
IFS= glob=[123]=* typeset -a 'y=($glob)'

And another:
IFS= typeset -a "a=($(printf '%q ' [123]=*))"

Or set -A if available.

> > Definitely it's good, if you want to be sure, to always quote all
> > characters which means pathname expansion - `*', `?' and `['.
> [...]
> 
> Yes, the problem here is that "[" is overloaded in a conflicting
> manner as a globbing operator and that poorly designed special
> type of array assignment.

Indeed, but you were right in calling a bug. Matching weird filenames is an odd 
corner-case for which there are work-arounds. The current behavior doesn't 
make sense.

Still, a variable/function attribute for disabling pathname expansion in a 
controled manner would be useful given that having to manage the state of set 
-f mostly precludes its use.

-- 
Dan Douglas



Re: bash does filename expansion when assigning to array member in compound form

2012-08-29 Thread Dan Douglas
On Friday, August 24, 2012 09:38:44 AM you wrote:
> On 8/22/12 8:58 PM, Chet Ramey wrote:
> 
> > Then how about this: words inside a compound assignment statement that are
> > recognized as assignment statements ([1]=foo) are expanded like assignment
> > statements (no brace expansion, globbing, or word splitting).  Other words
> > undergo all the expansions.

That's pretty much what I had in mind. I assumed this was how Bash handled 
pathname expansion until seeing Stephane's exception.

> > 
> > That means you can do things like
> > 
> > [{0,1,2,3}]=foo
> > 
> > to set the first four elements to the same value
> 
> Or should these be marked as assignment statements after brace expansion,
> with the appropriate expansions performed?  It can be complicated to
> suppress brace expansion on the RHS after allowing it on the LHS.
> 
> Chet
> 

I can't think of any problems with either offhand. Even disabling brace 
expansion entirely for words recognized as assignments wouldn't be too bad.

But it could be a nice shortcut. Current methods I could think of using brace 
expansion are ugly:

 $ declare -a 'a+=(['{0..3}']=foo)' b[{0..3}]=foo c=( [{0..3}]=foo )
 $ declare -a "c=(${c[*]})"
 $ declare -p a b c
declare -a a='([0]="foo" [1]="foo" [2]="foo" [3]="foo")'
...

-- 
Dan Douglas



Some issues with short-circuiting arithmetic operators

2012-09-05 Thread Dan Douglas
This reorder function is meant to swap values of a two-element array if
unordered. Bash and ksh produce reversed results. mksh and zsh do as expected.

#!/usr/bin/env bash

[[ -n ${ZSH_VERSION+_} ]] && emulate ksh

function reorder {
(( x[1] < x && (x=x[1], x[1]=$x) ))
echo "${x[@]}"
}

x=(123 456)
reorder
x=(456 123)
reorder

eval echo '${'{BA,K,Z}SH_VERSION\}
# vim: ft=sh :

 $ bash ./reorder
456 123
123 456
4.2.37(1)-release
 $ ksh ./reorder
123 456
456 123
Version AJM 93u+ 2012-06-28
 $ mksh ./reorder
123 456
123 456
@(#)MIRBSD KSH R40 2012/09/01
 $ zsh ./reorder
123 456
123 456
5.0.0

The Ksh issue seems to be that an explicit x[0] is needed (it's a slightly
outdated dev build), but I can't figure out why Bash is doing this. No
parameter expansion in the arithmetic does the same:

function reorder2 {
_=$x let '(x[1] < x) && (x=x[1], x[1]=_)'
echo "${x[@]}"
}

Some variations crash:

$ bash -c 'function reorder { (( x[1] < x[0] && (x=x[1], x[1]=$x) )); echo 
"${x[@]}"; }; x=(123 456); reorder; x=(456 123); reorder'
Segmentation fault
$ bash -c 'function reorder { (( (x > x[1]) && (x=${x[1]}, x[1]=$x) )); 
echo "${x[@]}"; }; x=(123 456); reorder; x=(456 123); reorder'
123 456
Segmentation fault

The second issue is that Bash tries to resolve arithmetic variables when
evaluation should never reach them. Some methods of short-circuiting do work,
e.g. to populate an array:

 $ n=0 a=a[n]=n=(n+1)%10,a; ((a)); echo "${a[@]}" # Bash
0 1 2 3 4 5 6 7 8 9

However, In this case, `a' should be protected by `&&'. The same occurs with
`||' and `x?y:z'.

 $ zsh -c 'emulate ksh; typeset -a a; n=0 a="(a[n]=n++)<7&&a"; ((a)); echo 
"${a[@]:1}"' # max depth == 256
0 1 2 3 4 5 6 7
 $ ksh -c 'n=0 a="(a[n]=++n)<7&&a[0]"; ((a[0])); echo "${a[@]:1}"'  
# max depth == 8
1 2 3 4 5 6 7
 $ bash -c 'n=0 a="(a[n]=++n)<7&&a[0]"; ((a[0])); echo "${a[@]:1}"' 
# max depth == 1024
bash: n: expression recursion level exceeded (error token is "n")
 $ bash -c 'n=0 a="(a[n]=n++)<7&&a"; ((a)); echo "${a[@]:1}"'
bash: (a[n]=n++)<7&&a: expression recursion level exceeded (error token is 
"n++)<7&&a")
0 1 2 3 4 5 6 7

The last one both gets the right answer and throws an error... weird.

It appears the mksh method is to keep track of which variables have been
visited and error if any are referenced twice, rather than counting the
arithmetic evaluator stack depth, so this isn't possible in that shell.
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: "break" inside a while-condition

2012-09-11 Thread Dan Douglas
On Tuesday, September 11, 2012 04:40:41 PM Philippe Wang wrote:
> Repeat-By:
>   # 1) should raise a parsing error  (but it doesn't)
>   while break ; true ; do true ; done 

This doesn't cause a parsing error in any shell I have to test with. I don't 
see why it would.

>   # 2) should break the outer loop  (but it doesn't)
>   while true ; do while break ; do whatever-because-never-reached ; done ; 
> echo fail ; done

I don't think so. It should break the inner loop. "break 2" would break the 
outer loop. the "list" preceeding the "do" keyword is considered part of the 
loop. The same applies to "continue".
-- 
Dan Douglas



Re: Parsing error when "case" in "for" in $()

2012-09-11 Thread Dan Douglas
On Tuesday, September 11, 2012 05:31:36 PM Steven W. Orr wrote:
> On 09/11/12 17:20, quoth Chris F.A. Johnson:
> > On Tue, 11 Sep 2012, Benoit Vaugon wrote:
> > ...
> >> Description:
> >>  Cannot use "case" construction in a "for" loop in a $() sub shell.
> >>  Should work but produces parsing error.
> >>
> >> Repeat-By:
> >>  echo $(for x in whatever; do case y in *) echo 42;; esac; done)
> >
> >  The closing parentheses in the case statement is being interpreted as the
> > closing for $(
> >
> >> Fix:
> >>  Probably by fixing the bash parser.
> >
> > Balance the parentheses in the case statement:
> >
> > echo $(for x in whatever; do case y in (*) echo 42;; esac; done)
> >
> 
> Thanks. I didn't know that the opening paren was optional and was needed in 
> such a case as a disambiguator. Very nice. And if you really want to match 
> something that starts with an open paren, just backslash it.
> 
> As a style issue, it makes me wonder if I should always use the optional
> open paren as syntactic sugar...

I could only reproduce it with an unquoted command substitution. It may still 
not be correct even though there are workarounds. The existance of an 
enclosing compound command shouldn't affect the parsing of the inner one.

The workaround might be a good idea anyway. ksh fails this in some 
permutations but not others, while most other shells seem to be able to handle 
it (except zsh which usually fails). CC'd ast-devel in case they consider it a 
bug worth fixing.

: $(case . in .) :; esac)
fails: zsh

: $(case . in (.) :; esac)
fails:

: $({ case . in .) :; esac; })
fails: sh bash zsh

: "$({ case . in .) :; esac; })"
fails: zsh

: $(for x in .; do case . in .) :; esac; done)
fails: sh bash zsh ksh

: "$(for x in .; do case . in .) :; esac; done)"
fails: zsh

bash/ksh compatable testcase:
---
#!/usr/bin/env bash

shells=( sh {{b,d}a,z,{,m}k}sh bb )

while IFS= read -r testCase; do
printf '%s\nfails: ' "$testCase"
for sh in "${shells[@]}"; do
"$sh" -c "$testCase" 2>/dev/null || printf '%s ' "$sh"
done
echo $'\n'
done <<"EOF"
: $(case . in .) :; esac)
: $(case . in (.) :; esac)
: $({ case . in .) :; esac; })
: "$({ case . in .) :; esac; })"
: $(for x in .; do case . in .) :; esac; done)
: "$(for x in .; do case . in .) :; esac; done)"
EOF

-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: a recursion bug

2012-10-04 Thread Dan Douglas
It's possible to grow the parameter expansion stack forever too.

 $ (x=x[\${!x}]<${!x})
Segmentation fault

One would think there would be no need to keep a stack if there are no
more expansions to the right of the current expansion.

On Wed, Oct 3, 2012 at 3:39 PM, Chet Ramey  wrote:
> On 10/3/12 3:40 PM, Greg Wooledge wrote:
>> On Wed, Oct 03, 2012 at 01:23:58PM -0600, Bob Proulx wrote:
>>> But in any case, is there
>>> anything in there that is about bash?  If so the we need an exact test
>>> case.
>>
>> You could start with this one:
>>
>> imadev:~$ bash-4.2.28 -c 'a() { echo "$1"; a $(($1+1)); }; a 1' 2>&1 | tail
>> Pid 4466 received a SIGSEGV for stack growth failure.
>> Possible causes: insufficient memory or swap space,
>> or stack size exceeded maxssiz.
>
> There's not actually anything you can do about that except use ulimit to
> get as much stack space as you can.

Well, poor-mans TCO

f()
if [ "$1" -ge 0 ]; then
printf "$1 "
exec dash -c "${2}f "'$(($1-1)) "$2"' -- "$@"
fi

( f 100 "$(typeset -f f)"$'\n' )

Joking of course :o)

FUNCNEST is usually good enough for me. No other shell I'm aware of
even has that.
--
Dan Douglas



Re: different exit codes in $? and ${PIPESTATUS[@]}

2012-10-14 Thread Dan Douglas
On Sunday, October 14, 2012 11:46:17 AM Wladimir Sidorenko wrote:
> To my mind '!' looks pretty much like a unary operator and '|' like a binary 
one.

This isn't as confusing as the associativity and nesting problem.

 $ ( ! time ! : | :; echo $? "( ${PIPESTATUS[@]} )" ) 2>/dev/null
0 ( 0 0 )
 $ ( ! ! : | :; echo $? "( ${PIPESTATUS[@]} )" )
0 ( 0 0 )
 $ ( ! { ! :; } | :; echo $? "( ${PIPESTATUS[@]} )" )
1 ( 1 0 )
 $ ( ! { ! : | : ; } | :; echo $? "( ${PIPESTATUS[@]} )" )
1 ( 1 0 )

I still don't completely understand how all of the above can be pipelines of 
two elements if eacg bang denotes the begnning of a new pipeline. You would 
expect to have at least one of these cases showing up as a one element 
pipeline containing multi-element pipelines especially with "! !".

Both ksh and mksh additionally allow things like:

 $ mksh -c ': | ! : | :; echo $? "( ${PIPESTATUS[@]} )"'
1 ( 0 1 )
 $ mksh -c ': | ! { ! : | :; }; echo $? "( ${PIPESTATUS[@]} )"'
0 ( 0 0 )

I don't believe this is valid syntax, but also end up being pipelines of two 
elements. Bash, dash, zsh, etc don't accept this.
-- 
Dan Douglas



bug-bash@gnu.org

2012-10-15 Thread Dan Douglas
On Sunday, October 14, 2012 05:21:05 PM Linda Walsh wrote:
> Seriously -- why not just fix it?
> 
> If you think it is broken -- Fix it.

This should fix most scripts.

case $- in
*e*) exec /bin/sh -c 'cat; rm -f "$1"; exit 1' -- "$BASH_SOURCE" <<-"EOF" 
>&2
Error: Braindamage detected. Protecting your system.
EOF
esac
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: different exit codes in $? and ${PIPESTATUS[@]}

2012-10-15 Thread Dan Douglas
This makes a lot of sense. Thanks for the nice explanation and link!

On Sunday, October 14, 2012 01:10:17 PM Chet Ramey wrote:
> It's a little easier to see how ! ! is a no-op that way.

The "negated false pipeline" isn't something I'd given much thought to. It 
turns out "! !" is an even faster noop than the previous fastest I'd found ( 
looping a few million times: "! !" is slightly faster than "<()", which is 
slightly faster than ":"). Only bash/ksh seem to accept either a bare "!" or 
"! !" though. Bash interestingly accepts "! !cmd" and "{ ! !; }; 
cmd", but not "! !; cmd". 

Also I may as well overload this mail by reporting that nearly every issue 
I've ever sent to this list over the last year or so (that turned out being 
"legitimate") appears to have been addressed in devel (at least, according to 
a few minutes of testing). Thanks! :)
-- 
Dan Douglas



Re: [ast-users] [ksh93] Should ~$user be tilde expanded?

2012-10-25 Thread Dan Douglas
For reference, the only current shell I can find that agrees with ksh93 is zsh 
in its default mode. In all emulation modes (including ksh), it reverts to 
being like Bash and others (mksh, dash, busybox, posh).

Regardless of which is correct, I've seen a lot of questions and confusion 
generated by the more common behavior. 
--
Dan Douglas



Re: wait unblocks before signals processed

2012-11-05 Thread Dan Douglas
Hi Elliott. The behavior of wait differs depending upon whether you are in 
POSIX mode. Try this script, which I think does essentially what you're after 
(also here: https://gist.github.com/3911059 ):

#!/usr/bin/env bash

${BASH_VERSION+shopt -s lastpipe extglob}

if [[ -v .sh.version ]]; then
builtin getconf
function BASHPID.get {
read -r .sh.value _ &2
sleep "$2"

printf '%d: returning %d\n' "$1" "$3" >&2
return "$3"
}

function main {
typeset -i n= j= maxj=$(getconf _NPROCESSORS_ONLN)

set -m
trap '((j--))' CHLD

while ((n++<30)); do
f "$BASHPID" $(((RANDOM%5)+1)) $((RANDOM%2)) &
((++j >= maxj)) && POSIXLY_CORRECT= wait
done

echo 'finished, waiting for remaining jobs...' >&2
wait
}

main "$@"
echo

# vim: set fenc=utf-8 ff=unix ts=4 sts=4 sw=4 ft=sh nowrap et:


The remaining issues are making it work in other shells (Bash in non-POSIX 
mode agrees with ksh, but ksh doesn't agree with POSIX), and also I can't 
think of a reasonable way to retrieve the exit statuses. The status of "wait" 
is rather useless here. Otherwise I think this is the best approach, using 
SIGCHLD and relying upon the POSIX wait behavior. See here: 
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_11

An issue to be aware of is that the trap will fire when any child exits 
including command/process substitutions or pipelines etc. If any are located 
within the main loop then monitor mode needs to be toggled off around them.
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: wait unblocks before signals processed

2012-11-05 Thread Dan Douglas
On Monday, November 05, 2012 05:52:41 PM Elliott Forney wrote:
> OK, I see in POSIX mode that a trap on SIGCHLD will cause wait to
> unblock.  We are still maintaining a counter of running jobs though so
> it seems to me that there could race condition in the following line
> 
> trap '((j--))' CHLD
> 
> if two processes quit in rapid succession and one trap gets preempted
> in the middle of ((j--)) then the count may be off.  Is this possible?
> 

I believe that Bash guarantees the trap will run once for every child that 
exits, so it shoud be impossible for the count to become off. See: 
https://lists.gnu.org/archive/html/bug-bash/2012-05/msg00055.html

I think you might be experiencing other known bugs. Chet pushed several 
wait/job related commits within the last few weeks. I haven't tested these 
yet. http://git.savannah.gnu.org/cgit/bash.git/tree/CWRU/CWRU.chlog?h=devel
-- 
Dan Douglas



Re: RFE: printf '%(fmt)T' prints current time by default

2012-11-14 Thread Dan Douglas
On Wednesday, November 14, 2012 11:00:18 AM Clark WANG wrote:
> In ksh:
> 
> $ printf '%(%F %T)T\n'
> 2012-11-14 10:57:26
> $
> 
> In bash:
> 
> $ printf '%(%F %T)T\n'
> 1970-01-01 08:00:00
> $
> 
> I think the ksh behavior is makes more sense so can we use the current time
> as the default?
> 
> -Clark

I agree that a null or empty argument as equivalent to -1 is a better default. 
"0" is identical to the current behavior for empty/unset, so no functionality 
is lost.

Additionally, an empty format in ksh is equivalent to the date(1) default for 
the current locale. So, LC_TIME=C; [[ $(printf '%()T') == "$(date)" ]] is 
true.

I imagine all the functionality ksh gets basically for free with libast tm.h 
functions is totally out of the question, if all there is to work from is libc 
strptime and friends. Having "date -d" essentially built in is handy. :/
--
Dan Douglas



Re: fd leak with {fd}>

2012-11-30 Thread Dan Douglas
On Monday, November 26, 2012 11:57:33 AM Chet Ramey wrote:
> On 11/26/12 8:41 AM, Pierre Gaston wrote:
> > On Mon, Nov 26, 2012 at 3:37 PM, Chet Ramey  > > wrote:
> > 
> > On 11/23/12 2:04 AM, Pierre Gaston wrote:
> > > It seems rather counter intuitive that the fd is not closed after 
leaving
> > > the block.
> > > With the normal redirection the fd is only available inside the block
> > >
> > > $ { : ;} 3>&1;echo bar >&3
> > > -bash: 3: Bad file descriptor
> > >
> > > if 3 is closed why should I expect {fd} to be still open?
> > 
> > Because that's part of the reason to have {x}: so the user can handle the
> > disposition of the file descriptor himself.
> > 
> > . 
> > I don't see any difference between 3> and {x}> except that the later free
> > me from the hassle of avoid conflicting fd
> 
> That's not really an issue.  Bash goes to great effort to allow users to
> specify fds it is already using internally.
> 
> The user/shell programmer getting a handle to the fd is one benefit.  The
> ability to use those named fds in operators that don't allow words (e.g.
> variable expansions) to replace the explicit file descriptor number is
> another.
> 
> David Korn beat all of us in implementing this feature (we first began
> discussing it in 2005).  I should ask him if he has additional insight.
> 
> Chet
> 

I believe one of the motivations for named FDs other than automatic FD 
allocation, and the reason they remain open, was probably to deal with 
organizing and grouping coprocesses so that you could follow the variable 
names rather than the FDs directly due to the somewhat awkward way they are 
manipulated.

ksh, mksh, and zsh have almost the same system (except the Zsh "coproc" 
pipeline modifier replaces Ksh's "|&" list operator) in which you start with 
an "anonymous coproc" that can only be accessed indirectly through "read -p" 
and "print -p", until moving the current coproc to a named FD with the special 
"&p" redirects. After that, and you have a handle on the real FD, you can open 
a new anonymous coproc and begin interacting with the others through "read -u" 
and "print -u" instead. Basically:

Bash:
coproc { ...; }; read -ru "${COPROC}"

Everybody else:
{ ...; } |& { read -ru "$COPROC"; } {COPROC[0]}<&p {COPROC[1]}>&p

I suppose the idea was that you could have collections of coprocs organized 
into arrays held open until you're finished with them. Bash does it without 
adding all that extra syntax, so the FD assignment feature is less important, 
with the downside being there are a bunch of incompatible conflicts like the 
"|&" pipe and "read -p" (unnecessary features IMO, but not a big deal).

One potential gotcha in Bash is that RETURN traps execute while exposed to the 
fds that were redirected during the call to the function, not to the 
definition. I don't know if that's intentional or whether most people are 
aware that both sets of redirects are active until returning even if a 
redirect to the definition hides another made to the call. Since you can only 
access local variables from a RETURN trap, but the visible FDs are a 
combination of redirects held open from a execs or named FD assignments, plus 
redirects to the function call, it could be easy to be surprised if relying 
upon RETURN to close the right FD especially if variable names from different 
scopes conflict.

$ bash -s <<\EOF
f() {
trap 'trap - RETURN; cat "/dev/fd/$x" -; exec {x}<&-' RETURN
local x
: <<

Re: "And" extended matching operator

2012-12-03 Thread Dan Douglas
On Wednesday, November 28, 2012 07:23:17 PM Nikolai Kondrashov wrote:
> @(a&!(b))

This is the syntax ksh93 already uses. So far nobody else has adopted it, but 
the equivalent as you already mentioned is the transformation to:

!(!(...)|!(...))

It's just a matter of implementing it. Other handy matching features still 
missing are the non-greedy modifier for pattern-lists, arbitrary quantifiers, 
and grouping for patterns (already supported for ERE only with BASH_REMATCH, 
but there's no .sh.match equivalent.) 

Of these, I think the non-greedy modifier and {n,m} quantifiers for patterns 
would be my priorities because ERE doesn't support either non-greedy matching 
or negative assertions, while only ERE supports custom quantification and 
grouping. There's also no way to use ERE for globbing, only for pattern 
matching, without also adding the ~(...) syntax, which would be a huge 
undertaking. 
-- 
Dan Douglas



Re: Why can't I say "&>&3"? Bug or feature?

2012-12-06 Thread Dan Douglas
On Thursday, December 06, 2012 11:48:09 AM Tim Friske wrote:
> Hi folks,
> 
> why is it that I can't say:
> 
> exec 3>/dev/null
> echo foobar &>&3
> # Error: "-bash: syntax error near unexpected token `&'"
> 
> but the following works:
> 
> echo foobar &>/dev/null
> echo foobar >&3 2>&3
> 
> I think the succinct notation "&>&N" where N is some numbered file
> descriptor should work also. Is this behavior a bug or feature?
> 
> 
> Cheers,
> Tim
> --
> `°<
> C92A E44E CC19 58E2 FA35 4048 2217 3C6E 0338 83FC

dash and ksh interpret that syntax as "background the previous list element 
and apply >&3 to the next command", which I tend to think is most correct. 
mksh appears to do as you suggest. Bash fails to parse it.

I don't like &> to begin with. It makes the already cryptic redirection syntax 
that beginners struggle to understand even more confusing by adding a 
pointless shortcut with a non-obvious meaning instead of just being explicit. 
If you don't understand the copy descriptor and all of a sudden see yet 
another use for the & character to the left of a redirection operator, you're 
going to be even more confused.
-- 
Dan Douglas



Requesting an alternate nameref feature

2012-12-12 Thread Dan Douglas
Hello. Could we possibly modify or create an additional variant of "typeset -n"
which produces "real" references rather than just dynamic references by name?
In other words, I'd like to be able to create reference variables that always
point to the instance of a variable that was visible at the time the reference
was created, similar to the way ksh93's nameref works.

While the current nameref implementation is tremendously valuable in writing
functions that manipulate non-local arrays, it does very little else that
couldn't already be done with Bash's indirect parameter expansion, or to solve
the encapsulation problem.

 $ bash+ -c 'function f { typeset -n y=$1; typeset x=bar; echo "$y"; }; x=foo; 
f x'
 bar

 $ mksh -c 'function f { typeset -n y=$1; typeset x=bar; echo "$y"; }; x=foo; f 
x'
 bar

 $ ksh -c 'function f { typeset -n y=$1; typeset x=bar; echo "$y"; }; x=foo; f 
x'
 foo
 
I can't think of a reason this couldn't coexist with dynamic scope in
principle, with some modification. For instance, Bash won't require a check
that forces variable names to be passed through the positional parameters as in
ksh.

This feature would have similarities to "declare -g" in its ability to
tunnel around overloaded variable names in outer scopes, except would allow
both reading and writing to any scope from any deeper scope (provided the
reference itself hasn't been covered up). This would be extremely useful for
shell libraries.

--
Dan Douglas



Re: shouldn't /+(??) capture 2 letter files only?

2012-12-13 Thread Dan Douglas
On Thursday, December 13, 2012 07:23:11 PM gregrwm wrote:
> i wanted to move a bunch of files & directories, all except a certain
> few, so i figured i'd use !(this|or|that).  so first i looked to see
> if +(this|or|that) isolated what i expected.  well perhaps i don't
> understand what it's supposed to do..  shouldn't /+(??) capture 2
> letter files only?
> 
> $  echo /+()
> /boot /home /proc /root /sbin
> $  echo /+(???)
> /bin /dev /etc /lib /mnt /opt /run /srv /sys /tmp /usr /var
> $  echo /+(??)
> /b1 /boot /c6 /e1 /home /initrd.img /lost+found /nu /pl /pm /proc /px
> /ql /root /sbin

The +() pattern is equivalent to ()+ in ERE. It means "one or more of any 
member of the pattern list". IMO "?" is more confusing because you'd probably 
guess that it means "zero or one" as in the regex quantifier, but it's 
actually the same as ".", meaning exactly one of anything.

So, +(??) actually matches strings with even length, while +(???) matches 
those with odd length. +(?) matches any string with at least one character, 
and any number of ?'s matches multiples of that length.

 $ ksh -c 'printf %R\\n \?'
^.$
 $ ksh -c 'printf %R\\n "+(?)"'
^(.)+$
 $ ksh -c 'printf %R\\n "+(??)"'
^(..)+$

-- 
Dan Douglas



Re: shouldn't /+(??) capture 2 letter files only?

2012-12-13 Thread Dan Douglas
On Thursday, December 13, 2012 09:25:02 PM DJ Mills wrote:
> +(???) matches lengths that are multiples of 3, not all odd-length files.
> ?+(??) would match odd-length files.

My bad :)
-- 
Dan Douglas



Re: Question about the return value of 'local'

2012-12-14 Thread Dan Douglas
On Friday, December 14, 2012 08:37:02 AM Francis Moreau wrote:
> On Thu, Dec 13, 2012 at 3:19 PM, Chet Ramey  wrote:
> > On 12/13/12 3:56 AM, Francis Moreau wrote:
> >
> >> I see thanks.
> >>
> >> Somehow I thought that help(1) would have given nothing more nothing
> >> less than what was described in the manual.
> >
> > `help' is a quick reference -- a handy shortcut.  The authoritative
> > documentation is still the manual page and texinfo document.
> 
> Then maybe an option should be added to 'local' to display the full
> description that one can get from the manual, or maybe change the
> behaviour of '-m' switch ?
> 
> Thanks.

The best you could do (realistically) is manually keep the man document in 
sync with the help text for every individual builtin. Generating help output 
automatically would require completely changing the way builtin options are 
processed, because there aren't just arrays of options that could be mapped to 
descriptions. Bash loops over a condition for all available options for each 
argument. There are also a couple intentionally undocumented options (like 
declare -c), and some which can vary by how bash was built (like echo). Also 
the man document has all the formatting in it and can't be automatically 
generated from individual builtin help text easily, or vice versa.

Zsh is way bigger than Bash and has no help system at all (unless I missed it 
in the dozen or so manpages...). Ksh has an unbelievably stupid way of 
accessing the help, though it tends to be even more comprehensive than the 
manpage. The options are automatically generated and the descriptions 
hardcoded to a central builtins.c file. (user-defined types are self-
documenting).

Most shell manuals follow about the same overall format and obviously borrow 
from one another. Some paragraphs are word-for-word identical between Bash and 
multiple other manuals. Best bet is to learn to navigate it quickly.

-- 
Dan Douglas



Re: RFE: printf '%(fmt)T' prints current time by default

2012-12-14 Thread Dan Douglas
On Friday, December 14, 2012 09:57:11 AM Chet Ramey wrote:
> > > I think the ksh behavior is makes more sense so can we use the current 
> > > time
> > > as the default?
> > > 
> > > -Clark
> > 
> > I agree that a null or empty argument as equivalent to -1 is a better 
> > default. 
> > "0" is identical to the current behavior for empty/unset, so no 
> > functionality 
> > is lost.
> 
> That's not unreasonable.  The current default is what Posix specifies for
> printf:
> 
> Any extra c or s conversion specifiers shall be evaluated as if a null
> string argument were supplied; other extra conversion specifications
> shall be evaluated as if a zero argument were supplied. 

Ooh ok... hrm I didn't consider it's actually consistent with everything else 
this way. 
-- 
Dan Douglas



Some segfaults possible via mapfile callbacks

2013-01-09 Thread Dan Douglas
Hi. These were easy for me to reproduce in various versions.

Export to mapfile the variable to be assigned, then run any callback:
$ printf '%s\n' {a..z} | bash -xc 'a= mapfile -tc1 -C : a'
+ a=
+ mapfile -tc1 -C : a
++ : 0 a
Segmentation fault

Set any variable, then unset mapfile's variable. (takes a few iterations):
$ printf '%s\n' {a..z} | bash -xc 'b=; mapfile -tc1 -C "unset -v a; :" 
a'
+ b=
+ mapfile -tc1 -C 'unset -v a; :' a
++ unset -v a
++ : 0 a
++ unset -v a
++ : 1 b
++ unset -v a
++ : 2 c
Segmentation fault

Or modify its type to anything but an indexed array for a faster failure:
$ printf '%s\n' {a..z} | bash -xc 'b=; mapfile -tc1 -C "unset -v a; a=; 
:" a'
+ b=
+ mapfile -tc1 -C 'unset -v a; a=; :' a
++ unset -v a
++ a=
++ : 0 a
Segmentation fault

Or indirectly via nameref:
$ printf '%s\n' {a..z} | bash -xc 'typeset -n x=a; mapfile -tc1 -C 
"unset -v x; :" a'
+ typeset -n x=a
+ mapfile -tc1 -C 'unset -v x; :' a
    ++ unset -v x
++ : 0 a
Segmentation fault

There were others, mostly to do with modifying the variable being mapped.
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Reverse redirection / assignment order

2013-01-09 Thread Dan Douglas
When expanding simple commands, steps 3 and 4 are reversed unconditionally for
all command types and number of words expanded, even in POSIX mode.
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01

The exceptions allowed by POSIX appear to only apply to ksh93. Other shells
always use the POSIX order, except Bash, which never uses the POSIX order,
though the manpage description is the same as POSIX.

#!/usr/bin/env bash

# 1) no command expanded, 2) special builtin, 3) regular builtin.
tst() {
"$sh" -c 'x=$(printf 2 >&2) ${1+"$1"} <&0$(printf 1 >&2)' _ "$@"
} 2>&1

for sh in {,{b,d}a,po,{,m}k,z}sh bb; do
printf '%-4s: %s %s %s\n' "$sh" "$(tst)" "$(tst :)" "$(tst true)"
done

Out:
sh  : 21 21 21 # bash posix mode
bash: 21 21 21 # normal mode
ksh : 21 21 12 # ksh93 is the other oddball shell
dash: 12 12 12 # ...
...# Everything else same as dash

I don't know why this order was chosen or what the advantages to one over the 
other might be.
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Short list of issues with various expansions and IFS

2013-01-09 Thread Dan Douglas
I'll lump a few issues together here...

 1. Backslash escape adjacent to "$@" expansion separates words.

$ set -- abc def ghi; printf '<%s> ' "123 $@ 456"; echo
<123 abc>  
$ set -- abc def ghi; printf '<%s> ' "123 $@\ 456"; echo
<123 abc>   <456>

Other shells don't do this (though it might even be useful if the backslash
were removed from the result). Depends on the previous fix for
4.2-p36

 2. IFS side-effects don't take effect during expansion.

It isn't clear to me which of these are correct. Bash's interpretation is
unique.

for sh in bb {{d,b}a,po,{m,}k,z}sh; do
printf '%-5s ' "${sh}:"
"$sh" /dev/fd/0
done <<\EOF
${ZSH_VERSION+:} false && emulate sh
set -f -- a b c
unset -v IFS
printf '<%s> ' ${*}${IFS=}${*}${IFS:=-}"${*}"
echo
EOF

bb  :  
dash:  
bash: 
posh:
mksh:
ksh :  
zsh :  

 3. Another IFS oddity via "command"

IFS can be given "two values at once" through the environment of a
redirection.

 $ ( set -- 1 2 3; IFS=foo; IFS=- command cat <<<"${IFS} ${*}" )
foo 1-2-3
 $ ( set -- 1 2 3; IFS=foo; IFS=- cat <<<"${IFS} ${*}" )
foo 1f2f3

 4. Command substitution in redirection evaluated twice on error.

Reproduce by:

$ { <$(echo $RANDOM >&3); } 3>&2 2>/dev/null
25449
8520

There is a 1-liner patch to redir.c for this which might help demonstrate
(thanks to Eduardo A. Bustamante López ). Skipping this
branch when expandable_redirection_filename is false bypasses a huge chunk
that I haven't looked at closely.

diff --git a/redir.c b/redir.c
index 921be8c..f248fd7 100644
--- a/redir.c
+++ b/redir.c
@@ -153,7 +153,7 @@ redirection_error (temp, error)
 }
 }
 #endif
-  else if (expandable_redirection_filename (temp))
+  else if (0)
 {
 expandable_filename:
   if (posixly_correct && interactive_shell == 0)

--
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: Segmentation fault in arithmetical expression when mixing array variables.

2013-01-09 Thread Dan Douglas
On Wednesday, January 09, 2013 10:15:31 AM Eduardo A. Bustamante López wrote:
> Hi!
> 
> I found an issue while using array variables in an arithmetical
> context. I tried to determine where the problem was, but I didn't
> understand expr.c. The backtrace points to expr.c's line 556, in
> expassing. I tested both the master and devel branches.
> 
> 
> 
---
> Script
> 
---
> #!/bin/bash
> 
> echo "$BASH_VERSION"
> echo $(( a=(y[0] + y[1]) & 0xff, b=(y[2] + y[3]) & 0xff, a << 8 | b))
> 
---

With 5 minutes of experimenting, it occurs here any time there is more than 
one assignment in an expression where the first refers to an array index.

$ ( y=(1 2); (( _ = y, _ = 1 )) )# No error
$ ( y=(1 2); (( _ = y[0], _ = 1 )) ) # crash
Segmentation fault
$ ( y=(1 2); (( _ = y[0] )) ) # No error

lvalue doesn't matter. It's just any two assignments in which the first 
dereferences an array with an index given.
-- 
Dan Douglas



Assignment errors with no additional words expanded in non-POSIX mode fails to abort

2013-01-11 Thread Dan Douglas
Whether or not this type of error aborts depends upon there being an actual 
newline.

$ bash -c 'echo pre; foo=$((8#9)); echo post' 2>&1
pre
bash: 8#9: value too great for base (error token is "8#9")

$ bash -c $'echo pre\nfoo=$((8#9))\necho post' 2>&1
pre
bash: line 1: 8#9: value too great for base (error token is "8#9")
    post

Only applies to non-POSIX mode.
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-11 Thread Dan Douglas
Bash treats the variable as essentially undefined until given at least an 
empty value.

$ bash -c 'typeset -i x; [[ -v x ]]; echo "$?, ${x+foo}"; typeset -p x'
1,
bash: line 0: typeset: x: not found
$ ksh -c 'typeset -i x; [[ -v x ]]; echo "$?, ${x+foo}"; typeset -p x'
0,
typeset -i x

Zsh implicitly gives integers a zero value if none are specified and the
variable was previously undefined. Either the ksh or zsh ways are fine IMO.

Also I'll throw this in:

$ arr[1]=test; [[ -v arr[1] ]]; echo $?
1

This now works in ksh to test if an individual element is set, though it 
hasn't always. Maybe Bash should do the same? -v is tricky because it adds 
some extra nuances to what it means for something to be defined...

-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


printf %q represents null argument as empty string.

2013-01-11 Thread Dan Douglas
$ set --; printf %q\\n "$@"
''

printf should perhaps only output '' when there is actually a corresponding
empty argument, else eval "$(printf %q ...)" and similar may give different 
results than expected. Other shells don't output '', even mksh's ${var@Q} 
expansion. Zsh's ${(q)var} does.
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: printf %q represents null argument as empty string.

2013-01-11 Thread Dan Douglas
On Friday, January 11, 2013 09:39:00 PM John Kearney wrote:
> Am 11.01.2013 19:38, schrieb Dan Douglas:
> > $ set --; printf %q\\n "$@"
> > ''
> >
> > printf should perhaps only output '' when there is actually a 
corresponding
> > empty argument, else eval "$(printf %q ...)" and similar may give 
different 
> > results than expected. Other shells don't output '', even mksh's ${var@Q} 
> > expansion. Zsh's ${(q)var} does.
> 
> that is not a bug in printf %q
> 
> it what you expect to happen with "${@}" 
> should that be 0 arguments if $# is 0.
> 
> I however find the behavior irritating, but correct from the description.
> 
> to do what you are suggesting you would need a special case handler for this
> "${@}" as oposed to "${@}j" or any other variation.
> 
> 
> what I tend to do as a workaround is
> 
> printf() {
> if [ $# -eq 2 -a -z "${2}" ];then
> builtin printf "${1}"
> else
> builtin printf "${@}"
> fi
> }
> 
> 
> or not as good but ok in most cases something like
> 
> printf "%q" ${1:+"${@}"}
> 
> 

I don't understand what you mean. The issue I'm speaking of is that printf %q 
produces a quoted empty string both when given no args and when given one 
empty arg. A quoted "$@" with no positional parameters present expands to zero 
words (and correspondingly for "${arr[@]}"). Why do you think "x${@}x" is 
special? (Note that expansion didn't even work correctly a few patchsets ago.)

Also as pointed out, every other shell with a printf %q feature disagrees with 
Bash. Are you saying that something in the manual says that it should do 
otherwise? I'm aware you could write a wrapper, I just don't see any utility 
in the default behavior.
-- 
Dan Douglas



Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-11 Thread Dan Douglas
On Friday, January 11, 2013 09:48:32 PM John Kearney wrote:
> Am 11.01.2013 19:27, schrieb Dan Douglas:
> > Bash treats the variable as essentially undefined until given at least an 
> > empty value.
> >
> > $ bash -c 'typeset -i x; [[ -v x ]]; echo "$?, ${x+foo}"; typeset -p x'
> > 1,
> > bash: line 0: typeset: x: not found
> > $ ksh -c 'typeset -i x; [[ -v x ]]; echo "$?, ${x+foo}"; typeset -p x'
> > 0,
> > typeset -i x
> >
> > Zsh implicitly gives integers a zero value if none are specified and the
> > variable was previously undefined. Either the ksh or zsh ways are fine IMO.
> >
> > Also I'll throw this in:
> >
> > $ arr[1]=test; [[ -v arr[1] ]]; echo $?
> > 1
> >
> > This now works in ksh to test if an individual element is set, though it 
> > hasn't always. Maybe Bash should do the same? -v is tricky because it adds 
> > some extra nuances to what it means for something to be defined...
> >
> 
> Personally I like the current behavior, disclaimer I use nounset.
> I see no problem with getting people to initialize variables.

How is this relevant? It's an inconsistency in the way set/unset variables
are normally handled. You don't use variadic functions? Unset variables /
parameters are a normal part of most scripts.

> it is a more robust programming approach.

I strongly disagree. (Same goes for errexit.)

-- 
Dan Douglas



Re: printf %q represents null argument as empty string.

2013-01-11 Thread Dan Douglas
On Friday, January 11, 2013 04:37:56 PM Chet Ramey wrote:
> On 1/11/13 4:05 PM, Dan Douglas wrote:
> 
> > 
> > I don't understand what you mean. The issue I'm speaking of is that printf 
> > %q 
> > produces a quoted empty string both when given no args and when given one 
> > empty arg. A quoted "$@" with no positional parameters present expands to 
> > zero 
> > words (and correspondingly for "${arr[@]}"). Why do you think "x${@}x" is 
> > special? (Note that expansion didn't even work correctly a few patchsets 
> > ago.)
> > 
> > Also as pointed out, every other shell with a printf %q feature disagrees 
> > with 
> > Bash. Are you saying that something in the manual says that it should do 
> > otherwise? I'm aware you could write a wrapper, I just don't see any 
> > utility 
> > in the default behavior.
> 
> This is how bash behaves:
> 
>   The format is reused as necessary to consume all  of  the  argu-
> ments.  If the format requires more arguments than are supplied,
> the extra format specifications behave as if  a  zero  value  or
> null  string,  as  appropriate,  had  been supplied.
> 
> This is how Posix specifies printf to work.  I know it doesn't have %q,
> but bash doesn't really differentiate between %q and %s.
> 
> Chet

Ah so I'm confusing the very same thing as the "no argument along with %()T" you
pointed out to me on earlier... so this would have to be yet another special 
case.
Funny that never crossed my mind.
-- 
Dan Douglas



Re: printf %q represents null argument as empty string.

2013-01-11 Thread Dan Douglas
On Saturday, January 12, 2013 02:35:34 AM John Kearney wrote:
> so there is always at least one word or one arg, just because its "${@}"
> should not  affect this behavior.
...
> printf "%q" "${@}"
> becomes
> printf "%q" ""
> 
> which is correct as ''

No, "${@}" doesn't always become at least one word. "$@" can expand to zero or 
more words. It can become nothing, one word, or more, possibly concatenated 
with adjacent words. That's completely beside the point.

BTW, your wrappers won't work. A wrapper would need to implement format string 
parsing in order to determine which argument(s) correspond with %q so that 
they could be removed from the output if not given an argument. It also would 
have to work around handling -v, which can take up either 1 or 2 args, plus 
possibly --. It isn't feasible to wrap printf itself this way.

I just used "$@" as an example of something that can expand to zero words and 
preserves exactly the positional parameters present (possibly none). This is 
important because printf %q is often used to safely create a level of escaping 
that guarantees getting out exactly what you put in.

 $ ksh -c 'f() { cmd=$(printf "%q " "$@"); }; f; eval x=moo "$cmd"; echo "$x"'
moo

Bash yields a "command not found" error for obvious reasons, but it goes 
against the spirit of what %q is supposed to do IMHO. It's a minor detail. 
This isn't even a particularly good example.

See Chet's previous message for the actual explanation. The reason is to be 
consistent with the defaults for other format specifiers which act like they 
were given a null or zero argument if more formats than arguments are given. 
We already had pretty much this same discussion here:

http://lists.gnu.org/archive/html/bug-bash/2012-12/msg00083.html

It somehow slipped my mind.
-- 
Dan Douglas



Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-12 Thread Dan Douglas
Yes some use -u / -e for debugging apparently. Actual logic relying upon those 
can be fragile of course. I prefer when things return nonzero instead of 
throwing errors usually so that they're handleable.
-- 
Dan Douglas



Re: printf %q represents null argument as empty string.

2013-01-12 Thread Dan Douglas
On Friday, January 11, 2013 10:39:19 PM Dan Douglas wrote:
> On Saturday, January 12, 2013 02:35:34 AM John Kearney wrote:
> BTW, your wrappers won't work. A wrapper would need to implement format 

Hrmf I should have clarified that I only meant A complete printf wrapper would 
be difficult. A single-purpose workaround is perfectly fine. e.g.
printq() { ${1+printf %q "$@"}; }; ... which is probably something like what 
you meant. Sorry for the rant.

-- 
Dan Douglas



Re: Reverse redirection / assignment order

2013-01-13 Thread Dan Douglas
On Sunday, January 13, 2013 04:54:59 PM Chet Ramey wrote:
> On 1/9/13 2:00 PM, Dan Douglas wrote:
> > When expanding simple commands, steps 3 and 4 are reversed unconditionally 
> > for
> > all command types and number of words expanded, even in POSIX mode.
> > http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01
> 
> True.  Bash has always behaved like this.  It's not clear what practical
> consequences it has, other than being technically non-standard, since
> redirections do not have access to variables set in the temporary
> environment.

Yeah... I think I noticed this in trying to figure out what goes on with the
environment of a redirection. There's that "command" thing with IFS a few mails
back. And then there are some other leaky bits. I ran a bunch of experiments
months ago while trying to document the expansion process in more detail.

Just one example still with the most recent patch:

$ bash -c 'x=1; x=2 true <&0$(eval echo \$x $x >&2)'
2 1

I think you solved some problems in the last patch (the IFS in a process
substitution in a redirect glitch). And some probably have to do with all this
business:
http://article.gmane.org/gmane.comp.standards.posix.austin.general/1927 (I had
to re-read that discussion several times.)

-- 
Dan Douglas



Re: "$(echo "x'" '{1, 2}')" performs brace expansion, even though it should not

2013-01-14 Thread Dan Douglas
On Monday, January 14, 2013 01:57:14 PM Marco Elver wrote:
> echo "$(echo "x'" '{1,2}')"

This was fixed in git a while back, but not backported to 4.2. I believe this:

   8/21
   
command.h
- W_NOBRACE: new word flag that means to inhibit brace expansion

subst.c
- brace_expand_word_list: suppress brace expansion for words with
      W_NOBRACE flag

-- 
Dan Douglas



Re: about one feature in V4.2 against V4.1

2013-01-15 Thread Dan Douglas
On Tuesday, January 15, 2013 11:51:28 AM Greg Wooledge wrote:
> On Tue, Jan 15, 2013 at 11:30:39AM -0500, DJ Mills wrote:
> > I believe that's referring to var=value command, as in the syntax to export
> > a variable into "command"s environment.
> > 
> > readonly a=3
> > a=2 echo foo
> 
> I thought that was what it meant, too, but I couldn't reproduce the "bug"
> that it was claiming to fix.
> 
> imadev:~$ bash-4.2.37 -posix -c 'readonly a=3; a=2 echo foo; echo survived'
> bash-4.2.37: a: readonly variable
> foo
> survived
> 
> imadev:~$ bash-4.1.9 -posix -c 'readonly a=3; a=2 echo foo; echo survived'
> bash-4.1.9: a: readonly variable
> foo
> survived
> 
> imadev:~$ bash-4.1.9 -posix -c 'readonly a=3; a=2 /bin/echo foo; echo 
> survived'
> bash-4.1.9: a: readonly variable
> foo
> survived

 $ bash --posix -c 'readonly a=3; a=2 true; echo survived'
bash: a: readonly variable
survived
 $ bash --posix -c 'readonly a=3; a=2 :; echo survived'
bash: a: readonly variable
 $ bash --posix -c 'readonly a=3; a=2 command :; echo survived'
bash: a: readonly variable
survived
 $
 $ ( n=10 x= POSIXLY_CORRECT=; while printf "${x:-true} "; ((n--)); do x= 
${x:-true ${x:=:}}; done )
true : true : true : true : true : true
 $  ( n=20 x=: POSIXLY_CORRECT=; while printf %s "${x:--) }"; ((n--)); do x= 
${x:-true ${x:=:}}; done )
:-) :-) :-) :-) :-) :-) :-) :-) :-) :-)
-- 
Dan Douglas



Re: about one feature in V4.2 against V4.1

2013-01-15 Thread Dan Douglas
Oops nevermind, I see the issue now. Couldn't reproduce here either. Neither 
with compat modes nor the real versions.
-- 
Dan Douglas



Re: |& in bash?

2013-01-18 Thread Dan Douglas
On Thursday, January 17, 2013 06:53:26 PM John Caruso wrote:
> In article , Chet Ramey 
wrote:
> > On 1/17/13 1:01 PM, John Caruso wrote:
> >> One feature of other shells (e.g. zsh and tcsh) I'd really love to have
> >> in bash is "|&", which redirects both stdout and stderr--basically just
> >> a shortcut for "2>&1 |".  Has this ever been considered for bash?
> > 
> > That has been in bash since bash-4.0.
> 
> I'm simultaneously happy and chagrined :-) (most of the servers I manage
> are on bash 3.x, so I hadn't encountered it).  Thanks.
> 
> - John

In scripts it breaks POSIX, conflicts with the coproc operator in kshes, and 
applies the redirections in an unintuitive order since the same operator 
redirects stdout first, then applies the stderr redirect after other 
redirections. It isn't very common to dump multiple streams into one pipe. I 
suggest avoiding |&.
-- 
Dan Douglas



Re: |& in bash?

2013-01-19 Thread Dan Douglas
On Saturday, January 19, 2013 02:47:38 PM Chet Ramey wrote:
> On 1/18/13 4:10 PM, Dan Douglas wrote:
> 
> > In scripts it breaks POSIX, conflicts with the coproc operator in kshes,
> > applies the redirections in an unintuitive order since the same operator 
> > redirects stdout first, then applies the stderr redirect after other 
> > redirections. It isn't very common to dump multiple streams into one pipe.
> > I suggest avoiding |&.
> 
> It doesn't `break' Posix.  Posix doesn't say anything about it, so it's
> simply undefined.

Not necessarily, but I know what you meant.

An operator consisting of a mash-up of syntax that has another meaning 
elsewhere isn't necessarily well-formed. Perhaps somebody did find that it's 
ok. e.g. the `;|' zsh/mksh case-delimiter is a similar-looking extension.

Syntax in which the only way to write certain compatibility wrappers is to 
shield it from parsers through an eval is the least desirable way to extend a 
language IMO.

> It's syntactic sugar.  A parser macro, if you will.

That makes sense.
-- 
Dan Douglas



Re: Short list of issues with various expansions and IFS

2013-01-29 Thread Dan Douglas
Hey thanks for all the nice explanations. I'm sure I'll reference this in the 
future.

> >  3. Another IFS oddity via "command"
> > 
> > IFS can be given "two values at once" through the environment of a 
> > redirection.
> 
> I have to look at this one.  It's clear that the temporary environment
> given to `command' is like the temp environment supplied to `eval', and
> needs to persist through all of the commands executed by `command'.  I
> have to figure out whether that temporary environment counts as the temp
> environment to `cat' (I don't think so) and how to reconcile the variable
> lookups during redirection expansion.

I think at least the variable should be accessible to builtins or functions 
run by `command' (if not cat). Maybe you meant it doesn't actually get 
exported to non-builtins? In this case, the redirect should be applying to the 
`command' command, so the outer environment is what applies to the redirect 
just like any other normal command (I think).

I forgot to mention also that related to this issue, Bash is the only shell 
that does word-splitting on here-strings in the first place. The others treat 
the words expanded after the redirect as in an unquoted here-document, except 
with quote-removal despite having no wordsplitting or globbing (while in a 
heredoc, there's no quote removal either). So Bash for instance will trim 
whitespace down to 1 space if you don't quote herestring contents, while 
others don't, and treat the herestring almost exactly like the word following 
`in' in a case..esac.

There's also a documentation bug related to this:

"The word following the redirection operator in the following descriptions, 
unless otherwise noted, is subjected to brace expansion, tilde expansion, 
parameter expansion, command substitution, arithmetic expansion, quote  
removal,  pathname expansion, and word splitting. If it expands to more than 
one word, bash reports an error."

Here-strings are of course an exception to the last sentence, and possibly an 
exception to parts of the previous sentence (here-strings are defined in terms 
of the here-document, which might make the issue not so straightforward).

-- 
Dan Douglas



More fun with IFS

2013-01-29 Thread Dan Douglas
Hi everyone, and welcome to another edition of IBOTD (IFS-bug-of-the-day), 
featuring everyone's favorite Bourne shell kludge: word-splitting!

On today's episode - inconsistencies within assignments that depend upon 
quoting. Though I can't take credit for discovering this -- it was pointed out 
to me by some guys on IRC after demonstrating some other stuff.

And a quick test:

function expassign {
typeset -a a
a=("$@")
typeset var asn

while IFS= read -r asn; do
IFS=: command eval "$asn"
printf '%-14s... %s\n' "$asn" "$var"
done <<\EOF
var=${a[*]}
var="${a[*]}"
var=$*
var="$*"
var=${a[@]}
var="${a[@]}"
var=$@
var="$@"
EOF
}

${ZSH_VERSION+:} false && emulate ksh
expassign one:::two three:::four

Bash output:  # I think...
var=${a[*]}   ... one   two three   four  # bad
var="${a[*]}" ... one:::two:three:::four  # good
var=$*... one:::two:three:::four  # good
var="$*"  ... one:::two:three:::four  # good
var=${a[@]}   ... one   two three   four  # bad
var="${a[@]}" ... one:::two three:::four  # good
var=$@... one   two three   four  # bad
var="$@"  ... one:::two three:::four  # good

Zsh and pdkshes produce:

one:::two:three:::four

For all of the above, which I think is wrong for the last 4. ksh93 produces:

one:::two three:::four

for the last 4, which I think is correct.

-- 
Dan Douglas



Re: More fun with IFS

2013-01-29 Thread Dan Douglas
On Wednesday, January 30, 2013 02:00:26 AM Chris F.A. Johnson wrote:
> On Wed, 30 Jan 2013, Dan Douglas wrote:
> 
> > Hi everyone, and welcome to another edition of IBOTD (IFS-bug-of-the-day),
> > featuring everyone's favorite Bourne shell kludge: word-splitting!
> >
> > On today's episode - inconsistencies within assignments that depend upon
> > quoting. I can't take credit for discovering this -- it was pointed out
> > to me by some guys on IRC after demonstrating some other stuff.
> >
> > And a quick test:
> >
> > function expassign {
> > typeset -a a
> > a=("$@")
> > typeset var asn
> >
> > while IFS= read -r asn; do
> > IFS=: command eval "$asn"
> > printf '%-14s... %s\n' "$asn" "$var"
> > done <<\EOF
> > var=${a[*]}
> > var="${a[*]}"
> > var=$*
> > var="$*"
> > var=${a[@]}
> > var="${a[@]}"
> > var=$@
> > var="$@"
> > EOF
> > }
> >
> > ${ZSH_VERSION+:} false && emulate ksh
> > expassign one:::two three:::four
> >
> > Bash output:  # I think...
> > var=${a[*]}   ... one   two three   four  # bad
> 
> Looks good to me. It expands to multiple words, just as an unquoted
> $* would do.

No, $* always expands to a single word. If multiple words result, those are 
the result of field-splitting, not an intrinsic multi-word expansion as in the 
case of $@. Though POSIX says very little about the unquoted cases.

Secondly, at least one of these would almost have to be wrong unless it 
somehow makes sense for $* and ${a[*]} to be treated differently (it doesn't).

Third, field-splitting doesn't occur within assignments so quoting shouldn't 
matter here.

I'll grant that POSIX is extremely unclear about whether and when multiple 
words are actually the result of a multi-word expansion, or a field splitting 
step. It gets much worse when it comes to the alternate-value expansions where 
it's then additionally unclear whether and how word-splitting and/or the 
implicit multi-wordness of $@ occur recursively when nested (or not). Almost 
all shells disagree on that, but I need to do more research for those cases 
because it's hard to test what's going on. ksh93 is especially bizarre in its 
nested expansions.

> > var="${a[*]}" ... one:::two:three:::four  # good
> > var=$*... one:::two:three:::four  # good
> > var="$*"  ... one:::two:three:::four  # good
> > var=${a[@]}   ... one   two three   four  # bad
> 
> As above.

No, the ::: shouldn't be removed here. The * and @ cases are separate issues 
(I should have been clear that there are multiple problems going on).

> > var="${a[@]}" ... one:::two three:::four  # good
> > var=$@... one   two three   four  # bad
> 
> Ditto.
> 
> > var="$@"  ... one:::two three:::four  # good
> >
> > Zsh and pdkshes produce:
> >
> > one:::two:three:::four
> >
> > For all of the above, which I think is wrong for the last 4. ksh93 
produces:
> >
> > one:::two three:::four
> >
> > for the last 4, which I think is correct.
> >
> >
> 
> 
-- 
Dan Douglas



Re: More fun with IFS

2013-01-30 Thread Dan Douglas
On Wednesday, January 30, 2013 11:35:55 AM Chet Ramey wrote:
> On 1/30/13 2:47 AM, Dan Douglas wrote:
> 
> > No, $* always expands to a single word. If multiple words result, those
> > are 
> > the result of field-splitting, not an intrinsic multi-word expansion as in
> > the 
> > case of $@. Though POSIX says very little about the unquoted cases.
> 
> I haven't looked at the rest of this, but the situation is clearly not as
> absolute as you've phrased it. 
> 
I know, I agree. I've looked at what happens with empty IFSes on $* before, 
which is pretty much the only way I can think of to even test whether the 
words are due to field splitting or parameter expansion. The way bash does 
that is fine. That's probably the least important of these points anyway and I 
wouldn't really expect it to have any impact on these tests involving 
assignments.
-- 
Dan Douglas



Re: builtin "read -d" behaves differently after "set -e#

2013-02-06 Thread Dan Douglas
On Wednesday, February 06, 2013 01:44:04 PM DJ Mills wrote:
> On Tue, Feb 5, 2013 at 6:39 PM, Tiwo W.  wrote:
> 
> > I have seen "read -d '' var" to read multi-line heredocs into
> > shell variables. An empty argument to -d seemed to mean "read
> > up to the end of input". And this is what it does.
> >
> >
> In addition to all of the "don't use set -e" answers you've gotten (which i
> agree with wholeheartedly), I have this to add:
> 
> read -rd '' is synonymous to read -d $'\0'. It doesn't actually mean "read
> until the end of input", but rather "read until a NUL byte is encountered".
> You see this usage a lot as well when reading NUL-delimited data, say
> filenames from find. For example:
> 
> while IFS= read -rd '' file; do
>   some_command "$file"
> done < <(find . -type f -print0)
> 
> So what's actually happening with your here document usage case is that
> read is looking for a NUL byte, but never finds one, so it stops reading
> when EOF is encountered. As Greg mentioned, this then casuse read to exit >
> 0. This is a perfectly acceptable usage, but now you know why that happens.
> 
> And to reiterate, STOP USING set -e!

+1 to all of that.

Note $'\0' was used here for illustration and doesn't actually expand to a nul 
byte. To be completely clear, the command: `read -rd '' x' is literally 
receiving the same arguments as the $'\0' case -- it's not only that `read' is 
treating them the same.

Also, if it means anything, sadly ksh93 didn't perform that termination until 
one of the most recent alphas. I take it that this is sort of an extra special 
feature that most shells with -d happen to share, and not merely a necessary 
consequence of -d with an empty arg. 

Hopefully someday `mapfile' will inherit an analogous feature.
-- 
Dan Douglas



Re: cd -e returns syntax error

2013-02-23 Thread Dan Douglas
On Sunday, February 24, 2013 02:43:03 PM Chris Down wrote:
> Hi all,
> 
> Unless I'm misunderstanding how it should work, `cd -P -e' does not work as
> specified by the documentation. From `help cd':

Yep, see: http://lists.gnu.org/archive/html/bug-bash/2013-01/msg00099.html
-- 
Dan Douglas



Re: More fun with IFS

2013-02-26 Thread Dan Douglas
On Sunday, February 24, 2013 10:26:52 PM Thorsten Glaser wrote:
> Dan Douglas dixit:
> 
> >Zsh and pdkshes produce:
> >
> >one:::two:three:::four
> >
> >For all of the above, which I think is wrong for the last 4. ksh93
> >produces:
> 
> Why is it incorrect?

This test was intended to demonstrate expansions within assignment contexts. 
''one:::two'' and ''three:::four'' are separate arguments to begin with. Word 
splitting and pathname expansion shouldn't occur within (ordinary) assignment 
contexts. I think the mksh (and zsh) results are wrong because I can't think 
of any reason it should be inserting a '':'' between the two arguments, 
especially for the ''$@'' variants, either quoted or unquoted. It certainly 
can't be because of a word splitting step.

It's already been pointed out that different shells interpret unquoted @ and * 
differently. I think Chet's interpretation of the spec is the most reasonable 
(but you could argue otherwise):

http://lists.gnu.org/archive/html/bug-bash/2013-01/msg00180.html

Most script writers treat assignments as identical whether quoted or not. 
Quotes should only be needed to assign whitespace-containing strings and 
$'...' quotes, but shouldn't affect globbing or word splitting or modify the 
expansion in any other way. You'll notice the same thing in the case of 
herestrings.

 $ mksh -c 'set one:::two three:::four; IFS=:; cat <<<$@'
one:::two:three:::four
 $ mksh -c 'set one:::two three:::four; IFS=:; cat <<<"$@"'
one:::two:three:::four
 $ ksh -c 'set one:::two three:::four; IFS=:; cat <<<"$@"'
one:::two three:::four
 $ ksh -c 'set one:::two three:::four; IFS=:; cat <<<$@'
one:::two three:::four
 $ bash -c 'set one:::two three:::four; IFS=:; cat <<<$@'
one   two three   four
 $ bash -c 'set one:::two three:::four; IFS=:; cat <<<"$@"'
one:::two three:::four

I tend to think AT&T ksh is doing the most reasonable thing here by making the 
concatenated result exactly the same as if expanded as arguments in a quoted 
context, with whitespace separating them.
 
> In other words, “don’t do that then” (rely on this behaviour).

I wouldn't bother with this language if the only non-random behavior was that 
specified by POSIX. "POSIX doesn't specify it" is a horrible reason to do 
anything.

> I think eval is evil anyway ;-)

Eval is frowned upon because it almost always misused. Until shells add first-
class closures and higher-order functions I'll continue using it.

> (Thanks to ormaaj for pointing out this posting.)

:)

-- 
Dan Douglas



Re: More fun with IFS

2013-02-28 Thread Dan Douglas
On Wednesday, February 27, 2013 01:31:58 PM Thorsten Glaser wrote:
> Why whitespace? $IFS certainly contains none. And the usual
> insertion rules all specify the first character of $IFS and
> specify what to do if $IFS is empty or unset (which it isn’t
> in these examples).

Well, ok then. I'm just nitpicking here. I think this makes sense because it 
distinguishes between $@ and $* when assigning to a scalar, so that the end 
result of $@ is always space-separated, as spaces delimit words during command 
parsing. Your way would make more sense to me if this were the Bourne shell 
where IFS is in charge of both the initial argument splitting and field 
splitting. In this case though it seems strange to use IFS to represent 
separate words.

Consider for example if you ever implement "${@@Q}". Because of this behavior, 
the integrity of the result can only be guaranteed with a default IFS during 
assignment. This can be demonstrated with zsh which implements the same 
expansion (with different syntax) and uses the same assignment rules as mksh.

 $ zsh -s <<\EOF
emulate ksh
typeset -a cmd
cmd=(echo w:x y:z)
IFS=: x=${(q)cmd[@]} # Now we're in trouble
typeset -p x
unset -v IFS # Different problem whether or not we go back to default.
eval $x
EOF

typeset x=echo:w:x:y:z
zsh:1: command not found: echo:w:x:y:z

> Yeah, of course, it’s the only way to do some things… I personally
> usually abstract everything eval into little functions of their
> own and then just use those.
>

I agree that's an excellent strategy. :)

-- 
Dan Douglas



Re: More fun with IFS

2013-03-01 Thread Dan Douglas
On Friday, March 01, 2013 11:49:37 AM Thorsten Glaser wrote:
> Dan Douglas dixit:
> 
> >Well, ok then. I'm just nitpicking here. I think this makes sense because
> >distinguishes between $@ and $* when assigning to a scalar, so that the end 
> >result of $@ is always space-separated, as spaces delimit words during
> […]
> >Consider for example if you ever implement "${@@Q}". Because of this
> 
> Hrm, you do have a point there. Now, how to do it for consistency…
> re-reading the manual snippet:
[…]
> This means “"$*"” needs to be “one:::two:three:::four”, and POSIX
> doesn’t say anything about “$*” or “$@”… the manpage says “separate
> words (which are subjected to word splitting)”, so I’d say we need
> “one:::two:three:::four” for “$*” and thus also “$@”.
> 
> So basically, only “"$@"” needs to change to implement what comes
> out of your point, and the other three cases need to stay the same.
> 
> Do you agree?

In the case of $* I don't really care. It's probably going to differ from Bash 
no matter what because as pointed out earlier in this thread, Bash expands 
unquoted $* to multiple words even with a null IFS, whereas mksh doesn't (if 
multiple words result it's from word splitting). “one:::two:three:::four” 
makes sense for the way mksh is doing it.

For "$@" that sounds about right. I think it would be preferable if x="$@" and 
x=$@ were the same. If a user wants IFS-delimited they should probably use 
x=$* because that's what it's for. Even if you decide not to do that, at least 
there will be one way to get that outcome, and people will just have to know 
that $@ behaves like $* when unquoted even in contexts without word splitting.
-- 
Dan Douglas



Re: More fun with IFS

2013-03-01 Thread Dan Douglas
On Friday, March 01, 2013 01:06:27 PM Thorsten Glaser wrote:
> Hrm, but the docs, both, specifically say that (unquoted) $@ behaves
> like $* except in the face of no arguments, so I cannot do that.
> 
> But thanks for the feedback. My reading differed, but you have a
> point, and the others can be kept as-is.
> 

I think the root of the problem is trying to force unquoted $@ to be like $* 
instead of the other way around. That's how bash (if not for the bug) and 
ksh93 manage to do this while remaining consistent with the spec.
-- 
Dan Douglas



Re: Bug/limitation in 'time'

2013-03-17 Thread Dan Douglas
On Sunday, March 17, 2013 01:09:47 AM William Park wrote:
> On Sat, Mar 16, 2013 at 10:15:50PM -0400, Chris F.A. Johnson wrote:
> > On Sun, 17 Mar 2013, Chris Down wrote:
> > >   ExprCount() {
> > >   for (( i = $1 ; i > 0 ; i-- )); do
> > >   :
> > >   done
> > >   echo "$1 iterations"
> > >   }
> > 
> >Or, in a POSIX-compliant manner:
> > 
> > ExprCount() {
> >   i=$1
> >   while [ $(( i -= 1 )) -ge 0 ]; do
> > :
> >   done
> >   echo Just did $1 iterations using expr math
> > }
> 
> Are you saying that
> 
> for (( ; ; ))
> 
> is not POSIX?

Not only is it not POSIX, but it's rather uncommon (bash, zsh, ksh93 only), 
which is unfortunate because writing the exact equivalent using ''while'' and 
(()) alone is quite ugly. Usually I put my loops within functions, so the 
variable initialization part is handled by a typeset that's needed anyway. 
Certain other aspects are not so easy to emulate cleanly, for instance, 
preventing a redundant increment on the last iteration, and avoiding an 
increment on the first iteration. All the workarounds kind of suck.

The very best alternative to for ((;;)) is to try and work in a $(()) 
somewhere in the loop body and do an increment at the same time.

function f {
typeset n=$1  # localize + initialize
while (( n )); do # end condition
# cmds...
cmd $((n--))  # Hope that there's a convienient spot for $(())
done
}

This construct at least extends the portability to pdksh and probably a few 
others. I usually draw the line at shells that lack typeset and inform people 
to upgrade to something modern.
-- 
Dan Douglas



"typeset +x var" to a variable exported to a function doesn't remove it from the environment.

2013-03-25 Thread Dan Douglas
Hello,

$ function f { typeset +x x; typeset x=123; echo "$x"; sh -c 'echo "$x"'; 
}; x=abc f
123
abc
$ echo "$BASH_VERSION"
4.2.45(1)-release

This is inconsistent with a variable defined and exported any other way. 
(ksh93/mksh/zsh don't have this issue. Dash doesn't actually export the 
variable to the environment in this case, but just "localizes" it, and requires 
a separate export.)

-- 
Dan Douglas



Assignments preceding "declare" affect brace and pathname expansion.

2013-03-25 Thread Dan Douglas
Hi,

$ set -x; foo=bar declare arr=( {1..10} )
+ foo=bar
+ declare 'a=(1)' 'a=(2)' 'a=(3)' 'a=(4)' 'a=(5)'
 
$ touch xy=foo
$ declare x[y]=*
+ declare 'x[y]=*'
$ foo=bar declare x[y]=*
+ foo=bar
+ declare xy=foo

This isn't the same bug as the earlier a=([n]=*) issue. Each word (the entire
assignment) is subject to globbing. "let", "eval", and possibly other builtins
appear to randomly use this same kind of expansion regardless of whether they 
are preceded by an assignment, though I can't think of any uses for it. 
Arguments in this case are treated neither as ordinary assignments
nor ordinary expansions.

-- 
Dan Douglas



A few possible process substitution issues

2013-03-25 Thread Dan Douglas
Hello,

 1. Process substitution within array indices.

The difference between (( 1<(2) )) and (( a[1<(2)] )) might be seen as 
surprising.
Zsh and ksh don't do this in any arithmetic context AFAICT.

Fun stuff:

# print "moo"
dev=fd=1 _[1<(echo moo >&2)]=

# Fork bomb
${dev[${dev='dev[1>(${dev[dev]})]'}]}

 2. EXIT trap doesn't fire when leaving a process substitution.

$ ksh -c '[[ -n $(< <(trap "tee /dev/fd/3" EXIT)) ]] 3>&1 <<&1 <<&1 <<&2); } 2>&1; wait $!; printf 2; } 
| cat; echo'
12
$ bash -c '{ { : <(sleep 1; printf 1 >&2); } 2>&1; wait $!; printf 2; } 
| cat; echo'
bash: wait: pid 9027 is not a child of this shell
21

At least, this is a confusing error, because that actually is a direct child
of the shell that runs the wait. Process substitutions do set $! in Bash, not 
in Zsh.

-- 
Dan Douglas



Re: Assignments preceding "declare" affect brace and pathname expansion.

2013-03-27 Thread Dan Douglas
On Tuesday, March 26, 2013 08:33:52 PM Chet Ramey wrote:

Thank you. I'm familiar with the declaration commands. It's issue 7, not TC1.

> arguments are expanded as normal and then treated as the command
> does its arguments.

Are you saying here that even when a declaration command is _not_ identified, 
that it's still correct for word expansions to not follow the usual rules for 
regular non-declaration commands?

Hopefully my examples were clear. What I don't understand is this:

   # This is correctly recognized
 $ touch 'a=( x )'
 $ declare a=( * )
 $ echo "$a"
 a=( x )

   # This should either be like above, or fail as below.
 $ _= declare a=( * )
 $ echo "$a"
 ( x )

   # This does what I expect for unrecognized declaration commands.
 $ cmd=declare
 $ "$cmd" a=( * ); echo "$a"
 -bash: syntax error near unexpected token `('

If it's true that an assignment prefix causes bash to not recognize 
declaration commands (which is unfortunate IMO), then you would expect the 2nd 
case above to be the same as the 3rd case. Instead, it's not wordsplitting and 
not failing due to the () metacharacters, and using sort of a hybrid of the 
two.

-- 
Dan Douglas



Re: weird problem -- path interpretted/eval'd as numeric expression

2013-03-28 Thread Dan Douglas
Can you whittle this down to the smallest reproducer and post a stand-alone 
synthetic testcase with sample input data that fails?

If the goal is simulating "exported arrays", there are other methods that 
would probably work out better.
-- 
Dan Douglas



Re: [PATCH] Adding support for '--' in builtin echo's option parsing.

2013-04-01 Thread Dan Douglas
On Monday, April 01, 2013 03:32:16 PM Dave Rutherford wrote:
> On Mon, Apr 01, 2013 at 03:16:07PM +0300, Hemmo Nieminen wrote:
> > > Description:
> > > Currently it seems to be impossible to e.g. print "-n" with the 
builtin
> > > echo witout any extra characters.
> > 
> > You should use printf instead.  The echo command is a historical artifact
> > which cannot be used for general-purpose output.
> > 
> > http://pubs.opengroup.org/onlinepubs/9699919799/utilities/echo.html says:
> > 
> > The echo utility shall not recognize the "--" argument in the manner
> > specified by Guideline 10 of XBD Utility Syntax Guidelines ; "--"
> > shall be recognized as a string operand.
> 
> Perhaps this is worth adjusting unless POSIXLY_CORRECT?

print(1) is an option for those that want --. AFAIK, it's consistent 
everywhere it's been implemented. There's a (barebones) example loadable, and 
it's quite easy to define a print shiv in shell if needed.

Although, comparing the length:

printf %s
print -rn

It's just as easy to type printf.
-- 
Dan Douglas



Re: setvalue builtin command

2013-04-04 Thread Dan Douglas
On Wednesday, April 03, 2013 11:53:48 PM konsolebox wrote:
> Hi. I made a post on this before but I haven't got a reply. I actually want
> to know what people think about the idea as I actually find a command like
> this really helpful. Anyone please?
> 
> On Wed, Feb 6, 2013 at 11:30 AM, konsolebox  wrote:
> 
> > Hi. I was wondering if we could add a builtin where we could use it as an
> > alternative for assigning values to a parameter. And thought of a builtin
> > name called setvalue. With it we could assign values to a normal variable,
> > an array, or an associative array.

This is more or less identical to the ksh88 `set -A'. If anything were to be 
added, it would probably be that. I assume Chet preferred enforcing more 
consistent syntax rather than adding something redundant to ksh93-like 
compound assignment syntax.

The primary advantages to set -A are:

 - It's the most portable way to assign multiple elements to an indexed array 
other than a separate assignment for each element.
 - The combination `set -sA' provides a means of sorting (lexicographically). 
Bash currently has no built-in way to sort an array or the positional 
parameters.
-- 
Dan Douglas



Interpretation of escapes in expansions in pattern matching contexts

2013-04-06 Thread Dan Douglas
I couldn't find anything obvious in POSIX that implies which interpretation is
correct. Assuming it's unspecified.

Bash (4.2.45) uniquely does interpret such escapes for [[, which makes me 
think this test should say "no":

x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi

bash: yes
ksh:  no
mksh: no
zsh:  no

However, ksh93 (AJM 93v- 2013-03-17) is unique in that it flips the result
depending on "[[ ]]" or "case..esac" (bug?), but otherwise it looks like a
fairly random spread:

x=\\x; case x in $x) echo yes;; *) echo no; esac

bash: yes
ksh:  yes
mksh: no
posh: no
zsh:  no
dash: yes
bb:   no
jsh:  no

18:42:44 jilles: ormaaj, I'm not sure if that's actually a bug
18:43:15 ormaaj: dunno. Bash seems unique in that respect
18:43:23 jilles: you're asking the shell to check if the string x matches the 
pattern stored in the variable x
19:32:51 jilles: freebsd sh and kmk_ash say no, dash says yes
19:33:40 jilles: Bourne shell says no

-- 
Dan Douglas



Re: Interpretation of escapes in expansions in pattern matching contexts

2013-04-06 Thread Dan Douglas
On Saturday, April 06, 2013 09:24:52 PM Chris Down wrote:
> On 2013-04-06 07:01, Eric Blake wrote:
> > > bb:   no
> > > jsh:  no
> >
> > I haven't heard of these two, but they are also bugs.
> 
> I assume bb is busybox ash.
> 
> Chris

It's typically a symlink to busybox yes, which calls the shell. "jsh" is the 
default binary name produced by the heirloom build, though I've seen other 
names used. 
-- 
Dan Douglas



Re: Interpretation of escapes in expansions in pattern matching contexts

2013-04-06 Thread Dan Douglas
On Saturday, April 06, 2013 09:37:44 PM Chet Ramey wrote:
> On 4/6/13 4:48 AM, Dan Douglas wrote:
> > I couldn't find anything obvious in POSIX that implies which
> > interpretation is
> > correct. Assuming it's unspecified.
> > 
> > Bash (4.2.45) uniquely does interpret such escapes for [[, which makes me 
> > think this test should say "no":
> > 
> > x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi
> > 
> > bash: yes
> > ksh:  no
> > mksh: no
> > zsh:  no
> > 
> > However, ksh93 (AJM 93v- 2013-03-17) is unique in that it flips the result
> > depending on "[[ ]]" or "case..esac" (bug?), but otherwise it looks like a
> > fairly random spread:
> > 
> > x=\\x; case x in $x) echo yes;; *) echo no; esac
> 
> These two cases should not be different.  They undergo the same expansions,
> except that the conditional command adds quote removal, which doesn't
> matter in this case.  In both cases, you ask the pattern matching code
> whether or not the string `x' matches the pattern `\x'.
> 
> You invoke the same pattern matching code on the same patterns, why would
> you not get the same answer?

I expect they should be the same. I just noticed the discrepancy with ksh93 
and wondered what gives.

The original question I had in mind is: Is the quoting state of any part of a 
pattern determined lexically prior to expansions, or are any quotes/escapes 
within parts of pattern words that were generated by unquoted expansions re-
interpreted as quotes by the pattern matcher? I had always thought the former, 
but now it looks to me like all these shells are saying "no" because they 
interpret the expanded words for quoting to determine which parts of the 
pattern should be literal. This appears to even apply to pathname expansion.

 $ touch '\foo'
 $ ksh -c 'x=\\f* IFS=; printf %s\\n $x'
 \foo
 $ bash -c 'x=\\f* IFS=; printf %s\\n $x'
 \f*

I'm surprised different implementations are all across the board on this.
--
Dan Douglas



Re: Very slow pattern substitution in parameter expansion

2013-04-09 Thread Dan Douglas
Erm, here it is in a less unreadable format:

#!/usr/bin/env bash
typeset -a a=(
curl --header 'Host: v33.veehd.com'
--header 'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) 
Gecko/20100101 Firefox/19.0'
--header 'Accept: 
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'
--header 'Accept-Language: en-US,en;q=0.5'
--header 'Accept-Encoding: gzip, deflate'
--header 'Referer: 
http://veehd.com/vpi?h=NDcwNjgyNXw0ODB8ODQ2Ljh8ZGl2eHwzfDUwMDB8MTM2NTUzNzQ1OHwxNTJ8MXw0NTNmOTA3NDY1Yjg3ZmM5MjI0MTI$'
--header 'Cookie: 
__utma=163375675.149191626.1365449922.1365465732.1365537650.4;__utmz=163375675.1365465732.3.2.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided);__utmb=163375675.1.10.1365537650;
 __utmc=163375675'
--header 'Connection: keep-alive'

'http://v33.veehd.com/dl/45c2f8a516118e29917ff154fee0179e/1365544663/5000.4706825.avi&b=390'
-o '4706825.avi'
-L
)

time : "${a[*]//[0-9]/z}"
time : "${a[*]//+([0-9])/z}"

I can't reproduce using the above code.

 $ bash ./testcases/substperf
real0m0.000s
user0m0.000s
sys 0m0.000s

real0m0.000s
user    0m0.000s
sys 0m0.000s

As an aside, don't store commands in variables.
http://mywiki.wooledge.org/BashFAQ/050

-- 
Dan Douglas



  1   2   3   >