Re: heredoc and subshell

2016-02-23 Thread Jilles Tjoelker
On Wed, Feb 24, 2016 at 01:07:44AM +0300, Oleg Bulatov wrote:
> trying to minimize a shell code I found an unobvious moment with
> heredocs and subshells.

> Is it specified by POSIX how next code should be parsed? dash output
> for this code differs from bash and zsh.

> --- code
> prefix() { sed -e "s/^/$1:/"; }
> DASH_CODE() { :; }
> 
> prefix A < echo line 1
> XXX
> echo line 2)" && prefix DASH_CODE < echo line 3
> XXX
> echo line 4)"
> echo line 5
> DASH_CODE

> --- bash 4.3.42 output:
> A:echo line 3
> B:echo line 1
> line 2
> DASH_CODE:echo line 4)"
> DASH_CODE:echo line 5

> --- dash 0.5.8 output:
> A:echo line 1
> B:echo line 2)" && prefix DASH_CODE < B:echo line 3
> line 4
> line 5

I think POSIX is clear that the bash/zsh behaviour is correct and the
dash behaviour is wrong. In XCU 2.6.3 Command Substitution, it says:

] With the $(command) form, all characters following the open
] parenthesis to the matching closing parenthesis constitute the
] command.

Therefore, the shell should not start reading the here-document
belonging to  prefix A 

Re: heredoc and subshell

2016-02-23 Thread Eric Blake
On 02/23/2016 03:49 PM, Eric Blake wrote:
> [adding the Austin Group]
> 
> On 02/23/2016 03:07 PM, Oleg Bulatov wrote:
>> Hello,
>>
>> trying to minimize a shell code I found an unobvious moment with heredocs 
>> and subshells.
> 
> Thanks for a cool testcase.
> 
>>
>> Is it specified by POSIX how next code should be parsed? dash output for 
>> this code differs from bash and zsh.
> 
> XCU 2.3 says:
> 
> When an io_here token has been recognized by the grammar (see Shell
> Grammar), one or more of the subsequent lines immediately following the
> next NEWLINE token form the body of one or more here-documents and shall
> be parsed according to the rules of Here-Document.
> 
> and 2.7.4 says:
> 
> The here-document shall be treated as a single word that begins after
> the next  and continues until there is a line containing only
> the delimiter and a , with no  characters in between.
> Then the next here-document starts, if there is one.
> 
> but with no mention of what happens if you somehow manage to make the
> next  be part of an incomplete shell word on the line
> containing the here-doc operator.

As it is, all shells I tested have a shorter test case that proves they
don't always start looking for the heredoc body after the first newline:

$ dash -c 'cat < Maybe we need a defect against the standard that says behavior is
> unspecified if the next  after a here-doc operator occurs in
> the middle of a shell word.

Or maybe refine the wording to state the first unescaped newline, since
backslash escaping seems to consistently work (and only newlines inside
incomplete command substitution is where the confusion begins).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: heredoc and subshell

2016-02-23 Thread Eric Blake
[adding the Austin Group]

On 02/23/2016 03:07 PM, Oleg Bulatov wrote:
> Hello,
> 
> trying to minimize a shell code I found an unobvious moment with heredocs and 
> subshells.

Thanks for a cool testcase.

> 
> Is it specified by POSIX how next code should be parsed? dash output for this 
> code differs from bash and zsh.

XCU 2.3 says:

When an io_here token has been recognized by the grammar (see Shell
Grammar), one or more of the subsequent lines immediately following the
next NEWLINE token form the body of one or more here-documents and shall
be parsed according to the rules of Here-Document.

and 2.7.4 says:

The here-document shall be treated as a single word that begins after
the next  and continues until there is a line containing only
the delimiter and a , with no  characters in between.
Then the next here-document starts, if there is one.

but with no mention of what happens if you somehow manage to make the
next  be part of an incomplete shell word on the line
containing the here-doc operator.

> 
> --- code
> prefix() { sed -e "s/^/$1:/"; }
> DASH_CODE() { :; }
> 
> prefix A < echo line 1
> XXX
> echo line 2)" && prefix DASH_CODE < echo line 3
> XXX
> echo line 4)"
> echo line 5
> DASH_CODE
> 
> --- bash 4.3.42 output:
> A:echo line 3
> B:echo line 1
> line 2
> DASH_CODE:echo line 4)"
> DASH_CODE:echo line 5

So, it looks like bash is interpreting this as "first newline that is
not in the middle of another shell word), and parses the entire $(...)
construct through line 2 as if there were no newlines, then treats the
newline after DASH_CODE as starting the heredoc, for outputting A: while
visiting line 3 as the lone line in that heredoc.  Then it moves on to
the second command in the && sequence, by processing the command
substitution (a heredoc outputting line 1, then the output of line 2;
then moves on to the third component of the && sequence as a final
heredoc delimited by DASH_CODE, with both lines 4 and 5 output with the
DASH_CODE: prefix.

> 
> --- dash 0.5.8 output:
> A:echo line 1
> B:echo line 2)" && prefix DASH_CODE < B:echo line 3
> line 4
> line 5
> 

Meanwhile, dash is taking the literal first newline as the start of the
first heredoc, and outputting A: with line 1; then consuming the next
heredoc as lines 2 and 3 before finding the end of the command
substitution on line 4, then outputting line 5 on its own and doing
nothing else for the DASH_CODE function call.

ksh 93u+ 2012-08-01 behaves even differently:

B:echo line 1
line 2 && prefix DASH_CODE < after a here-doc operator occurs in
the middle of a shell word.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


heredoc and subshell

2016-02-23 Thread Oleg Bulatov
Hello,

trying to minimize a shell code I found an unobvious moment with heredocs and 
subshells.

Is it specified by POSIX how next code should be parsed? dash output for this 
code differs from bash and zsh.

--- code
prefix() { sed -e "s/^/$1:/"; }
DASH_CODE() { :; }

prefix A 

Re: [BUG] Illegal function names are accepted after being used as aliases

2016-02-23 Thread Eric Blake
On 02/23/2016 02:00 PM, Harald van Dijk wrote:
> 
> I was under the impression that the intent from the dash side was to
> handle all commands the same, and that impression was based on the fact
> that the . command has received additional code to handle -- even though
> there's no requirement for that. However, looking into the original bug
> report that prompted that change in more detail I see that the standard
> will very likely require support for -- in the . command in the future,
> so that doesn't hold up.

Here's the link for dot and exec supporting --:
http://austingroupbugs.net/view.php?id=252

> 
> If that intent isn't there (I'm not saying it's not; I'm unsure now),
> the list of utilities that should be extended is far smaller, if I'm not
> overlooking anything:
> - alias
> - getopts
> - type
> - exec?
> - local?

Weird that unalias already works.  Oh, because of 'unalias -a'.  I
didn't spot any others that you missed (doesn't mean there aren't any,
just that I didn't spot them).

> 
> exec is like .: there's currently no requirement to support --, but that
> requirement is likely to come in the future.

See the above link; exec must support -- if '.' does.  I also found
http://austingroupbugs.net/view.php?id=163 which confirms that 'eval' is
not required (nor it is prevented) from recognizing --.  There's also
http://austingroupbugs.net/view.php?id=960 which mentioned the exit
status of export and several other special builtins, but added no
requirements related to --.

> 
> local is currently non-standard and it's hard to guess whether it will
> require support for -- if standardised.

If standardized, I expect it to require support for --, on the grounds
that 'local -r' already has meaning in bash, so local is definitely a
candidate for taking options.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [BUG] Illegal function names are accepted after being used as aliases

2016-02-23 Thread Harald van Dijk

On 23/02/2016 20:33, Eric Blake wrote:

exit - fuzzy.  exit is a special built-in (unlike getopts); and XCU 2.14
states:

  "Some of the special built-ins are described as conforming to XBD
Utility Syntax Guidelines. For those that are not, the requirement in
Utility Description Defaults that "--" be recognized as a first argument
to be discarded does not apply and a conforming application shall not
use that argument. "

Conforming apps cannot expect 'exit -1' to work, and therefore, cannot
also expect 'exit -- -1' to work, since the only standards-defined
values for an argument to exit is a non-negative decimal integer less
than 256.  Of course, if you want to fix it along with all the others,
that's fine; I'm just pointing out that 'exit' isn't broken as-is.


I was under the impression that the intent from the dash side was to 
handle all commands the same, and that impression was based on the fact 
that the . command has received additional code to handle -- even though 
there's no requirement for that. However, looking into the original bug 
report that prompted that change in more detail I see that the standard 
will very likely require support for -- in the . command in the future, 
so that doesn't hold up.


If that intent isn't there (I'm not saying it's not; I'm unsure now), 
the list of utilities that should be extended is far smaller, if I'm not 
overlooking anything:

- alias
- getopts
- type
- exec?
- local?

exec is like .: there's currently no requirement to support --, but that 
requirement is likely to come in the future.


local is currently non-standard and it's hard to guess whether it will 
require support for -- if standardised.


Cheers,
Harald van Dijk
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] Illegal function names are accepted after being used as aliases

2016-02-23 Thread Eric Blake
On 02/23/2016 12:21 PM, Harald van Dijk wrote:
> On 23/02/2016 19:58, Eric Blake wrote:
>> On 02/23/2016 11:44 AM, Harald van Dijk wrote:
>>
>>> This matches bash's behaviour, aside from bash requiring -- to prevent
>>> detection of invalid flags to the alias command:
>>>
>>> bash-4.3$ alias -- -=true
>>
>> Then dash DOES have a bug:
> 
> Indeed, I wasn't trying to suggest otherwise, my apologies if it came
> across that way. It's not limited to the alias command though, I spotted
> at least the exit and getopts commands having the same problem, and it
> should probably be fixed for all of them at once.

getopts - definitely needs a fix
exit - fuzzy.  exit is a special built-in (unlike getopts); and XCU 2.14
states:

 "Some of the special built-ins are described as conforming to XBD
Utility Syntax Guidelines. For those that are not, the requirement in
Utility Description Defaults that "--" be recognized as a first argument
to be discarded does not apply and a conforming application shall not
use that argument. "

Conforming apps cannot expect 'exit -1' to work, and therefore, cannot
also expect 'exit -- -1' to work, since the only standards-defined
values for an argument to exit is a non-negative decimal integer less
than 256.  Of course, if you want to fix it along with all the others,
that's fine; I'm just pointing out that 'exit' isn't broken as-is.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [BUG] Illegal function names are accepted after being used as aliases

2016-02-23 Thread Harald van Dijk

On 23/02/2016 19:58, Eric Blake wrote:

On 02/23/2016 11:44 AM, Harald van Dijk wrote:


This matches bash's behaviour, aside from bash requiring -- to prevent
detection of invalid flags to the alias command:

bash-4.3$ alias -- -=true


Then dash DOES have a bug:


Indeed, I wasn't trying to suggest otherwise, my apologies if it came 
across that way. It's not limited to the alias command though, I spotted 
at least the exit and getopts commands having the same problem, and it 
should probably be fixed for all of them at once.


Cheers,
Harald van Dijk
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] Illegal function names are accepted after being used as aliases

2016-02-23 Thread Eric Blake
On 02/23/2016 11:44 AM, Harald van Dijk wrote:

> This matches bash's behaviour, aside from bash requiring -- to prevent
> detection of invalid flags to the alias command:
> 
> bash-4.3$ alias -- -=true

Then dash DOES have a bug:

# dash
$ alias -- -='echo hi'
alias: -- not found
$ echo $?
1
$ -
hi
$

POSIX XCU 1.4 is clear:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html

"Default Behavior: When this section is listed as "None.", it means that
the implementation need not support any options. Standard utilities that
do not accept options, but that do accept operands, shall recognize "--"
as a first argument to be discarded."

and alias takes operands, stating "OPTIONS: None.", which means POSIX
_requires_ 'alias -- -=name' to (attempt to) define only the single
alias '-', and NOT to also attempt to define '--' as an alias.

It's okay if dash allows 'alias -=blah' to define '-' as an alias as an
extension, but it MUST ignore '--' the way bash does.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [BUG] Illegal function names are accepted after being used as aliases

2016-02-23 Thread Harald van Dijk

On 23/02/2016 19:18, Jan Verbeek wrote:

Function definitions that use a bad function name (such as "-" and "=")
are accepted if the function name already exists as an alias. For example:

$ -
dash: 1: -: not found
$ - () { echo hello; }
dash: 2: Syntax error: Bad function name
$ -
dash: 2: -: not found
$ alias -=true
$ -
$ - () { echo hello; }
$ -
hello
$


After alias -=true, - () { echo hello; } is treated as a use of that 
alias. It doesn't define a function with a name of -, it defines a 
function with a name of true, which consists only of valid characters.


$ alias -=true
$ -() { echo hello; }
$ type -
- is an alias for true
$ type true
true is a shell function
$ true
hello

This matches bash's behaviour, aside from bash requiring -- to prevent 
detection of invalid flags to the alias command:


bash-4.3$ alias -- -=true
bash-4.3$ -() { echo hello; }
bash-4.3$ type -
- is aliased to `true'
bash-4.3$ type true
true is a function
true ()
{
echo hello
}
bash-4.3$ true
hello

Cheers,
Harald van Dijk
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] Illegal function names are accepted after being used as aliases

2016-02-23 Thread Eric Blake
On 02/23/2016 11:18 AM, Jan Verbeek wrote:
> Function definitions that use a bad function name (such as "-" and "=")
> are accepted if the function name already exists as an alias. For example:

Not necessarily a bug.

> 
> $ -
> dash: 1: -: not found
> $ - () { echo hello; }
> dash: 2: Syntax error: Bad function name
> $ -
> dash: 2: -: not found
> $ alias -=true
> $ -

This is equivalent to running 'true'.

> $ - () { echo hello; }

This is equivalent to running 'true () { echo hello; }' - the alias
expansion happens BEFORE the function definition is even parsed.  You
are NOT defining a function named '-', but one named 'true'.

> $ -

This is again equivalent to running 'true' - except that now the
function name 'true' exists and bypasses the shell builtin.

> hello
> $

So the only thing remaining is to determine if it is legal to have a
function override the name of a regular shell builtin.  But
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01
under "Command Search and Execution" states that function names have
priority over regular built-ins (so yes, creating a function named
'true' is doable, although stupid).


-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[BUG] Illegal function names are accepted after being used as aliases

2016-02-23 Thread Jan Verbeek
Function definitions that use a bad function name (such as "-" and "=") 
are accepted if the function name already exists as an alias. For example:


$ -
dash: 1: -: not found
$ - () { echo hello; }
dash: 2: Syntax error: Bad function name
$ -
dash: 2: -: not found
$ alias -=true
$ -
$ - () { echo hello; }
$ -
hello
$
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html