Re: [PATCH] Make mktokens accept a random TMPDIR, replace `...` with $(...).
On 11/15/18 9:32 AM, Devin Hussey wrote: From b9724fc82eda2b0d164c33ad3e871d38b298d1ad Mon Sep 17 00:00:00 2001 From: Devin Hussey Date: Thu, 15 Nov 2018 10:30:05 -0500 Subject: [PATCH] Make mktokens accept a random TMPDIR, replace `...` with $(...). Sorry about the multiple commits at once. Signed-off-by: Devin Hussey --- src/mktokens | 17 ++--- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/src/mktokens b/src/mktokens index cd52241..ec801cc 100644 --- a/src/mktokens +++ b/src/mktokens @@ -37,7 +37,10 @@ # token marks the end of a list. The third column is the name to print in # error messages. -cat > /tmp/ka$$ <<\! +# set TMPDIR if it isn't already +[ -z "${TMPDIR}" ] && TMPDIR="/tmp" Shorter as: : "${TMPDIR:=/tmp}" -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: Unexpected behaviour: double backslash in single quotes
On 09/06/2018 04:40 AM, Joshua Phillips wrote: Escape sequences don't work in single quotes: $ echo 'hello\world' hello\world $ echo 'hello\' Warning. Use of 'echo' and backslashes is non-portable. There are two classical behaviors: 1. backslashes are not special to echo unless you pass -e, so you also have to have -n to elide a trailing newline (this is the behavior of bash by default) 2. backslashes ARE special by default, so you don't need -e; and \c exists to elide a trailing newline, so you don't need -n (this is the behavior of dash by default, and the behavior required by POSIX; bash can also be configured to run in this mode via 'set -o posix; shopt -s xpg_echo') Which makes it surprising that double backslashes get converted to single backslashes: $ echo 'hello\\world' hello\world Is this intended behaviour? Yes. dash is obeying the POSIX-mandated behavior, and interpreting \ sequences by default. Since \w is not a known sequence, dash cheats and outputs \ as-is instead of giving you an error (although an error would be friendlier at reminding you that \ is active-by-default in dash). But since \\ is a known sequence, it gets interpreted by echo. Bash behaves as I would have expected. Rather, bash in its default mode does what you are used to, but violated POSIX. Bash in the mode that I mentioned above (set -o posix; shopt -s xpg_echo) behaves like dash. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: test bug?
On 02/19/2018 06:53 AM, Yuriy Vostrikov wrote: Hello, Is this expected behavior? $ cd /tmp $ mkdir foo $ cd foo/ $ touch a $ /usr/bin/test a -nt b; echo $? 0 $ /bin/bash -c 'test a -nt b; echo $?' 0 $ /bin/dash -c 'test a -nt b; echo $?' 1 Yes. -nt is not specified by POSIX, and the behavior of -nt when one of the two operands does not exist can make sense under multiple interpretations (treat a missing file as a silent error, where both 'a -nt b' and 'b -nt a' fail with status 1 [dash]; treat a missing file as always newer, because once you make it exist it will have a newer timestamp [not sure if anyone does that]; treat a missing file as a hard error with message to stderr and status 2 [not sure if anyone does that]; treat a missing file as always older, perhaps because you use the default timestamp of Jan 1 1970 when interpreting all 0's for any file that fails to stat [bash, coreutils]). The same problem of multiple interpretations also applies to -ot. At any rate, I don't see it as a bug in dash, so much as your script making non-portable assumptions about non-standardized behavior. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org -- To unsubscribe from this list: send the line "unsubscribe dash" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: echo "\\1"?
On 07/27/2017 10:10 AM, Bosco wrote: > That script of zziplib isn't mine, I only had to compile it once > because it was necessary for compile other program (TeX Live). > > I'm not talking about POSIX, and I don't mind what it said. I'm > talking about the man page of dash, that said: > > when \\ is reached is replaced by \. When \\ is reached AS THE ARGUMENT to echo. > Then, in the command > echo > because \\ is reached first, then it will be replaced by '\' No, you are demonstrating a gap in your understanding of shell quoting rules. echo echo "" echo '\\' echo '\'"\\" are all the same way to pass the two-character argument to echo. That two-character argument is a valid escape sequence, which in turn means echo outputs a single \ character then a newline. > character, immediately after that another \\ is reached, then it will > be replaced by another '\' character. It turns out the ouput '\\'. If you want two \ as output, you have to pass four characters (not two) to echo, so your input has to be one of these (or other) valid quotings: echo '' echo "" echo \\'\'"\\"\\'\'"\\"\\'\' etc. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: echo "\\1"?
On 07/27/2017 08:13 AM, Bosco wrote: > On 27 July 2017 at 12:54, Eric Blake <ebl...@redhat.com> wrote: >> Which man pages? Echo is one of those programs that varies widely, and >> you are MUCH better off using printf(1) instead of echo(1) if you are >> trying to get newline suppression, trying to print something that might >> begin with -, or trying to print something that might contain \. > > Sorry, maybe I did't explain it correctly, I mean the man pages of the > dash source: > https://git.kernel.org/pub/scm/utils/dash/dash.git/tree/src/dash.1#n1202 > > And because of this, I got an error compiling zziplib, you may see > https://github.com/gdraheim/zziplib/blob/v0.13.67/configure#L17542 Eww - storing generated files in git - that forces everyone that checks out your project to use the EXACT same version of autotools to avoid changing the generated files unintentionally. Looking at those lines: > if test -f $ac_prefix_conf_INP ; then > echo "s/^#undef *\\([ABCDEFGHIJKLMNOPQRSTUVWXYZ_]\\)/#undef > $ac_prefix_conf_UPP""_\\1/" > conftest.prefix ac_prefix_conf_INP is not defined anywhere in autoconf 2.69 sources (and you really shouldn't use the ac_ prefix if you are writing code that is not part of autoconf proper). I couldn't find mention of it at https://github.com/gdraheim/zziplib/blob/v0.13.67/configure.ac, but it may be in one of your other included files. Can you pinpoint which part of your configure.ac results in that part of the generated configure file? In all likelihood, you are using a buggy macro that is using autoconf primitives incorrectly, and thus resulting in non-portable code. But without seeing the true source, I can't help you debug your problem. >> Arguably, since it is not required by POSIX, we don't have to do it. But >> I also can't argue that POSIX forbids us to support \1 as an extension >> (it says nothing about whether implementations can have additional >> escape sequences). So I'll argue that it is intentional as a dash >> extension. But if you can make dash smaller by getting rid of the >> extension, that might be an acceptable patch. > > In that case, I think, the man page of dash should be modified with > that extension. Indeed, or the fact that it is NOT documented means that it is an unintentional bug for providing the extension. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: echo "\\1"?
On 07/27/2017 07:23 AM, Bosco wrote: > According the man pages, Which man pages? Echo is one of those programs that varies widely, and you are MUCH better off using printf(1) instead of echo(1) if you are trying to get newline suppression, trying to print something that might begin with -, or trying to print something that might contain \. > for echo command, "\\" should print '\' > character, and \0digits should print the byte in octal base. > But the command > > echo "\\1" This is the same as echo '\1' which is NOT defined by POSIX as being a valid escape sequence that echo must recognize. (Did you mean to test echo '\\1' instead?) Here's the POSIX list of required escape sequences: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/echo.html > > outputs the byte 0x01 in hexadecimal (or 001 in octal). > Is this a bad behavior or is intentional? Arguably, since it is not required by POSIX, we don't have to do it. But I also can't argue that POSIX forbids us to support \1 as an extension (it says nothing about whether implementations can have additional escape sequences). So I'll argue that it is intentional as a dash extension. But if you can make dash smaller by getting rid of the extension, that might be an acceptable patch. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: Parameter expansion, patterns and fnmatch
On 09/02/2016 09:46 AM, Herbert Xu wrote: > On Fri, Sep 02, 2016 at 09:25:15AM -0500, Eric Blake wrote: >> >> 2.13.1 Patterns Matching a Single Character >> >> [ >> If an open bracket introduces a bracket expression as in XBD RE >> Bracket Expression, except that the character ( '!' ) >> shall replace the character ( '^' ) in its role in a >> non-matching list in the regular expression notation, it shall introduce >> a pattern bracket expression. A bracket expression starting with an >> unquoted character produces unspecified results. Otherwise, >> '[' shall match the character itself. > > BTW, this last sentence is not present in > > http://pubs.opengroup.org/onlinepubs/009604499/utilities/xcu_chap02.html#tag_02_13 > That's the 2004 edition (TC1 of the 2001 spec, aka Issue 6). > So I presume it's a newer unreleased revision. Newer but released (TC2 of the 2008 spec, aka Issue 7): http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_13_01 The requirement has been there for 8 years now. > > Seriously, you guys are turning POSIX into a joke by introducing > all these new requirements. At this point I think we should > pretty much give up on POSIX compliance the way it's headed. I hope you're just stating that out of frustration, and not something that you actually intend to follow through with. And if there is a requirement being considered in the Austin Group that you disagree with, please speak up on the Austin Group - membership is free. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Parameter expansion, patterns and fnmatch
On 09/02/2016 09:29 AM, Herbert Xu wrote: > On Fri, Sep 02, 2016 at 09:25:15AM -0500, Eric Blake wrote: >> >>>> This also affects >>>> >>>> case [a in [?) echo ok ;; *) echo bad ;; esac >>>> >>>> which should print ok. >>> >>> Even ksh prints bad here. >> >> So ksh is also buggy. > > Good luck writing a script with an unquoted [ expecting it to be > portable :) [ '' ] || echo empty There, I just wrote a portable script with unquoted [ portably interpreted as itself and not as a bracket filename expansion pattern. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Parameter expansion, patterns and fnmatch
On 09/02/2016 09:04 AM, Herbert Xu wrote: >> Yes, this looks like a bug in dash. With the default --disable-fnmatch >> code, when dash encounters [ in a pattern, it immediately treats the >> following characters as part of the set. If it then encounters the end >> of the pattern without having seen a matching ], it attempts to reset >> the state and continue as if [ was treated as a literal character right >> from the start. > > POSIX says: > > 9.3.3 BRE Special Characters > > A BRE special character has special properties in certain contexts. ... > An > expression containing a '[' that is not preceded by a backslash > and is not part of a bracket expression produces undefined results. Ah, but POSIX also says: 2.13.1 Patterns Matching a Single Character [ If an open bracket introduces a bracket expression as in XBD RE Bracket Expression, except that the character ( '!' ) shall replace the character ( '^' ) in its role in a non-matching list in the regular expression notation, it shall introduce a pattern bracket expression. A bracket expression starting with an unquoted character produces unspecified results. Otherwise, '[' shall match the character itself. So while a lone '[' is unspecified in a normal BRE, it is well-defined in a shell filename pattern matching context. Since '[' is not a bracket expression, it MUST be treated as a literal '[', so ${foo#[} MUST strip the leading [ from the contents of foo, without requiring that the [ be quoted. > >> This also affects >> >> case [a in [?) echo ok ;; *) echo bad ;; esac >> >> which should print ok. > > Even ksh prints bad here. So ksh is also buggy. > > I would however consider a patch that simplifies the code in the > undefined case. Except that it is well-defined by POSIX, not undefined. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: heredoc and subshell
[adding the Austin Group] On 02/23/2016 03:07 PM, Oleg Bulatov wrote: > Hello, > > trying to minimize a shell code I found an unobvious moment with heredocs and > subshells. Thanks for a cool testcase. > > Is it specified by POSIX how next code should be parsed? dash output for this > code differs from bash and zsh. XCU 2.3 says: When an io_here token has been recognized by the grammar (see Shell Grammar), one or more of the subsequent lines immediately following the next NEWLINE token form the body of one or more here-documents and shall be parsed according to the rules of Here-Document. and 2.7.4 says: The here-document shall be treated as a single word that begins after the next and continues until there is a line containing only the delimiter and a , with no characters in between. Then the next here-document starts, if there is one. but with no mention of what happens if you somehow manage to make the next be part of an incomplete shell word on the line containing the here-doc operator. > > --- code > prefix() { sed -e "s/^/$1:/"; } > DASH_CODE() { :; } > > prefix A < echo line 1 > XXX > echo line 2)" && prefix DASH_CODE < echo line 3 > XXX > echo line 4)" > echo line 5 > DASH_CODE > > --- bash 4.3.42 output: > A:echo line 3 > B:echo line 1 > line 2 > DASH_CODE:echo line 4)" > DASH_CODE:echo line 5 So, it looks like bash is interpreting this as "first newline that is not in the middle of another shell word), and parses the entire $(...) construct through line 2 as if there were no newlines, then treats the newline after DASH_CODE as starting the heredoc, for outputting A: while visiting line 3 as the lone line in that heredoc. Then it moves on to the second command in the && sequence, by processing the command substitution (a heredoc outputting line 1, then the output of line 2; then moves on to the third component of the && sequence as a final heredoc delimited by DASH_CODE, with both lines 4 and 5 output with the DASH_CODE: prefix. > > --- dash 0.5.8 output: > A:echo line 1 > B:echo line 2)" && prefix DASH_CODE < B:echo line 3 > line 4 > line 5 > Meanwhile, dash is taking the literal first newline as the start of the first heredoc, and outputting A: with line 1; then consuming the next heredoc as lines 2 and 3 before finding the end of the command substitution on line 4, then outputting line 5 on its own and doing nothing else for the DASH_CODE function call. ksh 93u+ 2012-08-01 behaves even differently: B:echo line 1 line 2 && prefix DASH_CODE < after a here-doc operator occurs in the middle of a shell word. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [BUG] Illegal function names are accepted after being used as aliases
On 02/23/2016 02:00 PM, Harald van Dijk wrote: > > I was under the impression that the intent from the dash side was to > handle all commands the same, and that impression was based on the fact > that the . command has received additional code to handle -- even though > there's no requirement for that. However, looking into the original bug > report that prompted that change in more detail I see that the standard > will very likely require support for -- in the . command in the future, > so that doesn't hold up. Here's the link for dot and exec supporting --: http://austingroupbugs.net/view.php?id=252 > > If that intent isn't there (I'm not saying it's not; I'm unsure now), > the list of utilities that should be extended is far smaller, if I'm not > overlooking anything: > - alias > - getopts > - type > - exec? > - local? Weird that unalias already works. Oh, because of 'unalias -a'. I didn't spot any others that you missed (doesn't mean there aren't any, just that I didn't spot them). > > exec is like .: there's currently no requirement to support --, but that > requirement is likely to come in the future. See the above link; exec must support -- if '.' does. I also found http://austingroupbugs.net/view.php?id=163 which confirms that 'eval' is not required (nor it is prevented) from recognizing --. There's also http://austingroupbugs.net/view.php?id=960 which mentioned the exit status of export and several other special builtins, but added no requirements related to --. > > local is currently non-standard and it's hard to guess whether it will > require support for -- if standardised. If standardized, I expect it to require support for --, on the grounds that 'local -r' already has meaning in bash, so local is definitely a candidate for taking options. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [BUG] Illegal function names are accepted after being used as aliases
On 02/23/2016 12:21 PM, Harald van Dijk wrote: > On 23/02/2016 19:58, Eric Blake wrote: >> On 02/23/2016 11:44 AM, Harald van Dijk wrote: >> >>> This matches bash's behaviour, aside from bash requiring -- to prevent >>> detection of invalid flags to the alias command: >>> >>> bash-4.3$ alias -- -=true >> >> Then dash DOES have a bug: > > Indeed, I wasn't trying to suggest otherwise, my apologies if it came > across that way. It's not limited to the alias command though, I spotted > at least the exit and getopts commands having the same problem, and it > should probably be fixed for all of them at once. getopts - definitely needs a fix exit - fuzzy. exit is a special built-in (unlike getopts); and XCU 2.14 states: "Some of the special built-ins are described as conforming to XBD Utility Syntax Guidelines. For those that are not, the requirement in Utility Description Defaults that "--" be recognized as a first argument to be discarded does not apply and a conforming application shall not use that argument. " Conforming apps cannot expect 'exit -1' to work, and therefore, cannot also expect 'exit -- -1' to work, since the only standards-defined values for an argument to exit is a non-negative decimal integer less than 256. Of course, if you want to fix it along with all the others, that's fine; I'm just pointing out that 'exit' isn't broken as-is. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [BUG] Illegal function names are accepted after being used as aliases
On 02/23/2016 11:44 AM, Harald van Dijk wrote: > This matches bash's behaviour, aside from bash requiring -- to prevent > detection of invalid flags to the alias command: > > bash-4.3$ alias -- -=true Then dash DOES have a bug: # dash $ alias -- -='echo hi' alias: -- not found $ echo $? 1 $ - hi $ POSIX XCU 1.4 is clear: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html "Default Behavior: When this section is listed as "None.", it means that the implementation need not support any options. Standard utilities that do not accept options, but that do accept operands, shall recognize "--" as a first argument to be discarded." and alias takes operands, stating "OPTIONS: None.", which means POSIX _requires_ 'alias -- -=name' to (attempt to) define only the single alias '-', and NOT to also attempt to define '--' as an alias. It's okay if dash allows 'alias -=blah' to define '-' as an alias as an extension, but it MUST ignore '--' the way bash does. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [BUG] Illegal function names are accepted after being used as aliases
On 02/23/2016 11:18 AM, Jan Verbeek wrote: > Function definitions that use a bad function name (such as "-" and "=") > are accepted if the function name already exists as an alias. For example: Not necessarily a bug. > > $ - > dash: 1: -: not found > $ - () { echo hello; } > dash: 2: Syntax error: Bad function name > $ - > dash: 2: -: not found > $ alias -=true > $ - This is equivalent to running 'true'. > $ - () { echo hello; } This is equivalent to running 'true () { echo hello; }' - the alias expansion happens BEFORE the function definition is even parsed. You are NOT defining a function named '-', but one named 'true'. > $ - This is again equivalent to running 'true' - except that now the function name 'true' exists and bypasses the shell builtin. > hello > $ So the only thing remaining is to determine if it is legal to have a function override the name of a regular shell builtin. But http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01 under "Command Search and Execution" states that function names have priority over regular built-ins (so yes, creating a function named 'true' is doable, although stupid). -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: dash drops exported bash functions
On 02/10/2016 08:18 AM, Joachim Breitner wrote: > Dear dash developers, > > a change in 0.5.8, very likely this one > http://git.kernel.org/cgit/utils/dash/dash.git/commit/?id=46d3c1a614f11f0d40a7e73376359618ff07abcd > broke the exporting of bash shell functions via the environment. Not a bug. POSIX says that on shell startup, the behavior of any inherited environment variables that do not start with a proper shell name is undefined; and allows shells to scrub such items out of the environment on startup. Just because bash does not scrub them (but instead treats them as shell function imports) does not mean dash has to behave the same. That said, preserving any unusable environment variables unchanged, rather than scrubbing them, may be slightly nicer behavior, but I'm not sure it's worth the bloat to dash to do so. > > Exporting bash functions via the environment might be a rarely used > feature, but it is used in practice, unfortunately (otherwise I > wouldn’t have noticed this). Exporting bash functions is only usable if you plan on directly invoking bash. Don't drag dash into the mess. Inserting a dash child in between a bash parent and grandchild means all bets are off for whether the grandparent can export anything to the grandchild. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [PATCH] Set LC_ALL instead LC_COLLATE in mkbuiltins
On 05/21/2015 10:45 PM, Herbert Xu wrote: Setting LC_ALL has the nice property that LC_COLLATE and LC_CTYPE are guaranteed to be compatible; if you just set LC_COLLATE but leave LC_CTYPE unchanged and unset LC_ALL, it is possible to attempt a collation that assumes one character set while still living in a ctype that assumes another, and get garbled results. Show me an actual pair of values for these two that produce incorrect results for mkbuiltins and I'll happily change both. 'sort -b' uses isspace() to determine which characters to strip. There are locales with a larger set of characters where isspace() returns true than for the LC_CTYPE=C locale. Suppose that I can find a single-byte locale where isblank('\xff') is true. If that is the case, then the input '\xffa\nb\n' will sort differently for 'LC_ALL=C sort -b' (output 'b\n'\xffa\n') than for 'LANG=C LC_CTYPE=$locale' (output '\xffa\nb\n') because the change in CTYPE changes whether the \xff is ignored as a blank or included as part of the name being sorted. However, the man pages for 'locale(1)' and 'localedef(1)' did not make it obvious for me how to perform a search that would easily find such a locale, so I'm open to suggestions on how to prove my point via more than just analysis. And there's still the point that mkbuiltins is being run on controlled input, where you are sticking only to a subset of characters that happen to be portable (that is, you are unlikely to be tripped up by a locale where \xff is a blank, since you are not using \xff in your input). -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [PATCH] Set LC_ALL instead LC_COLLATE in mkbuiltins
On 05/21/2015 10:25 PM, Herbert Xu wrote: Fredrik Fornwall fred...@fornwall.net wrote: In mkbuiltins LC_COLLATE is set, but since The value of the LC_ALL environment variable has precedence over any of the other environment variables starting with LC_ (http://pubs.opengroup.org/onlinepubs/7908799/xbd/envvar.html), this has no effect when LC_ALL is set. This breaks when having e.g. LC_ALL=en_US.UTF-8 during make, which causes the test case dash -c : to fail, probably due to broken ordering in builtins.c. The patch corrects that by setting LC_ALL instead of LC_COLLATE. This causes any errors printed by sort to come out in English. Why do you care whether any errors printed by sort are in the C locale (in English) rather than localized? Ideally, there won't be any sort errors in the first place, because this tool is run on controlled input as part of the build process. Please fix this by simply setting LC_ALL to empty alongside LC_COLLATE=C. Setting LC_ALL has the nice property that LC_COLLATE and LC_CTYPE are guaranteed to be compatible; if you just set LC_COLLATE but leave LC_CTYPE unchanged and unset LC_ALL, it is possible to attempt a collation that assumes one character set while still living in a ctype that assumes another, and get garbled results. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [PATCH] Fix variable assignments in function invocations
On 01/09/2015 10:17 AM, Harald van Dijk wrote: Hello all, A long-standing problem with dash has been how it deals with variable assignments in function invocations, and several packages are affected by it, two I've come across recently being autogen and pkg-config (only their test suites, luckily). A short test script: f() { echo inside f, VAR is $VAR sh -c 'echo inside sh called from f, VAR is $VAR' } VAR=value f This behavior is tricky. Here's the latest POSIX wording: http://austingroupbugs.net/view.php?id=654#c1559 * If the command name is a function that is not a standard utility implemented as a function, variable assignments shall affect the current execution environment during the execution of the function. It is unspecified: - Whether or not the variable assignments persist after the completion of the function - Whether or not the variables gain the export attribute during the execution of the function - Whether or not export attributes gained as a result of the variable assignments persist after the completion of the function (if variable assignments persist after the completion of the function) So the existing dash behavior is compliant, even if different from bash. Quoting SUSv4 Shell Command Language 2.9.1 Simple Commands: If no command name results, variable assignments shall affect the current execution environment. Otherwise, the variable assignments shall be exported for the execution environment of the command and shall not affect the current execution environment (except for special built-ins). This is the text that was rendered obsolete by the above POSIX bug 654. Fixing this seems trivial, see the attachment, and the test suites of both autogen and pkg-config pass with this change. Does this look correct? I have no opinion on whether to take the patch in order to behave more like bash, or whether to tell script-writers to fix their script to avoid unspecified behavior because dash is already compliant in providing a different behavior than bash. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: if [ s1 s2 ] broken, writing a s2 file
On 12/08/2014 10:32 AM, solsTiCe d'Hiver wrote: hello, folowing that bug https://bugs.launchpad.net/ubuntu/+source/update-notifier/+bug/1400357, I follow through to investigate and I found out that whatever I try, when comparing 2 strings I always end up with a file written to disk From the man page test expression [ expression ] [...] s1 s2 True if string s1 comes after s2 based on the ASCII value of their characters. You HAVE to escape the so that it is interpreted as an argument and not a redirection operator. The bug is not in dash, but in your usage. when I try to use it: a=ert b=aze if [ $a $b ] ; then Wrong. Use: if [ $a $b ]; then so this if syntax is broken or I don't knwo how to use it. The latter. Also it is really dangerous to use a syntax similar to file redirection and this is exactly what is happening here. POSIX is proposing the addition of the shell builtin [[ ]], where because it is a syntactical part of the shell, it would have safe semantics (that is, [[ $a $b ]] would be perfectly safe and do the right thing). But until the POSIX standardization is complete, dash does not implement [[; and as long as only '[' is portable (with its unfortunate but historically-mandated semantics of operating as if it were NOT a builtin, in that shell parsing happens before test sees its arguments), then you have to quote anything that might otherwise be misinterpreted during parsing. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: POSIX compliant trap signal names
On 10/30/2014 09:23 AM, Sylvain Bertrand wrote: Hit the issue while compiling linux 3.16.3 with dash, ${linux-src}/scripts/link-vmlinux.sh line 114 . The signal names for trap built-in must be prefixed with SIG to be POSIX compliant. dash expect trap signal names without a SIG prefix. Wrong. Per http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#trap The condition can be EXIT, 0 (equivalent to EXIT), or a signal specified using a symbolic name, without the SIG prefix, as listed in the tables of signal names in the signal.h header defined in XBD Headers; for example, HUP, INT, QUIT, TERM. Implementations may permit names with the SIG prefix or ignore case in signal names as an extension. Thus, POSIX requires 'trap ... INT' to work, but says 'trap ... SIGINT' and 'trap ... int' are up to the implementation whether they are supported as an extension. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
[PATCH] [BUILTIN] cd: support drive letters on Cygwin
The Cygwin platform supports DOS style drive-letter paths such as C:\\dir, even though the preferred form is a POSIX-style /cygdrive/c/dir. This can be seen by doing things such as chdir(c:) (which succeeds) followed by getcwd(NULL, 0) (which returns the normalized /cygdrive/c). However, dash was trying to perform local manipulations on the argument to 'cd' prior to calling into libc, in order to update the state of $PWD and friends; these manipulations were assuming that the user meant to change to a relative subdirectory of the current location, as in './c:', instead of honoring the drive letter. None of the other dash builtins take a filename and manipulate it to affect shell state (some, like 'test', take a file name, but as stat(c:) works just fine, there is no need to normalize). This patch has no impact outside of cygwin; on cygwin, it takes advantage of a native function call to canonicalize any incoming name into preferred form before updating shell state. Pre-patch: $ dash -c 'cd c: echo $PWD' dash: 1: cd: can't cd to c: Post-patch: $ dash -c 'cd c: echo $PWD' /cygdrive/c Signed-off-by: Eric Blake ebl...@redhat.com --- ChangeLog | 4 src/cd.c | 14 ++ 2 files changed, 18 insertions(+) diff --git a/ChangeLog b/ChangeLog index a466a7f..a745fe7 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,7 @@ +2014-10-13 Eric Blake ebl...@redhat.com + + * cd: support drive letters on Cygwin. + 2014-09-26 Herbert Xu herb...@gondor.apana.org.au * Small optimisation of command -pv change. diff --git a/src/cd.c b/src/cd.c index 2d9d4b5..a4e024d 100644 --- a/src/cd.c +++ b/src/cd.c @@ -38,6 +38,9 @@ #include string.h #include unistd.h #include limits.h +#ifdef __CYGWIN__ +#include sys/cygwin.h +#endif /* * The cd and pwd commands. @@ -194,6 +197,17 @@ updatepwd(const char *dir) char *cdcomppath; const char *lim; +#ifdef __CYGWIN__ + /* On cygwin, thanks to drive letters, some absolute paths do + not begin with slash; but cygwin includes a function that + forces normalization to the posix form */ + char pathbuf[PATH_MAX]; + if (cygwin_conv_path(CCP_WIN_A_TO_POSIX | CCP_RELATIVE, dir, pathbuf, +sizeof(pathbuf)) 0) + sh_error(can't normalize %s, dir); + dir = pathbuf; +#endif + cdcomppath = sstrdup(dir); STARTSTACKSTR(new); if (*dir != '/') { -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe dash in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 'set' leaks garbage from environment
On 09/30/2014 09:14 AM, Olof Johansson wrote: On 2014-09-30 09:01 -0600, Eric Blake wrote: $ dash -c 'unset a|b $ echo $? 0 Works for me (tested on both Debian package versions 0.5.7-3 (wheezy) and 0.5.7-4 (unstable)): Serves me right from testing on multiple machines :( I mixed up my test results. Fedora 20 using dash 0.5.7 works: $ dash -c 'unset a|b' dash: 1: unset: a|b: bad variable name $ rpm -q dash dash-0.5.7-8.fc20.x86_64 But RHEL 6 fails: $ dash -c 'unset a|b' $ rpm -q dash dash-0.5.5.1-4.el6.x86_64 so this is at least one bug that has already been fixed upstream. $ env 'a|b=' dash -c 'set | grep a.b' a|b='' This I can reproduce though. Meanwhile, I just tested the latest dash.git (commit f21016a12) and this behavior is no longer present: $ env 'a|b=' ./src/dash -c 'set | grep a.b' so it has also been fixed in the meantime. Sorry for not doing my homework; nothing to fix here... -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [PATCH] Support DOS paths in dash
[I noticed an old thread in my inbox while packaging dash 0.5.8 for Cygwin] On 03/28/2013 09:08 AM, Edward Lam wrote [to the cygwin list]: The problem is that dash tries to convert c:/windows to an absolute path, since it doesn't start with /. I suppose I could teach dash to recognize [letter]:/ as absolute paths, although that makes dash larger, and puts a burden on me (since I can guarantee upstream dash won't accept such a patch). I just don't care enough for DOS paths so I won't fix. Me neither. And since you can use /cygdrive/c, not c:/, I won't bother to fix it. Hi Folks, I finally got down to looking at how to fix this in dash and came up with the attached patch (against dash-0.5.7). It's simple enough and so cd now works. Please consider this for Cygwin. I'm not interested in burdening the cygwin build of dash with a one-off patch, so I'd like to gauge the upstream thoughts - is it worth including platform-specific patches like this (no penalty to build size of non-cygwin platforms, and on cygwin, it allows 'cd c:/' to behave as shorthand for 'cd /cygdrive/c/')? If the patch lands in dash.git, then I'll rebuild the cygwin port of dash to include a backport (rather than waiting for 0.5.9 to be released). If there is no interest, I'd rather just drop the patch. The cygwin community already states that /cygdrive/c notation is the official way to access drive letters, and that if 'c:/' works it is nice, but it is not a design goal to always have it work. --- src/cd.c 2011-03-15 03:18:06.0 -0400 +++ src/cd.new.c 2013-03-28 11:03:32.649576500 -0400 @@ -38,6 +38,9 @@ #include string.h #include unistd.h #include limits.h +#ifdef __CYGWIN__ +#include sys/cygwin.h +#endif /* * The cd and pwd commands. @@ -194,6 +197,11 @@ char *cdcomppath; const char *lim; +#ifdef __CYGWIN__ +char pathbuf[PATH_MAX + 1]; +cygwin_conv_to_full_posix_path (dir, pathbuf); By the way, cygwin_conv_to_full_posix_path() is deprecated (it suffers from possible buffer overflow); these days, it's preferred to use: cygwin_conv_path (CCP_WIN_A_TO_POSIX | CCP_RELATIVE, string, pathbuf, sizeof(pathbuf)) So, if there is interest in this patch upstream, I can respin it. + dir = pathbuf; +#endif cdcomppath = sstrdup(dir); STARTSTACKSTR(new); if (*dir != '/') { -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Line continuation and variables
On 08/26/2014 06:15 AM, Oleg Bulatov wrote: Hi! While playing with sh generators I found that dash and bash have different interpretations for slashnewline sequence. $ dash -c 'EDIT=xxx; echo $EDIT\ OR' xxxOR Buggy. $ bash -c 'EDIT=xxx; echo $EDIT\ OR' /usr/bin/vim Correct behavior. $ dash -c 'echo $\ (pwd)' $(pwd) Is it undefined behaviour in POSIX? No, it's well-defined, and dash is buggy. POSIX says: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_03 the shell shall break its input into tokens by applying the first applicable rule below to the next character in its input Rule 4 covers backslash handling, while rule 5 covers locating the end of a word to be subject to $ expansion. Therefore, rule 4 should happen first. Rule 4 defers to the section on quoting, with the caveat that newline joining is the only substitution that happens immediately as part of the parsing: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_02 If a newline follows the backslash, the shell shall interpret this as line continuation. The backslash and newline shall be removed before splitting the input into tokens. Since the escaped newline is removed entirely from the input and is not replaced by any white space, it cannot serve as a token separator. So the fact that dash is treating the elided backslash-newline as a token separator, and parsing your input as if ${EDIT}OR instead of ${EDITOR} is a bug in dash. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [PATCH dash] [BUILTIN] ensure LC_COLLATE is not overriden
On 08/05/2014 10:40 AM, Chema Gonzalez wrote: If the user environment has either LC_ALL or LANG defined, the setting of LC_COLLATE in src/mkbuiltins is overriden. With a non-POSIX locale, the orders of dotcmd (remember that '.' is 0x2e in ascii) and truecmd (':' is 0x3a in ascii) are reversed, which makes the : command fail in the bsearch. - }}' $temp | LC_COLLATE=C sort -k 1,1 | tee $temp2 | awk '{ + }}' $temp | LC_ALL= LANG= LC_COLLATE=C sort -k 1,1 | tee $temp2 | awk '{ Setting LC_ALL= to the empty string risks implementation-defined behavior. Also, LC_ALL overrides LANG and LC_COLLATE. It should be sufficient to merely do: }}' $temp | LC_ALL=C sort -k 1,1 | tee $temp2 | awk '{ -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [PATCH dash] [BUILTIN] ensure LC_COLLATE is not overriden
On 08/05/2014 11:12 AM, Chema Gonzalez wrote: Setting LC_ALL= to the empty string risks implementation-defined behavior. Also, LC_ALL overrides LANG and LC_COLLATE. It should be sufficient to merely do: }}' $temp | LC_ALL=C sort -k 1,1 | tee $temp2 | awk '{ Maybe: }}' $temp | LC_ALL=C LANG=C sort -k 1,1 | tee $temp2 | awk '{ No need to specify LANG=C when LC_ALL is set. I stand by my shorter line. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [PATCH] \e in echo and printf builtins
On 06/28/2014 11:33 AM, Paul Gilmartin wrote: OTOH, there's a POSIX requirement that builtins be indistinguishable (except in performance) from the corresponding executables. The POSIX requirement only applies to portable uses of the builtin - ie. those that are prescribed by POSIX. Since POSIX does not require \e, dash is not failing compliance, even if it differs from extensions provided by corresponding executables. I do not think dash needs to bloat for \e unless POSIX standardizes it first. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: sed script fails to run in dash
On 11/22/2013 11:11 AM, Tormen wrote: sed -e 1$'{w/dev/stdout\n;d}' -i /tmp/x in a dash script will yield the error message: sed: -e expression #1, char 2: unknown command: `$' But why ? :( Because $'' is not (yet) in POSIX. It will be required in a future release, but dash hasn't implemented it yet. http://austingroupbugs.net/view.php?id=249 -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: test incorrectly rejecting valid expression with confusing ! placement
On 09/03/2013 07:56 PM, Herbert Xu wrote: Harald van Dijk har...@gigawatt.nl wrote: Hi, Now that Herbert fixed the reported crash in test (in a far simpler manner than I had suggested, which I like), I did some more testing, and came across one case that does not currently work, and did not work in the past, but is perfectly valid: $ src/dash -c 'test ! ! = !' src/dash: 1: test: =: unexpected operator Agreed. $ src/dash -c 'test ! -o !' src/dash: 1: test: -o: unexpected operator Nope, the rule is quite clear that it only applies to binary primaries, not operators. -o is an operator. Huh? http://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html states that there are only two operators ! and (), and specifically mentions that -a and -o are binary primaries: expression1 -a expression2 [OB XSI] [Option Start] True if both expression1 and expression2 are true; otherwise, false. The -a binary primary is left associative. It has a higher precedence than -o. [Option End] expression1 -o expression2 [OB XSI] [Option Start] True if either expression1 or expression2 is true; otherwise, false. The -o binary primary is left associative. [Option End] test ! -o ! is a three-argument test, where $2 (-o) is a binary primary, so it is the binary test of $1 and $3, and the end result is an exit status of 0. Bash and ksh get it right, dash fails. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Crash on valid input
On 04/08/2013 09:12 PM, Dan Kegel wrote: Yes, my script was crap, I've fixed it. Here's the reproducer. Called with foo unset. I think it doesn't crash without -x. #!/bin/dash set -x test ! $foo The 'set -x' was indeed the key to reproducing the problem. In fact, this is the shortest I could make it: dash -cx 'test !' -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [PATCH] [BUILTIN] Allow SIG* signal names.
On 07/02/2012 12:53 PM, Isaac Jurado wrote: On Mon, Jul 2, 2012 at 4:22 PM, Eric Blake ebl...@redhat.com wrote: On 07/02/2012 08:11 AM, Paul Gilmartin wrote: On Jul 2, 2012, at 07:51, Eric Blake wrote: ... non-required bloat ... The key phrase. And one value of a shell lacking such extensions is that it provides an excellent test bed for code intended to be portable within the POSIX spec. That argues that we should drop our strcasecmp() for the much simpler strcmp(), in order to remove the bloat we already have. I guess my patch has no chance to be accepted. I'm not the maintainer, so my decision is not indicative of what the dash maintainer will choose. But my personal preference would be that we change this area of code, either to: 1. be lighter-weight (drop strcasecmp, which is locale-dependent, and replace it with strcmp) 2. be more user-friendly (accept optional case-insensitive SIG prefix) Both approaches are permitted by POSIX, so it boils down to a judgment call of whether providing useful extensions or providing a minimally compliant shell is more important. But I'm still curious about what kind of bloat you are referring to. I'm assuming it's not code bloat in terms of lines of code. Even one byte larger in the final executable size has been deemed bloat on this list in the past. Dash prides itself on being minimalistic, but you happened to point out an area of code that is not currently minimal. If the signal name to number conversion seems too expensive (linear search multiplied by the string lengths, wether it is case sensitive or not), there is a much more elegant solution: perfect hashing. Indeed, that would provide faster lookup, but it would also cost more executable size (the storage requirements for a perfect hash table are larger than the size of a loop comparison); I don't know whether the preference is for speed, for minimal size, or for a hybrid of the two (where larger size is okay only if it proves to have more speed). So hopefully the dash maintainer will chime in on the topic. -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [PATCH] var.c: check for valid variable name before printing in export -p
On 02/25/2012 07:31 AM, Herbert Xu wrote: On Sat, Feb 25, 2012 at 03:30:04PM +0100, Jilles Tjoelker wrote: Most shells pass the environment variable through, such as bash, zsh, ksh93 and most ash derivatives. However, the original Bourne shell and pdksh/mksh do not. Do you know of any genuine uses of such environment variables? POSIX states that applications must not rely on such pass-through: http://austingroupbugs.net/view.php?id=168 So while it might indeed be useful to pass through invalid names, such an application is broken for expecting it to work, and I'm okay with this patch as-is. -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: evaluation of env variables in DASH
On 10/19/2011 03:24 PM, Dima Sorkin wrote: Hi. The following DASH behaviour seems buggy to me The only bug here is your expectations. - $ export A='\n' $ echo $A Passing a literal backslash to echo is non-portable. POSIX even says so. And bash can match dash behavior: $ (shopt -s xpg_echo; A='\n'; echo -$A-) - - eblake@office (0) ~/libvirt $ (shopt -u xpg_echo; A='\n'; echo -$A-) -\n- Fix your shell script to use printf instead of echo if the thing you are printing might contain a backslash. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org -- To unsubscribe from this list: send the line unsubscribe dash in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [SHELL] Allow building without LINEO support.
On 08/17/2011 12:04 AM, Harald van Dijk wrote: On Tue, 2011-08-16 at 20:12 -0500, Jonathan Nieder wrote: David Miller wrote: [Subject: [SHELL] Allow building without LINEO support.] Thanks! Debian has been using something like this (but unconditional) to convince autoconf not to use dash as CONFIG_SHELL, to work around bugs in various configure scripts[1]. I imagine other users might want the same thing, so a patch like this seems like a good idea. If you don't mind me asking, if you want configure scripts to run from bash, why not simply run configure scripts from bash, instead of running them from sh and trusting that sh will call bash if it is really some other shell? And remember, most configure scripts already support that: CONFIG_SHELL=path/to/bash path/to/bash ./configure -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org -- To unsubscribe from this list: send the line unsubscribe dash in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
positional argument bug
[originally brought up on the bash list as a NetBSD bug, but dash is also affected] On 05/05/2011 08:11 AM, Eric Blake wrote: I'd call that a pretty serious incompatibility on the part of ash and its descendants (BSD sh, dash, etc.). There's no good reason that set -- a b c d e f g h i j echo $10 should echo `j'. Also a POSIX violation: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02 The parameter name or symbol can be enclosed in braces, which are optional except for positional parameters with more than one digit or when parameter is followed by a character that could be interpreted as part of the name. Additionally from POSIX: If the parameter name or symbol is not enclosed in braces, the expansion shall use the longest valid name (see XBD Name) In the shell command language, a word consisting solely of underscores, digits, and alphabetics from the portable character set. The first character of a name is not a digit. Therefore, in $10, 10 is not a name, so the longest name is the empty string, and the single-character symbol is used instead, such that this MUST be parsed as ${1}0, not as ${10}. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: `local' built-in POSIX?
On 03/26/2011 04:50 PM, Michael Witten wrote: I can't find POSIX documentation for the `local' built-in, which is available in both dash and bash for the creation of function-local variables. Is it not standard POSIX? If it is not, should it be removed from dash? No, it is not standard POSIX (yet). There has been talk on the Austin Group mailing list of standardizing local (perhaps by the name typeset) for the next revision; the biggest issue is that ksh uses typeset only for statically scoped variables, while bash uses it only for dynamically scoped variables, so a consensus has to be reached among shell writers which scoping rules to standardize. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Dash's web presence
On 03/08/2011 01:08 AM, Dan Muresan wrote: Oh, you do have a GIT repository. Kudos for that. And when you consider that bash lacks even a public repository, and your only recourse is massive inter-version diffs, dash is already worlds ahead in that regards. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: setvar MIA?
On 01/11/2011 08:54 AM, Aragon Gouveia wrote: Hi, I'm working on making a number of shell scripts cross compatible between FreeBSD and Linux, but one thorn in my side has been dash's lack of a setvar builtin. Does anyone know if this is a work in progress, or a decidedly void feature in dash? Decidedly missing. POSIX doesn't require it. Neither bash nor ksh provides setvar as a builtin, either. And what does setvar do anyways? Perhaps it is some alias or shell function that you have inherited from startup files in one of your other shells, but I've never heard of a 'setvar' program. So why bloat dash to include it? -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: setvar MIA?
On 01/11/2011 09:54 AM, Aragon Gouveia wrote: I wasn't sure of its status in POSIX. It is useful for declaring variable variables - tidier than eval and I imagine faster, eg. index=1 setvar var_${index} value Will emulate it with a local function - thanks. Indeed, it looks like FreeBSD introduced it as shorthand for: setvar() { eval $1=\$2; } The speed difference between that function doing an eval and a shell builtin would be in the noise. I don't know why FreeBSD even bothered to pollute the namespace with a builtin like that. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: static vs. dynamic scoping
On 11/15/2010 02:11 PM, Cedric Blancher wrote: Why is the debate static-vs-dynamic scoping coming up again? Because before 'typeset' can be standardized in POSIX, we have to get consensus from all the shell implementers that they will agree to implement static scoping. For ksh, the question is moot - ksh93 already does static only. For dash, the question is valid - the current dash implementation is dynamic only, but given that switching to static only could probably be made more efficient, and dash values efficiency, it's a reasonable goal. For bash and zsh, which currently are dynamic only, the problem stems that there are now a number of shell script libraries for these two shells that have exploited dynamic scoping, and which would break if we aren't careful to standardize something that can still allow dynamic scoping as an extension. In other words, this was a probe of the various shell implementers to figure out how easily static scoping can be added on after the fact to a dynamic scoping implementation, so that the shell could conform to a future POSIX revision that mandates static and permits dynamic as an extension. With this background I doubt any proposal for dynamic scoping will make it into the next POSIX standard. There's no desire for dynamic scoping in POSIX; David Korn has already made that point clear on the Austin Group mailing list. Rather, there is a desire for minimal effort for complying with a new POSIX requirement of static scoping on shells that currently lack it, as well as backwards compatibility for shells that wish to continue to provide dynamic scoping as an extension to the standard. My take of the Austin Group list discussion is that the next revision of the standard is most likely to have consensus if it just mandates 'typeset' for static scoping, and leaves 'local' as an implementation extension for dynamic scoping. Please, chime in on the Austin Group conversation if you have something useful to add. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: static vs. dynamic scoping
[redirecting back to the list, so others can benefit] On 11/10/2010 02:16 AM, Marc Herbert wrote: Le 09/11/2010 21:52, Eric Blake a écrit : I'm trying to standardize the notion of local variables for the next revision of POSIX, but before I can do so, I need some feedback on two general aspects: [...] Here's a sample shell script that illustrates the difference between the two scoping methods. Hi Eric, I found your sample script quite confusing. To make your point, does this script really need to: - use unquoted language keywords as string values? No; I could have used other strings. - use deprecated typeset instead of declare? Yes - the current Austin Group thoughts are to standardize 'typeset' and NOT 'local', since 'typeset' can be used with arguments outside of functions, and more existing shells provide 'typeset' than 'local' (dash being the odd one out) or 'declare'. Shells can continue to provide 'local' as a synonym for the most basic use of typeset. - use the not (or less?) standard function keyword? Yes - ksh93 ONLY supports function-local scoping when using the function keyword, rather than when using POSIX functions (although David Korn agreed that if POSIX standardizes function-local scoping, he'd make the next build of ksh support it in POSIX functions). So, here's the example again, with those points addressed: # Demonstrate ksh local scoping is static - requires ksh's 'function' $ ksh -c 'function f1 { typeset a=temp; f2; echo in f1: $a; }; function f2 { echo in f2: $a; a=changed; }; a=global; f1; echo top level: $a' in f2: global in f1: temp top level: changed # Demonstrate that with POSIX functions, ksh has global scoping $ ksh -c 'f1 () { typeset a=temp; f2; echo in f1: $a; }; f2 () { echo in f2: $a; a=changed; }; a=global; f1; echo top level: $a'in f2: temp in f1: changed top level: changed # Demonstrate that dash local scoping is currently dynamic $ dash -c 'f1 () { local a=temp; f2; echo in f1: $a; }; f2 () { echo in f2: $a; a=changed; }; a=global; f1; echo top level: $a' in f2: temp in f1: changed top level: global -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
static vs. dynamic scoping
On the Austin Group mailing list, David Korn (of ksh93 fame) complained[1] that bash's 'local' uses dynamic scoping, but that ksh's 'typeset' uses static scoping, and argued that static scoping is saner since it matches the behavior of declarative languages like C and Java (dynamic scoping mainly matters in functional languages like lisp): [1] https://www.opengroup.org/sophocles/show_mail.tpl?CALLER=show_archive.tplsource=Llistname=austin-group-lid=14951 I'm trying to standardize the notion of local variables for the next revision of POSIX, but before I can do so, I need some feedback on two general aspects: 1. Implementation aspect: How hard would it be to add static scoping to dash? Is it something that should be added in addition to dynamic scoping, via the use of an option to select the non-default mode (for example, 'local -d' to force dynamic, 'local -s' to force static, and 'local' to go with default scoping)? Or should dash switch entirely to static scoping (my gut feel is that static scoping may be more efficient to implement, which fits in line with dash's desire to be as lean as possible)? 2. User aspect: Is anyone aware of a script that intentionally uses the full power of dynamic scoping available through 'local' which would break if scoping switched to static? Here's a sample shell script that illustrates the difference between the two scoping methods (note that ksh only provides nested scoping via its typeset builtin, and only when using the function reserved word). $ ksh -c 'function f1 { typeset a=local; f2; echo $a; }; function f2 { echo $a; a=changed; }; a=global; f1; echo $a' global local changed $ dash -c 'f1 () { typeset a=local; f2; echo $a; }; f2 () { echo $a; a=changed; }; a=global; f1; echo $a' local changed global In static scoping, function f2 does not shadow a declaration of a, so references to $a within f2 refer to the global variable. The local variable a of f1 can only be accessed within f1; the behavior of f2 is the same no matter how it was reached. In dynamic scoping, function f2 looks up its call stack for the closest enclosing scope of a variable named a, and finds the local one declared in f1. Therefore, the behavior of f2 depends on how f2 is called. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [PATCH] [INPUT] Catch attempts to run a directory as a script
On 10/06/2010 04:55 AM, Jonathan Nieder wrote: But POSIX makes it clear enough that in sh command_file, command_file is supposed to be a file, not a directory. So diagnose this with an error message and exit with status 2. [...] Is this required by POSIX? If not this is simply making dash bigger for no good reason. Not clear. I suppose POSIX usually doesn't require anything when the caller screws up. POSIX requires that input files to bash shall be text files; directories do not qualify for this. http://www.opengroup.org/onlinepubs/9699919799/utilities/sh.html The input file shall be a text file, except that line lengths shall be unlimited. However, that is a requirement on the user, not the shell; so running 'sh /' is a constraint violation by the user, and leaves behavior up to the shell. Under OPERANDS[2]: if the path contains a slash, all the standard says is the implementation attempts to read that file. If the path does not contain a slash and the file is not in the working directory, the implementation _may_ perform a search as described in Command Search and Execution. It's more than just MAY; it's a requirement: http://www.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01_01 If the command name contains at least one slash, the shell shall execute the utility in a separate utility environment with actions equivalent to calling the execve() function... If the execve() function fails due to an error equivalent to the [ENOEXEC] error, the shell shall execute a command equivalent to having a shell invoked with the command name as its first operand During that search, after execve() fails, if the executable file is not a text file, the shell _may_ bypass this command execution. In this case, it shall write an error message, and shall return an exit status of 126. (emphasis mine). But yes, that same section is clear that for both command searches along PATH for a word without slash, and for a direct command with a slash, if execve() fails with ENOEXEC (as it does for directories), then it is optional whether the shell bypasses attempts to read the file because it was not a text file. On the other hand, in Linux, execve(.,...) fails with EACCES, as permitted by the standard: http://www.opengroup.org/onlinepubs/9699919799/functions/execve.html [EACCES] ...or the new process image file is not a regular file and the implementation does not support execution of files of its type. And since EACCES is not the same class as ENOEXEC, there is no requirement for the shell to attempt to execute the same file. So, rather than stat()ing the argument in advance and checking for S_ISDIR, it seems like it would be simpler to check after the execve() attempt for EACCES and blindly set $? to 126 in that case (since you already have to check for ENOEXEC). So this behavior is allowed as an optional subset of an optional behavior. That may have guided the bash implementors: $ bash directory directory: directory: is a directory $ echo $? 126 It's probably not required. Additionally, the standard REQUIRES that 'sh -c exec /' shall fail with status 126: http://www.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#exec If command is found, but it is not an executable utility, the exit status shall be 126. Right now, dash gets this wrong: dash -c 'exec .'; echo $? exec: 1: /: Permission denied 2 And since you already have the code in dash to detect failure to 'exec' a directory, you should be able to reuse that code when detecting failure to run a directory as a script, as in 'dash .'. [Hmm, bash also gets it wrong: bash -c 'exec .'; echo $? bash: line 0: exec: .: not found 127 even though . should always be found] -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org -- To unsubscribe from this list: send the line unsubscribe dash in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [EVAL] with set -e exit the shell if a subshell exits non-zero
On 06/28/2010 01:22 AM, Cristian Ionescu-Idbohrn wrote: Has bash's behaviour changed recently (I'm using an ancient version)? Yes - bash 4.1 tries harder to be compliant with the recent Austin Group interpretations (and more like ksh). bash 3.2.39 and 4.0.37 are behaving as dash without the suggested patch. Still. What is the correct behaviour? That should be the essential matter IMO, not what others do. So, why should this fail: $ dash -c 'set -e; false; echo here' and this succeed? $ dash -c 'set -e; (false); echo here' According to the Austin Group: http://austingroupbugs.net/view.php?id=52 the desired behavior is: Replace the description of -e with: -e When this option is on, when any command fails (for any of the reasons listed in [xref to 2.8.1] or by returning an exit status greater than zero) the shell immediately shall exit with the following exceptions: 1) The failure of any individual command in a multi-command pipeline shall not cause the shell to exit. Only the failure of the pipeline itself shall be considered. 2) The -e setting shall be ignored when executing the compound list following the while, until, if, or elif reserved word, a pipeline beginning with the ! reserved word, or any command of an AND-OR list other than the last. 3) If the exit status of a compound command other than a subshell command was the result of a failure while -e was being ignored, then -e shall not apply to this command. This requirement applies to the shell environment and each subshell environment separately. For example, in set -e; (false; echo one) | cat; echo two the false command causes the subshell to exit without executing echo one; however, echo two is executed because the exit status of the pipeline (false; echo one) | cat is zero. Per these rules, both 'set -e; false; echo here' and 'set -e; (false); echo here' are silent in bash 4.1. The fact that dash is not silent when a subshell exits with non-zero status is at odds with the above Austin Group ruling. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: test -x should use faccessat, not stat
According to Herbert Xu on 4/2/2010 8:03 AM: After much deliberation (alright, I've simply been busy elsewhere :) I've committed this patch. commit 1d68712ba2e439f36874c4ed1e3d9ffec177a06c Note that faccessat doesn't handle ACLs when euid != uid, as this case is currently implemented by glibc instead of the kernel, using code similar to the existing dash test. That faccessat bug is only true for current Linux kernels. Cygwin faccessat does the correct thing, even when euid != uid. Thanks for applying this. -- Don't work too hard, make some time for fun as well! Eric Blake e...@byu.net -- To unsubscribe from this list: send the line unsubscribe dash in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bugs in cd
According to Eric Blake on 7/14/2009 3:39 PM: For the cd command, POSIX 2008 requires that after all pathnames in CDPATH have been tested and failed in step 5, then step 6 interprets the directory argument relative to PWD. In other words, this demonstrates a bug: $ dash -c 'cd /tmp; mkdir -p foo; CDPATH=oops; cd foo; echo $?; pwd' cd: 1: can't cd to foo 2 /tmp while bash gets it correct: $ bash -c 'cd /tmp; mkdir -p foo; CDPATH=oops; cd foo; echo $?; pwd' 0 /tmp/foo Furthermore, POSIX requires that if the element in CDPATH ends in slash, that no additional slashes are added while forming the candidate curpath. In light of the fact that //home need not be the same directory as /home (and indeed, on cygwin, they are distinct entities), this is also a bug: $ dash -c 'CDPATH=/; cd home' //home $ bash -c 'CDPATH=/; cd home' /home Ping. -- Don't work too hard, make some time for fun as well! Eric Blake e...@byu.net -- To unsubscribe from this list: send the line unsubscribe dash in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: avoid compiler warning
According to Herbert Xu on 8/11/2009 3:56 PM: On Tue, Aug 11, 2009 at 09:33:43AM -0700, H. Peter Anvin wrote: Herbert... the *type* is int, but the *value* has to be in the range [-1,UCHAR_MAX] or the behavior is undefined in both the C and POSIX standards. Good point. I'll apply the patch. I'd be very surprised though if this was the only instance in which we pass a char along. Ping. Or do we want to go with an alternate patch of defining our own version of ISDIGIT that handles the entire range of int and avoids checking the current locale, since POSIX guarantees that isdigit can only return true for the ten bytes '0' through '9'? -- Don't work too hard, make some time for fun as well! Eric Blake e...@byu.net -- To unsubscribe from this list: send the line unsubscribe dash in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: avoid compiler warning
According to Herbert Xu on 8/10/2009 10:03 PM: On Thu, Jul 09, 2009 at 12:55:25PM +, Eric Blake wrote: ccache gcc -DHAVE_CONFIG_H -I. -I.. -include ../config.h -DBSD=1 -DSHELL -DIFS_BROKEN -Wall -gdwarf-2 -Wall -Werror -MT mystring.o -MD -MP -MF .deps/mystring.Tpo -c -o mystring.o mystring.c miscbltin.c: In function `umaskcmd': miscbltin.c:201: warning: subscript has type `char' isdigit is only defined over EOF and unsigned char values, so without this patch, you can trigger undefined behavior. What compiler and what libc was this? isdigit is supposed to be a function that takes an int argument according to POSIX. If libc implements it as a macro then it's up to it to cast the parameter to (int). This is with recent newlib (the warning was intentionally added exactly to catch the sort of bugs that my patch fixes), coupled with any version of gcc 3.4 or newer. Additionally, there is a pending bug report against glibc requesting that glibc add the same QoI warning to flag potentially buggy code, since it is quite easy to flag the use of raw char as a bug: http://sources.redhat.com/bugzilla/show_bug.cgi?id=10296 In a particular demonstration of the bug, there are some locales where isspace('\xff') is false but isspace((unsigned char)'\xff') [aka isspace(0xff)] is true, when char is a signed type. And although both glibc and newlib cater to most instances of this bug (as a QoI enhancement, these libraries guarantee that isspace('\xfe') [aka isspace(-2)] returns the same result as isspace(0xfe)), not all platforms have this QoI support, and can actually end up dereferencing outside of array bounds. Note that isdigit is a bit unique among the ctype macros: C89 and C99 state that it is locale dependent (with a range still limited to EOF or [0-UCHAR_MAX]), but POSIX adds the additional restriction that it only return true for the ten contiguous characters '0' through '9', meaning that any POSIX-compliant version of isdigit(x) can be as simple as ((unsigned)(x)-'0'=9) for all x, without regards to locale or out-of-range arguments. But not all locales comply with POSIX, so it is not generally portable to rely on isdigit being this simple or fast. On the other hand, there are a number of packages that #define ISDIGIT, rather than use ctype's isdigit, exactly to get the speedup guaranteed by POSIX; this may be something you want to do in dash. -- Don't work too hard, make some time for fun as well! Eric Blake e...@byu.net -- To unsubscribe from this list: send the line unsubscribe dash in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html