Bug#759230: shell: Always use explicit large file API
On Thu, May 07, 2020 at 11:42:12PM +1000, Herbert Xu wrote: > There are some remaining stat/readdir calls in dash that may lead > to spurious EOVERFLOW errors on 32-bit platforms. This patch changes > them (as well as open(2)) to use the explicit large file API. > Reported-by: Tatsuki Sugiura > Signed-off-by: Herbert Xu > diff --git a/configure.ac b/configure.ac > index 5dab5aa..dbd97d8 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -144,8 +144,13 @@ AC_CHECK_FUNC(stat64,, [ > AC_DEFINE(fstat64, fstat, [64-bit operations are the same as 32-bit]) > AC_DEFINE(lstat64, lstat, [64-bit operations are the same as 32-bit]) > AC_DEFINE(stat64, stat, [64-bit operations are the same as 32-bit]) > + AC_DEFINE(readdir64, readdir, > + [64-bit operations are the same as 32-bit]) > + AC_DEFINE(dirent64, dirent, > + [64-bit operations are the same as 32-bit]) > ]) > [snip] Is it possible to use AC_SYS_LARGEFILE and the normal fstat/lstat/readdir/dirent/open/etc instead of the non-standard "64" interfaces? I understand that dash formerly used the "64" interfaces selectively, so that the binary could be a bit smaller by avoiding 64-bit numbers where they were not necessary. Now that the feature "supports files with inode numbers that do not fit in 32 bits" is considered essential, this complexity seems unnecessary. I'm sorry for not providing a patch. -- Jilles Tjoelker
Bug#953421: dash: Resident Set Size growth is unbound (memory leak) on an infinite shell loop
On Sun, Mar 29, 2020 at 08:06:31PM +0100, Harald van Dijk wrote: > On 29/03/2020 18:54, Vitaly Zuevsky wrote: > > I have now fixed this bug locally. > > The leak is in jobtab array (jobs.c). I concluded that the most > > logical approach would be eliminating inconsistency between > > makejob() and dowait() functions. My fix in a forked repo: > > https://salsa.debian.org/psvz-guest/dash/-/commit/5e3ea90cb3355d1308c482661a471883d36af5e7 > This change is incorrect. The reason dash keeps on allocating memory is > because dash needs to keep on allocating memory. Consider this script: > set -- $(seq 1 100) > for i > do > : & > sleep .1 > done > for i > do > wait %$i > done > This is a valid script and works fine in dash. Your change breaks this by > not keeping the jobs around long enough, and I hope this test script shows > that there is no way to keep the jobs around long enough but by allocating > ever more memory. I agree that the change is incorrect, but I do not agree that this kind of script must leak memory. Per POSIX.1-2008 XCU 2.9.3.1 Asynchronous Lists, an implementation has additional ways to forget about jobs than just an appropriate completion of the wait utility: if another asynchronous job is started when $! was not referenced or if the number of known process IDs would exceed {CHILD_MAX} (which tends to be rather big, though). POSIX does not seem to expect using % in scripts like this; it seems highly fragile to me anyway (although $! has problems with process ID reuse). FreeBSD sh implements forgetting when $! was not referenced (and the job has terminated), but not the {CHILD_MAX} limit. This avoids the increasing memory usage in the example script. > Your change makes it impossible to keep track of the background process's > status, but if you do not care about that anyway, you can avoid the > increasing memory use without modifying dash by launching a background > process without including it in the current shell's job table, by launching > it from a subshell: > while true > do > (true &) > sleep .1 > done Certainly a good idea. Another option may be to include regular invocations of the wait utility without parameters, although this is not suitable for all scripts. -- Jilles Tjoelker
Bug#788979: dash: "Argument list too long" error on not too long argument
On Sat, Aug 15, 2015 at 02:13:22AM +0200, Vincent Lefevre wrote: > On 2015-08-14 20:16:27 +0200, Jilles Tjoelker wrote: > > This is not a bug in dash but either in the Linux kernel or in glibc. > > The existence of MAX_ARG_STRLEN (see execve(2)) could be considered a > > bug, since POSIX does not provide a hint that a limit on the length of > > an individual argument may exist. > Actually such a limitation seems to be allowed by POSIX: > [E2BIG] > Argument list too long. The sum of the number of bytes used by > the new process image's argument list and environment list is > greater than the system-imposed limit of {ARG_MAX} bytes. > or: > Lack of space in an output buffer. > or: > Argument is greater than the system-imposed maximum. > This is the third case, but then, the error message is incorrect > ("Argument list too long" corresponds to the first case). A glibc > bug in this case? There's the same message with bash and zsh. I think that condition is intended for different functions, since it is mentioned in XSH 2.3 Error Numbers and not in the exec page in XSH 3 System Interfaces. Although POSIX allows implementation-specific error conditions, they may be a bad idea because they hinder portability. > > However, it seems unlikely that this bug will be fixed given that it > > has existed for so long already and was deliberately introduced. > Perhaps it made sense 10 years ago, but nowadays this limit is > rather ridiculous (makes me think of Microsoft's 640 KB limit). I think this was that kind of limit, yes, although it introduced dependency on gmake smarts years ago (in Linux, it's common for UTILITY MANY_ARGS to be acceptable but sh -c 'UTILITY MANY_ARGS' to fail with [E2BIG]). > > Alternatively, the bug is an incorrect return value from > > sysconf(_SC_ARG_MAX). Given the existence of MAX_ARG_STRLEN, the value > > returned should never be more than MAX_ARG_STRLEN (131072 on most > > architectures), since that is the longest total length of arguments that > > is guaranteed to avoid [E2BIG] errors. > Since POSIX allows such a system limit in addition to the total > size, I'd say that the sysconf value is correct, but the error > message is not the correct one (see above). > > The [E2BIG] from "/bin/true" by itself seems strange and does not match > > the bug's descriptive text. > This problem seems to be specific to dash. In interactive mode outside functions, dash sets the variable _ to the last argument after executing a simple command, but normally does not export it. Both bash and zsh export the variable, though, so when dash is started from bash or zsh, each command gets the last argument of the previous command in its environment. This may cause [E2BIG] errors. Bash and zsh set _ before running the command, so the program gets its own last argument and it is unlikely [E2BIG] will occur that would not have occurred without _. -- Jilles Tjoelker
Bug#759230: dash: glob fails on large volume (64bit inodes) with i386 binary.
On Mon, 25 Aug 2014 22:32:00 +0900 Tatsuki Sugiura wrote: > Package: dash > Version: 0.5.7-4 > Severity: normal > When I create i386 chroot environment on filesystem that > have 64bit inode (large xfs/btrfs volume or xfs on 64bit kernel), > dash glob may not work properly by useing 32-bit APIs. > By this reason, pbuilder may fail when building i386 > package on amd64 host with xfs/btrfs volume. > I hope to work i386 chroot on amd64 if it's possible. Dash is careful to use largefile APIs only when needed. However, non-largefile stat()/lstat()/fstat()/readdir() are always wrong because of 64-bit inode numbers. Therefore, the only correct use of non-largefile APIs are the opens of scripts, /dev/tty and /dev/null (and maybe not even the latter, if /dev/null has a file offset). Is it still worth the complexity to use largefile only selectively? -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#753674: [dash] dash aborts a script, when 'shift' tested by 'while' fails
On Fri, 04 Jul 2014 11:07:28 +0200 Andre Naujoks wrote: > I just noticed Bug #627856, which deals with a very similar issue. > Is a shift without an argument also considered a syntax error if it is > called too often? If yes, then this bug can be closed and I am sorry > for the noise. I'd say it's the same thing: "shift" is the same as "shift 1". > IMHO this should not be considered a syntax error, because it is > (again IMHO) a valid way of getting all arguments of a script or shell > function. Your code ] echo $1 ] while shift; do ] echo $1 ] done looks a bit strange. It only works properly if there is at least one argument to start with, and duplicates the handling for the first argument (although you can fix that by placing that piece of code directly after 'while'). It would be normal in Perl where 'shift' returns the value shifted out but the shell's 'shift' does not return that. It looks more normal to write code like: ] while [ "$#" -gt 0 ]; do ] echo $1 ] shift ] done or ] for arg do ] echo $arg ] done and then there is no issue with shift failing. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#755446: dash: Should pass SA_RESTART to sigaction
On Mon, 21 Jul 2014 00:19:33 +0200 Samuel Thibault wrote: > Package: dash > Version: 0.5.7-4 > Severity: normal > User: debian-h...@lists.debian.org > Usertags: hurd > We have been having issues on hurd-i386 with the dash shell, we often > get "cannot create /dev/null: Interrupted system call" errors. With > research from Svante Signell, we have found that this is actually due > to a signal which unfortunately interrupts the open() call used by > dash to redirect some output. dash indeed does not set the SA_RESTART > flag when calling sigaction for SIGINT or SIGCHLD. There can probably > be other calls in dash that could get interrupted too. dash seems to > take care of at least some of them by handling EINTR, but it's much > safer to simply let the system handle all of them by setting > SA_RESTART. The attached patch does this, and fixes the issue on > hurd-i386, could you consider applying it? This breaks interrupting blocking redirection opens with ^C. Although relying on [EINTR] like this leaves a race window (where SIGINT arrives between the last check and the open blocking), if the open has blocked "long enough", SIGINT will make it fail with [EINTR]. In most cases, dash avoids this race by longjmp from the signal handler but for a blocking open that approach may leak the file descriptor. The cause of the regression is the SIGCHLD handler which is always enabled even if there is no SIGCHLD trap. It will probably help to use SA_RESTART for the SIGCHLD handler if there is no trap (therefore, setting a trap should disable SA_RESTART). In FreeBSD sh, another ash derivative, I have been even more careful, only enabling the SIGCHLD handler while it is needed (during a wait builtin). Apart from that, my FreeBSD/Linux-centric view says that opening /dev/null should not fail with [EINTR], just like a read or write from/to a regular file on a "normal" filesystem (an intr NFS mount is not normal). -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#745653: dash does not squash // to / when doing `cd //`
In Debian bug #745653, you wrote: > If I envoke cd in dash via: > cd // > it incorrectly reports my cwd at // instead of / POSIX permits this (for Cygwin and the like that want to have special pathnames starting with //). Three or more slashes are equivalent to one. You might want to use cd's -P option to force resolution of symlinks; this will also resolve // down to / if they are equivalent. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#746186: dash: Using wildcard in redirect doesn't work
In Debian bug #746186, you wrote: > I was trapped by this strange bug. It seems that wildcards ("?" and > "*") are not resolved when used in redirection. This is not a bug, but required by the POSIX specification. Per POSIX, a non-interactive shell shall not generate pathnames in redirections; an interactive shell may, provided exactly one pathname results. Dash chooses the simplest thing that can be compliant here: it never generates pathnames in redirections. You can see the same behaviour in scripts with bash --posix or ksh93. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#733852: dash: spurious "> " output after Ctrl-V Ctrl-J
In Debian bug #733852, Vincent Lefevre wrote: > When one types Ctrl-V Ctrl-J (to make a ^J control character appear > on the command line), dash outputs a spurious "> " string (one by > ^J occurrence) once the command is validated. For instance: > $ echo "ab^Jcd^Jef" > > > ab > cd > ef > $ true "ab^Jcd^Jef" > > > $ > where the ^J has been entered with Ctrl-V Ctrl-J. I think this is a normal consequence of using the kernel line editing ("canonical mode", ICANON) instead of something like libedit or libreadline. In the sequence of bytes returned from read(), Ctrl+V Ctrl+J looks the same as . One difference is that the kernel returns (parts of) one line in a single read() call, which could be used to distinguish between the two kinds of newline. This is not fully reliable because a newline at the last byte of a read call that fills the buffer completely cannot be analyzed this way; both Linux and FreeBSD permit longer lines than their advertised {MAX_CANON} so the problem cannot be avoided with a {MAX_CANON}+1 buffer. I expect that the extra ugliness caused by this incomplete check will not be worth it. It would be better to enable libedit or to use a version of linenoise (link it statically to avoid the impact on fork() performance). By the way, you can also have a "spurious" "$ ": $ echo a^Jecho b a $ b $ -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#726735: dash: redirect stdin from stdin to override asynchronous list redirection fails
Hi Serge, ltns, In Debian bug #726735, you wrote: > Dash redirects stdin from /dev/null when a command is started > asynchronously, as specified by POSIX. However, POSIX also specifies > that this can be overridden by explicitely redirecting. This fails > when redirecting stdin from stdin. > The relevant text from POSIX ( > http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_09_03_02 > ): > ] The standard input for an asynchronous list, before any explicit > ] redirections are performed, shall be considered to be assigned to a file > ] that has the same properties as /dev/null. If it is an interactive > ] shell, this need not happen. In all cases, explicit redirection of > ] standard input shall override this activity. > To test: create the following script: > #!/bin/dash > sleep 100 <&0 & > Run this, and then look at the open file descriptors of the sleep > process: > ls -l /proc/`pgrep sleep`/fd > The output is as follows: > lr-x-- 1 svdb svdb 64 Oct 18 15:58 0 -> /dev/null > lrwx-- 1 svdb svdb 64 Oct 18 15:58 1 -> /dev/pts/2 > lrwx-- 1 svdb svdb 64 Oct 18 15:58 2 -> /dev/pts/2 > fd 0 is assigned to /dev/null instead of /dev/pts/2 In the example script, the redirection happens in the child process. Therefore, the shell cannot, in the general case, know about this redirection when it forks the child process. For some reason, bash still overrides the implicit
Bug#716789: Bug #716789: dash: sometimes misbehaves when backticks are included after an || operator
In Debian bug #716789, you wrote: > [cat <
Bug#713805: Bug #713805: dash: does not pass/recognize parameters to/in sourced script
In Debian bug #713805, you wrote: > [dot ignores parameters after the script pathname] POSIX does not specify what happens if dot is given more than one parameter, and dash simply ignores the excess. I think it is unlikely that dash will add the extension of temporarily setting the excess (if not empty) as the positional parameters. You can set the positional parameters using 'set -- ' before sourcing the script, or you can create a function that sources the script and gets passed the desired parameters. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#700232: Bug #700232: dash: Segmentation fault in here-document and syntax error
In bug #700232, you wrote: > dash -c ': <
Bug#690473: Bug #690473: dash: No diagnostic on echo builtin failure
Hi, Patching bltin/printf.c to generate an error message does not seem to make much sense given that the non-zero exit status is set by eval.c. Why can the error message not be generated there as well? This is also what I do in FreeBSD sh (and I return exit status 2 instead of 1 in this situation). Note that echo and printf are not the only builtins that output things to stdout and can encounter errors because of that. Some shells ignore this fact, pretending it is OK if builtins like jobs, set, times and trap cannot write their output. ksh93 and mksh appear not to print an error message either. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#589103: lsof on kFreeBSD
> [lsof on debian kfreebsd] I don't think it makes much sense to extend linprocfs to be a full Linux /proc, just to be able to run monitoring tools. This would only be useful for Debian/kFreeBSD and not for real FreeBSD. Instead, lsof should obtain its data via sysctl, either directly or via libprocstat (the library is new in 9.0 and the 9.0 kernel also contains additions to the sysctl interface needed by libprocstat). These interfaces are used by 'procstat -f', 'sockstat', 'fstat' and 'fuser' which together provide functionality similar to lsof. (In 8.x and earlier, fstat needs kmem access but not in 9.0 and later.) As with the move on Linux from kmem to /proc many years ago, some very specialized information may have to be given up, but that's worth it. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#651446: dash builtin read has no -s option for silent
> [dash's read does not have -s to turn off echo] You can turn off echoing using a sequence like settings=$(stty -g) stty -echo IFS= read -r password stty "$settings" Some additional code is required to restore the terminal settings even if SIGINT, SIGQUIT, SIGTERM or similar happens. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#590920: dash does not support multi-digit file descriptors
> [possibly allowing fd>9 in redirections] Removing the limit of 10 file descriptors accessible to redirections is not as easy as it seems. An easy change is to raise the limit from 10 to, say, 24, like mksh has done. Although this allows access to more file descriptors, it also increases the 'ulimit -n' necessary to run the shell and the new limit is still pretty low (not high enough for the broken example in the old flock man page, for example). File descriptors above this limit are used (via fcntl(F_DUPFD)) for reading shell scripts, preserving open files that need to be restored after a redirected command and all other internal purposes that last longer than a command. What bash does is to remove the fixed boundary. When the user tries to overwrite a file descriptor the shell uses internally, it will move that descriptor to another number before allowing the operation to proceed. Implementing this will increase code size and complexity. In ksh93, file descriptors before a redirection operators are limited to one digit but a special syntax allows accessing more descriptors. A command like exec {in}
Bug#642922: regression: "sh -c" change causes FTBFS
> > POSIX's Shell and Utilities (XCU) 2.12 [1] does say that "[the] > > environment of the shell process shall not be changed by the utility", > > and that environment includes open files. My understanding is that > > dash's new behaviour (and incidentally, ksh93's one) is incorrect. > As I understand it, the intent of that section was not to prohibit > changes in process state due to a command like "exec foo" replacing > the shell process. The shell execution environment is not changed but > simply no longer exists once execve() has been called. The word "process" seems a bit awkward here, yes. For the most part, POSIX is very careful in writing "shell environment" instead of "shell process", so that creating a new process can often be avoided by clever implementations. Anyway, if the shell has found that it can safely exec a simple command without forking (or run a subshell without forking), there is no possibility of any execution in the original shell environment, so there seems no need to preserve any open files or other resource needed for such execution. Note that 2.7 Redirection does not say anything explicit about preserving the previous content of file descriptors, only that a redirection on a command (except 'exec') shall be in effect during that command only. Therefore, the shell only needs to save file descriptors at high numbers if it may need to execute subsequent commands in the same process ("may" because a trap for EXIT might be set in the command, if it is not external). Also note that 2.9.1 Simple Commands does not explicitly call for creating a new process. If there are subsequent commands, the shell will obviously need to create a new process, but otherwise that is not needed. > More to the point, POSIX is very easy to change[1], so if this > behavior is causing a problem in otherwise reasonable application > code, methinks it would be better to discuss that instead... The behaviour you are complaining about exists in almost all shells (tried: FreeBSD /bin/sh, dash, bash, zsh, mksh, ksh93, Heirloom Bourne shell, kBuild kmk_ash (much like NetBSD /bin/sh)) in some contexts, so it is unlikely to change. To demonstrate, where ${SH} is the shell to be tested: ${SH} -c '/bin/sleep 10 >/dev/null &' | { cat; echo EOF; } The "EOF" line appears immediately in all tested shells, showing that there is nothing holding onto the pipe in the subshell environment created by '&'. On the other hand, if the redirection is 2>/dev/null instead, it takes ten seconds before "EOF" appears, as in this case the pipe file descriptor is passed on. If braces or parentheses are added around the sleep command (including or not including the redirection), behaviour varies per shell. Likewise, behaviour for { ${SH} -c '/bin/sleep 10 >/dev/null' & } | { cat; echo EOF; } varies per shell, and I do not see a reason why this should keep the pipe file descriptor open while shells clearly agree that it should not be kept open in the first example. (Note that I write /bin/sleep explicitly, because some shells have a sleep builtin.) -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#619725: dash: permit special builtin names as function names
> [false || local() { ...; } is a syntax error] Although I don't think special builtins shall be valid function names, I think there is still a bug in dash here: 'local' should not be a special builtin. When 'local' was made a special builtin, I remarked on the dash mailing list that that change may not be a good idea: http://www.mail-archive.com/dash@vger.kernel.org/msg00268.html I think this is a concrete reason why I was right. 'local' should be a regular builtin and should be made to work in another way, such as by: * adding a special case to evalcommand() for 'local' (if (spclbltin > 0 || argc == 0)), or * modifying localcmd() so only local variables are only added to localvars lists belonging to functions. Although I have not tried, the former looks easiest. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#595063: read() builtin doesnt read integer value /proc files (but bashs does)
On Wed, Dec 15, 2010 at 12:55:51PM -0600, Jonathan Nieder wrote: > Cristian Ionescu-Idbohrn wrote: > > On Sun, 28 Nov 2010, Herbert Xu wrote: > > > On Sat, Sep 04, 2010 at 07:35:04PM +0000, Jilles Tjoelker wrote: > >>> This discarding is still bad as it throws away valid data if the open > >>> file description is shared. This happens if stdin is redirected inside a > >> I'm with Jilles on this. I also don't particularly feel like > >> bloating dash just because of the borked /proc interface when > >> there is a perfectly adequate work-around in "cat". > >>value=$(cat /proc/file) > > I wouldn't call that "a perfectly adequate work-around", but a painful and > > unadequate work-around. > For what it's worth, here's what bash does (based on strace): > 1. Determine whether the file is seekable. That is, seek using > SEEK_CUR with offset 0. > 2. If seekable, read a nice big chunk and then seek back to put the > file offset back in the right place. If not seekable, read one byte > at a time. > This works in /proc because files in /proc are seekable. > That said, I don't think borked /proc is a great reason to do this > (it's a better reason to fix /proc). Speeding up the read builtin > might be a good reason. The optimization is of limited benefit (still way more syscalls than strictly necessary) and does not apply to the common use case of reading from a pipe. Generally, if 'read' is too slow, it is better to spend a fork on a tool like grep, sed or awk which processes large amounts of text much more efficiently. As for value=$(cat /proc/file), there are at least two ways to make this faster. The traditional ksh way is the extension value=$(
Bug#436466: dash: Please optimise single command given to -c to exec it
Drake Wilson wrote: > - Point out that full tail execution optimization is not as > localized as the "anything without shell meta-stuff" that some > people think it is, and that partial tail execution optimization > can lead to subtle problems if not done carefully. Cf. a vaguely > similar case where I was attempting to use an XSI shell builtin > (which I was willing to rely on on the target system) as a single > command in a series of commands in a makefile, and GNU Make > decided that it didn't look enough like a shell thing and could be > "optimized" into an exec, which naturally hosed the command. Of > course, the case of doing exec transformation _in_ the shell is > not as bad as that. I think the gmake problem is caused by a non-compliance to POSIX in the operating system, possibly combined with an overzealous skipping of the shell by gmake. It is rather different from the fork/exec problem with sh -c. POSIX requires that all standard utilities except special builtins be available via the exec functions and the utilities that can invoke user-specified other utilities (env, find, nice, nohup, time, xargs). In SUSv4, the special builtins are . (dot), : (colon), break, continue, eval, exec, exit, export, readonly, return, set, shift, times, trap and unset; none of these are under the XSI option. An implementation is not required to provide them as external programs (and there seems no point in doing so). Gmake must know about these and not try to exec them itself. Other builtins that do not exist as external programs yet need to be provided. An easy way is to link a shell script like the below to /usr/bin/alias, /usr/bin/bg, /usr/bin/cd, /usr/bin/command, etc.: #!/bin/sh command -p ${0##*/} "$@" The 'command -p' serves to avoid looping when the PATH is set to a value like /usr/bin:%builtin which is an undocumented way to prefer /usr/bin/FOO to a builtin command FOO. The 'command' utility itself is not subject to this because dash implements the POSIX requirement that various (usually builtin) utilities such as 'command' cannot be overridden via PATH (but unlike the special builtins can be overridden with a function). Regarding sh -c optimization, I am in favour of this. Uselessly waiting 'sh -c' processes annoy me. I made the change for FreeBSD 8.0 sh, which is very similar to dash. The SVN changeset is r194128. The change appears to work for dash as well. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#591321: error: "value too great for base" while comparing numbers more than 7 with leading zeros
Alexey Knyazev wrote: > k...@fish-eye:~$ if (( 06 > 06 )); then echo lazhaa; fi > k...@fish-eye:~$ if (( 07 > 06 )); then echo lazhaa; fi > lazhaa > k...@fish-eye:~$ if (( 07 > 07 )); then echo lazhaa; fi > k...@fish-eye:~$ if (( 08 > 07 )); then echo lazhaa; fi > bash: ((: 08: value too great for base (error token is "08") > k...@fish-eye:~$ if (( 09 > 08 )); then echo lazhaa; fi > bash: ((: 09: value too great for base (error token is "09") > k...@fish-eye:~$ If you accept that the expressions in the non-standard construct ((FOO)) work the same as the ones in the standard construct $((FOO)), which seems reasonable, then this is perfectly correct behaviour. Arithmetic expressions recognize hexadecimal (starting with 0x), octal (starting with 0) or decimal (starting with 1-9) numbers. Another example: the following expression is true: (( 8 == 010 )) The numerical comparisons -lt, -le, -eq, -ne, -ge, -gt in test and [ recognize only decimal numbers. Therefore 09 is valid and nine and 010 is ten and not eight. I will admit that this is messy, but I think changing it now will only create a bigger mess. POSIX says where octal numbers shall and shall not be recognized and software generally complies to that. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#566641: bash: wait $pid exits before $pid has terminated
This looks rather similar to #566639 and can be closed for the same reason. The behaviour is as specified by POSIX and does not seem broken to me. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#482194: bash: PS1 non-empty for non-interactive shells
Some shells indeed seem to not set PS1 for non-interactive shells. However, it seems bad to rely on this to detect non-interactive shells. I don't see language in POSIX that says PS1 should not be set for non-interactive shells. Some people export PS1 in their shell startup files, breaking the check. The correct way to detect interactive vs non-interactive shells is to check if $- contains "i", for example: case $- in *i*) echo I am interactive ;; *) echo I am not interactive ;; esac -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#598239: dash local assignment splits words
Stephen Gildea wrote: > Using the "local" command to do variable assignment should not do word > splitting. Please note that this feature was in dash for some time, but was then removed. See Changelog.O in the dash source tree: On Sun, 23 Nov 1997 12:45:15 +1100, ash 0.3.1-14: * Disabled word-splitting for assignment builtins. On Mon, 24 Nov 1997 16:19:10 +1100: * Fixed incorrect assignment builtin code. On Sun, 15 Jul 2001 17:27:03 +1000, ash 0.3.8-15: * Removed assignment builtins since it is at best undefined by the SuS and also can't be implemented consistently. Unfortunately, neither the dash git repository nor the dash mailing list archives I can find go back that old. When the "treat words that look like assignments specially" behaviour triggers exactly differs between shells that implement it. In ksh93 it is fairly consistent: all builtins it applies to are special, so that when a word looks like such a builtin it is one. (ksh93 doesn't have "local", its "alias" is special, different from what POSIX requires, and its extension "typeset" is also special.) The command word may contain quoting but no expansions to be a candidate for this special treatment. In bash and mksh, it applies to various non-special builtins that can at least be overridden with a function. The special behaviour does not depend on whether the actual builtin is going to be called, or a user-defined function. The command word must be the literal builtin, no quoting or expansions are permitted. Some versions of bash and zsh, configured with certain options, apply the special casing to all words that look like assignments. Another effect of the special casing is that tilde expansion is performed in cases like export PATH=foo:~/bin:bar I agree with the other post that POSIX does not require this behaviour, and might even prohibit it; the prohibition is probably not intended, though, and it seems unlikely to me the standard will be changed to prohibit it -- that would break more than it fixes. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#595063: read() builtin doesn't read integer value /proc files (but bash's does)
On Sat, Sep 04, 2010 at 08:20:33PM +0200, Steve Schnepp wrote: > 2010/9/3 Jilles Tjoelker : > > This patch assumes that the file descriptor is discarded afterwards (its > > position does not matter). Therefore the very common construct > > while read x; do > > ... > > done > > stops working. > Ohh.. thanks for that, I didn't see it. > Actually "while read x" continues to work. > But "reopening the file" doesn't as in : > read a b < datafile > echo ${a} ${b} > read a b < datafile > echo ${a} ${b} You're right, it's even stranger than I expected. > I attached an updated patch that corrects this pb by discarding the > buffer when opening a new file. This discarding is still bad as it throws away valid data if the open file description is shared. This happens if stdin is redirected inside a while read... loop. Furthermore, I think constructions like read x; cat and read x; (read y); read z should keep working. This requires that the input's file position be synced whenever another process may see it (fork/exit). Due to the highly dynamic character of the shell and the common use of fd 0, this probably means that you can't do better than syncing after each read builtin. (For example, 'read' could be overridden with a function after the third line.) Another thought: exec 3<&0; read x; read y <&3 or even sh -c 'read x; read y <&3' 3<&0 Different file descriptors may refer to the same open file description and the shell may not know this. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#595063: read() builtin doesn't read integer value /proc files (but bash's does)
On Thu, Sep 02, 2010 at 05:02:55PM +0200, Steve Schnepp wrote: > 2010/9/1 Steve Schnepp : > > conforming to POSIX isn't a realistic option, would it be possible to > > have a workaround that doesn't involve an external tool like cat(1) ? > Hi, I just hacked & attached a little patch away to be able to solve > this case. > Feel free to reply with your comments. > NB: I just targeted dash-0.5.5.1, but it might apply to any version. This patch assumes that the file descriptor is discarded afterwards (its position does not matter). Therefore the very common construct while read x; do ... done stops working. A possible fix is to check first if the input supports seeking. If it does, use the buffering and at the end of the line seek backwards for the number of bytes remaining in the buffer. If it does not, read one byte at a time. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#595063: read() builtin doesn't read integer value /proc files (but bash's does)
On Wed, Sep 01, 2010 at 10:10:11AM +0200, Steve Schnepp wrote: > Hi, I opened bug 595063 on the debian BTS [1] and I was suggested to > resend the email upstream. > So I copied the body of the bug below : > dash's read() builtin seems to read the underlying file 1 char at a > time. This doesn't work with some files under /proc, since procfs isn't > fully POSIX compliant. > [snip] > After a little digging, it only appears on files that contains just an > integer value. When asked to read with a non-null offset (*ppos != 0), > __do_proc_dointvec() just returns 0 (meaning an EOF) as shown on [2]. > I'm aware that the issue isn't strictly a dash one, since it has the > right to read one character at a time. But since fixing procfs to be > conforming to POSIX isn't a realistic option, would it be possible to > have a workaround that doesn't involve an external tool like cat(1) ? Given that other files in /proc do work, I don't see why the ones that only contain an integer value cannot be fixed. All the necessary state to produce the second and further bytes is available. Choosing a powerful abstraction like a regular file has its implications. Note that a change in the file between the single-byte reads will cause an inconsistent value to be read. This is also the case with regular files on a filesystem, so it is acceptable. If single-byte reads are really unacceptable, then the proper way to read these files needs to be documented, and clear violations that will not work properly should cause an error (in this case, this means that reading one byte from offset 0 should fail like reading one byte from offset 1 does). -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#593426: zsh: Status of background jobs not updated
> [continuing a stopped background job using kill is not reflected in the > output of jobs] I think hooking into the kill builtin is the wrong way to fix this. Instead, use the facilities provided by modern systems which can notify you if a child process continues. These are si_code CLD_CONTINUED for SIGCHLD and the WCONTINUED flag and WIFCONTINUED() macro for waitpid(). Some systems do not provide these, and may not even provide queuing and siginfo for SIGCHLD, so the latter approach seems best. The WCONTINUED stuff can then be #ifdef'ed out for systems that do not support it. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#587741: dash: shell functions seem to make per-command variable assignments permanent
In your example $ dash -c 'noop(){ true;}; x=1; noop; x=broken noop; echo "x is now $x"' x is now broken dash's behaviour is what POSIX requires, and the same can be seen with bash --posix and ksh93. The relevant section is XCU 2.9.5 Function Definition Command. The same thing applies to special builtins such as ':', exec and unset. For example, these two both print 'good': dash -c 'x=bad; x=good :; echo $x' dash -c 'x=good; x=bad true; echo $x' Again bash needs --posix here. The idea behind special-casing some of the builtins is that the special builtins must always be builtins and are treated like how the original Bourne shell treated all builtins, while all other utilities may be implemented as builtins or external programs and are treated like how the original Bourne shell treated external programs (except that some of them affect the shell environment in a way only a builtin can). There is less historical basis for the treatment of assignments on function calls, as such assignments had no effect whatsoever in the original Bourne shell. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#566639: dash: wait $pid exits before $pid has terminated
The early return of 'wait' is described in POSIX XCU 2.11 Signals and Error Handling: the wait builtin shall return immediately with >128 exit status if a trap occurs. It is somewhat hidden as the page for 'wait' makes no mention of this. To avoid this, you can try to 'wait' until the exit status is less than 128. To distinguish between traps and child processes exiting because of signals, the trap handlers will have to do more magic. The early termination of 'sleep' happens because sh's termination causes the process group (formerly containing sh and sleep) to become orphaned, and at least one process (sleep) in that process group is stopped. It is sent a SIGHUP and a SIGCONT signal. This is done because there is noone left to care about the process, and otherwise it would remain stopped forever. (See _Exit() in XSH and orphaned process group in XBD 3 Definitions.) Concludingly, there is no bug here. All the behaviour is as described in POSIX. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#570757: dash: cannot parse multiline parameter expansion
FreeBSD's ash derivative has a similar restriction (newlines inside ${} must be quoted via single-quotes, double-quotes or here document). I think this is a feature as it allows detecting syntax errors where you still have a chance to see the problem easily rather than 100 lines later, and I have not received any other complaints about this. The System V shell (e.g. /bin/sh on Solaris <= 10) behaves similarly. -- Jilles Tjoelker -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org