Aw: Re: autocomplete error doesn't look to be in bash-complete so I'm reporting it here.
I got the file by running some code using procees substitution with set +o posix. I don't think the drectory was empty but what you say make sense and I didn't think to checkt it at the time. Thanks JOhn Gesendet: Sonntag, 18. August 2013 um 20:42 Uhr Von: "Chet Ramey" An: dethrop...@web.de Cc: bug-bash@gnu.org, b...@packages.debian.org, chet.ra...@case.edu Betreff: Re: autocomplete error doesn't look to be in bash-complete so I'm reporting it here. On 8/16/13 5:28 AM, dethrop...@web.de wrote: > Bash Version: 4.2 > Patch Level: 25 > Release Status: release > > Description: > autocomplete error doesn't look to be in bash-complete so I'm reporting it here. > > Repeat-By: > touch '>(pygmentize -l text -f html )' > rm >[Press tab] > > rm >\>\(pygmentize\ -l\ text\ -f\ html\ \) > ^ Note leading > I'm going to assume you did this in a directory with no other files, so tab-completing nothing results in the filename that strangely resembles process substitution. If you don't quote the `>', bash interprets it as a redirection operator, as the parser would, and performs filename completion. The tab results in the single filename. If you were to backslash-quote the `>', you'd get the filename as you intended. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRU c...@case.edu [1]http://cnswww.cns.cwru.edu/~chet/ References 1. http://cnswww.cns.cwru.edu/~chet/
Aw: Re: Re: Chained command prints password in Clear Text and breaks BASH Session until logout
Typically when a program has this sort of a problem I just save and restore the context myself. SavedContext="$(stty -g )" read -sp "Password:" Password mysqldump -u someuser --password=${Password} somedb | mysql -u someuser --password=${Password} -D someotherdb # Restore Terminal Context. stty "${SavedContext}" And note your orginal example was wrong. -p in the following is to speciy the password mysqldump -u someuser -p somedb | mysql -u someuser -p -D someotherdb so you are saying the password to someuser is somedb and not giving a database. in the second case you are saying that the password to someuser is -D Gesendet: Donnerstag, 11. Juli 2013 um 20:05 Uhr Von: "Jason Sipula" An: "John Kearney" Cc: bug-bash@gnu.org Betreff: Re: Re: Chained command prints password in Clear Text and breaks BASH Session until logout Bingo. ~]# stty echo This fixed bash. So it does appear MySQL is disabling echo.Strange that it does not re-enable it after it's finished running. I'll take this up with the mysql folks. Thank you to everyone! On Thu, Jul 11, 2013 at 11:00 AM, John Kearney wrote: > sounds like echo is turned off > try typing > stty +echo > when you you say you don't see any output. > And if its turned off it was probably turned off my mysql. > *Gesendet:* Donnerstag, 11. Juli 2013 um 19:53 Uhr > *Von:* "Jason Sipula" > *An:* Kein Empfänger > *Cc:* bug-bash@gnu.org > *Betreff:* Re: Chained command prints password in Clear Text and breaks > BASH Session until logout > I probably should have filed two different reports for this. Sorry for any > confusion guys. > > The password makes sense to me why it allows clear text... > > The second issue is once the command terminates, bash session does not > behave normally at all. Nothing typed into the terminal over SSH or > directly on the console displays, however it does receive the keys. Also, > if you repeatedly hit ENTER key, instead of skipping to new line, it just > repeats the bash prompt over and over in a single line. So far restarting > bash session (by logging out then back in) is the only way I have found to > "fix" the session and return to normal functionality. > > > On Thu, Jul 11, 2013 at 10:47 AM, John Kearney wrote: > > > > > This isn't a but in bash. > > firstly once a program is started it takes over the input so the fact > that > > your password is echoed to the terminal is because myspl allows it not > > bash, and in mysql defense this is the normal behaviour for command line > > tools. > > > > Secondly both mysqldump and mysql start at the same time and can > > potentially be reading the password also at the same time. > > on some systems and for some apps it could happen that. > > > > password for mysqldump p1234 > > password for mysql p5678 > > > > the way you are staring them you could potentially end up with > > > > mysqldump getting p5274 > > mysql getting p1638 > > > > basically you should give the password on the command line to mysql. > > > > something like > > read -sp "Password:" Password > > mysqldump -u someuser --password ${Password} -p somedb | mysql -u > someuser > > --password ${Password} -p -D someotherdb > > > > *Gesendet:* Mittwoch, 10. Juli 2013 um 23:54 Uhr > > *Von:* "Jason Sipula" > > *An:* bug-bash@gnu.org > > *Betreff:* Chained command prints password in Clear Text and breaks BASH > > > Session until logout > > Configuration Information [Automatically generated, do not change]: > > Machine: x86_64 > > OS: linux-gnu > > Compiler: gcc > > Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' > > -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-redhat-linux-gnu' > > -DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' > > -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib -D_GNU_SOURCE > > -DRECYCLES_PIDS -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions > > -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fwrapv > > uname output: Linux appsrv01.js.local 2.6.32-358.6.1.el6.x86_64 #1 SMP > Tue > > Apr 23 19:29:00 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux > > Machine Typ
Aw: Re: Chained command prints password in Clear Text and breaks BASH Session until logout
Sorry made a typo in the last email I meant try stty echo sounds like echo is turned off try typing stty echo when you you say you don't see any output. And if echoing is turned off it was probably turned off my mysql. Gesendet: Donnerstag, 11. Juli 2013 um 19:53 Uhr Von: "Jason Sipula" An: Kein Empfänger Cc: bug-bash@gnu.org Betreff: Re: Chained command prints password in Clear Text and breaks BASH Session until logout I probably should have filed two different reports for this. Sorry for any confusion guys. The password makes sense to me why it allows clear text... The second issue is once the command terminates, bash session does not behave normally at all. Nothing typed into the terminal over SSH or directly on the console displays, however it does receive the keys. Also, if you repeatedly hit ENTER key, instead of skipping to new line, it just repeats the bash prompt over and over in a single line. So far restarting bash session (by logging out then back in) is the only way I have found to "fix" the session and return to normal functionality. On Thu, Jul 11, 2013 at 10:47 AM, John Kearney wrote: > > This isn't a but in bash. > firstly once a program is started it takes over the input so the fact that > your password is echoed to the terminal is because myspl allows it not > bash, and in mysql defense this is the normal behaviour for command line > tools. > > Secondly both mysqldump and mysql start at the same time and can > potentially be reading the password also at the same time. > on some systems and for some apps it could happen that. > > password for mysqldump p1234 > password for mysql p5678 > > the way you are staring them you could potentially end up with > > mysqldump getting p5274 > mysql getting p1638 > > basically you should give the password on the command line to mysql. > > something like > read -sp "Password:" Password > mysqldump -u someuser --password ${Password} -p somedb | mysql -u someuser > --password ${Password} -p -D someotherdb > > *Gesendet:* Mittwoch, 10. Juli 2013 um 23:54 Uhr > *Von:* "Jason Sipula" > *An:* bug-bash@gnu.org > *Betreff:* Chained command prints password in Clear Text and breaks BASH > Session until logout > Configuration Information [Automatically generated, do not change]: > Machine: x86_64 > OS: linux-gnu > Compiler: gcc > Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' > -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-redhat-linux-gnu' > -DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' > -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib -D_GNU_SOURCE > -DRECYCLES_PIDS -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions > -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fwrapv > uname output: Linux appsrv01.js.local 2.6.32-358.6.1.el6.x86_64 #1 SMP Tue > Apr 23 19:29:00 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux > Machine Type: x86_64-redhat-linux-gnu > > Bash Version: 4.1 > Patch Level: 2 > Release Status: release > > Description: > > Reproducible from both an SSH session as well as directly at the console. > > On BASH 4.1.x (4.1.2) running under CentOS 6.x (6.4 Final) and MySQL 5.1.x > (5.1.69). I believe this bug will persist on all distros running BASH 4.x.x > > After running the chained command (see below "Repeat-By" section), BASH > allows a password field to be seen in Clear Text, and then the BASH session > breaks until BASH session is restarted (logout then login). > > The purpose of the command is to dump the database "somedb" ... which would > normally dump to a text file for import later... but instead redirect > stdout to the stdin of the chained mysql command which will import all the > data from "somedb" into "someotherdb" on the same MySQL host. The command > works, but there's two problems. > > MySQL correctly challenges for password of "someuser" to perform the > mysqldump part, but once you type in the password and hit ENTER, it skips > to a new blank line without the shell prompt and just sits. It is waiting > for you to type in the password for "someuser" as the second part of the > command (but does not prompt for this and it's not intuitive, it appears > as-if the command is running)... If you type, it&
Aw: Chained command prints password in Clear Text and breaks BASH Session until logout
This isn't a but in bash. firstly once a program is started it takes over the input so the fact that your password is echoed to the terminal is because myspl allows it not bash, and in mysql defense this is the normal behaviour for command line tools. Secondly both mysqldump and mysql start at the same time and can potentially be reading the password also at the same time. on some systems and for some apps it could happen that. password for mysqldump p1234 password for mysql p5678 the way you are staring them you could potentially end up with mysqldump getting p5274 mysql getting p1638 basically you should give the password on the command line to mysql. something like read -sp "Password:" Password mysqldump -u someuser --password ${Password} -p somedb | mysql -u someuser --password ${Password} -p -D someotherdb Gesendet: Mittwoch, 10. Juli 2013 um 23:54 Uhr Von: "Jason Sipula" An: bug-bash@gnu.org Betreff: Chained command prints password in Clear Text and breaks BASH Session until logout Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-redhat-linux-gnu' -DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib -D_GNU_SOURCE -DRECYCLES_PIDS -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fwrapv uname output: Linux appsrv01.js.local 2.6.32-358.6.1.el6.x86_64 #1 SMP Tue Apr 23 19:29:00 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-redhat-linux-gnu Bash Version: 4.1 Patch Level: 2 Release Status: release Description: Reproducible from both an SSH session as well as directly at the console. On BASH 4.1.x (4.1.2) running under CentOS 6.x (6.4 Final) and MySQL 5.1.x (5.1.69). I believe this bug will persist on all distros running BASH 4.x.x After running the chained command (see below "Repeat-By" section), BASH allows a password field to be seen in Clear Text, and then the BASH session breaks until BASH session is restarted (logout then login). The purpose of the command is to dump the database "somedb" ... which would normally dump to a text file for import later... but instead redirect stdout to the stdin of the chained mysql command which will import all the data from "somedb" into "someotherdb" on the same MySQL host. The command works, but there's two problems. MySQL correctly challenges for password of "someuser" to perform the mysqldump part, but once you type in the password and hit ENTER, it skips to a new blank line without the shell prompt and just sits. It is waiting for you to type in the password for "someuser" as the second part of the command (but does not prompt for this and it's not intuitive, it appears as-if the command is running)... If you type, it's in clear text! Potentially a major security issue there. It gets worse... After you hit ENTER a second time, the command will finish, and it will return a fresh line with the shell prompt. Everything looks normal... but try typing. Nothing will show at all, however it is sending the keys to the shell and will execute commands if you type them in and hit ENTER. Each successful command will return you to a fresh shell line, but same thing happens until you log out and back in (to restart BASH). Also, while this is happening, you can hit the ENTER key over and over and BASH will just keep repeating the shell prompt on the same line. Repeat-By: At the shell, issue the command: ~]# mysqldump -u someuser -p somedb | mysql -u someuser -p -D someotherdb Shouldn't need to run that command as root, but the mysql user must be privileged enough to work with the two databases. To simplify things you can replace "someuser" with root. Thank you, Jason Sipula alup...@gmail.com
Aw: How to test if a link exists
check out help test if you want to test fot both you can do [ -e file -o -h file ] || echo file not present. AFAIK the current behaviour is intentional and is the most useful. cheers Gesendet: Freitag, 21. Juni 2013 um 15:43 Uhr Von: "Mark Young" An: bug-bash@gnu.org Betreff: How to test if a link exists Hi, I stumbled into discovering that the -e test for a file does not report the file as existing if the file is a dead symbolic link. This seems wrong to me. Here's some test code:- (WARNING it includes rm -f a b) #!/bin/bash bash --version echo "" rm -f a b ln -s b a [ -a a ] && echo "1. (test -a) File a exists, it's a dead link" [ -e a ] && echo "1. (test -e) File a exists, it's a dead link" [ -f a ] && echo "1. (test -f) File a exists, it's a dead link" touch b [ -a a ] && echo "2. (test -a) File a exists, it points to b" [ -e a ] && echo "2. (test -e) File a exists, it points to b" [ -f a ] && echo "2. (test -f) File a exists, it points to b" When run on my CentOS v5.9 system I get the following $ ./test GNU bash, version 3.2.25(1)-release (x86_64-redhat-linux-gnu) Copyright (C) 2005 Free Software Foundation, Inc. 2. (test -a) File a exists, it points to b 2. (test -e) File a exists, it points to b 2. (test -f) File a exists, it points to b When run on Cygwin I also get basically the same $ ./test GNU bash, version 4.1.10(4)-release (i686-pc-cygwin) Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <[1]http://gnu.org/licenses/gpl.html> This is free software; you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. 2. (test -a) File a exists, it points to b 2. (test -e) File a exists, it points to b 2. (test -f) File a exists, it points to b My feeling is that this is wrong and that I should be told that a exists even though b doesn't. File 'a' does exist it is a dead symbolic link. So it prevents me for instance creating a symbolic link:- E.g. $ ln -s c a $ ls -l a b c ls: b: No such file or directory ls: c: No such file or directory lrwxrwxrwx 1 marky tools 1 Jun 21 14:41 a -> b Is this an error in bash? What test should I use to decide if a file exists (including dead symbolic links)? Cheers, Mark References 1. http://gnu.org/licenses/gpl.html
Aw: Re: currently doable? Indirect notation used w/a hash
Like I said its a back door aproach, it circumvents the parser. which doesn't allow this syntax ${${Name}[1]} I didn't actually find this myself it was reproted on this list a long time ago. I do remember Chet saying he wouldn't break it. But other than that I can't remember the discussion all that well. As always with this topic it was a pretty lively debate. Yhea its a constant fight getting my email clients to stop capitialising various things in code. Gesendet: Montag, 17. Juni 2013 um 13:57 Uhr Von: "Greg Wooledge" An: "Linda Walsh" Cc: "John Kearney" , bug-bash Betreff: Re: currently doable? Indirect notation used w/a hash On Sat, Jun 15, 2013 at 12:36:22PM -0700, Linda Walsh wrote: > John Kearney wrote: > >There is also a backdoor approach that I don't really advise. > >val="${ArrayName}[Index]" > >echo "${!val}" > - > Don't advise? Any particular reason? or stylistic? I'd shared this advice ("don't use it"), because I cannot for the life of me tell whether this is a bug or a feature. As near as I can tell, it is an unforeseen consequence of the parser implementation, not documented anywhere. As such, I would not rely on it to continue working in future Bash releases. P.S. you meant printf -v, not -V.
Aw: Re: `printf -v foo ""` does not set foo=
Thats one of the reasons I suggested the following syntax printf -v test "%s" "" It doesn't have this problem it also saves other problems as well. of if you want to expand back slashes etc. printf -v test "%b" "" Gesendet: Montag, 17. Juni 2013 um 08:33 Uhr Von: "Linda Walsh" An: bug-bash@gnu.org Betreff: Re: `printf -v foo ""` does not set foo= Mike Frysinger wrote: > simple test code: > unset foo > printf -v foo "" > echo ${foo+set} > > that does not display "set". seems to have been this way since the feature > was added in bash-3.1. > -mike Indeed: > set -u > unset foo > printf -v foo "" > echo $foo bash: foo: unbound variable > foo="" > echo $foo I have a feeling this would be hard to fix, since how can printf tell the difference between printf -v foo "" and printf -v foo ?? (with nothing after it?) it seems the semantic parser would have already removed the quotes by the time the args are passed to printf, even this: > set -u > printf -v foo "$(echo "$'\000'")" > echo $foo still leaves foo gutless: without content (even if were null)
Aw: currently doable? Indirect notation used w/a hash
Sorry forgot the bit to retrive values It is possible to retrive numeric values without eval i.e. val=$((${ArrayName}[Index])) works quiet well and is quick, in fact I used to use this quiet a lot. There is also a backdoor approach that I don't really advise. val="${ArrayName}[Index]" echo "${!val}" What I actually tend to do is. ks_array_ChkName() { local LC_COLLATE=C case "${1:?Missing Variable Name}" in [!a-zA-Z_]* | *[!][a-zA-Z_0-9]* ) return 3;; esac } ks_val_Get() { ks_array_ChkName "${1:?Missing Destination Variable Name}" || return $? ks_array_ChkName "${2:?Missing Source Variable Name}" || return $? eval "${1}=\"\${${2}:-}\"" } ks_array_GetVal() { ks_val_Get "${1}" "${2}[${3}]" ; } ks_array_SetVal() { ks_val_Set "${1}[${2}]" "${3:-}" ; } Cheers Gesendet: Samstag, 15. Juni 2013 um 15:03 Uhr Von: "John Kearney" An: "Linda Walsh" Cc: bug-bash Betreff: Aw: currently doable? Indirect notation used w/a hash In bash there are 2 options that I use. 1. ArrayName=blah printf -V "${ArrayName}[Index]" "%s" "Value To Set" 2. ks_val_ChkName() { local LC_COLLATE=C case "${1:?Missing Variable Name}" in [!a-zA-Z_]* | *[!a-zA-Z_0-9]* | '' ) return 3;; esac } ks_array_SetVal() { ks_val_ChkName "${1:?Missing Array Name}" || return $? ks_val_ChkName "a${2:?Missing Array Index}" || return $? eval "${1}"["${2}"]'="${3:-}"' } ks_array_SetVal "${ArrayName}" "Index" "Value" Cheers Gesendet: Sonntag, 09. Juni 2013 um 23:02 Uhr Von: "Linda Walsh" An: bug-bash Betreff: currently doable? Indirect notation used w/a hash I was wondering if I was missing some syntax somewhere... but I wanted to be able to pass the name of a hash in and store stuff in it and later retrieve it... but it looks like it's only possible with an eval or such? Would be nice(??)*sigh*
Aw: currently doable? Indirect notation used w/a hash
In bash there are 2 options that I use. 1. ArrayName=blah printf -V "${ArrayName}[Index]" "%s" "Value To Set" 2. ks_val_ChkName() { local LC_COLLATE=C case "${1:?Missing Variable Name}" in [!a-zA-Z_]* | *[!a-zA-Z_0-9]* | '' ) return 3;; esac } ks_array_SetVal() { ks_val_ChkName "${1:?Missing Array Name}" || return $? ks_val_ChkName "a${2:?Missing Array Index}" || return $? eval "${1}"["${2}"]'="${3:-}"' } ks_array_SetVal "${ArrayName}" "Index" "Value" Cheers Gesendet: Sonntag, 09. Juni 2013 um 23:02 Uhr Von: "Linda Walsh" An: bug-bash Betreff: currently doable? Indirect notation used w/a hash I was wondering if I was missing some syntax somewhere... but I wanted to be able to pass the name of a hash in and store stuff in it and later retrieve it... but it looks like it's only possible with an eval or such? Would be nice(??)*sigh*
Aw: Re: nested while loop doesn't work
as greg say this is the wrong list you need to report this to " Vim syntax file " Language:shell (sh) Korn shell (ksh) bash (sh) " Maintainer:Dr. Charles E. Campbell, Jr. " Previous Maintainer:Lennart Schultz " Last Change:Dec 09, 2011 " Version:121 " URL: [1]http://mysite.verizon.net/astronaut/vim/index.html#vimlinks_syntax the file that does this is /usr/share/vim/vim73/syntax/sh.vim in an ubuntu system. and this is probably a simpler example to work with #!/bin/bash while true; do while true; do done until true; do done done until true; do while true; do done until true; do done done cheers Gesendet: Dienstag, 04. Juni 2013 um 22:15 Uhr Von: "Greg Wooledge" An: kartik...@gmail.com Cc: bug-bash@gnu.org, b...@packages.debian.org Betreff: Re: nested while loop doesn't work On Tue, Jun 04, 2013 at 04:39:31PM +0530, kartik...@gmail.com wrote: > Description: > A while inside a while loop (nested while) doesnt work and also the vim /gvim doesnt highlight the second while loop For issues with the vim/gvim highlighting, you'd need to report the problem in vim, not in bash. > example code is given > > while [ "ka" = $name ] > do > echo "nothing\n" > while [ "ka" = $name ] //this while is not highlighted > do > echo "everything\n" > done > done You have a quoting mistake here. "$name" should be quoted, or this will fail if the variable contains multiple words separate by spaces. imadev:~$ name="first last" imadev:~$ [ "ka" = $name ] bash-4.3: [: too many arguments This code should work: while [ "ka" = "$name" ] do printf "nothing\n\n" while [ "ka" = "$name" ] do printf "everything\n\n" done done (It goes into an infinite loop when name=ka, but presumably that's what you wanted.) References 1. https://3c.web.de/mail/client/dereferrer?redirectUrl=http%3A%2F%2Fmysite.verizon.net%2Fastronaut%2Fvim%2Findex.html%23vimlinks_syntax&selection=tfol11a7bad28b16cbfe
Re: Bash4: Problem retrieving "$?" when running with "-e"
Am 12.04.2013 18:26, schrieb Lenga, Yair: > Chet, > > Sorry again for pulling the wrong Bash 4 doc. > > Based on the input, I'm assuming that the portable way (bash 3, bash 4 and > POSIX) to retrieve $? When running under "-e" is to use the PIPEr > CMD_STAT=0 ; GET_MAIN_DATA || CMD_STAT=$? That isn't a pipe its a logical or, it means if the first command returns non 0 execute the next command. as the assignment will not fail it avoids the problem. as such you could also do GET_MAIN_DATA || GET_BACKUP_DATA or if ! GET_MAIN_DATA ; then GET_BACKUP_DATA fi
Re: Bash4: Problem retrieving "$?" when running with "-e"
Am 12.04.2013 13:44, schrieb Lenga, Yair: > Good Morning, > > I've encountered another interesting change in behavior between Bash3 and > Bash4. I hope that you can help me: > > The core question is how to retrieve the status of a command, when running > with '-e' > > For production critical jobs, we run the script in '-e', to ensure that all > steps are successful. For cases where we allow the command to fail, because > we can implement backup, we add explicit error handing. For example. > > set -ue > CHECK_SPACE > (FETCH_NEW_DATA) > If [ $? = 11 ] ; then > FETCH_BACKUP_DATA > fi > REMOVE_OLD_DATA > COPY_NEW_TO_OLD > > In Bash3, the script could retrieve the return code for FETCH_NEW_DATA, by > placing it into a sub-shell, and then examining the value of "$?". > > In Bash4, the FETCH_NEW_COMMAND failure cause the parent script to fail. > > The man page says that '-e' will "exit immediately if a simple command (note > Simple Command::) exits with non-zero status unless ...". > The "simple commands" definition is a "sequence of words separate by blanks > ...". According to this definition, the sequence "( simple command )" > Is NOT a simple command, and should NOT trigger the "immediate exit". > > Can anyone comment on my interpretation. Is there alternative solution that > will allow retrieval of the status of single commands when running > With the '-e' ? > > Thanks > Yair Lenga > > > > try this approach set -ue CHECK_SPACE RVAUE=0 (FETCH_NEW_DATA) || RVALUE=$? If [ $RVALUE = 11 ] ; then FETCH_BACKUP_DATA fi REMOVE_OLD_DATA COPY_NEW_TO_OLD
Re: weird problem -- path interpretted/eval'd as numeric expression
Am 29.03.2013 18:53, schrieb Linda Walsh: > > Greg Wooledge wrote: >> On Fri, Mar 29, 2013 at 12:41:46AM -0700, Linda Walsh wrote: >>> include was designed to search the path for functions that >>> are relative paths. While the normal sourcepath allows searching for >>> filenames on the search path, I don't believe (please correct if I am wrong >>> and this works now, as it would make life much simpler) that the PATH will >>> be searched if you give it something like: >>> >>> source lib/Util/sourcefile.shh >> Is that all you want? Here: >> >> include() { >> local paths dir >> IFS=: read -ra paths <<< "$PATH" >> for dir in "${paths[@]}"; do >> if [[ -r $dir/$1 ]]; then >> source "$dir/$1" >> return >> fi >> done >> echo "could not find '$1' in PATH" >&2 >> return 1 >> } > -- > It also doesn't keep track of the previously sourced files so as to > not 're-source' them if one of the files you 'source' also sources a file. > > It also allows one to optionally leave off the extension, but other than > those additions... yeah... that's close... > > The idea is *mainly* to be able to read in functions and aliases.. > > Vars expected to 'survive' for those funcs or aliases are exported...but > that may not be enough to get them out of the local context...not sure. > Like this then ? unset INCLUDED ; declare -A INCLUDED find_file() { local dir FOUND_FILE="" [ $((INCLUDED[${1%.sh}]+=1)) -eq 1 ] || return 1 while IFS= read -rd ':' dir ;do #echo "trying : ${dir}/${1%.sh}.sh" [[ -r ${dir}/${1%.sh}.sh ]] || continue FOUND_FILE="${dir}/${1%.sh}.sh" echo "found : ${FOUND_FILE}" done <<< "${PATH}" [ -n "${FOUND_FILE:-}" ] || echo "could not find '$1' in PATH" >&2 return ${FOUND_FILE+1} } && echo 'find_file "${1:?Missing File Name }" && source "${FOUND_FILE}"' >/tmp/source_wrapper.sh && alias include=source\ "/tmp/source_wrapper.sh" I actually tested this one and it seems to work ok.
Re: weird problem -- path interpretted/eval'd as numeric expression
Am 29.03.2013 16:36, schrieb Greg Wooledge: > On Fri, Mar 29, 2013 at 04:10:22PM +0100, John Kearney wrote: >> consider >> dethrophes@dethace ~ >> $ read -ra vals -d '' <<< $'lkjlksda\n adasd\n:sdasda:' >> >> dethrophes@dethace ~ >> $ echo ${vals[0]} >> lkjlksda > You forgot to set IFS=: for that read. > > imadev:~$ IFS=: read -ra vals -d '' <<< $'lkjlksda\n adasd\n:sdasda:' > imadev:~$ declare -p vals > declare -a vals='([0]="lkjlksda\ > adasd\ > " [1]="sdasda" [2]="\ > ")' > >> I meant to update your wiki about it but I forgot. >> I guess read uses gets not fread and that truncates the line anyway. > No, that's not correct. > >>>> cat </source_wrapper.sh >>>> find_file "${1:?Missing File Name }" || return $? >>>> source "${FOUND_FILE}" >>>> EOF >>>> alias include=source\ "/source_wrapper.sh" >>> The <>> include the definition of find_file in the wrapper script. >> ?? why <<'EOF' ?? > Because if you don't quote any of the characters in the here document > delimiter, the expansions such as "${FOUND_FILE}" will be done by the > shell that's processing the redirection. I believe you want the code > to appear in the output file. Therefore you want to quote some or all > of the characters in the delimiter. > > Compare: > > imadev:~$ cat <> echo "$HOME" >> EOF > echo "/net/home/wooledg" > > imadev:~$ cat <<'EOF' >> echo "$HOME" >> EOF > echo "$HOME" Didn't know that, Actually I forgot to escape them in my example. > > On Fri, Mar 29, 2013 at 04:18:49PM +0100, John Kearney wrote: >> Oh and FYI >> IFS=: read >> may change the global IFS on some shells I think. >> Mainly thinking of pdksh right now. > If those shells have such a bug, then you'd need to bring it up on THEIR > bug mailing list. This is bug-bash. ;-) It was just a warning, and I don't think there is a pdksh bug list. > > In any case, I've never seen such a bug, and the pdksh to which I have > access does not display it: > > ... > Get:1 http://ftp.us.debian.org/debian/ squeeze/main pdksh i386 5.2.14-25 [265 > kB] > ... > arc3:~$ pdksh > \h:\w$ echo a:b:c > /tmp/frob > \h:\w$ IFS=: read a b < /tmp/frob > \h:\w$ rm /tmp/frob > \h:\w$ echo "$IFS" > > > \h:\w$ > > This is a fundamental feature that's commonly used. If it were so > egregiously broken I think more people would have noticed it. try this f(){ echo "ddd${IFS}fff"; } ; f ; IFS=KKK f; f This just didn't work as I would expect on ubuntu pdksh. I didn't look into it regarding builtins, I just stopped using that feature, as it seemed to be wonky. the original platform I saw it on was QNX6.5.0/BB10.
Re: weird problem -- path interpretted/eval'd as numeric expression
Am 29.03.2013 16:23, schrieb Pierre Gaston: > On Fri, Mar 29, 2013 at 5:10 PM, John Kearney wrote: >> consider >> dethrophes@dethace ~ >> $ read -ra vals -d '' <<< $'lkjlksda\n adasd\n:sdasda:' >> >> dethrophes@dethace ~ >> $ echo ${vals[0]} >> lkjlksda >> >> I meant to update your wiki about it but I forgot. >> I guess read uses gets not fread and that truncates the line anyway. >> > you miss the IFS part: > IFS=: read -ra vals -d '' <<< $'lkjlksda\n adasd\n:sdasda:' > echo "${vals[0]}" > > (IFS contains \n by default) Ok that works, I must have somehow misunderstood the description. Oh well thanks that makes he world a little more sane.
Re: weird problem -- path interpretted/eval'd as numeric expression
Oh and FYI IFS=: read may change the global IFS on some shells I think. Mainly thinking of pdksh right now. IFS=: ls # local ls_wrap(){ ls } IFS=: ls_wrap # Changes global IFS I think it was the same with builtins, but not sure right now. Thats why I always use wrapper functions and local to do that sort of thing now. Am 29.03.2013 15:30, schrieb Greg Wooledge: > On Fri, Mar 29, 2013 at 03:11:07PM +0100, John Kearney wrote: >> Actually I've had trouble >> >> IFS=: read -ra paths <<< "$PATH" >> >> and embedded new lines. > A directory with a newline in its name, in your PATH? Terrifying. > >> I think this is better >> find_file() { >> local IFS=: >> for dir in $PATH; do > But that one's vulnerable to globbing issues if a directory has a > wildcard character in its name. If you're concerned about newlines > then you should be just as concerned with ? or *, I should think. > > Workarounds: > > 1) In yours, use set -f and set +f around unquoted $PATH to suppress > globbing. > > 2) In mine, use -d '' on the read command, and manually strip the > trailing newline that <<< adds to the final element. > > 3) In mine, use -d '' on the read command, and use < <(printf %s "$PATH") > so there isn't an added trailing newline to strip. > >> Ideally what I want to do is >> alias include=source\ "$(find_file "${1}")" >> but that doesn't work in bash and I still haven't found a way around the >> problem. > I can't think of an alias workaround off the top of my head either. > Even Simon Tatham's "magic aliases" require a helper function, which leads > back to the variable scope issue, the avoidance of which was the whole > reason to attempt an alias (instead of a function) in the first place > >> The only way I can think to do it is to use a second file. >> >> cat </source_wrapper.sh >> find_file "${1:?Missing File Name }" || return $? >> source "${FOUND_FILE}" >> EOF >> alias include=source\ "/source_wrapper.sh" > The < include the definition of find_file in the wrapper script. >
Re: weird problem -- path interpretted/eval'd as numeric expression
Am 29.03.2013 15:30, schrieb Greg Wooledge: > On Fri, Mar 29, 2013 at 03:11:07PM +0100, John Kearney wrote: >> Actually I've had trouble >> >> IFS=: read -ra paths <<< "$PATH" >> >> and embedded new lines. > A directory with a newline in its name, in your PATH? Terrifying. why not :) its a great way to make sure only my scripts work on my system;). >> I think this is better >> find_file() { >> local IFS=: >> for dir in $PATH; do > But that one's vulnerable to globbing issues if a directory has a > wildcard character in its name. If you're concerned about newlines > then you should be just as concerned with ? or *, I should think. Strangely enough I that hasn't been as much of a problem. But a good point. > Workarounds: > > 1) In yours, use set -f and set +f around unquoted $PATH to suppress > globbing. I actually have that in my code :( coding off the top of your head is always a bit sloppy :). > > 2) In mine, use -d '' on the read command, and manually strip the > trailing newline that <<< adds to the final element. consider dethrophes@dethace ~ $ read -ra vals -d '' <<< $'lkjlksda\n adasd\n:sdasda:' dethrophes@dethace ~ $ echo ${vals[0]} lkjlksda I meant to update your wiki about it but I forgot. I guess read uses gets not fread and that truncates the line anyway. > > 3) In mine, use -d '' on the read command, and use < <(printf %s "$PATH") > so there isn't an added trailing newline to strip. > >> Ideally what I want to do is >> alias include=source\ "$(find_file "${1}")" >> but that doesn't work in bash and I still haven't found a way around the >> problem. > I can't think of an alias workaround off the top of my head either. > Even Simon Tatham's "magic aliases" require a helper function, which leads > back to the variable scope issue, the avoidance of which was the whole > reason to attempt an alias (instead of a function) in the first place I'm actually almost convinced that it just isn't possible. >> The only way I can think to do it is to use a second file. >> >> cat </source_wrapper.sh >> find_file "${1:?Missing File Name }" || return $? >> source "${FOUND_FILE}" >> EOF >> alias include=source\ "/source_wrapper.sh" > The < include the definition of find_file in the wrapper script. ?? why <<'EOF' ?? No I don't need to include the function I would declare it like this. find_file() { local dir IFS=: FOUND_FILE="" set -f for dir in $PATH; do [[ ! -r ${dir}/$1 ]] || continue FOUND_FILE="${dir}/$1" done set +f [ -z "${FOUND_FILE:-}" ] || echo "could not find '$1' in PATH" >&2 return ${FOUND_FILE:+1} }&& echo 'find_file "${1:?Missing File Name }" && source "${FOUND_FILE}"' >/tmp/source_wrapper.sh && alias include=source\ "/tmp/source_wrapper.sh"
Re: weird problem -- path interpretted/eval'd as numeric expression
Am 29.03.2013 12:57, schrieb Greg Wooledge: > On Fri, Mar 29, 2013 at 12:41:46AM -0700, Linda Walsh wrote: >> include was designed to search the path for functions that >> are relative paths. While the normal sourcepath allows searching for >> filenames on the search path, I don't believe (please correct if I am wrong >> and this works now, as it would make life much simpler) that the PATH will >> be searched if you give it something like: >> >> source lib/Util/sourcefile.shh > Is that all you want? Here: > > include() { > local paths dir > IFS=: read -ra paths <<< "$PATH" > for dir in "${paths[@]}"; do > if [[ -r $dir/$1 ]]; then > source "$dir/$1" > return > fi > done > echo "could not find '$1' in PATH" >&2 > return 1 > } > Actually I've had trouble IFS=: read -ra paths <<< "$PATH" and embedded new lines. I think this is better find_file() { local IFS=: for dir in $PATH; do [[ ! -r $dir/$1 ]] || continue FOUND_FILE="$dir/$1" return 0 done echo "could not find '$1' in PATH" >&2 return 1 } include() { local FOUND_FILE find_file "${1:?Missing File Name }" || return $? source "${FOUND_FILE}" } includeExt() { local FOUND_FILE local PATH=${1:?Missing Search Path} find_file "${2:?Missing File Name}" || return $? source "${FOUND_FILE}" } Ideally what I want to do is alias include=source\ "$(find_file "${1}")" but that doesn't work in bash and I still haven't found a way around the problem. The only way I can think to do it is to use a second file. cat
Re: gnu parallel in the bash manual
Am 06.03.2013 01:03, schrieb Linda Walsh: > > John Kearney wrote: >> The example is bad anyway as you normally don't want to parallelize disk >> io , due to seek overhead and io bottle neck congestion. This example >> will be slower and more likely to damage your disk than simply using mv >> on its own. but thats another discussion. > --- > That depends on how many IOPS your disk subsystem can > handle and how much cpu is between each of the IO calls. > Generally, unless you have a really old, non-queuing disk, >> 1 procs will be of help. If you have a RAID, it can go > up with # of data spindles (as a max, though if all are reading > from the same area, not so much...;-))... > > > Case in point, I wanted to compare rpm versions of files > on disk in a dir to see if there were duplicate version, and if so, > only keep the newest (highest numbered) version) (with the rest > going into a per-disk recycling bin (a fall-out of sharing > those disks to windows and implementing undo abilities on > the shares (samba, vfs_recycle). > > I was working directories with 1000's of files -- (1 dir, > after pruning has 10,312 entries). Sequential reading of those files > was DOG slow. > > I parallelized it (using perl) first by sorting all the names, > then breaking it into 'N' lists -- doing those in parallel, then > merging the results (and comparing end-points -- like end of one list > might have been diff-ver from beginning of next). I found a dynamic > 'N' based on max cpu load v.disk (i.e. no matter how many procs I > threw at it, it still used about 75% cpu). > > So I chose 9: > > Hot cache: > Read 12161 rpm names. > Use 1 procs w/12162 items/process > #pkgs=10161, #deletes=2000, total=12161 > Recycling 2000 duplicates...Done > Cumulative This Phase ID > 0.000s 0.000s Init > 0.000s 0.000s start_program > 0.038s 0.038s starting_children > 0.038s 0.001s end_starting_children > 8.653s 8.615s endRdFrmChldrn_n_start_re_sort > 10.733s 2.079s afterFinalSort > 17.94sec 3.71usr 6.21sys (55.29% cpu) > --- > Read 12161 rpm names. > Use 9 procs w/1353 items/process > #pkgs=10161, #deletes=2000, total=12161 > Recycling 2000 duplicates...Done > Cumulative This Phase ID > 0.000s 0.000s Init > 0.000s 0.000s start_program > 0.032s 0.032s starting_children > 0.036s 0.004s end_starting_children > 1.535s 1.500s endRdFrmChldrn_n_start_re_sort > 3.722s 2.187s afterFinalSort > 10.36sec 3.31usr 4.47sys (75.09% cpu) > > Cold Cache: > > Read 12161 rpm names. > Use 1 procs w/12162 items/process > #pkgs=10161, #deletes=2000, total=12161 > Recycling 2000 duplicates...Done > Cumulative This Phase ID > 0.000s 0.000s Init > 0.000s 0.000s start_program > 0.095s 0.095s starting_children > 0.096s 0.001s end_starting_children > 75.067s 74.971s endRdFrmChldrn_n_start_re_sort > 77.140s 2.073s afterFinalSort > 84.52sec 3.62usr 6.26sys (11.70% cpu) > > Read 12161 rpm names. > Use 9 procs w/1353 items/process > #pkgs=10161, #deletes=2000, total=12161 > Recycling 2000 duplicates...Done > Cumulative This Phase ID > 0.000s 0.000s Init > 0.000s 0.000s start_program > 0.107s 0.107s starting_children > 0.112s 0.005s end_starting_children > 29.350s 29.238s endRdFrmChldrn_n_start_re_sort > 31.497s 2.147s afterFinalSort > 38.27sec 3.35usr 4.47sys (20.47% cpu) > > --- > hot cache savings: 42% > cold cache savings: 55% > > > > > Different use case you can't really compare mv to data processing. And Generally it is a bad idea, unless yo know what you are doing. trying to parallelize mv /* Is a bad idea unless you are on some expensive hardware. This is because of the sequential nature of the access model. Your use case was a sparse access model and there is normally no performance penalty to interleaving sparse access methods. Depending on the underlying hardware it can be very costly to interleave sequential access streams especially on embedded devices e.g. emmc. Not to mention the sync object overhead you may be incurring in the fs driver and or hardware driver. With 13000 files in one directory you must have been taking a dir list and file open access penalty. What fs was t
Re: gnu parallel in the bash manual
Am 03.03.2013 01:40, schrieb Chet Ramey: >> this is actually more disturbing. >> >> ls | parallel mv {} destdir >> >> find -type f -print0 | xargs -0 -I{} -P /bin/mv {} > If we're really going to pick nits here, those two aren't really identical. > > You'd probably want something like > > find . -depth 1 \! -name '.*' -print0 > > to start. > > Chet > Sure your right what I showed wasn't a 1 to 1 functional replacement, but then again most times I see ppl using a ls | syntax they actually don't intend that functionality, its a side effect of them not knowing how to use a better syntax. The example is bad anyway as you normally don't want to parallelize disk io , due to seek overhead and io bottle neck congestion. This example will be slower and more likely to damage your disk than simply using mv on its own. but thats another discussion. with regards to nit picking, considering how much effort is made on this mailing list and help-bash to give filename safe examples, its hardly nitpicking to expect the examples in the bash manual to be written to the same standard.
Re: export in posix mode
Am 27.02.2013 22:39, schrieb James Mason: > On 02/27/2013 04:00 PM, Bob Proulx wrote: >> Eric Blake wrote: >>> James Mason wrote: I certainly could be doing something wrong, but it looks to me like bash - when in Posix mode - does not suppress the "-n" option for export. The version of bash that I'm looking at is 3.2.25. >>> So what? Putting bash in posix mode does not require bash to instantly >>> prohibit extensions. POSIX intentionally allows for implementations to >>> provide extensions, and 'export -n' is one of bash's extensions. >>> There's no bug here, since leaving the extension always enabled does >>> not >>> conflict with subset of behavior required by POSIX. >> If you are looking to try to detect non-portable constructs then you >> will probably need to test against various shells including ash. (If >> on Debian then use dash.) >> >>https://en.wikipedia.org/wiki/Almquist_shell >> >> The posh shell was constructed specifically to be as strictly >> conforming to posix as possible. (Making it somewhat less than useful >> in Real Life but it may be what you are looking for.) It is Debian >> specific in origin but should work on other systems. >> >>http://packages.debian.org/sid/posh >>http://anonscm.debian.org/gitweb/?p=users/clint/posh.git;a=summary >> >> Bob > > We considered setting up another shell as the implementation of > "/bin/sh", but that's hazardous in the context of vast amounts of > boot-time initialization scripting that hasn't been vetted as to > avoidance of bash-isms. > > Changing product script code - just so you can look for these sorts of > things - isn't practical (or safe) either. > > So I guess if you take the view that bash POSIX mode exists only to > make bash accept POSIX scripts, and not to preclude/warn about > behavior that isn't going to be acceptable elsewhere, then you're > right - it's not a bug. If you care about helping people to be able > to write scripts that work various places and don't exceed the POSIX > specification, you're unhelpfully wrong (and you might contemplate why > "bashisms" gives > 50K google hits). > > -jrm > > > bash posix more just changes bash behaviour that is incompatible with the posix spec. Nothing more or less. There are other shells for doing what you seem to want as has already been stated. Namely dash and posh.
Re: gnu parallel in the bash manual
Am 26.02.2013 03:36, schrieb Linda Walsh: > > Chet Ramey wrote: >> On 2/25/13 8:07 PM, Linda Walsh wrote: >>> Chet Ramey wrote: On 2/16/13 3:50 AM, Pierre Gaston wrote: > I don't quite see the point of having gnu parallel discussed in the > bash reference manual. I was asked to add that in May, 2010 by Ole Tange and Richard Stallman. >>> >>> Maybe now that it was done, it can be removed? >> I'm pretty sure that wasn't the intent of the original request. Let's >> see if we can clean it up instead. > > I'm sure, but you edited out the rest of my reasoning. > Note -- I don't feel strongly about this, one way or the other, > but at the same time, I don't feel their request, nor your response > are the best ones to take from an engineering or product perspective -- > in part -- directly because of the confusion about whether or not parallel > is bundled w/bash or not. > > Using it in an example would be fine... but make a section out it? That's > a fairly strong implication for it being something that's part of bash's > official release or product. > > I realize this matter is more political than technical, but still, I would > try to ask those questions of the original requestors and see if they might > not revisit their requests...? you could even say -- a user asked if > including parallel in it's own section in the manpage meant that parallel was > going to be part of the bash distribution? > > I mean it wouldn't surprise me or seem unreasonable if it was included in > the bash distribution (from a lay-person perspective). Knowing it's a > perl-script, I'd be a bit surprised, personally, but hey, I've been wondering > if you are going to embed the perl interpreter in bash as a dynamically > loadable .so and allow perl-expressions on the command line as an option... > *str8-face*... > > > > > I up vote the perl integration into bash :|.
Re: gnu parallel in the bash manual
Am 16.02.2013 09:50, schrieb Pierre Gaston: > I don't quite see the point of having gnu parallel discussed in the > bash reference manual. > http://www.gnu.org/software/bash/manual/bashref.html#GNU-Parallel > I don't argue that it can be a useful tool, but then you might as well > discuss sed awk grep make find etc.. > Or even the ones not part of the standard toolset since parallel is > not installed by default even on the linux distribution I know: flock > fdupes recode convmv rsync etc... Actually xargs could do everything listed better. and is installed by default on most systems. > On top of that the examples teach incorrect things eg, "the common > idioms that operate on lines read from a file"(sic) > > for x in $(cat list); do > > doesn't even read lines! this is actually more disturbing. ls | parallel mv {} destdir find -type f -print0 | xargs -0 -I{} -P /bin/mv {} > > I'd say this should be removed. > Or the examples should at least be fixed. there are terrible practices being shown there.
Re: builtin "read -d" behaves differently after "set -e#
Am 06.02.2013 14:46, schrieb Greg Wooledge: > On Wed, Feb 06, 2013 at 12:39:45AM +0100, Tiwo W. wrote: >> When using this in a script of mine, I noticed that this fails >> when errexit is set ("set -e"). > Most things do. set -e is crap. You should consider not using it. > >> * why does it work with "set +e" ? > Because set +e disables the crap. > >> * what is the recommended way to disable splitting with "read"? > What splitting? You only gave a single variable. There is no field > splitting when you only give one variable. > >> set -e >> read -d '' var2 <>but >>this >>fails >> EOF >> echo "$var2" > Are you actually asking how to force read to slurp in an entire file > including newlines, all at once? Is that what you meant by "splitting"? > > Well, you already found your answer -- stop using set -e. By the way, > you may also want to set IFS to an empty string to disable the trimming > of leading and trailing whitespace, and use the -r option to suppress > special handling of backslashes. Thus: > > IFS= read -rd '' var2 < > In case you're curious why set -e makes it fail: > > imadev:~$ IFS= read -rd '' foo < > blah > > EOF > imadev:~$ echo $? > 1 > > read returns 1 because it reached the end of file for standard input. > From the manual: "The return code is zero, unless end-of-file is > encountered, read times out (in which case the return code is greater than > 128), or an invalid file descriptor is supplied as the argument to -u." > > So, if you're reading all the way to EOF (on purpose) then you should > ignore the exit status. set -e doesn't permit you to ignore the exit > status on commands where the exit status indicates a nonfatal condition > (such as read -d '' or let i=0). This is why set -e is crap. > > Also see http://mywiki.wooledge.org/BashFAQ/105 > set -e IFS= read -rd '' var2 <
Re: Q on Bash's self-documented POSIX compliance...
Am 27.01.2013 01:37, schrieb Clark WANG: > On Sat, Jan 26, 2013 at 1:27 PM, Linda Walsh wrote: > >> I noted on the bash man page that it says it will start in posix >> compliance mode when started as 'sh' (/bin/sh). >> >> What does that mean about bash extensions like arrays and >> use of [[]]? >> >> Those are currently not-POSIX (but due to both Bash and Ksh having >> them, some think that such features are part of POSIX now)... >> >> If you operate in POSIX compliance mode, what guarantee is there that >> you can take a script developed with bash, in POSIX compliance mode, >> and run it under another POSIX compliant shell? >> >> Is it such that Bash can run POSIX compliant scripts, BUT, cannot be >> (easily) used to develop such, as there is no way to tell it to >> only use POSIX? >> >> If someone runs in POSIX mode, should bash keep arbitrary bash-specific >> extensions enabled? >> >> I am wondering about the rational, but also note that some people believe >> they are running a POSIX compatible shell when they use /bin/sh, but would >> get rudely surprised is another less feature-full shell were dropped in >> as a replacement. >> > I think every POSIX compatible shell has its own extensions so there's no > guarantee that a script which works fine in shell A would still work in > shell B even if both A and B are POSIX compatible unless the script writer > only uses POSIX compatible features. Is there a pure POSIX shell without > adding any extensions? dash is normally a better gauge of how portable your script is, than bash in posix mode.
Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)
Am 14.01.2013 21:12, schrieb Chet Ramey: > On 1/14/13 2:57 PM, John Kearney wrote: > >> I have no idea why errexit exists I doubt it was for lazy people >> thought. its more work to use it. > I had someone tell me one with a straight (electronic) face that -e > exists `to allow "make" to work as expected' since historical make invokes > sh -ce to run recipes. Now, he maintains his own independently-written > version of `make', so his opinion might be somewhat skewed. > > Chet > That actually makes a lot of sense. it explains the 2 weirdest things about it, 1 no error message explaining what happened, 2 weird behavior with functions.
Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)
Am 14.01.2013 22:09, schrieb Ken Irving: > On Mon, Jan 14, 2013 at 08:57:41PM +0100, John Kearney wrote: >> ... >> btw >> || return $? >> >> isn't actually error checking its error propagation. > Also btw, I think you can omit the $? in this case; from bash(1): > > return [n] > ... > If n is omitted, the return status is that of the last command > executed in the function body. ... > > and similarly for exit: > > exit [n] > ... If n is omitted, > the exit status is that of the last command executed. ... > > Ken > Thanks yhea your right, but I think its clearer to include it especially for people with less experience. I try to be as explicit as possible. Perl cured me of my taste for compactness in code . ;)
Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)
Am 14.01.2013 20:25, schrieb Greg Wooledge: > On Mon, Jan 14, 2013 at 08:08:53PM +0100, John Kearney wrote: >> this should exit. >> #!/bin/bash >> >> set -e >> f() { test -d nosuchdir && echo no dir; } >> echo testings >> f >> echo survived > OK, cool. That gives me more ammunition to use in the war against set -e. > > == > imadev:~$ cat foo > #!/bin/bash > > set -e > test -d nosuchdir && echo no dir > echo survived > imadev:~$ ./foo > survived > == > imadev:~$ cat bar > #!/bin/bash > > set -e > f() { test -d nosuchdir && echo no dir; } > f > echo survived > imadev:~$ ./bar > imadev:~$ > == > imadev:~$ cat baz > #!/bin/bash > > set -e > f() { if test -d nosuchdir; then echo no dir; fi; } > f > echo survived > imadev:~$ ./baz > survived > == > >> All I was pointing out that its safer to use syntax >> >> [] || >> >> or >> >> [] && || > I don't even know what "safer" means any more. As you can see in my > code examples above, if you were expecting the "survived" line to appear, > then you get burned if you wrap the test in a function, but only if the > test uses the "shorthand" && instead of the "vanilla" if. > > But I'm not sure what people expect it to do. It's hard enough just > documenting what it ACTUALLY does. > >> you always need a || on a one liner to make sure the return value of the >> line is a 0. > Or stop using set -e. No, really. Just... fucking... stop. :-( > >> but lets say you want to do 2 things in a function you have to do >> something like. >> f(){ >> mkdir "${1%/*}" ||return $? # so the line doesn't return an error. >> touch "${1}" >> } > ... wait, so you're saying that even if you use set -e, you STILL have to > include manual error checking? The whole point of set -e was to allow > lazy people to omit it, wasn't it? > > So, set -e lets you skip error checking, but you have to add error checking > to work around the quirks of set -e. > > That's hilarious. > I have no idea why errexit exists I doubt it was for lazy people thought. its more work to use it. I use trap ERR not errexit, which allows me to protocol unhandled errors. I actually find trap ERR/errexit pretty straight forward now. I don't really get why people are so against it. Except that they seem to have the wrong expectations for it. btw || return $? isn't actually error checking its error propagation. f(){ # not last command in function mkdir "${1%/*}" # exit on error. mkdir "${1%/*}" ||return $? # return an error. mkdir "${1%/*}" ||true # ignore error. # last command in function touch "${1}"# return exit code } what is confusing though is f(){ touch "${1}"# exit on error return $? } this wll not work as expected with errexit. because the touch isn't the last command in the function, however just removing the return should fix it. also need to be careful of stuff like x=$(false) need something more like x=$(false||true) or if x=$(false) ; then basically any situation in which a line returns a non 0 value is probably going to cause the exit especially in functions. I just do it automatically now. I guess most people aren't used to considering the line return values.
Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)
Am 14.01.2013 14:33, schrieb Greg Wooledge: > On Sun, Jan 13, 2013 at 03:31:24AM +0100, John Kearney wrote: >> set -o errexit >> test_func() { >> [ ! -d test ] && echo test2 >> } >> >> echo test3 >> test_func >> echo test4 >> >> now so long as test doesn't exist in the cwd it should errexit. >> at least it did for me just now. > Cannot reproduce. > > imadev:~$ cat bar > #!/bin/bash > > set -e > f() { test ! -d nosuchdir && echo no dir; } > f > echo survived > imadev:~$ ./bar > no dir > survived the "no dir" above means that the test didn't fail. The exit only happens if the test fails. Sorry I keep seeming to make typos. I really need more sleep. this should exit. #!/bin/bash set -e f() { test -d nosuchdir && echo no dir; } echo testings f echo survived All I was pointing out that its safer to use syntax [] || or [] && || you always need a || on a one liner to make sure the return value of the line is a 0. this isn't necessary in the script body I think but in a function it is, unless its the last command then it will be auto returned.. but lets say you want to do 2 things in a function you have to do something like. f(){ mkdir "${1%/*}" ||return $? # so the line doesn't return an error. touch "${1}" } any way it is nearly always something that should be being done anyway. It only the conditional one liners that tend to frustrate people a lot from what I've seen.
Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)
Am 13.01.2013 00:04, schrieb Chet Ramey: > On 1/12/13 10:07 AM, John Kearney wrote: > >> regarding -e it mainly has a bad name because there is no good guide how >> to program with it. >> so for example this causes stress >> [ ! -d ${dirname} ] && mkdir ${dirname} >> because if the dir exists it will exit the scripts :) > I'm not sure this is what you wanted to say. When -e is set, that code > will not cause an error exit if ${dirname} exists and is a directory. Run > this script in the bash source directory and see what happens: > > set -e > [ ! -d builtins ] && mkdir builtins > echo after > > > Chet :) its a little more complex, truthfully I make rules how I should do stuff and then just follow them. in this case you actually need to put the code in a function, then its actually the function return not the command itself that causes the exit. At least I think thats what happens, truthfully sometimes even with the caller trace it can be hard to tell what is actually going on. i.e. set -o errexit test_func() { [ ! -d test ] && echo test2 } echo test3 test_func echo test4 now so long as test doesn't exist in the cwd it should errexit. at least it did for me just now. Like I say the only reason I don't like errexit is it doesn't say why it exited, so I use the ERR trap. Which is great. Just to clarify I'm not complaining just saying why I think ppl have bad experiences with errexit. having said that it might be nice to get an optional backtrace on errors. I do this myself but it might help others if it was natively supported. John
Re: printf %q represents null argument as empty string.
Am 12.01.2013 20:40, schrieb Chet Ramey: > On 1/12/13 9:48 AM, John Kearney wrote: > >> anyway now we have a point I disagree that >> "${@}" >> >> should expand to 0 or more words, from the documentation it should be 1 >> or more. At least that is how I read that paragragh. IT says it will >> split the word not make the word vanish. >> so I had to test and it really does how weird, is that in the posix spec?. > Yes. Here's the relevant sentence from the man page description of $@: > > When there are no positional parameters, "$@" and $@ expand to > nothing (i.e., they are removed). > > Posix says something similar: > > If there are no positional parameters, the expansion of '@' shall > generate zero fields, even when '@' is double-quoted. > > Chet thanks one lives and learns.
Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)
Am 12.01.2013 14:53, schrieb Dan Douglas: > Yes some use -u / -e for debugging apparently. Actual logic relying upon > those > can be fragile of course. I prefer when things return nonzero instead of > throwing errors usually so that they're handleable. ah but you can still do that if you want you just do ${unsetvar:-0} says you want 0 for null string or unset ${unsetvar-0} says you want 0 for unset. I know these aren't the sort of things you want to add retroactively, but if you program from the ground up with this in mind your code is much more explicit, and less reliant on particular interpreter behavior. So again it forces a more explicit programming style which is always better. Truthfully most people complain my scripts don't look like scripts any more but more like programs. But once they get used to the style most see its advantages. at teh very least when they have to figure out what is gone wrong they understand. regarding -e it mainly has a bad name because there is no good guide how to program with it. so for example this causes stress [ ! -d ${dirname} ] && mkdir ${dirname} because if the dir exists it will exit the scripts :) [ -d ${dirname} ] || mkdir ${dirname} this however is safe. actually forcing myself to work with SIGERR taught me a lot about how this sort of thing works. thats why I do for example use (old but simple example) set -o errtrace function TaceEvent { local LASTERR=$? local ETYPE="${1:?Missing Error Type}" PrintFunctionStack 1 cErrorOut 1 "${ETYPE} ${BASH_SOURCE[1]}(${BASH_LINENO[1]}):${FUNCNAME[1]} ELEVEL=${LASTERR} \"${BASH_COMMAND}\"" } trap 'TaceEvent ERR' ERR which basically gives you a heads up everytime you haven't handled an error return code. so the following silly example test_func4() { false } test_func3() { test_func4 } test_func2() { test_func3 } test_func1() { test_func2 } test_func1 will give me a log that looks like #D: Sat Jan 12 15:49:13 CET 2013 : 18055 : test.sh (225 ) : main: "[5]/home/dethrophes/scripts/bash/test.sh(225):test_func1" #D: Sat Jan 12 15:49:13 CET 2013 : 18055 : test.sh (223 ) : test_func1 : "[4]/home/dethrophes/scripts/bash/test.sh(223):test_func2" #D: Sat Jan 12 15:49:13 CET 2013 : 18055 : test.sh (220 ) : test_func2 : "[3]/home/dethrophes/scripts/bash/test.sh(220):test_func3" #D: Sat Jan 12 15:49:13 CET 2013 : 18055 : test.sh (217 ) : test_func3 : "[2]/home/dethrophes/scripts/bash/test.sh(217):test_func4" #E: Sat Jan 12 15:49:13 CET 2013 : 18055 : test.sh (214 ) : test_func4 : "ERR /home/dethrophes/scripts/bash/test.sh(217):test_func4 ELEVEL=1 \"false\"" which allows me to very quickly route cause the error and fix it. if you really don't care you can just stick a ||true on the end to ignore it in the future. so in this case to something like test_func4() { false || true } I mean it would be nice to have an unset trap, but without it nounset is the next best thing. Also I don't think of this as debugging it's code verification/analysis. I do this so I don't have to debug my code. This is a big help against typos and scoping errors. like I say its like using lint.
Re: printf %q represents null argument as empty string.
Am 12.01.2013 15:34, schrieb Dan Douglas: > On Friday, January 11, 2013 10:39:19 PM Dan Douglas wrote: >> On Saturday, January 12, 2013 02:35:34 AM John Kearney wrote: >> BTW, your wrappers won't work. A wrapper would need to implement format > Hrmf I should have clarified that I only meant A complete printf wrapper > would > be difficult. A single-purpose workaround is perfectly fine. e.g. > printq() { ${1+printf %q "$@"}; }; ... which is probably something like what > you meant. Sorry for the rant. > Don't worry I've got a thick skin ;) feel free to rant, you have a different perspective and I like that. anyway now we have a point I disagree that "${@}" should expand to 0 or more words, from the documentation it should be 1 or more. At least that is how I read that paragragh. IT says it will split the word not make the word vanish. so I had to test and it really does how weird, is that in the posix spec?. set -- test_func() { echo $#; } test_func "${@}" 0 test_func "1${@}" 1 test_func "${@:-}" 1 test_func "${@-}" 1 Now I'm confused ... oh well sorry had the functionality differently in my head.
Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)
Am 11.01.2013 22:34, schrieb Dan Douglas: > On Friday, January 11, 2013 09:48:32 PM John Kearney wrote: >> Am 11.01.2013 19:27, schrieb Dan Douglas: >>> Bash treats the variable as essentially undefined until given at least an >>> empty value. >>> >>> $ bash -c 'typeset -i x; [[ -v x ]]; echo "$?, ${x+foo}"; typeset -p x' >>> 1, >>> bash: line 0: typeset: x: not found >>> $ ksh -c 'typeset -i x; [[ -v x ]]; echo "$?, ${x+foo}"; typeset -p x' >>> 0, >>> typeset -i x >>> >>> Zsh implicitly gives integers a zero value if none are specified and the >>> variable was previously undefined. Either the ksh or zsh ways are fine IMO. >>> >>> Also I'll throw this in: >>> >>> $ arr[1]=test; [[ -v arr[1] ]]; echo $? >>> 1 >>> >>> This now works in ksh to test if an individual element is set, though it >>> hasn't always. Maybe Bash should do the same? -v is tricky because it adds >>> some extra nuances to what it means for something to be defined... >>> >> Personally I like the current behavior, disclaimer I use nounset. >> I see no problem with getting people to initialize variables. > How is this relevant? It's an inconsistency in the way set/unset variables > are normally handled. You don't use variadic functions? Unset variables / > parameters are a normal part of most scripts. > >> it is a more robust programming approach. > I strongly disagree. (Same goes for errexit.) > :) we agree on errexit however SIGERROR is another matter quite like that. Note the only reason I don't like errexit is because it doesn't tell you why it exited, nounset deos. no unset is very valuable. during the entire testing and validation phase. Admittedly bash is more of a hobby for me. but I still have unit testing for the function and more complex harness testing for the higher level stuff. Before I ship ship code I may turn it off but normally if its really critical I won't use bash for it anyway, I mainly use bash for analysis. as such if bash stops because it finds a unset variable it is always a bug that bash has helped me track down. I guess it also depends on how big your scripts are I guess up to a couple thousand lines is ok but once you get into the 10s of thousands to keep your sanity and keep a high reliability you become more and more strict with what you allow, strict naming conventions and coding styles. setting nounset is in the same category of setting warnings to all and and treat warnings as errors. but then again I do mission critical designs so I guess I have a different mindset.
Re: printf %q represents null argument as empty string.
Am 11.01.2013 22:05, schrieb Dan Douglas: > On Friday, January 11, 2013 09:39:00 PM John Kearney wrote: >> Am 11.01.2013 19:38, schrieb Dan Douglas: >>> $ set --; printf %q\\n "$@" >>> '' >>> >>> printf should perhaps only output '' when there is actually a > corresponding >>> empty argument, else eval "$(printf %q ...)" and similar may give > different >>> results than expected. Other shells don't output '', even mksh's ${var@Q} >>> expansion. Zsh's ${(q)var} does. >> that is not a bug in printf %q >> >> it what you expect to happen with "${@}" >> should that be 0 arguments if $# is 0. >> >> I however find the behavior irritating, but correct from the description. >> >> to do what you are suggesting you would need a special case handler for this >> "${@}" as oposed to "${@}j" or any other variation. >> >> >> what I tend to do as a workaround is >> >> printf() { >> if [ $# -eq 2 -a -z "${2}" ];then >> builtin printf "${1}" >> else >> builtin printf "${@}" >> fi >> } >> >> >> or not as good but ok in most cases something like >> >> printf "%q" ${1:+"${@}"} >> >> > I don't understand what you mean. The issue I'm speaking of is that printf %q > produces a quoted empty string both when given no args and when given one > empty arg. A quoted "$@" with no positional parameters present expands to > zero > words (and correspondingly for "${arr[@]}"). Why do you think "x${@}x" is > special? (Note that expansion didn't even work correctly a few patchsets ago.) > > Also as pointed out, every other shell with a printf %q feature disagrees > with > Bash. Are you saying that something in the manual says that it should do > otherwise? I'm aware you could write a wrapper, I just don't see any utility > in the default behavior. um maybe an example will calrify my attempted point set -- arg1 arg2 arg3 set -- "--(${@})--" printf "<%q> " "${@}" <--\(arg1> set -- set -- "--(${@})--" printf "<%q> " "${@}" <--\(\)--> so there is always at least one word or one arg, just because its "${@}" should not affect this behavior. is that clearer as such bash is doing the right thing as far as I'm concerned, truthfully its not normally what I want but that is beside the point consistency is more important, especially when its so easy to work around. the relevant part of the man page is When there are no array members, ${name[@]} expands to nothing. <<>>> If the double-quoted expansion occurs within a word, the expansion of the first parameter is joined with the beginning part of the original word, and the expansion of the last parameter is joined with the last part of the original word. This is analogous to the expansion of the special parameters * and @ (see Special Parameters above). ${#name[subscript]} expands to the length of ${name[subscript]}. If sub- script is * or @, the expansion is the number of elements in the array. Referencing an array variable without a subscript is equivalent to referencing the array with a subscript of 0. so set -- printf "%q" "${@}" becomes printf "%q" "" which is correct as ''
Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)
Am 11.01.2013 19:27, schrieb Dan Douglas: > Bash treats the variable as essentially undefined until given at least an > empty value. > > $ bash -c 'typeset -i x; [[ -v x ]]; echo "$?, ${x+foo}"; typeset -p x' > 1, > bash: line 0: typeset: x: not found > $ ksh -c 'typeset -i x; [[ -v x ]]; echo "$?, ${x+foo}"; typeset -p x' > 0, > typeset -i x > > Zsh implicitly gives integers a zero value if none are specified and the > variable was previously undefined. Either the ksh or zsh ways are fine IMO. > > Also I'll throw this in: > > $ arr[1]=test; [[ -v arr[1] ]]; echo $? > 1 > > This now works in ksh to test if an individual element is set, though it > hasn't always. Maybe Bash should do the same? -v is tricky because it adds > some extra nuances to what it means for something to be defined... > Personally I like the current behavior, disclaimer I use nounset. I see no problem with getting people to initialize variables. it is a more robust programming approach.
Re: printf %q represents null argument as empty string.
Am 11.01.2013 19:38, schrieb Dan Douglas: > $ set --; printf %q\\n "$@" > '' > > printf should perhaps only output '' when there is actually a corresponding > empty argument, else eval "$(printf %q ...)" and similar may give different > results than expected. Other shells don't output '', even mksh's ${var@Q} > expansion. Zsh's ${(q)var} does. that is not a bug in printf %q it what you expect to happen with "${@}" should that be 0 arguments if $# is 0. I however find the behavior irritating, but correct from the description. to do what you are suggesting you would need a special case handler for this "${@}" as oposed to "${@}j" or any other variation. what I tend to do as a workaround is printf() { if [ $# -eq 2 -a -z "${2}" ];then builtin printf "${1}" else builtin printf "${@}" fi } or not as good but ok in most cases something like printf "%q" ${1:+"${@}"}
Re: output of `export -p' seems misleading
Am 09.11.2012 17:21, schrieb Greg Wooledge: > On Fri, Nov 09, 2012 at 11:18:24AM -0500, Greg Wooledge wrote: >> restore_environment() { >> set -o posix >> eval "$saved_output_of_export_dash_p" >> set +o posix >> } > Err, what I meant was: > > save_environment() { > set -o posix > saved_env=$(export -p) > set +o posix > } > > restore_environment() { > eval "$saved_env" > } > or I guess you could also do something like save_environment() { saved_env=$(export -p) } restore_environment() { echo "${saved_env//declare -x /declare -g -x }" } or save_environment() { saved_env=$(set -o posix; export -p) }
Re: Regular expression matching fails with string RE
Am 17.10.2012 03:13, schrieb Clark WANG: > On Wed, Oct 17, 2012 at 5:18 AM, wrote: > >> Bash Version: 4.2 >> Patch Level: 37 >> >> Description: >> >> bash -c 're=".*([0-9])"; if [[ "foo1" =~ ".*([0-9])" ]]; then echo >> ${BASH_REMATCH[0]}; elif [[ "bar2" =~ $re ]]; then echo ${BASH_REMATCH[0]}; >> fi' >> >> This should output foo1. It instead outputs bar2, as the first match fails. >> >> >> From bash's man page: >[[ expression ]] > ... ... > An additional binary operator, =~, is available, with the > same > ... ... > alphabetic characters. Any part of the pattern may be quoted > to > force it to be matched as a string. Substrings matched > by > ... ... Drop the quotes on the regex bash -c 're=".*([0-9])"; if [[ "foo1" =~ .*([0-9]) ]]; then echo ${BASH_REMATCH[0]}; elif [[ "bar2" =~ $re ]]; then echo ${BASH_REMATCH[0]}; fi' outputs foo1
Re: Bash bug interpolating delete characters
Am 07.05.2012 22:46, schrieb Chet Ramey: > On 5/3/12 5:53 AM, Ruediger Kuhlmann wrote: >> Hi, >> >> please try the following bash script: >> >> a=x >> del="$(echo -e "\\x7f")" >> >> echo "$del${a#x}" | od -ta >> echo "$del ${a#x}" | od -ta >> echo " $del${a#x}" | od -ta >> >> Using bash 3.2, the output is: >> >> 000 del nl >> 002 >> 000 del sp nl >> 003 >> 000 sp del nl >> 003 >> >> however with bash 4.1 and bash 4.2.20, the output is only: >> >> 000 del nl >> 002 >> 000 sp nl >> 002 >> 000 sp nl >> 002 >> >> ... so in the second and third line, the delete character magically >> disappears. Neither OS nor locale seem to influence this. Using a delete >> character directly in the script instead of $del also has no impact, either. > It's a case of one part of the code violating assumptions made by (and > conditions imposed by) another. Try the attached patch; it fixes the > issue for me. > > Chet > It also works for me. "$del${a#x}" =[$'\177'] " $del${a%x}"=[$' \177'] " $del""${a:0:0}"=[$' \177'] " ${del}${a:0:0}"=[$' \177'] "${del:0:1}${a#d}" =[$'\177x'] "${del:0:1} ${a#d}" =[$'\177 x'] "${del:0:1} ${a:+}" =[$'\177 '] "$del ${a#x}"=[$'\177 '] " $del${a:0:0}" =[$' \177'] " $del${a}" =[$' \177x'] " ${del:0:1}${a:0:0}"=[$' \177'] "${del:0:1}${a#x}" =[$'\177'] "${del:0:1} ${a#x}" =[$'\177 '] " $del${a#x}"=[$' \177'] " $del"${a:0:0} =[$' \177'] " $del" =[$' \177'] " ${del:0:1}${a}"=[$' \177x'] "${del:0:1} ${a}"=[$'\177 x'] "${del:0:1} ${a:-}" =[$'\177 x']
Re: Parallelism a la make -j / GNU parallel
Am 06.05.2012 08:28, schrieb Mike Frysinger: > On Saturday 05 May 2012 04:28:50 John Kearney wrote: >> Am 05.05.2012 06:35, schrieb Mike Frysinger: >>> On Friday 04 May 2012 15:25:25 John Kearney wrote: >>>> Am 04.05.2012 21:13, schrieb Mike Frysinger: >>>>> On Friday 04 May 2012 15:02:27 John Kearney wrote: >>>>>> Am 04.05.2012 20:53, schrieb Mike Frysinger: >>>>>>> On Friday 04 May 2012 13:46:32 Andreas Schwab wrote: >>>>>>>> Mike Frysinger writes: >>>>>>>>> i wish there was a way to use `wait` that didn't block until all >>>>>>>>> the pids returned. maybe a dedicated option, or a shopt to enable >>>>>>>>> this, or a new command. >>>>>>>>> >>>>>>>>> for example, if i launched 10 jobs in the background, i usually >>>>>>>>> want to wait for the first one to exit so i can queue up another >>>>>>>>> one, not wait for all of them. >>>>>>>> If you set -m you can trap on SIGCHLD while waiting. >>>>>>> awesome, that's a good mitigation >>>>>>> >>>>>>> #!/bin/bash >>>>>>> set -m >>>>>>> cnt=0 >>>>>>> trap ': $(( --cnt ))' SIGCHLD >>>>>>> for n in {0..20} ; do >>>>>>> >>>>>>> ( >>>>>>> >>>>>>> d=$(( RANDOM % 10 )) >>>>>>> echo $n sleeping $d >>>>>>> sleep $d >>>>>>> >>>>>>> ) & >>>>>>> >>>>>>> : $(( ++cnt )) >>>>>>> >>>>>>> if [[ ${cnt} -ge 10 ]] ; then >>>>>>> >>>>>>> echo going to wait >>>>>>> wait >>>>>>> >>>>>>> fi >>>>>>> >>>>>>> done >>>>>>> trap - SIGCHLD >>>>>>> wait >>>>>>> >>>>>>> it might be a little racy (wrt checking cnt >= 10 and then doing a >>>>>>> wait), but this is good enough for some things. it does lose >>>>>>> visibility into which pids are live vs reaped, and their exit status, >>>>>>> but i more often don't care about that ... >>>>>> That won't work I don't think. >>>>> seemed to work fine for me >>>>> >>>>>> I think you meant something more like this? >>>>> no. i want to sleep the parent indefinitely and fork a child asap >>>>> (hence the `wait`), not busy wait with a one second delay. the `set >>>>> -m` + SIGCHLD interrupted the `wait` and allowed it to return. >>>> The functionality of the code doesn't need SIGCHLD, it still waits till >>>> all the 10 processes are finished before starting the next lot. >>> not on my system it doesn't. maybe a difference in bash versions. as >>> soon as one process quits, the `wait` is interrupted, a new one is >>> forked, and the parent goes back to sleep until another child exits. if >>> i don't `set -m`, then i see what you describe -- the wait doesn't >>> return until all 10 children exit. >> Just to clarify what I see with your code, with the extra echos from me >> and less threads so its shorter. > that's not what i was getting. as soon as i saw the echo of SIGCHLD, a new > "sleeping" would get launched. > -mike Ok then, thats weird because it doesn't really make sense to me why a SIGCHLD would interrupt the wait command. Oh well.
Re: Parallelism a la make -j / GNU parallel
Am 06.05.2012 08:28, schrieb Mike Frysinger: > On Saturday 05 May 2012 23:25:26 John Kearney wrote: >> Am 05.05.2012 06:28, schrieb Mike Frysinger: >>> On Friday 04 May 2012 16:17:02 Chet Ramey wrote: >>>> On 5/4/12 2:53 PM, Mike Frysinger wrote: >>>>> it might be a little racy (wrt checking cnt >= 10 and then doing a >>>>> wait), but this is good enough for some things. it does lose >>>>> visibility into which pids are live vs reaped, and their exit status, >>>>> but i more often don't care about that ... >>>> What version of bash did you test this on? Bash-4.0 is a little >>>> different in how it treats the SIGCHLD trap. >>> bash-4.2_p28. wait returns 145 (which is SIGCHLD). >>> >>>> Would it be useful for bash to set a shell variable to the PID of the >>>> just- reaped process that caused the SIGCHLD trap? That way you could >>>> keep an array of PIDs and, if you wanted, use that variable to keep >>>> track of live and dead children. >>> we've got associative arrays now ... we could have one which contains all >>> the relevant info: >>> declare -A BASH_CHILD_STATUS=( >>> ["pid"]=1234 >>> ["status"]=1# WEXITSTATUS() >>> ["signal"]=13 # WTERMSIG() >>> ) >>> >>> makes it easy to add any other fields people might care about ... >> Is there actually a guarantee that there will be 1 SIGCHLD for every >> exited process. >> Isn't it actually a race condition? > when SIGCHLD is delivered doesn't matter. the child stays in a zombie state > until the parent calls wait() on it and gets its status. so you can have > `wait` return one child's status at a time. > -mike but I think my point still stands trap ': $(( cnt-- ))' SIGCHLD is a bad idea, you actually need to verify how many jobs are running not just arbitrarily decrement a counter, because your not guaranteed a trap for each process. I mean sure it will normally work, but its not guaranteed to work. Also I think the question would be is there any point in forcing bash to issue 1 status at a time? It seems to make more sense to issue them in bulk. So bash could populate an array of all reaped processes in one trap rather than having to execute multiple traps. This is what bash does internally anyway?
Re: Parallelism a la make -j / GNU parallel
Am 05.05.2012 06:28, schrieb Mike Frysinger: > On Friday 04 May 2012 16:17:02 Chet Ramey wrote: >> On 5/4/12 2:53 PM, Mike Frysinger wrote: >>> it might be a little racy (wrt checking cnt >= 10 and then doing a wait), >>> but this is good enough for some things. it does lose visibility into >>> which pids are live vs reaped, and their exit status, but i more often >>> don't care about that ... >> What version of bash did you test this on? Bash-4.0 is a little different >> in how it treats the SIGCHLD trap. > bash-4.2_p28. wait returns 145 (which is SIGCHLD). > >> Would it be useful for bash to set a shell variable to the PID of the just- >> reaped process that caused the SIGCHLD trap? That way you could keep an >> array of PIDs and, if you wanted, use that variable to keep track of live >> and dead children. > we've got associative arrays now ... we could have one which contains all the > relevant info: > declare -A BASH_CHILD_STATUS=( > ["pid"]=1234 > ["status"]=1# WEXITSTATUS() > ["signal"]=13 # WTERMSIG() > ) > makes it easy to add any other fields people might care about ... > -mike Is there actually a guarantee that there will be 1 SIGCHLD for every exited process. Isn't it actually a race condition? what happens if 2 subprocesses exit simultaneously. or if a process exits while already in the SIGCHLD trap. I mean my normal interpretation of a interrupt/event/trap is just a notification that I need to check what has happened. Or that there was an event not the extent of the event? I keep feeling that the following is bad practice trap ': $(( --cnt ))' SIGCHLD and would be better something like this trap 'cnt=$(jobs -p | wc -w)' SIGCHLD as such you would need something more like. declare -a BASH_CHILD_STATUS=([1234]=1 [1235]=1 [1236]=1) declare -a BASH_CHILD_STATUS_SIGNAL=([1234]=13 [1235]=13 [1236]=13)
Re: Parallelism a la make -j / GNU parallel
Am 05.05.2012 06:35, schrieb Mike Frysinger: > On Friday 04 May 2012 15:25:25 John Kearney wrote: >> Am 04.05.2012 21:13, schrieb Mike Frysinger: >>> On Friday 04 May 2012 15:02:27 John Kearney wrote: >>>> Am 04.05.2012 20:53, schrieb Mike Frysinger: >>>>> On Friday 04 May 2012 13:46:32 Andreas Schwab wrote: >>>>>> Mike Frysinger writes: >>>>>>> i wish there was a way to use `wait` that didn't block until all the >>>>>>> pids returned. maybe a dedicated option, or a shopt to enable this, >>>>>>> or a new command. >>>>>>> >>>>>>> for example, if i launched 10 jobs in the background, i usually want >>>>>>> to wait for the first one to exit so i can queue up another one, not >>>>>>> wait for all of them. >>>>>> If you set -m you can trap on SIGCHLD while waiting. >>>>> awesome, that's a good mitigation >>>>> >>>>> #!/bin/bash >>>>> set -m >>>>> cnt=0 >>>>> trap ': $(( --cnt ))' SIGCHLD >>>>> for n in {0..20} ; do >>>>> ( >>>>> d=$(( RANDOM % 10 )) >>>>> echo $n sleeping $d >>>>> sleep $d >>>>> ) & >>>>> : $(( ++cnt )) >>>>> if [[ ${cnt} -ge 10 ]] ; then >>>>> echo going to wait >>>>> wait >>>>> fi >>>>> done >>>>> trap - SIGCHLD >>>>> wait >>>>> >>>>> it might be a little racy (wrt checking cnt >= 10 and then doing a >>>>> wait), but this is good enough for some things. it does lose >>>>> visibility into which pids are live vs reaped, and their exit status, >>>>> but i more often don't care about that ... >>>> That won't work I don't think. >>> seemed to work fine for me >>> >>>> I think you meant something more like this? >>> no. i want to sleep the parent indefinitely and fork a child asap (hence >>> the `wait`), not busy wait with a one second delay. the `set -m` + >>> SIGCHLD interrupted the `wait` and allowed it to return. >> The functionality of the code doesn't need SIGCHLD, it still waits till >> all the 10 processes are finished before starting the next lot. > not on my system it doesn't. maybe a difference in bash versions. as soon > as > one process quits, the `wait` is interrupted, a new one is forked, and the > parent goes back to sleep until another child exits. if i don't `set -m`, > then i see what you describe -- the wait doesn't return until all 10 children > exit. > -mike Just to clarify what I see with your code, with the extra echos from me and less threads so its shorter. set -m cnt=0 trap ': $(( --cnt )); echo "SIGCHLD"' SIGCHLD for n in {0..10} ; do ( d=$(( RANDOM % 10 )) echo $n sleeping $d sleep $d echo $n exiting $d ) & : $(( ++cnt )) if [[ ${cnt} -ge 5 ]] ; then echo going to wait wait echo Back from wait fi done trap - SIGCHLD wait gives 0 sleeping 9 2 sleeping 4 going to wait 4 sleeping 7 3 sleeping 4 1 sleeping 6 2 exiting 4 SIGCHLD 3 exiting 4 SIGCHLD 1 exiting 6 SIGCHLD 4 exiting 7 SIGCHLD 0 exiting 9 SIGCHLD Back from wait 5 sleeping 5 6 sleeping 5 going to wait 8 sleeping 1 9 sleeping 1 7 sleeping 3 9 exiting 1 8 exiting 1 SIGCHLD SIGCHLD 7 exiting 3 SIGCHLD 6 exiting 5 SIGCHLD 5 exiting 5 now this code function TestProcess_22 { local d=$(( RANDOM % 10 )) echo $1 sleeping $d sleep $d echo $1 exiting $d } function trap_SIGCHLD { echo "SIGCHLD"; if [ $cnt -gt 0 ]; then : $(( --cnt )) TestProcess_22 $cnt & fi } set -m cnt=10 maxJobCnt=5 trap 'trap_SIGCHLD' SIGCHLD for (( x=0; xhttp://gnu.org/licenses/gpl.html> This is free software; you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. uname -a Linux DETH00 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Re: Parallelism a la make -j / GNU parallel
Am 04.05.2012 21:11, schrieb Greg Wooledge: > On Fri, May 04, 2012 at 09:02:27PM +0200, John Kearney wrote: >> set -m >> cnt=0 >> trap ': $(( --cnt ))' SIGCHLD >> set -- {0..20} >> while [ $# -gt 0 ]; do >> if [[ ${cnt} -lt 10 ]] ; then >> >> ( >> d=$(( RANDOM % 10 )) >> echo $n sleeping $d >> sleep $d >> ) & >> : $(( ++cnt )) >> shift >> fi >> echo going to wait >> sleep 1 >> done > You're busy-looping with a 1-second sleep instead of using wait and the > signal handler, which was the whole purpose of the previous example (and > of the set -m that you kept in yours). And $n should probably be $1 there. > see my response to mike. what you are thinking about is either what I suggested or something like this function TestProcess_22 { local d=$(( RANDOM % 10 )) echo $1 sleeping $d sleep $d echo $1 exiting $d } function trap_SIGCHLD { echo "SIGCHLD"; if [ $cnt -gt 0 ]; then : $(( --cnt )) TestProcess_22 $cnt & fi } set -m cnt=20 maxJobCnt=10 trap 'trap_SIGCHLD' SIGCHLD for (( x=0; x
Re: Parallelism a la make -j / GNU parallel
Am 04.05.2012 21:13, schrieb Mike Frysinger: > On Friday 04 May 2012 15:02:27 John Kearney wrote: >> Am 04.05.2012 20:53, schrieb Mike Frysinger: >>> On Friday 04 May 2012 13:46:32 Andreas Schwab wrote: >>>> Mike Frysinger writes: >>>>> i wish there was a way to use `wait` that didn't block until all the >>>>> pids returned. maybe a dedicated option, or a shopt to enable this, >>>>> or a new command. >>>>> >>>>> for example, if i launched 10 jobs in the background, i usually want to >>>>> wait for the first one to exit so i can queue up another one, not wait >>>>> for all of them. >>>> If you set -m you can trap on SIGCHLD while waiting. >>> awesome, that's a good mitigation >>> >>> #!/bin/bash >>> set -m >>> cnt=0 >>> trap ': $(( --cnt ))' SIGCHLD >>> for n in {0..20} ; do >>> >>> ( >>> >>> d=$(( RANDOM % 10 )) >>> echo $n sleeping $d >>> sleep $d >>> >>> ) & >>> >>> : $(( ++cnt )) >>> >>> if [[ ${cnt} -ge 10 ]] ; then >>> >>> echo going to wait >>> wait >>> >>> fi >>> >>> done >>> trap - SIGCHLD >>> wait >>> >>> it might be a little racy (wrt checking cnt >= 10 and then doing a wait), >>> but this is good enough for some things. it does lose visibility into >>> which pids are live vs reaped, and their exit status, but i more often >>> don't care about that ... >> That won't work I don't think. > seemed to work fine for me > >> I think you meant something more like this? > no. i want to sleep the parent indefinitely and fork a child asap (hence the > `wait`), not busy wait with a one second delay. the `set -m` + SIGCHLD > interrupted the `wait` and allowed it to return. > -mike The functionality of the code doesn't need SIGCHLD, it still waits till all the 10 processes are finished before starting the next lot. it only interrupts the wait to decrement the counter. to do what your talking about you would have to start the new subprocess in the SIGCHLD trap. try this out it might make it clearer what I mean set -m cnt=0 trap ': $(( --cnt )); echo SIGCHLD' SIGCHLD for n in {0..20} ; do ( d=$(( RANDOM % 10 )) echo $n sleeping $d sleep $d echo $n exiting $d ) & : $(( ++cnt )) if [[ ${cnt} -ge 10 ]] ; then echo going to wait wait fi done trap - SIGCHLD wait
Re: Parallelism a la make -j / GNU parallel
Am 04.05.2012 20:53, schrieb Mike Frysinger: > On Friday 04 May 2012 13:46:32 Andreas Schwab wrote: >> Mike Frysinger writes: >>> i wish there was a way to use `wait` that didn't block until all the pids >>> returned. maybe a dedicated option, or a shopt to enable this, or a new >>> command. >>> >>> for example, if i launched 10 jobs in the background, i usually want to >>> wait for the first one to exit so i can queue up another one, not wait >>> for all of them. >> If you set -m you can trap on SIGCHLD while waiting. > awesome, that's a good mitigation > > #!/bin/bash > set -m > cnt=0 > trap ': $(( --cnt ))' SIGCHLD > for n in {0..20} ; do > ( > d=$(( RANDOM % 10 )) > echo $n sleeping $d > sleep $d > ) & > : $(( ++cnt )) > if [[ ${cnt} -ge 10 ]] ; then > echo going to wait > wait > fi > done > trap - SIGCHLD > wait > > it might be a little racy (wrt checking cnt >= 10 and then doing a wait), but > this is good enough for some things. it does lose visibility into which pids > are live vs reaped, and their exit status, but i more often don't care about > that ... > -mike That won't work I don't think. I think you meant something more like this? set -m cnt=0 trap ': $(( --cnt ))' SIGCHLD set -- {0..20} while [ $# -gt 0 ]; do if [[ ${cnt} -lt 10 ]] ; then ( d=$(( RANDOM % 10 )) echo $n sleeping $d sleep $d ) & : $(( ++cnt )) shift fi echo going to wait sleep 1 done which is basically what I did in my earlier example except I used USR2 instead of SIGCHLD and put it in a function to make it easier to use.
Re: Parallelism a la make -j / GNU parallel
This version might be easier to follow. The last version was more for being able to issue commands via a fifo to a job queue server. function check_valid_var_name { case "${1:?Missing Variable Name}" in [!a-zA-Z_]* | *[!a-zA-Z_0-9]* ) return 3;; esac } CNiceLevel=$(nice) declare -a JobArray function PushAdvancedCmd { local le="tmp_array${#JobArray[@]}" JobArray+=("${le}") eval "${le}"'=("${@}")' } function PushSimpleCmd { PushAdvancedCmd WrapJob ${CNiceLevel} "${@}" } function PushNiceCmd { PushAdvancedCmd WrapJob "${@}" } function UnpackCmd { check_valid_var_name ${1} || return $? eval _RETURN=('"${'"${1}"'[@]}"') unset "${1}[@]" } function runJobParrell { local mjobCnt=${1} && shift jcnt=0 function WrapJob { [ ${1} -le ${CNiceLevel} ] || renice -n ${1} local Buffer=$("${@:2}") echo "${Buffer}" kill -s USR2 $$ } function JobFinised { jcnt=$((${jcnt}-1)) } trap JobFinised USR2 while [ $# -gt 0 ] ; do while [ ${jcnt} -lt ${mjobCnt} ]; do jcnt=$((${jcnt}+1)) if UnpackCmd "${1}" ; then "${_RETURN[@]}" & else continue fi shift done sleep 1 done } Am 03.05.2012 23:23, schrieb John Kearney: > Am 03.05.2012 22:30, schrieb Greg Wooledge: >> On Thu, May 03, 2012 at 10:12:17PM +0200, John Kearney wrote: >>> function runJobParrell { >>> local mjobCnt=${1} && shift >>> jcnt=0 >>> function WrapJob { >>> "${@}" >>> kill -s USR2 $$ >>> } >>> function JobFinised { >>> jcnt=$((${jcnt}-1)) >>> } >>> trap JobFinised USR2 >>> while [ $# -gt 0 ] ; do >>> while [ ${jcnt} -lt ${mjobCnt} ]; do >>> jcnt=$((${jcnt}+1)) >>> echo WrapJob "${1}" "${2}" >>> WrapJob "${1}" "${2}" & >>> shift 2 >>> done >>> sleep 1 >>> done >>> } >>> function testProcess { >>> echo "${*}" >>> sleep 1 >>> } >>> runJobParrell 2 testProcess "jiji#" testProcess "jiji#" testProcess >>> "jiji#" >>> >>> tends to work well enough. >>> it gets a bit more complex if you want to recover output but not too much. >> The real issue here is that there is no generalizable way to store an >> arbitrary command for later execution. Your example assumes that each >> pair of arguments constitutes one simple command, which is fine if that's >> all you need it to do. But the next guy asking for this will want to >> schedule arbitrarily complex shell pipelines and complex commands with >> here documents and brace expansions and >> > > :) > A more complex/flexible example. More like what I actually use. > > > > > CNiceLevel=$(nice) > declare -a JobArray > function PushAdvancedCmd { > local IFS=$'\v' > JobArray+=("${*}") > } > function PushSimpleCmd { > PushAdvancedCmd WrapJob ${CNiceLevel} "${@}" > } > function PushNiceCmd { > PushAdvancedCmd WrapJob "${@}" > } > function UnpackCmd { > local IFS=$'\v' > set -o noglob > _RETURN=( .${1}. ) > set +o noglob > _RETURN[0]="${_RETURN[0]#.}" > local -i le=${#_RETURN[@]}-1 > _RETURN[${le}]="${_RETURN[${le}]%.}" > } > function runJobParrell { > local mjobCnt=${1} && shift > jcnt=0 > function WrapJob { > [ ${1} -le ${CNiceLevel} ] || renice -n ${1} > local Buffer=$("${@:2}") > echo "${Buffer}" > kill -s USR2 $$ > } > function JobFinised { > jcnt=$((${jcnt}-1)) > } > trap JobFinised USR2 > while [ $# -gt 0 ] ; do > while [ ${jcnt} -lt ${mjobCnt} ]; do > jcnt=$((${jcnt}+1)) &
Re: Parallelism a la make -j / GNU parallel
Am 03.05.2012 22:30, schrieb Greg Wooledge: > On Thu, May 03, 2012 at 10:12:17PM +0200, John Kearney wrote: >> function runJobParrell { >> local mjobCnt=${1} && shift >> jcnt=0 >> function WrapJob { >> "${@}" >> kill -s USR2 $$ >> } >> function JobFinised { >> jcnt=$((${jcnt}-1)) >> } >> trap JobFinised USR2 >> while [ $# -gt 0 ] ; do >> while [ ${jcnt} -lt ${mjobCnt} ]; do >> jcnt=$((${jcnt}+1)) >> echo WrapJob "${1}" "${2}" >> WrapJob "${1}" "${2}" & >> shift 2 >> done >> sleep 1 >> done >> } >> function testProcess { >> echo "${*}" >> sleep 1 >> } >> runJobParrell 2 testProcess "jiji#" testProcess "jiji#" testProcess >> "jiji#" >> >> tends to work well enough. >> it gets a bit more complex if you want to recover output but not too much. > The real issue here is that there is no generalizable way to store an > arbitrary command for later execution. Your example assumes that each > pair of arguments constitutes one simple command, which is fine if that's > all you need it to do. But the next guy asking for this will want to > schedule arbitrarily complex shell pipelines and complex commands with > here documents and brace expansions and > :) A more complex/flexible example. More like what I actually use. CNiceLevel=$(nice) declare -a JobArray function PushAdvancedCmd { local IFS=$'\v' JobArray+=("${*}") } function PushSimpleCmd { PushAdvancedCmd WrapJob ${CNiceLevel} "${@}" } function PushNiceCmd { PushAdvancedCmd WrapJob "${@}" } function UnpackCmd { local IFS=$'\v' set -o noglob _RETURN=( .${1}. ) set +o noglob _RETURN[0]="${_RETURN[0]#.}" local -i le=${#_RETURN[@]}-1 _RETURN[${le}]="${_RETURN[${le}]%.}" } function runJobParrell { local mjobCnt=${1} && shift jcnt=0 function WrapJob { [ ${1} -le ${CNiceLevel} ] || renice -n ${1} local Buffer=$("${@:2}") echo "${Buffer}" kill -s USR2 $$ } function JobFinised { jcnt=$((${jcnt}-1)) } trap JobFinised USR2 while [ $# -gt 0 ] ; do while [ ${jcnt} -lt ${mjobCnt} ]; do jcnt=$((${jcnt}+1)) UnpackCmd "${1}" "${_RETURN[@]}" & shift done sleep 1 done } function testProcess { echo "${*}" sleep 1 } # So standard variable args can be handled in 2 ways 1 # encode them as such PushSimpleCmd testProcess "jiji#" dfds dfds dsfsd PushSimpleCmd testProcess "jiji#" dfds dfds PushNiceCmd 20 testProcess "jiji#" dfds PushSimpleCmd testProcess "jiji#" PushSimpleCmd testProcess "jiji#" "*" s # more complex things just wrap them in a function and call it function DoComplexMagicStuff1 { echo "${@}" >&2 } # Or more normally just do a hybrid of both. PushSimpleCmd DoComplexMagicStuff1 "jiji#" # runJobParrell 1 "${JobArray[@]}" Note there is another level of complexity where I start a JobQueue Process and issues it commands using a fifo.
Re: Parallelism a la make -j / GNU parallel
I tend to do something more like this function runJobParrell { local mjobCnt=${1} && shift jcnt=0 function WrapJob { "${@}" kill -s USR2 $$ } function JobFinised { jcnt=$((${jcnt}-1)) } trap JobFinised USR2 while [ $# -gt 0 ] ; do while [ ${jcnt} -lt ${mjobCnt} ]; do jcnt=$((${jcnt}+1)) echo WrapJob "${1}" "${2}" WrapJob "${1}" "${2}" & shift 2 done sleep 1 done } function testProcess { echo "${*}" sleep 1 } runJobParrell 2 testProcess "jiji#" testProcess "jiji#" testProcess "jiji#" tends to work well enough. it gets a bit more complex if you want to recover output but not too much. Am 03.05.2012 21:21, schrieb Elliott Forney: > Here is a construct that I use sometimes... although you might wind up > waiting for the slowest job in each iteration of the loop: > > > maxiter=100 > ncore=8 > > for iter in $(seq 1 $maxiter) > do > startjob $iter & > > if (( (iter % $ncore) == 0 )) > then > wait > fi > done > > > On Thu, May 3, 2012 at 12:49 PM, Colin McEwan wrote: >> Hi there, >> >> I don't know if this is anything that has ever been discussed or >> considered, but would be interested in any thoughts. >> >> I frequently find myself these days writing shell scripts, to run on >> multi-core machines, which could easily exploit lots of parallelism (eg. a >> batch of a hundred independent simulations). >> >> The basic parallelism construct of '&' for async execution is highly >> expressive, but it's not useful for this sort of use-case: starting up 100 >> jobs at once will leave them competing, and lead to excessive context >> switching and paging. >> >> So for practical purposes, I find myself reaching for 'make -j' or GNU >> parallel, both of which destroy the expressiveness of the shell script as I >> have to redirect commands and parameters to Makefiles or stdout, and >> wrestle with appropriate levels of quoting. >> >> What I would really *like* would be an extension to the shell which >> implements the same sort of parallelism-limiting / 'process pooling' found >> in make or 'parallel' via an operator in the shell language, similar to '&' >> which has semantics of *possibly* continuing asynchronously (like '&') if >> system resources allow, or waiting for the process to complete (';'). >> >> Any thoughts, anyone? >> >> Thanks! >> >> -- >> C. >> >> https://plus.google.com/109211294311109803299 >> https://www.facebook.com/mcewanca
Re: Fwd: Bash bug interpolating delete characters
Am 03.05.2012 19:41, schrieb John Kearney: > Am 03.05.2012 15:01, schrieb Greg Wooledge: >>> Yours, Rüdiger. >>> a=x >>> del="$(echo -e "\\x7f")" >>> >>> echo "$del${a#x}" | od -ta >>> echo "$del ${a#x}" | od -ta >>> echo " $del${a#x}" | od -ta >> Yup, confirmed that it breaks here, and only when the # parameter expansion >> is included. >> >> imadev:~$ del=$'\x7f' a=x b= >> imadev:~$ echo " $del$b" | od -ta >> 000 sp del nl >> 003 >> imadev:~$ echo " $del${b}" | od -ta >> 000 sp del nl >> 003 >> imadev:~$ echo " $del${b#x}" | od -ta >> 000 sp del nl >> 003 >> imadev:~$ echo " $del${a#x}" | od -ta >> 000 sp nl >> 002 >> >> Bash 4.2.24. >> > Also Confirmed, but my output is a bit wackier. > printf %q seems to get confused, and do invalid things as well. > > the \x7f becomes a \ disregard the comment about printf its just escaping the space. > > function printTests { > while [ $# -gt 0 ]; do > printf"%-20s=[%q]\n""${1}" "$(eval echo "${1}")" > shift > done > } > > a=x > del=$'\x7f' > printTests '"$del${a#x}"' '"$del ${a#x}"' '" $del${a#x}"' '" $del${a%x}"' > printTests '" $del${a:0:0}"' '" $del"${a:0:0}' '" $del""${a:0:0}"' > printTests '" $del${a}"' '" $del"' '" ${del}${a:0:0}"' '" > ${del:0:1}${a:0:0}"' > printTests '" ${del:0:1}${a}"' '"${del:0:1}${a#d}"' '"${del:0:1}${a#x}"' > printTests '" ${del:0:1} ${a}"' '"${del:0:1} ${a#d}"' '"${del:0:1} ${a#x}"' > > output > "$del${a#x}"=[$'\177'] > "$del ${a#x}" =[\ ] > " $del${a#x}" =[\ ] > " $del${a%x}" =[\ ] > " $del${a:0:0}" =[\ ] > " $del"${a:0:0} =[$' \177'] > " $del""${a:0:0}" =[$' \177'] > " $del${a}" =[$' \177x'] > " $del" =[$' \177'] > " ${del}${a:0:0}" =[\ ] > " ${del:0:1}${a:0:0}"=[\ ] > " ${del:0:1}${a}" =[$' \177x'] > "${del:0:1}${a#d}" =[$'\177x'] > "${del:0:1}${a#x}" =[$'\177'] > " ${del:0:1} ${a}" =[$' \177 x'] > "${del:0:1} ${a#d}" =[$'\177 x'] > "${del:0:1} ${a#x}" =[\ ] > > > > > >
Re: Fwd: Bash bug interpolating delete characters
Am 03.05.2012 15:01, schrieb Greg Wooledge: >> Yours, Rüdiger. >> a=x >> del="$(echo -e "\\x7f")" >> >> echo "$del${a#x}" | od -ta >> echo "$del ${a#x}" | od -ta >> echo " $del${a#x}" | od -ta > Yup, confirmed that it breaks here, and only when the # parameter expansion > is included. > > imadev:~$ del=$'\x7f' a=x b= > imadev:~$ echo " $del$b" | od -ta > 000 sp del nl > 003 > imadev:~$ echo " $del${b}" | od -ta > 000 sp del nl > 003 > imadev:~$ echo " $del${b#x}" | od -ta > 000 sp del nl > 003 > imadev:~$ echo " $del${a#x}" | od -ta > 000 sp nl > 002 > > Bash 4.2.24. > Also Confirmed, but my output is a bit wackier. printf %q seems to get confused, and do invalid things as well. the \x7f becomes a \ function printTests { while [ $# -gt 0 ]; do printf"%-20s=[%q]\n""${1}" "$(eval echo "${1}")" shift done } a=x del=$'\x7f' printTests '"$del${a#x}"' '"$del ${a#x}"' '" $del${a#x}"' '" $del${a%x}"' printTests '" $del${a:0:0}"' '" $del"${a:0:0}' '" $del""${a:0:0}"' printTests '" $del${a}"' '" $del"' '" ${del}${a:0:0}"' '" ${del:0:1}${a:0:0}"' printTests '" ${del:0:1}${a}"' '"${del:0:1}${a#d}"' '"${del:0:1}${a#x}"' printTests '" ${del:0:1} ${a}"' '"${del:0:1} ${a#d}"' '"${del:0:1} ${a#x}"' output "$del${a#x}"=[$'\177'] "$del ${a#x}" =[\ ] " $del${a#x}" =[\ ] " $del${a%x}" =[\ ] " $del${a:0:0}" =[\ ] " $del"${a:0:0} =[$' \177'] " $del""${a:0:0}" =[$' \177'] " $del${a}" =[$' \177x'] " $del" =[$' \177'] " ${del}${a:0:0}" =[\ ] " ${del:0:1}${a:0:0}"=[\ ] " ${del:0:1}${a}" =[$' \177x'] "${del:0:1}${a#d}" =[$'\177x'] "${del:0:1}${a#x}" =[$'\177'] " ${del:0:1} ${a}" =[$' \177 x'] "${del:0:1} ${a#d}" =[$'\177 x'] "${del:0:1} ${a#x}" =[\ ]
Re: Is it possible or RFE to expand ranges of *arrays*
Am 28.04.2012 05:05, schrieb Linda Walsh: Maarten Billemont wrote: On 26 Apr 2012, at 06:30, John Kearney wrote: Am 26.04.2012 06:26, schrieb Linda Walsh: I know I can get a="abcdef" echo "${a[2:4]}" = cde how do I do: typeset -a a=(apple berry cherry date); then get: echo ${a[1:2]} = "berry" "cherry" ( non-grouped args) I tried to do it in a function and hurt myself. echo ${a[@]:1:2} I see little reason to ask bash to wordsplit the elements after expanding them. You ought to quote that expansion. --- Good point. Since if you do: > a=( 'apple pie' 'berry pie' 'cherry cake' 'dates divine') > b=( ${a[@]:1:2} ) > echo ${#b[*]} 4 #yikes! > b=( "${a[@]:1:2}" ) 2 #woo! I'd guess the original poster probably figured, I'd figure out the correct form pretty quickly in usage. but thanks for your insight. ( (to all)*sigh*) I "always" quote not sure why I didn't that time. Except that it was just a quick response to a simple question. but of course your right.
Re: Is it possible or RFE to expand ranges of *arrays*
Am 26.04.2012 06:26, schrieb Linda Walsh: I know I can get a="abcdef" echo "${a[2:4]}" = cde how do I do: typeset -a a=(apple berry cherry date); then get: echo ${a[1:2]} = "berry" "cherry" ( non-grouped args) I tried to do it in a function and hurt myself. echo ${a[@]:1:2}
Please remove iconv_open (charset, "ASCII"); from unicode.c
Hi chet can you please remove the following from the unicode.c file localconv = iconv_open (charset, "ASCII"); This is invalid fall back. zhis creates a translation config. The primary attempt is utf-8 to destination codeset. If that conversion fails this tries selecting ASCII to codeset. ! But the code still inputs utf-8 as input to the icconv. this means that this is less likely to successfully encode than a simple assignment. consider U+80 becomes utf-8 "\xc2\x80" which because we tell iconv this is ascii becomes ascii "\xc2\x80". do this line takes a U+80 and turns it into a U+c3 and a U+80. The way i rewrote the icconv code made it cleaner, safer and quicker, please consider using it. I avoided the need for the strcpy among other things. On 02/21/2012 03:42 AM, Chet Ramey wrote: > On 2/18/12 5:39 AM, John Kearney wrote: > >> Bash Version: 4.2 Patch Level: 10 Release Status: release >> >> Description: Current u32toutf8 only encode values below 0x >> correctly. wchar_t can be ambiguous size better in my opinion to >> use unsigned long, or uint32_t, or something clearer. > > Thanks for the patch. It's good to have a complete > implementation, though as a practical matter you won't see UTF-8 > characters longer than four bytes. I agree with you about the > unsigned 32-bit int type; wchar_t is signed, even if it's 32 bits, > on several systems I use. > > Chet >
Re: Can somebody explain to me what u32tochar in /lib/sh/unicode.c is trying to do?
You really should stop using this function. It is just plain wrong, and is not predictable. It may enocde BIG5 and SJIS but is is more by accident that intent. If you want to do something like this then do it properly. basically all of the multibyte system have to have a detection method for multibyte characters, most of them rely on bit7 to indicate a multibyte sequence or use vt100 SS3 escape sequences. You really can't just inject random data into a txt buffer. even returning UTF-8 as a fallback is a bug. The most that should be done is return ASCII in error case and I mean U+0-U+7f only and ignore or warn about any unsupported characters. Using this function is dangerous and pointless. I mean seriously in what world does it make sense to inject utf-8 into a big5 string? Or indead into a ascii string. Code should behave like an adult, not like a frightened kid. By which I mean it shouldn't pretend it knows what its doing when it doesn't, it should admit the problem so that the problem can be fixed. On 02/21/2012 04:28 AM, Chet Ramey wrote: > On 2/19/12 5:07 PM, John Kearney wrote: >> Can somebody explain to me what u32tochar is trying to do? >> >> It seems like dangerous code? >> >> from the context i'm guessing it trying to make a hail mary pass at >> converting utf-32 to mb (not utf-8 mb) > > Pretty much. It's a big-endian representation of a 32-bit integer > as a character string. It's what you get when you don't have iconv > or iconv fails and the locale isn't UTF-8. It may not be useful, > but it's predictable. If we have a locale the system doesn't know > about or can't translate, there's not a lot we can do. > > Chet
Re: bash 4.2 breaks source finding libs in lib/filename...
On 03/03/2012 09:43 AM, Stefano Lattarini wrote: > On 03/03/2012 08:28 AM, Pierre Gaston wrote: >> On Fri, Mar 2, 2012 at 9:54 AM, Stefano Lattarini wrote: >> >>> Or here is a what it sounds as a marginally better idea to me: Bash could >>> start supporting a new environment variable like "BASHLIB" (a' la' >>> PERL5LIB) >>> or "BASHPATH" (a' la' PYTHONPATH) holding a colon separated (or semicolon >>> separated on Windows) list of directories where bash will look for sourced >>> non-absolute files (even if they contain a pathname separator) before >>> (possibly) performing a lookup in $PATH and then in the current directory. >>> Does this sounds sensible, or would it add too much complexity and/or >>> confusion? >> >> It could be even furthermore separated from the traditional "source" and a >> new keyword introduced like "require" >> > This might be a slightly better interface, yes. Agreed though include might be a better name than require. and if your at it why not include <> and include "" > >> a la lisp which would be able to do things like: >> >> 1) load the file, searching in the BASH_LIB_PATH (or other variables) for a >> file with optionally the extension .sh or .bash >> 2) only load the file if the "feature" as not been provided, eg only load >> the file once >> > These sound good :-) No I don't like that. if you want something like that just use inclusion protection like every other language. if [ -z "${__file_sh__:-}" ]; then __file_sh__=1 fi and my source wrapper function actually checks for that variable b4 sourcing the file. off the top of my head something like this. [ -n "${!__$(basename "${sourceFile}" .sh)_sh__}" ] || source "${sourceFile}" > >> 3) maybe optionally only load the definition and not execute commands >> (something I've seen people asking for on several occasions on IRC), for >> instance that would allow to have test code inside the lib file or maybe >> print a warning that it's a library not to be executed. (No so important >> imo) >> > ... and even python don't do that! If people care about making the test > code in the module "automatically executable" when the module is run as > a script, they could use an idiom similar to the python one: > > # For python. > if __name__ == "__main__": > test code ... > > i.e.: > > # For bash. > if [[ -n $BASH_SOURCE ]]; then > test code ... > fi > Only works if you source from thh command line, not execute. what you actually have to do is something like this # For bash. if [[ "$(basename "${0}")" = scriptname.sh ]]; then test code ... fi >> I think this would benefit the bash_completion project and help them to >> split the script so that the completion are only loaded on demand. >> (one of the goal mentionned at http://bash-completion.alioth.debian.org/ is >> "make bash-completion dynamically load completions") >> My understanding is that the >> http://code.google.com/p/bash-completion-lib/project did something >> like this but that it was not working entirely as >> they wanted. >> (I hope some of the devs reads this list) >> >> On the other hand, there is the possibility to add FPATH and autoload like >> in ksh93 ... >> I haven't think to much about it but my guess is that it would really be >> easy to implement a module system with that. >> >> my 2 centsas I don't have piles of bash lib. >> > Same here -- it was more of a "theoretical suggestion", in the category of > "hey, you know what would be really cool to have?" :-) But I don't deeply > care about it, personally. What would be really useful (dreamy eyes) would be namespace support :) something like this { # codeblock namespace namespace1 testvar=s { # codeblock namespace namespace2 testvar=s } } treated like this namespace1.testvar=s namespace1.namespace2.testvar=s although non posix this is already kinda supported because you can do function test1.ert.3 { } I mean all you would do is treat the namespace as a variable preamble so you'd have something like this to find the function etc if [ type "${varname}" ] elif [ type "${namespace}${varname}" ] else error not found wouldn't actually break anything afaik. > > Regards, > Stefano >
Re: RFE: allow bash to have libraries
https://github.com/dethrophes/Experimental-Bash-Module-System/blob/master/bash/template.sh So can't repeat this enough !play code!!. However suggestions are welcome. If this sort of thing is of interesting I could maintain it online I guess. basically I wan kinda thinking perl/python module libary when I started So what I like trap error etc and print error mesages set nounset Try to keep the files in 2 parts source part and run part. Have a common args handler routine. rediculously comples log output etc timestamped, line file function etc... stack trace on errors color output red for errors etc. silly comples userinterface routines :) I guess just have a look see and try it out. Also note I think a lot of the files are empty/or silly files that should actually be deleted don't have time to go through them now though. I'd also advise using ctags, tagging it and navigating so, its what I do. On 03/02/2012 03:54 AM, Clark J. Wang wrote: > On Fri, Mar 2, 2012 at 08:20, John Kearney > wrote: > >> :) :)) Personal best wrote about 1 lines of code which >> finally became about 200ish to implement a readkey function. >> >> Actually ended up with 2 solutions 1 basted on a full bash >> script vt100 parser weighing in a about 500 lines including state >> tables and a s00 line hack. >> >> Check out http://mywiki.wooledge.org/ReadingFunctionKeysInBash >> >> >> Personally I'd have to say using path to source a moduel is a >> massive securtiy risk but thats just me. I actually have a pretty >> complex bash modules hierarchy solution. If anybodys interested I >> guess I could upload it somewhere if anybodys interested, > > > I just found https://gist.github.com/ a few days ago :) > > Gist is a simple way to share snippets and pastes with others. All > gists are git repositories, so they are automatically versioned, > forkable and usable as a git repository. > > >> its just a play thing for me really but its couple 1000 lines of >> code proabely more like 1+. Its kinda why I started updating >> Gregs wiwi I noticed I'd found different/better ways of dealing >> with a lot of problems. >> >> Thiing like secured copy/move funtions. Task Servers. Generic >> approach to user interface interactions. i.e. supports both gui >> and console input in my scripts. Or I even started a bash based >> ncurses type system :), like I say some fune still got some >> performance issues with that one. >> >> Or improves select function that supports arrow keys and mouse >> selection, written in bash. >> >> Anybody interested in this sort of thing? >> > > I'm interested.
Re: RFE: allow bash to have libraries
:) :)) Personal best wrote about 1 lines of code which finally became about 200ish to implement a readkey function. Actually ended up with 2 solutions 1 basted on a full bash script vt100 parser weighing in a about 500 lines including state tables and a s00 line hack. Check out http://mywiki.wooledge.org/ReadingFunctionKeysInBash Personally I'd have to say using path to source a moduel is a massive securtiy risk but thats just me. I actually have a pretty complex bash modules hierarchy solution. If anybodys interested I guess I could upload it somewhere if anybodys interested, its just a play thing for me really but its couple 1000 lines of code proabely more like 1+. Its kinda why I started updating Gregs wiwi I noticed I'd found different/better ways of dealing with a lot of problems. Thiing like secured copy/move funtions. Task Servers. Generic approach to user interface interactions. i.e. supports both gui and console input in my scripts. Or I even started a bash based ncurses type system :), like I say some fune still got some performance issues with that one. Or improves select function that supports arrow keys and mouse selection, written in bash. Anybody interested in this sort of thing? On 03/01/2012 11:48 PM, Linda Walsh wrote: > John Kearney wrote: ... [large repetitive included text elided...] > >> why not just do something like this? >> > <26 line suggested 'header' elided...> >> gives you more control anyway, pretty quick and simple. >> >> > At least 30% of the point of this is to take large amounts of > common initialization code that ends up at the front of many or > most of my scripts and have it hidden in a side file where it can > just be 'included'... > > Having to add 26 lines of code just to include 20 common lines > doesn't sound like a net-gain... > > > I thought of doing something similar until I realized I'd end up > with some path-search routine written in shell at the beginning of > each program just to enable bash to have structured & hierarchical > libraries like any other programming language except maybe BASIC > (or other shells) > > My problem is I keep thinking problems can be solvable in a few > lines of shell code. Then they grow... *sigh*... > >
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
On 02/29/2012 11:55 PM, Chet Ramey wrote: > On 2/28/12 4:28 PM, John Kearney wrote: >> >> On 02/28/2012 10:05 PM, Chet Ramey wrote: >>> On 2/28/12 12:26 PM, John Kearney wrote: >>> >>>> But that isn't how it behaves. >>>> "${test//str/""}" >>>> >>>> because str is replaced with '""' as such it is treating the double >>>> quotes as string literals. >>>> >>>> however at the same time these literal double quotes escape/quote a >>>> single quote between them. >>>> As such they are treated both as literals and as quotes as such >>>> inconsistently. >>> >>> I don't have a lot of time today, but I'm going to try and answer bits >>> and pieces of this discussion. >>> >>> Yes, bash opens a new `quoting context' (for lack of a better term) inside >>> ${}. Posix used to require it, though after lively discussion it turned >>> into "well, we said that but it's clearly not what we meant." >>> >>> There are a couple of places in the currently-published version of the >>> standard, minus any corregendia, that specify this. The description of >>> ${parameter} reads, in part, >>> >>> "The matching closing brace shall be determined by counting brace levels, >>> skipping over enclosed quoted strings, and command substitutions." >>> >>> The section on double quotes reads, in part: >>> >>> "Within the string of characters from an enclosed "${" to the matching >>> '}', an even number of unescaped double-quotes or single-quotes, if any, >>> shall occur." >>> >>> Chet >> >> yhea but I think the point is that the current behavior is useless. >> there is no case where I want a " to be printed and start a double >> quoted string? and thats the current behavior. > > Maybe you don't, but there are several cases in the test suite that do > exactly that, derived from an old bug report. > > We don't have to keep the bash-4.2 behavior, but we need to acknowledge > that it's not backwards-compatible. > > Personally vote for ksf93 like behavior, was more intuitive for me, not that I've tested it all that much but the first impression was a good one. seriously try it out an see which behavior you want to use. As for backward compatibility. to be honest I think that anybody who relied on this behavior should be shot ;) Like someone already said the only sane way to use it now is with a variable.
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
On 03/01/2012 12:12 AM, Andreas Schwab wrote: > John Kearney writes: > >> It isn't just the quote removal that is confusing. >> >> The escape character is also not removed and has its special >> meaning. > > The esacape character is also a quote character, thus also subject > to quote removal. > > Andreas. > oh wasn't aware of that distinction thx.
Re: RFE: allow bash to have libraries (was bash 4.2 breaks source finding libs in lib/filename...)
On 02/29/2012 11:53 PM, Linda Walsh wrote: > > > Eric Blake wrote: > >> On 02/29/2012 12:26 PM, Linda Walsh wrote: >> Any pathname that contains a / should not be subject to PATH searching. >> >> Agreed - as this behavior is _mandated_ by POSIX, for both sh(1) >> and for execlp(2) and friends. > > > Is it that you don't read english as a first language, or are you > just trying to be argumentative?' > > I said: Original Message Subject: bash 4.2 breaks > source finding libs in lib/filename... Date: Tue, 28 Feb 2012 > 17:34:21 -0800 From: Linda Walsh To: bug-bash > > Why was this functionality removed in non-posix mode? > > So, your arguments are all groundless and pointless, as your entire > arguments stem from posix .. which I specifically said I'm NOT > specifying. If I want posix behavior, I can flick a switch and > have such compatibility. > > however, Bash was designed to EXceeed the limitations and features > of POSIX, so the fact that posix is restrained in this area, is a > perfect reason to allow it -- as it makes it > > >> >>> Pathnames that *start* with '/' are called an "absolute" >>> pathnames, >>> >>> while paths not starting with '/' are relative. >> >> And among the set of relative pathnames, there are two further >> divisions: anchored (contains at least one '/') and unanchored >> (no '/'). PATH lookup is defined as happening _only_ for >> unanchored names. >> >>> Try 'C', if you include a include file with "/", it scans for >>> it in each .h root. >> >> The C compiler _isn't_ doing a PATH search, so it follows >> different rules. >> >>> Almost all normal utils take their 'paths to be the 'roots' of >>> trees that contain files. Why should bash be different? >> >> Because that's what POSIX says. > > --- Posix says to ground paths with "/" in them at the root's of > their paths? But it says differently for BASH? you aren't > making sense. > > All the utils. > > What does man do?... it looks for a "/" separated hierarchy under > EACH entry of MANPATH. > > What does Perl do? It looks for a "/" separated hierarchy under > each entry in lib. > > What does vim do? It looks for a vim-hierarchy under each entry > of it's list of vim-runtimes. > > what does ld do? What does C do? What does C++ do? They all > look for "/" separated hierarchies under a PATH-like root. > > > You claim that behavior is mandated by posix? I didn't know > posix specified perl standards. or vim... but say they do > then why wouldn't you also look for a "/" separated hierarchy under > PATH? > > What does X do? -- a "/" separated hierarchy? > > > What does Microsoft do for registry locations? a "\" separated > hierarchy under 64 or 32-bit registry areas. > > Where do demons look for files? Under a "/" separated hierarchy > that may be root or a pseudo-root... > > All of these utils use "/" separated hierarchies -- none of them > refuse to do a path lookup with "/" is in the file name. The > entire concept of libraries would fail -- as they are organized > hierarchically. but you may not know the library location until > runtime, so you have a path and a hierarchical lookup. > > So why shouldn't Bash be able to look for 'library' functions in a > hierarchy? > > Note -- as we are talking about non-posix mode of BASH, you can't > use POSIX as a justification. > > > As for making another swithc -- there is already a switch -- > 'posix' for posix behavior. > > I'm not asking for a change in posix behavior, so you can continue > using posix mode ... > > > > >> >>> It goes against 'common sense' and least surprise -- given it's >>> the norm in so many other applications. >> >> About the best we can do is accept a patch (are you willing to >> write it? if not, quit complaining) > > >> that would add a new shopt, off by default, > > > --- > > I would agree to it being off in posix mode, by default, and on, > by default when not in posix mode... > > > >> allow your desired alternate behavior. But I won't write such a >> patch, and if such a patch is written, I won't use it, because >> I'm already used to the POSIX behavior. > > --- How do you use the current behavior that doesn't do a path > lookup if you include a / in the path (not at the beginning), that > you would be able to make use of if you added "." to the beginning > of your path (either temporarily or permanently...)? > > > How do you organize your hierarchical libraries with bash so they > don't have hard coded paths? > > > why not just do something like this? # FindInPathVarExt [ [ [ ]]] function FindInPathVarExt { local -a PathList IFS=":" read -a PathList <<< "${2}" for CPath in "${PathList[@]}" ; do for CTest in "${@:4}"; do test "${CTest}" "${CPath}/${3}" || continue 2 done printf -v "${1}" "${CPath}/${3}" return 0 done printf -v "${1}" "Not Found" ret
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
It isn't just the quote removal that is confusing. The escape character is also not removed and has its special meaning. and this also confuses me take the following 2 cases echo ${a:-$'\''} ' echo "${a:-$'\''}" bash: bad substitution: no closing `}' in "${a:-'}" and take the following 3 cases echo "${a:-$(echo $'\'')}" bash: command substitution: line 38: unexpected EOF while looking for matching `'' bash: command substitution: line 39: syntax error: unexpected end of file echo ${a:-$(echo $'\'')} ' echo "${a:-$(echo \')}" ' This can not be logical behavior. On 02/29/2012 11:26 PM, Chet Ramey wrote: > On 2/28/12 10:52 AM, John Kearney wrote: >> Actually this is something that still really confuses me as >> well. > > The key is that bash doesn't do quote removal on the `string' part > of the "${param/pat/string}" expansion. The double quotes are key; > quote removal happens when the expansion is unquoted. > > Double quotes are supposed to inhibit quote removal, but bash's > hybrid behavior of allowing quotes to escape characters but not > removing them is biting us here. >
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
On 02/28/2012 11:23 PM, Chet Ramey wrote: > On 2/28/12 5:18 PM, John Kearney wrote: >> On 02/28/2012 11:07 PM, Chet Ramey wrote: >>> On 2/28/12 4:28 PM, John Kearney wrote: >>>> >>>> On 02/28/2012 10:05 PM, Chet Ramey wrote: >>>>> On 2/28/12 12:26 PM, John Kearney wrote: >>>>> >>>>>> But that isn't how it behaves. "${test//str/""}" >>>>>> >>>>>> because str is replaced with '""' as such it is treating >>>>>> the double quotes as string literals. >>>>>> >>>>>> however at the same time these literal double quotes >>>>>> escape/quote a single quote between them. As such they are >>>>>> treated both as literals and as quotes as such >>>>>> inconsistently. >>>>> >>>>> I don't have a lot of time today, but I'm going to try and >>>>> answer bits and pieces of this discussion. >>>>> >>>>> Yes, bash opens a new `quoting context' (for lack of a better >>>>> term) inside ${}. Posix used to require it, though after >>>>> lively discussion it turned into "well, we said that but it's >>>>> clearly not what we meant." >>>>> >>>>> There are a couple of places in the currently-published version >>>>> of the standard, minus any corregendia, that specify this. The >>>>> description of ${parameter} reads, in part, >>>>> >>>>> "The matching closing brace shall be determined by counting >>>>> brace levels, skipping over enclosed quoted strings, and >>>>> command substitutions." >>>>> >>>>> The section on double quotes reads, in part: >>>>> >>>>> "Within the string of characters from an enclosed "${" to the >>>>> matching '}', an even number of unescaped double-quotes or >>>>> single-quotes, if any, shall occur." >>>>> >>>>> Chet >>>> >>>> yhea but I think the point is that the current behavior is >>>> useless. there is no case where I want a " to be printed and >>>> start a double quoted string? and thats the current behavior. >>>> >>>> >>>> Not so important how you treat it just need to pick 1. then you >>>> can at least work with it. Now you have to use a temp variable. >>>> >>>> >>>> as a side note ksh93 is pretty good, intuitive ksh93 -c >>>> 'test=teststrtest ; echo "${test//str/"dd dd"}"' testdd ddtest >>>> ksh93 -c '( test=teststrtest ; echo ${test//str/"dd '\''dd"} )' >>>> testdd 'ddtest >>> >>> The real question is whether or not you do quote removal on the >>> stuff inside the braces when they're enclosed in double quotes. >>> Double quotes usually inhibit quote removal. >>> >>> The Posix "solution" to this is to require quote removal if a >>> quote character (backslash, single quote, double quote) is used to >>> escape or quote another character. Somewhere I have the reference >>> to the Austin group discussion on this. >>> >> >> 1${A:-B}2 >> >> Logically for consistancy having double quotes at position 1 and 2 >> should have no effect on how you treat string B. > > Maybe, but that's not how things work in practice. Should the following > expansions output the same thing? What should they output? > > bar=abc > echo ${foo:-'$bar'} > echo "${foo:-'$bar'}" > > Chet and truthfully with thr current behavior Id' almost expect this behavior. $bar '$bar' but to be honest without trying it out I have no idea and that is the problem now.
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
On 02/28/2012 11:44 PM, Chet Ramey wrote: > echo "$(echo '$bar')" actually these both output the same in bash echo "$(echo '$bar')" echo $(echo '$bar')
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
On 02/28/2012 11:23 PM, Chet Ramey wrote: > On 2/28/12 5:18 PM, John Kearney wrote: >> On 02/28/2012 11:07 PM, Chet Ramey wrote: >>> On 2/28/12 4:28 PM, John Kearney wrote: >>>> >>>> On 02/28/2012 10:05 PM, Chet Ramey wrote: >>>>> On 2/28/12 12:26 PM, John Kearney wrote: >>>>> >>>>>> But that isn't how it behaves. "${test//str/""}" >>>>>> >>>>>> because str is replaced with '""' as such it is treating >>>>>> the double quotes as string literals. >>>>>> >>>>>> however at the same time these literal double quotes >>>>>> escape/quote a single quote between them. As such they are >>>>>> treated both as literals and as quotes as such >>>>>> inconsistently. >>>>> >>>>> I don't have a lot of time today, but I'm going to try and >>>>> answer bits and pieces of this discussion. >>>>> >>>>> Yes, bash opens a new `quoting context' (for lack of a better >>>>> term) inside ${}. Posix used to require it, though after >>>>> lively discussion it turned into "well, we said that but it's >>>>> clearly not what we meant." >>>>> >>>>> There are a couple of places in the currently-published version >>>>> of the standard, minus any corregendia, that specify this. The >>>>> description of ${parameter} reads, in part, >>>>> >>>>> "The matching closing brace shall be determined by counting >>>>> brace levels, skipping over enclosed quoted strings, and >>>>> command substitutions." >>>>> >>>>> The section on double quotes reads, in part: >>>>> >>>>> "Within the string of characters from an enclosed "${" to the >>>>> matching '}', an even number of unescaped double-quotes or >>>>> single-quotes, if any, shall occur." >>>>> >>>>> Chet >>>> >>>> yhea but I think the point is that the current behavior is >>>> useless. there is no case where I want a " to be printed and >>>> start a double quoted string? and thats the current behavior. >>>> >>>> >>>> Not so important how you treat it just need to pick 1. then you >>>> can at least work with it. Now you have to use a temp variable. >>>> >>>> >>>> as a side note ksh93 is pretty good, intuitive ksh93 -c >>>> 'test=teststrtest ; echo "${test//str/"dd dd"}"' testdd ddtest >>>> ksh93 -c '( test=teststrtest ; echo ${test//str/"dd '\''dd"} )' >>>> testdd 'ddtest >>> >>> The real question is whether or not you do quote removal on the >>> stuff inside the braces when they're enclosed in double quotes. >>> Double quotes usually inhibit quote removal. >>> >>> The Posix "solution" to this is to require quote removal if a >>> quote character (backslash, single quote, double quote) is used to >>> escape or quote another character. Somewhere I have the reference >>> to the Austin group discussion on this. >>> >> >> 1${A:-B}2 >> >> Logically for consistancy having double quotes at position 1 and 2 >> should have no effect on how you treat string B. > > Maybe, but that's not how things work in practice. Should the following > expansions output the same thing? What should they output? > > bar=abc > echo ${foo:-'$bar'} > echo "${foo:-'$bar'}" > > Chet my first intuition on this whole thing was §(varename arg1 arg2) I.E. conceptually treat it like a function the options are arguments. That is then consistant, and intuative. Don'tget confused by the syntax. If I want 'as' i'll type \'as\' or some such. the outermost quotes only effect how the final value is handled. same as §() having special behaviour model for that string makes it imposible to work with really. this should actually make it easier for the parser.
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
On 02/28/2012 11:15 PM, Chet Ramey wrote: > On 2/28/12 5:07 PM, Chet Ramey wrote: > >>> yhea but I think the point is that the current behavior is useless. >>> there is no case where I want a " to be printed and start a double >>> quoted string? and thats the current behavior. >>> >>> >>> Not so important how you treat it just need to pick 1. then you can at >>> least work with it. Now you have to use a temp variable. >>> >>> >>> as a side note ksh93 is pretty good, intuitive >>> ksh93 -c 'test=teststrtest ; echo "${test//str/"dd dd"}"' >>> testdd ddtest >>> ksh93 -c '( test=teststrtest ; echo ${test//str/"dd '\''dd"} )' >>> testdd 'ddtest >> >> The real question is whether or not you do quote removal on the stuff >> inside the braces when they're enclosed in double quotes. Double >> quotes usually inhibit quote removal. >> >> The Posix "solution" to this is to require quote removal if a quote >> character (backslash, single quote, double quote) is used to escape >> or quote another character. Somewhere I have the reference to the >> Austin group discussion on this. > > http://austingroupbugs.net/view.php?id=221 > > Chet This however doesn't make reference to changing that behavior if you enclose the entire thing in double quotes. ${a//a/"a"} should behave the same as "${a//a/"a"}" I mean the search and replace should behave the same. Currently they dont
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
On 02/28/2012 11:07 PM, Chet Ramey wrote: > On 2/28/12 4:28 PM, John Kearney wrote: >> >> On 02/28/2012 10:05 PM, Chet Ramey wrote: >>> On 2/28/12 12:26 PM, John Kearney wrote: >>> >>>> But that isn't how it behaves. "${test//str/""}" >>>> >>>> because str is replaced with '""' as such it is treating >>>> the double quotes as string literals. >>>> >>>> however at the same time these literal double quotes >>>> escape/quote a single quote between them. As such they are >>>> treated both as literals and as quotes as such >>>> inconsistently. >>> >>> I don't have a lot of time today, but I'm going to try and >>> answer bits and pieces of this discussion. >>> >>> Yes, bash opens a new `quoting context' (for lack of a better >>> term) inside ${}. Posix used to require it, though after >>> lively discussion it turned into "well, we said that but it's >>> clearly not what we meant." >>> >>> There are a couple of places in the currently-published version >>> of the standard, minus any corregendia, that specify this. The >>> description of ${parameter} reads, in part, >>> >>> "The matching closing brace shall be determined by counting >>> brace levels, skipping over enclosed quoted strings, and >>> command substitutions." >>> >>> The section on double quotes reads, in part: >>> >>> "Within the string of characters from an enclosed "${" to the >>> matching '}', an even number of unescaped double-quotes or >>> single-quotes, if any, shall occur." >>> >>> Chet >> >> yhea but I think the point is that the current behavior is >> useless. there is no case where I want a " to be printed and >> start a double quoted string? and thats the current behavior. >> >> >> Not so important how you treat it just need to pick 1. then you >> can at least work with it. Now you have to use a temp variable. >> >> >> as a side note ksh93 is pretty good, intuitive ksh93 -c >> 'test=teststrtest ; echo "${test//str/"dd dd"}"' testdd ddtest >> ksh93 -c '( test=teststrtest ; echo ${test//str/"dd '\''dd"} )' >> testdd 'ddtest > > The real question is whether or not you do quote removal on the > stuff inside the braces when they're enclosed in double quotes. > Double quotes usually inhibit quote removal. > > The Posix "solution" to this is to require quote removal if a > quote character (backslash, single quote, double quote) is used to > escape or quote another character. Somewhere I have the reference > to the Austin group discussion on this. > 1${A:-B}2 Logically for consistancy having double quotes at position 1 and 2 should have no effect on how you treat string B. or consider this 1${A/B/C}2 in this case its even weirder double quotes at 1 and 2 has no effect on A or B but modifies how string C behaves.
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
On 02/28/2012 10:05 PM, Chet Ramey wrote: > On 2/28/12 12:26 PM, John Kearney wrote: > >> But that isn't how it behaves. >> "${test//str/""}" >> >> because str is replaced with '""' as such it is treating the double >> quotes as string literals. >> >> however at the same time these literal double quotes escape/quote a >> single quote between them. >> As such they are treated both as literals and as quotes as such >> inconsistently. > > I don't have a lot of time today, but I'm going to try and answer bits > and pieces of this discussion. > > Yes, bash opens a new `quoting context' (for lack of a better term) inside > ${}. Posix used to require it, though after lively discussion it turned > into "well, we said that but it's clearly not what we meant." > > There are a couple of places in the currently-published version of the > standard, minus any corregendia, that specify this. The description of > ${parameter} reads, in part, > > "The matching closing brace shall be determined by counting brace levels, > skipping over enclosed quoted strings, and command substitutions." > > The section on double quotes reads, in part: > > "Within the string of characters from an enclosed "${" to the matching > '}', an even number of unescaped double-quotes or single-quotes, if any, > shall occur." > > Chet yhea but I think the point is that the current behavior is useless. there is no case where I want a " to be printed and start a double quoted string? and thats the current behavior. Not so important how you treat it just need to pick 1. then you can at least work with it. Now you have to use a temp variable. as a side note ksh93 is pretty good, intuitive ksh93 -c 'test=teststrtest ; echo "${test//str/"dd dd"}"' testdd ddtest ksh93 -c '( test=teststrtest ; echo ${test//str/"dd '\''dd"} )' testdd 'ddtest
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
On 02/28/2012 07:00 PM, Dan Douglas wrote: > On Tuesday, February 28, 2012 06:52:13 PM John Kearney wrote: >> On 02/28/2012 06:43 PM, Dan Douglas wrote: >>> On Tuesday, February 28, 2012 06:38:22 PM John Kearney wrote: >>>> On 02/28/2012 06:31 PM, Dan Douglas wrote: >>>>> On Tuesday, February 28, 2012 05:53:32 PM Roman Rakus >>>>> wrote: >>>>>> On 02/28/2012 05:49 PM, Greg Wooledge wrote: >>>>>>> On Tue, Feb 28, 2012 at 05:36:47PM +0100, Roman Rakus >>>>>>> >>>>>>> wrote: >>>>>>>> And that means, there isn't way to substitute >>>>>>>> "something" to ' (single quote) when you want to not >>>>>>>> perform word splitting. I would consider it as a >>>>>>>> bug. >>>>>>> >>>>>>> imadev:~$ q=\' imadev:~$ input="foosomethingbar" >>>>>>> imadev:~$ echo "${input//something/$q}" foo'bar >>>>>> >>>>>> I meant without temporary variable. >>>>>> >>>>>> RR >>>>> >>>>> ormaaj@ormaajbox ~ $ ( x=abc; echo ${x/b/$'\''} ) a'c >>>> >>>> ( x=abc; echo "${x/b/$'\''}" ) -bash: bad substitution: no >>>> closing `}' in "${x/b/'}" >>>> >>>> >>>> you forgot the double quotes ;) >>>> >>>> >>>> I really did spend like an hour or 2 one day trying to figure >>>> it out and gave up. >>> >>> Hm good catch. Thought there might be a new quoting context >>> over there. >> >> I think we can all agree its inconsistent, just not so sure we >> care?? i.e. we know workarounds that aren't so bad variables >> etc. > > Eh, it's sort of consistent. e.g. this doesn't work either: > > unset x; echo "${x:-$'\''}" > > and likewise a backslash escape alone won't do the trick. I'd > assume this applies to just about every expansion. > > I didn't think too hard before posting that. :) My favorite type of bug one thats consistently inconsistent :) now that I have a beter idea of what weird I'll take a look later after the gym.
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
On 02/28/2012 06:52 PM, John Kearney wrote: > On 02/28/2012 06:43 PM, Dan Douglas wrote: >> On Tuesday, February 28, 2012 06:38:22 PM John Kearney wrote: >>> On 02/28/2012 06:31 PM, Dan Douglas wrote: >>>> On Tuesday, February 28, 2012 05:53:32 PM Roman Rakus wrote: >>>>> On 02/28/2012 05:49 PM, Greg Wooledge wrote: >>>>>> On Tue, Feb 28, 2012 at 05:36:47PM +0100, Roman Rakus >>>>>> wrote: >>>>>>> And that means, there isn't way to substitute "something" >>>>>>> to ' (single quote) when you want to not perform word >>>>>>> splitting. I would consider it as a bug. >>>>>> >>>>>> imadev:~$ q=\' imadev:~$ input="foosomethingbar" imadev:~$ >>>>>> echo "${input//something/$q}" foo'bar >>>>> >>>>> I meant without temporary variable. >>>>> >>>>> RR >>>> >>>> ormaaj@ormaajbox ~ $ ( x=abc; echo ${x/b/$'\''} ) a'c >>> >>> ( x=abc; echo "${x/b/$'\''}" ) -bash: bad substitution: no >>> closing `}' in "${x/b/'}" >>> >>> >>> you forgot the double quotes ;) >>> >>> >>> I really did spend like an hour or 2 one day trying to figure it >>> out and gave up. >> >> Hm good catch. Thought there might be a new quoting context over >> there. > I think we can all agree its inconsistent, just not so sure we care?? > i.e. we know workarounds that aren't so bad variables etc. > > > > > To sum up bash treats replacement strings inconsistently in double quoted variable expansion. example double quote is treated both as literal and as quote character. ( test=test123test ; echo "${test/123/"'"}" ) test"'"test vs ( test=test123test ; echo "${test/123/'}" ) which hangs waiting for ' treated as literal because it is printed treated as quote char because otherwise it should hang waiting for ' now teh single quote and backslash characters all seem to exhibit this dual nature in the replacement string. search string behaves consistantly. i.e. treats characters either as special or literal, not as both at teh same time. this has got to be a bug guys.
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
On 02/28/2012 06:43 PM, Dan Douglas wrote: > On Tuesday, February 28, 2012 06:38:22 PM John Kearney wrote: >> On 02/28/2012 06:31 PM, Dan Douglas wrote: >>> On Tuesday, February 28, 2012 05:53:32 PM Roman Rakus wrote: >>>> On 02/28/2012 05:49 PM, Greg Wooledge wrote: >>>>> On Tue, Feb 28, 2012 at 05:36:47PM +0100, Roman Rakus >>>>> wrote: >>>>>> And that means, there isn't way to substitute "something" >>>>>> to ' (single quote) when you want to not perform word >>>>>> splitting. I would consider it as a bug. >>>>> >>>>> imadev:~$ q=\' imadev:~$ input="foosomethingbar" imadev:~$ >>>>> echo "${input//something/$q}" foo'bar >>>> >>>> I meant without temporary variable. >>>> >>>> RR >>> >>> ormaaj@ormaajbox ~ $ ( x=abc; echo ${x/b/$'\''} ) a'c >> >> ( x=abc; echo "${x/b/$'\''}" ) -bash: bad substitution: no >> closing `}' in "${x/b/'}" >> >> >> you forgot the double quotes ;) >> >> >> I really did spend like an hour or 2 one day trying to figure it >> out and gave up. > > Hm good catch. Thought there might be a new quoting context over > there. I think we can all agree its inconsistent, just not so sure we care?? i.e. we know workarounds that aren't so bad variables etc.
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
On 02/28/2012 06:31 PM, Dan Douglas wrote: > On Tuesday, February 28, 2012 05:53:32 PM Roman Rakus wrote: >> On 02/28/2012 05:49 PM, Greg Wooledge wrote: >>> On Tue, Feb 28, 2012 at 05:36:47PM +0100, Roman Rakus wrote: And that means, there isn't way to substitute "something" to ' (single quote) when you want to not perform word splitting. I would consider it as a bug. >>> >>> imadev:~$ q=\' imadev:~$ input="foosomethingbar" imadev:~$ echo >>> "${input//something/$q}" foo'bar >> >> I meant without temporary variable. >> >> RR > ormaaj@ormaajbox ~ $ ( x=abc; echo ${x/b/$'\''} ) a'c ( x=abc; echo "${x/b/$'\''}" ) -bash: bad substitution: no closing `}' in "${x/b/'}" you forgot the double quotes ;) I really did spend like an hour or 2 one day trying to figure it out and gave up.
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
On 02/28/2012 06:16 PM, Eric Blake wrote: > On 02/28/2012 09:54 AM, John Kearney wrote: >> On 02/28/2012 05:22 PM, Roman Rakus wrote: >>> On 02/28/2012 05:10 PM, John Kearney wrote: >>>> wrap it with single quotes and globally replace all single >>>> quotes in the string with '\'' >>> single quote and slash have special meaning so they have to be >>> escaped, that's it. \'${var//\'/\\\'}\' it is not quoted, so >>> it undergoes word splitting. To avoid it quote it in double >>> quotes, however it changes how slash and single quote is >>> treated. "'${var//\'/\'}'" >>> >>> Wasn't it already discussed on the list? >>> >>> RR >>> >> It was discussed but not answered in a way that helped. > > POSIX already says that using " inside ${var+value} is > non-portable; you've just proven that using " inside the bash > extension of ${var//pat/sub} is likewise not useful. I'm just going for understandable/predictable right now. > >> >> Now I'm not looking foe a workaround, I want to understand it. >> Now you say they are treated special what does that mean and how >> can I escape that specialness. > > By using temporary variables. That's the only sane approach. I do its just always bugged. > >> >> Or show me how without using variables to do this >> test=test\'string >> >> [ "${test}" = "${test//"'"/"'"}" ] || exit 999 > > exit 999 is pointless. It is the same as exit 231 on some shells, > and according to POSIX, it is allowed to be a syntax error in other > shells. > I was going for || exit "Doomsday" i,e. 666 = 999 = Apocalypse.
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
On 02/28/2012 06:05 PM, Steven W. Orr wrote: > On 2/28/2012 11:54 AM, John Kearney wrote: >> On 02/28/2012 05:22 PM, Roman Rakus wrote: >>> On 02/28/2012 05:10 PM, John Kearney wrote: >>>> wrap it with single quotes and globally replace all single >>>> quotes in the string with '\'' >>> single quote and slash have special meaning so they have to be >>> escaped, that's it. \'${var//\'/\\\'}\' it is not quoted, so >>> it undergoes word splitting. To avoid it quote it in double >>> quotes, however it changes how slash and single quote is >>> treated. "'${var//\'/\'}'" >>> >>> Wasn't it already discussed on the list? >>> >>> RR >>> >> It was discussed but not answered in a way that helped. >> >> >> Look consider this test=teststring >> >> >> echo "${test//str/""}" > > This makes no sense. > > "${test//str/" is a string. is anudder string "}" is a 3rd > string > > echo "${test//str/\"\"}" > > is perfectly legal. > > But that isn't how it behaves. "${test//str/""}" because str is replaced with '""' as such it is treating the double quotes as string literals. however at the same time these literal double quotes escape/quote a single quote between them. As such they are treated both as literals and as quotes as such inconsistently.
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
On 02/28/2012 05:22 PM, Roman Rakus wrote: > On 02/28/2012 05:10 PM, John Kearney wrote: >> wrap it with single quotes and globally replace all single quotes >> in the string with '\'' > single quote and slash have special meaning so they have to be > escaped, that's it. \'${var//\'/\\\'}\' it is not quoted, so it > undergoes word splitting. To avoid it quote it in double quotes, > however it changes how slash and single quote is treated. > "'${var//\'/\'}'" > > Wasn't it already discussed on the list? > > RR > It was discussed but not answered in a way that helped. Look consider this test=teststring echo "${test//str/""}" test""ing echo ${test//str/""} testing echo ${test//str/"'"} test'ing echo "${test//str/"'"}" test"'"ing echo "${test//str/'}" # hangs now consider this case test=test\'string echo "${test//"'"/"'"}" test"'"string the match string and the replace string are exhibiting 2 different behaviors. Now I'm not looking foe a workaround, I want to understand it. Now you say they are treated special what does that mean and how can I escape that specialness. Or show me how without using variables to do this test=test\'string [ "${test}" = "${test//"'"/"'"}" ] || exit 999 Note this isn't the answer [ "${test}" = "${test//'/'}" ] || exit 999
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
this all started with a wish to single quote a variable. Doesn't matter why I have multiple solutions to that now. But it it an interesting problem for exploring how escaping works in variable expansion. so for the test case the goal is to take a string like kljlksdjflsd'jkjkljl wrap it with single quotes and globally replace all single quotes in the string with '\'' its a workaround because it doesn't work all the time you would need something more like this IFS= echo \'${test//"'"/\'\\\'\'}\'" " 'weferfds'\''dsfsdf' On 02/28/2012 05:01 PM, Greg Wooledge wrote: > On Tue, Feb 28, 2012 at 04:52:48PM +0100, John Kearney wrote: >> The standard work around you see is >> echo -n \'${1//\'/\'\\\'\'}\'" " >> but its not the same thing > > Workaround for what? Not the same thing as what? What is this pile > of punctuation attempting to do? > >> # why does this work, this list was born of frustration, I tried >> everything I could think of. >> echo \'${test//"'"/\'\\\'\'}\'" " >> 'weferfds'\''dsfsdf' > > Are you trying to produce "safely usable" strings that can be fed to > eval later? Use printf %q for that. > > imadev:~$ input="ain't it * a \"pickle\"?" > imadev:~$ printf '%q\n' "$input" > ain\'t\ it\ \*\ a\ \"pickle\"\? > > printf -v evalable_input %q "$input" > > Or, y'know, avoid eval. > > Or is this something to do with sed? Feeding strings to sed when you > can't choose a safe delimiter? That would involve an entirely different > solution. It would be nice to know what the problem is.
Re: Inconsistent quote and escape handling in substitution part of parameter expansions.
Actually this is something that still really confuses me as well. In the end I gave up and just did this. local LName="'\\''" echo -n "'${1//"'"/${LName}}' " I still don't really understand why this wont work echo -n "'${1//"'"/"'\''"}' " echo -n "'${1//\'/\'\\\'\'}' " The standard work around you see is echo -n \'${1//\'/\'\\\'\'}\'" " but its not the same thing I guess what I don't understand is why quoting the variable affects the substitutions string. I mean I guess I can see how it could happen but it does seem inconsistent, in fact it feels like a bug. And even if it does affect it the effect seems to be weird. i.e. given test="weferfds'dsfsdf" # why does this work, this list was born of frustration, I tried everything I could think of. echo \'${test//"'"/\'\\\'\'}\'" " 'weferfds'\''dsfsdf' #but none of the following echo "'${test//'/}'" # hangs waiting for ' echo "'${test//"'"/}'" 'weferfdsdsfsdf' echo "'${test//"'"/"'\\''"}'" 'weferfds"'\''"dsfsdf' echo "'${test//"'"/'\\''}'" # ahngs waiting or ' echo "'${test//"'"/\'\\'\'}'" 'weferfds\'\'\'dsfsdf' leaving me doing something like local LName="'\\''" echo -n "'${1//"'"/${LName}}' " I mean its a silly thing but it confuses me. On 02/28/2012 03:47 PM, Roman Rakus wrote: > On 02/28/2012 02:36 PM, Chet Ramey wrote: >> On 2/28/12 4:17 AM, lhun...@lyndir.com wrote: >>> Configuration Information [Automatically generated, do not >>> change]: Machine: i386 OS: darwin11.2.0 Compiler: >>> /Developer/usr/bin/clang Compilation CFLAGS: -DPROGRAM='bash' >>> -DCONF_HOSTTYPE='i386' -DCONF_OSTYPE='darwin11.2.0' >>> -DCONF_MACHTYPE='i386-apple-darwin11.2.0' >>> -DCONF_VENDOR='apple' -DLOCALEDIR='/opt/local/share/locale' >>> -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -DMACOSX -I. -I. >>> -I./include -I./lib -I/opt/local/include -pipe -O2 -arch >>> x86_64 uname output: Darwin mbillemo.lin-k.net 11.3.0 Darwin >>> Kernel Version 11.3.0: Thu Jan 12 18:47:41 PST 2012; >>> root:xnu-1699.24.23~1/RELEASE_X86_64 x86_64 Machine Type: >>> i386-apple-darwin11.2.0 >>> >>> Bash Version: 4.2 Patch Level: 20 Release Status: release >>> >>> Description: The handling of backslash and quotes is completely >>> inconsistent, counter-intuitive and in violation of how the >>> syntax works elsewhere in bash. >>> >>> ' appears to introduce a single-quoted context and \ appears >>> to escape special characters. That's good. A substitution >>> pattern of ' causes bash to be unable to find the closing >>> quote. That's good. A substitution pattern of '' SHOULD equal >>> an empty quoted string. The result, however, is ''. That's >>> NOT good. Suddenly the quotes are literal? A substitution >>> pattern of '$var' SHOULD disable expansion inside the quotes. >>> The result, however, is '[contents-of-var]'. That's NOT good. >>> In fact, it looks like quoting doesn't work here at all. \\ is >>> a disabled backslash, and the syntactical backslash is removed. >>> The result is \. That's good. \' is a disabled single quote, >>> but the syntactical backslash is NOT removed. The result is >>> \'. That's NOT good. >>> >>> It mostly looks like all the rules for handling quoting and >>> escaping are out the window and some random and utterly >>> inconsistent set of rules is being applied instead. >>> >>> Fix: Change parsing of the substitution pattern so that it >>> abides by all the standard documented rules regarding quotes >>> and escaping. >> It would go better if you gave some examples of what you >> consider incorrect behavior. This description isn't helpful as >> it stands. >> > Maybe something like this: > > # ttt=ggg # ggg="asd'ddd'g" # echo "'${!ttt//\'/'\''}'" >> ^C > # echo "'${!ttt//\'/\'\\\'\'}'" 'asd\'\\'\'ddd\'\\'\'g' > > > > Anyway, I thought that single quote retains its special meaning in > double quotes. $ echo "'a'" 'a' > > RR >
Re: Initial test code for \U
On 02/22/2012 08:59 PM, Eric Blake wrote: > On 02/22/2012 12:55 PM, Chet Ramey wrote: >> On 2/21/12 5:07 PM, John Kearney wrote: >>> >>> Initial code for testing \u functionality. >> >> Thanks; this is really good work. In the limited testing I've >> done, ja_JP.SHIFT_JIS is rare and C.UTF-8 doesn't exist >> anywhere. > > C.UTF-8 exists on Cygwin. But you are correct that... > >> en_US.UTF-8 seems to perform acceptably for the latter. > Also on Ubuntu. I only really started using it because it is consistent with C i.e. LC_CTYPE=C LC_CTYPE=C.UTF-8 Actually this was the reason I made the comment about not being able to detect setlocale error in bash. wanted to use a fallback list of the locale synonyms. The primary problem with this test is you need the locales installed. thoretical plan 1. compile list of destination code sets. 3. some method to auto install codesets. 3. Get Unicode mappings for said code sets. 2. use iconv to generate bash test tables 4. start crying at all the error messages ;( now locale -m gives charsets. Any ideas about finding unicode mappings for said charsets? I've been looking through the iconv code but all seems a bit laborious. What charsets would make sense to test?
Re: shopt can't set extglob in a sub-shell?
I updated that wiki page Hopefully its clearer now. http://mywiki.wooledge.org/glob#extglob On 02/26/2012 12:06 PM, Dan Douglas wrote: > On Saturday, February 25, 2012 09:42:29 PM Davide Baldini wrote: > >> Description: A 'test.sh` script file composed exclusively of the >> following text fails execution: #!/bin/bash ( shopt -s extglob >> echo !(x) ) giving the output: $ ./test.sh ./test.sh: line 4: >> syntax error near unexpected token `(' ./test.sh: line 4: ` >> echo !(x)' Moving the shopt line above the sub-shell parenthesis >> makes the script work. >> >> The debian man pages give no explanation. >> >> Thank you. > > Non-eval workaround if you're desperate: > > #!/usr/bin/env bash ( shopt -s extglob declare -a a='( !(x) )' echo > "${a[@]}" ) > > You may be aware extglob is special and affects parsing in other > ways. Quoting Greg's wiki (http://mywiki.wooledge.org/glob): > >> Likewise, you cannot put shopt -s extglob inside a function that >> uses extended globs, because the function as a whole must be >> parsed when it's defined; the shopt command won't take effect >> until the function is called, at which point it's too late. > > This appears to be a similar situation. Since parentheses are > "metacharacters" they act strongly as word boundaries without a > special exception for extglobs. > > I just tested a bunch of permutations. I was a bit surprised to see > this one fail: > > f() if [[ $FUNCNAME != ${FUNCNAME[1]} ]]; then trap 'shopt -u > extglob' RETURN shopt -s extglob f else f()( shopt -s extglob echo > !(x) ) f fi > > f > > I was thinking there might be a general solution via the RETURN > trap where you could just set "trace" on functions where you want > it, but looks like even "redefinitions" break recursively, so > you're stuck. Fortunately, there aren't a lot of good reasons to > have extglob disabled to begin with (if any).
Re: shopt can't set extglob in a sub-shell?
On 02/25/2012 09:42 PM, Davide Baldini wrote: > Configuration Information [Automatically generated, do not > change]: Machine: i486 OS: linux-gnu Compiler: gcc Compilation > CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='i486' > -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='i486-pc-linux-gnu' > -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' > -DSHELL -DHAVE_CONFIG_H -I. -I../bash -I../bash/include > -I../bash/lib -g -O2 -Wall uname output: Linux debianBunker > 2.6.26-2-686 #1 SMP Wed Sep 21 04:35:47 UTC 2011 i686 GNU/Linux > Machine Type: i486-pc-linux-gnu > > Bash Version: 4.1 Patch Level: 5 Release Status: release Ok so had a play around with it. Its not specific to sub shells its commands. so the following also doesn't work. shopt -u extglob if true; then shopt -s extglob echo !(x) fi this is because bash treats the entire if statement as a command. so the second shopt isn't evaluate before the !(x) is parsed. therefore the error message. The error message is a parsing error not an expansion error I think. so bash sees the above as shopt -u extglob if true; then shopt -s extglob; echo !(x); fi as a workaround you could try/use is this it delays parsing the !(x) until after the shopt is evaluated. ( shopt -s extglob eval 'echo !(x)' ) Not sure if this is expected behavior. hth deth.
Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly.
And on the up side if they do ever give in and allow registration of family name characters we may get a wchar_t, schar_t lwchar_t and a llwchar_t :) just imagine a variable length 64bit char system. Everything from Sumerian to Klingon in Unicode, though I think they already are, though not officially, or are being done, Oh god what I really want now is bash in klingon. :)) just imagine black blackround glaring green text. know what I'm doing tonight. check out ( shakes head in disbelief, while chuckling ) Ubuntu Klingon Translators https://launchpad.net/~ubuntu-l10n-tlh Expansion: Ubuntu Font should support pIqaD (Klingon) https://bugs.launchpad.net/ubuntu/+source/ubuntu-font-family-sources/+bug/650729 On 02/23/2012 04:54 AM, Eric Blake wrote: > On 02/22/2012 07:43 PM, John Kearney wrote: >> ^ caviot you can represent the full 0x10 in UTF-16, you just >> need 2 UTF-16 characters. check out the latest version of >> unicode.c for an example how. > > Yes, and Cygwin actually does this. > > A strict reading of POSIX states that wchar_t must be wide enough > for all supported characters, technically limiting things to just > the basic plane if you have 16-bit wchar_t and a POSIX-compliant > app. But cygwin has exploited a loophole in the POSIX wording - > POSIX does not require that all bit patterns are valid characters. > So the actual Cygwin implementation is that on paper, rather than > representing all 65536 patterns as valid characters, the values > used in surrogate halves (0xd800 to 0xdfff) are listed as > non-characters (so the use of them triggers undefined behavior per > POSIX), but actually using them treats them as surrogate pairs > (leading to the full Unicode character set, but reintroducing the > headaches that multibyte characters had with 'char', but now with > wchar_t, where you are back to dealing with variable-sized > character elements). > > Furthermore, the mess of 16-bit vs. 32-bit wchar_t is one of the > reasons why C11 has introduced two new character types, 16-bit and > 32-bit characters, designed to fully map to the full Unicode set, > regardless of what size wchar_t is. It will be interesting to see > how the next version of POSIX takes the additions of C11 and > retrofits the other wide-character functions in POSIX but not C99 > to handle the new character types. >
Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly.
^ caviot you can represent the full 0x10 in UTF-16, you just need 2 UTF-16 characters. check out the latest version of unicode.c for an example how. On 02/22/2012 11:32 PM, Eric Blake wrote: > On 02/22/2012 03:01 PM, Linda Walsh wrote: >> My question had to do with an unqualified wint_t not >> unsigned wint_t and what platform existed where an 'int' type or >> wide-int_t, was, without qualifiers, unsigned. I still would like >> to know -- and posix allows int/wide-ints to be unsigned without >> the unsigned keyword? > > 'int' is signed, and at least 16 bits (these days, it's usually 32). It > can also be written 'signed int'. > > 'unsigned int' is unsigned, and at least 16 bits (these days, it's > usually 32). > > 'wchar_t' is an arbitrary integral type, either signed or unsigned, and > capable of holding the value of all valid wide characters. It is > possible to define a system where wchar_t and char are identical > (limiting yourself to 256 valid characters), but that is not done in > practice. More common are platforms that use 65536 characters (only the > basic plane of Unicode) for 16 bits, or full Unicode (0 to 0x10) for > 32 bits. Platforms that use 65536 characters and 16-bit wchar_t must > have wchar_t be unsigned; whereas platforms that have wchar_t wider than > the largest valid character can choose signed or unsigned with no impact. > > 'wint_t' is an arbitrary integral type, either signed or unsigned, at > least as wide as wchar_t, and capable of holding the value of all valid > wide characters and the sentinel WEOF. Like wchar_t, it may hold values > that are neither WEOF or valid characters; and in fact, it is more > likely to do so, since either wchar_t is saturated (all bit values are > valid characters) and thus wint_t is a wider type, or wchar_t is sparse > (as is the case with 32-bit wchar_t encoding Unicode), and the addition > of WEOF to the set does not plug in the remaining sparse values; but > using such values has unspecified results on any interface that takes a > wint_t. WEOF only has to be distinct, it does not have to be negative. > > Don't think of it as 'wide-int', rather, think of it as 'the integral > type that both contains wchar_t and WEOF'. You cannot write 'signed > wint_t' nor 'unsigned 'wint_t'. >
Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly.
On 02/22/2012 01:59 PM, Eric Blake wrote: > On 02/22/2012 05:19 AM, Linda Walsh wrote: >> >> >> Eric Blake wrote: >> >> >>> Not only can wchar_t can be either signed or unsigned, you also have to >>> worry about platforms where it is only 16 bits, such as cygwin; on the >>> other hand, wint_t is always 32 bits, but you still have the issue that >>> it can be either signed or unsigned. >> >> >> >> What platform uses unsigned wide ints? Is that even posix compat? > > Yes, it is posix compatible to have wint_t be unsigned. Not only that, > but both glibc (32-bit wchar_t) and cygwin (16-bit wchar_t) use a 32-bit > unsigned int for wint_t. Any code that expects WEOF to be less than 0 > is broken. > But if what you want is a uint32 use a uint32_t ;)
printf "%q" "~" not escaped?
Bash Version: 4.2 Patch Level: 10 Release Status: release Description: printf "%q" "~" not escaped? which means that this eval echo $(printf "%q" "~") results in your home path not a ~ unlike eval echo $(printf "%q" "*") as far as I can see its the only character that isn't treated as I expected.
Re: Bug? in bash setlocale implementation
On 02/22/2012 01:52 AM, Chet Ramey wrote: > On 2/21/12 3:51 AM, John Kearney wrote: > >> Bash Version: 4.2 Patch Level: 10 Release Status: release >> >> Description: Basically if setting the locale fails variable >> should not be changed. > > I disagree. The assignment was performed correctly and as the > user specified. The fact that a side effect of the assignment > failed should not mean that the assignment should be undone. > > I got enough bug reports when I added the warning. I'd get at > least as many if I undid a perfectly good assignment statement. > > I could see setting $? to a non-zero value if the setlocale() call > fails, but not when the shell is in posix mode. > > Chet > ok I guess that makes sense, just ksh93 behavior also makes sense, I guess I can just use some command to check the charset is present before I assign it.
Here is a diff of all the changed to the unicode
Here is a diff of all the changed to the unicode This seems to work ok for me. but still needs further testing. My major goal was to make the code easier to follow and clearer. but also generally fixed and improved it. Added warning message ./bash -c 'printf "string 1\\U8fffStromg 2"' ./bash: line 0: printf: warning: U+8fff unsupported in destination charset ".UTF-8" string 1Stromg 2 added utf32toutf16 and utf32towchar to allow usage of wcstombs both when wchar_t=2 or 4 generally reworked so consistent with function argument convention i.e. destination then source. diff --git a/builtins/printf.def b/builtins/printf.def index 9eca215..77a8159 100644 --- a/builtins/printf.def +++ b/builtins/printf.def @@ -859,15 +859,9 @@ tescape (estart, cp, lenp, sawc) *cp = '\\'; return 0; } - if (uvalue <= UCHAR_MAX) - *cp = uvalue; - else - { - temp = u32cconv (uvalue, cp); - cp[temp] = '\0'; - if (lenp) - *lenp = temp; - } + temp = utf32tomb (cp, uvalue); + if (lenp) + *lenp = temp; break; #endif diff --git a/externs.h b/externs.h index 09244fa..ff3f344 100644 --- a/externs.h +++ b/externs.h @@ -460,7 +460,7 @@ extern unsigned int falarm __P((unsigned int, unsigned int)); extern unsigned int fsleep __P((unsigned int, unsigned int)); /* declarations for functions defined in lib/sh/unicode.c */ -extern int u32cconv __P((unsigned long, char *)); +extern int utf32tomb __P((char *, unsigned long)); /* declarations for functions defined in lib/sh/winsize.c */ extern void get_new_window_size __P((int, int *, int *)); diff --git a/lib/sh/strtrans.c b/lib/sh/strtrans.c index 2265782..e410cff 100644 --- a/lib/sh/strtrans.c +++ b/lib/sh/strtrans.c @@ -28,6 +28,7 @@ #include #include +#include #include "shell.h" #ifdef ESC @@ -140,21 +141,10 @@ ansicstr (string, len, flags, sawc, rlen) for (v = 0; ISXDIGIT ((unsigned char)*s) && temp--; s++) v = (v * 16) + HEXVALUE (*s); if (temp == ((c == 'u') ? 4 : 8)) - { *r++ = '\\'; /* c remains unchanged */ - break; - } - else if (v <= UCHAR_MAX) - { - c = v; - break; - } else - { - temp = u32cconv (v, r); - r += temp; - continue; - } + r += utf32tomb (r, v); + break; #endif case '\\': break; diff --git a/lib/sh/unicode.c b/lib/sh/unicode.c index d34fa08..5cc96bf 100644 --- a/lib/sh/unicode.c +++ b/lib/sh/unicode.c @@ -36,13 +36,7 @@ #include -#ifndef USHORT_MAX -# ifdef USHRT_MAX -#define USHORT_MAX USHRT_MAX -# else -#define USHORT_MAX ((unsigned short) ~(unsigned short)0) -# endif -#endif +#include "bashintl.h" #if !defined (STREQ) # define STREQ(a, b) ((a)[0] == (b)[0] && strcmp ((a), (b)) == 0) @@ -54,13 +48,14 @@ extern const char *locale_charset __P((void)); extern char *get_locale_var __P((char *)); #endif -static int u32init = 0; +const char *charset; static int utf8locale = 0; #if defined (HAVE_ICONV) static iconv_t localconv; #endif #ifndef HAVE_LOCALE_CHARSET +static char charset_buffer[40]={0}; static char * stub_charset () { @@ -68,168 +63,267 @@ stub_charset () locale = get_locale_var ("LC_CTYPE"); if (locale == 0 || *locale == 0) -return "ASCII"; - s = strrchr (locale, '.'); - if (s) { - t = strchr (s, '@'); - if (t) - *t = 0; - return ++s; + strcpy(charset_buffer, "ASCII"); } - else if (STREQ (locale, "UTF-8")) -return "UTF-8"; else -return "ASCII"; +{ + s = strrchr (locale, '.'); + if (s) + { + t = strchr (s, '@'); + if (t) + *t = 0; + strcpy(charset_buffer, s); + } + else + { + strcpy(charset_buffer, locale); + } + /* free(locale) If we can Modify the buffer surely we need to free it?*/ +} + return charset_buffer; } #endif -/* u32toascii ? */ + +#if 0 int -u32tochar (wc, s) - wchar_t wc; +utf32tobig5 (s, c) char *s; + unsigned long c; { - unsigned long x; int l; - x = wc; - l = (x <= UCHAR_MAX) ? 1 : ((x <= USHORT_MAX) ? 2 : 4); - - if (x <= UCHAR_MAX) -s[0] = x & 0xFF; - else if (x <= USHORT_MAX) /* assume unsigned short = 16 bits */ + if (c <= 0x7F) { - s[0] = (x >> 8) & 0xFF; - s[1] = x & 0xFF; + s[0] = (char)c; + l = 1; +} + else if ((c >= 0x8000) && (c <= 0x)) +{ + s[0] = (char)(c>>8); + s[1] = (char)(c &0xFF); + l = 2; } else { - s[0] = (x >> 24) & 0xFF; - s[1] = (x >> 16) & 0xFF; - s[2] = (x >> 8) & 0xFF; - s[3] = x & 0xFF; + /* Error Invalid UTF-8 */ + l = 0; } s[l] = '\0'; - return l; + return l; } - +#endif int -u32toutf8 (wc, s) - wchar_t wc; +utf32toutf8 (s, c) char *s; + unsigned long c; { int l; - l = (wc < 0x0080) ? 1 : ((wc < 0x0800) ? 2 : 3); - - if (wc < 0x0080) -s[0] = (unsigned char)wc; - else if (wc < 0x0800) + if (c <= 0x7F) { - s[0] = (wc >> 6) | 0xc0; - s[1] = (
Initial test code for \U
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Initial code for testing \u functionality. basically uses arrays that look like this jp_JP_SHIFT_JIS=( #Unicode="expected bmstring" [0x0001]=$'\x01' # START OF HEADING [0x0002]=$'\x02' # START OF TEXT ... ) TestCodePage ja_JP.SHIFT_JIS jp_JP_SHIFT_JIS in error output looks like this Error Encoding U+00FB to C.UTF-8 [ "$'\303\273'" != "$'\373'" ] Error Encoding U+00FC to C.UTF-8 [ "$'\303\274'" != "$'\374'" ] Error Encoding U+00FD to C.UTF-8 [ "$'\303\275'" != "$'\375'" ] Error Encoding U+00FE to C.UTF-8 [ "$'\303\276'" != "$'\376'" ] Error Encoding U+00FF to C.UTF-8 [ "$'\303\277'" != "$'\377'" ] Failed 128 of 1378 Unicode tests or if its all ok like this Passed all 1378 Unicode tests should make it relatively easy to verify functionality on different targets etc. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJPRBWIAAoJEKUDtR0WmS05WigH/1iXidormw3aj+bBJZDEYv33 BL98n1irF4C9ZNNPc95UfPvDjqVUhpQrWx+/Pa6BH9m9zSd5cSqZ7xmgUH9mzg2p JkqbiTzg0+lb714BBopTyZMRajqXMrQGx5nJTzOwuMNhs7cgrHPtPdOUrkcB0OJ2 UR5e0T1MWx8RR6lOgkXu0Gt3nQqYtnes+8y8fGGbbHfFrxJMaOjegjdN87+Q6N0U Cl0uVH9JT8V6IEU1Q4EddjuuqyBr1c8soXd9XjeCPXVdc3XSJ5b/XB8Sdh7uW8FW x3UbaNrhaReX8XF0xHMoPvIJQFmQE469RpXERWmZzWpGnXrXCvEpxmVQXK2CWhY= =Cm29 -END PGP SIGNATURE- ErrorCnt=0 TestCnt=0 function check_valid_var_name { case "${1:?Missing Variable Name}" in [!a-zA-Z_]* | *[!a-zA-Z_0-9]* ) return 3;; esac } # get_array_element VariableName ArrayName ArrayElement function get_array_element { check_valid_var_name "${1:?Missing Variable Name}" || return $? check_valid_var_name "${2:?Missing Array Name}" || return $? eval "${1}"'="${'"${2}"'["${3:?Missing Array Index}"]}"' } # unset_array_element VarName ArrayName function get_array_element_cnt { check_valid_var_name "${1:?Missing Variable Name}" || return $? check_valid_var_name "${2:?Missing Array Name}" || return $? eval "${1}"'="${#'"${2}"'[@]}"' } function TestCodePage { local TargetCharset="${1:?Missing Test charset}" local EChar RChar TCnt get_array_element_cnt TCnt "${2:?Missing Array Name}" for (( x=1 ; x<${TCnt} ; x++ )); do get_array_element EChar "${2}" ${x} if [ -n "${EChar}" ]; then let TestCnt+=1 printf -v UVal '\\U%08x' "${x}" LC_CTYPE=${TargetCharset} printf -v RChar "${UVal}" if [ "${EChar}" != "${RChar}" ]; then let ErrorCnt+=1 printf "Error Encoding U+%08X to ${TL} [ \"%q\" != \"%q\" ]\n" "${x}" "${EChar}" "${RChar}" fi fi done } #for ((x=1;x<255;x++)); do printf ' [0x%04x]=$'\''\%03o'\' $x $x ; [ $(($x%5)) = 0 ] && echo; done fr_FR_ISO_8859_1=( [0x0001]=$'\001' [0x0002]=$'\002' [0x0003]=$'\003' [0x0004]=$'\004' [0x0005]=$'\005' [0x0006]=$'\006' [0x0007]=$'\007' [0x0008]=$'\010' [0x0009]=$'\011' [0x000a]=$'\012' [0x000b]=$'\013' [0x000c]=$'\014' [0x000d]=$'\015' [0x000e]=$'\016' [0x000f]=$'\017' [0x0010]=$'\020' [0x0011]=$'\021' [0x0012]=$'\022' [0x0013]=$'\023' [0x0014]=$'\024' [0x0015]=$'\025' [0x0016]=$'\026' [0x0017]=$'\027' [0x0018]=$'\030' [0x0019]=$'\031' [0x001a]=$'\032' [0x001b]=$'\033' [0x001c]=$'\034' [0x001d]=$'\035' [0x001e]=$'\036' [0x001f]=$'\037' [0x0020]=$'\040' [0x0021]=$'\041' [0x0022]=$'\042' [0x0023]=$'\043' [0x0024]=$'\044' [0x0025]=$'\045' [0x0026]=$'\046' [0x0027]=$'\047' [0x0028]=$'\050' [0x0029]=$'\051' [0x002a]=$'\052' [0x002b]=$'\053' [0x002c]=$'\054' [0x002d]=$'\055' [0x002e]=$'\056' [0x002f]=$'\057' [0x0030]=$'\060' [0x0031]=$'\061' [0x0032]=$'\062' [0x0033]=$'\063' [0x0034]=$'\064' [0x0035]=$'\065' [0x0036]=$'\066' [0x0037]=$'\067' [0x0038]=$'\070' [0x0039]=$'\071' [0x003a]=$'\072' [0x003b]=$'\073' [0x003c]=$'\074' [0x003d]=$'\075' [0x003e]=$'\076' [0x003f]=$'\077' [0x0040]=$'\100' [0x0041]=$'\101' [0x0042]=$'\102' [0x0043]=$'\103' [0x0044]=$'\104' [0x0045]=$'\105' [0x0046]=$'\106' [0x0047]=$'\107' [0x0048]=$'\110' [0x0049]=$'\111' [0x004a]=$'\112' [0x004b]=$'\113' [0x004c]=$'\114' [0x004d]=$'\115' [0x004e]=$'\116' [0x004f]=$'\117' [0x0050]=$'\120' [0x0051]=$'\121' [0x0052]=$'\122' [0x0053]=$'\123' [0x0054]=$'\124' [0x0055]=$'\125' [0x0056]=$'\126' [0x0057]=$'\127' [0x0058]=$'\130' [0x0059]=$'\131' [0x005a]=$'\132' [0x005b]=$'\133' [0x005c]=$'\134' [0x005d]=$'\135' [0x005e]=$'\136' [0x005f]=$'\137' [0x0060]=$'\140' [0x0061]=$'\141' [0x0062]=$'\142' [0x0063]=$'\143' [0x0064]=$'\144' [0x0065]=$'\145' [0x0066]=$'\146' [0x0067]=$'\147' [0x0068]=$'\150' [0x0069]=$'\151' [0x006a]=$'\152' [0x006b]=$'\153' [0x006c]=$'\154' [0x006d]=$'\155' [0x006e]=$'\156' [0x006f]=$'\157' [0x0070]=$'\160' [0x0071]=$'\161' [0x0072]=$'\162' [0x0073]=$'\163' [0x0074]=$'\164' [0x0075]=$'\165' [0x0076]=$'\166' [0x0077]=$'\167' [0x0078]=$'\170' [0x0079]=$'\171' [0x007a]=$'\172' [0x007b]=$'\173' [0x007c]=$'\174' [0x007d]=$'\175' [0x007e]=$'\176' [0x007f]=$'\177' [0x0080]=$'\200' [0x0081]=$'\
Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/21/2012 01:34 PM, Eric Blake wrote: > On 02/20/2012 07:42 PM, Chet Ramey wrote: >> On 2/18/12 5:39 AM, John Kearney wrote: >> >>> Bash Version: 4.2 Patch Level: 10 Release Status: release >>> >>> Description: Current u32toutf8 only encode values below 0x >>> correctly. wchar_t can be ambiguous size better in my opinion >>> to use unsigned long, or uint32_t, or something clearer. >> >> Thanks for the patch. It's good to have a complete >> implementation, though as a practical matter you won't see UTF-8 >> characters longer than four bytes. I agree with you about the >> unsigned 32-bit int type; wchar_t is signed, even if it's 32 >> bits, on several systems I use. > > Not only can wchar_t can be either signed or unsigned, you also > have to worry about platforms where it is only 16 bits, such as > cygwin; on the other hand, wint_t is always 32 bits, but you still > have the issue that it can be either signed or unsigned. > signed / unsigend isn't really the problem anyway utf-8 only encodes up to 0x7fff and utf-16 only encodes up to 0x0010 . In my latest version I've pretty much removed all reference to wchar_t in unicode.c. It was unnecessary. However I would be interested in something like utf16_t or uint16_t currently using unsigned short which is intelligent but works. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJPQ593AAoJEKUDtR0WmS05g0wH/RPQMl1mfUdJBfzv5QkUtVSG ibezTe3/b7/9h8SG3LLrv2FiPS+FtcCbE4n8tUror3V1BHomsQHZdlj/Zshi8W/n YDl5ac5nc0rrOlw+SJxyCAJl9vHeEAXavjGw8m0KUv/vn0tZyWNM0RYXc7tRxJU2 uqY7G5sGLUt8uGuswCmSmucKjoB7guiUbsmTR+OzgDgKxuuSeQBr6/oIImo721pk nI5TYdqerPGCIMJoYPeZChCBAZ/WhK9i3C3/SxKme4zWnjySaDw3NH0yfqFHl4Ts IIOT4fYpm0h62U76+NJSPGWfadTd8UL4A/Jy4I3IwUS+mflwdU0Pu2zmwb8I+Xk= =pkAF -END PGP SIGNATURE-
Bug? in bash setlocale implementation
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' - -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' - -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/s uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20 17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 4.2 Patch Level: 10 Release Status: release Description: Basically if setting the locale fails variable should not be changed. Consider export LC_CTYPE= bash -c 'LC_CTYPE=ISO-8859-1 eval printf "\${LC_CTYPE:-unset}"' bash: warning: setlocale: LC_CTYPE: cannot change locale (ISO-8859-1): No such file or directory ISO-8859-1 ksh93 -c 'LC_CTYPE=ISO-8859-1 eval printf "\${LC_CTYPE:-unset}"' ISO-8859-1: unknown locale unset ksh93 -c 'LC_CTYPE=C.UTF-8 eval printf "\${LC_CTYPE:-unset}"' C.UTF-8 the advantage being you can check in the script if the local change worked. e.g. LC_CTYPE=ISO-8859-1 [ "${LC_CTYPE:-}" = "ISO-8859-1" ] || error exit -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJPQ1sbAAoJEKUDtR0WmS05dDEH+wf+Gix7NnSZ6WvwOt6ZRmlv /BXr94coQ1I6ODCXXAG0ExgqNs81gJ58N1xw0nBO/qMpJ1CWv+t5Gc+FP37RK9GK aZbrT6yYAueg/lz58o7hg76oRKVmOpzaYxdquC4dMKa8K1kEdxNyyO4Qxa8a/TNP qLC79kvBl/23CESRomZdhUpOOjTdzhiEo6njLxDmluhzA+U/WsMD1Zp7TJih30gu okkJESAwSsEoo8QIeFbzOFa/qEZQH05SwY0CoYO+OPC0qlNR/Jar9cAJhTpHfxjg bLYXSNlqs5ZCgbmUCypnOWpOktUVPNxpXabNTjWPwAnekEY8Ms4BR6XkG+yuclk= =+Z4p -END PGP SIGNATURE-
bug in stub_charset rollup diff of changes to unicode code.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' - -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' - -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/s uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20 17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 4.2 Patch Level: 10 Release Status: release Description: stub_charset if locale == '\0' return ASCII else if locale=~m/.*\.(.*)(@.*)/ return $1 else if locale=UTF-8 return UTF-8 else return ASCII should be if locale == '\0' return ASCII else if locale=~m/.*\.(.*)(@.*)/ return $1 else return locale because its output is only being used in iconv, so let it decide if the locale makes sense. I've attached a diff of all my changes to the unicode code. Including renamed u2cconv to utf32tomb move special handling of ascii charcter to start of function and remove related call wrapper code. tried to reationalize the code in utf32tomb so its easier to read and understand what is happening. added utf32toutf16 use utf32toutf16 in case wchar_t=2 with wctomb removed dangerious code that was using iconv_open (charset, "ASCII"); as fallback. pointless anyway as we already assign a ascii value if posible. added warning message if encode fails always terminate mb output string. haven't started to test these changes yet firstly would like to know if these changes are acceptable, any observations, I'm still reviewing it myself for consistency. Plus can somebody tell me how this was tested originally? I've got some ideas myself but would like to know what has already been done in that direction. Repeat-By: . Fix: diff --git a/builtins/printf.def b/builtins/printf.def index 9eca215..3680419 100644 - --- a/builtins/printf.def +++ b/builtins/printf.def @@ -859,15 +859,9 @@ tescape (estart, cp, lenp, sawc) *cp = '\\'; return 0; } - - if (uvalue <= UCHAR_MAX) - - *cp = uvalue; - - else - - { - - temp = u32cconv (uvalue, cp); - - cp[temp] = '\0'; - - if (lenp) - - *lenp = temp; - - } + temp = utf32tomb (uvalue, cp); + if (lenp) + *lenp = temp; break; #endif diff --git a/externs.h b/externs.h index 09244fa..8868b55 100644 - --- a/externs.h +++ b/externs.h @@ -460,7 +460,7 @@ extern unsigned int falarm __P((unsigned int, unsigned int)); extern unsigned int fsleep __P((unsigned int, unsigned int)); /* declarations for functions defined in lib/sh/unicode.c */ - -extern int u32cconv __P((unsigned long, char *)); +extern int utf32tomb __P((unsigned long, char *)); /* declarations for functions defined in lib/sh/winsize.c */ extern void get_new_window_size __P((int, int *, int *)); diff --git a/lib/sh/strtrans.c b/lib/sh/strtrans.c index 2265782..495d9c4 100644 - --- a/lib/sh/strtrans.c +++ b/lib/sh/strtrans.c @@ -144,16 +144,10 @@ ansicstr (string, len, flags, sawc, rlen) *r++ = '\\'; /* c remains unchanged */ break; } - - else if (v <= UCHAR_MAX) - - { - - c = v; - - break; - - } else { - - temp = u32cconv (v, r); - - r += temp; - - continue; + r += utf32tomb (v, r); + break; } #endif case '\\': diff --git a/lib/sh/unicode.c b/lib/sh/unicode.c index d34fa08..9a557a9 100644 - --- a/lib/sh/unicode.c +++ b/lib/sh/unicode.c @@ -36,14 +36,6 @@ #include - -#ifndef USHORT_MAX - -# ifdef USHRT_MAX - -#define USHORT_MAX USHRT_MAX - -# else - -#define USHORT_MAX ((unsigned short) ~(unsigned short)0) - -# endif - -#endif - - #if !defined (STREQ) # define STREQ(a, b) ((a)[0] == (b)[0] && strcmp ((a), (b)) == 0) #endif /* !STREQ */ @@ -54,13 +46,14 @@ extern const char *locale_charset __P((void)); extern char *get_locale_var __P((char *)); #endif - -static int u32init = 0; +const char *charset; static int utf8locale = 0; #if defined (HAVE_ICONV) static iconv_t localconv; #endif #ifndef HAVE_LOCALE_CHARSET +static char CType[40]={0}; static char * stub_charset () { @@ -69,6 +62,7 @@ stub_charset () locale = get_locale_var ("LC_CTYPE"); if (locale == 0 || *locale == 0) return "ASCII"; + strcpy(CType, locale); s = strrchr (locale, '.'); if (s) { @@ -77,159 +71,230 @@ stub_charset () *t = 0; return ++s; } - - else if (STREQ (locale, "UTF-8")) - -return "UTF-8"; else - -return "ASCII"; +return CType; } #endif - -/* u32toascii ? */ int - -u32tochar (wc, s) - - wchar_t wc; +utf32_2_utf8 (c, s) + unsigned lon
Can somebody explain to me what u32tochar in /lib/sh/unicode.c is trying to do?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Can somebody explain to me what u32tochar is trying to do? It seems like dangerous code? from the context i'm guessing it trying to make a hail mary pass at converting utf-32 to mb (not utf-8 mb) int u32tochar (x, s) unsigned long c; char *s; { int l; l = (x <= UCHAR_MAX) ? 1 : ((x <= USHORT_MAX) ? 2 : 4); if (x <= UCHAR_MAX) s[0] = x & 0xFF; else if (x <= USHORT_MAX) /* assume unsigned short = 16 bits */ { s[0] = (x >> 8) & 0xFF; s[1] = x & 0xFF; } else { s[0] = (x >> 24) & 0xFF; s[1] = (x >> 16) & 0xFF; s[2] = (x >> 8) & 0xFF; s[3] = x & 0xFF; } /* s[l] = '\0'; Overwrite Buffer?*/ return l; } Couple problems with that though firstly utf-32 doesn't map directly to non utf mb locals. So you need a translation mechanism. Secondly Normal CJK system are state based systems so mutibyte sequences need to be escaped. Extended Unix Code would need encoding somewhat like utf-8, in fact any variable multi byte encoding system is going to need some context to recover the info this is unparsable behavior, what it is actually doing is taking utf-32 and depending on the size encoding it as UTF-32 Big Endian , UTF-16 Big Endian, UTF-8, or American EAscii codepage(values between 0x80 - 0xff). Choosing one of these is however Dependant on LC_CTYPE not some arbitrary check. So this function just seems plain crazy? I think that all it can safely do is this. int utf32tomb (x, s) unsigned long c; char *s; { if (x <= 0x7f ) /* x>=0x80 = locale specific */ { s[0] = x & 0xFF; return 1; } else return 0 } regarding naming convention u32 = unsigned 32 bit might be a good idea to rename all the utf32 functions to utf32, would I think save a lot of confusion in the code as to what is going on. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJPQXKxAAoJEKUDtR0WmS054sgH/R+qWtds9MMeN/y4n98wk83l MAOVBXAn+m8IUf31VtSZ7nqEccJHDPDRMkg21sYNlozsxPVwCYOGZd7LL8wxlwEl 70mRu9cAQOXIAeF9b8ao0/nz6e6nC6FTk03FDhDo+V8RWt9MiQHF4YWRCCmSdmQv GDM88XyXuQZaBwIHrXeCXRvuXTN8K5BrdbVFJ7OHRUytKNE6OccUDz/iaPCoPy5f SehHTLJ6AqpYy7NgapyALTvo3/FlVUDc7vtYbCDF5Q0EMIlvjgEQ9Y7vJlKtuAop 9Up32sQSy8red6frOgZmvA5GLeD7Lp/gvfp/U5fQWIZTKKLgBee2mYVqPlLOKw4= =nHdc -END PGP SIGNATURE-
Questionable code behavior in u32cconv?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' - -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' - -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' - -DSHELL -DHAVE_CONFIG_H -I. -I../bash -I../bash/include - -I../bash/lib -g -O2 -Wall uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20 17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 4.2 Patch Level: 10 Release Status: release Description: Now I may be misreading the code but it looks like the code relating to iconv is only checking the destination charset the first time, the code is executed. as such breaking the following functionality. LC_CTYPE=C printf '\uff' LC_CTYPE=C.UTF-8 printf '\uff' Repeat-By: haven't seen the problem. Fix: Not so much a fix as a modification that should hopefully clarify my concern. diff --git a/lib/sh/unicode.c b/lib/sh/unicode.c index d34fa08..3f7d378 100644 - --- a/lib/sh/unicode.c +++ b/lib/sh/unicode.c @@ -54,7 +54,7 @@ extern const char *locale_charset __P((void)); extern char *get_locale_var __P((char *)); #endif - -static int u32init = 0; +const char *charset; static int utf8locale = 0; #if defined (HAVE_ICONV) static iconv_t localconv; @@ -115,26 +115,61 @@ u32tochar (wc, s) } @@ -150,7 +185,7 @@ u32cconv (c, s) wchar_t wc; int n; #if HAVE_ICONV - - const char *charset; + const char *ncharset; char obuf[25], *optr; size_t obytesleft; const char *iptr; @@ -171,20 +206,22 @@ u32cconv (c, s) codeset = nl_langinfo (CODESET); if (STREQ (codeset, "UTF-8")) { n = u32toutf8 (wc, s); return n; } #endif #if HAVE_ICONV - - /* this is mostly from coreutils-8.5/lib/unicodeio.c */ - - if (u32init == 0) - -{ # if HAVE_LOCALE_CHARSET - - charset = locale_charset (); /* XXX - fix later */ + ncharset = locale_charset ();/* XXX - fix later */ # else - - charset = stub_charset (); + ncharset = stub_charset (); # endif + /* this is mostly from coreutils-8.5/lib/unicodeio.c */ + if (STREQ (charset, ncharset)) +{ + /* Free Old charset str ? */ + charset=ncharset; if (STREQ (charset, "UTF-8")) utf8locale = 1; else -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJPP5SCAAoJEKUDtR0WmS05L8QH/RUz/X8QZk7HXDIFUTCd0Eah MkfWpCtib9Jt5jUBcb+/UZKiwTSxYGm7D9X08Tpho+i7c+3kknWUGTkivqg7eVo4 TlRA+N4k3x8PdpbYPFNGxgy9LRSViQjqbbzNfYaX+Pbi2YIbZRuaPBipEdbvBqDG bN7KaUM/97vZicZn5SOrhcDiq1RfJosdTkr7egEON4P4BBIXIVk4vRcCF/xXCw6M w2BmvpavV3ra1TXhYN2C678qMyncq5kr8e0tvIl4EY6oCurMlvXhoNkOcz14fOMa XrYJUu1dDNKXmTsJFjDGZhyzvTejLVezjn91/so2OINinqHW++2IMFim5ED9w28= =rW+v -END PGP SIGNATURE-
Fix u32toutf8 so it encodes values > 0xFFFF correctly.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' - -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' - -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' - -DSHELL -DHAVE_CONFIG_H -I. -I../bash -I../bash/include - -I../bash/lib -g -O2 -Wall uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20 17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 4.2 Patch Level: 10 Release Status: release Description: Current u32toutf8 only encode values below 0x correctly. wchar_t can be ambiguous size better in my opinion to use unsigned long, or uint32_t, or something clearer. Repeat-By: ---' Fix: diff --git a/lib/sh/unicode.c b/lib/sh/unicode.c index d34fa08..3f7d378 100644 - --- a/lib/sh/unicode.c +++ b/lib/sh/unicode.c @@ -54,7 +54,7 @@ extern const char *locale_charset __P((void)); extern char *get_locale_var __P((char *)); #endif - -static int u32init = 0; +static int u32init = 0; static int utf8locale = 0; #if defined (HAVE_ICONV) static iconv_t localconv; @@ -115,26 +115,61 @@ u32tochar (wc, s) } int - -u32toutf8 (wc, s) - - wchar_t wc; +u32toutf8 (c, s) + unsigned long c; char *s; { int l; - - l = (wc < 0x0080) ? 1 : ((wc < 0x0800) ? 2 : 3); - - - - if (wc < 0x0080) - -s[0] = (unsigned char)wc; - - else if (wc < 0x0800) + if (c <= 0x7F) +{ + s[0] = (char)c; + l = 1; +} + else if (c <= 0x7FF) +{ + s[0] = (c >> 6)| 0xc0; /* 110x */ + s[1] = (c& 0x3f) | 0x80; /* 10xx */ + l = 2; +} + else if (c <= 0x) +{ + s[0] = (c >> 12) | 0xe0; /* 1110 */ + s[1] = ((c >> 6) & 0x3f) | 0x80; /* 10xx */ + s[2] = (c& 0x3f) | 0x80; /* 10xx */ + l = 3; +} + else if (c <= 0x1F) { - - s[0] = (wc >> 6) | 0xc0; - - s[1] = (wc & 0x3f) | 0x80; + s[0] = (c >> 18) | 0xf0; /* 0xxx */ + s[1] = ((c >> 12) & 0x3f) | 0x80; /* 10xx */ + s[2] = ((c >> 6) & 0x3f) | 0x80; /* 10xx */ + s[3] = ( c& 0x3f) | 0x80; /* 10xx */ + l = 4; +} + else if (c <= 0x3FF) +{ + s[0] = (c >> 24) | 0xf8; /* 10xx */ + s[1] = ((c >> 18) & 0x3f) | 0x80; /* 10xx */ + s[2] = ((c >> 12) & 0x3f) | 0x80; /* 10xx */ + s[3] = ((c >> 6) & 0x3f) | 0x80; /* 10xx */ + s[4] = ( c& 0x3f) | 0x80; /* 10xx */ + l = 5; +} + else if (c <= 0x7FFF) +{ + s[0] = (c >> 30) | 0xfc; /* 110x */ + s[1] = ((c >> 24) & 0x3f) | 0x80; /* 10xx */ + s[2] = ((c >> 18) & 0x3f) | 0x80; /* 10xx */ + s[3] = ((c >> 12) & 0x3f) | 0x80; /* 10xx */ + s[4] = ((c >> 6) & 0x3f) | 0x80; /* 10xx */ + s[5] = ( c& 0x3f) | 0x80; /* 10xx */ + l = 6; } else { - - s[0] = (wc >> 12) | 0xe0; - - s[1] = ((wc >> 6) & 0x3f) | 0x80; - - s[2] = (wc & 0x3f) | 0x80; + /* Error Invalid UTF-8 */ + l = 0; } s[l] = '\0'; return l; @@ -150,7 +185,7 @@ u32cconv (c, s) -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJPP3/tAAoJEKUDtR0WmS059CcH/iIyBOGhf0IgSmnIFyw0YLpA 3ZWSaXWoEZodrDr1fX67hj2424icXm9fTZw70G+rS1YjtCfm86O/Qou4VNROylAv TbjPUWkHRWVci7IqcDGb1tNWRrulxUvNFA/Uc1xBtKckAO6HHHRTYFa+sCkd5Fnx dm7e0iMTqMMmL/dUwB+di+hSkGD+ZXS1vY76wizdwG7CteUxAVunse+ffP7TRYbn K86Whc7p7llG12hruCPGArc9iS7YiBaC/XNIKXmN7fn93dhQTcdzzk/UTGmaZgDk cQk4R7/NBljP4LtQtKwX4JYAi5XJM5TeSLykL97UFxW/5OGM+SmSVJbKLlHU/mQ= =EJUb -END PGP SIGNATURE-
Re: UTF-8 Encode problems with \u \U
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/18/2012 11:29 AM, Andreas Schwab wrote: > John Kearney writes: > >> what I suggest will fix the UTF-8 case > > No, it won't. > >> and not affect the UTF-2 case. > > That is impossible. > > Andreas. > Current code if (uvalue <= UCHAR_MAX) *cp = uvalue; else { temp = u32cconv (uvalue, cp); cp[temp] = '\0'; if (lenp) *lenp = temp; } Robust Code temp = u32cconv (uvalue, cp); cp[temp] = '\0'; if (lenp) *lenp = temp; Compromise solution if (uvalue <= 0x7f) *cp = uvalue; else { temp = u32cconv (uvalue, cp); cp[temp] = '\0'; if (lenp) *lenp = temp; } How can doing a direct assignment, in less cases break anything, if it does u32cconv is broken. And it does work for me, so impossible seems to be overstating it. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJPP39rAAoJEKUDtR0WmS052JIH/09at08oGR16hvj2blL4YxWJ V1Slbkh9O8pJ4DV9NOwEweIpjAxYUzRFzOEVV0tiYzeqISJ36uKnttewiP5VcRSv heS6QwOl5R3wnx0ecNkpLMo2nT054Fqd+OHSHFOgkBeAM28PVwjT+GmfFyCp1f4K hPevpejPLyxHYWaXJwy4+1XN0Wp/YatzEXr21pHgU7CPyMGYLbju4su0kNpYledj 5Zo3tT/cvoBGVysJo5AbQ8D07cG85eoARxz6erJatjKDKCUPl1kKdcikG3nGvnQc 66HdR/lJRShDh344uss6/4sw2R9LFut0QP+ChhJowQ9ZBI1uZo7/fn0gQv7gOdo= =fXLm -END PGP SIGNATURE-
Re: UTF-8 Encode problems with \u \U
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I know To be hones I get a bad feeling with that code, I'm guessing it was done for performance reasons, Personally I'd just remove the special handling of any values, and always call the encoding function, but was trying for a minimalist solution. I mean you could do something like #define MAX_SINGLE_BYTE_UTF8 0x7F if (uvalue <= MAX_SINGLE_BYTE_UTF8) I'm guessing the code was done originally for UTF-2 encoding. what I suggest will fix the UTF-8 case and not affect the UTF-2 case. On 02/18/2012 11:11 AM, Andreas Schwab wrote: > John Kearney writes: > >> Fix: iff --git a/builtins/printf.def b/builtins/printf.def index >> 9eca215..b155160 100644 --- a/builtins/printf.def +++ >> b/builtins/printf.def @@ -859,7 +859,7 @@ tescape (estart, cp, >> lenp, sawc) *cp = '\\'; return 0; } -if (uvalue <= UCHAR_MAX) >> +if (uvalue <= CHAR_MAX) > > CHAR_MAX has nothing at all to do with UTF-8. > > Andreas. > -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJPP3u8AAoJEKUDtR0WmS056GIH/1TO/A8RmRCfTU3llNG1tMJy MJiby2gdvz2v/Q+Y83llCU01fcQ1tGpp2iOO7rbfYmfdqiJ8iMfNc1pK302Tb77u HcZSSVQKnBwNpL6eeAhwLVzrpfdcKWY/diQknsiXLtrm0AcPhsrf5Bu/OgHjeu7m 3uyqlcQAvYVKj5Z4eV75Hn1+lrCp26fkjZSOZPN9AH8yv1chQXrYPB+/Wj82Cp/S sSgupvpmAv3b4HaZhXsA2DPxEEb2ESj/ZaHMC4/AxyABJoub++erxm/k8r3iUDjc rud6jWoVJcwt+UkVyqi8V8qIJ/urVG01FVoVXTYIiqA73ZdJ3fkLw0PCmliZMtA= =pZin -END PGP SIGNATURE-
UTF-8 Encode problems with \u \U
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I../bash -I../bash/include -I../bash/lib -g -O2 -Wall uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20 17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 4.2 Patch Level: 10 Release Status: release Description: \u and \U incorrectly encode values between \u80 and \uff Repeat-By: printf '%q\n' "$(printf '\uff')" printf '%q\n' $'\uff' # outputs $'\377' instead of $'\303\277' Fix: iff --git a/builtins/printf.def b/builtins/printf.def index 9eca215..b155160 100644 --- a/builtins/printf.def +++ b/builtins/printf.def @@ -859,7 +859,7 @@ tescape (estart, cp, lenp, sawc) *cp = '\\'; return 0; } -if (uvalue <= UCHAR_MAX) +if (uvalue <= CHAR_MAX) *cp = uvalue; else { diff --git a/lib/sh/strtrans.c b/lib/sh/strtrans.c index 2265782..2e6e37b 100644 --- a/lib/sh/strtrans.c +++ b/lib/sh/strtrans.c @@ -144,7 +144,7 @@ ansicstr (string, len, flags, sawc, rlen) *r++ = '\\';/* c remains unchanged */ break; } - else if (v <= UCHAR_MAX) + else if (v <= CHAR_MAX) { c = v; break;