from:"John Kearney"

Aw: Re: autocomplete error doesn't look to be in bash-complete so I'm reporting it here.

2013-08-18 Thread John Kearney

   I got the file by running some code using procees substitution with set
   +o posix.
   I don't think the drectory was empty but what you say make sense and I
   didn't think to checkt it at the time.

   Thanks
   JOhn

   Gesendet: Sonntag, 18. August 2013 um 20:42 Uhr
   Von: "Chet Ramey" 
   An: dethrop...@web.de
   Cc: bug-bash@gnu.org, b...@packages.debian.org, chet.ra...@case.edu
   Betreff: Re: autocomplete error doesn't look to be in bash-complete so
   I'm reporting it here.
   On 8/16/13 5:28 AM, dethrop...@web.de wrote:
   > Bash Version: 4.2
   > Patch Level: 25
   > Release Status: release
   >
   > Description:
   > autocomplete error doesn't look to be in bash-complete so I'm
   reporting it here.
   >
   > Repeat-By:
   > touch '>(pygmentize -l text -f html )'
   > rm >[Press tab]
   >
   > rm >\>\(pygmentize\ -l\ text\ -f\ html\ \)
   > ^ Note leading >
   I'm going to assume you did this in a directory with no other files, so
   tab-completing nothing results in the filename that strangely resembles
   process substitution.
   If you don't quote the `>', bash interprets it as a redirection
   operator,
   as the parser would, and performs filename completion. The tab results
   in
   the single filename. If you were to backslash-quote the `>', you'd get
   the filename as you intended.
   Chet
   --
   ``The lyf so short, the craft so long to lerne.'' - Chaucer
   ``Ars longa, vita brevis'' - Hippocrates
   Chet Ramey, ITS, CWRU c...@case.edu
   [1]http://cnswww.cns.cwru.edu/~chet/

References

   1. http://cnswww.cns.cwru.edu/~chet/

Aw: Re: Re: Chained command prints password in Clear Text and breaks BASH Session until logout

2013-07-11 Thread John Kearney

   Typically when a program has this sort of a problem I just save and
   restore the context myself.

   SavedContext="$(stty -g )"

   read -sp "Password:" Password
   mysqldump -u someuser --password=${Password} somedb | mysql -u someuser
   --password=${Password} -D someotherdb

   # Restore Terminal Context.
   stty "${SavedContext}"


   And note your orginal example was wrong.

   -p in the following is to speciy the password
   mysqldump -u someuser -p somedb | mysql -u someuser -p -D someotherdb

   so you are saying the password to someuser is somedb and not giving a
   database.
   in the second case you are saying that the password to someuser is -D



   Gesendet: Donnerstag, 11. Juli 2013 um 20:05 Uhr
   Von: "Jason Sipula" 
   An: "John Kearney" 
   Cc: bug-bash@gnu.org
   Betreff: Re: Re: Chained command prints password in Clear Text and
   breaks BASH Session until logout
   Bingo.
   ~]# stty echo
   This fixed bash. So it does appear MySQL is disabling echo.Strange that
   it
   does not re-enable it after it's finished running. I'll take this up
   with
   the mysql folks.
   Thank you to everyone!
   On Thu, Jul 11, 2013 at 11:00 AM, John Kearney 
   wrote:
   > sounds like echo is turned off
   > try typing
   > stty +echo
   > when you you say you don't see any output.
   > And if its turned off it was probably turned off my mysql.
   > *Gesendet:* Donnerstag, 11. Juli 2013 um 19:53 Uhr
   > *Von:* "Jason Sipula" 
   > *An:* Kein Empfänger
   > *Cc:* bug-bash@gnu.org
   > *Betreff:* Re: Chained command prints password in Clear Text and
   breaks
   > BASH Session until logout
   > I probably should have filed two different reports for this. Sorry
   for any
   > confusion guys.
   >
   > The password makes sense to me why it allows clear text...
   >
   > The second issue is once the command terminates, bash session does
   not
   > behave normally at all. Nothing typed into the terminal over SSH or
   > directly on the console displays, however it does receive the keys.
   Also,
   > if you repeatedly hit ENTER key, instead of skipping to new line, it
   just
   > repeats the bash prompt over and over in a single line. So far
   restarting
   > bash session (by logging out then back in) is the only way I have
   found to
   > "fix" the session and return to normal functionality.
   >
   >
   > On Thu, Jul 11, 2013 at 10:47 AM, John Kearney 
   wrote:
   >
   > >
   > > This isn't a but in bash.
   > > firstly once a program is started it takes over the input so the
   fact
   > that
   > > your password is echoed to the terminal is because myspl allows it
   not
   > > bash, and in mysql defense this is the normal behaviour for command
   line
   > > tools.
   > >
   > > Secondly both mysqldump and mysql start at the same time and can
   > > potentially be reading the password also at the same time.
   > > on some systems and for some apps it could happen that.
   > >
   > > password for mysqldump p1234
   > > password for mysql p5678
   > >
   > > the way you are staring them you could potentially end up with
   > >
   > > mysqldump getting p5274
   > > mysql getting p1638
   > >
   > > basically you should give the password on the command line to
   mysql.
   > >
   > > something like
   > > read -sp "Password:" Password
   > > mysqldump -u someuser --password ${Password} -p somedb | mysql -u
   > someuser
   > > --password ${Password} -p -D someotherdb
   > >
   > > *Gesendet:* Mittwoch, 10. Juli 2013 um 23:54 Uhr
   > > *Von:* "Jason Sipula" 
   > > *An:* bug-bash@gnu.org
   > > *Betreff:* Chained command prints password in Clear Text and breaks
   BASH
   >
   > > Session until logout
   > > Configuration Information [Automatically generated, do not change]:
   > > Machine: x86_64
   > > OS: linux-gnu
   > > Compiler: gcc
   > > Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
   > > -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-redhat-linux-gnu'
   > > -DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale'
   -DPACKAGE='bash'
   > > -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib -D_GNU_SOURCE
   > > -DRECYCLES_PIDS -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
   -fexceptions
   > > -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic
   -fwrapv
   > > uname output: Linux appsrv01.js.local 2.6.32-358.6.1.el6.x86_64 #1
   SMP
   > Tue
   > > Apr 23 19:29:00 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
   > > Machine Typ

Aw: Re: Chained command prints password in Clear Text and breaks BASH Session until logout

2013-07-11 Thread John Kearney

   Sorry made a typo in the last email  I meant try
   stty echo




   sounds like echo is turned off
   try typing
   stty echo
   when you  you say you don't see any output.
   And if echoing is turned off it was probably turned off my mysql.
   Gesendet: Donnerstag, 11. Juli 2013 um 19:53 Uhr
   Von: "Jason Sipula" 
   An: Kein Empfänger
   Cc: bug-bash@gnu.org
   Betreff: Re: Chained command prints password in Clear Text and breaks
   BASH Session until logout
   I probably should have filed two different reports for this. Sorry for
   any
   confusion guys.
   The password makes sense to me why it allows clear text...
   The second issue is once the command terminates, bash session does not
   behave normally at all. Nothing typed into the terminal over SSH or
   directly on the console displays, however it does receive the keys.
   Also,
   if you repeatedly hit ENTER key, instead of skipping to new line, it
   just
   repeats the bash prompt over and over in a single line. So far
   restarting
   bash session (by logging out then back in) is the only way I have found
   to
   "fix" the session and return to normal functionality.
   On Thu, Jul 11, 2013 at 10:47 AM, John Kearney 
   wrote:
   >
   > This isn't a but in bash.
   > firstly once a program is started it takes over the input so the fact
   that
   > your password is echoed to the terminal is because myspl allows it
   not
   > bash, and in mysql defense this is the normal behaviour for command
   line
   > tools.
   >
   > Secondly both mysqldump and mysql start at the same time and can
   > potentially be reading the password also at the same time.
   > on some systems and for some apps it could happen that.
   >
   > password for mysqldump p1234
   > password for mysql p5678
   >
   > the way you are staring them you could potentially end up with
   >
   > mysqldump getting p5274
   > mysql getting p1638
   >
   > basically you should give the password on the command line to mysql.
   >
   > something like
   > read -sp "Password:" Password
   > mysqldump -u someuser --password ${Password} -p somedb | mysql -u
   someuser
   > --password ${Password} -p -D someotherdb
   >
   > *Gesendet:* Mittwoch, 10. Juli 2013 um 23:54 Uhr
   > *Von:* "Jason Sipula" 
   > *An:* bug-bash@gnu.org
   > *Betreff:* Chained command prints password in Clear Text and breaks
   BASH
   > Session until logout
   > Configuration Information [Automatically generated, do not change]:
   > Machine: x86_64
   > OS: linux-gnu
   > Compiler: gcc
   > Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
   > -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-redhat-linux-gnu'
   > -DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale'
   -DPACKAGE='bash'
   > -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib -D_GNU_SOURCE
   > -DRECYCLES_PIDS -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
   -fexceptions
   > -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic
   -fwrapv
   > uname output: Linux appsrv01.js.local 2.6.32-358.6.1.el6.x86_64 #1
   SMP Tue
   > Apr 23 19:29:00 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
   > Machine Type: x86_64-redhat-linux-gnu
   >
   > Bash Version: 4.1
   > Patch Level: 2
   > Release Status: release
   >
   > Description:
   >
   > Reproducible from both an SSH session as well as directly at the
   console.
   >
   > On BASH 4.1.x (4.1.2) running under CentOS 6.x (6.4 Final) and MySQL
   5.1.x
   > (5.1.69). I believe this bug will persist on all distros running BASH
   4.x.x
   >
   > After running the chained command (see below "Repeat-By" section),
   BASH
   > allows a password field to be seen in Clear Text, and then the BASH
   session
   > breaks until BASH session is restarted (logout then login).
   >
   > The purpose of the command is to dump the database "somedb" ... which
   would
   > normally dump to a text file for import later... but instead redirect
   > stdout to the stdin of the chained mysql command which will import
   all the
   > data from "somedb" into "someotherdb" on the same MySQL host. The
   command
   > works, but there's two problems.
   >
   > MySQL correctly challenges for password of "someuser" to perform the
   > mysqldump part, but once you type in the password and hit ENTER, it
   skips
   > to a new blank line without the shell prompt and just sits. It is
   waiting
   > for you to type in the password for "someuser" as the second part of
   the
   > command (but does not prompt for this and it's not intuitive, it
   appears
   > as-if the command is running)... If you type, it&

Aw: Chained command prints password in Clear Text and breaks BASH Session until logout

2013-07-11 Thread John Kearney


   This isn't a but in bash.
   firstly once a program is started it takes over the input so the fact
   that your password is echoed to the terminal is because myspl allows it
   not bash, and in mysql defense this is the normal behaviour for command
   line tools.

   Secondly both mysqldump  and mysql start at the same time and can
   potentially be reading the password also at the same time.
   on some systems and for some apps it could happen that.

   password for mysqldump p1234
   password for mysql  p5678

   the way you are staring them you could potentially end up with

   mysqldump getting p5274
   mysql getting  p1638

   basically you should give the password on the command line to mysql.
   something like
   read -sp "Password:" Password
   mysqldump -u someuser --password ${Password} -p somedb | mysql -u
   someuser --password ${Password} -p -D someotherdb

   Gesendet: Mittwoch, 10. Juli 2013 um 23:54 Uhr
   Von: "Jason Sipula" 
   An: bug-bash@gnu.org
   Betreff: Chained command prints password in Clear Text and breaks BASH
   Session until logout
   Configuration Information [Automatically generated, do not change]:
   Machine: x86_64
   OS: linux-gnu
   Compiler: gcc
   Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
   -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-redhat-linux-gnu'
   -DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash'
   -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib -D_GNU_SOURCE
   -DRECYCLES_PIDS -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
   -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fwrapv
   uname output: Linux appsrv01.js.local 2.6.32-358.6.1.el6.x86_64 #1 SMP
   Tue
   Apr 23 19:29:00 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
   Machine Type: x86_64-redhat-linux-gnu
   Bash Version: 4.1
   Patch Level: 2
   Release Status: release
   Description:
   Reproducible from both an SSH session as well as directly at the
   console.
   On BASH 4.1.x (4.1.2) running under CentOS 6.x (6.4 Final) and MySQL
   5.1.x
   (5.1.69). I believe this bug will persist on all distros running BASH
   4.x.x
   After running the chained command (see below "Repeat-By" section), BASH
   allows a password field to be seen in Clear Text, and then the BASH
   session
   breaks until BASH session is restarted (logout then login).
   The purpose of the command is to dump the database "somedb" ... which
   would
   normally dump to a text file for import later... but instead redirect
   stdout to the stdin of the chained mysql command which will import all
   the
   data from "somedb" into "someotherdb" on the same MySQL host. The
   command
   works, but there's two problems.
   MySQL correctly challenges for password of "someuser" to perform the
   mysqldump part, but once you type in the password and hit ENTER, it
   skips
   to a new blank line without the shell prompt and just sits. It is
   waiting
   for you to type in the password for "someuser" as the second part of
   the
   command (but does not prompt for this and it's not intuitive, it
   appears
   as-if the command is running)... If you type, it's in clear text!
   Potentially a major security issue there.
   It gets worse...
   After you hit ENTER a second time, the command will finish, and it will
   return a fresh line with the shell prompt. Everything looks normal...
   but
   try typing. Nothing will show at all, however it is sending the keys to
   the
   shell and will execute commands if you type them in and hit ENTER. Each
   successful command will return you to a fresh shell line, but same
   thing
   happens until you log out and back in (to restart BASH). Also, while
   this
   is happening, you can hit the ENTER key over and over and BASH will
   just
   keep repeating the shell prompt on the same line.
   Repeat-By:
   At the shell, issue the command:
   ~]# mysqldump -u someuser -p somedb | mysql -u someuser -p -D
   someotherdb
   Shouldn't need to run that command as root, but the mysql user must be
   privileged enough to work with the two databases. To simplify things
   you
   can replace "someuser" with root.
   Thank you,
   Jason Sipula
   alup...@gmail.com

Aw: How to test if a link exists

2013-06-21 Thread John Kearney

   check out
   help test
   if you want to test fot both you can do

   [ -e file -o -h file ] || echo file not present.

   AFAIK the current behaviour is intentional and is the most useful.

   cheers


   Gesendet: Freitag, 21. Juni 2013 um 15:43 Uhr
   Von: "Mark Young" 
   An: bug-bash@gnu.org
   Betreff: How to test if a link exists
   Hi,
   I stumbled into discovering that the -e test for a file does not report
   the file as existing if the file is a dead symbolic link. This seems
   wrong to me.
   Here's some test code:-
   (WARNING it includes rm -f a b)
   #!/bin/bash
   bash --version
   echo ""
   rm -f a b
   ln -s b a
   [ -a a ] && echo "1. (test -a) File a exists, it's a dead link"
   [ -e a ] && echo "1. (test -e) File a exists, it's a dead link"
   [ -f a ] && echo "1. (test -f) File a exists, it's a dead link"
   touch b
   [ -a a ] && echo "2. (test -a) File a exists, it points to b"
   [ -e a ] && echo "2. (test -e) File a exists, it points to b"
   [ -f a ] && echo "2. (test -f) File a exists, it points to b"
   When run on my CentOS v5.9 system I get the following
   $ ./test
   GNU bash, version 3.2.25(1)-release (x86_64-redhat-linux-gnu)
   Copyright (C) 2005 Free Software Foundation, Inc.
   2. (test -a) File a exists, it points to b
   2. (test -e) File a exists, it points to b
   2. (test -f) File a exists, it points to b
   When run on Cygwin I also get basically the same
   $ ./test
   GNU bash, version 4.1.10(4)-release (i686-pc-cygwin)
   Copyright (C) 2009 Free Software Foundation, Inc.
   License GPLv3+: GNU GPL version 3 or later
   <[1]http://gnu.org/licenses/gpl.html>
   This is free software; you are free to change and redistribute it.
   There is NO WARRANTY, to the extent permitted by law.
   2. (test -a) File a exists, it points to b
   2. (test -e) File a exists, it points to b
   2. (test -f) File a exists, it points to b
   My feeling is that this is wrong and that I should be told that a
   exists even though b doesn't. File 'a' does exist it is a dead symbolic
   link. So it prevents me for instance creating a symbolic link:-
   E.g.
   $ ln -s c a
   $ ls -l a b c
   ls: b: No such file or directory
   ls: c: No such file or directory
   lrwxrwxrwx 1 marky tools 1 Jun 21 14:41 a -> b
   Is this an error in bash?
   What test should I use to decide if a file exists (including dead
   symbolic links)?
   Cheers,
   Mark

References

   1. http://gnu.org/licenses/gpl.html

Aw: Re: currently doable? Indirect notation used w/a hash

2013-06-17 Thread John Kearney

   Like I said its a back door aproach, it circumvents the parser. which
   doesn't allow this syntax
   ${${Name}[1]}
   I didn't actually find this myself it was reproted on this list a long
   time ago. I do remember Chet saying he wouldn't break it. But other
   than that I can't remember the discussion all that well. As always with
   this topic it was a pretty lively debate.

   Yhea its a constant fight getting my email clients to stop
   capitialising various things in code.

   Gesendet: Montag, 17. Juni 2013 um 13:57 Uhr
   Von: "Greg Wooledge" 
   An: "Linda Walsh" 
   Cc: "John Kearney" , bug-bash 
   Betreff: Re: currently doable? Indirect notation used w/a hash
   On Sat, Jun 15, 2013 at 12:36:22PM -0700, Linda Walsh wrote:
   > John Kearney wrote:
   > >There is also a backdoor approach that I don't really advise.
   > >val="${ArrayName}[Index]"
   > >echo "${!val}"
   > -
   > Don't advise? Any particular reason? or stylistic?
   I'd shared this advice ("don't use it"), because I cannot for the
   life of me tell whether this is a bug or a feature. As near as I
   can tell, it is an unforeseen consequence of the parser implementation,
   not documented anywhere. As such, I would not rely on it to continue
   working in future Bash releases.
   P.S. you meant printf -v, not -V.

Aw: Re: `printf -v foo ""` does not set foo=

2013-06-17 Thread John Kearney

   Thats one of the reasons I suggested the following syntax
   printf -v test "%s" ""
   It doesn't have this problem it also saves other problems as well.
   of if you want to expand back slashes etc.
   printf -v test "%b" ""
   Gesendet: Montag, 17. Juni 2013 um 08:33 Uhr
   Von: "Linda Walsh" 
   An: bug-bash@gnu.org
   Betreff: Re: `printf -v foo ""` does not set foo=
   Mike Frysinger wrote:
   > simple test code:
   > unset foo
   > printf -v foo ""
   > echo ${foo+set}
   >
   > that does not display "set". seems to have been this way since the
   feature
   > was added in bash-3.1.
   > -mike
   
   Indeed:
   > set -u
   > unset foo
   > printf -v foo ""
   > echo $foo
   bash: foo: unbound variable
   > foo=""
   > echo $foo
   
   I have a feeling this would be hard to fix, since how can printf
   tell the difference between
   printf -v foo ""
   and
   printf -v foo
   ??
   (with nothing after it?) it seems the semantic parser would have
   already
   removed the quotes by the time the args are passed to printf, even
   this:
   > set -u
   > printf -v foo "$(echo "$'\000'")"
   > echo $foo
   still leaves foo gutless: without content (even if were null)

Aw: currently doable? Indirect notation used w/a hash

2013-06-15 Thread John Kearney


   Sorry forgot the bit to retrive values
   It is possible to retrive numeric values without eval
   i.e.
   val=$((${ArrayName}[Index]))
   works quiet well and is quick, in fact I used to use this quiet a lot.


   There is also a backdoor approach that I don't really advise.
   val="${ArrayName}[Index]"
   echo "${!val}"
   What I actually tend to do is.
 ks_array_ChkName() {
   local LC_COLLATE=C
   case "${1:?Missing Variable Name}" in
 [!a-zA-Z_]* | *[!][a-zA-Z_0-9]* ) return 3;;
   esac
 }
 ks_val_Get() {
   ks_array_ChkName "${1:?Missing Destination Variable Name}" ||
   return $?
   ks_array_ChkName "${2:?Missing Source Variable Name}" || return $?
   eval "${1}=\"\${${2}:-}\""
 }
 ks_array_GetVal() { ks_val_Get "${1}" "${2}[${3}]" ; }
 ks_array_SetVal() { ks_val_Set "${1}[${2}]" "${3:-}" ; }

   Cheers

   Gesendet: Samstag, 15. Juni 2013 um 15:03 Uhr
   Von: "John Kearney" 
   An: "Linda Walsh" 
   Cc: bug-bash 
   Betreff: Aw: currently doable? Indirect notation used w/a hash
   In bash there are 2 options that I use.
   1.
   ArrayName=blah
   printf -V "${ArrayName}[Index]" "%s" "Value To Set"
   2.
   ks_val_ChkName() {
   local LC_COLLATE=C
   case "${1:?Missing Variable Name}" in
   [!a-zA-Z_]* | *[!a-zA-Z_0-9]* | '' ) return 3;;
   esac
   }
   ks_array_SetVal() {
   ks_val_ChkName "${1:?Missing Array Name}" || return $?
   ks_val_ChkName "a${2:?Missing Array Index}" || return $?
   eval "${1}"["${2}"]'="${3:-}"'
   }
   ks_array_SetVal "${ArrayName}" "Index" "Value"
   Cheers
   Gesendet: Sonntag, 09. Juni 2013 um 23:02 Uhr
   Von: "Linda Walsh" 
   An: bug-bash 
   Betreff: currently doable? Indirect notation used w/a hash
   I was wondering if I was missing some syntax somewhere...
   but I wanted to be able to pass the name of a hash in
   and store stuff in it and later retrieve it... but it
   looks like it's only possible with an eval or such?
   Would be nice(??)*sigh*

Aw: currently doable? Indirect notation used w/a hash

2013-06-15 Thread John Kearney

   In bash there are 2 options that I use.

   1.
   ArrayName=blah

   printf -V "${ArrayName}[Index]" "%s" "Value To Set"


   2.
 ks_val_ChkName() {
   local LC_COLLATE=C
   case "${1:?Missing Variable Name}" in
 [!a-zA-Z_]* | *[!a-zA-Z_0-9]* | '' ) return 3;;
   esac
 }
  ks_array_SetVal() {
 ks_val_ChkName "${1:?Missing Array Name}" || return $?
 ks_val_ChkName "a${2:?Missing Array Index}" || return $?
 eval "${1}"["${2}"]'="${3:-}"'
   }
   ks_array_SetVal "${ArrayName}" "Index" "Value"

   Cheers

   Gesendet: Sonntag, 09. Juni 2013 um 23:02 Uhr
   Von: "Linda Walsh" 
   An: bug-bash 
   Betreff: currently doable? Indirect notation used w/a hash
   I was wondering if I was missing some syntax somewhere...
   but I wanted to be able to pass the name of a hash in
   and store stuff in it and later retrieve it... but it
   looks like it's only possible with an eval or such?
   Would be nice(??)*sigh*

Aw: Re: nested while loop doesn't work

2013-06-04 Thread John Kearney

   as greg say this is the wrong list you need to report this to
   " Vim syntax file
   " Language:shell (sh) Korn shell (ksh) bash (sh)
   " Maintainer:Dr. Charles E. Campbell, Jr.
   
   " Previous Maintainer:Lennart Schultz 
   " Last Change:Dec 09, 2011
   " Version:121
   " URL:
   [1]http://mysite.verizon.net/astronaut/vim/index.html#vimlinks_syntax

   the file that does this is /usr/share/vim/vim73/syntax/sh.vim in an
   ubuntu system.


   and this is probably a simpler example to work with
   #!/bin/bash
while true; do
 while true; do
 done
 until true; do
 done
done
until true; do
  while true; do
  done
 until true; do
 done
done
   cheers



   Gesendet: Dienstag, 04. Juni 2013 um 22:15 Uhr
   Von: "Greg Wooledge" 
   An: kartik...@gmail.com
   Cc: bug-bash@gnu.org, b...@packages.debian.org
   Betreff: Re: nested while loop doesn't work
   On Tue, Jun 04, 2013 at 04:39:31PM +0530, kartik...@gmail.com wrote:
   > Description:
   > A while inside a while loop (nested while) doesnt work and also the
   vim /gvim doesnt highlight the second while loop
   For issues with the vim/gvim highlighting, you'd need to report the
   problem in vim, not in bash.
   > example code is given
   >
   > while [ "ka" = $name ]
   > do
   > echo "nothing\n"
   > while [ "ka" = $name ] //this while is not highlighted
   > do
   > echo "everything\n"
   > done
   > done
   You have a quoting mistake here. "$name" should be quoted, or this will
   fail if the variable contains multiple words separate by spaces.
   imadev:~$ name="first last"
   imadev:~$ [ "ka" = $name ]
   bash-4.3: [: too many arguments
   This code should work:
   while [ "ka" = "$name" ]
   do
   printf "nothing\n\n"
   while [ "ka" = "$name" ]
   do
   printf "everything\n\n"
   done
   done
   (It goes into an infinite loop when name=ka, but presumably that's what
   you wanted.)

References

   1. 
https://3c.web.de/mail/client/dereferrer?redirectUrl=http%3A%2F%2Fmysite.verizon.net%2Fastronaut%2Fvim%2Findex.html%23vimlinks_syntax&selection=tfol11a7bad28b16cbfe

Re: Bash4: Problem retrieving "$?" when running with "-e"

2013-04-12 Thread John Kearney

Am 12.04.2013 18:26, schrieb Lenga, Yair:
> Chet,
>
> Sorry again for pulling the wrong Bash 4 doc. 
>
> Based on the input, I'm assuming that the portable way (bash 3, bash 4 and 
> POSIX) to retrieve $? When running under "-e" is to use the PIPEr
> CMD_STAT=0 ; GET_MAIN_DATA || CMD_STAT=$?
That isn't a pipe its a logical or, it means if the first command
returns non 0 execute the next command.
as the assignment will not fail it avoids the problem.
as such you could also do

GET_MAIN_DATA || GET_BACKUP_DATA

or 

if ! GET_MAIN_DATA ; then
  GET_BACKUP_DATA
fi

Re: Bash4: Problem retrieving "$?" when running with "-e"

2013-04-12 Thread John Kearney

Am 12.04.2013 13:44, schrieb Lenga, Yair:
> Good Morning,
>
> I've encountered another interesting change in behavior between Bash3 and 
> Bash4. I hope that you can help me:
>
> The core question is how to retrieve the status of a command, when running 
> with '-e'
>
> For production critical jobs, we run the script in '-e', to ensure that all 
> steps are successful. For cases where we allow the command to fail, because 
> we can implement backup, we add explicit error handing. For example.
>
> set -ue
> CHECK_SPACE
>  (FETCH_NEW_DATA)
> If [ $?  = 11 ] ; then
>   FETCH_BACKUP_DATA
> fi
> REMOVE_OLD_DATA
> COPY_NEW_TO_OLD
>
> In Bash3, the script could retrieve the return code for FETCH_NEW_DATA, by 
> placing it into a sub-shell, and then examining the value of "$?".
>
> In Bash4, the FETCH_NEW_COMMAND failure cause the parent script to fail.
>
> The man page says that '-e' will "exit immediately if a simple command (note 
> Simple Command::) exits with non-zero status unless ...".
> The "simple commands" definition is a "sequence of words separate by blanks 
> ...". According to this definition, the sequence "( simple command )" 
> Is NOT a simple command, and should NOT  trigger the "immediate exit".
>
> Can anyone comment on my interpretation. Is there alternative solution that 
> will allow retrieval of the status of single commands when running
> With the '-e' ?
>
> Thanks
> Yair Lenga
>
>
>
>
try this approach

set -ue
CHECK_SPACE
RVAUE=0
 (FETCH_NEW_DATA) || RVALUE=$?
If [ $RVALUE  = 11 ] ; then
FETCH_BACKUP_DATA
fi
REMOVE_OLD_DATA
COPY_NEW_TO_OLD

Re: weird problem -- path interpretted/eval'd as numeric expression

2013-03-29 Thread John Kearney

Am 29.03.2013 18:53, schrieb Linda Walsh:
>
> Greg Wooledge wrote:
>> On Fri, Mar 29, 2013 at 12:41:46AM -0700, Linda Walsh wrote:
>>> include was designed to search the path for functions that
>>> are relative paths.  While the normal sourcepath allows searching for
>>> filenames on the search path, I don't believe (please correct if I am wrong
>>> and this works now, as it would make life much simpler) that the PATH will
>>> be searched if you give it something like:
>>>
>>> source lib/Util/sourcefile.shh
>> Is that all you want?  Here:
>>
>> include() {
>> local paths dir
>> IFS=: read -ra paths <<< "$PATH"
>> for dir in "${paths[@]}"; do
>>  if [[ -r $dir/$1 ]]; then
>>  source "$dir/$1"
>>  return
>>  fi
>> done
>> echo "could not find '$1' in PATH" >&2
>> return 1
>> }
> --
> It  also doesn't keep track of the previously sourced files so as to
> not 're-source' them if one of the files you 'source' also sources a file.
>
> It also allows one to optionally leave off the extension, but other than
> those additions... yeah... that's close...
>
> The idea is *mainly* to be able to read in functions and aliases..
>
> Vars expected to 'survive' for those funcs or aliases are exported...but
> that may not be enough to get them out of the local context...not sure.
>

Like this then ?

unset INCLUDED ; 
declare -A INCLUDED
find_file() {
local dir 
FOUND_FILE=""
[ $((INCLUDED[${1%.sh}]+=1)) -eq 1 ] || return 1
while IFS= read -rd ':' dir ;do
#echo "trying : ${dir}/${1%.sh}.sh"
[[ -r ${dir}/${1%.sh}.sh ]] || continue
FOUND_FILE="${dir}/${1%.sh}.sh"
echo "found : ${FOUND_FILE}"
done <<< "${PATH}"
[ -n "${FOUND_FILE:-}" ] || echo "could not find '$1' in PATH" >&2
return ${FOUND_FILE+1}
} && echo 'find_file "${1:?Missing File Name }" && source "${FOUND_FILE}"' 
>/tmp/source_wrapper.sh && alias include=source\ "/tmp/source_wrapper.sh"


I actually tested this one and it seems to work ok.

Re: weird problem -- path interpretted/eval'd as numeric expression

2013-03-29 Thread John Kearney

Am 29.03.2013 16:36, schrieb Greg Wooledge:
> On Fri, Mar 29, 2013 at 04:10:22PM +0100, John Kearney wrote:
>> consider
>> dethrophes@dethace ~
>> $ read -ra vals -d '' <<< $'lkjlksda\n adasd\n:sdasda:'
>>
>> dethrophes@dethace ~
>> $ echo ${vals[0]}
>> lkjlksda
> You forgot to set IFS=: for that read.
>
> imadev:~$ IFS=: read -ra vals -d '' <<< $'lkjlksda\n adasd\n:sdasda:'
> imadev:~$ declare -p vals
> declare -a vals='([0]="lkjlksda\
>  adasd\
> " [1]="sdasda" [2]="\
> ")'
>
>> I meant to update your wiki about it but I forgot.
>> I guess read uses gets not fread and that truncates the line anyway.
> No, that's not correct.
>
>>>> cat </source_wrapper.sh
>>>> find_file "${1:?Missing File Name }" || return $?
>>>> source "${FOUND_FILE}"
>>>> EOF
>>>> alias include=source\ "/source_wrapper.sh"
>>> The <>> include the definition of find_file in the wrapper script.
>> ?? why <<'EOF' ??
> Because if you don't quote any of the characters in the here document
> delimiter, the expansions such as "${FOUND_FILE}" will be done by the
> shell that's processing the redirection.  I believe you want the code
> to appear in the output file.  Therefore you want to quote some or all
> of the characters in the delimiter.
>
> Compare:
>
> imadev:~$ cat <> echo "$HOME"
>> EOF
> echo "/net/home/wooledg"
>
> imadev:~$ cat <<'EOF'
>> echo "$HOME"
>> EOF
> echo "$HOME"
Didn't know that, Actually I forgot to escape them in my example.
>
> On Fri, Mar 29, 2013 at 04:18:49PM +0100, John Kearney wrote:
>> Oh and FYI
>> IFS=: read
>> may change the global IFS on some shells I think.
>> Mainly thinking of pdksh right now.
> If those shells have such a bug, then you'd need to bring it up on THEIR
> bug mailing list.  This is bug-bash. ;-)
It was just a warning, and I don't think there is a pdksh bug list. 
>
> In any case, I've never seen such a bug, and the pdksh to which I have
> access does not display it:
>
> ...
> Get:1 http://ftp.us.debian.org/debian/ squeeze/main pdksh i386 5.2.14-25 [265 
> kB]
> ...
> arc3:~$ pdksh
> \h:\w$ echo a:b:c > /tmp/frob
> \h:\w$ IFS=: read a b < /tmp/frob
> \h:\w$ rm /tmp/frob
> \h:\w$ echo "$IFS"
> 
>
> \h:\w$ 
>
> This is a fundamental feature that's commonly used.  If it were so
> egregiously broken I think more people would have noticed it.

try this
f(){ echo "ddd${IFS}fff"; } ; f ; IFS=KKK f; f
This just didn't work as I would expect on ubuntu pdksh.

I didn't look into it regarding builtins, I just stopped using that
feature, as it seemed to be wonky.

the original platform I saw it on was QNX6.5.0/BB10.

Re: weird problem -- path interpretted/eval'd as numeric expression

2013-03-29 Thread John Kearney

Am 29.03.2013 16:23, schrieb Pierre Gaston:
> On Fri, Mar 29, 2013 at 5:10 PM, John Kearney  wrote:
>> consider
>> dethrophes@dethace ~
>> $ read -ra vals -d '' <<< $'lkjlksda\n adasd\n:sdasda:'
>>
>> dethrophes@dethace ~
>> $ echo ${vals[0]}
>> lkjlksda
>>
>> I meant to update your wiki about it but I forgot.
>> I guess read uses gets not fread and that truncates the line anyway.
>>
> you miss the IFS part:
> IFS=: read -ra vals -d '' <<< $'lkjlksda\n adasd\n:sdasda:'
> echo "${vals[0]}"
>
> (IFS contains \n by default)
Ok that works, I must have somehow misunderstood the description.
Oh well thanks that makes he world a little more sane.

Re: weird problem -- path interpretted/eval'd as numeric expression

2013-03-29 Thread John Kearney

Oh and FYI
IFS=: read
may change the global IFS on some shells I think.
Mainly thinking of pdksh right now.

IFS=: ls   # local

ls_wrap(){
  ls
}

IFS=: ls_wrap # Changes global IFS

I think it was the same with builtins, but not sure right now.

Thats why I always use wrapper functions and local to do that sort of
thing now.




Am 29.03.2013 15:30, schrieb Greg Wooledge:
> On Fri, Mar 29, 2013 at 03:11:07PM +0100, John Kearney wrote:
>> Actually I've had trouble
>>
>> IFS=: read -ra paths <<< "$PATH"
>>
>> and embedded new lines.
> A directory with a newline in its name, in your PATH?  Terrifying.
>
>> I think this is better
>> find_file() {
>> local IFS=:
>> for dir in $PATH; do
> But that one's vulnerable to globbing issues if a directory has a
> wildcard character in its name.  If you're concerned about newlines
> then you should be just as concerned with ? or *, I should think.
>
> Workarounds:
>
>  1) In yours, use set -f and set +f around unquoted $PATH to suppress
> globbing.
>
>  2) In mine, use -d '' on the read command, and manually strip the
> trailing newline that <<< adds to the final element.
>
>  3) In mine, use -d '' on the read command, and use < <(printf %s "$PATH")
> so there isn't an added trailing newline to strip.
>
>> Ideally what I want to do is
>> alias include=source\ "$(find_file "${1}")"
>> but that doesn't work in bash and I still haven't found a way around the
>> problem.
> I can't think of an alias workaround off the top of my head either.
> Even Simon Tatham's "magic aliases" require a helper function, which leads
> back to the variable scope issue, the avoidance of which was the whole
> reason to attempt an alias (instead of a function) in the first place
>
>> The only way I can think to do it is to use a second file.
>>
>> cat </source_wrapper.sh
>> find_file "${1:?Missing File Name }" || return $?
>> source "${FOUND_FILE}"
>> EOF
>> alias include=source\ "/source_wrapper.sh"
> The < include the definition of find_file in the wrapper script.
>

Re: weird problem -- path interpretted/eval'd as numeric expression

2013-03-29 Thread John Kearney

Am 29.03.2013 15:30, schrieb Greg Wooledge:
> On Fri, Mar 29, 2013 at 03:11:07PM +0100, John Kearney wrote:
>> Actually I've had trouble
>>
>> IFS=: read -ra paths <<< "$PATH"
>>
>> and embedded new lines.
> A directory with a newline in its name, in your PATH?  Terrifying.
why not :) its a great way to make sure only my scripts work on my system;).

>> I think this is better
>> find_file() {
>> local IFS=:
>> for dir in $PATH; do
> But that one's vulnerable to globbing issues if a directory has a
> wildcard character in its name.  If you're concerned about newlines
> then you should be just as concerned with ? or *, I should think.
Strangely enough I that hasn't been as much of a problem. But a good point.
> Workarounds:
>
>  1) In yours, use set -f and set +f around unquoted $PATH to suppress
> globbing.
I actually have that in my code :( coding off the top of your head is
always a bit sloppy :).
>
>  2) In mine, use -d '' on the read command, and manually strip the
> trailing newline that <<< adds to the final element.
consider
dethrophes@dethace ~
$ read -ra vals -d '' <<< $'lkjlksda\n adasd\n:sdasda:'

dethrophes@dethace ~
$ echo ${vals[0]}
lkjlksda

I meant to update your wiki about it but I forgot.
I guess read uses gets not fread and that truncates the line anyway.
>
>  3) In mine, use -d '' on the read command, and use < <(printf %s "$PATH")
> so there isn't an added trailing newline to strip.
>
>> Ideally what I want to do is
>> alias include=source\ "$(find_file "${1}")"
>> but that doesn't work in bash and I still haven't found a way around the
>> problem.
> I can't think of an alias workaround off the top of my head either.
> Even Simon Tatham's "magic aliases" require a helper function, which leads
> back to the variable scope issue, the avoidance of which was the whole
> reason to attempt an alias (instead of a function) in the first place
I'm actually almost convinced that it just isn't possible.
>> The only way I can think to do it is to use a second file.
>>
>> cat </source_wrapper.sh
>> find_file "${1:?Missing File Name }" || return $?
>> source "${FOUND_FILE}"
>> EOF
>> alias include=source\ "/source_wrapper.sh"
> The < include the definition of find_file in the wrapper script.
?? why <<'EOF' ??

No I don't need to include the function I would declare it like this.

find_file() {
local dir IFS=:
FOUND_FILE=""
set -f
for dir in $PATH; do
[[ ! -r ${dir}/$1 ]] || continue
FOUND_FILE="${dir}/$1"
done
set +f
[ -z "${FOUND_FILE:-}" ] || echo "could not find '$1' in PATH" >&2
return ${FOUND_FILE:+1}
}&& echo 'find_file "${1:?Missing File Name }" && source "${FOUND_FILE}"' 
>/tmp/source_wrapper.sh && alias include=source\ "/tmp/source_wrapper.sh"

Re: weird problem -- path interpretted/eval'd as numeric expression

2013-03-29 Thread John Kearney

Am 29.03.2013 12:57, schrieb Greg Wooledge:
> On Fri, Mar 29, 2013 at 12:41:46AM -0700, Linda Walsh wrote:
>>  include was designed to search the path for functions that
>> are relative paths.  While the normal sourcepath allows searching for
>> filenames on the search path, I don't believe (please correct if I am wrong
>> and this works now, as it would make life much simpler) that the PATH will
>> be searched if you give it something like:
>>
>> source lib/Util/sourcefile.shh
> Is that all you want?  Here:
>
> include() {
> local paths dir
> IFS=: read -ra paths <<< "$PATH"
> for dir in "${paths[@]}"; do
>   if [[ -r $dir/$1 ]]; then
>   source "$dir/$1"
>   return
>   fi
> done
> echo "could not find '$1' in PATH" >&2
> return 1
> }
>
Actually I've had trouble

IFS=: read -ra paths <<< "$PATH"

and embedded new lines.

I think this is better
find_file() {
local IFS=:
for dir in $PATH; do
[[ ! -r $dir/$1 ]] || continue
FOUND_FILE="$dir/$1"
return 0
done
echo "could not find '$1' in PATH" >&2
return 1
}
include() {
local FOUND_FILE
find_file "${1:?Missing File Name }" || return $?
source "${FOUND_FILE}"
}
includeExt() {
local FOUND_FILE
local PATH=${1:?Missing Search Path}
find_file "${2:?Missing File Name}" || return $?
source "${FOUND_FILE}"
}

Ideally what I want to do is
alias include=source\ "$(find_file "${1}")"
but that doesn't work in bash and I still haven't found a way around the
problem.


The only way I can think to do it is to use a second file.

cat

Re: gnu parallel in the bash manual

2013-03-05 Thread John Kearney

Am 06.03.2013 01:03, schrieb Linda Walsh:
>
> John Kearney wrote:
>> The example is bad anyway as you normally don't want to parallelize disk
>> io , due to seek overhead and io bottle neck congestion. This example
>> will be slower and more likely to damage your disk than simply using mv
>> on its own. but thats another discussion.
> ---
>   That depends on how many IOPS your disk subsystem can
> handle and how much cpu is between each of the IO calls.
> Generally, unless you have a really old, non-queuing disk,
>> 1 procs will be of help.  If you have a RAID, it can go
> up with # of data spindles (as a max, though if all are reading
> from the same area, not so much...;-))...
>
>
>   Case in point, I wanted to compare rpm versions of files
> on disk in a dir to see if there were duplicate version, and if so,
> only keep the newest (highest numbered) version) (with the rest
> going into a per-disk recycling bin (a fall-out of sharing
> those disks to windows and implementing undo abilities on
> the shares (samba, vfs_recycle).
>
>   I was working directories with 1000's of files -- (1 dir,
> after pruning has 10,312 entries).  Sequential reading of those files
> was DOG slow.
>
>   I parallelized it (using perl) first by sorting all the names,
> then breaking it into 'N' lists -- doing those in parallel, then
> merging the results (and comparing end-points -- like end of one list
> might have been diff-ver from beginning of next).  I found a dynamic
> 'N' based on max cpu load v.disk (i.e. no matter how many procs I
> threw at it, it still used about 75% cpu).
>
> So I chose 9:
>
> Hot cache:
> Read 12161 rpm names.
> Use 1 procs w/12162 items/process
> #pkgs=10161, #deletes=2000, total=12161
> Recycling 2000 duplicates...Done
>  Cumulative  This Phase  ID
>  0.000s  0.000s  Init
>  0.000s  0.000s  start_program
>  0.038s  0.038s  starting_children
>  0.038s  0.001s  end_starting_children
>  8.653s  8.615s  endRdFrmChldrn_n_start_re_sort
>  10.733s 2.079s  afterFinalSort
> 17.94sec 3.71usr 6.21sys (55.29% cpu)
> ---
> Read 12161 rpm names.
> Use 9 procs w/1353 items/process
> #pkgs=10161, #deletes=2000, total=12161
> Recycling 2000 duplicates...Done
>  Cumulative  This Phase  ID
>  0.000s  0.000s  Init
>  0.000s  0.000s  start_program
>  0.032s  0.032s  starting_children
>  0.036s  0.004s  end_starting_children
>  1.535s  1.500s  endRdFrmChldrn_n_start_re_sort
>  3.722s  2.187s  afterFinalSort
> 10.36sec 3.31usr 4.47sys (75.09% cpu)
>
> Cold Cache:
> 
> Read 12161 rpm names.
> Use 1 procs w/12162 items/process
> #pkgs=10161, #deletes=2000, total=12161
> Recycling 2000 duplicates...Done
>  Cumulative  This Phase  ID
>  0.000s  0.000s  Init
>  0.000s  0.000s  start_program
>  0.095s  0.095s  starting_children
>  0.096s  0.001s  end_starting_children
>  75.067s 74.971s endRdFrmChldrn_n_start_re_sort
>  77.140s 2.073s  afterFinalSort
> 84.52sec 3.62usr 6.26sys (11.70% cpu)
> 
> Read 12161 rpm names.
> Use 9 procs w/1353 items/process
> #pkgs=10161, #deletes=2000, total=12161
> Recycling 2000 duplicates...Done
>  Cumulative  This Phase  ID
>  0.000s  0.000s  Init
>  0.000s  0.000s  start_program
>  0.107s  0.107s  starting_children
>  0.112s  0.005s  end_starting_children
>  29.350s 29.238s endRdFrmChldrn_n_start_re_sort
>  31.497s 2.147s  afterFinalSort
> 38.27sec 3.35usr 4.47sys (20.47% cpu)
>
> ---
> hot cache savings: 42%
> cold cache savings: 55%
>
>
>
>
>
Different use case you can't really compare mv to data processing.  And
Generally it is a bad idea, unless yo know what you are doing.
trying to parallelize
mv /*  
Is a bad idea unless you are on some expensive hardware. This is because
of the sequential nature of the access model.

Your use case was a sparse access model and there is normally no
performance penalty to interleaving sparse access methods.

Depending on the underlying hardware it can be very costly to interleave
sequential access streams especially on embedded devices e.g. emmc.
Not to mention the sync object overhead you may be incurring in the fs
driver and or hardware driver.


With 13000 files in one directory you must have been taking a dir list
and file open access penalty. What fs was t

Re: gnu parallel in the bash manual

2013-03-03 Thread John Kearney

Am 03.03.2013 01:40, schrieb Chet Ramey:
>> this is actually more disturbing.
>>
>> ls | parallel mv {} destdir
>>
>> find -type f -print0 | xargs -0 -I{} -P  /bin/mv {} 
> If we're really going to pick nits here, those two aren't really identical.
>
> You'd probably want something like
>
> find . -depth 1 \! -name '.*' -print0
>
> to start.
>
> Chet
>
Sure your right what I showed wasn't a 1 to 1 functional replacement,
but then again most times I see ppl using a ls | syntax they actually
don't intend that functionality, its a side effect of them not knowing
how to use a better syntax.
The example is bad anyway as you normally don't want to parallelize disk
io , due to seek overhead and io bottle neck congestion. This example
will be slower and more likely to damage your disk than simply using mv
on its own. but thats another discussion.

with regards to nit picking, considering how much effort is made on this
mailing list and help-bash to give filename safe examples, its hardly
nitpicking to expect the examples in the bash manual to be written to
the same standard.

Re: export in posix mode

2013-02-27 Thread John Kearney

Am 27.02.2013 22:39, schrieb James Mason:
> On 02/27/2013 04:00 PM, Bob Proulx wrote:
>> Eric Blake wrote:
>>> James Mason wrote:
 I certainly could be doing something wrong, but it looks to me like
 bash
 - when in Posix mode - does not suppress the "-n" option for export.
 The version of bash that I'm looking at is 3.2.25.
>>> So what?  Putting bash in posix mode does not require bash to instantly
>>> prohibit extensions.  POSIX intentionally allows for implementations to
>>> provide extensions, and 'export -n' is one of bash's extensions.
>>> There's no bug here, since leaving the extension always enabled does
>>> not
>>> conflict with subset of behavior required by POSIX.
>> If you are looking to try to detect non-portable constructs then you
>> will probably need to test against various shells including ash.  (If
>> on Debian then use dash.)
>>
>>https://en.wikipedia.org/wiki/Almquist_shell
>>
>> The posh shell was constructed specifically to be as strictly
>> conforming to posix as possible.  (Making it somewhat less than useful
>> in Real Life but it may be what you are looking for.)  It is Debian
>> specific in origin but should work on other systems.
>>
>>http://packages.debian.org/sid/posh
>>http://anonscm.debian.org/gitweb/?p=users/clint/posh.git;a=summary
>>
>> Bob
>
> We considered setting up another shell as the implementation of
> "/bin/sh", but that's hazardous in the context of vast amounts of
> boot-time initialization scripting that hasn't been vetted as to
> avoidance of bash-isms.
>
> Changing product script code - just so you can look for these sorts of
> things - isn't practical (or safe) either.
>
> So I guess if you take the view that bash POSIX mode exists only to
> make bash accept POSIX scripts, and not to preclude/warn about
> behavior that isn't going to be acceptable elsewhere, then you're
> right - it's not a bug.   If you care about helping people to be able
> to write scripts that work various places and don't exceed the POSIX
> specification, you're unhelpfully wrong (and you might contemplate why
> "bashisms" gives > 50K google hits).
>
> -jrm
>
>
>
bash posix more just changes bash behaviour that is incompatible with
the posix spec. Nothing more or less.
There are other shells for doing what you seem to want as has already
been stated.
Namely dash and posh.

Re: gnu parallel in the bash manual

2013-02-25 Thread John Kearney

Am 26.02.2013 03:36, schrieb Linda Walsh:
>
> Chet Ramey wrote:
>> On 2/25/13 8:07 PM, Linda Walsh wrote:
>>> Chet Ramey wrote:
 On 2/16/13 3:50 AM, Pierre Gaston wrote:
> I don't quite see the point of having gnu parallel discussed in the
> bash reference manual.
 I was asked to add that in May, 2010 by Ole Tange and Richard Stallman.
>>> 
>>> Maybe now that it was done, it can be removed?
>> I'm pretty sure that wasn't the intent of the original request.  Let's
>> see if we can clean it up instead.
> 
>   I'm sure, but you edited out the rest of my reasoning.
> Note -- I don't feel strongly about this, one way or the other,
> but at the same time, I don't feel their request, nor your response
> are the best ones to take from an engineering or product perspective --
> in part -- directly because of the confusion about whether or not parallel
> is bundled w/bash or not.
>
> Using it in an example would be fine... but make a section out it?  That's
> a fairly strong implication for it being something that's part of bash's
> official release or product.
>
> I realize this matter is more political than technical, but still, I would
> try to ask those questions of the original requestors and see if they might
> not revisit their requests...?  you could even say -- a user asked if
> including parallel in it's own section in the manpage meant that parallel was
> going to be part of the bash distribution?
>
> I mean it wouldn't surprise me or seem unreasonable if it was included in
> the bash distribution (from a lay-person perspective).  Knowing it's a
> perl-script, I'd be a bit surprised, personally, but hey, I've been wondering
> if you are going to embed the perl interpreter in bash as a dynamically
> loadable .so and allow perl-expressions on the command line as an option...
> *str8-face*...
>
>
>
>
>
I up vote the perl integration into bash :|.

Re: gnu parallel in the bash manual

2013-02-16 Thread John Kearney

Am 16.02.2013 09:50, schrieb Pierre Gaston:
> I don't quite see the point of having gnu parallel discussed in the
> bash reference manual.
> http://www.gnu.org/software/bash/manual/bashref.html#GNU-Parallel
> I don't argue that it can be a useful tool, but then you might as well
> discuss sed awk grep make find etc..
> Or even the ones not part of the standard toolset  since parallel is
> not installed by default even on the linux distribution I know: flock
> fdupes recode convmv rsync etc...
Actually xargs could do everything listed better. and is installed by
default on most systems.
> On top of that the examples teach incorrect things eg, "the common
> idioms that operate on lines read from a file"(sic)
>
> for x in $(cat list); do
>
> doesn't even read lines!
this is actually more disturbing.

ls | parallel mv {} destdir

find -type f -print0 | xargs -0 -I{} -P  /bin/mv {} 
>
> I'd say this should be removed.
>
Or the examples should at least be fixed. there are terrible practices
being shown there.

Re: builtin "read -d" behaves differently after "set -e#

2013-02-06 Thread John Kearney

Am 06.02.2013 14:46, schrieb Greg Wooledge:
> On Wed, Feb 06, 2013 at 12:39:45AM +0100, Tiwo W. wrote:
>> When using this in a script of mine, I noticed that this fails
>> when errexit is set ("set -e").
> Most things do.  set -e is crap.  You should consider not using it.
>
>> * why does it work with "set +e" ?
> Because set +e disables the crap.
>
>> * what is the recommended way to disable splitting with "read"?
> What splitting?  You only gave a single variable.  There is no field
> splitting when you only give one variable.
>
>> set -e
>> read -d '' var2 <>but
>>this
>>fails
>> EOF
>> echo "$var2"
> Are you actually asking how to force read to slurp in an entire file
> including newlines, all at once?  Is that what you meant by "splitting"?
>
> Well, you already found your answer -- stop using set -e.  By the way,
> you may also want to set IFS to an empty string to disable the trimming
> of leading and trailing whitespace, and use the -r option to suppress
> special handling of backslashes.  Thus:
>
>   IFS= read -rd '' var2 <
> In case you're curious why set -e makes it fail:
>
>   imadev:~$ IFS= read -rd '' foo <   > blah
>   > EOF
>   imadev:~$ echo $?
>   1
>
> read returns 1 because it reached the end of file for standard input.
> From the manual: "The return code is zero, unless end-of-file is
> encountered, read times out (in which case the return code is greater than
> 128), or an invalid file descriptor is supplied as the argument to -u."
>
> So, if you're reading all the way to EOF (on purpose) then you should
> ignore the exit status.  set -e doesn't permit you to ignore the exit
> status on commands where the exit status indicates a nonfatal condition
> (such as read -d '' or let i=0).  This is why set -e is crap.
>
> Also see http://mywiki.wooledge.org/BashFAQ/105
>
set -e

 IFS= read -rd '' var2 <

Re: Q on Bash's self-documented POSIX compliance...

2013-01-26 Thread John Kearney

Am 27.01.2013 01:37, schrieb Clark WANG:
> On Sat, Jan 26, 2013 at 1:27 PM, Linda Walsh  wrote:
>
>> I noted on the bash man page that it says it will start in posix
>> compliance mode when started as 'sh' (/bin/sh).
>>
>> What does that mean about bash extensions like arrays and
>> use of [[]]?
>>
>> Those are currently not-POSIX (but due to both Bash and Ksh having
>> them, some think that such features are part of POSIX now)...
>>
>> If you operate in POSIX compliance mode, what guarantee is there that
>> you can take a script developed with bash, in POSIX compliance mode,
>> and run it under another POSIX compliant shell?
>>
>> Is it such that Bash can run POSIX compliant scripts, BUT, cannot be
>> (easily) used to develop such, as there is no way to tell it to
>> only use POSIX?
>>
>> If someone runs in POSIX mode, should bash keep arbitrary bash-specific
>> extensions enabled?
>>
>> I am wondering about the rational, but also note that some people believe
>> they are running a POSIX compatible shell when they use /bin/sh, but would
>> get rudely surprised is another less feature-full shell were dropped in
>> as a replacement.
>>
> I think every POSIX compatible shell has its own extensions so there's no
> guarantee that a script which works fine in shell A would still work in
> shell B even if both A and B are POSIX compatible unless the script writer
> only uses POSIX compatible features. Is there a pure POSIX shell without
> adding any extensions?
dash is normally a better gauge of how portable your script is, than
bash in posix mode.

Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-15 Thread John Kearney

Am 14.01.2013 21:12, schrieb Chet Ramey:
> On 1/14/13 2:57 PM, John Kearney wrote:
>
>> I have no idea why errexit exists I doubt it was for lazy people
>> thought. its more work to use it.
> I had someone tell me one with a straight (electronic) face that -e
> exists `to allow "make" to work as expected' since historical make invokes
> sh -ce to run recipes.  Now, he maintains his own independently-written
> version of `make', so his opinion might be somewhat skewed.
>
> Chet
>
That actually makes a lot of sense. it explains the 2 weirdest things
about it, 1 no error message explaining what happened, 2 weird behavior
with functions.

Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-14 Thread John Kearney

Am 14.01.2013 22:09, schrieb Ken Irving:
> On Mon, Jan 14, 2013 at 08:57:41PM +0100, John Kearney wrote:
>> ...
>> btw
>> || return $?
>>
>> isn't actually error checking its error propagation.
> Also btw, I think you can omit the $? in this case;  from bash(1):
>
> return [n]
> ...
> If n is omitted, the return status is that of the  last  command
> executed  in the function body.  ...
>
> and similarly for exit:
>
> exit [n]
> ...  If  n  is  omitted,
> the exit status is that of the last command executed.  ...
>
> Ken
>
Thanks yhea your right, but I think its clearer to include it especially
for people with less experience. I try to be as explicit as possible.
Perl cured me of my taste for compactness in code . ;)

Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-14 Thread John Kearney

Am 14.01.2013 20:25, schrieb Greg Wooledge:
> On Mon, Jan 14, 2013 at 08:08:53PM +0100, John Kearney wrote:
>> this should exit.
>> #!/bin/bash
>>
>> set -e
>> f() { test -d nosuchdir && echo no dir; }
>> echo testings
>> f
>> echo survived
> OK, cool.  That gives me more ammunition to use in the war against set -e.
>
> ==
> imadev:~$ cat foo
> #!/bin/bash
>
> set -e
> test -d nosuchdir && echo no dir
> echo survived
> imadev:~$ ./foo
> survived
> ==
> imadev:~$ cat bar
> #!/bin/bash
>
> set -e
> f() { test -d nosuchdir && echo no dir; }
> f
> echo survived
> imadev:~$ ./bar
> imadev:~$ 
> ==
> imadev:~$ cat baz
> #!/bin/bash
>
> set -e
> f() { if test -d nosuchdir; then echo no dir; fi; }
> f
> echo survived
> imadev:~$ ./baz
> survived
> ==
>
>> All I was pointing out that its safer to use syntax
>>
>> [] ||
>>
>> or
>>
>> [] && ||
> I don't even know what "safer" means any more.  As you can see in my
> code examples above, if you were expecting the "survived" line to appear,
> then you get burned if you wrap the test in a function, but only if the
> test uses the "shorthand" && instead of the "vanilla" if.
>
> But I'm not sure what people expect it to do.  It's hard enough just
> documenting what it ACTUALLY does.
>
>> you always need a || on a one liner to make sure the return value of the
>> line is a 0.
> Or stop using set -e.  No, really.  Just... fucking... stop. :-(
>
>> but lets say you want to do 2 things in a function you have to do
>> something like.
>> f(){
>> mkdir "${1%/*}" ||return $?  # so the line doesn't return an error.
>> touch "${1}"
>> }
> ... wait, so you're saying that even if you use set -e, you STILL have to
> include manual error checking?  The whole point of set -e was to allow
> lazy people to omit it, wasn't it?
>
> So, set -e lets you skip error checking, but you have to add error checking
> to work around the quirks of set -e.
>
> That's hilarious.
>
I have no idea why errexit exists I doubt it was for lazy people
thought. its more work to use it.
I use trap ERR not errexit, which allows me to protocol unhandled errors.

I actually find trap ERR/errexit pretty straight forward now. I don't
really get why people are so against it. Except that they seem to have
the wrong expectations for it.

btw
|| return $?

isn't actually error checking its error propagation.


f(){
# not last command in function
mkdir "${1%/*}"  # exit on error.
mkdir "${1%/*}" ||return $?  # return an error.
mkdir "${1%/*}" ||true   # ignore error.

# last command in function
touch "${1}"# return exit code
}


what is confusing though is

f(){
touch "${1}"# exit on error
return $?
}


this wll not work as expected with errexit.

because the touch isn't the last command in the function, however just
removing the return should fix it.


also need to be careful of stuff like

x=$(false)
need something more like
x=$(false||true)
 
or
if x=$(false) ; then


basically any situation in which a line returns a non 0 value is
probably going to cause the exit especially in functions.


I just do it automatically now.
 

I guess most people aren't used to considering the line return values.

Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-14 Thread John Kearney

Am 14.01.2013 14:33, schrieb Greg Wooledge:
> On Sun, Jan 13, 2013 at 03:31:24AM +0100, John Kearney wrote:
>> set -o errexit
>> test_func() {
>> [ ! -d test ] && echo test2
>> }
>>
>> echo test3
>> test_func
>> echo test4
>>
>> now so long as test doesn't exist in the cwd it should errexit.
>> at least it did for me just now.
> Cannot reproduce.
>
> imadev:~$ cat bar
> #!/bin/bash
>
> set -e
> f() { test ! -d nosuchdir && echo no dir; }
> f
> echo survived
> imadev:~$ ./bar
> no dir
> survived

the "no dir" above means that the test didn't fail. The exit only
happens if the test fails. Sorry I keep seeming to make typos. I really
need more sleep.
this should exit.
#!/bin/bash

set -e
f() { test -d nosuchdir && echo no dir; }
echo testings
f
echo survived

All I was pointing out that its safer to use syntax

[] ||

or

[] && ||

you always need a || on a one liner to make sure the return value of the
line is a 0.
this isn't necessary in the script body I think but in a function it is,
unless its the last command then it will be auto returned..

but lets say you want to do 2 things in a function you have to do
something like.
f(){
mkdir "${1%/*}" ||return $?  # so the line doesn't return an error.
touch "${1}"
}

any way it is nearly always something that should be being done anyway.
It only the conditional one liners that tend to frustrate people a lot
from what I've seen.

Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-12 Thread John Kearney

Am 13.01.2013 00:04, schrieb Chet Ramey:
> On 1/12/13 10:07 AM, John Kearney wrote:
>
>> regarding -e it mainly has a bad name because there is no good guide how
>> to program with it.
>> so for example this causes stress
>> [ ! -d ${dirname} ] && mkdir ${dirname}
>> because if the dir exists it will exit the scripts :)
> I'm not sure this is what you wanted to say.  When -e is set, that code
> will not cause an error exit if ${dirname} exists and is a directory.  Run
> this script in the bash source directory and see what happens:
>
> set -e
> [ ! -d builtins ] && mkdir builtins
> echo after
>
>
> Chet
:)
its a little more complex, truthfully I make rules how I should do stuff
and then just follow them.

in this case you actually need to put the code in a function, then its
actually the function return not the command itself that causes the
exit. At least I think thats what happens, truthfully sometimes even
with the caller trace it can be hard to tell what is actually going on.
i.e.

set -o errexit
test_func() {
[ ! -d test ] && echo test2
}

echo test3
test_func
echo test4

now so long as test doesn't exist in the cwd it should errexit.
at least it did for me just now.

Like I say the only reason I don't like errexit is it doesn't say why it
exited, so I use the ERR trap. Which is great.

Just to clarify I'm not complaining just saying why I think ppl have bad
experiences with errexit.

having said that it might be nice to get an optional backtrace on
errors. I do this myself but it might help others if it was natively
supported.

John

Re: printf %q represents null argument as empty string.

2013-01-12 Thread John Kearney

Am 12.01.2013 20:40, schrieb Chet Ramey:
> On 1/12/13 9:48 AM, John Kearney wrote:
>
>> anyway now we have a point I disagree that
>> "${@}"
>>
>> should expand to 0 or more words, from the documentation it should be 1
>> or more. At least that is how I read  that paragragh. IT says it will
>> split the word not make the word vanish.
>> so I had to test and it really does how weird, is that in the posix spec?.
> Yes.  Here's the relevant sentence from the man page description of $@:
>
>   When  there  are no positional parameters, "$@" and $@ expand to
>   nothing (i.e., they are removed).
>
> Posix says something similar:
>
>   If there are no positional parameters, the expansion of '@' shall
>   generate zero fields, even when '@' is double-quoted.
>
> Chet
thanks one lives and learns.

Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-12 Thread John Kearney

Am 12.01.2013 14:53, schrieb Dan Douglas:
> Yes some use -u / -e for debugging apparently. Actual logic relying upon 
> those 
> can be fragile of course. I prefer when things return nonzero instead of 
> throwing errors usually so that they're handleable.
ah but you can still do that if you want

you just do

${unsetvar:-0}  says you want 0 for null string or unset

${unsetvar-0}  says you want 0 for unset.

I know these aren't the sort of things you want to add retroactively,
but if you program from the ground up with this in mind your code is
much more explicit, and less reliant on particular interpreter
behavior.  So again it forces a more explicit programming style which is
always better. Truthfully most people complain my scripts don't look
like scripts any more but more like programs. But once they get used to
the style most see its advantages. at teh very least when they have to
figure out what is gone wrong they understand.


regarding -e it mainly has a bad name because there is no good guide how
to program with it.
so for example this causes stress
[ ! -d ${dirname} ] && mkdir ${dirname}
because if the dir exists it will exit the scripts :)
[ -d ${dirname} ] || mkdir ${dirname}
this however is safe.

actually forcing myself to work with SIGERR taught me a lot about how
this sort of thing works.

thats why I do for example use (old but simple example)
set -o errtrace
function TaceEvent {
local LASTERR=$?
local ETYPE="${1:?Missing Error Type}"
PrintFunctionStack 1
cErrorOut 1 "${ETYPE}
${BASH_SOURCE[1]}(${BASH_LINENO[1]}):${FUNCNAME[1]} ELEVEL=${LASTERR}
\"${BASH_COMMAND}\""
}
 trap 'TaceEvent ERR' ERR

which basically gives you a heads up everytime you haven't handled an
error return code.
so the following silly example

  test_func4() {
false
  }
  test_func3() {
test_func4
  }
  test_func2() {
test_func3
  }
  test_func1() {
test_func2
  }
  test_func1
will give me a log that looks like
#D: Sat Jan 12 15:49:13 CET 2013 : 18055 : test.sh (225 ) :
main: "[5]/home/dethrophes/scripts/bash/test.sh(225):test_func1"
#D: Sat Jan 12 15:49:13 CET 2013 : 18055 : test.sh (223 ) :
test_func1  : "[4]/home/dethrophes/scripts/bash/test.sh(223):test_func2"
#D: Sat Jan 12 15:49:13 CET 2013 : 18055 : test.sh (220 ) :
test_func2  : "[3]/home/dethrophes/scripts/bash/test.sh(220):test_func3"
#D: Sat Jan 12 15:49:13 CET 2013 : 18055 : test.sh (217 ) :
test_func3  : "[2]/home/dethrophes/scripts/bash/test.sh(217):test_func4"
#E: Sat Jan 12 15:49:13 CET 2013 : 18055 : test.sh (214 ) :
test_func4  : "ERR
/home/dethrophes/scripts/bash/test.sh(217):test_func4 ELEVEL=1 \"false\""
which allows me to very quickly route cause the error and fix it.

if you really don't care you can just stick a ||true on the end to
ignore it in the future.

so in this case to something like
test_func4() {
false || true
  }
 
I mean it would be nice to have an unset trap, but without it nounset is
the next best thing.

Also I don't think of this as debugging it's code verification/analysis.
I do this so I don't have to debug my code. This is a big help against
typos and scoping errors. like I say its like using lint.

Re: printf %q represents null argument as empty string.

2013-01-12 Thread John Kearney

Am 12.01.2013 15:34, schrieb Dan Douglas:
> On Friday, January 11, 2013 10:39:19 PM Dan Douglas wrote:
>> On Saturday, January 12, 2013 02:35:34 AM John Kearney wrote:
>> BTW, your wrappers won't work. A wrapper would need to implement format 
> Hrmf I should have clarified that I only meant A complete printf wrapper 
> would 
> be difficult. A single-purpose workaround is perfectly fine. e.g.
> printq() { ${1+printf %q "$@"}; }; ... which is probably something like what 
> you meant. Sorry for the rant.
>
Don't worry I've got a thick skin ;) feel free to rant, you have a
different perspective and I like that.

anyway now we have a point I disagree that
"${@}"

should expand to 0 or more words, from the documentation it should be 1
or more. At least that is how I read  that paragragh. IT says it will
split the word not make the word vanish.
so I had to test and it really does how weird, is that in the posix spec?.
set --
test_func() { echo $#; }
test_func "${@}"
0
test_func "1${@}"
1
test_func "${@:-}"
1
test_func "${@-}"
1

Now I'm confused ...

oh well sorry had the functionality differently in my head.

Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-11 Thread John Kearney

Am 11.01.2013 22:34, schrieb Dan Douglas:
> On Friday, January 11, 2013 09:48:32 PM John Kearney wrote:
>> Am 11.01.2013 19:27, schrieb Dan Douglas:
>>> Bash treats the variable as essentially undefined until given at least an 
>>> empty value.
>>>
>>> $ bash -c 'typeset -i x; [[ -v x ]]; echo "$?, ${x+foo}"; typeset -p x'
>>> 1,
>>> bash: line 0: typeset: x: not found
>>> $ ksh -c 'typeset -i x; [[ -v x ]]; echo "$?, ${x+foo}"; typeset -p x'
>>> 0,
>>> typeset -i x
>>>
>>> Zsh implicitly gives integers a zero value if none are specified and the
>>> variable was previously undefined. Either the ksh or zsh ways are fine IMO.
>>>
>>> Also I'll throw this in:
>>>
>>> $ arr[1]=test; [[ -v arr[1] ]]; echo $?
>>> 1
>>>
>>> This now works in ksh to test if an individual element is set, though it 
>>> hasn't always. Maybe Bash should do the same? -v is tricky because it adds 
>>> some extra nuances to what it means for something to be defined...
>>>
>> Personally I like the current behavior, disclaimer I use nounset.
>> I see no problem with getting people to initialize variables.
> How is this relevant? It's an inconsistency in the way set/unset variables
> are normally handled. You don't use variadic functions? Unset variables /
> parameters are a normal part of most scripts.
>
>> it is a more robust programming approach.
> I strongly disagree. (Same goes for errexit.)
>
:)
we agree on errexit however SIGERROR is another matter quite like that.
Note the only reason I don't like errexit is because it doesn't tell you
why it exited, nounset deos.

no unset is very valuable. during the entire testing and validation
phase. Admittedly bash is more of a hobby for me. but I still have unit
testing for the function and more complex harness testing for the higher
level stuff. 

Before I ship ship code I may turn it off but normally if its really
critical I won't use bash for it anyway, I mainly use bash for analysis.

as such if bash stops because it finds a unset variable it is always a
bug that bash has helped me track down.

I guess it also depends on how big your scripts are I guess up to a
couple thousand lines is ok but once you get into the 10s of thousands
to keep your sanity and keep a high reliability you become more and more
strict with what you allow, strict naming conventions and coding styles.

setting nounset is in the same category of setting warnings to all and
and treat warnings as errors.

but then again I do mission critical designs so I guess I have a
different mindset.

Re: printf %q represents null argument as empty string.

2013-01-11 Thread John Kearney

Am 11.01.2013 22:05, schrieb Dan Douglas:
> On Friday, January 11, 2013 09:39:00 PM John Kearney wrote:
>> Am 11.01.2013 19:38, schrieb Dan Douglas:
>>> $ set --; printf %q\\n "$@"
>>> ''
>>>
>>> printf should perhaps only output '' when there is actually a 
> corresponding
>>> empty argument, else eval "$(printf %q ...)" and similar may give 
> different 
>>> results than expected. Other shells don't output '', even mksh's ${var@Q} 
>>> expansion. Zsh's ${(q)var} does.
>> that is not a bug in printf %q
>>
>> it what you expect to happen with "${@}" 
>> should that be 0 arguments if $# is 0.
>>
>> I however find the behavior irritating, but correct from the description.
>>
>> to do what you are suggesting you would need a special case handler for this
>> "${@}" as oposed to "${@}j" or any other variation.
>>
>>
>> what I tend to do as a workaround is
>>
>> printf() {
>> if [ $# -eq 2 -a -z "${2}" ];then
>> builtin printf "${1}"
>> else
>> builtin printf "${@}"
>> fi
>> }
>>
>>
>> or not as good but ok in most cases something like
>>
>> printf "%q" ${1:+"${@}"}
>>
>>
> I don't understand what you mean. The issue I'm speaking of is that printf %q 
> produces a quoted empty string both when given no args and when given one 
> empty arg. A quoted "$@" with no positional parameters present expands to 
> zero 
> words (and correspondingly for "${arr[@]}"). Why do you think "x${@}x" is 
> special? (Note that expansion didn't even work correctly a few patchsets ago.)
>
> Also as pointed out, every other shell with a printf %q feature disagrees 
> with 
> Bash. Are you saying that something in the manual says that it should do 
> otherwise? I'm aware you could write a wrapper, I just don't see any utility 
> in the default behavior.

um maybe an example will calrify my  attempted point

set -- arg1 arg2 arg3
set -- "--(${@})--"
printf "<%q> " "${@}"
<--\(arg1>  

set --
set -- "--(${@})--"
printf "<%q> " "${@}"
<--\(\)-->

so there is always at least one word or one arg, just because its "${@}"
should not  affect this behavior.

is that clearer as such bash is doing the right thing as far as I'm
concerned, truthfully its not normally what I want but that is beside
the point consistency is more important, especially when its so easy to
work around.

the relevant part of the man page is
 When there are no  array  members,  ${name[@]}  expands  to nothing.  

<<>>>
If  the double-quoted expansion occurs within a word, the expansion of
the first parameter is joined with the beginning part of the
   original word, and the expansion of the last parameter is joined
with the last part of the original word.  This is analogous to the 
expansion
   of  the  special parameters * and @ (see Special Parameters
above).  ${#name[subscript]} expands to the length of
${name[subscript]}.  If sub-
   script is * or @, the expansion is the number of elements in the
array.  Referencing an array variable without a subscript  is 
equivalent  to
   referencing the array with a subscript of 0.

so
set --

printf "%q" "${@}"
becomes
printf "%q" ""

which is correct as ''

Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-11 Thread John Kearney

Am 11.01.2013 19:27, schrieb Dan Douglas:
> Bash treats the variable as essentially undefined until given at least an 
> empty value.
>
> $ bash -c 'typeset -i x; [[ -v x ]]; echo "$?, ${x+foo}"; typeset -p x'
> 1,
> bash: line 0: typeset: x: not found
> $ ksh -c 'typeset -i x; [[ -v x ]]; echo "$?, ${x+foo}"; typeset -p x'
> 0,
> typeset -i x
>
> Zsh implicitly gives integers a zero value if none are specified and the
> variable was previously undefined. Either the ksh or zsh ways are fine IMO.
>
> Also I'll throw this in:
>
> $ arr[1]=test; [[ -v arr[1] ]]; echo $?
> 1
>
> This now works in ksh to test if an individual element is set, though it 
> hasn't always. Maybe Bash should do the same? -v is tricky because it adds 
> some extra nuances to what it means for something to be defined...
>

Personally I like the current behavior, disclaimer I use nounset.
I see no problem with getting people to initialize variables.

it is a more robust programming approach.

Re: printf %q represents null argument as empty string.

2013-01-11 Thread John Kearney

Am 11.01.2013 19:38, schrieb Dan Douglas:
> $ set --; printf %q\\n "$@"
> ''
>
> printf should perhaps only output '' when there is actually a corresponding
> empty argument, else eval "$(printf %q ...)" and similar may give different 
> results than expected. Other shells don't output '', even mksh's ${var@Q} 
> expansion. Zsh's ${(q)var} does.

that is not a bug in printf %q

it what you expect to happen with "${@}" 
should that be 0 arguments if $# is 0.

I however find the behavior irritating, but correct from the description.

to do what you are suggesting you would need a special case handler for this
"${@}" as oposed to "${@}j" or any other variation.


what I tend to do as a workaround is

printf() {
if [ $# -eq 2 -a -z "${2}" ];then
builtin printf "${1}"
else
builtin printf "${@}"
fi
}


or not as good but ok in most cases something like

printf "%q" ${1:+"${@}"}

Re: output of `export -p' seems misleading

2012-11-10 Thread John Kearney

Am 09.11.2012 17:21, schrieb Greg Wooledge:
> On Fri, Nov 09, 2012 at 11:18:24AM -0500, Greg Wooledge wrote:
>> restore_environment() {
>>   set -o posix
>>   eval "$saved_output_of_export_dash_p"
>>   set +o posix
>> }
> Err, what I meant was:
>
> save_environment() {
>   set -o posix
>   saved_env=$(export -p)
>   set +o posix
> }
>
> restore_environment() {
>   eval "$saved_env"
> }
>
or I guess you could also do something like

save_environment() {
  saved_env=$(export -p)
}

restore_environment() {
  echo "${saved_env//declare -x /declare -g -x }"
}


or

save_environment() {
  saved_env=$(set -o posix; export -p)
}

Re: Regular expression matching fails with string RE

2012-10-16 Thread John Kearney

Am 17.10.2012 03:13, schrieb Clark WANG:
> On Wed, Oct 17, 2012 at 5:18 AM,  wrote:
>
>> Bash Version: 4.2
>> Patch Level: 37
>>
>> Description:
>>
>> bash -c 're=".*([0-9])"; if [[ "foo1" =~ ".*([0-9])" ]]; then echo
>> ${BASH_REMATCH[0]}; elif [[ "bar2" =~ $re ]]; then echo ${BASH_REMATCH[0]};
>> fi'
>>
>> This should output foo1. It instead outputs bar2, as the first match fails.
>>
>>
>> From bash's man page:
>[[ expression ]]
>   ... ...
>   An additional binary operator, =~, is available, with  the
> same
>   ... ...
>   alphabetic characters.  Any part of the pattern may be quoted
> to
>   force  it  to  be  matched  as  a string.  Substrings matched
> by
>   ... ...
Drop the quotes on the regex

bash -c 're=".*([0-9])"; if [[ "foo1" =~ .*([0-9]) ]]; then echo
${BASH_REMATCH[0]}; elif [[ "bar2" =~ $re ]]; then echo ${BASH_REMATCH[0]};
fi'

outputs foo1

Re: Bash bug interpolating delete characters

2012-05-07 Thread John Kearney

Am 07.05.2012 22:46, schrieb Chet Ramey:
> On 5/3/12 5:53 AM, Ruediger Kuhlmann wrote:
>> Hi,
>>
>> please try the following bash script:
>>
>> a=x
>> del="$(echo -e "\\x7f")"
>>
>> echo "$del${a#x}" | od -ta
>> echo "$del ${a#x}" | od -ta
>> echo " $del${a#x}" | od -ta
>>
>> Using bash 3.2, the output is:
>>
>> 000 del  nl
>> 002
>> 000 del  sp  nl
>> 003
>> 000  sp del  nl
>> 003
>>
>> however with bash 4.1 and bash 4.2.20, the output is only:
>>
>> 000 del  nl
>> 002
>> 000  sp  nl
>> 002
>> 000  sp  nl
>> 002
>>
>> ... so in the second and third line, the delete character magically
>> disappears. Neither OS nor locale seem to influence this. Using a delete
>> character directly in the script instead of $del also has no impact, either.
> It's a case of one part of the code violating assumptions made by (and
> conditions imposed by) another.  Try the attached patch; it fixes the
> issue for me.
>
> Chet
>
It also works for me.

  "$del${a#x}" =[$'\177']
  " $del${a%x}"=[$' \177']
  " $del""${a:0:0}"=[$' \177']
  " ${del}${a:0:0}"=[$' \177']
  "${del:0:1}${a#d}"   =[$'\177x']
  "${del:0:1} ${a#d}"  =[$'\177 x']
  "${del:0:1} ${a:+}"  =[$'\177 ']
  "$del ${a#x}"=[$'\177 ']
  " $del${a:0:0}"  =[$' \177']
  " $del${a}"  =[$' \177x']
  " ${del:0:1}${a:0:0}"=[$' \177']
  "${del:0:1}${a#x}"   =[$'\177']
  "${del:0:1} ${a#x}"  =[$'\177 ']
  " $del${a#x}"=[$' \177']
  " $del"${a:0:0}  =[$' \177']
  " $del"  =[$' \177']
  " ${del:0:1}${a}"=[$' \177x']
  "${del:0:1} ${a}"=[$'\177 x']
  "${del:0:1} ${a:-}"  =[$'\177 x']

Re: Parallelism a la make -j / GNU parallel

2012-05-06 Thread John Kearney

Am 06.05.2012 08:28, schrieb Mike Frysinger:
> On Saturday 05 May 2012 04:28:50 John Kearney wrote:
>> Am 05.05.2012 06:35, schrieb Mike Frysinger:
>>> On Friday 04 May 2012 15:25:25 John Kearney wrote:
>>>> Am 04.05.2012 21:13, schrieb Mike Frysinger:
>>>>> On Friday 04 May 2012 15:02:27 John Kearney wrote:
>>>>>> Am 04.05.2012 20:53, schrieb Mike Frysinger:
>>>>>>> On Friday 04 May 2012 13:46:32 Andreas Schwab wrote:
>>>>>>>> Mike Frysinger writes:
>>>>>>>>> i wish there was a way to use `wait` that didn't block until all
>>>>>>>>> the pids returned.  maybe a dedicated option, or a shopt to enable
>>>>>>>>> this, or a new command.
>>>>>>>>>
>>>>>>>>> for example, if i launched 10 jobs in the background, i usually
>>>>>>>>> want to wait for the first one to exit so i can queue up another
>>>>>>>>> one, not wait for all of them.
>>>>>>>> If you set -m you can trap on SIGCHLD while waiting.
>>>>>>> awesome, that's a good mitigation
>>>>>>>
>>>>>>> #!/bin/bash
>>>>>>> set -m
>>>>>>> cnt=0
>>>>>>> trap ': $(( --cnt ))' SIGCHLD
>>>>>>> for n in {0..20} ; do
>>>>>>>
>>>>>>> (
>>>>>>> 
>>>>>>> d=$(( RANDOM % 10 ))
>>>>>>> echo $n sleeping $d
>>>>>>> sleep $d
>>>>>>> 
>>>>>>> ) &
>>>>>>> 
>>>>>>> : $(( ++cnt ))
>>>>>>> 
>>>>>>> if [[ ${cnt} -ge 10 ]] ; then
>>>>>>> 
>>>>>>> echo going to wait
>>>>>>> wait
>>>>>>> 
>>>>>>> fi
>>>>>>>
>>>>>>> done
>>>>>>> trap - SIGCHLD
>>>>>>> wait
>>>>>>>
>>>>>>> it might be a little racy (wrt checking cnt >= 10 and then doing a
>>>>>>> wait), but this is good enough for some things.  it does lose
>>>>>>> visibility into which pids are live vs reaped, and their exit status,
>>>>>>> but i more often don't care about that ...
>>>>>> That won't work I don't think.
>>>>> seemed to work fine for me
>>>>>
>>>>>> I think you meant something more like this?
>>>>> no.  i want to sleep the parent indefinitely and fork a child asap
>>>>> (hence the `wait`), not busy wait with a one second delay.  the `set
>>>>> -m` + SIGCHLD interrupted the `wait` and allowed it to return.
>>>> The functionality of the code doesn't need SIGCHLD, it still waits till
>>>> all the 10 processes are finished before starting the next lot.
>>> not on my system it doesn't.  maybe a difference in bash versions.  as
>>> soon as one process quits, the `wait` is interrupted, a new one is
>>> forked, and the parent goes back to sleep until another child exits.  if
>>> i don't `set -m`, then i see what you describe -- the wait doesn't
>>> return until all 10 children exit.
>> Just to clarify what I see with your code, with the extra echos from me
>> and less threads so its shorter.
> that's not what i was getting.  as soon as i saw the echo of SIGCHLD, a new 
> "sleeping" would get launched.
> -mike
Ok then, thats weird because it doesn't really make sense to me why a
SIGCHLD would interrupt the wait command. Oh well.

Re: Parallelism a la make -j / GNU parallel

2012-05-06 Thread John Kearney

Am 06.05.2012 08:28, schrieb Mike Frysinger:
> On Saturday 05 May 2012 23:25:26 John Kearney wrote:
>> Am 05.05.2012 06:28, schrieb Mike Frysinger:
>>> On Friday 04 May 2012 16:17:02 Chet Ramey wrote:
>>>> On 5/4/12 2:53 PM, Mike Frysinger wrote:
>>>>> it might be a little racy (wrt checking cnt >= 10 and then doing a
>>>>> wait), but this is good enough for some things.  it does lose
>>>>> visibility into which pids are live vs reaped, and their exit status,
>>>>> but i more often don't care about that ...
>>>> What version of bash did you test this on?  Bash-4.0 is a little
>>>> different in how it treats the SIGCHLD trap.
>>> bash-4.2_p28.  wait returns 145 (which is SIGCHLD).
>>>
>>>> Would it be useful for bash to set a shell variable to the PID of the
>>>> just- reaped process that caused the SIGCHLD trap?  That way you could
>>>> keep an array of PIDs and, if you wanted, use that variable to keep
>>>> track of live and dead children.
>>> we've got associative arrays now ... we could have one which contains all
>>> the relevant info:
>>> declare -A BASH_CHILD_STATUS=(
>>> ["pid"]=1234
>>> ["status"]=1# WEXITSTATUS()
>>> ["signal"]=13   # WTERMSIG()
>>> )
>>>
>>> makes it easy to add any other fields people might care about ...
>> Is there actually a guarantee that there will be 1 SIGCHLD for every
>> exited process.
>> Isn't it actually a race condition?
> when SIGCHLD is delivered doesn't matter.  the child stays in a zombie state 
> until the parent calls wait() on it and gets its status.  so you can have 
> `wait` return one child's status at a time.
> -mike
but I think my point still stands
trap ': $(( cnt-- ))' SIGCHLD
is a bad idea, you actually need to verify how many jobs are running not
just arbitrarily decrement a counter, because your not guaranteed a trap
for each process. I mean sure it will normally work, but its not
guaranteed to work.

Also I think the question would be is there any point in forcing bash to
issue 1 status at a time? It seems to make more sense to issue them in
bulk.
So bash could populate an array of all reaped processes in one trap
rather than having to execute multiple traps. This is what bash does
internally anyway?

Re: Parallelism a la make -j / GNU parallel

2012-05-05 Thread John Kearney

Am 05.05.2012 06:28, schrieb Mike Frysinger:
> On Friday 04 May 2012 16:17:02 Chet Ramey wrote:
>> On 5/4/12 2:53 PM, Mike Frysinger wrote:
>>> it might be a little racy (wrt checking cnt >= 10 and then doing a wait),
>>> but this is good enough for some things.  it does lose visibility into
>>> which pids are live vs reaped, and their exit status, but i more often
>>> don't care about that ...
>> What version of bash did you test this on?  Bash-4.0 is a little different
>> in how it treats the SIGCHLD trap.
> bash-4.2_p28.  wait returns 145 (which is SIGCHLD).
>
>> Would it be useful for bash to set a shell variable to the PID of the just-
>> reaped process that caused the SIGCHLD trap?  That way you could keep an
>> array of PIDs and, if you wanted, use that variable to keep track of live
>> and dead children.
> we've got associative arrays now ... we could have one which contains all the 
> relevant info:
>   declare -A BASH_CHILD_STATUS=(
>   ["pid"]=1234
>   ["status"]=1# WEXITSTATUS()
>   ["signal"]=13   # WTERMSIG()
>   )
> makes it easy to add any other fields people might care about ...
> -mike
Is there actually a guarantee that there will be 1 SIGCHLD for every
exited process.
Isn't it actually a race condition?
what happens if 2 subprocesses exit simultaneously.
or if a process exits while already in the SIGCHLD trap.
I mean my normal interpretation of a interrupt/event/trap is just a
notification that I need to check what has happened. Or that there was
an event not the extent of the event?
I keep feeling that the following is bad practice

trap ': $(( --cnt ))' SIGCHLD

and would be better something like this

trap 'cnt=$(jobs -p | wc -w)' SIGCHLD


as such you would need something more like.
declare -a BASH_CHILD_STATUS=([1234]=1 [1235]=1 [1236]=1)

declare -a BASH_CHILD_STATUS_SIGNAL=([1234]=13 [1235]=13 [1236]=13)

Re: Parallelism a la make -j / GNU parallel

2012-05-05 Thread John Kearney

Am 05.05.2012 06:35, schrieb Mike Frysinger:
> On Friday 04 May 2012 15:25:25 John Kearney wrote:
>> Am 04.05.2012 21:13, schrieb Mike Frysinger:
>>> On Friday 04 May 2012 15:02:27 John Kearney wrote:
>>>> Am 04.05.2012 20:53, schrieb Mike Frysinger:
>>>>> On Friday 04 May 2012 13:46:32 Andreas Schwab wrote:
>>>>>> Mike Frysinger  writes:
>>>>>>> i wish there was a way to use `wait` that didn't block until all the
>>>>>>> pids returned.  maybe a dedicated option, or a shopt to enable this,
>>>>>>> or a new command.
>>>>>>>
>>>>>>> for example, if i launched 10 jobs in the background, i usually want
>>>>>>> to wait for the first one to exit so i can queue up another one, not
>>>>>>> wait for all of them.
>>>>>> If you set -m you can trap on SIGCHLD while waiting.
>>>>> awesome, that's a good mitigation
>>>>>
>>>>> #!/bin/bash
>>>>> set -m
>>>>> cnt=0
>>>>> trap ': $(( --cnt ))' SIGCHLD
>>>>> for n in {0..20} ; do
>>>>>   (
>>>>>   d=$(( RANDOM % 10 ))
>>>>>   echo $n sleeping $d
>>>>>   sleep $d
>>>>>   ) &
>>>>>   : $(( ++cnt ))
>>>>>   if [[ ${cnt} -ge 10 ]] ; then
>>>>>   echo going to wait
>>>>>   wait
>>>>>   fi
>>>>> done
>>>>> trap - SIGCHLD
>>>>> wait
>>>>>
>>>>> it might be a little racy (wrt checking cnt >= 10 and then doing a
>>>>> wait), but this is good enough for some things.  it does lose
>>>>> visibility into which pids are live vs reaped, and their exit status,
>>>>> but i more often don't care about that ...
>>>> That won't work I don't think.
>>> seemed to work fine for me
>>>
>>>> I think you meant something more like this?
>>> no.  i want to sleep the parent indefinitely and fork a child asap (hence
>>> the `wait`), not busy wait with a one second delay.  the `set -m` +
>>> SIGCHLD interrupted the `wait` and allowed it to return.
>> The functionality of the code doesn't need SIGCHLD, it still waits till
>> all the 10 processes are finished before starting the next lot.
> not on my system it doesn't.  maybe a difference in bash versions.  as soon 
> as 
> one process quits, the `wait` is interrupted, a new one is forked, and the 
> parent goes back to sleep until another child exits.  if i don't `set -m`, 
> then i see what you describe -- the wait doesn't return until all 10 children 
> exit.
> -mike
Just to clarify what I see with your code, with the extra echos from me
and less threads so its shorter.
set -m
cnt=0
trap ': $(( --cnt )); echo "SIGCHLD"' SIGCHLD
for n in {0..10} ; do
(
d=$(( RANDOM % 10 ))
echo $n sleeping $d
sleep $d
echo $n exiting $d
) &
: $(( ++cnt ))
if [[ ${cnt} -ge 5 ]] ; then
echo going to wait
wait
echo Back from wait
fi
done
trap - SIGCHLD
wait
   
gives
0 sleeping 9
2 sleeping 4
going to wait
4 sleeping 7
3 sleeping 4
1 sleeping 6
2 exiting 4
SIGCHLD
3 exiting 4
SIGCHLD
1 exiting 6
SIGCHLD
4 exiting 7
SIGCHLD
0 exiting 9
SIGCHLD
Back from wait
5 sleeping 5
6 sleeping 5
going to wait
8 sleeping 1
9 sleeping 1
7 sleeping 3
9 exiting 1
8 exiting 1
SIGCHLD
SIGCHLD
7 exiting 3
SIGCHLD
6 exiting 5
SIGCHLD
5 exiting 5




now
this code
function TestProcess_22 {
local d=$(( RANDOM % 10 ))
echo $1 sleeping $d
sleep $d
echo $1 exiting $d
}
function trap_SIGCHLD {
echo "SIGCHLD";
if [ $cnt -gt 0 ]; then
: $(( --cnt ))
TestProcess_22 $cnt  &
fi
}
set -m
cnt=10
maxJobCnt=5
trap 'trap_SIGCHLD' SIGCHLD
for (( x=0; xhttp://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

uname -a
Linux DETH00 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC
2012 x86_64 x86_64 x86_64 GNU/Linux

Re: Parallelism a la make -j / GNU parallel

2012-05-04 Thread John Kearney

Am 04.05.2012 21:11, schrieb Greg Wooledge:
> On Fri, May 04, 2012 at 09:02:27PM +0200, John Kearney wrote:
>> set -m
>> cnt=0
>> trap ': $(( --cnt ))' SIGCHLD
>> set -- {0..20}
>> while [ $# -gt 0 ]; do
>>  if [[ ${cnt} -lt 10 ]] ; then
>>
>>  (
>>  d=$(( RANDOM % 10 ))
>>  echo $n sleeping $d
>>  sleep $d
>>  ) &
>>  : $(( ++cnt ))
>>  shift
>>  fi
>>  echo going to wait
>>  sleep 1
>> done
> You're busy-looping with a 1-second sleep instead of using wait and the
> signal handler, which was the whole purpose of the previous example (and
> of the set -m that you kept in yours).  And $n should probably be $1 there.
>
see my response to mike.


what you are thinking about is either what I suggested or something like
this

function TestProcess_22 {
local d=$(( RANDOM % 10 ))
echo $1 sleeping $d
sleep $d
echo $1 exiting $d
}
function trap_SIGCHLD {
echo "SIGCHLD";
if [ $cnt -gt 0 ]; then
: $(( --cnt ))
TestProcess_22 $cnt  &
fi
}
set -m
cnt=20
maxJobCnt=10
trap 'trap_SIGCHLD' SIGCHLD
for (( x=0; x

Re: Parallelism a la make -j / GNU parallel

2012-05-04 Thread John Kearney

Am 04.05.2012 21:13, schrieb Mike Frysinger:
> On Friday 04 May 2012 15:02:27 John Kearney wrote:
>> Am 04.05.2012 20:53, schrieb Mike Frysinger:
>>> On Friday 04 May 2012 13:46:32 Andreas Schwab wrote:
>>>> Mike Frysinger  writes:
>>>>> i wish there was a way to use `wait` that didn't block until all the
>>>>> pids returned.  maybe a dedicated option, or a shopt to enable this,
>>>>> or a new command.
>>>>>
>>>>> for example, if i launched 10 jobs in the background, i usually want to
>>>>> wait for the first one to exit so i can queue up another one, not wait
>>>>> for all of them.
>>>> If you set -m you can trap on SIGCHLD while waiting.
>>> awesome, that's a good mitigation
>>>
>>> #!/bin/bash
>>> set -m
>>> cnt=0
>>> trap ': $(( --cnt ))' SIGCHLD
>>> for n in {0..20} ; do
>>>
>>> (
>>> 
>>> d=$(( RANDOM % 10 ))
>>> echo $n sleeping $d
>>> sleep $d
>>> 
>>> ) &
>>> 
>>> : $(( ++cnt ))
>>> 
>>> if [[ ${cnt} -ge 10 ]] ; then
>>> 
>>> echo going to wait
>>> wait
>>> 
>>> fi
>>>
>>> done
>>> trap - SIGCHLD
>>> wait
>>>
>>> it might be a little racy (wrt checking cnt >= 10 and then doing a wait),
>>> but this is good enough for some things.  it does lose visibility into
>>> which pids are live vs reaped, and their exit status, but i more often
>>> don't care about that ...
>> That won't work I don't think.
> seemed to work fine for me
>
>> I think you meant something more like this?
> no.  i want to sleep the parent indefinitely and fork a child asap (hence the 
> `wait`), not busy wait with a one second delay.  the `set -m` + SIGCHLD 
> interrupted the `wait` and allowed it to return.
> -mike
The functionality of the code doesn't need SIGCHLD, it still waits till
all the 10 processes are finished before starting the next lot.

it only interrupts the wait to decrement the counter.

to do what your talking about you would have to start the new subprocess
in the SIGCHLD trap.


try this out it might make it clearer what I mean

set -m
cnt=0
trap ': $(( --cnt )); echo SIGCHLD' SIGCHLD
for n in {0..20} ; do
(
d=$(( RANDOM % 10 ))
echo $n sleeping $d
sleep $d
echo $n exiting $d
) &
: $(( ++cnt ))
if [[ ${cnt} -ge 10 ]] ; then
echo going to wait
wait
fi
done
trap - SIGCHLD
wait

Re: Parallelism a la make -j / GNU parallel

2012-05-04 Thread John Kearney

Am 04.05.2012 20:53, schrieb Mike Frysinger:
> On Friday 04 May 2012 13:46:32 Andreas Schwab wrote:
>> Mike Frysinger  writes:
>>> i wish there was a way to use `wait` that didn't block until all the pids
>>> returned.  maybe a dedicated option, or a shopt to enable this, or a new
>>> command.
>>>
>>> for example, if i launched 10 jobs in the background, i usually want to
>>> wait for the first one to exit so i can queue up another one, not wait
>>> for all of them.
>> If you set -m you can trap on SIGCHLD while waiting.
> awesome, that's a good mitigation
>
> #!/bin/bash
> set -m
> cnt=0
> trap ': $(( --cnt ))' SIGCHLD
> for n in {0..20} ; do
>   (
>   d=$(( RANDOM % 10 ))
>   echo $n sleeping $d
>   sleep $d
>   ) &
>   : $(( ++cnt ))
>   if [[ ${cnt} -ge 10 ]] ; then
>   echo going to wait
>   wait
>   fi
> done
> trap - SIGCHLD
> wait
>
> it might be a little racy (wrt checking cnt >= 10 and then doing a wait), but 
> this is good enough for some things.  it does lose visibility into which pids 
> are live vs reaped, and their exit status, but i more often don't care about 
> that ...
> -mike
That won't work I don't think.
I think you meant something more like this?

set -m
cnt=0
trap ': $(( --cnt ))' SIGCHLD
set -- {0..20}
while [ $# -gt 0 ]; do
if [[ ${cnt} -lt 10 ]] ; then

(
d=$(( RANDOM % 10 ))
echo $n sleeping $d
sleep $d
) &
: $(( ++cnt ))
shift
fi
echo going to wait
sleep 1
done


which is basically what I did in my earlier example except I used USR2
instead of SIGCHLD and put it in a function to make it easier to use.

Re: Parallelism a la make -j / GNU parallel

2012-05-03 Thread John Kearney

This version might be easier to follow. The last version was more for
being able to issue commands via a fifo to a job queue server.

  function check_valid_var_name {
case "${1:?Missing Variable Name}" in
  [!a-zA-Z_]* | *[!a-zA-Z_0-9]* ) return 3;;
esac
  }


  CNiceLevel=$(nice)
declare -a JobArray
function PushAdvancedCmd {
local le="tmp_array${#JobArray[@]}"
JobArray+=("${le}")
eval "${le}"'=("${@}")'
}
function PushSimpleCmd {
PushAdvancedCmd  WrapJob ${CNiceLevel} "${@}"
}
function PushNiceCmd {
PushAdvancedCmd  WrapJob "${@}"
}
function UnpackCmd {
check_valid_var_name ${1} || return $?
eval _RETURN=('"${'"${1}"'[@]}"')
unset "${1}[@]"
}
function runJobParrell {
local mjobCnt=${1} && shift
jcnt=0
function WrapJob {
[ ${1} -le ${CNiceLevel} ] || renice -n ${1}
local Buffer=$("${@:2}")
echo "${Buffer}"
kill -s USR2 $$
}
function JobFinised {
jcnt=$((${jcnt}-1))
}
trap JobFinised USR2
while [ $# -gt 0 ] ; do
while [ ${jcnt} -lt ${mjobCnt} ]; do
jcnt=$((${jcnt}+1))
if UnpackCmd "${1}" ; then
"${_RETURN[@]}" &
    else
continue
fi
shift
    done
sleep 1
done
}





Am 03.05.2012 23:23, schrieb John Kearney:
> Am 03.05.2012 22:30, schrieb Greg Wooledge:
>> On Thu, May 03, 2012 at 10:12:17PM +0200, John Kearney wrote:
>>> function runJobParrell {
>>> local mjobCnt=${1} && shift
>>> jcnt=0
>>> function WrapJob {
>>> "${@}"
>>> kill -s USR2 $$
>>> }
>>> function JobFinised {
>>> jcnt=$((${jcnt}-1))
>>> }
>>> trap JobFinised USR2
>>> while [ $# -gt 0 ] ; do
>>> while [ ${jcnt} -lt ${mjobCnt} ]; do
>>> jcnt=$((${jcnt}+1))
>>> echo WrapJob "${1}" "${2}"
>>> WrapJob "${1}" "${2}" &
>>> shift 2
>>> done
>>> sleep 1
>>> done
>>> }
>>> function testProcess {
>>> echo "${*}"
>>> sleep 1
>>> }
>>> runJobParrell 2  testProcess "jiji#" testProcess "jiji#" testProcess
>>> "jiji#"
>>>
>>> tends to work well enough.
>>> it gets a bit more complex if you want to recover output but not too much.
>> The real issue here is that there is no generalizable way to store an
>> arbitrary command for later execution.  Your example assumes that each
>> pair of arguments constitutes one simple command, which is fine if that's
>> all you need it to do.  But the next guy asking for this will want to
>> schedule arbitrarily complex shell pipelines and complex commands with
>> here documents and brace expansions and 
>>
>
> :)
> A more complex/flexible example. More like what I actually use.
>
>
>
>
>   CNiceLevel=$(nice)
> declare -a JobArray
> function PushAdvancedCmd {
> local IFS=$'\v'
> JobArray+=("${*}")
> }
> function PushSimpleCmd {
> PushAdvancedCmd  WrapJob ${CNiceLevel} "${@}"
> }
> function PushNiceCmd {
> PushAdvancedCmd  WrapJob "${@}"
> }
> function UnpackCmd {
> local IFS=$'\v'
> set -o noglob
> _RETURN=( .${1}. )  
> set +o noglob
> _RETURN[0]="${_RETURN[0]#.}"
> local -i le=${#_RETURN[@]}-1
> _RETURN[${le}]="${_RETURN[${le}]%.}"
> }
> function runJobParrell {
> local mjobCnt=${1} && shift
> jcnt=0
> function WrapJob {
> [ ${1} -le ${CNiceLevel} ] || renice -n ${1}
> local Buffer=$("${@:2}")
> echo "${Buffer}"
> kill -s USR2 $$
> }
> function JobFinised {
> jcnt=$((${jcnt}-1))
> }
> trap JobFinised USR2
> while [ $# -gt 0 ] ; do
> while [ ${jcnt} -lt ${mjobCnt} ]; do
> jcnt=$((${jcnt}+1))
&

Re: Parallelism a la make -j / GNU parallel

2012-05-03 Thread John Kearney

Am 03.05.2012 22:30, schrieb Greg Wooledge:
> On Thu, May 03, 2012 at 10:12:17PM +0200, John Kearney wrote:
>> function runJobParrell {
>> local mjobCnt=${1} && shift
>> jcnt=0
>> function WrapJob {
>> "${@}"
>> kill -s USR2 $$
>> }
>> function JobFinised {
>> jcnt=$((${jcnt}-1))
>> }
>> trap JobFinised USR2
>> while [ $# -gt 0 ] ; do
>> while [ ${jcnt} -lt ${mjobCnt} ]; do
>> jcnt=$((${jcnt}+1))
>> echo WrapJob "${1}" "${2}"
>> WrapJob "${1}" "${2}" &
>> shift 2
>> done
>> sleep 1
>> done
>> }
>> function testProcess {
>> echo "${*}"
>> sleep 1
>> }
>> runJobParrell 2  testProcess "jiji#" testProcess "jiji#" testProcess
>> "jiji#"
>>
>> tends to work well enough.
>> it gets a bit more complex if you want to recover output but not too much.
> The real issue here is that there is no generalizable way to store an
> arbitrary command for later execution.  Your example assumes that each
> pair of arguments constitutes one simple command, which is fine if that's
> all you need it to do.  But the next guy asking for this will want to
> schedule arbitrarily complex shell pipelines and complex commands with
> here documents and brace expansions and 
>


:)
A more complex/flexible example. More like what I actually use.




  CNiceLevel=$(nice)
declare -a JobArray
function PushAdvancedCmd {
local IFS=$'\v'
JobArray+=("${*}")
}
function PushSimpleCmd {
PushAdvancedCmd  WrapJob ${CNiceLevel} "${@}"
}
function PushNiceCmd {
PushAdvancedCmd  WrapJob "${@}"
}
function UnpackCmd {
local IFS=$'\v'
set -o noglob
_RETURN=( .${1}. )  
set +o noglob
_RETURN[0]="${_RETURN[0]#.}"
local -i le=${#_RETURN[@]}-1
_RETURN[${le}]="${_RETURN[${le}]%.}"
}
function runJobParrell {
local mjobCnt=${1} && shift
jcnt=0
function WrapJob {
[ ${1} -le ${CNiceLevel} ] || renice -n ${1}
local Buffer=$("${@:2}")
echo "${Buffer}"
kill -s USR2 $$
}
function JobFinised {
jcnt=$((${jcnt}-1))
}
trap JobFinised USR2
while [ $# -gt 0 ] ; do
while [ ${jcnt} -lt ${mjobCnt} ]; do
jcnt=$((${jcnt}+1))
UnpackCmd "${1}"
"${_RETURN[@]}" &
shift
done
sleep 1
done
}



function testProcess {
echo "${*}"
sleep 1
}
#  So standard variable args can be handled in 2 ways 1
#  encode them as such
PushSimpleCmd testProcess "jiji#" dfds dfds dsfsd
PushSimpleCmd testProcess "jiji#" dfds dfds
PushNiceCmd 20 testProcess "jiji#" dfds
PushSimpleCmd testProcess "jiji#"
PushSimpleCmd testProcess "jiji#" "*" s
# more complex things just wrap them in a function and call it
function DoComplexMagicStuff1 {
echo "${@}" >&2
}
# Or more normally just do a hybrid of both.
PushSimpleCmd DoComplexMagicStuff1 "jiji#"

#
   
runJobParrell 1 "${JobArray[@]}"



Note there is another level of complexity where I start a JobQueue
Process and issues it commands using a fifo.

Re: Parallelism a la make -j / GNU parallel

2012-05-03 Thread John Kearney

I tend to do something more like this


function runJobParrell {
local mjobCnt=${1} && shift
jcnt=0
function WrapJob {
"${@}"
kill -s USR2 $$
}
function JobFinised {
jcnt=$((${jcnt}-1))
}
trap JobFinised USR2
while [ $# -gt 0 ] ; do
while [ ${jcnt} -lt ${mjobCnt} ]; do
jcnt=$((${jcnt}+1))
echo WrapJob "${1}" "${2}"
WrapJob "${1}" "${2}" &
shift 2
done
sleep 1
done
}
function testProcess {
echo "${*}"
sleep 1
}
runJobParrell 2  testProcess "jiji#" testProcess "jiji#" testProcess
"jiji#"

tends to work well enough.
it gets a bit more complex if you want to recover output but not too much.

Am 03.05.2012 21:21, schrieb Elliott Forney:
> Here is a construct that I use sometimes... although you might wind up
> waiting for the slowest job in each iteration of the loop:
>
>
> maxiter=100
> ncore=8
>
> for iter in $(seq 1 $maxiter)
> do
>   startjob $iter &
>
>   if (( (iter % $ncore) == 0 ))
>   then
> wait
>   fi
> done
>
>
> On Thu, May 3, 2012 at 12:49 PM, Colin McEwan  wrote:
>> Hi there,
>>
>> I don't know if this is anything that has ever been discussed or
>> considered, but would be interested in any thoughts.
>>
>> I frequently find myself these days writing shell scripts, to run on
>> multi-core machines, which could easily exploit lots of parallelism (eg. a
>> batch of a hundred independent simulations).
>>
>> The basic parallelism construct of '&' for async execution is highly
>> expressive, but it's not useful for this sort of use-case: starting up 100
>> jobs at once will leave them competing, and lead to excessive context
>> switching and paging.
>>
>> So for practical purposes, I find myself reaching for 'make -j' or GNU
>> parallel, both of which destroy the expressiveness of the shell script as I
>> have to redirect commands and parameters to Makefiles or stdout, and
>> wrestle with appropriate levels of quoting.
>>
>> What I would really *like* would be an extension to the shell which
>> implements the same sort of parallelism-limiting / 'process pooling' found
>> in make or 'parallel' via an operator in the shell language, similar to '&'
>> which has semantics of *possibly* continuing asynchronously (like '&') if
>> system resources allow, or waiting for the process to complete (';').
>>
>> Any thoughts, anyone?
>>
>> Thanks!
>>
>> --
>> C.
>>
>> https://plus.google.com/109211294311109803299
>> https://www.facebook.com/mcewanca

Re: Fwd: Bash bug interpolating delete characters

2012-05-03 Thread John Kearney

Am 03.05.2012 19:41, schrieb John Kearney:
> Am 03.05.2012 15:01, schrieb Greg Wooledge:
>>> Yours, Rüdiger.
>>> a=x
>>> del="$(echo -e "\\x7f")"
>>>
>>> echo "$del${a#x}" | od -ta
>>> echo "$del ${a#x}" | od -ta
>>> echo " $del${a#x}" | od -ta
>> Yup, confirmed that it breaks here, and only when the # parameter expansion
>> is included.
>>
>> imadev:~$ del=$'\x7f' a=x b=
>> imadev:~$ echo " $del$b" | od -ta
>> 000   sp del  nl
>> 003
>> imadev:~$ echo " $del${b}" | od -ta
>> 000   sp del  nl
>> 003
>> imadev:~$ echo " $del${b#x}" | od -ta
>> 000   sp del  nl
>> 003
>> imadev:~$ echo " $del${a#x}" | od -ta
>> 000   sp  nl
>> 002
>>
>> Bash 4.2.24.
>>
> Also Confirmed, but my output is a bit wackier.
> printf %q seems to get confused, and do invalid things as well.
>
> the \x7f becomes a \
disregard the comment about printf its just escaping the space.
>
> function printTests {
> while [ $# -gt 0 ]; do
> printf"%-20s=[%q]\n""${1}" "$(eval echo "${1}")"
> shift
> done
> }
>
> a=x
> del=$'\x7f'
> printTests '"$del${a#x}"' '"$del ${a#x}"' '" $del${a#x}"' '" $del${a%x}"'
> printTests '" $del${a:0:0}"' '" $del"${a:0:0}' '" $del""${a:0:0}"'
> printTests '" $del${a}"' '" $del"' '" ${del}${a:0:0}"' '"
> ${del:0:1}${a:0:0}"'
> printTests '" ${del:0:1}${a}"' '"${del:0:1}${a#d}"' '"${del:0:1}${a#x}"'
> printTests '" ${del:0:1} ${a}"' '"${del:0:1} ${a#d}"' '"${del:0:1} ${a#x}"'
>
> output
> "$del${a#x}"=[$'\177']
> "$del ${a#x}"   =[\ ]
> " $del${a#x}"   =[\ ]
> " $del${a%x}"   =[\ ]
> " $del${a:0:0}" =[\ ]
> " $del"${a:0:0} =[$' \177']
> " $del""${a:0:0}"   =[$' \177']
> " $del${a}" =[$' \177x']
> " $del" =[$' \177']
> " ${del}${a:0:0}"   =[\ ]
> " ${del:0:1}${a:0:0}"=[\ ]
> " ${del:0:1}${a}"   =[$' \177x']
> "${del:0:1}${a#d}"  =[$'\177x']
> "${del:0:1}${a#x}"  =[$'\177']
> " ${del:0:1} ${a}"  =[$' \177 x']
> "${del:0:1} ${a#d}" =[$'\177 x']
> "${del:0:1} ${a#x}" =[\ ]
>
>
>
>
>
>

Re: Fwd: Bash bug interpolating delete characters

2012-05-03 Thread John Kearney

Am 03.05.2012 15:01, schrieb Greg Wooledge:
>> Yours, Rüdiger.
>> a=x
>> del="$(echo -e "\\x7f")"
>>
>> echo "$del${a#x}" | od -ta
>> echo "$del ${a#x}" | od -ta
>> echo " $del${a#x}" | od -ta
> Yup, confirmed that it breaks here, and only when the # parameter expansion
> is included.
>
> imadev:~$ del=$'\x7f' a=x b=
> imadev:~$ echo " $del$b" | od -ta
> 000   sp del  nl
> 003
> imadev:~$ echo " $del${b}" | od -ta
> 000   sp del  nl
> 003
> imadev:~$ echo " $del${b#x}" | od -ta
> 000   sp del  nl
> 003
> imadev:~$ echo " $del${a#x}" | od -ta
> 000   sp  nl
> 002
>
> Bash 4.2.24.
>
Also Confirmed, but my output is a bit wackier.
printf %q seems to get confused, and do invalid things as well.

the \x7f becomes a \

function printTests {
while [ $# -gt 0 ]; do
printf"%-20s=[%q]\n""${1}" "$(eval echo "${1}")"
shift
done
}

a=x
del=$'\x7f'
printTests '"$del${a#x}"' '"$del ${a#x}"' '" $del${a#x}"' '" $del${a%x}"'
printTests '" $del${a:0:0}"' '" $del"${a:0:0}' '" $del""${a:0:0}"'
printTests '" $del${a}"' '" $del"' '" ${del}${a:0:0}"' '"
${del:0:1}${a:0:0}"'
printTests '" ${del:0:1}${a}"' '"${del:0:1}${a#d}"' '"${del:0:1}${a#x}"'
printTests '" ${del:0:1} ${a}"' '"${del:0:1} ${a#d}"' '"${del:0:1} ${a#x}"'

output
"$del${a#x}"=[$'\177']
"$del ${a#x}"   =[\ ]
" $del${a#x}"   =[\ ]
" $del${a%x}"   =[\ ]
" $del${a:0:0}" =[\ ]
" $del"${a:0:0} =[$' \177']
" $del""${a:0:0}"   =[$' \177']
" $del${a}" =[$' \177x']
" $del" =[$' \177']
" ${del}${a:0:0}"   =[\ ]
" ${del:0:1}${a:0:0}"=[\ ]
" ${del:0:1}${a}"   =[$' \177x']
"${del:0:1}${a#d}"  =[$'\177x']
"${del:0:1}${a#x}"  =[$'\177']
" ${del:0:1} ${a}"  =[$' \177 x']
"${del:0:1} ${a#d}" =[$'\177 x']
"${del:0:1} ${a#x}" =[\ ]

Re: Is it possible or RFE to expand ranges of arrays

2012-04-28 Thread John Kearney

Am 28.04.2012 05:05, schrieb Linda Walsh:

Maarten Billemont wrote:

On 26 Apr 2012, at 06:30, John Kearney wrote:

Am 26.04.2012 06:26, schrieb Linda Walsh:

I know I can get
a="abcdef" echo "${a[2:4]}" = cde

how do I do:
typeset -a a=(apple berry cherry date); then get:

echo ${a[1:2]} = "berry" "cherry"  ( non-grouped args)

I tried to do it in a function and hurt myself.

echo ${a[@]:1:2}

I see little reason to ask bash to wordsplit the elements after 
expanding them. 

You ought to quote that expansion.

---
Good point.

Since if you do:

> a=( 'apple pie' 'berry pie' 'cherry cake' 'dates divine')
> b=( ${a[@]:1:2} )
> echo ${#b[*]}
4
#yikes!
> b=( "${a[@]:1:2}" )
2
#woo!

I'd guess the original poster probably figured, I'd figure out the 
correct

form pretty quickly in usage.  but thanks for your insight.

( (to all)*sigh*)

I "always" quote not sure why I didn't that time.
Except that it was just a quick response to a simple question.
but of course your right.

Re: Is it possible or RFE to expand ranges of arrays

2012-04-25 Thread John Kearney


Am 26.04.2012 06:26, schrieb Linda Walsh:

I know I can get
a="abcdef" echo "${a[2:4]}" = cde

 how do I do:
typeset -a a=(apple berry cherry date); then get:

echo ${a[1:2]} = "berry" "cherry"  ( non-grouped args)

I tried to do it in a function and hurt myself.




echo ${a[@]:1:2}

Please remove iconv_open (charset, "ASCII"); from unicode.c

2012-03-07 Thread John Kearney

Hi chet can you please remove the following from the unicode.c file

localconv = iconv_open (charset, "ASCII");

This is invalid fall back. zhis creates a translation config. The
primary attempt is utf-8 to destination codeset. If that conversion
fails this tries selecting ASCII to codeset. ! But the code still
inputs utf-8 as input to the icconv. this means that this is less
likely to successfully encode than a simple assignment. consider
U+80 becomes utf-8 "\xc2\x80" which because we tell iconv this is
ascii becomes ascii "\xc2\x80".

do this line takes a U+80 and turns it into a U+c3 and a U+80.

The way i rewrote the icconv code made it cleaner, safer and quicker,
please consider using it. I avoided the need for the strcpy among
other things.

On 02/21/2012 03:42 AM, Chet Ramey wrote:
> On 2/18/12 5:39 AM, John Kearney wrote:
> 
>> Bash Version: 4.2 Patch Level: 10 Release Status: release
>> 
>> Description: Current u32toutf8 only encode values below 0x
>> correctly. wchar_t can be ambiguous size better in my opinion to
>> use unsigned long, or uint32_t, or something clearer.
> 
> Thanks for the patch.  It's good to have a complete
> implementation, though as a practical matter you won't see UTF-8
> characters longer than four bytes.  I agree with you about the
> unsigned 32-bit int type; wchar_t is signed, even if it's 32 bits,
> on several systems I use.
> 
> Chet
>

Re: Can somebody explain to me what u32tochar in /lib/sh/unicode.c is trying to do?

2012-03-06 Thread John Kearney

You really should stop using this function. It is just plain wrong, and
is not predictable.

It may enocde BIG5 and SJIS but is is more by accident that intent.

If you want to do something like this then do it properly.

basically all of the multibyte system have to have a detection method
for multibyte characters, most of them rely on bit7 to indicate a
multibyte sequence or use vt100 SS3 escape sequences. You really can't
just inject random data into a txt buffer. even returning UTF-8 as a
fallback is a bug. The most that should be done is return ASCII in error
case and I mean U+0-U+7f only and ignore or warn about any unsupported
characters.

Using this function is dangerous and pointless.

I mean seriously in what world does it make sense to inject utf-8 into a
big5 string? Or indead into a ascii string. Code should behave like an
adult, not like a frightened kid. By which I mean it shouldn't pretend
it knows what its doing when it doesn't, it should admit the problem so
that the problem can be fixed.

On 02/21/2012 04:28 AM, Chet Ramey wrote:
> On 2/19/12 5:07 PM, John Kearney wrote:
>> Can somebody explain to me what u32tochar is trying to do?
>>
>> It seems like dangerous code?
>>
>> from the context i'm guessing it trying to make a hail mary pass at
>> converting utf-32 to mb (not utf-8 mb)
> 
> Pretty much.  It's a big-endian representation of a 32-bit integer
> as a character string.  It's what you get when you don't have iconv
> or iconv fails and the locale isn't UTF-8.  It may not be useful,
> but it's predictable.  If we have a locale the system doesn't know
> about or can't translate, there's not a lot we can do.
> 
> Chet

Re: bash 4.2 breaks source finding libs in lib/filename...

2012-03-03 Thread John Kearney

On 03/03/2012 09:43 AM, Stefano Lattarini wrote:
> On 03/03/2012 08:28 AM, Pierre Gaston wrote:
>> On Fri, Mar 2, 2012 at 9:54 AM, Stefano Lattarini wrote:
>>
>>> Or here is a what it sounds as a marginally better idea to me: Bash could
>>> start supporting a new environment variable like "BASHLIB" (a' la'
>>> PERL5LIB)
>>> or "BASHPATH" (a' la' PYTHONPATH) holding a colon separated (or semicolon
>>> separated on Windows) list of directories where bash will look for sourced
>>> non-absolute files (even if they contain a pathname separator) before
>>> (possibly) performing a lookup in $PATH and then in the current directory.
>>> Does this sounds sensible, or would it add too much complexity and/or
>>> confusion?
>>
>> It could be even furthermore separated from the traditional "source" and a
>> new keyword introduced like "require"
>>
> This might be a slightly better interface, yes.
Agreed though include might be a better name than require. and if your
at it why not include <> and include ""

> 
>> a la lisp which would be able to do things like:
>>
>> 1) load the file, searching in the BASH_LIB_PATH (or other variables) for a
>> file with optionally the extension .sh or .bash
>> 2) only load the file if the "feature" as not been provided, eg only load
>> the file once
>>
> These sound good :-)
No I don't like that. if you want something like that just use inclusion
protection like every other language.
if [ -z "${__file_sh__:-}" ]; then
  __file_sh__=1


fi


and my source wrapper function actually checks for that variable b4
sourcing the file.
off the top of my head something like this.
[ -n "${!__$(basename "${sourceFile}" .sh)_sh__}" ] || source
"${sourceFile}"

> 
>> 3) maybe optionally only load the definition and not execute commands
>> (something I've seen people asking for on several occasions on IRC), for
>> instance that would allow to have test code inside the lib file or maybe
>> print a warning that it's a library not to be executed. (No so important
>> imo)
>>
> ... and even python don't do that!  If people care about making the test
> code in the module "automatically executable" when the module is run as
> a script, they could use an idiom similar to the python one:
> 
>   # For python.
>   if __name__ == "__main__":
> test code ...
> 
> i.e.:
> 
>   # For bash.
>   if [[ -n $BASH_SOURCE ]]; then
> test code ...
>   fi
> 
Only works if you source from thh command line, not execute.
what you actually have to do is something like this
  # For bash.
  if [[ "$(basename "${0}")" = scriptname.sh ]]; then
test code ...
  fi


>> I think this would benefit the bash_completion project and help them to
>> split the script so that the completion are only loaded on demand.
>> (one of the goal mentionned at http://bash-completion.alioth.debian.org/ is
>> "make bash-completion dynamically load completions")
>> My understanding is that the
>> http://code.google.com/p/bash-completion-lib/project did something
>> like this but that it was not  working entirely as
>> they wanted.
>> (I hope some of the devs reads this list)
>>
>> On the other hand, there is the possibility to add FPATH and autoload like
>> in ksh93 ...
>> I haven't think to much about it but my guess is that it would really be
>> easy to implement a module system with that.
>>
>> my 2 centsas I don't have piles of bash lib.
>>
> Same here -- it was more of a "theoretical suggestion", in the category of
> "hey, you know what would be really cool to have?" :-)  But I don't deeply
> care about it, personally.

What would be really useful (dreamy eyes) would be namespace support :)

something like this
 { # codeblock
   namespace namespace1
   testvar=s

   { # codeblock
 namespace namespace2
 testvar=s

   }
 }

 treated like this
 namespace1.testvar=s
 namespace1.namespace2.testvar=s



 although non posix this is already kinda supported because you can do
 function test1.ert.3 {
 }

 I mean all you would do is treat the namespace as a variable preamble
 so you'd have something like this to find the function etc
 if [ type "${varname}" ]
 elif [ type "${namespace}${varname}" ]
 else error not found

 wouldn't actually break anything afaik.


> 
> Regards,
>   Stefano
>

Re: RFE: allow bash to have libraries

2012-03-01 Thread John Kearney

https://github.com/dethrophes/Experimental-Bash-Module-System/blob/master/bash/template.sh
So can't repeat this enough !play code!!.
However suggestions are welcome. If this sort of thing is of
interesting I could maintain it online I guess.

basically I wan kinda thinking perl/python module libary when I started


So what I like

trap error etc and print error mesages
set nounset
Try to keep the files in 2 parts source part and run part.

Have a common args handler routine.

rediculously comples log output etc timestamped, line file function etc...

stack trace on errors

color output red for errors etc.

silly comples userinterface routines :)

I guess just have a look see and try it out.


Also note I think a lot of the files are empty/or silly files that
should actually be deleted don't have time to go through them now though.

I'd also advise using ctags, tagging it and navigating so, its what I do.


On 03/02/2012 03:54 AM, Clark J. Wang wrote:
> On Fri, Mar 2, 2012 at 08:20, John Kearney 
> wrote:
> 
>> :) :)) Personal best wrote about 1 lines of code  which
>> finally became about 200ish to implement a readkey function.
>> 
>> Actually ended up with 2 solutions 1 basted on a full bash
>> script vt100 parser weighing in a about 500 lines including state
>> tables and a s00 line hack.
>> 
>> Check out http://mywiki.wooledge.org/ReadingFunctionKeysInBash
>> 
>> 
>> Personally I'd have to say using path to source a moduel is a
>> massive securtiy risk but thats just me. I actually have a pretty
>> complex bash modules hierarchy solution. If anybodys interested I
>> guess I could upload it somewhere if anybodys interested,
> 
> 
> I just found https://gist.github.com/ a few days ago :)
> 
> Gist is a simple way to share snippets and pastes with others. All
> gists are git repositories, so they are automatically versioned,
> forkable and usable as a git repository.
> 
> 
>> its just a play thing for me really but its couple 1000 lines of
>> code proabely more like 1+. Its kinda why I started updating
>> Gregs wiwi I noticed I'd found different/better ways of dealing
>> with a lot of problems.
>> 
>> Thiing like secured copy/move funtions. Task Servers. Generic
>> approach to user interface interactions. i.e. supports both gui
>> and console input in my scripts. Or I even started a bash based
>> ncurses type system :), like I say some fune still got some
>> performance issues with that one.
>> 
>> Or improves select function that supports arrow keys and mouse 
>> selection, written in bash.
>> 
>> Anybody interested in this sort of thing?
>> 
> 
> I'm interested.

Re: RFE: allow bash to have libraries

2012-03-01 Thread John Kearney

:) :))
Personal best wrote about 1 lines of code  which finally became
about 200ish to implement a readkey function.

Actually ended up with 2 solutions 1 basted on a full bash script
vt100 parser weighing in a about 500 lines including state tables and
a s00 line hack.

Check out http://mywiki.wooledge.org/ReadingFunctionKeysInBash

Personally I'd have to say using path to source a moduel is a massive
securtiy risk but thats just me.
I actually have a pretty complex bash modules hierarchy solution.
If anybodys interested I guess I could upload it somewhere if anybodys
interested, its just a play thing for me really but its couple 1000
lines of code proabely more like 1+.
Its kinda why I started updating Gregs wiwi I noticed I'd found
different/better ways of dealing with a lot of problems.

Thiing like secured copy/move funtions. Task Servers.
Generic approach to user interface interactions. i.e. supports both
gui and console input in my scripts.
Or I even started a bash based ncurses type system :), like I say some
fune still got some performance issues with that one.

Or improves select function that supports arrow keys and mouse
selection, written in bash.

Anybody interested in this sort of thing?

On 03/01/2012 11:48 PM, Linda Walsh wrote:
> John Kearney wrote: ... [large repetitive included text elided...]
> 
>> why not just do something like this?
>> 
>  <26 line suggested 'header' elided...>
>> gives you more control anyway, pretty quick and simple.
>> 
>> 
> At least 30% of the point of this is to take large amounts of
> common initialization code that ends up at the front of  many or
> most of my scripts and have it hidden in a side file where it can 
> just be 'included'...
> 
> Having to add 26 lines of code just to include 20 common lines
> doesn't sound like a net-gain...
> 
> 
> I thought of doing something similar until I realized I'd end up 
> with some path-search routine written in shell at the beginning of
> each program just to enable bash to have structured & hierarchical
> libraries like any other programming language except maybe BASIC
> (or other shells)
> 
> My problem is I keep thinking problems can be solvable in a few
> lines of shell code.   Then they grow...   *sigh*...
> 
>

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-29 Thread John Kearney

On 02/29/2012 11:55 PM, Chet Ramey wrote:
> On 2/28/12 4:28 PM, John Kearney wrote:
>>
>> On 02/28/2012 10:05 PM, Chet Ramey wrote:
>>> On 2/28/12 12:26 PM, John Kearney wrote:
>>>
>>>> But that isn't how it behaves.
>>>> "${test//str/""}"
>>>>
>>>> because str is replaced with '""' as such it is treating the double
>>>> quotes as string literals.
>>>>
>>>> however at the same time these literal double quotes escape/quote a
>>>> single quote between them.
>>>> As such they are treated both as literals and as quotes as such
>>>> inconsistently.
>>>
>>> I don't have a lot of time today, but I'm going to try and answer bits
>>> and pieces of this discussion.
>>>
>>> Yes, bash opens a new `quoting context' (for lack of a better term) inside
>>> ${}.  Posix used to require it, though after lively discussion it turned
>>> into "well, we said that but it's clearly not what we meant."
>>>
>>> There are a couple of places in the currently-published version of the
>>> standard, minus any corregendia, that specify this.  The description of
>>> ${parameter} reads, in part,
>>>
>>> "The matching closing brace shall be determined by counting brace levels,
>>> skipping over enclosed quoted strings, and command substitutions."
>>>
>>> The section on double quotes reads, in part:
>>>
>>> "Within the string of characters from an enclosed "${" to the matching
>>> '}', an even number of unescaped double-quotes or single-quotes, if any,
>>> shall occur."
>>>
>>> Chet
>>
>> yhea but I think the point is that the current behavior is useless.
>> there is no case where I want a " to be printed and start a double
>> quoted string? and thats the current behavior.
> 
> Maybe you don't, but there are several cases in the test suite that do
> exactly that, derived from an old bug report.
> 
> We don't have to keep the bash-4.2 behavior, but we need to acknowledge
> that it's not backwards-compatible.
> 
> 

Personally vote for ksf93 like behavior, was more intuitive for me, not
that I've tested it all that much but the first impression was a good
one. seriously try it out an see which behavior you want to use.

As for backward compatibility. to be honest I think that anybody who
relied on this behavior should be shot ;) Like someone already said the
only sane way to use it now is with a variable.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-29 Thread John Kearney

On 03/01/2012 12:12 AM, Andreas Schwab wrote:
> John Kearney  writes:
> 
>> It isn't just the quote removal that is confusing.
>> 
>> The escape character is also not removed and has its special
>> meaning.
> 
> The esacape character is also a quote character, thus also subject
> to quote removal.
> 
> Andreas.
> 
oh wasn't aware of that distinction thx.

Re: RFE: allow bash to have libraries (was bash 4.2 breaks source finding libs in lib/filename...)

2012-02-29 Thread John Kearney

On 02/29/2012 11:53 PM, Linda Walsh wrote:
> 
> 
> Eric Blake wrote:
> 
>> On 02/29/2012 12:26 PM, Linda Walsh wrote:
>> 
 Any pathname that contains a / should not be subject to PATH
 searching.
>> 
>> Agreed - as this behavior is _mandated_ by POSIX, for both sh(1)
>> and for execlp(2) and friends.
> 
> 
> Is it that you don't read english as a first language, or are you
> just trying to be argumentative?'
> 
> I said:  Original Message  Subject: bash 4.2 breaks
> source finding libs in lib/filename... Date: Tue, 28 Feb 2012
> 17:34:21 -0800 From: Linda Walsh To: bug-bash
> 
> Why was this functionality removed in non-posix mode?
> 
> So, your arguments are all groundless and pointless, as your entire
> arguments stem from posix .. which I specifically said I'm NOT
> specifying.   If I want posix behavior, I can flick a switch and
> have such compatibility.
> 
> however, Bash was designed to EXceeed the limitations and features
> of POSIX, so the fact that posix is restrained in this area, is a
> perfect reason to allow it -- as it makes it
> 
> 
>> 
>>> Pathnames that *start* with '/' are called an "absolute"
>>> pathnames,
>>> 
>>> while paths not starting with '/' are relative.
>> 
>> And among the set of relative pathnames, there are two further 
>> divisions: anchored (contains at least one '/') and unanchored
>> (no '/'). PATH lookup is defined as happening _only_ for
>> unanchored names.
>> 
>>> Try 'C', if you include a include file with "/", it scans for
>>> it in each .h root.
>> 
>> The C compiler _isn't_ doing a PATH search, so it follows
>> different rules.
>> 
>>> Almost all normal utils take their 'paths to be the 'roots' of 
>>> trees that contain files.  Why should bash be different?
>> 
>> Because that's what POSIX says.
> 
> --- Posix says to ground paths with "/" in them at the root's of
> their paths?   But it says differently for BASH?   you aren't
> making sense.
> 
> All the utils.
> 
> What does man do?... it looks for a "/" separated hierarchy under 
> EACH entry of MANPATH.
> 
> What does Perl do?  It looks for a "/" separated hierarchy under
> each entry in lib.
> 
> What does vim do?  It looks for a vim-hierarchy under each entry
> of it's list of vim-runtimes.
> 
> what does ld do?  What does C do?  What does C++ do?   They all
> look for "/" separated hierarchies under a PATH-like root.
> 
> 
> You claim that behavior is mandated by posix?   I didn't know
> posix specified perl standards.  or vim... but say they do 
> then why wouldn't you also look for a "/" separated hierarchy under
> PATH?
> 
> What does X do?   -- a "/" separated hierarchy?
> 
> 
> What does Microsoft do for registry locations?   a "\" separated 
> hierarchy under 64 or 32-bit registry areas.
> 
> Where do demons look for files?  Under a "/" separated hierarchy
> that may be root or a pseudo-root...
> 
> All of these utils use "/" separated hierarchies -- none of them
> refuse to do a path lookup with "/" is in the file name.   The
> entire concept of libraries would fail -- as they are organized
> hierarchically.   but you may not know the library location until
> runtime, so you have a path and a hierarchical lookup.
> 
> So why shouldn't Bash be able to look for 'library' functions in a 
> hierarchy?
> 
> Note -- as we are talking about non-posix mode of BASH, you can't
> use POSIX as a justification.
> 
> 
> As for making another swithc -- there is already a switch --
> 'posix' for posix behavior.
> 
> I'm not asking for a change in posix behavior, so you can continue
> using posix mode ...
> 
> 
> 
> 
>> 
>>> It goes against 'common sense' and least surprise -- given it's
>>> the norm in so many other applications.
>> 
>> About the best we can do is accept a patch (are you willing to
>> write it? if not, quit complaining)
> 
> 
>> that would add a new shopt, off by default,
> 
> 
> ---
> 
> I would agree to it being off in posix mode, by default, and on,
> by default when not in posix mode...
> 
> 
> 
>> allow your desired alternate behavior.  But I won't write such a
>> patch, and if such a patch is written, I won't use it, because
>> I'm already used to the POSIX behavior.
> 
> --- How do you use the current behavior that doesn't do a path
> lookup if you include a / in the path (not at the beginning), that
> you would be able to make use of if you added "." to the beginning
> of your path (either temporarily or permanently...)?
> 
> 
> How do you organize your hierarchical libraries with bash so they
> don't have hard coded paths?
> 
> 
> 
why not just do something like this?
# FindInPathVarExt
 [ [  [  ]]]
function FindInPathVarExt {
  local -a PathList
  IFS=":" read -a PathList <<< "${2}"
  for CPath in "${PathList[@]}" ; do
for CTest in "${@:4}"; do
  test "${CTest}" "${CPath}/${3}" || continue 2
done
printf -v "${1}" "${CPath}/${3}"
return 0
  done
  printf -v "${1}" "Not Found"
  ret

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-29 Thread John Kearney


It isn't just the quote removal that is confusing.

The escape character is also not removed and has its special meaning.


and this also confuses me
take the following 2 cases
echo ${a:-$'\''}
'
echo "${a:-$'\''}"
bash: bad substitution: no closing `}' in "${a:-'}"


and take the following 3 cases
echo "${a:-$(echo $'\'')}"
bash: command substitution: line 38: unexpected EOF while looking for
matching `''
bash: command substitution: line 39: syntax error: unexpected end of file

echo ${a:-$(echo $'\'')}
'
echo "${a:-$(echo \')}"
'

This can not be logical behavior.

On 02/29/2012 11:26 PM, Chet Ramey wrote:
> On 2/28/12 10:52 AM, John Kearney wrote:
>> Actually this is something that still really confuses me as
>> well.
> 
> The key is that bash doesn't do quote removal on the `string' part
> of the "${param/pat/string}" expansion.  The double quotes are key;
> quote removal happens when the expansion is unquoted.
> 
> Double quotes are supposed to inhibit quote removal, but bash's
> hybrid behavior of allowing quotes to escape characters but not
> removing them is biting us here.
>

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 11:23 PM, Chet Ramey wrote:
> On 2/28/12 5:18 PM, John Kearney wrote:
>> On 02/28/2012 11:07 PM, Chet Ramey wrote:
>>> On 2/28/12 4:28 PM, John Kearney wrote:
>>>>
>>>> On 02/28/2012 10:05 PM, Chet Ramey wrote:
>>>>> On 2/28/12 12:26 PM, John Kearney wrote:
>>>>>
>>>>>> But that isn't how it behaves. "${test//str/""}"
>>>>>>
>>>>>> because str is replaced with '""' as such it is treating
>>>>>> the double quotes as string literals.
>>>>>>
>>>>>> however at the same time these literal double quotes
>>>>>> escape/quote a single quote between them. As such they are
>>>>>> treated both as literals and as quotes as such 
>>>>>> inconsistently.
>>>>>
>>>>> I don't have a lot of time today, but I'm going to try and
>>>>> answer bits and pieces of this discussion.
>>>>>
>>>>> Yes, bash opens a new `quoting context' (for lack of a better
>>>>> term) inside ${}.  Posix used to require it, though after
>>>>> lively discussion it turned into "well, we said that but it's
>>>>> clearly not what we meant."
>>>>>
>>>>> There are a couple of places in the currently-published version
>>>>> of the standard, minus any corregendia, that specify this.  The
>>>>> description of ${parameter} reads, in part,
>>>>>
>>>>> "The matching closing brace shall be determined by counting
>>>>> brace levels, skipping over enclosed quoted strings, and
>>>>> command substitutions."
>>>>>
>>>>> The section on double quotes reads, in part:
>>>>>
>>>>> "Within the string of characters from an enclosed "${" to the
>>>>> matching '}', an even number of unescaped double-quotes or
>>>>> single-quotes, if any, shall occur."
>>>>>
>>>>> Chet
>>>>
>>>> yhea but I think the point is that the current behavior is
>>>> useless. there is no case where I want a " to be printed and
>>>> start a double quoted string? and thats the current behavior.
>>>>
>>>>
>>>> Not so important how you treat it just need to pick 1. then you
>>>> can at least work with it. Now you have to use a temp variable.
>>>>
>>>>
>>>> as a side note ksh93 is pretty good, intuitive ksh93 -c
>>>> 'test=teststrtest ; echo "${test//str/"dd dd"}"' testdd ddtest 
>>>> ksh93 -c '( test=teststrtest ; echo ${test//str/"dd '\''dd"} )' 
>>>> testdd 'ddtest
>>>
>>> The real question is whether or not you do quote removal on the
>>> stuff inside the braces when they're enclosed in double quotes.
>>> Double quotes usually inhibit quote removal.
>>>
>>> The Posix "solution" to this is to require quote removal if a
>>> quote character (backslash, single quote, double quote) is used to
>>> escape or quote another character.  Somewhere I have the reference
>>> to the Austin group discussion on this.
>>>
>>
>> 1${A:-B}2
>>
>> Logically for consistancy having double quotes at position 1 and 2
>> should have no effect on how you treat string B.
> 
> Maybe, but that's not how things work in practice.  Should the following
> expansions output the same thing?  What should they output?
> 
> bar=abc
> echo ${foo:-'$bar'}
> echo "${foo:-'$bar'}"
> 
> Chet


and truthfully with thr current behavior Id' almost expect this  behavior.

$bar
'$bar'
but to be honest without trying it out I have no idea and that is the
problem now.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 11:44 PM, Chet Ramey wrote:
> echo "$(echo '$bar')"

actually these both output the same in bash
echo "$(echo '$bar')"
echo $(echo '$bar')

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 11:23 PM, Chet Ramey wrote:
> On 2/28/12 5:18 PM, John Kearney wrote:
>> On 02/28/2012 11:07 PM, Chet Ramey wrote:
>>> On 2/28/12 4:28 PM, John Kearney wrote:
>>>>
>>>> On 02/28/2012 10:05 PM, Chet Ramey wrote:
>>>>> On 2/28/12 12:26 PM, John Kearney wrote:
>>>>>
>>>>>> But that isn't how it behaves. "${test//str/""}"
>>>>>>
>>>>>> because str is replaced with '""' as such it is treating
>>>>>> the double quotes as string literals.
>>>>>>
>>>>>> however at the same time these literal double quotes
>>>>>> escape/quote a single quote between them. As such they are
>>>>>> treated both as literals and as quotes as such 
>>>>>> inconsistently.
>>>>>
>>>>> I don't have a lot of time today, but I'm going to try and
>>>>> answer bits and pieces of this discussion.
>>>>>
>>>>> Yes, bash opens a new `quoting context' (for lack of a better
>>>>> term) inside ${}.  Posix used to require it, though after
>>>>> lively discussion it turned into "well, we said that but it's
>>>>> clearly not what we meant."
>>>>>
>>>>> There are a couple of places in the currently-published version
>>>>> of the standard, minus any corregendia, that specify this.  The
>>>>> description of ${parameter} reads, in part,
>>>>>
>>>>> "The matching closing brace shall be determined by counting
>>>>> brace levels, skipping over enclosed quoted strings, and
>>>>> command substitutions."
>>>>>
>>>>> The section on double quotes reads, in part:
>>>>>
>>>>> "Within the string of characters from an enclosed "${" to the
>>>>> matching '}', an even number of unescaped double-quotes or
>>>>> single-quotes, if any, shall occur."
>>>>>
>>>>> Chet
>>>>
>>>> yhea but I think the point is that the current behavior is
>>>> useless. there is no case where I want a " to be printed and
>>>> start a double quoted string? and thats the current behavior.
>>>>
>>>>
>>>> Not so important how you treat it just need to pick 1. then you
>>>> can at least work with it. Now you have to use a temp variable.
>>>>
>>>>
>>>> as a side note ksh93 is pretty good, intuitive ksh93 -c
>>>> 'test=teststrtest ; echo "${test//str/"dd dd"}"' testdd ddtest 
>>>> ksh93 -c '( test=teststrtest ; echo ${test//str/"dd '\''dd"} )' 
>>>> testdd 'ddtest
>>>
>>> The real question is whether or not you do quote removal on the
>>> stuff inside the braces when they're enclosed in double quotes.
>>> Double quotes usually inhibit quote removal.
>>>
>>> The Posix "solution" to this is to require quote removal if a
>>> quote character (backslash, single quote, double quote) is used to
>>> escape or quote another character.  Somewhere I have the reference
>>> to the Austin group discussion on this.
>>>
>>
>> 1${A:-B}2
>>
>> Logically for consistancy having double quotes at position 1 and 2
>> should have no effect on how you treat string B.
> 
> Maybe, but that's not how things work in practice.  Should the following
> expansions output the same thing?  What should they output?
> 
> bar=abc
> echo ${foo:-'$bar'}
> echo "${foo:-'$bar'}"
> 
> Chet
my first intuition on this whole thing was
§(varename arg1 arg2)

I.E. conceptually treat it like a function the options are arguments.
That is then consistant, and intuative. Don'tget confused by the syntax.
If I want 'as' i'll type \'as\' or some such.


the outermost quotes only effect how the final value is handled.
same as §()


having special behaviour model for that string makes it imposible to
work with really.

this should actually make it easier for the parser.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 11:15 PM, Chet Ramey wrote:
> On 2/28/12 5:07 PM, Chet Ramey wrote:
> 
>>> yhea but I think the point is that the current behavior is useless.
>>> there is no case where I want a " to be printed and start a double
>>> quoted string? and thats the current behavior.
>>>
>>>
>>> Not so important how you treat it just need to pick 1. then you can at
>>> least work with it. Now you have to use a temp variable.
>>>
>>>
>>> as a side note ksh93 is pretty good, intuitive
>>> ksh93 -c 'test=teststrtest ; echo "${test//str/"dd dd"}"'
>>> testdd ddtest
>>> ksh93 -c '( test=teststrtest ; echo ${test//str/"dd '\''dd"} )'
>>> testdd 'ddtest
>>
>> The real question is whether or not you do quote removal on the stuff
>> inside the braces when they're enclosed in double quotes.  Double
>> quotes usually inhibit quote removal.
>>
>> The Posix "solution" to this is to require quote removal if a quote
>> character (backslash, single quote, double quote) is used to escape
>> or quote another character.  Somewhere I have the reference to the
>> Austin group discussion on this.
> 
> http://austingroupbugs.net/view.php?id=221
> 
> Chet

This however doesn't make reference to changing that behavior if you
enclose the entire thing in double quotes.

${a//a/"a"} should behave the same as "${a//a/"a"}"

I mean the search and replace should behave the same. Currently they dont

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 11:07 PM, Chet Ramey wrote:
> On 2/28/12 4:28 PM, John Kearney wrote:
>> 
>> On 02/28/2012 10:05 PM, Chet Ramey wrote:
>>> On 2/28/12 12:26 PM, John Kearney wrote:
>>> 
>>>> But that isn't how it behaves. "${test//str/""}"
>>>> 
>>>> because str is replaced with '""' as such it is treating
>>>> the double quotes as string literals.
>>>> 
>>>> however at the same time these literal double quotes
>>>> escape/quote a single quote between them. As such they are
>>>> treated both as literals and as quotes as such 
>>>> inconsistently.
>>> 
>>> I don't have a lot of time today, but I'm going to try and
>>> answer bits and pieces of this discussion.
>>> 
>>> Yes, bash opens a new `quoting context' (for lack of a better
>>> term) inside ${}.  Posix used to require it, though after
>>> lively discussion it turned into "well, we said that but it's
>>> clearly not what we meant."
>>> 
>>> There are a couple of places in the currently-published version
>>> of the standard, minus any corregendia, that specify this.  The
>>> description of ${parameter} reads, in part,
>>> 
>>> "The matching closing brace shall be determined by counting
>>> brace levels, skipping over enclosed quoted strings, and
>>> command substitutions."
>>> 
>>> The section on double quotes reads, in part:
>>> 
>>> "Within the string of characters from an enclosed "${" to the
>>> matching '}', an even number of unescaped double-quotes or
>>> single-quotes, if any, shall occur."
>>> 
>>> Chet
>> 
>> yhea but I think the point is that the current behavior is
>> useless. there is no case where I want a " to be printed and
>> start a double quoted string? and thats the current behavior.
>> 
>> 
>> Not so important how you treat it just need to pick 1. then you
>> can at least work with it. Now you have to use a temp variable.
>> 
>> 
>> as a side note ksh93 is pretty good, intuitive ksh93 -c
>> 'test=teststrtest ; echo "${test//str/"dd dd"}"' testdd ddtest 
>> ksh93 -c '( test=teststrtest ; echo ${test//str/"dd '\''dd"} )' 
>> testdd 'ddtest
> 
> The real question is whether or not you do quote removal on the
> stuff inside the braces when they're enclosed in double quotes.
> Double quotes usually inhibit quote removal.
> 
> The Posix "solution" to this is to require quote removal if a
> quote character (backslash, single quote, double quote) is used to
> escape or quote another character.  Somewhere I have the reference
> to the Austin group discussion on this.
> 

1${A:-B}2

Logically for consistancy having double quotes at position 1 and 2
should have no effect on how you treat string B.


or
consider this
1${A/B/C}2

in this case its even weirder double quotes at 1 and 2 has no effect
on A or B but modifies how string C behaves.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney


On 02/28/2012 10:05 PM, Chet Ramey wrote:
> On 2/28/12 12:26 PM, John Kearney wrote:
> 
>> But that isn't how it behaves.
>> "${test//str/""}"
>>
>> because str is replaced with '""' as such it is treating the double
>> quotes as string literals.
>>
>> however at the same time these literal double quotes escape/quote a
>> single quote between them.
>> As such they are treated both as literals and as quotes as such
>> inconsistently.
> 
> I don't have a lot of time today, but I'm going to try and answer bits
> and pieces of this discussion.
> 
> Yes, bash opens a new `quoting context' (for lack of a better term) inside
> ${}.  Posix used to require it, though after lively discussion it turned
> into "well, we said that but it's clearly not what we meant."
> 
> There are a couple of places in the currently-published version of the
> standard, minus any corregendia, that specify this.  The description of
> ${parameter} reads, in part,
> 
> "The matching closing brace shall be determined by counting brace levels,
> skipping over enclosed quoted strings, and command substitutions."
> 
> The section on double quotes reads, in part:
> 
> "Within the string of characters from an enclosed "${" to the matching
> '}', an even number of unescaped double-quotes or single-quotes, if any,
> shall occur."
> 
> Chet

yhea but I think the point is that the current behavior is useless.
there is no case where I want a " to be printed and start a double
quoted string? and thats the current behavior.


Not so important how you treat it just need to pick 1. then you can at
least work with it. Now you have to use a temp variable.


as a side note ksh93 is pretty good, intuitive
ksh93 -c 'test=teststrtest ; echo "${test//str/"dd dd"}"'
testdd ddtest
ksh93 -c '( test=teststrtest ; echo ${test//str/"dd '\''dd"} )'
testdd 'ddtest

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 07:00 PM, Dan Douglas wrote:
> On Tuesday, February 28, 2012 06:52:13 PM John Kearney wrote:
>> On 02/28/2012 06:43 PM, Dan Douglas wrote:
>>> On Tuesday, February 28, 2012 06:38:22 PM John Kearney wrote:
>>>> On 02/28/2012 06:31 PM, Dan Douglas wrote:
>>>>> On Tuesday, February 28, 2012 05:53:32 PM Roman Rakus
>>>>> wrote:
>>>>>> On 02/28/2012 05:49 PM, Greg Wooledge wrote:
>>>>>>> On Tue, Feb 28, 2012 at 05:36:47PM +0100, Roman Rakus
>>>>>>> 
>>>>>>> wrote:
>>>>>>>> And that means, there isn't way to substitute
>>>>>>>> "something" to ' (single quote) when you want to not
>>>>>>>> perform word splitting. I would consider it as a
>>>>>>>> bug.
>>>>>>> 
>>>>>>> imadev:~$ q=\' imadev:~$ input="foosomethingbar"
>>>>>>> imadev:~$ echo "${input//something/$q}" foo'bar
>>>>>> 
>>>>>> I meant without temporary variable.
>>>>>> 
>>>>>> RR
>>>>> 
>>>>> ormaaj@ormaajbox ~ $ ( x=abc; echo ${x/b/$'\''} ) a'c
>>>> 
>>>> ( x=abc; echo "${x/b/$'\''}" ) -bash: bad substitution: no 
>>>> closing `}' in "${x/b/'}"
>>>> 
>>>> 
>>>> you forgot the double quotes ;)
>>>> 
>>>> 
>>>> I really did spend like an hour or 2 one day trying to figure
>>>> it out and gave up.
>>> 
>>> Hm good catch. Thought there might be a new quoting context
>>> over there.
>> 
>> I think we can all agree its inconsistent, just not so sure we
>> care?? i.e. we know workarounds that aren't so bad variables
>> etc.
> 
> Eh, it's sort of consistent. e.g. this doesn't work either:
> 
> unset x; echo "${x:-$'\''}"
> 
> and likewise a backslash escape alone won't do the trick. I'd
> assume this applies to just about every expansion.
> 
> I didn't think too hard before posting that. :)


My favorite type of bug one thats consistently inconsistent :)




now that I have a beter idea of what weird I'll take a look later after
the gym.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 06:52 PM, John Kearney wrote:
> On 02/28/2012 06:43 PM, Dan Douglas wrote:
>> On Tuesday, February 28, 2012 06:38:22 PM John Kearney wrote:
>>> On 02/28/2012 06:31 PM, Dan Douglas wrote:
>>>> On Tuesday, February 28, 2012 05:53:32 PM Roman Rakus wrote:
>>>>> On 02/28/2012 05:49 PM, Greg Wooledge wrote:
>>>>>> On Tue, Feb 28, 2012 at 05:36:47PM +0100, Roman Rakus
>>>>>> wrote:
>>>>>>> And that means, there isn't way to substitute "something"
>>>>>>> to ' (single quote) when you want to not perform word
>>>>>>> splitting. I would consider it as a bug.
>>>>>>
>>>>>> imadev:~$ q=\' imadev:~$ input="foosomethingbar" imadev:~$
>>>>>> echo "${input//something/$q}" foo'bar
>>>>>
>>>>> I meant without temporary variable.
>>>>>
>>>>> RR
>>>>
>>>> ormaaj@ormaajbox ~ $ ( x=abc; echo ${x/b/$'\''} ) a'c
>>>
>>> ( x=abc; echo "${x/b/$'\''}" ) -bash: bad substitution: no
>>> closing `}' in "${x/b/'}"
>>>
>>>
>>> you forgot the double quotes ;)
>>>
>>>
>>> I really did spend like an hour or 2 one day trying to figure it
>>> out and gave up.
>>
>> Hm good catch. Thought there might be a new quoting context over
>> there.
> I think we can all agree its inconsistent, just not so sure we care??
> i.e. we know workarounds that aren't so bad variables etc.
> 
> 
> 
> 
> 
To sum up


bash treats replacement strings inconsistently in double quoted variable
expansion.

example double quote is treated both as literal and as quote character.
( test=test123test ; echo "${test/123/"'"}" )
test"'"test
vs
( test=test123test ; echo "${test/123/'}" )  which hangs waiting for '

treated as literal because it is printed
treated as quote char because otherwise it should hang waiting for '

now teh single quote and backslash characters all seem to exhibit this
dual nature in the replacement string. search string behaves
consistantly. i.e. treats characters either as special or literal, not
as both at teh same time.

this has got to be a bug guys.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 06:43 PM, Dan Douglas wrote:
> On Tuesday, February 28, 2012 06:38:22 PM John Kearney wrote:
>> On 02/28/2012 06:31 PM, Dan Douglas wrote:
>>> On Tuesday, February 28, 2012 05:53:32 PM Roman Rakus wrote:
>>>> On 02/28/2012 05:49 PM, Greg Wooledge wrote:
>>>>> On Tue, Feb 28, 2012 at 05:36:47PM +0100, Roman Rakus
>>>>> wrote:
>>>>>> And that means, there isn't way to substitute "something"
>>>>>> to ' (single quote) when you want to not perform word
>>>>>> splitting. I would consider it as a bug.
>>>>> 
>>>>> imadev:~$ q=\' imadev:~$ input="foosomethingbar" imadev:~$
>>>>> echo "${input//something/$q}" foo'bar
>>>> 
>>>> I meant without temporary variable.
>>>> 
>>>> RR
>>> 
>>> ormaaj@ormaajbox ~ $ ( x=abc; echo ${x/b/$'\''} ) a'c
>> 
>> ( x=abc; echo "${x/b/$'\''}" ) -bash: bad substitution: no
>> closing `}' in "${x/b/'}"
>> 
>> 
>> you forgot the double quotes ;)
>> 
>> 
>> I really did spend like an hour or 2 one day trying to figure it
>> out and gave up.
> 
> Hm good catch. Thought there might be a new quoting context over
> there.
I think we can all agree its inconsistent, just not so sure we care??
i.e. we know workarounds that aren't so bad variables etc.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 06:31 PM, Dan Douglas wrote:
> On Tuesday, February 28, 2012 05:53:32 PM Roman Rakus wrote:
>> On 02/28/2012 05:49 PM, Greg Wooledge wrote:
>>> On Tue, Feb 28, 2012 at 05:36:47PM +0100, Roman Rakus wrote:
 And that means, there isn't way to substitute "something" to
 ' (single quote) when you want to not perform word splitting.
 I would consider it as a bug.
>>> 
>>> imadev:~$ q=\' imadev:~$ input="foosomethingbar" imadev:~$ echo
>>> "${input//something/$q}" foo'bar
>> 
>> I meant without temporary variable.
>> 
>> RR
> ormaaj@ormaajbox ~ $ ( x=abc; echo ${x/b/$'\''} ) a'c

( x=abc; echo "${x/b/$'\''}" )
-bash: bad substitution: no closing `}' in "${x/b/'}"


you forgot the double quotes ;)


I really did spend like an hour or 2 one day trying to figure it out
and gave up.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 06:16 PM, Eric Blake wrote:
> On 02/28/2012 09:54 AM, John Kearney wrote:
>> On 02/28/2012 05:22 PM, Roman Rakus wrote:
>>> On 02/28/2012 05:10 PM, John Kearney wrote:
>>>> wrap it with single quotes and globally replace all single
>>>> quotes in the string with '\''
>>> single quote and slash have special meaning so they have to be 
>>> escaped, that's it. \'${var//\'/\\\'}\' it is not quoted, so
>>> it undergoes word splitting. To avoid it quote it in double
>>> quotes, however it changes how slash and single quote is
>>> treated. "'${var//\'/\'}'"
>>> 
>>> Wasn't it already discussed on the list?
>>> 
>>> RR
>>> 
>> It was discussed but not answered in a way that helped.
> 
> POSIX already says that using " inside ${var+value} is
> non-portable; you've just proven that using " inside the bash
> extension of ${var//pat/sub} is likewise not useful.
I'm just going for understandable/predictable right now.


> 
>> 
>> Now I'm not looking foe a workaround, I want to understand it. 
>> Now you say they are treated special what does that mean and how
>> can I escape that specialness.
> 
> By using temporary variables.  That's the only sane approach.
I do its just always bugged.

> 
>> 
>> Or show me how without using variables to do this 
>> test=test\'string
>> 
>> [ "${test}" = "${test//"'"/"'"}" ] || exit 999
> 
> exit 999 is pointless.  It is the same as exit 231 on some shells,
> and according to POSIX, it is allowed to be a syntax error in other
> shells.
> 
I was going for || exit "Doomsday" i,e. 666 = 999 = Apocalypse.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 06:05 PM, Steven W. Orr wrote:
> On 2/28/2012 11:54 AM, John Kearney wrote:
>> On 02/28/2012 05:22 PM, Roman Rakus wrote:
>>> On 02/28/2012 05:10 PM, John Kearney wrote:
>>>> wrap it with single quotes and globally replace all single
>>>> quotes in the string with '\''
>>> single quote and slash have special meaning so they have to be 
>>> escaped, that's it. \'${var//\'/\\\'}\' it is not quoted, so
>>> it undergoes word splitting. To avoid it quote it in double
>>> quotes, however it changes how slash and single quote is
>>> treated. "'${var//\'/\'}'"
>>> 
>>> Wasn't it already discussed on the list?
>>> 
>>> RR
>>> 
>> It was discussed but not answered in a way that helped.
>> 
>> 
>> Look consider this test=teststring
>> 
>> 
>> echo "${test//str/""}"
> 
> This makes no sense.
> 
> "${test//str/" is a string.  is anudder string "}" is a 3rd
> string
> 
> echo "${test//str/\"\"}"
> 
> is perfectly legal.
> 
> 
But that isn't how it behaves.
"${test//str/""}"

because str is replaced with '""' as such it is treating the double
quotes as string literals.

however at the same time these literal double quotes escape/quote a
single quote between them.
As such they are treated both as literals and as quotes as such
inconsistently.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 05:22 PM, Roman Rakus wrote:
> On 02/28/2012 05:10 PM, John Kearney wrote:
>> wrap it with single quotes and globally replace all single quotes
>> in the string with '\''
> single quote and slash have special meaning so they have to be
> escaped, that's it. \'${var//\'/\\\'}\' it is not quoted, so it
> undergoes word splitting. To avoid it quote it in double quotes,
> however it changes how slash and single quote is treated. 
> "'${var//\'/\'}'"
> 
> Wasn't it already discussed on the list?
> 
> RR
> 
It was discussed but not answered in a way that helped.


Look consider this
test=teststring


echo "${test//str/""}"
test""ing

echo ${test//str/""}
testing


echo ${test//str/"'"}
test'ing

echo "${test//str/"'"}"
test"'"ing

echo "${test//str/'}"   # hangs


now consider this case

test=test\'string

echo "${test//"'"/"'"}"
test"'"string


the match string and the replace string are exhibiting 2 different
behaviors.



Now I'm not looking foe a workaround, I want to understand it.
Now you say they are treated special what does that mean and how can I
escape that specialness.

Or show me how without using variables
to do this
test=test\'string

[ "${test}" = "${test//"'"/"'"}" ] || exit 999




Note this isn't the answer
[ "${test}" = "${test//'/'}" ] || exit 999

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

this all started with a wish to single quote a variable. Doesn't matter
why I have multiple solutions to that now.

But it it an interesting problem for exploring how escaping works in
variable expansion.

so for the test case the goal is to take a string like
kljlksdjflsd'jkjkljl
wrap it with single quotes and globally replace all single quotes in the
string with '\''


its a workaround because it doesn't work all the time you would need
something more like this
IFS= echo \'${test//"'"/\'\\\'\'}\'" "
'weferfds'\''dsfsdf'



On 02/28/2012 05:01 PM, Greg Wooledge wrote:
> On Tue, Feb 28, 2012 at 04:52:48PM +0100, John Kearney wrote:
>> The standard work around you see is
>>  echo -n \'${1//\'/\'\\\'\'}\'" "
>>  but its not the same thing
> 
> Workaround for what?  Not the same thing as what?  What is this pile
> of punctuation attempting to do?
> 
>> # why does this work, this list was born of frustration, I tried
>> everything I could think of.
>> echo \'${test//"'"/\'\\\'\'}\'" "
>> 'weferfds'\''dsfsdf'
> 
> Are you trying to produce "safely usable" strings that can be fed to
> eval later?  Use printf %q for that.
> 
> imadev:~$ input="ain't it * a \"pickle\"?"
> imadev:~$ printf '%q\n' "$input"
> ain\'t\ it\ \*\ a\ \"pickle\"\?
> 
> printf -v evalable_input %q "$input"
> 
> Or, y'know, avoid eval.
> 
> Or is this something to do with sed?  Feeding strings to sed when you
> can't choose a safe delimiter?  That would involve an entirely different
> solution.  It would be nice to know what the problem is.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

Actually this is something that still really confuses me as well.
In the end I gave up and just did this.

local LName="'\\''"
echo -n "'${1//"'"/${LName}}' "


I still don't really understand why this wont work
echo -n "'${1//"'"/"'\''"}' "
echo -n "'${1//\'/\'\\\'\'}' "


The standard work around you see is
echo -n \'${1//\'/\'\\\'\'}\'" "
 but its not the same thing

I guess what I don't understand is why quoting the variable affects the
substitutions string. I mean I guess I can  see how it could happen but
it does seem inconsistent, in fact it feels like a bug.

And even if it does affect it the effect seems to be weird.
i.e.

given
test="weferfds'dsfsdf"


# why does this work, this list was born of frustration, I tried
everything I could think of.
echo \'${test//"'"/\'\\\'\'}\'" "
'weferfds'\''dsfsdf'

#but none of the following
echo "'${test//'/}'"   # hangs waiting for '

echo "'${test//"'"/}'"
'weferfdsdsfsdf'

echo "'${test//"'"/"'\\''"}'"
'weferfds"'\''"dsfsdf'

echo "'${test//"'"/'\\''}'" # ahngs waiting or '

echo "'${test//"'"/\'\\'\'}'"
'weferfds\'\'\'dsfsdf'


leaving me doing something like
local LName="'\\''"
echo -n "'${1//"'"/${LName}}' "



I mean its a silly thing but it confuses me.


On 02/28/2012 03:47 PM, Roman Rakus wrote:
> On 02/28/2012 02:36 PM, Chet Ramey wrote:
>> On 2/28/12 4:17 AM, lhun...@lyndir.com wrote:
>>> Configuration Information [Automatically generated, do not
>>> change]: Machine: i386 OS: darwin11.2.0 Compiler:
>>> /Developer/usr/bin/clang Compilation CFLAGS:  -DPROGRAM='bash'
>>> -DCONF_HOSTTYPE='i386' -DCONF_OSTYPE='darwin11.2.0' 
>>> -DCONF_MACHTYPE='i386-apple-darwin11.2.0'
>>> -DCONF_VENDOR='apple' -DLOCALEDIR='/opt/local/share/locale'
>>> -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -DMACOSX   -I.  -I.
>>> -I./include -I./lib -I/opt/local/include -pipe -O2 -arch
>>> x86_64 uname output: Darwin mbillemo.lin-k.net 11.3.0 Darwin
>>> Kernel Version 11.3.0: Thu Jan 12 18:47:41 PST 2012; 
>>> root:xnu-1699.24.23~1/RELEASE_X86_64 x86_64 Machine Type:
>>> i386-apple-darwin11.2.0
>>> 
>>> Bash Version: 4.2 Patch Level: 20 Release Status: release
>>> 
>>> Description: The handling of backslash and quotes is completely
>>> inconsistent, counter-intuitive and in violation of how the
>>> syntax works elsewhere in bash.
>>> 
>>> ' appears to introduce a single-quoted context and \ appears
>>> to escape special characters.  That's good. A substitution
>>> pattern of ' causes bash to be unable to find the closing
>>> quote.  That's good. A substitution pattern of '' SHOULD equal
>>> an empty quoted string.  The result, however, is ''.  That's
>>> NOT good.  Suddenly the quotes are literal? A substitution
>>> pattern of '$var' SHOULD disable expansion inside the quotes.
>>> The result, however, is '[contents-of-var]'.  That's NOT good.
>>> In fact, it looks like quoting doesn't work here at all. \\ is
>>> a disabled backslash, and the syntactical backslash is removed.
>>> The result is \.  That's good. \' is a disabled single quote,
>>> but the syntactical backslash is NOT removed.  The result is
>>> \'.  That's NOT good.
>>> 
>>> It mostly looks like all the rules for handling quoting and 
>>> escaping are out the window and some random and utterly
>>> inconsistent set of rules is being applied instead.
>>> 
>>> Fix: Change parsing of the substitution pattern so that it
>>> abides by all the standard documented rules regarding quotes
>>> and escaping.
>> It would go better if you gave some examples of what you
>> consider incorrect behavior.  This description isn't helpful as
>> it stands.
>> 
> Maybe something like this:
> 
> # ttt=ggg # ggg="asd'ddd'g" # echo "'${!ttt//\'/'\''}'"
>> ^C
> # echo "'${!ttt//\'/\'\\\'\'}'" 'asd\'\\'\'ddd\'\\'\'g'
> 
> 
> 
> Anyway, I thought that single quote retains its special meaning in 
> double quotes. $ echo "'a'" 'a'
> 
> RR
>

Re: Initial test code for \U

2012-02-26 Thread John Kearney

On 02/22/2012 08:59 PM, Eric Blake wrote:
> On 02/22/2012 12:55 PM, Chet Ramey wrote:
>> On 2/21/12 5:07 PM, John Kearney wrote:
>>> 
>>> Initial code for testing \u functionality.
>> 
>> Thanks; this is really good work.  In the limited testing I've
>> done, ja_JP.SHIFT_JIS is rare and C.UTF-8 doesn't exist
>> anywhere.
> 
> C.UTF-8 exists on Cygwin.  But you are correct that...
> 
>> en_US.UTF-8 seems to perform acceptably for the latter.
> 
Also on Ubuntu. I only really started using it because it is
consistent with C
i.e.
LC_CTYPE=C
LC_CTYPE=C.UTF-8

Actually this was the reason I made the comment about not being able
to detect setlocale error in bash. wanted to use a fallback list of
the locale synonyms.

The primary problem with this test is you need the locales installed.

thoretical plan
1. compile list of destination code sets.
3. some method to auto install codesets.
3. Get Unicode mappings for said code sets.
2. use iconv to generate bash test tables
4. start crying at all the error messages ;(

now locale -m gives charsets.
Any ideas about finding unicode mappings for said charsets?
I've been looking through the iconv code but all seems a bit laborious.
What charsets would make sense to test?

Re: shopt can't set extglob in a sub-shell?

2012-02-26 Thread John Kearney

I updated that wiki page
Hopefully its clearer now.
http://mywiki.wooledge.org/glob#extglob


On 02/26/2012 12:06 PM, Dan Douglas wrote:
> On Saturday, February 25, 2012 09:42:29 PM Davide Baldini wrote:
> 
>> Description: A 'test.sh` script file composed exclusively of the
>> following text fails execution: #!/bin/bash ( shopt -s extglob 
>> echo !(x) ) giving the output: $ ./test.sh ./test.sh: line 4:
>> syntax error near unexpected token `(' ./test.sh: line 4: `
>> echo !(x)' Moving the shopt line above the sub-shell parenthesis
>> makes the script work.
>> 
>> The debian man pages give no explanation.
>> 
>> Thank you.
> 
> Non-eval workaround if you're desperate:
> 
> #!/usr/bin/env bash ( shopt -s extglob declare -a a='( !(x) )' echo
> "${a[@]}" )
> 
> You may be aware extglob is special and affects parsing in other
> ways. Quoting Greg's wiki (http://mywiki.wooledge.org/glob):
> 
>> Likewise, you cannot put shopt -s extglob inside a function that
>> uses extended globs, because the function as a whole must be
>> parsed when it's defined; the shopt command won't take effect
>> until the function is called, at which point it's too late.
> 
> This appears to be a similar situation. Since parentheses are
> "metacharacters" they act strongly as word boundaries without a
> special exception for extglobs.
> 
> I just tested a bunch of permutations. I was a bit surprised to see
> this one fail:
> 
> f() if [[ $FUNCNAME != ${FUNCNAME[1]} ]]; then trap 'shopt -u
> extglob' RETURN shopt -s extglob f else f()( shopt -s extglob echo
> !(x) ) f fi
> 
> f
> 
> I was thinking there might be a general solution via the RETURN
> trap where you could just set "trace" on functions where you want
> it, but looks like even "redefinitions" break recursively, so
> you're stuck. Fortunately, there aren't a lot of good reasons to
> have extglob disabled to begin with (if any).

Re: shopt can't set extglob in a sub-shell?

2012-02-26 Thread John Kearney

On 02/25/2012 09:42 PM, Davide Baldini wrote:
> Configuration Information [Automatically generated, do not
> change]: Machine: i486 OS: linux-gnu Compiler: gcc Compilation
> CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='i486'
> -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='i486-pc-linux-gnu'
> -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash'
> -DSHELL -DHAVE_CONFIG_H   -I.  -I../bash -I../bash/include
> -I../bash/lib   -g -O2 -Wall uname output: Linux debianBunker
> 2.6.26-2-686 #1 SMP Wed Sep 21 04:35:47 UTC 2011 i686 GNU/Linux 
> Machine Type: i486-pc-linux-gnu
> 
> Bash Version: 4.1 Patch Level: 5 Release Status: release

Ok so had a play around with it. Its not specific to sub shells its
commands.

so the following also doesn't work.

shopt -u extglob
if true; then
  shopt -s extglob
  echo !(x)
fi

this is because bash treats the entire if statement as a command. so
the second shopt isn't evaluate before the !(x) is parsed. therefore
the error message. The error message is a parsing error not an
expansion error I think.

so bash sees the above as
shopt -u extglob
if true; then   shopt -s extglob;   echo !(x); fi

as a workaround you could try/use is this it delays parsing the !(x)
until after the shopt is evaluated.

(
shopt -s extglob
eval 'echo !(x)'
)

Not sure if this is expected behavior.
hth
deth.

Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly.

2012-02-22 Thread John Kearney

And on the up side if they do ever give in and allow registration of
family name characters we may get a wchar_t, schar_t lwchar_t and a
llwchar_t
:)
just imagine a variable length 64bit char system.

Everything from Sumerian to Klingon in Unicode, though I think they
already are, though not officially, or are being done,

Oh god what I really want now is bash in klingon.

:))
just imagine black blackround glaring green text.
know what I'm doing tonight.

check out ( shakes head in disbelief, while chuckling )
Ubuntu Klingon Translators https://launchpad.net/~ubuntu-l10n-tlh
Expansion: Ubuntu Font should support pIqaD (Klingon)
https://bugs.launchpad.net/ubuntu/+source/ubuntu-font-family-sources/+bug/650729



On 02/23/2012 04:54 AM, Eric Blake wrote:
> On 02/22/2012 07:43 PM, John Kearney wrote:
>> ^ caviot you can represent the full 0x10 in UTF-16, you just
>> need 2 UTF-16 characters. check out the latest version of
>> unicode.c for an example how.
> 
> Yes, and Cygwin actually does this.
> 
> A strict reading of POSIX states that wchar_t must be wide enough
> for all supported characters, technically limiting things to just
> the basic plane if you have 16-bit wchar_t and a POSIX-compliant
> app.  But cygwin has exploited a loophole in the POSIX wording -
> POSIX does not require that all bit patterns are valid characters.
> So the actual Cygwin implementation is that on paper, rather than
> representing all 65536 patterns as valid characters, the values
> used in surrogate halves (0xd800 to 0xdfff) are listed as
> non-characters (so the use of them triggers undefined behavior per
> POSIX), but actually using them treats them as surrogate pairs
> (leading to the full Unicode character set, but reintroducing the
> headaches that multibyte characters had with 'char', but now with
> wchar_t, where you are back to dealing with variable-sized 
> character elements).
> 
> Furthermore, the mess of 16-bit vs. 32-bit wchar_t is one of the
> reasons why C11 has introduced two new character types, 16-bit and
> 32-bit characters, designed to fully map to the full Unicode set,
> regardless of what size wchar_t is.  It will be interesting to see
> how the next version of POSIX takes the additions of C11 and
> retrofits the other wide-character functions in POSIX but not C99
> to handle the new character types.
>

Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly.

2012-02-22 Thread John Kearney

^ caviot you can represent the full 0x10 in UTF-16, you just need 2
UTF-16 characters. check out the latest version of unicode.c for an
example how.

On 02/22/2012 11:32 PM, Eric Blake wrote:
> On 02/22/2012 03:01 PM, Linda Walsh wrote:
>> My question had to do with an unqualified wint_t not
>> unsigned wint_t and what platform existed where an 'int' type or
>> wide-int_t, was, without qualifiers, unsigned.  I still would like
>> to know -- and posix allows int/wide-ints to be unsigned without
>> the unsigned keyword?
> 
> 'int' is signed, and at least 16 bits (these days, it's usually 32).  It
> can also be written 'signed int'.
> 
> 'unsigned int' is unsigned, and at least 16 bits (these days, it's
> usually 32).
> 
> 'wchar_t' is an arbitrary integral type, either signed or unsigned, and
> capable of holding the value of all valid wide characters.   It is
> possible to define a system where wchar_t and char are identical
> (limiting yourself to 256 valid characters), but that is not done in
> practice.  More common are platforms that use 65536 characters (only the
> basic plane of Unicode) for 16 bits, or full Unicode (0 to 0x10) for
> 32 bits.  Platforms that use 65536 characters and 16-bit wchar_t must
> have wchar_t be unsigned; whereas platforms that have wchar_t wider than
> the largest valid character can choose signed or unsigned with no impact.
> 
> 'wint_t' is an arbitrary integral type, either signed or unsigned, at
> least as wide as wchar_t, and capable of holding the value of all valid
> wide characters and the sentinel WEOF.  Like wchar_t, it may hold values
> that are neither WEOF or valid characters; and in fact, it is more
> likely to do so, since either wchar_t is saturated (all bit values are
> valid characters) and thus wint_t is a wider type, or wchar_t is sparse
> (as is the case with 32-bit wchar_t encoding Unicode), and the addition
> of WEOF to the set does not plug in the remaining sparse values; but
> using such values has unspecified results on any interface that takes a
> wint_t.  WEOF only has to be distinct, it does not have to be negative.
> 
> Don't think of it as 'wide-int', rather, think of it as 'the integral
> type that both contains wchar_t and WEOF'.  You cannot write 'signed
> wint_t' nor 'unsigned 'wint_t'.
>

Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly.

2012-02-22 Thread John Kearney

On 02/22/2012 01:59 PM, Eric Blake wrote:
> On 02/22/2012 05:19 AM, Linda Walsh wrote:
>>
>>
>> Eric Blake wrote:
>>
>>
>>> Not only can wchar_t can be either signed or unsigned, you also have to
>>> worry about platforms where it is only 16 bits, such as cygwin; on the
>>> other hand, wint_t is always 32 bits, but you still have the issue that
>>> it can be either signed or unsigned.
>>
>>
>>
>> What platform uses unsigned wide ints?  Is that even posix compat?
> 
> Yes, it is posix compatible to have wint_t be unsigned.  Not only that,
> but both glibc (32-bit wchar_t) and cygwin (16-bit wchar_t) use a 32-bit
> unsigned int for wint_t.  Any code that expects WEOF to be less than 0
> is broken.
> 
But if what you want is a uint32  use a uint32_t ;)

printf "%q" "~" not escaped?

2012-02-21 Thread John Kearney

Bash Version: 4.2
Patch Level: 10
Release Status: release

Description:
printf "%q" "~" not escaped?

which means that this
eval echo $(printf "%q" "~")
results in your home path not a ~
unlike
eval echo $(printf "%q" "*")

as far as I can see its the only character that isn't treated as I
expected.

Re: Bug? in bash setlocale implementation

2012-02-21 Thread John Kearney

On 02/22/2012 01:52 AM, Chet Ramey wrote:
> On 2/21/12 3:51 AM, John Kearney wrote:
> 
>> Bash Version: 4.2 Patch Level: 10 Release Status: release
>> 
>> Description: Basically if setting the locale fails variable
>> should not be changed.
> 
> I disagree.  The assignment was performed correctly and as the
> user specified.  The fact that a side effect of the assignment
> failed should not mean that the assignment should be undone.
> 
> I got enough bug reports when I added the warning.  I'd get at
> least as many if I undid a perfectly good assignment statement.
> 
> I could see setting $? to a non-zero value if the setlocale() call
> fails, but not when the shell is in posix mode.
> 
> Chet
> 
ok I guess that makes sense, just ksh93 behavior also makes sense, I
guess I can just use some command to check the charset is present
before I assign it.

Here is a diff of all the changed to the unicode

2012-02-21 Thread John Kearney



Here is a diff of all the changed to the unicode

This seems to work ok for me. but still needs further testing.

My major goal was to make the code easier to follow and clearer.

but also generally fixed and improved it.

Added warning message
./bash -c 'printf "string 1\\U8fffStromg 2"'
./bash: line 0: printf: warning: U+8fff unsupported in destination
charset ".UTF-8"
string 1Stromg 2


added utf32toutf16 and utf32towchar to allow usage of wcstombs both when
wchar_t=2 or 4

generally reworked so consistent with function argument convention i.e.
destination then source.
diff --git a/builtins/printf.def b/builtins/printf.def
index 9eca215..77a8159 100644
--- a/builtins/printf.def
+++ b/builtins/printf.def
@@ -859,15 +859,9 @@ tescape (estart, cp, lenp, sawc)
 	*cp = '\\';
 	return 0;
 	  }
-	if (uvalue <= UCHAR_MAX)
-	  *cp = uvalue;
-	else
-	  {
-	temp = u32cconv (uvalue, cp);
-	cp[temp] = '\0';
-	if (lenp)
-	  *lenp = temp;
-	  }
+	temp = utf32tomb (cp, uvalue);
+	if (lenp)
+	  *lenp = temp;
 	break;
 #endif
 	
diff --git a/externs.h b/externs.h
index 09244fa..ff3f344 100644
--- a/externs.h
+++ b/externs.h
@@ -460,7 +460,7 @@ extern unsigned int falarm __P((unsigned int, unsigned int));
 extern unsigned int fsleep __P((unsigned int, unsigned int));
 
 /* declarations for functions defined in lib/sh/unicode.c */
-extern int u32cconv __P((unsigned long, char *));
+extern int utf32tomb __P((char *, unsigned long));
 
 /* declarations for functions defined in lib/sh/winsize.c */
 extern void get_new_window_size __P((int, int *, int *));
diff --git a/lib/sh/strtrans.c b/lib/sh/strtrans.c
index 2265782..e410cff 100644
--- a/lib/sh/strtrans.c
+++ b/lib/sh/strtrans.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 
+#include 
 #include "shell.h"
 
 #ifdef ESC
@@ -140,21 +141,10 @@ ansicstr (string, len, flags, sawc, rlen)
 	  for (v = 0; ISXDIGIT ((unsigned char)*s) && temp--; s++)
 		v = (v * 16) + HEXVALUE (*s);
 	  if (temp == ((c == 'u') ? 4 : 8))
-		{
 		  *r++ = '\\';	/* c remains unchanged */
-		  break;
-		}
-	  else if (v <= UCHAR_MAX)
-		{
-		  c = v;
-		  break;
-		}
 	  else
-		{
-		  temp = u32cconv (v, r);
-		  r += temp;
-		  continue;
-		}
+		  r += utf32tomb (r, v);
+	  break;
 #endif
 	case '\\':
 	  break;
diff --git a/lib/sh/unicode.c b/lib/sh/unicode.c
index d34fa08..5cc96bf 100644
--- a/lib/sh/unicode.c
+++ b/lib/sh/unicode.c
@@ -36,13 +36,7 @@
 
 #include 
 
-#ifndef USHORT_MAX
-#  ifdef USHRT_MAX
-#define USHORT_MAX USHRT_MAX
-#  else
-#define USHORT_MAX ((unsigned short) ~(unsigned short)0)
-#  endif
-#endif
+#include "bashintl.h"
 
 #if !defined (STREQ)
 #  define STREQ(a, b) ((a)[0] == (b)[0] && strcmp ((a), (b)) == 0)
@@ -54,13 +48,14 @@ extern const char *locale_charset __P((void));
 extern char *get_locale_var __P((char *));
 #endif
 
-static int u32init = 0;
+const char *charset;
 static int utf8locale = 0;
 #if defined (HAVE_ICONV)
 static iconv_t localconv;
 #endif
 
 #ifndef HAVE_LOCALE_CHARSET
+static char charset_buffer[40]={0};
 static char *
 stub_charset ()
 {
@@ -68,168 +63,267 @@ stub_charset ()
 
   locale = get_locale_var ("LC_CTYPE");
   if (locale == 0 || *locale == 0)
-return "ASCII";
-  s = strrchr (locale, '.');
-  if (s)
 {
-  t = strchr (s, '@');
-  if (t)
-	*t = 0;
-  return ++s;
+  strcpy(charset_buffer, "ASCII");
 }
-  else if (STREQ (locale, "UTF-8"))
-return "UTF-8";
   else
-return "ASCII";
+{
+  s = strrchr (locale, '.');
+  if (s)
+	{
+	  t = strchr (s, '@');
+	  if (t)
+	*t = 0;
+	  strcpy(charset_buffer, s);
+	}
+  else
+	{
+	  strcpy(charset_buffer, locale);
+	}
+  /* free(locale)  If we can Modify the buffer surely we need to free it?*/
+}
+  return charset_buffer;
 }
 #endif
 
-/* u32toascii ? */
+
+#if 0
 int
-u32tochar (wc, s)
- wchar_t wc;
+utf32tobig5 (s, c)
  char *s;
+ unsigned long c;
 {
-  unsigned long x;
   int l;
 
-  x = wc;
-  l = (x <= UCHAR_MAX) ? 1 : ((x <= USHORT_MAX) ? 2 : 4);
-
-  if (x <= UCHAR_MAX)
-s[0] = x & 0xFF;
-  else if (x <= USHORT_MAX)	/* assume unsigned short = 16 bits */
+  if (c <= 0x7F)
 {
-  s[0] = (x >> 8) & 0xFF;
-  s[1] = x & 0xFF;
+  s[0] = (char)c;
+  l = 1;
+}
+  else if ((c >= 0x8000) && (c <= 0x))
+{
+  s[0] = (char)(c>>8);
+  s[1] = (char)(c  &0xFF);
+  l = 2;
 }
   else
 {
-  s[0] = (x >> 24) & 0xFF;
-  s[1] = (x >> 16) & 0xFF;
-  s[2] = (x >> 8) & 0xFF;
-  s[3] = x & 0xFF;
+  /* Error Invalid UTF-8 */
+  l = 0;
 }
   s[l] = '\0';
-  return l;  
+  return l;
 }
-
+#endif
 int
-u32toutf8 (wc, s)
- wchar_t wc;
+utf32toutf8 (s, c)
  char *s;
+ unsigned long c;
 {
   int l;
 
-  l = (wc < 0x0080) ? 1 : ((wc < 0x0800) ? 2 : 3);
-
-  if (wc < 0x0080)
-s[0] = (unsigned char)wc;
-  else if (wc < 0x0800)
+  if (c <= 0x7F)
 {
-  s[0] = (wc >> 6) | 0xc0;
-  s[1] = (

Initial test code for \U

2012-02-21 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Initial code for testing \u functionality.
basically uses arrays that look like this
jp_JP_SHIFT_JIS=(
  #Unicode="expected bmstring"
  [0x0001]=$'\x01' #  START OF HEADING
  [0x0002]=$'\x02' #  START OF TEXT
...
)
TestCodePage ja_JP.SHIFT_JIS jp_JP_SHIFT_JIS

in error output looks like this
Error Encoding U+00FB to C.UTF-8 [ "$'\303\273'" != "$'\373'" ]
Error Encoding U+00FC to C.UTF-8 [ "$'\303\274'" != "$'\374'" ]
Error Encoding U+00FD to C.UTF-8 [ "$'\303\275'" != "$'\375'" ]
Error Encoding U+00FE to C.UTF-8 [ "$'\303\276'" != "$'\376'" ]
Error Encoding U+00FF to C.UTF-8 [ "$'\303\277'" != "$'\377'" ]
Failed 128 of 1378 Unicode tests

or if its all ok like this
Passed all 1378 Unicode tests


should make it relatively easy to verify functionality on different
targets etc.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPRBWIAAoJEKUDtR0WmS05WigH/1iXidormw3aj+bBJZDEYv33
BL98n1irF4C9ZNNPc95UfPvDjqVUhpQrWx+/Pa6BH9m9zSd5cSqZ7xmgUH9mzg2p
JkqbiTzg0+lb714BBopTyZMRajqXMrQGx5nJTzOwuMNhs7cgrHPtPdOUrkcB0OJ2
UR5e0T1MWx8RR6lOgkXu0Gt3nQqYtnes+8y8fGGbbHfFrxJMaOjegjdN87+Q6N0U
Cl0uVH9JT8V6IEU1Q4EddjuuqyBr1c8soXd9XjeCPXVdc3XSJ5b/XB8Sdh7uW8FW
x3UbaNrhaReX8XF0xHMoPvIJQFmQE469RpXERWmZzWpGnXrXCvEpxmVQXK2CWhY=
=Cm29
-END PGP SIGNATURE-
ErrorCnt=0
TestCnt=0

  function check_valid_var_name {
case "${1:?Missing Variable Name}" in
  [!a-zA-Z_]* | *[!a-zA-Z_0-9]* ) return 3;;
esac
  }
  # get_array_element VariableName ArrayName ArrayElement
  function get_array_element {
check_valid_var_name "${1:?Missing Variable Name}" || return $?
check_valid_var_name "${2:?Missing Array Name}" || return $?
eval "${1}"'="${'"${2}"'["${3:?Missing Array Index}"]}"'
  }
  # unset_array_element VarName ArrayName
  function get_array_element_cnt {
check_valid_var_name "${1:?Missing Variable Name}" || return $?
check_valid_var_name "${2:?Missing Array Name}" || return $?
eval "${1}"'="${#'"${2}"'[@]}"'
  }


function TestCodePage {
local TargetCharset="${1:?Missing Test charset}"
local EChar RChar TCnt
get_array_element_cnt TCnt "${2:?Missing Array Name}"
for (( x=1 ; x<${TCnt} ; x++ )); do
  get_array_element EChar "${2}"  ${x}
  if [ -n "${EChar}" ]; then
	let TestCnt+=1
	printf -v UVal '\\U%08x' "${x}"
	LC_CTYPE=${TargetCharset} printf -v RChar "${UVal}"
	if [ "${EChar}" != "${RChar}" ]; then
	  let ErrorCnt+=1
	  printf "Error Encoding U+%08X to ${TL} [ \"%q\" != \"%q\" ]\n" "${x}" "${EChar}" "${RChar}"
	fi
  fi
done
}


#for ((x=1;x<255;x++)); do printf ' [0x%04x]=$'\''\%03o'\' $x $x ; [ $(($x%5)) = 0 ] && echo; done
fr_FR_ISO_8859_1=(
 [0x0001]=$'\001' [0x0002]=$'\002' [0x0003]=$'\003' [0x0004]=$'\004' [0x0005]=$'\005'
 [0x0006]=$'\006' [0x0007]=$'\007' [0x0008]=$'\010' [0x0009]=$'\011' [0x000a]=$'\012'
 [0x000b]=$'\013' [0x000c]=$'\014' [0x000d]=$'\015' [0x000e]=$'\016' [0x000f]=$'\017'
 [0x0010]=$'\020' [0x0011]=$'\021' [0x0012]=$'\022' [0x0013]=$'\023' [0x0014]=$'\024'
 [0x0015]=$'\025' [0x0016]=$'\026' [0x0017]=$'\027' [0x0018]=$'\030' [0x0019]=$'\031'
 [0x001a]=$'\032' [0x001b]=$'\033' [0x001c]=$'\034' [0x001d]=$'\035' [0x001e]=$'\036'
 [0x001f]=$'\037' [0x0020]=$'\040' [0x0021]=$'\041' [0x0022]=$'\042' [0x0023]=$'\043'
 [0x0024]=$'\044' [0x0025]=$'\045' [0x0026]=$'\046' [0x0027]=$'\047' [0x0028]=$'\050'
 [0x0029]=$'\051' [0x002a]=$'\052' [0x002b]=$'\053' [0x002c]=$'\054' [0x002d]=$'\055'
 [0x002e]=$'\056' [0x002f]=$'\057' [0x0030]=$'\060' [0x0031]=$'\061' [0x0032]=$'\062'
 [0x0033]=$'\063' [0x0034]=$'\064' [0x0035]=$'\065' [0x0036]=$'\066' [0x0037]=$'\067'
 [0x0038]=$'\070' [0x0039]=$'\071' [0x003a]=$'\072' [0x003b]=$'\073' [0x003c]=$'\074'
 [0x003d]=$'\075' [0x003e]=$'\076' [0x003f]=$'\077' [0x0040]=$'\100' [0x0041]=$'\101'
 [0x0042]=$'\102' [0x0043]=$'\103' [0x0044]=$'\104' [0x0045]=$'\105' [0x0046]=$'\106'
 [0x0047]=$'\107' [0x0048]=$'\110' [0x0049]=$'\111' [0x004a]=$'\112' [0x004b]=$'\113'
 [0x004c]=$'\114' [0x004d]=$'\115' [0x004e]=$'\116' [0x004f]=$'\117' [0x0050]=$'\120'
 [0x0051]=$'\121' [0x0052]=$'\122' [0x0053]=$'\123' [0x0054]=$'\124' [0x0055]=$'\125'
 [0x0056]=$'\126' [0x0057]=$'\127' [0x0058]=$'\130' [0x0059]=$'\131' [0x005a]=$'\132'
 [0x005b]=$'\133' [0x005c]=$'\134' [0x005d]=$'\135' [0x005e]=$'\136' [0x005f]=$'\137'
 [0x0060]=$'\140' [0x0061]=$'\141' [0x0062]=$'\142' [0x0063]=$'\143' [0x0064]=$'\144'
 [0x0065]=$'\145' [0x0066]=$'\146' [0x0067]=$'\147' [0x0068]=$'\150' [0x0069]=$'\151'
 [0x006a]=$'\152' [0x006b]=$'\153' [0x006c]=$'\154' [0x006d]=$'\155' [0x006e]=$'\156'
 [0x006f]=$'\157' [0x0070]=$'\160' [0x0071]=$'\161' [0x0072]=$'\162' [0x0073]=$'\163'
 [0x0074]=$'\164' [0x0075]=$'\165' [0x0076]=$'\166' [0x0077]=$'\167' [0x0078]=$'\170'
 [0x0079]=$'\171' [0x007a]=$'\172' [0x007b]=$'\173' [0x007c]=$'\174' [0x007d]=$'\175'
 [0x007e]=$'\176' [0x007f]=$'\177' [0x0080]=$'\200' [0x0081]=$'\

Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly.

2012-02-21 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 02/21/2012 01:34 PM, Eric Blake wrote:
> On 02/20/2012 07:42 PM, Chet Ramey wrote:
>> On 2/18/12 5:39 AM, John Kearney wrote:
>> 
>>> Bash Version: 4.2 Patch Level: 10 Release Status: release
>>> 
>>> Description: Current u32toutf8 only encode values below 0x
>>> correctly. wchar_t can be ambiguous size better in my opinion
>>> to use unsigned long, or uint32_t, or something clearer.
>> 
>> Thanks for the patch.  It's good to have a complete
>> implementation, though as a practical matter you won't see UTF-8
>> characters longer than four bytes.  I agree with you about the
>> unsigned 32-bit int type; wchar_t is signed, even if it's 32
>> bits, on several systems I use.
> 
> Not only can wchar_t can be either signed or unsigned, you also
> have to worry about platforms where it is only 16 bits, such as
> cygwin; on the other hand, wint_t is always 32 bits, but you still
> have the issue that it can be either signed or unsigned.
> 
signed / unsigend isn't really the problem anyway utf-8 only encodes
up to 0x7fff  and utf-16 only encodes up to 0x0010 .

In my latest version I've pretty much removed all reference to wchar_t
in unicode.c. It was unnecessary.

However I would be interested in something like utf16_t or uint16_t
currently using unsigned short which is intelligent but works.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPQ593AAoJEKUDtR0WmS05g0wH/RPQMl1mfUdJBfzv5QkUtVSG
ibezTe3/b7/9h8SG3LLrv2FiPS+FtcCbE4n8tUror3V1BHomsQHZdlj/Zshi8W/n
YDl5ac5nc0rrOlw+SJxyCAJl9vHeEAXavjGw8m0KUv/vn0tZyWNM0RYXc7tRxJU2
uqY7G5sGLUt8uGuswCmSmucKjoB7guiUbsmTR+OzgDgKxuuSeQBr6/oIImo721pk
nI5TYdqerPGCIMJoYPeZChCBAZ/WhK9i3C3/SxKme4zWnjySaDw3NH0yfqFHl4Ts
IIOT4fYpm0h62U76+NJSPGWfadTd8UL4A/Jy4I3IwUS+mflwdU0Pu2zmwb8I+Xk=
=pkAF
-END PGP SIGNATURE-

Bug? in bash setlocale implementation

2012-02-21 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
- -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu'
- -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/s
uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20
17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.2
Patch Level: 10
Release Status: release

Description:
  Basically if setting the locale fails variable should not be changed.

 Consider


export LC_CTYPE=

bash -c 'LC_CTYPE=ISO-8859-1 eval printf "\${LC_CTYPE:-unset}"'
bash: warning: setlocale: LC_CTYPE: cannot change locale (ISO-8859-1):
No such file or directory
ISO-8859-1

ksh93 -c 'LC_CTYPE=ISO-8859-1 eval printf "\${LC_CTYPE:-unset}"'
ISO-8859-1: unknown locale
unset
ksh93 -c 'LC_CTYPE=C.UTF-8 eval printf "\${LC_CTYPE:-unset}"'
C.UTF-8

  the advantage being you can check in the script if the local change
worked.
  e.g.
  LC_CTYPE=ISO-8859-1
  [ "${LC_CTYPE:-}" = "ISO-8859-1" ] || error exit
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPQ1sbAAoJEKUDtR0WmS05dDEH+wf+Gix7NnSZ6WvwOt6ZRmlv
/BXr94coQ1I6ODCXXAG0ExgqNs81gJ58N1xw0nBO/qMpJ1CWv+t5Gc+FP37RK9GK
aZbrT6yYAueg/lz58o7hg76oRKVmOpzaYxdquC4dMKa8K1kEdxNyyO4Qxa8a/TNP
qLC79kvBl/23CESRomZdhUpOOjTdzhiEo6njLxDmluhzA+U/WsMD1Zp7TJih30gu
okkJESAwSsEoo8QIeFbzOFa/qEZQH05SwY0CoYO+OPC0qlNR/Jar9cAJhTpHfxjg
bLYXSNlqs5ZCgbmUCypnOWpOktUVPNxpXabNTjWPwAnekEY8Ms4BR6XkG+yuclk=
=+Z4p
-END PGP SIGNATURE-

bug in stub_charset rollup diff of changes to unicode code.

2012-02-20 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
- -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu'
- -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/s
uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20
17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.2
Patch Level: 10
Release Status: release

Description:
  stub_charset
  if locale == '\0'
return ASCII
  else if locale=~m/.*\.(.*)(@.*)/
   return $1
  else if locale=UTF-8
   return UTF-8
  else
   return ASCII

should be
  if locale == '\0'
return ASCII
  else if locale=~m/.*\.(.*)(@.*)/
   return $1
  else
   return locale
 because its output is only being used in iconv, so let it decide if the
locale makes sense.




   I've attached a diff of all my changes to the unicode code.
   Including
   renamed u2cconv to utf32tomb
   move special handling of ascii charcter to start of function and
remove related call wrapper code.
   tried to reationalize the code in utf32tomb so its easier to read and
understand what is happening.
   added utf32toutf16
   use utf32toutf16 in case wchar_t=2 with wctomb

  removed dangerious code that was using iconv_open (charset, "ASCII");
as fallback. pointless anyway as we already assign a ascii value if posible.

  added warning message if encode fails

always terminate mb output string.


haven't started to test these changes yet firstly would like to know
if these changes are acceptable, any observations, I'm still reviewing
it myself for consistency.

Plus can somebody tell me how this was tested originally? I've got
some ideas myself but would like to know what has already been done in
that direction.

Repeat-By:
  .

Fix:
diff --git a/builtins/printf.def b/builtins/printf.def
index 9eca215..3680419 100644
- --- a/builtins/printf.def
+++ b/builtins/printf.def
@@ -859,15 +859,9 @@ tescape (estart, cp, lenp, sawc)
*cp = '\\';
return 0;
  }
- - if (uvalue <= UCHAR_MAX)
- -   *cp = uvalue;
- - else
- -   {
- - temp = u32cconv (uvalue, cp);
- - cp[temp] = '\0';
- - if (lenp)
- -   *lenp = temp;
- -   }
+   temp = utf32tomb (uvalue, cp);
+   if (lenp)
+ *lenp = temp;
break;
 #endif

diff --git a/externs.h b/externs.h
index 09244fa..8868b55 100644
- --- a/externs.h
+++ b/externs.h
@@ -460,7 +460,7 @@ extern unsigned int falarm __P((unsigned int,
unsigned int));
 extern unsigned int fsleep __P((unsigned int, unsigned int));

 /* declarations for functions defined in lib/sh/unicode.c */
- -extern int u32cconv __P((unsigned long, char *));
+extern int utf32tomb __P((unsigned long, char *));

 /* declarations for functions defined in lib/sh/winsize.c */
 extern void get_new_window_size __P((int, int *, int *));
diff --git a/lib/sh/strtrans.c b/lib/sh/strtrans.c
index 2265782..495d9c4 100644
- --- a/lib/sh/strtrans.c
+++ b/lib/sh/strtrans.c
@@ -144,16 +144,10 @@ ansicstr (string, len, flags, sawc, rlen)
  *r++ = '\\';  /* c remains unchanged */
  break;
}
- -   else if (v <= UCHAR_MAX)
- - {
- -   c = v;
- -   break;
- - }
  else
{
- -   temp = u32cconv (v, r);
- -   r += temp;
- -   continue;
+ r += utf32tomb (v, r);
+ break;
}
 #endif
case '\\':
diff --git a/lib/sh/unicode.c b/lib/sh/unicode.c
index d34fa08..9a557a9 100644
- --- a/lib/sh/unicode.c
+++ b/lib/sh/unicode.c
@@ -36,14 +36,6 @@

 #include 

- -#ifndef USHORT_MAX
- -#  ifdef USHRT_MAX
- -#define USHORT_MAX USHRT_MAX
- -#  else
- -#define USHORT_MAX ((unsigned short) ~(unsigned short)0)
- -#  endif
- -#endif
- -
 #if !defined (STREQ)
 #  define STREQ(a, b) ((a)[0] == (b)[0] && strcmp ((a), (b)) == 0)
 #endif /* !STREQ */
@@ -54,13 +46,14 @@ extern const char *locale_charset __P((void));
 extern char *get_locale_var __P((char *));
 #endif

- -static int u32init = 0;
+const char *charset;
 static int utf8locale = 0;
 #if defined (HAVE_ICONV)
 static iconv_t localconv;
 #endif

 #ifndef HAVE_LOCALE_CHARSET
+static char CType[40]={0};
 static char *
 stub_charset ()
 {
@@ -69,6 +62,7 @@ stub_charset ()
   locale = get_locale_var ("LC_CTYPE");
   if (locale == 0 || *locale == 0)
 return "ASCII";
+  strcpy(CType, locale);
   s = strrchr (locale, '.');
   if (s)
 {
@@ -77,159 +71,230 @@ stub_charset ()
*t = 0;
   return ++s;
 }
- -  else if (STREQ (locale, "UTF-8"))
- -return "UTF-8";
   else
- -return "ASCII";
+return CType;
 }
 #endif

- -/* u32toascii ? */
 int
- -u32tochar (wc, s)
- - wchar_t wc;
+utf32_2_utf8 (c, s)
+ unsigned lon

Can somebody explain to me what u32tochar in /lib/sh/unicode.c is trying to do?

2012-02-19 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Can somebody explain to me what u32tochar is trying to do?

It seems like dangerous code?

from the context i'm guessing it trying to make a hail mary pass at
converting utf-32 to mb (not utf-8 mb)


int
u32tochar (x, s)
 unsigned long c;
 char *s;
{
  int l;

  l = (x <= UCHAR_MAX) ? 1 : ((x <= USHORT_MAX) ? 2 : 4);

  if (x <= UCHAR_MAX)
s[0] = x & 0xFF;
  else if (x <= USHORT_MAX) /* assume unsigned short = 16 bits */
{
  s[0] = (x >> 8) & 0xFF;
  s[1] = x & 0xFF;
}
  else
{
  s[0] = (x >> 24) & 0xFF;
  s[1] = (x >> 16) & 0xFF;
  s[2] = (x >> 8) & 0xFF;
  s[3] = x & 0xFF;
}
  /* s[l] = '\0';  Overwrite Buffer?*/
  return l;
}

Couple problems with that though
firstly utf-32 doesn't map directly to non utf mb locals. So you need
a translation mechanism.
Secondly Normal CJK system are state based systems so mutibyte
sequences need to be escaped. Extended Unix Code would need encoding
somewhat like utf-8, in fact any variable multi byte encoding system
is going to need some context to recover the info this is unparsable
behavior,

what it is actually doing is taking utf-32 and depending on the size
encoding it as UTF-32 Big Endian , UTF-16 Big Endian, UTF-8, or
American EAscii codepage(values between 0x80 - 0xff). Choosing one of
these is however Dependant on LC_CTYPE not some arbitrary check.

So this function just seems plain crazy?
I  think that all it can safely do is this.
int
utf32tomb (x, s)
 unsigned long c;
 char *s;
{

  if (x <= 0x7f ) /* x>=0x80 = locale specific */
 {
 s[0] = x & 0xFF;
 return 1;
 }
  else
return 0
}



regarding naming convention u32 = unsigned 32 bit
might be a good idea to rename all the utf32 functions to utf32, would
I think save a lot of confusion in the code as to what is going on.

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPQXKxAAoJEKUDtR0WmS054sgH/R+qWtds9MMeN/y4n98wk83l
MAOVBXAn+m8IUf31VtSZ7nqEccJHDPDRMkg21sYNlozsxPVwCYOGZd7LL8wxlwEl
70mRu9cAQOXIAeF9b8ao0/nz6e6nC6FTk03FDhDo+V8RWt9MiQHF4YWRCCmSdmQv
GDM88XyXuQZaBwIHrXeCXRvuXTN8K5BrdbVFJ7OHRUytKNE6OccUDz/iaPCoPy5f
SehHTLJ6AqpYy7NgapyALTvo3/FlVUDc7vtYbCDF5Q0EMIlvjgEQ9Y7vJlKtuAop
9Up32sQSy8red6frOgZmvA5GLeD7Lp/gvfp/U5fQWIZTKKLgBee2mYVqPlLOKw4=
=nHdc
-END PGP SIGNATURE-

Questionable code behavior in u32cconv?

2012-02-18 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
- -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu'
- -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash'
- -DSHELL -DHAVE_CONFIG_H   -I.  -I../bash -I../bash/include
- -I../bash/lib   -g -O2 -Wall
uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20
17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.2
Patch Level: 10
Release Status: release

Description:
Now I may be misreading the code but it looks like the code relating
to iconv is only checking the destination charset the first time, the
code is executed.

as such breaking the following functionality.
LC_CTYPE=C printf '\uff'
LC_CTYPE=C.UTF-8 printf '\uff'

Repeat-By:
haven't seen the problem.

Fix:
  Not so much a fix as a modification that should hopefully clarify my
concern.



diff --git a/lib/sh/unicode.c b/lib/sh/unicode.c
index d34fa08..3f7d378 100644
- --- a/lib/sh/unicode.c
+++ b/lib/sh/unicode.c
@@ -54,7 +54,7 @@ extern const char *locale_charset __P((void));
 extern char *get_locale_var __P((char *));
 #endif

- -static int u32init = 0;
+const char *charset;
 static int utf8locale = 0;
 #if defined (HAVE_ICONV)
 static iconv_t localconv;
@@ -115,26 +115,61 @@ u32tochar (wc, s)
 }

@@ -150,7 +185,7 @@ u32cconv (c, s)
   wchar_t wc;
   int n;
 #if HAVE_ICONV
- -  const char *charset;
+  const char *ncharset;
   char obuf[25], *optr;
   size_t obytesleft;
   const char *iptr;
@@ -171,20 +206,22 @@ u32cconv (c, s)
   codeset = nl_langinfo (CODESET);
   if (STREQ (codeset, "UTF-8"))
 {
   n = u32toutf8 (wc, s);
   return n;
 }
 #endif

 #if HAVE_ICONV
- -  /* this is mostly from coreutils-8.5/lib/unicodeio.c */
- -  if (u32init == 0)
- -{
 #  if HAVE_LOCALE_CHARSET
- -  charset = locale_charset ();   /* XXX - fix later */
+  ncharset = locale_charset ();/* XXX - fix later */
 #  else
- -  charset = stub_charset ();
+  ncharset = stub_charset ();
 #  endif
+  /* this is mostly from coreutils-8.5/lib/unicodeio.c */
+  if (STREQ (charset, ncharset))
+{
+  /* Free Old charset str ? */
+  charset=ncharset;
   if (STREQ (charset, "UTF-8"))
utf8locale = 1;
   else
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPP5SCAAoJEKUDtR0WmS05L8QH/RUz/X8QZk7HXDIFUTCd0Eah
MkfWpCtib9Jt5jUBcb+/UZKiwTSxYGm7D9X08Tpho+i7c+3kknWUGTkivqg7eVo4
TlRA+N4k3x8PdpbYPFNGxgy9LRSViQjqbbzNfYaX+Pbi2YIbZRuaPBipEdbvBqDG
bN7KaUM/97vZicZn5SOrhcDiq1RfJosdTkr7egEON4P4BBIXIVk4vRcCF/xXCw6M
w2BmvpavV3ra1TXhYN2C678qMyncq5kr8e0tvIl4EY6oCurMlvXhoNkOcz14fOMa
XrYJUu1dDNKXmTsJFjDGZhyzvTejLVezjn91/so2OINinqHW++2IMFim5ED9w28=
=rW+v
-END PGP SIGNATURE-

Fix u32toutf8 so it encodes values > 0xFFFF correctly.

2012-02-18 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
- -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu'
- -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash'
- -DSHELL -DHAVE_CONFIG_H   -I.  -I../bash -I../bash/include
- -I../bash/lib   -g -O2 -Wall
uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20
17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.2
Patch Level: 10
Release Status: release

Description:
Current u32toutf8 only encode values below 0x correctly.
wchar_t can be ambiguous size better in my opinion to use
unsigned long, or uint32_t, or something clearer.
Repeat-By:
  ---'

Fix:
diff --git a/lib/sh/unicode.c b/lib/sh/unicode.c
index d34fa08..3f7d378 100644
- --- a/lib/sh/unicode.c
+++ b/lib/sh/unicode.c
@@ -54,7 +54,7 @@ extern const char *locale_charset __P((void));
 extern char *get_locale_var __P((char *));
 #endif

- -static int u32init = 0;
+static int u32init = 0;
 static int utf8locale = 0;
 #if defined (HAVE_ICONV)
 static iconv_t localconv;
@@ -115,26 +115,61 @@ u32tochar (wc, s)
 }

 int
- -u32toutf8 (wc, s)
- - wchar_t wc;
+u32toutf8 (c, s)
+ unsigned long c;
  char *s;
 {
   int l;

- -  l = (wc < 0x0080) ? 1 : ((wc < 0x0800) ? 2 : 3);
- -
- -  if (wc < 0x0080)
- -s[0] = (unsigned char)wc;
- -  else if (wc < 0x0800)
+  if (c <= 0x7F)
+{
+  s[0] = (char)c;
+  l = 1;
+}
+  else if (c <= 0x7FF)
+{
+  s[0] = (c >>   6)| 0xc0; /* 110x  */
+  s[1] = (c& 0x3f) | 0x80; /* 10xx  */
+  l = 2;
+}
+  else if (c <= 0x)
+{
+  s[0] =  (c >> 12) | 0xe0; /* 1110  */
+  s[1] = ((c >>  6) & 0x3f) | 0x80; /* 10xx  */
+  s[2] =  (c& 0x3f) | 0x80; /* 10xx  */
+  l = 3;
+}
+  else if (c <= 0x1F)
 {
- -  s[0] = (wc >> 6) | 0xc0;
- -  s[1] = (wc & 0x3f) | 0x80;
+  s[0] =  (c >> 18) | 0xf0; /*  0xxx */
+  s[1] = ((c >> 12) & 0x3f) | 0x80; /* 10xx  */
+  s[2] = ((c >>  6) & 0x3f) | 0x80; /* 10xx  */
+  s[3] = ( c& 0x3f) | 0x80; /* 10xx  */
+  l = 4;
+}
+  else if (c <= 0x3FF)
+{
+  s[0] =  (c >> 24) | 0xf8; /*  10xx */
+  s[1] = ((c >> 18) & 0x3f) | 0x80; /* 10xx  */
+  s[2] = ((c >> 12) & 0x3f) | 0x80; /* 10xx  */
+  s[3] = ((c >>  6) & 0x3f) | 0x80; /* 10xx  */
+  s[4] = ( c& 0x3f) | 0x80; /* 10xx  */
+  l = 5;
+}
+  else if (c <= 0x7FFF)
+{
+  s[0] =  (c >> 30) | 0xfc; /*  110x */
+  s[1] = ((c >> 24) & 0x3f) | 0x80; /* 10xx  */
+  s[2] = ((c >> 18) & 0x3f) | 0x80; /* 10xx  */
+  s[3] = ((c >> 12) & 0x3f) | 0x80; /* 10xx  */
+  s[4] = ((c >>  6) & 0x3f) | 0x80; /* 10xx  */
+  s[5] = ( c& 0x3f) | 0x80; /* 10xx  */
+  l = 6;
 }
   else
 {
- -  s[0] = (wc >> 12) | 0xe0;
- -  s[1] = ((wc >> 6) & 0x3f) | 0x80;
- -  s[2] = (wc & 0x3f) | 0x80;
+  /* Error Invalid UTF-8 */
+  l = 0;
 }
   s[l] = '\0';
   return l;
@@ -150,7 +185,7 @@ u32cconv (c, s)
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPP3/tAAoJEKUDtR0WmS059CcH/iIyBOGhf0IgSmnIFyw0YLpA
3ZWSaXWoEZodrDr1fX67hj2424icXm9fTZw70G+rS1YjtCfm86O/Qou4VNROylAv
TbjPUWkHRWVci7IqcDGb1tNWRrulxUvNFA/Uc1xBtKckAO6HHHRTYFa+sCkd5Fnx
dm7e0iMTqMMmL/dUwB+di+hSkGD+ZXS1vY76wizdwG7CteUxAVunse+ffP7TRYbn
K86Whc7p7llG12hruCPGArc9iS7YiBaC/XNIKXmN7fn93dhQTcdzzk/UTGmaZgDk
cQk4R7/NBljP4LtQtKwX4JYAi5XJM5TeSLykL97UFxW/5OGM+SmSVJbKLlHU/mQ=
=EJUb
-END PGP SIGNATURE-

Re: UTF-8 Encode problems with \u \U

2012-02-18 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 02/18/2012 11:29 AM, Andreas Schwab wrote:
> John Kearney  writes:
> 
>> what I suggest will fix the UTF-8 case
> 
> No, it won't.
> 
>> and not affect the UTF-2 case.
> 
> That is impossible.
> 
> Andreas.
> 

Current code
if (uvalue <= UCHAR_MAX)
  *cp = uvalue;
else
  {
temp = u32cconv (uvalue, cp);
cp[temp] = '\0';
if (lenp)
  *lenp = temp;
  }

Robust Code
temp = u32cconv (uvalue, cp);
cp[temp] = '\0';
if (lenp)
  *lenp = temp;

Compromise solution
if (uvalue <= 0x7f)
  *cp = uvalue;
else
  {
temp = u32cconv (uvalue, cp);
cp[temp] = '\0';
if (lenp)
  *lenp = temp;
  }

How can doing a direct assignment, in less cases break anything, if it
does u32cconv is broken.

And it does work for me, so impossible seems to be overstating it.

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPP39rAAoJEKUDtR0WmS052JIH/09at08oGR16hvj2blL4YxWJ
V1Slbkh9O8pJ4DV9NOwEweIpjAxYUzRFzOEVV0tiYzeqISJ36uKnttewiP5VcRSv
heS6QwOl5R3wnx0ecNkpLMo2nT054Fqd+OHSHFOgkBeAM28PVwjT+GmfFyCp1f4K
hPevpejPLyxHYWaXJwy4+1XN0Wp/YatzEXr21pHgU7CPyMGYLbju4su0kNpYledj
5Zo3tT/cvoBGVysJo5AbQ8D07cG85eoARxz6erJatjKDKCUPl1kKdcikG3nGvnQc
66HdR/lJRShDh344uss6/4sw2R9LFut0QP+ChhJowQ9ZBI1uZo7/fn0gQv7gOdo=
=fXLm
-END PGP SIGNATURE-

Re: UTF-8 Encode problems with \u \U

2012-02-18 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I know
To be hones I get a bad feeling with that code, I'm guessing it was
done for performance reasons, Personally I'd just remove the special
handling of any values, and always call the encoding function, but was
trying for a minimalist solution.
I mean you could do something like

#define MAX_SINGLE_BYTE_UTF8 0x7F
if (uvalue <= MAX_SINGLE_BYTE_UTF8)

I'm guessing the code was done originally for UTF-2 encoding.

what I suggest will fix the UTF-8 case and not affect the UTF-2 case.


On 02/18/2012 11:11 AM, Andreas Schwab wrote:
> John Kearney  writes:
> 
>> Fix: iff --git a/builtins/printf.def b/builtins/printf.def index 
>> 9eca215..b155160 100644 --- a/builtins/printf.def +++ 
>> b/builtins/printf.def @@ -859,7 +859,7 @@ tescape (estart, cp, 
>> lenp, sawc) *cp = '\\'; return 0; } -if (uvalue <= UCHAR_MAX)
>> +if (uvalue <= CHAR_MAX)
> 
> CHAR_MAX has nothing at all to do with UTF-8.
> 
> Andreas.
> 

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPP3u8AAoJEKUDtR0WmS056GIH/1TO/A8RmRCfTU3llNG1tMJy
MJiby2gdvz2v/Q+Y83llCU01fcQ1tGpp2iOO7rbfYmfdqiJ8iMfNc1pK302Tb77u
HcZSSVQKnBwNpL6eeAhwLVzrpfdcKWY/diQknsiXLtrm0AcPhsrf5Bu/OgHjeu7m
3uyqlcQAvYVKj5Z4eV75Hn1+lrCp26fkjZSOZPN9AH8yv1chQXrYPB+/Wj82Cp/S
sSgupvpmAv3b4HaZhXsA2DPxEEb2ESj/ZaHMC4/AxyABJoub++erxm/k8r3iUDjc
rud6jWoVJcwt+UkVyqi8V8qIJ/urVG01FVoVXTYIiqA73ZdJ3fkLw0PCmliZMtA=
=pZin
-END PGP SIGNATURE-

UTF-8 Encode problems with \u \U

2012-02-18 Thread John Kearney

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu'
-DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash'
-DSHELL -DHAVE_CONFIG_H   -I.  -I../bash -I../bash/include
-I../bash/lib   -g -O2 -Wall
uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20
17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.2
Patch Level: 10
Release Status: release

Description:
\u and \U incorrectly encode values between \u80 and \uff

Repeat-By:
  printf '%q\n' "$(printf '\uff')"
  printf '%q\n' $'\uff'
  # outputs $'\377' instead of $'\303\277'

Fix:
iff --git a/builtins/printf.def b/builtins/printf.def
index 9eca215..b155160 100644
--- a/builtins/printf.def
+++ b/builtins/printf.def
@@ -859,7 +859,7 @@ tescape (estart, cp, lenp, sawc)
 *cp = '\\';
 return 0;
   }
-if (uvalue <= UCHAR_MAX)
+if (uvalue <= CHAR_MAX)
   *cp = uvalue;
 else
   {
diff --git a/lib/sh/strtrans.c b/lib/sh/strtrans.c
index 2265782..2e6e37b 100644
--- a/lib/sh/strtrans.c
+++ b/lib/sh/strtrans.c
@@ -144,7 +144,7 @@ ansicstr (string, len, flags, sawc, rlen)
   *r++ = '\\';/* c remains unchanged */
   break;
 }
-  else if (v <= UCHAR_MAX)
+  else if (v <= CHAR_MAX)
 {
   c = v;
   break;

97 matches

Mail list logo