Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
On 9/19/11 2:35 PM, Stephane CHAZELAS wrote: > FYI, ksh93 uses pow(3). So does zsh, but only in floating point > mode. > > Probably better and more efficient than reinventing the wheel. Maybe, but since bash doesn't use floating point arithmetic, probably not. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
On 9/19/11 2:35 PM, Stephane CHAZELAS wrote: >> Thanks for the report. This looks like an independent reimplementation of >> the "exponentiation by squaring" method. I did a little looking around, >> and it's the best algorithm out there. I used a slightly different but >> equivalent implementation. > [...] > > FYI, ksh93 uses pow(3). So does zsh, but only in floating point > mode. Bash doesn't use floating point. It does all of its arithmetic in intmax_t. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
2011-09-17, 13:39(+00), Stephane CHAZELAS: > 2011-09-17, 13:06(+00), Stephane CHAZELAS: >> 2011-09-16, 17:17(-07), William Park: >>> 145557834293068928043467566190278008218249525830565939618481 >>> is awfully big number! :-) >> >> 3**2**62 is 3**(2**62), 3**4611686018427387904, not a number you >> can represent with 64bits, nor any reasonable number of bits, >> not (3**2)**62. > [...] > > Sorry, my bad, > > 3**2**62 is indeed (3**2)**62 in bash and in zsh contrary to > most other places (ksh93, bc, python, gawk, perl, ruby...). Sorry again, I was right in the first place, 3**2**62 is 3**(2**62) in bash and zsh like in other shells. I think I need more sleep... -- Stephane
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
2011-09-19, 09:27(-04), Chet Ramey: > On 9/16/11 4:39 PM, Nicolas ARGYROU wrote: > >> Bash Version: 4.0 >> Patch Level: 33 >> Release Status: release >> >> Description: >> The algorithm used to calculate x to the power of y: x**y >> takes O(y) time which is way too long on systems using 64 bits. >> Calculating for exemple $((3**2**62)) freezes the shell at >> argument parsing time. >> >> Repeat-By: >> bash -c 'echo $((3**2**62))' >> >> Fix: >> This fix uses an alorithm that takes O(log(y)) time, which is way >> faster. But it is still about 30 times slower with random numbers >> than a single multiplication, on 64 bits systems. The fix is written >> as a C++ template working on any unsigned integer type, and doesn't >> need any external resource: > > Thanks for the report. This looks like an independent reimplementation of > the "exponentiation by squaring" method. I did a little looking around, > and it's the best algorithm out there. I used a slightly different but > equivalent implementation. [...] FYI, ksh93 uses pow(3). So does zsh, but only in floating point mode. Probably better and more efficient than reinventing the wheel. -- Stephane
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
2011-09-17, 13:06(+00), Stephane CHAZELAS: > 2011-09-16, 17:17(-07), William Park: >> 145557834293068928043467566190278008218249525830565939618481 >> is awfully big number! :-) > > 3**2**62 is 3**(2**62), 3**4611686018427387904, not a number you > can represent with 64bits, nor any reasonable number of bits, > not (3**2)**62. [...] Sorry, my bad, 3**2**62 is indeed (3**2)**62 in bash and in zsh contrary to most other places (ksh93, bc, python, gawk, perl, ruby...). -- Stephane
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
2011-09-16, 17:17(-07), William Park: > 145557834293068928043467566190278008218249525830565939618481 > is awfully big number! :-) 3**2**62 is 3**(2**62), 3**4611686018427387904, not a number you can represent with 64bits, nor any reasonable number of bits, not (3**2)**62. Certainly not a number that bash arithmetic expansion can handle not even in floating mode. Wih zsh: $ echo $((exp((2**62)*log(3 inf. $ echo 'e((2^62)*l(3))' | bc -l Runtime warning (func=e, adr=123): scale too large, set to 2147483647 Fatal error: Out of memory for malloc. -- Stephane
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
Hello, I noticed other shells have the same bug, and perhaps you would like to use this code in other GNU projects (like making a library call or an executable). The best, I think, is to transfer the copyright to bash maintainers. You can now copyright and license it the way you want: template inline T ipow(register /* unsigned */ T x, register /* unsigned */ T y) { if (x == 0 && y != 0) return 0; // 1: ipow(x,y) = x ** y = Product [i=0; i<=log2(y)] (x ** (((y>>i)&1)*2**i)) // 2: x**(2**i) = x**(2**(i-1)) * x**(2**(i-1)) register T xy = 1; for(; y; y>>=1, x *= x) if (y & 1) xy *= x; return xy; } If you don't wish to use this algoritm, then there is another way of getting around the bug: for exemple, on 64 bit systems, when performing x ** y, if y > 64 then the result will be too large anyway to stand in 64 bits. It is true when you suppose x = 2, and (even more) when x > 2. In this case you may then throw an overflow error. In other cases, y < 64 is a manageable number for the loop algorithm. But, I still prefer the O(log(y)) algorithm, which always computes the exact value (modulus 2**64) in less numbers of steps on average. This is, in detail, how the algorithm works: When, for example, computing 3 ** 3 = 3 * 3 * 3 = 27, you can write it also with a binary power: 3 ** (1*2**1 + 1*2**0), and because, x ** (a+b) = x**a * x**b, you may develop it: 3 ** (2**1) * 3 ** (2**0) This gives the first equation in the comments of the code: // 1: ipow(x,y) = x ** y = Product [i=0; i<=log2(y)] (x ** (((y>>i)&1)*2**i)) The second thing to notice is that we need to compute: 3**2**i, but as 2**i = 2*2**(i-1) = 2**(i-1) + 2**(i-1), then 3**2**i = 3**(2**(i-1) + 2**(i-1)), since, x ** (a+b) = x**a * x**b, then 3**2**i = 3**(2**(i-1) + 2**(i-1)) = 3**(2**(i-1)) * 3**(2**(i-1)). This gives the second equation in the comments of the code: // 2: x**(2**i) = x**(2**(i-1)) * x**(2**(i-1)) and a reccurence to compute x**2**(i+1) when we know x**2**i, from x**2**0 = x. We just then have to roll though the bits of y, computing x = x * x each step, and multiplying xy (the temporary result) by one of the squares of x when the corresponding bit of y is set. As we always do multiplications modulus 2**64, it gives the result modulus 2**64. We only need to use 3 registers: one to roll y, one to hold the squares of x, and one to hold the result. Writing the same algoritm with assembler language can save a few instructions per loop, because gcc doesn't catch that it can use the out-going bit of y to directly jump over xy *= x; if not set. But it won't be portable then. Best regards, Nicolas Argyrou
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
On 9/16/11 4:39 PM, Nicolas ARGYROU wrote: > Bash Version: 4.0 > Patch Level: 33 > Release Status: release > > Description: > The algorithm used to calculate x to the power of y: x**y > takes O(y) time which is way too long on systems using 64 bits. > Calculating for exemple $((3**2**62)) freezes the shell at > argument parsing time. > > Repeat-By: > bash -c 'echo $((3**2**62))' > > Fix: > This fix uses an alorithm that takes O(log(y)) time, which is way > faster. But it is still about 30 times slower with random numbers > than a single multiplication, on 64 bits systems. The fix is written > as a C++ template working on any unsigned integer type, and doesn't > need any external resource: Thanks for the report. This looks like an independent reimplementation of the "exponentiation by squaring" method. I did a little looking around, and it's the best algorithm out there. I used a slightly different but equivalent implementation. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
From: Nicolas ARGYROU > > > > You're right. Then, just do what you want with this piece of code, I make > > it > > public domain: no restriction at all on using it. It's a gift to bash as I > > use it a lot. I also accept that you put my name and/or email with the > > piece of > > code in case someone wants to contact the author. :-) On Sun, Sep 18, 2011 at 06:33:23PM -0700, William Park wrote: > No. For example, current Bash is copyrighted and licensed by the copyright > holder. > > To get included in Bash, though, I think the license should be the same as > Bash, > at the least. It is my understanding (IANAL) that if a work is placed into the public domain, anyone may use it and relicense it as he or she sees fit. Thus, with the code being in the public domain, Chet may incorporate it into Bash and release the combined work under Bash's existing license.
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
I'm ok with using Bash's licence. - Original Message - From: William Park To: Nicolas ARGYROU Cc: bashbug Sent: Monday, September 19, 2011 3:33 AM Subject: Re: Bug fix for $((x**y)) algorithm on 64+ bits machines. No. For example, current Bash is copyrighted and licensed by the copyright holder. To get included in Bash, though, I think the license should be the same as Bash, at the least. -- William
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
No. For example, current Bash is copyrighted and licensed by the copyright holder. To get included in Bash, though, I think the license should be the same as Bash, at the least. -- William - Original Message - > From: Nicolas ARGYROU > To: Dave Rutherford > Cc: bashbug > Sent: Saturday, September 17, 2011 11:35:08 PM > Subject: Re: Bug fix for $((x**y)) algorithm on 64+ bits machines. > > You're right. Then, just do what you want with this piece of code, I make it > public domain: no restriction at all on using it. It's a gift to bash as I > use it a lot. I also accept that you put my name and/or email with the piece > of > code in case someone wants to contact the author. :-) > > Regards, > Nicolas Argyrou > > > > - Original Message - > From: Dave Rutherford > To: Nicolas ARGYROU > Cc: > Sent: Saturday, September 17, 2011 10:34 PM > Subject: Re: Bug fix for $((x**y)) algorithm on 64+ bits machines. > > On Sat, Sep 17, 2011 at 07:10, Nicolas ARGYROU wrote: > >> I came up with a version that is slightly more precise in the comments, and > that uses 3 registers instead of 4 >> (though gcc can optimize that): >> >> >> // Copyright 2011: Nicolas Argyrou , public domain. > > My understanding is that something can be either copyrighted or > public domain, but not both at the same time. You might want > to check on this. > > Dave >
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
You're right. Then, just do what you want with this piece of code, I make it public domain: no restriction at all on using it. It's a gift to bash as I use it a lot. I also accept that you put my name and/or email with the piece of code in case someone wants to contact the author. :-) Regards, Nicolas Argyrou - Original Message - From: Dave Rutherford To: Nicolas ARGYROU Cc: Sent: Saturday, September 17, 2011 10:34 PM Subject: Re: Bug fix for $((x**y)) algorithm on 64+ bits machines. On Sat, Sep 17, 2011 at 07:10, Nicolas ARGYROU wrote: > I came up with a version that is slightly more precise in the comments, and > that uses 3 registers instead of 4 > (though gcc can optimize that): > > > // Copyright 2011: Nicolas Argyrou , public domain. My understanding is that something can be either copyrighted or public domain, but not both at the same time. You might want to check on this. Dave
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
I'm glad it pleases you. I'm amazed also how fast it deals with large numbers. Feel free to use it. :) I came up with a version that is slightly more precise in the comments, and that uses 3 registers instead of 4 (though gcc can optimize that): // Copyright 2011: Nicolas Argyrou , public domain. template inline T ipow(register /* unsigned */ T x, register /* unsigned */ T y) { if (x == 0 && y != 0) return 0; // 1: ipow(x,y) = x ** y = Product [i=0; i<=log2(y)] (x ** (((y>>i)&1)*2**i)) // 2: x**(2**i) = x**(2**(i-1)) * x**(2**(i-1)) register T xy = 1; for(; y; y>>=1, x *= x) if (y & 1) xy *= x; return xy; } I doubt I can came up with a more concise and precise version. Thank you, Nicolas Argyrou - Original Message - From: William Park To: Nicolas ARGYROU Cc: "bug-bash@gnu.org" Sent: Saturday, September 17, 2011 2:17 AM Subject: Re: Bug fix for $((x**y)) algorithm on 64+ bits machines. 145557834293068928043467566190278008218249525830565939618481 is awfully big number! :-) -- William - Original Message - > From: Nicolas ARGYROU > To: "bug-bash@gnu.org" > Cc: > Sent: Friday, September 16, 2011 4:39:41 PM > Subject: Bug fix for $((x**y)) algorithm on 64+ bits machines. > > From: na...@yahoo.com > To: bug-bash@gnu.org > Subject: Bug fix for $((x**y)) algorithm on 64+ bits machines. > > Configuration Information [Automatically generated, do not change]: > Machine: x86_64 > OS: linux-gnu > Compiler: gcc > Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' > -DCONF_OSTYPE='linux-gnu' > -DCONF_MACHTYPE='x86_64-mandriva-linux-gnu' > -DCONF_VENDOR='mandriva' -DLOCALEDIR='/usr/share/locale' > -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include > -I./lib -O2 -g -pipe -Wformat -Werror=format-security > -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector --param=ssp-buffer-size=4 > uname output: Linux localhost.localdomain 2.6.31.14-desktop-1mnb #1 SMP Wed > Nov > 24 10:42:07 EST 2010 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 6000+ > GNU/Linux > Machine Type: x86_64-mandriva-linux-gnu > > Bash Version: 4.0 > Patch Level: 33 > Release Status: release > > Description: > The algorithm used to calculate x to the power of y: x**y > takes O(y) time which is way too long on systems using 64 bits. > Calculating for exemple $((3**2**62)) freezes the shell at > argument parsing time. > > Repeat-By: > bash -c 'echo $((3**2**62))' > > Fix: > This fix uses an alorithm that takes O(log(y)) time, which is way > faster. But it is still about 30 times slower with random numbers > than a single multiplication, on 64 bits systems. The fix is written > as a C++ template working on any unsigned integer type, and doesn't > need any external resource: > > // Copyright 2011: Nicolas Argyrou , public domain. > template > inline T ipow(register T x, register T y) > { > if (x == 0 && y != 0) return 0; > // 1: ipow(x,y) = x ** y = Product [i=0; i (((y>>i)&1)*2**i)) > // 2: x**(2**i) = x**(2**(i-1)) * x**(2**(i-1)) > register T X = x; register T xy = 1; > for(; y; y>>=1, X *= X) > if (y & 1) > xy *= X; > return xy; > } >
Re: Bug fix for $((x**y)) algorithm on 64+ bits machines.
145557834293068928043467566190278008218249525830565939618481 is awfully big number! :-) -- William - Original Message - > From: Nicolas ARGYROU > To: "bug-bash@gnu.org" > Cc: > Sent: Friday, September 16, 2011 4:39:41 PM > Subject: Bug fix for $((x**y)) algorithm on 64+ bits machines. > > From: na...@yahoo.com > To: bug-bash@gnu.org > Subject: Bug fix for $((x**y)) algorithm on 64+ bits machines. > > Configuration Information [Automatically generated, do not change]: > Machine: x86_64 > OS: linux-gnu > Compiler: gcc > Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' > -DCONF_OSTYPE='linux-gnu' > -DCONF_MACHTYPE='x86_64-mandriva-linux-gnu' > -DCONF_VENDOR='mandriva' -DLOCALEDIR='/usr/share/locale' > -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include > -I./lib -O2 -g -pipe -Wformat -Werror=format-security > -Wp,-D_FORTIFY_SOURCE=2 > -fexceptions -fstack-protector --param=ssp-buffer-size=4 > uname output: Linux localhost.localdomain 2.6.31.14-desktop-1mnb #1 SMP Wed > Nov > 24 10:42:07 EST 2010 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 6000+ > GNU/Linux > Machine Type: x86_64-mandriva-linux-gnu > > Bash Version: 4.0 > Patch Level: 33 > Release Status: release > > Description: > The algorithm used to calculate x to the power of y: x**y > takes O(y) time which is way too long on systems using 64 bits. > Calculating for exemple $((3**2**62)) freezes the shell at > argument parsing time. > > Repeat-By: > bash -c 'echo $((3**2**62))' > > Fix: > This fix uses an alorithm that takes O(log(y)) time, which is way > faster. But it is still about 30 times slower with random numbers > than a single multiplication, on 64 bits systems. The fix is written > as a C++ template working on any unsigned integer type, and doesn't > need any external resource: > > // Copyright 2011: Nicolas Argyrou , public domain. > template > inline T ipow(register T x, register T y) > { > if (x == 0 && y != 0) return 0; > // 1: ipow(x,y) = x ** y = Product [i=0; i (((y>>i)&1)*2**i)) > // 2: x**(2**i) = x**(2**(i-1)) * x**(2**(i-1)) > register T X = x; register T xy = 1; > for(; y; y>>=1, X *= X) > if (y & 1) > xy *= X; > return xy; > } >
Bug fix for $((x**y)) algorithm on 64+ bits machines.
From: na...@yahoo.com To: bug-bash@gnu.org Subject: Bug fix for $((x**y)) algorithm on 64+ bits machines. Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-mandriva-linux-gnu' -DCONF_VENDOR='mandriva' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib -O2 -g -pipe -Wformat -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 uname output: Linux localhost.localdomain 2.6.31.14-desktop-1mnb #1 SMP Wed Nov 24 10:42:07 EST 2010 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 6000+ GNU/Linux Machine Type: x86_64-mandriva-linux-gnu Bash Version: 4.0 Patch Level: 33 Release Status: release Description: The algorithm used to calculate x to the power of y: x**y takes O(y) time which is way too long on systems using 64 bits. Calculating for exemple $((3**2**62)) freezes the shell at argument parsing time. Repeat-By: bash -c 'echo $((3**2**62))' Fix: This fix uses an alorithm that takes O(log(y)) time, which is way faster. But it is still about 30 times slower with random numbers than a single multiplication, on 64 bits systems. The fix is written as a C++ template working on any unsigned integer type, and doesn't need any external resource: // Copyright 2011: Nicolas Argyrou , public domain. template inline T ipow(register T x, register T y) { if (x == 0 && y != 0) return 0; // 1: ipow(x,y) = x ** y = Product [i=0; i>i)&1)*2**i)) // 2: x**(2**i) = x**(2**(i-1)) * x**(2**(i-1)) register T X = x; register T xy = 1; for(; y; y>>=1, X *= X) if (y & 1) xy *= X; return xy; }