Re: asm volatile [Was: [RFC] div64_64 support]

2007-03-08 Thread Sami Farin
On Wed, Mar 07, 2007 at 00:24:35 +0200, Sami Farin wrote: > On Tue, Mar 06, 2007 at 23:53:49 +0200, Sami Farin wrote: > ... > > And I found bug in gcc-4.1.2, it gave 0 for ncubic results > > when doing 1000 loops test... gcc-4.0.3 works. > > Found it. > > --- cbrt-test.c~ 2007-03-07

Re: asm volatile [Was: [RFC] div64_64 support]

2007-03-08 Thread Sami Farin
On Wed, Mar 07, 2007 at 00:24:35 +0200, Sami Farin wrote: On Tue, Mar 06, 2007 at 23:53:49 +0200, Sami Farin wrote: ... And I found bug in gcc-4.1.2, it gave 0 for ncubic results when doing 1000 loops test... gcc-4.0.3 works. Found it. --- cbrt-test.c~ 2007-03-07

Re: [RFC] div64_64 support

2007-03-07 Thread Sami Farin
On Wed, Mar 07, 2007 at 11:11:49 -0500, Chuck Ebbert wrote: > Sami Farin wrote: > > On Tue, Mar 06, 2007 at 23:53:49 +0200, Sami Farin wrote: > > ... > >> And I found bug in gcc-4.1.2, it gave 0 for ncubic results > >> when doing 1000 loops test... gcc-4.0.3 works. > > > > Found it. > > > > ---

Re: [RFC] div64_64 support

2007-03-07 Thread Chuck Ebbert
Sami Farin wrote: > On Tue, Mar 06, 2007 at 23:53:49 +0200, Sami Farin wrote: > ... >> And I found bug in gcc-4.1.2, it gave 0 for ncubic results >> when doing 1000 loops test... gcc-4.0.3 works. > > Found it. > > --- cbrt-test.c~ 2007-03-07 00:20:54.735248105 +0200 > +++ cbrt-test.c

Re: [RFC] div64_64 support

2007-03-07 Thread Chuck Ebbert
Sami Farin wrote: On Tue, Mar 06, 2007 at 23:53:49 +0200, Sami Farin wrote: ... And I found bug in gcc-4.1.2, it gave 0 for ncubic results when doing 1000 loops test... gcc-4.0.3 works. Found it. --- cbrt-test.c~ 2007-03-07 00:20:54.735248105 +0200 +++ cbrt-test.c 2007-03-07

Re: [RFC] div64_64 support

2007-03-07 Thread Sami Farin
On Wed, Mar 07, 2007 at 11:11:49 -0500, Chuck Ebbert wrote: Sami Farin wrote: On Tue, Mar 06, 2007 at 23:53:49 +0200, Sami Farin wrote: ... And I found bug in gcc-4.1.2, it gave 0 for ncubic results when doing 1000 loops test... gcc-4.0.3 works. Found it. --- cbrt-test.c~

Re: [RFC] div64_64 support

2007-03-06 Thread David Miller
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Tue, 6 Mar 2007 16:00:55 -0800 > On Wed, 7 Mar 2007 00:24:35 +0200 > Sami Farin <[EMAIL PROTECTED]> wrote: > > > On Tue, Mar 06, 2007 at 23:53:49 +0200, Sami Farin wrote: > > ... > > > And I found bug in gcc-4.1.2, it gave 0 for ncubic results >

Re: [RFC] div64_64 support

2007-03-06 Thread Sami Farin
On Tue, Mar 06, 2007 at 16:00:55 -0800, Stephen Hemminger wrote: ... > > Now Linux 2.6 does not have "memory" in fls, maybe it causes > > some gcc funnies some people are seeing. > > > > That code was copy-paste from: > include/asm-x86_64/bitops.h > > So shouldn't both fls() and ffs() be

Re: [RFC] div64_64 support

2007-03-06 Thread Stephen Hemminger
On Wed, 7 Mar 2007 00:24:35 +0200 Sami Farin <[EMAIL PROTECTED]> wrote: > On Tue, Mar 06, 2007 at 23:53:49 +0200, Sami Farin wrote: > ... > > And I found bug in gcc-4.1.2, it gave 0 for ncubic results > > when doing 1000 loops test... gcc-4.0.3 works. > > Found it. > > --- cbrt-test.c~

Re: [RFC] div64_64 support

2007-03-06 Thread Sami Farin
On Tue, Mar 06, 2007 at 23:53:49 +0200, Sami Farin wrote: ... > And I found bug in gcc-4.1.2, it gave 0 for ncubic results > when doing 1000 loops test... gcc-4.0.3 works. Found it. --- cbrt-test.c~2007-03-07 00:20:54.735248105 +0200 +++ cbrt-test.c 2007-03-07 00:21:03.964864343 +0200 @@

Re: [RFC] div64_64 support

2007-03-06 Thread David Miller
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Tue, 6 Mar 2007 10:29:41 -0800 > /* calculate the cubic root of x using Newton-Raphson */ > static uint32_t ncubic(uint64_t a) > { > uint64_t x; > > /* Initial estimate is based on: >* cbrt(x) = exp(log(x) / 3) >*/ >

Re: [RFC] div64_64 support

2007-03-06 Thread Sami Farin
On Tue, Mar 06, 2007 at 10:29:41 -0800, Stephen Hemminger wrote: > Don't count the existing Newton-Raphson out. It turns out that to get enough > precision for 32 bits, only 4 iterations are needed. By unrolling those, it > gets much better timing. > > Slightly gross test program (with original

Re: [RFC] div64_64 support

2007-03-06 Thread Stephen Hemminger
On Tue, 6 Mar 2007 20:48:41 +0100 Andi Kleen <[EMAIL PROTECTED]> wrote: > On Tue, Mar 06, 2007 at 10:29:41AM -0800, Stephen Hemminger wrote: > > Don't count the existing Newton-Raphson out. It turns out that to get enough > > precision for 32 bits, only 4 iterations are needed. By unrolling

Re: [RFC] div64_64 support

2007-03-06 Thread Andi Kleen
On Tue, Mar 06, 2007 at 10:29:41AM -0800, Stephen Hemminger wrote: > Don't count the existing Newton-Raphson out. It turns out that to get enough > precision for 32 bits, only 4 iterations are needed. By unrolling those, it > gets much better timing. But did you fix the >2^43 bug too? SGI has

Re: [RFC] div64_64 support

2007-03-06 Thread H. Peter Anvin
Andi Kleen wrote: Let me see... You throw code like that and expect someone to actually understand it in one year, and be able to correct a bug ? To be honest I don't expect any bugs in this function. Please add something, an URL or even better a nice explanation, per favor... It's

Re: [RFC] div64_64 support II

2007-03-06 Thread H. Peter Anvin
Andi Kleen wrote: The problem with these algorithms that tradoff one or more multiplies in order to avoid a divide is that they don't give anything and often lose when both multiplies and divides are emulated in software. Actually on rereading this: is there really any Linux port that emulates

Re: [RFC] div64_64 support

2007-03-06 Thread Stephen Hemminger
Don't count the existing Newton-Raphson out. It turns out that to get enough precision for 32 bits, only 4 iterations are needed. By unrolling those, it gets much better timing. Slightly gross test program (with original cubic wraparound bug fixed). --- /* Test and measure perf of cube root

Re: [RFC] div64_64 support II

2007-03-06 Thread David Miller
From: [EMAIL PROTECTED] (Dagfinn Ilmari Mannsåker) Date: Tue, 06 Mar 2007 18:43:14 +0100 > Andi Kleen <[EMAIL PROTECTED]> writes: > > > Actually on rereading this: is there really any Linux port > > that emulates multiplies in software? I thought that was only > > done on really small

Re: [RFC] div64_64 support II

2007-03-06 Thread Dagfinn Ilmari Mannsåker
Andi Kleen <[EMAIL PROTECTED]> writes: > Actually on rereading this: is there really any Linux port > that emulates multiplies in software? I thought that was only > done on really small microcontrollers or smart cards; but anything > 32bit+ that runs Linux should have hardware multiply,

Re: [RFC] div64_64 support

2007-03-06 Thread Roland Kuhn
Hi Andi! On 6 Mar 2007, at 15:45, Andi Kleen wrote: Let me see... You throw code like that and expect someone to actually understand it in one year, and be able to correct a bug ? To be honest I don't expect any bugs in this function. Please add something, an URL or even better a nice

Re: [RFC] div64_64 support

2007-03-06 Thread Andi Kleen
> > Let me see... You throw code like that and expect someone to actually > understand it in one year, and be able to correct a bug ? To be honest I don't expect any bugs in this function. > > > Please add something, an URL or even better a nice explanation, per favor... It's straight out

Re: [RFC] div64_64 support

2007-03-06 Thread Eric Dumazet
On Tuesday 06 March 2007 14:34, Andi Kleen wrote: > - return x; > + int s; > + u32 y; > + u64 b; > + u64 bs; > + > + y = 0; > + for (s = 63; s >= 0; s -= 3) { > + y = 2 * y; > + b = 3 * y * (y+1) + 1; > + bs = b << s; > +

Re: [RFC] div64_64 support II

2007-03-06 Thread Andi Kleen
> The problem with these algorithms that tradoff one or more > multiplies in order to avoid a divide is that they don't > give anything and often lose when both multiplies and > divides are emulated in software. Actually on rereading this: is there really any Linux port that emulates multiplies

Re: [RFC] div64_64 support

2007-03-06 Thread Andi Kleen
On Mon, Mar 05, 2007 at 04:25:51PM -0800, David Miller wrote: > Another thing is that the non-Hacker's Delight version iterates > differently for different input values, so the input value space is > very important to consider when comparing these two pieces of code. I did some stochastic testing

Re: [RFC] div64_64 support

2007-03-06 Thread Andi Kleen
On Mon, Mar 05, 2007 at 03:57:14PM -0800, Stephen Hemminger wrote: > On 03 Mar 2007 03:31:52 +0100 > Andi Kleen <[EMAIL PROTECTED]> wrote: > > > Stephen Hemminger <[EMAIL PROTECTED]> writes: > > > > > Here is another way to handle the 64 bit divide case. > > > It allows full 64 bit divide by

Re: [RFC] div64_64 support

2007-03-06 Thread Stephen Hemminger
On Wed, 7 Mar 2007 00:24:35 +0200 Sami Farin [EMAIL PROTECTED] wrote: On Tue, Mar 06, 2007 at 23:53:49 +0200, Sami Farin wrote: ... And I found bug in gcc-4.1.2, it gave 0 for ncubic results when doing 1000 loops test... gcc-4.0.3 works. Found it. --- cbrt-test.c~ 2007-03-07

Re: [RFC] div64_64 support

2007-03-06 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED] Date: Tue, 6 Mar 2007 16:00:55 -0800 On Wed, 7 Mar 2007 00:24:35 +0200 Sami Farin [EMAIL PROTECTED] wrote: On Tue, Mar 06, 2007 at 23:53:49 +0200, Sami Farin wrote: ... And I found bug in gcc-4.1.2, it gave 0 for ncubic results when doing

Re: [RFC] div64_64 support

2007-03-06 Thread Sami Farin
On Tue, Mar 06, 2007 at 16:00:55 -0800, Stephen Hemminger wrote: ... Now Linux 2.6 does not have memory in fls, maybe it causes some gcc funnies some people are seeing. That code was copy-paste from: include/asm-x86_64/bitops.h So shouldn't both fls() and ffs() be fixed there

Re: [RFC] div64_64 support

2007-03-06 Thread Andi Kleen
On Mon, Mar 05, 2007 at 03:57:14PM -0800, Stephen Hemminger wrote: On 03 Mar 2007 03:31:52 +0100 Andi Kleen [EMAIL PROTECTED] wrote: Stephen Hemminger [EMAIL PROTECTED] writes: Here is another way to handle the 64 bit divide case. It allows full 64 bit divide by adding the support

Re: [RFC] div64_64 support

2007-03-06 Thread Andi Kleen
On Mon, Mar 05, 2007 at 04:25:51PM -0800, David Miller wrote: Another thing is that the non-Hacker's Delight version iterates differently for different input values, so the input value space is very important to consider when comparing these two pieces of code. I did some stochastic testing on

Re: [RFC] div64_64 support II

2007-03-06 Thread Andi Kleen
The problem with these algorithms that tradoff one or more multiplies in order to avoid a divide is that they don't give anything and often lose when both multiplies and divides are emulated in software. Actually on rereading this: is there really any Linux port that emulates multiplies in

Re: [RFC] div64_64 support

2007-03-06 Thread Eric Dumazet
On Tuesday 06 March 2007 14:34, Andi Kleen wrote: - return x; + int s; + u32 y; + u64 b; + u64 bs; + + y = 0; + for (s = 63; s = 0; s -= 3) { + y = 2 * y; + b = 3 * y * (y+1) + 1; + bs = b s; + if (x = bs (b

Re: [RFC] div64_64 support

2007-03-06 Thread Andi Kleen
rant Let me see... You throw code like that and expect someone to actually understand it in one year, and be able to correct a bug ? To be honest I don't expect any bugs in this function. /rant Please add something, an URL or even better a nice explanation, per favor... It's straight

Re: [RFC] div64_64 support

2007-03-06 Thread Roland Kuhn
Hi Andi! On 6 Mar 2007, at 15:45, Andi Kleen wrote: rant Let me see... You throw code like that and expect someone to actually understand it in one year, and be able to correct a bug ? To be honest I don't expect any bugs in this function. /rant Please add something, an URL or even better

Re: [RFC] div64_64 support II

2007-03-06 Thread Dagfinn Ilmari Mannsåker
Andi Kleen [EMAIL PROTECTED] writes: Actually on rereading this: is there really any Linux port that emulates multiplies in software? I thought that was only done on really small microcontrollers or smart cards; but anything 32bit+ that runs Linux should have hardware multiply, shouldn't it?

Re: [RFC] div64_64 support II

2007-03-06 Thread David Miller
From: [EMAIL PROTECTED] (Dagfinn Ilmari Mannsåker) Date: Tue, 06 Mar 2007 18:43:14 +0100 Andi Kleen [EMAIL PROTECTED] writes: Actually on rereading this: is there really any Linux port that emulates multiplies in software? I thought that was only done on really small microcontrollers or

Re: [RFC] div64_64 support

2007-03-06 Thread Stephen Hemminger
Don't count the existing Newton-Raphson out. It turns out that to get enough precision for 32 bits, only 4 iterations are needed. By unrolling those, it gets much better timing. Slightly gross test program (with original cubic wraparound bug fixed). --- /* Test and measure perf of cube root

Re: [RFC] div64_64 support II

2007-03-06 Thread H. Peter Anvin
Andi Kleen wrote: The problem with these algorithms that tradoff one or more multiplies in order to avoid a divide is that they don't give anything and often lose when both multiplies and divides are emulated in software. Actually on rereading this: is there really any Linux port that emulates

Re: [RFC] div64_64 support

2007-03-06 Thread H. Peter Anvin
Andi Kleen wrote: rant Let me see... You throw code like that and expect someone to actually understand it in one year, and be able to correct a bug ? To be honest I don't expect any bugs in this function. /rant Please add something, an URL or even better a nice explanation, per favor...

Re: [RFC] div64_64 support

2007-03-06 Thread Andi Kleen
On Tue, Mar 06, 2007 at 10:29:41AM -0800, Stephen Hemminger wrote: Don't count the existing Newton-Raphson out. It turns out that to get enough precision for 32 bits, only 4 iterations are needed. By unrolling those, it gets much better timing. But did you fix the 2^43 bug too? SGI has

Re: [RFC] div64_64 support

2007-03-06 Thread Stephen Hemminger
On Tue, 6 Mar 2007 20:48:41 +0100 Andi Kleen [EMAIL PROTECTED] wrote: On Tue, Mar 06, 2007 at 10:29:41AM -0800, Stephen Hemminger wrote: Don't count the existing Newton-Raphson out. It turns out that to get enough precision for 32 bits, only 4 iterations are needed. By unrolling those, it

Re: [RFC] div64_64 support

2007-03-06 Thread Sami Farin
On Tue, Mar 06, 2007 at 10:29:41 -0800, Stephen Hemminger wrote: Don't count the existing Newton-Raphson out. It turns out that to get enough precision for 32 bits, only 4 iterations are needed. By unrolling those, it gets much better timing. Slightly gross test program (with original cubic

Re: [RFC] div64_64 support

2007-03-06 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED] Date: Tue, 6 Mar 2007 10:29:41 -0800 /* calculate the cubic root of x using Newton-Raphson */ static uint32_t ncubic(uint64_t a) { uint64_t x; /* Initial estimate is based on: * cbrt(x) = exp(log(x) / 3) */ x = 1u

Re: [RFC] div64_64 support

2007-03-06 Thread Sami Farin
On Tue, Mar 06, 2007 at 23:53:49 +0200, Sami Farin wrote: ... And I found bug in gcc-4.1.2, it gave 0 for ncubic results when doing 1000 loops test... gcc-4.0.3 works. Found it. --- cbrt-test.c~2007-03-07 00:20:54.735248105 +0200 +++ cbrt-test.c 2007-03-07 00:21:03.964864343 +0200 @@

Re: [RFC] div64_64 support

2007-03-05 Thread David Miller
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Mon, 5 Mar 2007 15:57:14 -0800 > I tried the code from Hacker's Delight. > It is cool, but performance is CPU (and data) dependent: > > Average # of usecs per operation: Interesting results. The problem with these algorithms that tradoff one or

Re: [RFC] div64_64 support

2007-03-05 Thread Stephen Hemminger
On 03 Mar 2007 03:31:52 +0100 Andi Kleen <[EMAIL PROTECTED]> wrote: > Stephen Hemminger <[EMAIL PROTECTED]> writes: > > > Here is another way to handle the 64 bit divide case. > > It allows full 64 bit divide by adding the support routine > > GCC needs. > > Not supplying that was intentional by

Re: [RFC] div64_64 support

2007-03-05 Thread Stephen Hemminger
On 03 Mar 2007 03:31:52 +0100 Andi Kleen [EMAIL PROTECTED] wrote: Stephen Hemminger [EMAIL PROTECTED] writes: Here is another way to handle the 64 bit divide case. It allows full 64 bit divide by adding the support routine GCC needs. Not supplying that was intentional by Linus so that

Re: [RFC] div64_64 support

2007-03-05 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED] Date: Mon, 5 Mar 2007 15:57:14 -0800 I tried the code from Hacker's Delight. It is cool, but performance is CPU (and data) dependent: Average # of usecs per operation: Interesting results. The problem with these algorithms that tradoff one or more

Re: [RFC] div64_64 support

2007-03-02 Thread Andi Kleen
Stephen Hemminger <[EMAIL PROTECTED]> writes: > Here is another way to handle the 64 bit divide case. > It allows full 64 bit divide by adding the support routine > GCC needs. Not supplying that was intentional by Linus so that people think twice (or more often) before they using such expensive

Re: [RFC] div64_64 support

2007-03-02 Thread Andi Kleen
Stephen Hemminger [EMAIL PROTECTED] writes: Here is another way to handle the 64 bit divide case. It allows full 64 bit divide by adding the support routine GCC needs. Not supplying that was intentional by Linus so that people think twice (or more often) before they using such expensive

Re: [RFC] div64_64 support

2007-02-26 Thread Dan Williams
On 2/26/07, Stephen Hemminger <[EMAIL PROTECTED]> wrote: Here is another way to handle the 64 bit divide case. It allows full 64 bit divide by adding the support routine GCC needs. I know ARM already went through the process of removing __udivdi3 support:

Re: [RFC] div64_64 support

2007-02-26 Thread Segher Boessenkool
I thought the motivation for div64() was that a 64:32->32 divide could be done a lot faster on a number of platforms (including the important x86) than a generic 64:64->64 divide, but gcc doesn't handle the devolution automatically -- there is no such libgcc function. That there's no such

Re: [RFC] div64_64 support

2007-02-26 Thread H. Peter Anvin
Stephen Hemminger wrote: Hmm. Those are the GCC internal versions, that are picked up but doing divide in place. Do we want to allow general 64 bit in kernel to be easily used? It could cause sloppy slow code, but it would look cleaner. ... and it would handle datatypes which may be

Re: [RFC] div64_64 support

2007-02-26 Thread Jan Engelhardt
On Feb 26 2007 16:07, Stephen Hemminger wrote: >> On Feb 26 2007 15:44, Stephen Hemminger wrote: >> >> >-x = (2 * x + (uint32_t) div64_64(a, x*x)) / 3; >> >> >+x = (2 * x + (u32) (a / x*x)) / 3; >> >> >> >> Previously there was div64_64(a, x*x) which is equivalent

Re: [RFC] div64_64 support

2007-02-26 Thread Stephen Hemminger
On Tue, 27 Feb 2007 01:05:26 +0100 (MET) Jan Engelhardt <[EMAIL PROTECTED]> wrote: > > On Feb 26 2007 15:44, Stephen Hemminger wrote: > >> >- x = (2 * x + (uint32_t) div64_64(a, x*x)) / 3; > >> >+ x = (2 * x + (u32) (a / x*x)) / 3; > >> > >> Previously there was div64_64(a, x*x)

Re: [RFC] div64_64 support

2007-02-26 Thread Jan Engelhardt
On Feb 26 2007 15:44, Stephen Hemminger wrote: >> >- x = (2 * x + (uint32_t) div64_64(a, x*x)) / 3; >> >+ x = (2 * x + (u32) (a / x*x)) / 3; >> >> Previously there was div64_64(a, x*x) which is equivalent to >> (a)/(x*x), or just: a/(x^2). But now you do a/x*x, which is >>

Re: [RFC] div64_64 support

2007-02-26 Thread Stephen Hemminger
On Tue, 27 Feb 2007 00:02:50 +0100 (MET) Jan Engelhardt <[EMAIL PROTECTED]> wrote: > > On Feb 26 2007 13:28, Stephen Hemminger wrote: > >> > >> ./arch/arm26/lib/udivdi3.c > >> ./arch/sh/lib/udivdi3.c > >> ./arch/sparc/lib/udivdi3.S > >> > >> should not this be consolidated too? > > > >Hmm.

Re: [RFC] div64_64 support

2007-02-26 Thread Jan Engelhardt
On Feb 26 2007 13:28, Stephen Hemminger wrote: >> >> ./arch/arm26/lib/udivdi3.c >> ./arch/sh/lib/udivdi3.c >> ./arch/sparc/lib/udivdi3.S >> >> should not this be consolidated too? > >Hmm. Those are the GCC internal versions, that are picked up but >doing divide in place. Do we want to allow

Re: [RFC] div64_64 support

2007-02-26 Thread Stephen Hemminger
Here is another way to handle the 64 bit divide case. It allows full 64 bit divide by adding the support routine GCC needs. --- arch/alpha/Kconfig |4 arch/arm/Kconfig |4 arch/arm26/Kconfig |4 arch/avr32/Kconfig |4

Re: [RFC] div64_64 support

2007-02-26 Thread Stephen Hemminger
On Mon, 26 Feb 2007 21:09:26 +0100 (MET) Jan Engelhardt <[EMAIL PROTECTED]> wrote: > > On Feb 23 2007 17:05, Stephen Hemminger wrote: > > > >Since there already two users of full 64 bit division in the kernel, > >and other places maybe hiding out as well. Add a full 64/64 bit divide. > > > >Yes

Re: [RFC] div64_64 support

2007-02-26 Thread Jan Engelhardt
On Feb 23 2007 17:05, Stephen Hemminger wrote: > >Since there already two users of full 64 bit division in the kernel, >and other places maybe hiding out as well. Add a full 64/64 bit divide. > >Yes this expensive, but there are places where it is necessary. >It is not clear if doing the scaling

Re: [RFC] div64_64 support

2007-02-26 Thread David Miller
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Mon, 26 Feb 2007 11:28:18 -0800 > asm-i386.h/div64 and div64.o needs to move in Makefile to get this to work > on i386. This looks great to me: Signed-off-by: David S. Miller <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line

Re: [RFC] div64_64 support

2007-02-26 Thread Stephen Hemminger
asm-i386.h/div64 and div64.o needs to move in Makefile to get this to work on i386. --- include/asm-arm/div64.h |2 ++ include/asm-generic/div64.h |8 include/asm-i386/div64.h |5 + include/asm-m68k/div64.h |2 ++ include/asm-mips/div64.h |8

Re: [RFC] div64_64 support

2007-02-26 Thread Stephen Hemminger
asm-i386.h/div64 and div64.o needs to move in Makefile to get this to work on i386. --- include/asm-arm/div64.h |2 ++ include/asm-generic/div64.h |8 include/asm-i386/div64.h |5 + include/asm-m68k/div64.h |2 ++ include/asm-mips/div64.h |8

Re: [RFC] div64_64 support

2007-02-26 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED] Date: Mon, 26 Feb 2007 11:28:18 -0800 asm-i386.h/div64 and div64.o needs to move in Makefile to get this to work on i386. This looks great to me: Signed-off-by: David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line

Re: [RFC] div64_64 support

2007-02-26 Thread Jan Engelhardt
On Feb 23 2007 17:05, Stephen Hemminger wrote: Since there already two users of full 64 bit division in the kernel, and other places maybe hiding out as well. Add a full 64/64 bit divide. Yes this expensive, but there are places where it is necessary. It is not clear if doing the scaling buys

Re: [RFC] div64_64 support

2007-02-26 Thread Stephen Hemminger
On Mon, 26 Feb 2007 21:09:26 +0100 (MET) Jan Engelhardt [EMAIL PROTECTED] wrote: On Feb 23 2007 17:05, Stephen Hemminger wrote: Since there already two users of full 64 bit division in the kernel, and other places maybe hiding out as well. Add a full 64/64 bit divide. Yes this

Re: [RFC] div64_64 support

2007-02-26 Thread Stephen Hemminger
Here is another way to handle the 64 bit divide case. It allows full 64 bit divide by adding the support routine GCC needs. --- arch/alpha/Kconfig |4 arch/arm/Kconfig |4 arch/arm26/Kconfig |4 arch/avr32/Kconfig |4

Re: [RFC] div64_64 support

2007-02-26 Thread Jan Engelhardt
On Feb 26 2007 13:28, Stephen Hemminger wrote: ./arch/arm26/lib/udivdi3.c ./arch/sh/lib/udivdi3.c ./arch/sparc/lib/udivdi3.S should not this be consolidated too? Hmm. Those are the GCC internal versions, that are picked up but doing divide in place. Do we want to allow general 64 bit in

Re: [RFC] div64_64 support

2007-02-26 Thread Stephen Hemminger
On Tue, 27 Feb 2007 00:02:50 +0100 (MET) Jan Engelhardt [EMAIL PROTECTED] wrote: On Feb 26 2007 13:28, Stephen Hemminger wrote: ./arch/arm26/lib/udivdi3.c ./arch/sh/lib/udivdi3.c ./arch/sparc/lib/udivdi3.S should not this be consolidated too? Hmm. Those are the GCC internal

Re: [RFC] div64_64 support

2007-02-26 Thread Jan Engelhardt
On Feb 26 2007 15:44, Stephen Hemminger wrote: - x = (2 * x + (uint32_t) div64_64(a, x*x)) / 3; + x = (2 * x + (u32) (a / x*x)) / 3; Previously there was div64_64(a, x*x) which is equivalent to (a)/(x*x), or just: a/(x^2). But now you do a/x*x, which is equivalent to

Re: [RFC] div64_64 support

2007-02-26 Thread Stephen Hemminger
On Tue, 27 Feb 2007 01:05:26 +0100 (MET) Jan Engelhardt [EMAIL PROTECTED] wrote: On Feb 26 2007 15:44, Stephen Hemminger wrote: - x = (2 * x + (uint32_t) div64_64(a, x*x)) / 3; + x = (2 * x + (u32) (a / x*x)) / 3; Previously there was div64_64(a, x*x) which is

Re: [RFC] div64_64 support

2007-02-26 Thread Jan Engelhardt
On Feb 26 2007 16:07, Stephen Hemminger wrote: On Feb 26 2007 15:44, Stephen Hemminger wrote: -x = (2 * x + (uint32_t) div64_64(a, x*x)) / 3; +x = (2 * x + (u32) (a / x*x)) / 3; Previously there was div64_64(a, x*x) which is equivalent to (a)/(x*x), or

Re: [RFC] div64_64 support

2007-02-26 Thread H. Peter Anvin
Stephen Hemminger wrote: Hmm. Those are the GCC internal versions, that are picked up but doing divide in place. Do we want to allow general 64 bit in kernel to be easily used? It could cause sloppy slow code, but it would look cleaner. ... and it would handle datatypes which may be

Re: [RFC] div64_64 support

2007-02-26 Thread Segher Boessenkool
I thought the motivation for div64() was that a 64:32-32 divide could be done a lot faster on a number of platforms (including the important x86) than a generic 64:64-64 divide, but gcc doesn't handle the devolution automatically -- there is no such libgcc function. That there's no such

Re: [RFC] div64_64 support

2007-02-26 Thread Dan Williams
On 2/26/07, Stephen Hemminger [EMAIL PROTECTED] wrote: Here is another way to handle the 64 bit divide case. It allows full 64 bit divide by adding the support routine GCC needs. snip I know ARM already went through the process of removing __udivdi3 support:

Re: [RFC] div64_64 support

2007-02-24 Thread Sami Farin
On Fri, Feb 23, 2007 at 17:05:27 -0800, Stephen Hemminger wrote: > Since there already two users of full 64 bit division in the kernel, > and other places maybe hiding out as well. Add a full 64/64 bit divide. > > Yes this expensive, but there are places where it is necessary. > It is not clear

Re: [RFC] div64_64 support

2007-02-24 Thread Sami Farin
On Fri, Feb 23, 2007 at 17:05:27 -0800, Stephen Hemminger wrote: Since there already two users of full 64 bit division in the kernel, and other places maybe hiding out as well. Add a full 64/64 bit divide. Yes this expensive, but there are places where it is necessary. It is not clear if

[RFC] div64_64 support

2007-02-23 Thread Stephen Hemminger
Since there already two users of full 64 bit division in the kernel, and other places maybe hiding out as well. Add a full 64/64 bit divide. Yes this expensive, but there are places where it is necessary. It is not clear if doing the scaling buys any advantage on 64 bit platforms, so for them a

[RFC] div64_64 support

2007-02-23 Thread Stephen Hemminger
Since there already two users of full 64 bit division in the kernel, and other places maybe hiding out as well. Add a full 64/64 bit divide. Yes this expensive, but there are places where it is necessary. It is not clear if doing the scaling buys any advantage on 64 bit platforms, so for them a