RE: Performance of floating point instructions

2010-10-29 Thread pedro.larroy
t: Re: Performance of floating point instructions Hi, (I resurrected this old thread because there was on meego-dev mailing list a comment about possibility for RunFast float mode being enabled by default on MeeGo...) ext Alberto Mardegan wrote: > Eero Tamminen wrote: >> Hamalainen Kimmo (No

Re: Performance of floating point instructions

2010-10-29 Thread Alberto Mardegan
On 10/29/2010 12:14 PM, Eero Tamminen wrote: Hi, (I resurrected this old thread because there was on meego-dev mailing list a comment about possibility for RunFast float mode being enabled by default on MeeGo...) ext Alberto Mardegan wrote: Eero Tamminen wrote: Hamalainen Kimmo (Nokia-D/Helsi

Re: Performance of floating point instructions

2010-10-29 Thread Eero Tamminen
Hi, (I resurrected this old thread because there was on meego-dev mailing list a comment about possibility for RunFast float mode being enabled by default on MeeGo...) ext Alberto Mardegan wrote: Eero Tamminen wrote: Hamalainen Kimmo (Nokia-D/Helsinki) wrote: On Wed, 2010-03-10 at 12:57 +0100

Re: Performance of floating point instructions

2010-03-12 Thread Eero Tamminen
Hi, ext Alberto Mardegan wrote: Right. Just to complete the picture, here's the same data with -O2: float (fast mode enabled): map_path_calculate_distances: 40 ms for 8250 points map_path_calculate_distances: 2 ms for 430 points double (fast mode enabled): Note that fast mode affects only fl

Re: Performance of floating point instructions

2010-03-11 Thread Alberto Mardegan
Siarhei Siamashka wrote: The output (application compiled with -O0): Using an optimized build (-O2 or -O3) may sometimes change the overall picture quite dramatically. It makes almost no sense benchmarking -O0 code, because in this case all the local variables are kept in memory and are read/wr

Re: Performance of floating point instructions

2010-03-11 Thread Tor
[float/double/vfp/neon etc.] There's some information collected at http://pandorawiki.org/Floating_Point_Optimization There are also several long threads over at the Pandora forum about everything discussed in this thread. -Tor ___ maemo-developers mail

Re: Performance of floating point instructions

2010-03-10 Thread Laurent GUERBY
On Thu, 2010-03-11 at 00:32 +0200, Siarhei Siamashka wrote: > On Wednesday 10 March 2010, Laurent GUERBY wrote: > > GCC comes with some builtins for neon, they're defined in arm_neon.h > > see below. > > This does not sound like a good idea. If the code has to be modified and > changed into someth

Re: Performance of floating point instructions

2010-03-10 Thread Siarhei Siamashka
On Wednesday 10 March 2010, Laurent GUERBY wrote: > On Wed, 2010-03-10 at 21:54 +0200, Siarhei Siamashka wrote: > > I wonder why the compiler does not use real NEON instructions with > > -ffast-math option, it should be quite useful even for scalar code. > > > > something like: > > > > vld1.32 {d0

Re: Performance of floating point instructions

2010-03-10 Thread Siarhei Siamashka
On Wednesday 10 March 2010, Laurent Desnogues wrote: > Even if fast-math is known to break some rules, it only > breaks C rules IIRC. OTOH, NEON FP has no support > for NaN and other nice things from IEEE754. And just checked gcc man page to verify this stuff. -ffast-math Sets -fno-math-errno

Re: Performance of floating point instructions

2010-03-10 Thread Siarhei Siamashka
On Wednesday 10 March 2010, Laurent Desnogues wrote: > On Wed, Mar 10, 2010 at 8:54 PM, Siarhei Siamashka > wrote: > [...] > > > I wonder why the compiler does not use real NEON instructions with > > -ffast-math option, it should be quite useful even for scalar code. > > > > something like: > > >

Re: Performance of floating point instructions

2010-03-10 Thread Laurent GUERBY
On Wed, 2010-03-10 at 21:54 +0200, Siarhei Siamashka wrote: > I wonder why the compiler does not use real NEON instructions with > -ffast-math > option, it should be quite useful even for scalar code. > > something like: > > vld1.32 {d0[0]}, [r0] > vadd.f32 d0, d0, d0 > vst1.32 {d0[0]}, [r0]

Re: Performance of floating point instructions

2010-03-10 Thread Laurent Desnogues
On Wed, Mar 10, 2010 at 8:54 PM, Siarhei Siamashka wrote: [...] > I wonder why the compiler does not use real NEON instructions with -ffast-math > option, it should be quite useful even for scalar code. > > something like: > > vld1.32  {d0[0]}, [r0] > vadd.f32 d0, d0, d0 > vst1.32  {d0[0]}, [r0] >

Re: Performance of floating point instructions

2010-03-10 Thread Siarhei Siamashka
On Wednesday 10 March 2010, Laurent Desnogues wrote: > On Wed, Mar 10, 2010 at 7:29 PM, Alberto Mardegan > > So, it seems that there's a huge improvements when switching from doubles > > to floats; although I wonder if it's because of the FPU or just because > > the amount of data passed around is

Re: Performance of floating point instructions

2010-03-10 Thread Siarhei Siamashka
On Wednesday 10 March 2010, Alberto Mardegan wrote: > Alberto Mardegan wrote: > > Does one have any figure about how the performance of the FPU is, > > compared to integer operations? > > I added some profiling to the code, and I measured the time spent by a > function which is operating on an arra

Re: Performance of floating point instructions

2010-03-10 Thread Laurent Desnogues
On Wed, Mar 10, 2010 at 7:29 PM, Alberto Mardegan wrote: > Alberto Mardegan wrote: >> >> Does one have any figure about how the performance of the FPU is, compared >> to integer operations? > > I added some profiling to the code, and I measured the time spent by a > function which is operating on

Re: Performance of floating point instructions

2010-03-10 Thread Bernd Stramm
On Wed, 2010-03-10 at 20:29 +0200, Alberto Mardegan wrote: > Alberto Mardegan wrote: > > Does one have any figure about how the performance of the FPU is, > > compared to integer operations? > > I added some profiling to the code, and I measured the time spent by a > function which is operating

Re: Performance of floating point instructions

2010-03-10 Thread Alberto Mardegan
Alberto Mardegan wrote: Does one have any figure about how the performance of the FPU is, compared to integer operations? I added some profiling to the code, and I measured the time spent by a function which is operating on an array of points (whose coordinates are integers) and trasforming e

Re: Performance of floating point instructions

2010-03-10 Thread Alberto Mardegan
Eero Tamminen wrote: Is there any performance penalty if this switch is done often? Why you would switch it off? Operations on "fast floats" aren't IEEE compatible, but as far as I've understood, they should differ only for numbers that are very close to zero, close enough that repeating your

Re: Performance of floating point instructions

2010-03-10 Thread Alberto Mardegan
Eero Tamminen wrote: Hamalainen Kimmo (Nokia-D/Helsinki) wrote: On Wed, 2010-03-10 at 12:57 +0100, ext Alberto Mardegan wrote: Kimmo Hämäläinen wrote: You can also put the CPU to a "fast floats" mode, see hd_fpu_set_mode() in http://maemo.gitorious.org/fremantle-hildon-desktop/hildon-desktop/b

Re: Performance of floating point instructions

2010-03-10 Thread Eero Tamminen
Hi, Hamalainen Kimmo (Nokia-D/Helsinki) wrote: On Wed, 2010-03-10 at 12:57 +0100, ext Alberto Mardegan wrote: Kimmo Hämäläinen wrote: You can also put the CPU to a "fast floats" mode, see hd_fpu_set_mode() in http://maemo.gitorious.org/fremantle-hildon-desktop/hildon-desktop/blobs/master/src/m

Re: Performance of floating point instructions

2010-03-10 Thread Eero Tamminen
Hi, ext Alberto Mardegan wrote: Kimmo Hämäläinen wrote: You can also put the CPU to a "fast floats" mode, see hd_fpu_set_mode() in http://maemo.gitorious.org/fremantle-hildon-desktop/hildon-desktop/blobs/master/src/main.c N900 has support for NEON instructions also. This sounds interesting!

Re: Performance of floating point instructions

2010-03-10 Thread Kimmo Hämäläinen
On Wed, 2010-03-10 at 12:57 +0100, ext Alberto Mardegan wrote: > Kimmo Hämäläinen wrote: > > You can also put the CPU to a "fast floats" mode, see hd_fpu_set_mode() > > in > > http://maemo.gitorious.org/fremantle-hildon-desktop/hildon-desktop/blobs/master/src/main.c > > > > N900 has support for NE

Re: Performance of floating point instructions

2010-03-10 Thread Alberto Mardegan
Kimmo Hämäläinen wrote: You can also put the CPU to a "fast floats" mode, see hd_fpu_set_mode() in http://maemo.gitorious.org/fremantle-hildon-desktop/hildon-desktop/blobs/master/src/main.c N900 has support for NEON instructions also. This sounds interesting! Is there any performance penalty

Re: Performance of floating point instructions

2010-03-10 Thread Kimmo Hämäläinen
On Wed, 2010-03-10 at 10:46 +0100, ext Ove Kaaven wrote: > Alberto Mardegan skrev: > > Does anyone know any tricks to optimize certain operations on arrays of > > data? > > The answer to that is, obviously, to use the Cortex-A-series SIMD > engine, NEON. > > Supposedly you may be able to make gcc

Re: Performance of floating point instructions

2010-03-10 Thread Marcin Juszkiewicz
Dnia środa, 10 marca 2010 o 11:14:14 Laurent Desnogues napisał(a): > One has to be careful with that approach: Cortex-A9 SoC won't > necessarily come with a NEON SIMD unit, as it's optional. So it'd > be better to also include code that doesn't assume one has a > NEON unit. Or if someone will t

Re: Performance of floating point instructions

2010-03-10 Thread Laurent Desnogues
On Wed, Mar 10, 2010 at 10:46 AM, Ove Kaaven wrote: > Alberto Mardegan skrev: >> Does anyone know any tricks to optimize certain operations on arrays of >> data? > > The answer to that is, obviously, to use the Cortex-A-series SIMD > engine, NEON. > > Supposedly you may be able to make gcc generat

RE: Performance of floating point instructions

2010-03-10 Thread Simon Pickering
> > in maemo-mapper I have a lot of code involved in doing > > transformations from latitude/longitude to Mercator > > coordinates (used in google maps, for example), calculation > > of distances, etc. > > > > I'm trying to use integer arithmetics as much as > > possible, but som

Re: Performance of floating point instructions

2010-03-10 Thread Ove Kaaven
Alberto Mardegan skrev: > Does anyone know any tricks to optimize certain operations on arrays of > data? The answer to that is, obviously, to use the Cortex-A-series SIMD engine, NEON. Supposedly you may be able to make gcc generate NEON instructions with -mfpu=neon -ffast-math -ftree-vectorize

Re: Performance of floating point instructions

2010-03-10 Thread Sivan Greenberg
Hi Alberto! On Wed, Mar 10, 2010 at 9:55 AM, Alberto Mardegan < ma...@users.sourceforge.net> wrote: > Hi all, > in maemo-mapper I have a lot of code involved in doing transformations > from latitude/longitude to Mercator coordinates (used in google maps, for > example), calculation of distances,