Re: [racket-users] IEEE 754 single precision float support

dkim Tue, 10 Apr 2018 11:31:19 -0700

>
> Then you probably want SIMD vector ops too, which, AFAIK, are not yet 
> supported.  FP math in Racket does use the SIMD unit on most targets, 
> but normal math computes one value at a time, using only one slot per 
> SIMD register, as opposed to the N slots available at the given precision. 
> [This is the same as in C: if you want vector ops, you use SIMD 
> intrinsics instead of the normal C operators.]



We already make heavy use of SIMD instructions in our main codebase, so I 
don't need Racket to do SIMD since I plan on only using Racket for offline 
analysis purposes.

How long do you want to wait for "truth" calculations.  Done using 
> either rationals (software bigint / bigint fractions), or bigfloats 
> (software adjustable width FP) with results converted to rational for 
> comparison, the truth calculation is going to be many orders of 
> magnitude slower than hardware FP math. 
> Do you have enough memory?  Rationals can expand to fill all available 
> space. 


I can wait a while, but it can't be too slow, of course. If we're talking 
hours just to get a single computation done that involves just a handful of 
adds or multiplies, then this is untenable for me. But my experience shows 
that Racket is plenty fast for this simple case. Are there cases where it 
takes a surprising amount of extra time to perform a series of multiplies 
and adds?

As for memory space, I have 32 GB of memory to spare. Should I be concerned 
with this when my computations typically only contain a few multiplies or 
adds?

(FYI, it's not guaranteed that I'll restrict myself to such simple cases. 
We have many 4x4 matrix operations that we perform that I can definitely 
see myself looking into, some of which do orthonormalization or matrix 
inverses).

 Perhaps some kind of relative error measurement would be more 
> appropriate?  Without knowing the algorithm in question, nobody can 
> really give better suggestions. 


Yes, for sure, but I currently only care about ULPs at the moment.

-Dale Kim
 
On Tuesday, April 10, 2018 at 1:48:16 AM UTC-7, gneuner2 wrote:
>
>
> On 4/10/2018 1:36 AM, [email protected] <javascript:> wrote: 
> > For the applications I work on, double precision floats are too costly 
> > to use; although the CPU cycle count to operate on doubles tend to be 
> > the same as single precision floats on modern hardware, the bandwidth 
> > cost is too prohibitive. We really do need single precision floats, 
> > and in many cases, 16 bit half precision floats due to the bandwidth 
> > savings. 
>
> Then you probably want SIMD vector ops too, which, AFAIK, are not yet 
> supported.  FP math in Racket does use the SIMD unit on most targets, 
> but normal math computes one value at a time, using only one slot per 
> SIMD register, as opposed to the N slots available at the given precision. 
> [This is the same as in C: if you want vector ops, you use SIMD 
> intrinsics instead of the normal C operators.] 
>
> In Racket, there are tricks you can play with typed arrays and/or unsafe 
> operations to get more speed from bypassing the language's type 
> safeguards ... but you won't get vector ops AFAIK unless you drop into C 
> code. 
>
> And again, there is no half precision available.  Half precision is 
> available only in GPUs or certain DSPs - no CPU implements it. 
>
>
> > With regard to exactness, I don't need exactness to compare two single 
> > precision floats. I would like to have exactness in the ground truth 
> > that I compute to be able to calculate the error in the single 
> > precision float version of the computation. The idea is that I 
> > implement two versions of an algorithm. One uses the exact numbers 
> > supported by Racket and the other would use single precision floats, 
> > then I would like to compute error with (flulp-error x r) or something 
> > similar. 
>
> How long do you want to wait for "truth" calculations.  Done using 
> either rationals (software bigint / bigint fractions), or bigfloats 
> (software adjustable width FP) with results converted to rational for 
> comparison, the truth calculation is going to be many orders of 
> magnitude slower than hardware FP math. 
>
> Do you have enough memory?  Rationals can expand to fill all available 
> space. 
>
>
> > Is there a better approach to do this kind of analysis? 
>
> You really haven't specified any "analysis" per se.  Thus far you have 
> said only that you want to execute two versions of the same algorithm: 
> one using exact (or maybe high precision float) values, and one using 
> low (single) precision values, and compare the results. 
>
> What you proposed is fine as far as it goes, but I question whether 
> measuring ulps error really is what you want to do.  That more typically 
> would be done to compare answers computed to the same precision using 
> different algorithms.  In your case, the low precision value will likely 
> lead to large errors vs the exact one - think about how intermediate 
> values overflowing or underflowing might affect the end result. 
>
> Perhaps some kind of relative error measurement would be more 
> appropriate?  Without knowing the algorithm in question, nobody can 
> really give better suggestions. 
>
>
> > -Dale Kim 
>
> YMMV, 
> George 
>
>
On Tuesday, April 10, 2018 at 1:48:16 AM UTC-7, gneuner2 wrote:
>
>
> On 4/10/2018 1:36 AM, [email protected] <javascript:> wrote: 
> > For the applications I work on, double precision floats are too costly 
> > to use; although the CPU cycle count to operate on doubles tend to be 
> > the same as single precision floats on modern hardware, the bandwidth 
> > cost is too prohibitive. We really do need single precision floats, 
> > and in many cases, 16 bit half precision floats due to the bandwidth 
> > savings. 
>
> Then you probably want SIMD vector ops too, which, AFAIK, are not yet 
> supported.  FP math in Racket does use the SIMD unit on most targets, 
> but normal math computes one value at a time, using only one slot per 
> SIMD register, as opposed to the N slots available at the given precision. 
> [This is the same as in C: if you want vector ops, you use SIMD 
> intrinsics instead of the normal C operators.] 
>
> In Racket, there are tricks you can play with typed arrays and/or unsafe 
> operations to get more speed from bypassing the language's type 
> safeguards ... but you won't get vector ops AFAIK unless you drop into C 
> code. 
>
> And again, there is no half precision available.  Half precision is 
> available only in GPUs or certain DSPs - no CPU implements it. 
>
>
> > With regard to exactness, I don't need exactness to compare two single 
> > precision floats. I would like to have exactness in the ground truth 
> > that I compute to be able to calculate the error in the single 
> > precision float version of the computation. The idea is that I 
> > implement two versions of an algorithm. One uses the exact numbers 
> > supported by Racket and the other would use single precision floats, 
> > then I would like to compute error with (flulp-error x r) or something 
> > similar. 
>
> How long do you want to wait for "truth" calculations.  Done using 
> either rationals (software bigint / bigint fractions), or bigfloats 
> (software adjustable width FP) with results converted to rational for 
> comparison, the truth calculation is going to be many orders of 
> magnitude slower than hardware FP math. 
>
> Do you have enough memory?  Rationals can expand to fill all available 
> space. 
>
>
> > Is there a better approach to do this kind of analysis? 
>
> You really haven't specified any "analysis" per se.  Thus far you have 
> said only that you want to execute two versions of the same algorithm: 
> one using exact (or maybe high precision float) values, and one using 
> low (single) precision values, and compare the results. 
>
> What you proposed is fine as far as it goes, but I question whether 
> measuring ulps error really is what you want to do.  That more typically 
> would be done to compare answers computed to the same precision using 
> different algorithms.  In your case, the low precision value will likely 
> lead to large errors vs the exact one - think about how intermediate 
> values overflowing or underflowing might affect the end result. 
>
> Perhaps some kind of relative error measurement would be more 
> appropriate?  Without knowing the algorithm in question, nobody can 
> really give better suggestions. 
>
>
> > -Dale Kim 
>
> YMMV, 
> George 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [racket-users] IEEE 754 single precision float support

Reply via email to