> > Then you probably want SIMD vector ops too, which, AFAIK, are not yet > supported. FP math in Racket does use the SIMD unit on most targets, > but normal math computes one value at a time, using only one slot per > SIMD register, as opposed to the N slots available at the given precision. > [This is the same as in C: if you want vector ops, you use SIMD > intrinsics instead of the normal C operators.]
We already make heavy use of SIMD instructions in our main codebase, so I don't need Racket to do SIMD since I plan on only using Racket for offline analysis purposes. How long do you want to wait for "truth" calculations. Done using > either rationals (software bigint / bigint fractions), or bigfloats > (software adjustable width FP) with results converted to rational for > comparison, the truth calculation is going to be many orders of > magnitude slower than hardware FP math. > Do you have enough memory? Rationals can expand to fill all available > space. I can wait a while, but it can't be too slow, of course. If we're talking hours just to get a single computation done that involves just a handful of adds or multiplies, then this is untenable for me. But my experience shows that Racket is plenty fast for this simple case. Are there cases where it takes a surprising amount of extra time to perform a series of multiplies and adds? As for memory space, I have 32 GB of memory to spare. Should I be concerned with this when my computations typically only contain a few multiplies or adds? (FYI, it's not guaranteed that I'll restrict myself to such simple cases. We have many 4x4 matrix operations that we perform that I can definitely see myself looking into, some of which do orthonormalization or matrix inverses). Perhaps some kind of relative error measurement would be more > appropriate? Without knowing the algorithm in question, nobody can > really give better suggestions. Yes, for sure, but I currently only care about ULPs at the moment. -Dale Kim On Tuesday, April 10, 2018 at 1:48:16 AM UTC-7, gneuner2 wrote: > > > On 4/10/2018 1:36 AM, [email protected] <javascript:> wrote: > > For the applications I work on, double precision floats are too costly > > to use; although the CPU cycle count to operate on doubles tend to be > > the same as single precision floats on modern hardware, the bandwidth > > cost is too prohibitive. We really do need single precision floats, > > and in many cases, 16 bit half precision floats due to the bandwidth > > savings. > > Then you probably want SIMD vector ops too, which, AFAIK, are not yet > supported. FP math in Racket does use the SIMD unit on most targets, > but normal math computes one value at a time, using only one slot per > SIMD register, as opposed to the N slots available at the given precision. > [This is the same as in C: if you want vector ops, you use SIMD > intrinsics instead of the normal C operators.] > > In Racket, there are tricks you can play with typed arrays and/or unsafe > operations to get more speed from bypassing the language's type > safeguards ... but you won't get vector ops AFAIK unless you drop into C > code. > > And again, there is no half precision available. Half precision is > available only in GPUs or certain DSPs - no CPU implements it. > > > > With regard to exactness, I don't need exactness to compare two single > > precision floats. I would like to have exactness in the ground truth > > that I compute to be able to calculate the error in the single > > precision float version of the computation. The idea is that I > > implement two versions of an algorithm. One uses the exact numbers > > supported by Racket and the other would use single precision floats, > > then I would like to compute error with (flulp-error x r) or something > > similar. > > How long do you want to wait for "truth" calculations. Done using > either rationals (software bigint / bigint fractions), or bigfloats > (software adjustable width FP) with results converted to rational for > comparison, the truth calculation is going to be many orders of > magnitude slower than hardware FP math. > > Do you have enough memory? Rationals can expand to fill all available > space. > > > > Is there a better approach to do this kind of analysis? > > You really haven't specified any "analysis" per se. Thus far you have > said only that you want to execute two versions of the same algorithm: > one using exact (or maybe high precision float) values, and one using > low (single) precision values, and compare the results. > > What you proposed is fine as far as it goes, but I question whether > measuring ulps error really is what you want to do. That more typically > would be done to compare answers computed to the same precision using > different algorithms. In your case, the low precision value will likely > lead to large errors vs the exact one - think about how intermediate > values overflowing or underflowing might affect the end result. > > Perhaps some kind of relative error measurement would be more > appropriate? Without knowing the algorithm in question, nobody can > really give better suggestions. > > > > -Dale Kim > > YMMV, > George > > On Tuesday, April 10, 2018 at 1:48:16 AM UTC-7, gneuner2 wrote: > > > On 4/10/2018 1:36 AM, [email protected] <javascript:> wrote: > > For the applications I work on, double precision floats are too costly > > to use; although the CPU cycle count to operate on doubles tend to be > > the same as single precision floats on modern hardware, the bandwidth > > cost is too prohibitive. We really do need single precision floats, > > and in many cases, 16 bit half precision floats due to the bandwidth > > savings. > > Then you probably want SIMD vector ops too, which, AFAIK, are not yet > supported. FP math in Racket does use the SIMD unit on most targets, > but normal math computes one value at a time, using only one slot per > SIMD register, as opposed to the N slots available at the given precision. > [This is the same as in C: if you want vector ops, you use SIMD > intrinsics instead of the normal C operators.] > > In Racket, there are tricks you can play with typed arrays and/or unsafe > operations to get more speed from bypassing the language's type > safeguards ... but you won't get vector ops AFAIK unless you drop into C > code. > > And again, there is no half precision available. Half precision is > available only in GPUs or certain DSPs - no CPU implements it. > > > > With regard to exactness, I don't need exactness to compare two single > > precision floats. I would like to have exactness in the ground truth > > that I compute to be able to calculate the error in the single > > precision float version of the computation. The idea is that I > > implement two versions of an algorithm. One uses the exact numbers > > supported by Racket and the other would use single precision floats, > > then I would like to compute error with (flulp-error x r) or something > > similar. > > How long do you want to wait for "truth" calculations. Done using > either rationals (software bigint / bigint fractions), or bigfloats > (software adjustable width FP) with results converted to rational for > comparison, the truth calculation is going to be many orders of > magnitude slower than hardware FP math. > > Do you have enough memory? Rationals can expand to fill all available > space. > > > > Is there a better approach to do this kind of analysis? > > You really haven't specified any "analysis" per se. Thus far you have > said only that you want to execute two versions of the same algorithm: > one using exact (or maybe high precision float) values, and one using > low (single) precision values, and compare the results. > > What you proposed is fine as far as it goes, but I question whether > measuring ulps error really is what you want to do. That more typically > would be done to compare answers computed to the same precision using > different algorithms. In your case, the low precision value will likely > lead to large errors vs the exact one - think about how intermediate > values overflowing or underflowing might affect the end result. > > Perhaps some kind of relative error measurement would be more > appropriate? Without knowing the algorithm in question, nobody can > really give better suggestions. > > > > -Dale Kim > > YMMV, > George > > -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.

