Re: [Beignet] [PATCH 1/3] Benchmark: Evaluate math performance on intervals

Song, Ruiling Wed, 04 May 2016 18:39:07 -0700


> -----Original Message-----
> From: Lupescu, Grigore
> Sent: Wednesday, May 4, 2016 3:06 PM
> To: Song, Ruiling <ruiling.s...@intel.com>; beignet@lists.freedesktop.org
> Subject: RE: [Beignet] [PATCH 1/3] Benchmark: Evaluate math performance on
> intervals
> 
> > I think this may lead to optimize for a special input-range.
> 
> I agree - my ideea with the benchmark was just looking at how fast is a the
> function on an interval.
> I've looked at a function - say sinus - and saw that there are 3 paths the 
> code
> may take based on the input value. To properly evaluate the performance of
> each path I would do x = sin (x + a) then x = x * 0x1p-16, where a is the min 
> value
> of an interval, and x is close to 0. So at the end I would know that the
> performance is T on (a, b) interval, 2T on (b, c) interval etc.
> Now this won't tell me how sinus actually performs since I don't know how
> often sinus with values (a, b) is called vs (b, c) or other - but it would 
> tell me for
> instance that internal_1 performance (a, b) is 6 times faster than internal_2 
> (b, c)
> and 9 times than internal_3 (c, d) etc..


I get your point. I previously thought you would re-design the math function 
implementations. And implement them one by one.
From your message, seems that you are in a way to benchmark them in different 
ranges and optimize them.
Assume all math functions will be implemented in a range-based algorithm. This 
is true.
The biggest difference between cpu and gpu is that on gpu we need to try best 
to reduce divergent code.
Yes, the sine implementation has an if-else. That is because the payne-hanek is 
too much slow, that's why I use a fast version for small input value.
I would suppose there would be "no or very little" input range-check for other 
math functions.
If there must be some range-check, I would like to keep the if/else check in a 
math function at most two.
You can continue with your range-based benchmarking and optimization, see how 
much improvement we can get.
But I would also suggest you to check whether we can re-implement these 
functions in a new way like table-lookup or other techniques if you can find in 
papers.

> 
> I believe the only way to evaluate if a change in math code is relevant is 
> with
> real world tests. We thus must have a diverse set of tests that use most math
> functions. Ideally one should document what each test uses and in what
> proportion. I have starting doing this but it's taking a lot of time due to 
> the
> complexity of some tests (e.g. Luxmark).
Yes, if we can gather that information, it would be useful.

> -------------------------------------------------------------
> 
> So I see the following flow of optimization for Beignet - but may apply to any
> other math implemention for OpenCL:
> 
> 1. (done) See performance of each interval for a given function (sin). We 
> would
> know perf1 on (a, b), perf2 on (b, c), perf3 on (c, d)
> 2. (working) Run several relevant math tests (relevant to sinus). Try to 
> identify in
> what circumstances is sin called. Maybe all tests call it on (a,b) and (b,c). 
> Then
> we should target (a,b) and (b,c) because that is what is being used. This 
> would
> assume math tests are well chosen and diverse.
> 3. (working) Optimize intervals (a, b) and (b, c). Observe how each optimized
> since we can test performance on intervals. Re-run real world math tests.
> Any thoughts on this ?
I just have some worry that like you try to optimize sine, and you find that 
benchmark1 can be optimized through adding a input range check,
Then you find benchmark2, you add another input range check to optimize it, 
this may easily lead to much more divergent code.
Which may eventually evolve to a good CPU version but not good GPU version.
Optimize for benchmark is OK for me. But I would encourage you to reduce 
divergent code instead of introducing more divergent code. That is my point.

> 
> I did some optimizations (call to native and polynomial reduction) and 
> obtained
> an increase of at least 5% in about 8 - 10 math tests from the ones provided 
> by
> Mengmeng. It's quite difficult to target the general case for all math 
> functions
> but I think these changes are relevant to some point.

_______________________________________________
Beignet mailing list
Beignet@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/beignet

Re: [Beignet] [PATCH 1/3] Benchmark: Evaluate math performance on intervals

Reply via email to