Hey,

 > I am a bit confused, is there any reason for using "reciprocal" and
> "flip_sign", instead of just changing the scalar accordingly?

yes (with a drawback I'll discuss at the end): Consider the family of 
operations

  x = +- y OP1 a +- z OP2 b

where x, y, and z are vectors, OP1 and OP2 are either multiplication or 
division, and a,b are host scalars. If I did the math correctly, these 
are 16 different kernels when coded explicitly. Hence, if you put all 
these into separate OpenCL kernels, you'll get fairly long compilation 
times. However, not that you cannot do this if a and b stem from device 
scalars, because then the manipulation of a and b would result in 
additional buffer allocations and kernel launches -> way too slow.

For floating point operations, one can reduce the number of operations a 
lot when (+- OP1 a) and (+- OP2 b) are computed once in a preprocessing 
step. Then, only the kernel

  x = y * a' + z * b'

is needed, cutting the number of OpenCL kernels from 16 to 1. Since (-a) 
and (1/a) cannot be computed outside the kernel if a is a GPU scalar, 
this is always computed in a preprocessing step inside the OpenCL kernel 
for unification purposes. I think we can even apply some more cleverness 
here if we delegate all the work to a suitable implementation function.

And now for the drawback: When using integers, the operation n/m is no 
longer the same as n * (1/m). Even worse, for unsigned integers it is 
also no longer possible to replace n - m by n + (-m). Thus, we certainly 
have to bite the bullet and generate kernels for all 16 combinations 
when using unsigned integers. However, I'm reluctant to generate all 16 
combinations for floating point arguments if this is not needed...

Best regards,
Karli


------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to