Hi Philippe,
> I don't understand why this would go through more than one compilation... > This kernel is compiled only once, the value of flip_sign and reciprocal > only changes the dynamic value of the argument, not the source code. > > This would eventually result in: > > if(alpha_reciprocal) > kernel(N,x,y,z,1/alpha,beta) > > Am I missing something? I think so ;-) It's not about a single kernel, it's about the compilation unit (i.e. OpenCL program). For conjugate gradients we roughly have the following vector operations (random variable names) x = y; x += alpha y; x = z + alpha z; x = y - alpha z; x = inner_prod(y,z); BiCGStab and GMRES add a few more of them. If we use the generator as-is now, then each of the operations creates a separate OpenCL program the first time it is encountered and we pay the jit-compiler launch overhead multiple times. With the current non-generator model, all vector kernels are in the same OpenCL program and we pay the jit-overhead only once. I'd like to stick with the current model of having just one OpenCL program for all the basic kernels, but get the target-optimized sources from the generator. Sorry if I wasn't clear enough in my earlier mails. Best regards, Karli ------------------------------------------------------------------------------ CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk _______________________________________________ ViennaCL-devel mailing list ViennaCL-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/viennacl-devel