Hey hey,

2014/1/25 Karl Rupp <r...@iue.tuwien.ac.at>

> Hi,
>
>
>  I prefer option 3. This would allow for something like :
>>
>> if(size(x)>1e5 && stride==1 && start==0){
>>
>
> Here we also need to check the internal_size to fit the vector width
>
>
>
>>   //The following steps are costly for small vectors
>>   NumericT cpu_alpha = alpha //copy back to host when the scalar is on
>> global device memory)
>>   if(alpha_flip) cpu_alpha*=-1;
>>   if(reciprocal) cpu_alpha = 1/cpu_alpha;
>>   //... same for beta
>>
>> //Optimized routines
>>   if(external_blas)
>>     call_axpy_twice(x,cpu_alpha,y,cpu_beta,z)
>>   else{
>>     generate_execute(x = cpu_alpha*y + cpu_beta*z);
>> }
>> else{
>>    //fallback
>> }
>>
>> This way, we at most generate two kernels, one for small vectors,
>>   designed to optimize latency, and one for big vectors, designed to
>> optimize bandwidth. Are we converging? :)
>>
>
> Convergence depends on what is inside generate_execute() ;-) How is the
> problem with alpha and beta residing on the GPU addressed? How will the
> batch-compilation look like? The important point is that for the default
> axpy kernels we really don't want to go through the jit-compiler for each
> of them individually.
>

;)
in this case, generate_execute() will just trigger the compilation - on the
first call only - of the kernel
x = cpu_alpha*y + cpu_beta*z;

__kernel void kernel(unsigned int N, float4* x, float4* y, float4* z, float
alpha, float beta)
{
  for(i = get_global_id(0) ; i < N ; i+=get_global_size(0))
    x[i] = alpha*y[i] + beta*z[i];
}

with of course an appropriate compute profile


> Note to self: Collect some numbers on the costs of jit-compilation for
> different OpenCL SDKs.
>
> Best regards,
> Karli
>
>
>
Best regards,
Philippe
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to