Ok, thanks!
This sounds reasonable indeed.

Philippe


2014-06-26 23:51 GMT+02:00 Karl Rupp <r...@iue.tuwien.ac.at>:

> Hi,
>
> the cases 5, 6, and 7 are handled by running a kernel for four vectors,
> then subtract '4' and run a dedicated kernel on the remaining 1, 2, or 3
> vectors. This could also be handled by a generated kernel, yes, but I
> haven't implemented this for two reasons:
>  1. less kernels to compile
>  2. less implementation effort
>
> One single kernel is not possible for arbitrary values of vectors. Eight
> vectors turned out to be a reasonable upper bound because the overhead is
> less than 12.5% over the ideal case already, but at the same time the
> kernel still works for older GPUs with limited amounts of shared memory.
>
> Best regards,
> Karli
>
>
>
> On 06/26/2014 11:09 PM, Philippe Tillet wrote:
>
>> I'll add something. I assume that multiple kernels are launched thanks
>> to current_index. Wouldn't it be better to launch one single kernel ? I
>> think that a lot of users would prefer to have better performance for
>> perhaps a slightly longer JIT overhead (since we'll provide a caching
>> mechanism).
>>
>> Philippe
>>
>>
>> 2014-06-26 23:07 GMT+02:00 Philippe Tillet <phil.til...@gmail.com
>> <mailto:phil.til...@gmail.com>>:
>>
>>
>>     Hello!
>>
>>     I note this in the implementation of multi_inner_prod:
>>
>>                switch (vec_tuple.const_size() - current_index)
>>                {
>>                  case 7:
>>                  case 6:
>>                  case 5:
>>                  case 4:
>>                    //do stuff
>>
>>     However, there is a test for 5,6,7 so I assume that these have to be
>>     implemented somehow. Could I have more details on why there is no
>>     specific kernel for these three cases?
>>
>>     NB : This is the very last thing that has to be done before I can
>>     push the new device-specific OpenCL backend. All the tests pass
>>     except multi_inner_prod for tuple_size >= 5. :)
>>
>>     Philippe
>>
>>
>>
>>
>> ------------------------------------------------------------
>> ------------------
>> Open source business process management suite built on Java and Eclipse
>> Turn processes into business applications with Bonita BPM Community
>> Edition
>> Quickly connect people, data, and systems into organized workflows
>> Winner of BOSSIE, CODIE, OW2 and Gartner awards
>> http://p.sf.net/sfu/Bonitasoft
>>
>>
>>
>> _______________________________________________
>> ViennaCL-devel mailing list
>> ViennaCL-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>>
>>
>
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to