I'll add something. I assume that multiple kernels are launched thanks to
current_index. Wouldn't it be better to launch one single kernel ? I think
that a lot of users would prefer to have better performance for perhaps a
slightly longer JIT overhead (since we'll provide a caching mechanism).

Philippe


2014-06-26 23:07 GMT+02:00 Philippe Tillet <phil.til...@gmail.com>:

> Hello!
>
> I note this in the implementation of multi_inner_prod:
>
>           switch (vec_tuple.const_size() - current_index)
>           {
>             case 7:
>             case 6:
>             case 5:
>             case 4:
>               //do stuff
>
> However, there is a test for 5,6,7 so I assume that these have to be
> implemented somehow. Could I have more details on why there is no specific
> kernel for these three cases?
>
> NB : This is the very last thing that has to be done before I can push the
> new device-specific OpenCL backend. All the tests pass except
> multi_inner_prod for tuple_size >= 5. :)
>
> Philippe
>
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to