So the dense benchmark suite got refurbished here:

https://github.com/viennacl/viennacl-dev/commit/73f46e36cfa4104628f831195e4da25a62f9ef66
The same template using macros can be used for any benchmark. It's pretty
concise and maintainable!

Philippe


2014-08-17 13:50 GMT+02:00 Karl Rupp <r...@iue.tuwien.ac.at>:

> Hi,
>
>
> >     * nmf only implements matrix<T>, but in principle matrix_base<T>
> should
>
>>     work (since no custom kernel is called, I believe)
>>
>>
>>     NMF uses a custom kernel and thus only works with OpenCL. A
>>     generalization to matrix_base should be straight-forward, yes. I
>>     should be able to do it for the release.
>>
>>
>> The kernel it uses is:
>>
>>          template <typename StringType>
>>          void generate_nmf_el_wise_mul_div(StringType & source,
>> std::string const & numeric_string)
>>          {
>>            source.append("__kernel void el_wise_mul_div( \n");
>>            source.append("          __global ");
>> source.append(numeric_string); source.append(" * matrix1, \n");
>>            source.append("          __global const ");
>> source.append(numeric_string); source.append(" * matrix2, \n");
>>            source.append("          __global const ");
>> source.append(numeric_string); source.append(" * matrix3, \n");
>>            source.append("          unsigned int size) \n");
>>            source.append("{ \n");
>>            source.append("  for (unsigned int i = get_global_id(0); i <
>> size; i += get_global_size(0)) \n");
>>            source.append("  { \n");
>>            source.append("    "); source.append(numeric_string);
>> source.append(" val = matrix1[i] * matrix2[i]; \n");
>>            source.append("    "); source.append(numeric_string);
>> source.append(" divisor = matrix3[i]; \n");
>>            source.append("    matrix1[i] = (divisor > (");
>> source.append(numeric_string); source.append(")0.00001) ? (val /
>> divisor) : ("); source.append(numeric_string); source.append(")0; \n");
>>            source.append("  } \n");
>>            source.append("} \n");
>>          }
>>
>> So, the layout of the matrix shouldn't matter, indeed. It would be
>> pretty easy to have this kernel generated by the generator, too, as this
>> can be represented by the expression tree :
>> matrix1 = select(matrix3 > 0.00001, element_div(element_prod(matrix1,
>> matrix2), matrix3), cast<T>(0)).
>> However, we're running out of time so I wouldn't port it. But we have to
>> keep in mind that this would be a trivial thing to do.
>>
>
> The same student who ported the FFT-code to multiple backends will take
> care of porting NMF to multiple backends. He's pretty quick already, so it
> should be done by the release.
>
> However, I'd refrain from integrating this into the generator for now
> because it is totally non-critical in terms of overall performance. We can
> port that under perfect control within the OpenCL backend later when we
> have more confidence in the stability of the generator (no pun' intended).
>
>
>
>          - We should definitely have a discussion on matrix padding,
>>         which is no
>>         longer required anywhere in ViennaCL, as far as I know. I am in
>>         favor of
>>         making size()==internal_size() by default. That's not the point
>>         of the
>>         e-mail, but we should have a discussion on what we should do
>>         with it!
>>
>>
>>     Getting rid of the padding would certainly remove the traps of using
>>     fast_copy() on a matrix. Other than that, I don't think it has a
>>     substantial influence on the code because internal_size() is still
>>     needed for dealing with ranges.
>>
>>     There may be an influence on certain bandwidth-limited operations,
>>     though, as for example a matrix addition may lead to bank conflicts
>>     (or channel conflicts, whatever...) when accessing GPU RAM for
>>     certain matrix sizes. Before making a decision on the padding issue,
>>     we should run some benchmarks to see whether there is an impact.
>>
>>
>> Well, one thing I'm sure of is that we should give the possibility to
>> use no padding if needed (for memory constraints), or (probably even
>> better) to choose the padding size.
>>
>
> Apparently it is not an easy choice for us to pick the default because of
> the many things to consider. Thus, making this user-customizable is most
> likely the way to go, so that we only have to worry about choosing the
> 'best' default layout :-)
>
>
>  I completely agree that removing
>> padding will have a harmful influence for ldsize=some_weird_number.
>>
>
> which makes things so complicated... ;-)
>
>
>
>  However, we certainly don't need to pad both size1 and size2. Padding
>> size2() for row-major matrices, and size1() for column major matrices,
>> will not cause any performance regression.
>>
>
> Indeed.
>
>
>
>      There are a couple of more things for the release to be completed.
>>     They are essentially all listed in the issue tracker and have the
>>     1.6.0 milestone assigned to it, except for the unification of coding
>>     style. When are you available for tackling that together? I'm
>>     available after Monday.
>>
>>
>> I'm available from today to Friday. I'll be unavailable for quite some
>> time afterwards for any significant work. I will still be available for
>> critical work such as fixing correctness issues in the generated code,
>> but overall I'll be busy designing my PhD course/research plans. What I
>> plan to do before leaving:
>> - Fix the GEMM performance regression of the fallback kernel
>> - Refurbish the benchmark code for dense operations
>> - Rewrite the matrix-vector tests
>>
>> Not much more. This is my last week in France, so I want to spend some
>> time with my family. I've also been having a really hard time lately
>> when adding support for vector types, ranges and strides inside the
>> generated kernels, so I feel like taking a short break before my PhD
>> begins...
>>
>
> Sure, make sure you get to the US sufficiently relaxed, who knows when
> you'll have the next opportunity to relax again ;-) Let's schedule the
> coding style unification for Wednesday? We should be done within a few
> hours I guess.
>
> Best regards,
> Karli
>
>
------------------------------------------------------------------------------
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to