Re: [ViennaCL-devel] Roadmap to 1.6 : Cleaning the code, refurbishing the test suite, the benchmark suite, etc...

Karl Rupp Sun, 17 Aug 2014 14:01:44 -0700

Hi,

 > The nasty bug on strided GEMV got solved.


thanks! :-)


> I'm available on wednesday for the code uniformization session. We
> should be on IRC at the same time, though, in case we face a situation
> we had not discussed.

Ok, when do we start?

> I have a couple of questions regarding a
> standardized way of naming the numeric type of a matrix/vector.
> Sometimes it's NumericT, sometimes it's T, sometimes it's TYPE... What
> about NumericType everywhere? Anyway, some similar questions could arise
> so it's probably better to be able to chat in real time while making the
> code style uniform.

The answer can already be found here:
http://viennastar.iue.tuwien.ac.at/wiki/doku.php?id=codingstyle
;-)

> We must also remember to sort out
> https://github.com/viennacl/viennacl-dev/issues/71
> https://github.com/viennacl/viennacl-dev/issues/77
> https://github.com/viennacl/viennacl-dev/issues/66
> https://github.com/viennacl/viennacl-dev/issues/2

Sure. What is special about them in comparison to the other issues 
scheduled for 1.6.0?

Best regards,
Karli



>
> 2014-08-17 19:36 GMT+02:00 Philippe Tillet <phil.til...@gmail.com
> <mailto:phil.til...@gmail.com>>:
>
>     So the dense benchmark suite got refurbished here:
>
>     
> https://github.com/viennacl/viennacl-dev/commit/73f46e36cfa4104628f831195e4da25a62f9ef66
>     The same template using macros can be used for any benchmark. It's
>     pretty concise and maintainable!
>
>     Philippe
>
>
>     2014-08-17 13:50 GMT+02:00 Karl Rupp <r...@iue.tuwien.ac.at
>     <mailto:r...@iue.tuwien.ac.at>>:
>
>         Hi,
>
>
>          >     * nmf only implements matrix<T>, but in principle
>         matrix_base<T> should
>
>                  work (since no custom kernel is called, I believe)
>
>
>                  NMF uses a custom kernel and thus only works with OpenCL. A
>                  generalization to matrix_base should be
>             straight-forward, yes. I
>                  should be able to do it for the release.
>
>
>             The kernel it uses is:
>
>                       template <typename StringType>
>                       void generate_nmf_el_wise_mul_div(__StringType &
>             source,
>             std::string const & numeric_string)
>                       {
>                         source.append("__kernel void el_wise_mul_div( \n");
>                         source.append("          __global ");
>             source.append(numeric_string); source.append(" * matrix1, \n");
>                         source.append("          __global const ");
>             source.append(numeric_string); source.append(" * matrix2, \n");
>                         source.append("          __global const ");
>             source.append(numeric_string); source.append(" * matrix3, \n");
>                         source.append("          unsigned int size) \n");
>                         source.append("{ \n");
>                         source.append("  for (unsigned int i =
>             get_global_id(0); i <
>             size; i += get_global_size(0)) \n");
>                         source.append("  { \n");
>                         source.append("    ");
>             source.append(numeric_string);
>             source.append(" val = matrix1[i] * matrix2[i]; \n");
>                         source.append("    ");
>             source.append(numeric_string);
>             source.append(" divisor = matrix3[i]; \n");
>                         source.append("    matrix1[i] = (divisor > (");
>             source.append(numeric_string); source.append(")0.00001) ? (val /
>             divisor) : ("); source.append(numeric_string);
>             source.append(")0; \n");
>                         source.append("  } \n");
>                         source.append("} \n");
>                       }
>
>             So, the layout of the matrix shouldn't matter, indeed. It
>             would be
>             pretty easy to have this kernel generated by the generator,
>             too, as this
>             can be represented by the expression tree :
>             matrix1 = select(matrix3 > 0.00001,
>             element_div(element_prod(__matrix1,
>             matrix2), matrix3), cast<T>(0)).
>             However, we're running out of time so I wouldn't port it.
>             But we have to
>             keep in mind that this would be a trivial thing to do.
>
>
>         The same student who ported the FFT-code to multiple backends
>         will take care of porting NMF to multiple backends. He's pretty
>         quick already, so it should be done by the release.
>
>         However, I'd refrain from integrating this into the generator
>         for now because it is totally non-critical in terms of overall
>         performance. We can port that under perfect control within the
>         OpenCL backend later when we have more confidence in the
>         stability of the generator (no pun' intended).
>
>
>
>                      - We should definitely have a discussion on matrix
>             padding,
>                      which is no
>                      longer required anywhere in ViennaCL, as far as I
>             know. I am in
>                      favor of
>                      making size()==internal_size() by default. That's
>             not the point
>                      of the
>                      e-mail, but we should have a discussion on what we
>             should do
>                      with it!
>
>
>                  Getting rid of the padding would certainly remove the
>             traps of using
>                  fast_copy() on a matrix. Other than that, I don't think
>             it has a
>                  substantial influence on the code because
>             internal_size() is still
>                  needed for dealing with ranges.
>
>                  There may be an influence on certain bandwidth-limited
>             operations,
>                  though, as for example a matrix addition may lead to
>             bank conflicts
>                  (or channel conflicts, whatever...) when accessing GPU
>             RAM for
>                  certain matrix sizes. Before making a decision on the
>             padding issue,
>                  we should run some benchmarks to see whether there is
>             an impact.
>
>
>             Well, one thing I'm sure of is that we should give the
>             possibility to
>             use no padding if needed (for memory constraints), or
>             (probably even
>             better) to choose the padding size.
>
>
>         Apparently it is not an easy choice for us to pick the default
>         because of the many things to consider. Thus, making this
>         user-customizable is most likely the way to go, so that we only
>         have to worry about choosing the 'best' default layout :-)
>
>
>             I completely agree that removing
>             padding will have a harmful influence for
>             ldsize=some_weird_number.
>
>
>         which makes things so complicated... ;-)
>
>
>
>             However, we certainly don't need to pad both size1 and
>             size2. Padding
>             size2() for row-major matrices, and size1() for column major
>             matrices,
>             will not cause any performance regression.
>
>
>         Indeed.
>
>
>
>                  There are a couple of more things for the release to be
>             completed.
>                  They are essentially all listed in the issue tracker
>             and have the
>                  1.6.0 milestone assigned to it, except for the
>             unification of coding
>                  style. When are you available for tackling that
>             together? I'm
>                  available after Monday.
>
>
>             I'm available from today to Friday. I'll be unavailable for
>             quite some
>             time afterwards for any significant work. I will still be
>             available for
>             critical work such as fixing correctness issues in the
>             generated code,
>             but overall I'll be busy designing my PhD course/research
>             plans. What I
>             plan to do before leaving:
>             - Fix the GEMM performance regression of the fallback kernel
>             - Refurbish the benchmark code for dense operations
>             - Rewrite the matrix-vector tests
>
>             Not much more. This is my last week in France, so I want to
>             spend some
>             time with my family. I've also been having a really hard
>             time lately
>             when adding support for vector types, ranges and strides
>             inside the
>             generated kernels, so I feel like taking a short break
>             before my PhD
>             begins...
>
>
>         Sure, make sure you get to the US sufficiently relaxed, who
>         knows when you'll have the next opportunity to relax again ;-)
>         Let's schedule the coding style unification for Wednesday? We
>         should be done within a few hours I guess.
>
>         Best regards,
>         Karli
>
>
>


------------------------------------------------------------------------------
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Re: [ViennaCL-devel] Roadmap to 1.6 : Cleaning the code, refurbishing the test suite, the benchmark suite, etc...

Reply via email to