Re: [ViennaCL-devel] Benchmark GUI warmup

Philippe Tillet Mon, 05 May 2014 06:10:37 -0700

Hi,


2014-05-05 9:18 GMT+02:00 Karl Rupp <r...@iue.tuwien.ac.at>:

> Hi,
>
> (CC-ing viennacl-devel, as this is developer-talk ;-) )
>
>
>  Either way, I want to let you know that the generator/auto-tuner is
>> undergoing significant changes, and that you will, actually, not have to
>> worry about it for your GSoC project. The generator will be used
>> transparently via the viennacl::linalg:: functions, and the auto-tuner
>> will be entirely moved to pyviennacl.
>>
>
> Well, I think this is not entirely unrelated. The purpose of the GUI is
> still to allow a broader community to feed us with benchmark data, so
> somehow the loop over all possible configurations is still essential. With
> an interface to Python I assume that an API to do exactly that will still
> be available ;-)
>
>
Well, looping over all the possible configurations for one particular
problem size is good for benchmarking purpose only; the data generated this
way will not be re-usable unless we can make some assumption on the
input-data size. That is, if the GUI only auto-tunes GEMV/GEMM for large
square matrices, then we will collect a lot of pointless data. Instead, the
GUI should export a model which, given some input data sizes and a hardware
configuration, is able to predict the optimal kernel. This is why the
auto-tuner is being moved to pyviennacl.
However, the GUI could/should indeed still be able to execute the
corresponding python scripts.


>
>  There is, however, one additional point I'd like to discuss. The
>> performance of all the algorithms you'll have to benchmark are highly
>> dependent on the characteristics of the input data. For example, matrix
>> products will behave very differently according to the size/shape of the
>> input matrices. This is very important : this means that a good
>> benchmarking GUI could help the users to design their system.
>> Here's an example. Suppose that someone wants to solve the linear system:
>> A*x* = *y*
>>
>> If, for his particular application, A is a 50,000x50,000 sparse matrix,
>> then he could be greatly interested in knowing how he could pad A to
>> achieve better performance. In that case, the benchmarking-gui could
>> explore randomly R^2 beyond (50,000 ; 50,000), and potentially tell the
>> user that, if he makes A a (50,500; 50,500) matrix, then he could
>> improve his performance by say 10 or 20%.
>>
>
> For sparse matrices I don't believe in random patterns. The user usually
> has a particular application in mind, so I consider it more important to
>  a) Allow users to feed the tuner with their own sparse matrix
>  b) Allow users to select sparse matrices from the Florida matrix market
> The second option is important for benchmark purposes and for comparison
> with data in the literature. We can also add a third option for random
> matrices, but it's certainly far less important.



We could also try to describe a sparse matrix by a few parameters (number
of rows/cols, format, sparsity pattern, etc...) and use machine learning to
predict the optimal kernel given an arbitrary sparse matrix. For the
training data, we could use the Florida matrix market, indeed.


>
>
>
>  In the case of dense matrix
>> products, one may even be able to double his performance by slightly
>> altering the size of the input matrices.
>>
>
> Okay, this is only about adjusting the padding parameter and should
> transparently included in the tuning process anyway, shouldn't it?
>

This is not exactly what I meant. Suppose that someone wants to compute the
dense matrix product:
A*B
where A is in R^{238, 2031}, and B is in R^{2031, 1240}.
Then, the auto-tuner should indeed find the optimal padding size, and A and
B would be transparently padded to multiples of 128: {256,2048} and {2048,
1280}.
 However, for some reason, using matrices of size {256, 2176} and {2176,
1280} may be worth it on SGEMM (but not on DGEMM), because 2048 could
trigger a lot of bank conflicts. Similarly, one might fall on a sweet spot
of his GPU for {256,2560}x{2560,1408}.  I don't think that ViennaCL should
handle this. I can think of some applications in the field of Artificial
Neural Networks, where one may want to resize the layers of his neural
network so as to fall on some sweet spots of his GPU.

Philippe


> Best regards,
> Karli
>
>

------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
&#149; 3 signs your SCM is hindering your productivity
&#149; Requirements for releasing software faster
&#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce

_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Re: [ViennaCL-devel] Benchmark GUI warmup

Reply via email to