Re: [PyCUDA] Please help: solve A*X = B under pyCUDA within Python and GPU

Evgeny Lazutkin Tue, 25 Feb 2014 09:03:45 -0800

Many thanks for your support, Lev!

It works and looks good!

I have already mentioned about parallelization: your answer was that ithappens automatically.

To be more detailed:

   0. Could you please explain this mechanism?
   1. How many blocks/threads has been used in my program?
   2. How to obtain this numbers in program? How to manipulate?
   3. Could you give any literature, where I can read about it?

I have such a question because I am new one in such theme. When usingthe SourceModul - I give the number of blocks and threads. So it is notclear for me - how it works automatically? Is that depends from theMatrix size an so on?

I think such information will be useful for all new users! I hope, youcan help me (or someone else:-) ) to understand!


Best regards,
Evgeny
Am 25.02.2014 15:47, schrieb Lev Givon:

Received from Evgeny Lazutkin on Tue, Feb 25, 2014 at 03:18:18AM EST:

Dear Lev, dear all,

the problem with DataTypeError I have solved. I did not mention that
I pass float64 instead of float32.

In attach you will find the code, it works......but....GPU brings
wrong solution. I print the results in program. They dont match.
Could you explain why?

This is because arrays in numpy are row-major by default while CULA assumes the
data is column-major [1]. If you transpose the two input matrices, you should
obtain the correct result (transposed). Corrected code attached.

[1] http://www.culatools.com/cula_dense_programmers_guide/#column-major-ordering

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Re: [PyCUDA] Please help: solve A*X = B under pyCUDA within Python and GPU

Reply via email to