Hi Paul,

I haven't test your matrix on my machine. I don't have time until weekend.
I don't think cuSOLVER would produce the same result for GPU and CPU.
cuSOLVER would try to parallelize in a sophisticated way to improve
performance, but their error should be within a threshold.

If Magma cholesky decomposition is more stable, it is possible to implement
a gradient operator like GpuCholesky did. Just add support for float64 and
implement the L_op method.

Best regards,
wonghang

Paul Baggenstoss <p.m.baggenst...@ieee.org> 於 2020年2月6日 週四 下午6:28寫道:

> Simon,
> I did more digging and have some more information. I tested
> theano.gpuarray.linalg.GpuMagmaCholesky(),  on float32 and it looks good.
> The result is exactly the same as for CPU.
> So the problem seems to be in CUsolver.  The problem is that
> theano.gpuarray.linalg.GpuMagmaCholesky()(Cll) does not define a gradient
> and works only for
> float32. I installed the latest magma-2.5.2 and it has support for double
> precision Cholesky (dpotrf) but Theano seems to use it's own copy of the
> MAGMA source.
> Not sure how that works. Can I force Theano to use magma-2.5.2 ?  If not,
> it seems feasible to borrow the gradient from
> theano.gpuarray.linalg.GpuCholesky()
> and add support for float64 as well.  Thoughts?
> Paul
>
>
> On Wednesday, February 5, 2020 at 5:32:43 PM UTC+1, Paul Baggenstoss wrote:
>>
>> Hi Simon, I forgot to mention that I use the gradient of Cholesky, and
>> this has even more error than the Cholesky decomo, but I assume that this
>> is because
>> of a bug in Cholesky itself.
>> Paul
>>
>>
>> On Wednesday, February 5, 2020 at 5:30:10 PM UTC+1, Paul Baggenstoss
>> wrote:
>>>
>>> Hi Simon,I have uploaded the MATLAB format file with the matrix Cll,
>>> which is the original matrix, and R_cpu which was produced using CPU by
>>> slinalg.Cholesky( ), and R_cuda which
>>> was produced by the same function, but with GPU ( I think it uses
>>> theano.gpuarray.linalg.GpuCholesky() )   Both used the same precision
>>> (float32)  so should give the same results.
>>> But you can see that at the end of the diagonal, the values go wild. It
>>> appears to be numericla errors.
>>> Thanks in advance!
>>> Paul
>>>
>>>
>>>
>>>
>>> On Wednesday, February 5, 2020 at 4:56:14 PM UTC+1, Wong Hang wrote:
>>>>
>>>>
>>>> Hi,
>>>>
>>>> The GPU cholesky decomposition relies on cuSOLVER or Magma. I believe
>>>> nvidia knows their hardware well and cuSOLVER should provide the best
>>>> efficient result.
>>>>
>>>> Although cholesky decomposition is very numerical stable, when I write
>>>> the test case, I find that I will get trouble for relatively small matrix
>>>> if I use single-precision.
>>>>
>>>> Are you using single-precision on a big matrix?
>>>> If not, try to compute the condition number of the matrix to see if it
>>>> is too big.
>>>>
>>>> If it is not too big, then it may be a bug. I also need to use the
>>>> cholesky operator, Please send me the matrix and I am willing to fix it.
>>>>
>>>> Best,
>>>>
>>>> 2020年2月6日(木) 0:34 Paul Baggenstoss <p.m.ba...@ieee.org>:
>>>>
>>>>> HI Simon, I was wondering if you got anywhere with the faster Cholesky
>>>>> for Theano. I also use it a lot and have found it to be unstable on the 
>>>>> GPU.
>>>>> Paul
>>>>>
>>>>> On Saturday, March 7, 2015 at 11:45:36 AM UTC+1, Simon Ebner wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I want to do computations where I rely heavily on the Cholesky
>>>>>> decomposition. Writing a small benchmark for tensor.slinalg.Cholesky, I
>>>>>> noticed that the implementation is not as fast as I hoped. As far as I 
>>>>>> can
>>>>>> tell it is not optimized for GPUs yet but relies on the scipy
>>>>>> implementation?
>>>>>> Doing a bit of a google seach I found several cuda implementations
>>>>>> for fast Cholesky decompositions on the GPU. Before I try to include that
>>>>>> code into my theano environment, could you let me know whether you 
>>>>>> decided
>>>>>> not to implement fast Cholesky decomposition on the GPU on purpose?
>>>>>> Furthermore, since I'm fairly new to theano I'm not completely confident
>>>>>> how to incorporate cuda code best into my existing theano code. Is the
>>>>>> sensible to create a custom OP with optimized C-Code?
>>>>>>
>>>>>> Best,
>>>>>> Simon
>>>>>>
>>>>> --
>>>>>
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "theano-users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to theano...@googlegroups.com.
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/theano-users/aca41c35-ec36-4055-bac7-e53aced30ea7%40googlegroups.com
>>>>> <https://groups.google.com/d/msgid/theano-users/aca41c35-ec36-4055-bac7-e53aced30ea7%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to theano-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/theano-users/cbd1feec-2403-487b-809e-241a225a3ae4%40googlegroups.com
> <https://groups.google.com/d/msgid/theano-users/cbd1feec-2403-487b-809e-241a225a3ae4%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/theano-users/CAAMb3nVt2v0Wa%3D7RRLi78EJFjO%3DXSGwHDDGhCqOOYK%2BWGC%2BZNg%40mail.gmail.com.

Reply via email to