The LU decomposition in GSL (both real and complex) is now based on a
recursive Level 3 BLAS algorithm. The performance improvement is quite
dramatic when using an optimized multi-threaded BLAS library like ATLAS.
I'd be interested in hearing feedback from anyone who uses Cholesky/LU
factorizations in their work. GSL may out-perform LAPACK in these areas
now, and the recursive algorithms are surprisingly simple to implement
and fit quite nicely with GSL's codebase.

Enjoy,
Patrick

On 5/30/19 9:06 AM, Patrick Alken wrote:
> Hi all,
>
>
>    I have recently learned of a project called ReLAPACK 
> (https://github.com/HPAC/ReLAPACK, paper here: 
> https://arxiv.org/abs/1602.06763) which implements a number of LAPACK 
> algorithms (such as LU, Cholesky, Sylvester equations) using recursive 
> methods which can use Level 3 BLAS calls. The paper shows that most of 
> these algorithms out-perform the block Level 3 algorithms in LAPACK. The 
> main advantage is that LAPACK block algorithms require fixing the block 
> size ahead of time, which may not be optimal for a given architecture, 
> while the recursive methods don't require a block size parameter.
>
> The recursive methods do however require a "base case" - i.e. at what 
> size matrix should it switch to the Level 2 BLAS based algorithms. 
> ReLAPACK fixes this currently at N=24.
>
> Anyway, the recursive Cholesky variant is quite straightforward to 
> implement, and I have already coded it for GSL (both the decomposition 
> and inversion). I did some tests for N=10,000 with ATLAS BLAS and found 
> that it runs faster than DPOTRF from LAPACK. This fast Cholesky 
> decomposition will improve the performance also for the generalized 
> symmetric definite eigensolvers, and the least squares modules (linear 
> and nonlinear).
>
> So GSL now has a competitive Cholesky solver, which I think should make 
> many GSL users happy :)
>
> Work is currently underway to implement the recursive pivoted LU 
> decomposition in GSL.
>
> Unfortunately the ReLAPACK authors state that the QR algorithm is not 
> amenable to recursive methods, so the block QR seems to still be the 
> best choice. It would be nice to implement this for GSL, in case any 
> volunteers are looking for a project ;)
>
> Patrick
>
>

Reply via email to