The LU decomposition in GSL (both real and complex) is now based on a recursive Level 3 BLAS algorithm. The performance improvement is quite dramatic when using an optimized multi-threaded BLAS library like ATLAS. I'd be interested in hearing feedback from anyone who uses Cholesky/LU factorizations in their work. GSL may out-perform LAPACK in these areas now, and the recursive algorithms are surprisingly simple to implement and fit quite nicely with GSL's codebase.
Enjoy, Patrick On 5/30/19 9:06 AM, Patrick Alken wrote: > Hi all, > > > I have recently learned of a project called ReLAPACK > (https://github.com/HPAC/ReLAPACK, paper here: > https://arxiv.org/abs/1602.06763) which implements a number of LAPACK > algorithms (such as LU, Cholesky, Sylvester equations) using recursive > methods which can use Level 3 BLAS calls. The paper shows that most of > these algorithms out-perform the block Level 3 algorithms in LAPACK. The > main advantage is that LAPACK block algorithms require fixing the block > size ahead of time, which may not be optimal for a given architecture, > while the recursive methods don't require a block size parameter. > > The recursive methods do however require a "base case" - i.e. at what > size matrix should it switch to the Level 2 BLAS based algorithms. > ReLAPACK fixes this currently at N=24. > > Anyway, the recursive Cholesky variant is quite straightforward to > implement, and I have already coded it for GSL (both the decomposition > and inversion). I did some tests for N=10,000 with ATLAS BLAS and found > that it runs faster than DPOTRF from LAPACK. This fast Cholesky > decomposition will improve the performance also for the generalized > symmetric definite eigensolvers, and the least squares modules (linear > and nonlinear). > > So GSL now has a competitive Cholesky solver, which I think should make > many GSL users happy :) > > Work is currently underway to implement the recursive pivoted LU > decomposition in GSL. > > Unfortunately the ReLAPACK authors state that the QR algorithm is not > amenable to recursive methods, so the block QR seems to still be the > best choice. It would be nice to implement this for GSL, in case any > volunteers are looking for a project ;) > > Patrick > >
