On Sat, 20 Feb 2021 at 07:40, Kartik Sethi <kartiks31...@gmail.com> wrote: > > Hey, my name is Kartik Sethi. I am interested in taking part in Gsoc this > year. I have been contributing to sympy for the past couple of months.
Hi Kartik, > I have come up with a few ideas for Gsoc. I am focusing primarily on the > sympy matrices module. > > Implementing complete Singular Value decomposition. Sympy currently has a > function for condensed SVD ( I am a contributor on this #20761). > Using the complete SVD, I can implement Polar decomposition > Hermite Normal Form. There is an old un-merged PR (#18534), I could work on > this and complete it. > Sparse-Fraction Free Algorithm another old unfinished PR (#9133) The Matrix class does not necessarily get much benefit from fraction-free algorithms because of the internal representation. DomainMatrix does though (see below). > Jordan Canonical Form This is already implemented as the Matrix.jordan_form method so I'm not sure what you are proposing. The main way it could be improved is by having the method use DomainMatrix internally just like eigenvects does (see below). > Improve time complexity for QR decomposition of upper Hessenberg. There is an > algorithm using Householder reflectors that can cut down time-complexity to > O(n^2) instead of current O(n^3) When I've timed calculations with Matrix they almost never match up to the big-O behaviour that is typically expected for any given algorithm. I think there are too many other symbolic computations going on that it slows down more than it should as n grows. Operations with DomainMatrix though do tend to have the big-O that you would expect if you take into account bit complexity. This is not listed anywhere on the ideas page or really documented anywhere yet but there is a new mostly internal implementation of matrices in the sympy.polys.matrices module. This new implementation is called DomainMatrix and is based on the polys domains. The domains themselves are explained in these recently added docs: https://docs.sympy.org/dev/modules/polys/domainsintro.html There are aren't any docs for DomainMatrix yet and it is not that easy to use but you can find more information in this most recent PR which links to other PRs: https://github.com/sympy/sympy/pull/20780 The easiest way to test out DomainMatrix is by converting from a normal Matrix like this: In [75]: from sympy.polys.matrices import DomainMatrix In [76]: M = Matrix(range(25)).reshape(5, 5) In [77]: M Out[77]: ⎡0 1 2 3 4 ⎤ ⎢ ⎥ ⎢5 6 7 8 9 ⎥ ⎢ ⎥ ⎢10 11 12 13 14⎥ ⎢ ⎥ ⎢15 16 17 18 19⎥ ⎢ ⎥ ⎣20 21 22 23 24⎦ In [78]: dM = DomainMatrix.from_Matrix(M) In [80]: dM Out[80]: DomainMatrix([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]], (5, 5), ZZ) DomainMatrix is much faster than Matrix for many calculations. The difference is observable for a simple matrix of integers like this: In [81]: %time M.det() CPU times: user 2.4 ms, sys: 281 µs, total: 2.68 ms Wall time: 2.73 ms Out[81]: 0 In [82]: %time dM.det() CPU times: user 143 µs, sys: 161 µs, total: 304 µs Wall time: 310 µs Out[82]: mpz(0) The difference becomes really noticeable if the matrix has algebraic elements like I or sqrt(2) or has symbols like x or y in it: In [88]: M = randMatrix(5, 5) + randMatrix(5, 5) * I In [89]: M Out[89]: ⎡26 + 27⋅ⅈ 41 + 76⋅ⅈ 98 + 76⋅ⅈ 64 + 52⋅ⅈ 48 + 2⋅ⅈ ⎤ ⎢ ⎥ ⎢65 + 81⋅ⅈ 49 + 25⋅ⅈ 39 + 82⋅ⅈ 74⋅ⅈ 77 + 3⋅ⅈ ⎥ ⎢ ⎥ ⎢11 + 59⋅ⅈ 49 + 67⋅ⅈ 50 + 25⋅ⅈ 69 + 24⋅ⅈ 79 + 31⋅ⅈ⎥ ⎢ ⎥ ⎢42 + 29⋅ⅈ 33 + 72⋅ⅈ 37 + 59⋅ⅈ 10 + 12⋅ⅈ 27 + 62⋅ⅈ⎥ ⎢ ⎥ ⎣97 + 7⋅ⅈ 4 + 26⋅ⅈ 8 + 30⋅ⅈ 47 + 84⋅ⅈ 81 + 6⋅ⅈ ⎦ In [90]: dM = DomainMatrix.from_Matrix(M) In [91]: %time ok = M.det() CPU times: user 486 ms, sys: 4.15 ms, total: 490 ms Wall time: 492 ms In [92]: %time ok = dM.det() CPU times: user 12 ms, sys: 422 µs, total: 12.4 ms Wall time: 12.7 ms In [93]: %time ok = M.charpoly() CPU times: user 18.9 s, sys: 186 ms, total: 19.1 s Wall time: 19.2 s In [94]: %time ok = dM.charpoly() CPU times: user 10 ms, sys: 170 µs, total: 10.2 ms Wall time: 10.5 ms That's a 50x speed difference for computing the determinant and a 2000x difference for the characteristic polynomial. This is just for a 5x5 matrix: the speed difference will be more significant for larger matrices. Further optimisations are still possible for both det and charpoly: sparse fraction free elimination could be used for det for example. It is not currently clear how to make use of DomainMatrix for users. Currently it is only used as part of linsolve and internally by the Matrix.eigenvects routine. Right now the thing that would make the biggest difference to matrices in sympy would be making more use of DomainMatrix to speed up calculations. For example this PR added the use of DomainMatrix for computing eigenvectors: https://github.com/sympy/sympy/pull/20614 That makes eigenvector calculations extremely fast on master compared to sympy 1.7.1. Given this matrix: B = Matrix([ [ 5, -5, -3, 2], [-2, -5, 0, 2], [-2, -7, -5, -2], [ 7, 10, 3, 9]]) On master the eigenvectors are computed using DomainMatrix and it takes 200 milliseconds: In [2]: %time v = B.eigenvects() CPU times: user 206 ms, sys: 14 ms, total: 220 ms Wall time: 219 ms With sympy 1.7.1 DomainMatrix is not used and the calculation is extremely slow. I waited 10 minutes and then gave up: In [2]: %time v = B.eigenvects() ^C--------------------------------------------------------------------------- KeyboardInterrupt Methods like fraction-free algorithms don't necessarily make much sense with the current Matrix class as it uses Expr for the matrix elements which does not necessarily get much speed up from avoiding division. With DomainMatrix it can make a huge difference because it means that we can perform calculations over a polynomial ring domain rather than a rational function field which eliminates gcd calculations. I think that right now this is the direction to go to improve matrix calculations. Some examples of things that need doing: 1. Add docs explaining DomainMatrix 2. Use DomainMatrix for jordan_form as well as for eigenvects and in particular for the matrix exponential. 3. Use DomainMatrix for e.g. charpoly and many other methods. 4. Add basic matrix convenience methods like slicing to DomainMatrix. 5. Add fraction free and other algorithms to DomainMatrix 6. Figure out how and when to use both the dense and sparse implementations of DomainMatrix. The big thing that needs doing is providing users with a more direct way to use DomainMatrix: a public matrix class that uses DomainMatrix as its internal representation so that it does not need to convert between representations each time it wants to compute something like eigenvects. This needs a bigger discussion though about what that public interface would be. For now I think it would be better to keep building up DomainMatrix with additional docs, algorithms and methods while making use of it internally to speed up particularly slow calculations. Oscar -- You received this message because you are subscribed to the Google Groups "sympy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sympy+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxQQ4zA1qhwTsbtSEAjxZHwGhK4XgHS1k6KQYxTWsOLEMA%40mail.gmail.com.