Thanks Oscar Benjamin for the wonderful suggestions. I will start looking into algorithms and methods that have not been implemented for the domain Matrix class yet. I would be happy to help in building Domain Matrix class.
Do you think improving and implementing algorithms in Domain Matrix class would be a suitable Gsoc Proposal? On Saturday, 20 February 2021 at 5:04:13 pm UTC+5:30 Oscar wrote: > On Sat, 20 Feb 2021 at 07:40, Kartik Sethi <kartik...@gmail.com> wrote: > > > > Hey, my name is Kartik Sethi. I am interested in taking part in Gsoc > this year. I have been contributing to sympy for the past couple of months. > > Hi Kartik, > > > I have come up with a few ideas for Gsoc. I am focusing primarily on the > sympy matrices module. > > > > Implementing complete Singular Value decomposition. Sympy currently has > a function for condensed SVD ( I am a contributor on this #20761). > > Using the complete SVD, I can implement Polar decomposition > > Hermite Normal Form. There is an old un-merged PR (#18534), I could work > on this and complete it. > > Sparse-Fraction Free Algorithm another old unfinished PR (#9133) > > The Matrix class does not necessarily get much benefit from > fraction-free algorithms because of the internal representation. > DomainMatrix does though (see below). > > > Jordan Canonical Form > > This is already implemented as the Matrix.jordan_form method so I'm > not sure what you are proposing. The main way it could be improved is > by having the method use DomainMatrix internally just like eigenvects > does (see below). > > > Improve time complexity for QR decomposition of upper Hessenberg. There > is an algorithm using Householder reflectors that can cut down > time-complexity to O(n^2) instead of current O(n^3) > > When I've timed calculations with Matrix they almost never match up to > the big-O behaviour that is typically expected for any given > algorithm. I think there are too many other symbolic computations > going on that it slows down more than it should as n grows. Operations > with DomainMatrix though do tend to have the big-O that you would > expect if you take into account bit complexity. > > This is not listed anywhere on the ideas page or really documented > anywhere yet but there is a new mostly internal implementation of > matrices in the sympy.polys.matrices module. This new implementation > is called DomainMatrix and is based on the polys domains. The domains > themselves are explained in these recently added docs: > https://docs.sympy.org/dev/modules/polys/domainsintro.html > There are aren't any docs for DomainMatrix yet and it is not that easy > to use but you can find more information in this most recent PR which > links to other PRs: > https://github.com/sympy/sympy/pull/20780 > > The easiest way to test out DomainMatrix is by converting from a > normal Matrix like this: > > In [75]: from sympy.polys.matrices import DomainMatrix > > In [76]: M = Matrix(range(25)).reshape(5, 5) > > In [77]: M > Out[77]: > ⎡0 1 2 3 4 ⎤ > ⎢ ⎥ > ⎢5 6 7 8 9 ⎥ > ⎢ ⎥ > ⎢10 11 12 13 14⎥ > ⎢ ⎥ > ⎢15 16 17 18 19⎥ > ⎢ ⎥ > ⎣20 21 22 23 24⎦ > > In [78]: dM = DomainMatrix.from_Matrix(M) > > In [80]: dM > Out[80]: DomainMatrix([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, > 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]], (5, 5), ZZ) > > DomainMatrix is much faster than Matrix for many calculations. The > difference is observable for a simple matrix of integers like this: > > In [81]: %time M.det() > CPU times: user 2.4 ms, sys: 281 µs, total: 2.68 ms > Wall time: 2.73 ms > Out[81]: 0 > > In [82]: %time dM.det() > CPU times: user 143 µs, sys: 161 µs, total: 304 µs > Wall time: 310 µs > Out[82]: mpz(0) > > The difference becomes really noticeable if the matrix has algebraic > elements like I or sqrt(2) or has symbols like x or y in it: > > In [88]: M = randMatrix(5, 5) + randMatrix(5, 5) * I > > In [89]: M > Out[89]: > ⎡26 + 27⋅ⅈ 41 + 76⋅ⅈ 98 + 76⋅ⅈ 64 + 52⋅ⅈ 48 + 2⋅ⅈ ⎤ > ⎢ ⎥ > ⎢65 + 81⋅ⅈ 49 + 25⋅ⅈ 39 + 82⋅ⅈ 74⋅ⅈ 77 + 3⋅ⅈ ⎥ > ⎢ ⎥ > ⎢11 + 59⋅ⅈ 49 + 67⋅ⅈ 50 + 25⋅ⅈ 69 + 24⋅ⅈ 79 + 31⋅ⅈ⎥ > ⎢ ⎥ > ⎢42 + 29⋅ⅈ 33 + 72⋅ⅈ 37 + 59⋅ⅈ 10 + 12⋅ⅈ 27 + 62⋅ⅈ⎥ > ⎢ ⎥ > ⎣97 + 7⋅ⅈ 4 + 26⋅ⅈ 8 + 30⋅ⅈ 47 + 84⋅ⅈ 81 + 6⋅ⅈ ⎦ > > In [90]: dM = DomainMatrix.from_Matrix(M) > > In [91]: %time ok = M.det() > CPU times: user 486 ms, sys: 4.15 ms, total: 490 ms > Wall time: 492 ms > > In [92]: %time ok = dM.det() > CPU times: user 12 ms, sys: 422 µs, total: 12.4 ms > Wall time: 12.7 ms > > In [93]: %time ok = M.charpoly() > CPU times: user 18.9 s, sys: 186 ms, total: 19.1 s > Wall time: 19.2 s > > In [94]: %time ok = dM.charpoly() > CPU times: user 10 ms, sys: 170 µs, total: 10.2 ms > Wall time: 10.5 ms > > That's a 50x speed difference for computing the determinant and a > 2000x difference for the characteristic polynomial. This is just for a > 5x5 matrix: the speed difference will be more significant for larger > matrices. Further optimisations are still possible for both det and > charpoly: sparse fraction free elimination could be used for det for > example. > > It is not currently clear how to make use of DomainMatrix for users. > Currently it is only used as part of linsolve and internally by the > Matrix.eigenvects routine. Right now the thing that would make the > biggest difference to matrices in sympy would be making more use of > DomainMatrix to speed up calculations. > > For example this PR added the use of DomainMatrix for computing > eigenvectors: > https://github.com/sympy/sympy/pull/20614 > > That makes eigenvector calculations extremely fast on master compared > to sympy 1.7.1. Given this matrix: > > B = Matrix([ > [ 5, -5, -3, 2], > [-2, -5, 0, 2], > [-2, -7, -5, -2], > [ 7, 10, 3, 9]]) > > On master the eigenvectors are computed using DomainMatrix and it > takes 200 milliseconds: > > In [2]: %time v = B.eigenvects() > CPU times: user 206 ms, sys: 14 ms, total: 220 ms > Wall time: 219 ms > > With sympy 1.7.1 DomainMatrix is not used and the calculation is > extremely slow. I waited 10 minutes and then gave up: > > In [2]: %time v = B.eigenvects() > > ^C--------------------------------------------------------------------------- > KeyboardInterrupt > > Methods like fraction-free algorithms don't necessarily make much > sense with the current Matrix class as it uses Expr for the matrix > elements which does not necessarily get much speed up from avoiding > division. With DomainMatrix it can make a huge difference because it > means that we can perform calculations over a polynomial ring domain > rather than a rational function field which eliminates gcd > calculations. > > I think that right now this is the direction to go to improve matrix > calculations. Some examples of things that need doing: > > 1. Add docs explaining DomainMatrix > 2. Use DomainMatrix for jordan_form as well as for eigenvects and in > particular for the matrix exponential. > 3. Use DomainMatrix for e.g. charpoly and many other methods. > 4. Add basic matrix convenience methods like slicing to DomainMatrix. > 5. Add fraction free and other algorithms to DomainMatrix > 6. Figure out how and when to use both the dense and sparse > implementations of DomainMatrix. > > The big thing that needs doing is providing users with a more direct > way to use DomainMatrix: a public matrix class that uses DomainMatrix > as its internal representation so that it does not need to convert > between representations each time it wants to compute something like > eigenvects. This needs a bigger discussion though about what that > public interface would be. For now I think it would be better to keep > building up DomainMatrix with additional docs, algorithms and methods > while making use of it internally to speed up particularly slow > calculations. > > > Oscar > -- You received this message because you are subscribed to the Google Groups "sympy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sympy+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/afacef3c-9841-4395-b697-ae283d76947cn%40googlegroups.com.