On Sat, 20 Feb 2021 at 07:40, Kartik Sethi <kartiks31...@gmail.com> wrote:
>
> Hey, my name is Kartik Sethi. I am interested in taking part in Gsoc this 
> year. I have been contributing to sympy for the past couple of months.

Hi Kartik,

> I have come up with a few ideas for Gsoc. I am focusing primarily on the 
> sympy matrices module.
>
> Implementing complete Singular Value decomposition. Sympy currently has a 
> function for condensed SVD ( I am a contributor on this #20761).
> Using the complete SVD, I can implement Polar decomposition
> Hermite Normal Form. There is an old un-merged PR (#18534), I could work on 
> this and complete it.
> Sparse-Fraction Free Algorithm another old unfinished PR (#9133)

The Matrix class does not necessarily get much benefit from
fraction-free algorithms because of the internal representation.
DomainMatrix does though (see below).

> Jordan Canonical Form

This is already implemented as the Matrix.jordan_form method so I'm
not sure what you are proposing. The main way it could be improved is
by having the method use DomainMatrix internally just like eigenvects
does (see below).

> Improve time complexity for QR decomposition of upper Hessenberg. There is an 
> algorithm using Householder reflectors that can cut down time-complexity to 
> O(n^2) instead of current O(n^3)

When I've timed calculations with Matrix they almost never match up to
the big-O behaviour that is typically expected for any given
algorithm. I think there are too many other symbolic computations
going on that it slows down more than it should as n grows. Operations
with DomainMatrix though do tend to have the big-O that you would
expect if you take into account bit complexity.

This is not listed anywhere on the ideas page or really documented
anywhere yet but there is a new mostly internal implementation of
matrices in the sympy.polys.matrices module. This new implementation
is called DomainMatrix and is based on the polys domains. The domains
themselves are explained in these recently added docs:
https://docs.sympy.org/dev/modules/polys/domainsintro.html
There are aren't any docs for DomainMatrix yet and it is not that easy
to use but you can find more information in this most recent PR which
links to other PRs:
https://github.com/sympy/sympy/pull/20780

The easiest way to test out DomainMatrix is by converting from a
normal Matrix like this:

In [75]: from sympy.polys.matrices import DomainMatrix

In [76]: M = Matrix(range(25)).reshape(5, 5)

In [77]: M
Out[77]:
⎡0   1   2   3   4 ⎤
⎢                  ⎥
⎢5   6   7   8   9 ⎥
⎢                  ⎥
⎢10  11  12  13  14⎥
⎢                  ⎥
⎢15  16  17  18  19⎥
⎢                  ⎥
⎣20  21  22  23  24⎦

In [78]: dM = DomainMatrix.from_Matrix(M)

In [80]: dM
Out[80]: DomainMatrix([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12,
13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]], (5, 5), ZZ)

DomainMatrix is much faster than Matrix for many calculations. The
difference is observable for a simple matrix of integers like this:

In [81]: %time M.det()
CPU times: user 2.4 ms, sys: 281 µs, total: 2.68 ms
Wall time: 2.73 ms
Out[81]: 0

In [82]: %time dM.det()
CPU times: user 143 µs, sys: 161 µs, total: 304 µs
Wall time: 310 µs
Out[82]: mpz(0)

The difference becomes really noticeable if the matrix has algebraic
elements like I or sqrt(2) or has symbols like x or y in it:

In [88]: M = randMatrix(5, 5) + randMatrix(5, 5) * I

In [89]: M
Out[89]:
⎡26 + 27⋅ⅈ  41 + 76⋅ⅈ  98 + 76⋅ⅈ  64 + 52⋅ⅈ  48 + 2⋅ⅈ ⎤
⎢                                                     ⎥
⎢65 + 81⋅ⅈ  49 + 25⋅ⅈ  39 + 82⋅ⅈ    74⋅ⅈ     77 + 3⋅ⅈ ⎥
⎢                                                     ⎥
⎢11 + 59⋅ⅈ  49 + 67⋅ⅈ  50 + 25⋅ⅈ  69 + 24⋅ⅈ  79 + 31⋅ⅈ⎥
⎢                                                     ⎥
⎢42 + 29⋅ⅈ  33 + 72⋅ⅈ  37 + 59⋅ⅈ  10 + 12⋅ⅈ  27 + 62⋅ⅈ⎥
⎢                                                     ⎥
⎣97 + 7⋅ⅈ   4 + 26⋅ⅈ   8 + 30⋅ⅈ   47 + 84⋅ⅈ  81 + 6⋅ⅈ ⎦

In [90]: dM = DomainMatrix.from_Matrix(M)

In [91]: %time ok = M.det()
CPU times: user 486 ms, sys: 4.15 ms, total: 490 ms
Wall time: 492 ms

In [92]: %time ok = dM.det()
CPU times: user 12 ms, sys: 422 µs, total: 12.4 ms
Wall time: 12.7 ms

In [93]: %time ok = M.charpoly()
CPU times: user 18.9 s, sys: 186 ms, total: 19.1 s
Wall time: 19.2 s

In [94]: %time ok = dM.charpoly()
CPU times: user 10 ms, sys: 170 µs, total: 10.2 ms
Wall time: 10.5 ms

That's a 50x speed difference for computing the determinant and a
2000x difference for the characteristic polynomial. This is just for a
5x5 matrix: the speed difference will be more significant for larger
matrices. Further optimisations are still possible for both det and
charpoly: sparse fraction free elimination could be used for det for
example.

It is not currently clear how to make use of DomainMatrix for users.
Currently it is only used as part of linsolve and internally by the
Matrix.eigenvects routine. Right now the thing that would make the
biggest difference to matrices in sympy would be making more use of
DomainMatrix to speed up calculations.

For example this PR added the use of DomainMatrix for computing eigenvectors:
https://github.com/sympy/sympy/pull/20614

That makes eigenvector calculations extremely fast on master compared
to sympy 1.7.1. Given this matrix:

B = Matrix([
[ 5, -5, -3,  2],
[-2, -5,  0,  2],
[-2, -7, -5, -2],
[ 7, 10,  3,  9]])

On master the eigenvectors are computed using DomainMatrix and it
takes 200 milliseconds:

In [2]: %time v = B.eigenvects()
CPU times: user 206 ms, sys: 14 ms, total: 220 ms
Wall time: 219 ms

With sympy 1.7.1 DomainMatrix is not used and the calculation is
extremely slow. I waited 10 minutes and then gave up:

In [2]: %time v = B.eigenvects()
^C---------------------------------------------------------------------------
KeyboardInterrupt

Methods like fraction-free algorithms don't necessarily make much
sense with the current Matrix class as it uses Expr for the matrix
elements which does not necessarily get much speed up from avoiding
division. With DomainMatrix it can make a huge difference because it
means that we can perform calculations over a polynomial ring domain
rather than a rational function field which eliminates gcd
calculations.

I think that right now this is the direction to go to improve matrix
calculations. Some examples of things that need doing:

1. Add docs explaining DomainMatrix
2. Use DomainMatrix for jordan_form as well as for eigenvects and in
particular for the matrix exponential.
3. Use DomainMatrix for e.g. charpoly and many other methods.
4. Add basic matrix convenience methods like slicing to DomainMatrix.
5. Add fraction free and other algorithms to DomainMatrix
6. Figure out how and when to use both the dense and sparse
implementations of DomainMatrix.

The big thing that needs doing is providing users with a more direct
way to use DomainMatrix: a public matrix class that uses DomainMatrix
as its internal representation so that it does not need to convert
between representations each time it wants to compute something like
eigenvects. This needs a bigger discussion though about what that
public interface would be. For now I think it would be better to keep
building up DomainMatrix with additional docs, algorithms and methods
while making use of it internally to speed up particularly slow
calculations.


Oscar

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sympy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAHVvXxQQ4zA1qhwTsbtSEAjxZHwGhK4XgHS1k6KQYxTWsOLEMA%40mail.gmail.com.

Reply via email to