Hi Ronan,

On Sun, Oct 6, 2013 at 8:56 PM, Ronan Lamy <ronan.l...@gmail.com> wrote:
> I've been wondering how much of the speed of CSymPy compared to SymPy is due
> to the language (C++ vs Python) and how much is due to more efficient
> algorithms. So I've created a simple proof of concept
> [https://github.com/rlamy/pycsympy] implementing an algorithm similar to
> what CSymPy uses in pure Python.

Very cool, thanks for doing this! We need more people tackling (independently)
these basic algorithms and try to make them as fast as possible. I
think that's the best
way to make sure we don't get stuck with some inferior design.

>
> On the 'expand((x+y+z+w)**60)' benchmark, I get 165 ms on PyPy and 240 ms on
> CPython 2.7 vs 95 ms with CSymPy whereas I get 6.8 s on PyPy and 18.5 s on
> CPython 2.7 with SymPy.

On my slower laptop I got:

csympy + gcc 4.7.2: 171ms
pycsympy+Python 2.7: 441ms
sympyx + Python 2.7: 3270ms
sympyx + Cython: 2000ms

so the speedup is about 2.5x, just like you got.

>
> From this admittedly very limited data, I'd conclude that:
> * SymPy could (and should) be much faster

SymPy is doing a lot of things, like assumptions and so on. Definitely
the algorithms are not optimal.

> * Switching to C++ doesn't really seem worth it, considering that it makes
> it harder to find better algorithms.

However, it's very important to compare the same algorithms. For example,
pycsympy currently does not combine terms with powers:

In [28]: f = (z**2+z)**2

In [29]: f.expand()
Out[29]: 0 + 2*z**1 * z**2**1 + 1*z**2**2 + 1*z**2

while in csympy or sympy:

In [3]: f = (z**2+z)**2

In [4]: f.expand()
Out[4]: z^2 + 2z^3 + z^4

doing this simplification (e.g. z**1 -> z) and combining terms is some
work that has to be done. So this would have to be implemented in
pycsympy in order to have a fair comparison.


Besides that, one has to keep in mind that these benchmarks are highly
synthetic. But it's better than nothing, so I used them too in csympy.
They do seem to provide quite some information about the speed. I
personally like the expand2 benchmark in csympy:

(w + (w + z + y + x)^15) * (w + z + y + x)^15

The powers are expanded first, and then the benchmark is to measure
the time to expand these two long polynomials. Both with symbolic
algorithm (that can handle non-integer exponents, the expand2 in
csympy) and a special polynomial algorithm (expand2b in csympy).

Try to implement this benchmark in pycsympy.

The problem with regular (x+y+z)^100 expansion is that you just need
to calculate the multinomial coefficients, where C++ roughly brings 2x
or so speedup (if I remember correctly), as the Python algorithm is
already highly optimized. Here is the algorithm in C++:

https://github.com/certik/csympy/blob/master/src/pow.cpp#L156

I haven't spent too much time optimizing this particular piece of
code, one should try couple different datastructures in C++ for the
dictionary and vector, different hashing etc.

Once you have the coefficients, you need to quickly construct the
expression. In C++, this is done here:

https://github.com/certik/csympy/blob/master/src/pow.cpp#L200

in sympyx that is done here:

https://github.com/certik/sympyx/blob/master/sympy_pyx.pyx#L546

while in pycsympy this is done here:

https://github.com/rlamy/pycsympy/blob/master/pycsympy/add.py#L85

which is a lot simpler --- but it doesn't handle cases like the one I
reported above. Can you try adding this in? It might affect the speed
quite a bit. Another case is this:

In [6]: ((x+3)**2).expand()
Out[6]: 0 + 6*1**1 * x**1 + 9*1**2 + 1*x**2

There might be more. It takes quite some time to nail these special
cases down, and then the code gets "bloated" a bit (e.g. you can see I
have 3 if statements in the C++ code).

Btw, any ideas why your code is so much faster than sympyx? I can't
see much of a difference between the algorithms in sympyx, csympy and
pycsympy.

So at the moment it's hard to tell whether your approach truly can get
to 2.5x speed difference from the C++ code. If you can keep your
current speed once you implement these fixes, then that would be very
impressive indeed.

Ondrej

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sympy+unsubscr...@googlegroups.com.
To post to this group, send email to sympy@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to