Comment #48 on issue 2607 by asmeurer: as_numer_denom() is too slow
http://code.google.com/p/sympy/issues/detail?id=2607

Here are timings for the many denoms few numers idea from comment 19. All timings (obviously) done in the and branch:

Cache on:

In [1]: numers, denoms = zip(*((Symbol('n%d'%i),Symbol('d%d'%i)) for i in xrange(100)))

In [2]: a = Add(*(n/d for n in numers[:10] for d in denoms))

In [3]: %timeit a.as_numer_denom1()
^C
KeyboardInterrupt: # Took too long

In [4]: %timeit a.as_numer_denom2()
1 loops, best of 3: 1.98 s per loop

In [5]: %timeit a.as_numer_denom3()
1 loops, best of 3: 163 ms per loop

In [6]: %timeit a.as_numer_denom4()
1 loops, best of 3: 120 ms per loop

In [7]: %timeit a.as_numer_denom4b()
10 loops, best of 3: 170 ms per loop

In [8]: %timeit a.as_numer_denom5()
1 loops, best of 3: 82.6 ms per loop

In [9]: %timeit a.as_numer_denom6()
1 loops, best of 3: 234 ms per loop

In [10]: %timeit a.as_numer_denom7()
1 loops, best of 3: 82.1 ms per loop

In [11]: %timeit a.as_numer_denom8()
1 loops, best of 3: 100 ms per loop

In [12]: %timeit together(a)
1 loops, best of 3: 4.73 s per loop

Cache off:

In [1]: numers, denoms = zip(*((Symbol('n%d'%i),Symbol('d%d'%i)) for i in xrange(100)))

In [2]: a = Add(*(n/d for n in numers[:10] for d in denoms))

In [3]: %timeit a.as_numer_denom2()
1 loops, best of 3: 21.5 s per loop

In [4]: %timeit a.as_numer_denom3()
1 loops, best of 3: 16 s per loop

In [5]: %timeit a.as_numer_denom4()
1 loops, best of 3: 28.1 s per loop

In [6]: %timeit a.as_numer_denom4b()
1 loops, best of 3: 22.9 s per loop

In [7]: %timeit a.as_numer_denom5()
1 loops, best of 3: 1.46 s per loop

In [8]: %timeit a.as_numer_denom6()
1 loops, best of 3: 353 ms per loop

In [9]: %timeit a.as_numer_denom7()
1 loops, best of 3: 461 ms per loop

In [10]: %timeit a.as_numer_denom8()
1 loops, best of 3: 3.87 s per loop

In [11]: %timeit together(a)
1 loops, best of 3: 14 s per loop

What can we surmise from this? First, that all implementations are hitting some cache intensive parts of the code---some more than others. Second, your variants are better than any of mine or together in any case.

There's more to the story, though. as_numer_denom7(), which appears to be one of the fastest methods, returns a nested expression, which must be denested in many cases to do anything useful with it (e.g., to make a Poly out of it). This takes more time, and I think it's safe to assume that it would make it slower than most other contending methods if we tacked on that time. However, it's not as nested as the output of as_numer_denom3(). Just how nested is it? The depth of the output of as_numer_denom3() is equal to the number of terms. as_numer_denom7() seems to be more like the log of the number of terms, which is actually reasonable (for 100 terms, the depth is only 7).

So how about the dual test, many numers and few denoms? This will rely on a good common denominator optimization.

Cache on:

In [1]: numers, denoms = zip(*((Symbol('n%d'%i),Symbol('d%d'%i)) for i in xrange(100)))

In [2]: a = Add(*(n/d for n in numers for d in denoms[:10]))

In [3]: %timeit a.as_numer_denom1()
^C
KeyboardInterrupt:

In [4]: %timeit a.as_numer_denom2()
1 loops, best of 3: 992 ms per loop

In [5]: %timeit a.as_numer_denom3()
1 loops, best of 3: 93.1 ms per loop

In [6]: %timeit a.as_numer_denom4()
1 loops, best of 3: 85.7 ms per loop

In [7]: %timeit a.as_numer_denom4b()
10 loops, best of 3: 104 ms per loop

In [8]: %timeit a.as_numer_denom5()
10 loops, best of 3: 55.8 ms per loop

In [9]: %timeit a.as_numer_denom6()
10 loops, best of 3: 56.5 ms per loop

In [10]: %timeit a.as_numer_denom7()
10 loops, best of 3: 55.8 ms per loop

In [11]: %timeit a.as_numer_denom8()
1 loops, best of 3: 53 ms per loop

In [12]: %timeit together(a)
1 loops, best of 3: 398 ms per loop

Cache off:

In [2]: a = Add(*(n/d for n in numers for d in denoms[:10]))

In [3]: %timeit a.as_numer_denom2()
1 loops, best of 3: 11.1 s per loop

In [4]: %timeit a.as_numer_denom3()
1 loops, best of 3: 1.68 s per loop

In [5]: %timeit a.as_numer_denom4()
1 loops, best of 3: 1.59 s per loop

In [6]: %timeit a.as_numer_denom4b()
1 loops, best of 3: 1.58 s per loop

In [7]: %timeit a.as_numer_denom5()
1 loops, best of 3: 189 ms per loop

In [8]: %timeit a.as_numer_denom6()
10 loops, best of 3: 248 ms per loop

In [9]: %timeit a.as_numer_denom7()
1 loops, best of 3: 192 ms per loop

In [10]: %timeit a.as_numer_denom8()
1 loops, best of 3: 197 ms per loop

In [11]: %timeit together(a)
1 loops, best of 3: 1.31 s per loop

I'd say this doesn't tell us anything new.

Now, we should make as_numer_denom9, 10, ... based on 5-8 except it does structural gcd like together() does (see comment 29), so we can see if this makes things faster or slower.

(sorry for the long comments of timings, but I think it's important to document this)

--
You received this message because you are subscribed to the Google Groups 
"sympy-issues" group.
To post to this group, send email to sympy-issues@googlegroups.com.
To unsubscribe from this group, send email to 
sympy-issues+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sympy-issues?hl=en.

Reply via email to