[sympy] Re: Identifying repeated subexpressions in systems of equations

Ondrej Certik Wed, 18 Jun 2008 01:29:58 -0700

> It is certainly correct for that representation. However, my claim is
> that this representation is suboptimal for the purposes to which one
> wants to apply common subexpression elimination. All of the use cases
> hold as important one of these two criteria:
>
>  1. flop count
>  2. readability
>
> CSE on the default sympy representation pessimizes both of these. For
> floating point evaluation, - is an atomic operation. You would not
> generate code that breaks that up into a Mul and an Add.
>
>  x0 = -1*y
>  x1 = x + x0
>
> versus
>
>  x1 = x - y
>
> Readability is much the same.


Yes, I completely agree with this.

>> It does, but it can be easily fixed, see above. I.e. fixing your
>> algorithm. :) The other option, as you said, is fixing sympy. We would
>> have to introduce a Sub class.
>> But I think it would just make things more complex. No?
>
> For all of sympy, yes. However, I think we could keep these
> transformations local to the cse module. You would have to implement
> just enough of Sub to replace Add(x, Mul(-constant, y)) and go back
> again.

That's basically what my patch does. In terms of maintainability, I
think it's easier to maintain the two added lines, rather than a full
Sub class.
(Unless it doesn't cover all the corner cases, in which case we might
refactor it into a regular Sub class).

>
> Similarly, breaking up a power of a product into a product of powers
> in the default sympy representation hides optimization opportunities.
> CSE could work better with a different representation.

This also pops up from time to time. The current implementation does
what you'd do by hand in 90% of the cases, i.e. it does the right
thing.
I think we can patch Pow to do something like this:

e = Pow(a*b, 2, expand=False)
f = Pow(a*b, 2)

then

e = (a*b)**2
f = a**2 * b**2

and

e  != f

We can also add methods like expand(), collect() etc. I think it
wouldn't break anything and it can be useful from time to time.

>
>>> So I think there needs to be a bit of a preprocessing step to optimize
>>> the expression specifically for cse() before it does its magic.
>
> I think this is the correct approach rather than putting special cases
> inside cse() itself. Preprocessing to a more tractable representation,
> CSE, then postprocessing allows us to add optimizations incrementally
> while keeping the core algorithm clean and testable. sympy in general
> doesn't need to change outside of the cse module.

Right. As I said above, if the only special case are those 2 lines in
my patch, then I think that's the best way. If, however, we realize we
need to add more special cases, then I'd suggest to implement the
other representation, for exactly the reasons you said.

Or to put it another way, if someone sends a patch with the things you
describe, it will surely go in (if it's robust + maintainable),
otherwise I suggest to push in your code + my 2 lines patch.

Ondrej

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to sympy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sympy?hl=en
-~----------~----~----~----~------~----~------~--~---

[sympy] Re: Identifying repeated subexpressions in systems of equations

Reply via email to