@Richard,

Thanks for the Jenks paper, that was a good read. I also read your paper on
semantic matching http://dl.acm.org.ezp2.lib.umn.edu/citation.cfm?id=806300,
which only detailed the use of matching in Macsyma, and not its underlying
workings. I failed to find anything on how it actually worked under the
hood.

@Aaron

The Axiom system assumes variables shouldn't be identities, unless noted
that they can be. Otherwise you may match patterns that you'd rather not.
For example, given the set of patterns:

T = {a*x**2 + b*x + c, cos(x)}

where a, b, c, and x are variables. Both patterns will match cos(d) (the
quadratic matches on {a: 0, b: 0, x: whatever, c: cos(d)}). The user
probably only wants the second one to match. Assuming all variables are
allowed to be 1 or 0 by default results in a combinatorially large number
of potential matches.

The way the Axiom system does this is that if a variable is allowed to be
an identity (1 or 0), additional patterns with the variable at these values
are generated, and added to the ruleset. So specifying `cos(x)*a`, with x
allowed to go to identity creates 2 patterns: {cos(x)*a, a}. This wouldn't
be needed for simple cases, but for things like matching cos(x)*a -> a with
{x: 1} this would be necessary.

I'm not certain how this system will be used in SymPy. Will there be
multiple, small rulesets (a set of trig rules, a set of log rules, etc...),
or will there be some giant ruleset? With the net structure I'm using, the
cost of more rules in a ruleset is small (~none for non-AC patterns), but
it does exist.

- Jim Crist

On Sat, Nov 29, 2014 at 10:29 AM, Aaron Meurer <asmeu...@gmail.com> wrote:

>
>
> On Fri, Nov 28, 2014 at 6:40 PM, Richard Fateman <fate...@gmail.com>
> wrote:
>
>>
>>
>> On Thursday, November 27, 2014 7:49:30 PM UTC-8, James Crist wrote:
>>>
>>> Oh boy, this is going to be a big post. Responding to everyone in turn:
>>>
>>> *@Aaron:*
>>>
>>>
>>> > Nonlinear, AC pattern matching is NP complete. Linear AC pattern
>>>> matches
>>>
>>> > can be found in polynomial time.
>>>>
>>>> Interesting. Why is that?
>>>>
>>>
>>> Joachim got it right, having each match constrained by other matches,
>>> and a (potentially) combinatorially large set of available match
>>> combinations makes the nonlinear, associative-commutative problem much
>>> harder. This is a good paper on the subject:
>>> http://www.sciencedirect.com/science/article/pii/S0747717187800275
>>>
>>>
>>> >The question is, do we want the most general match first, or the
>>>> > most specific?
>>>>
>>>> Whichever one is the correct behavior for a dispatch system. I *think*
>>>> that means specific, but I could be getting it backwards.
>>>>
>>>> I'm assuming that the whole point of having a set of patterns rather
>>>> than a single pattern at a time is so you can do this.
>>>>
>>>
>>> That's exactly the purpose. In that case, matching on constants rather
>>> than variables is the desired behavior (more specific patterns first, then
>>> more general). I'll think about ways to make this configurable, but right
>>> now I'm more interested in making it *work*.
>>>
>>>
>>> *@Francesco*
>>>
>>>
>>> That's cool. Is it addressing this issue?
>>>> https://github.com/sympy/sympy/issues/7748
>>>
>>>
>>> Didn't know we had an issue on this, but yes - in part. The main purpose
>>> behind this was originally to get a good matcher for tiling the expression
>>> tree into computations for code generation, as there isn't necessarily a
>>> 1-to-1 match between nodes and C/Fortran functions. But since I had to
>>> write it for that, it made sense to make it general so it can be used
>>> everywhere else it's needed.
>>>
>>> You mean expressions like *Add(x, x, evaluate=False)* ? I think there
>>>> should be a way to check where an expression is in a canonical order.
>>>
>>>
>>> No, I mean anything where a variable occurs in a pattern more than once.
>>> So even sin(x)*cos(x), where x is a variable is nonlinear.
>>>
>>>
>>> Maybe it is preferrable that the first lazily generated substitution
>>>> appears in the same order as Mathematica's pattern matching. It would make
>>>> Mathematica's code more easily portable to SymPy.
>>>
>>>
>>> I think that would be incredibly hard to do. Especially since I have no
>>> access to or experience with mathematica. It all depends on the ordering
>>> algorithm they use for generating the potential matches.
>>>
>>> What kind of AST are you matching?
>>>
>>>
>>> It only depends on a preorder-traversal structure. To use the code right
>>> now, one needs to define a class that has 4 methods - `current`,
>>> `sub_terms`, `next`, and `skip`, which give the current term, its subterms,
>>> advance the preorder-traversal, and skip over a branch, respectively. This
>>> should allow it to be used with any tree-like term. There might be a better
>>> abstraction, but that's what I have right now in my prototype.
>>>
>>> One last complication: have you had any thought of how this could be
>>>> extended to an AST node containing a PermutationGroup instance determining
>>>> how its arguments commute/anticommute? That could have some applications in
>>>> the tensor module, but I guess it's hard or impossible to find a polynomial
>>>> algorithm for that.
>>>>
>>>
>>> I'm not sure what you mean by that. Can you explain?
>>>
>>> By the way, are you using Wild objects, or are you specifying wilds by
>>>> an extra parameter?
>>>> In unify.rewriterule, you just put in two expressions, and specify
>>>> which ones are variables (i.e. wilds) in another parameter. Furthermore,
>>>> there is an option to put a filtering lambda function and assumptions, so
>>>> that matches should be compatible with a test function and/or SymPy's
>>>> new-style assumptions.
>>>
>>>
>>> I'm using a similar method to the unification code. So far only
>>> "match-all" variables are supported - once it works it should be simple to
>>> modify the post-processing step to match on type/assumptions.
>>>
>>>
>>> *@Matt*
>>>
>>> Awesome, thanks for doing this.
>>>
>>>
>>> Well, thanks to you as well. This all started with the flatterms paper
>>> you sent me, along with the (ridiculously) long list of related papers to
>>> it I found.
>>>
>>> Another major use case is selecting the "best" match.  In this case we
>>>> might also want to control the order in which matches come out of the
>>>> sequence so that matches that we think are best come first.
>>>
>>>
>>> In my ideal world I might be able to decide the order in which we
>>>> traverse the tree of possible matches.  I don't know exactly how your
>>>> algorithm works but, in some algorithms, we obtain a set of partial matches
>>>> at each step and then have to decide which branch to pursue first.  I've
>>>> found that accepting a user-provided objective function is useful.  It
>>>> allows the user to specify a greedy search strategy.
>>>>
>>>
>>> The algorithm uses a discrimination net to do a "filtering" of the set
>>> of patterns, down to those that (potentially) match the given query term.
>>> At each node there are only 2 possible choices - follow the current
>>> constant symbol (if such a branch exists), or follow the variable symbol
>>> (all variables are treated as the same variable at this point).
>>> Backtracking is then used to eventually traverse all possible branches and
>>> get the full set of matches. Given that there are only two choices at each
>>> node, a user defined function would either say "follow variables first" or
>>> "follow constants first".
>>>
>>> After a the set of matches for the term is formed, then a matching
>>> substitution is determined. For non AC patterns, determining a matchset
>>> from the pattern is trivial, and is basically "done" after the
>>> discrimination net step. For AC, linear patterns, these can be simplified
>>> slightly based on prior known information from the discrimination net step,
>>> and then fed into a general, less efficient matching/unification routine.
>>> The nonlinear, AC patterns have to go right to this step.
>>>
>>> The thought behind this process is that most patterns will be removed by
>>> the cheap first "filtering" step, leaving only a small subset which require
>>> an expensive calculation to determine a matching substitution on.
>>>
>>> Given the above, a user defined "greedy" algorithm could generate the
>>> full set of potential matches, and then order them based on preference
>>> before diving into the expensive generation of a matching substitution.
>>> This may be a good idea.
>>>
>>>
>>> *@ Joachim*
>>>
>>> Covered some of your points elsewhere in this post.
>>>
>>> Is it a basic building block for other algorithms to use? Then favor
>>>> predictable performance over features. Something that has unpredictable
>>>> performance isn't very attractive to an algorithm author.
>>>>
>>>> Are you aiming for something that can solve as many situations as
>>>> possible? Then aim for more features.
>>>> I'm somewhat sceptical whether just pattern matching will be the right
>>>> approach, but I suspect you aren't after this one anyway :-)
>>>
>>>
>>> Aiming for a building block. I have my use cases for such a thing (see
>>> above), but it should be generally useful elsewhere. From what I
>>> understand, pattern matching and rewrite rules are common features in most
>>> computer algebra systems (Mathematica, maxima, etc...). The main reason for
>>> these questions is that some added space complexity, we can save some notes
>>> on the matches in the filtering step, which can then be used to provide a
>>> slight speedup in the creation of matching substitions. Or in the case of
>>> non-ac patterns, completely solve them. Doing so by default slightly slows
>>> down the determination of the set of patterns, but speeds up the creation
>>> of matching substitions.
>>>
>>> *@Richard*
>>>
>>> I don't know if your AC matcher is intended to be used for (say)
>>>>  arithmetic expressions
>>>> or something else.  But if arithmetic -- your stuff also needs to deal
>>>> with identities.
>>>> In that case if
>>>> you don't handle identities, your matcher becomes far less interesting.
>>>>
>>>
>>> Based on what I've read on the subject, I gather by identities you mean:
>>>
>>> pattern = x*y*z
>>> query =  a*b
>>>
>>> Should match with {x: a, y: b, z: 1}, or a different permutation of
>>> that, where 1 is the identity for the mulitplication function node.
>>> Similarly 0 would be that for addition. Is that correct? I'm a mechanical
>>> engineer, not a computer scientist, so this is all a bit foreign to me.
>>>
>>> That is the idea.
>> Also pattern x*y  can match  say   cos(a),    with x=cos(a) and y=1.
>> or y=cos(a), x=1.
>>
>
> I agree this is important. For instance, if the ODE solver tries to match
> a*f(x).diff(x, x) + b*f(x).diff(x) + c*f(x) + d, where a, b, c, and d are
> wild symbols that should be independent of f(x), it should work even if a,
> b, c, or d are 1 or 0. I suspect getting them to match 1 is not difficult
> (a simple way if you can already make a*b match x*y*z with a = x*y and b =
> z, just treat every Mul as if it had extra 1s in it, one for each wild in
> the pattern).  I'm guessing making a*x match 0 with a = 0 is a bit more
> difficult, unless there is some other "trick" to make it work.
>
> Another question is, how can you teach the pattern matcher that a function
> maps to an identity, like cos(a)*x can match x with a = 0, or x**a*y can
> match y with a = 0?
>
> Getting these "trivial" matching cases to work is very important.
> Otherwise, you have to write code that checks them all manually, which sort
> of defeats the purpose of using the pattern matcher.
>
> Aaron Meurer
>
>
>>
>>
>>> The literature accessible to me on Associative-Commutative-Identity
>>> many-to-one pattern matching is lacking, or I'm not the best at finding it.
>>> After some thinking, the formation of matches should still work mod
>>> identities, if they are as I described above.
>>>
>> Probably not without an exponential increase in cost, worst case.
>>
>>
>>> I'm not certain that the lack of this constitutes "far less
>>> interesting", but I can see how it could be useful.
>>>
>>
>> I venture to say that matching arithmetic/algebraic expressions without
>> identities 0, 1  for plus and times
>> would be nearly useless.
>> How many patterns would you need for "quadratic in x" ?
>>
>>
>>>
>>> There's a long history of pattern matching "fast" including work by
>>>> Richard Jenks, (Scratchpad, predecessor of Axiom).
>>>
>>>
>>> The papers I've read have been almost exclusively from the theorem
>>> proving world. The work of Jim Christian on HIPER and Leo Bachmair on
>>> REVEAL specifically, which only cover matching mod AC. I only found one
>>> relevant paper by Jenks, and it had only 3 paragraphs on pattern matching
>>> (just describing the syntax of it in his system).
>>>
>> wrong paper
>> *A pattern compiler*
>> Full Text:[image: PDF]PDF [image: Buy this Article]Buy this Article
>> <https://dl.acm.org/purchase.cfm?id=806324&CFID=556096752&CFTOKEN=59191841>
>> Author:Richard D. Jenks
>> <http://dl.acm.org/author_page.cfm?id=81100440591&coll=DL&dl=ACM&trk=0&cfid=556096752&cftoken=59191841>Published
>> in:· ProceedingSYMSAC '76 Proceedings of the third ACM symposium on
>> Symbolic and algebraic computationPages 60-65
>> ACM <http://www.acm.org/publications> New York, NY, USA ©1976
>> table of contents
>> <http://dl.acm.org/citation.cfm?id=800205&picked=prox&cfid=556096752&cftoken=59191841>
>>  doi>10.1145/800205.806324 <http://dx.doi.org/10.1145/800205.806324>[image:
>> A pattern compiler][image: Published by ACM] 1976 Article
>> [image: Bibliometrics Data]  Bibliometrics· Downloads (6 Weeks): 0
>> · Downloads (12 Months): 35
>> · Downloads (cumulative): 197
>> · Citation Count: 2
>>
>> *Tools and Resources*
>>
>>    - Buy this Article
>>    
>> <https://dl.acm.org/purchase.cfm?id=806324&CFID=556096752&CFTOKEN=59191841>
>>    - Recommend the ACM DL
>>    to your organization
>>    <https://dl.acm.org/reqdl.cfm?CFID=556096752&CFTOKEN=59191841>
>>    - Request Permissions
>>    
>> <http://dl.acm.org/rightslink.cfm?id=806324&parent_id=800205&CFID=556096752&CFTOKEN=59191841>
>>    - TOC Service: [image: Spacer Image reserves space for checkmark when
>>    TOC Service is updated]
>>       -
>>       [image: Toc Alert via Email]Email
>>       
>> <http://dl.acm.org/citation.cfm?id=806324&dl=ACM&coll=DL&CFID=556096752&CFTOKEN=59191841#>
>>
>>       - [image: Toc Alert via Email]RSS
>>       
>> <http://dl.acm.org/citation.cfm?id=806324&dl=ACM&coll=DL&CFID=556096752&CFTOKEN=59191841#>
>>    - Save to Binder
>>    
>> <http://dl.acm.org/citation.cfm?id=806324&dl=ACM&coll=DL&CFID=556096752&CFTOKEN=59191841#>
>>    - Export Formats:
>>       - BibTeX
>>       - EndNote
>>       - ACM Ref
>>
>> Share:
>> Share on email
>> <http://dl.acm.org/citation.cfm?id=806324&dl=ACM&coll=DL&CFID=556096752&CFTOKEN=59191841#>Share
>> on facebook
>> <http://dl.acm.org/citation.cfm?id=806324&dl=ACM&coll=DL&CFID=556096752&CFTOKEN=59191841#>Share
>> on google_plusone_share
>> <http://www.addthis.com/bookmark.php?v=300&winname=addthis&pub=acm&source=tbx-300&lng=en-US&s=google_plusone_share&url=http%3A%2F%2Fdl.acm.org%2Fcitation.cfm%3Fid%3D806324%26dl%3DACM%26coll%3DDL%26CFID%3D556096752%26CFTOKEN%3D59191841&title=A%20pattern%20compiler&ate=AT-acm/-/-/5479238f027a9e53/2&frommenu=1&uid=5479238f7b571dbb&ct=1&pre=https%3A%2F%2Fwww.google.com%2F&tt=0&captcha_provider=nucaptcha>Share
>> on twitter
>> <http://dl.acm.org/citation.cfm?id=806324&dl=ACM&coll=DL&CFID=556096752&CFTOKEN=59191841#>Share
>> on researchgate
>> <http://www.addthis.com/bookmark.php?v=300&winname=addthis&pub=acm&source=tbx-300&lng=en-US&s=researchgate&url=http%3A%2F%2Fdl.acm.org%2Fcitation.cfm%3Fid%3D806324%26dl%3DACM%26coll%3DDL%26CFID%3D556096752%26CFTOKEN%3D59191841&title=A%20pattern%20compiler&ate=AT-acm/-/-/5479238f027a9e53/3&frommenu=1&uid=5479238f1d616cd0&ct=1&pre=https%3A%2F%2Fwww.google.com%2F&tt=0&captcha_provider=nucaptcha>Share
>> on reddit
>> <http://www.addthis.com/bookmark.php?v=300&winname=addthis&pub=acm&source=tbx-300&lng=en-US&s=reddit&url=http%3A%2F%2Fdl.acm.org%2Fcitation.cfm%3Fid%3D806324%26dl%3DACM%26coll%3DDL%26CFID%3D556096752%26CFTOKEN%3D59191841&title=A%20pattern%20compiler&ate=AT-acm/-/-/5479238f027a9e53/4&frommenu=1&uid=5479238f2b25b215&ct=1&pre=https%3A%2F%2Fwww.google.com%2F&tt=0&captcha_provider=nucaptcha>
>> |More Sharing Services
>> <http://www.addthis.com/bookmark.php?v=250&username=acm>
>> *Tags:* algorithms
>> <http://dl.acm.org/results.cfm?query=%22algorithms%22&querydisp=%22algorithms%22&termshow=matchall&cfid=556096752&cftoken=59191841>
>>  compilers
>> <http://dl.acm.org/results.cfm?query=%22compilers%22&querydisp=%22compilers%22&termshow=matchall&cfid=556096752&cftoken=59191841>
>> scratchpad
>> <http://dl.acm.org/results.cfm?query=%22scratchpad%22&querydisp=%22scratchpad%22&termshow=matchall&cfid=556096752&cftoken=59191841>
>>  theory
>> <http://dl.acm.org/results.cfm?query=%22theory%22&querydisp=%22theory%22&termshow=matchall&cfid=556096752&cftoken=59191841>
>> [image: feedback]
>> <portal-feedb...@hq.acm.org?subject=Comments_on_new_design> *Feedback*
>> <portal-feedb...@hq.acm.org?subject=Comments_on_new_design> | Switch to 
>> single
>> page view
>> <http://dl.acm.org/citation.cfm?id=806324&dl=ACM&coll=DL&CFID=556096752&CFTOKEN=59191841&preflayout=flat>
>>  (no
>> tabs)
>> *Abstract*
>> <http://dl.acm.org/citation.cfm?id=806324&dl=ACM&coll=DL&CFID=556096752&CFTOKEN=59191841#>
>> *Authors*
>> <http://dl.acm.org/citation.cfm?id=806324&dl=ACM&coll=DL&CFID=556096752&CFTOKEN=59191841#>
>> *References*
>> <http://dl.acm.org/citation.cfm?id=806324&dl=ACM&coll=DL&CFID=556096752&CFTOKEN=59191841#>*Cited
>> By*
>> <http://dl.acm.org/citation.cfm?id=806324&dl=ACM&coll=DL&CFID=556096752&CFTOKEN=59191841#>*Index
>> Terms*
>> <http://dl.acm.org/citation.cfm?id=806324&dl=ACM&coll=DL&CFID=556096752&CFTOKEN=59191841#>
>> *Publication*
>> <http://dl.acm.org/citation.cfm?id=806324&dl=ACM&coll=DL&CFID=556096752&CFTOKEN=59191841#>
>> *Reviews*
>> <http://dl.acm.org/citation.cfm?id=806324&dl=ACM&coll=DL&CFID=556096752&CFTOKEN=59191841#>
>> *Comments*
>> <http://dl.acm.org/citation.cfm?id=806324&dl=ACM&coll=DL&CFID=556096752&CFTOKEN=59191841#>*Table
>> of Contents*
>> <http://dl.acm.org/citation.cfm?id=806324&dl=ACM&coll=DL&CFID=556096752&CFTOKEN=59191841#>
>> A pattern compiler for the SCRATCHPAD system provides an efficient
>> implementation of sets of user-defined pattern-replacement rules for
>> symbolic mathematical computation such as tables of integrals or summation
>> identities. Rules are compiled together, with common search paths merged
>> and factored out and with the resulting code optimized for efficient
>> recognition over all patterns. Matching principally involves structural
>> comparison of expression trees and evaluation of predicates. Pattern
>> recognizers are “fully compiled” if values of match variables can be
>> determined by solving equations at compile time. Recognition times for
>> several pattern matchers are compared.
>>
>> .....
>>
>>> I'm sure the code in macsyma/maxima would be quite useful, but being as
>>> lisp-illiterate as I am, I can't make much use of it. Thanks for the tip
>>> though.
>>>
>>
>> You could read my paper on semantic patterns instead of the code.
>>
>>
>>>
>>> - Jim
>>>
>>>  --
>> You received this message because you are subscribed to the Google Groups
>> "sympy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to sympy+unsubscr...@googlegroups.com.
>> To post to this group, send email to sympy@googlegroups.com.
>> Visit this group at http://groups.google.com/group/sympy.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/sympy/345a4a58-a44f-48a4-8f8e-64b920c1d65e%40googlegroups.com
>> <https://groups.google.com/d/msgid/sympy/345a4a58-a44f-48a4-8f8e-64b920c1d65e%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "sympy" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/sympy/-7lyMkiB5SY/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> sympy+unsubscr...@googlegroups.com.
> To post to this group, send email to sympy@googlegroups.com.
> Visit this group at http://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/CAKgW%3D6LQqOTcs500bg6_RZFJkZEdGAObXmi01QKAHSO7nxs4gg%40mail.gmail.com
> <https://groups.google.com/d/msgid/sympy/CAKgW%3D6LQqOTcs500bg6_RZFJkZEdGAObXmi01QKAHSO7nxs4gg%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sympy+unsubscr...@googlegroups.com.
To post to this group, send email to sympy@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAJ2L7mc6tm3QXL5MYLjPC%3DrEk2ZA3LUosvRjt2X9xBjEbhN-sQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to