[Python-Dev] Re: The semantics of pattern matching for Python

Mark Shannon Sat, 21 Nov 2020 06:15:15 -0800



On 20/11/2020 6:57 pm, Tobias Kohn wrote:

Hi Daniel and Mark,
Sorry for being slightly late to the party, but please let me add a fewremarks of my own to the discussion here.
*1. Must have precisely defined semantics*

Yes, there are some aspects that we left open intentionally.  Most

Why? Daniel has written up a formal semantics, why not use that orsomething like it?

prominently the question of how often the pattern matching engine willcheck whether the subject is an instance of a particular class. Takethe following trivial example::
   match some_data:
       case Pair(12, 34):
           ...
       case Triple(12, 34, z):
           ...
       case Pair(12, y):
           ...
       case Pair(x, y):
           ...
In a perfect world, the compiler discovers that it must check whether``some_data`` is an instance of ``Pair`` exactly once and not threetimes. This, of course, plays right into Mark's second point onefficiency and seems obvious enough. Yet, as soon as we are consideringnested patterns, it turns much less obvious whether the compiler issupposed to cache repeated isinstance-checks. Can we really expect thatthe compiler must discover that in both case clauses the first elementis checked against the same class? Or would it make more sense tosimply expect the compiler to potentially perform this ``Num`` instancecheck twice::
   match some_data:
       case [ Num(), 12 ]:
           ...
       case [ Num(), y, *z ]:
           ...
It is easy to think of cases where we accidentally end up calling anisinstance check more than once because the compiler could not provethat they are equal. Still, whenever possible we want to give thecompiler the freedom to optimise the pattern matching statement by caching.

An optimization (in the CS sense, not the math sense) is aperformance-improving, semantic-preserving transformation.

How can I preserve semantics if you won't tell me what the semantics are?

In a static language, all of this would not be an issue at all, ofcourse. In Python, however, we end up being caught between its dynamicfeatures and the desire to make pattern matching reasonably efficient.So, we ended up leaving the question open as how often the patternmatching engine is allowed or supposed to check instances. Naturally,if you go and write some isinstance-check on a class with side-effects,you can break it.
*2. Users should not have to pay an unnecessary performance penalty touse pattern matching*
To quote Mark [1] here:
/> Users should not have to pay an unnecessary performance penalty touse pattern matching./
Alright, what does this even mean? What is an unnecessary performancepenalty? How should that be measured or compared?

If we shipped Python compiled with -O1, instead of -O3. It would beslower and unnecessarily so. That would be an unnecessary performancepenalty.

In this context, I mean that there is an obvious performance improvementavailable and it hasn't been implemented. Or that the specificationprevents an optimization without any compensating benefit.

Without a well defined semantics, I can't optimize at all. That is anunnecessary performance penalty.

Pattern matching is not just fancy syntax for an if-elif-statement, buta new way of writing and expressing structure. There is currentlynothing in Python that fully compares to pattern matching (which isobviously why we propose to add in the first place). So, do you want tocompare a pattern matching structure to an if-elif-chain or rather animplementation using reflection and/or the visitor pattern? Whenimplementing pattern matching, would we be allowed to trade off a littlespeed handling the first pattern for moving faster to patterns further down?
Do our PEPs really read to you like we went out of our ways to make itslow or inefficient? Sure, we said let's start with an implementationthat is correct and worry about optimising it later. But I thought thisis 101 of software engineering, anyway, and am thus rather surprised tofind this item on the list.
*3. Failed matches should not pollute the enclosing namespace*
This is a slightly wider issue that has obviously sparked an entirediscussion on this mailing list on scopes.
If there is a good solution that only assigns variables once the entirepattern matched, I would be very happy with that. However, I think thatvariables should be assigned in full before evaluating any guards---evenat the risk of the guard failing and variables being assigned that arenot used later on. Anything else would obviously introduce a mini-scopeand lead to shadowing, which hardly improves anything with respect tolegibility.

How would it introduce shadowing? Since `var` is a capture pattern, itcan't read and thus can't be shadowed.



*4. Objects should be able determine which patterns they match*

Short version: no!


Yes! ;)

Class patterns are an extension of instance checks. Leaving out themeta-classes at this point, it is basically the class that isresponsible for determining if an object is an instance of it. Patternmatching follows the same logic, whereas Mark suggests to put thatupside-down. Since you certainly do not want to define the machinery ineach instance, you end up delegating the entire thing to the class, anyway.


"Mark suggests to put that upside-down"
I have no idea what you mean by that, or the the rest of this paragraph.

I find this suggestion also somewhat strange in light of the history ofour PEPs. We started with a more complex protocol that would allow forcustomised patterns, which was then ditched because it was felt as beingtoo complicated. There is still a possibility to add it later on, ofcourse. But here we are with Mark proposing to introduce a complexprotocol again. It would obviously also mean that we could not rely asmuch on Python's existing infrastructure, which makes efficient patternmatching harder, again. I completely fail to see what should be gainedby this.

"But here we are with Mark proposing to introduce a complex protocolagain". Again?Please list the actual faults you see with whatever it is you are seeingfaults with. I really don't know what you are getting at here.

If you are criticizing my proposal inhttps://github.com/markshannon/pattern-matching/blob/master/precise_semantics.rst#additions-to-the-object-modelthen address that, please.

*5. It should distinguish between a failed match and an erroneous pattern*
This seems like a reasonable idea. However, I do not think it iscompatible to Python's existing culture. Let's pick up Mark's exampleof an object ``RemoteCount`` with two attributes ``success`` and``total``. You can then execute the following line in Python withoutgetting any issues from the interpreter::

Python has loads of runtime type-checking. If something is clearly anerror, then why not report it?


>>> "" + 0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only concatenate str (not "int") to str

   my_remote_count.count = 3
Python does not discover that the attribute should have been ``total``rather than ``count`` here. From a software engineering perspective,this is unfortunate and I would be surprised if there was anyone on thislist who was never bitten by this. But this is one of the prices wehave to pay for Python's elegance in other aspects. Requiring thatpattern matching suddenly solves this is just not realistic.
*6. Syntax and Semantics*
There are some rather strange elements in this, such as the idea thatthe OR-pattern should be avoided. In the matching process, you are alsotalking about matching an expression (under point 2), for instance; youmight not really be aware of the issues of allowing expressions inpatterns in the first place.

Please don't assume that people who disagree with you don't understandsomething or are ignorant.

---
It is most certainly a good idea to start with guiding principles tothen design and build a new feature like pattern matching.Incidentally, this is what we actually did before going into the detailsof our proposal. As evidenced by extensive (!) documentation on ourpart, there is also a vision behind our proposal for pattern matching.
In my view, Mark's proposal completely fails to provide a vision or anyrationale for the guiding principles, other than reference to somemysterious "user" who "should" or "should not" do certain things.Furthermore, there are various obvious holes and imprecisions that wouldhave to be addressed.


You've just addressed my guiding principles, but here they again:

* The semantics of pattern matching must be precisely defined.

* It must be implemented efficiently. That is, it should perform atleast as well as an equivalent sequence of if, elif statements.

* Failed matches should not pollute the enclosing namespace.
* Objects should be able determine which patterns they match.

* It should distinguish, at much as possible, between a failed match andan erroneous pattern.


Where the guiding principles for PEP 634?

Cheers,
Mark.

Kind regards,
Tobias
[1]https://github.com/markshannon/pattern-matching/blob/master/precise_semantics.rst
Quoting Daniel Moisset <[email protected] <mailto:[email protected]>>:
[sorry for the duplicate, meant to reply-all]
Thank you for this approach, I find it really helpful to put theconversation in these terms (semantics and guiding principles).This is not an answer to the proposal (which I've read and helps mecontextualize) but to your points below and how they apply to PEP-634.I'm also answering personally, with a reasonable guess about what theother authors of 634-636 would agree, but they may correct me if I'mwrong.
On Mon, 16 Nov 2020 at 14:44, Mark Shannon <[email protected]<mailto:[email protected]>> wrote:
    (...)
    I believe that a pattern matching implementation must have the
    following
    properties:

    * The semantics must be precisely defined.
    * It must be implemented efficiently.
    * Failed matches must not pollute the enclosing namespace.
    * Objects should be able determine which patterns they match.
    * It should be able to handle erroneous patterns, beyond just
    syntax errors.

    PEP 634 and PEP 642 don't have *any* of these properties.

Let me answer this one by one:
1. "The semantics must be precisely defined":
If this happens in PEP634 I don't think it was intentional, and I'mpretty sure the authors would be happy to complete any incompletenessthat it has. I would happily have a more accurate description (Idrafted a non-official one for a much earlier version of PEP-622,https://github.com/dmoisset/notebook/blob/master/python/pep622/semantic-specs.md ).Can you clarify where you see these imprecisions?
2. "It must be implemented efficiently":
I don't think "efficient implementation" was a priority in PEP634,although I saw your proposal defines this as "same performance as theequivalent if statement", and I'm quite sure that level of performancecan be achieved (if it isn't already by Brandt's implementation).Finding the best way to optimise wasn't a priority, but I think ifthere was anything in our implementation that would make optimisationsharder we would consider them as a change. Do you think anything likethat has been presented?
3. "Failed matches must not pollute the enclosing namespace":
This for me is one of the less-desirable parts of the proposal, andwas agreed more as a matter of practicality and an engineeringtradeoff. If you have a reasonable way of solving this (like puttingmatched variables in the stack and popping it later) reasonably I'd bemuch happier putting that in.
4. "Objects should be able determine which patterns they match."
This is something that you and I, and most of the authors of 622 agreeon. What we found out when discussing this is that we didn't haveclear what and how to open that customization. Some customizationoptions added a lot of complexity at the cost of performance, someothers were very simple but it wasn't clear that they would beactually useful, or extensible in the future. This has a lot to dowith this being a somewhat new paradigm in Python, and our lack ofknowledge on what the user community may do with it beyond what weimagined. So the decision was "pattern matching as it is presentedwithout extensibility is useful, let's get this in, and once we seehow it is used in the wild we'll understand better what kind ofextensibility is valuable". For a crude analogy, imagine trying to getthe descriptor protocol right when the basic python object model wasbeing implemented. These things happened as different times as the useof the language evolved, and my opinion is that customization of thematch protocol must follow a similar path.5. "It should be able to handle erroneous patterns, beyond just syntaxerrors."I'll be answering this based on the example in your document, matchingRemoteCount(success=True, count=count) where RemoteCount is anamedtuple. The code is likely an error, and I'm in general all forreporting errors early, but the kind of error detection proposed hereis the kind of errors that python normally ignore. I find that examplereally similar to the kind of error you could make writing "ifremcount.count == 3: ..." or going beyond this example "maxelems =configfile.read(); if len(elems) == maxelems: ...". There are manytype errors that python ignores (especially related to equality), andPython has already made the decision of allowing mixed type equalitycomparisons everywhere, so why do you think pattern matching should bedifferent with respect to this? In my opinion trying to "fix" this (oreven agreeing if this is a bug or not) is a much more general issueunrelated to pattern matching. Given the current status-quo I normallytrust python type checkers to help me with these errors, and I'dexpect them to do the same with the "erroneous" match statement.If there are other examples you had in mind when you wrote this I'dalso be happy to discuss those.I'll try to get some time to review your specific counterproposallater, thanks for it.
Best,
    Daniel
_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/SIK62L2A3X37G52OR34HGHK6BIDLKGAT/
Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/E4ZXU7XHZ5DO6Q4WDBWIZRWZM5GOM7ZO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The semantics of pattern matching for Python

Reply via email to