On Mon, Jun 22, 2020 at 8:20 AM nate lust <natel...@linux.com> wrote:
> I have been working on an idea that would introduce pattern matching > syntax to python. I now have this syntax implemented in cpython, and feel > this is the right time to gather further input. The repository and branch > can be found at https://github.com/natelust/cpython/tree/match_syntax. > The new syntax would clean up readability, ease things like visitor pattern > style programming, localize matching behavior to a class, and support > better signaling amongst other things. This is the tl;dr, I will get into a > longer discussion below, but first I want to introduce the syntax and how > it works with the following simple example > Great! This is a time of synchronicity -- with a few core devs and some others I have been quietly working on a similar proposal. We've got an implementation and a draft PEP -- we were planning to publish the PEP last week but got distracted and there were a bunch of last-minute edits we wanted to apply (and we thought there was no hurry). We should join forces somehow. > result = some_function_call() > > try match result: # try match some_function_call(): is also supported > > as Dog: > > print("is a dog") > > as Cat(lives): > > print(f"is a cat with {lives} lives") > > as tuple(result1, result2): > > print(f"got two results {result1} and {result2}") > > else: > > print("unknown result") > Our syntax is similar but uses plain "match" instead of "try match". We considered "as", but in the end settled on "case". We ended up rejecting a separate "else" clause, using "case _" (where "_" matches everything) instead. There are other small syntactic differences but the general feel is very similar! > The statement begins with a new compound keyword "try match" . This is > treated as one logical block, the word match is not being turned into a > keyword. There are no backwards compatibility issues as previously no > symbols were allowed between try and :. The try match compound keyword was > chosen to make it clearer to users that this is distinct from a try block, > provide a hint on what the block is doing, and follow the Python tradition > of being sensible when spoken out loud to an English speaker. This keyword > is followed by an expression that is to be matched, called the match target. > Note that the new PEG parser makes it possible to recognize plain "match <expr>:" without fear of backward incompatibility. > What follows is one or more match blocks. A match block is started with > the keyword ‘as’ followed by a type, and optionally parameters. > > Matching begins by calling a __match__ (class)method on the type, with the > match target as a parameter. The match method must return an object that > can be evaluated as a bool. If the return value is True, the code block in > this match branch is executed, and execution is passed to whatever comes > after the match syntax. If __match__ returns False, execution is passed to > the next match branch for testing. > We have a __match__ class method too, though its semantics are somewhat different. > If a match branch contains a group of parameters, they are used in the > matching process as well. If __match__ returns True, then the match target > will be tested for the presence of a __unpack__ method. If there is no such > method, the match target is tried as a sequence. If both of these fail, > execution moves on. If there is a __unpack__ method, it is called and is > expected to return a sequence. The length of the sequence (either the > result of __unpack__, or the match target itself) is compared to the number > of supplied arguments. If they match the sequence is unpacked into > variables defined by the arguments and the match is considered a success > and the body is executed. If the length of the sequence does not match the > number of arguments, the match branch fails and execution continues. This > is useful for differentiating tuples of different lengths, or objects that > unpack differently depending on state. > Our __match__ handles the __unpack__ functionality. > If all the match blocks are tested and fail, the try match statement will > check for the presence of an else clause. If this is present the body is > executed. This serves as a default execution block. > > What is the __match__ method and how does it determine if a match target > is a match? This change introduces __match__ as a new default method on > ‘object’. The default implementation first checks the match target with the > ‘is’ operator against the object containing the __match__ method. If that > is false, then it checks the match target using isinstnace. Objects are > free to implement whatever __match__ method they want provided it matches > the interface. > > This proposal also introduces __unpack__ as a new interface, but does not > define any default implementation. This method should return a sequence. > There is no specific form outside this definition, a class author is free > to implement whatever representation they would like to use in a match > statement. One use case for this method could be a class that stores the > parameters passed to __init__ (or some other parameters) so someone could > construct a new object such as Animal(*animal_instance.__unpack__()) > [possibly with a builtin for calling unpack]. Another use, in keeping with > the match syntax, is something like Stateful Enums. > > The behavior covered by try match can be emulated with some combination of > compound and or nested if statements alongside type checking and parameter > unpacking, so why introduce the new syntax? The benefits I see are: > > * Much easier to read and follow compared complicated if branching logic > > * the matching logic is now defined alongside the class in the __match__ > method. This is in contrast to if statements where the logic is duplicated > in each place. If it factored out into a function, the function may be > unknown or unused in a cross package setting. Refactoring distributed logic > to live next to an object is similar to the introduction of the format > method on strings. > > * It introduces a well defined interface for programmers to depend on in > contrast to new interfaces being built on a package by package base. For > instance __unpack__ and __match__ behavior may be implemented on objects > today with any variety of names, making discoverability difficult, and > making if blocks more difficult for a user to parse. > > * A pattern often found in python is to use Exceptions for signaling > conditions inside of execution which muddles the distinction between > handling exceptional behavior and normal code execution. The try match > syntax, along with object unpacking, provides a standardized way to signal > information back to callers and standardized syntax for handling those > signals. > I agree with these advantages. I hadn't thought of matching as a replacement for exceptions but I think it's a good idea. > The try match syntax fits in well with the ethos of including more ways to > use typing information within python. For instance something like > typing.Union could have a corresponding __match__ method such that all of > those types would be covariant under a single match block. In the opposite > sense the try match syntax is also useful for taking a parameter defined as > a union and dispatching to individual functions that transform the variable > into a standardized type. > > Points I am unsure of: > > I am not sure about using “(parameters)” as part of the patch branch. In > some ways it is very familiar to look at, but it also makes it look like > these must be parameters used to construct the object, not parameters > returned from __unpack__ (though they may be the same, they may not be). I > have toyed with the the idea of something like “-> (a,b)” or uisng braces > “{a,b}” and the serve the purpose of being different, but then that also > makes them look different, so I dont really have a strong opinion formed on > this part of the syntax. > > The implementation on my branch does work, but I am by no means an expert > on all of python, there may be much better ways to do what I did. In > particular I implemented some of the logic in the compiler, but it may be > better served as an op code. The exact boundary of where best to put some > logic is unclear. > > This implementation currently stops at the first matching block it comes > to, not the best match out of all blocks. This is meant to make it easier > to understand the “flow” of the statement, but it might be preferable to > execute the block associated with the best match, though this would > complicate the implementation a good deal. > > > I am sure there is more that I have not considered with this proposal and > I appreciate any feedback you choose to provide. Thank you for your time in > reading this. > Thanks for taking the time to show and explain your proposal! I'm sorry I cannot yet show our full proposal. Hopefully it'll be ready later this week. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/34ZI3IZ7D4DLLXTSE2SFIMYEWOV7ZGWQ/ Code of Conduct: http://python.org/psf/codeofconduct/