Thank you for the well-written PEP, although I don't agree with it. My 
response below is quite long. Here is my opinionated TL;DR:


(1) Just get over the use of `_` for the wildcard pattern. 
another identifier. Now that the parser will support soft keywords, we 
should expect more cases that something that is an identifier is one 
context will be a keyword in another.

(2) The most common uses of patterns should not require sigils.

(3) None is special, and we should insist on `is` comparisons by 
default. True and False are a little more problematic.

(4) Using sigils to over-ride the default is okay. That includes turning 
what would otherwise be a capture pattern into a comparison.

Details below.


On Sat, Oct 31, 2020 at 05:16:59PM +1000, Nick Coghlan wrote:

> The rendered version of the PEP can be found here:
> https://www.python.org/dev/peps/pep-0642/

Quoting from the PEP:

"Wildcard patterns change their syntactic marker from _ to ?"

Yuck. Sorry, I find `?` in that role very aesthetically and 
visually unappealing :-(

I really don't get why so many people are hung up over this minuscule 
issue of giving `_` special meaning inside match statements. IMO, 
consistency with other languages' pattern matching is more useful than 
the ability to capture using `_` as a variable name.

Now that the PEG parser makes it easy to have soft keywords, there will 
probably be more cases in the future where something that is 
syntactically an identifier is a regular name in one context and special 
syntax in another. This has happened before (e.g. "as") and it will 
happen again.

We have a very strong convention that `_` is used as a write-only "don't 
care" variable. (The two exceptions are the magic underscore in the 
REPL, and `_()` in i18n.) In idiomatic Python code, if we bind a value 
to `_` and then use it later, we are Doing It Wrong.

Is there such a shortage of local variable names that the inability to 
misuse `_` is a problem in practice? Just use another identifier.

But if we really *must* break that convention and bind to `_`, we can 
still do it inside a match statement:

    case a:
        _ = a
        print(_)

The fact that you have to use a temporary variable to break the rules 
is, in my opinion, a good thing -- it reminds you that what you are 
doing is weird.


Quoting code from the PEP:

```
# Literal patterns
match number:
    case ?0:
        print("Nothing")
    case ?1:
        print("Just one")
```

I think this is an example of what Larry Wall talked about when he 
discussed the mistakes of Perl's original regex syntax:

"Poor Huffman coding"

https://www.perl.com/pub/2002/06/04/apo5.html/

Wall regrets that many common patterns are longer and harder to write 
than rarer patterns.

Why do we need a `?` sigil to match a literal? `case 1` cannot possibly 
be interpreted as a capture pattern. It would be wrong to compare it 
with `is`. What else could it mean other than equality comparison? The 
question mark is pure noise.

So here's a counter suggestion:


(1) Literals still match by equality, because that is what want 99% of 
the time. No sigil required.

You mention this in the "Rejected ideas" section, but I reject your 
rejection :-)

The PEP rejects this because:

"they have the same syntax sensitivity problem as value patterns do, 
where attempting to move the literal pattern out to a local variable for 
naming clarity would turn the value checking literal pattern into a name 
binding capture pattern"

but that's based on a really simple-minded refactoring. Sure, the naive 
user who knows little about pattern matching might try to refactor like 
this:


    # Before.
    match record:
        case (42, x): ...

    # After.
    ANSWER_TO_LIFE = 42
    match record:
        # It's a Trap!
        case (ANSWER_TO_LIFE, x): ...


and I am sympathetic to your desire to avoid that.

But this is the sort of error that:

- only applies in a comparatively unusual circumstances
  (naively refactoring a literal in a case statement);

- is easily avoided by automated refactoring tools;

- linters will warn about (assignment to a CONSTANT);

- is easily spotted if you have unit tests;

- is obvious to those with more experience in pattern matching.

So I don't see this is as a large problem. I expect few people will 
be bitten by this more than once, if that. I think that your 
preventative solution, forcing all literal patterns to require a 
sigil, is worse than the problem it is solving.

Bottom line: let's not hamstring pattern matching with poor Hoffman 
coding right from day one.


(2) While literals usually compare by equality, the exception is three
special keywords, and one symbol, that compare by identity:


    case None | True | False | ... :
        # Compares by identity.


I can't think of any other literal where identity tests would be useful 
and guaranteed by the language (no relying on implementation-specific 
details, such as small int caching or string interning).

So these keywords (plus the ... symbol) match by identity by default, 
because that's what we want 99% of the time. (Although, see below for 
discussion about the two bools.)

Other special values, like NotImplemented and Ellipsis, aren't keywords, 
they are just names, and don't get special treatment.


(3) Overriding the default comparison with an explicit sigil is 
allowed:


    case ==True:
        print("True, or 1, or 1.0, or 1+0j, etc")

    case ==None:
        print("None, or something weird that equals None")

    case is 1943.63:
        print("if you see this, the interpreter is caching floats")


I don't think that there will be any ambiguity between the unary "==" 
pattern modifier and the real `==` operator. But if I am wrong, then we 
can change the spelling:


    case ?None:
        print("None, or something weird that equals None")

    case ?is 1943.63:
        print("if you see this, the interpreter is caching floats")


(I don't love the question mark here, but I don't hate it either.)

The important thing here is that the cases with no sigil are the common 
operations; the sigil is only needed for the uncommon case.


(4) Patterns which could conceivably be interpreted as assignment 
targets default to capture patterns, because that's what is normally 
wanted in pattern matching:


    case [1, spam, eggs]:
        # captures spam and eggs


If you don't want to capture a named value, but just match on it, 
override it with an explicit `==` or `is`:


    case [1, ==spam, eggs]:
        # matches `spam` by equality, captures on eggs


Quoting the PEP:

"nobody litters their if-elif chains with x is True or x is False 
expressions, they write x and not x, both of which compare by value, not 
identity."

That's incorrect. `if x` doesn't *compare* at all, not by value and not 
with equality, it duck-types truthiness:


```
>>> class Demo:
...     def __bool__(self):
...             return True
...     def __eq__(self, other):
...             return False
... 
>>> x = Demo()
>>> x == True
False
>>> if x: print("truthy")
... 
truthy
```

There's a reasonable argument to make that (unless overridden by an 
explicit sigil) the `True` and `False` patterns should match by 
truthiness, not equality or identity, but I'm not going to make that 
argument.

Quote:

"Indeed, PEP 8 explicitly disallows the use if x is True"

This is true, but I think you have to understand the intention there. I 
believe the intent is that APIs should not insist on *exactly* the True 
or False singletons for boolean flags, but instead accept any truthy or 
falsey objects. (Duck typing for the win.)

But if you need to distinguish *exactly* True from an arbitrary truthy 
value like "spam and eggs" or 93.78, then identity, not equality, is the 
correct way to do it.



_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BTHFWG6MWLHALOD6CHTUFPHAR65YN6BP/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to