Re: [Python-ideas] Pattern Matching Syntax
05.05.18 09:23, Tim Peters пише: [Tim] ... I liked the way he _reached_ that conclusion: by looking at real- life Python code that may have been written instead to use constructs "like this". I find such examination far more persuasive than abstract arguments or made-up examples. [Serhiy Storchaka ] I would like to see such examination for PEP 572. And for all other syntax changing ideas. I did it myself for 572, and posted several times about what I found. Could you please give links to these results? It is hard to find something in hundreds of messages. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Pattern Matching Syntax
[Tim] >> ... I liked the way he _reached_ that conclusion: by looking at real- >> life Python code that may have been written instead to use constructs >> "like this". I find such examination far more persuasive than abstract >> arguments or made-up examples. [Serhiy Storchaka ] > I would like to see such examination for PEP 572. And for all other syntax > changing ideas. I did it myself for 572, and posted several times about what I found. It was far more productive to me than arguing (and, indeed, I sat out of the first several hundred msgs on python-ideas entirely because I spent all my time looking at code instead). Short course: I found a small win frequently, a large win rarely, but in most cases wouldn't use it. In all I expect I'd use it significantly more often than ternary "if", but far less often than augmented assignment. But that's me - everybody needs to look at their own code to apply _their_ judgment. 572 is harder than a case/switch statement to consider this way, because virtually every assignment statement binding a name could _potentially_ be changed to a binding expression instead, and there are gazillions of those. For considering case/switch additions, you can automate searches to vastly whittle down the universe of places to look at (`elif` chains, and certain nested if/else if/else if/else ... patterns). > I withdrew some my ideas and patches when my examinations showed that the > number of cases in the stdlib that will take a benefit from rewriting using > a new feature or from applying a compiler optimization is not large enough. Good! I approve :-) ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Inline assignments using "given" clauses
On Fri, May 4, 2018 at 6:56 PM, Alexander Belopolsky wrote: > On Fri, May 4, 2018 at 8:06 AM, Nick Coghlan wrote: >> ... >> With that spelling, the three examples above would become: >> >> # Exactly one branch is executed here >> if m given m = pattern.search(data): >> ... >> elif m given m = other_pattern.search(data)): >> ... >> else: >> ... >> >> # This name is rebound on each trip around the loop >> while m given m = pattern.search(remaining_data): >> ... >> >> # "f(x)" is only evaluated once on each iteration >> result = [(x, y, x/y) for x in data if y given y = f(x)] > > I think this is a step in the right direction. I stayed away from the > PEP 572 discussions because while intuitively it felt wrong, I could > not formulate what exactly was wrong with the assignment expressions > proposals. This proposal has finally made me realize why I did not > like PEP 572. The strong expression vs. statement dichotomy is one of > the key features that set Python apart from many other languages and > it makes Python programs much easier to understand. Right from the > title, "Assignment Expressions", PEP 572 was set to destroy the very > feature that in my view is responsible for much of Python's success. This is what makes me uncomfortable too. As Dijkstra once wrote: "our intellectual powers are rather geared to master static relations and ... our powers to visualize processes evolving in time are relatively poorly developed. For that reason we should do (as wise programmers aware of our limitations) our utmost to shorten the conceptual gap between the static program and the dynamic process, to make the correspondence between the program (spread out in text space) and the process (spread out in time) as trivial as possible." [1] Normally, Python code strongly maps *time* onto *vertical position*: one side-effect per line. Of course there is some specific order-of-operations for everything inside an individual line that the interpreter has to keep track of, but I basically never have to care about that myself. But by definition, := involves embedding side-effects within expressions, so suddenly I do have to care after all. Except... for the three cases Nick wrote above, where the side-effect occurs at the very end of the evaluation. And these also seem to be the three cases that have the most compelling use cases anyway. So restricting to just those three cases makes it much more palatable to me. (I won't comment on Nick's actual proposal, which is a bit more complicated than those examples, since it allows things like 'if m.group(1) given m = ...'.) (And on another note, I also wonder if all this pent-up desire to enrich the syntax of comprehensions means that we should add some kind of multi-line version of comprehensions, that doesn't require the awkwardness of explicitly accumulating a list or creating a nested function to yield out of. Not sure what that would look like, but people sure seem to want it.) -n [1] This is from "Go to statement considered harmful". Then a few lines later he uses a sequence of assignment statements as an example, and says that the wonderful thing about this example is that there's a 1-1 correspondence between lines and distinguishable program states, which is also uncannily apropos. -- Nathaniel J. Smith -- https://vorpus.org ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Inline assignments using "given" clauses
[Nick Coghlan ] > ... > The essence of the given clause concept would be to modify *these specific > cases* (at least initially) to allow the condition expression to be followed > by an inline assignment, of the form "given TARGET = EXPR". I'm not clear on what "these specific cases" are, specifically ;-) Conditions in "if", "elif", and "while" statement expressions? Restricted to one "given" clause, or can they chain? In a listcomp, is it one "given" clause per "if", or after at most one "if"? Or is an "if" even needed at all in a listcomp? For example, [(f(x)**2, f(x)**3) for x in xs] has no conditions, and [(fx := f(x))**2, fx**3) for x in xs] is one reasonable use for binding expressions. [(fx**2, fx**3) for x in xs given fx = f(x)] reads better, although it's initially surprising (to my eyes) to find fx defined "at the end". But no more surprising than the current: [(fx**2, fx**3) for x in xs for fx in [f(x)]] trick. > > While the leading keyword would allow TARGET to be an arbitrary assignment > target without much chance for confusion, it could also be restricted to > simple names instead (as has been done for PEP 572. The problem with complex targets in general assignment expressions is that, despite trying, I found no plausible use case for (at least) unpacking syntax. As in, e.g., x, y = func_returning_twople() if x**2 + y**2 > 9: # i.e., distance > 3, but save expensive sqrt The names can be _unpacked_ in a general assignment expression, but there appears to be no sane way then to _use_ the names in the test. This may be as good as it gets: if [(x, y := func_returning_twople()). x**2 + y**2 > 9][-1]: That reminds me of the hideous (condition and [T] or [F])[0] idiom I "invented" long ago to get the effect (in all cases) of the current T if condition else F That was intended to be goofy fun at the time, but I was appalled to see people later use it ;-) It''s certain sanest as if x**2 + y**2 > 9 given x, y = func_returning_twople(): "given" really shines there! > With that spelling, the three examples above would become: > > # Exactly one branch is executed here > if m given m = pattern.search(data): > ... > elif m given m = other_pattern.search(data)): > ... > else: Which is OK. The one-letter variable name obscures that it doesn't actually reduce _redundancy_, though. That is, in the current match = pattern.search(data) if match: it's obviously less redundant typing as: if match := pattern.search(data): In if match given match = pattern.search(data): the annoying visual redundancy (& typing) persists. > # This name is rebound on each trip around the loop > while m given m = pattern.search(remaining_data): Also fine, but also doesn't reduce redundancy. > # "f(x)" is only evaluated once on each iteration > result = [(x, y, x/y) for x in data if y given y = f(x)] As above, the potential usefulness of "given" in a listcomp doesn't really depend on having a conditional. Or on having a listcomp either, for that matter ;-) r2, r3 = fx**2, fx**3 given fx = f(x) One more, a lovely (to my eyes) binding expression simplification requiring two bindings in an `if` test, taken from real-life code I happened to write during the PEP discussion: diff = x - x_base if diff: g = gcd(diff, n) if g > 1: return g collapsed to the crisp & clear: if (diff := x - x_base) and (g := gcd(diff, n)) > 1: return g If only one trailing "given" clause can be given per `if` test expression, presumably I couldn't do that without trickery. If it's more general, if (diff given diff = x _ xbase) and g > 1 given g = gcd(diff, n): reads worse to my eyes (perhaps because of the "visual redundancy" thing again), while if diff and g > 1 given diff = x - x_base given g = gcd(diff, n): has my eyes darting all over the place, and wondering which of the trailing `given` clauses executes first. > ... ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Inline assignments using "given" clauses
On Fri, May 4, 2018 at 8:06 AM, Nick Coghlan wrote: > ... > With that spelling, the three examples above would become: > > # Exactly one branch is executed here > if m given m = pattern.search(data): > ... > elif m given m = other_pattern.search(data)): > ... > else: > ... > > # This name is rebound on each trip around the loop > while m given m = pattern.search(remaining_data): > ... > > # "f(x)" is only evaluated once on each iteration > result = [(x, y, x/y) for x in data if y given y = f(x)] I think this is a step in the right direction. I stayed away from the PEP 572 discussions because while intuitively it felt wrong, I could not formulate what exactly was wrong with the assignment expressions proposals. This proposal has finally made me realize why I did not like PEP 572. The strong expression vs. statement dichotomy is one of the key features that set Python apart from many other languages and it makes Python programs much easier to understand. Right from the title, "Assignment Expressions", PEP 572 was set to destroy the very feature that in my view is responsible for much of Python's success. Unlike PEP 572, Nick's proposal does not feel like changing the syntax of Python expressions, instead it feels like an extension to the if-, while- and for-statements syntax. (While comprehensions are expressions, for the purposes of this proposal I am willing to view them as for-statements in disguise.) ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Auto-wrapping coroutines into Tasks
On Fri, May 4, 2018 at 2:58 PM, Guido van Rossum wrote: > First, "start executing immediately" is an overstatement, right? They won't > run until the caller executes a (possibly unrelated) `await`. Well, traditional Future-returning functions often do execute some logic immediately, but right, what I meant was something like "starts executing without further intervention". I'm sure you know what I mean, but here's a concrete example to make sure it's clear to everyone else. Say we write this code: async_log("it happened!") If async_log is a traditional Future-returning function, then this line is sufficient to cause the message to be logged (eventually). If async_log is an async coroutine-returning function, then it's a no-op (except for generating a "coroutine was never awaited" warning). With this proposal, it would always work. > And I'm still > unclear why anyone would care, *except* in the case where they've somehow > learned by observation that "real" coroutines don't start immediately and > build a dependency on this in their code. (The happy eyeballs use case that > was brought up here earlier today seems like it would be better off not > depending on this either way, and it wouldn't be hard to do this either.) > > Second, when adding callbacks (if you *have to* -- if you're not a framework > author you're likely doing something wrong if you find yourself adding > callbacks), the right thing to do is obviously to *always* call > ensure_future() first. Async/await often lets you avoid working with the Future API directly, but Futures are still a major part of asyncio's public API, and so are synchronous-flavored systems like protocols/transports, where you can't use 'await'. I've been told that we need to keep that in mind when thinking about asyncio extensions ;-). And if the right thing to do is to *always* call a function, that's a good argument that the library should call it for you, right? :-) In practice I think cases like my 'async_log' example are the main place where people are likely to run into this – there are a lot of functions out there a bare call works to run something in the background, and a lot where it doesn't. (In particular, all existing Tornado and Twisted APIs are Future-returning, not async.) > Third, hooks like this feel like a great way to create an even bigger mess > -- it implicitly teaches users that all coroutines are Futures, which will > then cause disappointments when they find themselves in an environment where > the hook is not enabled. Switching between async libraries is always going to be a pretty messy. So I guess the only case people are likely to actually encounter an unexpected hook configuration is in the period before they enter asyncio (or whatever library they're using). Like, if you've learned that async functions always return Futures, you might expect this to work: fut = some_async_fun() # Error, 'fut' is actually a coroutine b/c the hook isn't set up yet fut.add_done_callback(...) asyncio.run(fut) That's a bit of a wart. But this is something that basically never worked and can't work, and very few people are likely to run into, so while it's sad that it's a wart I don't think it's an argument against fixing the other 99% of cases? (And of course this doesn't arise for libraries like Trio, where you just never call async functions outside of async context.) > Perhaps we should go the other way and wrap most ways of creating Futures in > coroutines? (Though there would have to be a way for ensure_future() to > *unwrap* it instead of wrapping it in a second Future.) So there's a few reasons I didn't suggest going this direction: - Just in practical terms, I don't know how we could make this change. There's one place that all coroutines are created, so we at least have the technical ability to change their behavior all at once. OTOH Future-returning functions are just regular functions that happen to return a Future, so we'd have to go fix them one at a time, right? - For regular asyncio users, the Future API is pretty much a superset of the coroutine API. (The only thing you can do with an coroutine is await it or call ensure_future, and Futures allow both of those.) That means that turning coroutines into Futures is mostly backwards compatible, but turning Futures into coroutines isn't. - Similarly, having coroutine-returning functions start running without further intervention is *mostly* backwards compatible, because it's very unusual to intentionally create a coroutine object and then never actually run it (via await or ensure_future or whatever). But I suspect it is fairly common to call Future-returning functions and never await them, like in the async_log example above. This is why we have the weird "category 3" in the first place: people would like to refactor Future-returning APIs to take advantage of async/await, but right now that's a compatibility-breaking change. - Exposing raw coroutine objects to users has led to var
Re: [Python-ideas] Auto-wrapping coroutines into Tasks
First, "start executing immediately" is an overstatement, right? They won't run until the caller executes a (possibly unrelated) `await`. And I'm still unclear why anyone would care, *except* in the case where they've somehow learned by observation that "real" coroutines don't start immediately and build a dependency on this in their code. (The happy eyeballs use case that was brought up here earlier today seems like it would be better off not depending on this either way, and it wouldn't be hard to do this either.) Second, when adding callbacks (if you *have to* -- if you're not a framework author you're likely doing something wrong if you find yourself adding callbacks), the right thing to do is obviously to *always* call ensure_future() first. Third, hooks like this feel like a great way to create an even bigger mess -- it implicitly teaches users that all coroutines are Futures, which will then cause disappointments when they find themselves in an environment where the hook is not enabled. Perhaps we should go the other way and wrap most ways of creating Futures in coroutines? (Though there would have to be a way for ensure_future() to *unwrap* it instead of wrapping it in a second Future.) On Fri, May 4, 2018 at 2:41 PM, Nathaniel Smith wrote: > Hi all, > > This is a bit of a wacky idea, but I think it might be doable and have > significant benefits, so throwing it out there to see what people > think. > > In asyncio, there are currently three kinds of calling conventions for > asynchronous functions: > > 1) Ones which return a Future > 2) Ones which return a raw coroutine object > 3) Ones which return a Future, but are documented to return a > coroutine object, because we want to possibly switch to doing that in > the future and are hoping people won't depend on them returning a > Future > > In practice these have slightly different semantics. For example, > types (1) and (3) start executing immediately, while type (2) doesn't > start executing until passed to 'await' or some function like > asyncio.gather. For type (1), you can immediately call > .add_done_callback: > > func_returning_future().add_done_callback(...) > > while for type (2) and (3), you have to explicitly call ensure_future > first: > > asyncio.ensure_future(func_returning_coro()).add_done_callback(...) > > In practice, these distinctions are mostly irrelevant and annoying to > users; the only thing you can do with a raw coroutine is pass it to > ensure_future() or equivalent, and the existence of type (3) functions > means that you can't even assume that functions documented as > returning raw coroutines actually return raw coroutines, or that these > will stay the same across versions. But it is a source of confusion, > see e.g. this thread on async-sig [1], or this one [2]. It also makes > it harder to evolve asyncio, since any function documented as > returning a Future cannot take advantage of async/await syntax. And > it's forced the creation of awkward APIs like the "coroutine hook" > used in asyncio's debug mode. > > Other languages with async/await, like C# and Javascript, don't have > these problems, because they don't have raw coroutine objects at all: > when you mark a function as async, that directly converts it into a > function that returns a Future (or local equivalent). So the > difference between async functions and Future-returning functions is > only relevant to the person writing the function; callers don't have > to care, and can assume that the full Future interface is always > available. > > I think Python did a very smart thing in *not* hard-coding Futures > into the language, like C#/JS do. But, I also think it would be nice > if we didn't force regular asyncio users to be aware of all these > details. > > So here's an idea: we add a new kind of hook that coroutine runners > can set. In async_function.__call__, it creates a coroutine object, > and then invokes this hook, which then can wrap the coroutine into a > Task (or Deferred or whatever is appropriate for the current coroutine > runner). This way, from the point of view of regular asyncio users, > *all* async functions become functions-returning-Futures (type 1 > above): > > async def foo(): > pass > > # This returns a Task running on the current loop > foo() > > Of course, async loops need a way to get at the actual coroutine > objects, so we should also provide some method on async functions to > do that: > > foo.__corocall__() -> returns a raw coroutine object > > And as an optimization, we can make 'await ' invoke this, so > that in regular async function -> async function calls, we don't pay > the cost of setting up an unnecessary Task object: > > # This > await foo(*args, **kwargs) > # Becomes sugar for: > try: > _callable = foo.__corocall__ > except AttributeError: > # Fallback, so 'await function_returning_promise()' still works: > _callable = foo > _awaitable = _callable(*args, **kwargs) > await _awaitable > > (So this
[Python-ideas] Auto-wrapping coroutines into Tasks
Hi all, This is a bit of a wacky idea, but I think it might be doable and have significant benefits, so throwing it out there to see what people think. In asyncio, there are currently three kinds of calling conventions for asynchronous functions: 1) Ones which return a Future 2) Ones which return a raw coroutine object 3) Ones which return a Future, but are documented to return a coroutine object, because we want to possibly switch to doing that in the future and are hoping people won't depend on them returning a Future In practice these have slightly different semantics. For example, types (1) and (3) start executing immediately, while type (2) doesn't start executing until passed to 'await' or some function like asyncio.gather. For type (1), you can immediately call .add_done_callback: func_returning_future().add_done_callback(...) while for type (2) and (3), you have to explicitly call ensure_future first: asyncio.ensure_future(func_returning_coro()).add_done_callback(...) In practice, these distinctions are mostly irrelevant and annoying to users; the only thing you can do with a raw coroutine is pass it to ensure_future() or equivalent, and the existence of type (3) functions means that you can't even assume that functions documented as returning raw coroutines actually return raw coroutines, or that these will stay the same across versions. But it is a source of confusion, see e.g. this thread on async-sig [1], or this one [2]. It also makes it harder to evolve asyncio, since any function documented as returning a Future cannot take advantage of async/await syntax. And it's forced the creation of awkward APIs like the "coroutine hook" used in asyncio's debug mode. Other languages with async/await, like C# and Javascript, don't have these problems, because they don't have raw coroutine objects at all: when you mark a function as async, that directly converts it into a function that returns a Future (or local equivalent). So the difference between async functions and Future-returning functions is only relevant to the person writing the function; callers don't have to care, and can assume that the full Future interface is always available. I think Python did a very smart thing in *not* hard-coding Futures into the language, like C#/JS do. But, I also think it would be nice if we didn't force regular asyncio users to be aware of all these details. So here's an idea: we add a new kind of hook that coroutine runners can set. In async_function.__call__, it creates a coroutine object, and then invokes this hook, which then can wrap the coroutine into a Task (or Deferred or whatever is appropriate for the current coroutine runner). This way, from the point of view of regular asyncio users, *all* async functions become functions-returning-Futures (type 1 above): async def foo(): pass # This returns a Task running on the current loop foo() Of course, async loops need a way to get at the actual coroutine objects, so we should also provide some method on async functions to do that: foo.__corocall__() -> returns a raw coroutine object And as an optimization, we can make 'await ' invoke this, so that in regular async function -> async function calls, we don't pay the cost of setting up an unnecessary Task object: # This await foo(*args, **kwargs) # Becomes sugar for: try: _callable = foo.__corocall__ except AttributeError: # Fallback, so 'await function_returning_promise()' still works: _callable = foo _awaitable = _callable(*args, **kwargs) await _awaitable (So this hook is actually quite similar to the existing coroutine hook, except that it's specifically only invoked on bare calls, not on await-calls.) Of course, if no coroutine runner hook is registered, then the default should remain the same as now. This also means that common idioms like: loop.run_until_complete(asyncfn()) still work, because at the time asyncfn() is called, no loop is running, asyncfn() silently returns a regular coroutine object, and then run_until_complete knows how to handle that. This would also help libraries like Trio that remove Futures altogether; in Trio, the convention is that 'await asyncfn()' is simply the only way to call asyncfn, and writing a bare 'asyncfn()' is always a mistake – but one that is currently confusing and difficult to detect because all it does is produce a warning ("coroutine was never awaited") at some potentially-distant location that depends on what the GC does. In this proposal, Trio could register a hook that raises an immediate error on bare 'asyncfn()' calls. This would also allow libraries built on Trio-or-similar to migrate a function from sync->async or async->sync with a deprecation period. Since in Trio sync functions would always use __call__, and async functions would always use __corocall__, then during a transition period one could use a custom object that defines both, and has one of them emit a DeprecationWarning. This is a problem that comes up a l
Re: [Python-ideas] Inline assignments using "given" clauses
I hope Python never has to go there. It's a tooling nightmare. On Fri, May 4, 2018 at 2:11 PM, Nathaniel Smith wrote: > On Fri, May 4, 2018 at 1:53 PM, Tim Peters wrote: > > [Tim] > >>> ... > >>> It's no longer the case that Python avoided that entirely, since > >>> "async def", "async for", and "async with" statements were added > >>> _without_ making "async" a new reserved word. It may require pain in > >>> the parser, but it's often doable anyway. At this stage in Python's > >>> life, adding new _reserved_ words "should be" an extremely high bar - > >>> but adding new non-reserved keywords (like "async") should be a much > >>> lower bar. > > > > [Guido] > >> Do note that this was a temporary solution. In 3.5 we introduced this > hack. > >> In 3.6, other uses of `async` and `await` became deprecated (though > you'd > >> have to use `python -Wall` to get a warning). In 3.7, it's a syntax > error. > > > > See my "that deserves more thought" at the start, but wrt future cases > > then ;-) In 3.5 and 3.6, "everything just works" for everyone. In > > 3.7 the implementation gets churned again, to go out of its way to > > break the handful of code using "async" as an identifier. It's > > obvious who that hurts, but who does that really benefit? > > > > My experience with Fortran convinces me nobody would _actually_ be > > confused even if they wrote code like: > > > > async def frobnicate(async=True): > > if async: > > async with ... > > IIUC, Javascript has also gone all-in on contextual keywords. The > realities of browser deployment mean they simply cannot have flag days > or break old code, ever, meaning that contextual keywords are really > the only kind they can add at all. So their async/await uses the same > kind of trick that Python 3.5 did, and I believe they plan to keep it > that way forever. > > FWIW. > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > -- --Guido van Rossum (python.org/~guido) ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Inline assignments using "given" clauses
On Fri, May 4, 2018 at 1:53 PM, Tim Peters wrote: > [Tim] >>> ... >>> It's no longer the case that Python avoided that entirely, since >>> "async def", "async for", and "async with" statements were added >>> _without_ making "async" a new reserved word. It may require pain in >>> the parser, but it's often doable anyway. At this stage in Python's >>> life, adding new _reserved_ words "should be" an extremely high bar - >>> but adding new non-reserved keywords (like "async") should be a much >>> lower bar. > > [Guido] >> Do note that this was a temporary solution. In 3.5 we introduced this hack. >> In 3.6, other uses of `async` and `await` became deprecated (though you'd >> have to use `python -Wall` to get a warning). In 3.7, it's a syntax error. > > See my "that deserves more thought" at the start, but wrt future cases > then ;-) In 3.5 and 3.6, "everything just works" for everyone. In > 3.7 the implementation gets churned again, to go out of its way to > break the handful of code using "async" as an identifier. It's > obvious who that hurts, but who does that really benefit? > > My experience with Fortran convinces me nobody would _actually_ be > confused even if they wrote code like: > > async def frobnicate(async=True): > if async: > async with ... IIUC, Javascript has also gone all-in on contextual keywords. The realities of browser deployment mean they simply cannot have flag days or break old code, ever, meaning that contextual keywords are really the only kind they can add at all. So their async/await uses the same kind of trick that Python 3.5 did, and I believe they plan to keep it that way forever. FWIW. -n -- Nathaniel J. Smith -- https://vorpus.org ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Inline assignments using "given" clauses
[Tim] >> ... >> It's no longer the case that Python avoided that entirely, since >> "async def", "async for", and "async with" statements were added >> _without_ making "async" a new reserved word. It may require pain in >> the parser, but it's often doable anyway. At this stage in Python's >> life, adding new _reserved_ words "should be" an extremely high bar - >> but adding new non-reserved keywords (like "async") should be a much >> lower bar. [Guido] > Do note that this was a temporary solution. In 3.5 we introduced this hack. > In 3.6, other uses of `async` and `await` became deprecated (though you'd > have to use `python -Wall` to get a warning). In 3.7, it's a syntax error. See my "that deserves more thought" at the start, but wrt future cases then ;-) In 3.5 and 3.6, "everything just works" for everyone. In 3.7 the implementation gets churned again, to go out of its way to break the handful of code using "async" as an identifier. It's obvious who that hurts, but who does that really benefit? My experience with Fortran convinces me nobody would _actually_ be confused even if they wrote code like: async def frobnicate(async=True): if async: async with ... But nobody would actually do that. Then again, "but people _could_ do that!" barely registers with me because the nobody-actually-does-it theoretical possibilities were so much worse in Fortran, so I tend to tune that kind of argument out reflexively. For example, whitespace was also irrelevant in Fortran, and these two statements mean radically different things: D O1 0I=1 00,30 0 D O1 0I=1 00.30 0 The first is like: for I in range(100, 301): # the block ends at the next statement with label 10 The seconds is like: DO10I = 100.300 All actual Fortran code spells them like this instead: DO 10 I = 100, 300 DO10I = 100.300 The differing intents are obvious at a glance then - although, yup, to the compiler the difference is solely due to that one uses a comma where the other a period. I'm not suggesting Python go anywhere near _that_ far ;-) Just as far as considering that there's no actual harm in Fortran allowing "DO" to be a variable name too. Nobody is even tempted to think that "DO" might mean "DO loop" in, e.g., DO = 4 X = FUNC(DO) X = DO(Y) IF (DO.OR.DONTDO) GOTO 10 etc. People generally _don't_ use Fortran keywords as identifiers despite that they can, but it's a real boon for the relatively rare older code that failed to anticipate keywords added after it was written. > ... > I'd also say that the difficulty of Googling for the meaning of ":=" > shouldn't be exaggerated. Currently you can search for "python operators" > and get tons of sites that list all operators. I've noted before that people don't seem to have trouble finding the meaning of Python's "is", "and", and "or" either. But Googling for "is" (etc) on its own isn't the way to do it ;-) > I also note that Google seems to be getting smarter about non-alphabetic > searches -- I just searched for "python << operator" and the first hit was > https://wiki.python.org/moin/BitwiseOperators Ya - most arguments are crap ;-) ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Inline assignments using "given" clauses
On Sat, May 5, 2018 at 5:27 AM, Matt Arcidy wrote: >> I'd also say that the difficulty of Googling for the meaning of ":=" >> shouldn't be exaggerated. Currently you can search for "python operators" >> and get tons of sites that list all operators. > > > Without adding hits to the search algorithm, this will remain the case. > Google must have clicks to rank up. Right now there is no page, nothing on > a high "Google juice" page like python.org, no one searching for it, and no > mass of people clicking on it. no SO questions, etc. Did you try? I searched for 'python :=' and for 'python colon equals' and got this hit each time: https://stackoverflow.com/questions/26000198/what-does-colon-equal-in-python-mean Which, incidentally, now has a response to it citing PEP 572. Good ol' Stack Overflow. ChrisA ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Inline assignments using "given" clauses
On Fri, May 4, 2018, 11:35 Guido van Rossum wrote: > On Fri, May 4, 2018 at 11:11 AM, Tim Peters wrote: > >> [Nick Coghlan ] >> > ... >> > Using a new keyword (rather than a symbol) would make the new construct >> > easier to identify and search for, but also comes with all the >> downsides of >> > introducing a new keyword. >> >> That deserves more thought. I started my paying career working on a >> Fortran compiler, a language which, by design, had no reserved words >> (although plenty of keywords). The language itself (and >> vendor-specific extensions) never had to suffer "but adding a new >> keyword could break old code!" consequences. >> >> In practice that worked out very well, Yes, you _could_ write >> hard-to-read code using language keywords as, e.g., identifier names >> too, but, no, absolutely nobody did that outside of "stupid Fortran >> tricks" posts on Usenet ;-) It had the _intended_ effect in practice: >> no breakage of old code just because the language grew new >> constructs. >> >> It's no longer the case that Python avoided that entirely, since >> "async def", "async for", and "async with" statements were added >> _without_ making "async" a new reserved word. It may require pain in >> the parser, but it's often doable anyway. At this stage in Python's >> life, adding new _reserved_ words "should be" an extremely high bar - >> but adding new non-reserved keywords (like "async") should be a much >> lower bar. >> > > Do note that this was a temporary solution. In 3.5 we introduced this > hack. In 3.6, other uses of `async` and `await` became deprecated (though > you'd have to use `python -Wall` to get a warning). In 3.7, it's a syntax > error. > > >> That said, I expect it's easier in general to add a non-reserved >> keyword introducing a statement (like "async") than one buried inside >> expressions ("given"). >> > > I'd also say that the difficulty of Googling for the meaning of ":=" > shouldn't be exaggerated. Currently you can search for "python operators" > and get tons of sites that list all operators. > Without adding hits to the search algorithm, this will remain the case. Google must have clicks to rank up. Right now there is no page, nothing on a high "Google juice" page like python.org, no one searching for it, and no mass of people clicking on it. no SO questions, etc. there is a transient response for all change. uniqueness and length of search term is just a faster one. All python syntax is findable eventually due to popularity. plus a better search is "why would I use...in python" or similar. = python also doesn't bring up anything interesting that wouldn't be had because of just "python". The details are too mundane and/or technical and everyone knows already. > that being said, if := had been (theoretically) included from the beginning, would people continue to have issues with it? unlikely, but I can't know. familiarity will cure many of these issues of readability or symbolic disagreement no matter what is chosen (well, to a point). it's unfortunate that changes have to be made up front with so little information like that, so I'm not advocating anything based on this, just pointing it out. I do think post hoc assignment will cause a cognitive load, like trying to figure out which variable is the iterator, and having to keep two contexts till the end of a comp with one given statement. [f(x) + a for all a in blah given x=1] not worse than a double nested comp though. > > I also note that Google seems to be getting smarter about non-alphabetic > searches -- I just searched for "python << operator" and the first hit was > https://wiki.python.org/moin/BitwiseOperators > > -- > --Guido van Rossum (python.org/~guido) > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Inline assignments using "given" clauses
On Fri, May 4, 2018 at 11:11 AM, Tim Peters wrote: > [Nick Coghlan ] > > ... > > Using a new keyword (rather than a symbol) would make the new construct > > easier to identify and search for, but also comes with all the downsides > of > > introducing a new keyword. > > That deserves more thought. I started my paying career working on a > Fortran compiler, a language which, by design, had no reserved words > (although plenty of keywords). The language itself (and > vendor-specific extensions) never had to suffer "but adding a new > keyword could break old code!" consequences. > > In practice that worked out very well, Yes, you _could_ write > hard-to-read code using language keywords as, e.g., identifier names > too, but, no, absolutely nobody did that outside of "stupid Fortran > tricks" posts on Usenet ;-) It had the _intended_ effect in practice: > no breakage of old code just because the language grew new > constructs. > > It's no longer the case that Python avoided that entirely, since > "async def", "async for", and "async with" statements were added > _without_ making "async" a new reserved word. It may require pain in > the parser, but it's often doable anyway. At this stage in Python's > life, adding new _reserved_ words "should be" an extremely high bar - > but adding new non-reserved keywords (like "async") should be a much > lower bar. > Do note that this was a temporary solution. In 3.5 we introduced this hack. In 3.6, other uses of `async` and `await` became deprecated (though you'd have to use `python -Wall` to get a warning). In 3.7, it's a syntax error. > That said, I expect it's easier in general to add a non-reserved > keyword introducing a statement (like "async") than one buried inside > expressions ("given"). > I'd also say that the difficulty of Googling for the meaning of ":=" shouldn't be exaggerated. Currently you can search for "python operators" and get tons of sites that list all operators. I also note that Google seems to be getting smarter about non-alphabetic searches -- I just searched for "python << operator" and the first hit was https://wiki.python.org/moin/BitwiseOperators -- --Guido van Rossum (python.org/~guido) ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Pattern Matching Syntax
04.05.18 20:48, Tim Peters пише: [Guido] Can I recommend going slow here? This is a very interesting topic where many languages have gone before. I liked Daniel F Moisset's analysis about the choices of a language designer and his conclusion that match should be a statement. Just to be annoying ;-) , I liked the way he _reached_ that conclusion: by looking at real-life Python code that may have been written instead to use constructs "like this". I find such examination far more persuasive than abstract arguments or made-up examples. I would like to see such examination for PEP 572. And for all other syntax changing ideas. I withdrew some my ideas and patches when my examinations showed that the number of cases in the stdlib that will take a benefit from rewriting using a new feature or from applying a compiler optimization is not large enough. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Inline assignments using "given" clauses
[Nick Coghlan ] > ... > Using a new keyword (rather than a symbol) would make the new construct > easier to identify and search for, but also comes with all the downsides of > introducing a new keyword. That deserves more thought. I started my paying career working on a Fortran compiler, a language which, by design, had no reserved words (although plenty of keywords). The language itself (and vendor-specific extensions) never had to suffer "but adding a new keyword could break old code!" consequences. In practice that worked out very well, Yes, you _could_ write hard-to-read code using language keywords as, e.g., identifier names too, but, no, absolutely nobody did that outside of "stupid Fortran tricks" posts on Usenet ;-) It had the _intended_ effect in practice: no breakage of old code just because the language grew new constructs. It's no longer the case that Python avoided that entirely, since "async def", "async for", and "async with" statements were added _without_ making "async" a new reserved word. It may require pain in the parser, but it's often doable anyway. At this stage in Python's life, adding new _reserved_ words "should be" an extremely high bar - but adding new non-reserved keywords (like "async") should be a much lower bar. That said, I expect it's easier in general to add a non-reserved keyword introducing a statement (like "async") than one buried inside expressions ("given"). ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Pattern Matching Syntax
[Guido] > Can I recommend going slow here? This is a very interesting topic where many > languages have gone before. I liked Daniel F Moisset's analysis about the > choices of a language designer and his conclusion that match should be a > statement. Just to be annoying ;-) , I liked the way he _reached_ that conclusion: by looking at real-life Python code that may have been written instead to use constructs "like this". I find such examination far more persuasive than abstract arguments or made-up examples. An observation: syntax borrowed from functional languages often fails to work well in practice when grafted onto a language that's statement-oriented - it only works well for the expression subset of the language. and even then just for when that subset is being used in a functional way (e.g., the expression `object.method(arg)` is usually used for its side effects, not for its typically-None return value). OTOH, syntax borrowed from a statement-oriented language usually fails to work at all when grafted onto an "almost everything's an expression" language. So that's an abstract argument of my own, but - according to me - should be given almost no weight unless confirmed by examining realistic code. Daniel did some of both - great! > ... > A larger topic may be how to reach decisions. If I've learned one thing from > PEP 572 it's that we need to adjust how we discuss and evaluate proposals. > I'll think about this and start a discussion at the Language Summit about > this. Python needs something akin to a dictator, who tells people how things are going to be, like it or not. But a benevolent dictator, not an evil one. And to prevent palace intrigue, they should hold that position for life. Just thinking outside the box there ;-) ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Inline assignments using "given" clauses
04.05.18 15:06, Nick Coghlan пише: Recapping the use cases where the inline assignment capability received the most agreement regarding being potentially more readable than the status quo: Sorry, but these examples don't look as good examples for inline assignments to me. I think that all these cases can be written better without using the inline assignment. 1. Making an "exactly one branch is executed" construct clearer than is the case for nested if statements: if m := pattern.search(data): ... elif m := other_pattern.search(data): ... else: ... This case can be better handled by combining patterns in a single regular expression. pattern = re.compile('(?Ppattern1)|(?Ppattern2)|...') m = pattern.search(data) if not m: # this can be omitted if the pattern is always found ... elif m.group('foo'): ... elif m.group('bar'): ... See for example gettext.py where this pattern is used. 2. Replacing a loop-and-a-half construct: while m := pattern.search(remaining_data): ... This case can be better handled by re.finditer(). for m in pattern.finditer(remaining_data): ... In more complex cases it is handy to write a simple generator function and iterate its result. The large number of similar cases are covered by a two-argument iter(). 3. Sharing values between filtering clauses and result expressions in comprehensions: result = [(x, y, x/y) for x in data if (y := f(x))] There are a lot of ways of writing this. PEP 572 mentions them. Different ways are used in real code depending on preferences of the author. Actually the number of these cases is pretty low in comparison with the total number of comprehensions. It is possible to express an assignment in comprehensions with the "for var in [value]" idiom, and this idiom is more powerful than PEP 572 in this case because it allows to perform an assignment before the first 'for'. But really complex comprehensions could be better written as 'for' statements with explicit adding to the collection or yielding. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Inline assignments using "given" clauses
Thanks! Perhaps the most useful bit of this post is the clear list of use cases (a useful summary of the motivational part of PEP 572). On Fri, May 4, 2018 at 5:06 AM, Nick Coghlan wrote: > (Note: Guido's already told me off-list that he doesn't like the way this > spelling reads, but I wanted to share it anyway since it addresses one of > the recurring requests in the PEP 572 discussions for a more targeted > proposal that focused specifically on the use cases that folks had agreed > were reasonable potential use cases for inline assignment expressions. > > I'll also note that another potential concern with this specific proposal > is that even though "given" wasn't used as a term in any easily discovered > Python APIs back when I first wrote PEP 3150, it's now part of the > Hypothesis testing API, so adopting it as a keyword now would be markedly > more disruptive than it might have been historically) > > Recapping the use cases where the inline assignment capability received > the most agreement regarding being potentially more readable than the > status quo: > > 1. Making an "exactly one branch is executed" construct clearer than is > the case for nested if statements: > > if m := pattern.search(data): > ... > elif m := other_pattern.search(data): > ... > else: > ... > > 2. Replacing a loop-and-a-half construct: > > while m := pattern.search(remaining_data): > ... > > 3. Sharing values between filtering clauses and result expressions in > comprehensions: > > result = [(x, y, x/y) for x in data if (y := f(x))] > > The essence of the given clause concept would be to modify *these specific > cases* (at least initially) to allow the condition expression to be > followed by an inline assignment, of the form "given TARGET = EXPR". (Note: > being able to implement such a syntactic constraint is a general > consequence of using a ternary notation rather than a binary one, since it > allows the construct to start with an arbitrary expression, without > requiring that expression to be both the result of the operation *and* the > value bound to a name - it isn't unique to the "given" keyword specifically) > > While the leading keyword would allow TARGET to be an arbitrary assignment > target without much chance for confusion, it could also be restricted to > simple names instead (as has been done for PEP 572. > > With that spelling, the three examples above would become: > > # Exactly one branch is executed here > if m given m = pattern.search(data): > ... > elif m given m = other_pattern.search(data)): > ... > else: > ... > > # This name is rebound on each trip around the loop > while m given m = pattern.search(remaining_data): > ... > > # "f(x)" is only evaluated once on each iteration > result = [(x, y, x/y) for x in data if y given y = f(x)] > > Constraining the syntax that way (at least initially) would avoid poking > into any dark corners of Python's current scoping and expression execution > ordering semantics, while still leaving the door open to later making > "result given NAME = expr" a general purpose ternary operator that returns > the LHS, while binding the RHS to the given name as a side effect. > > Using a new keyword (rather than a symbol) would make the new construct > easier to identify and search for, but also comes with all the downsides of > introducing a new keyword. (Hence the not-entirely-uncommon suggestion of > using "with" for a purpose along these lines, which runs into a different > set of problems related to trying to use "with" for two distinct and > entirely unrelated purposes). > > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia > > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- --Guido van Rossum (python.org/~guido) ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Pattern Matching Syntax
Can I recommend going slow here? This is a very interesting topic where many languages have gone before. I liked Daniel F Moisset's analysis about the choices of a language designer and his conclusion that match should be a statement. I just noticed the very similar proposal for JavaScript linked to by the OP: https://github.com/tc39/proposal-pattern-matching -- this is more relevant than what's done in e.g. F# or Swift because Python and JS are much closer. (Possibly Elixir is also relevant, it seems the JS proposal borrows from that.) A larger topic may be how to reach decisions. If I've learned one thing from PEP 572 it's that we need to adjust how we discuss and evaluate proposals. I'll think about this and start a discussion at the Language Summit about this. -- --Guido van Rossum (python.org/~guido) ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Pattern Matching Syntax
On 2018-05-04 08:26, Jacco van Dorp wrote: > Would this be valid? > > # Pattern matching with guards > x = 'three' > > number = match x: > 1 => "one" > y if y is str => f'The string is {y}' > z if z is int => f'the int is {z}' > _ => "anything" > > print(number) # The string is three > > If so, why are y and z both valid here ? Is the match variable rebound > to any other ? Or even to all names ? In the match case here: match x: y if y > 3 => f'{y} is >3' # to use an example that works there are three parts: "y" is a pattern. It specifies the shape of the value to match: in this case, anything at all. Nothing is bound yet. "if" is just the word if, used as a separator, nothing to do with "if" in expressions. "y > 3" is the guard expression for the match case. Iff the pattern matches, "y > 3" is evaluated, with names appearing in the pattern taking the values they matched. It's important to note that the thing on the left-hand side is explicitly *not* a variable. It's a pattern, which can look like a variable, but it could also be a literal or a display. > ofc, you could improve the clarity here with: > > number = match x as y: > > or any variant thereof. This way, you'd explicitely bind the variable > you use for testing. If you don't, the interpreter would never know > which ones to treat as rebindings and which to draw from surrounding > scopes, if any. I don't think anything in the pattern should be drawn from surrounding scopes. > I also haven't really seen a function where this would be better than > existing syntax, and the above is the only one to actually try > something not possible with dicts. The type checking one could better > be: > > [snip] > > The production datetime code could be: > > def convert_time_to_timedelta_with_match(unit:str, amount:int, now:date): >return { > "days":timedelta(**{unit: amount}), > "hours":timedelta(**{unit: amount}), > "weeks":timedelta(**{unit: amount}), > # why not something like subtracting two dates here to get an > accurate timedelta for your specific interval ? > "months":timedelta(days = 30*amount), # days = (365.25 / > 12)*amount ? Would be a lot more accurate for average month length. > (30.4375) > "years":timedelta(days=365*amount), # days = 365.25*amount ? > "cal_years":timedelta(now - now.replace(year=now.year - amount)), > }.get(unit) Don't you think the repetition of ``timedelta(**{unit: amount})'' sort of proves OP's point? Incidentally, there's no need to use the dict trick when the unit is known statically anyway. I can't decide if that would count as more reptition or less. > I honestly don't see the advantages of new syntax here. > Unless you hate the eager evaluation in the dict literal getting > indexed, so if it's performance critical an if/else might be better. > But I can't see a match statement outperforming if/else. (and if you > really need faster than if/else, you should perhaps move that bit of > code to C or something.) It's not really about performance. It's about power. A bunch of if statements can do many things--anything, arguably--but their generality translates into repetition when dealing with many instances of this family of cases. signature.asc Description: OpenPGP digital signature ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Pattern Matching Syntax
On Sat, May 5, 2018 at 12:45 AM, Daniel Moisset wrote: >> (3) Unlike a case/switch statement, there's no implication that the >> compiler could optimise the order of look-ups; it is purely top to >> bottom. > > > [we are talking about a multi-branch pattern matching statement now, not > just "apttern matching"] In most practical cases, a compiler can do > relatively simple static analysis (even in python) that could result in > performance improvements. One obvious improvement is that the matched > expression can be evaluated once (you can achieve the same effect always > with a dummy variable assignment right before the if/elif statement). That one isn't an optimization, but part of the specification; it is an advantage of the fact that you're writing the match expression just once. But all the rest of your optimizations aren't trustworthy. > But > for multiple literal string patterns (a common scenario), you can compile a > string matcher that is faster than a sequence of equality comparisons > (either through hashing+comparison fallback or creating some decision tree > that goes through the input string once). Only if you're doing equality checks (no substrings or anything else where it might match more than one of them). And if you're doing "pattern matching" that's nothing more than string equality comparisons, a dict is a better way to spell it. > For integers you can make lookup tables. If they're just equality checks, again, a dict is better. If they're ranges, you would have to ensure that they don't overlap (no problem if they're all literals), and then you could potentially optimize it. > Even an ifinstance check choosing between several branches (a not so > uncommon occurrence) could be implemented by a faster operation if somewhat > considered that relevant. Only if you can guarantee that no single object can be an instance of more than one of the types. Otherwise, you still have to check in some particular order. In CPython, you can guarantee that isinstance(x, int) and isinstance(x, str) can't both be true, but that's basically a CPython implementation detail, due to the way C-implemented classes work. You can't use this to dispatch based on exception types, for instance. Let's say you try to separately dispatch ValueError, ZeroDivisionError, and OSError; and then you get this: >>> class DivisionByOSError(ZeroDivisionError, OSError, ValueError): pass ... >>> raise DivisionByOSError() Traceback (most recent call last): File "", line 1, in __main__.DivisionByOSError That thing really truly is all three of those types, and you have to decide how to dispatch that. So there needs to be an order to the checks, with no optimization. ChrisA ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Pattern Matching Syntax
This email from Steve has some good questions, let me try to help organize ideas: On 4 May 2018 at 13:11, Steven D'Aprano wrote: > I'll make a start, and you can correct me if I get any of it wrong. > > (1) Pattern matching takes a value, and compares it to a series of > *patterns* until the first match, at which point it returns a specified > value, skipping the rest of the patterns. > In a general sense based in most other languages, patterns are a syntactic construct that can be "matched" with a value in runtime. The matching process has two effects at once: 1) check that the value has some specific form dictated by the pattern (which can have a yes/no result) 2) bind some assignable targets referenced in the pattern to components of the value matched. The binding is usually done only if there is a match according to (1) Python actually has some form of patterns (called "target_list" in the formal syntax) that are used in assignments, for loops, and other places. As it is mostly restricted to assign single values, or decompose iterables, we normally say "tuple unpacking" instead of "pattern matching". And there's a second type of pattern which is included in the try/except statement, which matches by subtype (and also can bind a name) As a language designer, once you have your notion on matching defined, you can choose which kind of constructs use patterns (I just mentioned left of assignemnts, between "for" and "in", etc in python). Usual constructs are multi branch statement/expression that match a single value between several patterns, and run a branch of code depending on what pattern matched (After performing the corresponding bindings). That's not the only option, you could also implement patterns in other places, like regular assuments, or the conditions of loops and conditionals [resulting in an effect similar to some of the ones being discussed in the PEP572 thread]; although this last sentence is a beyond what the OP was suggesting and a generalization of the idea. (2) Patterns typically are single values, and the match is by equality, > although other kinds of patterns are available as well. > Typical patterns in other languages include: a) single values (matched by equality) b) assignables (names, stuff like mylist[0] or self.color) which match anything and bind the value to assignables c) type patterns (a value matches if the type of the value has a certain supertype) d) structure patterns (a value matches if it has certain structure. For example, being a dict with certain keys, or an iterable of certain amount of elements). These usually are recursive, and components of the structure can be also patterns e) arbitrary boolean conditions (that can use the names bound by other parts of the pattern) Python has support for (b) and (c) in both assignment and for loops. Python supports (b) and (c) in try statements. The proposal for the OP offers expanding to most of these patterns, and implement some sort of pattern matching expression. I argued in another email that a pattern matching statement feels more adequate to Python (I'm not arguing at this point if it's a good idea, just that IF any is a good idea, it's the statement) As an example, you could have a pattern (invented syntax) like "(1, 'foo', bar, z: int)" which would positively match 4-element tuples that have 1 in its first position, foo in its second, and an int instance in the last; when matching it would bind the names "bar" and "z" to the last 2 elements in the tuple. > (3) Unlike a case/switch statement, there's no implication that the > compiler could optimise the order of look-ups; it is purely top to > bottom. > [we are talking about a multi-branch pattern matching statement now, not just "apttern matching"] In most practical cases, a compiler can do relatively simple static analysis (even in python) that could result in performance improvements. One obvious improvement is that the matched expression can be evaluated once (you can achieve the same effect always with a dummy variable assignment right before the if/elif statement). But for multiple literal string patterns (a common scenario), you can compile a string matcher that is faster than a sequence of equality comparisons (either through hashing+comparison fallback or creating some decision tree that goes through the input string once). For integers you can make lookup tables. Even an ifinstance check choosing between several branches (a not so uncommon occurrence) could be implemented by a faster operation if somewhat considered that relevant. > (4) Unlike if...elif, each branch is limited to a single expression, not > a block. That's a feature: a match expression takes an input, and > returns a value, and typically we don't have to worry about it having > side-effects. > > So it is intentionally less general than a chain of if...elif blocks. > > That's the OP proposal, yes (as I mentioned, I argued with some simple data that a feature like that is of a mo
Re: [Python-ideas] Pattern Matching Syntax
Note that most languages that you mentioned as references are functional (so they don't have a statement/expression distinction like Python has), and those that are not, have matching statements. The only exception is Javascript, but in Javascript the distinction is not that hard given that it has the idiom (function() {stmt; stmt; stmt})() to have any statement block as an expression. And again, as I mentioned it's an outlier. Other imperative languages like C, Java, have of course switch statements which are similar Making a quick search for real code that could benefit for this, I mostly found situations where a matching *statement* would be required instead of a matching *expression*. To give you the examples I found in the stdlib for Python3.6 (I grepped for "elif" and looked for "similar" branches manually, covering the first ~20%): fnmatch.translate (match c: ... string options) telnetlib.Telnet.process_rawq (match len(self.iacseq): ... integer options) mimetypes[module __main__ body] (match opt: ... multiple str options per match) typing._remove_dups_flatten (match p: ... isinstance checks + custom condition) [this *could* be an expression with some creativity] typing.GenericMeta.__getitem__ (match self: ... single and multiple type options by identity) turtle.Shape.__init__ (match type_:... str options) turtle.TurtleScreen._color (match len(cstr): ... int options) turtle.TurtleScreen.colormode (match cmode: ... mixed type options) turtle.TNavigator.distance (match x: ... isinstance checks) [could be an expression] turtle.TNavigator.towards (match x: ... isinstance checks) [could be an expression] turtle.TPen.color (match l: ... integer options. l is set to len(args) the line before) turtle._TurtleImage._setshape (match self._type: ... str options) [could be an expression] turtle.RawTurtle.__init__ (match canvas: ... isinstance checks) turtle.RawTurtle.clone (match ttype: ... str options) [ could be an expression] turtle.RawTurtle._getshapepoly (match self._resizemode: ... str options, one with a custom condition or'ed) turtle.RawTurtle._drawturtle (match ttype: ... str options) turtle.RawTurtle.stamp (match ttype: ... str options) turtle.RawTurtle._undo (match action: ... str options) ntpath.expandvars (match c: ... str optoins) sre_parse.Subpattern.getwidth (match op: ... nonliteral int constants, actually a NamedIntConstant which subclasses int) sre_parse._class_escape (match c: ... string options with custom conditions, and inclusion+equality mixed) sre_parse._escape (match c: ... string options with custom conditions, and inclusion+equality mixed) sre_parse._parse ( match this: ... string options with in, not in, and equality) sre_parse._parse ( match char: ... string options with in, and equality) sre_parse.parse_template (match c: ... string options with in) netrc.netrc._parse (match tt: ... string options with custom conditions) netrc.netrc._parse (match tt: ... string options with custom conditions) [not a duplicate, there are two possible statements here] argparse.HelpFormatter._format_args (match action.nargs: ... str/None options) [this *could* be an expression with some creativity/transformations] argparse.ArgumentParser._get_nargs_pattern (match nargs: ... str/None options) [could be an expression] argparse.ArgumentParser._get_values (match action.nargs: ... str/None options with extra conditions) _strptime._strptime (match group_key: ... str options) datetime._wrap_strftime (match ch: ... str optoins) pickletools.optimize (match opcode,name: ... str options with reverse inclusion and equiality) json/encoder._make_iterencode(match value: ... mixed options and isinstance checks) json/encoder._make_iterencode._iterencode dict (match key: ... mixed options and isinstance checks) json/encoder._make_iterencode._iterencode dict (match value: ... mixed options and isinstance checks) json/encoder._make_iterencode._iterencode (match o: ... mixed options and isinstance checks) json/scanner.py_make_scanner._scan_once (match nextchar: ... str options) [could be turned into expression with some transformation] unittest.mock._Call.__new__ (match _len: ... int options) unittest.mock._Call.eq__ (match len_other: ... int options) (I'm not saying that all these should be match statements, only that they could be). Cases where an expression would solve the issue are somewhat uncommon (there are many state machines, including many string or argument parsers that set state depending on the option, or object builders that grow data structures). An usual situation is that some of the branches need to raise exceptions (and raise in python is a statement, not an expression). This could be workarounded making the default a raise ValueError that can be caught and reraised as soemthing else, but that would end up making the code deeper, and IMO, more complicated. Also, many of the situations where an expression could be used, are string matches where a dictionary lookup would work well anyway. My conclusions
Re: [Python-ideas] Pattern Matching Syntax
On Thu, May 03, 2018 at 11:36:27AM -0700, Robert Roskam wrote: > So I started extremely generally with my syntax, but it seems like I should > provide a lot more examples of real use. Yes, real-life examples will be far more compelling and useful than made up examples and pseudo-code. Also, I think that you should delay talking about syntax until you have explained in plain English what pattern matching does, how it differs from a switch/case statement (in languages that have them) and why it is better than the two major existing idioms in Python: - chained if...elif - dict dispatch. I'll make a start, and you can correct me if I get any of it wrong. (1) Pattern matching takes a value, and compares it to a series of *patterns* until the first match, at which point it returns a specified value, skipping the rest of the patterns. (2) Patterns typically are single values, and the match is by equality, although other kinds of patterns are available as well. (3) Unlike a case/switch statement, there's no implication that the compiler could optimise the order of look-ups; it is purely top to bottom. (4) Unlike if...elif, each branch is limited to a single expression, not a block. That's a feature: a match expression takes an input, and returns a value, and typically we don't have to worry about it having side-effects. So it is intentionally less general than a chain of if...elif blocks. (5) We could think of it as somewhat analogous to a case/switch statement, a dict lookup, or if...elif, only better. (Why is it better?) Here is a list of patterns I would hope to support, off the top of my head: * match by equality; * match by arbitrary predicates such as "greater than X" or "between X and Y"; * match by string prefix, suffix, or substring; * match by type (isinstance). I think that before we start talking about syntax, we need to know what features we need syntax for. There's probably more to it, because so far it doesn't look like anything but a restricted switch statement. Over to someone else with a better idea of why pattern matching has become ubiquitous in functional programming. -- Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Inline assignments using "given" clauses
(Note: Guido's already told me off-list that he doesn't like the way this spelling reads, but I wanted to share it anyway since it addresses one of the recurring requests in the PEP 572 discussions for a more targeted proposal that focused specifically on the use cases that folks had agreed were reasonable potential use cases for inline assignment expressions. I'll also note that another potential concern with this specific proposal is that even though "given" wasn't used as a term in any easily discovered Python APIs back when I first wrote PEP 3150, it's now part of the Hypothesis testing API, so adopting it as a keyword now would be markedly more disruptive than it might have been historically) Recapping the use cases where the inline assignment capability received the most agreement regarding being potentially more readable than the status quo: 1. Making an "exactly one branch is executed" construct clearer than is the case for nested if statements: if m := pattern.search(data): ... elif m := other_pattern.search(data): ... else: ... 2. Replacing a loop-and-a-half construct: while m := pattern.search(remaining_data): ... 3. Sharing values between filtering clauses and result expressions in comprehensions: result = [(x, y, x/y) for x in data if (y := f(x))] The essence of the given clause concept would be to modify *these specific cases* (at least initially) to allow the condition expression to be followed by an inline assignment, of the form "given TARGET = EXPR". (Note: being able to implement such a syntactic constraint is a general consequence of using a ternary notation rather than a binary one, since it allows the construct to start with an arbitrary expression, without requiring that expression to be both the result of the operation *and* the value bound to a name - it isn't unique to the "given" keyword specifically) While the leading keyword would allow TARGET to be an arbitrary assignment target without much chance for confusion, it could also be restricted to simple names instead (as has been done for PEP 572. With that spelling, the three examples above would become: # Exactly one branch is executed here if m given m = pattern.search(data): ... elif m given m = other_pattern.search(data)): ... else: ... # This name is rebound on each trip around the loop while m given m = pattern.search(remaining_data): ... # "f(x)" is only evaluated once on each iteration result = [(x, y, x/y) for x in data if y given y = f(x)] Constraining the syntax that way (at least initially) would avoid poking into any dark corners of Python's current scoping and expression execution ordering semantics, while still leaving the door open to later making "result given NAME = expr" a general purpose ternary operator that returns the LHS, while binding the RHS to the given name as a side effect. Using a new keyword (rather than a symbol) would make the new construct easier to identify and search for, but also comes with all the downsides of introducing a new keyword. (Hence the not-entirely-uncommon suggestion of using "with" for a purpose along these lines, which runs into a different set of problems related to trying to use "with" for two distinct and entirely unrelated purposes). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Pattern Matching Syntax
On Thu, May 03, 2018 at 09:04:40PM +0100, Ed Kellett wrote: > On 2018-05-03 19:57, Chris Angelico wrote: > > Got it. Well, I don't see why we can't use Python's existing primitives. > > > > def hyperop(n, a, b): > > if n == 0: return 1 + b > > if n == 1: return a + b > > if n == 2: return a * b > > if n == 3: return a ** b > > if n == 4: return a *** b > > if n == 5: return a b > > if n == 6: return a * b > > ... > > Well, it'd be infinitely long, but I suppose I'd have to concede that > that's in line with the general practicality level of the example. Yes, but only countably infinite, so at least we can enumerate them all. Eventually :-) And aside from the tiny niggle that *** and higher order operators are syntax errors... Its not a bad example of the syntax, but it would be considerably more compelling a use-case if it were something less obscure and impractical. -- Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Pattern Matching Syntax
Would this be valid? # Pattern matching with guards x = 'three' number = match x: 1 => "one" y if y is str => f'The string is {y}' z if z is int => f'the int is {z}' _ => "anything" print(number) # The string is three If so, why are y and z both valid here ? Is the match variable rebound to any other ? Or even to all names ? ofc, you could improve the clarity here with: number = match x as y: or any variant thereof. This way, you'd explicitely bind the variable you use for testing. If you don't, the interpreter would never know which ones to treat as rebindings and which to draw from surrounding scopes, if any. I also haven't really seen a function where this would be better than existing syntax, and the above is the only one to actually try something not possible with dicts. The type checking one could better be: x = 1 d = { int:"integer", float:"float", str:"str" } d.get(type(x), None) The production datetime code could be: def convert_time_to_timedelta_with_match(unit:str, amount:int, now:date): return { "days":timedelta(**{unit: amount}), "hours":timedelta(**{unit: amount}), "weeks":timedelta(**{unit: amount}), # why not something like subtracting two dates here to get an accurate timedelta for your specific interval ? "months":timedelta(days = 30*amount), # days = (365.25 / 12)*amount ? Would be a lot more accurate for average month length. (30.4375) "years":timedelta(days=365*amount), # days = 365.25*amount ? "cal_years":timedelta(now - now.replace(year=now.year - amount)), }.get(unit) I honestly don't see the advantages of new syntax here. Unless you hate the eager evaluation in the dict literal getting indexed, so if it's performance critical an if/else might be better. But I can't see a match statement outperforming if/else. (and if you really need faster than if/else, you should perhaps move that bit of code to C or something.) 2018-05-04 0:34 GMT+02:00 Ed Kellett : > On 2018-05-03 20:17, Chris Angelico wrote: >>> def convert_time_to_timedelta_with_match(unit:str, amount:int, now:date): >>> return match unit: >>> x if x in ('days', 'hours', 'weeks') => timedelta(**{unit: amount}) >>> 'months' => timedelta(days=30 * amount) >>> 'years' => timedelta(days=365 * amount) >>> 'cal_years' => now - now.replace(year=now.year - amount) >> >> And then this comes down to the same as all the other comparisons - >> the "x if x" gets duplicated. So maybe it would be best to describe >> this thus: >> >> match : >> | ( ) => >> >> If it's just an expression, it's equivalent to a comp_op of '=='. The >> result of evaluating the match expression is then used as the left >> operand for ALL the comparisons. So you could write your example as: >> >> return match unit: >> in ('days', 'hours', 'weeks') => timedelta(**{unit: amount}) >> 'months' => timedelta(days=30 * amount) >> 'years' => timedelta(days=365 * amount) >> 'cal_years' => now - now.replace(year=now.year - amount) >> >> Then there's room to expand that to a comma-separated list of values, >> which would pattern-match a tuple. > > I believe there are some problems with this approach. That case uses no > destructuring at all, so the syntax that supports destructuring looks > clumsy. In general, if you want to support something like: > > match spec: > (None, const) => const > (env, fmt) if env => fmt.format(**env) > > then I think something like the 'if' syntax is essential for guards. > > One could also imagine cases where it'd be useful to guard on more > involved properties of things: > > match number_ish: > x:str if x.lower().startswith('0x') => int(x[2:], 16) > x:str => int(x) > x => x #yolo > > (I know base=0 exists, but let's imagine we're implementing base=0, or > something). > > I'm usually against naming things, and deeply resent having to name the > x in [x for x in ... if ...] and similar constructs. But in this > specific case, where destructuring is kind of the point, I don't think > there's much value in compromising that to avoid a name. > > I'd suggest something like this instead: > > return match unit: > _ in {'days', 'hours', 'weeks'} => timedelta(**{unit: amount}) > ... > > So a match entry would be one of: > - A pattern. See below > - A pattern followed by "if" , e.g.: > (False, x) if len(x) >= 7 > - A comparison where the left-hand side is a pattern, e.g.: > _ in {'days', 'hours', 'weeks'} > > Where a pattern is one of: > - A display of patterns, e.g.: > {'key': v, 'ignore': _} > I think *x and **x should be allowed here. > - A comma-separated list of patterns, making a tuple > - A pattern enclosed in parentheses > - A literal (that is not a formatted string literal, for sanity) > - A name > - A name with a type annotation > > To give a not-at-all-motivating but hopefully illustrative example: > > return match x: >