Hi, I think this syntax is very hard to read because the yielding-expression can be anywhere in the block and there is nothing to identify it. Even in your examples I can't figure out which expression will be used. What if I call a function somewhere in the block? Can't you just use generators + list/set/dict constructors when you need complex statements?
Regards Le ven. 21 févr. 2020 à 08:58, Alex Hall <alex.moj...@gmail.com> a écrit : > > This is a proposal for a new syntax where a comprehension is written as the > appropriate brackets containing a loop which can contain arbitrary statements. > > Here are some simple examples. Instead of: > > [ > f(x) > for y in z > for x in y > if g(x) > ] > > one may write: > > [ > for y in z: > for x in y: > if g(x): > f(x) > ] > > Instead of: > > lst = [] > for x in y: > if cond(x): > break > z = f(x) > lst.append(z * 2) > > one may write: > > lst = [ > for x in y: > if cond(x): > break > z = f(x) > yield z * 2 > ] > > Instead of: > > [ > {k: v for k, v in foo} > for foo in bar > ] > > one may write: > > [ > for foo in bar: > {for k, v in foo: k: v} > ] > > ## Specification > > A list/set/dict comprehension or generator expression is written as the > appropriate brackets containing a `for` or `while` loop. > > In the general case some expressions have `yield` in front and they become > the values of the comprehension, like a generator function. > > If the comprehension contains exactly one expression statement at any level > of nesting, i.e. if there is only one place where a `yield` can be placed at > the start of a statement, then `yield` is not required and the expression is > implicitly yielded. In particular this means that any existing comprehension > translated into the new style doesn't require `yield`. > > If the comprehension doesn't contain exactly one expression statement and > doesn't contain a `yield`, it's a SyntaxError. > > ### Dictionary comprehensions > > For dictionary comprehensions, a `key: value` pair is allowed as its own > pseudo-statement or in a yield. It's not a real expression and cannot appear > inside other expressions. > > This can potentially be confused with variable type annotations with no > assigned value, e.g. `x: int`. But we can essentially apply the same rule as > other comprehensions: either use `yield`, or only have one place where a > `yield` could be added in front of a statement. So if there is only one pair > `x: y` we try to implicitly yield that. The only way this could be > misinterpreted is if a user declared the type of exactly one expression and > completely forgot to give their comprehension elements, and the program would > almost certainly fail spectacularly. > > ### Whitespace > > If placing the loop on a single line would be valid syntax outside a > comprehension (i.e. it just contains a simple statement) then we call this an > *inline* comprehension. It can be inserted in the same line(s) as other code > and formatted however the writer likes - there are no concerns about > whitespace. > > For a more complex comprehension, the loop must start and end with a newline, > i.e. the lines containing the loop cannot contain any tokens from outside, > including the enclosing brackets. For example, this is allowed: > > foo = [ > for x in y: > if x > 0: > f(x) > ] > > but this is not: > > foo = [for x in y: > if x > 0: > f(x)] > > This ensures that code is readable even at a quick glance. The eyes can > quickly find where the loop starts and distinguish the embedded statements > from the rest of the enclosing expression. > > Furthermore, it's easy to copy paste entire lines to move them around, > whereas refactoring the invalid example above without specific tools would be > annoying and error-prone. It also makes it easy to adjust code outside the > comprehension (e.g. rename `foo` to something longer) without messing up > indentation and alignment. > > Inside the loop, the rules for indentation and such are the same as anywhere > else. The syntax of the loop is valid only if it's also valid as a normal > loop outside any expression. The body of the loop must be more indented than > the for/while keyword that starts the loop. > > ### Variable scope > > Since comprehensions look like normal loops they should maybe behave like > them again, including executing in the same scope and 'leaking' the iteration > variable(s). Assignments via the walrus operator already affect the outer > scope, only the iteration variable currently behaves differently. My > understanding is that this is influenced by the fact that there is little > reason to use the value of the iteration variable after a list comprehension > completes since it will always be the last value in the iterable. But since > the new syntax allows `break`, the value may become useful again. > > I don't know what the right approach is here and I imagine it can generate > plenty of debate. Given that this whole proposal is already controversial and > likely to be rejected this may not be the best place to start discussion. But > maybe it is, I don't know. > > ## Benefits/comparison to current methods > > ### Uniform syntax > > The new comprehensions just look like normal loops in brackets, or generator > functions. This should make them easier for beginners to learn than the old > comprehensions. > > A particular concept that's easier to learn is comprehensions that contain > multiple loops. Consider this comprehension over a nested list: > > [ > f(cell) > for row in matrix > for cell in row > ] > > For beginners this can easily be confusing, [and sometimes for experienced > coders > too](https://mail.python.org/archives/list/python-ideas@python.org/message/BX7LWUS57M52EPJMIR6A3SDQYSN7UCEX/ > ). Yes there's a rule that one can learn, but putting it in reverse also > seems logical, perhaps even more so: > > [ > f(cell) > for cell in row > for row in matrix > ] > > Now the comprehension is 'consistently backwards', it reads more like > English, and the usage of `cell` is right next to its definition. But of > course that order is wrong...unless we want a nested list comprehension that > produces a new nested list: > > [ > [ > f(cell) > for cell in row > ] > for row in matrix > ] > > Again, it's not hard for an experienced coder to understand this, but for a > beginner grappling with new concepts this is not great. Now consider how the > same two comprehensions would be written in the new syntax: > > [ > for row in matrix: > for cell in row: > f(cell) > ] > > [ > for row in matrix: > [ > for cell in row: > f(cell) > ] > ] > > ### Power and flexibility > > Comprehensions are great and I love using them. I want to be able to use them > more often. I know I can solve any problem with a loop, but it's obvious that > comprehensions are much nicer or we wouldn't need to have them at all. > Compare this code: > > new_matrix = [] > for row in matrix: > new_row = [] > for cell in row: > try: > new_row.append(f(cell)) > except ValueError: > new_row.append(0) > new_matrix.append(new_row) > > with the solution using the new syntax: > > new_matrix = [ > for row in matrix: [ > for cell in row: > try: > yield f(cell) > except ValueError: > yield 0 > ] > ] > > It's immediately visually obvious that it's building a new nested list, > there's much less syntax for me to parse, and the variable `new_row` has gone > from appearing 4 times to 0! > > There have been many requests to add some special syntax to comprehensions to > make them a bit more powerful: > > - [Is this PEP-able? "with" statement inside genexps / list > comprehensions](https://mail.python.org/archives/list/python-ideas@python.org/thread/BUD46OEPBN6YW43HPPEG3P3IFDOG6KMV/#O3U3V4Q4I2GOGVFCFH67TZ355WE7XKTD) > - [Allowing breaks in generator expressions by overloading the while > keyword](https://mail.python.org/archives/list/python-ideas@python.org/thread/6PEOE5ZXHQHAINEPQ7PTKSWYFW5OFMPQ/#ETB6ISNSB4KWQQYNMTRVJMZF4AWYCXV5) > - [while conditional in list comprehension > ??](https://mail.python.org/archives/list/python-ideas@python.org/thread/RYBBHV3YBBEIBUZPZ4WNQGKI76VSBWI5/#A36BJCUAGUBZA7FIQ3LN6UMZUYCL2LJG) > > This would solve all such problems neatly. > > ### No trying to fit things in a single expression > > The current syntax can only contain one expression in the body. This > restriction makes it difficult to solve certain problems elegantly and > creates an uncomfortable grey area where it's hard to decide between > squeezing maybe a bit too much into an expression or doing things 'manually'. > This can lead to analysis paralysis and disagreements between coders and > reviewers. For example, which of the following is the best? > > clean = [ > line.strip() > for line in lines > if line.strip() > ] > > stripped = [line.strip() for line in lines] > clean = [line for line in stripped if line] > > clean = list(filter(None, map(str.strip, lines))) > > clean = [] > for line in lines: > line = line.strip() > if line: > clean.append(line) > > def clean_lines(): > for line in lines: > line = line.strip() > if line: > yield line > > clean = list(clean_lines()) > > You probably have a favourite, but it's very subjective and this kind of > problem requires judgement depending on the situation. For example, I'd > choose the first version in this case, but a different version if I had to > worry about duplicating something more complex or expensive than `.strip()`. > And again, there's an awkward sweet spot where it's hard to decide whether I > care enough about the duplication. > > What about assignment expressions? We could do this: > > clean = [ > stripped > for line in lines > if (stripped := line.strip()) > ] > > Like the nested loops, this is tricky to parse without experience. The > execution order can be confusing and the variable is used away from where > it's defined. Even if you like it, there are clearly many who don't. I think > the fact that assignment expressions were a desired feature despite being so > controversial is a symptom of this problem. It's the kind of thing that > happens when we're stuck with the limitations of a single expression. > > The solution with the new syntax is: > > clean = [ > for line in lines: > stripped = line.strip() > if stripped: > stripped > ] > > or if you'd like to use an assignment expression: > > clean = [ > for line in lines: > if stripped := line.strip(): > stripped > ] > > I think both of these look great and are easily better than any of the other > options. And I think it would be the clear winner in any similar situation - > no careful judgement needed. This would become the one (and only one) obvious > way to do it. The new syntax has the elegance of list comprehensions and the > flexibility of multiple statements. It's completely scalable and works > equally well from the simplest comprehension to big complicated constructions. > > ### Easy to change > > I hate when I've already written a list comprehension but a new requirement > forces me to change it to, say, the `.append` version. It's a tedious > refactoring involving brackets, colons, indentation, and moving things > around. It also leaves me with a very unhelpful `git diff`. With the new > syntax I can easily add logic as I please and get a nice simple diff. > _______________________________________________ > Python-ideas mailing list -- python-ideas@python.org > To unsubscribe send an email to python-ideas-le...@python.org > https://mail.python.org/mailman3/lists/python-ideas.python.org/ > Message archived at > https://mail.python.org/archives/list/python-ideas@python.org/message/5UIXE23B26XPIQGPYNI575XN3NNX6JRR/ > Code of Conduct: http://python.org/psf/codeofconduct/ -- Antoine Rozo _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/VUPTDIVHZ5TINEYBQCFCPYZNK2DFYBU3/ Code of Conduct: http://python.org/psf/codeofconduct/