Re: [Python-ideas] PEP 505: None-aware operators
On 23Jul2018 1530, David Mertz wrote: Of course I don't mean that if implemented the semantics would be ambiguous... rather, the proper "swallowing" of different kinds of exceptions is not intuitively obvious, not even to you, Steve. And if some decision was reached and documented, it would remain unclear to new (or even experienced) users of the feature. As written in the PEP, no exceptions are ever swallowed. The translation into existing syntax is very clearly and unambiguously shown, and there is no exception handling at all. All the exception handling discussion in the PEP is under the heading of "rejected ideas". This email discussion includes some hypotheticals, since that's the point - I want thoughts and counter-proposals for semantics and discussion. I am 100% committed to an unambiguous PEP, and I believe the current proposal is most defensible. However, I don't want to have a "discussion" where I simply assume that I'm right, everyone else is wrong, and I refuse to discuss or consider alternatives. So sorry for letting you all think that everything I write is actually the PEP. I had assumed that because my emails are not the PEP that people would realise that they are not the PEP. I'm going to duck out of the discussions here now, since they are not as productive as I'd hoped, and once we have a BDFL-replacement I'll reawaken it and see what is required at that point. Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP 505: None-aware operators
On 23Jul2018 1145, Antoine Pitrou wrote: Le 23/07/2018 à 12:38, Steve Dower a écrit : General comment to everyone (not just Antoine): these arguments have zero value to me. Feel free to keep making them, but I am uninterested. So you're uninterested in learning from past mistakes? You sound like a child who thinks their demands should be satisfied because they are the center of the world. Sorry if it came across like that, it wasn't the intention. A bit of context on why you think it's a mistake would have helped, but if it's a purely subjective "I don't like the look of it" (as most similar arguments have turned out) then it doesn't add anything to enhancing the PEP. As a result, I do not see any reason to engage with this class of argument. I hope you'll also notice that I've been making very few demands in this thread, and have indicated a number of times that I'm very open to adjusting the proposal in the face of honest and useful feedback. Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP 505: None-aware operators
On 23Jul2018 1129, Antoine Pitrou wrote: Le 23/07/2018 à 12:25, Steve Dower a écrit : On 23Jul2018 , Antoine Pitrou wrote: On Mon, 23 Jul 2018 10:51:31 +0100 Steve Dower wrote: Which is the most important operator? - Personally, I think '?.' is the most valuable. For me, it's the most contentious. The fact that a single '?' added to a regular line of Python code can short-circuit execution silently is a net detriment to readability, IMHO. The only time it would short-circuit is when it would otherwise raise AttributeError for trying to access an attribute from None, which is also going to short-circuit. But AttributeError is going to bubble up as soon as it's raised, unless it's explicitly handled by an except block. Simply returning None may have silent undesired effects (perhaps even security flaws). You're right that the silent/undesired effects would be bad, which is why I'm not proposing silent changes to existing code (such as None.__getattr__ always returning None). This is a substitute for explicitly checking None before the attribute access, or explicitly handling AttributeError for this case (and unintentionally handling others as well). And "?." may be very small compared to the extra 3+ lines required to do exactly the same thing, but it is still an explicit change that can be reviewed and evaluated as "is None a valid but not-useful value here? or is it an indication of another error and should we fail immediately instead". Cheers, Steve This whole thing reminds of PHP's malicious "@" operator. General comment to everyone (not just Antoine): these arguments have zero value to me. Feel free to keep making them, but I am uninterested. Perhaps whoever gets to decide on the PEP will be swayed by them? ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP 505: None-aware operators
On 23Jul2018 , Antoine Pitrou wrote: On Mon, 23 Jul 2018 10:51:31 +0100 Steve Dower wrote: Which is the most important operator? - Personally, I think '?.' is the most valuable. For me, it's the most contentious. The fact that a single '?' added to a regular line of Python code can short-circuit execution silently is a net detriment to readability, IMHO. The only time it would short-circuit is when it would otherwise raise AttributeError for trying to access an attribute from None, which is also going to short-circuit. The difference is that it short-circuits the expression only, and not all statements up until the next except handler. Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP 505: None-aware operators
Responding to a few more ideas that have come up here. Again, apologies for not directing them to the original authors, but I want to focus on the ideas that are leading towards a more informed decision, and not getting distracted by providing customised examples for people or getting into side debates. I'm also going to try and update the PEP text today (or this week at least) to better clarify some of the questions that have come up (and fix that embarrassingly broken example :( ) Cheers, Steve False: '?.' should be surrounded by spaces -- It's basically the same as '.'. Spell it 'a?.b', not 'a ?. b' (like 'a.b' rather than 'a + b'). It's an enhancement to attribute access, not a new type of binary operator. The right-hand side cannot be evaluated in isolation. In my opinion, it can also be read aloud the same as '.' as well (see the next point). False: 'a?.b' is totally different from 'a.b' - The expression 'a.b' either results in 'a.b' or AttributeError (assuming no descriptors are involved). The expression 'a?.b' either results in 'a.b' or None (again, assuming no descriptors). This isn't a crazy new idea, it really just short-circuits a specific error that can only be precisely avoided with "if None" checks (catching AttributeError is not the same). The trivial case is already a one-liner --- That may be the case if you have a single character variable, but this proposal is not intended to try and further simplify already simple cases. It is for complex cases, particularly where you do not want to reevaluate the arguments or potentially leak temporary names into a module or class namespace. (Brief aside: 'a if (a := expr) is not None else None' is going to be the best workaround. The suggested 'a := expr if a is not None else None' is incorrect because the condition is evaluated first and so has to contain the assignment.) False: ??= is a new form of assignment -- No, it's just augmented assignment for a binary operator. "a ??= b" is identical to "a = a ?? b", just like "+=" and friends. It has no relationship to assignment expressions. '??=' can only be used as a statement, and is not strictly necessary, but if we add a new binary operator '??' and it does not have an equivalent augmented assignment statement, people will justifiably wonder about the inconsistency. The PEP author is unsure about how it works --- I wish this statement had come with some context, because the only thing I'm unsure about is what I'm supposed to be unsure about. That said, I'm willing to make changes to the PEP based on the feedback and discussion. I haven't come into this with a "my way is 100% right and it will never change" mindset, so if this is a misinterpretation of my willingness to listen to feedback then I'm sorry I wasn't more clear. I *do* care about your opinions (when presented fairly and constructively). Which is the most important operator? - Personally, I think '?.' is the most valuable. The value of '??' arises because (unless changing the semantics from None-aware to False-aware) it provides a way of setting the default that is consistent with how we got to the no-value value (e.g. `None?.a ?? b` and `""?.a ?? b` are different, whereas `None?.a or b` and `""?.a or b` are equivalent). I'm borderline on ?[] right now. Honestly, I think it works best if it also silently handles LookupError (e.g. for traversing a loaded JSON dict), but then it's inconsistent with ?. which I think works best if it handles None but allows AttributeError. Either way, both have the ability to directly handle the exception. For example, (assuming e1, e2 are expressions and not values): v = e1?[e2] Could be handled as this example (for None-aware): _temp1 = (e1) v = _temp1[e2] if _temp1 is not None else None Or for silent exception handling of the lookup only: _temp1 = (e1) _temp2 = (e2) try: v = _temp1[_temp2] if _temp1 is not None else None except LookupError: v = None Note that this second example is _not_ how most people protect against invalid lookups (most people use `.get` when it's available, or they accept that LookupErrors raised from e1 or e2 should also be silently handled). So there would be value in ?[] being able to more precisely handle the exception. However, with ?. being available, and _most_ lookups being on dicts that have .get(), you can also traverse JSON values fairly easily like this: d = json.load(f) name = d.get('user')?.get('details')?.get('name') ?? '' With ?[] doing the safe lookup as well, this could be: d = json.load(f) name = d?['user']?['details']?['name'] ?? '' Now, my *least* favourite part of this is that (as
Re: [Python-ideas] PEP 505: None-aware operators
On 23Jul2018 0151, Steven D'Aprano wrote: What if there was a language supported, non-hackish way to officially delay evaluation of expressions until explicitly requested? The current spelling for this is "lambda: delayed-expression" and the way to request the value is "()". :) (I'm not even being that facetious here. People ask for delayed expressions all the time, and it's only 7 characters, provided the callee knows they're getting it, and the semantics are already well defined and likely match what you want.) Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Python docs page: In what ways is None special
On 23Jul2018 1003, Jonathan Fine wrote: This arises out of PEP 505 - None-aware operators. I thought, a page on how None is special would be nice. I've not found such a page on the web. We do have === https://docs.python.org/3/library/constants.html None The sole value of the type NoneType. None is frequently used to represent the absence of a value, as when default arguments are not passed to a function. Assignments to None are illegal and raise a SyntaxError. === So decided to start writing such a page, perhaps to be added to the docs. All code examples in Python3.4. There's also https://docs.python.org/3/c-api/none.html?highlight=py_none#c.Py_None "The Python None object, denoting lack of value. This object has no methods. It needs to be treated just like any other object with respect to reference counts." I don't know that documenting the behaviours of None are that interesting (e.g. not displaying anything at the interactive prompt), though it'd be perfect for a blog and/or conference talk. But if there appear to be behaviours that are not consistent or cannot be easily inferred from the existing documentation, then we should think about why that is and how we could enhance the documentation to ensure it accurately describes what None is supposed to be. That said, your examples are good :) Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP 505: None-aware operators
On 20Jul2018 1119, Brendan Barnwell wrote: In this situation I lean toward "explicit is better than implicit" --- if you want to compare against None, you should do so explicitly --- and "special cases aren't special enough to break the rules" --- that is, None is not special enough to warrant the creation of multiple new operators solely to compare things against this specific value. "The rules" declare that None is special - it's the one and only value that represents "no value". So is giving it special meaning here breaking the rules or following them? (See also the ~50% of the PEP dedicated to this subject, and also consider proposing a non-special result for "??? if has_no_value(value) else value" in the 'True' case.) Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP 505: None-aware operators
Just for fun, I decided to go through some recently written code by some genuine Python experts (without their permission...) to see what changes would be worth taking. So I went to the sources of our github bots. Honestly, I only found three places that were worth changing (though I'm now kind of leaning towards ?[] eating LookupError, since that seems much more useful when traversing the result of json.loads()...). I'm also not holding up the third one as the strongest example :) >From >https://github.com/python/miss-islington/blob/master/miss_islington/status_change.py: async def check_status(event, gh, *args, **kwargs): if ( event.data["commit"].get("committer") and event.data["commit"]["committer"]["login"] == "miss-islington" ): sha = event.data["sha"] await check_ci_status_and_approval(gh, sha, leave_comment=True) After: async def check_status(event, gh, *args, **kwargs): if event.data["commit"].get("committer")?["login"] == "miss-islington": sha = event.data["sha"] await check_ci_status_and_approval(gh, sha, leave_comment=True) >From https://github.com/python/bedevere/blob/master/bedevere/__main__.py: try: print('GH requests remaining:', gh.rate_limit.remaining) except AttributeError: pass Assuming you want to continue hiding the message when no value is available: if (remaining := gh.rate_limit?.remaining) is not None: print('GH requests remaining:', remaining) Assuming you want the message printed anyway: print(f'GH requests remaining: {gh.rate_limit?.remaining ?? "N/A"}') >From https://github.com/python/bedevere/blob/master/bedevere/news.py (this is >the one I'm including for completeness, not because it's the most compelling >example I've ever seen): async def check_news(gh, pull_request, filenames=None): if not filenames: filenames = await util.filenames_for_PR(gh, pull_request) After: async def check_news(gh, pull_request, filenames=None): filenames ??= await util.filenames_for_PR(gh, pull_request) On 19Jul2018 , Steven D'Aprano wrote: > In other words, we ought to be comparing the expressiveness of > > process(spam ?? something) > > versus: > > process(something if spam is None else spam) Agreed, though to make it a more favourable comparison I'd replace "spam" with "spam()?.eggs" and put it in a class/module definition where you don't want temporary names leaking ;) Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP 505: None-aware operators
Thanks everyone for the feedback and discussion so far. I want to address some of the themes, so apologies for not quoting individuals and for doing this in one post instead of twenty. -- * "It looks like line noise" Thanks for the feedback. There's nothing constructive for me to take from this. * "I've never needed this" Also not very actionable, but as background I'll say that this was exactly my argument against adding them to C#. But my coding style has adapted to suit (for example, I'm more likely to use "null" as a default value and have a single function implementation than two mostly-duplicated overloads). * "It makes it more complex" * "It's harder to follow the flow" Depends on your measure of complexity. For me, I prioritise "area under the indentation" as my preferred complexity metric (more lines*indents == more complex), as well as left-to-right reading of each line (more random access == more complex). By these measures, ?. significantly reduces the complexity over any of the current or future alternatives:: def f(a=None): name = 'default' if a is not None: user = a.get_user() if user is not None: name = user.name print(name) def f(a=None): if a is not None: user = a.get_user() name = user.name if user is not None else 'default' print(name) else print('default') def f(a=None): user = a.get_user() if a is not None else None name = user.name if user is not None else 'default' print(name) def f(a=None): print(user.name if (user := a.get_user() if a is not None else None) is not None else 'default') def f(a=None): print(a?.get_user()?.name ?? 'none') * "We have 'or', we don't need '??'" Nearly-agreed, but I think the tighter binding on ?? makes it more valuable and tighter test make it valuable in place of 'or'. For example, compare: a ** b() or 2 # actual: (a ** b()) or 2 a ** b() ?? 2 # proposed: a ** (b() ?? 2) In the first, the presence of 'or' implies that either b() or __pow__(a, b()) could return a non-True value. This is correct (it could return 0 if a == 0). And the current precedence results in the result of __pow__ being used for the check. In the second one, the presence of the '??' implies that either b() or __pow__(a, b()) could return None. The latter should never happen, and so the choices are to make the built-in types propagate Nones when passed None (uhh... no) or to make '??' bind to the closer part of the expression. (If you don't think it's likely enough that a function could return [float, None], then assume 'a ** b?.c ?? 2' instead.) * "We could have '||', we don't need '??'" Perhaps, though this is basically just choosing the bikeshed colour. In the absence of a stronger argument, matching existing languages equivalent operators instead of operators that do different things in those languages should win. * "We could have 'else', we don't need '??'" This is the "a else 'default'" rather than "a ?? 'default'" proposal, which I do like the look of, but I think it will simultaneously mess with operator precedence and also force me to search for the 'if' that we actually need to be comparing "(a else 'default')" vs. "a ?? 'default'":: x = a if b else c else d x = a if (b else c) else d x = a if b else (c else d) * "It's not clear whether it's 'is not None' or 'hasattr' checks" I'm totally sympathetic to this. Ultimately, like everything else, this is a concept that has to be taught/learned rather than known intrinsically. The main reasons for not having 'a?.b' be directly equivalent to getattr(a, 'b', ???) is that you lose the easy ability to find typos, and we also already have the getattr() approach. (Aside: in this context, why should the result be 'None' if an attribute is missing? For None, the None value propagates (getattr(a, 'b', a)), while for falsies you could argue the same thing applies. But for a silently handled AttributeError? You still have to make the case that None is special here, just special as a return value vs. special as a test.) * "The semantics of this example changed from getattr() with ?." Yes, this was a poor example. On re-reading, all of the checks are indeed looking for optional attributes, rather than looking them up on optional targets. I'll find a better one (I've certainly seen and/or written code like this that was intended to avoid crashing on None, but I stopped my search of the stdlib too soon after finding this example). * "Bitwise operators" Uh... yeah. Have fun over there :) * "Assumes the only falsie ever returned [in some context] is None" I argue that it assumes the only falsie you want to replace with a different value is None. In many cases, I'd expect the None to be replaced with a falsie of the intended type: x = maybe_get_int() ?? 0 y = maybe_get_list() ?? [] Particularly for the
Re: [Python-ideas] PEP 505: None-aware operators
Thanks! Bit of discussion below about precedence, but thanks for spotting the typos. On 18Jul2018 1318, MRAB wrote: On 2018-07-18 18:43, Steve Dower wrote: Grammar changes --- The following rules of the Python grammar are updated to read:: augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' | '<<=' | '>>=' | '**=' | '//=' | '??=') power: coalesce ['**' factor] coalesce: atom_expr ['??' factor] atom_expr: ['await'] atom trailer* trailer: ('(' [arglist] ')' | '[' subscriptlist ']' | '?[' subscriptlist ']' | '.' NAME | '?.' NAME) The precedence is higher than I expected. I think of it more like 'or'. What is its precedence in the other languages? Yes, I expected this to be the contentious part. I may have to add a bit of discussion. Mostly, I applied intuition rather than copying other languages on precedence (and if you could go through my non-git history, you'd see I tried four other places ;) ). The most "obvious" cases were these:: a ?? 1 + b() b ** a() ?? 2 In the first case, both "(a ?? 1) + b()" and "a ?? (1 + b())" make sense, so it's really just my own personal preference that I think it looks like the first. If you flip the operands to get "b() + a ?? 1" then you end up with either "b() + (a ?? 1)" or "(b() + a) ?? 1", then it's more obvious that the latter doesn't make any sense (why would __add__ return None?), and so binding more tightly than "+" helps write sensible expressions with fewer parentheses. Similarly, I feel like "b ** (a() ?? 2)" makes more sense than "(b ** a()) ?? 2", where for the latter we would have to assume a __pow__ implementation that returns None, or one that handles being passed None without raising a TypeError. Contrasting this with "or", it is totally legitimate for arithmetic operators to return falsey values. As I open the text file to correct the typos, I see this is what I tried to capture with: Inserting the ``coalesce`` rule in this location ensures that expressions resulting in ``None`` are naturally coalesced before they are used in operations that would typically raise ``TypeError``. Take (2 ** a.b) ?? 0. The result of __pow__ is rarely going to be None, unless we train all the builtin types to do so (which, incidentally, I am not proposing and have no intention of proposing), whereas something like "2 ** coord?.exponent" attempting to call "2.__pow__(None)" seems comparatively likely. (Unfortunately, nobody writes code like this yet :) So there aren't any real-life examples. Originally I didn't include "??" in the proposal, but it became obvious in the examples that the presence of None-propagating operators ?. and ?[] just cause more pain without having the None-terminating operator ?? as well.) Inserting the ``coalesce`` rule in this location ensures that expressions resulting in ``None`` are natuarlly coalesced before they are used in Typo "natuarlly". Thanks. assert a == 'value' assert b == '' assert c == '0' and any(os.scandir('/')) Wouldn't the last assertion fail, because c == 0? Correct, another typo. Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] PEP 505: None-aware operators
Possibly this is exactly the wrong time to propose the next big syntax change, since we currently have nobody to declare on it, but since we're likely to argue for a while anyway it probably can't hurt (and maybe this will become the test PEP for whoever takes the reins?). FWIW, Guido had previously indicated that he was generally favourable towards most of this proposal, provided we could figure out coherent semantics. Last time we tried, that didn't happen, so this time I've made the semantics much more precise, have implemented and verified them, and made much stronger statements about why we are proposing these. Additional thanks to Mark Haase for writing most of the PEP. All the fair and balanced parts are his - all the overly strong opinions are mine. Also thanks to Nick Coghlan for writing PEPs 531 and 532 last time we went through this - if you're unhappy with "None" being treated as a special kind of value, I recommend reading those before you start repeating them. There is a formatted version of this PEP at https://www.python.org/dev/peps/pep-0505/ My current implementation is at https://github.com/zooba/cpython/tree/pep-505 (though I'm considering removing some of the new opcodes I added and just generating more complex code - in any case, let's get hung up on the proposal rather than the implementation :) ) Let the discussions begin! --- PEP: 505 Title: None-aware operators Version: $Revision$ Last-Modified: $Date$ Author: Mark E. Haase , Steve Dower Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 18-Sep-2015 Python-Version: 3.8 Abstract Several modern programming languages have so-called "``null``-coalescing" or "``null``- aware" operators, including C# [1]_, Dart [2]_, Perl, Swift, and PHP (starting in version 7). These operators provide syntactic sugar for common patterns involving null references. * The "``null``-coalescing" operator is a binary operator that returns its left operand if it is not ``null``. Otherwise it returns its right operand. * The "``null``-aware member access" operator accesses an instance member only if that instance is non-``null``. Otherwise it returns ``null``. (This is also called a "safe navigation" operator.) * The "``null``-aware index access" operator accesses an element of a collection only if that collection is non-``null``. Otherwise it returns ``null``. (This is another type of "safe navigation" operator.) This PEP proposes three ``None``-aware operators for Python, based on the definitions and other language's implementations of those above. Specifically: * The "``None`` coalescing`` binary operator ``??`` returns the left hand side if it evaluates to a value that is not ``None``, or else it evaluates and returns the right hand side. A coalescing ``??=`` augmented assignment operator is included. * The "``None``-aware attribute access" operator ``?.`` evaluates the complete expression if the left hand side evaluates to a value that is not ``None`` * The "``None``-aware indexing" operator ``?[]`` evaluates the complete expression if the left hand site evaluates to a value that is not ``None`` Syntax and Semantics Specialness of ``None`` --- The ``None`` object denotes the lack of a value. For the purposes of these operators, the lack of a value indicates that the remainder of the expression also lacks a value and should not be evaluated. A rejected proposal was to treat any value that evaluates to false in a Boolean context as not having a value. However, the purpose of these operators is to propagate the "lack of value" state, rather that the "false" state. Some argue that this makes ``None`` special. We contend that ``None`` is already special, and that using it as both the test and the result of these operators does not change the existing semantics in any way. See the `Rejected Ideas`_ section for discussion on the rejected approaches. Grammar changes --- The following rules of the Python grammar are updated to read:: augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' | '<<=' | '>>=' | '**=' | '//=' | '??=') power: coalesce ['**' factor] coalesce: atom_expr ['??' factor] atom_expr: ['await'] atom trailer* trailer: ('(' [arglist] ')' | '[' subscriptlist ']' | '?[' subscriptlist ']' | '.' NAME | '?.' NAME) Inserting the ``coalesce`` rule in this location ensures that expressions resulting in ``None`` are natuarlly coalesced before they are used in operations that would typically raise ``TypeError``. Like ``and`` and ``or`` the right-hand expression is not evaluated until the left-hand side is determined to be ``None``
Re: [Python-ideas] PEP 572: Statement-Local Name Bindings, take three!
# Dict display data = { key_a: local_a := 1, key_b: local_b := 2, key_c: local_c := 3, } Isn’t this a set display with local assignments and type annotations? :o) (I’m -1 on all of these ideas, btw. None help readability for me, and I read much more code than I write.) Top-posted from my Windows phone From: Nick Coghlan Sent: Sunday, April 8, 2018 6:27 To: Chris Angelico Cc: python-ideas Subject: Re: [Python-ideas] PEP 572: Statement-Local Name Bindings,take three! On 23 March 2018 at 20:01, Chris Angelicowrote: > Apologies for letting this languish; life has an annoying habit of > getting in the way now and then. > > Feedback from the previous rounds has been incorporated. From here, > the most important concern and question is: Is there any other syntax > or related proposal that ought to be mentioned here? If this proposal > is rejected, it should be rejected with a full set of alternatives. I was writing a new stdlib test case today, and thinking about how I might structure it differently in a PEP 572 world, and realised that a situation the next version of the PEP should discuss is this one: # Dict display data = { key_a: 1, key_b: 2, key_c: 3, } # Set display with local name bindings data = { local_a := 1, local_b := 2, local_c := 3, } # List display with local name bindings data = { local_a := 1, local_b := 2, local_c := 3, } # Dict display data = { key_a: local_a := 1, key_b: local_b := 2, key_c: local_c := 3, } # Dict display with local key name bindings data = { local_a := key_a: 1, local_b := key_b: 2, local_c := key_c: 3, } I don't think this is bad (although the interaction with dicts is a bit odd), and I don't think it counts as a rationale either, but I do think the fact that it becomes possible should be noted as an outcome arising from the "No sublocal scoping" semantics. Cheers, Nick. P.S. The specific test case is one where I want to test the three different ways of spelling "the current directory" in some sys.path manipulation code (the empty string, os.curdir, and os.getcwd()), and it occurred to me that a version of PEP 572 that omits the sublocal scoping concept will allow inline naming of parts of data structures as you define them. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] New PEP proposal -- Pathlib Module ShouldContain All File Operations -- version 2
I had a colleague complaining to me the other day about having to search multiple packages for the right function to move a file (implying: with the same semantics as drag-drop). If there isn’t a pathtools library on PyPI yet, this would certainly be valuable for newer developers. My view on Path is to either have everything on it or nothing on it (without removing what’s already there, of course), and since everything is so popular we should at least put everything in the one place. Top-posted from my Windows phone From: Mike Miller Sent: Monday, March 19, 2018 10:51 To: python-ideas@python.org Subject: Re: [Python-ideas] New PEP proposal -- Pathlib Module ShouldContain All File Operations -- version 2 On 2018-03-18 10:55, Paul Moore wrote: >> Should Path() have methods to access all file operations? > > No, (Counterexample, having a Path operation to set Windows ACLs for a path). Agreed, not a big fan of everything filesystem-related in pathlib, simply because it doesn't read well. Having them scattered isn't a great experience either. Perhaps it would be better to have a filesystem package instead, maybe named "fs" that included all this stuff in one easy to use location. File stuff from os, path stuff from os.path, pathlib, utils like stat, and shutil etc? ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Possible Enhancement to py Launcher - set default
Checking the Version (!=SysVersion) property should be enough (and perhaps we need to set it properly on install). The launcher currently only works with PythonCore entries anyway, so no need to worry about other distros. PEP 514 allows for other keys to be added as well (it specifies a minimum set), so we could just set one for this. “NoDefaultLaunch” or similar. Finally, if someone created a script for setting py.ini, it could probably be included in the Tools directory. Wouldn’t be run on install or get a start menu shortcut though, just to set expectations right. Top-posted from my Windows phone From: Paul Moore Sent: Wednesday, February 7, 2018 7:37 To: Alex Walters Cc: Python-Ideas Subject: Re: [Python-ideas] Possible Enhancement to py Launcher - set default I don't think so. As an example, what registry keys would Anaconda write to say that Release 5.2.1.7 is a pre-release version? Or would the py launcher have to parse the version looking for rc/a/b/... tags? And distributions would have to agree on how they record pre-release version numbers? Paul On 7 February 2018 at 14:57, Alex Walterswrote: > > >> -Original Message- >> From: Paul Moore [mailto:p.f.mo...@gmail.com] >> Sent: Wednesday, February 7, 2018 4:15 AM >> To: Alex Walters >> Cc: Steve Barnes ; Python-Ideas > id...@python.org> >> Subject: Re: [Python-ideas] Possible Enhancement to py Launcher - set >> default >> > ... >> >> IMO the biggest technical issue with this is that as far as I can see >> PEP 514 doesn't specify a way to determine if a given Python is a >> pre-release version. If we do want to implement this (I'm +0 on it, >> personally) then I think the starting point would need to be an update >> to PEP 514 to include that data. >> >> Paul > > Looking at pep 514, it looks like sys.winver is what would have to change to > support reporting the release status to the registry. I don't think 514 has > to change at all if sys.winver changes. Is that a correct interpretation? > ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Format mini-language for lakh and crore
Someone would have to check, but presumably the CRT on Windows is converting the natively thread-local locale into a process-wide locale for POSIX compatibility, which means it can probably be easily bypassed without having to use specific overloads. Top-posted from my Windows phone From: Nathaniel Smith Sent: Monday, January 29, 2018 11:29 To: Eric V. Smith Cc: python-ideas Subject: Re: [Python-ideas] Format mini-language for lakh and crore On Sun, Jan 28, 2018 at 5:46 AM, Eric V. Smithwrote: > If I recall correctly, we discussed this at the time, and the problem with > locale is that it's not thread safe. I agree that if it were, it would be > nice to be able to use it, either with 'n', or in some other mode just for > grouping. > > The underlying C setlocale()/localeconv() just isn't very friendly to this > use case. POSIX.1-2008 added thread-local locales (say that 3x fast); see uselocale(3). This appears to be supported on Linux (since glibc 2.3, which is older than all supported enterprise distros), MacOS, and the BSDs, but not Windows. OTOH Windows, MacOS, and the BSDs all seem to provide the non-standard sprintf_l, which takes an explicit locale to use. So it looks like all mainstream OSes actually make it possible to use a specific locale to do arbitrary formatting in a thread-safe way. -n -- Nathaniel J. Smith -- https://vorpus.org ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Windows Best Fit Encodings
On 20Jan2018 0518, M.-A. Lemburg wrote: do you know of a definite resource for Windows code pages on MSDN or another official MS website ? I don't know of anything sorry, and my quick search didn't turn up anything public. But I can at least confirm that the internal table for cp1252 has the same undefined characters as on unicode.org, so presumably if MultiByteToWideChar is mapping those to "best fit" characters it's only because the flag has been passed. As far as I can tell, Microsoft has not been secretly redefining any encodings. Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Support WHATWG versions of legacy encodings
On 12Jan2018 0342, Random832 wrote: On Thu, Jan 11, 2018, at 04:55, Serhiy Storchaka wrote: The way of solving this issue in Python is using an error handler. The "surrogateescape" error handler is specially designed for lossless reversible decoding. It maps every unassigned byte in the range 0x80-0xff to a single character in the range U+dc80-U+dcff. This allows you to distinguish correctly decoded characters from the escaped bytes, perform character by character processing of the decoded text, and encode the result back with the same encoding. Maybe we need a new error handler that maps unassigned bytes in the range 0x80-0x9f to a single character in the range U+0080-U+009F. Do any of the encodings being discussed have behavior other than the "normal" version of the encoding plus what I just described? +1 on this being an error handler (if possible). I suspect the semantics will be more complex than suggested above, but as this seems to be able handling normally un[en/de]codable characters, using an error handler to return something more sensible best represents what is going on. Call it something like 'web' or 'relaxed' or 'whatwg'. I don't know if error handlers have enough context for this though. If not, we should ensure they can have it. I'd much rather explain one new error handler to most people (and a more complex API for implementing them to the few people who do it) than explain a whole suite of new encodings. Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Make functions, methods and descriptor types living in the types module
I certainly have code that joins __module__ with __name__ to create a fully-qualified name (with special handling for those builtins that are not in builtins), and IIUC __qualname__ doesn't normally include the module name either (it's intended for nested types/functions). Can we make it visible when you import the builtins module, but not in the builtins namespace? Cheers, Steve On 12Jan2018 0941, Victor Stinner wrote: I like the idea of having a fully qualified name that "works" (can be resolved). I don't think that repr() should change, right? Can this change break the backward compatibility somehow? Victor Le 11 janv. 2018 21:00, "Serhiy Storchaka"> a écrit : Currently the classes of functions (implemented in Python and builtin), methods, and different type of descriptors, generators, etc have the __module__ attribute equal to "builtins" and the name that can't be used for accessing the class. >>> def f(): pass ... >>> type(f) >>> type(f).__module__ 'builtins' >>> type(f).__name__ 'function' >>> type(f).__qualname__ 'function' >>> import builtins >>> builtins.function Traceback (most recent call last): File "", line 1, in AttributeError: module 'builtins' has no attribute 'function' But most of this classes (if not all) are exposed in the types module. I suggest to rename them. Make the __module__ attribute equal to "builtins" and the __name__ and the __qualname__ attributes equal to the name used for accessing the class in the types module. This would allow to pickle references to these types. Currently this isn't possible. >>> pickle.dumps(types.FunctionType) Traceback (most recent call last): File "", line 1, in _pickle.PicklingError: Can't pickle : attribute lookup function on builtins failed And this will help to implement the pickle support of dynamic functions etc. Currently the third-party library that implements this needs to use a special purposed factory function (not compatible with other similar libraries) since types.FunctionType isn't pickleable. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Looking for input to help with the pip situation
On 15Nov2017 0617, Nick Coghlan wrote: On 15 November 2017 at 22:46, Michel Desmoulin <desmoulinmic...@gmail.com <mailto:desmoulinmic...@gmail.com>> wrote: Should I do a PEP with a summary of all the stuff we discussed ? I think a Windows-specific PEP covering adding PATH updates back to the default installer behaviour, and adding pythonX and pythonX.Y commands would be useful (and Guido would presumably delegate resolving that to Steve Dower as the Windows installer maintainer). If you write such a PEP, please also research and write up the issues with modifying PATH on Windows (they're largely scattered throughout bugs.p.o and earlier discussions on python-dev). Once you realise the tradeoff involved in modifying these global settings, you'll either come around to my point of view or be volunteering to take *all* the support questions when they come in :) The one thing I'd ask is that any such PEP *not* advocate for promoting ther variants as the preferred way of invoking Python on Windows - rather, they should be positioned as a way of making online instructions written for Linux more likely to "just work" for folks on Windows (similar to the utf-8 encoding changes in https://www.python.org/dev/peps/pep-0529/) Instead, the focus should be on ensuring the "python -m pip install" and "pip install" both work after clicking through the installer without changing any settings, and devising a troubleshooting guide to help folks that are familiar with computers and Python, but perhaps not with Windows, guide folks to a properly working environment. My preferred solution for this is to rename "py.exe" to "python.exe" (or rather, make a copy of it with the new name), and extend (or more likely, rewrite) the launcher such that: * if argv[0] == "py.exe", use PEP 514 company/tag resolution to find and launch Python based on first command line argument * if argv[0] == "python.exe", find the matching PythonCore/ install (where tag may be a partial match - e.g. "python3.exe" finds the latest PythonCore/3.x) * else, if argv[0] == ".exe, find the matching PythonCore/ install and launch "-m " With the launcher behaving like this, we can make as many hard links as we want in its install directory (it only gets installed once, so only needs one PATH entry, and this is C:\Windows for admin installs): * python.exe * python2.exe * python3.exe * python3.6.exe * pip.exe * pip2.exe * pip3.exe As well as allowing e.g. "py.exe -anaconda36-64 ..." to reliably locate and run non-Python.org installs. It needs to be fully specced out, obviously, and we may want to move the all-users install to its own directory to reduce clutter, but part of the reason behind PEP 514 was to enable this sort of launcher. It could even extend to "you don't have this version right now, want to download and install it?" And finally it should be fairly obvious that this doesn't have to be a core Python tool. It has no reliance on anything in core (that isn't already specified in a PEP) and could be written totally independently. I've tried (weakly) to get work time allocated to this in the past, and if it's genuinely not going to get done unless I do it then I'll try again. Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP draft: context variables
On 13Oct2017 1132, Yury Selivanov wrote: On Fri, Oct 13, 2017 at 1:45 PM, Ethan Furman <et...@stoneleaf.us> wrote: On 10/13/2017 09:48 AM, Steve Dower wrote: On 13Oct2017 0941, Yury Selivanov wrote: Actually, capturing context at the moment of coroutine creation (in PEP 550 v1 semantics) will not work at all. Async context managers will break. class AC: async def __aenter__(self): pass ^ If the context is captured when coroutines are instantiated, __aenter__ won't be able to set context variables and thus affect the code it wraps. That's why coroutines shouldn't capture context when created, nor they should isolate context. It's a job of async Task. Then make __aenter__/__aexit__ when called by "async with" an exception to the normal semantics? It seems simpler to have one specially named and specially called function be special, rather than make the semantics more complicated for all functions. It's not possible to special case __aenter__ and __aexit__ reliably (supporting wrappers, decorators, and possible side effects). Why not? Can you not add a decorator that sets a flag on the code object that means "do not create a new context when called", and then it doesn't matter where the call comes from - these functions will always read and write to the caller's context. That seems generally useful anyway, and then you just say that __aenter__ and __aexit__ are special and always have that flag set. +1. I think that would make it much more usable by those of us who are not experts. I still don't understand what Steve means by "more usable", to be honest. I don't know that I said "more usable", but it would certainly be easier to explain. The Zen has something to say about that... Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP draft: context variables
On 13Oct2017 0941, Yury Selivanov wrote: On Fri, Oct 13, 2017 at 3:25 AM, Nick Coghlanwrote: [..] However, considering that coroutines are almost always instantiated at the point where they're awaited, I do concede that creation time context capture would likely also work out OK for the coroutine case, which would leave contextlib.contextmanager as the only special case (and it would turn off both creation-time context capture *and* context isolation). Actually, capturing context at the moment of coroutine creation (in PEP 550 v1 semantics) will not work at all. Async context managers will break. class AC: async def __aenter__(self): pass ^ If the context is captured when coroutines are instantiated, __aenter__ won't be able to set context variables and thus affect the code it wraps. That's why coroutines shouldn't capture context when created, nor they should isolate context. It's a job of async Task. Then make __aenter__/__aexit__ when called by "async with" an exception to the normal semantics? It seems simpler to have one specially named and specially called function be special, rather than make the semantics more complicated for all functions. Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP draft: context variables
On 11Oct2017 0458, Koos Zevenhoven wrote: Exactly. You did say it less politely than I did, but this is exactly how I thought about it. And I'm not sure people got it the first time. Yes, perhaps a little harsh. However, if I released a refactoring tool that moved function calls that far, people would file bugs against it for breaking their code (and in my experience of people filing bugs against tools that break their code, they can also be a little harsh). I want PEP 555 to be how things *should be*, not how things are. Agreed. Start with the ideal target and backpedal when a sufficient case has been made to justify it. That's how Yury's PEP has travelled, but I disagree that this example is a compelling case for the amount of bending that is being done. New users of this functionality very likely won’t assume that TLS is the semantic equivalent, especially when all the examples and naming make it sound like context managers are more related. (I predict people will expect this to behave more like unstated/implicit function arguments and be captured at the same time as other arguments are, but can’t really back that up except with gut-feel. It's certainly a feature that I want for myself more than I want another spelling for TLS…) I assume you like my decision to rename the concept to "context arguments" :). And indeed, new use cases would be more interesting than existing ones. Surely we don't want new use cases to copy the semantics from the old ones which currently have issues (because they were originally designed to work with traditional function and method calls, and using then-available techniques). I don't really care about names, as long as it's easy to use them to research the underlying concept or intended functionality. And I'm not particularly supportive of this concept as a whole anyway - EIBTI and all. But since it does address a fairly significant shortcoming in existing code, we're going to end up with something. If it's a new runtime feature then I'd like it to be an easy concept to grasp with clever hacks for the compatibility cases (and I do believe there are clever hacks available for getting "inject into my deferred function call" semantics), rather than the whole thing being a complicated edge-case. Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP draft: context variables
Nick: “I like Yury's example for this, which is that the following two examples are currently semantically equivalent, and we want to preserve that equivalence: with decimal.localcontext() as ctx: ctc.prex = 30 for i in gen(): pass g = gen() with decimal.localcontext() as ctx: ctc.prex = 30 for i in g: pass” I’m following this discussion from a distance, but cared enough about this point to chime in without even reading what comes later in the thread. (Hopefully it’s not twenty people making the same point…) I HATE this example! Looking solely at the code we can see, you are refactoring a function call from inside an *explicit* context manager to outside of it, and assuming the behavior will not change. There’s *absolutely no* logical or semantic reason that these should be equivalent, especially given the obvious alternative of leaving the call within the explicit context. Even moving the function call before the setattr can’t be assumed to not change its behavior – how is moving it outside a with block ever supposed to be safe? I appreciate the desire to be able to take currently working code using one construct and have it continue working with a different construct, but the burden should be on that library and not the runtime. By that I mean that the parts of decimal that set and read the context should do the extra work to maintain compatibility (e.g. through a globally mutable structure using context variables as a slightly more fine-grained key than thread ID) rather than forcing an otherwise straightforward core runtime feature to jump through hoops to accommodate it. New users of this functionality very likely won’t assume that TLS is the semantic equivalent, especially when all the examples and naming make it sound like context managers are more related. (I predict people will expect this to behave more like unstated/implicit function arguments and be captured at the same time as other arguments are, but can’t really back that up except with gut-feel. It's certainly a feature that I want for myself more than I want another spelling for TLS…) Top-posted from my Windows phone From: Nick Coghlan Sent: Tuesday, October 10, 2017 5:35 To: Guido van Rossum Cc: Python-Ideas Subject: Re: [Python-ideas] PEP draft: context variables On 10 October 2017 at 01:24, Guido van Rossumwrote: On Sun, Oct 8, 2017 at 11:46 PM, Nick Coghlan wrote: On 8 October 2017 at 08:40, Koos Zevenhoven wrote: I do remember Yury mentioning that the first draft of PEP 550 captured something when the generator function was called. I think I started reading the discussions after that had already been removed, so I don't know exactly what it was. But I doubt that it was *exactly* the above, because PEP 550 uses set and get operations instead of "assignment contexts" like PEP 555 (this one) does. We didn't forget it, we just don't think it's very useful. I'm not sure I agree on the usefulness. Certainly a lot of the complexity of PEP 550 exists just to cater to Nathaniel's desire to influence what a generator sees via the context of the send()/next() call. I'm still not sure that's worth it. In 550 v1 there's no need for chained lookups. The compatibility concern is that we want developers of existing libraries to be able to transparently switch from using thread local storage to context local storage, and the way thread locals interact with generators means that decimal (et al) currently use the thread local state at the time when next() is called, *not* when the generator is created. I like Yury's example for this, which is that the following two examples are currently semantically equivalent, and we want to preserve that equivalence: with decimal.localcontext() as ctx: ctc.prex = 30 for i in gen(): pass g = gen() with decimal.localcontext() as ctx: ctc.prex = 30 for i in g: pass The easiest way to maintain that equivalence is to say that even though preventing state changes leaking *out* of generators is considered a desirable change, we see preventing them leaking *in* as a gratuitous backwards compatibility break. This does mean that *neither* form is semantically equivalent to eager extraction of the generator values before the decimal context is changed, but that's the status quo, and we don't have a compelling justification for changing it. If folks subsequently decide that they *do* want "capture on creation" or "capture on first iteration" semantics for their generators, those are easy enough to add as wrappers on top of the initial thread-local-compatible base by using the same building blocks as are being added to help event loops manage context snapshots for coroutine execution. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia
Re: [Python-ideas] Add pathlib.Path.write_json andpathlib.Path.read_json
It was enough of a benefit for text (and I never forget the argument order for writing text to a file, unlike json.dump(file_or_data?, data_or_file?) ) +1 Top-posted from my Windows Phone -Original Message- From: "Paul Moore"Sent: 3/27/2017 5:57 To: "Ram Rachum" Cc: "python-ideas" Subject: Re: [Python-ideas] Add pathlib.Path.write_json andpathlib.Path.read_json On 27 March 2017 at 13:50, Ram Rachum wrote: > This would make writing / reading JSON to a file a one liner instead of a > two-line with clause. That hardly seems like a significant benefit... Paul ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Adding an 'errors' argument to print
On 26Mar2017 0707, Nick Coghlan wrote: Perhaps it would be worth noting in the table of error handlers at https://docs.python.org/3/library/codecs.html#error-handlers that backslashreplace is used by the `ascii()` builtin and the associated format specifiers backslashreplace is also the default errors for stderr, which is arguably the right target for debugging output. Perhaps what we really want is a shorter way to send output to stderr? Though I guess it's an easy to invent one-liner, once you know about the difference: >>> printe = partial(print, file=sys.stderr) Also worth noting that Python 3.6 supports Unicode characters on the console by default on Windows. So unless sys.stdout was manually constructed (a possibility, given this was a GUI app, though I designed the change such that `open("CON", "w")` would get it right), there wouldn't have been an encoding issue in the first place. Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Fwd: Define a method or function attribute outside of a class with the dot operator
On 10Feb2017 1400, Stephan Hoyer wrote: An important note is that ideally, we would still have way of indicating that Spam.func should exists in on the Spam class itself, even if it doesn't define the implementation. I suppose an abstractmethod overwritten by the later definition might do the trick, e.g., class Spam(metaclass=ABCMeta): @abstractmethod def func(self): pass def Spam.func(self): return __class__ An abstractfunction should not become a concrete function on the abstract class - the right way to do this is to use a subclass. class SpamBase(metaclass=ABCMeta): @abstractmethod def func(self): pass class Spam(SpamBase): def func(self): return __class__ If you want to define parts of the class in separate modules, use mixins: from myarray.transforms import MyArrayTransformMixin from myarray.arithmetic import MyArrayArithmeticMixin from myarray.constructors import MyArrayConstructorsMixin class MyArray(MyArrayConstructorsMixin, MyArrayArithmeticMixin, MyArrayTransformMixin): pass The big different between these approaches and the proposal is that the proposal does not require both parties to agree on the approach. This is actually a terrible idea, as subclassing or mixing in a class that wasn't meant for it leads to all sorts of trouble unless the end user is very careful. Providing first-class syntax or methods for this discourages carefulness. (Another way of saying it is that directly overriding class members should feel a bit dirty because it *is* a bit dirty.) As Paul said in an earlier email, the best use of non-direct assignment in function definitions is putting it into a dispatch dictionary, and in this case making a decorator is likely cleaner than adding new syntax. But by all means, let's have a PEP. It will simplify the discussion when it comes up in six months again (or whenever the last time this came up was - less than a year, I'm sure). Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Fwd: Define a method or function attributeoutsideof a class with the dot operator
When you apply the "what if everyone did this" rule, it looks like a bad idea (or alternatively, what if two people who weren't expecting anyone else to do this did it). Monkeypatching is fairly blatantly taking advantage of the object model in a way that is not "supported" and cannot behave well in the context of everyone doing it, whereas inheritance or mixins are safe. Making a dedicated syntax or decorator for patching is saying that we (the language) think you should do it. (The extension_method decorator sends exactly the wrong message about what it's doing.) Enabling a __class__ variable within the scope of the definition would also solve the motivating example, and is less likely to lead to code where you need to review multiple modules and determine whole-program import order to figure out why your calls do not work. Top-posted from my Windows Phone -Original Message- From: "Markus Meskanen" <markusmeska...@gmail.com> Sent: 2/10/2017 10:18 To: "Paul Moore" <p.f.mo...@gmail.com> Cc: "Python-Ideas" <python-ideas@python.org>; "Steve Dower" <steve.do...@python.org> Subject: Re: [Python-ideas] Fwd: Define a method or function attributeoutsideof a class with the dot operator Well yes, but I think you're a bit too fast on labeling it a mistake to use monkey patching... On Feb 10, 2017 18:15, "Paul Moore" <p.f.mo...@gmail.com> wrote: On 10 February 2017 at 16:09, Markus Meskanen <markusmeska...@gmail.com> wrote: > But if people are gonna do it anyways with the tools provided (monkey > patching), why not provide them with better tools? Because encouraging and making it easier for people to make mistakes is the wrong thing to do, surely? Paul___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Fwd: Define a method or function attributeoutside of a class with the dot operator
Since votes seem to be being counted and used for debate purposes, I am -1 to anything that encourages or condones people adding functionality to classes outside of the class definition. (Monkeypatching in my mind neither condones or encourages, and most descriptions come with plenty of caveats about how it should be avoided.) My favourite description of object-oriented programming is that it's like "reading a road map through a drinking(/soda/pop) straw". We do not need to tell people that it's okay to make this problem worse by providing first-class tools to do it. Top-posted from my Windows Phone -Original Message- From: "Chris Angelico"Sent: 2/10/2017 8:27 To: "Python-Ideas" Subject: Re: [Python-ideas] Fwd: Define a method or function attributeoutside of a class with the dot operator On Sat, Feb 11, 2017 at 1:16 AM, Nick Coghlan wrote: > But what do __name__ and __qualname__ get set to? > > What happens if you do this at class scope, rather than at module > level or inside another function? > > What happens to the zero-argument super() support at class scope? > > What happens if you attempt to use zero-argument super() when *not* at > class scope? > > These are *answerable* questions... ... and are exactly why I asked the OP to write up a PEP. This isn't my proposal, so it's not up to me to make the decisions. For what it's worth, my answers would be: __name__ would be the textual representation of exactly what you typed between "def" and the open parenthesis. __qualname__ would be built the exact same way it currently is, based on that __name__. Zero-argument super() would behave exactly the way it would if you used a simple name. This just changes the assignment, not the creation of the function. So if you're inside a class, you could populate a lookup dictionary with method-like functions. Abuse this, and you're only shooting your own foot. Zero-argument super() outside of a class, just as currently, would be an error. (Whatever kind of error it currently is.) Maybe there are better answers to these questions, I don't know. That's what the PEP's for. ChrisA ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Unified TLS API for Python
On 02Feb2017 0601, Cory Benfield wrote: 4. Eventually, integrating the two backends above into the standard library so that it becomes possible to reduce the reliance on OpenSSL. This would allow future Python implementations to ship with all of their network protocol libraries supporting platform-native TLS implementations on Windows and macOS. This will almost certainly require new PEPs. I’ll probably volunteer to maintain a SecureTransport library, and I have got verbal suggestions from some other people who’d be willing to step up and help with that. Again, we’d need help with SChannel (looking at you, Steve). I'm always somewhat interested in learning a new API that I've literally never looked at before, so yeah, count me in :) (my other work was using the trust APIs directly, rather than the secure socket APIs). PyCon US sprints? It's not looking like I'll be able to set aside too much time before then, but I've already fenced off that time. Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] pathlib suggestions
On 25Jan2017 0816, Petr Viktorin wrote: On 01/25/2017 04:33 PM, Todd wrote: But what if the .tar.gz file is called "spam-4.2.5-final.tar.gz"? Existing tools like glob and endswith() can deal with the ".tar.gz" extension reliably, but "fullsuffix" would, arguably, not give the answers you want. I wouldn't use it in that situation. The existing "suffix" and "stem" properties also only work reliably under certain situations. Which situations do you mean? It works quite fine with multiple suffixes: The suffix of "pip-9.0.1.tar.gz" is ".gz", and sure enough, you can reasonably expect it's a gz-compressed file. If you uncompress it and strip the extension, you'll end up with a "pip-9.0.1.tar", where the suffix is ".tar" -- and humans would be surprised if it wasn't a tar archive. It may be handy if suffixes was a reversed tuple of suffixes (or possibly a cumulative tuple): >>> Path('pip-9.0.1.tar.gz').suffixes ('.gz', '.tar', '.1', '.0') This has a nice benefit for comparisons: >>> targzs = [f for f in all_files if f.suffixes[:2] == ('.gz', '.tar')] It doesn't necessarily improve over .endswith(), but it has a slight convenience over .split() and arguably demonstrates intent more clearly. (Though my biggest issue with all of this is case-sensitivity, which probably means we need to add comparison functions to Path flavours in order to do this stuff properly.) The "cumulative tuple" version would be like this: >>> Path('pip-9.0.1.tar.gz').suffixes ('.gz', '.tar.gz', '.1.tar.gz', '.0.1.tar.gz') This doesn't compare as nicely, since now we would use f.suffixes[1] which will raise if there is only one suffix (likely). But it does return a value which cannot be easily recreated using other functions. Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] New PyThread_tss_ C-API for CPython
Right. Platforms that have a defined invalid value don't need the struct, and so they can define the type differently. It just means we also need to provide a macro for testing whether it's been created or not, and users should genuinely treat the value as opaque. Cheers, Steve Top-posted from my Windows Phone -Original Message- From: "Masayuki YAMAMOTO"Sent: 12/23/2016 16:34 To: "Erik Bray" Cc: "python-ideas@python.org" Subject: Re: [Python-ideas] New PyThread_tss_ C-API for CPython 2016-12-21 19:01 GMT+09:00 Erik Bray : On Wed, Dec 21, 2016 at 2:10 AM, Nick Coghlan wrote: > Ouch, I'd missed that, and I agree it's not a negligible implementation > detail - there are definitely applications embedding CPython out there that > rely on being able to run multiple Initialize/Finalize cycles in the same > process and have everything "just work". It also means using the > "PyThread_*" prefix for the initialisation tracking aspect would be > misleading, since the life cycle details are: > > 1. Create the key for the first time if it has never been previously set in > the process > 2. Destroy and reinit if Py_Finalize gets called > 3. Destroy and reinit if a new subprocess is forked > > It also means we can't use pthread_once even in the pthread TLS > implementation, since it doesn't provide those semantics. > > So I see two main alternatives here. > > Option 1: Modify the proposed PyThread_tss_create and PyThread_tss_delete > APIs to accept a "bool *init_flag" pointer in addition to their current > arguments. > > If *init_flag is true, then PyThread_tss_create is a no-op, otherwise it > sets the flag to true after creating the key. > If *init_flag is false, then PyThread_tss_delete is a no-op, otherwise it > sets the flag to false after deleting the key. > > Option 2: Similar to option 1, but using a custom type alias, rather than > using a C99 bool directly > > The closest API we have to these semantics at the moment would be > PyGILState_Ensure, so the following API naming might work for option 2: > > Py_ensure_t > Py_ENSURE_NEEDS_INIT > Py_ENSURE_INITIALIZED > > Respectively, these would just be aliases for bool, false, and true. > > And then modify the proposed PyThread_tss_create and PyThread_tss_delete > APIs to accept a "Py_ensure_t *init_flag" in addition to their current > arguments. That all sounds good--between the two option 2 looks a bit more explicit. Though what about this? Rather than adding another type, the original proposal could be changed slightly so that Py_tss_t *is* partially defined as a struct consisting of a bool, with whatever the native TLS key is. E.g. typedef struct { bool init_flag; #if defined(_POSIX_THREADS) pthreat_key_t key; #elif defined (NT_THREADS) DWORD key; /* etc... */ } Py_tss_t; Then it's just taking Masayuki's original patch, with the global bool variables, and formalizing that by combining the initialized flag with the key, and requiring the semantics you described above for PyThread_tss_create/delete. For Python's purposes it seems like this might be good enough, with the more general purpose pthread_once-like functionality not required. Best, Erik Above mentioned, In currently TLS API, the thread key uses -1 as defined invalid value. If new TLS API inherits the specifications that the key requires defined invalid value, putting key and flag into one structure seems correct as semantics. In this case, I think TLS API should supply the defined invalid value (like PTHREAD_ONCE_INIT) to API users. Moreover, the structure has an opportunity to assert that the thread key type is the opaque using field name. I think to the suggestion that has effect to improve the understandability of the API because good field name can give that reading and writing to the key seems to be incorrect (even if API users don't read the precautionary statement). Have a nice holiday! Masayuki___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Enhancing vars()
I'm +1. This bites me far too often. > in the past developers were encouraged to put only "useful" attributes in __dir__. Good. If I'm getting vars() I really only want the useful ones. If I need interesting/secret ones then I'll getattr for them. Cheers, Steve Top-posted from my Windows Phone -Original Message- From: "Alexander Belopolsky"Sent: 12/12/2016 19:47 To: "Steven D'Aprano" Cc: "python-ideas" Subject: Re: [Python-ideas] Enhancing vars() On Mon, Dec 12, 2016 at 6:45 PM, Steven D'Aprano wrote: Proposal: enhance vars() to return a proxy to the object namespace, regardless of whether said namespace is __dict__ itself, or a number of __slots__, or both. How do you propose dealing with classes defined in C? Their objects don't have __slots__. One possibility is to use __dir__ or dir(), but those can return anything and in the past developers were encouraged to put only "useful" attributes in __dir__.___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Proposal for default character representation
FWIW, Python 3.6 should print this in the console just fine. Feel free to upgrade whenever you're ready. Cheers, Steve -Original Message- From: "Mikhail V"Sent: 10/12/2016 16:07 To: "M.-A. Lemburg" Cc: "python-ideas@python.org" Subject: Re: [Python-ideas] Proposal for default character representation Forgot to reply to all, duping my mesage... On 12 October 2016 at 23:48, M.-A. Lemburg wrote: > Hmm, in Python3, I get: > s = "абв.txt" s > 'абв.txt' I posted output with Python2 and Windows 7 BTW , In Windows 10 'print' won't work in cmd console at all by default with unicode but thats another story, let us not go into that. I think you get my idea right, it is not only about printing. > The hex notation for \u is a standard also used in many other > programming languages, it's also easier to parse, so I don't > think we should change this default. In programming literature it is used often, but let me point out that decimal is THE standard and is much much better standard in sence of readability. And there is no solid reason to use 2 standards at the same time. > > Take e.g. > s = "\u123456" s > 'ሴ56' > > With decimal notation, it's not clear where to end parsing > the digit notation. How it is not clear if the digit amount is fixed? Not very clear what did you mean. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] (Windows-only - calling Steve Dower) Is Python forWindows using PGO? If not consider this a suggestion.
It was disable previously because of compiler bugs. 3.6.0b1 64-bit has PGO enabled, but we'll disable it again if there are any issues. Top-posted from my Windows Phone -Original Message- From: "João Matos" <jcrma...@gmail.com> Sent: 9/17/2016 4:02 To: "python-ideas@python.org" <python-ideas@python.org> Subject: [Python-ideas] (Windows-only - calling Steve Dower) Is Python forWindows using PGO? If not consider this a suggestion. Hello, Is Python for Windows using PGO (Profile Guided Optimizations)? If not consider this a suggestion. Best regards, JM ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] (Windows-only - calling Steve Dower) Consider addinga symlink to pip in the same location as the py launcher
I'd like to add a launcher in the same style as py.exe, but that would upset people who manually configure their PATH appropriately. Personally, I find "py.exe -m pip" quite okay, but appreciate the idea. I'm thinking about this issue (also for other scripts). Top-posted from my Windows Phone -Original Message- From: "João Matos" <jcrma...@gmail.com> Sent: 9/17/2016 3:57 To: "python-ideas@python.org" <python-ideas@python.org> Subject: [Python-ideas] (Windows-only - calling Steve Dower) Consider addinga symlink to pip in the same location as the py launcher Hello, If Py3.5 is installed in user mode instead of admin (all users) and we follow your advice that we shouldn't add it to the PATH env var, we can execute Python using the py launcher, but we can't use pip. Please consider adding a pip symlink in the same location as the py launcher. Best regards, JM ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Let’s make escaping in f-literals impossible
On 29Aug2016 1433, Eric V. Smith wrote: On 8/29/2016 5:26 PM, Ethan Furman wrote: Update the PEP, then it's a bugfix. ;) Heh. I guess that's true. But it's sort of a big change, so shipping beta 1 with the code not agreeing with the PEP rubs me the wrong way. Or, I could stop worrying and typing emails, and instead just get on with it! I like this approach :) But I agree. Release Manager Ned has the final say, but I think this change can comfortably go in during the beta period. (I also disagree that it's a big change - nobody could agree on the 'obvious' behaviour of backslashes anyway, so chances are people would avoid them anyway, and there was strong consensus on advising people to avoid them.) Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Fix default encodings on Windows
On 18Aug2016 1036, Terry Reedy wrote: On 8/18/2016 11:25 AM, Steve Dower wrote: In this case, we would announce in 3.6 that using bytes as paths on Windows is no longer deprecated, My understanding is the the first 2 fixes refine the deprecation rather than reversing it. And #3 simply applies it. #3 certainly just applies the deprecation. As for the first two, I don't see any reason to deprecate the functionality once the issues are resolved. If using utf-8 encoded bytes is going to work fine in all the same cases as using str, why discourage it? ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Fix default encodings on Windows
On 18Aug2016 0900, Chris Angelico wrote: On Fri, Aug 19, 2016 at 1:54 AM, Steve Dower <steve.do...@python.org> wrote: On 18Aug2016 0829, Chris Angelico wrote: The second call to glob doesn't have any Unicode characters at all, the way I see it - it's all bytes. Am I completely misunderstanding this? You're not the only one - I think this has been the most common misunderstanding. On Windows, the paths as stored in the filesystem are actually all text - more precisely, utf-16-le encoded bytes, represented as 16-bit characters strings. Converting to an 8-bit character representation only exists for compatibility with code written for other platforms (either Linux, or much older versions of Windows). The operating system has one way to do the conversion to bytes, which Python currently uses, but since we control that transformation I'm proposing an alternative conversion that is more reliable than compatible (with Windows 3.1... shouldn't affect compatibility with code that properly handles multibyte encodings, which should include anything developed for Linux in the last decade or two). Does that help? I tried to keep the explanation short and focused :) Ah, I think I see what you mean. There's a slight ambiguity in the word "missing" here. 1) The Unicode character in the result lacks some of the information it should have 2) The Unicode character in the file name is information that has now been lost. My reading was the first, but AIUI you actually meant the second. If so, I'd be inclined to reword it very slightly, eg: "The Unicode character in the second call to glob is now lost information." Is that a correct interpretation? I think so, though I find the wording a little awkward (and on rereading, my original wording was pretty bad). How about: "The second call to glob has replaced the Unicode character with '?', which means the actual filename cannot be recovered and the path is no longer valid." Cheers, STeve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Fix default encodings on Windows
On 18Aug2016 0829, Chris Angelico wrote: The second call to glob doesn't have any Unicode characters at all, the way I see it - it's all bytes. Am I completely misunderstanding this? You're not the only one - I think this has been the most common misunderstanding. On Windows, the paths as stored in the filesystem are actually all text - more precisely, utf-16-le encoded bytes, represented as 16-bit characters strings. Converting to an 8-bit character representation only exists for compatibility with code written for other platforms (either Linux, or much older versions of Windows). The operating system has one way to do the conversion to bytes, which Python currently uses, but since we control that transformation I'm proposing an alternative conversion that is more reliable than compatible (with Windows 3.1... shouldn't affect compatibility with code that properly handles multibyte encodings, which should include anything developed for Linux in the last decade or two). Does that help? I tried to keep the explanation short and focused :) Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Fix default encodings on Windows
Summary for python-dev. This is the email I'm proposing to take over to the main mailing list to get some actual decisions made. As I don't agree with some of the possible recommendations, I want to make sure that they're represented fairly. I also want to summarise the background leading to why we should consider making a change here at all, rather than simply leaving it alone. There's a chance this will all make its way into a PEP, depending on how controversial the core team thinks this is. Please let me know if you think I've misrepresented (or unfairly represented) any of the positions, or if you think I can simplify/clarify anything in here. Please don't treat this like a PEP review - it's just going to be an email to python-dev - but the more we can avoid having the discussions there we've already had here the better. Cheers, Steve --- Background == File system paths are almost universally represented as text in some encoding determined by the file system. In Python, we expose these paths via a number of interfaces, such as the os and io modules. Paths may be passed either direction across these interfaces, that is, from the filesystem to the application (for example, os.listdir()), or from the application to the filesystem (for example, os.unlink()). When paths are passed between the filesystem and the application, they are either passed through as a bytes blob or converted to/from str using sys.getfilesystemencoding(). The result of encoding a string with sys.getfilesystemencoding() is a blob of bytes in the native format for the default file system. On Windows, the native format for the filesystem is utf-16-le. The recommended platform APIs for accessing the filesystem all accept and return text encoded in this format. However, prior to Windows NT (and possibly further back), the native format was a configurable machine option and a separate set of APIs existed to accept this format. The option (the "active code page") and these APIs (the "*A functions") still exist in recent versions of Windows for backwards compatibility, though new functionality often only has a utf-16-le API (the "*W functions"). In Python, we recommend using str as the default format on Windows because it can correctly round-trip all the characters representable in utf-16-le. Our support for bytes explicitly uses the *A functions and hence the encoding for the bytes is "whatever the active code page is". Since the active code page cannot represent all Unicode characters, the conversion of a path into bytes can lose information without warning. As a demonstration of this: >>> open('test\uAB00.txt', 'wb').close() >>> import glob >>> glob.glob('test*') ['test\uab00.txt'] >>> glob.glob(b'test*') [b'test?.txt'] The Unicode character in the second call to glob is missing information. You can observe the same results in os.listdir() or any function that matches its result type to the parameter type. Why is this a problem? == While the obvious and correct answer is to just use str everywhere, it remains well known that on Linux and MacOS it is perfectly okay to use bytes when taking values from the filesystem and passing them back. Doing so also avoids the cost of decoding and reencoding, such that (theoretically), code like below should be faster because of the `b'.'`: >>> for f in os.listdir(b'.'): ... os.stat(f) ... On Windows, if a filename exists that cannot be encoding with the active code page, you will receive an error from the above code. These errors are why in Python 3.3 the use of bytes paths on Windows was deprecated (listed in the What's New, but not clearly obvious in the documentation - more on this later). The above code produces multiple deprecation warnings in 3.3, 3.4 and 3.5 on Windows. However, we still keep seeing libraries use bytes paths, which can cause unexpected issues on Windows. Given the current approach of quietly recommending that library developers either write their code twice (once for bytes and once for str) or use str exclusively are not working, we should consider alternative mitigations. Proposals = There are two dimensions here - the fix and the timing. We can basically choose any fix and any timing. The main differences between the fixes are the balance between incorrect behaviour and backwards-incompatible behaviour. The main issue with respect to timing is whether or not we believe using bytes as paths on Windows was correctly deprecated in 3.3 and sufficiently advertised since to allow us to change the behaviour in 3.6. Fixes - Fix #1: Change sys.getfilesystemencoding() to utf-8 on Windows Currently the default filesystem encoding is 'mbcs', which is a meta-encoder that uses the active code page. In reality, our implementation uses the *A APIs and we don't explicitly decode bytes in order to pass them to the filesystem. This allows the OS to quietly
Re: [Python-ideas] Fix default encodings on Windows
"You consistently ignore Makefiles, .ini, etc." Do people really do open('makefile', 'rb'), extract filenames and try to use them without ever decoding the file contents? I've honestly never seen that, and it certainly looks like the sort of thing Python 3 was intended to discourage. (As soon as you open(..., 'r') you're only affected by this change if you explicitly encode again with mbcs.) Top-posted from my Windows Phone -Original Message- From: "Stephen J. Turnbull" <turnbull.stephen...@u.tsukuba.ac.jp> Sent: 8/17/2016 19:43 To: "Steve Dower" <steve.do...@python.org> Cc: "Paul Moore" <p.f.mo...@gmail.com>; "Python-Ideas" <python-ideas@python.org> Subject: Re: [Python-ideas] Fix default encodings on Windows Steve Dower writes: > On 17Aug2016 0235, Stephen J. Turnbull wrote: > > So a full statement is, "How do we best represent Windows file > > system paths in bytes for interoperability with systems that > > natively represent paths in bytes?" ("Other systems" refers to > > both other platforms and existing programs on Windows.) > > That's incorrect, or at least possible to interpret correctly as > the wrong thing. The goal is "code compatibility with systems ...", > not interoperability. You're right, I stated that incorrectly. I don't have anything to add to your corrected version. > > In a properly set up POSIX locale[1], it Just Works by design, > > especially if you use UTF-8 as the preferred encoding. It's > > Windows developers and users who suffer, not those who wrote the > > code, nor their primary audience which uses POSIX platforms. > > You mentioned "locale", "preferred" and "encoding" in the same sentence, > so I hope you're not thinking of locale.getpreferredencoding()? Changing > that function is orthogonal to this discussion, You consistently ignore Makefiles, .ini, etc. It is *not* orthogonal, it is *the* reason for all opposition to your proposal or request that it be delayed. Filesystem names *are* text in part because they are *used as filenames in text*. > When Windows developers and users suffer, I see it as my responsibility > to reduce that suffering. Changing Python on Windows should do that > without affecting developers on Linux, even though the Right Way is to > change all the developers on Linux to use str for paths. I resent that. If I were a partisan Linux fanboy, I'd be cheering you on because I think your proposal is going to hurt an identifiable and large class of *Windows* users. I know about and fear this possiblity because they use a language I love (Japanese) and an encoding I hate but have achieved a state of peaceful coexistence with (Shift JIS). And on the general principle, *I* don't disagree. I mentioned earlier that I use only the str interfaces in my own code on Linux and Mac OS X, and that I suspect that there are no real efficiency implications to using str rather than bytes for those interfaces. On the other hand, the programming convenience of reading the occasional "text" filename (or other text, such as XML tags) out of a binary stream and passing it directly to filesystem APIs cannot be denied. I think that the kind of usage you propose (a fixed, universal codec, universally accepted; ie, 'utf-8') is the best way to handle that in the long run. But as Grandmaster Lasker said, "Before the end game, the gods have placed the middle game." (Lord Keynes isn't relevant here, Python will outlive all of us. :-) > I don't think there's any reasonable way to noisily deprecate these > functions within Python, but certainly the docs can be made > clearer. People who explicitly encode with > sys.getfilesystemencoding() should not get the deprecation message, > but we can't tell whether they got their bytes from the right > encoding or a RNG, so there's no way to discriminate. I agree with you within Python; the custom is for DeprecationWarnings to be silent by default. As for "making noise", how about announcing the deprecation as like the top headline for 3.6, postponing the actual change to 3.7, and in the meantime you and Nick do a keynote duet at PyCon? (Your partner could be Guido, too, but Nick has been the most articulate proponent for this particular aspect of "inclusion". I think having a representative from the POSIX world explaining the importance of this for "all of us" would greatly multiply the impact.) Perhaps, given my proposed timing, a discussion at the language summit in '17 and the keynote in '18 would be the best timing. (OT, political: I've been strongly influenced in this proposal by recently reading http://blog.aurynn.com/contempt-culture. There's not as much of it in Pytho
Re: [Python-ideas] Fix default encodings on Windows
On 17Aug2016 0901, Nick Coghlan wrote: On 17 August 2016 at 02:06, Chris Barkerwrote: So the Solution is to either: (A) get everyone to use Unicode "properly", which will work on all platforms (but only on py3.5 and above?) or (B) kludge some *nix-compatible support for byte paths into Windows, that will work at least much of the time. It's clear (to me at least) that (A) it the "Right Thing", but real world experience has shown that it's unlikely to happen any time soon. Practicality beats Purity and all that -- this is a judgment call. Have I got that right? Yep, pretty much. Based on Stephen Turnbull's concerns, I wonder if we could make a whitelist of universal encodings that Python-on-Windows will use in preference to UTF-8 if they're configured as the current code page. If we accepted GB18030, GB2312, Shift-JIS, and ISO-2022-* as overrides, then problems would be significantly less likely. Another alternative would be to apply a similar solution as we do on Linux with regards to the "surrogateescape" error handler: there are some interfaces (like the standard streams) where we only enable that error handler specifically if the preferred encoding is reported as ASCII. In 2016, we're *very* skeptical about any properly configured system actually being ASCII-only (rather than that value showing up because the POSIX standards mandate it as the default), so we don't really believe the OS when it tells us that. The equivalent for Windows would be to disbelieve the configured code page only when it was reported as "mbcs" - for folks that had configured their system to use something other than the default, Python would believe them, just as we do on Linux. The problem here is that "mbcs" is not configurable - it's a meta-encoder that uses whatever is configured as the "language (system locale) to use when displaying text in programs that do not support Unicode" (quote from the dialog where administrators can configure this). So there's nothing to disbelieve here. And even on machines where the current code page is "reliable", UTF-16 is still the actual encoding, which means UTF-8 is still a better choice for representing the path as a blob of bytes. Currently we have inconsistent encoding between different Windows machines and could either remove that inconsistency completely or simply reduce it for (approx.) English speakers. I would rather an extreme here - either make it consistent regardless of user configuration, or make it so broken that nobody can use it at all. (And note that the correct way to support *some* other FS encodings would be to change the return value from sys.getfilesystemencoding(), which breaks people who currently ignore that just as badly as changing it to utf-8 would.) Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Fix default encodings on Windows
On 17Aug2016 0235, Stephen J. Turnbull wrote: Paul Moore writes: > On 16 August 2016 at 16:56, Steve Dower <steve.do...@python.org> wrote: > > This discussion is for the developers who insist on using bytes > > for paths within Python, and the question is, "how do we best > > represent UTF-16 encoded paths in bytes?" That's incomplete, AFAICS. (Paul makes this point somewhat differently.) We don't want to represent paths in bytes on Windows if we can avoid it. Nor does UTF-16 really enter into it (except for the technical issue of invalid surrogate pairs). So a full statement is, "How do we best represent Windows file system paths in bytes for interoperability with systems that natively represent paths in bytes?" ("Other systems" refers to both other platforms and existing programs on Windows.) That's incorrect, or at least possible to interpret correctly as the wrong thing. The goal is "code compatibility with systems ...", not interoperability. Nothing about this will make it easier to take a path from Windows and use it on Linux or vice versa, but it will make it easier/more reliable to take code that uses paths on Linux and use it on Windows. BTW, why "surrogate pairs"? Does Windows validate surrogates to ensure they come in pairs, but not necessarily in the right order (or perhaps sometimes they resolve to non-characters such as U+1)? Eryk answered this better than I would have. Paul says: > People passing bytes to open() have in my view, already chosen not > to follow the standard advice of "decode incoming data at the > boundaries of your application". They may have good reasons for > that, but it's perfectly reasonable to expect them to take > responsibility for manually tracking the encoding of the resulting > bytes values flowing through their code. Abstractly true, but in practice there's no such need for those who made the choice! In a properly set up POSIX locale[1], it Just Works by design, especially if you use UTF-8 as the preferred encoding. It's Windows developers and users who suffer, not those who wrote the code, nor their primary audience which uses POSIX platforms. You mentioned "locale", "preferred" and "encoding" in the same sentence, so I hope you're not thinking of locale.getpreferredencoding()? Changing that function is orthogonal to this discussion, despite the fact that in most cases it returns the same code page as what is going to be used by the file system functions (which in most cases will also be used by the encoding returned from sys.getfilesystemencoding()). When Windows developers and users suffer, I see it as my responsibility to reduce that suffering. Changing Python on Windows should do that without affecting developers on Linux, even though the Right Way is to change all the developers on Linux to use str for paths. > > If you see an alternative choice to those listed above, feel free > > to contribute it. Otherwise, can we focus the discussion on these > > (or any new) choices? > > Accept that we should have deprecated builtin open and the io module, > but didn't do so. Extend the existing deprecation of bytes paths on > Windows, to cover *all* APIs, not just the os module, But modify the > deprecation to be "use of the Windows CP_ACP code page (via the ...A > Win32 APIs) is deprecated and will be replaced with use of UTF-8 as > the implied encoding for all bytes paths on Windows starting in Python > 3.7". Document and publicise it much more prominently, as it is a > breaking change. Then leave it one release for people to prepare for > the change. I like this one! If my paranoid fears are realized, in practice it might have to wait two releases, but at least this announcement should get people who are at risk to speak up. If they don't, then you can just call me "Chicken Little" and go ahead! I don't think there's any reasonable way to noisily deprecate these functions within Python, but certainly the docs can be made clearer. People who explicitly encode with sys.getfilesystemencoding() should not get the deprecation message, but we can't tell whether they got their bytes from the right encoding or a RNG, so there's no way to discriminate. I'm going to put together a summary post here (hopefully today) and get those who have been contributing to basically sign off on it, then I'll take it to python-dev. The possible outcomes I'll propose will basically be "do we keep the status quo, undeprecate and change the functionality, deprecate the deprecation and undeprecate/change in a couple releases, or say that it wasn't a real deprecation so we can deprecate and then change functionality in a couple releases". Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Fix default encodings on Windows
On 15Aug2016 0954, Random832 wrote: On Mon, Aug 15, 2016, at 12:35, Steve Dower wrote: I'm still not sure we're talking about the same thing right now. For `open(path_as_bytes).read()`, are we talking about the way path_as_bytes is passed to the file system? Or the codec used to decide the returned string? We are talking about the way path_as_bytes is passed to the filesystem, and in particular what encoding path_as_bytes is *actually* in, when it was obtained from a file or other stream opened in binary mode. Okay good, we are talking about the same thing. Passing path_as_bytes in that location has been deprecated since 3.3, so we are well within our rights (and probably overdue) to make it a TypeError in 3.6. While it's obviously an invalid assumption, for the purposes of changing the language we can assume that no existing code is passing bytes into any functions where it has been deprecated. As far as I'm concerned, there are currently no filesystem APIs on Windows that accept paths as bytes. Given that, I'm proposing adding support for using byte strings encoded with UTF-8 in file system functions on Windows. This allows Python users to omit switching code like: if os.name == 'nt': f = os.stat(os.listdir('.')[-1]) else: f = os.stat(os.listdir(b'.')[-1]) Or simply using the bytes variant unconditionally because they heard it was faster (sacrificing cross-platform correctness, since it may not correctly round-trip on Windows). My proposal is to remove all use of the *A APIs and only use the *W APIs. That completely removes the (already deprecated) use of bytes as paths. I then propose to change the (unused on Windows) sys.getfsdefaultencoding() to 'utf-8' and handle bytes being passed into filesystem functions by transcoding into UTF-16 and calling the *W APIs. This completely removes the active codepage from the chain, allows paths returned from the filesystem to correctly roundtrip via bytes in Python, and allows those bytes paths to be manipulated at '\' characters. (Frankly I don't mind what encoding we use, and I'd be quite happy to force bytes paths to be UTF-16-LE encoded, which would also round-trip invalid surrogate pairs. But that would prevent basic manipulation which seems to be a higher priority.) This does not allow you to take bytes from an arbitrary source and assume that they are correctly encoded for the file system. Python 3.3, 3.4 and 3.5 have been warning that doing that is deprecated and the path needs to be decoded to a known encoding first. At this stage, it's time for us to either make byte paths an error, or to specify a suitable encoding that can correctly round-trip paths. If this does not answer the question, I'm going to need the question to be explained more clearly for me. Cheers, Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Fix default encodings on Windows
I was thinking we would end up using the console API for input but stick with the standard handles for output, mostly to minimize the amount of magic switching we have to do. But since we can just switch the entire stream object in __std*__ once at startup if nothing is redirected it probably isn't that much of a simplification. I have some airport/aeroplane time today where I can experiment. Top-posted from my Windows Phone -Original Message- From: "eryk sun"Sent: 8/12/2016 5:40 To: "python-ideas" Subject: Re: [Python-ideas] Fix default encodings on Windows On Thu, Aug 11, 2016 at 9:07 AM, Paul Moore wrote: > set codepage to UTF-8 > ... > set codepage back > spawn subprocess X, but don't wait for it > set codepage to UTF-8 > ... > ... At this point what codepage does Python see? What codepage does > process X see? (Note that they are both sharing the same console). The input and output codepages are global data in conhost.exe. They aren't tracked for each attached process (unlike input history and aliases). That's how chcp.com works in the first place. Otherwise its calls to SetConsoleCP and SetConsoleOutputCP would be pointless. But IMHO all talk of using codepage 65001 is a waste of time. I think the trailing garbage output with this codepage in Windows 7 is unacceptable. And getting EOF for non-ASCII input is a show stopper. The problem occurs in conhost. All you get is the EOF result from ReadFile/ReadConsoleA, so it can't be worked around. This kills the REPL and raises EOFError for input(). ISTM the only people who think codepage 65001 actually works are those using Windows 8+ who occasionally need to print non-OEM text and never enter (or paste) anything but ASCII text. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/