Re: [Python-ideas] PEP 505: None-aware operators

2018-07-23 Thread Steve Dower

On 23Jul2018 1530, David Mertz wrote:
Of course I don't mean that if implemented the semantics 
would be ambiguous... rather, the proper "swallowing" of different kinds 
of exceptions is not intuitively obvious, not even to you, Steve.  And 
if some decision was reached and documented, it would remain unclear to 
new (or even experienced) users of the feature.


As written in the PEP, no exceptions are ever swallowed. The translation 
into existing syntax is very clearly and unambiguously shown, and there 
is no exception handling at all. All the exception handling discussion 
in the PEP is under the heading of "rejected ideas".


This email discussion includes some hypotheticals, since that's the 
point - I want thoughts and counter-proposals for semantics and 
discussion. I am 100% committed to an unambiguous PEP, and I believe the 
current proposal is most defensible. However, I don't want to have a 
"discussion" where I simply assume that I'm right, everyone else is 
wrong, and I refuse to discuss or consider alternatives.


So sorry for letting you all think that everything I write is actually 
the PEP. I had assumed that because my emails are not the PEP that 
people would realise that they are not the PEP. I'm going to duck out of 
the discussions here now, since they are not as productive as I'd hoped, 
and once we have a BDFL-replacement I'll reawaken it and see what is 
required at that point.


Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 505: None-aware operators

2018-07-23 Thread Steve Dower

On 23Jul2018 1145, Antoine Pitrou wrote:


Le 23/07/2018 à 12:38, Steve Dower a écrit :


General comment to everyone (not just Antoine): these arguments have
zero value to me. Feel free to keep making them, but I am uninterested.


So you're uninterested in learning from past mistakes?

You sound like a child who thinks their demands should be satisfied
because they are the center of the world.


Sorry if it came across like that, it wasn't the intention. A bit of 
context on why you think it's a mistake would have helped, but if it's a 
purely subjective "I don't like the look of it" (as most similar 
arguments have turned out) then it doesn't add anything to enhancing the 
PEP. As a result, I do not see any reason to engage with this class of 
argument.


I hope you'll also notice that I've been making very few demands in this 
thread, and have indicated a number of times that I'm very open to 
adjusting the proposal in the face of honest and useful feedback.


Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 505: None-aware operators

2018-07-23 Thread Steve Dower

On 23Jul2018 1129, Antoine Pitrou wrote:


Le 23/07/2018 à 12:25, Steve Dower a écrit :

On 23Jul2018 , Antoine Pitrou wrote:

On Mon, 23 Jul 2018 10:51:31 +0100
Steve Dower  wrote:


Which is the most important operator?
-

Personally, I think '?.' is the most valuable.


For me, it's the most contentious.  The fact that a single '?' added to
a regular line of Python code can short-circuit execution silently is a
net detriment to readability, IMHO.


The only time it would short-circuit is when it would otherwise raise
AttributeError for trying to access an attribute from None, which is
also going to short-circuit.


But AttributeError is going to bubble up as soon as it's raised, unless
it's explicitly handled by an except block.  Simply returning None may
have silent undesired effects (perhaps even security flaws).


You're right that the silent/undesired effects would be bad, which is 
why I'm not proposing silent changes to existing code (such as 
None.__getattr__ always returning None).


This is a substitute for explicitly checking None before the attribute 
access, or explicitly handling AttributeError for this case (and 
unintentionally handling others as well). And "?." may be very small 
compared to the extra 3+ lines required to do exactly the same thing, 
but it is still an explicit change that can be reviewed and evaluated as 
"is None a valid but not-useful value here? or is it an indication of 
another error and should we fail immediately instead".


Cheers,
Steve


This whole thing reminds of PHP's malicious "@" operator.


General comment to everyone (not just Antoine): these arguments have 
zero value to me. Feel free to keep making them, but I am uninterested. 
Perhaps whoever gets to decide on the PEP will be swayed by them?

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 505: None-aware operators

2018-07-23 Thread Steve Dower

On 23Jul2018 , Antoine Pitrou wrote:

On Mon, 23 Jul 2018 10:51:31 +0100
Steve Dower  wrote:


Which is the most important operator?
-

Personally, I think '?.' is the most valuable.


For me, it's the most contentious.  The fact that a single '?' added to
a regular line of Python code can short-circuit execution silently is a
net detriment to readability, IMHO.


The only time it would short-circuit is when it would otherwise raise 
AttributeError for trying to access an attribute from None, which is 
also going to short-circuit. The difference is that it short-circuits 
the expression only, and not all statements up until the next except 
handler.


Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 505: None-aware operators

2018-07-23 Thread Steve Dower
Responding to a few more ideas that have come up here. Again, apologies 
for not directing them to the original authors, but I want to focus on 
the ideas that are leading towards a more informed decision, and not 
getting distracted by providing customised examples for people or 
getting into side debates.


I'm also going to try and update the PEP text today (or this week at 
least) to better clarify some of the questions that have come up (and 
fix that embarrassingly broken example :( )


Cheers,
Steve

False: '?.' should be surrounded by spaces
--

It's basically the same as '.'. Spell it 'a?.b', not 'a ?. b' (like 
'a.b' rather than 'a + b').


It's an enhancement to attribute access, not a new type of binary 
operator. The right-hand side cannot be evaluated in isolation.


In my opinion, it can also be read aloud the same as '.' as well (see 
the next point).


False: 'a?.b' is totally different from 'a.b'
-

The expression 'a.b' either results in 'a.b' or AttributeError (assuming 
no descriptors are involved).


The expression 'a?.b' either results in 'a.b' or None (again, assuming 
no descriptors).


This isn't a crazy new idea, it really just short-circuits a specific 
error that can only be precisely avoided with "if None" checks (catching 
AttributeError is not the same).


The trivial case is already a one-liner
---

That may be the case if you have a single character variable, but this 
proposal is not intended to try and further simplify already simple 
cases. It is for complex cases, particularly where you do not want to 
reevaluate the arguments or potentially leak temporary names into a 
module or class namespace.


(Brief aside: 'a if (a := expr) is not None else None' is going to be 
the best workaround. The suggested 'a := expr if a is not None else 
None' is incorrect because the condition is evaluated first and so has 
to contain the assignment.)


False: ??= is a new form of assignment
--

No, it's just augmented assignment for a binary operator. "a ??= b" is 
identical to "a = a ?? b", just like "+=" and friends.


It has no relationship to assignment expressions. '??=' can only be used 
as a statement, and is not strictly necessary, but if we add a new 
binary operator '??' and it does not have an equivalent augmented 
assignment statement, people will justifiably wonder about the 
inconsistency.


The PEP author is unsure about how it works
---

I wish this statement had come with some context, because the only thing 
I'm unsure about is what I'm supposed to be unsure about.


That said, I'm willing to make changes to the PEP based on the feedback 
and discussion. I haven't come into this with a "my way is 100% right 
and it will never change" mindset, so if this is a misinterpretation of 
my willingness to listen to feedback then I'm sorry I wasn't more clear. 
I *do* care about your opinions (when presented fairly and constructively).


Which is the most important operator?
-

Personally, I think '?.' is the most valuable. The value of '??' arises 
because (unless changing the semantics from None-aware to False-aware) 
it provides a way of setting the default that is consistent with how we 
got to the no-value value (e.g. `None?.a ?? b` and `""?.a ?? b` are 
different, whereas `None?.a or b` and `""?.a or b` are equivalent).


I'm borderline on ?[] right now. Honestly, I think it works best if it 
also silently handles LookupError (e.g. for traversing a loaded JSON 
dict), but then it's inconsistent with ?. which I think works best if it 
handles None but allows AttributeError. Either way, both have the 
ability to directly handle the exception. For example, (assuming e1, e2 
are expressions and not values):


v = e1?[e2]

Could be handled as this example (for None-aware):

_temp1 = (e1)
v = _temp1[e2] if _temp1 is not None else None

Or for silent exception handling of the lookup only:

_temp1 = (e1)
_temp2 = (e2)
try:
v = _temp1[_temp2] if _temp1 is not None else None
except LookupError:
v = None

Note that this second example is _not_ how most people protect against 
invalid lookups (most people use `.get` when it's available, or they 
accept that LookupErrors raised from e1 or e2 should also be silently 
handled). So there would be value in ?[] being able to more precisely 
handle the exception.


However, with ?. being available, and _most_ lookups being on dicts that 
have .get(), you can also traverse JSON values fairly easily like this:


d = json.load(f)
name = d.get('user')?.get('details')?.get('name') ?? ''

With ?[] doing the safe lookup as well, this could be:

d = json.load(f)
name = d?['user']?['details']?['name'] ?? ''

Now, my *least* favourite part of this is that (as 

Re: [Python-ideas] PEP 505: None-aware operators

2018-07-23 Thread Steve Dower

On 23Jul2018 0151, Steven D'Aprano wrote:

What if there was a language
supported, non-hackish way to officially delay evaluation of
expressions until explicitly requested?


The current spelling for this is "lambda: delayed-expression" and the 
way to request the value is "()". :)


(I'm not even being that facetious here. People ask for delayed 
expressions all the time, and it's only 7 characters, provided the 
callee knows they're getting it, and the semantics are already well 
defined and likely match what you want.)


Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Python docs page: In what ways is None special

2018-07-23 Thread Steve Dower

On 23Jul2018 1003, Jonathan Fine wrote:

This arises out of PEP 505 - None-aware operators.

I thought, a page on how None is special would be nice.
I've not found such a page on the web. We do have
===
https://docs.python.org/3/library/constants.html
None
The sole value of the type NoneType. None is
frequently used to represent the absence of a
value, as when default arguments are not passed
to a function. Assignments to None are illegal
and raise a SyntaxError.
===

So decided to start writing such a page, perhaps to be
added to the docs.  All code examples in Python3.4.


There's also 
https://docs.python.org/3/c-api/none.html?highlight=py_none#c.Py_None


"The Python None object, denoting lack of value. This object has no 
methods. It needs to be treated just like any other object with respect 
to reference counts."


I don't know that documenting the behaviours of None are that 
interesting (e.g. not displaying anything at the interactive prompt), 
though it'd be perfect for a blog and/or conference talk. But if there 
appear to be behaviours that are not consistent or cannot be easily 
inferred from the existing documentation, then we should think about why 
that is and how we could enhance the documentation to ensure it 
accurately describes what None is supposed to be.


That said, your examples are good :)

Cheers,
Steve

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 505: None-aware operators

2018-07-20 Thread Steve Dower

On 20Jul2018 1119, Brendan Barnwell wrote:
In this situation I lean toward "explicit is 
better than implicit" --- if you want to compare against None, you 
should do so explicitly --- and "special cases aren't special enough to 
break the rules" --- that is, None is not special enough to warrant the 
creation of multiple new operators solely to compare things against this 
specific value.


"The rules" declare that None is special - it's the one and only value 
that represents "no value". So is giving it special meaning here 
breaking the rules or following them? (See also the ~50% of the PEP 
dedicated to this subject, and also consider proposing a non-special 
result for "??? if has_no_value(value) else value" in the 'True' case.)


Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 505: None-aware operators

2018-07-20 Thread Steve Dower
Just for fun, I decided to go through some recently written code by some 
genuine Python experts (without their permission...) to see what changes would 
be worth taking. So I went to the sources of our github bots.

Honestly, I only found three places that were worth changing (though I'm now 
kind of leaning towards ?[] eating LookupError, since that seems much more 
useful when traversing the result of json.loads()...). I'm also not holding up 
the third one as the strongest example :)


>From 
>https://github.com/python/miss-islington/blob/master/miss_islington/status_change.py:

async def check_status(event, gh, *args, **kwargs):
if (
event.data["commit"].get("committer")
and event.data["commit"]["committer"]["login"] == "miss-islington"
):
sha = event.data["sha"]
await check_ci_status_and_approval(gh, sha, leave_comment=True)

After:

async def check_status(event, gh, *args, **kwargs):
if event.data["commit"].get("committer")?["login"] == "miss-islington":
sha = event.data["sha"]
await check_ci_status_and_approval(gh, sha, leave_comment=True)


>From https://github.com/python/bedevere/blob/master/bedevere/__main__.py:

try:
print('GH requests remaining:', gh.rate_limit.remaining)
except AttributeError:
pass

Assuming you want to continue hiding the message when no value is available:
if (remaining := gh.rate_limit?.remaining) is not None:
print('GH requests remaining:', remaining)

Assuming you want the message printed anyway:
print(f'GH requests remaining: {gh.rate_limit?.remaining ?? "N/A"}')


>From https://github.com/python/bedevere/blob/master/bedevere/news.py (this is 
>the one I'm including for completeness, not because it's the most compelling 
>example I've ever seen):

async def check_news(gh, pull_request, filenames=None):
if not filenames:
filenames = await util.filenames_for_PR(gh, pull_request)

After:
async def check_news(gh, pull_request, filenames=None):
filenames ??= await util.filenames_for_PR(gh, pull_request)


On 19Jul2018 , Steven D'Aprano wrote:
> In other words, we ought to be comparing the expressiveness of
> 
>  process(spam ?? something)
> 
> versus:
> 
> process(something if spam is None else spam)

Agreed, though to make it a more favourable comparison I'd replace "spam" with 
"spam()?.eggs" and put it in a class/module definition where you don't want 
temporary names leaking ;)

Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 505: None-aware operators

2018-07-19 Thread Steve Dower
Thanks everyone for the feedback and discussion so far. I want to 
address some of the themes, so apologies for not quoting individuals and 
for doing this in one post instead of twenty.


--


* "It looks like line noise"

Thanks for the feedback. There's nothing constructive for me to take 
from this.


* "I've never needed this"

Also not very actionable, but as background I'll say that this was 
exactly my argument against adding them to C#. But my coding style has 
adapted to suit (for example, I'm more likely to use "null" as a default 
value and have a single function implementation than two 
mostly-duplicated overloads).


* "It makes it more complex"
* "It's harder to follow the flow"

Depends on your measure of complexity. For me, I prioritise "area under 
the indentation" as my preferred complexity metric (more lines*indents 
== more complex), as well as left-to-right reading of each line (more 
random access == more complex). By these measures, ?. significantly 
reduces the complexity over any of the current or future alternatives::


def f(a=None):
name = 'default'
if a is not None:
user = a.get_user()
if user is not None:
name = user.name
print(name)

def f(a=None):
if a is not None:
user = a.get_user()
name = user.name if user is not None else 'default'
print(name)
else
print('default')

def f(a=None):
user = a.get_user() if a is not None else None
name = user.name if user is not None else 'default'
print(name)

def f(a=None):
print(user.name
  if (user := a.get_user() if a is not None else None) is not None
  else 'default')

def f(a=None):
print(a?.get_user()?.name ?? 'none')

* "We have 'or', we don't need '??'"

Nearly-agreed, but I think the tighter binding on ?? makes it more 
valuable and tighter test make it valuable in place of 'or'.


For example, compare:

a ** b() or 2 # actual:   (a ** b()) or 2
a ** b() ?? 2 # proposed:  a ** (b() ?? 2)

In the first, the presence of 'or' implies that either b() or __pow__(a, 
b()) could return a non-True value. This is correct (it could return 0 
if a == 0). And the current precedence results in the result of __pow__ 
being used for the check.


In the second one, the presence of the '??' implies that either b() or 
__pow__(a, b()) could return None. The latter should never happen, and 
so the choices are to make the built-in types propagate Nones when 
passed None (uhh... no) or to make '??' bind to the closer part of the 
expression.


(If you don't think it's likely enough that a function could return 
[float, None], then assume 'a ** b?.c ?? 2' instead.)


* "We could have '||', we don't need '??'"

Perhaps, though this is basically just choosing the bikeshed colour. In 
the absence of a stronger argument, matching existing languages 
equivalent operators instead of operators that do different things in 
those languages should win.


* "We could have 'else', we don't need '??'"

This is the "a else 'default'" rather than "a ?? 'default'" proposal, 
which I do like the look of, but I think it will simultaneously mess 
with operator precedence and also force me to search for the 'if' that 
we actually need to be comparing "(a else 'default')" vs. "a ?? 'default'"::


x = a if b else c else d
x = a if (b else c) else d
x = a if b else (c else d)

* "It's not clear whether it's 'is not None' or 'hasattr' checks"

I'm totally sympathetic to this. Ultimately, like everything else, this 
is a concept that has to be taught/learned rather than known intrinsically.


The main reasons for not having 'a?.b' be directly equivalent to 
getattr(a, 'b', ???) is that you lose the easy ability to find typos, 
and we also already have the getattr() approach.


(Aside: in this context, why should the result be 'None' if an attribute 
is missing? For None, the None value propagates (getattr(a, 'b', a)), 
while for falsies you could argue the same thing applies. But for a 
silently handled AttributeError? You still have to make the case that 
None is special here, just special as a return value vs. special as a test.)


* "The semantics of this example changed from getattr() with ?."

Yes, this was a poor example. On re-reading, all of the checks are 
indeed looking for optional attributes, rather than looking them up on 
optional targets. I'll find a better one (I've certainly seen and/or 
written code like this that was intended to avoid crashing on None, but 
I stopped my search of the stdlib too soon after finding this example).


* "Bitwise operators"

Uh... yeah. Have fun over there :)

* "Assumes the only falsie ever returned [in some context] is None"

I argue that it assumes the only falsie you want to replace with a 
different value is None. In many cases, I'd expect the None to be 
replaced with a falsie of the intended type:


x = maybe_get_int() ?? 0
y = maybe_get_list() ?? []

Particularly for the 

Re: [Python-ideas] PEP 505: None-aware operators

2018-07-18 Thread Steve Dower
Thanks! Bit of discussion below about precedence, but thanks for 
spotting the typos.


On 18Jul2018 1318, MRAB wrote:

On 2018-07-18 18:43, Steve Dower wrote:

Grammar changes
---

The following rules of the Python grammar are updated to read::

  augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' |
'^=' |
  '<<=' | '>>=' | '**=' | '//=' | '??=')

  power: coalesce ['**' factor]
  coalesce: atom_expr ['??' factor]
  atom_expr: ['await'] atom trailer*
  trailer: ('(' [arglist] ')' |
    '[' subscriptlist ']' |
    '?[' subscriptlist ']' |
    '.' NAME |
    '?.' NAME)

The precedence is higher than I expected. I think of it more like 'or'. 
What is its precedence in the other languages?


Yes, I expected this to be the contentious part. I may have to add a bit 
of discussion.


Mostly, I applied intuition rather than copying other languages on 
precedence (and if you could go through my non-git history, you'd see I 
tried four other places ;) ). The most "obvious" cases were these::


a ?? 1 + b()

b ** a() ?? 2

In the first case, both "(a ?? 1) + b()" and "a ?? (1 + b())" make 
sense, so it's really just my own personal preference that I think it 
looks like the first. If you flip the operands to get "b() + a ?? 1" 
then you end up with either "b() + (a ?? 1)" or "(b() + a) ?? 1", then 
it's more obvious that the latter doesn't make any sense (why would 
__add__ return None?), and so binding more tightly than "+" helps write 
sensible expressions with fewer parentheses.


Similarly, I feel like "b ** (a() ?? 2)" makes more sense than "(b ** 
a()) ?? 2", where for the latter we would have to assume a __pow__ 
implementation that returns None, or one that handles being passed None 
without raising a TypeError.


Contrasting this with "or", it is totally legitimate for arithmetic 
operators to return falsey values.


As I open the text file to correct the typos, I see this is what I tried 
to capture with:



Inserting the ``coalesce`` rule in this location ensures that expressions
resulting in ``None`` are naturally coalesced before they are used in
operations that would typically raise ``TypeError``.


Take (2 ** a.b) ?? 0. The result of __pow__ is rarely going to be None, 
unless we train all the builtin types to do so (which, incidentally, I 
am not proposing and have no intention of proposing), whereas something 
like "2 ** coord?.exponent" attempting to call "2.__pow__(None)" seems 
comparatively likely. (Unfortunately, nobody writes code like this yet 
:) So there aren't any real-life examples. Originally I didn't include 
"??" in the proposal, but it became obvious in the examples that the 
presence of None-propagating operators ?. and ?[] just cause more pain 
without having the None-terminating operator ?? as well.)



Inserting the ``coalesce`` rule in this location ensures that expressions
resulting in ``None`` are natuarlly coalesced before they are used in


Typo "natuarlly".


Thanks.


  assert a == 'value'
  assert b == ''
  assert c == '0' and any(os.scandir('/'))


Wouldn't the last assertion fail, because c == 0?


Correct, another typo.

Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] PEP 505: None-aware operators

2018-07-18 Thread Steve Dower
Possibly this is exactly the wrong time to propose the next big syntax 
change, since we currently have nobody to declare on it, but since we're 
likely to argue for a while anyway it probably can't hurt (and maybe 
this will become the test PEP for whoever takes the reins?).


FWIW, Guido had previously indicated that he was generally favourable 
towards most of this proposal, provided we could figure out coherent 
semantics. Last time we tried, that didn't happen, so this time I've 
made the semantics much more precise, have implemented and verified 
them, and made much stronger statements about why we are proposing these.


Additional thanks to Mark Haase for writing most of the PEP. All the 
fair and balanced parts are his - all the overly strong opinions are mine.


Also thanks to Nick Coghlan for writing PEPs 531 and 532 last time we 
went through this - if you're unhappy with "None" being treated as a 
special kind of value, I recommend reading those before you start 
repeating them.


There is a formatted version of this PEP at 
https://www.python.org/dev/peps/pep-0505/


My current implementation is at 
https://github.com/zooba/cpython/tree/pep-505 (though I'm considering 
removing some of the new opcodes I added and just generating more 
complex code - in any case, let's get hung up on the proposal rather 
than the implementation :) )


Let the discussions begin!

---

PEP: 505
Title: None-aware operators
Version: $Revision$
Last-Modified: $Date$
Author: Mark E. Haase , Steve Dower 


Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 18-Sep-2015
Python-Version: 3.8

Abstract


Several modern programming languages have so-called "``null``-coalescing" or
"``null``- aware" operators, including C# [1]_, Dart [2]_, Perl, Swift, 
and PHP

(starting in version 7). These operators provide syntactic sugar for common
patterns involving null references.

* The "``null``-coalescing" operator is a binary operator that returns 
its left

  operand if it is not ``null``. Otherwise it returns its right operand.
* The "``null``-aware member access" operator accesses an instance 
member only
  if that instance is non-``null``. Otherwise it returns ``null``. 
(This is also

  called a "safe navigation" operator.)
* The "``null``-aware index access" operator accesses an element of a 
collection
  only if that collection is non-``null``. Otherwise it returns 
``null``. (This

  is another type of "safe navigation" operator.)

This PEP proposes three ``None``-aware operators for Python, based on the
definitions and other language's implementations of those above. 
Specifically:


* The "``None`` coalescing`` binary operator ``??`` returns the left 
hand side

  if it evaluates to a value that is not ``None``, or else it evaluates and
  returns the right hand side. A coalescing ``??=`` augmented assignment
  operator is included.
* The "``None``-aware attribute access" operator ``?.`` evaluates the 
complete
  expression if the left hand side evaluates to a value that is not 
``None``

* The "``None``-aware indexing" operator ``?[]`` evaluates the complete
  expression if the left hand site evaluates to a value that is not 
``None``


Syntax and Semantics


Specialness of ``None``
---

The ``None`` object denotes the lack of a value. For the purposes of these
operators, the lack of a value indicates that the remainder of the 
expression

also lacks a value and should not be evaluated.

A rejected proposal was to treat any value that evaluates to false in a
Boolean context as not having a value. However, the purpose of these 
operators

is to propagate the "lack of value" state, rather that the "false" state.

Some argue that this makes ``None`` special. We contend that ``None`` is
already special, and that using it as both the test and the result of these
operators does not change the existing semantics in any way.

See the `Rejected Ideas`_ section for discussion on the rejected approaches.

Grammar changes
---

The following rules of the Python grammar are updated to read::

augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | 
'^=' |

'<<=' | '>>=' | '**=' | '//=' | '??=')

power: coalesce ['**' factor]
coalesce: atom_expr ['??' factor]
atom_expr: ['await'] atom trailer*
trailer: ('(' [arglist] ')' |
  '[' subscriptlist ']' |
  '?[' subscriptlist ']' |
  '.' NAME |
  '?.' NAME)

Inserting the ``coalesce`` rule in this location ensures that expressions
resulting in ``None`` are natuarlly coalesced before they are used in
operations that would typically raise ``TypeError``. Like ``and`` and ``or``
the right-hand expression is not evaluated until the left-hand side is
determined to be ``None``

Re: [Python-ideas] PEP 572: Statement-Local Name Bindings, take three!

2018-04-08 Thread Steve Dower
# Dict display
data = {
key_a: local_a := 1,
key_b: local_b := 2,
key_c: local_c := 3,
}

Isn’t this a set display with local assignments and type annotations? :o)

(I’m -1 on all of these ideas, btw. None help readability for me, and I read 
much more code than I write.)

Top-posted from my Windows phone

From: Nick Coghlan
Sent: Sunday, April 8, 2018 6:27
To: Chris Angelico
Cc: python-ideas
Subject: Re: [Python-ideas] PEP 572: Statement-Local Name Bindings,take three!

On 23 March 2018 at 20:01, Chris Angelico  wrote:
> Apologies for letting this languish; life has an annoying habit of
> getting in the way now and then.
>
> Feedback from the previous rounds has been incorporated. From here,
> the most important concern and question is: Is there any other syntax
> or related proposal that ought to be mentioned here? If this proposal
> is rejected, it should be rejected with a full set of alternatives.

I was writing a new stdlib test case today, and thinking about how I
might structure it differently in a PEP 572 world, and realised that a
situation the next version of the PEP should discuss is this one:

# Dict display
data = {
key_a: 1,
key_b: 2,
key_c: 3,
}

# Set display with local name bindings
data = {
local_a := 1,
local_b := 2,
local_c := 3,
   }

# List display with local name bindings
data = {
local_a := 1,
local_b := 2,
local_c := 3,
   }

# Dict display
data = {
key_a: local_a := 1,
key_b: local_b := 2,
key_c: local_c := 3,
}

# Dict display with local key name bindings
data = {
local_a := key_a: 1,
local_b := key_b: 2,
local_c := key_c: 3,
}

I don't think this is bad (although the interaction with dicts is a
bit odd), and I don't think it counts as a rationale either, but I do
think the fact that it becomes possible should be noted as an outcome
arising from the "No sublocal scoping" semantics.

Cheers,
Nick.

P.S. The specific test case is one where I want to test the three
different ways of spelling "the current directory" in some sys.path
manipulation code (the empty string, os.curdir, and os.getcwd()), and
it occurred to me that a version of PEP 572 that omits the sublocal
scoping concept will allow inline naming of parts of data structures
as you define them.


-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP proposal -- Pathlib Module ShouldContain All File Operations -- version 2

2018-03-23 Thread Steve Dower
I had a colleague complaining to me the other day about having to search 
multiple packages for the right function to move a file (implying: with the 
same semantics as drag-drop). 

If there isn’t a pathtools library on PyPI yet, this would certainly be 
valuable for newer developers. My view on Path is to either have everything on 
it or nothing on it (without removing what’s already there, of course), and 
since everything is so popular we should at least put everything in the one 
place.

Top-posted from my Windows phone

From: Mike Miller
Sent: Monday, March 19, 2018 10:51
To: python-ideas@python.org
Subject: Re: [Python-ideas] New PEP proposal -- Pathlib Module ShouldContain 
All File Operations -- version 2


On 2018-03-18 10:55, Paul Moore wrote:
>> Should Path() have methods to access all file operations?
> 
> No, (Counterexample, having a Path operation to set Windows ACLs for a path).

Agreed, not a big fan of everything filesystem-related in pathlib, simply 
because it doesn't read well.  Having them scattered isn't a great experience 
either.

Perhaps it would be better to have a filesystem package instead, maybe named 
"fs" that included all this stuff in one easy to use location.  File stuff from 
os, path stuff from os.path, pathlib, utils like stat, and shutil etc?
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Possible Enhancement to py Launcher - set default

2018-02-07 Thread Steve Dower
Checking the Version (!=SysVersion) property should be enough (and perhaps we 
need to set it properly on install). The launcher currently only works with 
PythonCore entries anyway, so no need to worry about other distros.

PEP 514 allows for other keys to be added as well (it specifies a minimum set), 
so we could just set one for this. “NoDefaultLaunch” or similar.

Finally, if someone created a script for setting py.ini, it could probably be 
included in the Tools directory. Wouldn’t be run on install or get a start menu 
shortcut though, just to set expectations right.

Top-posted from my Windows phone

From: Paul Moore
Sent: Wednesday, February 7, 2018 7:37
To: Alex Walters
Cc: Python-Ideas
Subject: Re: [Python-ideas] Possible Enhancement to py Launcher - set default

I don't think so. As an example, what registry keys would Anaconda
write to say that Release 5.2.1.7 is a pre-release version? Or would
the py launcher have to parse the version looking for rc/a/b/... tags?
And distributions would have to agree on how they record pre-release
version numbers?

Paul

On 7 February 2018 at 14:57, Alex Walters  wrote:
>
>
>> -Original Message-
>> From: Paul Moore [mailto:p.f.mo...@gmail.com]
>> Sent: Wednesday, February 7, 2018 4:15 AM
>> To: Alex Walters 
>> Cc: Steve Barnes ; Python-Ideas > id...@python.org>
>> Subject: Re: [Python-ideas] Possible Enhancement to py Launcher - set
>> default
>>
> ...
>>
>> IMO the biggest technical issue with this is that as far as I can see
>> PEP 514 doesn't specify a way to determine if a given Python is a
>> pre-release version. If we do want to implement this (I'm +0 on it,
>> personally) then I think the starting point would need to be an update
>> to PEP 514 to include that data.
>>
>> Paul
>
> Looking at pep 514, it looks like sys.winver is what would have to change to 
> support reporting the release status to the registry.  I don't think 514 has 
> to change at all if sys.winver changes.  Is that a correct interpretation?
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Format mini-language for lakh and crore

2018-01-29 Thread Steve Dower
Someone would have to check, but presumably the CRT on Windows is converting 
the natively thread-local locale into a process-wide locale for POSIX 
compatibility, which means it can probably be easily bypassed without having to 
use specific overloads.

Top-posted from my Windows phone

From: Nathaniel Smith
Sent: Monday, January 29, 2018 11:29
To: Eric V. Smith
Cc: python-ideas
Subject: Re: [Python-ideas] Format mini-language for lakh and crore

On Sun, Jan 28, 2018 at 5:46 AM, Eric V. Smith  wrote:
> If I recall correctly, we discussed this at the time, and the problem with
> locale is that it's not thread safe. I agree that if it were, it would be
> nice to be able to use it, either with 'n', or in some other mode just for
> grouping.
>
> The underlying C setlocale()/localeconv() just isn't very friendly to this
> use case.

POSIX.1-2008 added thread-local locales (say that 3x fast); see
uselocale(3). This appears to be supported on Linux (since glibc 2.3,
which is older than all supported enterprise distros), MacOS, and the
BSDs, but not Windows. OTOH Windows, MacOS, and the BSDs all seem to
provide the non-standard sprintf_l, which takes an explicit locale to
use.

So it looks like all mainstream OSes actually make it possible to use
a specific locale to do arbitrary formatting in a thread-safe way.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Windows Best Fit Encodings

2018-01-19 Thread Steve Dower

On 20Jan2018 0518, M.-A. Lemburg wrote:

do you know of a definite resource for Windows code pages
on MSDN or another official MS website ?


I don't know of anything sorry, and my quick search didn't turn up 
anything public. But I can at least confirm that the internal table for 
cp1252 has the same undefined characters as on unicode.org, so 
presumably if MultiByteToWideChar is mapping those to "best fit" 
characters it's only because the flag has been passed. As far as I can 
tell, Microsoft has not been secretly redefining any encodings.


Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Support WHATWG versions of legacy encodings

2018-01-11 Thread Steve Dower

On 12Jan2018 0342, Random832 wrote:

On Thu, Jan 11, 2018, at 04:55, Serhiy Storchaka wrote:

The way of solving this issue in Python is using an error handler. The
"surrogateescape" error handler is specially designed for lossless
reversible decoding. It maps every unassigned byte in the range
0x80-0xff to a single character in the range U+dc80-U+dcff. This allows
you to distinguish correctly decoded characters from the escaped bytes,
perform character by character processing of the decoded text, and
encode the result back with the same encoding.


Maybe we need a new error handler that maps unassigned bytes in the range 0x80-0x9f to a 
single character in the range U+0080-U+009F. Do any of the encodings being discussed have 
behavior other than the "normal" version of the encoding plus what I just 
described?


+1 on this being an error handler (if possible). I suspect the semantics 
will be more complex than suggested above, but as this seems to be able 
handling normally un[en/de]codable characters, using an error handler to 
return something more sensible best represents what is going on. Call it 
something like 'web' or 'relaxed' or 'whatwg'.


I don't know if error handlers have enough context for this though. If 
not, we should ensure they can have it. I'd much rather explain one new 
error handler to most people (and a more complex API for implementing 
them to the few people who do it) than explain a whole suite of new 
encodings.


Cheers,
Steve

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Make functions, methods and descriptor types living in the types module

2018-01-11 Thread Steve Dower
I certainly have code that joins __module__ with __name__ to create a 
fully-qualified name (with special handling for those builtins that are 
not in builtins), and IIUC __qualname__ doesn't normally include the 
module name either (it's intended for nested types/functions).


Can we make it visible when you import the builtins module, but not in 
the builtins namespace?


Cheers,
Steve

On 12Jan2018 0941, Victor Stinner wrote:

I like the idea of having a fully qualified name that "works" (can be
resolved).

I don't think that repr() should change, right?

Can this change break the backward compatibility somehow?

Victor

Le 11 janv. 2018 21:00, "Serhiy Storchaka" > a écrit :

Currently the classes of functions (implemented in Python and
builtin), methods, and different type of descriptors, generators,
etc have the __module__ attribute equal to "builtins"  and the name
that can't be used for accessing the class.

>>> def f(): pass
...
>>> type(f)

>>> type(f).__module__
'builtins'
>>> type(f).__name__
'function'
>>> type(f).__qualname__
'function'
>>> import builtins
>>> builtins.function
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: module 'builtins' has no attribute 'function'

But most of this classes (if not all) are exposed in the types module.

I suggest to rename them. Make the __module__ attribute equal to
"builtins" and the __name__ and the __qualname__ attributes equal to
the name used for accessing the class in the types module.

This would allow to pickle references to these types. Currently this
isn't possible.

>>> pickle.dumps(types.FunctionType)
Traceback (most recent call last):
  File "", line 1, in 
_pickle.PicklingError: Can't pickle : attribute
lookup function on builtins failed

And this will help to implement the pickle support of dynamic
functions etc. Currently the third-party library that implements
this needs to use a special purposed factory function (not
compatible with other similar libraries) since types.FunctionType
isn't pickleable.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Looking for input to help with the pip situation

2017-11-15 Thread Steve Dower

On 15Nov2017 0617, Nick Coghlan wrote:
On 15 November 2017 at 22:46, Michel Desmoulin 
<desmoulinmic...@gmail.com <mailto:desmoulinmic...@gmail.com>> wrote:

Should I do a PEP with a summary of all the stuff we discussed ?
I think a Windows-specific PEP covering adding PATH updates back to the 
default installer behaviour, and adding pythonX and pythonX.Y commands 
would be useful (and Guido would presumably delegate resolving that to 
Steve Dower as the Windows installer maintainer).


If you write such a PEP, please also research and write up the issues 
with modifying PATH on Windows (they're largely scattered throughout 
bugs.p.o and earlier discussions on python-dev).


Once you realise the tradeoff involved in modifying these global 
settings, you'll either come around to my point of view or be 
volunteering to take *all* the support questions when they come in :)


The one thing I'd ask is that any such PEP *not* advocate for promoting 
ther variants as the preferred way of invoking Python on Windows - 
rather, they should be positioned as a way of making online instructions 
written for Linux more likely to "just work" for folks on Windows 
(similar to the utf-8 encoding changes in 
https://www.python.org/dev/peps/pep-0529/)


Instead, the focus should be on ensuring the "python -m pip install" and 
"pip install" both work after clicking through the installer without 
changing any settings, and devising a troubleshooting guide to help 
folks that are familiar with computers and Python, but perhaps not with 
Windows, guide folks to a properly working environment.


My preferred solution for this is to rename "py.exe" to "python.exe" (or 
rather, make a copy of it with the new name), and extend (or more 
likely, rewrite) the launcher such that:


* if argv[0] == "py.exe", use PEP 514 company/tag resolution to find and 
launch Python based on first command line argument
* if argv[0] == "python.exe", find the matching 
PythonCore/ install (where tag may be a partial match - e.g. 
"python3.exe" finds the latest PythonCore/3.x)
* else, if argv[0] == ".exe, find the matching 
PythonCore/ install and launch "-m "


With the launcher behaving like this, we can make as many hard links as 
we want in its install directory (it only gets installed once, so only 
needs one PATH entry, and this is C:\Windows for admin installs):

* python.exe
* python2.exe
* python3.exe
* python3.6.exe
* pip.exe
* pip2.exe
* pip3.exe

As well as allowing e.g. "py.exe -anaconda36-64 ..." to reliably locate 
and run non-Python.org installs.


It needs to be fully specced out, obviously, and we may want to move the 
all-users install to its own directory to reduce clutter, but part of 
the reason behind PEP 514 was to enable this sort of launcher. It could 
even extend to "you don't have this version right now, want to download 
and install it?"


And finally it should be fairly obvious that this doesn't have to be a 
core Python tool. It has no reliance on anything in core (that isn't 
already specified in a PEP) and could be written totally independently. 
I've tried (weakly) to get work time allocated to this in the past, and 
if it's genuinely not going to get done unless I do it then I'll try again.


Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP draft: context variables

2017-10-13 Thread Steve Dower

On 13Oct2017 1132, Yury Selivanov wrote:

On Fri, Oct 13, 2017 at 1:45 PM, Ethan Furman <et...@stoneleaf.us> wrote:

On 10/13/2017 09:48 AM, Steve Dower wrote:


On 13Oct2017 0941, Yury Selivanov wrote:




Actually, capturing context at the moment of coroutine creation (in
PEP 550 v1 semantics) will not work at all.  Async context managers
will break.

 class AC:
 async def __aenter__(self):
  pass

^ If the context is captured when coroutines are instantiated,
__aenter__ won't be able to set context variables and thus affect the
code it wraps.  That's why coroutines shouldn't capture context when
created, nor they should isolate context.  It's a job of async Task.



Then make __aenter__/__aexit__ when called by "async with" an exception to
the normal semantics?

It seems simpler to have one specially named and specially called function
be special, rather than make the semantics
more complicated for all functions.




It's not possible to special case __aenter__ and __aexit__ reliably
(supporting wrappers, decorators, and possible side effects).


Why not? Can you not add a decorator that sets a flag on the code object 
that means "do not create a new context when called", and then it 
doesn't matter where the call comes from - these functions will always 
read and write to the caller's context. That seems generally useful 
anyway, and then you just say that __aenter__ and __aexit__ are special 
and always have that flag set.



+1.  I think that would make it much more usable by those of us who are not
experts.


I still don't understand what Steve means by "more usable", to be honest.


I don't know that I said "more usable", but it would certainly be easier 
to explain. The Zen has something to say about that...


Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP draft: context variables

2017-10-13 Thread Steve Dower

On 13Oct2017 0941, Yury Selivanov wrote:

On Fri, Oct 13, 2017 at 3:25 AM, Nick Coghlan  wrote:
[..]

However, considering that coroutines are almost always instantiated at the
point where they're awaited, I do concede that creation time context capture
would likely also work out OK for the coroutine case, which would leave
contextlib.contextmanager as the only special case (and it would turn off
both creation-time context capture *and* context isolation).


Actually, capturing context at the moment of coroutine creation (in
PEP 550 v1 semantics) will not work at all.  Async context managers
will break.

class AC:
async def __aenter__(self):
 pass

^ If the context is captured when coroutines are instantiated,
__aenter__ won't be able to set context variables and thus affect the
code it wraps.  That's why coroutines shouldn't capture context when
created, nor they should isolate context.  It's a job of async Task.


Then make __aenter__/__aexit__ when called by "async with" an exception 
to the normal semantics?


It seems simpler to have one specially named and specially called 
function be special, rather than make the semantics more complicated for 
all functions.


Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP draft: context variables

2017-10-11 Thread Steve Dower

On 11Oct2017 0458, Koos Zevenhoven wrote:
​Exactly. You did say it less politely than I did, but this is exactly 
how I thought about it. And I'm not sure people got it the first time.


Yes, perhaps a little harsh. However, if I released a refactoring tool 
that moved function calls that far, people would file bugs against it 
for breaking their code (and in my experience of people filing bugs 
against tools that break their code, they can also be a little harsh).



I want PEP 555 to be how things *should be*, not how things are.
Agreed. Start with the ideal target and backpedal when a sufficient case 
has been made to justify it. That's how Yury's PEP has travelled, but I 
disagree that this example is a compelling case for the amount of 
bending that is being done.



New users of this functionality very likely won’t assume that TLS is
the semantic equivalent, especially when all the examples and naming
make it sound like context managers are more related. (I predict
people will expect this to behave more like unstated/implicit
function arguments and be captured at the same time as other
arguments are, but can’t really back that up except with gut-feel.
It's certainly a feature that I want for myself more than I want
another spelling for TLS…)


I assume you like my decision to rename the concept to "context 
arguments" :). And indeed, new use cases would be more interesting than 
existing ones. Surely we don't want new use cases to copy the semantics 
from the old ones which currently have issues (because they were 
originally designed to work with traditional function and method calls, 
and using then-available techniques).


I don't really care about names, as long as it's easy to use them to 
research the underlying concept or intended functionality. And I'm not 
particularly supportive of this concept as a whole anyway - EIBTI and all.


But since it does address a fairly significant shortcoming in existing 
code, we're going to end up with something. If it's a new runtime 
feature then I'd like it to be an easy concept to grasp with clever 
hacks for the compatibility cases (and I do believe there are clever 
hacks available for getting "inject into my deferred function call" 
semantics), rather than the whole thing being a complicated edge-case.


Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP draft: context variables

2017-10-10 Thread Steve Dower
Nick: “I like Yury's example for this, which is that the following two examples 
are currently semantically equivalent, and we want to preserve that equivalence:

    with decimal.localcontext() as ctx:
    ctc.prex = 30
        for i in gen():
           pass

    g = gen()
    with decimal.localcontext() as ctx:
    ctc.prex = 30
        for i in g:
          pass”

I’m following this discussion from a distance, but cared enough about this 
point to chime in without even reading what comes later in the thread. 
(Hopefully it’s not twenty people making the same point…)

I HATE this example! Looking solely at the code we can see, you are refactoring 
a function call from inside an *explicit* context manager to outside of it, and 
assuming the behavior will not change. There’s *absolutely no* logical or 
semantic reason that these should be equivalent, especially given the obvious 
alternative of leaving the call within the explicit context. Even moving the 
function call before the setattr can’t be assumed to not change its behavior – 
how is moving it outside a with block ever supposed to be safe?

I appreciate the desire to be able to take currently working code using one 
construct and have it continue working with a different construct, but the 
burden should be on that library and not the runtime. By that I mean that the 
parts of decimal that set and read the context should do the extra work to 
maintain compatibility (e.g. through a globally mutable structure using context 
variables as a slightly more fine-grained key than thread ID) rather than 
forcing an otherwise straightforward core runtime feature to jump through hoops 
to accommodate it.

New users of this functionality very likely won’t assume that TLS is the 
semantic equivalent, especially when all the examples and naming make it sound 
like context managers are more related. (I predict people will expect this to 
behave more like unstated/implicit function arguments and be captured at the 
same time as other arguments are, but can’t really back that up except with 
gut-feel. It's certainly a feature that I want for myself more than I want 
another spelling for TLS…)

Top-posted from my Windows phone

From: Nick Coghlan
Sent: Tuesday, October 10, 2017 5:35
To: Guido van Rossum
Cc: Python-Ideas
Subject: Re: [Python-ideas] PEP draft: context variables

On 10 October 2017 at 01:24, Guido van Rossum  wrote:
On Sun, Oct 8, 2017 at 11:46 PM, Nick Coghlan  wrote:
On 8 October 2017 at 08:40, Koos Zevenhoven  wrote:
​​I do remember Yury mentioning that the first draft of PEP 550 captured 
something when the generator function was called. I think I started reading the 
discussions after that had already been removed, so I don't know exactly what 
it was. But I doubt that it was *exactly* the above, because PEP 550 uses set 
and get operations instead of "assignment contexts" like PEP 555 (this one) 
does. ​​

We didn't forget it, we just don't think it's very useful.


I'm not sure I agree on the usefulness. Certainly a lot of the complexity of 
PEP 550 exists just to cater to Nathaniel's desire to influence what a 
generator sees via the context of the send()/next() call. I'm still not sure 
that's worth it. In 550 v1 there's no need for chained lookups.

The compatibility concern is that we want developers of existing libraries to 
be able to transparently switch from using thread local storage to context 
local storage, and the way thread locals interact with generators means that 
decimal (et al) currently use the thread local state at the time when next() is 
called, *not* when the generator is created.

I like Yury's example for this, which is that the following two examples are 
currently semantically equivalent, and we want to preserve that equivalence:

    with decimal.localcontext() as ctx:
    ctc.prex = 30
        for i in gen():
           pass

    g = gen()
    with decimal.localcontext() as ctx:
    ctc.prex = 30
        for i in g:
          pass

The easiest way to maintain that equivalence is to say that even though 
preventing state changes leaking *out* of generators is considered a desirable 
change, we see preventing them leaking *in* as a gratuitous backwards 
compatibility break.

This does mean that *neither* form is semantically equivalent to eager 
extraction of the generator values before the decimal context is changed, but 
that's the status quo, and we don't have a compelling justification for 
changing it.

If folks subsequently decide that they *do* want "capture on creation" or 
"capture on first iteration" semantics for their generators, those are easy 
enough to add as wrappers on top of the initial thread-local-compatible base by 
using the same building blocks as are being added to help event loops manage 
context snapshots for coroutine execution.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia


Re: [Python-ideas] Add pathlib.Path.write_json andpathlib.Path.read_json

2017-03-27 Thread Steve Dower
It was enough of a benefit for text (and I never forget the argument order for 
writing text to a file, unlike json.dump(file_or_data?, data_or_file?) )

+1

Top-posted from my Windows Phone

-Original Message-
From: "Paul Moore" 
Sent: ‎3/‎27/‎2017 5:57
To: "Ram Rachum" 
Cc: "python-ideas" 
Subject: Re: [Python-ideas] Add pathlib.Path.write_json 
andpathlib.Path.read_json

On 27 March 2017 at 13:50, Ram Rachum  wrote:
> This would make writing / reading JSON to a file a one liner instead of a
> two-line with clause.

That hardly seems like a significant benefit...

Paul
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Adding an 'errors' argument to print

2017-03-26 Thread Steve Dower

On 26Mar2017 0707, Nick Coghlan wrote:

Perhaps it would be worth noting in the table of error handlers at
https://docs.python.org/3/library/codecs.html#error-handlers that
backslashreplace is used by the `ascii()` builtin and the associated
format specifiers


backslashreplace is also the default errors for stderr, which is 
arguably the right target for debugging output. Perhaps what we really 
want is a shorter way to send output to stderr? Though I guess it's an 
easy to invent one-liner, once you know about the difference:


>>> printe = partial(print, file=sys.stderr)

Also worth noting that Python 3.6 supports Unicode characters on the 
console by default on Windows. So unless sys.stdout was manually 
constructed (a possibility, given this was a GUI app, though I designed 
the change such that `open("CON", "w")` would get it right), there 
wouldn't have been an encoding issue in the first place.


Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fwd: Define a method or function attribute outside of a class with the dot operator

2017-02-10 Thread Steve Dower

On 10Feb2017 1400, Stephan Hoyer wrote:

An important note is that ideally, we would still have way of indicating
that Spam.func should exists in on the Spam class itself, even if it
doesn't define the implementation. I suppose an abstractmethod
overwritten by the later definition might do the trick, e.g.,

class Spam(metaclass=ABCMeta):
@abstractmethod
def func(self):
pass

def Spam.func(self):
return __class__


An abstractfunction should not become a concrete function on the 
abstract class - the right way to do this is to use a subclass.


class SpamBase(metaclass=ABCMeta):
@abstractmethod
def func(self):
pass

class Spam(SpamBase):
def func(self):
return __class__


If you want to define parts of the class in separate modules, use mixins:

from myarray.transforms import MyArrayTransformMixin
from myarray.arithmetic import MyArrayArithmeticMixin
from myarray.constructors import MyArrayConstructorsMixin

class MyArray(MyArrayConstructorsMixin, MyArrayArithmeticMixin, 
MyArrayTransformMixin):

pass


The big different between these approaches and the proposal is that the 
proposal does not require both parties to agree on the approach. This is 
actually a terrible idea, as subclassing or mixing in a class that 
wasn't meant for it leads to all sorts of trouble unless the end user is 
very careful. Providing first-class syntax or methods for this 
discourages carefulness. (Another way of saying it is that directly 
overriding class members should feel a bit dirty because it *is* a bit 
dirty.)


As Paul said in an earlier email, the best use of non-direct assignment 
in function definitions is putting it into a dispatch dictionary, and in 
this case making a decorator is likely cleaner than adding new syntax.


But by all means, let's have a PEP. It will simplify the discussion when 
it comes up in six months again (or whenever the last time this came up 
was - less than a year, I'm sure).


Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fwd: Define a method or function attributeoutsideof a class with the dot operator

2017-02-10 Thread Steve Dower
When you apply the "what if everyone did this" rule, it looks like a bad idea 
(or alternatively, what if two people who weren't expecting anyone else to do 
this did it).

Monkeypatching is fairly blatantly taking advantage of the object model in a 
way that is not "supported" and cannot behave well in the context of everyone 
doing it, whereas inheritance or mixins are safe. Making a dedicated syntax or 
decorator for patching is saying that we (the language) think you should do it. 
(The extension_method decorator sends exactly the wrong message about what it's 
doing.)

Enabling a __class__ variable within the scope of the definition would also 
solve the motivating example, and is less likely to lead to code where you need 
to review multiple modules and determine whole-program import order to figure 
out why your calls do not work.

Top-posted from my Windows Phone

-Original Message-
From: "Markus Meskanen" <markusmeska...@gmail.com>
Sent: ‎2/‎10/‎2017 10:18
To: "Paul Moore" <p.f.mo...@gmail.com>
Cc: "Python-Ideas" <python-ideas@python.org>; "Steve Dower" 
<steve.do...@python.org>
Subject: Re: [Python-ideas] Fwd: Define a method or function attributeoutsideof 
a class with the dot operator

Well yes, but I think you're a bit too fast on labeling it a mistake to use 
monkey patching...




On Feb 10, 2017 18:15, "Paul Moore" <p.f.mo...@gmail.com> wrote:

On 10 February 2017 at 16:09, Markus Meskanen <markusmeska...@gmail.com> wrote:
> But if people are gonna do it anyways with the tools provided (monkey
> patching), why not provide them with better tools?


Because encouraging and making it easier for people to make mistakes
is the wrong thing to do, surely?

Paul___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Fwd: Define a method or function attributeoutside of a class with the dot operator

2017-02-10 Thread Steve Dower
Since votes seem to be being counted and used for debate purposes, I am -1 to 
anything that encourages or condones people adding functionality to classes 
outside of the class definition. (Monkeypatching in my mind neither condones or 
encourages, and most descriptions come with plenty of caveats about how it 
should be avoided.)

My favourite description of object-oriented programming is that it's like 
"reading a road map through a drinking(/soda/pop) straw". We do not need to 
tell people that it's okay to make this problem worse by providing first-class 
tools to do it.

Top-posted from my Windows Phone

-Original Message-
From: "Chris Angelico" 
Sent: ‎2/‎10/‎2017 8:27
To: "Python-Ideas" 
Subject: Re: [Python-ideas] Fwd: Define a method or function attributeoutside 
of a class with the dot operator

On Sat, Feb 11, 2017 at 1:16 AM, Nick Coghlan  wrote:
> But what do __name__ and __qualname__ get set to?
>
> What happens if you do this at class scope, rather than at module
> level or inside another function?
>
> What happens to the zero-argument super() support at class scope?
>
> What happens if you attempt to use zero-argument super() when *not* at
> class scope?
>
> These are *answerable* questions...

... and are exactly why I asked the OP to write up a PEP. This isn't
my proposal, so it's not up to me to make the decisions.

For what it's worth, my answers would be:

__name__ would be the textual representation of exactly what you typed
between "def" and the open parenthesis. __qualname__ would be built
the exact same way it currently is, based on that __name__.

Zero-argument super() would behave exactly the way it would if you
used a simple name. This just changes the assignment, not the creation
of the function. So if you're inside a class, you could populate a
lookup dictionary with method-like functions. Abuse this, and you're
only shooting your own foot.

Zero-argument super() outside of a class, just as currently, would be
an error. (Whatever kind of error it currently is.)

Maybe there are better answers to these questions, I don't know.
That's what the PEP's for.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Unified TLS API for Python

2017-02-03 Thread Steve Dower

On 02Feb2017 0601, Cory Benfield wrote:


4. Eventually, integrating the two backends above into the standard
library so that it becomes possible to reduce the reliance on OpenSSL.
This would allow future Python implementations to ship with all of their
network protocol libraries supporting platform-native TLS
implementations on Windows and macOS. This will almost certainly require
new PEPs. I’ll probably volunteer to maintain a SecureTransport library,
and I have got verbal suggestions from some other people who’d be
willing to step up and help with that. Again, we’d need help with
SChannel (looking at you, Steve).


I'm always somewhat interested in learning a new API that I've literally 
never looked at before, so yeah, count me in :) (my other work was using 
the trust APIs directly, rather than the secure socket APIs).


PyCon US sprints? It's not looking like I'll be able to set aside too 
much time before then, but I've already fenced off that time.


Cheers,
Steve

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] pathlib suggestions

2017-01-25 Thread Steve Dower

On 25Jan2017 0816, Petr Viktorin wrote:

On 01/25/2017 04:33 PM, Todd wrote:

But what if the .tar.gz file is called "spam-4.2.5-final.tar.gz"?
Existing tools like glob and endswith() can deal with the ".tar.gz"
extension reliably, but "fullsuffix" would, arguably, not give the
answers you want.

I wouldn't use it in that situation.  The existing "suffix" and "stem"
properties also only work reliably under certain situations.


Which situations do you mean? It works quite fine with multiple suffixes:
The suffix of "pip-9.0.1.tar.gz" is ".gz", and sure enough, you can
reasonably expect it's a gz-compressed file. If you uncompress it and
strip the extension, you'll end up with a "pip-9.0.1.tar", where the
suffix is ".tar" -- and humans would be surprised if it wasn't a tar
archive.



It may be handy if suffixes was a reversed tuple of suffixes (or 
possibly a cumulative tuple):


>>> Path('pip-9.0.1.tar.gz').suffixes
('.gz', '.tar', '.1', '.0')

This has a nice benefit for comparisons:
>>> targzs = [f for f in all_files if f.suffixes[:2] == ('.gz', '.tar')]

It doesn't necessarily improve over .endswith(), but it has a slight 
convenience over .split() and arguably demonstrates intent more clearly. 
(Though my biggest issue with all of this is case-sensitivity, which 
probably means we need to add comparison functions to Path flavours in 
order to do this stuff properly.)



The "cumulative tuple" version would be like this:

>>> Path('pip-9.0.1.tar.gz').suffixes
('.gz', '.tar.gz', '.1.tar.gz', '.0.1.tar.gz')

This doesn't compare as nicely, since now we would use f.suffixes[1] 
which will raise if there is only one suffix (likely). But it does 
return a value which cannot be easily recreated using other functions.


Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-24 Thread Steve Dower
Right. Platforms that have a defined invalid value don't need the struct, and 
so they can define the type differently. It just means we also need to provide 
a macro for testing whether it's been created or not, and users should 
genuinely treat the value as opaque.

Cheers,
Steve

Top-posted from my Windows Phone

-Original Message-
From: "Masayuki YAMAMOTO" 
Sent: ‎12/‎23/‎2016 16:34
To: "Erik Bray" 
Cc: "python-ideas@python.org" 
Subject: Re: [Python-ideas] New PyThread_tss_ C-API for CPython

2016-12-21 19:01 GMT+09:00 Erik Bray :

On Wed, Dec 21, 2016 at 2:10 AM, Nick Coghlan  wrote:
> Ouch, I'd missed that, and I agree it's not a negligible implementation
> detail - there are definitely applications embedding CPython out there that
> rely on being able to run multiple Initialize/Finalize cycles in the same
> process and have everything "just work". It also means using the
> "PyThread_*" prefix for the initialisation tracking aspect would be
> misleading, since the life cycle details are:
>
> 1. Create the key for the first time if it has never been previously set in
> the process
> 2. Destroy and reinit if Py_Finalize gets called
> 3. Destroy and reinit if a new subprocess is forked
>
> It also means we can't use pthread_once even in the pthread TLS
> implementation, since it doesn't provide those semantics.
>
> So I see two main alternatives here.
>
> Option 1: Modify the proposed PyThread_tss_create and PyThread_tss_delete
> APIs to accept a "bool *init_flag" pointer in addition to their current
> arguments.
>
> If *init_flag is true, then PyThread_tss_create is a no-op, otherwise it
> sets the flag to true after creating the key.
> If *init_flag is false, then PyThread_tss_delete is a no-op, otherwise it
> sets the flag to false after deleting the key.
>
> Option 2: Similar to option 1, but using a custom type alias, rather than
> using a C99 bool directly
>
> The closest API we have to these semantics at the moment would be
> PyGILState_Ensure, so the following API naming might work for option 2:
>
> Py_ensure_t
> Py_ENSURE_NEEDS_INIT
> Py_ENSURE_INITIALIZED
>
> Respectively, these would just be aliases for bool, false, and true.
>
> And then modify the proposed PyThread_tss_create and PyThread_tss_delete
> APIs to accept a "Py_ensure_t *init_flag" in addition to their current
> arguments.


That all sounds good--between the two option 2 looks a bit more explicit.

Though what about this?  Rather than adding another type, the original
proposal could be changed slightly so that Py_tss_t *is* partially
defined as a struct consisting of a bool, with whatever the native TLS
key is.   E.g.

typedef struct {
bool init_flag;
#if defined(_POSIX_THREADS)
pthreat_key_t key;
#elif defined (NT_THREADS)
DWORD key;
/* etc... */
} Py_tss_t;

Then it's just taking Masayuki's original patch, with the global bool
variables, and formalizing that by combining the initialized flag with
the key, and requiring the semantics you described above for
PyThread_tss_create/delete.

For Python's purposes it seems like this might be good enough, with
the more general purpose pthread_once-like functionality not required.

Best,
Erik

Above mentioned, In currently TLS API, the thread key uses -1 as defined 
invalid value. If new TLS API inherits the specifications that the key requires 
defined invalid value, putting key and flag into one structure seems correct as 
semantics. In this case, I think TLS API should supply the defined invalid 
value (like PTHREAD_ONCE_INIT) to API users.

Moreover, the structure has an opportunity to assert that the thread key type 
is the opaque using field name. I think to the suggestion that has effect to 
improve the understandability of the API because good field name can give that 
reading and writing to the key seems to be incorrect (even if API users don't 
read the precautionary statement).


Have a nice holiday!

Masayuki___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Enhancing vars()

2016-12-12 Thread Steve Dower
I'm +1. This bites me far too often.

> in the past developers
were encouraged to put only "useful" attributes in __dir__.

Good. If I'm getting vars() I really only want the useful ones. If I need 
interesting/secret ones then I'll getattr for them.

Cheers,
Steve

Top-posted from my Windows Phone

-Original Message-
From: "Alexander Belopolsky" 
Sent: ‎12/‎12/‎2016 19:47
To: "Steven D'Aprano" 
Cc: "python-ideas" 
Subject: Re: [Python-ideas] Enhancing vars()



On Mon, Dec 12, 2016 at 6:45 PM, Steven D'Aprano  wrote:

Proposal: enhance vars() to return a proxy to the object namespace,
regardless of whether said namespace is __dict__ itself, or a number of
__slots__, or both.

How do you propose dealing with classes defined in C?  Their objects don't have 
__slots__.


One possibility is to use __dir__ or dir(), but those can return anything and 
in the past developers
were encouraged to put only "useful" attributes in __dir__.___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Proposal for default character representation

2016-10-15 Thread Steve Dower
FWIW, Python 3.6 should print this in the console just fine. Feel free to 
upgrade whenever you're ready.

Cheers,
Steve

-Original Message-
From: "Mikhail V" 
Sent: ‎10/‎12/‎2016 16:07
To: "M.-A. Lemburg" 
Cc: "python-ideas@python.org" 
Subject: Re: [Python-ideas] Proposal for default character representation

Forgot to reply to all, duping my mesage...

On 12 October 2016 at 23:48, M.-A. Lemburg  wrote:

> Hmm, in Python3, I get:
>
 s = "абв.txt"
 s
> 'абв.txt'

I posted output with Python2 and Windows 7
BTW , In Windows 10 'print'  won't work in cmd console at all by default
with unicode but thats another story, let us not go into that.
I think you get my idea right, it is not only about printing.


> The hex notation for \u is a standard also used in many other
> programming languages, it's also easier to parse, so I don't
> think we should change this default.

In programming literature it is used often, but let me point out that
decimal is THE standard and is much much better standard
in sence of readability. And there is no solid reason to use 2 standards
at the same time.

>
> Take e.g.
>
 s = "\u123456"
 s
> 'ሴ56'
>
> With decimal notation, it's not clear where to end parsing
> the digit notation.

How it is not clear if the digit amount is fixed? Not very clear what
did you mean.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] (Windows-only - calling Steve Dower) Is Python forWindows using PGO? If not consider this a suggestion.

2016-09-18 Thread Steve Dower
It was disable previously because of compiler bugs. 3.6.0b1 64-bit has PGO 
enabled, but we'll disable it again if there are any issues.

Top-posted from my Windows Phone

-Original Message-
From: "João Matos" <jcrma...@gmail.com>
Sent: ‎9/‎17/‎2016 4:02
To: "python-ideas@python.org" <python-ideas@python.org>
Subject: [Python-ideas] (Windows-only - calling Steve Dower) Is Python 
forWindows using PGO? If not consider this a suggestion.

Hello,

Is Python for Windows using PGO (Profile Guided Optimizations)? If not 
consider this a suggestion.

Best regards,

JM

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] (Windows-only - calling Steve Dower) Consider addinga symlink to pip in the same location as the py launcher

2016-09-18 Thread Steve Dower
I'd like to add a launcher in the same style as py.exe, but that would upset 
people who manually configure their PATH appropriately.

Personally, I find "py.exe -m pip" quite okay, but appreciate the idea. I'm 
thinking about this issue (also for other scripts).

Top-posted from my Windows Phone

-Original Message-
From: "João Matos" <jcrma...@gmail.com>
Sent: ‎9/‎17/‎2016 3:57
To: "python-ideas@python.org" <python-ideas@python.org>
Subject: [Python-ideas] (Windows-only - calling Steve Dower) Consider addinga 
symlink to pip in the same location as the py launcher

Hello,

If Py3.5 is installed in user mode instead of admin (all users) and we 
follow your advice that we shouldn't add it to the PATH env var, we can 
execute Python using the py launcher, but we can't use pip.
Please consider adding a pip symlink in the same location as the py 
launcher.

Best regards,

JM

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Let’s make escaping in f-literals impossible

2016-08-29 Thread Steve Dower

On 29Aug2016 1433, Eric V. Smith wrote:

On 8/29/2016 5:26 PM, Ethan Furman wrote:

Update the PEP, then it's a bugfix.  ;)


Heh. I guess that's true. But it's sort of a big change, so shipping
beta 1 with the code not agreeing with the PEP rubs me the wrong way.

Or, I could stop worrying and typing emails, and instead just get on
with it!


I like this approach :)

But I agree. Release Manager Ned has the final say, but I think this 
change can comfortably go in during the beta period. (I also disagree 
that it's a big change - nobody could agree on the 'obvious' behaviour 
of backslashes anyway, so chances are people would avoid them anyway, 
and there was strong consensus on advising people to avoid them.)


Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-18 Thread Steve Dower

On 18Aug2016 1036, Terry Reedy wrote:

On 8/18/2016 11:25 AM, Steve Dower wrote:


In this case, we would announce in 3.6 that using bytes as paths on
Windows is no longer deprecated,


My understanding is the the first 2 fixes refine the deprecation rather
than reversing it.  And #3 simply applies it.


#3 certainly just applies the deprecation.

As for the first two, I don't see any reason to deprecate the 
functionality once the issues are resolved. If using utf-8 encoded bytes 
is going to work fine in all the same cases as using str, why discourage it?


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-18 Thread Steve Dower

On 18Aug2016 0900, Chris Angelico wrote:

On Fri, Aug 19, 2016 at 1:54 AM, Steve Dower <steve.do...@python.org> wrote:

On 18Aug2016 0829, Chris Angelico wrote:


The second call to glob doesn't have any Unicode characters at all,
the way I see it - it's all bytes. Am I completely misunderstanding
this?



You're not the only one - I think this has been the most common
misunderstanding.

On Windows, the paths as stored in the filesystem are actually all text -
more precisely, utf-16-le encoded bytes, represented as 16-bit characters
strings.

Converting to an 8-bit character representation only exists for
compatibility with code written for other platforms (either Linux, or much
older versions of Windows). The operating system has one way to do the
conversion to bytes, which Python currently uses, but since we control that
transformation I'm proposing an alternative conversion that is more reliable
than compatible (with Windows 3.1... shouldn't affect compatibility with
code that properly handles multibyte encodings, which should include
anything developed for Linux in the last decade or two).

Does that help? I tried to keep the explanation short and focused :)


Ah, I think I see what you mean. There's a slight ambiguity in the
word "missing" here.

1) The Unicode character in the result lacks some of the information
it should have

2) The Unicode character in the file name is information that has now been lost.

My reading was the first, but AIUI you actually meant the second. If
so, I'd be inclined to reword it very slightly, eg:

"The Unicode character in the second call to glob is now lost information."

Is that a correct interpretation?


I think so, though I find the wording a little awkward (and on 
rereading, my original wording was pretty bad). How about:


"The second call to glob has replaced the Unicode character with '?', 
which means the actual filename cannot be recovered and the path is no 
longer valid."


Cheers,
STeve

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-18 Thread Steve Dower

On 18Aug2016 0829, Chris Angelico wrote:

The second call to glob doesn't have any Unicode characters at all,
the way I see it - it's all bytes. Am I completely misunderstanding
this?


You're not the only one - I think this has been the most common 
misunderstanding.


On Windows, the paths as stored in the filesystem are actually all text 
- more precisely, utf-16-le encoded bytes, represented as 16-bit 
characters strings.


Converting to an 8-bit character representation only exists for 
compatibility with code written for other platforms (either Linux, or 
much older versions of Windows). The operating system has one way to do 
the conversion to bytes, which Python currently uses, but since we 
control that transformation I'm proposing an alternative conversion that 
is more reliable than compatible (with Windows 3.1... shouldn't affect 
compatibility with code that properly handles multibyte encodings, which 
should include anything developed for Linux in the last decade or two).


Does that help? I tried to keep the explanation short and focused :)

Cheers,
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-18 Thread Steve Dower

Summary for python-dev.

This is the email I'm proposing to take over to the main mailing list to 
get some actual decisions made. As I don't agree with some of the 
possible recommendations, I want to make sure that they're represented 
fairly.


I also want to summarise the background leading to why we should 
consider making a change here at all, rather than simply leaving it 
alone. There's a chance this will all make its way into a PEP, depending 
on how controversial the core team thinks this is.


Please let me know if you think I've misrepresented (or unfairly 
represented) any of the positions, or if you think I can 
simplify/clarify anything in here. Please don't treat this like a PEP 
review - it's just going to be an email to python-dev - but the more we 
can avoid having the discussions there we've already had here the better.


Cheers,
Steve

---

Background
==

File system paths are almost universally represented as text in some 
encoding determined by the file system. In Python, we expose these paths 
via a number of interfaces, such as the os and io modules. Paths may be 
passed either direction across these interfaces, that is, from the 
filesystem to the application (for example, os.listdir()), or from the 
application to the filesystem (for example, os.unlink()).


When paths are passed between the filesystem and the application, they 
are either passed through as a bytes blob or converted to/from str using 
sys.getfilesystemencoding(). The result of encoding a string with 
sys.getfilesystemencoding() is a blob of bytes in the native format for 
the default file system.


On Windows, the native format for the filesystem is utf-16-le. The 
recommended platform APIs for accessing the filesystem all accept and 
return text encoded in this format. However, prior to Windows NT (and 
possibly further back), the native format was a configurable machine 
option and a separate set of APIs existed to accept this format. The 
option (the "active code page") and these APIs (the "*A functions") 
still exist in recent versions of Windows for backwards compatibility, 
though new functionality often only has a utf-16-le API (the "*W 
functions").


In Python, we recommend using str as the default format on Windows 
because it can correctly round-trip all the characters representable in 
utf-16-le. Our support for bytes explicitly uses the *A functions and 
hence the encoding for the bytes is "whatever the active code page is". 
Since the active code page cannot represent all Unicode characters, the 
conversion of a path into bytes can lose information without warning.


As a demonstration of this:

>>> open('test\uAB00.txt', 'wb').close()
>>> import glob
>>> glob.glob('test*')
['test\uab00.txt']
>>> glob.glob(b'test*')
[b'test?.txt']

The Unicode character in the second call to glob is missing information. 
You can observe the same results in os.listdir() or any function that 
matches its result type to the parameter type.


Why is this a problem?
==

While the obvious and correct answer is to just use str everywhere, it 
remains well known that on Linux and MacOS it is perfectly okay to use 
bytes when taking values from the filesystem and passing them back. 
Doing so also avoids the cost of decoding and reencoding, such that 
(theoretically), code like below should be faster because of the `b'.'`:


>>> for f in os.listdir(b'.'):
... os.stat(f)
...

On Windows, if a filename exists that cannot be encoding with the active 
code page, you will receive an error from the above code. These errors 
are why in Python 3.3 the use of bytes paths on Windows was deprecated 
(listed in the What's New, but not clearly obvious in the documentation 
- more on this later). The above code produces multiple deprecation 
warnings in 3.3, 3.4 and 3.5 on Windows.


However, we still keep seeing libraries use bytes paths, which can cause 
unexpected issues on Windows. Given the current approach of quietly 
recommending that library developers either write their code twice (once 
for bytes and once for str) or use str exclusively are not working, we 
should consider alternative mitigations.


Proposals
=

There are two dimensions here - the fix and the timing. We can basically 
choose any fix and any timing.


The main differences between the fixes are the balance between incorrect 
behaviour and backwards-incompatible behaviour. The main issue with 
respect to timing is whether or not we believe using bytes as paths on 
Windows was correctly deprecated in 3.3 and sufficiently advertised 
since to allow us to change the behaviour in 3.6.


Fixes
-

Fix #1: Change sys.getfilesystemencoding() to utf-8 on Windows

Currently the default filesystem encoding is 'mbcs', which is a 
meta-encoder that uses the active code page. In reality, our 
implementation uses the *A APIs and we don't explicitly decode bytes in 
order to pass them to the filesystem. This allows the OS to quietly 

Re: [Python-ideas] Fix default encodings on Windows

2016-08-18 Thread Steve Dower
"You consistently ignore Makefiles, .ini, etc."

Do people really do open('makefile', 'rb'), extract filenames and try to use 
them without ever decoding the file contents?

I've honestly never seen that, and it certainly looks like the sort of thing 
Python 3 was intended to discourage. (As soon as you open(..., 'r') you're only 
affected by this change if you explicitly encode again with mbcs.)

Top-posted from my Windows Phone

-Original Message-
From: "Stephen J. Turnbull" <turnbull.stephen...@u.tsukuba.ac.jp>
Sent: ‎8/‎17/‎2016 19:43
To: "Steve Dower" <steve.do...@python.org>
Cc: "Paul Moore" <p.f.mo...@gmail.com>; "Python-Ideas" <python-ideas@python.org>
Subject: Re: [Python-ideas] Fix default encodings on Windows

Steve Dower writes:
 > On 17Aug2016 0235, Stephen J. Turnbull wrote:

 > > So a full statement is, "How do we best represent Windows file
 > > system paths in bytes for interoperability with systems that
 > > natively represent paths in bytes?"  ("Other systems" refers to
 > > both other platforms and existing programs on Windows.)
 > 
 > That's incorrect, or at least possible to interpret correctly as
 > the wrong thing. The goal is "code compatibility with systems ...",
 > not interoperability.

You're right, I stated that incorrectly.  I don't have anything to add
to your corrected version.

 > > In a properly set up POSIX locale[1], it Just Works by design,
 > > especially if you use UTF-8 as the preferred encoding.  It's
 > > Windows developers and users who suffer, not those who wrote the
 > > code, nor their primary audience which uses POSIX platforms.
 > 
 > You mentioned "locale", "preferred" and "encoding" in the same sentence, 
 > so I hope you're not thinking of locale.getpreferredencoding()? Changing 
 > that function is orthogonal to this discussion,

You consistently ignore Makefiles, .ini, etc.  It is *not* orthogonal,
it is *the* reason for all opposition to your proposal or request that
it be delayed.  Filesystem names *are* text in part because they are
*used as filenames in text*.

 > When Windows developers and users suffer, I see it as my responsibility 
 > to reduce that suffering. Changing Python on Windows should do that 
 > without affecting developers on Linux, even though the Right Way is to 
 > change all the developers on Linux to use str for paths.

I resent that.  If I were a partisan Linux fanboy, I'd be cheering you
on because I think your proposal is going to hurt an identifiable and
large class of *Windows* users.  I know about and fear this possiblity
because they use a language I love (Japanese) and an encoding I hate
but have achieved a state of peaceful coexistence with (Shift JIS).

And on the general principle, *I* don't disagree.  I mentioned earlier
that I use only the str interfaces in my own code on Linux and Mac OS
X, and that I suspect that there are no real efficiency implications
to using str rather than bytes for those interfaces.

On the other hand, the programming convenience of reading the
occasional "text" filename (or other text, such as XML tags) out of a
binary stream and passing it directly to filesystem APIs cannot be
denied.  I think that the kind of usage you propose (a fixed,
universal codec, universally accepted; ie, 'utf-8') is the best way to
handle that in the long run.  But as Grandmaster Lasker said, "Before
the end game, the gods have placed the middle game."  (Lord Keynes
isn't relevant here, Python will outlive all of us. :-)

 > I don't think there's any reasonable way to noisily deprecate these
 > functions within Python, but certainly the docs can be made
 > clearer. People who explicitly encode with
 > sys.getfilesystemencoding() should not get the deprecation message,
 > but we can't tell whether they got their bytes from the right
 > encoding or a RNG, so there's no way to discriminate.

I agree with you within Python; the custom is for DeprecationWarnings
to be silent by default.

As for "making noise", how about announcing the deprecation as like
the top headline for 3.6, postponing the actual change to 3.7, and in
the meantime you and Nick do a keynote duet at PyCon?  (Your partner
could be Guido, too, but Nick has been the most articulate proponent
for this particular aspect of "inclusion".  I think having a
representative from the POSIX world explaining the importance of this
for "all of us" would greatly multiply the impact.)  Perhaps, given my
proposed timing, a discussion at the language summit in '17 and the
keynote in '18 would be the best timing.

(OT, political: I've been strongly influenced in this proposal by
recently reading http://blog.aurynn.com/contempt-culture.  There's not
as much of it in Pytho

Re: [Python-ideas] Fix default encodings on Windows

2016-08-17 Thread Steve Dower

On 17Aug2016 0901, Nick Coghlan wrote:

On 17 August 2016 at 02:06, Chris Barker  wrote:

So the Solution is to either:

 (A) get everyone to use Unicode  "properly", which will work on all
platforms (but only on py3.5 and above?)

or

(B) kludge some *nix-compatible support for byte paths into Windows, that
will work at least much of the time.

It's clear (to me at least) that (A) it the "Right Thing", but real world
experience has shown that it's unlikely to happen any time soon.

Practicality beats Purity and all that -- this is a judgment call.

Have I got that right?


Yep, pretty much. Based on Stephen Turnbull's concerns, I wonder if we
could make a whitelist of universal encodings that Python-on-Windows
will use in preference to UTF-8 if they're configured as the current
code page. If we accepted GB18030, GB2312, Shift-JIS, and ISO-2022-*
as overrides, then problems would be significantly less likely.

Another alternative would be to apply a similar solution as we do on
Linux with regards to the "surrogateescape" error handler: there are
some interfaces (like the standard streams) where we only enable that
error handler specifically if the preferred encoding is reported as
ASCII. In 2016, we're *very* skeptical about any properly configured
system actually being ASCII-only (rather than that value showing up
because the POSIX standards mandate it as the default), so we don't
really believe the OS when it tells us that.

The equivalent for Windows would be to disbelieve the configured code
page only when it was reported as "mbcs" - for folks that had
configured their system to use something other than the default,
Python would believe them, just as we do on Linux.


The problem here is that "mbcs" is not configurable - it's a 
meta-encoder that uses whatever is configured as the "language (system 
locale) to use when displaying text in programs that do not support 
Unicode" (quote from the dialog where administrators can configure 
this). So there's nothing to disbelieve here.


And even on machines where the current code page is "reliable", UTF-16 
is still the actual encoding, which means UTF-8 is still a better choice 
for representing the path as a blob of bytes. Currently we have 
inconsistent encoding between different Windows machines and could 
either remove that inconsistency completely or simply reduce it for 
(approx.) English speakers. I would rather an extreme here - either make 
it consistent regardless of user configuration, or make it so broken 
that nobody can use it at all. (And note that the correct way to support 
*some* other FS encodings would be to change the return value from 
sys.getfilesystemencoding(), which breaks people who currently ignore 
that just as badly as changing it to utf-8 would.)


Cheers,
Steve

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-17 Thread Steve Dower

On 17Aug2016 0235, Stephen J. Turnbull wrote:

Paul Moore writes:
 > On 16 August 2016 at 16:56, Steve Dower <steve.do...@python.org> wrote:

 > > This discussion is for the developers who insist on using bytes
 > > for paths within Python, and the question is, "how do we best
 > > represent UTF-16 encoded paths in bytes?"

That's incomplete, AFAICS.  (Paul makes this point somewhat
differently.)  We don't want to represent paths in bytes on Windows if
we can avoid it.  Nor does UTF-16 really enter into it (except for the
technical issue of invalid surrogate pairs).  So a full statement is,
"How do we best represent Windows file system paths in bytes for
interoperability with systems that natively represent paths in bytes?"
("Other systems" refers to both other platforms and existing programs
on Windows.)


That's incorrect, or at least possible to interpret correctly as the 
wrong thing. The goal is "code compatibility with systems ...", not 
interoperability.


Nothing about this will make it easier to take a path from Windows and 
use it on Linux or vice versa, but it will make it easier/more reliable 
to take code that uses paths on Linux and use it on Windows.



BTW, why "surrogate pairs"?  Does Windows validate surrogates to
ensure they come in pairs, but not necessarily in the right order (or
perhaps sometimes they resolve to non-characters such as U+1)?


Eryk answered this better than I would have.


Paul says:

 > People passing bytes to open() have in my view, already chosen not
 > to follow the standard advice of "decode incoming data at the
 > boundaries of your application". They may have good reasons for
 > that, but it's perfectly reasonable to expect them to take
  > responsibility for manually tracking the encoding of the resulting
 > bytes values flowing through their code.

Abstractly true, but in practice there's no such need for those who
made the choice!  In a properly set up POSIX locale[1], it Just Works by
design, especially if you use UTF-8 as the preferred encoding.  It's
Windows developers and users who suffer, not those who wrote the code,
nor their primary audience which uses POSIX platforms.


You mentioned "locale", "preferred" and "encoding" in the same sentence, 
so I hope you're not thinking of locale.getpreferredencoding()? Changing 
that function is orthogonal to this discussion, despite the fact that in 
most cases it returns the same code page as what is going to be used by 
the file system functions (which in most cases will also be used by the 
encoding returned from sys.getfilesystemencoding()).


When Windows developers and users suffer, I see it as my responsibility 
to reduce that suffering. Changing Python on Windows should do that 
without affecting developers on Linux, even though the Right Way is to 
change all the developers on Linux to use str for paths.



 > > If you see an alternative choice to those listed above, feel free
 > > to contribute it. Otherwise, can we focus the discussion on these
 > > (or any new) choices?
 >
 > Accept that we should have deprecated builtin open and the io module,
 > but didn't do so. Extend the existing deprecation of bytes paths on
 > Windows, to cover *all* APIs, not just the os module, But modify the
 > deprecation to be "use of the Windows CP_ACP code page (via the ...A
 > Win32 APIs) is deprecated and will be replaced with use of UTF-8 as
 > the implied encoding for all bytes paths on Windows starting in Python
 > 3.7". Document and publicise it much more prominently, as it is a
 > breaking change. Then leave it one release for people to prepare for
 > the change.

I like this one!  If my paranoid fears are realized, in practice it
might have to wait two releases, but at least this announcement should
get people who are at risk to speak up.  If they don't, then you can
just call me "Chicken Little" and go ahead!


I don't think there's any reasonable way to noisily deprecate these 
functions within Python, but certainly the docs can be made clearer. 
People who explicitly encode with sys.getfilesystemencoding() should not 
get the deprecation message, but we can't tell whether they got their 
bytes from the right encoding or a RNG, so there's no way to discriminate.


I'm going to put together a summary post here (hopefully today) and get 
those who have been contributing to basically sign off on it, then I'll 
take it to python-dev. The possible outcomes I'll propose will basically 
be "do we keep the status quo, undeprecate and change the functionality, 
deprecate the deprecation and undeprecate/change in a couple releases, 
or say that it wasn't a real deprecation so we can deprecate and then 
change functionality in a couple releases".


Cheers,
Steve

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-15 Thread Steve Dower

On 15Aug2016 0954, Random832 wrote:

On Mon, Aug 15, 2016, at 12:35, Steve Dower wrote:

I'm still not sure we're talking about the same thing right now.

For `open(path_as_bytes).read()`, are we talking about the way
path_as_bytes is passed to the file system? Or the codec used to decide
the returned string?


We are talking about the way path_as_bytes is passed to the filesystem,
and in particular what encoding path_as_bytes is *actually* in, when it
was obtained from a file or other stream opened in binary mode.


Okay good, we are talking about the same thing.

Passing path_as_bytes in that location has been deprecated since 3.3, so 
we are well within our rights (and probably overdue) to make it a 
TypeError in 3.6. While it's obviously an invalid assumption, for the 
purposes of changing the language we can assume that no existing code is 
passing bytes into any functions where it has been deprecated.


As far as I'm concerned, there are currently no filesystem APIs on 
Windows that accept paths as bytes.



Given that, I'm proposing adding support for using byte strings encoded 
with UTF-8 in file system functions on Windows. This allows Python users 
to omit switching code like:


if os.name == 'nt':
f = os.stat(os.listdir('.')[-1])
else:
f = os.stat(os.listdir(b'.')[-1])

Or simply using the bytes variant unconditionally because they heard it 
was faster (sacrificing cross-platform correctness, since it may not 
correctly round-trip on Windows).


My proposal is to remove all use of the *A APIs and only use the *W 
APIs. That completely removes the (already deprecated) use of bytes as 
paths. I then propose to change the (unused on Windows) 
sys.getfsdefaultencoding() to 'utf-8' and handle bytes being passed into 
filesystem functions by transcoding into UTF-16 and calling the *W APIs.


This completely removes the active codepage from the chain, allows paths 
returned from the filesystem to correctly roundtrip via bytes in Python, 
and allows those bytes paths to be manipulated at '\' characters. 
(Frankly I don't mind what encoding we use, and I'd be quite happy to 
force bytes paths to be UTF-16-LE encoded, which would also round-trip 
invalid surrogate pairs. But that would prevent basic manipulation which 
seems to be a higher priority.)


This does not allow you to take bytes from an arbitrary source and 
assume that they are correctly encoded for the file system. Python 3.3, 
3.4 and 3.5 have been warning that doing that is deprecated and the path 
needs to be decoded to a known encoding first. At this stage, it's time 
for us to either make byte paths an error, or to specify a suitable 
encoding that can correctly round-trip paths.



If this does not answer the question, I'm going to need the question to 
be explained more clearly for me.


Cheers,
Steve

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Fix default encodings on Windows

2016-08-12 Thread Steve Dower
I was thinking we would end up using the console API for input but stick with 
the standard handles for output, mostly to minimize the amount of magic 
switching we have to do. But since we can just switch the entire stream object 
in __std*__ once at startup if nothing is redirected it probably isn't that 
much of a simplification.

I have some airport/aeroplane time today where I can experiment.

Top-posted from my Windows Phone

-Original Message-
From: "eryk sun" 
Sent: ‎8/‎12/‎2016 5:40
To: "python-ideas" 
Subject: Re: [Python-ideas] Fix default encodings on Windows

On Thu, Aug 11, 2016 at 9:07 AM, Paul Moore  wrote:
> set codepage to UTF-8
> ...
> set codepage back
> spawn subprocess X, but don't wait for it
> set codepage to UTF-8
> ...
> ... At this point what codepage does Python see? What codepage does
> process X see? (Note that they are both sharing the same console).

The input and output codepages are global data in conhost.exe. They
aren't tracked for each attached process (unlike input history and
aliases). That's how chcp.com works in the first place. Otherwise its
calls to SetConsoleCP and SetConsoleOutputCP would be pointless.

But IMHO all talk of using codepage 65001 is a waste of time. I think
the trailing garbage output with this codepage in Windows 7 is
unacceptable. And getting EOF for non-ASCII input is a show stopper.
The problem occurs in conhost. All you get is the EOF result from
ReadFile/ReadConsoleA, so it can't be worked around. This kills the
REPL and raises EOFError for input(). ISTM the only people who think
codepage 65001 actually works are those using Windows 8+ who
occasionally need to print non-OEM text and never enter (or paste)
anything but ASCII text.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/