[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

2021-11-14 Thread Jim J. Jewett
ptmcg@austin.rr.com wrote:

> ...  add a cautionary section on homoglyphs, specifically citing
> “A” (LATIN CAPITAL LETTER A) and “Α” (GREEK CAPITAL LETTER ALPHA)
> as an example problem pair.

There is a unicode tech report about confusables, but it is never clear where 
to stop.  Are I (upper case I), l (lower case l) and 1 (numeric 1) from ASCII 
already a problem?  And if we do it at all, is there any way to avoid making 
Cyrillic languages second-class?

I'm not quickly finding the contemporary report, but these should be helpful if 
you want to go deeper:

http://www.unicode.org/reports/tr36/
http://unicode.org/reports/tr36/confusables.txt
https://util.unicode.org/UnicodeJsps/confusables.jsp


> I wanted to look a little further at the use of characters in identifiers 
> beyond the standard 7-bit ASCII, and so I found some of these same 
> issues dealing with Unicode NFKC normalization. The first discovery was 
> the overlapping normalization of “ªº” with “ao”. 

Here I don't see the problem.  Things that look slightly different are really 
the same, and you can write it either way.  So you can use what looks like a 
funny font, but the closest it comes to a security risk is that maybe you could 
access something without a casual reader realizing that you are doing so.  They 
would know that you *could* access it, just not that you *did*.

> Some other discoveries:
> “·” (ASCII 183) is a valid identifier body character, making “_···” a valid
> Python identifier.

That and the apostrophe are Unicode consortium regrets, because they are 
normally punctuation, but there are also languages that use them as letters. 
 The apostrophe is (supposedly only) used by Afrikaans, I asked a native 
speaker about where/how often it was used, and the similarity to Dutch was 
enough that Guido felt comfortable excluding it.  (It *may* have been similar 
to using the apostrophe for a contraction in English, and saying it therefore 
represents a letter, but the scope was clearly smaller.)  But the dot is used 
in Catalan, and ... we didn't find anyone ready to say it wouldn't be needed 
for sensible identifiers.  It is worth listing as a warning, and linters should 
probably complain.

> “_” seems to be a special case for normalization. Only the ASCII “_”
> character is valid as a leading identifier character; the Unicode 
> characters that normalize to “_” (any of the characters in “︳︴﹍﹎﹏_”)
> can only be used as identifier body characters. “︳” especially could be
> misread as “|” followed by a space, when it actually normalizes to “_”.

So go ahead and warn, but it isn't clear how that could be abused to look like 
something other than a syntax error, except maybe through soft keywords.  (Ha!  
I snuck in a call to async︳def that had been imported with *, and you didn't 
worry about the import *, or the apparently wild cursor position marker, or the 
strange async definition that was never used!  No way I could have just issued 
a call to _flush and done the same thing!)

> Potential beneficial uses:
> I am considering taking my transformer code and experimenting with an
> orthogonal approach to syntax highlighting, using Unicode groups 
> instead of colors. Module names using characters from one group,
> builtins from another, program variables from another, maybe 
> distinguish local from global variables. Colorizing has always been an
> obvious syntax highlight feature, but is an accessibility issue for those
> with difficulty distinguishing colors.

I kind of like the idea, but ... if you're doing it on-the-fly in the editor, 
you could just use different fonts.  If you're actually saving those changes, 
it seems likely to lead to a lot of spurious diffs if anyone uses a different 
editor.

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NPTL43EVT2FF76LXIBBWVHDU6NXH3HF5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

2021-11-16 Thread Jim J. Jewett
Compatibility variants can look different, but they can also look identical.  
Allowing any non-ASCII characters was worrisome because of the security 
implications of confusables.  Squashing compatibility characters seemed the 
more conservative choice at the time.  Stestagg's example:
е = lambda е, e: е if е > e else e
shows it wasn't perfect, but adding more invisible differences does have risks, 
even beyond the backwards incompatibility and the problem with (hopefully rare, 
but are we sure?) editors that don't distinguish between them in the way a 
programming language would prefer.

I think (but won't swear) that there were also several problematic characters 
that really should have been treated as (at most) glyph variants, but ... 
weren't.  If I Recall Correctly, the largest number were Arabic presentation 
forms, but there were also a few characters that were in Unicode only to 
support round-trip conversion with a legacy charset, even if that charset had 
been declared buggy.  In at least a few of these cases, it seemed likely that a 
beginning user would expect them to be equivalent.

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GNT3AG2SCVLMCJAZXSTIWFKKAYG25E7O/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

2021-11-16 Thread Jim J. Jewett
Stephen J. Turnbull wrote:
> Christopher Barker writes:

> > For example, in writing math we often use different scripts to mean
> > different things (e.g. TeX's Blackboard Bold).  So if I were to use
> > some of the Unicode Mathematical Alphanumeric Symbols, I wouldn't
> > want them to get normalized.

Agreed, for careful writers.  But Stephen's answer about people using the wrong 
one and expecting it to work means that normalization is probably the lesser of 
evils for most people, and the ones who don't want it normalized are more 
likely to be able to specify custom processing when it is important enough.  
(The compatibility characters aren't normalized in strings, largely because 
that should still be possible.)

> In fact, I think adding these symbols to Unicode was a bad idea; they
> should be handled at a higher level in the linguistic stack (by
> semantic markup).

When I was a math student, these were clearly different symbols, with much less 
relation to each other than a mere case difference. 
 So by the Unicode consortium's goals, they are independent characters that 
should each be defined.  I admit that isn't ideal for most use cases outside of 
math, but ... supporting those other cases is what compatibility normalization 
is for. 

> It's also a UX problem.  At slightly higher layer in the stack, I'm
> used to using Japanese input methods to input sigma and pi which
> produce characters in the Greek block, and at least the upper case
> forms that denote sum and product have separate characters in the math
> operators block.  I understand why people who literally write
> mathematics in Greek might want those not normalized, but I sure am
> going to keep using "Greek sigma", not "math sigma"!  The probability
> that I'm going to have a Greek uppercase sigma in my papers is nil,
> the probability of a summation symbol near unity.  But the summation
> symbol is not easily available, I have to scroll through all the
> preceding Unicode blocks to find Mathematical Operators.  So I am
> perfectly happy with uppercase Greek sigma for that role (as is
> XeTeX!!)

I think that is mostly a backwards compatibility problem; XeTeX itself had to 
worry about compatibility with TeX (which preceded Unicode) and with the fonts 
actually available and then with earlier versions of XeTeX.

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JNFLAQUKNCWCJSMBNJZGHVD5ZELOTU6G/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

2021-11-16 Thread Jim J. Jewett
Steven D'Aprano wrote:
> I think 
> that many editors in common use don't support bidirectional text, or at 
> least the ones I use don't seem to support it fully or correctly. ...
> But, if there is a concrete threat beyond "it looks weird", that it 
> another issue.

Based on the original post (and how it looked in my web browser, after various 
automated reformattings, it seems that one of the failure modes that buggy 
editors have is that 

stuff can be part of the code, even though it looks like part of a comment, or 
vice versa

This problem might be limited to only some of the bidi controls, and there 
might even be a workaround specific to # ... but it is an issue.  I do not 
currently have an opinion on how important of an issue it is, or how adequate 
the workarounds are.

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ECO4R655UGPCVFFVAOQZ3DUZVHQY75BX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: The current state of typing PEPs

2021-11-26 Thread Jim J. Jewett
Paul Moore wrote:
> More hazy memories here, but I think the original proposal left open
> the possibility of annotations not being types at all - for example,
> being docstrings for the arguments, or option names for a "function
> call to CLI" tool, etc. 

Absolutely.

While it was clear that Guido's own use cases were about typing, annotations 
were explicitly not limited to typing, which is one reason why some of the 
later changes have felt to some people like bait and switch.  Maybe it is 
already too late to avoid that.

> ... the expectation was that annotations
> would be *types*,

Even  from the start, it was assumed that they would be objects.  (Specifically 
types was expected to be common, but not universal.)  The particular way 
strings are being substituted for evaluated objects has sometimes reminded me 
of raising a string instead of an exception class/object.  It will work, but it 
can seem sloppy, and it can be annoying if you were assuming otherwise and 
suddenly have to add a bunch of evals.  (That said, I haven't yet been 
sufficiently motivated to even tease out exactly what the problems are, let 
alone to propose an alternative that also satisfies the typing fans -- in part 
because it feels like the obvious optimization is to just not run typing, and 
it isn't clear what middle grounds are generally worthwhile.)

>...  personally, I have the same discomfort
> about using explicit string annotations for forward references, it
> feels like I'm not declaring a "proper type".
> If what I say above is right, the debate here isn't about whether
> annotations "are for types", but rather about whether reading the
> types in annotations and using them to affect behaviour *at runtime*
> is a legitimate use of annotations. 

I see that as a second dispute, which I had previously missed.  I think you're 
right, though.  On the other hand, I'm not sure the solution to both isn't just 
a helper function that does the 2nd-pass resolution -- preferably without 
requiring that all the rest of typing be imported, since even the people who 
want to use the typing package agree that importing it is not lightweight.

> ... I lurk on the typing-sig, and from an outsider's perspective, the
> participants seem to be almost entirely designers or heavy users of
> static type checkers. That gives a certain emphasis to the proposals
> coming from that group.

At times, it sort of reminds me of OWL and "Semantic Web".  There are plenty of 
people who will want to use annotations as a tool, but won't be willing to wade 
through what can feel like "How many angels can dance on the head of a pin?" 
discussions.  That said, I'm not sure how to best reach people who just want a 
rough-and-ready usually-good-enough tool.

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JL3AQ4UVEVWXYTRFAVGVHNT23W2NCUDI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: The current state of typing PEPs

2021-11-26 Thread Jim J. Jewett
Steven D'Aprano wrote:
> On Sat, Nov 20, 2021 at 11:46:56PM -0800, Christopher Barker wrote:

> Maybe PEP 563 could include a decorator in the typing module to 
> destringify all the annotations in a class or function?

If it were in an annotations module, that would probably be sufficient.

If it is in typing, then it is a very heavyweight dependency -- heavy enough 
that even the people actually using that module for development (and not for 
production runs) are worried about the costs.  If the costs of the typing 
module are that high, it is not acceptable to impose them on people not 
otherwise using the module.

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZQHP24T2PKRTDBGZ36KLBHLLOKITP5ON/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Explicit markers for special C-API situations (re: Clarification regarding Stable ABI and _Py_*)

2021-12-09 Thread Jim J. Jewett
Christian Heimes wrote:
> On 09/12/2021 19.26, Petr Viktorin wrote:

> > If the code is the authoritative source of truth, we need a proper
> > parser to extract the information.  ... unfortunately I don't trust it
> > enough to let it define the API. Bugs in the parser could result in
> > the API definition silently changing.

> There are other options than writing a new parser. GCC and Clang are 
> flexible. For example GCC can be extended with plugins and custom 
> attributes.

But they have the same problem ... it can be difficult to know if there is a 
subtle bug in someone's understanding of how the plugin interacts with, for 
example, nested ifndef.

The failure mode for an explicitly manually maintained text file is that 
something doesn't get added when it should, and the more conservative API 
consumers wait an extra release before using it.

-jJ



 We could extend the header files with custom attributes and 
> then use a plugin to create an ABI file from the attributes.
> I created a quick n' hack 
> https://github.com/python/cpython/compare/main...tiran:gcc-pythonapi-plugin?...
>  
> as proof of concept.
> The plugin takes
> PyAPI_ABI_FUNC(PyObject *) PyLong_FromLong(long);
> and dumps the declaration as:
> extern struct PyObject * PyLong_FromLong (long int); "abi_func"
> Christian
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PKQFEIK75EWVTNMLB5CGBYLQANZG6QJH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-15 Thread Jim J. Jewett
> In Python 3.11, Python still implements around 100 types as "static
> types" which are not compatible with subinterpreters, like
> &PyLong_Type and &PyUnicode_Type. I opened
> https://bugs.python.org/issue40601 about these static types, but it
> seems like changing it may break the C API *and* the stable ABI (maybe
> a clever hack will avoid that).

If sub-interpreters each need their own copy of even immutable built-in types, 
then what advantage do they have over separate processes?

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/B7WO5B426HBTG6KZVKQXTJSBQL2S2ILQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

2021-12-15 Thread Jim J. Jewett
Immortal objects shouldn't be reclaimed by garbage collection, but they still 
count as potential external roots for non-cyclic liveness.

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FMIUHY6K3UUAUTK7GDTTOO4ULXO74QMP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

2021-12-15 Thread Jim J. Jewett
How common is it to reload a module in production code?

It seems like "object created at the module level" (excluding __main__) is at 
least as good of an heuristic for immortality as "string that meets the 
syntactic requirements for an identifier".  Perhaps also anything created as 
part of class creation (as opposed to instance initialization).

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/F3IEICCQTKGZMRX3L4JS4NEZZNXVMZGA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

2021-12-16 Thread Jim J. Jewett
Guido van Rossum wrote:
> On Wed, Dec 15, 2021 at 6:57 PM Jim J. Jewett jimjjew...@gmail.com wrote:
> > Immortal objects shouldn't be reclaimed by garbage collection, but they
> > still count as potential external roots for non-cyclic liveness.
> So everything referenced by an immortal object should also be made immortal

Why?  As long as you can get a list of all immortal objects (and a traversal 
function from each), this is just an extra step (annoying, but tolerable) that 
removes a bunch of objects from the pool of potential garbage before you even 
begin looking for cycles.

> -- even its type. Hence immortal objects must be immutable. 

This is probably a good idea, since avoiding changes also avoids races and Copy 
on Write and cache propagation, etc ... but I don't see why it is *needed*, 
rather than helpful.

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4KY5XSHRMP3F3CWAW2OUW4NRXN4AB7EM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-16 Thread Jim J. Jewett
Petr Viktorin wrote:
>>> In Python 3.11, Python still implements around 100 types as "static
>>> types" which are not compatible with subinterpreters,
...
>>> seems like changing it may break the C API *and* the stable ABI

> > If sub-interpreters each need their own copy of even immutable built-in 
> > types, then what advantage do they have over separate processes?

> They need copies of all *Python* objects. A non-Python library may allow 
> several Python wrappers/proxies for a single internal object, 
> effectively sharing that object between subinterpreters.
> (Which is a problem for removing the GIL -- currently all operations 
> done by such wrappers are protected by the GIL.)

OK, so what is the advantage of having multiple interpreters?

The only advantage I can see is that if you're embedding what are essentially 
several distinct python processes, you can still keep them all inside the 
single process used by the embedding program.  But seems pretty far along the 
"they're already compiling anyhow; so the ABI isn't crucial" path.

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/C2Z2RPRAIGYDODATM5BQQL6DA6LEOVVN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

2021-12-18 Thread Jim J. Jewett
Why are Immutability and transitive Immortality needed to share an object 
across interpreters?  

Are you assuming that a change in one interpreter should not be seen by others? 
 (Typical case, but not always true.)  

Or are you saying that there is a technical problem such that a change -- even 
just to the reference count of a referenced string or something -- would cause 
data corruption?  (If so, could you explain why, or at least point me in the 
general direction?)

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/E3XKSDEDOLHBFFUS2TXGDSLV7YOQUZJB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Function Prototypes

2021-12-24 Thread Jim J. Jewett
Steven D'Aprano wrote:
> In comparison, Mark's version:
> @Callable
> def IntToIntFunc(a:int)->int:
> pass
> # in the type declaration
> func: IntToIntFunc
> uses 54 characters, plus spaces and newlines (including 7 punctuation 
> characters); it takes up three extra lines, plus a blank line. As 
> syntax goes it is double the size of Callable.

I think it takes only the characters needed to write the name IntToIntFunc.

The @callable def section is a one-time definition, and not logically part of 
each function definition where it is used.

I get that some people prefer an inline lambda to a named function, and others 
hate naming an infrastructure function, but ...

Why are you even bothering to type the callback function?  If it is complicated 
enough to be worth explicitly typing, then it is complicated enough to chunk 
off with a name.

I won't say it is impossible to understand a function signature on the first 
pass if it takes several lines and whitespace to write ... but it is much 
easier when the the declaration is short enough to fit on a single line.  

An @ on the line above complicates the signature parsing, but can be mentally 
processed separately.  The same is true of a named-something-or-other in the 
middle.

Having to switch parsing modes to understand an internal ([int, float, int] -> 
List[int]), and then to pop that back off the stack is much harder. 
 Hard enough that you really ought to help your reader out with a name, and let 
them figure out what that names means separately, when their brain's working 
memory isn't already loaded with the first part of your own function, but still 
waiting for the last part.

> It separates the type declaration from the point at which it is used, 
> potentially far away from where it is used.

The sort of code that passes around functions tends to pass around many 
functions, but with only a few signatures.

If this is really the only time you'll need that signature (not even when you 
create the functions that will be passed from a calling site?), then ... great. 
 But be nice to your reader anyhow, unless the signature is really so simple 
that the type-checking software should infer it for you.  Then be nice by 
leaving it out as cruft.

[As an aside, I would see some advantage to 

def myfunc(f:like blobfunc) 

pointing to an examplar instead of a specifically constructed function-type.  
You discuss this later as either 

 ... f:blobfunc ... or 
 ... f:blobfunc=blobfunc ...

and I would support those, if other issues can be worked out.]

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZDCSTHMZVSILZZMGI3GTTBTWB53ZRJOI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Function Prototypes

2021-12-26 Thread Jim J. Jewett
Steven D'Aprano wrote:uble the size of Callable.
> > I think it takes only the characters needed to write the name IntToIntFunc.
> ... you may only use it once.

Could you provide an example where it is only used once?

The only way I can imagine is that you use it here when when defining your 
complicated function that takes a callback (once), but then you never actually 
call that complicated function, even from test code, nor do you expect your 
users to do so.

> The status quo is that we can use an anonymous type in the annotation 
> without pre-defining it, using Callable.

OK.  I'm not sure it would be a good idea, but we agree it is legal.

> PEP 677 proposes a new, more compact syntax for the same. 

Does it?  I agree that "(int, float) -> bool" is more compact than 
typing.Callable[...], but that feels like optimizing for the wrong thing.

I dislike the PEP's flat_map as an example, because it is the sort of 
infrastructure function that carries no semantic meaning, but ... I'll use it 
anyhow.

def flat_map(l, func):
out = []
for element in l:
out.extend(f(element))
return out


def wrap(x: int) -> list[int]:
return [x]

def add(x: int, y: int) -> int:
return x + y

It is reasonable to add a docstring to flat_map, but I grant that this doesn't 
work as well with tooling that might involve not actually seeing the function.

I agree that adding a long prefix of:

from typing import Callable

def flat_map(
l: list[int],
func: Callable[[int], list[int]]
) -> list[int]:

is undesirable.  But the biggest problem is not that "Callable..." has too many 
characters; the problem is that "Callable[[...], list[...]]" requires too many 
levels of sub-parsing. 

The PEP doesn't actually say what it proposes, [and you've suggested that my 
earlier attempt was slightly off, which may not bode well for likelihood of 
typos], but I'll *guess* that you prefer:

def flat_map(
l: list[int],
func: ((int) ->[int])
) -> list[int]:

which is slightly shorter physically, but not much simpler mentally.  It 
therefore creates an attractive nuisance.

def flat_map(
l: list[int],
func: wrap
) -> list[int]:

on the other hand, lets you read this definition without having to figure out 
what "wrap" does at the same time.  

"wrap" is a particularly bad example (because of the lack of semantic content 
in this example), but I think it still easily beats the proposed new solution, 
simply because it creates a "you don't need need to peer below this right now" 
barrier.


> Any proposal for function prototypes using 
> `def` is directly competing against Callable or arrow syntax for the 
> common case that we want an anonymous, unnamed type written in place.

I'm saying that catering to that "common" case is a trap, often leading you to 
a local optima that is bad globally.

> But if we can use an existing function as the prototype instead of 
> having to declare the prototype, that shifts the balance. 

I agree that re-using an existing function with the correct signature is 
better, *even* when that function doesn't make a good default.

...
> > I would say the opposite: most callback or key functions have very 
> simple signatures.
> If my function takes a key function, let's say:
> def spam(mylist:[str], 
>  a: int, 
>  b: float,
>  c: bool|None,
>  key: Callable[[str], str],
>  ) -> Eggs:
> mylist = sorted(mylist, key=key)
> ...
> the relevant signature is (str) -> str. Do we really need to give that a 
> predefined named prototype?
> def StrToStr(s: str) -> str: pass

If you really care about enforcing the str, then yes, it is worth saying

key: str_key

and defining str_key function as an example

def str_key(data:str)->str
return str(data)

> I would argue that very few people would bother. 

Because it would usually be silly to care that the list really contained 
strings, as opposed to "something sortable".  So if you do care, it is worth 
making your requirement stand out, instead of losing it in a pile of what looks 
like boilerplate.

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/32QYLA7UFTC54UM3CO3REIH57WLLBL6H/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 646 (Variadic Generics): final call for comments

2022-01-18 Thread Jim J. Jewett
I'm seeing enough different interpretations to think things aren't quite 
specified -- but I'm not sure if it matters.

(1)  Is any of this something that should affect computation, or is it really 
just a question of how to interpret possibly ambiguous documentation? 

(2)  Are any of these troubling cases something that a person should actually 
write for a normal situation?  Or are they just arguments about which 
abbreviations are acceptable?  Or about how automatically-generated (inferred) 
type descriptions should be written?

(3)  Are the slice-expansion questions all assumed to be indexing an 
n-dimensional array, as opposed to [start, stop, step]?  Is that explicit in 
the PEP, and just not in the extracts here?

(4)  Expanding multiple * shouldn't be ambiguous; the problem is figuring out 
what to condense into which if two are adjacent.  So 
s1, s2 =[a,b], (1,2,3)
[*s1, *s2] should turn into [a, b, 1, 2, 3]
The problem is that 
[*s3, *s4] = (a, b, 1, 2, 3)
is ambiguous ... and I didn't really get that distinction from Petr's question 
or the answers.  I can't tell whether I've missed something crucial, or others 
are arguing over angels on a pinhead ... so whatever the PEP ends up deciding, 
it should be explicit.  (I *think* the earlier parts of this thread are 
consistent with this, and discussing whether to say explicitly that certain 
formats are forbidden (but maybe not enforced by the grammar), meaningless, or 
valid but currently meaningless outside of typing.)

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YH4URO5EDQODG4QMGOCSXHV6RYTMLK5M/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Suggestion: a little language for type definitions

2022-01-18 Thread Jim J. Jewett
Steven D'Aprano wrote:
> On Sat, Jan 08, 2022 at 12:59:38AM +0100, jack.jan...@cwi.nl wrote:

> For example, the arrow syntax for Callable `(int) -> str` (if accepted) 
> could be a plain old Python expression, usable anywhere the plain old 
> Python expression `Callable[[int], str]` would be.

In principle, yes.  In practice, I think the precedence of "->" might be 
tricky, particularly if the (int) part discourages people from wrapping the 
full expression in parentheses. 

> > What if we created a little language that is clearly flagged, for 
> > example as t”….” or t’….’? Then we could simply define the 
> > typestring language to be readable, so you could indeed say t”(int, 
> > str) -> bool”. And we could even allow escapes (similar to 
> > f-strings) so that the previous expression could also be specified, 
> > if you really wanted to, as t”{typing.Callable[[int, str], bool}”.

> The following are not rhetorical questions. I don't know the answers, 
> which is why I am asking.

> 1. Are these t-strings only usable inside annotations, or are they 
> expressions that are usable everywhere?

I assume they *could* be used anywhere, there just wouldn't be huge reasons to 
do so.  Sort of like a string expression can be an entire statement; there just 
usually isn't much reason (except as a docstring) to do it.

> 2. If only usable inside annotations, why bother with the extra prefix 
> t" and suffix "? What benefit do they give versus just the rule 
> "annotations use this syntax"?

It provides a useful box around the typing, so that people who are not 
currently worried about typing can more easily concentrate on the portion they 
do currently care about.

> 3. If usable outside of annotations, what runtime effect do they have? 

They create a string.  Which may or may not be a useful thing to do.

> The t-string must evaluate to an object. What object?

A string.  The various "let us delay annotation evaluation" proposals have made 
it clear that the people actually using typing don't want it to slow things 
down for extra evaluation until they explicitly call for that evaluation, 
perhaps as part of a special run.

> 4. If the syntax allowed inside the t-string is specified as part of the 
> Python language definition, why do we need the prefix and suffix?

Same answer as number 2 ... it allows typing to be a less intrusive neighbor.  
I don't think t" " is as good as some sort of braces, but ... we're out of 
conventional braces available in ASCII.

> Likewise, if this is allowed:
> def func(arr: t"array [1...10] of int") -> str: ...

How many arguments do I pass to func?  That is already tricky to see at a 
glance, but 

> def func(arr: array [1...10] of int) -> str: ...

is even more difficult to parse.  By the time I've mentally attached the "of" 
and "int" to the indexed (but not really) array that just describes a type, 
I've forgotten what I was looking for and why.

> 5. What difference, if any, is there between `t"{expression}"` and 
> `expression`?

In addition to the box (so readers can more easily filter it out), there is 
also a flag to typing systems saying that they *should* elaborate the string.  
What they elaborate it into will be very different from a regular string.  That 
won't happen every time the module is imported, but it will happen when the 
string is actually needed for something.

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HY7I52ILYN7IYN2S6UQT57XV4R3YEC2Z/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Minor inconvenience: f-string not recognized as docstring

2022-01-18 Thread Jim J. Jewett
Guido van Rossum wrote:
> I personally think F-strings should not be usable as docstrings. If you
> want a dynamically calculated docstring you should assign it dynamically,
> not smuggle it in using a string-like expression. We don't allow "blah {x}
> blah".format(x=1) as a docstring either, not "foo %s bar" % x.

Nor, last I checked, even "string1" + "string2", even though the result is a 
compile-time string in the appropriate location.  I think all of these should 
be allowed, but I'll grant that annotations reduce the need.  I'll even admit 
that scoping issues make the interpolating versions error prone, and the UI to 
clear that up may be more of a hassle than it is worth. 

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZUWTCGK6KZJYCUDRR3JNB7H5W3ZHJWMT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Should we require IEEE 754 floating-point for CPython?

2022-02-07 Thread Jim J. Jewett
- Should we require the presence of NaNs in order for CPython to build?
- Should we require IEEE 754 floating-point for CPython-the-implementation?
- Should we require IEEE 754 floating-point for Python-the-language?

I don't have strong opinions on the first two, but for the language definition, 
I think the most we should say is "if an implementation does not support IEEE 
754 floating-point, this must be mentioned in the documentation as an 
implementation limit."
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YEXX363XX6DS7ZC653RBLIPNQIHBVYTK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: It's now time to deprecate the stdlib urllib module

2022-02-07 Thread Jim J. Jewett
There are problems with urllib.  With hindsight, it would have been nice to do 
a few things differently.  But that doesn't make migrating away from it any 
easier.

This thread has mentioned several "better" alternatives -- but with the 
exception of 3rd party Requests, the docs don't even mention them.

Saying "You can do better, but we won't tell you how" is pretty rude to 
beginners, and we should not do it.

Delegating to the operating system may be sensible for a production system, and 
there is nothing wrong with saying so in the docs, and it would be great if we 
made that easy.  But it is absolutely not a reasonable replacement for a 
straightforward (possibly inefficient and non-scalable) implementation written 
in python that people can read and use for reference.  urllib shouldn't be 
deprecated until we have a better solution to *that* use case that is also in 
the stdlib.  (That might well be worth doing, but it should happen before the 
deprecation.)

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JI5CFS3WYXQEXKSEZH2ZTE3JJJ7AUAMW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: It's now time to deprecate the stdlib urllib module

2022-02-08 Thread Jim J. Jewett
> Why do you think the stdlib *must *provide an example implementation 
> for this specific scenario? Is there something unique to HTTP request
> handling that you feel is important to demonstrate?

*must* is too strong, but I would use a very strong *should*.

I think the stdlib should provide simple source-included examples of most 
things.  I think the case is even stronger when it is:

(1) a fairly simple protocol (such as version 1 of http was) -- QUIC wouldn't 
count for a simple demonstration.
(2) something new users are likely to find motivating.  Short of "here is a way 
to do IO", and maybe "write a simple game",  "get something from the web" is 
probably the most obvious case.
(3) something where bootstrapping might be an issue (network protocols, 
particularly web downloads).  Network access is not an always-available 
resource.  Even when it is available, there is sometimes a barrier between 
"available in python" and "I could read it on my phone, but can't get it open 
in python".
(4) something where a a beginner is likely to be overwhelmed by choices if we 
just say "use a 3rd party module".
(5) something with a backwards-compatibility story in the stdlib already. 

As a side note, are there concerns about urllib.robotparser being broken or 
obsolete, or was that part of the deprecation proposal just contagion from 
urllib.request?

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HF5V6SFWV4BZUAOJTSEBD6DSZWSJONAM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Should we require IEEE 754 floating-point for CPython?

2022-02-09 Thread Jim J. Jewett
I think you skimmed over "A floating point expert can probably look at this ... 
"

I remember a time when I just assumed more bits was better, and a later time 
when I figured smaller was better, and a still later time when I wanted to 
match the published requirements for bitsize.  So that was several years when I 
didn't really understand the tradeoffs, but could benefit from (or at least 
write better documentation) knowing the size.  During those years, I would have 
recognized the importance of 1024, but would probably not have bothered 
interpreting 2.220446049250313.  

A method (or docstring) with a more friendly interface would be good.

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/R4BW2ZL46Y23UYYQCOSWJ2B3KTSRO5LK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

2022-02-16 Thread Jim J. Jewett
I suggest being a little more explicit (even blatant) that the particular 
details of:

(1)  which subset of functionally immortal objects are marked as immortal
(2)  how to mark something as immortal
(3)  how to recognize something as immortal
(4)  which memory-management activities are skipped or modified for immortal 
objects

are not only Cpython-specific, but are also private implementation details that 
are expected to change in subsequent versions.


Ideally, things like the interned string dictionary or the constants from a pyc 
file will be not merely immortal, but stored in an immortal-only memory page, 
so that they won't be flushed or CoW-ed when a nearby non-immortal object is 
modified.  Getting those details right will make a difference to performance, 
and you don't want to be locked in to the first draft.

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EPH3PGNKUBUZK26Z2M4SQSPUVIGXZUNB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)

2022-03-09 Thread Jim J. Jewett
> "periodically reset the refcount for immortal objects (only enable this
> if a stable ABI extension is imported?)" -- that sounds quite expensive, 
> both at runtime and maintenance-wise.

As I understand it, the plan is to represent an immortal object by setting two 
high-order bits to 1.  The higher bit is the actual test, and the one 
representing half of that is a safety margin.

When reducing the reference count, CPython already checks whether the 
refcount's new value is 0.  It could instead check whether refcount & (not 
!immortal_bit) is 0, which would detect when the safety margin has been reduced 
to 0 -- and could then add it back in.  Since the bit manipulation is not 
conditional, the only extra branch will occur when an object is about to be 
de-allocated, and that might be rare enough to be an acceptable cost.  (It 
still doesn't prevent rollover from too many increfs,  but ... that should 
indeed be rare in the wild.)

-jJ
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/O324Q4KMMXL2UHOQIZZWS52U7YHJGYEI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 684: A Per-Interpreter GIL

2022-03-11 Thread Jim J. Jewett
> Is ``allow_all_extensions`` the best name for the context manager?

Nope.  I'm pretty sure that "parallel processing via multiple simultaneous 
interpreters" won't be the only reason people ever want to exclude certain 
extensions.

It might be easier to express that through package or module name, but 
importlib and util aren't specific enough. 

For an example of an extension that works with multiple interpreters but only 
if they share a single GIL ... why wouldn't that apply to any extension 
designed to work with a Singleton external resource?  For example, the 
interpreters could all share a single database connection, and repurpose the 
GIL to ensure that there isn't a thread (or interpreter) switch mid-transaction.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RUDVIEDDCNFDRBIQVQU334GMPW77ZNOK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 684: A Per-Interpreter GIL

2022-03-14 Thread Jim J. Jewett
> That sounds like a horrible idea. The GIL should never be held during an
> I/O operation.

For a greenfield design, I agree that it would be perverse.  But I thought we 
were talking about affordances for transitions from code that was written 
without consideration of multiple interpreters.  In those cases, the GIL can be 
a way of saying "OK, this is the part where I haven't thought things through 
yet."  Using a more fine-grained lock would be better, but would take a lot 
more work and be more error-prone.

For a legacy system, I'm seen plenty of situations where a blunt (but simple) 
hammer like "Grab the GIL" would still be a huge improvement from the status 
quo.  And those situations tend to occur with the sort of clients where 
"Brutally inefficient, but it does work because the fragile parts are 
guaranteed by an external tool" is the right tradeoff.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AAWSCUNVS2NUXRHVATO736KM6I5M6RK5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Proto-PEP part 1: Forward declaration of classes

2022-04-25 Thread Jim J. Jewett
There is an important difference between monkeypatching in general, vs 
monkey-patching an object that was explicitly marked and documented as 
expecting a monkeypatch.

(That said, my personal opinion is that this is pretty heavyweight for very 
little gain; why not just create a placeholder class that static analysis tools 
are supposed to recognize as  likely-to-be-replaced later?  And why not just 
use strings giving the expected eventual class name?  It isn't as though the 
analysis can verify whether something actually meets the full intended contract 
before they've also parsed the continuation.)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JCPKY36RLN5WEFET34EHM4SC6STIJIUC/
Code of Conduct: http://python.org/psf/codeofconduct/


<    1   2   3