[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)
ptmcg@austin.rr.com wrote: > ... add a cautionary section on homoglyphs, specifically citing > “A” (LATIN CAPITAL LETTER A) and “Α” (GREEK CAPITAL LETTER ALPHA) > as an example problem pair. There is a unicode tech report about confusables, but it is never clear where to stop. Are I (upper case I), l (lower case l) and 1 (numeric 1) from ASCII already a problem? And if we do it at all, is there any way to avoid making Cyrillic languages second-class? I'm not quickly finding the contemporary report, but these should be helpful if you want to go deeper: http://www.unicode.org/reports/tr36/ http://unicode.org/reports/tr36/confusables.txt https://util.unicode.org/UnicodeJsps/confusables.jsp > I wanted to look a little further at the use of characters in identifiers > beyond the standard 7-bit ASCII, and so I found some of these same > issues dealing with Unicode NFKC normalization. The first discovery was > the overlapping normalization of “ªº” with “ao”. Here I don't see the problem. Things that look slightly different are really the same, and you can write it either way. So you can use what looks like a funny font, but the closest it comes to a security risk is that maybe you could access something without a casual reader realizing that you are doing so. They would know that you *could* access it, just not that you *did*. > Some other discoveries: > “·” (ASCII 183) is a valid identifier body character, making “_···” a valid > Python identifier. That and the apostrophe are Unicode consortium regrets, because they are normally punctuation, but there are also languages that use them as letters. The apostrophe is (supposedly only) used by Afrikaans, I asked a native speaker about where/how often it was used, and the similarity to Dutch was enough that Guido felt comfortable excluding it. (It *may* have been similar to using the apostrophe for a contraction in English, and saying it therefore represents a letter, but the scope was clearly smaller.) But the dot is used in Catalan, and ... we didn't find anyone ready to say it wouldn't be needed for sensible identifiers. It is worth listing as a warning, and linters should probably complain. > “_” seems to be a special case for normalization. Only the ASCII “_” > character is valid as a leading identifier character; the Unicode > characters that normalize to “_” (any of the characters in “︳︴﹍﹎﹏_”) > can only be used as identifier body characters. “︳” especially could be > misread as “|” followed by a space, when it actually normalizes to “_”. So go ahead and warn, but it isn't clear how that could be abused to look like something other than a syntax error, except maybe through soft keywords. (Ha! I snuck in a call to async︳def that had been imported with *, and you didn't worry about the import *, or the apparently wild cursor position marker, or the strange async definition that was never used! No way I could have just issued a call to _flush and done the same thing!) > Potential beneficial uses: > I am considering taking my transformer code and experimenting with an > orthogonal approach to syntax highlighting, using Unicode groups > instead of colors. Module names using characters from one group, > builtins from another, program variables from another, maybe > distinguish local from global variables. Colorizing has always been an > obvious syntax highlight feature, but is an accessibility issue for those > with difficulty distinguishing colors. I kind of like the idea, but ... if you're doing it on-the-fly in the editor, you could just use different fonts. If you're actually saving those changes, it seems likely to lead to a lot of spurious diffs if anyone uses a different editor. -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/NPTL43EVT2FF76LXIBBWVHDU6NXH3HF5/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)
Compatibility variants can look different, but they can also look identical. Allowing any non-ASCII characters was worrisome because of the security implications of confusables. Squashing compatibility characters seemed the more conservative choice at the time. Stestagg's example: е = lambda е, e: е if е > e else e shows it wasn't perfect, but adding more invisible differences does have risks, even beyond the backwards incompatibility and the problem with (hopefully rare, but are we sure?) editors that don't distinguish between them in the way a programming language would prefer. I think (but won't swear) that there were also several problematic characters that really should have been treated as (at most) glyph variants, but ... weren't. If I Recall Correctly, the largest number were Arabic presentation forms, but there were also a few characters that were in Unicode only to support round-trip conversion with a legacy charset, even if that charset had been declared buggy. In at least a few of these cases, it seemed likely that a beginning user would expect them to be equivalent. -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/GNT3AG2SCVLMCJAZXSTIWFKKAYG25E7O/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)
Stephen J. Turnbull wrote: > Christopher Barker writes: > > For example, in writing math we often use different scripts to mean > > different things (e.g. TeX's Blackboard Bold). So if I were to use > > some of the Unicode Mathematical Alphanumeric Symbols, I wouldn't > > want them to get normalized. Agreed, for careful writers. But Stephen's answer about people using the wrong one and expecting it to work means that normalization is probably the lesser of evils for most people, and the ones who don't want it normalized are more likely to be able to specify custom processing when it is important enough. (The compatibility characters aren't normalized in strings, largely because that should still be possible.) > In fact, I think adding these symbols to Unicode was a bad idea; they > should be handled at a higher level in the linguistic stack (by > semantic markup). When I was a math student, these were clearly different symbols, with much less relation to each other than a mere case difference. So by the Unicode consortium's goals, they are independent characters that should each be defined. I admit that isn't ideal for most use cases outside of math, but ... supporting those other cases is what compatibility normalization is for. > It's also a UX problem. At slightly higher layer in the stack, I'm > used to using Japanese input methods to input sigma and pi which > produce characters in the Greek block, and at least the upper case > forms that denote sum and product have separate characters in the math > operators block. I understand why people who literally write > mathematics in Greek might want those not normalized, but I sure am > going to keep using "Greek sigma", not "math sigma"! The probability > that I'm going to have a Greek uppercase sigma in my papers is nil, > the probability of a summation symbol near unity. But the summation > symbol is not easily available, I have to scroll through all the > preceding Unicode blocks to find Mathematical Operators. So I am > perfectly happy with uppercase Greek sigma for that role (as is > XeTeX!!) I think that is mostly a backwards compatibility problem; XeTeX itself had to worry about compatibility with TeX (which preceded Unicode) and with the fonts actually available and then with earlier versions of XeTeX. -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/JNFLAQUKNCWCJSMBNJZGHVD5ZELOTU6G/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)
Steven D'Aprano wrote: > I think > that many editors in common use don't support bidirectional text, or at > least the ones I use don't seem to support it fully or correctly. ... > But, if there is a concrete threat beyond "it looks weird", that it > another issue. Based on the original post (and how it looked in my web browser, after various automated reformattings, it seems that one of the failure modes that buggy editors have is that stuff can be part of the code, even though it looks like part of a comment, or vice versa This problem might be limited to only some of the bidi controls, and there might even be a workaround specific to # ... but it is an issue. I do not currently have an opinion on how important of an issue it is, or how adequate the workarounds are. -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ECO4R655UGPCVFFVAOQZ3DUZVHQY75BX/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: The current state of typing PEPs
Paul Moore wrote: > More hazy memories here, but I think the original proposal left open > the possibility of annotations not being types at all - for example, > being docstrings for the arguments, or option names for a "function > call to CLI" tool, etc. Absolutely. While it was clear that Guido's own use cases were about typing, annotations were explicitly not limited to typing, which is one reason why some of the later changes have felt to some people like bait and switch. Maybe it is already too late to avoid that. > ... the expectation was that annotations > would be *types*, Even from the start, it was assumed that they would be objects. (Specifically types was expected to be common, but not universal.) The particular way strings are being substituted for evaluated objects has sometimes reminded me of raising a string instead of an exception class/object. It will work, but it can seem sloppy, and it can be annoying if you were assuming otherwise and suddenly have to add a bunch of evals. (That said, I haven't yet been sufficiently motivated to even tease out exactly what the problems are, let alone to propose an alternative that also satisfies the typing fans -- in part because it feels like the obvious optimization is to just not run typing, and it isn't clear what middle grounds are generally worthwhile.) >... personally, I have the same discomfort > about using explicit string annotations for forward references, it > feels like I'm not declaring a "proper type". > If what I say above is right, the debate here isn't about whether > annotations "are for types", but rather about whether reading the > types in annotations and using them to affect behaviour *at runtime* > is a legitimate use of annotations. I see that as a second dispute, which I had previously missed. I think you're right, though. On the other hand, I'm not sure the solution to both isn't just a helper function that does the 2nd-pass resolution -- preferably without requiring that all the rest of typing be imported, since even the people who want to use the typing package agree that importing it is not lightweight. > ... I lurk on the typing-sig, and from an outsider's perspective, the > participants seem to be almost entirely designers or heavy users of > static type checkers. That gives a certain emphasis to the proposals > coming from that group. At times, it sort of reminds me of OWL and "Semantic Web". There are plenty of people who will want to use annotations as a tool, but won't be willing to wade through what can feel like "How many angels can dance on the head of a pin?" discussions. That said, I'm not sure how to best reach people who just want a rough-and-ready usually-good-enough tool. -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/JL3AQ4UVEVWXYTRFAVGVHNT23W2NCUDI/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: The current state of typing PEPs
Steven D'Aprano wrote: > On Sat, Nov 20, 2021 at 11:46:56PM -0800, Christopher Barker wrote: > Maybe PEP 563 could include a decorator in the typing module to > destringify all the annotations in a class or function? If it were in an annotations module, that would probably be sufficient. If it is in typing, then it is a very heavyweight dependency -- heavy enough that even the people actually using that module for development (and not for production runs) are worried about the costs. If the costs of the typing module are that high, it is not acceptable to impose them on people not otherwise using the module. -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ZQHP24T2PKRTDBGZ36KLBHLLOKITP5ON/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Explicit markers for special C-API situations (re: Clarification regarding Stable ABI and _Py_*)
Christian Heimes wrote: > On 09/12/2021 19.26, Petr Viktorin wrote: > > If the code is the authoritative source of truth, we need a proper > > parser to extract the information. ... unfortunately I don't trust it > > enough to let it define the API. Bugs in the parser could result in > > the API definition silently changing. > There are other options than writing a new parser. GCC and Clang are > flexible. For example GCC can be extended with plugins and custom > attributes. But they have the same problem ... it can be difficult to know if there is a subtle bug in someone's understanding of how the plugin interacts with, for example, nested ifndef. The failure mode for an explicitly manually maintained text file is that something doesn't get added when it should, and the more conservative API consumers wait an extra release before using it. -jJ We could extend the header files with custom attributes and > then use a plugin to create an ABI file from the attributes. > I created a quick n' hack > https://github.com/python/cpython/compare/main...tiran:gcc-pythonapi-plugin?... > > as proof of concept. > The plugin takes > PyAPI_ABI_FUNC(PyObject *) PyLong_FromLong(long); > and dumps the declaration as: > extern struct PyObject * PyLong_FromLong (long int); "abi_func" > Christian ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PKQFEIK75EWVTNMLB5CGBYLQANZG6QJH/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: subinterpreters and their possible impact on large extension projects
> In Python 3.11, Python still implements around 100 types as "static > types" which are not compatible with subinterpreters, like > &PyLong_Type and &PyUnicode_Type. I opened > https://bugs.python.org/issue40601 about these static types, but it > seems like changing it may break the C API *and* the stable ABI (maybe > a clever hack will avoid that). If sub-interpreters each need their own copy of even immutable built-in types, then what advantage do they have over separate processes? -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/B7WO5B426HBTG6KZVKQXTJSBQL2S2ILQ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL
Immortal objects shouldn't be reclaimed by garbage collection, but they still count as potential external roots for non-cyclic liveness. -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/FMIUHY6K3UUAUTK7GDTTOO4ULXO74QMP/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL
How common is it to reload a module in production code? It seems like "object created at the module level" (excluding __main__) is at least as good of an heuristic for immortality as "string that meets the syntactic requirements for an identifier". Perhaps also anything created as part of class creation (as opposed to instance initialization). -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/F3IEICCQTKGZMRX3L4JS4NEZZNXVMZGA/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL
Guido van Rossum wrote: > On Wed, Dec 15, 2021 at 6:57 PM Jim J. Jewett jimjjew...@gmail.com wrote: > > Immortal objects shouldn't be reclaimed by garbage collection, but they > > still count as potential external roots for non-cyclic liveness. > So everything referenced by an immortal object should also be made immortal Why? As long as you can get a list of all immortal objects (and a traversal function from each), this is just an extra step (annoying, but tolerable) that removes a bunch of objects from the pool of potential garbage before you even begin looking for cycles. > -- even its type. Hence immortal objects must be immutable. This is probably a good idea, since avoiding changes also avoids races and Copy on Write and cache propagation, etc ... but I don't see why it is *needed*, rather than helpful. -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4KY5XSHRMP3F3CWAW2OUW4NRXN4AB7EM/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: subinterpreters and their possible impact on large extension projects
Petr Viktorin wrote: >>> In Python 3.11, Python still implements around 100 types as "static >>> types" which are not compatible with subinterpreters, ... >>> seems like changing it may break the C API *and* the stable ABI > > If sub-interpreters each need their own copy of even immutable built-in > > types, then what advantage do they have over separate processes? > They need copies of all *Python* objects. A non-Python library may allow > several Python wrappers/proxies for a single internal object, > effectively sharing that object between subinterpreters. > (Which is a problem for removing the GIL -- currently all operations > done by such wrappers are protected by the GIL.) OK, so what is the advantage of having multiple interpreters? The only advantage I can see is that if you're embedding what are essentially several distinct python processes, you can still keep them all inside the single process used by the embedding program. But seems pretty far along the "they're already compiling anyhow; so the ABI isn't crucial" path. -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/C2Z2RPRAIGYDODATM5BQQL6DA6LEOVVN/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL
Why are Immutability and transitive Immortality needed to share an object across interpreters? Are you assuming that a change in one interpreter should not be seen by others? (Typical case, but not always true.) Or are you saying that there is a technical problem such that a change -- even just to the reference count of a referenced string or something -- would cause data corruption? (If so, could you explain why, or at least point me in the general direction?) -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/E3XKSDEDOLHBFFUS2TXGDSLV7YOQUZJB/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Function Prototypes
Steven D'Aprano wrote: > In comparison, Mark's version: > @Callable > def IntToIntFunc(a:int)->int: > pass > # in the type declaration > func: IntToIntFunc > uses 54 characters, plus spaces and newlines (including 7 punctuation > characters); it takes up three extra lines, plus a blank line. As > syntax goes it is double the size of Callable. I think it takes only the characters needed to write the name IntToIntFunc. The @callable def section is a one-time definition, and not logically part of each function definition where it is used. I get that some people prefer an inline lambda to a named function, and others hate naming an infrastructure function, but ... Why are you even bothering to type the callback function? If it is complicated enough to be worth explicitly typing, then it is complicated enough to chunk off with a name. I won't say it is impossible to understand a function signature on the first pass if it takes several lines and whitespace to write ... but it is much easier when the the declaration is short enough to fit on a single line. An @ on the line above complicates the signature parsing, but can be mentally processed separately. The same is true of a named-something-or-other in the middle. Having to switch parsing modes to understand an internal ([int, float, int] -> List[int]), and then to pop that back off the stack is much harder. Hard enough that you really ought to help your reader out with a name, and let them figure out what that names means separately, when their brain's working memory isn't already loaded with the first part of your own function, but still waiting for the last part. > It separates the type declaration from the point at which it is used, > potentially far away from where it is used. The sort of code that passes around functions tends to pass around many functions, but with only a few signatures. If this is really the only time you'll need that signature (not even when you create the functions that will be passed from a calling site?), then ... great. But be nice to your reader anyhow, unless the signature is really so simple that the type-checking software should infer it for you. Then be nice by leaving it out as cruft. [As an aside, I would see some advantage to def myfunc(f:like blobfunc) pointing to an examplar instead of a specifically constructed function-type. You discuss this later as either ... f:blobfunc ... or ... f:blobfunc=blobfunc ... and I would support those, if other issues can be worked out.] -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ZDCSTHMZVSILZZMGI3GTTBTWB53ZRJOI/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Function Prototypes
Steven D'Aprano wrote:uble the size of Callable. > > I think it takes only the characters needed to write the name IntToIntFunc. > ... you may only use it once. Could you provide an example where it is only used once? The only way I can imagine is that you use it here when when defining your complicated function that takes a callback (once), but then you never actually call that complicated function, even from test code, nor do you expect your users to do so. > The status quo is that we can use an anonymous type in the annotation > without pre-defining it, using Callable. OK. I'm not sure it would be a good idea, but we agree it is legal. > PEP 677 proposes a new, more compact syntax for the same. Does it? I agree that "(int, float) -> bool" is more compact than typing.Callable[...], but that feels like optimizing for the wrong thing. I dislike the PEP's flat_map as an example, because it is the sort of infrastructure function that carries no semantic meaning, but ... I'll use it anyhow. def flat_map(l, func): out = [] for element in l: out.extend(f(element)) return out def wrap(x: int) -> list[int]: return [x] def add(x: int, y: int) -> int: return x + y It is reasonable to add a docstring to flat_map, but I grant that this doesn't work as well with tooling that might involve not actually seeing the function. I agree that adding a long prefix of: from typing import Callable def flat_map( l: list[int], func: Callable[[int], list[int]] ) -> list[int]: is undesirable. But the biggest problem is not that "Callable..." has too many characters; the problem is that "Callable[[...], list[...]]" requires too many levels of sub-parsing. The PEP doesn't actually say what it proposes, [and you've suggested that my earlier attempt was slightly off, which may not bode well for likelihood of typos], but I'll *guess* that you prefer: def flat_map( l: list[int], func: ((int) ->[int]) ) -> list[int]: which is slightly shorter physically, but not much simpler mentally. It therefore creates an attractive nuisance. def flat_map( l: list[int], func: wrap ) -> list[int]: on the other hand, lets you read this definition without having to figure out what "wrap" does at the same time. "wrap" is a particularly bad example (because of the lack of semantic content in this example), but I think it still easily beats the proposed new solution, simply because it creates a "you don't need need to peer below this right now" barrier. > Any proposal for function prototypes using > `def` is directly competing against Callable or arrow syntax for the > common case that we want an anonymous, unnamed type written in place. I'm saying that catering to that "common" case is a trap, often leading you to a local optima that is bad globally. > But if we can use an existing function as the prototype instead of > having to declare the prototype, that shifts the balance. I agree that re-using an existing function with the correct signature is better, *even* when that function doesn't make a good default. ... > > I would say the opposite: most callback or key functions have very > simple signatures. > If my function takes a key function, let's say: > def spam(mylist:[str], > a: int, > b: float, > c: bool|None, > key: Callable[[str], str], > ) -> Eggs: > mylist = sorted(mylist, key=key) > ... > the relevant signature is (str) -> str. Do we really need to give that a > predefined named prototype? > def StrToStr(s: str) -> str: pass If you really care about enforcing the str, then yes, it is worth saying key: str_key and defining str_key function as an example def str_key(data:str)->str return str(data) > I would argue that very few people would bother. Because it would usually be silly to care that the list really contained strings, as opposed to "something sortable". So if you do care, it is worth making your requirement stand out, instead of losing it in a pile of what looks like boilerplate. -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/32QYLA7UFTC54UM3CO3REIH57WLLBL6H/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 646 (Variadic Generics): final call for comments
I'm seeing enough different interpretations to think things aren't quite specified -- but I'm not sure if it matters. (1) Is any of this something that should affect computation, or is it really just a question of how to interpret possibly ambiguous documentation? (2) Are any of these troubling cases something that a person should actually write for a normal situation? Or are they just arguments about which abbreviations are acceptable? Or about how automatically-generated (inferred) type descriptions should be written? (3) Are the slice-expansion questions all assumed to be indexing an n-dimensional array, as opposed to [start, stop, step]? Is that explicit in the PEP, and just not in the extracts here? (4) Expanding multiple * shouldn't be ambiguous; the problem is figuring out what to condense into which if two are adjacent. So s1, s2 =[a,b], (1,2,3) [*s1, *s2] should turn into [a, b, 1, 2, 3] The problem is that [*s3, *s4] = (a, b, 1, 2, 3) is ambiguous ... and I didn't really get that distinction from Petr's question or the answers. I can't tell whether I've missed something crucial, or others are arguing over angels on a pinhead ... so whatever the PEP ends up deciding, it should be explicit. (I *think* the earlier parts of this thread are consistent with this, and discussing whether to say explicitly that certain formats are forbidden (but maybe not enforced by the grammar), meaningless, or valid but currently meaningless outside of typing.) -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/YH4URO5EDQODG4QMGOCSXHV6RYTMLK5M/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Suggestion: a little language for type definitions
Steven D'Aprano wrote: > On Sat, Jan 08, 2022 at 12:59:38AM +0100, jack.jan...@cwi.nl wrote: > For example, the arrow syntax for Callable `(int) -> str` (if accepted) > could be a plain old Python expression, usable anywhere the plain old > Python expression `Callable[[int], str]` would be. In principle, yes. In practice, I think the precedence of "->" might be tricky, particularly if the (int) part discourages people from wrapping the full expression in parentheses. > > What if we created a little language that is clearly flagged, for > > example as t”….” or t’….’? Then we could simply define the > > typestring language to be readable, so you could indeed say t”(int, > > str) -> bool”. And we could even allow escapes (similar to > > f-strings) so that the previous expression could also be specified, > > if you really wanted to, as t”{typing.Callable[[int, str], bool}”. > The following are not rhetorical questions. I don't know the answers, > which is why I am asking. > 1. Are these t-strings only usable inside annotations, or are they > expressions that are usable everywhere? I assume they *could* be used anywhere, there just wouldn't be huge reasons to do so. Sort of like a string expression can be an entire statement; there just usually isn't much reason (except as a docstring) to do it. > 2. If only usable inside annotations, why bother with the extra prefix > t" and suffix "? What benefit do they give versus just the rule > "annotations use this syntax"? It provides a useful box around the typing, so that people who are not currently worried about typing can more easily concentrate on the portion they do currently care about. > 3. If usable outside of annotations, what runtime effect do they have? They create a string. Which may or may not be a useful thing to do. > The t-string must evaluate to an object. What object? A string. The various "let us delay annotation evaluation" proposals have made it clear that the people actually using typing don't want it to slow things down for extra evaluation until they explicitly call for that evaluation, perhaps as part of a special run. > 4. If the syntax allowed inside the t-string is specified as part of the > Python language definition, why do we need the prefix and suffix? Same answer as number 2 ... it allows typing to be a less intrusive neighbor. I don't think t" " is as good as some sort of braces, but ... we're out of conventional braces available in ASCII. > Likewise, if this is allowed: > def func(arr: t"array [1...10] of int") -> str: ... How many arguments do I pass to func? That is already tricky to see at a glance, but > def func(arr: array [1...10] of int) -> str: ... is even more difficult to parse. By the time I've mentally attached the "of" and "int" to the indexed (but not really) array that just describes a type, I've forgotten what I was looking for and why. > 5. What difference, if any, is there between `t"{expression}"` and > `expression`? In addition to the box (so readers can more easily filter it out), there is also a flag to typing systems saying that they *should* elaborate the string. What they elaborate it into will be very different from a regular string. That won't happen every time the module is imported, but it will happen when the string is actually needed for something. -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/HY7I52ILYN7IYN2S6UQT57XV4R3YEC2Z/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Minor inconvenience: f-string not recognized as docstring
Guido van Rossum wrote: > I personally think F-strings should not be usable as docstrings. If you > want a dynamically calculated docstring you should assign it dynamically, > not smuggle it in using a string-like expression. We don't allow "blah {x} > blah".format(x=1) as a docstring either, not "foo %s bar" % x. Nor, last I checked, even "string1" + "string2", even though the result is a compile-time string in the appropriate location. I think all of these should be allowed, but I'll grant that annotations reduce the need. I'll even admit that scoping issues make the interpolating versions error prone, and the UI to clear that up may be more of a hassle than it is worth. -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ZUWTCGK6KZJYCUDRR3JNB7H5W3ZHJWMT/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Should we require IEEE 754 floating-point for CPython?
- Should we require the presence of NaNs in order for CPython to build? - Should we require IEEE 754 floating-point for CPython-the-implementation? - Should we require IEEE 754 floating-point for Python-the-language? I don't have strong opinions on the first two, but for the language definition, I think the most we should say is "if an implementation does not support IEEE 754 floating-point, this must be mentioned in the documentation as an implementation limit." ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/YEXX363XX6DS7ZC653RBLIPNQIHBVYTK/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: It's now time to deprecate the stdlib urllib module
There are problems with urllib. With hindsight, it would have been nice to do a few things differently. But that doesn't make migrating away from it any easier. This thread has mentioned several "better" alternatives -- but with the exception of 3rd party Requests, the docs don't even mention them. Saying "You can do better, but we won't tell you how" is pretty rude to beginners, and we should not do it. Delegating to the operating system may be sensible for a production system, and there is nothing wrong with saying so in the docs, and it would be great if we made that easy. But it is absolutely not a reasonable replacement for a straightforward (possibly inefficient and non-scalable) implementation written in python that people can read and use for reference. urllib shouldn't be deprecated until we have a better solution to *that* use case that is also in the stdlib. (That might well be worth doing, but it should happen before the deprecation.) -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/JI5CFS3WYXQEXKSEZH2ZTE3JJJ7AUAMW/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: It's now time to deprecate the stdlib urllib module
> Why do you think the stdlib *must *provide an example implementation > for this specific scenario? Is there something unique to HTTP request > handling that you feel is important to demonstrate? *must* is too strong, but I would use a very strong *should*. I think the stdlib should provide simple source-included examples of most things. I think the case is even stronger when it is: (1) a fairly simple protocol (such as version 1 of http was) -- QUIC wouldn't count for a simple demonstration. (2) something new users are likely to find motivating. Short of "here is a way to do IO", and maybe "write a simple game", "get something from the web" is probably the most obvious case. (3) something where bootstrapping might be an issue (network protocols, particularly web downloads). Network access is not an always-available resource. Even when it is available, there is sometimes a barrier between "available in python" and "I could read it on my phone, but can't get it open in python". (4) something where a a beginner is likely to be overwhelmed by choices if we just say "use a 3rd party module". (5) something with a backwards-compatibility story in the stdlib already. As a side note, are there concerns about urllib.robotparser being broken or obsolete, or was that part of the deprecation proposal just contagion from urllib.request? -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/HF5V6SFWV4BZUAOJTSEBD6DSZWSJONAM/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Should we require IEEE 754 floating-point for CPython?
I think you skimmed over "A floating point expert can probably look at this ... " I remember a time when I just assumed more bits was better, and a later time when I figured smaller was better, and a still later time when I wanted to match the published requirements for bitsize. So that was several years when I didn't really understand the tradeoffs, but could benefit from (or at least write better documentation) knowing the size. During those years, I would have recognized the importance of 1024, but would probably not have bothered interpreting 2.220446049250313. A method (or docstring) with a more friendly interface would be good. -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/R4BW2ZL46Y23UYYQCOSWJ2B3KTSRO5LK/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"
I suggest being a little more explicit (even blatant) that the particular details of: (1) which subset of functionally immortal objects are marked as immortal (2) how to mark something as immortal (3) how to recognize something as immortal (4) which memory-management activities are skipped or modified for immortal objects are not only Cpython-specific, but are also private implementation details that are expected to change in subsequent versions. Ideally, things like the interned string dictionary or the constants from a pyc file will be not merely immortal, but stored in an immortal-only memory page, so that they won't be flushed or CoW-ed when a nearby non-immortal object is modified. Getting those details right will make a difference to performance, and you don't want to be locked in to the first draft. -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/EPH3PGNKUBUZK26Z2M4SQSPUVIGXZUNB/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)
> "periodically reset the refcount for immortal objects (only enable this > if a stable ABI extension is imported?)" -- that sounds quite expensive, > both at runtime and maintenance-wise. As I understand it, the plan is to represent an immortal object by setting two high-order bits to 1. The higher bit is the actual test, and the one representing half of that is a safety margin. When reducing the reference count, CPython already checks whether the refcount's new value is 0. It could instead check whether refcount & (not !immortal_bit) is 0, which would detect when the safety margin has been reduced to 0 -- and could then add it back in. Since the bit manipulation is not conditional, the only extra branch will occur when an object is about to be de-allocated, and that might be rare enough to be an acceptable cost. (It still doesn't prevent rollover from too many increfs, but ... that should indeed be rare in the wild.) -jJ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/O324Q4KMMXL2UHOQIZZWS52U7YHJGYEI/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 684: A Per-Interpreter GIL
> Is ``allow_all_extensions`` the best name for the context manager? Nope. I'm pretty sure that "parallel processing via multiple simultaneous interpreters" won't be the only reason people ever want to exclude certain extensions. It might be easier to express that through package or module name, but importlib and util aren't specific enough. For an example of an extension that works with multiple interpreters but only if they share a single GIL ... why wouldn't that apply to any extension designed to work with a Singleton external resource? For example, the interpreters could all share a single database connection, and repurpose the GIL to ensure that there isn't a thread (or interpreter) switch mid-transaction. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RUDVIEDDCNFDRBIQVQU334GMPW77ZNOK/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 684: A Per-Interpreter GIL
> That sounds like a horrible idea. The GIL should never be held during an > I/O operation. For a greenfield design, I agree that it would be perverse. But I thought we were talking about affordances for transitions from code that was written without consideration of multiple interpreters. In those cases, the GIL can be a way of saying "OK, this is the part where I haven't thought things through yet." Using a more fine-grained lock would be better, but would take a lot more work and be more error-prone. For a legacy system, I'm seen plenty of situations where a blunt (but simple) hammer like "Grab the GIL" would still be a huge improvement from the status quo. And those situations tend to occur with the sort of clients where "Brutally inefficient, but it does work because the fragile parts are guaranteed by an external tool" is the right tradeoff. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/AAWSCUNVS2NUXRHVATO736KM6I5M6RK5/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Proto-PEP part 1: Forward declaration of classes
There is an important difference between monkeypatching in general, vs monkey-patching an object that was explicitly marked and documented as expecting a monkeypatch. (That said, my personal opinion is that this is pretty heavyweight for very little gain; why not just create a placeholder class that static analysis tools are supposed to recognize as likely-to-be-replaced later? And why not just use strings giving the expected eventual class name? It isn't as though the analysis can verify whether something actually meets the full intended contract before they've also parsed the continuation.) ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/JCPKY36RLN5WEFET34EHM4SC6STIJIUC/ Code of Conduct: http://python.org/psf/codeofconduct/