[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults
On Mon, 1 Nov 2021, Chris Angelico wrote: This is incompatible with the existing __get__ method, so it should get a different name. Also, functions have a __get__ method, so you definitely don't want to have everything that takes a callback run into this. Let's say it's __delayed__ instead. Right, good point. I'm clearly still learning about descriptors. :-) I'm having a LOT of trouble seeing this as an improvement. It's not meant to be an improvement exactly, more of a compatible explanation of how PEP 671 works -- in the same way that `instance.method` doesn't "magically" make a bound method, but rather checks whether `instance.method` has a `__get__` attribute, and if so, calls it with `instance` as an argument, instead of returning `instance.method` directly. This mechanism makes the whole `instance.method` less magic, more introspectable, more overridable, etc., e.g. making classmethod and similar decorators possible. I'm trying to do the same thing with PEP 671 (though possibly failing :-)). At least it's still executing the function in its natural scope; it's "just" the locals() dict that gets exposed, as an argument. Yes, which means you can't access nonlocals or globals, only locals. So it has a subset of functionality in an awkward way. My actual intent was to just be able to access the arguments, which are all locals to the function. [Conceptually, I was thinking of the arguments being in their own object, and then getting accessed once like attributes, which triggered __get__ if defined -- but this view isn't very good, in particular because we don't want to redefine what it means to pass functions as arguments!] But the __delayed__ method is already a function, so it has its own locals, nonlocals, and globals. The difference is that those are in the frame of __delayed__, which is outside the function with the defaults, and I wanted to access that function's arguments -- hence passing in the function's locals(). Alternatively, we could forbid this (at least for now): perhaps a __get__ method only gets checked and called on a parameter when that parameter has its default value (e.g. `end is bisect.__defaults__[1]`). That part's not a problem; if this has language support, it could be much more explicit: "if the end parameter was not set". True. I was trying to preserve the "skip this argument" property, but it might make more sense to call __delayed__ only when the argument is omitted. This might make it possible for defaults with __delayed__ methods to actually be evaluated in the function's scope, which would make it more compatible with the current PEP 671. AND it becomes impossible to have an object with this method as an early default - that's the sentinel problem. That's true. I guess my point is that these *are* early defaults, but act very much like late defaults. Functions or function calls just treat these early defaults specially because they have a __delayed__ method. I agree it's not perfect, but is there a context where you'd actually want to have an early default that is one of these objects? The point to add a method to an early default that makes the early default behave like a late default. So this feels like expected behavior...? The use of locals() (as an argument to __get__) is rather ugly, and probably prevents name lookup optimization. Yes. It also prevents use of anything other than locals. For instance, you can't have global helper functions, or anything like that; you could use something like len() from the builtins, but you couldn't use a function defined in the same module. Passing both globals and locals would be better, but still imperfect; and it incurs double lookups every time. That wasn't my intent. The __delayed__ method is still a function, and has its own locals, nonlocals, and globals. It can still call len (as my example code did) -- it's just the len visible from the __delayed__ function, not the len visible from the function with the default parameter. It's true that this approach would prevent implementing something like this: ``` def foo(a => (b := 5)): nonlocal b ``` I'm not sure that that is particularly important: I just wanted the default expression to be able to access the arguments and the surrounding scopes. Sure. Explore anything you like! But I don't think that this is any less ugly than either the status quo or PEP 671, both of which involve actual real code being parsed by the compiler. This proposal was meant to help define what the compiler with PEP 671 parsed code *into*. Erik -- Erik Demaine | edema...@mit.edu | http://erikdemaine.org/ ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an em
[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults
On Sat, 30 Oct 2021, Erik Demaine wrote: Functions are already a form of deferred evaluation. PEP 671 is an embellishment to this mechanism for some of the code in the function signature to actually get executed within the body scope, *just like the body of the function*. I was thinking about what other forms of deferred evaluation Python has, and ran into descriptors [https://docs.python.org/3/howto/descriptor.html]. Classes support this mechanism for calling arbitrary code when accessing the attribute, instead of when calling the class: ``` class CallMeLater: '''Descriptor for calling a specified function with no arguments.''' def __init__(self, func): self.func = func def __get__(self, obj, objtype=None): return self.func() class Foo: early_list = [] late_list = CallMeLater(lambda: []) foo1 = Foo() foo2 = Foo() foo1.early_list == foo2.early_list == foo1.late_list == foo2.late_list foo1.early_list is foo2.early_list# the same [] foo1.late_list is not foo2.late_list # two different []s ``` Written this way, it feels quite a bit like early and late arguments to me. So this got me thinking: What if parameter defaults supported descriptors? Specifically, something like the following: If a parameter (passed or defaulted) has a __get__ method, call it with one argument (beyond self), namely, the function scope's locals(). Parameters are so processed in order from left to right. (PEPs 549 and 649 are somewhat related in that they also propose extending descriptors.) This would enable the following hand-rolled late-bound defaults (using two early-bound defaults): ``` def foo(early_list = [], late_list = CallMeLater(lambda: [])): ... ``` Or we could write a decorator to make this somewhat cleaner: ``` def late_defaults(func): '''Convert callable defaults into late-bound defaults''' func.__defaults__ = tuple( CallMeLater(default) if callable(default) else default for default in func.__defaults__ ) return func @late_defaults def foo(early_list = [], late_list = lambda: []): ... ``` It's also possible, but difficult, to write `end := len(a)` defaults: ``` class LateLength: '''Descriptor for calling len(specified name)''' def __init__(self, name): self.name = name def __get__(self, locals): return len(locals[self.name]) def __repr__(self): # This is bad form for repr, but it makes help(bisect) # output the "right" thing: end=len(a) return f'len({self.name})' def bisect(a, start=0, end=LateLength('a')): ... ``` One feature/bug of this approach is that someone calling the function could pass in a descriptor, and its __get__ method will get called by the function (immediately at the start of the call). Personally I find this dangerous, but those excited about general deferreds might like it? At least it's still executing the function in its natural scope; it's "just" the locals() dict that gets exposed, as an argument. Alternatively, we could forbid this (at least for now): perhaps a __get__ method only gets checked and called on a parameter when that parameter has its default value (e.g. `end is bisect.__defaults__[1]`). In addition to feeling safer (to me), this would enable a lot of optimization: * Parameters without defaults don't need any __get__ checking. * Default values could be checked for the presence of a __get__ method at function definition time (or when setting func.__defaults__), and that flag could get checked at function call time, and __get__ semantics occur only when that flag is set. (I'm not sure whether this would actually save time, though. Maybe if it were a global flag for the function, "any late-bound arguments here?". If not, old behavior and performance.) This proposal could be compatible with PEP 671. What I find nice about this proposal is that it's valid Python syntax today, just an extension of the data model. But I wouldn't necessarily want to use the ugly incantations above, and rather use some syntactic sugar on top of it -- and that's where PEP 671 could come in. What this proposal might offer is a *meaning* for that syntactic sugar, which is more general and perhaps more Pythonic (building on the existing Python data model). It provides another way to think about what the notation in PEP 671 means, and suggests a (different) mechanism to implement it. Some nice features: * __defaults__ naturally generalizes here; no need for auxiliary structures or different signatures for __defaults__. A tool looking at __defaults__ could either be aware of descriptors in this context or not. All other introspection should be the same. * It becomes possible to skip a positional a
[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults
On Sat, 30 Oct 2021, Brendan Barnwell wrote: I agree it seems totally absurd to add a type of deferred expression but restrict it to only work inside function definitions. Functions are already a form of deferred evaluation. PEP 671 is an embellishment to this mechanism for some of the code in the function signature to actually get executed within the body scope, *just like the body of the function*. This doesn't seem weird to me. If we have a way to create deferred expressions we should try to make them more generally usable. Does anyone have a proposal for deferred expressions that could match the ease of use of PEP 671 in assigning a default argument of, say, `[]`? The proposals I've seen so far in this thread involve checking `isdeferred` and then resolving that deferred. This doesn't seem any easier than the existing sentinal approach for default arguments, whereas PEP 671 significantly simplifies this use-case. I also don't see how a function could distinguish a deferred default argument and a deferred argument passed in from another function. In my opinion, the latter would be really messy/dangerous to work with, because it could arbitrarily polute your scope. Whereas late-bound default arguments make a lot of sense: they're written in the function itself (just in the signature instead of the body), so we can see by looking at the code what happens. I've written code in dynamically scoped languages before. I don't recall enjoying it. But maybe I missed a proposal, or someone has an idea for how to fix these issues. Erik -- Erik Demaine | edema...@mit.edu | http://erikdemaine.org/ ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7ZJPAUJVUXJNI2SPAXK54CL3FGR22SCW/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults
On Tue, 26 Oct 2021, Christopher Barker wrote: It's not actually documented that None indicates "use the default". Which, it turns out is because it doesn't :-) In [24]: bisect.bisect([1,3,4,6,8,9], 5, hi=None) --- TypeError Traceback (most recent call last) in > 1 bisect.bisect([1,3,4,6,8,9], 5, hi=None) TypeError: 'NoneType' object cannot be interpreted as an integer I guess that's because in C there is a way to define optional other than using a sentinel? or it's using an undocumented sentinal? Note: that's python 3.8 -- I can't imagine anything;s changed, but ... It seems to have changed. I can reproduce the error in CPython 3.8, but the same code words in CPython 3.9 and 3.10 (all using the C version of the module, though there's also a Python version of the module that probably always supported hi=None). I think it's the result of this commit: https://github.com/python/cpython/commit/3a855b26aed02abf87fc1163ad0d564dc3da1ea3#diff-02d3dd896d6d030e5c6c3e0961f9a4760a37b50bb05a2d89e4ab627a8f1a7b9f On the plus side, this probably means that there aren't many people using the hi=None API. :-) So it might be safe to change to a late-bound default. Erik -- Erik Demaine | edema...@mit.edu | http://erikdemaine.org/___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/PXBXUWQPX4ZGOVGMPCV2ITNAPG5KEUTW/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults
On Tue, 26 Oct 2021, Ricky Teachey wrote: At bottom I guess I'd describe the problem this way: with most APIs, there is a way to PASS SOMETHING that says "give me the default". With this proposed API, we don't have that; the only want to say "give me the default" is to NOT pass something. I don't KNOW if that's a problem, it just feels like one. I agree that it's annoying, but it's actually an existing problem with early-bound defaults too. Consider: ``` def f(eggs = [], spam = {}): ... ``` There isn't an easy way to get the defaults for the arguments, because they're not just *any* `[]` or `{}`, they're a specific list and dict. So if you want to specify a value for the second argument but not the first, you'd need to do one of the following: ``` f(spam = {'more'}) f(f.__defaults__[0], {'more'}) ``` The former would work just as well with PEP 671. The latter depends on introspection, which we're still working out. Unfortunately, even if we can get access to the code that produces the default, we won't be able to actually call it, because it needs to be called from the function's scope. For example, consider: ``` def g(eggs := [], spam := {}): ... ``` In this simple case, there are no dependencies, so we could do something like this: ``` g(g.__defaults__[0](), {'more'}) ``` But in general we won't be able to make this call, because we don't have the scope until `g` gets called and its scope created... So there is a bit of functionality loss with PEP 671, though I'm not sure it's that big a deal. I wonder if it would make sense to offer a "missing argument" object (builtin? attribute of inspect.Parameter? attribute of types.FunctionType?) that actually simulates the behavior of that argument not being passed. Let me call it `_missing` for now. This would actually make it far easier to accomplish "pass in the second argument but not the first", both with early- and late-binding defaults: ``` f(_missing, {'more'}) g(_missing, {'more'}) ``` I started thinking about `_missing` when thinking about how to implement late-binding defaults. It's at least one way to do it (then the function itself could even do the argument checks), though perhaps there are simpler ways that avoid the ref count increments. Erik -- Erik Demaine | edema...@mit.edu | http://erikdemaine.org/ ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3DLKREVEG62RHDHY4KP2R6IX2PPA633F/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Parameter tuple unpacking in the age of positional-only arguments
On Tue, 26 Oct 2021, Eric V. Smith wrote: You may or may not recall that a big reason for the removal of "tuple parameter unpacking" in PEP 3113 was that they couldn't be supported by the inspect module. Quoting that PEP: "Python has very powerful introspection capabilities. These extend to function signatures. There are no hidden details as to what a function's call signature is." (Aside: I loved tuple parameter unpacking, and used it all the time! I was sad to see them go, but I agreed with PEP 3113.) Having recently heard a friend say "the removal of tuple parameter unpacking was one thing that Python 3 got wrong", I read this and PEP 3113 with interest. It seems like another approach would be to treat tuple-unpacking parameters as positional-only, now that this is a thing, or perhaps require that they are explicitly positional-only via in PEP 570: def move((x, y), /): ... # could be valid? def move((x, y)): ... # could remain invalid? Is it worth revisiting parameter tuple-unpacking in the age of positional-only arguments? Or is this still a no-go from the perspective of introspection, because it violates "There are no hidden details as to what a function's call signature is."? (This may be a very short-lived thread.) Erik -- Erik Demaine | edema...@mit.edu | http://erikdemaine.org/ ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/4DEUPSGXRJMB4TWGVLEZUEOCZUX3TNMS/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults
On Tue, 26 Oct 2021, Steven D'Aprano wrote: def func(x=x, y=>x) # or func(x=x, @y=x) This makes me think of a "real" use-case for assigning all early-bound defaults before late-bound defaults: consider using closure hacks (my main use of early-bound defaults) together with late-bound defaults, as in ``` for i in range(n): def func(arg := expensive(i), i = i): ... ``` I think it's pretty common to put closure hacks at the end, so they don't get in the way of the caller. (The intent is that the caller never specifies those arguments.) But then it'd be nice to be able to use those variables in the late-bound defaults. I can't say this is beautiful code, but it is an application and would probably be convenient. On Tue, 26 Oct 2021, Eric V. Smith wrote: Among my objections to this proposal is introspection: how would that work? The PEP mentions that the text of the expression would be available for introspection, but that doesn't seem very useful. I think what would make sense is for code objects to be visible, in the same way as `func.__code__`. But it's definitely worth fleshing out whether: 1. Late-bound defaults are in `func.__defaults__` and `func.__kwdefaults__` -- where code objects are treated as special kind of default values. This seems problematic because we can't distinguish between a late-bound default and an early-bound default that is a code object. or 2. There are new defaults like `func.__late_defaults__` and `func.__late_kwdefaults__`. The issue here is that it's not clear in what order to mix `func.__defaults__` and `func.__late_defaults` (each a tuple). Perhaps most natural is to add a new introspection object, say LateDefault, that can take place as a default value (but can't be used as an early-bound default?), and has a __code__ attribute. --- By the way, another thing missing from the PEP: presumably lambda expressions can also have late-bound defaults? On Tue, 26 Oct 2021, Marc-Andre Lemburg wrote: Now, it may not be obvious, but the key advantage of such deferred objects is that you can pass them around, i.e. the "defer os.listdir(DEFAULT_DIR)" could also be passed in via another function. Are deferred code pieces are dynamically scoped, i.e., they are evaluated in whatever scope they end up getting evaluated? That would certainly interesting, but also kind of dangerous (about as dangerous as eval), and I imagine fairly prone to error if they get passed around a lot. If they're *not* dynamically scoped, then I think they're equivalent to lambda, and then they don't solve the default parameter problem, because they'll be evaluated in the function's enclosing scope instead of the function's scope. Erik -- Erik Demaine | edema...@mit.edu | http://erikdemaine.org/ ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/UPC3AX7ESRJ57IJS4DWEV4MS3N4SIISO/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Unpacking in tuple/list/set/dict comprehensions
On Sat, 16 Oct 2021, Erik Demaine wrote: Assuming the support remains relatively unanimous for [*...], {*...}, and {**...} (thanks for all the quick replies!), I'll put together a PEP. As promised, I put together a pre-PEP (together with my friend and coteacher Adam Hartz, not currently subscribed, but I'll keep him aprised): https://github.com/edemaine/peps/blob/unpacking-comprehensions/pep-.rst For this to become an actual PEP, it needs a sponsor. If a core developer would be willing to be the sponsor for this, please let me know. (This is my first PEP, so if I'm going about this the wrong way, also let me know.) Meanwhile, I'd welcome any comments! In writing things up, I became convinced that generators should be supported, but arguments should not be supported; see the document for details why. Erik -- Erik Demaine | edema...@mit.edu | http://erikdemaine.org/ ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/L6NZLEWOXM2KTGOIX7AHP5L76TLNKDPW/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults
l), so perhaps that's a better choice. On the other hand, given that `global spam` and `nonlocal spam` would just be preventing `spam` from being defined in the function's scope, it seems more reasonable for your example to work, just like the following should: ``` spam = 5 def f(x := spam): print(x, spam) # 5 5 f() ``` Here's another example where it matters whether the default expressions are computed within their own scope: ``` def f(x := (y := 5)): print(x) # 5 print(y) # 5??? f() ``` I feel like we don't want to allow accessing `y` in the body of `f` here, because whether `y` is bound depends on whether `x` was passed. (If `x` is passed, `y` won't get assigned.) This would suggest evaluating default expressions in their own scope would be beneficial. Intuitively, the parens are indicating a separate scope, in the same way that `(x for x in it)` creates its own scope and thus doesn't leak `x`. On the other hand, `((y := x) for x in it)` does seem to leak `y`, so I'm not really sure what would be best / most consistent here. Erik -- Erik Demaine | edema...@mit.edu | http://erikdemaine.org/ ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/EWYHQLZOXLYH5DCZJIW3KQSSO3BV37TD/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults
On Mon, 25 Oct 2021, Chris Angelico wrote: On Mon, Oct 25, 2021 at 6:13 PM Steven D'Aprano wrote: The rules for applying parameter defaults are well-defined. I would have to look it up to be sure... And that right there is all the evidence I need. If you, an experienced Python programmer, can be unsure, then there's a strong indication that novice programmers will have far more trouble. Why permit bad code at the price of hard-to-explain complexity? I'm not sure how this helps; the rules are already a bit complicated. Steven's proposed rules are a natural way to extend the existing rules; I don't see the new rules as (much) more complicated. Offer me a real use-case where this would matter. So far, we had better use-cases for arbitrary assignment expression targets than for back-to-front argument default references, and those were excluded. I can think of a few examples, though they are a bit artificial: ``` def search_listdir(path = None, files := os.listdir(path), start = 0, end = len(files)): '''specify path or files''' # variation of the LocaleTextCalendar from stdlib (in a message of Steven's) class Calendar: default_firstweekday = 0 def __init__(self, firstweekday := Calendar.default_firstweekday, locale := find_default_locale(), firstweekdayname := locale.lookup_day_name(firstweekday)): ... Calendar.default_firstweekday = 1 ``` But I think the main advantage of the left-to-right semantics is simplicity and predictability. I don't think the following functions should evaluate the default values in different orders. ``` def f(a := side_effect1(), b := side_effect2()): ... def g(a := side_effect1(), b := side_effect2() + a): ... def h(a := side_effect1() + b, b := side_effect2()): ... ``` I expect left-to-right semantics of the side effects (so function h will probably raise an error), just like I get from the corresponding tuple expressions: ``` (a := side_effect1(), b := side_effect2()) (a := side_effect1(), b := side_effect2() + a) (a := side_effect1() + b, b := side_effect2()) ``` As Jonathan Fine mentioned, if you defined the order to be a linearization of the partial order on arguments, (a) this would be complicated and (b) it would be ambiguous. I think, if you're going to forbid `def f(a := b, b:= a)` at the compiler level, you would need to forbid using late-bound arguments (at least) in least-bound argument expressions. But I don't see a reason to forbid this. It's rare that order would matter, and if it did, a quick experiment or learning "left to right" is really easy. The tuple expression equivalence leads me to think that `:=` is decent notation. As a result, I would expect: ``` def f(a := expr1, b := expr2, c := expr3): pass ``` to behave the same as: ``` _no_a = object() _no_b = object() _no_c = object() def f(a = _no_a, b = _no_b, c = _no_c): (a := expr1 if a is _no_a else a, b := expr2 if b is _no_b else b, c := expr3 if c is _no_c else c) ``` Given that `=` assignments within a function's parameter spec already only means "assign when another value isn't specified", this is pretty similar. On Mon, 25 Oct 2021, Chris Angelico wrote: On Sun, 24 Oct 2021, Erik Demaine wrote: > I think the semantics are easy to specify: the argument defaults get > evaluated for unspecified ARGUMENT(s), in left to right order as specified > in the def. Those may trigger exceptions as usual. Ah, but is it ALL argument defaults, or only those that are late-evaluated? Either way, it's going to be inconsistent with itself and harder to explain. That's what led me to change my mind. I admit I missed this subtlety, though again I don't think it would often make a difference. But working out subtleties is what PEPs and discussion are for. :-) I'd be inclined to assign the early-bound argument defaults before the late-bound arguments, because their values are already known (they're stored right in the function argument) so they can't cause side effects, and it could offer slight incremental benefits, like being able to write the following (again, somewhat artificial): ``` def manipulate(top_list): def recurse(start=0, end := len(rec_list), rec_list=top_list): ... ``` But I don't feel strongly either way about either interpretation. Mixing both types of default arguments breaks the analogy to tuple expressions above, alas. The corresponding tuple expression with `=` is just invalid. Personally, I'd expect to use late-bound defaults almost all or all the time; they behave more how I expect and how I usually need them (I use a fair amount of `[]` and `{}` and `set()` as default values). The only context I'd use/want the current default behavior is to hack closures, as in: ``` for t
[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults
On Sun, 24 Oct 2021, Erik Demaine wrote: I think the semantics are easy to specify: the argument defaults get evaluated for unspecified order, in left to right order as specified in the def. Those may trigger exceptions as usual. Sorry, that should be: I think the semantics are easy to specify: the argument defaults get evaluated for unspecified ARGUMENT(s), in left to right order as specified in the def. Those may trigger exceptions as usual. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/SNAYBJR52DHO3U76RLXZEC7HQFJLKVEX/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults
On Mon, 25 Oct 2021, Chris Angelico wrote: On Mon, Oct 25, 2021 at 3:47 AM Chris Angelico wrote: On Mon, Oct 25, 2021 at 3:43 AM Jonathan Fine wrote: Please forgive me if it's not already been considered. Is the following valid syntax, and if so what's the semantics? Here it is: def puzzle(*, a=>b+1, b=>a+1): return a, b There are two possibilities: either it's a SyntaxError, or it's a run-time UnboundLocalError if you omit both of them (in which case it would be perfectly legal and sensible if you specify one of them). I'm currently inclined towards SyntaxError, since permitting it would open up some hard-to-track-down bugs, but am open to suggestions about how it would be of value to permit this. In fact, on subsequent consideration, I'm inclining more strongly towards SyntaxError, due to the difficulty of explaining the actual semantics. Changing the PEP accordingly. I think the semantics are easy to specify: the argument defaults get evaluated for unspecified order, in left to right order as specified in the def. Those may trigger exceptions as usual. Erik ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/HYXNABI2ACLVCLQH5TNRDX6WWSHNOING/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: PEP 671: Syntax for late-bound function argument defaults
On Sun, 24 Oct 2021, Chris Angelico wrote: Is anyone interested in coauthoring this with me? Anyone who has strong interest in seeing this happen - whether you've been around the Python lists for years, or you're new and interested in getting involved for the first time, or anywhere in between! I have a strong interest in seeing this happen, and would be happy to help how I can. Teaching (and using) the behavior of Python argument initializers is definitely a thorn in my side. :-) I'd love to be able to easily initalize an empty list/set/dict. For what it's worth, here are my thoughts on some of the syntaxes proposed so far: * I don't like `def f(arg => default)` exactly because it looks like a lambda, and so I imagine arg is an argument to that lambda, but the intended meaning has nothing to do with that. I understand lambdas give delegation, but in my mind that should look more like `def f(arg = => default)` or `def f(arg = () => default)` -- except these will have a different meaning (arg's default is a function, and they would be evaluated in parent scope not the function's scope) once `=>` is short-hand for lambda. * I find `def f(arg := default)` reasonable. I was actually thinking about this very issue before the thread started, and this was the syntax that came to mind. The main plus for this is that it uses an existing operator (so fewer to learn) and it is "another kind of assignment". The main minus is that it doesn't really have much to do with the walrus operator; we're not using the assigned value inline like `arg := default` would mean outside `def`. Then again, `def f(arg = default)` is quite different from `arg = default` outside `def`. * I find `def f(arg ?= default)` (or `def f(arg ??= default)`) reasonable, exactly because it is similar to None-aware operators (PEP 0505), which is currently/recently under discussion in python-dev). The main complaint about PEP 0505 in those discussions is that it's very None-specific, which feels biased. But the meaning of "omitted value" is extremely clear in a def. If both this were added and PEP 0505 were accepted, `def f(arg ?= default)` would be roughly equivalent to: ``` def f(arg = None): arg ??= default ``` except `def f(arg ?= default)` wouldn't trigger default because in the case of `f(None)`, whereas the above code would. I find this an acceptable difference. (FWIW, I'm also in favor of 0505.) * I also find `def f(@arg = default)` reasonable, though it feels a little inconsistent with decorators. I expect a decorator expression after @, not an argument, more like `def f(@later arg = default)`. * I'm not very familiar with thunks, but they seem a bit too magical for my liking. Evaluating argument defaults only sometimes (when they get read in the body) feels a bit unpredictable. Erik -- Erik Demaine | edema...@mit.edu | http://erikdemaine.org/ ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/DBDNNYYOVVZ5MITYXC5Q3SC5U2P3ASUS/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Unpacking in tuple/list/set/dict comprehensions
On Sun, 17 Oct 2021, Steven D'Aprano wrote: On Sat, Oct 16, 2021 at 11:42:49AM -0400, Erik Demaine wrote: I guess the question is whether to define `(*it for it in its)` to mean tuple or generator comprehension or nothing at all. I don't see why that is even a question. We don't have tuple comprehensions and `(expr for x in items)` is always a generator, never a tuple. There's no ambiguity there. Why would allowing unpacking turn it into a tuple? Agreed. I got confused by the symmetry. The only tricky corner case is that generator comprehensions can forgo the surrounding brackets in the case of a function call: func( (expr for x in items) ) func( expr for x in items ) # we can leave out the brackets But with the unpacking operator, it is unclear whether the unpacking star applies to the entire generator or the inner expression: func(*expr for x in items) That could be read as either: it = (expr for x in items) func(*it) or this: it = (*expr for x in items) func(it) Of course we can disambiguate it with precedence rules, [...] I'd be inclined to go that way, as the latter seems like the only reasonable (to me) parse for that syntax. Indeed, that's how the current parser interprets this: ``` func(*expr for x in items) ^ SyntaxError: iterable unpacking cannot be used in comprehension ``` To get the former meaning, which is possible today, you already need parentheses, as in func(*(expr for x in items)) But it would be quite surprising for this minor issue to lead to the major inconsistency of prohibiting unpacking inside generator comps when it is allowed in list, dict and set comps. Good point. Now I'm much more inclined to define the generator expression `(*expr for x in items)`. Thanks for your input! On Sat, 16 Oct 2021, Serhiy Storchaka wrote: It was considered and rejected in PEP 448. What was changed since? What new facts or arguments have emerged? I need to read the original discussion more (e.g. https://mail.python.org/pipermail/python-dev/2015-February/138564.html), but you can see the summary of why it was removed here: https://www.python.org/dev/peps/pep-0448/#variations In particular, there was "limited support" before (and the generator ambiguity issue discussed above). I expect now that we've gotten to enjoy PEP 448 for 5 years, it's more "obvious" that this functionality is missing and useful. So far that seems true (all responses have been at least +0), but if anyone disagree, please say so. Erik -- Erik Demaine | edema...@mit.edu | http://erikdemaine.org/ ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/DGPZMQXAZG55J4HLACIXMBZFCTEM6FPG/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Unpacking in tuple/list/set/dict comprehensions
On Sat, 16 Oct 2021, David Mertz, Ph.D. wrote: On Sat, Oct 16, 2021, 10:10 AM Erik Demaine (*it1, *it2, *it3) # tuple with the concatenation of three iterables [*it1, *it2, *it3] # list with the concatenation of three iterables {*it1, *it2, *it3} # set with the union of three iterables {**dict1, **dict2, **dict3} # dict with the combination of three dicts I'm +0 on the last three of these. But the first one is much more suggestive of a generator comprehension. I would want/expect it to be equivalent to itertools.chain(), not create a tuple. I guess you were referring to `(*it for it in its)` (proposed notation) rather than `(*it1, *it2, *it3)` (which already exists and builds a tuple). Very good point! This is confusing. I could also read `(*it for it in its)` as wanting to build the following generator (or something like it): ``` def generate(): for it in its: yield from it ``` I guess the question is whether to define `(*it for it in its)` to mean tuple or generator comprehension or nothing at all. Tuples are nice because they mirror `(*it1, *it2, *it3)` but bad for the reasons you raise: Moreover, it is an anti-pattern to create large and indefinite sized tuples, whereas such large collections as lists, sets, and dicts are common and useful. I'd be inclined to not define `(*it for it in its)`, given the ambiguity. Assuming the support remains relatively unanimous for [*...], {*...}, and {**...} (thanks for all the quick replies!), I'll put together a PEP. On Sat, 16 Oct 2021, Guido van Rossum wrote: Seems sensible to me. I’d write the equivalency as for x in y: answer.extend([…x…]) Oh, nice! That indeed works in all cases. Erik -- Erik Demaine | edema...@mit.edu | http://erikdemaine.org/___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/2AZBMZGKL56PERIJRCPTIJ6BRITTWHGM/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Unpacking in tuple/list/set/dict comprehensions
Extended unpacking notation (* and **) from PEP 448 gives us great ways to concatenate a few iterables or dicts: ``` (*it1, *it2, *it3) # tuple with the concatenation of three iterables [*it1, *it2, *it3] # list with the concatenation of three iterables {*it1, *it2, *it3} # set with the union of three iterables {**dict1, **dict2, **dict3} # dict with the combination of three dicts # roughly equivalent to dict1 | dict2 | dict3 thanks to PEP 584 ``` I propose (not for the first time) that similarly concatenating an unknown number of iterables or dicts should be possible via comprehensions: ``` (*it for it in its) # tuple with the concatenation of iterables in 'its' [*it for it in its] # list with the concatenation of iterables in 'its' {*it for it in its} # set with the union of iterables in 'its' {**d for d in dicts} # dict with the combination of dicts in 'dicts' ``` The above is my attempt to argue that the proposed notation is natural: `[*it for it in its]` is exactly analogous to `[*its[0], *its[1], ..., *its[len(its)-1]]`. There are other ways to do this, of course: ``` [x for it in its for x in it] itertools.chain(*its) sum(it for it in its, []) functools.reduce(operator.concat, its, []) ``` But none are as concise and (to me, and hopefully others who understand * notation) as intuitive. For example, I recently wanted to write a recursion like so, which accumulated a set of results from within a tree structure: ``` def recurse(node): # base case omitted return {*recurse(child) for child in node.children} ``` In fact, I am teaching a class and just asked a question on a written exam for which several students wrote this exact code in their solution (which inspired writing this message). So I do think it's quite intuitive, even to those relatively new to Python. Now, on to previous proposals. I found this thread from 2016 (!); please let me know if there are others. https://mail.python.org/archives/list/python-ideas@python.org/thread/SBM3LYESPJMI3FMTMP3VQ6JKKRDHYP7A/#DE4PCVNXBQJIGFBYRB2X7JUFZT75KYFR There are several arguments for and against this feature in that thread. I'll try to summarize: Arguments for: * Natural extension to PEP 448 (it's mentioned as a variant within PEP 448) * Easy to implement: all that's needed in CPython is to *remove* some code blocking this. Arguments against: * Counterintuitive (to some) * Hard to teach * `[...x... for x in y]` is no longer morally equivalent to `answer = []; for x in y: answer.append(...x...)` (unless `list1.append(a, b)` were equivalent to `list1.extend([a, b])`) Above I've tried to counter the first two "against" arguments. Some counters to the third "against" argument: 1. `[*...x... for x in y]` is equivalent to `answer = []; for x in y: answer.extend(...x...)` (about as easy to teach, I'd say) 2. Maybe `list1.append(a, b)` should be equivalent to `list1.extend([a, b])`? It is in JavaScript (`Array.push`). And I don't see why one would expect it to append a tuple `(a, b)`; that's what `list1.append((a, b))` is for. I think the main argument against this is to avoid programming errors, which is fine, but I don't see why it should hold back an advanced feature involving both unpacking and comprehensions. Erik -- Erik Demaine | edema...@mit.edu | http://erikdemaine.org/ ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7G732VMDWCRMWM4PKRG6ZMUKH7SUC7SH/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Accessing target name at runtime
On Sat, 16 Oct 2021, Steven D'Aprano wrote: The token should preferably be: * self-explanatory, not line-noise; * shorter rather than longer, otherwise it is easier to just type the target name as a string: 'x' is easier to type than NAME_OF_ASSIGNMENT_TARGET; * backwards compatible, which means it can't be anything that is already a legal name or expression; * doesn't look like an error or typo. A possible soft keyword: __lhs__ (short for 'left-hand side'): REGION = os.getenv(__lhs__) db_url = config[REGION][__lhs__] It's not especially short, and it's not backward-compatible, but at least there's a history of adding double-underscore things. Perhaps, for backward compatibility, the feature could be disabled in any scope (or file?) where __lhs__ is assigned, in which case it's treated like a variable as usual. The magic version only applies when it's used in a read-only fashion. It's kind of like a builtin variable, but its value changes on every line (and it's valid only in an assignment line). One thing I wonder: what happens if you write the following? foo[1] = __lhs__ # or <<< or whatever Maybe you get 'foo[1]', or maybe this is invalid syntax, in the same way that the following is. def foo[1]: pass Classes, functions, decorators and imports already satisfy the "low hanging fruit" for this functionality. My estimate is that well over 99% of the use-cases for this fall into just four examples, which are already satisfied by the interpreter: [...] # like func = decorator(func) # similarly for classes @decorator def func(): ... This did get me wondering about how you could simulate this feature with decorators. Probably obvious, but here's the best version I came up with: ``` def env_var(x): return os.getenv(x.__name__) @env_var def REGION(): pass ``` It's definitely ugly to avoid repetition... Using a class, I guess we could at least get several such variables at once. If we didn't already have interpreter support for these four cases, it would definitely be worth coming up with a solution. But the use-cases that remain are, I think, quite niche and uncommon. To me (a mathematician), the existence of this magic in def, class, import, etc. is a sign that this is indeed useful functionality. As a fan of first-class language features, it definitely makes me wonder whether it could be generalized. But I'm not sure what the best mechanism is. (From the description in the original post, I gather that variable assignment decorators didn't work out well.) I wonder about some generalized mechanism for automatically setting the __name__ of an assigned object (like def and class), but I'm not sure what it would look like... Erik -- Erik Demaine | edema...@mit.edu | http://erikdemaine.org/ ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/BHGDRTX3BBYB66NINSTOPROTCIRKZNRU/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: dict_items.__getitem__?
There seems to be a growing list of issues with adding `itertools.first(x)` as shorthand for `next(iter(x))`: * If `x` is an iterator, it modifies the iterator, which is counterintuitive from the name `first`. * It'll still be difficult for new users to find/figure out. In the end, I feel like the main case I want to use a `first` and `last` functions on are `dict`s; other objects like `range`, `str`, `list`, `tuple` all support `[0]` and `[-1]`. So I wonder whether we should go back to this idea: On Tue, 5 Oct 2021, Alex Waygood wrote: [...] Another possibility I've been wondering about was whether several methods should be added to the dict interface: * dict.first_key = lambda self: next(iter(self)) * dict.first_val = lambda self: next(iter(self.values())) * dict.first_item = lambda self: next(iter(self.items())) * dict.last_key = lambda self: next(reversed(self)) * dict.last_val = lambda self: next(reversed(self.values())) * dict.last_item = lambda self: next(reversed(self.items())) But I think I like a lot more the idea of adding general ways of doing these things to itertools. At the least, I wonder whether a `dict.lastitem` method that's the nondestructive analog of `dict.popitem` would be good to add. This would solve the case of "I want an arbitrary item from this dict, I don't care which one, but I don't want to modify the dict so I'd rather not use popitem" which I've seen repeated a few times in this thread. By contrast, I don't think `next(iter(my_dict))` is an intuitive way to solve this problem, even for many experts; and I don't think it's as efficient as `my_dict.lastitem()` would be, because the current `dict` code maintains a pointer to the last item but not to the first item. [I also admit that I've mostly forgotten the original situation where I wanted this functionality. I believe it was an exhaustive search, where I wanted to branch on an arbitrary item of a dict, and nondestructively build new versions of that dict for recursive calls (instead of modifying before recursion and unmodifying afterward).] One more idea to throw around: Consider the following "anonymous unpacking" syntax. ``` first, * = [1, 2, 3] *, last = [1, 2, 3] ``` For someone used to unpacking syntax, this seems like a natural extension to what we have now, and is far more flexible than just extracting the first element. The distinction from the existing methods (with e.g. `*_`) is that it wouldn't waste time extracting elements you don't want. And it could work well with things like `dict` (and `dict_items` etc.). Erik -- Erik Demaine | edema...@mit.edu | http://erikdemaine.org/___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/IQ2EJM5BTDEO4URUHN3XGR6XSXX22HFR/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] dict_items.__getitem__?
Have folks thought about allowing indexing dictionary views as in the following code, where d is a dict object? d.keys()[0] d.keys()[-1] d.values()[0] d.values()[-1] d.items()[0] d.items()[-1] # item that would be returned by d.popitem() I could see value to the last form in particular: you might want to inspect the last item of a dictionary before possibly popping it. I've also often wanted to get an arbitrary item/key from a dictionary, and d.items()[0] seems natural for this. Of course, the universal way to get the first item from an iterable x is item = next(iter(x)) I can't say this is particularly readable, but it is functional and fast. Or sometimes I use this pattern: for item in x: break If you wanted the last item of a dictionary d (the one to be returned from d.popitem()), you could write this beautiful code: last = next(iter(reversed(d.items( Given the dictionary order guarantee from Python 3.7, adding indexing (__getitem__) to dict views seems natural. The potential catch is that (I think) it would require linear time to access an item in the middle, because you need to count the dummy elements. But accessing [i] and [-i] should be doable in O(|i|) time. (I've wondered about the possibility of doing binary or interpolation search, but without some stored index signposts, I don't think it's possible.) Python is also full of operations that take linear time to do: list.insert(0, x), list.pop(0), list.index(), etc. But it may be that __getitem__ takes constant time on all built-in data structures, and the apparent symmetry but very different performance between dict()[i] and list()[i] might be confusing. That said, I really just want d[0] and d[-1], which is when these are fast. I found some related discussion in https://mail.python.org/archives/list/python-ideas@python.org/thread/QVTGZD6USSC34D4IJG76UPKZRXBBB4MM/ but not this exact idea. Erik -- Erik Demaine | edema...@mit.edu | http://erikdemaine.org/ ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/PPI747IBFYYRAVPUJDY4DKFNTJGASH3K/ Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Support parsing stream with `re`
On Mon, Oct 8, 2018 at 12:20 PM Cameron Simpson wrote: > > On 08Oct2018 10:56, Ram Rachum wrote: > >That's incredibly interesting. I've never used mmap before. > >However, there's a problem. > >I did a few experiments with mmap now, this is the latest: > > > >path = pathlib.Path(r'P:\huge_file') > > > >with path.open('r') as file: > >mmap = mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ) > > Just a remark: don't tromp on the "mmap" name. Maybe "mapped"? > > >for match in re.finditer(b'.', mmap): > >pass > > > >The file is 338GB in size, and it seems that Python is trying to load it > >into memory. The process is now taking 4GB RAM and it's growing. I saw the > >same behavior when searching for a non-existing match. > > > >Should I open a Python bug for this? > > Probably not. First figure out what is going on. BTW, how much RAM have you > got? > > As you access the mapped file the OS will try to keep it in memory in case you > need that again. In the absense of competition, most stuff will get paged out > to accomodate it. That's normal. All the data are "clean" (unmodified) so the > OS can simply release the older pages instantly if something else needs the > RAM. > > However, another possibility is the the regexp is consuming lots of memory. > > The regexp seems simple enough (b'.'), so I doubt it is leaking memory like > mad; I'm guessing you're just seeing the OS page in as much of the file as it > can. Yup. Windows will aggressively fill up your RAM in cases like this because after all why not? There's no use to having memory just sitting around unused. For read-only, non-anonymous mappings it's not much problem for the OS to drop pages that haven't been recently accessed and use them for something else. So I wouldn't be too worried about the process chewing up RAM. I feel like this is veering more into python-list territory for further discussion though. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] support toml for pyproject support
On Mon, Oct 8, 2018 at 12:23 PM Nathaniel Smith wrote: > > On Mon, Oct 8, 2018 at 2:55 AM, Steven D'Aprano wrote: > > > > On Mon, Oct 08, 2018 at 09:10:40AM +0200, Jimmy Girardet wrote: > >> Each tool which wants to use pyproject.toml has to add a toml lib as a > >> conditional or hard dependency. > >> > >> Since toml is now the standard configuration file format, > > > > It is? Did I miss the memo? Because I've never even heard of TOML before > > this very moment. > > He's referring to PEPs 518 and 517 [1], which indeed standardize on > TOML as a file format for Python package build metadata. > > I think moving anything into the stdlib would be premature though – > TOML libraries are under active development, and the general trend in > the packaging space has been to move things *out* of the stdlib (e.g. > there's repeated rumblings about moving distutils out), because the > stdlib release cycle doesn't work well for packaging infrastructure. If I had the energy to argue it I would also argue against using TOML in those PEPs. I personally don't especially care for TOML and what's "obvious" to Tom is not at all obvious to me. I'd rather just stick with YAML or perhaps something even simpler than either one. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Asynchronous exception handling around with/try statement borders
On Fri, Sep 21, 2018 at 12:58 AM Chris Angelico wrote: > > On Fri, Sep 21, 2018 at 8:52 AM Kyle Lahnakoski > wrote: > > Since the java.lang.Thread.stop() "debacle", it has been obvious that > > stopping code to run other code has been dangerous. KeyboardInterrupt > > (any interrupt really) is dangerous. Now, we can probably code a > > solution, but how about we remove the danger: > > > > I suggest we remove interrupts from Python, and make them act more like > > java.lang.Thread.interrupt(); setting a thread local bit to indicate an > > interrupt has occurred. Then we can write explicit code to check for > > that bit, and raise an exception in a safe place if we wish. This can > > be done with Python code, or convenient places in Python's C source > > itself. I imagine it would be easier to whitelist where interrupts can > > raise exceptions, rather than blacklisting where they should not. > > The time machine strikes again! > > https://docs.python.org/3/c-api/exceptions.html#signal-handling Although my original post did not explicitly mention PyErr_CheckSignals() and friends, it had already taken that into account and it is not a silver bullet, at least w.r.t. the exact issue I raised, which had to do with the behavior of context managers versus the setup() try: do_thing() finally: cleanup() pattern, and the question of how signals are handled between Python interpreter opcodes. There is a still-open bug on the issue tracker discussing the exact issue in greater details: https://bugs.python.org/issue29988 ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Move optional data out of pyc files
On Tue, Apr 10, 2018 at 9:50 PM, Eric V. Smith wrote: > >>> 3. Annotations. They are used mainly by third party tools that >>> statically analyze sources. They are rarely used at runtime. >> >> Even less used than docstrings probably. > > typing.NamedTuple and dataclasses use annotations at runtime. Astropy uses annotations at runtime for optional unit checking on arguments that take dimensionful quantities: http://docs.astropy.org/en/stable/api/astropy.units.quantity_input.html#astropy.units.quantity_input ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP proposal: unifying function/method classes
On Fri, Mar 23, 2018 at 11:25 AM, Antoine Pitrou wrote: > On Fri, 23 Mar 2018 07:25:33 +0100 > Jeroen Demeyer wrote: > >> On 2018-03-23 00:36, Antoine Pitrou wrote: >> > It does make sense, since the proposal sounds ambitious (and perhaps >> > impossible without breaking compatibility). >> >> Well, *some* breakage of backwards compatibility will be unavoidable. >> >> >> My plan (just a plan for now!) is to preserve backwards compatibility in >> the following ways: >> >> * Existing Python attributes of functions/methods should continue to >> exist and behave the same >> >> * The inspect module should give the same results as now (by changing >> the implementation of some of the functions in inspect to match the new >> classes) >> >> * Everything from the documented Python/C API. >> >> >> This means that I might break compatibility in the following ways: >> >> * Changing the classes of functions/methods (this is the whole point of >> this PEP). So anything involving isinstance() checks might break. >> >> * The undocumented parts of the Python/C API, in particular the C structure. > > One breaking change would be to add __get__ to C functions. This means > e.g. the following: > > class MyClass: > my_open = open > > would make my_open a MyClass method, therefore you would need to spell > it: > > class MyClass: > my_open = staticmethod(open) > > ... if you wanted MyClass().my_open('some file') to continue to work. > > Of course that might be considered a minor annoyance. I don't really see your point in this example. For one: why would anyone do this? Is this based on a real example? 2) That's how any function works. If you put some arbitrary function in a class body, and it's not able to accept an instance of that class as its first argument, then it will always be broken unless you make it a staticmethod. I don't see how there should be any difference there if the function were implemented in Python or in C. Thanks, E ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] importlib: making FileFinder easier to extend
Hello, Brief problem statement: Let's say I have a custom file type (say, with extension .foo) and these .foo files are included in a package (along with other Python modules with standard extensions like .py and .so), and I want to make these .foo files importable like any other module. On its face, importlib.machinery.FileFinder makes this easy. I make a loader for my custom file type (say, FooSourceLoader), and I can use the FileFinder.path_hook helper like: sys.path_hooks.insert(0, FileFinder.path_hook((FooSourceLoader, ['.foo']))) sys.path_importer_cache.clear() Great--now I can import my .foo modules like any other Python module. However, any standard Python modules now cannot be imported. The way PathFinder sys.meta_path hook works, sys.path_hooks entries are first-come-first-serve, and furthermore FileFinder.path_hook is very promiscuous--it will take over module loading for *any* directory on sys.path, regardless what the file extensions are in that directory. So although this mechanism is provided by the stdlib, it can't really be used for this purpose without breaking imports of normal modules (and maybe it's not intended for that purpose, but the documentation is unclear). There are a number of different ways one could get around this. One might be to pass FileFinder.path_hook loaders/extension pairs for all the basic file types known by the Python interpreter. Unfortunately there's no great way to get that information. *I* know that I want to support .py, .pyc, .so etc. files, and I know which loaders to use for them. But that's really information that should belong to the Python interpreter, and not something that should be reverse-engineered. In fact, there is such a mapping provided by importlib.machinery._get_supported_file_loaders(), but this is not a publicly documented function. One could probably think of other workarounds. For example you could implement a custom sys.meta_path hook. But I think it shouldn't be necessary to go to higher levels of abstraction in order to do this--the default sys.path handler should be able to handle this use case. In order to support adding support for new file types to sys.path_hooks, I ended up implementing the following hack: # import os import sys from importlib.abc import PathEntryFinder @PathEntryFinder.register class MetaFileFinder: """ A 'middleware', if you will, between the PathFinder sys.meta_path hook, and sys.path_hooks hooks--particularly FileFinder. The hook returned by FileFinder.path_hook is rather 'promiscuous' in that it will handle *any* directory. So if one wants to insert another FileFinder.path_hook into sys.path_hooks, that will totally take over importing for any directory, and previous path hooks will be ignored. This class provides its own sys.path_hooks hook as follows: If inserted on sys.path_hooks (it should be inserted early so that it can supersede anything else). Its find_spec method then calls each hook on sys.path_hooks after itself and, for each hook that can handle the given sys.path entry, it calls the hook to create a finder, and calls that finder's find_spec. So each sys.path_hooks entry is tried until a spec is found or all finders are exhausted. """ def __init__(self, path): if not os.path.isdir(path): raise ImportError('only directories are supported', path=path) self.path = path self._finder_cache = {} def __repr__(self): return '{}({!r})'.format(self.__class__.__name__, self.path) def find_spec(self, fullname, target=None): if not sys.path_hooks: return None for hook in sys.path_hooks: if hook is self.__class__: continue finder = None try: if hook in self._finder_cache: finder = self._finder_cache[hook] if finder is None: # We've tried this finder before and got an ImportError continue except TypeError: # The hook is unhashable pass if finder is None: try: finder = hook(self.path) except ImportError: pass try: self._finder_cache[hook] = finder except TypeError: # The hook is unhashable for some reason so we don't bother # caching it pass if finder is not None: spec = finder.find_spec(fullname, target) if spec is not None: return spec # Module spec not found through any of the finders return None def invalidate_caches(self): for finder in self._finder_cache.values(): finder.invalidate_caches() @classmet
Re: [Python-ideas] Is there a reason some of the PyLong_As* functions don't call an object's __int__?
On Thu, Dec 28, 2017 at 8:42 PM, Serhiy Storchaka wrote: > 28.12.17 12:10, Erik Bray пише: >> >> There's no index() alternative to int(). > > > operator.index() Okay, and it's broken. That doesn't change my other point that some functions that could previously take non-int arguments can no longer--if we agree on that at least then I can set about making a bug report and fixing it. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Is there a reason some of the PyLong_As* functions don't call an object's __int__?
On Fri, Dec 8, 2017 at 7:20 PM, Ethan Furman wrote: > On 12/08/2017 04:33 AM, Erik Bray wrote: > >> More importantly not as many objects that coerce to int actually >> implement __index__. They probably *should* but there seems to be >> some confusion about how that's to be used. > > > __int__ is for coercion (float, fraction, etc) > > __index__ is for true integers > > Note that if __index__ is defined, __int__ should also be defined, and > return the same value. > > https://docs.python.org/3/reference/datamodel.html#object.__index__ This doesn't appear to be enforced, though I think maybe it should be. I'll also note that because of the changes I pointed out in my original post, it's now necessary for me to explicitly cast as int() objects that previously "just worked" when passed as arguments in some functions in itertools, collections, and other modules with C implementations. However, this is bad because if some broken code is passing floats to these arguments, they will be quietly cast to int and succeed, when really I should only be accepting objects that have __index__. There's no index() alternative to int(). I think changing all these functions to do the appropriate PyIndex_Check is a correct and valid fix, but I think it also stretches beyond the original purpose of __index__. I think that __index__ is relatively unknown, and perhaps there should be better documentation as to when and how it should be used over the better-known __int__. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Is there a reason some of the PyLong_As* functions don't call an object's __int__?
On Fri, Dec 8, 2017 at 1:52 PM, Antoine Pitrou wrote: > On Fri, 8 Dec 2017 14:30:00 +0200 > Serhiy Storchaka > wrote: >> >> NumPy integers implement __index__. > > That doesn't help if a function calls e.g. PyLong_AsLongAndOverflow(). Right--pointing to __index__ basically implies that PyIndex_Check and subsequent PyNumber_AsSsize_t than there currently are. That I could agree with but then it becomes a question of where are those cases? And what do with, e.g. interfaces like PyLong_AsLongAndOverflow(). Add more PyNumber_ conversion functions? ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Is there a reason some of the PyLong_As* functions don't call an object's __int__?
On Fri, Dec 8, 2017 at 12:26 PM, Serhiy Storchaka wrote: > 08.12.17 12:41, Erik Bray пише: >> >> IIUC, it seems to be carry-over from Python 2's PyLong API, but I >> don't see an obvious reason for it. In every case there's an explicit >> PyLong_Check first anyways, so not calling __int__ doesn't help for >> the common case of exact int objects; adding the fallback costs >> nothing in that case. > > > There is also a case of int subclasses. It is expected that PyLong_AsLong is > atomic, and calling __int__ can lead to crashes or similar consequences. > >> I ran into this because I was passing an object that implements >> __int__ to the maxlen argument to deque(). On Python 2 this used >> PyInt_AsSsize_t which does fall back to calling __int__, whereas >> PyLong_AsSsize_t does not. > > > PyLong_* functions provide an interface to PyLong objects. If they don't > return the content of a PyLong object, how can it be retrieved? If you want > to work with general numbers you should use PyNumber_* functions. By "you " I assume you meant the generic "you". I'm not the one who broke things in this case :) > In your particular case it is more reasonable to fallback to __index__ > rather than __int__. Unlikely maxlen=4.2 makes sense. That's true, but in Python 2 that was possible: >>> deque([], maxlen=4.2) deque([], maxlen=4) More importantly not as many objects that coerce to int actually implement __index__. They probably *should* but there seems to be some confusion about how that's to be used. It was mainly motivated by slices, but it *could* be used in general cases where it definitely wouldn't make sense to accept a float (I wonder if maybe the real problem here is that floats can be coerced automatically to ints) In other words, there are probably countless other cases in the stdlib at all where it "doesn't make sense" to accept a float, but that otherwise should accept objects that can be coerced to int without having to manually wrap those objects with an int(o) call. >> Currently the following functions fall back on __int__ where available: >> >> PyLong_AsLong >> PyLong_AsLongAndOverflow >> PyLong_AsLongLong >> PyLong_AsLongLongAndOverflow >> PyLong_AsUnsignedLongMask >> PyLong_AsUnsignedLongLongMask > > > I think this should be deprecated (and there should be an open issue for > this). Calling __int__ is just a Python 2 legacy. Okay, but then there are probably many cases where they should be replaced with PyNumber_ equivalents or else who knows how much code would break. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Is there a reason some of the PyLong_As* functions don't call an object's __int__?
IIUC, it seems to be carry-over from Python 2's PyLong API, but I don't see an obvious reason for it. In every case there's an explicit PyLong_Check first anyways, so not calling __int__ doesn't help for the common case of exact int objects; adding the fallback costs nothing in that case. I ran into this because I was passing an object that implements __int__ to the maxlen argument to deque(). On Python 2 this used PyInt_AsSsize_t which does fall back to calling __int__, whereas PyLong_AsSsize_t does not. Currently the following functions fall back on __int__ where available: PyLong_AsLong PyLong_AsLongAndOverflow PyLong_AsLongLong PyLong_AsLongLongAndOverflow PyLong_AsUnsignedLongMask PyLong_AsUnsignedLongLongMask whereas the following (at least according to the docs--haven't checked the code in all cases) do not: PyLong_AsSsize_t PyLong_AsUnsignedLong PyLong_AsSize_t PyLong_AsUnsignedLongLong PyLong_AsDouble PyLong_AsVoidPtr I think this inconsistency should be fixed, unless there's some reason for it I'm not seeing. Thanks, Erik ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] install pip packages from Python prompt
On Nov 4, 2017 08:31, "Stephen J. Turnbull" < turnbull.stephen...@u.tsukuba.ac.jp> wrote: Erik Bray writes: > Nope. I totally get that they don’t know what a shell or command prompt > is. THEY. NEED. TO. LEARN. Just to be clear I did not write this. Someone replying to me did. I'm going to go over all the different proposals in this thread and see if I can synthesize a list of options. I think, even if it's not a solution that winds up in the stdlib, it would be good to have some user stories about how package installation from within an interactive prompt might work (even if not from the standard REPL, which it should be noted has had small improvements made to it over the years). I also have my doubts about whether this *shouldn't* be possible. I mean, to a lot of beginners starting out the basic REPL *is* Python. They're so new to the scene they don't even know what IPython or Jupyter is or why they might want that. They aren't experienced enough to even know what they're missing out on. In classrooms we can resolve that easily by pointing our students to whatever tools we think will work best for them, but not everyone has that privilege. Best, Erik I don't want to take a position on the proposal, and I agree that we should *strongly* encourage everyone to learn. But "THEY. NEED. TO. LEARN." is not obvious to me. Anecdotally, my students are doing remarkably (to me, as a teacher) complex modeling with graphical interfaces to statistical and simulation packages (SPSS/AMOS, Artisoc, respectively), and collection of large textual databases from SNS with cargo-culted Python programs. For the past twenty years teaching social scientists, these accidental barriers (as Fred Brooks would have called them) have dropped dramatically, to the point where it's possible to do superficially good-looking (= complex) but entirely meaningless :-/ empirical research. (In some ways I think this lowered cost has been horribly detrimental to my work as an educator in applied social science. ;-) The point being that "user-friendly" UI in many fields where (fairly) advanced computing is used is more than keeping up with the perceived needs of most computer users, while the essential (in the sense of Brooks) non-computing modeling difficulties of their jobs remain. By "perceived" I mean I want my students using TeX, but it's hard to force them when all their professors (except me and a couple mathematicians) use Word (speaking of irreproducible results). It's good enough for government work, and that's in fact where many of them end up (and the great majority are either in government or in equivalent corporate bureaucrat positions). Yes, I meant the deprecatory connotations of "perceived", but realistically, I admit that maybe they *don't* *need* the more polished tech that I could teach them. I remember when I first started out teaching Software Carpentry I made the embarrassing mistake (coming from Physics) of assuming that LaTex is de-facto in most other academic fields :) > Hiding it is not a good idea for anyone. Agreed. Command lines and REPLs teach humility, to me as well as my students. :-) Steve -- Associate Professor Division of Policy and Planning Science http://turnbull/sk.tsukuba.ac.jp/ Faculty of Systems and Information Email: turnb...@sk.tsukuba.ac.jp University of Tsukuba Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] install pip packages from Python prompt
On Oct 30, 2017 8:57 PM, "Alex Walters" wrote: > -Original Message- > From: Python-ideas [mailto:python-ideas-bounces+tritium- > list=sdamon@python.org] On Behalf Of Erik Bray > Sent: Monday, October 30, 2017 6:28 AM > To: Python-Ideas > Subject: Re: [Python-ideas] install pip packages from Python prompt > > On Sun, Oct 29, 2017 at 8:45 PM, Alex Walters > wrote: > > Then those users have more fundamental problems. There is a minimum > level > > of computer knowledge needed to be successful in programming. > Insulating > > users from the reality of the situation is not preparing them to be > > successful. Pretending that there is no system command prompt, or shell, > or > > whatever platform specific term applies, only hurts new programmers. > Give > > users an error message they can google, and they will be better off in the > > long run than they would be if we just ran pip for them. > > While I completely agree with this in principle, I think you > overestimate the average beginner. Nope. I totally get that they don’t know what a shell or command prompt is. THEY. NEED. TO. LEARN. Hiding it is not a good idea for anyone. If this is an insurmountable problem for the newbie, maybe they really shouldn’t be attempting to program. This field is not for everyone. Reading this I get the impression, and correct me if I'm wrong, that you've never taught beginners programming. Of course long term (heck in fact fairly early on) they need to learn these nitty-gritty and sometimes frustrating lessons, but not in a 2 hour intro to programming for total beginners. And I beg to differ--this field is for everyone, and increasingly moreso every day. Doesn't mean it's easy, but it is and can be for everyone. Whether this specific proposal is technically feasible in a cross-platform manner with the state of the Python interpreter and import system is another question. But that's a discussion worth having. "Some people aren't cut out for programming" isn't. > Many beginners I've taught or > helped, even if they can manage to get to the correct command prompt, > often don't even know how to run the correct Python. They might often > have multiple Pythons installed on their system--maybe they have > Anaconda, maybe Python installed by homebrew, or a Python that came > with an IDE like Spyder. If they're on OSX often running "python" > from the command prompt gives the system's crippled Python 2.6 and > they don't know the difference. > > One thing that has been a step in the right direction is moving more > documentation toward preferring running `python -m pip` over just > `pip`, since this often has a better guarantee of running `pip` in the > Python interpreter you intended. But that still requires one to know > how to run the correct Python interpreter from the command-line (which > the newbie double-clicking on IDLE may not even have a concept of...). > > While I agree this is something that is important for beginners to > learn (e.g. print(sys.executable) if in doubt), it *is* a high bar for > many newbies just to install one or two packages from pip, which they > often might need/want to do for whatever educational pursuit they're > following (heck, it's pretty common even just to want to install the > `requests` module, as I would never throw `urllib` at a beginner). > > So while I don't think anything proposed here will work technically, I > am in favor of an in-interpreter pip install functionality. Perhaps > it could work something like this: > > a) Allow it *only* in interactive mode: running `pip(...)` (or > whatever this looks like) outside of interactive mode raises a > `RuntimeError` with the appropriate documentation > b) When running `pip(...)` the user is supplied with an interactive > prompt explaining that since installing packages with `pip()` can > result in changes to the interpreter, it is necessary to restart the > interpreter after installation--give them an opportunity to cancel the > action in case they have any work they need to save. If they proceed, > install the new package then restart the interpreter for them. This > avoids any ambiguity as to states of loaded modules before/after pip > install. > > From: Stephan Houben [mailto:stephan...@gmail.com] > > Sent: Sunday, October 29, 2017 3:43 PM > > To: Alex Walters > > Cc: Python-Ideas > > Subject: Re: [Python-ideas] install pip packages from Python prompt > > > > > > > > Hi Alex, > > > > > > > > 2017-10-29 20:26 GMT+01:00 Alex Walters : > > > > return “Please run pip from your system command prompt”
Re: [Python-ideas] install pip packages from Python prompt
On Mon, Oct 30, 2017 at 11:27 AM, Erik Bray wrote: > On Sun, Oct 29, 2017 at 8:45 PM, Alex Walters wrote: >> Then those users have more fundamental problems. There is a minimum level >> of computer knowledge needed to be successful in programming. Insulating >> users from the reality of the situation is not preparing them to be >> successful. Pretending that there is no system command prompt, or shell, or >> whatever platform specific term applies, only hurts new programmers. Give >> users an error message they can google, and they will be better off in the >> long run than they would be if we just ran pip for them. > > While I completely agree with this in principle, I think you > overestimate the average beginner. Many beginners I've taught or > helped, even if they can manage to get to the correct command prompt, > often don't even know how to run the correct Python. They might often > have multiple Pythons installed on their system--maybe they have > Anaconda, maybe Python installed by homebrew, or a Python that came > with an IDE like Spyder. If they're on OSX often running "python" > from the command prompt gives the system's crippled Python 2.6 and > they don't know the difference. I should add--another case that is becoming extremely common is beginners learning Python for the first time inside the Jupyter/IPython Notebook. And in my experience it can be very difficult for beginners to understand the connection between what's happening in the notebook ("it's in the web-browser--what does that have to do with anything on my computer??") and the underlying Python interpreter, file system, etc. Being able to pip install from within the Notebook would be a big win. This is already possible since IPython allows running system commands and it is possible to run the pip executable from the notebook, then manually restart the Jupyter kernel. It's not 100% clear to me how my proposal below would work within a Jupyter Notebook, so that would also be an angle worth looking into. Best, Erik > One thing that has been a step in the right direction is moving more > documentation toward preferring running `python -m pip` over just > `pip`, since this often has a better guarantee of running `pip` in the > Python interpreter you intended. But that still requires one to know > how to run the correct Python interpreter from the command-line (which > the newbie double-clicking on IDLE may not even have a concept of...). > > While I agree this is something that is important for beginners to > learn (e.g. print(sys.executable) if in doubt), it *is* a high bar for > many newbies just to install one or two packages from pip, which they > often might need/want to do for whatever educational pursuit they're > following (heck, it's pretty common even just to want to install the > `requests` module, as I would never throw `urllib` at a beginner). > > So while I don't think anything proposed here will work technically, I > am in favor of an in-interpreter pip install functionality. Perhaps > it could work something like this: > > a) Allow it *only* in interactive mode: running `pip(...)` (or > whatever this looks like) outside of interactive mode raises a > `RuntimeError` with the appropriate documentation > b) When running `pip(...)` the user is supplied with an interactive > prompt explaining that since installing packages with `pip()` can > result in changes to the interpreter, it is necessary to restart the > interpreter after installation--give them an opportunity to cancel the > action in case they have any work they need to save. If they proceed, > install the new package then restart the interpreter for them. This > avoids any ambiguity as to states of loaded modules before/after pip > install. > > > >> From: Stephan Houben [mailto:stephan...@gmail.com] >> Sent: Sunday, October 29, 2017 3:43 PM >> To: Alex Walters >> Cc: Python-Ideas >> Subject: Re: [Python-ideas] install pip packages from Python prompt >> >> >> >> Hi Alex, >> >> >> >> 2017-10-29 20:26 GMT+01:00 Alex Walters : >> >> return “Please run pip from your system command prompt” >> >> >> >> >> >> The target audience for my proposal are people who do not know >> >> which part of the sheep the "system command prompt" is. >> >> Stephan >> >> >> >> >> >> From: Python-ideas >> [mailto:python-ideas-bounces+tritium-list=sdamon@python.org] On Behalf >> Of Stephan Houben >> Sent: Sunday, October 29, 2017 3:19 PM >> To: Python-Ideas >> Subject: [Python-ideas] install p
Re: [Python-ideas] install pip packages from Python prompt
On Sun, Oct 29, 2017 at 8:45 PM, Alex Walters wrote: > Then those users have more fundamental problems. There is a minimum level > of computer knowledge needed to be successful in programming. Insulating > users from the reality of the situation is not preparing them to be > successful. Pretending that there is no system command prompt, or shell, or > whatever platform specific term applies, only hurts new programmers. Give > users an error message they can google, and they will be better off in the > long run than they would be if we just ran pip for them. While I completely agree with this in principle, I think you overestimate the average beginner. Many beginners I've taught or helped, even if they can manage to get to the correct command prompt, often don't even know how to run the correct Python. They might often have multiple Pythons installed on their system--maybe they have Anaconda, maybe Python installed by homebrew, or a Python that came with an IDE like Spyder. If they're on OSX often running "python" from the command prompt gives the system's crippled Python 2.6 and they don't know the difference. One thing that has been a step in the right direction is moving more documentation toward preferring running `python -m pip` over just `pip`, since this often has a better guarantee of running `pip` in the Python interpreter you intended. But that still requires one to know how to run the correct Python interpreter from the command-line (which the newbie double-clicking on IDLE may not even have a concept of...). While I agree this is something that is important for beginners to learn (e.g. print(sys.executable) if in doubt), it *is* a high bar for many newbies just to install one or two packages from pip, which they often might need/want to do for whatever educational pursuit they're following (heck, it's pretty common even just to want to install the `requests` module, as I would never throw `urllib` at a beginner). So while I don't think anything proposed here will work technically, I am in favor of an in-interpreter pip install functionality. Perhaps it could work something like this: a) Allow it *only* in interactive mode: running `pip(...)` (or whatever this looks like) outside of interactive mode raises a `RuntimeError` with the appropriate documentation b) When running `pip(...)` the user is supplied with an interactive prompt explaining that since installing packages with `pip()` can result in changes to the interpreter, it is necessary to restart the interpreter after installation--give them an opportunity to cancel the action in case they have any work they need to save. If they proceed, install the new package then restart the interpreter for them. This avoids any ambiguity as to states of loaded modules before/after pip install. > From: Stephan Houben [mailto:stephan...@gmail.com] > Sent: Sunday, October 29, 2017 3:43 PM > To: Alex Walters > Cc: Python-Ideas > Subject: Re: [Python-ideas] install pip packages from Python prompt > > > > Hi Alex, > > > > 2017-10-29 20:26 GMT+01:00 Alex Walters : > > return “Please run pip from your system command prompt” > > > > > > The target audience for my proposal are people who do not know > > which part of the sheep the "system command prompt" is. > > Stephan > > > > > > From: Python-ideas > [mailto:python-ideas-bounces+tritium-list=sdamon@python.org] On Behalf > Of Stephan Houben > Sent: Sunday, October 29, 2017 3:19 PM > To: Python-Ideas > Subject: [Python-ideas] install pip packages from Python prompt > > > > Hi all, > > Here is in somewhat more detail my earlier proposal for > > having in the interactive Python interpreter a `pip` function to > > install packages from Pypi. > > Motivation: it appears to me that there is a category of newbies > > for which "open a shell and do `pip whatever`" is a bit too much. > > It would, in my opinion, simplify things a bit if they could just > > copy-and-paste some text into the Python interpreter and have > > some packages from pip installed. > > That would simplify instructions on how to install package xyz, > > without going into the vagaries of how to open a shell on various > > platforms, and how to get to the right pip executable. > > I think this could be as simple as: > > def pip(args): > import sys > import subprocess > subprocess.check_call([sys.executable, "-m", "pip"] + args.split()) > > print("Please re-start Python now to use installed or upgraded > packages.") > > Note that I added the final message about restarting the interpreter > > as a low-tech solution to the problem of packages being already > > imported in the current Python session. > > I would imagine that the author of package xyz would then put on > > their webpage something like: > > To use, enter in your Python interpreter: > > pip("install xyz --user") > > As another example, consider prof. Baldwin from Woolamaloo university > > who teaches a course "Introductory
Re: [Python-ideas] + operator on generators
On Fri, Jun 30, 2017 at 1:09 AM, Jan Kaliszewski wrote: > 2017-06-25 Serhiy Storchaka dixit: > >> 25.06.17 15:06, lucas via Python-ideas пише: > >> > I often use generators, and itertools.chain on them. >> > What about providing something like the following: >> > >> > a = (n for n in range(2)) >> > b = (n for n in range(2, 4)) >> > tuple(a + b) # -> 0 1 2 3 > [...] >> It would be weird if the addition is only supported for instances of >> the generator class, but not for other iterators. Why (n for n in >> range(2)) >> + (n for n in range(2, 4)) works, but iter(range(2)) + iter(range(2, >> 4)) and iter([0, 1]) + iter((2, 3)) don't? itertools.chain() supports >> arbitrary iterators. Therefore you will need to implement the __add__ >> method for *all* iterators in the world. >> >> However itertools.chain() accepts not just *iterators*. > [...] > > But implementation of the OP's proposal does not need to be based on > __add__ at all. It could be based on extending the current behaviour of > the `+` operator itself. > > Now this behavior is (roughly): try left side's __add__, if failed try > right side's __radd__, if failed raise TypeError. > > New behavior could be (again: roughly): try left side's __add__, if > failed try right side's __radd__, if failed try __iter__ of both sides > and chain them (creating a new iterator¹), if failed raise TypeError. > > And similarly, for `+=`: try __iadd__..., try __add__..., try > __iter__..., raise TypeError. I actually really like this proposal, in additional to the original proposal of using '+' to chain generators--I don't think it necessarily needs to be extended to *all* iterables. But this proposal goes one better. I just have to wonder what kind of strange unexpected bugs would result. For example now you could add a list to a string: >>> list(['a', 'b', 'c'] + 'def') ['a', 'b', 'c', 'd', 'e', 'f'] Personally, I really like this and find it natural. But it will break anything expecting this to be a TypeError. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Asynchronous exception handling around with/try statement borders
On Wed, Jun 28, 2017 at 3:19 PM, Greg Ewing wrote: > Erik Bray wrote: >> >> At this point a potentially >> waiting SIGINT is handled, resulting in KeyboardInterrupt being raised >> while inside the with statement's suite, and finally block, and hence >> Lock.__exit__ are entered. > > > Seems to me this is the behaviour you *want* in this case, > otherwise the lock can be acquired and never released. > It's disconcerting that it seems to be very difficult to > get that behaviour with a pure Python implementation. I think normally you're right--this is the behavior you would *want*, but not the behavior that's consistent with how Python implements the `with` statement, all else being equal. Though it's still not entirely fair either because if Lock.__enter__ were pure Python somehow, it's possible the exception would be raised either before or after the lock is actually marked as "acquired", whereas in the C implementation acquisition of the lock will always succeed (assuming the lock was free, and no other exceptional conditions) before the signal handler is executed. >> I think it might be possible to >> gain more consistency between these cases if pending signals are >> checked/handled after any direct call to PyCFunction from within the >> ceval loop. > > > IMO that would be going in the wrong direction by making > the C case just as broken as the Python case. > > Instead, I would ask what needs to be done to make this > work correctly in the Python case as well as the C case. You have a point there, but at the same time the Python case, while "broken" insofar as it can lead to broken code, seems correct from the Pythonic perspective. The other possibility would be to actually change the semantics of the `with` statement. Or as you mention below, a way to temporarily mask signals... > I don't think it's even possible to write Python code that > does this correctly at the moment. What's needed is a > way to temporarily mask delivery of asynchronous exceptions > for a region of code, but unless I've missed something, > no such facility is currently provided. > > What would such a facility look like? One possibility > would be to model it on the sigsetmask() system call, so > there would be a function such as > >mask_async_signals(bool) > > that turns delivery of async signals on or off. > > However, I don't think that would work. To fix the locking > case, what we need to do is mask async signals during the > locking operation, and only unmask them once the lock has > been acquired. We might write a context manager with an > __enter__ method like this: > >def __enter__(self): > mask_async_signals(True) > try: > self.acquire() > finally: > mask_async_signals(False) > > But then we have the same problem again -- if a Keyboard > Interrupt occurs after mask_async_signals(False) but > before __enter__ returns, the lock won't get released. Exactly. > Another approach would be to provide a context manager > such as > >async_signals_masked(bool) > > Then the whole locking operation could be written as > >with async_signals_masked(True): > lock.acquire() > try: > with async_signals_masked(False): > # do stuff here > finally: > lock.release() > > Now there's no possibility for a KeyboardInterrupt to > be delivered until we're safely inside the body, but we've > lost the ability to capture the pattern in the form of > a context manager. > > The only way out of this I can think of at the moment is > to make the above pattern part of the context manager > protocol itself. In other words, async exceptions are > always masked while the __enter__ and __exit__ methods > are executing, and unmasked while the body is executing. I think so too. That's more or less in line with Nick's idea on njs's issue (https://bugs.python.org/issue29988) of an ATOMIC_UNTIL opcode. That's just one implementation possibility. My question would be to make that a language-level requirement of the context manager protocol, or just something CPython does... Thanks, Erik ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Asynchronous exception handling around with/try statement borders
On Wed, Jun 28, 2017 at 3:09 PM, Erik Bray wrote: > On Wed, Jun 28, 2017 at 2:26 PM, Nick Coghlan wrote: >> On 28 June 2017 at 21:40, Erik Bray wrote: >>> My colleague's contention is that given >>> >>> lock = threading.Lock() >>> >>> this is simply *wrong*: >>> >>> lock.acquire() >>> try: >>> do_something() >>> finally: >>> lock.release() >>> >>> whereas this is okay: >>> >>> with lock: >>> do_something() >> >> Technically both are slightly racy with respect to async signals (e.g. >> KeyboardInterrupt), but the with statement form is less exposed to the >> problem (since it does more of its work in single opcodes). >> >> Nathaniel Smith posted a good write-up of the technical details to the >> issue tracker based on his work with trio: >> https://bugs.python.org/issue29988 > > Interesting; thanks for pointing this out. Part of me felt like this > has to have come up before but my searching didn't bring this up > somehow (and even then it's only a couple months old itself). > > I didn't think about the possible race condition before > WITH_CLEANUP_START, but obviously that's a possibility as well. > Anyways since this is already acknowledged as a real bug I guess any > further followup can happen on the issue tracker. On second thought, maybe there is a case to made w.r.t. making a documentation change about the semantics of the `with` statement: The old-style syntax cannot make any guarantees about atomicity w.r.t. async events. That is, there's no way syntactically in Python to declare that no exception will be raised between "lock.acquire()" and the setup of the "try/finally" blocks. However, if issue-29988 were *fixed* somehow (and I'm not convinced it can't be fixed in the limited case of `with` statements) then there really would be a major semantic difference of the `with` statement in that it does support this invariant. Then the question is whether that difference be made a requirement of the language (probably too onerous a requirement?), or just a feature of CPython (which should still be documented one way or the other IMO). Erik ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Asynchronous exception handling around with/try statement borders
On Wed, Jun 28, 2017 at 2:26 PM, Nick Coghlan wrote: > On 28 June 2017 at 21:40, Erik Bray wrote: >> My colleague's contention is that given >> >> lock = threading.Lock() >> >> this is simply *wrong*: >> >> lock.acquire() >> try: >> do_something() >> finally: >> lock.release() >> >> whereas this is okay: >> >> with lock: >> do_something() > > Technically both are slightly racy with respect to async signals (e.g. > KeyboardInterrupt), but the with statement form is less exposed to the > problem (since it does more of its work in single opcodes). > > Nathaniel Smith posted a good write-up of the technical details to the > issue tracker based on his work with trio: > https://bugs.python.org/issue29988 Interesting; thanks for pointing this out. Part of me felt like this has to have come up before but my searching didn't bring this up somehow (and even then it's only a couple months old itself). I didn't think about the possible race condition before WITH_CLEANUP_START, but obviously that's a possibility as well. Anyways since this is already acknowledged as a real bug I guess any further followup can happen on the issue tracker. Thanks, Erik ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Asynchronous exception handling around with/try statement borders
Hi folks, I normally wouldn't bring something like this up here, except I think that there is possibility of something to be done--a language documentation clarification if nothing else, though possibly an actual code change as well. I've been having an argument with a colleague over the last couple days over the proper way order of statements when setting up a try/finally to perform cleanup of some action. On some level we're both being stubborn I think, and I'm not looking for resolution as to who's right/wrong or I wouldn't bring it to this list in the first place. The original argument was over setting and later restoring os.environ, but we ended up arguing over threading.Lock.acquire/release which I think is a more interesting example of the problem, and he did raise a good point that I do want to bring up. My colleague's contention is that given lock = threading.Lock() this is simply *wrong*: lock.acquire() try: do_something() finally: lock.release() whereas this is okay: with lock: do_something() Ignoring other details of how threading.Lock is actually implemented, assuming that Lock.__enter__ calls acquire() and Lock.__exit__ calls release() then as far as I've known ever since Python 2.5 first came out these two examples are semantically *equivalent*, and I can't find any way of reading PEP 343 or the Python language reference that would suggest otherwise. However, there *is* a difference, and has to do with how signals are handled, particularly w.r.t. context managers implemented in C (hence we are talking CPython specifically): If Lock.__enter__ is a pure Python method (even if it maybe calls some C methods), and a SIGINT is handled during execution of that method, then in almost all cases a KeyboardInterrupt exception will be raised from within Lock.__enter__--this means the suite under the with: statement is never evaluated, and Lock.__exit__ is never called. You can be fairly sure the KeyboardInterrupt will be raised from somewhere within a pure Python Lock.__enter__ because there will usually be at least one remaining opcode to be evaluated, such as RETURN_VALUE. Because of how delayed execution of signal handlers is implemented in the pyeval main loop, this means the signal handler for SIGINT will be called *before* RETURN_VALUE, resulting in the KeyboardInterrupt exception being raised. Standard stuff. However, if Lock.__enter__ is a PyCFunction things are quite different. If you look at how the SETUP_WITH opcode is implemented, it first calls the __enter__ method with _PyObjet_CallNoArg. If this returns NULL (i.e. an exception occurred in __enter__) then "goto error" is executed and the exception is raised. However if it returns non-NULL the finally block is set up with PyFrame_BlockSetup and execution proceeds to the next opcode. At this point a potentially waiting SIGINT is handled, resulting in KeyboardInterrupt being raised while inside the with statement's suite, and finally block, and hence Lock.__exit__ are entered. Long story short, because Lock.__enter__ is a C function, assuming that it succeeds normally then with lock: do_something() always guarantees that Lock.__exit__ will be called if a SIGINT was handled inside Lock.__enter__, whereas with lock.acquire() try: ... finally: lock.release() there is at last a small possibility that the SIGINT handler is called after the CALL_FUNCTION op but before the try/finally block is entered (e.g. before executing POP_TOP or SETUP_FINALLY). So the end result is that the lock is held and never released after the KeyboardInterrupt (whether or not it's handled somehow). Whereas, again, if Lock.__enter__ is a pure Python function there's less likely to be any difference (though I don't think the possibility can be ruled out entirely). At the very least I think this quirk of CPython should be mentioned somewhere (since in all other cases the semantic meaning of the "with:" statement is clear). However, I think it might be possible to gain more consistency between these cases if pending signals are checked/handled after any direct call to PyCFunction from within the ceval loop. Sorry for the tl;dr; any thoughts? ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Run length encoding
On 19/06/17 02:47, David Mertz wrote: As an only semi-joke, I have created a module on GH that meets the needs of this discussion (using the spelling I think are most elegant): https://github.com/DavidMertz/RLE It's a shame you have to build that list when encoding. I tried to work out a way to get the number of items in an iterable without having to capture all the values (on the understanding that if the iterable is already an iterator, it would be consumed). The best I came up with so far (not general purpose, but it works in this scenario) is: from iterator import groupby from operator import countOf def rle_encode(it): return ((k, countOf(g, k)) for k, g in groupby(it)) In your test code, this speeds things up quite a bit over building the list, but that's presumably only because both groupby() and countOf() will use the standard class comparison operator methods which in the case of ints will short-circuit with a C-level pointer comparison first. For user-defined classes with complicated comparison methods, getting the length of the group by comparing the items will probably be worse. Is there a better way of implementing a general-purpose "ilen()"? I tried a couple of other things, but they all required at least one lambda function and slowed things down by about 50% compared to the list-building version. (I agree this is sort of a joke, but it's still an interesting puzzle ...). Regards, E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] [Python-Dev] Language proposal: variable assignment in functional context
[cross-posted to python-ideas] Hi Robert, On 16/06/17 12:32, Robert Vanden Eynde wrote: Hello, I would like to propose an idea for the language but I don't know where I can talk about it. Can you please explain what the problem is that you are trying to solve? In a nutshell, I would like to be able to write: y = (b+2 for b = a + 1) The above is (almost) equivalent to: y = (a+1)+2 I realize the parentheses are not required, but I've included them because if your example mixed operators with different precedence then they might be necessary. Other than binding 'b' (you haven't defined what you expect the scope of that to be, but I'll assume it's the outer scope for now), what is it about the form you're proposing that's different? Or in list comprehension: Y = [b+2 for a in L for b = a+1] Which can already be done like this: Y = [b+2 for a in L for b in [a+1]] Y = [(a+1)+2 for a in L] Which is less obvious, has a small overhead (iterating over a list) and get messy with multiple assignment: Y = [b+c+2 for a in L for b,c in [(a+1,a+2)]] New syntax would allow to write: Y = [b+c+2 for a in L for b,c = (a+1,a+2)] Y = [(a+1)+(a+2)+2 for a in L] My first example (b+2 for b = a+1) can already be done using ugly syntax using lambda y = (lambda b: b+2)(b=a+1) y = (lambda b: b+2)(a+1) y = (lambda b=a+1: b+2)() Choice of syntax: for is good because it uses current keyword, and the analogy for x = 5 vs for x in [5] is natural. But the "for" loses the meaning of iteration. The use of "with" would maybe sound more logical. Python already have the "functional if", lambdas, list comprehension, but not simple assignment functional style. Can you present an example that can't be re-written simply by reducing the expression as I have done above? Regards, E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Dictionary destructing and unpacking.
On 07/06/17 23:42, C Anthony Risinger wrote: Neither of these are really comparable to destructuring. No, but they are comparable to the OP's suggested new built-in method (without requiring each mapping type - not just dicts - to implement it). That was what _I_ was responding to. E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Dictionary destructing and unpacking.
On 07/06/17 19:14, Nick Humrich wrote: a, b, c = mydict.unpack('a', 'b', 'c') def retrieve(mapping, *keys): return (mapping[key] for key in keys) $ python3 Python 3.5.2 (default, Nov 17 2016, 17:05:23) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information. >>> def retrieve(mapping, *keys): ... return (mapping[key] for key in keys) ... >>> d = {'a': 1, 'b': None, 100: 'Foo' } >>> a, b, c = retrieve(d, 'a', 'b', 100) >>> a, b, c (1, None, 'Foo') E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] π = math.pi
On Fri, Jun 2, 2017 at 7:52 AM, Greg Ewing wrote: > Victor Stinner wrote: >> >> How do you write π (pi) with a keyboard on Windows, Linux or macOS? > > > On a Mac, π is Option-p and ∑ is Option-w. I don't have a strong opinion about it being in the stdlib, but I'd also point out that a strong advantage to having these defined in a module at all is that third-party interpreters (e.g. IPython, bpython, some IDEs) that support tab-completion make these easy to type as well, and I find them to be very readable for math-heavy code. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Suggestion: push() method for lists
On 21/05/17 15:43, Paul Laos wrote: push(obj) would be equivalent to insert(index = -1, object), having -1 as the default index parameter. In fact, push() could replace both append() and insert() by unifying them. I don't think list.insert() with an index of -1 does what you think it does: $ python3 Python 3.5.2 (default, Nov 17 2016, 17:05:23) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information. >>> l = [0, 1, 2] >>> l [0, 1, 2] >>> l.insert(-1, 99) >>> l [0, 1, 99, 2] >>> Because the indices can be thought of as referencing the spaces _between_ the objects, having a push() in which -1 is referencing a different 'space' than a -1 given to insert() or a slice operation refers to would, I suspect, be a source of confusion (and off-by-one bugs). E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Add a .chunks() method to sequences
Hi Nick, On 05/05/17 08:29, Nick Coghlan wrote: And then given the proposed str.splitgroups() on the one hand, and the existing memoryview.cast() on the other, offering itertools.itergroups() as a corresponding building block specifically for working with streams of regular data would make sense to me - that's a standard approach in time-division multiplexing protocols, and it also shows up in areas like digital audio processing as well (where you're often doing things like shuffling incoming data chunks into FFT buffers) It looks to me like your "itertools.itergroups()" is similar to more_itertools.chunked() - with at least one obvious change, see below(*). If anyone wants to persue this (or any itertools) enhancement, then please be aware of the following thread (and in particular the message being linked to - and the bug and discussion that it is replying to): https://mail.python.org/pipermail/python-dev/2012-July/120885.html I have been told off for bringing this up already, but I do it again in direct response to your suggestion because it seems there is a bar to getting something included in itertools and something like "chunked()" has already failed to make it. The thing to do is probably to talk directly to Raymond to see if there's an acceptable solution first before too much work is put into something that may be rejected as being too high level. It may be that a C version of "more_itertools" for things which people would find a speedup useful might be a solution (where the more_itertools package defers to those built-ins if they exist on the version of Python its executing on, otherwise uses its existing implementation as a fallback). I am not suggesting implementing the _whole_ of more_itertools in C - it's quite large now. (*) I had implemented itertools.chunked in C before (also for audio processing, as it happens) and one thing that I didn't like is the way strings get unpacked: >>> tuple(more_itertools.chunked("foo bar baz", 2)) (['f', 'o'], ['o', ' '], ['b', 'a'], ['r', ' '], ['b', 'a'], ['z']) If the chunked/itergroups method checked for the presence of a __chunks__ or similar dunder method in the source sequence which returns an iterator, then the string class could efficiently yield substrings rather than individual characters which then had to be wrapped in a list or tuple (which I think is what you wanted itergroups() to do): >>> tuple(itertools.chunked("foo bar baz", 2)) ('fo', 'o ', 'ba', 'r ', 'ba', 'z') Similarly, for objects which _represent_ a lot of data but do not actually hold those data literally (for example, range objects or even memoryviews), the returned chunks can also be representations of the data (subranges or subviews) and not the actual rendered data. For example, the existing: >>> range(10) range(0, 10) >>> tuple(more_itertools.chunked(range(10), 3)) ([0, 1, 2], [3, 4, 5], [6, 7, 8], [9]) becomes: >>> tuple(more_itertools.chunked(range(10), 3)) (range(0, 3), range(3, 6), range(6, 9), range(9, 10)) Obviously, with those short strings and ranges one could argue that there's no point, but the principle of doing it this way scales better than the version that collects all of the data in lists - for things like chunks of some sort of "view" object, you would still only have the actual data stored once in the original object. I suppose that one thing to consider is what happens when an iterator is passed to the chunked() function. An iterator could have a __chunks__ method which returned chunks of the source sequence from the existing point in the iteration, however the difference between such an iterator and one that _doesn't_ have a __chunks__ method is that in the second case the iterator would be consumed by the fall-back code which just does what more_itertools.chunked() does now, but in the first it would not. Perhaps there is a precedent for that particular edge case with iterators in a different context. Hope that helps, E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Add an option for delimiters in bytes.hex()
On 04/05/17 01:24, Steven D'Aprano wrote: On Thu, May 04, 2017 at 12:13:25AM +0100, Erik wrote: I had a use-case where splitting an iterable into a sequence of same-sized chunks efficiently improved the performance of my code [...] So I didn't propose it. I have no idea now what I spent my saved hours doing, but I imagine that it was fun Summary: I didn't present the argument because I'm not a masochist I'm not sure what the point of that anecdote was, unless it was "I wrote some useful code, and you missed out". Then you have misunderstood me. Paul suggested that my use-case (chunking could be faster) was perhaps enough to propose that my patch may be considered. I responded with historical/empirical evidence that perhaps that would actually not be the case. I was responding, honestly, to the questions raised by Paul's email. Your comments come across as a passive-aggressive chastisment of the core devs and the Python-Ideas community for being too quick to reject useful code: we missed out on something good, because you don't have the time or energy to deal with our negativity and knee-jerk rejection of everything good. That's the way your series of posts come across to me. I apologise if my words or my turn of phrase do not appeal to you. I am trying to be constructive with everything I post. If you choose to interpret my messages in a different way then I'm not sure what I can do about that. Back to the important stuff though: - you could have offered it to the moreitertools project; A more efficient version of moreitertools.chunked() is what we're talking about. - you could have published it on PyPy; Does PyPy support C extension modules? If so, that's a possibility. - you could have proposed it on Python-Ideas with an explicit statement I may well do that - my current patch (because of when I did it) is against a Py2 codebase, but I could port it to Py3. I still have a nagging doubt that I'd be wasting my time though ;) If you care so little that you can't be bothered even to propose it, why do you care if it is rejected? You are mistaking not caring enough about the functionality with not caring enough to enter into an argument about including that functionality ... I didn't propose it at the time because of the reasons I mentioned. But when I saw something being discussed yet again that I had a general solution for already written I thought I mention it in case it was useful. As I said, I'm _trying_ to be constructive. E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Add an option for delimiters in bytes.hex()
Hi Paul, On 03/05/17 08:57, Paul Moore wrote: > On 3 May 2017 at 02:48, Erik wrote: >> Anyway, I know you can't stop anyone from *proposing* something like this, >> but as soon as they do you may decide to quote the recipe from >> "https://docs.python.org/3/library/functions.html#zip"; and try to block >> their proposition. There are already threads on fora that do that. >> >> That was my sticking point at the time when I implemented a general >> solution. Why bother to propose something that (although it made my code >> significantly faster) had already been blocked as being something that >> should be a python-level operation and not something to be included in a >> built-in? > > It sounds like you have a reasonable response to the suggestion of > using zip- that you have a use case where performance matters, and > your proposed solution is of value in that case. I don't think so, though. I had a use-case where splitting an iterable into a sequence of same-sized chunks efficiently improved the performance of my code significantly (processing a LOT of 24-bit, multi-channel - 16 to 32 - PCM streams from a WAV file). Having thought "I need to split this stream by a fixed number of bytes" and then found more_itertools.chunked() (and the zip_longest(*([iter(foo)] * num)) trick) it turned out they were not quick enough so I implemented itertools.chunked() in C. That worked well for me, so when I was done I did a search in case it was worth proposing as an enhancement to feed it back to the community. Then I came across things such as the following: http://bugs.python.org/issue6021 I am specifically referring to the "It has been rejected before" comment, also mentioned here: https://mail.python.org/pipermail/python-dev/2012-July/120885.html See this entire thread, too: https://mail.python.org/pipermail/python-ideas/2012-July/015671.html This is the reason why I really just didn't care enough to go through the process of proposing it in the end (even though the more_itertools.chunked function was one of the first 3 implemented in V1.0 and seems to _still_ be cropping up all the time in different guises - so is perhaps more fundamental than people recognise). The strong implication of the discussions linked to above is that if it had been mentioned before it would be immediately rejected, and that was supported by several members of the community in good standing. So I didn't propose it. I have no idea now what I spent my saved hours doing, but I imagine that it was fun > Whether it's a > *sufficient* response remains to be seen, but unless you present the > argument we won't know. Summary: I didn't present the argument because I'm not a masochist Regards, E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Add an option for delimiters in bytes.hex()
On 03/05/17 01:43, Steven D'Aprano wrote: On Tue, May 02, 2017 at 11:39:48PM +0100, Erik wrote: On 02/05/17 12:31, Steven D'Aprano wrote: Rather than duplicate the API and logic everywhere, I suggest we add a new string method. My suggestion is str.chunk(size, delimiter=' ') and str.rchunk() with the same arguments: For the record, I now think the second argument should be called "sep", for separator, and I'm okay with Greg's suggestion we call the method "group". "1234ABCDEF".chunk(4) => returns "1234 ABCD EF" [...] Why do you want to limit it to strings? I'm not stopping anyone from proposing a generalisation of this that works with other sequence types. As somebody did :-) Who? I didn't spot that in the thread - please give a reference. Thanks. Anyway, I know you can't stop anyone from *proposing* something like this, but as soon as they do you may decide to quote the recipe from "https://docs.python.org/3/library/functions.html#zip"; and try to block their proposition. There are already threads on fora that do that. That was my sticking point at the time when I implemented a general solution. Why bother to propose something that (although it made my code significantly faster) had already been blocked as being something that should be a python-level operation and not something to be included in a built-in? String methods should return strings. In that case, we need to fix this ASAP ;) : >>> 'foobarbaz'.split('o') ['f', '', 'barbaz'] Where the result is reasonably a sequence, a method should return a sequence (but I would agree that it should generally be a sequence of objects of the source type - which I think is what I effectively said: "Isn't something like this potentially useful for all sequences (where the result is a [sequence] of objects that are the same [type] as the source sequence)" That's not to argue against a generic iterator solution, but the barrier to use of an iterator solution is higher than just calling a method. Knowing which sequence classes have a "chunk" method and which don't is a higher barrier than knowing that all sequences can be "chunked" by a single imported function. E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Augmented assignment syntax for objects.
On 26/04/17 21:50, Chris Angelico wrote: On Thu, Apr 27, 2017 at 6:24 AM, Erik wrote: The background is that what I find myself doing a lot of for private projects is importing data from databases into a structured collection of objects and then grouping and analyzing the data in different ways before graphing the results. So yes, I tend to have classes that accept their entire object state as parameters to the __init__ method (from the database values) and then any other methods in the class are generally to do with the subsequent analysis (including dunder methods for iteration, rendering and comparison etc). You may want to try designing your objects as namedtuples. That gives you a lot of what you're looking for. I did look at this. It looked promising. What I found was that I spent a lot of time working out how to subclass namedtuples properly (I do need to do that to add the extra logic - and sometimes some state - for my analysis) and once I got that working, I was left with a whole different set of boilerplate and special cases and therefore another set of things to remember if I return to this code at some point. So I've reverted to regular classes and multiple assignments in __init__. E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Add an option for delimiters in bytes.hex()
On 02/05/17 12:31, Steven D'Aprano wrote: I disagree with this approach. There's nothing special about bytes.hex() here, perhaps we want to format the output of hex() or bin() or oct(), or for that matter "%x" and any of the other string templates? In fact, this is a string operation that could apply to any character string, including decimal digits. Rather than duplicate the API and logic everywhere, I suggest we add a new string method. My suggestion is str.chunk(size, delimiter=' ') and str.rchunk() with the same arguments: "1234ABCDEF".chunk(4) => returns "1234 ABCD EF" FWIW, I implemented a version of something similar as a fixed-length "chunk" method in itertoolsmodule.c (it was similar to izip_longest - it had a "fill" keyword to pad the final chunk). It was ~100 LOC including the structure definitions. The chunk method was an iterator (so it returned a sequence of "chunks" as defined by the API). Then I read that "itertools" should consist of primitives only and that we should defer to "moreitertools" for anything that is of a higher level (which this is - it can be done in terms of itertools functions). So I didn't propose it, although the processing of my WAV files (in which the sample data are groups of bytes - frames - of a fixed length) was significantly faster with it :( I also looked at implementing itertools.chunk as a function that would make use of a "__chunk__" method on the source object if it existed (which allowed a class to support an even more efficient version of chunking - things like range() etc). I don't see any advantage to adding this to bytes.hex(), hex(), oct(), bin(), and I really don't think it is helpful to be grouping the characters by the number of bits. Its a string formatting operation, not a bit operation. Why do you want to limit it to strings? Isn't something like this potentially useful for all sequences (where the result is a tuple of objects that are the same as the source sequence - be that strings or lists or lazy ranges or whatever?). Why aren't the chunks returned via an iterator? E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Augmented assignment syntax for objects.
On 28/04/17 10:47, Paul Moore wrote: On 28 April 2017 at 00:18, Erik wrote: The semantics are very different and there's little or no connection between importing a module and setting an attribute on self. At the technical level of what goes on under the covers, yes. At the higher level of what the words mean in spoken English, it's really not so different a concept. I disagree. If you were importing into the *class* (instance?) I might begin to see a connection, but importing into self? I know you already understand the following, but I'll spell it out anyway. Here's a module: - $ cat foo.py def foo(): global sys import sys current_namespace = set(globals().keys()) print(initial_namespace ^ current_namespace) def bar(): before_import = set(locals().keys()) import os after_import = set(locals().keys()) print(before_import ^ after_import) initial_namespace = set(globals().keys()) - Now, what happens when I run those functions: $ python3 Python 3.5.2 (default, Nov 17 2016, 17:05:23) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import foo >>> foo.foo() {'sys', 'initial_namespace'} >>> foo.bar() {'before_import', 'os'} >>> ... so the net effect of "import" is to bind an object into a namespace (a dict somewhere). In the case of 'foo()' it's binding the module object for "sys" into the dict of the module object that represents 'foo.py'. In the case of 'bar()' it's binding the module object for "os" into the dict representing the local namespace of the current instance of the bar() call. Isn't binding an object to a namespace the same operation that assignment performs? So it's a type of assignment, and one that doesn't require the name to be spelled twice in the current syntax (and that's partly why I took offense at a suggestion - not by you - that I was picking "random or arbitrary" keywords. I picked it for that specific reason). I realize that there are other semantic changes (importing a module twice doesn't do anything - and specifically repeated "import * from mod" will not do anything if the module mutates) - and perhaps this is your point. Also, if you try to make the obvious generalisations (which you'd *have* to be able to make due to the way Python works) things quickly get out of hand: def __init__(self, a): self import a self.a = a OK, but self is just a variable name, so we can reasonably use a different name: def __init__(foo, a): foo import a foo.a = a So the syntax is import Presumably the following also works, because there's nothing special about parameters? def __init__(x, a): calc = a**2 x import calc x.calc = calc And of course there's nothing special about __init__ def my_method(self, a): self import a self.a = a Or indeed about methods def standalone(a, b): a import b a.b = b or statements inside functions: if __name __ == '__main__: a = 12 b = 13 a import b a.b = b Hmm, I'd hope for a type error here. But what types would be allowed for a? I think you're assuming I'm suggesting some sort of magic around "self" or some such thing. I'm not. I've written above exactly what I would expect the examples to be equivalent to. It's just an assignment which doesn't repeat the name (and in the comma-separated version allows several names to be assigned using compact syntax without spelling them twice, which is where this whole thing spawned from). See what I mean? Things get out of hand *very* fast. I don't see how that's getting "out of hand". The proposal is nothing more complicated than a slightly-different spelling of assignment. It could be done today with a text-based preprocessor which converts the proposed form to an existing valid syntax. Therefore, if it's "out of hand" then so is the existing assignment syntax ;) FWIW, I should probably state for the record that I'm not actually pushing for _anything_ right now. I'm replying to questions asked and also to statements made which I think have missed the point of what I was trying to say earlier. So I'm just engaging in the conversation at this point - if it appears confrontational then it's not meant to. To summarise: 1. There's some serious technical issues with your proposal, which as far as I can see can only be solved by arbitrary restrictions on how it can be used To be honest, I still don't understand what the serious technical issues are (other than the parser probably doesn't handle this sort of keyword/o
Re: [Python-ideas] Augmented assignment syntax for objects.
On 27/04/17 23:43, Steven D'Aprano wrote: On Wed, Apr 26, 2017 at 11:29:19PM +0100, Erik wrote: def __init__(self, a, b, c): self import a, b self.foo = c * 100 [snarky] If we're going to randomly choose arbitrary keywords with no connection to the operation being performed, The keyword I chose was not random or arbitrary and it _does_ have a connection to the operation being performed (bind a value in the source namespace to the target namespace using the same name it had in the source namespace - or rename it using the 'as' keyword). can we use `del` instead of `import` because it's three characters less typing? Comments like this just serve to dismiss or trivialize the discussion. We acknowledged that we're bikeshedding so it was not a serious suggestion, just a "synapse prodder" ... But seriously, I hate this idea. Good. It's not a proposal, but something that was supposed to generate constructive discussion. The semantics are very different and there's little or no connection between importing a module and setting an attribute on self. At the technical level of what goes on under the covers, yes. At the higher level of what the words mean in spoken English, it's really not so different a concept. If we're going to discuss pie-in-the-sky suggestions, That is just dismissing/trivializing the conversation again. (If you don't like "inject", I'm okay with "load" or even "push".) No you're not, because that's a new keyword which might break existing code and that is even harder to justify than re-using an existing keyword in a different context. the problem this solves isn't big or important enough for the disruption of adding a new keyword. So far, you are the only one to have suggested adding a new keyword, I think ;) E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Augmented assignment syntax for objects.
On 26/04/17 23:28, Paul Moore wrote: Or to put it another way, if the only reason for the syntax proposal is performance then show me a case where performance is so critical that it warrants a language change. It's the other way around. The proposal (arguably) makes the code clearer but does not impact performance (and is a syntax error today, so does not break existing code). The suggestions (decorators etc) make the code (arguably) clearer today without a syntax change, but impact performance. So, those who think the decorators make for clearer code have to choose between source code clarity and performance. E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Augmented assignment syntax for objects.
On 26/04/17 19:15, Mike Miller wrote: As the new syntax ideas piggyback on existing syntax, it doesn't feel like that its a complete impossibility to have this solved. Could be another "fixed papercut" to drive Py3 adoption. Taken individually not a big deal but they add up. *sigh* OK, this has occurred to me over the last couple of days but I didn't want to suggest it as I didn't want the discussion to fragment even more. But, if we're going to bikeshed and there is some weight behind the idea that this "papercut" should be addressed, then given my previous comparisons with importing, what about having 'import' as an operator: def __init__(self, a, b, c): self import a, b self.foo = c * 100 Also allows renaming: def __init__(self, a, b, c): self import a, b, c as _c Because people are conditioned to think the comma-separated values after "import" are not tuples, perhaps the use of import as an operator rides on that wave ... (I do realise that blurring the lines between statements and operators like this is probably not going to work for technical reasons (and it just doesn't quite read correctly anyway), but now we're bikeshedding and who knows what someone else might come up with in response ...). E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Augmented assignment syntax for objects.
On 26/04/17 22:28, Paul Moore wrote: On 26 April 2017 at 21:51, Erik wrote: It doesn't make anything more efficient, however all of the suggestions of how to do it with current syntax (mostly decorators) _do_ make things less efficient. Is instance creation the performance bottleneck in your application? No, not at all. This discussion has split into two: 1) How can I personally achieve what I want for my own personal use-cases. This should really be on -list, and some variation of the decorator thing will probably suffice for me. 2) The original proposal, which does belong on -ideas and has to take into account the general case, not just my specific use-case. The post you are responding to is part of (2), and hence reduced performance is a consideration. Regards, E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Augmented assignment syntax for objects.
On 26/04/17 01:39, Nathaniel Smith wrote: [snip discussion of why current augmented assignment operators are better for other reasons] Are there any similar arguments for .=? It doesn't make anything more efficient, however all of the suggestions of how to do it with current syntax (mostly decorators) _do_ make things less efficient. So rather than a win/win as with current augmented assignment (compact/clearer code *and* potentially a performance improvement), it's now a tradeoff (wordy code *or* a performance reduction). E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Augmented assignment syntax for objects.
On 26/04/17 16:10, Nick Timkovich wrote: I was wondering that if there are so many arguments to a function that it *looks* ugly, that it might just *be* ugly. For one, too many required arguments to a function (constructor, whatever) is already strange. Binding them as attributes of the object, unmodified in a constructor also seems to be rare. Yes, and perhaps it's more of a problem for me because of my possibly-atypical use of Python. The background is that what I find myself doing a lot of for private projects is importing data from databases into a structured collection of objects and then grouping and analyzing the data in different ways before graphing the results. So yes, I tend to have classes that accept their entire object state as parameters to the __init__ method (from the database values) and then any other methods in the class are generally to do with the subsequent analysis (including dunder methods for iteration, rendering and comparison etc). E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Augmented assignment syntax for objects.
On 26/04/17 18:42, Mike Miller wrote: I want to be able to say: def __init__(self, foo, bar, baz, spam): self .= foo, bar, spam self.baz = baz * 100 I don't see ALL being set a big problem, and less work than typing several of them out again. Because, some of the parameters might be things that are just passed to another constructor to create an object that is then referenced by the object being created. If one doesn't want the object's namespace to be polluted by that stuff (which may be large and also now can't be garbage collected while the object is alive) then a set of "del self.xxx" statements is required instead, so you've just replaced one problem with another ;) I'd rather just explicitly say what I want to happen rather than have *everything* happen and then have to tidy that up instead ... E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Augmented assignment syntax for objects.
On 26/04/17 08:59, Paul Moore wrote: It should be possible to modify the decorator to take a list of the variable names you want to assign, but I suspect you won't like that Now you're second-guessing me. > class MyClass: > @auto_args('a', 'b') > def __init__(self, a, b, c=None): > pass I had forgotten that decorators could take parameters. Something like that pretty much ticks the boxes for me. I'd _prefer_ something that sits inside the method body rather than just outside it, and I'd probably _prefer_ something that wasn't quite so heavyweight at runtime (which may be an irrational concern on my part ;)), but those aren't deal breakers, depending on the project - and the vast majority of what I do in Python is short-lived one-off projects and rapid prototyping for later implementation in another language, so I do seem to be fleshing out a set of classes from scratch and writing a bunch of __init__ methods far more of the time than people with long-lived projects would do. Perhaps that's why it irritates me more than it does some others ;) E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Augmented assignment syntax for objects.
On 26/04/17 13:19, Joao S. O. Bueno wrote: On 25 April 2017 at 19:30, Erik wrote: decorators don't cut it anyway (at least not those proposed) because they blindly assign ALL of the arguments. I'm more than happy to hear of something that solves both of those problems without needing syntax changes though, as that means I can have it today ;) Sorry - a decorator won't "blindly assign all argments" - it will do that just if it is written to do so. Right, and the three or four variants suggested (and the vars(self).update() suggestion) all do exactly that. I was talking about the specific responses (though I can see my language is vague). [FWIW I've been using Python the whole time that decorators have existed and I've yet to need to write one - I've _used_ some non-parameterized ones though - so I guess I'd forgotten that they can take parameters] E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Augmented assignment syntax for objects.
On 25/04/17 22:15, Brice PARENT wrote: it may be easier to get something like this (I think, as there is no new operator involved) : No new operator, but still a syntax change, so that doesn't help from that POV. def __init__(self, *args, **kwargs): self.* = *args self.** = **kwargs What is "self.* = *args" supposed to do? For each positional argument, what name in the object is it bound to? E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Augmented assignment syntax for objects.
On 25/04/17 23:05, Paul Moore wrote: 1. Writing out the assignments "longhand" is an unacceptable burden. There are reasons why augmented assignment was implemented. One of them was to make the code easier to read: foil = foil + 1 foil = foi1 + 1 foil += 1 Should one be silly enough to have a "foil" and "foi1" variable in scope, only one of those is clearly incrementing a variable without requiring a slightly harder look ;) It's not about the time taken to type the line. It's about the clarity of what the line is expressing. 2. Using a decorator (which can be written directly in your project, doesn't even need to be an external dependency) is unacceptable. All of the decorators (or other language tricks that modify the object's dict) suggested so far assume that ALL of the method's arguments are to be assigned. I do not want that. I want to be able to say: def __init__(self, foo, bar, baz, spam): self .= foo, bar, spam self.baz = baz * 100 It's all still explicit inside the body of the method. Add to that the fact that these people would be arguing "I want the ability to avoid writing out the assignments, but I don't want that capability enough to use a decorator" As I said above, it's not about the effort writing it out. It's about the effort (and accuracy) of reading the code after it has been written. And as I also said above, decorators don't cut it anyway (at least not those proposed) because they blindly assign ALL of the arguments. I'm more than happy to hear of something that solves both of those problems without needing syntax changes though, as that means I can have it today ;) E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Augmented assignment syntax for objects.
On 25/04/17 02:15, Chris Angelico wrote: Bikeshedding: Your example looks a lot more like tuple assignment than multiple assignment. Well, originally, I thought it was just the spelling-the-same-name-twice thing that irritated me and I was just going to suggest a single assignment version like: self .= foo self .= bar Then I thought that this is similar to importing (referencing an object from one namespace in another under the same name). In that scenario, instead of: from other import foo from other import bar we have: from other import foo, bar That's where the comma-separated idea came from, and I understand it looks like a tuple (which is why I explicitly mentioned that) but it does in the import syntax too ;) The single argument version (though it doesn't help with vertical space) still reads better to me for the same reason that augmented assignment is clearer - there is no need to mentally parse that the same name is being used on both sides of the assignment because it's only spelled once. self .= foo .= bar .= baz .= spam .= ham Thanks for being the only person so far to understand that I don't necessarily want to bind ALL of the __init__ parameters to the object, just the ones I explicitly reference, but I'm not convinced by this suggestion. In chained assignment the thing on the RHS is bound to each name to the left of it and that is really not happening here. The trouble is that this syntax is really only going to be used inside __init__. Even if that was true, who ever writes one of those? :D E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Augmented assignment syntax for objects.
Hi. I suspect that this may have been discussed to death at some point in the past, but I've done some searching and I didn't come up with much. Apologies if I'm rehashing an old argument ;) I often find myself writing __init__ methods of the form: def __init__(self, foo, bar, baz, spam, ham): self.foo = foo self.bar = bar self.baz = baz self.spam = spam self.ham = ham This seems a little wordy and uses a lot of vertical space on the screen. Occasionally, I have considered something like: def __init__(self, foo, bar, baz, spam, ham): self.foo, self.bar, self.baz, self.spam, self.ham = \ foo, bar, baz, spam, ham ... just to make it a bit more compact - though in practice, I'd probably not do that with a list quite that long ... two or three items at most: def __init__(self, foo, bar, baz): self.foo, self.bar, self.baz = foo, bar, baz When I do that I'm torn because I know it has a runtime impact to create and unpack the implicit tuples and I'm also introducing a style asymmetry in my code just because of the number of parameters a method happens to have. So why not have an augmented assignment operator for object attributes? It addresses one of the same broad issues that the other augmented assignment operators were introduced for (that of repeatedly spelling names). The suggestion therefore is: def __init__(self, foo, bar, baz, spam, ham): self .= foo, bar, baz, spam, ham This is purely syntactic sugar for the original example: def __init__(self, foo, bar, baz, spam, ham): self.foo = foo self.bar = bar self.baz = baz self.spam = spam self.ham = ham ... so if any of the attributes have setters, then they are called as usual. It's purely a syntactic shorthand. Any token which is not suitable on the RHS of the dot in a standard "obj.attr =" assignment is a syntax error (no "self .= 1"). The comma-separators in the example are not creating a tuple object, they would work at the same level in the parser as the import statement's comma-separated lists - in the same way that "from pkg import a, b, c" is the same as saying: import pkg a = pkg.a b = pkg.b c = pkg.c ... "self .= a, b, c" is the same as writing: self.a = a self.b = b self.c = c E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Submitted a PR!
Hi. I may be way off-base here, but having scanned the patch I'm not sure I agree that it's the right way forward. What seems to be happening is that the homogeneity of the list is determined somehow (whether tracked with a hint or scanned just-in-time) and then a specific comparison function for a known subset of built-in types is selected if appropriate. I had assumed that there would be an "apples-to-apples" comparison function in the type structure and that the patch was simply tracking the list's homogeneity in order to enter a (generic) alternative loop to call that function over PyObject_RichCompare(). Why is that not the case? When a new C-level type is introduced (either a built-in or an extension module), why does the list object's code need to know about it in order to perform this optimisation? Why is there not a "tp_apple2apple" slot in the type structure which higher level functions (including the RichCompare() stuff - the first thing that function does is check the type of the objects anyway) can call if it determines that the two objects have the same type? Such a slot would also speed up "contains", "count", etc (for all classes) with no extra work, and no overhead of tracking or scanning the sequence's homogeneity. E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] dict(default=int)
On 09/03/17 23:04, Spencer Brown wrote: Might make more sense to be dict.default(int), that way it doesn't have redundant dict names. I thought that, too. since you could do {1:2, 3:4}.default(int) Could you? Python 3.6.0 (default, Mar 9 2017, 00:43:06) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information. >>> type(dict()) >>> type({}) >>> type(dict) The thing bound to the name 'dict' is not the same as the object returned by _calling_ 'dict'. E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)
On 08/03/17 11:07, Steven D'Aprano wrote: I mentioned earlier that I have code which has to track the type of list items, and swaps to a different algorithm when the types are not all the same. Hmmm. Yes, I guess if the expensive version requires a lot of isinstance() messing or similar for each element then it could be better to have optimized versions for homogeneous lists of ints or strings etc. A list.is_heterogeneous() method could be implemented if it was necessary, I would prefer to get the list item's type: if mylist.__type_hint__ is float: If you know the list is homogeneous then the item's type is "type(mylist[0])". Also, having it be a function call gives an obvious place to put the transition from "unknown" to known state if the tri-state hint approach was taken. Otherwise, that would have to be hooked into the attribute access somehow. That's for someone who wants to try implementing it to decide and propose though :) E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)
On 08/03/17 00:18, Steven D'Aprano wrote: I thought about that and rejected it as an unnecessary complication. Hetrogeneous and unknown might as well be the same state: either way, you cannot use the homogeneous-type optimization. Knowing it's definitely one of two positive states and not knowing which of those two states it is is not the same thing when it comes to what one can and can't optimize cheaply :) It sort of depends on how cheaply one can track the states though ... Part of the complexity here is that I'd like this flag to be available to Python code, not just a hidden internal state of the list. Out of interest, for what purpose? Generally, I thought Python code should not need to worry about low-level optimisations such as this (which are C-Python specific AIUI). A list.is_heterogeneous() method could be implemented if it was necessary, but how would that be used? But also avoids bothering with an O(N) scan in some situations where the list really is hetrogeneous. So there's both an opportunity cost and a benefit. O(N) is worst case. Most of the anecdotal evidence in this thread so far seems to suggest that heterogeneous lists are not common. May or may not be true. Empirically, for me, it is true. Who knows? (and there is the question). Remember, we're talking about opportunities for applying an optimization here, nothing more. You're not giving up anything: at worst, the ordinary, unoptimized routine will run and you're no worse off than you are today. You are a little bit - the extra overhead of checking all of this (which is the unknown factor we're all skirting around ATM) costs. So converting a previously-heterogeneous list to a homogeneous list via a delete or whatever has a benefit if the optimisations can then be applied to that list many times in the future (i.e., once it becomes recognised as homogeneous again, it benefits from optimised paths in the interpreter). And of course, all that depends on your use case. It might work out better for one application over another. As you quite rightly point out, it needs someone to measure the alternatives and work out if _overall_ it has a positive impact ... so I'm not a fan of the "once heterogeneous, always considered heterogeneous" behaviour if it's cheap enough to avoid it. It is not just a matter of the cost of tracking three states versus two. It is a matter of the complexity of the interface. I suppose this could be reported to Python code as None, False or the type. I didn't think any of this stuff would come back to Python code (I thought we were talking about C-Python specific implementation only). How is this useful to Python code? Ultimately, this is all very pie-in-the-sky unless somebody tests just how expensive this is and whether the benefit is worthwhile. I agree. As I said before, I'm just pointing out things I noticed while looking at the current C code which could be picked up on if someone wants to try implementing and benchmarking any of this. It sort of feels like an argument, but I hope we're just violently agreeing on a generally shared goal ;) Regards, E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)
Hi David, On 07/03/17 22:39, David Mertz wrote: On Tue, Mar 7, 2017 at 4:36 PM, Erik mailto:pyt...@lucidity.plus.com>> wrote: * Several other methods ('contains', 'remove', 'count', 'index') also use PyObject_RichCompareBool(). They could also presumably benefit from the same optimisation (perhaps it's not all about sort() - perhaps this gives a little more weight to the idea). Good point about list.extend(). I don't think __type_hint__ could help with .__contains__() or .count() or .remove(). E.g.: In [7]: lst = [1.0, 2.0, 1+0j, F(1,1)] In [8]: from fractions import Fraction as F In [9]: lst = [1.0, 2.0, 1+0j, F(1,1)] In [10]: 1 in lst Out[10]: True In [11]: lst.count(1) Out[11]: 3 In [12]: l.index(1) Out[12]: 0 The list has absolutely nothing of the right type. Yet it contains an item, counts things that are equal, finds a position for an equal item. Sure, but if the needle doesn't have the same type as the (homogeneous) haystack, then the rich comparison would still need to be done as a fallback (and would produce the result you indicate). But if the needle and the homogeneous haystack have the _same_ type, then a more optimised version of the operation can be done. Regards, E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)
Hi Elliot, On 07/03/17 21:10, Elliot Gorokhovsky wrote: On Tue, Mar 7, 2017 at 1:47 PM Erik mailto:pyt...@lucidity.plus.com>> wrote: I'd prefer the sort optimization to be based on what my list contains NOW, not on what it may have contained some time in the past, so I'm not a fan of the "once heterogeneous, always considered heterogeneous" behaviour if it's cheap enough to avoid it. Sure. Dictionaries actually don't implement this, though: as soon as they see a non-string key, they permanently switch to a heterogeneous state (IIRC). I'd be interested to know if this approach had been considered and rejected for dicts - but I think dicts are a bit of a special case anyway. Because they are historically a fundamental building block of the language (for name lookups etc) they are probably more sensitive to small changes than other objects. I think the bigger problem, though, is that most list use does *not* involve sorting, so it would be a shame to impose the non-trivial overhead of type-checking on *all* list use. Yes, I understand that issue - I just thought I'd mention something that hadn't been pointed out yet _IF_ the idea of a type hint were to be considered (that's the sub-thread I'm replying to). If you're not doing that, then fine - I just wanted to put down things that occurred to me so they were documented (if only for rejection). So, while I'm at it ;), here are some other things I noticed scanning the list object source (again, only if a type hint was considered): * What is the type hint of an empty list? (this probably depends on how naturally the code for all of the type hint checking deals with NULL vs "unknown"). * listextend() - this should do the right thing with the type hint when extending one list with another. * Several other methods ('contains', 'remove', 'count', 'index') also use PyObject_RichCompareBool(). They could also presumably benefit from the same optimisation (perhaps it's not all about sort() - perhaps this gives a little more weight to the idea). Anyway, my patch could always be a precursor to a more general optimization along these lines. I'm almost finished fixing the problem Tim identified earlier in this thread; after that, it'll be ready for review! Nice - good job. E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)
On 07/03/17 20:46, Erik wrote: (unless it was acceptable that once heterogeneous, a list is always considered heterogeneous - i.e., delete always sets the hint to NULL). Rubbish. I meant that delete would not touch the hint at all. E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Exploiting type-homogeneity in list.sort() (again!)
On 06/03/17 03:08, David Mertz wrote: On Sun, Mar 5, 2017 at 6:45 PM, Steven D'Aprano mailto:st...@pearwood.info>> wrote: Here is a radical thought... why don't lists track their common type themselves? There's only a few methods which can add items: I had exactly the same thought. Lists would need to grow a new attribute, of course. I'm not sure how that would affect the object layout and word boundaries. But there might be free space for another attribute slot. The real question is whether doing this is a win. On each append/mutation operation we would need to do a comparison to the __type_hint__ (assuming Steven's spelling of the attribute). That's not free. Balancing that, however, when we actually *did* a sort, it would be O(1) to tell if it was homogeneous (and also the actual type if yes) rather than O(N). I don't think anyone has mentioned this yet, but FWIW I think the 'type hint' may need to be tri-state: heterogeneous (NULL), homogeneous (the pointer to the type structure) and also "unknown" (a sentinel value - the address of a static char or something). Otherwise, a delete operation on the list would need to scan the list to work out if it had changed from heterogeneous to homogeneous (unless it was acceptable that once heterogeneous, a list is always considered heterogeneous - i.e., delete always sets the hint to NULL). Instead, delete would change a NULL hint to the sentinel (leaving a valid type hint as it is) and then prior to sorting - as the hint is being checked anyway - if it's the sentinel value, perform the pre-scan that the existing patch is doing to restore the knowledge of just what type of list it is. I'd prefer the sort optimization to be based on what my list contains NOW, not on what it may have contained some time in the past, so I'm not a fan of the "once heterogeneous, always considered heterogeneous" behaviour if it's cheap enough to avoid it. E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] For/in/as syntax
Hi Brice, On 04/03/17 08:45, Brice PARENT wrote: * Creating a real object at runtime for each loop which needs to be the target of a non-inner break or continue However, I'm not sure the object should be constructed and fed for every loop usage. It should probably only be instanciated if explicitly asked by the coder (by the use of "as loop_name"). That's what I meant by "needs to be the target of a non-inner break or continue" (OK, you are proposing something more than just a referenced break/continue target, but we are talking about the same thing). Only loops which use the syntax get a loop manager object. * For anything "funky" (my words, not yours ;)), there needs to be a way of creating a custom loop object - what would the syntax for that be? A callable needs to be invoked as well as the name bound (the current suggestion just binds a name to some magical object that appears from somewhere). I don't really understand what this means, as I'm not aware of how those things work in the background. What I mean is, in the syntax "for spam in ham as eggs:" the name "eggs" is bound to your loop manager object. Where is the constructor call for this object? what class is it? That's what I meant by "magical". If you are proposing the ability to create user-defined loop managers then there must be somewhere where your custom class's constructor is called. Otherwise how does Python know what type of object to create? Something like (this is not a proposal, just something plucked out of the air to hopefully illustrate what I mean): for spam in ham with MyLoop() as eggs: eggs.continue() I guess it would be magical in the sense it's not the habitual way of constructing an object. But it's what we're already used to with "as". When we use a context manager, like "with MyPersonalStream() as my_stream:", my_stream is not an object of type "MyPersonalStream" that has been built using the constructor, but the return of __enter__() By you have to spell the constructor (MyPersonalStream()) to see what type of object is being created (whether or not the eventual name bound in your context is to the result of a method call on that object, the constructor of your custom context manager is explicitly called. If you are saying that the syntax always implicitly creates an instance of a builtin class which can not be subclassed by a custom class then that's a bit different. This solution, besides having been explicitly rejected by Guido himself, I didn't realise that. Dead in the water then probably, which is fine, I wasn't pushing it. brings two functionalities that are part of the proposal, but are not its main purpose, which is having the object itself. Allowing to break and continue from it are just things that it could bring to us, but there are countless things it could also bring (not all of them being good ideas, of course), like the .skip() and the properties I mentioned, I understand that, but I concentrated on those because they were easily converted into syntax (and would probably be the only things I'd find useful - all the other stuff is mostly doable using a custom iterator, I think). I would agree that considering syntax for all of the extra things you mention would be a bad idea - which your loop manager object idea gets around. but we could discuss about some methods like forloop.reset(), forloop.is_first_iteration() which is just of shortcut to (forloop.count == 0), forloop.is_last_iteration() Also, FWIW, if I knew that in addition to the overhead of creating a loop manager object I was also incurring the overhead of a loop counter being maintained (usually, one is not required - if it is, use enumerate()) I would probably not use this construct and instead find ways of restructuring my code to avoid it using regular for loops. I'm not beating up on you - like I said, I think the idea is interesting. E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] For/in/as syntax
On 03/03/17 19:02, Alexandre Brault wrote: I believe what Matthias is hoping for is an equivalent of Java's named break feature. Breaking out of an outer loop implicitly breaks out of all inner loops Yes, and although I think making this a runtime object is an interesting thought (in terms of perhaps allowing other funky stuff to be implemented by a custom object, in line with Python's general dynamic ethos), I think that it should perhaps be considered a lexer/parser level thing only. * Creating a real object at runtime for each loop which needs to be the target of a non-inner break or continue is quite a large overhead. How would this affect Python variants other than CPython? * For anything "funky" (my words, not yours ;)), there needs to be a way of creating a custom loop object - what would the syntax for that be? A callable needs to be invoked as well as the name bound (the current suggestion just binds a name to some magical object that appears from somewhere). * If nothing "funky" needs to be done then why not just make the whole thing syntax-only and have no real object, by making the 'as' name a parser-only token which is only valid as the optional subject of a break or continue statement: for foo in bar as LABEL: . # (a) . for spam in ham: . . if eggs(spam): continue LABEL . . if not frob(spam): break LABEL # (b) (a) is the code generator's implied 'continue' target for the LABEL loop. (b) is the code generator's implied 'break' target for the LABEL loop. I'm not saying that's a great solution either. It's probably not an option as there is now something that looks like a bound name but is not actually available at runtime - the following would not work: for foo in bar as LABEL: print(dir(LABEL)) (and presumably that is part of the reason why the proposal is the way it is). I'm generally +0 on the concept (it might be nice, but I'm not sure either the original proposal or what I mention above are particularly problem-free ;)). E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] More classical for-loop
On 18/02/17 19:35, Mikhail V wrote: You mean what my proposal would bring technically better than e.g.: for i,e in enumerate(Seq) Well, nothing, and I will simply use it, with only difference it could be: for i,e over enumerate(Seq) In this case only space holes will be smoothed out, so pure optical fix. But you also make the language's structure not make sense. For good or bad, English is the language that the keywords are written in so it makes sense for the Python language constructs to follow English constructs. An iterable in Python (something that can be the target of a 'for' loop) is a collection of objects (whether they represent a sequence of integers, a set of unique values, a list of random things, whatever). It is valid English to say "for each object in my collection, I will do the following:". It is not valid English to say "for each object over my collection, I will do the following:". In that respect, "in" is the correct keyword for Python to use. In the physical world, if the "collection" is some coins in your pocket, would you say "for each coin over my pocket, I will take it out and look at it"? Other than that, I also echo Stephen's comments that not all iterables' lengths can be known in advance, and not all iterables can be indexed, so looping using length and indexing is a subset of what the 'for' loop can do today. Why introduce new syntax for a restricted subset of what can already be done? Soon, someone else will propose another syntax for a different subset. This is why people are talking about the "burden" of learning these extra syntaxes. Rather than 10 different syntaxes for 10 different subsets, why not just learn the one syntax for the general case? E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Light-weight call-by-name syntax in Python
On 17/02/17 10:22, Stephan Houben wrote: Proposal: Light-weight call-by-name syntax in Python The following syntax a : b is to be interpreted as: a(lambda: b) Isn't this too general a syntax? Doesn't it lead to something like: if a: b: c: d: e: pass E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Things that won't change (proposed PEP)
On 12/01/17 19:51, Todd wrote: On Thu, Jan 12, 2017 at 2:33 PM, Sven R. Kunze mailto:srku...@mail.de>> wrote: First of all, I am anti-censor and pro-change. There is no "censorship" or "banning thoughts" going on here. Even with this PEP, people are free to think about and talk about how Python could work differently all they want. What this PEP does is tell them that certain decisions have been made about how the Python language is going to work, so they should be aware that such talk isn't going to actually result in any changes to the language. By saying that "these are things that will not change", then you _are_ sort of banning talk about them (if, as you assert, "such talk isn't going to actually result in any changes to the language" then you are saying don't waste your breath, we won't even consider your arguments). I think I get Sven's point. A long time ago, someone probably said "Python will never have any sort of type declarations.". But now there is type hinting. It's not the same thing, I know, but such a declaration in a PEP might have prevented people from even spending time considering hinting. Instead, if the PEP collected - for each 'frequently' suggested change - a summary of the reasons WHY each aspect is designed the way it is (with links to archived discussions or whatever) then that IMO that would be a good resource to cite in a canned response to such suggestions. It's not that "these things will never change", it's more of a "you need to provide a solid argument why your suggestion is different to, and better than, the cited suggestions that have already been rejected". Probably a lot of work to gather all the references though. But it could start out with one or two and grow from there. Add to it as and when people bring up the same old stuff next time. E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Python reviewed
On 10/01/17 01:44, Simon Lovell wrote: Regarding the logical inconsistency of my argument, well I am saying that I would prefer my redundancy at the end of the loop rather than the beginning. To say that the status quo is better is to say that you prefer your redundancy at the beginning. It's not really that one prefers redundancy anywhere. It's more a question of: a) Does the redundancy have any (however small) benefit? b) How "expensive" is the redundancy (in this case, that equates to mandatory characters typed and subsequent screen noise when reading the code). I don't understand how a "redundancy" of a trailing colon in any statement that will introduce a new level of indentation is worse than having to remember to type "end" when a dedent (which is zero characters) does that. Trailing colon "cost": 1 * (0.n) Block end "cost": (len("end") + len(statement_text)) * 1.0 I still struggle to see why it should be mandatory though? That looks like a statement, but you've ended it with a question mark. Are you asking if you still struggle? I can't tell. Perhaps it's just the correct use of punctuation that you're objecting to ;) > One more comment I wanted to make about end blocks, is that a > respectable editor will add them for you, You are now asking me to write code with what you describe as a "respectable" editor. I use vim, which is very respectable, thank you. You'd like me to use "EditPlus 2" or equivalent. I struggle to see why that should be mandatory. Thanks for starting an entertaining thread, though ;) E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] New PyThread_tss_ C-API for CPython
On Fri, Dec 30, 2016 at 5:05 PM, Nick Coghlan wrote: > On 29 December 2016 at 22:12, Erik Bray wrote: >> >> 1) CPython's TLS: Defines -1 as an uninitialized key (by fact of the >> implementation--that the keys are integers starting from zero) >> 2) pthreads: Does not definite an uninitialized default value for >> keys, for reasons described at [1] under "Non-Idempotent Data Key >> Creation". I understand their reasoning, though I can't claim to know >> specifically what they mean when they say that some implementations >> would require the mutual-exclusion to be performed on >> pthread_getspecific() as well. I don't know that it applies here. > > > That section is a little weird, as they describe two requests (one for a > known-NULL default value, the other for implicit synchronisation of key > creation to prevent race conditions), and only provide the justification for > rejecting one of them (the second one). Right, that is confusing to me as well. I'm guessing the reason for rejecting the first is in part a way to force us to recognize the second issue. > If I've understood correctly, the situation they're worried about there is > that pthread_key_create() has to be called at least once-per-process, but > must be called before *any* call to pthread_getspecific or > pthread_setspecific for a given key. If you do "implicit init" rather than > requiring the use of an explicit mechanism like pthread_once (or our own > Py_Initialize and module import locks), then you may take a small > performance hit as either *every* thread then has to call > pthread_key_create() to ensure the key exists before using it, or else > pthread_getspecific() and pthread_setspecific() have to become potentially > blocking calls. Neither of those is desirable, so it makes sense to leave > that part of the problem to the API client. > > In our case, we don't want the implicit synchronisation, we just want the > known-NULL default value so the "Is it already set?" check can be moved > inside the library function. Okay, we're on the same page here then. I just wanted to make sure there wasn't anything else I was missing in Python's case. >> 3) windows: The return value of TlsAlloc() is a DWORD (unsigned int) >> and [2] states that its value should be opaque. >> >> So in principle we can cover all cases with an opaque struct that >> contains, as its first member, an is_initialized flag. The tricky >> part is how to initialize the rest of the struct (containing the >> underlying implementation-specific key). For 1) and 3) it doesn't >> matter--it can just be zero. For 2) it's trickier because there's no >> defined constant value to initialize a pthread_key_t to. >> >> Per Nick's suggestion this can be worked around by relying on C99's >> initialization semantics. Per [3] section 6.7.8, clause 21: >> >> """ >> If there are fewer initializers in a brace-enclosed list than there >> are elements or members of an aggregate, or fewer characters in a >> string literal used to initialize an array of known size than there >> are elements in the array, the remainder of the aggregate shall be >> initialized implicitly the same as objects that have static storage >> duration. >> """ >> >> How objects with static storage are initialized is described in the >> previous page under clause 10, but in practice it boils down to what >> you would expect: Everything is initialized to zero, including nested >> structs and arrays. >> >> So as long as we can use this feature of C99 then I think that's the >> best approach. > > > > I checked PEP 7 to see exactly which features we've added to the approved C > dialect, and designated initialisers are already on the list: > https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html > > So I believe that would allow the initializer to be declared as something > like: > > #define Py_tss_NEEDS_INIT {.is_initialized = false} Great! One could argue about whether or not the designated initializer syntax also incorporates omitted fields, but it would seem strange to insist that it doesn't. Have a happy new year, Erik ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] New PyThread_tss_ C-API for CPython
On Wed, Dec 21, 2016 at 5:07 PM, Nick Coghlan wrote: > On 21 December 2016 at 20:01, Erik Bray wrote: >> >> On Wed, Dec 21, 2016 at 2:10 AM, Nick Coghlan wrote: >> > Option 2: Similar to option 1, but using a custom type alias, rather >> > than >> > using a C99 bool directly >> > >> > The closest API we have to these semantics at the moment would be >> > PyGILState_Ensure, so the following API naming might work for option 2: >> > >> > Py_ensure_t >> > Py_ENSURE_NEEDS_INIT >> > Py_ENSURE_INITIALIZED >> > >> > Respectively, these would just be aliases for bool, false, and true. >> > >> > And then modify the proposed PyThread_tss_create and PyThread_tss_delete >> > APIs to accept a "Py_ensure_t *init_flag" in addition to their current >> > arguments. >> >> That all sounds good--between the two option 2 looks a bit more explicit. >> >> Though what about this? Rather than adding another type, the original >> proposal could be changed slightly so that Py_tss_t *is* partially >> defined as a struct consisting of a bool, with whatever the native TLS >> key is. E.g. >> >> typedef struct { >> bool init_flag; >> #if defined(_POSIX_THREADS) >> pthreat_key_t key; >> #elif defined (NT_THREADS) >> DWORD key; >> /* etc... */ >> } Py_tss_t; >> >> Then it's just taking Masayuki's original patch, with the global bool >> variables, and formalizing that by combining the initialized flag with >> the key, and requiring the semantics you described above for >> PyThread_tss_create/delete. >> >> For Python's purposes it seems like this might be good enough, with >> the more general purpose pthread_once-like functionality not required. > > > Aye, I also thought of that approach, but talked myself out of it since > there's no definable default value for pthread_key_t. However, C99 partial > initialisation may deal with that for us (by zeroing the memory without > actually assigning a typed value to it), and if it does, I agree it would be > better to handle the initialisation flag automatically rather than requiring > callers to do it. I think I understand what you're saying here... To be clear, let me enumerate the three currently supported cases and how they're affected: 1) CPython's TLS: Defines -1 as an uninitialized key (by fact of the implementation--that the keys are integers starting from zero) 2) pthreads: Does not definite an uninitialized default value for keys, for reasons described at [1] under "Non-Idempotent Data Key Creation". I understand their reasoning, though I can't claim to know specifically what they mean when they say that some implementations would require the mutual-exclusion to be performed on pthread_getspecific() as well. I don't know that it applies here. 3) windows: The return value of TlsAlloc() is a DWORD (unsigned int) and [2] states that its value should be opaque. So in principle we can cover all cases with an opaque struct that contains, as its first member, an is_initialized flag. The tricky part is how to initialize the rest of the struct (containing the underlying implementation-specific key). For 1) and 3) it doesn't matter--it can just be zero. For 2) it's trickier because there's no defined constant value to initialize a pthread_key_t to. Per Nick's suggestion this can be worked around by relying on C99's initialization semantics. Per [3] section 6.7.8, clause 21: """ If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration. """ How objects with static storage are initialized is described in the previous page under clause 10, but in practice it boils down to what you would expect: Everything is initialized to zero, including nested structs and arrays. So as long as we can use this feature of C99 then I think that's the best approach. [1] http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_key_create.html [2] https://msdn.microsoft.com/en-us/library/windows/desktop/ms686801(v=vs.85).aspx [3] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] New PyThread_tss_ C-API for CPython
On Wed, Dec 21, 2016 at 11:01 AM, Erik Bray wrote: > That all sounds good--between the two option 2 looks a bit more explicit. > > Though what about this? Rather than adding another type, the original > proposal could be changed slightly so that Py_tss_t *is* partially > defined as a struct consisting of a bool, with whatever the native TLS > key is. E.g. > > typedef struct { > bool init_flag; > #if defined(_POSIX_THREADS) > pthreat_key_t key; *pthread_key_t* of course, though I wonder if that was a Freudian slip :) > #elif defined (NT_THREADS) > DWORD key; > /* etc... */ > } Py_tss_t; > > Then it's just taking Masayuki's original patch, with the global bool > variables, and formalizing that by combining the initialized flag with > the key, and requiring the semantics you described above for > PyThread_tss_create/delete. > > For Python's purposes it seems like this might be good enough, with > the more general purpose pthread_once-like functionality not required. Of course, that's not to say it might not be useful for some other purpose, but then it's outside the scope of this discussion as long as it isn't needed for TLS key initialization. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] New PyThread_tss_ C-API for CPython
On Wed, Dec 21, 2016 at 2:10 AM, Nick Coghlan wrote: > On 21 December 2016 at 01:35, Masayuki YAMAMOTO > wrote: >> >> 2016-12-20 22:30 GMT+09:00 Erik Bray : >>> >>> This is probably an implementation detail, but ISTM that even with >>> PyThread_call_once, it will be necessary to reset any used once_flags >>> manually in PyOS_AfterFork, essentially for the same reason the >>> autoTLSkey is reset there currently... >> >> >> Deleting threads key is executed on *_Fini functions, but Py_FinalizeEx >> function that calls *_Fini functions doesn't terminate CPython interpreter. >> Furthermore, source comment and document have said description about >> reinitialization after calling Py_FinalizeEx. [1] [2] That is to say there >> is an implicit possible that is reinitialization contrary to name >> "call_once" on a process level. Therefore, if CPython interpreter continues >> to allow reinitialization, I'd suggest to rename the call_once API to avoid >> misreading semantics. (for example, safe_init, check_init) > > > Ouch, I'd missed that, and I agree it's not a negligible implementation > detail - there are definitely applications embedding CPython out there that > rely on being able to run multiple Initialize/Finalize cycles in the same > process and have everything "just work". It also means using the > "PyThread_*" prefix for the initialisation tracking aspect would be > misleading, since the life cycle details are: > > 1. Create the key for the first time if it has never been previously set in > the process > 2. Destroy and reinit if Py_Finalize gets called > 3. Destroy and reinit if a new subprocess is forked > > It also means we can't use pthread_once even in the pthread TLS > implementation, since it doesn't provide those semantics. > > So I see two main alternatives here. > > Option 1: Modify the proposed PyThread_tss_create and PyThread_tss_delete > APIs to accept a "bool *init_flag" pointer in addition to their current > arguments. > > If *init_flag is true, then PyThread_tss_create is a no-op, otherwise it > sets the flag to true after creating the key. > If *init_flag is false, then PyThread_tss_delete is a no-op, otherwise it > sets the flag to false after deleting the key. > > Option 2: Similar to option 1, but using a custom type alias, rather than > using a C99 bool directly > > The closest API we have to these semantics at the moment would be > PyGILState_Ensure, so the following API naming might work for option 2: > > Py_ensure_t > Py_ENSURE_NEEDS_INIT > Py_ENSURE_INITIALIZED > > Respectively, these would just be aliases for bool, false, and true. > > And then modify the proposed PyThread_tss_create and PyThread_tss_delete > APIs to accept a "Py_ensure_t *init_flag" in addition to their current > arguments. That all sounds good--between the two option 2 looks a bit more explicit. Though what about this? Rather than adding another type, the original proposal could be changed slightly so that Py_tss_t *is* partially defined as a struct consisting of a bool, with whatever the native TLS key is. E.g. typedef struct { bool init_flag; #if defined(_POSIX_THREADS) pthreat_key_t key; #elif defined (NT_THREADS) DWORD key; /* etc... */ } Py_tss_t; Then it's just taking Masayuki's original patch, with the global bool variables, and formalizing that by combining the initialized flag with the key, and requiring the semantics you described above for PyThread_tss_create/delete. For Python's purposes it seems like this might be good enough, with the more general purpose pthread_once-like functionality not required. Best, Erik ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] New PyThread_tss_ C-API for CPython
On Tue, Dec 20, 2016 at 9:26 AM, Nick Coghlan wrote: > On 20 December 2016 at 00:53, Erik Bray wrote: >> >> On Mon, Dec 19, 2016 at 3:45 PM, Erik Bray wrote: >> >> Likewise - we know the status quo isn't right, and the proposed change >> >> addresses that. In reviewing the patch on the tracker, the one downside >> >> I've >> >> found is that due to "pthread_key_t" being an opaque type with no >> >> defined >> >> sentinel, the consuming code in _tracemalloc.c and pystate.c needed to >> >> add >> >> separate boolean flag variables to track whether or not the key had >> >> been >> >> created. (The pthread examples at >> >> >> >> http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_key_create.html >> >> use pthread_once for a similar effect) >> >> >> >> I don't see any obvious way around that either, as even using a small >> >> struct >> >> for native pthread TLS keys would still face the problem of how to >> >> initialise the pthread_key_t field. >> > >> > Hmm...fair point that it's not pretty. One way around it, albeit >> > requiring more work/complexity, would be to extend this proposal to >> > add a new function analogous to pthread_once--say--PyThread_call_once, >> > and an associated Py_once_flag_t >> >> Oops--fat-fingered a 'send' command before I finished. >> >> So workaround would be to add a PyThread_call_once function, >> analogous to pthread_once. Yet another interface one needs to >> implement for a native thread implementation, but not too hard either. >> For pthreads there's already an obvious analogue that can be wrapped >> directly. For other platforms that don't have a direct analogue a >> (naive) implementation is still fairly simple: All you need in >> Py_once_flag_t is a boolean flag with an associated mutex, and a >> sentinel value analogous to PTHREAD_ONCE_INIT. > > > Yeah, I think I'd prefer that - it aligns nicely with the way pthreads are > defined, and means we can be more prescriptive about how to use the new API > correctly for key declarations (we're currently a bit vague about exactly > how to handle that in the current TLS API). > > With that addition, I think it will be worth turning your initial post here > into a PR to the peps repo, though - not to resolve any particular > controversy, but rather as an easier to find reference for the design > rationale than a mailing list thread or a tracker issue. > > (I'd also be happy to volunteer as BDFL-Delegate, since I'm already > reviewing the patch on the tracker) Okay, thanks. I will work on a PR to the PEPs repo, and update the proposal to add the PyThread_call_once idea, which some prescription for how it should be used. Of course, an updated patch will have to follow as well. This is probably an implementation detail, but ISTM that even with PyThread_call_once, it will be necessary to reset any used once_flags manually in PyOS_AfterFork, essentially for the same reason the autoTLSkey is reset there currently... Erik ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] New PyThread_tss_ C-API for CPython
On Mon, Dec 19, 2016 at 3:45 PM, Erik Bray wrote: > On Mon, Dec 19, 2016 at 1:11 PM, Nick Coghlan wrote: >> On 17 December 2016 at 03:51, Antoine Pitrou wrote: >>> >>> On Fri, 16 Dec 2016 13:07:46 +0100 >>> Erik Bray wrote: >>> > Greetings all, >>> > >>> > I wanted to bring attention to an issue that's been languishing on the >>> > bug tracker since last year, which I think would best be addressed by >>> > changes to CPython's C-API. The original issue is at >>> > http://bugs.python.org/issue25658, but I have made an effort below in >>> > a sort of proto-PEP to summarize the problem and the proposed >>> > solution. >>> > >>> > I haven't written this up in the proper PEP format because I want to >>> > see if the idea has some broader support first, and it's also not >>> > clear to me whether C-API changes (especially to undocumented APIs) >>> > even require their own PEP. >>> >>> This is a nice detailed write-up and I'm in favour of the proposal. >> >> >> Likewise - we know the status quo isn't right, and the proposed change >> addresses that. In reviewing the patch on the tracker, the one downside I've >> found is that due to "pthread_key_t" being an opaque type with no defined >> sentinel, the consuming code in _tracemalloc.c and pystate.c needed to add >> separate boolean flag variables to track whether or not the key had been >> created. (The pthread examples at >> http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_key_create.html >> use pthread_once for a similar effect) >> >> I don't see any obvious way around that either, as even using a small struct >> for native pthread TLS keys would still face the problem of how to >> initialise the pthread_key_t field. > > Hmm...fair point that it's not pretty. One way around it, albeit > requiring more work/complexity, would be to extend this proposal to > add a new function analogous to pthread_once--say--PyThread_call_once, > and an associated Py_once_flag_t Oops--fat-fingered a 'send' command before I finished. So workaround would be to add a PyThread_call_once function, analogous to pthread_once. Yet another interface one needs to implement for a native thread implementation, but not too hard either. For pthreads there's already an obvious analogue that can be wrapped directly. For other platforms that don't have a direct analogue a (naive) implementation is still fairly simple: All you need in Py_once_flag_t is a boolean flag with an associated mutex, and a sentinel value analogous to PTHREAD_ONCE_INIT. Best, Erik ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] New PyThread_tss_ C-API for CPython
On Mon, Dec 19, 2016 at 1:11 PM, Nick Coghlan wrote: > On 17 December 2016 at 03:51, Antoine Pitrou wrote: >> >> On Fri, 16 Dec 2016 13:07:46 +0100 >> Erik Bray wrote: >> > Greetings all, >> > >> > I wanted to bring attention to an issue that's been languishing on the >> > bug tracker since last year, which I think would best be addressed by >> > changes to CPython's C-API. The original issue is at >> > http://bugs.python.org/issue25658, but I have made an effort below in >> > a sort of proto-PEP to summarize the problem and the proposed >> > solution. >> > >> > I haven't written this up in the proper PEP format because I want to >> > see if the idea has some broader support first, and it's also not >> > clear to me whether C-API changes (especially to undocumented APIs) >> > even require their own PEP. >> >> This is a nice detailed write-up and I'm in favour of the proposal. > > > Likewise - we know the status quo isn't right, and the proposed change > addresses that. In reviewing the patch on the tracker, the one downside I've > found is that due to "pthread_key_t" being an opaque type with no defined > sentinel, the consuming code in _tracemalloc.c and pystate.c needed to add > separate boolean flag variables to track whether or not the key had been > created. (The pthread examples at > http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_key_create.html > use pthread_once for a similar effect) > > I don't see any obvious way around that either, as even using a small struct > for native pthread TLS keys would still face the problem of how to > initialise the pthread_key_t field. Hmm...fair point that it's not pretty. One way around it, albeit requiring more work/complexity, would be to extend this proposal to add a new function analogous to pthread_once--say--PyThread_call_once, and an associated Py_once_flag_t ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] New PyThread_tss_ C-API for CPython
On Sat, Dec 17, 2016 at 8:21 AM, Stephen J. Turnbull wrote: > Erik Bray writes: > > > Abstract > > > > > > The proposal is to add a new Thread Local Storage (TLS) API to CPython > > which would supersede use of the existing TLS API within the CPython > > interpreter, while deprecating the existing API. > > Thank you for the analysis! And thank *you* for the feedback! > Question: > > > Further, the old PyThread_*_key* functions will be marked as > > deprecated. > > Of course, but: > > > Additionally, the pthread implementations of the old > > PyThread_*_key* functions will either fail or be no-ops on > > platforms where sizeof(pythead_t) != sizeof(int). > > Typo "pythead_t" in last line. Thanks, yes, that was suppose to be pthread_key_t of course. I think I had a few other typos too. > I don't understand this. I assume that there are no such platforms > supported at present. I would think that when such a platform becomes > supported, code supporting "key" functions becomes unsupportable > without #ifdefs on that platform, at least directly. So you should > either (1) raise UnimplementedError, or (2) provide the API as a > wrapper over the new API by making the integer keys indexes into a > table of TSS'es, or some such device. I don't understand how (3) > "make it a no-op" can be implemented for PyThread_create_key -- return > 0 or -1? That would only work if there's a failure return status like > 0 or -1, and it seems really dangerous to me since in general a lot of > code doesn't check status even though it should. Even for code > checking the status, the error message will be suboptimal ("creation > failed" vs. "unimplemented"). Masayuki already explained this downthread I think, but I could have probably made that section more precise. The point was that PyThread_create_key should immediately return -1 in this case. This is just a subtle difference over the current situation, which is that PyThread_create_key succeeds, but the key is corrupted by being cast to an int, so that later calls to PyThread_set_key_value and the like fail unexpectedly. The point is that PyThread_create_key (and we're only talking about the pthread implementation thereof, to be clear) must fail immediately if it can't work correctly. #ifdefs on the platform would not be necessary--instead, Masayuki's patch adds a feature check in configure.ac for sizeof(int) == sizeof(pthread_key_t). It should be noted that even this check is not 100% perfect, as on Linux pthread_key_t is an unsigned int, and so technically can cause Python's signed int key to overflow, but there's already an explicit check for that (which would be kept), and it's also a very unlikely scenario. > I gather from references to casting pthread_key_t to unsigned int and > back that there's probably code that does this in ways making (2) too > dangerous to support. If true, perhaps that should be mentioned here. It's not necessarily too dangerous, so much as not worth the trouble, IMO. Simpler to just provide, and immediately use the new API and make the old one deprecated and explicitly not supported on those platforms where it can't work. Thanks, Erik ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] New PyThread_tss_ C-API for CPython
On Sun, Dec 18, 2016 at 12:10 AM, Masayuki YAMAMOTO wrote: > 2016-12-17 18:35 GMT+09:00 Stephen J. Turnbull > : >> >> I don't understand this. I assume that there are no such platforms >> supported at present. I would think that when such a platform becomes >> supported, code supporting "key" functions becomes unsupportable >> without #ifdefs on that platform, at least directly. So you should >> either (1) raise UnimplementedError, or (2) provide the API as a >> wrapper over the new API by making the integer keys indexes into a >> table of TSS'es, or some such device. I don't understand how (3) >> "make it a no-op" can be implemented for PyThread_create_key -- return >> 0 or -1? That would only work if there's a failure return status like >> 0 or -1, and it seems really dangerous to me since in general a lot of >> code doesn't check status even though it should. Even for code >> checking the status, the error message will be suboptimal ("creation >> failed" vs. "unimplemented"). > > > PyThread_create_key has required user to check the return value since when > key creation fails, returns -1 instead of valid key value. Therefore, my > patch changes PyThread_create_key that always return -1 on platforms that > cannot cast key to int safely and current API never return valid key value > to these platforms. Its advantage to not change function specifications and > no effect on supported platforms. Hence, this is reason that doesn't raise > any exception on the API. > > (2) of ideas can enable current API on specific-platforms. If it's simple, > I'd have liked to select it. However, work that brings current API using > native TLS to specific-platforms brings duplication implementation that > manages keys, and it's ugly (same reason for Erik's draft, the last item of > Rejected Ideas). Thus, I gave up to keep feature and decided to implement > "no-op", delegate error handling to API users. Yep--I think it speaks to the sensibleness of that decision that I pretty much read your mind :) ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] New PyThread_tss_ C-API for CPython
oses by way of an API that is not compatible with POSIX (and in fact makes invalid assumptions about pthreads). Rationale for Proposed Solution === The use of an opaque type (Py_tss_t) to key TLS values allows the API to be compatible, at least in this regard, with CPython's internal TLS implementation, as well as all present (NT and posix) and future (C11?) native TLS implementations supported by CPython, as it allows the definition of Py_tss_t to depend on the underlying implementation. A new API must be introduced, rather than changing the function signatures of the current API, in order to maintain backwards compatibility. The new API also more clearly groups together these related functions under a single name prefix, "PyThread_tss_". The "tss" in the name stands for "thread-specific storage", and was influenced by the naming and design of the "tss" API that is part of the C11 threads API. However, this is in no way meant to imply compatibility with or support for the C11 threads API, or signal any future intention of supporting C11--it's just the influence for the naming and design. Changing PyThread_create_key to immediately return a failure status on systems using pthreads where sizeof(int) != sizeof(pthread_key_t) is intended as a sanity check: Currently, PyThread_create_key will report initial success on such systems, but attempts to use the returned key are likely to fail. Although in practice this failure occurs quickly during interpreter startup, it's better to fail immediately at the source of failure (PyThread_create_key) rather than sometime later when use of an invalid key is attempted. Rejected Ideas == * Do nothing: The status quo is fine because it works on Linux, and platforms wishing to be supported by CPython should follow the requirements of PEP-11. As explained above, while this would be a fair argument if CPython were being to asked to make changes to support particular quirks of a specific platform, in this case the platforms in question are only asking to fix a quirk of CPython that prevents it from being used to its full potential on those platforms. The fact that the current implementation happens to work on Linux is a happy accident, and there's no guarantee that will never change. * Affected platforms should just configure Python --without-threads: This is a possible temporary workaround to the issue, but only that. Python should not be hobbled on affected platforms despite them being otherwise perfectly capable of running multi-threaded Python. * Affected platforms should not define Py_HAVE_NATIVE_TLS: This is a more acceptable alternative to the previous idea, and in fact there is a patch to do just that [2]. However, CPython's internal TLS implementation being "slower and clunkier" in general than native implementations still needlessly hobbles performance on affected platforms. At least one other module (tracemalloc) is also broken if Python is built without Py_HAVE_NATIVE_TLS. * Keep the existing API, but work around the issue by providing a mapping from pthread_key_t values to ints. A couple attempts were made at this [3] [4], but this only injects needless complexity and overhead into performance-critical code on platforms that are not currently affected by this issue (such as Linux). Even if use of this workaround were made conditional on platform compatibility, it introduces platform-specific code to maintain, and still has the problem of the previous rejected ideas of needlessly hobbling performance on affected platforms. Implementation == An initial version of a patch [5] is available on the bug tracker for this issue. The patch is proposed and written by Masayuki Yamamoto, who should be considered a co-author of this proto-PEP, though I have not consulted directly with him in writing this. If he's reading, he should chime in in case I've misrepresented anything. If you've made it this far, thanks for reading and thank you for your consideration, Erik [1] https://bugs.python.org/msg116292 [2] http://bugs.python.org/file45548/configure-pthread_key_t.patch [3] http://bugs.python.org/file44269/issue25658-1.patch [4] http://bugs.python.org/file44303/key-constant-time.diff [5] http://bugs.python.org/file45763/pythread-tss.patch ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP8 dictionary indenting addition
On Sun, Oct 9, 2016 at 2:25 AM, Steven D'Aprano wrote: > On Sat, Oct 08, 2016 at 09:26:13PM +0200, Jelte Fennema wrote: >> I have an idea to improve indenting guidelines for dictionaries for better >> readability: If a value in a dictionary literal is placed on a new line, it >> should have (or at least be allowed to have) a n additional hanging indent. >> >> Below is an example: >> >> mydict = {'mykey': >> 'a very very very very very long value', >> 'secondkey': 'a short value', >> 'thirdkey': 'a very very very ' >> 'long value that continues on the next line', >> } > > Looks good to me, except that my personal preference for the implicit > string concatenation (thirdkey) is to move the space to the > following line, and (if possible) align the parts: > mydict = {'mykey': > 'a very very very very very long value', > 'secondkey': 'a short value', > 'thirdkey': 'a very very very' > ' long value that continues on the next line', > } Heh--not to bikeshed, but my personal preference is to leave the trailing space on the first line. This is because by the time I've started a new line (and possibly have spent time fussing with indentation for the odd cases that my editor doesn't get quite right) I'll have forgotten that I need to start the line with a space :) Best, Erik ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] PEP8 dictionary indenting addition
On 09/10/16 12:43, Paul Moore wrote: I'd probably lay this out as # Less indent needed for keys, so thirdkey fits better in this case mydict = { 'mykey': 'a very very very very very long value', 'secondkey': 'a short value', 'thirdkey': 'a very very very long value that continues on the next line', } +1 from me on this general style of layout. Why associate the indentation level with the name of the identifier being bound? Treat the opening parenthesis as beginning a "suite" of indented key/value pairs in the same way as a colon introduces an indented suite of statements in other constructs. It may not be part of the formal syntax, but it's consistent with other constructs in the language that _are_ defined by the formal syntax. E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] if-statement in for-loop
Hi, On 11/09/16 10:36, Dominik Gresch wrote: So I asked myself if a syntax as follows would be possible: for i in range(10) if i != 5: body I've read the thread and I understand the general issues with making the condition part of the expression. However, what if this wasn't part of changing the expression syntax but changing the declarative syntax instead to remove the need for a newline and indent after the colon? I'm fairly sure this will have been suggested and shot down in the past, but I couldn't find any obvious references so I'll say it (again?). The expression suggested could be spelled: for i in range(10): if i != 5: body So, if a colon followed by another suite is equivalent to the same construct but without the INDENT (and then the corresponding DEDENT unwinds up to the point of the first keyword) then we get something that's pretty much as succinct as Dominik suggested. Of course, we then might get: for i in myweirdobject: if i != 5: while foobar(i) > 10: while frob(i+1) < 99: body ... which is hideous. But is it actually _likely_? E. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] if-statement in for-loop
On Tue, Sep 27, 2016 at 5:33 PM, Nick Coghlan wrote: > On 28 September 2016 at 00:55, Erik Bray wrote: >> On Sun, Sep 11, 2016 at 12:28 PM, Bernardo Sulzbach >> wrote: >>> On 09/11/2016 06:36 AM, Dominik Gresch wrote: >>>> >>>> So I asked myself if a syntax as follows would be possible: >>>> >>>> for i in range(10) if i != 5: >>>> body >>>> >>>> Personally, I find this extremely intuitive since this kind of >>>> if-statement is already present in list comprehensions. >>>> >>>> What is your opinion on this? Sorry if this has been discussed before -- >>>> I didn't find anything in the archives. >>>> >>> >>> I find it interesting. >>> >>> I thing that this will likely take up too many columns in more convoluted >>> loops such as >>> >>> for element in collection if is_pretty_enough(element) and ...: >>> ... >>> >>> However, this "problem" is already faced by list comprehensions, so it is >>> not a strong argument against your idea. >> >> Sorry to re-raise this thread--I'm inclined to agree that the case >> doesn't really warrant new syntax. I just wanted to add that I think >> the very fact that this syntax is supported by list comprehensions is >> an argument *in its favor*. >> >> I could easily see a Python newbie being confused that they can write >> "for x in y if z" inside a list comprehension, but not in a bare >> for-statement. Sure they'd learn quickly enough that the filtering >> syntax is unique to list comprehensions. But to anyone who doesn't >> know the historical progression of the Python language that would seem >> highly arbitrary and incongruous I would think. >> >> Just $0.02 USD from a pedagogical perspective. > > This has come up before, and it's considered a teaching moment > regarding how the comprehension syntax actually works: it's an > *arbitrarily deep* nested chain of if statements and for statements. > > That is: > > [f(x,y,z) for x in seq1 if p1(x) for y in seq2 if p2(y) for z in > seq3 if p3(z)] > > can be translated mechanically to the equivalent nested statements > (with the only difference being that the loop variable leak due to the > missing implicit scope): > > result = [] > for x in seq1: > if p1(x): > for y in seq2: > if p2(y): > for z in seq3: > if p3(z): > result.append(f(x, y, z)) > > So while the *most common* cases are a single for loop (map > equivalent), or a single for loop and a single if statement (filter > equivalent), they're not only the forms folks may encounter in the > wild. Thanks for pointing this out Nick. Then following my own logic it would be desirable to also allow the nested for loop syntax of list comprehensions outside them as well. That's a slippery slope to incomprehensibility (they're bad enough in list comprehensions, though occasionally useful). This is a helpful way to think about list comprehensions though--I'll remember it next time I teach them. Thanks, Erik ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] if-statement in for-loop
On Sun, Sep 11, 2016 at 12:28 PM, Bernardo Sulzbach wrote: > On 09/11/2016 06:36 AM, Dominik Gresch wrote: >> >> So I asked myself if a syntax as follows would be possible: >> >> for i in range(10) if i != 5: >> body >> >> Personally, I find this extremely intuitive since this kind of >> if-statement is already present in list comprehensions. >> >> What is your opinion on this? Sorry if this has been discussed before -- >> I didn't find anything in the archives. >> > > I find it interesting. > > I thing that this will likely take up too many columns in more convoluted > loops such as > > for element in collection if is_pretty_enough(element) and ...: > ... > > However, this "problem" is already faced by list comprehensions, so it is > not a strong argument against your idea. Sorry to re-raise this thread--I'm inclined to agree that the case doesn't really warrant new syntax. I just wanted to add that I think the very fact that this syntax is supported by list comprehensions is an argument *in its favor*. I could easily see a Python newbie being confused that they can write "for x in y if z" inside a list comprehension, but not in a bare for-statement. Sure they'd learn quickly enough that the filtering syntax is unique to list comprehensions. But to anyone who doesn't know the historical progression of the Python language that would seem highly arbitrary and incongruous I would think. Just $0.02 USD from a pedagogical perspective. Erik ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] real numbers with SI scale factors
On Tue, Aug 30, 2016 at 5:48 AM, Ken Kundert wrote: > Erik, > One aspect of astropy.units that differs significantly from what I am > proposing is that with astropy.units a user would explicitly specify the scale > factor along with the units, and that scale factor would not change even if > the > value became very large or very small. For example: > > >>> from astropy import units as u > >>> d_andromeda = 7.8e5 * u.parsec > >>> print(d_andromeda) > 78.0 pc > > >>> d_sun = 93e6*u.imperial.mile > >>> print(d_sun.to(u.parsec)) > 4.850441695494146e-06 pc > > >>> print(d_andromeda.to(u.kpc)) > 780.0 kpc > > >>> print(d_sun.to(u.kpc)) > 4.850441695494146e-09 kpc > > I can see where this can be helpful at times, but it kind of goes against the > spirit of SI scale factors, were you are generally expected to 'normalize' the > scale factor (use the scale factor that results in the digits presented before > the decimal point falling between 1 and 999). So I would expected > > d_andromeda = 780 kpc > d_sun = 4.8504 upc > > Is the normalization available astropy.units and I just did not find it? > Is there some reason not to provide the normalization? > > It seems to me that pre-specifying the scale factor might be preferred if one > is > generating data for a table and all the magnitude of the values are known in > advance to within 2-3 orders of magnitude. > > It also seems to me that if these assumptions were not true, then normalizing > the scale factors would generally be preferred. > > Do you believe that? Hi Ken, I see what you're getting at, and that's a good idea. There's also nothing in the current implementation preventing it, and I think I'll even suggest this to Astropy (with proper attribution)! I think there are reasons not to always do this, but it's a nice option to have. Point being nothing about this particular feature requires special support from the language, unless I'm missing something obvious. And given that Astropy (or any other units library) is third-party chances are a feature like this will land in place a lot faster than it has any chance of showing up in Python :) Best, Erik > On Mon, Aug 29, 2016 at 03:05:50PM +0200, Erik Bray wrote: >> Astropy also has a very powerful units package--originally derived >> from pyunit I think but long since diverged and grown: >> >> http://docs.astropy.org/en/stable/units/index.html >> >> It was originally developed especially for astronomy/astrophysics use >> and has some pre-defined units that many other packages don't have, as >> well as support for logarithmic units like decibel and optional (and >> customizeable) unit equivalences (e.g. frequency/wavelength or >> flux/power). >> >> That said, its power extends beyond astronomy and I heard through last >> week's EuroScipy that even some biology people have been using it. >> There's been some (informal) talk about splitting it out from Astropy >> into a stand-alone package. This is tricky since almost everything in >> Astropy has been built around it (dimensional calculations are always >> used where possible), but not impossible. >> >> One of the other big advantages of astropy.units is the Quantity class >> representing scale+dimension values. This is deeply integrated into >> Numpy so that units can be attached to Numpy arrays, and all Numpy >> ufuncs can operate on them in a dimensionally meaningful way. The >> needs for this have driven a number of recent features in Numpy. This >> is work that, unfortunately, could never be integrated into the Python >> stdlib. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] A proposal to rename the term "duck typing"
On Sun, Aug 28, 2016 at 7:41 PM, Bruce Leban wrote: > > > On Sunday, August 28, 2016, ROGER GRAYDON CHRISTMAN wrote: >> >> >> We have a term in our lexicon "duck typing" that traces its origins, in >> part to a quote along the lines of >> "If it walks like a duck, and talks like a duck, ..." >> >> ... >> >> In that case, it would be far more appropriate for use to call this sort >> of type analysis "witch typing" > > > I believe the duck is out of the bag on this one. First the "duck test" that > you quote above is over 100 years old. > https://en.m.wikipedia.org/wiki/Duck_test So that's entrenched. > > Second this isn't a Python-only term anymore and language is notoriously > hard to change prescriptively. > > Third I think the duck test is more appropriate than the witch test which > involves the testers faking the results. Agreed. It's also fairly problematic given that you're deriving the term from a sketch about witch hunts. While the Monty Python sketch is hilarious and, it's the ignorant mob that's the butt of the joke rather than the "witch", this joke doesn't necessarily play well universally, especially given that there places today where women are being killed for being "witches". Best, Erik ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] real numbers with SI scale factors
On Mon, Aug 29, 2016 at 3:05 PM, Erik Bray wrote: > On Mon, Aug 29, 2016 at 9:07 AM, Ken Kundert > wrote: >> On Mon, Aug 29, 2016 at 01:45:20PM +1000, Steven D'Aprano wrote: >>> On Sun, Aug 28, 2016 at 08:26:38PM -0700, Brendan Barnwell wrote: >>> > On 2016-08-28 18:44, Ken Kundert wrote: >>> > >When working with a general purpose programming language, the above >>> > >numbers >>> > >become: >>> > > >>> > > 780kpc -> 7.8e+05 >>> [...] >>> >>> For the record, I don't know what kpc might mean. "kilo pico speed of >>> light"? So I looked it up using units, and it is kilo-parsecs. That >>> demonstrates that unless your audience is intimately familiar with the >>> domain you are working with, adding units (especially units that aren't >>> actually used for anything) adds confusion. >>> >>> Python is not a specialist application targetted at a single domain. It >>> is a general purpose programming language where you can expect a lot of >>> cross-domain people (e.g. a system administrator asked to hack on a >>> script in a domain they know nothing about). >> >> I talked to astrophysicist about your comments, and what she said was: >> 1. She would love it if Python had built in support for real numbers with SI >>scale factors >> 2. I told her about my library for reading and writing numbers with SI scale >>factors, and she was much less enthusiastic because using it would require >>convincing the rest of the group, which would be too much effort. >> 3. She was amused by the "kilo pico speed of light" comment, but she was >> adamant >>that the fact that you, or some system administrator, does not understand >>what kpc means has absolutely no affect on her desired to use SI scale >>factors. Her comment: I did not write it for him. >> 4. She pointed out that the software she writes and uses is intended either >> for >>herself of other astrophysicists. No system administrators involved. > > Astropy also has a very powerful units package--originally derived > from pyunit I think but long since diverged and grown: > > http://docs.astropy.org/en/stable/units/index.html > > It was originally developed especially for astronomy/astrophysics use > and has some pre-defined units that many other packages don't have, as > well as support for logarithmic units like decibel and optional (and > customizeable) unit equivalences (e.g. frequency/wavelength or > flux/power). > > That said, its power extends beyond astronomy and I heard through last > week's EuroScipy that even some biology people have been using it. > There's been some (informal) talk about splitting it out from Astropy > into a stand-alone package. This is tricky since almost everything in > Astropy has been built around it (dimensional calculations are always > used where possible), but not impossible. > > One of the other big advantages of astropy.units is the Quantity class > representing scale+dimension values. This is deeply integrated into > Numpy so that units can be attached to Numpy arrays, and all Numpy > ufuncs can operate on them in a dimensionally meaningful way. The > needs for this have driven a number of recent features in Numpy. This > is work that, unfortunately, could never be integrated into the Python > stdlib. I'll also add that syntactic support for units has rarely been an issue in Astropy. The existing algebraic rules for units work fine with Python's existing order of operations. It can be *nice* to be able to write "1m" instead of "1 * m" but ultimately it doesn't add much for clarity (and if really desired could be handled with a preparser--something I've considered adding for Astropy sources (via codecs). Best, Erik ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] real numbers with SI scale factors
On Mon, Aug 29, 2016 at 9:07 AM, Ken Kundert wrote: > On Mon, Aug 29, 2016 at 01:45:20PM +1000, Steven D'Aprano wrote: >> On Sun, Aug 28, 2016 at 08:26:38PM -0700, Brendan Barnwell wrote: >> > On 2016-08-28 18:44, Ken Kundert wrote: >> > >When working with a general purpose programming language, the above >> > >numbers >> > >become: >> > > >> > > 780kpc -> 7.8e+05 >> [...] >> >> For the record, I don't know what kpc might mean. "kilo pico speed of >> light"? So I looked it up using units, and it is kilo-parsecs. That >> demonstrates that unless your audience is intimately familiar with the >> domain you are working with, adding units (especially units that aren't >> actually used for anything) adds confusion. >> >> Python is not a specialist application targetted at a single domain. It >> is a general purpose programming language where you can expect a lot of >> cross-domain people (e.g. a system administrator asked to hack on a >> script in a domain they know nothing about). > > I talked to astrophysicist about your comments, and what she said was: > 1. She would love it if Python had built in support for real numbers with SI >scale factors > 2. I told her about my library for reading and writing numbers with SI scale >factors, and she was much less enthusiastic because using it would require >convincing the rest of the group, which would be too much effort. > 3. She was amused by the "kilo pico speed of light" comment, but she was > adamant >that the fact that you, or some system administrator, does not understand >what kpc means has absolutely no affect on her desired to use SI scale >factors. Her comment: I did not write it for him. > 4. She pointed out that the software she writes and uses is intended either > for >herself of other astrophysicists. No system administrators involved. Astropy also has a very powerful units package--originally derived from pyunit I think but long since diverged and grown: http://docs.astropy.org/en/stable/units/index.html It was originally developed especially for astronomy/astrophysics use and has some pre-defined units that many other packages don't have, as well as support for logarithmic units like decibel and optional (and customizeable) unit equivalences (e.g. frequency/wavelength or flux/power). That said, its power extends beyond astronomy and I heard through last week's EuroScipy that even some biology people have been using it. There's been some (informal) talk about splitting it out from Astropy into a stand-alone package. This is tricky since almost everything in Astropy has been built around it (dimensional calculations are always used where possible), but not impossible. One of the other big advantages of astropy.units is the Quantity class representing scale+dimension values. This is deeply integrated into Numpy so that units can be attached to Numpy arrays, and all Numpy ufuncs can operate on them in a dimensionally meaningful way. The needs for this have driven a number of recent features in Numpy. This is work that, unfortunately, could never be integrated into the Python stdlib. ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/