On Fri, May 29, 2020 at 1:56 PM Rhodri James <rho...@kynesim.co.uk> wrote:
> Presumably "delayed" is something that would be automatically applied to > the actual parameter given, otherwise your call graphs might or might > not actually be call graphs depending on how the function was called. > What happens if I call "foo(y=0)" for instance? > I am slightly hijacking the thread. I think the "solution" to the narrow "problem" of mutable default arguments is not at all worth having. So yes, if that was the only, or even main, purpose of a hypothetical 'delayed' keyword and 'DelayedType', it would absolutely not be worthwhile. It would just happen to solve that problem as a side effect. Where I think it is valuable is the idea of letting all the normal operations work on EITHER a DelayedType or whatever type the operation would otherwise operate on. So no, `foo(y=0)` would pass in a concrete type and do greedy calculations, nothing delayed, no in-memory call graph (other than whatever is implicit in the bytecode). However, I don't really love the 'concretize' operation, which you correctly identify as a source of new bugs. I think it is necessary occasionally, and actually there is no reason it cannot be a plain old function rather than a keyword. The DelayedType object is simply some way of storing a call graph, which is has some kind of underlying representation. So the only slightly special function concretize() could just walk the graph. I think it would be better if MANY operations simply *implied* the need to concretize the call tree into a result value. The problem is, I'm really not sure exactly which operations, and probably a lot of them could have arguments in either direction. My thinking here is inspired by several data frame libraries that address this issue. * Pandas is always eager. Call a method, it does the computation right away, and usually returns the mutated data frame object (or some new similar object) to chain methods in a "fluent style." * Dask.dataframe is always lazy. It implements 95% of the Pandas API (by putting Pandas "under the hood"). You can still chain methods, but everything returns a call graph that the next method uses to build a larger call graph. The ONLY time any computation is done when you explicitly call `df.compute()` * Vaex is another data frame library that is very interesting. It is *almost always* lazy, and likewise allows chaining methods. You can easily build up call graphs behind the scenes (either by chaining or with intermediate names for "computations"). However, the RIGHT collection of methods know they need to concretize... and they just do the right thing. Of these, I've used Vaex the least by far. However, my impression is that it makes the right decisions, and users rarely need to think about what is going on behind the scenes (other than that operations on big data is darn fast). Still, "things with data frames" is more specialized than "everything one can do in Python." I'm not sure what the right answers are. For example, my first intuition is that `print(x)` is a good occasion to implicitly concretize. If you really want the underlying delayed, maybe you'd write `print(x.callgraph)` or something similar. But then, sticking in a debugging print might wind up being an expensive thing to do in that case, so... -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/G7ZYBSFC6L3FFPFODPZWMHV72MLIQPMU/ Code of Conduct: http://python.org/psf/codeofconduct/