Subject changed for tangent.

On Sun, May 24, 2020 at 4:14 PM Dominik Vilsmeier <dominik.vilsme...@gmx.de>
wrote:

> output = []
> for x in data:
>     a = delayed inc(x)
>     b = delayed double(x)
>     c = delayed add(a, b)
>     output.append(c)
> total = sum(outputs)  # concrete answer here.
>
> Obviously the simple example of adding scalars isn't worth the delay
> thing.  But if those were expensive operations that built up a call graph,
> it could be useful laziness.
>
> Do you have an example which can't be solved by using generator
> expressions and itertools? As far as I understand the Dask docs the purpose
> of this is to execute in parallel which wouldn't be the case for pure
> Python I suppose? The above example can be written as:
>
>     a = (inc(x) for x in data)
>     b = (double(x) for x in data)
>     c = (add(x, y) for x, y in zip(a, b))
>     total = sum(c)
>
Obviously delayed execution CAN be done in Python, since Dask is a
pure-Python library that does it.  For the narrow example I took from the
start of the dask.delayed docs, your version look equivalent.  But there
are many, not very complicated cases, where you cannot make the call graph
as simple as a sequence of generator comprehensions.

I could make some contrived example. Or with a little more work, I could
make an actual useful example.  For  example, think of creating different
delayed objects within conditional branches inside the loop.  Yes, some
could be expressed with an if in the comprehensions, but many cannot.

It's true that Dask is most useful for parallel execution, whether in
multiple threads, multiple processes, or multiple worker nodes.  That
doesn't mean it would be a bad thing for language level capabilities to
make similar libraries easier.  Kinda like the way we have asyncio, uvloop,
and curio all built on the same primitives.

But another really nice thing in delayed execution is that we do not
necessarily want the *final* computation. Or indeed, the DAG might not have
only one "final state."  Building a DAG of delayed operations is almost
free.  We might build one with thousands or millions of different
operations involved (and Dask users really do that).  But imaging that
different paths through the DAG lead to the states/values "final1",
"final2", "final3" that share many, but not all of the same computation
steps.  After building the DAG, we can make a decision which computations
to perform:

if some_condition():
    x = concretize final1
elif other_condition():
    x = concretize final2
else:
    x = concretize final3

If we avoid 2/3, or even 1/3 of the computation by having that approach,
that is a nice win where we are compute bound.


-- 
The dead increasingly dominate and strangle both the living and the
not-yet born.  Vampiric capital and undead corporate persons abuse
the lives and control the thoughts of homo faber. Ideas, once born,
become abortifacients against new conceptions.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XGZMRIYPOIZKLQOG47M5CF6HQGYUCDWA/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to