On Fri, May 29, 2020 at 5:06 PM Dominik Vilsmeier <dominik.vilsme...@gmx.de>
wrote:

> I'm still struggling to imagine a real use case which can't already be
> solved by generators. Usually the purpose of such computation graphs is to
> execute on some specialized hardware or because you want to backtrack
> through the graph (e.g. Tensorflow, PyTorch, etc). Dask seems to be similar
> in a sense that the user can choose different execution models for the
> graph.
>

If you look for Dask presentations, you can find lots of examples, many of
real world use.  You want those that use dask.delayed specifically, since a
lot of Dask users just use the higher-levels like dask.dataframe.  But off
the cuff, here's one that I don't think you can formulate as generators
alone:

def lazy_tree(stream, btree=BTree(), new_data=None):
    """Store a b-tree of results from computation and the inputs

    For example, a modular  form of large integers, or a PRG
    """
    for num in stream:
        # We expect a few million numbers to arrive
        result = delayed expensive_calculation(num)
        if result == btree.result:
            if new_data is None:
                btree.result = new_data.result
                btree.inits = [new_data.init]
            else:
                # Add to the inits that might produce result
                btree.inits.append(num)
        elif result < btree.value.result:
            btree.left = delayed lazy_tree(stream, btree.left,
InitResultPair(num, result))
        elif result > btree.value.result:
            btree.right = delayed lazy_tree(stream, btree.right,
InitResultPair(num, result))
    return btree

I probably have some logic wrong in there since I'm doing it off the cuff
without testing.  And the classes BTree and InitResultPair are hypothetical.

But the idea is we are partitioning on the difficult to calculate result.
Multiple inits might produce that same result, but you don't know without
the computation.  So yes, parallelism is potentially beneficial.  But also
partial computation without expanding the entire tree.  That could let us
run a line like this to only realize a small part of the tree:

btree = lazy_tree(mystream)
my_node = btree.path("left/left/left/right/left/right/right/right")

Obviously it is not impossible to write this without new Python semantics
around delayed computation.  But the scaffolding would be much more
extensive.  And it certainly couldn't be expressed just as a sequence of
generator comprehensions.

-- 
The dead increasingly dominate and strangle both the living and the
not-yet born.  Vampiric capital and undead corporate persons abuse
the lives and control the thoughts of homo faber. Ideas, once born,
become abortifacients against new conceptions.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KNFDVPYHRUKAQ4PN4O6RUUKDXBYN6S46/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to