On Thu, Jan 14, 2016 at 2:30 PM, Nathaniel Smith <n...@pobox.com> wrote:
> The reason I didn't suggest dask is that I had the impression that > dask's model is better suited to bulk/streaming computations with > vectorized semantics ("do the same thing to lots of data" kinds of > problems, basically), whereas it sounded like the OP's algorithm > needed lots of one-off unpredictable random access. > > Obviously even if this is true then it's useful to point out both > because the OP's problem might turn out to be a better fit for dask's > model than they indicated -- the post is somewhat vague :-). > > But, I just wanted to check, is the above a good characterization of > dask's strengths/applicability? > Yes, dask is definitely designed around setting up a large streaming computation and then executing it all at once. But it is pretty flexible in terms of what those specific computations are, and can also work for non-vectorized computation (especially via dask imperative). It's worth taking a look at dask's collections for a sense of what it can do here. The recently refreshed docs provide a nice overview: http://dask.pydata.org/ Cheers, Stephan
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion