bigpetstore flink : parallelizing collections

jay vyas Sun, 12 Jul 2015 06:05:29 -0700

Hi flink.

Im happy to announce that ive done a small bit of initial hacking on
bigpetstore-flink, in order to represent what we do in spark in flink.


TL;DR the main question is at the bottom!

Currently, i want to generate transactions for a list of customers.  The
generation of transactions is a parallel process, and the customers are
generated beforehand.

In hadoop , we can create an input format with custom splits if we want to
split a data set up, otherwise, we can break it into files.

in spark, there is a conveneint "parallelize" which we can run on a list,
which we can then capture the RDD from , and run a parallelized transform.

In flink, i have an array of "customers" and i want to parallelize our
transaction generator for each customer.  How would i do that?

-- 
jay vyas

bigpetstore flink : parallelizing collections

Reply via email to