Hi Garrett,

You can call .setParallelism(1) on just this operator:

ds.reduceGroup(new GroupReduceFunction...).setParallelism(1)

Best,
Gabor



On Mon, Oct 2, 2017 at 3:46 PM, Garrett Barton <garrett.bar...@gmail.com> wrote:
> I have a complex alg implemented using the DataSet api and by default it
> runs with parallel 90 for good performance. At the end I want to perform a
> clustering of the resulting data and to do that correctly I need to pass all
> the data through a single thread/process.
>
> I read in the docs that as long as I did a global reduce using
> DataSet.reduceGroup(new GroupReduceFunction....) that it would force it to a
> single thread.  Yet when I run the flow and bring it up in the ui, I see
> parallel 90 all the way through the dag including this one.
>
> Is there a config or feature to force the flow back to a single thread?  Or
> should I just split this into two completely separate jobs?  I'd rather not
> split as I would like to use flinks ability to iterate on this alg and
> cluster combo.
>
> Thank you

Reply via email to