Thanks! I found this PR already but seemed to be completely outdated :)

Maybe it's worth restarting this discussion.

Gyula

Chesnay Schepler <ches...@apache.org> ezt írta (időpont: 2016. jún. 13., H,
14:58):

> FLINK-1003 may be related.
>
> On 13.06.2016 12:46, Gyula Fóra wrote:
> > Hey,
> >
> > The Flink scheduling mechanism has become quite a bit of a pain lately
> for
> > us when trying to schedule IO heavy streaming jobs. And by IO heavy I
> mean
> > it has a fairly large state that is being continuously updated/read.
> >
> > The main problem is that the scheduled task slots are not evenly
> > distributed among the different task managers but usually the first TM
> > takes as much slots as possibles and the other TMs get much fewer. And
> > since the job is RocksDB IO bound the uneven load causes a significant
> > performance penalty.
> >
> > This is further accentuated during historical runs when we are trying to
> > "fast-forward" the application. The difference can be quite substantial
> in
> > a 3-4 node cluster: with even task distribution the history might run 3
> > times faster compared to an uneven one.
> >
> > I was wondering if there was a simple way to modify the scheduler so it
> > allocates resources in a round-robin fashion. Probably someone has a lot
> of
> > experience with this already :) (I'm running 1.0.3 for this job btw)
> >
> > Cheers,
> > Gyula
> >
>
>

Reply via email to