Re: Flink stops deploying jobs on normal iteration

2016-07-07 Thread Nguyen Xuan Truong
Hi Vasia, You are right about the topDistance, it is the dataset which has only 1 double value. I already looked at the Aggregator and I can only get the value of an aggregator in the next iteration. However, my problem is a bit tricky because the topDistance controls how the newSeeds is

Re: Flink stops deploying jobs on normal iteration

2016-07-05 Thread Nguyen Xuan Truong
Hi Vasia, Thank you very much for your explanation :). When running with small maxIteration, the job graph that Flink executed was optimal. However, when maxIterations was large, Flink took very long time to generate the job graph. The actually time to execute the jobs was very fast but the time

Re: Flink stops deploying jobs on normal iteration

2016-07-05 Thread Vasiliki Kalavri
Hi Truong, I'm afraid what you're experiencing is to be expected. Currently, for loops do not perform well in Flink since there is no support for caching intermediate results yet. This has been a quite often requested feature lately, so maybe it will be added soon :) Until then, I suggest you try

Flink stops deploying jobs on normal iteration

2016-07-05 Thread Nguyen Xuan Truong
Hi, I have a Flink program which is similar to Kmeans algorithm. I use normal iteration(for loop) because Flink iteration does not allow to compute the intermediate results(in this case the topDistance) within one iteration. The problem is that my program only runs when maxIteration is small.