Re: CustomPartitioner that simulates ForwardPartitioner and watermarks

2017-10-06 Thread Aljoscha Krettek
Hi, As you noticed, Flink does currently not put Source-X and Throttler-X (for some X) in the same task slot (TaskManager). In the low-level execution system, there are two connection patterns: ALL_TO_ALL and POINTWISE. Flink will only schedule Source-X and Throttler-X on the same slot when

Re: CustomPartitioner that simulates ForwardPartitioner and watermarks

2017-09-28 Thread Yunus Olgun
Hi Kostas, Aljoscha, To answer Kostas’s concern, the algorithm works this way: Let’s say we have two sources Source-0 and Source-1. Source-0 is slow and Source-1 is fast. Sources read from Kafka at different paces. Threshold is 10 time units. 1st cycle: Source-0 sends records with timestamp

Re: CustomPartitioner that simulates ForwardPartitioner and watermarks

2017-09-28 Thread Aljoscha Krettek
To quickly make Kostas' intuition concrete: it's currently not possible to have watermarks broadcast but the data be locally forwarded. The reason is that watermarks and data travel in the same channels so if the watermark needs to be broadcast there needs to be an n to m (in this case m == n)

Re: CustomPartitioner that simulates ForwardPartitioner and watermarks

2017-09-28 Thread Kostas Kloudas
Hi Yunus, I see. Currently I am not sure that you can simply broadcast the watermark only, without having a shuffle. But one thing to notice about your algorithm is that, I am not sure if your algorithm solves the problem you encounter. Your algorithm seems to prioritize the stream with the

Re: CustomPartitioner that simulates ForwardPartitioner and watermarks

2017-09-27 Thread Yunus Olgun
Hi Kostas, Yes, you have summarized well. I want to only forward the data to the next local operator, but broadcast the watermark through the cluster. - I can’t set parallelism of taskB to 1. The stream is too big for that. Also, the data is ordered at each partition. I don’t want to change