Re: Increasing parallelism skews/increases overall job processing time linearly

2017-01-12 Thread Chakravarthy varaga
Hi Tim, Thanks for your response. The results are the same. 4 CPU (*8 cores in total) kafka partitions = 4 per topic parallesim for job = 3 task.slot / TM = 4 Basically this flink application consumes (kafka source) from 2 topics and produces (kafka

Re: Increasing parallelism skews/increases overall job processing time linearly

2017-01-11 Thread Till Rohrmann
Hi CVP, changing the parallelism from 1 to 2 with every TM having only one slot will inevitably introduce another network shuffle operation between the sources and the keyed co flat map. This might be the source of your slow down, because before everything was running on one machine without any

Re: Increasing parallelism skews/increases overall job processing time linearly

2017-01-10 Thread Chakravarthy varaga
Hi Guys, I understand that you are extremely busy but any pointers here is highly appreciated. I can proceed forward towards concluding the activity ! Best Regards CVP On Mon, Jan 9, 2017 at 11:43 AM, Chakravarthy varaga < chakravarth...@gmail.com> wrote: > Anything that I could check or

Re: Increasing parallelism skews/increases overall job processing time linearly

2017-01-09 Thread Chakravarthy varaga
Anything that I could check or collect for you for investigation ? On Sat, Jan 7, 2017 at 1:35 PM, Chakravarthy varaga < chakravarth...@gmail.com> wrote: > Hi Stephen > > . Kafka version is: 0.9.0.1 the connector is flinkconsumer09 > . The flatmap n coflatmap are connected by keyBy > . No data

Re: Increasing parallelism skews/increases overall job processing time linearly

2017-01-07 Thread Chakravarthy varaga
Hi Stephen . Kafka version is: 0.9.0.1 the connector is flinkconsumer09 . The flatmap n coflatmap are connected by keyBy . No data is broadcasted and the data is not exploded based on the parallelism Cvp On 6 Jan 2017 20:16, "Stephan Ewen" wrote: > Hi! > > You are right,

Re: Increasing parallelism skews/increases overall job processing time linearly

2017-01-06 Thread Stephan Ewen
Hi! You are right, parallelism 2 should be faster than parallelism 1 ;-) As ChenQin pointed out, having only 2 Kafka Partitions may prevent further scaleout. Few things to check: - How are you connecting the FlatMap and CoFlatMap? Default, keyBy, broadcast? - Broadcast for example would

Re: Increasing parallelism skews/increases overall job processing time linearly

2017-01-06 Thread Chen Qin
Just noticed there are only two partitions per topic. Regardless of how large parallelism set. Only two of those will get partition assigned at most. Sent from my iPhone > On Jan 6, 2017, at 02:40, Chakravarthy varaga > wrote: > > Hi All, > > Any updates on

Re: Increasing parallelism skews/increases overall job processing time linearly

2017-01-06 Thread Chakravarthy varaga
Hi All, Any updates on this? Best Regards CVP On Thu, Jan 5, 2017 at 1:21 PM, Chakravarthy varaga < chakravarth...@gmail.com> wrote: > > Hi All, > > I have a job as attached. > > I have a 16 Core blade running RHEL 7. The taskmanager default number of > slots is set to 1. The source is a

Increasing parallelism skews/increases overall job processing time linearly

2017-01-05 Thread Chakravarthy varaga
Hi All, I have a job as attached. I have a 16 Core blade running RHEL 7. The taskmanager default number of slots is set to 1. The source is a kafka stream and each of the 2 sources(topic) have 2 partitions each. *What I notice is that when I deploy a job to run with #parallelism=2 the total