RE: Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-14 Thread Mukul Gupta
Koeninger [mailto:c...@koeninger.org] Sent: Monday, March 14, 2016 9:39 PM To: Mukul Gupta <mukul.gu...@aricent.com> Cc: user@spark.apache.org Subject: Re: Kafka + Spark streaming, RDD partitions not processed in parallel So what's happening here is that print() uses take(). Take() will try to

Re: Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-13 Thread Mukul Gupta
. Following is the link to repository: https://github.com/guptamukul/sparktest.git From: Cody Koeninger <c...@koeninger.org> Sent: 11 March 2016 23:04 To: Mukul Gupta Cc: user@spark.apache.org Subject: Re: Kafka + Spark streaming, RDD partitions not pro

Re: Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-11 Thread Mukul Gupta
ssed.print(90); try { jssc.start(); jssc.awaitTermination(); } catch (Exception e) { } finally { jssc.close(); } } } ____ From: Cody Koeninger <c...@koeninger.org> Sent: 11 March 2016 20:42 To: Mukul Gupta Cc: user@spark.apache.org Subject: Re: Kafka + Spark streaming, RDD partition

Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-10 Thread Mukul Gupta
Hi All,I was running the following test:*Setup*9 VM runing spark workers with 1 spark executor each.1 VM running kafka and spark master.Spark version is 1.6.0Kafka version is 0.9.0.1Spark is using its own resource manager and is not running over YARN.*Test*I created a kafka topic with 3 partition.