Koeninger [mailto:c...@koeninger.org]
Sent: Monday, March 14, 2016 9:39 PM
To: Mukul Gupta <mukul.gu...@aricent.com>
Cc: user@spark.apache.org
Subject: Re: Kafka + Spark streaming, RDD partitions not processed in parallel
So what's happening here is that print() uses take(). Take() will try to
.
Following is the link to repository:
https://github.com/guptamukul/sparktest.git
From: Cody Koeninger <c...@koeninger.org>
Sent: 11 March 2016 23:04
To: Mukul Gupta
Cc: user@spark.apache.org
Subject: Re: Kafka + Spark streaming, RDD partitions not pro
ssed.print(90);
try {
jssc.start();
jssc.awaitTermination();
} catch (Exception e) {
} finally {
jssc.close();
}
}
}
____
From: Cody Koeninger <c...@koeninger.org>
Sent: 11 March 2016 20:42
To: Mukul Gupta
Cc: user@spark.apache.org
Subject: Re: Kafka + Spark streaming, RDD partition
Hi All,I was running the following test:*Setup*9 VM runing spark workers with
1 spark executor each.1 VM running kafka and spark master.Spark version is
1.6.0Kafka version is 0.9.0.1Spark is using its own resource manager and is
not running over YARN.*Test*I created a kafka topic with 3 partition.