RE: Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-14 Thread Mukul Gupta
Koeninger [mailto:c...@koeninger.org] Sent: Monday, March 14, 2016 9:39 PM To: Mukul Gupta Cc: user@spark.apache.org Subject: Re: Kafka + Spark streaming, RDD partitions not processed in parallel So what's happening here is that print() uses take(). Take() will try to satisfy the request using on

Re: Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-14 Thread Cody Koeninger
e link to repository: > https://github.com/guptamukul/sparktest.git > > > From: Cody Koeninger > Sent: 11 March 2016 23:04 > To: Mukul Gupta > Cc: user@spark.apache.org > Subject: Re: Kafka + Spark streaming, RDD partitions not processed in parallel > > Why are

Re: Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-13 Thread Mukul Gupta
efore. Following is the link to repository: https://github.com/guptamukul/sparktest.git From: Cody Koeninger Sent: 11 March 2016 23:04 To: Mukul Gupta Cc: user@spark.apache.org Subject: Re: Kafka + Spark streaming, RDD partitions not processed in parallel

Re: Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-11 Thread Cody Koeninger
t); > > JavaDStream processed = messages.map(new Function String>, String>() { > > @Override > public String call(Tuple2 arg0) throws Exception { > > Thread.sleep(7000); > return arg0._2; > } > }); > > processed.print(90); > > try { > jssc.start(); > jssc

Re: Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-11 Thread Mukul Gupta
___ From: Cody Koeninger Sent: 11 March 2016 20:42 To: Mukul Gupta Cc: user@spark.apache.org Subject: Re: Kafka + Spark streaming, RDD partitions not processed in parallel Can you post your actual code? On Thu, Mar 10, 2016 at 9:55 PM, Mukul Gupta wrote: > Hi All, I was running the following t

Re: Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-11 Thread Cody Koeninger
nt spark executors. I am not clear about why spark is > waiting for operations on first RDD partition to finish, while it could > process remaining partitions in parallel? Am I missing any configuration? > Any help is appreciated. Thanks, Mukul > ________________ > View t

Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-10 Thread Mukul Gupta
ation? Any help is appreciated.Thanks,Mukul -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Kafka-Spark-streaming-RDD-partitions-not-processed-in-parallel-tp26457.html Sent from the Apache Spark User List mailing list archive at Nabble.com.