RE: Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-14 Thread Mukul Gupta
Koeninger [mailto:c...@koeninger.org] Sent: Monday, March 14, 2016 9:39 PM To: Mukul Gupta <mukul.gu...@aricent.com> Cc: user@spark.apache.org Subject: Re: Kafka + Spark streaming, RDD partitions not processed in parallel So what's happening here is that print() uses take(). Take() will try to

Re: Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-14 Thread Cody Koeninger
s://github.com/guptamukul/sparktest.git > > > From: Cody Koeninger <c...@koeninger.org> > Sent: 11 March 2016 23:04 > To: Mukul Gupta > Cc: user@spark.apache.org > Subject: Re: Kafka + Spark streaming, RDD partitions not processed in

Re: Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-13 Thread Mukul Gupta
rows Exception { > > Thread.sleep(7000); > return arg0._2; > } > }); > > processed.print(90); > > try { > jssc.start(); > jssc.awaitTermination(); > } catch (Exception e) { > > } finally { > jssc.close(); > } > } > } > > >

Re: Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-11 Thread Cody Koeninger
the test after >> increasing the partitions of kafka topic to 5. This time also RDD partition >> corresponding to partition 1 of kafka was processed on one of the spark >> executor. Once processing is finished for this RDD partition, then RDD >> partitions corresponding to

Re: Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-11 Thread Mukul Gupta
ssed.print(90); try { jssc.start(); jssc.awaitTermination(); } catch (Exception e) { } finally { jssc.close(); } } } From: Cody Koeninger <c...@koeninger.org> Sent: 11 March 2016 20:42 To: Mukul Gupta Cc: user@spark.apache.org Subject: Re: Kafka + Spark streaming, RDD partition

Re: Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-11 Thread Cody Koeninger
, Mukul > ____________ > View this message in context: Kafka + Spark streaming, RDD partitions not > processed in parallel > Sent from the Apache Spark User List mailing list archive at Nabble.com. ---

Kafka + Spark streaming, RDD partitions not processed in parallel

2016-03-10 Thread Mukul Gupta
Am I missing any configuration? Any help is appreciated.Thanks,Mukul -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Kafka-Spark-streaming-RDD-partitions-not-processed-in-parallel-tp26457.html Sent from the Apache Spark User List mailing list archive at Nabble.com.