Koeninger [mailto:c...@koeninger.org]
Sent: Monday, March 14, 2016 9:39 PM
To: Mukul Gupta <mukul.gu...@aricent.com>
Cc: user@spark.apache.org
Subject: Re: Kafka + Spark streaming, RDD partitions not processed in parallel
So what's happening here is that print() uses take(). Take() will try to
s://github.com/guptamukul/sparktest.git
>
>
> From: Cody Koeninger <c...@koeninger.org>
> Sent: 11 March 2016 23:04
> To: Mukul Gupta
> Cc: user@spark.apache.org
> Subject: Re: Kafka + Spark streaming, RDD partitions not processed in
rows Exception {
>
> Thread.sleep(7000);
> return arg0._2;
> }
> });
>
> processed.print(90);
>
> try {
> jssc.start();
> jssc.awaitTermination();
> } catch (Exception e) {
>
> } finally {
> jssc.close();
> }
> }
> }
>
>
>
the test after
>> increasing the partitions of kafka topic to 5. This time also RDD partition
>> corresponding to partition 1 of kafka was processed on one of the spark
>> executor. Once processing is finished for this RDD partition, then RDD
>> partitions corresponding to
ssed.print(90);
try {
jssc.start();
jssc.awaitTermination();
} catch (Exception e) {
} finally {
jssc.close();
}
}
}
From: Cody Koeninger <c...@koeninger.org>
Sent: 11 March 2016 20:42
To: Mukul Gupta
Cc: user@spark.apache.org
Subject: Re: Kafka + Spark streaming, RDD partition
, Mukul
> ____________
> View this message in context: Kafka + Spark streaming, RDD partitions not
> processed in parallel
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
---
Am I
missing any configuration? Any help is appreciated.Thanks,Mukul
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Kafka-Spark-streaming-RDD-partitions-not-processed-in-parallel-tp26457.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.