Koeninger [mailto:c...@koeninger.org]
Sent: Monday, March 14, 2016 9:39 PM
To: Mukul Gupta
Cc: user@spark.apache.org
Subject: Re: Kafka + Spark streaming, RDD partitions not processed in parallel
So what's happening here is that print() uses take(). Take() will try to
satisfy the request using on
e link to repository:
> https://github.com/guptamukul/sparktest.git
>
>
> From: Cody Koeninger
> Sent: 11 March 2016 23:04
> To: Mukul Gupta
> Cc: user@spark.apache.org
> Subject: Re: Kafka + Spark streaming, RDD partitions not processed in parallel
>
> Why are
efore.
Following is the link to repository:
https://github.com/guptamukul/sparktest.git
From: Cody Koeninger
Sent: 11 March 2016 23:04
To: Mukul Gupta
Cc: user@spark.apache.org
Subject: Re: Kafka + Spark streaming, RDD partitions not processed in parallel
t);
>
> JavaDStream processed = messages.map(new Function String>, String>() {
>
> @Override
> public String call(Tuple2 arg0) throws Exception {
>
> Thread.sleep(7000);
> return arg0._2;
> }
> });
>
> processed.print(90);
>
> try {
> jssc.start();
> jssc
___
From: Cody Koeninger
Sent: 11 March 2016 20:42
To: Mukul Gupta
Cc: user@spark.apache.org
Subject: Re: Kafka + Spark streaming, RDD partitions not processed in parallel
Can you post your actual code?
On Thu, Mar 10, 2016 at 9:55 PM, Mukul Gupta wrote:
> Hi All, I was running the following t
nt spark executors. I am not clear about why spark is
> waiting for operations on first RDD partition to finish, while it could
> process remaining partitions in parallel? Am I missing any configuration?
> Any help is appreciated. Thanks, Mukul
> ________________
> View t
ation? Any help is appreciated.Thanks,Mukul
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Kafka-Spark-streaming-RDD-partitions-not-processed-in-parallel-tp26457.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.