: "Qiao, Richard"
Cc: Gerard Maas , "user @spark"
Subject: Re: Do I need to do .collect inside forEachRDD
Hi Richard,
I had tried your sample code now and several times in the past as well. The
problem seems to be kafkaProducer is not serializable. so I get "Task not
seria
r.
>
>
>
> Best Regards
>
> Richard
>
>
>
>
>
> *From: *kant kodali
> *Date: *Thursday, December 7, 2017 at 2:30 AM
> *To: *Gerard Maas
> *Cc: *"Qiao, Richard" , "user @spark" <
> user@spark.apache.org>
> *Subject: *Re: Do I need
ali
Date: Thursday, December 7, 2017 at 2:30 AM
To: Gerard Maas
Cc: "Qiao, Richard" , "user @spark"
Subject: Re: Do I need to do .collect inside forEachRDD
@Richard I had pasted the two versions of the code below and I still couldn't
figure out why it wouldn'
@Richard I had pasted the two versions of the code below and I still
couldn't figure out why it wouldn't work without .collect ? Any help would
be great
*The code below doesn't work and sometime I also run into OutOfMemory
error.*
jsonMessagesDStream
.window(new Duration(6), new Duratio
Hi Kant,
> but would your answer on .collect() change depending on running the
spark app in client vs cluster mode?
No, it should make no difference.
-kr, Gerard.
On Tue, Dec 5, 2017 at 11:34 PM, kant kodali wrote:
> @Richard I don't see any error in the executor log but let me run again to
@Richard I don't see any error in the executor log but let me run again to
make sure.
@Gerard Thanks much! but would your answer on .collect() change depending
on running the spark app in client vs cluster mode?
Thanks!
On Tue, Dec 5, 2017 at 1:54 PM, Gerard Maas wrote:
> The general answer t
The general answer to your initial question is that "it depends". If the
operation in the rdd.foreach() closure can be parallelized, then you don't
need to collect first. If it needs some local context (e.g. a socket
connection), then you need to do rdd.collect first to bring the data
locally, whic
In the 2nd case, is there any producer’s error thrown in executor’s log?
Best Regards
Richard
From: kant kodali
Date: Tuesday, December 5, 2017 at 4:38 PM
To: "Qiao, Richard"
Cc: "user @spark"
Subject: Re: Do I need to do .collect inside forEachRDD
Reads from Kafka and
Reads from Kafka and outputs to Kafka. so I check the output from Kafka.
On Tue, Dec 5, 2017 at 1:26 PM, Qiao, Richard
wrote:
> Where do you check the output result for both case?
>
> Sent from my iPhone
>
> > On Dec 5, 2017, at 15:36, kant kodali wrote:
> >
> > Hi All,
> >
> > I have a simple
Where do you check the output result for both case?
Sent from my iPhone
> On Dec 5, 2017, at 15:36, kant kodali wrote:
>
> Hi All,
>
> I have a simple stateless transformation using Dstreams (stuck with the old
> API for one of the Application). The pseudo code is rough like this
>
> dstream
10 matches
Mail list logo