Re: Running a spark code on multiple machines using google cloud platform

Marco Mistroni Thu, 02 Feb 2017 05:30:35 -0800

U can use EMR if u want to run. On a cluster....
Kr

On 2 Feb 2017 12:30 pm, "Anahita Talebi" <anahita.t.am...@gmail.com> wrote:


> Dear all,
>
> I am trying to run a spark code on multiple machines using submit job in
> google cloud platform.
> As the inputs of my code, I have a training and testing datasets.
>
> When I use small training data set like (10kb), the code can be
> successfully ran on the google cloud while when I have a large data set
> like 50Gb, I received the following error:
>
> 17/02/01 19:08:06 ERROR org.apache.spark.scheduler.LiveListenerBus: 
> SparkListenerBus has already stopped! Dropping event 
> SparkListenerTaskEnd(2,0,ResultTask,TaskKilled,org.apache.spark.scheduler.TaskInfo@3101f3b3,null)
>
> Does anyone can give me a hint how I can solve my problem?
>
> PS: I cannot use small training data set because I have an optimization code 
> which needs to use all the data.
>
> I have to use google could platform because I need to run the code on 
> multiple machines.
>
> Thanks a lot,
>
> Anahita
>
>

Re: Running a spark code on multiple machines using google cloud platform

Reply via email to