persistence iops and throughput check? Re: Running a spark code on multiple machines using google cloud platform

2017-02-02 Thread Heji Kim
Dear Anahita, When we run performance tests for Spark/YARN clusters on GCP, we have to make sure we are within iops and throughput limits. Depending on disk type (standard or SSD) and size of disk, you will only get so many max sustained iops and throughput per sec. The GCP instance metrics

Re: Running a spark code on multiple machines using google cloud platform

2017-02-02 Thread Anahita Talebi
Thanks for your answer. do you mean Amazon EMR? On Thu, Feb 2, 2017 at 2:30 PM, Marco Mistroni wrote: > U can use EMR if u want to run. On a cluster > Kr > > On 2 Feb 2017 12:30 pm, "Anahita Talebi" > wrote: > >> Dear all, >> >> I am trying

Re: Running a spark code on multiple machines using google cloud platform

2017-02-02 Thread Marco Mistroni
U can use EMR if u want to run. On a cluster Kr On 2 Feb 2017 12:30 pm, "Anahita Talebi" wrote: > Dear all, > > I am trying to run a spark code on multiple machines using submit job in > google cloud platform. > As the inputs of my code, I have a training and

Running a spark code on multiple machines using google cloud platform

2017-02-02 Thread Anahita Talebi
Dear all, I am trying to run a spark code on multiple machines using submit job in google cloud platform. As the inputs of my code, I have a training and testing datasets. When I use small training data set like (10kb), the code can be successfully ran on the google cloud while when I have a