Kevin: Can you describe how you got over the Metadata fetch exception ?
> On Apr 16, 2016, at 9:41 AM, Kevin Eid <kevin.e...@mail.dcu.ie> wrote: > > One last email to announce that I've fixed all of the issues. Don't hesitate > to contact me if you encounter the same. I'd be happy to help. > > Regards, > Kevin > >> On 14 Apr 2016 12:39 p.m., "Kevin Eid" <kevin.e...@mail.dcu.ie> wrote: >> Hi all, >> >> I managed to copy my .py files from local to the cluster using SCP . And I >> managed to run my Spark app on the cluster against a small dataset. >> >> However, when I iterate over a dataset of 5GB I got the followings: >> org.apache.spark.shuffle.MetadataFetchFailedException + please see the >> joined screenshots. >> >> I am deploying 3*m3.xlarge and using the following parameters while >> submitting the app: --executor-memory 50g --driver-memory 20g >> --executor-cores 4 --num-executors 3. >> >> Can you recommend other configurations (driver executors number memory) or >> do I have to deploy more and larger instances in order to run my app on 5GB >> ? Or do I need to add more partitions while reading the file? >> >> Best, >> Kevin >> >>> On 12 April 2016 at 12:19, Sun, Rui <rui....@intel.com> wrote: >>> Which py file is your main file (primary py file)? Zip the other two py >>> files. Leave the main py file alone. Don't copy them to S3 because it seems >>> that only local primary and additional py files are supported. >>> >>> ./bin/spark-submit --master spark://... --py-files <zip file> <main py file> >>> >>> -----Original Message----- >>> From: kevllino [mailto:kevin.e...@mail.dcu.ie] >>> Sent: Tuesday, April 12, 2016 5:07 PM >>> To: user@spark.apache.org >>> Subject: Run a self-contained Spark app on a Spark standalone cluster >>> >>> Hi, >>> >>> I need to know how to run a self-contained Spark app (3 python files) in a >>> Spark standalone cluster. Can I move the .py files to the cluster, or >>> should I store them locally, on HDFS or S3? I tried the following locally >>> and on S3 with a zip of my .py files as suggested here >>> <http://spark.apache.org/docs/latest/submitting-applications.html> : >>> >>> ./bin/spark-submit --master >>> spark://ec2-54-51-23-172.eu-west-1.compute.amazonaws.com:5080 --py-files >>> s3n://AWS_ACCESS_KEY_ID:AWS_SECRET_ACCESS_KEY@mubucket//weather_predict.zip >>> >>> But get: “Error: Must specify a primary resource (JAR or Python file)” >>> >>> Best, >>> Kevin >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Run-a-self-contained-Spark-app-on-a-Spark-standalone-cluster-tp26753.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >> >> >> >> -- >> Kevin EID >> M.Sc. in Computing, Data Analytics >> >>