Problem running Spark shell (1.0.0) on EMR
I'm having similar problem to: http://mail-archives.apache.org/mod_mbox/spark-user/201407.mbox/browser I'm trying to follow the tutorial at: When I run: val file = sc.textFile(s3://bigdatademo/sample/wiki/) I get: WARN storage.BlockManager: Putting block broadcast_1 failed java.lang.NoSuchMethodError: com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode; I found a few other people raising this issue, but wasn't able to find a solution or an explanation. Have anyone encountered this? Any help or advice will be highly appreciated! thank you, -- Omer
Re: Problem running Spark shell (1.0.0) on EMR
I am also having exactly the same problem, calling using pyspark. Has anyone managed to get this script to work? -- Martin Goodson | VP Data Science (0)20 3397 1240 [image: Inline image 1] On Wed, Jul 16, 2014 at 2:10 PM, Ian Wilkinson ia...@me.com wrote: Hi, I’m trying to run the Spark (1.0.0) shell on EMR and encountering a classpath issue. I suspect I’m missing something gloriously obviously, but so far it is eluding me. I launch the EMR Cluster (using the aws cli) with: aws emr create-cluster --name Test Cluster \ --ami-version 3.0.3 \ --no-auto-terminate \ --ec2-attributes KeyName=... \ --bootstrap-actions Path=s3://elasticmapreduce/samples/spark/1.0.0/install-spark-shark.rb \ --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m1.medium \ InstanceGroupType=CORE,InstanceCount=1,InstanceType=m1.medium --region eu-west-1 then, $ aws emr ssh --cluster-id ... --key-pair-file ... --region eu-west-1 On the master node, I then launch the shell with: [hadoop@ip-... spark]$ ./bin/spark-shell and try performing: scala val logs = sc.textFile(s3n://.../“) this produces: 14/07/16 12:40:35 WARN storage.BlockManager: Putting block broadcast_0 failed java.lang.NoSuchMethodError: com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode; Any help mighty welcome, ian
Problem running Spark shell (1.0.0) on EMR
Hi, I’m trying to run the Spark (1.0.0) shell on EMR and encountering a classpath issue. I suspect I’m missing something gloriously obviously, but so far it is eluding me. I launch the EMR Cluster (using the aws cli) with: aws emr create-cluster --name Test Cluster \ --ami-version 3.0.3 \ --no-auto-terminate \ --ec2-attributes KeyName=... \ --bootstrap-actions Path=s3://elasticmapreduce/samples/spark/1.0.0/install-spark-shark.rb \ --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m1.medium \ InstanceGroupType=CORE,InstanceCount=1,InstanceType=m1.medium --region eu-west-1 then, $ aws emr ssh --cluster-id ... --key-pair-file ... --region eu-west-1 On the master node, I then launch the shell with: [hadoop@ip-... spark]$ ./bin/spark-shell and try performing: scala val logs = sc.textFile(s3n://.../“) this produces: 14/07/16 12:40:35 WARN storage.BlockManager: Putting block broadcast_0 failed java.lang.NoSuchMethodError: com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode; Any help mighty welcome, ian