Hi,

I’m trying to run the Spark (1.0.0) shell on EMR and encountering a classpath 
issue.
I suspect I’m missing something gloriously obviously, but so far it is eluding 
me.

I launch the EMR Cluster (using the aws cli) with:

aws emr create-cluster --name "Test Cluster"  \
        --ami-version 3.0.3 \
        --no-auto-terminate \
        --ec2-attributes KeyName=<...> \
        --bootstrap-actions 
Path=s3://elasticmapreduce/samples/spark/1.0.0/install-spark-shark.rb \
        --instance-groups 
InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m1.medium  \
        InstanceGroupType=CORE,InstanceCount=1,InstanceType=m1.medium --region 
eu-west-1

then,

$ aws emr ssh --cluster-id <...> --key-pair-file <...> --region eu-west-1

On the master node, I then launch the shell with:

[hadoop@ip-... spark]$ ./bin/spark-shell

and try performing:

scala> val logs = sc.textFile("s3n://.../“)

this produces:

14/07/16 12:40:35 WARN storage.BlockManager: Putting block broadcast_0 failed
java.lang.NoSuchMethodError: 
com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;


Any help mighty welcome,
ian

Reply via email to