Hi ,
 I used spark-ec2 script to create ec2 cluster.

 Now I am trying copy data from s3 into hdfs.
I am doing this
*root@ip-172-31-21-160 ephemeral-hdfs]$ bin/hadoop distcp
s3://<xxx>/home/mydata/small.sam
hdfs://ec2-52-11-148-31.us-west-2.compute.amazonaws.com:9010/data1
<http://ec2-52-11-148-31.us-west-2.compute.amazonaws.com:9010/data1>*

and I get following error -

2015-03-06 01:39:27,299 INFO  tools.DistCp (DistCp.java:run(109)) - Input
Options: DistCpOptions{atomicCommit=false, syncFolder=false,
deleteMissing=false, ignoreFailures=false, maxMaps=20,
sslConfigurationFile='null', copyStrategy='uniformsize',
sourceFileListing=null, sourcePaths=[s3://<xxX>/home/mydata/small.sam],
targetPath=hdfs://
ec2-52-11-148-31.us-west-2.compute.amazonaws.com:9010/data1}
2015-03-06 01:39:27,585 INFO  mapreduce.Cluster
(Cluster.java:initialize(114)) - Failed to use
org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid
"mapreduce.jobtracker.address" configuration value for LocalJobRunner : "
ec2-52-11-148-31.us-west-2.compute.amazonaws.com:9001"
2015-03-06 01:39:27,585 ERROR tools.DistCp (DistCp.java:run(126)) -
Exception encountered
java.io.IOException: Cannot initialize Cluster. Please check your
configuration for mapreduce.framework.name and the correspond server
addresses.
    at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:121)
    at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:83)
    at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:76)
    at org.apache.hadoop.tools.DistCp.createMetaFolderPath(DistCp.java:352)
    at org.apache.hadoop.tools.DistCp.execute(DistCp.java:146)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:118)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:374)

I tried doing start-all.sh , start-dfs.sh  and start-yarn.sh

what should I do ?
Thanks
-roni

Reply via email to