Re: distcp on ec2 standalone spark cluster

Josh Rosen Sun, 07 Sep 2014 11:27:57 -0700

If I recall, you should be able to start Hadoop MapReduce using
~/ephemeral-hdfs/sbin/start-mapred.sh.


On Sun, Sep 7, 2014 at 6:42 AM, Tomer Benyamini <tomer....@gmail.com> wrote:

> Hi,
>
> I would like to copy log files from s3 to the cluster's
> ephemeral-hdfs. I tried to use distcp, but I guess mapred is not
> running on the cluster - I'm getting the exception below.
>
> Is there a way to activate it, or is there a spark alternative to distcp?
>
> Thanks,
> Tomer
>
> mapreduce.Cluster (Cluster.java:initialize(114)) - Failed to use
> org.apache.hadoop.mapred.LocalClientProtocolProvider due to error:
> Invalid "mapreduce.jobtracker.address" configuration value for
> LocalJobRunner : "XXX:9001"
>
> ERROR tools.DistCp (DistCp.java:run(126)) - Exception encountered
>
> java.io.IOException: Cannot initialize Cluster. Please check your
> configuration for mapreduce.framework.name and the correspond server
> addresses.
>
> at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:121)
>
> at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:83)
>
> at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:76)
>
> at org.apache.hadoop.tools.DistCp.createMetaFolderPath(DistCp.java:352)
>
> at org.apache.hadoop.tools.DistCp.execute(DistCp.java:146)
>
> at org.apache.hadoop.tools.DistCp.run(DistCp.java:118)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>
> at org.apache.hadoop.tools.DistCp.main(DistCp.java:374)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: distcp on ec2 standalone spark cluster

Reply via email to