Did you follow these steps? https://wiki.apache.org/hadoop/AmazonS3 Also
make sure your jobtracker/mapreduce processes are running fine.
Thanks
Best Regards
On Sun, Mar 8, 2015 at 7:32 AM, roni roni.epi...@gmail.com wrote:
Did you get this to work?
I got pass the issues with the cluster not
Did you get this to work?
I got pass the issues with the cluster not startetd problem
I am having problem where distcp with s3 URI says incorrect forlder path and
s3n:// hangs.
stuck for 2 days :(
Thanks
-R
--
View this message in context:
~/ephemeral-hdfs/sbin/start-mapred.sh does not exist on spark-1.0.2;
I restarted hdfs using ~/ephemeral-hdfs/sbin/stop-dfs.sh and
~/ephemeral-hdfs/sbin/start-dfs.sh, but still getting the same error
when trying to run distcp:
ERROR tools.DistCp (DistCp.java:run(126)) - Exception encountered
Tomer,
To use distcp, you need to have a Hadoop compute cluster up. start-dfs just
restarts HDFS. I don’t have a Spark 1.0.2 cluster up right now, but there
should be a start-mapred*.sh or start-all.sh script that will launch the Hadoop
MapReduce cluster that you will need for distcp.
Tomer,
Did you try start-all.sh? It worked for me the last time I tried using
distcp, and it worked for this guy too
http://stackoverflow.com/a/18083790/877069.
Nick
On Mon, Sep 8, 2014 at 3:28 AM, Tomer Benyamini tomer@gmail.com wrote:
~/ephemeral-hdfs/sbin/start-mapred.sh does not
Still no luck, even when running stop-all.sh followed by start-all.sh.
On Mon, Sep 8, 2014 at 5:57 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
Tomer,
Did you try start-all.sh? It worked for me the last time I tried using
distcp, and it worked for this guy too.
Nick
On Mon, Sep
what did you see in the log? was there anything related to mapreduce?
can you log into your hdfs (data) node, use jps to list all java process and
confirm whether there is a tasktracker process (or nodemanager) running with
datanode process
--
Ye Xianjin
Sent with Sparrow
No tasktracker or nodemanager. This is what I see:
On the master:
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode
org.apache.hadoop.hdfs.server.namenode.NameNode
On the data node (slave):
well, this means you didn't start a compute cluster. Most likely because the
wrong value of mapreduce.jobtracker.address cause the slave node cannot start
the node manager. ( I am not familiar with the ec2 script, so I don't know
whether the slave node has node manager installed or not.)
Can
I've installed a spark standalone cluster on ec2 as defined here -
https://spark.apache.org/docs/latest/ec2-scripts.html. I'm not sure if
mr1/2 is part of this installation.
On Sun, Sep 7, 2014 at 7:25 PM, Ye Xianjin advance...@gmail.com wrote:
Distcp requires a mr1(or mr2) cluster to start. Do
I think you need to run start-all.sh or something similar on the EC2
cluster. MR is installed but is not running by default on EC2 clusters spun
up by spark-ec2.
On Sun, Sep 7, 2014 at 12:33 PM, Tomer Benyamini tomer@gmail.com
wrote:
I've installed a spark standalone cluster on ec2 as
If I recall, you should be able to start Hadoop MapReduce using
~/ephemeral-hdfs/sbin/start-mapred.sh.
On Sun, Sep 7, 2014 at 6:42 AM, Tomer Benyamini tomer@gmail.com wrote:
Hi,
I would like to copy log files from s3 to the cluster's
ephemeral-hdfs. I tried to use distcp, but I guess
12 matches
Mail list logo