Help troubleshooting multi-cluster setup

2015-09-23 Thread Daniel Watrous
Hi, I have deployed a multi-node cluster with one master and two data nodes. Here's what jps shows: hadoop@hadoop-master:~$ jps 24641 SecondaryNameNode 24435 DataNode 24261 NameNode 24791 ResourceManager 25483 Jps 24940 NodeManager hadoop@hadoop-data1:~$ jps 15556 DataNode 16198 NodeManager

Re: Help troubleshooting multi-cluster setup

2015-09-23 Thread Daniel Watrous
I'm not sure if this is related, but I'm seeing some errors in hadoop-hadoop-namenode-hadoop-master.log 2015-09-23 19:56:27,798 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved datanode registration: hostname cannot be resolved (ip=192.168.51.1,

Re: Help troubleshooting multi-cluster setup

2015-09-23 Thread Kuhu Shukla
Hi Daniel, The RM will list only NodeManagers and not the datanodes. You can view the datanodes on the NameNode page (eg. 192.168.51.4:50070). The one node you see on the RM page 'Nodes' list is from this: hadoop@hadoop-master:~$ jps24641 SecondaryNameNode24435 DataNode24261 NameNode24791

Re: Help troubleshooting multi-cluster setup

2015-09-23 Thread Daniel Watrous
I was able to get the jobs submitting to the cluster by adding the following property to mapred-site.xml mapreduce.framework.name yarn I also had to add the following properties to yarn-site.xml yarn.nodemanager.aux-services mapreduce_shuffle

Re: Multi-Cluster Setup

2014-07-04 Thread fab wol
hey Rahul, thanks for pointing me to that page. It's definately worth a read. Need both clusters to be at least V2.3 for that? I was digging also a little bit further. There is the property setting fs.defaultFS whchi might be the exact setting I was actually looking for. Unfortuantely MapR

Multi-Cluster Setup

2014-07-03 Thread fab wol
hey everyone, MapR is offering the possibility to acces from one cluster (e.g. a compute only cluster without much storage capabilities) another cluster's HDFS/MapRFS (see http://doc.mapr.com/display/MapR/mapr-clusters.conf). In times of Hadoop-as-a-Service this becomes very interesting. Is this

Re: Multi-Cluster Setup

2014-07-03 Thread Nitin Pawar
Nothing is stopping you to implement cluster the way you want. You can have storage only nodes for your HDFS and do not run tasktrackers on them. Start bunch of machines with High RAM and high CPUs but no storage. Only thing to worry then would be network bandwidth to carry data from hdfs to

Re: Multi-Cluster Setup

2014-07-03 Thread fab wol
Hey Nitin, I'm not talking about concept-wise. I'm takling about how to actually do it technically and how to set it up. Imagine this: I have two clusters, both running fine and they are both (setup-wise) the same, besides that one has way more tasktrackers/Nodemanagers than the other one. Now I

Re: Multi-Cluster Setup

2014-07-03 Thread Rahul Chaudhari
Fabian, I see this as the classic case of federation of hadoop clusters. The MR or job can refer to the specific hdfs://file location as input but at the same time run on another cluster. You can refer to following link for further details on federation.