forgive me if i sound rude , but please re-read the installation instructions properly - it should help you in your case positively.
1. have a sound naming convention for all your boxes. e.g.: namenode01.localdomain , datanode01.localdomain , datanode0N.localdomain , this will help you much in your future expansion and maintenance of your cluster 2. do not , by any means , tamper with /etc/hosts for 127.0.0.1 and ::1 , let it be localhost keyword only as you don't want to change that in the first place ... so don't play around with that one . it will help you to otherwise maintain normal operations on your box as well , otherwise for every internal lookup of OS functions it will only create issues 3. if you have a DHCP + very good DNS server in place, then okay , else , assign static IPs to your machines and create one entry for each box with the FQDN and static IP address , replicated on ALL the boxes 4. set up keyless ssh login for root or any other uniform localuser that you want to use and manage ambari + hadoop 5. confirm that namenode and the ambari server machines (in case they are different for you) can talk to ALL the machines using a keyless login for that universal user you have created in above steps. hope the above will help you to sort out the issue in a single go. regards Dev On Tue, Dec 16, 2014 at 11:45 PM, David Novogrodsky < [email protected]> wrote: > > There is nothing simply done in Ambari. :) > > By changing the name of this computer and restarting the namenode Ambari > does not recogize any node. The main error I am wondering about is this: > INFO 2014-12-16 12:02:29,669 main.py:233 - Connecting to Ambari server at > https://namenode.localdomain:8440 (98.124.198.1) > INFO 2014-12-16 12:02:29,670 NetUtil.py:48 - Connecting to > https://namenode.localdomain:8440/ca > WARNING 2014-12-16 12:02:29,718 NetUtil.py:71 - Failed to connect to > https://namenode.localdomain:8440/ca due to [Errno 111] Connection > refused > WARNING 2014-12-16 12:02:29,719 NetUtil.py:92 - Server at > https://namenode.localdomain:8440 is not reachable, sleeping for 10 > seconds... > ', None) > Why is Ambari using namenode.localdomain to connect? > > I am running Ambari on this node; I am running Ambari on the namenode of > this cluster. The host file for this computer is this: > GNU nano 2.0.9 File: > /etc/hosts > > 127.0.0.1 localhost localhost.localdomain localhost4 > localhost4.localdomain4 > ::1 localhost localhost.localdomain localhost6 > localhost6.localdomain6 > 192.168.200.144 localhost.datanode10 > 192.168.200.107 localhost.datanode01 > 192.168.200.143 namenode.localdomain.com namenode > > The Ambari wizard said I needed to use fully qualified domain names, so > > What follows is a detailed log of the registration log. I get this error > in the registration log for namenode.localdomain.com: > -- > ========================== > Creating target directory... > ========================== > > Command start time 2014-12-16 12:02:18 > > Connection to namenode.localdomain.com closed. > SSH command execution finished > host=namenode.localdomain.com, exitcode=0 > Command end time 2014-12-16 12:02:18 > > ========================== > Copying common functions script... > ========================== > > Command start time 2014-12-16 12:02:18 > > scp /usr/lib/python2.6/site-packages/ambari_commons > host=namenode.localdomain.com, exitcode=0 > Command end time 2014-12-16 12:02:18 > > ========================== > Copying OS type check script... > ========================== > > Command start time 2014-12-16 12:02:18 > > scp /usr/lib/python2.6/site-packages/ambari_server/os_check_type.py > host=namenode.localdomain.com, exitcode=0 > Command end time 2014-12-16 12:02:18 > > ========================== > Running OS type check... > ========================== > > Command start time 2014-12-16 12:02:18 > Cluster primary/cluster OS type is redhat6 and local/current OS type is > redhat6 > > Connection to namenode.localdomain.com closed. > SSH command execution finished > host=namenode.localdomain.com, exitcode=0 > Command end time 2014-12-16 12:02:19 > > ========================== > Checking 'sudo' package on remote host... > ========================== > > Command start time 2014-12-16 12:02:19 > sudo-1.8.6p3-15.el6.x86_64 > > Connection to namenode.localdomain.com closed. > SSH command execution finished > host=namenode.localdomain.com, exitcode=0 > Command end time 2014-12-16 12:02:20 > > ========================== > Copying repo file to 'tmp' folder... > ========================== > > Command start time 2014-12-16 12:02:20 > > scp /etc/yum.repos.d/ambari.repo > host=namenode.localdomain.com, exitcode=0 > Command end time 2014-12-16 12:02:20 > > ========================== > Moving file to repo dir... > ========================== > > Command start time 2014-12-16 12:02:20 > > Connection to namenode.localdomain.com closed. > SSH command execution finished > host=namenode.localdomain.com, exitcode=0 > Command end time 2014-12-16 12:02:21 > > ========================== > Copying setup script file... > ========================== > > Command start time 2014-12-16 12:02:21 > > scp /usr/lib/python2.6/site-packages/ambari_server/setupAgent.py > host=namenode.localdomain.com, exitcode=0 > Command end time 2014-12-16 12:02:21 > > ========================== > Running setup agent script... > ========================== > > Command start time 2014-12-16 12:02:21 > Verifying Python version compatibility... > Using python /usr/bin/python2.6 > Found ambari-agent PID: 5036 > Stopping ambari-agent > Removing PID file at /var/run/ambari-agent/ambari-agent.pid > ambari-agent successfully stopped > Restarting ambari-agent > Verifying Python version compatibility... > Using python /usr/bin/python2.6 > ambari-agent is not running. No PID found at > /var/run/ambari-agent/ambari-agent.pid > Verifying Python version compatibility... > Using python /usr/bin/python2.6 > Checking for previously running Ambari Agent... > Starting ambari-agent > Verifying ambari-agent process status... > Ambari Agent successfully started > Agent PID at: /var/run/ambari-agent/ambari-agent.pid > Agent out at: /var/log/ambari-agent/ambari-agent.out > Agent log at: /var/log/ambari-agent/ambari-agent.log > ('WARNING 2014-12-16 12:01:59,642 NetUtil.py:92 - Server at > https://namenode.localdomain:8440 is not reachable, sleeping for 10 > seconds... > INFO 2014-12-16 12:02:09,653 NetUtil.py:48 - Connecting to > https://namenode.localdomain:8440/ca > WARNING 2014-12-16 12:02:09,701 NetUtil.py:71 - Failed to connect to > https://namenode.localdomain:8440/ca due to [Errno 111] Connection > refused > WARNING 2014-12-16 12:02:09,701 NetUtil.py:92 - Server at > https://namenode.localdomain:8440 is not reachable, sleeping for 10 > seconds... > INFO 2014-12-16 12:02:19,711 NetUtil.py:48 - Connecting to > https://namenode.localdomain:8440/ca > WARNING 2014-12-16 12:02:19,770 NetUtil.py:71 - Failed to connect to > https://namenode.localdomain:8440/ca due to [Errno 111] Connection > refused > WARNING 2014-12-16 12:02:19,770 NetUtil.py:92 - Server at > https://namenode.localdomain:8440 is not reachable, sleeping for 10 > seconds... > INFO 2014-12-16 12:02:22,680 main.py:83 - loglevel=logging.INFO > INFO 2014-12-16 12:02:22,681 main.py:55 - signal received, exiting. > INFO 2014-12-16 12:02:22,681 ProcessHelper.py:39 - Removing pid file > INFO 2014-12-16 12:02:22,681 ProcessHelper.py:46 - Removing temp files > INFO 2014-12-16 12:02:29,532 main.py:83 - loglevel=logging.INFO > INFO 2014-12-16 12:02:29,533 DataCleaner.py:36 - Data cleanup thread > started > INFO 2014-12-16 12:02:29,534 DataCleaner.py:117 - Data cleanup started > INFO 2014-12-16 12:02:29,542 DataCleaner.py:119 - Data cleanup finished > INFO 2014-12-16 12:02:29,667 PingPortListener.py:51 - Ping port listener > started on port: 8670 > INFO 2014-12-16 12:02:29,669 main.py:233 - Connecting to Ambari server at > https://namenode.localdomain:8440 (98.124.198.1) > INFO 2014-12-16 12:02:29,670 NetUtil.py:48 - Connecting to > https://namenode.localdomain:8440/ca > WARNING 2014-12-16 12:02:29,718 NetUtil.py:71 - Failed to connect to > https://namenode.localdomain:8440/ca due to [Errno 111] Connection > refused > WARNING 2014-12-16 12:02:29,719 NetUtil.py:92 - Server at > https://namenode.localdomain:8440 is not reachable, sleeping for 10 > seconds... > ', None) > > Connection to namenode.localdomain.com closed. > SSH command execution finished > host=namenode.localdomain.com, exitcode=0 > Command end time 2014-12-16 12:02:32 > > Registering with the server... > Registration with the server failed. > ---- > > David Novogrodsky > [email protected] > http://www.linkedin.com/in/davidnovogrodsky > > On Mon, Dec 15, 2014 at 10:02 PM, Devopam Mittra <[email protected]> > wrote: >> >> May I suggest you simply do a ssh -l <keylessusername> using the previous >> and the new FQDNs that you have defined to verify which one is in effect, >> and accessible ? >> Also, since you changed the FQDN, you may wish to simply reboot the >> cluster once, just to make sure that new ones are in-place. >> It might happen that after the reboot you will need to redo the ssh >> keyless pairing once again (most probably) >> >> regards >> Devopam >> >> >> On Tue, Dec 16, 2014 at 4:32 AM, David Novogrodsky < >> [email protected]> wrote: >>> >>> The changes I am making in the hosts file are not being picked up by the >>> installation scripts of Ambari. I was told I could make changes to the >>> hosts file and that Ambari would see them. I have >>> checked the etc/ambari-agent/conf/ambari-agent.ini file and the changes >>> I made to the hosts file are not showing up in that file. Where is Ambari >>> getting the names for the other nodes in the cluster? >>> >>> Here are the changes I made to the hosts file on the host for the name >>> node: >>> 127.0.0.1 localhost localhost.localdomain localhost4 >>> localhost4.localdomain4 >>> ::1 localhost localhost.localdomain localhost6 >>> localhost6.localdomain6 >>> 192.168.200.144 datanode10.localdomain >>> 192.168.200.107 datanode01.localdomain >>> 192.168.200.143 namenode.localdomain namenode >>> >>> Since I made these changes Ambari can not discover any of the nodes in >>> the network. None of them. >>> >>> I have not made these changes to the other nodes because I do not want >>> to make changes to the other nodes until I can see Ambari discover the host >>> it is sitting upon. >>> >>> Regarding the commands you mentioned, here are the results: >>> [root@localhost conf]# hostname -f >>> hostname: Unknown host >>> [root@localhost conf]# hostname >>> localhost.namenode >>> [root@localhost conf]# python -c 'import socket; print >>> socket.getfqdn()' >>> localhost.namenode >>> >>> localhost.namenode was the name for I used for this host during the >>> installation of CentOS. I thought you said i could make changes to the >>> hosts file and the installation scripts would recognize them? >>> >>> From the Confirm Hosts page I am getting the following errors: >>> for connecting to the name node >>> >>> STDOUT: {'exitstatus': 1, 'log': "Host registration aborted. Ambari Agent >>> host >>> cannot reach Ambari Server 'localhost.namenode:8080'. Please check the >>> network >>> connectivity between the Ambari Agent host and the Ambari Server"} >>> >>> for connecting to the datanode10 >>> >>> INFO 2014-12-15 16:42:33,348 DataCleaner.py:36 - Data cleanup thread started >>> ERROR 2014-12-15 16:42:33,349 main.py:137 - Ambari agent machine hostname >>> (localhost.datanode10) does not match expected ambari server hostname >>> (datanode10.localdomain). Aborting registration. Please check hostname, >>> hostname -f and /etc/hosts file to confirm your hostname is setup correctly >>> ', None) >>> >>> I am getting similiar error when trying to get to the datanode01. >>> Please note I used the following domain names for the following datanodes >>> when I installed the CentOS >>> datanode 10 --> localhost.datanode10 >>> datanode01 --> localhost.datanode01 >>> >>> >>> >>> >>> >>> David Novogrodsky >>> [email protected] >>> http://www.linkedin.com/in/davidnovogrodsky >>> >>> On Mon, Dec 15, 2014 at 11:50 AM, Yusaku Sako <[email protected]> >>> wrote: >>>> >>>> Did you change the FQDNs like I proposed, like namenode.localdomain, >>>> rather than localhost.namenode? >>>> Did you ensure that the 3 commands returned the results as shown? >>>> Can each host resolve all the other hosts by name? >>>> >>>> If you want to get a cluster up and running on VMs, the best bet is to >>>> use: >>>> https://cwiki.apache.org/confluence/display/AMBARI/Quick+Start+Guide >>>> >>>> This sets up all /etc/hosts and other settings in the way you want. >>>> Then you can see how these VMs are being set up and mimic on your VMs >>>> if you'd rather set them up from scratch. >>>> >>>> I hope this helps. >>>> Yusaku >>>> >>>> >>>> On Mon, Dec 15, 2014 at 8:18 AM, David Novogrodsky < >>>> [email protected]> wrote: >>>>> >>>>> Ok, I removed the multiple instances onf localhost.namenode. It now >>>>> only appears on one line in the hosts file. >>>>> >>>>> The main ambari server still cannot see the data nodes nor the node >>>>> Ambari is on. Ambari is on the namenode. When I run the install, the >>>>> install program can not connect to any node in the network. >>>>> >>>>> Also I tried running /etc/init.d/network restart on one of the nodes; >>>>> datanode10 ( a virtual machine). Now that node cannot connect to the >>>>> internet....I would like to send you the information but I am having >>>>> problems setting the document from the virtual machine. >>>>> >>>>> I do not have a DNS. These machines have hardwired IP addresses and >>>>> names in the host file. Did runn /etc/init.d/network restart break the >>>>> connection? >>>>> >>>>> >>>>> David Novogrodsky >>>>> [email protected] >>>>> http://www.linkedin.com/in/davidnovogrodsky >>>>> >>>>> On Sat, Dec 13, 2014 at 12:46 AM, Yusaku Sako <[email protected]> >>>>> wrote: >>>>>> >>>>>> You can just make the changes in /etc/hosts. You might also >>>>>> change /etc/sysconfig/network and run /etc/init.d/network restart. >>>>>> >>>>>> Then make sure that running the 3 commands return expected results. >>>>>> >>>>>> Yusaku >>>>>> >>>>>> On Fri, Dec 12, 2014 at 9:06 PM, David Novogrodsky < >>>>>> [email protected]> wrote: >>>>>>> >>>>>>> When I installed the CentOS on the machines, I chose those name, >>>>>>> localhost.datanode01...and so on. You mean I have to reinstall CentOS >>>>>>> on >>>>>>> the machines again? >>>>>>> >>>>>>> Can I just make the changes in the host files? >>>>>>> >>>>>>> Will I need to recreate the SSH keys?. >>>>>>> >>>>>>> David Novogrodsky >>>>>>> [email protected] >>>>>>> http://www.linkedin.com/in/davidnovogrodsky >>>>>>> >>>>>>> On Fri, Dec 12, 2014 at 6:21 PM, Yusaku Sako <[email protected] >>>>>>> > wrote: >>>>>>> >>>>>>>> I would set it up like this: >>>>>>>> >>>>>>>> 127.0.0.1 localhost localhost.localdomain localhost4 >>>>>>>> localhost4.localdomain4* <- do not list the hostname here. * >>>>>>>> ::1 localhost localhost.localdomain localhost6 >>>>>>>> localhost6.localdomain6 >>>>>>>> xxx.xxx.200.144 datanode10.localdomain >>>>>>>> xxx.xxx.200.107 datanode01.localdomain >>>>>>>> xxx.xxx.200.143 namenode.localdomain namenode >>>>>>>> >>>>>>>> With this change: >>>>>>>> * *hostname -f* should display *namenode.localdomain* >>>>>>>> * *hostname* should display *namenode* >>>>>>>> * *python -c 'import socket; print socket.getfqdn()' *should >>>>>>>> display *namenode.localdomain* >>>>>>>> >>>>>>>> I hope this helps. >>>>>>>> Yusaku >>>>>>>> >>>>>>>> On Fri, Dec 12, 2014 at 3:52 PM, David Novogrodsky < >>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>> All, >>>>>>>>> >>>>>>>>> I am having a problem with Ambari. >>>>>>>>> I am trying to use Ambari to install Hadoop to a three node >>>>>>>>> cluster. the name node is where the Ambari server is located. I am >>>>>>>>> getting >>>>>>>>> this error: >>>>>>>>> ERROR 2014-12-12 17:39:56,963 main.py:137 – Ambari agent machine >>>>>>>>> hostname (localhost.localdomain) does not match expected ambari server >>>>>>>>> hostname (namenode). Aborting registration. Please check hostname, >>>>>>>>> hostname >>>>>>>>> -f and /etc/hosts file to confirm your hostname is setup correctly >>>>>>>>> ‘, None) >>>>>>>>> >>>>>>>>> Here is the contents of my hosts file: >>>>>>>>> 127.0.0.1 localhost localhost.localdomain localhost4 >>>>>>>>> localhost4.localdomain4 localhost.namenode namenode >>>>>>>>> ::1 localhost localhost.localdomain localhost6 >>>>>>>>> localhost6.localdomain6 >>>>>>>>> xxx.xxx.200.144 localhost.datanode10 >>>>>>>>> xxx.xxx.200.107 localhost.datanode01 >>>>>>>>> xxx.xxx.200.143 localhost.namenode namenode >>>>>>>>> >>>>>>>>> I am not sure what the problem is. Since there are only four steps >>>>>>>>> to run ambari there is not a lot of background to determine the cause >>>>>>>>> of >>>>>>>>> this problem. >>>>>>>>> >>>>>>>>> David Novogrodsky >>>>>>>>> [email protected] >>>>>>>>> http://www.linkedin.com/in/davidnovogrodsky >>>>>>>>> >>>>>>>> >>>>>>>> CONFIDENTIALITY NOTICE >>>>>>>> NOTICE: This message is intended for the use of the individual or >>>>>>>> entity to which it is addressed and may contain information that is >>>>>>>> confidential, privileged and exempt from disclosure under applicable >>>>>>>> law. >>>>>>>> If the reader of this message is not the intended recipient, you are >>>>>>>> hereby >>>>>>>> notified that any printing, copying, dissemination, distribution, >>>>>>>> disclosure or forwarding of this communication is strictly prohibited. >>>>>>>> If >>>>>>>> you have received this communication in error, please contact the >>>>>>>> sender >>>>>>>> immediately and delete it from your system. Thank You. >>>>>>> >>>>>>> >>>>>> CONFIDENTIALITY NOTICE >>>>>> NOTICE: This message is intended for the use of the individual or >>>>>> entity to which it is addressed and may contain information that is >>>>>> confidential, privileged and exempt from disclosure under applicable law. >>>>>> If the reader of this message is not the intended recipient, you are >>>>>> hereby >>>>>> notified that any printing, copying, dissemination, distribution, >>>>>> disclosure or forwarding of this communication is strictly prohibited. If >>>>>> you have received this communication in error, please contact the sender >>>>>> immediately and delete it from your system. Thank You. >>>>>> >>>>> >>>> CONFIDENTIALITY NOTICE >>>> NOTICE: This message is intended for the use of the individual or >>>> entity to which it is addressed and may contain information that is >>>> confidential, privileged and exempt from disclosure under applicable law. >>>> If the reader of this message is not the intended recipient, you are hereby >>>> notified that any printing, copying, dissemination, distribution, >>>> disclosure or forwarding of this communication is strictly prohibited. If >>>> you have received this communication in error, please contact the sender >>>> immediately and delete it from your system. Thank You. >>>> >>> >> >> -- >> Devopam Mittra >> Life and Relations are not binary >> > -- Devopam Mittra Life and Relations are not binary
