Hi again, You are actually hitting a problem caused by some changes in the code which require a modified repo. Unfortunately, I got delayed in modifying the documentation to point to the new repo.
Could you try using http://public-repo-1.hortonworks.com/HDP-1.0.1.14/repos/centos5/hdp-release-1.0.1.14-1.el5.noarch.rpm or http://public-repo-1.hortonworks.com/HDP-1.0.1.14/repos/centos6/hdp-release-1.0.1.14-1.el6.noarch.rpm The above should install the yum repo configs to point to the correct repo which will have the lzo packages. -- Hitesh On Aug 16, 2012, at 9:27 PM, xu peng wrote: > Hitesh Shah : > > It is a my my pleasure to fill jira of ambari to help other users . As > a matter of fact, i want to summarize all the problem before i install > ambari cluster successfully. And i will feed back as soon as > possiable. > > Here is another problem i encounter when install hadoop using ambari, > i found a rpm package "hadoop-lzp-native" not in the hdp repo > (baseurl=http://public-repo-1.hortonworks.com/HDP-1.0.13/repos/centos5) > . So i failed againg during deploying step. > > And the attachment is the deploying log , please refer. > > Thanks a lot and look forward to you reply. > > > On Tue, Aug 14, 2012 at 11:35 PM, Hitesh Shah <[email protected]> wrote: >> Ok - the cert issue is sometimes a result of uninstalling and re-installing >> ambari agents. >> >> The re-install causes ambari agents to regenerate a new certification and if >> the master was bootstrapped earlier, it would still be looking to match >> against old certs. >> >> Stop ambari master and remove ambari-agent rpm from all hosts. >> >> To fix this: >> - on the master, do a puppet cert revoke for all hosts ( >> http://docs.puppetlabs.com/man/cert.html ) >> - you can do a cert list to get all signed or non-signed hosts >> >> On all hosts, delete the following dirs ( if they exist ) : >> - /etc/puppet/ssl >> - /etc/puppet/[master|agent\/ssl/ >> - /var/lib/puppet/ssl/ >> >> >> After doing the above, re-install the ambari agent. >> >> On the ambari master, stop the master. Run the following command: >> >> puppet master --no-daemonize --debug >> >> The above runs in the foreground. The reason to run this is to make sure the >> cert for the master is recreated as we deleted it earlier. >> >> Now, kill the above process running in the foreground and do a service >> ambari start to bring up the UI. >> >> You should be able to bootstrap from this point on. >> >> Would you mind filing a jira and mentioning all the various issues you have >> come across and how you solved them. We can use that to create an FAQ for >> other users. >> >> thanks >> -- Hitesh >> >> >> On Aug 14, 2012, at 1:55 AM, xu peng wrote: >> >>> Hi Hitesh : >>> >>> Thanks a lot for your reply. >>> >>> 1. I did a puppet kick --ping to the client from my ambari master , >>> all the five nodes failed with the same log (Triggering >>> vbaby2.cloud.eb >>> Host vbaby2.cloud.eb failed: certificate verify failed. This is often >>> because the time is out of sync on the server or client >>> vbaby2.cloud.eb finished with exit code 2) >>> >>> I manually run "service ambari-agent start" , is that necessary ? How >>> can i fix these problem ? >>> >>> 2. As you suggest , I run the yum command manually. And found that the >>> installation missed some dependecy - php-gd. And i have to update my >>> yum repo. >>> >>> >>> >>> On Tue, Aug 14, 2012 at 1:01 AM, Hitesh Shah <[email protected]> wrote: >>>> Based on your deploy error log: >>>> >>>> "3": { >>>> "nodeReport": { >>>> "PUPPET_KICK_FAILED": [], >>>> "PUPPET_OPERATION_FAILED": [ >>>> "vbaby3.cloud.eb", >>>> "vbaby5.cloud.eb", >>>> "vbaby4.cloud.eb", >>>> "vbaby2.cloud.eb", >>>> "vbaby6.cloud.eb", >>>> "vbaby1.cloud.eb" >>>> ], >>>> "PUPPET_OPERATION_TIMEDOUT": [ >>>> "vbaby5.cloud.eb", >>>> "vbaby4.cloud.eb", >>>> "vbaby2.cloud.eb", >>>> "vbaby6.cloud.eb", >>>> "vbaby1.cloud.eb" >>>> ], >>>> >>>> 5 nodes timed out which means the puppet agent is not running on them or >>>> they cannot communicate with the master. Trying doing a puppet kick --ping >>>> to them from the master. >>>> >>>> For the one which failed, it failed at >>>> >>>> "\"Mon Aug 13 11:54:17 +0800 2012 >>>> /Stage[1]/Hdp::Pre_install_pkgs/Hdp::Exec[yum install >>>> $pre_installed_pkgs]/Exec[yum install $pre_installed_pkgs]/returns (err): >>>> change from notrun to 0 failed: yum install -y hadoop hadoop-libhdfs >>>> hadoop-native hadoop-pipes hadoop-sbin hadoop-lzo hadoop hadoop-libhdfs >>>> hadoop-native hadoop-pipes hadoop-sbin hadoop-lzo hdp_mon_dashboard >>>> ganglia-gmond-3.2.0 gweb hdp_mon_ganglia_addons snappy snappy-devel >>>> returned 1 instead of one of [0] at >>>> /etc/puppet/agent/modules/hdp/manifests/init.pp:265\"", >>>> >>>> It seems like yum install failed on the host. Try running the command >>>> manually and see what the error is. >>>> >>>> -- Hitesh >>>> >>>> >>>> >>>> On Aug 13, 2012, at 2:28 AM, xu peng wrote: >>>> >>>>> Hi Hitesh : >>>>> >>>>> It's me again. >>>>> >>>>> Followed you advice , I reinstalled the ambari server. But deploying >>>>> cluster and uninstall cluster failed again. I really don't know why. >>>>> >>>>> I supplied a attachment which contains the logs of all the nodes in >>>>> my cluster (/var/log/puppet_*.log , /var/log/puppet/*.log , >>>>> /var/log/yum.log, /var/log/hmc/hmc.log). And vbaby3.cloud.eb is the >>>>> ambari server. Please refer. >>>>> >>>>> Attachment DeployError and UninstallError is the log supplied by the >>>>> website of ambari when failing. And attachment DeployingDetails.jpg is >>>>> the deploy details of my cluster. Please refer. >>>>> >>>>> >>>>> Thanks again for your patience ! And look forward to your reply. >>>>> >>>>> Xupeng >>>>> >>>>> On Sat, Aug 11, 2012 at 10:56 PM, Hitesh Shah <[email protected]> >>>>> wrote: >>>>>> For uninstall failures, you will need to do a couple of things. >>>>>> Depending on where the uninstall failed, you may have to manually do a >>>>>> killall java on all the nodes to kill any missed processes. If you want >>>>>> to start with a complete clean install, you should also delete the >>>>>> hadoop dir in the mount points you selected during the previous install >>>>>> so that the new fresh install does not face errors when it tries to >>>>>> re-format hdfs. >>>>>> >>>>>> After that, simply, uinstall and re-install ambari rpm and that should >>>>>> allow you to re-create a fresh cluster. >>>>>> >>>>>> -- Hitesh >>>>>> >>>>>> On Aug 11, 2012, at 2:34 AM, xu peng wrote: >>>>>> >>>>>>> Hi Hitesh : >>>>>>> >>>>>>> Thanks a lot for your reply. >>>>>>> >>>>>>> I solved this problem , it is silly mistake. Someone has changed the >>>>>>> owner of "/" dir , and according to the errorlog , pdsh need root to >>>>>>> proceed. >>>>>>> >>>>>>> After changing the owner of "/" to root , problem solved. Thank you >>>>>>> again for you reply. >>>>>>> >>>>>>> I have another question. I had a uninstall failure , and there is no >>>>>>> button on the website for me to rollback and i don't know what to do >>>>>>> about that. What should i do now to reinstall hadoop ? >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> On Fri, Aug 10, 2012 at 10:55 PM, Hitesh Shah <[email protected]> >>>>>>> wrote: >>>>>>>> Hi >>>>>>>> >>>>>>>> Currently, the ambari installer requires everything to be run as root. >>>>>>>> It does not detect that the user is not root and use sudo either on >>>>>>>> the master or on the agent nodes. >>>>>>>> Furthermore, it seems like it is failing when trying to use pdsh to >>>>>>>> make remote calls to the host list that you passed in due to the >>>>>>>> errors mentioned in your script. This could be due to how it was >>>>>>>> installed but I am not sure. >>>>>>>> >>>>>>>> Could you switch to become root and run any simple command on all >>>>>>>> hosts using pdsh? If you want to reference exactly how ambari uses >>>>>>>> pdsh, you can look into /usr/share/hmc/php/frontend/commandUtils.php >>>>>>>> >>>>>>>> thanks >>>>>>>> -- Hitesh >>>>>>>> >>>>>>>> On Aug 9, 2012, at 9:04 PM, xu peng wrote: >>>>>>>> >>>>>>>>> According to the error log , is there something wrong with my account >>>>>>>>> ? >>>>>>>>> >>>>>>>>> I installed all the dependency module and ambari with the user >>>>>>>>> "ambari" instead of root. I added user "ambari" to /etc/sudofilers >>>>>>>>> with no passwd. >>>>>>>>> >>>>>>>>> On Fri, Aug 10, 2012 at 11:49 AM, xu peng <[email protected]> >>>>>>>>> wrote: >>>>>>>>>> There is no 100.log.file in /var/log/hmc dir, but only 55.log file >>>>>>>>>> (55 >>>>>>>>>> is the biggest version num). >>>>>>>>>> >>>>>>>>>> The content of 55.log is : >>>>>>>>>> pdsh@vbaby1: module path "/usr/lib64/pdsh" insecure. >>>>>>>>>> pdsh@vbaby1: "/": Owner not root, current uid, or pdsh executable >>>>>>>>>> owner >>>>>>>>>> pdsh@vbaby1: Couldn't load any pdsh modules >>>>>>>>>> >>>>>>>>>> Thanks ~ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Aug 10, 2012 at 11:36 AM, Hitesh Shah >>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>> Sorry - my mistake. The last txn mentioned is 100 so please look >>>>>>>>>>> for the 100.log file. >>>>>>>>>>> >>>>>>>>>>> -- Hitesh >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Aug 9, 2012, at 8:34 PM, Hitesh Shah wrote: >>>>>>>>>>> >>>>>>>>>>>> Thanks - will take a look and get back to you. >>>>>>>>>>>> >>>>>>>>>>>> Could you also look at /var/log/hmc/hmc.txn.55.log and see if >>>>>>>>>>>> there are any errors in it? >>>>>>>>>>>> >>>>>>>>>>>> -- Hitesh. >>>>>>>>>>>> >>>>>>>>>>>> On Aug 9, 2012, at 8:00 PM, xu peng wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Hitesh : >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks a lot for your replying. I have done all your suggestions >>>>>>>>>>>>> in my >>>>>>>>>>>>> ambari server , and the result is as below. >>>>>>>>>>>>> >>>>>>>>>>>>> 1. I can confirm that the hosts.txt file is empty after i failed >>>>>>>>>>>>> at >>>>>>>>>>>>> the step finding reachable nodes. >>>>>>>>>>>>> 2. I tried make hostdetails file in win7 and redhat , it both >>>>>>>>>>>>> failed.(Please see the attachment, my hostdetails file) >>>>>>>>>>>>> 3. I removed the logging re-direct and run the .sh script .It >>>>>>>>>>>>> seems >>>>>>>>>>>>> like the script works well , it print the hostname in console and >>>>>>>>>>>>> generate a file (content is "0") in the same dir. (Please see the >>>>>>>>>>>>> attachment , the result and my .sh script ) >>>>>>>>>>>>> 4. I attached the hmc.log and error_log too. Hope this helps ~ >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks ~ >>>>>>>>>>>>> Xupeng >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Aug 10, 2012 at 12:24 AM, Hitesh Shah >>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>> Xupeng, can you confirm that the hosts.txt file at >>>>>>>>>>>>>> /var/run/hmc/clusters/EBHadoop/hosts.txt is empty? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Also, can you ensure that the hostdetails file that you upload >>>>>>>>>>>>>> does not have any special characters that may be creating >>>>>>>>>>>>>> problems for the parsing layer? >>>>>>>>>>>>>> >>>>>>>>>>>>>> In the same dir, there should be an ssh.sh script. Can you >>>>>>>>>>>>>> create a copy of it, edit to remove the logging re-directs to >>>>>>>>>>>>>> files and run the script manually from command-line ( it takes >>>>>>>>>>>>>> in a hostname as the argument ) ? The output of that should show >>>>>>>>>>>>>> you as to what is going wrong. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Also, please look at /var/log/hmc/hmc.log and httpd/error_log to >>>>>>>>>>>>>> see if there are any errors being logged which may shed more >>>>>>>>>>>>>> light on the issue. >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks >>>>>>>>>>>>>> -- Hitesh >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Aug 9, 2012, at 9:11 AM, Artem Ervits wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Which file are you supplying in the step? Hostdetail.txt or >>>>>>>>>>>>>>> hosts? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> From: xupeng.bupt [mailto:[email protected]] >>>>>>>>>>>>>>> Sent: Thursday, August 09, 2012 11:33 AM >>>>>>>>>>>>>>> To: ambari-user >>>>>>>>>>>>>>> Subject: Re: RE: Problem when setting up hadoop cluster step 2 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you for your replying ~ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I made only one hostdetail.txt file which contains the names of >>>>>>>>>>>>>>> all servers. And i submit this file on the website , but i >>>>>>>>>>>>>>> still have the same problem. I failed at the step of finding >>>>>>>>>>>>>>> reachable nodes. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The error log is : " >>>>>>>>>>>>>>> [ERROR][sequentialScriptExecutor][sequentialScriptRunner.php:272][]: >>>>>>>>>>>>>>> Encountered total failure in transaction 100 while running cmd: >>>>>>>>>>>>>>> /usr/bin/php ./addNodes/findSshableNodes.php with args: >>>>>>>>>>>>>>> EBHadoop root >>>>>>>>>>>>>>> 35 100 36 /var/run/hmc/clusters/EBHadoop/hosts.txt >>>>>>>>>>>>>>> " >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> And my hostdetail.txt file is :" >>>>>>>>>>>>>>> vbaby2.cloud.eb >>>>>>>>>>>>>>> vbaby3.cloud.eb >>>>>>>>>>>>>>> vbaby4.cloud.eb >>>>>>>>>>>>>>> vbaby5.cloud.eb >>>>>>>>>>>>>>> vbaby6.cloud.eb >>>>>>>>>>>>>>> " >>>>>>>>>>>>>>> Thank you very much ~ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2012-08-09 >>>>>>>>>>>>>>> xupeng.bupt >>>>>>>>>>>>>>> 发件人: Artem Ervits >>>>>>>>>>>>>>> 发送时间: 2012-08-09 22:16:53 >>>>>>>>>>>>>>> 收件人: [email protected] >>>>>>>>>>>>>>> 抄送: >>>>>>>>>>>>>>> 主题: RE: Problem when setting up hadoop cluster step 2 >>>>>>>>>>>>>>> the installer requires a hosts file which I believe you called >>>>>>>>>>>>>>> hostdetail. Make sure it's the same file. You also mention a >>>>>>>>>>>>>>> hosts.txt and host.txt. You only need one file with the names >>>>>>>>>>>>>>> of all servers. >>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>> From: xu peng [mailto:[email protected]] >>>>>>>>>>>>>>> Sent: Thursday, August 09, 2012 2:02 AM >>>>>>>>>>>>>>> To: [email protected] >>>>>>>>>>>>>>> Subject: Problem when setting up hadoop cluster step 2 >>>>>>>>>>>>>>> Hi everyone : >>>>>>>>>>>>>>> I am trying to use ambari to set up a hadoop cluster , but i >>>>>>>>>>>>>>> encounter a problem on step 2. I already set up the >>>>>>>>>>>>>>> password-less ssh, and i creat a hostdetail.txt file. >>>>>>>>>>>>>>> The problem is that i found the file >>>>>>>>>>>>>>> "/var/run/hmc/clusters/EBHadoop/hosts.txt" is empty , no matter >>>>>>>>>>>>>>> how many times i submit the host.txt file on the website , and >>>>>>>>>>>>>>> i really don't know why. >>>>>>>>>>>>>>> { >>>>>>>>>>>>>>> Here is the log file : [2012:08:09 >>>>>>>>>>>>>>> 05:17:56][ERROR][sequentialScriptExecutor][sequentialScriptRunner.php:272][]: >>>>>>>>>>>>>>> Encountered total failure in transaction 100 while running cmd: >>>>>>>>>>>>>>> /usr/bin/php ./addNodes/findSshableNodes.php with args: >>>>>>>>>>>>>>> EBHadoop root >>>>>>>>>>>>>>> 35 100 36 /var/run/hmc/clusters/EBHadoop/hosts.txt >>>>>>>>>>>>>>> and my host.txt is like this(vbaby1.cloud.eb is the master >>>>>>>>>>>>>>> node) : >>>>>>>>>>>>>>> vbaby2.cloud.eb >>>>>>>>>>>>>>> vbaby3.cloud.eb >>>>>>>>>>>>>>> vbaby4.cloud.eb >>>>>>>>>>>>>>> vbaby5.cloud.eb >>>>>>>>>>>>>>> vbaby6.cloud.eb >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> Can anyone help me and tell me what i am doing wrong ? >>>>>>>>>>>>>>> Thank you very much ~! >>>>>>>>>>>>>>> This electronic message is intended to be for the use only of >>>>>>>>>>>>>>> the named recipient, and may contain information that is >>>>>>>>>>>>>>> confidential or privileged. If you are not the intended >>>>>>>>>>>>>>> recipient, you are hereby notified that any disclosure, >>>>>>>>>>>>>>> copying, distribution or use of the contents of this message is >>>>>>>>>>>>>>> strictly prohibited. If you have received this message in error >>>>>>>>>>>>>>> or are not the named recipient, please notify us immediately by >>>>>>>>>>>>>>> contacting the sender at the electronic mail address noted >>>>>>>>>>>>>>> above, and delete and destroy all copies of this message. Thank >>>>>>>>>>>>>>> you. >>>>>>>>>>>>>>> -------------------- >>>>>>>>>>>>>>> This electronic message is intended to be for the use only of >>>>>>>>>>>>>>> the named recipient, and may contain information that is >>>>>>>>>>>>>>> confidential or privileged. If you are not the intended >>>>>>>>>>>>>>> recipient, you are hereby notified that any disclosure, >>>>>>>>>>>>>>> copying, distribution or use of the contents of this message is >>>>>>>>>>>>>>> strictly prohibited. If you have received this message in >>>>>>>>>>>>>>> error or are not the named recipient, please notify us >>>>>>>>>>>>>>> immediately by contacting the sender at the electronic mail >>>>>>>>>>>>>>> address noted above, and delete and destroy all copies of this >>>>>>>>>>>>>>> message. Thank you. >>>>>>>>>>>>>>> -------------------- >>>>>>>>>>>>>>> This electronic message is intended to be for the use only of >>>>>>>>>>>>>>> the named recipient, and may contain information that is >>>>>>>>>>>>>>> confidential or privileged. If you are not the intended >>>>>>>>>>>>>>> recipient, you are hereby notified that any disclosure, >>>>>>>>>>>>>>> copying, distribution or use of the contents of this message is >>>>>>>>>>>>>>> strictly prohibited. If you have received this message in >>>>>>>>>>>>>>> error or are not the named recipient, please notify us >>>>>>>>>>>>>>> immediately by contacting the sender at the electronic mail >>>>>>>>>>>>>>> address noted above, and delete and destroy all copies of this >>>>>>>>>>>>>>> message. Thank you. >>>>>>>>>>>>>>> -------------------- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This electronic message is intended to be for the use only of >>>>>>>>>>>>>>> the named recipient, and may contain information that is >>>>>>>>>>>>>>> confidential or privileged. If you are not the intended >>>>>>>>>>>>>>> recipient, you are hereby notified that any disclosure, >>>>>>>>>>>>>>> copying, distribution or use of the contents of this message is >>>>>>>>>>>>>>> strictly prohibited. If you have received this message in >>>>>>>>>>>>>>> error or are not the named recipient, please notify us >>>>>>>>>>>>>>> immediately by contacting the sender at the electronic mail >>>>>>>>>>>>>>> address noted above, and delete and destroy all copies of this >>>>>>>>>>>>>>> message. Thank you. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -------------------- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This electronic message is intended to be for the use only of >>>>>>>>>>>>>>> the named recipient, and may contain information that is >>>>>>>>>>>>>>> confidential or privileged. If you are not the intended >>>>>>>>>>>>>>> recipient, you are hereby notified that any disclosure, >>>>>>>>>>>>>>> copying, distribution or use of the contents of this message is >>>>>>>>>>>>>>> strictly prohibited. If you have received this message in >>>>>>>>>>>>>>> error or are not the named recipient, please notify us >>>>>>>>>>>>>>> immediately by contacting the sender at the electronic mail >>>>>>>>>>>>>>> address noted above, and delete and destroy all copies of this >>>>>>>>>>>>>>> message. Thank you. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> <hmcLog.txt><hostdetails.txt><httpdLog.txt><ssh1.sh><ssh1_result.jpg> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> >>>>>> >>>>> <DeployError1_2012.8.13.txt><log.rar><DeployingDetails.jpg><UninstallError1_2012.8.13.txt> >>>> >> > <deployError2012.8.17.txt>
