Re: Problem when setting up hadoop cluster step 2

Hitesh Shah Sun, 19 Aug 2012 10:30:47 -0700

Yes - not using /dev/mapper/hdvg-rootlv was what I was planning to suggest.


It seems to me that you installed mod_passenger and/or ambari on vbaby1. Is 
this your new ambari master? 

Try doing this on vbaby1: 

$puppet master --no-daemonize --debug

The above will create the cert required by the puppet master running in httpd. 
Kill the above process ( it will run in the foreground ). 

Now, try the httpd restart. 

 ( Also, note that you should not need to do anything for ganglia start unless 
you are running ganglia server on the same host as ambari. )

thanks
-- Hitesh


On Aug 19, 2012, at 6:03 AM, xu peng wrote:

> Hi Hitesh :
> 
> It is me again.
> 
> I figured out the previous problem by changing the mount point to a custom 
> path.
> 
> But i failed at the step of starting ganglia server.
> 
> I run this command manualy on vbaby1 node , but failed . The other
> node successed.
> ([root@vbaby1 log]# service httpd start
> Starting httpd: Syntax error on line 37 of 
> /etc/httpd/conf.d/puppetmaster.conf:
> SSLCertificateChainFile: file '/var/lib/puppet/ssl/ca/ca_crt.pem' does
> not exist or is empty
>                                                           [FAILED])
> 
> Please refer to the error log.
> 
> Thanks a lot.
> 
> On Sun, Aug 19, 2012 at 7:26 PM, xu peng <[email protected]> wrote:
>> Hi Hitesh :
>> 
>> It is me again.
>> 
>> I encountered another problem while deploying the service. And
>> according to the log , it seems like something went wrong when
>> executing command (Dependency Exec[mkdir -p
>> /dev/mapper/hdvg-rootlv/hadoop/hdfs/data] has failures: true) .
>> 
>> Please refer to the attachment. It seems like all the rpm package
>> installed successfully , and i don't know where failed the dependency.
>> 
>> Please help , thanks a lot.
>> 
>> On Sun, Aug 19, 2012 at 8:08 AM, xu peng <[email protected]> wrote:
>>> Hi Hitesh :
>>> 
>>> I use the default settings of the mount point , but it seems like this
>>> path is not a directory(/dev/mapper/hdvg-rootlv/), and i can not
>>> execute mkdir -p command on this path. And the hdvg-rootlv is a
>>> blocking file (bwrxwrxwrx) .  Is there something wrong ?
>>> 
>>> 
>>> 
>>> On Sun, Aug 19, 2012 at 3:38 AM, Hitesh Shah <[email protected]> wrote:
>>>> Hi
>>>> 
>>>> Yes - you should all packages from the new repo and none from the old 
>>>> repo. Most of the packages should be the same but same like hadoop-lzo 
>>>> were re-factored to work correctly with respect to 32/64-bit installs on 
>>>> RHEL6.
>>>> 
>>>> Regarding the mount points, from a hadoop point of view, the namenode and 
>>>> datanode dirs are just dirs. From a performance point of view, you want 
>>>> each dir to be created on a separate mount point to increase disk io 
>>>> bandwidth. This means that the mount points that you select on the UI 
>>>> should allow directories to be created. If you have mounted certain kind 
>>>> of filesystems which you do not wish to use for hadoop ( any tmpfs, nfs 
>>>> mounts etc ), you should de-select them on the UI and/or use the custom 
>>>> mount point text box as appropriate. The UI currently does not distinguish 
>>>> valid mount points and therefore it is up to the user to select correctly.
>>>> 
>>>> -- Hitesh
>>>> 
>>>> 
>>>> On Aug 18, 2012, at 9:48 AM, xu peng wrote:
>>>> 
>>>>> Hi Hitesh:
>>>>> 
>>>>> Thanks again for your reply.
>>>>> 
>>>>> I solved the dependency problem after updating the hdp repo.
>>>>> 
>>>>> But here comes two new problems :
>>>>> 1. I update the new hdp repo , but i create a local repo copy of the
>>>>> old hdp repo. And I installed all the rpm package except
>>>>> hadoop-lzo-native using the old hdp repo. So it seems like the
>>>>> hadoop-lzo-native has some conflct with hadoop-lzo. So , do i have to
>>>>> install all the rpm package from the new repo ?
>>>>> 
>>>>> 2. From the error log , i can see a command "mkdir -p /var/.../..
>>>>> (mounting point of hadoop)", but i found the mouting point is not a
>>>>> dir , but a blocking file(bwrxwrxwrx). And the execution of this step
>>>>> failed. Did i do something wrong ?
>>>>> 
>>>>> I am sorry that this deploy error log is on my company's computer, and
>>>>> i will upload it in my next email.
>>>>> 
>>>>> 
>>>>> Thanks
>>>>> -- Xupeng
>>>>> 
>>>>> On Sat, Aug 18, 2012 at 4:43 AM, Hitesh Shah <[email protected]> 
>>>>> wrote:
>>>>>> Hi again,
>>>>>> 
>>>>>> You are actually hitting a problem caused by some changes in the code 
>>>>>> which require a modified repo. Unfortunately, I got delayed in modifying 
>>>>>> the documentation to point to the new repo.
>>>>>> 
>>>>>> Could you try using
>>>>>> http://public-repo-1.hortonworks.com/HDP-1.0.1.14/repos/centos5/hdp-release-1.0.1.14-1.el5.noarch.rpm
>>>>>> or
>>>>>> http://public-repo-1.hortonworks.com/HDP-1.0.1.14/repos/centos6/hdp-release-1.0.1.14-1.el6.noarch.rpm
>>>>>> 
>>>>>> The above should install the yum repo configs to point to the correct 
>>>>>> repo which will have the lzo packages.
>>>>>> 
>>>>>> -- Hitesh
>>>>>> 
>>>>>> 
>>>>>> On Aug 16, 2012, at 9:27 PM, xu peng wrote:
>>>>>> 
>>>>>>> Hitesh Shah :
>>>>>>> 
>>>>>>> It is a my my pleasure to fill jira of ambari to help other users . As
>>>>>>> a matter of fact, i want to summarize all the problem before i install
>>>>>>> ambari cluster successfully. And i will  feed back as soon as
>>>>>>> possiable.
>>>>>>> 
>>>>>>> Here is another problem i encounter when install hadoop using ambari,
>>>>>>> i found a rpm package "hadoop-lzp-native" not in the hdp repo
>>>>>>> (baseurl=http://public-repo-1.hortonworks.com/HDP-1.0.13/repos/centos5)
>>>>>>> . So i failed againg during deploying step.
>>>>>>> 
>>>>>>> And the attachment is the deploying log , please refer.
>>>>>>> 
>>>>>>> Thanks a lot and look forward to you reply.
>>>>>>> 
>>>>>>> 
>>>>>>> On Tue, Aug 14, 2012 at 11:35 PM, Hitesh Shah <[email protected]> 
>>>>>>> wrote:
>>>>>>>> Ok - the cert issue is sometimes a result of uninstalling and 
>>>>>>>> re-installing ambari agents.
>>>>>>>> 
>>>>>>>> The re-install causes ambari agents to regenerate a new certification 
>>>>>>>> and if the master was bootstrapped earlier, it would still be looking 
>>>>>>>> to match against old certs.
>>>>>>>> 
>>>>>>>> Stop ambari master and remove ambari-agent rpm from all hosts.
>>>>>>>> 
>>>>>>>> To fix this:
>>>>>>>> - on the master, do a puppet cert revoke for all hosts ( 
>>>>>>>> http://docs.puppetlabs.com/man/cert.html )
>>>>>>>> - you can do a cert list to get all signed or non-signed hosts
>>>>>>>> 
>>>>>>>> On all hosts, delete the following dirs ( if they exist ) :
>>>>>>>> - /etc/puppet/ssl
>>>>>>>> - /etc/puppet/[master|agent\/ssl/
>>>>>>>> - /var/lib/puppet/ssl/
>>>>>>>> 
>>>>>>>> 
>>>>>>>> After doing the above, re-install the ambari agent.
>>>>>>>> 
>>>>>>>> On the ambari master, stop the master. Run the following command:
>>>>>>>> 
>>>>>>>> puppet master --no-daemonize --debug
>>>>>>>> 
>>>>>>>> The above runs in the foreground. The reason to run this is to make 
>>>>>>>> sure the cert for the master is recreated as we deleted it earlier.
>>>>>>>> 
>>>>>>>> Now, kill the above process running in the foreground and do a service 
>>>>>>>> ambari start to bring up the UI.
>>>>>>>> 
>>>>>>>> You should be able to bootstrap from this point on.
>>>>>>>> 
>>>>>>>> Would you mind filing a jira and mentioning all the various issues you 
>>>>>>>> have come across and how you solved them. We can use that to create an 
>>>>>>>> FAQ for other users.
>>>>>>>> 
>>>>>>>> thanks
>>>>>>>> -- Hitesh
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Aug 14, 2012, at 1:55 AM, xu peng wrote:
>>>>>>>> 
>>>>>>>>> Hi  Hitesh :
>>>>>>>>> 
>>>>>>>>> Thanks a lot for your reply.
>>>>>>>>> 
>>>>>>>>> 1. I did a puppet kick --ping to the client from my ambari master ,
>>>>>>>>> all the five nodes failed with the same log (Triggering
>>>>>>>>> vbaby2.cloud.eb
>>>>>>>>> Host vbaby2.cloud.eb failed: certificate verify failed.  This is often
>>>>>>>>> because the time is out of sync on the server or client
>>>>>>>>> vbaby2.cloud.eb finished with exit code 2)
>>>>>>>>> 
>>>>>>>>> I manually run "service ambari-agent start" , is that necessary ? How
>>>>>>>>> can i fix these problem ?
>>>>>>>>> 
>>>>>>>>> 2. As you suggest , I run the yum command manually. And found that the
>>>>>>>>> installation missed some dependecy - php-gd. And i have to update my
>>>>>>>>> yum repo.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Tue, Aug 14, 2012 at 1:01 AM, Hitesh Shah <[email protected]> 
>>>>>>>>> wrote:
>>>>>>>>>> Based on your deploy error log:
>>>>>>>>>> 
>>>>>>>>>> "3": {
>>>>>>>>>>     "nodeReport": {
>>>>>>>>>>         "PUPPET_KICK_FAILED": [],
>>>>>>>>>>         "PUPPET_OPERATION_FAILED": [
>>>>>>>>>>             "vbaby3.cloud.eb",
>>>>>>>>>>             "vbaby5.cloud.eb",
>>>>>>>>>>             "vbaby4.cloud.eb",
>>>>>>>>>>             "vbaby2.cloud.eb",
>>>>>>>>>>             "vbaby6.cloud.eb",
>>>>>>>>>>             "vbaby1.cloud.eb"
>>>>>>>>>>         ],
>>>>>>>>>>         "PUPPET_OPERATION_TIMEDOUT": [
>>>>>>>>>>             "vbaby5.cloud.eb",
>>>>>>>>>>             "vbaby4.cloud.eb",
>>>>>>>>>>             "vbaby2.cloud.eb",
>>>>>>>>>>             "vbaby6.cloud.eb",
>>>>>>>>>>             "vbaby1.cloud.eb"
>>>>>>>>>>         ],
>>>>>>>>>> 
>>>>>>>>>> 5 nodes timed out which means the puppet agent is not running on 
>>>>>>>>>> them or they cannot communicate with the master. Trying doing a 
>>>>>>>>>> puppet kick --ping to them from the master.
>>>>>>>>>> 
>>>>>>>>>> For the one which failed, it failed at
>>>>>>>>>> 
>>>>>>>>>> "\"Mon Aug 13 11:54:17 +0800 2012 
>>>>>>>>>> /Stage[1]/Hdp::Pre_install_pkgs/Hdp::Exec[yum install 
>>>>>>>>>> $pre_installed_pkgs]/Exec[yum install $pre_installed_pkgs]/returns 
>>>>>>>>>> (err): change from notrun to 0 failed: yum install -y hadoop 
>>>>>>>>>> hadoop-libhdfs hadoop-native hadoop-pipes hadoop-sbin hadoop-lzo 
>>>>>>>>>> hadoop hadoop-libhdfs hadoop-native hadoop-pipes hadoop-sbin 
>>>>>>>>>> hadoop-lzo hdp_mon_dashboard ganglia-gmond-3.2.0 gweb 
>>>>>>>>>> hdp_mon_ganglia_addons snappy snappy-devel returned 1 instead of one 
>>>>>>>>>> of [0] at /etc/puppet/agent/modules/hdp/manifests/init.pp:265\"",
>>>>>>>>>> 
>>>>>>>>>> It seems like yum install failed on the host. Try running the 
>>>>>>>>>> command manually and see what the error is.
>>>>>>>>>> 
>>>>>>>>>> -- Hitesh
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Aug 13, 2012, at 2:28 AM, xu peng wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi Hitesh :
>>>>>>>>>>> 
>>>>>>>>>>> It's me again.
>>>>>>>>>>> 
>>>>>>>>>>> Followed you advice , I reinstalled the ambari server. But deploying
>>>>>>>>>>> cluster and uninstall cluster failed again. I really  don't know 
>>>>>>>>>>> why.
>>>>>>>>>>> 
>>>>>>>>>>> I supplied a attachment which contains the logs of  all the nodes in
>>>>>>>>>>> my cluster (/var/log/puppet_*.log , /var/log/puppet/*.log ,
>>>>>>>>>>> /var/log/yum.log, /var/log/hmc/hmc.log). And vbaby3.cloud.eb is the
>>>>>>>>>>> ambari server. Please refer.
>>>>>>>>>>> 
>>>>>>>>>>> Attachment DeployError and UninstallError is the log supplied by the
>>>>>>>>>>> website of ambari when failing. And attachment DeployingDetails.jpg 
>>>>>>>>>>> is
>>>>>>>>>>> the deploy details of my cluster. Please refer.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Thanks again for your patience ! And look forward to your reply.
>>>>>>>>>>> 
>>>>>>>>>>> Xupeng
>>>>>>>>>>> 
>>>>>>>>>>> On Sat, Aug 11, 2012 at 10:56 PM, Hitesh Shah 
>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>> For uninstall failures, you will need to do a couple of things. 
>>>>>>>>>>>> Depending on where the uninstall failed, you may have to manually 
>>>>>>>>>>>> do a killall java on all the nodes to kill any missed processes. 
>>>>>>>>>>>> If you want to start with a complete clean install, you should 
>>>>>>>>>>>> also delete the hadoop dir in the mount points you selected during 
>>>>>>>>>>>> the previous install  so that the new fresh install does not face 
>>>>>>>>>>>> errors when it tries to re-format hdfs.
>>>>>>>>>>>> 
>>>>>>>>>>>> After that, simply, uinstall and re-install ambari rpm and that 
>>>>>>>>>>>> should allow you to re-create a fresh cluster.
>>>>>>>>>>>> 
>>>>>>>>>>>> -- Hitesh
>>>>>>>>>>>> 
>>>>>>>>>>>> On Aug 11, 2012, at 2:34 AM, xu peng wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Hitesh :
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks a lot for your reply.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I solved this problem , it is silly mistake. Someone has changed 
>>>>>>>>>>>>> the
>>>>>>>>>>>>> owner of "/" dir , and according to the errorlog , pdsh need root 
>>>>>>>>>>>>> to
>>>>>>>>>>>>> proceed.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> After changing the owner of "/" to root , problem solved. Thank 
>>>>>>>>>>>>> you
>>>>>>>>>>>>> again for you reply.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I have another question. I had a uninstall failure , and there is 
>>>>>>>>>>>>> no
>>>>>>>>>>>>> button on the website for me to rollback and i don't know what to 
>>>>>>>>>>>>> do
>>>>>>>>>>>>> about that. What should i do now to reinstall hadoop ?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Fri, Aug 10, 2012 at 10:55 PM, Hitesh Shah 
>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>> Hi
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Currently, the ambari installer requires everything to be run as 
>>>>>>>>>>>>>> root. It does not detect that the user is not root and use sudo 
>>>>>>>>>>>>>> either on the master or on the agent nodes.
>>>>>>>>>>>>>> Furthermore, it seems like it is failing when trying to use pdsh 
>>>>>>>>>>>>>> to make remote calls to the host list that you passed in due to 
>>>>>>>>>>>>>> the errors mentioned in your script. This could be due to how it 
>>>>>>>>>>>>>> was installed but I am not sure.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Could you switch to become root and run any simple command on 
>>>>>>>>>>>>>> all hosts using pdsh? If you want to reference exactly how 
>>>>>>>>>>>>>> ambari uses pdsh, you can look into 
>>>>>>>>>>>>>> /usr/share/hmc/php/frontend/commandUtils.php
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> thanks
>>>>>>>>>>>>>> -- Hitesh
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Aug 9, 2012, at 9:04 PM, xu peng wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> According to the error log , is there something wrong with my 
>>>>>>>>>>>>>>> account ?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I installed all the dependency module and ambari with the user
>>>>>>>>>>>>>>> "ambari" instead of root. I added user "ambari" to 
>>>>>>>>>>>>>>> /etc/sudofilers
>>>>>>>>>>>>>>> with no passwd.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Fri, Aug 10, 2012 at 11:49 AM, xu peng 
>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>> There is no 100.log.file in /var/log/hmc dir, but only 55.log 
>>>>>>>>>>>>>>>> file (55
>>>>>>>>>>>>>>>> is the biggest version num).
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> The content of 55.log is :
>>>>>>>>>>>>>>>> pdsh@vbaby1: module path "/usr/lib64/pdsh" insecure.
>>>>>>>>>>>>>>>> pdsh@vbaby1: "/": Owner not root, current uid, or pdsh 
>>>>>>>>>>>>>>>> executable owner
>>>>>>>>>>>>>>>> pdsh@vbaby1: Couldn't load any pdsh modules
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks ~
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Fri, Aug 10, 2012 at 11:36 AM, Hitesh Shah 
>>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>> Sorry - my mistake. The last txn mentioned is 100 so please 
>>>>>>>>>>>>>>>>> look for the 100.log file.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> -- Hitesh
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Aug 9, 2012, at 8:34 PM, Hitesh Shah wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks - will take a look and get back to you.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Could you also look at /var/log/hmc/hmc.txn.55.log and see 
>>>>>>>>>>>>>>>>>> if there are any errors in it?
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> -- Hitesh.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On Aug 9, 2012, at 8:00 PM, xu peng wrote:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Hi Hitesh :
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Thanks a lot for your replying. I have done all your 
>>>>>>>>>>>>>>>>>>> suggestions in my
>>>>>>>>>>>>>>>>>>> ambari server , and the result is as below.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 1. I can confirm that the hosts.txt file is empty after i 
>>>>>>>>>>>>>>>>>>> failed at
>>>>>>>>>>>>>>>>>>> the step finding reachable nodes.
>>>>>>>>>>>>>>>>>>> 2. I tried make hostdetails file in win7 and redhat , it 
>>>>>>>>>>>>>>>>>>> both
>>>>>>>>>>>>>>>>>>> failed.(Please see the attachment, my hostdetails file)
>>>>>>>>>>>>>>>>>>> 3. I removed the logging re-direct and run the .sh script 
>>>>>>>>>>>>>>>>>>> .It seems
>>>>>>>>>>>>>>>>>>> like the script works well , it print the hostname in 
>>>>>>>>>>>>>>>>>>> console and
>>>>>>>>>>>>>>>>>>> generate a file (content  is "0") in the same dir. (Please 
>>>>>>>>>>>>>>>>>>> see the
>>>>>>>>>>>>>>>>>>> attachment , the result and my .sh script )
>>>>>>>>>>>>>>>>>>> 4. I attached the hmc.log and error_log too. Hope this 
>>>>>>>>>>>>>>>>>>> helps ~
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Thanks ~
>>>>>>>>>>>>>>>>>>> Xupeng
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On Fri, Aug 10, 2012 at 12:24 AM, Hitesh Shah 
>>>>>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>> Xupeng, can you confirm that the hosts.txt file at 
>>>>>>>>>>>>>>>>>>>> /var/run/hmc/clusters/EBHadoop/hosts.txt is empty?
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Also, can you ensure that the hostdetails file that you 
>>>>>>>>>>>>>>>>>>>> upload does not have any special characters that may be 
>>>>>>>>>>>>>>>>>>>> creating problems for the parsing layer?
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> In the same dir, there should be an ssh.sh script. Can you 
>>>>>>>>>>>>>>>>>>>> create a copy of it, edit to remove the logging re-directs 
>>>>>>>>>>>>>>>>>>>> to files and run the script manually from command-line ( 
>>>>>>>>>>>>>>>>>>>> it takes in a hostname as the argument ) ? The output of 
>>>>>>>>>>>>>>>>>>>> that should show you as to what is going wrong.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Also, please look at /var/log/hmc/hmc.log and 
>>>>>>>>>>>>>>>>>>>> httpd/error_log to see if there are any errors being 
>>>>>>>>>>>>>>>>>>>> logged which may shed more light on the issue.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> thanks
>>>>>>>>>>>>>>>>>>>> -- Hitesh
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On Aug 9, 2012, at 9:11 AM, Artem Ervits wrote:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Which file are you supplying in the step? Hostdetail.txt 
>>>>>>>>>>>>>>>>>>>>> or hosts?
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> From: xupeng.bupt [mailto:[email protected]]
>>>>>>>>>>>>>>>>>>>>> Sent: Thursday, August 09, 2012 11:33 AM
>>>>>>>>>>>>>>>>>>>>> To: ambari-user
>>>>>>>>>>>>>>>>>>>>> Subject: Re: RE: Problem when setting up hadoop cluster 
>>>>>>>>>>>>>>>>>>>>> step 2
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Thank you for your replying ~
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I made only one hostdetail.txt file which contains the 
>>>>>>>>>>>>>>>>>>>>> names of all servers. And i submit this file on the 
>>>>>>>>>>>>>>>>>>>>> website ,  but i still have the same problem. I failed at 
>>>>>>>>>>>>>>>>>>>>> the step of finding reachable nodes.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> The error log is : "
>>>>>>>>>>>>>>>>>>>>> [ERROR][sequentialScriptExecutor][sequentialScriptRunner.php:272][]:
>>>>>>>>>>>>>>>>>>>>> Encountered total failure in transaction 100 while 
>>>>>>>>>>>>>>>>>>>>> running cmd:
>>>>>>>>>>>>>>>>>>>>> /usr/bin/php ./addNodes/findSshableNodes.php with args: 
>>>>>>>>>>>>>>>>>>>>> EBHadoop root
>>>>>>>>>>>>>>>>>>>>> 35 100 36 /var/run/hmc/clusters/EBHadoop/hosts.txt
>>>>>>>>>>>>>>>>>>>>> "
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> And my hostdetail.txt file is :"
>>>>>>>>>>>>>>>>>>>>> vbaby2.cloud.eb
>>>>>>>>>>>>>>>>>>>>> vbaby3.cloud.eb
>>>>>>>>>>>>>>>>>>>>> vbaby4.cloud.eb
>>>>>>>>>>>>>>>>>>>>> vbaby5.cloud.eb
>>>>>>>>>>>>>>>>>>>>> vbaby6.cloud.eb
>>>>>>>>>>>>>>>>>>>>> "
>>>>>>>>>>>>>>>>>>>>> Thank you very much ~
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 2012-08-09
>>>>>>>>>>>>>>>>>>>>> xupeng.bupt
>>>>>>>>>>>>>>>>>>>>> 发件人： Artem Ervits
>>>>>>>>>>>>>>>>>>>>> 发送时间： 2012-08-09  22:16:53
>>>>>>>>>>>>>>>>>>>>> 收件人： [email protected]
>>>>>>>>>>>>>>>>>>>>> 抄送：
>>>>>>>>>>>>>>>>>>>>> 主题： RE: Problem when setting up hadoop cluster step 2
>>>>>>>>>>>>>>>>>>>>> the installer requires a hosts file which I believe you 
>>>>>>>>>>>>>>>>>>>>> called hostdetail. Make sure it's the same file. You also 
>>>>>>>>>>>>>>>>>>>>> mention a hosts.txt and host.txt. You only need one file 
>>>>>>>>>>>>>>>>>>>>> with the names of all servers.
>>>>>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>>>>>> From: xu peng [mailto:[email protected]]
>>>>>>>>>>>>>>>>>>>>> Sent: Thursday, August 09, 2012 2:02 AM
>>>>>>>>>>>>>>>>>>>>> To: [email protected]
>>>>>>>>>>>>>>>>>>>>> Subject: Problem when setting up hadoop cluster step 2
>>>>>>>>>>>>>>>>>>>>> Hi everyone :
>>>>>>>>>>>>>>>>>>>>> I am trying to use ambari to set up a hadoop cluster , 
>>>>>>>>>>>>>>>>>>>>> but i encounter a problem on step 2. I already set up the 
>>>>>>>>>>>>>>>>>>>>> password-less ssh, and i creat a hostdetail.txt file.
>>>>>>>>>>>>>>>>>>>>> The problem is that i found the file
>>>>>>>>>>>>>>>>>>>>> "/var/run/hmc/clusters/EBHadoop/hosts.txt" is empty , no 
>>>>>>>>>>>>>>>>>>>>> matter how many times i submit the host.txt file on the 
>>>>>>>>>>>>>>>>>>>>> website , and i really don't know why.
>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>> Here is the log file : [2012:08:09
>>>>>>>>>>>>>>>>>>>>> 05:17:56][ERROR][sequentialScriptExecutor][sequentialScriptRunner.php:272][]:
>>>>>>>>>>>>>>>>>>>>> Encountered total failure in transaction 100 while 
>>>>>>>>>>>>>>>>>>>>> running cmd:
>>>>>>>>>>>>>>>>>>>>> /usr/bin/php ./addNodes/findSshableNodes.php with args: 
>>>>>>>>>>>>>>>>>>>>> EBHadoop root
>>>>>>>>>>>>>>>>>>>>> 35 100 36 /var/run/hmc/clusters/EBHadoop/hosts.txt
>>>>>>>>>>>>>>>>>>>>> and my host.txt is like this(vbaby1.cloud.eb is the 
>>>>>>>>>>>>>>>>>>>>> master node) :
>>>>>>>>>>>>>>>>>>>>> vbaby2.cloud.eb
>>>>>>>>>>>>>>>>>>>>> vbaby3.cloud.eb
>>>>>>>>>>>>>>>>>>>>> vbaby4.cloud.eb
>>>>>>>>>>>>>>>>>>>>> vbaby5.cloud.eb
>>>>>>>>>>>>>>>>>>>>> vbaby6.cloud.eb
>>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>> Can anyone help me and tell me what i am doing wrong ?
>>>>>>>>>>>>>>>>>>>>> Thank you very much ~!
>>>>>>>>>>>>>>>>>>>>> This electronic message is intended to be for the use 
>>>>>>>>>>>>>>>>>>>>> only of the named recipient, and may contain information 
>>>>>>>>>>>>>>>>>>>>> that is confidential or privileged. If you are not the 
>>>>>>>>>>>>>>>>>>>>> intended recipient, you are hereby notified that any 
>>>>>>>>>>>>>>>>>>>>> disclosure, copying, distribution or use of the contents 
>>>>>>>>>>>>>>>>>>>>> of this message is strictly prohibited. If you have 
>>>>>>>>>>>>>>>>>>>>> received this message in error or are not the named 
>>>>>>>>>>>>>>>>>>>>> recipient, please notify us immediately by contacting the 
>>>>>>>>>>>>>>>>>>>>> sender at the electronic mail address noted above, and 
>>>>>>>>>>>>>>>>>>>>> delete and destroy all copies of this message. Thank you.
>>>>>>>>>>>>>>>>>>>>> --------------------
>>>>>>>>>>>>>>>>>>>>> This electronic message is intended to be for the use 
>>>>>>>>>>>>>>>>>>>>> only of the named recipient, and may contain information 
>>>>>>>>>>>>>>>>>>>>> that is confidential or privileged.  If you are not the 
>>>>>>>>>>>>>>>>>>>>> intended recipient, you are hereby notified that any 
>>>>>>>>>>>>>>>>>>>>> disclosure, copying, distribution or use of the contents 
>>>>>>>>>>>>>>>>>>>>> of this message is strictly prohibited.  If you have 
>>>>>>>>>>>>>>>>>>>>> received this message in error or are not the named 
>>>>>>>>>>>>>>>>>>>>> recipient, please notify us immediately by contacting the 
>>>>>>>>>>>>>>>>>>>>> sender at the electronic mail address noted above, and 
>>>>>>>>>>>>>>>>>>>>> delete and destroy all copies of this message.  Thank you.
>>>>>>>>>>>>>>>>>>>>> --------------------
>>>>>>>>>>>>>>>>>>>>> This electronic message is intended to be for the use 
>>>>>>>>>>>>>>>>>>>>> only of the named recipient, and may contain information 
>>>>>>>>>>>>>>>>>>>>> that is confidential or privileged.  If you are not the 
>>>>>>>>>>>>>>>>>>>>> intended recipient, you are hereby notified that any 
>>>>>>>>>>>>>>>>>>>>> disclosure, copying, distribution or use of the contents 
>>>>>>>>>>>>>>>>>>>>> of this message is strictly prohibited.  If you have 
>>>>>>>>>>>>>>>>>>>>> received this message in error or are not the named 
>>>>>>>>>>>>>>>>>>>>> recipient, please notify us immediately by contacting the 
>>>>>>>>>>>>>>>>>>>>> sender at the electronic mail address noted above, and 
>>>>>>>>>>>>>>>>>>>>> delete and destroy all copies of this message.  Thank you.
>>>>>>>>>>>>>>>>>>>>> --------------------
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> This electronic message is intended to be for the use 
>>>>>>>>>>>>>>>>>>>>> only of the named recipient, and may contain information 
>>>>>>>>>>>>>>>>>>>>> that is confidential or privileged.  If you are not the 
>>>>>>>>>>>>>>>>>>>>> intended recipient, you are hereby notified that any 
>>>>>>>>>>>>>>>>>>>>> disclosure, copying, distribution or use of the contents 
>>>>>>>>>>>>>>>>>>>>> of this message is strictly prohibited.  If you have 
>>>>>>>>>>>>>>>>>>>>> received this message in error or are not the named 
>>>>>>>>>>>>>>>>>>>>> recipient, please notify us immediately by contacting the 
>>>>>>>>>>>>>>>>>>>>> sender at the electronic mail address noted above, and 
>>>>>>>>>>>>>>>>>>>>> delete and destroy all copies of this message.  Thank you.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> --------------------
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> This electronic message is intended to be for the use 
>>>>>>>>>>>>>>>>>>>>> only of the named recipient, and may contain information 
>>>>>>>>>>>>>>>>>>>>> that is confidential or privileged.  If you are not the 
>>>>>>>>>>>>>>>>>>>>> intended recipient, you are hereby notified that any 
>>>>>>>>>>>>>>>>>>>>> disclosure, copying, distribution or use of the contents 
>>>>>>>>>>>>>>>>>>>>> of this message is strictly prohibited.  If you have 
>>>>>>>>>>>>>>>>>>>>> received this message in error or are not the named 
>>>>>>>>>>>>>>>>>>>>> recipient, please notify us immediately by contacting the 
>>>>>>>>>>>>>>>>>>>>> sender at the electronic mail address noted above, and 
>>>>>>>>>>>>>>>>>>>>> delete and destroy all copies of this message.  Thank you.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> <hmcLog.txt><hostdetails.txt><httpdLog.txt><ssh1.sh><ssh1_result.jpg>
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> <DeployError1_2012.8.13.txt><log.rar><DeployingDetails.jpg><UninstallError1_2012.8.13.txt>
>>>>>>>>>> 
>>>>>>>> 
>>>>>>> <deployError2012.8.17.txt>
>>>>>> 
>>>> 
> <gangliaStartError.txt><4.jpg>

Re: Problem when setting up hadoop cluster step 2

Reply via email to