Name node High Availability in Cloudera 4.1.1

2013-09-19 Thread Pavan Kumar Polineni
Hi all,

*Name Node High Availability & Job tracker high availability* is there in
Cloudera 4.1.1 ?

If not, Then what are the properties need to change in Cloudera 4.1.1 to
make the cluster as High availability.

please help on this.. Thanks in Advance

-- 
 Pavan Kumar Polineni


Re: auto-failover does not work

2013-12-02 Thread Pavan Kumar Polineni
Post your config files and in which method you are following for automatic
failover


On Mon, Dec 2, 2013 at 5:34 PM, YouPeng Yang wrote:

> Hi i
>   I'm testing the HA auto-failover within hadoop-2.2.0
>
>   The cluster can be manully failover ,however failed with the automatic
> failover.
> I setup the HA according to  the URL
>
> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html
>
>   When I test the automatic failover, I killed my active NN by kill -9
> ,while the standby namenode does not change to active state.
>   It came out the log in my DFSZKFailoverController as [1]
>
>  Please help me ,any suggestion will be appreciated.
>
>
> Regards.
>
>
> zkfc
> log[1]
>
> 2013-12-02 19:49:28,588 INFO org.apache.hadoop.ha.NodeFencer: ==
> Beginning Service Fencing Process... ==
> 2013-12-02 19:49:28,588 INFO org.apache.hadoop.ha.NodeFencer: Trying
> method 1/1: org.apache.hadoop.ha.SshFenceByTcpPort(null)
> 2013-12-02 19:49:28,590 INFO org.apache.hadoop.ha.SshFenceByTcpPort:
> Connecting to hadoop3...
> 2013-12-02 19:49:28,590 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> Connecting to hadoop3 port 22
> 2013-12-02 19:49:28,592 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> Connection established
> 2013-12-02 19:49:28,603 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> Remote version string: SSH-2.0-OpenSSH_5.3
> 2013-12-02 19:49:28,603 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> Local version string: SSH-2.0-JSCH-0.1.42
> 2013-12-02 19:49:28,603 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> CheckCiphers:
> aes256-ctr,aes192-ctr,aes128-ctr,aes256-cbc,aes192-cbc,aes128-cbc,3des-ctr,arcfour,arcfour128,arcfour256
> 2013-12-02 19:49:28,608 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> aes256-ctr is not available.
> 2013-12-02 19:49:28,608 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> aes192-ctr is not available.
> 2013-12-02 19:49:28,608 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> aes256-cbc is not available.
> 2013-12-02 19:49:28,608 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> aes192-cbc is not available.
> 2013-12-02 19:49:28,609 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> arcfour256 is not available.
> 2013-12-02 19:49:28,610 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> SSH_MSG_KEXINIT sent
> 2013-12-02 19:49:28,610 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> SSH_MSG_KEXINIT received
> 2013-12-02 19:49:28,610 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> kex: server->client aes128-ctr hmac-md5 none
> 2013-12-02 19:49:28,610 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> kex: client->server aes128-ctr hmac-md5 none
> 2013-12-02 19:49:28,617 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> SSH_MSG_KEXDH_INIT sent
> 2013-12-02 19:49:28,617 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> expecting SSH_MSG_KEXDH_REPLY
> 2013-12-02 19:49:28,634 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> ssh_rsa_verify: signature true
> 2013-12-02 19:49:28,635 WARN org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> Permanently added 'hadoop3' (RSA) to the list of known hosts.
> 2013-12-02 19:49:28,635 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> SSH_MSG_NEWKEYS sent
> 2013-12-02 19:49:28,635 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> SSH_MSG_NEWKEYS received
> 2013-12-02 19:49:28,636 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> SSH_MSG_SERVICE_REQUEST sent
> 2013-12-02 19:49:28,637 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> SSH_MSG_SERVICE_ACCEPT received
> 2013-12-02 19:49:28,638 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> Authentications that can continue:
> gssapi-with-mic,publickey,keyboard-interactive,password
> 2013-12-02 19:49:28,639 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> Next authentication method: gssapi-with-mic
> 2013-12-02 19:49:28,642 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> Authentications that can continue: publickey,keyboard-interactive,password
> 2013-12-02 19:49:28,642 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> Next authentication method: publickey
> 2013-12-02 19:49:28,644 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> Disconnecting from hadoop3 port 22
> 2013-12-02 19:49:28,644 WARN org.apache.hadoop.ha.SshFenceByTcpPort:
> Unable to connect to hadoop3 as user hadoop
> com.jcraft.jsch.JSchException: Auth fail
> at com.jcraft.jsch.Session.connect(Session.java:452)
> at
> org.apache.hadoop.ha.SshFenceByTcpPort.tryFence(SshFenceByTcpPort.java:100)
> at org.apache.hadoop.ha.NodeFencer.fence(NodeFencer.java:97)
> at
> org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:521)
> at
> org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:494)
> at
> org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverCo

How to fail the Name Node or how to crash the Name Node for testing Purpose.

2013-06-18 Thread Pavan Kumar Polineni
For Testing The Name Node Crashes and failures. For Single Point of Failure

-- 
 Pavan Kumar Polineni


Re: How to fail the Name Node or how to crash the Name Node for testing Purpose.

2013-06-18 Thread Pavan Kumar Polineni
I am checking for Cloudera only. But no HA? just we have single Name node.
For testing purposes and preventing actions. Preparing expected scenarios
and solutions for them.


On Wed, Jun 19, 2013 at 12:14 PM, Nitin Pawar wrote:

> are you testing it for HA?
> which version of hadoop are you using?
>
> can you explain your test scenario in detail
>
>
> On Wed, Jun 19, 2013 at 12:08 PM, Pavan Kumar Polineni <
> smartsunny...@gmail.com> wrote:
>
>> For Testing The Name Node Crashes and failures. For Single Point of
>> Failure
>>
>> --
>>  Pavan Kumar Polineni
>>
>
>
>
> --
> Nitin Pawar
>



-- 
 Pavan Kumar Polineni


Re: How to fail the Name Node or how to crash the Name Node for testing Purpose.

2013-06-18 Thread Pavan Kumar Polineni
I am using Hadoop-1. I dont want HA.


On Wed, Jun 19, 2013 at 12:20 PM, Azuryy Yu  wrote:

> hey Pavan,
> Hadoop-2.* has HDFS HA, which hadoop version are you using?
>
>
>
>
> On Wed, Jun 19, 2013 at 2:46 PM, Pavan Kumar Polineni <
> smartsunny...@gmail.com> wrote:
>
>> I am checking for Cloudera only. But no HA? just we have single Name node.
>> For testing purposes and preventing actions. Preparing expected scenarios
>> and solutions for them.
>>
>>
>> On Wed, Jun 19, 2013 at 12:14 PM, Nitin Pawar wrote:
>>
>>> are you testing it for HA?
>>> which version of hadoop are you using?
>>>
>>> can you explain your test scenario in detail
>>>
>>>
>>>  On Wed, Jun 19, 2013 at 12:08 PM, Pavan Kumar Polineni <
>>> smartsunny...@gmail.com> wrote:
>>>
>>>> For Testing The Name Node Crashes and failures. For Single Point of
>>>> Failure
>>>>
>>>> --
>>>>  Pavan Kumar Polineni
>>>>
>>>
>>>
>>>
>>> --
>>> Nitin Pawar
>>>
>>
>>
>>
>> --
>>  Pavan Kumar Polineni
>>
>
>


-- 
 Pavan Kumar Polineni


Re: How to fail the Name Node or how to crash the Name Node for testing Purpose.

2013-06-19 Thread Pavan Kumar Polineni
Hi Manoj,

If we power of the Host then the secondary name node also goes down. I want
to simulate the name node failure and what to do some recovery. Every thing
is part of the testing the name node recovery.



On Wed, Jun 19, 2013 at 12:32 PM, Manoj S wrote:

> You can stop the Namenode daemon or even power the host off :)
>
>
> On Wed, Jun 19, 2013 at 12:27 PM, Pavan Kumar Polineni <
> smartsunny...@gmail.com> wrote:
>
>> I am using Hadoop-1. I dont want HA.
>>
>>
>> On Wed, Jun 19, 2013 at 12:20 PM, Azuryy Yu  wrote:
>>
>>> hey Pavan,
>>> Hadoop-2.* has HDFS HA, which hadoop version are you using?
>>>
>>>
>>>
>>>
>>> On Wed, Jun 19, 2013 at 2:46 PM, Pavan Kumar Polineni <
>>> smartsunny...@gmail.com> wrote:
>>>
>>>> I am checking for Cloudera only. But no HA? just we have single Name
>>>> node.
>>>> For testing purposes and preventing actions. Preparing expected
>>>> scenarios and solutions for them.
>>>>
>>>>
>>>> On Wed, Jun 19, 2013 at 12:14 PM, Nitin Pawar 
>>>> wrote:
>>>>
>>>>> are you testing it for HA?
>>>>> which version of hadoop are you using?
>>>>>
>>>>> can you explain your test scenario in detail
>>>>>
>>>>>
>>>>>  On Wed, Jun 19, 2013 at 12:08 PM, Pavan Kumar Polineni <
>>>>> smartsunny...@gmail.com> wrote:
>>>>>
>>>>>> For Testing The Name Node Crashes and failures. For Single Point of
>>>>>> Failure
>>>>>>
>>>>>> --
>>>>>>  Pavan Kumar Polineni
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Nitin Pawar
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>  Pavan Kumar Polineni
>>>>
>>>
>>>
>>
>>
>> --
>>  Pavan Kumar Polineni
>>
>
>


-- 
 Pavan Kumar Polineni


Re: How to fail the Name Node or how to crash the Name Node for testing Purpose.

2013-06-19 Thread Pavan Kumar Polineni
hi Nitin,

That's true mate. Finally your understand me. And if you have some
scenarios also accepted :).

Thanks In Advance.


On Wed, Jun 19, 2013 at 1:01 PM, Nitin Pawar wrote:

> If I understand correctly,
>
> what you want to do is crash the namenode manually and recover it using
> secondary namenode?
> and you want to know both how to crash the namenode as well how to recover
> it
>
> Is that correct ?
>
>
> On Wed, Jun 19, 2013 at 12:34 PM, Pavan Kumar Polineni <
> smartsunny...@gmail.com> wrote:
>
>> Hi Manoj,
>>
>> If we power of the Host then the secondary name node also goes down. I
>> want to simulate the name node failure and what to do some recovery. Every
>> thing is part of the testing the name node recovery.
>>
>>
>>
>> On Wed, Jun 19, 2013 at 12:32 PM, Manoj S wrote:
>>
>>> You can stop the Namenode daemon or even power the host off :)
>>>
>>>
>>> On Wed, Jun 19, 2013 at 12:27 PM, Pavan Kumar Polineni <
>>> smartsunny...@gmail.com> wrote:
>>>
>>>> I am using Hadoop-1. I dont want HA.
>>>>
>>>>
>>>> On Wed, Jun 19, 2013 at 12:20 PM, Azuryy Yu  wrote:
>>>>
>>>>> hey Pavan,
>>>>> Hadoop-2.* has HDFS HA, which hadoop version are you using?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jun 19, 2013 at 2:46 PM, Pavan Kumar Polineni <
>>>>> smartsunny...@gmail.com> wrote:
>>>>>
>>>>>> I am checking for Cloudera only. But no HA? just we have single Name
>>>>>> node.
>>>>>> For testing purposes and preventing actions. Preparing expected
>>>>>> scenarios and solutions for them.
>>>>>>
>>>>>>
>>>>>> On Wed, Jun 19, 2013 at 12:14 PM, Nitin Pawar <
>>>>>> nitinpawar...@gmail.com> wrote:
>>>>>>
>>>>>>> are you testing it for HA?
>>>>>>> which version of hadoop are you using?
>>>>>>>
>>>>>>> can you explain your test scenario in detail
>>>>>>>
>>>>>>>
>>>>>>>  On Wed, Jun 19, 2013 at 12:08 PM, Pavan Kumar Polineni <
>>>>>>> smartsunny...@gmail.com> wrote:
>>>>>>>
>>>>>>>> For Testing The Name Node Crashes and failures. For Single Point of
>>>>>>>> Failure
>>>>>>>>
>>>>>>>> --
>>>>>>>>  Pavan Kumar Polineni
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Nitin Pawar
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>  Pavan Kumar Polineni
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>  Pavan Kumar Polineni
>>>>
>>>
>>>
>>
>>
>> --
>>  Pavan Kumar Polineni
>>
>
>
>
> --
> Nitin Pawar
>



-- 
 Pavan Kumar Polineni


MapReduce job not running - i think i keep all correct configuration.

2013-06-23 Thread Pavan Kumar Polineni
Hi all,

first i have a machine with all the demons are running on it. after that i
added two data nodes. In this case MR job working fine.

Now i changed the first machine to just namenode by stopping all the demons
except NN demon. and changed i data node to (SNN.JT,DN,TT) and all are
working. i keep the other data node like that only.

I changed the configurations to link up the NN and JT.

>From here when i tried to run MR job this is not running ..

Please help Me. Thanks

-- 
 Pavan Kumar Polineni


Re: MapReduce job not running - i think i keep all correct configuration.

2013-06-23 Thread Pavan Kumar Polineni
Hi ravi,

after checking the config in one Mapred-site.xml i keep replication factor
of 1 instead of 2. after changing this i restarted all the demons . But
still the problem exits. Can you come to gtalk i can explain you more.
Thanks ..


On Sun, Jun 23, 2013 at 7:04 PM, Ravi Prakash  wrote:

> Hi Pavan,
>
> I assure you this configuration works. The problem is very likely in your
> configuration files. Please look them over once again. Also did you restart
> your daemons after changing the configuration? Some configurations
> necessarily require a restart.
>
> Ravi.
>
>
>   ------
>  *From:* Pavan Kumar Polineni 
> *To:* user@hadoop.apache.org
> *Sent:* Sunday, June 23, 2013 6:20 AM
> *Subject:* MapReduce job not running - i think i keep all correct
> configuration.
>
>
> Hi all,
>
> first i have a machine with all the demons are running on it. after that i
> added two data nodes. In this case MR job working fine.
>
> Now i changed the first machine to just namenode by stopping all the
> demons except NN demon. and changed i data node to (SNN.JT,DN,TT) and all
> are working. i keep the other data node like that only.
>
> I changed the configurations to link up the NN and JT.
>
> From here when i tried to run MR job this is not running ..
>
> Please help Me. Thanks
>
> --
>  Pavan Kumar Polineni
>
>
>


-- 
 Pavan Kumar Polineni


Re: MapReduce job not running - i think i keep all correct configuration.

2013-06-23 Thread Pavan Kumar Polineni
Hi all,

i solved it. due to some cluster id prob . Thanks for support


On Mon, Jun 24, 2013 at 11:39 AM, Azuryy Yu  wrote:

> Can you paste some error logs here? you can find it on the JT or TT. and
> tell us the hadoop version.
>
>
> On Sun, Jun 23, 2013 at 9:20 PM, Pavan Kumar Polineni <
> smartsunny...@gmail.com> wrote:
>
>>
>> Hi all,
>>
>> first i have a machine with all the demons are running on it. after that
>> i added two data nodes. In this case MR job working fine.
>>
>> Now i changed the first machine to just namenode by stopping all the
>> demons except NN demon. and changed i data node to (SNN.JT,DN,TT) and all
>> are working. i keep the other data node like that only.
>>
>> I changed the configurations to link up the NN and JT.
>>
>> From here when i tried to run MR job this is not running ..
>>
>> Please help Me. Thanks
>>
>> --
>>  Pavan Kumar Polineni
>>
>
>


-- 
 Pavan Kumar Polineni


Re: how to find process under node

2013-08-29 Thread Pavan Kumar Polineni
Hi Suneel,

Please provide more details. Like what you want to print and what files you
are using with in the script. So that i can help. May be some thing wrong
in your script. So i want to check from my end and help you on this case.


On Thu, Aug 29, 2013 at 1:10 PM, Shekhar Sharma wrote:

> Are your trying to find the java process under a node...Then simple
> thing would be to do ssh and run jps command to get the list of java
> process
> Regards,
> Som Shekhar Sharma
> +91-8197243810
>
>
> On Thu, Aug 29, 2013 at 12:27 PM, suneel hadoop
>  wrote:
> > Hi All,
> >
> > what im trying out here is to capture the process which is running under
> > which node
> >
> > this is the unix script which i tried
> >
> >
> > !/bin/ksh
> >
> >
> > Cnt=cat /users/hadoop/unixtest/nodefilename.txt | wc -l
> > cd /users/hadoop/unixtest/
> > ls -ltr | awk '{print $9}' > list_of_scripts.txt
> > split -l $Cnt list_of_scripts.txt node_scripts
> > ls -ltr node_scripts* | awk '{print $9}' > list_of_node_scripts.txt
> > for i in nodefilename.txt
> > do
> > for j in list_of_node_scripts.txt
> > do
> > node=$i
> > script_file=$j
> > cat $node\n $script_file >> $script_file
> > done
> > done
> >
> >
> > exit 0;
> >
> >
> >
> > but my result should look like below:
> >
> >
> > node1 node2
> > - ---
> > process1 proces3
> > process2 proces4
> >
> >
> > can some one please help in this..
> > thanks in advance..
>



-- 
 Pavan Kumar Polineni