[jira] [Updated] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-20 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6089:
--

   Resolution: Fixed
Fix Version/s: 2.4.0
   Status: Resolved  (was: Patch Available)

Committed this back through 2.4, thanks again Jing for the patch and Arpit for 
the report.

> Standby NN while transitioning to active throws a connection refused error 
> when the prior active NN process is suspended
> 
>
> Key: HDFS-6089
> URL: https://issues.apache.org/jira/browse/HDFS-6089
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
> Fix For: 2.4.0
>
> Attachments: HDFS-6089.000.patch, HDFS-6089.001.patch, 
> HDFS-6089.002.patch
>
>
> The following scenario was tested:
> * Determine Active NN and suspend the process (kill -19)
> * Wait about 60s to let the standby transition to active
> * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
> active.
> What was noticed that some times the call to get the service state of nn2 got 
> a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-19 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6089:


Attachment: HDFS-6089.002.patch

Patch that adds rpc timeout for the rollEditLog call. I set the default timeout 
to 20s.

> Standby NN while transitioning to active throws a connection refused error 
> when the prior active NN process is suspended
> 
>
> Key: HDFS-6089
> URL: https://issues.apache.org/jira/browse/HDFS-6089
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
> Attachments: HDFS-6089.000.patch, HDFS-6089.001.patch, 
> HDFS-6089.002.patch
>
>
> The following scenario was tested:
> * Determine Active NN and suspend the process (kill -19)
> * Wait about 60s to let the standby transition to active
> * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
> active.
> What was noticed that some times the call to get the service state of nn2 got 
> a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-12 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6089:


Attachment: HDFS-6089.001.patch

Fix unit tests.

> Standby NN while transitioning to active throws a connection refused error 
> when the prior active NN process is suspended
> 
>
> Key: HDFS-6089
> URL: https://issues.apache.org/jira/browse/HDFS-6089
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
> Attachments: HDFS-6089.000.patch, HDFS-6089.001.patch
>
>
> The following scenario was tested:
> * Determine Active NN and suspend the process (kill -19)
> * Wait about 60s to let the standby transition to active
> * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
> active.
> What was noticed that some times the call to get the service state of nn2 got 
> a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-11 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6089:


Attachment: HDFS-6089.000.patch

Simple patch to remove the editlog roll from SBN.

> Standby NN while transitioning to active throws a connection refused error 
> when the prior active NN process is suspended
> 
>
> Key: HDFS-6089
> URL: https://issues.apache.org/jira/browse/HDFS-6089
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
> Attachments: HDFS-6089.000.patch
>
>
> The following scenario was tested:
> * Determine Active NN and suspend the process (kill -19)
> * Wait about 60s to let the standby transition to active
> * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
> active.
> What was noticed that some times the call to get the service state of nn2 got 
> a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-11 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6089:


Status: Patch Available  (was: Open)

> Standby NN while transitioning to active throws a connection refused error 
> when the prior active NN process is suspended
> 
>
> Key: HDFS-6089
> URL: https://issues.apache.org/jira/browse/HDFS-6089
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
> Attachments: HDFS-6089.000.patch
>
>
> The following scenario was tested:
> * Determine Active NN and suspend the process (kill -19)
> * Wait about 60s to let the standby transition to active
> * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
> active.
> What was noticed that some times the call to get the service state of nn2 got 
> a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

2014-03-11 Thread Arpit Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-6089:
--

Description: 
The following scenario was tested:

* Determine Active NN and suspend the process (kill -19)
* Wait about 60s to let the standby transition to active
* Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
active.


What was noticed that some times the call to get the service state of nn2 got a 
socket time out exception.

  was:
The following scenario was tested:

* Determine Active NN and suspend the process (kill -19)
* Wait about 60s to let the standby transition to active
* Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
active.


What was noticed that some times the call to get the service state of nn2 got a 
socket time out connection.


> Standby NN while transitioning to active throws a connection refused error 
> when the prior active NN process is suspended
> 
>
> Key: HDFS-6089
> URL: https://issues.apache.org/jira/browse/HDFS-6089
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.4.0
>Reporter: Arpit Gupta
>Assignee: Jing Zhao
>
> The following scenario was tested:
> * Determine Active NN and suspend the process (kill -19)
> * Wait about 60s to let the standby transition to active
> * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
> active.
> What was noticed that some times the call to get the service state of nn2 got 
> a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)