[ 
https://issues.apache.org/jira/browse/HDFS-11094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15703890#comment-15703890
 ] 

Mingliang Liu edited comment on HDFS-11094 at 11/29/16 2:18 AM:
----------------------------------------------------------------

Hi [~ebadger], thanks for updating the patch. Sorry for returning late from 
holiday. The patch looks good to me overall. I have two thoughts for your 
consideration.

# I discussed with [~arpitagarwal] offline and he suggested us use the same 
logic in {{updateActorStatesFromHeartbeat}} to update the active NN 
{{bpServiceToActive}}, which has dealt with several cases carefully. Moreover, 
if we are updating {{bpServiceToActive}} we should likely also update 
{{lastActiveClaimTxId}}. To achieve this, I think we can pass 
{{NNHAStatusHeartbeatProto}} instead of {{HAServiceStateProto}} in 
{{NamespaceInfoProto}}.
# For the unit test, can we set a very large heartbeat interval in 
configuration, and check the active NN is not null after 
{{cluster.waitForActive()}}? Mocked tests are useful as well and can be kept. 
Another idea is to drop heartbeat request against a spied HeartbeatManager.


was (Author: liuml07):
Hi [~ebadger], thanks for updating the patch. Sorry for returning late from 
holiday. The patch looks good to me overall. I have two thoughts for your 
consideration.

# I discussed with [~arpitagarwal] and he suggest us consider using the same 
logic in {{updateActorStatesFromHeartbeat}} to update the active NN 
{{bpServiceToActive}}, which has dealt with several cases carefully. If we are 
updating {{bpServiceToActive}} we should likely also update 
{{lastActiveClaimTxId}}.

> Send back HAState along with NamespaceInfo during a versionRequest as an 
> optional parameter
> -------------------------------------------------------------------------------------------
>
>                 Key: HDFS-11094
>                 URL: https://issues.apache.org/jira/browse/HDFS-11094
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>            Reporter: Eric Badger
>            Assignee: Eric Badger
>         Attachments: HDFS-11094.001.patch, HDFS-11094.002.patch, 
> HDFS-11094.003.patch, HDFS-11094.004.patch, HDFS-11094.005.patch, 
> HDFS-11094.006.patch, HDFS-11094.007.patch, HDFS-11094.008.patch, 
> HDFS-11094.009.patch
>
>
> The datanode should know which NN is active when it is connecting/registering 
> to the NN. Currently, it only figures this out during its first (and 
> subsequent) heartbeat(s) and so there is a period of time where the datanode 
> is alive and registered, but can't actually do anything because it doesn't 
> know which NN is active. A byproduct of this is that the MiniDFSCluster will 
> become active before it knows what NN is active, which can lead to NPEs when 
> calling getActiveNN(). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to