[ https://issues.apache.org/jira/browse/HDFS-11094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716886#comment-15716886 ]
Mingliang Liu commented on HDFS-11094: -------------------------------------- For the protocol changes, end-to-end tests are very helpful. Starting a mini dfs cluster is not very expensive; I can usually finish start and shutdown an empty mini cluster in 3~5 seconds on my dev machine. The first heartbeat will bypass the large interval; so 1) Choosing {{HAServiceStateProto}} instead of {{HAServiceStateProto}} makes sense as {{lastActiveClaimTxId}} will be updated in a timely manner, and we can save the complexity of updating it in this patch; 2) Unfortunately, current methods (e.g. set large config {{DFS_HEARTBEAT_INTERVAL_KEY}}, or {{DataNode#setHeartbeatsDisabledForTests()}}) are not working without change for testing this patch. I can accept that existing tests in patch are somehow adequate. So this will not block the progress of this patch. Thanks, > Send back HAState along with NamespaceInfo during a versionRequest as an > optional parameter > ------------------------------------------------------------------------------------------- > > Key: HDFS-11094 > URL: https://issues.apache.org/jira/browse/HDFS-11094 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Reporter: Eric Badger > Assignee: Eric Badger > Attachments: HDFS-11094.001.patch, HDFS-11094.002.patch, > HDFS-11094.003.patch, HDFS-11094.004.patch, HDFS-11094.005.patch, > HDFS-11094.006.patch, HDFS-11094.007.patch, HDFS-11094.008.patch, > HDFS-11094.009.patch > > > The datanode should know which NN is active when it is connecting/registering > to the NN. Currently, it only figures this out during its first (and > subsequent) heartbeat(s) and so there is a period of time where the datanode > is alive and registered, but can't actually do anything because it doesn't > know which NN is active. A byproduct of this is that the MiniDFSCluster will > become active before it knows what NN is active, which can lead to NPEs when > calling getActiveNN(). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org