[ https://issues.apache.org/jira/browse/HDFS-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wei-Chiu Chuang resolved HDFS-16947. ------------------------------------ Resolution: Fixed > RBF NamenodeHeartbeatService to report error for not being able to register > namenode in state store > --------------------------------------------------------------------------------------------------- > > Key: HDFS-16947 > URL: https://issues.apache.org/jira/browse/HDFS-16947 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Viraj Jasani > Assignee: Viraj Jasani > Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Namenode heartbeat service should provide error with full stacktrace if it > cannot register namenode in the state store. As of today, we only log info > msg. > For zookeeper based impl, this might mean either a) curator manager is not > initialized or b) if it fails to write to znode after exhausting retries. For > either of these cases, reporting only INFO log might not be good enough and > we might have to look for errors elsewhere. > > Sample example: > {code:java} > 2023-02-20 23:10:33,714 DEBUG [NamenodeHeartbeatService {ns} nn0-0] > router.NamenodeHeartbeatService - Received service state: ACTIVE from HA > namenode: {ns}-nn0:nn-0-{ns}.{cluster}:9000 > 2023-02-20 23:10:33,731 INFO [NamenodeHeartbeatService {ns} nn0-0] > impl.MembershipStoreImpl - Inserting new NN registration: > nn-0.namenode.{cluster}:8888->{ns}:nn0:nn-0-{ns}.{cluster}:9000-ACTIVE > 2023-02-20 23:10:33,731 INFO [NamenodeHeartbeatService {ns} nn0-0] > router.NamenodeHeartbeatService - Cannot register namenode in the State Store > {code} > If we could log full stacktrace: > {code:java} > 2023-02-21 00:20:24,691 ERROR [NamenodeHeartbeatService {ns} nn0-0] > router.NamenodeHeartbeatService - Cannot register namenode in the State Store > org.apache.hadoop.hdfs.server.federation.store.StateStoreUnavailableException: > State Store driver StateStoreZooKeeperImpl in nn-0.namenode.{cluster} is not > ready. > at > org.apache.hadoop.hdfs.server.federation.store.driver.StateStoreDriver.verifyDriverReady(StateStoreDriver.java:158) > at > org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl.putAll(StateStoreZooKeeperImpl.java:235) > at > org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreBaseImpl.put(StateStoreBaseImpl.java:74) > at > org.apache.hadoop.hdfs.server.federation.store.impl.MembershipStoreImpl.namenodeHeartbeat(MembershipStoreImpl.java:179) > at > org.apache.hadoop.hdfs.server.federation.resolver.MembershipNamenodeResolver.registerNamenode(MembershipNamenodeResolver.java:381) > at > org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.updateState(NamenodeHeartbeatService.java:317) > at > org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.lambda$periodicInvoke$0(NamenodeHeartbeatService.java:244) > ... > ... {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org