[ https://issues.apache.org/jira/browse/HBASE-10271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861035#comment-13861035 ]
Lars Hofhansl commented on HBASE-10271: --------------------------------------- Could we: # create ephemeral znode (as we do now) # report the RS for duty # update the ephemeral znode with the name as seen by the Master ? That way we'd have the ephemeral node in place in case the RS dies and as soon as we know the proper hostname we update that node. > [regression] Cannot use the wildcard address since HBASE-9593 > ------------------------------------------------------------- > > Key: HBASE-10271 > URL: https://issues.apache.org/jira/browse/HBASE-10271 > Project: HBase > Issue Type: Bug > Affects Versions: 0.98.0, 0.94.13, 0.96.1 > Reporter: Jean-Daniel Cryans > Priority: Critical > Fix For: 0.98.0, 0.94.16 > > > HBASE-9593 moved the creation of the ephemeral znode earlier in the region > server startup process such that we don't have access to the ServerName from > the Master's POV. HRS.getMyEphemeralNodePath() calls HRS.getServerName() > which at that point will return this.isa.getHostName(). If you set > hbase.regionserver.ipc.address to 0.0.0.0, you will create a znode with that > address. > What happens next is that the RS will report for duty correctly but the > master will do this: > {noformat} > 2014-01-02 11:45:49,498 INFO [master:172.21.3.117:60000] > master.ServerManager: Registering server=0:0:0:0:0:0:0:0%0,60020,1388691892014 > 2014-01-02 11:45:49,498 INFO [master:172.21.3.117:60000] master.HMaster: > Registered server found up in zk but who has not yet reported in: > 0:0:0:0:0:0:0:0%0,60020,1388691892014 > {noformat} > The cluster is then unusable. > I think a better solution is to track the heartbeats for the region servers > and expire those that haven't checked-in for some time. The 0.89-fb branch > has this concept, and they also use it to detect rack failures: > https://github.com/apache/hbase/blob/0.89-fb/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java#L1224. > In this jira's scope I would just add the heartbeat tracking and add a unit > test for the wildcard address. > What do you think [~rajesh23]? -- This message was sent by Atlassian JIRA (v6.1.5#6160)