bharathv commented on a change in pull request #2130: URL: https://github.com/apache/hbase/pull/2130#discussion_r461737913
########## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/MasterRegistry.java ########## @@ -170,6 +214,11 @@ public static String getMasterAddr(Configuration conf) throws UnknownHostExcepti callable.call(controller, stub, resp -> { if (controller.failed()) { future.completeExceptionally(controller.getFailed()); + // RPC has failed, trigger a refresh of master end points. We can have some spurious + // refreshes, but that is okay since the RPC is not expensive and not in a hot path. + synchronized (refreshMasters) { + refreshMasters.notify(); Review comment: > I meant to say if refresh thread misses this notify because it is already done waiting on refreshMasters, for the next loop, it should not again wait 5 min on refreshMasters and rather quickly perform RPC call to populate masters. I don't think thats needed. If the thread has just fetched the masters (in cases where it missed the notification), it is very unlikely that something new has been added/removed. Typically this is a very rare event, probably less rare in K8s environment than DC deployments but even then I don't think things usually change for days if not weeks. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org