zhengchenyu commented on PR #5132: URL: https://github.com/apache/hadoop/pull/5132#issuecomment-1319904398
## 1 Background After HDFS-13522, observer read on RBF is supported. For now, client and router transmit state id of all nameservice, this may has two disadvantage: * may carry lots of unnecessary data. * msync must invokeConcurrent on all nameservice. > Note: In my experience, the call which invokeConcurrent on all nameservice will harm to the stability of dfsrouter. For example , in our production, after apply HDFS-16283, avoid renewlease invoke on all nameservice, the callqueuelength of router have keep very low. ## 2 How to transmit state id according to client's demand? The problem is that client can't know the downstreaming nameservice. But in PoolAlignmentContext::updateRequestState, we will know the nameservice, we could record it in the call. Here is an example: * Initial state: There are no state id in client. Client make call1 without any federated state id. * Router will find the downstreaming nameservice, then invoke PoolAlignmentContext::updateRequestState. Here we can record the nsid and stateid in call1. * When Router will return the response to client (RouterStateIdContext::updateResponseState). we can get the nsid and stateid from call1, then return to client. * When client recieve the call from client, client will store the state id of ns1. * When client reconnect to dfsrouter, client will make call2 with ns1. This means we transmit the state id according to client's demand. ## 3 How to msync? In RouterClientProtocol::msync, we can get the nameservice from the current call. Then we can invoke msync to downstreaming namenode according to client's demand. ## 4 How to handle the first msync? There is a problem which I did not mention in section 2. How to handle the first mysnc? In initial state, there are no federated state ids in client side, dfsrouter will not invoke mysnc to any namenode. In next call, the first call to every downstreaming namenode except msync, RouterStateIdContext.getClientStateIdFromCurrentCall will return Long.MIN_VALUE, will connect to active. Then the following read call to every downstreaming namenode , will connect to observer namenode. > Note: In fact, I prefer that do msync in router side, do not msync in client side, just like https://issues.apache.org/jira/secure/attachment/13011000/HDFS-13522_WIP.patch. Then there is no problem about first msync. ## 5 What about the unit test? /ns0 is mount on ns0, /ns1 is mount on ns1. fs0 can only transmit ns0 state id. fs1 can only transmit ns1 state id. Set DFS_ROUTER_OBSERVER_FEDERATED_STATE_PROPAGATION_MAXSIZE to 1, then not allow transmit more than one state ids. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org