Sandeep, Can you please share more information on which hadoop version you are using and also size of the cluster in terms of fsimage size or file/block count. Also what is the threshold set for rpc latency?
There is very less probability that standbyNN getting rpc latency unless there is a checkpointing is in progress. Checkpointing is done by standbyNN and acquires FSNameSystem write lock during the process. Hence other NN operations (from DN) like Heartbeat processing or incremental block report or full block report will get blocked during this time. This might be the case you face in your cluster. If fsImage is bigger enough (in the order of few GB's), then checkpointing might take more than a minute. If you are using Hadoop 2.6.0, you might be encountering this situation. This got fixed in hadoop-2.7.0 <https://issues.apache.org/jira/browse/HDFS-7097>. Thanks, Chackra Thanks, Chackra On Mon, Jul 18, 2016 at 8:35 AM, sandeep vura <sandeepv...@gmail.com> wrote: > Hi Team, > > We are getting rpc latency alerts from the standby namenode. What does it > means? Where to check the logs for the root cause? > > > I have already checked standby namenode logs but didn't find any specific > error. > > > Regards, > Sandeep.v > >