[ https://issues.apache.org/jira/browse/HDFS-12943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16723272#comment-16723272 ]
Chen Liang edited comment on HDFS-12943 at 12/17/18 10:45 PM: -------------------------------------------------------------- Hi [~brahmareddy], Thanks for testing! The timeout issue seems interesting. To start with, it is expected to see some performance degradation *from CLI*, because CLI initiates a DFSClient every time for each command, a fresh DFSClient has to get status of name nodes every time. But if it is the same DFSClient being reused, this would not be an issue. I have never seen the second-call issue. Here is an output from our cluster (log outpu part omitted), and I think you are right about lowering dfs.ha.tail-edits.period, we had similar numbers here: {code:java} $time hdfs --loglevel debug dfs -Ddfs.client.failover.proxy.provider.***=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -mkdir /TestsORF1 real 0m2.254s user 0m3.608s sys 0m0.331s $time hdfs --loglevel debug dfs -Ddfs.client.failover.proxy.provider.***=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -mkdir /TestsORF2 real 0m2.159s user 0m3.855s sys 0m0.330s{code} Curious, how many NN you had in the testing? and was there any error from NN logs? was (Author: vagarychen): Hi [~brahmareddy], Thanks for testing! The timeout issue seems interesting. To start with, it is expected to see some performance degradation *from CLI*, because CLI initiates a DFSClient every time for each command, a fresh DFSClient has to get status of name nodes every time. But if it is the same DFSClient being reused, this would not be an issue. I have never seen the second-call issue. Here is an output from our cluster (log outpu part omitted), and I think you are right about lowering dfs.ha.tail-edits.period, we had similar numbers here: {code:java} $time hdfs --loglevel debug dfs -Ddfs.client.failover.proxy.provider.ltx1-unonn01=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -mkdir /TestsORF1 real 0m2.254s user 0m3.608s sys 0m0.331s $time hdfs --loglevel debug dfs -Ddfs.client.failover.proxy.provider.ltx1-unonn01=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider -mkdir /TestsORF2 real 0m2.159s user 0m3.855s sys 0m0.330s{code} ** Curious, how many NN you had in the testing? and was there any error from NN logs? > Consistent Reads from Standby Node > ---------------------------------- > > Key: HDFS-12943 > URL: https://issues.apache.org/jira/browse/HDFS-12943 > Project: Hadoop HDFS > Issue Type: New Feature > Components: hdfs > Reporter: Konstantin Shvachko > Priority: Major > Attachments: ConsistentReadsFromStandbyNode.pdf, > ConsistentReadsFromStandbyNode.pdf, HDFS-12943-001.patch, > HDFS-12943-002.patch, TestPlan-ConsistentReadsFromStandbyNode.pdf > > > StandbyNode in HDFS is a replica of the active NameNode. The states of the > NameNodes are coordinated via the journal. It is natural to consider > StandbyNode as a read-only replica. As with any replicated distributed system > the problem of stale reads should be resolved. Our main goal is to provide > reads from standby in a consistent way in order to enable a wide range of > existing applications running on top of HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org