[ https://issues.apache.org/jira/browse/HDFS-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973771#comment-16973771 ]
Siyao Meng commented on HDFS-14283: ----------------------------------- [~leosun08] Code looks good. Would you add test cases in {{TestDFSInputStream}} to demonstrate: 1) when client sets {{dfs.client.read.use.cache.priority}} to true, a cached block will be used; 2) when the config is set to false, it won't affect the current behavior of the client without the patch. Thanks! > DFSInputStream to prefer cached replica > --------------------------------------- > > Key: HDFS-14283 > URL: https://issues.apache.org/jira/browse/HDFS-14283 > Project: Hadoop HDFS > Issue Type: Improvement > Affects Versions: 2.6.0 > Environment: HDFS Caching > Reporter: Wei-Chiu Chuang > Assignee: Lisheng Sun > Priority: Major > Attachments: HDFS-14283.001.patch, HDFS-14283.002.patch, > HDFS-14283.003.patch, HDFS-14283.004.patch, HDFS-14283.005.patch > > > HDFS Caching offers performance benefits. However, currently NameNode does > not treat cached replica with higher priority, so HDFS caching is only useful > when cache replication = 3, that is to say, all replicas are cached in > memory, so that a client doesn't randomly pick an uncached replica. > HDFS-6846 proposed to let NameNode give higher priority to cached replica. > Changing a logic in NameNode is always tricky so that didn't get much > traction. Here I propose a different approach: let client (DFSInputStream) > prefer cached replica. > A {{LocatedBlock}} object already contains cached replica location so a > client has the needed information. I think we can change > {{DFSInputStream#getBestNodeDNAddrPair()}} for this purpose. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org