[ https://issues.apache.org/jira/browse/HDFS-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ayush Saxena updated HDFS-14283: -------------------------------- Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > DFSInputStream to prefer cached replica > --------------------------------------- > > Key: HDFS-14283 > URL: https://issues.apache.org/jira/browse/HDFS-14283 > Project: Hadoop HDFS > Issue Type: Improvement > Affects Versions: 2.6.0 > Environment: HDFS Caching > Reporter: Wei-Chiu Chuang > Assignee: Lisheng Sun > Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-14283.001.patch, HDFS-14283.002.patch, > HDFS-14283.003.patch, HDFS-14283.004.patch, HDFS-14283.005.patch, > HDFS-14283.006.patch, HDFS-14283.007.patch, HDFS-14283.008.patch, > HDFS-14283.009.patch > > > HDFS Caching offers performance benefits. However, currently NameNode does > not treat cached replica with higher priority, so HDFS caching is only useful > when cache replication = 3, that is to say, all replicas are cached in > memory, so that a client doesn't randomly pick an uncached replica. > HDFS-6846 proposed to let NameNode give higher priority to cached replica. > Changing a logic in NameNode is always tricky so that didn't get much > traction. Here I propose a different approach: let client (DFSInputStream) > prefer cached replica. > A {{LocatedBlock}} object already contains cached replica location so a > client has the needed information. I think we can change > {{DFSInputStream#getBestNodeDNAddrPair()}} for this purpose. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org