Wei-Chiu Chuang created HDFS-14283: -------------------------------------- Summary: DFSInputStream to prefer cached replica Key: HDFS-14283 URL: https://issues.apache.org/jira/browse/HDFS-14283 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.6.0 Environment: HDFS Caching Reporter: Wei-Chiu Chuang
HDFS Caching offers performance benefits. However, currently NameNode does not treat cached replica with higher priority, so HDFS caching is only useful when cache replication = 3, that is to say, all replicas are cached in memory, so that a client doesn't randomly pick an uncached replica. HDFS-6846 proposed to let NameNode give higher priority to cached replica. Changing a logic in NameNode is always tricky so that didn't get much traction. Here I propose a different approach: let client (DFSInputStream) prefer cached replica. A {{LocatedBlock}} object already contains cached replica location so a client has the needed information. I think we can change {{DFSInputStream#getBestNodeDNAddrPair()}} for this purpose. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org