Steve Loughran created MAPREDUCE-6800: -----------------------------------------
Summary: FileInputFormat.singleThreadedListStatus to use listFiles(recursive) Key: MAPREDUCE-6800 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6800 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 2.7.3 Reporter: Steve Loughran Priority: Minor {{FileInputFormat.singleThreadedListStatus}} does recursive directory walks to pick files to scan. This is very inefficient on object stores, and can be bypassed if {{listFiles(recursive=true)}} can be used instead. Based on the experience of SPARK-2984, it should also be resilient to a source file going away during the iteration, downgrading an FNFE to a "skip that nonexistent path" -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org