Jason Lowe created MAPREDUCE-6219:
-------------------------------------
Summary: Reduce memory required for FileInputFormat located status
optimization
Key: MAPREDUCE-6219
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6219
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 2.1.1-beta
Reporter: Jason Lowe
Priority: Minor
MAPREDUCE-1981 introduced an optimization to drastically reduce the number of
namenode operations required to compute input splits when processing a
directory. However it requires more memory to perform this optimization as it
retains the full LocatedFileStatus object for all input files while computing
the splits. This can lead to odd situations for users where using a directory
as input can run the job client out of heap space but using directory/* as the
input spec allows it to run within the original heap space.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)