Cache HAR filesystem metadata
-----------------------------

                 Key: MAPREDUCE-2459
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2459
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: harchive
            Reporter: Mac Yang
            Assignee: Mac Yang


Each HAR file system has two index files that contains information on how files 
are stored in the part files. During the block location calculation, these 
indexes are reread for every file in the archive. Caching the indexes and the 
status of the part files will greatly reduce the number of name node operations 
during the job setup time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to