[ https://issues.apache.org/jira/browse/MAPREDUCE-2572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Joseph Evans updated MAPREDUCE-2572: ------------------------------------------- Attachment: MR-2572-trunk-v1.patch It looks like the LRU changes never made it into 0.22 so no patch will be submitted for that. > Throttle the deletion of data from the distributed cache > -------------------------------------------------------- > > Key: MAPREDUCE-2572 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2572 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: distributed-cache > Affects Versions: 0.20.205.0 > Reporter: Robert Joseph Evans > Assignee: Robert Joseph Evans > Attachments: MR-2572-trunk-v1.patch, THROTTLING-security-v1.patch > > > When deleting entries from the distributed cache we do so in a background > thread. Once the size limit of the distributed cache is reached all unused > entries are deleted. MAPREDUCE-2494 changes this so that entries are deleted > in LRU order until the usage falls below a given threshold. In either of > these cases we are periodically flooding a disk with delete requests which > can slow down all IO operations to a drive. It would be better to be able to > throttle this deletion so that it is spread out over a longer period of time. > This jira is to add in this throttling. > On investigating it seems much simpler to backport MPAREDUCE-2494 to 20S > before implementing this change rather then try to implement it without LRU > deletion, because LRU goes a long way towards reducing the load on the disk > anyways. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira