[ https://issues.apache.org/jira/browse/MAPREDUCE-2647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Devaraj K resolved MAPREDUCE-2647. ---------------------------------- Resolution: Won't Fix Closing it as Won't fix as there is no active feature development happening in mrv1. > Memory sharing across all the Tasks in the Task Tracker to improve the job > performance > -------------------------------------------------------------------------------------- > > Key: MAPREDUCE-2647 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2647 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: tasktracker > Reporter: Devaraj K > Assignee: Devaraj K > > If all the tasks (maps/reduces) are using (working with) the same > additional data to execute the map/reduce task, each task should load the > data into memory individually and read the data. It is the additional effort > for all the tasks to do the same job. Instead of loading the data by each > task, data can be loaded into main memory and it can be used to execute all > the tasks. > h5.Proposed Solution: > 1. Provide a mechanism to load the data into shared memory and to read that > data from main memory. > 2. We can provide a java API, which internally uses the native implementation > to read the data from the memory. All the maps/reducers can this API for > reading the data from the main memory. > h5.Example: > Suppose in a map task, ip address is a key and it needs to get location > of the ip address from a local file. In this case each map task should load > the file into main memory and read from it and close it. It takes some time > to open, read from the file and process every time. Instead of this, we can > load the file in the task tracker memory and each task can read from the > memory directly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)