Your problem seems to surround available memory and over-subscription. If you're using a 0.20.x or 1.x version of Apache Hadoop, you probably want to use the CapacityScheduler to address this for you.
I once detailed how-to, on a similar question here: http://search-hadoop.com/m/gnFs91yIg1e On Wed, May 22, 2013 at 2:55 PM, Steve Lewis <lordjoe2...@gmail.com> wrote: > I have a series of Hadoop jobs to run - one of my jobs requires larger than > standard memory > I allow the task to use 2GB of memory. When I run some of these jobs the > slave nodes are crashing because they run out of swap space. It is not that > s slave count not run one. or even 4 of these jobs but 8 stresses the > limits. > I could cut the mapred.tasktracker.reduce.tasks.maximum for the entire > cluster but this cripples the whole cluster for one of many jobs. > It seems to be a very bad design > a) to allow the job tracker to keep assigning tasks to a slave that is > already getting low on memory > b) to allow the user to run jobs capable or crashing noeds on the cluster > c) not to allow the user to specify that some jobs need to be limited to a > lower value without requiring this limit for every job. > > Are there plans to fix this?? > > -- > -- Harsh J