RE: set how much CPU to be utilised by a MapReduce job

2010-01-25 Thread Naveen Kumar Prasad
This functionality may not be readily available with Hadoop. But it would be appreciable if anyone can help me in understanding how to go about developing this feature. Regards, Naveen Kumar HUAWEI TECHNOLOGIES CO.,LTD. huawei_logo Address: Huawei Industrial Base Bantian Longgang Shenzhen

Re: RE: set how much CPU to be utilised by a MapReduce job

2010-01-25 Thread Ryan Rawson
The only thing you could do is to have the tasktracker nice the child when its exceeding its reservation. Aside from that, its hard to limit without killing a process. On Jan 25, 2010 12:23 AM, Naveen Kumar Prasad naveenkum...@huawei.com wrote: This functionality may not be readily available

Re: RE: set how much CPU to be utilised by a MapReduce job

2010-01-25 Thread Todd Lipcon
If you can require a recent kernel, you could use cgroups: http://broadcast.oreilly.com/2009/06/manage-your-performance-with-cgroups-and-projects.html No one has integrated this with hadoop yet as it's still pretty new, and Hadoop clusters are meant to be run on unshared hardware. -Todd On

Re: RE: set how much CPU to be utilised by a MapReduce job

2010-01-25 Thread Gautam Singaraju
One thing that you might want to consider is to increase the replication factor. This might take a lot of disk space, but also might increase the performance. You might also want to check out: Sun Grid Engine Hadoop Integration http://blogs.sun.com/templedf/entry/beta_testing_the_sun_grid ---

Re: set how much CPU to be utilised by a MapReduce job

2010-01-24 Thread Allen Wittenauer
On 1/24/10 10:33 PM, Naveen Kumar Prasad naveenkum...@huawei.com wrote: If many jobs are running concurrently in Hadoop, how can we set CPU usage for individual tasks. That functionality does not exist.