Re: RE: set how much CPU to be utilised by a MapReduce job

2010-01-25 Thread Gautam Singaraju
One thing that you might want to consider is to increase the replication factor. This might take a lot of disk space, but also might increase the performance. You might also want to check out: Sun Grid Engine Hadoop Integration http://blogs.sun.com/templedf/entry/beta_testing_the_sun_grid --- Gaut

Re: setNumReduceTasks(1)

2010-01-25 Thread Jeff Zhang
*See my comments below* On Mon, Jan 25, 2010 at 3:22 PM, Something Something < mailinglist...@gmail.com> wrote: > If I set # of reduce tasks to 1 using setNumReduceTasks(1), would the class > be instantiated only on one machine.. always? I mean if I have a cluster > of > say 1 master, 10 workers

setNumReduceTasks(1)

2010-01-25 Thread Something Something
If I set # of reduce tasks to 1 using setNumReduceTasks(1), would the class be instantiated only on one machine.. always? I mean if I have a cluster of say 1 master, 10 workers & 3 zookeepers, is the Reducer class guaranteed to be instantiated only on 1 machine? If answer is yes, then I will use

Interested in Hadoop Training outside the US? Let us know!

2010-01-25 Thread Christophe Bisciglia
NOTE: Please forward this message to local use groups, especially those that might be less active on the core Apache mailing lists. Feel free to translate into local languages. Hadoop Fans, Over the next year, you'll see new options for Hadoop training and certification from Cloudera. One of the

Re: RE: set how much CPU to be utilised by a MapReduce job

2010-01-25 Thread Todd Lipcon
If you can require a recent kernel, you could use cgroups: http://broadcast.oreilly.com/2009/06/manage-your-performance-with-cgroups-and-projects.html No one has integrated this with hadoop yet as it's still pretty new, and Hadoop clusters are meant to be run on unshared hardware. -Todd On Mon,

Re: RE: set how much CPU to be utilised by a MapReduce job

2010-01-25 Thread Ryan Rawson
The only thing you could do is to have the tasktracker nice the child when its exceeding its reservation. Aside from that, its hard to limit without killing a process. On Jan 25, 2010 12:23 AM, "Naveen Kumar Prasad" wrote: This functionality may not be readily available with Hadoop. But it wou

RE: set how much CPU to be utilised by a MapReduce job

2010-01-25 Thread Naveen Kumar Prasad
This functionality may not be readily available with Hadoop. But it would be appreciable if anyone can help me in understanding how to go about developing this feature. Regards, Naveen Kumar HUAWEI TECHNOLOGIES CO.,LTD. huawei_logo Address: Huawei Industrial Base Bantian Longgang Shenzhen 5