Re: Number of MR running mahout kmeans

Sean Owen Thu, 29 Nov 2012 02:50:24 -0800

This isn't specific to Mahout. In Hadoop, your capacity has no bearing
on how many jobs Hadoop chooses to run. The ".maximum" properties are
just expressing how many tasks to run concurrently on one worker.


Hadoop will always run 1 reducer unless you tell it otherwise with
mapred.reduce.tasks. I think most of the jobs don't try to set this
for you and you'll have to do so yourself.

The number of maps depends on the size and number of input files. I
wouldn't override it, but, if you must you can do so by decreasing the
maximum split size that one mapper takes.

Re: Number of MR running mahout kmeans

Reply via email to