MapReduce ought to control the number of workers reasonably well, and
you can override with mapred.reduce.tasks if you want. I don't think
any fixed number works: what's right for 2 machines isn't right for
200.

2011/9/13 myn <m...@163.com>:
>  private static void startDFCounting(Path input, Path output, Configuration 
> baseConf,int numReducers)
>
>  private static void makePartialVectors(Path input,
>
> meanshift cluster
>
> and so many place ,why?  hadoop default is 2 reduce, but my data is 3 billon 
> ,2 reduce is so slowly.

Reply via email to