I see Nutch is moving towards using MapReduce for many things and there is already a branch that uses MapReduce for parsing and updatedb. I was wondering are there any benchmarks/tests validating the benefit of using the Nutch implementation of MapReduce, especially at large scale in a distributed setting? What tests have been done, and at what scale, for this MapReduce branch? I am doing my own testing but it is good to know what others have experienced.
Thanks. Yitao
