Re: Biggest cluster running YARN in the world?

2013-01-15 Thread Hemanth Yamijala
You may get more updated information from folks at Yahoo!, but here is a mail on hadoop-general mailing list that has some statistics: http://www.mail-archive.com/general@hadoop.apache.org/msg05592.html Please note it is a little dated, so things should be better now :-) Thank hemanth On Tue,

MPI and hadoop on same cluster

2013-01-15 Thread rahul v
Hi, This issue issues.apache.org/jira/browse/MAPREDUCE-2911 talks about executing hadoop and MPI on the same cluster.Even though the comments suggest Ralph has finished writing the code, am not able to find the patch. Can someone guide me towards finding the same? -- Regards, R.V.

When reduce tasks start in MapReduce Streaming?

2013-01-15 Thread Pedro Sá da Costa
Hi, I read from documents that in MapReduce, the reduce tasks only start after a percentage (by default 90%) of maps end. This means that the slowest maps can delay the start of reduce tasks, and the input data that is consumed by the reduce tasks is represented as a batch of data. This means

Re: When reduce tasks start in MapReduce Streaming?

2013-01-15 Thread Jeff Bean
Hi Pedro, Yes, Hadoop Streaming has the same property. The reduce method is not called until the mappers are done, and the reducers are not scheduled before the threshold set by mapred.reduce.slowstart.completed.maps is reached. On Tue, Jan 15, 2013 at 3:06 PM, Pedro Sá da Costa