I have a 1 node pseudo cluster with plenty of RAM and 5 HDs. As an
experiment, I set mapreduce.cluster.local.dir to point to a ram disk.
For this experiment I am running 8GB terasort, so I made a 9GB ram
disk. This change sped up the run time of the job by ~16% versus
pointing mapreduce.clu
Hi Matei,
Using the fair scheduler of the cloudera distribution seems to have (mostly)
solved the problem. Thanks a lot for the suggestion.
-Virajith
On Tue, Jul 12, 2011 at 7:23 PM, Matei Zaharia wrote:
> Hi Virajith,
>
> The default FIFO scheduler just isn't optimized for locality for small
>
Hi,all:
I met a very weird problem. When I input some data, and if this data
have to split into more than 2 tasks, then the last task's status is always
*initializing*. And any node can't complete the task. I check the
tasktracker, userlogs, datanode's logs, there're no error report.
Pleas