Vitalii Tymchyshyn skrev:
01.09.11 18:14, Per Steffensen написав(ла):
Well I am not sure I get you right, but anyway, basically I want a
timer framework that triggers my jobs. And the triggering of the jobs
need to work even though one or two particular machines goes down. So
the "timer triggering mechanism" has to live in the cluster, so to
speak. What I dont want is that the timer framework are driven from
one particular machine, so that the triggering of jobs will not
happen if this particular machine goes down. Basically if I have e.g.
10 machines in a Hadoop cluster I will be able to run e.g. MapReduce
jobs even if 3 of the 10 machines are down. I want my timer framework
to also be clustered, distributed and coordinated, so that I will
also have my timer jobs triggered even though 3 out of 10 machines
are down.
Hello.
AFAIK now you still have HDFS NameNode and as soon as NameNode is down
- your cluster is down. So, putting scheduling on the same machine as
NameNode won't make you cluster worse in terms of SPOF (at least for
HW failures).
Best regards, Vitalii Tymchyshyn
I believe this is why there is also a secondary namenode. But with two
namenodes it is still to centralized in my opinion, but guess Hadoop
people know that, and that the namenode-role will be even more
distributed in the future. But that does not change the fact that I
would like to have a real distributed clustered scheduler.