Dear all,
Does anyone have an experience on working Hadoop Integration with SGE (
Sun Grid Engine ).
It is open -source too ( sge-6.2u5 ).
Did SGE really overcomes some of the deficiencies of Hadoop.
According to a article :-
Instead, to set the stage, let's talk about what Hadoop doesn't do so
well. I currently see two important deficiencies in Hadoop: it doesn't
play well with others, and it has no real accounting framework. Pretty
much every customer I've seen running Hadoop does it on a dedicated
cluster. Why? Because the tasktrackers assume they own the machines on
which they run. If there's anything on the cluster other than Hadoop,
it's in direct competition with Hadoop. That wouldn't be such a big deal
if Hadoop clusters didn't tend to be so huge. Folks are dedicating
hundreds, thousands, or even tens of thousands of machines to their
Hadoop applications. That's a lot of hardware to be walled off for a
single purpose. Are those machines really being used? You may not be
able to tell. You can monitor state in the moment, and you can grep
through log files to find out about past usage (Gah!), but there's no
historical accounting capability there.
So I want to know that is it worthful to use SGE with Hadoop in
Production Cluster or not.
Please share your views.
Thanks in Advance
Adarsh Sharma