Doğacan Güney wrote:
Hello everyone,

When we started to use Hadoop (which was around 0.4.0 I think), we
used different machines for DFS and MR. IIRC, we had some problems
with running both a datanode and a tasktracker on the same machine, or
perhaps we were just superstitious. Anyway, the decision stuck and we
still use different machines.

So, the question is:
How do you run MR/DFS? Do you run JT/NN on the same machine or on
different machines? Do you run a tasktracker and a datanode on the
same machine? Also, in general is it recommended to run them on the
same machine?
(Our machines are dual core AMD64s with 2-4 GBs of RAM, btw)

Your setup is rather unusual. Typically you should run DN/TT on the same machines, because then tasktrackers may benefit from data locality (i.e. DFS blocks may be found on the local disk and don't have to be transmitted over the network). I think it would be much better to resolve whatever issue prevented you from doing this in the first place ...

JT/NN don't have to run on the same machine, although in my setups I usually end up with this configuration - JT and NN create moderate loads, so a single machine is usually sufficient, and usually I can't afford to put them on dedicated separate machines ..

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply via email to