I think the biggest issue would be upstream bandwidth and latency. If the
thought was to use a Seti type approach, most users wouldn't have the
necessary upstream bandwidth to support the DFS. It would be likely that a
few local desktop machines would significantly out pace a much larger
DSL/cabl
This is likely to fail, yes. The reason why is because you'll almost
certainly encounter timeouts in the heartbeats between data nodes and
the name node and the task trackers and job tracker. Also, Hadoop uses
pipe line replication between data nodes (client -> DN1 -> DN2 -> ...)
which will also en
I think the original assumption of google's implementation (also hadoop's)
of map/reduce is in-house clusters.
2010/4/17
> Hello,
>
> I want to investigate the matter of running hadoop MapReduce jobs over the
> Internet. I don't mean in private computers, all of them in different
> places, rathe
Hello,
I want to investigate the matter of running hadoop MapReduce jobs over the
Internet. I don't mean in private computers, all of them in different
places, rather a collection of datacenters, connected to each other over
the Internet.
Would that fail? If yes, how and why? What issues would ar