Hello Sundeep,
As Harsh has said it doesn't make much sense to use MR with the
native FS. If you really want to leverage the power of Hadoop, you should
use MR+HDFS combo, as Divide and Rule is Hadoop's strength. It's a
distributed system where each component gets its own piece of work to
Hi Users,
I am kind of new to MapReduce programming I am trying to understand the
integration between MapReduce and HDFS.
I could understand MapReduce can use HDFS for data access. But is
possible not to use HDFS at all and run MapReduce programs?
HDFS does file replication and partitioning.
The local filesystem has no sense of being 'distributed'. If you run a
distributed mode of Hadoop over file:// (Local FS), then unless the
file:// points being used itself is distributed (such as an NFS), then
your jobs will fail their tasks on all the nodes the referenced files
cannot be found