Re: Data replication and moving computation

Roshan James Thu, 18 Jun 2009 15:24:20 -0700

Further, look at the namenode file system browser for your cluster to see
the chunking in action.


http://wiki.apache.org/hadoop/WebApp%20URLs

Roshan

On Thu, Jun 18, 2009 at 6:28 AM, Harish Mallipeddi <
harish.mallipe...@gmail.com> wrote:

> On Thu, Jun 18, 2009 at 3:43 PM, rajeev gupta <graj1...@yahoo.com> wrote:
>
> >
> > I have this doubt regarding HDFS. Suppose I have 3 machines in my HDFS
> > cluster and replication factor is 1. A large file is there on one of
> those
> > three cluster machines in its local file system. If I put that file in
> HDFS
> > will it be divided and distributed across all three machines? I had this
> > doubt as HDFS "moving computation is cheaper than moving data".
> >
> > If file is distributed across all three machines, lots of data transfer
> > will be there, whereas, if file is NOT distributed then compute power of
> > other machine will be unused. Am I missing something here?
> >
> > -Raj
> >
> >
> >
> Irrespective of what you set as the replication factor, large files will
> always be split into chunks (chunk size is what you set as your HDFS
> block-size) and they'll be distributed across your entire cluster.
>
>
> --
> Harish Mallipeddi
> http://blog.poundbang.in
>

Re: Data replication and moving computation

Reply via email to