Re: how to distribute the data to all the datanodes?

gmackey Tue, 15 Jul 2008 10:30:10 -0700

I'm failing to see your question. If you only want one copy of the datastored, but you want the 128MB to be replicated over your data nodes, thenyou need to set the replication factor to 1. I'm surprised that it let youset the factor to 0.

IF you were to set the replication value any higher than 1, then multiplecopies would exist, for redundancy, and would be distributed across thethree nodes.


hope this helps

- Grant

On Jul 14 2008, Yi Zhao wrote:

hi, all
I have a hadoop cluster which have one master and three datanodes.

I want to put a local file about 128M intpu hdfs, I have set the
block-size to 10M

when I set the replication to 0,
I found that all the data distributed to the node which I execute the
command 'bin/hadoop dfs -put file.gz input', so this node's disk space
is used about 128M, but other nodes has no disk space used.

when I set the replication to 3,
I found that every nodes have the same data, so every nodes is about
128M disk space used.

what should I do? ï»¿I'm using hadoop-0.15.2.

any one can help me?

thanks.


--
Grant Mackey
UCF Researcher
Eng. III Rm238

Re: how to distribute the data to all the datanodes?

Reply via email to