I'm failing to see your question. If you only want one copy of the data stored, but you want the 128MB to be replicated over your data nodes, then you need to set the replication factor to 1. I'm surprised that it let you set the factor to 0.

IF you were to set the replication value any higher than 1, then multiple copies would exist, for redundancy, and would be distributed across the three nodes.

hope this helps

- Grant

On Jul 14 2008, Yi Zhao wrote:

hi, all
I have a hadoop cluster which have one master and three datanodes.

I want to put a local file about 128M intpu hdfs, I have set the
block-size to 10M

when I set the replication to 0,
I found that all the data distributed to the node which I execute the
command 'bin/hadoop dfs -put file.gz input', so this node's disk space
is used about 128M, but other nodes has no disk space used.

when I set the replication to 3,
I found that every nodes have the same data, so every nodes is about
128M disk space used.

what should I do? I'm using hadoop-0.15.2.

any one can help me?

thanks.


--
Grant Mackey
UCF Researcher
Eng. III Rm238

Reply via email to