The easiest way would be to not use anything but your reliable machines as datanodes. Alternately, for better performance, you could run two DFS systems, one on all machines, and one on just the reliable machines, and back one up to the other before you shutdown the "unreliable" nodes each night. Then, in the morning, restore things.

Long-term, we hope to add a feature that permits one to remove a number of nodes from DFS at once, forcing all of the blocks stored on these nodes to migrate to other nodes. But that feature has not yet been implemented.

Doug

Mikkel Kamstrup Erlandsen wrote:
I will be running a cluster with 100-200 nodes, most of which will be
shut down at night. For the sake of example lets say that 4 'reliable
slaves' will remain turned on continuously, and let me call the rest
'unreliable slaves'.

Storage wise, how would I go about this (using HDFS)? I figure that it
would be a bad idea to put persistent data on the unreliable slaves,
since turning ~100 computers of simultaneously might wreck havoc to the
hdfs(?). So the idea would be to let persistent data only reside on
reliable slaves.

Would setting dfs.datanode.du.pct=0 on the unreliable slaves do the
trick?

Cheers,
Mikkel

Reply via email to