Hi,
I am using 0.19.0 on EC2. The Hadoop execution and HDFS directories are on
EBS volumes mounted to each node in my EC2 cluster. Only the install of
hadoop is in the AMI. We have 10 EBS volumes and when the cluster starts it
randomly picks one for each slave. We don't always start all 10 slaves
Hi Chris,
You should really start all the slave nodes to be sure that you don't
lose data. If you start fewer than #nodes - #replication + 1 nodes
then you are virtually guaranteed to lose blocks. Starting 6 nodes out
of 10 will cause the filesystem to remain in safe mode, as you've
seen.
BTW