I am building a small two node cluster following http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster)
Every thing seems to be working, except I notice the data are NOT evenly distributed to each physical box. e.g., when I hadoop dfs -put <6G> data. I am expecting ~3G on each node (take turns every ~64MB), however, I checked dfshealth.jsp and "du -k" on local box, and found the uploaded data are ONLY residing on the physical box where I start "dfs -put". That defeats the whole (data locality) purpose of hadoop?! Please help. Thanks -- View this message in context: http://old.nabble.com/should-data-be-evenly-distributed-to-each-%28physical%29-node-tp27782215p27782215.html Sent from the Hadoop core-user mailing list archive at Nabble.com.