Do the jobs run on the whole cluster or a single rack? If you write from a single rack, you will get something similar to what you described, because the default policy is to put one block locally and 2 blocks on the same remote rack. It does check that there is enough place available, but does not try to balance.
On Thu, Aug 22, 2013 at 9:41 AM, Marc Sturlese <marc.sturl...@gmail.com>wrote: > Hey there, > I've set up rack awareness on my hadoop cluster with replication 3. I have > 2 > racks and each contains 50% of the nodes. > I can see that the blocks are spread on the 2 racks, the problem is that > all > nodes from a rack are storing 2 replicas and the nodes of the other rack > just one. If I launch the hadoop balancer script, it will properly spread > the replicas across the 2 racks, leaving all nodes with exactly the same > available disk space but, after jobs are running for hours, the data will > be > unbalanced again (rack1 having all nodes with less empty disk space than > all > nodes from rack2) > > Any clue whats going on? > Thanks in advance > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/rack-awarness-unexpected-behaviour-tp4086029.html > Sent from the Hadoop lucene-users mailing list archive at Nabble.com. >