if  the balancer is not  running ,or with a low bandwith and slow reaction, i 
think  there may have a signatual unsymmetric between datanodes .






At 2013-03-25 04:37:05,"Jamal B" <jm151...@gmail.com> wrote:

Then I think the only way around this would be to decommission  1 at a time, 
the smaller nodes, and ensure that the blocks are moved to the larger nodes.  
And once complete, bring back in the smaller nodes, but maybe only after you 
tweak the rack topology to match your disk layout more than network layout to 
compensate for the unbalanced nodes.  


Just my 2 cents



On Sun, Mar 24, 2013 at 4:31 PM, Tapas Sarangi <tapas.sara...@gmail.com> wrote:

Thanks. We have a 1-1 configuration of drives and folder in all the datanodes.


-Tapas


On Mar 24, 2013, at 3:29 PM, Jamal B <jm151...@gmail.com> wrote:


On both types of nodes, what is your dfs.data.dir set to? Does it specify 
multiple folders on the same set's of drives or is it 1-1 between folder and 
drive?  If it's set to multiple folders on the same drives, it is probably 
multiplying the amount of "available capacity" incorrectly in that it assumes a 
1-1 relationship between folder and total capacity of the drive.



On Sun, Mar 24, 2013 at 3:01 PM, Tapas Sarangi <tapas.sara...@gmail.com> wrote:

Yes, thanks for pointing, but I already know that it is completing the 
balancing when exiting otherwise it shouldn't exit. 
Your answer doesn't solve the problem I mentioned earlier in my message. 'hdfs' 
is stalling and hadoop is not writing unless space is cleared up from the 
cluster even though "df" shows the cluster has about 500 TB of free space. 


-------
 


On Mar 24, 2013, at 1:54 PM, Balaji Narayanan (பாலாஜி நாராயணன்) 
<bal...@balajin.net> wrote:


 -setBalancerBandwidth <bandwidth in bytes per second>

So the value is bytes per second. If it is running and exiting,it means it has 
completed the balancing.




On 24 March 2013 11:32, Tapas Sarangi <tapas.sara...@gmail.com> wrote:

Yes, we are running balancer, though a balancer process runs for almost a day 
or more before exiting and starting over.
Current dfs.balance.bandwidthPerSec value is set to 2x10^9. I assume that's 
bytes so about 2 GigaByte/sec. Shouldn't that be reasonable ? If it is in Bits 
then we have a problem.
What's the unit for "dfs.balance.bandwidthPerSec" ?


-----


On Mar 24, 2013, at 1:23 PM, Balaji Narayanan (பாலாஜி நாராயணன்) 
<li...@balajin.net> wrote:


Are you running balancer? If balancer is running and if it is slow, try 
increasing the balancer bandwidth




On 24 March 2013 09:21, Tapas Sarangi <tapas.sara...@gmail.com> wrote:

Thanks for the follow up. I don't know whether attachment will pass through 
this mailing list, but I am attaching a pdf that contains the usage of all live 
nodes.


All nodes starting with letter "g" are the ones with smaller storage space 
where as nodes starting with letter "s" have larger storage space. As you will 
see, most of the "gXX" nodes are completely full whereas "sXX" nodes have a lot 
of unused space. 


Recently, we are facing crisis frequently as 'hdfs' goes into a mode where it 
is not able to write any further even though the total space available in the 
cluster is about 500 TB. We believe this has something to do with the way it is 
balancing the nodes, but don't understand the problem yet. May be the attached 
PDF will help some of you (experts) to see what is going wrong here...


Thanks
------













Balancer know about topology,but when calculate balancing it operates only with 
nodes not with racks.
You can see how it work in Balancer.java in  BalancerDatanode about string 509.

I was wrong about 350Tb,35Tb it calculates in such way :

For example:
cluster_capacity=3.5Pb
cluster_dfsused=2Pb

avgutil=cluster_dfsused/cluster_capacity*100=57.14% used cluster capacity
Then we know avg node utilization (node_dfsused/node_capacity*100) .Balancer 
think that all good if  avgutil +10>node_utilizazation>=avgutil-10.

Ideal case that all node used avgutl of capacity.but for 12TB node its only 
6.5Tb and for 72Tb its about 40Tb.

Balancer cant help you.

Show me http://namenode.rambler.ru:50070/dfsnodelist.jsp?whatNodes=LIVE if you 
can.

 





In ideal case with replication factor 2 ,with two nodes 12Tb and 72Tb you will 
be able to have only 12Tb replication data.



Yes, this is true for exactly two nodes in the cluster with 12 TB and 72 TB, 
but not true for more than two nodes in the cluster.



Best way,on my opinion,it is using multiple racks.Nodes in rack must be with 
identical capacity.Racks must be identical capacity.
For example:

rack1: 1 node with 72Tb
rack2: 6 nodes with 12Tb
rack3: 3 nodes with 24Tb

It helps with balancing,because dublicated  block must be another rack.




The same question I asked earlier in this message, does multiple racks with 
default threshold for the balancer minimizes the difference between racks ?


Why did you select hdfs?May be lustre,cephfs and other is better choise. 



It wasn't my decision, and I probably can't change it now. I am new to this 
cluster and trying to understand few issues. I will explore other options as 
you mentioned.

--
http://balajin.net/blog
http://flic.kr/balajijegan





--
http://balajin.net/blog
http://flic.kr/balajijegan







Reply via email to