Just a note > Usually around the ~80% full mark is when HDFS starts getting a bit wonky > These days, we have large grids over 90% full and still running fine. Percentage of hdfs space could be misleading. We usually monitor the percentage of full datanodes.
Koji On 3/17/11 2:20 PM, "Allen Wittenauer" <a...@apache.org> wrote: > > On Mar 17, 2011, at 12:13 PM, Stuart Smith wrote: > >> Parts of this may end up on the hbase list, but I thought I'd start here. My >> basic problem is: >> >> My cluster is getting full enough that having one data node go down does put >> a bit of pressure on the system (when balanced, every DN is more than half >> full). > > Usually around the ~80% full mark is when HDFS starts getting a bit wonky on > super active grids. Your best bet is to either delete some data/store the data > more efficiently, add more nodes, or upgrade the storage capacity of the nodes > you have. The balancer is only going to save you for so long until the whole > thing tips over. > >> Anybody here have any idea how badly running the balancer on a heavily active >> system messes things up? (for hdfs/hbase - if anyone knows). > > I don't run HBase, but at Y! we used to run the balancer pretty much every > day, even on super active grids. It 'mostly works' until you get to the point > of no return, which it sounds like you are heading for... > >> Any ideas? Or do I just need better hardware? Not sure if that's an option, >> though.. > > Depending upon how your systems are configured, something else to look at is > how much space is getting ate by logs, mapreduce spill space, etc. A good > daemon bounce might free up some stale handles as well.