Hadoop / HBase hotspotting / overloading specific nodes

2014-10-09 Thread SF Hadoop
I'm not sure if this is an HBase issue or an Hadoop issue so if this is off-topic please forgive. I am having a problem with Hadoop maxing out drive space on a select few nodes when I am running an HBase job. The scenario is this: - The job is a data import using Map/Reduce / HBase - The data

Re: Hadoop / HBase hotspotting / overloading specific nodes

2014-10-09 Thread Bing Jiang
Could you set a reserved room for non-dfs usage? Just to avoid the disk gets full. hdfs-site.xml property namedfs.datanode.du.reserved/name value/value descriptionReserved space in bytes per volume. Always leave this much space free for non dfs use. /description /property 2014-10-09 14:01

Re: Hadoop / HBase hotspotting / overloading specific nodes

2014-10-09 Thread Ted Yu
Looks like the number of regions is lower than the number of nodes in the cluster. Can you split the table such that, after hbase balancer is run, there is region hosted by every node ? Cheers On Oct 8, 2014, at 11:01 PM, SF Hadoop sfhad...@gmail.com wrote: I'm not sure if this is an HBase

Re: Hadoop / HBase hotspotting / overloading specific nodes

2014-10-09 Thread SF Hadoop
This doesn't help because the space is simply reserved for the OS. Hadoop still maxes out its quota and spits out out of space errors. Thanks On Wednesday, October 8, 2014, Bing Jiang jiangbinglo...@gmail.com wrote: Could you set a reserved room for non-dfs usage? Just to avoid the disk gets

Re: Hadoop / HBase hotspotting / overloading specific nodes

2014-10-09 Thread SF Hadoop
Haven't tried this. I'll give it a shot. Thanks On Thursday, October 9, 2014, Ted Yu yuzhih...@gmail.com wrote: Looks like the number of regions is lower than the number of nodes in the cluster. Can you split the table such that, after hbase balancer is run, there is region hosted by every