Ok. The balancer runs as a separate thread (there is a config to set how often 
the thread wakes up but can't remember off the top of my head). Maybe if you 
wait long enough, it will balance eventually. Another thing you can try is run 
the balancer from hbase shell and see what you get back. If you get back a 
true, it means it should balance. If you get back a false, look at hbase master 
logs to see whats happening. I once had a scenario where my Unix accounts were 
messed up (2 users - hbase and another user mapped to the same unix ID and HDFS 
thought the user did not have the permissions to write to the HBase files on 
HDFS) and balancer did not run due to this exception. 

Another thing is (I think!) balancer generally does not run when regions are 
splitting. So its possible in your case that your regions are splitting so 
often (due to 10MB limit) that the balancer cannot be run since your regions 
are not stationary


Regards,
Dhaval


________________________________
From: Vamshi Krishna <vamshi2...@gmail.com>
To: user@hbase.apache.org; Dhaval Shah <prince_mithi...@yahoo.co.in> 
Sent: Friday, 23 August 2013 10:21 AM
Subject: Re: Will hbase automatically distribute the data across region servers 
or NOT..??


No that is 10MB itself. Just to observe the region splitting with respect
to the amount of data i am inserting in to hbase.
So, here i am inserting 40-50mb data and fixing that property to 10mb and
checking the region splitting happening.
But the intersting thing is regions got split BUT they are not being
distributed across other servers.
Whatever regions formed from the created tables on machine-1, all of them
are residing on the same machine-1 not being moved to other machine.




On Fri, Aug 23, 2013 at 7:40 PM, Dhaval Shah <prince_mithi...@yahoo.co.in>wrote:

> Vamshi, max value for hbase.hregion.max.filesize to 10MB seems too small.
> Did you mean 10GB?
>
>
> Regards,
> Dhaval
>
>
> ________________________________
> From: Vamshi Krishna <vamshi2...@gmail.com>
> To: user@hbase.apache.org; zhoushuaifeng <zhoushuaif...@gmail.com>
> Sent: Friday, 23 August 2013 9:38 AM
> Subject: Re: Will hbase automatically distribute the data across region
> servers or NOT..??
>
>
> Thanks for the clarifications.
> I am using hbase-0.94.10 and zookeepr-3.4.5
> But I am running into different issues .
> I set  hbase.hregion.max.filesize to 10Mb and i am inserting 10 million
> rows in to hbase table. During the insertion after some time, suddenly
> master is going down. I don't know what is the reason for such peculiar
> behavior.
> I found in master log below content and not able to make out what exactly
> the mistake. Please somebody help.
>
> master-log:
>
> 2013-08-23 18:56:36,865 FATAL org.apache.hadoop.hbase.master.HMaster:
> Master server abort: loaded coprocessors are: []
> 2013-08-23 18:56:36,866 FATAL org.apache.hadoop.hbase.master.HMaster:
> Unexpected state :
>
> scores,\x00\x00\x00\x00\x00\x02\xC8t,1377264003140.a564f31795091b6513880c5db49ec90f.
> state=PENDING_OPEN, ts=1377264396861, server=vamshi,60020,1377263789273 ..
> Cannot transit it to OFFLINE.
> java.lang.IllegalStateException: Unexpected state :
>
> scores,\x00\x00\x00\x00\x00\x02\xC8t,1377264003140.a564f31795091b6513880c5db49ec90f.
> state=PENDING_OPEN, ts=1377264396861, server=vamshi,60020,1377263789273 ..
> Cannot transit it to OFFLINE.
>     at
>
> org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1879)
>     at
>
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688)
>     at
>
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
>     at
>
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
>     at
>
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
>     at
>
> org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
>     at
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
>     at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>     at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>     at java.lang.Thread.run(Thread.java:662)
> 2013-08-23 18:56:36,867 INFO org.apache.hadoop.hbase.master.HMaster:
> Aborting
> 2013-08-23 18:56:36,867 DEBUG org.apache.hadoop.hbase.master.HMaster:
> Stopping service threads
> 2013-08-23 18:56:36,867 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> server on 60000
> 2013-08-23 18:56:36,867 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 0 on 60000: exiting
> 2013-08-23 18:56:36,867 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 5 on 60000: exiting
> 2013-08-23 18:56:36,867 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 3 on 60000: exiting
> 2013-08-23 18:56:36,873 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> IPC Server listener on 60000
> 2013-08-23 18:56:36,873 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC
> Server handler 2 on 60000: exiting
> 2013-08-23 18:56:36,873 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC
> Server handler 1 on 60000: exiting
> 2013-08-23 18:56:36,873 INFO org.apache.hadoop.hbase.master.HMaster$2:
> vamshi,60000,1377263788019-BalancerChore exiting
> 2013-08-23 18:56:36,873 INFO org.apache.hadoop.hbase.master.HMaster:
> Stopping infoServer
> 2013-08-23 18:56:36,873 INFO
> org.apache.hadoop.hbase.master.cleaner.HFileCleaner:
> master-vamshi,60000,1377263788019.archivedHFileCleaner exiting
> 2013-08-23 18:56:36,873 INFO org.apache.hadoop.hbase.master.CatalogJanitor:
> vamshi,60000,1377263788019-CatalogJanitor exiting
> 2013-08-23 18:56:36,873 INFO org.apache.hadoop.ipc.HBaseServer: REPL IPC
> Server handler 0 on 60000: exiting
> 2013-08-23 18:56:36,873 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 9 on 60000: exiting
> 2013-08-23 18:56:36,874 INFO org.mortbay.log: Stopped
> SelectChannelConnector@0.0.0.0:60010
> 2013-08-23 18:56:36,874 INFO
> org.apache.hadoop.hbase.master.cleaner.LogCleaner:
> master-vamshi,60000,1377263788019.oldLogCleaner exiting
> 2013-08-23 18:56:36,874 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1 on 60000: exiting
> 2013-08-23 18:56:36,874 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 7 on 60000: exiting
> 2013-08-23 18:56:36,874 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 6 on 60000: exiting
> 2013-08-23 18:56:36,874 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 8 on 60000: exiting
> 2013-08-23 18:56:36,874 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> IPC Server Responder
> 2013-08-23 18:56:36,876 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> IPC Server Responder
> 2013-08-23 18:56:36,874 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 2 on 60000: exiting
> 2013-08-23 18:56:36,873 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 4 on 60000: exiting
> 2013-08-23 18:56:36,877 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil:
> master:60000-0x140ab519b0f0000 Unable to set watcher on znode
> (/hbase/unassigned/05e30711673614f6b41a364c76f3f05f)
> java.lang.InterruptedException
>     at java.lang.Object.wait(Native Method)
>     at java.lang.Object.wait(Object.java:485)
>     at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>     at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1036)
>     at
>
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:172)
>     at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:450)
>     at
>
> org.apache.hadoop.hbase.zookeeper.ZKAssign.createOrForceNodeOffline(ZKAssign.java:271)
>     at
>
> org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1905)
>     at
>
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688)
>     at
>
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
>     at
>
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
>     at
>
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
>     at
>
> org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
>     at
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
>     at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>     at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>     at java.lang.Thread.run(Thread.java:662)
> 2013-08-23 18:56:36,876 WARN
> org.apache.hadoop.hbase.master.AssignmentManager: Attempted to create/force
> node into OFFLINE state before completing assignment but failed to do so
> for
>
> scores,\x00\x00\x00\x00\x00\x08b8,1377264147374.39794b7deea3203fc260756f5038d6f8.
> state=OFFLINE, ts=1377264396802, server=null
> 2013-08-23 18:56:36,876 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil:
> master:60000-0x140ab519b0f0000 Unable to get data of znode
> /hbase/unassigned/d476f8442ce31de90b60080b74daf47f
> java.lang.InterruptedException
>     at java.lang.Object.wait(Native Method)
>     at java.lang.Object.wait(Object.java:485)
>     at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
>     at
>
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:290)
>     at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:746)
>     at
>
> org.apache.hadoop.hbase.zookeeper.ZKAssign.getDataNoWatch(ZKAssign.java:904)
>     at
>
> org.apache.hadoop.hbase.zookeeper.ZKAssign.createOrForceNodeOffline(ZKAssign.java:283)
>     at
>
> org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1905)
>     at
>
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688)
>     at
>
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
>     at
>
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
>     at
>
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
>     at
>
> org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
>     at
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
>     at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>     at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>     at java.lang.Thread.run(Thread.java:662)
> 2013-08-23 18:56:36,877 WARN
> org.apache.hadoop.hbase.master.AssignmentManager: Attempted to create/force
> node into OFFLINE state before completing assignment but failed to do so
> for
>
> scores,\x00\x00\x00\x00\x00\x10\xC1\xF4,1377264146360.05e30711673614f6b41a364c76f3f05f.
> state=OFFLINE, ts=1377264396862, server=null
> 2013-08-23 18:56:36,877 WARN
> org.apache.hadoop.hbase.master.AssignmentManager: Attempted to create/force
> node into OFFLINE state before completing assignment but failed to do so
> for
>
> scores,\x00\x00\x00\x00\x00\x17\xC0i,1377264302391.d476f8442ce31de90b60080b74daf47f.
> state=OFFLINE, ts=1377264396813, server=null
> 2013-08-23 18:56:36,882 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Handling
> transition=RS_ZK_REGION_FAILED_OPEN, server=vamshi_RS,60020,1377263792053,
> region=d476f8442ce31de90b60080b74daf47f
> 2013-08-23 18:56:36,882 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan
> for
>
> scores,\x00\x00\x00\x00\x00\x17\xC0i,1377264302391.d476f8442ce31de90b60080b74daf47f.
> destination server is vamshi,60020,1377263789273
> 2013-08-23 18:56:36,882 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: No previous transition
> plan was found (or we are ignoring an existing plan) for
>
> scores,\x00\x00\x00\x00\x00\x17\xC0i,1377264302391.d476f8442ce31de90b60080b74daf47f.
> so generated a random one;
>
> hri=scores,\x00\x00\x00\x00\x00\x17\xC0i,1377264302391.d476f8442ce31de90b60080b74daf47f.,
> src=, dest=vamshi,60020,1377263789273; 2 (online=2, available=1) available
> servers
> 2013-08-23 18:56:36,882 ERROR
> org.apache.hadoop.hbase.executor.ExecutorService: Cannot submit
> [ClosedRegionHandler-vamshi,60000,1377263788019-38] because the executor is
> missing. Is this process shutting down?
> 2013-08-23 18:56:36,906 DEBUG
> org.apache.hadoop.hbase.catalog.CatalogTracker: Stopping catalog tracker
> org.apache.hadoop.hbase.catalog.CatalogTracker@451415c8
> 2013-08-23 18:56:36,906 INFO
> org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor:
> vamshi,60000,1377263788019.timeoutMonitor exiting
> 2013-08-23 18:56:36,906 INFO
> org.apache.hadoop.hbase.master.AssignmentManager$TimerUpdater:
> vamshi,60000,1377263788019.timerUpdater exiting
> 2013-08-23 18:56:36,907 INFO
> org.apache.hadoop.hbase.master.SplitLogManager$TimeoutMonitor:
> vamshi,60000,1377263788019.splitLogManagerTimeoutMonitor exiting
> 2013-08-23 18:56:36,910 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Handling
> transition=RS_ZK_REGION_FAILED_OPEN, server=vamshi_RS,60020,1377263792053,
> region=05e30711673614f6b41a364c76f3f05f
> 2013-08-23 18:56:36,911 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan
> for
>
> scores,\x00\x00\x00\x00\x00\x10\xC1\xF4,1377264146360.05e30711673614f6b41a364c76f3f05f.
> destination server is vamshi,60020,1377263789273
> 2013-08-23 18:56:36,911 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: No previous transition
> plan was found (or we are ignoring an existing plan) for
>
> scores,\x00\x00\x00\x00\x00\x10\xC1\xF4,1377264146360.05e30711673614f6b41a364c76f3f05f.
> so generated a random one;
>
> hri=scores,\x00\x00\x00\x00\x00\x10\xC1\xF4,1377264146360.05e30711673614f6b41a364c76f3f05f.,
> src=, dest=vamshi,60020,1377263789273; 2 (online=2, available=1) available
> servers
> 2013-08-23 18:56:36,911 ERROR
> org.apache.hadoop.hbase.executor.ExecutorService: Cannot submit
> [ClosedRegionHandler-vamshi,60000,1377263788019-39] because the executor is
> missing. Is this process shutting down?
> 2013-08-23 18:56:36,912 WARN
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
> ZooKeeper exception:
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for
> /hbase/unassigned/d476f8442ce31de90b60080b74daf47f
> 2013-08-23 18:56:36,912 INFO org.apache.hadoop.hbase.util.RetryCounter:
> Sleeping 2000ms before retry #1...
> 2013-08-23 18:56:36,914 INFO org.apache.zookeeper.ZooKeeper: Session:
> 0x140ab519b0f0000 closed
> 2013-08-23 18:56:36,914 INFO org.apache.hadoop.hbase.master.HMaster:
> HMaster main thread exiting
> 2013-08-23 18:56:36,914 ERROR
> org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
> java.lang.RuntimeException: HMaster Aborted
>     at
>
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:160)
>     at
>
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:104)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at
>
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
>     at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2100)
>
>
>
> My hbase-site.xml :
>
> <configuration>
>     <property>
>         <name>hbase.rootdir</name>
>
>     <value>/home/biginfolabs/BILSftwrs/hbase-0.94.10/hbstmp/</value>
>     </property>
>
>     <property>
>         <name>hbase.cluster.distributed</name>
>         <value>true</value>
>     </property>
>     <property>
>         <name>hbase.master</name>
>         <value>vamshi</value>
>     </property>
>     <property>
>         <name>hbase.zookeeper.property.clientPort</name>
>         <value>2181</value>
>     </property>
>
>
>    <property>
>         <name>hbase.hregion.max.filesize</name>
>         <value>10485760</value>
>     </property>
>
>
>
>     <property>
>         <name>hbase.zookeeper.quorum</name>
>         <value>vamshi</value>
>     </property>
>     <property>
>         <name>hbase.zookeeper.property.dataDir</name>
>         <value>/home/biginfolabs/BILSftwrs/hbase-0.94.10/zkptmp</value>
>     </property>
>
> <property>
>     <name>hbase.zookeeper.property.maxClientCnxns</name>
>     <value>1024</value>
>   </property>
>
> <property>
>     <name>hbase.coprocessor.user.region.classes</name>
>     <value>com.bil.coproc.ColumnAggregationEndpoint</value>
>   </property>
> </configuration>
>
>
>
>
> On Fri, Aug 23, 2013 at 7:00 PM, Frank Chow <zhoushuaif...@gmail.com>
> wrote:
>
> > Hi,
> > You may should check if the compact is on. If data size in a region is
> max
> > than the limition, region will split and balance after a major
> > compaction(Usually occur automatically).
> > You can manually by run the compact operaction by the shell commond:
> > compact <tableName>, or major_compact <tableName>
> >
> >
> >
> >
> > Frank Chow
>
>
>
>
> --
> *Regards*
> *
> Vamshi Krishna
> *
>



-- 
*Regards*
*
Vamshi Krishna
* 

Reply via email to