Re: want to join in slack
Sent, please check. -- Best regards, R.C From: Jaze Lee Sent: 29 September 2018 17:53 To: dev@hbase.apache.org Subject: want to join in slack Hello, my slack is tianq...@unitedstack.com, can you help me to join in hbase slack? Thanks a lot. -- 谦谦君子
[jira] [Created] (HBASE-21260) The whole balancer plans might be aborted if there are more than one plans to move the same region
Xiaolin Ha created HBASE-21260: -- Summary: The whole balancer plans might be aborted if there are more than one plans to move the same region Key: HBASE-21260 URL: https://issues.apache.org/jira/browse/HBASE-21260 Project: HBase Issue Type: Bug Components: Balancer, master Affects Versions: 2.0.0, 2.1.0 Reporter: Xiaolin Ha Assignee: Xiaolin Ha In SimpleLoadBalancer, plans are generated firstly by average number regions per server for a table. Each server will be randomly assigned either floor(average) or ceiling(average) regions (if the average is not an integer number). But afterwards, the balanceOverall method might generate new plans of some regions of the table to balance server loads in whole cluster scope. As a result, there are plans to move a same region in one call of balance. Currently, branch-2 is using async procedures to implement balancer plans. But the concurrency of moving the same regions will cause the balance method failed. And all the afterwards plans will not be implement when one plan encounters exception. We have encountered this problem in our practices, the logs are as follows, {color:#205081}2018-09-26,12:12:38,224 INFO [master/c4-hadoop-tst-ct15:52900.Chore.1] org.apache.hadoop.hbase.master.HMaster: Balancer plans size is 3757, the balance interval is 79 ms, and the max number regions in transition is 25 2018-09-26,12:12:38,224 INFO [master/c4-hadoop-tst-ct15:52900.Chore.1] org.apache.hadoop.hbase.master.HMaster: balance hri=1588230740, source=c4-hadoop-tst-st99.bj,52900,1537522783781, destination=c4-hadoop-tst-st28.bj,52900,1537520009497 2018-09-26,12:12:38,325 INFO [master/c4-hadoop-tst-ct15:52900.Chore.1] org.apache.hadoop.hbase.master.HMaster: balance hri=1588230740, source=c4-hadoop-tst-st99.bj,52900,1537522783781, destination=c4-hadoop-tst-st29.bj,52900,1537522784188 2018-09-26,12:12:38,325 INFO [PEWorker-16] org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler: pid=119197, state=RUNNABLE:REGION_STATE_TRANSITION_CLOSE; TransitRegionStateProcedure table=hbase:meta, region=1588230740, REOPEN/MOVE checking lock on 1588230740 2018-09-26,12:12:38,325 ERROR [master/c4-hadoop-tst-ct15:52900.Chore.1] org.apache.hadoop.hbase.master.balancer.BalancerChore: Failed to balance. org.apache.hadoop.hbase.HBaseIOException: rit=OPEN, location=c4-hadoop-tst-st99.bj,52900,1537522783781, table=hbase:meta, region=1588230740 is currently in transition at org.apache.hadoop.hbase.master.assignment.AssignmentManager.preTransitCheck(AssignmentManager.java:536) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.createMoveRegionProcedure(AssignmentManager.java:592) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.moveAsync(AssignmentManager.java:609) at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1707) at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1622) at org.apache.hadoop.hbase.master.balancer.BalancerChore.chore(BalancerChore.java:49) at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745){color} This is a serious problem because it often occurs when new RSs started or old RSs failover. And what's more, no effective methods can be used to make the balance of the cluster back to normal. But the solution of this problem may be simple. We can cache Exceptions when implementing a plan, and then just skip it, avoiding failed plans effect later plans in the whole plans list. New calls of balance can fetch up the failed and skipped plans. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21207) Add client side sorting functionality in master web UI for table and region server details.
[ https://issues.apache.org/jira/browse/HBASE-21207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reopened HBASE-21207: > Add client side sorting functionality in master web UI for table and region > server details. > --- > > Key: HBASE-21207 > URL: https://issues.apache.org/jira/browse/HBASE-21207 > Project: HBase > Issue Type: Improvement > Components: master, monitoring, UI, Usability >Reporter: Archana Katiyar >Assignee: Archana Katiyar >Priority: Minor > Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.8 > > Attachments: 14926e82-b929-11e8-8bdd-4ce4621f1118.png, > 21207.branch-1.addendum.patch, 2724afd8-b929-11e8-8171-8b5b2ba3084e.png, > HBASE-21207-branch-1.patch, HBASE-21207-branch-1.v1.patch, > HBASE-21207-branch-2.v1.patch, HBASE-21207.patch, HBASE-21207.patch, > HBASE-21207.v1.patch, edc5c812-b928-11e8-87e2-ce6396629bbc.png > > > In Master UI, we can see region server details like requests per seconds and > number of regions etc. Similarly, for tables also we can see online regions , > offline regions. > It will help ops people in determining hot spotting if we can provide sort > functionality in the UI. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21259) [amv2] Revived deadservers; recreated serverstatenode
stack created HBASE-21259: - Summary: [amv2] Revived deadservers; recreated serverstatenode Key: HBASE-21259 URL: https://issues.apache.org/jira/browse/HBASE-21259 Project: HBase Issue Type: Bug Components: amv2 Affects Versions: 2.1.0 Reporter: stack Assignee: stack Fix For: 2.2.0, 2.1.1, 2.0.3 On startup, I see servers being revived; i.e. their serverstatenode is getting marked online even though its just been processed by ServerCrashProcedure. It looks like this (in a patched server that reports on whenever a serverstatenode is created): {code} 2018-09-29 03:45:40,963 INFO org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=3982597, state=SUCCESS; ServerCrashProcedure server=vb1442.halxg.cloudera.com,22101,1536675314426, splitWal=true, meta=false in 1.0130sec ... 2018-09-29 03:45:43,733 INFO org.apache.hadoop.hbase.master.assignment.RegionStates: CREATING! vb1442.halxg.cloudera.com,22101,1536675314426 java.lang.RuntimeException: WHERE AM I? at org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1116) at org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1143) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1464) at org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:200) at org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:369) at org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97) at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1716) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1494) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:75) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2022) {code} See how we've just finished a SCP which will have removed the serverstatenode... but then we come across an unassign that references the server that was just processed. The unassign will attempt to update the serverstatenode and therein we create one if one not present. We shouldn't be creating one. I think I see this a lot because I am scheduling unassigns with hbck2. The servers crash and then come up with SCPs doing cleanup of old server and unassign procedures in the procedure executor queue to be processed still but could happen at any time on cluster should an unassign happen get scheduled near an SCP. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21257) misspelled words.[occured -> occurred]
[ https://issues.apache.org/jira/browse/HBASE-21257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhanggangxue reopened HBASE-21257: -- > misspelled words.[occured -> occurred] > -- > > Key: HBASE-21257 > URL: https://issues.apache.org/jira/browse/HBASE-21257 > Project: HBase > Issue Type: Bug >Reporter: zhanggangxue >Priority: Trivial > Labels: occured, occurred, typo > Attachments: 0001-misspelled-words.-occured-occurred.patch > > > I found some spelling errors on the master Branch.Found in [misspelled > words|https://github.com/apache/hbase/pull/91] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21257) misspelled words.[occured -> occurred]
[ https://issues.apache.org/jira/browse/HBASE-21257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhanggangxue resolved HBASE-21257. -- Resolution: Fixed > misspelled words.[occured -> occurred] > -- > > Key: HBASE-21257 > URL: https://issues.apache.org/jira/browse/HBASE-21257 > Project: HBase > Issue Type: Bug >Reporter: zhanggangxue >Priority: Trivial > Labels: occured, occurred, typo > Attachments: 0001-misspelled-words.-occured-occurred.patch > > > I found some spelling errors on the master Branch.Found in [misspelled > words|https://github.com/apache/hbase/pull/91] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21258) Add resetting of flags for RS Group pre/post hooks in TestRSGroups
Ted Yu created HBASE-21258: -- Summary: Add resetting of flags for RS Group pre/post hooks in TestRSGroups Key: HBASE-21258 URL: https://issues.apache.org/jira/browse/HBASE-21258 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Over HBASE-20627, [~xucang] reminded me that the resetting of flags for RS Group pre/post hooks in TestRSGroups was absent. This issue is to add the resetting of these flags before each subtest starts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21257) misspelled words.[occured -> occurred]
zhanggangxue created HBASE-21257: Summary: misspelled words.[occured -> occurred] Key: HBASE-21257 URL: https://issues.apache.org/jira/browse/HBASE-21257 Project: HBase Issue Type: Bug Reporter: zhanggangxue Attachments: 0001-misspelled-words.-occured-occurred.patch I found some spelling errors on the master Branch.Found in [misspelled words.[occured -> occurred] #91|https://github.com/apache/hbase/pull/91] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21256) Improve IntegrationTestBigLinkedList for testing huge data
Zephyr Guo created HBASE-21256: -- Summary: Improve IntegrationTestBigLinkedList for testing huge data Key: HBASE-21256 URL: https://issues.apache.org/jira/browse/HBASE-21256 Project: HBase Issue Type: Improvement Components: integration tests Affects Versions: 3.0.0 Reporter: Zephyr Guo Assignee: Zephyr Guo Fix For: 3.0.0 Attachments: ITBLL-1.png, ITBLL-2.png Recently, I use ITBLL to test some features in our company. I have encountered the following problems: 1. Generator is too slow at the generating stage, the root cause is SecureRandom. There is a global lock in SecureRandom( See the following picture). I use Random instead of SecureRandom, and it could speed up this stage(500% up with 20 mapper). SecureRandom was brought by HBASE-13382, but speaking of generating random bytes, in my opnion, it is the same with Random. !ITBLL-1.png! 2. VerifyReducer have a cpu cost of 14% on format method. This is cause by create keyString variable. However, keyString is never be used if test result is correct.(and that's in most cases). Just delay creating keyString can yield huge performance boost in verifing stage. !ITBLL-2.png! 3.Arguments check is needed, because there's constraint between arguments. If we broken this constraint, we can not get a correct circular list. 4.Let big family value size could be configured. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
want to join in slack
Hello, my slack is tianq...@unitedstack.com, can you help me to join in hbase slack? Thanks a lot. -- 谦谦君子
[jira] [Created] (HBASE-21255) Refactor TablePermission into three classes (Global, Namespace, Table)
Reid Chan created HBASE-21255: - Summary: Refactor TablePermission into three classes (Global, Namespace, Table) Key: HBASE-21255 URL: https://issues.apache.org/jira/browse/HBASE-21255 Project: HBase Issue Type: Improvement Reporter: Reid Chan Assignee: Reid Chan A TODO in {{TablePermission.java}} {code} //TODO refactor this class //we need to refacting this into three classes (Global, Table, Namespace) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)