[jira] [Resolved] (HBASE-22154) Facing issue with HA of HBase
[ https://issues.apache.org/jira/browse/HBASE-22154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang resolved HBASE-22154. Resolution: Not A Problem > Facing issue with HA of HBase > - > > Key: HBASE-22154 > URL: https://issues.apache.org/jira/browse/HBASE-22154 > Project: HBase > Issue Type: Test >Reporter: James >Priority: Critical > Labels: /hbase-1.2.6.1 > > Hi Team, > I have set up HA Hadoop cluster and same for HBase. > When my Active name node is going down, Stand by name node is becoming active > name node however as same time my backup hbase master is not becoming active > HMaster(Active HMaster and Region server goes down). > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22047) LeaseException in Scan should be retired
Allan Yang created HBASE-22047: -- Summary: LeaseException in Scan should be retired Key: HBASE-22047 URL: https://issues.apache.org/jira/browse/HBASE-22047 Project: HBase Issue Type: Bug Affects Versions: 2.1.3, 2.0.4, 2.2.0 Reporter: Allan Yang We should retry LeaseException just like other exceptions like OutOfOrderScannerNextException and UnknownScannerException Code in ClientScanner: {code:java} if ((cause != null && cause instanceof NotServingRegionException) || (cause != null && cause instanceof RegionServerStoppedException) || e instanceof OutOfOrderScannerNextException || e instanceof UnknownScannerException || e instanceof ScannerResetException) { // Pass. It is easier writing the if loop test as list of what is allowed rather than // as a list of what is not allowed... so if in here, it means we do not throw. if (retriesLeft <= 0) { throw e; // no more retries } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-22043) HMaster Went down
[ https://issues.apache.org/jira/browse/HBASE-22043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang resolved HBASE-22043. Resolution: Not A Problem > HMaster Went down > - > > Key: HBASE-22043 > URL: https://issues.apache.org/jira/browse/HBASE-22043 > Project: HBase > Issue Type: Bug > Components: Admin >Reporter: James >Priority: Critical > > HMaster went down > /hbase/WALs/regionserver80-XXXsplitting is non empty': Directory is > not empty -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21962) Filters do not work in ThriftTable
Allan Yang created HBASE-21962: -- Summary: Filters do not work in ThriftTable Key: HBASE-21962 URL: https://issues.apache.org/jira/browse/HBASE-21962 Project: HBase Issue Type: Sub-task Components: Thrift Reporter: Allan Yang Assignee: Allan Yang Fix For: 3.0.0, 2.2.0 Filters in ThriftTable is not working, this issue is to fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21809) Add retry thrift client for ThriftTable/Admin
Allan Yang created HBASE-21809: -- Summary: Add retry thrift client for ThriftTable/Admin Key: HBASE-21809 URL: https://issues.apache.org/jira/browse/HBASE-21809 Project: HBase Issue Type: Sub-task Reporter: Allan Yang Assignee: Allan Yang Fix For: 3.0.0, 2.2.0 This is for ThriftTable/Admin to handle exceptions like connection loss. And only available for http thrift client. For client using TSocket, it is not so easy to implement a retry client, maybe later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21754) ReportRegionStateTransitionRequest should be executed in priority executor
Allan Yang created HBASE-21754: -- Summary: ReportRegionStateTransitionRequest should be executed in priority executor Key: HBASE-21754 URL: https://issues.apache.org/jira/browse/HBASE-21754 Project: HBase Issue Type: Bug Affects Versions: 2.0.4, 2.1.2 Reporter: Allan Yang Assignee: Allan Yang Now, ReportRegionStateTransitionRequest is executed in default handler, only region of system table is executed in priority handler. That is because we have only two kinds of handler default and priority in master(replication handler is for replication specifically), if the transition report for all region is executed in priority handler, there is a dead lock situation that other regions' transition report take all handler and need to update meta, but meta region is not able to report online since all handler is taken(addressed in the comments of MasterAnnotationReadingPriorityFunction). But there is another dead lock case that user's DDL requests (or other sync op like moveregion) take over all default handlers, making region transition report is not possible, thus those sync ops can't complete either. A simple UT provided in the patch shows this case. To resolve this problem, I added a new metaTransitionExecutor to execute meta region transition report only, and all the other region's report are executed in priority handlers, separating them from user's requests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-14223) Meta WALs are not cleared if meta region was closed and RS aborts
[ https://issues.apache.org/jira/browse/HBASE-14223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang resolved HBASE-14223. Resolution: Fixed > Meta WALs are not cleared if meta region was closed and RS aborts > - > > Key: HBASE-14223 > URL: https://issues.apache.org/jira/browse/HBASE-14223 > Project: HBase > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar >Priority: Major > Fix For: 3.0.0, 1.5.0, 2.2.0 > > Attachments: HBASE-14223logs, hbase-14223_v0.patch, > hbase-14223_v1-branch-1.patch, hbase-14223_v2-branch-1.patch, > hbase-14223_v3-branch-1.patch, hbase-14223_v3-branch-1.patch, > hbase-14223_v3-master.patch > > > When an RS opens meta, and later closes it, the WAL(FSHlog) is not closed. > The last WAL file just sits there in the RS WAL directory. If RS stops > gracefully, the WAL file for meta is deleted. Otherwise if RS aborts, WAL for > meta is not cleaned. It is also not split (which is correct) since master > determines that the RS no longer hosts meta at the time of RS abort. > From a cluster after running ITBLL with CM, I see a lot of {{-splitting}} > directories left uncleaned: > {code} > [root@os-enis-dal-test-jun-4-7 cluster-os]# sudo -u hdfs hadoop fs -ls > /apps/hbase/data/WALs > Found 31 items > drwxr-xr-x - hbase hadoop 0 2015-06-05 01:14 > /apps/hbase/data/WALs/hregion-58203265 > drwxr-xr-x - hbase hadoop 0 2015-06-05 07:54 > /apps/hbase/data/WALs/os-enis-dal-test-jun-4-1.openstacklocal,16020,1433489308745-splitting > drwxr-xr-x - hbase hadoop 0 2015-06-05 09:28 > /apps/hbase/data/WALs/os-enis-dal-test-jun-4-1.openstacklocal,16020,1433494382959-splitting > drwxr-xr-x - hbase hadoop 0 2015-06-05 10:01 > /apps/hbase/data/WALs/os-enis-dal-test-jun-4-1.openstacklocal,16020,1433498252205-splitting > ... > {code} > The directories contain WALs from meta: > {code} > [root@os-enis-dal-test-jun-4-7 cluster-os]# sudo -u hdfs hadoop fs -ls > /apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting > Found 2 items > -rw-r--r-- 3 hbase hadoop 201608 2015-06-05 03:15 > /apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285..meta.1433470511501.meta > -rw-r--r-- 3 hbase hadoop 44420 2015-06-05 04:36 > /apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285..meta.1433474111645.meta > {code} > The RS hosted the meta region for some time: > {code} > 2015-06-05 03:14:28,692 INFO [PostOpenDeployTasks:1588230740] > zookeeper.MetaTableLocator: Setting hbase:meta region location in ZooKeeper > as os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285 > ... > 2015-06-05 03:15:17,302 INFO > [RS_CLOSE_META-os-enis-dal-test-jun-4-5:16020-0] regionserver.HRegion: Closed > hbase:meta,,1.1588230740 > {code} > In between, a WAL is created: > {code} > 2015-06-05 03:15:11,707 INFO > [RS_OPEN_META-os-enis-dal-test-jun-4-5:16020-0-MetaLogRoller] wal.FSHLog: > Rolled WAL > /apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285..meta.1433470511501.meta > with entries=385, filesize=196.88 KB; new WAL > /apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285..meta.1433474111645.meta > {code} > When CM killed the region server later master did not see these WAL files: > {code} > ./hbase-hbase-master-os-enis-dal-test-jun-4-3.log:2015-06-05 03:36:46,075 > INFO [MASTER_SERVER_OPERATIONS-os-enis-dal-test-jun-4-3:16000-0] > master.SplitLogManager: started splitting 2 logs in > [hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting] > for [os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285] > ./hbase-hbase-master-os-enis-dal-test-jun-4-3.log:2015-06-05 03:36:47,300 > INFO [main-EventThread] wal.WALSplitter: Archived processed log > hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285.default.1433475074436 > to > hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/oldWALs/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285.default.1433475074436 > ./hbase-hbase-master-os-enis-dal-test-jun-4-3.log:2015-06-05 03:36:50,497 > INFO [main-EventThread] wal.WALSplitter: Archived processed l
[jira] [Created] (HBASE-21751) WAL create fails during region open may cause region assign forever fail
Allan Yang created HBASE-21751: -- Summary: WAL create fails during region open may cause region assign forever fail Key: HBASE-21751 URL: https://issues.apache.org/jira/browse/HBASE-21751 Project: HBase Issue Type: Bug Affects Versions: 2.0.4, 2.1.2 Reporter: Allan Yang Assignee: Allan Yang Fix For: 2.2.0, 2.1.3, 2.0.5 During the first region opens on the RS, WALFactory will create a WAL file, but if the wal creation fails, in some cases, HDFS will leave a empty file in the dir(e.g. disk full, file is created succesfully but block allocation fails). We have a check in AbstractFSWAL that if WAL belong to the same factory exists, then a error will be throw. Thus, the region can never be open on this RS later. {code:java} 2019-01-17 02:15:53,320 ERROR [RS_OPEN_META-regionserver/server003:16020-0] handler.OpenRegionHandler(301): Failed open of region=hbase:meta,,1.1588230740 java.io.IOException: Target WAL already exists within directory hdfs://cluster/hbase/WALs/server003.hbase.hostname.com,16020,1545269815888 at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.(AbstractFSWAL.java:382) at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.(AsyncFSWAL.java:210) at org.apache.hadoop.hbase.wal.AsyncFSWALProvider.createWAL(AsyncFSWALProvider.java:72) at org.apache.hadoop.hbase.wal.AsyncFSWALProvider.createWAL(AsyncFSWALProvider.java:47) at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:138) at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(AbstractFSWALProvider.java:57) at org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.java:264) at org.apache.hadoop.hbase.regionserver.HRegionServer.getWAL(HRegionServer.java:2085) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:284) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622) at java.lang.Thread.run(Thread.java:834) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21652) Refactor ThriftServer making thrift2 server inherited from thrift1 server
[ https://issues.apache.org/jira/browse/HBASE-21652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang reopened HBASE-21652: > Refactor ThriftServer making thrift2 server inherited from thrift1 server > - > > Key: HBASE-21652 > URL: https://issues.apache.org/jira/browse/HBASE-21652 > Project: HBase > Issue Type: Sub-task >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21652.addendum.patch, HBASE-21652.branch-2.patch, > HBASE-21652.patch, HBASE-21652.v2.patch, HBASE-21652.v3.patch, > HBASE-21652.v4.patch, HBASE-21652.v5.patch, HBASE-21652.v6.patch, > HBASE-21652.v7.patch > > > Except the different protocol, thrift2 Server should have no much difference > from thrift1 Server. So refactoring the thrift server, making thrift2 server > inherit from thrift1 server. Getting rid of many duplicated code. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21661) Provide Thrift2 implementation of Table/Admin
Allan Yang created HBASE-21661: -- Summary: Provide Thrift2 implementation of Table/Admin Key: HBASE-21661 URL: https://issues.apache.org/jira/browse/HBASE-21661 Project: HBase Issue Type: Sub-task Environment: Provide Thrift2 implementation of Table/Admin, making Java user to use thrift client more easily(Some environment which can not expose ZK or RS Servers directly require thrift or REST protocol even using Java). Another Example of this is RemoteHTable and RemoteAdmin, they are REST connectors. Reporter: Allan Yang Assignee: Allan Yang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21652) Refactor ThriftServer making thrift2 server to support both thrift1 and thrift2 protocol
Allan Yang created HBASE-21652: -- Summary: Refactor ThriftServer making thrift2 server to support both thrift1 and thrift2 protocol Key: HBASE-21652 URL: https://issues.apache.org/jira/browse/HBASE-21652 Project: HBase Issue Type: Sub-task Reporter: Allan Yang Assignee: Allan Yang Except the different protocol, thrift2 Server should have no much difference from thrift1 Server. So refactoring the thrift server, making thrift2 server inherit from thrift1 server. Getting rid of many duplicated code, making thrift2 server can serve thrift1 protocol tt the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21650) Add DDL operation and some other miscellaneous to thrift2
Allan Yang created HBASE-21650: -- Summary: Add DDL operation and some other miscellaneous to thrift2 Key: HBASE-21650 URL: https://issues.apache.org/jira/browse/HBASE-21650 Project: HBase Issue Type: Sub-task Reporter: Allan Yang Assignee: Allan Yang Fix For: 3.0.0, 2.2.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21649) Complete Thrift2 to supersede Thrift1
Allan Yang created HBASE-21649: -- Summary: Complete Thrift2 to supersede Thrift1 Key: HBASE-21649 URL: https://issues.apache.org/jira/browse/HBASE-21649 Project: HBase Issue Type: Umbrella Reporter: Allan Yang Assignee: Allan Yang Fix For: 3.0.0, 2.2.0 Thrift1 and Thrift2 coexists in our project for a very long time. Functionality is more complete in thrift1 but its interface design is bad for adding new features(so we have get(), getVer(),getVerTs,getRowWithColumns() and so many other methods for a single get request, this is bad). Thrift2 has a more clean interface and structure definition, making our user more easy to use. But, it has not been updated for a long time, lacking of DDL method is a major weakness. I think we should complete Thrift2 and supersede Thrift1, making Thrift2 as the standard multi language definition. This is a umbrella issue to make it happen. The plan would be: 1. complete the DDL interface of thrift2 2. Making thrift2 server can handle thrift1 requests, user don't have to choose which thrift server they need to start 3. deprecate thrift1 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21392) HTable can still write data after calling the close method.
[ https://issues.apache.org/jira/browse/HBASE-21392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang reopened HBASE-21392: > HTable can still write data after calling the close method. > --- > > Key: HBASE-21392 > URL: https://issues.apache.org/jira/browse/HBASE-21392 > Project: HBase > Issue Type: Improvement > Components: Client >Affects Versions: 1.2.0, 2.1.0, 2.0.0 > Environment: HBase 1.2.0 >Reporter: lixiaobao >Assignee: lixiaobao >Priority: Major > Attachments: HBASE-21392.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > HTable can still write data after calling the close method. > > {code:java} > val conn = ConnectionFactory.createConnection(conf) > var table = conn.getTable(TableName.valueOf(tableName)) > val put = new Put(rowKey.getBytes()) > put.addColumn("cf".getBytes(), columnField.getBytes(), endTimeLong, > Bytes.toBytes(line.getLong(8))) > table.put(put) > //call table close() method > table.close() > //put again > val put1 = new Put(rowKey4.getBytes()) > out1.addColumn("cf".getBytes(), columnField.getBytes(), endTimeLong, > Bytes.toBytes(line.getLong(8))) > table.put(put1) > {code} > > after call close method ,can alse write data into HBase,I think this does not > match close logic. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21469) Re-visit post* hooks in DDL operations
Allan Yang created HBASE-21469: -- Summary: Re-visit post* hooks in DDL operations Key: HBASE-21469 URL: https://issues.apache.org/jira/browse/HBASE-21469 Project: HBase Issue Type: Bug Affects Versions: 2.0.2, 2.1.1 Reporter: Allan Yang Assignee: Allan Yang I have some discuss in HBASE-19953 from [here|https://issues.apache.org/jira/browse/HBASE-19953?focusedCommentId=16673126&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16673126] In HBASE-19953,[~elserj] want to make sure that the post* hooks are called only when the procedures finish.But it accidentally turns modifytable and truncate table request into a sync call, which make clients RPC timeout easily on big tables. We should re-visit those postxxx hooks in DDL operations, because they are now not consistent now: For DDLs other than modifytable and truncate table, although the call will wait on the latch, the latch is actually released just after prepare state, so we still call postxxx hooks before the operation finish. For DDLs of modifytable and truncate, the latch is only released after the whole procedure finish. So the effort works(but will cause RPC timeout) I think these latches are designed for compatibility with 1.x clients. Take ModifyTable for example, in 1.x, we use admin.getAlterStauts() to check the alter status, but in 2.x, this method is deprecated and returning inaccurate result, we have to make 1.x client in a sync wait. And for the semantics of postxxx hooks in 1.x, we will call them after the corresponding DDL request return, but actually, the DDL request may not finished also since we don't want for region assignment. So, here, we need to discuss the semantics of postxxx hooks in DDL operations, we need to make it consistent in every DDL operations, do we really need to make sure this hooks being called only after the operation finish? What's more, we have postCompletedxxx hooks for that need. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21423) Procedures for meta table/region should be able to execute in separate workers
[ https://issues.apache.org/jira/browse/HBASE-21423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang resolved HBASE-21423. Resolution: Fixed Opened HBASE-21468 for the addendum, close this one > Procedures for meta table/region should be able to execute in separate > workers > --- > > Key: HBASE-21423 > URL: https://issues.apache.org/jira/browse/HBASE-21423 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.1, 2.0.2 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 2.0.3, 2.1.2 > > Attachments: HBASE-21423.branch-2.0.001.patch, > HBASE-21423.branch-2.0.002.patch, HBASE-21423.branch-2.0.003.patch, > HBASE-21423.branch-2.0.addendum.patch > > > We have higher priority for meta table procedures, but only in queue level. > There is a case that the meta table is closed and a AssignProcedure(or RTSP > in branch-2+) is waiting there to be executed, but at the same time, all the > Work threads are executing procedures need to write to meta table, then all > the worker will be stuck and retry for writing meta, no worker will take the > AP for meta. > Though we have a mechanism that will detect stuck and adding more > ''KeepAlive'' workers to the pool to resolve the stuck. It is already stuck a > long time. > This is a real case I encountered in ITBLL. > So, I add one 'Urgent work' to the ProceudureExecutor, which only take meta > procedures(other workers can take meta procedures too), which can resolve > this kind of stuck. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21468) separate workers for meta table is not working
Allan Yang created HBASE-21468: -- Summary: separate workers for meta table is not working Key: HBASE-21468 URL: https://issues.apache.org/jira/browse/HBASE-21468 Project: HBase Issue Type: Bug Affects Versions: 2.0.2, 2.1.1 Reporter: Allan Yang Assignee: Allan Yang This is an addendum for HBASE-21423, since HBASE-21423 is already closed, the QA won't be triggered. It is my mistake that the separate workers for meta table is not working, since when polling from queue, the onlyUrgent flag is not passed in. And for some UT that only require one worker thread, urgent workers should set to 0 to ensure there is one worker at time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21423) Procedures for meta table/region should be able to execute in separate workers
[ https://issues.apache.org/jira/browse/HBASE-21423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang reopened HBASE-21423: > Procedures for meta table/region should be able to execute in separate > workers > --- > > Key: HBASE-21423 > URL: https://issues.apache.org/jira/browse/HBASE-21423 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.1, 2.0.2 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 2.0.3, 2.1.2 > > Attachments: HBASE-21423.branch-2.0.001.patch, > HBASE-21423.branch-2.0.002.patch, HBASE-21423.branch-2.0.003.patch > > > We have higher priority for meta table procedures, but only in queue level. > There is a case that the meta table is closed and a AssignProcedure(or RTSP > in branch-2+) is waiting there to be executed, but at the same time, all the > Work threads are executing procedures need to write to meta table, then all > the worker will be stuck and retry for writing meta, no worker will take the > AP for meta. > Though we have a mechanism that will detect stuck and adding more > ''KeepAlive'' workers to the pool to resolve the stuck. It is already stuck a > long time. > This is a real case I encountered in ITBLL. > So, I add one 'Urgent work' to the ProceudureExecutor, which only take meta > procedures(other workers can take meta procedures too), which can resolve > this kind of stuck. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21423) Procedures for meta table/region should be able to executed in separate workers
Allan Yang created HBASE-21423: -- Summary: Procedures for meta table/region should be able to executed in separate workers Key: HBASE-21423 URL: https://issues.apache.org/jira/browse/HBASE-21423 Project: HBase Issue Type: Sub-task Affects Versions: 2.0.2, 2.1.1 Reporter: Allan Yang Assignee: Allan Yang We have higher priority for meta table procedures, but only in queue level. There is a case that the meta table is closed and a AssignProcedure(or RTSP in branch-2+) is waiting there to be executed, but at the same time, all the Work threads are executing procedures need to write to meta table, then all the worker will be stuck and retry for writing meta, no worker will take the AP for meta. Though we have a mechanism that will detect stuck and adding more ''KeepAlive'' workers to the pool to resolve the stuck. It is already stuck a long time. This is a real case I encountered in ITBLL. So, I add one 'Urgent work' to the ProceudureExecutor, which only take meta procedures(other workers can take meta procedures too), which can resolve this kind of stuck. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21421) Do not kill RS if reportOnlineRegions fails
Allan Yang created HBASE-21421: -- Summary: Do not kill RS if reportOnlineRegions fails Key: HBASE-21421 URL: https://issues.apache.org/jira/browse/HBASE-21421 Project: HBase Issue Type: Sub-task Affects Versions: 2.0.2, 2.1.1 Reporter: Allan Yang Assignee: Allan Yang In the periodic regionServerReport call from RS to master, we will check master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a different state from Master. If RS holds a region which master think should be on another RS, the Master will kill the RS. But, the regionServerReport could be lagging(due to network or something), which can't represent the current state of RegionServer. Besides, we will call reportRegionStateTransition and try forever until it successfully reported to master when online a region. We can count on reportRegionStateTransition calls. I have encountered cases that the regions are closed on the RS and reportRegionStateTransition to master successfully. But later, a lagging regionServerReport tells the master the region is online on the RS(Which is not at the moment, this call may generated some time ago and delayed by network somehow), the the master think the region should be on another RS, and kill the RS, which should not be. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on
Allan Yang created HBASE-21395: -- Summary: Abort split/merge procedure if there is a table procedure of the same table going on Key: HBASE-21395 URL: https://issues.apache.org/jira/browse/HBASE-21395 Project: HBase Issue Type: Sub-task Affects Versions: 2.0.2, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang In my ITBLL, I often see that if split/merge procedure and table procedure(like ModifyTableProcedure) happen at the same time, and since there some race conditions between these two kind of procedures, causing some serious problems. e.g. the split/merged parent is bought on line by the table procedure or the split merged region making the whole table procedure rollback. Talked with [~Apache9] offline today, this kind of problem was solved in branch-2+ since There is a fence that only one RTSP can agianst a single region at the same time. To keep out of the mess in branch-2.0 and branch-2.1, I added a simple safe fence in the split/merge procedure: If there is a table procedure going on against the same table, then abort the split/merge procedure. Aborting the split/merge procedure at the beginning of the execution is no big deal, compared with the mess it will cause... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21364) Procedure holds the lock should put to front of the queue after restart
[ https://issues.apache.org/jira/browse/HBASE-21364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang resolved HBASE-21364. Resolution: Fixed Fix Version/s: 2.2.0 3.0.0 > Procedure holds the lock should put to front of the queue after restart > --- > > Key: HBASE-21364 > URL: https://issues.apache.org/jira/browse/HBASE-21364 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0, 2.0.2 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Blocker > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21364.branch-2.0.001.patch, > HBASE-21364.branch-2.0.002.patch > > > After restore the procedures form Procedure WALs. We will put the runable > procedures back to the queue to execute. The order is not the problem before > HBASE-20846 since the first one to execute will acquire the lock itself. But > since the locks will restored after HBASE-20846. If we execute a procedure > without the lock first before a procedure with the lock in the same queue, > there is a race condition that we may not be able to execute all procedures > in the same queue at all. > The race condtion is: > 1. A procedure need to take the table's exclusive lock was put into the > table's queue, but the table's shard lock was lock by a Region Procedure. > Since no one takes the exclusive lock, the queue is put to run queue to > execute. But soon, the worker thread see the procedure can't execute because > it doesn't hold the lock, so it will stop execute and remove the queue from > run queue. > 2. At the same time, the Region procedure which holds the table's shard lock > and the region's exclusive lock is put to the table's queue. But, since the > queue already added to the run queue, it won't add again. > 3. Since 1, the table's queue was removed from the run queue. > 4. Then, no one will put the table's queue back, thus no worker will execute > the procedures inside > A test case in the patch shows how. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21384) Procedure with holdlock=false should not be restored lock when restarts
Allan Yang created HBASE-21384: -- Summary: Procedure with holdlock=false should not be restored lock when restarts Key: HBASE-21384 URL: https://issues.apache.org/jira/browse/HBASE-21384 Project: HBase Issue Type: Sub-task Reporter: Allan Yang Assignee: Allan Yang Yet another case of stuck similar with HBASE-21364. The case is that: 1. A ModifyProcedure spawned a ReopenTableProcedure, and since its holdLock=false, so it release the lock 2. The ReopenTableProcedure spawned several MoveRegionProcedure, it also has holdLock=false, but just after it store the children procedures to the wal and begin to release the lock, the master was killed. 3. When restarting, the ReopenTableProcedure's lock was restored (since it was hold the lock before, which is not right, since it is in WAITING state now and its holdLock=false) 4. After restart, MoveRegionProcedure can execute since its parent has the lock, but when it spawned the AssignProcedure, the AssignProcedure procedure can't execute anymore, since it parent didn't have the lock, but its 'grandpa' - ReopenTableProcedure has. 5. Restart the master, the stuck still, because we will restore the lock for ReopenTableProcedure. Two fixes: 1. We should not restore the lock if the procedure doesn't hold lock and in WAITING state. 2. Procedures don't have lock but its parent has the lock should also be put in front of the queue, as a addendum of HBASE-21364. Discussion: Should we check the lock of all ancestors not only its parents? As addressed in the comments of the patch, currently, after fix the issue above, check parent is enough. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21376) Add some verbose log to MasterProcedureScheduler
Allan Yang created HBASE-21376: -- Summary: Add some verbose log to MasterProcedureScheduler Key: HBASE-21376 URL: https://issues.apache.org/jira/browse/HBASE-21376 Project: HBase Issue Type: Sub-task Reporter: Allan Yang Assignee: Allan Yang As discussed in HBASE-21364, we divided the patch in HBASE-21364 to two, the critical one is already submitted in HBASE-21364 to branch-2.0 and branch-2.1, but I also added some useful logs which need to commit to all branches. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21364) Procedure holds the lock should put to front of the queue after restart
Allan Yang created HBASE-21364: -- Summary: Procedure holds the lock should put to front of the queue after restart Key: HBASE-21364 URL: https://issues.apache.org/jira/browse/HBASE-21364 Project: HBase Issue Type: Sub-task Affects Versions: 2.0.2, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang After restore the procedures form Procedure WALs. We will put the runable procedures back to the queue to execute. The order is not the problem before HBASE-20846 since the first one to execute will acquire the lock itself. But since the locks will restored after HBASE-20846. If we execute a procedure without the lock first before a procedure with the lock in the same queue, there is a race condition that we may not be able to execute all procedures in the same queue at all. The race condtion is: 1. A procedure need to take the table's exclusive lock was put into the table's queue, but the table's shard lock was lock by a Region Procedure. Since no one takes the exclusive lock, the queue is put to run queue to execute. But soon, the worker thread see the procedure can't execute because it doesn't hold the lock, so it will stop execute and remove the queue from run queue. 2. At the same time, the Region procedure which holds the table's shard lock and the region's exclusive lock is put to the table's queue. But, since the queue already added to the run queue, it won't add again. 3. Since 1, the table's queue was removed from the run queue. 4. Then, no one will put the table's queue back, thus no worker will execute the procedures inside A test case in the patch shows how. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21357) RS should abort if OOM in Reader thread
Allan Yang created HBASE-21357: -- Summary: RS should abort if OOM in Reader thread Key: HBASE-21357 URL: https://issues.apache.org/jira/browse/HBASE-21357 Project: HBase Issue Type: Bug Affects Versions: 1.4.8 Reporter: Allan Yang Assignee: Allan Yang It is a bit strange, we will abort the RS if OOM in Listener thread, Responder thread and in CallRunner thread, only not in Reader thread... We should abort RS if OOM happens in Reader thread, too. If not, the reader thread exists because of OOM, and the selector closes. Later connection select to this reader will be ignored {quote} try { if (key.isValid()) { if (key.isAcceptable()) doAccept(key); } } catch (IOException ignored) { if (LOG.isTraceEnabled()) LOG.trace("ignored", ignored); } {quote} Leaving the client (or Master and other RS)'s call wait until SocketTimeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21354) Procedure may be deleted improperly during master restarts resulting in
Allan Yang created HBASE-21354: -- Summary: Procedure may be deleted improperly during master restarts resulting in Key: HBASE-21354 URL: https://issues.apache.org/jira/browse/HBASE-21354 Project: HBase Issue Type: Sub-task Reporter: Allan Yang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21292) IdLock.getLockEntry() may hang if interrupted
Allan Yang created HBASE-21292: -- Summary: IdLock.getLockEntry() may hang if interrupted Key: HBASE-21292 URL: https://issues.apache.org/jira/browse/HBASE-21292 Project: HBase Issue Type: Bug Reporter: Allan Yang Assignee: Allan Yang Fix For: 1.4.9, 2.0.2, 2.1.0 This is a rare case found by my colleague which really happened on our production env. Thread may hang(or enter a infinite loop ) when try to call IdLock.getLockEntry(). Here is the case: 1. Thread1 owned the IdLock, while Thread2(the only one waiting) was waiting for it. 2. Thread1 called releaseLockEntry, it will set IdLock.locked = false, but since Thread2 was waiting, it won't call map.remove(entry.id) 3. While Thread1 was calling releaseLockEntry, Thread2 was interrupted. So no one will remove this IdLock from the map. 4. If another thread try to call getLockEntry on this IdLock, it will end up in a infinite loop. Since existing = map.putIfAbsent(entry.id, entry)) != null and existing.locked=false It is hard to write a UT since it is a very rare race condition. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21288) HostingServer in UnassignProcedure is not accurate
Allan Yang created HBASE-21288: -- Summary: HostingServer in UnassignProcedure is not accurate Key: HBASE-21288 URL: https://issues.apache.org/jira/browse/HBASE-21288 Project: HBase Issue Type: Sub-task Components: amv2, Balancer Affects Versions: 2.0.2, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang We have a case that a region shows status OPEN on a already dead server in meta table(it is hard to trace how this happen), meaning this region is actually not online. But balance came and scheduled a MoveReionProcedure for this region, which created a mess: The balancer 'thought' this region was on the server which has the same address(but with different startcode). So it schedules a MRP from this online server to another, but the UnassignProcedure dispatch the unassign call to the dead server according to regionstate, which then found the server dead and schedulre a SCP for the dead server. But since the UnassignProcedure's hostingServer is not accurate, the SCP can't interrupt it. So, in the end, the SCP can't finish since the UnassignProcedure has the region' lock, the UnassignProcedure can finish since no one wake it, thus stuck. Here is log, notice that the server of the UnassignProcedure is 'hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584' but it was dispatch to 'hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964' {code} 2018-10-10 14:34:50,011 INFO [PEWorker-4] assignment.RegionTransitionProcedure(252): Dispatch pid=13, ppid=12, state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584; rit=CLOSING, location=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964 2018-10-10 14:34:50,011 WARN [PEWorker-4] assignment.RegionTransitionProcedure(230): Remote call failed hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; pid=13, ppid=12, state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584; rit=CLOSING, location=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; exception=NoServerDispatchException org.apache.hadoop.hbase.procedure2.NoServerDispatchException: hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964; pid=13, ppid=12, state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; UnassignProcedure table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f, server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539153278584 //Then a SCP was scheduled 2018-10-10 14:34:50,012 WARN [PEWorker-4] master.ServerManager(635): Expiration of hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964 but server not online 2018-10-10 14:34:50,012 INFO [PEWorker-4] master.ServerManager(615): Processing expiration of hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964 on hb-uf6oyi699w8h700f0-001.hbase.rds. ,16000,1539088156164 2018-10-10 14:34:50,017 DEBUG [PEWorker-4] procedure2.ProcedureExecutor(1089): Stored pid=14, state=RUNNABLE:SERVER_CRASH_START, hasLock=false; ServerCrashProcedure server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964, splitWal=true, meta=false //The SCP did not interrupt the UnassignProcedure but schedule new AssignProcedure for this region 2018-10-10 14:34:50,043 DEBUG [PEWorker-6] procedure.ServerCrashProcedure(250): Done splitting WALs pid=14, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, hasLock=true; ServerCrashProcedure server=hb-uf6oyi699w8h700f0-003.hbase.rds. ,16020,1539076734964, splitWal=true, meta=false 2018-10-10 14:34:50,054 INFO [PEWorker-8] procedure2.ProcedureExecutor(1691): Initialized subprocedures=[{pid=15, ppid=14, state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure table=hbase:acl, region=267335c85766c62479fb4a5f18a1e95f}, {pid=16, ppid=14, state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure table=hbase:req_intercept_rule, region=460481706415d776b3742f428a6f579b}, {pid=17, ppid=14, state=RUNNABLE:REGION_TRANSITION_QUEUE, hasLock=false; AssignProcedure table=hbase:namespace, region=ec7a965e7302840120a5d8289947c40b}] {code} Here I also added a safe fence in balancer, if such regions are found, balancing is skipped for safe.It should do no harm. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21253) Backport HBASE-21244 Skip persistence when retrying for assignment related procedures to branch-2.0 and branch-2.1
Allan Yang created HBASE-21253: -- Summary: Backport HBASE-21244 Skip persistence when retrying for assignment related procedures to branch-2.0 and branch-2.1 Key: HBASE-21253 URL: https://issues.apache.org/jira/browse/HBASE-21253 Project: HBase Issue Type: Bug Affects Versions: 2.0.2, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang See HBASE-21244 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21237) Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS
Allan Yang created HBASE-21237: -- Summary: Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS Key: HBASE-21237 URL: https://issues.apache.org/jira/browse/HBASE-21237 Project: HBase Issue Type: Sub-task Affects Versions: 2.0.2, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang As discussed in HBASE-21217, in branch-2.0 and branch-2.1, we should use CompatRemoteProcedureResolver instead of ExecuteProceduresRemoteCall to dispatch region open/close requests to RS. Since ExecuteProceduresRemoteCall will group all the open/close operations in one call and execute them sequentially on the target RS. If one operation fails, all the operation will be marked as failure. Actually, some of the operations(like open region) is already executing in the open region handler thread. But master thinks these operations fails and reassign the regions to another RS. So when the previous RS report to the master that the region is online, master will kill the RS since it already assign the region to another RS. For branch-2.2+, HBASE-21217 will fix this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21228) Memory leak since AbstractFSWAL caches Thread object and never clean later
Allan Yang created HBASE-21228: -- Summary: Memory leak since AbstractFSWAL caches Thread object and never clean later Key: HBASE-21228 URL: https://issues.apache.org/jira/browse/HBASE-21228 Project: HBase Issue Type: Bug Affects Versions: 1.4.7, 2.0.2, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang In AbstractFSWAL(FSHLog in branch-1), we have a map caches thread and SyncFutures. {code} /** * Map of {@link SyncFuture}s keyed by Handler objects. Used so we reuse SyncFutures. * * TODO: Reuse FSWALEntry's rather than create them anew each time as we do SyncFutures here. * * TODO: Add a FSWalEntry and SyncFuture as thread locals on handlers rather than have them get * them from this Map? */ private final ConcurrentMap syncFuturesByHandler; {code} A colleague of mine find a memory leak case caused by this map. Every thread who writes WAL will be cached in this map, And no one will clean the threads in the map even after the thread is dead. In one of our customer's cluster, we noticed that even though there is no requests, the heap of the RS is almost full and CMS GC was triggered every second. We dumped the heap and then found out there were more than 30 thousands threads with Terminated state. which are all cached in this map above. Everything referenced in these threads were leaked. Most of the threads are: 1.PostOpenDeployTasksThread, which will write Open Region mark in WAL 2. hconnection-0x1f838e31-shared--pool, which are used to write index short circuit(Phoenix), and WAL will be write and sync in these threads. 3. Index writer thread(Phoenix), which referenced by RegionEnvironment then by HRegion and finally been referenced by PostOpenDeployTasksThread. We should turn this map into a thread local one, let JVM GC the terminated thread for us. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21212) Wrong flush time when update flush metric
Allan Yang created HBASE-21212: -- Summary: Wrong flush time when update flush metric Key: HBASE-21212 URL: https://issues.apache.org/jira/browse/HBASE-21212 Project: HBase Issue Type: Bug Affects Versions: 2.0.2, 2.1.0, 3.0.0 Reporter: Allan Yang Assignee: Allan Yang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21085) Adding getter methods to some private fields in ProcedureV2 module
Allan Yang created HBASE-21085: -- Summary: Adding getter methods to some private fields in ProcedureV2 module Key: HBASE-21085 URL: https://issues.apache.org/jira/browse/HBASE-21085 Project: HBase Issue Type: Sub-task Reporter: Allan Yang Assignee: Allan Yang Many fields are private in ProcedureV2 module. adding getter method to them making them more transparent. And some classes are private too, making it public. Some class is private in ProcecudeV2 module, making it public. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure
Allan Yang created HBASE-21083: -- Summary: Introduce a mechanism to bypass the execution of a stuck procedure Key: HBASE-21083 URL: https://issues.apache.org/jira/browse/HBASE-21083 Project: HBase Issue Type: Sub-task Components: amv2 Affects Versions: 2.0.1, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can continue running. we still have some unrevealed bugs hiding in our AMv2 and procedureV2 system, we need something to interfere with stuck procedures before HBCK2 can work. This is very crucial for a production ready system. For now, we have little ways to interfere with running procedures. Aborting them is not a good choice, since some procedures are not abort-able. And some procedure may have overridden the abort() method, which will ignore the abort request. So, here, I will introduce a mechanism to bypass the execution of a stuck procedure. Basically, I added a field called 'bypass' to Procedure class. If we set this field to true, all the logic in execute/rollback will be skipped, letting this procedure and its ancestors complete normally and releasing the lock resources at last. Notice that bypassing a procedure may leave the cluster in a middle state, e.g. the region not assigned, or some hdfs files left behind. The Operators need know the side effect of bypassing and recover the inconsistent state of the cluster themselves, like issuing new procedures to assign the regions. A patch will be uploaded and review board will be open. For now, only APIs in ProcedureExecutor are provided. If anything is fine, I will add it to master service and add a shell command to bypass a procedure. Or, maybe we can use dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21051) Possible NPE if ModifyTable and region split happen at the same time
Allan Yang created HBASE-21051: -- Summary: Possible NPE if ModifyTable and region split happen at the same time Key: HBASE-21051 URL: https://issues.apache.org/jira/browse/HBASE-21051 Project: HBase Issue Type: Sub-task Components: amv2 Affects Versions: 2.0.1, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang Similar with HBASE-20921, ModifyTable procedure and reopenProcedure won't held the lock, so another procedures like split/merge can execute at the same time. 1. a split happend during ModifyTable, as you can see from the log, the split was nealy complete. {code} 2018-08-05 01:28:31,339 INFO [PEWorker-8] procedure2.ProcedureExecutor(1659): Finished subprocedure(s) of pid=772, state=RUNNABLE:SPLIT_TABLE_REGION_POST_OPERATION, hasLock=true; SplitTableRegionProce dure table=IntegrationTestBigLinkedList, parent=357a7a6a62c76bc2d7ab30a6cc812637, daughterA=b13e5d155b65a5f752f3adda78fcfb6a, daughterB=5be3aadcee68d91c3d1e464865550246; resume parent processing. 2018-08-05 01:28:31,345 INFO [PEWorker-8] procedure2.ProcedureExecutor(1296): Finished pid=795, ppid=772, state=SUCCESS, hasLock=false; AssignProcedure table=IntegrationTestBigLinkedList, region=b13e5 d155b65a5f752f3adda78fcfb6a, target=e010125048016.bja,60020,1533402809226 in 5.0280sec {code} 2. reopenProcedure began to reopen region by moving it {code} 2018-08-05 01:28:31,389 INFO [PEWorker-11] procedure.MasterProcedureScheduler(631): pid=781, ppid=774, state=RUNNABLE:MOVE_REGION_UNASSIGN, hasLock=false; MoveRegionProcedure hri=357a7a6a62c76bc2d7ab3 0a6cc812637, source=e010125048016.bja,60020,1533402809226, destination=e010125048016.bja,60020,1533402809226 checking lock on 357a7a6a62c76bc2d7ab30a6cc812637 2018-08-05 01:28:31,390 INFO [PEWorker-3] procedure2.ProcedureExecutor(1296): Finished pid=772, state=SUCCESS, hasLock=false; SplitTableRegionProcedure table=IntegrationTestBigLinkedList, parent=357a7 a6a62c76bc2d7ab30a6cc812637, daughterA=b13e5d155b65a5f752f3adda78fcfb6a, daughterB=5be3aadcee68d91c3d1e464865550246 in 21.9050sec 2018-08-05 01:28:31,518 INFO [PEWorker-11] procedure2.ProcedureExecutor(1533): Initialized subprocedures=[{pid=797, ppid=781, state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=false; UnassignProcedur e table=IntegrationTestBigLinkedList, region=357a7a6a62c76bc2d7ab30a6cc812637, server=e010125048016.bja,60020,1533402809226}] 2018-08-05 01:28:31,530 INFO [PEWorker-15] procedure.MasterProcedureScheduler(631): pid=797, ppid=781, state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=false; UnassignProcedure table=IntegrationTest BigLinkedList, region=357a7a6a62c76bc2d7ab30a6cc812637, server=e010125048016.bja,60020,1533402809226 checking lock on 357a7a6a62c76bc2d7ab30a6cc812637 {code} 3. MoveRegionProcdure fails since the region did not exis any more (due to split) {code} 2018-08-05 01:28:31,543 ERROR [PEWorker-15] procedure2.ProcedureExecutor(1517): CODE-BUG: Uncaught runtime exception: pid=797, ppid=781, state=RUNNABLE:REGION_TRANSITION_DISPATCH, hasLock=true; Unassig nProcedure table=IntegrationTestBigLinkedList, region=357a7a6a62c76bc2d7ab30a6cc812637, server=e010125048016.bja,60020,1533402809226 java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) at org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:1097) at org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:1125) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1455) at org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:204) at org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:349) at org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:101) at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:873) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1498) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1278) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$900(ProcedureExecutor.java:76) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1785) {code} We need to think about the case, and find a untimely solution for it, otherwise, issues like this one and HBASE-20921 will keep comming. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21050) Exclusive lock may be held by a SUCCESS state procedure forever
Allan Yang created HBASE-21050: -- Summary: Exclusive lock may be held by a SUCCESS state procedure forever Key: HBASE-21050 URL: https://issues.apache.org/jira/browse/HBASE-21050 Project: HBase Issue Type: Sub-task Components: amv2 Affects Versions: 2.0.1, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang After HBASE-20846, we restore lock info for procedures. But, there is a case that the lock and be held by a already success procedure. Since the procedure won't execute again, the lock will held by the procedure forever. 1. All children for pid=1208 had been finished, but before procedure 1208 awake, the master was killed {code} 2018-08-05 02:20:14,465 INFO [PEWorker-8] procedure2.ProcedureExecutor(1659): Finished subprocedure(s) of pid=1208, ppid=1206, state=RUNNABLE, hasLock=true; MoveRegionProcedure hri=c2a23a735f16df57299 dba6fd4599f2f, source=e010125050127.bja,60020,1533403109034, destination=e010125050127.bja,60020,1533403109034; resume parent processing. 2018-08-05 02:20:14,466 INFO [PEWorker-8] procedure2.ProcedureExecutor(1296): Finished pid=1232, ppid=1208, state=SUCCESS, hasLock=false; AssignProcedure table=IntegrationTestBigLinkedList, region=c2a 23a735f16df57299dba6fd4599f2f, target=e010125050127.bja,60020,1533403109034 in 1.5060sec {code} 2. Master restarts, since procedure 1208 held the lock before restart, so the lock was resotore for it {code} 2018-08-05 02:20:30,803 DEBUG [Thread-15] procedure2.ProcedureExecutor(456): Loading pid=1208, ppid=1206, state=SUCCESS, hasLock=false; MoveRegionProcedure hri=c2a23a735f16df57299dba6fd4599f2f, source= e010125050127.bja,60020,1533403109034, destination=e010125050127.bja,60020,1533403109034 2018-08-05 02:20:30,818 DEBUG [Thread-15] procedure2.Procedure(898): pid=1208, ppid=1206, state=SUCCESS, hasLock=false; MoveRegionProcedure hri=c2a23a735f16df57299dba6fd4599f2f, source=e010125050127.bj a,60020,1533403109034, destination=e010125050127.bja,60020,1533403109034 held the lock before restarting, call acquireLock to restore it. 2018-08-05 02:20:30,818 INFO [Thread-15] procedure.MasterProcedureScheduler(631): pid=1208, ppid=1206, state=SUCCESS, hasLock=false; MoveRegionProcedure hri=c2a23a735f16df57299dba6fd4599f2f, source=e0 10125050127.bja,60020,1533403109034, destination=e010125050127.bja,60020,1533403109034 checking lock on c2a23a735f16df57299dba6fd4599f2f {code} 3. Since procedure 1208 is success, it won't execute later, so the lock will be held by it forever We need to check the state of the procedure before restoring locks, if the procedure is already finished (success or rollback), we do not need to acquire lock for it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21041) Memstore's heap size will be decreased to minus zero after flush
Allan Yang created HBASE-21041: -- Summary: Memstore's heap size will be decreased to minus zero after flush Key: HBASE-21041 URL: https://issues.apache.org/jira/browse/HBASE-21041 Project: HBase Issue Type: Bug Affects Versions: 2.0.1, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang When creating an active mutable segment (MutableSegment) in memstore, MutableSegment's deep overheap (208 bytes) was added to its heap size, but not to the region's memstore's heap size. And so was the immutable segment(CSLMImmutableSegment) which the mutable segment turned into (additional 8 bytes ) later. So after one flush, the memstore's heapsize will be decreased to -216 bytes, The minus number will accumulate after every flush. CompactingMemstore has this problem too. We need to record the overhead for CSLMImmutableSegment and MutableSegment to the corresponding region's memstore size. For CellArrayImmutableSegment, CellChunkImmutableSegment and CompositeImmutableSegment , it is not necessary to do so, because inside CompactingMemstore, the overheads are already be taken care of when transfer a CSLMImmutableSegment into them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21035) Meta Table should be able to online even if all procedures are lost
Allan Yang created HBASE-21035: -- Summary: Meta Table should be able to online even if all procedures are lost Key: HBASE-21035 URL: https://issues.apache.org/jira/browse/HBASE-21035 Project: HBase Issue Type: Sub-task Affects Versions: 2.1.0 Reporter: Allan Yang Assignee: Allan Yang After HBASE-20708, we changed the way we init after master starts. It will only check WAL dirs and compare to Zookeeper RS nodes to decide which server need to expire. For servers which's dir is ending with 'SPLITTING', we assure that there will be a SCP for it. But, if the server with the meta region crashed before master restarts, and if all the procedure wals are lost (due to bug, or deleted manually, whatever), the new restarted master will be stuck when initing. Since no one will bring meta region online. Although it is a anomaly case, but I think no matter what happens, we need to online meta region. Otherwise, we are sitting ducks, noting can be done. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-20976) SCP can be scheduled multiple times for the same RS
[ https://issues.apache.org/jira/browse/HBASE-20976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang reopened HBASE-20976: > SCP can be scheduled multiple times for the same RS > --- > > Key: HBASE-20976 > URL: https://issues.apache.org/jira/browse/HBASE-20976 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 2.0.2 > > Attachments: HBASE-20976.branch-2.0.001.patch > > > SCP can be scheduled multiple times for the same RS: > 1. a RS crashed, a SCP was submitted for it > 2. before this SCP finish, the Master crashed > 3. The new master will scan the meta table and find some region is still open > on a dead server > 4. The new master submit a SCP for the dead server again > The two SCP for the same RS can even execute concurrently if without > HBASE-20846… > Provided a test case to reproduce this issue and a fix solution in the patch. > Another case that SCP might be scheduled multiple times for the same RS(with > HBASE-20708.): > 1. a RS crashed, a SCP was submitted for it > 2. A new RS on the same host started, the old RS's Serveranme was remove from > DeadServer.deadServers > 3. after the SCP passed the Handle_RIT state, a UnassignProcedure need to > send a close region operation to the crashed RS > 4. The UnassignProcedure's dispatch failed since 'NoServerDispatchException' > 5. Begin to expire the RS, but only find it not online and not in deadServer > list, so a SCP was submitted for the same RS again > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21031) Memory leak if replay edits failed during region opening
Allan Yang created HBASE-21031: -- Summary: Memory leak if replay edits failed during region opening Key: HBASE-21031 URL: https://issues.apache.org/jira/browse/HBASE-21031 Project: HBase Issue Type: Bug Affects Versions: 2.0.1, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang Due to HBASE-21029, when replaying edits with a lot of same cells, the memstore won't flush, a exception will throw when all heap space was used: {code} 2018-08-06 15:52:27,590 ERROR [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] handler.OpenRegionHandler(302): Failed open of region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., starting to roll back the global memstore size. java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57) at java.nio.ByteBuffer.allocate(ByteBuffer.java:335) at org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41) at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104) at org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226) at org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180) at org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163) at org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273) at org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148) at org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111) at org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178) at org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287) at org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107) at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706) at org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494) at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608) at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404) {code} After this exception, the memstore did not roll back, and since MSLAB is used, all the chunk allocated won't release for ever. Those memory is leak forever... We need to rollback the memory if open region fails(For now, only global memstore size is decreased after failure). Another problem is that we use replayEditsPerRegion in RegionServerAccounting to record how many memory used during replaying. And decrease the global memstore size if replay fails. This is not right, since during replaying, we may also flush the memstore, the size in the map of replayEditsPerRegion is not accurate at all! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21029) Miscount of memstore's heap/offheap size if same cell was put
Allan Yang created HBASE-21029: -- Summary: Miscount of memstore's heap/offheap size if same cell was put Key: HBASE-21029 URL: https://issues.apache.org/jira/browse/HBASE-21029 Project: HBase Issue Type: Bug Affects Versions: 2.0.1, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang We are now using memstore.heapSize() + memstore.offheapSize() to decide whether a flush is needed. But, if a same cell was put again in memstore, only the memstore's dataSize will be increased, the heap/offheap size won't. Actually, if MSLAB is used, the heap/offheap will increase no matter the cell is added or not. IIRC, memstore's heap/offheap size should always bigger than data size. We introduced heap/offheap size besides data size to reflect memory footprint more precisely. {code} // If there's already a same cell in the CellSet and we are using MSLAB, we must count in the // MSLAB allocation size as well, or else there will be memory leak (occupied heap size larger // than the counted number) if (succ || mslabUsed) { cellSize = getCellLength(cellToAdd); } // heap/offheap size is changed only if the cell is truly added in the cellSet long heapSize = heapSizeChange(cellToAdd, succ); long offHeapSize = offHeapSizeChange(cellToAdd, succ); {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-20976) SCP can be scheduled multiple times for the same RS
[ https://issues.apache.org/jira/browse/HBASE-20976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang resolved HBASE-20976. Resolution: Invalid > SCP can be scheduled multiple times for the same RS > --- > > Key: HBASE-20976 > URL: https://issues.apache.org/jira/browse/HBASE-20976 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 2.0.2 > > Attachments: HBASE-20976.branch-2.0.001.patch > > > SCP can be scheduled multiple times for the same RS: > 1. a RS crashed, a SCP was submitted for it > 2. before this SCP finish, the Master crashed > 3. The new master will scan the meta table and find some region is still open > on a dead server > 4. The new master submit a SCP for the dead server again > The two SCP for the same RS can even execute concurrently if without > HBASE-20846… > Provided a test case to reproduce this issue and a fix solution in the patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21003) Fix the flaky TestSplitOrMergeStatus
Allan Yang created HBASE-21003: -- Summary: Fix the flaky TestSplitOrMergeStatus Key: HBASE-21003 URL: https://issues.apache.org/jira/browse/HBASE-21003 Project: HBase Issue Type: Bug Affects Versions: 2.0.1, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang TestSplitOrMergeStatus.testSplitSwitch() is flaky because : {code} //Set the split switch to false boolean[] results = admin.setSplitOrMergeEnabled(false, false, MasterSwitchType.SPLIT); .. //Split the region admin.split(t.getName()); int count = admin.getTableRegions(tableName).size(); assertTrue(originalCount == count); //Set the split switch to true, actually, the last split procedure may not started yet on master //So, after setting the switch to true, the last split operation may success, which is not //excepted results = admin.setSplitOrMergeEnabled(true, false, MasterSwitchType.SPLIT); assertEquals(1, results.length); assertFalse(results[0]); //Since last split success, split the region again will end up with a //DoNotRetryRegionException here admin.split(t.getName()); {code} {code} org.apache.hadoop.hbase.client.DoNotRetryRegionException: 3f16a57c583e6ecf044c5b7de2e97121 is not OPEN; regionState={3f16a57c583e6ecf044c5b7de2e97121 state=SPLITTING, ts=1533239385789, server=asf911.gq1.ygridcore.net,60061,1533239369899} at org.apache.hadoop.hbase.master.procedure.AbstractStateMachineTableProcedure.checkOnline(AbstractStateMachineTableProcedure.java:191) at org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.(SplitTableRegionProcedure.java:112) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.createSplitProcedure(AssignmentManager.java:756) at org.apache.hadoop.hbase.master.HMaster$2.run(HMaster.java:1722) at org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131) at org.apache.hadoop.hbase.master.HMaster.splitRegion(HMaster.java:1714) at org.apache.hadoop.hbase.master.MasterRpcServices.splitRegion(MasterRpcServices.java:797) at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20990) One operation in procedure batch throws an exception will cause all RegionTransitionProcedures receive the same exception
Allan Yang created HBASE-20990: -- Summary: One operation in procedure batch throws an exception will cause all RegionTransitionProcedures receive the same exception Key: HBASE-20990 URL: https://issues.apache.org/jira/browse/HBASE-20990 Project: HBase Issue Type: Sub-task Components: amv2 Affects Versions: 2.0.1, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang In AMv2, we batch open/close region operations and call RS with executeProcedures API. But, in this API, if one of the region's operations throws an exception, all the operations in the batch will receive the same exception. Actually, some of the operations in the batch is executing normally in the RS. I think we should try catch exceptions respectively, and call remoteCallFailed or remoteCallCompleted in RegionTransitionProcedure respectively. Otherwise, there will be some very strange behave. Such as this one: {code} 2018-07-18 02:56:18,506 WARN [RSProcedureDispatcher-pool3-t1] assignment.RegionTransitionProcedure(226): Remote call failed e010125048016.bja,60020,1531848989401; pid=8362, ppid=8272, state=RUNNABLE:R EGION_TRANSITION_DISPATCH; AssignProcedure table=IntegrationTestBigLinkedList, region=0beb8ea4e2f239fc082be7cefede1427, target=e010125048016.bja,60020,1531848989401; rit=OPENING, location=e010125048016 .bja,60020,1531848989401; exception=NotServingRegionException {code} The AssignProcedure failed with a NotServingRegionException, what??? It is very strange, actually, the AssignProcedure successes on the RS, another CloseRegion operation failed in the operation batch was causing the exception. To correct this, we need to modify the response of executeProcedures API, which is the ExecuteProceduresResponse proto, to return infos(status, exceptions) per operation. This issue alone won't cause much trouble, so not so hurry to change the behave here, but indeed we need to consider this one when we want do some reconstruct to AMv2. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-20976) SCP can be scheduled multiple times for the same RS
[ https://issues.apache.org/jira/browse/HBASE-20976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang reopened HBASE-20976: > SCP can be scheduled multiple times for the same RS > --- > > Key: HBASE-20976 > URL: https://issues.apache.org/jira/browse/HBASE-20976 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-20976.branch-2.0.001.patch > > > SCP can be scheduled multiple times for the same RS: > 1. a RS crashed, a SCP was submitted for it > 2. before this SCP finish, the Master crashed > 3. The new master will scan the meta table and find some region is still open > on a dead server > 4. The new master submit a SCP for the dead server again > The two SCP for the same RS can even execute concurrently if without > HBASE-20846… > Provided a test case to reproduce this issue and a fix solution in the patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20976) SCP can be scheduled multiple times for the same RS
Allan Yang created HBASE-20976: -- Summary: SCP can be scheduled multiple times for the same RS Key: HBASE-20976 URL: https://issues.apache.org/jira/browse/HBASE-20976 Project: HBase Issue Type: Sub-task Affects Versions: 2.0.1, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang SCP can be scheduled multiple times for the same RS: 1. a RS crashed, a SCP was submitted for it 2. before this SCP finish, the Master crashed 3. The new master will scan the meta table and find some region is still open on a dead server 4. The new master submit a SCP for the dead server again The two SCP for the same RS can even execute concurrently if without HBASE-20846… Provided a test case to reproduce this issue and a fix solution in the patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20975) Lock may not be taken while rolling back procedure
Allan Yang created HBASE-20975: -- Summary: Lock may not be taken while rolling back procedure Key: HBASE-20975 URL: https://issues.apache.org/jira/browse/HBASE-20975 Project: HBase Issue Type: Sub-task Components: amv2 Affects Versions: 2.0.1, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang Find this one when investigating HBASE-20921, too. Here is some code from executeRollback in ProcedureExecutor.java. {code} boolean reuseLock = false; while (stackTail --> 0) { final Procedure proc = subprocStack.get(stackTail); LockState lockState; //If reuseLock, then don't acquire the lock if (!reuseLock && (lockState = acquireLock(proc)) != LockState.LOCK_ACQUIRED) { return lockState; } lockState = executeRollback(proc); boolean abortRollback = lockState != LockState.LOCK_ACQUIRED; abortRollback |= !isRunning() || !store.isRunning(); //If the next procedure in the stack is the current one, then reuseLock = true reuseLock = stackTail > 0 && (subprocStack.get(stackTail - 1) == proc) && !abortRollback; //If reuseLock, don't releaseLock if (!reuseLock) { releaseLock(proc, false); } if (abortRollback) { return lockState; } subprocStack.remove(stackTail); if (proc.isYieldAfterExecutionStep(getEnvironment())) { return LockState.LOCK_YIELD_WAIT; } //But, here, lock is released no matter reuseLock is true or false if (proc != rootProc) { execCompletionCleanup(proc); } } {code} You can see my comments in the code above, reuseLock can cause the procedure executing(rollback) without a lock. Though I haven't found any bugs introduced by this issue, it is indeed a potential bug need to fix. I think we can just remove the reuseLock logic. Acquire and release lock every time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure
Allan Yang created HBASE-20973: -- Summary: ArrayIndexOutOfBoundsException when rolling back procedure Key: HBASE-20973 URL: https://issues.apache.org/jira/browse/HBASE-20973 Project: HBase Issue Type: Sub-task Components: amv2 Affects Versions: 2.0.1, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang Find this one while investigating HBASE-20921. After the root procedure(ModifyTableProcedure in this case) rolled back, a ArrayIndexOutOfBoundsException was thrown {code} 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): CODE-BUG: Uncaught runtime exception for pid=5973, state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l ang.NullPointerException; ModifyTableProcedure table=IntegrationTestBigLinkedList java.lang.UnsupportedOperationException: unhandled state=MODIFY_TABLE_REOPEN_ALL_REGIONS at org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147) at org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50) at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203) at org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741) 2018-07-18 01:39:10,243 WARN [PEWorker-8] procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405) at org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178) at org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513) at org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505) at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741) at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691) at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741) {code} This is a very serious condition, After this exception thrown, the exclusive lock held by ModifyTableProcedure was never released. All the procedure against this table were blocked. Until the master restarted, and since the lock info for the procedure won't be restored, the other procedures can go again, it is quite embarrassing that a bug save us...(this bug will be fixed in HBASE-20846) I tried to reproduce this one using the test case in HBASE-20921 but I just can't reproduce it. A easy way to resolve this is add a try catch, making sure no matter what happens, the table's exclusive lock can always be relased. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20921) Possible NPE in ReopenTableRegionsProcedure
Allan Yang created HBASE-20921: -- Summary: Possible NPE in ReopenTableRegionsProcedure Key: HBASE-20921 URL: https://issues.apache.org/jira/browse/HBASE-20921 Project: HBase Issue Type: Sub-task Components: amv2 Affects Versions: 2.1.0, 3.0.0, 2.0.2 Reporter: Allan Yang Assignee: Allan Yang After HBASE-20752, we issue a ReopenTableRegionsProcedure in ModifyTableProcedure to ensure all regions are reopened. But, ModifyTableProcedure and ReopenTableRegionsProcedure do not hold the lock (why?), so there is a chance that while ModifyTableProcedure executing, a merge/split procedure can be executed at the same time. So, when ReopenTableRegionsProcedure reaches the state of "REOPEN_TABLE_REGIONS_CONFIRM_REOPENED", some of the persisted regions to check is actually not exists, thus a NPE will throw. {code} 2018-07-18 01:38:57,528 INFO [PEWorker-9] procedure2.ProcedureExecutor(1246): Finished pid=6110, state=SUCCESS; MergeTableRegionsProcedure table=IntegrationTestBigLinkedList, regions=[845d286231eb01b7 1aeaa17b0e30058d, 4a46ab0918c99cada72d5336ad83a828], forcibly=false in 10.8610sec 2018-07-18 01:38:57,530 ERROR [PEWorker-8] procedure2.ProcedureExecutor(1478): CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; ReopenTab leRegionsProcedure table=IntegrationTestBigLinkedList java.lang.NullPointerException at org.apache.hadoop.hbase.master.assignment.RegionStates.checkReopened(RegionStates.java:651) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at org.apache.hadoop.hbase.master.procedure.ReopenTableRegionsProcedure.executeFromState(ReopenTableRegionsProcedure.java:102) at org.apache.hadoop.hbase.master.procedure.ReopenTableRegionsProcedure.executeFromState(ReopenTableRegionsProcedure.java:45) at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:184) at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:850) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1453) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1221) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741) {code} I think we need to renew the region list of the table at the "REOPEN_TABLE_REGIONS_CONFIRM_REOPENED" state. For the regions which are merged or split, we do not need to check it. Since we can make sure that they are opened after we made change to table descriptor. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-20864) RS was killed due to master thought the region should be on a already dead server
[ https://issues.apache.org/jira/browse/HBASE-20864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang resolved HBASE-20864. Resolution: Resolved Fix Version/s: (was: 2.0.2) HBASE-20792 solved this issue > RS was killed due to master thought the region should be on a already dead > server > - > > Key: HBASE-20864 > URL: https://issues.apache.org/jira/browse/HBASE-20864 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: log.zip > > > When I was running ITBLL with our internal 2.0.0 version(with 2.0.1 > backported and with other two issues: HBASE-20706, HBASE-20752). I found two > of my RS killed by master since master has a different region state with > those RS. It is very strange that master thought these region should be on a > already dead server. There might be a serious bug, but I haven't found it > yet. Here is the process: > 1. e010125048153.bja,60020,1531137365840 is crashed, and clearly > 4423e4182457c5b573729be4682cc3a3 was assigned to > e010125049164.bja,60020,1531136465378 during ServerCrashProcedure > {code:java} > 2018-07-09 20:03:32,443 INFO [PEWorker-10] procedure.ServerCrashProcedure: > Start pid=2303, state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure > server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false > 2018-07-09 20:03:39,220 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=294,queue=24,port=6] > assignment.RegionTransitionProcedure: Received report OPENED seqId=16021, > pid=2305, ppid=2303, state=RUNNABLE:REGION_TRANSITION_DISPATCH; > AssignProcedure table=IntegrationTestBigLinkedList, > region=4423e4182457c5b573729be4682cc3a3; rit=OPENING, > location=e010125049164.bja,60020,1531136465378 > 2018-07-09 20:03:39,220 INFO [PEWorker-13] assignment.RegionStateStore: > pid=2305 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, > regionState=OPEN, openSeqNum=16021, > regionLocation=e010125049164.bja,60020,1531136465378 > 2018-07-09 20:03:43,190 INFO [PEWorker-12] procedure2.ProcedureExecutor: > Finished pid=2303, state=SUCCESS; ServerCrashProcedure > server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false in > 10.7490sec > {code} > 2. A modify table happened later, the 4423e4182457c5b573729be4682cc3a3 was > reopend on e010125049164.bja,60020,1531136465378 > {code:java} > 2018-07-09 20:04:39,929 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=295,queue=25,port=6] > assignment.RegionTransitionProcedure: Received report OPENED seqId=16024, > pid=2351, ppid=2314, state=RUNNABLE:REGION_TRANSITION_DISPATCH; > AssignProcedure table=IntegrationTestBigLinkedList, > region=4423e4182457c5b573729be4682cc3a3, > target=e010125049164.bja,60020,1531136465378; rit=OPENING, > location=e010125049164.bja,60020,1531136465378 > 2018-07-09 20:04:40,554 INFO [PEWorker-6] assignment.RegionStateStore: > pid=2351 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, > regionState=OPEN, openSeqNum=16024, > regionLocation=e010125049164.bja,60020,1531136465378 > {code} > 3. Active master was killed, the backup master took over, but when loading > meta entry, it clearly showed 4423e4182457c5b573729be4682cc3a3 is on the > privous dead server e010125048153.bja,60020,1531137365840. That is very very > strange!!! > {code:java} > 2018-07-09 20:06:17,985 INFO [master/e010125048016:6] > assignment.RegionStateStore: Load hbase:meta entry > region=4423e4182457c5b573729be4682cc3a3, regionState=OPEN, > lastHost=e010125049164.bja,60020,1531136465378, > regionLocation=e010125048153.bja,60020,1531137365840, openSeqNum=16024 > {code} > 4. the rs was killed > {code:java} > 2018-07-09 20:06:20,265 WARN > [RpcServer.default.FPBQ.Fifo.handler=297,queue=27,port=6] > assignment.AssignmentManager: Killing e010125049164.bja,60020,1531136465378: > rit=OPEN, location=e010125048153.bja,60020,1531137365840, > table=IntegrationTestBigLinkedList, > region=4423e4182457c5b573729be4682cc3a3reported OPEN on > server=e010125049164.bja,60020,1531136465378 but state has otherwise. > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20903) backport HBASE-20792 "info:servername and info:sn inconsistent for OPEN region" to branch-2.0
Allan Yang created HBASE-20903: -- Summary: backport HBASE-20792 "info:servername and info:sn inconsistent for OPEN region" to branch-2.0 Key: HBASE-20903 URL: https://issues.apache.org/jira/browse/HBASE-20903 Project: HBase Issue Type: Bug Affects Versions: 2.0.1 Reporter: Allan Yang Assignee: Allan Yang Fix For: 2.0.2 As discussed in HBASE-20864. This is a very serious bug which can cause RS being killed or data loss. Should be backported to branch-2.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing
Allan Yang created HBASE-20893: -- Summary: Data loss if splitting region while ServerCrashProcedure executing Key: HBASE-20893 URL: https://issues.apache.org/jira/browse/HBASE-20893 Project: HBase Issue Type: Sub-task Affects Versions: 2.0.1, 3.0.0, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang Similar case as HBASE-20878. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing
Allan Yang created HBASE-20878: -- Summary: Data loss if merging regions while ServerCrashProcedure executing Key: HBASE-20878 URL: https://issues.apache.org/jira/browse/HBASE-20878 Project: HBase Issue Type: Bug Components: amv2 Affects Versions: 2.0.1, 3.0.0, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang In MergeTableRegionsProcedure, we close the regions to merge using UnassignProcedure. But, if the RS these regions on is crashed, a ServerCrashProcedure will execute at the same time. UnassignProcedures will be blocks until all logs are split. But since these regions are closed for merging, the regions won't open again, the recovered.edit in the region dir won't be replay, thus, data will loss. I provided a test to repo this case. I seriously doubt Split region procedure also has this kind of problem. I will check later -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool
Allan Yang created HBASE-20870: -- Summary: Wrong HBase root dir in ITBLL's Search Tool Key: HBASE-20870 URL: https://issues.apache.org/jira/browse/HBASE-20870 Project: HBase Issue Type: Bug Components: integration tests Affects Versions: 2.0.1, 3.0.0, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang When using IntegrationTestBigLinkedList's Search tools, it always fails since it tries to read WALs in a wrong HBase root dir. Turned out that when initializing IntegrationTestingUtility in IntegrationTestBigLinkedList, its super class HBaseTestingUtility will change hbase.rootdir to a local random dir. It is not wrong since HBaseTestingUtility is mostly used by Minicluster. But for IntegrationTest runs on distributed clusters, we should change it back. Here is the error info. {code:java} 2018-07-11 16:35:49,679 DEBUG [main] hbase.HBaseCommonTestingUtility: Setting hbase.rootdir to /home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb 2018-07-11 16:35:50,736 ERROR [main] util.AbstractHBaseTool: Error running command-line tool java.io.FileNotFoundException: File file:/home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb/WALs does not exist at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:431) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20867) RS may got killed while master restarts
Allan Yang created HBASE-20867: -- Summary: RS may got killed while master restarts Key: HBASE-20867 URL: https://issues.apache.org/jira/browse/HBASE-20867 Project: HBase Issue Type: Bug Affects Versions: 2.0.1, 3.0.0, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang If the master is dispatching a RPC call to RS when aborting. A connection exception may be thrown by the RPC layer(A IOException with "Connection closed" message in this case). The RSProcedureDispatcher will regard is as an un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, which will expire the RS. Actually, the RS is very healthy, only the master is restarting. I think we should deal with those kinds of connection exceptions in RSProcedureDispatcher and retry the rpc call -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20864) RS was killed due to master thought the region should be on a already dead server
Allan Yang created HBASE-20864: -- Summary: RS was killed due to master thought the region should be on a already dead server Key: HBASE-20864 URL: https://issues.apache.org/jira/browse/HBASE-20864 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Allan Yang When I was running ITBLL with our internal 2.0.0 version(with 2.0.1 backported and with other two issues: HBASE-20706, HBASE-20752). I found two of my RS killed by master since master has a different region state with those RS. It is very strange that master thought these region should be on a already dead server. There might be a serious bug, but I haven't found it yet. Here is the process: 1. e010125048153.bja,60020,1531137365840 is crashed, and clearly 4423e4182457c5b573729be4682cc3a3 was assigned to e010125049164.bja,60020,1531136465378 during ServerCrashProcedure {code:java} 2018-07-09 20:03:32,443 INFO [PEWorker-10] procedure.ServerCrashProcedure: Start pid=2303, state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure server=e010125048153.bja,60020,1531137365840, splitWa l=true, meta=false 2018-07-09 20:03:39,220 DEBUG [RpcServer.default.FPBQ.Fifo.handler=294,queue=24,port=6] assignment.RegionTransitionProcedure: Received report OPENED seqId=16021, pid=2305, ppid=2303, state=RUNNABLE :REGION_TRANSITION_DISPATCH; AssignProcedure table=IntegrationTestBigLinkedList, region=4423e4182457c5b573729be4682cc3a3; rit=OPENING, location=e010125049164.bja,60020,1531136465378 2018-07-09 20:03:39,220 INFO [PEWorker-13] assignment.RegionStateStore: pid=2305 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, regionState=OPEN, openSeqNum=16021, regionLocation=e010125049 164.bja,60020,1531136465378 2018-07-09 20:03:43,190 INFO [PEWorker-12] procedure2.ProcedureExecutor: Finished pid=2303, state=SUCCESS; ServerCrashProcedure server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false in 10.7490sec {code} 2. A modify table happened later, the 4423e4182457c5b573729be4682cc3a3 was reopend on e010125049164.bja,60020,1531136465378 {code:java} 2018-07-09 20:04:39,929 DEBUG [RpcServer.default.FPBQ.Fifo.handler=295,queue=25,port=6] assignment.RegionTransitionProcedure: Received report OPENED seqId=16024, pid=2351, ppid=2314, state=RUNNABLE :REGION_TRANSITION_DISPATCH; AssignProcedure table=IntegrationTestBigLinkedList, region=4423e4182457c5b573729be4682cc3a3, target=e010125049164.bja,60020,1531136465378; rit=OPENING, location=e0101250491 64.bja,60020,1531136465378 2018-07-09 20:04:40,554 INFO [PEWorker-6] assignment.RegionStateStore: pid=2351 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, regionState=OPEN, openSeqNum=16024, regionLocation=e0101250491 64.bja,60020,1531136465378 {code} 3. Active master was killed, the backup master took over, but when loading meta entry, it clearly showed 4423e4182457c5b573729be4682cc3a3 is on the privous dead server e010125048153.bja,60020,1531137365840. That is very very strange!!! {code:java} 2018-07-09 20:06:17,985 INFO [master/e010125048016:6] assignment.RegionStateStore: Load hbase:meta entry region=4423e4182457c5b573729be4682cc3a3, regionState=OPEN, lastHost=e010125049164.bja,60020 ,1531136465378, regionLocation=e010125048153.bja,60020,1531137365840, openSeqNum=16024 {code} 4. the rs was killed {code:java} 2018-07-09 20:06:20,265 WARN [RpcServer.default.FPBQ.Fifo.handler=297,queue=27,port=6] assignment.AssignmentManager: Killing e010125049164.bja,60020,1531136465378: rit=OPEN, location=e010125048153 .bja,60020,1531137365840, table=IntegrationTestBigLinkedList, region=4423e4182457c5b573729be4682cc3a3reported OPEN on server=e010125049164.bja,60020,1531136465378 but state has otherwise. {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20860) Merged region's RIT state may not be cleaned after master restart
Allan Yang created HBASE-20860: -- Summary: Merged region's RIT state may not be cleaned after master restart Key: HBASE-20860 URL: https://issues.apache.org/jira/browse/HBASE-20860 Project: HBase Issue Type: Bug Affects Versions: 2.0.1, 3.0.0, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang Fix For: 3.0.0, 2.1.0, 2.0.2 In MergeTableRegionsProcedure, we issue UnassignProcedures to offline regions to merge. But if we restart master just after MergeTableRegionsProcedure finished these two UnassignProcedure and before it can delete their meta entries. The new master will found these two region is CLOSED but no procedures are attached to them. They will be regard as RIT regions and nobody will clean the RIT state for them later. A quick way to resolve this stuck situation in the production env is restarting master again, since the meta entries are deleted in MergeTableRegionsProcedure. Here, I offer a fix for this problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20854) Wrong retires in RpcRetryingCaller's log message
Allan Yang created HBASE-20854: -- Summary: Wrong retires in RpcRetryingCaller's log message Key: HBASE-20854 URL: https://issues.apache.org/jira/browse/HBASE-20854 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Allan Yang Assignee: Allan Yang Fix For: 3.0.0, 2.1.0, 2.0.2 Just a small bug fix. In the error log message in RpcRetryingCallerImpl. tries number is passed to both tries and retries. Causing a bit of confusing. {code} 2018-07-05 21:04:46,343 INFO [Thread-20] org.apache.hadoop.hbase.client.RpcRetryingCallerImpl: Call exception, tries=6, retries=6, started=4174 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.exce ptions.RegionOpeningException: Region IntegrationTestBigLinkedList,\x7F\xFF\xFF\xFF\xFF\xFF\xFF\xFE,1530795739116.0cfd339596648348ac13d979150eb2bf. is opening on e010125049164.bja,60020,1530795698451 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20846) Table's shared lock is not hold by sub-procedures after master restart
Allan Yang created HBASE-20846: -- Summary: Table's shared lock is not hold by sub-procedures after master restart Key: HBASE-20846 URL: https://issues.apache.org/jira/browse/HBASE-20846 Project: HBase Issue Type: Bug Affects Versions: 2.1.0 Reporter: Allan Yang Assignee: Allan Yang Fix For: 3.0.0, 2.1.0, 2.0.2 Found this one when investigating ModifyTableProcedure got stuck while there was a MoveRegionProcedure going on after master restart. Though this issue can be solved by HBASE-20752. But I discovered something else. Before a MoveRegionProcedure can execute, it will hold the table's shared lock. so,, when a UnassignProcedure was spwaned, it will not check the table's shared lock since it is sure that its parent(MoveRegionProcedure) has aquired the table's lock. {code:java} // If there is parent procedure, it would have already taken xlock, so no need to take // shared lock here. Otherwise, take shared lock. if (!procedure.hasParent() && waitTableQueueSharedLock(procedure, table) == null) { return true; } {code} But, it is not the case when Master was restarted. The child procedure(UnassignProcedure) will be executed first after restart. Though it has a parent(MoveRegionProcedure), but apprently the parent didn't hold the table's lock. So, since it began to execute without hold the table's shared lock. A ModifyTableProcedure can aquire the table's exclusive lock and execute at the same time. Which is not possible if the master was not restarted. This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, I wrote a simple UT to repo this case. I think we don't have to check the parent for table's shared lock. It is a shared lock, right? I think we can acquire it every time we need it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20727) Persist FlushedSequenceId to speed up WAL split after cluster restart
Allan Yang created HBASE-20727: -- Summary: Persist FlushedSequenceId to speed up WAL split after cluster restart Key: HBASE-20727 URL: https://issues.apache.org/jira/browse/HBASE-20727 Project: HBase Issue Type: New Feature Affects Versions: 2.0.0 Reporter: Allan Yang Assignee: Allan Yang Fix For: 3.0.0 We use flushedSequenceIdByRegion and storeFlushedSequenceIdsByRegion in ServerManager to record the latest flushed seqids of regions and stores. So during log split, we can use seqids stored in those maps to filter out the edits which do not need to be replayed. But, those maps are not persisted. After cluster restart or master restart, info of flushed seqids are all lost. Here I offer a way to persist those info to HDFS, even if master restart, we can still use those info to filter WAL edits and then to speed up replay. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20679) Add the ability to compile JSP dynamically in Jetty
Allan Yang created HBASE-20679: -- Summary: Add the ability to compile JSP dynamically in Jetty Key: HBASE-20679 URL: https://issues.apache.org/jira/browse/HBASE-20679 Project: HBase Issue Type: New Feature Affects Versions: 2.0.0 Reporter: Allan Yang Assignee: Allan Yang Fix For: 3.0.0 As discussed in HBASE-20617, adding the ability to dynamically compile jsp enable us to do some hot fix. For example, several days ago, in our testing HBase-2.0 cluster, procedureWals were corrupted due to some unknown reasons. After restarting the cluster, since some procedures(AssignProcedure for example) were corrupted and couldn't be replayed. Some regions were stuck in RIT forever. We couldn't use HBCK since it haven't support AssignmentV2 yet. As a matter of fact, the namespace region was not online, so the master was not inited, we even couldn't use shell command like assign/move. But, we wrote a jsp and fix this issue easily. The jsp file is like this: {code:java} <% String action = request.getParameter("action"); HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER); List offlineRegionsToAssign = new ArrayList<>(); List regionRITs = master.getAssignmentManager() .getRegionStates().getRegionsInTransition(); for (RegionStates.RegionStateNode regionStateNode : regionRITs) { // if regionStateNode don't have a procedure attached, but meta state shows // this region is in RIT, that means the previous procedure may be corrupted // we need to create a new assignProcedure to assign them if (!regionStateNode.isInTransition()) { offlineRegionsToAssign.add(regionStateNode.getRegionInfo()); out.println("RIT region:" + regionStateNode); } } // Assign offline regions. Uses round-robin. if ("fix".equals(action) && offlineRegionsToAssign.size() > 0) { master.getMasterProcedureExecutor().submitProcedures(master.getAssignmentManager(). createRoundRobinAssignProcedures(offlineRegionsToAssign)); } else { out.println("use ?action=fix to fix RIT regions"); } %> {code} Above it is only one example we can do if we have the ability to compile jsp dynamically. We think it is very useful. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20611) UnsupportedOperationException may thrown when calling getCallQueueInfo()
Allan Yang created HBASE-20611: -- Summary: UnsupportedOperationException may thrown when calling getCallQueueInfo() Key: HBASE-20611 URL: https://issues.apache.org/jira/browse/HBASE-20611 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Allan Yang HBASE-16290 added a new feature to dump queue info, the method getCallQueueInfo() need to iterate the queue to get the elements in the queue. But, except the Java's LinkedBlockingQueue, the other queue implementations like BoundedPriorityBlockingQueue and AdaptiveLifoCoDelCallQueue don't implement the method iterator(). If those queues are used, a UnsupportedOperationException will be thrown. This can be easily be reproduced by the UT testCallQueueInfo while adding a conf ( conf.set("hbase.ipc.server.callqueue.type", "deadline");) {code} java.lang.UnsupportedOperationException at org.apache.hadoop.hbase.util.BoundedPriorityBlockingQueue.iterator(BoundedPriorityBlockingQueue.java:285) at org.apache.hadoop.hbase.ipc.RpcExecutor.getCallQueueCountsSummary(RpcExecutor.java:166) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.getCallQueueInfo(SimpleRpcScheduler.java:241) at org.apache.hadoop.hbase.ipc.TestSimpleRpcScheduler.testCallQueueInfo(TestSimpleRpcScheduler.java:164) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20601) And multiPut support and other miscellaneous to pe
Allan Yang created HBASE-20601: -- Summary: And multiPut support and other miscellaneous to pe Key: HBASE-20601 URL: https://issues.apache.org/jira/browse/HBASE-20601 Project: HBase Issue Type: Bug Components: tooling Affects Versions: 2.0.0 Reporter: Allan Yang Assignee: Allan Yang Fix For: 2.1.0 Add some useful stuff and some refinement to PE tool 1. And multiPut support Though we have BufferedMutator, sometimes we need to benchmark batch put in a certain number. Set --multiPut=number to enable batchput(meanwhile, --autoflush need be set to false) 2. And Connection Number support Before, there is only on parameter to control the connection used by threads. oneCon=true means all threads use one connection, false means each thread has it own connection. When thread number is high and oneCon=false, we noticed high context switch frequency in the machine which PE run on, disturbing the benchmark results(each connection has its own netty worker threads, 2*CPU IIRC). So, added a new parameter conNum to PE. set --conNum=2 means all threads will share 2 connections. 3. And avg RT and avg TPS/QPS statstic for all threads Useful when we want to meansure the total throughtput of the cluster 4. Delete some redundant code Now RandomWriteTest is inherited from SequentialWrite. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-18233) We shouldn't wait for readlock in doMiniBatchMutation in case of deadlock
Allan Yang created HBASE-18233: -- Summary: We shouldn't wait for readlock in doMiniBatchMutation in case of deadlock Key: HBASE-18233 URL: https://issues.apache.org/jira/browse/HBASE-18233 Project: HBase Issue Type: Bug Affects Versions: 1.2.7 Reporter: Allan Yang Assignee: Allan Yang Please refer to the discuss in HBASE-18144 https://issues.apache.org/jira/browse/HBASE-18144?focusedCommentId=16051701&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16051701 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18168) NoSuchElementException when rolling the log
Allan Yang created HBASE-18168: -- Summary: NoSuchElementException when rolling the log Key: HBASE-18168 URL: https://issues.apache.org/jira/browse/HBASE-18168 Project: HBase Issue Type: Bug Affects Versions: 1.1.11 Reporter: Allan Yang Assignee: Allan Yang Today, one of our server aborted due to the following log. {code} 2017-06-06 05:38:47,142 ERROR [regionserver/.logRoller] regionserver.LogRoller: Log rolling failed java.util.NoSuchElementException at java.util.concurrent.ConcurrentSkipListMap$Iter.advance(ConcurrentSkipListMap.java:2224) at java.util.concurrent.ConcurrentSkipListMap$ValueIterator.next(ConcurrentSkipListMap.java:2253) at java.util.Collections.min(Collections.java:628) at org.apache.hadoop.hbase.regionserver.wal.FSHLog.findEligibleMemstoresToFlush(FSHLog.java:861) at org.apache.hadoop.hbase.regionserver.wal.FSHLog.findRegionsToForceFlush(FSHLog.java:886) at org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:728) at org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:137) at java.lang.Thread.run(Thread.java:756) 2017-06-06 05:38:47,142 FATAL [regionserver/.logRoller] regionserver.HRegionServer: ABORTING region server : Log rolling failed java.util.NoSuchElementException .. {code} The code is here: {code} private byte[][] findEligibleMemstoresToFlush(Map regionsSequenceNums) { List regionsToFlush = null; // Keeping the old behavior of iterating unflushedSeqNums under oldestSeqNumsLock. synchronized (regionSequenceIdLock) { for (Map.Entry e: regionsSequenceNums.entrySet()) { ConcurrentMap m = this.oldestUnflushedStoreSequenceIds.get(e.getKey()); if (m == null) { continue; } long unFlushedVal = Collections.min(m.values()); //The exception is thrown here .. {code} The map 'm' is empty is the only reason I can think of why NoSuchElementException is thrown. I then looked up all code related to the update of 'oldestUnflushedStoreSequenceIds'. All update to 'oldestUnflushedStoreSequenceIds' is guarded by the synchronization of 'regionSequenceIdLock' except here: {code} private ConcurrentMap getOrCreateOldestUnflushedStoreSequenceIdsOfRegion( byte[] encodedRegionName) { .. oldestUnflushedStoreSequenceIdsOfRegion = new ConcurrentSkipListMap(Bytes.BYTES_COMPARATOR); ConcurrentMap alreadyPut = oldestUnflushedStoreSequenceIds.putIfAbsent(encodedRegionName, oldestUnflushedStoreSequenceIdsOfRegion); // Here, a empty map may put to 'oldestUnflushedStoreSequenceIds' with no synchronization return alreadyPut == null ? oldestUnflushedStoreSequenceIdsOfRegion : alreadyPut; } {code} It should be a very rare bug. But it can lead to server abort. It only exists in branch-1.1. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18156) Provide a tool to show cache summary
Allan Yang created HBASE-18156: -- Summary: Provide a tool to show cache summary Key: HBASE-18156 URL: https://issues.apache.org/jira/browse/HBASE-18156 Project: HBase Issue Type: New Feature Affects Versions: 2.0.0, 1.4.0 Reporter: Allan Yang Assignee: Allan Yang HBASE-17757 is already committed. But since there is no easy way to show the size distribution of cached blocks, it is hard to decide the unified size should be used. Here I provide a tool to show the details of size distribution of cached blocks. This tool is well used in our production environment. It is a jsp page summaries the cache details like this {code} BlockCache type:org.apache.hadoop.hbase.io.hfile.LruBlockCache LruBlockCache Total size:28.40 GB Current size:22.49 GB MetaBlock size:1.56 GB Free size:5.91 GB Block count:152684 Size distribution summary: BlockCacheSizeDistributionSummary [0 B<=blocksize<4 KB, blocks=833, heapSize=1.19 MB] BlockCacheSizeDistributionSummary [4 KB<=blocksize<8 KB, blocks=65, heapSize=310.83 KB] BlockCacheSizeDistributionSummary [8 KB<=blocksize<12 KB, blocks=175, heapSize=1.46 MB] BlockCacheSizeDistributionSummary [12 KB<=blocksize<16 KB, blocks=18, heapSize=267.43 KB] BlockCacheSizeDistributionSummary [16 KB<=blocksize<20 KB, blocks=512, heapSize=8.30 MB] BlockCacheSizeDistributionSummary [20 KB<=blocksize<24 KB, blocks=22, heapSize=499.66 KB] BlockCacheSizeDistributionSummary [24 KB<=blocksize<28 KB, blocks=24, heapSize=632.59 KB] BlockCacheSizeDistributionSummary [28 KB<=blocksize<32 KB, blocks=34, heapSize=1.02 MB] BlockCacheSizeDistributionSummary [32 KB<=blocksize<36 KB, blocks=31, heapSize=1.02 MB] BlockCacheSizeDistributionSummary [36 KB<=blocksize<40 KB, blocks=22, heapSize=838.58 KB] BlockCacheSizeDistributionSummary [40 KB<=blocksize<44 KB, blocks=28, heapSize=1.15 MB] {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18132) Low replication should be checked in period in case of datanode rolling upgrade
Allan Yang created HBASE-18132: -- Summary: Low replication should be checked in period in case of datanode rolling upgrade Key: HBASE-18132 URL: https://issues.apache.org/jira/browse/HBASE-18132 Project: HBase Issue Type: Bug Affects Versions: 1.1.10, 1.4.0 Reporter: Allan Yang Assignee: Allan Yang For now, we just check low replication of WALs when there is a sync operation (HBASE-2234), rolling the log if the replica of the WAL is less than configured. But if the WAL has very little writes or no writes at all, low replication will not be detected and thus no log will be rolled. That is a problem when rolling updating datanode, all replica of the WAL with no writes will be restarted and lead to the WAL file end up with a abnormal state. Later operation of opening this file will be always failed. I bring up a patch to check low replication of WALs at a configured period. When rolling updating datanodes, we just make sure the restart interval time between two nodes is bigger than the low replication check time, the WAL will be closed and rolled normally. A UT in the patch will show everything. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18058) Zookeeper retry sleep time should have a up limit
Allan Yang created HBASE-18058: -- Summary: Zookeeper retry sleep time should have a up limit Key: HBASE-18058 URL: https://issues.apache.org/jira/browse/HBASE-18058 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.4.0 Reporter: Allan Yang Assignee: Allan Yang Now, in {{RecoverableZooKeeper}}, the retry backoff sleep time grow exponentially, but it doesn't have any up limit. It directly lead to a long long recovery time after Zookeeper going down for some while and come back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18014) A case of Region remain unassigned when table enabled
Allan Yang created HBASE-18014: -- Summary: A case of Region remain unassigned when table enabled Key: HBASE-18014 URL: https://issues.apache.org/jira/browse/HBASE-18014 Project: HBase Issue Type: Bug Affects Versions: 1.1.10, 1.4.0 Reporter: Allan Yang Assignee: Allan Yang Reproduce procedure: 1. Create a table, say the regions of this table are opened on RS1 2. Disable this table 3. Abort RS1 and wait for SSH to complete 4. Wait for a while, RS1 will be deleted from processedServers(a HashMap in {{RegionState}} to store processed dead servers) 5. Enable the table, then the region of the table will remain unassigned until master restarts. Why? When assigning regions after the table enabled, AssignmentManager will check whether those regions are on servers which are dead but not processed, since RS1 already have deleted from the map of 'processedServers'. Then the AssignmentManager think this region is on a dead but not processed server. So it will skip assign, let the region be handled by SSH. {code:java} case OFFLINE: if (useZKForAssignment && regionStates.isServerDeadAndNotProcessed(sn) && wasRegionOnDeadServerByMeta(region, sn)) { if (!regionStates.isRegionInTransition(region)) { LOG.info("Updating the state to " + State.OFFLINE + " to allow to be reassigned by SSH"); regionStates.updateRegionState(region, State.OFFLINE); } LOG.info("Skip assigning " + region.getRegionNameAsString() + ", it is on a dead but not processed yet server: " + sn); return null; } {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17969) Balance by table using SimpleLoadBalancer could end up imbalance
Allan Yang created HBASE-17969: -- Summary: Balance by table using SimpleLoadBalancer could end up imbalance Key: HBASE-17969 URL: https://issues.apache.org/jira/browse/HBASE-17969 Project: HBase Issue Type: Improvement Affects Versions: 1.1.10 Reporter: Allan Yang Assignee: Allan Yang This really happens in our production env. Here is a example: Say we have three RS named r1, r2, r3. A table named table1 with 3 regions distributes on these rs like this: r1 1 r2 1 r3 1 Each rs have one region, it means table1 is balanced. So balancer will not run. If the region on r3 splits, then it becomes: r1 1 r2 1 r3 2 For table1, in average, each rs will have min=1, max=2 regions. So still it is balanced, balancer will not run. Then a region on r3 splits again, the distribution becomes: r1 1 r2 1 r3 3 In average, each rs will have min=1, max=2 regions. So balancer will run. For r1 and r2, they have already have min=1 regions. Balancer won't do any operation on them. But for r3, it exceed max=3, so balancer will remove one region from r3 and choose one rs from r1, r2 to move to. But r1 and r2 have the same load, so balancer will always choose r1 since servername r1 < r2(alphabet order, sorted by ServerAndLoad's compareTo method). It is OK for table1 itself. But if every table in the cluster have similar situations like table1, then the load in the cluster will always be like r1 > r2 > r3. So, the solution here is when each rs reach min regions (min=total region / servers), but there still some region need to move, shuffle the regionservers before move. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17808) FastPath for RWQueueRpcExecutor
Allan Yang created HBASE-17808: -- Summary: FastPath for RWQueueRpcExecutor Key: HBASE-17808 URL: https://issues.apache.org/jira/browse/HBASE-17808 Project: HBase Issue Type: Improvement Components: rpc Affects Versions: 2.0.0 Reporter: Allan Yang Assignee: Allan Yang FastPath for the FIFO rpcscheduler was introduced in HBASE-16023. But it is not implemented for RW queues. In this issue, I use FastPathBalancedQueueRpcExecutor in RW queues. So anyone who want to isolate their read/write requests can also benefit from the fastpath. I haven't test the performance yet. But since I haven't change any of the core implemention of FastPathBalancedQueueRpcExecutor, it should have the same performance in HBASE-16023. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17757) Unify blocksize after encoding to decrease memory fragment
Allan Yang created HBASE-17757: -- Summary: Unify blocksize after encoding to decrease memory fragment Key: HBASE-17757 URL: https://issues.apache.org/jira/browse/HBASE-17757 Project: HBase Issue Type: New Feature Reporter: Allan Yang Assignee: Allan Yang Usually, we store encoded block(uncompressed) in blockcache/bucketCache. Though we have set the blocksize, after encoding, blocksize is varied. Varied blocksize will cause memory fragment problem, which will result in more FGC finally.In order to relief the memory fragment, This issue adjusts the encoded block to a unified size. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17718) Difference between RS's servername and its ephemeral node cause SSH stop working
Allan Yang created HBASE-17718: -- Summary: Difference between RS's servername and its ephemeral node cause SSH stop working Key: HBASE-17718 URL: https://issues.apache.org/jira/browse/HBASE-17718 Project: HBase Issue Type: Bug Affects Versions: 1.1.8, 1.2.4, 2.0.0 Reporter: Allan Yang Assignee: Allan Yang After HBASE-9593, RS put up an ephemeral node in ZK before reporting for duty. But if the hosts config (/etc/hosts) is different between master and RS, RS's serverName can be different from the one stored the ephemeral zk node. The email metioned in HBASE-13753 (http://mail-archives.apache.org/mod_mbox/hbase-user/201505.mbox/%3CCANZDn9ueFEEuZMx=pZdmtLsdGLyZz=rrm1N6EQvLswYc1z-H=g...@mail.gmail.com%3E) is exactly what happened in our production env. But what the email didn't point out is that the difference between serverName in RS and zk node can cause SSH stop to work. as we can see from the code in {{RegionServerTracker}} {code} @Override public void nodeDeleted(String path) { if (path.startsWith(watcher.rsZNode)) { String serverName = ZKUtil.getNodeName(path); LOG.info("RegionServer ephemeral node deleted, processing expiration [" + serverName + "]"); ServerName sn = ServerName.parseServerName(serverName); if (!serverManager.isServerOnline(sn)) { LOG.warn(serverName.toString() + " is not online or isn't known to the master."+ "The latter could be caused by a DNS misconfiguration."); return; } remove(sn); this.serverManager.expireServer(sn); } } {code} The server will not be processed by SSH/ServerCrashProcedure. The regions on this server will not been assigned again until master restart or failover. I know HBASE-9593 was to fix the issue if RS report to duty and crashed before it can put up a zk node. It is a very rare case. But The issue I metioned can happened more often(due to DNS, config, etc.) and have more severe consequence. So here I offer some solutions to discuss: 1. Revert HBASE-9593 from all branches, Andrew Purtell has reverted it in branch-0.98 2. Abort RS if master return a different name, otherwise SSH can't work properly 3. Master receive whatever servername reported by RS and don't change it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17673) Monitored RPC Handler not show in the WebUI
Allan Yang created HBASE-17673: -- Summary: Monitored RPC Handler not show in the WebUI Key: HBASE-17673 URL: https://issues.apache.org/jira/browse/HBASE-17673 Project: HBase Issue Type: Bug Affects Versions: 1.1.8, 1.2.4, 2.0.0, 3.0.0 Reporter: Allan Yang Assignee: Allan Yang Priority: Minor This issue has been fixed once in HBASE-14674. But, I noticed that almost all RS in our production environment still have this problem. Strange thing is that newly started servers seems do not affected. Digging for a while, then I realize the {{CircularFifoBuffer}} introduced by HBASE-10312 is the root cause. The RPC hander's monitoredTask only create once, if the server is flooded with tasks, RPC monitoredTask can be purged by CircularFifoBuffer, and then never visible in WebUI. So my solution is create a list for RPC monitoredTask sepreately. It is OK to do so since the RPC handlers remain in a fixed number. It won't increase or decrease during the lifetime of the server. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17506) started mvcc transaction is not completed in branch-1
Allan Yang created HBASE-17506: -- Summary: started mvcc transaction is not completed in branch-1 Key: HBASE-17506 URL: https://issues.apache.org/jira/browse/HBASE-17506 Project: HBase Issue Type: Bug Affects Versions: 1.4.0 Reporter: Allan Yang Assignee: Allan Yang In {{doMiniBatchMutation}}, if it is in replay and if the the nonce of the mutation is different, we append them to a different wal. But, after HBASE-14465, we start a mvcc transition in the ringbuffer's append thread. So, every time we append a wal entry, we started a mvcc transition, but we didn't complete the mvcc transition anywhere. This can block other transition of this region. {code} // txid should always increase, so having the one from the last call is ok. // we use HLogKey here instead of WALKey directly to support legacy coprocessors. walKey = new ReplayHLogKey(this.getRegionInfo().getEncodedNameAsBytes(), this.htableDescriptor.getTableName(), now, m.getClusterIds(), currentNonceGroup, currentNonce, mvcc); txid = this.wal.append(this.htableDescriptor, this.getRegionInfo(), walKey, walEdit, true); walEdit = new WALEdit(cellCount, isInReplay); walKey = null; {code} Looked at master branch, there is no such problem. It has a method named{{appendCurrentNonces}} : {code} private void appendCurrentNonces(final Mutation mutation, final boolean replay, final WALEdit walEdit, final long now, final long currentNonceGroup, final long currentNonce) throws IOException { if (walEdit.isEmpty()) return; if (!replay) throw new IOException("Multiple nonces per batch and not in replay"); WALKey walKey = new WALKey(this.getRegionInfo().getEncodedNameAsBytes(), this.htableDescriptor.getTableName(), now, mutation.getClusterIds(), currentNonceGroup, currentNonce, mvcc, this.getReplicationScope()); this.wal.append(this.getRegionInfo(), walKey, walEdit, true); // Complete the mvcc transaction started down in append else it will block others this.mvcc.complete(walKey.getWriteEntry()); } {code} Yes, the easiest way to fix branch-1 is to complete the writeEntry like master branch do. But is it really fine to do this? 1. Question 1: complete the mvcc transition before waiting sync will create a disturbance of data visibility. 2.Question 2: In what circumstance will there be different nonce and nonce group in a single wal entry? Nonce are used in append/increment. But in {{batchMuate}} ,we treat them differently and append one wal entry for each of them. So I think no test can reach this code path, that maybe why no one has found this bug(Please tell me if I'm wrong). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17482) mvcc mechanism failed when using mvccPreAssign
Allan Yang created HBASE-17482: -- Summary: mvcc mechanism failed when using mvccPreAssign Key: HBASE-17482 URL: https://issues.apache.org/jira/browse/HBASE-17482 Project: HBase Issue Type: Bug Affects Versions: 2.0.. Reporter: Allan Yang Assignee: Allan Yang Priority: Critical If mvccPreAssign and ASYNC_WAL is used, then cells may have been commited to memstore before append thread can stamp seqid to them. The unit test shows everything. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17475) Stack overflow in AsyncProcess if retry too much
Allan Yang created HBASE-17475: -- Summary: Stack overflow in AsyncProcess if retry too much Key: HBASE-17475 URL: https://issues.apache.org/jira/browse/HBASE-17475 Project: HBase Issue Type: Bug Components: API Affects Versions: 2.0.0, 1.4.0 Reporter: Allan Yang Assignee: Allan Yang In AsyncProcess, we resubmit the retry task in the same thread {code} // run all the runnables for (Runnable runnable : runnables) { if ((--actionsRemaining == 0) && reuseThread) { runnable.run(); } else { try { pool.submit(runnable); } .. {code} But, if we retry too much time. soon, stack overflow will occur. This is very common in clusters with Phoenix. Phoenix need to write index table in the normal write path, retry will cause stack overflow exception. {noformat} "htable-pool19-t2" #582 daemon prio=5 os_prio=0 tid=0x02687800 nid=0x4a96 waiting on condition [0x7fe3f6301000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.resubmit(AsyncProcess.java:1174) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveMultiAction(AsyncProcess.java:1321) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1200(AsyncProcess.java:575) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:729) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:977) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:886) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.resubmit(AsyncProcess.java:1181) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveMultiAction(AsyncProcess.java:1321) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1200(AsyncProcess.java:575) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:729) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:977) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:886) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.resubmit(AsyncProcess.java:1181) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveMultiAction(AsyncProcess.java:1321) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1200(AsyncProcess.java:575) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:729) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:977) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:886) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.resubmit(AsyncProcess.java:1181) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveMultiAction(AsyncProcess.java:1321) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1200(AsyncProcess.java:575) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:729) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:977) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:886) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.resubmit(AsyncProcess.java:1181) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveMultiAction(AsyncProcess.java:1321) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1200(AsyncProcess.java:575) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:729) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:977) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:886) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.resubmit(AsyncProcess.java:1181) at org.apache.hadoop.hbase.client.Asy
[jira] [Created] (HBASE-17471) Region Seqid will be out of order in WAL if using mvccPreAssign
Allan Yang created HBASE-17471: -- Summary: Region Seqid will be out of order in WAL if using mvccPreAssign Key: HBASE-17471 URL: https://issues.apache.org/jira/browse/HBASE-17471 Project: HBase Issue Type: Bug Components: wal Affects Versions: 2.0.0, 1.4.0 Reporter: Allan Yang Assignee: Allan Yang mvccPreAssign was bring by HBASE-16698, which truly improved the performance of writing, especially in ASYNC_WAL scenario. But mvccPreAssign was only used in {{doMiniBatchMutate}}, not in Increment/Append path. If Increment/Append and batch put are using against the same region in parallel, then seqid of the same region may not monotonically increasing in the WAL. Since one write path acquires mvcc/seqid before append, and the other acquires in the append/sync consume thread. The out of order situation can easily reproduced by a simple UT, which was attached in the attachment. I modified the code to assert on the disorder: {code} if(this.highestSequenceIds.containsKey(encodedRegionName)) { assert highestSequenceIds.get(encodedRegionName) < sequenceid; } {code} I'd like to say, If we are allow disorder in WALs, then this is not a issue. But as far as I know, if {{highestSequenceIds}} is not properly set, some WALs may not archive to oldWALs correctly. which I haven't figure out yet is that, will disorder in WAL cause data loss when recovering from disaster? If so, then it is a big problem need to be fixed. I have fix this problem in our costom1.1.x branch, my solution is using mvccPreAssign everywhere, making it un-configurable. Since mvccPreAssign it is indeed a better way than assign seqid in the ringbuffer thread while keeping handlers waiting for it. If anyone think it is doable, then I will port it to branch-1 and master branch and upload it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17319) Truncate table with preserve after split may cause truncate fail
Allan Yang created HBASE-17319: -- Summary: Truncate table with preserve after split may cause truncate fail Key: HBASE-17319 URL: https://issues.apache.org/jira/browse/HBASE-17319 Project: HBase Issue Type: Bug Components: Admin Affects Versions: 1.2.4, 1.1.7 Reporter: Allan Yang Assignee: Allan Yang In truncateTableProcedue , when getting tables regions from meta to recreate new regions, split parents are not excluded, so the new regions can end up with the same start key, and the same region dir: {noformat} 2016-12-14 20:15:22,231 WARN [RegionOpenAndInitThread-writetest-1] regionserver.HRegionFileSystem: Trying to create a region that already exists on disk: hdfs://hbasedev1/zhengyan-hbase11-func2/.tmp/data/default/writetest/9b2c8d1539cd92661703ceb8a4d518a1 {noformat} The truncateTableProcedue will retry forever and never get success. A attached unit test shows everything. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17275) Assign timeout cause region unassign forever
Allan Yang created HBASE-17275: -- Summary: Assign timeout cause region unassign forever Key: HBASE-17275 URL: https://issues.apache.org/jira/browse/HBASE-17275 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 1.1.7, 1.2.3 Reporter: Allan Yang Assignee: Allan Yang This is a real cased happened in my test cluster. I have more 8000 regions to assign when I restart a cluster, but I only started one regionserver. That means master need to assign these 8000 regions to a single server(I know it is not right, but just for testing). The rs recevied the open region rpc and began to open regions. But the due to the hugh number of regions, , master timeout the rpc call(but actually some region had already opened) after 1 mins, as you can see from log 1. {noformat} 1. 2016-11-22 10:17:32,285 INFO [example.org:30001.activeMasterManager] master.AssignmentManager: Unable to communicate with example.org,30003,1479780976834 in order to assign regions, java.io.IOException: Call to /example.org:30003 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=1, waitTime=60001, operationTimeout=6 expired. at org.apache.hadoop.hbase.ipc.RpcClientImpl.wrapException(RpcClientImpl.java:1338) at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1272) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:290) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:30177) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:1000) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1719) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2828) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2775) at org.apache.hadoop.hbase.master.AssignmentManager.assignAllUserRegions(AssignmentManager.java:2876) at org.apache.hadoop.hbase.master.AssignmentManager.processDeadServersAndRegionsInTransition(AssignmentManager.java:646) at org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:493) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:796) at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:188) at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1711) at java.lang.Thread.run(Thread.java:756) Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=1, waitTime=60001, operationTimeout=6 expired. at org.apache.hadoop.hbase.ipc.Call.checkAndSetTimeout(Call.java:81) at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1246) ... 14 more {noformat} for the region 7e9aee32eb98a6fc9d503b99fc5f9615(like many others), after timeout, master use a pool to re-assign them, as in 2 {noformat} 2. 2016-11-22 10:17:32,303 DEBUG [AM.-pool1-t26] master.AssignmentManager: Force region state offline {7e9aee32eb98a6fc9d503b99fc5f9615 state=PENDING_OPEN, ts=1479780992078, server=example.org,30003,1479780976834} {noformat} But, this region was actually opened on the rs, but (maybe) due to the hugh pressure, the OPENED zk event recevied by master , as you can tell from 3, "which is more than 15 seconds late" {noformat} 3. 2016-11-22 10:17:32,304 DEBUG [AM.ZK.Worker-pool2-t3] master.AssignmentManager: Handling RS_ZK_REGION_OPENED, server=example.org,30003,1479780976834, region=7e9aee32eb98a6fc9d503b99fc5f9615, which is more than 15 seconds late, current_state={7e9aee32eb98a6fc9d503b99fc5f9615 state=PENDING_OPEN, ts=1479780992078, server=example.org,30003,1479780976834} {noformat} In the meantime, master still try to re-assign this region in another thread. Master first close this region in case of multi assign, then change the state of this region change from PENDING_OPEN >OFFLINE>PENDING_OPEN. Its RIT node in zk was also transitioned to OFFLINE, as in 4,5,6,7 {noformat} 4. 2016-11-22 10:17:32,321 DEBUG [AM.-pool1-t26] master.AssignmentManager: Sent CLOSE to example.org,30003,1479780976834 for region test,P7HQ55,1475985973151.7e9aee32eb98a6fc9d503b99fc5f9615. 5. 2016-11-22 10:17:32,461 INFO [AM.-pool1-t26] master.RegionStates: Transition {7e9aee32eb98a6fc9d503b99fc5f9615 state=PENDING_OPEN, ts=1479781052344, server=example.org,30003,1479780976834} to {7e9aee32eb98a6fc9d503b99fc5f9615 state=OFFLINE, ts=1479781052461, server=example.org,30003,1479780976834} 6. 2016-11-22 10:17:32,469 DEBUG [AM.-poo
[jira] [Created] (HBASE-17264) Process RIT with offline state will always fail to open in the first time
Allan Yang created HBASE-17264: -- Summary: Process RIT with offline state will always fail to open in the first time Key: HBASE-17264 URL: https://issues.apache.org/jira/browse/HBASE-17264 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 1.1.7 Reporter: Allan Yang Assignee: Allan Yang Attachments: HBASE-17264-branch1.1.patch In Assignment#processRegionsInTransition, when handling regions with M_ZK_REGION_OFFLINE state, we used a handler to reassign this region. But, when calling assign, we passed not to set the zk node {code} case M_ZK_REGION_OFFLINE: // Insert in RIT and resend to the regionserver regionStates.updateRegionState(rt, State.PENDING_OPEN); final RegionState rsOffline = regionStates.getRegionState(regionInfo); this.executorService.submit( new EventHandler(server, EventType.M_MASTER_RECOVERY) { @Override public void process() throws IOException { ReentrantLock lock = locker.acquireLock(regionInfo.getEncodedName()); try { RegionPlan plan = new RegionPlan(regionInfo, null, sn); addPlan(encodedName, plan); assign(rsOffline, false, false); //we decide to not to setOfflineInZK } finally { lock.unlock(); } } }); break; {code} But, when setOfflineInZK is false, we passed a zk node vesion of -1 to the regionserver, meaning the zk node does not exists. But actually the offline zk node does exist with a different version. RegionServer will report fail to open because of this. This situation is trully happened in our test environment. Though the master will recevied the FAILED_OPEN zk event and retry later, but due to a another bug(I will open another jira later). The Region will be remain in closed state forever. Master assign region in RIT {noformat} 2016-11-23 17:11:46,842 INFO [example.org:30001.activeMasterManager] master.AssignmentManager: Processing 57513956a7b671f4e8da1598c2e2970e in state: M_ZK_REGION_OFFLINE 2016-11-23 17:11:46,842 INFO [example.org:30001.activeMasterManager] master.RegionStates: Transition {57513956a7b671f4e8da1598c2e2970e state=OFFLINE, ts=1479892306738, server=example.org,30003,1475893095003} to {57513956a7b671f4e8da1598c2e2970e state=PENDING_OPEN, ts=1479892306842, server=example.org,30003,1479780976834} 2016-11-23 17:11:46,842 INFO [example.org:30001.activeMasterManager] master.AssignmentManager: Processed region 57513956a7b671f4e8da1598c2e2970e in state M_ZK_REGION_OFFLINE, on server: example.org,30003,1479780976834 2016-11-23 17:11:46,843 INFO [MASTER_SERVER_OPERATIONS-example.org:30001-0] master.AssignmentManager: Assigning test,QFO7M,1475986053104.57513956a7b671f4e8da1598c2e2970e. to example.org,30003,1479780976834 {noformat} RegionServer recevied the open region request, and new a RegionOpenHandler to open the region, but only to find the RIT node's version is not as it expected. RS transition the RIT ZK node to failed open in the end {noformat} 2016-11-23 17:11:46,860 WARN [RS_OPEN_REGION-example.org:30003-1] coordination.ZkOpenRegionCoordination: Failed transition from OFFLINE to OPENING for region=57513956a7b671f4e8da1598c2e2970e 2016-11-23 17:11:46,861 WARN [RS_OPEN_REGION-example.org:30003-1] handler.OpenRegionHandler: Region was hijacked? Opening cancelled for encodedName=57513956a7b671f4e8da1598c2e2970e 2016-11-23 17:11:46,860 WARN [RS_OPEN_REGION-example.org:30003-1] zookeeper.ZKAssign: regionserver:30003-0x15810b5f633015f, quorum=hbase4dev04.et2sqa:2181,hbase4dev05.et2sqa:2181,hbase4dev06.et2sqa:2181, baseZNode=/test-hbase11-func2 Attempt to transition the unassigned node for 57513956a7b671f4e8da1598c2e2970e from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING failed, the node existed but was version 3 not the expected version -1 {noformat} Master recevied this zk event and begin to handle RS_ZK_REGION_FAILED_OPEN {noformat} 2016-11-23 17:11:46,944 DEBUG [AM.ZK.Worker-pool2-t1] master.AssignmentManager: Handling RS_ZK_REGION_FAILED_OPEN, server=example.org,30003,1479780976834, region=57513956a7b671f4e8da1598c2e2970e, current_state={57513956a7b671f4e8da1598c2e2970e state=PENDING_OPEN, ts=1479892306843, server=example.org,30003,1479780976834} {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17265) Region left unassigned in master failover when failed open
Allan Yang created HBASE-17265: -- Summary: Region left unassigned in master failover when failed open Key: HBASE-17265 URL: https://issues.apache.org/jira/browse/HBASE-17265 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 1.1.7 Reporter: Allan Yang Assignee: Allan Yang Attachments: HBASE-17265-branch1.patch This problem is very similar with HBASE-13330. It is also a result of ServerShutdownHandler and AssignmentManager 'thought' the region will be assigned by each other, and left the region remain unassigned. But HBASE-13330 only dealed with RS_ZK_REGION_FAILED_OPEN in {{processRegionInTransition}}. Region failed open may happen after {{processRegionInTransition}}. In my case, when master failover, it assigned all RIT regions, but some are failed to open(due to HBASE-17264), AssignmentManager received the zk event, and skip to assign it(this region was opened on a failed server before and already in RIT before master failover). The SSH also skip to assign it because it was RIT on another RS. Master recevied a zk event of RS_ZK_REGION_FAILED_OPEN and begin to handle it: {noformat} 2016-11-23 17:11:46,944 DEBUG [AM.ZK.Worker-pool2-t1] master.AssignmentManager: Handling RS_ZK_REGION_FAILED_OPEN, server=example.org,30003,1479780976834, region=57513956a7b671f4e8da1598c2e2970e, current_state={57513956a7b671f4e8da1598c2e2970e state=PENDING_OPEN, ts=1479892306843, server=example.org,30003,1479780976834} 2016-11-23 17:11:46,944 INFO [AM.ZK.Worker-pool2-t1] master.RegionStates: Transition {57513956a7b671f4e8da1598c2e2970e state=PENDING_OPEN, ts=1479892306843, server=example.org,30003,1479780976834} to {57513956a7b671f4e8da1598c2e2970e state=CLOSED, ts=1479892306944, server=example.org,30003,1479780976834} 2016-11-23 17:11:46,945 WARN [AM.ZK.Worker-pool2-t1] master.RegionStates: 57513956a7b671f4e8da1598c2e2970e moved to CLOSED on example.org,30003,1479780976834, expected example.org,30003,1475893095003 2016-11-23 17:11:46,950 DEBUG [AM.ZK.Worker-pool2-t1] master.AssignmentManager: Found an existing plan for test,QFO7M,1475986053104.57513956a7b671f4e8da1598c2e2970e. destination server is example.org,30003,1479780976834 accepted as a dest server = false 2016-11-23 17:11:47,012 DEBUG [AM.ZK.Worker-pool2-t1] master.AssignmentManager: No previous transition plan found (or ignoring an existing plan) for test,QFO7M,1475986053104.57513956a7b671f4e8da1598c2e2970e.; generated random plan=hri=test,QFO7M,1475986053104.57513956a7b671f4e8da1598c2e2970e., src=, dest=11.239.21.235,30003,1479781410131; 2 (online=3) available servers, forceNewPlan=true 2016-11-23 17:11:47,014 DEBUG [AM.ZK.Worker-pool2-t1] handler.ClosedRegionHandler: Handling CLOSED event for 57513956a7b671f4e8da1598c2e2970e 2016-11-23 17:11:47,015 WARN [AM.ZK.Worker-pool2-t1] master.RegionStates: 57513956a7b671f4e8da1598c2e2970e moved to CLOSED on example.org,30003,1479780976834, expected example.org,30003,1475893095003 {noformat} AssignmentManager skip to assign it because the region was on a failed server {noformat} 2016-11-23 17:11:47,017 INFO [AM.ZK.Worker-pool2-t1] master.AssignmentManager: Skip assigning test,QFO7M,1475986053104.57513956a7b671f4e8da1598c2e2970e., it's host example.org,30003,1475893095003 is dead but not processed yet {noformat} SSH also skip it because it was RIT on another server {noformat} 2016-11-23 17:12:17,850 INFO [MASTER_SERVER_OPERATIONS-example.org:30001-0] master.RegionStates: Transitioning {57513956a7b671f4e8da1598c2e2970e state=CLOSED, ts=1479892307015, server=example.org,30003,1479780976834} will be handled by SSH for example.org,30003,1475893095003 2016-11-23 17:12:17,910 INFO [MASTER_SERVER_OPERATIONS-example.org:30001-0] handler.ServerShutdownHandler: Skip assigning region in transition on other server{57513956a7b671f4e8da1598c2e2970e state=CLOSED, ts=1479892307015, server=example.org,30003,1479780976834} {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17113) finding middle key in HFileV2 is always wrong and can cause IndexOutOfBoundsException
Allan Yang created HBASE-17113: -- Summary: finding middle key in HFileV2 is always wrong and can cause IndexOutOfBoundsException Key: HBASE-17113 URL: https://issues.apache.org/jira/browse/HBASE-17113 Project: HBase Issue Type: Bug Components: HFile Affects Versions: 1.2.4, 0.98.23, 1.1.7, 0.94.17, 2.0.0 Reporter: Allan Yang Assignee: Allan Yang When we want to split a region, we need to get the middle rowkey from the biggest store file. Here is the code from HFileBlockIndex.midkey() which help us find a approximation middle key. {code} // Caching, using pread, assuming this is not a compaction. HFileBlock midLeafBlock = cachingBlockReader.readBlock( midLeafBlockOffset, midLeafBlockOnDiskSize, true, true, false, true, BlockType.LEAF_INDEX, null); ByteBuffer b = midLeafBlock.getBufferWithoutHeader(); int numDataBlocks = b.getInt(); int keyRelOffset = b.getInt(Bytes.SIZEOF_INT * (midKeyEntry + 1)); int keyLen = b.getInt(Bytes.SIZEOF_INT * (midKeyEntry + 2)) - keyRelOffset - SECONDARY_INDEX_ENTRY_OVERHEAD; int keyOffset = Bytes.SIZEOF_INT * (numDataBlocks + 2) + keyRelOffset + SECONDARY_INDEX_ENTRY_OVERHEAD; targetMidKey = ByteBufferUtils.toBytes(b, keyOffset, keyLen); {code} and in each entry of Non-root block index contains three object: 1. Offset of the block referenced by this entry in the file (long) 2 .Ondisk size of the referenced block (int) 3. RowKey. But when we caculating the keyLen from the entry, we forget to take away the 12 byte overhead(1,2 above, SECONDARY_INDEX_ENTRY_OVERHEAD in the code). So the keyLen is always 12 bytes bigger than the real rowkey length. Every time we read the rowkey form the entry, we read 12 bytes from the next entry. No exception will throw unless the middle key is in the last entry of the Non-root block index. which will cause a IndexOutOfBoundsException. That is exactly what HBASE-16097 is suffering from. {code} 2016-11-16 05:27:31,991 ERROR [MemStoreFlusher.1] regionserver.MemStoreFlusher: Cache flusher failed for entry [flush region hitsdb,\x14\x03\x83\x1AX\x1A\x9A \x00\x00\x07\x00\x00\x07\x00\x00\x09\x00\x00\x09\x00\x01\x9F\x00F\xE3\x00\x00\x0A\x00\x01~\x00\x00\x08\x00\x5C\x09\x00\x03\x11\x00\xEF\x99,1478311873096.79d3f7f285396b6896f3229e2bcac7af.] java.lang.IndexOutOfBoundsException at java.nio.Buffer.checkIndex(Buffer.java:532) at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:139) at org.apache.hadoop.hbase.util.ByteBufferUtils.toBytes(ByteBufferUtils.java:490) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:349) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:529) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1527) at org.apache.hadoop.hbase.regionserver.StoreFile.getFileSplitPoint(StoreFile.java:684) at org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.getSplitPoint(DefaultStoreFileManager.java:126) at org.apache.hadoop.hbase.regionserver.HStore.getSplitPoint(HStore.java:1976) at org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:82) at org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:7614) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:521) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) at java.lang.Thread.run(Thread.java:756) {code} It is a quite serious bug. It may exsits from HFileV2 was invented. But no one has found out! Since this bug ONLY happens when finding a middlekey, and since we compare a rowkey from the left side, adding 12 bytes more to the right side is totally OK, no one cares! It even won't throw IndexOutOfBoundsException before HBASE-12297. since {{Arrays.copyOfRange}} is used, which will check the limit to ensue the length won't running past the end of the array. But now, {{ByteBufferUtils.toBytes}} is used and IndexOutOfBoundsException will been thrown. It happens in our production environment. Because of this bug, the region can't be split can grow bigger and bigger. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16856) Exception message in SyncRunner.run() should print currentSequence but syncFutureSequence
Allan Yang created HBASE-16856: -- Summary: Exception message in SyncRunner.run() should print currentSequence but syncFutureSequence Key: HBASE-16856 URL: https://issues.apache.org/jira/browse/HBASE-16856 Project: HBase Issue Type: Bug Components: wal Affects Versions: 1.1.7, 1.2.2, 2.0.0 Reporter: Allan Yang Assignee: Allan Yang Priority: Minor A very small bug, a typo in exception message: {code} if (syncFutureSequence > currentSequence) { throw new IllegalStateException("currentSequence=" + syncFutureSequence + ", syncFutureSequence=" + syncFutureSequence); } {code} It should print currentSequence and syncFutureSequence, but print two syncFutureSequence -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16816) HMaster.move() should throw exception if region to move is not online
Allan Yang created HBASE-16816: -- Summary: HMaster.move() should throw exception if region to move is not online Key: HBASE-16816 URL: https://issues.apache.org/jira/browse/HBASE-16816 Project: HBase Issue Type: Bug Components: Admin Affects Versions: 1.1.2 Reporter: Allan Yang Assignee: Allan Yang Priority: Minor The move region function in HMaster only checked the region to move if it is not exist. {code} if (regionState == null) { throw new UnknownRegionException(Bytes.toStringBinary(encodedRegionName)); } {code} It will not return anything if the region is split or in transition which is not movable. So the caller has no way to know if the move region operation is failed. It is a problem for "region_move.rb". It only gives up moving a region if a exception is thrown.Otherwise, it will wait until a timeout and retry. Without a exception, it have no idea the region is not movable. {code} begin admin.move(Bytes.toBytes(r.getEncodedName()), Bytes.toBytes(newServer)) rescue java.lang.reflect.UndeclaredThrowableException, org.apache.hadoop.hbase.UnknownRegionException => e $LOG.info("Exception moving " + r.getEncodedName() + "; split/moved? Continuing: " + e) return end # Wait till its up on new server before moving on maxWaitInSeconds = admin.getConfiguration.getInt("hbase.move.wait.max", 60) maxWait = Time.now + maxWaitInSeconds while Time.now < maxWait same = isSameServer(admin, r, original) break unless same sleep 0.1 end end {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16649) Truncate table with splits preserved can cause both data loss and truncated data appeared again
Allan Yang created HBASE-16649: -- Summary: Truncate table with splits preserved can cause both data loss and truncated data appeared again Key: HBASE-16649 URL: https://issues.apache.org/jira/browse/HBASE-16649 Project: HBase Issue Type: Bug Affects Versions: 1.1.3 Reporter: Allan Yang Since truncate table with splits preserved will delete hfiles and use the previous regioninfo. It can cause odd behaviors - Case 1: *Data appeared after truncate* reproduce procedure: 1. create a table, let's say 'test' 2. write data to 'test', make sure memstore of 'test' is not empty 3. truncate 'test' with splits preserved 4. kill the regionserver hosting the region(s) of 'test' 5. start the regionserver, now it is the time to witness the miracle! the truncated data appeared in table 'test' - Case 2: *Data loss* reproduce procedure: 1. create a table, let's say 'test' 2. write some data to 'test', no matter how many 3. truncate 'test' with splits preserved 4. write some data, but less than 2 since we don't want the seqid to run over the one in 2 5. kill the regionserver hosting the region(s) of 'test' 6. restart the regionserver. Congratulations! the data writen in 4 is now all lost *Why?* for case 1 Since preserve splits in truncate table procedure will not change the regioninfo, when log replay happens, the 'unflushed' data will restore back to the region for case 2 since the flushedSequenceIdByRegion are stored in Master in a map with the region's encodedName. Although the table is truncated, but the region's name is not changed since we chose to preserve the splits. So after truncate the table, the region's sequenceid is reset in the regionserver, but not reset in master. When flush comes and report to master, master will reject the update of sequenceid since the new one is smaller than the old one. So in log replay, all the edits writen in 4 will be skipped since they have a smaller seqid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16572) Sync method in RecoverableZooKeeper failed to pass callback fucntion in
Allan Yang created HBASE-16572: -- Summary: Sync method in RecoverableZooKeeper failed to pass callback fucntion in Key: HBASE-16572 URL: https://issues.apache.org/jira/browse/HBASE-16572 Project: HBase Issue Type: Bug Components: Zookeeper Affects Versions: 1.1.4, 2.0.0 Reporter: Allan Yang Priority: Minor Fix For: 2.0.0 {code:java} public void sync(String path, AsyncCallback.VoidCallback cb, Object ctx) throws KeeperException { checkZk().sync(path, null, null); //callback function cb is not passed in } {code} It is obvious that the callback method is not passed in. Since sync operation in Zookeeper is a 'async' operation, we need a callback method to notify the caller that the 'sync' operation is finished. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16283) Batch Append/Increment will always fail if set ReturnResults to false
Allan Yang created HBASE-16283: -- Summary: Batch Append/Increment will always fail if set ReturnResults to false Key: HBASE-16283 URL: https://issues.apache.org/jira/browse/HBASE-16283 Project: HBase Issue Type: Bug Components: API Affects Versions: 1.2.2, 1.1.5, 2.0.0 Reporter: Allan Yang Priority: Minor Fix For: 2.0.0 If set Append/Increment's ReturnResult attribute to false, and batch the appends/increments to server. The batch operation will always return false. The reason is that, since return result is set to false, append/increment will return null instead of Result object. But in ResponseConverter#getResults, there is some check code {code} if (requestRegionActionCount != responseRegionActionResultCount) { throw new IllegalStateException("Request mutation count=" + requestRegionActionCount + " does not match response mutation result count=" + responseRegionActionResultCount); } {code} That means if the result count is not meat with request mutation count, it will fail the request. The solution is simple, instead of returning a null result, return a empty result if ReturnResult set to null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16238) It's useless to catch SESSIONEXPIRED exception and retry in RecoverableZooKeeper
Allan Yang created HBASE-16238: -- Summary: It's useless to catch SESSIONEXPIRED exception and retry in RecoverableZooKeeper Key: HBASE-16238 URL: https://issues.apache.org/jira/browse/HBASE-16238 Project: HBase Issue Type: Bug Components: Zookeeper Reporter: Allan Yang Priority: Minor After HBASE-5549, SESSIONEXPIRED exception was caught and retried with other zookeeper exceptions like ConnectionLoss. But it is useless to retry when a session expired happens, since the retry will never be successful. Though there is a config called "zookeeper.recovery.retry" to control retry times, in our cases, we set this config to a very big number like "9". When a session expired happens, the regionserver should kill itself, but because of the retrying, threads of regionserver stuck at trying to reconnect to zookeeper, and never properly shut down. {code} public Stat exists(String path, boolean watch) throws KeeperException, InterruptedException { TraceScope traceScope = null; try { traceScope = Trace.startSpan("RecoverableZookeeper.exists"); RetryCounter retryCounter = retryCounterFactory.create(); while (true) { try { return checkZk().exists(path, watch); } catch (KeeperException e) { switch (e.code()) { case CONNECTIONLOSS: case SESSIONEXPIRED: //we shouldn't catch this case OPERATIONTIMEOUT: retryOrThrow(retryCounter, e, "exists"); break; default: throw e; } } retryCounter.sleepUntilNextRetry(); } } finally { if (traceScope != null) traceScope.close(); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15474) Exception in HConnectionImplementation's constructor cause Zookeeper connnections leak
Allan Yang created HBASE-15474: -- Summary: Exception in HConnectionImplementation's constructor cause Zookeeper connnections leak Key: HBASE-15474 URL: https://issues.apache.org/jira/browse/HBASE-15474 Project: HBase Issue Type: Bug Affects Versions: 1.1.0 Reporter: Allan Yang Assignee: Allan Yang HConnectionImplementation creates a ZooKeeperKeepAliveConnection during construction, but if the constructor throw a exception, the zookeeper connection is not properly closed. {code} HConnectionImplementation(Configuration conf, boolean managed, ExecutorService pool, User user) throws IOException { this(conf); this.user = user; this.batchPool = pool; this.managed = managed; this.registry = setupRegistry(); retrieveClusterId(); //here is the zookeeper connection created this.rpcClient = RpcClientFactory.createClient(this.conf, this.clusterId); this.rpcControllerFactory = RpcControllerFactory.instantiate(conf);// In our case, the exception happens here, so the zookeeper connection never close .. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)