[jira] [Created] (HBASE-18380) Implement async RSGroup admin based on the async admin
Guanghao Zhang created HBASE-18380: -- Summary: Implement async RSGroup admin based on the async admin Key: HBASE-18380 URL: https://issues.apache.org/jira/browse/HBASE-18380 Project: HBase Issue Type: Sub-task Reporter: Guanghao Zhang Now the RSGroup admin client get a blocking stub based on the blocking admin's coprocessor service. As we add coprocessor service support in async admin. So we can implement a new async RSGroup admin client based on the new async admin. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: Exception in HBase get
Hi Few more pointers in the issue. Let's say there are two cells with column keys as A and B for same row. These two cells are present in two different files. Cell B is deleted recently. While scanning StoreFileScanner has read B first then reads A. Though the lexographical sorting is preserved in individual files why does the scanner reads deleted cell B first? This is the issue causing problem in reads. Because of this, we have commented the else part which throws IllegalStateException. Will this commenting cause any issues(though no issues till now). Please someone enlighten more on this issue please. We are using 1.2.5 stable version On Mon, 10 Jul 2017 at 7:29 PM, mukund murrali wrote: > +dev > > Regards, > > Mukund Murrali > > On Mon, Jul 10, 2017 at 1:17 PM, mukund murrali > wrote: > >> I know the row key. So in what way it will be helpful in analyzing this >> issue? >> >> Regards, >> >> Mukund Murrali >> >> On Mon, Jul 10, 2017 at 5:32 AM, Ted Yu wrote: >> >>> Can you find the hfile where this exception happens when doing get / >>> scan ? >>> >>> Unfortunately the log didn't contain row key. >>> Here is small change which would log row key: >>> >>> https://pastebin.com/XU5hCLXq >>> >>> Cheers >>> >>> On Thu, Jul 6, 2017 at 10:17 PM, Graceline Abigail Prem Kumar < >>> pgabigai...@gmail.com> wrote: >>> >>> > Hi >>> > >>> > We are currently used Hbase 1.2.5. This exception occurs frequently, >>> both >>> > on the server and on the client side. >>> > >>> > Regards, >>> > Graceline Abigail P >>> > >>> > On Thu, Jul 6, 2017 at 11:08 AM, Graceline Abigail Prem Kumar < >>> > pgabigai...@gmail.com> wrote: >>> > >>> > > Hi >>> > > >>> > > We have got an IllegalStateException during HBase get operation. And >>> > we're >>> > > not sure about the cause. Here is the exception trace. What could the >>> > > problem be? >>> > > >>> > > Caused by: java.lang.IllegalStateException: isDelete failed: >>> > > deleteBuffer=22, qualifier=21, timestamp=1487055525513, comparison >>> > result: >>> > > 1 >>> > > at >>> org.apache.hadoop.hbase.regionserver.ScanDeleteTracker.isDeleted( >>> > > ScanDeleteTracker.java:147) >>> > > at org.apache.hadoop.hbase.regionserver.ScanQueryMatcher. >>> > > match(ScanQueryMatcher.java:395) >>> > > at org.apache.hadoop.hbase.regionserver.StoreScanner. >>> > > next(StoreScanner.java:529) >>> > > at org.apache.hadoop.hbase.regionserver.KeyValueHeap. >>> > > next(KeyValueHeap.java:150) >>> > > at >>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl. >>> > > populateResult(HRegion.java:5731) >>> > > at >>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl. >>> > > nextInternal(HRegion.java:5894) >>> > > at org.apache.hadoop.hbase.regionserver.HRegion$ >>> > > RegionScannerImpl.nextRaw(HRegion.java:5668) >>> > > at org.apache.hadoop.hbase.regionserver.HRegion$ >>> > > RegionScannerImpl.next(HRegion.java:5645) >>> > > at org.apache.hadoop.hbase.regionserver.HRegion$ >>> > > RegionScannerImpl.next(HRegion.java:5631) >>> > > at org.apache.hadoop.hbase.regionserver.HRegion.get( >>> > > HRegion.java:6829) >>> > > at org.apache.hadoop.hbase.regionserver.HRegion.get( >>> > > HRegion.java:6807) >>> > > at org.apache.hadoop.hbase.regionserver.RSRpcServices. >>> > > get(RSRpcServices.java:2049) >>> > > at org.apache.hadoop.hbase.protobuf.generated. >>> > > >>> ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33644) >>> > > at >>> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2188) >>> > > ... 4 more >>> > > >>> > > Mon Jul 03 04:34:07 PDT 2017, RpcRetryingCaller{ >>> > globalStartTime=1499081647183, >>> > > pause=100, retries=35}, java.io.IOException: java.io.IOException: >>> > isDelete >>> > > failed: deleteBuffer=22, qualifier=21, timestamp=1487055525513, >>> > comparison >>> > > result: 1 >>> > > at >>> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2239) >>> > > at org.apache.hadoop.hbase.ipc.Ca >>> llRunner.run(CallRunner.java:112) >>> > > at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop( >>> > > RpcExecutor.java:133) >>> > > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor. >>> > > java:108) >>> > > at java.lang.Thread.run(Thread.java:745) >>> > > >>> > > Regards, >>> > > Graceline Abigail P >>> > > >>> > >>> >> >> > -- Regards, Mukund Murrali
[jira] [Created] (HBASE-18379) SnapshotManager#checkSnapshotSupport() should better handle malfunctioning hdfs snapshot
Ted Yu created HBASE-18379: -- Summary: SnapshotManager#checkSnapshotSupport() should better handle malfunctioning hdfs snapshot Key: HBASE-18379 URL: https://issues.apache.org/jira/browse/HBASE-18379 Project: HBase Issue Type: Bug Reporter: Ted Yu The following was observed by a customer which prevented master from coming up: {code} 2017-07-13 13:25:07,898 FATAL [xyz:16000.activeMasterManager] master.HMaster: Failed to become active master java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: Daily_Snapshot_Apps_2017-xx at org.apache.hadoop.fs.Path.initialize(Path.java:205) at org.apache.hadoop.fs.Path.(Path.java:171) at org.apache.hadoop.fs.Path.(Path.java:93) at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:230) at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.makeQualified(HdfsFileStatus.java:263) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:911) at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:113) at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:966) at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:962) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:962) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1534) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1574) at org.apache.hadoop.hbase.master.snapshot.SnapshotManager.getCompletedSnapshots(SnapshotManager.java:206) at org.apache.hadoop.hbase.master.snapshot.SnapshotManager.checkSnapshotSupport(SnapshotManager.java:1011) at org.apache.hadoop.hbase.master.snapshot.SnapshotManager.initialize(SnapshotManager.java:1070) at org.apache.hadoop.hbase.procedure.MasterProcedureManagerHost.initialize(MasterProcedureManagerHost.java:50) at org.apache.hadoop.hbase.master.HMaster.initializeZKBasedSystemTrackers(HMaster.java:667) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:732) at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:213) at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1863) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.URISyntaxException: Relative path in absolute URI: Daily_Snapshot_Apps_2017-xx at java.net.URI.checkPath(URI.java:1823) at java.net.URI.(URI.java:745) at org.apache.hadoop.fs.Path.initialize(Path.java:202) {code} Turns out the exception can be reproduced using hdfs command line accessing .snapshot directory. SnapshotManager#checkSnapshotSupport() should better handle malfunctioning hdfs snapshot so that master starts up. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18378) Cloning configuration contained in CoprocessorEnvironment doesn't work
Samarth Jain created HBASE-18378: Summary: Cloning configuration contained in CoprocessorEnvironment doesn't work Key: HBASE-18378 URL: https://issues.apache.org/jira/browse/HBASE-18378 Project: HBase Issue Type: Bug Reporter: Samarth Jain In our phoenix co-processors, we need to clone configuration passed in CoprocessorEnvironment. However, using the copy constructor declared in it's parent class, Configuration, doesn't copy over anything. For example: {code} CorpocessorEnvironment e Configuration original = e.getConfiguration(); Configuration clone = new Configuration(original); clone.get(HConstants.ZK_SESSION_TIMEOUT) -> returns null e.configuration.get(HConstants.ZK_SEESION_TIMEOUT) -> returns HConstants.DEFAULT_ZK_SESSION_TIMEOUT {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18377) Error handling for FileNotFoundException should consider RemoteException in ReplicationSource#openReader()
Ted Yu created HBASE-18377: -- Summary: Error handling for FileNotFoundException should consider RemoteException in ReplicationSource#openReader() Key: HBASE-18377 URL: https://issues.apache.org/jira/browse/HBASE-18377 Project: HBase Issue Type: Bug Reporter: Ted Yu In region server log, I observed the following: {code} org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/hbase/data/WALs/lx.p.com,16020,1497300923131/497300923131. default.1497302873178 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1860) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1831) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1744) ... at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:326) at org.apache.hadoop.fs.FilterFileSystem.open(FilterFileSystem.java:162) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:782) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:293) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:267) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:255) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:414) at org.apache.hadoop.hbase.replication.regionserver.ReplicationWALReaderManager.openReader(ReplicationWALReaderManager.java:69) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:605) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:364) {code} We have code in ReplicationSource#openReader() which is supposed to handle FileNotFoundException but RemoteException wrapping FileNotFoundException was missed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HBASE-11249) Missing null check in finally block of HRegion#processRowsWithLocks() may lead to partially rolled back state in memstore.
[ https://issues.apache.org/jira/browse/HBASE-11249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob resolved HBASE-11249. --- Resolution: Not A Problem This code was rewritten in HBASE-15158, doesn't look like a problem anymore. > Missing null check in finally block of HRegion#processRowsWithLocks() may > lead to partially rolled back state in memstore. > -- > > Key: HBASE-11249 > URL: https://issues.apache.org/jira/browse/HBASE-11249 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Priority: Minor > > At line 4883: > {code} > Store store = getStore(kv); > if (store == null) { > checkFamily(CellUtil.cloneFamily(kv)); > // unreachable > } > {code} > Exception would be thrown from checkFamily() if store is null. > In the finally block: > {code} > } finally { > if (!mutations.isEmpty() && !walSyncSuccessful) { > LOG.warn("Wal sync failed. Roll back " + mutations.size() + > " memstore keyvalues for row(s):" + StringUtils.byteToHexString( > processor.getRowsToLock().iterator().next()) + "..."); > for (KeyValue kv : mutations) { > getStore(kv).rollback(kv); > } > {code} > There is no corresponding null check for return value of getStore() above, > potentially leading to partially rolled back state in memstore. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18376) Flaky exclusion doesn't appear to work in precommit
Sean Busbey created HBASE-18376: --- Summary: Flaky exclusion doesn't appear to work in precommit Key: HBASE-18376 URL: https://issues.apache.org/jira/browse/HBASE-18376 Project: HBase Issue Type: Bug Components: community, test Reporter: Sean Busbey Priority: Critical Yesterday we started defaulting the precommit parameter for the flaky test list to point to the job on builds.a.o. Looks like the personality is ignoring it. example build that's marked to keep: https://builds.apache.org/job/PreCommit-HBASE-Build/7646/ (search for 'Running unit tests' to skip to the right part of the console') should add some more debug output in there too. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18375) The pool chunks from ChunkCreator are deallocated while in pool because there is no reference to them
Anastasia Braginsky created HBASE-18375: --- Summary: The pool chunks from ChunkCreator are deallocated while in pool because there is no reference to them Key: HBASE-18375 URL: https://issues.apache.org/jira/browse/HBASE-18375 Project: HBase Issue Type: Sub-task Reporter: Anastasia Braginsky Because MSLAB list of chunks was changed to list of chunk IDs, the chunks returned back to pool can be deallocated by JVM because there is no reference to them. The solution is to protect pool chunks from GC by the strong map of ChunkCreator introduced by HBASE-18010. Will prepare the patch today. -- This message was sent by Atlassian JIRA (v6.4.14#64029)