[jira] [Commented] (HBASE-15425) Failing to write bulk load event marker in the WAL is ignored
[ https://issues.apache.org/jira/browse/HBASE-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188837#comment-15188837 ] Hudson commented on HBASE-15425: FAILURE: Integrated in HBase-Trunk_matrix #768 (See [https://builds.apache.org/job/HBase-Trunk_matrix/768/]) HBASE-15425 Failing to write bulk load event marker in the WAL is (tedyu: rev d14b6c3810f193adb658a4052aca9c3c23d74ae9) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java > Failing to write bulk load event marker in the WAL is ignored > - > > Key: HBASE-15425 > URL: https://issues.apache.org/jira/browse/HBASE-15425 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0 >Reporter: Ashish Singhi >Assignee: Ashish Singhi > Fix For: 2.0.0 > > Attachments: HBASE-15425.patch, HBASE-15425.v1.patch > > > During LoadIncrementalHFiles process if we fail to write the bulk load event > marker in the WAL, it is ignored. So this will lead to data mismatch issue in > source and peer cluster in case of bulk loaded data replication scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15265) Implement an asynchronous FSHLog
[ https://issues.apache.org/jira/browse/HBASE-15265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188835#comment-15188835 ] Duo Zhang commented on HBASE-15265: --- {quote} Why is this option called asyncfs and not something like async_wal? {quote} The {{DefaultWALProvider}} is called 'filesystem' so I name it 'asyncfs'... > Implement an asynchronous FSHLog > > > Key: HBASE-15265 > URL: https://issues.apache.org/jira/browse/HBASE-15265 > Project: HBase > Issue Type: Sub-task > Components: wal >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0 > > Attachments: HBASE-15265-v1.patch, HBASE-15265-v2.patch, > HBASE-15265-v3.patch, HBASE-15265-v4.patch, HBASE-15265-v5.patch, > HBASE-15265-v6.patch, HBASE-15265.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15120) Backport HBASE-14883 to branch-1.1
[ https://issues.apache.org/jira/browse/HBASE-15120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188829#comment-15188829 ] Hudson commented on HBASE-15120: FAILURE: Integrated in HBase-1.1-JDK8 #1762 (See [https://builds.apache.org/job/HBase-1.1-JDK8/1762/]) HBASE-15120 Backport HBASE-14883 to branch-1.1 (Yu Li) (ndimiduk: rev 30855853f0fc570138fc79f0316412fe32450b9a) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java > Backport HBASE-14883 to branch-1.1 > -- > > Key: HBASE-15120 > URL: https://issues.apache.org/jira/browse/HBASE-15120 > Project: HBase > Issue Type: Bug >Affects Versions: 1.1.2 >Reporter: Yu Li >Assignee: Yu Li >Priority: Minor > Fix For: 1.1.4 > > Attachments: HBASE-15120.branch-1.1.patch, > HBASE-15120.branch-1.1.patch, HBASE-15120.branch-1.1.patch, > HBASE-15120.branch-1.1.patch > > > When checking branch-1.1 UT in HBASE-13590, found > TestSplitTransactionOnCluster#testFailedSplit will fail at a 12/50 chance. > The issue is fixed by HBASE-14883 but the change didn't go into branch-1.1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15378) Scanner cannot handle heartbeat message with no results
[ https://issues.apache.org/jira/browse/HBASE-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188828#comment-15188828 ] Hudson commented on HBASE-15378: FAILURE: Integrated in HBase-1.1-JDK8 #1762 (See [https://builds.apache.org/job/HBase-1.1-JDK8/1762/]) HBASE-15378 Scanner cannot handle heartbeat message with no results (tedyu: rev f57c6193b6a22d3dc0b14e73f3fa2d06df510f88) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestScannerHeartbeatMessages.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java > Scanner cannot handle heartbeat message with no results > --- > > Key: HBASE-15378 > URL: https://issues.apache.org/jira/browse/HBASE-15378 > Project: HBase > Issue Type: Bug > Components: dataloss, Scanners >Affects Versions: 1.2.0, 1.1.3 >Reporter: Phil Yang >Assignee: Phil Yang >Priority: Critical > Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 1.4.0 > > Attachments: HBASE-15378-v1.txt, HBASE-15378-v2.txt, > HBASE-15378-v3.txt, HBASE-15378-v4.patch, HBASE-15378-v5.patch, > HBASE-15378-v6.patch > > > When a RS scanner get a TIME_LIMIT_REACHED_MID_ROW state, they will stop > scanning and send back what it has read to client and mark the message as a > heartbeat message. If there is no cell has been read, it will be an empty > response. > However, ClientScanner only handles the situation that the client gets an > empty heartbeat and its cache is not empty. If the cache is empty too, it > will be regarded as end-of-region and open a new scanner for next region. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14883) TestSplitTransactionOnCluster#testFailedSplit flakey
[ https://issues.apache.org/jira/browse/HBASE-14883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188827#comment-15188827 ] Hudson commented on HBASE-14883: FAILURE: Integrated in HBase-1.1-JDK8 #1762 (See [https://builds.apache.org/job/HBase-1.1-JDK8/1762/]) HBASE-15120 Backport HBASE-14883 to branch-1.1 (Yu Li) (ndimiduk: rev 30855853f0fc570138fc79f0316412fe32450b9a) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java > TestSplitTransactionOnCluster#testFailedSplit flakey > > > Key: HBASE-14883 > URL: https://issues.apache.org/jira/browse/HBASE-14883 > Project: HBase > Issue Type: Sub-task > Components: flakey, test >Affects Versions: 1.2.0, 1.3.0 >Reporter: stack >Assignee: stack > Fix For: 1.2.0, 1.3.0 > > Attachments: 14883-branch-1.txt > > > Only in branch-1 and branch-1.2. > Fails look like this: > https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.3/jdk=latest1.8,label=Hadoop/397/ > TEST-org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.xml. > If I look in the xml, I see this: > {code} >classname="org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster" > time="8.275"> > java.lang.AssertionError: > null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testFailedSplit(TestSplitTransactionOnCluster.java:1339) >
[jira] [Commented] (HBASE-15265) Implement an asynchronous FSHLog
[ https://issues.apache.org/jira/browse/HBASE-15265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188824#comment-15188824 ] Duo Zhang commented on HBASE-15265: --- [~stack] {code} LOG.info("Instantiating WALProvider of type " + clazz); {code} You could try finding this log? > Implement an asynchronous FSHLog > > > Key: HBASE-15265 > URL: https://issues.apache.org/jira/browse/HBASE-15265 > Project: HBase > Issue Type: Sub-task > Components: wal >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0 > > Attachments: HBASE-15265-v1.patch, HBASE-15265-v2.patch, > HBASE-15265-v3.patch, HBASE-15265-v4.patch, HBASE-15265-v5.patch, > HBASE-15265-v6.patch, HBASE-15265.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15431) A bunch of methods are hot and too big to be inlined
[ https://issues.apache.org/jira/browse/HBASE-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188823#comment-15188823 ] Lars Hofhansl commented on HBASE-15431: --- bq. Was 1010 enough to get the list of methods inlined? Yep. The largest is 1006 bytes. After that I no longer see any "hot method too big" messages. But performance actually seemed to be slower. So maybe this is a dud...? I looked into breaking up StoreScanner.next(...), it's pretty messy now, with many exits (continue with the loop, break out of loop, return from the method with result true, return with false, etc). What we want to do is breaking up the larger method so that hot code is separate from the not so hot code. So in StoreScanner.next we'd try to more DONE, DONE_SCAN, maybe XXX_SEEK code to separate methods. That would allow the more frequent and very hot SKIP and INCLUDE code paths to be inlined... But as I said, it's not a trivial quick fix, it's need to rethinking of the structure of this method. Others are similar. > A bunch of methods are hot and too big to be inlined > > > Key: HBASE-15431 > URL: https://issues.apache.org/jira/browse/HBASE-15431 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl > Attachments: hotMethods.txt > > > I ran HBase with "-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions > -XX:+PrintInlining" and then looked for "hot method too big" log lines. > I'll attach a log of those messages. > I tried to increase -XX:FreqInlineSize to 1010 to inline all these methods > (as long as they're hot, but actually didn't see any improvement). > In all cases I primed the JVM to make sure the JVM gets a chance to profile > the methods and decide whether they're hot or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks
[ https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188816#comment-15188816 ] Lars Hofhansl commented on HBASE-15392: --- (Maybe it's time to look at HBASE-11811. I started that, but never finished. Had a pretty big impact for Gets, since we can find the offset of any Cell in an HFileBlock in O(log\(n)) time rather then O\(n) (n being the number of Cells in a block), and we also have a notion of the nth Cell in a block, which will enable other optimization like sampling.) > Single Cell Get reads two HFileBlocks > - > > Key: HBASE-15392 > URL: https://issues.apache.org/jira/browse/HBASE-15392 > Project: HBase > Issue Type: Sub-task > Components: BucketCache >Reporter: stack >Assignee: stack > Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, > 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, > HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, > no_optimize.patch, reads.png, reads.png, two_seeks.txt > > > As found by Daniel "SystemTap" Pol, a simple Get results in our reading two > HFileBlocks, the one that contains the wanted Cell, and the block that > follows. > Here is a bit of custom logging that logs a stack trace on each HFileBlock > read so you can see the call stack responsible: > {code} > 2016-03-03 22:20:30,191 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > START LOOP > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > QCODE SEEK_NEXT_COL > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: > STARTED WHILE > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: > OUT OF L2 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read > offset=31409152, len=2103 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: > offset=31409152, length=2103 > 2016-03-03 22:20:30,193 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: > From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, > onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, > prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, > getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, > buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], > dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE]]] > 2016-03-03 22:20:30,193 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: > Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, > onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, > prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, > getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, > buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], > dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE]]] > java.lang.Throwable > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198) > at > org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321) > at >
[jira] [Comment Edited] (HBASE-15392) Single Cell Get reads two HFileBlocks
[ https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188810#comment-15188810 ] Lars Hofhansl edited comment on HBASE-15392 at 3/10/16 7:03 AM: We're optimizing the wrong thing here. Let's fix it for Gets. [~anoopsamjohn], I was surprised by the cost of SEEKs as well. I've spent some time to make that better, but at the end we have to reset the pointers for every HFile we're touching. That means SEEKing in the right HFile block of every HFile. Then there are a bunch of compares in KeyValueHeap and the HFileReader classes. Note that is true whether we are reading a new block from disk or from the block cache, we're doing a bunch of work for every SEEK. The case I missed is that moreRowsMayExistAfter is not called unless we're getting a XXX_SEEK code. My bad. Let's fix that part. For Gets for sure, we should fix it. For Scans, unless we have a degenerate case (a very small scan that fits into one HFile block), we might load an extra block if the StopRow falls towards the end of a block, which is far outweighed by the time we've saved during the scan. That block will a read of the same file and it will be the _next_ block to read (so not a random read, but a sequential read). That is a fair trade-off. IMHO. HBase is usually CPU bound (which is in part what I am trying to fix) Maybe we can disable the next row optimization for small scans as well. Anyway... If you want to remove optimize() or change it significantly, show me end-to-end perf numbers (like I have done in HBASE-13109). :) Areas where it might be degenerate are the following: * Gets.. I missed that. * Large Cells, such that only a few (maybe just one) fit into into an HFileBlock. * Very small scans (that just read a single block, or very few block) was (Author: lhofhansl): We're optimizing the wrong thing here. Let's fix it for Gets. [~anoopsamjohn], I was surprised by the cost of SEEKs as well. I've spent some time to make that better, but at the we have reset the pointers for every HFile we're touching. That means SEEKing in the right HFile block of every HFile. Then there are a bunch of compares in KeyValueHeap and the HFileReader classes. Note that is true whether we are reading a new block from disk or from the block cache, we're doing a bunch of work for every SEEK. The case I missed is that moreRowsMayExistAfter is not called unless we're getting a XXX_SEEK code. My bad. Let's fix that part. For Gets for sure, we should fix it. For Scans, unless we have a degenerate case (a very small scan that fits into one HFile block), we might load an extra block if the StopRow falls towards the end of a block, which is far outweighed by the time we've saved during the scan. That block will a read of the same file and it will be the _next_ block to read (so not a random read, but a sequential read). That is a fair trade-off. IMHO. (Maybe we can disable the next row optimization for small scans as well.) Anyway... If you want to remove optimize() or change it significantly, show me end-to-end perf numbers (like I have done in HBASE-13109). :) Areas where it might be degenerate are the following: * Gets.. I missed that. * Large Cells, such that only a few (maybe just one) fit into into an HFileBlock. * Very small scans (that just read a single block, or very few block) > Single Cell Get reads two HFileBlocks > - > > Key: HBASE-15392 > URL: https://issues.apache.org/jira/browse/HBASE-15392 > Project: HBase > Issue Type: Sub-task > Components: BucketCache >Reporter: stack >Assignee: stack > Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, > 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, > HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, > no_optimize.patch, reads.png, reads.png, two_seeks.txt > > > As found by Daniel "SystemTap" Pol, a simple Get results in our reading two > HFileBlocks, the one that contains the wanted Cell, and the block that > follows. > Here is a bit of custom logging that logs a stack trace on each HFileBlock > read so you can see the call stack responsible: > {code} > 2016-03-03 22:20:30,191 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > START LOOP > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > QCODE SEEK_NEXT_COL > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: > STARTED WHILE > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: > OUT OF L2 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read >
[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks
[ https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188810#comment-15188810 ] Lars Hofhansl commented on HBASE-15392: --- We're optimizing the wrong thing here. Let's fix it for Gets. [~anoopsamjohn], I was surprised by the cost of SEEKs as well. I've spent some time to make that better, but at the we have reset the pointers for every HFile we're touching. That means SEEKing in the right HFile block of every HFile. Then there are a bunch of compares in KeyValueHeap and the HFileReader classes. Note that is true whether we are reading a new block from disk or from the block cache, we're doing a bunch of work for every SEEK. The case I missed is that moreRowsMayExistAfter is not called unless we're getting a XXX_SEEK code. My bad. Let's fix that part. For Gets for sure, we should fix it. For Scans, unless we have a degenerate case (a very small scan that fits into one HFile block), we might load an extra block if the StopRow falls towards the end of a block, which is far outweighed by the time we've saved during the scan. That block will a read of the same file and it will be the _next_ block to read (so not a random read, but a sequential read). That is a fair trade-off. IMHO. (Maybe we can disable the next row optimization for small scans as well.) Anyway... If you want to remove optimize() or change it significantly, show me end-to-end perf numbers (like I have done in HBASE-13109). :) Areas where it might be degenerate are the following: * Gets.. I missed that. * Large Cells, such that only a few (maybe just one) fit into into an HFileBlock. * Very small scans (that just read a single block, or very few block) > Single Cell Get reads two HFileBlocks > - > > Key: HBASE-15392 > URL: https://issues.apache.org/jira/browse/HBASE-15392 > Project: HBase > Issue Type: Sub-task > Components: BucketCache >Reporter: stack >Assignee: stack > Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, > 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, > HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, > no_optimize.patch, reads.png, reads.png, two_seeks.txt > > > As found by Daniel "SystemTap" Pol, a simple Get results in our reading two > HFileBlocks, the one that contains the wanted Cell, and the block that > follows. > Here is a bit of custom logging that logs a stack trace on each HFileBlock > read so you can see the call stack responsible: > {code} > 2016-03-03 22:20:30,191 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > START LOOP > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > QCODE SEEK_NEXT_COL > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: > STARTED WHILE > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: > OUT OF L2 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read > offset=31409152, len=2103 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: > offset=31409152, length=2103 > 2016-03-03 22:20:30,193 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: > From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, > onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, > prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, > getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, > buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], > dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE]]] > 2016-03-03 22:20:30,193 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: > Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, > onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, > prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, > getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, > buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], > dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384,
[jira] [Commented] (HBASE-15314) Allow more than one backing file in bucketcache
[ https://issues.apache.org/jira/browse/HBASE-15314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188804#comment-15188804 ] Anoop Sam John commented on HBASE-15314: As what I checked in BC code area, yes we are not utilizing full area of BC. Seems buckets in some of the specific size will not get used.. This is when u have all the tables with same block size. Ya the issue with block sizes in HFiles is that we really can not guarentee the size... Checking these area as well.. > Allow more than one backing file in bucketcache > --- > > Key: HBASE-15314 > URL: https://issues.apache.org/jira/browse/HBASE-15314 > Project: HBase > Issue Type: Sub-task > Components: BucketCache >Reporter: stack >Assignee: Amal Joshy > Attachments: HBASE-15314.patch > > > Allow bucketcache use more than just one backing file: e.g. chassis has more > than one SSD in it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15265) Implement an asynchronous FSHLog
[ https://issues.apache.org/jira/browse/HBASE-15265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188788#comment-15188788 ] stack commented on HBASE-15265: --- Would this be tough to put on branch-1 [~Apache9]? Would be good to get some experience w/ this new stuff on the branch-1 line. > Implement an asynchronous FSHLog > > > Key: HBASE-15265 > URL: https://issues.apache.org/jira/browse/HBASE-15265 > Project: HBase > Issue Type: Sub-task > Components: wal >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0 > > Attachments: HBASE-15265-v1.patch, HBASE-15265-v2.patch, > HBASE-15265-v3.patch, HBASE-15265-v4.patch, HBASE-15265-v5.patch, > HBASE-15265-v6.patch, HBASE-15265.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15265) Implement an asynchronous FSHLog
[ https://issues.apache.org/jira/browse/HBASE-15265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188782#comment-15188782 ] stack commented on HBASE-15265: --- Needs a release note when you get a chance [~Apache9] Why is this option called asyncfs and not something like async_wal? I set this in my config: hbase.wal.provider asyncfs ... but don't seem to see anything in logs saying async wal is enabled What do you see? I have this: {code} 2016-03-09 22:26:04,968 INFO [regionserver/ve0530.halxg.cloudera.com/10.17.240.23:16020] wal.AbstractFSWAL: WAL configuration: blocksize=128 MB, rollsize=121.60 MB, prefix=ve0530.halxg.cloudera.com%2C16020%2C1457591160305, suffix=, logDir=hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs/ve0530.halxg.cloudera.com,16020,1457591160305, archiveDir=hdfs://ve0524.halxg.cloudera.com:8020/hbase/oldWALs 2016-03-09 22:26:05,200 INFO [regionserver/ve0530.halxg.cloudera.com/10.17.240.23:16020] wal.AbstractFSWAL: New WAL /hbase/WALs/ve0530.halxg.cloudera.com,16020,1457591160305/ve0530.halxg.cloudera.com%2C16020%2C1457591160305.1457591164968 {code} does this mean async WAL is 'on'? > Implement an asynchronous FSHLog > > > Key: HBASE-15265 > URL: https://issues.apache.org/jira/browse/HBASE-15265 > Project: HBase > Issue Type: Sub-task > Components: wal >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0 > > Attachments: HBASE-15265-v1.patch, HBASE-15265-v2.patch, > HBASE-15265-v3.patch, HBASE-15265-v4.patch, HBASE-15265-v5.patch, > HBASE-15265-v6.patch, HBASE-15265.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15440) Master's meta region server should not be shown in Dead Region server's list
neha created HBASE-15440: Summary: Master's meta region server should not be shown in Dead Region server's list Key: HBASE-15440 URL: https://issues.apache.org/jira/browse/HBASE-15440 Project: HBase Issue Type: Bug Components: hbase Reporter: neha Priority: Minor For zk less region assignment Problem: master's meta region server is always shown if it is a force kill operation of the hbase master process. It will be always shown as dead RS;even when the killed master starts up successfully and registers itself as backup master. Whereas, this is not the case with graceful stopping of active master (./hbase-daemon.sh stop master) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15392) Single Cell Get reads two HFileBlocks
[ https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-15392: -- Attachment: reads.png gc.png io.png Small YCSB test comparing before and after patch. Its the pure random read test. Interesting is that while patch read rate is up when patch is applied, it os not that much different when cache is LRU L1 (no L2 in this case). You can see though that we are doing less i/o. Thats good. > Single Cell Get reads two HFileBlocks > - > > Key: HBASE-15392 > URL: https://issues.apache.org/jira/browse/HBASE-15392 > Project: HBase > Issue Type: Sub-task > Components: BucketCache >Reporter: stack >Assignee: stack > Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, > 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, > HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, > no_optimize.patch, reads.png, reads.png, two_seeks.txt > > > As found by Daniel "SystemTap" Pol, a simple Get results in our reading two > HFileBlocks, the one that contains the wanted Cell, and the block that > follows. > Here is a bit of custom logging that logs a stack trace on each HFileBlock > read so you can see the call stack responsible: > {code} > 2016-03-03 22:20:30,191 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > START LOOP > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > QCODE SEEK_NEXT_COL > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: > STARTED WHILE > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: > OUT OF L2 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read > offset=31409152, len=2103 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: > offset=31409152, length=2103 > 2016-03-03 22:20:30,193 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: > From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, > onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, > prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, > getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, > buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], > dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE]]] > 2016-03-03 22:20:30,193 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: > Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, > onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, > prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, > getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, > buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], > dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE]]] > java.lang.Throwable > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198) > at > org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279) > at >
[jira] [Updated] (HBASE-15392) Single Cell Get reads two HFileBlocks
[ https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-15392: -- Attachment: reads.png gc.png Two charts comparing GC and hits w/ patch and w/o. Weird is how little difference the extra read makes when all is out of L1 LRU (no L2 in these tests). > Single Cell Get reads two HFileBlocks > - > > Key: HBASE-15392 > URL: https://issues.apache.org/jira/browse/HBASE-15392 > Project: HBase > Issue Type: Sub-task > Components: BucketCache >Reporter: stack >Assignee: stack > Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, > 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, > HBASE-15392_suggest.patch, gc.png, no_optimize.patch, no_optimize.patch, > reads.png, two_seeks.txt > > > As found by Daniel "SystemTap" Pol, a simple Get results in our reading two > HFileBlocks, the one that contains the wanted Cell, and the block that > follows. > Here is a bit of custom logging that logs a stack trace on each HFileBlock > read so you can see the call stack responsible: > {code} > 2016-03-03 22:20:30,191 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > START LOOP > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > QCODE SEEK_NEXT_COL > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: > STARTED WHILE > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: > OUT OF L2 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read > offset=31409152, len=2103 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: > offset=31409152, length=2103 > 2016-03-03 22:20:30,193 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: > From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, > onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, > prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, > getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, > buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], > dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE]]] > 2016-03-03 22:20:30,193 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: > Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, > onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, > prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, > getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, > buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], > dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE]]] > java.lang.Throwable > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198) > at > org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:806) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:795) > at >
[jira] [Commented] (HBASE-15322) HBase 1.1.3 crashing
[ https://issues.apache.org/jira/browse/HBASE-15322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188713#comment-15188713 ] Anoop Sam John commented on HBASE-15322: Will commit tonight IST. Ya we can make it the best way (as we had 2 impls for Comparer).. Will have 2 impl for this all operation (read write diff primitive types).. We will have impl for byte[] and ByteBuffer. Code change wise ya bigger patch it will be.. Let me do that later then.. It fixing the reported issue and let us get this in. > HBase 1.1.3 crashing > > > Key: HBASE-15322 > URL: https://issues.apache.org/jira/browse/HBASE-15322 > Project: HBase > Issue Type: Bug > Components: hbase >Affects Versions: 1.0.0, 2.0.0, 0.98.7, 0.94.24 > Environment: OS: Ubuntu 14.04/Ubuntu 15.10 > JDK: OpenJDK8/OpenJDK9 >Reporter: Anant Sharma >Assignee: Anoop Sam John >Priority: Critical > Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 0.98.18, 1.4.0 > > Attachments: BASE-15322.patch > > > HBase crashes in standalone mode with the following log: > __ > 2016-02-24 22:38:37,578 ERROR [main] master.HMasterCommandLine: Master exiting > java.lang.RuntimeException: Failed construction of Master: class > org.apache.hadoop.hbase.master.HMaster > at > org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2341) > at > org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:233) > at > org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:139) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at > org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) > at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2355) > Caused by: java.lang.NoClassDefFoundError: Could not initialize class > org.apache.hadoop.hbase.util.Bytes$LexicographicalComparerHolder$UnsafeComparer > at org.apache.hadoop.hbase.util.Bytes.putInt(Bytes.java:899) > at > org.apache.hadoop.hbase.KeyValue.createByteArray(KeyValue.java:1082) > at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:652) > at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:580) > at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:483) > at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:370) > at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:267) > at org.apache.hadoop.hbase.HConstants.(HConstants.java:978) > at > org.apache.hadoop.hbase.HTableDescriptor.(HTableDescriptor.java:1488) > at > org.apache.hadoop.hbase.util.FSTableDescriptors.(FSTableDescriptors.java:124) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:570) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:365) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2336) > __ > The class is in the hbase-common.jar and its there in the classpath as can be > seen from the log: > _ > 2016-02-24 22:38:32,538 INFO [main] util.ServerCommandLine: >
[jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception
[ https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188712#comment-15188712 ] Ted Yu commented on HBASE-15433: {code} 112 return state != null ? state.getRegionCountOfTable(tName) : -1; 113 } else { {code} the 'else' keyword is not needed above. Is it possible to add a test ? > SnapshotManager#restoreSnapshot not update table and region count quota > correctly when encountering exception > - > > Key: HBASE-15433 > URL: https://issues.apache.org/jira/browse/HBASE-15433 > Project: HBase > Issue Type: Bug > Components: snapshots >Affects Versions: 2.0.0 >Reporter: Jianwei Cui > Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk.patch > > > In SnapshotManager#restoreSnapshot, the table and region quota will be > checked and updated as: > {code} > try { > // Table already exist. Check and update the region quota for this > table namespace > checkAndUpdateNamespaceRegionQuota(manifest, tableName); > restoreSnapshot(snapshot, snapshotTableDesc); > } catch (IOException e) { > > this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName); > LOG.error("Exception occurred while restoring the snapshot " + > snapshot.getName() > + " as table " + tableName.getNameAsString(), e); > throw e; > } > {code} > The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot > make the region count quota exceeded, then, the table will be removed in the > 'catch' block. This will make the current table count and region count > decrease, following table creation or region split will succeed even if the > actual quota is exceeded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception
[ https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianwei Cui updated HBASE-15433: Attachment: HBASE-15433-trunk-v1.patch This patch could be applied to 2.0.0, 1.4.0, 1.3.0, 1.2.0 and 1.1.4. > SnapshotManager#restoreSnapshot not update table and region count quota > correctly when encountering exception > - > > Key: HBASE-15433 > URL: https://issues.apache.org/jira/browse/HBASE-15433 > Project: HBase > Issue Type: Bug > Components: snapshots >Affects Versions: 2.0.0 >Reporter: Jianwei Cui > Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk.patch > > > In SnapshotManager#restoreSnapshot, the table and region quota will be > checked and updated as: > {code} > try { > // Table already exist. Check and update the region quota for this > table namespace > checkAndUpdateNamespaceRegionQuota(manifest, tableName); > restoreSnapshot(snapshot, snapshotTableDesc); > } catch (IOException e) { > > this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName); > LOG.error("Exception occurred while restoring the snapshot " + > snapshot.getName() > + " as table " + tableName.getNameAsString(), e); > throw e; > } > {code} > The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot > make the region count quota exceeded, then, the table will be removed in the > 'catch' block. This will make the current table count and region count > decrease, following table creation or region split will succeed even if the > actual quota is exceeded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13590) TestEnableTableHandler.testEnableTableWithNoRegionServers is flakey
[ https://issues.apache.org/jira/browse/HBASE-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13590: - Resolution: Fixed Status: Resolved (was: Patch Available) Resolving committed ticket. > TestEnableTableHandler.testEnableTableWithNoRegionServers is flakey > --- > > Key: HBASE-13590 > URL: https://issues.apache.org/jira/browse/HBASE-13590 > Project: HBase > Issue Type: Test > Components: master >Reporter: Nick Dimiduk >Assignee: Yu Li > Fix For: 1.3.0, 1.2.1, 1.1.4 > > Attachments: HBASE-13590.branch-1.1.patch, > HBASE-13590.branch-1.1.patch, HBASE-13590.branch-1.1.patch, > HBASE-13590.branch-1.patch, HBASE-13590.branch-1.v2.patch, > testEnableTableHandler_branch-1.1.log.zip, > testEnableTableHandler_branch-1.log.zip > > > Looking at our [build > history|https://builds.apache.org/job/HBase-1.1/buildTimeTrend], it seems > this test is flakey. See builds 429, 431, 439. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13590) TestEnableTableHandler.testEnableTableWithNoRegionServers is flakey
[ https://issues.apache.org/jira/browse/HBASE-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13590: - Fix Version/s: (was: 2.0.0) > TestEnableTableHandler.testEnableTableWithNoRegionServers is flakey > --- > > Key: HBASE-13590 > URL: https://issues.apache.org/jira/browse/HBASE-13590 > Project: HBase > Issue Type: Test > Components: master >Reporter: Nick Dimiduk >Assignee: Yu Li > Fix For: 1.3.0, 1.2.1, 1.1.4 > > Attachments: HBASE-13590.branch-1.1.patch, > HBASE-13590.branch-1.1.patch, HBASE-13590.branch-1.1.patch, > HBASE-13590.branch-1.patch, HBASE-13590.branch-1.v2.patch, > testEnableTableHandler_branch-1.1.log.zip, > testEnableTableHandler_branch-1.log.zip > > > Looking at our [build > history|https://builds.apache.org/job/HBase-1.1/buildTimeTrend], it seems > this test is flakey. See builds 429, 431, 439. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15295) MutateTableAccess.multiMutate() does not get high priority causing a deadlock
[ https://issues.apache.org/jira/browse/HBASE-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-15295: - Fix Version/s: (was: 1.1.4) 1.1.5 > MutateTableAccess.multiMutate() does not get high priority causing a deadlock > - > > Key: HBASE-15295 > URL: https://issues.apache.org/jira/browse/HBASE-15295 > Project: HBase > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.5 > > Attachments: hbase-15295_v1.patch, hbase-15295_v1.patch, > hbase-15295_v2.patch, hbase-15295_v3.patch, hbase-15295_v4.patch, > hbase-15295_v5.patch, hbase-15295_v5.patch > > > We have seen this in a cluster with Phoenix secondary indexes leading to a > deadlock. All handlers are busy waiting on the index updates to finish: > {code} > "B.defaultRpcServer.handler=50,queue=0,port=16020" #91 daemon prio=5 > os_prio=0 tid=0x7f29f64ba000 nid=0xab51 waiting on condition > [0x7f29a8762000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x000124f1d5c8> (a > com.google.common.util.concurrent.AbstractFuture$Sync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:275) > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:111) > at > org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submit(BaseTaskRunner.java:66) > at > org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submitUninterruptible(BaseTaskRunner.java:99) > at > org.apache.phoenix.hbase.index.write.ParallelWriterIndexCommitter.write(ParallelWriterIndexCommitter.java:194) > at > org.apache.phoenix.hbase.index.write.IndexWriter.write(IndexWriter.java:179) > at > org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:144) > at > org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:134) > at > org.apache.phoenix.hbase.index.Indexer.doPostWithExceptions(Indexer.java:457) > at org.apache.phoenix.hbase.index.Indexer.doPost(Indexer.java:406) > at > org.apache.phoenix.hbase.index.Indexer.postBatchMutate(Indexer.java:401) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$36.call(RegionCoprocessorHost.java:1006) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1673) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1748) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1705) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutate(RegionCoprocessorHost.java:1002) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3162) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2801) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2743) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:692) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:654) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2031) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32213) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) > at java.lang.Thread.run(Thread.java:745) > {code} > And the index region is trying to split, and is trying to do a meta update: > {code} > "regionserver//10.132.70.191:16020-splits-1454693389669" #1779 > prio=5 os_prio=0 tid=0x7f29e149c000 nid=0x5107 in Object.wait() > [0x7f1f136d6000] >java.lang.Thread.State: TIMED_WAITING (on
[jira] [Commented] (HBASE-15295) MutateTableAccess.multiMutate() does not get high priority causing a deadlock
[ https://issues.apache.org/jira/browse/HBASE-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188669#comment-15188669 ] Nick Dimiduk commented on HBASE-15295: -- Bleh. I'm not able to give this the review attention it needs this week, so unless someone else is able to volunteer, I'm going to bump it to 1.1.5. Maybe [~jesse_yates] or [~rajeshbabu] can spare a moment here? > MutateTableAccess.multiMutate() does not get high priority causing a deadlock > - > > Key: HBASE-15295 > URL: https://issues.apache.org/jira/browse/HBASE-15295 > Project: HBase > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.5 > > Attachments: hbase-15295_v1.patch, hbase-15295_v1.patch, > hbase-15295_v2.patch, hbase-15295_v3.patch, hbase-15295_v4.patch, > hbase-15295_v5.patch, hbase-15295_v5.patch > > > We have seen this in a cluster with Phoenix secondary indexes leading to a > deadlock. All handlers are busy waiting on the index updates to finish: > {code} > "B.defaultRpcServer.handler=50,queue=0,port=16020" #91 daemon prio=5 > os_prio=0 tid=0x7f29f64ba000 nid=0xab51 waiting on condition > [0x7f29a8762000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x000124f1d5c8> (a > com.google.common.util.concurrent.AbstractFuture$Sync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:275) > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:111) > at > org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submit(BaseTaskRunner.java:66) > at > org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submitUninterruptible(BaseTaskRunner.java:99) > at > org.apache.phoenix.hbase.index.write.ParallelWriterIndexCommitter.write(ParallelWriterIndexCommitter.java:194) > at > org.apache.phoenix.hbase.index.write.IndexWriter.write(IndexWriter.java:179) > at > org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:144) > at > org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:134) > at > org.apache.phoenix.hbase.index.Indexer.doPostWithExceptions(Indexer.java:457) > at org.apache.phoenix.hbase.index.Indexer.doPost(Indexer.java:406) > at > org.apache.phoenix.hbase.index.Indexer.postBatchMutate(Indexer.java:401) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$36.call(RegionCoprocessorHost.java:1006) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1673) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1748) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1705) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutate(RegionCoprocessorHost.java:1002) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3162) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2801) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2743) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:692) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:654) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2031) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32213) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) > at java.lang.Thread.run(Thread.java:745) > {code} > And the index region is trying to split, and is trying to do a meta update: > {code} >
[jira] [Commented] (HBASE-15295) MutateTableAccess.multiMutate() does not get high priority causing a deadlock
[ https://issues.apache.org/jira/browse/HBASE-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188659#comment-15188659 ] Enis Soztutar commented on HBASE-15295: --- Reviews welcome :) precommit unit test status is pretty bad nowadays. I did not spend much time looking at it, but it seems that there is either a recent regression from a previous commit, or an env problem. I was not able to get a good run locally as well. > MutateTableAccess.multiMutate() does not get high priority causing a deadlock > - > > Key: HBASE-15295 > URL: https://issues.apache.org/jira/browse/HBASE-15295 > Project: HBase > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.4 > > Attachments: hbase-15295_v1.patch, hbase-15295_v1.patch, > hbase-15295_v2.patch, hbase-15295_v3.patch, hbase-15295_v4.patch, > hbase-15295_v5.patch, hbase-15295_v5.patch > > > We have seen this in a cluster with Phoenix secondary indexes leading to a > deadlock. All handlers are busy waiting on the index updates to finish: > {code} > "B.defaultRpcServer.handler=50,queue=0,port=16020" #91 daemon prio=5 > os_prio=0 tid=0x7f29f64ba000 nid=0xab51 waiting on condition > [0x7f29a8762000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x000124f1d5c8> (a > com.google.common.util.concurrent.AbstractFuture$Sync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:275) > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:111) > at > org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submit(BaseTaskRunner.java:66) > at > org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submitUninterruptible(BaseTaskRunner.java:99) > at > org.apache.phoenix.hbase.index.write.ParallelWriterIndexCommitter.write(ParallelWriterIndexCommitter.java:194) > at > org.apache.phoenix.hbase.index.write.IndexWriter.write(IndexWriter.java:179) > at > org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:144) > at > org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:134) > at > org.apache.phoenix.hbase.index.Indexer.doPostWithExceptions(Indexer.java:457) > at org.apache.phoenix.hbase.index.Indexer.doPost(Indexer.java:406) > at > org.apache.phoenix.hbase.index.Indexer.postBatchMutate(Indexer.java:401) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$36.call(RegionCoprocessorHost.java:1006) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1673) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1748) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1705) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutate(RegionCoprocessorHost.java:1002) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3162) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2801) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2743) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:692) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:654) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2031) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32213) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) > at java.lang.Thread.run(Thread.java:745) > {code} > And the index region is trying to split, and is
[jira] [Updated] (HBASE-15120) Backport HBASE-14883 to branch-1.1
[ https://issues.apache.org/jira/browse/HBASE-15120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-15120: - Resolution: Fixed Status: Resolved (was: Patch Available) Pushed to branch-1.1. Thank you, mighty [~carp84]. > Backport HBASE-14883 to branch-1.1 > -- > > Key: HBASE-15120 > URL: https://issues.apache.org/jira/browse/HBASE-15120 > Project: HBase > Issue Type: Bug >Affects Versions: 1.1.2 >Reporter: Yu Li >Assignee: Yu Li >Priority: Minor > Fix For: 1.1.4 > > Attachments: HBASE-15120.branch-1.1.patch, > HBASE-15120.branch-1.1.patch, HBASE-15120.branch-1.1.patch, > HBASE-15120.branch-1.1.patch > > > When checking branch-1.1 UT in HBASE-13590, found > TestSplitTransactionOnCluster#testFailedSplit will fail at a 12/50 chance. > The issue is fixed by HBASE-14883 but the change didn't go into branch-1.1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15295) MutateTableAccess.multiMutate() does not get high priority causing a deadlock
[ https://issues.apache.org/jira/browse/HBASE-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188648#comment-15188648 ] Nick Dimiduk commented on HBASE-15295: -- Any progress here? Seems like Phoenix users will be sad without it. > MutateTableAccess.multiMutate() does not get high priority causing a deadlock > - > > Key: HBASE-15295 > URL: https://issues.apache.org/jira/browse/HBASE-15295 > Project: HBase > Issue Type: Bug >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.4 > > Attachments: hbase-15295_v1.patch, hbase-15295_v1.patch, > hbase-15295_v2.patch, hbase-15295_v3.patch, hbase-15295_v4.patch, > hbase-15295_v5.patch, hbase-15295_v5.patch > > > We have seen this in a cluster with Phoenix secondary indexes leading to a > deadlock. All handlers are busy waiting on the index updates to finish: > {code} > "B.defaultRpcServer.handler=50,queue=0,port=16020" #91 daemon prio=5 > os_prio=0 tid=0x7f29f64ba000 nid=0xab51 waiting on condition > [0x7f29a8762000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x000124f1d5c8> (a > com.google.common.util.concurrent.AbstractFuture$Sync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:275) > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:111) > at > org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submit(BaseTaskRunner.java:66) > at > org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submitUninterruptible(BaseTaskRunner.java:99) > at > org.apache.phoenix.hbase.index.write.ParallelWriterIndexCommitter.write(ParallelWriterIndexCommitter.java:194) > at > org.apache.phoenix.hbase.index.write.IndexWriter.write(IndexWriter.java:179) > at > org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:144) > at > org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:134) > at > org.apache.phoenix.hbase.index.Indexer.doPostWithExceptions(Indexer.java:457) > at org.apache.phoenix.hbase.index.Indexer.doPost(Indexer.java:406) > at > org.apache.phoenix.hbase.index.Indexer.postBatchMutate(Indexer.java:401) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$36.call(RegionCoprocessorHost.java:1006) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1673) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1748) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1705) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutate(RegionCoprocessorHost.java:1002) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3162) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2801) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2743) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:692) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:654) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2031) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32213) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) > at java.lang.Thread.run(Thread.java:745) > {code} > And the index region is trying to split, and is trying to do a meta update: > {code} > "regionserver//10.132.70.191:16020-splits-1454693389669" #1779 > prio=5 os_prio=0 tid=0x7f29e149c000 nid=0x5107 in Object.wait() > [0x7f1f136d6000] >
[jira] [Commented] (HBASE-15322) HBase 1.1.3 crashing
[ https://issues.apache.org/jira/browse/HBASE-15322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188646#comment-15188646 ] Nick Dimiduk commented on HBASE-15322: -- If this patch fixed it for our reporter's use case, I'm good with committing this patch as is. We can make further improvements in this area as developer time/attention allows. +1, ready for commit [~anoop.hbase]? > HBase 1.1.3 crashing > > > Key: HBASE-15322 > URL: https://issues.apache.org/jira/browse/HBASE-15322 > Project: HBase > Issue Type: Bug > Components: hbase >Affects Versions: 1.0.0, 2.0.0, 0.98.7, 0.94.24 > Environment: OS: Ubuntu 14.04/Ubuntu 15.10 > JDK: OpenJDK8/OpenJDK9 >Reporter: Anant Sharma >Assignee: Anoop Sam John >Priority: Critical > Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 0.98.18, 1.4.0 > > Attachments: BASE-15322.patch > > > HBase crashes in standalone mode with the following log: > __ > 2016-02-24 22:38:37,578 ERROR [main] master.HMasterCommandLine: Master exiting > java.lang.RuntimeException: Failed construction of Master: class > org.apache.hadoop.hbase.master.HMaster > at > org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2341) > at > org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:233) > at > org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:139) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at > org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) > at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2355) > Caused by: java.lang.NoClassDefFoundError: Could not initialize class > org.apache.hadoop.hbase.util.Bytes$LexicographicalComparerHolder$UnsafeComparer > at org.apache.hadoop.hbase.util.Bytes.putInt(Bytes.java:899) > at > org.apache.hadoop.hbase.KeyValue.createByteArray(KeyValue.java:1082) > at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:652) > at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:580) > at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:483) > at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:370) > at org.apache.hadoop.hbase.KeyValue.(KeyValue.java:267) > at org.apache.hadoop.hbase.HConstants.(HConstants.java:978) > at > org.apache.hadoop.hbase.HTableDescriptor.(HTableDescriptor.java:1488) > at > org.apache.hadoop.hbase.util.FSTableDescriptors.(FSTableDescriptors.java:124) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:570) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:365) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2336) > __ > The class is in the hbase-common.jar and its there in the classpath as can be > seen from the log: > _ > 2016-02-24 22:38:32,538 INFO [main] util.ServerCommandLine: >
[jira] [Commented] (HBASE-15425) Failing to write bulk load event marker in the WAL is ignored
[ https://issues.apache.org/jira/browse/HBASE-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188641#comment-15188641 ] ramkrishna.s.vasudevan commented on HBASE-15425: If the bulk loaded file is loaded again and this time the WAL marker is added only the latest file that got retried is going to be replicated in the peer cluster? Ya the seqId does matter here. > Failing to write bulk load event marker in the WAL is ignored > - > > Key: HBASE-15425 > URL: https://issues.apache.org/jira/browse/HBASE-15425 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0 >Reporter: Ashish Singhi >Assignee: Ashish Singhi > Fix For: 2.0.0 > > Attachments: HBASE-15425.patch, HBASE-15425.v1.patch > > > During LoadIncrementalHFiles process if we fail to write the bulk load event > marker in the WAL, it is ignored. So this will lead to data mismatch issue in > source and peer cluster in case of bulk loaded data replication scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14844) backport jdk.tools exclusion to 1.0 and 1.1
[ https://issues.apache.org/jira/browse/HBASE-14844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188643#comment-15188643 ] Nick Dimiduk commented on HBASE-14844: -- Reopening HBASE-13963 is fine by me [~busbey], since you haven't closed it yet. > backport jdk.tools exclusion to 1.0 and 1.1 > --- > > Key: HBASE-14844 > URL: https://issues.apache.org/jira/browse/HBASE-14844 > Project: HBase > Issue Type: Task > Components: build >Affects Versions: 1.1.2, 1.0.3 >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Critical > Fix For: 1.1.4, 1.0.4 > > > per [~apurtell]'s comment when backporting HBASE-13963 to 0.98, we should > probably consider leaking jdk.tools in 1.0 and 1.1 bugs as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14963) Remove Guava dependency from HBase client code
[ https://issues.apache.org/jira/browse/HBASE-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188631#comment-15188631 ] Enis Soztutar commented on HBASE-14963: --- I just checked the patch again. It does not remove the dependency of guava from the client module as the title suggests. It just changes an internal code path to not use guava, that is all. I think this can go in all applicable branches. > Remove Guava dependency from HBase client code > -- > > Key: HBASE-14963 > URL: https://issues.apache.org/jira/browse/HBASE-14963 > Project: HBase > Issue Type: Improvement > Components: Client >Reporter: Devaraj Das >Assignee: Devaraj Das > Labels: needs_releasenote > Fix For: 2.0.0 > > Attachments: no-stopwatch.txt > > > We ran into an issue where an application bundled its own Guava (and that > happened to be in the classpath first) and HBase's MetaTableLocator threw an > exception due to the fact that Stopwatch's constructor wasn't compatible... > Might be better to not depend on Stopwatch at all in MetaTableLocator since > the functionality is easily doable without. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15411) Rewrite backup with Procedure V2
[ https://issues.apache.org/jira/browse/HBASE-15411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188629#comment-15188629 ] Enis Soztutar commented on HBASE-15411: --- bq. Still encounters the following when running TestFullBackup#testFullBackupSingle : In the offline discussion, I think the plan was to submit the proc to the queue for the {{hbase:backup}} table as a pseudo-global queue. cc. [~syuanjiang]. > Rewrite backup with Procedure V2 > > > Key: HBASE-15411 > URL: https://issues.apache.org/jira/browse/HBASE-15411 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 15411-v1.txt, 15411-v3.txt, 15411-v5.txt, > FullTableBackupProcedure.java > > > Currently full / incremental backup is driven by BackupHandler (see call() > method for flow). > This issue is to rewrite the flow using Procedure V2. > States (enum) for full / incremental backup would be introduced in > Backup.proto which correspond to the steps performed in BackupHandler#call(). > executeFromState() would pace the backup based on the current state. > serializeStateData() / deserializeStateData() would be used to persist state > into procedure WAL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does count CellScanner payload
[ https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188624#comment-15188624 ] Enis Soztutar commented on HBASE-15437: --- We can keep the other metric updated, and move only the responseSize metric update to the new place. > Response size calculated in RPCServer for warning tooLarge responses does > count CellScanner payload > --- > > Key: HBASE-15437 > URL: https://issues.apache.org/jira/browse/HBASE-15437 > Project: HBase > Issue Type: Bug > Components: IPC/RPC >Reporter: deepankar > > After HBASE-13158 where we respond back to RPCs with cells in the payload , > the protobuf response will just have the count the cells to read from > payload, but there are set of features where we log warn in RPCServer > whenever the response is tooLarge, but this size now is not considering the > sizes of the cells in the PayloadCellScanner. Code form RPCServer > {code} > long responseSize = result.getSerializedSize(); > // log any RPC responses that are slower than the configured warn > // response time or larger than configured warning size > boolean tooSlow = (processingTime > warnResponseTime && > warnResponseTime > -1); > boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize > > -1); > if (tooSlow || tooLarge) { > // when tagging, we let TooLarge trump TooSmall to keep output simple > // note that large responses will often also be slow. > logResponse(new Object[]{param}, > md.getName(), md.getName() + "(" + param.getClass().getName() + > ")", > (tooLarge ? "TooLarge" : "TooSlow"), > status.getClient(), startTime, processingTime, qTime, > responseSize); > } > {code} > Should this feature be not supported any more or should we add a method to > CellScanner or a new interface which returns the serialized size (but this > might not include the compression codecs which might be used during response > ?) Any other Idea this could be fixed ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15430) Failed taking snapshot - Manifest proto-message too large
[ https://issues.apache.org/jira/browse/HBASE-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188605#comment-15188605 ] Hadoop QA commented on HBASE-15430: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} | {color:red} HBASE-15430 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.2.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12792426/hbase-15430-v2.patch | | JIRA Issue | HBASE-15430 | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/917/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > Failed taking snapshot - Manifest proto-message too large > - > > Key: HBASE-15430 > URL: https://issues.apache.org/jira/browse/HBASE-15430 > Project: HBase > Issue Type: Bug > Components: snapshots >Affects Versions: 0.98.11 >Reporter: JunHo Cho >Assignee: JunHo Cho >Priority: Critical > Attachments: hbase-15430-v1.patch, hbase-15430-v2.patch, > hbase-15430.patch > > > the size of a protobuf message is 64MB (default). but the size of snapshot > meta is over 64MB. > Caused by: com.google.protobuf.InvalidProtocolBufferException via Failed > taking snapshot { ss=snapshot_xxx table=xxx type=FLUSH } due to > exception:Protocol message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size > limit.:com.google.protobuf.InvalidProtocolBufferException: Protocol message > was too large. May be malicious. Use CodedInputStream.setSizeLimit() to > increase the size limit. > at > org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:83) > at > org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:307) > at > org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:341) > ... 10 more > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size limit. > at > com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) > at > com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) > at > com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811) > at > com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3767) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3699) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3815) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3810) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1152) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1094) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1201) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1196) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3858) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3792) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3894) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3889) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) > at >
[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring
[ https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188588#comment-15188588 ] Yu Li commented on HBASE-15160: --- Sure, please go ahead [~enis], and thanks for offering help. Sorry for the lag, kinda busy recently resolving online issues... > Put back HFile's HDFS op latency sampling code and add metrics for monitoring > - > > Key: HBASE-15160 > URL: https://issues.apache.org/jira/browse/HBASE-15160 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0, 1.1.2 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, > HBASE-15160_v3.patch > > > In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, > fsPreadLatency and fsWriteLatency, have been removed. There was some > discussion about putting them back in a new JIRA but never happened. > According to our experience, these metrics are useful to judge whether issue > lies on HDFS when slow request occurs, so we propose to put them back in this > JIRA, and add the metrics for monitoring as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15430) Failed taking snapshot - Manifest proto-message too large
[ https://issues.apache.org/jira/browse/HBASE-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188582#comment-15188582 ] JunHo Cho commented on HBASE-15430: --- I changed DATA_MANIFEST is public and test class. > Failed taking snapshot - Manifest proto-message too large > - > > Key: HBASE-15430 > URL: https://issues.apache.org/jira/browse/HBASE-15430 > Project: HBase > Issue Type: Bug > Components: snapshots >Affects Versions: 0.98.11 >Reporter: JunHo Cho >Assignee: JunHo Cho >Priority: Critical > Attachments: hbase-15430-v1.patch, hbase-15430-v2.patch, > hbase-15430.patch > > > the size of a protobuf message is 64MB (default). but the size of snapshot > meta is over 64MB. > Caused by: com.google.protobuf.InvalidProtocolBufferException via Failed > taking snapshot { ss=snapshot_xxx table=xxx type=FLUSH } due to > exception:Protocol message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size > limit.:com.google.protobuf.InvalidProtocolBufferException: Protocol message > was too large. May be malicious. Use CodedInputStream.setSizeLimit() to > increase the size limit. > at > org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:83) > at > org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:307) > at > org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:341) > ... 10 more > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size limit. > at > com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) > at > com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) > at > com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811) > at > com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3767) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3699) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3815) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3810) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1152) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1094) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1201) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1196) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3858) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3792) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3894) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3889) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.parseFrom(SnapshotProtos.java:4094) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.readDataManifest(SnapshotManifest.java:433) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.load(SnapshotManifest.java:273) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.open(SnapshotManifest.java:119) > at > org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:106 -- This message was
[jira] [Updated] (HBASE-15430) Failed taking snapshot - Manifest proto-message too large
[ https://issues.apache.org/jira/browse/HBASE-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JunHo Cho updated HBASE-15430: -- Attachment: hbase-15430-v2.patch > Failed taking snapshot - Manifest proto-message too large > - > > Key: HBASE-15430 > URL: https://issues.apache.org/jira/browse/HBASE-15430 > Project: HBase > Issue Type: Bug > Components: snapshots >Affects Versions: 0.98.11 >Reporter: JunHo Cho >Assignee: JunHo Cho >Priority: Critical > Attachments: hbase-15430-v1.patch, hbase-15430-v2.patch, > hbase-15430.patch > > > the size of a protobuf message is 64MB (default). but the size of snapshot > meta is over 64MB. > Caused by: com.google.protobuf.InvalidProtocolBufferException via Failed > taking snapshot { ss=snapshot_xxx table=xxx type=FLUSH } due to > exception:Protocol message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size > limit.:com.google.protobuf.InvalidProtocolBufferException: Protocol message > was too large. May be malicious. Use CodedInputStream.setSizeLimit() to > increase the size limit. > at > org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:83) > at > org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:307) > at > org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:341) > ... 10 more > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size limit. > at > com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) > at > com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) > at > com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811) > at > com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3767) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3699) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3815) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3810) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1152) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1094) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1201) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1196) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3858) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3792) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3894) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3889) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.parseFrom(SnapshotProtos.java:4094) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.readDataManifest(SnapshotManifest.java:433) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.load(SnapshotManifest.java:273) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.open(SnapshotManifest.java:119) > at > org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:106 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15425) Failing to write bulk load event marker in the WAL is ignored
[ https://issues.apache.org/jira/browse/HBASE-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188575#comment-15188575 ] Anoop Sam John commented on HBASE-15425: bq.For region replicas, missing a flush file or bulk load files is not a critical condition (since eventually they will be picked up due to compactions) Regarding the bulk load markers, that is not for Region replicas.. It is for replication across cluster. These wal marker cells also will be passed via the Replication path and the sink cluster will read the cell and do load the file there also.. So a miss in writing this cell to WAL while bulk load will result in all these bulk loaded data not getting available for peer cluster. (And no time later also it can get available) But ya what ur point of retry and loading the file again ! Need to see any impact. So these 2 files will have diff seqIds? > Failing to write bulk load event marker in the WAL is ignored > - > > Key: HBASE-15425 > URL: https://issues.apache.org/jira/browse/HBASE-15425 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0 >Reporter: Ashish Singhi >Assignee: Ashish Singhi > Fix For: 2.0.0 > > Attachments: HBASE-15425.patch, HBASE-15425.v1.patch > > > During LoadIncrementalHFiles process if we fail to write the bulk load event > marker in the WAL, it is ignored. So this will lead to data mismatch issue in > source and peer cluster in case of bulk loaded data replication scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks
[ https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188566#comment-15188566 ] Anoop Sam John commented on HBASE-15392: Index is used to know a row/cell is in which HFile block. Within a block there may be many rows/cell. To go to particular cell within a block, it is linear search. Yes you read next Cell see whether this is the one needed if not read next. There is no indexing within a block. Said that its not generic seek vs skip.. What Lars say is abt seek within a block. So this seek (within block) is also some what like read and skip skip way.. But he says the seek is way slower.. Seeing the code I was not able to judge why it is so slow.. Any idea u have [~lhofhansl]? Hope I am answering ur Qs and clearing doubts Daniel. And yes when the seek point is outside the current block, we wont do skip skip... > Single Cell Get reads two HFileBlocks > - > > Key: HBASE-15392 > URL: https://issues.apache.org/jira/browse/HBASE-15392 > Project: HBase > Issue Type: Sub-task > Components: BucketCache >Reporter: stack >Assignee: stack > Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, > 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, > HBASE-15392_suggest.patch, no_optimize.patch, no_optimize.patch, two_seeks.txt > > > As found by Daniel "SystemTap" Pol, a simple Get results in our reading two > HFileBlocks, the one that contains the wanted Cell, and the block that > follows. > Here is a bit of custom logging that logs a stack trace on each HFileBlock > read so you can see the call stack responsible: > {code} > 2016-03-03 22:20:30,191 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > START LOOP > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > QCODE SEEK_NEXT_COL > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: > STARTED WHILE > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: > OUT OF L2 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read > offset=31409152, len=2103 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: > offset=31409152, length=2103 > 2016-03-03 22:20:30,193 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: > From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, > onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, > prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, > getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, > buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], > dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE]]] > 2016-03-03 22:20:30,193 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: > Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, > onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, > prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, > getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, > buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], > dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE]]] > java.lang.Throwable > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288) > at >
[jira] [Created] (HBASE-15439) Mob compaction is not triggered after extended period of time
Ted Yu created HBASE-15439: -- Summary: Mob compaction is not triggered after extended period of time Key: HBASE-15439 URL: https://issues.apache.org/jira/browse/HBASE-15439 Project: HBase Issue Type: Bug Reporter: Ted Yu I was running IntegrationTestIngestWithMOB test. I lower the mob compaction chore interval to this value: {code} hbase.mob.compaction.chore.period 6000 {code} After whole night, there was no indication from master log that mob compaction ran. All I found was: {code} 2016-03-09 04:18:52,194 INFO [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_1] hbase.ScheduledChore: Chore: tyu-hbase-rhel-re-2.novalocal,2,1457491115327- MobCompactionChore missed its start time 2016-03-09 05:58:52,516 INFO [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_1] hbase.ScheduledChore: Chore: tyu-hbase-rhel-re-2.novalocal,2,1457491115327- MobCompactionChore missed its start time 2016-03-09 07:38:52,847 INFO [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_2] hbase.ScheduledChore: Chore: tyu-hbase-rhel-re-2.novalocal,2,1457491115327- MobCompactionChore missed its start time 2016-03-09 09:18:52,848 INFO [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_1] hbase.ScheduledChore: Chore: tyu-hbase-rhel-re-2.novalocal,2,1457491115327- MobCompactionChore missed its start time 2016-03-09 10:58:52,932 INFO [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_2] hbase.ScheduledChore: Chore: tyu-hbase-rhel-re-2.novalocal,2,1457491115327- MobCompactionChore missed its start time 2016-03-09 12:38:52,932 INFO [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_1] hbase.ScheduledChore: Chore: tyu-hbase-rhel-re-2.novalocal,2,1457491115327- MobCompactionChore missed its start time 2016-03-09 14:18:52,933 INFO [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_2] hbase.ScheduledChore: Chore: tyu-hbase-rhel-re-2.novalocal,2,1457491115327- MobCompactionChore missed its start time 2016-03-09 15:58:52,957 INFO [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_1] hbase.ScheduledChore: Chore: tyu-hbase-rhel-re-2.novalocal,2,1457491115327- MobCompactionChore missed its start time 2016-03-09 17:38:52,960 INFO [tyu-hbase-rhel-re-2.novalocal,2,1457491115327_ChoreService_2] hbase.ScheduledChore: Chore: tyu-hbase-rhel-re-2.novalocal,2,1457491115327- MobCompactionChore missed its start time {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15430) Failed taking snapshot - Manifest proto-message too large
[ https://issues.apache.org/jira/browse/HBASE-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188563#comment-15188563 ] Hadoop QA commented on HBASE-15430: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} | {color:red} HBASE-15430 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.2.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12792422/hbase-15430-v1.patch | | JIRA Issue | HBASE-15430 | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/916/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > Failed taking snapshot - Manifest proto-message too large > - > > Key: HBASE-15430 > URL: https://issues.apache.org/jira/browse/HBASE-15430 > Project: HBase > Issue Type: Bug > Components: snapshots >Affects Versions: 0.98.11 >Reporter: JunHo Cho >Assignee: JunHo Cho >Priority: Critical > Attachments: hbase-15430-v1.patch, hbase-15430.patch > > > the size of a protobuf message is 64MB (default). but the size of snapshot > meta is over 64MB. > Caused by: com.google.protobuf.InvalidProtocolBufferException via Failed > taking snapshot { ss=snapshot_xxx table=xxx type=FLUSH } due to > exception:Protocol message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size > limit.:com.google.protobuf.InvalidProtocolBufferException: Protocol message > was too large. May be malicious. Use CodedInputStream.setSizeLimit() to > increase the size limit. > at > org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:83) > at > org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:307) > at > org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:341) > ... 10 more > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size limit. > at > com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) > at > com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) > at > com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811) > at > com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3767) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3699) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3815) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3810) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1152) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1094) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1201) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1196) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3858) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3792) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3894) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3889) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) > at >
[jira] [Updated] (HBASE-15425) Failing to write bulk load event marker in the WAL is ignored
[ https://issues.apache.org/jira/browse/HBASE-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-15425: --- Hadoop Flags: Reviewed Fix Version/s: 2.0.0 > Failing to write bulk load event marker in the WAL is ignored > - > > Key: HBASE-15425 > URL: https://issues.apache.org/jira/browse/HBASE-15425 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0 >Reporter: Ashish Singhi >Assignee: Ashish Singhi > Fix For: 2.0.0 > > Attachments: HBASE-15425.patch, HBASE-15425.v1.patch > > > During LoadIncrementalHFiles process if we fail to write the bulk load event > marker in the WAL, it is ignored. So this will lead to data mismatch issue in > source and peer cluster in case of bulk loaded data replication scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15425) Failing to write bulk load event marker in the WAL is ignored
[ https://issues.apache.org/jira/browse/HBASE-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188557#comment-15188557 ] Ted Yu commented on HBASE-15425: BulkLoad tests passed. Let me commit to master branch first. > Failing to write bulk load event marker in the WAL is ignored > - > > Key: HBASE-15425 > URL: https://issues.apache.org/jira/browse/HBASE-15425 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0 >Reporter: Ashish Singhi >Assignee: Ashish Singhi > Attachments: HBASE-15425.patch, HBASE-15425.v1.patch > > > During LoadIncrementalHFiles process if we fail to write the bulk load event > marker in the WAL, it is ignored. So this will lead to data mismatch issue in > source and peer cluster in case of bulk loaded data replication scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15430) Failed taking snapshot - Manifest proto-message too large
[ https://issues.apache.org/jira/browse/HBASE-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-15430: Status: Patch Available (was: Open) > Failed taking snapshot - Manifest proto-message too large > - > > Key: HBASE-15430 > URL: https://issues.apache.org/jira/browse/HBASE-15430 > Project: HBase > Issue Type: Bug > Components: snapshots >Affects Versions: 0.98.11 >Reporter: JunHo Cho >Priority: Critical > Attachments: hbase-15430-v1.patch, hbase-15430.patch > > > the size of a protobuf message is 64MB (default). but the size of snapshot > meta is over 64MB. > Caused by: com.google.protobuf.InvalidProtocolBufferException via Failed > taking snapshot { ss=snapshot_xxx table=xxx type=FLUSH } due to > exception:Protocol message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size > limit.:com.google.protobuf.InvalidProtocolBufferException: Protocol message > was too large. May be malicious. Use CodedInputStream.setSizeLimit() to > increase the size limit. > at > org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:83) > at > org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:307) > at > org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:341) > ... 10 more > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size limit. > at > com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) > at > com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) > at > com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811) > at > com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3767) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3699) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3815) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3810) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1152) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1094) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1201) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1196) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3858) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3792) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3894) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3889) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.parseFrom(SnapshotProtos.java:4094) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.readDataManifest(SnapshotManifest.java:433) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.load(SnapshotManifest.java:273) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.open(SnapshotManifest.java:119) > at > org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:106 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15430) Failed taking snapshot - Manifest proto-message too large
[ https://issues.apache.org/jira/browse/HBASE-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188552#comment-15188552 ] Matteo Bertozzi commented on HBASE-15430: - v1 looks good to me, the QA checkstyle will probably complain about parenthesis spacing and tabs vs 2 spacing. but other than that looks ok to me. maybe the only change is to make DATA_MANIFEST_NAME visible for testing instead of copying it in the test class. > Failed taking snapshot - Manifest proto-message too large > - > > Key: HBASE-15430 > URL: https://issues.apache.org/jira/browse/HBASE-15430 > Project: HBase > Issue Type: Bug > Components: snapshots >Affects Versions: 0.98.11 >Reporter: JunHo Cho >Priority: Critical > Attachments: hbase-15430-v1.patch, hbase-15430.patch > > > the size of a protobuf message is 64MB (default). but the size of snapshot > meta is over 64MB. > Caused by: com.google.protobuf.InvalidProtocolBufferException via Failed > taking snapshot { ss=snapshot_xxx table=xxx type=FLUSH } due to > exception:Protocol message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size > limit.:com.google.protobuf.InvalidProtocolBufferException: Protocol message > was too large. May be malicious. Use CodedInputStream.setSizeLimit() to > increase the size limit. > at > org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:83) > at > org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:307) > at > org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:341) > ... 10 more > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size limit. > at > com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) > at > com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) > at > com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811) > at > com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3767) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3699) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3815) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3810) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1152) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1094) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1201) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1196) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3858) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3792) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3894) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3889) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.parseFrom(SnapshotProtos.java:4094) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.readDataManifest(SnapshotManifest.java:433) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.load(SnapshotManifest.java:273) > at >
[jira] [Updated] (HBASE-15430) Failed taking snapshot - Manifest proto-message too large
[ https://issues.apache.org/jira/browse/HBASE-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-15430: Assignee: JunHo Cho > Failed taking snapshot - Manifest proto-message too large > - > > Key: HBASE-15430 > URL: https://issues.apache.org/jira/browse/HBASE-15430 > Project: HBase > Issue Type: Bug > Components: snapshots >Affects Versions: 0.98.11 >Reporter: JunHo Cho >Assignee: JunHo Cho >Priority: Critical > Attachments: hbase-15430-v1.patch, hbase-15430.patch > > > the size of a protobuf message is 64MB (default). but the size of snapshot > meta is over 64MB. > Caused by: com.google.protobuf.InvalidProtocolBufferException via Failed > taking snapshot { ss=snapshot_xxx table=xxx type=FLUSH } due to > exception:Protocol message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size > limit.:com.google.protobuf.InvalidProtocolBufferException: Protocol message > was too large. May be malicious. Use CodedInputStream.setSizeLimit() to > increase the size limit. > at > org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:83) > at > org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:307) > at > org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:341) > ... 10 more > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size limit. > at > com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) > at > com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) > at > com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811) > at > com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3767) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3699) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3815) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3810) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1152) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1094) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1201) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1196) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3858) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3792) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3894) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3889) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.parseFrom(SnapshotProtos.java:4094) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.readDataManifest(SnapshotManifest.java:433) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.load(SnapshotManifest.java:273) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.open(SnapshotManifest.java:119) > at > org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:106 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15378) Scanner cannot handle heartbeat message with no results
[ https://issues.apache.org/jira/browse/HBASE-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-15378: --- Fix Version/s: 1.1.4 > Scanner cannot handle heartbeat message with no results > --- > > Key: HBASE-15378 > URL: https://issues.apache.org/jira/browse/HBASE-15378 > Project: HBase > Issue Type: Bug > Components: dataloss, Scanners >Affects Versions: 1.2.0, 1.1.3 >Reporter: Phil Yang >Assignee: Phil Yang >Priority: Critical > Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 1.4.0 > > Attachments: HBASE-15378-v1.txt, HBASE-15378-v2.txt, > HBASE-15378-v3.txt, HBASE-15378-v4.patch, HBASE-15378-v5.patch, > HBASE-15378-v6.patch > > > When a RS scanner get a TIME_LIMIT_REACHED_MID_ROW state, they will stop > scanning and send back what it has read to client and mark the message as a > heartbeat message. If there is no cell has been read, it will be an empty > response. > However, ClientScanner only handles the situation that the client gets an > empty heartbeat and its cache is not empty. If the cache is empty too, it > will be regarded as end-of-region and open a new scanner for next region. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15378) Scanner cannot handle heartbeat message with no results
[ https://issues.apache.org/jira/browse/HBASE-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188547#comment-15188547 ] Phil Yang commented on HBASE-15378: --- Thank you all. [~tedyu] Why we didn't push to branch-1.1? > Scanner cannot handle heartbeat message with no results > --- > > Key: HBASE-15378 > URL: https://issues.apache.org/jira/browse/HBASE-15378 > Project: HBase > Issue Type: Bug > Components: dataloss, Scanners >Affects Versions: 1.2.0, 1.1.3 >Reporter: Phil Yang >Assignee: Phil Yang >Priority: Critical > Fix For: 2.0.0, 1.3.0, 1.2.1, 1.4.0 > > Attachments: HBASE-15378-v1.txt, HBASE-15378-v2.txt, > HBASE-15378-v3.txt, HBASE-15378-v4.patch, HBASE-15378-v5.patch, > HBASE-15378-v6.patch > > > When a RS scanner get a TIME_LIMIT_REACHED_MID_ROW state, they will stop > scanning and send back what it has read to client and mark the message as a > heartbeat message. If there is no cell has been read, it will be an empty > response. > However, ClientScanner only handles the situation that the client gets an > empty heartbeat and its cache is not empty. If the cache is empty too, it > will be regarded as end-of-region and open a new scanner for next region. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15430) Failed taking snapshot - Manifest proto-message too large
[ https://issues.apache.org/jira/browse/HBASE-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JunHo Cho updated HBASE-15430: -- Attachment: hbase-15430-v1.patch add test case > Failed taking snapshot - Manifest proto-message too large > - > > Key: HBASE-15430 > URL: https://issues.apache.org/jira/browse/HBASE-15430 > Project: HBase > Issue Type: Bug > Components: snapshots >Affects Versions: 0.98.11 >Reporter: JunHo Cho >Priority: Critical > Attachments: hbase-15430-v1.patch, hbase-15430.patch > > > the size of a protobuf message is 64MB (default). but the size of snapshot > meta is over 64MB. > Caused by: com.google.protobuf.InvalidProtocolBufferException via Failed > taking snapshot { ss=snapshot_xxx table=xxx type=FLUSH } due to > exception:Protocol message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size > limit.:com.google.protobuf.InvalidProtocolBufferException: Protocol message > was too large. May be malicious. Use CodedInputStream.setSizeLimit() to > increase the size limit. > at > org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:83) > at > org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:307) > at > org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:341) > ... 10 more > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size limit. > at > com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) > at > com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) > at > com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811) > at > com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3767) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3699) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3815) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3810) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1152) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1094) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1201) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1196) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3858) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3792) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3894) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3889) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.parseFrom(SnapshotProtos.java:4094) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.readDataManifest(SnapshotManifest.java:433) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.load(SnapshotManifest.java:273) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.open(SnapshotManifest.java:119) > at > org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:106 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15430) Failed taking snapshot - Manifest proto-message too large
[ https://issues.apache.org/jira/browse/HBASE-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188544#comment-15188544 ] JunHo Cho commented on HBASE-15430: --- I added test case. > Failed taking snapshot - Manifest proto-message too large > - > > Key: HBASE-15430 > URL: https://issues.apache.org/jira/browse/HBASE-15430 > Project: HBase > Issue Type: Bug > Components: snapshots >Affects Versions: 0.98.11 >Reporter: JunHo Cho >Priority: Critical > Attachments: hbase-15430-v1.patch, hbase-15430.patch > > > the size of a protobuf message is 64MB (default). but the size of snapshot > meta is over 64MB. > Caused by: com.google.protobuf.InvalidProtocolBufferException via Failed > taking snapshot { ss=snapshot_xxx table=xxx type=FLUSH } due to > exception:Protocol message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size > limit.:com.google.protobuf.InvalidProtocolBufferException: Protocol message > was too large. May be malicious. Use CodedInputStream.setSizeLimit() to > increase the size limit. > at > org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:83) > at > org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:307) > at > org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:341) > ... 10 more > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size limit. > at > com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) > at > com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) > at > com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811) > at > com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3767) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo.(HBaseProtos.java:3699) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3815) > at > org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionInfo$1.parsePartialFrom(HBaseProtos.java:3810) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1152) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1094) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1201) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1196) > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3858) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3792) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3894) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3889) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223) > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.parseFrom(SnapshotProtos.java:4094) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.readDataManifest(SnapshotManifest.java:433) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.load(SnapshotManifest.java:273) > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.open(SnapshotManifest.java:119) > at > org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:106 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15377) Per-RS Get metric is time based, per-region metric is size-based
[ https://issues.apache.org/jira/browse/HBASE-15377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188533#comment-15188533 ] Enis Soztutar commented on HBASE-15377: --- Thanks for the updated patch. Looks good to go. There is something fishy with the unit tests lately. Almost all the precommit runs and normal runs have a very large set of timed-out tests. > Per-RS Get metric is time based, per-region metric is size-based > > > Key: HBASE-15377 > URL: https://issues.apache.org/jira/browse/HBASE-15377 > Project: HBase > Issue Type: Sub-task >Reporter: Enis Soztutar >Assignee: Heng Chen > Fix For: 2.0.0, 1.3.0, 1.4.0 > > Attachments: HBASE-15377.patch, HBASE-15377_v1.patch, > HBASE-15377_v2.patch > > > We have metrics for Get operations at the region server level and region > level. > {code} >"Get_num_ops" : 4837505, > "Get_min" : 0, > "Get_max" : 296, > "Get_mean" : 0.2934618155433431, > "Get_median" : 0.0, > "Get_75th_percentile" : 0.0, > "Get_95th_percentile" : 1.0, > "Get_99th_percentile" : 1.0, > {code} > and > {code} >"Namespace_hbase_table_meta_region_1588230740_metric_get_num_ops" : 103, > "Namespace_hbase_table_meta_region_1588230740_metric_get_min" : 450, > "Namespace_hbase_table_meta_region_1588230740_metric_get_max" : 470, > "Namespace_hbase_table_meta_region_1588230740_metric_get_mean" : > 450.19417475728153, > "Namespace_hbase_table_meta_region_1588230740_metric_get_median" : 460.0, > "Namespace_hbase_table_meta_region_1588230740_metric_get_75th_percentile" > : 470.0, > "Namespace_hbase_table_meta_region_1588230740_metric_get_95th_percentile" > : 470.0, > "Namespace_hbase_table_meta_region_1588230740_metric_get_99th_percentile" > : 470.0, > {code} > The problem is that the report values for the region server shows the > latency, versus the reported values for the region shows the response sizes. > There is no way of telling this without reading the source code. > I think we should deprecate response size histograms in favor of latency > histograms. > See also HBASE-15376. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks
[ https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188530#comment-15188530 ] Daniel Pol commented on HBASE-15392: Take my comments with a grain of salt since I'm not familiar with this code, or coding in general :) . I am familiar with systems performance. Re: Seek vs Skip discussion: is that true in general or only for rotational HDD ? Is that true for bucketcache usage also ? Re: How do you know you're done with the row?: Where do indexes come into play ? I keep seeing skip,seeks, reading extra blocks. I don't see any comments related to looking at indexes to know where to seek and how much to read. > Single Cell Get reads two HFileBlocks > - > > Key: HBASE-15392 > URL: https://issues.apache.org/jira/browse/HBASE-15392 > Project: HBase > Issue Type: Sub-task > Components: BucketCache >Reporter: stack >Assignee: stack > Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, > 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, > HBASE-15392_suggest.patch, no_optimize.patch, no_optimize.patch, two_seeks.txt > > > As found by Daniel "SystemTap" Pol, a simple Get results in our reading two > HFileBlocks, the one that contains the wanted Cell, and the block that > follows. > Here is a bit of custom logging that logs a stack trace on each HFileBlock > read so you can see the call stack responsible: > {code} > 2016-03-03 22:20:30,191 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > START LOOP > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > QCODE SEEK_NEXT_COL > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: > STARTED WHILE > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: > OUT OF L2 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read > offset=31409152, len=2103 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: > offset=31409152, length=2103 > 2016-03-03 22:20:30,193 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: > From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, > onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, > prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, > getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, > buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], > dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE]]] > 2016-03-03 22:20:30,193 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: > Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, > onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, > prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, > getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, > buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], > dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE]]] > java.lang.Throwable > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198) > at > org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321) >
[jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring
[ https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188526#comment-15188526 ] Enis Soztutar commented on HBASE-15160: --- [~carp84] did you get a chance to update the patch? I can take this on if you want. > Put back HFile's HDFS op latency sampling code and add metrics for monitoring > - > > Key: HBASE-15160 > URL: https://issues.apache.org/jira/browse/HBASE-15160 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0, 1.1.2 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, > HBASE-15160_v3.patch > > > In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, > fsPreadLatency and fsWriteLatency, have been removed. There was some > discussion about putting them back in a new JIRA but never happened. > According to our experience, these metrics are useful to judge whether issue > lies on HDFS when slow request occurs, so we propose to put them back in this > JIRA, and add the metrics for monitoring as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15411) Rewrite backup with Procedure V2
[ https://issues.apache.org/jira/browse/HBASE-15411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-15411: --- Attachment: 15411-v5.txt Patch v5 lets FullTableBackupProcedure implement TableProcedureInterface. Still encounters the following when running TestFullBackup#testFullBackupSingle : {code} 2016-03-09 18:03:19,414 ERROR [ProcedureExecutor-0] server.NIOServerCnxnFactory$1(44): Thread Thread[ProcedureExecutor-0,5,main] died java.lang.UnsupportedOperationException: RQs for non-table/non-server procedures are not implemented yet at org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler.doAdd(MasterProcedureScheduler.java:117) at org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler.addFront(MasterProcedureScheduler.java:92) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1175) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:855) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:808) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$400(ProcedureExecutor.java:75) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$2.run(ProcedureExecutor.java:494) {code} > Rewrite backup with Procedure V2 > > > Key: HBASE-15411 > URL: https://issues.apache.org/jira/browse/HBASE-15411 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 15411-v1.txt, 15411-v3.txt, 15411-v5.txt, > FullTableBackupProcedure.java > > > Currently full / incremental backup is driven by BackupHandler (see call() > method for flow). > This issue is to rewrite the flow using Procedure V2. > States (enum) for full / incremental backup would be introduced in > Backup.proto which correspond to the steps performed in BackupHandler#call(). > executeFromState() would pace the backup based on the current state. > serializeStateData() / deserializeStateData() would be used to persist state > into procedure WAL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15425) Failing to write bulk load event marker in the WAL is ignored
[ https://issues.apache.org/jira/browse/HBASE-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188511#comment-15188511 ] Enis Soztutar commented on HBASE-15425: --- Flush and bulk load markers have been added for region replicas so that they can replay these events. Normally, the regular log split / replay ignores these markers. For region replicas, missing a flush file or bulk load files is not a critical condition (since eventually they will be picked up due to compactions), so we were following the safe route there. Now, returning failure will cause the bulk load RPC to be retried, and the regionserver would have already bulk loaded those files, so they will be bulk loaded again. One cluster will see 2 sets of bulk load files, the other cluster which gets replication will see only one set. There is no atomic transaction to make sure that the bulk load and WAL event happens atomically, so it is a best effort in that case. Semantically it should still be correct though. Patch looks fine to me. > Failing to write bulk load event marker in the WAL is ignored > - > > Key: HBASE-15425 > URL: https://issues.apache.org/jira/browse/HBASE-15425 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0 >Reporter: Ashish Singhi >Assignee: Ashish Singhi > Attachments: HBASE-15425.patch, HBASE-15425.v1.patch > > > During LoadIncrementalHFiles process if we fail to write the bulk load event > marker in the WAL, it is ignored. So this will lead to data mismatch issue in > source and peer cluster in case of bulk loaded data replication scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does count CellScanner payload
[ https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188500#comment-15188500 ] deepankar commented on HBASE-15437: --- In that case should the values of queueTime, processingTime be stored inside the call ? or should the calculation of those be also moved to setResponse() ? > Response size calculated in RPCServer for warning tooLarge responses does > count CellScanner payload > --- > > Key: HBASE-15437 > URL: https://issues.apache.org/jira/browse/HBASE-15437 > Project: HBase > Issue Type: Bug > Components: IPC/RPC >Reporter: deepankar > > After HBASE-13158 where we respond back to RPCs with cells in the payload , > the protobuf response will just have the count the cells to read from > payload, but there are set of features where we log warn in RPCServer > whenever the response is tooLarge, but this size now is not considering the > sizes of the cells in the PayloadCellScanner. Code form RPCServer > {code} > long responseSize = result.getSerializedSize(); > // log any RPC responses that are slower than the configured warn > // response time or larger than configured warning size > boolean tooSlow = (processingTime > warnResponseTime && > warnResponseTime > -1); > boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize > > -1); > if (tooSlow || tooLarge) { > // when tagging, we let TooLarge trump TooSmall to keep output simple > // note that large responses will often also be slow. > logResponse(new Object[]{param}, > md.getName(), md.getName() + "(" + param.getClass().getName() + > ")", > (tooLarge ? "TooLarge" : "TooSlow"), > status.getClient(), startTime, processingTime, qTime, > responseSize); > } > {code} > Should this feature be not supported any more or should we add a method to > CellScanner or a new interface which returns the serialized size (but this > might not include the compression codecs which might be used during response > ?) Any other Idea this could be fixed ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does count CellScanner payload
[ https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188489#comment-15188489 ] Enis Soztutar commented on HBASE-15437: --- We can move the metric + log warning to Call.setResponse() and get the correct size there. > Response size calculated in RPCServer for warning tooLarge responses does > count CellScanner payload > --- > > Key: HBASE-15437 > URL: https://issues.apache.org/jira/browse/HBASE-15437 > Project: HBase > Issue Type: Bug > Components: IPC/RPC >Reporter: deepankar > > After HBASE-13158 where we respond back to RPCs with cells in the payload , > the protobuf response will just have the count the cells to read from > payload, but there are set of features where we log warn in RPCServer > whenever the response is tooLarge, but this size now is not considering the > sizes of the cells in the PayloadCellScanner. Code form RPCServer > {code} > long responseSize = result.getSerializedSize(); > // log any RPC responses that are slower than the configured warn > // response time or larger than configured warning size > boolean tooSlow = (processingTime > warnResponseTime && > warnResponseTime > -1); > boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize > > -1); > if (tooSlow || tooLarge) { > // when tagging, we let TooLarge trump TooSmall to keep output simple > // note that large responses will often also be slow. > logResponse(new Object[]{param}, > md.getName(), md.getName() + "(" + param.getClass().getName() + > ")", > (tooLarge ? "TooLarge" : "TooSlow"), > status.getClient(), startTime, processingTime, qTime, > responseSize); > } > {code} > Should this feature be not supported any more or should we add a method to > CellScanner or a new interface which returns the serialized size (but this > might not include the compression codecs which might be used during response > ?) Any other Idea this could be fixed ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does count CellScanner payload
[ https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188481#comment-15188481 ] Enis Soztutar commented on HBASE-15437: --- Not just the log warn, we are also not computing the response size metric correctly: {code} metrics.sentResponse(responseSize); {code} The correct size of the response can be obtained after call.setResponse() is called, because the cells would already been encoded in the buffers in the call.response buffer chain. > Response size calculated in RPCServer for warning tooLarge responses does > count CellScanner payload > --- > > Key: HBASE-15437 > URL: https://issues.apache.org/jira/browse/HBASE-15437 > Project: HBase > Issue Type: Bug > Components: IPC/RPC >Reporter: deepankar > > After HBASE-13158 where we respond back to RPCs with cells in the payload , > the protobuf response will just have the count the cells to read from > payload, but there are set of features where we log warn in RPCServer > whenever the response is tooLarge, but this size now is not considering the > sizes of the cells in the PayloadCellScanner. Code form RPCServer > {code} > long responseSize = result.getSerializedSize(); > // log any RPC responses that are slower than the configured warn > // response time or larger than configured warning size > boolean tooSlow = (processingTime > warnResponseTime && > warnResponseTime > -1); > boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize > > -1); > if (tooSlow || tooLarge) { > // when tagging, we let TooLarge trump TooSmall to keep output simple > // note that large responses will often also be slow. > logResponse(new Object[]{param}, > md.getName(), md.getName() + "(" + param.getClass().getName() + > ")", > (tooLarge ? "TooLarge" : "TooSlow"), > status.getClient(), startTime, processingTime, qTime, > responseSize); > } > {code} > Should this feature be not supported any more or should we add a method to > CellScanner or a new interface which returns the serialized size (but this > might not include the compression codecs which might be used during response > ?) Any other Idea this could be fixed ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15438) error: CF specified in importtsv.columns does not match with table CF
[ https://issues.apache.org/jira/browse/HBASE-15438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bob zhao updated HBASE-15438: - Summary: error: CF specified in importtsv.columns does not match with table CF (was: error: Column Families specified in importtsv.columns does not match with any of the table column families ) > error: CF specified in importtsv.columns does not match with table CF > - > > Key: HBASE-15438 > URL: https://issues.apache.org/jira/browse/HBASE-15438 > Project: HBase > Issue Type: Bug > Components: hbase >Affects Versions: 1.1.2 > Environment: HDP 2.3 Ubuntu 14 >Reporter: bob zhao >Priority: Minor > Labels: easyfix, easytest > > Try to play with the hbase tsv import, get such error: > ERROR: Column Families [ Current, Closing] specified in importtsv.columns > does not match with any of the table StocksB column families [Closing, > Current]. > the script is: > hbase org.apache.hadoop.hbase.mapreduce.ImportTsv > -Dimporttsv.columns="HBASE_ROW_KEY, Current:Price, Closing:Price" > -Dimporttsv.bulk.output="/user/bob/storeDataFileOutput/" StocksB > /user/bob/stocks.txt > If i remove the space which behind the comma and before CF name, everything > is fine. > As a Dev, I like to add some space over there for easy reading and checking. > Please trim these CF names before processing, thanks! > https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15438) error: Column Families specified in importtsv.columns does not match with any of the table column families
bob zhao created HBASE-15438: Summary: error: Column Families specified in importtsv.columns does not match with any of the table column families Key: HBASE-15438 URL: https://issues.apache.org/jira/browse/HBASE-15438 Project: HBase Issue Type: Bug Components: hbase Affects Versions: 1.1.2 Environment: HDP 2.3 Ubuntu 14 Reporter: bob zhao Priority: Minor Try to play with the hbase tsv import, get such error: ERROR: Column Families [ Current, Closing] specified in importtsv.columns does not match with any of the table StocksB column families [Closing, Current]. the script is: hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns="HBASE_ROW_KEY, Current:Price, Closing:Price" -Dimporttsv.bulk.output="/user/bob/storeDataFileOutput/" StocksB /user/bob/stocks.txt If i remove the space which behind the comma and before CF name, everything is fine. As a Dev, I like to add some space over there for easy reading and checking. Please trim these CF names before processing, thanks! https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15191) CopyTable and VerifyReplication - Option to specify batch size, versions
[ https://issues.apache.org/jira/browse/HBASE-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188439#comment-15188439 ] Ted Yu commented on HBASE-15191: {code} 95 if(cacheRow > 0) { 96 scan.setCaching(cacheRow); 97 } 98 else { {code} nit: insert space between if and ( Move else to the line above it, following right curly. I can do the above if QA result comes back good. > CopyTable and VerifyReplication - Option to specify batch size, versions > > > Key: HBASE-15191 > URL: https://issues.apache.org/jira/browse/HBASE-15191 > Project: HBase > Issue Type: Improvement > Components: Replication >Affects Versions: 0.98.16.1 >Reporter: Parth Shah >Priority: Minor > Attachments: HBASE_15191.patch > > > Need option to specify batch size for CopyTable and VerifyReplication. We > are working on patch for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks
[ https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188427#comment-15188427 ] Lars Hofhansl commented on HBASE-15392: --- Lemme try to answer. There seems to be a bunch of confusion about the optimize method. bq. Lars Hofhansl Is this true? If a row of 1M columns and we are looking for the second column only, SKIP'ing through the rest is what we want? Nope. That's the point. If the SEEK gets us to another block it does the SEEK. Otherwise it saves time by doing SKIP, SKIP, ... That _is_ cheaper. If it does that, even though the remaining columns are on different block, that's a bug. It should only SKIP as long as the point we SEEK is to is on the same block. SEEK is very expensive. Phoenix has an "optimization" where they always load all columns, so that it won't use the ExplicitColumnTracker, because seeking between columns or rows was too slow. With wide rows they are f'ed. With optimize they can remove that, small rows are performant, and for large rows we still SEEK. The Key is: Don't count the number of Cells the ColumnTracker is seeing. Count the work that HBase doing when we're not SKIP'ing, but doing a SEEK instead. bq. Optimize is squashing the SEEK making it an INCLUDE or a SKIP instead when the index key is >= current cell. I do not think so. It's only because of the bug that moreRowsMayExistAfter is not called unless we SEEK. We're fixing this for Gets. bq. Because it suppresses the SEEK, turning it into a SKIP or INCLUDE, then we are skirting the check of there being any more rows in this scan so we keep going... till we hit the next row... (loading blocks if we have to) and only then, do we realize we are actually done. That is only true for the last row in a block. bq. -looksee is good. Doesn't do anything for the case when stopRows are sprecified (here again we can over read) and it could make broader use of the fact that we know there are no more rows if the Scan is a Get Scan. A scan presumably is big. We're doing an extra check for every single row to avoid loading a single block in the end. Sure, for small scans the block load might be worse, for big scans it is not. I'll point out the 3x improvement again in our M/R jobs that do a lot of scanning. bq. Scenario #6 and #7 above are interesting. For a Get, we should not be doing #7. I do not think they are. In #6, how do you now you're done with the row? You need to read the next Cell of the next row. If that's on another block that block needs to be loaded first. #7 similarly, you need to check the next Cell to see whether you are done. If that is on the next block it needs to be loaded. All of this should only happen when we talk about the last row in block, no? If the next row is the same block we _want_ to SKIP (not SEEK). > Single Cell Get reads two HFileBlocks > - > > Key: HBASE-15392 > URL: https://issues.apache.org/jira/browse/HBASE-15392 > Project: HBase > Issue Type: Sub-task > Components: BucketCache >Reporter: stack >Assignee: stack > Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, > 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, > HBASE-15392_suggest.patch, no_optimize.patch, no_optimize.patch, two_seeks.txt > > > As found by Daniel "SystemTap" Pol, a simple Get results in our reading two > HFileBlocks, the one that contains the wanted Cell, and the block that > follows. > Here is a bit of custom logging that logs a stack trace on each HFileBlock > read so you can see the call stack responsible: > {code} > 2016-03-03 22:20:30,191 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > START LOOP > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > QCODE SEEK_NEXT_COL > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: > STARTED WHILE > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: > OUT OF L2 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read > offset=31409152, len=2103 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: > offset=31409152, length=2103 > 2016-03-03 22:20:30,193 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: > From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, > onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, > prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, > getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, >
[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks
[ https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188405#comment-15188405 ] Lars Hofhansl commented on HBASE-15392: --- bq. Optimize seems broke to me. We keep SKIPping even though we've found all our rows. No, no... That's the _whole_ point optimize :) SEEKing is so expensive that we translate it to series of SKIPing instead. We can do 100's SKIPs in the time of single SEEK - unless we expect the SEEK to get us onto the next block with high likelihood, in which case we should SEEK, which can definitely happen with many versions (so we want to mostly SKIP, but not have terrible performance when we store 1000 versions). A bunch of SKIP, SKIP, SKIP, SKIP, ..., SKIP, us much cheaper than a single SEEK_TO_NEXT_ROW. In fact it can be 3x faster. > Single Cell Get reads two HFileBlocks > - > > Key: HBASE-15392 > URL: https://issues.apache.org/jira/browse/HBASE-15392 > Project: HBase > Issue Type: Sub-task > Components: BucketCache >Reporter: stack >Assignee: stack > Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, > 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, > HBASE-15392_suggest.patch, no_optimize.patch, no_optimize.patch, two_seeks.txt > > > As found by Daniel "SystemTap" Pol, a simple Get results in our reading two > HFileBlocks, the one that contains the wanted Cell, and the block that > follows. > Here is a bit of custom logging that logs a stack trace on each HFileBlock > read so you can see the call stack responsible: > {code} > 2016-03-03 22:20:30,191 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > START LOOP > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: > QCODE SEEK_NEXT_COL > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: > STARTED WHILE > 2016-03-03 22:20:30,192 INFO > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: > OUT OF L2 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read > offset=31409152, len=2103 > 2016-03-03 22:20:30,192 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: > offset=31409152, length=2103 > 2016-03-03 22:20:30,193 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: > From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, > onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, > prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, > getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, > buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], > dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE]]] > 2016-03-03 22:20:30,193 TRACE > [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: > Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, > onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, > prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, > getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, > buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], > dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, > fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, > bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, > includesTags=false, compressAlgo=NONE, compressTags=false, > cryptoContext=[cipher=NONE keyHash=NONE]]] > java.lang.Throwable > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198) > at >
[jira] [Commented] (HBASE-15191) CopyTable and VerifyReplication - Option to specify batch size, versions
[ https://issues.apache.org/jira/browse/HBASE-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188393#comment-15188393 ] Parth Shah commented on HBASE-15191: - Changed the curl braces. - As per my above comment. CopyTable also does the same. https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/CopyTable.java#L89 > CopyTable and VerifyReplication - Option to specify batch size, versions > > > Key: HBASE-15191 > URL: https://issues.apache.org/jira/browse/HBASE-15191 > Project: HBase > Issue Type: Improvement > Components: Replication >Affects Versions: 0.98.16.1 >Reporter: Parth Shah >Priority: Minor > Attachments: HBASE_15191.patch > > > Need option to specify batch size for CopyTable and VerifyReplication. We > are working on patch for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15191) CopyTable and VerifyReplication - Option to specify batch size, versions
[ https://issues.apache.org/jira/browse/HBASE-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Shah updated HBASE-15191: --- Attachment: HBASE_15191.patch > CopyTable and VerifyReplication - Option to specify batch size, versions > > > Key: HBASE-15191 > URL: https://issues.apache.org/jira/browse/HBASE-15191 > Project: HBase > Issue Type: Improvement > Components: Replication >Affects Versions: 0.98.16.1 >Reporter: Parth Shah >Priority: Minor > Attachments: HBASE_15191.patch > > > Need option to specify batch size for CopyTable and VerifyReplication. We > are working on patch for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15191) CopyTable and VerifyReplication - Option to specify batch size, versions
[ https://issues.apache.org/jira/browse/HBASE-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Shah updated HBASE-15191: --- Attachment: (was: HBASE_15191.patch) > CopyTable and VerifyReplication - Option to specify batch size, versions > > > Key: HBASE-15191 > URL: https://issues.apache.org/jira/browse/HBASE-15191 > Project: HBase > Issue Type: Improvement > Components: Replication >Affects Versions: 0.98.16.1 >Reporter: Parth Shah >Priority: Minor > > Need option to specify batch size for CopyTable and VerifyReplication. We > are working on patch for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15191) CopyTable and VerifyReplication - Option to specify batch size, versions
[ https://issues.apache.org/jira/browse/HBASE-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188386#comment-15188386 ] Parth Shah commented on HBASE-15191: 1. Added the curly braces. Checked the documentation online again for the default value for the param hbase.client.scanner.caching which is 100, hence now set the value to 100. The idea is to set with the default value in-case nothing is supplied. 2. By default the value is true, in case of Verify Replication the idea is to verify the blocks, so there should not be need to cache those blocks. Similarly in CopyTable also this value is set to false, as the functionality is used for replication which should not require to cache the blocks. > CopyTable and VerifyReplication - Option to specify batch size, versions > > > Key: HBASE-15191 > URL: https://issues.apache.org/jira/browse/HBASE-15191 > Project: HBase > Issue Type: Improvement > Components: Replication >Affects Versions: 0.98.16.1 >Reporter: Parth Shah >Priority: Minor > Attachments: HBASE_15191.patch > > > Need option to specify batch size for CopyTable and VerifyReplication. We > are working on patch for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15191) CopyTable and VerifyReplication - Option to specify batch size, versions
[ https://issues.apache.org/jira/browse/HBASE-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188383#comment-15188383 ] Ted Yu commented on HBASE-15191: {code} 95 if(cacheRow > 0) 96 { {code} Please put the right curly at the end of previous line (same with else). scan.setCacheBlocks(false) call is still there. Do you need it ? Why ? > CopyTable and VerifyReplication - Option to specify batch size, versions > > > Key: HBASE-15191 > URL: https://issues.apache.org/jira/browse/HBASE-15191 > Project: HBase > Issue Type: Improvement > Components: Replication >Affects Versions: 0.98.16.1 >Reporter: Parth Shah >Priority: Minor > Attachments: HBASE_15191.patch > > > Need option to specify batch size for CopyTable and VerifyReplication. We > are working on patch for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-15406) Split / merge switch left disabled after early termination of hbck
[ https://issues.apache.org/jira/browse/HBASE-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188378#comment-15188378 ] Heng Chen edited comment on HBASE-15406 at 3/10/16 12:32 AM: - {quote} The exception message doesn't match the condition. Did you mean to say 'has to be acquired" ? {quote} Sorry, it is confused. I should rename the "lease" to be "lock". My original thought is before change switch, if we found there is one lock on it, the action will be refused. And the condition should be ">0" {quote} SplitOrMergeLeaseTracker does the rollback. What if master crashes after getting zookeeper notification but before restoring split / merge states ? {quote} That is a problem, we can do cleanup when master start up. was (Author: chenheng): {quote} The exception message doesn't match the condition. Did you mean to say 'has to be acquired" ? {quote} Sorry, it is confused. I should rename the "lease" to be "lock". My original thought is before change switch, if we found there is one lock on it, the action will be refused. {quote} SplitOrMergeLeaseTracker does the rollback. What if master crashes after getting zookeeper notification but before restoring split / merge states ? {quote} That is a problem, we can do cleanup when master start up. > Split / merge switch left disabled after early termination of hbck > -- > > Key: HBASE-15406 > URL: https://issues.apache.org/jira/browse/HBASE-15406 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Priority: Critical > Fix For: 2.0.0, 1.3.0, 1.4.0 > > Attachments: HBASE-15406.v1.patch, wip.patch > > > This was what I did on cluster with 1.4.0-SNAPSHOT built Thursday: > Run 'hbase hbck -disableSplitAndMerge' on gateway node of the cluster > Terminate hbck early > Enter hbase shell where I observed: > {code} > hbase(main):001:0> splitormerge_enabled 'SPLIT' > false > 0 row(s) in 0.3280 seconds > hbase(main):002:0> splitormerge_enabled 'MERGE' > false > 0 row(s) in 0.0070 seconds > {code} > Expectation is that the split / merge switches should be restored to default > value after hbck exits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15406) Split / merge switch left disabled after early termination of hbck
[ https://issues.apache.org/jira/browse/HBASE-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188378#comment-15188378 ] Heng Chen commented on HBASE-15406: --- {quote} The exception message doesn't match the condition. Did you mean to say 'has to be acquired" ? {quote} Sorry, it is confused. I should rename the "lease" to be "lock". My original thought is before change switch, if we found there is one lock on it, the action will be refused. {quote} SplitOrMergeLeaseTracker does the rollback. What if master crashes after getting zookeeper notification but before restoring split / merge states ? {quote} That is a problem, we can do cleanup when master start up. > Split / merge switch left disabled after early termination of hbck > -- > > Key: HBASE-15406 > URL: https://issues.apache.org/jira/browse/HBASE-15406 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Priority: Critical > Fix For: 2.0.0, 1.3.0, 1.4.0 > > Attachments: HBASE-15406.v1.patch, wip.patch > > > This was what I did on cluster with 1.4.0-SNAPSHOT built Thursday: > Run 'hbase hbck -disableSplitAndMerge' on gateway node of the cluster > Terminate hbck early > Enter hbase shell where I observed: > {code} > hbase(main):001:0> splitormerge_enabled 'SPLIT' > false > 0 row(s) in 0.3280 seconds > hbase(main):002:0> splitormerge_enabled 'MERGE' > false > 0 row(s) in 0.0070 seconds > {code} > Expectation is that the split / merge switches should be restored to default > value after hbck exits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15191) CopyTable and VerifyReplication - Option to specify batch size, versions
[ https://issues.apache.org/jira/browse/HBASE-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Shah updated HBASE-15191: --- Attachment: HBASE_15191.patch > CopyTable and VerifyReplication - Option to specify batch size, versions > > > Key: HBASE-15191 > URL: https://issues.apache.org/jira/browse/HBASE-15191 > Project: HBase > Issue Type: Improvement > Components: Replication >Affects Versions: 0.98.16.1 >Reporter: Parth Shah >Priority: Minor > Attachments: HBASE_15191.patch > > > Need option to specify batch size for CopyTable and VerifyReplication. We > are working on patch for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15191) CopyTable and VerifyReplication - Option to specify batch size, versions
[ https://issues.apache.org/jira/browse/HBASE-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Shah updated HBASE-15191: --- Attachment: (was: HBASE_15191.patch) > CopyTable and VerifyReplication - Option to specify batch size, versions > > > Key: HBASE-15191 > URL: https://issues.apache.org/jira/browse/HBASE-15191 > Project: HBase > Issue Type: Improvement > Components: Replication >Affects Versions: 0.98.16.1 >Reporter: Parth Shah >Priority: Minor > > Need option to specify batch size for CopyTable and VerifyReplication. We > are working on patch for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15378) Scanner cannot handle heartbeat message with no results
[ https://issues.apache.org/jira/browse/HBASE-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188366#comment-15188366 ] Hudson commented on HBASE-15378: FAILURE: Integrated in HBase-1.4 #10 (See [https://builds.apache.org/job/HBase-1.4/10/]) HBASE-15378 Scanner cannot handle heartbeat message with no results (tedyu: rev e103f75ae760680f66f45fe6ae9b9cd5173a3466) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestScannerHeartbeatMessages.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java > Scanner cannot handle heartbeat message with no results > --- > > Key: HBASE-15378 > URL: https://issues.apache.org/jira/browse/HBASE-15378 > Project: HBase > Issue Type: Bug > Components: dataloss, Scanners >Affects Versions: 1.2.0, 1.1.3 >Reporter: Phil Yang >Assignee: Phil Yang >Priority: Critical > Fix For: 2.0.0, 1.3.0, 1.2.1, 1.4.0 > > Attachments: HBASE-15378-v1.txt, HBASE-15378-v2.txt, > HBASE-15378-v3.txt, HBASE-15378-v4.patch, HBASE-15378-v5.patch, > HBASE-15378-v6.patch > > > When a RS scanner get a TIME_LIMIT_REACHED_MID_ROW state, they will stop > scanning and send back what it has read to client and mark the message as a > heartbeat message. If there is no cell has been read, it will be an empty > response. > However, ClientScanner only handles the situation that the client gets an > empty heartbeat and its cache is not empty. If the cache is empty too, it > will be regarded as end-of-region and open a new scanner for next region. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15378) Scanner cannot handle heartbeat message with no results
[ https://issues.apache.org/jira/browse/HBASE-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188355#comment-15188355 ] Hudson commented on HBASE-15378: FAILURE: Integrated in HBase-Trunk_matrix #767 (See [https://builds.apache.org/job/HBase-Trunk_matrix/767/]) HBASE-15378 Scanner cannot handle heartbeat message with no results (tedyu: rev ad9b91a9042607c4528ac79b2aed1254d99f6db4) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestScannerHeartbeatMessages.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java > Scanner cannot handle heartbeat message with no results > --- > > Key: HBASE-15378 > URL: https://issues.apache.org/jira/browse/HBASE-15378 > Project: HBase > Issue Type: Bug > Components: dataloss, Scanners >Affects Versions: 1.2.0, 1.1.3 >Reporter: Phil Yang >Assignee: Phil Yang >Priority: Critical > Fix For: 2.0.0, 1.3.0, 1.2.1, 1.4.0 > > Attachments: HBASE-15378-v1.txt, HBASE-15378-v2.txt, > HBASE-15378-v3.txt, HBASE-15378-v4.patch, HBASE-15378-v5.patch, > HBASE-15378-v6.patch > > > When a RS scanner get a TIME_LIMIT_REACHED_MID_ROW state, they will stop > scanning and send back what it has read to client and mark the message as a > heartbeat message. If there is no cell has been read, it will be an empty > response. > However, ClientScanner only handles the situation that the client gets an > empty heartbeat and its cache is not empty. If the cache is empty too, it > will be regarded as end-of-region and open a new scanner for next region. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14801) Enhance the Spark-HBase connector catalog with json format
[ https://issues.apache.org/jira/browse/HBASE-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188356#comment-15188356 ] Hudson commented on HBASE-14801: FAILURE: Integrated in HBase-Trunk_matrix #767 (See [https://builds.apache.org/job/HBase-Trunk_matrix/767/]) HBASE-14801 Enhance the Spark-HBase connector catalog with json format (jmhsieh: rev 97cce850fed130aa263d61f6a3c4f361f2629c7c) * hbase-spark/src/main/scala/org/apache/spark/sql/datasources/hbase/DataTypeParserWrapper.scala * hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/datasources/SerDes.scala * hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/DefaultSourceSuite.scala * hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/datasources/HBaseTableScanRDD.scala * hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/HBaseCatalogSuite.scala * hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/DefaultSource.scala * hbase-spark/src/main/scala/org/apache/spark/sql/datasources/hbase/HBaseTableCatalog.scala * hbase-spark/src/main/java/org/apache/hadoop/hbase/spark/SparkSQLPushDownFilter.java * hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/datasources/HBaseSparkConf.scala > Enhance the Spark-HBase connector catalog with json format > -- > > Key: HBASE-14801 > URL: https://issues.apache.org/jira/browse/HBASE-14801 > Project: HBase > Issue Type: Sub-task >Reporter: Zhan Zhang >Assignee: Zhan Zhang > Attachments: HBASE-14801-1.patch, HBASE-14801-2.patch, > HBASE-14801-3.patch, HBASE-14801-4.patch, HBASE-14801-5.patch, > HBASE-14801-6.patch, HBASE-14801-7.patch, HBASE-14801-8.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15336) Support Dataframe writer to the connector
[ https://issues.apache.org/jira/browse/HBASE-15336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188342#comment-15188342 ] Ted Yu commented on HBASE-15336: If there is no more review comment, I plan to commit tomorrow. > Support Dataframe writer to the connector > - > > Key: HBASE-15336 > URL: https://issues.apache.org/jira/browse/HBASE-15336 > Project: HBase > Issue Type: Sub-task >Reporter: Zhan Zhang >Assignee: Zhan Zhang > Attachments: HBASE-15336-1.patch, HBASE-15336-2.patch, > HBASE-15336-3.patch > > > Currently, the connector only support read path. A complete solution should > support both read and writer. This subtask add write support. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15336) Support Dataframe writer to the connector
[ https://issues.apache.org/jira/browse/HBASE-15336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188340#comment-15188340 ] Hadoop QA commented on HBASE-15336: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 28s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s {color} | {color:green} master passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s {color} | {color:green} master passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} scaladoc {color} | {color:green} 0m 33s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} scalac {color} | {color:green} 0m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} scalac {color} | {color:green} 1m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 29m 35s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} scaladoc {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} scaladoc {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s {color} | {color:green} hbase-spark in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 8s {color} | {color:green} hbase-spark in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 9s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 41m 37s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.9.1 Server=1.9.1 Image:yetus/hbase:date2016-03-09 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12792374/HBASE-15336-3.patch | | JIRA Issue | HBASE-15336 | | Optional Tests | asflicense scalac scaladoc unit compile | | uname | Linux 245d4215338b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 97cce85 | | JDK v1.7.0_95 Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/915/testReport/ | | modules | C: hbase-spark U: hbase-spark | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/915/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > Support Dataframe writer to the connector > - > > Key: HBASE-15336 > URL: https://issues.apache.org/jira/browse/HBASE-15336 > Project: HBase > Issue Type: Sub-task >Reporter: Zhan Zhang >Assignee: Zhan Zhang > Attachments: HBASE-15336-1.patch, HBASE-15336-2.patch, > HBASE-15336-3.patch > > > Currently, the connector only support read path. A complete solution should > support both read and writer. This subtask add write support. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15378) Scanner cannot handle heartbeat message with no results
[ https://issues.apache.org/jira/browse/HBASE-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188297#comment-15188297 ] Hudson commented on HBASE-15378: SUCCESS: Integrated in HBase-1.3 #595 (See [https://builds.apache.org/job/HBase-1.3/595/]) HBASE-15378 Scanner cannot handle heartbeat message with no results (tedyu: rev d686fec14495ac5eff3d661cc6a842d1f401c8d2) * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestScannerHeartbeatMessages.java > Scanner cannot handle heartbeat message with no results > --- > > Key: HBASE-15378 > URL: https://issues.apache.org/jira/browse/HBASE-15378 > Project: HBase > Issue Type: Bug > Components: dataloss, Scanners >Affects Versions: 1.2.0, 1.1.3 >Reporter: Phil Yang >Assignee: Phil Yang >Priority: Critical > Fix For: 2.0.0, 1.3.0, 1.2.1, 1.4.0 > > Attachments: HBASE-15378-v1.txt, HBASE-15378-v2.txt, > HBASE-15378-v3.txt, HBASE-15378-v4.patch, HBASE-15378-v5.patch, > HBASE-15378-v6.patch > > > When a RS scanner get a TIME_LIMIT_REACHED_MID_ROW state, they will stop > scanning and send back what it has read to client and mark the message as a > heartbeat message. If there is no cell has been read, it will be an empty > response. > However, ClientScanner only handles the situation that the client gets an > empty heartbeat and its cache is not empty. If the cache is empty too, it > will be regarded as end-of-region and open a new scanner for next region. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does count CellScanner payload
[ https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188288#comment-15188288 ] deepankar commented on HBASE-15437: --- ping [~stack] > Response size calculated in RPCServer for warning tooLarge responses does > count CellScanner payload > --- > > Key: HBASE-15437 > URL: https://issues.apache.org/jira/browse/HBASE-15437 > Project: HBase > Issue Type: Bug > Components: IPC/RPC >Reporter: deepankar > > After HBASE-13158 where we respond back to RPCs with cells in the payload , > the protobuf response will just have the count the cells to read from > payload, but there are set of features where we log warn in RPCServer > whenever the response is tooLarge, but this size now is not considering the > sizes of the cells in the PayloadCellScanner. Code form RPCServer > {code} > long responseSize = result.getSerializedSize(); > // log any RPC responses that are slower than the configured warn > // response time or larger than configured warning size > boolean tooSlow = (processingTime > warnResponseTime && > warnResponseTime > -1); > boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize > > -1); > if (tooSlow || tooLarge) { > // when tagging, we let TooLarge trump TooSmall to keep output simple > // note that large responses will often also be slow. > logResponse(new Object[]{param}, > md.getName(), md.getName() + "(" + param.getClass().getName() + > ")", > (tooLarge ? "TooLarge" : "TooSlow"), > status.getClient(), startTime, processingTime, qTime, > responseSize); > } > {code} > Should this feature be not supported any more or should we add a method to > CellScanner or a new interface which returns the serialized size (but this > might not include the compression codecs which might be used during response > ?) Any other Idea this could be fixed ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does count CellScanner payload
[ https://issues.apache.org/jira/browse/HBASE-15437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188285#comment-15188285 ] deepankar commented on HBASE-15437: --- ping [~saint@gmail.com] [~anoopsamjohn] [~ram_krish] > Response size calculated in RPCServer for warning tooLarge responses does > count CellScanner payload > --- > > Key: HBASE-15437 > URL: https://issues.apache.org/jira/browse/HBASE-15437 > Project: HBase > Issue Type: Bug > Components: IPC/RPC >Reporter: deepankar > > After HBASE-13158 where we respond back to RPCs with cells in the payload , > the protobuf response will just have the count the cells to read from > payload, but there are set of features where we log warn in RPCServer > whenever the response is tooLarge, but this size now is not considering the > sizes of the cells in the PayloadCellScanner. Code form RPCServer > {code} > long responseSize = result.getSerializedSize(); > // log any RPC responses that are slower than the configured warn > // response time or larger than configured warning size > boolean tooSlow = (processingTime > warnResponseTime && > warnResponseTime > -1); > boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize > > -1); > if (tooSlow || tooLarge) { > // when tagging, we let TooLarge trump TooSmall to keep output simple > // note that large responses will often also be slow. > logResponse(new Object[]{param}, > md.getName(), md.getName() + "(" + param.getClass().getName() + > ")", > (tooLarge ? "TooLarge" : "TooSlow"), > status.getClient(), startTime, processingTime, qTime, > responseSize); > } > {code} > Should this feature be not supported any more or should we add a method to > CellScanner or a new interface which returns the serialized size (but this > might not include the compression codecs which might be used during response > ?) Any other Idea this could be fixed ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15378) Scanner cannot handle heartbeat message with no results
[ https://issues.apache.org/jira/browse/HBASE-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188282#comment-15188282 ] Hudson commented on HBASE-15378: FAILURE: Integrated in HBase-1.2 #574 (See [https://builds.apache.org/job/HBase-1.2/574/]) HBASE-15378 Scanner cannot handle heartbeat message with no results (tedyu: rev 6497b365c7734ed9984de561ec37292a3656e878) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestScannerHeartbeatMessages.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java > Scanner cannot handle heartbeat message with no results > --- > > Key: HBASE-15378 > URL: https://issues.apache.org/jira/browse/HBASE-15378 > Project: HBase > Issue Type: Bug > Components: dataloss, Scanners >Affects Versions: 1.2.0, 1.1.3 >Reporter: Phil Yang >Assignee: Phil Yang >Priority: Critical > Fix For: 2.0.0, 1.3.0, 1.2.1, 1.4.0 > > Attachments: HBASE-15378-v1.txt, HBASE-15378-v2.txt, > HBASE-15378-v3.txt, HBASE-15378-v4.patch, HBASE-15378-v5.patch, > HBASE-15378-v6.patch > > > When a RS scanner get a TIME_LIMIT_REACHED_MID_ROW state, they will stop > scanning and send back what it has read to client and mark the message as a > heartbeat message. If there is no cell has been read, it will be an empty > response. > However, ClientScanner only handles the situation that the client gets an > empty heartbeat and its cache is not empty. If the cache is empty too, it > will be regarded as end-of-region and open a new scanner for next region. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15437) Response size calculated in RPCServer for warning tooLarge responses does count CellScanner payload
deepankar created HBASE-15437: - Summary: Response size calculated in RPCServer for warning tooLarge responses does count CellScanner payload Key: HBASE-15437 URL: https://issues.apache.org/jira/browse/HBASE-15437 Project: HBase Issue Type: Bug Components: IPC/RPC Reporter: deepankar After HBASE-13158 where we respond back to RPCs with cells in the payload , the protobuf response will just have the count the cells to read from payload, but there are set of features where we log warn in RPCServer whenever the response is tooLarge, but this size now is not considering the sizes of the cells in the PayloadCellScanner. Code form RPCServer {code} long responseSize = result.getSerializedSize(); // log any RPC responses that are slower than the configured warn // response time or larger than configured warning size boolean tooSlow = (processingTime > warnResponseTime && warnResponseTime > -1); boolean tooLarge = (responseSize > warnResponseSize && warnResponseSize > -1); if (tooSlow || tooLarge) { // when tagging, we let TooLarge trump TooSmall to keep output simple // note that large responses will often also be slow. logResponse(new Object[]{param}, md.getName(), md.getName() + "(" + param.getClass().getName() + ")", (tooLarge ? "TooLarge" : "TooSlow"), status.getClient(), startTime, processingTime, qTime, responseSize); } {code} Should this feature be not supported any more or should we add a method to CellScanner or a new interface which returns the serialized size (but this might not include the compression codecs which might be used during response ?) Any other Idea this could be fixed ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15336) Support Dataframe writer to the connector
[ https://issues.apache.org/jira/browse/HBASE-15336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated HBASE-15336: --- Attachment: HBASE-15336-3.patch > Support Dataframe writer to the connector > - > > Key: HBASE-15336 > URL: https://issues.apache.org/jira/browse/HBASE-15336 > Project: HBase > Issue Type: Sub-task >Reporter: Zhan Zhang >Assignee: Zhan Zhang > Attachments: HBASE-15336-1.patch, HBASE-15336-2.patch, > HBASE-15336-3.patch > > > Currently, the connector only support read path. A complete solution should > support both read and writer. This subtask add write support. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15436) BufferedMutatorImpl.flush() appears to get stuck
[ https://issues.apache.org/jira/browse/HBASE-15436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188261#comment-15188261 ] Sangjin Lee commented on HBASE-15436: - See YARN-4736 for more details. > BufferedMutatorImpl.flush() appears to get stuck > > > Key: HBASE-15436 > URL: https://issues.apache.org/jira/browse/HBASE-15436 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 1.0.2 >Reporter: Sangjin Lee > Attachments: hbaseException.log, threaddump.log > > > We noticed an instance where the thread that was executing a flush > ({{BufferedMutatorImpl.flush()}}) got stuck when the (local one-node) cluster > shut down and was unable to get out of that stuck state. > The setup is a single node HBase cluster, and apparently the cluster went > away when the client was executing flush. The flush eventually logged a > failure after 30+ minutes of retrying. That is understandable. > What is unexpected is that thread is stuck in this state (i.e. in the > {{flush()}} call). I would have expected the {{flush()}} call to return after > the complete failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15436) BufferedMutatorImpl.flush() appears to get stuck
[ https://issues.apache.org/jira/browse/HBASE-15436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated HBASE-15436: Attachment: threaddump.log hbaseException.log The hbaseException.log file shows the exception and the failure during {{flush()}}. The threaddump.log file shows the full thread stack trace dump after the shutdown mechanism was unable to shut down the thread that was stuck in the {{flush()}} call. > BufferedMutatorImpl.flush() appears to get stuck > > > Key: HBASE-15436 > URL: https://issues.apache.org/jira/browse/HBASE-15436 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 1.0.2 >Reporter: Sangjin Lee > Attachments: hbaseException.log, threaddump.log > > > We noticed an instance where the thread that was executing a flush > ({{BufferedMutatorImpl.flush()}}) got stuck when the (local one-node) cluster > shut down and was unable to get out of that stuck state. > The setup is a single node HBase cluster, and apparently the cluster went > away when the client was executing flush. The flush eventually logged a > failure after 30+ minutes of retrying. That is understandable. > What is unexpected is that thread is stuck in this state (i.e. in the > {{flush()}} call). I would have expected the {{flush()}} call to return after > the complete failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15436) BufferedMutatorImpl.flush() appears to get stuck
[ https://issues.apache.org/jira/browse/HBASE-15436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated HBASE-15436: Description: We noticed an instance where the thread that was executing a flush ({{BufferedMutatorImpl.flush()}}) got stuck when the (local one-node) cluster shut down and was unable to get out of that stuck state. The setup is a single node HBase cluster, and apparently the cluster went away when the client was executing flush. The flush eventually logged a failure after 30+ minutes of retrying. That is understandable. What is unexpected is that thread is stuck in this state (i.e. in the {{flush()}} call). I would have expected the {{flush()}} call to return after the complete failure. was: We noticed an instance where the thread that was executing a flush ({{BufferedMutatorImpl.flush()}} got stuck when the (local one-node) cluster shut down and was unable to get out of that stuck state. The setup is a single node HBase cluster, and apparently the cluster went away when the client was executing flush. The flush eventually logged a failure after 30+ minutes of retrying. That is understandable. What is unexpected is that thread is stuck in this state (i.e. in the {{flush()}} call). I would have expected the {{flush()}} call to return after the complete failure. > BufferedMutatorImpl.flush() appears to get stuck > > > Key: HBASE-15436 > URL: https://issues.apache.org/jira/browse/HBASE-15436 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 1.0.2 >Reporter: Sangjin Lee > > We noticed an instance where the thread that was executing a flush > ({{BufferedMutatorImpl.flush()}}) got stuck when the (local one-node) cluster > shut down and was unable to get out of that stuck state. > The setup is a single node HBase cluster, and apparently the cluster went > away when the client was executing flush. The flush eventually logged a > failure after 30+ minutes of retrying. That is understandable. > What is unexpected is that thread is stuck in this state (i.e. in the > {{flush()}} call). I would have expected the {{flush()}} call to return after > the complete failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15436) BufferedMutatorImpl.flush() appears to get stuck
Sangjin Lee created HBASE-15436: --- Summary: BufferedMutatorImpl.flush() appears to get stuck Key: HBASE-15436 URL: https://issues.apache.org/jira/browse/HBASE-15436 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.0.2 Reporter: Sangjin Lee We noticed an instance where the thread that was executing a flush ({{BufferedMutatorImpl.flush()}} got stuck when the (local one-node) cluster shut down and was unable to get out of that stuck state. The setup is a single node HBase cluster, and apparently the cluster went away when the client was executing flush. The flush eventually logged a failure after 30+ minutes of retrying. That is understandable. What is unexpected is that thread is stuck in this state (i.e. in the {{flush()}} call). I would have expected the {{flush()}} call to return after the complete failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15336) Support Dataframe writer to the connector
[ https://issues.apache.org/jira/browse/HBASE-15336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188184#comment-15188184 ] Hadoop QA commented on HBASE-15336: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 8s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s {color} | {color:green} master passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s {color} | {color:green} master passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} scaladoc {color} | {color:green} 0m 31s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} scalac {color} | {color:green} 1m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} scalac {color} | {color:green} 1m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 27m 23s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} scaladoc {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} scaladoc {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 8s {color} | {color:green} hbase-spark in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 6s {color} | {color:green} hbase-spark in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 7s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 39m 5s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.9.1 Server=1.9.1 Image:yetus/hbase:date2016-03-09 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12792342/HBASE-15336-2.patch | | JIRA Issue | HBASE-15336 | | Optional Tests | asflicense scalac scaladoc unit compile | | uname | Linux e8b401ca62bb 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 97cce85 | | JDK v1.7.0_95 Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/914/testReport/ | | modules | C: hbase-spark U: hbase-spark | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/914/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > Support Dataframe writer to the connector > - > > Key: HBASE-15336 > URL: https://issues.apache.org/jira/browse/HBASE-15336 > Project: HBase > Issue Type: Sub-task >Reporter: Zhan Zhang >Assignee: Zhan Zhang > Attachments: HBASE-15336-1.patch, HBASE-15336-2.patch > > > Currently, the connector only support read path. A complete solution should > support both read and writer. This subtask add write support. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15378) Scanner cannot handle heartbeat message with no results
[ https://issues.apache.org/jira/browse/HBASE-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188142#comment-15188142 ] Hadoop QA commented on HBASE-15378: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 21s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s {color} | {color:green} master passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 53s {color} | {color:green} master passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 7m 24s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 0s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 6s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 32s {color} | {color:green} master passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 51s {color} | {color:green} master passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 43s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 50s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 7s {color} | {color:green} hbase-client: patch generated 0 new + 624 unchanged - 3 fixed = 624 total (was 627) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 40s {color} | {color:green} hbase-client: patch generated 0 new + 624 unchanged - 3 fixed = 624 total (was 627) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 36s {color} | {color:green} hbase-server: patch generated 0 new + 624 unchanged - 3 fixed = 624 total (was 627) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 45s {color} | {color:green} hbase-server: patch generated 0 new + 624 unchanged - 3 fixed = 624 total (was 627) {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 27m 47s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 30s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 49s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 2s {color} | {color:green} hbase-client in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} |
[jira] [Commented] (HBASE-15425) Failing to write bulk load event marker in the WAL is ignored
[ https://issues.apache.org/jira/browse/HBASE-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188081#comment-15188081 ] Hadoop QA commented on HBASE-15425: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 5s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} master passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} master passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 5s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 59s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s {color} | {color:green} master passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s {color} | {color:green} master passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 29m 31s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 131m 10s {color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 124m 0s {color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 308m 34s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.hbase.master.balancer.TestStochasticLoadBalancer2 | | JDK v1.8.0_74 Timed out junit tests | org.apache.hadoop.hbase.coprocessor.TestRegionObserverScannerOpenHook | | | org.apache.hadoop.hbase.coprocessor.TestRegionServerObserver | | | org.apache.hadoop.hbase.TestClusterBootOrder | | | org.apache.hadoop.hbase.snapshot.TestMobSecureExportSnapshot | | |
[jira] [Commented] (HBASE-15425) Failing to write bulk load event marker in the WAL is ignored
[ https://issues.apache.org/jira/browse/HBASE-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188071#comment-15188071 ] Ted Yu commented on HBASE-15425: +1, assuming tests pass. > Failing to write bulk load event marker in the WAL is ignored > - > > Key: HBASE-15425 > URL: https://issues.apache.org/jira/browse/HBASE-15425 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0 >Reporter: Ashish Singhi >Assignee: Ashish Singhi > Attachments: HBASE-15425.patch, HBASE-15425.v1.patch > > > During LoadIncrementalHFiles process if we fail to write the bulk load event > marker in the WAL, it is ignored. So this will lead to data mismatch issue in > source and peer cluster in case of bulk loaded data replication scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-15350) Enable individual unit test in hbase-spark module
[ https://issues.apache.org/jira/browse/HBASE-15350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang resolved HBASE-15350. Resolution: Duplicate > Enable individual unit test in hbase-spark module > - > > Key: HBASE-15350 > URL: https://issues.apache.org/jira/browse/HBASE-15350 > Project: HBase > Issue Type: Sub-task >Reporter: Zhan Zhang >Assignee: Zhan Zhang > Attachments: HBASE-15350-1.patch, HBASE-15350-2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15336) Support Dataframe writer to the connector
[ https://issues.apache.org/jira/browse/HBASE-15336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188033#comment-15188033 ] Zhan Zhang commented on HBASE-15336: [~tedyu] Review comments solved, and open review board, please kindly review. > Support Dataframe writer to the connector > - > > Key: HBASE-15336 > URL: https://issues.apache.org/jira/browse/HBASE-15336 > Project: HBase > Issue Type: Sub-task >Reporter: Zhan Zhang >Assignee: Zhan Zhang > Attachments: HBASE-15336-1.patch, HBASE-15336-2.patch > > > Currently, the connector only support read path. A complete solution should > support both read and writer. This subtask add write support. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15336) Support Dataframe writer to the connector
[ https://issues.apache.org/jira/browse/HBASE-15336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated HBASE-15336: --- Attachment: HBASE-15336-2.patch > Support Dataframe writer to the connector > - > > Key: HBASE-15336 > URL: https://issues.apache.org/jira/browse/HBASE-15336 > Project: HBase > Issue Type: Sub-task >Reporter: Zhan Zhang >Assignee: Zhan Zhang > Attachments: HBASE-15336-1.patch, HBASE-15336-2.patch > > > Currently, the connector only support read path. A complete solution should > support both read and writer. This subtask add write support. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15336) Support Dataframe writer to the connector
[ https://issues.apache.org/jira/browse/HBASE-15336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated HBASE-15336: --- Status: Patch Available (was: Open) > Support Dataframe writer to the connector > - > > Key: HBASE-15336 > URL: https://issues.apache.org/jira/browse/HBASE-15336 > Project: HBase > Issue Type: Sub-task >Reporter: Zhan Zhang >Assignee: Zhan Zhang > Attachments: HBASE-15336-1.patch, HBASE-15336-2.patch > > > Currently, the connector only support read path. A complete solution should > support both read and writer. This subtask add write support. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15424) Add bulk load hfile-refs for replication in ZK after the event is appended in the WAL
[ https://issues.apache.org/jira/browse/HBASE-15424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15187909#comment-15187909 ] Hadoop QA commented on HBASE-15424: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 40s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s {color} | {color:green} master passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} master passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 33s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 14s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s {color} | {color:green} master passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s {color} | {color:green} master passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 13s {color} | {color:green} hbase-server: patch generated 0 new + 80 unchanged - 1 fixed = 80 total (was 81) {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 29m 5s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 9s {color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 125m 2s {color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 243m 33s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Timed out junit tests | org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedure | | | org.apache.hadoop.hbase.regionserver.compactions.TestFIFOCompactionPolicy | | | org.apache.hadoop.hbase.master.TestGetLastFlushedSequenceId | | | org.apache.hadoop.hbase.regionserver.TestCompaction | | | org.apache.hadoop.hbase.snapshot.TestSnapshotClientRetries | | | org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer2 | | |
[jira] [Commented] (HBASE-15377) Per-RS Get metric is time based, per-region metric is size-based
[ https://issues.apache.org/jira/browse/HBASE-15377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15187785#comment-15187785 ] Hadoop QA commented on HBASE-15377: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 37s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 34s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 51s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 24s {color} | {color:green} master passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 47s {color} | {color:green} master passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 49s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 49s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 51s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 56s {color} | {color:green} master passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 47s {color} | {color:green} master passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 18s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 1s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 41m 54s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 10m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 28s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 7s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 31s {color} | {color:green} hbase-hadoop-compat in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 32s {color} | {color:green} hbase-hadoop-compat in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 44s {color} | {color:green} hbase-hadoop2-compat in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 43s {color} | {color:green} hbase-hadoop2-compat in the patch passed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 33m 4s {color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color}
[jira] [Commented] (HBASE-15378) Scanner cannot handle heartbeat message with no results
[ https://issues.apache.org/jira/browse/HBASE-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15187780#comment-15187780 ] Hudson commented on HBASE-15378: SUCCESS: Integrated in HBase-1.2-IT #459 (See [https://builds.apache.org/job/HBase-1.2-IT/459/]) HBASE-15378 Scanner cannot handle heartbeat message with no results (tedyu: rev 6497b365c7734ed9984de561ec37292a3656e878) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestScannerHeartbeatMessages.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java > Scanner cannot handle heartbeat message with no results > --- > > Key: HBASE-15378 > URL: https://issues.apache.org/jira/browse/HBASE-15378 > Project: HBase > Issue Type: Bug > Components: dataloss, Scanners >Affects Versions: 1.2.0, 1.1.3 >Reporter: Phil Yang >Assignee: Phil Yang >Priority: Critical > Fix For: 2.0.0, 1.3.0, 1.2.1, 1.4.0 > > Attachments: HBASE-15378-v1.txt, HBASE-15378-v2.txt, > HBASE-15378-v3.txt, HBASE-15378-v4.patch, HBASE-15378-v5.patch, > HBASE-15378-v6.patch > > > When a RS scanner get a TIME_LIMIT_REACHED_MID_ROW state, they will stop > scanning and send back what it has read to client and mark the message as a > heartbeat message. If there is no cell has been read, it will be an empty > response. > However, ClientScanner only handles the situation that the client gets an > empty heartbeat and its cache is not empty. If the cache is empty too, it > will be regarded as end-of-region and open a new scanner for next region. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15314) Allow more than one backing file in bucketcache
[ https://issues.apache.org/jira/browse/HBASE-15314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15187759#comment-15187759 ] Daniel Pol commented on HBASE-15314: My favorite use case is when you have the hottest table that you want to cache completely and make sure you get the best performance for it (and the table doesn't fit into RAM). Right now its a matter of adding software raid on top of multiple SSDs to achieve that. I would like to remove the software raid overhead by doing the parallelism in Hbase. I agree it's not easy, mostly because the Hbase blocksize is not really fixed. So you have to end up adding logic to handle that. Funny you mentioned the buckets of fixed size. I'm thinking about filling another JIRA related to a lot of wasted space in bucketcache because of that. When you have a small bucketcache that's not an issue, but when you get to a few TiB and you end up with half the space unused but allocated it becomes a serious issue. > Allow more than one backing file in bucketcache > --- > > Key: HBASE-15314 > URL: https://issues.apache.org/jira/browse/HBASE-15314 > Project: HBase > Issue Type: Sub-task > Components: BucketCache >Reporter: stack >Assignee: Amal Joshy > Attachments: HBASE-15314.patch > > > Allow bucketcache use more than just one backing file: e.g. chassis has more > than one SSD in it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15435) Add WAL (in bytes) written metric
[ https://issues.apache.org/jira/browse/HBASE-15435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alicia Ying Shu updated HBASE-15435: Fix Version/s: 1.4.0 1.3.0 2.0.0 > Add WAL (in bytes) written metric > -- > > Key: HBASE-15435 > URL: https://issues.apache.org/jira/browse/HBASE-15435 > Project: HBase > Issue Type: Sub-task >Reporter: Alicia Ying Shu >Assignee: Alicia Ying Shu > Fix For: 2.0.0, 1.3.0, 1.4.0 > > > We have a histogram metrics related to wal bytes written, but we do not have > a single metric to track the WAL in bytes written as a count per regionserver. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15435) Add WAL (in bytes) written metric
Alicia Ying Shu created HBASE-15435: --- Summary: Add WAL (in bytes) written metric Key: HBASE-15435 URL: https://issues.apache.org/jira/browse/HBASE-15435 Project: HBase Issue Type: Sub-task Reporter: Alicia Ying Shu Assignee: Alicia Ying Shu We have a histogram metrics related to wal bytes written, but we do not have a single metric to track the WAL in bytes written as a count per regionserver. -- This message was sent by Atlassian JIRA (v6.3.4#6332)