[jira] [Created] (HBASE-27034) NegativeArraySizeException was encountered during compaction

2022-05-13 Thread fengxianjing (Jira)
fengxianjing created HBASE-27034:


 Summary: NegativeArraySizeException was encountered during 
compaction
 Key: HBASE-27034
 URL: https://issues.apache.org/jira/browse/HBASE-27034
 Project: HBase
  Issue Type: Bug
  Components: HFile
Affects Versions: 2.3.4
Reporter: fengxianjing
Assignee: fengxianjing


 
{code:java}
2022-04-13 12:45:37,122 ERROR [regionserver/xx:26020-shortCompactions-0] 
regionserver.CompactSplit: Compaction failed 
region=XXX:XXX,002CX21205070934507532021052320210523174923,162
8091302516.7d2e05ad63b91843d438d2464a908d49., 
storeName=7d2e05ad63b91843d438d2464a908d49/info, priority=90, 
startTime=1649825135950
java.lang.NegativeArraySizeException
        at org.apache.hadoop.hbase.CellUtil.cloneQualifier(CellUtil.java:120)
        at 
org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray(ByteBufferKeyValue.java:112)
        at 
org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(CellUtil.java:1335)
        at 
org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(CellUtil.java:1318)
        at 
org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.getMidpoint(HFileWriterImpl.java:384)
        at 
org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.finishBlock(HFileWriterImpl.java:349)
        at 
org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.checkBlockBoundary(HFileWriterImpl.java:328)
        at 
org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.append(HFileWriterImpl.java:739)
        at 
org.apache.hadoop.hbase.regionserver.StoreFileWriter.append(StoreFileWriter.java:299)
        at 
org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:410)
        at 
org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:333)
        at 
org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:65)
        at 
org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:126)
        at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1544)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2288)
        at 
org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:619)
        at 
org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:661)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748) {code}
 

We encounter the exeception above many times. Usually, it will retry and 
compact success next time. But sometime it make _getMidpoint return_ a wrong 
result and then make an abnormal index block  as follow

 
{code:java}
068c892d122//LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0
1//LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0
068c896a6fc6f155beaddab036a4225ef79_12022011420220114205945/info:q/LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0
 {code}
 

And this index block will lead to an endless loop in 
_org.apache.hadoop.hbase.regionserver.KeyValueHeap#generalizedSeek_

The cause of this problem is _lastCellOfPreviousBlock_ reference to the cells 
in read path([HBASE-16372|https://issues.apache.org/jira/browse/HBASE-16372])

I have fixed it and will create a PR for it



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HBASE-27019) Minor compression performance improvements

2022-05-13 Thread Andrew Kyle Purtell (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell resolved HBASE-27019.
-
Hadoop Flags: Reviewed
  Resolution: Fixed

> Minor compression performance improvements
> --
>
> Key: HBASE-27019
> URL: https://issues.apache.org/jira/browse/HBASE-27019
> Project: HBase
>  Issue Type: Bug
>Reporter: Andrew Kyle Purtell
>Assignee: Andrew Kyle Purtell
>Priority: Trivial
> Fix For: 2.5.0, 3.0.0-alpha-3
>
>
> TRACE level logging is expensive enough to warrant removal. They were useful 
> during development but now are just overhead. 
> {noformat}
>   12700390224.07%  127  jbyte_disjoint_arraycopy
> {noformat}
> e.g.
> {noformat}
>   [ 0] jbyte_disjoint_arraycopy
>   [ 1] org.slf4j.impl.Reload4jLoggerAdapter.isTraceEnabled
>   [ 2] org.slf4j.impl.Reload4jLoggerAdapter.trace
>   [ 3] 
> org.apache.hadoop.hbase.io.compress.aircompressor.HadoopCompressor.setInput
>   [ 4] org.apache.hadoop.io.compress.BlockCompressorStream.write
>   [ 5] java.io.OutputStream.write
>   [ 6] com.salesforce.hbase.util.TestUtils.outputStreamTest
>   [ 7] com.salesforce.hbase.util.TestUtils.outputStreamTest
>   [ 8] com.salesforce.hbase.BenchmarkAircompressorLz4.test
>   [ 9] 
> com.salesforce.hbase.jmh_generated.BenchmarkAircompressorLz4_test_jmhTest.test_avgt_jmhStub
>   [10] 
> com.salesforce.hbase.jmh_generated.BenchmarkAircompressorLz4_test_jmhTest.test_AverageTime
> {noformat}
> Also we unnecessarily create new LZ4 compressor and decompressor instances in 
> the reset() methods.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] Disable Style/FrozenStringLiteralComment for ruby

2022-05-13 Thread Tao Li
Hi Mike,

Thank you for your reply.

I add `# frozen_string_literal: true` after the license header and RuboCop
is not happy. I did the verification in this PR
https://github.com/apache/hbase/pull/4416, you can refer to it.

This comment (while no longer the default in Ruby 3) may indeed provide
some minor performance improvements. Ruby is used in HBase in only a small
number of scenarios. So I think the potential performance gains are limited.

But if we can only add it before the license header, it looks a little
weird. Maybe it depends on how we weigh it. Thank you again.



Mike Drob  于2022年5月13日周五 20:38写道:

>
>
> On 2022/05/12 05:04:51 Tao Li wrote:
> > Hi team,
> >
> > By default Style/FrozenStringLiteralComment is enabled in rubocop. If we
> > update a ruby file, rubocop prompts `Missing frozen string literal
> comment`
> > (see
> >
> https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4416/2/artifact/yetus-general-check/output/diff-patch-rubocop.txt
> )
> > <
> https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4416/2/artifact/yetus-general-check/output/diff-patch-rubocop.txt
> >
> > .
> >
> >
> > To address this warn, we need to add `# frozen_string_literal: true` to
> the
> > top of the ruby file(see
> > https://github.com/rubocop/rubocop/blob/master/config/default.yml#L3631
> ),
> > which will be added to the top of the `Apache License` and will look
> > strange.
> >
>
> I think this line can be applied after the license header and still be
> compliant with RuboCop.
>
> >
> > I don't think this `FrozenStringLiteralComment` check is very necessary.
> > Can we disable it?
> >
>
> Using frozen strings (while no longer the default in Ruby 3) has some
> minor performance improvements where it is applied.
>
> >
> > I filed a JIRA: https://issues.apache.org/jira/browse/HBASE-27026 to
> track
> > the issue and put a PR https://github.com/apache/hbase/pull/4423.
> >
> > Welcome to discuss whether this approach is reasonable or not.
> >
> >
> >
> > Thanks,
> >
> > Tao Li
> >
>


[jira] [Resolved] (HBASE-27013) Introduce read all bytes when using pread for prefetch

2022-05-13 Thread Tak-Lon (Stephen) Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tak-Lon (Stephen) Wu resolved HBASE-27013.
--
Fix Version/s: 2.5.0
   2.6.0
   3.0.0-alpha-3
 Hadoop Flags: Reviewed
 Release Note: Introduce optional flag hfile.pread.all.bytes.enabled for 
pread that must read full bytes with the next block header, this is specially 
helpful when users are running HBase with Blob storage like S3 and Azure Blob 
storage.
   Resolution: Fixed

> Introduce read all bytes when using pread for prefetch
> --
>
> Key: HBASE-27013
> URL: https://issues.apache.org/jira/browse/HBASE-27013
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile, Performance
>Affects Versions: 2.5.0, 2.6.0, 3.0.0-alpha-3, 2.4.13
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
> Fix For: 2.5.0, 2.6.0, 3.0.0-alpha-3
>
>
> h2. Problem statement
> When prefetching HFiles from blob storage like S3 and use it with the storage 
> implementation like S3A, we found there is a logical issue in HBase pread 
> that causes the reading of the remote HFile aborts the input stream multiple 
> times. This aborted stream and reopen slow down the reads and trigger many 
> aborted bytes and waste time in recreating the connection especially when SSL 
> is enabled.
> h2. ROOT CAUSE
> The root cause of above issue was due to 
> [BlockIOUtils#preadWithExtra|https://github.com/apache/hbase/blob/9c8c9e7fbf8005ea89fa9b13d6d063b9f0240443/hbase-common/src/main/java/org/apache/hadoop/hbase/io/util/BlockIOUtils.java#L214-L257]
>  is reading an input stream that does not guarrentee to return the data block 
> and the next block header as an option data to be cached.
> In the case of the input stream read short and when the input stream read 
> passed the length of the necessary data block with few more bytes within the 
> size of next block header, the 
> [BlockIOUtils#preadWithExtra|https://github.com/apache/hbase/blob/9c8c9e7fbf8005ea89fa9b13d6d063b9f0240443/hbase-common/src/main/java/org/apache/hadoop/hbase/io/util/BlockIOUtils.java#L214-L257]
>  returns to the caller without a cached the next block header. As a result, 
> before HBase tries to read the next block, 
> [HFileBlock#readBlockDataInternal|https://github.com/apache/hbase/blob/9c8c9e7fbf8005ea89fa9b13d6d063b9f0240443/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java#L1648-L1664]
>  in hbase tries to re-read the next block header from the input stream. Here, 
> the reusable input stream has move the current position pointer ahead from 
> the offset of the last read data block, when using with the [S3A 
> implementation|https://github.com/apache/hadoop/blob/29401c820377d02a992eecde51083cf87f8e57af/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L339-L361],
>  the input stream is then closed, aborted all the remaining bytes and reopen 
> a new input stream at the offset of the last read data block .
> h2. How do we fix it?
> S3A is doing the right job that HBase is telling to move the offset from 
> position A back to A - N, so there is not much thing we can do on how S3A 
> handle the inputstream. meanwhile in the case of HDFS, this operation is fast.
> Such that, we should fix in HBase level, and try always to read datablock + 
> next block header when we're using blob storage to avoid expensive draining 
> the bytes in a stream and reopen the socket with the remote storage.
> h2. Draw back and discussion
>  * A known drawback is, when we're at the last block, we will read extra 
> length that should not be a header, and we still read that into the byte 
> buffer array. the size should be always 33 bytes, and it should not a big 
> issue in data correctness because the trailer will tell when the last 
> datablock should end. And we just waste a 33 byte read and that data is not 
> being used.
>  * I don't know if we can use HFileStreamReader but that will change the 
> Prefetch logic a lot, such that this minimum change should be the best.
> h2. initial result
> We use YCSB 1 billion records data, and we enable prefetch for the userable. 
> the collected the S3A metrics of {{stream_read_bytes_discarded_in_abort}} to 
> compare the solution, each region server have abort ~290 GB data to be 
> prefetch to bucketcache.
> * before the change, we have a total of 4235973338472 bytes (~4235GB) has 
> been aborted on a sample region server for about 290GB data.
> ** the overall time was about 45 ~ 60 mins
>  
> {code}
> % grep "stream_read_bytes_discarded_in_abort" 
> ~/prefetch-result/prefetch-s3a-jmx-metrics.json | grep -wv 
> "stream_read_bytes_discarded_in_abort\":0,"
>  

[jira] [Created] (HBASE-27033) Backport "HBASE-27013 Introduce read all bytes when using pread for prefetch" to branch-2.4

2022-05-13 Thread Tak-Lon (Stephen) Wu (Jira)
Tak-Lon (Stephen) Wu created HBASE-27033:


 Summary: Backport "HBASE-27013 Introduce read all bytes when using 
pread for prefetch" to branch-2.4
 Key: HBASE-27033
 URL: https://issues.apache.org/jira/browse/HBASE-27033
 Project: HBase
  Issue Type: Task
Affects Versions: 2.4.13
Reporter: Tak-Lon (Stephen) Wu
 Fix For: 2.4.13


Backport HBASE-27013 to branch-2.4, it's required because it's not a clean 
backport.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] Disable Style/FrozenStringLiteralComment for ruby

2022-05-13 Thread Mike Drob



On 2022/05/12 05:04:51 Tao Li wrote:
> Hi team,
> 
> By default Style/FrozenStringLiteralComment is enabled in rubocop. If we
> update a ruby file, rubocop prompts `Missing frozen string literal comment`
> (see
> https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4416/2/artifact/yetus-general-check/output/diff-patch-rubocop.txt)
> 
> .
> 
> 
> To address this warn, we need to add `# frozen_string_literal: true` to the
> top of the ruby file(see
> https://github.com/rubocop/rubocop/blob/master/config/default.yml#L3631),
> which will be added to the top of the `Apache License` and will look
> strange.
> 

I think this line can be applied after the license header and still be 
compliant with RuboCop.

> 
> I don't think this `FrozenStringLiteralComment` check is very necessary.
> Can we disable it?
> 

Using frozen strings (while no longer the default in Ruby 3) has some minor 
performance improvements where it is applied.

> 
> I filed a JIRA: https://issues.apache.org/jira/browse/HBASE-27026 to track
> the issue and put a PR https://github.com/apache/hbase/pull/4423.
> 
> Welcome to discuss whether this approach is reasonable or not.
> 
> 
> 
> Thanks,
> 
> Tao Li
> 


[jira] [Created] (HBASE-27032) The draining region servers metric description is incorrect

2022-05-13 Thread Tao Li (Jira)
Tao Li created HBASE-27032:
--

 Summary: The draining region servers metric description is 
incorrect
 Key: HBASE-27032
 URL: https://issues.apache.org/jira/browse/HBASE-27032
 Project: HBase
  Issue Type: Bug
Reporter: Tao Li
Assignee: Tao Li


The draining region servers metric description is incorrect.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HBASE-27031) Add metrics for draining region servers

2022-05-13 Thread Tao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li resolved HBASE-27031.

Resolution: Duplicate

> Add metrics for draining region servers
> ---
>
> Key: HBASE-27031
> URL: https://issues.apache.org/jira/browse/HBASE-27031
> Project: HBase
>  Issue Type: Wish
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
> Attachments: image-2022-05-13-19-28-40-618.png
>
>
> To facilitate us to query and monitor the draining region servers, add 
> metrics 
> numDrainingRegionServers and drainingRegionServers.
> !image-2022-05-13-19-28-40-618.png|width=317,height=165!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HBASE-27031) Add metrics for draining region servers

2022-05-13 Thread Tao Li (Jira)
Tao Li created HBASE-27031:
--

 Summary: Add metrics for draining region servers
 Key: HBASE-27031
 URL: https://issues.apache.org/jira/browse/HBASE-27031
 Project: HBase
  Issue Type: Wish
Reporter: Tao Li
Assignee: Tao Li
 Attachments: image-2022-05-13-19-28-40-618.png

To facilitate us to query and monitor the draining region servers, add metrics 

numDrainingRegionServers and drainingRegionServers.

!image-2022-05-13-19-28-40-618.png|width=317,height=165!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HBASE-27030) Fix undefined local variable error in draining_servers.rb

2022-05-13 Thread Junegunn Choi (Jira)
Junegunn Choi created HBASE-27030:
-

 Summary: Fix undefined local variable error in draining_servers.rb
 Key: HBASE-27030
 URL: https://issues.apache.org/jira/browse/HBASE-27030
 Project: HBase
  Issue Type: Bug
Affects Versions: 3.0.0-alpha-3
Reporter: Junegunn Choi


HBASE-21812 replaced a for-loop with an each block. Each block introduces a new 
scope, so a local variable defined inside it cannot be accessed afterwards.

{quote}
  NameError: undefined local variable or method `admin' for main:Object
    getServerNames at /opt/khp/hbase/bin/draining_servers.rb:81
        addServers at /opt/khp/hbase/bin/draining_servers.rb:88
             at /opt/khp/hbase/bin/draining_servers.rb:146
{quote}
 
{code:java}
for i in [1, 2, 3]
  a = i
end
puts a
  # 3

[4, 5, 6].each do |i|
  b = i
end
puts b
  # undefined local variable or method `b'
{code}

We can define the admin local variable in the current scope beforehand, and we 
can still access it after the block.

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)