Re: [DISCUSS] Contribution of a thrift2 python api

2021-09-28 Thread Yutong Xiao
Contributing to hbase-connectors is ok for me. But there is a question
about how to release the package. Currently, I upload the package to pypi
with my own pypi account. Not clear whether this should change if I
contribute the client to hbase-connectors.

Reid Chan  于2021年8月3日周二 下午10:27写道:

> Brief summaries from this thread:
>
> I think you can push to hbase-connectors
>  repo. Just create a new dir
> hbase-thrift-python, and put your codes under it.
> Your codes need to be reviewed by good Pythoners in the community.
>
> Some follow-up tasks include (but not limited to):
>
>- requirements.txt
>- docs about how to release
>- some client examples
>- add some test codes (I'm not sure this part, whether there are similar
>conventions in python world)
>- other related information
>
>
> About the *.thrfit files you mentioned, I can't come up a good idea by now.
> Looks like we still need to create two separate PRs to update both hbase
> and hbase-connectors repo.
> However, as *.thrfit files seldom get updated, I feel it should be a ok.
>
>
> --
> Best Regards,
> R.C
>
>
> On Tue, Aug 3, 2021 at 10:42 AM Yutong Xiao  wrote:
>
> > Hello, any other thoughts about this?
> >
> > Yutong Xiao  于2021年7月21日周三 上午11:30写道:
> >
> > > Hi. I have removed the personal info in the licenses. For the point
> 3.2,
> > > thbase is dependent on the hbase.thrift file. I have involved the
> > > hbase.thrift file, that thbase used, in the repo. In this case,  repo
> > > separation will lead to a sync problem between the hbase.thrfit files
> in
> > > HBase repo and the connector repo. I am concerned this may make it hard
> > for
> > > maintenance. What do you think?
> > >
> > > 张铎(Duo Zhang)  于2021年7月17日周六 上午9:39写道:
> > >
> > >> One of the difficulties of moving hbase-thrift and hbase-rest out is
> > >> because we make use of hbase-http in these two modules, at least for
> > >> setting up the status servlet...
> > >>
> > >> Sean Busbey  于2021年7月16日周五 下午11:01写道:
> > >>
> > >> > maybe a good fit for the hbase-connectors repo? I know we've talked
> a
> > >> > few times about moving the thrift server out there. if we did both
> > >> > then the compatibility question becomes just the standard
> > >> > client/server compatibility provided the thrift server only uses our
> > >> > public java client API.
> > >> >
> > >> > On Thu, Jul 15, 2021 at 10:21 PM Yutong Xiao 
> > >> wrote:
> > >> > >
> > >> > > btw for point 2, if allowed I can do that.
> > >> > > And for point 3.2 it is only a personal idea, the final decision
> > >> should
> > >> > be
> > >> > > made by the community.
> > >> > > Besides, many of my python user colleagues started using this
> > library.
> > >> > > I think many python users have the demand of a good HBase python
> > >> client.
> > >> > >
> > >> > > Yutong Xiao  于2021年7月16日周五 上午11:07写道:
> > >> > >
> > >> > > > 1. The license is no problem.
> > >> > > > 2. This should see if any committer or PMC has interests to do
> > that.
> > >> > > > 3. I can be responsible for those documents. About 3.2, as
> thbase
> > >> has
> > >> > been
> > >> > > > uploaded to Pypi, I think it would be better if it is a new,
> > >> separate
> > >> > repo.
> > >> > > >
> > >> > > > Wei-Chiu Chuang  于2021年7月6日周二 上午10:22写道:
> > >> > > >
> > >> > > >> Hi
> > >> > > >> thanks for your interest in contributing the python api to the
> > >> HBase
> > >> > > >> project.
> > >> > > >>
> > >> > > >> I quickly check and it doesn't look like there's another active
> > >> python
> > >> > > >> HBase thrift client project at this point.
> > >> > > >> I don't have a demand to use a python thrift hbase client
> > library.
> > >> If
> > >> > > >> there
> > >> > > >> are people who will benefit from this library, then it's a good
> > >> idea
> > >> > to
> > >> > > >> make sure the library is well maintained, by having it become
> > part
> > >> of
> > >> > the
> > >> > > >> Apache HBase project and that more developers can contribute to
> > it.
> > >> > > >>
> > >> > > >> As a hobbyist Python developer I can help review/commit the
> > patch.
> > >> > > >>
> > >> > > >> My two cents:
> > >> > > >> (1) license: the code is ASL 2.0 so it's compatible. The text
> > >> > "Copyright
> > >> > > >> 2021 Yutong Sean" would need to be removed.
> > >> > > >> (2) Apache Infra does not manage PyPi. So we (the Apache HBase
> > >> project
> > >> > > >> committers/PMC) will have to do that.
> > >> > > >> I suspect we will have to replicate this PyPi project and add
> the
> > >> > > >> interested HBase PMCs who's willing to do the release work.
> > >> > > >> (3) compatibility matrix: we need to document what versions of
> > >> HBase
> > >> > > >> server
> > >> > > >> is supported.
> > >> > > >> (3) code:
> > >> > > >> (3.1) You will need a requirements.txt and preferably specify
> the
> > >> > versions
> > >> > > >> of the dependencies.
> > >> > > >> (3.2) If the community accepts it, should it be part of 

[jira] [Created] (HBASE-26305) Move NavigableSet add operation to writer thread in BucketCache

2021-09-28 Thread Yutong Xiao (Jira)
Yutong Xiao created HBASE-26305:
---

 Summary: Move NavigableSet add operation to writer thread in 
BucketCache
 Key: HBASE-26305
 URL: https://issues.apache.org/jira/browse/HBASE-26305
 Project: HBase
  Issue Type: Improvement
Reporter: Yutong Xiao
Assignee: Yutong Xiao
 Attachments: logn in WriterThreads.png, logn in cacheBlock.png

We currently use a ConcurrentSkipList to store blocks by HFile in bucket cache. 
The average time complexity of the add function is O(logn). We can move this 
time costly to the writer threads to reduce the response latency of read 
requests. I have tested the time cost of function cacheBlock in BucketCache and 
attached the metrics screenshots.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-26299) Fix TestHTableTracing.testTableClose for nightly build of branch-2

2021-09-28 Thread Tak-Lon (Stephen) Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tak-Lon (Stephen) Wu resolved HBASE-26299.
--
Fix Version/s: 2.5.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Fix TestHTableTracing.testTableClose for nightly build of branch-2
> --
>
> Key: HBASE-26299
> URL: https://issues.apache.org/jira/browse/HBASE-26299
> Project: HBase
>  Issue Type: Bug
>  Components: test, tracing
>Affects Versions: 2.5.0
>Reporter: Tak-Lon (Stephen) Wu
>Assignee: Tak-Lon (Stephen) Wu
>Priority: Major
> Fix For: 2.5.0
>
>
> after merging HBASE-26141, sometime isn't right with the last testTableClose 
> when we close the table and the connection, need to figure out why it's not 
> working in the unit test with the jdk8 and hadoop 3 profile
> https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/351/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/
> https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/352/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/
> {code}
> [ERROR] org.apache.hadoop.hbase.client.TestHTableTracing.testTableClose  Time 
> elapsed: 0.001 s  <<< ERROR!
> java.lang.IllegalStateException: GlobalOpenTelemetry.set has already been 
> called. GlobalOpenTelemetry.set must be called only once before any calls to 
> GlobalOpenTelemetry.get. If you are using the OpenTelemetrySdk, use 
> OpenTelemetrySdkBuilder.buildAndRegisterGlobal instead. Previous invocation 
> set to cause of this exception.
>   at 
> io.opentelemetry.api.GlobalOpenTelemetry.set(GlobalOpenTelemetry.java:83)
>   at 
> io.opentelemetry.sdk.testing.junit4.OpenTelemetryRule.before(OpenTelemetryRule.java:95)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:50)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at 
> org.apache.hadoop.hbase.SystemExitRule$1.evaluate(SystemExitRule.java:38)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.Throwable
>   at 
> io.opentelemetry.api.GlobalOpenTelemetry.set(GlobalOpenTelemetry.java:91)
>   at 
> io.opentelemetry.api.GlobalOpenTelemetry.get(GlobalOpenTelemetry.java:61)
>   at 
> io.opentelemetry.api.GlobalOpenTelemetry.getTracer(GlobalOpenTelemetry.java:110)
>   at 
> org.apache.hadoop.hbase.trace.TraceUtil.getGlobalTracer(TraceUtil.java:71)
>   at org.apache.hadoop.hbase.trace.TraceUtil.createSpan(TraceUtil.java:95)
>   at org.apache.hadoop.hbase.trace.TraceUtil.createSpan(TraceUtil.java:78)
>   at 
> org.apache.hadoop.hbase.trace.TraceUtil.lambda$trace$1(TraceUtil.java:176)
>   at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:180)
>   at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:176)
>   at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.close(ConnectionImplementation.java:2110)
>   at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.finalize(ConnectionImplementation.java:2149)
>   at java.lang.System$2.invokeFinalize(System.java:1273)
>   at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:102)
>   at java.lang.ref.Finalizer.access$100(Finalizer.java:34)
>   at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:217)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-26293) Use reservoir sampling when selecting bootstrap nodes

2021-09-28 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-26293.
---
Fix Version/s: 3.0.0-alpha-2
   2.5.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to master and branch-2.

Thanks [~haxiaolin] for reviewing.

> Use reservoir sampling when selecting bootstrap nodes
> -
>
> Key: HBASE-26293
> URL: https://issues.apache.org/jira/browse/HBASE-26293
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, regionserver
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-2
>
>
> In the current implementation, we need to copy all the region servers out to 
> a new array list, shuffle the list, and then get the first several elements.
> We could use reservoir sampling to reduce the overhead here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26304) Reflect out-of-band locality improvements in served requests

2021-09-28 Thread Bryan Beaudreault (Jira)
Bryan Beaudreault created HBASE-26304:
-

 Summary: Reflect out-of-band locality improvements in served 
requests
 Key: HBASE-26304
 URL: https://issues.apache.org/jira/browse/HBASE-26304
 Project: HBase
  Issue Type: Sub-task
Reporter: Bryan Beaudreault
Assignee: Bryan Beaudreault


Once the LocalityHealer has improved locality of a StoreFile (by moving blocks 
onto the correct host), the Reader's DFSInputStream and Region's localityIndex 
metric must be refreshed. Without refreshing the DFSInputStream, the improved 
locality will not improve latencies. In fact, the DFSInputStream may try to 
fetch blocks that have moved, resulting in a ReplicaNotFoundException. This is 
automatically retried, but the retry will increase long tail latencies relative 
to configured backoff strategy.

See https://issues.apache.org/jira/browse/HDFS-16155 for an improvement in 
backoff strategy which can greatly mitigate latency impact of the missing block 
retry.

Even with that mitigation, a StoreFile is often made up of many blocks. Without 
some sort of intervention, we will continue to hit ReplicaNotFoundException 
over time as clients naturally request data from moved blocks.

In the original LocalityHealer design, I created a new 
RefreshHDFSBlockDistribution RPC on the RegionServer. This RPC accepts a list 
of region names and, for each region store, re-opens the underlying StoreFile 
if the locality has changed.

I will submit a PR with that implementation, but I am also investigating other 
avenues. For example, I noticed 
https://issues.apache.org/jira/browse/HDFS-15119 which doesn't seem ideal but 
maybe can be improved as an automatic lower-level handling of block moves.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26303) Use priority queue in dir scan pool of cleaner

2021-09-28 Thread Xiaolin Ha (Jira)
Xiaolin Ha created HBASE-26303:
--

 Summary: Use priority queue in dir scan pool of cleaner
 Key: HBASE-26303
 URL: https://issues.apache.org/jira/browse/HBASE-26303
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.0.0, 3.0.0-alpha-1
Reporter: Xiaolin Ha
Assignee: Xiaolin Ha


DirScanPool used normal LinkedBlockingQueue when creating thread pool,
{code:java}
 54   private static ThreadPoolExecutor initializePool(int size) {
 55     return Threads.getBoundedCachedThreadPool(size, 1, TimeUnit.MINUTES,
 56       new 
ThreadFactoryBuilder().setNameFormat("dir-scan-pool-%d").setDaemon(true)
 57         
.setUncaughtExceptionHandler(Threads.LOGGING_EXCEPTION_HANDLER).build());
 58   }
{code}
which will not priority scan larger directories and delete files there as 
expected, though CleanerChore#sortByConsumedSpace() before putting directories 
to the queue.

Subdirectories of larger directories and small directories are taken fairly in 
the queue.

We should used priority queue here instead, e.g. PriorityBlockingQueue, to make 
larger directories be cleaned earlier. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)