[jira] [Created] (HBASE-26552) Introduce retry to logroller when encounters IOException

2021-12-08 Thread Xiaolin Ha (Jira)
Xiaolin Ha created HBASE-26552:
--

 Summary: Introduce retry to logroller when encounters IOException
 Key: HBASE-26552
 URL: https://issues.apache.org/jira/browse/HBASE-26552
 Project: HBase
  Issue Type: Improvement
  Components: wal
Affects Versions: 2.0.0, 3.0.0-alpha-1
Reporter: Xiaolin Ha
Assignee: Xiaolin Ha


When calling RollController#rollWal in AbstractWALRoller, the regionserver may 
abort when encounters exception,
{code:java}
...
} catch (FailedLogCloseException | ConnectException e) {
  abort("Failed log close in log roller", e);
} catch (IOException ex) {
  // Abort if we get here. We probably won't recover an IOE. HBASE-1132
  abort("IOE in log roller",
ex instanceof RemoteException ? ((RemoteException) 
ex).unwrapRemoteException() : ex);
} catch (Exception ex) {
  LOG.error("Log rolling failed", ex);
  abort("Log rolling failed", ex);
} {code}
I think we should support retry of rollWal here to avoid recovering the service 
by killing regionserver. The restart of regionserver is costly and very not 
friendly to the availability.

I find that when creating new writer for the WAL in 
FanOutOneBlockAsyncDFSOutputHelper#createOutput, it supports retry to addBlock 
by setting this config "hbase.fs.async.create.retries".

But the initialization of new WAL writer also includes flushing the write 
buffer flush and waiting until it is completed by 
AsyncProtobufLogWriter#writeMagicAndWALHeader, which can also fail by some 
hardware reasons. The regionserver connected to the datanodes after addBlock, 
but that not means the magic and header can be flushed successfully.
{code:java}
protected long writeMagicAndWALHeader(byte[] magic, WALHeader header) throws 
IOException {
  return write(future -> {
output.write(magic);
try {
  header.writeDelimitedTo(asyncOutputWrapper);
} catch (IOException e) {
  // should not happen
  throw new AssertionError(e);
}
addListener(output.flush(false), (len, error) -> {
  if (error != null) {
future.completeExceptionally(error);
  } else {
future.complete(len);
  }
});
  });
}{code}
We have found that in our production clusters, there exists aborting of 
regionservers that caused by "IOE in log roller". And the practice in our 
clusters is that just one more retry of rollWal can make the WAL roll complete 
and continue serving.

 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26551) Add FastPath feature to HBase RWQueueRpcExecutor

2021-12-08 Thread Yutong Xiao (Jira)
Yutong Xiao created HBASE-26551:
---

 Summary: Add FastPath feature to HBase RWQueueRpcExecutor
 Key: HBASE-26551
 URL: https://issues.apache.org/jira/browse/HBASE-26551
 Project: HBase
  Issue Type: Task
  Components: rpc
Reporter: Yutong Xiao
Assignee: Yutong Xiao






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] In flight work to complete before 2.5.0RC0

2021-12-08 Thread Andrew Purtell
Thanks Duo.

I haven’t been around much lately but will be reviewing and progressing PRs for 
branch-2 / branch-2.5 more actively now. A committer or contributor looking for 
a review can at-mention me on a relevant JIRA to get my attention. That will 
drop a notification mail into my inbox. (GitHub emails go to /dev/null, there 
are too many.) 

> On Dec 8, 2021, at 6:09 PM, 张铎  wrote:
> 
> OpenTracing -> OpenTelemetry :)
> 
> For me, I think the OpenTelemetry part is a blocker, we must finish it
> before cutting an RC since the current implementation is already landed on
> branch-2.5 and it breaks some Otel best practises, so we should not release
> it out.
> 
> Now it is only Nick doing the work and Tak Lon Wu and I reviewing the PRs.
> And I also joined the CNCF slack channel and saw Nick is working hard in
> communication with the Otel community on how to better implement tracing in
> HBase, for example, how to trace big scans.
> I would encourage more people in our community to involve so we can make
> progress faster.
> 
> Thanks.
> 
> Sean Busbey  于2021年12月9日周四 10:02写道:
> 
>> If we don't want to wait for HBASE-26543 (fix arg parsing for shell)
>> then we should revert HBASE-24772 from branch-2.5 prior to an RC.
>> 
>> 
>>> On Wed, Dec 8, 2021 at 7:34 PM Andrew Purtell  wrote:
>>> 
>>> As your branch-2.5 RM I am assembling a list of work items that should be
>>> completed before a 2.5.0RC0 candidate is submitted for the PMC's
>>> consideration.
>>> 
>>> I have so far:
>>> 
>>> - OpenTracing span naming convention and coverage improvements.
>>> 
>>> - Shell exit code fixes/improvements.
>>> 
>>> - The "encryption improvements umbrella". Arguable, but let's include it
>>> for now. Can all be resolved as Later if need be.
>>> 
>>> Let's discuss what else, if anything, should be on this list, or if one
>> or
>>> more of the above items does not constitute a release blocker. I consider
>>> incomplete work-in-progress a blocker. Obviously all of the work in
>>> progress should land before release. For WIP, let's also agree on a
>>> definition of done.
>>> 
>>> --
>>> Best regards,
>>> Andrew
>>> 
>>> Words like orphans lost among the crosstalk, meaning torn from truth's
>>> decrepit hands
>>>   - A23, Crosstalk
>> 


Re: [DISCUSS] In flight work to complete before 2.5.0RC0

2021-12-08 Thread Duo Zhang
OpenTracing -> OpenTelemetry :)

For me, I think the OpenTelemetry part is a blocker, we must finish it
before cutting an RC since the current implementation is already landed on
branch-2.5 and it breaks some Otel best practises, so we should not release
it out.

Now it is only Nick doing the work and Tak Lon Wu and I reviewing the PRs.
And I also joined the CNCF slack channel and saw Nick is working hard in
communication with the Otel community on how to better implement tracing in
HBase, for example, how to trace big scans.
I would encourage more people in our community to involve so we can make
progress faster.

Thanks.

Sean Busbey  于2021年12月9日周四 10:02写道:

> If we don't want to wait for HBASE-26543 (fix arg parsing for shell)
> then we should revert HBASE-24772 from branch-2.5 prior to an RC.
>
>
> On Wed, Dec 8, 2021 at 7:34 PM Andrew Purtell  wrote:
> >
> > As your branch-2.5 RM I am assembling a list of work items that should be
> > completed before a 2.5.0RC0 candidate is submitted for the PMC's
> > consideration.
> >
> > I have so far:
> >
> > - OpenTracing span naming convention and coverage improvements.
> >
> > - Shell exit code fixes/improvements.
> >
> > - The "encryption improvements umbrella". Arguable, but let's include it
> > for now. Can all be resolved as Later if need be.
> >
> > Let's discuss what else, if anything, should be on this list, or if one
> or
> > more of the above items does not constitute a release blocker. I consider
> > incomplete work-in-progress a blocker. Obviously all of the work in
> > progress should land before release. For WIP, let's also agree on a
> > definition of done.
> >
> > --
> > Best regards,
> > Andrew
> >
> > Words like orphans lost among the crosstalk, meaning torn from truth's
> > decrepit hands
> >- A23, Crosstalk
>


Re: [DISCUSS] Merge HBASE-26067 branch into master, and backport it to base 2 branches

2021-12-08 Thread Andrew Purtell
+1 for merging to branch-2 (2.6)

> On Dec 8, 2021, at 6:04 PM, 张铎  wrote:
> 
> I think here we just want this to be backported to 2.x, not 2.5.x.
> 
> So thanks Andrew for the quick action.
> 
> +1 on merging HBASE-26067 to master and backporting to branch-2(2.6.0).
> 
> Thanks.
> 
> Andrew Purtell  于2021年12月9日周四 08:45写道:
> 
>> I concur with Nick, but let me help here by branching 2.5 today. It was
>> always going to be somewhat arbitrary a point.
>> 
>>> On Wed, Dec 8, 2021 at 3:09 PM Nick Dimiduk  wrote:
>>> 
>>> Based solely on the comments made to this thread, I would recommend
>> against
>>> a merge to branch-2, given that we are very close to 2.5. The points
>> about
>>> existing gaps seem like things we're not ready to publish in the
>> impending
>>> minor release. Once we have a branch-2.5, this particular concern of mine
>>> will be alleviated.
>>> 
>>> Thanks,
>>> Nick
>>> 
 On Wed, Dec 8, 2021 at 1:37 PM Josh Elser  wrote:
>>> 
 I was going to wait for some other folks to chime in, but I guess I can
 be the next one :)
 
 Duo, Wellington, and Szabolcs have been doing some excellent work on
>> the
 storefile tracking (SFT) to a degree that I never expected to see. I
 remember some of the original "Filesystem re-do" issues on Jira. The
 idea was exceptional, but the result seemed unreachable.
 
 These devs, building on the success of what Zach/Stephen first talked
 about in HBASE-24749, came up with what I think is an excellent step
 forward. I've yet to break it via my own testing, but do acknowledge
 that there's always more work to be done.
 
 I think this is at a reasonable place to merge this back into the
 "mainline" branches from the feature branch (HBASE-26067). I believe
 this is ready because:
 
 1. The feature is completely opt-in (HBase works the same way by
>> default)
 2. There is API to migrate tables into the new SFT implementation
 3. There is also API to migrate tables back to the default
>> implementation
 
 Some gaps still exist around bulk loading, documentation, snapshots,
>> and
 recovery tooling, but these are being worked on. In the context of S3,
 this makes a significantly more compelling offering of HBase by
>> removing
 the complexity of HBOSS. For HBase in all installations, I think SFT
 makes more a significantly more "deterministic" way of managing
 regions/files.
 
 +1 from me to merge HBASE-26067 into master and branch-2
 
 - Josh
 
 On 12/7/21 10:31 AM, Wellington Chevreuil wrote:
> Hello everyone,
> 
> We have been making progress on the alternative way of tracking store
 files
> originally proposed by Duo in HBASE-26067.
> 
> To briefly summarize it for those not following it, this feature
 introduces
> an abstraction layer to track store files still used/needed by store
> engines, allowing for plugging different approaches of identifying
>>> store
> files required by the given store. The design doc describing it in
>> more
> detail is available here
> <
 
>>> 
>> https://docs.google.com/document/d/16Nr1Fn3VaXuz1g1FTiME-bnGR3qVK5B-raXshOkDLcY/edit#heading=h.calrs3kn4d8s
> 
> .
> 
> Our main goal within this feature is to avoid the need for using temp
 files
> and renames when creating new hfiles (whenever flushing, compacting,
> splitting/merging or snapshotting). This is made possible by the
 pluggable
> tracker implementation labeled "FILE". The current behavior using
>> temp
 dirs
> and renames would still be the default approach (labeled "DEFAULT").
> 
> This "renameless" approach is appealing for deployments using Amazon
>> S3
> Object store file system, where the lack of atomic rename operations
> imposed the necessity of an additional layer of locking (HBOSS),
>> which
> combined with the s3a rename operation can have a performance
>> overhead.
> 
> Some test runs on my employer infrastructure have shown promising
 results.
> A pure insertion ycsb run has shown ~6% performance gain on the
>> client
> writes. Snapshot clone of hundreds of regions table completes in half
>>> of
> the time. There are also improvements in compaction, splits and
>> merges
> times.
> 
> Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we
>> feel
> optimistic that the current implementation is in a good state to get
 merged
> into master branch, but it would be nice to hear other opinions about
>>> it,
> before we effectively commit it. Looking forward to hearing some
> thoughts/concerns you might have.
> 
> Kind regards,
> Wellington.
> 
 
>>> 
>> 
>> 
>> --
>> Best regards,
>> Andrew
>> 
>> Words like orphans lost among the crosstalk, meaning torn from truth's
>> decrepit hands
>>   - A23, Crosstalk
>> 


Re: [DISCUSS] Merge HBASE-26067 branch into master, and backport it to base 2 branches

2021-12-08 Thread Duo Zhang
I think here we just want this to be backported to 2.x, not 2.5.x.

So thanks Andrew for the quick action.

+1 on merging HBASE-26067 to master and backporting to branch-2(2.6.0).

Thanks.

Andrew Purtell  于2021年12月9日周四 08:45写道:

> I concur with Nick, but let me help here by branching 2.5 today. It was
> always going to be somewhat arbitrary a point.
>
> On Wed, Dec 8, 2021 at 3:09 PM Nick Dimiduk  wrote:
>
> > Based solely on the comments made to this thread, I would recommend
> against
> > a merge to branch-2, given that we are very close to 2.5. The points
> about
> > existing gaps seem like things we're not ready to publish in the
> impending
> > minor release. Once we have a branch-2.5, this particular concern of mine
> > will be alleviated.
> >
> > Thanks,
> > Nick
> >
> > On Wed, Dec 8, 2021 at 1:37 PM Josh Elser  wrote:
> >
> > > I was going to wait for some other folks to chime in, but I guess I can
> > > be the next one :)
> > >
> > > Duo, Wellington, and Szabolcs have been doing some excellent work on
> the
> > > storefile tracking (SFT) to a degree that I never expected to see. I
> > > remember some of the original "Filesystem re-do" issues on Jira. The
> > > idea was exceptional, but the result seemed unreachable.
> > >
> > > These devs, building on the success of what Zach/Stephen first talked
> > > about in HBASE-24749, came up with what I think is an excellent step
> > > forward. I've yet to break it via my own testing, but do acknowledge
> > > that there's always more work to be done.
> > >
> > > I think this is at a reasonable place to merge this back into the
> > > "mainline" branches from the feature branch (HBASE-26067). I believe
> > > this is ready because:
> > >
> > > 1. The feature is completely opt-in (HBase works the same way by
> default)
> > > 2. There is API to migrate tables into the new SFT implementation
> > > 3. There is also API to migrate tables back to the default
> implementation
> > >
> > > Some gaps still exist around bulk loading, documentation, snapshots,
> and
> > > recovery tooling, but these are being worked on. In the context of S3,
> > > this makes a significantly more compelling offering of HBase by
> removing
> > > the complexity of HBOSS. For HBase in all installations, I think SFT
> > > makes more a significantly more "deterministic" way of managing
> > > regions/files.
> > >
> > > +1 from me to merge HBASE-26067 into master and branch-2
> > >
> > > - Josh
> > >
> > > On 12/7/21 10:31 AM, Wellington Chevreuil wrote:
> > > > Hello everyone,
> > > >
> > > > We have been making progress on the alternative way of tracking store
> > > files
> > > > originally proposed by Duo in HBASE-26067.
> > > >
> > > > To briefly summarize it for those not following it, this feature
> > > introduces
> > > > an abstraction layer to track store files still used/needed by store
> > > > engines, allowing for plugging different approaches of identifying
> > store
> > > > files required by the given store. The design doc describing it in
> more
> > > > detail is available here
> > > > <
> > >
> >
> https://docs.google.com/document/d/16Nr1Fn3VaXuz1g1FTiME-bnGR3qVK5B-raXshOkDLcY/edit#heading=h.calrs3kn4d8s
> > > >
> > > > .
> > > >
> > > > Our main goal within this feature is to avoid the need for using temp
> > > files
> > > > and renames when creating new hfiles (whenever flushing, compacting,
> > > > splitting/merging or snapshotting). This is made possible by the
> > > pluggable
> > > > tracker implementation labeled "FILE". The current behavior using
> temp
> > > dirs
> > > > and renames would still be the default approach (labeled "DEFAULT").
> > > >
> > > > This "renameless" approach is appealing for deployments using Amazon
> S3
> > > > Object store file system, where the lack of atomic rename operations
> > > > imposed the necessity of an additional layer of locking (HBOSS),
> which
> > > > combined with the s3a rename operation can have a performance
> overhead.
> > > >
> > > > Some test runs on my employer infrastructure have shown promising
> > > results.
> > > > A pure insertion ycsb run has shown ~6% performance gain on the
> client
> > > > writes. Snapshot clone of hundreds of regions table completes in half
> > of
> > > > the time. There are also improvements in compaction, splits and
> merges
> > > > times.
> > > >
> > > > Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we
> feel
> > > > optimistic that the current implementation is in a good state to get
> > > merged
> > > > into master branch, but it would be nice to hear other opinions about
> > it,
> > > > before we effectively commit it. Looking forward to hearing some
> > > > thoughts/concerns you might have.
> > > >
> > > > Kind regards,
> > > > Wellington.
> > > >
> > >
> >
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>- A23, Crosstalk
>


Re: [DISCUSS] In flight work to complete before 2.5.0RC0

2021-12-08 Thread Sean Busbey
If we don't want to wait for HBASE-26543 (fix arg parsing for shell)
then we should revert HBASE-24772 from branch-2.5 prior to an RC.


On Wed, Dec 8, 2021 at 7:34 PM Andrew Purtell  wrote:
>
> As your branch-2.5 RM I am assembling a list of work items that should be
> completed before a 2.5.0RC0 candidate is submitted for the PMC's
> consideration.
>
> I have so far:
>
> - OpenTracing span naming convention and coverage improvements.
>
> - Shell exit code fixes/improvements.
>
> - The "encryption improvements umbrella". Arguable, but let's include it
> for now. Can all be resolved as Later if need be.
>
> Let's discuss what else, if anything, should be on this list, or if one or
> more of the above items does not constitute a release blocker. I consider
> incomplete work-in-progress a blocker. Obviously all of the work in
> progress should land before release. For WIP, let's also agree on a
> definition of done.
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>- A23, Crosstalk


[DISCUSS] In flight work to complete before 2.5.0RC0

2021-12-08 Thread Andrew Purtell
As your branch-2.5 RM I am assembling a list of work items that should be
completed before a 2.5.0RC0 candidate is submitted for the PMC's
consideration.

I have so far:

- OpenTracing span naming convention and coverage improvements.

- Shell exit code fixes/improvements.

- The "encryption improvements umbrella". Arguable, but let's include it
for now. Can all be resolved as Later if need be.

Let's discuss what else, if anything, should be on this list, or if one or
more of the above items does not constitute a release blocker. I consider
incomplete work-in-progress a blocker. Obviously all of the work in
progress should land before release. For WIP, let's also agree on a
definition of done.

-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


[jira] [Resolved] (HBASE-25799) add clusterReadRequests and clusterWriteRequests jmx

2021-12-08 Thread Andrew Kyle Purtell (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell resolved HBASE-25799.
-
Resolution: Fixed

> add clusterReadRequests and clusterWriteRequests jmx 
> -
>
> Key: HBASE-25799
> URL: https://issues.apache.org/jira/browse/HBASE-25799
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Reporter: xijiawen
>Assignee: xijiawen
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HBASE-26445) Procedure state pretty-printing should use printable character friendly encoding

2021-12-08 Thread Andrew Kyle Purtell (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell resolved HBASE-26445.
-
Resolution: Later

> Procedure state pretty-printing should use printable character friendly 
> encoding
> 
>
> Key: HBASE-26445
> URL: https://issues.apache.org/jira/browse/HBASE-26445
> Project: HBase
>  Issue Type: Task
>  Components: Operability
>Affects Versions: 2.4.8
>Reporter: Andrew Kyle Purtell
>Priority: Minor
>
> The shell 'list_procedures' command produces output like:
> 
>  889 org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure 
> SUCCESS 2021-11-10 22:20:34 UTC 2021-11-10 22:20:35 UTC [{"state"=>[1, 2, 3, 
> 11, 4, 5, 6, 7, 8, 9, 10, 2147483648]}, {"regionId"=>"1636579678894", 
> "tableName"=>{"namespace"=>"ZGVmYXVsdA==", 
> "qualifier"=>"SW50ZWdyYXRpb25UZXN0TG9hZENvbW1vbkNyYXds"}, 
> "startKey"=>"dWsuY28uZ3Jhbml0ZXRyYW5zZm9ybWF0aW9ucy53d3d8L2dhbGxlcnkvdA==", 
> "endKey"=>"dXMuYmFuZHwvYmFuZC81OA==", "offline"=>false, "split"=>false, 
> "replicaId"=>0}, {"userInfo"=>{"effectiveUser"=>"apurtell"}, 
> "parentRegionInfo"=>{"regionId"=>"1636579678894", 
> "tableName"=>{"namespace"=>"ZGVmYXVsdA==", 
> "qualifier"=>"SW50ZWdyYXRpb25UZXN0TG9hZENvbW1vbkNyYXds"}, 
> "startKey"=>"dWsuY28uZ3Jhbml0ZXRyYW5zZm9ybWF0aW9ucy53d3d8L2dhbGxlcnkvdA==", 
> "endKey"=>"dXMuYmFuZHwvYmFuZC81OA==", "offline"=>false, "split"=>false, 
> "replicaId"=>0}, "childRegionInfo"=>[{"regionId"=>"1636582834759", 
> "tableName"=>{"namespace"=>"ZGVmYXVsdA==", 
> "qualifier"=>"SW50ZWdyYXRpb25UZXN0TG9hZENvbW1vbkNyYXds"}, 
> "startKey"=>"dWsuY28uZ3Jhbml0ZXRyYW5zZm9ybWF0aW9ucy53d3d8L2dhbGxlcnkvdA==", 
> "endKey"=>"dWsuY28uc2ltb25hbmRzY2h1c3Rlci53d3d8L2Jvb2tzL1RoZS1P", 
> "offline"=>false, "split"=>false, "replicaId"=>0}, 
> {"regionId"=>"1636582834759", "tableName"=>{"namespace"=>"ZGVmYXVsdA==", 
> "qualifier"=>"SW50ZWdyYXRpb25UZXN0TG9hZENvbW1vbkNyYXds"}, 
> "startKey"=>"dWsuY28uc2ltb25hbmRzY2h1c3Rlci53d3d8L2Jvb2tzL1RoZS1P", 
> "endKey"=>"dXMuYmFuZHwvYmFuZC81OA==", "offline"=>false, "split"=>false, 
> "replicaId"=>0}]}]
> 
> The base64 encoding of byte[] values offers poor usability. 
> Generally, table names etc are printable characters encoded in byte[]. Base64 
> encoding them totally obfuscates information that is important to see at a 
> glance. Even start keys and end keys might be printable characters.
> It would be better to use Bytes.toStringBinary or something else that shows 
> printable characters as-is while escaping others that would be invalid in 
> JSON formatting.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[ANNOUNCE] branch-2.5 branched today

2021-12-08 Thread Andrew Purtell
Today we opened a new branch for releasing the 2.5 code line, branch-2.5.

branch-2 has been renumbered to 2.6.0-SNAPSHOT.

-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: Branching branch-2.5 soon

2021-12-08 Thread Andrew Purtell
The branch-2.5 branch was created today.

The branch-2.5 version is currently 2.5.0-SNAPSHOT. branch-2's version is
now 2.6.0-SNAPSHOT.

On Mon, Oct 11, 2021 at 10:28 AM Andrew Purtell  wrote:

> To give you a clearer sense of timeline, I am thinking we branch for 2.5
> at the end of this current month (October), so we can have a month to
> stabilize, then try a 2.5.0RC0 in December.
>
>
>
> On Thu, Oct 7, 2021 at 10:36 AM Andrew Purtell 
> wrote:
>
>> We are thinking about bringing backup back to branch-2 on HBASE-26301, so
>> it can receive more attention and ultimately be prepared for release.
>> Before we do that, because we have collected a few things in branch-2
>> already that motivate a 2.5 release, I plan to branch for 2.5 soon, as
>> branch-2.5, and begin stabilization efforts for a 2.5.0 release. Backup
>> would make a nice motivator for a future 2.6 release.
>>
>> I volunteer to RM the upcoming 2.5 release line and you can expect the
>> first release by the end of the year.
>>
>> Branch-2.4 will continue to see releasing on roughly a monthly cadence.
>>
>> Related, our Nick asked us to revive previous threads "[DISCUSS] Updating
>> the 'stable' pointer to 2.4.2" [0] and "[DISCUSS] EOL 2.3" [1]. It would be
>> great, if you are reading this note with interest, if you could help move
>> those discussions forward.
>>
>> [0]:
>>
>> https://lists.apache.org/thread.html/r1cc4528a6a35cd1b0d38398aa61ad642a368901795d6970544d0a0a9%40%3Cdev.hbase.apache.org%3E
>> [1]:
>>
>> https://lists.apache.org/thread.html/r1ed22b0223920ae68eb48d0208cbc1a673f4ddf96b548a16601dc471%40%3Cdev.hbase.apache.org%3E
>>
>> --
>> Best regards,
>> Andrew
>>
>> Words like orphans lost among the crosstalk, meaning torn from truth's
>> decrepit hands
>>- A23, Crosstalk
>>
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>- A23, Crosstalk
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: [DISCUSS] Merge HBASE-26067 branch into master, and backport it to base 2 branches

2021-12-08 Thread Andrew Purtell
I concur with Nick, but let me help here by branching 2.5 today. It was
always going to be somewhat arbitrary a point.

On Wed, Dec 8, 2021 at 3:09 PM Nick Dimiduk  wrote:

> Based solely on the comments made to this thread, I would recommend against
> a merge to branch-2, given that we are very close to 2.5. The points about
> existing gaps seem like things we're not ready to publish in the impending
> minor release. Once we have a branch-2.5, this particular concern of mine
> will be alleviated.
>
> Thanks,
> Nick
>
> On Wed, Dec 8, 2021 at 1:37 PM Josh Elser  wrote:
>
> > I was going to wait for some other folks to chime in, but I guess I can
> > be the next one :)
> >
> > Duo, Wellington, and Szabolcs have been doing some excellent work on the
> > storefile tracking (SFT) to a degree that I never expected to see. I
> > remember some of the original "Filesystem re-do" issues on Jira. The
> > idea was exceptional, but the result seemed unreachable.
> >
> > These devs, building on the success of what Zach/Stephen first talked
> > about in HBASE-24749, came up with what I think is an excellent step
> > forward. I've yet to break it via my own testing, but do acknowledge
> > that there's always more work to be done.
> >
> > I think this is at a reasonable place to merge this back into the
> > "mainline" branches from the feature branch (HBASE-26067). I believe
> > this is ready because:
> >
> > 1. The feature is completely opt-in (HBase works the same way by default)
> > 2. There is API to migrate tables into the new SFT implementation
> > 3. There is also API to migrate tables back to the default implementation
> >
> > Some gaps still exist around bulk loading, documentation, snapshots, and
> > recovery tooling, but these are being worked on. In the context of S3,
> > this makes a significantly more compelling offering of HBase by removing
> > the complexity of HBOSS. For HBase in all installations, I think SFT
> > makes more a significantly more "deterministic" way of managing
> > regions/files.
> >
> > +1 from me to merge HBASE-26067 into master and branch-2
> >
> > - Josh
> >
> > On 12/7/21 10:31 AM, Wellington Chevreuil wrote:
> > > Hello everyone,
> > >
> > > We have been making progress on the alternative way of tracking store
> > files
> > > originally proposed by Duo in HBASE-26067.
> > >
> > > To briefly summarize it for those not following it, this feature
> > introduces
> > > an abstraction layer to track store files still used/needed by store
> > > engines, allowing for plugging different approaches of identifying
> store
> > > files required by the given store. The design doc describing it in more
> > > detail is available here
> > > <
> >
> https://docs.google.com/document/d/16Nr1Fn3VaXuz1g1FTiME-bnGR3qVK5B-raXshOkDLcY/edit#heading=h.calrs3kn4d8s
> > >
> > > .
> > >
> > > Our main goal within this feature is to avoid the need for using temp
> > files
> > > and renames when creating new hfiles (whenever flushing, compacting,
> > > splitting/merging or snapshotting). This is made possible by the
> > pluggable
> > > tracker implementation labeled "FILE". The current behavior using temp
> > dirs
> > > and renames would still be the default approach (labeled "DEFAULT").
> > >
> > > This "renameless" approach is appealing for deployments using Amazon S3
> > > Object store file system, where the lack of atomic rename operations
> > > imposed the necessity of an additional layer of locking (HBOSS), which
> > > combined with the s3a rename operation can have a performance overhead.
> > >
> > > Some test runs on my employer infrastructure have shown promising
> > results.
> > > A pure insertion ycsb run has shown ~6% performance gain on the client
> > > writes. Snapshot clone of hundreds of regions table completes in half
> of
> > > the time. There are also improvements in compaction, splits and merges
> > > times.
> > >
> > > Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we feel
> > > optimistic that the current implementation is in a good state to get
> > merged
> > > into master branch, but it would be nice to hear other opinions about
> it,
> > > before we effectively commit it. Looking forward to hearing some
> > > thoughts/concerns you might have.
> > >
> > > Kind regards,
> > > Wellington.
> > >
> >
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


[jira] [Created] (HBASE-26550) NPE if balance request comes in before master is initialized

2021-12-08 Thread Josh Elser (Jira)
Josh Elser created HBASE-26550:
--

 Summary: NPE if balance request comes in before master is 
initialized
 Key: HBASE-26550
 URL: https://issues.apache.org/jira/browse/HBASE-26550
 Project: HBase
  Issue Type: Bug
  Components: Balancer, master
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 3.0.0-alpha-2


Noticed this in a unit test from [https://github.com/apache/hbase/pull/3851]

I believe this is a result of the new balance() implementation in the Master, 
and a client submitting a request to the master before it's completed its 
instantiation. Simple fix to avoid the NPE.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26549) hbaseprotoc plugin should initialize maven

2021-12-08 Thread Nick Dimiduk (Jira)
Nick Dimiduk created HBASE-26549:


 Summary: hbaseprotoc plugin should initialize maven
 Key: HBASE-26549
 URL: https://issues.apache.org/jira/browse/HBASE-26549
 Project: HBase
  Issue Type: Task
  Components: jenkins
Affects Versions: 3.0.0-alpha-1
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk


My PR testing HBASE-26542 fails because the mvn incantation does not find other 
project module dependencies. Looking over our personality file and the Yetus 
documentation 
[precommit-advanced|https://yetus.apache.org/documentation/0.12.0/precommit-advanced/]
 says

{quote}
NOTE: If the plug-in has support for maven, the maven_add_install pluginname 
should be executed. See more information in Custom Maven Tests in the build 
tool documentation.
{quote}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Merge HBASE-26067 branch into master, and backport it to base 2 branches

2021-12-08 Thread Nick Dimiduk
Based solely on the comments made to this thread, I would recommend against
a merge to branch-2, given that we are very close to 2.5. The points about
existing gaps seem like things we're not ready to publish in the impending
minor release. Once we have a branch-2.5, this particular concern of mine
will be alleviated.

Thanks,
Nick

On Wed, Dec 8, 2021 at 1:37 PM Josh Elser  wrote:

> I was going to wait for some other folks to chime in, but I guess I can
> be the next one :)
>
> Duo, Wellington, and Szabolcs have been doing some excellent work on the
> storefile tracking (SFT) to a degree that I never expected to see. I
> remember some of the original "Filesystem re-do" issues on Jira. The
> idea was exceptional, but the result seemed unreachable.
>
> These devs, building on the success of what Zach/Stephen first talked
> about in HBASE-24749, came up with what I think is an excellent step
> forward. I've yet to break it via my own testing, but do acknowledge
> that there's always more work to be done.
>
> I think this is at a reasonable place to merge this back into the
> "mainline" branches from the feature branch (HBASE-26067). I believe
> this is ready because:
>
> 1. The feature is completely opt-in (HBase works the same way by default)
> 2. There is API to migrate tables into the new SFT implementation
> 3. There is also API to migrate tables back to the default implementation
>
> Some gaps still exist around bulk loading, documentation, snapshots, and
> recovery tooling, but these are being worked on. In the context of S3,
> this makes a significantly more compelling offering of HBase by removing
> the complexity of HBOSS. For HBase in all installations, I think SFT
> makes more a significantly more "deterministic" way of managing
> regions/files.
>
> +1 from me to merge HBASE-26067 into master and branch-2
>
> - Josh
>
> On 12/7/21 10:31 AM, Wellington Chevreuil wrote:
> > Hello everyone,
> >
> > We have been making progress on the alternative way of tracking store
> files
> > originally proposed by Duo in HBASE-26067.
> >
> > To briefly summarize it for those not following it, this feature
> introduces
> > an abstraction layer to track store files still used/needed by store
> > engines, allowing for plugging different approaches of identifying store
> > files required by the given store. The design doc describing it in more
> > detail is available here
> > <
> https://docs.google.com/document/d/16Nr1Fn3VaXuz1g1FTiME-bnGR3qVK5B-raXshOkDLcY/edit#heading=h.calrs3kn4d8s
> >
> > .
> >
> > Our main goal within this feature is to avoid the need for using temp
> files
> > and renames when creating new hfiles (whenever flushing, compacting,
> > splitting/merging or snapshotting). This is made possible by the
> pluggable
> > tracker implementation labeled "FILE". The current behavior using temp
> dirs
> > and renames would still be the default approach (labeled "DEFAULT").
> >
> > This "renameless" approach is appealing for deployments using Amazon S3
> > Object store file system, where the lack of atomic rename operations
> > imposed the necessity of an additional layer of locking (HBOSS), which
> > combined with the s3a rename operation can have a performance overhead.
> >
> > Some test runs on my employer infrastructure have shown promising
> results.
> > A pure insertion ycsb run has shown ~6% performance gain on the client
> > writes. Snapshot clone of hundreds of regions table completes in half of
> > the time. There are also improvements in compaction, splits and merges
> > times.
> >
> > Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we feel
> > optimistic that the current implementation is in a good state to get
> merged
> > into master branch, but it would be nice to hear other opinions about it,
> > before we effectively commit it. Looking forward to hearing some
> > thoughts/concerns you might have.
> >
> > Kind regards,
> > Wellington.
> >
>


Re: [DISCUSS] Merge HBASE-26067 branch into master, and backport it to base 2 branches

2021-12-08 Thread Josh Elser
I was going to wait for some other folks to chime in, but I guess I can 
be the next one :)


Duo, Wellington, and Szabolcs have been doing some excellent work on the 
storefile tracking (SFT) to a degree that I never expected to see. I 
remember some of the original "Filesystem re-do" issues on Jira. The 
idea was exceptional, but the result seemed unreachable.


These devs, building on the success of what Zach/Stephen first talked 
about in HBASE-24749, came up with what I think is an excellent step 
forward. I've yet to break it via my own testing, but do acknowledge 
that there's always more work to be done.


I think this is at a reasonable place to merge this back into the 
"mainline" branches from the feature branch (HBASE-26067). I believe 
this is ready because:


1. The feature is completely opt-in (HBase works the same way by default)
2. There is API to migrate tables into the new SFT implementation
3. There is also API to migrate tables back to the default implementation

Some gaps still exist around bulk loading, documentation, snapshots, and 
recovery tooling, but these are being worked on. In the context of S3, 
this makes a significantly more compelling offering of HBase by removing 
the complexity of HBOSS. For HBase in all installations, I think SFT 
makes more a significantly more "deterministic" way of managing 
regions/files.


+1 from me to merge HBASE-26067 into master and branch-2

- Josh

On 12/7/21 10:31 AM, Wellington Chevreuil wrote:

Hello everyone,

We have been making progress on the alternative way of tracking store files
originally proposed by Duo in HBASE-26067.

To briefly summarize it for those not following it, this feature introduces
an abstraction layer to track store files still used/needed by store
engines, allowing for plugging different approaches of identifying store
files required by the given store. The design doc describing it in more
detail is available here

.

Our main goal within this feature is to avoid the need for using temp files
and renames when creating new hfiles (whenever flushing, compacting,
splitting/merging or snapshotting). This is made possible by the pluggable
tracker implementation labeled "FILE". The current behavior using temp dirs
and renames would still be the default approach (labeled "DEFAULT").

This "renameless" approach is appealing for deployments using Amazon S3
Object store file system, where the lack of atomic rename operations
imposed the necessity of an additional layer of locking (HBOSS), which
combined with the s3a rename operation can have a performance overhead.

Some test runs on my employer infrastructure have shown promising results.
A pure insertion ycsb run has shown ~6% performance gain on the client
writes. Snapshot clone of hundreds of regions table completes in half of
the time. There are also improvements in compaction, splits and merges
times.

Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we feel
optimistic that the current implementation is in a good state to get merged
into master branch, but it would be nice to hear other opinions about it,
before we effectively commit it. Looking forward to hearing some
thoughts/concerns you might have.

Kind regards,
Wellington.



[jira] [Created] (HBASE-26548) Investigate mTLS in RPC layer

2021-12-08 Thread Bryan Beaudreault (Jira)
Bryan Beaudreault created HBASE-26548:
-

 Summary: Investigate mTLS in RPC layer
 Key: HBASE-26548
 URL: https://issues.apache.org/jira/browse/HBASE-26548
 Project: HBase
  Issue Type: New Feature
Reporter: Bryan Beaudreault


Current authentication options are heavily based on SASL and Kerberos. For 
organizations that don't already deploy Kerberos or other token provider, this 
is a heavy lift. Another very common way of authenticating in the industry is 
mTLS, which makes use of SSL certifications and can solve both wire encryption 
and auth. For those already deploying trusted certificates in their infra, mTLS 
may be much easier to integrate.

It isn't necessarily easy to implement this, but I do think we could use 
existing Netty SSL support in the NettyRpcClient and NettyRpcServer. I know 
it's easy to add SSL to non-blocking IO through a 
hadoop.rpc.socket.factory.class.default which returns SSLSockets, but that 
doesn't touch on the certification verification at all.

Much more investigation is needed, but logging this due to some interest 
encountered on slack.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


New dedicated Jenkins Conrtoller

2021-12-08 Thread Gavin McDonald
Hi HBase devs.

This is to let you know that Infra has created a dedicated Jenkins
Controller at https://ci-hbase.apache.org .

You have new incoming donated agents which will get connected to this new
controller by Infra.

You also have 10 or so agents connected to https://ci-hadoop.apache.org - I
would like to move these agents - and all of your jobs - over to the new
controller as soon as possible.

You may or may not need to reconfigure your jobs and or JenkinsFile or
other.

Can I get the go ahead to move these at my convenience or suggest please a
timeline for Infra to move these agents.

Thanks!

-- 

*Gavin McDonald*
Systems Administrator
ASF Infrastructure Team