Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts

2022-08-16 Thread Duo Zhang
Ah, a good news is that, for branch-2, I tried locally to build it
with hadoop 2 profile(the hadoop version is 2.10.2), and then remove
all the hadoop 2.10.2 jars in the binary and copy all the hadoop jars
in hadoop 3.3.4 there. Starting a mini cluster is fine, hbase shell is
fine, LTT tool is fine, and then scan the LTT result in shell is also
fine.

So maybe upgrading to 2.10.2 is enough for hbase to maintain the drop
in replacement.

You could have a try for running phoenix IT tests to see if it works.

Thanks.

张铎(Duo Zhang)  于2022年8月17日周三 10:31写道:
>
> In general, it is fine to setup a hbase cluster which ships with
> hadoop 2.x client library against a hadoop 3.x hdfs and yarn cluster.
> The wire communication is still compatible. That's why we still keep
> the client library as 2.x and not do something like the time in 0.98
> that publish two binaries for hadoop1 and hadoop2.
>
> So I do not understand why 'no end-users with Hadoop 3 clusters will
> be able to use the Apache-distributed binaries'.
>
> And on the Phoenix IT tests, it is a problem as HBase does not support
> drop in replacement with hadoop libraries. There are several possible
> directions:
> 1. Support drop in replacement.
> 2. Publish two binaries with hadoop2 and hadoop3, and also publish two
> maven dependencies with hadoop2 and hadoop3.
> 3. Only publish hadoop3 binaries and maven dependencies.
>
> The problem for these directions
> 1. Not sure if this is easy to implement...
> 2. Will increase the complexity when users just want to use hbase.
> 3. May have compatible issues when using the hadoop3 libraries against
> a hadoop 2 cluster.
>
> Thanks.
>
> Geoffrey Jacoby  于2022年8月17日周三 03:30写道:
> >
> > I see that the next HBase 2.5 RC is imminent, and before that's set in
> > stone, I wanted to bring up the question of whether there will be official
> > HBase 2.5 binaries built with the Hadoop 3 profile and available in the
> > usual Maven repositories. (In addition to the usual Hadoop 2 profile
> > binaries)
> >
> > The HBase 2.x line has a commitment to maintain support for Hadoop 2.x, but
> > Hadoop 3.3 is the current stable Hadoop line and the most recent release
> > notes [1] encourage all users of Hadoop  2.x to upgrade to Hadoop 3.
> >
> > Without convenience artifacts built against Hadoop 3, no end-users with
> > Hadoop 3 clusters will be able to use the Apache-distributed binaries and
> > will instead have to recompile HBase from source themselves, or use a 3rd
> > party distribution that does so for them.
> >
> > This is especially inconvenient for downstream projects such as Apache
> > Phoenix, which has never  officially supported the HBase 2.x / Hadoop 2.10
> > combination. (It currently supports only HBase 2.3 or 2.4 with Hadoop 3.
> > HBase 2.5 support will be added very shortly after its release as part of
> > Phoenix 5.2.)
> >
> > To even run the Phoenix IT tests locally requires contributors to download
> > the HBase source release and manually mvn install to their local maven repo
> > using the Hadoop 3 profile, to avoid crashes in the HBase minicluster.[2]
> > This is a barrier to new contributors and confuses even veteran ones, and
> > has to be done again for every new HBase release.
> >
> > In general, I expect the Hadoop 3 user base to grow and the Hadoop 2.10
> > user base to shrink with every future HBase 2 release, so I think this is a
> > worthwhile improvement.
> >
> > Thanks,
> >
> > Geoffrey
> >
> > [1] https://hadoop.apache.org/release/3.3.4.html
> > [2] https://github.com/apache/phoenix/blob/master/BUILDING.md


Re: Time for hbase-2.4.14 release

2022-08-16 Thread Duo Zhang
Thanks for taking care of this!

Huaxiang Sun  于2022年8月17日周三 02:33写道:
>
> Hi Folks,
>
> There are quite some critical fixes after 2.4.13, especially memory
> leaks in SASL implementation and ByteBuffAllocator. Since 2.5.0 release is
> being worked on, there is a rolling back issue from 2.5.0 back to 2.4.*
> release which is fixed in the coming 2.4.14 release. 2.4.13 was released on
> 7/1/22, it is about time for 2.4.14 release.
>
>  Andrew is on PTO for two weeks. I volunteer to run this release unless
> someone else wants to do.
>
> Please let me know if there are any concerns.
>
> Best Regards,
> Huaxiang


Re: [DISCUSS] HBase 2.5 / Hadoop 3 artifacts

2022-08-16 Thread Duo Zhang
In general, it is fine to setup a hbase cluster which ships with
hadoop 2.x client library against a hadoop 3.x hdfs and yarn cluster.
The wire communication is still compatible. That's why we still keep
the client library as 2.x and not do something like the time in 0.98
that publish two binaries for hadoop1 and hadoop2.

So I do not understand why 'no end-users with Hadoop 3 clusters will
be able to use the Apache-distributed binaries'.

And on the Phoenix IT tests, it is a problem as HBase does not support
drop in replacement with hadoop libraries. There are several possible
directions:
1. Support drop in replacement.
2. Publish two binaries with hadoop2 and hadoop3, and also publish two
maven dependencies with hadoop2 and hadoop3.
3. Only publish hadoop3 binaries and maven dependencies.

The problem for these directions
1. Not sure if this is easy to implement...
2. Will increase the complexity when users just want to use hbase.
3. May have compatible issues when using the hadoop3 libraries against
a hadoop 2 cluster.

Thanks.

Geoffrey Jacoby  于2022年8月17日周三 03:30写道:
>
> I see that the next HBase 2.5 RC is imminent, and before that's set in
> stone, I wanted to bring up the question of whether there will be official
> HBase 2.5 binaries built with the Hadoop 3 profile and available in the
> usual Maven repositories. (In addition to the usual Hadoop 2 profile
> binaries)
>
> The HBase 2.x line has a commitment to maintain support for Hadoop 2.x, but
> Hadoop 3.3 is the current stable Hadoop line and the most recent release
> notes [1] encourage all users of Hadoop  2.x to upgrade to Hadoop 3.
>
> Without convenience artifacts built against Hadoop 3, no end-users with
> Hadoop 3 clusters will be able to use the Apache-distributed binaries and
> will instead have to recompile HBase from source themselves, or use a 3rd
> party distribution that does so for them.
>
> This is especially inconvenient for downstream projects such as Apache
> Phoenix, which has never  officially supported the HBase 2.x / Hadoop 2.10
> combination. (It currently supports only HBase 2.3 or 2.4 with Hadoop 3.
> HBase 2.5 support will be added very shortly after its release as part of
> Phoenix 5.2.)
>
> To even run the Phoenix IT tests locally requires contributors to download
> the HBase source release and manually mvn install to their local maven repo
> using the Hadoop 3 profile, to avoid crashes in the HBase minicluster.[2]
> This is a barrier to new contributors and confuses even veteran ones, and
> has to be done again for every new HBase release.
>
> In general, I expect the Hadoop 3 user base to grow and the Hadoop 2.10
> user base to shrink with every future HBase 2 release, so I think this is a
> worthwhile improvement.
>
> Thanks,
>
> Geoffrey
>
> [1] https://hadoop.apache.org/release/3.3.4.html
> [2] https://github.com/apache/phoenix/blob/master/BUILDING.md


[DISCUSS] HBase 2.5 / Hadoop 3 artifacts

2022-08-16 Thread Geoffrey Jacoby
I see that the next HBase 2.5 RC is imminent, and before that's set in
stone, I wanted to bring up the question of whether there will be official
HBase 2.5 binaries built with the Hadoop 3 profile and available in the
usual Maven repositories. (In addition to the usual Hadoop 2 profile
binaries)

The HBase 2.x line has a commitment to maintain support for Hadoop 2.x, but
Hadoop 3.3 is the current stable Hadoop line and the most recent release
notes [1] encourage all users of Hadoop  2.x to upgrade to Hadoop 3.

Without convenience artifacts built against Hadoop 3, no end-users with
Hadoop 3 clusters will be able to use the Apache-distributed binaries and
will instead have to recompile HBase from source themselves, or use a 3rd
party distribution that does so for them.

This is especially inconvenient for downstream projects such as Apache
Phoenix, which has never  officially supported the HBase 2.x / Hadoop 2.10
combination. (It currently supports only HBase 2.3 or 2.4 with Hadoop 3.
HBase 2.5 support will be added very shortly after its release as part of
Phoenix 5.2.)

To even run the Phoenix IT tests locally requires contributors to download
the HBase source release and manually mvn install to their local maven repo
using the Hadoop 3 profile, to avoid crashes in the HBase minicluster.[2]
This is a barrier to new contributors and confuses even veteran ones, and
has to be done again for every new HBase release.

In general, I expect the Hadoop 3 user base to grow and the Hadoop 2.10
user base to shrink with every future HBase 2 release, so I think this is a
worthwhile improvement.

Thanks,

Geoffrey

[1] https://hadoop.apache.org/release/3.3.4.html
[2] https://github.com/apache/phoenix/blob/master/BUILDING.md


Time for hbase-2.4.14 release

2022-08-16 Thread Huaxiang Sun
Hi Folks,

There are quite some critical fixes after 2.4.13, especially memory
leaks in SASL implementation and ByteBuffAllocator. Since 2.5.0 release is
being worked on, there is a rolling back issue from 2.5.0 back to 2.4.*
release which is fixed in the coming 2.4.14 release. 2.4.13 was released on
7/1/22, it is about time for 2.4.14 release.

 Andrew is on PTO for two weeks. I volunteer to run this release unless
someone else wants to do.

Please let me know if there are any concerns.

Best Regards,
Huaxiang


[jira] [Resolved] (HBASE-27294) Add new hadoop releases in our hadoop checks

2022-08-16 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-27294.
---
Fix Version/s: 2.5.0
   3.0.0-alpha-4
   2.4.14
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to branch-2.4+.

Thanks [~ndimiduk] for reviewing!

> Add new hadoop releases in our hadoop checks
> 
>
> Key: HBASE-27294
> URL: https://issues.apache.org/jira/browse/HBASE-27294
> Project: HBase
>  Issue Type: Task
>  Components: scripts
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-4, 2.4.14
>
>
> 3.1.3, 3.1.4
> 3.2.3, 3.2.4
> 3.3.2,3.3.3,3.3.4



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-27279) Make SslHandler work with SaslWrapHandler/SaslUnwrapHandler

2022-08-16 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-27279.
---
Fix Version/s: 2.6.0
   3.0.0-alpha-4
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to master and branch-2.

Thanks [~andor] and [~bbeaudreault]!

> Make SslHandler work with SaslWrapHandler/SaslUnwrapHandler
> ---
>
> Key: HBASE-27279
> URL: https://issues.apache.org/jira/browse/HBASE-27279
> Project: HBase
>  Issue Type: Improvement
>  Components: IPC/RPC, security
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.6.0, 3.0.0-alpha-4
>
>
> Though this is not recommended as SslHandler has already done wrap/unwrap, it 
> is still better to make it work(maybe we could add some warning logs to say 
> this is unnecessary?) than fail with strange error message.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-27305) add an option to skip file splitting when bulkload hfiles

2022-08-16 Thread ruanhui (Jira)
ruanhui created HBASE-27305:
---

 Summary: add an option to skip file splitting when bulkload hfiles
 Key: HBASE-27305
 URL: https://issues.apache.org/jira/browse/HBASE-27305
 Project: HBase
  Issue Type: Improvement
  Components: tooling
Affects Versions: 3.0.0-alpha-3
Reporter: ruanhui
Assignee: ruanhui
 Fix For: 3.0.0-alpha-4


When bulkload hfiles, if the key range of the hfile does not match the key 
range of the region, the BulkLoadHFilesTool will split hfile to fit make the 
key range of the new file match the key range of the region. If there are many 
files to be split, the load on the BulkLoadHFilesTool will be very high. 
Sometimes we want to avoid this situation, just directly fail and regenerate 
new hfiles. Here we try to introduce a new option, When the above problem is 
encountered, an exception will be thrown and let the upper client handle it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-26907) Update Hadoop3 versions for JEP 223 compliance

2022-08-16 Thread Nick Dimiduk (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk resolved HBASE-26907.
--
  Assignee: (was: Nick Dimiduk)
Resolution: Won't Fix

> Update Hadoop3 versions for JEP 223 compliance
> --
>
> Key: HBASE-26907
> URL: https://issues.apache.org/jira/browse/HBASE-26907
> Project: HBase
>  Issue Type: Task
>  Components: build
>Affects Versions: 2.5.0, 3.0.0-alpha-3, 2.4.12
>Reporter: Nick Dimiduk
>Priority: Major
>
> It happened that my JDK version upgraded to 11.0.14.1. Running unit tests 
> involving the HDFS mini cluster now fails with a stack trace that ends with
> {noformat}
> Caused by: java.lang.IllegalArgumentException: Invalid Java version 11.0.14.1 
>   
> 
> at org.eclipse.jetty.util.JavaVersion.parseJDK9(JavaVersion.java:71)  
>   
> 
> at org.eclipse.jetty.util.JavaVersion.parse(JavaVersion.java:49)  
>   
> 
> at org.eclipse.jetty.util.JavaVersion.(JavaVersion.java:43)
> {noformat}
> We are using hadoop-3.2.0, which uses jetty-9.3.24. This is a Jetty issue has 
> been fixed upstream in Jetty via 
> https://github.com/eclipse/jetty.project/issues/2090. Hadoop has upgraded its 
> Jetty version to 9.4.20 in HADOOP-16152, which is available as of 
> hadoop-3.2.2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)