Re: Issue with joda-time library bundled in hive-exec:4.0.0

2024-04-17 Thread Cheng Pan
Hi Ayush,

> Hive is already in discussion of marking Hive-2.x EOL, so at very best we 
> would have one release and immediately after that we will announce it EOL

Does the discussion happen in public? Is there an ETA for the final release of 
branch-2.3?

Thanks,
Cheng Pan


> On Apr 17, 2024, at 18:03, Ayush Saxena  wrote:
> 
> Thanx Cheng Pan for sharing the pointers, Do you have any list of issues or 
> pointers on what are the challenges for Spark to move to a higher Hive 
> version? I know upgrading libraries is quite challenging but it is inevitable.
> 
> Hive is already in discussion of marking Hive-2.x EOL, so at very best we 
> would have one release and immediately after that we will announce it EOL, 
> maintaining a release line is quite an effort for us at Hive & doing it 
> because other projects doesn't want to upgrade isn't a convincing reason for 
> most of us. The best we can do is or are trying is to address issues for 
> Spark whatever we can do as part of Hive code & would definitely need 
> help/support from Spark side as well, since the move is from 2.x to 4.x, it 
> would be a big change and would offer resistance on both sides.
> 
> So, it would be great help if any pointers can be shared from Spark side for 
> the move, if there is no help/interest from Spark then we can't do anything & 
> there is no need for Hive-2.x either in that case :-) 
> 
> -Ayush
> 
> On Wed, 17 Apr 2024 at 15:00, Cheng Pan  wrote:
> > … we are exploring ways to get Spark move from 2.3.9 to 4.0, Our initial 
> > hunch is that it would be quite challenging without a hive-exec slim jar …
> 
> It should be challenging to upgrade Spark’s built-in Hive version. Actually, 
> we already did lots of work on branch-2.3 which focuses on CVE reduction, for 
> example, allowing Spark to upgrade Guava to modern versions to get rid of 
> Guava 14, it was tested with the latest Spark master branch[1], maybe we need 
> a release for 2.3.10 now.
> 
> [1] https://github.com/apache/spark/pull/45372
> 
> Thanks,
> Cheng Pan
> 
> 



Re: Issue with joda-time library bundled in hive-exec:4.0.0

2024-04-17 Thread Ayush Saxena
Thanx Cheng Pan for sharing the pointers, Do you have any list of issues or
pointers on what are the challenges for Spark to move to a higher Hive
version? I know upgrading libraries is quite challenging but it
is inevitable.

Hive is already in discussion of marking Hive-2.x EOL, so at very best we
would have one release and immediately after that we will announce it EOL,
maintaining a release line is quite an effort for us at Hive & doing it
because other projects doesn't want to upgrade isn't a convincing reason
for most of us. The best we can do is or are trying is to address issues
for Spark whatever we can do as part of Hive code & would definitely need
help/support from Spark side as well, since the move is from 2.x to 4.x, it
would be a big change and would offer resistance on both sides.

So, it would be great help if any pointers can be shared from Spark side
for the move, if there is no help/interest from Spark then we can't do
anything & there is no need for Hive-2.x either in that case :-)

-Ayush

On Wed, 17 Apr 2024 at 15:00, Cheng Pan  wrote:

> > … we are exploring ways to get Spark move from 2.3.9 to 4.0, Our initial
> hunch is that it would be quite challenging without a hive-exec slim jar …
>
> It should be challenging to upgrade Spark’s built-in Hive version.
> Actually, we already did lots of work on branch-2.3 which focuses on CVE
> reduction, for example, allowing Spark to upgrade Guava to modern versions
> to get rid of Guava 14, it was tested with the latest Spark master
> branch[1], maybe we need a release for 2.3.10 now.
>
> [1] https://github.com/apache/spark/pull/45372
>
> Thanks,
> Cheng Pan
>
>
>


Re: Issue with joda-time library bundled in hive-exec:4.0.0

2024-04-17 Thread Cheng Pan
> … we are exploring ways to get Spark move from 2.3.9 to 4.0, Our initial 
> hunch is that it would be quite challenging without a hive-exec slim jar …

It should be challenging to upgrade Spark’s built-in Hive version. Actually, we 
already did lots of work on branch-2.3 which focuses on CVE reduction, for 
example, allowing Spark to upgrade Guava to modern versions to get rid of Guava 
14, it was tested with the latest Spark master branch[1], maybe we need a 
release for 2.3.10 now.

[1] https://github.com/apache/spark/pull/45372

Thanks,
Cheng Pan




Re: Issue with joda-time library bundled in hive-exec:4.0.0

2024-04-17 Thread Cheng Pan
There is a JIRA ticket[1] that tracks "upgrading built-in Hive to 3+”

BTW, regarding HMS API used by Spark, the Hive 2.3.9 client is compatible with 
HMS from 2.0 to 4.0, while the upcoming Hive 2.3.10 client should be compatible 
with HMS from 1.2 to 4.0, if we decide to upgrade the built-in Hive, it’s 
better to keep such compatibility.

[1] https://issues.apache.org/jira/browse/SPARK-44114

Thanks,
Cheng Pan


> On Apr 17, 2024, at 18:03, Ayush Saxena  wrote:
> 
> Thanx Cheng Pan for sharing the pointers, Do you have any list of issues or 
> pointers on what are the challenges for Spark to move to a higher Hive 
> version? I know upgrading libraries is quite challenging but it is inevitable.
> 
> Hive is already in discussion of marking Hive-2.x EOL, so at very best we 
> would have one release and immediately after that we will announce it EOL, 
> maintaining a release line is quite an effort for us at Hive & doing it 
> because other projects doesn't want to upgrade isn't a convincing reason for 
> most of us. The best we can do is or are trying is to address issues for 
> Spark whatever we can do as part of Hive code & would definitely need 
> help/support from Spark side as well, since the move is from 2.x to 4.x, it 
> would be a big change and would offer resistance on both sides.
> 
> So, it would be great help if any pointers can be shared from Spark side for 
> the move, if there is no help/interest from Spark then we can't do anything & 
> there is no need for Hive-2.x either in that case :-) 
> 
> -Ayush
> 
> On Wed, 17 Apr 2024 at 15:00, Cheng Pan  wrote:
> > … we are exploring ways to get Spark move from 2.3.9 to 4.0, Our initial 
> > hunch is that it would be quite challenging without a hive-exec slim jar …
> 
> It should be challenging to upgrade Spark’s built-in Hive version. Actually, 
> we already did lots of work on branch-2.3 which focuses on CVE reduction, for 
> example, allowing Spark to upgrade Guava to modern versions to get rid of 
> Guava 14, it was tested with the latest Spark master branch[1], maybe we need 
> a release for 2.3.10 now.
> 
> [1] https://github.com/apache/spark/pull/45372
> 
> Thanks,
> Cheng Pan
> 
> 



Re: [DISCUS] Plan the next Hive release

2024-04-17 Thread Ayush Saxena
Hi Stamatis,
The plan is to have a release line cut from the branch-4.0, So, we plan to
pull in some critical bug fixes & improvements into the 4.0.1 release and
have a quicker release.
As of now we are just putting the label "hive-4.0.1-must" on the tickets
and we plan to make sure those get c-picked to the release line. AFAIK we
haven't started committing to any branch yet, was waiting if anyone feels
differently, so we can hold back if you have concerns or take a different
approach as well.

>From CI you mean to say the daily builds? else if you create a PR
targeting to branch-4.0, it will run the entire test suite I believe? In
the meantime I will update the instructions regarding the target branch &
the label if anyone wants that a particular ticket to be part of the 4.0.1
release.

-Ayush

On Wed, 17 Apr 2024 at 12:42, Stamatis Zampetakis  wrote:

> Thanks for starting the discussion Ayush.
>
> Having frequent releases is definitely needed so we should keep the
> momentum going.
>
> I had the impression from other threads that the next Hive release
> would be 4.1.0 and that it would be cut from master. I would like to
> understand how 4.0.1 is different and if it is, what is the
> contribution pattern that contributors and committers should follow?
> If the idea is to maintain and commit in two (or more) branches the
> steps should be documented and CI should be running on those branches.
>
> Best,
> Stamatis
>
> On Wed, Apr 10, 2024 at 1:18 PM Denys Kuzmenko 
> wrote:
> >
> > We might need it sooner as identified some critical issues in the recent
> code:
> > 1. HIVE-28166: Truncate on Iceberg table disregards the branch name and
> operates on a main;
> > 2. HIVE-28190: Materialized view rebuild lock heart-beating is broken;
>


Archive old Hive releases

2024-04-17 Thread Stamatis Zampetakis
Hi all,

Following the INFRA policy [1] about handling current and older
releases, I just removed the following releases from the main download
site [2].

Apache Hive 1.2.2
Apache Hive 3.1.2
Apache Hive 1.2.2
Apache Hive 4.0.0-alpha-1
Apache Hive 4.0.0-alpha-2
Apache Hive 4.0.0-beta-1

The aforementioned releases can now be found in the archive [3]. When
a new release comes out we should keep in mind to perform the
necessary cleanup thus I added a new section in the wiki [4].

Best,
Stamatis

[1] 
https://infra.apache.org/release-download-pages.html#current-and-older-releases
[2] https://downloads.apache.org/hive/
[3] https://archive.apache.org/dist/hive/
[4] 
https://cwiki.apache.org/confluence/display/Hive/HowToRelease#HowToRelease-Archiveoldreleases


Query Regarding Core Classified Hive-Exec Artifact in Future Releases

2024-04-17 Thread Mergu Ravi
Hi Hive Team,
>From this Hive ticket https://issues.apache.org/jira/browse/HIVE-25531, I
understood that the core classified hive-exec artifact was removed. Is
there any plan to include this core artifact in upcoming releases?
-- 

Thanks & Regards,
Ravi Mergu


Re: [DISCUS] Plan the next Hive release

2024-04-17 Thread Stamatis Zampetakis
Thanks for starting the discussion Ayush.

Having frequent releases is definitely needed so we should keep the
momentum going.

I had the impression from other threads that the next Hive release
would be 4.1.0 and that it would be cut from master. I would like to
understand how 4.0.1 is different and if it is, what is the
contribution pattern that contributors and committers should follow?
If the idea is to maintain and commit in two (or more) branches the
steps should be documented and CI should be running on those branches.

Best,
Stamatis

On Wed, Apr 10, 2024 at 1:18 PM Denys Kuzmenko  wrote:
>
> We might need it sooner as identified some critical issues in the recent code:
> 1. HIVE-28166: Truncate on Iceberg table disregards the branch name and 
> operates on a main;
> 2. HIVE-28190: Materialized view rebuild lock heart-beating is broken;