Re: [DISCUSS] End of life for Hive 1.x, 2.x, 3.x

2022-05-10 Thread Stamatis Zampetakis
Thanks everyone for sharing your thoughts. I am happy to see so many people
involved in the discussion.

I would say that the current 4.0.0-alpha-1 is better in many aspects than
previous stable releases, although this might be a bit subjective.

I am afraid that if we keep supporting older releases it will take too much
time till people start using the 4.x.
Having real deployments of Hive 4 is the only way to go from alpha to
stable releases with confidence.

I checked the download statistics for Hive releases [1], [2] for the past
month and the results show that the vast majority of downloads are for
older releases.
I am not posting the stats here since I am not sure if this would violate
some policies. Hive committers can access the stats using their ASF
credentials.
To some degree this is expected but at the same time problematic given the
number of open issues which affect older releases.

I would definitely like to have multiple maintenance branches with high
quality standards but I don't think there are enough active committers in
the project to successfully maintain those.
The https://github.com/mr3project/hive-mr3 repo may be a great fit for an
upcoming ASF Hive release.
However, according to what Sungwoo said, this seems more like a new
maintenance branch rather than a continuation of Hive 3.
Moving towards this direction would certainly require more time from all of
us.

Lastly, it seems that there are some issues preventing people from using
4.0.0-alpha-1.
As Peter already mentioned these issues are probably release blockers and
it should be taken into account in the next Hive 4 release.
The thread about the next steps after 4.0.0-alpha-1 [3] is the perfect
place to discuss those.
For those with certain demands around Hive 4, please reply to [3] and
include any specific JIRAs that need to be in the scope of the next release.

Best,
Stamatis

[1] https://logging1-he-de.apache.org/stats/
[2] https://repository.apache.org/#central-stat
[3] https://lists.apache.org/thread/n245dd23kb2v3qrrfp280w3pto89khxj


On Tue, May 10, 2022 at 10:55 AM Sungwoo Park  wrote:

> We maintain our own fork of Hive 3 because we are not always adding new
> commits to the tip of the branch. To backport a new patch, sometimes we
> have to add new commits between existing commits, update earlier commits,
> and so on. This makes it impractical to keep adding new patches only to the
> tip of the branch while reverting commits if necessary. Maintaining the
> Hive 3 branch would mean frequent force-updates, which might produce more
> problems. (If this is not an issue, we could try to completely rebuild the
> Hive 3 branch.)
>
> I hope the Apache community can make a concerted effort to figure out what
> patches to include in Hive 3. For us, the challenge was 1) to decide which
> patch to include; 2) to figure out its dependencies if any; 3) to resolve
> conflicts. Testing was also another source of pain.
>
> Thanks,
>
> --- Sungwoo
>
>
>
>
>
> On Tue, May 10, 2022 at 4:26 PM Peter Vary  wrote:
>
>> When we were brainstorming about the future of the Hive 3 branch with
>> Zoltan Haindrich, he mentioned this letter:
>> https://lists.apache.org/thread/by9ppc2z8oqdzpqotzv5bs34yrxrd84l
>>
>> I think Sungwoo Park and his team makes a huge effort to maintain this
>> branch, and maybe it would be better to help them do this inside the Apache
>> Hive project. They should not need to maintain their own branch if there is
>> no particular reason behind it, or we can remove those blockers. This could
>> be beneficial for every Hive user who still uses Hive 3.
>>
>> @Sungwoo: Do you have any specific reason to keep you own fork of Hive 3?
>>
>> That would mean we could have a much better Hive 3.x branch than we have
>> now.
>>
>> What do you think?
>>
>> Thanks,
>> Peter
>>
>>
>>
>> On 2022. May 10., at 8:40, Battula, Brahma Reddy <
>> bbatt...@visa.com.INVALID> wrote:
>>
>> Agree to Peter and sunchao..
>>
>> Even we are using the hive 3.x, we might contribute on bugfixes.
>>
>> Even I am +1 on 1.x EOL as it's hard to maintain so many releases and
>> time to user's migrate to 2.x and 3.x.
>>
>>
>> On 09/05/22, 10:51 PM, "Chao Sun"  wrote:
>>
>>Agree to Peter above. I know quite a few projects such as Spark,
>>Iceberg and Trino/Presto are depending on Hive 2.x and 3.x, and
>>periodically they may need new fixes in these. Upgrading them to use
>>4.x seems not an option for now since the core classified artifact has
>>been removed and the shading issue has to be solved before they can
>>consume the new jar.
>>
>>On Mon, May 9, 2022 at 4:10 AM Peter Vary  wrote:
>>
>>
>> Hi Team,
>>
>> My experience with the Iceberg community shows that there are some
>> sizeable userbase around Hive 2.x. I have seen patches, contributions to
>> Hive 2.3.x branches, and the tests are in much better shape there.
>>
>> I would definitely vote for EOL Hive 1.x, but until we have a stable 4.x,
>> I would be cautious about slashing 2.x,

Re: [DISCUSS] End of life for Hive 1.x, 2.x, 3.x

2022-05-10 Thread Sungwoo Park
We maintain our own fork of Hive 3 because we are not always adding new
commits to the tip of the branch. To backport a new patch, sometimes we
have to add new commits between existing commits, update earlier commits,
and so on. This makes it impractical to keep adding new patches only to the
tip of the branch while reverting commits if necessary. Maintaining the
Hive 3 branch would mean frequent force-updates, which might produce more
problems. (If this is not an issue, we could try to completely rebuild the
Hive 3 branch.)

I hope the Apache community can make a concerted effort to figure out what
patches to include in Hive 3. For us, the challenge was 1) to decide which
patch to include; 2) to figure out its dependencies if any; 3) to resolve
conflicts. Testing was also another source of pain.

Thanks,

--- Sungwoo





On Tue, May 10, 2022 at 4:26 PM Peter Vary  wrote:

> When we were brainstorming about the future of the Hive 3 branch with
> Zoltan Haindrich, he mentioned this letter:
> https://lists.apache.org/thread/by9ppc2z8oqdzpqotzv5bs34yrxrd84l
>
> I think Sungwoo Park and his team makes a huge effort to maintain this
> branch, and maybe it would be better to help them do this inside the Apache
> Hive project. They should not need to maintain their own branch if there is
> no particular reason behind it, or we can remove those blockers. This could
> be beneficial for every Hive user who still uses Hive 3.
>
> @Sungwoo: Do you have any specific reason to keep you own fork of Hive 3?
>
> That would mean we could have a much better Hive 3.x branch than we have
> now.
>
> What do you think?
>
> Thanks,
> Peter
>
>
>
> On 2022. May 10., at 8:40, Battula, Brahma Reddy <
> bbatt...@visa.com.INVALID> wrote:
>
> Agree to Peter and sunchao..
>
> Even we are using the hive 3.x, we might contribute on bugfixes.
>
> Even I am +1 on 1.x EOL as it's hard to maintain so many releases and time
> to user's migrate to 2.x and 3.x.
>
>
> On 09/05/22, 10:51 PM, "Chao Sun"  wrote:
>
>Agree to Peter above. I know quite a few projects such as Spark,
>Iceberg and Trino/Presto are depending on Hive 2.x and 3.x, and
>periodically they may need new fixes in these. Upgrading them to use
>4.x seems not an option for now since the core classified artifact has
>been removed and the shading issue has to be solved before they can
>consume the new jar.
>
>On Mon, May 9, 2022 at 4:10 AM Peter Vary  wrote:
>
>
> Hi Team,
>
> My experience with the Iceberg community shows that there are some
> sizeable userbase around Hive 2.x. I have seen patches, contributions to
> Hive 2.3.x branches, and the tests are in much better shape there.
>
> I would definitely vote for EOL Hive 1.x, but until we have a stable 4.x,
> I would be cautious about slashing 2.x, 3.x branches.
>
> Just my 2 cents.
>
> Peter
>
> On 2022. May 9., at 10:51, Alessandro Solimando <
> alessandro.solima...@gmail.com> wrote:
>
> Hi Stamatis,
> thanks for bringing up this topic, I basically agree on everything you
> wrote.
>
> I just wanted to add that this kind of proposal might sound harsh, because
> in many contexts upgrading is a complex process, but it's in nobody's
> interest to keep release branches that are missing important
> fixes/improvements and that might not meet the quality standards that
> people expect, as mentioned.
>
> Since we don't have yet a stable 4.x release (only alpha for now) we might
> want to keep supporting the 3.x branch until the first 4.x stable release
> and EOL < 3.x branches, WDYT?
>
> Best regards,
> Alessandro
>
> On Fri, 6 May 2022 at 23:14, Stamatis Zampetakis 
> wrote:
>
>
> Hi all,
>
> The current master has many critical bug fixes as well as important
> performance improvements that are not backported (and most likely never
> will) to the maintenance branches.
>
> Backporting changes from master usually requires adapting the code and
> tests in questions making it a non-trivial and time consuming task.
>
> The ASF bylaws require PMCs to deliver high quality software which satisfy
> certain criteria. Cutting new releases from maintenance branches with known
> critical bugs is not compliant with the ASF.
>
> CI is unstable in all maintenance branches making the quality of a release
> questionable and merging new PRs rather difficult. Enabling and running it
> frequently in all maintenance branches would require a big amount of
> resources on top of what we already need for master.
>
> History has shown that it is very difficult or impossible to properly
> maintain multiple release branches for Hive.
>
> I think it would be to the best interest of the project if the PMC decided
> to drop support for maintenance branches and focused on releasing
> exclusively from master.
>
> This mail is related to the discussion about the release cadence [1] since
> it would certainly help making Hive releases more regular. I decided to
> start a separate thread to avoid mixing multiple topics together.
>
> Looking for

Re: [DISCUSS] End of life for Hive 1.x, 2.x, 3.x

2022-05-10 Thread Peter Vary
When we were brainstorming about the future of the Hive 3 branch with Zoltan 
Haindrich, he mentioned this letter: 
https://lists.apache.org/thread/by9ppc2z8oqdzpqotzv5bs34yrxrd84l 


I think Sungwoo Park and his team makes a huge effort to maintain this branch, 
and maybe it would be better to help them do this inside the Apache Hive 
project. They should not need to maintain their own branch if there is no 
particular reason behind it, or we can remove those blockers. This could be 
beneficial for every Hive user who still uses Hive 3.

@Sungwoo: Do you have any specific reason to keep you own fork of Hive 3?

That would mean we could have a much better Hive 3.x branch than we have now.

What do you think?

Thanks,
Peter



> On 2022. May 10., at 8:40, Battula, Brahma Reddy  
> wrote:
> 
> Agree to Peter and sunchao..
> 
> Even we are using the hive 3.x, we might contribute on bugfixes. 
> 
> Even I am +1 on 1.x EOL as it's hard to maintain so many releases and time to 
> user's migrate to 2.x and 3.x.
> 
> 
> On 09/05/22, 10:51 PM, "Chao Sun"  > wrote:
> 
>Agree to Peter above. I know quite a few projects such as Spark,
>Iceberg and Trino/Presto are depending on Hive 2.x and 3.x, and
>periodically they may need new fixes in these. Upgrading them to use
>4.x seems not an option for now since the core classified artifact has
>been removed and the shading issue has to be solved before they can
>consume the new jar.
> 
>On Mon, May 9, 2022 at 4:10 AM Peter Vary  wrote:
>> 
>> Hi Team,
>> 
>> My experience with the Iceberg community shows that there are some sizeable 
>> userbase around Hive 2.x. I have seen patches, contributions to Hive 2.3.x 
>> branches, and the tests are in much better shape there.
>> 
>> I would definitely vote for EOL Hive 1.x, but until we have a stable 4.x, I 
>> would be cautious about slashing 2.x, 3.x branches.
>> 
>> Just my 2 cents.
>> 
>> Peter
>> 
>> On 2022. May 9., at 10:51, Alessandro Solimando 
>>  wrote:
>> 
>> Hi Stamatis,
>> thanks for bringing up this topic, I basically agree on everything you wrote.
>> 
>> I just wanted to add that this kind of proposal might sound harsh, because 
>> in many contexts upgrading is a complex process, but it's in nobody's 
>> interest to keep release branches that are missing important 
>> fixes/improvements and that might not meet the quality standards that people 
>> expect, as mentioned.
>> 
>> Since we don't have yet a stable 4.x release (only alpha for now) we might 
>> want to keep supporting the 3.x branch until the first 4.x stable release 
>> and EOL < 3.x branches, WDYT?
>> 
>> Best regards,
>> Alessandro
>> 
>> On Fri, 6 May 2022 at 23:14, Stamatis Zampetakis  wrote:
>>> 
>>> Hi all,
>>> 
>>> The current master has many critical bug fixes as well as important 
>>> performance improvements that are not backported (and most likely never 
>>> will) to the maintenance branches.
>>> 
>>> Backporting changes from master usually requires adapting the code and 
>>> tests in questions making it a non-trivial and time consuming task.
>>> 
>>> The ASF bylaws require PMCs to deliver high quality software which satisfy 
>>> certain criteria. Cutting new releases from maintenance branches with known 
>>> critical bugs is not compliant with the ASF.
>>> 
>>> CI is unstable in all maintenance branches making the quality of a release 
>>> questionable and merging new PRs rather difficult. Enabling and running it 
>>> frequently in all maintenance branches would require a big amount of 
>>> resources on top of what we already need for master.
>>> 
>>> History has shown that it is very difficult or impossible to properly 
>>> maintain multiple release branches for Hive.
>>> 
>>> I think it would be to the best interest of the project if the PMC decided 
>>> to drop support for maintenance branches and focused on releasing 
>>> exclusively from master.
>>> 
>>> This mail is related to the discussion about the release cadence [1] since 
>>> it would certainly help making Hive releases more regular. I decided to 
>>> start a separate thread to avoid mixing multiple topics together.
>>> 
>>> Looking forward to your thoughts.
>>> 
>>> Best,
>>> Stamatis
>>> 
>>> [1] 
>>> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.apache.org%2Fthread%2Fn245dd23kb2v3qrrfp280w3pto89khxj&data=05%7C01%7Cbbattula%40visa.com%7Ccba1383657724a00f0bb08da31e069bc%7C38305e12e15d4ee888b9c4db1c477d76%7C0%7C0%7C637877137169408371%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=X3BJyzgALXZVnjmd2PzbLrOi4lXMHxEQa8KwA1Pz7BQ%3D&reserved=0
>>>  
>>> 

Re: [DISCUSS] End of life for Hive 1.x, 2.x, 3.x

2022-05-09 Thread Battula, Brahma Reddy
Agree to Peter and sunchao..

Even we are using the hive 3.x, we might contribute on bugfixes. 

Even I am +1 on 1.x EOL as it's hard to maintain so many releases and time to 
user's migrate to 2.x and 3.x.


On 09/05/22, 10:51 PM, "Chao Sun"  wrote:

Agree to Peter above. I know quite a few projects such as Spark,
Iceberg and Trino/Presto are depending on Hive 2.x and 3.x, and
periodically they may need new fixes in these. Upgrading them to use
4.x seems not an option for now since the core classified artifact has
been removed and the shading issue has to be solved before they can
consume the new jar.

On Mon, May 9, 2022 at 4:10 AM Peter Vary  wrote:
>
> Hi Team,
>
> My experience with the Iceberg community shows that there are some 
sizeable userbase around Hive 2.x. I have seen patches, contributions to Hive 
2.3.x branches, and the tests are in much better shape there.
>
> I would definitely vote for EOL Hive 1.x, but until we have a stable 4.x, 
I would be cautious about slashing 2.x, 3.x branches.
>
> Just my 2 cents.
>
> Peter
>
> On 2022. May 9., at 10:51, Alessandro Solimando 
 wrote:
>
> Hi Stamatis,
> thanks for bringing up this topic, I basically agree on everything you 
wrote.
>
> I just wanted to add that this kind of proposal might sound harsh, 
because in many contexts upgrading is a complex process, but it's in nobody's 
interest to keep release branches that are missing important fixes/improvements 
and that might not meet the quality standards that people expect, as mentioned.
>
> Since we don't have yet a stable 4.x release (only alpha for now) we 
might want to keep supporting the 3.x branch until the first 4.x stable release 
and EOL < 3.x branches, WDYT?
>
> Best regards,
> Alessandro
>
> On Fri, 6 May 2022 at 23:14, Stamatis Zampetakis  
wrote:
>>
>> Hi all,
>>
>> The current master has many critical bug fixes as well as important 
performance improvements that are not backported (and most likely never will) 
to the maintenance branches.
>>
>> Backporting changes from master usually requires adapting the code and 
tests in questions making it a non-trivial and time consuming task.
>>
>> The ASF bylaws require PMCs to deliver high quality software which 
satisfy certain criteria. Cutting new releases from maintenance branches with 
known critical bugs is not compliant with the ASF.
>>
>> CI is unstable in all maintenance branches making the quality of a 
release questionable and merging new PRs rather difficult. Enabling and running 
it frequently in all maintenance branches would require a big amount of 
resources on top of what we already need for master.
>>
>> History has shown that it is very difficult or impossible to properly 
maintain multiple release branches for Hive.
>>
>> I think it would be to the best interest of the project if the PMC 
decided to drop support for maintenance branches and focused on releasing 
exclusively from master.
>>
>> This mail is related to the discussion about the release cadence [1] 
since it would certainly help making Hive releases more regular. I decided to 
start a separate thread to avoid mixing multiple topics together.
>>
>> Looking forward to your thoughts.
>>
>> Best,
>> Stamatis
>>
>> [1] 
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.apache.org%2Fthread%2Fn245dd23kb2v3qrrfp280w3pto89khxj&data=05%7C01%7Cbbattula%40visa.com%7Ccba1383657724a00f0bb08da31e069bc%7C38305e12e15d4ee888b9c4db1c477d76%7C0%7C0%7C637877137169408371%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=X3BJyzgALXZVnjmd2PzbLrOi4lXMHxEQa8KwA1Pz7BQ%3D&reserved=0
>>
>



Re: [DISCUSS] End of life for Hive 1.x, 2.x, 3.x

2022-05-09 Thread Peter Vary
Shall we put the core/shading to the blocker for 4.0.0?
Do we have a jira for it?

Thanks Sun Chao for bringing this up!

Peter

On Mon, May 9, 2022, 19:21 Chao Sun  wrote:

> Agree to Peter above. I know quite a few projects such as Spark,
> Iceberg and Trino/Presto are depending on Hive 2.x and 3.x, and
> periodically they may need new fixes in these. Upgrading them to use
> 4.x seems not an option for now since the core classified artifact has
> been removed and the shading issue has to be solved before they can
> consume the new jar.
>
> On Mon, May 9, 2022 at 4:10 AM Peter Vary  wrote:
> >
> > Hi Team,
> >
> > My experience with the Iceberg community shows that there are some
> sizeable userbase around Hive 2.x. I have seen patches, contributions to
> Hive 2.3.x branches, and the tests are in much better shape there.
> >
> > I would definitely vote for EOL Hive 1.x, but until we have a stable
> 4.x, I would be cautious about slashing 2.x, 3.x branches.
> >
> > Just my 2 cents.
> >
> > Peter
> >
> > On 2022. May 9., at 10:51, Alessandro Solimando <
> alessandro.solima...@gmail.com> wrote:
> >
> > Hi Stamatis,
> > thanks for bringing up this topic, I basically agree on everything you
> wrote.
> >
> > I just wanted to add that this kind of proposal might sound harsh,
> because in many contexts upgrading is a complex process, but it's in
> nobody's interest to keep release branches that are missing important
> fixes/improvements and that might not meet the quality standards that
> people expect, as mentioned.
> >
> > Since we don't have yet a stable 4.x release (only alpha for now) we
> might want to keep supporting the 3.x branch until the first 4.x stable
> release and EOL < 3.x branches, WDYT?
> >
> > Best regards,
> > Alessandro
> >
> > On Fri, 6 May 2022 at 23:14, Stamatis Zampetakis 
> wrote:
> >>
> >> Hi all,
> >>
> >> The current master has many critical bug fixes as well as important
> performance improvements that are not backported (and most likely never
> will) to the maintenance branches.
> >>
> >> Backporting changes from master usually requires adapting the code and
> tests in questions making it a non-trivial and time consuming task.
> >>
> >> The ASF bylaws require PMCs to deliver high quality software which
> satisfy certain criteria. Cutting new releases from maintenance branches
> with known critical bugs is not compliant with the ASF.
> >>
> >> CI is unstable in all maintenance branches making the quality of a
> release questionable and merging new PRs rather difficult. Enabling and
> running it frequently in all maintenance branches would require a big
> amount of resources on top of what we already need for master.
> >>
> >> History has shown that it is very difficult or impossible to properly
> maintain multiple release branches for Hive.
> >>
> >> I think it would be to the best interest of the project if the PMC
> decided to drop support for maintenance branches and focused on releasing
> exclusively from master.
> >>
> >> This mail is related to the discussion about the release cadence [1]
> since it would certainly help making Hive releases more regular. I decided
> to start a separate thread to avoid mixing multiple topics together.
> >>
> >> Looking forward to your thoughts.
> >>
> >> Best,
> >> Stamatis
> >>
> >> [1] https://lists.apache.org/thread/n245dd23kb2v3qrrfp280w3pto89khxj
> >>
> >
>


Re: [DISCUSS] End of life for Hive 1.x, 2.x, 3.x

2022-05-09 Thread Chao Sun
Agree to Peter above. I know quite a few projects such as Spark,
Iceberg and Trino/Presto are depending on Hive 2.x and 3.x, and
periodically they may need new fixes in these. Upgrading them to use
4.x seems not an option for now since the core classified artifact has
been removed and the shading issue has to be solved before they can
consume the new jar.

On Mon, May 9, 2022 at 4:10 AM Peter Vary  wrote:
>
> Hi Team,
>
> My experience with the Iceberg community shows that there are some sizeable 
> userbase around Hive 2.x. I have seen patches, contributions to Hive 2.3.x 
> branches, and the tests are in much better shape there.
>
> I would definitely vote for EOL Hive 1.x, but until we have a stable 4.x, I 
> would be cautious about slashing 2.x, 3.x branches.
>
> Just my 2 cents.
>
> Peter
>
> On 2022. May 9., at 10:51, Alessandro Solimando 
>  wrote:
>
> Hi Stamatis,
> thanks for bringing up this topic, I basically agree on everything you wrote.
>
> I just wanted to add that this kind of proposal might sound harsh, because in 
> many contexts upgrading is a complex process, but it's in nobody's interest 
> to keep release branches that are missing important fixes/improvements and 
> that might not meet the quality standards that people expect, as mentioned.
>
> Since we don't have yet a stable 4.x release (only alpha for now) we might 
> want to keep supporting the 3.x branch until the first 4.x stable release and 
> EOL < 3.x branches, WDYT?
>
> Best regards,
> Alessandro
>
> On Fri, 6 May 2022 at 23:14, Stamatis Zampetakis  wrote:
>>
>> Hi all,
>>
>> The current master has many critical bug fixes as well as important 
>> performance improvements that are not backported (and most likely never 
>> will) to the maintenance branches.
>>
>> Backporting changes from master usually requires adapting the code and tests 
>> in questions making it a non-trivial and time consuming task.
>>
>> The ASF bylaws require PMCs to deliver high quality software which satisfy 
>> certain criteria. Cutting new releases from maintenance branches with known 
>> critical bugs is not compliant with the ASF.
>>
>> CI is unstable in all maintenance branches making the quality of a release 
>> questionable and merging new PRs rather difficult. Enabling and running it 
>> frequently in all maintenance branches would require a big amount of 
>> resources on top of what we already need for master.
>>
>> History has shown that it is very difficult or impossible to properly 
>> maintain multiple release branches for Hive.
>>
>> I think it would be to the best interest of the project if the PMC decided 
>> to drop support for maintenance branches and focused on releasing 
>> exclusively from master.
>>
>> This mail is related to the discussion about the release cadence [1] since 
>> it would certainly help making Hive releases more regular. I decided to 
>> start a separate thread to avoid mixing multiple topics together.
>>
>> Looking forward to your thoughts.
>>
>> Best,
>> Stamatis
>>
>> [1] https://lists.apache.org/thread/n245dd23kb2v3qrrfp280w3pto89khxj
>>
>


Re: [DISCUSS] End of life for Hive 1.x, 2.x, 3.x

2022-05-09 Thread Peter Vary
Hi Team,

My experience with the Iceberg community shows that there are some sizeable 
userbase around Hive 2.x. I have seen patches, contributions to Hive 2.3.x 
branches, and the tests are in much better shape there.

I would definitely vote for EOL Hive 1.x, but until we have a stable 4.x, I 
would be cautious about slashing 2.x, 3.x branches.

Just my 2 cents.

Peter

> On 2022. May 9., at 10:51, Alessandro Solimando 
>  wrote:
> 
> Hi Stamatis,
> thanks for bringing up this topic, I basically agree on everything you wrote. 
> 
> I just wanted to add that this kind of proposal might sound harsh, because in 
> many contexts upgrading is a complex process, but it's in nobody's interest 
> to keep release branches that are missing important fixes/improvements and 
> that might not meet the quality standards that people expect, as mentioned.
> 
> Since we don't have yet a stable 4.x release (only alpha for now) we might 
> want to keep supporting the 3.x branch until the first 4.x stable release and 
> EOL < 3.x branches, WDYT?
> 
> Best regards,
> Alessandro
> 
> On Fri, 6 May 2022 at 23:14, Stamatis Zampetakis  > wrote:
> Hi all,
> 
> The current master has many critical bug fixes as well as important 
> performance improvements that are not backported (and most likely never will) 
> to the maintenance branches.
> 
> Backporting changes from master usually requires adapting the code and tests 
> in questions making it a non-trivial and time consuming task.
> 
> The ASF bylaws require PMCs to deliver high quality software which satisfy 
> certain criteria. Cutting new releases from maintenance branches with known 
> critical bugs is not compliant with the ASF.  
> 
> CI is unstable in all maintenance branches making the quality of a release 
> questionable and merging new PRs rather difficult. Enabling and running it 
> frequently in all maintenance branches would require a big amount of 
> resources on top of what we already need for master.
> 
> History has shown that it is very difficult or impossible to properly 
> maintain multiple release branches for Hive.
> 
> I think it would be to the best interest of the project if the PMC decided to 
> drop support for maintenance branches and focused on releasing exclusively 
> from master. 
> 
> This mail is related to the discussion about the release cadence [1] since it 
> would certainly help making Hive releases more regular. I decided to start a 
> separate thread to avoid mixing multiple topics together.
> 
> Looking forward to your thoughts.   
> 
> Best,
> Stamatis
> 
> [1] https://lists.apache.org/thread/n245dd23kb2v3qrrfp280w3pto89khxj 
> 
> 



Re: [DISCUSS] End of life for Hive 1.x, 2.x, 3.x

2022-05-09 Thread Alessandro Solimando
Hi Stamatis,
thanks for bringing up this topic, I basically agree on everything you
wrote.

I just wanted to add that this kind of proposal might sound harsh, because
in many contexts upgrading is a complex process, but it's in nobody's
interest to keep release branches that are missing important
fixes/improvements and that might not meet the quality standards that
people expect, as mentioned.

Since we don't have yet a stable 4.x release (only alpha for now) we might
want to keep supporting the 3.x branch until the first 4.x stable release
and EOL < 3.x branches, WDYT?

Best regards,
Alessandro

On Fri, 6 May 2022 at 23:14, Stamatis Zampetakis  wrote:

> Hi all,
>
> The current master has many critical bug fixes as well as important
> performance improvements that are not backported (and most likely never
> will) to the maintenance branches.
>
> Backporting changes from master usually requires adapting the code and
> tests in questions making it a non-trivial and time consuming task.
>
> The ASF bylaws require PMCs to deliver high quality software which satisfy
> certain criteria. Cutting new releases from maintenance branches with known
> critical bugs is not compliant with the ASF.
>
> CI is unstable in all maintenance branches making the quality of a release
> questionable and merging new PRs rather difficult. Enabling and running it
> frequently in all maintenance branches would require a big amount of
> resources on top of what we already need for master.
>
> History has shown that it is very difficult or impossible to properly
> maintain multiple release branches for Hive.
>
> I think it would be to the best interest of the project if the PMC decided
> to drop support for maintenance branches and focused on releasing
> exclusively from master.
>
> This mail is related to the discussion about the release cadence [1] since
> it would certainly help making Hive releases more regular. I decided to
> start a separate thread to avoid mixing multiple topics together.
>
> Looking forward to your thoughts.
>
> Best,
> Stamatis
>
> [1] https://lists.apache.org/thread/n245dd23kb2v3qrrfp280w3pto89khxj
>
>


[DISCUSS] End of life for Hive 1.x, 2.x, 3.x

2022-05-06 Thread Stamatis Zampetakis
Hi all,

The current master has many critical bug fixes as well as important
performance improvements that are not backported (and most likely never
will) to the maintenance branches.

Backporting changes from master usually requires adapting the code and
tests in questions making it a non-trivial and time consuming task.

The ASF bylaws require PMCs to deliver high quality software which satisfy
certain criteria. Cutting new releases from maintenance branches with known
critical bugs is not compliant with the ASF.

CI is unstable in all maintenance branches making the quality of a release
questionable and merging new PRs rather difficult. Enabling and running it
frequently in all maintenance branches would require a big amount of
resources on top of what we already need for master.

History has shown that it is very difficult or impossible to properly
maintain multiple release branches for Hive.

I think it would be to the best interest of the project if the PMC decided
to drop support for maintenance branches and focused on releasing
exclusively from master.

This mail is related to the discussion about the release cadence [1] since
it would certainly help making Hive releases more regular. I decided to
start a separate thread to avoid mixing multiple topics together.

Looking forward to your thoughts.

Best,
Stamatis

[1] https://lists.apache.org/thread/n245dd23kb2v3qrrfp280w3pto89khxj


[DISCUSS] End of life for Hive 1.x, 2.x, 3.x

2022-05-06 Thread Stamatis Zampetakis
Hi all,

The current master has many critical bug fixes as well as important
performance improvements that are not backported (and most likely never
will) to the maintenance branches.

Backporting changes from master usually requires adapting the code and
tests in questions making it a non-trivial and time consuming task.

The ASF bylaws require PMCs to deliver high quality software which satisfy
certain criteria. Cutting new releases from maintenance branches with known
critical bugs is not compliant with the ASF.

CI is unstable in all maintenance branches making the quality of a release
questionable and merging new PRs rather difficult. Enabling and running it
frequently in all maintenance branches would require a big amount of
resources on top of what we already need for master.

History has shown that it is very difficult or impossible to properly
maintain multiple release branches for Hive.

I think it would be to the best interest of the project if the PMC decided
to drop support for maintenance branches and focused on releasing
exclusively from master.

This mail is related to the discussion about the release cadence [1] since
it would certainly help making Hive releases more regular. I decided to
start a separate thread to avoid mixing multiple topics together.

Looking forward for your thoughts.

Best,
Stamatis

[1] https://lists.apache.org/thread/n245dd23kb2v3qrrfp280w3pto89khxj