Re: [DISCUSS] End of life for Hive 1.x, 2.x, 3.x

2022-07-28 Thread Chao Sun
Hive 2.x is still being used by other projects like Spark and Iceberg,
and periodically there are bug fixes & CVE fixes coming into the
branch. So I would suggest keeping it alive for a bit longer (maybe
after 2.3.10/11 release) until the other projects are ready to move
away from it (which could take some significant efforts).

Chao

On Thu, Jul 28, 2022 at 5:51 AM Ayush Saxena  wrote:
>
> +1, to start EOL vote for 1.x, and we can keep a doc or a reference in the
> Hive Wiki/Website to mark the lines EOL
>
> Sharing thoughts about the other release lines.
> Though there were assertions that we have a lot of users on 2.x & 3.x
> lines, I don't think marking these lines as  EOL will impact them that
> badly.
> Marking a release line seems to be a Dev agreement that we as the
> developers aren't putting enough efforts now maintaining these branches and
> they aren't very up to date.
>
> Quoting the example from Hadoop. Hadoop 3.1.x line is marked as EOL and
> still almost every second person on Hadoop 3.x line is on a heavily patched
> version of 3.1.x, and from the other half still a bunch of them are on 2.x
> family, out of which only 2.10.x isn't EOL. Side note: As of today Hive in
> master branch also depends on an unstable EOL version of hadoop, that is
> 3.1.0(Upgrade in progress)
>
> From the stability point of view, I agree with Stamatis that 4.x in alpha
> stage is still better than a bunch of previous releases in many aspects,
> and supporting older releases will just slow down the chances of
> adaptability of the new 4.x.
> If we see the git history even of these old branches, the frequency of
> commits are even too low, so I don't think most of the
> developers/committers aren't putting efforts maintaining these
> branches.(Subjective Opinion)
>
> IMO, We should consider marking 1.x & 2.x as EOL, Resolve upgrade issues
> mentioned for 3.x->4.x and once resolved, if that doesn't require any
> changes on 3.x line and everyone is happy then mark that even as EOL or
> else have a last bridge release for this branch to move to 4.x
>
> Just my 2 cents.
>
> -Ayush
>
>
>
> On Mon, 25 Jul 2022 at 19:38, Stamatis Zampetakis  wrote:
>
> > Hi all,
> >
> > In the last exchanges there was a general consensus to EOL Hive 1.X but no
> > additional action.
> > I believe the next step would be to start a VOTE and move forward with an
> > official announcement.
> >
> > I think it would be helpful for the end-users to know which releases are
> > supported and which are strongly discouraged.
> > The Hadoop community keeps this information in their wiki [1].
> >
> > Although, I am still not convinced that we should encourage users to use
> > the older release lines (2.X, 3.X) we can postpone the decision for the
> > time being and proceed just for 1.X.
> >
> > Best,
> > Stamatis
> >
> > [1]
> >
> > https://cwiki.apache.org/confluence/display/HADOOP/EOL+%28End-of-life%29+Release+Branches
> >
> > On Tue, May 10, 2022 at 2:51 PM Stamatis Zampetakis 
> > wrote:
> >
> > > Thanks everyone for sharing your thoughts. I am happy to see so many
> > > people involved in the discussion.
> > >
> > > I would say that the current 4.0.0-alpha-1 is better in many aspects than
> > > previous stable releases, although this might be a bit subjective.
> > >
> > > I am afraid that if we keep supporting older releases it will take too
> > > much time till people start using the 4.x.
> > > Having real deployments of Hive 4 is the only way to go from alpha to
> > > stable releases with confidence.
> > >
> > > I checked the download statistics for Hive releases [1], [2] for the past
> > > month and the results show that the vast majority of downloads are for
> > > older releases.
> > > I am not posting the stats here since I am not sure if this would violate
> > > some policies. Hive committers can access the stats using their ASF
> > > credentials.
> > > To some degree this is expected but at the same time problematic given
> > the
> > > number of open issues which affect older releases.
> > >
> > > I would definitely like to have multiple maintenance branches with high
> > > quality standards but I don't think there are enough active committers in
> > > the project to successfully maintain those.
> > > The https://github.com/mr3project/hive-mr3 repo may be a great fit for
> > an
> > > upcoming ASF Hive release.
> > > However, according to what Sungwoo said, this seems more like a new
> > > maintenance branch rather than a continuation of Hive 3.
> > > Moving towards this direction would certainly require more time from all
> > > of us.
> > >
> > > Lastly, it seems that there are some issues preventing people from using
> > > 4.0.0-alpha-1.
> > > As Peter already mentioned these issues are probably release blockers and
> > > it should be taken into account in the next Hive 4 release.
> > > The thread about the next steps after 4.0.0-alpha-1 [3] is the perfect
> > > place to discuss those.
> > > For those with certain demands around Hive 4, 

[jira] [Created] (HIVE-26437) dump unpartitioned Tables in parallel

2022-07-28 Thread Amit Saonerkar (Jira)
Amit Saonerkar created HIVE-26437:
-

 Summary: dump unpartitioned Tables in parallel
 Key: HIVE-26437
 URL: https://issues.apache.org/jira/browse/HIVE-26437
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Reporter: Amit Saonerkar
Assignee: Amit Saonerkar






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26436) test

2022-07-28 Thread TE (Jira)
TE created HIVE-26436:
-

 Summary: test
 Key: HIVE-26436
 URL: https://issues.apache.org/jira/browse/HIVE-26436
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Affects Versions: 3.1.2
Reporter: TE
Assignee: TE






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] End of life for Hive 1.x, 2.x, 3.x

2022-07-28 Thread Ayush Saxena
+1, to start EOL vote for 1.x, and we can keep a doc or a reference in the
Hive Wiki/Website to mark the lines EOL

Sharing thoughts about the other release lines.
Though there were assertions that we have a lot of users on 2.x & 3.x
lines, I don't think marking these lines as  EOL will impact them that
badly.
Marking a release line seems to be a Dev agreement that we as the
developers aren't putting enough efforts now maintaining these branches and
they aren't very up to date.

Quoting the example from Hadoop. Hadoop 3.1.x line is marked as EOL and
still almost every second person on Hadoop 3.x line is on a heavily patched
version of 3.1.x, and from the other half still a bunch of them are on 2.x
family, out of which only 2.10.x isn't EOL. Side note: As of today Hive in
master branch also depends on an unstable EOL version of hadoop, that is
3.1.0(Upgrade in progress)

>From the stability point of view, I agree with Stamatis that 4.x in alpha
stage is still better than a bunch of previous releases in many aspects,
and supporting older releases will just slow down the chances of
adaptability of the new 4.x.
If we see the git history even of these old branches, the frequency of
commits are even too low, so I don't think most of the
developers/committers aren't putting efforts maintaining these
branches.(Subjective Opinion)

IMO, We should consider marking 1.x & 2.x as EOL, Resolve upgrade issues
mentioned for 3.x->4.x and once resolved, if that doesn't require any
changes on 3.x line and everyone is happy then mark that even as EOL or
else have a last bridge release for this branch to move to 4.x

Just my 2 cents.

-Ayush



On Mon, 25 Jul 2022 at 19:38, Stamatis Zampetakis  wrote:

> Hi all,
>
> In the last exchanges there was a general consensus to EOL Hive 1.X but no
> additional action.
> I believe the next step would be to start a VOTE and move forward with an
> official announcement.
>
> I think it would be helpful for the end-users to know which releases are
> supported and which are strongly discouraged.
> The Hadoop community keeps this information in their wiki [1].
>
> Although, I am still not convinced that we should encourage users to use
> the older release lines (2.X, 3.X) we can postpone the decision for the
> time being and proceed just for 1.X.
>
> Best,
> Stamatis
>
> [1]
>
> https://cwiki.apache.org/confluence/display/HADOOP/EOL+%28End-of-life%29+Release+Branches
>
> On Tue, May 10, 2022 at 2:51 PM Stamatis Zampetakis 
> wrote:
>
> > Thanks everyone for sharing your thoughts. I am happy to see so many
> > people involved in the discussion.
> >
> > I would say that the current 4.0.0-alpha-1 is better in many aspects than
> > previous stable releases, although this might be a bit subjective.
> >
> > I am afraid that if we keep supporting older releases it will take too
> > much time till people start using the 4.x.
> > Having real deployments of Hive 4 is the only way to go from alpha to
> > stable releases with confidence.
> >
> > I checked the download statistics for Hive releases [1], [2] for the past
> > month and the results show that the vast majority of downloads are for
> > older releases.
> > I am not posting the stats here since I am not sure if this would violate
> > some policies. Hive committers can access the stats using their ASF
> > credentials.
> > To some degree this is expected but at the same time problematic given
> the
> > number of open issues which affect older releases.
> >
> > I would definitely like to have multiple maintenance branches with high
> > quality standards but I don't think there are enough active committers in
> > the project to successfully maintain those.
> > The https://github.com/mr3project/hive-mr3 repo may be a great fit for
> an
> > upcoming ASF Hive release.
> > However, according to what Sungwoo said, this seems more like a new
> > maintenance branch rather than a continuation of Hive 3.
> > Moving towards this direction would certainly require more time from all
> > of us.
> >
> > Lastly, it seems that there are some issues preventing people from using
> > 4.0.0-alpha-1.
> > As Peter already mentioned these issues are probably release blockers and
> > it should be taken into account in the next Hive 4 release.
> > The thread about the next steps after 4.0.0-alpha-1 [3] is the perfect
> > place to discuss those.
> > For those with certain demands around Hive 4, please reply to [3] and
> > include any specific JIRAs that need to be in the scope of the next
> release.
> >
> > Best,
> > Stamatis
> >
> > [1] https://logging1-he-de.apache.org/stats/
> > [2] https://repository.apache.org/#central-stat
> > [3] https://lists.apache.org/thread/n245dd23kb2v3qrrfp280w3pto89khxj
> >
> >
> > On Tue, May 10, 2022 at 10:55 AM Sungwoo Park  wrote:
> >
> >> We maintain our own fork of Hive 3 because we are not always adding new
> >> commits to the tip of the branch. To backport a new patch, sometimes we
> >> have to add new commits between existing commits,