Thanks for the context, Denys and Peter. Sounds like there's a good
question here about where the Hive integration should live and the most
recent decision was to maintain that support in Hive. I definitely hear the
point about Hive 3 users depending on the Iceberg modules. I'm also glad to
hear that some of the issues are expected to be fixed with the release of
Hive 4.0.x.

I think that we have two separate questions for how to move forward with
Hive support, depending on the Hive version. There is a question about what
we do with the current Hive modules and what to do with the Hive 4 support
that has been developed externally.

For Hive 2.x and 3.x, we have code in the Iceberg repo that is not being
developed. Hive 2 is fairly easy since it is EOL. While Hive 3 is still
used, I don't think it makes sense to keep releasing versions of it if it
requires Java 8, which has not been publicly maintained for 5 years. We
need to upgrade and that is at odds with keeping support for Hive 3. As
Fokko and I both pointed out, people can still use older releases.

For the question of how to maintain support for Hive 4, I think it's worth
having a separate discussion (probably not on the thread about JDK
versions) about where to maintain it. I think that it is best to maintain
integration in engines and not in the Iceberg project; there are few
implementations here and I think that it is a hard problem for Iceberg to
maintain support for multiple versions (as you can see with support for so
many different Flink, Hive, and Spark versions).

Ryan

On Thu, Jul 18, 2024 at 7:25 AM Denys Kuzmenko <dkuzme...@apache.org> wrote:

> In the following 1-2 months we plan to release HIVE-4.0.1 which includes
> bug fixes and then focus on HIVE-4.1.0 release with jdk17.
>


-- 
Ryan Blue
Databricks

Reply via email to