Thanks Bowen for involving.

> why you proposed segregating hive versions into the 5 ranges above? &
what different Hive features are supported in the 5 ranges?

For only higher client dependencies version support lower hive metastore
versions:
- Hive 1.0.0 - 1.2.2, thrift change is OK, only hive date column stats, we
can throw exception for the unsupported feature.
- Hive 2.0 and Hive 2.1, primary key support and alter_partition api change.
- Hive 2.2 no thrift change.
- Hive 2.3 change many things, lots of thrift change.
- Hive 3+, not null. unique, timestamp, so many things.

All these things can be found in hive_metastore.thrift.

I think I can try do more effort in implementation to use Hive 2.2 to
support Hive 2.0. So the range size will be 4.

> have you tested that whether the proposed corresponding Flink module will
be fully compatible with each Hive version range?

Yes, I have done some tests, not really for "fully", but it is a technical
judgment.

Best,
Jingsong Lee

On Thu, Mar 5, 2020 at 1:17 PM Bowen Li <bowenl...@gmail.com> wrote:

> Thanks, Jingsong, for bringing this up. We've received lots of feedbacks in
> the past few months that the complexity involved in different Hive versions
> has been quite painful for users to start with. So it's great to step
> forward and deal with such issue.
>
> Before getting on a decision, can you please explain:
>
> 1) why you proposed segregating hive versions into the 5 ranges above?
> 2) what different Hive features are supported in the 5 ranges?
> 3) have you tested that whether the proposed corresponding Flink module
> will be fully compatible with each Hive version range?
>
> Thanks,
> Bowen
>
>
>
> On Wed, Mar 4, 2020 at 1:00 AM Jingsong Lee <lzljs3620...@apache.org>
> wrote:
>
> > Hi all,
> >
> > I'd like to propose introduce flink-connector-hive-xx modules.
> >
> > We have documented the dependencies detailed information[2]. But still
> has
> > some inconvenient:
> > - Too many versions, users need to pick one version from 8 versions.
> > - Too many versions, It's not friendly to our developers either, because
> > there's a problem/exception, we need to look at eight different versions
> of
> > hive client code, which are often various.
> > - Too many jars, for example, users need to download 4+ jars for Hive 1.x
> > from various places.
> >
> > We have discussed in [1] and [2], but unfortunately, we can not achieve
> an
> > agreement.
> >
> > For improving this, I'd like to introduce few flink-connector-hive-xx
> > modules in flink-connectors, module contains all the dependencies related
> > to hive. And only support lower hive metastore versions:
> > - "flink-connector-hive-1.2" to support hive 1.0.0 - 1.2.2
> > - "flink-connector-hive-2.0" to support hive 2.0.0 - 2.0.1
> > - "flink-connector-hive-2.2" to support hive 2.1.0 - 2.2.0
> > - "flink-connector-hive-2.3" to support hive 2.3.0 - 2.3.6
> > - "flink-connector-hive-3.1" to support hive 3.0.0 - 3.1.2
> >
> > Users can choose one and download to flink/lib. It includes all hive
> > things.
> >
> > I try to use a single module to deploy multiple versions, but I can not
> > find a suitable way, because different modules require different versions
> > and different dependencies.
> >
> > What do you think?
> >
> > [1]
> >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-have-separate-Flink-distributions-with-built-in-Hive-dependencies-td35918.html
> > [2]
> >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-109-Improve-Hive-dependencies-out-of-box-experience-td38290.html
> >
> > Best,
> > Jingsong Lee
> >
>


-- 
Best, Jingsong Lee

Reply via email to