Thanks Jingsong for your explanation! I'm +1 for this initiative.

According to your description, I think it makes sense to incorporate
support of Hive 2.2 to that of 2.0/2.1 and reducing the number of ranges to
4.

A couple minor followup questions:
1) will there be a base module like "flink-connector-hive-base" which holds
all the common logic of these proposed modules and is compiled into the
uber jar of "flink-connector-hive-xxx"?
2) according to my observation, it's more common to set the version in
module name to be the lowest version that this module supports, e.g. for
Hive 1.0.0 - 1.2.2, the module name can be "flink-connector-hive-1.0"
rather than "flink-connector-hive-1.2"


On Wed, Mar 4, 2020 at 10:20 PM Jingsong Li <jingsongl...@gmail.com> wrote:

> Thanks Bowen for involving.
>
> > why you proposed segregating hive versions into the 5 ranges above? &
> what different Hive features are supported in the 5 ranges?
>
> For only higher client dependencies version support lower hive metastore
> versions:
> - Hive 1.0.0 - 1.2.2, thrift change is OK, only hive date column stats, we
> can throw exception for the unsupported feature.
> - Hive 2.0 and Hive 2.1, primary key support and alter_partition api
> change.
> - Hive 2.2 no thrift change.
> - Hive 2.3 change many things, lots of thrift change.
> - Hive 3+, not null. unique, timestamp, so many things.
>
> All these things can be found in hive_metastore.thrift.
>
> I think I can try do more effort in implementation to use Hive 2.2 to
> support Hive 2.0. So the range size will be 4.
>
> > have you tested that whether the proposed corresponding Flink module will
> be fully compatible with each Hive version range?
>
> Yes, I have done some tests, not really for "fully", but it is a technical
> judgment.
>
> Best,
> Jingsong Lee
>
> On Thu, Mar 5, 2020 at 1:17 PM Bowen Li <bowenl...@gmail.com> wrote:
>
> > Thanks, Jingsong, for bringing this up. We've received lots of feedbacks
> in
> > the past few months that the complexity involved in different Hive
> versions
> > has been quite painful for users to start with. So it's great to step
> > forward and deal with such issue.
> >
> > Before getting on a decision, can you please explain:
> >
> > 1) why you proposed segregating hive versions into the 5 ranges above?
> > 2) what different Hive features are supported in the 5 ranges?
> > 3) have you tested that whether the proposed corresponding Flink module
> > will be fully compatible with each Hive version range?
> >
> > Thanks,
> > Bowen
> >
> >
> >
> > On Wed, Mar 4, 2020 at 1:00 AM Jingsong Lee <lzljs3620...@apache.org>
> > wrote:
> >
> > > Hi all,
> > >
> > > I'd like to propose introduce flink-connector-hive-xx modules.
> > >
> > > We have documented the dependencies detailed information[2]. But still
> > has
> > > some inconvenient:
> > > - Too many versions, users need to pick one version from 8 versions.
> > > - Too many versions, It's not friendly to our developers either,
> because
> > > there's a problem/exception, we need to look at eight different
> versions
> > of
> > > hive client code, which are often various.
> > > - Too many jars, for example, users need to download 4+ jars for Hive
> 1.x
> > > from various places.
> > >
> > > We have discussed in [1] and [2], but unfortunately, we can not achieve
> > an
> > > agreement.
> > >
> > > For improving this, I'd like to introduce few flink-connector-hive-xx
> > > modules in flink-connectors, module contains all the dependencies
> related
> > > to hive. And only support lower hive metastore versions:
> > > - "flink-connector-hive-1.2" to support hive 1.0.0 - 1.2.2
> > > - "flink-connector-hive-2.0" to support hive 2.0.0 - 2.0.1
> > > - "flink-connector-hive-2.2" to support hive 2.1.0 - 2.2.0
> > > - "flink-connector-hive-2.3" to support hive 2.3.0 - 2.3.6
> > > - "flink-connector-hive-3.1" to support hive 3.0.0 - 3.1.2
> > >
> > > Users can choose one and download to flink/lib. It includes all hive
> > > things.
> > >
> > > I try to use a single module to deploy multiple versions, but I can not
> > > find a suitable way, because different modules require different
> versions
> > > and different dependencies.
> > >
> > > What do you think?
> > >
> > > [1]
> > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-have-separate-Flink-distributions-with-built-in-Hive-dependencies-td35918.html
> > > [2]
> > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-109-Improve-Hive-dependencies-out-of-box-experience-td38290.html
> > >
> > > Best,
> > > Jingsong Lee
> > >
> >
>
>
> --
> Best, Jingsong Lee
>

Reply via email to