> I have some hesitation, because the actual version number can better
reflect the actual dependency. For example, if the user also knows the
field hiveVersion[1]. He may enter the wrong hiveVersion because of the
name, or he may have the wrong expectation for the hive built-in functions.

Sorry, I'm not sure if my proposal is understood correctly.

What I'm saying is, in your original proposal, taking an example, suggested
naming the module as "flink-connector-hive-1.2" to support hive 1.0.0 -
1.2.2, a name including the highest Hive version it supports. I'm
suggesting to name it "flink-connector-hive-1.0", a name including the
lowest Hive version it supports.

What do you think?



On Wed, Mar 4, 2020 at 11:14 PM Jingsong Li <jingsongl...@gmail.com> wrote:

> Hi Bowen, thanks for your reply.
>
> > will there be a base module like "flink-connector-hive-base" which holds
> all the common logic of these proposed modules
>
> Maybe we don't need, their implementation is only "pom.xml". Different
> versions have different dependencies.
>
> > it's more common to set the version in module name to be the lowest
> version that this module supports
>
> I have some hesitation, because the actual version number can better
> reflect the actual dependency. For example, if the user also knows the
> field hiveVersion[1]. He may enter the wrong hiveVersion because of the
> name, or he may have the wrong expectation for the hive built-in functions.
>
> [1] https://github.com/apache/flink/pull/11304
>
> Best,
> Jingsong Lee
>
> On Thu, Mar 5, 2020 at 2:34 PM Bowen Li <bowenl...@gmail.com> wrote:
>
> > Thanks Jingsong for your explanation! I'm +1 for this initiative.
> >
> > According to your description, I think it makes sense to incorporate
> > support of Hive 2.2 to that of 2.0/2.1 and reducing the number of ranges
> to
> > 4.
> >
> > A couple minor followup questions:
> > 1) will there be a base module like "flink-connector-hive-base" which
> holds
> > all the common logic of these proposed modules and is compiled into the
> > uber jar of "flink-connector-hive-xxx"?
> > 2) according to my observation, it's more common to set the version in
> > module name to be the lowest version that this module supports, e.g. for
> > Hive 1.0.0 - 1.2.2, the module name can be "flink-connector-hive-1.0"
> > rather than "flink-connector-hive-1.2"
> >
> >
> > On Wed, Mar 4, 2020 at 10:20 PM Jingsong Li <jingsongl...@gmail.com>
> > wrote:
> >
> > > Thanks Bowen for involving.
> > >
> > > > why you proposed segregating hive versions into the 5 ranges above? &
> > > what different Hive features are supported in the 5 ranges?
> > >
> > > For only higher client dependencies version support lower hive
> metastore
> > > versions:
> > > - Hive 1.0.0 - 1.2.2, thrift change is OK, only hive date column stats,
> > we
> > > can throw exception for the unsupported feature.
> > > - Hive 2.0 and Hive 2.1, primary key support and alter_partition api
> > > change.
> > > - Hive 2.2 no thrift change.
> > > - Hive 2.3 change many things, lots of thrift change.
> > > - Hive 3+, not null. unique, timestamp, so many things.
> > >
> > > All these things can be found in hive_metastore.thrift.
> > >
> > > I think I can try do more effort in implementation to use Hive 2.2 to
> > > support Hive 2.0. So the range size will be 4.
> > >
> > > > have you tested that whether the proposed corresponding Flink module
> > will
> > > be fully compatible with each Hive version range?
> > >
> > > Yes, I have done some tests, not really for "fully", but it is a
> > technical
> > > judgment.
> > >
> > > Best,
> > > Jingsong Lee
> > >
> > > On Thu, Mar 5, 2020 at 1:17 PM Bowen Li <bowenl...@gmail.com> wrote:
> > >
> > > > Thanks, Jingsong, for bringing this up. We've received lots of
> > feedbacks
> > > in
> > > > the past few months that the complexity involved in different Hive
> > > versions
> > > > has been quite painful for users to start with. So it's great to step
> > > > forward and deal with such issue.
> > > >
> > > > Before getting on a decision, can you please explain:
> > > >
> > > > 1) why you proposed segregating hive versions into the 5 ranges
> above?
> > > > 2) what different Hive features are supported in the 5 ranges?
> > > > 3) have you tested that whether the proposed corresponding Flink
> module
> > > > will be fully compatible with each Hive version range?
> > > >
> > > > Thanks,
> > > > Bowen
> > > >
> > > >
> > > >
> > > > On Wed, Mar 4, 2020 at 1:00 AM Jingsong Lee <lzljs3620...@apache.org
> >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I'd like to propose introduce flink-connector-hive-xx modules.
> > > > >
> > > > > We have documented the dependencies detailed information[2]. But
> > still
> > > > has
> > > > > some inconvenient:
> > > > > - Too many versions, users need to pick one version from 8
> versions.
> > > > > - Too many versions, It's not friendly to our developers either,
> > > because
> > > > > there's a problem/exception, we need to look at eight different
> > > versions
> > > > of
> > > > > hive client code, which are often various.
> > > > > - Too many jars, for example, users need to download 4+ jars for
> Hive
> > > 1.x
> > > > > from various places.
> > > > >
> > > > > We have discussed in [1] and [2], but unfortunately, we can not
> > achieve
> > > > an
> > > > > agreement.
> > > > >
> > > > > For improving this, I'd like to introduce few
> flink-connector-hive-xx
> > > > > modules in flink-connectors, module contains all the dependencies
> > > related
> > > > > to hive. And only support lower hive metastore versions:
> > > > > - "flink-connector-hive-1.2" to support hive 1.0.0 - 1.2.2
> > > > > - "flink-connector-hive-2.0" to support hive 2.0.0 - 2.0.1
> > > > > - "flink-connector-hive-2.2" to support hive 2.1.0 - 2.2.0
> > > > > - "flink-connector-hive-2.3" to support hive 2.3.0 - 2.3.6
> > > > > - "flink-connector-hive-3.1" to support hive 3.0.0 - 3.1.2
> > > > >
> > > > > Users can choose one and download to flink/lib. It includes all
> hive
> > > > > things.
> > > > >
> > > > > I try to use a single module to deploy multiple versions, but I can
> > not
> > > > > find a suitable way, because different modules require different
> > > versions
> > > > > and different dependencies.
> > > > >
> > > > > What do you think?
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-have-separate-Flink-distributions-with-built-in-Hive-dependencies-td35918.html
> > > > > [2]
> > > > >
> > > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-109-Improve-Hive-dependencies-out-of-box-experience-td38290.html
> > > > >
> > > > > Best,
> > > > > Jingsong Lee
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best, Jingsong Lee
> > >
> >
>
>
> --
> Best, Jingsong Lee
>

Reply via email to