Thanks Bowen for involving. > why you proposed segregating hive versions into the 5 ranges above? & what different Hive features are supported in the 5 ranges?
For only higher client dependencies version support lower hive metastore versions: - Hive 1.0.0 - 1.2.2, thrift change is OK, only hive date column stats, we can throw exception for the unsupported feature. - Hive 2.0 and Hive 2.1, primary key support and alter_partition api change. - Hive 2.2 no thrift change. - Hive 2.3 change many things, lots of thrift change. - Hive 3+, not null. unique, timestamp, so many things. All these things can be found in hive_metastore.thrift. I think I can try do more effort in implementation to use Hive 2.2 to support Hive 2.0. So the range size will be 4. > have you tested that whether the proposed corresponding Flink module will be fully compatible with each Hive version range? Yes, I have done some tests, not really for "fully", but it is a technical judgment. Best, Jingsong Lee On Thu, Mar 5, 2020 at 1:17 PM Bowen Li <bowenl...@gmail.com> wrote: > Thanks, Jingsong, for bringing this up. We've received lots of feedbacks in > the past few months that the complexity involved in different Hive versions > has been quite painful for users to start with. So it's great to step > forward and deal with such issue. > > Before getting on a decision, can you please explain: > > 1) why you proposed segregating hive versions into the 5 ranges above? > 2) what different Hive features are supported in the 5 ranges? > 3) have you tested that whether the proposed corresponding Flink module > will be fully compatible with each Hive version range? > > Thanks, > Bowen > > > > On Wed, Mar 4, 2020 at 1:00 AM Jingsong Lee <lzljs3620...@apache.org> > wrote: > > > Hi all, > > > > I'd like to propose introduce flink-connector-hive-xx modules. > > > > We have documented the dependencies detailed information[2]. But still > has > > some inconvenient: > > - Too many versions, users need to pick one version from 8 versions. > > - Too many versions, It's not friendly to our developers either, because > > there's a problem/exception, we need to look at eight different versions > of > > hive client code, which are often various. > > - Too many jars, for example, users need to download 4+ jars for Hive 1.x > > from various places. > > > > We have discussed in [1] and [2], but unfortunately, we can not achieve > an > > agreement. > > > > For improving this, I'd like to introduce few flink-connector-hive-xx > > modules in flink-connectors, module contains all the dependencies related > > to hive. And only support lower hive metastore versions: > > - "flink-connector-hive-1.2" to support hive 1.0.0 - 1.2.2 > > - "flink-connector-hive-2.0" to support hive 2.0.0 - 2.0.1 > > - "flink-connector-hive-2.2" to support hive 2.1.0 - 2.2.0 > > - "flink-connector-hive-2.3" to support hive 2.3.0 - 2.3.6 > > - "flink-connector-hive-3.1" to support hive 3.0.0 - 3.1.2 > > > > Users can choose one and download to flink/lib. It includes all hive > > things. > > > > I try to use a single module to deploy multiple versions, but I can not > > find a suitable way, because different modules require different versions > > and different dependencies. > > > > What do you think? > > > > [1] > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-have-separate-Flink-distributions-with-built-in-Hive-dependencies-td35918.html > > [2] > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-109-Improve-Hive-dependencies-out-of-box-experience-td38290.html > > > > Best, > > Jingsong Lee > > > -- Best, Jingsong Lee