Hi Bowen, thanks for your reply. > will there be a base module like "flink-connector-hive-base" which holds all the common logic of these proposed modules
Maybe we don't need, their implementation is only "pom.xml". Different versions have different dependencies. > it's more common to set the version in module name to be the lowest version that this module supports I have some hesitation, because the actual version number can better reflect the actual dependency. For example, if the user also knows the field hiveVersion[1]. He may enter the wrong hiveVersion because of the name, or he may have the wrong expectation for the hive built-in functions. [1] https://github.com/apache/flink/pull/11304 Best, Jingsong Lee On Thu, Mar 5, 2020 at 2:34 PM Bowen Li <bowenl...@gmail.com> wrote: > Thanks Jingsong for your explanation! I'm +1 for this initiative. > > According to your description, I think it makes sense to incorporate > support of Hive 2.2 to that of 2.0/2.1 and reducing the number of ranges to > 4. > > A couple minor followup questions: > 1) will there be a base module like "flink-connector-hive-base" which holds > all the common logic of these proposed modules and is compiled into the > uber jar of "flink-connector-hive-xxx"? > 2) according to my observation, it's more common to set the version in > module name to be the lowest version that this module supports, e.g. for > Hive 1.0.0 - 1.2.2, the module name can be "flink-connector-hive-1.0" > rather than "flink-connector-hive-1.2" > > > On Wed, Mar 4, 2020 at 10:20 PM Jingsong Li <jingsongl...@gmail.com> > wrote: > > > Thanks Bowen for involving. > > > > > why you proposed segregating hive versions into the 5 ranges above? & > > what different Hive features are supported in the 5 ranges? > > > > For only higher client dependencies version support lower hive metastore > > versions: > > - Hive 1.0.0 - 1.2.2, thrift change is OK, only hive date column stats, > we > > can throw exception for the unsupported feature. > > - Hive 2.0 and Hive 2.1, primary key support and alter_partition api > > change. > > - Hive 2.2 no thrift change. > > - Hive 2.3 change many things, lots of thrift change. > > - Hive 3+, not null. unique, timestamp, so many things. > > > > All these things can be found in hive_metastore.thrift. > > > > I think I can try do more effort in implementation to use Hive 2.2 to > > support Hive 2.0. So the range size will be 4. > > > > > have you tested that whether the proposed corresponding Flink module > will > > be fully compatible with each Hive version range? > > > > Yes, I have done some tests, not really for "fully", but it is a > technical > > judgment. > > > > Best, > > Jingsong Lee > > > > On Thu, Mar 5, 2020 at 1:17 PM Bowen Li <bowenl...@gmail.com> wrote: > > > > > Thanks, Jingsong, for bringing this up. We've received lots of > feedbacks > > in > > > the past few months that the complexity involved in different Hive > > versions > > > has been quite painful for users to start with. So it's great to step > > > forward and deal with such issue. > > > > > > Before getting on a decision, can you please explain: > > > > > > 1) why you proposed segregating hive versions into the 5 ranges above? > > > 2) what different Hive features are supported in the 5 ranges? > > > 3) have you tested that whether the proposed corresponding Flink module > > > will be fully compatible with each Hive version range? > > > > > > Thanks, > > > Bowen > > > > > > > > > > > > On Wed, Mar 4, 2020 at 1:00 AM Jingsong Lee <lzljs3620...@apache.org> > > > wrote: > > > > > > > Hi all, > > > > > > > > I'd like to propose introduce flink-connector-hive-xx modules. > > > > > > > > We have documented the dependencies detailed information[2]. But > still > > > has > > > > some inconvenient: > > > > - Too many versions, users need to pick one version from 8 versions. > > > > - Too many versions, It's not friendly to our developers either, > > because > > > > there's a problem/exception, we need to look at eight different > > versions > > > of > > > > hive client code, which are often various. > > > > - Too many jars, for example, users need to download 4+ jars for Hive > > 1.x > > > > from various places. > > > > > > > > We have discussed in [1] and [2], but unfortunately, we can not > achieve > > > an > > > > agreement. > > > > > > > > For improving this, I'd like to introduce few flink-connector-hive-xx > > > > modules in flink-connectors, module contains all the dependencies > > related > > > > to hive. And only support lower hive metastore versions: > > > > - "flink-connector-hive-1.2" to support hive 1.0.0 - 1.2.2 > > > > - "flink-connector-hive-2.0" to support hive 2.0.0 - 2.0.1 > > > > - "flink-connector-hive-2.2" to support hive 2.1.0 - 2.2.0 > > > > - "flink-connector-hive-2.3" to support hive 2.3.0 - 2.3.6 > > > > - "flink-connector-hive-3.1" to support hive 3.0.0 - 3.1.2 > > > > > > > > Users can choose one and download to flink/lib. It includes all hive > > > > things. > > > > > > > > I try to use a single module to deploy multiple versions, but I can > not > > > > find a suitable way, because different modules require different > > versions > > > > and different dependencies. > > > > > > > > What do you think? > > > > > > > > [1] > > > > > > > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-have-separate-Flink-distributions-with-built-in-Hive-dependencies-td35918.html > > > > [2] > > > > > > > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-109-Improve-Hive-dependencies-out-of-box-experience-td38290.html > > > > > > > > Best, > > > > Jingsong Lee > > > > > > > > > > > > > -- > > Best, Jingsong Lee > > > -- Best, Jingsong Lee