Hi Kurt, I also prefer "-" as version delimiter now. I didn't remove the "_" proposal by mistake, that's why I sent another email last night :) Regarding to "property-version", I also think we shouldn't let users to learn about this. And ConfigOption provides a good ability to support deprecated keys and auto-generate documentation for deprecated keys.
Hi Danny, Regarding to “connector.properties.*”: In FLIP-95, the Factory#requiredOptions() and Factory#optionalOptions() inferfaces are only used for generation of documentation. It does not influence the discovery and validation of a factory. The validation logic is defined by connectors in createDynamicTableSource/Sink(). So you don't have to provide an option for "connector.properties.*". But I think we should make ConfigOption support wildcard in the long term for a full story. I don't think we should inline all the "connector.properties.*", otherwise, it will be very tricky for users to configure the properties. Regarding to FLIP-113, I suggest to provide some ConfigOptions for commonly used kafka properties and put them in the supportedHintOptions(), e.g. "connector.properties.group.id", "connector.properties.fetch.min.bytes". Best, Jark On Tue, 31 Mar 2020 at 12:04, Danny Chan <yuzhao....@gmail.com> wrote: > Thanks Jark for bring up this discussion, +1 for this idea, I believe the > user has suffered from the verbose property key for long time. > > Just one question, how do we handle the keys with wildcard, such as the > “connector.properties.*” in Kafka connector which would then hand-over to > Kafka client directly. As what suggested in FLIP-95, we use a ConfigOption > to describe the “supported properties”, then I have to concerns: > > • For the new keys, do we still need to put multi-lines there the such > key, such as “connector.properties.abc” “connector.properties.def”, or > should we inline them, such as “some-key-prefix” = “k1=v1, k2=v2 ..." > • Should the ConfigOption support the wildcard ? (If we plan to support > the current multi-line style) > > > Best, > Danny Chan > 在 2020年3月31日 +0800 AM12:37,Jark Wu <imj...@gmail.com>,写道: > > Hi all, > > > > Thanks for the feedbacks. > > > > It seems that we have a conclusion to put the version into the factory > > identifier. I'm also fine with this. > > If we have this outcome, the interface of Factory#factoryVersion is not > > needed anymore, this can simplify the learning cost of new factory. > > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther > > <twal...@apache.org> > > > > kafka => kafka for 0.11+ versions, we don't suffix "-universal", because > > the meaning of "universal" not easy to understand. > > kafka-0.11 => kafka for 0.11 version > > kafka-0.10 => kafka for 0.10 version > > elasticsearch-6 => elasticsearch for 6.x versions > > elasticsearch-7 => elasticsearch for 7.x versions > > hbase-1.4 => hbase for 1.4.x versions > > jdbc > > filesystem > > > > We use "-" as the version delimiter to make them to be more consistent. > > This is not forces, users can also use other delimiters or without > > delimiter. > > But this can be a guilde in the Javadoc of Factory, to make the connector > > ecosystem to be more consistent. > > > > What do you think? > > > > ------------------------ > > > > Regarding "connector.property-version": > > > > Hi @Dawid Wysakowicz <dwysakow...@apache.org> , the new fatories are > > designed not support to read current properties. > > All the current properties are routed to the old factories if they are > > using "connector.type". Otherwise, properties are routed to new > factories. > > > > If I understand correctly, the "connector.property-version" is attched > > implicitly by system, not manually set by users. > > For example, the framework should add "connector.property-version=1" to > > properties when processing DDL statement. > > I'm fine to add a "connector.property-version=1" when processing DDL > > statement, but I think it's also fine if we don't, > > because this can be done in the future if need and the default version > can > > be 1. > > > > Best, > > Jark > > > > On Tue, 31 Mar 2020 at 00:36, Jark Wu <imj...@gmail.com> wrote: > > > > > Hi all, > > > > > > Thanks for the feedbacks. > > > > > > It seems that we have a conclusion to put the version into the factory > > > identifier. I'm also fine with this. > > > If we have this outcome, the interface of Factory#factoryVersion is not > > > needed anymore, this can simplify the learning cost of new factory. > > > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther > > > <twal...@apache.org> > > > > > > Btw, I would like to use "_" instead of "-" as the version delimiter, > > > because "-" looks like minus and may confuse users, e.g. > "elasticsearch-6". > > > This is not forced, but should be a guilde in the Javadoc of Factory. > > > I propose to use the following identifiers for existing connectors, > > > > > > kafka => kafka for 0.11+ versions, we don't suffix "-universal", > because > > > the meaning of "universal" not easy to understand. > > > kafka-0.11 => kafka for 0.11 version > > > kafka-0.10 => kafka for 0.10 version > > > elasticsearch-6 => elasticsearch for 6.x versions > > > elasticsearch-7 => elasticsearch for 7.x versions > > > hbase-1.4 => hbase for 1.4.x versions > > > jdbc > > > filesystem > > > > > > We use "-" as the version delimiter to make them to be more consistent. > > > This is not forces, users can also use other delimiters or without > > > delimiter. > > > But this can be a guilde in the Javadoc of Factory, to make the > connector > > > ecosystem to be more consistent. > > > > > > What do you think? > > > > > > ------------------------ > > > > > > Regarding "connector.property-version": > > > > > > Hi @Dawid Wysakowicz <dwysakow...@apache.org> , the new fatories are > > > designed not support to read current properties. > > > All the current properties are routed to the old factories if they are > > > using "connector.type". Otherwise, properties are routed to new > factories. > > > > > > If I understand correctly, the "connector.property-version" is attched > > > implicitly by system, not manually set by users. > > > For example, the framework should add "connector.property-version=1" to > > > properties when processing DDL statement. > > > I'm fine to add a "connector.property-version=1" when processing DDL > > > statement, but I think it's also fine if we don't, > > > because this can be done in the future if need and the default version > can > > > be 1. > > > > > > Best, > > > Jark > > > > > > > > > > > > > > > > > > On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <dwysakow...@apache.org > > > > > wrote: > > > > > > > Hi all, > > > > > > > > I like the overall design of the FLIP. > > > > > > > > As for the withstanding concerns. I kind of like the approach to put > the > > > > version into the factory identifier. I think it's the cleanest way to > > > > say that this version actually applies to the connector itself and > not > > > > to the system it connects to. BTW, I think the outcome of this > > > > discussion will affect interfaces described in FLIP-95. If we put the > > > > version into the functionIdentifier, do we need > Factory#factoryVersion? > > > > > > > > I also think it does make sense to have a versioning for the > properties > > > > as well. Are we able to read all the current properties with the new > > > > factories? I think we could use the "connector.property-version" to > > > > alternate between different Factory interfaces to support the old > set of > > > > properties. Otherwise the new factories need to understand both set > of > > > > properties, don't they? > > > > > > > > Best, > > > > > > > > Dawid > > > > > > > > On 30/03/2020 17:07, Timo Walther wrote: > > > > > Hi Jark, > > > > > > > > > > thanks for the FLIP. I don't have a strong opinion on > > > > > DynamicTableFactory#factoryVersion() for distiguishing connector > > > > > versions. We can also include it in the factory identifier itself. > For > > > > > Kafka, I would then just use `kafka` because the naming "universal" > > > > > was just a helper solution at that time. > > > > > > > > > > What we need is a good guide for how to design the options. We > should > > > > > include that in the JavaDoc of factory interface itself, once it is > > > > > in-place. I like the difference between source and sink in > properties. > > > > > > > > > > Regarding "connector.property-version": > > > > > > > > > > It was meant for backwards compatibility along the dimension of > > > > > "property design" compared to "connector version". However, since > the > > > > > connector is now responsible for parsing all properties, it is not > as > > > > > necessary as it was before. > > > > > > > > > > However, a change of the property design as it is done in FLIP-107 > > > > > becomes more difficult without a property version and versioning is > > > > > always a good idea in API world. > > > > > > > > > > Regards, > > > > > Timo > > > > > > > > > > > > > > > On 30.03.20 16:38, Benchao Li wrote: > > > > > > Hi Jark, > > > > > > > > > > > > Thanks for starting the discussion. The FLIP LGTM generally. > > > > > > > > > > > > Regarding connector version VS put version into connector's name, > > > > > > I favor the latter personally, using one property to locate a > > > > > > connector can make the error log more precise. > > > > > > From the users' side, using one property to match a connector > will > > > > > > be easier. Especially we have many connectors, > > > > > > and some of the need version property required, and some of them > not. > > > > > > > > > > > > Regarding Jingsong's suggestion, > > > > > > IMO, it's a very good complement to the FLIP. Distinguishing > > > > > > properties for source and sink can be very useful, and > > > > > > also this will make the connector property more precise. > > > > > > We are also sick of this for now, we cannot know whether a DDL > is a > > > > > > source or sink unless we look through all queries where > > > > > > the table is used. > > > > > > Even more, some of the required properties are only required for > > > > > > source, bug we cannot leave it blank for sink, and vice versa. > > > > > > I think we can also add a type for dimension tables except > source and > > > > > > sink. > > > > > > > > > > > > Kurt Young <ykt...@gmail.com <mailto:ykt...@gmail.com>> > 于2020年3月30日 > > > > > > 周一 下午8:16写道: > > > > > > > > > > > > > It's not possible for the framework to throw such exception. > > > > > > Because the > > > > > > framework doesn't know what versions do the connector support. > > > > > > > > > > > > Not really, we can list all valid connectors framework could > > > > > > found. E.g. > > > > > > user mistyped 'kafka-0.x', the error message will looks like: > > > > > > > > > > > > we don't have any connector named "kafka-0.x", but we have: > > > > > > FileSystem > > > > > > Kafka-0.10 > > > > > > Kafka-0.11 > > > > > > ElasticSearch6 > > > > > > ElasticSearch7 > > > > > > > > > > > > Best, > > > > > > Kurt > > > > > > > > > > > > > > > > > > On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <imj...@gmail.com > > > > > > <mailto:imj...@gmail.com>> wrote: > > > > > > > > > > > > > Hi Kurt, > > > > > > > > > > > > > > > 2) Lists all available connectors seems also quite > > > > > > straightforward, e.g > > > > > > > user provided a wrong "kafka-0.8", we tell user we have > > > > > > candidates of > > > > > > > "kakfa-0.11", "kafka-universal" > > > > > > > It's not possible for the framework to throw such exception. > > > > > > Because the > > > > > > > framework doesn't know what versions do the connector support. > > > > > > All the > > > > > > > version information is a blackbox in the identifier. But with > > > > > > > `Factory#factoryVersion()` interface, we can know all the > > > > > > supported > > > > > > > versions. > > > > > > > > > > > > > > > 3) I don't think so. We can still treat it as the same > > > > > > connector but with > > > > > > > different versions. > > > > > > > That's true but that's weird. Because from the plain DDL > > > > > > definition, they > > > > > > > look like different connectors with different "connector" > > > > > > value, e.g. > > > > > > > 'connector=kafka-0.8', 'connector=kafka-0.10'. > > > > > > > > > > > > > > > If users don't set any version, we will use "kafka-universal" > > > > > > instead. > > > > > > > The behavior is inconsistent IMO. > > > > > > > That is a long term vision when there is no kafka clusters > > > > > > with <0.11 > > > > > > > version. > > > > > > > At that point, "universal" is the only supported version in > > > > Flink > > > > > > and the > > > > > > > "version" key can be optional. > > > > > > > > > > > > > > --------------------------------------------- > > > > > > > > > > > > > > Hi Jingsong, > > > > > > > > > > > > > > > "version" vs "kafka.version" > > > > > > > I though about it. But if we prefix "kafka" to version, we > > > > should > > > > > > prefix > > > > > > > "kafka" for all other property keys, because they are all kafka > > > > > > specific > > > > > > > options. > > > > > > > However, that will make the property set verbose, see rejected > > > > > > option#2 in > > > > > > > the FLIP. > > > > > > > > > > > > > > > explicitly separate options for source and sink > > > > > > > That's a good topic. It's good to have a guideline for the new > > > > > > property > > > > > > > keys. > > > > > > > I'm fine to prefix with a "source"/"sink" for some connector > > > > > > keys. > > > > > > > Actually, we already do this in some connectors, e.g. jdbc and > > > > > > hbase. > > > > > > > > > > > > > > Best, > > > > > > > Jark > > > > > > > > > > > > > > On Mon, 30 Mar 2020 at 16:36, Jingsong Li < > > > > jingsongl...@gmail.com > > > > > > <mailto:jingsongl...@gmail.com>> wrote: > > > > > > > > > > > > > > > Thanks Jark for the proposal. > > > > > > > > > > > > > > > > +1 to the general idea. > > > > > > > > > > > > > > > > For "version", what about "kafka.version"? It is obvious to > > > > > > know its > > > > > > > > meaning. > > > > > > > > > > > > > > > > And I'd like to start a new topic: > > > > > > > > Should we need to explicitly separate source from sink? > > > > > > > > With the development of batch and streaming, more and more > > > > > > connectors > > > > > > > have > > > > > > > > both source and sink. > > > > > > > > > > > > > > > > So should we set a rule for table properties: > > > > > > > > - properties for both source and sink: without prefix, like > > > > > > "topic" > > > > > > > > - properties for source only: with "source." prefix, like > > > > > > > > "source.startup-mode" > > > > > > > > - properties for sink only: with "sink." prefix, like > > > > > > "sink.partitioner" > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > Best, > > > > > > > > Jingsong Lee > > > > > > > > > > > > > > > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <imj...@gmail.com > > > > > > <mailto:imj...@gmail.com>> wrote: > > > > > > > > > > > > > > > > > Hi Kurt, > > > > > > > > > > > > > > > > > > That's good questions. > > > > > > > > > > > > > > > > > > > the meaning of "version" > > > > > > > > > There are two versions in the old design. One is property > > > > > > version > > > > > > > > > "connector.property-version" which can be used for backward > > > > > > > > compatibility. > > > > > > > > > The other one is "connector.version" which defines the > > > > > > version of > > > > > > > > external > > > > > > > > > system, e.g. 0.11" for kafka, "6" or "7" for ES. > > > > > > > > > In this proposal, the "version" is the previous > > > > > > "connector.version". > > > > > > > The > > > > > > > > > ""connector.property-version" is not introduced in new > > > > > > design. > > > > > > > > > > > > > > > > > > > how to keep the old capability which can evolve connector > > > > > > properties > > > > > > > > > The "connector.property-version" is an optional key in the > > > > > > old design > > > > > > > and > > > > > > > > > is never bumped up. > > > > > > > > > I'm not sure how "connector.property-version" should work > > > > > > in the > > > > > > > initial > > > > > > > > > design. Maybe @Timo Walther <twal...@apache.org > > > > > > <mailto:twal...@apache.org>> has more knowledge on > > > > > > > > > this. > > > > > > > > > But for the new properties, every options should be > > > > > > expressed as > > > > > > > > > `ConfigOption` which provides `withDeprecatedKeys(...)` > > > > > > method to > > > > > > > easily > > > > > > > > > support evolving keys. > > > > > > > > > > > > > > > > > > > a single keys instead of two, e.g. "kafka-0.11", > > > > > > "kafka-universal"? > > > > > > > > > There are several benefit to use separate "version" key I > > > > can > > > > > > see: > > > > > > > > > 1) it's more readable to separete them into different keys, > > > > > > because > > > > > > > they > > > > > > > > > are orthogonal concepts. > > > > > > > > > 2) the planner can give all the availble versions in the > > > > > > exception > > > > > > > > message, > > > > > > > > > if user uses a wrong version (this is often reported in > > > > > > user ML). > > > > > > > > > 3) If we use "kafka-0.11" as connector identifier, we may > > > > > > have to > > > > > > > write a > > > > > > > > > full documentation for each version, because they are > > > > > > different > > > > > > > > > "connector"? > > > > > > > > > IMO, for 0.11, 0.11, etc... kafka, they are actually > > > > > > the same > > > > > > > > connector > > > > > > > > > but with different "client jar" version, > > > > > > > > > they share all the same supported property keys and > > > > > > should reside > > > > > > > > > together. > > > > > > > > > 4) IMO, the future vision is version-free. At some point in > > > > > > the future, > > > > > > > > we > > > > > > > > > may don't need users to specify kafka version anymore, and > > > > > > make > > > > > > > > > "version=universal" as optional or removed in the future. > > > > > > This is can > > > > > > > be > > > > > > > > > done easily if they are separate keys. > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Jark > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <ykt...@gmail.com > > > > > > <mailto:ykt...@gmail.com>> wrote: > > > > > > > > > > > > > > > > > > > Hi Jark, > > > > > > > > > > > > > > > > > > > > Thanks for the proposal, I'm +1 to the general idea. > > > > > > However I have a > > > > > > > > > > question about "version", > > > > > > > > > > in the old design, the version seems to be aimed for > > > > > > tracking > > > > > > > property > > > > > > > > > > version, with different > > > > > > > > > > version, we could evolve these step by step without > > > > > > breaking backward > > > > > > > > > > compatibility. But in this > > > > > > > > > > design, version is representing external system's > > > > > > version, like > > > > > > > "0.11" > > > > > > > > > for > > > > > > > > > > kafka, "6" or "7" for ES. > > > > > > > > > > I'm not sure if this is necessary, what's the benefit of > > > > > > using two > > > > > > > keys > > > > > > > > > > instead of one, like "kafka-0.11" > > > > > > > > > > or "ES6" directly? And how about the old capability which > > > > > > could let > > > > > > > us > > > > > > > > > > evolving connector properties? > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > Kurt > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen > > > > > > <shenleifight...@gmail.com <mailto:shenleifight...@gmail.com>> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi Jark, > > > > > > > > > > > > > > > > > > > > > > I am really looking forward to this feature. I think > > > > this > > > > > > feature > > > > > > > > > > > could simplify flink sql code,and at the same time , > > > > > > > > > > > it could make the developer more easlier to config the > > > > > > flink sql > > > > > > > WITH > > > > > > > > > > > options. > > > > > > > > > > > > > > > > > > > > > > Now when I am using flink sql to write flink task , > > > > > > sometimes I > > > > > > > think > > > > > > > > > the > > > > > > > > > > > WITH options is too long for user. > > > > > > > > > > > For example,I config the kafka source connector > > > > > > parameter,for > > > > > > > > consumer > > > > > > > > > > > group and brokers parameter: > > > > > > > > > > > > > > > > > > > > > > 'connector.properties.0.key' = 'group.id > > > > > > <http://group.id>' > > > > > > > > > > > > , 'connector.properties.0.value' = 'xxx' > > > > > > > > > > > > , 'connector.properties.1.key' = > > > > > > 'bootstrap.servers' > > > > > > > > > > > > , 'connector.properties.1.value' = 'xxxxx' > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I can understand this config , but for the flink fresh > > > > > > man,maybe it > > > > > > > > > > > is confused for him. > > > > > > > > > > > In my thought, I am really looking forward to this > > > > > > feature,thank > > > > > > > you > > > > > > > > to > > > > > > > > > > > propose this feature. > > > > > > > > > > > > > > > > > > > > > > Best wishes, > > > > > > > > > > > LakeShen > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Jark Wu <imj...@gmail.com <mailto:imj...@gmail.com>> 于 > > > > > > 2020年3月30日周一 下午2:02写道: > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > I want to start a discussion about further improve > and > > > > > > simplify > > > > > > > our > > > > > > > > > > > current > > > > > > > > > > > > connector porperty keys, aka WITH options. Currently, > > > > > > we have a > > > > > > > > > > > > 'connector.' prefix for many properties, but they are > > > > > > verbose, > > > > > > > and > > > > > > > > we > > > > > > > > > > > see a > > > > > > > > > > > > big inconsistency between the properties when > > > > designing > > > > > > FLIP-107. > > > > > > > > > > > > > > > > > > > > > > > > So we propose to remove all the 'connector.' prefix > > > > and > > > > > > rename > > > > > > > > > > > > 'connector.type' to 'connector', 'format.type' to > > > > > > 'format'. So a > > > > > > > > new > > > > > > > > > > > Kafka > > > > > > > > > > > > DDL may look like this: > > > > > > > > > > > > > > > > > > > > > > > > CREATE TABLE kafka_table ( > > > > > > > > > > > > ... > > > > > > > > > > > > ) WITH ( > > > > > > > > > > > > 'connector' = 'kafka', > > > > > > > > > > > > 'version' = '0.10', > > > > > > > > > > > > 'topic' = 'test-topic', > > > > > > > > > > > > 'startup-mode' = 'earliest-offset', > > > > > > > > > > > > 'properties.bootstrap.servers' = 'localhost:9092', > > > > > > > > > > > > 'properties.group.id <http://properties.group.id>' = > > > > > > 'testGroup', > > > > > > > > > > > > 'format' = 'json', > > > > > > > > > > > > 'format.fail-on-missing-field' = 'false' > > > > > > > > > > > > ); > > > > > > > > > > > > > > > > > > > > > > > > The new connector property key set will come together > > > > > > with new > > > > > > > > > Factory > > > > > > > > > > > > inferface which is proposed in FLIP-95. Old > properties > > > > > > are still > > > > > > > > > > > compatible > > > > > > > > > > > > with their existing implementation. New properties > > > > > > are only > > > > > > > > available > > > > > > > > > > in > > > > > > > > > > > > new DynamicTableFactory implementations. > > > > > > > > > > > > > > > > > > > > > > > > You can access the detailed FLIP here: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > Jark > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Best, Jingsong Lee > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > Benchao Li > > > > > > School of Electronics Engineering and Computer Science, Peking > > > > > > University > > > > > > Tel:+86-15650713730 > > > > > > Email:libenc...@gmail.com > > > > > > <mailto:libenc...@gmail.com>;libenc...@pku.edu.cn > > > > > > <mailto:libenc...@pku.edu.cn> > > > > > > > > > > > > > > > > > > >