Hi all,

I like the overall design of the FLIP.

As for the withstanding concerns. I kind of like the approach to put the
version into the factory identifier. I think it's the cleanest way to
say that this version actually applies to the connector itself and not
to the system it connects to. BTW, I think the outcome of this
discussion will affect interfaces described in FLIP-95. If we put the
version into the functionIdentifier, do we need Factory#factoryVersion?

I also think it does make sense to have a versioning for the properties
as well. Are we able to read all the current properties with the new
factories? I think we could use the "connector.property-version" to
alternate between different Factory interfaces to support the old set of
properties. Otherwise the new factories need to understand both set of
properties, don't they?

Best,

Dawid

On 30/03/2020 17:07, Timo Walther wrote:
> Hi Jark,
>
> thanks for the FLIP. I don't have a strong opinion on
> DynamicTableFactory#factoryVersion() for distiguishing connector
> versions. We can also include it in the factory identifier itself. For
> Kafka, I would then just use `kafka` because the naming "universal"
> was just a helper solution at that time.
>
> What we need is a good guide for how to design the options. We should
> include that in the JavaDoc of factory interface itself, once it is
> in-place. I like the difference between source and sink in properties.
>
> Regarding "connector.property-version":
>
> It was meant for backwards compatibility along the dimension of
> "property design" compared to "connector version". However, since the
> connector is now responsible for parsing all properties, it is not as
> necessary as it was before.
>
> However, a change of the property design as it is done in FLIP-107
> becomes more difficult without a property version and versioning is
> always a good idea in API world.
>
> Regards,
> Timo
>
>
> On 30.03.20 16:38, Benchao Li wrote:
>> Hi Jark,
>>
>> Thanks for starting the discussion. The FLIP LGTM generally.
>>
>> Regarding connector version VS put version into connector's name,
>> I favor the latter personally, using one property to locate a
>> connector can make the error log more precise.
>>  From the users' side, using one property to match a connector will
>> be easier. Especially we have many connectors,
>> and some of the need version property required, and some of them not.
>>
>> Regarding Jingsong's suggestion,
>> IMO, it's a very good complement to the FLIP. Distinguishing
>> properties for source and sink can be very useful, and
>> also this will make the connector property more precise.
>> We are also sick of this for now, we cannot know whether a DDL is a
>> source or sink unless we look through all queries where
>> the table is used.
>> Even more, some of the required properties are only required for
>> source, bug we cannot leave it blank for sink, and vice versa.
>> I think we can also add a type for dimension tables except source and
>> sink.
>>
>> Kurt Young <ykt...@gmail.com <mailto:ykt...@gmail.com>> 于2020年3月30日
>> 周一 下午8:16写道:
>>
>>      > It's not possible for the framework to throw such exception.
>>     Because the
>>     framework doesn't know what versions do the connector support.
>>
>>     Not really, we can list all valid connectors framework could
>> found. E.g.
>>     user mistyped 'kafka-0.x', the error message will looks like:
>>
>>     we don't have any connector named "kafka-0.x", but we have:
>>     FileSystem
>>     Kafka-0.10
>>     Kafka-0.11
>>     ElasticSearch6
>>     ElasticSearch7
>>
>>     Best,
>>     Kurt
>>
>>
>>     On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <imj...@gmail.com
>>     <mailto:imj...@gmail.com>> wrote:
>>
>>      > Hi Kurt,
>>      >
>>      > > 2) Lists all available connectors seems also quite
>>     straightforward, e.g
>>      > user provided a wrong "kafka-0.8", we tell user we have
>> candidates of
>>      > "kakfa-0.11", "kafka-universal"
>>      > It's not possible for the framework to throw such exception.
>>     Because the
>>      > framework doesn't know what versions do the connector support.
>>     All the
>>      > version information is a blackbox in the identifier. But with
>>      > `Factory#factoryVersion()` interface, we can know all the
>> supported
>>      > versions.
>>      >
>>      > > 3) I don't think so. We can still treat it as the same
>>     connector but with
>>      > different versions.
>>      > That's true but that's weird. Because from the plain DDL
>>     definition, they
>>      > look like different connectors with different "connector"
>> value, e.g.
>>      > 'connector=kafka-0.8', 'connector=kafka-0.10'.
>>      >
>>      > > If users don't set any version, we will use "kafka-universal"
>>     instead.
>>      > The behavior is inconsistent IMO.
>>      > That is a long term vision when there is no kafka clusters
>> with <0.11
>>      > version.
>>      > At that point, "universal" is the only supported version in Flink
>>     and the
>>      > "version" key can be optional.
>>      >
>>      > ---------------------------------------------
>>      >
>>      > Hi Jingsong,
>>      >
>>      > > "version" vs "kafka.version"
>>      > I though about it. But if we prefix "kafka" to version, we should
>>     prefix
>>      > "kafka" for all other property keys, because they are all kafka
>>     specific
>>      > options.
>>      > However, that will make the property set verbose, see rejected
>>     option#2 in
>>      > the FLIP.
>>      >
>>      > > explicitly separate options for source and sink
>>      > That's a good topic. It's good to have a guideline for the new
>>     property
>>      > keys.
>>      > I'm fine to prefix with a "source"/"sink" for some connector
>> keys.
>>      > Actually, we already do this in some connectors, e.g. jdbc and
>> hbase.
>>      >
>>      > Best,
>>      > Jark
>>      >
>>      > On Mon, 30 Mar 2020 at 16:36, Jingsong Li <jingsongl...@gmail.com
>>     <mailto:jingsongl...@gmail.com>> wrote:
>>      >
>>      > > Thanks Jark for the proposal.
>>      > >
>>      > > +1 to the general idea.
>>      > >
>>      > > For "version", what about "kafka.version"? It is obvious to
>>     know its
>>      > > meaning.
>>      > >
>>      > > And I'd like to start a new topic:
>>      > > Should we need to explicitly separate source from sink?
>>      > > With the development of batch and streaming, more and more
>>     connectors
>>      > have
>>      > > both source and sink.
>>      > >
>>      > > So should we set a rule for table properties:
>>      > > - properties for both source and sink: without prefix, like
>> "topic"
>>      > > - properties for source only: with "source." prefix, like
>>      > > "source.startup-mode"
>>      > > - properties for sink only: with "sink." prefix, like
>>     "sink.partitioner"
>>      > >
>>      > > What do you think?
>>      > >
>>      > > Best,
>>      > > Jingsong Lee
>>      > >
>>      > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <imj...@gmail.com
>>     <mailto:imj...@gmail.com>> wrote:
>>      > >
>>      > > > Hi Kurt,
>>      > > >
>>      > > > That's good questions.
>>      > > >
>>      > > > > the meaning of "version"
>>      > > > There are two versions in the old design. One is property
>> version
>>      > > > "connector.property-version" which can be used for backward
>>      > > compatibility.
>>      > > > The other one is "connector.version" which defines the
>> version of
>>      > > external
>>      > > > system, e.g. 0.11" for kafka, "6" or "7" for ES.
>>      > > > In this proposal, the "version" is the previous
>>     "connector.version".
>>      > The
>>      > > > ""connector.property-version" is not introduced in new
>> design.
>>      > > >
>>      > > > > how to keep the old capability which can evolve connector
>>     properties
>>      > > > The "connector.property-version" is an optional key in the
>>     old design
>>      > and
>>      > > > is never bumped up.
>>      > > > I'm not sure how "connector.property-version" should work
>> in the
>>      > initial
>>      > > > design. Maybe @Timo Walther <twal...@apache.org
>>     <mailto:twal...@apache.org>> has more knowledge on
>>      > > > this.
>>      > > > But for the new properties, every options should be
>> expressed as
>>      > > > `ConfigOption` which provides `withDeprecatedKeys(...)`
>> method to
>>      > easily
>>      > > > support evolving keys.
>>      > > >
>>      > > > > a single keys instead of two, e.g. "kafka-0.11",
>>     "kafka-universal"?
>>      > > > There are several benefit to use separate "version" key I can
>>     see:
>>      > > > 1) it's more readable to separete them into different keys,
>>     because
>>      > they
>>      > > > are orthogonal concepts.
>>      > > > 2) the planner can give all the availble versions in the
>>     exception
>>      > > message,
>>      > > > if user uses a wrong version (this is often reported in
>> user ML).
>>      > > > 3) If we use "kafka-0.11" as connector identifier, we may
>> have to
>>      > write a
>>      > > > full documentation for each version, because they are
>> different
>>      > > > "connector"?
>>      > > >     IMO, for 0.11, 0.11, etc... kafka, they are actually
>> the same
>>      > > connector
>>      > > > but with different "client jar" version,
>>      > > >     they share all the same supported property keys and
>>     should reside
>>      > > > together.
>>      > > > 4) IMO, the future vision is version-free. At some point in
>>     the future,
>>      > > we
>>      > > > may don't need users to specify kafka version anymore, and
>> make
>>      > > > "version=universal" as optional or removed in the future.
>>     This is can
>>      > be
>>      > > > done easily if they are separate keys.
>>      > > >
>>      > > > Best,
>>      > > > Jark
>>      > > >
>>      > > >
>>      > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young <ykt...@gmail.com
>>     <mailto:ykt...@gmail.com>> wrote:
>>      > > >
>>      > > > > Hi Jark,
>>      > > > >
>>      > > > > Thanks for the proposal, I'm +1 to the general idea.
>>     However I have a
>>      > > > > question about "version",
>>      > > > > in the old design, the version seems to be aimed for
>> tracking
>>      > property
>>      > > > > version, with different
>>      > > > > version, we could evolve these step by step without
>>     breaking backward
>>      > > > > compatibility. But in this
>>      > > > > design, version is representing external system's
>> version, like
>>      > "0.11"
>>      > > > for
>>      > > > > kafka, "6" or "7" for ES.
>>      > > > > I'm not sure if this is necessary, what's the benefit of
>>     using two
>>      > keys
>>      > > > > instead of one, like "kafka-0.11"
>>      > > > > or "ES6" directly? And how about the old capability which
>>     could let
>>      > us
>>      > > > > evolving connector properties?
>>      > > > >
>>      > > > > Best,
>>      > > > > Kurt
>>      > > > >
>>      > > > >
>>      > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen
>>     <shenleifight...@gmail.com <mailto:shenleifight...@gmail.com>>
>>      > > > > wrote:
>>      > > > >
>>      > > > > > Hi Jark,
>>      > > > > >
>>      > > > > > I am really looking forward to this feature. I think this
>>     feature
>>      > > > > > could simplify flink sql code,and at the same time ,
>>      > > > > > it could make the developer more easlier to config the
>>     flink sql
>>      > WITH
>>      > > > > > options.
>>      > > > > >
>>      > > > > > Now when I am using flink sql to write flink task ,
>>     sometimes I
>>      > think
>>      > > > the
>>      > > > > > WITH options is too long for user.
>>      > > > > > For example,I config the kafka source connector
>>     parameter,for
>>      > > consumer
>>      > > > > > group and brokers parameter:
>>      > > > > >
>>      > > > > >           'connector.properties.0.key' = 'group.id
>>     <http://group.id>'
>>      > > > > > >          , 'connector.properties.0.value' = 'xxx'
>>      > > > > > >          , 'connector.properties.1.key' =
>>     'bootstrap.servers'
>>      > > > > > >          , 'connector.properties.1.value' = 'xxxxx'
>>      > > > > > >
>>      > > > > >
>>      > > > > > I can understand this config , but for the flink fresh
>>     man,maybe it
>>      > > > > > is confused for him.
>>      > > > > > In my thought, I am really looking forward to this
>>     feature,thank
>>      > you
>>      > > to
>>      > > > > > propose this feature.
>>      > > > > >
>>      > > > > > Best wishes,
>>      > > > > > LakeShen
>>      > > > > >
>>      > > > > >
>>      > > > > > Jark Wu <imj...@gmail.com <mailto:imj...@gmail.com>> 于
>>     2020年3月30日周一 下午2:02写道:
>>      > > > > >
>>      > > > > > > Hi everyone,
>>      > > > > > >
>>      > > > > > > I want to start a discussion about further improve and
>>     simplify
>>      > our
>>      > > > > > current
>>      > > > > > > connector porperty keys, aka WITH options. Currently,
>>     we have a
>>      > > > > > > 'connector.' prefix for many properties, but they are
>>     verbose,
>>      > and
>>      > > we
>>      > > > > > see a
>>      > > > > > > big inconsistency between the properties when designing
>>     FLIP-107.
>>      > > > > > >
>>      > > > > > > So we propose to remove all the 'connector.' prefix and
>>     rename
>>      > > > > > > 'connector.type' to 'connector', 'format.type' to
>>     'format'. So a
>>      > > new
>>      > > > > > Kafka
>>      > > > > > > DDL may look like this:
>>      > > > > > >
>>      > > > > > > CREATE TABLE kafka_table (
>>      > > > > > >  ...
>>      > > > > > > ) WITH (
>>      > > > > > >  'connector' = 'kafka',
>>      > > > > > >  'version' = '0.10',
>>      > > > > > >  'topic' = 'test-topic',
>>      > > > > > >  'startup-mode' = 'earliest-offset',
>>      > > > > > >  'properties.bootstrap.servers' = 'localhost:9092',
>>      > > > > > >  'properties.group.id <http://properties.group.id>' =
>>     'testGroup',
>>      > > > > > >  'format' = 'json',
>>      > > > > > >  'format.fail-on-missing-field' = 'false'
>>      > > > > > > );
>>      > > > > > >
>>      > > > > > > The new connector property key set will come together
>>     with new
>>      > > > Factory
>>      > > > > > > inferface which is proposed in FLIP-95. Old properties
>>     are still
>>      > > > > > compatible
>>      > > > > > > with their existing implementation. New properties
>> are only
>>      > > available
>>      > > > > in
>>      > > > > > > new DynamicTableFactory implementations.
>>      > > > > > >
>>      > > > > > > You can access the detailed FLIP here:
>>      > > > > > >
>>      > > > > > >
>>      > > > > >
>>      > > > >
>>      > > >
>>      > >
>>      >
>>    
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
>>      > > > > > >
>>      > > > > > > Best,
>>      > > > > > > Jark
>>      > > > > > >
>>      > > > > >
>>      > > > >
>>      > > >
>>      > >
>>      > >
>>      > > --
>>      > > Best, Jingsong Lee
>>      > >
>>      >
>>
>>
>>
>> -- 
>>
>> Benchao Li
>> School of Electronics Engineering and Computer Science, Peking
>> University
>> Tel:+86-15650713730
>> Email:libenc...@gmail.com 
>> <mailto:libenc...@gmail.com>;libenc...@pku.edu.cn 
>> <mailto:libenc...@pku.edu.cn>
>>
>

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to