Hi Jing,

Thanks for the reminder. The aim of this flip is letting the sql users to
use those features in the Datastream API, we don't intend to extend
flip-217. In my opinion, the watermark alignment with only one source can
be configured by the options given in flip, and if the source connector
does not implement flip-217, the task will run with an error, reminding the
user to use `pipeline.watermark-alignment.allow- unaligned-source-splits`,
but maybe these behaviors are not understood by everyone, I will add some
tips about flip-217 in the flip to let users understand the behavior in the
case of source splits.


Best,

Kui Yuan

Jing Ge <j...@ververica.com.invalid> 于2023年3月7日周二 04:23写道:

> Hi Kui,
>
> Thanks for pointing that out. I knew FLIP-217 which was done by an
> engineer working in my team.  As far as I am concerned, your FLIP should
> answer the following questions:
>
> 1. How to enable the watermark alignment of a source splits with Flink SQL?
> e.g. which options should be used if only one source is used?
>
> 2. Default behaviour. i.e. Flink SQL users should be aware that watermark
> alignment of source split will only work for sources that implement
> FLIP-217 properly. Should users take care of
> `pipeline.watermark-alignment.allow-unaligned-source-splits`
> while using Flink SQL?
>
> Best regards,
> Jing
>
>
> On Fri, Mar 3, 2023 at 8:46 AM Kui Yuan <catye...@gmail.com> wrote:
>
> > Hi all,
> >
> > Thanks for all. There are more questions and I will answer one by one.
> >
> > @Jark Thanks for your tips. For the first question, I will add more
> details
> > in the flip, and give a POC[1] so that pepole can know how I'm currently
> > implementing these features.
> >
> > > IIRC, this is the first time we introduce the framework-level connector
> > > options that the option is not recognized and handled by connectors.
> > > The FLIP should cover how framework filters the watermark related
> options
> > > to avoid discover connector factory failed, and what happens if the
> > > connector already supported the conflict options
> >
> > For the second question, We know that the default strategy is
> 'on-periodic'
> > in SQL layer, and the default interval is 200ms. The reason for emiting
> > watermark periodically is that the time advancement of consecutive events
> > may be very small, we don't need to calculate watermark for each event.
> > Same for 'on-event' strategy, so my idea is that we can set a fixed gap
> for
> > 'on-event' strategy.
> >
> > > I'm not sure about the usage scenarios of event gap emit strategy. Do
> > > you have any specific use case of this strategy? I'm confused why no
> one
> > > requested this strategy before no matter in DataStream or SQL, but
> maybe
> > > I missed something. I'm not against to add this option, but just want
> to
> > be
> > > careful when adding new API because it's hard to remove in the future.
> >
> > As @Timo said, There is no default features like 'on-event-gap' in
> > DataStream API, but the users can achieve the 'on-event-gap' feature by
> > using `WatermarkGenerator` interface, just like the implemention in my
> > POC[1]. However, If we don't provide it  in SQL layer, there is no way
> for
> > users to use similar features.
> >
> > > Jark raised a very good point. I thought we only expose what is
> > > contained in DataStream API already. If this strategy is not part of
> > > DataStream API, would like to exclude it from the FLIP. We need to be
> > > careful which strategies we offer by default.
> >
> > @Jark @Timo I'm sorry, perhaps I don't understand what are your concerns
> > about CompiledPlan, maybe I missed something else, maybe you can look at
> my
> > POC first to see if there is somewhere to worry about.
> >
> > > Sorry, I forgot to remind you that Timo's concern about the changes to
> > the
> > > CompiledPlan looks like is still not covered in the FLIP.
> >
> > @Jing We could have more discussion about naming, but I prefer that the
> > naming should be consistent with the DataStream API.
> > About aligning splits/partitions/shards, maybe you missed FLIP-217[2]
> which
> > aims to support watermark alignment of source splits.
> >
> > > After reading the most up-to-date Flip, I didn't find any information
> if
> > > this solution will support aligning splits/partitions/shards [1]. Did I
> > > miss anything?
> >
> > Best
> > Kui Yuan
> >
> > [1] the POC:
> > https://github.com/yuchengxin/flink/tree/yuankui/watermark_params
> > [2] FLIP-217:
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-217%3A+Support+watermark+alignment+of+source+splits
> >
> >
> > Jing Ge <j...@ververica.com.invalid> 于2023年3月3日周五 08:03写道:
> >
> > > Hi,
> > >
> > > Thanks Kui for driving this Flip and thanks all for the informative
> > > discussion.
> > >
> > > @Timo
> > >
> > > Your suggestion about the naming convention is excellent. Thanks! I was
> > > wondering why you, exceptionally, suggested 'scan.idle-timeout' instead
> > of
> > > 'scan.watermark.idle-timeout'. I must miss something here.
> > >
> > > There is one more NIT. I am just aware that "drift" is used for the
> > > watermark alignment. It seems to be fine while using DataStream API,
> > > because we will not really see it. But with the OPTIONS in SQL, a much
> > > bigger group of users (including SRE, tech support, etc) will see the
> > word
> > > "drift". Given that "drift" wasn't used widely yet and with all
> training
> > > materials, Flink doc [1][2][3] (search with "lag"), "lag" has been used
> > to
> > > describe timestamp difference between watermark and its
> > > corresponding event. Do we really need to introduce another term for
> the
> > > same thing? How about using 'scan.watermark.alignment.max-lag'='1min'
> and
> > > change the parameter name from maxAllowedWatermarkDrift to
> > > maxAllowedWatermarkLag [4] because of naming consistency? Just my two
> > cents
> > > worth.
> > >
> > > @Kui
> > >
> > > After reading the most up-to-date Flip, I didn't find any information
> if
> > > this solution will support aligning splits/partitions/shards [1]. Did I
> > > miss anything?
> > >
> > > +1 for the concern about Table API. We'd be better keep Table API and
> SQL
> > > synced for new features.
> > >
> > > Best regards,
> > > Jing
> > >
> > >
> > > [1]
> > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/event-time/generating_watermarks/#watermark-alignment-_beta_
> > > [2]
> > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/event-time/built_in/#fixed-amount-of-lateness
> > >
> > > [3]
> > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/datastream/kafka/
> > > [4]
> > >
> > >
> >
> https://github.com/apache/flink/blob/4aacff572a9e3996c5dee9273638831e4040c767/flink-core/src/main/java/org/apache/flink/api/common/eventtime/WatermarkStrategy.java#L169
> > >
> > >
> > >
> > > On Wed, Mar 1, 2023 at 3:54 PM Timo Walther <twal...@apache.org>
> wrote:
> > >
> > > > Reg. 2:
> > > >  > event gap emit strategy [...] no matter in DataStream or SQL
> > > >
> > > > Jark raised a very good point. I thought we only expose what is
> > > > contained in DataStream API already. If this strategy is not part of
> > > > DataStream API, would like to exclude it from the FLIP. We need to be
> > > > careful which strategies we offer by default.
> > > >
> > > > Reg 1:
> > > > This already has a JIRA ticket with additional thoughts on this
> topic:
> > > > https://issues.apache.org/jira/browse/FLINK-25221
> > > >
> > > > Regards,
> > > > Timo
> > > >
> > > >
> > > >
> > > > On 01.03.23 12:31, Jark Wu wrote:
> > > > > Sorry, I forgot to remind you that Timo's concern about the changes
> > to
> > > > the
> > > > > CompiledPlan looks like is still not covered in the FLIP.
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > > > On Wed, 1 Mar 2023 at 19:28, Jark Wu <imj...@gmail.com> wrote:
> > > > >
> > > > >> Hi Kui,
> > > > >>
> > > > >> Thank you for the great proposal, I think this is already in a
> good
> > > > shape.
> > > > >>
> > > > >> Just a kind reminder, according to the community guidelines[1],
> > > > >> if there are unresponsive reviewers, a typical reasonable time
> > > > >> to wait for responses is one week, but be pragmatic about it.
> > > > >>
> > > > >> Regarding the FLIP, I have some comments below:
> > > > >>
> > > > >> 1. IIRC, this is the first time we introduce the framework-level
> > > > connector
> > > > >> options that the option is not recognized and handled by
> connectors.
> > > > >> The FLIP should cover how framework filters the watermark related
> > > > options
> > > > >> to avoid discover connector factory failed, and what happens if
> the
> > > > >> connector
> > > > >> already supported the conflict options.
> > > > >>
> > > > >> 2. I'm not sure about the usage scenarios of event gap emit
> > strategy.
> > > Do
> > > > >> you have any specific use case of this strategy? I'm confused why
> no
> > > one
> > > > >> requested this strategy before no matter in DataStream or SQL, but
> > > maybe
> > > > >> I missed something. I'm not against to add this option, but just
> > want
> > > to
> > > > >> be
> > > > >> careful when adding new API because it's hard to remove in the
> > future.
> > > > >>
> > > > >>
> > > > >> 3. Adding a "Public Interface"[2] section to summarize the
> > > > >> proposed APIs and options would be better for developers to
> > > > >> know the impact. Currently, the APIs are scattered in the long
> > > > >> design sections.
> > > > >>
> > > > >> Best,
> > > > >> Jark
> > > > >>
> > > > >>
> > > > >> [1]:
> > > > >>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
> > > > >> [2]:
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP+Template
> > > > >>
> > > > >> On Wed, 1 Mar 2023 at 16:56, Kui Yuan <catye...@gmail.com> wrote:
> > > > >>
> > > > >>> Hi all,
> > > > >>>
> > > > >>> Thanks for all discussions!
> > > > >>>
> > > > >>> Anyone else have questions or suggestions? if not, I will start a
> > > vote
> > > > >>> thread later.
> > > > >>>
> > > > >>> Best
> > > > >>> Kui Yuan
> > > > >>>
> > > > >>> kui yuan <catye...@gmail.com> 于2023年2月27日周一 20:21写道:
> > > > >>>
> > > > >>>> Hi Timo,
> > > > >>>>
> > > > >>>> Thanks for your advice. I totally agree with your suggestion of
> > > naming
> > > > >>>> convention, I will rename these options and update the flip
> later,
> > > > >>> thanks
> > > > >>>> very much.
> > > > >>>>
> > > > >>>> In our internal implementation we had put these options inside
> the
> > > > >>>> `FactoryUtil`, just as you expect.  We have also taken into
> > account
> > > > the
> > > > >>>> changes to the CompiledPlan and we have packaged these options
> > > > >>>> appropriately to minimize intrusiveness and ensure the
> > compatibility
> > > > to
> > > > >>> the
> > > > >>>> `WatermarkPushDownSpec`.
> > > > >>>>
> > > > >>>>> A hint to the implementation: I would suggest that we add those
> > > > >>> options
> > > > >>>>> to `FactoryUtil`. All cross-connector options should end up
> > there.
> > > > >>>>
> > > > >>>>
> > > > >>>>> Please also consider the changes to the CompiledPlan in your
> > FLIP.
> > > > >>> This
> > > > >>>>> change has implications on the JSON format as watermark
> strategy
> > of
> > > > >>>>> ExecNode becomes more complex, see WatermarkPushDownSpec
> > > > >>>>
> > > > >>>> Best
> > > > >>>> Kui Yuan
> > > > >>>>
> > > > >>>> Timo Walther <twal...@apache.org> 于2023年2月27日周一 18:05写道:
> > > > >>>>
> > > > >>>>> Hi Kui Yuan,
> > > > >>>>>
> > > > >>>>> thanks for working on this FLIP. Let me also give some comments
> > > about
> > > > >>>>> the proposed changes.
> > > > >>>>>
> > > > >>>>> I support the direction of this FLIP about handling these
> > > > >>>>> watermark-specific properties through options and
> /*+OPTIONS(...)
> > > */
> > > > >>>>> hints.
> > > > >>>>>
> > > > >>>>> Regarding naming, I would like to keep the options in sync with
> > > > >>> existing
> > > > >>>>> options:
> > > > >>>>>
> > > > >>>>>   > 'watermark.emit.strategy'='ON_EVENT'
> > > > >>>>>
> > > > >>>>> Let's use lower case (e.g. `on-event`) that matches with
> > properties
> > > > >>> like
> > > > >>>>> sink.partitioner [1] or sink.delivery-guarantee [2].
> > > > >>>>>
> > > > >>>>>   > 'source.idle-timeout'='1min'
> > > > >>>>>
> > > > >>>>> According to FLIP-122 [3], we want to prefix all scan-source
> > > related
> > > > >>>>> properties with `scan.*`. This clearly includes idle-timeout
> and
> > > > >>>>> actually also watermark strategies which don't apply for lookup
> > > > >>> sources.
> > > > >>>>>
> > > > >>>>> Summarizing the comments above, we should use the following
> > > options:
> > > > >>>>>
> > > > >>>>> 'scan.watermark.emit.strategy'='on-event',
> > > > >>>>> 'scan.watermark.emit.on-event.gap'='10000',
> > > > >>>>> 'scan.idle-timeout'='1min',
> > > > >>>>> 'scan.watermark.alignment.group'='alignment-group-1',
> > > > >>>>> 'scan.watermark.alignment.max-drift'='1min',
> > > > >>>>> 'scan.watermark.alignment.update-interval'='1s'
> > > > >>>>>
> > > > >>>>> I know that this makes the keys even longer, but given that
> those
> > > > >>>>> options are for power users this should be acceptable. It also
> > > > clearly
> > > > >>>>> indicates which options are for sinks, scans, and lookups. This
> > > > >>>>> potentially also helps in allow lists.
> > > > >>>>>
> > > > >>>>> A hint to the implementation: I would suggest that we add those
> > > > options
> > > > >>>>> to `FactoryUtil`. All cross-connector options should end up
> > there.
> > > > >>>>>
> > > > >>>>> Please also consider the changes to the CompiledPlan in your
> > FLIP.
> > > > This
> > > > >>>>> change has implications on the JSON format as watermark
> strategy
> > of
> > > > >>>>> ExecNode becomes more complex, see WatermarkPushDownSpec [4].
> > > > >>>>>
> > > > >>>>> Regards,
> > > > >>>>> Timo
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> [1]
> > > > >>>>>
> > > > >>>>>
> > > > >>>
> > > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/connectors/kafka.html#sink-partitioner
> > > > >>>>> [2]
> > > > >>>>>
> > > > >>>>>
> > > > >>>
> > > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/kafka/#sink-delivery-guarantee
> > > > >>>>> [3]
> > > > >>>>>
> > > > >>>>>
> > > > >>>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
> > > > >>>>> [4]
> > > > >>>>>
> > > > >>>>>
> > > > >>>
> > > >
> > >
> >
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/abilities/source/WatermarkPushDownSpec.java
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On 24.02.23 04:55, kui yuan wrote:
> > > > >>>>>> Hi all,
> > > > >>>>>>
> > > > >>>>>> I have updated the flip according to the discussion, and we
> will
> > > > >>> extend
> > > > >>>>> the
> > > > >>>>>> watermark-related features with both table options and
> 'OPTIONS'
> > > > >>> hint,
> > > > >>>>> like
> > > > >>>>>> this:
> > > > >>>>>>
> > > > >>>>>> ```
> > > > >>>>>> -- configure in table options
> > > > >>>>>> CREATE TABLE user_actions (
> > > > >>>>>>     ...
> > > > >>>>>>     user_action_time TIMESTAMP(3),
> > > > >>>>>>     WATERMARK FOR user_action_time AS user_action_time -
> > INTERVAL
> > > > '5'
> > > > >>>>> SECOND
> > > > >>>>>> ) WITH (
> > > > >>>>>>     'watermark.emit.strategy'='ON_PERIODIC',
> > > > >>>>>>     ...
> > > > >>>>>> );
> > > > >>>>>>
> > > > >>>>>> -- use 'OPTIONS' hint
> > > > >>>>>> select ... from source_table /*+
> > > OPTIONS('watermark.emit.strategy'=
> > > > >>>>>> 'ON_PERIODIC') */
> > > > >>>>>> ```
> > > > >>>>>>
> > > > >>>>>> Does everybody have any other questions?
> > > > >>>>>>
> > > > >>>>>> Best
> > > > >>>>>> Kui Yuan
> > > > >>>>>>
> > > > >>>>>> kui yuan <catye...@gmail.com> 于2023年2月23日周四 20:05写道:
> > > > >>>>>>
> > > > >>>>>>> Hi all,
> > > > >>>>>>>
> > > > >>>>>>> Thanks for all suggestions.
> > > > >>>>>>>
> > > > >>>>>>> We will extend the watermark-related features in SQL layer
> with
> > > > >>> dynamic
> > > > >>>>>>> table options and 'OPTIONS' hint, just as everyone expects. I
> > > will
> > > > >>>>> modify
> > > > >>>>>>> Flip-296 as discussed.
> > > > >>>>>>>
> > > > >>>>>>> @Martijn As far as I know, there is no hint interface in the
> > > table
> > > > >>> API,
> > > > >>>>>>> so we can't use hint in table API directly. if we need to
> > extend
> > > > the
> > > > >>>>> hint
> > > > >>>>>>> interface in the table API, maybe we need another flip.
> > However,
> > > if
> > > > >>> we
> > > > >>>>>>> extend the watermark-related features in the dynamic table
> > > options,
> > > > >>>>> maybe
> > > > >>>>>>> we are able to use them indirectly in the table API like
> > this[1]:
> > > > >>>>>>>
> > > > >>>>>>> ```
> > > > >>>>>>> // register a table named "Orders"
> > > > >>>>>>> tableEnv.executeSql("CREATE TABLE Orders (`user` BIGINT,
> > product
> > > > >>>>> STRING,
> > > > >>>>>>> amount INT) WITH ('watermark.emit.strategy'='ON_EVENT'...)");
> > > > >>>>>>> ```
> > > > >>>>>>>
> > > > >>>>>>> [1]
> > > > >>>>>>>
> > > > >>>>>
> > > > >>>
> > > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/
> > > > >>>>>>>
> > > > >>>>>>> Best
> > > > >>>>>>> Kui Yuan
> > > > >>>>>>>
> > > > >>>>>>> Yun Tang <myas...@live.com> 于2023年2月23日周四 17:46写道:
> > > > >>>>>>>
> > > > >>>>>>>> Thanks for the warm discussions!
> > > > >>>>>>>>
> > > > >>>>>>>> I had an offline discussion with Kui about the replies. I
> > think
> > > I
> > > > >>>>> could
> > > > >>>>>>>> give some explanations on the original intention to
> introduce
> > > > >>> another
> > > > >>>>>>>> WATERMARK_PARAMS. If we take a look at the current
> datastream
> > > API,
> > > > >>> the
> > > > >>>>>>>> watermark strategy does not belong to any specific
> connector.
> > > And
> > > > >>> we
> > > > >>>>>>>> thought the dynamic table options were more like the
> > > > configurations
> > > > >>>>> within
> > > > >>>>>>>> some specific connector.
> > > > >>>>>>>>
> > > > >>>>>>>>   From the review comments, I think most people feel good to
> > > make
> > > > it
> > > > >>>>> part
> > > > >>>>>>>> of the dynamic table options. I think this is fine if we
> give
> > > more
> > > > >>>>> clear
> > > > >>>>>>>> scope definition of the dynamic table options here. And I
> also
> > > > >>> agree
> > > > >>>>> with
> > > > >>>>>>>> Jingsong's concern about adding SQL syntax which is the most
> > > > >>>>> concerning
> > > > >>>>>>>> part before launching this discussion.
> > > > >>>>>>>>
> > > > >>>>>>>> For Martijn's concern, if we accept to make the
> > > watermark-related
> > > > >>>>> options
> > > > >>>>>>>> part of dynamic table options, the problem becomes another
> > > topic:
> > > > >>> how
> > > > >>>>> to
> > > > >>>>>>>> support the dynamic table options in table API, which is
> > > deserved
> > > > >>> to
> > > > >>>>> create
> > > > >>>>>>>> another FLIP.
> > > > >>>>>>>>
> > > > >>>>>>>> Best
> > > > >>>>>>>> Yun Tang
> > > > >>>>>>>> ________________________________
> > > > >>>>>>>> From: Martijn Visser <martijnvis...@apache.org>
> > > > >>>>>>>> Sent: Thursday, February 23, 2023 17:14
> > > > >>>>>>>> To: dev@flink.apache.org <dev@flink.apache.org>
> > > > >>>>>>>> Subject: Re: [DISCUSS] FLIP-296: Watermark options for table
> > > API &
> > > > >>> SQL
> > > > >>>>>>>>
> > > > >>>>>>>> Hi,
> > > > >>>>>>>>
> > > > >>>>>>>> While I can understand that there's a desire to first focus
> on
> > > > >>> solving
> > > > >>>>>>>> this
> > > > >>>>>>>> problem for SQL, I do wonder if we should ignore the Table
> API
> > > at
> > > > >>> this
> > > > >>>>>>>> point. If we could include the syntax for the Table API, it
> > > > >>>>> potentially
> > > > >>>>>>>> could also be implemented by another contributor without
> > needing
> > > > to
> > > > >>>>> create
> > > > >>>>>>>> another FLIP. If we don't design it right now, my concern is
> > > that
> > > > >>> this
> > > > >>>>>>>> will
> > > > >>>>>>>> increase sparsity for the Table API which ultimately hurts
> > > > >>> adoption.
> > > > >>>>>>>>
> > > > >>>>>>>> With regards to the syntax, I have a preference to solve
> this
> > > via
> > > > >>> the
> > > > >>>>>>>> connector options (e.g. like you can currently specify
> things
> > as
> > > > >>>>>>>> scan.startup.specific-offsets or scan.bounded.mode for the
> > Kafka
> > > > >>>>>>>> connector). You could still use the dynamic table options to
> > > > >>>>> override/add
> > > > >>>>>>>> them.
> > > > >>>>>>>>
> > > > >>>>>>>> Best regards,
> > > > >>>>>>>>
> > > > >>>>>>>> Martijn
> > > > >>>>>>>>
> > > > >>>>>>>> On Thu, Feb 23, 2023 at 7:21 AM Shammon FY <
> zjur...@gmail.com
> > >
> > > > >>> wrote:
> > > > >>>>>>>>
> > > > >>>>>>>>> Hi kui
> > > > >>>>>>>>>
> > > > >>>>>>>>> Thanks for your answer and +1 to yuxia too
> > > > >>>>>>>>>
> > > > >>>>>>>>>> we should not bind the watermark-related options to a
> > > connector
> > > > >>> to
> > > > >>>>>>>> ensure
> > > > >>>>>>>>> semantic clarity.
> > > > >>>>>>>>>
> > > > >>>>>>>>> In my opinion, adding watermark-related options to a
> > connector
> > > is
> > > > >>>>> much
> > > > >>>>>>>> more
> > > > >>>>>>>>> clear. Currently users can define simple watermark strategy
> > in
> > > > >>> DDL,
> > > > >>>>>>>> adding
> > > > >>>>>>>>> more configuration items in connector options is easy to
> > > > >>> understand
> > > > >>>>>>>>>
> > > > >>>>>>>>> Best,
> > > > >>>>>>>>> Shammon
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Thu, Feb 23, 2023 at 10:52 AM Jingsong Li <
> > > > >>> jingsongl...@gmail.com
> > > > >>>>>>
> > > > >>>>>>>>> wrote:
> > > > >>>>>>>>>
> > > > >>>>>>>>>> Thanks for your proposal.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> +1 to yuxia, consider watermark-related hints as option
> > hints.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Personally, I am cautious about adding SQL syntax,
> > > > >>> WATERMARK_PARAMS
> > > > >>>>> is
> > > > >>>>>>>>>> also SQL syntax to some extent.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> We can use OPTIONS to meet this requirement if possible.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Best,
> > > > >>>>>>>>>> Jingsong
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> On Thu, Feb 23, 2023 at 10:41 AM yuxia <
> > > > >>> luoyu...@alumni.sjtu.edu.cn
> > > > >>>>>>
> > > > >>>>>>>>>> wrote:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Hi, Yuan Kui.
> > > > >>>>>>>>>>> Thanks for driving it.
> > > > >>>>>>>>>>> IMO, the 'OPTIONS' hint may be not only specific to the
> > > > >>> connector
> > > > >>>>>>>>>> options. Just as a reference, we also have
> > > `sink.parallelism`[1]
> > > > >>> as
> > > > >>>>> a
> > > > >>>>>>>>>> connector options. It enables
> > > > >>>>>>>>>>> user to specific the writer's parallelism dynamically
> > > > per-query.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Personally, I perfer to consider watermark-related hints
> as
> > > > >>> option
> > > > >>>>>>>>>> hints. So, user can define a default watermark strategy
> for
> > > the
> > > > >>>>> table,
> > > > >>>>>>>>> and
> > > > >>>>>>>>>> if user dosen't needed to changes it, they need to do
> > nothing
> > > in
> > > > >>>>> their
> > > > >>>>>>>>>> query instead of specific it ervery time.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> [1]
> > > > >>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>
> > > > >>>
> > > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-master/zh/docs/connectors/table/filesystem/#sink-parallelism
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Best regards,
> > > > >>>>>>>>>>> Yuxia
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Best regards,
> > > > >>>>>>>>>>> Yuxia
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> ----- 原始邮件 -----
> > > > >>>>>>>>>>> 发件人: "kui yuan" <catye...@gmail.com>
> > > > >>>>>>>>>>> 收件人: "dev" <dev@flink.apache.org>
> > > > >>>>>>>>>>> 抄送: "Jark Wu" <imj...@gmail.com>
> > > > >>>>>>>>>>> 发送时间: 星期三, 2023年 2 月 22日 下午 10:08:11
> > > > >>>>>>>>>>> 主题: Re: [DISCUSS] FLIP-296: Watermark options for table
> > API &
> > > > >>> SQL
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Hi all,
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Thanks for the lively discussion and I will respond to
> > these
> > > > >>>>>>>> questions
> > > > >>>>>>>>>> one
> > > > >>>>>>>>>>> by one. However, there are also some common questions
> and I
> > > > will
> > > > >>>>>>>> answer
> > > > >>>>>>>>>>> together.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> @郑 Thanks for your reply. The features mentioned in this
> > flip
> > > > >>> are
> > > > >>>>>>>> only
> > > > >>>>>>>>>> for
> > > > >>>>>>>>>>> those source connectors that implement the
> > > > >>>>> SupportsWatermarkPushDown
> > > > >>>>>>>>>>> interface, generating watermarks in other graph locations
> > is
> > > > >>> not in
> > > > >>>>>>>> the
> > > > >>>>>>>>>>> scope of this discussion. Perhaps another flip can be
> > > proposed
> > > > >>>>>>>> later to
> > > > >>>>>>>>>>> implement this feature.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> @Shammon Thanks for your reply. In Flip-296, a rejected
> > > > >>> alternative
> > > > >>>>>>>> is
> > > > >>>>>>>>>>> adding watermark related options in the connector
> > options,we
> > > > >>>>> believe
> > > > >>>>>>>>> that
> > > > >>>>>>>>>>> we should not bind the watermark-related options to a
> > > connector
> > > > >>> to
> > > > >>>>>>>>> ensure
> > > > >>>>>>>>>>> semantic clarity.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> What will happen if we add watermark related options in
> > `the
> > > > >>>>>>>>> connector
> > > > >>>>>>>>>>>> options`? Will the connector ignore these options or
> throw
> > > an
> > > > >>>>>>>>>> exception?
> > > > >>>>>>>>>>>> How can we support this?
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> If user defines different watermark configurations for
> one
> > > > >>> table in
> > > > >>>>>>>> two
> > > > >>>>>>>>>>> places, I tend to prefer the first place would prevail,
> but
> > > we
> > > > >>> can
> > > > >>>>>>>>> also
> > > > >>>>>>>>>>> throw exception or just print logs to prompt the user,
> > which
> > > > are
> > > > >>>>>>>>>>> implementation details.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> If one table is used by two operators with different
> > > watermark
> > > > >>>>>>>>> params,
> > > > >>>>>>>>>>>> what will happen?
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> @Martijn Thanks for your reply.  I'm sorry that we are
> not
> > > > >>>>>>>> particularly
> > > > >>>>>>>>>>> accurate, this hint is mainly for SQL,  not table API.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> While the FLIP talks about watermark options for Table
> > API &
> > > > >>> SQL,
> > > > >>>>>>>> I
> > > > >>>>>>>>>> only
> > > > >>>>>>>>>>>> see proposed syntax for SQL, not for the Table API. What
> > is
> > > > >>> your
> > > > >>>>>>>>>> proposal
> > > > >>>>>>>>>>>> for the Table API
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> @Jane Thanks for your reply. For the first question, If
> the
> > > > user
> > > > >>>>>>>> uses
> > > > >>>>>>>>>> this
> > > > >>>>>>>>>>> hint on those sourse that does not implement the
> > > > >>>>>>>>>> SupportsWatermarkPushDown
> > > > >>>>>>>>>>> interface, it will be completely invalid. The task will
> run
> > > as
> > > > >>>>>>>> normal
> > > > >>>>>>>>> as
> > > > >>>>>>>>>> if
> > > > >>>>>>>>>>> the hint had not been used.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> What's the behavior if there are multiple table sources,
> > > among
> > > > >>>>>>>> which
> > > > >>>>>>>>>>>> some do not support `SupportsWatermarkPushDown`?
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> @Jane feedback that 'WATERMARK_PARAMS' is difficult to
> > > > remember,
> > > > >>>>>>>>> perhaps
> > > > >>>>>>>>>>> the naming issue can be put to the end of the discussion,
> > > > >>> because
> > > > >>>>>>>> more
> > > > >>>>>>>>>>> people like @Martijn @Shuo are considering whether these
> > > > >>>>>>>> configurations
> > > > >>>>>>>>>>> should be put into the DDL or the 'OPTIONS' hint. Here's
> > > what I
> > > > >>>>>>>>>>> think, Putting these configs into DDL or putting them
> into
> > > > >>>>> 'OPTIONS'
> > > > >>>>>>>>> hint
> > > > >>>>>>>>>>> is actually the same thing, because the 'OPTIONS' hint is
> > > > mainly
> > > > >>>>>>>> used
> > > > >>>>>>>>> to
> > > > >>>>>>>>>>> configure the properties of conenctor. The reason why I
> > want
> > > to
> > > > >>> use
> > > > >>>>>>>> a
> > > > >>>>>>>>> new
> > > > >>>>>>>>>>> hint is to make sure the semantics clear, in my opinion
> the
> > > > >>>>>>>>> configuration
> > > > >>>>>>>>>>> of watermark should not be mixed up with connector.
> > However,
> > > a
> > > > >>> new
> > > > >>>>>>>> hint
> > > > >>>>>>>>>>> does make it more difficult to use to some extent, for
> > > example,
> > > > >>>>>>>> when a
> > > > >>>>>>>>>> user
> > > > >>>>>>>>>>> uses both 'OPTIONS' hint and 'WATERMARK_PARAMS' hint. For
> > > this
> > > > >>>>>>>> point,
> > > > >>>>>>>>>> maby
> > > > >>>>>>>>>>> it is more appropriate to use uniform 'OPTIONS' hint.
> > > > >>>>>>>>>>> On the other hand, if we enrich more watermark option
> keys
> > in
> > > > >>>>>>>> 'OPTIONS'
> > > > >>>>>>>>>>> hints, The question will be what we treat the
> definatrions
> > of
> > > > >>>>>>>>> 'OPTIONS'
> > > > >>>>>>>>>>> hint, is this only specific to the connector options or
> > could
> > > > be
> > > > >>>>>>>> more?
> > > > >>>>>>>>>>> Maybe @Jark could share more insights here. In my opion,
> > > > >>> 'OPTIONS'
> > > > >>>>>>>> is
> > > > >>>>>>>>>> only
> > > > >>>>>>>>>>> related to the connector options, which is not like the
> > > gernal
> > > > >>>>>>>>> watermark
> > > > >>>>>>>>>>> options.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Shuo Cheng <njucs...@gmail.com> 于2023年2月22日周三 19:17写道:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> Hi Kui,
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Thanks for driving the discussion. It's quite useful to
> > > > >>> introduce
> > > > >>>>>>>>>> Watermark
> > > > >>>>>>>>>>>> options. I have some questions:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> What kind of hints is "WATERMARK_PARAMS"?
> > > > >>>>>>>>>>>> Currently, we have two kinds of hints in Flink: Dynamic
> > > Table
> > > > >>>>>>>>> Options &
> > > > >>>>>>>>>>>> Query Hints. As described in the Flip,
> "WATERMARK_PARAMS"
> > is
> > > > >>> more
> > > > >>>>>>>>> like
> > > > >>>>>>>>>>>> Dynamic Table Options. So two questions arise here:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> 1)  Are these watermark options to be exposed as
> connector
> > > > WITH
> > > > >>>>>>>>>> options? Aa
> > > > >>>>>>>>>>>> described in SQL Hints doc[1],  "Dynamic Table Options
> > allow
> > > > to
> > > > >>>>>>>>>> specify or
> > > > >>>>>>>>>>>> override table options dynamically", which implies that
> > > these
> > > > >>>>>>>> options
> > > > >>>>>>>>>> can
> > > > >>>>>>>>>>>> also be configured in WITH options.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> 2)  Do we really need a new hint name like
> > > 'WATERMARK_PARAMS',
> > > > >>>>>>>> table
> > > > >>>>>>>>>>>> options use "OPTIONS" as hint name, like '/*+
> > > > >>>>>>>>>>>> OPTIONS('csv.ignore-parse-errors'='true') */', maybe we
> > can
> > > > >>> enrich
> > > > >>>>>>>>> more
> > > > >>>>>>>>>>>> table option keys for watermark, e.g., /*+
> > > > >>>>>>>>>>>> OPTIONS('watermark.emit-strategy'='ON_PERIODIC') */.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> [1]
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>
> > > > >>>
> > > >
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> On Wed, Feb 22, 2023 at 10:22 AM kui yuan <
> > > catye...@gmail.com
> > > > >
> > > > >>>>>>>>> wrote:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Hi devs,
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> I'd like to start a discussion thread for FLIP-296[1].
> > This
> > > > >>>>>>>> comes
> > > > >>>>>>>>>> from an
> > > > >>>>>>>>>>>>> offline discussion with @Yun Tang, and we hope to
> enrich
> > > > table
> > > > >>>>>>>> API
> > > > >>>>>>>>> &
> > > > >>>>>>>>>> SQL
> > > > >>>>>>>>>>>> to
> > > > >>>>>>>>>>>>> support many watermark-related features which were only
> > > > >>>>>>>> implemented
> > > > >>>>>>>>>> at
> > > > >>>>>>>>>>>> the
> > > > >>>>>>>>>>>>> datastream API level.
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Basically, we want to introduce watermark options in
> > table
> > > > >>> API &
> > > > >>>>>>>>> SQL
> > > > >>>>>>>>>> via
> > > > >>>>>>>>>>>>> SQL hint named 'WATERMARK_PARAMS' to support features:
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> 1、Configurable watermark emit strategy
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> 2、Dealing with idle sources
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> 3、Watermark alignment
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Last but not least, thanks to Qingsheng and Jing Zhang
> > for
> > > > the
> > > > >>>>>>>>>> initial
> > > > >>>>>>>>>>>>> reviews.
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Looking forward to your thoughts and any feedback is
> > > > >>>>>>>> appreciated!
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> [1]
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>
> > > > >>>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240884405
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Best
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Yuan Kui
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>>
> > > > >>>
> > > > >>
> > > > >
> > > >
> > > >
> > >
> >
>

Reply via email to