Re: [Discuss] FLIP-366: Support standard YAML for FLINK configuration

2023-09-21 Thread Yuxin Tan
Hi, Junrui

+1 for the proposal.
Thanks for your effort.

Best,
Yuxin


Samrat Deb  于2023年9月22日周五 13:23写道:

> Hello Junrui,
>
> +1 for the proposal.
>
>
> Bests,
> Samrat
>
> On Fri, Sep 22, 2023 at 10:18 AM Shammon FY  wrote:
>
> > +1 for the proposal, thanks for driving.
> >
> > Bet,
> > Shammon FY
> >
> > On Fri, Sep 22, 2023 at 12:41 PM Yangze Guo  wrote:
> >
> > > Thanks for driving this, +1 for the proposal.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > >
> > > On Fri, Sep 22, 2023 at 11:59 AM Lijie Wang 
> > > wrote:
> > > >
> > > > Hi Junrui,
> > > >
> > > > +1 for this proposal, thanks for driving.
> > > >
> > > > Best,
> > > > Lijie
> > > >
> > > > ConradJam  于2023年9月22日周五 10:07写道:
> > > >
> > > > > +1 Support for standard YAML format facilitates specification
> > > > >
> > > > > Jing Ge  于2023年9月22日周五 02:23写道:
> > > > >
> > > > > > Hi Junrui,
> > > > > >
> > > > > > +1 for following the standard. Thanks for your effort!
> > > > > >
> > > > > > Best regards,
> > > > > > Jing
> > > > > >
> > > > > > On Thu, Sep 21, 2023 at 5:09 AM Junrui Lee 
> > > wrote:
> > > > > >
> > > > > > > Hi Jane,
> > > > > > >
> > > > > > > Thank you for your valuable feedback and suggestions.
> > > > > > > I agree with your point about differentiating between
> > > > > "flink-config.yaml"
> > > > > > > and "flink-conf.yaml" to determine the standard syntax at a
> > glance.
> > > > > > >
> > > > > > > While I understand your suggestion of using
> > > "flink-conf-default.yaml"
> > > > > to
> > > > > > > represent the default YAML file for Flink 1.x, I have been
> > > considering
> > > > > > > the option of using "flink-configuration.yaml" as the file name
> > > for the
> > > > > > > new configuration file.
> > > > > > > This name "flink-configuration.yaml" provides a clear
> distinction
> > > > > between
> > > > > > > the new and old configuration files based on their names, and
> it
> > > does
> > > > > not
> > > > > > > introduce any additional semantics. Moreover, this name
> > > > > > > "flink-configuration.yaml" can continue to be used in future
> > > versions
> > > > > > > FLINK-2.0.
> > > > > > >
> > > > > > > WDYT? If we can reach a consensus on this, I will update the
> FLIP
> > > > > > > documentation
> > > > > > > accordingly.
> > > > > > >
> > > > > > > Best regards,
> > > > > > > Junrui
> > > > > > >
> > > > > > > Jane Chan  于2023年9月20日周三 23:38写道:
> > > > > > >
> > > > > > > > Hi Junrui,
> > > > > > > >
> > > > > > > > Thanks for driving this FLIP. +1 for adoption of the standard
> > > YAML
> > > > > > > syntax.
> > > > > > > > I just have one minor suggestion. It's a little bit
> challenging
> > > to
> > > > > > > > differentiate between `flink-config.yaml` and
> `flink-conf.yaml`
> > > to
> > > > > > > > determine which one uses the standard syntax at a glance. How
> > > about
> > > > > > > > using `flink-conf-default.yaml` to represent the default yaml
> > > file
> > > > > for
> > > > > > > > Flink 1.x?
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Jane
> > > > > > > >
> > > > > > > > On Wed, Sep 20, 2023 at 11:06 AM Junrui Lee <
> > jrlee@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi devs,
> > > > > > > > >
> > > > > > > > > I would like to start a discussion about FLIP-366:
> > > > > > > > > Support standard YAML for FLINK configuration[1]
> > > > > > > > >
> > > > > > > > > The current flink-conf.yaml parser in FLINK is not a
> standard
> > > YAML
> > > > > > > > parser,
> > > > > > > > > which has some shortcomings.
> > > > > > > > > Firstly, it does not support nested structure configuration
> > > items
> > > > > and
> > > > > > > > only
> > > > > > > > > supports key-value pairs, resulting in poor readability.
> > > Secondly,
> > > > > if
> > > > > > > the
> > > > > > > > > value is a collection type, such as a List or Map, users
> are
> > > > > required
> > > > > > > to
> > > > > > > > > write the value in a FLINK-specific pattern, which is
> > > inconvenient
> > > > > to
> > > > > > > > use.
> > > > > > > > > Additionally, the parser of FLINK has some differences in
> > > syntax
> > > > > > > compared
> > > > > > > > > to the standard YAML parser, such as the syntax for parsing
> > > > > comments
> > > > > > > and
> > > > > > > > > null values. These inconsistencies can cause confusion for
> > > users,
> > > > > as
> > > > > > > seen
> > > > > > > > > in FLINK-15358 and FLINK-32740.
> > > > > > > > >
> > > > > > > > > By supporting standard YAML, these issues can be resolved,
> > and
> > > > > users
> > > > > > > can
> > > > > > > > > create a Flink configuration file using third-party tools
> and
> > > > > > leverage
> > > > > > > > > some advanced YAML features. Therefore, we propose to
> support
> > > > > > standard
> > > > > > > > > YAML for FLINK configuration.
> > > > > > > > >
> > > > > > > > > You can find more details in the FLIP-366[1]. Looking
> forward
> > > to
> > > > > your
> > > > > > > > > feedback.
> > > > > > > > >
> > > > > > > > > [1]
> > >

回复: [DISCUSS] FLIP-367: Support Setting Parallelism for Table/SQL Sources

2023-09-21 Thread Chen Zhanghao
Hi Jane,

Thanks for the suggestions and totally agree with them. I've updated the FLIP 
with the following two changes:

1. ​Rename WrapperTransformation to SourceTransformationWrapper that wraps a 
SourceTransformation only. Note that we do not plan to support the legacy 
LegacySourceTransformation.
2. Choosing the partitioner after the source will be based on the changelog 
mode of the source + the existence of the primary key in source schema. If the 
source will produce update/delete message but a primary key does not exist, an 
exception will be thrown.

Best,
Zhanghao Chen

发件人: Jane Chan 
发送时间: 2023年9月20日 15:13
收件人: dev@flink.apache.org 
主题: Re: [DISCUSS] FLIP-367: Support Setting Parallelism for Table/SQL Sources

Hi Zhanghao,

Thanks for the update. The FLIP now looks good to me in general, and I have
two minor comments.

1. Compared with other subclasses like `CacheTransformation` or
`PartitionTransformation`, the name  `WrapperTransformation` seems too
general. What about `SourceTransformationWrapper`, which is more specific
and descriptive, WDYT?

2.

> When the source generates update and delete data (determined by checking
> the existence of a primary key in the source schema), the source will use
> hash partitioner to send data.


It might not be sufficient to determine whether the source is a CDC source
solely based on checking the existence of the primary key. It's better to
check the changelog mode of the source. On the other hand, adding the hash
partitioner requires the CDC source table to declare the primary key in the
DDL. Therefore, it is preferable to explain this restriction in the FLIP
and doc and throw a meaningful exception when users want to configure a
different parallelism for a CDC source but forget to declare the primary
key constraint.

Best,
Jane

On Wed, Sep 20, 2023 at 9:20 AM Benchao Li  wrote:

> Thank you for the update, the FLIP now looks good to me.
>
> Chen Zhanghao  于2023年9月19日周二 22:50写道:
> >
> > Thanks to everyone for the valuable inputs, we learnt a lot during the
> discussion. We've updated the FLIP in three main aspects based on the
> discussion here:
> >
> > - Add a new subsection on keeping downstream operators' parallelism
> unchanged by wrapping the source transformation in a phantom transformation.
> > - Add a new subsection on how to deal with changelog messages, simply
> put, build a hash partitioner based on the primary key when a source
> generates update/delete data.
> > - Update the non-goals section to remove the possibly misleading
> statement that setting parallelism for individual operators lacks public
> interest and state that we leave it for future work due to its extra
> complexity.
> >
> > Looking forward to your suggestions.
> >
> > Best,
> > Zhanghao Chen
> > 
> > 发件人: Feng Jin 
> > 发送时间: 2023年9月17日 0:56
> > 收件人: dev@flink.apache.org 
> > 主题: Re: [DISCUSS] FLIP-367: Support Setting Parallelism for Table/SQL
> Sources
> >
> > Hi, Zhanghao
> >
> > Thank you for proposing this FLIP, it is a very meaningful feature.
> >
> > I agree that currently we may only consider the parallelism setting of
> the
> > source itself. If we consider the parallelism setting of other operators,
> > it may make the entire design more complex.
> >
> > Regarding the situation where the parallelism of the source is different
> > from that of downstream tasks, I did not find a more detailed description
> > in FLIP.
> >
> > By default, if the parallelism between two operators is different, the
> > rebalance partitioner will be used.
> > But in the SQL scenario, I believe that we should keep the behavior of
> > parallelism setting consistent with that of the sink.
> >
> > 1. When the source only generates insert-only data, if there is a
> mismatch
> > in parallelism between the source and downstream operators, rebalance is
> > used by default.
> >
> > 2. When the source generates update and delete data, we should require
> the
> > source to configure a primary key and then build a hash partitioner based
> > on that primary key.
> >
> > WDYT ?
> >
> >
> > Best,
> > Feng
> >
> >
> > On Sat, Sep 16, 2023 at 5:58 PM Jane Chan  wrote:
> >
> > > Hi Zhanghao,
> > >
> > > Thanks for the explanation.
> > >
> > > For Q1, I think the key lies in determining the boundary where the
> chain
> > > should be broken. However, this boundary is ultimately determined by
> the
> > > specific requirements of each user query.
> > >
> > > The most straightforward approach is breaking the chain after the
> source
> > > operator, even though it involves a tradeoff. This is because there
> may be
> > > instances of `StreamExecWatermarkAssigner`,
> `StreamExecMiniBatchAssigner`,
> > > or `StreamExecChangelogNormalize` occurring before the `StreamExecCalc`
> > > node, and it would be complex and challenging to enumerate all possible
> > > match patterns.
> > >
> > > A more complex workaround would be to provide an entry point for u

[VOTE] FLIP-327: Support switching from batch to stream mode to improve throughput when processing backlog data

2023-09-21 Thread Dong Lin
Hi all,

We would like to start the vote for FLIP-327: Support switching from batch
to stream mode to improve throughput when processing backlog data [1]. This
FLIP was discussed in this thread [2].

The vote will be open until at least Sep 27th (at least 72
hours), following the consensus voting process.

Cheers,
Xuannan and Dong

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-327%3A+Support+switching+from+batch+to+stream+mode+to+improve+throughput+when+processing+backlog+data
[2] https://lists.apache.org/thread/29nvjt9sgnzvs90browb8r6ng31dcs3n


Re: [Discuss] FLIP-366: Support standard YAML for FLINK configuration

2023-09-21 Thread Samrat Deb
Hello Junrui,

+1 for the proposal.


Bests,
Samrat

On Fri, Sep 22, 2023 at 10:18 AM Shammon FY  wrote:

> +1 for the proposal, thanks for driving.
>
> Bet,
> Shammon FY
>
> On Fri, Sep 22, 2023 at 12:41 PM Yangze Guo  wrote:
>
> > Thanks for driving this, +1 for the proposal.
> >
> > Best,
> > Yangze Guo
> >
> >
> > On Fri, Sep 22, 2023 at 11:59 AM Lijie Wang 
> > wrote:
> > >
> > > Hi Junrui,
> > >
> > > +1 for this proposal, thanks for driving.
> > >
> > > Best,
> > > Lijie
> > >
> > > ConradJam  于2023年9月22日周五 10:07写道:
> > >
> > > > +1 Support for standard YAML format facilitates specification
> > > >
> > > > Jing Ge  于2023年9月22日周五 02:23写道:
> > > >
> > > > > Hi Junrui,
> > > > >
> > > > > +1 for following the standard. Thanks for your effort!
> > > > >
> > > > > Best regards,
> > > > > Jing
> > > > >
> > > > > On Thu, Sep 21, 2023 at 5:09 AM Junrui Lee 
> > wrote:
> > > > >
> > > > > > Hi Jane,
> > > > > >
> > > > > > Thank you for your valuable feedback and suggestions.
> > > > > > I agree with your point about differentiating between
> > > > "flink-config.yaml"
> > > > > > and "flink-conf.yaml" to determine the standard syntax at a
> glance.
> > > > > >
> > > > > > While I understand your suggestion of using
> > "flink-conf-default.yaml"
> > > > to
> > > > > > represent the default YAML file for Flink 1.x, I have been
> > considering
> > > > > > the option of using "flink-configuration.yaml" as the file name
> > for the
> > > > > > new configuration file.
> > > > > > This name "flink-configuration.yaml" provides a clear distinction
> > > > between
> > > > > > the new and old configuration files based on their names, and it
> > does
> > > > not
> > > > > > introduce any additional semantics. Moreover, this name
> > > > > > "flink-configuration.yaml" can continue to be used in future
> > versions
> > > > > > FLINK-2.0.
> > > > > >
> > > > > > WDYT? If we can reach a consensus on this, I will update the FLIP
> > > > > > documentation
> > > > > > accordingly.
> > > > > >
> > > > > > Best regards,
> > > > > > Junrui
> > > > > >
> > > > > > Jane Chan  于2023年9月20日周三 23:38写道:
> > > > > >
> > > > > > > Hi Junrui,
> > > > > > >
> > > > > > > Thanks for driving this FLIP. +1 for adoption of the standard
> > YAML
> > > > > > syntax.
> > > > > > > I just have one minor suggestion. It's a little bit challenging
> > to
> > > > > > > differentiate between `flink-config.yaml` and `flink-conf.yaml`
> > to
> > > > > > > determine which one uses the standard syntax at a glance. How
> > about
> > > > > > > using `flink-conf-default.yaml` to represent the default yaml
> > file
> > > > for
> > > > > > > Flink 1.x?
> > > > > > >
> > > > > > > Best,
> > > > > > > Jane
> > > > > > >
> > > > > > > On Wed, Sep 20, 2023 at 11:06 AM Junrui Lee <
> jrlee@gmail.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > Hi devs,
> > > > > > > >
> > > > > > > > I would like to start a discussion about FLIP-366:
> > > > > > > > Support standard YAML for FLINK configuration[1]
> > > > > > > >
> > > > > > > > The current flink-conf.yaml parser in FLINK is not a standard
> > YAML
> > > > > > > parser,
> > > > > > > > which has some shortcomings.
> > > > > > > > Firstly, it does not support nested structure configuration
> > items
> > > > and
> > > > > > > only
> > > > > > > > supports key-value pairs, resulting in poor readability.
> > Secondly,
> > > > if
> > > > > > the
> > > > > > > > value is a collection type, such as a List or Map, users are
> > > > required
> > > > > > to
> > > > > > > > write the value in a FLINK-specific pattern, which is
> > inconvenient
> > > > to
> > > > > > > use.
> > > > > > > > Additionally, the parser of FLINK has some differences in
> > syntax
> > > > > > compared
> > > > > > > > to the standard YAML parser, such as the syntax for parsing
> > > > comments
> > > > > > and
> > > > > > > > null values. These inconsistencies can cause confusion for
> > users,
> > > > as
> > > > > > seen
> > > > > > > > in FLINK-15358 and FLINK-32740.
> > > > > > > >
> > > > > > > > By supporting standard YAML, these issues can be resolved,
> and
> > > > users
> > > > > > can
> > > > > > > > create a Flink configuration file using third-party tools and
> > > > > leverage
> > > > > > > > some advanced YAML features. Therefore, we propose to support
> > > > > standard
> > > > > > > > YAML for FLINK configuration.
> > > > > > > >
> > > > > > > > You can find more details in the FLIP-366[1]. Looking forward
> > to
> > > > your
> > > > > > > > feedback.
> > > > > > > >
> > > > > > > > [1]
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-366%3A+Support+standard+YAML+for+FLINK+configuration
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Junrui
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Best
> > > >
> > > > ConradJam
> > > >
> >
>


Re: [Discuss] FLIP-366: Support standard YAML for FLINK configuration

2023-09-21 Thread Shammon FY
+1 for the proposal, thanks for driving.

Bet,
Shammon FY

On Fri, Sep 22, 2023 at 12:41 PM Yangze Guo  wrote:

> Thanks for driving this, +1 for the proposal.
>
> Best,
> Yangze Guo
>
>
> On Fri, Sep 22, 2023 at 11:59 AM Lijie Wang 
> wrote:
> >
> > Hi Junrui,
> >
> > +1 for this proposal, thanks for driving.
> >
> > Best,
> > Lijie
> >
> > ConradJam  于2023年9月22日周五 10:07写道:
> >
> > > +1 Support for standard YAML format facilitates specification
> > >
> > > Jing Ge  于2023年9月22日周五 02:23写道:
> > >
> > > > Hi Junrui,
> > > >
> > > > +1 for following the standard. Thanks for your effort!
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > > On Thu, Sep 21, 2023 at 5:09 AM Junrui Lee 
> wrote:
> > > >
> > > > > Hi Jane,
> > > > >
> > > > > Thank you for your valuable feedback and suggestions.
> > > > > I agree with your point about differentiating between
> > > "flink-config.yaml"
> > > > > and "flink-conf.yaml" to determine the standard syntax at a glance.
> > > > >
> > > > > While I understand your suggestion of using
> "flink-conf-default.yaml"
> > > to
> > > > > represent the default YAML file for Flink 1.x, I have been
> considering
> > > > > the option of using "flink-configuration.yaml" as the file name
> for the
> > > > > new configuration file.
> > > > > This name "flink-configuration.yaml" provides a clear distinction
> > > between
> > > > > the new and old configuration files based on their names, and it
> does
> > > not
> > > > > introduce any additional semantics. Moreover, this name
> > > > > "flink-configuration.yaml" can continue to be used in future
> versions
> > > > > FLINK-2.0.
> > > > >
> > > > > WDYT? If we can reach a consensus on this, I will update the FLIP
> > > > > documentation
> > > > > accordingly.
> > > > >
> > > > > Best regards,
> > > > > Junrui
> > > > >
> > > > > Jane Chan  于2023年9月20日周三 23:38写道:
> > > > >
> > > > > > Hi Junrui,
> > > > > >
> > > > > > Thanks for driving this FLIP. +1 for adoption of the standard
> YAML
> > > > > syntax.
> > > > > > I just have one minor suggestion. It's a little bit challenging
> to
> > > > > > differentiate between `flink-config.yaml` and `flink-conf.yaml`
> to
> > > > > > determine which one uses the standard syntax at a glance. How
> about
> > > > > > using `flink-conf-default.yaml` to represent the default yaml
> file
> > > for
> > > > > > Flink 1.x?
> > > > > >
> > > > > > Best,
> > > > > > Jane
> > > > > >
> > > > > > On Wed, Sep 20, 2023 at 11:06 AM Junrui Lee  >
> > > > wrote:
> > > > > >
> > > > > > > Hi devs,
> > > > > > >
> > > > > > > I would like to start a discussion about FLIP-366:
> > > > > > > Support standard YAML for FLINK configuration[1]
> > > > > > >
> > > > > > > The current flink-conf.yaml parser in FLINK is not a standard
> YAML
> > > > > > parser,
> > > > > > > which has some shortcomings.
> > > > > > > Firstly, it does not support nested structure configuration
> items
> > > and
> > > > > > only
> > > > > > > supports key-value pairs, resulting in poor readability.
> Secondly,
> > > if
> > > > > the
> > > > > > > value is a collection type, such as a List or Map, users are
> > > required
> > > > > to
> > > > > > > write the value in a FLINK-specific pattern, which is
> inconvenient
> > > to
> > > > > > use.
> > > > > > > Additionally, the parser of FLINK has some differences in
> syntax
> > > > > compared
> > > > > > > to the standard YAML parser, such as the syntax for parsing
> > > comments
> > > > > and
> > > > > > > null values. These inconsistencies can cause confusion for
> users,
> > > as
> > > > > seen
> > > > > > > in FLINK-15358 and FLINK-32740.
> > > > > > >
> > > > > > > By supporting standard YAML, these issues can be resolved, and
> > > users
> > > > > can
> > > > > > > create a Flink configuration file using third-party tools and
> > > > leverage
> > > > > > > some advanced YAML features. Therefore, we propose to support
> > > > standard
> > > > > > > YAML for FLINK configuration.
> > > > > > >
> > > > > > > You can find more details in the FLIP-366[1]. Looking forward
> to
> > > your
> > > > > > > feedback.
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-366%3A+Support+standard+YAML+for+FLINK+configuration
> > > > > > >
> > > > > > > Best,
> > > > > > > Junrui
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best
> > >
> > > ConradJam
> > >
>


Re: [Discuss] FLIP-366: Support standard YAML for FLINK configuration

2023-09-21 Thread Yangze Guo
Thanks for driving this, +1 for the proposal.

Best,
Yangze Guo


On Fri, Sep 22, 2023 at 11:59 AM Lijie Wang  wrote:
>
> Hi Junrui,
>
> +1 for this proposal, thanks for driving.
>
> Best,
> Lijie
>
> ConradJam  于2023年9月22日周五 10:07写道:
>
> > +1 Support for standard YAML format facilitates specification
> >
> > Jing Ge  于2023年9月22日周五 02:23写道:
> >
> > > Hi Junrui,
> > >
> > > +1 for following the standard. Thanks for your effort!
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Thu, Sep 21, 2023 at 5:09 AM Junrui Lee  wrote:
> > >
> > > > Hi Jane,
> > > >
> > > > Thank you for your valuable feedback and suggestions.
> > > > I agree with your point about differentiating between
> > "flink-config.yaml"
> > > > and "flink-conf.yaml" to determine the standard syntax at a glance.
> > > >
> > > > While I understand your suggestion of using "flink-conf-default.yaml"
> > to
> > > > represent the default YAML file for Flink 1.x, I have been considering
> > > > the option of using "flink-configuration.yaml" as the file name for the
> > > > new configuration file.
> > > > This name "flink-configuration.yaml" provides a clear distinction
> > between
> > > > the new and old configuration files based on their names, and it does
> > not
> > > > introduce any additional semantics. Moreover, this name
> > > > "flink-configuration.yaml" can continue to be used in future versions
> > > > FLINK-2.0.
> > > >
> > > > WDYT? If we can reach a consensus on this, I will update the FLIP
> > > > documentation
> > > > accordingly.
> > > >
> > > > Best regards,
> > > > Junrui
> > > >
> > > > Jane Chan  于2023年9月20日周三 23:38写道:
> > > >
> > > > > Hi Junrui,
> > > > >
> > > > > Thanks for driving this FLIP. +1 for adoption of the standard YAML
> > > > syntax.
> > > > > I just have one minor suggestion. It's a little bit challenging to
> > > > > differentiate between `flink-config.yaml` and `flink-conf.yaml` to
> > > > > determine which one uses the standard syntax at a glance. How about
> > > > > using `flink-conf-default.yaml` to represent the default yaml file
> > for
> > > > > Flink 1.x?
> > > > >
> > > > > Best,
> > > > > Jane
> > > > >
> > > > > On Wed, Sep 20, 2023 at 11:06 AM Junrui Lee 
> > > wrote:
> > > > >
> > > > > > Hi devs,
> > > > > >
> > > > > > I would like to start a discussion about FLIP-366:
> > > > > > Support standard YAML for FLINK configuration[1]
> > > > > >
> > > > > > The current flink-conf.yaml parser in FLINK is not a standard YAML
> > > > > parser,
> > > > > > which has some shortcomings.
> > > > > > Firstly, it does not support nested structure configuration items
> > and
> > > > > only
> > > > > > supports key-value pairs, resulting in poor readability. Secondly,
> > if
> > > > the
> > > > > > value is a collection type, such as a List or Map, users are
> > required
> > > > to
> > > > > > write the value in a FLINK-specific pattern, which is inconvenient
> > to
> > > > > use.
> > > > > > Additionally, the parser of FLINK has some differences in syntax
> > > > compared
> > > > > > to the standard YAML parser, such as the syntax for parsing
> > comments
> > > > and
> > > > > > null values. These inconsistencies can cause confusion for users,
> > as
> > > > seen
> > > > > > in FLINK-15358 and FLINK-32740.
> > > > > >
> > > > > > By supporting standard YAML, these issues can be resolved, and
> > users
> > > > can
> > > > > > create a Flink configuration file using third-party tools and
> > > leverage
> > > > > > some advanced YAML features. Therefore, we propose to support
> > > standard
> > > > > > YAML for FLINK configuration.
> > > > > >
> > > > > > You can find more details in the FLIP-366[1]. Looking forward to
> > your
> > > > > > feedback.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-366%3A+Support+standard+YAML+for+FLINK+configuration
> > > > > >
> > > > > > Best,
> > > > > > Junrui
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> > Best
> >
> > ConradJam
> >


Re: [Discuss] FLIP-366: Support standard YAML for FLINK configuration

2023-09-21 Thread Lijie Wang
Hi Junrui,

+1 for this proposal, thanks for driving.

Best,
Lijie

ConradJam  于2023年9月22日周五 10:07写道:

> +1 Support for standard YAML format facilitates specification
>
> Jing Ge  于2023年9月22日周五 02:23写道:
>
> > Hi Junrui,
> >
> > +1 for following the standard. Thanks for your effort!
> >
> > Best regards,
> > Jing
> >
> > On Thu, Sep 21, 2023 at 5:09 AM Junrui Lee  wrote:
> >
> > > Hi Jane,
> > >
> > > Thank you for your valuable feedback and suggestions.
> > > I agree with your point about differentiating between
> "flink-config.yaml"
> > > and "flink-conf.yaml" to determine the standard syntax at a glance.
> > >
> > > While I understand your suggestion of using "flink-conf-default.yaml"
> to
> > > represent the default YAML file for Flink 1.x, I have been considering
> > > the option of using "flink-configuration.yaml" as the file name for the
> > > new configuration file.
> > > This name "flink-configuration.yaml" provides a clear distinction
> between
> > > the new and old configuration files based on their names, and it does
> not
> > > introduce any additional semantics. Moreover, this name
> > > "flink-configuration.yaml" can continue to be used in future versions
> > > FLINK-2.0.
> > >
> > > WDYT? If we can reach a consensus on this, I will update the FLIP
> > > documentation
> > > accordingly.
> > >
> > > Best regards,
> > > Junrui
> > >
> > > Jane Chan  于2023年9月20日周三 23:38写道:
> > >
> > > > Hi Junrui,
> > > >
> > > > Thanks for driving this FLIP. +1 for adoption of the standard YAML
> > > syntax.
> > > > I just have one minor suggestion. It's a little bit challenging to
> > > > differentiate between `flink-config.yaml` and `flink-conf.yaml` to
> > > > determine which one uses the standard syntax at a glance. How about
> > > > using `flink-conf-default.yaml` to represent the default yaml file
> for
> > > > Flink 1.x?
> > > >
> > > > Best,
> > > > Jane
> > > >
> > > > On Wed, Sep 20, 2023 at 11:06 AM Junrui Lee 
> > wrote:
> > > >
> > > > > Hi devs,
> > > > >
> > > > > I would like to start a discussion about FLIP-366:
> > > > > Support standard YAML for FLINK configuration[1]
> > > > >
> > > > > The current flink-conf.yaml parser in FLINK is not a standard YAML
> > > > parser,
> > > > > which has some shortcomings.
> > > > > Firstly, it does not support nested structure configuration items
> and
> > > > only
> > > > > supports key-value pairs, resulting in poor readability. Secondly,
> if
> > > the
> > > > > value is a collection type, such as a List or Map, users are
> required
> > > to
> > > > > write the value in a FLINK-specific pattern, which is inconvenient
> to
> > > > use.
> > > > > Additionally, the parser of FLINK has some differences in syntax
> > > compared
> > > > > to the standard YAML parser, such as the syntax for parsing
> comments
> > > and
> > > > > null values. These inconsistencies can cause confusion for users,
> as
> > > seen
> > > > > in FLINK-15358 and FLINK-32740.
> > > > >
> > > > > By supporting standard YAML, these issues can be resolved, and
> users
> > > can
> > > > > create a Flink configuration file using third-party tools and
> > leverage
> > > > > some advanced YAML features. Therefore, we propose to support
> > standard
> > > > > YAML for FLINK configuration.
> > > > >
> > > > > You can find more details in the FLIP-366[1]. Looking forward to
> your
> > > > > feedback.
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-366%3A+Support+standard+YAML+for+FLINK+configuration
> > > > >
> > > > > Best,
> > > > > Junrui
> > > > >
> > > >
> > >
> >
>
>
> --
> Best
>
> ConradJam
>


Re: [VOTE] FLIP-312: Prometheus Sink Connector

2023-09-21 Thread ConradJam
+1 (non binding)

Samrat Deb  于2023年9月21日周四 23:31写道:

> Thank you ,
>
> +1 (non binding)
>
> Bests,
> Samrat
>
> On Thu, Sep 21, 2023 at 9:43 AM Leonard Xu  wrote:
>
> > Thanks Lorenzo for driving this.
> >
> > +1(binding)
> >
> > Best,
> > Leonard
> >
> > > On Sep 21, 2023, at 11:47 AM, Yun Tang  wrote:
> > >
> > > +1 (binding)
> > >
> > > Thanks for driving this, Lorenzo.
> > >
> > > Best
> > > Yun Tang
> > > 
> > > From: Hong 
> > > Sent: Thursday, September 21, 2023 1:22
> > > To: dev@flink.apache.org 
> > > Subject: Re: [VOTE] FLIP-312: Prometheus Sink Connector
> > >
> > > +1 (binding)
> > >
> > > Thanks Lorenzo.
> > >
> > > Hong
> > >
> > >> On 20 Sep 2023, at 17:49, Danny Cranmer 
> > wrote:
> > >>
> > >> +1 binding.
> > >>
> > >> Thanks for picking this up Lorenzo!
> > >>
> > >> Danny
> > >>
> > >>
> > >>> On Wed, 20 Sept 2023, 16:33 Jing Ge, 
> > wrote:
> > >>>
> > >>> +1(binding) Thanks!
> > >>>
> > >>> Best regards,
> > >>> Jing
> > >>>
> > >>> On Wed, Sep 20, 2023 at 3:20 PM Martijn Visser <
> > martijnvis...@apache.org>
> > >>> wrote:
> > >>>
> >  +1 (binding)
> > 
> >  Thanks for driving this. Cheers, M
> > 
> >  On Mon, Sep 18, 2023 at 1:51 PM Lorenzo Nicora <
> > lorenzo.nic...@gmail.com
> > 
> >  wrote:
> > >
> > > Hi All,
> > >
> > > Thanks for the feedback on FLIP-312: Prometheus Sink Connector [1].
> > > We updated the FLIP accordingly [2].
> > >
> > > I would like to open the vote on FLIP-312.
> > > The vote will be open for at least 72 hours unless there is an
> > >>> objection
> >  or
> > > insufficient votes.
> > >
> > >
> > > [1]
> https://lists.apache.org/thread/tm4qqfb4fxr7bc6nq5mwty1fqz8sj39x
> > > [2]
> > >
> > 
> > >>>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-312%3A+Prometheus+Sink+Connector
> > >
> > > Regards
> > > Lorenzo Nicora
> > 
> > >>>
> > >
> >
> >
>


Re: [Discuss] FLIP-366: Support standard YAML for FLINK configuration

2023-09-21 Thread ConradJam
+1 Support for standard YAML format facilitates specification

Jing Ge  于2023年9月22日周五 02:23写道:

> Hi Junrui,
>
> +1 for following the standard. Thanks for your effort!
>
> Best regards,
> Jing
>
> On Thu, Sep 21, 2023 at 5:09 AM Junrui Lee  wrote:
>
> > Hi Jane,
> >
> > Thank you for your valuable feedback and suggestions.
> > I agree with your point about differentiating between "flink-config.yaml"
> > and "flink-conf.yaml" to determine the standard syntax at a glance.
> >
> > While I understand your suggestion of using "flink-conf-default.yaml" to
> > represent the default YAML file for Flink 1.x, I have been considering
> > the option of using "flink-configuration.yaml" as the file name for the
> > new configuration file.
> > This name "flink-configuration.yaml" provides a clear distinction between
> > the new and old configuration files based on their names, and it does not
> > introduce any additional semantics. Moreover, this name
> > "flink-configuration.yaml" can continue to be used in future versions
> > FLINK-2.0.
> >
> > WDYT? If we can reach a consensus on this, I will update the FLIP
> > documentation
> > accordingly.
> >
> > Best regards,
> > Junrui
> >
> > Jane Chan  于2023年9月20日周三 23:38写道:
> >
> > > Hi Junrui,
> > >
> > > Thanks for driving this FLIP. +1 for adoption of the standard YAML
> > syntax.
> > > I just have one minor suggestion. It's a little bit challenging to
> > > differentiate between `flink-config.yaml` and `flink-conf.yaml` to
> > > determine which one uses the standard syntax at a glance. How about
> > > using `flink-conf-default.yaml` to represent the default yaml file for
> > > Flink 1.x?
> > >
> > > Best,
> > > Jane
> > >
> > > On Wed, Sep 20, 2023 at 11:06 AM Junrui Lee 
> wrote:
> > >
> > > > Hi devs,
> > > >
> > > > I would like to start a discussion about FLIP-366:
> > > > Support standard YAML for FLINK configuration[1]
> > > >
> > > > The current flink-conf.yaml parser in FLINK is not a standard YAML
> > > parser,
> > > > which has some shortcomings.
> > > > Firstly, it does not support nested structure configuration items and
> > > only
> > > > supports key-value pairs, resulting in poor readability. Secondly, if
> > the
> > > > value is a collection type, such as a List or Map, users are required
> > to
> > > > write the value in a FLINK-specific pattern, which is inconvenient to
> > > use.
> > > > Additionally, the parser of FLINK has some differences in syntax
> > compared
> > > > to the standard YAML parser, such as the syntax for parsing comments
> > and
> > > > null values. These inconsistencies can cause confusion for users, as
> > seen
> > > > in FLINK-15358 and FLINK-32740.
> > > >
> > > > By supporting standard YAML, these issues can be resolved, and users
> > can
> > > > create a Flink configuration file using third-party tools and
> leverage
> > > > some advanced YAML features. Therefore, we propose to support
> standard
> > > > YAML for FLINK configuration.
> > > >
> > > > You can find more details in the FLIP-366[1]. Looking forward to your
> > > > feedback.
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-366%3A+Support+standard+YAML+for+FLINK+configuration
> > > >
> > > > Best,
> > > > Junrui
> > > >
> > >
> >
>


-- 
Best

ConradJam


Re: [DISCUSS] FLIP-365: Introduce flush interval to adjust the interval of emitting results with idempotent semantics

2023-09-21 Thread Yunfeng Zhou
Hi Zakelly,

Thanks for your comments on this FLIP. Please let me attempt to
clarify these points.

1. Yes, this FLIP proposes to buffer the outputs in the state backend.
As only the latest one of each type of StreamElement is about to be
buffered, a ValueState in keyed context or a ListState in non-keyed
context would be enough to hold each type of StreamElement. The value
to be stored in the ValueState/ListState would be the original
StreamRecord/Watermark/WatermarkStatus/LatencyMarker. Besides, the
KeyedStateBackend#applyToAllKeys method makes it possible to access
states for all keys in one keyed context.

2.1 The buffered intermediate results need to be included in the next
checkpoint to preserve exactly-once semantics during failover. The
buffer would be cleared in the `flush()` operation, but `flush()` need
not be triggered before checkpoints. I agree with it that saving
buffered results to state would increase the workload about state
access operations, but given that the state buffer would be enabled on
aggregation operators which already involve states, the additional
buffer results would not increase the time complexity of state
accesses or the memory(state) complexity. If we could exchange one
state read/write operation and the space of a ValueState with all
computations in downstream operators to process one intermediate
result, I believe the optimization to throughput would be worth the
tradeoff in states.

2.2 Not considering checkpoints, it might still be meaningful to
discuss the alternative solutions to store buffered results during
runtime as proposed in your suggestions. At least for keyed streams,
I'm concerned that saving all buffered results in memory would easily
cause OOM problems, as there is no guarantee on the number of keyed
states to store between a flush interval. I'm also wondering whether a
file-based map would have better performance than state backends, and
why Flink haven't introduced FileSystemStateBackend if file-based map
could be better. Could you please provide more illustrations on the
pros & cons of state backend v.s. memory/filesystem?

Best regards,
Yunfeng

On Thu, Sep 21, 2023 at 4:10 PM Zakelly Lan  wrote:
>
> Hi Yunfeng and Dong,
>
> Thanks for this FLIP. I have reviewed it briefly and have a few questions:
>
> 1. Is this FLIP proposing to buffer the output in the state backend?
> If so, what is the data format of this buffer (what type of state does
> it use and what is the value)? Additionally, how does the operator
> retrieve all the buffer data from the state backend during the
> `flush()` operation (while the keyed states can only be accessed under
> a keyed context)?
> 2. Are the buffered intermediate results required to be included in
> the next checkpoint? Or are they deleted and subsumed in the original
> states during the `flush()` operation before triggering the
> checkpoint? I'm asking because if they are not included in the
> checkpoint, it may be more efficient to avoid using keyed states for
> buffering. In this scenario, a simple heap-based or even file-based
> map could be more efficient. Frequent writes and clears can lead to
> increased space usage and read amplification for RocksDB, and it also
> requires more CPU resources for checkpointing and compaction.
>
>
> Looking forward to your thoughts.
>
>
> Best,
> Zakelly
>
>
> On Mon, Sep 11, 2023 at 1:39 PM Yunfeng Zhou
>  wrote:
> >
> > Hi all,
> >
> > Dong(cc'ed) and I are opening this thread to discuss our proposal to
> > support buffering & flushing the output of operators with idempotent
> > semantics,  which has been documented in
> > FLIP-365.
> >
> > In the pursuit of unifying batch and stream processing, it has been
> > discovered that the batch execution mode provides a significant
> > advantage by allowing only the final result from "rolling" operations
> > such as reduce() or sum() to be emitted, thus reducing the amount of
> > time and resources required by downstream applications. Inspired by
> > this advantage, the proposed solution supports buffering the output of
> > operators in the streaming mode and periodically emitting the final
> > results, much like in batch processing. This approach is designed to
> > help improve the throughput of jobs by reducing the need for
> > downstream applications to process intermediate results in a stream,
> > at the cost of increased latency and state size.
> >
> > Please refer to the FLIP document for more details about the proposed
> > design and implementation. We welcome any feedback and opinions on
> > this proposal.
> >
> > Best regards.
> > Dong and Yunfeng


[RESULT][VOTE] FLIP-307: Flink Connector Redshift

2023-09-21 Thread Samrat Deb
Hi Everyone,

The proposal, FLIP-307: Flink Connector Redshift, has been accepted
with 5 votes (4 binding) .

+1 votes:
- Danny Cranmer (binding)
- Jing Ge (binding)
- Ahmed Hamdy (non-binding)
- Martijn Visser (binding)
- Leonard Xu (binding)


Bests,
Samrat


Re: [DISCUSS] FLIP-356: Support Nested Fields Filter Pushdown

2023-09-21 Thread Venkatakrishnan Sowrirajan
Got it, Martijn.

Unfortunately, I don't have edit access to the already created JIRA -
FLINK-20767 . If you can
remove the task from the EPIC FLINK-16987 FLIP-95: Add new table source and
sink interfaces , can
you please change it?

If not, I can open a new ticket, close this one and link the 2 tickets as
duplicated by.

Regards
Venkata krishnan


On Thu, Sep 21, 2023 at 12:40 AM Martijn Visser 
wrote:

> Hi Venkatakrishnan,
>
> The reason why I thought it's abandoned because the Jira ticket is
> part of the umbrella ticket for FLIP-95. Let's move the Jira ticket to
> its own dedicated task instead of nested to a FLIP-95 ticket.
>
> Thanks,
>
> Martijn
>
> On Wed, Sep 20, 2023 at 4:34 PM Becket Qin  wrote:
> >
> > Hi Martijn,
> >
> > This FLIP has passed voting[1]. It is a modification on top of the
> FLIP-95
> > interface.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > [1]
> https://urldefense.com/v3/__https://lists.apache.org/thread/hysv9y1f48gtpr5vx3x40wtjb6cp9ky6__;!!IKRxdwAv5BmarQ!f6gg5kDGU-9tqM2rGpbK81L_7sOscG4aVKSjp0RchK1pevsMz_YrhlStS1tXduiLHoxjMP8_BjpJYrQc28-X111i$
> >
> > On Wed, Sep 20, 2023 at 9:29 PM Martijn Visser  >
> > wrote:
> >
> > > For clarity purposes, this FLIP is being abandoned because it was part
> > > of FLIP-95?
> > >
> > > On Thu, Sep 7, 2023 at 3:01 AM Venkatakrishnan Sowrirajan
> > >  wrote:
> > > >
> > > > Hi everyone,
> > > >
> > > > Posted a PR (
> https://urldefense.com/v3/__https://github.com/apache/flink/pull/23313__;!!IKRxdwAv5BmarQ!f6gg5kDGU-9tqM2rGpbK81L_7sOscG4aVKSjp0RchK1pevsMz_YrhlStS1tXduiLHoxjMP8_BjpJYrQc2-0vM_Ac$
> ) to add nested
> > > > fields filter pushdown. Please review. Thanks.
> > > >
> > > > Regards
> > > > Venkata krishnan
> > > >
> > > >
> > > > On Tue, Sep 5, 2023 at 10:04 PM Venkatakrishnan Sowrirajan <
> > > vsowr...@asu.edu>
> > > > wrote:
> > > >
> > > > > Based on an offline discussion with Becket Qin, I added
> *fieldIndices *
> > > > > back which is the field index of the nested field at every level to
> > > the *NestedFieldReferenceExpression
> > > > > *in FLIP-356
> > > > > <
> > >
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-356*3A*Support*Nested*Fields*Filter*Pushdown__;JSsrKysr!!IKRxdwAv5BmarQ!f6gg5kDGU-9tqM2rGpbK81L_7sOscG4aVKSjp0RchK1pevsMz_YrhlStS1tXduiLHoxjMP8_BjpJYrQc27ttpcO0$
> > > >
> > > > > *. *2 reasons to do it:
> > > > >
> > > > > 1. Agree with using *fieldIndices *as the only contract to refer
> to the
> > > > > column from the underlying datasource.
> > > > > 2. To keep it consistent with *FieldReferenceExpression*
> > > > >
> > > > > Having said that, I see that with *projection pushdown, *index of
> the
> > > > > fields are used whereas with *filter pushdown (*based on scanning
> few
> > > > > tablesources) *FieldReferenceExpression*'s name is used for eg:
> even in
> > > > > the Flink's *FileSystemTableSource, IcebergSource, JDBCDatsource*.
> This
> > > > > way, I feel the contract is not quite clear and explicit. Wanted to
> > > > > understand other's thoughts as well.
> > > > >
> > > > > Regards
> > > > > Venkata krishnan
> > > > >
> > > > >
> > > > > On Tue, Sep 5, 2023 at 5:34 PM Becket Qin 
> > > wrote:
> > > > >
> > > > >> Hi Venkata,
> > > > >>
> > > > >>
> > > > >> > Also I made minor changes to the
> *NestedFieldReferenceExpression,
> > > > >> *instead
> > > > >> > of *fieldIndexArray* we can just do away with *fieldNames *array
> > > that
> > > > >> > includes fieldName at every level for the nested field.
> > > > >>
> > > > >>
> > > > >> I don't think keeping only the field names array would work. At
> the
> > > end of
> > > > >> the day, the contract between Flink SQL and the connectors is
> based
> > > on the
> > > > >> indexes, not the names. Technically speaking, the connectors only
> > > emit a
> > > > >> bunch of RowData which is based on positions. The field names are
> > > added by
> > > > >> the SQL framework via the DDL for those RowData. In this sense,
> the
> > > > >> connectors may not be aware of the field names in Flink DDL at
> all.
> > > The
> > > > >> common language between Flink SQL and source is just positions.
> This
> > > is
> > > > >> also why ProjectionPushDown would work by only relying on the
> > > indexes, not
> > > > >> the field names. So I think the field index array is a must have
> here
> > > in
> > > > >> the NestedFieldReferenceExpression.
> > > > >>
> > > > >> Thanks,
> > > > >>
> > > > >> Jiangjie (Becket) Qin
> > > > >>
> > > > >> On Fri, Sep 1, 2023 at 8:12 AM Venkatakrishnan Sowrirajan <
> > > > >> vsowr...@asu.edu>
> > > > >> wrote:
> > > > >>
> > > > >> > Gentle ping on the vote for FLIP-356: Support Nested fields
> filter
> > > > >> pushdown
> > > > >> > <
> > > > >>
> > >
> https://urldefense.com/v3/__https://www.mail-archive.com/dev@flink.apache.org/msg69289.html__;!!IKRxdwAv5BmarQ!bOW26WlafOQQcb32eWtUiXBAl0cTCK1C6iYhDI

Re: [Discuss] FLIP-366: Support standard YAML for FLINK configuration

2023-09-21 Thread Jing Ge
Hi Junrui,

+1 for following the standard. Thanks for your effort!

Best regards,
Jing

On Thu, Sep 21, 2023 at 5:09 AM Junrui Lee  wrote:

> Hi Jane,
>
> Thank you for your valuable feedback and suggestions.
> I agree with your point about differentiating between "flink-config.yaml"
> and "flink-conf.yaml" to determine the standard syntax at a glance.
>
> While I understand your suggestion of using "flink-conf-default.yaml" to
> represent the default YAML file for Flink 1.x, I have been considering
> the option of using "flink-configuration.yaml" as the file name for the
> new configuration file.
> This name "flink-configuration.yaml" provides a clear distinction between
> the new and old configuration files based on their names, and it does not
> introduce any additional semantics. Moreover, this name
> "flink-configuration.yaml" can continue to be used in future versions
> FLINK-2.0.
>
> WDYT? If we can reach a consensus on this, I will update the FLIP
> documentation
> accordingly.
>
> Best regards,
> Junrui
>
> Jane Chan  于2023年9月20日周三 23:38写道:
>
> > Hi Junrui,
> >
> > Thanks for driving this FLIP. +1 for adoption of the standard YAML
> syntax.
> > I just have one minor suggestion. It's a little bit challenging to
> > differentiate between `flink-config.yaml` and `flink-conf.yaml` to
> > determine which one uses the standard syntax at a glance. How about
> > using `flink-conf-default.yaml` to represent the default yaml file for
> > Flink 1.x?
> >
> > Best,
> > Jane
> >
> > On Wed, Sep 20, 2023 at 11:06 AM Junrui Lee  wrote:
> >
> > > Hi devs,
> > >
> > > I would like to start a discussion about FLIP-366:
> > > Support standard YAML for FLINK configuration[1]
> > >
> > > The current flink-conf.yaml parser in FLINK is not a standard YAML
> > parser,
> > > which has some shortcomings.
> > > Firstly, it does not support nested structure configuration items and
> > only
> > > supports key-value pairs, resulting in poor readability. Secondly, if
> the
> > > value is a collection type, such as a List or Map, users are required
> to
> > > write the value in a FLINK-specific pattern, which is inconvenient to
> > use.
> > > Additionally, the parser of FLINK has some differences in syntax
> compared
> > > to the standard YAML parser, such as the syntax for parsing comments
> and
> > > null values. These inconsistencies can cause confusion for users, as
> seen
> > > in FLINK-15358 and FLINK-32740.
> > >
> > > By supporting standard YAML, these issues can be resolved, and users
> can
> > > create a Flink configuration file using third-party tools and leverage
> > > some advanced YAML features. Therefore, we propose to support standard
> > > YAML for FLINK configuration.
> > >
> > > You can find more details in the FLIP-366[1]. Looking forward to your
> > > feedback.
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-366%3A+Support+standard+YAML+for+FLINK+configuration
> > >
> > > Best,
> > > Junrui
> > >
> >
>


Re: [VOTE] FLIP-307: Flink Connector Redshift

2023-09-21 Thread Samrat Deb
Thank you ,
closing the vote.
I will share the result in a separate thread.

Bests,
Samrat


On Wed, Sep 20, 2023 at 7:22 PM Leonard Xu  wrote:

> +1 (binding)
>
>
> Best,
> Leonard
>
> > On Sep 18, 2023, at 11:53 PM, Ahmed Hamdy  wrote:
> >
> > +1 (non-binding)
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Mon, 18 Sept 2023 at 16:52, Jing Ge 
> wrote:
> >
> >> +1(binding). Thanks!
> >>
> >> Best regards,
> >> Jing
> >>
> >> On Mon, Sep 18, 2023 at 5:26 PM Danny Cranmer 
> >> wrote:
> >>
> >>> Thanks for driving this Samrat!
> >>>
> >>> +1 (binding)
> >>>
> >>> Thanks,
> >>> Danny
> >>>
> >>> On Mon, Sep 18, 2023 at 4:17 AM Samrat Deb 
> >> wrote:
> >>>
>  Hi All,
> 
>  Thanks for all the feedback on FLIP-307: Flink Connector Redshift
> >> [1][2]
> 
>  I'd like to start a vote for FLIP-307. The vote will be open for at
> >> least
>  72
>  hours unless there is an objection or insufficient votes.
> 
>  Bests,
>  Samrat
> 
>  [1]
> 
> 
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-307%3A++Flink+Connector+Redshift
>  [2] https://lists.apache.org/thread/wsz4jgdpnlyw1x781f9qpk7y416b45dj
> 
> >>>
> >>
>
>


Re: [VOTE] FLIP-312: Prometheus Sink Connector

2023-09-21 Thread Samrat Deb
Thank you ,

+1 (non binding)

Bests,
Samrat

On Thu, Sep 21, 2023 at 9:43 AM Leonard Xu  wrote:

> Thanks Lorenzo for driving this.
>
> +1(binding)
>
> Best,
> Leonard
>
> > On Sep 21, 2023, at 11:47 AM, Yun Tang  wrote:
> >
> > +1 (binding)
> >
> > Thanks for driving this, Lorenzo.
> >
> > Best
> > Yun Tang
> > 
> > From: Hong 
> > Sent: Thursday, September 21, 2023 1:22
> > To: dev@flink.apache.org 
> > Subject: Re: [VOTE] FLIP-312: Prometheus Sink Connector
> >
> > +1 (binding)
> >
> > Thanks Lorenzo.
> >
> > Hong
> >
> >> On 20 Sep 2023, at 17:49, Danny Cranmer 
> wrote:
> >>
> >> +1 binding.
> >>
> >> Thanks for picking this up Lorenzo!
> >>
> >> Danny
> >>
> >>
> >>> On Wed, 20 Sept 2023, 16:33 Jing Ge, 
> wrote:
> >>>
> >>> +1(binding) Thanks!
> >>>
> >>> Best regards,
> >>> Jing
> >>>
> >>> On Wed, Sep 20, 2023 at 3:20 PM Martijn Visser <
> martijnvis...@apache.org>
> >>> wrote:
> >>>
>  +1 (binding)
> 
>  Thanks for driving this. Cheers, M
> 
>  On Mon, Sep 18, 2023 at 1:51 PM Lorenzo Nicora <
> lorenzo.nic...@gmail.com
> 
>  wrote:
> >
> > Hi All,
> >
> > Thanks for the feedback on FLIP-312: Prometheus Sink Connector [1].
> > We updated the FLIP accordingly [2].
> >
> > I would like to open the vote on FLIP-312.
> > The vote will be open for at least 72 hours unless there is an
> >>> objection
>  or
> > insufficient votes.
> >
> >
> > [1] https://lists.apache.org/thread/tm4qqfb4fxr7bc6nq5mwty1fqz8sj39x
> > [2]
> >
> 
> >>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-312%3A+Prometheus+Sink+Connector
> >
> > Regards
> > Lorenzo Nicora
> 
> >>>
> >
>
>


Re: [DISCUSS] FLIP-328: Allow source operators to determine isProcessingBacklog based on watermark lag

2023-09-21 Thread Dong Lin
Hi all,

Jark and I discussed this FLIP offline and I will summarize our discussion
below. It would be great if you could provide your opinion of the proposed
options.

Regarding the target use-cases:
- We both agreed that MySQL CDC should have backlog=true when watermarkLag
is large during the binlog phase.
- Dong argued that other streaming sources with watermarkLag defined (e.g.
Kafka) should also have backlog=true when watermarkLag is large. The
pros/cons discussion below assume this use-case needs to be supported.

The 1st option is what is currently proposed in FLIP-328, with the
following key characteristics:
1) There is one job-level config (i.e.
pipeline.backlog.watermark-lag-threshold) that applies to all sources with
watermarkLag metric defined.
2) The semantics of `#setIsProcessingBacklog(false)` is that it overrides
the effect of the previous invocation (if any) of
`#setIsProcessingBacklog(true)` on the given source instance.

The 2nd option is what Jark proposed in this email thread, with the
following key characteristics:
1) Add source-specific config (both Java API and SQL source property) to
every source for which we want to set backlog status based on the
watermarkLag metric. For example, we might add separate Java APIs
`#setWatermarkLagThreshold`  for MySQL CDC source, HybridSource,
KafkaSource, PulsarSource etc.
2) The semantics of `#setIsProcessingBacklog(false)` is that the given
source instance will have watermarkLag=false.

Here are the key pros/cons of these two options.

Cons of the 1st option:
1) The semantics of `#setIsProcessingBacklog(false)` is harder to
understand for Flink operator developers than the corresponding semantics
in option-2.

Cons of the  2nd option:
1) More work for end-users. For a job with multiple sources that need to be
configured with a watermark lag threshold, users need to specify multiple
configs (one for each source) instead of specifying one job-level config.

2) More work for Flink operator developers. Overall there are more public
APIs (one Java API and one SQL property for each source that needs to
determine backlog based on watermark) exposed to end users. This also adds
more burden for the Flink community to maintain these APIs.

3) It would be hard (e.g. require backward incompatible API change) to
extend the Flink runtime to support job-level config to set watermark
strategy in the future (e.g. support the
pipeline.backlog.watermark-lag-threshold in option-1). This is because an
existing source operator's code might have hardcoded an invocation of
`#setIsProcessingBacklog(false)`, which means the backlog status must be
set to true, which prevents Flink runtime from setting backlog=true when a
new strategy is triggered.

Overall, I am still inclined to choose option-1 because it is more
extensible and simpler to use in the long term when we want to support/use
multiple sources whose backlog status can change based on the watermark
lag. While option-1's `#setIsProcessingBacklog` is a bit harder to
understand than option-2, I think this overhead/cost is worthwhile as it
makes end-users' life easier in the long term.

Jark: thank you for taking the time to review this FLIP. Please feel free
to comment if I missed anything in the pros/cons above.

Jark and I have not reached agreement on which option is better. It will be
really helpful if we can get more comments on these options.

Thanks,
Dong


On Tue, Sep 19, 2023 at 11:26 AM Dong Lin  wrote:

> Hi Jark,
>
> Thanks for the reply. Please see my comments inline.
>
> On Tue, Sep 19, 2023 at 10:12 AM Jark Wu  wrote:
>
>> Hi Dong,
>>
>> Sorry for the late reply.
>>
>> > The rationale is that if there is any strategy that is triggered and
>> says
>> > backlog=true, then job's backlog should be true. Otherwise, the job's
>> > backlog status is false.
>>
>> I'm quite confused about this. Does that mean, if the source is in the
>> changelog phase, the source has to continuously invoke
>> "setIsProcessingBacklog(true)" (in an infinite loop?). Otherwise,
>> the job's backlog status would be set to false by the framework?
>>
>
> No, the source would not have to continuously invoke
> setIsProcessingBacklog(true) in an infinite loop.
>
> Actually, I am not very sure why there is confusion that "the job's
> backlog status would be set to false by the framework". Could you explain
> where that comes from?
>
> I guess it might be useful to provide a complete overview of how
> setIsProcessingBacklog(...)
> and pipline.backlog.watermark-lag-threshold work together to determine the
> overall job's backlog status. Let me explain it below.
>
> Here is the semantics/behavior of setIsProcessingBacklog(..).
> - This method is invoked on a per-source basis.
> - This method can be invoked multiple times with true/false as its input
> parameter.
> - For a given source, the last invocation of this method overwrites the
> effect of earlier invocation of this method.
> - For a given source, if this method has been invoked at

Re: Close orphaned/stale PRs

2023-09-21 Thread Martijn Visser
Hi all,

I really believe that the problem of the number of open PRs is just
that there aren't enough reviewers/resources available to review them.

> Stale PRs can clutter the repository, and closing them helps keep it 
> organized and ensures that only relevant and up-to-date PRs are present.

Sure, but what's the indicator that the PR is stale? The fact that
there has been no reviewer yet to review it, doesn't mean that the PR
is stale. For me, a stale PR is a PR that has been reviewed, changes
have been requested and the contributor isn't participating in the
discussion anymore. But that's a different story compared to closing
PRs where there has been no review done at all.

> It mainly helps the project maintainers/reviewers to focus on only the 
> actively updated trimmed list of PRs that are ready for review.

I disagree that closing PRs helps with this. If you want to help
maintainers/reviewers, we should have a situation where it's obvious
that a PR is really ready (meaning, CI has passed, PR contents/commit
message etc are following the code contribution guidelines).

> It helps Flink users who are waiting on a PR that enhances an existing 
> feature or fixes an issue a clear indication on whether the PR will be 
> continually worked on and eventually get a closure or not and therefore will 
> be closed.

Having other PRs being closed doesn't increase the guarantee that
other PRs will be reviewed. It's still a capacity problem.

> It would be demotivating for any contributor when there is no feedback for a 
> PR within a sufficient period of time anyway.

Definitely. But I think it would be even worse if someone makes a
contribution, there is no response but after X days they get a message
that their PR was closed automatically.

I'm +1 for (automatically) closing up PRs after X days which:
a) Don't have a CI that has passed
b) Don't follow the code contribution guide (like commit naming conventions)
c) Have changes requested but aren't being followed-up by the contributor

I'm -1 for automatically closing PRs where no maintainers have taken a
review for the reasons I've listed above.

Best regards,

Martijn

On Wed, Sep 20, 2023 at 7:41 AM Venkatakrishnan Sowrirajan
 wrote:
>
> Thanks for your response, Martijn.
>
> > What's the added value of
> closing these PRs
>
> It mainly helps the project maintainers/reviewers to focus on only the
> actively updated trimmed list of PRs that are ready for review.
>
> It helps Flink users who are waiting on a PR that enhances an existing
> feature or fixes an issue a clear indication on whether the PR will be
> continually worked on and eventually get a closure or not and therefore
> will be closed.
>
> Btw, I am open to other suggestions or enhancements on top of the proposal
> as well.
>
> > it would
> just close PRs where maintainers haven't been able to perform a
> review, but getting a PR closed without any feedback is also
> demotivating for a (potential new) contributor
>
> It would be demotivating for any contributor when there is no feedback for
> a PR within a sufficient period of time anyway. I don't see closing the PR
> which is inactive after a sufficient period of time (say 60 to 90 days)
> would be any more discouraging than not getting any feedback. The problem
> of not getting feedback due to not enough maintainer's bandwidth has to be
> solved through other mechanisms.
>
> > I think the important
> thing is that we get into a cycle where maintainers can see which PRs
> are ready for review, and also a way to divide the bulk of the work.
>
> Yes, exactly my point as well. It helps the maintainers to see a trimmed
> list which is ready to be reviewed.
>
> +1 for the other automation to nudge/help the contributor to fix the PR
> that follows the contribution guide, CI checks passed etc.
>
> > IIRC we can't really fix that until we can
> finally move to dedicated Github Action Runners instead of the current
> setup with Azure, but that's primarily blocked by ASF Infra.
>
> Curious, if you can share the JIRA or prior discussion on this topic. I
> would like to learn more about why Github Actions cannot be used for Apache
> Flink.
>
> Regards
> Venkata krishnan
>
>
> On Tue, Sep 19, 2023 at 2:00 PM Martijn Visser 
> wrote:
>
> > Hi Venkata,
> >
> > Thanks for opening the discussion, I've been thinking about it quite a
> > bit but I'm not sure what's the right approach.
> >
> > From your proposal, the question would be "What's the added value of
> > closing these PRs"? I don't see an immediate value of that: it would
> > just close PRs where maintainers haven't been able to perform a
> > review, but getting a PR closed without any feedback is also
> > demotivating for a (potential new) contributor. I think the important
> > thing is that we get into a cycle where maintainers can see which PRs
> > are ready for review, and also a way to divide the bulk of the work.
> > Because doing proper reviews requires time, and these resources are
> > scarce.

Re: [DISCUSS] FLIP-368 Reorganize the exceptions thrown in state interfaces

2023-09-21 Thread Jing Ge
Fair enough! Thanks Zakelly for the information. Afaic, even users can do
nothing with Flink, they still can do something in their territory, at
least doing some logging and metrics stuff, or triggering some other
services in their ecosystem. After all, the Flink jobs they build are part
of their service component. It didn't change the fact that we are going to
use the anti-pattern. Just because we didn't expect users to do
anything with Flink, does not mean users don't expect to do something with
the expected exception. Anyway, I am open to hearing different opinions.

Best regards,
Jing

On Thu, Sep 21, 2023 at 7:02 AM Zakelly Lan  wrote:

> Hi Martijn,
>
> Thanks for the reminder!
>
> This FLIP proposes a change to the state API that is annotated as
> @PublicEvolving and targets version 1.19.  I have clarified this in
> the "Proposed Change" section of the FLIP.
>
>
> Hi Jing,
>
> Thanks for sharing your thoughts! Here are my opinions:
>
> 1. The exceptions of the state API are usually treated as critical
> ones. In other words, if anything goes wrong with state accessing, the
> element processing cannot proceed and the job should fail. Flink users
> may not know what to do when they encounter these exceptions. I
> believe this is the main reason why we want to replace them with
> unchecked exceptions.
> 2. There have also been some further discussions[1][2] from Stephan
> and Shixiaogang below the one you pointed out [3], and it seems they
> come to an agreement to use unchecked exceptions. After reviewing the
> entire discussion on that PR, I think their arguments are reasonable
> given the use case.
>
> Looking forward to your feedback.
>
>
> Best,
> Zakelly
>
> [1] https://github.com/apache/flink/pull/3380#issuecomment-286807853
> [2] https://github.com/apache/flink/pull/3380#issuecomment-286932133
> [3] https://github.com/apache/flink/pull/3380#issuecomment-281631160
>
> On Thu, Sep 21, 2023 at 1:27 AM Jing Ge 
> wrote:
> >
> > sorry, typo: It is a known "anti-pattern" instead of "ant-pattern"
> >
> > Best regards,
> > Jing
> >
> > On Wed, Sep 20, 2023 at 7:23 PM Jing Ge  wrote:
> >
> > > Hi Zakelly,
> > >
> > > Thanks for driving this topic. From good software engineering's
> > > perspective, I have different thoughts:
> > >
> > > 1. The idea to get rid of all checked Exceptions and replace them with
> > > unchecked Exceptions is a known ant-pattern: "Generally speaking, do
> not
> > > throw a RuntimeException or create a subclass of RuntimeException
> simply
> > > because you don't want to be bothered with specifying the exceptions
> your
> > > methods can throw." [1] Checked Exceptions mean expected exceptions
> that
> > > can help developers find a way to catch them and decide what to do. It
> is
> > > part of the public API signature that can help developers build robust
> > > systems. We should not mix concepts and build expected exceptions with
> > > unchecked Java Exception classes.
> > > 2. The comment Stephan left [2] clearly pointed out that we should
> avoid
> > > using generic Java Exceptions, and "find some more 'specific'
> exceptions
> > > for the signature, like throws IOException or throws
> StateAccessException."
> > > So, the idea is to define/use specific checked Exception classes
> instead of
> > > using unchecked Exceptions.
> > >
> > > Looking forward to your thoughts.
> > >
> > > Best regards,
> > > Jing
> > >
> > >
> > > [1]
> > >
> https://docs.oracle.com/javase/tutorial/essential/exceptions/runtime.html
> > > [2] https://github.com/apache/flink/pull/3380#issuecomment-281631160
> > >
> > > On Wed, Sep 20, 2023 at 4:52 PM Zakelly Lan 
> wrote:
> > >
> > >> Hi Yanfei,
> > >>
> > >> Thanks for your reply!
> > >>
> > >> Yes, this FLIP aims to change all state-related exceptions to
> > >> unchecked exceptions and remove all exceptions from the signature. So
> > >> I believe we have come to an agreement to keep the interfaces simple.
> > >>
> > >>
> > >> Best regards,
> > >> Zakelly
> > >>
> > >> On Wed, Sep 20, 2023 at 2:26 PM Zakelly Lan 
> > >> wrote:
> > >> >
> > >> > Hi Hangxiang,
> > >> >
> > >> > Thank you for your response! Here are my thoughts:
> > >> >
> > >> > 1. Regarding the exceptions thrown by internal interfaces, I suggest
> > >> > keeping them as checked exceptions. Since these exceptions will be
> > >> > handled by the internal callers, it is meaningful to throw them as
> > >> > checked ones. If we need to make changes to these classes, we can
> > >> > create separate tickets alongside this FLIP. What are your thoughts
> on
> > >> > this?
> > >> > 2. StateIOException is primarily thrown by file-based state like
> > >> > RocksDB, while StateAccessException is more generic and can be
> thrown
> > >> > by heap states. Additionally, I believe there may be more subclasses
> > >> > of StateAccessException that we can add. We can consider this when
> > >> > implementing.
> > >> > 3. I would like to make this change in version 1.19. As mentioned in
> > >> > this FLIP,

Re: [DISCUSS] FLIP-365: Introduce flush interval to adjust the interval of emitting results with idempotent semantics

2023-09-21 Thread Zakelly Lan
Hi Yunfeng and Dong,

Thanks for this FLIP. I have reviewed it briefly and have a few questions:

1. Is this FLIP proposing to buffer the output in the state backend?
If so, what is the data format of this buffer (what type of state does
it use and what is the value)? Additionally, how does the operator
retrieve all the buffer data from the state backend during the
`flush()` operation (while the keyed states can only be accessed under
a keyed context)?
2. Are the buffered intermediate results required to be included in
the next checkpoint? Or are they deleted and subsumed in the original
states during the `flush()` operation before triggering the
checkpoint? I'm asking because if they are not included in the
checkpoint, it may be more efficient to avoid using keyed states for
buffering. In this scenario, a simple heap-based or even file-based
map could be more efficient. Frequent writes and clears can lead to
increased space usage and read amplification for RocksDB, and it also
requires more CPU resources for checkpointing and compaction.


Looking forward to your thoughts.


Best,
Zakelly


On Mon, Sep 11, 2023 at 1:39 PM Yunfeng Zhou
 wrote:
>
> Hi all,
>
> Dong(cc'ed) and I are opening this thread to discuss our proposal to
> support buffering & flushing the output of operators with idempotent
> semantics,  which has been documented in
> FLIP-365.
>
> In the pursuit of unifying batch and stream processing, it has been
> discovered that the batch execution mode provides a significant
> advantage by allowing only the final result from "rolling" operations
> such as reduce() or sum() to be emitted, thus reducing the amount of
> time and resources required by downstream applications. Inspired by
> this advantage, the proposed solution supports buffering the output of
> operators in the streaming mode and periodically emitting the final
> results, much like in batch processing. This approach is designed to
> help improve the throughput of jobs by reducing the need for
> downstream applications to process intermediate results in a stream,
> at the cost of increased latency and state size.
>
> Please refer to the FLIP document for more details about the proposed
> design and implementation. We welcome any feedback and opinions on
> this proposal.
>
> Best regards.
> Dong and Yunfeng


Re: [DISCUSS] FLIP-356: Support Nested Fields Filter Pushdown

2023-09-21 Thread Martijn Visser
Hi Venkatakrishnan,

The reason why I thought it's abandoned because the Jira ticket is
part of the umbrella ticket for FLIP-95. Let's move the Jira ticket to
its own dedicated task instead of nested to a FLIP-95 ticket.

Thanks,

Martijn

On Wed, Sep 20, 2023 at 4:34 PM Becket Qin  wrote:
>
> Hi Martijn,
>
> This FLIP has passed voting[1]. It is a modification on top of the FLIP-95
> interface.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> [1] https://lists.apache.org/thread/hysv9y1f48gtpr5vx3x40wtjb6cp9ky6
>
> On Wed, Sep 20, 2023 at 9:29 PM Martijn Visser 
> wrote:
>
> > For clarity purposes, this FLIP is being abandoned because it was part
> > of FLIP-95?
> >
> > On Thu, Sep 7, 2023 at 3:01 AM Venkatakrishnan Sowrirajan
> >  wrote:
> > >
> > > Hi everyone,
> > >
> > > Posted a PR (https://github.com/apache/flink/pull/23313) to add nested
> > > fields filter pushdown. Please review. Thanks.
> > >
> > > Regards
> > > Venkata krishnan
> > >
> > >
> > > On Tue, Sep 5, 2023 at 10:04 PM Venkatakrishnan Sowrirajan <
> > vsowr...@asu.edu>
> > > wrote:
> > >
> > > > Based on an offline discussion with Becket Qin, I added *fieldIndices *
> > > > back which is the field index of the nested field at every level to
> > the *NestedFieldReferenceExpression
> > > > *in FLIP-356
> > > > <
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-356%3A+Support+Nested+Fields+Filter+Pushdown
> > >
> > > > *. *2 reasons to do it:
> > > >
> > > > 1. Agree with using *fieldIndices *as the only contract to refer to the
> > > > column from the underlying datasource.
> > > > 2. To keep it consistent with *FieldReferenceExpression*
> > > >
> > > > Having said that, I see that with *projection pushdown, *index of the
> > > > fields are used whereas with *filter pushdown (*based on scanning few
> > > > tablesources) *FieldReferenceExpression*'s name is used for eg: even in
> > > > the Flink's *FileSystemTableSource, IcebergSource, JDBCDatsource*. This
> > > > way, I feel the contract is not quite clear and explicit. Wanted to
> > > > understand other's thoughts as well.
> > > >
> > > > Regards
> > > > Venkata krishnan
> > > >
> > > >
> > > > On Tue, Sep 5, 2023 at 5:34 PM Becket Qin 
> > wrote:
> > > >
> > > >> Hi Venkata,
> > > >>
> > > >>
> > > >> > Also I made minor changes to the *NestedFieldReferenceExpression,
> > > >> *instead
> > > >> > of *fieldIndexArray* we can just do away with *fieldNames *array
> > that
> > > >> > includes fieldName at every level for the nested field.
> > > >>
> > > >>
> > > >> I don't think keeping only the field names array would work. At the
> > end of
> > > >> the day, the contract between Flink SQL and the connectors is based
> > on the
> > > >> indexes, not the names. Technically speaking, the connectors only
> > emit a
> > > >> bunch of RowData which is based on positions. The field names are
> > added by
> > > >> the SQL framework via the DDL for those RowData. In this sense, the
> > > >> connectors may not be aware of the field names in Flink DDL at all.
> > The
> > > >> common language between Flink SQL and source is just positions. This
> > is
> > > >> also why ProjectionPushDown would work by only relying on the
> > indexes, not
> > > >> the field names. So I think the field index array is a must have here
> > in
> > > >> the NestedFieldReferenceExpression.
> > > >>
> > > >> Thanks,
> > > >>
> > > >> Jiangjie (Becket) Qin
> > > >>
> > > >> On Fri, Sep 1, 2023 at 8:12 AM Venkatakrishnan Sowrirajan <
> > > >> vsowr...@asu.edu>
> > > >> wrote:
> > > >>
> > > >> > Gentle ping on the vote for FLIP-356: Support Nested fields filter
> > > >> pushdown
> > > >> > <
> > > >>
> > https://urldefense.com/v3/__https://www.mail-archive.com/dev@flink.apache.org/msg69289.html__;!!IKRxdwAv5BmarQ!bOW26WlafOQQcb32eWtUiXBAl0cTCK1C6iYhDI2f_z__eczudAWmTRvjDiZg6gzlXmPXrDV4KJS5cFxagFE$
> > > >> >.
> > > >> >
> > > >> > Regards
> > > >> > Venkata krishnan
> > > >> >
> > > >> >
> > > >> > On Tue, Aug 29, 2023 at 9:18 PM Venkatakrishnan Sowrirajan <
> > > >> > vsowr...@asu.edu>
> > > >> > wrote:
> > > >> >
> > > >> > > Sure, will reference this discussion to resume where we started as
> > > >> part
> > > >> > of
> > > >> > > the flip to refactor SupportsProjectionPushDown.
> > > >> > >
> > > >> > > On Tue, Aug 29, 2023, 7:22 PM Jark Wu  wrote:
> > > >> > >
> > > >> > >> I'm fine with this. `ReferenceExpression` and
> > > >> > `SupportsProjectionPushDown`
> > > >> > >> can be another FLIP. However, could you summarize the design of
> > this
> > > >> > part
> > > >> > >> in the future part of the FLIP? This can be easier to get started
> > > >> with
> > > >> > in
> > > >> > >> the future.
> > > >> > >>
> > > >> > >>
> > > >> > >> Best,
> > > >> > >> Jark
> > > >> > >>
> > > >> > >>
> > > >> > >> On Wed, 30 Aug 2023 at 02:45, Venkatakrishnan Sowrirajan <
> > > >> > >> vsowr...@asu.edu>
> > > >> > >> wrote:
> > > >> > >>
> > > >> > >> > Thanks Jark. Sounds good.
> > > >> > >> >
> > > >> > >> > One more thing, earlier in 

Re: [Re-DISCUSS] FLIP-202: Introduce ClickHouse Connector

2023-09-21 Thread ConradJam
Thanks to @Martijn @Leonard for raising this question

This is also something I have been worried about, that is, the issue of
code copyright ownership and cooperation. That's why I mark FLIP-202 [1] as
draft.
I have sent a private email inviting the author to participate. If there is
more news, I will synchronize it in time. I believe that through hard work,
we will definitely win opportunities for cooperation.

[1] FLIP-202
https://cwiki.apache.org/confluence/display/FLINK/%5BDRAFT%5D+FLIP-202%3A+Introduce+ClickHouse+Connector