date:20230222

[jira] [Created] (FLINK-31194) ntroduces savepoint mechanism of Table Store

2023-02-22 Thread Nicholas Jiang (Jira)

Nicholas Jiang created FLINK-31194:
--

 Summary: ntroduces savepoint mechanism of Table Store
 Key: FLINK-31194
 URL: https://issues.apache.org/jira/browse/FLINK-31194
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Affects Versions: table-store-0.4.0
Reporter: Nicholas Jiang
 Fix For: table-store-0.4.0


Disaster Recovery is very much mission critical for any software. Especially 
when it comes to data systems, the impact could be very serious leading to 
delay in business decisions or even wrong business decisions at times. Flink 
Table Store could introduce savepoint mechanism to assist users in recovering 
data from a previous state.

As the name suggest, "savepoint" saves the table as of the snapshot, so that it 
lets you restore the table to this savepoint at a later point in snapshot if 
need be. Care is taken to ensure cleaner will not clean up any files that are 
savepointed. On similar lines, savepoint cannot be triggered on a snapshot that 
is already cleaned up. In simpler terms, this is synonymous to taking a backup, 
just that we don't make a new copy of the table, but just save the state of the 
table elegantly so that we can restore it later when in need.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-31193) The option `table.exec.hive.native-agg-function.enabled` should work at job level when using it in SqlClient side

2023-02-22 Thread dalongliu (Jira)

dalongliu created FLINK-31193:
-

 Summary: The option `table.exec.hive.native-agg-function.enabled` 
should work at job level when using it in SqlClient side
 Key: FLINK-31193
 URL: https://issues.apache.org/jira/browse/FLINK-31193
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Hive
Affects Versions: 1.17.0
Reporter: dalongliu
 Fix For: 1.18.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] FLIP-296: Watermark options for table API & SQL

2023-02-22 Thread Shammon FY

Hi kui

Thanks for your answer and +1 to yuxia too

> we should not bind the watermark-related options to a connector to ensure
semantic clarity.

In my opinion, adding watermark-related options to a connector is much more
clear. Currently users can define simple watermark strategy in DDL, adding
more configuration items in connector options is easy to understand

Best,
Shammon


On Thu, Feb 23, 2023 at 10:52 AM Jingsong Li  wrote:

> Thanks for your proposal.
>
> +1 to yuxia, consider watermark-related hints as option hints.
>
> Personally, I am cautious about adding SQL syntax, WATERMARK_PARAMS is
> also SQL syntax to some extent.
>
> We can use OPTIONS to meet this requirement if possible.
>
> Best,
> Jingsong
>
> On Thu, Feb 23, 2023 at 10:41 AM yuxia 
> wrote:
> >
> > Hi, Yuan Kui.
> > Thanks for driving it.
> > IMO, the 'OPTIONS' hint may be not only specific to the connector
> options. Just as a reference, we also have `sink.parallelism`[1] as a
> connector options. It enables
> > user to specific the writer's parallelism dynamically per-query.
> >
> > Personally, I perfer to consider watermark-related hints as option
> hints. So, user can define a default watermark strategy for the table, and
> if user dosen't needed to changes it, they need to do nothing in their
> query instead of specific it ervery time.
> >
> > [1]
> https://nightlies.apache.org/flink/flink-docs-master/zh/docs/connectors/table/filesystem/#sink-parallelism
> >
> > Best regards,
> > Yuxia
> >
> > Best regards,
> > Yuxia
> >
> > - 原始邮件 -
> > 发件人: "kui yuan" 
> > 收件人: "dev" 
> > 抄送: "Jark Wu" 
> > 发送时间: 星期三, 2023年 2 月 22日 下午 10:08:11
> > 主题: Re: [DISCUSS] FLIP-296: Watermark options for table API & SQL
> >
> > Hi all,
> >
> > Thanks for the lively discussion and I will respond to these questions
> one
> > by one. However, there are also some common questions and I will answer
> > together.
> >
> > @郑 Thanks for your reply. The features mentioned in this flip are only
> for
> > those source connectors that implement the SupportsWatermarkPushDown
> > interface, generating watermarks in other graph locations is not in the
> > scope of this discussion. Perhaps another flip can be proposed later to
> > implement this feature.
> >
> > @Shammon Thanks for your reply. In Flip-296, a rejected alternative is
> > adding watermark related options in the connector options，we believe that
> > we should not bind the watermark-related options to a connector to ensure
> > semantic clarity.
> >
> > > What will happen if we add watermark related options in `the connector
> > > options`? Will the connector ignore these options or throw an
> exception?
> > > How can we support this?
> >
> > If user defines different watermark configurations for one table in two
> > places, I tend to prefer the first place would prevail, but  we can also
> > throw exception or just print logs to prompt the user, which are
> > implementation details.
> >
> > > If one table is used by two operators with different watermark params,
> > > what will happen?
> >
> > @Martijn Thanks for your reply.  I'm sorry that we are not particularly
> > accurate, this hint is mainly for SQL,  not table API.
> >
> > > While the FLIP talks about watermark options for Table API & SQL, I
> only
> > > see proposed syntax for SQL, not for the Table API. What is your
> proposal
> > > for the Table API
> >
> > @Jane Thanks for your reply. For the first question, If the user uses
> this
> > hint on those sourse that does not implement the
> SupportsWatermarkPushDown
> > interface, it will be completely invalid. The task will run as normal as
> if
> > the hint had not been used.
> >
> > > What's the behavior if there are multiple table sources, among which
> > > some do not support `SupportsWatermarkPushDown`?
> >
> > @Jane feedback that 'WATERMARK_PARAMS' is difficult to remember, perhaps
> > the naming issue can be put to the end of the discussion, because more
> > people like @Martijn @Shuo are considering whether these configurations
> > should be put into the DDL or the 'OPTIONS' hint. Here's what I
> > think, Putting these configs into DDL or putting them into 'OPTIONS' hint
> > is actually the same thing, because the 'OPTIONS' hint is mainly used to
> > configure the properties of conenctor. The reason why I want to use a new
> > hint is to make sure the semantics clear, in my opinion the configuration
> > of watermark should not be mixed up with connector. However, a new hint
> > does make it more difficult to use to some extent, for example, when a
> user
> > uses both 'OPTIONS' hint and 'WATERMARK_PARAMS' hint. For this point,
> maby
> > it is more appropriate to use uniform 'OPTIONS' hint.
> > On the other hand, if we enrich more watermark option keys in 'OPTIONS'
> > hints, The question will be what we treat the definatrions of  'OPTIONS'
> > hint, is this only specific to the connector options or could be more?
> > Maybe @Jark could share more insights here. In my opion,

[jira] [Created] (FLINK-31192) dataGen takes too long to initialize under sequence

2023-02-22 Thread xzw0223 (Jira)

xzw0223 created FLINK-31192:
---

 Summary: dataGen takes too long to initialize under sequence
 Key: FLINK-31192
 URL: https://issues.apache.org/jira/browse/FLINK-31192
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.16.1, 1.16.0
Reporter: xzw0223
 Fix For: 1.16.1, 1.16.0


The SequenceGenerator preloads all sequence values in open. If the totalElement 
number is too large, it will take too long.
[https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/datagen/SequenceGenerator.java#L91]



The reason is that the capacity of the Deque will be expanded twice when the 
current capacity is full, and the array copy is required, which is 
time-consuming.

 

Here's what I think : 
 do not preload the full amount of data on Sequence, and generate a piece of 
data each time next is called to solve the problem of slow initialization 
caused by loading full amount of data.

  record the currently sent Sequence position through the checkpoint, and 
continue to send data through the recorded position after an abnormal restart 
to ensure fault tolerance



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-31191) VectorIndexer should check whether doublesByColumn is null before snapshot

2023-02-22 Thread Zhipeng Zhang (Jira)

Zhipeng Zhang created FLINK-31191:
-

 Summary: VectorIndexer should check whether doublesByColumn is 
null before snapshot
 Key: FLINK-31191
 URL: https://issues.apache.org/jira/browse/FLINK-31191
 Project: Flink
  Issue Type: Bug
  Components: Library / Machine Learning
Affects Versions: ml-2.2.0
Reporter: Zhipeng Zhang


Currently VectorIndexer would lead to NPE when doing checkpoint. It should 
check whether `doublesByColumn` is null before calling snapshot.

 

logview: 
[https://github.com/apache/flink-ml/actions/runs/4249415318/jobs/7389547039]

details:
 
 
[735|https://github.com/apache/flink-ml/actions/runs/4249415318/jobs/7389547039#step:4:736]Caused
 by: java.lang.NullPointerException 
[736|https://github.com/apache/flink-ml/actions/runs/4249415318/jobs/7389547039#step:4:737]
 at 
org.apache.flink.ml.feature.vectorindexer.VectorIndexer$ComputeDistinctDoublesOperator.convertToListArray(VectorIndexer.java:232)
 
[737|https://github.com/apache/flink-ml/actions/runs/4249415318/jobs/7389547039#step:4:738]
 at 
org.apache.flink.ml.feature.vectorindexer.VectorIndexer$ComputeDistinctDoublesOperator.snapshotState(VectorIndexer.java:228)
 
[738|https://github.com/apache/flink-ml/actions/runs/4249415318/jobs/7389547039#step:4:739]
 at 
org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:222)
 
[739|https://github.com/apache/flink-ml/actions/runs/4249415318/jobs/7389547039#step:4:740]
 ... 33 more



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] FLIP-297: Improve Auxiliary Sql Statements

2023-02-22 Thread Ran Tao

Hi Jingsong. thanks. i got it.
In this way, there is no need to introduce new API changes.

Best Regards,
Ran Tao


Jingsong Li  于2023年2月23日周四 12:26写道：

> Hi Ran,
>
> I mean we can just use
> TableEnvironment.getCatalog(getCurrentCatalog).get().listDatabases().
>
> We don't need to provide new apis just for utils.
>
> Best,
> Jingsong
>
> On Thu, Feb 23, 2023 at 12:11 PM Ran Tao  wrote:
> >
> > Hi Jingsong, thanks.
> >
> > The implementation of these statements in TableEnvironmentImpl is called
> > through the catalog api.
> >
> > but it does support some new override methods on the catalog api side,
> and
> > I will update it later. Thank you.
> >
> > e.g.
> > TableEnvironmentImpl
> > @Override
> > public String[] listDatabases() {
> > return catalogManager
> > .getCatalog(catalogManager.getCurrentCatalog())
> > .get()
> > .listDatabases()
> > .toArray(new String[0]);
> > }
> >
> > Best Regards,
> > Ran Tao
> >
> >
> > Jingsong Li  于2023年2月23日周四 11:47写道：
> >
> > > Thanks for the proposal.
> > >
> > > +1 for the proposal.
> > >
> > > I am confused about "Proposed TableEnvironment SQL API Changes", can
> > > we just use catalog api for this requirement?
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Thu, Feb 23, 2023 at 10:48 AM Jacky Lau 
> wrote:
> > > >
> > > > Hi Ran:
> > > > Thanks for driving the FLIP. the google doc looks really good. it is
> > > > important to improve user interactive experience. +1 to support this
> > > > feature.
> > > >
> > > > Jing Ge  于2023年2月23日周四 00:51写道：
> > > >
> > > > > Hi Ran,
> > > > >
> > > > > Thanks for driving the FLIP.  It looks overall good. Would you
> like to
> > > add
> > > > > a description of useLike and notLike? I guess useLike true is for
> > > "LIKE"
> > > > > and notLike true is for "NOT LIKE" but I am not sure if I
> understood it
> > > > > correctly. Furthermore, does it make sense to support "ILIKE" too?
> > > > >
> > > > > Best regards,
> > > > > Jing
> > > > >
> > > > > On Wed, Feb 22, 2023 at 1:17 PM Ran Tao 
> wrote:
> > > > >
> > > > > > Currently flink sql auxiliary statements has supported some good
> > > features
> > > > > > such as catalog/databases/table support.
> > > > > >
> > > > > > But these features are not very complete compared with other
> popular
> > > > > > engines such as spark, presto, hive and commercial engines such
> as
> > > > > > snowflake.
> > > > > >
> > > > > > For example, many engines support show operation with filtering
> > > except
> > > > > > flink, and support describe other object(flink only support
> describe
> > > > > > table).
> > > > > >
> > > > > > I wonder can we add these useful features for flink?
> > > > > > You can find details in this doc.[1] or FLIP.[2]
> > > > > >
> > > > > > Also, please let me know if there is a mistake. Looking forward
> to
> > > your
> > > > > > reply.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >
> > > > >
> > >
> https://docs.google.com/document/d/1hAiOfPx14VTBTOlpyxG7FA2mB1k5M31VnKYad2XpJ1I/
> > > > > > [2]
> > > > > >
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-297%3A+Improve+Auxiliary+Sql+Statements
> > > > > >
> > > > > > Best Regards,
> > > > > > Ran Tao
> > > > > >
> > > > >
> > >
>

[jira] [Created] (FLINK-31190) Supports Spark call procedure command on Table Store

2023-02-22 Thread Nicholas Jiang (Jira)

Nicholas Jiang created FLINK-31190:
--

 Summary: Supports Spark call procedure command on Table Store
 Key: FLINK-31190
 URL: https://issues.apache.org/jira/browse/FLINK-31190
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Affects Versions: table-store-0.4.0
Reporter: Nicholas Jiang
 Fix For: table-store-0.4.0


At present Hudi and Iceberg supports the Spark call procedure command to 
execute the table service action etc. Flink Table Store could also support 
Spark call procedure command to run compaction etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] FLIP-297: Improve Auxiliary Sql Statements

2023-02-22 Thread Jingsong Li

Hi Ran,

I mean we can just use
TableEnvironment.getCatalog(getCurrentCatalog).get().listDatabases().

We don't need to provide new apis just for utils.

Best,
Jingsong

On Thu, Feb 23, 2023 at 12:11 PM Ran Tao  wrote:
>
> Hi Jingsong, thanks.
>
> The implementation of these statements in TableEnvironmentImpl is called
> through the catalog api.
>
> but it does support some new override methods on the catalog api side, and
> I will update it later. Thank you.
>
> e.g.
> TableEnvironmentImpl
> @Override
> public String[] listDatabases() {
> return catalogManager
> .getCatalog(catalogManager.getCurrentCatalog())
> .get()
> .listDatabases()
> .toArray(new String[0]);
> }
>
> Best Regards,
> Ran Tao
>
>
> Jingsong Li  于2023年2月23日周四 11:47写道：
>
> > Thanks for the proposal.
> >
> > +1 for the proposal.
> >
> > I am confused about "Proposed TableEnvironment SQL API Changes", can
> > we just use catalog api for this requirement?
> >
> > Best,
> > Jingsong
> >
> > On Thu, Feb 23, 2023 at 10:48 AM Jacky Lau  wrote:
> > >
> > > Hi Ran:
> > > Thanks for driving the FLIP. the google doc looks really good. it is
> > > important to improve user interactive experience. +1 to support this
> > > feature.
> > >
> > > Jing Ge  于2023年2月23日周四 00:51写道：
> > >
> > > > Hi Ran,
> > > >
> > > > Thanks for driving the FLIP.  It looks overall good. Would you like to
> > add
> > > > a description of useLike and notLike? I guess useLike true is for
> > "LIKE"
> > > > and notLike true is for "NOT LIKE" but I am not sure if I understood it
> > > > correctly. Furthermore, does it make sense to support "ILIKE" too?
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > > On Wed, Feb 22, 2023 at 1:17 PM Ran Tao  wrote:
> > > >
> > > > > Currently flink sql auxiliary statements has supported some good
> > features
> > > > > such as catalog/databases/table support.
> > > > >
> > > > > But these features are not very complete compared with other popular
> > > > > engines such as spark, presto, hive and commercial engines such as
> > > > > snowflake.
> > > > >
> > > > > For example, many engines support show operation with filtering
> > except
> > > > > flink, and support describe other object(flink only support describe
> > > > > table).
> > > > >
> > > > > I wonder can we add these useful features for flink?
> > > > > You can find details in this doc.[1] or FLIP.[2]
> > > > >
> > > > > Also, please let me know if there is a mistake. Looking forward to
> > your
> > > > > reply.
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > https://docs.google.com/document/d/1hAiOfPx14VTBTOlpyxG7FA2mB1k5M31VnKYad2XpJ1I/
> > > > > [2]
> > > > >
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-297%3A+Improve+Auxiliary+Sql+Statements
> > > > >
> > > > > Best Regards,
> > > > > Ran Tao
> > > > >
> > > >
> >

Re: [DISCUSS] FLIP-297: Improve Auxiliary Sql Statements

2023-02-22 Thread Ran Tao

Hi Jingsong, thanks.

The implementation of these statements in TableEnvironmentImpl is called
through the catalog api.

but it does support some new override methods on the catalog api side, and
I will update it later. Thank you.

e.g.
TableEnvironmentImpl
@Override
public String[] listDatabases() {
return catalogManager
.getCatalog(catalogManager.getCurrentCatalog())
.get()
.listDatabases()
.toArray(new String[0]);
}

Best Regards,
Ran Tao


Jingsong Li  于2023年2月23日周四 11:47写道：

> Thanks for the proposal.
>
> +1 for the proposal.
>
> I am confused about "Proposed TableEnvironment SQL API Changes", can
> we just use catalog api for this requirement?
>
> Best,
> Jingsong
>
> On Thu, Feb 23, 2023 at 10:48 AM Jacky Lau  wrote:
> >
> > Hi Ran:
> > Thanks for driving the FLIP. the google doc looks really good. it is
> > important to improve user interactive experience. +1 to support this
> > feature.
> >
> > Jing Ge  于2023年2月23日周四 00:51写道：
> >
> > > Hi Ran,
> > >
> > > Thanks for driving the FLIP.  It looks overall good. Would you like to
> add
> > > a description of useLike and notLike? I guess useLike true is for
> "LIKE"
> > > and notLike true is for "NOT LIKE" but I am not sure if I understood it
> > > correctly. Furthermore, does it make sense to support "ILIKE" too?
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Wed, Feb 22, 2023 at 1:17 PM Ran Tao  wrote:
> > >
> > > > Currently flink sql auxiliary statements has supported some good
> features
> > > > such as catalog/databases/table support.
> > > >
> > > > But these features are not very complete compared with other popular
> > > > engines such as spark, presto, hive and commercial engines such as
> > > > snowflake.
> > > >
> > > > For example, many engines support show operation with filtering
> except
> > > > flink, and support describe other object(flink only support describe
> > > > table).
> > > >
> > > > I wonder can we add these useful features for flink?
> > > > You can find details in this doc.[1] or FLIP.[2]
> > > >
> > > > Also, please let me know if there is a mistake. Looking forward to
> your
> > > > reply.
> > > >
> > > > [1]
> > > >
> > > >
> > >
> https://docs.google.com/document/d/1hAiOfPx14VTBTOlpyxG7FA2mB1k5M31VnKYad2XpJ1I/
> > > > [2]
> > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-297%3A+Improve+Auxiliary+Sql+Statements
> > > >
> > > > Best Regards,
> > > > Ran Tao
> > > >
> > >
>

RE: Re: Re: [discussion] To introduce a formatter for Markdown files

2023-02-22 Thread Zhongpu Chen


Hi Jing,

Prettier is a versatile formatting tools with very limited options for 
markdown files. If we need to specify the rules, I think other tools, 
such as markdownlint [1], can be a better choice. As for markdownlint, 
we can specify many kinds of rules. Here is an example for `unordered 
list` whose style ID is MD004 [2].


```json

{ "default": true, "MD004": { "style": "asterisk" } }

```

Then markdown files only accept '*' for making a list item.

The specific formatting tools may be left as a vote later.

[1] https://github.com/DavidAnson/markdownlint

[2] https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md


On 2023/02/22 21:22:02 Jing Ge wrote:
> Hi Zhongpu,
>
> Thanks for the clarification. Prettier looks fine for formatting mk 
files.

> Is there any way to validate rules like e.g. only using '*' for listing
> items in the CI pipeline?
>
> Best regards,
> Jing
>
> On Wed, Feb 22, 2023 at 2:28 PM Zhongpu Chen  wrote:
>
> > Hi Jing,
> >
> > Sorry for the last reply in a messed up formatting, as I am not quite
> > familiar with the mail list usage.
> >
> > First, as long as contributors install Prettier with the same version,
> > markdown files can be auto-formatted in the same way without extra 
working
> > via the `prettier --write **/*.md` command, and this can also be 
set as a
> > part of pre-commit hook. To check whether markdown files satisfy 
formatting
> > requirements, we can use `prettier --check **/*.md` where exit code 
0 means

> > it does follow the styles.
> >
> > Secondly, the format check can be integrated with the CI pipeline 
easily.
> > Here is an example using GitHub Actions, and I think the setting in 
Azure

> > pipelines is similar.
> >
> > ```yml
> > jobs:
> > Check-MD-Format-Actions:
> > runs-on: ubuntu-latest
> > steps:
> > - name: Check out repo code
> > uses: actions/checkout@v3
> > - name: Use Node.js
> > uses: actions/setup-node@v3
> > with:
> > node-version: "12.x"
> > - name: Install pretty
> > run: npm i -g prettier
> > - name: Prettify code
> > run: prettier --check **/*.md
> > ```
> >
> > At last, it will never cause any conflict. Prettier is compatible with
> > CommonMark (https://commonmark.org/) and GitHub Flavored Markdown.
> >
> >
> > On 2023/02/22 10:22:46 Jing Ge wrote:
> > > Hi Zhongpu,
> > >
> > > Thanks for starting this discussion. I was wondering how we could let
> > every
> > > contributor follow the same mk rule. I am not familiar with Prettier.
> > Would
> > > you like to help me understand some basic questions? Thanks.
> > >
> > > Is there any way to integrate the format check into our CI pipeline?
> > > Otherwise, it will be hard to control it. Another thing is that 
some docs

> > > are written with Hugo syntax [1]. Would they work together with no
> > conflict?
> > >
> > >
> > > Best regards,
> > > Jing
> > >
> > > [1] https://gohugo.io/
> > >
> > > On Wed, Feb 22, 2023 at 9:50 AM Zhongpu Chen  wrote:
> > >
> > > > As I mentioned in FLINK-31177 (
> > > > https://issues.apache.org/jira/browse/FLINK-31177):
> > > >
> > > > Currently, markdown files in *docs* are maintained and updated 
by many
> > > > contributors, and different people have varying code style 
taste. By

> > the
> > > > way, as the syntax of markdown is not really strict, the styles 
tend to

> > be
> > > > inconsistent.
> > > >
> > > > To name a few,
> > > >
> > > > - Some prefer `*` to make a list item, while others may prefer `-`.
> > > > - It is common to leave many unnecessary blank lines and spaces.
> > > > - To make a divider, the number of `-` can be varying.
> > > >
> > > > To this end, I think it would be nicer to encourage or demand
> > contributors
> > > > to format their markdown files before making a pull request.
> > Personally, I
> > > > think Prettier (https://prettier.io/) is a good candidate.
> > > > What do you think?
> > > >
> > > > --
> > > > Zhongpu Chen
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-297: Improve Auxiliary Sql Statements

2023-02-22 Thread Jingsong Li

Thanks for the proposal.

+1 for the proposal.

I am confused about "Proposed TableEnvironment SQL API Changes", can
we just use catalog api for this requirement?

Best,
Jingsong

On Thu, Feb 23, 2023 at 10:48 AM Jacky Lau  wrote:
>
> Hi Ran:
> Thanks for driving the FLIP. the google doc looks really good. it is
> important to improve user interactive experience. +1 to support this
> feature.
>
> Jing Ge  于2023年2月23日周四 00:51写道：
>
> > Hi Ran,
> >
> > Thanks for driving the FLIP.  It looks overall good. Would you like to add
> > a description of useLike and notLike? I guess useLike true is for "LIKE"
> > and notLike true is for "NOT LIKE" but I am not sure if I understood it
> > correctly. Furthermore, does it make sense to support "ILIKE" too?
> >
> > Best regards,
> > Jing
> >
> > On Wed, Feb 22, 2023 at 1:17 PM Ran Tao  wrote:
> >
> > > Currently flink sql auxiliary statements has supported some good features
> > > such as catalog/databases/table support.
> > >
> > > But these features are not very complete compared with other popular
> > > engines such as spark, presto, hive and commercial engines such as
> > > snowflake.
> > >
> > > For example, many engines support show operation with filtering except
> > > flink, and support describe other object(flink only support describe
> > > table).
> > >
> > > I wonder can we add these useful features for flink?
> > > You can find details in this doc.[1] or FLIP.[2]
> > >
> > > Also, please let me know if there is a mistake. Looking forward to your
> > > reply.
> > >
> > > [1]
> > >
> > >
> > https://docs.google.com/document/d/1hAiOfPx14VTBTOlpyxG7FA2mB1k5M31VnKYad2XpJ1I/
> > > [2]
> > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-297%3A+Improve+Auxiliary+Sql+Statements
> > >
> > > Best Regards,
> > > Ran Tao
> > >
> >

Re: [DISCUSS] FLIP-297: Improve Auxiliary Sql Statements

2023-02-22 Thread Ran Tao

thanks Jing. there is a small mistake.useLike it means whether to enable
like.
e.g.
SHOW TABLES from cat1.db1 not like 't%'   // useLike is true, notLike is
true
SHOW TABLES from cat1.db1 like 't%'  // useLike is true, notLike is false
useLike both are true.  I have updated the flip for this.

question2: i think currently we do not support 'ILIKE' in this flip.
Because other statements are currently based on the implementation of LIKE,
the implementation of this flip is compatible with other SQL syntax.

also, in the doc in this flip you can find that some other engines only
support LIKE.
what do you think?

Best Regards,
Ran Tao
https://github.com/chucheng92

Jing Ge  于2023年2月23日周四 00:50写道：

> Hi Ran,
>
> Thanks for driving the FLIP.  It looks overall good. Would you like to add
> a description of useLike and notLike? I guess useLike true is for "LIKE"
> and notLike true is for "NOT LIKE" but I am not sure if I understood it
> correctly. Furthermore, does it make sense to support "ILIKE" too?
>
> Best regards,
> Jing
>
> On Wed, Feb 22, 2023 at 1:17 PM Ran Tao  wrote:
>
> > Currently flink sql auxiliary statements has supported some good features
> > such as catalog/databases/table support.
> >
> > But these features are not very complete compared with other popular
> > engines such as spark, presto, hive and commercial engines such as
> > snowflake.
> >
> > For example, many engines support show operation with filtering except
> > flink, and support describe other object(flink only support describe
> > table).
> >
> > I wonder can we add these useful features for flink?
> > You can find details in this doc.[1] or FLIP.[2]
> >
> > Also, please let me know if there is a mistake. Looking forward to your
> > reply.
> >
> > [1]
> >
> >
> https://docs.google.com/document/d/1hAiOfPx14VTBTOlpyxG7FA2mB1k5M31VnKYad2XpJ1I/
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-297%3A+Improve+Auxiliary+Sql+Statements
> >
> > Best Regards,
> > Ran Tao
> >
>

[jira] [Created] (FLINK-31189) Allow ignore less frequent values in StringIndexer

2023-02-22 Thread Fan Hong (Jira)

Fan Hong created FLINK-31189:


 Summary: Allow ignore less frequent values in StringIndexer
 Key: FLINK-31189
 URL: https://issues.apache.org/jira/browse/FLINK-31189
 Project: Flink
  Issue Type: Improvement
  Components: Library / Machine Learning
Reporter: Fan Hong






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] FLIP-296: Watermark options for table API & SQL

2023-02-22 Thread Jingsong Li

Thanks for your proposal.

+1 to yuxia, consider watermark-related hints as option hints.

Personally, I am cautious about adding SQL syntax, WATERMARK_PARAMS is
also SQL syntax to some extent.

We can use OPTIONS to meet this requirement if possible.

Best,
Jingsong

On Thu, Feb 23, 2023 at 10:41 AM yuxia  wrote:
>
> Hi, Yuan Kui.
> Thanks for driving it.
> IMO, the 'OPTIONS' hint may be not only specific to the connector options. 
> Just as a reference, we also have `sink.parallelism`[1] as a connector 
> options. It enables
> user to specific the writer's parallelism dynamically per-query.
>
> Personally, I perfer to consider watermark-related hints as option hints. So, 
> user can define a default watermark strategy for the table, and if user 
> dosen't needed to changes it, they need to do nothing in their query instead 
> of specific it ervery time.
>
> [1] 
> https://nightlies.apache.org/flink/flink-docs-master/zh/docs/connectors/table/filesystem/#sink-parallelism
>
> Best regards,
> Yuxia
>
> Best regards,
> Yuxia
>
> - 原始邮件 -
> 发件人: "kui yuan" 
> 收件人: "dev" 
> 抄送: "Jark Wu" 
> 发送时间: 星期三, 2023年 2 月 22日 下午 10:08:11
> 主题: Re: [DISCUSS] FLIP-296: Watermark options for table API & SQL
>
> Hi all,
>
> Thanks for the lively discussion and I will respond to these questions one
> by one. However, there are also some common questions and I will answer
> together.
>
> @郑 Thanks for your reply. The features mentioned in this flip are only for
> those source connectors that implement the SupportsWatermarkPushDown
> interface, generating watermarks in other graph locations is not in the
> scope of this discussion. Perhaps another flip can be proposed later to
> implement this feature.
>
> @Shammon Thanks for your reply. In Flip-296, a rejected alternative is
> adding watermark related options in the connector options，we believe that
> we should not bind the watermark-related options to a connector to ensure
> semantic clarity.
>
> > What will happen if we add watermark related options in `the connector
> > options`? Will the connector ignore these options or throw an exception?
> > How can we support this?
>
> If user defines different watermark configurations for one table in two
> places, I tend to prefer the first place would prevail, but  we can also
> throw exception or just print logs to prompt the user, which are
> implementation details.
>
> > If one table is used by two operators with different watermark params,
> > what will happen?
>
> @Martijn Thanks for your reply.  I'm sorry that we are not particularly
> accurate, this hint is mainly for SQL,  not table API.
>
> > While the FLIP talks about watermark options for Table API & SQL, I only
> > see proposed syntax for SQL, not for the Table API. What is your proposal
> > for the Table API
>
> @Jane Thanks for your reply. For the first question, If the user uses this
> hint on those sourse that does not implement the SupportsWatermarkPushDown
> interface, it will be completely invalid. The task will run as normal as if
> the hint had not been used.
>
> > What's the behavior if there are multiple table sources, among which
> > some do not support `SupportsWatermarkPushDown`?
>
> @Jane feedback that 'WATERMARK_PARAMS' is difficult to remember, perhaps
> the naming issue can be put to the end of the discussion, because more
> people like @Martijn @Shuo are considering whether these configurations
> should be put into the DDL or the 'OPTIONS' hint. Here's what I
> think, Putting these configs into DDL or putting them into 'OPTIONS' hint
> is actually the same thing, because the 'OPTIONS' hint is mainly used to
> configure the properties of conenctor. The reason why I want to use a new
> hint is to make sure the semantics clear, in my opinion the configuration
> of watermark should not be mixed up with connector. However, a new hint
> does make it more difficult to use to some extent, for example, when a user
> uses both 'OPTIONS' hint and 'WATERMARK_PARAMS' hint. For this point, maby
> it is more appropriate to use uniform 'OPTIONS' hint.
> On the other hand, if we enrich more watermark option keys in 'OPTIONS'
> hints, The question will be what we treat the definatrions of  'OPTIONS'
> hint, is this only specific to the connector options or could be more?
> Maybe @Jark could share more insights here. In my opion, 'OPTIONS' is only
> related to the connector options, which is not like the gernal watermark
> options.
>
>
>
> Shuo Cheng  于2023年2月22日周三 19:17写道：
>
> > Hi Kui,
> >
> > Thanks for driving the discussion. It's quite useful to introduce Watermark
> > options. I have some questions:
> >
> > What kind of hints is "WATERMARK_PARAMS"?
> > Currently, we have two kinds of hints in Flink: Dynamic Table Options &
> > Query Hints. As described in the Flip, "WATERMARK_PARAMS" is more like
> > Dynamic Table Options. So two questions arise here:
> >
> > 1)  Are these watermark options to be exposed as connector WITH options? Aa
> > described in SQL

Re: [DISCUSS] FLIP-297: Improve Auxiliary Sql Statements

2023-02-22 Thread Jacky Lau

Hi Ran:
Thanks for driving the FLIP. the google doc looks really good. it is
important to improve user interactive experience. +1 to support this
feature.

Jing Ge  于2023年2月23日周四 00:51写道：

> Hi Ran,
>
> Thanks for driving the FLIP.  It looks overall good. Would you like to add
> a description of useLike and notLike? I guess useLike true is for "LIKE"
> and notLike true is for "NOT LIKE" but I am not sure if I understood it
> correctly. Furthermore, does it make sense to support "ILIKE" too?
>
> Best regards,
> Jing
>
> On Wed, Feb 22, 2023 at 1:17 PM Ran Tao  wrote:
>
> > Currently flink sql auxiliary statements has supported some good features
> > such as catalog/databases/table support.
> >
> > But these features are not very complete compared with other popular
> > engines such as spark, presto, hive and commercial engines such as
> > snowflake.
> >
> > For example, many engines support show operation with filtering except
> > flink, and support describe other object(flink only support describe
> > table).
> >
> > I wonder can we add these useful features for flink?
> > You can find details in this doc.[1] or FLIP.[2]
> >
> > Also, please let me know if there is a mistake. Looking forward to your
> > reply.
> >
> > [1]
> >
> >
> https://docs.google.com/document/d/1hAiOfPx14VTBTOlpyxG7FA2mB1k5M31VnKYad2XpJ1I/
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-297%3A+Improve+Auxiliary+Sql+Statements
> >
> > Best Regards,
> > Ran Tao
> >
>

Re: [DISCUSS] FLIP-296: Watermark options for table API & SQL

2023-02-22 Thread yuxia

Hi, Yuan Kui.
Thanks for driving it.
IMO, the 'OPTIONS' hint may be not only specific to the connector options. Just 
as a reference, we also have `sink.parallelism`[1] as a connector options. It 
enables 
user to specific the writer's parallelism dynamically per-query.

Personally, I perfer to consider watermark-related hints as option hints. So, 
user can define a default watermark strategy for the table, and if user dosen't 
needed to changes it, they need to do nothing in their query instead of 
specific it ervery time.

[1] 
https://nightlies.apache.org/flink/flink-docs-master/zh/docs/connectors/table/filesystem/#sink-parallelism

Best regards,
Yuxia

Best regards,
Yuxia

- 原始邮件 -
发件人: "kui yuan" 
收件人: "dev" 
抄送: "Jark Wu" 
发送时间: 星期三, 2023年 2 月 22日 下午 10:08:11
主题: Re: [DISCUSS] FLIP-296: Watermark options for table API & SQL

Hi all,

Thanks for the lively discussion and I will respond to these questions one
by one. However, there are also some common questions and I will answer
together.

@郑 Thanks for your reply. The features mentioned in this flip are only for
those source connectors that implement the SupportsWatermarkPushDown
interface, generating watermarks in other graph locations is not in the
scope of this discussion. Perhaps another flip can be proposed later to
implement this feature.

@Shammon Thanks for your reply. In Flip-296, a rejected alternative is
adding watermark related options in the connector options，we believe that
we should not bind the watermark-related options to a connector to ensure
semantic clarity.

> What will happen if we add watermark related options in `the connector
> options`? Will the connector ignore these options or throw an exception?
> How can we support this?

If user defines different watermark configurations for one table in two
places, I tend to prefer the first place would prevail, but  we can also
throw exception or just print logs to prompt the user, which are
implementation details.

> If one table is used by two operators with different watermark params,
> what will happen?

@Martijn Thanks for your reply.  I'm sorry that we are not particularly
accurate, this hint is mainly for SQL,  not table API.

> While the FLIP talks about watermark options for Table API & SQL, I only
> see proposed syntax for SQL, not for the Table API. What is your proposal
> for the Table API

@Jane Thanks for your reply. For the first question, If the user uses this
hint on those sourse that does not implement the SupportsWatermarkPushDown
interface, it will be completely invalid. The task will run as normal as if
the hint had not been used.

> What's the behavior if there are multiple table sources, among which
> some do not support `SupportsWatermarkPushDown`?

@Jane feedback that 'WATERMARK_PARAMS' is difficult to remember, perhaps
the naming issue can be put to the end of the discussion, because more
people like @Martijn @Shuo are considering whether these configurations
should be put into the DDL or the 'OPTIONS' hint. Here's what I
think, Putting these configs into DDL or putting them into 'OPTIONS' hint
is actually the same thing, because the 'OPTIONS' hint is mainly used to
configure the properties of conenctor. The reason why I want to use a new
hint is to make sure the semantics clear, in my opinion the configuration
of watermark should not be mixed up with connector. However, a new hint
does make it more difficult to use to some extent, for example, when a user
uses both 'OPTIONS' hint and 'WATERMARK_PARAMS' hint. For this point, maby
it is more appropriate to use uniform 'OPTIONS' hint.
On the other hand, if we enrich more watermark option keys in 'OPTIONS'
hints, The question will be what we treat the definatrions of  'OPTIONS'
hint, is this only specific to the connector options or could be more?
Maybe @Jark could share more insights here. In my opion, 'OPTIONS' is only
related to the connector options, which is not like the gernal watermark
options.



Shuo Cheng  于2023年2月22日周三 19:17写道：

> Hi Kui,
>
> Thanks for driving the discussion. It's quite useful to introduce Watermark
> options. I have some questions:
>
> What kind of hints is "WATERMARK_PARAMS"?
> Currently, we have two kinds of hints in Flink: Dynamic Table Options &
> Query Hints. As described in the Flip, "WATERMARK_PARAMS" is more like
> Dynamic Table Options. So two questions arise here:
>
> 1)  Are these watermark options to be exposed as connector WITH options? Aa
> described in SQL Hints doc[1],  "Dynamic Table Options allow to specify or
> override table options dynamically", which implies that these options can
> also be configured in WITH options.
>
> 2)  Do we really need a new hint name like 'WATERMARK_PARAMS', table
> options use "OPTIONS" as hint name, like '/*+
> OPTIONS('csv.ignore-parse-errors'='true') */', maybe we can enrich more
> table option keys for watermark, e.g., /*+
> OPTIONS('watermark.emit-strategy'='ON_PERIODIC') */.
>
>
> [1]
>
>

[jira] [Created] (FLINK-31188) Expose kubernetes scheduler configOption when running flink on kubernetes

2023-02-22 Thread Kelu Tao (Jira)

Kelu Tao created FLINK-31188:


 Summary: Expose kubernetes scheduler configOption when running 
flink on kubernetes
 Key: FLINK-31188
 URL: https://issues.apache.org/jira/browse/FLINK-31188
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / Kubernetes
Reporter: Kelu Tao


Now when we deploy Flink job on kubernetes, the scheduler is kubernetes 
scheduler by default. But the custom kubernetes scheduler setting sometimes is 
needed by users.

 

So can we add the config option for kubernetes scheduler setting? 

 

Thanks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: Re: [discussion] To introduce a formatter for Markdown files

2023-02-22 Thread Jing Ge

Hi Zhongpu,

Thanks for the clarification. Prettier looks fine for formatting mk files.
Is there any way to validate rules like e.g. only using '*' for listing
items in the CI pipeline?

Best regards,
Jing

On Wed, Feb 22, 2023 at 2:28 PM Zhongpu Chen  wrote:

> Hi Jing,
>
> Sorry for the last reply in a messed up formatting, as I am not quite
> familiar with the mail list usage.
>
> First, as long as contributors install Prettier with the same version,
> markdown files can be auto-formatted in the same way without extra working
> via the `prettier --write **/*.md` command, and this can also be set as a
> part of pre-commit hook. To check whether markdown files satisfy formatting
> requirements, we can use `prettier --check **/*.md` where exit code 0 means
> it does follow the styles.
>
> Secondly, the format check can be integrated with the CI pipeline easily.
> Here is an example using GitHub Actions, and I think the setting in Azure
> pipelines is similar.
>
> ```yml
> jobs:
>   Check-MD-Format-Actions:
> runs-on: ubuntu-latest
> steps:
>   - name: Check out repo code
> uses: actions/checkout@v3
>   - name: Use Node.js
> uses: actions/setup-node@v3
> with:
>   node-version: "12.x"
>   - name: Install pretty
> run: npm i -g prettier
>   - name: Prettify code
> run: prettier --check **/*.md
> ```
>
> At last, it will never cause any conflict. Prettier is compatible with
> CommonMark (https://commonmark.org/) and GitHub Flavored Markdown.
>
>
> On 2023/02/22 10:22:46 Jing Ge wrote:
> > Hi Zhongpu,
> >
> > Thanks for starting this discussion. I was wondering how we could let
> every
> > contributor follow the same mk rule. I am not familiar with Prettier.
> Would
> > you like to help me understand some basic questions? Thanks.
> >
> > Is there any way to integrate the format check into our CI pipeline?
> > Otherwise, it will be hard to control it. Another thing is that some docs
> > are written with Hugo syntax [1]. Would they work together with no
> conflict?
> >
> >
> > Best regards,
> > Jing
> >
> > [1] https://gohugo.io/
> >
> > On Wed, Feb 22, 2023 at 9:50 AM Zhongpu Chen  wrote:
> >
> > > As I mentioned in FLINK-31177 (
> > > https://issues.apache.org/jira/browse/FLINK-31177):
> > >
> > > Currently, markdown files in *docs* are maintained and updated by many
> > > contributors, and different people have varying code style taste. By
> the
> > > way, as the syntax of markdown is not really strict, the styles tend to
> be
> > > inconsistent.
> > >
> > > To name a few,
> > >
> > >- Some prefer `*` to make a list item, while others may prefer `-`.
> > >- It is common to  leave many unnecessary blank lines and spaces.
> > >- To make a divider, the number of `-` can be varying.
> > >
> > > To this end, I think it would be nicer to encourage or demand
> contributors
> > > to format their markdown files before making a pull request.
> Personally, I
> > > think Prettier (https://prettier.io/) is a good candidate.
> > > What do you think?
> > >
> > > --
> > > Zhongpu Chen
> > >
> >
>

[jira] [Created] (FLINK-31187) Standalone HA mode does not work if dynamic properties are supplied

2023-02-22 Thread Mate Czagany (Jira)

Mate Czagany created FLINK-31187:


 Summary: Standalone HA mode does not work if dynamic properties 
are supplied
 Key: FLINK-31187
 URL: https://issues.apache.org/jira/browse/FLINK-31187
 Project: Flink
  Issue Type: Bug
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.4.0
Reporter: Mate Czagany
 Attachments: standalone-ha.yaml

With FLINK-30518 '--host $(POD_IP)' has been added to the arguments of the JMs 
which fixes the issue with HA on standalone mode, but it always gets appended 
to the end of the final JM arguments: 
https://github.com/usamj/flink-kubernetes-operator/blob/72ec9d384def3091ce50c2a3e2a06cded3b572e6/flink-kubernetes-standalone/src/main/java/org/apache/flink/kubernetes/operator/kubeclient/decorators/CmdStandaloneJobManagerDecorator.java#L107

But this will not be parsed properly in case any dynamic properties were set in 
the arguments, e.g.:
{code:java}
 Program Arguments:
   --configDir
   /opt/flink/conf
   -D
   jobmanager.memory.off-heap.size=134217728b
   -D
   jobmanager.memory.jvm-overhead.min=201326592b
   -D
   jobmanager.memory.jvm-metaspace.size=268435456b
   -D
   jobmanager.memory.heap.size=469762048b
   -D
   jobmanager.memory.jvm-overhead.max=201326592b
   --job-classname
   org.apache.flink.streaming.examples.statemachine.StateMachineExample
   --test
   test
   --host
   172.17.0.11{code}
You can verify this bug by using the YAML I've attached and in the JM logs you 
can see this line: 
{code:java}
Remoting started; listening on addresses 
:[akka.tcp://flink@flink-example-statemachine.flink:6123]{code}
Without any program arguments supplied this would correctly be:
{code:java}
Remoting started; listening on addresses 
:[akka.tcp://flink@172.17.0.8:6123]{code}

I believe this could be easily fixed by appending the --host parameter before 
JobSpec.args and if a committer can assign this ticket to me I can create a PR 
for this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] FLIP-297: Improve Auxiliary Sql Statements

2023-02-22 Thread Jing Ge

Hi Ran,

Thanks for driving the FLIP.  It looks overall good. Would you like to add
a description of useLike and notLike? I guess useLike true is for "LIKE"
and notLike true is for "NOT LIKE" but I am not sure if I understood it
correctly. Furthermore, does it make sense to support "ILIKE" too?

Best regards,
Jing

On Wed, Feb 22, 2023 at 1:17 PM Ran Tao  wrote:

> Currently flink sql auxiliary statements has supported some good features
> such as catalog/databases/table support.
>
> But these features are not very complete compared with other popular
> engines such as spark, presto, hive and commercial engines such as
> snowflake.
>
> For example, many engines support show operation with filtering except
> flink, and support describe other object(flink only support describe
> table).
>
> I wonder can we add these useful features for flink?
> You can find details in this doc.[1] or FLIP.[2]
>
> Also, please let me know if there is a mistake. Looking forward to your
> reply.
>
> [1]
>
> https://docs.google.com/document/d/1hAiOfPx14VTBTOlpyxG7FA2mB1k5M31VnKYad2XpJ1I/
> [2]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-297%3A+Improve+Auxiliary+Sql+Statements
>
> Best Regards,
> Ran Tao
>

Re: [ANNOUNCE] New Apache Flink Committer - Sergey Nuyanzin

2023-02-22 Thread Andrey Redko

Congrats Sergey! Well deserved!!!

Best Regards,
Andriy Redko

On Wed, Feb 22, 2023, 9:03 AM Samrat Deb  wrote:

> Congratulations Sergey
>
> On Wed, Feb 22, 2023 at 3:31 PM Hang Ruan  wrote:
>
> > Congratulations Sergey!
> >
> > Jane Chan  于2023年2月22日周三 15:53写道：
> >
> > > Congratulations, Sergey!
> > >
> > > Best regards,
> > > Jane
> > >
> > >
> > > On Wed, Feb 22, 2023 at 9:44 AM Dian Fu  wrote:
> > >
> > > > Congratulations Sergey!
> > > >
> > > > On Tue, Feb 21, 2023 at 9:07 PM Rui Fan  wrote:
> > > >
> > > > > Congratulations, Sergey!
> > > > >
> > > > > Best
> > > > > Rui Fan
> > > > >
> > > > > On Tue, Feb 21, 2023 at 20:55 Junrui Lee 
> > wrote:
> > > > >
> > > > > > Congratulations Sergey!
> > > > > >
> > > > > > weijie guo  于2023年2月21日周二 20:54写道：
> > > > > >
> > > > > > > Congratulations, Sergey~
> > > > > > >
> > > > > > > Best regards,
> > > > > > >
> > > > > > > Weijie
> > > > > > >
> > > > > > >
> > > > > > > Jing Ge  于2023年2月21日周二 20:52写道：
> > > > > > >
> > > > > > > > congrats Sergey!
> > > > > > > >
> > > > > > > > On Tue, Feb 21, 2023 at 1:15 PM Matthias Pohl
> > > > > > > >  wrote:
> > > > > > > >
> > > > > > > > > Congratulations, Sergey! Good job & well-deserved! :)
> > > > > > > > >
> > > > > > > > > On Tue, Feb 21, 2023 at 1:03 PM yuxia <
> > > > luoyu...@alumni.sjtu.edu.cn
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Congratulations Sergey!
> > > > > > > > > >
> > > > > > > > > > Best regards,
> > > > > > > > > > Yuxia
> > > > > > > > > >
> > > > > > > > > > - 原始邮件 -
> > > > > > > > > > 发件人: "Martijn Visser" 
> > > > > > > > > > 收件人: "dev" 
> > > > > > > > > > 发送时间: 星期二, 2023年 2 月 21日 下午 7:58:35
> > > > > > > > > > 主题: Re: [ANNOUNCE] New Apache Flink Committer - Sergey
> > > Nuyanzin
> > > > > > > > > >
> > > > > > > > > > Congrats Sergey, well deserved :)
> > > > > > > > > >
> > > > > > > > > > On Tue, Feb 21, 2023 at 12:53 PM Benchao Li <
> > > > > libenc...@apache.org>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Congratulations Sergey!
> > > > > > > > > > >
> > > > > > > > > > > Timo Walther  于2023年2月21日周二
> 19:51写道：
> > > > > > > > > > >
> > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > >
> > > > > > > > > > > > On behalf of the PMC, I'm very happy to announce
> Sergey
> > > > > > Nuyanzin
> > > > > > > > as a
> > > > > > > > > > > > new Flink Committer.
> > > > > > > > > > > >
> > > > > > > > > > > > Sergey started contributing small improvements to the
> > > > project
> > > > > > in
> > > > > > > > > 2018.
> > > > > > > > > > > > Over the past 1.5 years, he has become more active
> and
> > > > > focused
> > > > > > on
> > > > > > > > > > adding
> > > > > > > > > > > > and reviewing changes to the Flink SQL ecosystem.
> > > > > > > > > > > >
> > > > > > > > > > > > Currently, he is upgrading Flink's SQL engine to the
> > > latest
> > > > > > > Apache
> > > > > > > > > > > > Calcite version [1][2][3] and helps in updating other
> > > > > > > project-wide
> > > > > > > > > > > > dependencies as well.
> > > > > > > > > > > >
> > > > > > > > > > > > Please join me in congratulating Sergey Nuyanzin for
> > > > > becoming a
> > > > > > > > Flink
> > > > > > > > > > > > Committer!
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Timo Walther (on behalf of the Flink PMC)
> > > > > > > > > > > >
> > > > > > > > > > > > [1]
> https://issues.apache.org/jira/browse/FLINK-29932
> > > > > > > > > > > > [2]
> https://issues.apache.org/jira/browse/FLINK-21239
> > > > > > > > > > > > [3]
> https://issues.apache.org/jira/browse/FLINK-20873
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Benchao Li
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-296: Watermark options for table API & SQL

2023-02-22 Thread kui yuan

Hi all,

Thanks for the lively discussion and I will respond to these questions one
by one. However, there are also some common questions and I will answer
together.

@郑 Thanks for your reply. The features mentioned in this flip are only for
those source connectors that implement the SupportsWatermarkPushDown
interface, generating watermarks in other graph locations is not in the
scope of this discussion. Perhaps another flip can be proposed later to
implement this feature.

@Shammon Thanks for your reply. In Flip-296, a rejected alternative is
adding watermark related options in the connector options，we believe that
we should not bind the watermark-related options to a connector to ensure
semantic clarity.

> What will happen if we add watermark related options in `the connector
> options`? Will the connector ignore these options or throw an exception?
> How can we support this?

If user defines different watermark configurations for one table in two
places, I tend to prefer the first place would prevail, but  we can also
throw exception or just print logs to prompt the user, which are
implementation details.

> If one table is used by two operators with different watermark params,
> what will happen?

@Martijn Thanks for your reply.  I'm sorry that we are not particularly
accurate, this hint is mainly for SQL,  not table API.

> While the FLIP talks about watermark options for Table API & SQL, I only
> see proposed syntax for SQL, not for the Table API. What is your proposal
> for the Table API

@Jane Thanks for your reply. For the first question, If the user uses this
hint on those sourse that does not implement the SupportsWatermarkPushDown
interface, it will be completely invalid. The task will run as normal as if
the hint had not been used.

> What's the behavior if there are multiple table sources, among which
> some do not support `SupportsWatermarkPushDown`?

@Jane feedback that 'WATERMARK_PARAMS' is difficult to remember, perhaps
the naming issue can be put to the end of the discussion, because more
people like @Martijn @Shuo are considering whether these configurations
should be put into the DDL or the 'OPTIONS' hint. Here's what I
think, Putting these configs into DDL or putting them into 'OPTIONS' hint
is actually the same thing, because the 'OPTIONS' hint is mainly used to
configure the properties of conenctor. The reason why I want to use a new
hint is to make sure the semantics clear, in my opinion the configuration
of watermark should not be mixed up with connector. However, a new hint
does make it more difficult to use to some extent, for example, when a user
uses both 'OPTIONS' hint and 'WATERMARK_PARAMS' hint. For this point, maby
it is more appropriate to use uniform 'OPTIONS' hint.
On the other hand, if we enrich more watermark option keys in 'OPTIONS'
hints, The question will be what we treat the definatrions of  'OPTIONS'
hint, is this only specific to the connector options or could be more?
Maybe @Jark could share more insights here. In my opion, 'OPTIONS' is only
related to the connector options, which is not like the gernal watermark
options.

Shuo Cheng  于2023年2月22日周三 19:17写道：

> Hi Kui,
>
> Thanks for driving the discussion. It's quite useful to introduce Watermark
> options. I have some questions:
>
> What kind of hints is "WATERMARK_PARAMS"?
> Currently, we have two kinds of hints in Flink: Dynamic Table Options &
> Query Hints. As described in the Flip, "WATERMARK_PARAMS" is more like
> Dynamic Table Options. So two questions arise here:
>
> 1)  Are these watermark options to be exposed as connector WITH options? Aa
> described in SQL Hints doc[1],  "Dynamic Table Options allow to specify or
> override table options dynamically", which implies that these options can
> also be configured in WITH options.
>
> 2)  Do we really need a new hint name like 'WATERMARK_PARAMS', table
> options use "OPTIONS" as hint name, like '/*+
> OPTIONS('csv.ignore-parse-errors'='true') */', maybe we can enrich more
> table option keys for watermark, e.g., /*+
> OPTIONS('watermark.emit-strategy'='ON_PERIODIC') */.
>
>
> [1]
>
> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/
>
> On Wed, Feb 22, 2023 at 10:22 AM kui yuan  wrote:
>
> > Hi devs,
> >
> >
> > I'd like to start a discussion thread for FLIP-296[1]. This comes from an
> > offline discussion with @Yun Tang, and we hope to enrich table API & SQL
> to
> > support many watermark-related features which were only implemented at
> the
> > datastream API level.
> >
> >
> > Basically, we want to introduce watermark options in table API & SQL via
> > SQL hint named 'WATERMARK_PARAMS' to support features:
> >
> > 1、Configurable watermark emit strategy
> >
> > 2、Dealing with idle sources
> >
> > 3、Watermark alignment
> >
> >
> > Last but not least, thanks to Qingsheng and Jing Zhang for the initial
> > reviews.
> >
> >
> > Looking forward to your thoughts and any feedback is appreciated!
> >
> >
> >

Re: [ANNOUNCE] New Apache Flink Committer - Sergey Nuyanzin

2023-02-22 Thread Samrat Deb

Congratulations Sergey

On Wed, Feb 22, 2023 at 3:31 PM Hang Ruan  wrote:

> Congratulations Sergey!
>
> Jane Chan  于2023年2月22日周三 15:53写道：
>
> > Congratulations, Sergey!
> >
> > Best regards,
> > Jane
> >
> >
> > On Wed, Feb 22, 2023 at 9:44 AM Dian Fu  wrote:
> >
> > > Congratulations Sergey!
> > >
> > > On Tue, Feb 21, 2023 at 9:07 PM Rui Fan  wrote:
> > >
> > > > Congratulations, Sergey!
> > > >
> > > > Best
> > > > Rui Fan
> > > >
> > > > On Tue, Feb 21, 2023 at 20:55 Junrui Lee 
> wrote:
> > > >
> > > > > Congratulations Sergey!
> > > > >
> > > > > weijie guo  于2023年2月21日周二 20:54写道：
> > > > >
> > > > > > Congratulations, Sergey~
> > > > > >
> > > > > > Best regards,
> > > > > >
> > > > > > Weijie
> > > > > >
> > > > > >
> > > > > > Jing Ge  于2023年2月21日周二 20:52写道：
> > > > > >
> > > > > > > congrats Sergey!
> > > > > > >
> > > > > > > On Tue, Feb 21, 2023 at 1:15 PM Matthias Pohl
> > > > > > >  wrote:
> > > > > > >
> > > > > > > > Congratulations, Sergey! Good job & well-deserved! :)
> > > > > > > >
> > > > > > > > On Tue, Feb 21, 2023 at 1:03 PM yuxia <
> > > luoyu...@alumni.sjtu.edu.cn
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Congratulations Sergey!
> > > > > > > > >
> > > > > > > > > Best regards,
> > > > > > > > > Yuxia
> > > > > > > > >
> > > > > > > > > - 原始邮件 -
> > > > > > > > > 发件人: "Martijn Visser" 
> > > > > > > > > 收件人: "dev" 
> > > > > > > > > 发送时间: 星期二, 2023年 2 月 21日 下午 7:58:35
> > > > > > > > > 主题: Re: [ANNOUNCE] New Apache Flink Committer - Sergey
> > Nuyanzin
> > > > > > > > >
> > > > > > > > > Congrats Sergey, well deserved :)
> > > > > > > > >
> > > > > > > > > On Tue, Feb 21, 2023 at 12:53 PM Benchao Li <
> > > > libenc...@apache.org>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Congratulations Sergey!
> > > > > > > > > >
> > > > > > > > > > Timo Walther  于2023年2月21日周二 19:51写道：
> > > > > > > > > >
> > > > > > > > > > > Hi everyone,
> > > > > > > > > > >
> > > > > > > > > > > On behalf of the PMC, I'm very happy to announce Sergey
> > > > > Nuyanzin
> > > > > > > as a
> > > > > > > > > > > new Flink Committer.
> > > > > > > > > > >
> > > > > > > > > > > Sergey started contributing small improvements to the
> > > project
> > > > > in
> > > > > > > > 2018.
> > > > > > > > > > > Over the past 1.5 years, he has become more active and
> > > > focused
> > > > > on
> > > > > > > > > adding
> > > > > > > > > > > and reviewing changes to the Flink SQL ecosystem.
> > > > > > > > > > >
> > > > > > > > > > > Currently, he is upgrading Flink's SQL engine to the
> > latest
> > > > > > Apache
> > > > > > > > > > > Calcite version [1][2][3] and helps in updating other
> > > > > > project-wide
> > > > > > > > > > > dependencies as well.
> > > > > > > > > > >
> > > > > > > > > > > Please join me in congratulating Sergey Nuyanzin for
> > > > becoming a
> > > > > > > Flink
> > > > > > > > > > > Committer!
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Timo Walther (on behalf of the Flink PMC)
> > > > > > > > > > >
> > > > > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-29932
> > > > > > > > > > > [2] https://issues.apache.org/jira/browse/FLINK-21239
> > > > > > > > > > > [3] https://issues.apache.org/jira/browse/FLINK-20873
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Benchao Li
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

RE: Re: [discussion] To introduce a formatter for Markdown files

2023-02-22 Thread Zhongpu Chen

Hi Jing,

Sorry for the last reply in a messed up formatting, as I am not quite
familiar with the mail list usage.

First, as long as contributors install Prettier with the same version,
markdown files can be auto-formatted in the same way without extra working
via the `prettier --write **/*.md` command, and this can also be set as a
part of pre-commit hook. To check whether markdown files satisfy formatting
requirements, we can use `prettier --check **/*.md` where exit code 0 means
it does follow the styles.

Secondly, the format check can be integrated with the CI pipeline easily.
Here is an example using GitHub Actions, and I think the setting in Azure
pipelines is similar.

```yml
jobs:
  Check-MD-Format-Actions:
runs-on: ubuntu-latest
steps:
  - name: Check out repo code
uses: actions/checkout@v3
  - name: Use Node.js
uses: actions/setup-node@v3
with:
  node-version: "12.x"
  - name: Install pretty
run: npm i -g prettier
  - name: Prettify code
run: prettier --check **/*.md
```

At last, it will never cause any conflict. Prettier is compatible with
CommonMark (https://commonmark.org/) and GitHub Flavored Markdown.

On 2023/02/22 10:22:46 Jing Ge wrote:
> Hi Zhongpu,
>
> Thanks for starting this discussion. I was wondering how we could let
every
> contributor follow the same mk rule. I am not familiar with Prettier.
Would
> you like to help me understand some basic questions? Thanks.
>
> Is there any way to integrate the format check into our CI pipeline?
> Otherwise, it will be hard to control it. Another thing is that some docs
> are written with Hugo syntax [1]. Would they work together with no
conflict?
>
>
> Best regards,
> Jing
>
> [1] https://gohugo.io/
>
> On Wed, Feb 22, 2023 at 9:50 AM Zhongpu Chen  wrote:
>
> > As I mentioned in FLINK-31177 (
> > https://issues.apache.org/jira/browse/FLINK-31177):
> >
> > Currently, markdown files in *docs* are maintained and updated by many
> > contributors, and different people have varying code style taste. By the
> > way, as the syntax of markdown is not really strict, the styles tend to
be
> > inconsistent.
> >
> > To name a few,
> >
> >- Some prefer `*` to make a list item, while others may prefer `-`.
> >- It is common to  leave many unnecessary blank lines and spaces.
> >- To make a divider, the number of `-` can be varying.
> >
> > To this end, I think it would be nicer to encourage or demand
contributors
> > to format their markdown files before making a pull request.
Personally, I
> > think Prettier (https://prettier.io/) is a good candidate.
> > What do you think?
> >
> > --
> > Zhongpu Chen
> >
>

[jira] [Created] (FLINK-31186) Removing topic from kafka source does nothing

2023-02-22 Thread Exidex (Jira)

Exidex created FLINK-31186:
--

 Summary: Removing topic from kafka source does nothing
 Key: FLINK-31186
 URL: https://issues.apache.org/jira/browse/FLINK-31186
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.15.3
Reporter: Exidex


As far as I can tell, there is no good way to remove topic from the list of 
topic that kafka source consumes from. 

We use {{StreamExecutionEnvironment.fromSource}} api with 
{{KafkaSource.setTopics}} which accepts list of topics. but when we remove the 
topic from list after some time the flink kafka source still consumes from it. 

My guess is that it relates to this TODO in code:
[GitHub|https://github.com/apache/flink/blob/cc66d4855e6f8ee9986809a18f68a458bcfe3c12/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/enumerator/KafkaSourceEnumerator.java#L305]

You can kind of workaroud this by removing whole job state or changing uid of 
kafka source but that affects either whole job or whole source. The other way 
is to use state processor api but it doesn't expose source operator state, 
which in turn can be worked around using reflection and copying code from 
SourceCoordinator. None of those are satisfactory



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

RE: Re: [discussion] To introduce a formatter for Markdown files

2023-02-22 Thread Zhongpu Chen

> I was wondering how we could let every contributor follow the same mk
rule.

As long as contributors install Prettier with the same version, markdown
files can be auto-formatted in the same way without extra working via the
`prettier --write **/*.md` command, and this can also be set as a part of
pre-commit hook. To check whether markdown files satisfy formatting
requirements, we can use `prettier --check **/*.md` where exit code 0 means
it does follow the styles.

> Is there any way to integrate the format check into our CI pipeline?

Yes, it is possible. I have tested using GitHub Actions, and I think Azure
pipelines also support this feature.

```yml
jobs:
  Check-MD-Format-Actions:
runs-on: ubuntu-latest
steps:
  - name: Check out repo code
uses: actions/checkout@v3
  - name: Prettify code
uses: creyD/prettier_action@v4.2
with:
  prettier_options: --check **/*.md
```

> Another thing is that some docs are written with Hugo syntax. Would they
work together with no conflict?

Well, it will never cause any conflict. Since Hugo syntax is a superset of
regular markdown syntax, and Prettier is compatible with CommonMark (
https://commonmark.org/) and GitHub Flavored Markdown.

On 2023/02/22 10:22:46 Jing Ge wrote:
> Hi Zhongpu,
>
> Thanks for starting this discussion. I was wondering how we could let
every
> contributor follow the same mk rule. I am not familiar with Prettier.
Would
> you like to help me understand some basic questions? Thanks.
>
> Is there any way to integrate the format check into our CI pipeline?
> Otherwise, it will be hard to control it. Another thing is that some docs
> are written with Hugo syntax [1]. Would they work together with no
conflict?
>
>
> Best regards,
> Jing
>
> [1] https://gohugo.io/
>
> On Wed, Feb 22, 2023 at 9:50 AM Zhongpu Chen  wrote:
>
> > As I mentioned in FLINK-31177 (
> > https://issues.apache.org/jira/browse/FLINK-31177):
> >
> > Currently, markdown files in *docs* are maintained and updated by many
> > contributors, and different people have varying code style taste. By the
> > way, as the syntax of markdown is not really strict, the styles tend to
be
> > inconsistent.
> >
> > To name a few,
> >
> >- Some prefer `*` to make a list item, while others may prefer `-`.
> >- It is common to  leave many unnecessary blank lines and spaces.
> >- To make a divider, the number of `-` can be varying.
> >
> > To this end, I think it would be nicer to encourage or demand
contributors
> > to format their markdown files before making a pull request.
Personally, I
> > think Prettier (https://prettier.io/) is a good candidate.
> > What do you think?
> >
> > --
> > Zhongpu Chen
> >
>

[DISCUSS] FLIP-297: Improve Auxiliary Sql Statements

2023-02-22 Thread Ran Tao

Currently flink sql auxiliary statements has supported some good features
such as catalog/databases/table support.

But these features are not very complete compared with other popular
engines such as spark, presto, hive and commercial engines such as
snowflake.

For example, many engines support show operation with filtering except
flink, and support describe other object(flink only support describe table).

I wonder can we add these useful features for flink?
You can find details in this doc.[1] or FLIP.[2]

Also, please let me know if there is a mistake. Looking forward to your
reply.

[1]
https://docs.google.com/document/d/1hAiOfPx14VTBTOlpyxG7FA2mB1k5M31VnKYad2XpJ1I/
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-297%3A+Improve+Auxiliary+Sql+Statements

Best Regards,
Ran Tao

[jira] [Created] (FLINK-31185) Python BroadcastProcessFunction not support side output

2023-02-22 Thread Juntao Hu (Jira)

Juntao Hu created FLINK-31185:
-

 Summary: Python BroadcastProcessFunction not support side output
 Key: FLINK-31185
 URL: https://issues.apache.org/jira/browse/FLINK-31185
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.16.1
Reporter: Juntao Hu
 Fix For: 1.17.0, 1.16.2






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[RESULT] [VOTE] Apache Flink Kubernetes Operator Release 1.4.0, release candidate #1

2023-02-22 Thread Gyula Fóra

I'm happy to announce that we have unanimously approved this release.

There are 6 approving votes, 3 of which are binding:

* Marton Balassi (binding)
* Gyula Fora (binding)
* Maximilian Mixhels (binding)
* Jim Busche
* Peter Huang
* Matt Wang

There are no disapproving votes.

Thanks everyone!
Gyula

Re: [VOTE] Apache Flink Kubernetes Operator Release 1.4.0, release candidate #1

2023-02-22 Thread Gyula Fóra

Thank you everyone, closing this vote now!

Gyula

On Tue, Feb 21, 2023 at 2:19 PM Matt Wang  wrote:

> Thank you, Gyula.
>
>
> +1 (non-binding)
>
> I tested the following:
>
> 1. Downloaded the archives, checksums, signatures and README file
> 2. Build the source distribution to ensure all source files have Apache
> headers
> 3. Verify that all POM files point to the same version
> 4. Helm repo install from
> https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.4.0-rc1/
> 5. Ran example jobs with V1_16, and verified operator logs
>
>
>
> --
>
> Best,
> Matt Wang
>
>
>  Replied Message 
> | From | Márton Balassi |
> | Date | 02/21/2023 21:07 |
> | To |  |
> | Subject | Re: [VOTE] Apache Flink Kubernetes Operator Release 1.4.0,
> release candidate #1 |
> Thank you, Gyula.
>
> +1 (binding)
>
> - Verified Helm repo works as expected, points to correct image tag, build,
> version
> - Verified basic examples + checked operator logs everything looks as
> expected
> - Verified hashes, signatures and source release contains no binaries
> - Ran built-in tests, built jars + docker image from source successfully
>
> Best,
> Marton
>
> On Mon, Feb 20, 2023 at 2:58 PM Gyula Fóra  wrote:
>
> +1 (binding)
>
> - Downloaded archives, validated binary content, checksum signatures,
> NOTICE files
> - Installed to local k8s cluster from Helm repo, verified docker image,
> logs
> - Ran some basic examples, ran autoscaler example
>
> Cheers
> Gyula
>
> On Sun, Feb 19, 2023 at 5:09 AM Peter Huang 
> wrote:
>
> Thanks for preparing the release!
>
> +1 (non-binding)
>
> 1. Downloaded the archives, checksums, and signatures
> 2. Extract and inspect the source code for binaries
> 3. Compiled and tested the source code via mvn verify
> 4. Deployed helm chart from flink-kubernetes-operator-1.4.0-helm.tgz to
> local minikube cluster
> 5. Ran example jobs with V1_14, V1_15, V1_16, and verified operator logs
> 6. Built the docker container locally and verified it through repeatting
> step 5
>
> Best Regards
> Peter Huang
>
> On Fri, Feb 17, 2023 at 8:53 PM Jim Busche  wrote:
>
> Thanks Gyula for the release.
>
> +1 (non-binding)
>
>
> I tested the following:
>
> *   Helm repo install from flink-kubernetes-operator-1.4.0-helm.tgz
> *   Podman Dockerfile build from source, looked good.
> *   Twistlock security scans of proposed image looks good.  No
> currently
> known vulnerabilities with the built image or
> ghcr.io/apache/flink-kubernetes-operator:7fc23a1
> *   UI, basic sample, basic session jobs look good. Logs look as
> expected.
> *   Checksums looked good
> *   Tested OLM build/install on OpenShift 4.12
> *   Verified that sessionjob correctly can write in the operator’s
> /opt/flink/artifacts filesystem on OpenShift even in a non-default
> namespace.
>
>
>
> Thanks, Jim
>
>
>
>

[jira] [Created] (FLINK-31184) Failed to get python udf runner directory via running GET_RUNNER_DIR_SCRIPT

2023-02-22 Thread Wei Zhong (Jira)

Wei Zhong created FLINK-31184:
-

 Summary: Failed to get python udf runner directory via running 
GET_RUNNER_DIR_SCRIPT 
 Key: FLINK-31184
 URL: https://issues.apache.org/jira/browse/FLINK-31184
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.16.1, 1.15.3, 1.17.0
Reporter: Wei Zhong


The following exception is thrown when using python udf in user job:

 
{code:java}
Caused by: java.io.IOException: Cannot run program "ERROR: ld.so: object 
'/usr/lib64/libjemalloc.so.1' from LD_PRELOAD cannot be preloaded: ignored.
/mnt/ssd/0/yarn/nm-local-dir/usercache/flink/appcache/application_1670838323719_705777/python-dist-fe870981-4de7-4229-ad0b-f51881e80d90/python-archives/pipeline_venv_v5.tar.gz/lib/python3.7/site-packages/pyflink/bin/pyflink-udf-runner.sh":
 error=2, No such file or directory
  at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
  at 
org.apache.beam.runners.fnexecution.environment.ProcessManager.startProcess(ProcessManager.java:147)
  at 
org.apache.beam.runners.fnexecution.environment.ProcessManager.startProcess(ProcessManager.java:122)
  at 
org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:106)
  at 
org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:252)
  at 
org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:231)
  at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528)
  at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277)
  at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154)
  at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044)
  at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952)
  at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974)
  at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958)
  at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964)
  ... 19 more
  Suppressed: java.lang.NullPointerException: Process for id does not exist: 1-1
at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:895)
at 
org.apache.beam.runners.fnexecution.environment.ProcessManager.stopProcess(ProcessManager.java:172)
at 
org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:126)
... 29 more
Caused by: java.io.IOException: error=2, No such file or directory
  at java.lang.UNIXProcess.forkAndExec(Native Method)
  at java.lang.UNIXProcess.(UNIXProcess.java:247)
  at java.lang.ProcessImpl.start(ProcessImpl.java:134)
  at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
  ... 32 more {code}
 

 

This is because SRE introduce a environment param 

 
{code:java}
LD_PRELOAD=/usr/lib64/libjemalloc.so.1 {code}
The logic of the python process itself can be executed normally, but an extra 
error message will be printed. So the whole output looks like:
{code:java}
ERROR: ld.so: object '/usr/lib64/libjemalloc.so.1' from LD_PRELOAD cannot be 
preloaded: ignored.
/mnt/ssd/0/yarn/nm-local-dir/usercache/flink/appcache/application_1670838323719_705777/python-dist-fe870981-4de7-4229-ad0b-f51881e80d90/python-archives/pipeline_venv_v5.tar.gz/lib/python3.7/site-packages/pyflink/bin/{code}
And the whole output is treated as a command, which caused the exception.

It seems the output is not very reliable. Maybe we need to find another way to 
transfer data, or filter the output before using.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] FLIP-296: Watermark options for table API & SQL

2023-02-22 Thread Shuo Cheng

Hi Kui,

Thanks for driving the discussion. It's quite useful to introduce Watermark
options. I have some questions:

What kind of hints is "WATERMARK_PARAMS"?
Currently, we have two kinds of hints in Flink: Dynamic Table Options &
Query Hints. As described in the Flip, "WATERMARK_PARAMS" is more like
Dynamic Table Options. So two questions arise here:

1)  Are these watermark options to be exposed as connector WITH options? Aa
described in SQL Hints doc[1],  "Dynamic Table Options allow to specify or
override table options dynamically", which implies that these options can
also be configured in WITH options.

2)  Do we really need a new hint name like 'WATERMARK_PARAMS', table
options use "OPTIONS" as hint name, like '/*+
OPTIONS('csv.ignore-parse-errors'='true') */', maybe we can enrich more
table option keys for watermark, e.g., /*+
OPTIONS('watermark.emit-strategy'='ON_PERIODIC') */.

[1]
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/

On Wed, Feb 22, 2023 at 10:22 AM kui yuan  wrote:

> Hi devs,
>
>
> I'd like to start a discussion thread for FLIP-296[1]. This comes from an
> offline discussion with @Yun Tang, and we hope to enrich table API & SQL to
> support many watermark-related features which were only implemented at the
> datastream API level.
>
>
> Basically, we want to introduce watermark options in table API & SQL via
> SQL hint named 'WATERMARK_PARAMS' to support features:
>
> 1、Configurable watermark emit strategy
>
> 2、Dealing with idle sources
>
> 3、Watermark alignment
>
>
> Last but not least, thanks to Qingsheng and Jing Zhang for the initial
> reviews.
>
>
> Looking forward to your thoughts and any feedback is appreciated!
>
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240884405
>
>
> Best
>
> Yuan Kui
>

[jira] [Created] (FLINK-31183) Flink Kinesis EFO Consumer can fail to stop gracefully

2023-02-22 Thread Danny Cranmer (Jira)

Danny Cranmer created FLINK-31183:
-

 Summary: Flink Kinesis EFO Consumer can fail to stop gracefully
 Key: FLINK-31183
 URL: https://issues.apache.org/jira/browse/FLINK-31183
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kinesis
Affects Versions: aws-connector-4.0.0, 1.16.1, aws-connector-3.0.0, 1.15.3
Reporter: Danny Cranmer
 Fix For: 1.17.0, 1.15.4, aws-connector-4.1.0, 1.16.2


*Background*

When stopping a Flink job using the stop-with-savepoint API the EFO Kinesis 
source can fail to close gracefully.

 

Sample stack trace
{code:java}
2023-02-16 20:45:40
org.apache.flink.runtime.checkpoint.CheckpointException: Task has failed.
at 
org.apache.flink.runtime.messages.checkpoint.SerializedCheckpointException.unwrap(SerializedCheckpointException.java:51)
at 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.receiveDeclineMessage(CheckpointCoordinator.java:1013)
at 
org.apache.flink.runtime.scheduler.ExecutionGraphHandler.lambda$declineCheckpoint$2(ExecutionGraphHandler.java:103)
at 
org.apache.flink.runtime.scheduler.ExecutionGraphHandler.lambda$processCheckpointCoordinatorMessage$3(ExecutionGraphHandler.java:119)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.flink.runtime.checkpoint.CheckpointException: Task name 
with subtask : Source: vas_source_stream (38/48)#0 Failure reason: Task has 
failed.
at 
org.apache.flink.runtime.taskmanager.Task.declineCheckpoint(Task.java:1395)
at 
org.apache.flink.runtime.taskmanager.Task.lambda$triggerCheckpointBarrier$3(Task.java:1338)
at 
java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930)
at 
java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
at 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:343)
Caused by: java.util.concurrent.CompletionException: 
java.util.concurrent.RejectedExecutionException: event executor terminated
at 
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
at 
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
at 
java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1063)
... 3 more
Caused by: java.util.concurrent.RejectedExecutionException: event executor 
terminated
at 
org.apache.flink.kinesis.shaded.io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:923)
at 
org.apache.flink.kinesis.shaded.io.netty.util.concurrent.SingleThreadEventExecutor.offerTask(SingleThreadEventExecutor.java:350)
at 
org.apache.flink.kinesis.shaded.io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:343)
at 
org.apache.flink.kinesis.shaded.io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:825)
at 
org.apache.flink.kinesis.shaded.io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:815)
at 
org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher$ChannelSubscription.cancel(HandlerPublisher.java:502)
at 
org.apache.flink.kinesis.shaded.software.amazon.awssdk.utils.async.DelegatingSubscription.cancel(DelegatingSubscription.java:37)
at 
org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.internal.http2.Http2ResetSendingSubscription.cancel(Http2ResetSendingSubscription.java:41)
at 
org.apache.flink.kinesis.shaded.software.amazon.awssdk.utils.async.DelegatingSubscription.cancel(DelegatingSubscription.java:37)
at 
org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.internal.ResponseHandler$OnCancelSubscription.cancel(ResponseHandler.java:409)
at 
org.apache.flink.kinesis.shaded.software.amazon.awssdk.utils.async.FlatteningSubscriber$1.cancel(FlatteningSubscriber.java:98)
at 
org.apache.flink.kinesis.shaded.software.amazon.awssdk.utils.async.FlatteningSubscriber.handleStateUpdate(FlatteningSubscriber.java:170)
at 
org.apache.flink.kinesis.shaded.software.amazon.awssdk.utils.async.FlatteningSubscriber.access$100(FlatteningSubscriber.java:29)
at

Re: [discussion] To introduce a formatter for Markdown files

2023-02-22 Thread Jing Ge

Hi Zhongpu,

Thanks for starting this discussion. I was wondering how we could let every
contributor follow the same mk rule. I am not familiar with Prettier. Would
you like to help me understand some basic questions? Thanks.

Is there any way to integrate the format check into our CI pipeline?
Otherwise, it will be hard to control it. Another thing is that some docs
are written with Hugo syntax [1]. Would they work together with no conflict?

Best regards,
Jing

[1] https://gohugo.io/

On Wed, Feb 22, 2023 at 9:50 AM Zhongpu Chen  wrote:

> As I mentioned in FLINK-31177 (
> https://issues.apache.org/jira/browse/FLINK-31177):
>
> Currently, markdown files in *docs* are maintained and updated by many
> contributors, and different people have varying code style taste. By the
> way, as the syntax of markdown is not really strict, the styles tend to be
> inconsistent.
>
> To name a few,
>
>- Some prefer `*` to make a list item, while others may prefer `-`.
>- It is common to  leave many unnecessary blank lines and spaces.
>- To make a divider, the number of `-` can be varying.
>
> To this end, I think it would be nicer to encourage or demand contributors
> to format their markdown files before making a pull request.  Personally, I
> think Prettier (https://prettier.io/) is a good candidate.
> What do you think?
>
> --
> Zhongpu Chen
>

[jira] [Created] (FLINK-31182) CompiledPlan cannot deserialize BridgingSqlFunction correctly

2023-02-22 Thread Jane Chan (Jira)

Jane Chan created FLINK-31182:
-

 Summary: CompiledPlan cannot deserialize BridgingSqlFunction 
correctly
 Key: FLINK-31182
 URL: https://issues.apache.org/jira/browse/FLINK-31182
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.17.0, 1.18.0, 1.17.1
Reporter: Jane Chan


This issue is reported from the [user mail 
list|https://lists.apache.org/thread/dglf8zgtt1yx3vrdytn0dlv7b3pw1nq6].

The stacktrace is 
{code:java}
Caused by: org.apache.flink.table.api.TableException: Could not resolve 
internal system function '$UNNEST_ROWS$1'. This is a bug, please file an issue.
    at 
org.apache.flink.table.planner.plan.nodes.exec.serde.RexNodeJsonDeserializer.deserializeInternalFunction(RexNodeJsonDeserializer.java:392)
    at 
org.apache.flink.table.planner.plan.nodes.exec.serde.RexNodeJsonDeserializer.deserializeSqlOperator(RexNodeJsonDeserializer.java:337)
    at 
org.apache.flink.table.planner.plan.nodes.exec.serde.RexNodeJsonDeserializer.deserializeCall(RexNodeJsonDeserializer.java:307)
    at 
org.apache.flink.table.planner.plan.nodes.exec.serde.RexNodeJsonDeserializer.deserialize(RexNodeJsonDeserializer.java:146)
    at 
org.apache.flink.table.planner.plan.nodes.exec.serde.RexNodeJsonDeserializer.deserialize(RexNodeJsonDeserializer.java:128)
    at 
org.apache.flink.table.planner.plan.nodes.exec.serde.RexNodeJsonDeserializer.deserialize(RexNodeJsonDeserializer.java:115)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [ANNOUNCE] New Apache Flink Committer - Sergey Nuyanzin

2023-02-22 Thread Hang Ruan

Congratulations Sergey!

Jane Chan  于2023年2月22日周三 15:53写道：

> Congratulations, Sergey!
>
> Best regards,
> Jane
>
>
> On Wed, Feb 22, 2023 at 9:44 AM Dian Fu  wrote:
>
> > Congratulations Sergey!
> >
> > On Tue, Feb 21, 2023 at 9:07 PM Rui Fan  wrote:
> >
> > > Congratulations, Sergey!
> > >
> > > Best
> > > Rui Fan
> > >
> > > On Tue, Feb 21, 2023 at 20:55 Junrui Lee  wrote:
> > >
> > > > Congratulations Sergey!
> > > >
> > > > weijie guo  于2023年2月21日周二 20:54写道：
> > > >
> > > > > Congratulations, Sergey~
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Weijie
> > > > >
> > > > >
> > > > > Jing Ge  于2023年2月21日周二 20:52写道：
> > > > >
> > > > > > congrats Sergey!
> > > > > >
> > > > > > On Tue, Feb 21, 2023 at 1:15 PM Matthias Pohl
> > > > > >  wrote:
> > > > > >
> > > > > > > Congratulations, Sergey! Good job & well-deserved! :)
> > > > > > >
> > > > > > > On Tue, Feb 21, 2023 at 1:03 PM yuxia <
> > luoyu...@alumni.sjtu.edu.cn
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Congratulations Sergey!
> > > > > > > >
> > > > > > > > Best regards,
> > > > > > > > Yuxia
> > > > > > > >
> > > > > > > > - 原始邮件 -
> > > > > > > > 发件人: "Martijn Visser" 
> > > > > > > > 收件人: "dev" 
> > > > > > > > 发送时间: 星期二, 2023年 2 月 21日 下午 7:58:35
> > > > > > > > 主题: Re: [ANNOUNCE] New Apache Flink Committer - Sergey
> Nuyanzin
> > > > > > > >
> > > > > > > > Congrats Sergey, well deserved :)
> > > > > > > >
> > > > > > > > On Tue, Feb 21, 2023 at 12:53 PM Benchao Li <
> > > libenc...@apache.org>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Congratulations Sergey!
> > > > > > > > >
> > > > > > > > > Timo Walther  于2023年2月21日周二 19:51写道：
> > > > > > > > >
> > > > > > > > > > Hi everyone,
> > > > > > > > > >
> > > > > > > > > > On behalf of the PMC, I'm very happy to announce Sergey
> > > > Nuyanzin
> > > > > > as a
> > > > > > > > > > new Flink Committer.
> > > > > > > > > >
> > > > > > > > > > Sergey started contributing small improvements to the
> > project
> > > > in
> > > > > > > 2018.
> > > > > > > > > > Over the past 1.5 years, he has become more active and
> > > focused
> > > > on
> > > > > > > > adding
> > > > > > > > > > and reviewing changes to the Flink SQL ecosystem.
> > > > > > > > > >
> > > > > > > > > > Currently, he is upgrading Flink's SQL engine to the
> latest
> > > > > Apache
> > > > > > > > > > Calcite version [1][2][3] and helps in updating other
> > > > > project-wide
> > > > > > > > > > dependencies as well.
> > > > > > > > > >
> > > > > > > > > > Please join me in congratulating Sergey Nuyanzin for
> > > becoming a
> > > > > > Flink
> > > > > > > > > > Committer!
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Timo Walther (on behalf of the Flink PMC)
> > > > > > > > > >
> > > > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-29932
> > > > > > > > > > [2] https://issues.apache.org/jira/browse/FLINK-21239
> > > > > > > > > > [3] https://issues.apache.org/jira/browse/FLINK-20873
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Benchao Li
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

[VOTE] Flink minor version support policy for old releases

2023-02-22 Thread Danny Cranmer

I am starting a vote to update the "Update Policy for old releases" [1] to
include additional bugfix support for end of life versions.

As per the discussion thread [2], the change we are voting on is:
- Support policy: updated to include: "Upon release of a new Flink minor
version, the community will perform one final bugfix release for resolved
critical/blocker issues in the Flink minor version losing support."
- Release process: add a step to start the discussion thread for the final
patch version, if there are resolved critical/blocking issues to flush.

Voting schema: since our bylaws [3] do not cover this particular scenario,
and releases require PMC involvement, we will use a consensus vote with PMC
binding votes.

Thanks,
Danny

[1] https://flink.apache.org/downloads.html#update-policy-for-old-releases
[2] https://lists.apache.org/thread/szq23kr3rlkm80rw7k9n95js5vqpsnbv
[3] https://cwiki.apache.org/confluence/display/FLINK/Flink+Bylaws

[discussion] To introduce a formatter for Markdown files

2023-02-22 Thread Zhongpu Chen

As I mentioned in FLINK-31177 (
https://issues.apache.org/jira/browse/FLINK-31177):

Currently, markdown files in *docs* are maintained and updated by many
contributors, and different people have varying code style taste. By the
way, as the syntax of markdown is not really strict, the styles tend to be
inconsistent.

To name a few,

   - Some prefer `*` to make a list item, while others may prefer `-`.
   - It is common to  leave many unnecessary blank lines and spaces.
   - To make a divider, the number of `-` can be varying.

To this end, I think it would be nicer to encourage or demand contributors
to format their markdown files before making a pull request.  Personally, I
think Prettier (https://prettier.io/) is a good candidate.
What do you think?

-- 
Zhongpu Chen

Re: [DISCUSS] FLIP-296: Watermark options for table API & SQL

2023-02-22 Thread Jane Chan

Hi Kui,

Thanks for bringing this to the discussion. Toward the FLIP, I have several
questions.

1. What's the behavior if there are multiple table sources, among which
some do not support `SupportsWatermarkPushDown`?

> So the features that this flip intends to support are only for those
> sources that implement the `SupportsWatermarkPushDown` interface.
>

2. Is there any reason for having a different parameter name for the global
and hint configuration? Personally, I feel this is a little hard to
remember. Can we unify the parameter naming?

> For the 'ON_EVENT' strategy,  option ‘emit-gap-on-event’ can configure how
> many events to emit a watermark, the default value is 1. We will also add
> a global parameter 'table.exec.watermark-emit.gap' to achieve the same
> goal, which will be valid for each source and will ease the user's
> configuration to some extent.
>

3. As Martijn mentioned, the API changes are absent in the FLIP. Can you
add more details?

Best,
Jane

On Wed, Feb 22, 2023 at 4:29 PM Martijn Visser 
wrote:

> Hi Yuan Kui,
>
> Thanks for creating the FLIP. A couple of questions / remarks
>
> 1. While the FLIP talks about watermark options for Table API & SQL, I only
> see proposed syntax for SQL, not for the Table API. What is your proposal
> for the Table API?
>
> 2. A rejected alternative is adding watermark related options in the SQL
> DDL because it would make the DDL complex and lengthy. But looking at the
> current hints for dynamic table options, we already provide the option to
> specify or override table options dynamically [1]. So I would think that
> these watermark options actually should be part of the SQL DDL and that a
> user can use the dynamic table hints to specify/override these options if
> needed. The advantage of putting these options in the SQL DDL is that the
> user has one location where these types of options need to be specified. If
> we would only do this via hints, the user would need to specify these
> options in two places. What do you think?
>
> Best regards,
>
> Martijn
>
> [1]
>
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/hints/#dynamic-table-options
>
> On Wed, Feb 22, 2023 at 7:58 AM Shammon FY  wrote:
>
> > Hi kui
> >
> > Thanks for initializing this discussion. I have two questions
> >
> > 1. What will happen if we add watermark related options in `the connector
> > options`? Will the connector ignore these options or throw an exception?
> > How can we support this?
> >
> > 2. If one table is used by two operators with different watermark params,
> > what will happen? For example, there is a source table T and user submit
> > the following job
> >
> > SELECT T1.id, T1.cnt_val, T2.sum_val FROM (SELECT id, count(val) as
> cnt_val
> > FROM T /*+ WATERMARK PARAMS(‘rowtime’ = ’time_column’,
> ‘rowtime_expression’
> > = ’time_column - INTERVAL 5 SECOND') */ GROUP BY id) T1 JOIN (SELECT id,
> > sum(val) as sum_val FROM T /*+ WATERMARK PARAMS(‘rowtime’ =
> ’time_column’,
> > ‘rowtime_expression’ = ’time_column - INTERVAL 10 SECOND') */ GROUP BY
> id)
> > T2 ON T1.id=T2.id
> >
> > Best,
> > Shammon
> >
> >
> > On Wed, Feb 22, 2023 at 11:28 AM 郑舒力  wrote:
> >
> > > Hi kui,
> > >
> > > There is a scenario using watermark that cannot be implemented through
> > SQL
> > > now：
> > > We only support specify rowtime column in table DDL，if the rowtime
> field
> > > is generated by join dimension table , it cannot be implemented
> > >
> > > Can we consider implement through HINTS like :
> > > Select * from t1 join t2 /*+ WATERMARK PARAMS(‘rowtime’ =
> ’time_column’,
> > > ‘rowtime_expression’ = ’time_column - INTERVAL 5 SECOND') */
> > >
> > >
> > >
> > > > 2023年2月22日 10:22，kui yuan  写道：
> > > >
> > > > Hi devs,
> > > >
> > > >
> > > > I'd like to start a discussion thread for FLIP-296[1]. This comes
> from
> > an
> > > > offline discussion with @Yun Tang, and we hope to enrich table API &
> > SQL
> > > to
> > > > support many watermark-related features which were only implemented
> at
> > > the
> > > > datastream API level.
> > > >
> > > >
> > > > Basically, we want to introduce watermark options in table API & SQL
> > via
> > > > SQL hint named 'WATERMARK_PARAMS' to support features:
> > > >
> > > > 1、Configurable watermark emit strategy
> > > >
> > > > 2、Dealing with idle sources
> > > >
> > > > 3、Watermark alignment
> > > >
> > > >
> > > > Last but not least, thanks to Qingsheng and Jing Zhang for the
> initial
> > > > reviews.
> > > >
> > > >
> > > > Looking forward to your thoughts and any feedback is appreciated!
> > > >
> > > >
> > > > [1]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240884405
> > > >
> > > >
> > > > Best
> > > >
> > > > Yuan Kui
> > >
> > >
> >
>

[jira](FLINK-31177) To introduce a formatter for Markdown files

2023-02-22 Thread Zhongpu Chen

As I mentioned in FLINK-31177 (
https://issues.apache.org/jira/browse/FLINK-31177):

Currently, markdown files in *docs* are maintained and updated by many
contributors, and different people have varying code style taste. By the
way, as the syntax of markdown is not really strict, the styles tend to be
inconsistent.

To name a few,

   - Some prefer `*` to make a list item, while others may prefer `-`.
   - It is common to  leave many unnecessary blank lines and spaces.
   - To make a divider, the number of `-` can be varying.

To this end, I think it would be nicer to encourage or demand contributors
to format their markdown files before making a pull request.  Personally, I
think Prettier (https://prettier.io/) is a good candidate.
What do you think?

-- 
Zhongpu Chen

Re: [DISCUSS] FLIP-296: Watermark options for table API & SQL

2023-02-22 Thread Martijn Visser

Hi Yuan Kui,

Thanks for creating the FLIP. A couple of questions / remarks

1. While the FLIP talks about watermark options for Table API & SQL, I only
see proposed syntax for SQL, not for the Table API. What is your proposal
for the Table API?

2. A rejected alternative is adding watermark related options in the SQL
DDL because it would make the DDL complex and lengthy. But looking at the
current hints for dynamic table options, we already provide the option to
specify or override table options dynamically [1]. So I would think that
these watermark options actually should be part of the SQL DDL and that a
user can use the dynamic table hints to specify/override these options if
needed. The advantage of putting these options in the SQL DDL is that the
user has one location where these types of options need to be specified. If
we would only do this via hints, the user would need to specify these
options in two places. What do you think?

Best regards,

Martijn

[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/hints/#dynamic-table-options

On Wed, Feb 22, 2023 at 7:58 AM Shammon FY  wrote:

> Hi kui
>
> Thanks for initializing this discussion. I have two questions
>
> 1. What will happen if we add watermark related options in `the connector
> options`? Will the connector ignore these options or throw an exception?
> How can we support this?
>
> 2. If one table is used by two operators with different watermark params,
> what will happen? For example, there is a source table T and user submit
> the following job
>
> SELECT T1.id, T1.cnt_val, T2.sum_val FROM (SELECT id, count(val) as cnt_val
> FROM T /*+ WATERMARK PARAMS(‘rowtime’ = ’time_column’, ‘rowtime_expression’
> = ’time_column - INTERVAL 5 SECOND') */ GROUP BY id) T1 JOIN (SELECT id,
> sum(val) as sum_val FROM T /*+ WATERMARK PARAMS(‘rowtime’ = ’time_column’,
> ‘rowtime_expression’ = ’time_column - INTERVAL 10 SECOND') */ GROUP BY id)
> T2 ON T1.id=T2.id
>
> Best,
> Shammon
>
>
> On Wed, Feb 22, 2023 at 11:28 AM 郑舒力  wrote:
>
> > Hi kui,
> >
> > There is a scenario using watermark that cannot be implemented through
> SQL
> > now：
> > We only support specify rowtime column in table DDL，if the rowtime field
> > is generated by join dimension table , it cannot be implemented
> >
> > Can we consider implement through HINTS like :
> > Select * from t1 join t2 /*+ WATERMARK PARAMS(‘rowtime’ = ’time_column’,
> > ‘rowtime_expression’ = ’time_column - INTERVAL 5 SECOND') */
> >
> >
> >
> > > 2023年2月22日 10:22，kui yuan  写道：
> > >
> > > Hi devs,
> > >
> > >
> > > I'd like to start a discussion thread for FLIP-296[1]. This comes from
> an
> > > offline discussion with @Yun Tang, and we hope to enrich table API &
> SQL
> > to
> > > support many watermark-related features which were only implemented at
> > the
> > > datastream API level.
> > >
> > >
> > > Basically, we want to introduce watermark options in table API & SQL
> via
> > > SQL hint named 'WATERMARK_PARAMS' to support features:
> > >
> > > 1、Configurable watermark emit strategy
> > >
> > > 2、Dealing with idle sources
> > >
> > > 3、Watermark alignment
> > >
> > >
> > > Last but not least, thanks to Qingsheng and Jing Zhang for the initial
> > > reviews.
> > >
> > >
> > > Looking forward to your thoughts and any feedback is appreciated!
> > >
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240884405
> > >
> > >
> > > Best
> > >
> > > Yuan Kui
> >
> >
>

[jira] [Created] (FLINK-31181) Support LIKE operator pushdown

2023-02-22 Thread Jira

Grzegorz Kołakowski created FLINK-31181:
---

 Summary: Support LIKE operator pushdown
 Key: FLINK-31181
 URL: https://issues.apache.org/jira/browse/FLINK-31181
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / JDBC
Reporter: Grzegorz Kołakowski


Filter pushdown has been introduced with 
[FLINK-16024|https://issues.apache.org/jira/browse/FLINK-16024] for selected 
unary and binary operators. We can extend the functionality to support pushdown 
for LIKE operator as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

42 matches

Mail list logo