date:20240314

Re: [DISCUSS] FLIP-437: Support ML Models in Flink SQL

2024-03-14 Thread Hao Li

Hi Jark,

Thanks for the pointer. Sorry for the confusion: I meant how the table name
in window TVF gets translated to `SqlCallingBinding`. Probably we need to
fetch the table definition from the catalog somewhere. Do we treat those
window TVF specially in parser/planner so that catalog is looked up when
they are seen?

For what model is, I'm wondering if it has to be datatype or relation. Can
it be another kind of citizen parallel to datatype/relation/function/db?
Redshift also supports `show models` operation, so it seems it's treated
specially as well? The reasons I don't like Redshift's syntax are:
1. It's a bit verbose, you need to think of a model name as well as a
function name and the function name also needs to be unique.
2. More importantly, prediction function isn't the only function that can
operate on models. There could be a set of inference functions [1] and
evaluation functions [2] which can operate on models. It's hard to specify
all of them in model creation.

[1]:
https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-predict
[2]:
https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-evaluate

Thanks,
Hao

On Thu, Mar 14, 2024 at 8:18 PM Jark Wu  wrote:

> Hi Hao,
>
> > Can you send me some pointers
> where the function gets the table information?
>
> Here is the code of cumulate window type checking [1].
>
> > Also is it possible to support  in
> window functions in addiction to table?
>
> Yes. It is not allowed in TVF.
>
> Thanks for the syntax links of other systems. The reason I prefer the
> Redshift way is
> that it avoids introducing Model as a relation or datatype (referenced as a
> parameter in TVF).
> Model is not a relation because it can be queried directly (e.g., SELECT *
> FROM model).
> I'm also confused about making Model as a datatype, because I don't know
> what class the
> model parameter of the eval method of TableFunction/ScalarFunction should
> be. By defining
> the function with the model, users can directly invoke the function without
> reference to the model name.
>
> Best,
> Jark
>
> [1]:
>
> https://github.com/apache/flink/blob/d6c7eee8243b4fe3e593698f250643534dc79cb5/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/functions/sql/SqlCumulateTableFunction.java#L53
>
> On Fri, 15 Mar 2024 at 02:48, Hao Li  wrote:
>
> > Hi Jark,
> >
> > Thanks for the pointers. It's very helpful.
> >
> > 1. Looks like `tumble`, `hopping` are keywords in calcite parser. And the
> > syntax `cumulate(Table my_table, ...)` needs to get table information
> from
> > catalog somewhere for type validation etc. Can you send me some pointers
> > where the function gets the table information?
> > 2. The ideal syntax for model function I think would be `ML_PREDICT(MODEL
> > , {table  | (query_stmt) })`. I think with
> special
> > handling of the `ML_PREDICT` function in parser/planner, maybe we can do
> > this like window functions. But to support `MODEL` keyword, we need
> calcite
> > parser change I guess. Also is it possible to support  in
> > window functions in addiction to table?
> >
> > For the redshift syntax, I'm not sure the purpose of defining the
> function
> > name with the model. Is it to define the function input/output schema? We
> > have the schema in our create model syntax and the `ML_PREDICT` can
> handle
> > it by getting model definition. I think our syntax is more concise to
> have
> > a generic prediction function. I also did some research and it's the
> syntax
> > used by Databricks `ai_query` [1], Snowflake `predict` [2], Azureml
> > `predict` [3].
> >
> > [1]:
> >
> https://docs.databricks.com/en/sql/language-manual/functions/ai_query.html
> > [2]:
> >
> >
> https://github.com/Snowflake-Labs/sfguide-intro-to-machine-learning-with-snowpark-ml-for-python/blob/main/3_snowpark_ml_model_training_inference.ipynb?_fsi=sksXUwQ0
> > [3]:
> >
> >
> https://learn.microsoft.com/en-us/sql/machine-learning/tutorials/quickstart-python-train-score-model?view=azuresqldb-mi-current
> >
> > Thanks,
> > Hao
> >
> > On Wed, Mar 13, 2024 at 8:57 PM Jark Wu  wrote:
> >
> > > Hi Mingge, Hao,
> > >
> > > Thanks for your replies.
> > >
> > > > PTF is actually the ideal approach for model functions, and we do
> have
> > > the plans to use PTF for
> > > all model functions (including prediction, evaluation etc..) once the
> PTF
> > > is supported in FlinkSQL
> > > confluent extension.
> > >
> > > It sounds that PTF is the ideal way and table function is a temporary
> > > solution which will be dropped in the future.
> > > I'm not sure whether we can implement it using PTF in Flink SQL. But we
> > > have implemented window
> > > functions using PTF[1]. And introduced a new window function (called
> > > CUMULATE[2]) in Flink SQL based
> > > on this. I think it might work to use PTF and implement model function
> > > syntax like this:
> > >
> > > SELECT * FROM TABLE(ML_PREDICT(
> > >   TABLE my_table,
> > >   my_model,

Re: [DISCUSS] FLIP-426: Grouping Remote State Access

2024-03-14 Thread Jinzhong Li

Hi Rui,

Thanks for your comments.

> In my opinion, we need to reduce the times of fetching RocksDB
> SST from remote to local. The FLIP seems to batch the RocksDB
> put/get requests. I am not sure this will reduce the SST fetching times.

For multiple Get(read) requests, we will convert them into one MultiGet
request. The MultiGet could merge block read IO which belong to multiple
Get requests, thereby reducing the times of fetching RocksDB Block/SST from
remote filesystem.

> How to monitor the I/O used by state disaggregation?

On the one hand, the remote storage(S3/HDFS/OSS) generally provides
network/IO throughput monitoring. On the other hand, on the Flink side we
can also provide some metric about accessing remote storage, eg. the ratio
of remote storage access to local disk cache access, etc.

> How about customizing batching strategy? I

I don't recommend permitting users to customize their batching strategy, as
very few users have the capability to customize this strategy effectively.
However, we could offer some configurable options that allow users to
adjust the behavior of batching, such as the batching size and so on.

Best,
Jinzhong


On Fri, Mar 15, 2024 at 12:08 PM 夏 瑞  wrote:

> Hi Jinzhong,
>
> Batching state access is a reasonable way to reduce the amount of I/O
> compared to per-record state access. But I have some questions:
>
> - In my opinion, we need to reduce the times of fetching RocksDB SST from
> remote to local. The FLIP seems to batch the RocksDB put/get requests. I am
> not sure this will reduce the SST fetching times.
>
> - How to monitor the I/O used by state disaggregation? The latency/amount
> of I/O on DFS is important to the performance diagnosis. Moreover, the
> amount of I/O also influences the stability of the DFS. For example, the
> throughput of HDFS NameNode is hard to scale and suffers from I/O flood.
>
> - How about customizing batching strategy? Intuitively, an extremely large
> batch may need lots of memory to hold the returned results, and causes OOM.
> On the other size, if we randomly batch keys, state storage may navigate
> all SSTs to find the results.
>
> Best wishes,
> Rui Xia.
>
> 
> 发件人: Hangxiang Yu 
> 发送时间: 2024年3月15日 4:04
> 收件人: xiarui0...@hotmail.com 
> 主题: Fwd: [DISCUSS] FLIP-426: Grouping Remote State Access
>
>
>
> -- Forwarded message -
> From: Jinzhong Li  lijinzhong2...@gmail.com>>
> Date: Thu, Mar 7, 2024 at 4:52 PM
> Subject: [DISCUSS] FLIP-426: Grouping Remote State Access
> To: mailto:dev@flink.apache.org>>
> Cc: mailto:yuanmei.w...@gmail.com>>, <
> zakelly@gmail.com>,  >,  fredia...@gmail.com>>, mailto:fengw...@apache.org>>
>
>
>
> Hi devs,
>
>
> I'd like to start a discussion on a sub-FLIP of FLIP-423: Disaggregated
> State Storage and Management[1], which is a joint work of Yuan Mei, Zakelly
> Lan, Jinzhong Li, Hangxiang Yu, Yanfei Lei and Feng Wang:
>
> - FLIP-426: Grouping Remote State Access<
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-426%3A+Grouping+Remote+State+Access>
> [2]
>
> This FLIP enables retrieval of remote state data in batches to avoid
> unnecessary round-trip costs for remote access.
>
> Please make sure you have read the FLIP-423[1] to know the whole story,
> and we'll discuss the details of FLIP-424[2] under this mail. For the
> discussion of overall architecture or topics related with multiple
> sub-FLIPs, please post in the previous mail[3].
>
> Looking forward to hearing from you!
>
> [1] https://cwiki.apache.org/confluence/x/R4p3EQ
>
> [2] https://cwiki.apache.org/confluence/x/TYp3EQ
>
> [3] https://lists.apache.org/thread/ct8smn6g9y0b8730z7rp9zfpnwmj8vf0
>
> Best,
>
> Jinzhong Li
>
>
>
>
> --
> Best,
> Hangxiang.
>

[jira] [Created] (FLINK-34690) If data from upstream is decimal and primary key , starrocks sink will not support.

2024-03-14 Thread Hongshun Wang (Jira)

Hongshun Wang created FLINK-34690:
-

 Summary: If data from upstream is decimal and primary key , 
starrocks sink will not support.
 Key: FLINK-34690
 URL: https://issues.apache.org/jira/browse/FLINK-34690
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Reporter: Hongshun Wang
 Fix For: cdc-3.1.0


If data from upstream is decimal and primary key , starrocks sink will not 
support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

回复: [DISCUSS] FLIP-426: Grouping Remote State Access

2024-03-14 Thread 夏瑞

Hi Jinzhong,

Batching state access is a reasonable way to reduce the amount of I/O compared 
to per-record state access. But I have some questions:

- In my opinion, we need to reduce the times of fetching RocksDB SST from 
remote to local. The FLIP seems to batch the RocksDB put/get requests. I am not 
sure this will reduce the SST fetching times.

- How to monitor the I/O used by state disaggregation? The latency/amount of 
I/O on DFS is important to the performance diagnosis. Moreover, the amount of 
I/O also influences the stability of the DFS. For example, the throughput of 
HDFS NameNode is hard to scale and suffers from I/O flood.

- How about customizing batching strategy? Intuitively, an extremely large 
batch may need lots of memory to hold the returned results, and causes OOM. On 
the other size, if we randomly batch keys, state storage may navigate all SSTs 
to find the results.

Best wishes,
Rui Xia.


发件人: Hangxiang Yu 
发送时间: 2024年3月15日 4:04
收件人: xiarui0...@hotmail.com 
主题: Fwd: [DISCUSS] FLIP-426: Grouping Remote State Access



-- Forwarded message -
From: Jinzhong Li mailto:lijinzhong2...@gmail.com>>
Date: Thu, Mar 7, 2024 at 4:52 PM
Subject: [DISCUSS] FLIP-426: Grouping Remote State Access
To: mailto:dev@flink.apache.org>>
Cc: mailto:yuanmei.w...@gmail.com>>, 
mailto:zakelly@gmail.com>>, 
mailto:master...@gmail.com>>, 
mailto:fredia...@gmail.com>>, 
mailto:fengw...@apache.org>>



Hi devs,


I'd like to start a discussion on a sub-FLIP of FLIP-423: Disaggregated State 
Storage and Management[1], which is a joint work of Yuan Mei, Zakelly Lan, 
Jinzhong Li, Hangxiang Yu, Yanfei Lei and Feng Wang:

- FLIP-426: Grouping Remote State 
Access
 [2]

This FLIP enables retrieval of remote state data in batches to avoid 
unnecessary round-trip costs for remote access.

Please make sure you have read the FLIP-423[1] to know the whole story, and 
we'll discuss the details of FLIP-424[2] under this mail. For the discussion of 
overall architecture or topics related with multiple sub-FLIPs, please post in 
the previous mail[3].

Looking forward to hearing from you!

[1] https://cwiki.apache.org/confluence/x/R4p3EQ

[2] https://cwiki.apache.org/confluence/x/TYp3EQ

[3] https://lists.apache.org/thread/ct8smn6g9y0b8730z7rp9zfpnwmj8vf0

Best,

Jinzhong Li




--
Best,
Hangxiang.

Re: Re: [DISCUSS] FLIP-436: Introduce "SHOW CREATE CATALOG" Syntax

2024-03-14 Thread Yubin Li

Hi Xuyang,

Thank you for pointing this out, The parser part of `describe catalog`
syntax
has indeed been implemented in FLIP-69, and it is not actually available.
we can complete the syntax in this FLIP [1].  I have updated the doc :)

Best,
Yubin

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-436%3A+Introduce+Catalog-related+Syntax

On Fri, Mar 15, 2024 at 10:12 AM Xuyang  wrote:

> Hi, Yubin. Big +1 for this Flip. I just left one minor comment following.
>
>
> I found that although flink has not supported syntax 'DESCRIBE CATALOG
> catalog_name' currently, it was already
> discussed in flip-69[1], do we need to restart discussing it?
> I don't have a particular preference regarding the restart discussion. It
> seems that there is no difference on this syntax
> in FLIP-436, so maybe it would be best to refer back to FLIP-69 in this
> FLIP. WDYT?
>
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-69%3A+Flink+SQL+DDL+Enhancement
>
>
>
> --
>
> Best！
> Xuyang
>
>
>
>
>
> At 2024-03-15 02:49:59, "Yubin Li"  wrote:
> >Hi folks,
> >
> >Thank you all for your input, it really makes sense to introduce missing
> >catalog-related SQL syntaxes under this FLIP, and I have changed the
> >title of doc to "FLIP-436: Introduce Catalog-related Syntax".
> >
> >After comprehensive consideration, the following syntaxes should be
> >introduced, more suggestions are welcome :)
> >
> >> 1. SHOW CREATE CATALOG catalog_name
> >> 2. DESCRIBE/DESC CATALOG catalog_name
> >> 3. ALTER CATALOG catalog_name SET (key1=val1, key2=val2, ...)
> >
> >Regarding the `alter catalog` syntax format, I refer to the current design
> >of `alter database`.
> >
> >Given that CatalogManager already provides catalog operations such as
> >create, get, and unregister, and in order to facilitate future
> >implementation
> >of audit tracking, I propose to introduce the alterCatalog() function in
> >CatalogManager. WDYT?
> >
> >Please see details in FLIP doc [1] .
> >
> >Best,
> >Yubin
> >
> >[1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-436%3A+Introduce+Catalog-related+Syntax
> >
> >
> >On Thu, Mar 14, 2024 at 11:07 PM Leonard Xu  wrote:
> >
> >> Hi Yubin,
> >>
> >> Thanks for driving the discussion, generally +1 for the FLIP, big +1 to
> >> finalize the whole catalog syntax story in one FLIP,
> >> thus I want to jump into the discussion again after you completed the
> >> whole catalog syntax story.
> >>
> >> Best,
> >> Leonard
> >>
> >>
> >>
> >> > 2024年3月14日 下午8:39，Roc Marshal  写道：
> >> >
> >> > Hi, Yubin
> >> >
> >> >
> >> > Thank you for initiating this discussion! +1 for the proposal.
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > Best,
> >> > Yuepeng Pan
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > At 2024-03-14 18:57:35, "Ferenc Csaky" 
> >> wrote:
> >> >> Hi Yubin,
> >> >>
> >> >> Thank you for initiating this discussion! +1 for the proposal.
> >> >>
> >> >> I also think it makes sense to group the missing catalog related
> >> >> SQL syntaxes under this FLIP.
> >> >>
> >> >> Looking forward to these features!
> >> >>
> >> >> Best,
> >> >> Ferenc
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> On Thursday, March 14th, 2024 at 08:31, Jane Chan <
> >> qingyue@gmail.com> wrote:
> >> >>
> >> >>>
> >> >>>
> >> >>> Hi Yubin,
> >> >>>
> >> >>> Thanks for leading the discussion. I'm +1 for the FLIP.
> >> >>>
> >> >>> As Jark said, it's a good opportunity to enhance the syntax for
> Catalog
> >> >>> from a more comprehensive perspective. So, I suggest expanding the
> >> scope of
> >> >>> this FLIP by focusing on the mechanism instead of one use case to
> >> enhance
> >> >>> the overall functionality. WDYT?
> >> >>>
> >> >>> Best,
> >> >>> Jane
> >> >>>
> >> >>> On Thu, Mar 14, 2024 at 11:38 AM Hang Ruan ruanhang1...@gmail.com
> >> wrote:
> >> >>>
> >>  Hi, Yubin.
> >> 
> >>  Thanks for the FLIP. +1 for it.
> >> 
> >>  Best,
> >>  Hang
> >> 
> >>  Yubin Li lyb5...@gmail.com 于2024年3月14日周四 10:15写道：
> >> 
> >> > Hi Jingsong, Feng, and Jeyhun
> >> >
> >> > Thanks for your support and feedback!
> >> >
> >> >> However, could we add a new method `getCatalogDescriptor()` to
> >> >> CatalogManager instead of directly exposing CatalogStore?
> >> >
> >> > Good point, Besides the audit tracking issue, The proposed feature
> >> > only requires `getCatalogDescriptor()` function. Exposing
> components
> >> > with excessive functionality will bring unnecessary risks, I have
> >> made
> >> > modifications in the FLIP doc [1]. Thank Feng :)
> >> >
> >> >> Showing the SQL parser implementation in the FLIP for the SQL
> syntax
> >> >> might be a bit confusing. Also, the formal definition is missing
> for
> >> >> this SQL clause.
> >> >
> >> > Thank Jeyhun for pointing it out :) I have updated the doc [1] .
> >> >
> >> > [1]
> >> 
> >> 
> >>
>

[jira] [Created] (FLINK-34689) check binlog_row_value_optoins

2024-03-14 Thread Lee SeungMin (Jira)

Lee SeungMin created FLINK-34689:


 Summary: check binlog_row_value_optoins
 Key: FLINK-34689
 URL: https://issues.apache.org/jira/browse/FLINK-34689
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Reporter: Lee SeungMin
 Attachments: image-2024-03-15-12-56-49-344.png

When {{binlog_row_value_optoins}} is set to {{{}PARTIAL_JSON{}}},
the update operator remains as {{{}Update_rows_partial{}}}.

Flink CDC does not parse this event because {{Update_row_partial}} binlog event 
is mapped to {{PARTIAL_UPDATE_ROWS_EVENT}} and Flink CDC do not handle that 
event type

 

Example of Update_row_partial (when {{binlog_row_value_optoins}} = 
{{PARTIAL_JSON)}}

!image-2024-03-15-12-56-49-344.png|width=1015,height=30!

So, we have to check {{binlog_row_value_optoins}} before starting.

 

 

Cretae PR: [[MySQL][Feature] check binlog_row_value_optoins by SML0127 · Pull 
Request #3148 · apache/flink-cdc 
(github.com)|https://github.com/apache/flink-cdc/pull/3148]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34688) CDC framework split snapshot chunks asynchronously

2024-03-14 Thread Hongshun Wang (Jira)

Hongshun Wang created FLINK-34688:
-

 Summary: CDC framework split snapshot chunks asynchronously
 Key: FLINK-34688
 URL: https://issues.apache.org/jira/browse/FLINK-34688
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Reporter: Hongshun Wang
 Fix For: cdc-3.1.0


In Mysql CDC,  MysqlSnapshotSplitAssigner splits snapshot chunks 
asynchronously([https://github.com/apache/flink-cdc/pull/931).] But CDC 
framework lacks it.

If table is too big to split, the enumerator will be stuck, and checkpoint will 
be influenced( sometime will checkpoint timeout occurs).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34687) Home Page of Flink CDC Documentation

2024-03-14 Thread Qingsheng Ren (Jira)

Qingsheng Ren created FLINK-34687:
-

 Summary: Home Page of Flink CDC Documentation
 Key: FLINK-34687
 URL: https://issues.apache.org/jira/browse/FLINK-34687
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Flink CDC
Affects Versions: cdc-3.1.0
Reporter: Qingsheng Ren
 Fix For: cdc-3.1.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] FLIP-437: Support ML Models in Flink SQL

2024-03-14 Thread Jark Wu

Hi Hao,

> Can you send me some pointers
where the function gets the table information?

Here is the code of cumulate window type checking [1].

> Also is it possible to support  in
window functions in addiction to table?

Yes. It is not allowed in TVF.

Thanks for the syntax links of other systems. The reason I prefer the
Redshift way is
that it avoids introducing Model as a relation or datatype (referenced as a
parameter in TVF).
Model is not a relation because it can be queried directly (e.g., SELECT *
FROM model).
I'm also confused about making Model as a datatype, because I don't know
what class the
model parameter of the eval method of TableFunction/ScalarFunction should
be. By defining
the function with the model, users can directly invoke the function without
reference to the model name.

Best,
Jark

[1]:
https://github.com/apache/flink/blob/d6c7eee8243b4fe3e593698f250643534dc79cb5/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/functions/sql/SqlCumulateTableFunction.java#L53

On Fri, 15 Mar 2024 at 02:48, Hao Li  wrote:

> Hi Jark,
>
> Thanks for the pointers. It's very helpful.
>
> 1. Looks like `tumble`, `hopping` are keywords in calcite parser. And the
> syntax `cumulate(Table my_table, ...)` needs to get table information from
> catalog somewhere for type validation etc. Can you send me some pointers
> where the function gets the table information?
> 2. The ideal syntax for model function I think would be `ML_PREDICT(MODEL
> , {table  | (query_stmt) })`. I think with special
> handling of the `ML_PREDICT` function in parser/planner, maybe we can do
> this like window functions. But to support `MODEL` keyword, we need calcite
> parser change I guess. Also is it possible to support  in
> window functions in addiction to table?
>
> For the redshift syntax, I'm not sure the purpose of defining the function
> name with the model. Is it to define the function input/output schema? We
> have the schema in our create model syntax and the `ML_PREDICT` can handle
> it by getting model definition. I think our syntax is more concise to have
> a generic prediction function. I also did some research and it's the syntax
> used by Databricks `ai_query` [1], Snowflake `predict` [2], Azureml
> `predict` [3].
>
> [1]:
> https://docs.databricks.com/en/sql/language-manual/functions/ai_query.html
> [2]:
>
> https://github.com/Snowflake-Labs/sfguide-intro-to-machine-learning-with-snowpark-ml-for-python/blob/main/3_snowpark_ml_model_training_inference.ipynb?_fsi=sksXUwQ0
> [3]:
>
> https://learn.microsoft.com/en-us/sql/machine-learning/tutorials/quickstart-python-train-score-model?view=azuresqldb-mi-current
>
> Thanks,
> Hao
>
> On Wed, Mar 13, 2024 at 8:57 PM Jark Wu  wrote:
>
> > Hi Mingge, Hao,
> >
> > Thanks for your replies.
> >
> > > PTF is actually the ideal approach for model functions, and we do have
> > the plans to use PTF for
> > all model functions (including prediction, evaluation etc..) once the PTF
> > is supported in FlinkSQL
> > confluent extension.
> >
> > It sounds that PTF is the ideal way and table function is a temporary
> > solution which will be dropped in the future.
> > I'm not sure whether we can implement it using PTF in Flink SQL. But we
> > have implemented window
> > functions using PTF[1]. And introduced a new window function (called
> > CUMULATE[2]) in Flink SQL based
> > on this. I think it might work to use PTF and implement model function
> > syntax like this:
> >
> > SELECT * FROM TABLE(ML_PREDICT(
> >   TABLE my_table,
> >   my_model,
> >   col1,
> >   col2
> > ));
> >
> > Besides, did you consider following the way of AWS Redshift which defines
> > model function with the model itself together?
> > IIUC, a model is a black-box which defines input parameters and output
> > parameters which can be modeled into functions.
> >
> >
> > Best,
> > Jark
> >
> > [1]:
> >
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/window-tvf/#session
> > [2]:
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-145%3A+Support+SQL+windowing+table-valued+function#FLIP145:SupportSQLwindowingtablevaluedfunction-CumulatingWindows
> > [3]:
> >
> >
> https://github.com/aws-samples/amazon-redshift-ml-getting-started/blob/main/use-cases/bring-your-own-model-remote-inference/README.md#create-model
> >
> >
> >
> >
> > On Wed, 13 Mar 2024 at 15:00, Hao Li  wrote:
> >
> > > Hi Jark,
> > >
> > > Thanks for your questions. These are good questions!
> > >
> > > 1. The polymorphism table function I was referring to takes a table as
> > > input and outputs a table. So the syntax would be like
> > > ```
> > > SELECT * FROM ML_PREDICT('model', (SELECT * FROM my_table))
> > > ```
> > > As far as I know, this is not supported yet on Flink. So before it's
> > > supported, one option for the predict function is using table function
> > > which can output multiple columns
> > > ```
> > > SELECT * FROM my_table, LATERAL VIEW (ML_PREDICT('model',

[jira] [Created] (FLINK-34686) "Deployment - YARN" Page for Flink CDC Documentation

2024-03-14 Thread Qingsheng Ren (Jira)

Qingsheng Ren created FLINK-34686:
-

 Summary: "Deployment - YARN" Page for Flink CDC Documentation
 Key: FLINK-34686
 URL: https://issues.apache.org/jira/browse/FLINK-34686
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Flink CDC
Affects Versions: cdc-3.1.0
Reporter: Qingsheng Ren
 Fix For: cdc-3.1.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34685) "Deployment - Kubernetes" Page for Flink CDC Documentation

2024-03-14 Thread Qingsheng Ren (Jira)

Qingsheng Ren created FLINK-34685:
-

 Summary: "Deployment - Kubernetes" Page for Flink CDC Documentation
 Key: FLINK-34685
 URL: https://issues.apache.org/jira/browse/FLINK-34685
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Flink CDC
Affects Versions: cdc-3.1.0
Reporter: Qingsheng Ren
 Fix For: cdc-3.1.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34684) "Developer Guide - Licenses" Page for Flink CDC Documentation

2024-03-14 Thread Qingsheng Ren (Jira)

Qingsheng Ren created FLINK-34684:
-

 Summary: "Developer Guide - Licenses" Page for Flink CDC 
Documentation
 Key: FLINK-34684
 URL: https://issues.apache.org/jira/browse/FLINK-34684
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Flink CDC
Affects Versions: cdc-3.1.0
Reporter: Qingsheng Ren
 Fix For: cdc-3.1.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34683) "Developer Guide - Contribute to Flink CDC" Page for Flink CDC Documentation

2024-03-14 Thread Qingsheng Ren (Jira)

Qingsheng Ren created FLINK-34683:
-

 Summary: "Developer Guide - Contribute to Flink CDC" Page for 
Flink CDC Documentation
 Key: FLINK-34683
 URL: https://issues.apache.org/jira/browse/FLINK-34683
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Flink CDC
Affects Versions: 3.1.0
Reporter: Qingsheng Ren
 Fix For: 3.1.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34682) "Developer Guide - Understanding Flink CDC" Page for Flink CDC Documentation

2024-03-14 Thread Qingsheng Ren (Jira)

Qingsheng Ren created FLINK-34682:
-

 Summary: "Developer Guide - Understanding Flink CDC" Page for 
Flink CDC Documentation
 Key: FLINK-34682
 URL: https://issues.apache.org/jira/browse/FLINK-34682
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Flink CDC
Affects Versions: 3.1.0
Reporter: Qingsheng Ren
 Fix For: 3.1.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34681) "Deployment" Pages for Flink CDC Documentation

2024-03-14 Thread Qingsheng Ren (Jira)

Qingsheng Ren created FLINK-34681:
-

 Summary: "Deployment" Pages for Flink CDC Documentation
 Key: FLINK-34681
 URL: https://issues.apache.org/jira/browse/FLINK-34681
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Flink CDC
Affects Versions: 3.1.0
Reporter: Qingsheng Ren
 Fix For: 3.1.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34680) "Overview" Page for Connectors in Flink CDC Documentation

2024-03-14 Thread Qingsheng Ren (Jira)

Qingsheng Ren created FLINK-34680:
-

 Summary: "Overview" Page for Connectors in Flink CDC Documentation
 Key: FLINK-34680
 URL: https://issues.apache.org/jira/browse/FLINK-34680
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Flink CDC
Affects Versions: 3.1.0
Reporter: Qingsheng Ren
 Fix For: 3.1.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34679) "Core Concept" Pages for Flink CDC Documentation

2024-03-14 Thread Qingsheng Ren (Jira)

Qingsheng Ren created FLINK-34679:
-

 Summary: "Core Concept" Pages for Flink CDC Documentation
 Key: FLINK-34679
 URL: https://issues.apache.org/jira/browse/FLINK-34679
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Flink CDC
Affects Versions: 3.1.0
Reporter: Qingsheng Ren
 Fix For: 3.1.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34678) "Introduction" Page for Flink CDC Documentation

2024-03-14 Thread Qingsheng Ren (Jira)

Qingsheng Ren created FLINK-34678:
-

 Summary: "Introduction" Page for Flink CDC Documentation
 Key: FLINK-34678
 URL: https://issues.apache.org/jira/browse/FLINK-34678
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Flink CDC
Affects Versions: 3.1.0
Reporter: Qingsheng Ren
 Fix For: 3.1.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34677) Refactor the structure of documentation for Flink CDC

2024-03-14 Thread Qingsheng Ren (Jira)

Qingsheng Ren created FLINK-34677:
-

 Summary: Refactor the structure of documentation for Flink CDC
 Key: FLINK-34677
 URL: https://issues.apache.org/jira/browse/FLINK-34677
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Affects Versions: 3.1.0
Reporter: Qingsheng Ren
Assignee: Qingsheng Ren
 Fix For: 3.1.0


The documentation structure of Flink CDC is not quite in good shape currently. 
We plan to refactor it as below (✅ for existed pages and ️  for new pages to 
write):
 * Get Started
 ** ️ Introduction
 ** ✅ Quickstart
 * Core Concept
 ** ️ (Pages for data pipeline / sources / sinks / table ID / transform / 
route)
 * Connectors
 ** ️ Overview
 ** ✅ (Pages for connectors)
 ** ✅ Legacy Flink CDC Sources (For CDC sources before 3.0)
 * Deployment
 ** ️ (Pages for different deployment modes)
 * Developer Guide
 ** ️ Understand Flink CDC API
 ** ️ Contribute to Flink CDC
 ** ️ Licenses
 * FAQ
 ** ✅ Frequently Asked Questions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re:Re: [DISCUSS] FLIP-436: Introduce "SHOW CREATE CATALOG" Syntax

2024-03-14 Thread Xuyang

Hi, Yubin. Big +1 for this Flip. I just left one minor comment following.


I found that although flink has not supported syntax 'DESCRIBE CATALOG 
catalog_name' currently, it was already
discussed in flip-69[1], do we need to restart discussing it?
I don't have a particular preference regarding the restart discussion. It seems 
that there is no difference on this syntax 
in FLIP-436, so maybe it would be best to refer back to FLIP-69 in this FLIP. 
WDYT?


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-69%3A+Flink+SQL+DDL+Enhancement



--

Best！
Xuyang





At 2024-03-15 02:49:59, "Yubin Li"  wrote:
>Hi folks,
>
>Thank you all for your input, it really makes sense to introduce missing
>catalog-related SQL syntaxes under this FLIP, and I have changed the
>title of doc to "FLIP-436: Introduce Catalog-related Syntax".
>
>After comprehensive consideration, the following syntaxes should be
>introduced, more suggestions are welcome :)
>
>> 1. SHOW CREATE CATALOG catalog_name
>> 2. DESCRIBE/DESC CATALOG catalog_name
>> 3. ALTER CATALOG catalog_name SET (key1=val1, key2=val2, ...)
>
>Regarding the `alter catalog` syntax format, I refer to the current design
>of `alter database`.
>
>Given that CatalogManager already provides catalog operations such as
>create, get, and unregister, and in order to facilitate future
>implementation
>of audit tracking, I propose to introduce the alterCatalog() function in
>CatalogManager. WDYT?
>
>Please see details in FLIP doc [1] .
>
>Best,
>Yubin
>
>[1]
>https://cwiki.apache.org/confluence/display/FLINK/FLIP-436%3A+Introduce+Catalog-related+Syntax
>
>
>On Thu, Mar 14, 2024 at 11:07 PM Leonard Xu  wrote:
>
>> Hi Yubin,
>>
>> Thanks for driving the discussion, generally +1 for the FLIP, big +1 to
>> finalize the whole catalog syntax story in one FLIP,
>> thus I want to jump into the discussion again after you completed the
>> whole catalog syntax story.
>>
>> Best,
>> Leonard
>>
>>
>>
>> > 2024年3月14日 下午8:39，Roc Marshal  写道：
>> >
>> > Hi, Yubin
>> >
>> >
>> > Thank you for initiating this discussion! +1 for the proposal.
>> >
>> >
>> >
>> >
>> >
>> >
>> > Best,
>> > Yuepeng Pan
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > At 2024-03-14 18:57:35, "Ferenc Csaky" 
>> wrote:
>> >> Hi Yubin,
>> >>
>> >> Thank you for initiating this discussion! +1 for the proposal.
>> >>
>> >> I also think it makes sense to group the missing catalog related
>> >> SQL syntaxes under this FLIP.
>> >>
>> >> Looking forward to these features!
>> >>
>> >> Best,
>> >> Ferenc
>> >>
>> >>
>> >>
>> >>
>> >> On Thursday, March 14th, 2024 at 08:31, Jane Chan <
>> qingyue@gmail.com> wrote:
>> >>
>> >>>
>> >>>
>> >>> Hi Yubin,
>> >>>
>> >>> Thanks for leading the discussion. I'm +1 for the FLIP.
>> >>>
>> >>> As Jark said, it's a good opportunity to enhance the syntax for Catalog
>> >>> from a more comprehensive perspective. So, I suggest expanding the
>> scope of
>> >>> this FLIP by focusing on the mechanism instead of one use case to
>> enhance
>> >>> the overall functionality. WDYT?
>> >>>
>> >>> Best,
>> >>> Jane
>> >>>
>> >>> On Thu, Mar 14, 2024 at 11:38 AM Hang Ruan ruanhang1...@gmail.com
>> wrote:
>> >>>
>>  Hi, Yubin.
>> 
>>  Thanks for the FLIP. +1 for it.
>> 
>>  Best,
>>  Hang
>> 
>>  Yubin Li lyb5...@gmail.com 于2024年3月14日周四 10:15写道：
>> 
>> > Hi Jingsong, Feng, and Jeyhun
>> >
>> > Thanks for your support and feedback!
>> >
>> >> However, could we add a new method `getCatalogDescriptor()` to
>> >> CatalogManager instead of directly exposing CatalogStore?
>> >
>> > Good point, Besides the audit tracking issue, The proposed feature
>> > only requires `getCatalogDescriptor()` function. Exposing components
>> > with excessive functionality will bring unnecessary risks, I have
>> made
>> > modifications in the FLIP doc [1]. Thank Feng :)
>> >
>> >> Showing the SQL parser implementation in the FLIP for the SQL syntax
>> >> might be a bit confusing. Also, the formal definition is missing for
>> >> this SQL clause.
>> >
>> > Thank Jeyhun for pointing it out :) I have updated the doc [1] .
>> >
>> > [1]
>> 
>> 
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=296290756
>> 
>> > Best,
>> > Yubin
>> >
>> > On Thu, Mar 14, 2024 at 2:18 AM Jeyhun Karimov je.kari...@gmail.com
>> > wrote:
>> >
>> >> Hi Yubin,
>> >>
>> >> Thanks for the proposal. +1 for it.
>> >> I have one comment:
>> >>
>> >> I would like to see the SQL syntax for the proposed statement.
>> Showing
>> >> the
>> >> SQL parser implementation in the FLIP
>> >> for the SQL syntax might be a bit confusing. Also, the formal
>> >> definition
>> >> is
>> >> missing for this SQL clause.
>> >> Maybe something like [1] might be useful. WDYT?
>> >>
>> >> Regards,
>> >> Jeyhun
>> >>
>> >> [1]
>>

[RESULT][VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread Lincoln Lee

Hi everyone,

I'm happy to announce that we have unanimously approved this release.

There are 24 approving votes, 4 of which are binding:

- Xintong Song (binding)
- Jean-Baptiste Onofré (non binding)
- Ahmed Hamdy (non binding)
- Samrat Deb (non binding)
- Jeyhun Karimov (non binding)
- Hangxiang Yu (non binding)
- Yanfei Lei (non binding)
- Ron liu (non binding)
- Feng Jin (non binding)
- Hang Ruan (non binding)
- Xuannan Su (non binding)
- Jane Chan (non binding)
- Rui Fan (non binding)
- weijie guo (non binding)
- Benchao Li (non binding)
- Qingsheng Ren (binding)
- Yun Tang (non binding)
- Ferenc Csaky (non binding)
- gongzhongqiang (non binding)
- Martijn Visser (binding)
- Sergey Nuyanzin (non binding)
- Jing Ge (non binding)
- Leonard Xu (binding)
- Jiabao Sun (non binding)


There are no disapproving votes.

Thank you for verifying the release candidate. We will now proceed
to finalize the release and announce it once everything is published.


Best,
Yun, Jing, Martijn and Lincoln

[jira] [Created] (FLINK-34676) Migrate ConvertToNotInOrInRule

2024-03-14 Thread Sergey Nuyanzin (Jira)

Sergey Nuyanzin created FLINK-34676:
---

 Summary: Migrate ConvertToNotInOrInRule
 Key: FLINK-34676
 URL: https://issues.apache.org/jira/browse/FLINK-34676
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: Sergey Nuyanzin
Assignee: Sergey Nuyanzin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] FLIP-434: Support optimizations for pre-partitioned data sources

2024-03-14 Thread Jeyhun Karimov

Hi Jane,

Thanks for your comments.

I understand your point.  **It would be better if you could sync the
> content to the FLIP**.


- Sure thing. I added my above answer to the FLIP.


Another thing is I'm curious about what the physical plan looks like. Is
> there any specific info that will be added to the table source (like
> filter/project pushdown)? It would be great if you could attach an example
> to the FLIP.


- For the physical plan, the table source will have an additional info
named "partitionedReading" or "partitionedRead". For example:

CREATE TABLE MyTableP (
  a bigint,
  b int,
  c varchar
) PARTITIONED BY (a, b) with (
 'connector' = 'filesystem', 'format' = 'testcsv', 'path' = '/root_dir')


SELECT a, b, COUNT (c) from MyTableP GROUP BY a, b


+- LocalHashAggregate(groupBy=[a, b], select=[a, b, Partial_COUNT(c) AS
count$0])
  +- TableSourceScan(table=[[default_catalog, default_database, MyTableP,
partitionedReading]], fields=[a, b, c])


I also added this example to the FLIP.


Please let me know if that answers your question or if you have any other
comments.


Regards,

Jeyhun


On Thu, Mar 14, 2024 at 8:23 AM Jane Chan  wrote:

> Hi Jeyhun,
>
> Thanks for your clarification.
>
> > Once a new partition is detected, we add it to our existing mapping. Our
> mapping looks like Map> subtaskToPartitionAssignment,
> where it maps each source subtaskID to zero or more partitions.
>
> I understand your point.  **It would be better if you could sync the
> content to the FLIP**.
>
> Another thing is I'm curious about what the physical plan looks like. Is
> there any specific info that will be added to the table source (like
> filter/project pushdown)? It would be great if you could attach an example
> to the FLIP.
>
> Bests,
> Jane
>
> On Wed, Mar 13, 2024 at 9:11 PM Jeyhun Karimov 
> wrote:
>
> > Hi Jane,
> >
> > Thanks for your comments.
> >
> >
> > 1. Concerning the `sourcePartitions()` method, the partition information
> > > returned during the optimization phase may not be the same as the
> > partition
> > > information during runtime execution. For long-running jobs, partitions
> > may
> > > be continuously created. Is this FLIP equipped to handle scenarios?
> >
> >
> > - Good point. This scenario is definitely supported.
> > Once a new partition is added, or in general, new splits are
> > discovered,
> > PartitionAwareSplitAssigner::addSplits(Collection
> > newSplits)
> > method will be called. Inside that method, we are able to detect if a
> split
> > belongs to existing partitions or there is a new partition.
> > Once a new partition is detected, we add it to our existing mapping. Our
> > mapping looks like Map>
> subtaskToPartitionAssignment,
> > where
> > it maps each source subtaskID to zero or more partitions.
> >
> > 2. Regarding the `RemoveRedundantShuffleRule` optimization rule, I
> > > understand that it is also necessary to verify whether the hash key
> > within
> > > the Exchange node is consistent with the partition key defined in the
> > table
> > > source that implements `SupportsPartitioning`.
> >
> >
> > - Yes, I overlooked that point, fixed. Actually, the rule is much
> > complicated. I tried to simplify it in the FLIP. Good point.
> >
> >
> > 3. Could you elaborate on the desired physical plan and integration with
> > > `CompiledPlan` to enhance the overall functionality?
> >
> >
> > - For compiled plan, PartitioningSpec will be used, with a json tag
> > "Partitioning". As a result, in the compiled plan, the source operator
> will
> > have
> > "abilities" : [ { "type" : "Partitioning" } ] as part of the compiled
> plan.
> > More about the implementation details below:
> >
> > 
> > PartitioningSpec class
> > 
> > @JsonTypeName("Partitioning")
> > public final class PartitioningSpec extends SourceAbilitySpecBase {
> >  // some code here
> > @Override
> > public void apply(DynamicTableSource tableSource,
> SourceAbilityContext
> > context) {
> > if (tableSource instanceof SupportsPartitioning) {
> > ((SupportsPartitioning)
> tableSource).applyPartitionedRead();
> > } else {
> > throw new TableException(
> > String.format(
> > "%s does not support SupportsPartitioning.",
> > tableSource.getClass().getName()));
> > }
> > }
> >   // some code here
> > }
> >
> > 
> > SourceAbilitySpec class
> > 
> > @JsonTypeInfo(use = JsonTypeInfo.Id.NAME, include =
> > JsonTypeInfo.As.PROPERTY, property = "type")
> > @JsonSubTypes({
> > @JsonSubTypes.Type(value = FilterPushDownSpec.class),
> > @JsonSubTypes.Type(value = LimitPushDownSpec.class),
> > @JsonSubTypes.Type(value = PartitionPushDownSpec.class),
> > @JsonSubTypes.Type(value = ProjectPushDownSpec.class),
> > @JsonSubTypes.Type(value =

[jira] [Created] (FLINK-34675) Migrate AggregateReduceGroupingRule

2024-03-14 Thread Sergey Nuyanzin (Jira)

Sergey Nuyanzin created FLINK-34675:
---

 Summary: Migrate AggregateReduceGroupingRule
 Key: FLINK-34675
 URL: https://issues.apache.org/jira/browse/FLINK-34675
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: Sergey Nuyanzin
Assignee: Sergey Nuyanzin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34674) Migrate CalcSnapshotTransposeRule

2024-03-14 Thread Sergey Nuyanzin (Jira)

Sergey Nuyanzin created FLINK-34674:
---

 Summary: Migrate CalcSnapshotTransposeRule
 Key: FLINK-34674
 URL: https://issues.apache.org/jira/browse/FLINK-34674
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: Sergey Nuyanzin
Assignee: Sergey Nuyanzin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] FLIP-434: Support optimizations for pre-partitioned data sources

2024-03-14 Thread Jeyhun Karimov

Hi Hang,

Thanks for the comments.

I have a question about the part `Additional option to disable this
> optimization`. Is this option a source configuration or a table
> configuration?


- It is a source configuration.

Besides that, there is a little mistake if I do not understand wrongly.
> Should `Check if upstream_any is pre-partitioned data source AND contains
> the same partition keys as the source` be changed as `Check if upstream_any
> is pre-partitioned data source AND contains the same partition keys as
> downstream_any` ?


- Yes, 'source' should be 'exchange' here.

I materialized both points to the FLIP.

Please let me know if that answers your questions or if you have any other
comments.

Regards,
Jeyhun


On Thu, Mar 14, 2024 at 8:10 AM Hang Ruan  wrote:

> Hi, Jeyhun.
>
> Thanks for the FLIP. Totally +1 for it.
>
> I have a question about the part `Additional option to disable this
> optimization`. Is this option a source configuration or a table
> configuration?
>
> Besides that, there is a little mistake if I do not understand wrongly.
> Should `Check if upstream_any is pre-partitioned data source AND contains
> the same partition keys as the source` be changed as `Check if upstream_any
> is pre-partitioned data source AND contains the same partition keys as
> downstream_any` ?
>
> Best,
> Hang
>
> Jeyhun Karimov  于2024年3月13日周三 21:11写道：
>
> > Hi Jane,
> >
> > Thanks for your comments.
> >
> >
> > 1. Concerning the `sourcePartitions()` method, the partition information
> > > returned during the optimization phase may not be the same as the
> > partition
> > > information during runtime execution. For long-running jobs, partitions
> > may
> > > be continuously created. Is this FLIP equipped to handle scenarios?
> >
> >
> > - Good point. This scenario is definitely supported.
> > Once a new partition is added, or in general, new splits are
> > discovered,
> > PartitionAwareSplitAssigner::addSplits(Collection
> > newSplits)
> > method will be called. Inside that method, we are able to detect if a
> split
> > belongs to existing partitions or there is a new partition.
> > Once a new partition is detected, we add it to our existing mapping. Our
> > mapping looks like Map>
> subtaskToPartitionAssignment,
> > where
> > it maps each source subtaskID to zero or more partitions.
> >
> > 2. Regarding the `RemoveRedundantShuffleRule` optimization rule, I
> > > understand that it is also necessary to verify whether the hash key
> > within
> > > the Exchange node is consistent with the partition key defined in the
> > table
> > > source that implements `SupportsPartitioning`.
> >
> >
> > - Yes, I overlooked that point, fixed. Actually, the rule is much
> > complicated. I tried to simplify it in the FLIP. Good point.
> >
> >
> > 3. Could you elaborate on the desired physical plan and integration with
> > > `CompiledPlan` to enhance the overall functionality?
> >
> >
> > - For compiled plan, PartitioningSpec will be used, with a json tag
> > "Partitioning". As a result, in the compiled plan, the source operator
> will
> > have
> > "abilities" : [ { "type" : "Partitioning" } ] as part of the compiled
> plan.
> > More about the implementation details below:
> >
> > 
> > PartitioningSpec class
> > 
> > @JsonTypeName("Partitioning")
> > public final class PartitioningSpec extends SourceAbilitySpecBase {
> >  // some code here
> > @Override
> > public void apply(DynamicTableSource tableSource,
> SourceAbilityContext
> > context) {
> > if (tableSource instanceof SupportsPartitioning) {
> > ((SupportsPartitioning)
> tableSource).applyPartitionedRead();
> > } else {
> > throw new TableException(
> > String.format(
> > "%s does not support SupportsPartitioning.",
> > tableSource.getClass().getName()));
> > }
> > }
> >   // some code here
> > }
> >
> > 
> > SourceAbilitySpec class
> > 
> > @JsonTypeInfo(use = JsonTypeInfo.Id.NAME, include =
> > JsonTypeInfo.As.PROPERTY, property = "type")
> > @JsonSubTypes({
> > @JsonSubTypes.Type(value = FilterPushDownSpec.class),
> > @JsonSubTypes.Type(value = LimitPushDownSpec.class),
> > @JsonSubTypes.Type(value = PartitionPushDownSpec.class),
> > @JsonSubTypes.Type(value = ProjectPushDownSpec.class),
> > @JsonSubTypes.Type(value = ReadingMetadataSpec.class),
> > @JsonSubTypes.Type(value = WatermarkPushDownSpec.class),
> > @JsonSubTypes.Type(value = SourceWatermarkSpec.class),
> > @JsonSubTypes.Type(value = AggregatePushDownSpec.class),
> > +  @JsonSubTypes.Type(value = PartitioningSpec.class)
>  //
> > new added
> >
> >
> >
> > Please let me know if that answers your questions or if you have other
> > comments.
> >
> > Regards,
> > Jeyhun
> >
> >
> > On Tue, Mar 12, 2024 at 8:56 AM

Re: Additional metadata available for Kafka serdes

2024-03-14 Thread Balint Bene

Hi David!

I think passing the headers as a map (as opposed to
ConsumerRecord/ProducerRecord) is a great idea that should work. That way
the core Flink package doesn't have Kafka dependencies, it seems like
they're meant to be decoupled anyway. The one bonus that using the Record
objects has is that it also provides the topic name, which is a part of the
signature (but usually unused) for Kafka serdes. Do you think it's
worthwhile to also have the topic name included in the signature along with
the map?

Happy to test things out, provide feedback. I'm not working on an Apicurio
format myself, but the use case is very similar.

Thanks,
Balint

On Thu, Mar 14, 2024 at 12:41 PM David Radley 
wrote:

> Hi ,
> I am currently prototyping an Avro Apicurio format that I hope to raise as
> a FLIP very soon (hopefully by early  next week). In my prototyping , I am
> passing through the Kafka headers content as a map to the
> DeserializationSchema and have extended the SerializationSchema to pass
> back headers. I am using new default methods in the interface so as to be
> backwardly compatible. I have the deserialise working and the serialise is
> close.
>
> We did consider trying to use the Apicurio deser libraries but this is
> tricky due to the way the code is split.
>
> Let me know what you think – I hope this approach will meet your needs,
> Kind regards, David.
>
> From: Balint Bene 
> Date: Tuesday, 12 March 2024 at 22:18
> To: dev@flink.apache.org 
> Subject: [EXTERNAL] Additional metadata available for Kafka serdes
> Hello! Looking to get some guidance for a problem around the Flink formats
> used for Kafka.
>
> Flink currently uses common serdes interfaces across all formats. However,
> some data formats used in Kafka require headers for serdes.  It's the same
> problem for serialization and deserialization, so I'll just use
> DynamicKafkaDeserialationSchema
> <
> https://github.com/Shopify/shopify-flink-connector-kafka/blob/979791c4c71e944c16c51419cf9a84aa1f8fea4c/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/DynamicKafkaDeserializationSchema.java#L130
> >
> as
> an example. It has access to the Kafka record headers, but it can't pass
> them to the DeserializationSchema
> <
> https://github.com/apache/flink/blob/94b55d1ae61257f21c7bb511660e7497f269abc7/flink-core/src/main/java/org/apache/flink/api/common/serialization/DeserializationSchema.java#L81
> >
> implemented
> by the format since the interface is generic.
>
> If it were possible to pass the headers, then open source formats such as
> Apicurio could be supported. Unlike the Confluent formats which store the
> metadata (schema ID) appended to the serialized bytes in the key and value,
> the Apicurio formats store their metadata in the record headers.
>
> I have bandwidth to work on this, but it would be great to have direction
> from the community. I have a simple working prototype that's able to load a
> custom version of the format with a modified interface that can accept the
> headers (I just put the entire Apache Kafka ConsumerRecord
> <
> https://kafka.apache.org/36/javadoc/org/apache/kafka/clients/consumer/ConsumerRecord.html
> >
> /ProducerRecord
> <
> https://kafka.apache.org/36/javadoc/org/apache/kafka/clients/producer/ProducerRecord.html
> >
> for simplicity). The issues I foresee is that the class-loader
> <
> https://github.com/apache/flink/blob/94b55d1ae61257f21c7bb511660e7497f269abc7/flink-core/src/main/java/org/apache/flink/core/classloading/ComponentClassLoader.java
> >
> exists in the Flink repo along with interfaces for the formats, but these
> changes are specific to Kafka. This solution could require migrating
> formats to the Flink-connector-kafka repo which is a decent amount of work.
>
> Feedback is appreciated!
> Thanks
> Balint
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
>

Efficient Use of Disk Resources (Only Attach Disks to Stateful Task Managers)

2024-03-14 Thread Kevin Lam

Hi all,

I was wondering if anyone has any ideas or advice when it comes to being
efficient about the use of disks with the RocksDB StateBackend.

In general not all operators in a Flink Job will be stateful and require
persistent disks to use with RocksDB, and so any stateless operators
running on TaskManagers with disks will not use the disk at all, which is
wasteful.

I found the External Resources plugin documentation
,
which seems interesting, but we are using the Flink SQL API, and only the
DataStream API is supported there.

Thanks in advance,
Kevin

Re: Flink Kubernetes Operator Failing Over FlinkDeployments to a New Cluster

2024-03-14 Thread Kevin Lam

Thanks for your response Gyula. Yes I understand, it doesn't really fit
nicely into the Kubernetes Operator pattern.

I do still wonder about the idea of supporting a feature where upon first
deploy, Flink Operator optionally (flag/param enabled) finds the most
recent snapshot (in a specified object storage URI) and uses that as the
initialSavepointPath to restore and run the Flink job. It doesn't require
being aware of clusters, or submitting resources to different clusters at
all, while still facilitating such fail overs.


On Wed, Mar 13, 2024 at 4:51 PM Gyula Fóra  wrote:

> Hey Kevin!
>
> The general mismatch I see here is that operators and resources are pretty
> cluster dependent. The operator itself is running in the same cluster so it
> feels out of scope to submit resources to different clusters, this doesn't
> really sound like what any Kubernetes Operator should do in general.
>
> To me this sounds more like a typical control plane feature that sits above
> different environments and operator instances. There are a lot of features
> like this, blue/green deployments also fall into this category in my head,
> but there are of course many many others.
>
> There may come a time when the Flink community decides to take on such a
> scope but it feels a bit too much at this point to try to standardize this.
>
> Cheers,
> Gyula
>
> On Wed, Mar 13, 2024 at 9:18 PM Kevin Lam 
> wrote:
>
> > Hi Max,
> >
> > It feels a bit hacky to need to back-up the resources directly from the
> > cluster, as opposed to being able to redeploy our checked-in k8s
> manifests
> > such that they failover correctly, but that makes sense to me and we can
> > look into this approach. Thanks for the suggestion!
> >
> > I'd still be interested in hearing the community's thoughts on if we can
> > support this in a more first-class way as part of the Apache Flink
> > Kubernetes Operator.
> >
> > Thanks,
> > Kevin
> >
> > On Wed, Mar 13, 2024 at 9:41 AM Maximilian Michels 
> wrote:
> >
> > > Hi Kevin,
> > >
> > > Theoretically, as long as you move over all k8s resources, failover
> > > should work fine on the Flink and Flink Operator side. The tricky part
> > > is the handover. You will need to backup all resources from the old
> > > cluster, shutdown the old cluster, then re-create them on the new
> > > cluster. The operator deployment and the Flink cluster should then
> > > recover fine (assuming that high availability has been configured and
> > > checkpointing is done to persistent storage available in the new
> > > cluster). The operator state / Flink state is actually kept in
> > > ConfigMaps which would be part of the resource dump.
> > >
> > > This method has proven to work in case of Kubernetes cluster upgrades.
> > > Moving to an entirely new cluster is a bit more involved but exporting
> > > all resource definitions and re-importing them into the new cluster
> > > should yield the same result as long as the checkpoint paths do not
> > > change.
> > >
> > > Probably something worth trying :)
> > >
> > > -Max
> > >
> > >
> > >
> > > On Wed, Mar 6, 2024 at 9:09 PM Kevin Lam  >
> > > wrote:
> > > >
> > > > Another thought could be modifying the operator to have a behaviour
> > where
> > > > upon first deploy, it optionally (flag/param enabled) finds the most
> > > recent
> > > > snapshot and uses that as the initialSavepointPath to restore and run
> > the
> > > > Flink job.
> > > >
> > > > On Wed, Mar 6, 2024 at 2:07 PM Kevin Lam 
> > wrote:
> > > >
> > > > > Hi there,
> > > > >
> > > > > We use the Flink Kubernetes Operator, and I am investigating how we
> > can
> > > > > easily support failing over a FlinkDeployment from one Kubernetes
> > > Cluster
> > > > > to another in the case of an outage that requires us to migrate a
> > large
> > > > > number of FlinkDeployments from one K8s cluster to another.
> > > > >
> > > > > I understand one way to do this is to set `initialSavepoint` on all
> > the
> > > > > FlinkDeployments to the most recent/appropriate snapshot so the
> jobs
> > > > > continue from where they left off, but for a large number of jobs,
> > this
> > > > > would be quite a bit of manual labor.
> > > > >
> > > > > Do others have an approach they are using? Any advice?
> > > > >
> > > > > Could this be something addressed in a future FLIP? Perhaps we
> could
> > > store
> > > > > some kind of metadata in object storage so that the Flink
> Kubernetes
> > > > > Operator can restore a FlinkDeployment from where it left off, even
> > if
> > > the
> > > > > job is shifted to another Kubernetes Cluster.
> > > > >
> > > > > Looking forward to hearing folks' thoughts!
> > > > >
> > >
> >
>

Re: [DISCUSS] FLIP-436: Introduce "SHOW CREATE CATALOG" Syntax

2024-03-14 Thread Yubin Li

Hi folks,

Thank you all for your input, it really makes sense to introduce missing
catalog-related SQL syntaxes under this FLIP, and I have changed the
title of doc to "FLIP-436: Introduce Catalog-related Syntax".

After comprehensive consideration, the following syntaxes should be
introduced, more suggestions are welcome :)

> 1. SHOW CREATE CATALOG catalog_name
> 2. DESCRIBE/DESC CATALOG catalog_name
> 3. ALTER CATALOG catalog_name SET (key1=val1, key2=val2, ...)

Regarding the `alter catalog` syntax format, I refer to the current design
of `alter database`.

Given that CatalogManager already provides catalog operations such as
create, get, and unregister, and in order to facilitate future
implementation
of audit tracking, I propose to introduce the alterCatalog() function in
CatalogManager. WDYT?

Please see details in FLIP doc [1] .

Best,
Yubin

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-436%3A+Introduce+Catalog-related+Syntax


On Thu, Mar 14, 2024 at 11:07 PM Leonard Xu  wrote:

> Hi Yubin,
>
> Thanks for driving the discussion, generally +1 for the FLIP, big +1 to
> finalize the whole catalog syntax story in one FLIP,
> thus I want to jump into the discussion again after you completed the
> whole catalog syntax story.
>
> Best,
> Leonard
>
>
>
> > 2024年3月14日 下午8:39，Roc Marshal  写道：
> >
> > Hi, Yubin
> >
> >
> > Thank you for initiating this discussion! +1 for the proposal.
> >
> >
> >
> >
> >
> >
> > Best,
> > Yuepeng Pan
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > At 2024-03-14 18:57:35, "Ferenc Csaky" 
> wrote:
> >> Hi Yubin,
> >>
> >> Thank you for initiating this discussion! +1 for the proposal.
> >>
> >> I also think it makes sense to group the missing catalog related
> >> SQL syntaxes under this FLIP.
> >>
> >> Looking forward to these features!
> >>
> >> Best,
> >> Ferenc
> >>
> >>
> >>
> >>
> >> On Thursday, March 14th, 2024 at 08:31, Jane Chan <
> qingyue@gmail.com> wrote:
> >>
> >>>
> >>>
> >>> Hi Yubin,
> >>>
> >>> Thanks for leading the discussion. I'm +1 for the FLIP.
> >>>
> >>> As Jark said, it's a good opportunity to enhance the syntax for Catalog
> >>> from a more comprehensive perspective. So, I suggest expanding the
> scope of
> >>> this FLIP by focusing on the mechanism instead of one use case to
> enhance
> >>> the overall functionality. WDYT?
> >>>
> >>> Best,
> >>> Jane
> >>>
> >>> On Thu, Mar 14, 2024 at 11:38 AM Hang Ruan ruanhang1...@gmail.com
> wrote:
> >>>
>  Hi, Yubin.
> 
>  Thanks for the FLIP. +1 for it.
> 
>  Best,
>  Hang
> 
>  Yubin Li lyb5...@gmail.com 于2024年3月14日周四 10:15写道：
> 
> > Hi Jingsong, Feng, and Jeyhun
> >
> > Thanks for your support and feedback!
> >
> >> However, could we add a new method `getCatalogDescriptor()` to
> >> CatalogManager instead of directly exposing CatalogStore?
> >
> > Good point, Besides the audit tracking issue, The proposed feature
> > only requires `getCatalogDescriptor()` function. Exposing components
> > with excessive functionality will bring unnecessary risks, I have
> made
> > modifications in the FLIP doc [1]. Thank Feng :)
> >
> >> Showing the SQL parser implementation in the FLIP for the SQL syntax
> >> might be a bit confusing. Also, the formal definition is missing for
> >> this SQL clause.
> >
> > Thank Jeyhun for pointing it out :) I have updated the doc [1] .
> >
> > [1]
> 
> 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=296290756
> 
> > Best,
> > Yubin
> >
> > On Thu, Mar 14, 2024 at 2:18 AM Jeyhun Karimov je.kari...@gmail.com
> > wrote:
> >
> >> Hi Yubin,
> >>
> >> Thanks for the proposal. +1 for it.
> >> I have one comment:
> >>
> >> I would like to see the SQL syntax for the proposed statement.
> Showing
> >> the
> >> SQL parser implementation in the FLIP
> >> for the SQL syntax might be a bit confusing. Also, the formal
> >> definition
> >> is
> >> missing for this SQL clause.
> >> Maybe something like [1] might be useful. WDYT?
> >>
> >> Regards,
> >> Jeyhun
> >>
> >> [1]
> 
> 
> https://github.com/apache/flink/blob/0da60ca1a4754f858cf7c52dd4f0c97ae0e1b0cb/docs/content/docs/dev/table/sql/show.md?plain=1#L620-L632
> 
> >> On Wed, Mar 13, 2024 at 3:28 PM Feng Jin jinfeng1...@gmail.com
> >> wrote:
> >>
> >>> Hi Yubin
> >>>
> >>> Thank you for initiating this FLIP.
> >>>
> >>> I have just one minor question:
> >>>
> >>> I noticed that we added a new function `getCatalogStore` to expose
> >>> CatalogStore, and it seems fine.
> >>> However, could we add a new method `getCatalogDescriptor()` to
> >>> CatalogManager instead of directly exposing CatalogStore?
> >>> By only providing the `getCatalogDescriptor()` interface, it may be
> >>> easier
> >>> for us to implement audit tracking in

Re: [DISCUSS] FLIP-437: Support ML Models in Flink SQL

2024-03-14 Thread Hao Li

Hi Jark,

Thanks for the pointers. It's very helpful.

1. Looks like `tumble`, `hopping` are keywords in calcite parser. And the
syntax `cumulate(Table my_table, ...)` needs to get table information from
catalog somewhere for type validation etc. Can you send me some pointers
where the function gets the table information?
2. The ideal syntax for model function I think would be `ML_PREDICT(MODEL
, {table  | (query_stmt) })`. I think with special
handling of the `ML_PREDICT` function in parser/planner, maybe we can do
this like window functions. But to support `MODEL` keyword, we need calcite
parser change I guess. Also is it possible to support  in
window functions in addiction to table?

For the redshift syntax, I'm not sure the purpose of defining the function
name with the model. Is it to define the function input/output schema? We
have the schema in our create model syntax and the `ML_PREDICT` can handle
it by getting model definition. I think our syntax is more concise to have
a generic prediction function. I also did some research and it's the syntax
used by Databricks `ai_query` [1], Snowflake `predict` [2], Azureml
`predict` [3].

[1]:
https://docs.databricks.com/en/sql/language-manual/functions/ai_query.html
[2]:
https://github.com/Snowflake-Labs/sfguide-intro-to-machine-learning-with-snowpark-ml-for-python/blob/main/3_snowpark_ml_model_training_inference.ipynb?_fsi=sksXUwQ0
[3]:
https://learn.microsoft.com/en-us/sql/machine-learning/tutorials/quickstart-python-train-score-model?view=azuresqldb-mi-current

Thanks,
Hao

On Wed, Mar 13, 2024 at 8:57 PM Jark Wu  wrote:

> Hi Mingge, Hao,
>
> Thanks for your replies.
>
> > PTF is actually the ideal approach for model functions, and we do have
> the plans to use PTF for
> all model functions (including prediction, evaluation etc..) once the PTF
> is supported in FlinkSQL
> confluent extension.
>
> It sounds that PTF is the ideal way and table function is a temporary
> solution which will be dropped in the future.
> I'm not sure whether we can implement it using PTF in Flink SQL. But we
> have implemented window
> functions using PTF[1]. And introduced a new window function (called
> CUMULATE[2]) in Flink SQL based
> on this. I think it might work to use PTF and implement model function
> syntax like this:
>
> SELECT * FROM TABLE(ML_PREDICT(
>   TABLE my_table,
>   my_model,
>   col1,
>   col2
> ));
>
> Besides, did you consider following the way of AWS Redshift which defines
> model function with the model itself together?
> IIUC, a model is a black-box which defines input parameters and output
> parameters which can be modeled into functions.
>
>
> Best,
> Jark
>
> [1]:
>
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/window-tvf/#session
> [2]:
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-145%3A+Support+SQL+windowing+table-valued+function#FLIP145:SupportSQLwindowingtablevaluedfunction-CumulatingWindows
> [3]:
>
> https://github.com/aws-samples/amazon-redshift-ml-getting-started/blob/main/use-cases/bring-your-own-model-remote-inference/README.md#create-model
>
>
>
>
> On Wed, 13 Mar 2024 at 15:00, Hao Li  wrote:
>
> > Hi Jark,
> >
> > Thanks for your questions. These are good questions!
> >
> > 1. The polymorphism table function I was referring to takes a table as
> > input and outputs a table. So the syntax would be like
> > ```
> > SELECT * FROM ML_PREDICT('model', (SELECT * FROM my_table))
> > ```
> > As far as I know, this is not supported yet on Flink. So before it's
> > supported, one option for the predict function is using table function
> > which can output multiple columns
> > ```
> > SELECT * FROM my_table, LATERAL VIEW (ML_PREDICT('model', col1, col2))
> > ```
> >
> > 2. Good question. Type inference is hard for the `ML_PREDICT` function
> > because it takes a model name string as input. I can think of three ways
> of
> > doing type inference for it.
> >1). Treat `ML_PREDICT` function as something special and during sql
> > parsing or planning time, if it's encountered, we need to look up the
> model
> > from the first argument which is a model name from catalog. Then we can
> > infer the input/output for the function.
> >2). We can define a `model` keyword and use that in the predict
> function
> > to indicate the argument refers to a model. So it's like
> `ML_PREDICT(model
> > 'my_model', col1, col2))`
> >3). We can create a special type of table function maybe called
> > `ModelFunction` which can resolve the model type inference by special
> > handling it during parsing or planning time.
> > 1) is hacky, 2) isn't supported in Flink for function, 3) might be a
> > good option.
> >
> > 3. I sketched the `ML_PREDICT` function for inference. But there are
> > limitations of the function mentioned in 1 and 2. So maybe we don't need
> to
> > introduce them as built-in functions until polymorphism table function
> and
> > we can properly deal with type inference.
> > After that,

[VOTE] Apache Flink Kubernetes Operator Release 1.8.0, release candidate #1

2024-03-14 Thread Maximilian Michels

Hi everyone,

Please review and vote on the release candidate #1 for the version
1.8.0 of the Apache Flink Kubernetes Operator, as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

**Release Overview**

As an overview, the release consists of the following:
a) Kubernetes Operator canonical source distribution (including the
Dockerfile), to be deployed to the release repository at
dist.apache.org
b) Kubernetes Operator Helm Chart to be deployed to the release
repository at dist.apache.org
c) Maven artifacts to be deployed to the Maven Central Repository
d) Docker image to be pushed to Dockerhub

**Staging Areas to Review**

The staging areas containing the above mentioned artifacts are as
follows, for your review:
* All artifacts for (a), (b) can be found in the corresponding dev
repository at dist.apache.org [1]
* All artifacts for (c) can be found at the Apache Nexus Repository [2]
* The docker image for (d) is staged on github [3]

All artifacts are signed with the key
DA359CBFCEB13FC302A8793FB655E6F7693D5FDE [4]

Other links for your review:
* JIRA release notes [5]
* source code tag "release-1.8.0-rc1" [6]
* PR to update the website Downloads page to include Kubernetes
Operator links [7]

**Vote Duration**

The voting time will run for at least 72 hours. It is adopted by
majority approval, with at least 3 PMC affirmative votes.

**Note on Verification**

You can follow the basic verification guide here [8]. Note that you
don't need to verify everything yourself, but please make note of what
you have tested together with your +- vote.

Thanks,
Max

[1] 
https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.8.0-rc1/
[2] https://repository.apache.org/content/repositories/orgapacheflink-1710/
[3] ghcr.io/apache/flink-kubernetes-operator:8938658
[4] https://dist.apache.org/repos/dist/release/flink/KEYS
[5] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12353866=12315522
[6] https://github.com/apache/flink-kubernetes-operator/tree/release-1.8.0-rc1
[7] https://github.com/apache/flink-web/pull/726
[8] 
https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Kubernetes+Operator+Release

Re: Additional metadata available for Kafka serdes

2024-03-14 Thread David Radley

Hi ,
I am currently prototyping an Avro Apicurio format that I hope to raise as a 
FLIP very soon (hopefully by early  next week). In my prototyping , I am 
passing through the Kafka headers content as a map to the DeserializationSchema 
and have extended the SerializationSchema to pass back headers. I am using new 
default methods in the interface so as to be backwardly compatible. I have the 
deserialise working and the serialise is close.

We did consider trying to use the Apicurio deser libraries but this is tricky 
due to the way the code is split.

Let me know what you think – I hope this approach will meet your needs,
Kind regards, David.

From: Balint Bene 
Date: Tuesday, 12 March 2024 at 22:18
To: dev@flink.apache.org 
Subject: [EXTERNAL] Additional metadata available for Kafka serdes
Hello! Looking to get some guidance for a problem around the Flink formats
used for Kafka.

Flink currently uses common serdes interfaces across all formats. However,
some data formats used in Kafka require headers for serdes.  It's the same
problem for serialization and deserialization, so I'll just use
DynamicKafkaDeserialationSchema

as
an example. It has access to the Kafka record headers, but it can't pass
them to the DeserializationSchema

implemented
by the format since the interface is generic.

If it were possible to pass the headers, then open source formats such as
Apicurio could be supported. Unlike the Confluent formats which store the
metadata (schema ID) appended to the serialized bytes in the key and value,
the Apicurio formats store their metadata in the record headers.

I have bandwidth to work on this, but it would be great to have direction
from the community. I have a simple working prototype that's able to load a
custom version of the format with a modified interface that can accept the
headers (I just put the entire Apache Kafka ConsumerRecord

/ProducerRecord

for simplicity). The issues I foresee is that the class-loader

exists in the Flink repo along with interfaces for the formats, but these
changes are specific to Kafka. This solution could require migrating
formats to the Flink-connector-kafka repo which is a decent amount of work.

Feedback is appreciated!
Thanks
Balint

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU

ARM flink docker image

2024-03-14 Thread Yang LI

Dear Flink Community,

Do you know if we have somewhere a tested ARM based flink docker image? I
think we can already run locally on an ARM macbook.  But we don't have a
ARM specified docker image yet.

Regards,
Yang LI

Re: [DISCUSS] FLIP-436: Introduce "SHOW CREATE CATALOG" Syntax

2024-03-14 Thread Leonard Xu

Hi Yubin,

Thanks for driving the discussion, generally +1 for the FLIP, big +1 to 
finalize the whole catalog syntax story in one FLIP, 
thus I want to jump into the discussion again after you completed the whole 
catalog syntax story.

Best,
Leonard



> 2024年3月14日 下午8:39，Roc Marshal  写道：
> 
> Hi, Yubin
> 
> 
> Thank you for initiating this discussion! +1 for the proposal.
> 
> 
> 
> 
> 
> 
> Best,
> Yuepeng Pan
> 
> 
> 
> 
> 
> 
> 
> 
> 
> At 2024-03-14 18:57:35, "Ferenc Csaky"  wrote:
>> Hi Yubin,
>> 
>> Thank you for initiating this discussion! +1 for the proposal.
>> 
>> I also think it makes sense to group the missing catalog related
>> SQL syntaxes under this FLIP.
>> 
>> Looking forward to these features!
>> 
>> Best,
>> Ferenc
>> 
>> 
>> 
>> 
>> On Thursday, March 14th, 2024 at 08:31, Jane Chan  
>> wrote:
>> 
>>> 
>>> 
>>> Hi Yubin,
>>> 
>>> Thanks for leading the discussion. I'm +1 for the FLIP.
>>> 
>>> As Jark said, it's a good opportunity to enhance the syntax for Catalog
>>> from a more comprehensive perspective. So, I suggest expanding the scope of
>>> this FLIP by focusing on the mechanism instead of one use case to enhance
>>> the overall functionality. WDYT?
>>> 
>>> Best,
>>> Jane
>>> 
>>> On Thu, Mar 14, 2024 at 11:38 AM Hang Ruan ruanhang1...@gmail.com wrote:
>>> 
 Hi, Yubin.
 
 Thanks for the FLIP. +1 for it.
 
 Best,
 Hang
 
 Yubin Li lyb5...@gmail.com 于2024年3月14日周四 10:15写道：
 
> Hi Jingsong, Feng, and Jeyhun
> 
> Thanks for your support and feedback!
> 
>> However, could we add a new method `getCatalogDescriptor()` to
>> CatalogManager instead of directly exposing CatalogStore?
> 
> Good point, Besides the audit tracking issue, The proposed feature
> only requires `getCatalogDescriptor()` function. Exposing components
> with excessive functionality will bring unnecessary risks, I have made
> modifications in the FLIP doc [1]. Thank Feng :)
> 
>> Showing the SQL parser implementation in the FLIP for the SQL syntax
>> might be a bit confusing. Also, the formal definition is missing for
>> this SQL clause.
> 
> Thank Jeyhun for pointing it out :) I have updated the doc [1] .
> 
> [1]
 
 https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=296290756
 
> Best,
> Yubin
> 
> On Thu, Mar 14, 2024 at 2:18 AM Jeyhun Karimov je.kari...@gmail.com
> wrote:
> 
>> Hi Yubin,
>> 
>> Thanks for the proposal. +1 for it.
>> I have one comment:
>> 
>> I would like to see the SQL syntax for the proposed statement. Showing
>> the
>> SQL parser implementation in the FLIP
>> for the SQL syntax might be a bit confusing. Also, the formal
>> definition
>> is
>> missing for this SQL clause.
>> Maybe something like [1] might be useful. WDYT?
>> 
>> Regards,
>> Jeyhun
>> 
>> [1]
 
 https://github.com/apache/flink/blob/0da60ca1a4754f858cf7c52dd4f0c97ae0e1b0cb/docs/content/docs/dev/table/sql/show.md?plain=1#L620-L632
 
>> On Wed, Mar 13, 2024 at 3:28 PM Feng Jin jinfeng1...@gmail.com
>> wrote:
>> 
>>> Hi Yubin
>>> 
>>> Thank you for initiating this FLIP.
>>> 
>>> I have just one minor question:
>>> 
>>> I noticed that we added a new function `getCatalogStore` to expose
>>> CatalogStore, and it seems fine.
>>> However, could we add a new method `getCatalogDescriptor()` to
>>> CatalogManager instead of directly exposing CatalogStore?
>>> By only providing the `getCatalogDescriptor()` interface, it may be
>>> easier
>>> for us to implement audit tracking in CatalogManager in the future.
>>> WDYT ?
>>> Although we have only collected some modified events at the
>>> moment.[1]
>>> 
>>> [1].
 
 https://cwiki.apache.org/confluence/display/FLINK/FLIP-294%3A+Support+Customized+Catalog+Modification+Listener
 
>>> Best,
>>> Feng
>>> 
>>> On Wed, Mar 13, 2024 at 5:31 PM Jingsong Li jingsongl...@gmail.com
>>> wrote:
>>> 
 +1 for this.
 
 We are missing a series of catalog related syntaxes.
 Especially after the introduction of catalog store. [1]
 
 [1]
 
 https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
 
 Best,
 Jingsong
 
 On Wed, Mar 13, 2024 at 5:09 PM Yubin Li lyb5...@gmail.com
 wrote:
 
> Hi devs,
> 
> I'd like to start a discussion about FLIP-436: Introduce "SHOW
> CREATE
> CATALOG" Syntax [1].
> 
> At present, the `SHOW CREATE TABLE` statement provides strong
> support
> for
> users to easily
> reuse created tables. However, despite the increasing importance
> of the

Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread Jiabao Sun

+1 (non-binding)

- Verified signatures and checksums
- Verified that source does not contain binaries
- Build source code successfully
- Reviewed the web PR
- Reviewed the release note

Best,
Jiabao

Leonard Xu  于2024年3月14日周四 23:01写道：

>
> +1 (binding)
>
> - verified signatures
> - verified hashsums
> - checked Github release tag
> - started SQL Client, used MySQL CDC connector to read records from
> database , the result is expected
> - checked Jira issues for 1.19.0 and discussed with RMs that  FLINK-29114
> won’t block this RC
> - checked release notes
> - reviewed the web PR
>
> Best,
> Leonard
>
> > 2024年3月14日 下午9:36，Sergey Nuyanzin  写道：
> >
> > +1 (non-binding)
> >
> > - Checked the pre-built jars are generated with jdk8
> > - Verified signature and checksum
> > - Verified no binary in source
> > - Verified source code tag
> > - Reviewed release note
> > - Reviewed web PR
> > - Built from source
> > - Run a simple job successfully
> >
> > On Thu, Mar 14, 2024 at 2:21 PM Martijn Visser  >
> > wrote:
> >
> >> +1 (binding)
> >>
> >> - Validated hashes
> >> - Verified signature
> >> - Verified that no binaries exist in the source archive
> >> - Build the source with Maven via mvn clean install -Pcheck-convergence
> >> -Dflink.version=1.19.0
> >> - Verified licenses
> >> - Verified web PR
> >> - Started a cluster and the Flink SQL client, successfully read and
> wrote
> >> with the Kafka connector to Confluent Cloud with AVRO and Schema
> Registry
> >> enabled
> >>
> >> On Thu, Mar 14, 2024 at 1:32 PM gongzhongqiang <
> gongzhongqi...@apache.org>
> >> wrote:
> >>
> >>> +1 (non-binding)
> >>>
> >>> - Verified no binary files in source code
> >>> - Verified signature and checksum
> >>> - Build source code and run a simple job successfully
> >>> - Reviewed the release announcement PR
> >>>
> >>> Best,
> >>>
> >>> Zhongqiang Gong
> >>>
> >>> Ferenc Csaky  于2024年3月14日周四 20:07写道：
> >>>
>  +1 (non-binding)
> 
>  - Verified checksum and signature
>  - Verified no binary in src
>  - Built from src
>  - Reviewed release note PR
>  - Reviewed web PR
>  - Tested a simple datagen query and insert to blackhole sink via SQL
>  Gateway
> 
>  Best,
>  Ferenc
> 
> 
> 
> 
>  On Thursday, March 14th, 2024 at 12:14, Jane Chan <
> >> qingyue@gmail.com
> 
>  wrote:
> 
> >
> >
> > Hi Lincoln,
> >
> > Thank you for the prompt response and the effort to provide clarity
> >> on
>  this
> > matter.
> >
> > Best,
> > Jane
> >
> > On Thu, Mar 14, 2024 at 6:02 PM Lincoln Lee lincoln.8...@gmail.com
>  wrote:
> >
> >> Hi Jane,
> >>
> >> Thank you for raising this question. I saw the discussion in the
> >> Jira
> >> (include Matthias' point)
> >> and sought advice from several PMCs (including the previous RMs),
> >> the
> >> majority of people
> >> are in favor of merging the bugfix into the release branch even
> >>> during
>  the
> >> release candidate
> >> (RC) voting period, so we should accept all bugfixes (unless there
> >>> is a
> >> specific community
> >> rule preventing it).
> >>
> >> Thanks again for contributing to the community!
> >>
> >> Best,
> >> Lincoln Lee
> >>
> >> Matthias Pohl matthias.p...@aiven.io.invalid 于2024年3月14日周四
> >> 17:50写道：
> >>
> >>> Update on FLINK-34227 [1] which I mentioned above: Chesnay helped
> >>> identify
> >>> a concurrency issue in the JobMaster shutdown logic which seems
> >> to
>  be in
> >>> the code for quite some time. I created a PR fixing the issue
> >>> hoping
>  that
> >>> the test instability is resolved with it.
> >>>
> >>> The concurrency issue doesn't really explain why it only started
> >> to
> >>> appear
> >>> recently in a specific CI setup (GHA with AdaptiveScheduler).
> >> There
>  is no
> >>> hint in the git history indicating that it's caused by some newly
> >>> introduced change. That is why I wouldn't make FLINK-34227 a
> >> reason
>  to
> >>> cancel rc2. Instead, the fix can be provided in subsequent patch
> >>> releases.
> >>>
> >>> Matthias
> >>>
> >>> [1] https://issues.apache.org/jira/browse/FLINK-34227
> >>>
> >>> On Thu, Mar 14, 2024 at 8:49 AM Jane Chan qingyue@gmail.com
>  wrote:
> >>>
>  Hi Yun, Jing, Martijn and Lincoln,
> 
>  I'm seeking guidance on whether merging the bugfix[1][2] at
> >> this
>  stage
>  is
>  appropriate. I want to ensure that the actions align with the
>  current
>  release process and do not disrupt the ongoing preparations.
> 
>  [1] https://issues.apache.org/jira/browse/FLINK-29114
>  [2] https://github.com/apache/flink/pull/24492
> 
>  Best,
>  Jane
> 
>  On Thu, Mar 14, 2024 at 1:33 PM Yun Tang

Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread Leonard Xu


+1 (binding)

- verified signatures
- verified hashsums
- checked Github release tag 
- started SQL Client, used MySQL CDC connector to read records from database , 
the result is expected
- checked Jira issues for 1.19.0 and discussed with RMs that  FLINK-29114 won’t 
block this RC
- checked release notes
- reviewed the web PR 

Best,
Leonard

> 2024年3月14日 下午9:36，Sergey Nuyanzin  写道：
> 
> +1 (non-binding)
> 
> - Checked the pre-built jars are generated with jdk8
> - Verified signature and checksum
> - Verified no binary in source
> - Verified source code tag
> - Reviewed release note
> - Reviewed web PR
> - Built from source
> - Run a simple job successfully
> 
> On Thu, Mar 14, 2024 at 2:21 PM Martijn Visser 
> wrote:
> 
>> +1 (binding)
>> 
>> - Validated hashes
>> - Verified signature
>> - Verified that no binaries exist in the source archive
>> - Build the source with Maven via mvn clean install -Pcheck-convergence
>> -Dflink.version=1.19.0
>> - Verified licenses
>> - Verified web PR
>> - Started a cluster and the Flink SQL client, successfully read and wrote
>> with the Kafka connector to Confluent Cloud with AVRO and Schema Registry
>> enabled
>> 
>> On Thu, Mar 14, 2024 at 1:32 PM gongzhongqiang 
>> wrote:
>> 
>>> +1 (non-binding)
>>> 
>>> - Verified no binary files in source code
>>> - Verified signature and checksum
>>> - Build source code and run a simple job successfully
>>> - Reviewed the release announcement PR
>>> 
>>> Best,
>>> 
>>> Zhongqiang Gong
>>> 
>>> Ferenc Csaky  于2024年3月14日周四 20:07写道：
>>> 
 +1 (non-binding)
 
 - Verified checksum and signature
 - Verified no binary in src
 - Built from src
 - Reviewed release note PR
 - Reviewed web PR
 - Tested a simple datagen query and insert to blackhole sink via SQL
 Gateway
 
 Best,
 Ferenc
 
 
 
 
 On Thursday, March 14th, 2024 at 12:14, Jane Chan <
>> qingyue@gmail.com
 
 wrote:
 
> 
> 
> Hi Lincoln,
> 
> Thank you for the prompt response and the effort to provide clarity
>> on
 this
> matter.
> 
> Best,
> Jane
> 
> On Thu, Mar 14, 2024 at 6:02 PM Lincoln Lee lincoln.8...@gmail.com
 wrote:
> 
>> Hi Jane,
>> 
>> Thank you for raising this question. I saw the discussion in the
>> Jira
>> (include Matthias' point)
>> and sought advice from several PMCs (including the previous RMs),
>> the
>> majority of people
>> are in favor of merging the bugfix into the release branch even
>>> during
 the
>> release candidate
>> (RC) voting period, so we should accept all bugfixes (unless there
>>> is a
>> specific community
>> rule preventing it).
>> 
>> Thanks again for contributing to the community!
>> 
>> Best,
>> Lincoln Lee
>> 
>> Matthias Pohl matthias.p...@aiven.io.invalid 于2024年3月14日周四
>> 17:50写道：
>> 
>>> Update on FLINK-34227 [1] which I mentioned above: Chesnay helped
>>> identify
>>> a concurrency issue in the JobMaster shutdown logic which seems
>> to
 be in
>>> the code for quite some time. I created a PR fixing the issue
>>> hoping
 that
>>> the test instability is resolved with it.
>>> 
>>> The concurrency issue doesn't really explain why it only started
>> to
>>> appear
>>> recently in a specific CI setup (GHA with AdaptiveScheduler).
>> There
 is no
>>> hint in the git history indicating that it's caused by some newly
>>> introduced change. That is why I wouldn't make FLINK-34227 a
>> reason
 to
>>> cancel rc2. Instead, the fix can be provided in subsequent patch
>>> releases.
>>> 
>>> Matthias
>>> 
>>> [1] https://issues.apache.org/jira/browse/FLINK-34227
>>> 
>>> On Thu, Mar 14, 2024 at 8:49 AM Jane Chan qingyue@gmail.com
 wrote:
>>> 
 Hi Yun, Jing, Martijn and Lincoln,
 
 I'm seeking guidance on whether merging the bugfix[1][2] at
>> this
 stage
 is
 appropriate. I want to ensure that the actions align with the
 current
 release process and do not disrupt the ongoing preparations.
 
 [1] https://issues.apache.org/jira/browse/FLINK-29114
 [2] https://github.com/apache/flink/pull/24492
 
 Best,
 Jane
 
 On Thu, Mar 14, 2024 at 1:33 PM Yun Tang myas...@live.com
>> wrote:
 
> +1 (non-binding)
> 
> *
> Verified the signature and checksum.
> *
> Reviewed the release note PR
> *
> Reviewed the web announcement PR
> *
> Start a standalone cluster to submit the state machine
>> example,
 which
> works well.
> *
> Checked the pre-built jars are generated via JDK8
> *
> Verified the process profiler works well after setting
> rest.profiling.enabled: true
>

Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread Jing Ge

+1 (not-binding)

- verified signature
- verified checksum
- verified that source distribution does not contain binaries
- built from source
- reviewed the web PR
- reviewed the release note
- checked the tag
- checked the repo

Best regards,
Jing

On Thu, Mar 14, 2024 at 2:37 PM Sergey Nuyanzin  wrote:

> +1 (non-binding)
>
> - Checked the pre-built jars are generated with jdk8
> - Verified signature and checksum
> - Verified no binary in source
> - Verified source code tag
> - Reviewed release note
> - Reviewed web PR
> - Built from source
> - Run a simple job successfully
>
> On Thu, Mar 14, 2024 at 2:21 PM Martijn Visser 
> wrote:
>
> > +1 (binding)
> >
> > - Validated hashes
> > - Verified signature
> > - Verified that no binaries exist in the source archive
> > - Build the source with Maven via mvn clean install -Pcheck-convergence
> > -Dflink.version=1.19.0
> > - Verified licenses
> > - Verified web PR
> > - Started a cluster and the Flink SQL client, successfully read and wrote
> > with the Kafka connector to Confluent Cloud with AVRO and Schema Registry
> > enabled
> >
> > On Thu, Mar 14, 2024 at 1:32 PM gongzhongqiang <
> gongzhongqi...@apache.org>
> > wrote:
> >
> > > +1 (non-binding)
> > >
> > > - Verified no binary files in source code
> > > - Verified signature and checksum
> > > - Build source code and run a simple job successfully
> > > - Reviewed the release announcement PR
> > >
> > > Best,
> > >
> > > Zhongqiang Gong
> > >
> > > Ferenc Csaky  于2024年3月14日周四 20:07写道：
> > >
> > > >  +1 (non-binding)
> > > >
> > > > - Verified checksum and signature
> > > > - Verified no binary in src
> > > > - Built from src
> > > > - Reviewed release note PR
> > > > - Reviewed web PR
> > > > - Tested a simple datagen query and insert to blackhole sink via SQL
> > > > Gateway
> > > >
> > > > Best,
> > > > Ferenc
> > > >
> > > >
> > > >
> > > >
> > > > On Thursday, March 14th, 2024 at 12:14, Jane Chan <
> > qingyue@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > >
> > > > >
> > > > > Hi Lincoln,
> > > > >
> > > > > Thank you for the prompt response and the effort to provide clarity
> > on
> > > > this
> > > > > matter.
> > > > >
> > > > > Best,
> > > > > Jane
> > > > >
> > > > > On Thu, Mar 14, 2024 at 6:02 PM Lincoln Lee lincoln.8...@gmail.com
> > > > wrote:
> > > > >
> > > > > > Hi Jane,
> > > > > >
> > > > > > Thank you for raising this question. I saw the discussion in the
> > Jira
> > > > > > (include Matthias' point)
> > > > > > and sought advice from several PMCs (including the previous RMs),
> > the
> > > > > > majority of people
> > > > > > are in favor of merging the bugfix into the release branch even
> > > during
> > > > the
> > > > > > release candidate
> > > > > > (RC) voting period, so we should accept all bugfixes (unless
> there
> > > is a
> > > > > > specific community
> > > > > > rule preventing it).
> > > > > >
> > > > > > Thanks again for contributing to the community!
> > > > > >
> > > > > > Best,
> > > > > > Lincoln Lee
> > > > > >
> > > > > > Matthias Pohl matthias.p...@aiven.io.invalid 于2024年3月14日周四
> > 17:50写道：
> > > > > >
> > > > > > > Update on FLINK-34227 [1] which I mentioned above: Chesnay
> helped
> > > > > > > identify
> > > > > > > a concurrency issue in the JobMaster shutdown logic which seems
> > to
> > > > be in
> > > > > > > the code for quite some time. I created a PR fixing the issue
> > > hoping
> > > > that
> > > > > > > the test instability is resolved with it.
> > > > > > >
> > > > > > > The concurrency issue doesn't really explain why it only
> started
> > to
> > > > > > > appear
> > > > > > > recently in a specific CI setup (GHA with AdaptiveScheduler).
> > There
> > > > is no
> > > > > > > hint in the git history indicating that it's caused by some
> newly
> > > > > > > introduced change. That is why I wouldn't make FLINK-34227 a
> > reason
> > > > to
> > > > > > > cancel rc2. Instead, the fix can be provided in subsequent
> patch
> > > > > > > releases.
> > > > > > >
> > > > > > > Matthias
> > > > > > >
> > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-34227
> > > > > > >
> > > > > > > On Thu, Mar 14, 2024 at 8:49 AM Jane Chan
> qingyue@gmail.com
> > > > wrote:
> > > > > > >
> > > > > > > > Hi Yun, Jing, Martijn and Lincoln,
> > > > > > > >
> > > > > > > > I'm seeking guidance on whether merging the bugfix[1][2] at
> > this
> > > > stage
> > > > > > > > is
> > > > > > > > appropriate. I want to ensure that the actions align with the
> > > > current
> > > > > > > > release process and do not disrupt the ongoing preparations.
> > > > > > > >
> > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-29114
> > > > > > > > [2] https://github.com/apache/flink/pull/24492
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Jane
> > > > > > > >
> > > > > > > > On Thu, Mar 14, 2024 at 1:33 PM Yun Tang myas...@live.com
> > wrote:
> > > > > > > >
> > > > > > > > > +1 (non-binding)
> > > > > > > > >
> > > > > > > > > *
> > > > > > > > >

[jira] [Created] (FLINK-34673) SessionRelatedITCase#testTouchSession failure on GitHub Actions

2024-03-14 Thread Ryan Skraba (Jira)

Ryan Skraba created FLINK-34673:
---

 Summary: SessionRelatedITCase#testTouchSession failure on GitHub 
Actions
 Key: FLINK-34673
 URL: https://issues.apache.org/jira/browse/FLINK-34673
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Gateway
Affects Versions: 1.19.0
Reporter: Ryan Skraba


[https://github.com/apache/flink/actions/runs/8258416388/job/22590907051#step:10:12155]
{code:java}
 Error: 03:08:21 03:08:21.304 [ERROR] 
org.apache.flink.table.gateway.rest.SessionRelatedITCase.testTouchSession -- 
Time elapsed: 0.015 s <<< FAILURE!
Mar 13 03:08:21 java.lang.AssertionError: 
Mar 13 03:08:21 
Mar 13 03:08:21 Expecting actual:
Mar 13 03:08:21   1710299301198L
Mar 13 03:08:21 to be greater than:
Mar 13 03:08:21   1710299301198L
Mar 13 03:08:21 
Mar 13 03:08:21     at 
org.apache.flink.table.gateway.rest.SessionRelatedITCase.testTouchSession(SessionRelatedITCase.java:175)
Mar 13 03:08:21     at 
java.base/java.lang.reflect.Method.invoke(Method.java:580)
Mar 13 03:08:21     at 
java.base/java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:194)
Mar 13 03:08:21     at 
java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
Mar 13 03:08:21     at 
java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
Mar 13 03:08:21     at 
java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
Mar 13 03:08:21     at 
java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
Mar 13 03:08:21     at 
java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34672) HA deadlock between JobMasterServiceLeadershipRunner and DefaultLeaderElectionService

2024-03-14 Thread Chesnay Schepler (Jira)

Chesnay Schepler created FLINK-34672:


 Summary: HA deadlock between JobMasterServiceLeadershipRunner and 
DefaultLeaderElectionService
 Key: FLINK-34672
 URL: https://issues.apache.org/jira/browse/FLINK-34672
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.18.1
Reporter: Chesnay Schepler
 Fix For: 1.18.2, 1.20.0, 1.19.1


We recently observed a deadlock in the JM within the HA system.
(see below for the thread dump)

[~mapohl] and I looked a bit into it and there appears to be a race condition 
when leadership is revoked while a JobMaster is being started.
It appears to be caused by 
{{JobMasterServiceLeadershipRunner#createNewJobMasterServiceProcess}} 
forwarding futures while holding a lock; depending on whether the forwarded 
future is already complete the next stage may or may not run while holding that 
same lock.
We haven't determined yet whether we should be holding that lock or not.

{{code}}
"DefaultLeaderElectionService-leadershipOperationExecutor-thread-1" #131 daemon 
prio=5 os_prio=0 cpu=157.44ms elapsed=78749.65s tid=0x7f531f43d000 
nid=0x19d waiting for monitor entry  [0x7f53084fd000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.runIfStateRunning(JobMasterServiceLeadershipRunner.java:462)
- waiting to lock <0xf1c0e088> (a java.lang.Object)
at 
org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.revokeLeadership(JobMasterServiceLeadershipRunner.java:397)
at 
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService.notifyLeaderContenderOfLeadershipLoss(DefaultLeaderElectionService.java:484)
at 
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService$$Lambda$1252/0x000840ddec40.accept(Unknown
 Source)
at java.util.HashMap.forEach(java.base@11.0.22/HashMap.java:1337)
at 
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService.onRevokeLeadershipInternal(DefaultLeaderElectionService.java:452)
at 
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService$$Lambda$1251/0x000840dcf840.run(Unknown
 Source)
at 
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService.lambda$runInLeaderEventThread$3(DefaultLeaderElectionService.java:549)
- locked <0xf0e3f4d8> (a java.lang.Object)
at 
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService$$Lambda$1075/0x000840c23040.run(Unknown
 Source)
at 
java.util.concurrent.CompletableFuture$AsyncRun.run(java.base@11.0.22/CompletableFuture.java:1736)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.22/ThreadPoolExecutor.java:1128)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.22/ThreadPoolExecutor.java:628)
at java.lang.Thread.run(java.base@11.0.22/Thread.java:829)
{{code}}

{{code}}
"jobmanager-io-thread-1" #636 daemon prio=5 os_prio=0 cpu=125.56ms 
elapsed=78699.01s tid=0x7f5321c6e800 nid=0x396 waiting for monitor entry  
[0x7f530567d000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService.hasLeadership(DefaultLeaderElectionService.java:366)
- waiting to lock <0xf0e3f4d8> (a java.lang.Object)
at 
org.apache.flink.runtime.leaderelection.DefaultLeaderElection.hasLeadership(DefaultLeaderElection.java:52)
at 
org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.isValidLeader(JobMasterServiceLeadershipRunner.java:509)
at 
org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner.lambda$forwardIfValidLeader$15(JobMasterServiceLeadershipRunner.java:520)
- locked <0xf1c0e088> (a java.lang.Object)
at 
org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner$$Lambda$1320/0x000840e1a840.accept(Unknown
 Source)
at 
java.util.concurrent.CompletableFuture.uniWhenComplete(java.base@11.0.22/CompletableFuture.java:859)
at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(java.base@11.0.22/CompletableFuture.java:837)
at 
java.util.concurrent.CompletableFuture.postComplete(java.base@11.0.22/CompletableFuture.java:506)
at 
java.util.concurrent.CompletableFuture.complete(java.base@11.0.22/CompletableFuture.java:2079)
at 
org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.registerJobMasterServiceFutures(DefaultJobMasterServiceProcess.java:124)
at 
org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.lambda$new$0(DefaultJobMasterServiceProcess.java:114)
at

Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread Sergey Nuyanzin

+1 (non-binding)

- Checked the pre-built jars are generated with jdk8
- Verified signature and checksum
- Verified no binary in source
- Verified source code tag
- Reviewed release note
- Reviewed web PR
- Built from source
- Run a simple job successfully

On Thu, Mar 14, 2024 at 2:21 PM Martijn Visser 
wrote:

> +1 (binding)
>
> - Validated hashes
> - Verified signature
> - Verified that no binaries exist in the source archive
> - Build the source with Maven via mvn clean install -Pcheck-convergence
> -Dflink.version=1.19.0
> - Verified licenses
> - Verified web PR
> - Started a cluster and the Flink SQL client, successfully read and wrote
> with the Kafka connector to Confluent Cloud with AVRO and Schema Registry
> enabled
>
> On Thu, Mar 14, 2024 at 1:32 PM gongzhongqiang 
> wrote:
>
> > +1 (non-binding)
> >
> > - Verified no binary files in source code
> > - Verified signature and checksum
> > - Build source code and run a simple job successfully
> > - Reviewed the release announcement PR
> >
> > Best,
> >
> > Zhongqiang Gong
> >
> > Ferenc Csaky  于2024年3月14日周四 20:07写道：
> >
> > >  +1 (non-binding)
> > >
> > > - Verified checksum and signature
> > > - Verified no binary in src
> > > - Built from src
> > > - Reviewed release note PR
> > > - Reviewed web PR
> > > - Tested a simple datagen query and insert to blackhole sink via SQL
> > > Gateway
> > >
> > > Best,
> > > Ferenc
> > >
> > >
> > >
> > >
> > > On Thursday, March 14th, 2024 at 12:14, Jane Chan <
> qingyue@gmail.com
> > >
> > > wrote:
> > >
> > > >
> > > >
> > > > Hi Lincoln,
> > > >
> > > > Thank you for the prompt response and the effort to provide clarity
> on
> > > this
> > > > matter.
> > > >
> > > > Best,
> > > > Jane
> > > >
> > > > On Thu, Mar 14, 2024 at 6:02 PM Lincoln Lee lincoln.8...@gmail.com
> > > wrote:
> > > >
> > > > > Hi Jane,
> > > > >
> > > > > Thank you for raising this question. I saw the discussion in the
> Jira
> > > > > (include Matthias' point)
> > > > > and sought advice from several PMCs (including the previous RMs),
> the
> > > > > majority of people
> > > > > are in favor of merging the bugfix into the release branch even
> > during
> > > the
> > > > > release candidate
> > > > > (RC) voting period, so we should accept all bugfixes (unless there
> > is a
> > > > > specific community
> > > > > rule preventing it).
> > > > >
> > > > > Thanks again for contributing to the community!
> > > > >
> > > > > Best,
> > > > > Lincoln Lee
> > > > >
> > > > > Matthias Pohl matthias.p...@aiven.io.invalid 于2024年3月14日周四
> 17:50写道：
> > > > >
> > > > > > Update on FLINK-34227 [1] which I mentioned above: Chesnay helped
> > > > > > identify
> > > > > > a concurrency issue in the JobMaster shutdown logic which seems
> to
> > > be in
> > > > > > the code for quite some time. I created a PR fixing the issue
> > hoping
> > > that
> > > > > > the test instability is resolved with it.
> > > > > >
> > > > > > The concurrency issue doesn't really explain why it only started
> to
> > > > > > appear
> > > > > > recently in a specific CI setup (GHA with AdaptiveScheduler).
> There
> > > is no
> > > > > > hint in the git history indicating that it's caused by some newly
> > > > > > introduced change. That is why I wouldn't make FLINK-34227 a
> reason
> > > to
> > > > > > cancel rc2. Instead, the fix can be provided in subsequent patch
> > > > > > releases.
> > > > > >
> > > > > > Matthias
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-34227
> > > > > >
> > > > > > On Thu, Mar 14, 2024 at 8:49 AM Jane Chan qingyue@gmail.com
> > > wrote:
> > > > > >
> > > > > > > Hi Yun, Jing, Martijn and Lincoln,
> > > > > > >
> > > > > > > I'm seeking guidance on whether merging the bugfix[1][2] at
> this
> > > stage
> > > > > > > is
> > > > > > > appropriate. I want to ensure that the actions align with the
> > > current
> > > > > > > release process and do not disrupt the ongoing preparations.
> > > > > > >
> > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-29114
> > > > > > > [2] https://github.com/apache/flink/pull/24492
> > > > > > >
> > > > > > > Best,
> > > > > > > Jane
> > > > > > >
> > > > > > > On Thu, Mar 14, 2024 at 1:33 PM Yun Tang myas...@live.com
> wrote:
> > > > > > >
> > > > > > > > +1 (non-binding)
> > > > > > > >
> > > > > > > > *
> > > > > > > > Verified the signature and checksum.
> > > > > > > > *
> > > > > > > > Reviewed the release note PR
> > > > > > > > *
> > > > > > > > Reviewed the web announcement PR
> > > > > > > > *
> > > > > > > > Start a standalone cluster to submit the state machine
> example,
> > > which
> > > > > > > > works well.
> > > > > > > > *
> > > > > > > > Checked the pre-built jars are generated via JDK8
> > > > > > > > *
> > > > > > > > Verified the process profiler works well after setting
> > > > > > > > rest.profiling.enabled: true
> > > > > > > >
> > > > > > > > Best
> > > > > > > > Yun Tang
> > > > > > > >
> > > > > > > > 
> > > > > > >

Re: [DISCUSS] Removing documentation on Azure Pipelines for Flink forks

2024-03-14 Thread Matthias Pohl

Good pointing. I guess, marking it as deprecated and pointing to GitHub
Actions as the new workaround would be a better way than removing it
entirely for now.

On Thu, Mar 14, 2024 at 11:47 AM Sergey Nuyanzin 
wrote:

> Hi Matthias,
>
> thanks for driving  this
> agree GHA seems working ok
>
> however to be on the safe side what if we mark it for removal or deprecated
> first
> and then remove together with dropping support of 1.17 where GHA is not
> supported IIUC?
>
> On Thu, Mar 14, 2024 at 11:42 AM Matthias Pohl
>  wrote:
>
> > Hi everyone,
> > I'm wondering whether anyone has objections against removing the Azure
> > Pipelines Tutorial to "set up CI for a fork of the Flink repository" in
> the
> > Flink wiki. Flink's GitHub Actions workflow seems to work fine for forks
> > (at least for 1.18+ changes). No need to guide contributors to the
> > flink-mirror repository to create draft PRs. And it's not used that
> often,
> > anyway [2].
> >
> > Best,
> > Matthias
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Azure+Pipelines#AzurePipelines-Tutorial:SettingupAzurePipelinesforaforkoftheFlinkrepository
> > [2] https://github.com/flink-ci/flink-mirror/pulls?q=is%3Apr
> >
>
>
> --
> Best regards,
> Sergey
>

Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread Martijn Visser

+1 (binding)

- Validated hashes
- Verified signature
- Verified that no binaries exist in the source archive
- Build the source with Maven via mvn clean install -Pcheck-convergence
-Dflink.version=1.19.0
- Verified licenses
- Verified web PR
- Started a cluster and the Flink SQL client, successfully read and wrote
with the Kafka connector to Confluent Cloud with AVRO and Schema Registry
enabled

On Thu, Mar 14, 2024 at 1:32 PM gongzhongqiang 
wrote:

> +1 (non-binding)
>
> - Verified no binary files in source code
> - Verified signature and checksum
> - Build source code and run a simple job successfully
> - Reviewed the release announcement PR
>
> Best,
>
> Zhongqiang Gong
>
> Ferenc Csaky  于2024年3月14日周四 20:07写道：
>
> >  +1 (non-binding)
> >
> > - Verified checksum and signature
> > - Verified no binary in src
> > - Built from src
> > - Reviewed release note PR
> > - Reviewed web PR
> > - Tested a simple datagen query and insert to blackhole sink via SQL
> > Gateway
> >
> > Best,
> > Ferenc
> >
> >
> >
> >
> > On Thursday, March 14th, 2024 at 12:14, Jane Chan  >
> > wrote:
> >
> > >
> > >
> > > Hi Lincoln,
> > >
> > > Thank you for the prompt response and the effort to provide clarity on
> > this
> > > matter.
> > >
> > > Best,
> > > Jane
> > >
> > > On Thu, Mar 14, 2024 at 6:02 PM Lincoln Lee lincoln.8...@gmail.com
> > wrote:
> > >
> > > > Hi Jane,
> > > >
> > > > Thank you for raising this question. I saw the discussion in the Jira
> > > > (include Matthias' point)
> > > > and sought advice from several PMCs (including the previous RMs), the
> > > > majority of people
> > > > are in favor of merging the bugfix into the release branch even
> during
> > the
> > > > release candidate
> > > > (RC) voting period, so we should accept all bugfixes (unless there
> is a
> > > > specific community
> > > > rule preventing it).
> > > >
> > > > Thanks again for contributing to the community!
> > > >
> > > > Best,
> > > > Lincoln Lee
> > > >
> > > > Matthias Pohl matthias.p...@aiven.io.invalid 于2024年3月14日周四 17:50写道：
> > > >
> > > > > Update on FLINK-34227 [1] which I mentioned above: Chesnay helped
> > > > > identify
> > > > > a concurrency issue in the JobMaster shutdown logic which seems to
> > be in
> > > > > the code for quite some time. I created a PR fixing the issue
> hoping
> > that
> > > > > the test instability is resolved with it.
> > > > >
> > > > > The concurrency issue doesn't really explain why it only started to
> > > > > appear
> > > > > recently in a specific CI setup (GHA with AdaptiveScheduler). There
> > is no
> > > > > hint in the git history indicating that it's caused by some newly
> > > > > introduced change. That is why I wouldn't make FLINK-34227 a reason
> > to
> > > > > cancel rc2. Instead, the fix can be provided in subsequent patch
> > > > > releases.
> > > > >
> > > > > Matthias
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-34227
> > > > >
> > > > > On Thu, Mar 14, 2024 at 8:49 AM Jane Chan qingyue@gmail.com
> > wrote:
> > > > >
> > > > > > Hi Yun, Jing, Martijn and Lincoln,
> > > > > >
> > > > > > I'm seeking guidance on whether merging the bugfix[1][2] at this
> > stage
> > > > > > is
> > > > > > appropriate. I want to ensure that the actions align with the
> > current
> > > > > > release process and do not disrupt the ongoing preparations.
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-29114
> > > > > > [2] https://github.com/apache/flink/pull/24492
> > > > > >
> > > > > > Best,
> > > > > > Jane
> > > > > >
> > > > > > On Thu, Mar 14, 2024 at 1:33 PM Yun Tang myas...@live.com wrote:
> > > > > >
> > > > > > > +1 (non-binding)
> > > > > > >
> > > > > > > *
> > > > > > > Verified the signature and checksum.
> > > > > > > *
> > > > > > > Reviewed the release note PR
> > > > > > > *
> > > > > > > Reviewed the web announcement PR
> > > > > > > *
> > > > > > > Start a standalone cluster to submit the state machine example,
> > which
> > > > > > > works well.
> > > > > > > *
> > > > > > > Checked the pre-built jars are generated via JDK8
> > > > > > > *
> > > > > > > Verified the process profiler works well after setting
> > > > > > > rest.profiling.enabled: true
> > > > > > >
> > > > > > > Best
> > > > > > > Yun Tang
> > > > > > >
> > > > > > > 
> > > > > > > From: Qingsheng Ren re...@apache.org
> > > > > > > Sent: Wednesday, March 13, 2024 12:45
> > > > > > > To: dev@flink.apache.org dev@flink.apache.org
> > > > > > > Subject: Re: [VOTE] Release 1.19.0, release candidate #2
> > > > > > >
> > > > > > > +1 (binding)
> > > > > > >
> > > > > > > - Verified signature and checksum
> > > > > > > - Verified no binary in source
> > > > > > > - Built from source
> > > > > > > - Tested reading and writing Kafka with SQL client and Kafka
> > > > > > > connector
> > > > > > > 3.1.0
> > > > > > > - Verified source code tag
> > > > > > > - Reviewed release note
> > > > > > > - Reviewed web PR
> > > > > > >
> > > > > > >

[jira] [Created] (FLINK-34671) Update the content of README.md in FlinkCDC project

2024-03-14 Thread LvYanquan (Jira)

LvYanquan created FLINK-34671:
-

 Summary: Update the content of README.md in FlinkCDC project
 Key: FLINK-34671
 URL: https://issues.apache.org/jira/browse/FLINK-34671
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Affects Versions: cdc-3.1.0
Reporter: LvYanquan
 Fix For: cdc-3.1.0


As we have updated the doc site of FlinkCDC, we should modify the content of 
README.md to update those links and add some more accurate description.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] FLIP-436: Introduce "SHOW CREATE CATALOG" Syntax

2024-03-14 Thread Roc Marshal

Hi, Yubin


Thank you for initiating this discussion! +1 for the proposal.






Best,
Yuepeng Pan









At 2024-03-14 18:57:35, "Ferenc Csaky"  wrote:
>Hi Yubin,
>
>Thank you for initiating this discussion! +1 for the proposal.
>
>I also think it makes sense to group the missing catalog related
>SQL syntaxes under this FLIP.
>
>Looking forward to these features!
>
>Best,
>Ferenc
>
>
>
>
>On Thursday, March 14th, 2024 at 08:31, Jane Chan  
>wrote:
>
>> 
>> 
>> Hi Yubin,
>> 
>> Thanks for leading the discussion. I'm +1 for the FLIP.
>> 
>> As Jark said, it's a good opportunity to enhance the syntax for Catalog
>> from a more comprehensive perspective. So, I suggest expanding the scope of
>> this FLIP by focusing on the mechanism instead of one use case to enhance
>> the overall functionality. WDYT?
>> 
>> Best,
>> Jane
>> 
>> On Thu, Mar 14, 2024 at 11:38 AM Hang Ruan ruanhang1...@gmail.com wrote:
>> 
>> > Hi, Yubin.
>> > 
>> > Thanks for the FLIP. +1 for it.
>> > 
>> > Best,
>> > Hang
>> > 
>> > Yubin Li lyb5...@gmail.com 于2024年3月14日周四 10:15写道：
>> > 
>> > > Hi Jingsong, Feng, and Jeyhun
>> > > 
>> > > Thanks for your support and feedback!
>> > > 
>> > > > However, could we add a new method `getCatalogDescriptor()` to
>> > > > CatalogManager instead of directly exposing CatalogStore?
>> > > 
>> > > Good point, Besides the audit tracking issue, The proposed feature
>> > > only requires `getCatalogDescriptor()` function. Exposing components
>> > > with excessive functionality will bring unnecessary risks, I have made
>> > > modifications in the FLIP doc [1]. Thank Feng :)
>> > > 
>> > > > Showing the SQL parser implementation in the FLIP for the SQL syntax
>> > > > might be a bit confusing. Also, the formal definition is missing for
>> > > > this SQL clause.
>> > > 
>> > > Thank Jeyhun for pointing it out :) I have updated the doc [1] .
>> > > 
>> > > [1]
>> > 
>> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=296290756
>> > 
>> > > Best,
>> > > Yubin
>> > > 
>> > > On Thu, Mar 14, 2024 at 2:18 AM Jeyhun Karimov je.kari...@gmail.com
>> > > wrote:
>> > > 
>> > > > Hi Yubin,
>> > > > 
>> > > > Thanks for the proposal. +1 for it.
>> > > > I have one comment:
>> > > > 
>> > > > I would like to see the SQL syntax for the proposed statement. Showing
>> > > > the
>> > > > SQL parser implementation in the FLIP
>> > > > for the SQL syntax might be a bit confusing. Also, the formal
>> > > > definition
>> > > > is
>> > > > missing for this SQL clause.
>> > > > Maybe something like [1] might be useful. WDYT?
>> > > > 
>> > > > Regards,
>> > > > Jeyhun
>> > > > 
>> > > > [1]
>> > 
>> > https://github.com/apache/flink/blob/0da60ca1a4754f858cf7c52dd4f0c97ae0e1b0cb/docs/content/docs/dev/table/sql/show.md?plain=1#L620-L632
>> > 
>> > > > On Wed, Mar 13, 2024 at 3:28 PM Feng Jin jinfeng1...@gmail.com
>> > > > wrote:
>> > > > 
>> > > > > Hi Yubin
>> > > > > 
>> > > > > Thank you for initiating this FLIP.
>> > > > > 
>> > > > > I have just one minor question:
>> > > > > 
>> > > > > I noticed that we added a new function `getCatalogStore` to expose
>> > > > > CatalogStore, and it seems fine.
>> > > > > However, could we add a new method `getCatalogDescriptor()` to
>> > > > > CatalogManager instead of directly exposing CatalogStore?
>> > > > > By only providing the `getCatalogDescriptor()` interface, it may be
>> > > > > easier
>> > > > > for us to implement audit tracking in CatalogManager in the future.
>> > > > > WDYT ?
>> > > > > Although we have only collected some modified events at the
>> > > > > moment.[1]
>> > > > > 
>> > > > > [1].
>> > 
>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-294%3A+Support+Customized+Catalog+Modification+Listener
>> > 
>> > > > > Best,
>> > > > > Feng
>> > > > > 
>> > > > > On Wed, Mar 13, 2024 at 5:31 PM Jingsong Li jingsongl...@gmail.com
>> > > > > wrote:
>> > > > > 
>> > > > > > +1 for this.
>> > > > > > 
>> > > > > > We are missing a series of catalog related syntaxes.
>> > > > > > Especially after the introduction of catalog store. [1]
>> > > > > > 
>> > > > > > [1]
>> > 
>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
>> > 
>> > > > > > Best,
>> > > > > > Jingsong
>> > > > > > 
>> > > > > > On Wed, Mar 13, 2024 at 5:09 PM Yubin Li lyb5...@gmail.com
>> > > > > > wrote:
>> > > > > > 
>> > > > > > > Hi devs,
>> > > > > > > 
>> > > > > > > I'd like to start a discussion about FLIP-436: Introduce "SHOW
>> > > > > > > CREATE
>> > > > > > > CATALOG" Syntax [1].
>> > > > > > > 
>> > > > > > > At present, the `SHOW CREATE TABLE` statement provides strong
>> > > > > > > support
>> > > > > > > for
>> > > > > > > users to easily
>> > > > > > > reuse created tables. However, despite the increasing importance
>> > > > > > > of the
>> > > > > > > `Catalog` in user's
>> > > > > > > business, there is no similar statement for users to use.
>> > > > > > > 
>> > > > > > >

[jira] [Created] (FLINK-34670) Use SynchronousQueue to create asyncOperationsThreadPool for SubtaskCheckpointCoordinatorImpl

2024-03-14 Thread Jinzhong Li (Jira)

Jinzhong Li created FLINK-34670:
---

 Summary: Use SynchronousQueue to create asyncOperationsThreadPool 
for SubtaskCheckpointCoordinatorImpl
 Key: FLINK-34670
 URL: https://issues.apache.org/jira/browse/FLINK-34670
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing, Runtime / State Backends
Affects Versions: 1.18.0, 1.19.0
Reporter: Jinzhong Li
 Fix For: 1.20.0
 Attachments: image-2024-03-14-20-24-14-198.png, 
image-2024-03-14-20-27-37-540.png, image-2024-03-14-20-33-28-851.png

Now, the asyncOperations ThreadPoolExecutor of SubtaskCheckpointCoordinatorImpl 
is create with a LinkedBlockingQueue and zero corePoolSize.

!image-2024-03-14-20-24-14-198.png|width=456,height=147!

And in the ThreadPoolExecutor, except for the first time the task is submitted, 
*no* new thread is created until the queue is full. But this capacity of 
LinkedBlockingQueue is Integer.Max. This means that there is almost *only one 
thread* working in this thread pool, even if there are many concurrent 
checkpoint requests or checkpoint abort requests waiting to be processed.

!image-2024-03-14-20-27-37-540.png|width=575,height=164!

This problem can be verified by changing ExecutorService implementation in UT 
SubtaskCheckpointCoordinatorTest#testNotifyCheckpointAbortedDuringAsyncPhase. 
When the LinkedBlockingQueue is used, this UT will deadlock because only one 
worker thread can be created.
!image-2024-03-14-20-33-28-851.png|width=598,height=232!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread gongzhongqiang

+1 (non-binding)

- Verified no binary files in source code
- Verified signature and checksum
- Build source code and run a simple job successfully
- Reviewed the release announcement PR

Best,

Zhongqiang Gong

Ferenc Csaky  于2024年3月14日周四 20:07写道：

>  +1 (non-binding)
>
> - Verified checksum and signature
> - Verified no binary in src
> - Built from src
> - Reviewed release note PR
> - Reviewed web PR
> - Tested a simple datagen query and insert to blackhole sink via SQL
> Gateway
>
> Best,
> Ferenc
>
>
>
>
> On Thursday, March 14th, 2024 at 12:14, Jane Chan 
> wrote:
>
> >
> >
> > Hi Lincoln,
> >
> > Thank you for the prompt response and the effort to provide clarity on
> this
> > matter.
> >
> > Best,
> > Jane
> >
> > On Thu, Mar 14, 2024 at 6:02 PM Lincoln Lee lincoln.8...@gmail.com
> wrote:
> >
> > > Hi Jane,
> > >
> > > Thank you for raising this question. I saw the discussion in the Jira
> > > (include Matthias' point)
> > > and sought advice from several PMCs (including the previous RMs), the
> > > majority of people
> > > are in favor of merging the bugfix into the release branch even during
> the
> > > release candidate
> > > (RC) voting period, so we should accept all bugfixes (unless there is a
> > > specific community
> > > rule preventing it).
> > >
> > > Thanks again for contributing to the community!
> > >
> > > Best,
> > > Lincoln Lee
> > >
> > > Matthias Pohl matthias.p...@aiven.io.invalid 于2024年3月14日周四 17:50写道：
> > >
> > > > Update on FLINK-34227 [1] which I mentioned above: Chesnay helped
> > > > identify
> > > > a concurrency issue in the JobMaster shutdown logic which seems to
> be in
> > > > the code for quite some time. I created a PR fixing the issue hoping
> that
> > > > the test instability is resolved with it.
> > > >
> > > > The concurrency issue doesn't really explain why it only started to
> > > > appear
> > > > recently in a specific CI setup (GHA with AdaptiveScheduler). There
> is no
> > > > hint in the git history indicating that it's caused by some newly
> > > > introduced change. That is why I wouldn't make FLINK-34227 a reason
> to
> > > > cancel rc2. Instead, the fix can be provided in subsequent patch
> > > > releases.
> > > >
> > > > Matthias
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-34227
> > > >
> > > > On Thu, Mar 14, 2024 at 8:49 AM Jane Chan qingyue@gmail.com
> wrote:
> > > >
> > > > > Hi Yun, Jing, Martijn and Lincoln,
> > > > >
> > > > > I'm seeking guidance on whether merging the bugfix[1][2] at this
> stage
> > > > > is
> > > > > appropriate. I want to ensure that the actions align with the
> current
> > > > > release process and do not disrupt the ongoing preparations.
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-29114
> > > > > [2] https://github.com/apache/flink/pull/24492
> > > > >
> > > > > Best,
> > > > > Jane
> > > > >
> > > > > On Thu, Mar 14, 2024 at 1:33 PM Yun Tang myas...@live.com wrote:
> > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > *
> > > > > > Verified the signature and checksum.
> > > > > > *
> > > > > > Reviewed the release note PR
> > > > > > *
> > > > > > Reviewed the web announcement PR
> > > > > > *
> > > > > > Start a standalone cluster to submit the state machine example,
> which
> > > > > > works well.
> > > > > > *
> > > > > > Checked the pre-built jars are generated via JDK8
> > > > > > *
> > > > > > Verified the process profiler works well after setting
> > > > > > rest.profiling.enabled: true
> > > > > >
> > > > > > Best
> > > > > > Yun Tang
> > > > > >
> > > > > > 
> > > > > > From: Qingsheng Ren re...@apache.org
> > > > > > Sent: Wednesday, March 13, 2024 12:45
> > > > > > To: dev@flink.apache.org dev@flink.apache.org
> > > > > > Subject: Re: [VOTE] Release 1.19.0, release candidate #2
> > > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > > - Verified signature and checksum
> > > > > > - Verified no binary in source
> > > > > > - Built from source
> > > > > > - Tested reading and writing Kafka with SQL client and Kafka
> > > > > > connector
> > > > > > 3.1.0
> > > > > > - Verified source code tag
> > > > > > - Reviewed release note
> > > > > > - Reviewed web PR
> > > > > >
> > > > > > Thanks to all release managers and contributors for the awesome
> work!
> > > > > >
> > > > > > Best,
> > > > > > Qingsheng
> > > > > >
> > > > > > On Wed, Mar 13, 2024 at 1:23 AM Matthias Pohl
> > > > > > matthias.p...@aiven.io.invalid wrote:
> > > > > >
> > > > > > > I want to share an update on FLINK-34227 [1]: It's still not
> clear
> > > > > > > what's
> > > > > > > causing the test instability. So far, we agreed in today's
> release
> > > > > > > sync
> > > > > > > [2]
> > > > > > > that it's not considered a blocker because it is observed in
> 1.18
> > > > > > > nightly
> > > > > > > builds and it only appears in the GitHub Actions workflow. But
> I
> > > > > > > still
> > > > > > > have
> > > > > > > a bit of a concern that this is

Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread Ferenc Csaky

 +1 (non-binding)

- Verified checksum and signature
- Verified no binary in src
- Built from src
- Reviewed release note PR
- Reviewed web PR
- Tested a simple datagen query and insert to blackhole sink via SQL Gateway

Best,
Ferenc




On Thursday, March 14th, 2024 at 12:14, Jane Chan  wrote:

> 
> 
> Hi Lincoln,
> 
> Thank you for the prompt response and the effort to provide clarity on this
> matter.
> 
> Best,
> Jane
> 
> On Thu, Mar 14, 2024 at 6:02 PM Lincoln Lee lincoln.8...@gmail.com wrote:
> 
> > Hi Jane,
> > 
> > Thank you for raising this question. I saw the discussion in the Jira
> > (include Matthias' point)
> > and sought advice from several PMCs (including the previous RMs), the
> > majority of people
> > are in favor of merging the bugfix into the release branch even during the
> > release candidate
> > (RC) voting period, so we should accept all bugfixes (unless there is a
> > specific community
> > rule preventing it).
> > 
> > Thanks again for contributing to the community!
> > 
> > Best,
> > Lincoln Lee
> > 
> > Matthias Pohl matthias.p...@aiven.io.invalid 于2024年3月14日周四 17:50写道：
> > 
> > > Update on FLINK-34227 [1] which I mentioned above: Chesnay helped
> > > identify
> > > a concurrency issue in the JobMaster shutdown logic which seems to be in
> > > the code for quite some time. I created a PR fixing the issue hoping that
> > > the test instability is resolved with it.
> > > 
> > > The concurrency issue doesn't really explain why it only started to
> > > appear
> > > recently in a specific CI setup (GHA with AdaptiveScheduler). There is no
> > > hint in the git history indicating that it's caused by some newly
> > > introduced change. That is why I wouldn't make FLINK-34227 a reason to
> > > cancel rc2. Instead, the fix can be provided in subsequent patch
> > > releases.
> > > 
> > > Matthias
> > > 
> > > [1] https://issues.apache.org/jira/browse/FLINK-34227
> > > 
> > > On Thu, Mar 14, 2024 at 8:49 AM Jane Chan qingyue@gmail.com wrote:
> > > 
> > > > Hi Yun, Jing, Martijn and Lincoln,
> > > > 
> > > > I'm seeking guidance on whether merging the bugfix[1][2] at this stage
> > > > is
> > > > appropriate. I want to ensure that the actions align with the current
> > > > release process and do not disrupt the ongoing preparations.
> > > > 
> > > > [1] https://issues.apache.org/jira/browse/FLINK-29114
> > > > [2] https://github.com/apache/flink/pull/24492
> > > > 
> > > > Best,
> > > > Jane
> > > > 
> > > > On Thu, Mar 14, 2024 at 1:33 PM Yun Tang myas...@live.com wrote:
> > > > 
> > > > > +1 (non-binding)
> > > > > 
> > > > > *
> > > > > Verified the signature and checksum.
> > > > > *
> > > > > Reviewed the release note PR
> > > > > *
> > > > > Reviewed the web announcement PR
> > > > > *
> > > > > Start a standalone cluster to submit the state machine example, which
> > > > > works well.
> > > > > *
> > > > > Checked the pre-built jars are generated via JDK8
> > > > > *
> > > > > Verified the process profiler works well after setting
> > > > > rest.profiling.enabled: true
> > > > > 
> > > > > Best
> > > > > Yun Tang
> > > > > 
> > > > > 
> > > > > From: Qingsheng Ren re...@apache.org
> > > > > Sent: Wednesday, March 13, 2024 12:45
> > > > > To: dev@flink.apache.org dev@flink.apache.org
> > > > > Subject: Re: [VOTE] Release 1.19.0, release candidate #2
> > > > > 
> > > > > +1 (binding)
> > > > > 
> > > > > - Verified signature and checksum
> > > > > - Verified no binary in source
> > > > > - Built from source
> > > > > - Tested reading and writing Kafka with SQL client and Kafka
> > > > > connector
> > > > > 3.1.0
> > > > > - Verified source code tag
> > > > > - Reviewed release note
> > > > > - Reviewed web PR
> > > > > 
> > > > > Thanks to all release managers and contributors for the awesome work!
> > > > > 
> > > > > Best,
> > > > > Qingsheng
> > > > > 
> > > > > On Wed, Mar 13, 2024 at 1:23 AM Matthias Pohl
> > > > > matthias.p...@aiven.io.invalid wrote:
> > > > > 
> > > > > > I want to share an update on FLINK-34227 [1]: It's still not clear
> > > > > > what's
> > > > > > causing the test instability. So far, we agreed in today's release
> > > > > > sync
> > > > > > [2]
> > > > > > that it's not considered a blocker because it is observed in 1.18
> > > > > > nightly
> > > > > > builds and it only appears in the GitHub Actions workflow. But I
> > > > > > still
> > > > > > have
> > > > > > a bit of a concern that this is something that was introduced in
> > > > > > 1.19
> > > > > > and
> > > > > > backported to 1.18 after the 1.18.1 release (because the test
> > > > > > instability
> > > > > > started to appear more regularly in March; with one occurrence in
> > > > > > January).
> > > > > > Additionally, I have no reason to believe, yet, that the
> > > > > > instability
> > > > > > is
> > > > > > caused by some GHA-related infrastructure issue.
> > > > > > 
> > > > > > So, if someone else has some capacity to help looking into it; that
> > > >

[jira] [Created] (FLINK-34669) Optimization of Arch Rules for Connector Constraints and Violation File Updates

2024-03-14 Thread Jane Chan (Jira)

Jane Chan created FLINK-34669:
-

 Summary: Optimization of Arch Rules for Connector Constraints and 
Violation File Updates
 Key: FLINK-34669
 URL: https://issues.apache.org/jira/browse/FLINK-34669
 Project: Flink
  Issue Type: Improvement
  Components: Test Infrastructure
Reporter: Jane Chan


Description:
I have identified potential areas for optimization within our Arch rules that 
could improve our development workflow. This originated from the discussion for 
PR https://github.com/apache/flink/pull/24492

1. Connector Constraints:
Our current Arch rule, `CONNECTOR_CLASSES_ONLY_DEPEND_ON_PUBLIC_API`, was 
implemented to prevent internal code changes in Flink from affecting the 
compilation of connectors in external repositories. This rule is crucial for 
connectors that are external, but it may be unnecessarily restrictive for the 
filesystem connector, which remains within the same code repository as Flink. 
Maybe we should consider excluding the filesystem connector from this rule to 
better reflect its status as an internal component.

2. Preconditions Class Promotion:
The majority of Arch rule violations for connectors are related to the use of 
`Preconditions#checkX`. This consistent pattern of violations prompts the 
question of whether we should reclassify `Preconditions` from its current 
internal status to a `Public` or `PublicEvolving` interface, allowing broader 
and more official usage within our codebase.

3. Violation File Updates:
Updating the violation file following the `freeze.refreeze=true` process 
outlined in the readme proves to be difficult. The diffs generated include the 
line numbers, which complicates the review process, especially when substantial 
changes are submitted. Reviewers face a considerable challenge in 
distinguishing between meaningful changes and mere line number alterations. To 
alleviate this issue, I suggest that we modify the process so that line numbers 
are not included in the violation file diffs, streamlining reviews and commits.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] FLIP-434: Support optimizations for pre-partitioned data sources

2024-03-14 Thread Benchao Li

Thanks Jeyhun for bringing up this discussion, it is really exiting,
+1 for the general idea.

We also introduced a similar concept in Flink Batch internally to cope
with bucketed tables in Hive, it is a very important improvement.

> One thing to note is that for join queries, the parallelism of each join
> source might be different. This might result in
> inconsistencies while using the pre-partitioned/pre-divided data (e.g.,
> different mappings of partitions to source operators).
> Therefore, it is the job of planner to detect this and adjust the
> parallelism. With that having in mind,
> the rest (how the split assigners perform) is consistent among many
> sources.

Could you elaborate a little more on this. I added my two cents here
about this part:
1. What the parallelism would you take? E.g., 128 + 256 => 128? What
if we cannot have a good greatest common divisor, like 127 + 128,
could we just utilize one side's pre-partitioned attribute, and let
another side just do the shuffle?
2. In our current shuffle remove design (FlinkExpandConversionRule),
we don't consider parallelism, we just remove unnecessary shuffles
according to the distribution columns. After this FLIP, the
parallelism may be bundled with source's partitions, then how will
this optimization accommodate with FlinkExpandConversionRule, will you
also change downstream operator's parallelisms if we want to also
remove subsequent shuffles?


Regarding the new optimization rule, have you also considered to allow
some non-strict mode like FlinkRelDistribution#requireStrict? For
example, source is pre-partitioned by a, b columns, if we are
consuming this source, and do a aggregate on a, b, c, can we utilize
this optimization?

Jane Chan  于2024年3月14日周四 15:24写道：
>
> Hi Jeyhun,
>
> Thanks for your clarification.
>
> > Once a new partition is detected, we add it to our existing mapping. Our
> mapping looks like Map> subtaskToPartitionAssignment,
> where it maps each source subtaskID to zero or more partitions.
>
> I understand your point.  **It would be better if you could sync the
> content to the FLIP**.
>
> Another thing is I'm curious about what the physical plan looks like. Is
> there any specific info that will be added to the table source (like
> filter/project pushdown)? It would be great if you could attach an example
> to the FLIP.
>
> Bests,
> Jane
>
> On Wed, Mar 13, 2024 at 9:11 PM Jeyhun Karimov  wrote:
>
> > Hi Jane,
> >
> > Thanks for your comments.
> >
> >
> > 1. Concerning the `sourcePartitions()` method, the partition information
> > > returned during the optimization phase may not be the same as the
> > partition
> > > information during runtime execution. For long-running jobs, partitions
> > may
> > > be continuously created. Is this FLIP equipped to handle scenarios?
> >
> >
> > - Good point. This scenario is definitely supported.
> > Once a new partition is added, or in general, new splits are
> > discovered,
> > PartitionAwareSplitAssigner::addSplits(Collection
> > newSplits)
> > method will be called. Inside that method, we are able to detect if a split
> > belongs to existing partitions or there is a new partition.
> > Once a new partition is detected, we add it to our existing mapping. Our
> > mapping looks like Map> subtaskToPartitionAssignment,
> > where
> > it maps each source subtaskID to zero or more partitions.
> >
> > 2. Regarding the `RemoveRedundantShuffleRule` optimization rule, I
> > > understand that it is also necessary to verify whether the hash key
> > within
> > > the Exchange node is consistent with the partition key defined in the
> > table
> > > source that implements `SupportsPartitioning`.
> >
> >
> > - Yes, I overlooked that point, fixed. Actually, the rule is much
> > complicated. I tried to simplify it in the FLIP. Good point.
> >
> >
> > 3. Could you elaborate on the desired physical plan and integration with
> > > `CompiledPlan` to enhance the overall functionality?
> >
> >
> > - For compiled plan, PartitioningSpec will be used, with a json tag
> > "Partitioning". As a result, in the compiled plan, the source operator will
> > have
> > "abilities" : [ { "type" : "Partitioning" } ] as part of the compiled plan.
> > More about the implementation details below:
> >
> > 
> > PartitioningSpec class
> > 
> > @JsonTypeName("Partitioning")
> > public final class PartitioningSpec extends SourceAbilitySpecBase {
> >  // some code here
> > @Override
> > public void apply(DynamicTableSource tableSource, SourceAbilityContext
> > context) {
> > if (tableSource instanceof SupportsPartitioning) {
> > ((SupportsPartitioning) tableSource).applyPartitionedRead();
> > } else {
> > throw new TableException(
> > String.format(
> > "%s does not support SupportsPartitioning.",
> > tableSource.getClass().getName()));
> > }

Re: [DISCUSS] FLIP-436: Introduce "SHOW CREATE CATALOG" Syntax

2024-03-14 Thread gongzhongqiang

Hi, Yubin.

Thanks for leading the discussion. +1 for the FLIP.



Best,
Zhongqiang Gong

Ferenc Csaky  于2024年3月14日周四 18:58写道：

> Hi Yubin,
>
> Thank you for initiating this discussion! +1 for the proposal.
>
> I also think it makes sense to group the missing catalog related
> SQL syntaxes under this FLIP.
>
> Looking forward to these features!
>
> Best,
> Ferenc
>
>
>
>
> On Thursday, March 14th, 2024 at 08:31, Jane Chan 
> wrote:
>
> >
> >
> > Hi Yubin,
> >
> > Thanks for leading the discussion. I'm +1 for the FLIP.
> >
> > As Jark said, it's a good opportunity to enhance the syntax for Catalog
> > from a more comprehensive perspective. So, I suggest expanding the scope
> of
> > this FLIP by focusing on the mechanism instead of one use case to enhance
> > the overall functionality. WDYT?
> >
> > Best,
> > Jane
> >
> > On Thu, Mar 14, 2024 at 11:38 AM Hang Ruan ruanhang1...@gmail.com wrote:
> >
> > > Hi, Yubin.
> > >
> > > Thanks for the FLIP. +1 for it.
> > >
> > > Best,
> > > Hang
> > >
> > > Yubin Li lyb5...@gmail.com 于2024年3月14日周四 10:15写道：
> > >
> > > > Hi Jingsong, Feng, and Jeyhun
> > > >
> > > > Thanks for your support and feedback!
> > > >
> > > > > However, could we add a new method `getCatalogDescriptor()` to
> > > > > CatalogManager instead of directly exposing CatalogStore?
> > > >
> > > > Good point, Besides the audit tracking issue, The proposed feature
> > > > only requires `getCatalogDescriptor()` function. Exposing components
> > > > with excessive functionality will bring unnecessary risks, I have
> made
> > > > modifications in the FLIP doc [1]. Thank Feng :)
> > > >
> > > > > Showing the SQL parser implementation in the FLIP for the SQL
> syntax
> > > > > might be a bit confusing. Also, the formal definition is missing
> for
> > > > > this SQL clause.
> > > >
> > > > Thank Jeyhun for pointing it out :) I have updated the doc [1] .
> > > >
> > > > [1]
> > >
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=296290756
> > >
> > > > Best,
> > > > Yubin
> > > >
> > > > On Thu, Mar 14, 2024 at 2:18 AM Jeyhun Karimov je.kari...@gmail.com
> > > > wrote:
> > > >
> > > > > Hi Yubin,
> > > > >
> > > > > Thanks for the proposal. +1 for it.
> > > > > I have one comment:
> > > > >
> > > > > I would like to see the SQL syntax for the proposed statement.
> Showing
> > > > > the
> > > > > SQL parser implementation in the FLIP
> > > > > for the SQL syntax might be a bit confusing. Also, the formal
> > > > > definition
> > > > > is
> > > > > missing for this SQL clause.
> > > > > Maybe something like [1] might be useful. WDYT?
> > > > >
> > > > > Regards,
> > > > > Jeyhun
> > > > >
> > > > > [1]
> > >
> > >
> https://github.com/apache/flink/blob/0da60ca1a4754f858cf7c52dd4f0c97ae0e1b0cb/docs/content/docs/dev/table/sql/show.md?plain=1#L620-L632
> > >
> > > > > On Wed, Mar 13, 2024 at 3:28 PM Feng Jin jinfeng1...@gmail.com
> > > > > wrote:
> > > > >
> > > > > > Hi Yubin
> > > > > >
> > > > > > Thank you for initiating this FLIP.
> > > > > >
> > > > > > I have just one minor question:
> > > > > >
> > > > > > I noticed that we added a new function `getCatalogStore` to
> expose
> > > > > > CatalogStore, and it seems fine.
> > > > > > However, could we add a new method `getCatalogDescriptor()` to
> > > > > > CatalogManager instead of directly exposing CatalogStore?
> > > > > > By only providing the `getCatalogDescriptor()` interface, it may
> be
> > > > > > easier
> > > > > > for us to implement audit tracking in CatalogManager in the
> future.
> > > > > > WDYT ?
> > > > > > Although we have only collected some modified events at the
> > > > > > moment.[1]
> > > > > >
> > > > > > [1].
> > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-294%3A+Support+Customized+Catalog+Modification+Listener
> > >
> > > > > > Best,
> > > > > > Feng
> > > > > >
> > > > > > On Wed, Mar 13, 2024 at 5:31 PM Jingsong Li
> jingsongl...@gmail.com
> > > > > > wrote:
> > > > > >
> > > > > > > +1 for this.
> > > > > > >
> > > > > > > We are missing a series of catalog related syntaxes.
> > > > > > > Especially after the introduction of catalog store. [1]
> > > > > > >
> > > > > > > [1]
> > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
> > >
> > > > > > > Best,
> > > > > > > Jingsong
> > > > > > >
> > > > > > > On Wed, Mar 13, 2024 at 5:09 PM Yubin Li lyb5...@gmail.com
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi devs,
> > > > > > > >
> > > > > > > > I'd like to start a discussion about FLIP-436: Introduce
> "SHOW
> > > > > > > > CREATE
> > > > > > > > CATALOG" Syntax [1].
> > > > > > > >
> > > > > > > > At present, the `SHOW CREATE TABLE` statement provides strong
> > > > > > > > support
> > > > > > > > for
> > > > > > > > users to easily
> > > > > > > > reuse created tables. However, despite the increasing
> importance
> > > > > > > > of the
> > > > > > > > `Catalog` in user's
> > >

Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread Jane Chan

Hi Lincoln,

Thank you for the prompt response and the effort to provide clarity on this
matter.

Best,
Jane

On Thu, Mar 14, 2024 at 6:02 PM Lincoln Lee  wrote:

> Hi Jane,
>
> Thank you for raising this question. I saw the discussion in the Jira
> (include Matthias' point)
> and sought advice from several PMCs (including the previous RMs), the
> majority of people
> are in favor of merging the bugfix into the release branch even during the
> release candidate
> (RC) voting period, so we should accept all bugfixes (unless there is a
> specific community
> rule preventing it).
>
> Thanks again for contributing to the community!
>
> Best,
> Lincoln Lee
>
>
> Matthias Pohl  于2024年3月14日周四 17:50写道：
>
> > Update on FLINK-34227 [1] which I mentioned above: Chesnay helped
> identify
> > a concurrency issue in the JobMaster shutdown logic which seems to be in
> > the code for quite some time. I created a PR fixing the issue hoping that
> > the test instability is resolved with it.
> >
> > The concurrency issue doesn't really explain why it only started to
> appear
> > recently in a specific CI setup (GHA with AdaptiveScheduler). There is no
> > hint in the git history indicating that it's caused by some newly
> > introduced change. That is why I wouldn't make FLINK-34227 a reason to
> > cancel rc2. Instead, the fix can be provided in subsequent patch
> releases.
> >
> > Matthias
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-34227
> >
> > On Thu, Mar 14, 2024 at 8:49 AM Jane Chan  wrote:
> >
> > > Hi Yun, Jing, Martijn and Lincoln,
> > >
> > > I'm seeking guidance on whether merging the bugfix[1][2] at this stage
> is
> > > appropriate. I want to ensure that the actions align with the current
> > > release process and do not disrupt the ongoing preparations.
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-29114
> > > [2] https://github.com/apache/flink/pull/24492
> > >
> > > Best,
> > > Jane
> > >
> > > On Thu, Mar 14, 2024 at 1:33 PM Yun Tang  wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > >
> > > >   *
> > > > Verified the signature and checksum.
> > > >   *
> > > > Reviewed the release note PR
> > > >   *
> > > > Reviewed the web announcement PR
> > > >   *
> > > > Start a standalone cluster to submit the state machine example, which
> > > > works well.
> > > >   *
> > > > Checked the pre-built jars are generated via JDK8
> > > >   *
> > > > Verified the process profiler works well after setting
> > > > rest.profiling.enabled: true
> > > >
> > > > Best
> > > > Yun Tang
> > > >
> > > > 
> > > > From: Qingsheng Ren 
> > > > Sent: Wednesday, March 13, 2024 12:45
> > > > To: dev@flink.apache.org 
> > > > Subject: Re: [VOTE] Release 1.19.0, release candidate #2
> > > >
> > > > +1 (binding)
> > > >
> > > > - Verified signature and checksum
> > > > - Verified no binary in source
> > > > - Built from source
> > > > - Tested reading and writing Kafka with SQL client and Kafka
> connector
> > > > 3.1.0
> > > > - Verified source code tag
> > > > - Reviewed release note
> > > > - Reviewed web PR
> > > >
> > > > Thanks to all release managers and contributors for the awesome work!
> > > >
> > > > Best,
> > > > Qingsheng
> > > >
> > > > On Wed, Mar 13, 2024 at 1:23 AM Matthias Pohl
> > > >  wrote:
> > > >
> > > > > I want to share an update on FLINK-34227 [1]: It's still not clear
> > > what's
> > > > > causing the test instability. So far, we agreed in today's release
> > sync
> > > > [2]
> > > > > that it's not considered a blocker because it is observed in 1.18
> > > nightly
> > > > > builds and it only appears in the GitHub Actions workflow. But I
> > still
> > > > have
> > > > > a bit of a concern that this is something that was introduced in
> 1.19
> > > and
> > > > > backported to 1.18 after the 1.18.1 release (because the test
> > > instability
> > > > > started to appear more regularly in March; with one occurrence in
> > > > January).
> > > > > Additionally, I have no reason to believe, yet, that the
> instability
> > is
> > > > > caused by some GHA-related infrastructure issue.
> > > > >
> > > > > So, if someone else has some capacity to help looking into it; that
> > > would
> > > > > be appreciated. I will continue my investigation tomorrow.
> > > > >
> > > > > Best,
> > > > > Matthias
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-34227
> > > > > [2]
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/1.19+Release#id-1.19Release-03/12/2024
> > > > >
> > > > > On Tue, Mar 12, 2024 at 12:50 PM Benchao Li 
> > > > wrote:
> > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > - checked signature and checksum: OK
> > > > > > - checkout copyright year in notice file: OK
> > > > > > - diffed source distribution with tag, make sure there is no
> > > > > > unexpected files: OK
> > > > > > - build from source : OK
> > > > > > - start a local cluster, played with jdbc connector: OK
> > > > > >
> >

Re: [DISCUSS] FLIP-436: Introduce "SHOW CREATE CATALOG" Syntax

2024-03-14 Thread Ferenc Csaky

Hi Yubin,

Thank you for initiating this discussion! +1 for the proposal.

I also think it makes sense to group the missing catalog related
SQL syntaxes under this FLIP.

Looking forward to these features!

Best,
Ferenc




On Thursday, March 14th, 2024 at 08:31, Jane Chan  wrote:

> 
> 
> Hi Yubin,
> 
> Thanks for leading the discussion. I'm +1 for the FLIP.
> 
> As Jark said, it's a good opportunity to enhance the syntax for Catalog
> from a more comprehensive perspective. So, I suggest expanding the scope of
> this FLIP by focusing on the mechanism instead of one use case to enhance
> the overall functionality. WDYT?
> 
> Best,
> Jane
> 
> On Thu, Mar 14, 2024 at 11:38 AM Hang Ruan ruanhang1...@gmail.com wrote:
> 
> > Hi, Yubin.
> > 
> > Thanks for the FLIP. +1 for it.
> > 
> > Best,
> > Hang
> > 
> > Yubin Li lyb5...@gmail.com 于2024年3月14日周四 10:15写道：
> > 
> > > Hi Jingsong, Feng, and Jeyhun
> > > 
> > > Thanks for your support and feedback!
> > > 
> > > > However, could we add a new method `getCatalogDescriptor()` to
> > > > CatalogManager instead of directly exposing CatalogStore?
> > > 
> > > Good point, Besides the audit tracking issue, The proposed feature
> > > only requires `getCatalogDescriptor()` function. Exposing components
> > > with excessive functionality will bring unnecessary risks, I have made
> > > modifications in the FLIP doc [1]. Thank Feng :)
> > > 
> > > > Showing the SQL parser implementation in the FLIP for the SQL syntax
> > > > might be a bit confusing. Also, the formal definition is missing for
> > > > this SQL clause.
> > > 
> > > Thank Jeyhun for pointing it out :) I have updated the doc [1] .
> > > 
> > > [1]
> > 
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=296290756
> > 
> > > Best,
> > > Yubin
> > > 
> > > On Thu, Mar 14, 2024 at 2:18 AM Jeyhun Karimov je.kari...@gmail.com
> > > wrote:
> > > 
> > > > Hi Yubin,
> > > > 
> > > > Thanks for the proposal. +1 for it.
> > > > I have one comment:
> > > > 
> > > > I would like to see the SQL syntax for the proposed statement. Showing
> > > > the
> > > > SQL parser implementation in the FLIP
> > > > for the SQL syntax might be a bit confusing. Also, the formal
> > > > definition
> > > > is
> > > > missing for this SQL clause.
> > > > Maybe something like [1] might be useful. WDYT?
> > > > 
> > > > Regards,
> > > > Jeyhun
> > > > 
> > > > [1]
> > 
> > https://github.com/apache/flink/blob/0da60ca1a4754f858cf7c52dd4f0c97ae0e1b0cb/docs/content/docs/dev/table/sql/show.md?plain=1#L620-L632
> > 
> > > > On Wed, Mar 13, 2024 at 3:28 PM Feng Jin jinfeng1...@gmail.com
> > > > wrote:
> > > > 
> > > > > Hi Yubin
> > > > > 
> > > > > Thank you for initiating this FLIP.
> > > > > 
> > > > > I have just one minor question:
> > > > > 
> > > > > I noticed that we added a new function `getCatalogStore` to expose
> > > > > CatalogStore, and it seems fine.
> > > > > However, could we add a new method `getCatalogDescriptor()` to
> > > > > CatalogManager instead of directly exposing CatalogStore?
> > > > > By only providing the `getCatalogDescriptor()` interface, it may be
> > > > > easier
> > > > > for us to implement audit tracking in CatalogManager in the future.
> > > > > WDYT ?
> > > > > Although we have only collected some modified events at the
> > > > > moment.[1]
> > > > > 
> > > > > [1].
> > 
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-294%3A+Support+Customized+Catalog+Modification+Listener
> > 
> > > > > Best,
> > > > > Feng
> > > > > 
> > > > > On Wed, Mar 13, 2024 at 5:31 PM Jingsong Li jingsongl...@gmail.com
> > > > > wrote:
> > > > > 
> > > > > > +1 for this.
> > > > > > 
> > > > > > We are missing a series of catalog related syntaxes.
> > > > > > Especially after the introduction of catalog store. [1]
> > > > > > 
> > > > > > [1]
> > 
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
> > 
> > > > > > Best,
> > > > > > Jingsong
> > > > > > 
> > > > > > On Wed, Mar 13, 2024 at 5:09 PM Yubin Li lyb5...@gmail.com
> > > > > > wrote:
> > > > > > 
> > > > > > > Hi devs,
> > > > > > > 
> > > > > > > I'd like to start a discussion about FLIP-436: Introduce "SHOW
> > > > > > > CREATE
> > > > > > > CATALOG" Syntax [1].
> > > > > > > 
> > > > > > > At present, the `SHOW CREATE TABLE` statement provides strong
> > > > > > > support
> > > > > > > for
> > > > > > > users to easily
> > > > > > > reuse created tables. However, despite the increasing importance
> > > > > > > of the
> > > > > > > `Catalog` in user's
> > > > > > > business, there is no similar statement for users to use.
> > > > > > > 
> > > > > > > According to the online discussion in FLINK-24939 [2] with Jark
> > > > > > > Wu
> > > > > > > and
> > > > > > > Feng
> > > > > > > Jin, since `CatalogStore`
> > > > > > > has been introduced in FLIP-295 [3], we could use this component
> > > > > > > to
> > > > > > > implement such a long-awaited
> > > > > >

Re: [DISCUSS] FLIP-436: Introduce "SHOW CREATE CATALOG" Syntax

2024-03-14 Thread Benchao Li

+1 for the FLIP, thanks Yubin for driving it.

Also +1 to complete the whole story about Catalog in this FLIP as Jark
and Jane said above.

Jane Chan  于2024年3月14日周四 15:33写道：
>
> Hi Yubin,
>
> Thanks for leading the discussion. I'm +1 for the FLIP.
>
> As Jark said, it's a good opportunity to enhance the syntax for Catalog
> from a more comprehensive perspective. So, I suggest expanding the scope of
> this FLIP by focusing on the mechanism instead of one use case to enhance
> the overall functionality. WDYT?
>
> Best,
> Jane
>
> On Thu, Mar 14, 2024 at 11:38 AM Hang Ruan  wrote:
>
> > Hi, Yubin.
> >
> > Thanks for the FLIP. +1 for it.
> >
> > Best,
> > Hang
> >
> > Yubin Li  于2024年3月14日周四 10:15写道：
> >
> > > Hi Jingsong, Feng, and Jeyhun
> > >
> > > Thanks for your support and feedback!
> > >
> > > > However, could we add a new method `getCatalogDescriptor()` to
> > > > CatalogManager instead of directly exposing CatalogStore?
> > >
> > > Good point, Besides the audit tracking issue, The proposed feature
> > > only requires `getCatalogDescriptor()` function. Exposing components
> > > with excessive functionality will bring unnecessary risks, I have made
> > > modifications in the FLIP doc [1]. Thank Feng :)
> > >
> > > > Showing the SQL parser implementation in the FLIP for the SQL syntax
> > > > might be a bit confusing. Also, the formal definition is missing for
> > > > this SQL clause.
> > >
> > > Thank Jeyhun for pointing it out :) I have updated the doc [1] .
> > >
> > > [1]
> > >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=296290756
> > >
> > > Best,
> > > Yubin
> > >
> > >
> > > On Thu, Mar 14, 2024 at 2:18 AM Jeyhun Karimov 
> > > wrote:
> > > >
> > > > Hi Yubin,
> > > >
> > > > Thanks for the proposal. +1 for it.
> > > > I have one comment:
> > > >
> > > > I would like to see the SQL syntax for the proposed statement.  Showing
> > > the
> > > > SQL parser implementation in the FLIP
> > > > for the SQL syntax might be a bit confusing. Also, the formal
> > definition
> > > is
> > > > missing for this SQL clause.
> > > > Maybe something like [1] might be useful. WDYT?
> > > >
> > > > Regards,
> > > > Jeyhun
> > > >
> > > > [1]
> > > >
> > >
> > https://github.com/apache/flink/blob/0da60ca1a4754f858cf7c52dd4f0c97ae0e1b0cb/docs/content/docs/dev/table/sql/show.md?plain=1#L620-L632
> > > >
> > > > On Wed, Mar 13, 2024 at 3:28 PM Feng Jin 
> > wrote:
> > > >
> > > > > Hi Yubin
> > > > >
> > > > > Thank you for initiating this FLIP.
> > > > >
> > > > > I have just one minor question:
> > > > >
> > > > > I noticed that we added a new function `getCatalogStore` to expose
> > > > > CatalogStore, and it seems fine.
> > > > > However, could we add a new method `getCatalogDescriptor()` to
> > > > > CatalogManager instead of directly exposing CatalogStore?
> > > > > By only providing the `getCatalogDescriptor()` interface, it may be
> > > easier
> > > > > for us to implement audit tracking in CatalogManager in the future.
> > > WDYT ?
> > > > > Although we have only collected some modified events at the
> > moment.[1]
> > > > >
> > > > >
> > > > > [1].
> > > > >
> > > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-294%3A+Support+Customized+Catalog+Modification+Listener
> > > > >
> > > > > Best,
> > > > > Feng
> > > > >
> > > > > On Wed, Mar 13, 2024 at 5:31 PM Jingsong Li 
> > > > > wrote:
> > > > >
> > > > > > +1 for this.
> > > > > >
> > > > > > We are missing a series of catalog related syntaxes.
> > > > > > Especially after the introduction of catalog store. [1]
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
> > > > > >
> > > > > > Best,
> > > > > > Jingsong
> > > > > >
> > > > > > On Wed, Mar 13, 2024 at 5:09 PM Yubin Li 
> > wrote:
> > > > > > >
> > > > > > > Hi devs,
> > > > > > >
> > > > > > > I'd like to start a discussion about FLIP-436: Introduce "SHOW
> > > CREATE
> > > > > > > CATALOG" Syntax [1].
> > > > > > >
> > > > > > > At present, the `SHOW CREATE TABLE` statement provides strong
> > > support
> > > > > for
> > > > > > > users to easily
> > > > > > > reuse created tables. However, despite the increasing importance
> > > of the
> > > > > > > `Catalog` in user's
> > > > > > > business, there is no similar statement for users to use.
> > > > > > >
> > > > > > > According to the online discussion in FLINK-24939 [2] with Jark
> > Wu
> > > and
> > > > > > Feng
> > > > > > > Jin, since `CatalogStore`
> > > > > > > has been introduced in FLIP-295 [3], we could use this component
> > to
> > > > > > > implement such a long-awaited
> > > > > > > feature, Please refer to the document [1] for implementation
> > > details.
> > > > > > >
> > > > > > > examples as follows:
> > > > > > >
> > > > > > > Flink SQL> create catalog cat2 WITH ('type'='generic_in_memory',
> > > > > > > > 'default-database'='db');
> > >

Re: [DISCUSS] Removing documentation on Azure Pipelines for Flink forks

2024-03-14 Thread Sergey Nuyanzin

Hi Matthias,

thanks for driving  this
agree GHA seems working ok

however to be on the safe side what if we mark it for removal or deprecated
first
and then remove together with dropping support of 1.17 where GHA is not
supported IIUC?

On Thu, Mar 14, 2024 at 11:42 AM Matthias Pohl
 wrote:

> Hi everyone,
> I'm wondering whether anyone has objections against removing the Azure
> Pipelines Tutorial to "set up CI for a fork of the Flink repository" in the
> Flink wiki. Flink's GitHub Actions workflow seems to work fine for forks
> (at least for 1.18+ changes). No need to guide contributors to the
> flink-mirror repository to create draft PRs. And it's not used that often,
> anyway [2].
>
> Best,
> Matthias
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/Azure+Pipelines#AzurePipelines-Tutorial:SettingupAzurePipelinesforaforkoftheFlinkrepository
> [2] https://github.com/flink-ci/flink-mirror/pulls?q=is%3Apr
>


-- 
Best regards,
Sergey

[DISCUSS] Removing documentation on Azure Pipelines for Flink forks

2024-03-14 Thread Matthias Pohl

Hi everyone,
I'm wondering whether anyone has objections against removing the Azure
Pipelines Tutorial to "set up CI for a fork of the Flink repository" in the
Flink wiki. Flink's GitHub Actions workflow seems to work fine for forks
(at least for 1.18+ changes). No need to guide contributors to the
flink-mirror repository to create draft PRs. And it's not used that often,
anyway [2].

Best,
Matthias

[1]
https://cwiki.apache.org/confluence/display/FLINK/Azure+Pipelines#AzurePipelines-Tutorial:SettingupAzurePipelinesforaforkoftheFlinkrepository
[2] https://github.com/flink-ci/flink-mirror/pulls?q=is%3Apr

Re: [DISCUSS] Kubernetes Operator 1.8.0 release planning

2024-03-14 Thread Maximilian Michels

Hey everyone,

I don't see any immediate blockers, so I'm going to start the release process.

Thanks,
Max

On Tue, Feb 20, 2024 at 8:55 PM Maximilian Michels  wrote:
>
> Hey Rui, hey Ryan,
>
> Good points. Non-committers can't directly release but they can assist
> with the release. It would be great to get help from both of you in
> the release process.
>
> I'd be happy to be the release manager for the 1.8 release. As for the
> timing, I think we need to reach consensus in which form to include
> the new memory tuning. Also, considering that Gyula just merged a
> pretty big improvement / refactor of the metric collection code, we
> might want to give it another week. I would target the end of February
> to begin with the release process.
>
> Cheers,
> Max
>
> On Sun, Feb 18, 2024 at 4:48 AM Rui Fan <1996fan...@gmail.com> wrote:
> >
> > Thanks Max and Ryan for the volunteering.
> >
> > To Ryan:
> >
> > I'm not sure whether non-flink-committers have permission to release.
> > If I remember correctly, multiple steps of the release process[1] need
> > the apache account, such as: Apache GPG key and Apache Nexus.
> >
> > If the release process needs the committer permission, feel free to
> > assist this release, thanks~
> >
> > To all:
> >
> > Max is one of the very active contributors to the
> > flink-kuberneters-operator
> > project, and he didn't release before. So Max as the release manager
> > makes sense to me.
> >
> > I can assist this release if all of you don't mind. In particular,
> > Autoscaler Standalone 1.8.0 is much improved compared to 1.7.0,
> > and I can help write the related Release note. Besides, I can help
> > check and test this release.
> >
> > [1]
> > https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Kubernetes+Operator+Release
> >
> > Best,
> > Rui
> >
> > On Wed, Feb 7, 2024 at 11:01 PM Ryan van Huuksloot <
> > ryan.vanhuuksl...@shopify.com> wrote:
> >
> > > I can volunteer to be a release manager. I haven't done it for
> > > Apache/Flink or the operator before so I may be a good candidate.
> > >
> > > Ryan van Huuksloot
> > > Sr. Production Engineer | Streaming Platform
> > > [image: Shopify]
> > > 
> > >
> > >
> > > On Wed, Feb 7, 2024 at 6:06 AM Maximilian Michels  wrote:
> > >
> > >> It's very considerate that you want to volunteer to be the release
> > >> manager, but given that you have already managed one release, I would
> > >> ideally like somebody else to do it. Personally, I haven't managed an
> > >> operator release, although I've done it for Flink itself in the past.
> > >> Nevertheless, it would be nice to have somebody new to the process.
> > >>
> > >> Anyone reading this who wants to try being a release manager, please
> > >> don't be afraid to volunteer. Of course we'll be able to assist. That
> > >> would also be a good opportunity for us to update the docs regarding
> > >> the release process.
> > >>
> > >> Cheers,
> > >> Max
> > >>
> > >>
> > >> On Wed, Feb 7, 2024 at 10:08 AM Rui Fan <1996fan...@gmail.com> wrote:
> > >> >
> > >> > If the release is postponed 1-2 more weeks, I could volunteer
> > >> > as the one of the release managers.
> > >> >
> > >> > Best,
> > >> > Rui
> > >> >
> > >> > On Wed, Feb 7, 2024 at 4:54 AM Gyula Fóra  wrote:
> > >> >>
> > >> >> Given the proposed timeline was a bit short / rushed I agree with Max
> > >> that
> > >> >> we could wait 1-2 more weeks to wrap up the current outstanding bigger
> > >> >> features around memory tuning and the JDBC state store.
> > >> >>
> > >> >> In the meantime it would be great to involve 1-2 new committers (or
> > >> other
> > >> >> contributors) in the operator release process so that we have some
> > >> fresh
> > >> >> eyes on the process.
> > >> >> Would anyone be interested in volunteering to help with the next
> > >> release?
> > >> >>
> > >> >> Cheers,
> > >> >> Gyula
> > >> >>
> > >> >> On Tue, Feb 6, 2024 at 4:35 PM Maximilian Michels 
> > >> wrote:
> > >> >>
> > >> >> > Thanks for starting the discussion Gyula!
> > >> >> >
> > >> >> > It comes down to how important the outstanding changes are for the
> > >> >> > release. Both the memory tuning as well as the JDBC changes probably
> > >> >> > need 1-2 weeks realistically to complete the initial spec. For the
> > >> >> > memory tuning, I would prefer merging it in the current state as an
> > >> >> > experimental feature for the release which comes disabled out of the
> > >> >> > box. The reason is that it can already be useful to users who want 
> > >> >> > to
> > >> >> > try it out; we have seen some interest in it. Then for the next
> > >> >> > release we will offer a richer feature set and might enable it by
> > >> >> > default.
> > >> >> >
> > >> >> > Cheers,
> > >> >> > Max
> > >> >> >
> > >> >> > On Tue, Feb 6, 2024 at 10:53 AM Rui Fan <1996fan...@gmail.com>
> > >> wrote:
> > >> >> > >
> > >> >> > > Thanks Gyula for driving this release!
> > >> >> > >
> >

Re: [VOTE] FLIP-402: Extend ZooKeeper Curator configurations

2024-03-14 Thread Matthias Pohl

Nothing to add from my side. Thanks, Alex.

+1 (binding)

On Thu, Mar 7, 2024 at 4:09 PM Alex Nitavsky  wrote:

> Hi everyone,
>
> I'd like to start a vote on FLIP-402 [1]. It introduces new configuration
> options for Apache Flink's ZooKeeper integration for high availability by
> reflecting existing Apache Curator configuration options. It has been
> discussed in this thread [2].
>
> I would like to start a vote.  The vote will be open for at least 72 hours
> (until March 10th 18:00 GMT) unless there is an objection or
> insufficient votes.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-402%3A+Extend+ZooKeeper+Curator+configurations
> [2] https://lists.apache.org/thread/gqgs2jlq6bmg211gqtgdn8q5hp5v9l1z
>
> Thanks
> Alex
>

Re: [DISCUSS] FLIP Suggestion: Externalize Kudu Connector from Bahir

2024-03-14 Thread Ferenc Csaky

Hi,

Gentle ping to see if there are any other concerns or things that seems missing 
from the FLIP.

Best,
Ferenc




On Monday, March 11th, 2024 at 11:11, Ferenc Csaky  
wrote:

> 
> 
> Hi Jing,
> 
> Thank you for your comments! Updated the FLIP with reasoning on the proposed 
> release versions and included them in the headline "Release" field.
> 
> Best,
> Ferenc
> 
> 
> 
> 
> On Sunday, March 10th, 2024 at 16:59, Jing Ge j...@ververica.com.INVALID 
> wrote:
> 
> > Hi Ferenc,
> > 
> > Thanks for the proposal! +1 for it!
> > 
> > Similar to what Leonard mentioned. I would suggest:
> > 1. Use the "release" to define the release version of the Kudu connector
> > itself.
> > 2. Optionally, add one more row underneath to describe which Flink versions
> > this release will be compatible with, e.g. 1.17, 1.18. I think it makes
> > sense to support at least two last Flink releases. An example could be
> > found at [1]
> > 
> > Best regards,
> > Jing
> > 
> > [1] https://lists.apache.org/thread/jcjfy3fgpg5cdnb9noslq2c77h0gtcwp
> > 
> > On Sun, Mar 10, 2024 at 3:46 PM Yanquan Lv decq12y...@gmail.com wrote:
> > 
> > > Hi Ferenc, +1 for this FLIP.
> > > 
> > > Ferenc Csaky ferenc.cs...@pm.me.invalid 于2024年3月9日周六 01:49写道：
> > > 
> > > > Thank you Jeyhun, Leonard, and Hang for your comments! Let me
> > > > address them from earliest to latest.
> > > > 
> > > > > How do you plan the review process in this case (e.g. incremental
> > > > > over existing codebase or cumulative all at once) ?
> > > > 
> > > > I think incremental would be less time consuming and complex for
> > > > reviewers so I would leaning towards that direction. I would
> > > > imagine multiple subtasks for migrating the existing code, and
> > > > updating the deprecated interfaces, so those should be separate PRs and
> > > > the release can be initiated when everything is merged.
> > > > 
> > > > > (1) About the release version, could you specify kudu connector 
> > > > > version
> > > > > instead of flink version 1.18 as external connector version is 
> > > > > different
> > > > > with flink?
> > > > > (2) About the connector config options, could you enumerate these
> > > > > options so that we can review they’re reasonable or not?
> > > > 
> > > > I added these to the FLIP, copied the current configs options as is,
> > > > PTAL.
> > > > 
> > > > > (3) Metrics is also key part of connector, could you add the supported
> > > > > connector metrics to public interface as well?
> > > > 
> > > > The current Bahir conenctor code does not include any metrics and I did
> > > > not plan to include them into the scope of this FLIP.
> > > > 
> > > > > I think that how to state this code originally lived in Bahir may be 
> > > > > in
> > > > > the
> > > > > FLIP.
> > > > 
> > > > I might miss your point, but the FLIP contains this: "Migrating the
> > > > current code keeping the history and noting it explicitly it was forked
> > > > from the Bahir repository [2]." Pls. share if you meant something else.
> > > > 
> > > > Best,
> > > > Ferenc
> > > > 
> > > > On Friday, March 8th, 2024 at 10:42, Hang Ruan ruanhang1...@gmail.com
> > > > wrote:
> > > > 
> > > > > Hi, Ferenc.
> > > > > 
> > > > > Thanks for the FLIP discussion. +1 for the proposal.
> > > > > I think that how to state this code originally lived in Bahir may be 
> > > > > in
> > > > > the
> > > > > FLIP.
> > > > > 
> > > > > Best,
> > > > > Hang
> > > > > 
> > > > > Leonard Xu xbjt...@gmail.com 于2024年3月7日周四 14:14写道：
> > > > > 
> > > > > > Thanks Ferenc for kicking off this discussion, I left some comments
> > > > > > here:
> > > > > > 
> > > > > > (1) About the release version, could you specify kudu connector
> > > > > > version
> > > > > > instead of flink version 1.18 as external connector version is
> > > > > > different
> > > > > > with flink ?
> > > > > > 
> > > > > > (2) About the connector config options, could you enumerate these
> > > > > > options
> > > > > > so that we can review they’re reasonable or not?
> > > > > > 
> > > > > > (3) Metrics is also key part of connector, could you add the
> > > > > > supported
> > > > > > connector metrics to public interface as well?
> > > > > > 
> > > > > > Best,
> > > > > > Leonard
> > > > > > 
> > > > > > > 2024年3月6日 下午11:23，Ferenc Csaky ferenc.cs...@pm.me.INVALID 写道：
> > > > > > > 
> > > > > > > Hello devs,
> > > > > > > 
> > > > > > > Opening this thread to discuss a FLIP [1] about externalizing the
> > > > > > > Kudu
> > > > > > > connector, as recently
> > > > > > > the Apache Bahir project were moved to the attic [2]. Some details
> > > > > > > were
> > > > > > > discussed already
> > > > > > > in another thread [3]. I am proposing to externalize this 
> > > > > > > connector
> > > > > > > and
> > > > > > > keep it maintainable,
> > > > > > > and up to date.
> > > > > > > 
> > > > > > > Best regards,
> > > > > > > Ferenc
> > > > > > > 
> > > > > > > [1]
> > > 
> > >

Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread Lincoln Lee

Hi Matthias,

Thanks for the updating!
And once again, thank you for continuously tracking all unstable cases and
driving the resolution!

Best,
Lincoln Lee


Lincoln Lee  于2024年3月14日周四 18:00写道：

> Hi Jane,
>
> Thank you for raising this question. I saw the discussion in the Jira
> (include Matthias' point)
> and sought advice from several PMCs (including the previous RMs), the
> majority of people
> are in favor of merging the bugfix into the release branch even during the
> release candidate
> (RC) voting period, so we should accept all bugfixes (unless there is a
> specific community
> rule preventing it).
>
> Thanks again for contributing to the community!
>
> Best,
> Lincoln Lee
>
>
> Matthias Pohl  于2024年3月14日周四 17:50写道：
>
>> Update on FLINK-34227 [1] which I mentioned above: Chesnay helped identify
>> a concurrency issue in the JobMaster shutdown logic which seems to be in
>> the code for quite some time. I created a PR fixing the issue hoping that
>> the test instability is resolved with it.
>>
>> The concurrency issue doesn't really explain why it only started to appear
>> recently in a specific CI setup (GHA with AdaptiveScheduler). There is no
>> hint in the git history indicating that it's caused by some newly
>> introduced change. That is why I wouldn't make FLINK-34227 a reason to
>> cancel rc2. Instead, the fix can be provided in subsequent patch releases.
>>
>> Matthias
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-34227
>>
>> On Thu, Mar 14, 2024 at 8:49 AM Jane Chan  wrote:
>>
>> > Hi Yun, Jing, Martijn and Lincoln,
>> >
>> > I'm seeking guidance on whether merging the bugfix[1][2] at this stage
>> is
>> > appropriate. I want to ensure that the actions align with the current
>> > release process and do not disrupt the ongoing preparations.
>> >
>> > [1] https://issues.apache.org/jira/browse/FLINK-29114
>> > [2] https://github.com/apache/flink/pull/24492
>> >
>> > Best,
>> > Jane
>> >
>> > On Thu, Mar 14, 2024 at 1:33 PM Yun Tang  wrote:
>> >
>> > > +1 (non-binding)
>> > >
>> > >
>> > >   *
>> > > Verified the signature and checksum.
>> > >   *
>> > > Reviewed the release note PR
>> > >   *
>> > > Reviewed the web announcement PR
>> > >   *
>> > > Start a standalone cluster to submit the state machine example, which
>> > > works well.
>> > >   *
>> > > Checked the pre-built jars are generated via JDK8
>> > >   *
>> > > Verified the process profiler works well after setting
>> > > rest.profiling.enabled: true
>> > >
>> > > Best
>> > > Yun Tang
>> > >
>> > > 
>> > > From: Qingsheng Ren 
>> > > Sent: Wednesday, March 13, 2024 12:45
>> > > To: dev@flink.apache.org 
>> > > Subject: Re: [VOTE] Release 1.19.0, release candidate #2
>> > >
>> > > +1 (binding)
>> > >
>> > > - Verified signature and checksum
>> > > - Verified no binary in source
>> > > - Built from source
>> > > - Tested reading and writing Kafka with SQL client and Kafka connector
>> > > 3.1.0
>> > > - Verified source code tag
>> > > - Reviewed release note
>> > > - Reviewed web PR
>> > >
>> > > Thanks to all release managers and contributors for the awesome work!
>> > >
>> > > Best,
>> > > Qingsheng
>> > >
>> > > On Wed, Mar 13, 2024 at 1:23 AM Matthias Pohl
>> > >  wrote:
>> > >
>> > > > I want to share an update on FLINK-34227 [1]: It's still not clear
>> > what's
>> > > > causing the test instability. So far, we agreed in today's release
>> sync
>> > > [2]
>> > > > that it's not considered a blocker because it is observed in 1.18
>> > nightly
>> > > > builds and it only appears in the GitHub Actions workflow. But I
>> still
>> > > have
>> > > > a bit of a concern that this is something that was introduced in
>> 1.19
>> > and
>> > > > backported to 1.18 after the 1.18.1 release (because the test
>> > instability
>> > > > started to appear more regularly in March; with one occurrence in
>> > > January).
>> > > > Additionally, I have no reason to believe, yet, that the
>> instability is
>> > > > caused by some GHA-related infrastructure issue.
>> > > >
>> > > > So, if someone else has some capacity to help looking into it; that
>> > would
>> > > > be appreciated. I will continue my investigation tomorrow.
>> > > >
>> > > > Best,
>> > > > Matthias
>> > > >
>> > > > [1] https://issues.apache.org/jira/browse/FLINK-34227
>> > > > [2]
>> > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/1.19+Release#id-1.19Release-03/12/2024
>> > > >
>> > > > On Tue, Mar 12, 2024 at 12:50 PM Benchao Li 
>> > > wrote:
>> > > >
>> > > > > +1 (non-binding)
>> > > > >
>> > > > > - checked signature and checksum: OK
>> > > > > - checkout copyright year in notice file: OK
>> > > > > - diffed source distribution with tag, make sure there is no
>> > > > > unexpected files: OK
>> > > > > - build from source : OK
>> > > > > - start a local cluster, played with jdbc connector: OK
>> > > > >
>> > > > > weijie guo  于2024年3月12日周二 16:55写道：
>> > > > > >
>> > > > > > +1 (non-binding)
>> > > > > >
>>

Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread Lincoln Lee

Hi Jane,

Thank you for raising this question. I saw the discussion in the Jira
(include Matthias' point)
and sought advice from several PMCs (including the previous RMs), the
majority of people
are in favor of merging the bugfix into the release branch even during the
release candidate
(RC) voting period, so we should accept all bugfixes (unless there is a
specific community
rule preventing it).

Thanks again for contributing to the community!

Best,
Lincoln Lee


Matthias Pohl  于2024年3月14日周四 17:50写道：

> Update on FLINK-34227 [1] which I mentioned above: Chesnay helped identify
> a concurrency issue in the JobMaster shutdown logic which seems to be in
> the code for quite some time. I created a PR fixing the issue hoping that
> the test instability is resolved with it.
>
> The concurrency issue doesn't really explain why it only started to appear
> recently in a specific CI setup (GHA with AdaptiveScheduler). There is no
> hint in the git history indicating that it's caused by some newly
> introduced change. That is why I wouldn't make FLINK-34227 a reason to
> cancel rc2. Instead, the fix can be provided in subsequent patch releases.
>
> Matthias
>
> [1] https://issues.apache.org/jira/browse/FLINK-34227
>
> On Thu, Mar 14, 2024 at 8:49 AM Jane Chan  wrote:
>
> > Hi Yun, Jing, Martijn and Lincoln,
> >
> > I'm seeking guidance on whether merging the bugfix[1][2] at this stage is
> > appropriate. I want to ensure that the actions align with the current
> > release process and do not disrupt the ongoing preparations.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-29114
> > [2] https://github.com/apache/flink/pull/24492
> >
> > Best,
> > Jane
> >
> > On Thu, Mar 14, 2024 at 1:33 PM Yun Tang  wrote:
> >
> > > +1 (non-binding)
> > >
> > >
> > >   *
> > > Verified the signature and checksum.
> > >   *
> > > Reviewed the release note PR
> > >   *
> > > Reviewed the web announcement PR
> > >   *
> > > Start a standalone cluster to submit the state machine example, which
> > > works well.
> > >   *
> > > Checked the pre-built jars are generated via JDK8
> > >   *
> > > Verified the process profiler works well after setting
> > > rest.profiling.enabled: true
> > >
> > > Best
> > > Yun Tang
> > >
> > > 
> > > From: Qingsheng Ren 
> > > Sent: Wednesday, March 13, 2024 12:45
> > > To: dev@flink.apache.org 
> > > Subject: Re: [VOTE] Release 1.19.0, release candidate #2
> > >
> > > +1 (binding)
> > >
> > > - Verified signature and checksum
> > > - Verified no binary in source
> > > - Built from source
> > > - Tested reading and writing Kafka with SQL client and Kafka connector
> > > 3.1.0
> > > - Verified source code tag
> > > - Reviewed release note
> > > - Reviewed web PR
> > >
> > > Thanks to all release managers and contributors for the awesome work!
> > >
> > > Best,
> > > Qingsheng
> > >
> > > On Wed, Mar 13, 2024 at 1:23 AM Matthias Pohl
> > >  wrote:
> > >
> > > > I want to share an update on FLINK-34227 [1]: It's still not clear
> > what's
> > > > causing the test instability. So far, we agreed in today's release
> sync
> > > [2]
> > > > that it's not considered a blocker because it is observed in 1.18
> > nightly
> > > > builds and it only appears in the GitHub Actions workflow. But I
> still
> > > have
> > > > a bit of a concern that this is something that was introduced in 1.19
> > and
> > > > backported to 1.18 after the 1.18.1 release (because the test
> > instability
> > > > started to appear more regularly in March; with one occurrence in
> > > January).
> > > > Additionally, I have no reason to believe, yet, that the instability
> is
> > > > caused by some GHA-related infrastructure issue.
> > > >
> > > > So, if someone else has some capacity to help looking into it; that
> > would
> > > > be appreciated. I will continue my investigation tomorrow.
> > > >
> > > > Best,
> > > > Matthias
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-34227
> > > > [2]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/1.19+Release#id-1.19Release-03/12/2024
> > > >
> > > > On Tue, Mar 12, 2024 at 12:50 PM Benchao Li 
> > > wrote:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > - checked signature and checksum: OK
> > > > > - checkout copyright year in notice file: OK
> > > > > - diffed source distribution with tag, make sure there is no
> > > > > unexpected files: OK
> > > > > - build from source : OK
> > > > > - start a local cluster, played with jdbc connector: OK
> > > > >
> > > > > weijie guo  于2024年3月12日周二 16:55写道：
> > > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > - Verified signature and checksum
> > > > > > - Verified source distribution does not contains binaries
> > > > > > - Build from source code and submit a word-count job successfully
> > > > > >
> > > > > >
> > > > > > Best regards,
> > > > > >
> > > > > > Weijie
> > > > > >
> > > > > >
> > > > > > Jane Chan  于2024年3月12日周二 16:38写道：
> > > > > >
> > > > >

[jira] [Created] (FLINK-34668) Report State handle of file merging directory to JM

2024-03-14 Thread Zakelly Lan (Jira)

Zakelly Lan created FLINK-34668:
---

 Summary: Report State handle of file merging directory to JM
 Key: FLINK-34668
 URL: https://issues.apache.org/jira/browse/FLINK-34668
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Checkpointing
Reporter: Zakelly Lan
Assignee: Yanfei Lei
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread Matthias Pohl

Update on FLINK-34227 [1] which I mentioned above: Chesnay helped identify
a concurrency issue in the JobMaster shutdown logic which seems to be in
the code for quite some time. I created a PR fixing the issue hoping that
the test instability is resolved with it.

The concurrency issue doesn't really explain why it only started to appear
recently in a specific CI setup (GHA with AdaptiveScheduler). There is no
hint in the git history indicating that it's caused by some newly
introduced change. That is why I wouldn't make FLINK-34227 a reason to
cancel rc2. Instead, the fix can be provided in subsequent patch releases.

Matthias

[1] https://issues.apache.org/jira/browse/FLINK-34227

On Thu, Mar 14, 2024 at 8:49 AM Jane Chan  wrote:

> Hi Yun, Jing, Martijn and Lincoln,
>
> I'm seeking guidance on whether merging the bugfix[1][2] at this stage is
> appropriate. I want to ensure that the actions align with the current
> release process and do not disrupt the ongoing preparations.
>
> [1] https://issues.apache.org/jira/browse/FLINK-29114
> [2] https://github.com/apache/flink/pull/24492
>
> Best,
> Jane
>
> On Thu, Mar 14, 2024 at 1:33 PM Yun Tang  wrote:
>
> > +1 (non-binding)
> >
> >
> >   *
> > Verified the signature and checksum.
> >   *
> > Reviewed the release note PR
> >   *
> > Reviewed the web announcement PR
> >   *
> > Start a standalone cluster to submit the state machine example, which
> > works well.
> >   *
> > Checked the pre-built jars are generated via JDK8
> >   *
> > Verified the process profiler works well after setting
> > rest.profiling.enabled: true
> >
> > Best
> > Yun Tang
> >
> > 
> > From: Qingsheng Ren 
> > Sent: Wednesday, March 13, 2024 12:45
> > To: dev@flink.apache.org 
> > Subject: Re: [VOTE] Release 1.19.0, release candidate #2
> >
> > +1 (binding)
> >
> > - Verified signature and checksum
> > - Verified no binary in source
> > - Built from source
> > - Tested reading and writing Kafka with SQL client and Kafka connector
> > 3.1.0
> > - Verified source code tag
> > - Reviewed release note
> > - Reviewed web PR
> >
> > Thanks to all release managers and contributors for the awesome work!
> >
> > Best,
> > Qingsheng
> >
> > On Wed, Mar 13, 2024 at 1:23 AM Matthias Pohl
> >  wrote:
> >
> > > I want to share an update on FLINK-34227 [1]: It's still not clear
> what's
> > > causing the test instability. So far, we agreed in today's release sync
> > [2]
> > > that it's not considered a blocker because it is observed in 1.18
> nightly
> > > builds and it only appears in the GitHub Actions workflow. But I still
> > have
> > > a bit of a concern that this is something that was introduced in 1.19
> and
> > > backported to 1.18 after the 1.18.1 release (because the test
> instability
> > > started to appear more regularly in March; with one occurrence in
> > January).
> > > Additionally, I have no reason to believe, yet, that the instability is
> > > caused by some GHA-related infrastructure issue.
> > >
> > > So, if someone else has some capacity to help looking into it; that
> would
> > > be appreciated. I will continue my investigation tomorrow.
> > >
> > > Best,
> > > Matthias
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-34227
> > > [2]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/1.19+Release#id-1.19Release-03/12/2024
> > >
> > > On Tue, Mar 12, 2024 at 12:50 PM Benchao Li 
> > wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > - checked signature and checksum: OK
> > > > - checkout copyright year in notice file: OK
> > > > - diffed source distribution with tag, make sure there is no
> > > > unexpected files: OK
> > > > - build from source : OK
> > > > - start a local cluster, played with jdbc connector: OK
> > > >
> > > > weijie guo  于2024年3月12日周二 16:55写道：
> > > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > - Verified signature and checksum
> > > > > - Verified source distribution does not contains binaries
> > > > > - Build from source code and submit a word-count job successfully
> > > > >
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Weijie
> > > > >
> > > > >
> > > > > Jane Chan  于2024年3月12日周二 16:38写道：
> > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > - Verify that the source distributions do not contain any
> binaries;
> > > > > > - Build the source distribution to ensure all source files have
> > > Apache
> > > > > > headers;
> > > > > > - Verify checksum and GPG signatures;
> > > > > >
> > > > > > Best,
> > > > > > Jane
> > > > > >
> > > > > > On Tue, Mar 12, 2024 at 4:08 PM Xuannan Su <
> suxuanna...@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > > +1 (non-binding)
> > > > > > >
> > > > > > > - Verified signature and checksum
> > > > > > > - Verified that source distribution does not contain binaries
> > > > > > > - Built from source code successfully
> > > > > > > - Reviewed the release announcement PR
> > > > > > >
> > > > > > > Best regards,
> > > > > > > Xuannan
> > >

[jira] [Created] (FLINK-34667) Changelog state backend support local rescaling

2024-03-14 Thread Yanfei Lei (Jira)

Yanfei Lei created FLINK-34667:
--

 Summary: Changelog state backend support local rescaling
 Key: FLINK-34667
 URL: https://issues.apache.org/jira/browse/FLINK-34667
 Project: Flink
  Issue Type: Bug
  Components: Runtime / State Backends
Affects Versions: 1.20.0
Reporter: Yanfei Lei


FLINK-33341 uses the available local keyed state for rescaling, this will cause 
changelog state to incorrectly treat part of the local state as the complete 
local state.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34666) Keep assigned splits in order to fix wrong meta group calculation

2024-03-14 Thread Hongshun Wang (Jira)

Hongshun Wang created FLINK-34666:
-

 Summary: Keep assigned splits in order to fix wrong meta group 
calculation
 Key: FLINK-34666
 URL: https://issues.apache.org/jira/browse/FLINK-34666
 Project: Flink
  Issue Type: Improvement
Reporter: Hongshun Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread Jane Chan

Hi Yun, Jing, Martijn and Lincoln,

I'm seeking guidance on whether merging the bugfix[1][2] at this stage is
appropriate. I want to ensure that the actions align with the current
release process and do not disrupt the ongoing preparations.

[1] https://issues.apache.org/jira/browse/FLINK-29114
[2] https://github.com/apache/flink/pull/24492

Best,
Jane

On Thu, Mar 14, 2024 at 1:33 PM Yun Tang  wrote:

> +1 (non-binding)
>
>
>   *
> Verified the signature and checksum.
>   *
> Reviewed the release note PR
>   *
> Reviewed the web announcement PR
>   *
> Start a standalone cluster to submit the state machine example, which
> works well.
>   *
> Checked the pre-built jars are generated via JDK8
>   *
> Verified the process profiler works well after setting
> rest.profiling.enabled: true
>
> Best
> Yun Tang
>
> 
> From: Qingsheng Ren 
> Sent: Wednesday, March 13, 2024 12:45
> To: dev@flink.apache.org 
> Subject: Re: [VOTE] Release 1.19.0, release candidate #2
>
> +1 (binding)
>
> - Verified signature and checksum
> - Verified no binary in source
> - Built from source
> - Tested reading and writing Kafka with SQL client and Kafka connector
> 3.1.0
> - Verified source code tag
> - Reviewed release note
> - Reviewed web PR
>
> Thanks to all release managers and contributors for the awesome work!
>
> Best,
> Qingsheng
>
> On Wed, Mar 13, 2024 at 1:23 AM Matthias Pohl
>  wrote:
>
> > I want to share an update on FLINK-34227 [1]: It's still not clear what's
> > causing the test instability. So far, we agreed in today's release sync
> [2]
> > that it's not considered a blocker because it is observed in 1.18 nightly
> > builds and it only appears in the GitHub Actions workflow. But I still
> have
> > a bit of a concern that this is something that was introduced in 1.19 and
> > backported to 1.18 after the 1.18.1 release (because the test instability
> > started to appear more regularly in March; with one occurrence in
> January).
> > Additionally, I have no reason to believe, yet, that the instability is
> > caused by some GHA-related infrastructure issue.
> >
> > So, if someone else has some capacity to help looking into it; that would
> > be appreciated. I will continue my investigation tomorrow.
> >
> > Best,
> > Matthias
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-34227
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/1.19+Release#id-1.19Release-03/12/2024
> >
> > On Tue, Mar 12, 2024 at 12:50 PM Benchao Li 
> wrote:
> >
> > > +1 (non-binding)
> > >
> > > - checked signature and checksum: OK
> > > - checkout copyright year in notice file: OK
> > > - diffed source distribution with tag, make sure there is no
> > > unexpected files: OK
> > > - build from source : OK
> > > - start a local cluster, played with jdbc connector: OK
> > >
> > > weijie guo  于2024年3月12日周二 16:55写道：
> > > >
> > > > +1 (non-binding)
> > > >
> > > > - Verified signature and checksum
> > > > - Verified source distribution does not contains binaries
> > > > - Build from source code and submit a word-count job successfully
> > > >
> > > >
> > > > Best regards,
> > > >
> > > > Weijie
> > > >
> > > >
> > > > Jane Chan  于2024年3月12日周二 16:38写道：
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > - Verify that the source distributions do not contain any binaries;
> > > > > - Build the source distribution to ensure all source files have
> > Apache
> > > > > headers;
> > > > > - Verify checksum and GPG signatures;
> > > > >
> > > > > Best,
> > > > > Jane
> > > > >
> > > > > On Tue, Mar 12, 2024 at 4:08 PM Xuannan Su 
> > > wrote:
> > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > - Verified signature and checksum
> > > > > > - Verified that source distribution does not contain binaries
> > > > > > - Built from source code successfully
> > > > > > - Reviewed the release announcement PR
> > > > > >
> > > > > > Best regards,
> > > > > > Xuannan
> > > > > >
> > > > > > On Tue, Mar 12, 2024 at 2:18 PM Hang Ruan <
> ruanhang1...@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > +1 (non-binding)
> > > > > > >
> > > > > > > - Verified signatures and checksums
> > > > > > > - Verified that source does not contain binaries
> > > > > > > - Build source code successfully
> > > > > > > - Reviewed the release note and left a comment
> > > > > > >
> > > > > > > Best,
> > > > > > > Hang
> > > > > > >
> > > > > > > Feng Jin  于2024年3月12日周二 11:23写道：
> > > > > > >
> > > > > > > > +1 (non-binding)
> > > > > > > >
> > > > > > > > - Verified signatures and checksums
> > > > > > > > - Verified that source does not contain binaries
> > > > > > > > - Build source code successfully
> > > > > > > > - Run a simple sql query successfully
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Feng Jin
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Mar 12, 2024 at 11:09 AM Ron liu  >
> > > wrote:
> > > > > > > >
> > > > > > > > > +1 (non binding)
> > > > > > > > >
> > > > > > > > > quickly

[jira] [Created] (FLINK-34665) Add streaming rule for union to Rand and it convert to StreamExecDeduplicate finally

2024-03-14 Thread Jacky Lau (Jira)

Jacky Lau created FLINK-34665:
-

 Summary: Add streaming rule for union to Rand and it convert to 
StreamExecDeduplicate finally
 Key: FLINK-34665
 URL: https://issues.apache.org/jira/browse/FLINK-34665
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0


The semantics of a union in SQL involves deduplication, and in Calcite, when 
converting a SQL node to a RelNode, a Distinct Aggregate is inserted above the 
Union to achieve this deduplication. In Flink, the Distinct Aggregate 
eventually gets converted into a StreamExecGroupAggregate operator. This 
operator accesses the state multiple times, and from our observations of 
numerous jobs, we can see that the stack often gets stuck at state access. This 
is because the key for the distinct aggregate is all the fields of the union, 
meaning that for the state, the key will be relatively large, and repeated 
access and comparisons to the state can be time-consuming.

In fact, a potential optimization is to add a rule to convert the Union into a 
Rank with processing time, which then ultimately gets converted into a 
StreamExecDeduplicate. Currently, we have users rewrite their SQL to use 
Row_number for deduplication, and this approach works very well. Therefore, it 
is possible to add a rule at the engine level to support this optimization.

 

and it will break the change of plan, it will cause user upgrade flink version 
failed. so i suggest add a flag.default value is not change the behavior



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] FLIP-436: Introduce "SHOW CREATE CATALOG" Syntax

2024-03-14 Thread Jane Chan

Hi Yubin,

Thanks for leading the discussion. I'm +1 for the FLIP.

As Jark said, it's a good opportunity to enhance the syntax for Catalog
from a more comprehensive perspective. So, I suggest expanding the scope of
this FLIP by focusing on the mechanism instead of one use case to enhance
the overall functionality. WDYT?

Best,
Jane

On Thu, Mar 14, 2024 at 11:38 AM Hang Ruan  wrote:

> Hi, Yubin.
>
> Thanks for the FLIP. +1 for it.
>
> Best,
> Hang
>
> Yubin Li  于2024年3月14日周四 10:15写道：
>
> > Hi Jingsong, Feng, and Jeyhun
> >
> > Thanks for your support and feedback!
> >
> > > However, could we add a new method `getCatalogDescriptor()` to
> > > CatalogManager instead of directly exposing CatalogStore?
> >
> > Good point, Besides the audit tracking issue, The proposed feature
> > only requires `getCatalogDescriptor()` function. Exposing components
> > with excessive functionality will bring unnecessary risks, I have made
> > modifications in the FLIP doc [1]. Thank Feng :)
> >
> > > Showing the SQL parser implementation in the FLIP for the SQL syntax
> > > might be a bit confusing. Also, the formal definition is missing for
> > > this SQL clause.
> >
> > Thank Jeyhun for pointing it out :) I have updated the doc [1] .
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=296290756
> >
> > Best,
> > Yubin
> >
> >
> > On Thu, Mar 14, 2024 at 2:18 AM Jeyhun Karimov 
> > wrote:
> > >
> > > Hi Yubin,
> > >
> > > Thanks for the proposal. +1 for it.
> > > I have one comment:
> > >
> > > I would like to see the SQL syntax for the proposed statement.  Showing
> > the
> > > SQL parser implementation in the FLIP
> > > for the SQL syntax might be a bit confusing. Also, the formal
> definition
> > is
> > > missing for this SQL clause.
> > > Maybe something like [1] might be useful. WDYT?
> > >
> > > Regards,
> > > Jeyhun
> > >
> > > [1]
> > >
> >
> https://github.com/apache/flink/blob/0da60ca1a4754f858cf7c52dd4f0c97ae0e1b0cb/docs/content/docs/dev/table/sql/show.md?plain=1#L620-L632
> > >
> > > On Wed, Mar 13, 2024 at 3:28 PM Feng Jin 
> wrote:
> > >
> > > > Hi Yubin
> > > >
> > > > Thank you for initiating this FLIP.
> > > >
> > > > I have just one minor question:
> > > >
> > > > I noticed that we added a new function `getCatalogStore` to expose
> > > > CatalogStore, and it seems fine.
> > > > However, could we add a new method `getCatalogDescriptor()` to
> > > > CatalogManager instead of directly exposing CatalogStore?
> > > > By only providing the `getCatalogDescriptor()` interface, it may be
> > easier
> > > > for us to implement audit tracking in CatalogManager in the future.
> > WDYT ?
> > > > Although we have only collected some modified events at the
> moment.[1]
> > > >
> > > >
> > > > [1].
> > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-294%3A+Support+Customized+Catalog+Modification+Listener
> > > >
> > > > Best,
> > > > Feng
> > > >
> > > > On Wed, Mar 13, 2024 at 5:31 PM Jingsong Li 
> > > > wrote:
> > > >
> > > > > +1 for this.
> > > > >
> > > > > We are missing a series of catalog related syntaxes.
> > > > > Especially after the introduction of catalog store. [1]
> > > > >
> > > > > [1]
> > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
> > > > >
> > > > > Best,
> > > > > Jingsong
> > > > >
> > > > > On Wed, Mar 13, 2024 at 5:09 PM Yubin Li 
> wrote:
> > > > > >
> > > > > > Hi devs,
> > > > > >
> > > > > > I'd like to start a discussion about FLIP-436: Introduce "SHOW
> > CREATE
> > > > > > CATALOG" Syntax [1].
> > > > > >
> > > > > > At present, the `SHOW CREATE TABLE` statement provides strong
> > support
> > > > for
> > > > > > users to easily
> > > > > > reuse created tables. However, despite the increasing importance
> > of the
> > > > > > `Catalog` in user's
> > > > > > business, there is no similar statement for users to use.
> > > > > >
> > > > > > According to the online discussion in FLINK-24939 [2] with Jark
> Wu
> > and
> > > > > Feng
> > > > > > Jin, since `CatalogStore`
> > > > > > has been introduced in FLIP-295 [3], we could use this component
> to
> > > > > > implement such a long-awaited
> > > > > > feature, Please refer to the document [1] for implementation
> > details.
> > > > > >
> > > > > > examples as follows:
> > > > > >
> > > > > > Flink SQL> create catalog cat2 WITH ('type'='generic_in_memory',
> > > > > > > 'default-database'='db');
> > > > > > > [INFO] Execute statement succeeded.
> > > > > > > Flink SQL> show create catalog cat2;
> > > > > > >
> > > > > > >
> > > > >
> > > >
> >
> ++
> > > > > > > | result |
> > > > > > >
> > > > > > >
> > > > >
> > > >
> >
> ++
> > > > > > > | CREATE CATALOG `cat2` WITH (
> > > > > > >

Re: [DISCUSS] FLIP-434: Support optimizations for pre-partitioned data sources

2024-03-14 Thread Jane Chan

Hi Jeyhun,

Thanks for your clarification.

> Once a new partition is detected, we add it to our existing mapping. Our
mapping looks like Map> subtaskToPartitionAssignment,
where it maps each source subtaskID to zero or more partitions.

I understand your point.  **It would be better if you could sync the
content to the FLIP**.

Another thing is I'm curious about what the physical plan looks like. Is
there any specific info that will be added to the table source (like
filter/project pushdown)? It would be great if you could attach an example
to the FLIP.

Bests,
Jane

On Wed, Mar 13, 2024 at 9:11 PM Jeyhun Karimov  wrote:

> Hi Jane,
>
> Thanks for your comments.
>
>
> 1. Concerning the `sourcePartitions()` method, the partition information
> > returned during the optimization phase may not be the same as the
> partition
> > information during runtime execution. For long-running jobs, partitions
> may
> > be continuously created. Is this FLIP equipped to handle scenarios?
>
>
> - Good point. This scenario is definitely supported.
> Once a new partition is added, or in general, new splits are
> discovered,
> PartitionAwareSplitAssigner::addSplits(Collection
> newSplits)
> method will be called. Inside that method, we are able to detect if a split
> belongs to existing partitions or there is a new partition.
> Once a new partition is detected, we add it to our existing mapping. Our
> mapping looks like Map> subtaskToPartitionAssignment,
> where
> it maps each source subtaskID to zero or more partitions.
>
> 2. Regarding the `RemoveRedundantShuffleRule` optimization rule, I
> > understand that it is also necessary to verify whether the hash key
> within
> > the Exchange node is consistent with the partition key defined in the
> table
> > source that implements `SupportsPartitioning`.
>
>
> - Yes, I overlooked that point, fixed. Actually, the rule is much
> complicated. I tried to simplify it in the FLIP. Good point.
>
>
> 3. Could you elaborate on the desired physical plan and integration with
> > `CompiledPlan` to enhance the overall functionality?
>
>
> - For compiled plan, PartitioningSpec will be used, with a json tag
> "Partitioning". As a result, in the compiled plan, the source operator will
> have
> "abilities" : [ { "type" : "Partitioning" } ] as part of the compiled plan.
> More about the implementation details below:
>
> 
> PartitioningSpec class
> 
> @JsonTypeName("Partitioning")
> public final class PartitioningSpec extends SourceAbilitySpecBase {
>  // some code here
> @Override
> public void apply(DynamicTableSource tableSource, SourceAbilityContext
> context) {
> if (tableSource instanceof SupportsPartitioning) {
> ((SupportsPartitioning) tableSource).applyPartitionedRead();
> } else {
> throw new TableException(
> String.format(
> "%s does not support SupportsPartitioning.",
> tableSource.getClass().getName()));
> }
> }
>   // some code here
> }
>
> 
> SourceAbilitySpec class
> 
> @JsonTypeInfo(use = JsonTypeInfo.Id.NAME, include =
> JsonTypeInfo.As.PROPERTY, property = "type")
> @JsonSubTypes({
> @JsonSubTypes.Type(value = FilterPushDownSpec.class),
> @JsonSubTypes.Type(value = LimitPushDownSpec.class),
> @JsonSubTypes.Type(value = PartitionPushDownSpec.class),
> @JsonSubTypes.Type(value = ProjectPushDownSpec.class),
> @JsonSubTypes.Type(value = ReadingMetadataSpec.class),
> @JsonSubTypes.Type(value = WatermarkPushDownSpec.class),
> @JsonSubTypes.Type(value = SourceWatermarkSpec.class),
> @JsonSubTypes.Type(value = AggregatePushDownSpec.class),
> +  @JsonSubTypes.Type(value = PartitioningSpec.class)   //
> new added
>
>
>
> Please let me know if that answers your questions or if you have other
> comments.
>
> Regards,
> Jeyhun
>
>
> On Tue, Mar 12, 2024 at 8:56 AM Jane Chan  wrote:
>
> > Hi Jeyhun,
> >
> > Thank you for leading the discussion. I'm generally +1 with this
> proposal,
> > along with some questions. Please see my comments below.
> >
> > 1. Concerning the `sourcePartitions()` method, the partition information
> > returned during the optimization phase may not be the same as the
> partition
> > information during runtime execution. For long-running jobs, partitions
> may
> > be continuously created. Is this FLIP equipped to handle scenarios?
> >
> > 2. Regarding the `RemoveRedundantShuffleRule` optimization rule, I
> > understand that it is also necessary to verify whether the hash key
> within
> > the Exchange node is consistent with the partition key defined in the
> table
> > source that implements `SupportsPartitioning`.
> >
> > 3. Could you elaborate on the desired physical plan and integration with
> > `CompiledPlan` to enhance the overall functionality?
> >
> > Best,

Re: [DISCUSS] FLIP-434: Support optimizations for pre-partitioned data sources

2024-03-14 Thread Hang Ruan

Hi, Jeyhun.

Thanks for the FLIP. Totally +1 for it.

I have a question about the part `Additional option to disable this
optimization`. Is this option a source configuration or a table
configuration?

Besides that, there is a little mistake if I do not understand wrongly.
Should `Check if upstream_any is pre-partitioned data source AND contains
the same partition keys as the source` be changed as `Check if upstream_any
is pre-partitioned data source AND contains the same partition keys as
downstream_any` ?

Best,
Hang

Jeyhun Karimov  于2024年3月13日周三 21:11写道：

> Hi Jane,
>
> Thanks for your comments.
>
>
> 1. Concerning the `sourcePartitions()` method, the partition information
> > returned during the optimization phase may not be the same as the
> partition
> > information during runtime execution. For long-running jobs, partitions
> may
> > be continuously created. Is this FLIP equipped to handle scenarios?
>
>
> - Good point. This scenario is definitely supported.
> Once a new partition is added, or in general, new splits are
> discovered,
> PartitionAwareSplitAssigner::addSplits(Collection
> newSplits)
> method will be called. Inside that method, we are able to detect if a split
> belongs to existing partitions or there is a new partition.
> Once a new partition is detected, we add it to our existing mapping. Our
> mapping looks like Map> subtaskToPartitionAssignment,
> where
> it maps each source subtaskID to zero or more partitions.
>
> 2. Regarding the `RemoveRedundantShuffleRule` optimization rule, I
> > understand that it is also necessary to verify whether the hash key
> within
> > the Exchange node is consistent with the partition key defined in the
> table
> > source that implements `SupportsPartitioning`.
>
>
> - Yes, I overlooked that point, fixed. Actually, the rule is much
> complicated. I tried to simplify it in the FLIP. Good point.
>
>
> 3. Could you elaborate on the desired physical plan and integration with
> > `CompiledPlan` to enhance the overall functionality?
>
>
> - For compiled plan, PartitioningSpec will be used, with a json tag
> "Partitioning". As a result, in the compiled plan, the source operator will
> have
> "abilities" : [ { "type" : "Partitioning" } ] as part of the compiled plan.
> More about the implementation details below:
>
> 
> PartitioningSpec class
> 
> @JsonTypeName("Partitioning")
> public final class PartitioningSpec extends SourceAbilitySpecBase {
>  // some code here
> @Override
> public void apply(DynamicTableSource tableSource, SourceAbilityContext
> context) {
> if (tableSource instanceof SupportsPartitioning) {
> ((SupportsPartitioning) tableSource).applyPartitionedRead();
> } else {
> throw new TableException(
> String.format(
> "%s does not support SupportsPartitioning.",
> tableSource.getClass().getName()));
> }
> }
>   // some code here
> }
>
> 
> SourceAbilitySpec class
> 
> @JsonTypeInfo(use = JsonTypeInfo.Id.NAME, include =
> JsonTypeInfo.As.PROPERTY, property = "type")
> @JsonSubTypes({
> @JsonSubTypes.Type(value = FilterPushDownSpec.class),
> @JsonSubTypes.Type(value = LimitPushDownSpec.class),
> @JsonSubTypes.Type(value = PartitionPushDownSpec.class),
> @JsonSubTypes.Type(value = ProjectPushDownSpec.class),
> @JsonSubTypes.Type(value = ReadingMetadataSpec.class),
> @JsonSubTypes.Type(value = WatermarkPushDownSpec.class),
> @JsonSubTypes.Type(value = SourceWatermarkSpec.class),
> @JsonSubTypes.Type(value = AggregatePushDownSpec.class),
> +  @JsonSubTypes.Type(value = PartitioningSpec.class)   //
> new added
>
>
>
> Please let me know if that answers your questions or if you have other
> comments.
>
> Regards,
> Jeyhun
>
>
> On Tue, Mar 12, 2024 at 8:56 AM Jane Chan  wrote:
>
> > Hi Jeyhun,
> >
> > Thank you for leading the discussion. I'm generally +1 with this
> proposal,
> > along with some questions. Please see my comments below.
> >
> > 1. Concerning the `sourcePartitions()` method, the partition information
> > returned during the optimization phase may not be the same as the
> partition
> > information during runtime execution. For long-running jobs, partitions
> may
> > be continuously created. Is this FLIP equipped to handle scenarios?
> >
> > 2. Regarding the `RemoveRedundantShuffleRule` optimization rule, I
> > understand that it is also necessary to verify whether the hash key
> within
> > the Exchange node is consistent with the partition key defined in the
> table
> > source that implements `SupportsPartitioning`.
> >
> > 3. Could you elaborate on the desired physical plan and integration with
> > `CompiledPlan` to enhance the overall functionality?
> >
> > Best,
> > Jane
> >
> > On Tue, Mar 12, 2024 at 11:11 AM Jim Hughes  >

Re: [DISCUSS] Add "Special Thanks" Page on the Flink Website

2024-03-14 Thread Jark Wu

The pull request has been merged. Thank you for the discussion and
reviewing.
The page is live now: https://flink.apache.org/what-is-flink/special-thanks/

Best,
Jark

On Tue, 12 Mar 2024 at 17:44, Jark Wu  wrote:

> I have created a JIRA issue and opened a pull request for this:
> https://github.com/apache/flink-web/pull/725.
>
> Best,
> Jark
>
> On Tue, 12 Mar 2024 at 16:56, Jark Wu  wrote:
>
>> Thank you all for your feedback. If there are no other concerns or
>> objections,
>> I'm going to create a pull request to add the Special Thanks page.
>>
>> Further feedback and sponsors to be added are still welcome!
>>
>> Best,
>> Jark
>>
>> On Mon, 11 Mar 2024 at 23:09, Maximilian Michels  wrote:
>>
>>> Hi Jark,
>>>
>>> Thanks for clarifying. At first sight, such a page indicated general
>>> sponsorship. +1 for a Thank You page to list specific monetary
>>> contributions to the project for resources which are actively used or
>>> were actively used in the past.
>>>
>>> Cheers,
>>> Max
>>>
>>> On Fri, Mar 8, 2024 at 11:55 AM Martijn Visser 
>>> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > I'm +1 on it. As long as we follow the ASF rules on this, we can thank
>>> > those that are/have made contributions.
>>> >
>>> > Best regards,
>>> >
>>> > Martijn
>>> >
>>> > On Wed, Mar 6, 2024 at 7:45 AM Jark Wu  wrote:
>>> >
>>> > > Hi Matthias,
>>> > >
>>> > > Thanks for your comments! Please see my reply inline.
>>> > >
>>> > > > What do we do if we have enough VMs? Do we still allow
>>> > > companies to add more VMs to the pool even though it's not adding any
>>> > > value?
>>> > >
>>> > > The ASF policy[1] makes it very clear: "Project Thanks pages are to
>>> show
>>> > > appreciation
>>> > > for goods that the project truly needs, not just for goods that
>>> someone
>>> > > wants to donate."
>>> > > Therefore, the community should reject new VMs if it is enough.
>>> > >
>>> > >
>>> > > > The community lacks the openly accessible tools to monitor the VM
>>> usage
>>> > > independently
>>> > > as far as I know (the Azure Pipelines project is owned by Ververica
>>> right
>>> > > now).
>>> > >
>>> > > The Azure pipeline account is sponsored by Ververica, and is managed
>>> by the
>>> > > community.
>>> > > AFAIK, Chesnay and Robert both have admin permissions [2] to the
>>> Azure
>>> > > pipeline project.
>>> > > Others can contact the managers to get access to the environment.
>>> > >
>>> > > > I figured that there could be a chance for us to
>>> > > rely on Apache-provided infrastructure entirely with our current
>>> workload
>>> > > when switching over from Azure Pipelines.
>>> > >
>>> > > That sounds great. We can return back the VMs and mark the donations
>>> as
>>> > > historical
>>> > > on the Thank Page once the new GitHub Actions CI is ready.
>>> > >
>>> > > > I am fine with creating a Thank You page to acknowledge the
>>> financial
>>> > > contributions from Alibaba and Ververica in the past (since Apache
>>> allows
>>> > > historical donations) considering that the contributions of the two
>>> > > companies go way back in time and are quite significant in my
>>> opinion. I
>>> > > suggest focusing on the past for now because of the option to
>>> migrate to
>>> > > Apache infrastructure midterm.
>>> > >
>>> > > Sorry, do you mean we only mention past donations for now?
>>> > > IIUC, the new GitHub Actions might be ready after the end of v1.20,
>>> which
>>> > > probably be in half a year.
>>> > > I'm worried that if we say the sponsorship is ongoing until now (but
>>> it's
>>> > > not), it will confuse
>>> > > people and disrespect the sponsor.
>>> > >
>>> > > Besides, I'm not sure whether the new GitHub Actions CI will replace
>>> the
>>> > > machines for running
>>> > > flink-ci mirrors [3] and the flink benchmarks [4]. If not, I think
>>> it's
>>> > > inappropriate to say they are
>>> > > historical donations.
>>> > >
>>> > > Furthermore, we are collecting all kinds of donations. I just
>>> noticed that
>>> > > AWS donated [5] service costs
>>> > > for flink-connector-aws tests that hit real AWS services. This is an
>>> > > ongoing donation and I think it's not
>>> > > good to mark it as a historical donation. (Thanks for the donation,
>>> AWS,
>>> > > @Danny
>>> > > Cranmer  @HongTeoh!
>>> > > We should add it to the Thank Page!)
>>> > >
>>> > > Best,
>>> > > Jark
>>> > >
>>> > >
>>> > > [1]: https://www.apache.org/foundation/marks/linking#projectthanks
>>> > > [2]:
>>> > >
>>> > >
>>> https://cwiki.apache.org/confluence/display/FLINK/Continuous+Integration#ContinuousIntegration-Contacts
>>> > >
>>> > > [3]:
>>> > >
>>> > >
>>> https://cwiki.apache.org/confluence/display/FLINK/Continuous+Integration#ContinuousIntegration-Repositories
>>> > >
>>> > > [4]:
>>> https://lists.apache.org/thread/bkw6ozoflgltwfwmzjtgx522hyssfko6
>>> > >
>>> > > [5]: https://issues.apache.org/jira/browse/INFRA-24474
>>> > >
>>> > > On Wed, 6 Mar 2024 at 17:58, Matthias Pohl 
>>> wrote:
>>> > >
>>> > > > Thanks for starting this

[RESULT][VOTE] FLIP-420: Add API annotations for RocksDB StateBackend user-facing classes

2024-03-14 Thread Jinzhong Li

Hi devs,

I'm happy to announce that FLIP-420: Add API annotations for
RocksDB StateBackend user-facing classes [1] has been accepted with 7
approving votes (4 binding) [2]:

- Yun Tang (binding)
- Jeyhun Karimov (non-binding)
- Hangxiang Yu (binding)
- Yanfei Lei (binding)
- Zakelly Lan (non-binding)
- Jing Ge (binding)
- Ahmed Hamdy (non-binding)

There are no disapproving votes. Thanks to everyone who participated in
the discussion and voting.

[1] https://cwiki.apache.org/confluence/x/JQs4EQ
[2] https://lists.apache.org/thread/gfgz4j2m15w8ppwhdgm1f3nhsdpvphox

Best,
Jinzhong Li

Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for Simplifying Data Pipelines

2024-03-14 Thread Lincoln Lee

Hi Jing,

Thanks for your attention to this flip! I'll try to answer the following
questions.

> 1. How to define query of dynamic table?
> Use flink sql or introducing new syntax?
> If use flink sql, how to handle the difference in SQL between streaming
and
> batch processing?
> For example, a query including window aggregate based on processing time?
> or a query including global order by?

Similar to `CREATE TABLE AS query`, here the `query` also uses Flink sql and

doesn't introduce a totally new syntax.
We will not change the status respect to

the difference in functionality of flink sql itself on streaming and
batch, for example,

the proctime window agg on streaming and global sort on batch that you
mentioned,

in fact, do not work properly in the
other mode, so when the user modifies the

refresh mode of a dynamic table that is not supported, we will throw an
exception.

> 2. Whether modify the query of dynamic table is allowed?
> Or we could only refresh a dynamic table based on the initial query?

Yes, in the current design, the query definition of the
dynamic table is not allowed

 to be modified, and you can only refresh the data based on the
initial definition.

> 3. How to use dynamic table?
> The dynamic table seems to be similar to the materialized view.  Will we
do
> something like materialized view rewriting during the optimization?

It's true that dynamic table and materialized view
are similar in some ways, but as Ron

explains
there are differences. In terms of optimization, automated
materialization discovery

similar to that supported by calcite is also a potential possibility,
perhaps with the

addition of automated rewriting in the future.



Best,
Lincoln Lee


Ron liu  于2024年3月14日周四 14:01写道：

> Hi, Timo
>
> Sorry for later response,  thanks for your feedback.
> Regarding your questions:
>
> > Flink has introduced the concept of Dynamic Tables many years ago. How
>
> does the term "Dynamic Table" fit into Flink's regular tables and also
>
> how does it relate to Table API?
>
>
> > I fear that adding the DYNAMIC TABLE keyword could cause confusion for
> > users, because a term for regular CREATE TABLE (that can be "kind of
> > dynamic" as well and is backed by a changelog) is then missing. Also
> > given that we call our connectors for those tables, DynamicTableSource
> > and DynamicTableSink.
>
>
> > In general, I find it contradicting that a TABLE can be "paused" or
> > "resumed". From an English language perspective, this does sound
> > incorrect. In my opinion (without much research yet), a continuous
> > updating trigger should rather be modelled as a CREATE MATERIALIZED VIEW
> > (which users are familiar with?) or a new concept such as a CREATE TASK
> > (that can be paused and resumed?).
>
>
> 1.
> In the current concept[1], it actually includes: Dynamic Tables &
> Continuous Query. Dynamic Table is just an abstract
> logical concept
> , which in its physical form represents either a table or a changelog
> stream. It requires the combination with Continuous Query to achieve
> dynamic updates of the target table similar to a database’s
> Materialized View.
> We hope to upgrade the Dynamic Table to a real entity that users can
> operate, which combines the logical concepts of Dynamic Tables +
> Continuous Query. By integrating the definition of tables and queries,
> it can achieve functions similar to Materialized Views, simplifying
> users' data processing pipelines.
> So, the object of the suspend operation is the refresh task of the
> dynamic table. The command  `ALTER DYNAMIC TABLE table_name SUSPEND `
> is actually a shorthand for `ALTER DYNAMIC TABLE table_name SUSPEND
> REFRESH` (if written in full for clarity, we can also modify it).
>
>  2. Initially, we also considered Materialized Views
> , but ultimately decided against them. Materialized views are designed
> to enhance query performance for workloads that consist of common,
> repetitive query patterns. In essence, a materialized view represents
> the result of a query.
> However, it is not intended to support data modification. For
> Lakehouse scenarios, where the ability to delete or update data is
> crucial (such as compliance with GDPR, FLIP-2), materialized views
> fall short.
>
> 3.
> Compared to CREATE (regular) TABLE, CREATE DYNAMIC TABLE not only
> defines metadata in the catalog but also automatically initiates a
> data refresh task based on the query specified during table creation.
> It dynamically executes data updates. Users can focus on data
> dependencies and data generation logic.
>
> 4.
> The new dynamic table does not conflict with the existing
> DynamicTableSource and DynamicTableSink interfaces. For the developer,
> all that needs to be implemented is the new CatalogDynamicTable,
> without changing the implementation of source and sink.
>
> 5. For now, the FLIP does not consider supporting Table API operations on
> Dynamic Table
> . However, once the SQL syntax is finalized, we can discuss

Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for Simplifying Data Pipelines

2024-03-14 Thread Ron liu

Hi, Timo

Sorry for later response,  thanks for your feedback.
Regarding your questions:

> Flink has introduced the concept of Dynamic Tables many years ago. How

does the term "Dynamic Table" fit into Flink's regular tables and also

how does it relate to Table API?


> I fear that adding the DYNAMIC TABLE keyword could cause confusion for
> users, because a term for regular CREATE TABLE (that can be "kind of
> dynamic" as well and is backed by a changelog) is then missing. Also
> given that we call our connectors for those tables, DynamicTableSource
> and DynamicTableSink.


> In general, I find it contradicting that a TABLE can be "paused" or
> "resumed". From an English language perspective, this does sound
> incorrect. In my opinion (without much research yet), a continuous
> updating trigger should rather be modelled as a CREATE MATERIALIZED VIEW
> (which users are familiar with?) or a new concept such as a CREATE TASK
> (that can be paused and resumed?).


1.
In the current concept[1], it actually includes: Dynamic Tables &
Continuous Query. Dynamic Table is just an abstract
logical concept
, which in its physical form represents either a table or a changelog
stream. It requires the combination with Continuous Query to achieve
dynamic updates of the target table similar to a database’s
Materialized View.
We hope to upgrade the Dynamic Table to a real entity that users can
operate, which combines the logical concepts of Dynamic Tables +
Continuous Query. By integrating the definition of tables and queries,
it can achieve functions similar to Materialized Views, simplifying
users' data processing pipelines.
So, the object of the suspend operation is the refresh task of the
dynamic table. The command  `ALTER DYNAMIC TABLE table_name SUSPEND `
is actually a shorthand for `ALTER DYNAMIC TABLE table_name SUSPEND
REFRESH` (if written in full for clarity, we can also modify it).

 2. Initially, we also considered Materialized Views
, but ultimately decided against them. Materialized views are designed
to enhance query performance for workloads that consist of common,
repetitive query patterns. In essence, a materialized view represents
the result of a query.
However, it is not intended to support data modification. For
Lakehouse scenarios, where the ability to delete or update data is
crucial (such as compliance with GDPR, FLIP-2), materialized views
fall short.

3.
Compared to CREATE (regular) TABLE, CREATE DYNAMIC TABLE not only
defines metadata in the catalog but also automatically initiates a
data refresh task based on the query specified during table creation.
It dynamically executes data updates. Users can focus on data
dependencies and data generation logic.

4.
The new dynamic table does not conflict with the existing
DynamicTableSource and DynamicTableSink interfaces. For the developer,
all that needs to be implemented is the new CatalogDynamicTable,
without changing the implementation of source and sink.

5. For now, the FLIP does not consider supporting Table API operations on
Dynamic Table
. However, once the SQL syntax is finalized, we can discuss this in a
separate FLIP. Currently, I have a rough idea: the Table API should
also introduce
DynamicTable operation interfaces
 corresponding to the existing Table interfaces.
The TableEnvironment
 will provide relevant methods to support various dynamic table
operations. The goal for the new Dynamic Table is to offer users an
experience similar to using a database, which is why we prioritize
SQL-based approaches initially.

> How do you envision re-adding the functionality of a statement set, that
> fans out to multiple tables? This is a very important use case for data
> pipelines.


Multi-tables is indeed a very important user scenario. In the future,
we can consider extending the statement set syntax to support the
creation of multiple dynamic tables.


> > Since the early days of Flink SQL, we were discussing `SELECT STREAM *
> FROM T EMIT 5 MINUTES`. Your proposal seems to rephrase STREAM and EMIT,
> into other keywords DYNAMIC TABLE and FRESHNESS. But the core
> functionality is still there. I'm wondering if we should widen the scope
> (maybe not part of this FLIP but a new FLIP) to follow the standard more
> closely. Making `SELECT * FROM t` bounded by default and use new syntax
> for the dynamic behavior. Flink 2.0 would be the perfect time for this,
> however, it would require careful discussions. What do you think?


The query part indeed requires a separate FLIP
for discussion, as it involves changes to the default behavior.


[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/dynamic_tables


Best,

Ron


Jing Zhang  于2024年3月13日周三 15:19写道：

> Hi, Lincoln & Ron,
>
> Thanks for the proposal.
>
> I agree with the question raised by Timo.
>
> Besides, I have some other questions.
> 1. How to define query of dynamic table?
> Use flink sql or introducing new syntax?
> If use flink sql, how to handle the

74 matches

Mail list logo