[jira] [Created] (FLINK-35976) StreamPhysicalOverAggregate should handle column name confliction

2024-08-05 Thread lincoln lee (Jira)
lincoln lee created FLINK-35976:
---

 Summary: StreamPhysicalOverAggregate should handle column name 
confliction
 Key: FLINK-35976
 URL: https://issues.apache.org/jira/browse/FLINK-35976
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.19.1, 1.20.0
Reporter: lincoln lee
Assignee: lincoln lee
 Fix For: 2.0.0


Duplicate column name exception occurred when use a nested over aggregate query,

e.g., a repro case:

{code}
@Test
def testNestedOverAgg(): Unit = {
util.addTable(s"""
|CREATE TEMPORARY TABLE src (
| a STRING,
| b STRING,
| ts TIMESTAMP_LTZ(3),
| watermark FOR ts as ts
|) WITH (
| 'connector' = 'values'
|)
|""".stripMargin)

util.verifyExecPlan(s"""
|SELECT *
|FROM (
| SELECT
| *, count(*) OVER (PARTITION BY a ORDER BY ts) AS c2
| FROM (
| SELECT
| *, count(*) OVER (PARTITION BY a,b ORDER BY ts) AS c1
| FROM src
| )
|)
|""".stripMargin)
}
{code}

 

{code}
org.apache.flink.table.api.ValidationException: Field names must be unique. 
Found duplicates: [w0$o0]
 
at org.apache.flink.table.types.logical.RowType.validateFields(RowType.java:273)
at org.apache.flink.table.types.logical.RowType.(RowType.java:158)
at org.apache.flink.table.types.logical.RowType.of(RowType.java:298)
at org.apache.flink.table.types.logical.RowType.of(RowType.java:290)
at 
org.apache.flink.table.planner.calcite.FlinkTypeFactory$.toLogicalRowType(FlinkTypeFactory.scala:678)
at 
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamPhysicalOverAggregate.translateToExecNode(StreamPhysicalOverAggregate.scala:57)
at 
org.apache.flink.table.planner.plan.nodes.physical.FlinkPhysicalRel.translateToExecNode(FlinkPhysicalRel.scala:53)
at 
org.apache.flink.table.planner.plan.nodes.physical.FlinkPhysicalRel.translateToExecNode$(FlinkPhysicalRel.scala:52)
at 
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamPhysicalOverAggregateBase.translateToExecNode(StreamPhysicalOverAggregateBase.scala:35)
at 
org.apache.flink.table.planner.plan.nodes.exec.ExecNodeGraphGenerator.generate(ExecNodeGraphGenerator.java:74)
at 
org.apache.flink.table.planner.plan.nodes.exec.ExecNodeGraphGenerator.generate(ExecNodeGraphGenerator.java:54)
at 
org.apache.flink.table.planner.delegation.PlannerBase.translateToExecNodeGraph(PlannerBase.scala:407)
at 
org.apache.flink.table.planner.utils.TableTestUtilBase.assertPlanEquals(TableTestBase.scala:1076)
at 
org.apache.flink.table.planner.utils.TableTestUtilBase.doVerifyPlan(TableTestBase.scala:920)
at 
org.apache.flink.table.planner.utils.TableTestUtilBase.verifyExecPlan(TableTestBase.scala:675)
at 
org.apache.flink.table.planner.plan.stream.sql.agg.OverAggregateTest.testNestedOverAgg(OverAggregateTest.scala:460)
{code}

 

This is a similar case In https://issues.apache.org/jira/browse/FLINK-22121, 
but missed the fixing in streaming over agg scenario.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-470: Support Adaptive Broadcast Join

2024-08-01 Thread Lincoln Lee
Hi Xia,

Thanks for your updates! Looks good to me.

Best,
Lincoln Lee


Xia Sun  于2024年8月1日周四 11:15写道:

> Hi Lincoln,
>
> Thanks for your detailed explanation. I understand your concern.
> Introducing configuration with redundant semantics can indeed confuse
> users, and the engine should minimize user exposure to these details. Based
> on this premise, while also ensuring that users can choose to enable the
> broadcast hash join optimization during either the compile-time or runtime,
> I think we can introduce a new configuration
> `table.optimizer.adaptive-broadcast-join.strategy`, and reuse the existing
> configuration `table.optimizer.join.broadcast-threshold` as a unified
> threshold for determining broadcast hash join optimization. The
> `table.optimizer.adaptive-broadcast-join.strategy` configuration would be
> of an enumeration type with three options:
>
> AUTO: Flink will autonomously select the optimal timing for the
> optimization.
> RUNTIME_ONLY: The broadcast hash join optimization will only be performed
> at runtime.
> NONE: The broadcast hash join optimization will only be performed at
> compile phase.
> And AUTO will be the default option.
>
> I have also updated this information in FLIP, PTAL.
>
> Best,
> Xia
>
> Lincoln Lee  于2024年7月30日周二 23:39写道:
>
> > Thanks Xia for your explanation!
> >
> > I can understand your concern, but considering the design of this FLIP,
> > which already covers compile-time inaccurate optimization for runtime
> > de-optimization, is it necessary to make the user manually turn off
> > 'table.optimizer.join.broadcast-threshold' and set the new
> > 'table.optimizer.adaptive.join.broadcast-threshold' again? Another option
> > is that users only need to focus on the existing broadcast size
> threshold,
> > and accept the reality that 100% accurate optimization cannot be done
> > at compile time, and adopt the new capability of dynamic optimization at
> > runtime, and ultimately, users will trust that flink will always optimize
> > accurately, and from this point of view, I would prefer a generic
> parameter
> > 'table.optimizer. adaptive-optimization.enabled', which would allow for
> > more dynamic optimization in the future, not limited to broadcast join
> > scenarios and will not continuously bring more new options, WDYT?
> >
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Xia Sun  于2024年7月30日周二 11:27写道:
> >
> > > Hi Lincoln,
> > >
> > > Thank you for your input and participation in the discussion!
> > >
> > > Compared to introducing the 'table.optimizer.adaptive-join.enabled'
> > option,
> > > introducing the "table.optimizer.adaptive.join.broadcast-threshold" can
> > > also cover the need to disable static broadcast optimization while only
> > > enabling dynamic broadcast optimization. From this perspective,
> > introducing
> > > a new threshold configuration might be more appropriate. What do you
> > think?
> > >
> > > Best,
> > > Xia
> > >
> > > Lincoln Lee  于2024年7月29日周一 23:12写道:
> > >
> > > > +1 for this useful optimization!
> > > >
> > > > I have a question about the new optoin, do we really need two
> broadcast
> > > > join thresholds? IIUC, this adaptive broadcast join is a complement
> to
> > > > compile-time optimization, there is no need for the user to configure
> > two
> > > > different thresholds (not the off represented by -1), so we just want
> > to
> > > > control the adaptive optimization itself, should we provide a
> > > configuration
> > > > option like 'table.optimizer.adaptive-join.enabled' or a more general
> > one
> > > > 'table.optimizer.adaptive-optimization.enabled' for such related
> > > > optimizations?
> > > >
> > > >
> > > > Best,
> > > > Lincoln Lee
> > > >
> > > >
> > > > Ron Liu  于2024年7月26日周五 11:59写道:
> > > >
> > > > > Hi, Xia
> > > > >
> > > > > Thanks for your reply. It looks good to me.
> > > > >
> > > > >
> > > > > Best,
> > > > > Ron
> > > > >
> > > > > Xia Sun  于2024年7月26日周五 10:49写道:
> > > > >
> > > > > > Hi Ron,
> > > > > >
> > > > > > Thanks for your feedback!
> > > > > >
> > > > > > -> creation of the join operators until runtime
>

Re: [DISCUSS] FLIP-470: Support Adaptive Broadcast Join

2024-07-30 Thread Lincoln Lee
Thanks Xia for your explanation!

I can understand your concern, but considering the design of this FLIP,
which already covers compile-time inaccurate optimization for runtime
de-optimization, is it necessary to make the user manually turn off
'table.optimizer.join.broadcast-threshold' and set the new
'table.optimizer.adaptive.join.broadcast-threshold' again? Another option
is that users only need to focus on the existing broadcast size threshold,
and accept the reality that 100% accurate optimization cannot be done
at compile time, and adopt the new capability of dynamic optimization at
runtime, and ultimately, users will trust that flink will always optimize
accurately, and from this point of view, I would prefer a generic parameter
'table.optimizer. adaptive-optimization.enabled', which would allow for
more dynamic optimization in the future, not limited to broadcast join
scenarios and will not continuously bring more new options, WDYT?


Best,
Lincoln Lee


Xia Sun  于2024年7月30日周二 11:27写道:

> Hi Lincoln,
>
> Thank you for your input and participation in the discussion!
>
> Compared to introducing the 'table.optimizer.adaptive-join.enabled' option,
> introducing the "table.optimizer.adaptive.join.broadcast-threshold" can
> also cover the need to disable static broadcast optimization while only
> enabling dynamic broadcast optimization. From this perspective, introducing
> a new threshold configuration might be more appropriate. What do you think?
>
> Best,
> Xia
>
> Lincoln Lee  于2024年7月29日周一 23:12写道:
>
> > +1 for this useful optimization!
> >
> > I have a question about the new optoin, do we really need two broadcast
> > join thresholds? IIUC, this adaptive broadcast join is a complement to
> > compile-time optimization, there is no need for the user to configure two
> > different thresholds (not the off represented by -1), so we just want to
> > control the adaptive optimization itself, should we provide a
> configuration
> > option like 'table.optimizer.adaptive-join.enabled' or a more general one
> > 'table.optimizer.adaptive-optimization.enabled' for such related
> > optimizations?
> >
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Ron Liu  于2024年7月26日周五 11:59写道:
> >
> > > Hi, Xia
> > >
> > > Thanks for your reply. It looks good to me.
> > >
> > >
> > > Best,
> > > Ron
> > >
> > > Xia Sun  于2024年7月26日周五 10:49写道:
> > >
> > > > Hi Ron,
> > > >
> > > > Thanks for your feedback!
> > > >
> > > > -> creation of the join operators until runtime
> > > >
> > > >
> > > > That means when creating the AdaptiveJoinOperatorFactory, we will not
> > > > immediately create the JoinOperator. Instead, we only pass in the
> > > necessary
> > > > parameters for creating the JoinOperator. The appropriate
> JoinOperator
> > > will
> > > > be created during the StreamGraphOptimizationStrategy optimization
> > phase.
> > > >
> > > > You mentioned that the runtime's visibility into the table planner is
> > > > indeed an issue. It includes two aspects,
> > > > (1) we plan to place both implementations of the
> > > > AdaptiveBroadcastJoinOptimizationStrategy and
> > AdaptiveJoinOperatorFactory
> > > > in the table layer. During the runtime phase, we will obtain the
> > > > AdaptiveBroadcastJoinOptimizationStrategy through class loading.
> > > Therefore,
> > > > the flink-runtime does not need to be aware of the table layer's
> > > > implementation.
> > > > (2) Since the dynamic codegen in the AdaptiveJoinOperatorFactory
> needs
> > to
> > > > be aware of the table planner, we will consider placing the
> > > > AdaptiveJoinOperatorFactory in the table planner module as well.
> > > >
> > > >
> > > >  -> When did you configure these optimization strategies uniformly
> into
> > > > > `execution.batch.adaptive.stream-graph-optimization.strategies`
> > > >
> > > >
> > > > Thank you for pointing out this issue. When there are multiple
> > > > StreamGraphOptimizationStrategies, the optimization order at the
> > runtime
> > > > phase will strictly follow the order specified in the configuration
> > > option
> > > > `execution.batch.adaptive.stream-graph-optimization.strategies`.
> > > Therefore,
> > > > it is necessary to have a unified configuration during the sql
> planner
> >

Re: [DISCUSS] FLIP-470: Support Adaptive Broadcast Join

2024-07-29 Thread Lincoln Lee
+1 for this useful optimization!

I have a question about the new optoin, do we really need two broadcast
join thresholds? IIUC, this adaptive broadcast join is a complement to
compile-time optimization, there is no need for the user to configure two
different thresholds (not the off represented by -1), so we just want to
control the adaptive optimization itself, should we provide a configuration
option like 'table.optimizer.adaptive-join.enabled' or a more general one
'table.optimizer.adaptive-optimization.enabled' for such related
optimizations?


Best,
Lincoln Lee


Ron Liu  于2024年7月26日周五 11:59写道:

> Hi, Xia
>
> Thanks for your reply. It looks good to me.
>
>
> Best,
> Ron
>
> Xia Sun  于2024年7月26日周五 10:49写道:
>
> > Hi Ron,
> >
> > Thanks for your feedback!
> >
> > -> creation of the join operators until runtime
> >
> >
> > That means when creating the AdaptiveJoinOperatorFactory, we will not
> > immediately create the JoinOperator. Instead, we only pass in the
> necessary
> > parameters for creating the JoinOperator. The appropriate JoinOperator
> will
> > be created during the StreamGraphOptimizationStrategy optimization phase.
> >
> > You mentioned that the runtime's visibility into the table planner is
> > indeed an issue. It includes two aspects,
> > (1) we plan to place both implementations of the
> > AdaptiveBroadcastJoinOptimizationStrategy and AdaptiveJoinOperatorFactory
> > in the table layer. During the runtime phase, we will obtain the
> > AdaptiveBroadcastJoinOptimizationStrategy through class loading.
> Therefore,
> > the flink-runtime does not need to be aware of the table layer's
> > implementation.
> > (2) Since the dynamic codegen in the AdaptiveJoinOperatorFactory needs to
> > be aware of the table planner, we will consider placing the
> > AdaptiveJoinOperatorFactory in the table planner module as well.
> >
> >
> >  -> When did you configure these optimization strategies uniformly into
> > > `execution.batch.adaptive.stream-graph-optimization.strategies`
> >
> >
> > Thank you for pointing out this issue. When there are multiple
> > StreamGraphOptimizationStrategies, the optimization order at the runtime
> > phase will strictly follow the order specified in the configuration
> option
> > `execution.batch.adaptive.stream-graph-optimization.strategies`.
> Therefore,
> > it is necessary to have a unified configuration during the sql planner
> > phase to ensure the correct optimization order. Currently, we are
> > considering performing this unified configuration in
> > BatchPlanner#afterTranslation().
> >
> > For simplicity, as long as the adaptive broadcast join/skewed join
> > optimization features are enabled (e.g.,
> > `table.optimizer.adaptive.join.broadcast-threshold` is not -1), the
> > corresponding strategy will be configured. This optimization is
> independent
> > of the specific SQL query, although it might not produce any actual
> effect.
> >
> > Best,
> > Xia
> >
> > Ron Liu  于2024年7月24日周三 14:10写道:
> >
> > > Hi, Xia
> > >
> > > This FLIP looks good to me, +1.
> > >
> > > I've two questions:
> > >
> > > 1.
> > > >> Accordingly, in terms of implementation, we will delay the codegen
> and
> > > creation of the join operators until runtime.
> > >
> > > How are you delaying codegen to runtime, the current runtime is not SQL
> > > planner aware. in other words, how do I understand this sentence?
> > >
> > > 2. FLIP-469 mentions passing StreamGraphOptimizationStrategy to runtime
> > via
> > > option `execution.batch.adaptive.stream-graph-optimization.strategies`.
> > In
> > > SQL planner if you have multiple different optimization strategies like
> > > broadcast join, skew join, etc...  When did you configure these
> > > optimization strategies uniformly into
> > > `execution.batch.adaptive.stream-graph-optimization.strategies`?
> > >
> > >
> > >
> > > Zhu Zhu  于2024年7月19日周五 17:41写道:
> > >
> > > > +1 for the FLIP
> > > >
> > > > It's a good start to adaptively optimize the logical execution plan
> > with
> > > > runtime information.
> > > >
> > > > Thanks,
> > > > Zhu
> > > >
> > > > Xia Sun  于2024年7月18日周四 18:23写道:
> > > >
> > > > > Hi devs,
> > > > >
> > > > > Junrui Lee, Lei Yang, and I would like to initiate a discussio

[jira] [Created] (FLINK-35816) Non-mergeable proctime tvf window aggregate needs to fallback to group aggregate

2024-07-11 Thread lincoln lee (Jira)
lincoln lee created FLINK-35816:
---

 Summary: Non-mergeable proctime tvf window aggregate needs to 
fallback to group aggregate
 Key: FLINK-35816
 URL: https://issues.apache.org/jira/browse/FLINK-35816
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.19.1, 1.20.0
Reporter: lincoln lee
Assignee: lincoln lee
 Fix For: 1.20.0


Non-mergeable proctime tvf window aggregate needs to fallback to group 
aggregate, e.g.,

an example:

{code}
select c, count(a)
from
   TABLE(CUMULATE(table MyTable, DESCRIPTOR(proctime), interval '10' seconds, 
interval '5' minutes))
where window_start <> '123'
group by window_start, window_end, c, window_time
{code}

the window property in above query was materialized before aggregation, so it 
lost processing time attribute and cause the planner failed to pull up 
`StreamPhysicalWindowTableFunction` into the `StreamPhysicalWindowAggregate` to 
generate a valid execution plan, like following(which goes into the attached 
window strategy which relies on the upstream watermark but lacks of a watermark 
assigner):

{code}
Calc(select=[c, EXPR$1])
+- WindowAggregate(groupBy=[c], window=[CUMULATE(win_start=[window_start], 
win_end=[window_end], max_size=[5 min], step=[10 s])], select=[c, COUNT(a) AS 
EXPR$1, start('w$) AS window_start, end('w$) AS window_end, proctime('w$) AS 
window_time])
+- Exchange(distribution=[hash[c]])
+- Calc(select=[window_start, window_end, c, window_time, a], 
where=[<>(window_start, '123')])
+- WindowTableFunction(window=[CUMULATE(time_col=[proctime], max_size=[5 min], 
step=[10 s])])
+- Calc(select=[a, c, proctime])
+- WatermarkAssigner(rowtime=[rowtime], watermark=[-(rowtime, 1000:INTERVAL 
SECOND)])
+- Calc(select=[a, c, PROCTIME() AS proctime, rowtime])
+- TableSourceScan(table=[[default_catalog, default_database, MyTable, 
project=[a, c, rowtime], metadata=[]]], fields=[a, c, rowtime])
{code}

so, semantically when the window time attribute was materialized after window 
table function, the downstream aggregation should use group aggregation, the 
expected plan of the above example can be:

{code}
Calc(select=[c, EXPR$1])
+- GroupAggregate(groupBy=[window_start, window_end, c, window_time], 
select=[window_start, window_end, c, window_time, COUNT(a) AS EXPR$1])
+- Exchange(distribution=[hash[window_start, window_end, c, window_time]])
+- Calc(select=[window_start, window_end, c, window_time, a], 
where=[<>(window_start, '123')])
+- WindowTableFunction(window=[CUMULATE(time_col=[proctime], max_size=[5 min], 
step=[10 s])])
+- Calc(select=[a, c, proctime])
+- WatermarkAssigner(rowtime=[rowtime], watermark=[-(rowtime, 1000:INTERVAL 
SECOND)])
+- Calc(select=[a, c, PROCTIME() AS proctime, rowtime])
+- TableSourceScan(table=[[default_catalog, default_database, MyTable, 
project=[a, c, rowtime], metadata=[]]], fields=[a, c, rowtime])
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Apache Flink CDC Release 3.1.1, release candidate #0

2024-06-18 Thread Lincoln Lee
+1 (binding)

- verified hashes
- verified signatures
- build from source with JDK 8 & Maven 3.8.6
- the source distributions do not contain any binaries
- all POM files point to the same version
- checked release notes
- reviewed the web PR


Best,
Lincoln Lee


Jark Wu  于2024年6月18日周二 19:58写道:

> +1 (binding)
>
> - Build and compile the source code locally: *OK*
> - Verified signatures: *OK*
> - Verified hashes: *OK*
> - Checked no missing artifacts in the staging area: *OK*
> - Reviewed the website release PR: *OK*
> - Checked the licenses: *OK*
>
> Best,
> Jark
>
> On Tue, 18 Jun 2024 at 18:14, Leonard Xu  wrote:
>
> > +1 (binding)
> >
> > - verified signatures
> > - verified hashsums
> > - checked release notes
> > - reviewed the web PR
> > - tested Flink CDC works with Flink 1.19
> > - tested route、transform in MySQL to Doris Pipeline
> >
> > Best,
> > Leonard
> >
> >
>


Re: [VOTE] FLIP-462: Support Custom Data Distribution for Input Stream of Lookup Join

2024-06-17 Thread Lincoln Lee
+1 (binding)


Best,
Lincoln Lee


Xintong Song  于2024年6月17日周一 13:39写道:

> +1 (binding)
>
> Best,
>
> Xintong
>
>
>
> On Mon, Jun 17, 2024 at 11:41 AM Zhanghao Chen 
> wrote:
>
> > +1 (unbinding)
> >
> > Best,
> > Zhanghao Chen
> > 
> > From: weijie guo 
> > Sent: Monday, June 17, 2024 10:13
> > To: dev 
> > Subject: [VOTE] FLIP-462: Support Custom Data Distribution for Input
> > Stream of Lookup Join
> >
> > Hi everyone,
> >
> >
> > Thanks for all the feedback about the FLIP-462: Support Custom Data
> > Distribution for Input Stream of Lookup Join [1]. The discussion
> > thread is here [2].
> >
> >
> > The vote will be open for at least 72 hours unless there is an
> > objection or insufficient votes.
> >
> >
> > Best,
> >
> > Weijie
> >
> >
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-462+Support+Custom+Data+Distribution+for+Input+Stream+of+Lookup+Join
> >
> >
> > [2] https://lists.apache.org/thread/kds2zrcdmykrz5lmn0hf9m4phdl60nfb
> >
>


Re: Re: Re: [DISCUSS] FLIP-462: Support Custom Data Distribution for Input Stream of Lookup Join

2024-06-13 Thread Lincoln Lee
Thanks Weijie & Wencong for your update including the conclusions of
the offline discussion.

There's one thing need to be confirmed in the FLIP:
> The hint only provides a suggestion to the optimizer, it is not an
enforcer. As a result, If the target dim table not implements
SupportsLookupCustomShuffle, planner will ignore this newly introduced
shuffle option.

Since we've decided to extend a new hint option 'shuffle' to the current
`LOOKUP` join hint, do we support hash shuffle as well?(It seems like it
shouldn't require a lot of extra work, right?)
This will deliver a complete new feature to users,  also because
FLIP-204 is stale for now and this new extension will give user a more
simpler way to achieve the goal, WDYT?

Another small comment for the new interface:
> "... planner may not apply this partitioner in upsert mode ..."
> default boolean isDeterministic()
"upsert mode" should be "updating stream" or "non-insert-only stream".


Best,
Lincoln Lee


Wencong Liu  于2024年6月12日周三 21:43写道:

> Hi Jingsong,
>
>
> Some of the points you mentioned are currently clarified in
> the updated FLIP. Please check it out.
>
>
> 1. Enabling custom data distribution can be done through the
> LOOKUP SQL Hint. There are detailed examples provided in the FLIP.
>
>
> 2. We will add the isDeterministic method to the `InputDataPartitioner`
> interface, which will return true by default. If the
> `InputDataPartitioner`
> is not deterministic, the connector developer need to override the
> isDeterministic method to return false. If the connector developer
> cannot ensure this protocol, they will need to bear the correctness
> issues that arise.
>
>
> 3. Yes, this feature will work in batch mode as well.
>
>
> Best regards,
> Wencong
>
>
>
>
>
> At 2024-06-11 23:47:40, "Jingsong Li"  wrote:
> >Hi all,
> >
> >+1 to this FLIP, very thanks all for your proposal.
> >
> >isDeterministic looks good to me too.
> >
> >We can consider stating the following points:
> >
> >1. How to enable custom data distribution? Is it a dynamic hint? Can
> >you provide an SQL example.
> >
> >2. What impact will it have when the mainstream is changelog? Causing
> >disorder? This may need to be emphasized.
> >
> >3. Does this feature work in batch mode too?
> >
> >Best,
> >Jingsong
> >
> >On Tue, Jun 11, 2024 at 8:22 PM Wencong Liu  wrote:
> >>
> >> Hi Lincoln,
> >>
> >>
> >> Thanks for your reply. Weijie and I discussed these two issues offline,
> >> and here are the results of our discussion:
> >> 1. When the user utilizes the hash lookup join hint introduced by
> FLIP-204[1],
> >> the `SupportsLookupCustomShuffle` interface should be ignored. This is
> because
> >> the hash lookup join hint is directly specified by the user through a
> SQL HINT,
> >> which is more in line with user intuition. WDYT?
> >> 2. We agree with the introduction of the `isDeterministic` method. The
> >> `SupportsLookupCustomShuffle` interface introduces a custom shuffle,
> which
> >> can cause ADD/UPDATE_AFTER events (+I, +U) to appear
> >> after UPDATE_BEFORE/DELETE events (-D, -U), thus breaking the current
> >> limitations of the Flink Sink Operator[2]. If `isDeterministic` returns
> false and the
> >> changelog event type is not insert-only, the Planner should not apply
> the shuffle
> >> provided by `SupportsLookupCustomShuffle`.
> >>
> >>
> >> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-204%3A+Introduce+Hash+Lookup+Join
> >> [2]
> https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-event-out-of-orderness
> >>
> >>
> >> Best,
> >> Wencong
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> At 2024-06-11 00:02:57, "Lincoln Lee"  wrote:
> >> >Hi Weijie,
> >> >
> >> >Thanks for your proposal, this will be a useful advanced optimization
> for
> >> >connector developers!
> >> >
> >> >I have two questions:
> >> >
> >> >1. FLIP-204[1] hash lookup join hint is mentioned in this FLIP, what's
> the
> >> >apply ordering of the two feature? For example, a connector that
> >> >implements the `SupportsLookupCustomShuffle` interface also has a
> >> >`SHUFFLE_HASH` lookup join hint specified by the user in sql, what's
> >> >the expected behavior?
> >> >
> >

Re: [DISCUSS] FLIP-462: Support Custom Data Distribution for Input Stream of Lookup Join

2024-06-10 Thread Lincoln Lee
Hi Weijie,

Thanks for your proposal, this will be a useful advanced optimization for
connector developers!

I have two questions:

1. FLIP-204[1] hash lookup join hint is mentioned in this FLIP, what's the
apply ordering of the two feature? For example, a connector that
implements the `SupportsLookupCustomShuffle` interface also has a
`SHUFFLE_HASH` lookup join hint specified by the user in sql, what's
the expected behavior?

2. This FLIP considers the relationship with NDU processing, and I agree
with the current choice to prioritize NDU first. However, we should also
consider another issue: out-of-orderness of the changelog events in
streaming[2]. If the connector developer supplies a non-deterministic
partitioner, e.g., a random partitioner for anti-skew purpose, then it'll
break the assumption relied by current SQL operators in streaming: the
ADD/UDPATE_AFTER events (+I, +U) always occur before its related
UDPATE_BEFORE/DELETE events (-D, -U) and they are always
processed by the same task even if a data shuffle is involved. So a
straightforward approach would be to add method `isDeterministic` to
the `InputDataPartitioner` interface to explicitly tell the planner whether
the partitioner is deterministic or not(then the planner can reject the
non-deterministic custom partitioner for correctness requirements).

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-204%3A+Introduce+Hash+Lookup+Join
[2]
https://www.ververica.com/blog/flink-sql-secrets-mastering-the-art-of-changelog-event-out-of-orderness


Best,
Lincoln Lee


Xintong Song  于2024年6月7日周五 13:53写道:

> +1 for this proposal.
>
> This FLIP will make it possible for each lookup join parallel task to only
> access and cache a subset of the data. This will significantly improve the
> performance and reduce the overhead when using Paimon for the dimension
> table. And it's general enough to also be leveraged by other connectors.
>
> Best,
>
> Xintong
>
>
>
> On Fri, Jun 7, 2024 at 10:01 AM weijie guo 
> wrote:
>
> > Hi devs,
> >
> >
> > I'd like to start a discussion about FLIP-462[1]: Support Custom Data
> > Distribution for Input Stream of Lookup Join.
> >
> >
> > Lookup Join is an important feature in Flink, It is typically used to
> > enrich a table with data that is queried from an external system.
> > If we interact with the external systems for each incoming record, we
> > incur significant network IO and RPC overhead.
> >
> > Therefore, most connectors introduce caching to reduce the per-record
> > level query overhead. However, because the data distribution of Lookup
> > Join's input stream is arbitrary, the cache hit rate is sometimes
> > unsatisfactory.
> >
> >
> > We want to introduce a mechanism for the connector to tell the Flink
> > planner its desired input stream data distribution or partitioning
> > strategy. This can significantly reduce the amount of cached data and
> > improve performance of Lookup Join.
> >
> >
> > You can find more details in this FLIP[1]. Looking forward to hearing
> > from you, thanks!
> >
> >
> > Best regards,
> >
> > Weijie
> >
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-462+Support+Custom+Data+Distribution+for+Input+Stream+of+Lookup+Join
> >
>


Re: [ANNOUNCE] New Apache Flink PMC Member - Fan Rui

2024-06-05 Thread Lincoln Lee
Congratulations, Rui!

Best,
Lincoln Lee


Lijie Wang  于2024年6月6日周四 09:11写道:

> Congratulations, Rui!
>
> Best,
> Lijie
>
> Rodrigo Meneses  于2024年6月5日周三 21:35写道:
>
> > All the best
> >
> > On Wed, Jun 5, 2024 at 5:56 AM xiangyu feng 
> wrote:
> >
> > > Congratulations, Rui!
> > >
> > > Regards,
> > > Xiangyu Feng
> > >
> > > Feng Jin  于2024年6月5日周三 20:42写道:
> > >
> > > > Congratulations, Rui!
> > > >
> > > >
> > > > Best,
> > > > Feng Jin
> > > >
> > > > On Wed, Jun 5, 2024 at 8:23 PM Yanfei Lei 
> wrote:
> > > >
> > > > > Congratulations, Rui!
> > > > >
> > > > > Best,
> > > > > Yanfei
> > > > >
> > > > > Luke Chen  于2024年6月5日周三 20:08写道:
> > > > > >
> > > > > > Congrats, Rui!
> > > > > >
> > > > > > Luke
> > > > > >
> > > > > > On Wed, Jun 5, 2024 at 8:02 PM Jiabao Sun 
> > > > wrote:
> > > > > >
> > > > > > > Congrats, Rui. Well-deserved!
> > > > > > >
> > > > > > > Best,
> > > > > > > Jiabao
> > > > > > >
> > > > > > > Zhanghao Chen  于2024年6月5日周三
> 19:29写道:
> > > > > > >
> > > > > > > > Congrats, Rui!
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Zhanghao Chen
> > > > > > > > 
> > > > > > > > From: Piotr Nowojski 
> > > > > > > > Sent: Wednesday, June 5, 2024 18:01
> > > > > > > > To: dev ; rui fan <
> 1996fan...@gmail.com>
> > > > > > > > Subject: [ANNOUNCE] New Apache Flink PMC Member - Fan Rui
> > > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > On behalf of the PMC, I'm very happy to announce another new
> > > Apache
> > > > > Flink
> > > > > > > > PMC Member - Fan Rui.
> > > > > > > >
> > > > > > > > Rui has been active in the community since August 2019.
> During
> > > this
> > > > > time
> > > > > > > he
> > > > > > > > has contributed a lot of new features. Among others:
> > > > > > > >   - Decoupling Autoscaler from Kubernetes Operator, and
> > > supporting
> > > > > > > > Standalone Autoscaler
> > > > > > > >   - Improvements to checkpointing, flamegraphs, restart
> > > strategies,
> > > > > > > > watermark alignment, network shuffles
> > > > > > > >   - Optimizing the memory and CPU usage of large operators,
> > > greatly
> > > > > > > > reducing the risk and probability of TaskManager OOM
> > > > > > > >
> > > > > > > > He reviewed a significant amount of PRs and has been active
> > both
> > > on
> > > > > the
> > > > > > > > mailing lists and in Jira helping to both maintain and grow
> > > Apache
> > > > > > > Flink's
> > > > > > > > community. He is also our current Flink 1.20 release manager.
> > > > > > > >
> > > > > > > > In the last 12 months, Rui has been the most active
> contributor
> > > in
> > > > > the
> > > > > > > > Flink Kubernetes Operator project, while being the 2nd most
> > > active
> > > > > Flink
> > > > > > > > contributor at the same time.
> > > > > > > >
> > > > > > > > Please join me in welcoming and congratulating Fan Rui!
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Piotrek (on behalf of the Flink PMC)
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [ANNOUNCE] New Apache Flink PMC Member - Weijie Guo

2024-06-05 Thread Lincoln Lee
Congratulations, Weijie!

Best,
Lincoln Lee


ConradJam  于2024年6月5日周三 10:46写道:

> Congratulations!
>
> Biao Geng  于2024年6月5日周三 10:44写道:
>
> > Congratulations, Weijie!
> > Best,
> > Biao Geng
> >
> >
> > Yun Tang  于2024年6月5日周三 10:42写道:
> >
> > > Congratulations, Weijie!
> > >
> > > Best
> > > Yun Tang
> > > 
> > > From: Hangxiang Yu 
> > > Sent: Wednesday, June 5, 2024 10:00
> > > To: dev@flink.apache.org 
> > > Subject: Re: [ANNOUNCE] New Apache Flink PMC Member - Weijie Guo
> > >
> > > Congratulations, Weijie!
> > >
> > > On Tue, Jun 4, 2024 at 11:40 PM Zhanghao Chen <
> zhanghao.c...@outlook.com
> > >
> > > wrote:
> > >
> > > > Congrats, Weijie!
> > > >
> > > > Best,
> > > > Zhanghao Chen
> > > > 
> > > > From: Hang Ruan 
> > > > Sent: Tuesday, June 4, 2024 16:37
> > > > To: dev@flink.apache.org 
> > > > Subject: Re: [ANNOUNCE] New Apache Flink PMC Member - Weijie Guo
> > > >
> > > > Congratulations Weijie!
> > > >
> > > > Best,
> > > > Hang
> > > >
> > > > Yanfei Lei  于2024年6月4日周二 16:24写道:
> > > >
> > > > > Congratulations!
> > > > >
> > > > > Best,
> > > > > Yanfei
> > > > >
> > > > > Leonard Xu  于2024年6月4日周二 16:20写道:
> > > > > >
> > > > > > Congratulations!
> > > > > >
> > > > > > Best,
> > > > > > Leonard
> > > > > >
> > > > > > > 2024年6月4日 下午4:02,Yangze Guo  写道:
> > > > > > >
> > > > > > > Congratulations!
> > > > > > >
> > > > > > > Best,
> > > > > > > Yangze Guo
> > > > > > >
> > > > > > > On Tue, Jun 4, 2024 at 4:00 PM Weihua Hu <
> huweihua@gmail.com
> > >
> > > > > wrote:
> > > > > > >>
> > > > > > >> Congratulations, Weijie!
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> Weihua
> > > > > > >>
> > > > > > >>
> > > > > > >> On Tue, Jun 4, 2024 at 3:03 PM Yuxin Tan <
> > tanyuxinw...@gmail.com>
> > > > > wrote:
> > > > > > >>
> > > > > > >>> Congratulations, Weijie!
> > > > > > >>>
> > > > > > >>> Best,
> > > > > > >>> Yuxin
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Yuepeng Pan  于2024年6月4日周二 14:57写道:
> > > > > > >>>
> > > > > > >>>> Congratulations !
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>> Best,
> > > > > > >>>> Yuepeng Pan
> > > > > > >>>>
> > > > > > >>>> At 2024-06-04 14:45:45, "Xintong Song" <
> tonysong...@gmail.com
> > >
> > > > > wrote:
> > > > > > >>>>> Hi everyone,
> > > > > > >>>>>
> > > > > > >>>>> On behalf of the PMC, I'm very happy to announce that
> Weijie
> > > Guo
> > > > > has
> > > > > > >>>> joined
> > > > > > >>>>> the Flink PMC!
> > > > > > >>>>>
> > > > > > >>>>> Weijie has been an active member of the Apache Flink
> > community
> > > > for
> > > > > many
> > > > > > >>>>> years. He has made significant contributions in many
> > > components,
> > > > > > >>> including
> > > > > > >>>>> runtime, shuffle, sdk, connectors, etc. He has driven /
> > > > > participated in
> > > > > > >>>>> many FLIPs, authored and reviewed hundreds of PRs, been
> > > > > consistently
> > > > > > >>>> active
> > > > > > >>>>> on mailing lists, and also helped with release management
> of
> > > 1.20
> > > > > and
> > > > > > >>>>> several other bugfix releases.
> > > > > > >>>>>
> > > > > > >>>>> Congratulations and welcome Weijie!
> > > > > > >>>>>
> > > > > > >>>>> Best,
> > > > > > >>>>>
> > > > > > >>>>> Xintong (on behalf of the Flink PMC)
> > > > > > >>>>
> > > > > > >>>
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best,
> > > Hangxiang.
> > >
> >
>
>
> --
> Best
>
> ConradJam
>


[jira] [Created] (FLINK-35498) Unexpected argument name conflict error when do extract method params from udf

2024-05-31 Thread lincoln lee (Jira)
lincoln lee created FLINK-35498:
---

 Summary: Unexpected argument name conflict error when do extract 
method params from udf
 Key: FLINK-35498
 URL: https://issues.apache.org/jira/browse/FLINK-35498
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.19.0, 1.20.0
Reporter: lincoln lee
Assignee: xuyang


Follow the steps to reproduce the error:

test case:

{code}
util.addTemporarySystemFunction("myudf", new TestXyz)
util.tableEnv.explainSql("select myudf(f1, f2) from t")
{code}

 

udf: TestXyz 

{code}
public class TestXyz extends ScalarFunction {
public String eval(String s1, String s2) {
String localV1;

if (s1 == null) {
if (s2 != null) {
localV1 = s2;
} else {
localV1 = s2 + s1;
}
} else {
if ("xx".equals(s2)) {
localV1 = s1.length() >= s2.length() ? s1 : s2;
} else {
localV1 = s1;
}
}
if (s1 == null) {
return s2 + localV1;
}
if (s2 == null) {
return s1;
}
return s1.length() >= s2.length() ? s1 + localV1 : s2;
}
}
{code}

 

error stack:

{code}

Caused by: org.apache.flink.table.api.ValidationException: Unable to extract a 
type inference from method:
public java.lang.String 
org.apache.flink.table.planner.runtime.utils.TestXyz.eval(java.lang.String,java.lang.String)
    at 
org.apache.flink.table.types.extraction.ExtractionUtils.extractionError(ExtractionUtils.java:362)
    at 
org.apache.flink.table.types.extraction.BaseMappingExtractor.extractResultMappings(BaseMappingExtractor.java:154)
    at 
org.apache.flink.table.types.extraction.BaseMappingExtractor.extractOutputMapping(BaseMappingExtractor.java:100)
    ... 53 more
Caused by: org.apache.flink.table.api.ValidationException: Argument name 
conflict, there are at least two argument names that are the same.
    at 
org.apache.flink.table.types.extraction.ExtractionUtils.extractionError(ExtractionUtils.java:362)
    at 
org.apache.flink.table.types.extraction.ExtractionUtils.extractionError(ExtractionUtils.java:357)
    at 
org.apache.flink.table.types.extraction.FunctionSignatureTemplate.of(FunctionSignatureTemplate.java:73)
    at 
org.apache.flink.table.types.extraction.BaseMappingExtractor.lambda$createParameterSignatureExtraction$9(BaseMappingExtractor.java:381)
    at 
org.apache.flink.table.types.extraction.BaseMappingExtractor.putExtractedResultMappings(BaseMappingExtractor.java:298)
    at 
org.apache.flink.table.types.extraction.BaseMappingExtractor.collectMethodMappings(BaseMappingExtractor.java:244)
    at 
org.apache.flink.table.types.extraction.BaseMappingExtractor.extractResultMappings(BaseMappingExtractor.java:137)
    ... 54 more

{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-457: Improve Table/SQL Configuration for Flink 2.0

2024-05-23 Thread Lincoln Lee
+1(binding)

Best,
Lincoln Lee


Jane Chan  于2024年5月24日周五 09:52写道:

> Hi all,
>
> I'd like to start a vote on FLIP-457[1] after reaching a consensus through
> the discussion thread[2].
>
> The vote will be open for at least 72 hours unless there is an objection or
> insufficient votes.
>
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=307136992
> [2] https://lists.apache.org/thread/1sthbv6q00sq52pp04n2p26d70w4fqj1
>
> Best,
> Jane
>


Re: [DISCUSSION] FLIP-457: Improve Table/SQL Configuration for Flink 2.0

2024-05-21 Thread Lincoln Lee
Hi Jane,

Thanks for the updates!

Just one small comment on the options in IncrementalAggregateRule
& RelNodeBlock, should we also change the API level from Experimental
to PublicEvolving?


Best,
Lincoln Lee


Jane Chan  于2024年5月21日周二 16:41写道:

> Hi all,
>
> Thanks for your valuable feedback!
>
> To @Xuannan
>
> For options to be moved to another module/package, I think we have to
> > mark the old option deprecated in 1.20 for it to be removed in 2.0,
> > according to the API compatibility guarantees[1]. We can introduce the
> > new option in 1.20 with the same option key in the intended class.
>
>
> Good point, fixed.
>
> To @Lincoln and @Benchao
>
> Thanks for sharing the insights into the historical context of which I was
> unaware. I've reorganized the sheet.
>
> 3. Regarding WindowEmitStrategy, IIUC it is currently unsupported on TVF
> > window, so it's recommended to keep it untouched for now and follow up in
> > FLINK-29692
>
>
> How to tackle the configuration is up to whether to remove the legacy
> window aggregate in 2.0, and I've updated the FLIP to leverage this part to
> FLINK-29692.
>
> Please let me know if that answers your questions or if you have other
> comments.
>
> Best,
> Jane
>
>
> On Mon, May 20, 2024 at 1:52 PM Ron Liu  wrote:
>
> > Hi, Lincoln
> >
> > >  2. Regarding the options in HashAggCodeGenerator, since this new
> feature
> > has gone
> > through a couple of release cycles and could be considered for
> > PublicEvolving now,
> > cc @Ron Liu   WDYT?
> >
> > Thanks for cc'ing me,  +1 for public these options now.
> >
> > Best,
> > Ron
> >
> > Benchao Li  于2024年5月20日周一 13:08写道:
> >
> > > I agree with Lincoln about the experimental features.
> > >
> > > Some of these configurations do not even have proper implementation,
> > > take 'table.exec.range-sort.enabled' as an example, there was a
> > > discussion[1] about it before.
> > >
> > > [1] https://lists.apache.org/thread/q5h3obx36pf9po28r0jzmwnmvtyjmwdr
> > >
> > > Lincoln Lee  于2024年5月20日周一 12:01写道:
> > > >
> > > > Hi Jane,
> > > >
> > > > Thanks for the proposal!
> > > >
> > > > +1 for the changes except for these annotated as experimental ones.
> > > >
> > > > For the options annotated as experimental,
> > > >
> > > > +1 for the moving of IncrementalAggregateRule & RelNodeBlock.
> > > >
> > > > For the rest of the options, there are some suggestions:
> > > >
> > > > 1. for the batch related parameters, it's recommended to either
> delete
> > > > them (leaving the necessary defaults value in place) or leave them as
> > > they
> > > > are. Including:
> > > > FlinkRelMdRowCount
> > > > FlinkRexUtil
> > > > BatchPhysicalSortRule
> > > > JoinDeriveNullFilterRule
> > > > BatchPhysicalJoinRuleBase
> > > > BatchPhysicalSortMergeJoinRule
> > > >
> > > > What I understand about the history of these options is that they
> were
> > > once
> > > > used for fine
> > > > tuning for tpc testing, and the current flink planner no longer
> relies
> > on
> > > > these internal
> > > > options when testing tpc[1]. In addition, these options are too
> obscure
> > > for
> > > > SQL users,
> > > > and some of them are actually magic numbers.
> > > >
> > > > 2. Regarding the options in HashAggCodeGenerator, since this new
> > feature
> > > > has gone
> > > > through a couple of release cycles and could be considered for
> > > > PublicEvolving now,
> > > > cc @Ron Liu   WDYT?
> > > >
> > > > 3. Regarding WindowEmitStrategy, IIUC it is currently unsupported on
> > TVF
> > > > window, so
> > > > it's recommended to keep it untouched for now and follow up in
> > > > FLINK-29692[2]. cc @Xuyang 
> > > >
> > > > [1]
> > > >
> > >
> >
> https://github.com/ververica/flink-sql-benchmark/blob/master/tools/common/flink-conf.yaml
> > > > [2] https://issues.apache.org/jira/browse/FLINK-29692
> > > >
> > > >
> > > > Best,
> > > > Lincoln Lee
> > > >
> > > >
> > > > Yubin Li  于2024年5月17日周五 10:49写道:
> > > >
> > > > > Hi Jane,
&g

Re: [DISCUSS] Flink upgrade compatibility page not updated for Flink 1.19

2024-05-21 Thread Lincoln Lee
Thanks Aleksandr for fixing this missing change!

It was my oversight during the last release. I had gone offline to @Hangxiang
Yu 
before rc1 to confirm the 1.19 compatibility changes and confirmed
that there were no changes (so your pr should be correct), but finally I
didn't make it into subsequent rc's.

I checked the flink release wiki page[1] again, and this update is
mentioned in item 8 of the checklist in the "Creating Release Branches"
section, but it's not listed in jira[2] (I've added a note to it).

Also cc 1.20 rm @weijie guo , remember to
include this documentation
update in the next release.

[1]
https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release
[2] https://issues.apache.org/jira/browse/FLINK-34282

Best,
Lincoln Lee


Aleksandr Pilipenko  于2024年5月21日周二 18:28写道:

> Hi all,
>
> Current version of documentation missing savepoint compatibility data for
> Flink 1.19 [1].
> I have created a ticket [2] and PRs to address this, but wanted to clarify
> if there any changes that make savepoint compatibility different from
> previous releases? I did not find any changes related to compatibility in
> release notes [3].
>
> Additionally, is an update to this page included in the process of minor
> version release?
>
> 1:
>
> https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/ops/upgrading/#compatibility-table
> 2: https://issues.apache.org/jira/browse/FLINK-35383
> 3:
>
> https://nightlies.apache.org/flink/flink-docs-master/release-notes/flink-1.19/#checkpoints
>
> Thanks,
> Aleksandr
>


Re: [DISCUSSION] FLIP-457: Improve Table/SQL Configuration for Flink 2.0

2024-05-19 Thread Lincoln Lee
Hi Jane,

Thanks for the proposal!

+1 for the changes except for these annotated as experimental ones.

For the options annotated as experimental,

+1 for the moving of IncrementalAggregateRule & RelNodeBlock.

For the rest of the options, there are some suggestions:

1. for the batch related parameters, it's recommended to either delete
them (leaving the necessary defaults value in place) or leave them as they
are. Including:
FlinkRelMdRowCount
FlinkRexUtil
BatchPhysicalSortRule
JoinDeriveNullFilterRule
BatchPhysicalJoinRuleBase
BatchPhysicalSortMergeJoinRule

What I understand about the history of these options is that they were once
used for fine
tuning for tpc testing, and the current flink planner no longer relies on
these internal
options when testing tpc[1]. In addition, these options are too obscure for
SQL users,
and some of them are actually magic numbers.

2. Regarding the options in HashAggCodeGenerator, since this new feature
has gone
through a couple of release cycles and could be considered for
PublicEvolving now,
cc @Ron Liu   WDYT?

3. Regarding WindowEmitStrategy, IIUC it is currently unsupported on TVF
window, so
it's recommended to keep it untouched for now and follow up in
FLINK-29692[2]. cc @Xuyang 

[1]
https://github.com/ververica/flink-sql-benchmark/blob/master/tools/common/flink-conf.yaml
[2] https://issues.apache.org/jira/browse/FLINK-29692


Best,
Lincoln Lee


Yubin Li  于2024年5月17日周五 10:49写道:

> Hi Jane,
>
> Thank Jane for driving this proposal !
>
> This makes sense for users, +1 for that.
>
> Best,
> Yubin
>
> On Thu, May 16, 2024 at 3:17 PM Jark Wu  wrote:
> >
> > Hi Jane,
> >
> > Thanks for the proposal. +1 from my side.
> >
> >
> > Best,
> > Jark
> >
> > On Thu, 16 May 2024 at 10:28, Xuannan Su  wrote:
> >
> > > Hi Jane,
> > >
> > > Thanks for driving this effort! And +1 for the proposed changes.
> > >
> > > I have one comment on the migration plan.
> > >
> > > For options to be moved to another module/package, I think we have to
> > > mark the old option deprecated in 1.20 for it to be removed in 2.0,
> > > according to the API compatibility guarantees[1]. We can introduce the
> > > new option in 1.20 with the same option key in the intended class.
> > > WDYT?
> > >
> > > Best,
> > > Xuannan
> > >
> > > [1]
> > >
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/upgrading/#api-compatibility-guarantees
> > >
> > >
> > >
> > > On Wed, May 15, 2024 at 6:20 PM Jane Chan 
> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > I'd like to start a discussion on FLIP-457: Improve Table/SQL
> > > Configuration
> > > > for Flink 2.0 [1]. This FLIP revisited all Table/SQL configurations
> to
> > > > improve user-friendliness and maintainability as Flink moves toward
> 2.0.
> > > >
> > > > I am looking forward to your feedback.
> > > >
> > > > Best regards,
> > > > Jane
> > > >
> > > > [1]
> > > >
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=307136992
> > >
>


Re: [VOTE] FLIP-452: Allow Skipping Invocation of Function Calls While Constant-folding

2024-05-09 Thread Lincoln Lee
+1 (binding)


Best,
Lincoln Lee


Benchao Li  于2024年5月9日周四 13:02写道:

> +1 (binding)
>
> Timo Walther  于2024年5月8日周三 17:15写道:
> >
> > +1 (binding)
> >
> > Thanks,
> > Timo
> >
> > On 08.05.24 11:10, Stefan Richter wrote:
> > > Hi Alan,
> > >
> > > Thanks for this proposal, the ability to exclude functions from
> constant folding makes sense to me.
> > >
> > > +1 (binding)
> > >
> > > Best,
> > > Stefan
> > >
> > >> On 8. May 2024, at 02:01, Alan Sheinberg 
> > >> 
> wrote:
> > >>
> > >> Hi everyone,
> > >>
> > >> I'd like to start a vote on FLIP-452 [1]. It covers adding a new
> method
> > >> FunctionDefinition.supportsConstantFolding() as part of the Flink
> Table/SQL
> > >> API to allow skipping invocation of functions while constant-folding.
> It
> > >> has been discussed in this thread [2].
> > >>
> > >> I would like to start a vote.  The vote will be open for at least 72
> hours
> > >> unless there is an objection or insufficient votes.
> > >>
> > >> [1]
> > >>
> https://www.google.com/url?q=https://cwiki.apache.org/confluence/display/FLINK/FLIP-452%253A%2BAllow%2BSkipping%2BInvocation%2Bof%2BFunction%2BCalls%2BWhile%2BConstant-folding=gmail-imap=171573131400=AOvVaw3sVTK3M3Qs45haptzQbUmo
> > >>
> > >> [2]
> https://www.google.com/url?q=https://lists.apache.org/thread/ko5ndv5kr87nm011psll2hzzd0nn3ztz=gmail-imap=171573131400=AOvVaw3YKYwhLhbgWkX5hbzHRW31
> > >>
> > >> Thanks,
> > >> Alan
> > >
> >
>
>
> --
>
> Best,
> Benchao Li
>


Re: Re: [VOTE] FLIP-448: Introduce Pluggable Workflow Scheduler Interface for Materialized Table

2024-05-09 Thread Lincoln Lee
+1 (binding)

Best,
Lincoln Lee


Feng Jin  于2024年5月9日周四 19:45写道:

> +1 (non-binding)
>
>
> Best,
> Feng
>
>
> On Thu, May 9, 2024 at 7:37 PM Xuyang  wrote:
>
> > +1 (non-binding)
> >
> >
> > --
> >
> > Best!
> > Xuyang
> >
> >
> >
> >
> >
> > At 2024-05-09 13:57:07, "Ron Liu"  wrote:
> > >Sorry for the re-post, just to format this email content.
> > >
> > >Hi Dev
> > >
> > >Thank you to everyone for the feedback on FLIP-448: Introduce Pluggable
> > >Workflow Scheduler Interface for Materialized Table[1][2].
> > >I'd like to start a vote for it. The vote will be open for at least 72
> > >hours unless there is an objection or not enough votes.
> > >
> > >[1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-448%3A+Introduce+Pluggable+Workflow+Scheduler+Interface+for+Materialized+Table
> > >
> > >[2] https://lists.apache.org/thread/57xfo6p25rbrhcg01dhyok46zt6jc5q1
> > >
> > >Best,
> > >Ron
> > >
> > >Ron Liu  于2024年5月9日周四 13:52写道:
> > >
> > >> Hi Dev, Thank you to everyone for the feedback on FLIP-448: Introduce
> > >> Pluggable Workflow Scheduler Interface for Materialized Table[1][2].
> I'd
> > >> like to start a vote for it. The vote will be open for at least 72
> hours
> > >> unless there is an objection or not enough votes. [1]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-448%3A+Introduce+Pluggable+Workflow+Scheduler+Interface+for+Materialized+Table
> > >>
> > >> [2] https://lists.apache.org/thread/57xfo6p25rbrhcg01dhyok46zt6jc5q1
> > >> Best, Ron
> > >>
> >
>


Re: [Discuss] FLIP-452: Allow Skipping Invocation of Function Calls While Constant-folding

2024-05-06 Thread Lincoln Lee
Thanks Alan for starting this flip!

+1 for the new `supportsConstantFolding`. We do need to reserve the
ability to not perform constant folding for certain cases.

Best,
Lincoln Lee


Alan Sheinberg  于2024年5月7日周二 08:32写道:

> Hi Muhammet, Timo
>
> The `isDeterministic()`[1] function is mentioned in the documentation,
> > I would suggest to add maybe a section for `supportsConstantFolding()`,
> > with short description and examples use cases (similar to the
> > motivation of the FLIP) where this could be useful in UDFs.
> >
> Thanks for the suggestion.  I'll do that.
>
>  Feel free to start a vote thread by tomorrow. If there are no objections?
>
> Sounds good. I'll do that tomorrow if I hear no objections by then.
>
> Thanks,
> Alan
>
> On Mon, May 6, 2024 at 8:12 AM Timo Walther  wrote:
>
> > Hi Alan,
> >
> > thanks for the update. From my side, the FLIP seems good for voting.
> >
> > Since it only touches a small API surface, I guess the proposal is not
> > very controversial.
> >
> > Feel free to start a vote thread by tomorrow. If there are no objections?
> >
> > Thanks,
> > Timo
> >
> >
> > On 02.05.24 09:45, Muhammet Orazov wrote:
> > > Hey Alan,
> > >
> > > Thanks for the proposal, +1!
> > >
> > > The `isDeterministic()`[1] function is mentioned in the documentation,
> > > I would suggest to add maybe a section for `supportsConstantFolding()`,
> > > with short description and examples use cases (similar to the
> > > motivation of the FLIP) where this could be useful in UDFs.
> > >
> > > Best,
> > > Muhammet
> > >
> > > [1]:
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/dev/table/functions/udfs/#evaluation-methods
> > >
> > >
> > > On 2024-04-29 22:57, Alan Sheinberg wrote:
> > >> I'd like to start a discussion of FLIP-452: Allow Skipping Invocation
> of
> > >> Function Calls While Constant-folding [1]
> > >>
> > >> This feature proposes adding a new
> > >> method FunctionDefinition.allowConstantFolding() as part of the Flink
> > >> Table/SQL API.  This would be used to determine whether an expression
> > >> containing this function should have constant-folding logic run on it,
> > >> invoking the function at planning time.
> > >>
> > >> The current behavior of always doing constant-folding on function
> > >> calls is
> > >> problematic for UDFs which invoke RPCs or have other side effects in
> > >> external systems.  In these cases, you either don’t want these actions
> > to
> > >> occur during planning time, or it may be important to happen on a per
> > >> result row basis.
> > >>
> > >> Note that this is a bit different than
> > >> FunctionDefinition.isDeterministic(), and can exist along-side it.
> > >>
> > >> Looking forward to your feedback and suggestions.
> > >>
> > >> [1]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-452%3A+Allow+Skipping+Invocation+of+Function+Calls+While+Constant-folding
> > >>
> > >>
> > >> Thanks,
> > >> Alan
> > >
> >
> >
>


Re: [DISCUSS] FLIP-448: Introduce Pluggable Workflow Scheduler Interface for Materialized Table

2024-04-30 Thread Lincoln Lee
Thanks Ron for starting this flip! It will complete the user story for
flip-435[1].

Regarding the WorkflowOperation, I have a question about whether we
should add Delete/DropWorkflowOperation as well for when the
Materialized Table is dropped or refresh mode changed from full to
continuous?

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-435%3A+Introduce+a+New+Materialized+Table+for+Simplifying+Data+Pipelines?src=contextnavpagetreemode


Best,
Lincoln Lee


 于2024年4月30日周二 15:37写道:

> Hello Ron, thank you for your detailed answers!
>
> For the Visitor pattern, I thought about it the other way around, so that
> operations visit the scheduler, and not vice-versa :) In this way
> operations can get the required information in order to be executed in a
> tailored way.
>
> Thank you for your effort, but, as you say:
> > furthermore, I think the current does not see the benefits of the time,
> simpler instead of better, similar to the design of
> CatalogModificationEvent[2] and CatalogModificationListener[3], the
> developer only needs instanceof judgment.
>
> In java, most of the times, `instanceof` is considered an anti-pattern,
> that's why I was also thinking about a command pattern (every operations
> defines an `execute` method). However, I also understand this part is not
> crucial for the FLIP under discussion, and the implementation details can
> simply wait for the PRs to come.
>
> > After discussing with Shengkai offline, there is no need for this REST
> API
> to support multiple tables to be refreshed at the same time, so it would be
> more appropriate to put the materialized table identifier in the path of
> the URL, thanks for the suggestion.
>
> Very good!
>
> Thank you!
> On Apr 29, 2024 at 05:04 +0200, Ron Liu , wrote:
> > Hi, Lorenzo
> >
> > > I have a question there: how can the gateway update the refreshHandler
> in
> > the Catalog before getting it from the scheduler?
> >
> > The refreshHandler in CatalogMateriazedTable is null before getting it
> from
> > the scheduler, you can look at the CatalogMaterializedTable.Builder[1]
> for
> > more details.
> >
> > > You have a typo here: WorkflowScheudler -> WorkflowScheduler :)
> >
> > Fix it now, thanks very much.
> >
> > > For the operations part, I still think that the FLIP would benefit from
> > providing a specific pattern for operations. You could either propose a
> > command pattern [1] or a visitor pattern (where the scheduler visits the
> > operation to get relevant info) [2] for those operations at your choice.
> >
> > Thank you for your input, I find it very useful. I tried to understand
> your
> > thinking through code and implemented the following pseudo code using the
> > visitor design pattern:
> > 1. first defined WorkflowOperationVisitor, providing several overloaded
> > visit methods.
> >
> > public interface WorkflowOperationVisitor {
> >
> >  T visit(CreateWorkflowOperation
> > createWorkflowOperation);
> >
> > void visit(ModifyWorkflowOperation operation);
> > }
> >
> > 2. then in the WorkflowOperation add the accept method.
> >
> > @PublicEvolving
> > public interface WorkflowOperation {
> >
> > void accept(WorkflowOperationVisitor visitor);
> > }
> >
> >
> > 3. in the WorkflowScheduler call the implementation class of
> > WorkflowOperationVisitor, complete the corresponding operations.
> >
> > I recognize this design pattern purely from a code design point of view,
> > but from the point of our specific scenario:
> > 1. For CreateWorkflowOperation, the visit method needs to return
> > RefreshHandler, for ModifyWorkflowOperation, such as suspend and resume,
> > the visit method doesn't need to return RefreshHandler. parameter,
> > currently for different WorkflowOperation,
> WorkflowOperationVisitor#accept
> > can't be unified, so I think visitor may not be applicable here.
> >
> > 2. In addition, I think using the visitor pattern will add complexity to
> > the WorkflowScheduler implementer, which needs to implement one more
> > interface WorkflowOperationVisitor, this interface is not for the engine
> to
> > use, so I don't see any benefit from this design at the moment.
> >
> > 3. furthermore, I think the current does not see the benefits of the
> time,
> > simpler instead of better, similar to the design of
> > CatalogModificationEvent[2] and CatalogModificationListener[3], the
> > developer only needs instanceof judgment.
> >
> > To summarize, I don't think there is a need to introduce command or
> visitor
>

Re: [ANNOUNCE] New Apache Flink PMC Member - Lincoln Lee

2024-04-18 Thread Lincoln Lee
Thanks everyone : )

Best,
Lincoln Lee


Feifan Wang  于2024年4月16日周二 12:44写道:

> Congratulations Lincoln !
>
>
> ——
>
> Best regards,
>
> Feifan Wang
>
>
>
>
> At 2024-04-12 15:59:00, "Jark Wu"  wrote:
> >Hi everyone,
> >
> >On behalf of the PMC, I'm very happy to announce that Lincoln Lee has
> >joined the Flink PMC!
> >
> >Lincoln has been an active member of the Apache Flink community for
> >many years. He mainly works on Flink SQL component and has driven
> >/pushed many FLIPs around SQL, including FLIP-282/373/415/435 in
> >the recent versions. He has a great technical vision of Flink SQL and
> >participated in plenty of discussions in the dev mailing list. Besides
> >that,
> >he is community-minded, such as being the release manager of 1.19,
> >verifying releases, managing release syncs, writing the release
> >announcement etc.
> >
> >Congratulations and welcome Lincoln!
> >
> >Best,
> >Jark (on behalf of the Flink PMC)
>


Re: [VOTE] FLIP-435: Introduce a New Materialized Table for Simplifying Data Pipelines

2024-04-17 Thread Lincoln Lee
+1(binding)

Best,
Lincoln Lee


Ferenc Csaky  于2024年4月17日周三 19:58写道:

> +1 (non-binding)
>
> Best,
> Ferenc
>
>
>
>
> On Wednesday, April 17th, 2024 at 10:26, Ahmed Hamdy 
> wrote:
>
> >
> >
> > + 1 (non-binding)
> >
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Wed, 17 Apr 2024 at 08:28, Yuepeng Pan panyuep...@apache.org wrote:
> >
> > > +1(non-binding).
> > >
> > > Best,
> > > Yuepeng Pan
> > >
> > > At 2024-04-17 14:27:27, "Ron liu" ron9@gmail.com wrote:
> > >
> > > > Hi Dev,
> > > >
> > > > Thank you to everyone for the feedback on FLIP-435: Introduce a New
> > > > Materialized Table for Simplifying Data Pipelines[1][2].
> > > >
> > > > I'd like to start a vote for it. The vote will be open for at least
> 72
> > > > hours unless there is an objection or not enough votes.
> > > >
> > > > [1]
> > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-435%3A+Introduce+a+New+Materialized+Table+for+Simplifying+Data+Pipelines
> > >
> > > > [2] https://lists.apache.org/thread/c1gnn3bvbfs8v1trlf975t327s4rsffs
> > > >
> > > > Best,
> > > > Ron
>


Re: [ANNOUNCE] New Apache Flink Committer - Zakelly Lan

2024-04-14 Thread Lincoln Lee
Congratulations, Zakelly!

Best,
Lincoln Lee


Yanfei Lei  于2024年4月15日周一 12:51写道:

> Congratulations, Zakelly!
>
> Best regards,
> Yanfei
>
> Junrui Lee  于2024年4月15日周一 12:06写道:
> >
> > Congratulations, Zakelly!
> >
> > Best,
> > Junrui
> >
> > yue ma  于2024年4月15日周一 12:00写道:
> >
> > > Congratulations Zakelly!
> > >
> > > Best,
> > > Yue
> > >
>


Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for Simplifying Data Pipelines

2024-04-12 Thread Lincoln Lee
Thanks Jamie for your thoughtful inputs!

Regarding the necessity of supporting updates, I'd like to add a scenario
where the base table for the 'view' created by the user does not entirely
come from a database(under the relational model).

For example, in a specific scenario: personalized user recommendations
are generated based on the user's visit records on an e-commerce website
(stored in a message queue such as Kafka and retained for a period of time)
combined with other dimensional information to create content for user
recommendations (this computation process + the final result table is the
new object we hope to introduce).

If a user requests the deletion of their visit data, but the message queue
cannot directly provide data updates like a database and generate a
changelog
to trigger the update of the final result table, then deleting the relevant
user's
data in the result table directly based on DML, just like operating a
database
table, is the most straightforward approach:
```
DELETE * FROM target_table WHERE uid LIKE '...';
```
This is also one of the reasons we introduced FLIP-282.


Best,
Lincoln Lee


Jamie Grier  于2024年4月13日周六 03:34写道:

> >
> > In the SQL standard regarding the definition of View, there are the
> > following restrictions:
>
>
> 1. Partitioned view is not supported.
>
>
> I agree that partitioned "views" don't really make sense since there is no
> storage to partition.  However, a bit of a search reveals that partitioned
> "materialized views" seem common in database systems [1][2][3]
>
> 2. Modification of the data generated by views is not supported.
>
>
> Modification of data generated by materialized views should not be
> allowed.  I agree with this which is why I wanted to dig into that
> requirement more deeply.  The approach one would normally use here would be
> to update or delete the base tables and that would propagate to the
> materialized view.  I actually don't see why you would want to
> directly manipulate the data of an MV or a Dynamic Table (or whatever we
> call it)  for that matter.  It should always be equal to the query executed
> over the base tables.  I don't understand the use case where you'd want to
> manipulate it directly and thus violate this basic property.
>
> 3. Alteration of a View's schema, such as adding columns, is not supported.
>
> This seems like an issue to be solved regardless of what we call this.
> However schema evolution is handled I think the approach can be used
> regardless of what we call this object.
>
> I think the main objection/concern I have here is that I actually don't
> think we want to allow direct modification of these things whatever they
> are called, and with that requirement lifted I prefer the Materialized View
> since existing SQL users will already understand what it means.
>
> So, I still prefer Materialized View to any other proposal, and further
> don't think direct updates of this object should be supported regardless of
> what it's called.  My perspective is primarily from the point of view of
> keeping the system easy to understand and use for existing SQL users rather
> than Flink or stream processing experts.
>
> -Jamie
>
>
> --
> [1]
> https://docs.oracle.com/cd/B13789_01/server.101/b10736/advmv.htm#i1006635
> [2]
>
> https://cloud.google.com/bigquery/docs/materialized-views-create#partitioned_materialized_views
> [3]
>
> https://docs.cloudera.com/runtime/7.2.17/using-hiveql/topics/hive_create_partitioned_materialized_view.html
>
> On Fri, Apr 12, 2024 at 4:51 AM Ron liu  wrote:
>
> > Hi, jgrier
> >
> > Thanks for your insightful input.
> >
> > First of all, very much agree with you that it is a right direction that
> we
> > should strive towards making Flink SQL more user-friendly, including
> > simplifying the job execution parameters, execution modes, data
> processing
> > pipeline definitions and maintenance, and so on.
> > The goal of this proposal is also to simplify the data processing
> pipeline
> > by proposing a new Dynamic Table, by combining Dynamic Table +
> Continuous,
> > so that users can focus more on the business itself. Our goal is also not
> > to create new business scenarios, it's just that the current Table can't
> > support this goal, so we need to propose a new type of Dynamic Table.
> >
> > In the traditional Hive warehouse and Lakhouse scenario, the common
> > requirement from users begins with ingesting DB data such as MySQL and
> logs
> > in real-time into the ODS layer of the data warehouse. Then, defining a
> > series of ETL jobs to process and layer the raw data, with the general
> data
> > flow being ODS -> DWD -> DWS -> ADS, ulti

Re: [DISCUSS] FLIP-434: Support optimizations for pre-partitioned data sources

2024-04-09 Thread Lincoln Lee
Thanks Jeyhun for your reply!

Unfortunately, MiniBatchLocalGroupAggFunction only works for local agg
in two-phase aggregation, while global aggregation (which is actually
handled
by the KeyedMapBundleOperator) still relies on the KeyedStream, meaning
that consistency of the partitioner and state key selector is still
required.

Best,
Lincoln Lee


Jeyhun Karimov  于2024年4月6日周六 05:11写道:

> Hi Lincoln,
>
> I did a bit of analysis on small PoC.
> Please find my comments below:
>
> - In general, current design supports streaming workloads. However, as you
> mentioned it comes with some (implementation-related) difficulties.
> One of them (as you also mentioned) is that most of the operators utilize
> keyed functions (e.g., Aggregate functions).
> As a result, we cannot directly, utilize these operators (e.g.,
> StreamPhysicalGroupbyAggregate) because they work on keyed inputs and their
> tasks
> utilize specific keyGroupRange.
>
> - As I mentioned above, my idea is to utilize similar approach
> to MiniBatchLocalGroupAggFunction that is not time based and supports also
> retractions.
> The existing implementation of this function already supports quite a big
> part of the scope. With this implementation, we utilize MapbundleFunction
> that is not bound to a specific key range.
>
> - As the next milestone, more generic optimization is required that
> introduces 1) new streaming distribution type as KEEP_INPUT_AS_IS,
> 2) utilization of a ForwardHashExchangeProcessor, 3) corresponding chaining
> strategy
>
> Currently, the plan is to first support this FLIP for batch workloads
> (e.g., files, pre-divided data and buckets). Next, support for streaming
> workloads.
>
> I hope I have answered your question.
>
> Regards,
> Jeyhun
>
> On Wed, Apr 3, 2024 at 4:33 PM Lincoln Lee  wrote:
>
> > Hi Jeyhun,
> >
> > Thanks for your quick response!
> >
> > In streaming scenario, shuffle commonly occurs before the stateful
> > operator, and there's a sanity check[1] when the stateful operator
> > accesses the state. This implies the consistency requirement of the
> > partitioner used for data shuffling and state key selector for state
> > accessing(see KeyGroupStreamPartitioner for more details),
> > otherwise, there may be state access errors. That is to say, in the
> > streaming scenario, it is not only the strict requirement described in
> > FlinkRelDistribution#requireStrict, but also the implied consistency of
> > hash calculation.
> >
> > Also, if this flip targets both streaming and batch scenarios, it is
> > recommended to do PoC validation for streaming as well.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-29430
> >
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Leonard Xu  于2024年4月3日周三 14:25写道:
> >
> > > Hey, Jeyhun
> > >
> > > Thanks for kicking off this discussion. I have two questions about
> > > streaming sources:
> > >
> > > (1)The FLIP  motivation section says Kafka broker is already
> partitioned
> > > w.r.t. some key[s] , Is this the main use case in Kafka world?
> > Partitioning
> > > by key fields is not the default partitioner of Kafka default
> > > partitioner[1] IIUC.
> > >
> > > (2) Considering the FLIP’s optimization scope aims to both Batch and
> > > Streaming pre-partitioned source, could you add a Streaming Source
> > example
> > > to help me understand the  FLIP better? I think Kafka Source is a good
> > > candidates for streaming source example, file source is a good one for
> > > batch source and it really helped me to follow-up the FLIP.
> > >
> > > Best,
> > > Leonard
> > > [1]
> > >
> >
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/internals/DefaultPartitioner.java#L31
> > >
> > >
> > >
> > > > 2024年4月3日 上午5:53,Jeyhun Karimov  写道:
> > > >
> > > > Hi Lincoln,
> > > >
> > > > Thanks a lot for your comments. Please find my answers below.
> > > >
> > > >
> > > > 1. Is this flip targeted only at batch scenarios or does it include
> > > >> streaming?
> > > >> (The flip and the discussion did not explicitly mention this, but in
> > the
> > > >> draft pr, I only
> > > >> saw the implementation for batch scenarios
> > > >>
> > > >>
> > >
> >
> https://github.com/apache/flink/pull/24437/files#diff-a6d71dd7d9bf0e7776404f54473b504e1de1240e93f820

Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for Simplifying Data Pipelines

2024-04-09 Thread Lincoln Lee
Thanks Ron and Timo for your proposal!

Here is my ranking:

1. Derived table -> extend the persistent semantics of derived table in SQL
   standard, with a strong association with query, and has industry
precedents
   such as Google Looker.

2. Live Table ->  an alternative for 'dynamic table'

3. Materialized Table -> combination of the Materialized View and Table,
but
still a table which accept data changes

4. Materialized View -> need to extend understanding of the view to accept
data changes

The reason for not adding 'Refresh Table' is I don't want to tell the user
to 'refresh a refresh table'.


Best,
Lincoln Lee


Ron liu  于2024年4月9日周二 20:11写道:

> Hi, Dev
>
> My rankings are:
>
> 1. Derived Table
> 2. Materialized Table
> 3. Live Table
> 4. Materialized View
>
> Best,
> Ron
>
>
>
> Ron liu  于2024年4月9日周二 20:07写道:
>
> > Hi, Dev
> >
> > After several rounds of discussion, there is currently no consensus on
> the
> > name of the new concept. Timo has proposed that we decide the name
> through
> > a vote. This is a good solution when there is no clear preference, so we
> > will adopt this approach.
> >
> > Regarding the name of the new concept, there are currently five
> candidates:
> > 1. Derived Table -> taken by SQL standard
> > 2. Materialized Table -> similar to SQL materialized view but a table
> > 3. Live Table -> similar to dynamic tables
> > 4. Refresh Table -> states what it does
> > 5. Materialized View -> needs to extend the standard to support modifying
> > data
> >
> > For the above five candidates, everyone can give your rankings based on
> > your preferences. You can choose up to five options or only choose some
> of
> > them.
> > We will use a scoring rule, where the* first rank gets 5 points, second
> > rank gets 4 points, third rank gets 3 points, fourth rank gets 2 points,
> > and fifth rank gets 1 point*.
> > After the voting closes, I will score all the candidates based on
> > everyone's votes, and the candidate with the highest score will be chosen
> > as the name for the new concept.
> >
> > The voting will last up to 72 hours and is expected to close this Friday.
> > I look forward to everyone voting on the name in this thread. Of course,
> we
> > also welcome new input regarding the name.
> >
> > Best,
> > Ron
> >
> > Ron liu  于2024年4月9日周二 19:49写道:
> >
> >> Hi, Dev
> >>
> >> Sorry for my previous statement was not quite accurate. We will hold a
> >> vote for the name within this thread.
> >>
> >> Best,
> >> Ron
> >>
> >>
> >> Ron liu  于2024年4月9日周二 19:29写道:
> >>
> >>> Hi, Timo
> >>>
> >>> Thanks for your reply.
> >>>
> >>> I agree with you that sometimes naming is more difficult. When no one
> >>> has a clear preference, voting on the name is a good solution, so I'll
> send
> >>> a separate email for the vote, clarify the rules for the vote, then let
> >>> everyone vote.
> >>>
> >>> One other point to confirm, in your ranking there is an option for
> >>> Materialized View, does it stand for the UPDATING Materialized View
> that
> >>> you mentioned earlier in the discussion? If using Materialized View I
> think
> >>> it is needed to extend it.
> >>>
> >>> Best,
> >>> Ron
> >>>
> >>> Timo Walther  于2024年4月9日周二 17:20写道:
> >>>
> >>>> Hi Ron,
> >>>>
> >>>> yes naming is hard. But it will have large impact on trainings,
> >>>> presentations, and the mental model of users. Maybe the easiest is to
> >>>> collect ranking by everyone with some short justification:
> >>>>
> >>>>
> >>>> My ranking (from good to not so good):
> >>>>
> >>>> 1. Refresh Table -> states what it does
> >>>> 2. Materialized Table -> similar to SQL materialized view but a table
> >>>> 3. Live Table -> nice buzzword, but maybe still too close to dynamic
> >>>> tables?
> >>>> 4. Materialized View -> a bit broader than standard but still very
> >>>> similar
> >>>> 5. Derived table -> taken by standard
> >>>>
> >>>> Regards,
> >>>> Timo
> >>>>
> >>>>
> >>>>
> >>>> On 07.04.24 11:34, Ron liu wrote:
> >>>> > Hi, Dev

Re: [DISCUSS] FLIP-434: Support optimizations for pre-partitioned data sources

2024-04-03 Thread Lincoln Lee
Hi Jeyhun,

Thanks for your quick response!

In streaming scenario, shuffle commonly occurs before the stateful
operator, and there's a sanity check[1] when the stateful operator
accesses the state. This implies the consistency requirement of the
partitioner used for data shuffling and state key selector for state
accessing(see KeyGroupStreamPartitioner for more details),
otherwise, there may be state access errors. That is to say, in the
streaming scenario, it is not only the strict requirement described in
FlinkRelDistribution#requireStrict, but also the implied consistency of
hash calculation.

Also, if this flip targets both streaming and batch scenarios, it is
recommended to do PoC validation for streaming as well.

[1] https://issues.apache.org/jira/browse/FLINK-29430


Best,
Lincoln Lee


Leonard Xu  于2024年4月3日周三 14:25写道:

> Hey, Jeyhun
>
> Thanks for kicking off this discussion. I have two questions about
> streaming sources:
>
> (1)The FLIP  motivation section says Kafka broker is already partitioned
> w.r.t. some key[s] , Is this the main use case in Kafka world? Partitioning
> by key fields is not the default partitioner of Kafka default
> partitioner[1] IIUC.
>
> (2) Considering the FLIP’s optimization scope aims to both Batch and
> Streaming pre-partitioned source, could you add a Streaming Source example
> to help me understand the  FLIP better? I think Kafka Source is a good
> candidates for streaming source example, file source is a good one for
> batch source and it really helped me to follow-up the FLIP.
>
> Best,
> Leonard
> [1]
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/internals/DefaultPartitioner.java#L31
>
>
>
> > 2024年4月3日 上午5:53,Jeyhun Karimov  写道:
> >
> > Hi Lincoln,
> >
> > Thanks a lot for your comments. Please find my answers below.
> >
> >
> > 1. Is this flip targeted only at batch scenarios or does it include
> >> streaming?
> >> (The flip and the discussion did not explicitly mention this, but in the
> >> draft pr, I only
> >> saw the implementation for batch scenarios
> >>
> >>
> https://github.com/apache/flink/pull/24437/files#diff-a6d71dd7d9bf0e7776404f54473b504e1de1240e93f820214fa5d1f082fb30c8
> >> <
> >>
> https://github.com/apache/flink/pull/24437/files#diff-a6d71dd7d9bf0e7776404f54473b504e1de1240e93f820214fa5d1f082fb30c8%EF%BC%89
> >>>
> >> )
> >> If we expect this also apply to streaming, then we need to consider the
> >> stricter
> >> shuffle restrictions of streaming compared to batch (if support is
> >> considered,
> >> more discussion is needed here, let’s not expand for now). If it only
> >> applies to batch,
> >> it is recommended to clarify in the flip.
> >
> >
> > - The FLIP targets both streaming and batch scenarios.
> > Could you please elaborate more on what you mean by additional
> > restrictions?
> >
> >
> > 2. In the current implementation, the optimized plan seems to have some
> >> problems.
> >> As described in the class comments:
> >>
> >>
> https://github.com/apache/flink/blob/d6e3b51fdb9a2e565709e8d7bc619234b3768ed1/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/rules/physical/batch/RemoveRedundantShuffleRule.java#L60
> >
> > BatchPhysicalHashAggregate (local)
> >
> >   +- BatchPhysicalLocalHashAggregate (local)
> >>  +- BatchPhysicalTableSourceScan
> >> The `BatchPhysicalLocalHashAggregate` here is redundant (in the case of
> >> one-phase
> >> hashAgg, localAgg is not necessary, which is the scenario currently
> handled
> >> by
> >> `RemoveRedundantLocalHashAggRule` and other rules)
> >
> >
> > - Yes, you are completely right. Note that the PR you referenced is just
> a
> > quick PoC.
> > Redundant operators you mentioned exist because
> > `RemoveRedundantShuffleRule` just removes the Exchange operator,
> > without modifying upstream/downstream operators.
> > As I mentioned, the implementation is just a PoC and the end
> implementation
> > will make sure that existing redundancy elimination rules remove
> redundant
> > operators.
> >
> >
> > Also, in the draft pr,
> >> the optimization of `testShouldEliminatePartitioning1` &
> >> `testShouldEliminatePartitioning2`
> >> seems didn't take effect?
> >>
> >>
> https://github.com/apache/flink/blob/d6e3b51fdb9a2e565709e8d7bc619234b3768ed1/flink-table/flink-table-planner/src/test/resources/org/apache/flink/connector/file/table/Bat

Re: [DISCUSS] FLIP-434: Support optimizations for pre-partitioned data sources

2024-04-02 Thread Lincoln Lee
Hi Jeyhun,

Thank you for driving this, it would be very useful optimization!

Sorry for joining the discussion now(I originally planned to reply earlier,
but
happened to be during my vacation). I have two questions:

1. Is this flip targeted only at batch scenarios or does it include
streaming?
(The flip and the discussion did not explicitly mention this, but in the
draft pr, I only
saw the implementation for batch scenarios
https://github.com/apache/flink/pull/24437/files#diff-a6d71dd7d9bf0e7776404f54473b504e1de1240e93f820214fa5d1f082fb30c8
<https://github.com/apache/flink/pull/24437/files#diff-a6d71dd7d9bf0e7776404f54473b504e1de1240e93f820214fa5d1f082fb30c8%EF%BC%89>
 )
If we expect this also apply to streaming, then we need to consider the
stricter
shuffle restrictions of streaming compared to batch (if support is
considered,
more discussion is needed here, let’s not expand for now). If it only
applies to batch,
it is recommended to clarify in the flip.

2. In the current implementation, the optimized plan seems to have some
problems.
As described in the class comments:
https://github.com/apache/flink/blob/d6e3b51fdb9a2e565709e8d7bc619234b3768ed1/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/rules/physical/batch/RemoveRedundantShuffleRule.java#L60

BatchPhysicalHashAggregate (local)

   +- BatchPhysicalLocalHashAggregate (local)

  +- BatchPhysicalTableSourceScan
The `BatchPhysicalLocalHashAggregate` here is redundant (in the case of
one-phase
hashAgg, localAgg is not necessary, which is the scenario currently handled
by
`RemoveRedundantLocalHashAggRule` and other rules).  Also, in the draft pr,
the
optimization of `testShouldEliminatePartitioning1` &
`testShouldEliminatePartitioning2`
seems didn't take effect?
https://github.com/apache/flink/blob/d6e3b51fdb9a2e565709e8d7bc619234b3768ed1/flink-table/flink-table-planner/src/test/resources/org/apache/flink/connector/file/table/BatchFileSystemTableSourceTest.xml#L38

In conjunction with question 2, I am wondering if we have a better choice
(of course,
not simply adding the current `PHYSICAL_OPT_RULES`'s
`RemoveRedundantLocalXXRule`s
to the `PHYSICAL_REWRITE`).
For example, let the source actively provide some traits (including
`FlinkRelDistribution`
and `RelCollation`) to the planner. The advantage of doing this is to
directly reuse the
current shuffle remove optimization (as `FlinkExpandConversionRule`
implemented),
and according to the data distribution characteristics provided by the
source, the planner
may choose a physical operator with a cheaper costs (for example, according
to `RelCollation`,
the planner can use sortAgg, no need for a separate local sort operation).
WDYT?


Best,
Lincoln Lee


Jeyhun Karimov  于2024年4月1日周一 18:00写道:

> Hi everyone,
>
> Thanks for your valuable feedback!
>
> The discussion on this FLIP has been going on for a while.
> I would like to start a vote after 48 hours.
>
> Please let me know if you have any concerns or any further
> questions/comments.
>
> Regards,
> Jeyhun
>
>
> On Thu, Mar 21, 2024 at 6:01 PM Jeyhun Karimov 
> wrote:
>
> > Hi Lorenzo,
> >
> > Thanks a lot for your comments. Please find my answers below:
> >
> >
> > For the interface `SupportsPartitioning`, why returning `Optional`?
> >> If one decides to implement that, partitions must exist (at maximum,
> >> return and empty list). Returning `Optional` seem just to complicate the
> >> logic of the code using that interface.
> >
> >
> > - The reasoning behind the use of Optional is that sometimes (e.g., in
> > HiveTableSource) the partitioning info is in catalog.
> >   Therefore, we return Optional.empty(), so that the list of partitions
> is
> > queried from the catalog.
> >
> >
> > I foresee the using code doing something like: "if the source supports
> >> partitioning, get the partitions, but if they don't exist, raise a
> runtime
> >> exception". Let's simply make that safe at compile time and guarantee
> the
> >> code that partitions exist.
> >
> >
> > - Yes, once partitions cannot be found, neither from catalog nor from the
> > interface implementation, then we raise an exception during query compile
> > time.
> >
> >
> >  Another thing is that you show Hive-like partitioning in your FS
> >> structure, do you think it makes sense to add a note about
> auto-discovery
> >> of partitions?
> >
> >
> > - Yes, the FLIP contains just an example partitioning for filesystem
> > connector. Each connector already "knows" about autodiscovery of its
> > partitions. And we rely on this fact.
> >   For example, partition discovery is different between kafka and
> > filesystem sources. S

[jira] [Created] (FLINK-34972) MigrationTestsSnapshotGenerator does not work properly for all subclasses of SnapshotMigrationTestBase

2024-03-31 Thread lincoln lee (Jira)
lincoln lee created FLINK-34972:
---

 Summary: MigrationTestsSnapshotGenerator does not work properly 
for all subclasses of SnapshotMigrationTestBase
 Key: FLINK-34972
 URL: https://issues.apache.org/jira/browse/FLINK-34972
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.19.0
Reporter: lincoln lee


This issue was found on the followup work of 1.19.0 releasing 
https://issues.apache.org/jira/browse/FLINK-34712.

{code}MigrationTestsSnapshotGenerator{code} actually didn't generate the 
corresponding testing files for the new version(1.19) without any abnormal 
logs. And only appears in all subclasses of 
{code}SnapshotMigrationTestBase{code}.

more context: https://github.com/apache/flink/pull/24517



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Re: [ANNOUNCE] Apache Paimon is graduated to Top Level Project

2024-03-30 Thread Lincoln Lee
Congratulations!

Best,
Lincoln Lee


Jark Wu  于2024年3月30日周六 22:13写道:

> Congratulations!
>
> Best,
> Jark
>
> On Fri, 29 Mar 2024 at 12:08, Yun Tang  wrote:
>
> > Congratulations to all Paimon guys!
> >
> > Glad to see a Flink sub-project has been graduated to an Apache top-level
> > project.
> >
> > Best
> > Yun Tang
> >
> > 
> > From: Hangxiang Yu 
> > Sent: Friday, March 29, 2024 10:32
> > To: dev@flink.apache.org 
> > Subject: Re: Re: [ANNOUNCE] Apache Paimon is graduated to Top Level
> Project
> >
> > Congratulations!
> >
> > On Fri, Mar 29, 2024 at 10:27 AM Benchao Li 
> wrote:
> >
> > > Congratulations!
> > >
> > > Zakelly Lan  于2024年3月29日周五 10:25写道:
> > > >
> > > > Congratulations!
> > > >
> > > >
> > > > Best,
> > > > Zakelly
> > > >
> > > > On Thu, Mar 28, 2024 at 10:13 PM Jing Ge  >
> > > wrote:
> > > >
> > > > > Congrats!
> > > > >
> > > > > Best regards,
> > > > > Jing
> > > > >
> > > > > On Thu, Mar 28, 2024 at 1:27 PM Feifan Wang 
> > > wrote:
> > > > >
> > > > > > Congratulations!——
> > > > > >
> > > > > > Best regards,
> > > > > >
> > > > > > Feifan Wang
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > At 2024-03-28 20:02:43, "Yanfei Lei" 
> wrote:
> > > > > > >Congratulations!
> > > > > > >
> > > > > > >Best,
> > > > > > >Yanfei
> > > > > > >
> > > > > > >Zhanghao Chen  于2024年3月28日周四
> 19:59写道:
> > > > > > >>
> > > > > > >> Congratulations!
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> Zhanghao Chen
> > > > > > >> 
> > > > > > >> From: Yu Li 
> > > > > > >> Sent: Thursday, March 28, 2024 15:55
> > > > > > >> To: d...@paimon.apache.org 
> > > > > > >> Cc: dev ; user 
> > > > > > >> Subject: Re: [ANNOUNCE] Apache Paimon is graduated to Top
> Level
> > > > > Project
> > > > > > >>
> > > > > > >> CC the Flink user and dev mailing list.
> > > > > > >>
> > > > > > >> Paimon originated within the Flink community, initially known
> as
> > > Flink
> > > > > > >> Table Store, and all our incubating mentors are members of the
> > > Flink
> > > > > > >> Project Management Committee. I am confident that the bonds of
> > > > > > >> enduring friendship and close collaboration will continue to
> > > unite the
> > > > > > >> two communities.
> > > > > > >>
> > > > > > >> And congratulations all!
> > > > > > >>
> > > > > > >> Best Regards,
> > > > > > >> Yu
> > > > > > >>
> > > > > > >> On Wed, 27 Mar 2024 at 20:35, Guojun Li <
> > gjli.schna...@gmail.com>
> > > > > > wrote:
> > > > > > >> >
> > > > > > >> > Congratulations!
> > > > > > >> >
> > > > > > >> > Best,
> > > > > > >> > Guojun
> > > > > > >> >
> > > > > > >> > On Wed, Mar 27, 2024 at 5:24 PM wulin 
> > > wrote:
> > > > > > >> >
> > > > > > >> > > Congratulations~
> > > > > > >> > >
> > > > > > >> > > > 2024年3月27日 15:54,王刚 
> 写道:
> > > > > > >> > > >
> > > > > > >> > > > Congratulations~
> > > > > > >> > > >
> > > > > > >> > > >> 2024年3月26日 10:25,Jingsong Li 
> > 写道:
> > > > > > >> > > >>
> > > > > > >> > > >> Hi Paimon community,
> > > > > > >> > > >>
> > > > > > >> > > >> I’m glad to announce that the ASF board has approved a
> > > > > > resolution to
> > > > > > >> > > >> graduate Paimon into a full Top Level Project. Thanks
> to
> > > > > > everyone for
> > > > > > >> > > >> your help to get to this point.
> > > > > > >> > > >>
> > > > > > >> > > >> I just created an issue to track the things we need to
> > > modify
> > > > > > [2],
> > > > > > >> > > >> please comment on it if you feel that something is
> > > missing. You
> > > > > > can
> > > > > > >> > > >> refer to apache documentation [1] too.
> > > > > > >> > > >>
> > > > > > >> > > >> And, we already completed the GitHub repo migration
> [3],
> > > please
> > > > > > update
> > > > > > >> > > >> your local git repo to track the new repo [4].
> > > > > > >> > > >>
> > > > > > >> > > >> You can run the following command to complete the
> remote
> > > repo
> > > > > > tracking
> > > > > > >> > > >> migration.
> > > > > > >> > > >>
> > > > > > >> > > >> git remote set-url origin
> > > https://github.com/apache/paimon.git
> > > > > > >> > > >>
> > > > > > >> > > >> If you have a different name, please change the
> 'origin'
> > to
> > > > > your
> > > > > > remote
> > > > > > >> > > name.
> > > > > > >> > > >>
> > > > > > >> > > >> Please join me in celebrating!
> > > > > > >> > > >>
> > > > > > >> > > >> [1]
> > > > > > >> > >
> > > > > >
> > > > >
> > >
> >
> https://incubator.apache.org/guides/transferring.html#life_after_graduation
> > > > > > >> > > >> [2] https://github.com/apache/paimon/issues/3091
> > > > > > >> > > >> [3] https://issues.apache.org/jira/browse/INFRA-25630
> > > > > > >> > > >> [4] https://github.com/apache/paimon
> > > > > > >> > > >>
> > > > > > >> > > >> Best,
> > > > > > >> > > >> Jingsong Lee
> > > > > > >> > >
> > > > > > >> > >
> > > > > >
> > > > >
> > >
> > >
> > >
> > > --
> > >
> > > Best,
> > > Benchao Li
> > >
> >
> >
> > --
> > Best,
> > Hangxiang.
> >
>


Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for Simplifying Data Pipelines

2024-03-22 Thread Lincoln Lee
Hi Timo,

Thanks for your thoughtful inputs!

Yes, expanding the MATERIALIZED VIEW(MV) could achieve the same function,
but our primary concern is that by using a view, we might limit future
opportunities
to optimize queries through automatic materialization rewriting [1],
leveraging
the support for MV by physical storage. This is because we would be breaking
the intuitive semantics of a materialized view (a materialized view
represents
the result of a query) by allowing data modifications, thus losing the
potential
for such optimizations.

With these considerations in mind, we were inspired by Google Looker's
Persistent
Derived Table [2]. PDT is designed for building Looker's automated
modeling,
aligning with our purpose for the stream-batch automatic pipeline.
Therefore,
we are considering another candidate, Derived Table, the term 'derive'
suggests a
query, and 'table' retains modifiability. This approach would not disrupt
our current
concept of a dynamic table, preserving the future utility of MVs.

Conceptually, a Derived Table is a Dynamic Table + Continuous Query. By
introducing
 a new concept Derived Table for this FLIP, this makes all concepts to play
together nicely.

What do you think about this?

[1] https://calcite.apache.org/docs/materialized_views.html
[2]
https://cloud.google.com/looker/docs/derived-tables#persistent_derived_tables


Best,
Lincoln Lee


Timo Walther  于2024年3月22日周五 17:54写道:

> Hi Ron,
>
> thanks for the detailed answer. Sorry, for my late reply, we had a
> conference that kept me busy.
>
>  > In the current concept[1], it actually includes: Dynamic Tables &
>  > & Continuous Query. Dynamic Table is just an abstract logical concept
>
> This explanation makes sense to me. But the docs also say "A continuous
> query is evaluated on the dynamic table yielding a new dynamic table.".
> So even our regular CREATE TABLEs are considered dynamic tables. This
> can also be seen in the diagram "Dynamic Table -> Continuous Query ->
> Dynamic Table". Currently, Flink queries can only be executed on Dynamic
> Tables.
>
>  > In essence, a materialized view represents the result of a query.
>
> Isn't that what your proposal does as well?
>
>  > the object of the suspend operation is the refresh task of the
> dynamic table
>
> I understand that Snowflake uses the term [1] to merge their concepts of
> STREAM, TASK, and TABLE into one piece of concept. But Flink has no
> concept of a "refresh task". Also, they already introduced MATERIALIZED
> VIEW. Flink is in the convenient position that the concept of
> materialized views is not taken (reserved maybe for exactly this use
> case?). And SQL standard concept could be "slightly adapted" to our
> needs. Looking at other vendors like Postgres[2], they also use
> `REFRESH` commands so why not adding additional commands such as DELETE
> or UPDATE. Oracle supports  "ON PREBUILT TABLE clause tells the database
> to use an existing table segment"[3] which comes closer to what we want
> as well.
>
>  > it is not intended to support data modification
>
> This is an argument that I understand. But we as Flink could allow data
> modifications. This way we are only extending the standard and don't
> introduce new concepts.
>
> If we can't agree on using MATERIALIZED VIEW concept. We should fix our
> syntax in a Flink 2.0 effort. Making regular tables bounded and dynamic
> tables unbounded. We would be closer to the SQL standard with this and
> pave the way for the future. I would actually support this if all
> concepts play together nicely.
>
>  > In the future, we can consider extending the statement set syntax to
> support the creation of multiple dynamic tables.
>
> It's good that we called the concept STATEMENT SET. This allows us to
> defined CREATE TABLE within. Even if it might look a bit confusing.
>
> Regards,
> Timo
>
> [1] https://docs.snowflake.com/en/user-guide/dynamic-tables-about
> [2]
> https://www.postgresql.org/docs/current/sql-creatematerializedview.html
> [3] https://oracle-base.com/articles/misc/materialized-views
>
> On 21.03.24 04:14, Feng Jin wrote:
> > Hi Ron and Lincoln
> >
> > Thanks for driving this discussion.  I believe it will greatly improve
> the
> > convenience of managing user real-time pipelines.
> >
> > I have some questions.
> >
> > *Regarding Limitations of Dynamic Table:*
> >
> >> Does not support modifying the select statement after the dynamic table
> > is created.
> >
> > Although currently we restrict users from modifying the query, I wonder
> if
> > we can provide a better way to help users rebuild it without affecting
> > downstream OLAP queries.
&g

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-20 Thread Lincoln Lee
Congrats, thanks for the great work!


Best,
Lincoln Lee


Peter Huang  于2024年3月20日周三 22:48写道:

> Congratulations
>
>
> Best Regards
> Peter Huang
>
> On Wed, Mar 20, 2024 at 6:56 AM Huajie Wang  wrote:
>
>>
>> Congratulations
>>
>>
>>
>> Best,
>> Huajie Wang
>>
>>
>>
>> Leonard Xu  于2024年3月20日周三 21:36写道:
>>
>>> Hi devs and users,
>>>
>>> We are thrilled to announce that the donation of Flink CDC as a
>>> sub-project of Apache Flink has completed. We invite you to explore the new
>>> resources available:
>>>
>>> - GitHub Repository: https://github.com/apache/flink-cdc
>>> - Flink CDC Documentation:
>>> https://nightlies.apache.org/flink/flink-cdc-docs-stable
>>>
>>> After Flink community accepted this donation[1], we have completed
>>> software copyright signing, code repo migration, code cleanup, website
>>> migration, CI migration and github issues migration etc.
>>> Here I am particularly grateful to Hang Ruan, Zhongqaing Gong, Qingsheng
>>> Ren, Jiabao Sun, LvYanquan, loserwang1024 and other contributors for their
>>> contributions and help during this process!
>>>
>>>
>>> For all previous contributors: The contribution process has slightly
>>> changed to align with the main Flink project. To report bugs or suggest new
>>> features, please open tickets
>>> Apache Jira (https://issues.apache.org/jira).  Note that we will no
>>> longer accept GitHub issues for these purposes.
>>>
>>>
>>> Welcome to explore the new repository and documentation. Your feedback
>>> and contributions are invaluable as we continue to improve Flink CDC.
>>>
>>> Thanks everyone for your support and happy exploring Flink CDC!
>>>
>>> Best,
>>> Leonard
>>> [1] https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob
>>>
>>>


[jira] [Created] (FLINK-34746) Switching to the Apache CDN for Dockerfile

2024-03-19 Thread lincoln lee (Jira)
lincoln lee created FLINK-34746:
---

 Summary: Switching to the Apache CDN for Dockerfile
 Key: FLINK-34746
 URL: https://issues.apache.org/jira/browse/FLINK-34746
 Project: Flink
  Issue Type: Improvement
  Components: flink-docker
Reporter: lincoln lee


During publishing the official image, we received some comments

for Switching to the Apache CDN

 

See

https://github.com/docker-library/official-images/pull/16114

https://github.com/docker-library/official-images/pull/16430

 

Reason for switching: [https://apache.org/history/mirror-history.html] (also 
[https://www.apache.org/dyn/closer.cgi] and [https://www.apache.org/mirrors])



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Re: [DISCUSS] FLIP-436: Introduce "SHOW CREATE CATALOG" Syntax

2024-03-19 Thread Lincoln Lee
Hi Yubin,

Sorry, please ignore my last reply (wrong context).
I also asked Leonard, your proposal to extend the `CatalogDescriptor`
should be okay.

Thank you for your update : ) !


Best,
Lincoln Lee


Lincoln Lee  于2024年3月20日周三 09:35写道:

> Hi Yubin,
>
> Thank you for detailed explaination! I overlooked `CatalogBaseTable`, in
> fact
>  there is already a `String getComment();` interface similar to `database`
> and `table`.
> Can we continue the work on FLINK-21665 and complete its implementation?
> It seems to be very close.
>
> Best,
> Lincoln Lee
>
>
> Yubin Li  于2024年3月20日周三 01:42写道:
>
>> Hi Lincoln,
>>
>> Thanks for your detailed comments!
>>
>> Supporting comments for `Catalog` is a really helpful feature, I agree
>> with you to make it introduced in this FLIP, thank you for pointing
>> that out :)
>>
>> Concerning the implementation, I propose to introduce `getComment()`
>> method in `CatalogDescriptor`, and the reasons are as follows. WDYT?
>> 1. For the sake of design consistency, follow the design of FLIP-295
>> [1] which introduced `CatalogStore` component, `CatalogDescriptor`
>> includes names and attributes, both of which are used to describe the
>> catalog, and `comment` can be added smoothly.
>> 2. Extending the existing class rather than add new method to the
>> existing interface, Especially, the `Catalog` interface, as a core
>> interface, is used by a series of important components such as
>> `CatalogFactory`, `CatalogManager` and `FactoryUtil`, and is
>> implemented by a large number of connectors such as JDBC, Paimon, and
>> Hive. Adding methods to it will greatly increase the implementation
>> complexity, and more importantly, increase the cost of iteration,
>> maintenance, and verification.
>>
>> Please see FLIP doc [2] for details.
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
>> [2]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-436%3A+Introduce+Catalog-related+Syntax
>>
>> Best,
>> Yubin
>>
>> On Tue, Mar 19, 2024 at 9:57 PM Lincoln Lee 
>> wrote:
>> >
>> > Hi Yubin,
>> >
>> > Thanks for your quickly response!
>> >
>> > It would be better to support comments just like create `database` and
>> > `table` with comment.
>> > That is, add `String getComment();` to the current `Catalog` interface.
>> > WDYT?
>> >
>> > Best,
>> > Lincoln Lee
>> >
>> >
>> > Yubin Li  于2024年3月19日周二 21:44写道:
>> >
>> > > Hi Lincoln,
>> > >
>> > > Good catch. Thanks for your suggestions.
>> > >
>> > > I found that the creation statements of database and table both
>> > > support specifying "if not exists". For the sake of syntactic
>> > > consistency and user practicality, We could introduce the '[if not
>> > > exists]' clause to the 'create catalog' statement.
>> > >
>> > > As for the introduction of the `catalog comment` feature, it may
>> > > involve changes to the Catalog structure, which can be left for future
>> > > discussion.
>> > >
>> > > WDYT? Looking forward to your feedback :)
>> > >
>> > > Best,
>> > > Yubin
>> > >
>> > > On Tue, Mar 19, 2024 at 9:06 PM Lincoln Lee 
>> > > wrote:
>> > > >
>> > > > Hi Yubin,
>> > > >
>> > > > Big +1 for completing the catalog api!
>> > > > There's a minor addition[1] which does not affect the vote could
>> also be
>> > > > considered.
>> > > >
>> > > > [1] https://issues.apache.org/jira/browse/FLINK-21665
>> > > >
>> > > >
>> > > > Best,
>> > > > Lincoln Lee
>> > > >
>> > > >
>> > > > Yubin Li  于2024年3月18日周一 17:44写道:
>> > > >
>> > > > > Hi Jark,
>> > > > >
>> > > > > Thanks for your response, I have updated FLIP-436: Introduce
>> > > > > Catalog-related Syntax [1] as you suggested.
>> > > > >
>> > > > > If there are no more comments within 24 hours, I will start a
>> vote for
>> > > > > this, thanks :)
>> > > > >
>> > > > > Best,
>> > > > > Yubin
>> > > > >
>

Re: Re: [DISCUSS] FLIP-436: Introduce "SHOW CREATE CATALOG" Syntax

2024-03-19 Thread Lincoln Lee
Hi Yubin,

Thank you for detailed explaination! I overlooked `CatalogBaseTable`, in
fact
 there is already a `String getComment();` interface similar to `database`
and `table`.
Can we continue the work on FLINK-21665 and complete its implementation? It
seems to be very close.

Best,
Lincoln Lee


Yubin Li  于2024年3月20日周三 01:42写道:

> Hi Lincoln,
>
> Thanks for your detailed comments!
>
> Supporting comments for `Catalog` is a really helpful feature, I agree
> with you to make it introduced in this FLIP, thank you for pointing
> that out :)
>
> Concerning the implementation, I propose to introduce `getComment()`
> method in `CatalogDescriptor`, and the reasons are as follows. WDYT?
> 1. For the sake of design consistency, follow the design of FLIP-295
> [1] which introduced `CatalogStore` component, `CatalogDescriptor`
> includes names and attributes, both of which are used to describe the
> catalog, and `comment` can be added smoothly.
> 2. Extending the existing class rather than add new method to the
> existing interface, Especially, the `Catalog` interface, as a core
> interface, is used by a series of important components such as
> `CatalogFactory`, `CatalogManager` and `FactoryUtil`, and is
> implemented by a large number of connectors such as JDBC, Paimon, and
> Hive. Adding methods to it will greatly increase the implementation
> complexity, and more importantly, increase the cost of iteration,
> maintenance, and verification.
>
> Please see FLIP doc [2] for details.
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-436%3A+Introduce+Catalog-related+Syntax
>
> Best,
> Yubin
>
> On Tue, Mar 19, 2024 at 9:57 PM Lincoln Lee 
> wrote:
> >
> > Hi Yubin,
> >
> > Thanks for your quickly response!
> >
> > It would be better to support comments just like create `database` and
> > `table` with comment.
> > That is, add `String getComment();` to the current `Catalog` interface.
> > WDYT?
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Yubin Li  于2024年3月19日周二 21:44写道:
> >
> > > Hi Lincoln,
> > >
> > > Good catch. Thanks for your suggestions.
> > >
> > > I found that the creation statements of database and table both
> > > support specifying "if not exists". For the sake of syntactic
> > > consistency and user practicality, We could introduce the '[if not
> > > exists]' clause to the 'create catalog' statement.
> > >
> > > As for the introduction of the `catalog comment` feature, it may
> > > involve changes to the Catalog structure, which can be left for future
> > > discussion.
> > >
> > > WDYT? Looking forward to your feedback :)
> > >
> > > Best,
> > > Yubin
> > >
> > > On Tue, Mar 19, 2024 at 9:06 PM Lincoln Lee 
> > > wrote:
> > > >
> > > > Hi Yubin,
> > > >
> > > > Big +1 for completing the catalog api!
> > > > There's a minor addition[1] which does not affect the vote could
> also be
> > > > considered.
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-21665
> > > >
> > > >
> > > > Best,
> > > > Lincoln Lee
> > > >
> > > >
> > > > Yubin Li  于2024年3月18日周一 17:44写道:
> > > >
> > > > > Hi Jark,
> > > > >
> > > > > Thanks for your response, I have updated FLIP-436: Introduce
> > > > > Catalog-related Syntax [1] as you suggested.
> > > > >
> > > > > If there are no more comments within 24 hours, I will start a vote
> for
> > > > > this, thanks :)
> > > > >
> > > > > Best,
> > > > > Yubin
> > > > >
> > > > > [1]
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-436%3A+Introduce+Catalog-related+Syntax
> > > > >
> > > > > On Mon, Mar 18, 2024 at 4:39 PM Jark Wu  wrote:
> > > > > >
> > > > > > Hi Yubin,
> > > > > >
> > > > > > Thanks for the quick response. The suggestion sounds good to me!
> > > > > >
> > > > > > Best,
> > > > > > Jark
> > > > > >
> > > > > > On Mon, 18 Mar 2024 at 13:06, Yubin Li 
> wrote:
> > > > > >
> &

[SUMMARY] Flink 1.19 last sync summary on 03/19/2024

2024-03-19 Thread Lincoln Lee
Hi everyone,

Flink 1.19.0 has been officially released yesterday[1].

I'd like to share some highlights of the last release sync of 1.19:

- Remaining works

The official docker image is still in progress[2], will be available once
the related pr been merged[3].
In addition, some follow-up items are being processed[4], and about end of
support for lower versions will be discussed in separate mail.

Thanks to all contributors for your great work on 1.19 and the support for
the release!

The new 1.20 release cycle[5] has set off, welcome to continue contributing!

[1] https://lists.apache.org/thread/sofmxytbh6y20nwot1gywqqc2lqxn4hm
[2] https://issues.apache.org/jira/browse/FLINK-34701
[3] https://github.com/docker-library/official-images/pull/16430
[4] https://issues.apache.org/jira/browse/FLINK-34706
[5] https://lists.apache.org/thread/80h3nzk08v276xmllswbbbg1z7m3v70t


Best,
Yun, Jing, Martijn and Lincoln


Re: Re: [DISCUSS] FLIP-436: Introduce "SHOW CREATE CATALOG" Syntax

2024-03-19 Thread Lincoln Lee
Hi Yubin,

Thanks for your quickly response!

It would be better to support comments just like create `database` and
`table` with comment.
That is, add `String getComment();` to the current `Catalog` interface.
WDYT?

Best,
Lincoln Lee


Yubin Li  于2024年3月19日周二 21:44写道:

> Hi Lincoln,
>
> Good catch. Thanks for your suggestions.
>
> I found that the creation statements of database and table both
> support specifying "if not exists". For the sake of syntactic
> consistency and user practicality, We could introduce the '[if not
> exists]' clause to the 'create catalog' statement.
>
> As for the introduction of the `catalog comment` feature, it may
> involve changes to the Catalog structure, which can be left for future
> discussion.
>
> WDYT? Looking forward to your feedback :)
>
> Best,
> Yubin
>
> On Tue, Mar 19, 2024 at 9:06 PM Lincoln Lee 
> wrote:
> >
> > Hi Yubin,
> >
> > Big +1 for completing the catalog api!
> > There's a minor addition[1] which does not affect the vote could also be
> > considered.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-21665
> >
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Yubin Li  于2024年3月18日周一 17:44写道:
> >
> > > Hi Jark,
> > >
> > > Thanks for your response, I have updated FLIP-436: Introduce
> > > Catalog-related Syntax [1] as you suggested.
> > >
> > > If there are no more comments within 24 hours, I will start a vote for
> > > this, thanks :)
> > >
> > > Best,
> > > Yubin
> > >
> > > [1]
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-436%3A+Introduce+Catalog-related+Syntax
> > >
> > > On Mon, Mar 18, 2024 at 4:39 PM Jark Wu  wrote:
> > > >
> > > > Hi Yubin,
> > > >
> > > > Thanks for the quick response. The suggestion sounds good to me!
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > > On Mon, 18 Mar 2024 at 13:06, Yubin Li  wrote:
> > > >
> > > > > Hi Jark,
> > > > >
> > > > > Good pointing! Thanks for your reply, there are some details to
> align
> > > :)
> > > > >
> > > > > 1. I think the purpose of DESCRIBE CATALOG is to display metadata
> > > > > > information including catalog name,
> > > > > > catalog comment (may be introduced in the future), catalog type,
> and
> > > > > > catalog properties (for example [1])
> > > > >
> > > > > Adopting { DESC | DESCRIBE } CATALOG [ EXTENDED ] xx as formal
> syntax,
> > > > > Producing rich and compatible results for future needs is very
> > > important.
> > > > > When
> > > > > specifying "extended" in the syntax, it will output the complete
> > > > > information including
> > > > > properties.The complete output example is as follows:
> > > > >
> > > > >
> > >
> +-+---+
> > > > > | catalog_description_item | catalog_description_value
> > >|
> > > > >
> > > > >
> > >
> +-+---+
> > > > > |   Name | cat1
> > > > >   |
> > > > > |   Type   |
>  generic_in_memory
> > > > >|
> > > > > |   Comment   |
> > > > >   |
> > > > > |   Properties  |((k1,v1),
> (k2,v2))
> > > > > |
> > > > >
> > > > >
> > >
> +-+---+
> > > > >
> > > > > 2. Could you add support for ALTER CATALOG xxx UNSET ('mykey')?
> This is
> > > > > > also very useful in ALTER TABLE.
> > > > >
> > > > > I found that there is already an ALTER TABLE xxx RESET ('mykey')
> > > syntax [1]
> > > > > now,
> > > > > which will reset the myKey attribute of a certain table to the
> default
> > > > > value. For catalogs,
> > > > > it might be better to use ALTER CATALOG xxx RESET ('mykey') for the
> 

Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for Simplifying Data Pipelines

2024-03-19 Thread Lincoln Lee
Hi Yun,

Thank you very much for your valuable input!

Incremental mode is indeed an attractive idea, we have also discussed
this, but in the current design,

we first provided two refresh modes: CONTINUOUS and
FULL. Incremental mode can be introduced

once the execution layer has the capability.

My answer for the two questions:

1.
Yes, cascading is a good question.  Current proposal provides a
freshness that defines a dynamic
table relative to the base table’s lag. If users need to consider the
end-to-end freshness of multiple
cascaded dynamic tables, he can manually split them for now. Of
course, how to let multiple cascaded
 or dependent dynamic tables complete the freshness definition in a
simpler way, I think it can be
extended in the future.

2.
Cascading refresh is also a part we focus on discussing. In this flip,
we hope to focus as much as
possible on the core features (as it already involves a lot things),
so we did not directly introduce related
 syntax. However, based on the current design, combined with the
catalog and lineage, theoretically,
users can also finish the cascading refresh.


Best,
Lincoln Lee


Yun Tang  于2024年3月19日周二 13:45写道:

> Hi Lincoln,
>
> Thanks for driving this discussion, and I am so excited to see this topic
> being discussed in the Flink community!
>
> From my point of view, instead of the work of unifying streaming and batch
> in DataStream API [1], this FLIP actually could make users benefit from one
> engine to rule batch & streaming.
>
> If we treat this FLIP as an open-source implementation of Snowflake's
> dynamic tables [2], we still lack an incremental refresh mode to make the
> ETL near real-time with a much cheaper computation cost. However, I think
> this could be done under the current design by introducing another refresh
> mode in the future. Although the extra work of incremental view maintenance
> would be much larger.
>
> For the FLIP itself, I have several questions below:
>
> 1. It seems this FLIP does not consider the lag of refreshes across ETL
> layers from ODS ---> DWD ---> APP [3]. We currently only consider the
> scheduler interval, which means we cannot use lag to automatically schedule
> the upfront micro-batch jobs to do the work.
> 2. To support the automagical refreshes, we should consider the lineage in
> the catalog or somewhere else.
>
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-134%3A+Batch+execution+for+the+DataStream+API
> [2] https://docs.snowflake.com/en/user-guide/dynamic-tables-about
> [3] https://docs.snowflake.com/en/user-guide/dynamic-tables-refresh
>
> Best
> Yun Tang
>
>
> 
> From: Lincoln Lee 
> Sent: Thursday, March 14, 2024 14:35
> To: dev@flink.apache.org 
> Subject: Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for
> Simplifying Data Pipelines
>
> Hi Jing,
>
> Thanks for your attention to this flip! I'll try to answer the following
> questions.
>
> > 1. How to define query of dynamic table?
> > Use flink sql or introducing new syntax?
> > If use flink sql, how to handle the difference in SQL between streaming
> and
> > batch processing?
> > For example, a query including window aggregate based on processing time?
> > or a query including global order by?
>
> Similar to `CREATE TABLE AS query`, here the `query` also uses Flink sql
> and
>
> doesn't introduce a totally new syntax.
> We will not change the status respect to
>
> the difference in functionality of flink sql itself on streaming and
> batch, for example,
>
> the proctime window agg on streaming and global sort on batch that you
> mentioned,
>
> in fact, do not work properly in the
> other mode, so when the user modifies the
>
> refresh mode of a dynamic table that is not supported, we will throw an
> exception.
>
> > 2. Whether modify the query of dynamic table is allowed?
> > Or we could only refresh a dynamic table based on the initial query?
>
> Yes, in the current design, the query definition of the
> dynamic table is not allowed
>
>  to be modified, and you can only refresh the data based on the
> initial definition.
>
> > 3. How to use dynamic table?
> > The dynamic table seems to be similar to the materialized view.  Will we
> do
> > something like materialized view rewriting during the optimization?
>
> It's true that dynamic table and materialized view
> are similar in some ways, but as Ron
>
> explains
> there are differences. In terms of optimization, automated
> materialization discovery
>
> similar to that supported by calcite is also a potential possibility,
> perhaps with the
>
> addition of automated rewriting in the future.
>
>
>
> B

Re: [VOTE] FLIP-436: Introduce Catalog-related Syntax

2024-03-19 Thread Lincoln Lee
+1 (binding)

Best,
Lincoln Lee


Feng Jin  于2024年3月19日周二 19:59写道:

> +1 (non-binding)
>
> Best,
> Feng
>
> On Tue, Mar 19, 2024 at 7:46 PM Ferenc Csaky 
> wrote:
>
> > +1 (non-binding).
> >
> > Best,
> > Ferenc
> >
> >
> >
> >
> > On Tuesday, March 19th, 2024 at 12:39, Jark Wu  wrote:
> >
> > >
> > >
> > > +1 (binding)
> > >
> > > Best,
> > > Jark
> > >
> > > On Tue, 19 Mar 2024 at 19:05, Yuepeng Pan panyuep...@apache.org wrote:
> > >
> > > > Hi, Yubin
> > > >
> > > > Thanks for driving it !
> > > >
> > > > +1 non-binding.
> > > >
> > > > Best,
> > > > Yuepeng Pan.
> > > >
> > > > At 2024-03-19 17:56:42, "Yubin Li" lyb5...@gmail.com wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > Thanks for all the feedback, I'd like to start a vote on the
> > FLIP-436:
> > > > > Introduce Catalog-related Syntax [1]. The discussion thread is here
> > > > > [2].
> > > > >
> > > > > The vote will be open for at least 72 hours unless there is an
> > > > > objection or insufficient votes.
> > > > >
> > > > > [1]
> > > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-436%3A+Introduce+Catalog-related+Syntax
> > > > > [2]
> https://lists.apache.org/thread/10k1bjb4sngyjwhmfqfky28lyoo7sv0z
> > > > >
> > > > > Best regards,
> > > > > Yubin
> >
>


Re: Re: [DISCUSS] FLIP-436: Introduce "SHOW CREATE CATALOG" Syntax

2024-03-19 Thread Lincoln Lee
Hi Yubin,

Big +1 for completing the catalog api!
There's a minor addition[1] which does not affect the vote could also be
considered.

[1] https://issues.apache.org/jira/browse/FLINK-21665


Best,
Lincoln Lee


Yubin Li  于2024年3月18日周一 17:44写道:

> Hi Jark,
>
> Thanks for your response, I have updated FLIP-436: Introduce
> Catalog-related Syntax [1] as you suggested.
>
> If there are no more comments within 24 hours, I will start a vote for
> this, thanks :)
>
> Best,
> Yubin
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-436%3A+Introduce+Catalog-related+Syntax
>
> On Mon, Mar 18, 2024 at 4:39 PM Jark Wu  wrote:
> >
> > Hi Yubin,
> >
> > Thanks for the quick response. The suggestion sounds good to me!
> >
> > Best,
> > Jark
> >
> > On Mon, 18 Mar 2024 at 13:06, Yubin Li  wrote:
> >
> > > Hi Jark,
> > >
> > > Good pointing! Thanks for your reply, there are some details to align
> :)
> > >
> > > 1. I think the purpose of DESCRIBE CATALOG is to display metadata
> > > > information including catalog name,
> > > > catalog comment (may be introduced in the future), catalog type, and
> > > > catalog properties (for example [1])
> > >
> > > Adopting { DESC | DESCRIBE } CATALOG [ EXTENDED ] xx as formal syntax,
> > > Producing rich and compatible results for future needs is very
> important.
> > > When
> > > specifying "extended" in the syntax, it will output the complete
> > > information including
> > > properties.The complete output example is as follows:
> > >
> > >
> +-+---+
> > > | catalog_description_item | catalog_description_value
>|
> > >
> > >
> +-+---+
> > > |   Name | cat1
> > >   |
> > > |   Type   | generic_in_memory
> > >|
> > > |   Comment   |
> > >   |
> > > |   Properties  |((k1,v1), (k2,v2))
> > > |
> > >
> > >
> +-+---+
> > >
> > > 2. Could you add support for ALTER CATALOG xxx UNSET ('mykey')? This is
> > > > also very useful in ALTER TABLE.
> > >
> > > I found that there is already an ALTER TABLE xxx RESET ('mykey')
> syntax [1]
> > > now,
> > > which will reset the myKey attribute of a certain table to the default
> > > value. For catalogs,
> > > it might be better to use ALTER CATALOG xxx RESET ('mykey') for the
> sake of
> > > design
> > > consistency.
> > >
> > > WDYT? Looking forward to your suggestions.
> > >
> > > Best,
> > > Yubin
> > >
> > > [1]
> > >
> > >
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/alter/#reset
> > >
> > >
> > > On Mon, Mar 18, 2024 at 11:49 AM Jark Wu  wrote:
> > >
> > > > Hi Yubin,
> > > >
> > > > Thanks for updating the FLIP. The updated version looks good in
> general.
> > > > I only have 2 minor comments.
> > > >
> > > > 1. I think the purpose of DESCRIBE CATALOG is to display metadata
> > > > information including catalog name,
> > > > catalog comment (may be introduced in the future), catalog type, and
> > > > catalog properties (for example [1]).
> > > > Expanding all properties may limit this syntax to include more
> metadata
> > > > information in the future.
> > > >
> > > > 2. Could you add support for ALTER CATALOG xxx UNSET ('mykey')? This
> is
> > > > also very useful in ALTER TABLE.
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > > [1]:
> > > >
> > > >
> > >
> https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-aux-describe-schema.html
> > > >
> > > >
> > > >
> > > > On Fri, 15 Mar 2024 at 12:06, Yubin Li  wrote:
> > > >
> > > > > Hi Xuyang,
> > > > >
> > > > > Thank you for pointing this out, The parser part of `describe
> catalog`
> 

[jira] [Created] (FLINK-34725) Dockerfiles for release publishing has incorrect config.yaml path

2024-03-18 Thread lincoln lee (Jira)
lincoln lee created FLINK-34725:
---

 Summary: Dockerfiles for release publishing has incorrect 
config.yaml path
 Key: FLINK-34725
 URL: https://issues.apache.org/jira/browse/FLINK-34725
 Project: Flink
  Issue Type: Bug
  Components: flink-docker
Affects Versions: 1.19.0
Reporter: lincoln lee
 Fix For: 1.19.0


An issue found when do docker image publishing, unexpected error msg:

{code}sed: can't read /config.yaml: No such file or directory\{code}

also found in flink-docker/master daily Publish SNAPSHOTs  action:
https://github.com/apache/flink-docker/actions/runs/8210534289/job/22458150514#step:8:588

[https://github.com/apache/flink-docker/actions/runs/8210534289/job/22458150322#step:8:549]

 

This related to changes by https://issues.apache.org/jira/browse/FLINK-34205



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34716) Build 1.19 docs in GitHub Action and mark 1.19 as stable in docs

2024-03-18 Thread lincoln lee (Jira)
lincoln lee created FLINK-34716:
---

 Summary: Build 1.19 docs in GitHub Action and mark 1.19 as stable 
in docs
 Key: FLINK-34716
 URL: https://issues.apache.org/jira/browse/FLINK-34716
 Project: Flink
  Issue Type: Sub-task
Reporter: lincoln lee
Assignee: lincoln lee






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[ANNOUNCE] Apache Flink 1.19.0 released

2024-03-18 Thread Lincoln Lee
The Apache Flink community is very happy to announce the release of Apache
Flink 1.19.0, which is the fisrt release for the Apache Flink 1.19 series.

Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
applications.

The release is available for download at:
https://flink.apache.org/downloads.html

Please check out the release blog post for an overview of the improvements
for this bugfix release:
https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353282

We would like to thank all contributors of the Apache Flink community who
made this release possible!


Best,
Yun, Jing, Martijn and Lincoln


[jira] [Created] (FLINK-34713) CLONE - Updates the docs stable version

2024-03-18 Thread lincoln lee (Jira)
lincoln lee created FLINK-34713:
---

 Summary: CLONE - Updates the docs stable version
 Key: FLINK-34713
 URL: https://issues.apache.org/jira/browse/FLINK-34713
 Project: Flink
  Issue Type: Sub-task
Reporter: Sergey Nuyanzin


Update docs to "stable" in {{docs/config.toml}} in the branch of the 
_just-released_ version:
 * Change V{{{}ersion{}}} from {{{}x.y-SNAPSHOT }}to \{{{}x.y.z{}}}, i.e. 
{{1.6-SNAPSHOT}} to {{1.6.0}}
 * Change V{{{}ersionTitle{}}} from {{x.y-SNAPSHOT}} to {{{}x.y{}}}, i.e. 
{{1.6-SNAPSHOT}} to {{1.6}}
 * Change Branch from {{master}} to {{{}release-x.y{}}}, i.e. {{master}} to 
{{release-1.6}}
 * Change {{baseURL}} from 
{{//[ci.apache.org/projects/flink/flink-docs-master|http://ci.apache.org/projects/flink/flink-docs-master]}}
 to 
{{//[ci.apache.org/projects/flink/flink-docs-release-x.y|http://ci.apache.org/projects/flink/flink-docs-release-x.y]}}
 * Change {{javadocs_baseurl}} from 
{{//[ci.apache.org/projects/flink/flink-docs-master|http://ci.apache.org/projects/flink/flink-docs-master]}}
 to 
{{//[ci.apache.org/projects/flink/flink-docs-release-x.y|http://ci.apache.org/projects/flink/flink-docs-release-x.y]}}
 * Change {{IsStable}} to {{true}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34712) CLONE - Update reference data for Migration Tests

2024-03-18 Thread lincoln lee (Jira)
lincoln lee created FLINK-34712:
---

 Summary: CLONE - Update reference data for Migration Tests
 Key: FLINK-34712
 URL: https://issues.apache.org/jira/browse/FLINK-34712
 Project: Flink
  Issue Type: Sub-task
Reporter: Sergey Nuyanzin
Assignee: Sergey Nuyanzin
 Fix For: 1.19.0, 1.18.1


Update migration tests in master to cover migration from new version. Since 
1.18, this step could be done automatically with the following steps. For more 
information please refer to [this 
page.|https://github.com/apache/flink/blob/master/flink-test-utils-parent/flink-migration-test-utils/README.md]
 # {*}On the published release tag (e.g., release-1.16.0){*}, run 
{panel}
{panel}
|{{$ mvn clean }}{{package}} {{{}-Pgenerate-migration-test-data 
-Dgenerate.version={}}}{{{}1.16{}}} {{-nsu -Dfast -DskipTests}}|

The version (1.16 in the command above) should be replaced with the target one.

 # Modify the content of the file 
[apache/flink:flink-test-utils-parent/flink-migration-test-utils/src/main/resources/most_recently_published_version|https://github.com/apache/flink/blob/master/flink-test-utils-parent/flink-migration-test-utils/src/main/resources/most_recently_published_version]
 to the latest version (it would be "v1_16" if sticking to the example where 
1.16.0 was released). 
 # Commit the modification in step a and b with "{_}[release] Generate 
reference data for state migration tests based on release-1.xx.0{_}" to the 
corresponding release branch (e.g. {{release-1.16}} in our example), replace 
"xx" with the actual version (in this example "16"). You should use the Jira 
issue ID in case of [release]  as the commit message's prefix if you have a 
dedicated Jira issue for this task.

 # Cherry-pick the commit to the master branch. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34709) CLONE - Remove outdated versions

2024-03-18 Thread lincoln lee (Jira)
lincoln lee created FLINK-34709:
---

 Summary: CLONE - Remove outdated versions
 Key: FLINK-34709
 URL: https://issues.apache.org/jira/browse/FLINK-34709
 Project: Flink
  Issue Type: Sub-task
Reporter: Sergey Nuyanzin


h4. dist.apache.org

For a new major release remove all release files older than 2 versions, e.g., 
when releasing 1.7, remove all releases <= 1.5.

For a new bugfix version remove all release files for previous bugfix releases 
in the same series, e.g., when releasing 1.7.1, remove the 1.7.0 release.
# If you have not already, check out the Flink section of the {{release}} 
repository on {{[dist.apache.org|http://dist.apache.org/]}} via Subversion. In 
a fresh directory:
{code}
svn checkout https://dist.apache.org/repos/dist/release/flink --depth=immediates
cd flink
{code}
# Remove files for outdated releases and commit the changes.
{code}
svn remove flink-
svn commit
{code}
# Verify that files  are 
[removed|https://dist.apache.org/repos/dist/release/flink]
(!) Remember to remove the corresponding download links from the website.

h4. CI

Disable the cron job for the now-unsupported version from 
(tools/azure-pipelines/[build-apache-repo.yml|https://github.com/apache/flink/blob/master/tools/azure-pipelines/build-apache-repo.yml])
 in the respective branch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34714) CLONE - Start End of Life discussion thread for now outdated Flink minor version

2024-03-18 Thread lincoln lee (Jira)
lincoln lee created FLINK-34714:
---

 Summary: CLONE - Start End of Life discussion thread for now 
outdated Flink minor version
 Key: FLINK-34714
 URL: https://issues.apache.org/jira/browse/FLINK-34714
 Project: Flink
  Issue Type: Sub-task
Reporter: Matthias Pohl


The idea is to discuss whether we should do a final release for the now not 
supported minor version in the community. Such a minor release shouldn't be 
covered by the current minor version release managers. Their only 
responsibility is to trigger the discussion.

The intention of a final patch release for the now unsupported Flink minor 
version is to flush out all the fixes that didn't end up in the previous 
release.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34711) CLONE - Other announcements

2024-03-18 Thread lincoln lee (Jira)
lincoln lee created FLINK-34711:
---

 Summary: CLONE - Other announcements
 Key: FLINK-34711
 URL: https://issues.apache.org/jira/browse/FLINK-34711
 Project: Flink
  Issue Type: Sub-task
Reporter: Sergey Nuyanzin


h3. Recordkeeping

Use [reporter.apache.org|https://reporter.apache.org/addrelease.html?flink] to 
seed the information about the release into future project reports.

(Note: Only PMC members have access report releases. If you do not have access, 
ask on the mailing list for assistance.)
h3. Flink blog

Major or otherwise important releases should have a blog post. Write one if 
needed for this particular release. Minor releases that don’t introduce new 
major functionality don’t necessarily need to be blogged (see [flink-web PR 
#581 for Flink 1.15.3|https://github.com/apache/flink-web/pull/581] as an 
example for a minor release blog post).

Please make sure that the release notes of the documentation (see section 
"Review and update documentation") are linked from the blog post of a major 
release.
We usually include the names of all contributors in the announcement blog post. 
Use the following command to get the list of contributors:
{code}
# first line is required to make sort first with uppercase and then lower
export LC_ALL=C
export FLINK_PREVIOUS_RELEASE_BRANCH=
export FLINK_CURRENT_RELEASE_BRANCH=
# e.g.
# export FLINK_PREVIOUS_RELEASE_BRANCH=release-1.17
# export FLINK_CURRENT_RELEASE_BRANCH=release-1.18
git log $(git merge-base master $FLINK_PREVIOUS_RELEASE_BRANCH)..$(git show-ref 
--hash ${FLINK_CURRENT_RELEASE_BRANCH}) --pretty=format:"%an%n%cn" | sort  -u | 
paste -sd, | sed "s/\,/\, /g"
{code}
h3. Social media

Tweet, post on Facebook, LinkedIn, and other platforms. Ask other contributors 
to do the same.
h3. Flink Release Wiki page

Add a summary of things that went well or that went not so well during the 
release process. This can include feedback from contributors but also more 
generic things like the release have taken longer than initially anticipated 
(and why) to give a bit of context to the release process.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34710) CLONE - Apache mailing lists announcements

2024-03-18 Thread lincoln lee (Jira)
lincoln lee created FLINK-34710:
---

 Summary: CLONE - Apache mailing lists announcements
 Key: FLINK-34710
 URL: https://issues.apache.org/jira/browse/FLINK-34710
 Project: Flink
  Issue Type: Sub-task
Reporter: Sergey Nuyanzin


Announce on the {{dev@}} mailing list that the release has been finished.

Announce on the release on the {{user@}} mailing list, listing major 
improvements and contributions.

Announce the release on the [annou...@apache.org|mailto:annou...@apache.org] 
mailing list.
{panel}
{panel}
|{{From: Release Manager}}
{{To: dev@flink.apache.org, u...@flink.apache.org, user...@flink.apache.org, 
annou...@apache.org}}
{{Subject: [ANNOUNCE] Apache Flink 1.2.3 released}}
 
{{The Apache Flink community is very happy to announce the release of Apache 
Flink 1.2.3, which is the third bugfix release for the Apache Flink 1.2 
series.}}
 
{{Apache Flink® is an open-source stream processing framework for distributed, 
high-performing, always-available, and accurate data streaming applications.}}
 
{{The release is available for download at:}}
{{[https://flink.apache.org/downloads.html]}}
 
{{Please check out the release blog post for an overview of the improvements 
for this bugfix release:}}
{{}}
 
{{The full release notes are available in Jira:}}
{{}}
 
{{We would like to thank all contributors of the Apache Flink community who 
made this release possible!}}
 
{{Feel free to reach out to the release managers (or respond to this thread) 
with feedback on the release process. Our goal is to constantly improve the 
release process. Feedback on what could be improved or things that didn't go so 
well are appreciated.}}
 
{{Regards,}}
{{Release Manager}}|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34708) CLONE - Merge website pull request

2024-03-18 Thread lincoln lee (Jira)
lincoln lee created FLINK-34708:
---

 Summary: CLONE - Merge website pull request
 Key: FLINK-34708
 URL: https://issues.apache.org/jira/browse/FLINK-34708
 Project: Flink
  Issue Type: Sub-task
Reporter: Sergey Nuyanzin


Merge the website pull request to [list the 
release|http://flink.apache.org/downloads.html]. Make sure to regenerate the 
website as well, as it isn't build automatically.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34707) CLONE - Update japicmp configuration

2024-03-18 Thread lincoln lee (Jira)
lincoln lee created FLINK-34707:
---

 Summary: CLONE - Update japicmp configuration
 Key: FLINK-34707
 URL: https://issues.apache.org/jira/browse/FLINK-34707
 Project: Flink
  Issue Type: Sub-task
Reporter: Sergey Nuyanzin
Assignee: Sergey Nuyanzin
 Fix For: 1.19.0, 1.18.1


Update the japicmp reference version and wipe exclusions / enable API 
compatibility checks for {{@PublicEvolving}} APIs on the corresponding SNAPSHOT 
branch with the {{update_japicmp_configuration.sh}} script (see below).

For a new major release (x.y.0), run the same command also on the master branch 
for updating the japicmp reference version and removing out-dated exclusions in 
the japicmp configuration.

Make sure that all Maven artifacts are already pushed to Maven Central. 
Otherwise, there's a risk that CI fails due to missing reference artifacts.
{code:bash}
tools $ NEW_VERSION=$RELEASE_VERSION releasing/update_japicmp_configuration.sh
tools $ cd ..$ git add *$ git commit -m "Update japicmp configuration for 
$RELEASE_VERSION" {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34706) Promote release 1.19

2024-03-18 Thread lincoln lee (Jira)
lincoln lee created FLINK-34706:
---

 Summary: Promote release 1.19
 Key: FLINK-34706
 URL: https://issues.apache.org/jira/browse/FLINK-34706
 Project: Flink
  Issue Type: New Feature
Affects Versions: 1.18.0
Reporter: Sergey Nuyanzin
Assignee: Jing Ge


Once the release has been finalized (FLINK-32920), the last step of the process 
is to promote the release within the project and beyond. Please wait for 24h 
after finalizing the release in accordance with the [ASF release 
policy|http://www.apache.org/legal/release-policy.html#release-announcements].

*Final checklist to declare this issue resolved:*
 # Website pull request to [list the 
release|http://flink.apache.org/downloads.html] merged
 # Release announced on the user@ mailing list.
 # Blog post published, if applicable.
 # Release recorded in 
[reporter.apache.org|https://reporter.apache.org/addrelease.html?flink].
 # Release announced on social media.
 # Completion declared on the dev@ mailing list.
 # Update Homebrew: [https://docs.brew.sh/How-To-Open-a-Homebrew-Pull-Request] 
(seems to be done automatically - at least for minor releases  for both minor 
and major releases)
 # Updated the japicmp configuration
 ** corresponding SNAPSHOT branch japicmp reference version set to the just 
released version, and API compatibiltity checks for {{@PublicEvolving}}  was 
enabled
 ** (minor version release only) master branch japicmp reference version set to 
the just released version
 ** (minor version release only) master branch japicmp exclusions have been 
cleared
 # Update the list of previous version in {{docs/config.toml}} on the master 
branch.
 # Set {{show_outdated_warning: true}} in {{docs/config.toml}} in the branch of 
the _now deprecated_ Flink version (i.e. 1.16 if 1.18.0 is released)
 # Update stable and master alias in 
[https://github.com/apache/flink/blob/master/.github/workflows/docs.yml]
 # Open discussion thread for End of Life for Unsupported version (i.e. 1.16)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34698) CLONE - Deploy Python artifacts to PyPI

2024-03-15 Thread lincoln lee (Jira)
lincoln lee created FLINK-34698:
---

 Summary: CLONE - Deploy Python artifacts to PyPI
 Key: FLINK-34698
 URL: https://issues.apache.org/jira/browse/FLINK-34698
 Project: Flink
  Issue Type: Sub-task
Reporter: Sergey Nuyanzin
Assignee: Jing Ge


Release manager should create a PyPI account and ask the PMC add this account 
to pyflink collaborator list with Maintainer role (The PyPI admin account info 
can be found here. NOTE, only visible to PMC members) to deploy the Python 
artifacts to PyPI. The artifacts could be uploaded using 
twine([https://pypi.org/project/twine/]). To install twine, just run:
{code:java}
pip install --upgrade twine==1.12.0
{code}
Download the python artifacts from dist.apache.org and upload it to pypi.org:
{code:java}
svn checkout 
https://dist.apache.org/repos/dist/dev/flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
cd flink-${RELEASE_VERSION}-rc${RC_NUM}
 
cd python
 
#uploads wheels
for f in *.whl; do twine upload --repository-url 
https://upload.pypi.org/legacy/ $f $f.asc; done
 
#upload source packages
twine upload --repository-url https://upload.pypi.org/legacy/ 
apache-flink-libraries-${RELEASE_VERSION}.tar.gz 
apache-flink-libraries-${RELEASE_VERSION}.tar.gz.asc
 
twine upload --repository-url https://upload.pypi.org/legacy/ 
apache-flink-${RELEASE_VERSION}.tar.gz 
apache-flink-${RELEASE_VERSION}.tar.gz.asc
{code}
If upload failed or incorrect for some reason (e.g. network transmission 
problem), you need to delete the uploaded release package of the same version 
(if exists) and rename the artifact to 
\{{{}apache-flink-${RELEASE_VERSION}.post0.tar.gz{}}}, then re-upload.

(!) Note: re-uploading to pypi.org must be avoided as much as possible because 
it will cause some irreparable problems. If that happens, users cannot install 
the apache-flink package by explicitly specifying the package version, i.e. the 
following command "pip install apache-flink==${RELEASE_VERSION}" will fail. 
Instead they have to run "pip install apache-flink" or "pip install 
apache-flink==${RELEASE_VERSION}.post0" to install the apache-flink package.

 

h3. Expectations
 * Python artifacts released and indexed in the 
[PyPI|https://pypi.org/project/apache-flink/] Repository



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34701) CLONE - Publish the Dockerfiles for the new release

2024-03-15 Thread lincoln lee (Jira)
lincoln lee created FLINK-34701:
---

 Summary: CLONE - Publish the Dockerfiles for the new release
 Key: FLINK-34701
 URL: https://issues.apache.org/jira/browse/FLINK-34701
 Project: Flink
  Issue Type: Sub-task
Reporter: Sergey Nuyanzin
Assignee: Jing Ge


Note: the official Dockerfiles fetch the binary distribution of the target 
Flink version from an Apache mirror. After publishing the binary release 
artifacts, mirrors can take some hours to start serving the new artifacts, so 
you may want to wait to do this step until you are ready to continue with the 
"Promote the release" steps in the follow-up Jira.

Follow the [release instructions in the flink-docker 
repo|https://github.com/apache/flink-docker#release-workflow] to build the new 
Dockerfiles and send an updated manifest to Docker Hub so the new images are 
built and published.

 

h3. Expectations

 * Dockerfiles in [flink-docker|https://github.com/apache/flink-docker] updated 
for the new Flink release and pull request opened on the Docker official-images 
with an updated manifest



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34699) CLONE - Deploy artifacts to Maven Central Repository

2024-03-15 Thread lincoln lee (Jira)
lincoln lee created FLINK-34699:
---

 Summary: CLONE - Deploy artifacts to Maven Central Repository
 Key: FLINK-34699
 URL: https://issues.apache.org/jira/browse/FLINK-34699
 Project: Flink
  Issue Type: Sub-task
Reporter: Sergey Nuyanzin
Assignee: Jing Ge


Use the [Apache Nexus repository|https://repository.apache.org/] to release the 
staged binary artifacts to the Maven Central repository. In the Staging 
Repositories section, find the relevant release candidate orgapacheflink-XXX 
entry and click Release. Drop all other release candidates that are not being 
released.
h3. Deploy source and binary releases to dist.apache.org

Copy the source and binary releases from the dev repository to the release 
repository at [dist.apache.org|http://dist.apache.org/] using Subversion.
{code:java}
$ svn move -m "Release Flink ${RELEASE_VERSION}" 
https://dist.apache.org/repos/dist/dev/flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
 https://dist.apache.org/repos/dist/release/flink/flink-${RELEASE_VERSION}
{code}
(Note: Only PMC members have access to the release repository. If you do not 
have access, ask on the mailing list for assistance.)
h3. Remove old release candidates from [dist.apache.org|http://dist.apache.org/]

Remove the old release candidates from 
[https://dist.apache.org/repos/dist/dev/flink] using Subversion.
{code:java}
$ svn checkout https://dist.apache.org/repos/dist/dev/flink --depth=immediates
$ cd flink
$ svn remove flink-${RELEASE_VERSION}-rc*
$ svn commit -m "Remove old release candidates for Apache Flink 
${RELEASE_VERSION}
{code}
 

h3. Expectations
 * Maven artifacts released and indexed in the [Maven Central 
Repository|https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.flink%22]
 (usually takes about a day to show up)
 * Source & binary distributions available in the release repository of 
[https://dist.apache.org/repos/dist/release/flink/]
 * Dev repository [https://dist.apache.org/repos/dist/dev/flink/] is empty
 * Website contains links to new release binaries and sources in download page
 * (for minor version updates) the front page references the correct new major 
release version and directs to the correct link



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34700) CLONE - Create Git tag and mark version as released in Jira

2024-03-15 Thread lincoln lee (Jira)
lincoln lee created FLINK-34700:
---

 Summary: CLONE - Create Git tag and mark version as released in 
Jira
 Key: FLINK-34700
 URL: https://issues.apache.org/jira/browse/FLINK-34700
 Project: Flink
  Issue Type: Sub-task
Reporter: Sergey Nuyanzin
Assignee: Jing Ge


Create and push a new Git tag for the released version by copying the tag for 
the final release candidate, as follows:
{code:java}
$ git tag -s "release-${RELEASE_VERSION}" refs/tags/${TAG}^{} -m "Release Flink 
${RELEASE_VERSION}"
$ git push  refs/tags/release-${RELEASE_VERSION}
{code}
In JIRA, inside [version 
management|https://issues.apache.org/jira/plugins/servlet/project-config/FLINK/versions],
 hover over the current release and a settings menu will appear. Click Release, 
and select today’s date.

(Note: Only PMC members have access to the project administration. If you do 
not have access, ask on the mailing list for assistance.)

If PRs have been merged to the release branch after the the last release 
candidate was tagged, make sure that the corresponding Jira tickets have the 
correct Fix Version set.

 

h3. Expectations
 * Release tagged in the source code repository
 * Release version finalized in JIRA. (Note: Not all committers have 
administrator access to JIRA. If you end up getting permissions errors ask on 
the mailing list for assistance)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34697) Finalize release 1.19.0

2024-03-15 Thread lincoln lee (Jira)
lincoln lee created FLINK-34697:
---

 Summary: Finalize release 1.19.0
 Key: FLINK-34697
 URL: https://issues.apache.org/jira/browse/FLINK-34697
 Project: Flink
  Issue Type: New Feature
Reporter: Sergey Nuyanzin
Assignee: Jing Ge


Once the release candidate has been reviewed and approved by the community, the 
release should be finalized. This involves the final deployment of the release 
candidate to the release repositories, merging of the website changes, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[RESULT][VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread Lincoln Lee
Hi everyone,

I'm happy to announce that we have unanimously approved this release.

There are 24 approving votes, 4 of which are binding:

- Xintong Song (binding)
- Jean-Baptiste Onofré (non binding)
- Ahmed Hamdy (non binding)
- Samrat Deb (non binding)
- Jeyhun Karimov (non binding)
- Hangxiang Yu (non binding)
- Yanfei Lei (non binding)
- Ron liu (non binding)
- Feng Jin (non binding)
- Hang Ruan (non binding)
- Xuannan Su (non binding)
- Jane Chan (non binding)
- Rui Fan (non binding)
- weijie guo (non binding)
- Benchao Li (non binding)
- Qingsheng Ren (binding)
- Yun Tang (non binding)
- Ferenc Csaky (non binding)
- gongzhongqiang (non binding)
- Martijn Visser (binding)
- Sergey Nuyanzin (non binding)
- Jing Ge (non binding)
- Leonard Xu (binding)
- Jiabao Sun (non binding)


There are no disapproving votes.

Thank you for verifying the release candidate. We will now proceed
to finalize the release and announce it once everything is published.


Best,
Yun, Jing, Martijn and Lincoln


Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread Lincoln Lee
Hi Matthias,

Thanks for the updating!
And once again, thank you for continuously tracking all unstable cases and
driving the resolution!

Best,
Lincoln Lee


Lincoln Lee  于2024年3月14日周四 18:00写道:

> Hi Jane,
>
> Thank you for raising this question. I saw the discussion in the Jira
> (include Matthias' point)
> and sought advice from several PMCs (including the previous RMs), the
> majority of people
> are in favor of merging the bugfix into the release branch even during the
> release candidate
> (RC) voting period, so we should accept all bugfixes (unless there is a
> specific community
> rule preventing it).
>
> Thanks again for contributing to the community!
>
> Best,
> Lincoln Lee
>
>
> Matthias Pohl  于2024年3月14日周四 17:50写道:
>
>> Update on FLINK-34227 [1] which I mentioned above: Chesnay helped identify
>> a concurrency issue in the JobMaster shutdown logic which seems to be in
>> the code for quite some time. I created a PR fixing the issue hoping that
>> the test instability is resolved with it.
>>
>> The concurrency issue doesn't really explain why it only started to appear
>> recently in a specific CI setup (GHA with AdaptiveScheduler). There is no
>> hint in the git history indicating that it's caused by some newly
>> introduced change. That is why I wouldn't make FLINK-34227 a reason to
>> cancel rc2. Instead, the fix can be provided in subsequent patch releases.
>>
>> Matthias
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-34227
>>
>> On Thu, Mar 14, 2024 at 8:49 AM Jane Chan  wrote:
>>
>> > Hi Yun, Jing, Martijn and Lincoln,
>> >
>> > I'm seeking guidance on whether merging the bugfix[1][2] at this stage
>> is
>> > appropriate. I want to ensure that the actions align with the current
>> > release process and do not disrupt the ongoing preparations.
>> >
>> > [1] https://issues.apache.org/jira/browse/FLINK-29114
>> > [2] https://github.com/apache/flink/pull/24492
>> >
>> > Best,
>> > Jane
>> >
>> > On Thu, Mar 14, 2024 at 1:33 PM Yun Tang  wrote:
>> >
>> > > +1 (non-binding)
>> > >
>> > >
>> > >   *
>> > > Verified the signature and checksum.
>> > >   *
>> > > Reviewed the release note PR
>> > >   *
>> > > Reviewed the web announcement PR
>> > >   *
>> > > Start a standalone cluster to submit the state machine example, which
>> > > works well.
>> > >   *
>> > > Checked the pre-built jars are generated via JDK8
>> > >   *
>> > > Verified the process profiler works well after setting
>> > > rest.profiling.enabled: true
>> > >
>> > > Best
>> > > Yun Tang
>> > >
>> > > 
>> > > From: Qingsheng Ren 
>> > > Sent: Wednesday, March 13, 2024 12:45
>> > > To: dev@flink.apache.org 
>> > > Subject: Re: [VOTE] Release 1.19.0, release candidate #2
>> > >
>> > > +1 (binding)
>> > >
>> > > - Verified signature and checksum
>> > > - Verified no binary in source
>> > > - Built from source
>> > > - Tested reading and writing Kafka with SQL client and Kafka connector
>> > > 3.1.0
>> > > - Verified source code tag
>> > > - Reviewed release note
>> > > - Reviewed web PR
>> > >
>> > > Thanks to all release managers and contributors for the awesome work!
>> > >
>> > > Best,
>> > > Qingsheng
>> > >
>> > > On Wed, Mar 13, 2024 at 1:23 AM Matthias Pohl
>> > >  wrote:
>> > >
>> > > > I want to share an update on FLINK-34227 [1]: It's still not clear
>> > what's
>> > > > causing the test instability. So far, we agreed in today's release
>> sync
>> > > [2]
>> > > > that it's not considered a blocker because it is observed in 1.18
>> > nightly
>> > > > builds and it only appears in the GitHub Actions workflow. But I
>> still
>> > > have
>> > > > a bit of a concern that this is something that was introduced in
>> 1.19
>> > and
>> > > > backported to 1.18 after the 1.18.1 release (because the test
>> > instability
>> > > > started to appear more regularly in March; with one occurrence in
>> > > January).
>> > > > Additionally, I have no r

Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-14 Thread Lincoln Lee
Hi Jane,

Thank you for raising this question. I saw the discussion in the Jira
(include Matthias' point)
and sought advice from several PMCs (including the previous RMs), the
majority of people
are in favor of merging the bugfix into the release branch even during the
release candidate
(RC) voting period, so we should accept all bugfixes (unless there is a
specific community
rule preventing it).

Thanks again for contributing to the community!

Best,
Lincoln Lee


Matthias Pohl  于2024年3月14日周四 17:50写道:

> Update on FLINK-34227 [1] which I mentioned above: Chesnay helped identify
> a concurrency issue in the JobMaster shutdown logic which seems to be in
> the code for quite some time. I created a PR fixing the issue hoping that
> the test instability is resolved with it.
>
> The concurrency issue doesn't really explain why it only started to appear
> recently in a specific CI setup (GHA with AdaptiveScheduler). There is no
> hint in the git history indicating that it's caused by some newly
> introduced change. That is why I wouldn't make FLINK-34227 a reason to
> cancel rc2. Instead, the fix can be provided in subsequent patch releases.
>
> Matthias
>
> [1] https://issues.apache.org/jira/browse/FLINK-34227
>
> On Thu, Mar 14, 2024 at 8:49 AM Jane Chan  wrote:
>
> > Hi Yun, Jing, Martijn and Lincoln,
> >
> > I'm seeking guidance on whether merging the bugfix[1][2] at this stage is
> > appropriate. I want to ensure that the actions align with the current
> > release process and do not disrupt the ongoing preparations.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-29114
> > [2] https://github.com/apache/flink/pull/24492
> >
> > Best,
> > Jane
> >
> > On Thu, Mar 14, 2024 at 1:33 PM Yun Tang  wrote:
> >
> > > +1 (non-binding)
> > >
> > >
> > >   *
> > > Verified the signature and checksum.
> > >   *
> > > Reviewed the release note PR
> > >   *
> > > Reviewed the web announcement PR
> > >   *
> > > Start a standalone cluster to submit the state machine example, which
> > > works well.
> > >   *
> > > Checked the pre-built jars are generated via JDK8
> > >   *
> > > Verified the process profiler works well after setting
> > > rest.profiling.enabled: true
> > >
> > > Best
> > > Yun Tang
> > >
> > > 
> > > From: Qingsheng Ren 
> > > Sent: Wednesday, March 13, 2024 12:45
> > > To: dev@flink.apache.org 
> > > Subject: Re: [VOTE] Release 1.19.0, release candidate #2
> > >
> > > +1 (binding)
> > >
> > > - Verified signature and checksum
> > > - Verified no binary in source
> > > - Built from source
> > > - Tested reading and writing Kafka with SQL client and Kafka connector
> > > 3.1.0
> > > - Verified source code tag
> > > - Reviewed release note
> > > - Reviewed web PR
> > >
> > > Thanks to all release managers and contributors for the awesome work!
> > >
> > > Best,
> > > Qingsheng
> > >
> > > On Wed, Mar 13, 2024 at 1:23 AM Matthias Pohl
> > >  wrote:
> > >
> > > > I want to share an update on FLINK-34227 [1]: It's still not clear
> > what's
> > > > causing the test instability. So far, we agreed in today's release
> sync
> > > [2]
> > > > that it's not considered a blocker because it is observed in 1.18
> > nightly
> > > > builds and it only appears in the GitHub Actions workflow. But I
> still
> > > have
> > > > a bit of a concern that this is something that was introduced in 1.19
> > and
> > > > backported to 1.18 after the 1.18.1 release (because the test
> > instability
> > > > started to appear more regularly in March; with one occurrence in
> > > January).
> > > > Additionally, I have no reason to believe, yet, that the instability
> is
> > > > caused by some GHA-related infrastructure issue.
> > > >
> > > > So, if someone else has some capacity to help looking into it; that
> > would
> > > > be appreciated. I will continue my investigation tomorrow.
> > > >
> > > > Best,
> > > > Matthias
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-34227
> > > > [2]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/1.19+Release#id-1.19Release-03/12/2024
&g

Re: [DISCUSS] FLIP-435: Introduce a New Dynamic Table for Simplifying Data Pipelines

2024-03-14 Thread Lincoln Lee
Hi Jing,

Thanks for your attention to this flip! I'll try to answer the following
questions.

> 1. How to define query of dynamic table?
> Use flink sql or introducing new syntax?
> If use flink sql, how to handle the difference in SQL between streaming
and
> batch processing?
> For example, a query including window aggregate based on processing time?
> or a query including global order by?

Similar to `CREATE TABLE AS query`, here the `query` also uses Flink sql and

doesn't introduce a totally new syntax.
We will not change the status respect to

the difference in functionality of flink sql itself on streaming and
batch, for example,

the proctime window agg on streaming and global sort on batch that you
mentioned,

in fact, do not work properly in the
other mode, so when the user modifies the

refresh mode of a dynamic table that is not supported, we will throw an
exception.

> 2. Whether modify the query of dynamic table is allowed?
> Or we could only refresh a dynamic table based on the initial query?

Yes, in the current design, the query definition of the
dynamic table is not allowed

 to be modified, and you can only refresh the data based on the
initial definition.

> 3. How to use dynamic table?
> The dynamic table seems to be similar to the materialized view.  Will we
do
> something like materialized view rewriting during the optimization?

It's true that dynamic table and materialized view
are similar in some ways, but as Ron

explains
there are differences. In terms of optimization, automated
materialization discovery

similar to that supported by calcite is also a potential possibility,
perhaps with the

addition of automated rewriting in the future.



Best,
Lincoln Lee


Ron liu  于2024年3月14日周四 14:01写道:

> Hi, Timo
>
> Sorry for later response,  thanks for your feedback.
> Regarding your questions:
>
> > Flink has introduced the concept of Dynamic Tables many years ago. How
>
> does the term "Dynamic Table" fit into Flink's regular tables and also
>
> how does it relate to Table API?
>
>
> > I fear that adding the DYNAMIC TABLE keyword could cause confusion for
> > users, because a term for regular CREATE TABLE (that can be "kind of
> > dynamic" as well and is backed by a changelog) is then missing. Also
> > given that we call our connectors for those tables, DynamicTableSource
> > and DynamicTableSink.
>
>
> > In general, I find it contradicting that a TABLE can be "paused" or
> > "resumed". From an English language perspective, this does sound
> > incorrect. In my opinion (without much research yet), a continuous
> > updating trigger should rather be modelled as a CREATE MATERIALIZED VIEW
> > (which users are familiar with?) or a new concept such as a CREATE TASK
> > (that can be paused and resumed?).
>
>
> 1.
> In the current concept[1], it actually includes: Dynamic Tables &
> Continuous Query. Dynamic Table is just an abstract
> logical concept
> , which in its physical form represents either a table or a changelog
> stream. It requires the combination with Continuous Query to achieve
> dynamic updates of the target table similar to a database’s
> Materialized View.
> We hope to upgrade the Dynamic Table to a real entity that users can
> operate, which combines the logical concepts of Dynamic Tables +
> Continuous Query. By integrating the definition of tables and queries,
> it can achieve functions similar to Materialized Views, simplifying
> users' data processing pipelines.
> So, the object of the suspend operation is the refresh task of the
> dynamic table. The command  `ALTER DYNAMIC TABLE table_name SUSPEND `
> is actually a shorthand for `ALTER DYNAMIC TABLE table_name SUSPEND
> REFRESH` (if written in full for clarity, we can also modify it).
>
>  2. Initially, we also considered Materialized Views
> , but ultimately decided against them. Materialized views are designed
> to enhance query performance for workloads that consist of common,
> repetitive query patterns. In essence, a materialized view represents
> the result of a query.
> However, it is not intended to support data modification. For
> Lakehouse scenarios, where the ability to delete or update data is
> crucial (such as compliance with GDPR, FLIP-2), materialized views
> fall short.
>
> 3.
> Compared to CREATE (regular) TABLE, CREATE DYNAMIC TABLE not only
> defines metadata in the catalog but also automatically initiates a
> data refresh task based on the query specified during table creation.
> It dynamically executes data updates. Users can focus on data
> dependencies and data generation logic.
>
> 4.
> The new dynamic table does not conflict with the existing
> DynamicTableSource and DynamicTableSink int

Re: [VOTE] Release 1.19.0, release candidate #2

2024-03-11 Thread Lincoln Lee
Hi Robin,

Thanks for helping verifying the release note[1], FLINK-14879 should not
have been included, after confirming this
I moved all unresolved non-blocker issues left over from 1.19.0 to 1.20.0
and reconfigured the release note [1].

Best,
Lincoln Lee

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353282


Robin Moffatt  于2024年3月11日周一 19:36写道:

> Looking at the release notes [1] it lists `DESCRIBE DATABASE` (FLINK-14879)
> and `DESCRIBE CATALOG` (FLINK-14690).
> When I try these in 1.19 RC2 the behaviour is as in 1.18.1, i.e. it is not
> supported:
>
> ```
> [INFO] Execute statement succeed.
>
> Flink SQL> show catalogs;
> +-+
> |catalog name |
> +-+
> |   c_new |
> | default_catalog |
> +-+
> 2 rows in set
>
> Flink SQL> DESCRIBE CATALOG c_new;
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.calcite.sql.validate.SqlValidatorException: Column 'c_new' not
> found in any table
>
> Flink SQL> show databases;
> +--+
> |database name |
> +--+
> | default_database |
> +--+
> 1 row in set
>
> Flink SQL> DESCRIBE DATABASE default_database;
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.calcite.sql.validate.SqlValidatorException: Column
> 'default_database' not found in
> any table
> ```
>
> Is this an error in the release notes, or my mistake in interpreting them?
>
> thanks, Robin.
>
>
> [1]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353282
>
> On Thu, 7 Mar 2024 at 10:01, Lincoln Lee  wrote:
>
> > Hi everyone,
> >
> > Please review and vote on the release candidate #2 for the version
> 1.19.0,
> > as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> > The complete staging area is available for your review, which includes:
> >
> > * JIRA release notes [1], and the pull request adding release note for
> > users [2]
> > * the official Apache source release and binary convenience releases to
> be
> > deployed to dist.apache.org [3], which are signed with the key with
> > fingerprint E57D30ABEE75CA06  [4],
> > * all artifacts to be deployed to the Maven Central Repository [5],
> > * source code tag "release-1.19.0-rc2" [6],
> > * website pull request listing the new release and adding announcement
> blog
> > post [7].
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> >
> > [1]
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353282
> > [2] https://github.com/apache/flink/pull/24394
> > [3] https://dist.apache.org/repos/dist/dev/flink/flink-1.19.0-rc2/
> > [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [5]
> https://repository.apache.org/content/repositories/orgapacheflink-1709
> > [6] https://github.com/apache/flink/releases/tag/release-1.19.0-rc2
> > [7] https://github.com/apache/flink-web/pull/721
> >
> >
> > Best,
> > Yun, Jing, Martijn and Lincoln
> >
>


[VOTE] Release 1.19.0, release candidate #2

2024-03-07 Thread Lincoln Lee
Hi everyone,

Please review and vote on the release candidate #2 for the version 1.19.0,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:

* JIRA release notes [1], and the pull request adding release note for
users [2]
* the official Apache source release and binary convenience releases to be
deployed to dist.apache.org [3], which are signed with the key with
fingerprint E57D30ABEE75CA06  [4],
* all artifacts to be deployed to the Maven Central Repository [5],
* source code tag "release-1.19.0-rc2" [6],
* website pull request listing the new release and adding announcement blog
post [7].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353282
[2] https://github.com/apache/flink/pull/24394
[3] https://dist.apache.org/repos/dist/dev/flink/flink-1.19.0-rc2/
[4] https://dist.apache.org/repos/dist/release/flink/KEYS
[5] https://repository.apache.org/content/repositories/orgapacheflink-1709
[6] https://github.com/apache/flink/releases/tag/release-1.19.0-rc2
[7] https://github.com/apache/flink-web/pull/721


Best,
Yun, Jing, Martijn and Lincoln


[jira] [Created] (FLINK-34590) Build Release Candidate: 1.19.0-rc2

2024-03-06 Thread lincoln lee (Jira)
lincoln lee created FLINK-34590:
---

 Summary: Build Release Candidate: 1.19.0-rc2
 Key: FLINK-34590
 URL: https://issues.apache.org/jira/browse/FLINK-34590
 Project: Flink
  Issue Type: New Feature
Affects Versions: 1.19.0
Reporter: Lincoln Lee
Assignee: lincoln lee
 Fix For: 1.19.0


The core of the release process is the build-vote-fix cycle. Each cycle 
produces one release candidate. The Release Manager repeats this cycle until 
the community approves one release candidate, which is then finalized.

h4. Prerequisites
Set up a few environment variables to simplify Maven commands that follow. This 
identifies the release candidate being built. Start with {{RC_NUM}} equal to 1 
and increment it for each candidate:
{code}
RC_NUM="1"
TAG="release-${RELEASE_VERSION}-rc${RC_NUM}"
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34592) CLONE - Stage source and binary releases on dist.apache.org

2024-03-06 Thread lincoln lee (Jira)
lincoln lee created FLINK-34592:
---

 Summary: CLONE - Stage source and binary releases on 
dist.apache.org
 Key: FLINK-34592
 URL: https://issues.apache.org/jira/browse/FLINK-34592
 Project: Flink
  Issue Type: Sub-task
Reporter: Lincoln Lee
Assignee: lincoln lee


Copy the source release to the dev repository of dist.apache.org:
# If you have not already, check out the Flink section of the dev repository on 
dist.apache.org via Subversion. In a fresh directory:
{code:bash}
$ svn checkout https://dist.apache.org/repos/dist/dev/flink --depth=immediates
{code}
# Make a directory for the new release and copy all the artifacts (Flink 
source/binary distributions, hashes, GPG signatures and the python 
subdirectory) into that newly created directory:
{code:bash}
$ mkdir flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
$ mv /tools/releasing/release/* 
flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
{code}
# Add and commit all the files.
{code:bash}
$ cd flink
flink $ svn add flink-${RELEASE_VERSION}-rc${RC_NUM}
flink $ svn commit -m "Add flink-${RELEASE_VERSION}-rc${RC_NUM}"
{code}
# Verify that files are present under 
[https://dist.apache.org/repos/dist/dev/flink|https://dist.apache.org/repos/dist/dev/flink].
# Push the release tag if not done already (the following command assumes to be 
called from within the apache/flink checkout):
{code:bash}
$ git push  refs/tags/release-${RELEASE_VERSION}-rc${RC_NUM}
{code}

 

h3. Expectations
 * Maven artifacts deployed to the staging repository of 
[repository.apache.org|https://repository.apache.org/content/repositories/]
 * Source distribution deployed to the dev repository of 
[dist.apache.org|https://dist.apache.org/repos/dist/dev/flink/]
 * Check hashes (e.g. shasum -c *.sha512)
 * Check signatures (e.g. {{{}gpg --verify 
flink-1.2.3-source-release.tar.gz.asc flink-1.2.3-source-release.tar.gz{}}})
 * {{grep}} for legal headers in each file.
 * If time allows check the NOTICE files of the modules whose dependencies have 
been changed in this release in advance, since the license issues from time to 
time pop up during voting. See [Verifying a Flink 
Release|https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Release]
 "Checking License" section.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34593) CLONE - Propose a pull request for website updates

2024-03-06 Thread lincoln lee (Jira)
lincoln lee created FLINK-34593:
---

 Summary: CLONE - Propose a pull request for website updates
 Key: FLINK-34593
 URL: https://issues.apache.org/jira/browse/FLINK-34593
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 1.19.0
Reporter: Lincoln Lee
Assignee: lincoln lee
 Fix For: 1.19.0


The final step of building the candidate is to propose a website pull request 
containing the following changes:
 # update 
[apache/flink-web:_config.yml|https://github.com/apache/flink-web/blob/asf-site/_config.yml]
 ## update {{FLINK_VERSION_STABLE}} and {{FLINK_VERSION_STABLE_SHORT}} as 
required
 ## update version references in quickstarts ({{{}q/{}}} directory) as required
 ## (major only) add a new entry to {{flink_releases}} for the release binaries 
and sources
 ## (minor only) update the entry for the previous release in the series in 
{{flink_releases}}
 ### Please pay notice to the ids assigned to the download entries. They should 
be unique and reflect their corresponding version number.
 ## add a new entry to {{release_archive.flink}}
 # add a blog post announcing the release in _posts
 # add a organized release notes page under docs/content/release-notes and 
docs/content.zh/release-notes (like 
[https://nightlies.apache.org/flink/flink-docs-release-1.15/release-notes/flink-1.15/]).
 The page is based on the non-empty release notes collected from the issues, 
and only the issues that affect existing users should be included (e.g., 
instead of new functionality). It should be in a separate PR since it would be 
merged to the flink project.

(!) Don’t merge the PRs before finalizing the release.

 

h3. Expectations
 * Website pull request proposed to list the 
[release|http://flink.apache.org/downloads.html]
 * (major only) Check {{docs/config.toml}} to ensure that
 ** the version constants refer to the new version
 ** the {{baseurl}} does not point to {{flink-docs-master}}  but 
{{flink-docs-release-X.Y}} instead



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34594) CLONE - Vote on the release candidate

2024-03-06 Thread lincoln lee (Jira)
lincoln lee created FLINK-34594:
---

 Summary: CLONE - Vote on the release candidate
 Key: FLINK-34594
 URL: https://issues.apache.org/jira/browse/FLINK-34594
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 1.19.0
Reporter: Lincoln Lee
Assignee: lincoln lee
 Fix For: 1.19.0


Once you have built and individually reviewed the release candidate, please 
share it for the community-wide review. Please review foundation-wide [voting 
guidelines|http://www.apache.org/foundation/voting.html] for more information.

Start the review-and-vote thread on the dev@ mailing list. Here’s an email 
template; please adjust as you see fit.
{quote}From: Release Manager
To: dev@flink.apache.org
Subject: [VOTE] Release 1.2.3, release candidate #3

Hi everyone,
Please review and vote on the release candidate #3 for the version 1.2.3, as 
follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:
 * JIRA release notes [1],
 * the official Apache source release and binary convenience releases to be 
deployed to dist.apache.org [2], which are signed with the key with fingerprint 
 [3],
 * all artifacts to be deployed to the Maven Central Repository [4],
 * source code tag "release-1.2.3-rc3" [5],
 * website pull request listing the new release and adding announcement blog 
post [6].

The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.

Thanks,
Release Manager

[1] link
[2] link
[3] [https://dist.apache.org/repos/dist/release/flink/KEYS]
[4] link
[5] link
[6] link
{quote}
*If there are any issues found in the release candidate, reply on the vote 
thread to cancel the vote.* There’s no need to wait 72 hours. Proceed to the 
Fix Issues step below and address the problem. However, some issues don’t 
require cancellation. For example, if an issue is found in the website pull 
request, just correct it on the spot and the vote can continue as-is.

For cancelling a release, the release manager needs to send an email to the 
release candidate thread, stating that the release candidate is officially 
cancelled. Next, all artifacts created specifically for the RC in the previous 
steps need to be removed:
 * Delete the staging repository in Nexus
 * Remove the source / binary RC files from dist.apache.org
 * Delete the source code tag in git

*If there are no issues, reply on the vote thread to close the voting.* Then, 
tally the votes in a separate email. Here’s an email template; please adjust as 
you see fit.
{quote}From: Release Manager
To: dev@flink.apache.org
Subject: [RESULT] [VOTE] Release 1.2.3, release candidate #3

I'm happy to announce that we have unanimously approved this release.

There are XXX approving votes, XXX of which are binding:
 * approver 1
 * approver 2
 * approver 3
 * approver 4

There are no disapproving votes.

Thanks everyone!
{quote}
 

h3. Expectations
 * Community votes to release the proposed candidate, with at least three 
approving PMC votes

Any issues that are raised till the vote is over should be either resolved or 
moved into the next release (if applicable).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34591) CLONE - Build and stage Java and Python artifacts

2024-03-06 Thread lincoln lee (Jira)
lincoln lee created FLINK-34591:
---

 Summary: CLONE - Build and stage Java and Python artifacts
 Key: FLINK-34591
 URL: https://issues.apache.org/jira/browse/FLINK-34591
 Project: Flink
  Issue Type: Sub-task
Reporter: Lincoln Lee
Assignee: lincoln lee


# Create a local release branch ((!) this step can not be skipped for minor 
releases):
{code:bash}
$ cd ./tools
tools/ $ OLD_VERSION=$CURRENT_SNAPSHOT_VERSION NEW_VERSION=$RELEASE_VERSION 
RELEASE_CANDIDATE=$RC_NUM releasing/create_release_branch.sh
{code}
 # Tag the release commit:
{code:bash}
$ git tag -s ${TAG} -m "${TAG}"
{code}
 # We now need to do several things:
 ## Create the source release archive
 ## Deploy jar artefacts to the [Apache Nexus 
Repository|https://repository.apache.org/], which is the staging area for 
deploying the jars to Maven Central
 ## Build PyFlink wheel packages
You might want to create a directory on your local machine for collecting the 
various source and binary releases before uploading them. Creating the binary 
releases is a lengthy process but you can do this on another machine (for 
example, in the "cloud"). When doing this, you can skip signing the release 
files on the remote machine, download them to your local machine and sign them 
there.
 # Build the source release:
{code:bash}
tools $ RELEASE_VERSION=$RELEASE_VERSION releasing/create_source_release.sh
{code}
 # Stage the maven artifacts:
{code:bash}
tools $ releasing/deploy_staging_jars.sh
{code}
Review all staged artifacts ([https://repository.apache.org/]). They should 
contain all relevant parts for each module, including pom.xml, jar, test jar, 
source, test source, javadoc, etc. Carefully review any new artifacts.
 # Close the staging repository on Apache Nexus. When prompted for a 
description, enter “Apache Flink, version X, release candidate Y”.
Then, you need to build the PyFlink wheel packages (since 1.11):
 # Set up an azure pipeline in your own Azure account. You can refer to [Azure 
Pipelines|https://cwiki.apache.org/confluence/display/FLINK/Azure+Pipelines#AzurePipelines-Tutorial:SettingupAzurePipelinesforaforkoftheFlinkrepository]
 for more details on how to set up azure pipeline for a fork of the Flink 
repository. Note that a google cloud mirror in Europe is used for downloading 
maven artifacts, therefore it is recommended to set your [Azure organization 
region|https://docs.microsoft.com/en-us/azure/devops/organizations/accounts/change-organization-location]
 to Europe to speed up the downloads.
 # Push the release candidate branch to your forked personal Flink repository, 
e.g.
{code:bash}
tools $ git push  
refs/heads/release-${RELEASE_VERSION}-rc${RC_NUM}:release-${RELEASE_VERSION}-rc${RC_NUM}
{code}
 # Trigger the Azure Pipelines manually to build the PyFlink wheel packages
 ## Go to your Azure Pipelines Flink project → Pipelines
 ## Click the "New pipeline" button on the top right
 ## Select "GitHub" → your GitHub Flink repository → "Existing Azure Pipelines 
YAML file"
 ## Select your branch → Set path to "/azure-pipelines.yaml" → click on 
"Continue" → click on "Variables"
 ## Then click "New Variable" button, fill the name with "MODE", and the value 
with "release". Click "OK" to set the variable and the "Save" button to save 
the variables, then back on the "Review your pipeline" screen click "Run" to 
trigger the build.
 ## You should now see a build where only the "CI build (release)" is running
 # Download the PyFlink wheel packages from the build result page after the 
jobs of "build_wheels mac" and "build_wheels linux" have finished.
 ## Download the PyFlink wheel packages
 ### Open the build result page of the pipeline
 ### Go to the {{Artifacts}} page (build_wheels linux -> 1 artifact)
 ### Click {{wheel_Darwin_build_wheels mac}} and {{wheel_Linux_build_wheels 
linux}} separately to download the zip files
 ## Unzip these two zip files
{code:bash}
$ cd /path/to/downloaded_wheel_packages
$ unzip wheel_Linux_build_wheels\ linux.zip
$ unzip wheel_Darwin_build_wheels\ mac.zip{code}
 ## Create directory {{./dist}} under the directory of {{{}flink-python{}}}:
{code:bash}
$ cd 
$ mkdir flink-python/dist{code}
 ## Move the unzipped wheel packages to the directory of 
{{{}flink-python/dist{}}}:
{code:java}
$ mv /path/to/wheel_Darwin_build_wheels\ mac/* flink-python/dist/
$ mv /path/to/wheel_Linux_build_wheels\ linux/* flink-python/dist/
$ cd tools{code}

Finally, we create the binary convenience release files:
{code:bash}
tools $ RELEASE_VERSION=$RELEASE_VERSION releasing/create_binary_release.sh
{code}
If you want to run this step in parallel on a remote machine you have to make 
the release commit available there (for example by p

Re: [VOTE] Release 1.19.0, release candidate #1

2024-03-06 Thread Lincoln Lee
Hi Yangze,

Thanks for letting us know, this has been discussed among rms and we agree
having this fix into 1.19.0.

Also thanks Zhu Zhu to help confirm this issue!

Best,
Yun, Jing, Martijn and Lincoln


Zhu Zhu  于2024年3月6日周三 21:03写道:

> +1 to include the fix of FLINK-34105 in 1.19.0.
> Existing large scale jobs, including streaming jobs and batch jobs,
> may become unstable without the fix.
>
> Thanks,
> Zhu
>
> Yangze Guo  于2024年3月6日周三 14:13写道:
>
> > Hi, Lincoln,
> >
> > We are currently addressing the FLINK-34105, where we have encountered
> > an Akka timeout during the TPC-DS benchmarks. We plan to revert the
> > changes made in FLINK-33532, and we anticipate that the corresponding
> > pull request will be ready today. Could you include this patch into
> > the rc2?
> >
> > Best,
> > Yangze Guo
> >
> > On Wed, Mar 6, 2024 at 2:00 PM Lincoln Lee 
> wrote:
> > >
> > > Hi everyone,
> > >
> > > This rc is officially cancelled because of the incompleteness of python
> > > wheel packages(lost 2 newly added py3.11 packages for mac, both 10.9 &
> > > 11.0, will be fixed in [1]).
> > >
> > > A new rc will be created.
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-34582
> > >
> > > Best,
> > > Yun, Jing, Martijn and Lincoln
> > >
> > > Lincoln Lee  于2024年3月6日周三 00:24写道:
> > >
> > > > Hi everyone,
> > > >
> > > > Please review and vote on the release candidate #1 for the version
> > 1.19.0,
> > > > as follows:
> > > > [ ] +1, Approve the release
> > > > [ ] -1, Do not approve the release (please provide specific comments)
> > > >
> > > > The complete staging area is available for your review, which
> includes:
> > > >
> > > > * JIRA release notes [1], and the pull request adding release note
> for
> > > > users [2]
> > > > * the official Apache source release and binary convenience releases
> > to be
> > > > deployed to dist.apache.org [3], which are signed with the key with
> > > > fingerprint E57D30ABEE75CA06  [4],
> > > > * all artifacts to be deployed to the Maven Central Repository [5],
> > > > * source code tag "release-1.19.0-rc1" [6],
> > > > * website pull request listing the new release and adding
> announcement
> > > > blog post [7].
> > > >
> > > > The vote will be open for at least 72 hours. It is adopted by
> majority
> > > > approval, with at least 3 PMC affirmative votes.
> > > >
> > > > [1]
> > > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353282
> > > > [2] https://github.com/apache/flink/pull/24394
> > > > [3] https://dist.apache.org/repos/dist/dev/flink/flink-1.19.0-rc1/
> > > > [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > > [5]
> > https://repository.apache.org/content/repositories/orgapacheflink-1708
> > > > [6] https://github.com/apache/flink/releases/tag/release-1.19.0-rc1
> > > > [7] https://github.com/apache/flink-web/pull/721
> > > >
> > > > Best,
> > > > Yun, Jing, Martijn and Lincoln
> > > >
> >
>


[ANNOUNCE] Flink 1.19.0 rc1 cancelled & sync summary on 03/05/2024

2024-03-05 Thread Lincoln Lee
Hi devs,

As mentioned in the vote thread[1] rc1 has been cancelled due to the
incompleteness
of python wheel packages(lost 2 newly added py3.11 packages for mac, both
10.9 & 11.0, will be fixed in [2]).
If no new blockers, we'll start a new rc soon.

Also share some highlights from the release sync on 03/05/2024

- Test instabilities
A new failure [3] comes up and has been confirmed not a blocker.

- Release notes

The release note[4] and announcement[5] prs are ready to review, please
help review before 1.19 release, thanks!

- Sync meeting (https://meet.google.com/vcx-arzs-trv)

We've already switched to weekly release sync, so the next release sync
will be on Mar 12th, 2024. Feel free to join!


[1] https://lists.apache.org/thread/10bxy1zhzy6hycjyohyl3pzx3xs3zh34
[2] https://issues.apache.org/jira/browse/FLINK-34582
[3] https://issues.apache.org/jira/browse/FLINK-34571
[4] https://github.com/apache/flink/pull/24394
[5] https://github.com/apache/flink-web/pull/721

Best,
Yun, Jing, Martijn and Lincoln


Re: [VOTE] Release 1.19.0, release candidate #1

2024-03-05 Thread Lincoln Lee
Hi everyone,

This rc is officially cancelled because of the incompleteness of python
wheel packages(lost 2 newly added py3.11 packages for mac, both 10.9 &
11.0, will be fixed in [1]).

A new rc will be created.

[1] https://issues.apache.org/jira/browse/FLINK-34582

Best,
Yun, Jing, Martijn and Lincoln

Lincoln Lee  于2024年3月6日周三 00:24写道:

> Hi everyone,
>
> Please review and vote on the release candidate #1 for the version 1.19.0,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
> The complete staging area is available for your review, which includes:
>
> * JIRA release notes [1], and the pull request adding release note for
> users [2]
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [3], which are signed with the key with
> fingerprint E57D30ABEE75CA06  [4],
> * all artifacts to be deployed to the Maven Central Repository [5],
> * source code tag "release-1.19.0-rc1" [6],
> * website pull request listing the new release and adding announcement
> blog post [7].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353282
> [2] https://github.com/apache/flink/pull/24394
> [3] https://dist.apache.org/repos/dist/dev/flink/flink-1.19.0-rc1/
> [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> [5] https://repository.apache.org/content/repositories/orgapacheflink-1708
> [6] https://github.com/apache/flink/releases/tag/release-1.19.0-rc1
> [7] https://github.com/apache/flink-web/pull/721
>
> Best,
> Yun, Jing, Martijn and Lincoln
>


[jira] [Created] (FLINK-34582) release build tools lost the newly added py3.11 packages for mac

2024-03-05 Thread lincoln lee (Jira)
lincoln lee created FLINK-34582:
---

 Summary: release build tools lost the newly added py3.11 packages 
for mac
 Key: FLINK-34582
 URL: https://issues.apache.org/jira/browse/FLINK-34582
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.19.0, 1.20.0
Reporter: lincoln lee
Assignee: Xingbo Huang


during 1.19.0-rc1 building binaries via tools/releasing/create_binary_release.sh

lost the newly added py3.11  2 packages for mac



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[VOTE] Release 1.19.0, release candidate #1

2024-03-05 Thread Lincoln Lee
Hi everyone,

Please review and vote on the release candidate #1 for the version 1.19.0,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:

* JIRA release notes [1], and the pull request adding release note for
users [2]
* the official Apache source release and binary convenience releases to be
deployed to dist.apache.org [3], which are signed with the key with
fingerprint E57D30ABEE75CA06  [4],
* all artifacts to be deployed to the Maven Central Repository [5],
* source code tag "release-1.19.0-rc1" [6],
* website pull request listing the new release and adding announcement blog
post [7].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353282
[2] https://github.com/apache/flink/pull/24394
[3] https://dist.apache.org/repos/dist/dev/flink/flink-1.19.0-rc1/
[4] https://dist.apache.org/repos/dist/release/flink/KEYS
[5] https://repository.apache.org/content/repositories/orgapacheflink-1708
[6] https://github.com/apache/flink/releases/tag/release-1.19.0-rc1
[7] https://github.com/apache/flink-web/pull/721

Best,
Yun, Jing, Martijn and Lincoln


[ANNOUNCE] Flink 1.19 Cross-team testing completed & sync summary on 02/27/2024

2024-02-27 Thread Lincoln Lee
Hi devs,

I'd like to share some highlights from the release sync on 02/27/2024

- Cross-team testing

We've finished all of the testing work[1]. Huge thanks to all contributors
and volunteers for the effort on this!

- Blockers

Two api change merge requests[2][3] had been discussed, there was an
agreement on the second pr, as it is a fix for an
unintended behavior newly introduced in 1.19, and we need to avoid
releasing it to users. For the 1st pr, we suggest continue the discussing
separately.
So we will wait for [3] done and then create the first release candidate
1.19.0-rc1(expecting within this week if no new blockers).

- Release notes

Revision to the draft version of the release note[4] has been closed, and
the formal pr[5] has been submitted,
also the release announcement pr will be ready later this week, please
continue to help review before 1.19 release, thanks!

- Sync meeting (https://meet.google.com/vcx-arzs-trv)

We've already switched to weekly release sync, so the next release sync
will be on Mar 5th, 2024. Feel free to join!

[1] https://issues.apache.org/jira/browse/FLINK-34285
[2] https://lists.apache.org/thread/2llhhbkcx5w7chp3d6cthoqc8kwfvw6x
[3] https://github.com/apache/flink/pull/24387#pullrequestreview-1902749309
[4]
https://docs.google.com/document/d/1HLF4Nhvkln4zALKJdwRErCnPzufh7Z3BhhkWlk9Zh7w
[5] https://github.com/apache/flink/pull/24394

Best,
Yun, Jing, Martijn and Lincoln


Re: [DISCUSS] Move CheckpointingMode to flink-core

2024-02-27 Thread Lincoln Lee
Hi Zakelly,

Thanks for letting us 1.19 RMs know about this!

This change has been discussed during today's release sync meeting, we
suggest not merge it into 1.19.
We can continue discussing the removal in 2.x separately.

Best,
Lincoln Lee


Hangxiang Yu  于2024年2月27日周二 11:28写道:

> Hi, Zakelly.
> Thanks for driving this.
> Moving this class to flink-core makes sense to me which could make the code
> path and configs clearer.
> It's marked as @Public from 1.0 and 1.20 should be the next long-term
> version, so 1.19 should have been a suitable version to do it.
> And also look forward to thoughts of other developers/RMs since 1.19 is
> currently under a feature freeze status.
>
> On Mon, Feb 26, 2024 at 6:42 PM Zakelly Lan  wrote:
>
> > Hi devs,
> >
> > When working on the FLIP-406[1], I realized that moving all options of
> > ExecutionCheckpointingOptions(flink-streaming-java) to
> > CheckpointingOptions(flink-core) depends on relocating the
> > enum CheckpointingMode(flink-streaming-java) to flink-core module.
> However,
> > the CheckpointingMode is annotated as @Public and used by datastream api
> > like 'CheckpointConfig#setCheckpointingMode'. So I'd like to start a
> > discussion on moving the CheckpointingMode to flink-core. It is in a
> little
> > bit of a hurry if we want the old enum to be entirely removed in Flink
> 2.x
> > series, since the deprecation should be shipped in the upcoming Flink
> 1.19.
> > I suggest not creating a dedicated FLIP and treating this as a sub-task
> of
> > FLIP-406.
> >
> > I prepared a minimal change of providing new APIs and deprecating the old
> > ones[2], which could be merged to 1.19 if we agree to do so.
> >
> > Looking forward to your thoughts! Also cc RMs of 1.19 about this.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=284789560
> > [2]
> >
> >
> https://github.com/apache/flink/commit/9bdd237d0322df8853f1b9e6ae658f77b9175237
> >
> > Best,
> > Zakelly
> >
>
>
> --
> Best,
> Hangxiang.
>


[jira] [Created] (FLINK-34534) CLONE - Vote on the release candidate

2024-02-27 Thread lincoln lee (Jira)
lincoln lee created FLINK-34534:
---

 Summary: CLONE - Vote on the release candidate
 Key: FLINK-34534
 URL: https://issues.apache.org/jira/browse/FLINK-34534
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 1.17.0
Reporter: Matthias Pohl
Assignee: Qingsheng Ren
 Fix For: 1.17.0


Once you have built and individually reviewed the release candidate, please 
share it for the community-wide review. Please review foundation-wide [voting 
guidelines|http://www.apache.org/foundation/voting.html] for more information.

Start the review-and-vote thread on the dev@ mailing list. Here’s an email 
template; please adjust as you see fit.
{quote}From: Release Manager
To: dev@flink.apache.org
Subject: [VOTE] Release 1.2.3, release candidate #3

Hi everyone,
Please review and vote on the release candidate #3 for the version 1.2.3, as 
follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:
 * JIRA release notes [1],
 * the official Apache source release and binary convenience releases to be 
deployed to dist.apache.org [2], which are signed with the key with fingerprint 
 [3],
 * all artifacts to be deployed to the Maven Central Repository [4],
 * source code tag "release-1.2.3-rc3" [5],
 * website pull request listing the new release and adding announcement blog 
post [6].

The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.

Thanks,
Release Manager

[1] link
[2] link
[3] [https://dist.apache.org/repos/dist/release/flink/KEYS]
[4] link
[5] link
[6] link
{quote}
*If there are any issues found in the release candidate, reply on the vote 
thread to cancel the vote.* There’s no need to wait 72 hours. Proceed to the 
Fix Issues step below and address the problem. However, some issues don’t 
require cancellation. For example, if an issue is found in the website pull 
request, just correct it on the spot and the vote can continue as-is.

For cancelling a release, the release manager needs to send an email to the 
release candidate thread, stating that the release candidate is officially 
cancelled. Next, all artifacts created specifically for the RC in the previous 
steps need to be removed:
 * Delete the staging repository in Nexus
 * Remove the source / binary RC files from dist.apache.org
 * Delete the source code tag in git

*If there are no issues, reply on the vote thread to close the voting.* Then, 
tally the votes in a separate email. Here’s an email template; please adjust as 
you see fit.
{quote}From: Release Manager
To: dev@flink.apache.org
Subject: [RESULT] [VOTE] Release 1.2.3, release candidate #3

I'm happy to announce that we have unanimously approved this release.

There are XXX approving votes, XXX of which are binding:
 * approver 1
 * approver 2
 * approver 3
 * approver 4

There are no disapproving votes.

Thanks everyone!
{quote}
 

h3. Expectations
 * Community votes to release the proposed candidate, with at least three 
approving PMC votes

Any issues that are raised till the vote is over should be either resolved or 
moved into the next release (if applicable).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34533) CLONE - Propose a pull request for website updates

2024-02-27 Thread lincoln lee (Jira)
lincoln lee created FLINK-34533:
---

 Summary: CLONE - Propose a pull request for website updates
 Key: FLINK-34533
 URL: https://issues.apache.org/jira/browse/FLINK-34533
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 1.17.0
Reporter: Matthias Pohl
Assignee: Qingsheng Ren
 Fix For: 1.17.0


The final step of building the candidate is to propose a website pull request 
containing the following changes:
 # update 
[apache/flink-web:_config.yml|https://github.com/apache/flink-web/blob/asf-site/_config.yml]
 ## update {{FLINK_VERSION_STABLE}} and {{FLINK_VERSION_STABLE_SHORT}} as 
required
 ## update version references in quickstarts ({{{}q/{}}} directory) as required
 ## (major only) add a new entry to {{flink_releases}} for the release binaries 
and sources
 ## (minor only) update the entry for the previous release in the series in 
{{flink_releases}}
 ### Please pay notice to the ids assigned to the download entries. They should 
be unique and reflect their corresponding version number.
 ## add a new entry to {{release_archive.flink}}
 # add a blog post announcing the release in _posts
 # add a organized release notes page under docs/content/release-notes and 
docs/content.zh/release-notes (like 
[https://nightlies.apache.org/flink/flink-docs-release-1.15/release-notes/flink-1.15/]).
 The page is based on the non-empty release notes collected from the issues, 
and only the issues that affect existing users should be included (e.g., 
instead of new functionality). It should be in a separate PR since it would be 
merged to the flink project.

(!) Don’t merge the PRs before finalizing the release.

 

h3. Expectations
 * Website pull request proposed to list the 
[release|http://flink.apache.org/downloads.html]
 * (major only) Check {{docs/config.toml}} to ensure that
 ** the version constants refer to the new version
 ** the {{baseurl}} does not point to {{flink-docs-master}}  but 
{{flink-docs-release-X.Y}} instead



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34532) CLONE - Stage source and binary releases on dist.apache.org

2024-02-27 Thread lincoln lee (Jira)
lincoln lee created FLINK-34532:
---

 Summary: CLONE - Stage source and binary releases on 
dist.apache.org
 Key: FLINK-34532
 URL: https://issues.apache.org/jira/browse/FLINK-34532
 Project: Flink
  Issue Type: Sub-task
Reporter: Matthias Pohl
Assignee: Qingsheng Ren


Copy the source release to the dev repository of dist.apache.org:
# If you have not already, check out the Flink section of the dev repository on 
dist.apache.org via Subversion. In a fresh directory:
{code:bash}
$ svn checkout https://dist.apache.org/repos/dist/dev/flink --depth=immediates
{code}
# Make a directory for the new release and copy all the artifacts (Flink 
source/binary distributions, hashes, GPG signatures and the python 
subdirectory) into that newly created directory:
{code:bash}
$ mkdir flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
$ mv /tools/releasing/release/* 
flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
{code}
# Add and commit all the files.
{code:bash}
$ cd flink
flink $ svn add flink-${RELEASE_VERSION}-rc${RC_NUM}
flink $ svn commit -m "Add flink-${RELEASE_VERSION}-rc${RC_NUM}"
{code}
# Verify that files are present under 
[https://dist.apache.org/repos/dist/dev/flink|https://dist.apache.org/repos/dist/dev/flink].
# Push the release tag if not done already (the following command assumes to be 
called from within the apache/flink checkout):
{code:bash}
$ git push  refs/tags/release-${RELEASE_VERSION}-rc${RC_NUM}
{code}

 

h3. Expectations
 * Maven artifacts deployed to the staging repository of 
[repository.apache.org|https://repository.apache.org/content/repositories/]
 * Source distribution deployed to the dev repository of 
[dist.apache.org|https://dist.apache.org/repos/dist/dev/flink/]
 * Check hashes (e.g. shasum -c *.sha512)
 * Check signatures (e.g. {{{}gpg --verify 
flink-1.2.3-source-release.tar.gz.asc flink-1.2.3-source-release.tar.gz{}}})
 * {{grep}} for legal headers in each file.
 * If time allows check the NOTICE files of the modules whose dependencies have 
been changed in this release in advance, since the license issues from time to 
time pop up during voting. See [Verifying a Flink 
Release|https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Release]
 "Checking License" section.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34531) CLONE - Build and stage Java and Python artifacts

2024-02-27 Thread lincoln lee (Jira)
lincoln lee created FLINK-34531:
---

 Summary: CLONE - Build and stage Java and Python artifacts
 Key: FLINK-34531
 URL: https://issues.apache.org/jira/browse/FLINK-34531
 Project: Flink
  Issue Type: Sub-task
Reporter: Matthias Pohl
Assignee: Qingsheng Ren


# Create a local release branch ((!) this step can not be skipped for minor 
releases):
{code:bash}
$ cd ./tools
tools/ $ OLD_VERSION=$CURRENT_SNAPSHOT_VERSION NEW_VERSION=$RELEASE_VERSION 
RELEASE_CANDIDATE=$RC_NUM releasing/create_release_branch.sh
{code}
 # Tag the release commit:
{code:bash}
$ git tag -s ${TAG} -m "${TAG}"
{code}
 # We now need to do several things:
 ## Create the source release archive
 ## Deploy jar artefacts to the [Apache Nexus 
Repository|https://repository.apache.org/], which is the staging area for 
deploying the jars to Maven Central
 ## Build PyFlink wheel packages
You might want to create a directory on your local machine for collecting the 
various source and binary releases before uploading them. Creating the binary 
releases is a lengthy process but you can do this on another machine (for 
example, in the "cloud"). When doing this, you can skip signing the release 
files on the remote machine, download them to your local machine and sign them 
there.
 # Build the source release:
{code:bash}
tools $ RELEASE_VERSION=$RELEASE_VERSION releasing/create_source_release.sh
{code}
 # Stage the maven artifacts:
{code:bash}
tools $ releasing/deploy_staging_jars.sh
{code}
Review all staged artifacts ([https://repository.apache.org/]). They should 
contain all relevant parts for each module, including pom.xml, jar, test jar, 
source, test source, javadoc, etc. Carefully review any new artifacts.
 # Close the staging repository on Apache Nexus. When prompted for a 
description, enter “Apache Flink, version X, release candidate Y”.
Then, you need to build the PyFlink wheel packages (since 1.11):
 # Set up an azure pipeline in your own Azure account. You can refer to [Azure 
Pipelines|https://cwiki.apache.org/confluence/display/FLINK/Azure+Pipelines#AzurePipelines-Tutorial:SettingupAzurePipelinesforaforkoftheFlinkrepository]
 for more details on how to set up azure pipeline for a fork of the Flink 
repository. Note that a google cloud mirror in Europe is used for downloading 
maven artifacts, therefore it is recommended to set your [Azure organization 
region|https://docs.microsoft.com/en-us/azure/devops/organizations/accounts/change-organization-location]
 to Europe to speed up the downloads.
 # Push the release candidate branch to your forked personal Flink repository, 
e.g.
{code:bash}
tools $ git push  
refs/heads/release-${RELEASE_VERSION}-rc${RC_NUM}:release-${RELEASE_VERSION}-rc${RC_NUM}
{code}
 # Trigger the Azure Pipelines manually to build the PyFlink wheel packages
 ## Go to your Azure Pipelines Flink project → Pipelines
 ## Click the "New pipeline" button on the top right
 ## Select "GitHub" → your GitHub Flink repository → "Existing Azure Pipelines 
YAML file"
 ## Select your branch → Set path to "/azure-pipelines.yaml" → click on 
"Continue" → click on "Variables"
 ## Then click "New Variable" button, fill the name with "MODE", and the value 
with "release". Click "OK" to set the variable and the "Save" button to save 
the variables, then back on the "Review your pipeline" screen click "Run" to 
trigger the build.
 ## You should now see a build where only the "CI build (release)" is running
 # Download the PyFlink wheel packages from the build result page after the 
jobs of "build_wheels mac" and "build_wheels linux" have finished.
 ## Download the PyFlink wheel packages
 ### Open the build result page of the pipeline
 ### Go to the {{Artifacts}} page (build_wheels linux -> 1 artifact)
 ### Click {{wheel_Darwin_build_wheels mac}} and {{wheel_Linux_build_wheels 
linux}} separately to download the zip files
 ## Unzip these two zip files
{code:bash}
$ cd /path/to/downloaded_wheel_packages
$ unzip wheel_Linux_build_wheels\ linux.zip
$ unzip wheel_Darwin_build_wheels\ mac.zip{code}
 ## Create directory {{./dist}} under the directory of {{{}flink-python{}}}:
{code:bash}
$ cd 
$ mkdir flink-python/dist{code}
 ## Move the unzipped wheel packages to the directory of 
{{{}flink-python/dist{}}}:
{code:java}
$ mv /path/to/wheel_Darwin_build_wheels\ mac/* flink-python/dist/
$ mv /path/to/wheel_Linux_build_wheels\ linux/* flink-python/dist/
$ cd tools{code}

Finally, we create the binary convenience release files:
{code:bash}
tools $ RELEASE_VERSION=$RELEASE_VERSION releasing/create_binary_release.sh
{code}
If you want to run this step in parallel on a remote machine you have to make 
the release commit available there (for exa

[jira] [Created] (FLINK-34530) Build Release Candidate: 1.19.0-rc1

2024-02-27 Thread lincoln lee (Jira)
lincoln lee created FLINK-34530:
---

 Summary: Build Release Candidate: 1.19.0-rc1
 Key: FLINK-34530
 URL: https://issues.apache.org/jira/browse/FLINK-34530
 Project: Flink
  Issue Type: New Feature
Affects Versions: 1.17.0
Reporter: Matthias Pohl
Assignee: Jing Ge
 Fix For: 1.17.0


The core of the release process is the build-vote-fix cycle. Each cycle 
produces one release candidate. The Release Manager repeats this cycle until 
the community approves one release candidate, which is then finalized.

h4. Prerequisites
Set up a few environment variables to simplify Maven commands that follow. This 
identifies the release candidate being built. Start with {{RC_NUM}} equal to 1 
and increment it for each candidate:
{code}
RC_NUM="1"
TAG="release-${RELEASE_VERSION}-rc${RC_NUM}"
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34523) org.apache.maven.surefire:surefire-shared-utils:jar:3.2.2 couldn't be downloaded

2024-02-26 Thread lincoln lee (Jira)
lincoln lee created FLINK-34523:
---

 Summary: org.apache.maven.surefire:surefire-shared-utils:jar:3.2.2 
couldn't be downloaded
 Key: FLINK-34523
 URL: https://issues.apache.org/jira/browse/FLINK-34523
 Project: Flink
  Issue Type: Bug
  Components: Build System / CI, BuildSystem / Shaded
Affects Versions: 1.19.0
Reporter: Matthias Pohl


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56728=logs=59a2b95a-736b-5c46-b3e0-cee6e587fd86=c301da75-e699-5c06-735f-778207c16f50=1368

{code}
Jan 23 00:55:44 00:55:44.078 [ERROR] Failed to execute goal on project 
flink-core: Could not resolve dependencies for project 
org.apache.flink:flink-core:jar:1.19-SNAPSHOT: Could not transfer artifact 
org.apache.flink:flink-shaded-jackson:jar:2.15.3-18.0 from/to 
google-maven-central 
(https://maven-central-eu.storage-download.googleapis.com/maven2/): GET request 
of: 
org/apache/flink/flink-shaded-jackson/2.15.3-18.0/flink-shaded-jackson-2.15.3-18.0.jar
 from google-maven-central failed: Connection reset -> [Help 1]
Jan 23 00:55:44 00:55:44.078 [ERROR] 
Jan 23 00:55:44 00:55:44.078 [ERROR] To see the full stack trace of the errors, 
re-run Maven with the -e switch.
Jan 23 00:55:44 00:55:44.078 [ERROR] Re-run Maven using the -X switch to enable 
full debug logging.
Jan 23 00:55:44 00:55:44.078 [ERROR] 
Jan 23 00:55:44 00:55:44.078 [ERROR] For more information about the errors and 
possible solutions, please read the following articles:
Jan 23 00:55:44 00:55:44.078 [ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
Jan 23 00:55:44 00:55:44.078 [ERROR] 
Jan 23 00:55:44 00:55:44.078 [ERROR] After correcting the problems, you can 
resume the build with the command
Jan 23 00:55:44 00:55:44.078 [ERROR]   mvn  -rf :flink-core
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[ANNOUNCE] Flink 1.19 Cross-team testing & sync summary on 02/20/2024

2024-02-20 Thread Lincoln Lee
Hi devs,

I'd like to share some highlights from the release sync on 02/20/2024


*- Cross-team testing*
Only one test instruction remains to be finalized[1]. With more than half
of the identified test tasks completed: 8 of the 15 test work items
finished, 3 in progress, and volunteers identified for the remaining 4.


*- Release notes*
We've drafted a release note[2] based on the content of 'Release Notes'
field from JIRA tickets, Please use ‘Suggest edit’[3] mode to comment the
modifications.
Revisions to the draft version will continue until the next release sync,
02/27, after which a formal pr will be submitted for review.


*- CI issues*
There are some CI test or instability issues that are under evaluation[4].


*- Sync meeting (https://meet.google.com/vcx-arzs-trv
)*
We've already switched to weekly release sync, so the next release sync
will be on Feb 27th, 2024. Feel free to join!

[1] https://issues.apache.org/jira/browse/FLINK-34305
[2]
https://docs.google.com/document/d/1HLF4Nhvkln4zALKJdwRErCnPzufh7Z3BhhkWlk9Zh7w/edit
[3]
https://support.google.com/docs/answer/6033474?hl=en=GENIE.Platform%3DDesktop
[4] https://cwiki.apache.org/confluence/display/FLINK/1.19+Release

Best,
Yun, Jing, Martijn and Lincoln


Re: [ANNOUNCE] New Apache Flink Committer - Jiabao Sun

2024-02-19 Thread Lincoln Lee
Congratulations Jiabao!

Best,
Lincoln Lee


Leonard Xu  于2024年2月19日周一 18:25写道:

> Congratulations, Jiabao! Well deserved.
>
>
> Best,
> Leonard
>
>
> > 2024年2月19日 下午6:21,David Radley  写道:
> >
> > Congratulations Jiabao!
> >
> > From: Swapnal Varma 
> > Date: Monday, 19 February 2024 at 10:14
> > To: dev@flink.apache.org 
> > Subject: [EXTERNAL] Re: [ANNOUNCE] New Apache Flink Committer - Jiabao
> Sun
> > Congratulations Jiabao!
> >
> > Best,
> > Swapnal
> >
> > On Mon, 19 Feb 2024, 15:37 weijie guo, 
> wrote:
> >
> >> Congratulations, Jiabao :)
> >>
> >> Best regards,
> >>
> >> Weijie
> >>
> >>
> >> Hang Ruan  于2024年2月19日周一 18:04写道:
> >>
> >>> Congratulations, Jiabao!
> >>>
> >>> Best,
> >>> Hang
> >>>
> >>> Qingsheng Ren  于2024年2月19日周一 17:53写道:
> >>>
> >>>> Hi everyone,
> >>>>
> >>>> On behalf of the PMC, I'm happy to announce Jiabao Sun as a new Flink
> >>>> Committer.
> >>>>
> >>>> Jiabao began contributing in August 2022 and has contributed 60+
> >> commits
> >>>> for Flink main repo and various connectors. His most notable
> >> contribution
> >>>> is being the core author and maintainer of MongoDB connector, which is
> >>>> fully functional in DataStream and Table/SQL APIs. Jiabao is also the
> >>>> author of FLIP-377 and the main contributor of JUnit 5 migration in
> >>> runtime
> >>>> and table planner modules.
> >>>>
> >>>> Beyond his technical contributions, Jiabao is an active member of our
> >>>> community, participating in the mailing list and consistently
> >>> volunteering
> >>>> for release verifications and code reviews with enthusiasm.
> >>>>
> >>>> Please join me in congratulating Jiabao for becoming an Apache Flink
> >>>> committer!
> >>>>
> >>>> Best,
> >>>> Qingsheng (on behalf of the Flink PMC)
> >>>>
> >>>
> >>
> >
> > Unless otherwise stated above:
> >
> > IBM United Kingdom Limited
> > Registered in England and Wales with number 741598
> > Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
>
>


[jira] [Created] (FLINK-34465) Python py38-cython: commands failed, Bash exited with code '1

2024-02-19 Thread lincoln lee (Jira)
lincoln lee created FLINK-34465:
---

 Summary: Python py38-cython: commands failed, Bash exited with 
code '1
 Key: FLINK-34465
 URL: https://issues.apache.org/jira/browse/FLINK-34465
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.18.0
Reporter: Sergey Nuyanzin
Assignee: Huang Xingbo


{noformat}
Apr 06 05:55:13 ___ summary 

Apr 06 05:55:13 ERROR:   py37-cython: commands failed
Apr 06 05:55:13   py38-cython: commands succeeded
Apr 06 05:55:13   py39-cython: commands succeeded
Apr 06 05:55:13   py310-cython: commands succeeded
{noformat}
in logs there is such error for 37 not sure if it is related
{noformat}
Apr 06 04:26:30  ERROR at setup of 
ProcessWindowTests.test_count_sliding_window 
Apr 06 04:26:30 
Apr 06 04:26:30 cls = 
Apr 06 04:26:30 
Apr 06 04:26:30 @classmethod
Apr 06 04:26:30 def setUpClass(cls):
Apr 06 04:26:30 super(PyFlinkStreamingTestCase, cls).setUpClass()
Apr 06 04:26:30 cls.env.set_parallelism(2)
Apr 06 04:26:30 >   cls.env.set_runtime_mode(RuntimeExecutionMode.STREAMING)
Apr 06 04:26:30 
Apr 06 04:26:30 pyflink/testing/test_case_utils.py:193: 
Apr 06 04:26:30 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ 
 {noformat}
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=47960=logs=821b528f-1eed-5598-a3b4-7f748b13f261=6bb545dd-772d-5d8c-f258-f5085fba3295=24593]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Alternative way of posting FLIPs

2024-02-07 Thread Lincoln Lee
Thanks Martijn moving this forward!

+1 for the first solution, because as of now it looks like this is a
temporary solution and we're still looking forward to the improvement by
ASF Infra, when the access is ok for contributors, we can back to the
current workflow.

For solution 2, one visible downside is that it becomes inconvenient to
look for flips (unless we permanently switch to github discussion).

Looking forward to hearing more thoughts.

Best,
Lincoln Lee


Martijn Visser  于2024年2月7日周三 21:51写道:

> Hi all,
>
> ASF Infra has confirmed to me that only ASF committers can access the
> ASF Confluence site since a recent change. One of the results of this
> decision is that users can't signup and access Confluence, so only
> committers+ can create FLIPs.
>
> ASF Infra hopes to improve this situation when they move to the Cloud
> shortly (as in: some months), but they haven't committed on an actual
> date. The idea would be that we find a temporary solution until anyone
> can request access to Confluence.
>
> There are a couple of ways we could resolve this situation:
> 1. Contributors create a Google Doc and make that view-only, and post
> that Google Doc to the mailing list for a discussion thread. When the
> discussions have been resolved, the contributor ask on the Dev mailing
> list to a committer/PMC to copy the contents from the Google Doc, and
> create a FLIP number for them. The contributor can then use that FLIP
> to actually have a VOTE thread.
> 2. We could consider moving FLIPs to "Discussions" on Github, like
> Airflow does at https://github.com/apache/airflow/discussions
> 3. Perhaps someone else has another good idea.
>
> Looking forward to your thoughts.
>
> Best regards,
>
> Martijn
>


[ANNOUNCE] release-1.19 branch cut

2024-02-06 Thread Lincoln Lee
Hi devs,

The release-1.19 branch has been forked out from the master branch, with
commit ID 3d67a2b9b208ed662f81c1ef43d76d10a82a7d88.
The version on the master branch has been upgraded to 1.20-SNAPSHOT.

>From now on, for PRs that should be presented in 1.19.0, please make sure:
* Merge the PR into both master and release-1.19 branches
* The JIRA ticket should be closed with the correct fix-versions (1.19.0).

Release testing is already start, there're serval flips which need to be
confirmed whether cross-team testinig is required[1].
Also contributors are encouraged to create tickets if there are other ones
that need to be cross-team tested (Just create new ticket for testing using
title 'Release Testing: Verify ...' without 'Instructions' keyword).

We plan to finish all release testing within two weeks after the Chinese
New YEAR holiday (until Feb 29, 2024), and please update the “X-team
verified” column in the 1.19 release wiki page [2] in the meantime.

Also, we’d like to thank all contributors who put effort into stabilizing
the CI on the master branch in the past weeks, and look forward to
stabilizing new features in the coming weeks.

Good luck with your release testing!

[1]  https://issues.apache.org/jira/browse/FLINK-34285
[2] https://cwiki.apache.org/confluence/display/FLINK/1.19+Release

Best,
Yun, Jing, Martijn and Lincoln


[SUMMARY] Flink 1.19 Release Sync 02/06/2024

2024-02-06 Thread Lincoln Lee
Hi devs,

I'd like to share some highlights from the release sync on 02/06/2024

*- Cutting release branch*

We are waiting for the last blocker[1] fix to be completed. After
confirming that fix been the merged, we will cut the release branch
(expected to be 02/07 AM CET (UTC+1)), and we will make another
announcement on the dev mailing list then.



*- Cross-team testing*
Release testing is already start and in the meantime, there're still serval
flips which need to be confirmed whether cross-team testinig is required[2].
RM had created related tickets includes all the features listed on the 1.19
wiki page[3] as well as other actually completed flips.
Also contributors are encouraged to create tickets if there are other ones
that need to be cross-team tested (Just create new ticket for testing using
title 'Release Testing: Verify ...' without 'Instructions' keyword).


*- Release notes *[Highlights again]

New features and behavior changes which without the 'Release Note', please
help to fill out column in the JIRA(click the Edit button and pull the page
to the center), especially since 1.19 involves a lot of deprecation, which
is important for users and will be part of the release announcement.


- *Sync meeting* (https://meet.google.com/vcx-arzs-trv)

We've already switched to weekly release sync, so the next release sync
will be on Feb 13th, 2024. Feel free to join!


[1] https://issues.apache.org/jira/browse/FLINK-34337
[2] https://issues.apache.org/jira/browse/FLINK-34285
[3] https://cwiki.apache.org/confluence/display/FLINK/1.19+Release


Best,
Yun, Jing, Martijn and Lincoln


[jira] [Created] (FLINK-34396) Release Testing Instructions: Verify FLINK-32775 Support yarn.provided.lib.dirs to add parent directory to classpath

2024-02-06 Thread lincoln lee (Jira)
lincoln lee created FLINK-34396:
---

 Summary: Release Testing Instructions: Verify FLINK-32775 Support 
yarn.provided.lib.dirs to add parent directory to classpath
 Key: FLINK-34396
 URL: https://issues.apache.org/jira/browse/FLINK-34396
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.19.0
Reporter: lincoln lee
Assignee: Yunfeng Zhou
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34395) Release Testing Instructions: Verify FLINK-32514 Support using larger checkpointing interval when source is processing backlog

2024-02-06 Thread lincoln lee (Jira)
lincoln lee created FLINK-34395:
---

 Summary: Release Testing Instructions: Verify FLINK-32514 Support 
using larger checkpointing interval when source is processing backlog
 Key: FLINK-34395
 URL: https://issues.apache.org/jira/browse/FLINK-34395
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.19.0
Reporter: lincoln lee
Assignee: Timo Walther
 Fix For: 1.19.0
 Attachments: screenshot-1.png





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34394) Release Testing Instructions: Verify FLINK-33028 Make expanding behavior of virtual metadata columns configurable

2024-02-06 Thread lincoln lee (Jira)
lincoln lee created FLINK-34394:
---

 Summary: Release Testing Instructions: Verify FLINK-33028 Make 
expanding behavior of virtual metadata columns configurable
 Key: FLINK-34394
 URL: https://issues.apache.org/jira/browse/FLINK-34394
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.19.0
Reporter: lincoln lee
Assignee: Venkata krishnan Sowrirajan
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34393) Release Testing Instructions: Verify FLINK-20767 Add nested field support for SupportsFilterPushDown

2024-02-06 Thread lincoln lee (Jira)
lincoln lee created FLINK-34393:
---

 Summary: Release Testing Instructions: Verify FLINK-20767 Add 
nested field support for SupportsFilterPushDown
 Key: FLINK-34393
 URL: https://issues.apache.org/jira/browse/FLINK-34393
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.19.0
Reporter: lincoln lee
Assignee: Zhanghao Chen
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34392) Release Testing Instructions: Verify FLINK-33146 Unify the Representation of TaskManager Location in REST API and Web UI

2024-02-06 Thread lincoln lee (Jira)
lincoln lee created FLINK-34392:
---

 Summary: Release Testing Instructions: Verify FLINK-33146 Unify 
the Representation of TaskManager Location in REST API and Web UI
 Key: FLINK-34392
 URL: https://issues.apache.org/jira/browse/FLINK-34392
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.19.0
Reporter: lincoln lee
Assignee: xiangyu feng
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34391) Release Testing Instructions: Verify FLINK-15959 Add min number of slots configuration to limit total number of slots

2024-02-06 Thread lincoln lee (Jira)
lincoln lee created FLINK-34391:
---

 Summary: Release Testing Instructions: Verify FLINK-15959 Add min 
number of slots configuration to limit total number of slots
 Key: FLINK-34391
 URL: https://issues.apache.org/jira/browse/FLINK-34391
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.19.0
Reporter: lincoln lee
Assignee: Alan Sheinberg
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34387) Release Testing: Verify FLINK-33695 Introduce TraceReporter and use it to create checkpointing and recovery traces

2024-02-06 Thread lincoln lee (Jira)
lincoln lee created FLINK-34387:
---

 Summary: Release Testing: Verify FLINK-33695 Introduce 
TraceReporter and use it to create checkpointing and recovery traces 
 Key: FLINK-34387
 URL: https://issues.apache.org/jira/browse/FLINK-34387
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Metrics
Affects Versions: 1.19.0
Reporter: lincoln lee
 Fix For: 1.19.0


This ticket covers testing three related features: FLINK-33695, FLINK-33735 and 
FLINK-33696.

Instructions:
#  Configure Flink to use 
[Slf4jTraceReporter|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/trace_reporters/#slf4j]
 and with enabled *INFO* level logging (can be to console or to a file, doesn't 
matter).
# Start a streaming job with enabled checkpointing.
# Let it run for a couple of checkpoints.
# Verify presence of a single *JobInitialization* [1] trace logged just after 
job start up.
# Verify presence of a couple of *Checkpoint* [1] traces logged after each 
successful or failed checkpoint.

[1] 
https://nightlies.apache.org/flink/flink-docs-master/docs/ops/traces/#checkpointing-and-initialization



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34384) Release Testing: Verify FLINK-33735 Improve the exponential-delay restart-strategy

2024-02-05 Thread lincoln lee (Jira)
lincoln lee created FLINK-34384:
---

 Summary: Release Testing: Verify FLINK-33735 Improve the 
exponential-delay restart-strategy 
 Key: FLINK-34384
 URL: https://issues.apache.org/jira/browse/FLINK-34384
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Affects Versions: 1.19.0
Reporter: lincoln lee
Assignee: Rui Fan
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34382) Release Testing: Verify FLINK-33625 Support System out and err to be redirected to LOG or discarded

2024-02-05 Thread lincoln lee (Jira)
lincoln lee created FLINK-34382:
---

 Summary: Release Testing: Verify FLINK-33625 Support System out 
and err to be redirected to LOG or discarded
 Key: FLINK-34382
 URL: https://issues.apache.org/jira/browse/FLINK-34382
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Configuration
Affects Versions: 1.19.0
Reporter: lincoln lee
Assignee: Rui Fan
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34367) Release Testing Instructions: Verify FLINK-34027 AsyncScalarFunction for asynchronous scalar function support

2024-02-05 Thread lincoln lee (Jira)
lincoln lee created FLINK-34367:
---

 Summary: Release Testing Instructions: Verify FLINK-34027 
AsyncScalarFunction for asynchronous scalar function support
 Key: FLINK-34367
 URL: https://issues.apache.org/jira/browse/FLINK-34367
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / REST, Runtime / Web Frontend
Affects Versions: 1.19.0
Reporter: lincoln lee
Assignee: Yu Chen
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] Flink 1.19 feature freeze & sync summary on 01/30/2024

2024-02-05 Thread Lincoln Lee
Thanks to Xintong for clarifying this!  @Rui Due to the rules of feature
freeze: "Only bug fixes and documentation changes are allowed."[1],  your
merge request has been discussed among 1.19 RMs, we also agree that do not
merge these PRs which are purely cleanup work and no more benefits for
users. Thank you for agreeing.

Also update the progress:

*- Cutting release branch*
We're still working on two blockers[2][3],  we'll decide when to cut 1.19
release branch after next release sync(Feb 6th) discussion

[1]
https://cwiki.apache.org/confluence/display/FLINK/Flink+Release+Management
[2] https://issues.apache.org/jira/browse/FLINK-34337
[3] https://issues.apache.org/jira/browse/FLINK-34007

Best,
Yun, Jing, Martijn and Lincoln

Rui Fan <1996fan...@gmail.com> 于2024年2月5日周一 13:42写道:

> > My opinion would be to follow the process by default, and to make
> exceptions only if there're good reasons.
>
> Sounds make sense, I will merge it after 1.19 branch cutting.
>
> Thanks Xintong for the explanation! And sorry for bothering.
>
> Best,
> Rui
>
> On Mon, Feb 5, 2024 at 1:20 PM Xintong Song  wrote:
>
> > Thanks for the info.
> >
> > My opinion would be to follow the process by default, and to make
> > exceptions only if there're good reasons. From your description, it
> sounds
> > like merging the PR in or after 1.19 doesn't really make a difference. In
> > that case, I'd suggest to merge it for the next release (i.e. merge it
> into
> > master after the 1.19 branch cutting).
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Mon, Feb 5, 2024 at 12:52 PM Rui Fan <1996fan...@gmail.com> wrote:
> >
> > > Thanks Xintong for the reply.
> > >
> > > They are Flink internal classes, and they are not used anymore.
> > > So I think they don't affect users, the benefit of removing them
> > > is to simplify Flink's code and reduce maintenance costs.
> > >
> > > If we just merge some user-related PRs recently, I could merge
> > > it after 1.19. Thank you again~
> > >
> > > Best,
> > > Rui
> > >
> > > On Mon, Feb 5, 2024 at 12:21 PM Xintong Song 
> > > wrote:
> > >
> > > > Hi Rui,
> > > >
> > > > Quick question, would there be any downside if this PR doesn't go
> into
> > > > 1.19? Or any user benefit from getting it into this release?
> > > >
> > > > Best,
> > > >
> > > > Xintong
> > > >
> > > >
> > > >
> > > > On Sun, Feb 4, 2024 at 10:16 AM Rui Fan <1996fan...@gmail.com>
> wrote:
> > > >
> > > > > Hi release managers,
> > > > >
> > > > > > The feature freeze of 1.19 has started now. That means that no
> new
> > > > > features
> > > > > > or improvements should now be merged into the master branch
> unless
> > > you
> > > > > ask
> > > > > > the release managers first, which has already been done for PRs,
> or
> > > > > pending
> > > > > > on CI to pass. Bug fixes and documentation PRs can still be
> merged.
> > > > >
> > > > > I'm curious whether the code cleanup could be merged?
> > > > > FLINK-31449[1] removed DeclarativeSlotManager related logic.
> > > > > Some other classes are not used anymore after FLINK-31449.
> > > > > FLINK-34345[2][3] will remove them.
> > > > >
> > > > > I checked these classes are not used in the master branch.
> > > > > And the PR[3] is reviewed for now, could I merge it now or
> > > > > after flink-1.19?
> > > > >
> > > > > Looking forward to your feedback, thanks~
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-31449
> > > > > [2] https://issues.apache.org/jira/browse/FLINK-34345
> > > > > [3] https://github.com/apache/flink/pull/24257
> > > > >
> > > > > Best,
> > > > > Rui
> > > > >
> > > > > On Wed, Jan 31, 2024 at 5:20 PM Lincoln Lee <
> lincoln.8...@gmail.com>
> > > > > wrote:
> > > > >
> > > > >> Hi Matthias,
> > > > >>
> > > > >> Thanks for letting us know! After discussed with 1.19 release
> > > managers,
> > > > we
> > > > >> agreed to merge these pr.
> > > > >>
> > >

[jira] [Created] (FLINK-34356) Release Testin: Verify FLINK-33768 Support dynamic source parallelism inference for batch jobs

2024-02-04 Thread lincoln lee (Jira)
lincoln lee created FLINK-34356:
---

 Summary: Release Testin: Verify FLINK-33768  Support dynamic 
source parallelism inference for batch jobs 
 Key: FLINK-34356
 URL: https://issues.apache.org/jira/browse/FLINK-34356
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Affects Versions: 1.19.0
Reporter: lincoln lee
Assignee: xingbe
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34355) Release Testing: Verify FLINK-34054 Support named parameters for functions and procedures

2024-02-04 Thread lincoln lee (Jira)
lincoln lee created FLINK-34355:
---

 Summary: Release Testing: Verify FLINK-34054 Support named 
parameters for functions and procedures
 Key: FLINK-34355
 URL: https://issues.apache.org/jira/browse/FLINK-34355
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.19.0
Reporter: lincoln lee
Assignee: Feng Jin
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   >