from:"An \(Jira\)"

[jira] [Created] (FLINK-35977) Missing an import in datastream.md

2024-08-05 Thread guluo (Jira)

guluo created FLINK-35977:
-

 Summary: Missing an import in datastream.md
 Key: FLINK-35977
 URL: https://issues.apache.org/jira/browse/FLINK-35977
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.20.0
Reporter: guluo


In document datastream.md, we missing an import about OpenContext.

 
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35976) StreamPhysicalOverAggregate should handle column name confliction

2024-08-05 Thread lincoln lee (Jira)

lincoln lee created FLINK-35976:
---

 Summary: StreamPhysicalOverAggregate should handle column name 
confliction
 Key: FLINK-35976
 URL: https://issues.apache.org/jira/browse/FLINK-35976
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.19.1, 1.20.0
Reporter: lincoln lee
Assignee: lincoln lee
 Fix For: 2.0.0


Duplicate column name exception occurred when use a nested over aggregate query,

e.g., a repro case:

{code}
@Test
def testNestedOverAgg(): Unit = {
util.addTable(s"""
|CREATE TEMPORARY TABLE src (
| a STRING,
| b STRING,
| ts TIMESTAMP_LTZ(3),
| watermark FOR ts as ts
|) WITH (
| 'connector' = 'values'
|)
|""".stripMargin)

util.verifyExecPlan(s"""
|SELECT *
|FROM (
| SELECT
| *, count(*) OVER (PARTITION BY a ORDER BY ts) AS c2
| FROM (
| SELECT
| *, count(*) OVER (PARTITION BY a,b ORDER BY ts) AS c1
| FROM src
| )
|)
|""".stripMargin)
}
{code}

 

{code}
org.apache.flink.table.api.ValidationException: Field names must be unique. 
Found duplicates: [w0$o0]
 
at org.apache.flink.table.types.logical.RowType.validateFields(RowType.java:273)
at org.apache.flink.table.types.logical.RowType.(RowType.java:158)
at org.apache.flink.table.types.logical.RowType.of(RowType.java:298)
at org.apache.flink.table.types.logical.RowType.of(RowType.java:290)
at 
org.apache.flink.table.planner.calcite.FlinkTypeFactory$.toLogicalRowType(FlinkTypeFactory.scala:678)
at 
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamPhysicalOverAggregate.translateToExecNode(StreamPhysicalOverAggregate.scala:57)
at 
org.apache.flink.table.planner.plan.nodes.physical.FlinkPhysicalRel.translateToExecNode(FlinkPhysicalRel.scala:53)
at 
org.apache.flink.table.planner.plan.nodes.physical.FlinkPhysicalRel.translateToExecNode$(FlinkPhysicalRel.scala:52)
at 
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamPhysicalOverAggregateBase.translateToExecNode(StreamPhysicalOverAggregateBase.scala:35)
at 
org.apache.flink.table.planner.plan.nodes.exec.ExecNodeGraphGenerator.generate(ExecNodeGraphGenerator.java:74)
at 
org.apache.flink.table.planner.plan.nodes.exec.ExecNodeGraphGenerator.generate(ExecNodeGraphGenerator.java:54)
at 
org.apache.flink.table.planner.delegation.PlannerBase.translateToExecNodeGraph(PlannerBase.scala:407)
at 
org.apache.flink.table.planner.utils.TableTestUtilBase.assertPlanEquals(TableTestBase.scala:1076)
at 
org.apache.flink.table.planner.utils.TableTestUtilBase.doVerifyPlan(TableTestBase.scala:920)
at 
org.apache.flink.table.planner.utils.TableTestUtilBase.verifyExecPlan(TableTestBase.scala:675)
at 
org.apache.flink.table.planner.plan.stream.sql.agg.OverAggregateTest.testNestedOverAgg(OverAggregateTest.scala:460)
{code}

 

This is a similar case In https://issues.apache.org/jira/browse/FLINK-22121, 
but missed the fixing in streaming over agg scenario.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35975) MongoFilterPushDownVisitorTest fails

2024-08-04 Thread Jiabao Sun (Jira)

Jiabao Sun created FLINK-35975:
--

 Summary: MongoFilterPushDownVisitorTest fails
 Key: FLINK-35975
 URL: https://issues.apache.org/jira/browse/FLINK-35975
 Project: Flink
  Issue Type: Bug
  Components: Connectors / MongoDB
Reporter: Jiabao Sun
Assignee: Jiabao Sun
 Fix For: mongodb-1.3.0


https://github.com/apache/flink-connector-mongodb/actions/runs/10127471893/job/28005305311



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35974) PyFlink YARN per-job on Docker test failed because docker-compose command not found

2024-08-04 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35974:
--

 Summary: PyFlink YARN per-job on Docker test failed because 
docker-compose command not found
 Key: FLINK-35974
 URL: https://issues.apache.org/jira/browse/FLINK-35974
 Project: Flink
  Issue Type: Improvement
Reporter: Weijie Guo






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35973) HeartbeatManagerTest.testHeartbeatTimeout failed

2024-08-04 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35973:
--

 Summary: HeartbeatManagerTest.testHeartbeatTimeout failed
 Key: FLINK-35973
 URL: https://issues.apache.org/jira/browse/FLINK-35973
 Project: Flink
  Issue Type: Improvement
  Components: Build System / CI
Affects Versions: 2.0.0
Reporter: Weijie Guo


Jul 26 03:37:22 java.lang.AssertionError
Jul 26 03:37:22 at org.junit.Assert.fail(Assert.java:87)
Jul 26 03:37:22 at org.junit.Assert.assertTrue(Assert.java:42)
Jul 26 03:37:22 at org.junit.Assert.assertFalse(Assert.java:65)
Jul 26 03:37:22 at org.junit.Assert.assertFalse(Assert.java:75)
Jul 26 03:37:22 at 
org.apache.flink.runtime.heartbeat.HeartbeatManagerTest.testHeartbeatTimeout(HeartbeatManagerTest.java:200)
Jul 26 03:37:22 at java.lang.reflect.Method.invoke(Method.java:498)
Jul 26 03:37:22 at 
org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35972) Directly specify key and process runnable in async state execution model

2024-08-04 Thread Zakelly Lan (Jira)

Zakelly Lan created FLINK-35972:
---

 Summary: Directly specify key and process runnable in async state 
execution model
 Key: FLINK-35972
 URL: https://issues.apache.org/jira/browse/FLINK-35972
 Project: Flink
  Issue Type: Sub-task
Reporter: Zakelly Lan
Assignee: Zakelly Lan


In asynchronous execution model (FLIP-425) for state, the key is managed by the 
framework and automatically assigned before {{processElement}} or callback 
running. However in some cases, there should be an internal interface that 
allows to switch key and start a new process with this key. It is useful when 
implementing something like mini-batch, which triggers a batch of real logic 
for gathered elements at one time instead of the {{processElement}} for each 
element.

Additionally, the processing invoked by this interface should have a higher 
priority than the normal input, as the semantics of this processing method is 
to continue processing the previously half-processed data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35971) Lose precision in PostgresParallelSource

2024-08-03 Thread wuzhenhua (Jira)

wuzhenhua created FLINK-35971:
-

 Summary: Lose precision in PostgresParallelSource
 Key: FLINK-35971
 URL: https://issues.apache.org/jira/browse/FLINK-35971
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
 Environment: Flink version
1.14.0
 

Flink CDC version
2.4.1
 

Database and its version
PostgreSQL 10.23

Reporter: wuzhenhua


Reproduce step:

{code:java}
CREATE TABLE IF NOT EXISTS s1.t1
(
id bigint NOT NULL,
tm time without time zone,
CONSTRAINT t1_pkey PRIMARY KEY (id)
)
{code}

{code:java}
INSERT INTO s1.t1 VALUES(1, '10:33:23.660863')
{code}

{code:java}
val prop = new Properties()

val pgSource = PostgresSourceBuilder.PostgresIncrementalSource.builder[String]
.hostname("localhost")
.port(5432)
.database("cdc_test")
.schemaList("s1")
.tableList("s1.t1")
.username("postgres")
.password("postgres")
.deserializer(new JsonDebeziumDeserializationSchema)
.slotName("aaa")
.decodingPluginName("pgoutput")
.debeziumProperties(prop)
.build()

env.enableCheckpointing(3000)
env.fromSource(pgSource, WatermarkStrategy.noWatermarks[String](), 
"PostgresParallelSource").print()
env.execute("Print Postgres Snapshot + WAL")
{code}

expect to see:
{code:java}
"after":{"id":1,"tm":38003660863}
{code}

i see instead:
{code:java}
"after":{"id":1,"tm":3800366}
{code}

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35970) Update documentation about FLINK-26050 (small file compaction)

2024-08-02 Thread Roman Khachatryan (Jira)

Roman Khachatryan created FLINK-35970:
-

 Summary: Update documentation about FLINK-26050 (small file 
compaction)
 Key: FLINK-35970
 URL: https://issues.apache.org/jira/browse/FLINK-35970
 Project: Flink
  Issue Type: Improvement
  Components: Documentation, Runtime / State Backends
Affects Versions: 1.20.0
Reporter: Roman Khachatryan
 Fix For: 2.0.0


This is a follow-up on FLINK-26050 to update the documentation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35969) Remove deprecated dataset based API from State Processor API

2024-08-02 Thread Gabor Somogyi (Jira)

Gabor Somogyi created FLINK-35969:
-

 Summary: Remove deprecated dataset based API from State Processor 
API
 Key: FLINK-35969
 URL: https://issues.apache.org/jira/browse/FLINK-35969
 Project: Flink
  Issue Type: Improvement
  Components: API / State Processor
Affects Versions: 2.0.0
Reporter: Gabor Somogyi






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35968) Remove flink-cdc-runtime depedency from connectors

2024-08-02 Thread Hongshun Wang (Jira)

Hongshun Wang created FLINK-35968:
-

 Summary: Remove flink-cdc-runtime depedency from connectors
 Key: FLINK-35968
 URL: https://issues.apache.org/jira/browse/FLINK-35968
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Affects Versions: cdc-3.1.1
Reporter: Hongshun Wang
 Fix For: cdc-3.2.0


Current, flink-cdc-source-connectors and flink-cdc-pipeline-connectors depends 
on flink-cdc-runtime, which is not ideal for design and is redundant.

This issue is aimed to remove it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35967) Shade other Flink connectors in CDC Pipeline connectors

2024-08-02 Thread yux (Jira)

yux created FLINK-35967:
---

 Summary: Shade other Flink connectors in CDC Pipeline connectors
 Key: FLINK-35967
 URL: https://issues.apache.org/jira/browse/FLINK-35967
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Reporter: yux


Currently, CDC Pipeline connectors bundles other Flink source / sink connectors 
without shading / relocating them, which hassles users since they have to 
ensure there's no incompatible connectors loaded from Flink /lib or elsewhere.

Shading these commonly used Flink connectors (including MySQL, StarRocks) would 
be a solution, just like what Doris Pipeline connector does now.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35966) Introduce the TASKS for TaskManagerLoadBalanceMode enum

2024-08-02 Thread RocMarshal (Jira)

RocMarshal created FLINK-35966:
--

 Summary: Introduce the TASKS for TaskManagerLoadBalanceMode enum
 Key: FLINK-35966
 URL: https://issues.apache.org/jira/browse/FLINK-35966
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: RocMarshal






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35965) Add ENDSWITH function

2024-08-01 Thread Dylan He (Jira)

Dylan He created FLINK-35965:


 Summary: Add ENDSWITH function
 Key: FLINK-35965
 URL: https://issues.apache.org/jira/browse/FLINK-35965
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Dylan He


Add ENDSWITH function.

Returns if {{expr}} ends with {{endExpr}}.

Example:
{code:sql}
> SELECT ENDSWITH('SparkSQL', 'SQL');
 true
> SELECT ENDSWITH('SparkSQL', 'sql');
 false
{code}

Syntax:
{code:sql}
ENDSWITH(expr, endExpr)
{code}

Arguments:
 * {{expr}}: A STRING or BINARY expression.
 * {{endExpr}}: A STRING or BINARY expression.

Returns:
A BOOLEAN.

{{expr}} and {{endExpr}} should have same type.
If {{expr}} or {{endExpr}} is NULL, the result is NULL.
If {{endExpr}} is the empty, the result is true.

See also:
 * 
[Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#string-functions]
 * 
[Databricks|https://docs.databricks.com/en/sql/language-manual/functions/endswith.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35964) Add STARTSWITH function

2024-08-01 Thread Dylan He (Jira)

Dylan He created FLINK-35964:


 Summary: Add STARTSWITH function
 Key: FLINK-35964
 URL: https://issues.apache.org/jira/browse/FLINK-35964
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Dylan He


Add STARTSWITH function.

Returns if {{expr}} begins with {{startExpr}}.

Example:
{code:sql}
> SELECT STARTSWITH('SparkSQL', 'Spark');
 true

> SELECT STARTSWITH('SparkSQL', 'spark');
 false
{code}

Syntax:
{code:sql}
STARTSWITH(expr, startExpr)
{code}

Arguments:
 * {{expr}}: A STRING or BINARY expression.
 * {{startExpr}}: A STRING or BINARY expression.

Returns:
A BOOLEAN.

{{expr}} and {{startExpr}} should have same type.
If {{expr}} or {{startExpr}} is NULL, the result is NULL.
If {{startExpr}} is the empty, the result is true.

See also:
 * 
[Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#string-functions]
 * 
[Databricks|https://docs.databricks.com/en/sql/language-manual/functions/startswith.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35963) Add REGEXP_SUBSTR function

2024-08-01 Thread Dylan He (Jira)

Dylan He created FLINK-35963:


 Summary: Add REGEXP_SUBSTR function
 Key: FLINK-35963
 URL: https://issues.apache.org/jira/browse/FLINK-35963
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Dylan He


Add REGEXP_SUBSTR function.

Returns the first substring in {{str}} that matches {{regex}}.

Example:
{code:sql}
> SELECT REGEXP_SUBSTR('Steven Jones and Stephen Smith are the best players', 
> 'Ste(v|ph)en');
 Steven
> SELECT REGEXP_SUBSTR('Mary had a little lamb', 'Ste(v|ph)en');
 NULL
{code}

Syntax:
{code:sql}
REGEXP_SUBSTR(str, regex)
{code}

Arguments:
 * {{str}}: A STRING expression to be matched.
 * {{regex}}: A STRING expression with a pattern.

Returns:
A STRING.

In case of a malformed {{regex}} the function returns an error. 
If either argument is NULL or the pattern is not found, the result is NULL.

See also:
 * 
[Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#string-functions]
 * 
[Databricks|https://docs.databricks.com/en/sql/language-manual/functions/regexp_substr.html]
 * 
[MySQL|https://dev.mysql.com/doc/refman/8.4/en/regexp.html#function_regexp-substr]
 * 
[PostgreSQL|https://www.postgresql.org/docs/16/functions-matching.html#FUNCTIONS-POSIX-REGEXP]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35962) Add REGEXP_INSTR function

2024-08-01 Thread Dylan He (Jira)

Dylan He created FLINK-35962:


 Summary: Add REGEXP_INSTR function
 Key: FLINK-35962
 URL: https://issues.apache.org/jira/browse/FLINK-35962
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Dylan He


Add REGEXP_INSTR function.

Returns the position of the first substring in {{str}} that matches {{regex}}.

Example:
{code:sql}
> SELECT REGEXP_INSTR('Steven Jones and Stephen Smith are the best players', 
> 'Ste(v|ph)en');
 1
> SELECT REGEXP_INSTR('Mary had a little lamb', 'Ste(v|ph)en');
 0
{code}

Syntax:
{code:sql}
REGEXP_INSTR(str, regex)
{code}
Arguments:
 * {{str}}: A STRING expression to be matched.
 * {{regex}}: A STRING expression with a pattern.

Returns:
A STRING.

Result indexes begin at 1, 0 if there is no match.
In case of a malformed {{regex}} the function returns an error. 
If either argument is NULL, the result is NULL.

See also:
 * 
[Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#string-functions]
 * 
[Databricks|https://docs.databricks.com/en/sql/language-manual/functions/regexp_instr.html]
 * 
[MySQL|https://dev.mysql.com/doc/refman/8.4/en/regexp.html#function_regexp-instr]
 * 
[PostgreSQL|https://www.postgresql.org/docs/16/functions-matching.html#FUNCTIONS-POSIX-REGEXP]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35959) CLONE - Updates the docs stable version

2024-08-01 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35959:
--

 Summary: CLONE - Updates the docs stable version
 Key: FLINK-35959
 URL: https://issues.apache.org/jira/browse/FLINK-35959
 Project: Flink
  Issue Type: Sub-task
Reporter: Weijie Guo
Assignee: lincoln lee


Update docs to "stable" in {{docs/config.toml}} in the branch of the 
_just-released_ version:
 * Change V{{{}ersion{}}} from {{{}x.y-SNAPSHOT }}to \{{{}x.y.z{}}}, i.e. 
{{1.6-SNAPSHOT}} to {{1.6.0}}
 * Change V{{{}ersionTitle{}}} from {{x.y-SNAPSHOT}} to {{{}x.y{}}}, i.e. 
{{1.6-SNAPSHOT}} to {{1.6}}
 * Change Branch from {{master}} to {{{}release-x.y{}}}, i.e. {{master}} to 
{{release-1.6}}
 * Change {{baseURL}} from 
{{//[ci.apache.org/projects/flink/flink-docs-master|http://ci.apache.org/projects/flink/flink-docs-master]}}
 to 
{{//[ci.apache.org/projects/flink/flink-docs-release-x.y|http://ci.apache.org/projects/flink/flink-docs-release-x.y]}}
 * Change {{javadocs_baseurl}} from 
{{//[ci.apache.org/projects/flink/flink-docs-master|http://ci.apache.org/projects/flink/flink-docs-master]}}
 to 
{{//[ci.apache.org/projects/flink/flink-docs-release-x.y|http://ci.apache.org/projects/flink/flink-docs-release-x.y]}}
 * Change {{IsStable}} to {{true}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35960) CLONE - Start End of Life discussion thread for now outdated Flink minor version

2024-08-01 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35960:
--

 Summary: CLONE - Start End of Life discussion thread for now 
outdated Flink minor version
 Key: FLINK-35960
 URL: https://issues.apache.org/jira/browse/FLINK-35960
 Project: Flink
  Issue Type: Sub-task
Reporter: Weijie Guo


The idea is to discuss whether we should do a final release for the now not 
supported minor version in the community. Such a minor release shouldn't be 
covered by the current minor version release managers. Their only 
responsibility is to trigger the discussion.

The intention of a final patch release for the now unsupported Flink minor 
version is to flush out all the fixes that didn't end up in the previous 
release.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35957) CLONE - Other announcements

2024-08-01 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35957:
--

 Summary: CLONE - Other announcements
 Key: FLINK-35957
 URL: https://issues.apache.org/jira/browse/FLINK-35957
 Project: Flink
  Issue Type: Sub-task
Reporter: Weijie Guo
Assignee: Lincoln Lee


h3. Recordkeeping

Use [reporter.apache.org|https://reporter.apache.org/addrelease.html?flink] to 
seed the information about the release into future project reports.

(Note: Only PMC members have access report releases. If you do not have access, 
ask on the mailing list for assistance.)
h3. Flink blog

Major or otherwise important releases should have a blog post. Write one if 
needed for this particular release. Minor releases that don’t introduce new 
major functionality don’t necessarily need to be blogged (see [flink-web PR 
#581 for Flink 1.15.3|https://github.com/apache/flink-web/pull/581] as an 
example for a minor release blog post).

Please make sure that the release notes of the documentation (see section 
"Review and update documentation") are linked from the blog post of a major 
release.
We usually include the names of all contributors in the announcement blog post. 
Use the following command to get the list of contributors:
{code}
# first line is required to make sort first with uppercase and then lower
export LC_ALL=C
export FLINK_PREVIOUS_RELEASE_BRANCH=
export FLINK_CURRENT_RELEASE_BRANCH=
# e.g.
# export FLINK_PREVIOUS_RELEASE_BRANCH=release-1.17
# export FLINK_CURRENT_RELEASE_BRANCH=release-1.18
git log $(git merge-base master $FLINK_PREVIOUS_RELEASE_BRANCH)..$(git show-ref 
--hash ${FLINK_CURRENT_RELEASE_BRANCH}) --pretty=format:"%an%n%cn" | sort  -u | 
paste -sd, | sed "s/\,/\, /g"
{code}
h3. Social media

Tweet, post on Facebook, LinkedIn, and other platforms. Ask other contributors 
to do the same.
h3. Flink Release Wiki page

Add a summary of things that went well or that went not so well during the 
release process. This can include feedback from contributors but also more 
generic things like the release have taken longer than initially anticipated 
(and why) to give a bit of context to the release process.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35958) CLONE - Update reference data for Migration Tests

2024-08-01 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35958:
--

 Summary: CLONE - Update reference data for Migration Tests
 Key: FLINK-35958
 URL: https://issues.apache.org/jira/browse/FLINK-35958
 Project: Flink
  Issue Type: Sub-task
Reporter: Weijie Guo
Assignee: lincoln lee
 Fix For: 1.20.0, 1.19.1


Update migration tests in master to cover migration from new version. Since 
1.18, this step could be done automatically with the following steps. For more 
information please refer to [this 
page.|https://github.com/apache/flink/blob/master/flink-test-utils-parent/flink-migration-test-utils/README.md]
 # {*}On the published release tag (e.g., release-1.16.0){*}, run 
{panel}
{panel}
|{{$ mvn clean }}{{package}} {{{}-Pgenerate-migration-test-data 
-Dgenerate.version={}}}{{{}1.16{}}} {{-nsu -Dfast -DskipTests}}|

The version (1.16 in the command above) should be replaced with the target one.

 # Modify the content of the file 
[apache/flink:flink-test-utils-parent/flink-migration-test-utils/src/main/resources/most_recently_published_version|https://github.com/apache/flink/blob/master/flink-test-utils-parent/flink-migration-test-utils/src/main/resources/most_recently_published_version]
 to the latest version (it would be "v1_16" if sticking to the example where 
1.16.0 was released). 
 # Commit the modification in step a and b with "{_}[release] Generate 
reference data for state migration tests based on release-1.xx.0{_}" to the 
corresponding release branch (e.g. {{release-1.16}} in our example), replace 
"xx" with the actual version (in this example "16"). You should use the Jira 
issue ID in case of [release]  as the commit message's prefix if you have a 
dedicated Jira issue for this task.

 # Cherry-pick the commit to the master branch. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35954) CLONE - Merge website pull request

2024-08-01 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35954:
--

 Summary: CLONE - Merge website pull request
 Key: FLINK-35954
 URL: https://issues.apache.org/jira/browse/FLINK-35954
 Project: Flink
  Issue Type: Sub-task
Reporter: Weijie Guo
Assignee: lincoln lee


Merge the website pull request to [list the 
release|http://flink.apache.org/downloads.html]. Make sure to regenerate the 
website as well, as it isn't build automatically.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35955) CLONE - Remove outdated versions

2024-08-01 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35955:
--

 Summary: CLONE - Remove outdated versions
 Key: FLINK-35955
 URL: https://issues.apache.org/jira/browse/FLINK-35955
 Project: Flink
  Issue Type: Sub-task
Reporter: Weijie Guo


h4. dist.apache.org

For a new major release remove all release files older than 2 versions, e.g., 
when releasing 1.7, remove all releases <= 1.5.

For a new bugfix version remove all release files for previous bugfix releases 
in the same series, e.g., when releasing 1.7.1, remove the 1.7.0 release.
# If you have not already, check out the Flink section of the {{release}} 
repository on {{[dist.apache.org|http://dist.apache.org/]}} via Subversion. In 
a fresh directory:
{code}
svn checkout https://dist.apache.org/repos/dist/release/flink --depth=immediates
cd flink
{code}
# Remove files for outdated releases and commit the changes.
{code}
svn remove flink-
svn commit
{code}
# Verify that files  are 
[removed|https://dist.apache.org/repos/dist/release/flink]
(!) Remember to remove the corresponding download links from the website.

h4. CI

Disable the cron job for the now-unsupported version from 
(tools/azure-pipelines/[build-apache-repo.yml|https://github.com/apache/flink/blob/master/tools/azure-pipelines/build-apache-repo.yml])
 in the respective branch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35961) CLONE - Build 1.19 docs in GitHub Action and mark 1.19 as stable in docs

2024-08-01 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35961:
--

 Summary: CLONE - Build 1.19 docs in GitHub Action and mark 1.19 as 
stable in docs
 Key: FLINK-35961
 URL: https://issues.apache.org/jira/browse/FLINK-35961
 Project: Flink
  Issue Type: Sub-task
Reporter: Weijie Guo
Assignee: lincoln lee






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35953) CLONE - Update japicmp configuration

2024-08-01 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35953:
--

 Summary: CLONE - Update japicmp configuration
 Key: FLINK-35953
 URL: https://issues.apache.org/jira/browse/FLINK-35953
 Project: Flink
  Issue Type: Sub-task
Reporter: Weijie Guo
Assignee: lincoln lee
 Fix For: 1.20.0, 1.19.1


Update the japicmp reference version and wipe exclusions / enable API 
compatibility checks for {{@PublicEvolving}} APIs on the corresponding SNAPSHOT 
branch with the {{update_japicmp_configuration.sh}} script (see below).

For a new major release (x.y.0), run the same command also on the master branch 
for updating the japicmp reference version and removing out-dated exclusions in 
the japicmp configuration.

Make sure that all Maven artifacts are already pushed to Maven Central. 
Otherwise, there's a risk that CI fails due to missing reference artifacts.
{code:bash}
tools $ NEW_VERSION=$RELEASE_VERSION releasing/update_japicmp_configuration.sh
tools $ cd ..$ git add *$ git commit -m "Update japicmp configuration for 
$RELEASE_VERSION" {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35956) CLONE - Apache mailing lists announcements

2024-08-01 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35956:
--

 Summary: CLONE - Apache mailing lists announcements
 Key: FLINK-35956
 URL: https://issues.apache.org/jira/browse/FLINK-35956
 Project: Flink
  Issue Type: Sub-task
Reporter: Weijie Guo
Assignee: lincoln lee


Announce on the {{dev@}} mailing list that the release has been finished.

Announce on the release on the {{user@}} mailing list, listing major 
improvements and contributions.

Announce the release on the [annou...@apache.org|mailto:annou...@apache.org] 
mailing list.
{panel}
{panel}
|{{From: Release Manager}}
{{To: dev@flink.apache.org, u...@flink.apache.org, user...@flink.apache.org, 
annou...@apache.org}}
{{Subject: [ANNOUNCE] Apache Flink 1.2.3 released}}
 
{{The Apache Flink community is very happy to announce the release of Apache 
Flink 1.2.3, which is the third bugfix release for the Apache Flink 1.2 
series.}}
 
{{Apache Flink® is an open-source stream processing framework for distributed, 
high-performing, always-available, and accurate data streaming applications.}}
 
{{The release is available for download at:}}
{{[https://flink.apache.org/downloads.html]}}
 
{{Please check out the release blog post for an overview of the improvements 
for this bugfix release:}}
{{}}
 
{{The full release notes are available in Jira:}}
{{}}
 
{{We would like to thank all contributors of the Apache Flink community who 
made this release possible!}}
 
{{Feel free to reach out to the release managers (or respond to this thread) 
with feedback on the release process. Our goal is to constantly improve the 
release process. Feedback on what could be improved or things that didn't go so 
well are appreciated.}}
 
{{Regards,}}
{{Release Manager}}|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35952) Promote release 1.20

2024-08-01 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35952:
--

 Summary: Promote release 1.20
 Key: FLINK-35952
 URL: https://issues.apache.org/jira/browse/FLINK-35952
 Project: Flink
  Issue Type: New Feature
Affects Versions: 1.18.0
Reporter: Weijie Guo
Assignee: lincoln lee


Once the release has been finalized (FLINK-32920), the last step of the process 
is to promote the release within the project and beyond. Please wait for 24h 
after finalizing the release in accordance with the [ASF release 
policy|http://www.apache.org/legal/release-policy.html#release-announcements].

*Final checklist to declare this issue resolved:*
 # Website pull request to [list the 
release|http://flink.apache.org/downloads.html] merged
 # Release announced on the user@ mailing list.
 # Blog post published, if applicable.
 # Release recorded in 
[reporter.apache.org|https://reporter.apache.org/addrelease.html?flink].
 # Release announced on social media.
 # Completion declared on the dev@ mailing list.
 # Update Homebrew: [https://docs.brew.sh/How-To-Open-a-Homebrew-Pull-Request] 
(seems to be done automatically - at least for minor releases  for both minor 
and major releases)
 # Updated the japicmp configuration
 ** corresponding SNAPSHOT branch japicmp reference version set to the just 
released version, and API compatibiltity checks for {{@PublicEvolving}}  was 
enabled
 ** (minor version release only) master branch japicmp reference version set to 
the just released version
 ** (minor version release only) master branch japicmp exclusions have been 
cleared
 # Update the list of previous version in {{docs/config.toml}} on the master 
branch.
 # Set {{show_outdated_warning: true}} in {{docs/config.toml}} in the branch of 
the _now deprecated_ Flink version (i.e. 1.16 if 1.18.0 is released)
 # Update stable and master alias in 
[https://github.com/apache/flink/blob/master/.github/workflows/docs.yml]
 # Open discussion thread for End of Life for Unsupported version (i.e. 1.16)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35951) Flink CDC startup exception using jar package

2024-08-01 Thread huxx (Jira)

huxx created FLINK-35951:


 Summary: Flink CDC startup exception using jar package
 Key: FLINK-35951
 URL: https://issues.apache.org/jira/browse/FLINK-35951
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Affects Versions: cdc-3.1.1
 Environment: jdk 17 

flink 1.18

flinkcdc 3.0.0
Reporter: huxx
 Attachments: image-2024-08-01-15-01-25-795.png, 
image-2024-08-01-15-10-35-824.png

I integrated Flink with SpringBoot to complete my project because I only used a 
small part of Flink's functionality, namely the data extraction feature in 
Flink CDC. I don't want to build another Flink environment to manage and 
maintain, I want to integrate this feature with other businesses and manage it 
through Spring. I can use the Idea development tool to package and start 
projects normally. I use the spring boot man plugin tool to package, but when I 
start it using Java Jar and execute StreamExecutionEnvironment. Execute(), it 
prompts that the class cannot be found. However, this class does exist and can 
be packaged

!image-2024-08-01-15-10-35-824.png!
{code:java}
org.apache.flink.runtime.client.JobInitializationException: Could not start the 
JobMaster.
        at 
org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.lambda$new$0(DefaultJobMasterServiceProcess.java:97)
        at 
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
        at 
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
        at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
        at 
java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1773)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.util.concurrent.CompletionException: 
java.lang.RuntimeException: java.lang.ClassNotFoundException: 
org.apache.flink.api.common.ExecutionConfig
        at 
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:315)
        at 
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:320)
        at 
java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1770)
        ... 3 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
org.apache.flink.api.common.ExecutionConfig
        at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321)
        at 
org.apache.flink.util.function.FunctionUtils.lambda$uncheckedSupplier$4(FunctionUtils.java:114)
        at 
java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
        ... 3 more
Caused by: java.lang.ClassNotFoundException: 
org.apache.flink.api.common.ExecutionConfig
        at 
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
        at 
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520)
        at java.base/java.lang.Class.forName0(Native Method)
        at java.base/java.lang.Class.forName(Class.java:467)
        at 
org.apache.flink.util.InstantiationUtil$ClassLoaderObjectInputStream.resolveClass(InstantiationUtil.java:78)
        at 
java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2045)
        at 
java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1909)
        at 
java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2235)
        at 
java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1744)
        at 
java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:514)
        at 
java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:472)
        at 
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:539)
        at 
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:527)
        at 
org.apache.flink.util.SerializedValue.deserializeValue(SerializedValue.java:67)
        at 
org.apache.flink.runtime.scheduler.DefaultSchedulerFactory.createInstance(DefaultSchedulerFactory.java:101)
        at 
org.apache.flink.runtime.jobmaster.DefaultSlotPoolServiceSchedulerFactory.createScheduler(DefaultSlotPoolServiceSchedulerFactory.java:122)
        at 
org.apache.flink.runtime.jobmaster.JobMaster.createScheduler(JobMaster.java:379)
        at 
org.apache.flink.runtime.jobmaster.JobMaster.(JobMaster.java:356

[jira] [Created] (FLINK-35950) CLONE - Publish the Dockerfiles for the new release

2024-08-01 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35950:
--

 Summary: CLONE - Publish the Dockerfiles for the new release
 Key: FLINK-35950
 URL: https://issues.apache.org/jira/browse/FLINK-35950
 Project: Flink
  Issue Type: Sub-task
Reporter: Weijie Guo
Assignee: lincoln lee


Note: the official Dockerfiles fetch the binary distribution of the target 
Flink version from an Apache mirror. After publishing the binary release 
artifacts, mirrors can take some hours to start serving the new artifacts, so 
you may want to wait to do this step until you are ready to continue with the 
"Promote the release" steps in the follow-up Jira.

Follow the [release instructions in the flink-docker 
repo|https://github.com/apache/flink-docker#release-workflow] to build the new 
Dockerfiles and send an updated manifest to Docker Hub so the new images are 
built and published.

 

h3. Expectations

 * Dockerfiles in [flink-docker|https://github.com/apache/flink-docker] updated 
for the new Flink release and pull request opened on the Docker official-images 
with an updated manifest



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35948) CLONE - Deploy artifacts to Maven Central Repository

2024-08-01 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35948:
--

 Summary: CLONE - Deploy artifacts to Maven Central Repository
 Key: FLINK-35948
 URL: https://issues.apache.org/jira/browse/FLINK-35948
 Project: Flink
  Issue Type: Sub-task
Reporter: Weijie Guo
Assignee: lincoln lee


Use the [Apache Nexus repository|https://repository.apache.org/] to release the 
staged binary artifacts to the Maven Central repository. In the Staging 
Repositories section, find the relevant release candidate orgapacheflink-XXX 
entry and click Release. Drop all other release candidates that are not being 
released.
h3. Deploy source and binary releases to dist.apache.org

Copy the source and binary releases from the dev repository to the release 
repository at [dist.apache.org|http://dist.apache.org/] using Subversion.
{code:java}
$ svn move -m "Release Flink ${RELEASE_VERSION}" 
https://dist.apache.org/repos/dist/dev/flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
 https://dist.apache.org/repos/dist/release/flink/flink-${RELEASE_VERSION}
{code}
(Note: Only PMC members have access to the release repository. If you do not 
have access, ask on the mailing list for assistance.)
h3. Remove old release candidates from [dist.apache.org|http://dist.apache.org/]

Remove the old release candidates from 
[https://dist.apache.org/repos/dist/dev/flink] using Subversion.
{code:java}
$ svn checkout https://dist.apache.org/repos/dist/dev/flink --depth=immediates
$ cd flink
$ svn remove flink-${RELEASE_VERSION}-rc*
$ svn commit -m "Remove old release candidates for Apache Flink 
${RELEASE_VERSION}
{code}
 

h3. Expectations
 * Maven artifacts released and indexed in the [Maven Central 
Repository|https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.flink%22]
 (usually takes about a day to show up)
 * Source & binary distributions available in the release repository of 
[https://dist.apache.org/repos/dist/release/flink/]
 * Dev repository [https://dist.apache.org/repos/dist/dev/flink/] is empty
 * Website contains links to new release binaries and sources in download page
 * (for minor version updates) the front page references the correct new major 
release version and directs to the correct link



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35949) CLONE - Create Git tag and mark version as released in Jira

2024-08-01 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35949:
--

 Summary: CLONE - Create Git tag and mark version as released in 
Jira
 Key: FLINK-35949
 URL: https://issues.apache.org/jira/browse/FLINK-35949
 Project: Flink
  Issue Type: Sub-task
Reporter: Weijie Guo
Assignee: lincoln lee


Create and push a new Git tag for the released version by copying the tag for 
the final release candidate, as follows:
{code:java}
$ git tag -s "release-${RELEASE_VERSION}" refs/tags/${TAG}^{} -m "Release Flink 
${RELEASE_VERSION}"
$ git push  refs/tags/release-${RELEASE_VERSION}
{code}
In JIRA, inside [version 
management|https://issues.apache.org/jira/plugins/servlet/project-config/FLINK/versions],
 hover over the current release and a settings menu will appear. Click Release, 
and select today’s date.

(Note: Only PMC members have access to the project administration. If you do 
not have access, ask on the mailing list for assistance.)

If PRs have been merged to the release branch after the the last release 
candidate was tagged, make sure that the corresponding Jira tickets have the 
correct Fix Version set.

 

h3. Expectations
 * Release tagged in the source code repository
 * Release version finalized in JIRA. (Note: Not all committers have 
administrator access to JIRA. If you end up getting permissions errors ask on 
the mailing list for assistance)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35947) CLONE - Deploy Python artifacts to PyPI

2024-08-01 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35947:
--

 Summary: CLONE - Deploy Python artifacts to PyPI
 Key: FLINK-35947
 URL: https://issues.apache.org/jira/browse/FLINK-35947
 Project: Flink
  Issue Type: Sub-task
Reporter: Weijie Guo
Assignee: lincoln lee


Release manager should create a PyPI account and ask the PMC add this account 
to pyflink collaborator list with Maintainer role (The PyPI admin account info 
can be found here. NOTE, only visible to PMC members) to deploy the Python 
artifacts to PyPI. The artifacts could be uploaded using 
twine([https://pypi.org/project/twine/]). To install twine, just run:
{code:java}
pip install --upgrade twine==1.12.0
{code}
Download the python artifacts from dist.apache.org and upload it to pypi.org:
{code:java}
svn checkout 
https://dist.apache.org/repos/dist/dev/flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
cd flink-${RELEASE_VERSION}-rc${RC_NUM}
 
cd python
 
#uploads wheels
for f in *.whl; do twine upload --repository-url 
https://upload.pypi.org/legacy/ $f $f.asc; done
 
#upload source packages
twine upload --repository-url https://upload.pypi.org/legacy/ 
apache-flink-libraries-${RELEASE_VERSION}.tar.gz 
apache-flink-libraries-${RELEASE_VERSION}.tar.gz.asc
 
twine upload --repository-url https://upload.pypi.org/legacy/ 
apache-flink-${RELEASE_VERSION}.tar.gz 
apache-flink-${RELEASE_VERSION}.tar.gz.asc
{code}
If upload failed or incorrect for some reason (e.g. network transmission 
problem), you need to delete the uploaded release package of the same version 
(if exists) and rename the artifact to 
\{{{}apache-flink-${RELEASE_VERSION}.post0.tar.gz{}}}, then re-upload.

(!) Note: re-uploading to pypi.org must be avoided as much as possible because 
it will cause some irreparable problems. If that happens, users cannot install 
the apache-flink package by explicitly specifying the package version, i.e. the 
following command "pip install apache-flink==${RELEASE_VERSION}" will fail. 
Instead they have to run "pip install apache-flink" or "pip install 
apache-flink==${RELEASE_VERSION}.post0" to install the apache-flink package.

 

h3. Expectations
 * Python artifacts released and indexed in the 
[PyPI|https://pypi.org/project/apache-flink/] Repository



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35946) Finalize release 1.20.0

2024-08-01 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35946:
--

 Summary: Finalize release 1.20.0
 Key: FLINK-35946
 URL: https://issues.apache.org/jira/browse/FLINK-35946
 Project: Flink
  Issue Type: New Feature
Reporter: Weijie Guo
Assignee: lincoln lee
 Fix For: 1.19.0


Once the release candidate has been reviewed and approved by the community, the 
release should be finalized. This involves the final deployment of the release 
candidate to the release repositories, merging of the website changes, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35945) Using Flink to dock with Kafka data sources without enabling checkpoints cannot browse consumer group information in Kafka

2024-07-31 Thread Jira

任铭睿 created FLINK-35945:
---

 Summary: Using Flink to dock with Kafka data sources without 
enabling checkpoints cannot browse consumer group information in Kafka
 Key: FLINK-35945
 URL: https://issues.apache.org/jira/browse/FLINK-35945
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Reporter: 任铭睿


Using Flink to dock with Kafka data sources without enabling checkpoints cannot 
browse consumer group information in Kafka



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35944) Add CompiledPlan annotations to BatchExecUnion

2024-07-31 Thread Jim Hughes (Jira)

Jim Hughes created FLINK-35944:
--

 Summary: Add CompiledPlan annotations to BatchExecUnion
 Key: FLINK-35944
 URL: https://issues.apache.org/jira/browse/FLINK-35944
 Project: Flink
  Issue Type: Sub-task
Reporter: Jim Hughes
Assignee: Jim Hughes


In addition to the annotations, implement the BatchCompiledPlan test for this 
operator.

For this operator, the BatchExecHashAggregate operator must be annotated as 
well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35943) Add CompiledPlan annotations to BatchExecHashJoin and BatchExecNestedLoopJoin

2024-07-31 Thread Jim Hughes (Jira)

Jim Hughes created FLINK-35943:
--

 Summary: Add CompiledPlan annotations to BatchExecHashJoin and 
BatchExecNestedLoopJoin
 Key: FLINK-35943
 URL: https://issues.apache.org/jira/browse/FLINK-35943
 Project: Flink
  Issue Type: Sub-task
Reporter: Jim Hughes
Assignee: Jim Hughes


In addition to the annotations, implement the BatchCompiledPlan test for these 
two operators.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35942) Add CompiledPlan annotations to BatchExecSort

2024-07-31 Thread Jim Hughes (Jira)

Jim Hughes created FLINK-35942:
--

 Summary: Add CompiledPlan annotations to BatchExecSort
 Key: FLINK-35942
 URL: https://issues.apache.org/jira/browse/FLINK-35942
 Project: Flink
  Issue Type: Sub-task
Reporter: Jim Hughes
Assignee: Jim Hughes


In addition to the annotations, implement the BatchCompiledPlan test for this 
operator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35941) Add CompiledPlan annotations to BatchExecLimit

2024-07-31 Thread Jim Hughes (Jira)

Jim Hughes created FLINK-35941:
--

 Summary: Add CompiledPlan annotations to BatchExecLimit
 Key: FLINK-35941
 URL: https://issues.apache.org/jira/browse/FLINK-35941
 Project: Flink
  Issue Type: Sub-task
Reporter: Jim Hughes
Assignee: Jim Hughes


In addition to the annotations, implement the BatchCompiledPlan test for this 
operator.

Additionally, tests for the TableSource operator will be pulled into this work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35940) Upgrade log4j dependencies to 2.23.1

2024-07-31 Thread Daniel Burrell (Jira)

Daniel Burrell created FLINK-35940:
--

 Summary: Upgrade log4j dependencies to 2.23.1
 Key: FLINK-35940
 URL: https://issues.apache.org/jira/browse/FLINK-35940
 Project: Flink
  Issue Type: Improvement
  Components: API / Core
Affects Versions: 1.19.1
 Environment: N/A
Reporter: Daniel Burrell


There is a need to upgrade log4j dependencies as the current version has a 
vulnerability.

I propose upgrading to 2.23.1 which is the latest 2.x version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35939) Do not set empty config values via ConfigUtils#encodeCollectionToConfig

2024-07-31 Thread Ferenc Csaky (Jira)

Ferenc Csaky created FLINK-35939:


 Summary: Do not set empty config values via 
ConfigUtils#encodeCollectionToConfig
 Key: FLINK-35939
 URL: https://issues.apache.org/jira/browse/FLINK-35939
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.19.1
Reporter: Ferenc Csaky
 Fix For: 2.0.0


The {{ConfigUtils#encodeCollectionToConfig}} function only skips to set a given 
{{ConfigOption}} value, if that value is null. If the passed collection is 
empty, it will set that empty collection.

I think this behavior is less logical and can cause more undesired situations, 
when we only set a value if it is not empty AND not null.

Furthermore, the method's 
[javadoc|https://github.com/apache/flink/blob/82b628d4730eef32b2f7a022e3b73cb18f950e6e/flink-core/src/main/java/org/apache/flink/configuration/ConfigUtils.java#L73]
 describes the logic I just mentioned above, which is in conflict with the 
actual implementation and tests, which sets an empty collection.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35938) Avoid commit the same datafile again in Paimon Sink.

2024-07-31 Thread LvYanquan (Jira)

LvYanquan created FLINK-35938:
-

 Summary: Avoid commit the same datafile again in Paimon Sink.
 Key: FLINK-35938
 URL: https://issues.apache.org/jira/browse/FLINK-35938
 Project: Flink
  Issue Type: Technical Debt
  Components: Flink CDC
Affects Versions: cdc-3.1.0
Reporter: LvYanquan
 Fix For: cdc-3.2.0
 Attachments: image-2024-07-31-19-45-14-153.png

[Flink will re-commit 
committables|https://github.com/apache/flink/blob/82b628d4730eef32b2f7a022e3b73cb18f950e6e/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/connector/sink2/GlobalCommitterOperator.java#L148]
 when job restart from failure. This may cause the same datafile were added 
twice in current PaimonCommitter.




!image-2024-07-31-19-45-14-153.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35937) Helm RBAC cleanup

2024-07-31 Thread Tim (Jira)

Tim created FLINK-35937:
---

 Summary: Helm RBAC cleanup
 Key: FLINK-35937
 URL: https://issues.apache.org/jira/browse/FLINK-35937
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Affects Versions: 1.19.0
 Environment: Kubernetes 1.29.4
Reporter: Tim


This is a follow up ticket to 
[https://issues.apache.org/jira/projects/FLINK/issues/FLINK-35310]

In further research and testing with Kyverno I figured out that some apiGroups 
seem to be invalid and I removed them with this PR.

It seems that the "extensions" apiGroups does not exist on our recent cluster 
(Kubernetes 1.29.4). I'm not sure but it might be related to these deprecation 
notices: 
https://kubernetes.io/docs/reference/using-api/deprecation-guide/#deployment-v116

Same holds for the "finalizers" resources. They do not seem to exist anymore 
and lead to problems with our deployment. So I also removed them.

To complete the verb list I also added "deleteCollections" where applicable.

Please take a look if this makes sense to you.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35936) paimon cdc schema evolution failure when restart job

2024-07-30 Thread MOBIN (Jira)

MOBIN created FLINK-35936:
-

 Summary: paimon cdc schema evolution failure when restart job
 Key: FLINK-35936
 URL: https://issues.apache.org/jira/browse/FLINK-35936
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Affects Versions: cdc-3.2.0
 Environment: Flink 1.19

cdc master
Reporter: MOBIN


paimon cdc schema evolution failure when restart job

Minimal reproduce step：
 # stop flink-cdc-mysql-to-paimon pipeline job
 # alter mysql table schema, such as add column
 # start pipeline job
 # the newly added column was not synchronized to the paimon table



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35935) CREATE TABLE AS doesn't work with LIMIT

2024-07-30 Thread Xingcan Cui (Jira)

Xingcan Cui created FLINK-35935:
---

 Summary: CREATE TABLE AS doesn't work with LIMIT
 Key: FLINK-35935
 URL: https://issues.apache.org/jira/browse/FLINK-35935
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.18.1
Reporter: Xingcan Cui


{code:java}
CREATE TABLE WITH (foo) AS (SELECT * FROM bar LIMIT 5){code}
The above statement throws "Caused by: java.lang.AssertionError: not a query: " 
exception.

A workaround is to wrap the query with CTE.
{code:java}
CREATE TABLE WITH (foo) AS (WITH R AS (SELECT * FROM bar LIMIT 5) SELECT * FROM 
R){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35934) Add CompiledPlan annotations to BatchExecValues

2024-07-30 Thread Jim Hughes (Jira)

Jim Hughes created FLINK-35934:
--

 Summary: Add CompiledPlan annotations to BatchExecValues
 Key: FLINK-35934
 URL: https://issues.apache.org/jira/browse/FLINK-35934
 Project: Flink
  Issue Type: Sub-task
Reporter: Jim Hughes
Assignee: Jim Hughes






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35933) Skip distributing maxAllowedWatermark if there are no subtasks

2024-07-30 Thread Roman Khachatryan (Jira)

Roman Khachatryan created FLINK-35933:
-

 Summary: Skip distributing maxAllowedWatermark if there are no 
subtasks
 Key: FLINK-35933
 URL: https://issues.apache.org/jira/browse/FLINK-35933
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Reporter: Roman Khachatryan
Assignee: Roman Khachatryan
 Fix For: 1.20.0


On JM, `SourceCoordinator.announceCombinedWatermark` executes unnecessary if 
there are no subtasks to distribute maxAllowedWatermark. This involves Heap and 
ConcurrentHashMap accesses and lots of logging.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35932) Add REGEXP_COUNT function

2024-07-30 Thread Dylan He (Jira)

Dylan He created FLINK-35932:


 Summary: Add REGEXP_COUNT function
 Key: FLINK-35932
 URL: https://issues.apache.org/jira/browse/FLINK-35932
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Dylan He


Add REGEXP_COUNT function.

Returns the number of times {{str}} matches the {{regexp}} pattern.

Example:
{code:sql}
> SELECT REGEXP_COUNT('Steven Jones and Stephen Smith are the best players', 
> 'Ste(v|ph)en');
 2
> SELECT REGEXP_COUNT('Mary had a little lamb', 'Ste(v|ph)en');
 0
{code}

Syntax:
{code:sql}
REGEXP_COUNT(str, regexp)
{code}

Arguments:
 * {{str}}: A STRING expression to be matched.
 * {{regexp}}: A STRING expression with a matching pattern.

Returns:
An INTEGER.

{{regexp}} must be a Java regular expression.
When using literals, use `raw-literal` (`r` prefix) to avoid escape character 
pre-processing.
If either argument is {{NULL}}, the result is {{NULL}}.

See also:
 * 
[Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#string-functions]
 * 
[Databricks|https://docs.databricks.com/en/sql/language-manual/functions/regexp_count.html]
 * [PostgreSQL|https://www.postgresql.org/docs/16/functions-string.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35931) Add REGEXP_EXTRACT_ALL function

2024-07-30 Thread Dylan He (Jira)

Dylan He created FLINK-35931:


 Summary: Add REGEXP_EXTRACT_ALL function
 Key: FLINK-35931
 URL: https://issues.apache.org/jira/browse/FLINK-35931
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Dylan He


Add REGEXP_EXTRACT_ALL function.

Extracts all of the strings in {{str}} that match the {{regexp}} expression and 
correspond to the regex group {{idx}}.

Example:
{code:sql}
> SELECT REGEXP_EXTRACT_ALL('100-200, 300-400', '(\\d+)-(\\d+)', 1);
 [100, 300]
> SELECT REGEXP_EXTRACT_ALL('100-200, 300-400', r'(\d+)-(\d+)', 1);
 ["100","300"]
{code}

Syntax:
{code:sql}
REGEXP_EXTRACT_ALL(str, regexp[, idx])
{code}

Arguments:
 * {{str}}: A STRING expression to be matched.
 * {{regexp}}: A STRING expression with a matching pattern.
 * {{idx}}: An optional INTEGER expression greater or equal 0 with default 1.

Returns:
An ARRAY.
{{regexp}} must be a Java regular expression.
When using literals, use `raw-literal` (`r` prefix) to avoid escape character 
pre-processing.
{{regexp}} may contain multiple groups. {{idx}} indicates which regex group to 
extract. An {{idx}} of 0 means matching the entire regular expression.

See also:
 * 
[Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#string-functions]
 * 
[Databricks|https://docs.databricks.com/en/sql/language-manual/functions/regexp_extract_all.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35930) Add FORMAT_STRING function

2024-07-30 Thread Dylan He (Jira)

Dylan He created FLINK-35930:


 Summary: Add FORMAT_STRING function
 Key: FLINK-35930
 URL: https://issues.apache.org/jira/browse/FLINK-35930
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Dylan He


Add FORMAT_STRING function as the same in Spark.
This function is a synonym for PRINTF function.


Returns a formatted string from printf-style format strings.

Example:
{code:sql}
> SELECT FORMAT_STRING('Hello World %d %s', 100, 'days');
 Hello World 100 days
{code}

Syntax:
{code:sql}
FORMAT_STRING(strfmt, obj...)
{code}

Arguments:
* {{strfmt}}: A STRING expression.
* {{obj}}: ANY expression.

Returns:
A STRING.

See also:
 * 
[Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#string-functions]
 * 
[Databricks|https://docs.databricks.com/en/sql/language-manual/functions/format_string.html]
* 
[Hive|https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-StringFunctions]
 printf
* [PostgreSQL|https://www.postgresql.org/docs/16/functions-string.html] format



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35929) In flink insert mode, it supports modifying the parallelism of jdbc sink when the parallelism of source and sink is the same.

2024-07-30 Thread Qiu (Jira)

Qiu created FLINK-35929:
---

 Summary: In flink insert mode, it supports modifying the 
parallelism of jdbc sink when the parallelism of source and sink is the same.
 Key: FLINK-35929
 URL: https://issues.apache.org/jira/browse/FLINK-35929
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / JDBC
Affects Versions: jdbc-3.1.2
Reporter: Qiu
 Attachments: image-2024-07-30-19-57-45-033.png, 
image-2024-07-30-19-57-50-868.png

In insert mode, when the source and sink parallelism are consistent, if you 
reduce or increase the jdbc sink parallelism, the SQL verification will report 
an error. The following is the error message.

configured sink parallelism is: 8, while the input parallelism is: -1. Since 
configured parallelism is different from input parallelism and the changelog 
mode contains [INSERT,UPDATE_AFTER,DELETE], which is not INSERT_ONLY mode, 
primary key is required but no primary key is found!
{code:java}
//代码占位符
module: flink-connector-jdbc
class: org.apache.flink.connector.jdbc.table.JdbcDynamicTableSink

public ChangelogMode getChangelogMode(ChangelogMode requestedMode) {
validatePrimaryKey(requestedMode);
ChangelogMode.Builder changelogModeBuilder = ChangelogMode.newBuilder()
.addContainedKind(RowKind.INSERT);

if (tableSchema.getPrimaryKey().isPresent()) {
changelogModeBuilder
.addContainedKind(RowKind.UPDATE_AFTER)
.addContainedKind(RowKind.DELETE);
}

return changelogModeBuilder.build();
} {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35928) ForSt supports compiling with RocksDB

2024-07-30 Thread Hangxiang Yu (Jira)

Hangxiang Yu created FLINK-35928:


 Summary: ForSt supports compiling with RocksDB
 Key: FLINK-35928
 URL: https://issues.apache.org/jira/browse/FLINK-35928
 Project: Flink
  Issue Type: Sub-task
Reporter: Hangxiang Yu
Assignee: Hangxiang Yu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35927) Support closing ForSt meta file when using object storage

2024-07-30 Thread Hangxiang Yu (Jira)

Hangxiang Yu created FLINK-35927:


 Summary: Support closing ForSt meta file when using object storage
 Key: FLINK-35927
 URL: https://issues.apache.org/jira/browse/FLINK-35927
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / State Backends
Reporter: Hangxiang Yu
Assignee: Hangxiang Yu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35926) During rescale, jobmanager has incorrect judgment logic for the max parallelism.

2024-07-30 Thread yuanfenghu (Jira)

yuanfenghu created FLINK-35926:
--

 Summary: During rescale, jobmanager has incorrect judgment logic 
for the max parallelism.
 Key: FLINK-35926
 URL: https://issues.apache.org/jira/browse/FLINK-35926
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Task
Affects Versions: 1.19.1
 Environment: flink-1.19.1

There is a high probability that 1.18 has the same problem
Reporter: yuanfenghu
 Attachments: image-2024-07-30-14-56-48-931.png, 
image-2024-07-30-14-59-26-976.png, image-2024-07-30-15-00-28-491.png

When I was using the adaptive scheduler and modified the task in parallel 
through the rest api, an incorrect decision logic occurred, causing the task to 
fail.
h2. produce:
When I start a simple job with a parallelism of 128, the Max Parallelism of the 
job will be set to 256 (through flink's internal calculation logic). Then I 
make a savepoint on the job and modify the parallelism of the job to 1. Restore 
the job from the savepoint. At this time, the Max Parallelism of the job is 
still 256:
 
!image-2024-07-30-14-56-48-931.png!

 
this is as expected, at this time I call the rest api to increase the 
parallelism to 129 (which is obviously reasonable, since it is < 128), but the 
task throws an exception after restarting:
 
!image-2024-07-30-14-59-26-976.png!
At this time, when viewing the detailed information of the task, it is found 
that Max Parallelism has changed to 128:
 
!image-2024-07-30-15-00-28-491.png!

 
This can be reproduced stably locally
 
h3. Causes:
 
In AdaptiveScheduler we recalculate the job `VertexParallelismStore`,
This results in the job after restart having the wrong max parallelism.

, which seems to be related to FLINK-21844 and FLINK-22084 .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35925) Remove hive connector from Flink main repo

2024-07-29 Thread Zhenqiu Huang (Jira)

Zhenqiu Huang created FLINK-35925:
-

 Summary: Remove hive connector from Flink main repo
 Key: FLINK-35925
 URL: https://issues.apache.org/jira/browse/FLINK-35925
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / Hive
Affects Versions: 2.0.0
Reporter: Zhenqiu Huang
 Fix For: 2.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35924) Improve the SourceReaderBase to support the RecordsWithSplitIds share internal buffer from SplitReader.

2024-07-29 Thread Jiangjie Qin (Jira)

Jiangjie Qin created FLINK-35924:


 Summary: Improve the SourceReaderBase to support the 
RecordsWithSplitIds share internal buffer from SplitReader.
 Key: FLINK-35924
 URL: https://issues.apache.org/jira/browse/FLINK-35924
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Common
Affects Versions: 1.20.0
Reporter: Jiangjie Qin


Recently, we saw corrupted {{RecordsWithSplitIds}} in one of our Iceberg source 
implementation. The problem is following sequence:
 # The {{SourceReaderBase}} does not have anything in the element queue. And 
the current fetch has been drained.
 # the main thread invokes {{{}SourceReaderBase.pollNext(){}}}, sees nothing to 
consume, and goes into the {{SourceReaderBase.finishedOrAvailableLater()}} call.
 # A {{SplitFetcher}} thread just finished consuming from a split and put the 
last batch of records into the element queue. The SplitFetcher becomes Idle 
after that, i.e. no assigned splits and no pending tasks.
 # 
The main thread invokes {{SplitFetcherManager.maybeShutdownFinishedFetchers()}} 
to shutdown idle fetchers. So the {{SplitFetcher}} in (3) is shutdown. As a 
result the {{SplitReader}} is also closed.
 # The main thread then sees a non-empty element queue, with the last batch put 
by the {{SplitFetcher}} which has just shutdown.
 # The main thread invokes {{SourceReaderBase.pollNext()}} again to process the 
last batch, but receives an exception, because the records contained in the 
batch has been released as a part of the {{SplitReader}} closure in (4)

The problem here is that the {{SourceReaderBase}} implementation assumes that 
once a batch of records (i.e. {{{}RecordsWithSplitIds{}}}) is generated, these 
records are available even after the {{SplitReader}} which generated them is 
closed.

This assumption forces some of the connector implementations to copy the 
records fetched from the {{{}SplitReader{}}}, and hence introduces additional 
overhead.

This patch aims to improve the {{SourceReaderBase}} to ensure that a 
{{SplitReader}} will not be closed until all the records it have emitted have 
been processed.

There is also a somewhat orthogonal problem that the {{SplitFetchers}} are 
being closed too aggressively. We've seen that in most cases, a dedicated 
SplitFetcher is created to handle a split and closed right away after the 
current split is finished but before the next split arrives. I'll create a 
separate patch to introduce a short period of timeout before we define a 
{{SplitFetcher}} thread as idle.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35923) Add CompiledPlan annotations to BatchExecSort

2024-07-29 Thread Jim Hughes (Jira)

Jim Hughes created FLINK-35923:
--

 Summary: Add CompiledPlan annotations to BatchExecSort
 Key: FLINK-35923
 URL: https://issues.apache.org/jira/browse/FLINK-35923
 Project: Flink
  Issue Type: Sub-task
Reporter: Jim Hughes
Assignee: Jim Hughes
 Fix For: 2.0.0


In addition to the annotations, implement the BatchCompiledPlan test for this 
operator.

Since this is the first operator, exchange operator must be annotated as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35922) Add configuration options related to hive

2024-07-29 Thread Qiu (Jira)

Qiu created FLINK-35922:
---

 Summary: Add configuration options related to hive
 Key: FLINK-35922
 URL: https://issues.apache.org/jira/browse/FLINK-35922
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Affects Versions: 1.19.1
Reporter: Qiu


Current, we submit flink job to yarn with run-application target and need to 
specify some configuration related to hive, because we use distributed 
filesystem similar to Ali oss to storage resources, in this case, we will pass 
special configuration option and set them to hiveConfiguration.

In order to solve such problems, we can provide a configuration option prefixed 
with "flink.hadoop."(such as -Dflink.hadoop.xxx=yyy), and then take it into 
HiveConfiguration.

A simple implementation code is as follows:

 
{code:java}
//代码占位符

module: flink-connectors/flink-connector-hive
class: org.apache.flink.table.catalog.hive.HiveCatalog  

//代码占位符 
public static HiveConf createHiveConf(@Nullable String hiveConfDir, @Nullable 
String hadoopConfDir) {
...
String flinkConfDir = System.getenv(ConfigConstants.ENV_FLINK_CONF_DIR);
if (flinkConfDir != null) {
org.apache.flink.configuration.Configuration flinkConfiguration = 
GlobalConfiguration.loadConfiguration(flinkConfDir);

// add all configuration keys with prefix 'flink.hadoop.' in Flink 
conf to Hadoop conf
for (String key : flinkConfiguration.keySet()) {
for (String prefix : FLINK_CONFIG_PREFIXES) {
if (key.startsWith(prefix)) {
String newKey = key.substring(prefix.length());
String value = flinkConfiguration.getString(key, null);
hadoopConf.set(newKey, value);
LOG.debug("Adding Flink config entry for {} as {}={} to 
Hadoop config", key, newKey, value);
}
}
}
}
}{code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35921) Flink SQL physical operator replacement support

2024-07-29 Thread Wang Qilong (Jira)

Wang Qilong created FLINK-35921:
---

 Summary: Flink SQL physical operator replacement support
 Key: FLINK-35921
 URL: https://issues.apache.org/jira/browse/FLINK-35921
 Project: Flink
  Issue Type: Bug
Reporter: Wang Qilong


Does Flinksql provide some SPI implementations that support custom physical 
operators, such as customizing a FileSourceScanExec in the execPhysicalPlan 
layer?

 

I have been researching the FlinkSQL source code recently, understanding the 
execution process of FlinkSQL, and came up with the above question



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35920) Add PRINTF function

2024-07-29 Thread Dylan He (Jira)

Dylan He created FLINK-35920:


 Summary: Add PRINTF function
 Key: FLINK-35920
 URL: https://issues.apache.org/jira/browse/FLINK-35920
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Dylan He


Add PRINTF function as the same in Spark & Hive.

Returns a formatted string from printf-style format strings.

Example:
{code:sql}
> SELECT PRINTF('Hello World %d %s', 100, 'days');
 Hello World 100 days
{code}

Syntax:
{code:sql}
PRINTF(strfmt[, obj1, ...])
{code}

Arguments:
* {{strfmt}}: A STRING expression.
* {{objN}}: A STRING or NUMERIC expression.

Returns:
A STRING.

See also:
 * 
[Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#string-functions]
 * 
[Databricks|https://docs.databricks.com/en/sql/language-manual/functions/printf.html]
* 
[Hive|https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-StringFunctions]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35919) The rowkind keyword option for MySQL connector metadata should be changed to the op_type keyword

2024-07-29 Thread Thorne (Jira)

Thorne created FLINK-35919:
--

 Summary: The rowkind keyword option for MySQL connector metadata 
should be changed to the op_type keyword
 Key: FLINK-35919
 URL: https://issues.apache.org/jira/browse/FLINK-35919
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Affects Versions: cdc-3.1.1
Reporter: Thorne
 Fix For: cdc-3.3.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35918) Migrate the Time to Duration for the flink-runtime module.

2024-07-29 Thread RocMarshal (Jira)

RocMarshal created FLINK-35918:
--

 Summary: Migrate the Time to Duration for the flink-runtime module.
 Key: FLINK-35918
 URL: https://issues.apache.org/jira/browse/FLINK-35918
 Project: Flink
  Issue Type: Technical Debt
Affects Versions: 2.0.0
Reporter: RocMarshal






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35917) Failed to deserialize data of EventHeaderV4

2024-07-29 Thread wangkang (Jira)

wangkang created FLINK-35917:


 Summary: Failed to deserialize data of EventHeaderV4
 Key: FLINK-35917
 URL: https://issues.apache.org/jira/browse/FLINK-35917
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Affects Versions: cdc-3.1.1
 Environment: * Flink version : 1.15
 * Flink CDC version: 3.1.1
 * Database and version: mysql 5.7
Reporter: wangkang


*Environment :*
 * Flink version : 1.15
 * Flink CDC version: 3.1.1
 * Database and version: mysql 5.7

[2024-07-28 20:26:33.471] [INFO] [flink-akka.actor.default-dispatcher-140] 
[org.apache.flink.runtime.executiongraph.ExecutionGraph ] >>> Job MySQL-Paimon 
Table Sync: vipdw_rt.sales_inv_hold_test (7cdbe1f87b1729196791b7c2506df5e0) 
switched from state FAILING to FAILED. org.apache.flink.runtime.JobException: 
Recovery is suppressed by 
FailureRateRestartBackoffTimeStrategy(FailureRateRestartBackoffTimeStrategy(failuresIntervalMS=180,backoffTimeMS=1,maxFailuresPerInterval=3)
 at 
org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:138)
 ~[flink-dist-1.15.3-vip.jar:1.15.3-vip] at 
org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:82)
 ~[flink-dist-1.15.3-vip.jar:1.15.3-vip] at 
org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:301)
 ~[flink-dist-1.15.3-vip.jar:1.15.3-vip] at 
org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:291)
 ~[flink-dist-1.15.3-vip.jar:1.15.3-vip] at 
org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:282)
 ~[flink-dist-1.15.3-vip.jar:1.15.3-vip] at 
org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:739)
 ~[flink-dist-1.15.3-vip.jar:1.15.3-vip] at 
org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:78)
 ~[flink-dist-1.15.3-vip.jar:1.15.3-vip] at 
org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:443)
 ~[flink-dist-1.15.3-vip.jar:1.15.3-vip] at 
sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source) ~[?:?] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_262] at java.lang.reflect.Method.invoke(Method.java:498) 
~[?:1.8.0_262] at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRpcInvocation$1(AkkaRpcActor.java:304)
 ~[flink-rpc-akka_e9ac714f-60da-4da5-93ce-bd11a5ce2af1.jar:1.15.3-vip] at 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:83)
 ~[flink-rpc-akka_e9ac714f-60da-4da5-93ce-bd11a5ce2af1.jar:1.15.3-vip] at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:302)
 ~[flink-rpc-akka_e9ac714f-60da-4da5-93ce-bd11a5ce2af1.jar:1.15.3-vip] at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:217)
 ~[flink-rpc-akka_e9ac714f-60da-4da5-93ce-bd11a5ce2af1.jar:1.15.3-vip] at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:78)
 ~[flink-rpc-akka_e9ac714f-60da-4da5-93ce-bd11a5ce2af1.jar:1.15.3-vip] at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:163)
 ~[flink-rpc-akka_e9ac714f-60da-4da5-93ce-bd11a5ce2af1.jar:1.15.3-vip] at 
akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24) 
[flink-rpc-akka_e9ac714f-60da-4da5-93ce-bd11a5ce2af1.jar:1.15.3-vip] at 
akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20) 
[flink-rpc-akka_e9ac714f-60da-4da5-93ce-bd11a5ce2af1.jar:1.15.3-vip] at 
scala.PartialFunction.applyOrElse(PartialFunction.scala:123) 
[flink-rpc-akka_e9ac714f-60da-4da5-93ce-bd11a5ce2af1.jar:1.15.3-vip] at 
scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) 
[flink-rpc-akka_e9ac714f-60da-4da5-93ce-bd11a5ce2af1.jar:1.15.3-vip] at 
akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20) 
[flink-rpc-akka_e9ac714f-60da-4da5-93ce-bd11a5ce2af1.jar:1.15.3-vip] at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) 
[flink-rpc-akka_e9ac714f-60da-4da5-93ce-bd11a5ce2af1.jar:1.15.3-vip] at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) 
[flink-rpc-akka_e9ac714f-60da-4da5-93ce-bd11a5ce2af1.jar:1.15.3-vip] at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) 
[flink-rpc-akka_e9ac714f-60da-4da5-93ce-bd11a5ce2af1.jar:1.15.3-vip] at 
akka.actor.Actor.aroundReceive(Actor.scala:537) 
[flink-rpc-akka_e9ac714f-60da-4da5-93ce-bd11a5ce2af1.jar:1.15.3-vip] at 
akka.actor.Actor.aroundReceive$(Actor.scala:535) 
[flink-rpc-akka_e9ac714f-60da-4da5-93ce-bd11a5ce2af1.jar:1.15.3-vip] at 
akka.actor.AbstractActor.aroundReceive(AbstractActor.

[jira] [Created] (FLINK-35916) AbstractJdbcRowConverter: type conversion to string

2024-07-29 Thread Evgeniy (Jira)

Evgeniy created FLINK-35916:
---

 Summary: AbstractJdbcRowConverter: type conversion to string
 Key: FLINK-35916
 URL: https://issues.apache.org/jira/browse/FLINK-35916
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / JDBC
Affects Versions: jdbc-3.2.0
Reporter: Evgeniy
 Fix For: jdbc-3.1.2, jdbc-3.2.0


*Problem:* When converting a type to a string, Java types are explicitly cast, 
which generates a ClassCastException.

 

{*}Effects{*}: Most Java types, one way or another, have the ability to be 
converted to a String type. However, not all types with this capability inherit 
a class of type String or similar, where the type can be cast to type String. 
As java suggests, any object can be cast to the String: Object#toString type if 
its implementation does not imply the opposite. However, explicit casting to 
the String type makes this impossible for 99% of custom Java objects.

 

{*}Goals{*}: Calling the Object#toString method for all objects that are 
supposed to be converted to a string.

 

 

{*}Solution{*}:

To begin with, I suggest you pay attention to the class that deals with type 
conversion - AbstractJdbcRowConverter. Pay attention to the method that deals 
with type conversion - createInternalConverter, and specifically to the lambda 
expression that deals with converting a type to a String type (CHAR, VARCHAR).
{code:java}
switch (type.getTypeRoot()) {
  ...
  case CHAR:
  case VARCHAR:
return val -> StringData.fromString((String) val);
  ...
} {code}
You can observe (String) val, which is problematic.

My intended solution is the following change.
{code:java}
return val -> StringData.fromString(val.toString()); {code}
val.toString() can give a chance for a new life to most objects that cannot be 
processed by Flink by default, and this is very cool because it solves at least 
1 problem.

 

 

{*}For further research{*}:

For example, pay attention to the type that is used in the podtgres driver to 
package PostgreSQL types other than primitive ones - PGobject.
{code:java}
public class PGobject implements Serializable, Cloneable {

  protected @Nullable String type;
  protected @Nullable String value;

  ...
  ...

  public String toString() {
return this.getValue();
  }

  ...
} {code}
As you may have noticed, PostgreSQL itself is engaged in converting types to a 
string, which is an excellent argument for using val.toString().

Perhaps in the future, you can add PGobject type processing, as was done for 
PGarray, but it seems to me that the solution described above is the fastest 
and most optimal. Thanks!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35915) Add MASK function

2024-07-29 Thread Dylan He (Jira)

Dylan He created FLINK-35915:


 Summary: Add MASK function
 Key: FLINK-35915
 URL: https://issues.apache.org/jira/browse/FLINK-35915
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Dylan He


Add MASK function as the same in Spark.

Returns a masked version of the input str.

Example:
{code:sql}
> SELECT MASK('AaBb123-&^ % 서울 Ä');
  XxXxnnn-&^ % 서울 X
> SELECT MASK('AaBb123-&^ % 서울 Ä', 'Z', 'z', '9', 'X');
  ZzZz999XZ
{code}

Syntax:
{code:sql}
MASK(str[, upperChar[, lowerChar[, digitChar[, otherChar)
{code}
Arguments:
 * {{str}}: A STRING expression.
 * {{upperChar}}: A single character STRING literal used to substitute upper 
case characters. The default is 'X'. If upperChar is NULL, upper case 
characters remain unmasked.
 * {{lowerChar}}: A single character STRING literal used to substitute lower 
case characters. The default is 'x'. If lowerChar is NULL, lower case 
characters remain unmasked.
 * {{digitChar}}: A single character STRING literal used to substitute digits. 
The default is 'n'. If digitChar is NULL, digits remain unmasked.
 * {{otherChar}}: A single character STRING literal used to substitute any 
other character. The default is NULL, which leaves these characters unmasked.


Returns:
A STRING.

See also:
 * 
[Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#string-functions]
 * 
[Databricks|https://docs.databricks.com/en/sql/language-manual/functions/mask.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35914) Flink 1.18 runtime not supporting java 17

2024-07-28 Thread sushil_karwasra (Jira)

sushil_karwasra created FLINK-35914:
---

 Summary: Flink 1.18 runtime not supporting java 17
 Key: FLINK-35914
 URL: https://issues.apache.org/jira/browse/FLINK-35914
 Project: Flink
  Issue Type: Bug
  Components: API / Core
Affects Versions: 1.18.1
Reporter: sushil_karwasra


I have a Kinesis stream on AWS, and the runtime I'm using is Flink 1.18. When I 
tried to compile my application code in Java 17 and deploy when i start the 
Kinesis stream, I encountered the following error: 
Caused by: java.lang.UnsupportedClassVersionError: KinesisToSqsStreamingJob has 
been compiled by a more recent version of the Java Runtime (class file version 
61.0), this version of the Java Runtime only recognizes class file versions up 
to 55.0. 
Does this mean that Flink does not yet support Java 17 compiled application 
code?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35913) The issue of using StreamExecutioneEnvironment to execute multiple jobs in a program on Flink CDC

2024-07-28 Thread huxx (Jira)

huxx created FLINK-35913:


 Summary: The issue of using StreamExecutioneEnvironment to execute 
multiple jobs in a program on Flink CDC
 Key: FLINK-35913
 URL: https://issues.apache.org/jira/browse/FLINK-35913
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Reporter: huxx
 Attachments: image-2024-07-29-12-23-55-766.png, 
image-2024-07-29-12-25-19-146.png

I integrated Flink CDC with Springboot and attempted to listen to multiple data 
sources using a single program, as described above. I used 
StreamExecutioneEnvironment to execute multiple jobs and built different data 
sources. However, when I tried the second task, I received multiple warnings
like this
WARN 21828 --- [rce-coordinator] io.debezium.metrics.Metrics  : 
Unable to register metrics as an old set with the same name exists, retrying in 
PT5S (attempt 1 out of 12)
The final result was that my second job was not executed.
Did you say that my StreamExecutionEnvironment registered duplicate parameters? 
I don't know how to solve this problem? Can't a single 
StreamExecutioneEnvironment execute multiple jobs? Thank you for your reply
!image-2024-07-29-12-23-55-766.png!
 !image-2024-07-29-12-25-19-146.png! 





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35912) SqlServer CDC doesn't chunk UUID-typed columns correctly

2024-07-28 Thread yux (Jira)

yux created FLINK-35912:
---

 Summary: SqlServer CDC doesn't chunk UUID-typed columns correctly
 Key: FLINK-35912
 URL: https://issues.apache.org/jira/browse/FLINK-35912
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Reporter: yux


As reported by GitHub user @LiPL2017, SqlServer CDC doesn't chunk UUID-typed 
columns correctly since UUID comparison isn't implemented correctly[1].

[1] 
https://learn.microsoft.com/en-us/sql/connect/ado-net/sql/compare-guid-uniqueidentifier-values?view=sql-server-ver16



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35911) Flink kafka-source connector

2024-07-28 Thread Gang Yang (Jira)

Gang Yang created FLINK-35911:
-

 Summary: Flink kafka-source connector
 Key: FLINK-35911
 URL: https://issues.apache.org/jira/browse/FLINK-35911
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: kafka-3.0.1
Reporter: Gang Yang


问题描述：业务反馈数据丢失，经排查发现是上游Kafka-Source部分分区数据没有被消费，其次业务也多次从状态重启，但这个问题仍然存在。

解决方式：后续为了排查定位问题配置了参数：pipeline.operator-chaining = 'false'，然后有状态重启任务，发现任务恢复正常。

业务场景：kafka write hdfs

Source定义如下：
{code:java}
// code placeholder
CREATE TABLE `play_log_source` (
  `appName` VARCHAR,
  `appInfo.channel` VARCHAR,
  `channel_name` AS `appInfo.channel`,
  `appInfo.packageName` VARCHAR,
  `package_name` AS `appInfo.packageName`,
  `deviceInfo.deviceId` VARCHAR,
  `device_id` AS `deviceInfo.deviceId`,
  `deviceInfo.deviceName` VARCHAR
) WITH (
  'nested-json.key.fields.deserialize-min.enabled' = 'true',
  'connector' = 'kafka',
  'format' = 'nested-json'
);{code}
 

Flink版本：Flink-1.18.1



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35910) Add BTRIM function

2024-07-28 Thread Dylan He (Jira)

Dylan He created FLINK-35910:


 Summary: Add BTRIM function
 Key: FLINK-35910
 URL: https://issues.apache.org/jira/browse/FLINK-35910
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Dylan He


Add BTRIM function as the same in Spark, where 'B' stands for BOTH.

Returns {{str}} with leading and trailing characters removed.

Syntax:
{code:sql}
BTRIM(str [, trimStr])
{code}
Arguments:
 * {{{}str{}}}: A STRING expression.
 * {{{}trimStr{}}}: An optional STRING expression with characters to be 
trimmed. The default is a space character.

Returns:
A STRING.

See also:
 * 
[Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#string-functions]
 * 
[Databricks|https://docs.databricks.com/en/sql/language-manual/functions/btrim.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35909) JdbcFilterPushdownPreparedStatementVisitorTest fails

2024-07-28 Thread Rui Fan (Jira)

Rui Fan created FLINK-35909:
---

 Summary: JdbcFilterPushdownPreparedStatementVisitorTest fails
 Key: FLINK-35909
 URL: https://issues.apache.org/jira/browse/FLINK-35909
 Project: Flink
  Issue Type: Bug
  Components: Connectors / JDBC
Reporter: Rui Fan


JdbcFilterPushdownPreparedStatementVisitorTest fails after FLINK-35363 is 
merged.

 

Hey [~eskabetxe] , would you mind taking a look this issue in your free time? 
thanks~



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35908) Flink log can not output method name and line number.

2024-07-28 Thread yiheng tang (Jira)

yiheng tang created FLINK-35908:
---

 Summary:  Flink log can not output method name and line number.
 Key: FLINK-35908
 URL: https://issues.apache.org/jira/browse/FLINK-35908
 Project: Flink
  Issue Type: Bug
Reporter: yiheng tang


log4j.properties:
{code}
appender.main.name = MainAppender
appender.main.type = RollingFile
appender.main.append = false
appender.main.fileName = ${sys:log.file}
appender.main.filePattern = ${sys:log.file}.%i
appender.main.layout.type = PatternLayout
appender.main.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} [%t] %-5p %c{1} 
(%M:%L) - %m%n
appender.main.policies.type = Policies
appender.main.policies.size.type = SizeBasedTriggeringPolicy
appender.main.policies.size.size = 100MB
appender.main.strategy.type = DefaultRolloverStrategy
appender.main.strategy.max = 5
{code}
But flink jobmanager log output as follows:
{code:java}
2024-07-28 22:00:12,489 [flink-akka.actor.default-dispatcher-3] INFO  
ExecutionGraph (:) - Source_Custom_Source -> Flat_Map(9/16) - execution #0 
(afde77b9b3ac94a2ffab46e5f96dc14c) switched from RECOVERING to RUNNING.
2024-07-28 22:00:12,489 [flink-akka.actor.default-dispatcher-21] INFO  
ExecutionGraph (:) - Source_Custom_Source -> Flat_Map(13/16) - execution #0 
(e0e7ce36f7d40372b71f956f7c19dbd3) switched from RECOVERING to RUNNING.
2024-07-28 22:00:12,524 [flink-akka.actor.default-dispatcher-21] INFO  
ExecutionGraph (:) - Window_TumblingProce -> Map -> (Sink_Print_to_Std_Ou, 
Sink_Unnamed)(15/16) - execution #0 (dc904d83c235247271caa2c8ef7be1ed) switched 
from RECOVERING to RUNNING.
{code}
Flink logs do not output the method name and line number.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35907) Inconsistent handling of bit and bit(n) types in Flink CDC PostgreSQL Connector

2024-07-28 Thread jiangcheng (Jira)

jiangcheng created FLINK-35907:
--

 Summary: Inconsistent handling of bit and bit(n) types in Flink 
CDC PostgreSQL Connector
 Key: FLINK-35907
 URL: https://issues.apache.org/jira/browse/FLINK-35907
 Project: Flink
  Issue Type: Bug
 Environment: I use the following maven settings.
{code:java}

    com.ververica
    flink-connector-postgres-cdc
    2.4-vvr-8.0-metrics-SNAPSHOT
 {code}
PostgreSQL Server Version：I have tested on 14.0, 15.0, 16.0. This error occurs 
regardless of PG Version.

JDK: Java 8.
Reporter: jiangcheng


I am encountering issues with the Flink CDC PostgreSQL Connector's handling of 
{{bit}} and {{bit(n)}} data types, which seem to deviate from expected 
behavior, leading to inconsistencies and errors during both snapshot and 
incremental sync phases.
h1. Problems
h2. *Problem 1: Misinterpretation of {{bit}} and {{bit(1)}}*

During data retrieval, {{bit}} and {{bit(1)}} types are interpreted as 
{{{}org.apache.kafka.connect.data.Schema.BOOLEAN{}}}, mirroring the treatment 
of PostgreSQL's native {{bool}} or {{boolean}} types. This may lead to loss of 
precision when the actual intention could be to preserve the bit pattern, 
rather than a simple true/false value.
h2. *Problem 2: Error with {{bit(n)}} during Snapshot Phase*

For {{bit(n)}} where {{{}n > 1{}}}, an error is encountered during the snapshot 
phase within {{{}PostgresScanFetchTask{}}}. The issue arises from attempting to 
read these values using {{{}PgResult#getBoolean{}}}, which is inappropriate for 
{{bit(n)}} types not representing a standard boolean state. Strangely, this 
error does not surface during the incremental phase, because no 
{{PgResult#getBoolean}} is invoked during the incremental phase.


h1. My Analysis
h2. BIT type is interpreted to BOOLEAN

Diving into the code reveals that for both scenarios, the connector relies on 
{{PgResult#getObject}} which internally identifies {{bit}} and {{bit(n)}} as 
{{{}java.sql.Type.BOOLEAN{}}}. This misclassification triggers the problematic 
usage of {{getBoolean}} for non-standard boolean representations like 
{{{}bit(n){}}}.
h2. 
h2. Inconsistency between Snapshot and Incremental Phases

The discrepancy between the snapshot phase and incremental phase is noteworthy. 
During the snapshot phase, errors manifest due to direct interaction with 
{{PgResult#getObject}} in 
{{{}PostgresScanFetchTask#createDataEventsForTable{}}}, and 
{{PgResult#getObject}} is further forward to {{{}PgResult#getBoolean{}}}.

Conversely, in the incremental phase, {{bit(n)}} values are coerced into 
{{{}org.apache.kafka.connect.data.Schema.BYTES{}}}, resulting in a loss of the 
original {{n}} precision information. This forces consumers to assume an 8-bit 
byte array representation, obscuring the intended bit-width and potentially 
leading to incorrect interpretations (e.g., I insert a {{bit(10)}} value into 
PostgreSQL Server named {{{}'000111'{}}}, which is represented as a byte 
array of length = 1 in SourceRecord, the first element is 127).
h1. *My Opinion*

>From my perspective, the following approaches may solve this problem.

*Consistent Handling:* The connector should uniformly and accurately handle all 
{{bit}} variants, respecting their distinct PostgreSQL definitions.

*Preserve Precision:* For {{{}bit(n){}}}, ensure that the precision {{n}} is 
maintained throughout processing, allowing consumers to correctly interpret the 
intended bit sequence without assuming a default byte size.

*Schema Transparency:* Enhance metadata handling to communicate the original 
{{bit(n)}} schema accurately to downstream systems, enabling them to process 
the data with full knowledge of its intended format.
h1. Conclusion

Addressing these discrepancies will not only improve the reliability of the 
Flink CDC PostgreSQL Connector but also enhance its compatibility with a 
broader range of use cases that rely on precise {{bit}} data handling. I look 
forward to a resolution that ensures consistent and accurate processing across 
both snapshot and incremental modes.

Thank you for considering this issue, and I'm available to provide further 
details or assist in any way possible.
h1. Reproduct the Error

I use the following code to read record from PostgreSQL Connector by 
implementing the deserialize method in 
{{{}DebeziumDeserializationSchema{}}}:
{code:java}
public class PostgreSqlRecordSourceDeserializeSchema
        implements DebeziumDeserializationSchema {
        
    public void deserialize(SourceRecord sourceRecord, Collector out) 
throws Exception {  
        // skipping irrelevant business logic ...      
        Struct rowValue = ((Struct) 
sourceRecord.value()).getStruct(Envelope.FieldName.AFTER);
        
        for (Field field: rowValue.schema().fields()){
            switch (field.schema().type()) {
                case BOOLEAN:
                    // handli

[jira] [Created] (FLINK-35906) Inconsistent timestamp precision handling in Flink CDC PostgreSQL Connector

2024-07-28 Thread jiangcheng (Jira)

jiangcheng created FLINK-35906:
--

 Summary: Inconsistent timestamp precision handling in Flink CDC 
PostgreSQL Connector
 Key: FLINK-35906
 URL: https://issues.apache.org/jira/browse/FLINK-35906
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
 Environment: I use the following maven settings.
{code:java}

    com.ververica
    flink-connector-postgres-cdc
    2.4-vvr-8.0-metrics-SNAPSHOT
 {code}


PostgreSQL Server Version：I have tested on 14.0, 15.0, 16.0. This error occurs 
regardless of PG Version.

JDK: Java 8.
Reporter: jiangcheng


I have encountered an inconsistency issue with the Flink CDC PostgreSQL 
Connector when it comes to handling different time types, specifically time, 
timetz, timestamp, and timestamptz. The problem revolves around the precision 
of these time-related values during both snapshot and incremental phases.
h1. Issue Details

*time type*

During the snapshot phase, the precision is reduced to milliseconds (ms), 
whereas in the incremental phase, the correct microsecond (micros) precision is 
maintained.

*timetz type*

This is where the most discrepancy arises. In the snapshot phase, the precision 
drops to seconds (s), and the time is interpreted according to the system's 
timezone, leading to potential misinterpretation of the actual stored value. 
Conversely, in the incremental phase, the precision increases to milliseconds 
(ms), but the timezone is fixed at 0 (UTC), causing further discrepancies. An 
illustrative example involves inserting 10:13:02.264525+08 into PostgreSQL; 
during the snapshot, it is retrieved as 10:13:02Z, while incrementally it 
appears as 02:13:02.264525Z. Specifically, I use the following code to read the 
data.

*timestamp type*

Both in snapshot and incremental modes, the precision is consistently at the 
microsecond level, which aligns with expectations.

*timestamptz type*

Unlike expected, both phases yield a reduced precision to milliseconds (ms), 
deviating from the native microsecond precision supported by PostgreSQL for 
this type.

 
h1. Expected Behavior

The Flink CDC PostgreSQL Connector should maintain the native precision 
provided by PostgreSQL for all time-related data types across both snapshot and 
incremental phases, ensuring that time and timestamptz types are accurately 
represented down to microseconds, and timetz correctly handles timezone 
information alongside its precision.



 
h1. Reproduct the Error

I use the following code to read record from PostgreSQL Connector by 
implementing the deserialize method in 
{{{}DebeziumDeserializationSchema{}}}:



 

 
{code:java}
public class PostgreSqlRecordSourceDeserializeSchema
        implements DebeziumDeserializationSchema {
        
    public void deserialize(SourceRecord sourceRecord, Collector out) 
throws Exception {  
        // skipping irrelevant business logic ...      
        Struct rowValue = ((Struct) 
sourceRecord.value()).getStruct(Envelope.FieldName.AFTER);
        
        for (Field field: rowValue.schema().fields()){
            switch (field.schema().type()) {
                case INT64:
                    if (StringUtils.equals(field.schema().name(), 
MicroTime.class.getName())) {
                        // handling time type
                        Long value = rowValue.getInt64(field.name());
                    } else if (StringUtils.equals(field.schema().name(), 
MicroTimestamp.class.getName())) {
                        // handling timestamp type
                        Long value = rowValue.getInt64(field.name());
                    } else // skipping irrelevant business logic ...     
                    break;
                case STRING:
                    if (StringUtils.equals(field.schema().name(), 
ZonedTimestamp.class.getName())) {
                        // handling timestamptz type
                        String value = rowValue.getString(field.name());
                    } else if (StringUtils.equals(ZonedTime.class.getName(), 
field.schema().name())) {
                        // handling timetz type
                        String value = rowValue.getString(field.name());
                    } else // skipping irrelevant business logic ...     
                    break;
                case // skipping irrelevant business logic ...     
        }
        // skipping irrelevant business logic ...     
        
    }
} {code}
h1. Version

I use the following maven settings.
{code:java}

    com.ververica
    flink-connector-postgres-cdc
    2.4-vvr-8.0-metrics-SNAPSHOT
 {code}


PostgreSQL Server Version：I have tested on 14.0, 15.0, 16.0. This error occurs 
regardless of PG Version.

JDK: Java 8.


h1. Offer for Assistance

I am willing to provide additional test scenarios or results to help diagnose 
this issue further. Moreover, I am open to collaborating on reviewing potential 
fixes or providing any

[jira] [Created] (FLINK-35905) Flink physical operator replacement support

2024-07-28 Thread Wang Qilong (Jira)

Wang Qilong created FLINK-35905:
---

 Summary: Flink physical operator replacement support
 Key: FLINK-35905
 URL: https://issues.apache.org/jira/browse/FLINK-35905
 Project: Flink
  Issue Type: Bug
  Components: API / Scala, Table SQL / API
Affects Versions: 1.15.0
 Environment: Flink1.15 and so on
Reporter: Wang Qilong


I have been studying the FlinkSQL source code recently and have learned about 
the execution process of FlinkSQL, which has led to a question:
Does Flinksql provide some SPI implementations that support custom physical 
operators, such as customizing a FileSourceScanExec in the execPhysicalPlan 
layer?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35904) Test harness for async state processing operators

2024-07-26 Thread Zakelly Lan (Jira)

Zakelly Lan created FLINK-35904:
---

 Summary: Test harness for async state processing operators
 Key: FLINK-35904
 URL: https://issues.apache.org/jira/browse/FLINK-35904
 Project: Flink
  Issue Type: Sub-task
Reporter: Zakelly Lan


We need a test harness for async state processing operators, which could 
simulate the mailbox and task thread. This is essential for further sql 
operator developing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35903) Add CentralizedSlotAssigner support to adaptive scheduler

2024-07-26 Thread yuanfenghu (Jira)

yuanfenghu created FLINK-35903:
--

 Summary: Add CentralizedSlotAssigner support to adaptive scheduler
 Key: FLINK-35903
 URL: https://issues.apache.org/jira/browse/FLINK-35903
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Reporter: yuanfenghu


h2. *Question:* 

When using an adaptive scheduler for the current task, reducing the task's 
parallelism via REST API triggers the task to restart. If the numberOfTaskSlots 
in my taskmanager is greater than 1, it may result in some taskmanager slots 
being idle and unable to release the taskmanager resource.
h3. *How can this issue be resolved:* 

We need a new SlotAssigner strategy. When the task triggers the above process, 
the slots should be requested centrally within the taskmanager to ensure we can 
maximize the release of unnecessary resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35902) WebUI Metrics RangeError on huge parallelism

2024-07-25 Thread Sergey Paryshev (Jira)

Sergey Paryshev created FLINK-35902:
---

 Summary: WebUI Metrics RangeError on huge parallelism
 Key: FLINK-35902
 URL: https://issues.apache.org/jira/browse/FLINK-35902
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Web Frontend
Affects Versions: 1.19.1, 1.18.1, 1.17.2, 1.20.0
Reporter: Sergey Paryshev
 Attachments: flink-high-parallelism-webui.png

Displaying metrics with high parallelism results in a RangeError (similar to 
StackOverflow in JS). To reproduce the error locally it is enough to launch 
locally any job with parallelism of one of the operators 1600 or higher and go 
to the metrics menu.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35901) Setting web.submit.enable to false prevents FlinkSessionJobs from working

2024-07-25 Thread Ralph Blaise (Jira)

Ralph Blaise created FLINK-35901:


 Summary: Setting web.submit.enable to false prevents 
FlinkSessionJobs from working
 Key: FLINK-35901
 URL: https://issues.apache.org/jira/browse/FLINK-35901
 Project: Flink
  Issue Type: Bug
  Components: API / Core
Affects Versions: 1.19.1
 Environment: Azure Kubernetes Service 1.28.5
Reporter: Ralph Blaise


Setting web.submit.enable to false in a flinkdeployment in kubernetes doesn't 
allow flinksessionjobs for it to work.  It instead results in the error below:
 
 
_{"type":"org.apache.flink.kubernetes.operator.exception.ReconciliationException","message":"java.util.concurrent.ExecutionException:_
    _org.apache.flink.runtime.rest.util.RestClientException: [POST request not_
    _allowed]"_

_I have to re-enable web.submit.enable in order for FlinkSessionJobs to work_



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35900) PyFlink: update version of Apache Beam dependency

2024-07-25 Thread Lydia L (Jira)

Lydia L created FLINK-35900:
---

 Summary: PyFlink: update version of Apache Beam dependency
 Key: FLINK-35900
 URL: https://issues.apache.org/jira/browse/FLINK-35900
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.19.1
Reporter: Lydia L


PyFlink relies on Apache Beam

[jira] [Created] (FLINK-35899) Accumulated TumblingProcessingTimeWindows in the state for a few days

2024-07-25 Thread Sergey Anokhovskiy (Jira)

Sergey Anokhovskiy created FLINK-35899:
--

 Summary: Accumulated TumblingProcessingTimeWindows in the state 
for a few days
 Key: FLINK-35899
 URL: https://issues.apache.org/jira/browse/FLINK-35899
 Project: Flink
  Issue Type: Bug
  Components: API / Core, Runtime / Checkpointing
Affects Versions: 1.13.5
Reporter: Sergey Anokhovskiy


One of the sub-task out of 40 of the TumblingProcessingTimeWindows operator 
accumulated windows over a day. The next restart of the job caused it to 
process the accumulated windows, which caused the checkpointing timeout. Once 
the sub-task has processed the old windows (might take several hours) it works 
normally again. *Could you please come up with the ideas of what might cause 
the window operator sub-task to accumulate old windows for days?*

 

Here is more context:



At Yelp we built a connector to the database based on Flink. We aimed to reduce 
the load to the database. That's why a time window with reduce function was 
introduced in that only the latest version of the document does matter for us. 
Here is the configuration of the window: 

 

private def windowedStream(input: DataStream[FieldsToIndex]) = {

input.keyBy(f => f.id)

  
.window(TumblingProcessingTimeWindows.of(seconds(elasticPipeJobConfig.deduplicationWindowTimeInSec)))

  .reduce(

(e1, e2) => {

  if (e1.messageTimestamp > e2.messageTimestamp) {

e1

  } else {

e2

  }

})

  }

 

It works as expected most of the time but a few times per year on sub-task of 
the dedup_window operator got stuck and caused checkpointing to fail. We took a 
look at the state data and added extra logging to the custom trigger and here 
is what we found:


 # It turned out that the state of the 17th (different number every incident) 
sub-task is more than 100 times bigger than the others (see the txt file). It 
caused the job and particular sub-task to initialize slowly (see the screen 
shot)


 # Statics of RocksDB tables:
_timer_state/processing_window-timers ~8MB,
_timer_state/event_window-timers was 0
window-contents was ~20GB with ~960k entries and ~14k unique message ids. 
Counts by id were distributed (see ids_destribution.txt)


 # Each window-contents value has associated timer entry in 
_timer_state/processing_window-timers The timers accumulated gradually, time 
(Pacific) bucket counts (see timers.txt)


 # The earliest entries are from 9:23am Pacific on June 26th, over a day before 
the incident. Flink log showed that a taskmanager went away at 9:28, forcing a 
job restart (see taskmanager_exception.txt). The job came back up at ~9:41am.


 # Debug logs in the custom trigger in the functions 
onClear/onProcessingTime/onElement/onEventTime confirmed that the job is busy 
on processing the old windows


 # It seems that the subtask was in a bad state after the restart. It is 
unclear if it was stuck not processing any window events, or if it was just not 
removing them from the state. The next time the job restarted it had to churn 
through a days worth of writes, causing the delay.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35898) When running Flink on Yarn, if the log file path contains a space, the startup will fail.

2024-07-25 Thread yiheng tang (Jira)

yiheng tang created FLINK-35898:
---

 Summary: When running Flink on Yarn, if the log file path contains 
a space, the startup will fail.
 Key: FLINK-35898
 URL: https://issues.apache.org/jira/browse/FLINK-35898
 Project: Flink
  Issue Type: Bug
  Components: Deployment / YARN
Affects Versions: 1.20.0
Reporter: yiheng tang


We use Yarn to launch Flink jobs and try to add custom parameters to configure 
the log output format in the YarnLogConfigUtil.getLog4jCommand method. However, 
when the parameter value contains spaces, it causes the startup to fail.

The start command in the Yarn container is as follows, omitting some irrelevant 
parameters.

```bash
/bin/bash -c "$JAVA_HOME/bin/java 
-Dlog.file='/data08/yarn/userlogs/application_1719804259110_16060/container_e43_1719804259110_16060_01_01/jobmanager.log'
 -Dlog4j.configuration=file.properties 
-Dlog4j.configurationFile=file.properties 
-Dlog4j2.isThreadContextMapInheritable=true 
-Dlog.layout.pattern='%d\{-MM-dd HH:mm} %-5p %-60c %-60t %x - %m%n %ex' 
-Dlog.level=INFO"
```

We discovered that when the value of log.layout.pattern contains spaces, the 
command is incorrectly truncated, causing the content after the space (`HH:mm} 
%-5p %-60c %-60t %x - %m%n %ex' -Dlog.level=INFO"`) to be unrecognized, and 
therefore the Flink job cannot be launched in the Yarn container.

Therefore, we suggest using single quotes (') instead of double quotes (") to 
wrap the values of parameters in the `getLogBackCommand` method and 
`getLog4jCommand` method of the `YarnLogConfigUtil` class.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35897) Some checkpoint files and localState files can't be cleanUp when checkpoint is aborted

2024-07-25 Thread Jinzhong Li (Jira)

Jinzhong Li created FLINK-35897:
---

 Summary: Some checkpoint files and localState files can't be 
cleanUp when checkpoint is aborted 
 Key: FLINK-35897
 URL: https://issues.apache.org/jira/browse/FLINK-35897
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing, Runtime / State Backends
Reporter: Jinzhong Li


h2. Problem
When the job checkpoint is canceled ([asyncsnapshotcallable.java/#L129|
[https://github.com/apache/flink/blob/d4294c59e6f2ec8702f53916ea49cf23f6db8961/flink-runtime/src/main/java/org/apache/flink/runtime/state/AsyncSnapshotCallable.java#L129]]),
 it is still possible for the asynchronous snapshot thread to continue 
executing and generate a completed checkpoint 
([RocksIncrementalSnapshotStrategy.java#L324|
[https://github.com/apache/flink/blob/d4294c59e6f2ec8702f53916ea49cf23f6db8961/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/snapshot/RocksIncrementalSnapshotStrategy.java#L324]]).
 In this case, there will be no role is responsible for the completed 
checkpoint cleanup, neither async snapshot thread, nor 
SubtaskCheckpointCoordinatorImpl.

 
h3. How to reproduce it 
We can reproduce this issue by running the [DataGenWordCount example in my 
debug 
branch|[https://github.com/ljz2051/flink/commit/33c0c55098a49a0b56c9404256a560da5069f26c]],
 in which I've added some debug code.
 
h3. How to fix it
When the asynchronous snapshot thread completes a checkpoint, it needs to 
cleanup the completed checkpoint if it finds that the checkpoint has been 
canceled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35896) eventHandler in RescaleApiScalingRealizer handle the event of scaling failure

2024-07-25 Thread yuanfenghu (Jira)

yuanfenghu created FLINK-35896:
--

 Summary: eventHandler in RescaleApiScalingRealizer  handle the 
event of scaling failure
 Key: FLINK-35896
 URL: https://issues.apache.org/jira/browse/FLINK-35896
 Project: Flink
  Issue Type: Improvement
  Components: Autoscaler
Affects Versions: 2.0.0
Reporter: yuanfenghu


When using flink-autoscaler-standalone, if fail during the Scaling process, 
need to handle the failure event through evenhandler
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35895) flink parquet vectorized reader should use readNextFilteredRowGroup() instead of readNextRowGroup()

2024-07-25 Thread Kai Chen (Jira)

Kai Chen created FLINK-35895:


 Summary: flink parquet vectorized reader should use 
readNextFilteredRowGroup() instead of readNextRowGroup()
 Key: FLINK-35895
 URL: https://issues.apache.org/jira/browse/FLINK-35895
 Project: Flink
  Issue Type: Bug
  Components: Connectors / FileSystem, Formats (JSON, Avro, Parquet, 
ORC, SequenceFile)
Affects Versions: 1.19.1, 1.18.1, 1.17.2, 1.16.3, 1.15.4
Reporter: Kai Chen


reader.readNextRowGroup() should be changed to 
reader.readNextFilteredRowGroup();
https://github.com/apache/flink/blob/d4294c59e6f2ec8702f53916ea49cf23f6db8961/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/ParquetVectorizedInputFormat.java#L422



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35894) Add Elasticsearch Sink Connector for Flink CDC Pipeline

2024-07-25 Thread wuzexian (Jira)

wuzexian created FLINK-35894:


 Summary: Add Elasticsearch Sink Connector for Flink CDC Pipeline
 Key: FLINK-35894
 URL: https://issues.apache.org/jira/browse/FLINK-35894
 Project: Flink
  Issue Type: Improvement
Reporter: wuzexian


I propose adding a new Elasticsearch Sink Connector to the Flink CDC Pipeline. 
This connector will enable integration with Elasticsearch, allowing real-time 
data synchronization and indexing from  data source managed by Flink CDC.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35893) Add state compatibility for Serializer of TableChangeInfo

2024-07-24 Thread LvYanquan (Jira)

LvYanquan created FLINK-35893:
-

 Summary: Add state compatibility for Serializer of TableChangeInfo 
 Key: FLINK-35893
 URL: https://issues.apache.org/jira/browse/FLINK-35893
 Project: Flink
  Issue Type: Technical Debt
  Components: Flink CDC
Affects Versions: cdc-3.1.1
Reporter: LvYanquan
 Fix For: cdc-3.2.0


Version info of SERIALIZER for TableChangeInfo was not included in 
[TransformSchemaOperator|https://github.com/apache/flink-cdc/blob/ea71b2302ddc5f9b7be65843dbf3f5bed4ca9d8e/flink-cdc-runtime/src/main/java/org/apache/flink/cdc/runtime/operators/transform/TransformSchemaOperator.java#L127],
 which may cause incompatible state in the future.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35892) Flink Kubernetes Operator docs lacks info on FlinkSessionJob submission

2024-07-24 Thread Sergey Kononov (Jira)

Sergey Kononov created FLINK-35892:
--

 Summary: Flink Kubernetes Operator docs lacks info on 
FlinkSessionJob submission
 Key: FLINK-35892
 URL: https://issues.apache.org/jira/browse/FLINK-35892
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.9.0
Reporter: Sergey Kononov


Hello,

After FLINK-26808 fix a job submission to Flink session cluster via Flink 
Kubernetes Operator (FKO) is still impossible without setting config option 
web.submit.enable to true. If it's false FKO produces an exception:

{{i.j.o.p.e.ReconciliationDispatcher [ERROR][fko/jobname] Error during event 
processing ExecutionScope\{ resource id: ResourceID{name='jobname', 
namespace='fko'}, version: 173807615} failed.}}
{{org.apache.flink.kubernetes.operator.exception.ReconciliationException: 
java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.rest.util.RestClientException: [POST request not 
allowed]}}

As I understand it is an intended behavior as FKO uses {{/jars/*}} REST 
endpoints to submit jobs to a session cluster and they are not initialized if 
web.submit.enable is false.

I would suggest to mention in FKO docs 
([here|https://github.com/apache/flink-kubernetes-operator/blob/main/docs/content/docs/custom-resource/overview.md#flinksessionjob-spec-overview[]|https://github.com/apache/flink-kubernetes-operator/blob/main/docs/content/docs/custom-resource/overview.md#flinksessionjob-spec-overview])
 that the option web.submit.enable should be set to true if one intends to use 
FKO with the session cluster. I lost quite a lot of time to figure this out.

 

Best regards,

Sergey



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35891) Support writing dynamic bucket table of Paimon

2024-07-24 Thread LvYanquan (Jira)

LvYanquan created FLINK-35891:
-

 Summary: Support writing dynamic bucket table of Paimon 
 Key: FLINK-35891
 URL: https://issues.apache.org/jira/browse/FLINK-35891
 Project: Flink
  Issue Type: Technical Debt
  Components: Flink CDC
Affects Versions: cdc-3.2.0
Reporter: LvYanquan


Support writing table in 
[https://paimon.apache.org/docs/master/primary-key-table/data-distribution/#dynamic-bucket]
 mode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35890) Select only the increment but still enter the SNAPSHOT,lead

2024-07-24 Thread sanqingleo (Jira)

sanqingleo created FLINK-35890:
--

 Summary: Select only the increment but still enter the 
SNAPSHOT,lead 
 Key: FLINK-35890
 URL: https://issues.apache.org/jira/browse/FLINK-35890
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Affects Versions: cdc-3.1.1
Reporter: sanqingleo
 Attachments: image-2024-07-24-16-47-19-258.png

Hi, when I was using Flink CDC Postgre, I chose 'debezium.snapshot.mode' = 
'never' to skip the snapshot phase, but the code still entered the snapshot 
branch, and this log was printed, 'Database snapshot phase can 't perform 
checkpoint, acquired Checkpoint lock.'

(Since our business creates the task first and then writes the data), the task 
cannot perform ckp, and the status is always stuck in in progress. After the 
timeout, the task restarts.
Normally, if you only choose incremental data synchronization, you should not 
enter this branch.

 

The failure of ckp does not mean that the first one fails. After multiple ckp 
are successfully performed, it will then get stuck, causing a timeout and 
restarting the task. This is a problem that must occur. After it appeared in 
production, it also recurred many times in local testing.

 

Then we test, if there is data flowing in, the task status will be normal.

 

sql and conf like this

!image-2024-07-24-16-47-19-258.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35889) mongo cdc restore from expired resume token and job status still running but expect failed

2024-07-24 Thread Darren_Han (Jira)

Darren_Han created FLINK-35889:
--

 Summary: mongo cdc restore from expired resume token and job 
status still running but expect failed
 Key: FLINK-35889
 URL: https://issues.apache.org/jira/browse/FLINK-35889
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Affects Versions: cdc-3.1.0
Reporter: Darren_Han


version

mongodb: 3.6

flink: 1.14.6

mongo-cdc:2.4.2

 

Restarting job from a savepoint/checpoint which contains expired resume 
token/point, job status is always running and  do not capture change data, 
printing logs continuously.

Here is some example logs:

2024-07-23 11:11:04,214 INFO  com.mongodb.kafka.connect.source.MongoSourceTask  
           [] - An exception occurred when trying to get the next item from the 
Change Stream
com.mongodb.MongoQueryException: Query failed with error code 280 and error 
message 'resume of change stream was not possible, as the resume token was not 
found. \{_data: BinData(0, "xx")}'

2024-07-23 17:53:27,330 INFO  com.mongodb.kafka.connect.source.MongoSourceTask  
           [] - An exception occurred when trying to get the next item from the 
Change Stream
com.mongodb.MongoQueryException: Query failed with error code 280 and error 
message 'resume of change notification was not possible, as the resume point 
may no longer be in the oplog. ' on server

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35888) Add e2e test for paimon DataSink

2024-07-23 Thread LvYanquan (Jira)

LvYanquan created FLINK-35888:
-

 Summary: Add e2e test for paimon DataSink
 Key: FLINK-35888
 URL: https://issues.apache.org/jira/browse/FLINK-35888
 Project: Flink
  Issue Type: Technical Debt
  Components: Flink CDC
Affects Versions: cdc-3.2.0
Reporter: LvYanquan
 Fix For: cdc-3.2.0


Paimon DataSink was already completed, but not e2e test was added.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35887) Null Pointer Exception in TypeExtractor.isRecord when trying to provide type info for interface

2024-07-23 Thread Jacob Jona Fahlenkamp (Jira)

Jacob Jona Fahlenkamp created FLINK-35887:
-

 Summary: Null Pointer Exception in TypeExtractor.isRecord when 
trying to provide type info for interface
 Key: FLINK-35887
 URL: https://issues.apache.org/jira/browse/FLINK-35887
 Project: Flink
  Issue Type: Bug
  Components: API / Type Serialization System
Affects Versions: 1.19.1
Reporter: Jacob Jona Fahlenkamp


The following code
{code:java}
import org.apache.flink.api.common.typeinfo.TypeInfo;
import org.apache.flink.api.common.typeinfo.TypeInfoFactory;
import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.api.common.typeinfo.Types;
import org.apache.flink.types.PojoTestUtils;
import org.junit.jupiter.api.Test;

import java.lang.reflect.Type;
import java.util.Map;

public class DebugTest {
@TypeInfo(FooFactory.class)
public interface Foo{}

public static class FooFactory extends TypeInfoFactory {
   @Override
   public TypeInformation createTypeInfo(Type type, Map> map) {
  return Types.POJO(Foo.class, Map.of());
   }
}

@Test
void test() {
   PojoTestUtils.assertSerializedAsPojo(Foo.class);
}
} {code}
throws this exception:
{code:java}
java.lang.NullPointerException: Cannot invoke "java.lang.Class.getName()" 
because the return value of "java.lang.Class.getSuperclass()" is null
at 
org.apache.flink.api.java.typeutils.TypeExtractor.isRecord(TypeExtractor.java:2227)
  at 
org.apache.flink.api.java.typeutils.runtime.PojoSerializer.(PojoSerializer.java:125)
   at 
org.apache.flink.api.java.typeutils.PojoTypeInfo.createPojoSerializer(PojoTypeInfo.java:359)
 at 
org.apache.flink.api.java.typeutils.PojoTypeInfo.createSerializer(PojoTypeInfo.java:347)
 at 
org.apache.flink.types.PojoTestUtils.assertSerializedAsPojo(PojoTestUtils.java:48)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35886) Incorrect watermark idleness timeout accounting when subtask is backpressured/blocked

2024-07-23 Thread Piotr Nowojski (Jira)

Piotr Nowojski created FLINK-35886:
--

 Summary: Incorrect watermark idleness timeout accounting when 
subtask is backpressured/blocked
 Key: FLINK-35886
 URL: https://issues.apache.org/jira/browse/FLINK-35886
 Project: Flink
  Issue Type: Bug
  Components: API / DataStream, Runtime / Task
Affects Versions: 1.19.1, 1.18.1, 1.20.0
Reporter: Piotr Nowojski


Currently when using watermark with idleness in Flink, idleness can be 
incorrectly detected when reading records from a source that is blocked by the 
runtime. For example this can easily happen when source is either 
backpressured, or blocked by the watermark alignment. In those cases, despite 
there are more records to be read from the source (or source’s split), runtime 
is deciding not to poll (or being unable to) those records. In such case 
idleness timeout can kick in, marking source/source split as idle, which can 
lead to incorrect combined watermark calculations and dropping of incorrectly 
marked late records.

h4. Watermark alignment

If there are two source splits, A and B , and maxAllowedWatermarkDrift is set 
to 30s. 

# Partition A emitted watermark 1042 sec, while partition B sits at watermark 
1000 sec.
# {{1042s - 1000s > maxAllowedWatermarkDrift}}, so partition A is blocked by 
the watermark alignment.
# For the duration of idleTimeout, partition B is emitting some large batch of 
records, that do not advance watermark of that partition by much. For example 
either watermark for partition B stays 1000s, or is updated by a small amount 
to for example 1005s.
# idleTimeout kicks in, marking partition A as idle
# partition B finishes emitting large batch of those older records, and let's 
say now there is a gap in rowtimes. Previously partition B was emitting records 
with rowtime ~1000s, now it jumps to for example ~5000s.
# As partition A is idle, combined watermark jumps to ~5000s as well.
# Watermark alignment unblocks partition A, and it continues emitting records 
with rowtime ~1042s. But now all of those records are dropped due to being late.

h4. Backpressure

When there are two SourceOperator’s, A and B. Due to for example some data 
skew, it could happen that either only A gets backpressured, or A is 
backpressured quicker/sooner. Either way, during that time when A is 
backpressured, while B is not, B can bump the combined watermark high enough, 
so that when backpressure recedes, fresh records from A will be considered as 
late, leading to incorrect results.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35885) SlicingWindowOperator should not process watermark with proctime

2024-07-23 Thread Baozhu Zhao (Jira)

Baozhu Zhao created FLINK-35885:
---

 Summary: SlicingWindowOperator should not process watermark with 
proctime
 Key: FLINK-35885
 URL: https://issues.apache.org/jira/browse/FLINK-35885
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime
Affects Versions: 1.17.2, 1.13.6
 Environment: flink 1.13.6 or flink 1.17.2
Reporter: Baozhu Zhao


We have discovered an unexpected case where abnormal data with a count of 0 
occurs when performing proctime window aggregation on data with a watermark, 
The SQL is as follows

{code:sql}
CREATE TABLE s1 (
id INT,
event_time TIMESTAMP(3),
name string,
proc_time AS PROCTIME (),
WATERMARK FOR event_time AS event_time - INTERVAL '10' SECOND
)
WITH
('connector' = 'my-source')
;

SELECT
*
FROM
(
SELECT
name,
COUNT(id) AS total_count,
window_start,
window_end
FROM
TABLE (
TUMBLE (
TABLE s1,
DESCRIPTOR (proc_time),
INTERVAL '30' SECONDS
)
)
GROUP BY
window_start,
window_end,
name
)
WHERE
total_count = 0;
{code}

For detailed test code, please refer to xxx



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35884) Support

2024-07-23 Thread JunboWang (Jira)

JunboWang created FLINK-35884:
-

 Summary: Support
 Key: FLINK-35884
 URL: https://issues.apache.org/jira/browse/FLINK-35884
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Reporter: JunboWang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35883) Wildcard projection inserts column at wrong place

2024-07-23 Thread yux (Jira)

yux created FLINK-35883:
---

 Summary: Wildcard projection inserts column at wrong place
 Key: FLINK-35883
 URL: https://issues.apache.org/jira/browse/FLINK-35883
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Reporter: yux


In this case where a wildcard projection was declared:
```yaml
transform:
  - projection: \*, 'extras' AS extras
```
For upstream schema [a, b, c], transform operator should send [a, b, c, extras] 
to downstream.

However, if another column 'd' was inserted at the end, upstream schema would 
be [a, b, c, d], and one might expect transformed schema to be [a, b, c, d, 
extras]. But it's [a, b, c, extras, d], since `AddColumnEvent{d, 
position=LAST}` was applied to [a, b, c, extras] after the projection process.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35882) CLONE - [Release-1.20] Vote on the release candidate

2024-07-23 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35882:
--

 Summary: CLONE - [Release-1.20] Vote on the release candidate
 Key: FLINK-35882
 URL: https://issues.apache.org/jira/browse/FLINK-35882
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 1.17.0
Reporter: Weijie Guo
Assignee: Weijie Guo
 Fix For: 1.17.0


Once you have built and individually reviewed the release candidate, please 
share it for the community-wide review. Please review foundation-wide [voting 
guidelines|http://www.apache.org/foundation/voting.html] for more information.

Start the review-and-vote thread on the dev@ mailing list. Here’s an email 
template; please adjust as you see fit.
{quote}From: Release Manager
To: dev@flink.apache.org
Subject: [VOTE] Release 1.2.3, release candidate #3

Hi everyone,
Please review and vote on the release candidate #3 for the version 1.2.3, as 
follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:
 * JIRA release notes [1],
 * the official Apache source release and binary convenience releases to be 
deployed to dist.apache.org [2], which are signed with the key with fingerprint 
 [3],
 * all artifacts to be deployed to the Maven Central Repository [4],
 * source code tag "release-1.2.3-rc3" [5],
 * website pull request listing the new release and adding announcement blog 
post [6].

The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.

Thanks,
Release Manager

[1] link
[2] link
[3] [https://dist.apache.org/repos/dist/release/flink/KEYS]
[4] link
[5] link
[6] link
{quote}
*If there are any issues found in the release candidate, reply on the vote 
thread to cancel the vote.* There’s no need to wait 72 hours. Proceed to the 
Fix Issues step below and address the problem. However, some issues don’t 
require cancellation. For example, if an issue is found in the website pull 
request, just correct it on the spot and the vote can continue as-is.

For cancelling a release, the release manager needs to send an email to the 
release candidate thread, stating that the release candidate is officially 
cancelled. Next, all artifacts created specifically for the RC in the previous 
steps need to be removed:
 * Delete the staging repository in Nexus
 * Remove the source / binary RC files from dist.apache.org
 * Delete the source code tag in git

*If there are no issues, reply on the vote thread to close the voting.* Then, 
tally the votes in a separate email. Here’s an email template; please adjust as 
you see fit.
{quote}From: Release Manager
To: dev@flink.apache.org
Subject: [RESULT] [VOTE] Release 1.2.3, release candidate #3

I'm happy to announce that we have unanimously approved this release.

There are XXX approving votes, XXX of which are binding:
 * approver 1
 * approver 2
 * approver 3
 * approver 4

There are no disapproving votes.

Thanks everyone!
{quote}
 

h3. Expectations
 * Community votes to release the proposed candidate, with at least three 
approving PMC votes

Any issues that are raised till the vote is over should be either resolved or 
moved into the next release (if applicable).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35880) CLONE - [Release-1.20] Stage source and binary releases on dist.apache.org

2024-07-23 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35880:
--

 Summary: CLONE - [Release-1.20] Stage source and binary releases 
on dist.apache.org
 Key: FLINK-35880
 URL: https://issues.apache.org/jira/browse/FLINK-35880
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 1.20.0
Reporter: Weijie Guo
Assignee: Weijie Guo
 Fix For: 1.20.0


Copy the source release to the dev repository of dist.apache.org:
# If you have not already, check out the Flink section of the dev repository on 
dist.apache.org via Subversion. In a fresh directory:
{code:bash}
$ svn checkout https://dist.apache.org/repos/dist/dev/flink --depth=immediates
{code}
# Make a directory for the new release and copy all the artifacts (Flink 
source/binary distributions, hashes, GPG signatures and the python 
subdirectory) into that newly created directory:
{code:bash}
$ mkdir flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
$ mv /tools/releasing/release/* 
flink/flink-${RELEASE_VERSION}-rc${RC_NUM}
{code}
# Add and commit all the files.
{code:bash}
$ cd flink
flink $ svn add flink-${RELEASE_VERSION}-rc${RC_NUM}
flink $ svn commit -m "Add flink-${RELEASE_VERSION}-rc${RC_NUM}"
{code}
# Verify that files are present under 
[https://dist.apache.org/repos/dist/dev/flink|https://dist.apache.org/repos/dist/dev/flink].
# Push the release tag if not done already (the following command assumes to be 
called from within the apache/flink checkout):
{code:bash}
$ git push  refs/tags/release-${RELEASE_VERSION}-rc${RC_NUM}
{code}

 

h3. Expectations
 * Maven artifacts deployed to the staging repository of 
[repository.apache.org|https://repository.apache.org/content/repositories/]
 * Source distribution deployed to the dev repository of 
[dist.apache.org|https://dist.apache.org/repos/dist/dev/flink/]
 * Check hashes (e.g. shasum -c *.sha512)
 * Check signatures (e.g. {{{}gpg --verify 
flink-1.2.3-source-release.tar.gz.asc flink-1.2.3-source-release.tar.gz{}}})
 * {{grep}} for legal headers in each file.
 * If time allows check the NOTICE files of the modules whose dependencies have 
been changed in this release in advance, since the license issues from time to 
time pop up during voting. See [Verifying a Flink 
Release|https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Release]
 "Checking License" section.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35881) CLONE - [Release-1.20] Propose a pull request for website updates

2024-07-23 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35881:
--

 Summary: CLONE - [Release-1.20] Propose a pull request for website 
updates
 Key: FLINK-35881
 URL: https://issues.apache.org/jira/browse/FLINK-35881
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 1.17.0
Reporter: Weijie Guo
Assignee: Weijie Guo
 Fix For: 1.20.0


The final step of building the candidate is to propose a website pull request 
containing the following changes:
 # update 
[apache/flink-web:_config.yml|https://github.com/apache/flink-web/blob/asf-site/_config.yml]
 ## update {{FLINK_VERSION_STABLE}} and {{FLINK_VERSION_STABLE_SHORT}} as 
required
 ## update version references in quickstarts ({{{}q/{}}} directory) as required
 ## (major only) add a new entry to {{flink_releases}} for the release binaries 
and sources
 ## (minor only) update the entry for the previous release in the series in 
{{flink_releases}}
 ### Please pay notice to the ids assigned to the download entries. They should 
be unique and reflect their corresponding version number.
 ## add a new entry to {{release_archive.flink}}
 # add a blog post announcing the release in _posts
 # add a organized release notes page under docs/content/release-notes and 
docs/content.zh/release-notes (like 
[https://nightlies.apache.org/flink/flink-docs-release-1.15/release-notes/flink-1.15/]).
 The page is based on the non-empty release notes collected from the issues, 
and only the issues that affect existing users should be included (e.g., 
instead of new functionality). It should be in a separate PR since it would be 
merged to the flink project.

(!) Don’t merge the PRs before finalizing the release.

 

h3. Expectations
 * Website pull request proposed to list the 
[release|http://flink.apache.org/downloads.html]
 * (major only) Check {{docs/config.toml}} to ensure that
 ** the version constants refer to the new version
 ** the {{baseurl}} does not point to {{flink-docs-master}}  but 
{{flink-docs-release-X.Y}} instead



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35879) CLONE - [Release-1.20] Build and stage Java and Python artifacts

2024-07-23 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35879:
--

 Summary: CLONE - [Release-1.20] Build and stage Java and Python 
artifacts
 Key: FLINK-35879
 URL: https://issues.apache.org/jira/browse/FLINK-35879
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 1.20.0
Reporter: Weijie Guo
Assignee: Weijie Guo
 Fix For: 1.20.0


# Create a local release branch ((!) this step can not be skipped for minor 
releases):
{code:bash}
$ cd ./tools
tools/ $ OLD_VERSION=$CURRENT_SNAPSHOT_VERSION NEW_VERSION=$RELEASE_VERSION 
RELEASE_CANDIDATE=$RC_NUM releasing/create_release_branch.sh
{code}
 # Tag the release commit:
{code:bash}
$ git tag -s ${TAG} -m "${TAG}"
{code}
 # We now need to do several things:
 ## Create the source release archive
 ## Deploy jar artefacts to the [Apache Nexus 
Repository|https://repository.apache.org/], which is the staging area for 
deploying the jars to Maven Central
 ## Build PyFlink wheel packages
You might want to create a directory on your local machine for collecting the 
various source and binary releases before uploading them. Creating the binary 
releases is a lengthy process but you can do this on another machine (for 
example, in the "cloud"). When doing this, you can skip signing the release 
files on the remote machine, download them to your local machine and sign them 
there.
 # Build the source release:
{code:bash}
tools $ RELEASE_VERSION=$RELEASE_VERSION releasing/create_source_release.sh
{code}
 # Stage the maven artifacts:
{code:bash}
tools $ releasing/deploy_staging_jars.sh
{code}
Review all staged artifacts ([https://repository.apache.org/]). They should 
contain all relevant parts for each module, including pom.xml, jar, test jar, 
source, test source, javadoc, etc. Carefully review any new artifacts.
 # Close the staging repository on Apache Nexus. When prompted for a 
description, enter “Apache Flink, version X, release candidate Y”.
Then, you need to build the PyFlink wheel packages (since 1.11):
 # Set up an azure pipeline in your own Azure account. You can refer to [Azure 
Pipelines|https://cwiki.apache.org/confluence/display/FLINK/Azure+Pipelines#AzurePipelines-Tutorial:SettingupAzurePipelinesforaforkoftheFlinkrepository]
 for more details on how to set up azure pipeline for a fork of the Flink 
repository. Note that a google cloud mirror in Europe is used for downloading 
maven artifacts, therefore it is recommended to set your [Azure organization 
region|https://docs.microsoft.com/en-us/azure/devops/organizations/accounts/change-organization-location]
 to Europe to speed up the downloads.
 # Push the release candidate branch to your forked personal Flink repository, 
e.g.
{code:bash}
tools $ git push  
refs/heads/release-${RELEASE_VERSION}-rc${RC_NUM}:release-${RELEASE_VERSION}-rc${RC_NUM}
{code}
 # Trigger the Azure Pipelines manually to build the PyFlink wheel packages
 ## Go to your Azure Pipelines Flink project → Pipelines
 ## Click the "New pipeline" button on the top right
 ## Select "GitHub" → your GitHub Flink repository → "Existing Azure Pipelines 
YAML file"
 ## Select your branch → Set path to "/azure-pipelines.yaml" → click on 
"Continue" → click on "Variables"
 ## Then click "New Variable" button, fill the name with "MODE", and the value 
with "release". Click "OK" to set the variable and the "Save" button to save 
the variables, then back on the "Review your pipeline" screen click "Run" to 
trigger the build.
 ## You should now see a build where only the "CI build (release)" is running
 # Download the PyFlink wheel packages from the build result page after the 
jobs of "build_wheels mac" and "build_wheels linux" have finished.
 ## Download the PyFlink wheel packages
 ### Open the build result page of the pipeline
 ### Go to the {{Artifacts}} page (build_wheels linux -> 1 artifact)
 ### Click {{wheel_Darwin_build_wheels mac}} and {{wheel_Linux_build_wheels 
linux}} separately to download the zip files
 ## Unzip these two zip files
{code:bash}
$ cd /path/to/downloaded_wheel_packages
$ unzip wheel_Linux_build_wheels\ linux.zip
$ unzip wheel_Darwin_build_wheels\ mac.zip{code}
 ## Create directory {{./dist}} under the directory of {{{}flink-python{}}}:
{code:bash}
$ cd 
$ mkdir flink-python/dist{code}
 ## Move the unzipped wheel packages to the directory of 
{{{}flink-python/dist{}}}:
{code:java}
$ mv /path/to/wheel_Darwin_build_wheels\ mac/* flink-python/dist/
$ mv /path/to/wheel_Linux_build_wheels\ linux/* flink-python/dist/
$ cd tools{code}

Finally, we create the binary convenience release files:
{code:bash}
tools $ RELEASE_VERSION=$RELEASE_VERSION releasing/create_binary_release.sh
{code}
If you want to run this step in parallel on a remote

[jira] [Created] (FLINK-35878) Build Release Candidate: 1.20.0-rc2

2024-07-23 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35878:
--

 Summary: Build Release Candidate: 1.20.0-rc2
 Key: FLINK-35878
 URL: https://issues.apache.org/jira/browse/FLINK-35878
 Project: Flink
  Issue Type: New Feature
Affects Versions: 1.20.0
Reporter: Weijie Guo
Assignee: Weijie Guo
 Fix For: 1.20.0


The core of the release process is the build-vote-fix cycle. Each cycle 
produces one release candidate. The Release Manager repeats this cycle until 
the community approves one release candidate, which is then finalized.

h4. Prerequisites
Set up a few environment variables to simplify Maven commands that follow. This 
identifies the release candidate being built. Start with {{RC_NUM}} equal to 1 
and increment it for each candidate:
{code}
RC_NUM="1"
TAG="release-${RELEASE_VERSION}-rc${RC_NUM}"
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 33142 matches

Mail list logo