date:20240709

[jira] [Updated] (FLINK-25283) End-to-end application modules create oversized jars

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-25283:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> End-to-end application modules create oversized jars
> 
>
> Key: FLINK-25283
> URL: https://issues.apache.org/jira/browse/FLINK-25283
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Tests
>Affects Versions: 1.13.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
> Fix For: 2.0.0
>
>
> Various modules that create jars for e2e tests (e.g., 
> flink-streaming-kinesis-test) create oversized jars (100mb+) because they 
> bundle their entire dependency tree, including many parts of Flink, along 
> with test dependencies and test resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-26942) Support SELECT clause in CREATE TABLE(CTAS)

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-26942:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Support SELECT clause in CREATE TABLE(CTAS)
> ---
>
> Key: FLINK-26942
> URL: https://issues.apache.org/jira/browse/FLINK-26942
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Planner
>Reporter: tartarus
>Priority: Major
> Fix For: 2.0.0
>
>
> Support CTAS(CREATE TABLE AS SELECT) syntax
> {code:java}
> CREATE TABLE [ IF NOT EXISTS ] table_name 
> [ WITH ( table_properties ) ]
> [ AS query_expression ] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34445) Integrate new endpoint with Flink UI metrics section

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-34445:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Integrate new endpoint with Flink UI metrics section
> 
>
> Key: FLINK-34445
> URL: https://issues.apache.org/jira/browse/FLINK-34445
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Web Frontend
>Reporter: Mason Chen
>Assignee: Ruan Hang
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35557) MemoryManager only reserves memory per consumer type once

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35557:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)
   (was: 1.19.2)

> MemoryManager only reserves memory per consumer type once
> -
>
> Key: FLINK-35557
> URL: https://issues.apache.org/jira/browse/FLINK-35557
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends, Runtime / Task
>Affects Versions: 1.16.3, 1.17.2, 1.19.0, 1.18.1, 1.20.0
>Reporter: Roman Khachatryan
>Assignee: Roman Khachatryan
>Priority: Major
> Fix For: 2.0.0
>
>
> # In {{MemoryManager.getSharedMemoryResourceForManagedMemory}} we 
> [create|https://github.com/apache/flink/blob/57869c11687e0053a242c90623779c0c7336cd33/flink-runtime/src/main/java/org/apache/flink/runtime/memory/MemoryManager.java#L526]
>  a reserve function
>  # The function 
> [decrements|https://github.com/apache/flink/blob/57869c11687e0053a242c90623779c0c7336cd33/flink-runtime/src/main/java/org/apache/flink/runtime/memory/UnsafeMemoryBudget.java#L61]
>  the available Slot memory and fails if there's not enough memory
>  # We pass it to {{SharedResources.getOrAllocateSharedResource}}
>  # In {{SharedResources.getOrAllocateSharedResource}} , we check if the 
> resource (memory) was already reserved by some key (e.g. 
> {{{}state-rocks-managed-memory{}}})
>  # If not, we create a new one and call the reserve function
>  # If the resource was already reserved (not null), we do NOT reserve the 
> memory again: 
> [https://github.com/apache/flink/blob/57869c11687e0053a242c90623779c0c7336cd33/flin[…]/main/java/org/apache/flink/runtime/memory/SharedResources.java|https://github.com/apache/flink/blob/57869c11687e0053a242c90623779c0c7336cd33/flink-runtime/src/main/java/org/apache/flink/runtime/memory/SharedResources.java#L71]
> So there will be only one (first) memory reservation for rocksdb for example, 
> no matter how many state backends in a slot are created. Meaning that managed 
> memory limits are not applied



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-19254) Invalid UTF-8 start byte exception

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-19254:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Invalid UTF-8 start byte exception 
> ---
>
> Key: FLINK-19254
> URL: https://issues.apache.org/jira/browse/FLINK-19254
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.11.0
>Reporter: Jun Zhang
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 2.0.0
>
>
> when read  no utf8 data ,JsonRowDeserializationSchema throw a exception.
> {code:java}
> Caused by: 
> org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.JsonParseException:
>  Invalid UTF-8 start byte xxx 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-25350) Verify stability guarantees of annotated classes

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-25350:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Verify stability guarantees of annotated classes
> 
>
> Key: FLINK-25350
> URL: https://issues.apache.org/jira/browse/FLINK-25350
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.15.0
>Reporter: Till Rohrmann
>Priority: Major
> Fix For: 2.0.0
>
>
> In order to give API stability guarantees we should add a test which ensures 
> that an API class can at most give a stability guarantee of the weakest 
> annotated method a user needs to implement for this class. This will ensure 
> that we can extend API classes with methods that have weaker guarantees but 
> that must have default implementations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-12352) [FLIP-36] [Phase 1] Support cache() / invalidateCache() in Table with default ShuffleService and NetworkStack

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-12352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-12352:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> [FLIP-36] [Phase 1] Support cache() / invalidateCache() in Table with default 
> ShuffleService and NetworkStack
> -
>
> Key: FLINK-12352
> URL: https://issues.apache.org/jira/browse/FLINK-12352
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Jiangjie Qin
>Priority: Major
> Fix For: 2.0.0
>
>
> The goals of this phase are following:
>  * cache and release intermediate result with shuffle service.
>  * benefit from locality of default shuffle service



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-21765) Remove implementation-specific MetricGroup parents

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-21765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-21765:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Remove implementation-specific MetricGroup parents
> --
>
> Key: FLINK-21765
> URL: https://issues.apache.org/jira/browse/FLINK-21765
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / Metrics
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> stale-assigned, starter
> Fix For: 2.0.0
>
>
> MetricGroups currently form a bi-directly graph, usually with explicit 
> requirements that type the parent must have. For example, an OperatorMG has a 
> hard requirement that the parent is a TaskMG.
> As a result they are quite inflexible, which particular shows in tests, as 
> you can't just create one metric group, but have to build an entire tree.
> The end goal of this ticket is to remove AbstractMetricGroup#parent, and 
> along the way we'll decouple the various MG implementations from each other.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-29547) Select a[1] which is array type for parquet complex type throw ClassCastException

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-29547:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Select a[1] which is  array type for parquet complex type throw 
> ClassCastException
> --
>
> Key: FLINK-29547
> URL: https://issues.apache.org/jira/browse/FLINK-29547
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.16.0
>Reporter: dalongliu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>
> Regarding the following SQL test in HiveTableSourceITCase, it will throw 
> ClassCastException.
> {code:java}
> batchTableEnv.executeSql(
> "create table parquet_complex_type_test("
> + "a array, m map, s 
> struct) stored as parquet");
> String[] modules = batchTableEnv.listModules();
> // load hive module so that we can use array,map, named_struct function
> // for convenient writing complex data
> batchTableEnv.loadModule("hive", new HiveModule());
> batchTableEnv.useModules("hive", CoreModuleFactory.IDENTIFIER);
> batchTableEnv
> .executeSql(
> "insert into parquet_complex_type_test"
> + " select array(1, 2), map(1, 'val1', 2, 'val2'),"
> + " named_struct('f1', 1,  'f2', 2)")
> .await();
> Table src = batchTableEnv.sqlQuery("select a[1] from 
> parquet_complex_type_test");
> List rows = CollectionUtil.iteratorToList(src.execute().collect());{code}
> The exception stack: 
> Caused by: java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast 
> to [Ljava.lang.Integer;
>     at BatchExecCalc$37.processElement(Unknown Source)
>     at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.pushToOperator(ChainingOutput.java:99)
>     at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:80)
>     at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:39)
>     at 
> org.apache.flink.streaming.runtime.tasks.SourceOperatorStreamTask$AsyncDataOutputToOutput.emitRecord(SourceOperatorStreamTask.java:313)
>     at 
> org.apache.flink.streaming.api.operators.source.NoOpTimestampsAndWatermarks$TimestampsOnlyOutput.collect(NoOpTimestampsAndWatermarks.java:98)
>     at 
> org.apache.flink.streaming.api.operators.source.NoOpTimestampsAndWatermarks$TimestampsOnlyOutput.collect(NoOpTimestampsAndWatermarks.java:92)
>     at 
> org.apache.flink.connector.file.src.impl.FileSourceRecordEmitter.emitRecord(FileSourceRecordEmitter.java:45)
>     at 
> org.apache.flink.connector.file.src.impl.FileSourceRecordEmitter.emitRecord(FileSourceRecordEmitter.java:35)
>     at 
> org.apache.flink.connector.base.source.reader.SourceReaderBase.pollNext(SourceReaderBase.java:144)
>     at 
> org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:401)
>     at 
> org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:68)
>     at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:542)
>     at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:831)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:780)
>     at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935)
>     at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:914)
>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728)
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550)
>     at java.lang.Thread.run(Thread.java:748)
>  
> After debugging the code, I found the root cause is that source operator 
> reads array data from parquet in the vectorized way, and it returns 
> ColumnarArrayData, then in the calc operator we convert it to 
> GenericArrayData, the object array is Object[] type instead of Integer[], so 
> if we call the ArrayObjectArrayConverter#toExternal method converts it to 
> Integer[], it still returns Object[] type, and then if convert the array to 
> Integer[] type forcedly, we will get the exception.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30380) Look into enabling parallel compilation on CI

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-30380:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Look into enabling parallel compilation on CI
> -
>
> Key: FLINK-30380
> URL: https://issues.apache.org/jira/browse/FLINK-30380
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Build System, Build System / CI
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-22926) IDLE FLIP-27 source should go ACTIVE when registering a new split

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-22926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-22926:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> IDLE FLIP-27 source should go ACTIVE when registering a new split
> -
>
> Key: FLINK-22926
> URL: https://issues.apache.org/jira/browse/FLINK-22926
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Dawid Wysakowicz
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>
> When a FLIP-27 source is IDLE and registers a new split it does not go 
> immediately ACTIVE. We should consider watermarks from a newly registered 
> split immediately after registration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34503) Migrate JoinDeriveNullFilterRule

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-34503:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Migrate JoinDeriveNullFilterRule
> 
>
> Key: FLINK-34503
> URL: https://issues.apache.org/jira/browse/FLINK-34503
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 1.20.0
>Reporter: Jacky Lau
>Assignee: Jacky Lau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-17826) Add missing custom query support on new jdbc connector

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-17826:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Add missing custom query support on new jdbc connector
> --
>
> Key: FLINK-17826
> URL: https://issues.apache.org/jira/browse/FLINK-17826
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Reporter: Jark Wu
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 2.0.0
>
>
> In FLINK-17361, we added custom query on JDBC tables, but missing to add the 
> same ability on new jdbc connector (i.e. 
> {{JdbcDynamicTableSourceSinkFactory}}). 
> In the new jdbc connector, maybe we should call it {{scan.query}} to keep 
> consistent with other scan options, besides we need to make {{"table-name"}} 
> optional, but add validation that "table-name" and "scan.query" shouldn't 
> both be empty, and "table-name" must not be empty when used as sink.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-21125) Sum or Sum0 overflow quietly

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-21125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-21125:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Sum or Sum0 overflow quietly
> 
>
> Key: FLINK-21125
> URL: https://issues.apache.org/jira/browse/FLINK-21125
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Reporter: Sebastian Liu
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 2.0.0
>
>
> Overflow is not calculated correctly in the build-in sum function of Blink 
> planner.
> For a aggregate calculation such as:
> {code:java}
> CREATE TABLE TestTable (
>   amount INT
> );
> insert into TestTable (2147483647);
> insert into TestTable (1);
> SELECT sum(amount) FROM TestTable;
> The result will be: -2147483648, which is an overflowed value and no 
> exception was thrown. {code}
> The overflow occurs quietly and is difficult to detect. 
> Compare the processing semantics of other systems：
>  * *mysql*: provide two SQL mode for handling overflow. If strict SQL mode is 
> enabled, MySQL rejects the out-of-range value with an error, and the insert 
> fails, in accordance with the SQL standard. If no restrictive modes are 
> enabled, MySQL clips the value to the appropriate endpoint of the column data 
> type range and stores the resulting value instead. FYI: 
> [https://dev.mysql.com/doc/refman/8.0/en/out-of-range-and-overflow.html]
>  * *presto*: all numeric types are automatically cast to Long type, and If 
> the long is out of range, an exception is thrown to prompt user.
> IMO, exception hint is necessary, instead of quietly overflow. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-15645) enable COPY TO/FROM in Postgres JDBC source/sink for faster batch processing

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-15645:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> enable COPY TO/FROM in Postgres JDBC source/sink for faster batch processing
> 
>
> Key: FLINK-15645
> URL: https://issues.apache.org/jira/browse/FLINK-15645
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / JDBC
>Reporter: Bowen Li
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 2.0.0
>
>
> Postgres has its own SQL extension as COPY FROM/TO via JDBC for faster bulk 
> loading/reading [https://www.postgresql.org/docs/12/sql-copy.html]
> Flink should be able to support that for batch use cases



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-28042) Create an extension for resetting HiveConf

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-28042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-28042:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Create an extension for resetting HiveConf
> --
>
> Key: FLINK-28042
> URL: https://issues.apache.org/jira/browse/FLINK-28042
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / Hive, Tests
>Reporter: Chesnay Schepler
>Priority: Major
> Fix For: 2.0.0
>
>
> The {{HiveConf}} is a singleton and modified by both production and test code.
> We should think about writing an extension that prevents these changes from 
> leaking into other tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-28428) Example in the Elasticsearch doc of fault tolerance section is missing

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-28428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-28428:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Example in the Elasticsearch doc of fault tolerance section is missing
> --
>
> Key: FLINK-28428
> URL: https://issues.apache.org/jira/browse/FLINK-28428
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / ElasticSearch, Documentation
>Affects Versions: 1.15.1
>Reporter: Luning Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>
> The example in English doc 
> {code:java}
> final StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.enableCheckpointing(5000); // checkpoint every 5000 msecs{code}
>  
> The example in Chinese doc
> {code:java}
> final StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.enableCheckpointing(5000); // 每 5000 毫秒执行一次 checkpoint
> Elasticsearch6SinkBuilder sinkBuilder = new 
> Elasticsearch6SinkBuilder()
> .setDeliveryGuarantee(DeliveryGuarantee.AT_LEAST_ONCE)
> .setHosts(new HttpHost("127.0.0.1", 9200, "http"))
> .setEmitter(
> (element, context, indexer) -> 
> indexer.add(createIndexRequest(element))); {code}
> IMO, the example in Chinese doc is correct.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-16466) Group by on event time should produce insert only result

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-16466:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Group by on event time should produce insert only result
> 
>
> Key: FLINK-16466
> URL: https://issues.apache.org/jira/browse/FLINK-16466
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.10.0
>Reporter: Kurt Young
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 2.0.0
>
>
> Currently when doing aggregation queries, we can output insert only results 
> only when grouping by windows. But when users defined event time and also 
> watermark, we can also support emit insert only results when grouping on 
> event time. To be more precise, it should only require event time is one of 
> the grouping keys. One can think of grouping by event time is kind of a 
> special window, with both window start and window end equals to event time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35062) Migrate RewriteMultiJoinConditionRule

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35062:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Migrate RewriteMultiJoinConditionRule
> -
>
> Key: FLINK-35062
> URL: https://issues.apache.org/jira/browse/FLINK-35062
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 1.20.0
>Reporter: Jacky Lau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-21648) FLIP-151: Incremental snapshots for heap-based state backend

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-21648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-21648:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> FLIP-151: Incremental snapshots for heap-based state backend
> 
>
> Key: FLINK-21648
> URL: https://issues.apache.org/jira/browse/FLINK-21648
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / State Backends
>Reporter: Roman Khachatryan
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: auto-deprioritized-major, auto-unassigned, stale-assigned
> Fix For: 2.0.0
>
>
> Umbrella ticket for 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-151%3A+Incremental+snapshots+for+heap-based+state+backend]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-15000) WebUI Metrics is very slow in large parallelism

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-15000:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> WebUI Metrics is very slow in large parallelism
> ---
>
> Key: FLINK-15000
> URL: https://issues.apache.org/jira/browse/FLINK-15000
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.9.0, 1.9.1
>Reporter: fa zheng
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> metrics in web ui are very slow when parallelism is huge. It's hard to add 
> metric and choose one metric. I run carTopSpeedWindowingExample with command 
> {code:java}
> //代码占位符
> flink run -m yarn-cluster -p 1200 examples/streaming/TopSpeedWindowing.jar
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-24386) JobMaster should guard against exceptions from OperatorCoordinator

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-24386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-24386:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> JobMaster should guard against exceptions from OperatorCoordinator
> --
>
> Key: FLINK-24386
> URL: https://issues.apache.org/jira/browse/FLINK-24386
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.13.2
>Reporter: David Morávek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>
> Original report from [~sewen]:
> When the scheduler processes the call to trigger a _globalFailover_
>  and something goes wrong in there, the _JobManager_ gets stuck. Concretely, 
> I have an _OperatorCoordinator_ that throws an exception in 
> _subtaskFailed()_, which gets called as part of processing the failover.
> While this is a bug in that coordinator, the whole thing seems a bit 
> dangerous to me. If there is some bug in any part of the failover logic, we 
> have no safety net. No "hard crash" and let the process be restarted. We only 
> see a log line (below) and everything goes unresponsive.
> {code:java}
> ERROR org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor [] - Caught 
> exception while executing runnable in main thread.
> {code}
> Shouldn't we have some safety nets in place here?
>  * I am wondering if the place where that line is logged should actually 
> invoke the fatal error handler. If an exception propagates out of a main 
> thread action, we need to call off all bets and assume things have gotten 
> inconsistent.
>  * At the very least, the failover procedure itself should be guarded. If an 
> error happens while processing the global failover, then we need to treat 
> this as beyond redemption and declare a fatal error.
> The fatal error would give us a log line and the user a container restart, 
> hopefully fixing things (unless it was a deterministic error).
> [~dmvk] notes:
>  * OperatorCoordinator is part of a public API interface (part of JobGraph).
>  ** Can be provided by implementing CoordinatedOperatorFactory
>  ** This actually gives the issue higher priority than I initially thought.
>  * We should guard against flaws in user code:
>  ** There are two types of interfaces
>  *** (CRITICAL) Public API for JobGraph construction / submission
>  *** Semi-public interfaces such as custom HA Services, this is for power 
> users, so I wouldn't be as concerned there.
>  ** We already do good job guarding against failure on TM side
>  ** Considering the critical parts on JM side, there two places where user 
> can "hook"
>  *** OperatorCoordinator
>  *** InitializeOnMaster, FinalizeOnMaster (batch sinks only, legacy from the 
> Hadoop world)
> --- 
> We should audit all the calls to OperatorCoordinator and handle failures 
> accordingly. We want to avoid unnecessary JVM terminations as much as 
> possible (sometimes it's the only option though).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-25887) FLIP-193: Snapshots ownership follow ups

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-25887:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> FLIP-193: Snapshots ownership follow ups
> 
>
> Key: FLINK-25887
> URL: https://issues.apache.org/jira/browse/FLINK-25887
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Reporter: Dawid Wysakowicz
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-33895) Implement restore tests for PythonGroupWindowAggregate node

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-33895:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Implement restore tests for PythonGroupWindowAggregate node
> ---
>
> Key: FLINK-33895
> URL: https://issues.apache.org/jira/browse/FLINK-33895
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 1.19.0
>Reporter: Jacky Lau
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-26477) Finalize the window support in Python DataStream API

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-26477:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Finalize the window support in Python DataStream API
> 
>
> Key: FLINK-26477
> URL: https://issues.apache.org/jira/browse/FLINK-26477
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: Dian Fu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>
> It has provided window support in Python DataStream API in FLINK-21842. 
> However, there are still a few functionalities missing, e.g. evictor, etc. 
> This is an umbrella JIRA to collect all the missing part of work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-24951) Allow watch bookmarks to mitigate frequent watcher rebuilding

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-24951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-24951:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Allow watch bookmarks to mitigate frequent watcher rebuilding
> -
>
> Key: FLINK-24951
> URL: https://issues.apache.org/jira/browse/FLINK-24951
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Affects Versions: 1.15.0
>Reporter: guoyangze#1
>Priority: Major
> Fix For: 2.0.0
>
>
> In some production environments, there are massive pods that create and 
> delete. Thus the global resource version is updated very quickly and may 
> cause frequent watcher rebuilding because of "too old resource version". To 
> avoid this, K8s provide a Bookmark mechanism[1].
> I propose to enable bookmark by default
> [1] 
> https://kubernetes.io/docs/reference/using-api/api-concepts/#watch-bookmarks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34670) The asyncOperationsThreadPool in SubtaskCheckpointCoordinatorImpl can only create one worker thread

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-34670:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> The asyncOperationsThreadPool in SubtaskCheckpointCoordinatorImpl can only 
> create one worker thread
> ---
>
> Key: FLINK-34670
> URL: https://issues.apache.org/jira/browse/FLINK-34670
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0, 1.19.0
>Reporter: Jinzhong Li
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 2.0.0
>
> Attachments: image-2024-03-14-20-24-14-198.png, 
> image-2024-03-14-20-27-37-540.png, image-2024-03-14-20-33-28-851.png
>
>
> Now, the asyncOperations ThreadPoolExecutor of 
> SubtaskCheckpointCoordinatorImpl is create with a LinkedBlockingQueue and 
> zero corePoolSize.
> !image-2024-03-14-20-24-14-198.png|width=614,height=198!
> And in the ThreadPoolExecutor, except for the first time the task is 
> submitted, *no* new thread is created until the queue is full. But the 
> capacity of LinkedBlockingQueue is Integer.Max. This means that there is 
> almost *only one thread* working in this thread pool, *even if* {*}there are 
> many concurrent checkpoint requests or checkpoint abort requests waiting to 
> be processed{*}.
> !image-2024-03-14-20-27-37-540.png|width=614,height=175!
> This problem can be verified by changing ExecutorService implementation in UT 
> SubtaskCheckpointCoordinatorTest#testNotifyCheckpointAbortedDuringAsyncPhase. 
> When the LinkedBlockingQueue and zero corePoolSize are configured, this UT 
> will deadlock because only one worker thread can be created.
> !image-2024-03-14-20-33-28-851.png|width=606,height=235!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-20920) Document how to connect to kerberized HMS

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-20920:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Document how to connect to kerberized HMS
> -
>
> Key: FLINK-20920
> URL: https://issues.apache.org/jira/browse/FLINK-20920
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive, Documentation
>Reporter: Rui Li
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34094) Document new AsyncScalarFunction

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-34094:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Document new AsyncScalarFunction
> 
>
> Key: FLINK-34094
> URL: https://issues.apache.org/jira/browse/FLINK-34094
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation
>Reporter: Timo Walther
>Assignee: Alan Sheinberg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>
> Write documentation in the user-defined functions page. Maybe summarizing the 
> behavior of async calls in general. We could also think about add a REST API 
> example in flink-table-examples?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-16776) Support schema evolution for Hive tables

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-16776:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Support schema evolution for Hive tables
> 
>
> Key: FLINK-16776
> URL: https://issues.apache.org/jira/browse/FLINK-16776
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Rui Li
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-28777) Add configure session API for sql gateway rest endpoint

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-28777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-28777:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Add configure session API for sql gateway rest endpoint
> ---
>
> Key: FLINK-28777
> URL: https://issues.apache.org/jira/browse/FLINK-28777
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Gateway
>Reporter: Wencong Liu
>Priority: Major
> Fix For: 2.0.0
>
>
> In the development of version 1.16, we will temporarily skip the development 
> of configure session api in sql gateway rest endpoint. Considering the 
> workload and sql gateway temporarily does not need to be compatible with sql 
> client, so the relevant development work will be carried out in the 
> development work of version 1.17.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-12491) Incorrect documentation for directory path separators of CoreOptions.TMP_DIRS

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-12491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-12491:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Incorrect documentation for directory path separators of CoreOptions.TMP_DIRS
> -
>
> Key: FLINK-12491
> URL: https://issues.apache.org/jira/browse/FLINK-12491
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration
>Affects Versions: 1.6.4, 1.7.2, 1.8.0, 1.9.0
>Reporter: Kezhu Wang
>Priority: Minor
>  Labels: auto-unassigned, pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{CoreOptions.TMP_DIRS}} and {{ConfigConstants.TASK_MANAGER_TMP_DIR_KEY}} 
> both say that:
> {quote}
> The config parameter defining the directories for temporary files, separated 
> by
>* ",", "|", or the system's \{@link java.io.File#pathSeparator}.
> {quote}
> But the parsing phase uses {{String.split}} with argument {{",|" + 
> File.pathSeparator}} eventually. However, in fact the sole parameter of 
> {{String.split}} is a regular expression, so the directory path separators 
> are "," or {{java.io.File#pathSeparator}}. After digging into history, I 
> found that the documentation was introduced in commit 
> {{a7c407ace4f6cbfbde3e247071cee5a755ae66db}} and inherited by 
> {{76abcaa55d0d6ab704b7ab8164718e8e2dcae2c4}}. So, I think it is safe to drop 
> "|" from documentation.
> {code:title=ConfigurationUtils.java}
> public class ConfigurationUtils {
>   private static String[] splitPaths(@Nonnull String separatedPaths) {
>   return separatedPaths.length() > 0 ? separatedPaths.split(",|" 
> + File.pathSeparator) : EMPTY;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-14934) Remove error log statement in ES connector

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-14934:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Remove error log statement in ES connector
> --
>
> Key: FLINK-14934
> URL: https://issues.apache.org/jira/browse/FLINK-14934
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.9.1
>Reporter: Arvid Heise
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 2.0.0
>
>
> The ES connector currently uses the [log and throw 
> antipattern|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-elasticsearch-base/src/main/java/org/apache/flink/streaming/connectors/elasticsearch/ElasticsearchSinkBase.java#L406],
>  which doesn't allow users to ignore certain types of errors without getting 
> their logs spammed.
> The log statement should be removed completely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-28398) CheckpointCoordinatorTriggeringTest.discardingTriggeringCheckpointWillExecuteNextCheckpointRequest( gets stuck

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-28398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-28398:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> CheckpointCoordinatorTriggeringTest.discardingTriggeringCheckpointWillExecuteNextCheckpointRequest(
>  gets stuck
> --
>
> Key: FLINK-28398
> URL: https://issues.apache.org/jira/browse/FLINK-28398
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.16.0
>Reporter: Martijn Visser
>Priority: Major
>  Labels: stale-assigned, test-stability
> Fix For: 2.0.0
>
>
> {code:java}
> Jul 01 02:16:55 "main" #1 prio=5 os_prio=0 tid=0x7fe41000b800 nid=0x5ca2 
> in Object.wait() [0x7fe41a429000]
> Jul 01 02:16:55java.lang.Thread.State: WAITING (on object monitor)
> Jul 01 02:16:55   at java.lang.Object.wait(Native Method)
> Jul 01 02:16:55   at java.lang.Object.wait(Object.java:502)
> Jul 01 02:16:55   at 
> org.apache.flink.core.testutils.OneShotLatch.await(OneShotLatch.java:61)
> Jul 01 02:16:55   - locked <0xf096ab58> (a java.lang.Object)
> Jul 01 02:16:55   at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinatorTriggeringTest.discardingTriggeringCheckpointWillExecuteNextCheckpointRequest(CheckpointCoordinatorTriggeringTest.java:731)
> Jul 01 02:16:55   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Jul 01 02:16:55   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Jul 01 02:16:55   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Jul 01 02:16:55   at java.lang.reflect.Method.invoke(Method.java:498)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=37433&view=logs&j=d89de3df-4600-5585-dadc-9bbc9a5e661c&t=be5a4b15-4b23-56b1-7582-795f58a645a2&l=15207



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-21876) Handle it properly when the returned value of Python UDF doesn't match the defined result type

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-21876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-21876:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Handle it properly when the returned value of Python UDF doesn't match the 
> defined result type
> --
>
> Key: FLINK-21876
> URL: https://issues.apache.org/jira/browse/FLINK-21876
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.0, 1.11.0, 1.12.0
>Reporter: Dian Fu
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 2.0.0
>
>
> Currently, when the returned value of Python UDF doesn't match the defined 
> result type of the Python UDF, it will thrown the following exception during 
> execution:
> {code:java}
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readFully(DataInputStream.java:197)
> at java.io.DataInputStream.readFully(DataInputStream.java:169)
> at 
> org.apache.flink.table.runtime.typeutils.StringDataSerializer.deserializeInternal(StringDataSerializer.java:88)
> at 
> org.apache.flink.table.runtime.typeutils.StringDataSerializer.deserialize(StringDataSerializer.java:82)
> at 
> org.apache.flink.table.runtime.typeutils.StringDataSerializer.deserialize(StringDataSerializer.java:34)
> at 
> org.apache.flink.table.runtime.typeutils.serializers.python.MapDataSerializer.deserializeInternal(MapDataSerializer.java:129)
> at 
> org.apache.flink.table.runtime.typeutils.serializers.python.MapDataSerializer.deserialize(MapDataSerializer.java:110)
> at 
> org.apache.flink.table.runtime.typeutils.serializers.python.MapDataSerializer.deserialize(MapDataSerializer.java:46)
> at 
> org.apache.flink.table.runtime.typeutils.serializers.python.RowDataSerializer.deserialize(RowDataSerializer.java:106)
> at 
> org.apache.flink.table.runtime.typeutils.serializers.python.RowDataSerializer.deserialize(RowDataSerializer.java:49)
> at 
> org.apache.flink.table.runtime.operators.python.scalar.RowDataPythonScalarFunctionOperator.emitResult(RowDataPythonScalarFunctionOperator.java:81)
> at 
> org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator.emitResults(AbstractPythonFunctionOperator.java:250)
> at 
> org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator.invokeFinishBundle(AbstractPythonFunctionOperator.java:273)
> at 
> org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator.processWatermark(AbstractPythonFunctionOperator.java:199)
> at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.emitWatermark(ChainingOutput.java:123)
> at 
> org.apache.flink.streaming.runtime.tasks.SourceOperatorStreamTask$AsyncDataOutputToOutput.emitWatermark(SourceOperatorStreamTask.java:170)
> at 
> org.apache.flink.streaming.runtime.tasks.SourceOperatorStreamTask.advanceToEndOfEventTime(SourceOperatorStreamTask.java:110)
> at 
> org.apache.flink.streaming.runtime.tasks.SourceOperatorStreamTask.afterInvoke(SourceOperatorStreamTask.java:116)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:589)
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> The exception isn't straight forward for users and it's difficult for users 
> to figure out the root cause of the issue.
> As Python is dynamic language, this case should be very common and it would 
> be great if we could handle this case properly.
> See 
> [https://stackoverflow.com/questions/66687797/pyflink-java-io-eofexception-at-java-io-datainputstream-readfully]
>  for more details.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34129) MiniBatchGlobalGroupAggFunction will make -D as +I then make +I as -U when state expired

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-34129:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> MiniBatchGlobalGroupAggFunction will make -D as +I then make +I as -U when 
> state expired 
> -
>
> Key: FLINK-34129
> URL: https://issues.apache.org/jira/browse/FLINK-34129
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.1
>Reporter: Hongshun Wang
>Priority: Major
> Fix For: 2.0.0
>
>
> Take sum for example:
> When state is expired, then an update operation from source happens. 
> MiniBatchGlobalGroupAggFunction take -U[1, 20] and +U[1, 20] as input, but 
> will emit +I[1, -20] and -D[1, -20]. The sink will detele the data from 
> external database.
> Let's see why this will happens:
>  * when state is expired and -U[1, 20] arrive, 
> MiniBatchGlobalGroupAggFunction will create a new sum accumulator and set 
> firstRow as true.
> {code:java}
> if (stateAcc == null) { 
>     stateAcc = globalAgg.createAccumulators(); 
>     firstRow = true; 
> }   {code}
>  * then sum accumulator will retract sum value as -20
>  * As the first row, MiniBatchGlobalGroupAggFunction will change -U as +I, 
> then emit to downstream.
> {code:java}
> if (!recordCounter.recordCountIsZero(acc)) {
>    // if this was not the first row and we have to emit retractions
>     if (!firstRow) {
>        // ignore
>     } else {
>     // update acc to state
>     accState.update(acc);
>  
>    // this is the first, output new result
>    // prepare INSERT message for new row
>    resultRow.replace(currentKey, newAggValue).setRowKind(RowKind.INSERT);
>    out.collect(resultRow);
> }  {code}
>  * when next +U[1, 20] arrives, sum accumulator will retract sum value as 0, 
> so RetractionRecordCounter#recordCountIsZero will return true. Because 
> firstRow = false now, will change the +U as -D, then emit to downtream.
> {code:java}
> if (!recordCounter.recordCountIsZero(acc)) {
>     // ignode
> }else{
>    // we retracted the last record for this key
>    // if this is not first row sent out a DELETE message
>    if (!firstRow) {
>    // prepare DELETE message for previous row
>    resultRow.replace(currentKey, prevAggValue).setRowKind(RowKind.DELETE);
>    out.collect(resultRow);
> } {code}
>  
> So the sink will receiver +I and -D after a source update operation, the data 
> will be delete.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32699) select typeof(proctime()); throw exception in sql-client

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-32699:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> select typeof(proctime()); throw exception in sql-client
> 
>
> Key: FLINK-32699
> URL: https://issues.apache.org/jira/browse/FLINK-32699
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Jacky Lau
>Priority: Major
> Fix For: 2.0.0
>
>
> {code:java}
> Flink SQL> select typeof(proctime()); 
>  
> [ERROR] Could not execute SQL statement. Reason: 
> org.apache.flink.table.planner.codegen.CodeGenException: Mismatch of 
> function's argument data type 'TIMESTAMP_LTZ(3) NOT NULL' and actual argument 
> type 'TIMESTAMP_LTZ(3)'
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [hotfix] [docs] Fix broken link to "Anatomy of the Flink distribution" in dev/configuration/connector [flink]

2024-07-09 Thread via GitHub



hlteoh37 merged PR #24906:
URL: https://github.com/apache/flink/pull/24906


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33545][Connectors/Kafka] KafkaSink implementation can cause dataloss during broker issue when not using EXACTLY_ONCE if there's any batching [flink-connector-kafka]

2024-07-09 Thread via GitHub



AHeise commented on code in PR #70:
URL: 
https://github.com/apache/flink-connector-kafka/pull/70#discussion_r1671715837


##
flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/sink/FlinkKafkaInternalProducer.java:
##
@@ -72,12 +74,17 @@ private static Properties withTransactionalId(
 return props;
 }
 
+public long getPendingRecordsCount() {
+return pendingRecords.get();
+}
+
 @Override
 public Future send(ProducerRecord record, Callback 
callback) {
 if (inTransaction) {
 hasRecordsInTransaction = true;
 }
-return super.send(record, callback);
+pendingRecords.incrementAndGet();
+return super.send(record, new TrackingCallback(callback));

Review Comment:
   I can't quite follow. I was proposing to use
   
   `return super.send(record, callbackCache.computeIfAbsent(callback, 
TrackingCallback::new));`
   
   So we have 3 cases:
   - New callback, wrap in `TackingCallback` and cache.
   - Existing callback (common case), retrieve existing callback and use it.
   - Remove existing `TackingCallback` from cache if full.
   
   In all cases, both the TackingCallback and the original callback will be 
invoked. The only difference to the code without cache is that we avoiding 
creating extra TrackingCallback instances around the same original callback.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35435) [FLIP-451] Introduce timeout configuration to AsyncSink

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35435:
---
Release Note: In Flink 1.20, We have introduced timeout configuration to 
`AsyncSink` with `retryOnTimeout` and `failOnTimeout` mechanisms to ensure the 
writer doesn't block on un-acknowledged requests.  (was: In Flink 1.20, We have 
introduced timeout configuration to AsyncSink with retryOnTimeout and 
FailOnTimeout mechanisms to ensure the writer doesn't block on un-acknowledged 
requests.)

> [FLIP-451] Introduce timeout configuration to AsyncSink
> ---
>
> Key: FLINK-35435
> URL: https://issues.apache.org/jira/browse/FLINK-35435
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common
>Reporter: Ahmed Hamdy
>Assignee: Ahmed Hamdy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
> Attachments: Screenshot 2024-05-24 at 11.06.30.png, Screenshot 
> 2024-05-24 at 12.06.20.png
>
>
> Implementation Ticket for:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-451%3A+Introduce+timeout+configuration+to+AsyncSink+API
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-34111) Add JSON_QUOTE and JSON_UNQUOTE function

2024-07-09 Thread Martijn Visser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser reassigned FLINK-34111:
--

Assignee: Anupam Aggarwal

> Add JSON_QUOTE and JSON_UNQUOTE function
> 
>
> Key: FLINK-34111
> URL: https://issues.apache.org/jira/browse/FLINK-34111
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Martijn Visser
>Assignee: Anupam Aggarwal
>Priority: Major
>  Labels: pull-request-available
>
> Escapes or unescapes a JSON string removing traces of offending characters 
> that could prevent parsing.
> Proposal:
> - JSON_QUOTE: Quotes a string by wrapping it with double quote characters and 
> escaping interior quote and other characters, then returning the result as a 
> utf8mb4 string. Returns NULL if the argument is NULL.
> - JSON_UNQUOTE: Unquotes value and returns the result as a string. Returns 
> NULL if the argument is NULL. An error occurs if the value starts and ends 
> with double quotes but is not a valid JSON string literal.
> The following characters are reserved in JSON and must be properly escaped to 
> be used in strings:
> Backspace is replaced with \b
> Form feed is replaced with \f
> Newline is replaced with \n
> Carriage return is replaced with \r
> Tab is replaced with \t
> Double quote is replaced with \"
> Backslash is replaced with \\
> This function exists in MySQL: 
> - 
> https://dev.mysql.com/doc/refman/8.0/en/json-creation-functions.html#function_json-quote
> - 
> https://dev.mysql.com/doc/refman/8.0/en/json-modification-functions.html#function_json-unquote
> It's still open in Calcite CALCITE-3130



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35435) [FLIP-451] Introduce timeout configuration to AsyncSink

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35435:
---
Release Note: In Flink 1.20, We have introduced timeout configuration to 
AsyncSink with retryOnTimeout and FailOnTimeout mechanisms to ensure the writer 
doesn't block on un-acknowledged requests.

> [FLIP-451] Introduce timeout configuration to AsyncSink
> ---
>
> Key: FLINK-35435
> URL: https://issues.apache.org/jira/browse/FLINK-35435
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common
>Reporter: Ahmed Hamdy
>Assignee: Ahmed Hamdy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
> Attachments: Screenshot 2024-05-24 at 11.06.30.png, Screenshot 
> 2024-05-24 at 12.06.20.png
>
>
> Implementation Ticket for:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-451%3A+Introduce+timeout+configuration+to+AsyncSink+API
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-34111) Add JSON_QUOTE and JSON_UNQUOTE function

2024-07-09 Thread Martijn Visser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser reassigned FLINK-34111:
--

Assignee: (was: Anupam)

> Add JSON_QUOTE and JSON_UNQUOTE function
> 
>
> Key: FLINK-34111
> URL: https://issues.apache.org/jira/browse/FLINK-34111
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Martijn Visser
>Priority: Major
>  Labels: pull-request-available
>
> Escapes or unescapes a JSON string removing traces of offending characters 
> that could prevent parsing.
> Proposal:
> - JSON_QUOTE: Quotes a string by wrapping it with double quote characters and 
> escaping interior quote and other characters, then returning the result as a 
> utf8mb4 string. Returns NULL if the argument is NULL.
> - JSON_UNQUOTE: Unquotes value and returns the result as a string. Returns 
> NULL if the argument is NULL. An error occurs if the value starts and ends 
> with double quotes but is not a valid JSON string literal.
> The following characters are reserved in JSON and must be properly escaped to 
> be used in strings:
> Backspace is replaced with \b
> Form feed is replaced with \f
> Newline is replaced with \n
> Carriage return is replaced with \r
> Tab is replaced with \t
> Double quote is replaced with \"
> Backslash is replaced with \\
> This function exists in MySQL: 
> - 
> https://dev.mysql.com/doc/refman/8.0/en/json-creation-functions.html#function_json-quote
> - 
> https://dev.mysql.com/doc/refman/8.0/en/json-modification-functions.html#function_json-unquote
> It's still open in Calcite CALCITE-3130



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-34111) Add JSON_QUOTE and JSON_UNQUOTE function

2024-07-09 Thread Martijn Visser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser reassigned FLINK-34111:
--

Assignee: Anupam  (was: Jeyhun Karimov)

> Add JSON_QUOTE and JSON_UNQUOTE function
> 
>
> Key: FLINK-34111
> URL: https://issues.apache.org/jira/browse/FLINK-34111
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Martijn Visser
>Assignee: Anupam
>Priority: Major
>  Labels: pull-request-available
>
> Escapes or unescapes a JSON string removing traces of offending characters 
> that could prevent parsing.
> Proposal:
> - JSON_QUOTE: Quotes a string by wrapping it with double quote characters and 
> escaping interior quote and other characters, then returning the result as a 
> utf8mb4 string. Returns NULL if the argument is NULL.
> - JSON_UNQUOTE: Unquotes value and returns the result as a string. Returns 
> NULL if the argument is NULL. An error occurs if the value starts and ends 
> with double quotes but is not a valid JSON string literal.
> The following characters are reserved in JSON and must be properly escaped to 
> be used in strings:
> Backspace is replaced with \b
> Form feed is replaced with \f
> Newline is replaced with \n
> Carriage return is replaced with \r
> Tab is replaced with \t
> Double quote is replaced with \"
> Backslash is replaced with \\
> This function exists in MySQL: 
> - 
> https://dev.mysql.com/doc/refman/8.0/en/json-creation-functions.html#function_json-quote
> - 
> https://dev.mysql.com/doc/refman/8.0/en/json-modification-functions.html#function_json-unquote
> It's still open in Calcite CALCITE-3130



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-25217) FLIP-190: Support Version Upgrades for Table API & SQL Programs

2024-07-09 Thread Martijn Visser (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864495#comment-17864495
 ] 

Martijn Visser commented on FLINK-25217:


[~fsk119]  Since Flink 1.15 actually. 

> FLIP-190: Support Version Upgrades for Table API & SQL Programs
> ---
>
> Key: FLINK-25217
> URL: https://issues.apache.org/jira/browse/FLINK-25217
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API, Table SQL / Planner
>Reporter: Timo Walther
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
>
> Nowadays, the Table & SQL API is as important to Flink as the DataStream API. 
> It is one of the main abstractions for expressing pipelines that perform 
> stateful stream processing. Users expect the same backwards compatibility 
> guarantees when upgrading to a newer Flink version as with the DataStream API.
> In particular, this means:
> * once the operator topology is defined, it remains static and does not 
> change between Flink versions, unless resulting in better performance,
> * business logic (defined using expressions and functions in queries) behaves 
> identical as before the version upgrade,
> * the state of a Table & SQL API program can be restored from a savepoint of 
> a previous version,
> * adding or removing stateful operators should be made possible in the 
> DataStream API.
> The same query can remain up and running after upgrades.
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336489



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35802][mysql] clean ChangeEventQueue to avoid deadlock when calling BinaryLogClient#disconnect method. [flink-cdc]

2024-07-09 Thread via GitHub



lvyanquan closed pull request #3463: [FLINK-35802][mysql] clean 
ChangeEventQueue to avoid deadlock when calling BinaryLogClient#disconnect 
method.
URL: https://github.com/apache/flink-cdc/pull/3463


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35802) Deadlock may happen after adding new tables

2024-07-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35802:
---
Labels: pull-request-available  (was: )

> Deadlock may happen after adding new tables
> ---
>
> Key: FLINK-35802
> URL: https://issues.apache.org/jira/browse/FLINK-35802
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: LvYanquan
>Priority: Major
>  Labels: pull-request-available
> Fix For: cdc-3.2.0
>
> Attachments: image-2024-07-10-13-44-49-972.png, 
> image-2024-07-10-13-45-52-450.png, image-2024-07-10-13-47-07-190.png
>
>
> Problem Description:
> 1.CDC originally consumed the full incremental data of a table, and 
> currently, the snapshot phase has ended, and it is in the binlog consumption 
> phase.
> 2.Stop the job to add the full incremental data synchronization for a new 
> table.
> 3.After the full phase of the new table ends, it fails to return to the 
> binlog consumption phase.
> 4. Checking the thread that consumes the binlog, a deadlock situation is 
> discovered, and the specific thread stack is as follows.
> 5. The likely cause is that after the Enumerator issues a 
> BinlogSplitUpdateRequestEvent, both the MysqlSplitReader and 
> MySqlBinlogSplitReadTask close the binlogClient connection but fail to 
> acquire the lock.
> 6. The lock is held by the consumer thread, but the queue is full, waiting 
> for consumers to consume the data out, and yet there are no consumers, thus 
> causing a deadlock.
> ThreadDump：
> 1.  MysqlSplitReader.pollSplitRecords method
> !image-2024-07-10-13-44-49-972.png!
> 2. MySqlStreamingChangeEventSource.execute method
> !image-2024-07-10-13-45-52-450.png!
> 3. MySqlBinlogSplitReadTask.handleEvent method
> !image-2024-07-10-13-47-07-190.png!
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-35802][mysql] clean ChangeEventQueue to avoid deadlock when calling BinaryLogClient#disconnect method. [flink-cdc]

2024-07-09 Thread via GitHub



lvyanquan opened a new pull request, #3463:
URL: https://github.com/apache/flink-cdc/pull/3463

   See https://issues.apache.org/jira/browse/FLINK-35802 for more details.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-35753) ParquetColumnarRowInputFormatTest.testContinuousRepetition(int) failed due to ArrayIndexOutOfBoundsException

2024-07-09 Thread Weijie Guo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864493#comment-17864493
 ] 

Weijie Guo edited comment on FLINK-35753 at 7/10/24 6:31 AM:
-

After checking the commits for the failed cases, I found that

The {{Parquet does not support null keys in maps}} seems fixed by 64f745a5.

Since all the failed AZP build are before this fix, I'm going to close this 
issue. I will re-open this if it still can be reproduced.


was (Author: weijie guo):
After checking the commits for the failed cases, I found that

The {{Parquet does not support null keys in maps}} seems fixed by 64f745a5.

Since all the failed AZP build are before this fix, I will close this issue. I 
will re-open this if it still can be reproduced.

> ParquetColumnarRowInputFormatTest.testContinuousRepetition(int) failed due to 
> ArrayIndexOutOfBoundsException
> 
>
> Key: FLINK-35753
> URL: https://issues.apache.org/jira/browse/FLINK-35753
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI, Connectors / Parent
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Major
> Fix For: 1.20.0
>
>
> {code:java}
> Jul 02 01:23:50 01:23:50.105 [ERROR] 
> org.apache.flink.formats.parquet.ParquetColumnarRowInputFormatTest.testContinuousRepetition(int)[2]
>  -- Time elapsed: 1.886 s <<< ERROR!
> Jul 02 01:23:50 java.lang.ArrayIndexOutOfBoundsException: 500
> Jul 02 01:23:50   at 
> org.apache.flink.table.data.columnar.vector.heap.AbstractHeapVector.setNullAt(AbstractHeapVector.java:72)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.vector.reader.NestedColumnReader.setFieldNullFalg(NestedColumnReader.java:251)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.vector.reader.NestedColumnReader.readArray(NestedColumnReader.java:221)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.vector.reader.NestedColumnReader.readData(NestedColumnReader.java:101)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.vector.reader.NestedColumnReader.readToVector(NestedColumnReader.java:90)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.ParquetVectorizedInputFormat$ParquetReader.nextBatch(ParquetVectorizedInputFormat.java:413)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.ParquetVectorizedInputFormat$ParquetReader.readBatch(ParquetVectorizedInputFormat.java:381)
> Jul 02 01:23:50   at 
> org.apache.flink.connector.file.src.util.Utils.forEachRemaining(Utils.java:81)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60587&view=logs&j=1c002d28-a73d-5309-26ee-10036d8476b4&t=d1c117a6-8f13-5466-55f0-d48dbb767fcd&l=12002



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-35753) ParquetColumnarRowInputFormatTest.testContinuousRepetition(int) failed due to ArrayIndexOutOfBoundsException

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo closed FLINK-35753.
--
Fix Version/s: 1.20.0
   Resolution: Fixed

> ParquetColumnarRowInputFormatTest.testContinuousRepetition(int) failed due to 
> ArrayIndexOutOfBoundsException
> 
>
> Key: FLINK-35753
> URL: https://issues.apache.org/jira/browse/FLINK-35753
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI, Connectors / Parent
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Major
> Fix For: 1.20.0
>
>
> {code:java}
> Jul 02 01:23:50 01:23:50.105 [ERROR] 
> org.apache.flink.formats.parquet.ParquetColumnarRowInputFormatTest.testContinuousRepetition(int)[2]
>  -- Time elapsed: 1.886 s <<< ERROR!
> Jul 02 01:23:50 java.lang.ArrayIndexOutOfBoundsException: 500
> Jul 02 01:23:50   at 
> org.apache.flink.table.data.columnar.vector.heap.AbstractHeapVector.setNullAt(AbstractHeapVector.java:72)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.vector.reader.NestedColumnReader.setFieldNullFalg(NestedColumnReader.java:251)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.vector.reader.NestedColumnReader.readArray(NestedColumnReader.java:221)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.vector.reader.NestedColumnReader.readData(NestedColumnReader.java:101)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.vector.reader.NestedColumnReader.readToVector(NestedColumnReader.java:90)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.ParquetVectorizedInputFormat$ParquetReader.nextBatch(ParquetVectorizedInputFormat.java:413)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.ParquetVectorizedInputFormat$ParquetReader.readBatch(ParquetVectorizedInputFormat.java:381)
> Jul 02 01:23:50   at 
> org.apache.flink.connector.file.src.util.Utils.forEachRemaining(Utils.java:81)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60587&view=logs&j=1c002d28-a73d-5309-26ee-10036d8476b4&t=d1c117a6-8f13-5466-55f0-d48dbb767fcd&l=12002



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35753) ParquetColumnarRowInputFormatTest.testContinuousRepetition(int) failed due to ArrayIndexOutOfBoundsException

2024-07-09 Thread Weijie Guo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864493#comment-17864493
 ] 

Weijie Guo commented on FLINK-35753:


After checking the commits for the failed cases, I found that

The {{Parquet does not support null keys in maps}} seems fixed by 64f745a5.

Since all the exceptions are before this fix, I will close this issue. I will 
re-open this if it still can be reproduced.

> ParquetColumnarRowInputFormatTest.testContinuousRepetition(int) failed due to 
> ArrayIndexOutOfBoundsException
> 
>
> Key: FLINK-35753
> URL: https://issues.apache.org/jira/browse/FLINK-35753
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI, Connectors / Parent
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Major
>
> {code:java}
> Jul 02 01:23:50 01:23:50.105 [ERROR] 
> org.apache.flink.formats.parquet.ParquetColumnarRowInputFormatTest.testContinuousRepetition(int)[2]
>  -- Time elapsed: 1.886 s <<< ERROR!
> Jul 02 01:23:50 java.lang.ArrayIndexOutOfBoundsException: 500
> Jul 02 01:23:50   at 
> org.apache.flink.table.data.columnar.vector.heap.AbstractHeapVector.setNullAt(AbstractHeapVector.java:72)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.vector.reader.NestedColumnReader.setFieldNullFalg(NestedColumnReader.java:251)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.vector.reader.NestedColumnReader.readArray(NestedColumnReader.java:221)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.vector.reader.NestedColumnReader.readData(NestedColumnReader.java:101)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.vector.reader.NestedColumnReader.readToVector(NestedColumnReader.java:90)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.ParquetVectorizedInputFormat$ParquetReader.nextBatch(ParquetVectorizedInputFormat.java:413)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.ParquetVectorizedInputFormat$ParquetReader.readBatch(ParquetVectorizedInputFormat.java:381)
> Jul 02 01:23:50   at 
> org.apache.flink.connector.file.src.util.Utils.forEachRemaining(Utils.java:81)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60587&view=logs&j=1c002d28-a73d-5309-26ee-10036d8476b4&t=d1c117a6-8f13-5466-55f0-d48dbb767fcd&l=12002



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-35753) ParquetColumnarRowInputFormatTest.testContinuousRepetition(int) failed due to ArrayIndexOutOfBoundsException

2024-07-09 Thread Weijie Guo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864493#comment-17864493
 ] 

Weijie Guo edited comment on FLINK-35753 at 7/10/24 6:30 AM:
-

After checking the commits for the failed cases, I found that

The {{Parquet does not support null keys in maps}} seems fixed by 64f745a5.

Since all the failed AZP build are before this fix, I will close this issue. I 
will re-open this if it still can be reproduced.


was (Author: weijie guo):
After checking the commits for the failed cases, I found that

The {{Parquet does not support null keys in maps}} seems fixed by 64f745a5.

Since all the exceptions are before this fix, I will close this issue. I will 
re-open this if it still can be reproduced.

> ParquetColumnarRowInputFormatTest.testContinuousRepetition(int) failed due to 
> ArrayIndexOutOfBoundsException
> 
>
> Key: FLINK-35753
> URL: https://issues.apache.org/jira/browse/FLINK-35753
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI, Connectors / Parent
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Major
> Fix For: 1.20.0
>
>
> {code:java}
> Jul 02 01:23:50 01:23:50.105 [ERROR] 
> org.apache.flink.formats.parquet.ParquetColumnarRowInputFormatTest.testContinuousRepetition(int)[2]
>  -- Time elapsed: 1.886 s <<< ERROR!
> Jul 02 01:23:50 java.lang.ArrayIndexOutOfBoundsException: 500
> Jul 02 01:23:50   at 
> org.apache.flink.table.data.columnar.vector.heap.AbstractHeapVector.setNullAt(AbstractHeapVector.java:72)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.vector.reader.NestedColumnReader.setFieldNullFalg(NestedColumnReader.java:251)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.vector.reader.NestedColumnReader.readArray(NestedColumnReader.java:221)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.vector.reader.NestedColumnReader.readData(NestedColumnReader.java:101)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.vector.reader.NestedColumnReader.readToVector(NestedColumnReader.java:90)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.ParquetVectorizedInputFormat$ParquetReader.nextBatch(ParquetVectorizedInputFormat.java:413)
> Jul 02 01:23:50   at 
> org.apache.flink.formats.parquet.ParquetVectorizedInputFormat$ParquetReader.readBatch(ParquetVectorizedInputFormat.java:381)
> Jul 02 01:23:50   at 
> org.apache.flink.connector.file.src.util.Utils.forEachRemaining(Utils.java:81)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60587&view=logs&j=1c002d28-a73d-5309-26ee-10036d8476b4&t=d1c117a6-8f13-5466-55f0-d48dbb767fcd&l=12002



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35749] Kafka sink component will lose data when kafka cluster is unavailable for a while [flink-connector-kafka]

2024-07-09 Thread via GitHub



AHeise commented on PR #107:
URL: 
https://github.com/apache/flink-connector-kafka/pull/107#issuecomment-2219662290

   There is an arch unit rule violation:
   
   ```
   2024-07-09T13:48:15.4851704Z [ERROR]   Architecture Violation [Priority: 
MEDIUM] - Rule 'ITCASE tests should use a MiniCluster resource or extension' 
was violated (1 times):
   2024-07-09T13:48:15.4853792Z 
org.apache.flink.connector.kafka.sink.KafkaWriterFaultToleranceITCase does not 
satisfy: only one of the following predicates match:
   2024-07-09T13:48:15.4856292Z * reside in a package 
'org.apache.flink.runtime.*' and contain any fields that are static, final, and 
of type InternalMiniClusterExtension and annotated with @RegisterExtension
   2024-07-09T13:48:15.4859585Z * reside outside of package 
'org.apache.flink.runtime.*' and contain any fields that are static, final, and 
of type MiniClusterExtension and annotated with @RegisterExtension or are , and 
of type MiniClusterTestEnvironment and annotated with @TestEnv
   2024-07-09T13:48:15.4862798Z * reside in a package 
'org.apache.flink.runtime.*' and is annotated with @ExtendWith with class 
InternalMiniClusterExtension
   2024-07-09T13:48:15.4864813Z * reside outside of package 
'org.apache.flink.runtime.*' and is annotated with @ExtendWith with class 
MiniClusterExtension
   2024-07-09T13:48:15.4871237Z  or contain any fields that are public, static, 
and of type MiniClusterWithClientResource and final and annotated with 
@ClassRule or contain any fields that is of type MiniClusterWithClientResource 
and public and final and not static and annotated with @Rule
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35802) Deadlock may happen after adding new tables

2024-07-09 Thread ruofan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864490#comment-17864490
 ] 

ruofan commented on FLINK-35802:


Please assign this to me, thanks.

> Deadlock may happen after adding new tables
> ---
>
> Key: FLINK-35802
> URL: https://issues.apache.org/jira/browse/FLINK-35802
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: LvYanquan
>Priority: Major
> Fix For: cdc-3.2.0
>
> Attachments: image-2024-07-10-13-44-49-972.png, 
> image-2024-07-10-13-45-52-450.png, image-2024-07-10-13-47-07-190.png
>
>
> Problem Description:
> 1.CDC originally consumed the full incremental data of a table, and 
> currently, the snapshot phase has ended, and it is in the binlog consumption 
> phase.
> 2.Stop the job to add the full incremental data synchronization for a new 
> table.
> 3.After the full phase of the new table ends, it fails to return to the 
> binlog consumption phase.
> 4. Checking the thread that consumes the binlog, a deadlock situation is 
> discovered, and the specific thread stack is as follows.
> 5. The likely cause is that after the Enumerator issues a 
> BinlogSplitUpdateRequestEvent, both the MysqlSplitReader and 
> MySqlBinlogSplitReadTask close the binlogClient connection but fail to 
> acquire the lock.
> 6. The lock is held by the consumer thread, but the queue is full, waiting 
> for consumers to consume the data out, and yet there are no consumers, thus 
> causing a deadlock.
> ThreadDump：
> 1.  MysqlSplitReader.pollSplitRecords method
> !image-2024-07-10-13-44-49-972.png!
> 2. MySqlStreamingChangeEventSource.execute method
> !image-2024-07-10-13-45-52-450.png!
> 3. MySqlBinlogSplitReadTask.handleEvent method
> !image-2024-07-10-13-47-07-190.png!
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-21634) ALTER TABLE statement enhancement

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-21634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-21634:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> ALTER TABLE statement enhancement
> -
>
> Key: FLINK-21634
> URL: https://issues.apache.org/jira/browse/FLINK-21634
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API, Table SQL / Client
>Reporter: Jark Wu
>Assignee: Jark Wu
>Priority: Major
>  Labels: auto-unassigned, stale-assigned
> Fix For: 2.0.0
>
>
> We already introduced ALTER TABLE statement in FLIP-69 [1], but only support 
> to rename table name and change table options. One useful feature of ALTER 
> TABLE statement is modifying schema. This is also heavily required by 
> integration with data lakes (e.g. iceberg).
> Therefore, I propose to support following ALTER TABLE statements (except 
> {{SET}} and {{{}RENAME TO{}}}, others are all new introduced syntax):
> {code:sql}
> ALTER TABLE table_name {
> ADD {  | ( [, ...]) }
>   | MODIFY {  | ( [, ...]) }
>   | DROP {column_name | (column_name, column_name, ) | PRIMARY KEY | 
> CONSTRAINT constraint_name | WATERMARK}
>   | RENAME old_column_name TO new_column_name
>   | RENAME TO new_table_name
>   | SET (key1=val1, ...)
>   | RESET (key1, ...)
> }
> ::
>   {  |  |  }
> ::
>   column_name  [FIRST | AFTER column_name]
> ::
>   [CONSTRAINT constraint_name] PRIMARY KEY (column_name, ...) NOT ENFORCED
> ::
>   WATERMARK FOR rowtime_column_name AS watermark_strategy_expression
> ::
>   {  |  | 
>  } [COMMENT column_comment]
> ::
>   column_type
> ::
>   column_type METADATA [ FROM metadata_key ] [ VIRTUAL ]
> ::
>   AS computed_column_expression
> {code}
> And some examples:
> {code:sql}
> -- add a new column 
> ALTER TABLE mytable ADD new_column STRING COMMENT 'new_column docs';
> -- add columns, constraint, and watermark
> ALTER TABLE mytable ADD (
> log_ts STRING COMMENT 'log timestamp string' FIRST,
> ts AS TO_TIMESTAMP(log_ts) AFTER log_ts,
> PRIMARY KEY (id) NOT ENFORCED,
> WATERMARK FOR ts AS ts - INTERVAL '3' SECOND
> );
> -- modify a column type
> ALTER TABLE prod.db.sample MODIFY measurement double COMMENT 'unit is bytes 
> per second' AFTER `id`;
> -- modify definition of column log_ts and ts, primary key, watermark. they 
> must exist in table schema
> ALTER TABLE mytable ADD (
> log_ts STRING COMMENT 'log timestamp string' AFTER `id`,  -- reoder 
> columns
> ts AS TO_TIMESTAMP(log_ts) AFTER log_ts,
> PRIMARY KEY (id) NOT ENFORCED,
> WATERMARK FOR ts AS ts - INTERVAL '3' SECOND
> );
> -- drop an old column
> ALTER TABLE prod.db.sample DROP measurement;
> -- drop columns
> ALTER TABLE prod.db.sample DROP (col1, col2, col3);
> -- drop a watermark
> ALTER TABLE prod.db.sample DROP WATERMARK;
> -- rename column name
> ALTER TABLE prod.db.sample RENAME `data` TO payload;
> -- rename table name
> ALTER TABLE mytable RENAME TO mytable2;
> -- set options
> ALTER TABLE kafka_table SET (
> 'scan.startup.mode' = 'specific-offsets', 
> 'scan.startup.specific-offsets' = 'partition:0,offset:42'
> );
> -- reset options
> ALTER TABLE kafka_table RESET ('scan.startup.mode', 
> 'scan.startup.specific-offsets');
> {code}
> Note: we don't need to introduce new interfaces, because all the alter table 
> operation will be forward to catalog through 
> {{{}Catalog#alterTable(tablePath, newTable, ignoreIfNotExists){}}}.
> [1]: 
> [https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sql/alter/#alter-table]
> [2]: [http://iceberg.apache.org/spark-ddl/#alter-table-alter-column]
> [3]: [https://trino.io/docs/current/sql/alter-table.html]
> [4]: [https://dev.mysql.com/doc/refman/8.0/en/alter-table.html]
> [5]: [https://www.postgresql.org/docs/9.1/sql-altertable.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-14867) Move TextInputFormat & TextOutputFormat to flink-core

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-14867:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Move TextInputFormat & TextOutputFormat to flink-core
> -
>
> Key: FLINK-14867
> URL: https://issues.apache.org/jira/browse/FLINK-14867
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataSet, API / DataStream
>Reporter: Zili Chen
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is one step to decouple the dependency from flink-streaming-java to 
> flink-java. We already have a package {{o.a.f.core.io}} for these formats.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-17011) Introduce builder to create AbstractStreamOperatorTestHarness for testing

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-17011:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Introduce builder to create AbstractStreamOperatorTestHarness for testing
> -
>
> Key: FLINK-17011
> URL: https://issues.apache.org/jira/browse/FLINK-17011
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Tests
>Affects Versions: 1.10.0
>Reporter: Yun Tang
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Current \{{AbstractStreamOperatorTestHarness}} lacks of builder which leads 
> us to create more constructors. Moreover, to set customized component, we 
> might have to call \{{AbstractStreamOperatorTestHarness#setup}}, which might 
> be treated a deprecated interface, before using.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-14410) Retrieve CharacterFilter once after registration

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-14410:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Retrieve CharacterFilter once after registration
> 
>
> Key: FLINK-14410
> URL: https://issues.apache.org/jira/browse/FLINK-14410
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics
>Reporter: Chesnay Schepler
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 2.0.0
>
>
> Reporters can use a {{CharacterFilter}} to remove/replace characters when 
> assembling metric identifiers. Currently the reporter pass this filter in 
> every call to scope-related methods on the {{MetricGrou}}. We could 
> streamline the {{MetricScope}} API proposed in FLINK-14350 if instead we 
> retrieved this filter once after the reporter is initialized.
> This would furthermore prevent subtle bugs if reporters use varying filters; 
> due to the caching of scopes only the first filter would truly be applied to 
> the scope.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-18986) KubernetesSessionCli creates RestClusterClient for detached deployments

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-18986:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> KubernetesSessionCli creates RestClusterClient for detached deployments
> ---
>
> Key: FLINK-18986
> URL: https://issues.apache.org/jira/browse/FLINK-18986
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.11.0
>Reporter: Chesnay Schepler
>Priority: Minor
>  Labels: auto-unassigned, pull-request-available, usability
> Fix For: 2.0.0
>
>
> The {{KubernetesSessionCli}} creates a {{ClusterClient}} for retrieving the 
> {{clusterId}} if the cluster was just started.
> However, this {{clusterId}} is only used in attached executions.
> For detached deployments where {{kubernetes.rest-service.exposed.type}} is 
> set to {{ClusterIP}} this results in unnecessary error messages about the 
> {{RestClusterClient}} not being able to be created.
> Given that there is no need to create the client in this situation, we should 
> skip this step.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-21375) Refactor HybridMemorySegment

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-21375:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Refactor HybridMemorySegment
> 
>
> Key: FLINK-21375
> URL: https://issues.apache.org/jira/browse/FLINK-21375
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / Coordination
>Reporter: Xintong Song
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 2.0.0
>
>
> Per the discussion in [this PR|https://github.com/apache/flink/pull/14904], 
> we plan to refactor {{HybridMemorySegment}} as follows.
> * Separate into memory type specific implementations: heap / direct / native 
> (unsafe)
> * Remove {{wrap()}}, replacing with {{processAsByteBuffer()}}
> * Remove native memory cleaner logic



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-18201) Support create_function in Python Table API

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-18201:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Support create_function in Python Table API
> ---
>
> Key: FLINK-18201
> URL: https://issues.apache.org/jira/browse/FLINK-18201
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: Dian Fu
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 2.0.0
>
>
> There is an interface *createFunction* in the Java *TableEnvironment*. It's 
> used to register a Java UserDefinedFunction class as a catalog function. We 
> should align the Python Table API with Java and add such an interface in the 
> Python *TableEnvironment*.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-22240) Sanity/phrasing pass for ExternalResourceUtils

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-22240:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Sanity/phrasing pass for ExternalResourceUtils
> --
>
> Key: FLINK-22240
> URL: https://issues.apache.org/jira/browse/FLINK-22240
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: Chesnay Schepler
>Assignee: Yangze Guo
>Priority: Minor
>  Labels: pull-request-available, stale-assigned
> Fix For: 2.0.0
>
>
> We should do a pass over the {{ExternalResourceUtils}}, because various 
> phrases could use some refinement (e.g., "The amount of the {} should be 
> positive while finding {}. Will ignore that resource."), some log messages 
> are misleading (resources are logged as enabled on both the JM/TM, but this 
> actually doesn't happen on the JM; and on the TM the driver discovery may 
> subsequently fail), and we may want to rethink whether we shouldn't fail if a 
> driver could not be discovered.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35802) Deadlock may happen after adding new tables

2024-07-09 Thread LvYanquan (Jira)

LvYanquan created FLINK-35802:
-

 Summary: Deadlock may happen after adding new tables
 Key: FLINK-35802
 URL: https://issues.apache.org/jira/browse/FLINK-35802
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Affects Versions: cdc-3.1.0
Reporter: LvYanquan
 Fix For: cdc-3.2.0
 Attachments: image-2024-07-10-13-44-49-972.png, 
image-2024-07-10-13-45-52-450.png, image-2024-07-10-13-47-07-190.png

Problem Description:

1.CDC originally consumed the full incremental data of a table, and currently, 
the snapshot phase has ended, and it is in the binlog consumption phase.

2.Stop the job to add the full incremental data synchronization for a new table.

3.After the full phase of the new table ends, it fails to return to the binlog 
consumption phase.

4. Checking the thread that consumes the binlog, a deadlock situation is 
discovered, and the specific thread stack is as follows.

5. The likely cause is that after the Enumerator issues a 
BinlogSplitUpdateRequestEvent, both the MysqlSplitReader and 
MySqlBinlogSplitReadTask close the binlogClient connection but fail to acquire 
the lock.

6. The lock is held by the consumer thread, but the queue is full, waiting for 
consumers to consume the data out, and yet there are no consumers, thus causing 
a deadlock.

ThreadDump：

1.  MysqlSplitReader.pollSplitRecords method

!image-2024-07-10-13-44-49-972.png!

2. MySqlStreamingChangeEventSource.execute method

!image-2024-07-10-13-45-52-450.png!

3. MySqlBinlogSplitReadTask.handleEvent method
!image-2024-07-10-13-47-07-190.png!

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-33730][doc] Update the Flink upgrade savepoint compatibility table doc for chinese version [flink]

2024-07-09 Thread via GitHub



flinkbot commented on PR #25065:
URL: https://github.com/apache/flink/pull/25065#issuecomment-2219548572

   
   ## CI report:
   
   * 4be52c4e48622e7c8b69ee11aa43973e1e8e76d5 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-32239) Unify TestJvmProcess and TestProcessBuilder

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-32239:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Unify TestJvmProcess and TestProcessBuilder
> ---
>
> Key: FLINK-32239
> URL: https://issues.apache.org/jira/browse/FLINK-32239
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Test Infrastructure
>Reporter: Chesnay Schepler
>Assignee: Samrat Deb
>Priority: Minor
>  Labels: stale-assigned, starter
> Fix For: 2.0.0
>
>
> Both of these utility classes are used to spawn additional JVM processes 
> during tests, and contain a fair bit of duplicated logic. We can unify them 
> to ease maintenance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-26081) Update document,becasue of add maven wrapper.

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-26081:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Update document,becasue of add maven wrapper.
> -
>
> Key: FLINK-26081
> URL: https://issues.apache.org/jira/browse/FLINK-26081
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.15.0
>Reporter: Aiden Gong
>Assignee: Aiden Gong
>Priority: Minor
> Fix For: 2.0.0
>
>
> Update document,becasue of add maven wapper.
> Related files: README.md、project-configuration.md、building.md、ide_setup.md.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-33319) Add AverageTime metric to measure delta change in GC time

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-33319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-33319:
---
Fix Version/s: 2.0.0
   (was: 1.20.0)

> Add AverageTime metric to measure delta change in GC time
> -
>
> Key: FLINK-33319
> URL: https://issues.apache.org/jira/browse/FLINK-33319
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Metrics
>Reporter: Gyula Fora
>Assignee: ouyangwulin
>Priority: Minor
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-35683) [Release-1.20] Verify that no exclusions were erroneously added to the japicmp plugin

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo reassigned FLINK-35683:
--

Assignee: Weijie Guo

> [Release-1.20]  Verify that no exclusions were erroneously added to the 
> japicmp plugin
> --
>
> Key: FLINK-35683
> URL: https://issues.apache.org/jira/browse/FLINK-35683
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Weijie Guo
>Priority: Major
>
> Verify that no exclusions were erroneously added to the japicmp plugin that 
> break compatibility guarantees. Check the exclusions for the 
> japicmp-maven-plugin in the root pom (see 
> [apache/flink:pom.xml:2175ff|https://github.com/apache/flink/blob/3856c49af77601cf7943a5072d8c932279ce46b4/pom.xml#L2175]
>  for exclusions that:
> * For minor releases: break source compatibility for {{@Public}} APIs
> * For patch releases: break source/binary compatibility for 
> {{@Public}}/{{@PublicEvolving}}  APIs
> Any such exclusion must be properly justified, in advance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-35680) [Release-1.20] Review Release Notes in JIRA

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo closed FLINK-35680.
--
Resolution: Done

> [Release-1.20] Review Release Notes in JIRA
> ---
>
> Key: FLINK-35680
> URL: https://issues.apache.org/jira/browse/FLINK-35680
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Weijie Guo
>Priority: Major
> Fix For: 1.20.0
>
>
> JIRA automatically generates Release Notes based on the {{Fix Version}} field 
> applied to issues. Release Notes are intended for Flink users (not Flink 
> committers/contributors). You should ensure that Release Notes are 
> informative and useful.
> Open the release notes from the version status page by choosing the release 
> underway and clicking Release Notes.
> You should verify that the issues listed automatically by JIRA are 
> appropriate to appear in the Release Notes. Specifically, issues should:
>  * Be appropriately classified as {{{}Bug{}}}, {{{}New Feature{}}}, 
> {{{}Improvement{}}}, etc.
>  * Represent noteworthy user-facing changes, such as new functionality, 
> backward-incompatible API changes, or performance improvements.
>  * Have occurred since the previous release; an issue that was introduced and 
> fixed between releases should not appear in the Release Notes.
>  * Have an issue title that makes sense when read on its own.
> Adjust any of the above properties to the improve clarity and presentation of 
> the Release Notes.
> Ensure that the JIRA release notes are also included in the release notes of 
> the documentation (see section "Review and update documentation").
> h4. Content of Release Notes field from JIRA tickets 
> To get the list of "release notes" field from JIRA, you can ran the following 
> script using JIRA REST API (notes the maxResults limits the number of 
> entries):
> {code:bash}
> curl -s 
> https://issues.apache.org/jira//rest/api/2/search?maxResults=200&jql=project%20%3D%20FLINK%20AND%20%22Release%20Note%22%20is%20not%20EMPTY%20and%20fixVersion%20%3D%20${RELEASE_VERSION}
>  | jq '.issues[]|.key,.fields.summary,.fields.customfield_12310192' | paste - 
> - -
> {code}
> {{jq}}  is present in most Linux distributions and on MacOS can be installed 
> via brew.
>  
> 
> h3. Expectations
>  * Release Notes in JIRA have been audited and adjusted



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-33730][doc] Update the Flink upgrade savepoint compatibility table doc for chinese version [flink]

2024-07-09 Thread via GitHub



1996fanrui opened a new pull request, #25065:
URL: https://github.com/apache/flink/pull/25065

   ## What is the purpose of the change
   
   https://github.com/apache/flink/pull/23859 only updated the English version, 
this PR updates the zh doc.
   
   
   ## Brief change log
   
   [FLINK-33730][doc] Update the Flink upgrade savepoint compatibility table 
doc for chinese version
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35680) [Release-1.20] Review Release Notes in JIRA

2024-07-09 Thread Weijie Guo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864474#comment-17864474
 ] 

Weijie Guo commented on FLINK-35680:


[~fanrui] and I have reviewed all JIRA related to 1.20, and updated all release 
notes. I'm going to close this now.

> [Release-1.20] Review Release Notes in JIRA
> ---
>
> Key: FLINK-35680
> URL: https://issues.apache.org/jira/browse/FLINK-35680
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Weijie Guo
>Priority: Major
> Fix For: 1.20.0
>
>
> JIRA automatically generates Release Notes based on the {{Fix Version}} field 
> applied to issues. Release Notes are intended for Flink users (not Flink 
> committers/contributors). You should ensure that Release Notes are 
> informative and useful.
> Open the release notes from the version status page by choosing the release 
> underway and clicking Release Notes.
> You should verify that the issues listed automatically by JIRA are 
> appropriate to appear in the Release Notes. Specifically, issues should:
>  * Be appropriately classified as {{{}Bug{}}}, {{{}New Feature{}}}, 
> {{{}Improvement{}}}, etc.
>  * Represent noteworthy user-facing changes, such as new functionality, 
> backward-incompatible API changes, or performance improvements.
>  * Have occurred since the previous release; an issue that was introduced and 
> fixed between releases should not appear in the Release Notes.
>  * Have an issue title that makes sense when read on its own.
> Adjust any of the above properties to the improve clarity and presentation of 
> the Release Notes.
> Ensure that the JIRA release notes are also included in the release notes of 
> the documentation (see section "Review and update documentation").
> h4. Content of Release Notes field from JIRA tickets 
> To get the list of "release notes" field from JIRA, you can ran the following 
> script using JIRA REST API (notes the maxResults limits the number of 
> entries):
> {code:bash}
> curl -s 
> https://issues.apache.org/jira//rest/api/2/search?maxResults=200&jql=project%20%3D%20FLINK%20AND%20%22Release%20Note%22%20is%20not%20EMPTY%20and%20fixVersion%20%3D%20${RELEASE_VERSION}
>  | jq '.issues[]|.key,.fields.summary,.fields.customfield_12310192' | paste - 
> - -
> {code}
> {{jq}}  is present in most Linux distributions and on MacOS can be installed 
> via brew.
>  
> 
> h3. Expectations
>  * Release Notes in JIRA have been audited and adjusted



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-26050) Too many small sst files in rocksdb state backend when using time window created in ascending order

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-26050:
---
Release Note: 
In some cases, the number of files produced by RocksDB state backend grows 
indefinitely.This might cause task state info (TDD and checkpoint ACK) to 
exceed RPC message size and fail recovery/checkpoint in addition to having lots 
of small files.
In Flink 1.20, you can manually merge such files in the background using 
RocksDB API.

> Too many small sst files in rocksdb state backend when using time window 
> created in ascending order
> ---
>
> Key: FLINK-26050
> URL: https://issues.apache.org/jira/browse/FLINK-26050
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.10.2, 1.14.3
>Reporter: shen
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
> Attachments: image-2022-02-09-21-22-13-920.png, 
> image-2022-02-11-10-32-14-956.png, image-2022-02-11-10-36-46-630.png, 
> image-2022-02-14-13-04-52-325.png
>
>
> When using processing or event time windows, in some workloads, there will be 
> a lot of small sst files(serveral KB) in rocksdb local directory and may 
> cause "Too many files error".
> Use rocksdb tool ldb to find out content in sst files:
>  * column family of these small sst files is "processing_window-timers".
>  * most sst files are in level-1.
>  * records in sst files are almost kTypeDeletion.
>  * creation time of sst file correspond to checkpoint interval.
> These small sst files seem to be generated when flink checkpoint is 
> triggered. Although all content in sst are delete tags, they are not 
> compacted and deleted in rocksdb compaction because of not intersecting with 
> each other(rocksdb [compaction trivial 
> move|https://github.com/facebook/rocksdb/wiki/Compaction-Trivial-Move]). And 
> there seems to be no chance to delete them because of small size and not 
> intersect with other sst files.
>  
> I will attach a simple program to reproduce the problem.
>  
> Since timer in processing time window is generated in strictly ascending 
> order(both put and delete). So If workload of job happen to generate level-0 
> sst files not intersect with each other.(for example: processing window size 
> much smaller than checkpoint interval, and no window content cross checkpoint 
> interval or no new data in window crossing checkpoint interval). There will 
> be many small sst files generated until job restored from savepoint, or 
> incremental checkpoint is disabled. 
>  
> May be similar problem exists when user use timer in operators with same 
> workload.
>  
> Code to reproduce the problem:
> {code:java}
> package org.apache.flink.jira;
> import lombok.extern.slf4j.Slf4j;
> import org.apache.flink.configuration.Configuration;
> import org.apache.flink.configuration.RestOptions;
> import org.apache.flink.configuration.TaskManagerOptions;
> import org.apache.flink.contrib.streaming.state.RocksDBStateBackend;
> import org.apache.flink.streaming.api.TimeCharacteristic;
> import org.apache.flink.streaming.api.checkpoint.ListCheckpointed;
> import org.apache.flink.streaming.api.datastream.DataStreamSource;
> import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
> import org.apache.flink.streaming.api.functions.source.SourceFunction;
> import 
> org.apache.flink.streaming.api.functions.windowing.ProcessWindowFunction;
> import 
> org.apache.flink.streaming.api.windowing.assigners.TumblingProcessingTimeWindows;
> import org.apache.flink.streaming.api.windowing.time.Time;
> import org.apache.flink.streaming.api.windowing.windows.TimeWindow;
> import org.apache.flink.util.Collector;
> import java.util.Collections;
> import java.util.List;
> import java.util.Random;
> @Slf4j
> public class StreamApp  {
>   public static void main(String[] args) throws Exception {
> Configuration config = new Configuration();
> config.set(RestOptions.ADDRESS, "127.0.0.1");
> config.set(RestOptions.PORT, 10086);
> config.set(TaskManagerOptions.NUM_TASK_SLOTS, 6);
> new 
> StreamApp().configureApp(StreamExecutionEnvironment.createLocalEnvironment(1, 
> config));
>   }
>   public void configureApp(StreamExecutionEnvironment env) throws Exception {
> env.enableCheckpointing(2); // 20sec
> RocksDBStateBackend rocksDBStateBackend =
> new 
> RocksDBStateBackend("file:///Users/shenjiaqi/Workspace/jira/flink-51/checkpoints/",
>  true); // need to be reconfigured
> 
> rocksDBStateBackend.setDbStoragePath("/Users/shenjiaqi/Workspace/jira/flink-51/flink/rocksdb_local_db");
>  // need to be reconfigured
> env.setStateBackend(rock

[jira] [Updated] (FLINK-35680) [Release-1.20] Review Release Notes in JIRA

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35680:
---
Fix Version/s: 1.20.0

> [Release-1.20] Review Release Notes in JIRA
> ---
>
> Key: FLINK-35680
> URL: https://issues.apache.org/jira/browse/FLINK-35680
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Weijie Guo
>Priority: Major
> Fix For: 1.20.0
>
>
> JIRA automatically generates Release Notes based on the {{Fix Version}} field 
> applied to issues. Release Notes are intended for Flink users (not Flink 
> committers/contributors). You should ensure that Release Notes are 
> informative and useful.
> Open the release notes from the version status page by choosing the release 
> underway and clicking Release Notes.
> You should verify that the issues listed automatically by JIRA are 
> appropriate to appear in the Release Notes. Specifically, issues should:
>  * Be appropriately classified as {{{}Bug{}}}, {{{}New Feature{}}}, 
> {{{}Improvement{}}}, etc.
>  * Represent noteworthy user-facing changes, such as new functionality, 
> backward-incompatible API changes, or performance improvements.
>  * Have occurred since the previous release; an issue that was introduced and 
> fixed between releases should not appear in the Release Notes.
>  * Have an issue title that makes sense when read on its own.
> Adjust any of the above properties to the improve clarity and presentation of 
> the Release Notes.
> Ensure that the JIRA release notes are also included in the release notes of 
> the documentation (see section "Review and update documentation").
> h4. Content of Release Notes field from JIRA tickets 
> To get the list of "release notes" field from JIRA, you can ran the following 
> script using JIRA REST API (notes the maxResults limits the number of 
> entries):
> {code:bash}
> curl -s 
> https://issues.apache.org/jira//rest/api/2/search?maxResults=200&jql=project%20%3D%20FLINK%20AND%20%22Release%20Note%22%20is%20not%20EMPTY%20and%20fixVersion%20%3D%20${RELEASE_VERSION}
>  | jq '.issues[]|.key,.fields.summary,.fields.customfield_12310192' | paste - 
> - -
> {code}
> {{jq}}  is present in most Linux distributions and on MacOS can be installed 
> via brew.
>  
> 
> h3. Expectations
>  * Release Notes in JIRA have been audited and adjusted



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-35680) [Release-1.20] Review Release Notes in JIRA

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo reassigned FLINK-35680:
--

Assignee: Weijie Guo  (was: Ufuk Celebi)

> [Release-1.20] Review Release Notes in JIRA
> ---
>
> Key: FLINK-35680
> URL: https://issues.apache.org/jira/browse/FLINK-35680
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Weijie Guo
>Priority: Major
>
> JIRA automatically generates Release Notes based on the {{Fix Version}} field 
> applied to issues. Release Notes are intended for Flink users (not Flink 
> committers/contributors). You should ensure that Release Notes are 
> informative and useful.
> Open the release notes from the version status page by choosing the release 
> underway and clicking Release Notes.
> You should verify that the issues listed automatically by JIRA are 
> appropriate to appear in the Release Notes. Specifically, issues should:
>  * Be appropriately classified as {{{}Bug{}}}, {{{}New Feature{}}}, 
> {{{}Improvement{}}}, etc.
>  * Represent noteworthy user-facing changes, such as new functionality, 
> backward-incompatible API changes, or performance improvements.
>  * Have occurred since the previous release; an issue that was introduced and 
> fixed between releases should not appear in the Release Notes.
>  * Have an issue title that makes sense when read on its own.
> Adjust any of the above properties to the improve clarity and presentation of 
> the Release Notes.
> Ensure that the JIRA release notes are also included in the release notes of 
> the documentation (see section "Review and update documentation").
> h4. Content of Release Notes field from JIRA tickets 
> To get the list of "release notes" field from JIRA, you can ran the following 
> script using JIRA REST API (notes the maxResults limits the number of 
> entries):
> {code:bash}
> curl -s 
> https://issues.apache.org/jira//rest/api/2/search?maxResults=200&jql=project%20%3D%20FLINK%20AND%20%22Release%20Note%22%20is%20not%20EMPTY%20and%20fixVersion%20%3D%20${RELEASE_VERSION}
>  | jq '.issues[]|.key,.fields.summary,.fields.customfield_12310192' | paste - 
> - -
> {code}
> {{jq}}  is present in most Linux distributions and on MacOS can be installed 
> via brew.
>  
> 
> h3. Expectations
>  * Release Notes in JIRA have been audited and adjusted



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-35801) testSwitchFromEnablingToDisablingFileMerging failed in AZP

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo reassigned FLINK-35801:
--

Assignee: Zakelly Lan

> testSwitchFromEnablingToDisablingFileMerging failed in AZP
> --
>
> Key: FLINK-35801
> URL: https://issues.apache.org/jira/browse/FLINK-35801
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Zakelly Lan
>Priority: Major
>
> {code:java}
> Jul 09 15:12:51 15:12:51.261 [ERROR] Failures: 
> Jul 09 15:12:51 15:12:51.261 [ERROR]   
> SnapshotFileMergingCompatibilityITCase.testSwitchFromEnablingToDisablingFileMerging:83->testSwitchingFileMerging:216->verifyCheckpointExist:288
>  
> Jul 09 15:12:51 expected: false
> Jul 09 15:12:51  but was: true
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60787&view=logs&j=5c8e7682-d68f-54d1-16a2-a09310218a49&t=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba&l=9479



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35801) testSwitchFromEnablingToDisablingFileMerging failed in AZP

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35801:
---
Priority: Blocker  (was: Major)

> testSwitchFromEnablingToDisablingFileMerging failed in AZP
> --
>
> Key: FLINK-35801
> URL: https://issues.apache.org/jira/browse/FLINK-35801
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Zakelly Lan
>Priority: Blocker
>
> {code:java}
> Jul 09 15:12:51 15:12:51.261 [ERROR] Failures: 
> Jul 09 15:12:51 15:12:51.261 [ERROR]   
> SnapshotFileMergingCompatibilityITCase.testSwitchFromEnablingToDisablingFileMerging:83->testSwitchingFileMerging:216->verifyCheckpointExist:288
>  
> Jul 09 15:12:51 expected: false
> Jul 09 15:12:51  but was: true
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60787&view=logs&j=5c8e7682-d68f-54d1-16a2-a09310218a49&t=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba&l=9479



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35624) Release Testing: Verify FLIP-306 Unified File Merging Mechanism for Checkpoints

2024-07-09 Thread Rui Fan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864472#comment-17864472
 ] 

Rui Fan commented on FLINK-35624:
-

Thanks for the quikc feedback! Looking forward to your test result, and I hope 
it works well now :)

> Release Testing: Verify FLIP-306 Unified File Merging Mechanism for 
> Checkpoints
> ---
>
> Key: FLINK-35624
> URL: https://issues.apache.org/jira/browse/FLINK-35624
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing
>Reporter: Zakelly Lan
>Assignee: Rui Fan
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-07-07-14-04-47-065.png, 
> image-2024-07-08-17-05-40-546.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-32070
>  
> 1.20 is the MVP version for FLIP-306. It is a little bit complex and should 
> be tested carefully. The main idea of FLIP-306 is to merge checkpoint files 
> in TM side, and provide new {{{}StateHandle{}}}s to the JM. There will be a 
> TM-managed directory under the 'shared' checkpoint directory for each 
> subtask, and a TM-managed directory under the 'taskowned' checkpoint 
> directory for each Task Manager. Under those new introduced directories, the 
> checkpoint files will be merged into smaller file set. The following 
> scenarios need to be tested, including but not limited to:
>  # With the file merging enabled, periodic checkpoints perform properly, and 
> the failover, restore and rescale would also work well.
>  # Switch the file merging on and off across jobs, checkpoints and recovery 
> also work properly.
>  # There will be no left-over TM-managed directory, especially when there is 
> no cp complete before the job cancellation.
>  # File merging takes no effect in (native) savepoints.
> Besides the behaviors above, it is better to validate the function of space 
> amplification control and metrics. All the config options can be found under 
> 'execution.checkpointing.file-merging'.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35801) testSwitchFromEnablingToDisablingFileMerging failed in AZP

2024-07-09 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35801:
--

 Summary: testSwitchFromEnablingToDisablingFileMerging failed in AZP
 Key: FLINK-35801
 URL: https://issues.apache.org/jira/browse/FLINK-35801
 Project: Flink
  Issue Type: Bug
  Components: Build System / CI
Affects Versions: 1.20.0
Reporter: Weijie Guo






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35624) Release Testing: Verify FLIP-306 Unified File Merging Mechanism for Checkpoints

2024-07-09 Thread Zakelly Lan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864470#comment-17864470
 ] 

Zakelly Lan commented on FLINK-35624:
-

[~fanrui] I'm investigating on the github action failure.
In the meantime, we are about to finish the final round test. We will post the 
result here.

Thanks

> Release Testing: Verify FLIP-306 Unified File Merging Mechanism for 
> Checkpoints
> ---
>
> Key: FLINK-35624
> URL: https://issues.apache.org/jira/browse/FLINK-35624
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing
>Reporter: Zakelly Lan
>Assignee: Rui Fan
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-07-07-14-04-47-065.png, 
> image-2024-07-08-17-05-40-546.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-32070
>  
> 1.20 is the MVP version for FLIP-306. It is a little bit complex and should 
> be tested carefully. The main idea of FLIP-306 is to merge checkpoint files 
> in TM side, and provide new {{{}StateHandle{}}}s to the JM. There will be a 
> TM-managed directory under the 'shared' checkpoint directory for each 
> subtask, and a TM-managed directory under the 'taskowned' checkpoint 
> directory for each Task Manager. Under those new introduced directories, the 
> checkpoint files will be merged into smaller file set. The following 
> scenarios need to be tested, including but not limited to:
>  # With the file merging enabled, periodic checkpoints perform properly, and 
> the failover, restore and rescale would also work well.
>  # Switch the file merging on and off across jobs, checkpoints and recovery 
> also work properly.
>  # There will be no left-over TM-managed directory, especially when there is 
> no cp complete before the job cancellation.
>  # File merging takes no effect in (native) savepoints.
> Besides the behaviors above, it is better to validate the function of space 
> amplification control and metrics. All the config options can be found under 
> 'execution.checkpointing.file-merging'.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35801) testSwitchFromEnablingToDisablingFileMerging failed in AZP

2024-07-09 Thread Weijie Guo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864471#comment-17864471
 ] 

Weijie Guo commented on FLINK-35801:


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60781&view=logs&j=5c8e7682-d68f-54d1-16a2-a09310218a49&t=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba&l=9483

> testSwitchFromEnablingToDisablingFileMerging failed in AZP
> --
>
> Key: FLINK-35801
> URL: https://issues.apache.org/jira/browse/FLINK-35801
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Major
>
> {code:java}
> Jul 09 15:12:51 15:12:51.261 [ERROR] Failures: 
> Jul 09 15:12:51 15:12:51.261 [ERROR]   
> SnapshotFileMergingCompatibilityITCase.testSwitchFromEnablingToDisablingFileMerging:83->testSwitchingFileMerging:216->verifyCheckpointExist:288
>  
> Jul 09 15:12:51 expected: false
> Jul 09 15:12:51  but was: true
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60787&view=logs&j=5c8e7682-d68f-54d1-16a2-a09310218a49&t=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba&l=9479



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35801) testSwitchFromEnablingToDisablingFileMerging failed in AZP

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35801:
---
Description: 
{code:java}
Jul 09 15:12:51 15:12:51.261 [ERROR] Failures: 
Jul 09 15:12:51 15:12:51.261 [ERROR]   
SnapshotFileMergingCompatibilityITCase.testSwitchFromEnablingToDisablingFileMerging:83->testSwitchingFileMerging:216->verifyCheckpointExist:288
 
Jul 09 15:12:51 expected: false
Jul 09 15:12:51  but was: true
{code}

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60787&view=logs&j=5c8e7682-d68f-54d1-16a2-a09310218a49&t=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba&l=9479


> testSwitchFromEnablingToDisablingFileMerging failed in AZP
> --
>
> Key: FLINK-35801
> URL: https://issues.apache.org/jira/browse/FLINK-35801
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Major
>
> {code:java}
> Jul 09 15:12:51 15:12:51.261 [ERROR] Failures: 
> Jul 09 15:12:51 15:12:51.261 [ERROR]   
> SnapshotFileMergingCompatibilityITCase.testSwitchFromEnablingToDisablingFileMerging:83->testSwitchingFileMerging:216->verifyCheckpointExist:288
>  
> Jul 09 15:12:51 expected: false
> Jul 09 15:12:51  but was: true
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60787&view=logs&j=5c8e7682-d68f-54d1-16a2-a09310218a49&t=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba&l=9479



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33730) Update the compatibility table to only include last three released versions

2024-07-09 Thread Rui Fan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864467#comment-17864467
 ] 

Rui Fan commented on FLINK-33730:
-

Merged to master(1.19) via: a2cf8fca7b229dcfd49d0b2c258e12af9f0dda6a and 
c3e2d163a637dca5f49522721109161bd7ebb723

> Update the compatibility table to only include last three released versions
> ---
>
> Key: FLINK-33730
> URL: https://issues.apache.org/jira/browse/FLINK-33730
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Jing Ge
>Assignee: Jing Ge
>Priority: Minor
>  Labels: pull-request-available
>
> Update the compatibility table 
> ([apache-flink:./docs/content/docs/ops/upgrading.md|https://github.com/apache/flink/blob/master/docs/content/docs/ops/upgrading.md#compatibility-table]
>  and 
> [apache-flink:./docs/content.zh/docs/ops/upgrading.md|https://github.com/apache/flink/blob/master/docs/content.zh/docs/ops/upgrading.md#compatibility-table])
>  according to the discussion[1].
>  
> [1] https://lists.apache.org/thread/7yx396x5lmtws0s4t0sf9f2psgny11d6
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33730) Update the compatibility table to only include last three released versions

2024-07-09 Thread Rui Fan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-33730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864466#comment-17864466
 ] 

Rui Fan commented on FLINK-33730:
-

[https://github.com/apache/flink/pull/23859] only updated the English version, 
I will updated the zh doc today.

> Update the compatibility table to only include last three released versions
> ---
>
> Key: FLINK-33730
> URL: https://issues.apache.org/jira/browse/FLINK-33730
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Jing Ge
>Assignee: Jing Ge
>Priority: Minor
>  Labels: pull-request-available
>
> Update the compatibility table 
> ([apache-flink:./docs/content/docs/ops/upgrading.md|https://github.com/apache/flink/blob/master/docs/content/docs/ops/upgrading.md#compatibility-table]
>  and 
> [apache-flink:./docs/content.zh/docs/ops/upgrading.md|https://github.com/apache/flink/blob/master/docs/content.zh/docs/ops/upgrading.md#compatibility-table])
>  according to the discussion[1].
>  
> [1] https://lists.apache.org/thread/7yx396x5lmtws0s4t0sf9f2psgny11d6
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-25217) FLIP-190: Support Version Upgrades for Table API & SQL Programs

2024-07-09 Thread Shengkai Fang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864464#comment-17864464
 ] 

Shengkai Fang commented on FLINK-25217:
---

[~twalthr] hello, I see most of issues have been closed, does it mean Flink has 
already supported  version upgrade in 1.19 or 1.20?

> FLIP-190: Support Version Upgrades for Table API & SQL Programs
> ---
>
> Key: FLINK-25217
> URL: https://issues.apache.org/jira/browse/FLINK-25217
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API, Table SQL / Planner
>Reporter: Timo Walther
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
>
> Nowadays, the Table & SQL API is as important to Flink as the DataStream API. 
> It is one of the main abstractions for expressing pipelines that perform 
> stateful stream processing. Users expect the same backwards compatibility 
> guarantees when upgrading to a newer Flink version as with the DataStream API.
> In particular, this means:
> * once the operator topology is defined, it remains static and does not 
> change between Flink versions, unless resulting in better performance,
> * business logic (defined using expressions and functions in queries) behaves 
> identical as before the version upgrade,
> * the state of a Table & SQL API program can be restored from a savepoint of 
> a previous version,
> * adding or removing stateful operators should be made possible in the 
> DataStream API.
> The same query can remain up and running after upgrades.
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336489



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-35624) Release Testing: Verify FLIP-306 Unified File Merging Mechanism for Checkpoints

2024-07-09 Thread Rui Fan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864459#comment-17864459
 ] 

Rui Fan edited comment on FLINK-35624 at 7/10/24 4:00 AM:
--

Hi [~zakelly] , I saw FLINK-35778 and FLINK-35784 are resolved. But it seems 
FLINK-35784 causes the SnapshotFileMergingCompatibilityITCase to fail.

Also, did you run flink jobs to test the file merging after these 2 fixes? 
Please let me know if your test is expected, I will do a final-round test as 
well, thanks :)

It's needed to re-test the file merging after these 2 fixes to ensure the 1.20 
can be released smoothly.


was (Author: fanrui):
Hi [~zakelly] , I saw FLINK-35778 and FLINK-35784 are resolved. But it seems 
FLINK-35784 causes the SnapshotFileMergingCompatibilityITCase to fail.

Also, did you run flink jobs to test the file merging after these 2 fixes? If 
your test is expected, I will do a final-round test as well.

It's needed to re-test the file merging after these 2 fixes to ensure the 1.20 
can be released smoothly.

> Release Testing: Verify FLIP-306 Unified File Merging Mechanism for 
> Checkpoints
> ---
>
> Key: FLINK-35624
> URL: https://issues.apache.org/jira/browse/FLINK-35624
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing
>Reporter: Zakelly Lan
>Assignee: Rui Fan
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-07-07-14-04-47-065.png, 
> image-2024-07-08-17-05-40-546.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-32070
>  
> 1.20 is the MVP version for FLIP-306. It is a little bit complex and should 
> be tested carefully. The main idea of FLIP-306 is to merge checkpoint files 
> in TM side, and provide new {{{}StateHandle{}}}s to the JM. There will be a 
> TM-managed directory under the 'shared' checkpoint directory for each 
> subtask, and a TM-managed directory under the 'taskowned' checkpoint 
> directory for each Task Manager. Under those new introduced directories, the 
> checkpoint files will be merged into smaller file set. The following 
> scenarios need to be tested, including but not limited to:
>  # With the file merging enabled, periodic checkpoints perform properly, and 
> the failover, restore and rescale would also work well.
>  # Switch the file merging on and off across jobs, checkpoints and recovery 
> also work properly.
>  # There will be no left-over TM-managed directory, especially when there is 
> no cp complete before the job cancellation.
>  # File merging takes no effect in (native) savepoints.
> Besides the behaviors above, it is better to validate the function of space 
> amplification control and metrics. All the config options can be found under 
> 'execution.checkpointing.file-merging'.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35624) Release Testing: Verify FLIP-306 Unified File Merging Mechanism for Checkpoints

2024-07-09 Thread Rui Fan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864459#comment-17864459
 ] 

Rui Fan commented on FLINK-35624:
-

Hi [~zakelly] , I saw FLINK-35778 and FLINK-35784 are resolved. But it seems 
FLINK-35784 causes the SnapshotFileMergingCompatibilityITCase to fail.

Also, did you run flink jobs to test the file merging after these 2 fixes? If 
your test is expected, I will do a final-round test as well.

It's needed to re-test the file merging after these 2 fixes to ensure the 1.20 
can be released smoothly.

> Release Testing: Verify FLIP-306 Unified File Merging Mechanism for 
> Checkpoints
> ---
>
> Key: FLINK-35624
> URL: https://issues.apache.org/jira/browse/FLINK-35624
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing
>Reporter: Zakelly Lan
>Assignee: Rui Fan
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-07-07-14-04-47-065.png, 
> image-2024-07-08-17-05-40-546.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-32070
>  
> 1.20 is the MVP version for FLIP-306. It is a little bit complex and should 
> be tested carefully. The main idea of FLIP-306 is to merge checkpoint files 
> in TM side, and provide new {{{}StateHandle{}}}s to the JM. There will be a 
> TM-managed directory under the 'shared' checkpoint directory for each 
> subtask, and a TM-managed directory under the 'taskowned' checkpoint 
> directory for each Task Manager. Under those new introduced directories, the 
> checkpoint files will be merged into smaller file set. The following 
> scenarios need to be tested, including but not limited to:
>  # With the file merging enabled, periodic checkpoints perform properly, and 
> the failover, restore and rescale would also work well.
>  # Switch the file merging on and off across jobs, checkpoints and recovery 
> also work properly.
>  # There will be no left-over TM-managed directory, especially when there is 
> no cp complete before the job cancellation.
>  # File merging takes no effect in (native) savepoints.
> Besides the behaviors above, it is better to validate the function of space 
> amplification control and metrics. All the config options can be found under 
> 'execution.checkpointing.file-merging'.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-35784) The cp file-merging directory not properly registered in SharedStateRegistry

2024-07-09 Thread Rui Fan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864457#comment-17864457
 ] 

Rui Fan edited comment on FLINK-35784 at 7/10/24 3:40 AM:
--

https://github.com/apache/flink/actions/runs/9856947616/job/27215481394#step:10:9489

[https://github.com/apache/flink/actions/runs/9857042336/job/27215771351#step:10:9380]

 
SnapshotFileMergingCompatibilityITCase.testSwitchFromEnablingToDisablingFileMerging:83->testSwitchingFileMerging:214->verifyCheckpointExist:288
 
Jul 09 13:24:47 expected: false 
Jul 09 13:24:47 but was: true

 

SnapshotFileMergingCompatibilityITCase fails after this fix is merged in 1.20 
branch. [~zakelly] 

It fails at 2 github CIs, so I guess it's easy to be reprocuded.

cc [~Weijie Guo]


was (Author: fanrui):
[https://github.com/apache/flink/actions/runs/9857042336/job/27215771351#step:10:9380]

 
SnapshotFileMergingCompatibilityITCase.testSwitchFromEnablingToDisablingFileMerging:83->testSwitchingFileMerging:214->verifyCheckpointExist:288
 
Jul 09 13:24:47 expected: false 
Jul 09 13:24:47 but was: true

 

SnapshotFileMergingCompatibilityITCase fails after this fix is merged in 1.20 
branch. [~zakelly] 

cc [~Weijie Guo]

> The cp file-merging directory not properly registered in SharedStateRegistry
> 
>
> Key: FLINK-35784
> URL: https://issues.apache.org/jira/browse/FLINK-35784
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.20.0
>Reporter: Zakelly Lan
>Assignee: Zakelly Lan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> The {{OperatorSubtaskState}} only make keyed state register with 
> {{SharedStateRegistry}}. However, the file-merging directories's handle are 
> wrapped in {{FileMergingOperatorStreamStateHandle}}, which is an 
> {{OperatorStreamStateHandle}}. That means the {{#registerSharedStates}} is 
> never called, so the registry will never delete the directories.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35784) The cp file-merging directory not properly registered in SharedStateRegistry

2024-07-09 Thread Rui Fan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864457#comment-17864457
 ] 

Rui Fan commented on FLINK-35784:
-

[https://github.com/apache/flink/actions/runs/9857042336/job/27215771351#step:10:9380]

 
SnapshotFileMergingCompatibilityITCase.testSwitchFromEnablingToDisablingFileMerging:83->testSwitchingFileMerging:214->verifyCheckpointExist:288
 
Jul 09 13:24:47 expected: false 
Jul 09 13:24:47 but was: true

 

SnapshotFileMergingCompatibilityITCase fails after this fix is merged in 1.20 
branch. [~zakelly] 

cc [~Weijie Guo]

> The cp file-merging directory not properly registered in SharedStateRegistry
> 
>
> Key: FLINK-35784
> URL: https://issues.apache.org/jira/browse/FLINK-35784
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.20.0
>Reporter: Zakelly Lan
>Assignee: Zakelly Lan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> The {{OperatorSubtaskState}} only make keyed state register with 
> {{SharedStateRegistry}}. However, the file-merging directories's handle are 
> wrapped in {{FileMergingOperatorStreamStateHandle}}, which is an 
> {{OperatorStreamStateHandle}}. That means the {{#registerSharedStates}} is 
> never called, so the registry will never delete the directories.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-35738) Release Testing: Verify FLINK-26050 Too many small sst files in rocksdb state backend when using time window created in ascending order

2024-07-09 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo resolved FLINK-35738.

Resolution: Done

> Release Testing: Verify FLINK-26050 Too many small sst files in rocksdb state 
> backend when using time window created in ascending order
> ---
>
> Key: FLINK-35738
> URL: https://issues.apache.org/jira/browse/FLINK-35738
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Affects Versions: 1.20.0
>Reporter: Rui Fan
>Assignee: Samrat Deb
>Priority: Major
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: Screenshot 2024-07-09 at 11.27.03 PM.png
>
>
> The problem occurs when using RocksDB and specific queries/jobs (please see 
> the ticket for the detailed description).
> To test the solution, run the following query with RocksDB as a state backend:
>  
> {code:java}
> INSERT INTO top_5_highest_view_time
> SELECT *
> FROM   (
>                 SELECT   *,
>                          ROW_NUMBER() OVER (PARTITION BY window_start, 
> window_end ORDER BY view_time DESC) AS rownum
>                 FROM     (
>                                   SELECT   window_start,
>                                            window_end,
>                                            product_id,
>                                            SUM(view_time) AS view_time,
>                                            COUNT(*)       AS cnt
>                                   FROM     TABLE(TUMBLE(TABLE 
> `shoe_clickstream`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES))
>                                   GROUP BY window_start,
>                                            window_end,
>                                            product_id))
> WHERE  rownum <= 5;{code}
>  
> With the feature disabled (default), the number of files in rocksdb working 
> directory (as well as in the checkpoint) should grow indefinitely.
>  
> With feature enabled, the number of files should stays constant (as they 
> should get merged with each other).
> To enable the feature, set 
> {code:java}
> state.backend.rocksdb.manual-compaction.min-interval{code}
>  set to 1 minute for example.
>  
> Please consult 
> [https://github.com/apache/flink/blob/e7d7db3b6f87e53d9bace2a16cf95e5f7a79087a/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/sstmerge/RocksDBManualCompactionOptions.java#L29]
>  for other options if necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35738) Release Testing: Verify FLINK-26050 Too many small sst files in rocksdb state backend when using time window created in ascending order

2024-07-09 Thread Weijie Guo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864455#comment-17864455
 ] 

Weijie Guo commented on FLINK-35738:


Thanks [~samrat007] for the careful verification! I'm going to close this then.

> Release Testing: Verify FLINK-26050 Too many small sst files in rocksdb state 
> backend when using time window created in ascending order
> ---
>
> Key: FLINK-35738
> URL: https://issues.apache.org/jira/browse/FLINK-35738
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Affects Versions: 1.20.0
>Reporter: Rui Fan
>Assignee: Samrat Deb
>Priority: Major
>  Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: Screenshot 2024-07-09 at 11.27.03 PM.png
>
>
> The problem occurs when using RocksDB and specific queries/jobs (please see 
> the ticket for the detailed description).
> To test the solution, run the following query with RocksDB as a state backend:
>  
> {code:java}
> INSERT INTO top_5_highest_view_time
> SELECT *
> FROM   (
>                 SELECT   *,
>                          ROW_NUMBER() OVER (PARTITION BY window_start, 
> window_end ORDER BY view_time DESC) AS rownum
>                 FROM     (
>                                   SELECT   window_start,
>                                            window_end,
>                                            product_id,
>                                            SUM(view_time) AS view_time,
>                                            COUNT(*)       AS cnt
>                                   FROM     TABLE(TUMBLE(TABLE 
> `shoe_clickstream`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES))
>                                   GROUP BY window_start,
>                                            window_end,
>                                            product_id))
> WHERE  rownum <= 5;{code}
>  
> With the feature disabled (default), the number of files in rocksdb working 
> directory (as well as in the checkpoint) should grow indefinitely.
>  
> With feature enabled, the number of files should stays constant (as they 
> should get merged with each other).
> To enable the feature, set 
> {code:java}
> state.backend.rocksdb.manual-compaction.min-interval{code}
>  set to 1 minute for example.
>  
> Please consult 
> [https://github.com/apache/flink/blob/e7d7db3b6f87e53d9bace2a16cf95e5f7a79087a/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/sstmerge/RocksDBManualCompactionOptions.java#L29]
>  for other options if necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-35793) BatchSQLTest.testBatchSQL failed in hybrid shuffle mode

2024-07-09 Thread Weijie Guo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864453#comment-17864453
 ] 

Weijie Guo edited comment on FLINK-35793 at 7/10/24 3:30 AM:
-

remove hs type via bbbaa7ac303421cb420b1bfc6beb3b79588ce042.

I shall not close this as we have to fix this later.


was (Author: weijie guo):
remove hs type via bbbaa7ac303421cb420b1bfc6beb3b79588ce042.

> BatchSQLTest.testBatchSQL failed in hybrid shuffle mode
> ---
>
> Key: FLINK-35793
> URL: https://issues.apache.org/jira/browse/FLINK-35793
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 2.0.0
>Reporter: Weijie Guo
>Assignee: Weijie Guo
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> Jul 05 17:38:34 Caused by: java.lang.IllegalStateException: Result partition 
> is already released.
> Jul 05 17:38:34   at 
> org.apache.flink.util.Preconditions.checkState(Preconditions.java:193)
> Jul 05 17:38:34   at 
> org.apache.flink.runtime.io.network.partition.hybrid.HsResultPartition.finish(HsResultPartition.java:264)
> Jul 05 17:38:34   at 
> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:778)
> Jul 05 17:38:34   at 
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:575)
> Jul 05 17:38:34   at java.lang.Thread.run(Thread.java:750)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60701&view=logs&j=af184cdd-c6d8-5084-0b69-7e9c67b35f7a&t=0f3adb59-eefa-51c6-2858-3654d9e0749d&l=13956



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35793) BatchSQLTest.testBatchSQL failed in hybrid shuffle mode

2024-07-09 Thread Weijie Guo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864453#comment-17864453
 ] 

Weijie Guo commented on FLINK-35793:


remove hs type via bbbaa7ac303421cb420b1bfc6beb3b79588ce042.

> BatchSQLTest.testBatchSQL failed in hybrid shuffle mode
> ---
>
> Key: FLINK-35793
> URL: https://issues.apache.org/jira/browse/FLINK-35793
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 2.0.0
>Reporter: Weijie Guo
>Assignee: Weijie Guo
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> Jul 05 17:38:34 Caused by: java.lang.IllegalStateException: Result partition 
> is already released.
> Jul 05 17:38:34   at 
> org.apache.flink.util.Preconditions.checkState(Preconditions.java:193)
> Jul 05 17:38:34   at 
> org.apache.flink.runtime.io.network.partition.hybrid.HsResultPartition.finish(HsResultPartition.java:264)
> Jul 05 17:38:34   at 
> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:778)
> Jul 05 17:38:34   at 
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:575)
> Jul 05 17:38:34   at java.lang.Thread.run(Thread.java:750)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60701&view=logs&j=af184cdd-c6d8-5084-0b69-7e9c67b35f7a&t=0f3adb59-eefa-51c6-2858-3654d9e0749d&l=13956



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35793][tests] Remove hybrid shuffle types from BatchSQLTest temporarily [flink]

2024-07-09 Thread via GitHub



reswqa merged PR #25060:
URL: https://github.com/apache/flink/pull/25060


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34111][table] Add support for json_quote, json_unquote, address PR feedback #24156 [flink]

2024-07-09 Thread via GitHub



anupamaggarwal commented on code in PR #24967:
URL: https://github.com/apache/flink/pull/24967#discussion_r1671531397


##
flink-table/flink-table-planner/src/test/scala/org/apache/flink/table/planner/expressions/ScalarFunctionsTest.scala:
##
@@ -604,6 +604,49 @@ class ScalarFunctionsTest extends ScalarTypesTestBase {
   "foothebar")
   }
 
+  @Test
+  def testJsonQuote(): Unit = {
+testSqlApi("JSON_QUOTE('null')", "\"null\"")
+testSqlApi("JSON_QUOTE('\"null\"')", "\"\\\"null\\\"\"")
+testSqlApi("JSON_QUOTE('[1,2,3]')", "\"[1,2,3]\"")
+testSqlApi("JSON_UNQUOTE(JSON_QUOTE('[1,2,3]'))", "[1,2,3]")
+testSqlApi(
+  "JSON_QUOTE('This is a \\t test \\n with special characters: \" \\ \\b 
\\f \\r \\u0041')",
+  "\"This is a t test n with special characters: \\\"  b 
f r u0041\""
+)
+testSqlApi(
+  "JSON_QUOTE('\"special\": \"\\b\\f\\r\"')",
+  "\"\\\"special\\\": \\\"bfr\\\"\"")
+testSqlApi(
+  "JSON_QUOTE('skipping backslash \')",
+  "\"skipping backslash \""
+)
+testSqlApi(
+  "JSON_QUOTE('this will be quoted ≠')",
+  "\"this will be quoted \\u2260\""
+)
+  }
+
+  @Test
+  def testJsonUnquote(): Unit = {

Review Comment:
   thanks @fhueske for your review, will fix this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34111][table] Add support for json_quote, json_unquote, address PR feedback #24156 [flink]

2024-07-09 Thread via GitHub



anupamaggarwal commented on PR #24967:
URL: https://github.com/apache/flink/pull/24967#issuecomment-2219410170

   thanks @snuyanzin for your review!
   I think both these cases are related to inconsistent behavior (IIUC) of `IS 
JSON` function which is used in json_unquote implementation for checking JSON 
validity. I had observed something similar earlier (see `Open questions` 
section of the PR description. 
   
   ```
   SELECT JSON_UNQUOTE('"1""1"');
   ```
   As you pointed out this should ideally throw an exception (as mysql), 
however IS JSON returns true in this case  (IIUC this should be a bug? since 
`1""1` is not a valid json)
   
   ```
   select '"1""1"' is json; //true 
   ```  
   If we fix this by throwing an exception (preferable), it could be 
in-consistent with what IS JSON considers as valid json?
   Should we care about compatibility with IS JSON behavior here (or just fix 
this and file a JIRA for IS JSON?)
   
   Same is the case with 
   ```
   SELECT JSON_UNQUOTE('"1""\u"')
   select '"1""\u"' is JSON; //true
   ``` 
   However in this case we throw an exception (from json_unquote) since \u 
is not a valid literal 
   
   For 
   ```
   SELECT JSON_UNQUOTE('"1\u"')
   ```
IS JSON returns FALSE and we return the input string back per 
documentation. 
   
   >SELECT JSON_UNQUOTE('"\"\ufffa"');
   >it starts printing backslash as well which is NOT OK
   
   Hmm, I tested this through the sql-client.sh, and  I get `" ` which seems 
expected? (as "\"\ufffa" is a valid JSON )
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34111][table] Add support for json_quote, json_unquote, address PR feedback #24156 [flink]

2024-07-09 Thread via GitHub



anupamaggarwal commented on code in PR #24967:
URL: https://github.com/apache/flink/pull/24967#discussion_r1671499749


##
docs/data/sql_functions.yml:
##
@@ -377,6 +377,12 @@ string:
   - sql: SUBSTR(string, integer1[, integer2])
 table: STRING.substr(INTEGER1[, INTEGER2])
 description: Returns a substring of string starting from position integer1 
with length integer2 (to the end by default).
+  - sql: JSON_QUOTE(string)

Review Comment:
   Created JIRA https://issues.apache.org/jira/browse/FLINK-35800 (and added a 
comment to original JIRA https://issues.apache.org/jira/browse/FLINK-34111) 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-35800) Update chinese documentation for json_(un)quote function

2024-07-09 Thread Anupam Aggarwal (Jira)

Anupam Aggarwal created FLINK-35800:
---

 Summary: Update chinese documentation for json_(un)quote function
 Key: FLINK-35800
 URL: https://issues.apache.org/jira/browse/FLINK-35800
 Project: Flink
  Issue Type: Improvement
  Components: chinese-translation
Reporter: Anupam Aggarwal


Update chinese documentation corresponding to `json_quote` and `json_unquote` 
function

as per instructions  in 
[https://github.com/apache/flink/blob/master/docs/data/sql_functions_zh.yml] 

 

Changes are in
PR link - [https://github.com/apache/flink/pull/24967] 

[https://github.com/apache/flink/blob/4267018323dc3bfa1d65ee9fcb49b024e03d/docs/data/sql_functions.yml#L380]

 

Changes would be needed in file
[https://github.com/apache/flink/blob/master/docs/data/sql_functions_zh.yml] 

Jira for functions - https://issues.apache.org/jira/browse/FLINK-34111 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34111) Add JSON_QUOTE and JSON_UNQUOTE function

2024-07-09 Thread Anupam Aggarwal (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864437#comment-17864437
 ] 

Anupam Aggarwal commented on FLINK-34111:
-

Adding link to Jira for chinese translation 
https://issues.apache.org/jira/browse/FLINK-35800 

> Add JSON_QUOTE and JSON_UNQUOTE function
> 
>
> Key: FLINK-34111
> URL: https://issues.apache.org/jira/browse/FLINK-34111
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Martijn Visser
>Assignee: Jeyhun Karimov
>Priority: Major
>  Labels: pull-request-available
>
> Escapes or unescapes a JSON string removing traces of offending characters 
> that could prevent parsing.
> Proposal:
> - JSON_QUOTE: Quotes a string by wrapping it with double quote characters and 
> escaping interior quote and other characters, then returning the result as a 
> utf8mb4 string. Returns NULL if the argument is NULL.
> - JSON_UNQUOTE: Unquotes value and returns the result as a string. Returns 
> NULL if the argument is NULL. An error occurs if the value starts and ends 
> with double quotes but is not a valid JSON string literal.
> The following characters are reserved in JSON and must be properly escaped to 
> be used in strings:
> Backspace is replaced with \b
> Form feed is replaced with \f
> Newline is replaced with \n
> Carriage return is replaced with \r
> Tab is replaced with \t
> Double quote is replaced with \"
> Backslash is replaced with \\
> This function exists in MySQL: 
> - 
> https://dev.mysql.com/doc/refman/8.0/en/json-creation-functions.html#function_json-quote
> - 
> https://dev.mysql.com/doc/refman/8.0/en/json-modification-functions.html#function_json-unquote
> It's still open in Calcite CALCITE-3130



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [hotfix][checkpoint] Rename file-merging options in documents [flink]

2024-07-09 Thread via GitHub



fredia merged PR #25048:
URL: https://github.com/apache/flink/pull/25048


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35791][kafka] add database and table info of canal/debezium json format for kafka sink. [flink-cdc]

2024-07-09 Thread via GitHub



lvyanquan commented on code in PR #3461:
URL: https://github.com/apache/flink-cdc/pull/3461#discussion_r1671446357


##
flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-kafka/src/main/java/org/apache/flink/cdc/connectors/kafka/json/debezium/DebeziumJsonSerializationSchema.java:
##
@@ -185,14 +190,21 @@ public byte[] serialize(Event event) {
 }
 }
 
+/**
+ * Refer to https://debezium.io/documentation/reference/1.9/connectors/mysql.html";>

Review Comment:
   Addressed it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

1 2 3 >

1 - 100 of 278 matches

Mail list logo