[jira] [Comment Edited] (FLINK-31810) RocksDBException: Bad table magic number on checkpoint rescale

2023-05-14 Thread David Artiga (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722670#comment-17722670
 ] 

David Artiga edited comment on FLINK-31810 at 5/15/23 6:46 AM:
---

RocksDB files are stored in local filesystem on DataProc nodes, then 
checkpoints/savepoints are written into Google bucket.

We migrated recently from 1.14 (haven't seen this issue previously)


was (Author: david.artiga):
RocksDB files are stored in local filesystem on DataProc nodes, then 
checkpoints/savepoints are written into Google bucket.

We migrated recently from 1.14.

> RocksDBException: Bad table magic number on checkpoint rescale
> --
>
> Key: FLINK-31810
> URL: https://issues.apache.org/jira/browse/FLINK-31810
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.15.2
>Reporter: Robert Metzger
>Priority: Major
>
> While rescaling a job from checkpoint, I ran into this exception:
> {code:java}
> SinkMaterializer[7] -> rob-result[7]: Writer -> rob-result[7]: Committer 
> (4/4)#3 (c1b348f7eed6e1ce0e41ef75338ae754) switched from INITIALIZING to 
> FAILED with failure cause: java.lang.Exception: Exception while creating 
> StreamOperatorStateContext.
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:255)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:265)
>   at 
> org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:703)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:679)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:646)
>   at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
>   at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:917)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
>   at java.base/java.lang.Thread.run(Unknown Source)
> Caused by: org.apache.flink.util.FlinkException: Could not restore keyed 
> state backend for 
> SinkUpsertMaterializer_7d9b7588bc2ff89baed50d7a4558caa4_(4/4) from any of the 
> 1 provided restore options.
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:160)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:346)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:164)
>   ... 11 more
> Caused by: org.apache.flink.runtime.state.BackendBuildingException: Caught 
> unexpected exception.
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:395)
>   at 
> org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend.createKeyedStateBackend(EmbeddedRocksDBStateBackend.java:483)
>   at 
> org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend.createKeyedStateBackend(EmbeddedRocksDBStateBackend.java:97)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:329)
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168)
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
>   ... 13 more
> Caused by: java.io.IOException: Error while opening RocksDB instance.
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBOperationUtils.openDB(RocksDBOperationUtils.java:92)
>   at 
> org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreDBInstanceFromStateHandle(RocksDBIncrementalRestoreOperation.java:465)
>   at 
> org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreWithRescaling(RocksDBIncrementalRestoreOperation.java:321)
>   at 
> org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restore(RocksDBIncrementalRestoreOperatio

[jira] [Comment Edited] (FLINK-31810) RocksDBException: Bad table magic number on checkpoint rescale

2023-05-14 Thread David Artiga (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722670#comment-17722670
 ] 

David Artiga edited comment on FLINK-31810 at 5/15/23 6:46 AM:
---

RocksDB files are stored in local filesystem on DataProc nodes, then 
checkpoints/savepoints are written into Google bucket.

We migrated recently from 1.14.


was (Author: david.artiga):
RocksDB files are stored in local filesystem on DataProc nodes, then 
checkpoints/savepoints are written into Google bucket.

> RocksDBException: Bad table magic number on checkpoint rescale
> --
>
> Key: FLINK-31810
> URL: https://issues.apache.org/jira/browse/FLINK-31810
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.15.2
>Reporter: Robert Metzger
>Priority: Major
>
> While rescaling a job from checkpoint, I ran into this exception:
> {code:java}
> SinkMaterializer[7] -> rob-result[7]: Writer -> rob-result[7]: Committer 
> (4/4)#3 (c1b348f7eed6e1ce0e41ef75338ae754) switched from INITIALIZING to 
> FAILED with failure cause: java.lang.Exception: Exception while creating 
> StreamOperatorStateContext.
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:255)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:265)
>   at 
> org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:703)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:679)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:646)
>   at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
>   at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:917)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
>   at java.base/java.lang.Thread.run(Unknown Source)
> Caused by: org.apache.flink.util.FlinkException: Could not restore keyed 
> state backend for 
> SinkUpsertMaterializer_7d9b7588bc2ff89baed50d7a4558caa4_(4/4) from any of the 
> 1 provided restore options.
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:160)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:346)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:164)
>   ... 11 more
> Caused by: org.apache.flink.runtime.state.BackendBuildingException: Caught 
> unexpected exception.
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:395)
>   at 
> org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend.createKeyedStateBackend(EmbeddedRocksDBStateBackend.java:483)
>   at 
> org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend.createKeyedStateBackend(EmbeddedRocksDBStateBackend.java:97)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:329)
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168)
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
>   ... 13 more
> Caused by: java.io.IOException: Error while opening RocksDB instance.
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBOperationUtils.openDB(RocksDBOperationUtils.java:92)
>   at 
> org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreDBInstanceFromStateHandle(RocksDBIncrementalRestoreOperation.java:465)
>   at 
> org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreWithRescaling(RocksDBIncrementalRestoreOperation.java:321)
>   at 
> org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restore(RocksDBIncrementalRestoreOperation.java:164)
>   at 
> org.apache.flink.contrib.streaming.state.Ro

[jira] [Commented] (FLINK-31810) RocksDBException: Bad table magic number on checkpoint rescale

2023-05-14 Thread David Artiga (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722670#comment-17722670
 ] 

David Artiga commented on FLINK-31810:
--

RocksDB files are stored in local filesystem on DataProc nodes, then 
checkpoints/savepoints are written into Google bucket.

> RocksDBException: Bad table magic number on checkpoint rescale
> --
>
> Key: FLINK-31810
> URL: https://issues.apache.org/jira/browse/FLINK-31810
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.15.2
>Reporter: Robert Metzger
>Priority: Major
>
> While rescaling a job from checkpoint, I ran into this exception:
> {code:java}
> SinkMaterializer[7] -> rob-result[7]: Writer -> rob-result[7]: Committer 
> (4/4)#3 (c1b348f7eed6e1ce0e41ef75338ae754) switched from INITIALIZING to 
> FAILED with failure cause: java.lang.Exception: Exception while creating 
> StreamOperatorStateContext.
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:255)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:265)
>   at 
> org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:703)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:679)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:646)
>   at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
>   at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:917)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
>   at java.base/java.lang.Thread.run(Unknown Source)
> Caused by: org.apache.flink.util.FlinkException: Could not restore keyed 
> state backend for 
> SinkUpsertMaterializer_7d9b7588bc2ff89baed50d7a4558caa4_(4/4) from any of the 
> 1 provided restore options.
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:160)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:346)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:164)
>   ... 11 more
> Caused by: org.apache.flink.runtime.state.BackendBuildingException: Caught 
> unexpected exception.
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:395)
>   at 
> org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend.createKeyedStateBackend(EmbeddedRocksDBStateBackend.java:483)
>   at 
> org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend.createKeyedStateBackend(EmbeddedRocksDBStateBackend.java:97)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:329)
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168)
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
>   ... 13 more
> Caused by: java.io.IOException: Error while opening RocksDB instance.
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBOperationUtils.openDB(RocksDBOperationUtils.java:92)
>   at 
> org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreDBInstanceFromStateHandle(RocksDBIncrementalRestoreOperation.java:465)
>   at 
> org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreWithRescaling(RocksDBIncrementalRestoreOperation.java:321)
>   at 
> org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restore(RocksDBIncrementalRestoreOperation.java:164)
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:315)
>   ... 18 more
> Caused by: org.rocksdb.RocksDBException: Bad table magic number: expected 
> 9863518390377041911, found 4096 in 
> /tmp/job_0

[jira] [Comment Edited] (FLINK-31967) SQL with LAG function NullPointerException

2023-05-14 Thread Haojin Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722665#comment-17722665
 ] 

Haojin Wang edited comment on FLINK-31967 at 5/15/23 6:26 AM:
--

[~padavan]  This problem is caused by the basic data type of Java, and I will 
propose a PR to solve this problem in the future


was (Author: JIRAUSER300404):
This problem is caused by the basic data type of Java, and I will propose a PR 
to solve this problem in the future

> SQL with LAG function NullPointerException
> --
>
> Key: FLINK-31967
> URL: https://issues.apache.org/jira/browse/FLINK-31967
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Reporter: padavan
>Priority: Major
> Attachments: image-2023-04-28-14-46-19-736.png, 
> image-2023-04-28-15-06-48-184.png, image-2023-04-28-15-14-58-788.png, 
> image-2023-04-28-15-17-49-144.png, image-2023-04-28-17-06-20-737.png, 
> simpleFlinkKafkaLag.zip
>
>
> I want to make a query with the LAG function. And got Job Exception without 
> any explanations.
>  
> *Code:*
> {code:java}
> private static void t1_LeadLag(DataStream ds, 
> StreamExecutionEnvironment env) {
> StreamTableEnvironment te = StreamTableEnvironment.create(env);
> Table t = te.fromDataStream(ds, 
> Schema.newBuilder().columnByExpression("proctime", "proctime()").build());
> te.createTemporaryView("users", t);
> Table res = te.sqlQuery("SELECT userId, `count`,\n" +
> " LAG(`count`) OVER (PARTITION BY userId ORDER BY proctime) AS 
> prev_quantity\n" +
> " FROM users");
> te.toChangelogStream(res).print();
> }{code}
>  
> *Input:*
> {"userId":3,"count":0,"dt":"2023-04-28T07:44:21.551Z"}
>  
> *Exception:* I remove part about basic JobExecutionException and kept the 
> important(i think)
> {code:java}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.flink.table.data.GenericRowData.getInt(GenericRowData.java:149)
> at 
> org.apache.flink.table.data.RowData.lambda$createFieldGetter$245ca7d1$6(RowData.java:245)
> at 
> org$apache$flink$table$runtime$functions$aggregate$LagAggFunction$LagAcc$2$Converter.toExternal(Unknown
>  Source)
> at 
> org.apache.flink.table.data.conversion.StructuredObjectConverter.toExternal(StructuredObjectConverter.java:101)
> at UnboundedOverAggregateHelper$15.setAccumulators(Unknown Source)
> at 
> org.apache.flink.table.runtime.operators.over.ProcTimeUnboundedPrecedingFunction.processElement(ProcTimeUnboundedPrecedingFunction.java:92)
> at 
> org.apache.flink.table.runtime.operators.over.ProcTimeUnboundedPrecedingFunction.processElement(ProcTimeUnboundedPrecedingFunction.java:42)
> at 
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83)
> at 
> org.apache.flink.streaming.runtime.io.RecordProcessorUtils.lambda$getRecordProcessor$0(RecordProcessorUtils.java:60)
> at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:237)
> at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:146)
> at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:110)
> at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:550)
> at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:839)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:788)
> at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:952)
> at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:931)
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:745)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
> at java.base/java.lang.Thread.run(Thread.java:829){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] clownxc closed pull request #22555: [FLINK-31706] [runtime] The default source parallelism should be the same as ex…

2023-05-14 Thread via GitHub


clownxc closed pull request #22555: [FLINK-31706] [runtime] The default source 
parallelism should be the same as ex…
URL: https://github.com/apache/flink/pull/22555


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-31967) SQL with LAG function NullPointerException

2023-05-14 Thread Haojin Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722665#comment-17722665
 ] 

Haojin Wang commented on FLINK-31967:
-

This problem is caused by the basic data type of Java, and I will propose a PR 
to solve this problem in the future

> SQL with LAG function NullPointerException
> --
>
> Key: FLINK-31967
> URL: https://issues.apache.org/jira/browse/FLINK-31967
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Reporter: padavan
>Priority: Major
> Attachments: image-2023-04-28-14-46-19-736.png, 
> image-2023-04-28-15-06-48-184.png, image-2023-04-28-15-14-58-788.png, 
> image-2023-04-28-15-17-49-144.png, image-2023-04-28-17-06-20-737.png, 
> simpleFlinkKafkaLag.zip
>
>
> I want to make a query with the LAG function. And got Job Exception without 
> any explanations.
>  
> *Code:*
> {code:java}
> private static void t1_LeadLag(DataStream ds, 
> StreamExecutionEnvironment env) {
> StreamTableEnvironment te = StreamTableEnvironment.create(env);
> Table t = te.fromDataStream(ds, 
> Schema.newBuilder().columnByExpression("proctime", "proctime()").build());
> te.createTemporaryView("users", t);
> Table res = te.sqlQuery("SELECT userId, `count`,\n" +
> " LAG(`count`) OVER (PARTITION BY userId ORDER BY proctime) AS 
> prev_quantity\n" +
> " FROM users");
> te.toChangelogStream(res).print();
> }{code}
>  
> *Input:*
> {"userId":3,"count":0,"dt":"2023-04-28T07:44:21.551Z"}
>  
> *Exception:* I remove part about basic JobExecutionException and kept the 
> important(i think)
> {code:java}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.flink.table.data.GenericRowData.getInt(GenericRowData.java:149)
> at 
> org.apache.flink.table.data.RowData.lambda$createFieldGetter$245ca7d1$6(RowData.java:245)
> at 
> org$apache$flink$table$runtime$functions$aggregate$LagAggFunction$LagAcc$2$Converter.toExternal(Unknown
>  Source)
> at 
> org.apache.flink.table.data.conversion.StructuredObjectConverter.toExternal(StructuredObjectConverter.java:101)
> at UnboundedOverAggregateHelper$15.setAccumulators(Unknown Source)
> at 
> org.apache.flink.table.runtime.operators.over.ProcTimeUnboundedPrecedingFunction.processElement(ProcTimeUnboundedPrecedingFunction.java:92)
> at 
> org.apache.flink.table.runtime.operators.over.ProcTimeUnboundedPrecedingFunction.processElement(ProcTimeUnboundedPrecedingFunction.java:42)
> at 
> org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83)
> at 
> org.apache.flink.streaming.runtime.io.RecordProcessorUtils.lambda$getRecordProcessor$0(RecordProcessorUtils.java:60)
> at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:237)
> at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:146)
> at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:110)
> at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:550)
> at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:839)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:788)
> at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:952)
> at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:931)
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:745)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
> at java.base/java.lang.Thread.run(Thread.java:829){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] luoyuxia merged pull request #22577: [FLINK-31882][sql-client] Fix sql client can't show result for DELETE…

2023-05-14 Thread via GitHub


luoyuxia merged PR #22577:
URL: https://github.com/apache/flink/pull/22577


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-31810) RocksDBException: Bad table magic number on checkpoint rescale

2023-05-14 Thread Yun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722656#comment-17722656
 ] 

Yun Tang commented on FLINK-31810:
--

[~david.artiga] what distributed-file-system did you use to store the 
checkpoint?

> RocksDBException: Bad table magic number on checkpoint rescale
> --
>
> Key: FLINK-31810
> URL: https://issues.apache.org/jira/browse/FLINK-31810
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.15.2
>Reporter: Robert Metzger
>Priority: Major
>
> While rescaling a job from checkpoint, I ran into this exception:
> {code:java}
> SinkMaterializer[7] -> rob-result[7]: Writer -> rob-result[7]: Committer 
> (4/4)#3 (c1b348f7eed6e1ce0e41ef75338ae754) switched from INITIALIZING to 
> FAILED with failure cause: java.lang.Exception: Exception while creating 
> StreamOperatorStateContext.
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:255)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:265)
>   at 
> org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:703)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:679)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:646)
>   at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
>   at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:917)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
>   at java.base/java.lang.Thread.run(Unknown Source)
> Caused by: org.apache.flink.util.FlinkException: Could not restore keyed 
> state backend for 
> SinkUpsertMaterializer_7d9b7588bc2ff89baed50d7a4558caa4_(4/4) from any of the 
> 1 provided restore options.
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:160)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:346)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:164)
>   ... 11 more
> Caused by: org.apache.flink.runtime.state.BackendBuildingException: Caught 
> unexpected exception.
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:395)
>   at 
> org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend.createKeyedStateBackend(EmbeddedRocksDBStateBackend.java:483)
>   at 
> org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend.createKeyedStateBackend(EmbeddedRocksDBStateBackend.java:97)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:329)
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168)
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
>   ... 13 more
> Caused by: java.io.IOException: Error while opening RocksDB instance.
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBOperationUtils.openDB(RocksDBOperationUtils.java:92)
>   at 
> org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreDBInstanceFromStateHandle(RocksDBIncrementalRestoreOperation.java:465)
>   at 
> org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreWithRescaling(RocksDBIncrementalRestoreOperation.java:321)
>   at 
> org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restore(RocksDBIncrementalRestoreOperation.java:164)
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:315)
>   ... 18 more
> Caused by: org.rocksdb.RocksDBException: Bad table magic number: expected 
> 9863518390377041911, found 4096 in 
> /tmp/job__op_SinkUpsertMaterializer_7d9b

[GitHub] [flink] fsk119 commented on pull request #17830: [FLINK-24893][Table SQL/Client][FLIP-189] SQL Client prompts customization

2023-05-14 Thread via GitHub


fsk119 commented on PR #17830:
URL: https://github.com/apache/flink/pull/17830#issuecomment-1547203381

   I ask the chatgbt about the idea. The answer as follows:
   
   > Q: Is it a good idea to put current catalog or database in the rest api 
headers?
   It can be a good idea to put the current catalog or database in the REST API 
headers, as it can simplify API calls and enhance security.
   >
   > A: By sending this information as headers, it eliminates the need to 
provide the catalog and database name in each API >request. This can simplify 
API calls and reduce network load. Furthermore, including catalog and database 
names in >headers can enhance the security of the API, as this information is 
easier to protect and limit access to.
   >
   >However, it is important to ensure that the information included in the 
headers is correct and that only authorized users >can access it. Additionally, 
headers must be properly parsed and validated to avoid potential security 
vulnerabilities and >data leakage.
   >
   >In summary, putting the current catalog or database in the REST API headers 
can be a good idea, but it is important to >consider security and correctness.
   
   > Q: Does any system put the current catalog or database in the rest api 
headers
   > A: Yes, there are many systems that put the current catalog or database in 
the REST API headers. The exact 
   > implementation can vary among different systems, but here are some 
examples:
   > 
   > MySQL: The current database can be specified in the X-Original-Database 
header.
   > 
   > PostgreSQL: The current schema can be specified in the currentSchema 
header.
   > 
   > Oracle: The current schema can be specified in the currentSchema header.
   > 
   > SQL Server: The current database can be specified in the Initial Catalog 
header.
   > 
   > MongoDB: The current database can be specified in the database header.
   > 
   > These headers are typically used to specify the catalog or database for a 
single API request. They can also be used to set the default catalog or 
database for subsequent API requests.
   > 
   > Overall, including the current catalog or database in REST API headers is 
a common practice, and many database systems support this functionality.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-32094) startScheduling.BATCH performance regression since May 11th

2023-05-14 Thread Zhilong Hong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722646#comment-17722646
 ] 

Zhilong Hong commented on FLINK-32094:
--

The extra cost may be caused by 
{{{}TieredStorageIdMappingUtils#convertId(ResultPartitionID 
resultPartitionId){}}}.

!image-2023-05-15-12-33-56-319.png|height=75%,width=75%!

> startScheduling.BATCH performance regression since May 11th
> ---
>
> Key: FLINK-32094
> URL: https://issues.apache.org/jira/browse/FLINK-32094
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Reporter: Martijn Visser
>Assignee: Yuxin Tan
>Priority: Blocker
> Attachments: image-2023-05-14-22-58-00-886.png, 
> image-2023-05-15-12-33-56-319.png
>
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=startScheduling.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32094) startScheduling.BATCH performance regression since May 11th

2023-05-14 Thread Zhilong Hong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhilong Hong updated FLINK-32094:
-
Attachment: image-2023-05-15-12-33-56-319.png

> startScheduling.BATCH performance regression since May 11th
> ---
>
> Key: FLINK-32094
> URL: https://issues.apache.org/jira/browse/FLINK-32094
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Reporter: Martijn Visser
>Assignee: Yuxin Tan
>Priority: Blocker
> Attachments: image-2023-05-14-22-58-00-886.png, 
> image-2023-05-15-12-33-56-319.png
>
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=startScheduling.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-24696) Translate how to configure unaligned checkpoints into Chinese

2023-05-14 Thread lijie (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722634#comment-17722634
 ] 

lijie edited comment on FLINK-24696 at 5/15/23 4:28 AM:


hi Piotr Nowojski.

I like flink and I learned how to contribute to our community from WuChong,this 
is my first time commenting. I saw that ZhouYu Chen paid attention to this 
issue a long time ago, but it was not closed.I guess he didn't. Can I work on 
this issue and gradually start joining the community.Could you assignee to me 
to translate this page into Chinese

Thank you.


was (Author: JIRAUSER300402):
hi Piotr Nowojski.

I like flink and I learned how to contribute to our community from WuChong,this 
is my first time commenting. I saw that ZhouYu Chen paid attention to this 
issue a long time ago, but it was not closed.I guess he didn't. Can I work on 
this issue and gradually start joining the community.

Thank you.

> Translate how to configure unaligned checkpoints into Chinese
> -
>
> Key: FLINK-24696
> URL: https://issues.apache.org/jira/browse/FLINK-24696
> Project: Flink
>  Issue Type: Improvement
>  Components: chinese-translation, Documentation
>Affects Versions: 1.14.1, 1.15.0
>Reporter: Piotr Nowojski
>Priority: Not a Priority
>
> As part of FLINK-24695 
> {{docs/content/docs/ops/state/checkpointing_under_backpressure.md}} and 
> {{docs/content/docs/dev/datastream/fault-tolerance/checkpointing.md}} were 
> modified. Those modifications should be translated into Chinese



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] fsk119 commented on pull request #17830: [FLINK-24893][Table SQL/Client][FLIP-189] SQL Client prompts customization

2023-05-14 Thread via GitHub


fsk119 commented on PR #17830:
URL: https://github.com/apache/flink/pull/17830#issuecomment-1547174183

   The presto resposne headers contains the catalog/schema info[1], e.g 
`X-Presto-Set-Catalog` and `X-Presto-Catalog`. 
   
   [1] 
https://prestodb.io/docs/current/develop/client-protocol.html#client-response-headers
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-32095) HiveDialectITCase crashed with exit code 239

2023-05-14 Thread Weijie Guo (Jira)
Weijie Guo created FLINK-32095:
--

 Summary: HiveDialectITCase crashed with exit code 239
 Key: FLINK-32095
 URL: https://issues.apache.org/jira/browse/FLINK-32095
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.17.1
Reporter: Weijie Guo


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=48957&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461&l=22740

May 13 02:10:09 [ERROR] Crashed tests:
May 13 02:10:09 [ERROR] org.apache.flink.connectors.hive.HiveDialectITCase
May 13 02:10:09 [ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:532)
May 13 02:10:09 [ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:479)
May 13 02:10:09 [ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:322)
May 13 02:10:09 [ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:266)
May 13 02:10:09 [ERROR] at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1314)
May 13 02:10:09 [ERROR] at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1159)
May 13 02:10:09 [ERROR] at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:932)
May 13 02:10:09 [ERROR] at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132)
May 13 02:10:09 [ERROR] at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
May 13 02:10:09 [ERROR] at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
May 13 02:10:09 [ERROR] at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
May 13 02:10:09 [ERROR] at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
May 13 02:10:09 [ERROR] at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
May 13 02:10:09 [ERROR] at 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
May 13 02:10:09 [ERROR] at 
org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:120)
May 13 02:10:09 [ERROR] at 
org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:355)
May 13 02:10:09 [ERROR] at 
org.apache.maven.DefaultMaven.execute(DefaultMaven.java:155)
May 13 02:10:09 [ERROR] at 
org.apache.maven.cli.MavenCli.execute(MavenCli.java:584)
May 13 02:10:09 [ERROR] at 
org.apache.maven.cli.MavenCli.doMain(MavenCli.java:216)
May 13 02:10:09 [ERROR] at org.apache.maven.cli.MavenCli.main(MavenCli.java:160)
May 13 02:10:09 [ERROR] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
May 13 02:10:09 [ERROR] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
May 13 02:10:09 [ERROR] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
May 13 02:10:09 [ERROR] at java.lang.reflect.Method.invoke(Method.java:498)
May 13 02:10:09 [ERROR] at 
org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
May 13 02:10:09 [ERROR] at 
org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
May 13 02:10:09 [ERROR] at 
org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
May 13 02:10:09 [ERROR] at 
org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
May 13 02:10:09 [ERROR] Caused by: 
org.apache.maven.surefire.booter.SurefireBooterForkException: The forked VM 
terminated without properly saying goodbye. VM crash or System.exit called?
May 13 02:10:09 [ERROR] Command was /bin/sh -c cd 
/__w/1/s/flink-connectors/flink-connector-hive && 
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -XX:+UseG1GC -Xms256m -Xmx1536m 
-jar 
/__w/1/s/flink-connectors/flink-connector-hive/target/surefire/surefirebooter2973058874035532114.jar
 /__w/1/s/flink-connectors/flink-connector-hive/target/surefire 
2023-05-13T01-46-05_580-jvmRun3 surefire1860158651016882706tmp 
surefire_277931085391834517755tmp




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (FLINK-32065) Got NoSuchFileException when initialize source function.

2023-05-14 Thread Wencong Liu (Jira)


[ https://issues.apache.org/jira/browse/FLINK-32065 ]


Wencong Liu deleted comment on FLINK-32065:
-

was (Author: JIRAUSER281639):
Hello [~SpongebobZ] , do you clean the tmp dir before the error happens ? It 
seems that the file was deleted when it need to be read.

> Got NoSuchFileException when initialize source function.
> 
>
> Key: FLINK-32065
> URL: https://issues.apache.org/jira/browse/FLINK-32065
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.14.4
>Reporter: Spongebob
>Priority: Major
> Attachments: image-2023-05-12-14-07-45-771.png, 
> image-2023-05-12-14-26-46-268.png, image-2023-05-12-17-37-09-002.png
>
>
> When I submit an application to flink standalone cluster, I got a 
> NoSuchFileException. I think it was failed to create the tmp channel file but 
> I am confused about the reason relative to this case.
> I found that this sub-directory `flink-netty-shuffle-xxx` was not existed, so 
> is this diretory only working for that step of the application ?
> BTW, this issue happen coincidently.
> !image-2023-05-12-14-07-45-771.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32065) Got NoSuchFileException when initialize source function.

2023-05-14 Thread Wencong Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722638#comment-17722638
 ] 

Wencong Liu commented on FLINK-32065:
-

Hello [~SpongebobZ] , do you clean the tmp dir before the error happens ? It 
seems that the file was deleted when it need to be read.

> Got NoSuchFileException when initialize source function.
> 
>
> Key: FLINK-32065
> URL: https://issues.apache.org/jira/browse/FLINK-32065
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.14.4
>Reporter: Spongebob
>Priority: Major
> Attachments: image-2023-05-12-14-07-45-771.png, 
> image-2023-05-12-14-26-46-268.png, image-2023-05-12-17-37-09-002.png
>
>
> When I submit an application to flink standalone cluster, I got a 
> NoSuchFileException. I think it was failed to create the tmp channel file but 
> I am confused about the reason relative to this case.
> I found that this sub-directory `flink-netty-shuffle-xxx` was not existed, so 
> is this diretory only working for that step of the application ?
> BTW, this issue happen coincidently.
> !image-2023-05-12-14-07-45-771.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30629) ClientHeartbeatTest.testJobRunningIfClientReportHeartbeat is unstable

2023-05-14 Thread Weijie Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722639#comment-17722639
 ] 

Weijie Guo commented on FLINK-30629:


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=48972&view=logs&j=77a9d8e1-d610-59b3-fc2a-4766541e0e33&t=125e07e7-8de0-5c6c-a541-a567415af3ef&l=9703

> ClientHeartbeatTest.testJobRunningIfClientReportHeartbeat is unstable
> -
>
> Key: FLINK-30629
> URL: https://issues.apache.org/jira/browse/FLINK-30629
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.17.0, 1.18.0
>Reporter: Xintong Song
>Assignee: Weijie Guo
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.17.0
>
> Attachments: ClientHeartbeatTestLog.txt
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44690&view=logs&j=77a9d8e1-d610-59b3-fc2a-4766541e0e33&t=125e07e7-8de0-5c6c-a541-a567415af3ef&l=10819
> {code:java}
> Jan 11 04:32:39 [ERROR] Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, 
> Time elapsed: 21.02 s <<< FAILURE! - in 
> org.apache.flink.client.ClientHeartbeatTest
> Jan 11 04:32:39 [ERROR] 
> org.apache.flink.client.ClientHeartbeatTest.testJobRunningIfClientReportHeartbeat
>   Time elapsed: 9.157 s  <<< ERROR!
> Jan 11 04:32:39 java.lang.IllegalStateException: MiniCluster is not yet 
> running or has already been shut down.
> Jan 11 04:32:39   at 
> org.apache.flink.util.Preconditions.checkState(Preconditions.java:193)
> Jan 11 04:32:39   at 
> org.apache.flink.runtime.minicluster.MiniCluster.getDispatcherGatewayFuture(MiniCluster.java:1044)
> Jan 11 04:32:39   at 
> org.apache.flink.runtime.minicluster.MiniCluster.runDispatcherCommand(MiniCluster.java:917)
> Jan 11 04:32:39   at 
> org.apache.flink.runtime.minicluster.MiniCluster.getJobStatus(MiniCluster.java:841)
> Jan 11 04:32:39   at 
> org.apache.flink.runtime.minicluster.MiniClusterJobClient.getJobStatus(MiniClusterJobClient.java:91)
> Jan 11 04:32:39   at 
> org.apache.flink.client.ClientHeartbeatTest.testJobRunningIfClientReportHeartbeat(ClientHeartbeatTest.java:79)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32094) startScheduling.BATCH performance regression since May 11th

2023-05-14 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo reassigned FLINK-32094:
--

Assignee: Yuxin Tan

> startScheduling.BATCH performance regression since May 11th
> ---
>
> Key: FLINK-32094
> URL: https://issues.apache.org/jira/browse/FLINK-32094
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Reporter: Martijn Visser
>Assignee: Yuxin Tan
>Priority: Blocker
> Attachments: image-2023-05-14-22-58-00-886.png
>
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=startScheduling.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32094) startScheduling.BATCH performance regression since May 11th

2023-05-14 Thread Weijie Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722637#comment-17722637
 ] 

Weijie Guo commented on FLINK-32094:


This may caused by FLINK-31635, [~tanyuxin] would you mind taking a look at 
this?

> startScheduling.BATCH performance regression since May 11th
> ---
>
> Key: FLINK-32094
> URL: https://issues.apache.org/jira/browse/FLINK-32094
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Reporter: Martijn Visser
>Priority: Blocker
> Attachments: image-2023-05-14-22-58-00-886.png
>
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=startScheduling.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] fsk119 commented on a diff in pull request #17830: [FLINK-24893][Table SQL/Client][FLIP-189] SQL Client prompts customization

2023-05-14 Thread via GitHub


fsk119 commented on code in PR #17830:
URL: https://github.com/apache/flink/pull/17830#discussion_r1193281401


##
flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/cli/PromptHandler.java:
##
@@ -0,0 +1,186 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.client.cli;
+
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.table.client.config.SqlClientOptions;
+import org.apache.flink.table.client.gateway.Executor;
+
+import org.jline.terminal.Terminal;
+import org.jline.utils.AttributedStringBuilder;
+import org.jline.utils.AttributedStyle;
+import org.jline.utils.StyleResolver;
+
+import java.io.PrintWriter;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+import java.util.HashMap;
+import java.util.Locale;
+import java.util.Map;
+import java.util.function.Supplier;
+
+/**
+ * Prompt handler class which allows customization for the prompt shown at the 
start (left prompt)
+ * and the end (right prompt) of each line.
+ */
+public class PromptHandler {
+private static final char ESCAPE_BACKSLASH = '\\';
+private static final Map FORMATTER_CACHE = new 
HashMap<>();
+
+static {
+FORMATTER_CACHE.put("D", new SimpleDateFormat("-MM-dd 
HH:mm:ss.SSS", Locale.ROOT));
+FORMATTER_CACHE.put("m", new SimpleDateFormat("mm", Locale.ROOT));
+FORMATTER_CACHE.put("o", new SimpleDateFormat("MM", Locale.ROOT));
+FORMATTER_CACHE.put("O", new SimpleDateFormat("MMM", Locale.ROOT));
+FORMATTER_CACHE.put("P", new SimpleDateFormat("aa", Locale.ROOT));
+FORMATTER_CACHE.put("r", new SimpleDateFormat("hh:mm", Locale.ROOT));
+FORMATTER_CACHE.put("R", new SimpleDateFormat("HH:mm", Locale.ROOT));
+FORMATTER_CACHE.put("s", new SimpleDateFormat("ss", Locale.ROOT));
+FORMATTER_CACHE.put("w", new SimpleDateFormat("d", Locale.ROOT));
+FORMATTER_CACHE.put("W", new SimpleDateFormat("E", Locale.ROOT));
+FORMATTER_CACHE.put("y", new SimpleDateFormat("yy", Locale.ROOT));
+FORMATTER_CACHE.put("Y", new SimpleDateFormat("", Locale.ROOT));
+}
+
+private static final StyleResolver STYLE_RESOLVER = new StyleResolver(s -> 
"");
+
+private final Executor executor;
+private final Supplier terminalSupplier;
+
+public PromptHandler(Executor executor, Supplier 
terminalSupplier) {
+this.executor = executor;
+this.terminalSupplier = terminalSupplier;
+}
+
+public String getPrompt() {
+return buildPrompt(
+executor.getSessionConfig().get(SqlClientOptions.PROMPT),
+SqlClientOptions.PROMPT.defaultValue());
+}
+
+public String getRightPrompt() {
+return buildPrompt(
+executor.getSessionConfig().get(SqlClientOptions.RIGHT_PROMPT),
+SqlClientOptions.RIGHT_PROMPT.defaultValue());
+}
+
+private String buildPrompt(String pattern, String defaultValue) {

Review Comment:
   add nullable annotation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-24696) Translate how to configure unaligned checkpoints into Chinese

2023-05-14 Thread lijie (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722634#comment-17722634
 ] 

lijie commented on FLINK-24696:
---

hi Piotr Nowojski.

I like flink and I learned how to contribute to our community from WuChong,this 
is my first time commenting. I saw that ZhouYu Chen paid attention to this 
issue a long time ago, but it was not closed.I guess he didn't. Can I work on 
this issue and gradually start joining the community.

Thank you.

> Translate how to configure unaligned checkpoints into Chinese
> -
>
> Key: FLINK-24696
> URL: https://issues.apache.org/jira/browse/FLINK-24696
> Project: Flink
>  Issue Type: Improvement
>  Components: chinese-translation, Documentation
>Affects Versions: 1.14.1, 1.15.0
>Reporter: Piotr Nowojski
>Priority: Not a Priority
>
> As part of FLINK-24695 
> {{docs/content/docs/ops/state/checkpointing_under_backpressure.md}} and 
> {{docs/content/docs/dev/datastream/fault-tolerance/checkpointing.md}} were 
> modified. Those modifications should be translated into Chinese



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31033) UsingRemoteJarITCase.testUdfInRemoteJar failed with assertion

2023-05-14 Thread Weijie Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722632#comment-17722632
 ] 

Weijie Guo commented on FLINK-31033:


testCreateTemporarySystemFunctionUsingRemoteJar also report the same error:
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=48964&view=logs&j=fb37c667-81b7-5c22-dd91-846535e99a97&t=011e961e-597c-5c96-04fe-7941c8b83f23&l=12631

> UsingRemoteJarITCase.testUdfInRemoteJar failed with assertion
> -
>
> Key: FLINK-31033
> URL: https://issues.apache.org/jira/browse/FLINK-31033
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.17.0, 1.16.1
>Reporter: Matthias Pohl
>Priority: Critical
>  Labels: test-stability
>
> {{UsingRemoteJarITCase.testUdfInRemoteJar}} failed with assertion:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46009&view=logs&j=af184cdd-c6d8-5084-0b69-7e9c67b35f7a&t=160c9ae5-96fd-516e-1c91-deb81f59292a&l=18050
> {code}
> Feb 10 15:28:15 [ERROR] Tests run: 10, Failures: 1, Errors: 0, Skipped: 0, 
> Time elapsed: 249.499 s <<< FAILURE! - in 
> org.apache.flink.table.sql.codegen.UsingRemoteJarITCase
> Feb 10 15:28:15 [ERROR] UsingRemoteJarITCase.testUdfInRemoteJar  Time 
> elapsed: 40.786 s  <<< FAILURE!
> Feb 10 15:28:15 org.opentest4j.AssertionFailedError: Did not get expected 
> results before timeout, actual result: 
> [{"before":null,"after":{"user_name":"Bob","order_cnt":1},"op":"c"}, 
> {"before":null,"after":{"user_name":"Alice","order_cnt":1},"op":"c"}, 
> {"before":{"user_name":"Bob","order_cnt":1},"after":null,"op":"d"}, 
> {"before":null,"after":{"user_name":"Bob","order_cnt":2},"op":"c"}]. ==> 
> expected:  but was: 
> Feb 10 15:28:15   at 
> org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
> Feb 10 15:28:15   at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40)
> Feb 10 15:28:15   at 
> org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:210)
> Feb 10 15:28:15   at 
> org.apache.flink.table.sql.codegen.SqlITCaseBase.checkJsonResultFile(SqlITCaseBase.java:168)
> Feb 10 15:28:15   at 
> org.apache.flink.table.sql.codegen.SqlITCaseBase.runAndCheckSQL(SqlITCaseBase.java:111)
> Feb 10 15:28:15   at 
> org.apache.flink.table.sql.codegen.UsingRemoteJarITCase.testUdfInRemoteJar(UsingRemoteJarITCase.java:106)
> [...]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] xintongsong commented on a diff in pull request #22352: [FLINK-31639][network] Introduce tiered store memory manager

2023-05-14 Thread via GitHub


xintongsong commented on code in PR #22352:
URL: https://github.com/apache/flink/pull/22352#discussion_r1193288010


##
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/storage/TieredStorageMemoryManager.java:
##
@@ -0,0 +1,96 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage;
+
+import org.apache.flink.runtime.io.network.buffer.BufferBuilder;
+import org.apache.flink.runtime.io.network.buffer.BufferPool;
+import org.apache.flink.runtime.io.network.buffer.LocalBufferPool;
+
+import java.util.List;
+
+/**
+ * The {@link TieredStorageMemoryManager} is to request or recycle buffers 
from {@link
+ * LocalBufferPool} for different memory owners, for example, the tiers, the 
buffer accumulator,
+ * etc. Note that the logic for requesting and recycling buffers is consistent 
for these owners.
+ *
+ * The memory managed by {@link TieredStorageMemoryManager} is categorized 
into two types:
+ * long-term occupied memory which cannot be immediately released and 
short-term occupied memory
+ * which can be reclaimed quickly and safely. Long-term occupied memory usage 
necessitates waiting
+ * for other operations to complete before releasing it, such as downstream 
consumption. On the
+ * other hand, short-term occupied memory can be freed up at any time, 
enabling rapid memory
+ * recycling for tasks such as flushing memory to disk or remote storage.
+ *
+ * This {@link TieredStorageMemoryManager} aim to streamline and harmonize 
memory management
+ * across various layers. Instead of tracking the number of buffers utilized 
by individual users, it
+ * dynamically calculates a user's maximum guaranteed amount based on the 
current status of the
+ * manager and the local buffer pool. Specifically, if a user is a long-term 
occupied memory user,
+ * the {@link TieredStorageMemoryManager} does not limit the user's memory 
usage, while if a user is
+ * a short-term occupied memory user, the current guaranteed buffers of the 
user is the left buffers
+ * in the buffer pool - guaranteed amount of other users (excluding the 
current user).

Review Comment:
   This is a bit hard to understand. I think the memory manager never limit 
users' memory usage. It may not need to understand the long-term and short-term 
differences.



##
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/storage/TieredStorageMemoryManager.java:
##
@@ -0,0 +1,96 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage;
+
+import org.apache.flink.runtime.io.network.buffer.BufferBuilder;
+import org.apache.flink.runtime.io.network.buffer.BufferPool;
+import org.apache.flink.runtime.io.network.buffer.LocalBufferPool;
+
+import java.util.List;
+
+/**
+ * The {@link TieredStorageMemoryManager} is to request or recycle buffers 
from {@link
+ * LocalBufferPool} for different memory owners, for example, the tiers, the 
buffer accumulator,
+ * etc. Note that the logic for requesting and recycling buffers is consistent 
for these owners.
+ *
+ * The memory managed by {@link TieredStorageMemoryManager} is categorized 
into two types:
+ * long-term occupied memory which cannot be immediately released and 
short-term occupied memory
+ * which can be reclaimed quickly and safely. Long-

[jira] [Closed] (FLINK-20794) Support to select distinct columns in the Table API

2023-05-14 Thread Dian Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu closed FLINK-20794.
---
Fix Version/s: (was: 1.18.0)
   Resolution: Not A Problem

> Support to select distinct columns in the Table API
> ---
>
> Key: FLINK-20794
> URL: https://issues.apache.org/jira/browse/FLINK-20794
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Dian Fu
>Assignee: Vishaal Selvaraj
>Priority: Major
> Attachments: screenshot-1.png
>
>
> Currently, there is no corresponding functionality in Table API for the 
> following SQL:
> {code:java}
> SELECT DISTINCT users FROM Orders
> {code}
> For example, for the following job:
> {code:java}
> table.select("distinct a")
> {code}
> It will thrown the following exception:
> {code:java}
> org.apache.flink.table.api.ExpressionParserException: Could not parse 
> expression at column 10: ',' expected but 'a' 
> foundorg.apache.flink.table.api.ExpressionParserException: Could not parse 
> expression at column 10: ',' expected but 'a' founddistinct a         ^
>  at 
> org.apache.flink.table.expressions.PlannerExpressionParserImpl$.throwError(PlannerExpressionParserImpl.scala:726)
>  at 
> org.apache.flink.table.expressions.PlannerExpressionParserImpl$.parseExpressionList(PlannerExpressionParserImpl.scala:710)
>  at 
> org.apache.flink.table.expressions.PlannerExpressionParserImpl.parseExpressionList(PlannerExpressionParserImpl.scala:47)
>  at 
> org.apache.flink.table.expressions.ExpressionParser.parseExpressionList(ExpressionParser.java:40)
>  at 
> org.apache.flink.table.api.internal.TableImpl.select(TableImpl.java:121){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-31638) Downstream supports reading buffers from tiered store

2023-05-14 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song closed FLINK-31638.

Fix Version/s: 1.18.0
   Resolution: Done

master (1.18): 7d9027dbb3ae551a86c385f985e5fe9af2cbdbac

> Downstream supports reading buffers from tiered store
> -
>
> Key: FLINK-31638
> URL: https://issues.apache.org/jira/browse/FLINK-31638
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Affects Versions: 1.18.0
>Reporter: Yuxin Tan
>Assignee: Wencong Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-20794) Support to select distinct columns in the Table API

2023-05-14 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722623#comment-17722623
 ] 

Dian Fu commented on FLINK-20794:
-

Thanks for the analysis. Since string based interfaces have been removed, I 
guess we could close this ticket~

> Support to select distinct columns in the Table API
> ---
>
> Key: FLINK-20794
> URL: https://issues.apache.org/jira/browse/FLINK-20794
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Dian Fu
>Assignee: Vishaal Selvaraj
>Priority: Major
> Fix For: 1.18.0
>
> Attachments: screenshot-1.png
>
>
> Currently, there is no corresponding functionality in Table API for the 
> following SQL:
> {code:java}
> SELECT DISTINCT users FROM Orders
> {code}
> For example, for the following job:
> {code:java}
> table.select("distinct a")
> {code}
> It will thrown the following exception:
> {code:java}
> org.apache.flink.table.api.ExpressionParserException: Could not parse 
> expression at column 10: ',' expected but 'a' 
> foundorg.apache.flink.table.api.ExpressionParserException: Could not parse 
> expression at column 10: ',' expected but 'a' founddistinct a         ^
>  at 
> org.apache.flink.table.expressions.PlannerExpressionParserImpl$.throwError(PlannerExpressionParserImpl.scala:726)
>  at 
> org.apache.flink.table.expressions.PlannerExpressionParserImpl$.parseExpressionList(PlannerExpressionParserImpl.scala:710)
>  at 
> org.apache.flink.table.expressions.PlannerExpressionParserImpl.parseExpressionList(PlannerExpressionParserImpl.scala:47)
>  at 
> org.apache.flink.table.expressions.ExpressionParser.parseExpressionList(ExpressionParser.java:40)
>  at 
> org.apache.flink.table.api.internal.TableImpl.select(TableImpl.java:121){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] xintongsong closed pull request #22316: [FLINK-31638][network] Downstream supports reading buffers from tiered store

2023-05-14 Thread via GitHub


xintongsong closed pull request #22316: [FLINK-31638][network] Downstream 
supports reading buffers from tiered store
URL: https://github.com/apache/flink/pull/22316


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-31762) Subscribe to multiple Kafka topics may cause partition assignment skew

2023-05-14 Thread Liam (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722618#comment-17722618
 ] 

Liam commented on FLINK-31762:
--

Hi master [~martijnvisser] [~tzulitai] any comment?

> Subscribe to multiple Kafka topics may cause partition assignment skew
> --
>
> Key: FLINK-31762
> URL: https://issues.apache.org/jira/browse/FLINK-31762
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Affects Versions: 1.13.0, 1.18.0
>Reporter: Liam
>Priority: Major
> Attachments: image-2023-04-11-08-00-16-054.png, 
> image-2023-04-11-08-12-24-115.png
>
>
> To simplify the demonstration, let us assume that there are two topics, and 
> each topic has four partitions. We have set the parallelism to eight to 
> consume these two topics. However, the current partition assignment method 
> may lead to some subtasks being assigned two partitions while others are left 
> with none.
> !image-2023-04-11-08-00-16-054.png|width=500,height=143!
> In my case, the situation is even worse as I have ten topics, each with 100 
> partitions. If I set the parallelism to 1000, some slots may be assigned 
> seven partitions while others remain unassigned.
> To address this issue, I propose a new partition assignment solution. In this 
> approach, round-robin assignment takes place between all topics, not just one.
> For example, the ideal assignment for the case mentioned above is presented 
> below:
>  
> !https://imgr.whimsical.com/object/A4jSJwgQNrc5mgpGddhghq|width=513,height=134!
> This new solution can also handle cases where each topic has more partitions.
> !image-2023-04-11-08-12-24-115.png|width=444,height=127!
> Let us work together to reach a consensus on this proposal. Thank you!
>  
> FYI: how the partition be assigned currently
> {code:java}
> public class KafkaTopicPartitionAssigner {    
>     public static int assign(KafkaTopicPartition partition, int 
> numParallelSubtasks) {
>         return assign(partition.getTopic(), partition.getPartition(), 
> numParallelSubtasks);
>     }    public static int assign(String topic, int partition, int 
> numParallelSubtasks) {
>         int startIndex = ((topic.hashCode() * 31) & 0x7FFF) % 
> numParallelSubtasks;        // here, the assumption is that the id of Kafka 
> partitions are always ascending
>         // starting from 0, and therefore can be used directly as the offset 
> clockwise from the
>         // start index
>         return (startIndex + partition) % numParallelSubtasks;
>     }
>  {code}
> for Kafka Source, it's implemented in the KafkaSourceEnumerator as below
> {code:java}
>     static int getSplitOwner(TopicPartition tp, int numReaders) {
>         int startIndex = ((tp.topic().hashCode() * 31) & 0x7FFF) % 
> numReaders;        // here, the assumption is that the id of Kafka partitions 
> are always ascending
>         // starting from 0, and therefore can be used directly as the offset 
> clockwise from the
>         // start index
>         return (startIndex + tp.partition()) % numReaders;
>     } {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #22577: [FLINK-31882][sql-client] Fix sql client can't show result for DELETE…

2023-05-14 Thread via GitHub


flinkbot commented on PR #22577:
URL: https://github.com/apache/flink/pull/22577#issuecomment-1547107421

   
   ## CI report:
   
   * 498d9e8c2a576f2a9dbfcb5367b2e2d8beed8f83 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] fsk119 commented on a diff in pull request #22533: [FLINK-31687][jdbc-driver] Get rid of flink-core for jdbc driver

2023-05-14 Thread via GitHub


fsk119 commented on code in PR #22533:
URL: https://github.com/apache/flink/pull/22533#discussion_r1193272280


##
flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/gateway/Executor.java:
##
@@ -46,7 +46,7 @@ static Executor create(
  *
  * @return the session configuration.
  */
-ReadableConfig getSessionConfig();
+Map getSessionConfig();

Review Comment:
   Yes. I think external system may rely on this. Can we add another method, 
e.g. getSessionConfigMap?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] luoyuxia opened a new pull request, #22577: [FLINK-31882][sql-client] Fix sql client can't show result for DELETE…

2023-05-14 Thread via GitHub


luoyuxia opened a new pull request, #22577:
URL: https://github.com/apache/flink/pull/22577

   … statement when delete is pushed down (#22488)
   
   
   
   ## What is the purpose of the change
   Backport of #22488
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-31950) Introduce StateMetadata

2023-05-14 Thread Jane Chan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan updated FLINK-31950:
--
Summary: Introduce StateMetadata  (was: Introduce StateMetadata and 
StateMetadataJson SerDe)

> Introduce StateMetadata
> ---
>
> Key: FLINK-31950
> URL: https://issues.apache.org/jira/browse/FLINK-31950
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Jane Chan
>Assignee: Jane Chan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> According to the FLIP design, we're about to introduce
>  * StateMetadata, which describes the TTL attribute of the stateful stream 
> operator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31950) Introduce StateMetadata and StateMetadataJson SerDe

2023-05-14 Thread Jane Chan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan updated FLINK-31950:
--
Description: 
According to the FLIP design, we're about to introduce
 * StateMetadata, which describes the TTL attribute of the stateful stream 
operator.

  was:
According to the FLIP design, we're about to introduce
 * StateMetadata, which describes the TTL attribute of the stateful stream 
operator.
 * StateMetadata SerDerializers.


> Introduce StateMetadata and StateMetadataJson SerDe
> ---
>
> Key: FLINK-31950
> URL: https://issues.apache.org/jira/browse/FLINK-31950
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Jane Chan
>Assignee: Jane Chan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> According to the FLIP design, we're about to introduce
>  * StateMetadata, which describes the TTL attribute of the stateful stream 
> operator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] dianfu commented on a diff in pull request #22571: [FLINK-32056][Python][Connector/Pulsar] Upgrade flink-connector-pulsar in flink-python to v4.0.0

2023-05-14 Thread via GitHub


dianfu commented on code in PR #22571:
URL: https://github.com/apache/flink/pull/22571#discussion_r1191830067


##
flink-python/pyflink/datastream/connectors/pulsar.py:
##
@@ -693,8 +668,7 @@ class PulsarSinkBuilder(object):
 ... .set_service_url(PULSAR_BROKER_URL) \\
 ... .set_admin_url(PULSAR_BROKER_HTTP_URL) \\
 ... .set_topics([TOPIC1, TOPIC2]) \\
-... .set_serialization_schema(
-... 
PulsarSerializationSchema.flink_schema(SimpleStringSchema())) \\
+... .set_value_serialization_schema(SimpleStringSchema()) \\

Review Comment:
   ```suggestion
   ... .set_serialization_schema(SimpleStringSchema()) \\
   ```



##
flink-python/pyflink/datastream/connectors/pulsar.py:
##
@@ -483,18 +438,17 @@ def set_bounded_stop_cursor(self, stop_cursor: 
StopCursor) -> 'PulsarSourceBuild
 
self._j_pulsar_source_builder.setBoundedStopCursor(stop_cursor._j_stop_cursor)
 return self
 
-def set_deserialization_schema(self,
-   pulsar_deserialization_schema: 
PulsarDeserializationSchema) \
+def set_value_only_deserializer(self, deserialization_schema: 
DeserializationSchema) \
 -> 'PulsarSourceBuilder':
 """
-DeserializationSchema is required for getting the Schema for 
deserialize message from
-pulsar and getting the TypeInformation for message serialization in 
flink.
+Sets the :class:`~pyflink.common.serialization.DeserializationSchema` 
for deserializing the
+value of Pulsars message.
 
-We have defined a set of implementations, using 
PulsarDeserializationSchema#flink_type_info
-or PulsarDeserializationSchema#flink_schema for creating the desired 
schema.
+:param deserialization_schema: the :class:`DeserializationSchema` to 
use for
+deserialization.
+:return: this PulsarSourceBuilder.
 """
-self._j_pulsar_source_builder.setDeserializationSchema(
-pulsar_deserialization_schema._j_pulsar_deserialization_schema)
+
self._j_builder.setValueOnlyDeserializer(deserialization_schema._j_deserialization_schema)

Review Comment:
   I'm curious why you are trying to call `setValueOnlyDeserializer`. Per my 
understanding, there is no such interface at all: 
https://github.com/apache/flink-connector-pulsar/blob/main/flink-connector-pulsar/src/main/java/org/apache/flink/connector/pulsar/source/PulsarSourceBuilder.java



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] luoyuxia merged pull request #22488: [FLINK-31882][sql-client] Fix sql client can't show result for DELETE statement when delete is pushed down

2023-05-14 Thread via GitHub


luoyuxia merged PR #22488:
URL: https://github.com/apache/flink/pull/22488


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-31950) Introduce StateMetadata and StateMetadataJson SerDe

2023-05-14 Thread Godfrey He (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Godfrey He closed FLINK-31950.
--
Resolution: Fixed

Fixed in master: 62b11e2e117f874f073de93756c6b9889c464562

> Introduce StateMetadata and StateMetadataJson SerDe
> ---
>
> Key: FLINK-31950
> URL: https://issues.apache.org/jira/browse/FLINK-31950
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Jane Chan
>Assignee: Jane Chan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> According to the FLIP design, we're about to introduce
>  * StateMetadata, which describes the TTL attribute of the stateful stream 
> operator.
>  * StateMetadata SerDerializers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] godfreyhe closed pull request #22492: [FLINK-31950][table-planner] Introduce StateMetadata and StateMetadataJson SerDe

2023-05-14 Thread via GitHub


godfreyhe closed pull request #22492: [FLINK-31950][table-planner] Introduce 
StateMetadata and StateMetadataJson SerDe
URL: https://github.com/apache/flink/pull/22492


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] godfreyhe commented on a diff in pull request #22492: [FLINK-31950][table-planner] Introduce StateMetadata and StateMetadataJson SerDe

2023-05-14 Thread via GitHub


godfreyhe commented on code in PR #22492:
URL: https://github.com/apache/flink/pull/22492#discussion_r1193258928


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/StateMetadata.java:
##
@@ -0,0 +1,207 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.plan.nodes.exec;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.configuration.ReadableConfig;
+import org.apache.flink.table.api.config.ExecutionConfigOptions;
+import org.apache.flink.util.CollectionUtil;
+import org.apache.flink.util.Preconditions;
+import org.apache.flink.util.TimeUtils;
+
+import 
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.annotation.JsonCreator;
+import 
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.annotation.JsonIgnoreProperties;
+import 
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.annotation.JsonProperty;
+
+import javax.annotation.Nonnull;
+import javax.annotation.Nullable;
+
+import java.time.Duration;
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.List;
+import java.util.Objects;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
+import java.util.stream.Stream;
+
+/**
+ * It is used to describe the state metadata of a stateful operator, which is
+ * serialized/deserialized into/from those {@link
+ * org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecNode}s that 
can generate stateful
+ * operators. For ExecNodes that generates {@link
+ * org.apache.flink.streaming.api.operators.TwoInputStreamOperator} or {@link
+ * org.apache.flink.streaming.api.operators.MultipleInputStreamOperator}, 
there will be multiple
+ * metadata describing information about each input's state.
+ *
+ * The metadata describes the following attributes.
+ *
+ * 
+ *   {@code stateIndex}: annotates the state is from the i-th input, index 
based on zero
+ *   {@code ttl}: annotates the state retention time for the i-th input's 
state, the time unit
+ *   is ms.
+ *   {@code name}: annotates the state description, such as 
deduplicate-state, join-left-state.
+ * 
+ */
+@Internal
+@JsonIgnoreProperties(ignoreUnknown = true)
+public class StateMetadata {
+public static final String FIELD_NAME_STATE_INDEX = "index";
+public static final String FIELD_NAME_STATE_TTL = "ttl";
+public static final String FIELD_NAME_STATE_NAME = "name";
+
+@JsonProperty(value = FIELD_NAME_STATE_INDEX, index = 0)
+private final int stateIndex;
+
+@JsonProperty(value = FIELD_NAME_STATE_TTL, index = 1)
+private final Duration stateTtl;
+
+@JsonProperty(value = FIELD_NAME_STATE_NAME, index = 2)
+private final String stateName;
+
+@JsonCreator
+public StateMetadata(
+@JsonProperty(FIELD_NAME_STATE_INDEX) int stateIndex,
+@JsonProperty(FIELD_NAME_STATE_TTL) String stateTtl,
+@JsonProperty(FIELD_NAME_STATE_NAME) String stateName) {
+this(
+stateIndex,
+TimeUtils.parseDuration(
+Preconditions.checkNotNull(stateTtl, "state ttl should 
not be null")),
+Preconditions.checkNotNull(stateName, "state name should not 
be null"));
+}
+
+public StateMetadata(int stateIndex, @Nonnull Duration stateTtl, @Nonnull 
String stateName) {

Review Comment:
   nit: `@Nonnull` will check null value when running, you can improve it in 
the following prs



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-32094) startScheduling.BATCH performance regression since May 11th

2023-05-14 Thread Martijn Visser (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser updated FLINK-32094:
---
Description: 
http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=startScheduling.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200
  (was: 
http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=startScheduling.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200

 !image-2023-05-14-22-58-00-886.png! )

> startScheduling.BATCH performance regression since May 11th
> ---
>
> Key: FLINK-32094
> URL: https://issues.apache.org/jira/browse/FLINK-32094
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Reporter: Martijn Visser
>Priority: Blocker
> Attachments: image-2023-05-14-22-58-00-886.png
>
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=startScheduling.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32094) startScheduling.BATCH performance regression since May 11th

2023-05-14 Thread Martijn Visser (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser updated FLINK-32094:
---
 Attachment: image-2023-05-14-22-58-00-886.png
Description: 
http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=startScheduling.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200

 !image-2023-05-14-22-58-00-886.png! 

  was:
http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=startScheduling.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200




> startScheduling.BATCH performance regression since May 11th
> ---
>
> Key: FLINK-32094
> URL: https://issues.apache.org/jira/browse/FLINK-32094
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Reporter: Martijn Visser
>Priority: Blocker
> Attachments: image-2023-05-14-22-58-00-886.png
>
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=startScheduling.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200
>  !image-2023-05-14-22-58-00-886.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32094) startScheduling.BATCH performance regression since May 11th

2023-05-14 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-32094:
--

 Summary: startScheduling.BATCH performance regression since May 
11th
 Key: FLINK-32094
 URL: https://issues.apache.org/jira/browse/FLINK-32094
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Reporter: Martijn Visser


http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=startScheduling.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28842) Add client.id.prefix for the KafkaSink

2023-05-14 Thread Yaroslav Tkachenko (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722601#comment-17722601
 ] 

Yaroslav Tkachenko commented on FLINK-28842:


Hi Valentina, unfortunately I didn’t have a chance to address feedback, so feel 
free to provide an alternative PR.

> Add client.id.prefix for the KafkaSink
> --
>
> Key: FLINK-28842
> URL: https://issues.apache.org/jira/browse/FLINK-28842
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Kafka
>Affects Versions: 1.15.1
>Reporter: Yaroslav Tkachenko
>Assignee: Yaroslav Tkachenko
>Priority: Major
>  Labels: pull-request-available
>
> Currently, KafkaSink doesn't provide a way to configure a client.id.prefix 
> like KafkaSource does. client.id is as important for Kafka Producers, so it 
> makes sense to implement the missing logic for the KafkaSink. 
> A similar implementation that leverages subtaskId for uniqueness can be used 
> here.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-playgrounds] alainbrown commented on pull request #42: Create flink_data volume for operations playground.

2023-05-14 Thread via GitHub


alainbrown commented on PR #42:
URL: https://github.com/apache/flink-playgrounds/pull/42#issuecomment-1546974310

   Done!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #22576: Remove savepoint and checkpoint mount instructions from flink playground.

2023-05-14 Thread via GitHub


flinkbot commented on PR #22576:
URL: https://github.com/apache/flink/pull/22576#issuecomment-1546972941

   
   ## CI report:
   
   * 7daebec55b3c0aadaf4e499ac61fe078b25b1eb6 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] alainbrown opened a new pull request, #22576: Remove savepoint and checkpoint mount instructions from flink playground.

2023-05-14 Thread via GitHub


alainbrown opened a new pull request, #22576:
URL: https://github.com/apache/flink/pull/22576

   ## What is the purpose of the change
   
   With the addition of a docker volume, this step is no long needed.
   https://github.com/apache/flink-playgrounds/pull/42
   
   re:
   mkdir -p /tmp/flink-checkpoints-directory
   mkdir -p /tmp/flink-savepoints-directory
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-32093) Upon Delete deployment idle pods throw - java.lang.IllegalStateException: Cannot receive event after a delete event received

2023-05-14 Thread Gyula Fora (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722593#comment-17722593
 ] 

Gyula Fora commented on FLINK-32093:


Can you please share both the operator and the FlinkDeployment configuration?

> Upon Delete deployment idle pods throw - java.lang.IllegalStateException: 
> Cannot receive event after a delete event received
> 
>
> Key: FLINK-32093
> URL: https://issues.apache.org/jira/browse/FLINK-32093
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.5.0
>Reporter: Tamir Sagi
>Priority: Major
> Attachments: event-error.txt
>
>
> After a deployment is deleted , idle pods throw 
> java.lang.IllegalStateException: Cannot receive event after a delete event 
> received
> HA is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32041) flink-kubernetes-operator RoleBinding for Leases not created in correct namespace when using watchNamespaces

2023-05-14 Thread Gyula Fora (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722592#comment-17722592
 ] 

Gyula Fora commented on FLINK-32041:


Can you please share both the operator and FlinkDeployment configuration? I am 
a bit confused 

> flink-kubernetes-operator RoleBinding for Leases not created in correct 
> namespace when using watchNamespaces
> 
>
> Key: FLINK-32041
> URL: https://issues.apache.org/jira/browse/FLINK-32041
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.4.0
>Reporter: Andrew Otto
>Assignee: Andrew Otto
>Priority: Major
>
> When enabling [HA for 
> flink-kubernetes-operator|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/configuration/#leader-election-and-high-availability]
>  RBAC rules must be created to allow the flink-operator to manage k8s Lease 
> resources.  When not using {{{}watchNamespaces{}}}, the RBAC rules are 
> created at the k8s cluster level scope, giving the flink-operator 
> ServiceAccount the ability to manage all needed k8s resources for all 
> namespaces.
> However, when using {{{}watchNamespaces{}}}, RBAC rules are only created in 
> the {{{}watchNamepaces{}}}.  For most rules, this is correct, as the operator 
> needs to manage resources like Flink pods and deployments in the 
> {{{}watchNamespaces{}}}.  
> However, For flink-kubernetes-operator HA, the Lease resource is managed in 
> the same namespace in which the operator is deployed.  
> The Helm chart should be fixed so that the proper RBAC rules for Leases are 
> created to allow the operator's ServiceAccount in the operator's namespace.
> Mailing list discussion 
> [here.|https://lists.apache.org/thread/yq89jm0szkcodfocm5x7vqnqdmh0h1l0]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32093) Upon Delete deployment idle pods throw - java.lang.IllegalStateException: Cannot receive event after a delete event received

2023-05-14 Thread Tamir Sagi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722589#comment-17722589
 ] 

Tamir Sagi commented on FLINK-32093:


I'm talking about Operator HA not Job HA.

> Upon Delete deployment idle pods throw - java.lang.IllegalStateException: 
> Cannot receive event after a delete event received
> 
>
> Key: FLINK-32093
> URL: https://issues.apache.org/jira/browse/FLINK-32093
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.5.0
>Reporter: Tamir Sagi
>Priority: Major
> Attachments: event-error.txt
>
>
> After a deployment is deleted , idle pods throw 
> java.lang.IllegalStateException: Cannot receive event after a delete event 
> received
> HA is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-32041) flink-kubernetes-operator RoleBinding for Leases not created in correct namespace when using watchNamespaces

2023-05-14 Thread Tamir Sagi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722588#comment-17722588
 ] 

Tamir Sagi edited comment on FLINK-32041 at 5/14/23 5:12 PM:
-

I'm talking about Operator HA.

My point was when creating RestClusterClient, its constructor can take the 
ClientHighAvailabilityServicesFactory, if not provided it will look in 
classpath for implementation(based on cluster configurations). In my Flink 
configurations it is set to kubernetes. that's why I thought it might be 
related. (Noticed that in the operator it does pass the standalone HA service). 
so I might be wrong here.


was (Author: JIRAUSER283777):
I'm talking about Operator HA.

My point was when creating RestClusterClient, its constructor can take the 
ClientHighAvailabilityServicesFactory, if not provided it will look in 
classpath for implementation(based on cluster configurations). In my Flink 
configurations it is set to kubernetes. that's why I thought it might be 
related.

> flink-kubernetes-operator RoleBinding for Leases not created in correct 
> namespace when using watchNamespaces
> 
>
> Key: FLINK-32041
> URL: https://issues.apache.org/jira/browse/FLINK-32041
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.4.0
>Reporter: Andrew Otto
>Assignee: Andrew Otto
>Priority: Major
>
> When enabling [HA for 
> flink-kubernetes-operator|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/configuration/#leader-election-and-high-availability]
>  RBAC rules must be created to allow the flink-operator to manage k8s Lease 
> resources.  When not using {{{}watchNamespaces{}}}, the RBAC rules are 
> created at the k8s cluster level scope, giving the flink-operator 
> ServiceAccount the ability to manage all needed k8s resources for all 
> namespaces.
> However, when using {{{}watchNamespaces{}}}, RBAC rules are only created in 
> the {{{}watchNamepaces{}}}.  For most rules, this is correct, as the operator 
> needs to manage resources like Flink pods and deployments in the 
> {{{}watchNamespaces{}}}.  
> However, For flink-kubernetes-operator HA, the Lease resource is managed in 
> the same namespace in which the operator is deployed.  
> The Helm chart should be fixed so that the proper RBAC rules for Leases are 
> created to allow the operator's ServiceAccount in the operator's namespace.
> Mailing list discussion 
> [here.|https://lists.apache.org/thread/yq89jm0szkcodfocm5x7vqnqdmh0h1l0]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32041) flink-kubernetes-operator RoleBinding for Leases not created in correct namespace when using watchNamespaces

2023-05-14 Thread Tamir Sagi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722588#comment-17722588
 ] 

Tamir Sagi commented on FLINK-32041:


I'm talking about Operator HA.

My point was when creating RestClusterClient, its constructor can take the 
ClientHighAvailabilityServicesFactory, if not provided it will look in 
classpath for implementation(based on cluster configurations). In my Flink 
configurations it is set to kubernetes. that's why I thought it might be 
related.

> flink-kubernetes-operator RoleBinding for Leases not created in correct 
> namespace when using watchNamespaces
> 
>
> Key: FLINK-32041
> URL: https://issues.apache.org/jira/browse/FLINK-32041
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.4.0
>Reporter: Andrew Otto
>Assignee: Andrew Otto
>Priority: Major
>
> When enabling [HA for 
> flink-kubernetes-operator|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/configuration/#leader-election-and-high-availability]
>  RBAC rules must be created to allow the flink-operator to manage k8s Lease 
> resources.  When not using {{{}watchNamespaces{}}}, the RBAC rules are 
> created at the k8s cluster level scope, giving the flink-operator 
> ServiceAccount the ability to manage all needed k8s resources for all 
> namespaces.
> However, when using {{{}watchNamespaces{}}}, RBAC rules are only created in 
> the {{{}watchNamepaces{}}}.  For most rules, this is correct, as the operator 
> needs to manage resources like Flink pods and deployments in the 
> {{{}watchNamespaces{}}}.  
> However, For flink-kubernetes-operator HA, the Lease resource is managed in 
> the same namespace in which the operator is deployed.  
> The Helm chart should be fixed so that the proper RBAC rules for Leases are 
> created to allow the operator's ServiceAccount in the operator's namespace.
> Mailing list discussion 
> [here.|https://lists.apache.org/thread/yq89jm0szkcodfocm5x7vqnqdmh0h1l0]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32093) Upon Delete deployment idle pods throw - java.lang.IllegalStateException: Cannot receive event after a delete event received

2023-05-14 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-32093:
---
Labels:   (was: flink-kubernetes-operator)

> Upon Delete deployment idle pods throw - java.lang.IllegalStateException: 
> Cannot receive event after a delete event received
> 
>
> Key: FLINK-32093
> URL: https://issues.apache.org/jira/browse/FLINK-32093
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Tamir Sagi
>Priority: Major
> Attachments: event-error.txt
>
>
> After a deployment is deleted , idle pods throw 
> java.lang.IllegalStateException: Cannot receive event after a delete event 
> received
> HA is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32093) Upon Delete deployment idle pods throw - java.lang.IllegalStateException: Cannot receive event after a delete event received

2023-05-14 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-32093:
---
Affects Version/s: kubernetes-operator-1.5.0

> Upon Delete deployment idle pods throw - java.lang.IllegalStateException: 
> Cannot receive event after a delete event received
> 
>
> Key: FLINK-32093
> URL: https://issues.apache.org/jira/browse/FLINK-32093
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.5.0
>Reporter: Tamir Sagi
>Priority: Major
> Attachments: event-error.txt
>
>
> After a deployment is deleted , idle pods throw 
> java.lang.IllegalStateException: Cannot receive event after a delete event 
> received
> HA is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32093) Upon Delete deployment idle pods throw - java.lang.IllegalStateException: Cannot receive event after a delete event received

2023-05-14 Thread Gyula Fora (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722586#comment-17722586
 ] 

Gyula Fora commented on FLINK-32093:


By HA again here you mean Flink Job HA? or Operator HA (leader election)?

> Upon Delete deployment idle pods throw - java.lang.IllegalStateException: 
> Cannot receive event after a delete event received
> 
>
> Key: FLINK-32093
> URL: https://issues.apache.org/jira/browse/FLINK-32093
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Tamir Sagi
>Priority: Major
>  Labels: flink-kubernetes-operator
> Attachments: event-error.txt
>
>
> After a deployment is deleted , idle pods throw 
> java.lang.IllegalStateException: Cannot receive event after a delete event 
> received
> HA is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-32093) Upon Delete deployment idle pods throw - java.lang.IllegalStateException: Cannot receive event after a delete event received

2023-05-14 Thread Tamir Sagi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722584#comment-17722584
 ] 

Tamir Sagi edited comment on FLINK-32093 at 5/14/23 4:37 PM:
-

There are no much logs from the idles pods.

either the stack I attached or info logs regarding configurations
[Info] {} [o.a.f.c.GlobalConfiguration]: Loading configuration property...

 

as for disabling HA, I will try tomorrow and get back to you.


was (Author: JIRAUSER283777):
I will try tomorrow and get back to you.

> Upon Delete deployment idle pods throw - java.lang.IllegalStateException: 
> Cannot receive event after a delete event received
> 
>
> Key: FLINK-32093
> URL: https://issues.apache.org/jira/browse/FLINK-32093
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Tamir Sagi
>Priority: Major
>  Labels: flink-kubernetes-operator
> Attachments: event-error.txt
>
>
> After a deployment is deleted , idle pods throw 
> java.lang.IllegalStateException: Cannot receive event after a delete event 
> received
> HA is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32093) Upon Delete deployment idle pods throw - java.lang.IllegalStateException: Cannot receive event after a delete event received

2023-05-14 Thread Tamir Sagi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722584#comment-17722584
 ] 

Tamir Sagi commented on FLINK-32093:


I will try tomorrow and get back to you.

> Upon Delete deployment idle pods throw - java.lang.IllegalStateException: 
> Cannot receive event after a delete event received
> 
>
> Key: FLINK-32093
> URL: https://issues.apache.org/jira/browse/FLINK-32093
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Tamir Sagi
>Priority: Major
>  Labels: flink-kubernetes-operator
> Attachments: event-error.txt
>
>
> After a deployment is deleted , idle pods throw 
> java.lang.IllegalStateException: Cannot receive event after a delete event 
> received
> HA is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32041) flink-kubernetes-operator RoleBinding for Leases not created in correct namespace when using watchNamespaces

2023-05-14 Thread Gyula Fora (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722585#comment-17722585
 ] 

Gyula Fora commented on FLINK-32041:


This may be completely unrelated to the original issue described in the ticket. 
That is about kubernetes operator HA, not Flink job HA.
Would be good to know whether this error occurs in operator 1.4.0 vs 1.5.0 and 
Flink version 1.16 vs 1.17

> flink-kubernetes-operator RoleBinding for Leases not created in correct 
> namespace when using watchNamespaces
> 
>
> Key: FLINK-32041
> URL: https://issues.apache.org/jira/browse/FLINK-32041
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.4.0
>Reporter: Andrew Otto
>Assignee: Andrew Otto
>Priority: Major
>
> When enabling [HA for 
> flink-kubernetes-operator|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/configuration/#leader-election-and-high-availability]
>  RBAC rules must be created to allow the flink-operator to manage k8s Lease 
> resources.  When not using {{{}watchNamespaces{}}}, the RBAC rules are 
> created at the k8s cluster level scope, giving the flink-operator 
> ServiceAccount the ability to manage all needed k8s resources for all 
> namespaces.
> However, when using {{{}watchNamespaces{}}}, RBAC rules are only created in 
> the {{{}watchNamepaces{}}}.  For most rules, this is correct, as the operator 
> needs to manage resources like Flink pods and deployments in the 
> {{{}watchNamespaces{}}}.  
> However, For flink-kubernetes-operator HA, the Lease resource is managed in 
> the same namespace in which the operator is deployed.  
> The Helm chart should be fixed so that the proper RBAC rules for Leases are 
> created to allow the operator's ServiceAccount in the operator's namespace.
> Mailing list discussion 
> [here.|https://lists.apache.org/thread/yq89jm0szkcodfocm5x7vqnqdmh0h1l0]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-32041) flink-kubernetes-operator RoleBinding for Leases not created in correct namespace when using watchNamespaces

2023-05-14 Thread Tamir Sagi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722583#comment-17722583
 ] 

Tamir Sagi edited comment on FLINK-32041 at 5/14/23 4:33 PM:
-

I need to try and get back to you with an answer (probably tomorrow) . But it 
does seems connected to k8s HA service & RestClient & KubeClient. 
 
RestClient uses k8s client internally which needs NodeList permissions but 
instead of reading from Service account it looks for kube.config file.

[https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#kubernetes-config-file]

 
ClusterIP Service
[https://github.com/apache/flink/blob/release-1.17.0/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/services/ClusterIPService.java#L44-L53]
 
NodePort Service
[https://github.com/apache/flink/blob/release-1.17.0/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/services/NodePortService.java#L62]


was (Author: JIRAUSER283777):
I need to try and get back to you with an answer (probably tomorrow) . But it 
does seems connected to k8s HA service & RestClient & KubeClinet. 
 
RestClient uses k8s client internally which needs NodeList permissions but 
instead of reading from Service account it looks for kube.config file.


[https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#kubernetes-config-file]

 
ClusterIP Service
[https://github.com/apache/flink/blob/release-1.17.0/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/services/ClusterIPService.java#L44-L53]
 
NodePort Service
[https://github.com/apache/flink/blob/release-1.17.0/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/services/NodePortService.java#L62]

> flink-kubernetes-operator RoleBinding for Leases not created in correct 
> namespace when using watchNamespaces
> 
>
> Key: FLINK-32041
> URL: https://issues.apache.org/jira/browse/FLINK-32041
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.4.0
>Reporter: Andrew Otto
>Assignee: Andrew Otto
>Priority: Major
>
> When enabling [HA for 
> flink-kubernetes-operator|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/configuration/#leader-election-and-high-availability]
>  RBAC rules must be created to allow the flink-operator to manage k8s Lease 
> resources.  When not using {{{}watchNamespaces{}}}, the RBAC rules are 
> created at the k8s cluster level scope, giving the flink-operator 
> ServiceAccount the ability to manage all needed k8s resources for all 
> namespaces.
> However, when using {{{}watchNamespaces{}}}, RBAC rules are only created in 
> the {{{}watchNamepaces{}}}.  For most rules, this is correct, as the operator 
> needs to manage resources like Flink pods and deployments in the 
> {{{}watchNamespaces{}}}.  
> However, For flink-kubernetes-operator HA, the Lease resource is managed in 
> the same namespace in which the operator is deployed.  
> The Helm chart should be fixed so that the proper RBAC rules for Leases are 
> created to allow the operator's ServiceAccount in the operator's namespace.
> Mailing list discussion 
> [here.|https://lists.apache.org/thread/yq89jm0szkcodfocm5x7vqnqdmh0h1l0]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-32041) flink-kubernetes-operator RoleBinding for Leases not created in correct namespace when using watchNamespaces

2023-05-14 Thread Tamir Sagi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722583#comment-17722583
 ] 

Tamir Sagi edited comment on FLINK-32041 at 5/14/23 4:33 PM:
-

I need to try and get back to you with an answer (probably tomorrow) . But it 
does seems connected to k8s HA service & RestClient & KubeClinet. 
 
RestClient uses k8s client internally which needs NodeList permissions but 
instead of reading from Service account it looks for kube.config file.


[https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#kubernetes-config-file]

 
ClusterIP Service
[https://github.com/apache/flink/blob/release-1.17.0/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/services/ClusterIPService.java#L44-L53]
 
NodePort Service
[https://github.com/apache/flink/blob/release-1.17.0/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/services/NodePortService.java#L62]


was (Author: JIRAUSER283777):
I need to try and get back to you with an answer (probably tomorrow) . But it 
does seems connected to k8s HA service & RestClient & KubeClinet. 
 
RestClient uses k8s client internally which needs NodeList permissions but 
instead of reading from Service account it looks for kube.config file. [1]
 
ClusterIP Service
[https://github.com/apache/flink/blob/release-1.17.0/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/services/ClusterIPService.java#L44-L53]
 
NodePort Service
[https://github.com/apache/flink/blob/release-1.17.0/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/services/NodePortService.java#L62]

> flink-kubernetes-operator RoleBinding for Leases not created in correct 
> namespace when using watchNamespaces
> 
>
> Key: FLINK-32041
> URL: https://issues.apache.org/jira/browse/FLINK-32041
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.4.0
>Reporter: Andrew Otto
>Assignee: Andrew Otto
>Priority: Major
>
> When enabling [HA for 
> flink-kubernetes-operator|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/configuration/#leader-election-and-high-availability]
>  RBAC rules must be created to allow the flink-operator to manage k8s Lease 
> resources.  When not using {{{}watchNamespaces{}}}, the RBAC rules are 
> created at the k8s cluster level scope, giving the flink-operator 
> ServiceAccount the ability to manage all needed k8s resources for all 
> namespaces.
> However, when using {{{}watchNamespaces{}}}, RBAC rules are only created in 
> the {{{}watchNamepaces{}}}.  For most rules, this is correct, as the operator 
> needs to manage resources like Flink pods and deployments in the 
> {{{}watchNamespaces{}}}.  
> However, For flink-kubernetes-operator HA, the Lease resource is managed in 
> the same namespace in which the operator is deployed.  
> The Helm chart should be fixed so that the proper RBAC rules for Leases are 
> created to allow the operator's ServiceAccount in the operator's namespace.
> Mailing list discussion 
> [here.|https://lists.apache.org/thread/yq89jm0szkcodfocm5x7vqnqdmh0h1l0]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32041) flink-kubernetes-operator RoleBinding for Leases not created in correct namespace when using watchNamespaces

2023-05-14 Thread Tamir Sagi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722583#comment-17722583
 ] 

Tamir Sagi commented on FLINK-32041:


I need to try and get back to youwith an answer (probably tomorrow) . But it 
does seems connected to k8s HA service & RestClient & KubeClinet. 
 
RestClient uses k8s client internally which needs NodeList permissions but 
instead of reading from Service account it looks for kube.config file. [1]
 
ClusterIP Service
[https://github.com/apache/flink/blob/release-1.17.0/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/services/ClusterIPService.java#L44-L53]
 
NodePort Service
[https://github.com/apache/flink/blob/release-1.17.0/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/services/NodePortService.java#L62]

> flink-kubernetes-operator RoleBinding for Leases not created in correct 
> namespace when using watchNamespaces
> 
>
> Key: FLINK-32041
> URL: https://issues.apache.org/jira/browse/FLINK-32041
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.4.0
>Reporter: Andrew Otto
>Assignee: Andrew Otto
>Priority: Major
>
> When enabling [HA for 
> flink-kubernetes-operator|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/configuration/#leader-election-and-high-availability]
>  RBAC rules must be created to allow the flink-operator to manage k8s Lease 
> resources.  When not using {{{}watchNamespaces{}}}, the RBAC rules are 
> created at the k8s cluster level scope, giving the flink-operator 
> ServiceAccount the ability to manage all needed k8s resources for all 
> namespaces.
> However, when using {{{}watchNamespaces{}}}, RBAC rules are only created in 
> the {{{}watchNamepaces{}}}.  For most rules, this is correct, as the operator 
> needs to manage resources like Flink pods and deployments in the 
> {{{}watchNamespaces{}}}.  
> However, For flink-kubernetes-operator HA, the Lease resource is managed in 
> the same namespace in which the operator is deployed.  
> The Helm chart should be fixed so that the proper RBAC rules for Leases are 
> created to allow the operator's ServiceAccount in the operator's namespace.
> Mailing list discussion 
> [here.|https://lists.apache.org/thread/yq89jm0szkcodfocm5x7vqnqdmh0h1l0]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-32041) flink-kubernetes-operator RoleBinding for Leases not created in correct namespace when using watchNamespaces

2023-05-14 Thread Tamir Sagi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722583#comment-17722583
 ] 

Tamir Sagi edited comment on FLINK-32041 at 5/14/23 4:32 PM:
-

I need to try and get back to you with an answer (probably tomorrow) . But it 
does seems connected to k8s HA service & RestClient & KubeClinet. 
 
RestClient uses k8s client internally which needs NodeList permissions but 
instead of reading from Service account it looks for kube.config file. [1]
 
ClusterIP Service
[https://github.com/apache/flink/blob/release-1.17.0/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/services/ClusterIPService.java#L44-L53]
 
NodePort Service
[https://github.com/apache/flink/blob/release-1.17.0/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/services/NodePortService.java#L62]


was (Author: JIRAUSER283777):
I need to try and get back to youwith an answer (probably tomorrow) . But it 
does seems connected to k8s HA service & RestClient & KubeClinet. 
 
RestClient uses k8s client internally which needs NodeList permissions but 
instead of reading from Service account it looks for kube.config file. [1]
 
ClusterIP Service
[https://github.com/apache/flink/blob/release-1.17.0/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/services/ClusterIPService.java#L44-L53]
 
NodePort Service
[https://github.com/apache/flink/blob/release-1.17.0/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/services/NodePortService.java#L62]

> flink-kubernetes-operator RoleBinding for Leases not created in correct 
> namespace when using watchNamespaces
> 
>
> Key: FLINK-32041
> URL: https://issues.apache.org/jira/browse/FLINK-32041
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.4.0
>Reporter: Andrew Otto
>Assignee: Andrew Otto
>Priority: Major
>
> When enabling [HA for 
> flink-kubernetes-operator|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/configuration/#leader-election-and-high-availability]
>  RBAC rules must be created to allow the flink-operator to manage k8s Lease 
> resources.  When not using {{{}watchNamespaces{}}}, the RBAC rules are 
> created at the k8s cluster level scope, giving the flink-operator 
> ServiceAccount the ability to manage all needed k8s resources for all 
> namespaces.
> However, when using {{{}watchNamespaces{}}}, RBAC rules are only created in 
> the {{{}watchNamepaces{}}}.  For most rules, this is correct, as the operator 
> needs to manage resources like Flink pods and deployments in the 
> {{{}watchNamespaces{}}}.  
> However, For flink-kubernetes-operator HA, the Lease resource is managed in 
> the same namespace in which the operator is deployed.  
> The Helm chart should be fixed so that the proper RBAC rules for Leases are 
> created to allow the operator's ServiceAccount in the operator's namespace.
> Mailing list discussion 
> [here.|https://lists.apache.org/thread/yq89jm0szkcodfocm5x7vqnqdmh0h1l0]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-32093) Upon Delete deployment idle pods throw - java.lang.IllegalStateException: Cannot receive event after a delete event received

2023-05-14 Thread Gyula Fora (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722581#comment-17722581
 ] 

Gyula Fora edited comment on FLINK-32093 at 5/14/23 4:25 PM:
-

Can you share the operator logs?

Does this also happen without HA?


was (Author: gyfora):
Can you share the operator logs?

This this also happen without HA?

> Upon Delete deployment idle pods throw - java.lang.IllegalStateException: 
> Cannot receive event after a delete event received
> 
>
> Key: FLINK-32093
> URL: https://issues.apache.org/jira/browse/FLINK-32093
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Tamir Sagi
>Priority: Major
>  Labels: flink-kubernetes-operator
> Attachments: event-error.txt
>
>
> After a deployment is deleted , idle pods throw 
> java.lang.IllegalStateException: Cannot receive event after a delete event 
> received
> HA is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32041) flink-kubernetes-operator RoleBinding for Leases not created in correct namespace when using watchNamespaces

2023-05-14 Thread Gyula Fora (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722582#comment-17722582
 ] 

Gyula Fora commented on FLINK-32041:


[~tamirsagi] does the same error occur without HA enabled?

> flink-kubernetes-operator RoleBinding for Leases not created in correct 
> namespace when using watchNamespaces
> 
>
> Key: FLINK-32041
> URL: https://issues.apache.org/jira/browse/FLINK-32041
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.4.0
>Reporter: Andrew Otto
>Assignee: Andrew Otto
>Priority: Major
>
> When enabling [HA for 
> flink-kubernetes-operator|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/configuration/#leader-election-and-high-availability]
>  RBAC rules must be created to allow the flink-operator to manage k8s Lease 
> resources.  When not using {{{}watchNamespaces{}}}, the RBAC rules are 
> created at the k8s cluster level scope, giving the flink-operator 
> ServiceAccount the ability to manage all needed k8s resources for all 
> namespaces.
> However, when using {{{}watchNamespaces{}}}, RBAC rules are only created in 
> the {{{}watchNamepaces{}}}.  For most rules, this is correct, as the operator 
> needs to manage resources like Flink pods and deployments in the 
> {{{}watchNamespaces{}}}.  
> However, For flink-kubernetes-operator HA, the Lease resource is managed in 
> the same namespace in which the operator is deployed.  
> The Helm chart should be fixed so that the proper RBAC rules for Leases are 
> created to allow the operator's ServiceAccount in the operator's namespace.
> Mailing list discussion 
> [here.|https://lists.apache.org/thread/yq89jm0szkcodfocm5x7vqnqdmh0h1l0]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32093) Upon Delete deployment idle pods throw - java.lang.IllegalStateException: Cannot receive event after a delete event received

2023-05-14 Thread Gyula Fora (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722581#comment-17722581
 ] 

Gyula Fora commented on FLINK-32093:


Can you share the operator logs?

This this also happen without HA?

> Upon Delete deployment idle pods throw - java.lang.IllegalStateException: 
> Cannot receive event after a delete event received
> 
>
> Key: FLINK-32093
> URL: https://issues.apache.org/jira/browse/FLINK-32093
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Tamir Sagi
>Priority: Major
>  Labels: flink-kubernetes-operator
> Attachments: event-error.txt
>
>
> After a deployment is deleted , idle pods throw 
> java.lang.IllegalStateException: Cannot receive event after a delete event 
> received
> HA is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32093) Upon Delete deployment idle pods throw - java.lang.IllegalStateException: Cannot receive event after a delete event received

2023-05-14 Thread Tamir Sagi (Jira)
Tamir Sagi created FLINK-32093:
--

 Summary: Upon Delete deployment idle pods throw - 
java.lang.IllegalStateException: Cannot receive event after a delete event 
received
 Key: FLINK-32093
 URL: https://issues.apache.org/jira/browse/FLINK-32093
 Project: Flink
  Issue Type: Bug
  Components: Kubernetes Operator
Reporter: Tamir Sagi
 Attachments: event-error.txt

After a deployment is deleted , idle pods throw 
java.lang.IllegalStateException: Cannot receive event after a delete event 
received

HA is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-32041) flink-kubernetes-operator RoleBinding for Leases not created in correct namespace when using watchNamespaces

2023-05-14 Thread Tamir Sagi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721192#comment-17721192
 ] 

Tamir Sagi edited comment on FLINK-32041 at 5/14/23 4:18 PM:
-

Hey Gyula,

I also encountered something similar  (HA is enabled).

 
I checked the rolebinding between the service account 
`dev-0-flink-clusters:dev-0-xsight-flink-operator-sa` and the corresponded 
role({*}flink-operator{*}) which has been created by the operator using 
*{{rbac.nodesRule.create=true, they both look fine.}}*

 

The operator watches 2 namespaces:
 # its own:  dev-0-flink-clusters
 # dev-0-flink-temp-clusters

!https://lists.apache.org/api/email.lua?attachment=true&id=61qtwrnxlh722pvok8dtnzdt7t7k7drb&file=fe69ed8d14240d73b73f68176ee7fa4f13f2b0ee303676f8eea92b7bdee9ceb3!

!https://lists.apache.org/api/email.lua?attachment=true&id=61qtwrnxlh722pvok8dtnzdt7t7k7drb&file=c8a40ca61528174bd1667e3fcf10ba39e2224700198a69e828db80c66315719d!

{{Then the following error is thrown:}}

{{org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.KubernetesClientException","message":"Failure
 executing: GET at: [https://172.20.0.1/api/v1/nodes]. Message: 
Forbidden!Configured service account doesn't have access. Service account may 
have been revoked. nodes is forbidden: User 
"system:serviceaccount:dev-0-flink-clusters:{*}dev-0-xsight-flink-operator-sa{*}"
 cannot list resource "nodes" in API group "" at the cluster scope."}}

{{could it be related to : kubernetes.rest-service.exposed.type? }}

 

EDIT: seems like it resolved when changing 
{{kubernetes.rest-service.exposed.type from NodePort to ClusterIP.}}


was (Author: JIRAUSER283777):
Hey Gyula,

I also encountered something similar  (HA is enabled).

 
I checked the rolebinding between the service account 
`dev-0-flink-clusters:dev-0-xsight-flink-operator-sa` and the corresponded 
role({*}flink-operator{*}) which has been created by the operator using 
*{{rbac.nodesRule.create=true, they both look fine.}}*

 

The operator watches 2 namespaces:
 # its own:  dev-0-flink-clusters
 # dev-0-flink-temp-clusters

!https://lists.apache.org/api/email.lua?attachment=true&id=61qtwrnxlh722pvok8dtnzdt7t7k7drb&file=fe69ed8d14240d73b73f68176ee7fa4f13f2b0ee303676f8eea92b7bdee9ceb3!

!https://lists.apache.org/api/email.lua?attachment=true&id=61qtwrnxlh722pvok8dtnzdt7t7k7drb&file=c8a40ca61528174bd1667e3fcf10ba39e2224700198a69e828db80c66315719d!

{{Then the following error is thrown:}}

{{org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.KubernetesClientException","message":"Failure
 executing: GET at: [https://172.20.0.1/api/v1/nodes]. Message: 
Forbidden!Configured service account doesn't have access. Service account may 
have been revoked. nodes is forbidden: User 
"system:serviceaccount:dev-0-flink-clusters:{*}dev-0-xsight-flink-operator-sa{*}"
 cannot list resource "nodes" in API group "" at the cluster scope."}}

{{could it be related to : kubernetes.rest-service.exposed.type? }}

> flink-kubernetes-operator RoleBinding for Leases not created in correct 
> namespace when using watchNamespaces
> 
>
> Key: FLINK-32041
> URL: https://issues.apache.org/jira/browse/FLINK-32041
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.4.0
>Reporter: Andrew Otto
>Assignee: Andrew Otto
>Priority: Major
>
> When enabling [HA for 
> flink-kubernetes-operator|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/configuration/#leader-election-and-high-availability]
>  RBAC rules must be created to allow the flink-operator to manage k8s Lease 
> resources.  When not using {{{}watchNamespaces{}}}, the RBAC rules are 
> created at the k8s cluster level scope, giving the flink-operator 
> ServiceAccount the ability to manage all needed k8s resources for all 
> namespaces.
> However, when using {{{}watchNamespaces{}}}, RBAC rules are only created in 
> the {{{}watchNamepaces{}}}.  For most rules, this is correct, as the operator 
> needs to manage resources like Flink pods and deployments in the 
> {{{}watchNamespaces{}}}.  
> However, For flink-kubernetes-operator HA, the Lease resource is managed in 
> the same namespace in which the operator is deployed.  
> The Helm chart should be fixed so that the proper RBAC rules for Leases are 
> created to allow the operator's ServiceAccount in the operator's namespace.
> Mailing list discussion 
> [here.|https://lists.apache.org/thread/yq89jm0szkcodfocm5x7vqnqdmh0h1l0]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] xuzhiwen1255 commented on a diff in pull request #21971: [FLINK-31084][connectors/dataGen] Add default value for dataGen seque…

2023-05-14 Thread via GitHub


xuzhiwen1255 commented on code in PR #21971:
URL: https://github.com/apache/flink/pull/21971#discussion_r1193156542


##
flink-table/flink-table-api-java-bridge/src/test/java/org/apache/flink/table/factories/DataGenTableSourceFactoryTest.java:
##
@@ -264,57 +267,63 @@ void testSequenceCheckpointRestore() throws Exception {
 }
 
 @Test
-void testLackStartForSequence() {
-assertThatThrownBy(
-() -> {
-DescriptorProperties descriptor = new 
DescriptorProperties();
-descriptor.putString(FactoryUtil.CONNECTOR.key(), 
"datagen");
-descriptor.putString(
-DataGenConnectorOptionsUtil.FIELDS
-+ ".f0."
-+ DataGenConnectorOptionsUtil.KIND,
-DataGenConnectorOptionsUtil.SEQUENCE);
-descriptor.putLong(
-DataGenConnectorOptionsUtil.FIELDS
-+ ".f0."
-+ DataGenConnectorOptionsUtil.END,
-100);
+void testDefaultValueForSequence() {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] xuzhiwen1255 commented on pull request #21971: [FLINK-31084][connectors/dataGen] Add default value for dataGen seque…

2023-05-14 Thread via GitHub


xuzhiwen1255 commented on PR #21971:
URL: https://github.com/apache/flink/pull/21971#issuecomment-1546911786

   @XComp 感谢您的review,我已经完成了修改. PTAL.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] xuzhiwen1255 commented on a diff in pull request #21971: [FLINK-31084][connectors/dataGen] Add default value for dataGen seque…

2023-05-14 Thread via GitHub


xuzhiwen1255 commented on code in PR #21971:
URL: https://github.com/apache/flink/pull/21971#discussion_r1193156464


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/datagen/SequenceGenerator.java:
##
@@ -49,10 +50,18 @@
 /**
  * Creates a DataGenerator that emits all numbers from the given interval 
exactly once.
  *
+ * He requires that the end must be greater than the start and that the 
total number cannot
+ * be greater than max-1.
+ *

Review Comment:
   Done.



##
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/datagen/SequenceGenerator.java:
##
@@ -111,11 +120,20 @@ public boolean hasNext() {
 return !this.valuesToEmit.isEmpty();
 }
 
-private static int safeDivide(long left, long right) {
-Preconditions.checkArgument(right > 0);
-Preconditions.checkArgument(left >= 0);
-Preconditions.checkArgument(left <= Integer.MAX_VALUE * right);
-return (int) (left / right);
+private static long safeDivide(long totalRows, long stepSize) {
+Preconditions.checkArgument(stepSize > 0, "cannot be equal to 0");
+Preconditions.checkArgument(totalRows >= 0, "Cannot be less than 0");
+return totalRows / stepSize;
+}

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-32092) Integrate snapshot file-merging with existing IT cases

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32092:
-
Affects Version/s: 1.18.0

> Integrate snapshot file-merging with existing IT cases
> --
>
> Key: FLINK-32092
> URL: https://issues.apache.org/jira/browse/FLINK-32092
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Rui Xia
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32092) Integrate snapshot file-merging with existing IT cases

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32092:
-
Component/s: Runtime / Checkpointing
 Runtime / State Backends

> Integrate snapshot file-merging with existing IT cases
> --
>
> Key: FLINK-32092
> URL: https://issues.apache.org/jira/browse/FLINK-32092
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Rui Xia
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32091) Add necessary metrics for file-merging

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32091:
-
Component/s: Runtime / Checkpointing
 Runtime / State Backends

> Add necessary metrics for file-merging
> --
>
> Key: FLINK-32091
> URL: https://issues.apache.org/jira/browse/FLINK-32091
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Hangxiang Yu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32091) Add necessary metrics for file-merging

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32091:
-
Affects Version/s: 1.18.0

> Add necessary metrics for file-merging
> --
>
> Key: FLINK-32091
> URL: https://issues.apache.org/jira/browse/FLINK-32091
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Hangxiang Yu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32090) Python API for enabling and configuring file merging snapshot

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32090:
-
Component/s: Runtime / Checkpointing
 Runtime / State Backends

> Python API for enabling and configuring file merging snapshot
> -
>
> Key: FLINK-32090
> URL: https://issues.apache.org/jira/browse/FLINK-32090
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Yanfei Lei
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32087) Space amplification statistics of file merging

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32087:
-
Component/s: Runtime / Checkpointing
 Runtime / State Backends

> Space amplification statistics of file merging
> --
>
> Key: FLINK-32087
> URL: https://issues.apache.org/jira/browse/FLINK-32087
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Rui Xia
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32090) Python API for enabling and configuring file merging snapshot

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32090:
-
Affects Version/s: 1.18.0

> Python API for enabling and configuring file merging snapshot
> -
>
> Key: FLINK-32090
> URL: https://issues.apache.org/jira/browse/FLINK-32090
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Yanfei Lei
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32089) Do fast copy in best-effort during first checkpoint after restoration

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32089:
-
Affects Version/s: 1.18.0

> Do fast copy in best-effort during first checkpoint after restoration
> -
>
> Key: FLINK-32089
> URL: https://issues.apache.org/jira/browse/FLINK-32089
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Yanfei Lei
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32089) Do fast copy in best-effort during first checkpoint after restoration

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32089:
-
Component/s: Runtime / Checkpointing
 Runtime / State Backends

> Do fast copy in best-effort during first checkpoint after restoration
> -
>
> Key: FLINK-32089
> URL: https://issues.apache.org/jira/browse/FLINK-32089
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Yanfei Lei
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32088) Re-uploading in state file-merging for space amplification control

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32088:
-
Affects Version/s: 1.18.0

> Re-uploading in state file-merging for space amplification control
> --
>
> Key: FLINK-32088
> URL: https://issues.apache.org/jira/browse/FLINK-32088
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Han Yin
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32087) Space amplification statistics of file merging

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32087:
-
Affects Version/s: 1.18.0

> Space amplification statistics of file merging
> --
>
> Key: FLINK-32087
> URL: https://issues.apache.org/jira/browse/FLINK-32087
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Rui Xia
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32088) Re-uploading in state file-merging for space amplification control

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32088:
-
Component/s: Runtime / Checkpointing
 Runtime / State Backends

> Re-uploading in state file-merging for space amplification control
> --
>
> Key: FLINK-32088
> URL: https://issues.apache.org/jira/browse/FLINK-32088
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Han Yin
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32086) Cleanup non-reported managed directory on exit of TM

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32086:
-
Component/s: Runtime / Checkpointing
 Runtime / State Backends

> Cleanup non-reported managed directory on exit of TM
> 
>
> Key: FLINK-32086
> URL: https://issues.apache.org/jira/browse/FLINK-32086
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Zakelly Lan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32086) Cleanup non-reported managed directory on exit of TM

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32086:
-
Affects Version/s: 1.18.0

> Cleanup non-reported managed directory on exit of TM
> 
>
> Key: FLINK-32086
> URL: https://issues.apache.org/jira/browse/FLINK-32086
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Zakelly Lan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32085) Implement and migrate batch uploading in changelog files into the file merging framework

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32085:
-
Affects Version/s: 1.18.0

> Implement and migrate batch uploading in changelog files into the file 
> merging framework
> 
>
> Key: FLINK-32085
> URL: https://issues.apache.org/jira/browse/FLINK-32085
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Hangxiang Yu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32085) Implement and migrate batch uploading in changelog files into the file merging framework

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32085:
-
Component/s: Runtime / Checkpointing
 Runtime / State Backends

> Implement and migrate batch uploading in changelog files into the file 
> merging framework
> 
>
> Key: FLINK-32085
> URL: https://issues.apache.org/jira/browse/FLINK-32085
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Hangxiang Yu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32083) Chinese translation of documentation of checkpoint file-merging

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32083:
-
Component/s: Runtime / Checkpointing
 Runtime / State Backends

> Chinese translation of documentation of checkpoint file-merging
> ---
>
> Key: FLINK-32083
> URL: https://issues.apache.org/jira/browse/FLINK-32083
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Hangxiang Yu
>Priority: Major
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32084) Migrate current file merging of channel state into the file merging framework

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32084:
-
Affects Version/s: 1.18.0

> Migrate current file merging of channel state into the file merging framework
> -
>
> Key: FLINK-32084
> URL: https://issues.apache.org/jira/browse/FLINK-32084
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Yanfei Lei
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32084) Migrate current file merging of channel state into the file merging framework

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32084:
-
Component/s: Runtime / Checkpointing
 Runtime / State Backends

> Migrate current file merging of channel state into the file merging framework
> -
>
> Key: FLINK-32084
> URL: https://issues.apache.org/jira/browse/FLINK-32084
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Yanfei Lei
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32083) Chinese translation of documentation of checkpoint file-merging

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32083:
-
Affects Version/s: 1.18.0

> Chinese translation of documentation of checkpoint file-merging
> ---
>
> Key: FLINK-32083
> URL: https://issues.apache.org/jira/browse/FLINK-32083
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Hangxiang Yu
>Priority: Major
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32082) Documentation of checkpoint file-merging

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32082:
-
Component/s: Runtime / Checkpointing
 Runtime / State Backends

> Documentation of checkpoint file-merging
> 
>
> Key: FLINK-32082
> URL: https://issues.apache.org/jira/browse/FLINK-32082
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Yanfei Lei
>Priority: Major
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32082) Documentation of checkpoint file-merging

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32082:
-
Affects Version/s: 1.18.0

> Documentation of checkpoint file-merging
> 
>
> Key: FLINK-32082
> URL: https://issues.apache.org/jira/browse/FLINK-32082
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Yanfei Lei
>Priority: Major
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] TanYuxin-tyx commented on a diff in pull request #22352: [FLINK-31639][network] Introduce tiered store memory manager

2023-05-14 Thread via GitHub


TanYuxin-tyx commented on code in PR #22352:
URL: https://github.com/apache/flink/pull/22352#discussion_r1193122906


##
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/storage/CacheFlushManager.java:
##
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage;
+
+import 
org.apache.flink.runtime.io.network.partition.hybrid.tiered.common.TieredStorageUtils;
+import org.apache.flink.util.ExceptionUtils;
+import org.apache.flink.util.FatalExitExceptionHandler;
+
+import 
org.apache.flink.shaded.guava30.com.google.common.util.concurrent.ThreadFactoryBuilder;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.Executors;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+
+/**
+ * A manager to check whether the cached buffers need to be flushed.
+ *
+ * Cached buffers refer to the buffers stored in a tier that can be 
released by flushing them to
+ * the disk or remote storage. If the requested buffers from the buffer pool 
reach the ratio limit,
+ * {@link CacheFlushManager} will trigger a process wherein all these buffers 
are flushed to the
+ * disk or remote storage to recycling these buffers.
+ *
+ * Note that when initializing a tier with cached buffers, the tier should 
register a {@link
+ * CacheBufferFlushTrigger}, as a listener to flush cached buffers.
+ */
+public class CacheFlushManager {

Review Comment:
   I have merged `TieredStorageMemoryManager` and `CacheFlushManager` into one 
class.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] TanYuxin-tyx commented on a diff in pull request #22352: [FLINK-31639][network] Introduce tiered store memory manager

2023-05-14 Thread via GitHub


TanYuxin-tyx commented on code in PR #22352:
URL: https://github.com/apache/flink/pull/22352#discussion_r1193122840


##
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/storage/CacheFlushManager.java:
##
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage;
+
+import 
org.apache.flink.runtime.io.network.partition.hybrid.tiered.common.TieredStorageUtils;
+import org.apache.flink.util.ExceptionUtils;
+import org.apache.flink.util.FatalExitExceptionHandler;
+
+import 
org.apache.flink.shaded.guava30.com.google.common.util.concurrent.ThreadFactoryBuilder;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.Executors;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+
+/**
+ * A manager to check whether the cached buffers need to be flushed.
+ *
+ * Cached buffers refer to the buffers stored in a tier that can be 
released by flushing them to
+ * the disk or remote storage. If the requested buffers from the buffer pool 
reach the ratio limit,
+ * {@link CacheFlushManager} will trigger a process wherein all these buffers 
are flushed to the
+ * disk or remote storage to recycling these buffers.
+ *
+ * Note that when initializing a tier with cached buffers, the tier should 
register a {@link
+ * CacheBufferFlushTrigger}, as a listener to flush cached buffers.
+ */
+public class CacheFlushManager {
+private final float numBuffersTriggerFlushRatio;
+
+private final List flushTriggers;
+
+private TieredStorageMemoryManager storageMemoryManager;
+
+private final ScheduledExecutorService executor =
+Executors.newSingleThreadScheduledExecutor(
+new ThreadFactoryBuilder()
+.setNameFormat("cache flush trigger")
+
.setUncaughtExceptionHandler(FatalExitExceptionHandler.INSTANCE)
+.build());
+
+public CacheFlushManager(float numBuffersTriggerFlushRatio) {
+this.numBuffersTriggerFlushRatio = numBuffersTriggerFlushRatio;
+this.flushTriggers = new ArrayList<>();
+
+executor.scheduleWithFixedDelay(
+this::checkNeedTriggerFlushCachedBuffers, 10, 50, 
TimeUnit.MILLISECONDS);

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] TanYuxin-tyx commented on a diff in pull request #22352: [FLINK-31639][network] Introduce tiered store memory manager

2023-05-14 Thread via GitHub


TanYuxin-tyx commented on code in PR #22352:
URL: https://github.com/apache/flink/pull/22352#discussion_r1193122816


##
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/storage/CacheFlushManager.java:
##
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage;
+
+import 
org.apache.flink.runtime.io.network.partition.hybrid.tiered.common.TieredStorageUtils;
+import org.apache.flink.util.ExceptionUtils;
+import org.apache.flink.util.FatalExitExceptionHandler;
+
+import 
org.apache.flink.shaded.guava30.com.google.common.util.concurrent.ThreadFactoryBuilder;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.Executors;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+
+/**
+ * A manager to check whether the cached buffers need to be flushed.
+ *
+ * Cached buffers refer to the buffers stored in a tier that can be 
released by flushing them to
+ * the disk or remote storage. If the requested buffers from the buffer pool 
reach the ratio limit,
+ * {@link CacheFlushManager} will trigger a process wherein all these buffers 
are flushed to the
+ * disk or remote storage to recycling these buffers.
+ *
+ * Note that when initializing a tier with cached buffers, the tier should 
register a {@link
+ * CacheBufferFlushTrigger}, as a listener to flush cached buffers.
+ */
+public class CacheFlushManager {
+private final float numBuffersTriggerFlushRatio;
+
+private final List flushTriggers;
+
+private TieredStorageMemoryManager storageMemoryManager;
+
+private final ScheduledExecutorService executor =
+Executors.newSingleThreadScheduledExecutor(
+new ThreadFactoryBuilder()
+.setNameFormat("cache flush trigger")
+
.setUncaughtExceptionHandler(FatalExitExceptionHandler.INSTANCE)
+.build());
+
+public CacheFlushManager(float numBuffersTriggerFlushRatio) {
+this.numBuffersTriggerFlushRatio = numBuffersTriggerFlushRatio;
+this.flushTriggers = new ArrayList<>();
+
+executor.scheduleWithFixedDelay(
+this::checkNeedTriggerFlushCachedBuffers, 10, 50, 
TimeUnit.MILLISECONDS);
+}
+
+public void setup(TieredStorageMemoryManager storageMemoryManager) {
+this.storageMemoryManager = storageMemoryManager;
+}
+
+public void registerCacheBufferFlushTrigger(CacheBufferFlushTrigger 
cacheBufferFlushTrigger) {
+flushTriggers.add(cacheBufferFlushTrigger);
+}
+
+public void triggerFlushCachedBuffers() {
+
flushTriggers.forEach(CacheBufferFlushTrigger::notifyFlushCachedBuffers);
+}
+
+public void checkNeedTriggerFlushCachedBuffers() {

Review Comment:
   Fixed it. and I renamed it to `checkShouldFlushCachedBuffers`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] TanYuxin-tyx commented on a diff in pull request #22352: [FLINK-31639][network] Introduce tiered store memory manager

2023-05-14 Thread via GitHub


TanYuxin-tyx commented on code in PR #22352:
URL: https://github.com/apache/flink/pull/22352#discussion_r1193122680


##
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/storage/CacheFlushManager.java:
##
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage;
+
+import 
org.apache.flink.runtime.io.network.partition.hybrid.tiered.common.TieredStorageUtils;
+import org.apache.flink.util.ExceptionUtils;
+import org.apache.flink.util.FatalExitExceptionHandler;
+
+import 
org.apache.flink.shaded.guava30.com.google.common.util.concurrent.ThreadFactoryBuilder;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.Executors;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+
+/**
+ * A manager to check whether the cached buffers need to be flushed.
+ *
+ * Cached buffers refer to the buffers stored in a tier that can be 
released by flushing them to
+ * the disk or remote storage. If the requested buffers from the buffer pool 
reach the ratio limit,
+ * {@link CacheFlushManager} will trigger a process wherein all these buffers 
are flushed to the
+ * disk or remote storage to recycling these buffers.
+ *
+ * Note that when initializing a tier with cached buffers, the tier should 
register a {@link
+ * CacheBufferFlushTrigger}, as a listener to flush cached buffers.
+ */
+public class CacheFlushManager {
+private final float numBuffersTriggerFlushRatio;
+
+private final List flushTriggers;
+
+private TieredStorageMemoryManager storageMemoryManager;
+
+private final ScheduledExecutorService executor =
+Executors.newSingleThreadScheduledExecutor(
+new ThreadFactoryBuilder()
+.setNameFormat("cache flush trigger")
+
.setUncaughtExceptionHandler(FatalExitExceptionHandler.INSTANCE)
+.build());
+
+public CacheFlushManager(float numBuffersTriggerFlushRatio) {
+this.numBuffersTriggerFlushRatio = numBuffersTriggerFlushRatio;
+this.flushTriggers = new ArrayList<>();
+
+executor.scheduleWithFixedDelay(
+this::checkNeedTriggerFlushCachedBuffers, 10, 50, 
TimeUnit.MILLISECONDS);
+}
+
+public void setup(TieredStorageMemoryManager storageMemoryManager) {
+this.storageMemoryManager = storageMemoryManager;
+}
+
+public void registerCacheBufferFlushTrigger(CacheBufferFlushTrigger 
cacheBufferFlushTrigger) {
+flushTriggers.add(cacheBufferFlushTrigger);
+}
+
+public void triggerFlushCachedBuffers() {
+
flushTriggers.forEach(CacheBufferFlushTrigger::notifyFlushCachedBuffers);
+}
+
+public void checkNeedTriggerFlushCachedBuffers() {
+if (storageMemoryManager == null) {
+return;
+}
+
+if (TieredStorageUtils.needFlushCacheBuffers(
+storageMemoryManager, numBuffersTriggerFlushRatio)) {
+triggerFlushCachedBuffers();
+}
+}
+
+public int numFlushTriggers() {

Review Comment:
   Fixed.
   We have removed the two Apis of `numTotalBuffers ` and `numRequestedBuffers` 
because we have merged `TieredStorageMemoryManager` and `CacheFlushManager`. So 
we do not need expose the two args any more.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-32081) Compatibility between file-merging on and off across job runs

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32081:
-
Component/s: Runtime / Checkpointing
 Runtime / State Backends

> Compatibility between file-merging on and off across job runs
> -
>
> Key: FLINK-32081
> URL: https://issues.apache.org/jira/browse/FLINK-32081
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Han Yin
>Priority: Major
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32081) Compatibility between file-merging on and off across job runs

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32081:
-
Affects Version/s: 1.18.0

> Compatibility between file-merging on and off across job runs
> -
>
> Key: FLINK-32081
> URL: https://issues.apache.org/jira/browse/FLINK-32081
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Han Yin
>Priority: Major
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] TanYuxin-tyx commented on a diff in pull request #22352: [FLINK-31639][network] Introduce tiered store memory manager

2023-05-14 Thread via GitHub


TanYuxin-tyx commented on code in PR #22352:
URL: https://github.com/apache/flink/pull/22352#discussion_r1193122499


##
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/storage/CacheFlushManager.java:
##
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage;
+
+import 
org.apache.flink.runtime.io.network.partition.hybrid.tiered.common.TieredStorageUtils;
+import org.apache.flink.util.ExceptionUtils;
+import org.apache.flink.util.FatalExitExceptionHandler;
+
+import 
org.apache.flink.shaded.guava30.com.google.common.util.concurrent.ThreadFactoryBuilder;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.Executors;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+
+/**
+ * A manager to check whether the cached buffers need to be flushed.
+ *
+ * Cached buffers refer to the buffers stored in a tier that can be 
released by flushing them to
+ * the disk or remote storage. If the requested buffers from the buffer pool 
reach the ratio limit,
+ * {@link CacheFlushManager} will trigger a process wherein all these buffers 
are flushed to the
+ * disk or remote storage to recycling these buffers.
+ *
+ * Note that when initializing a tier with cached buffers, the tier should 
register a {@link
+ * CacheBufferFlushTrigger}, as a listener to flush cached buffers.
+ */
+public class CacheFlushManager {
+private final float numBuffersTriggerFlushRatio;
+
+private final List flushTriggers;
+
+private TieredStorageMemoryManager storageMemoryManager;
+
+private final ScheduledExecutorService executor =
+Executors.newSingleThreadScheduledExecutor(
+new ThreadFactoryBuilder()
+.setNameFormat("cache flush trigger")
+
.setUncaughtExceptionHandler(FatalExitExceptionHandler.INSTANCE)
+.build());
+
+public CacheFlushManager(float numBuffersTriggerFlushRatio) {
+this.numBuffersTriggerFlushRatio = numBuffersTriggerFlushRatio;
+this.flushTriggers = new ArrayList<>();
+
+executor.scheduleWithFixedDelay(
+this::checkNeedTriggerFlushCachedBuffers, 10, 50, 
TimeUnit.MILLISECONDS);
+}
+
+public void setup(TieredStorageMemoryManager storageMemoryManager) {
+this.storageMemoryManager = storageMemoryManager;
+}
+
+public void registerCacheBufferFlushTrigger(CacheBufferFlushTrigger 
cacheBufferFlushTrigger) {
+flushTriggers.add(cacheBufferFlushTrigger);
+}
+
+public void triggerFlushCachedBuffers() {

Review Comment:
   Fixed it, and I renamed it to `checkShouldFlushCachedBuffers`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] TanYuxin-tyx commented on a diff in pull request #22352: [FLINK-31639][network] Introduce tiered store memory manager

2023-05-14 Thread via GitHub


TanYuxin-tyx commented on code in PR #22352:
URL: https://github.com/apache/flink/pull/22352#discussion_r1193122339


##
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/storage/CacheBufferFlushTrigger.java:
##
@@ -0,0 +1,26 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage;
+
+/** Notify the specific tier to try flushing the cached buffers. */
+public interface CacheBufferFlushTrigger {

Review Comment:
   Removed it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] TanYuxin-tyx commented on a diff in pull request #22352: [FLINK-31639][network] Introduce tiered store memory manager

2023-05-14 Thread via GitHub


TanYuxin-tyx commented on code in PR #22352:
URL: https://github.com/apache/flink/pull/22352#discussion_r1193122290


##
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/common/TieredStorageUtils.java:
##
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.partition.hybrid.tiered.common;
+
+import 
org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManager;
+
+/** Utils for reading or writing to tiered store. */
+public class TieredStorageUtils {
+public static boolean needFlushCacheBuffers(
+TieredStorageMemoryManager storageMemoryManager, float 
numBuffersTriggerFlushRatio) {
+int numTotal = storageMemoryManager.numTotalBuffers();
+int numRequested = storageMemoryManager.numRequestedBuffers();

Review Comment:
   We have removed the two Apis of `numTotalBuffers ` and `numRequestedBuffers` 
 because we have merged `TieredStorageMemoryManager` and `CacheFlushManager`. 
So we do not need expose the two args any more.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-32080) Restoration of FileMergingSnapshotManager

2023-05-14 Thread Yuan Mei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuan Mei updated FLINK-32080:
-
Component/s: Runtime / Checkpointing
 Runtime / State Backends

> Restoration of FileMergingSnapshotManager
> -
>
> Key: FLINK-32080
> URL: https://issues.apache.org/jira/browse/FLINK-32080
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Zakelly Lan
>Priority: Major
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   >