[jira] [Commented] (FLINK-29557) The SinkOperator's OutputFormat function is not recognized

2022-10-26 Thread Yun Gao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624867#comment-17624867
 ] 

Yun Gao commented on FLINK-29557:
-

Hi [~aitozi] LGTM, very thanks for tracking the issue. Would you like to open a 
PR to fix the issue? Both side is ok to me. 

> The SinkOperator's OutputFormat function is not recognized
> --
>
> Key: FLINK-29557
> URL: https://issues.apache.org/jira/browse/FLINK-29557
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core, Table SQL / API
>Reporter: Aitozi
>Priority: Major
>
> In the {{SimpleOperatorFactory#of}}, only {{StreamSink}} is handled to 
> register as {{SimpleOutputFormatOperatorFactory}}. So it will lost the output 
> format information in  {{SinkOperator}}. Then some hook functions like 
> {{FinalizeOnMaster}} will have no chance to be executed.
> Due to the {{SinkOperator}} is in the table module, it can't be reached 
> directly in the {{flink-streaming-java}}. So maybe we need introduce an extra 
> common class eg: {{SinkFunctionOperator}} to describe the {{Sink}} operator 
> and handle it individually.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] testjudyzhu opened a new pull request, #416: Update architecture.md

2022-10-26 Thread GitBox


testjudyzhu opened a new pull request, #416:
URL: https://github.com/apache/flink-kubernetes-operator/pull/416

   
   
   ## What is the purpose of the change
   
   *(For example: This pull request adds a new feature to periodically create 
and maintain savepoints through the `FlinkDeployment` custom resource.)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *Periodic savepoint trigger is introduced to the custom resource*
 - *The operator checks on reconciliation whether the required time has 
passed*
 - *The JobManager's dispose savepoint API is used to clean up obsolete 
savepoints*
   
   ## Verifying this change
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
(yes / no)
 - Core observer or reconciler logic that is regularly executed: (yes / no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaoyunhaii commented on pull request #21077: [FlINK-29248] Add Scala Async Retry Strategies and ResultPredicates Helper Classes

2022-10-26 Thread GitBox


gaoyunhaii commented on PR #21077:
URL: https://github.com/apache/flink/pull/21077#issuecomment-1293023382

   Thanks @ericxiao251 for the PR! I'll have a look at the PR today. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-29157) Clarify the contract between CompletedCheckpointStore and SharedStateRegistry

2022-10-26 Thread Congxian Qiu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624825#comment-17624825
 ] 

Congxian Qiu edited comment on FLINK-29157 at 10/27/22 5:22 AM:


merged into master 63767c5ed91642c67f97d9f16ff2b8955f9ae421

1.16 be2bd93838548f7858baecf5e8beb469836081d5


was (Author: klion26):
merged into master 63767c5ed91642c67f97d9f16ff2b8955f9ae421

> Clarify the contract between CompletedCheckpointStore and SharedStateRegistry
> -
>
> Key: FLINK-29157
> URL: https://issues.apache.org/jira/browse/FLINK-29157
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / Checkpointing
>Affects Versions: 1.16.0, 1.15.2
>Reporter: Roman Khachatryan
>Assignee: Yanfei Lei
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.17.0, 1.15.3, 1.16.1
>
>
> After FLINK-24611, CompletedCheckpointStore is required to call 
> SharedStateRegistry.unregisterUnusedState() on checkpoint subsumption and 
> shutdown.
> Although it's not clear whether CompletedCheckpointStore is internal there 
> are in fact external implementations (which weren't updated accordingly).
>  
> After FLINK-25872, CompletedCheckpointStore also must call 
> checkpointsCleaner.cleanSubsumedCheckpoints.
>  
> Another issue with a custom implementation was using different java objects 
> for state for CheckpointStore and SharedStateRegistry (after FLINK-24086). 
>  
> So it makes sense to:
>  * clarify the contract (different in 1.15 and 1.16)
>  * require using the same checkpoint objects by SharedStateRegistryFactory 
> and CompletedCheckpointStore
>  * mark the interface(s) as PublicEvolving



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29157) Clarify the contract between CompletedCheckpointStore and SharedStateRegistry

2022-10-26 Thread Congxian Qiu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Congxian Qiu updated FLINK-29157:
-
Fix Version/s: (was: 1.15.3)

> Clarify the contract between CompletedCheckpointStore and SharedStateRegistry
> -
>
> Key: FLINK-29157
> URL: https://issues.apache.org/jira/browse/FLINK-29157
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / Checkpointing
>Affects Versions: 1.16.0, 1.15.2
>Reporter: Roman Khachatryan
>Assignee: Yanfei Lei
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.17.0, 1.16.1
>
>
> After FLINK-24611, CompletedCheckpointStore is required to call 
> SharedStateRegistry.unregisterUnusedState() on checkpoint subsumption and 
> shutdown.
> Although it's not clear whether CompletedCheckpointStore is internal there 
> are in fact external implementations (which weren't updated accordingly).
>  
> After FLINK-25872, CompletedCheckpointStore also must call 
> checkpointsCleaner.cleanSubsumedCheckpoints.
>  
> Another issue with a custom implementation was using different java objects 
> for state for CheckpointStore and SharedStateRegistry (after FLINK-24086). 
>  
> So it makes sense to:
>  * clarify the contract (different in 1.15 and 1.16)
>  * require using the same checkpoint objects by SharedStateRegistryFactory 
> and CompletedCheckpointStore
>  * mark the interface(s) as PublicEvolving



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29315) HDFSTest#testBlobServerRecovery fails on CI

2022-10-26 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624861#comment-17624861
 ] 

Yang Wang commented on FLINK-29315:
---

I will have a look on the CI machine Alibaba001 today.

> HDFSTest#testBlobServerRecovery fails on CI
> ---
>
> Key: FLINK-29315
> URL: https://issues.apache.org/jira/browse/FLINK-29315
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.16.0, 1.15.2
>Reporter: Chesnay Schepler
>Priority: Blocker
>  Labels: pull-request-available
>
> The test started failing 2 days ago on different branches. I suspect 
> something's wrong with the CI infrastructure.
> {code:java}
> Sep 15 09:11:22 [ERROR] Failures: 
> Sep 15 09:11:22 [ERROR]   HDFSTest.testBlobServerRecovery Multiple Failures 
> (2 failures)
> Sep 15 09:11:22   java.lang.AssertionError: Test failed Error while 
> running command to get file permissions : java.io.IOException: Cannot run 
> program "ls": error=1, Operation not permitted
> Sep 15 09:11:22   at 
> java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.runCommand(Shell.java:913)
> Sep 15 09:11:22   at org.apache.hadoop.util.Shell.run(Shell.java:869)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.execCommand(Shell.java:1264)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.execCommand(Shell.java:1246)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:1089)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:697)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:672)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:233)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:141)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:116)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2580)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2622)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2604)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2497)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1501)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:851)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:485)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:444)
> Sep 15 09:11:22   at 
> org.apache.flink.hdfstests.HDFSTest.createHDFS(HDFSTest.java:93)
> Sep 15 09:11:22   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Sep 15 09:11:22   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Sep 15 09:11:22   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Sep 15 09:11:22   at 
> java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
> Sep 15 09:11:22   ... 67 more
> Sep 15 09:11:22 
> Sep 15 09:11:22   java.lang.NullPointerException: 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #21169: [WIP][FLINK-29770]hbase connector supports out of order data

2022-10-26 Thread GitBox


flinkbot commented on PR #21169:
URL: https://github.com/apache/flink/pull/21169#issuecomment-1293001756

   
   ## CI report:
   
   * 877470a67dacef1b3bfdb1ba1308fffeeca3c739 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-29770) hbase connector supports out-of-order data

2022-10-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-29770:
---
Labels: pull-request-available  (was: )

> hbase connector supports out-of-order data
> --
>
> Key: FLINK-29770
> URL: https://issues.apache.org/jira/browse/FLINK-29770
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / HBase
>Reporter: Bo Cui
>Priority: Major
>  Labels: pull-request-available
>
> The data may be out of order and has no timestamp. As a result, the data 
> written to the HBase is incorrect



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] cuibo01 opened a new pull request, #21169: [WIP][FLINK-29770]hbase connector supports out of order data

2022-10-26 Thread GitBox


cuibo01 opened a new pull request, #21169:
URL: https://github.com/apache/flink/pull/21169

   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-29427) LookupJoinITCase failed with classloader problem

2022-10-26 Thread Yun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624852#comment-17624852
 ] 

Yun Tang commented on FLINK-29427:
--

Another instance: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=42466=logs=0c940707-2659-5648-cbe6-a1ad63045f0a=075c2716-8010-5565-fe08-3c4bb45824a4

> LookupJoinITCase failed with classloader problem
> 
>
> Key: FLINK-29427
> URL: https://issues.apache.org/jira/browse/FLINK-29427
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.16.0
>Reporter: Huang Xingbo
>Assignee: Alexander Smirnov
>Priority: Critical
>  Labels: test-stability
>
> {code:java}
> 2022-09-27T02:49:20.9501313Z Sep 27 02:49:20 Caused by: 
> org.codehaus.janino.InternalCompilerException: Compiling 
> "KeyProjection$108341": Trying to access closed classloader. Please check if 
> you store classloaders directly or indirectly in static fields. If the 
> stacktrace suggests that the leak occurs in a third party library and cannot 
> be fixed immediately, you can disable this check with the configuration 
> 'classloader.check-leaked-classloader'.
> 2022-09-27T02:49:20.9502654Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:382)
> 2022-09-27T02:49:20.9503366Z Sep 27 02:49:20  at 
> org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:237)
> 2022-09-27T02:49:20.9504044Z Sep 27 02:49:20  at 
> org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:465)
> 2022-09-27T02:49:20.9504704Z Sep 27 02:49:20  at 
> org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:216)
> 2022-09-27T02:49:20.9505341Z Sep 27 02:49:20  at 
> org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:207)
> 2022-09-27T02:49:20.9505965Z Sep 27 02:49:20  at 
> org.codehaus.commons.compiler.Cookable.cook(Cookable.java:80)
> 2022-09-27T02:49:20.9506584Z Sep 27 02:49:20  at 
> org.codehaus.commons.compiler.Cookable.cook(Cookable.java:75)
> 2022-09-27T02:49:20.9507261Z Sep 27 02:49:20  at 
> org.apache.flink.table.runtime.generated.CompileUtils.doCompile(CompileUtils.java:104)
> 2022-09-27T02:49:20.9507883Z Sep 27 02:49:20  ... 30 more
> 2022-09-27T02:49:20.9509266Z Sep 27 02:49:20 Caused by: 
> java.lang.IllegalStateException: Trying to access closed classloader. Please 
> check if you store classloaders directly or indirectly in static fields. If 
> the stacktrace suggests that the leak occurs in a third party library and 
> cannot be fixed immediately, you can disable this check with the 
> configuration 'classloader.check-leaked-classloader'.
> 2022-09-27T02:49:20.9510835Z Sep 27 02:49:20  at 
> org.apache.flink.util.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:184)
> 2022-09-27T02:49:20.9511760Z Sep 27 02:49:20  at 
> org.apache.flink.util.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.loadClass(FlinkUserCodeClassLoaders.java:192)
> 2022-09-27T02:49:20.9512456Z Sep 27 02:49:20  at 
> java.lang.Class.forName0(Native Method)
> 2022-09-27T02:49:20.9513014Z Sep 27 02:49:20  at 
> java.lang.Class.forName(Class.java:348)
> 2022-09-27T02:49:20.9513649Z Sep 27 02:49:20  at 
> org.codehaus.janino.ClassLoaderIClassLoader.findIClass(ClassLoaderIClassLoader.java:89)
> 2022-09-27T02:49:20.9514339Z Sep 27 02:49:20  at 
> org.codehaus.janino.IClassLoader.loadIClass(IClassLoader.java:312)
> 2022-09-27T02:49:20.9514990Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler.findTypeByName(UnitCompiler.java:8556)
> 2022-09-27T02:49:20.9515659Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler.getReferenceType(UnitCompiler.java:6749)
> 2022-09-27T02:49:20.9516337Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler.getReferenceType(UnitCompiler.java:6594)
> 2022-09-27T02:49:20.9516989Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:6573)
> 2022-09-27T02:49:20.9517632Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler.access$13900(UnitCompiler.java:215)
> 2022-09-27T02:49:20.9518319Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler$22$1.visitReferenceType(UnitCompiler.java:6481)
> 2022-09-27T02:49:20.9519018Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler$22$1.visitReferenceType(UnitCompiler.java:6476)
> 2022-09-27T02:49:20.9519680Z Sep 27 02:49:20  at 
> org.codehaus.janino.Java$ReferenceType.accept(Java.java:3928)
> 2022-09-27T02:49:20.9520386Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler$22.visitType(UnitCompiler.java:6476)
> 2022-09-27T02:49:20.9521042Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler$22.visitType(UnitCompiler.java:6469)
> 2022-09-27T02:49:20.9521677Z Sep 27 02:49:20  at 
> 

[GitHub] [flink] Myasuka commented on pull request #20917: [hotfix][doc] Refine code examples in data_stream_api

2022-10-26 Thread GitBox


Myasuka commented on PR #20917:
URL: https://github.com/apache/flink/pull/20917#issuecomment-1292986632

   The CI failed due to FLINK-29427
   
   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] reswqa commented on pull request #21137: [FLINK-29234][runtime] JobMasterServiceLeadershipRunner handle leader event in a separate executor to avoid dead lock

2022-10-26 Thread GitBox


reswqa commented on PR #21137:
URL: https://github.com/apache/flink/pull/21137#issuecomment-1292957867

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora opened a new pull request, #415: [hotfix] Change default Helm operator configs to align with Flink

2022-10-26 Thread GitBox


gyfora opened a new pull request, #415:
URL: https://github.com/apache/flink-kubernetes-operator/pull/415

   ## What is the purpose of the change
   
   Minor change based on mailng list discussion to align default confs with 
Flink for subtask number and parallelism.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #415: [hotfix] Change default Helm operator configs to align with Flink

2022-10-26 Thread GitBox


gyfora commented on PR #415:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/415#issuecomment-1292952390

   cc @bgeng777 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-29611) Fix flaky tests in CoBroadcastWithNonKeyedOperatorTest

2022-10-26 Thread Sopan Phaltankar (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624850#comment-17624850
 ] 

Sopan Phaltankar commented on FLINK-29611:
--

[~martijnvisser] Here is my PR: [https://github.com/apache/flink/pull/21151]

Let me know your feedback

> Fix flaky tests in CoBroadcastWithNonKeyedOperatorTest
> --
>
> Key: FLINK-29611
> URL: https://issues.apache.org/jira/browse/FLINK-29611
> Project: Flink
>  Issue Type: Bug
>Reporter: Sopan Phaltankar
>Priority: Minor
>  Labels: pull-request-available
>
> The test 
> _org.apache.flink.streaming.api.operators.co.CoBroadcastWithNonKeyedOperatorTest.testMultiStateSupport_
>  has the following failure:
> Failures:
> [ERROR]   CoBroadcastWithNonKeyedOperatorTest.testMultiStateSupport:74 
> Wrong Side Output: arrays first differed at element [0]; expected: 15 : 9:key.6->6> but was:5>
> I used the tool [NonDex|https://github.com/TestingResearchIllinois/NonDex] to 
> find this flaky test. 
> Command: mvn edu.illinois:nondex-maven-plugun:1.1.2:nondex -Dtest='Fully 
> Qualified Test Name'
> I analyzed the assertion failure and found that the root cause is because the 
> test method calls ctx.getBroadcastState(STATE_DESCRIPTOR).immutableEntries() 
> which calls the entrySet() method of the underlying HashMap. entrySet() 
> returns the entries in a non-deterministic way, causing the test to be flaky. 
> The fix would be to change _HashMap_ to _LinkedHashMap_ where the Map is 
> getting initialized.
> On further analysis, it was found that the Map is getting initialized on line 
> 53 of org.apache.flink.runtime.state.HeapBroadcastState class.
> After changing from HashMap to LinkedHashMap, the above test is passing.
> Edit: Upon making this change and running the CI, it was found that the tests 
> org.apache.flink.api.datastream.DataStreamBatchExecutionITCase.batchKeyedBroadcastExecution
>  and 
> org.apache.flink.api.datastream.DataStreamBatchExecutionITCase.batchBroadcastExecution
>  were failing. Upon further investigation, I found that these tests were also 
> flaky and depended on the earlier made change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #409: [FLINK-29708] Convert CR Error field to JSON with enriched exception …

2022-10-26 Thread GitBox


gyfora commented on PR #409:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/409#issuecomment-1292948210

   Great work @darenwkt !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-web] godfreyhe commented on pull request #574: Announcement blogpost for the 1.16 release

2022-10-26 Thread GitBox


godfreyhe commented on PR #574:
URL: https://github.com/apache/flink-web/pull/574#issuecomment-1292943915

   > Thanks a lot for putting together this big announcement, I mostly made 
some wording suggestions
   
   Thanks for the detailed comments, I have fixed them.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] yuzelin closed pull request #21040: [FLINK-29486][sql-client] Implement a new ClientResult for sql client to wrap the result returned by sql gateway

2022-10-26 Thread GitBox


yuzelin closed pull request #21040: [FLINK-29486][sql-client] Implement a new 
ClientResult for sql client to wrap the result returned by sql gateway
URL: https://github.com/apache/flink/pull/21040


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] LadyForest commented on a diff in pull request #20652: [FLINK-22315][table] Support add/modify column/constraint/watermark for ALTER TABLE statement

2022-10-26 Thread GitBox


LadyForest commented on code in PR #20652:
URL: https://github.com/apache/flink/pull/20652#discussion_r1006384680


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/operations/AlterTableSchemaUtil.java:
##
@@ -0,0 +1,516 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.operations;
+
+import org.apache.flink.sql.parser.ddl.SqlAlterTableAdd;
+import org.apache.flink.sql.parser.ddl.SqlAlterTableModify;
+import org.apache.flink.sql.parser.ddl.SqlAlterTableSchema;
+import org.apache.flink.sql.parser.ddl.SqlTableColumn;
+import org.apache.flink.sql.parser.ddl.SqlWatermark;
+import org.apache.flink.sql.parser.ddl.constraint.SqlTableConstraint;
+import org.apache.flink.sql.parser.ddl.position.SqlTableColumnPosition;
+import org.apache.flink.table.api.Schema;
+import org.apache.flink.table.api.ValidationException;
+import org.apache.flink.table.catalog.ResolvedCatalogTable;
+import org.apache.flink.table.expressions.Expression;
+import org.apache.flink.table.expressions.SqlCallExpression;
+import org.apache.flink.table.planner.calcite.FlinkTypeFactory;
+import org.apache.flink.table.types.AbstractDataType;
+import org.apache.flink.table.types.DataType;
+import org.apache.flink.table.types.logical.LogicalType;
+import org.apache.flink.table.types.utils.TypeConversions;
+
+import org.apache.calcite.rel.type.RelDataType;
+import org.apache.calcite.sql.SqlCharStringLiteral;
+import org.apache.calcite.sql.SqlDataTypeSpec;
+import org.apache.calcite.sql.SqlIdentifier;
+import org.apache.calcite.sql.SqlNode;
+import org.apache.calcite.sql.validate.SqlValidator;
+
+import javax.annotation.Nullable;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.function.Consumer;
+import java.util.function.Function;
+import java.util.stream.Collectors;
+
+/** Util to convert {@link SqlAlterTableSchema} with source table to generate 
new {@link Schema}. */
+public class AlterTableSchemaUtil {
+
+private final SqlValidator sqlValidator;
+private final Consumer validateTableConstraint;
+private final Function escapeExpression;
+
+AlterTableSchemaUtil(
+SqlValidator sqlValidator,
+Function escapeExpression,
+Consumer validateTableConstraint) {
+this.sqlValidator = sqlValidator;
+this.validateTableConstraint = validateTableConstraint;
+this.escapeExpression = escapeExpression;
+}
+
+public Schema convertSchema(
+SqlAlterTableSchema alterTableSchema, ResolvedCatalogTable 
originalTable) {
+UnresolvedSchemaBuilder builder =
+new UnresolvedSchemaBuilder(
+originalTable,
+(FlinkTypeFactory) sqlValidator.getTypeFactory(),
+sqlValidator,
+validateTableConstraint,
+escapeExpression);
+AlterSchemaStrategy strategy = 
computeAlterSchemaStrategy(alterTableSchema);
+List columnPositions = 
alterTableSchema.getColumnPositions().getList();
+builder.addOrModifyColumns(strategy, columnPositions);
+alterTableSchema
+.getWatermark()
+.ifPresent(sqlWatermark -> 
builder.addOrModifyWatermarks(strategy, sqlWatermark));
+alterTableSchema
+.getFullConstraint()
+.ifPresent(
+(pk) ->
+builder.addOrModifyPrimaryKey(
+strategy, pk, 
columnPositions.isEmpty()));
+return builder.build();
+}
+
+private static class UnresolvedSchemaBuilder {

Review Comment:
   > Rename to SchemaConverter? Currently we only have `Schema` and 
`ResolvedSchema` in Flink. It's better we don't introduce a new concept.
   
   Agree with you on the principle that "We'd better not introduce a new 
concept". 
   
   But I think `UnresovledSchema` is not a new concept, for instance 
`CatalogBaseTable#getUnresolvedSchema` returns `Schema` class. 
   Anyway, I'm fine with `SchemaBuilder`.



-- 
This 

[jira] [Closed] (FLINK-29760) Introduce snapshots metadata table

2022-10-26 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-29760.

Resolution: Fixed

master: a58513e40eb0ffb50fb6eee357c4449a9d8483cd

> Introduce snapshots metadata table
> --
>
> Key: FLINK-29760
> URL: https://issues.apache.org/jira/browse/FLINK-29760
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.3.0
>
>
> Introduce snapshots metadata table to show snapshot history.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] JingsongLi merged pull request #333: [FLINK-29760] Introduce snapshots metadata table

2022-10-26 Thread GitBox


JingsongLi merged PR #333:
URL: https://github.com/apache/flink-table-store/pull/333


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on pull request #336: [FLINK-29773] Fix table store unstable tests PreAggregationITCase

2022-10-26 Thread GitBox


JingsongLi commented on PR #336:
URL: 
https://github.com/apache/flink-table-store/pull/336#issuecomment-1292939114

   How many times have you run without fail?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-29075) RescaleBucketITCase#testSuspendAndRecoverAfterRescaleOverwrite is not stable

2022-10-26 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-29075.

Fix Version/s: table-store-0.3.0
   Resolution: Fixed

master: f3b1e813684194fa50d57663fea2f06d747f5f96

> RescaleBucketITCase#testSuspendAndRecoverAfterRescaleOverwrite is not stable
> 
>
> Key: FLINK-29075
> URL: https://issues.apache.org/jira/browse/FLINK-29075
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Affects Versions: table-store-0.2.0
>Reporter: Jane Chan
>Assignee: Caizhi Weng
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.3.0
>
> Attachments: image-2022-08-23-15-35-59-499.png
>
>
> [https://github.com/apache/flink-table-store/runs/7964774584?check_suite_focus=true]
> !image-2022-08-23-15-35-59-499.png|width=576,height=370!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] JingsongLi merged pull request #335: [FLINK-29075] Fix table store unstable test RescaleBucketITCase#testSuspendAndRecoverAfterRescaleOverwrite

2022-10-26 Thread GitBox


JingsongLi merged PR #335:
URL: https://github.com/apache/flink-table-store/pull/335


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] mbalassi commented on pull request #414: [FLINK-29771] Fix flaky rollback test

2022-10-26 Thread GitBox


mbalassi commented on PR #414:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/414#issuecomment-1292928578

   Thanks @darenwkt, makes sense. Could you please extract the sleep times to a 
single `ROLLBACK_DELAY` or similarly named constant please?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-29773) PreAggregationITCase.LastValueAggregation and PreAggregationITCase.LastNonNullValueAggregation are unstable

2022-10-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-29773:
---
Labels: pull-request-available  (was: )

> PreAggregationITCase.LastValueAggregation and 
> PreAggregationITCase.LastNonNullValueAggregation are unstable
> ---
>
> Key: FLINK-29773
> URL: https://issues.apache.org/jira/browse/FLINK-29773
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Affects Versions: table-store-0.3.0
>Reporter: Caizhi Weng
>Assignee: Caizhi Weng
>Priority: Major
>  Labels: pull-request-available
>
> {{PreAggregationITCase.LastValueAggregation}} and 
> {{PreAggregationITCase.LastNonNullValueAggregation}} need to make sure that 
> the order of input data is determined. However the default parallelism of 
> {{FileStoreTableITCase}} is 2, so the order of input data might change across 
> tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] tsreaper opened a new pull request, #336: [FLINK-29773] Fix table store unstable tests PreAggregationITCase

2022-10-26 Thread GitBox


tsreaper opened a new pull request, #336:
URL: https://github.com/apache/flink-table-store/pull/336

   `PreAggregationITCase.LastValueAggregation` and 
`PreAggregationITCase.LastNonNullValueAggregation` need to make sure that the 
order of input data is determined. However the default parallelism of 
`FileStoreTableITCase` is 2, so the order of input data might change across 
tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-29774) Introduce options metadata table

2022-10-26 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29774:


 Summary: Introduce options metadata table
 Key: FLINK-29774
 URL: https://issues.apache.org/jira/browse/FLINK-29774
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


SELECT * FROM T$options;
KEY | VALUE
... | ...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29773) PreAggregationITCase.LastValueAggregation and PreAggregationITCase.LastNonNullValueAggregation are unstable

2022-10-26 Thread Caizhi Weng (Jira)
Caizhi Weng created FLINK-29773:
---

 Summary: PreAggregationITCase.LastValueAggregation and 
PreAggregationITCase.LastNonNullValueAggregation are unstable
 Key: FLINK-29773
 URL: https://issues.apache.org/jira/browse/FLINK-29773
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Affects Versions: table-store-0.3.0
Reporter: Caizhi Weng


{{PreAggregationITCase.LastValueAggregation}} and 
{{PreAggregationITCase.LastNonNullValueAggregation}} need to make sure that the 
order of input data is determined. However the default parallelism of 
{{FileStoreTableITCase}} is 2, so the order of input data might change across 
tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-29773) PreAggregationITCase.LastValueAggregation and PreAggregationITCase.LastNonNullValueAggregation are unstable

2022-10-26 Thread Caizhi Weng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng reassigned FLINK-29773:
---

Assignee: Caizhi Weng

> PreAggregationITCase.LastValueAggregation and 
> PreAggregationITCase.LastNonNullValueAggregation are unstable
> ---
>
> Key: FLINK-29773
> URL: https://issues.apache.org/jira/browse/FLINK-29773
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Affects Versions: table-store-0.3.0
>Reporter: Caizhi Weng
>Assignee: Caizhi Weng
>Priority: Major
>
> {{PreAggregationITCase.LastValueAggregation}} and 
> {{PreAggregationITCase.LastNonNullValueAggregation}} need to make sure that 
> the order of input data is determined. However the default parallelism of 
> {{FileStoreTableITCase}} is 2, so the order of input data might change across 
> tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29075) RescaleBucketITCase#testSuspendAndRecoverAfterRescaleOverwrite is not stable

2022-10-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-29075:
---
Labels: pull-request-available  (was: )

> RescaleBucketITCase#testSuspendAndRecoverAfterRescaleOverwrite is not stable
> 
>
> Key: FLINK-29075
> URL: https://issues.apache.org/jira/browse/FLINK-29075
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Affects Versions: table-store-0.2.0
>Reporter: Jane Chan
>Assignee: Caizhi Weng
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-08-23-15-35-59-499.png
>
>
> [https://github.com/apache/flink-table-store/runs/7964774584?check_suite_focus=true]
> !image-2022-08-23-15-35-59-499.png|width=576,height=370!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] tsreaper opened a new pull request, #335: [FLINK-29075] Fix table store unstable test RescaleBucketITCase#testSuspendAndRecoverAfterRescaleOverwrite

2022-10-26 Thread GitBox


tsreaper opened a new pull request, #335:
URL: https://github.com/apache/flink-table-store/pull/335

   `RescaleBucketITCase#testSuspendAndRecoverAfterRescaleOverwrite` fails due 
to some expected records in table `T4` does not appear in table `T3`. This is 
because `stopJobSafely` method does not wait for job to fully shut down.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (FLINK-29766) Speculative execution should also work in hybrid shuffle mode

2022-10-26 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song reassigned FLINK-29766:


Assignee: Weijie Guo

> Speculative execution should also work in hybrid shuffle mode
> -
>
> Key: FLINK-29766
> URL: https://issues.apache.org/jira/browse/FLINK-29766
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.17.0
>Reporter: Weijie Guo
>Assignee: Weijie Guo
>Priority: Major
>  Labels: Umbrella
>
> Speculation execution is a very important piece for large-scale batch jobs, 
> but it can only work in the BLOCKING shuffle mode, I propose to also support 
> this feature for HYBRID shuffle mode to combine the power of hybrid shuffle 
> and speculative execution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-29767) Adaptive batch scheduler supports hybrid shuffle

2022-10-26 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song reassigned FLINK-29767:


Assignee: Weijie Guo

> Adaptive batch scheduler supports hybrid shuffle
> 
>
> Key: FLINK-29767
> URL: https://issues.apache.org/jira/browse/FLINK-29767
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.17.0
>Reporter: Weijie Guo
>Assignee: Weijie Guo
>Priority: Major
>
> `SpeculativeScheduler` reuses the ability of `AdaptiveBatchScheduler` to 
> schedule tasks, but the current `AdaptiveBatchScheduler` only supports ALL_ 
> EDGES_BLOCKING batch jobs, so we should make it also support hybrid shuffle 
> mode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29772) Kafka table source scan blocked

2022-10-26 Thread shizhengchao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624836#comment-17624836
 ] 

shizhengchao commented on FLINK-29772:
--

my case is kafka interval join,deadlock occurs when using rocksdb

> Kafka table source scan blocked
> ---
>
> Key: FLINK-29772
> URL: https://issues.apache.org/jira/browse/FLINK-29772
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.13.2
>Reporter: shizhengchao
>Priority: Major
>
> {code:java}
> //
> "Source: TableSourceScan(table=[[ostream, user_mart, dwd_ads_isms_msgmiddle, 
> watermark=[-(toTimeStamps($2), 1:INTERVAL SECOND)]]], fields=[data_type, 
> cluster_name, server_time, server_time_s, client_time, client_time_s, imei, 
> request_id, owner_id, service_id, content_id, sign_id, receiver_type, 
> msg_type, handle_type, reach_type, source_type, create_time, msg_id, imsi, 
> array_info_imei, phone, channel_id, process_time, code, msg, receiver, 
> content_type, android_version, apk_version]) -> Calc(select=[data_type, 
> server_time, client_time, msg_id, array_info_imei, code, PROCTIME() AS 
> proctime, Reinterpret(toTimeStamps(server_time)) AS rowtime]) -> 
> Calc(select=[array_info_imei AS imei, REPLACE(msg_id, _UTF-16LE'#', 
> _UTF-16LE'') AS msg_id, CASE((SEARCH(server_time, Sarg[(-∞.._UTF-16LE'NULL'), 
> (_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
> SET "UTF-16LE") AND server_time IS NOT NULL), server_time, 
> CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
> 10) AS server_time, CASE((SEARCH(client_time, Sarg[(-∞.._UTF-16LE'NULL'), 
> (_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
> SET "UTF-16LE") AND client_time IS NOT NULL), client_time, 
> CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
> 10) AS client_time, IF(((data_type = 
> _UTF-16LE'sms-netmsg-send':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AND 
> (code = _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")), 1, 0) AS 
> send_cnt, IF(((data_type = _UTF-16LE'sms-netmsg-callback':VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE") AND (code = _UTF-16LE'0':VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE")), 1, 0) AS reach_cnt, rowtime], 
> where=[SEARCH(data_type, Sarg[_UTF-16LE'sms-netmsg-callback':VARCHAR(19) 
> CHARACTER SET "UTF-16LE", _UTF-16LE'sms-netmsg-send':VARCHAR(19) CHARACTER 
> SET "UTF-16LE"]:VARCHAR(19) CHARACTER SET "UTF-16LE")]) (22/24)#0" Id=77 
> BLOCKED on java.lang.Object@4aa3fe44 owned by "Legacy Source Thread - Source: 
> TableSourceScan(table=[[ostream, user_mart, dwd_ads_isms_msgmiddle, 
> watermark=[-(toTimeStamps($2), 1:INTERVAL SECOND)]]], fields=[data_type, 
> cluster_name, server_time, server_time_s, client_time, client_time_s, imei, 
> request_id, owner_id, service_id, content_id, sign_id, receiver_type, 
> msg_type, handle_type, reach_type, source_type, create_time, msg_id, imsi, 
> array_info_imei, phone, channel_id, process_time, code, msg, receiver, 
> content_type, android_version, apk_version]) -> Calc(select=[data_type, 
> server_time, client_time, msg_id, array_info_imei, code, PROCTIME() AS 
> proctime, Reinterpret(toTimeStamps(server_time)) AS rowtime]) -> 
> Calc(select=[array_info_imei AS imei, REPLACE(msg_id, _UTF-16LE'#', 
> _UTF-16LE'') AS msg_id, CASE((SEARCH(server_time, Sarg[(-∞.._UTF-16LE'NULL'), 
> (_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
> SET "UTF-16LE") AND server_time IS NOT NULL), server_time, 
> CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
> 10) AS server_time, CASE((SEARCH(client_time, Sarg[(-∞.._UTF-16LE'NULL'), 
> (_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
> SET "UTF-16LE") AND client_time IS NOT NULL), client_time, 
> CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
> 10) AS client_time, IF(((data_type = 
> _UTF-16LE'sms-netmsg-send':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AND 
> (code = _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")), 1, 0) AS 
> send_cnt, IF(((data_type = _UTF-16LE'sms-netmsg-callback':VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE") AND (code = _UTF-16LE'0':VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE")), 1, 0) AS reach_cnt, rowtime], 
> where=[SEARCH(data_type, Sarg[_UTF-16LE'sms-netmsg-callback':VARCHAR(19) 
> CHARACTER SET "UTF-16LE", _UTF-16LE'sms-netmsg-send':VARCHAR(19) CHARACTER 
> SET "UTF-16LE"]:VARCHAR(19) CHARACTER SET "UTF-16LE")]) (22/24)#0" Id=87
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:92)
> -blocked on java.lang.Object@4aa3fe44
> at 

[jira] (FLINK-29772) Kafka table source scan blocked

2022-10-26 Thread shizhengchao (Jira)


[ https://issues.apache.org/jira/browse/FLINK-29772 ]


shizhengchao deleted comment on FLINK-29772:
--

was (Author: tinny):
kafka interval join, deadlock occurs when using rocksdb

> Kafka table source scan blocked
> ---
>
> Key: FLINK-29772
> URL: https://issues.apache.org/jira/browse/FLINK-29772
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.13.2
>Reporter: shizhengchao
>Priority: Major
>
> {code:java}
> //
> "Source: TableSourceScan(table=[[ostream, user_mart, dwd_ads_isms_msgmiddle, 
> watermark=[-(toTimeStamps($2), 1:INTERVAL SECOND)]]], fields=[data_type, 
> cluster_name, server_time, server_time_s, client_time, client_time_s, imei, 
> request_id, owner_id, service_id, content_id, sign_id, receiver_type, 
> msg_type, handle_type, reach_type, source_type, create_time, msg_id, imsi, 
> array_info_imei, phone, channel_id, process_time, code, msg, receiver, 
> content_type, android_version, apk_version]) -> Calc(select=[data_type, 
> server_time, client_time, msg_id, array_info_imei, code, PROCTIME() AS 
> proctime, Reinterpret(toTimeStamps(server_time)) AS rowtime]) -> 
> Calc(select=[array_info_imei AS imei, REPLACE(msg_id, _UTF-16LE'#', 
> _UTF-16LE'') AS msg_id, CASE((SEARCH(server_time, Sarg[(-∞.._UTF-16LE'NULL'), 
> (_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
> SET "UTF-16LE") AND server_time IS NOT NULL), server_time, 
> CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
> 10) AS server_time, CASE((SEARCH(client_time, Sarg[(-∞.._UTF-16LE'NULL'), 
> (_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
> SET "UTF-16LE") AND client_time IS NOT NULL), client_time, 
> CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
> 10) AS client_time, IF(((data_type = 
> _UTF-16LE'sms-netmsg-send':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AND 
> (code = _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")), 1, 0) AS 
> send_cnt, IF(((data_type = _UTF-16LE'sms-netmsg-callback':VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE") AND (code = _UTF-16LE'0':VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE")), 1, 0) AS reach_cnt, rowtime], 
> where=[SEARCH(data_type, Sarg[_UTF-16LE'sms-netmsg-callback':VARCHAR(19) 
> CHARACTER SET "UTF-16LE", _UTF-16LE'sms-netmsg-send':VARCHAR(19) CHARACTER 
> SET "UTF-16LE"]:VARCHAR(19) CHARACTER SET "UTF-16LE")]) (22/24)#0" Id=77 
> BLOCKED on java.lang.Object@4aa3fe44 owned by "Legacy Source Thread - Source: 
> TableSourceScan(table=[[ostream, user_mart, dwd_ads_isms_msgmiddle, 
> watermark=[-(toTimeStamps($2), 1:INTERVAL SECOND)]]], fields=[data_type, 
> cluster_name, server_time, server_time_s, client_time, client_time_s, imei, 
> request_id, owner_id, service_id, content_id, sign_id, receiver_type, 
> msg_type, handle_type, reach_type, source_type, create_time, msg_id, imsi, 
> array_info_imei, phone, channel_id, process_time, code, msg, receiver, 
> content_type, android_version, apk_version]) -> Calc(select=[data_type, 
> server_time, client_time, msg_id, array_info_imei, code, PROCTIME() AS 
> proctime, Reinterpret(toTimeStamps(server_time)) AS rowtime]) -> 
> Calc(select=[array_info_imei AS imei, REPLACE(msg_id, _UTF-16LE'#', 
> _UTF-16LE'') AS msg_id, CASE((SEARCH(server_time, Sarg[(-∞.._UTF-16LE'NULL'), 
> (_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
> SET "UTF-16LE") AND server_time IS NOT NULL), server_time, 
> CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
> 10) AS server_time, CASE((SEARCH(client_time, Sarg[(-∞.._UTF-16LE'NULL'), 
> (_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
> SET "UTF-16LE") AND client_time IS NOT NULL), client_time, 
> CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
> 10) AS client_time, IF(((data_type = 
> _UTF-16LE'sms-netmsg-send':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AND 
> (code = _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")), 1, 0) AS 
> send_cnt, IF(((data_type = _UTF-16LE'sms-netmsg-callback':VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE") AND (code = _UTF-16LE'0':VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE")), 1, 0) AS reach_cnt, rowtime], 
> where=[SEARCH(data_type, Sarg[_UTF-16LE'sms-netmsg-callback':VARCHAR(19) 
> CHARACTER SET "UTF-16LE", _UTF-16LE'sms-netmsg-send':VARCHAR(19) CHARACTER 
> SET "UTF-16LE"]:VARCHAR(19) CHARACTER SET "UTF-16LE")]) (22/24)#0" Id=87
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:92)
> -blocked on java.lang.Object@4aa3fe44
> at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
> 

[jira] [Commented] (FLINK-29772) Kafka table source scan blocked

2022-10-26 Thread shizhengchao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624834#comment-17624834
 ] 

shizhengchao commented on FLINK-29772:
--

kafka interval join, deadlock occurs when using rocksdb

> Kafka table source scan blocked
> ---
>
> Key: FLINK-29772
> URL: https://issues.apache.org/jira/browse/FLINK-29772
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.13.2
>Reporter: shizhengchao
>Priority: Major
>
> {code:java}
> //
> "Source: TableSourceScan(table=[[ostream, user_mart, dwd_ads_isms_msgmiddle, 
> watermark=[-(toTimeStamps($2), 1:INTERVAL SECOND)]]], fields=[data_type, 
> cluster_name, server_time, server_time_s, client_time, client_time_s, imei, 
> request_id, owner_id, service_id, content_id, sign_id, receiver_type, 
> msg_type, handle_type, reach_type, source_type, create_time, msg_id, imsi, 
> array_info_imei, phone, channel_id, process_time, code, msg, receiver, 
> content_type, android_version, apk_version]) -> Calc(select=[data_type, 
> server_time, client_time, msg_id, array_info_imei, code, PROCTIME() AS 
> proctime, Reinterpret(toTimeStamps(server_time)) AS rowtime]) -> 
> Calc(select=[array_info_imei AS imei, REPLACE(msg_id, _UTF-16LE'#', 
> _UTF-16LE'') AS msg_id, CASE((SEARCH(server_time, Sarg[(-∞.._UTF-16LE'NULL'), 
> (_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
> SET "UTF-16LE") AND server_time IS NOT NULL), server_time, 
> CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
> 10) AS server_time, CASE((SEARCH(client_time, Sarg[(-∞.._UTF-16LE'NULL'), 
> (_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
> SET "UTF-16LE") AND client_time IS NOT NULL), client_time, 
> CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
> 10) AS client_time, IF(((data_type = 
> _UTF-16LE'sms-netmsg-send':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AND 
> (code = _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")), 1, 0) AS 
> send_cnt, IF(((data_type = _UTF-16LE'sms-netmsg-callback':VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE") AND (code = _UTF-16LE'0':VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE")), 1, 0) AS reach_cnt, rowtime], 
> where=[SEARCH(data_type, Sarg[_UTF-16LE'sms-netmsg-callback':VARCHAR(19) 
> CHARACTER SET "UTF-16LE", _UTF-16LE'sms-netmsg-send':VARCHAR(19) CHARACTER 
> SET "UTF-16LE"]:VARCHAR(19) CHARACTER SET "UTF-16LE")]) (22/24)#0" Id=77 
> BLOCKED on java.lang.Object@4aa3fe44 owned by "Legacy Source Thread - Source: 
> TableSourceScan(table=[[ostream, user_mart, dwd_ads_isms_msgmiddle, 
> watermark=[-(toTimeStamps($2), 1:INTERVAL SECOND)]]], fields=[data_type, 
> cluster_name, server_time, server_time_s, client_time, client_time_s, imei, 
> request_id, owner_id, service_id, content_id, sign_id, receiver_type, 
> msg_type, handle_type, reach_type, source_type, create_time, msg_id, imsi, 
> array_info_imei, phone, channel_id, process_time, code, msg, receiver, 
> content_type, android_version, apk_version]) -> Calc(select=[data_type, 
> server_time, client_time, msg_id, array_info_imei, code, PROCTIME() AS 
> proctime, Reinterpret(toTimeStamps(server_time)) AS rowtime]) -> 
> Calc(select=[array_info_imei AS imei, REPLACE(msg_id, _UTF-16LE'#', 
> _UTF-16LE'') AS msg_id, CASE((SEARCH(server_time, Sarg[(-∞.._UTF-16LE'NULL'), 
> (_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
> SET "UTF-16LE") AND server_time IS NOT NULL), server_time, 
> CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
> 10) AS server_time, CASE((SEARCH(client_time, Sarg[(-∞.._UTF-16LE'NULL'), 
> (_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
> SET "UTF-16LE") AND client_time IS NOT NULL), client_time, 
> CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
> 10) AS client_time, IF(((data_type = 
> _UTF-16LE'sms-netmsg-send':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AND 
> (code = _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")), 1, 0) AS 
> send_cnt, IF(((data_type = _UTF-16LE'sms-netmsg-callback':VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE") AND (code = _UTF-16LE'0':VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE")), 1, 0) AS reach_cnt, rowtime], 
> where=[SEARCH(data_type, Sarg[_UTF-16LE'sms-netmsg-callback':VARCHAR(19) 
> CHARACTER SET "UTF-16LE", _UTF-16LE'sms-netmsg-send':VARCHAR(19) CHARACTER 
> SET "UTF-16LE"]:VARCHAR(19) CHARACTER SET "UTF-16LE")]) (22/24)#0" Id=87
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:92)
> -blocked on java.lang.Object@4aa3fe44
> at 

[jira] [Commented] (FLINK-24119) KafkaITCase.testTimestamps fails due to "Topic xxx already exist"

2022-10-26 Thread Weijie Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624832#comment-17624832
 ] 

Weijie Guo commented on FLINK-24119:


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=42455=logs=aa18c3f6-13b8-5f58-86bb-c1cffb239496=502fb6c0-30a2-5e49-c5c2-a00fa3acb203=37386

> KafkaITCase.testTimestamps fails due to "Topic xxx already exist"
> -
>
> Key: FLINK-24119
> URL: https://issues.apache.org/jira/browse/FLINK-24119
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.0, 1.15.0, 1.16.0
>Reporter: Xintong Song
>Assignee: Qingsheng Ren
>Priority: Critical
>  Labels: auto-deprioritized-critical, test-stability
> Fix For: 1.16.1
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23328=logs=c5f0071e-1851-543e-9a45-9ac140befc32=15a22db7-8faa-5b34-3920-d33c9f0ca23c=7419
> {code}
> Sep 01 15:53:20 [ERROR] Tests run: 23, Failures: 1, Errors: 0, Skipped: 0, 
> Time elapsed: 162.65 s <<< FAILURE! - in 
> org.apache.flink.streaming.connectors.kafka.KafkaITCase
> Sep 01 15:53:20 [ERROR] testTimestamps  Time elapsed: 23.237 s  <<< FAILURE!
> Sep 01 15:53:20 java.lang.AssertionError: Create test topic : tstopic failed, 
> org.apache.kafka.common.errors.TopicExistsException: Topic 'tstopic' already 
> exists.
> Sep 01 15:53:20   at org.junit.Assert.fail(Assert.java:89)
> Sep 01 15:53:20   at 
> org.apache.flink.streaming.connectors.kafka.KafkaTestEnvironmentImpl.createTestTopic(KafkaTestEnvironmentImpl.java:226)
> Sep 01 15:53:20   at 
> org.apache.flink.streaming.connectors.kafka.KafkaTestEnvironment.createTestTopic(KafkaTestEnvironment.java:112)
> Sep 01 15:53:20   at 
> org.apache.flink.streaming.connectors.kafka.KafkaTestBase.createTestTopic(KafkaTestBase.java:212)
> Sep 01 15:53:20   at 
> org.apache.flink.streaming.connectors.kafka.KafkaITCase.testTimestamps(KafkaITCase.java:191)
> Sep 01 15:53:20   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Sep 01 15:53:20   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Sep 01 15:53:20   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Sep 01 15:53:20   at java.lang.reflect.Method.invoke(Method.java:498)
> Sep 01 15:53:20   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> Sep 01 15:53:20   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> Sep 01 15:53:20   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> Sep 01 15:53:20   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> Sep 01 15:53:20   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> Sep 01 15:53:20   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> Sep 01 15:53:20   at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> Sep 01 15:53:20   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #21168: [FLINK-27579][client] Make the client.timeout and parallelism.default…

2022-10-26 Thread GitBox


flinkbot commented on PR #21168:
URL: https://github.com/apache/flink/pull/21168#issuecomment-1292895192

   
   ## CI report:
   
   * 60f5c5556492722b05d6271731dff5f6430d22e6 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-29315) HDFSTest#testBlobServerRecovery fails on CI

2022-10-26 Thread Weijie Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624830#comment-17624830
 ] 

Weijie Guo commented on FLINK-29315:


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=42455=logs=4eda0b4a-bd0d-521a-0916-8285b9be9bb5=2ff6d5fa-53a6-53ac-bff7-fa524ea361a9=46086

> HDFSTest#testBlobServerRecovery fails on CI
> ---
>
> Key: FLINK-29315
> URL: https://issues.apache.org/jira/browse/FLINK-29315
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.16.0, 1.15.2
>Reporter: Chesnay Schepler
>Priority: Blocker
>  Labels: pull-request-available
>
> The test started failing 2 days ago on different branches. I suspect 
> something's wrong with the CI infrastructure.
> {code:java}
> Sep 15 09:11:22 [ERROR] Failures: 
> Sep 15 09:11:22 [ERROR]   HDFSTest.testBlobServerRecovery Multiple Failures 
> (2 failures)
> Sep 15 09:11:22   java.lang.AssertionError: Test failed Error while 
> running command to get file permissions : java.io.IOException: Cannot run 
> program "ls": error=1, Operation not permitted
> Sep 15 09:11:22   at 
> java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.runCommand(Shell.java:913)
> Sep 15 09:11:22   at org.apache.hadoop.util.Shell.run(Shell.java:869)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.execCommand(Shell.java:1264)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.execCommand(Shell.java:1246)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:1089)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:697)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:672)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:233)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:141)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:116)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2580)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2622)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2604)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2497)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1501)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:851)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:485)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:444)
> Sep 15 09:11:22   at 
> org.apache.flink.hdfstests.HDFSTest.createHDFS(HDFSTest.java:93)
> Sep 15 09:11:22   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Sep 15 09:11:22   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Sep 15 09:11:22   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Sep 15 09:11:22   at 
> java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
> Sep 15 09:11:22   ... 67 more
> Sep 15 09:11:22 
> Sep 15 09:11:22   java.lang.NullPointerException: 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29772) Kafka table source scan blocked

2022-10-26 Thread shizhengchao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shizhengchao updated FLINK-29772:
-
Description: 
{code:java}
//
"Source: TableSourceScan(table=[[ostream, user_mart, dwd_ads_isms_msgmiddle, 
watermark=[-(toTimeStamps($2), 1:INTERVAL SECOND)]]], fields=[data_type, 
cluster_name, server_time, server_time_s, client_time, client_time_s, imei, 
request_id, owner_id, service_id, content_id, sign_id, receiver_type, msg_type, 
handle_type, reach_type, source_type, create_time, msg_id, imsi, 
array_info_imei, phone, channel_id, process_time, code, msg, receiver, 
content_type, android_version, apk_version]) -> Calc(select=[data_type, 
server_time, client_time, msg_id, array_info_imei, code, PROCTIME() AS 
proctime, Reinterpret(toTimeStamps(server_time)) AS rowtime]) -> 
Calc(select=[array_info_imei AS imei, REPLACE(msg_id, _UTF-16LE'#', 
_UTF-16LE'') AS msg_id, CASE((SEARCH(server_time, Sarg[(-∞.._UTF-16LE'NULL'), 
(_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
SET "UTF-16LE") AND server_time IS NOT NULL), server_time, 
CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
10) AS server_time, CASE((SEARCH(client_time, Sarg[(-∞.._UTF-16LE'NULL'), 
(_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
SET "UTF-16LE") AND client_time IS NOT NULL), client_time, 
CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
10) AS client_time, IF(((data_type = 
_UTF-16LE'sms-netmsg-send':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AND 
(code = _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")), 1, 0) AS 
send_cnt, IF(((data_type = _UTF-16LE'sms-netmsg-callback':VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE") AND (code = _UTF-16LE'0':VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE")), 1, 0) AS reach_cnt, rowtime], 
where=[SEARCH(data_type, Sarg[_UTF-16LE'sms-netmsg-callback':VARCHAR(19) 
CHARACTER SET "UTF-16LE", _UTF-16LE'sms-netmsg-send':VARCHAR(19) CHARACTER SET 
"UTF-16LE"]:VARCHAR(19) CHARACTER SET "UTF-16LE")]) (22/24)#0" Id=77 BLOCKED on 
java.lang.Object@4aa3fe44 owned by "Legacy Source Thread - Source: 
TableSourceScan(table=[[ostream, user_mart, dwd_ads_isms_msgmiddle, 
watermark=[-(toTimeStamps($2), 1:INTERVAL SECOND)]]], fields=[data_type, 
cluster_name, server_time, server_time_s, client_time, client_time_s, imei, 
request_id, owner_id, service_id, content_id, sign_id, receiver_type, msg_type, 
handle_type, reach_type, source_type, create_time, msg_id, imsi, 
array_info_imei, phone, channel_id, process_time, code, msg, receiver, 
content_type, android_version, apk_version]) -> Calc(select=[data_type, 
server_time, client_time, msg_id, array_info_imei, code, PROCTIME() AS 
proctime, Reinterpret(toTimeStamps(server_time)) AS rowtime]) -> 
Calc(select=[array_info_imei AS imei, REPLACE(msg_id, _UTF-16LE'#', 
_UTF-16LE'') AS msg_id, CASE((SEARCH(server_time, Sarg[(-∞.._UTF-16LE'NULL'), 
(_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
SET "UTF-16LE") AND server_time IS NOT NULL), server_time, 
CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
10) AS server_time, CASE((SEARCH(client_time, Sarg[(-∞.._UTF-16LE'NULL'), 
(_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
SET "UTF-16LE") AND client_time IS NOT NULL), client_time, 
CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
10) AS client_time, IF(((data_type = 
_UTF-16LE'sms-netmsg-send':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AND 
(code = _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")), 1, 0) AS 
send_cnt, IF(((data_type = _UTF-16LE'sms-netmsg-callback':VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE") AND (code = _UTF-16LE'0':VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE")), 1, 0) AS reach_cnt, rowtime], 
where=[SEARCH(data_type, Sarg[_UTF-16LE'sms-netmsg-callback':VARCHAR(19) 
CHARACTER SET "UTF-16LE", _UTF-16LE'sms-netmsg-send':VARCHAR(19) CHARACTER SET 
"UTF-16LE"]:VARCHAR(19) CHARACTER SET "UTF-16LE")]) (22/24)#0" Id=87
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:92)
-blocked on java.lang.Object@4aa3fe44
at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:344)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:330)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:202)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:684)
at 

[jira] [Assigned] (FLINK-27579) The param client.timeout can not be set by dynamic properties when stopping the job

2022-10-26 Thread Yang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang reassigned FLINK-27579:
-

Assignee: Yao Zhang

> The param client.timeout can not be set by dynamic properties when stopping 
> the job 
> 
>
> Key: FLINK-27579
> URL: https://issues.apache.org/jira/browse/FLINK-27579
> Project: Flink
>  Issue Type: Improvement
>  Components: Client / Job Submission
>Affects Versions: 1.16.0
>Reporter: Liu
>Assignee: Yao Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> The default client.timeout value is one minute which may be too short when 
> stop-with-savepoint for big state jobs.
> When we stop the job by dynamic properties(-D or -yD for yarn), the 
> client.timeout is not effective.
> From the code, we can see that the dynamic properties are only effect for run 
> command. We should support it for stop command.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] liuyongvs opened a new pull request, #21168: [FLINK-27579][client] Make the client.timeout and parallelism.default…

2022-10-26 Thread GitBox


liuyongvs opened a new pull request, #21168:
URL: https://github.com/apache/flink/pull/21168

   ## What is the purpose of the change
   
   *backport FLINK-27579 to 1.15*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-27579) The param client.timeout can not be set by dynamic properties when stopping the job

2022-10-26 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624827#comment-17624827
 ] 

Yang Wang commented on FLINK-27579:
---

When working for FLINK-29749, I realize that this ticket also needs to be 
backported to release-1.15. 

> The param client.timeout can not be set by dynamic properties when stopping 
> the job 
> 
>
> Key: FLINK-27579
> URL: https://issues.apache.org/jira/browse/FLINK-27579
> Project: Flink
>  Issue Type: Improvement
>  Components: Client / Job Submission
>Affects Versions: 1.16.0
>Reporter: Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> The default client.timeout value is one minute which may be too short when 
> stop-with-savepoint for big state jobs.
> When we stop the job by dynamic properties(-D or -yD for yarn), the 
> client.timeout is not effective.
> From the code, we can see that the dynamic properties are only effect for run 
> command. We should support it for stop command.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29772) Kafka table source scan blocked

2022-10-26 Thread shizhengchao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shizhengchao updated FLINK-29772:
-
Description: 
{code:java}
//
"Source: TableSourceScan(table=[[ostream, user_mart, dwd_ads_isms_msgmiddle, 
watermark=[-(toTimeStamps($2), 1:INTERVAL SECOND)]]], fields=[data_type, 
cluster_name, server_time, server_time_s, client_time, client_time_s, imei, 
request_id, owner_id, service_id, content_id, sign_id, receiver_type, msg_type, 
handle_type, reach_type, source_type, create_time, msg_id, imsi, 
array_info_imei, phone, channel_id, process_time, code, msg, receiver, 
content_type, android_version, apk_version]) -> Calc(select=[data_type, 
server_time, client_time, msg_id, array_info_imei, code, PROCTIME() AS 
proctime, Reinterpret(toTimeStamps(server_time)) AS rowtime]) -> 
Calc(select=[array_info_imei AS imei, REPLACE(msg_id, _UTF-16LE'#', 
_UTF-16LE'') AS msg_id, CASE((SEARCH(server_time, Sarg[(-∞.._UTF-16LE'NULL'), 
(_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
SET "UTF-16LE") AND server_time IS NOT NULL), server_time, 
CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
10) AS server_time, CASE((SEARCH(client_time, Sarg[(-∞.._UTF-16LE'NULL'), 
(_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
SET "UTF-16LE") AND client_time IS NOT NULL), client_time, 
CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
10) AS client_time, IF(((data_type = 
_UTF-16LE'sms-netmsg-send':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AND 
(code = _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")), 1, 0) AS 
send_cnt, IF(((data_type = _UTF-16LE'sms-netmsg-callback':VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE") AND (code = _UTF-16LE'0':VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE")), 1, 0) AS reach_cnt, rowtime], 
where=[SEARCH(data_type, Sarg[_UTF-16LE'sms-netmsg-callback':VARCHAR(19) 
CHARACTER SET "UTF-16LE", _UTF-16LE'sms-netmsg-send':VARCHAR(19) CHARACTER SET 
"UTF-16LE"]:VARCHAR(19) CHARACTER SET "UTF-16LE")]) (22/24)#0" Id=77 
{color:red}BLOCKED on java.lang.Object@4aa3fe44 owned by{color} "Legacy Source 
Thread - Source: TableSourceScan(table=[[ostream, user_mart, 
dwd_ads_isms_msgmiddle, watermark=[-(toTimeStamps($2), 1:INTERVAL 
SECOND)]]], fields=[data_type, cluster_name, server_time, server_time_s, 
client_time, client_time_s, imei, request_id, owner_id, service_id, content_id, 
sign_id, receiver_type, msg_type, handle_type, reach_type, source_type, 
create_time, msg_id, imsi, array_info_imei, phone, channel_id, process_time, 
code, msg, receiver, content_type, android_version, apk_version]) -> 
Calc(select=[data_type, server_time, client_time, msg_id, array_info_imei, 
code, PROCTIME() AS proctime, Reinterpret(toTimeStamps(server_time)) AS 
rowtime]) -> Calc(select=[array_info_imei AS imei, REPLACE(msg_id, 
_UTF-16LE'#', _UTF-16LE'') AS msg_id, CASE((SEARCH(server_time, 
Sarg[(-∞.._UTF-16LE'NULL'), (_UTF-16LE'NULL'.._UTF-16LE'null'), 
(_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER SET "UTF-16LE") AND server_time IS NOT 
NULL), server_time, 
CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
10) AS server_time, CASE((SEARCH(client_time, Sarg[(-∞.._UTF-16LE'NULL'), 
(_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
SET "UTF-16LE") AND client_time IS NOT NULL), client_time, 
CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
10) AS client_time, IF(((data_type = 
_UTF-16LE'sms-netmsg-send':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AND 
(code = _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")), 1, 0) AS 
send_cnt, IF(((data_type = _UTF-16LE'sms-netmsg-callback':VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE") AND (code = _UTF-16LE'0':VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE")), 1, 0) AS reach_cnt, rowtime], 
where=[SEARCH(data_type, Sarg[_UTF-16LE'sms-netmsg-callback':VARCHAR(19) 
CHARACTER SET "UTF-16LE", _UTF-16LE'sms-netmsg-send':VARCHAR(19) CHARACTER SET 
"UTF-16LE"]:VARCHAR(19) CHARACTER SET "UTF-16LE")]) (22/24)#0" Id=87
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:92)
-{color:red} blocked on java.lang.Object@4aa3fe44{color}
at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:344)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:330)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:202)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:684)
at 

[GitHub] [flink] fredia commented on pull request #21053: [FLINK-29157][doc] Clarify the contract between CompletedCheckpointStore and SharedStateRegistry

2022-10-26 Thread GitBox


fredia commented on PR #21053:
URL: https://github.com/apache/flink/pull/21053#issuecomment-1292888602

   Thanks a lot for reviewing. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-docker] HuangXingBo closed pull request #139: [hotfix] Remove 1.14

2022-10-26 Thread GitBox


HuangXingBo closed pull request #139: [hotfix] Remove 1.14
URL: https://github.com/apache/flink-docker/pull/139


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (FLINK-29157) Clarify the contract between CompletedCheckpointStore and SharedStateRegistry

2022-10-26 Thread Congxian Qiu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Congxian Qiu resolved FLINK-29157.
--
Resolution: Fixed

merged into master 63767c5ed91642c67f97d9f16ff2b8955f9ae421

> Clarify the contract between CompletedCheckpointStore and SharedStateRegistry
> -
>
> Key: FLINK-29157
> URL: https://issues.apache.org/jira/browse/FLINK-29157
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / Checkpointing
>Affects Versions: 1.16.0, 1.15.2
>Reporter: Roman Khachatryan
>Assignee: Yanfei Lei
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.17.0, 1.15.3, 1.16.1
>
>
> After FLINK-24611, CompletedCheckpointStore is required to call 
> SharedStateRegistry.unregisterUnusedState() on checkpoint subsumption and 
> shutdown.
> Although it's not clear whether CompletedCheckpointStore is internal there 
> are in fact external implementations (which weren't updated accordingly).
>  
> After FLINK-25872, CompletedCheckpointStore also must call 
> checkpointsCleaner.cleanSubsumedCheckpoints.
>  
> Another issue with a custom implementation was using different java objects 
> for state for CheckpointStore and SharedStateRegistry (after FLINK-24086). 
>  
> So it makes sense to:
>  * clarify the contract (different in 1.15 and 1.16)
>  * require using the same checkpoint objects by SharedStateRegistryFactory 
> and CompletedCheckpointStore
>  * mark the interface(s) as PublicEvolving



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-29157) Clarify the contract between CompletedCheckpointStore and SharedStateRegistry

2022-10-26 Thread Congxian Qiu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Congxian Qiu reassigned FLINK-29157:


Assignee: Yanfei Lei  (was: Roman Khachatryan)

> Clarify the contract between CompletedCheckpointStore and SharedStateRegistry
> -
>
> Key: FLINK-29157
> URL: https://issues.apache.org/jira/browse/FLINK-29157
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / Checkpointing
>Affects Versions: 1.16.0, 1.15.2
>Reporter: Roman Khachatryan
>Assignee: Yanfei Lei
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.17.0, 1.15.3, 1.16.1
>
>
> After FLINK-24611, CompletedCheckpointStore is required to call 
> SharedStateRegistry.unregisterUnusedState() on checkpoint subsumption and 
> shutdown.
> Although it's not clear whether CompletedCheckpointStore is internal there 
> are in fact external implementations (which weren't updated accordingly).
>  
> After FLINK-25872, CompletedCheckpointStore also must call 
> checkpointsCleaner.cleanSubsumedCheckpoints.
>  
> Another issue with a custom implementation was using different java objects 
> for state for CheckpointStore and SharedStateRegistry (after FLINK-24086). 
>  
> So it makes sense to:
>  * clarify the contract (different in 1.15 and 1.16)
>  * require using the same checkpoint objects by SharedStateRegistryFactory 
> and CompletedCheckpointStore
>  * mark the interface(s) as PublicEvolving



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-29075) RescaleBucketITCase#testSuspendAndRecoverAfterRescaleOverwrite is not stable

2022-10-26 Thread Caizhi Weng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng reassigned FLINK-29075:
---

Assignee: Caizhi Weng

> RescaleBucketITCase#testSuspendAndRecoverAfterRescaleOverwrite is not stable
> 
>
> Key: FLINK-29075
> URL: https://issues.apache.org/jira/browse/FLINK-29075
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Affects Versions: table-store-0.2.0
>Reporter: Jane Chan
>Assignee: Caizhi Weng
>Priority: Major
> Attachments: image-2022-08-23-15-35-59-499.png
>
>
> [https://github.com/apache/flink-table-store/runs/7964774584?check_suite_focus=true]
> !image-2022-08-23-15-35-59-499.png|width=576,height=370!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] wangyang0918 merged pull request #21164: [FLINK-29749][client] Make 'flink info' command could support dynamic…

2022-10-26 Thread GitBox


wangyang0918 merged PR #21164:
URL: https://github.com/apache/flink/pull/21164


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-web] HuangXingBo commented on a diff in pull request #574: Announcement blogpost for the 1.16 release

2022-10-26 Thread GitBox


HuangXingBo commented on code in PR #574:
URL: https://github.com/apache/flink-web/pull/574#discussion_r1006341102


##
_config.yml:
##
@@ -73,6 +73,22 @@ FLINK_TABLE_STORE_GITHUB_REPO_NAME: flink-table-store
 #  md1_url: 
https://repo.maven.apache.org/maven2/org/apache/flink/flink-metrics-prometheus_2.12/1.7.1/flink-metrics-prometheus_2.12-1.7.1.jar.sha1
 
 flink_releases:
+  - version_short: "1.16"

Review Comment:
   Only two major version need to be present, so we can remove release 1.14



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-29660) Show all attempts of subtasks in WebUI

2022-10-26 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624818#comment-17624818
 ] 

Zhu Zhu edited comment on FLINK-29660 at 10/27/22 2:14 AM:
---

In the exception history, it already shows all the root exceptions and the 
concurrent exceptions. You can also check any concurrent exception by filtering 
the task name if you think the displayed root cause is not the true root.

Showing all attempts on the vertex tab may result in performance problem 
because there can be too many attempts to be retrieved and shown at one time, 
given that tasks may failover multiple times. At the moment, the tab can be 
slower and harder to navigate if the parallelism is large and I prefer not to 
add much extra burden to it.

Maybe a better way is to show the task manager id on the exception history 
page, and a link to the task manager id to jump to the task manager page.



was (Author: zhuzh):
In the exception history, it already shows all the root exceptions and the 
concurrent exceptions. You can also check any concurrent exception by filtering 
the task name if you think the displayed root cause is not the true root.

Showing all attempts on the vertex tab may result in performance problem 
because there can be too many attempts to be retrieved and shown at one time, 
given that tasks may failover multiple times. At the moment, the tab can be 
slower and harder to navigate if the parallelism is large and I prefer not to 
add much extra burden to it.

Maybe a better way is to show the task manager id on the exception history 
page, and a link o the task manager id to jump to the task manager page.


> Show all attempts of subtasks in WebUI
> --
>
> Key: FLINK-29660
> URL: https://issues.apache.org/jira/browse/FLINK-29660
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Reporter: LI Mingkun
>Priority: Major
>
> Web UI only show subtask metric and TM log now.
> For batch jobs, It's very important to track the metric of fail attempt and 
> jump into log stream of every attemp.
>  
> Feature needed: enable expanded rows of all attempts for subtasks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29660) Show all attempts of subtasks in WebUI

2022-10-26 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624818#comment-17624818
 ] 

Zhu Zhu commented on FLINK-29660:
-

In the exception history, it already shows all the root exceptions and the 
concurrent exceptions. You can also check any concurrent exception by filtering 
the task name if you think the displayed root cause is not the true root.

Showing all attempts on the vertex tab may result in performance problem 
because there can be too many attempts to be retrieved and shown at one time, 
given that tasks may failover multiple times. At the moment, the tab can be 
slower and harder to navigate if the parallelism is large and I prefer not to 
add much extra burden to it.

Maybe a better way is to show the task manager id on the exception history 
page, and a link o the task manager id to jump to the task manager page.


> Show all attempts of subtasks in WebUI
> --
>
> Key: FLINK-29660
> URL: https://issues.apache.org/jira/browse/FLINK-29660
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Reporter: LI Mingkun
>Priority: Major
>
> Web UI only show subtask metric and TM log now.
> For batch jobs, It's very important to track the metric of fail attempt and 
> jump into log stream of every attemp.
>  
> Feature needed: enable expanded rows of all attempts for subtasks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29772) Kafka table source scan blocked

2022-10-26 Thread shizhengchao (Jira)
shizhengchao created FLINK-29772:


 Summary: Kafka table source scan blocked
 Key: FLINK-29772
 URL: https://issues.apache.org/jira/browse/FLINK-29772
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.13.2
Reporter: shizhengchao


{code:java}
//
"Source: TableSourceScan(table=[[ostream, user_mart, dwd_ads_isms_msgmiddle, 
watermark=[-(toTimeStamps($2), 1:INTERVAL SECOND)]]], fields=[data_type, 
cluster_name, server_time, server_time_s, client_time, client_time_s, imei, 
request_id, owner_id, service_id, content_id, sign_id, receiver_type, msg_type, 
handle_type, reach_type, source_type, create_time, msg_id, imsi, 
array_info_imei, phone, channel_id, process_time, code, msg, receiver, 
content_type, android_version, apk_version]) -> Calc(select=[data_type, 
server_time, client_time, msg_id, array_info_imei, code, PROCTIME() AS 
proctime, Reinterpret(toTimeStamps(server_time)) AS rowtime]) -> 
Calc(select=[array_info_imei AS imei, REPLACE(msg_id, _UTF-16LE'#', 
_UTF-16LE'') AS msg_id, CASE((SEARCH(server_time, Sarg[(-∞.._UTF-16LE'NULL'), 
(_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
SET "UTF-16LE") AND server_time IS NOT NULL), server_time, 
CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
10) AS server_time, CASE((SEARCH(client_time, Sarg[(-∞.._UTF-16LE'NULL'), 
(_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
SET "UTF-16LE") AND client_time IS NOT NULL), client_time, 
CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
10) AS client_time, IF(((data_type = 
_UTF-16LE'sms-netmsg-send':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AND 
(code = _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")), 1, 0) AS 
send_cnt, IF(((data_type = _UTF-16LE'sms-netmsg-callback':VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE") AND (code = _UTF-16LE'0':VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE")), 1, 0) AS reach_cnt, rowtime], 
where=[SEARCH(data_type, Sarg[_UTF-16LE'sms-netmsg-callback':VARCHAR(19) 
CHARACTER SET "UTF-16LE", _UTF-16LE'sms-netmsg-send':VARCHAR(19) CHARACTER SET 
"UTF-16LE"]:VARCHAR(19) CHARACTER SET "UTF-16LE")]) (22/24)#0" Id=77 BLOCKED on 
java.lang.Object@4aa3fe44 owned by "Legacy Source Thread - Source: 
TableSourceScan(table=[[ostream, user_mart, dwd_ads_isms_msgmiddle, 
watermark=[-(toTimeStamps($2), 1:INTERVAL SECOND)]]], fields=[data_type, 
cluster_name, server_time, server_time_s, client_time, client_time_s, imei, 
request_id, owner_id, service_id, content_id, sign_id, receiver_type, msg_type, 
handle_type, reach_type, source_type, create_time, msg_id, imsi, 
array_info_imei, phone, channel_id, process_time, code, msg, receiver, 
content_type, android_version, apk_version]) -> Calc(select=[data_type, 
server_time, client_time, msg_id, array_info_imei, code, PROCTIME() AS 
proctime, Reinterpret(toTimeStamps(server_time)) AS rowtime]) -> 
Calc(select=[array_info_imei AS imei, REPLACE(msg_id, _UTF-16LE'#', 
_UTF-16LE'') AS msg_id, CASE((SEARCH(server_time, Sarg[(-∞.._UTF-16LE'NULL'), 
(_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
SET "UTF-16LE") AND server_time IS NOT NULL), server_time, 
CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
10) AS server_time, CASE((SEARCH(client_time, Sarg[(-∞.._UTF-16LE'NULL'), 
(_UTF-16LE'NULL'.._UTF-16LE'null'), (_UTF-16LE'null'..+∞)]:CHAR(4) CHARACTER 
SET "UTF-16LE") AND client_time IS NOT NULL), client_time, 
CAST(FROM_UNIXTIME(CAST(SUBSTRING(CAST(PROCTIME_MATERIALIZE(proctime)), 0, 
10) AS client_time, IF(((data_type = 
_UTF-16LE'sms-netmsg-send':VARCHAR(2147483647) CHARACTER SET "UTF-16LE") AND 
(code = _UTF-16LE'0':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")), 1, 0) AS 
send_cnt, IF(((data_type = _UTF-16LE'sms-netmsg-callback':VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE") AND (code = _UTF-16LE'0':VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE")), 1, 0) AS reach_cnt, rowtime], 
where=[SEARCH(data_type, Sarg[_UTF-16LE'sms-netmsg-callback':VARCHAR(19) 
CHARACTER SET "UTF-16LE", _UTF-16LE'sms-netmsg-send':VARCHAR(19) CHARACTER SET 
"UTF-16LE"]:VARCHAR(19) CHARACTER SET "UTF-16LE")]) (22/24)#0" Id=87
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:92)
- blocked on java.lang.Object@4aa3fe44
at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:344)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:330)
at 

[jira] [Closed] (FLINK-29747) [UI] Refactor runtime web from module-based to standalone components

2022-10-26 Thread Junhan Yang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junhan Yang closed FLINK-29747.
---
Resolution: Fixed

> [UI] Refactor runtime web from module-based to standalone components
> 
>
> Key: FLINK-29747
> URL: https://issues.apache.org/jira/browse/FLINK-29747
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Reporter: Junhan Yang
>Assignee: Junhan Yang
>Priority: Major
>  Labels: pull-request-available
>
> From v14 onwards, Angular provides a capability of standalone components that 
> can be independently bootstrapping. This is a powerful feature in terms of 
> refactoring the application to be less-heavy and structurally clean. It also 
> enables the component-level lazy loading in routes, improving the web 
> performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29747) [UI] Refactor runtime web from module-based to standalone components

2022-10-26 Thread Junhan Yang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624817#comment-17624817
 ] 

Junhan Yang commented on FLINK-29747:
-

master: 77204ff7b92da132c1033a356344d8ccd6b443f2

> [UI] Refactor runtime web from module-based to standalone components
> 
>
> Key: FLINK-29747
> URL: https://issues.apache.org/jira/browse/FLINK-29747
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Reporter: Junhan Yang
>Assignee: Junhan Yang
>Priority: Major
>  Labels: pull-request-available
>
> From v14 onwards, Angular provides a capability of standalone components that 
> can be independently bootstrapping. This is a powerful feature in terms of 
> refactoring the application to be less-heavy and structurally clean. It also 
> enables the component-level lazy loading in routes, improving the web 
> performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29541) [JUnit5 Migration] Module: flink-table-planner

2022-10-26 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624816#comment-17624816
 ] 

Lijie Wang commented on FLINK-29541:


No, I'm not working on this, because I'm not familiar with flink-table-planner. 
I create this issue since the migration of this module does block another work 
I'm doing (FLINK-27940).  Feel free to take it if you want :) [~rskraba] 

> [JUnit5 Migration] Module: flink-table-planner
> --
>
> Key: FLINK-29541
> URL: https://issues.apache.org/jira/browse/FLINK-29541
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner, Tests
>Reporter: Lijie Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] yangjunhan merged pull request #21144: [FLINK-29747] refactor: module-based app to standalone components

2022-10-26 Thread GitBox


yangjunhan merged PR #21144:
URL: https://github.com/apache/flink/pull/21144


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] wangyang0918 commented on pull request #20779: [FLINK-28915] Flink Native k8s mode jar localtion support s3 schema.

2022-10-26 Thread GitBox


wangyang0918 commented on PR #20779:
URL: https://github.com/apache/flink/pull/20779#issuecomment-1292866740

   @SwimSweet Yes. The user jar for Yarn application mode could only be a HDFS 
file. However, I believe it is enough since using Yarn distributed cache is 
more appropriate than downloading via http or Flink filesystem directly. This 
also means that we just need to add the support for standalone mode in this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-29572) Flink Task Manager skip loopback interface for resource manager registration

2022-10-26 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624813#comment-17624813
 ] 

Xintong Song commented on FLINK-29572:
--

bq. If this works, TM will report its address as 127.0.0.1:6223. JM can RPC 
this address as well. But as soon as you have multiple TMs, all of them will 
report their address as 127.0.0.1:6223. Obviously only one will succeed.

You said that it would work if there's only one TM using a 127.0.0.1 address. 
Then I don't understand why it couldn't work with multiple TMs using the same 
address but different ports. A {{TaskManagerLocation}} is uniquely by its 
address *and* port. Having multiple TMs using the same IP address with 
different ports should not confuse JM.

> Flink Task Manager skip loopback interface for resource manager registration
> 
>
> Key: FLINK-29572
> URL: https://issues.apache.org/jira/browse/FLINK-29572
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.15.2
> Environment: Flink 1.15.2
> Kubernetes with Istio Proxy
>Reporter: Kevin Li
>Priority: Major
>
> Currently Flink Task Manager use different local interface to bind to connect 
> to Resource Manager. First one is Loopback interface. Normally if Job Manager 
> is running on remote host/container, using loopback interface to connect will 
> fail and it will pick up correct IP address.
> However, if Task Manager is running with some proxy, loopback interface can 
> connect to remote host as well. This will result 127.0.0.1 reported to 
> Resource Manager during registration, even Job Manager/Resource Manager runs 
> on remote host, and problem will happen. For us, only one Task Manager can 
> register in this case.
> I suggest adding configuration to skip Loopback interface check if we know 
> Job/Resource Manager is running on remote host/container.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] saLeox commented on pull request #21149: [FLINK-29527][format] Make unknownFieldsIndices work for single ParquetReader

2022-10-26 Thread GitBox


saLeox commented on PR #21149:
URL: https://github.com/apache/flink/pull/21149#issuecomment-1292864583

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-29629) FlameGraph is empty for Legacy Source Threads

2022-10-26 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624812#comment-17624812
 ] 

Zhu Zhu commented on FLINK-29629:
-

Unfortunately we do not have enough resources to make this improvement 
recently. 
[~pvary] Not sure if you are interested and have time to do this. If yes, I can 
help with the design and code review. If not, we may close this ticket at the 
moment and pick it up in later versions.

> FlameGraph is empty for Legacy Source Threads
> -
>
> Key: FLINK-29629
> URL: https://issues.apache.org/jira/browse/FLINK-29629
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Web Frontend
>Reporter: Peter Vary
>Priority: Major
>
> Thread dump gets the stack trace for the {{Custom Source}} thread, but this 
> thread is always in {{TIMED_WAITING}}:
> {code}
> "Source: Custom Source -> A random source (1/2)#0" ...
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at jdk.internal.misc.Unsafe.park(java.base@11.0.16/Native Method)
>   - parking to wait for  <0xea775750> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>   at java.util.concurrent.locks.LockSupport.parkNanos()
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await()
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.take()
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:335)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:324)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:201)
> [..]
> {code}
> The actual code is run in the {{Legacy Source Thread}}:
> {code}
> "Legacy Source Thread - Source: Custom Source -> A random source (1/2)#0" ...
>java.lang.Thread.State: RUNNABLE
> {code}
> This causes the WebUI FlameGraph to be empty of any useful data.
> This is an example code to reproduce:
> {code}
> DataStream inputStream = env.addSource(new 
> RandomRecordSource(recordSize));
> inputStream = inputStream.map(new CounterMapper());
> FlinkSink.forRowData(inputStream).tableLoader(loader).append();
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-ml] jiangxin369 commented on a diff in pull request #162: [FLINK-29593] Add QuantileSummary to help calculate approximate quant…

2022-10-26 Thread GitBox


jiangxin369 commented on code in PR #162:
URL: https://github.com/apache/flink-ml/pull/162#discussion_r1004217316


##
flink-ml-core/src/main/java/org/apache/flink/ml/util/QuantileSummary.java:
##
@@ -0,0 +1,359 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.util;
+
+import org.apache.flink.util.Preconditions;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
+
+import static org.apache.flink.util.Preconditions.checkState;
+
+/**
+ * Helper class to compute an approximate quantile summary. This 
implementation is based on the
+ * algorithm proposed in the paper: "Space-efficient Online Computation of 
Quantile Summaries" by
+ * Greenwald, Michael and Khanna, Sanjeev. 
(https://doi.org/10.1145/375663.375670)
+ */
+public class QuantileSummary {
+
+private static final int BUFFER_SIZE = 1;

Review Comment:
   Of cause `5` is also acceptable, in that way each QuantileSummary needs 
`5 * 8B + 1 * 24B = 640k` memory, a TM with 4GB mem(task mem only 1.5GB 
by default)can process 2000 features, if exceeded, the parallelism needs to be 
expanded.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-29764) Automatic judgment of parallelism of source

2022-10-26 Thread waywtdcc (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

waywtdcc closed FLINK-29764.

Resolution: Not A Problem

> Automatic judgment of parallelism of source
> ---
>
> Key: FLINK-29764
> URL: https://issues.apache.org/jira/browse/FLINK-29764
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API, Table SQL / Planner
>Affects Versions: 1.16.0, 1.15.2
>Reporter: waywtdcc
>Priority: Major
> Fix For: 1.17.0
>
>
> The parallelism of the source is automatically judged. The parallelism of the 
> source should not be determined by jobmanager. adaptive batch scheduler The 
> default source parallelism is judged by the two configurations of 
> jobmanager.adaptive batch-scheduler.min-parallelism and jobmanager.adaptive 
> batch-scheduler.max-parallelism and the number of partitions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29764) Automatic judgment of parallelism of source

2022-10-26 Thread waywtdcc (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624796#comment-17624796
 ] 

waywtdcc commented on FLINK-29764:
--

Oh, sorry, I misunderstood. 

> Automatic judgment of parallelism of source
> ---
>
> Key: FLINK-29764
> URL: https://issues.apache.org/jira/browse/FLINK-29764
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API, Table SQL / Planner
>Affects Versions: 1.16.0, 1.15.2
>Reporter: waywtdcc
>Priority: Major
> Fix For: 1.17.0
>
>
> The parallelism of the source is automatically judged. The parallelism of the 
> source should not be determined by jobmanager. adaptive batch scheduler The 
> default source parallelism is judged by the two configurations of 
> jobmanager.adaptive batch-scheduler.min-parallelism and jobmanager.adaptive 
> batch-scheduler.max-parallelism and the number of partitions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29572) Flink Task Manager skip loopback interface for resource manager registration

2022-10-26 Thread Kevin Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624793#comment-17624793
 ] 

Kevin Li commented on FLINK-29572:
--

The sidecar proxy allows application binding to 127.0.0.1 to connect remote IP 
address (where Job Manager runs), which it shouldn't under normal situation. 
This will make Task Manager report its IP as 127.0.0.1 to Job Manager, instead 
of its real IP, such as 1.2.3.4. It has nothing with port.

Under this situation, all TMs will report their IP as 127.0.0.1, this confuse 
the Job Manager and eventually no TM can communicate with JM.

> Flink Task Manager skip loopback interface for resource manager registration
> 
>
> Key: FLINK-29572
> URL: https://issues.apache.org/jira/browse/FLINK-29572
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.15.2
> Environment: Flink 1.15.2
> Kubernetes with Istio Proxy
>Reporter: Kevin Li
>Priority: Major
>
> Currently Flink Task Manager use different local interface to bind to connect 
> to Resource Manager. First one is Loopback interface. Normally if Job Manager 
> is running on remote host/container, using loopback interface to connect will 
> fail and it will pick up correct IP address.
> However, if Task Manager is running with some proxy, loopback interface can 
> connect to remote host as well. This will result 127.0.0.1 reported to 
> Resource Manager during registration, even Job Manager/Resource Manager runs 
> on remote host, and problem will happen. For us, only one Task Manager can 
> register in this case.
> I suggest adding configuration to skip Loopback interface check if we know 
> Job/Resource Manager is running on remote host/container.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29771) Fix flaky RollbackTest

2022-10-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-29771:
---
Labels: pull-request-available  (was: )

> Fix flaky RollbackTest
> --
>
> Key: FLINK-29771
> URL: https://issues.apache.org/jira/browse/FLINK-29771
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.3.0
>Reporter: Gyula Fora
>Assignee: Daren Wong
>Priority: Blocker
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] darenwkt opened a new pull request, #414: [FLINK-29771] Fix flaky rollback test

2022-10-26 Thread GitBox


darenwkt opened a new pull request, #414:
URL: https://github.com/apache/flink-kubernetes-operator/pull/414

   
   
   ## What is the purpose of the change
   
   This PR is to fix a flaky Unit Test in Rollback Test that has been failing 
Github Actions in the past.
   
   
   ## Brief change log
   
   Increase `deployment.readiness.timeout` that is used in rollback test to 
simulate a rollback. This config is increased from 100ms to 400ms. 
   
   When it's set to 100ms, it takes Github action env longer than 100ms to run 
line 334 to 339, therefore triggering an undesired auto-rollback and failing 
Unit Test. This test may not break in local dev as local env may be able to 
execute the code faster, however it can be easily replicated locally as well by 
setting the config to 10ms.
   
   ## Verifying this change
   
   
   
   This change is already covered by existing tests, such as RollbackTest.
   
   It has been tested in local and fork's github action.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): ( no)
 - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
( no)
 - Core observer or reconciler logic that is regularly executed: ( no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? ( no)
 - If yes, how is the feature documented? (not applicable )
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] snuyanzin commented on pull request #18927: [FLINK-26390] Use try resourses for CliClient#executeFile

2022-10-26 Thread GitBox


snuyanzin commented on PR #18927:
URL: https://github.com/apache/flink/pull/18927#issuecomment-1292626975

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] darenwkt commented on pull request #409: [FLINK-29708] Convert CR Error field to JSON with enriched exception …

2022-10-26 Thread GitBox


darenwkt commented on PR #409:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/409#issuecomment-1292538966

   > > Hi @darenwkt ideally we should be able to control the stack-trace per CR:
   > > ```
   > >   flinkConfiguration:
   > > taskmanager.numberOfTaskSlots: "2"
   > > kubernetes.operator.exception.stacktrace.enabled: "true"
   > > ```
   > > 
   > > 
   > > 
   > >   
   > > 
   > > 
   > >   
   > > 
   > > 
   > > 
   > >   
   > > With the current approach this is an operator wide config only. @gyfora 
wdyt?
   > 
   > I am not completely sure about this. It might be better to let operator 
admins decide on this setting, at least the size limits for sure.
   > 
   > In any case there is some inconsitency in the current implementation: If 
something is put in the `KubernetesOperatorConfigurations` the ConfigOption 
should have the annotation `SECTION_SYSTEM` not dynamic (those are for options 
that can dynamically be overridden from the user flink config as Matyas 
suggested).
   
   @gyfora @morhidi 
   
   Thanks for confirming, I will push a new revision with the config placed in 
SYSTEM section.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] darenwkt commented on pull request #409: [FLINK-29708] Convert CR Error field to JSON with enriched exception …

2022-10-26 Thread GitBox


darenwkt commented on PR #409:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/409#issuecomment-1292536844

   
   
   
   > > Hi @morhidi,
   > > I have updated the stackTrace to be a string and added a ConfigOption to 
limit the length of the stackTrace string.
   > > Regarding the concern on deserializing the error status into a valid 
json field, I have tested that deserialization back into the 
FlinkResourceException class works. My testing was done as follows:
   > > 
   > > 1. `kubectl get flinkdeployment basic-example -o yaml > test.yaml`
   > > 2. Tested deserializing the `test.yaml` back into FlinkResourceException 
class using the following code:
   > > 
   > > ```
   > > @Test
   > > public void testYamlDeserialization() throws IOException {
   > > 
   > > Yaml yaml = new Yaml();
   > > InputStream inputStream = this.getClass()
   > > .getClassLoader()
   > > .getResourceAsStream("test.yaml");
   > > Map obj = yaml.load(inputStream);
   > > System.out.println("deserialized yaml: " + obj);
   > > 
   > > ObjectMapper mapper = new ObjectMapper();
   > > FlinkResourceException ex = mapper.readValue((String) 
((Map) obj.get("status")).get("error"), 
FlinkResourceException.class);
   > > System.out.println("deserialized json: " + ex);
   > > }
   > > ```
   > > 
   > > 
   > > 
   > >   
   > > 
   > > 
   > >   
   > > 
   > > 
   > > 
   > >   
   > > 
   > > 3. Results of System.out.println are:
   > > 
   > > ```
   > > deserialized yaml: {apiVersion=flink.apache.org/v1beta1, 
kind=FlinkDeployment, 
metadata={annotations={kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"flink.apache.org/v1beta1","kind":"FlinkDeployment","metadata":{"annotations":{},"name":"basic-example","namespace":"default"},"spec":{"flinkConfiguration":{"taskmanager.numberOfTaskSlots":"2"},"flinkVersion":"v1_15","image":"flink:1.15","job":{"jarURI":"local:///opt/flink/examples/streaming/StateMachineExample.jar","parallelism":2,"upgradeMode":"stateless"},"jobManager":{"resource":{"cpu":1,"memory":"2048m"}},"podTemplate":{"apiVersion":"v1","kind":"Pod","metadata":{"name":"pod-template"},"spec":{"containers":[{"name":"flink-main-container","volumeMounts":[{"mountPath":"/opt/flink/log","name":"flink-logs"},{"mountPath":"/opt/flink/downloads","name":"downloads"}]}]}},"serviceAccount":"flink","taskManager":{"resource":{"cpu":1,"memory":"2048m"
   > > ```
   > > 
   > > 
   > > 
   > >   
   > > 
   > > 
   > >   
   > > 
   > > 
   > > 
   > >   
   > > ```
   > > deserialized json: 
FlinkResourceException(type=org.apache.flink.kubernetes.operator.exception.ReconciliationException,
 message=org.apache.flink.client.deployment.ClusterDeploymentException: Could 
not create Kubernetes cluster "basic-example"., stackTraceElements=null, 
additionalMetadata=null, 
throwableList=[FlinkResourceException(type=org.apache.flink.client.deployment.ClusterDeploymentException,
 message=Could not create Kubernetes cluster "basic-example"., 
stackTraceElements=null, additionalMetadata=null, throwableList=null), 
FlinkResourceException(type=org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.KubernetesClientException,
 message=Failure executing: POST at: 
https://10.96.0.1/apis/apps/v1/namespaces/default/deployments. Message: 
Deployment.apps "basic-example" is invalid: 
[spec.template.spec.containers[0].volumeMounts[0].name: Not found: 
"flink-logs", spec.template.spec.containers[0].volumeMounts[1].name: Not found: 
"downloads"]. Received status: Status(ap
 iVersion=v1, code=422, 
details=StatusDetails(causes=[StatusCause(field=spec.template.spec.containers[0].volumeMounts[0].name,
 message=Not found: "flink-logs", reason=FieldValueNotFound, 
additionalProperties={}), 
StatusCause(field=spec.template.spec.containers[0].volumeMounts[1].name, 
message=Not found: "downloads", reason=FieldValueNotFound, 
additionalProperties={})], group=apps, kind=Deployment, name=basic-example, 
retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, 
message=Deployment.apps "basic-example" is invalid: 
[spec.template.spec.containers[0].volumeMounts[0].name: Not found: 
"flink-logs", spec.template.spec.containers[0].volumeMounts[1].name: Not found: 
"downloads"], metadata=ListMeta(_continue=null, remainingItemCount=null, 
resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, 
status=Failure, additionalProperties={})., stackTraceElements=null, 
additionalMetadata=null, throwableList=null)])
   > > ```
   > > 
   > > 
   > > 
   > >   
   > > 
   > > 
   > >   
   > > 
   > > 
   > > 
   > >   
   > > As a result, I can confirm deserialization of the json works. The 
question now is whether we are ok with the current format the error field is 
shown in CR yaml, which includes the escape field. I tried to search 

[jira] [Commented] (FLINK-29609) Clean up jobmanager deployment on suspend after recording savepoint info

2022-10-26 Thread Gyula Fora (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624703#comment-17624703
 ] 

Gyula Fora commented on FLINK-29609:


This feature is only enabled for Application clusters. The current behaviour is 
fully intentional to allow job manager access after job completion / shutdown/ 
failure so that the operator can better track whats going on.

The improvement we are looking for here is that once the operator actually 
recorded the final state in the CR status we could actually shut down resources 
as we don't need to check anymore.

> Clean up jobmanager deployment on suspend after recording savepoint info
> 
>
> Key: FLINK-29609
> URL: https://issues.apache.org/jira/browse/FLINK-29609
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Sriram Ganesh
>Priority: Major
> Fix For: kubernetes-operator-1.3.0
>
>
> Currently in case of suspending with savepoint. The jobmanager pod will 
> linger there forever after cancelling the job.
> This is currently used to ensure consistency in case the 
> operator/cancel-with-savepoint operation fails.
> Once we are sure however that the savepoint has been recorded and the job is 
> shut down, we should clean up all the resources. Optionally we can make this 
> configurable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]

2022-10-26 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154
 ] 

Eric Xiao edited comment on FLINK-20578 at 10/26/22 7:24 PM:
-

Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think what we can do is:
 # (Look below for examples from other databases) Set a default data type when 
the empty array is created: Two options for the default data type:
 ## Use or create a new empty/void data type.
 ## Use an existing data type i.e. Integer.
 # Teach operations such as `COALESCE` how to type coercion from the default 
data type to the desired data type (a separate PR).

I think Step 1 is sufficient enough to unblock users in creating an empty array 
for the time being but Step 2 is required to allow seamless SQL experience. 
Without step two users will most likely have to manually convert their empty 
array's data type.

With step 1:

 
{code:java}
SELECT COALESCE(int_column, CAST(ARRAY[] as INT)){code}
 

With step 1 and 2:
{code:java}
SELECT COALESCE(int_column, ARRAY[]) {code}
*Default Data Type*
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

!Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340!

They use unknown datatype

Spark:

!image-2022-10-26-14-42-08-468.png|width=505,height=112!

They use unknown datatype

BigQuery:
!Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158!

!image-2022-10-26-14-42-57-579.png|width=373,height=125!
They use integer datatype


was (Author: JIRAUSER295489):
Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think what we can do is:
 # (Look below for examples from other databases) Set a default data type when 
the empty array is created: Two options for the default data type:
 ## Use or create a new empty/void data type.
 ## Use an existing data type i.e. Integer.
 # Teach operations such as `COALESCE` how to type coercion from the default 
data type to the desired data type (a separate PR).

Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

!Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340!

They use unknown datatype

Spark:

!image-2022-10-26-14-42-08-468.png|width=505,height=112!

They use unknown datatype

BigQuery:
!Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158!

!image-2022-10-26-14-42-57-579.png|width=373,height=125!
They use integer datatype

> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: pull-request-available, starter
> Fix For: 1.17.0
>
> Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot 
> 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png, 
> Screen Shot 2022-10-26 at 2.28.49 PM.png, image-2022-10-26-14-42-08-468.png, 
> image-2022-10-26-14-42-57-579.png
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]

2022-10-26 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154
 ] 

Eric Xiao edited comment on FLINK-20578 at 10/26/22 7:20 PM:
-

Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think what we can do is:
 # (Look below for examples from other databases) Set a default data type when 
the empty array is created: Two options for the default data type:
 ## Use or create a new empty/void data type.
 ## Use an existing data type i.e. Integer.
 # Teach operations such as `COALESCE` how to type coercion from the default 
data type to the desired data type (a separate PR).

Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

!Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340!

They use unknown datatype

Spark:

!image-2022-10-26-14-42-08-468.png|width=505,height=112!

They use unknown datatype

BigQuery:
!Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158!

!image-2022-10-26-14-42-57-579.png|width=373,height=125!
They use integer datatype


was (Author: JIRAUSER295489):
Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that.

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

!Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340!

They use unknown datatype

Spark:

!image-2022-10-26-14-42-08-468.png|width=505,height=112!

They use unknown datatype

BigQuery:
!Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158!

!image-2022-10-26-14-42-57-579.png|width=373,height=125!
They use integer datatype

> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: pull-request-available, starter
> Fix For: 1.17.0
>
> Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot 
> 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png, 
> Screen Shot 2022-10-26 at 2.28.49 PM.png, image-2022-10-26-14-42-08-468.png, 
> image-2022-10-26-14-42-57-579.png
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-29771) Fix flaky RollbackTest

2022-10-26 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora reassigned FLINK-29771:
--

Assignee: Daren Wong  (was: Gyula Fora)

> Fix flaky RollbackTest
> --
>
> Key: FLINK-29771
> URL: https://issues.apache.org/jira/browse/FLINK-29771
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.3.0
>Reporter: Gyula Fora
>Assignee: Daren Wong
>Priority: Blocker
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] gyfora closed pull request #413: [hotfix] Temporarily disable RollbackTest

2022-10-26 Thread GitBox


gyfora closed pull request #413: [hotfix] Temporarily disable RollbackTest
URL: https://github.com/apache/flink-kubernetes-operator/pull/413


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #413: [hotfix] Temporarily disable RollbackTest

2022-10-26 Thread GitBox


gyfora commented on PR #413:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/413#issuecomment-1292509377

   Daren Wong found the probable cause, he will fix this properly


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]

2022-10-26 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154
 ] 

Eric Xiao edited comment on FLINK-20578 at 10/26/22 6:43 PM:
-

Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that.

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

!Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340!

They use unknown datatype

Spark:

!image-2022-10-26-14-42-08-468.png|width=505,height=112!

They use unknown datatype

BigQuery:
!Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158!

!image-2022-10-26-14-42-57-579.png|width=373,height=125!
They use integer datatype


was (Author: JIRAUSER295489):
Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that.

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

!Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340!

They use unknown datatype

BigQuery:
!Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158!!Screen Shot 
2022-10-25 at 10.50.42 PM.png|width=761,height=148!
They use integer datatype

> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: pull-request-available, starter
> Fix For: 1.17.0
>
> Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot 
> 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png, 
> Screen Shot 2022-10-26 at 2.28.49 PM.png, image-2022-10-26-14-42-08-468.png, 
> image-2022-10-26-14-42-57-579.png
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-16073) Translate "State & Fault Tolerance" pages into Chinese

2022-10-26 Thread Konstantin Knauf (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624685#comment-17624685
 ] 

Konstantin Knauf commented on FLINK-16073:
--

[~fuping.wang] Done

> Translate "State & Fault Tolerance" pages into Chinese
> --
>
> Key: FLINK-16073
> URL: https://issues.apache.org/jira/browse/FLINK-16073
> Project: Flink
>  Issue Type: New Feature
>  Components: chinese-translation, Documentation
>Affects Versions: 1.11.0
>Reporter: Yu Li
>Assignee: fuping.wang
>Priority: Major
>  Labels: auto-unassigned
> Fix For: 1.17.0
>
>
> Translate all "State & Fault Tolerance" related pages into Chinese, including 
> pages under `docs/dev/stream/state/` and `docs/ops/state`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-16073) Translate "State & Fault Tolerance" pages into Chinese

2022-10-26 Thread Konstantin Knauf (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Knauf reassigned FLINK-16073:


Assignee: fuping.wang

> Translate "State & Fault Tolerance" pages into Chinese
> --
>
> Key: FLINK-16073
> URL: https://issues.apache.org/jira/browse/FLINK-16073
> Project: Flink
>  Issue Type: New Feature
>  Components: chinese-translation, Documentation
>Affects Versions: 1.11.0
>Reporter: Yu Li
>Assignee: fuping.wang
>Priority: Major
>  Labels: auto-unassigned
> Fix For: 1.17.0
>
>
> Translate all "State & Fault Tolerance" related pages into Chinese, including 
> pages under `docs/dev/stream/state/` and `docs/ops/state`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]

2022-10-26 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154
 ] 

Eric Xiao edited comment on FLINK-20578 at 10/26/22 6:29 PM:
-

Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that.

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

!Screen Shot 2022-10-26 at 2.28.49 PM.png|width=957,height=340!

They use unknown datatype

BigQuery:
!Screen Shot 2022-10-25 at 10.50.47 PM.png|width=374,height=158!!Screen Shot 
2022-10-25 at 10.50.42 PM.png|width=761,height=148!
They use integer datatype


was (Author: JIRAUSER295489):
Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that. 

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

They use unknown datatype

BigQuery:

They use integer datatype

a similar query in Trino (Presto) and BigQuery and they use a data type Integer 
as the data type. This could be a good default behaviour?

!Screen Shot 2022-10-25 at 10.50.42 PM.png!
!Screen Shot 2022-10-25 at 10.50.47 PM.png! 
!Screen Shot 2022-10-25 at 11.01.06 PM.png! 


> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: pull-request-available, starter
> Fix For: 1.17.0
>
> Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot 
> 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png, 
> Screen Shot 2022-10-26 at 2.28.49 PM.png
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-20578) Cannot create empty array using ARRAY[]

2022-10-26 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624154#comment-17624154
 ] 

Eric Xiao edited comment on FLINK-20578 at 10/26/22 6:27 PM:
-

Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that. 

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested in other query engines and these are the results I got:

Trino:

They use unknown datatype

BigQuery:

They use integer datatype

a similar query in Trino (Presto) and BigQuery and they use a data type Integer 
as the data type. This could be a good default behaviour?

!Screen Shot 2022-10-25 at 10.50.42 PM.png!
!Screen Shot 2022-10-25 at 10.50.47 PM.png! 
!Screen Shot 2022-10-25 at 11.01.06 PM.png! 



was (Author: JIRAUSER295489):
Hi I wanted to get more involved in contributing to the Flink project and found 
this starter task - my team is working with the Table / SQL APIs, so I thought 
this would be a good beginning task to work on :). [~surahman] are you still 
working on this issue? If not I would love to take over.

> If Flink support empty array, which data type of elements in array should be 
> ? Does it cause new problems.
[~pensz] I think there are two paths:
1. If we given more context on what the array type should be we should try 
using that.
2. If we have no context we use a default data type.

Path #1 - 
I can forsee queries as such `SELECT COALESCE(empty_str_column,ARRAY[])` where 
we could infer the data should be of string type and try to return that. 

Path #2 - Default Data Type
I believe the query in the issue would qualify as a query with no context. I 
tested a similar query in Trino (Presto) and BigQuery and they use a data type 
Integer as the data type. This could be a good default behaviour?

!Screen Shot 2022-10-25 at 10.50.42 PM.png!
!Screen Shot 2022-10-25 at 10.50.47 PM.png! 
!Screen Shot 2022-10-25 at 11.01.06 PM.png! 


> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: pull-request-available, starter
> Fix For: 1.17.0
>
> Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot 
> 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29609) Clean up jobmanager deployment on suspend after recording savepoint info

2022-10-26 Thread Sriram Ganesh (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624665#comment-17624665
 ] 

Sriram Ganesh commented on FLINK-29609:
---

Sure [~Miuler] . I have started exploring the issue. Just incase if someone can 
help me with this information - Is this issue happening for both application 
and session mode?. 

> Clean up jobmanager deployment on suspend after recording savepoint info
> 
>
> Key: FLINK-29609
> URL: https://issues.apache.org/jira/browse/FLINK-29609
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Sriram Ganesh
>Priority: Major
> Fix For: kubernetes-operator-1.3.0
>
>
> Currently in case of suspending with savepoint. The jobmanager pod will 
> linger there forever after cancelling the job.
> This is currently used to ensure consistency in case the 
> operator/cancel-with-savepoint operation fails.
> Once we are sure however that the savepoint has been recorded and the job is 
> shut down, we should clean up all the resources. Optionally we can make this 
> configurable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] Samrat002 commented on pull request #21154: [hotfix] Add icon for Flink in IntellijIdea and Toolbox

2022-10-26 Thread GitBox


Samrat002 commented on PR #21154:
URL: https://github.com/apache/flink/pull/21154#issuecomment-1292400696

   this is really cool  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-24119) KafkaITCase.testTimestamps fails due to "Topic xxx already exist"

2022-10-26 Thread Mason Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624645#comment-17624645
 ] 

Mason Chen commented on FLINK-24119:


[~martijnvisser]  Sure, let me take a look later in the week

> KafkaITCase.testTimestamps fails due to "Topic xxx already exist"
> -
>
> Key: FLINK-24119
> URL: https://issues.apache.org/jira/browse/FLINK-24119
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.0, 1.15.0, 1.16.0
>Reporter: Xintong Song
>Assignee: Qingsheng Ren
>Priority: Critical
>  Labels: auto-deprioritized-critical, test-stability
> Fix For: 1.16.1
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23328=logs=c5f0071e-1851-543e-9a45-9ac140befc32=15a22db7-8faa-5b34-3920-d33c9f0ca23c=7419
> {code}
> Sep 01 15:53:20 [ERROR] Tests run: 23, Failures: 1, Errors: 0, Skipped: 0, 
> Time elapsed: 162.65 s <<< FAILURE! - in 
> org.apache.flink.streaming.connectors.kafka.KafkaITCase
> Sep 01 15:53:20 [ERROR] testTimestamps  Time elapsed: 23.237 s  <<< FAILURE!
> Sep 01 15:53:20 java.lang.AssertionError: Create test topic : tstopic failed, 
> org.apache.kafka.common.errors.TopicExistsException: Topic 'tstopic' already 
> exists.
> Sep 01 15:53:20   at org.junit.Assert.fail(Assert.java:89)
> Sep 01 15:53:20   at 
> org.apache.flink.streaming.connectors.kafka.KafkaTestEnvironmentImpl.createTestTopic(KafkaTestEnvironmentImpl.java:226)
> Sep 01 15:53:20   at 
> org.apache.flink.streaming.connectors.kafka.KafkaTestEnvironment.createTestTopic(KafkaTestEnvironment.java:112)
> Sep 01 15:53:20   at 
> org.apache.flink.streaming.connectors.kafka.KafkaTestBase.createTestTopic(KafkaTestBase.java:212)
> Sep 01 15:53:20   at 
> org.apache.flink.streaming.connectors.kafka.KafkaITCase.testTimestamps(KafkaITCase.java:191)
> Sep 01 15:53:20   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Sep 01 15:53:20   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Sep 01 15:53:20   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Sep 01 15:53:20   at java.lang.reflect.Method.invoke(Method.java:498)
> Sep 01 15:53:20   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> Sep 01 15:53:20   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> Sep 01 15:53:20   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> Sep 01 15:53:20   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> Sep 01 15:53:20   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> Sep 01 15:53:20   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> Sep 01 15:53:20   at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> Sep 01 15:53:20   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] reswqa commented on pull request #21137: [FLINK-29234][runtime] JobMasterServiceLeadershipRunner handle leader event in a separate executor to avoid dead lock

2022-10-26 Thread GitBox


reswqa commented on PR #21137:
URL: https://github.com/apache/flink/pull/21137#issuecomment-1292379575

   > Thanks @reswqa for this PR. I'm wondering how executing the leadership 
granting/revocation call from within another thread would help fixing the 
issue. The locks might be still acquired concurrently in opposite orders 
leading to the deadlock situation.
   > 
   > The usecase that was described in 
[FLINK-29234](https://issues.apache.org/jira/browse/FLINK-29234) essentially 
happens because the Dispatcher is stopped (which, as a consequence, would stop 
`JobMasterServiceLeadershipRunner`) while the 
`JobMasterServiceLeadershipRunner` is granted leadership causing the locks to 
be acquired in the opposite order.
   > 
   > I think the problem is that we're still trying to acquire the lock in 
[JobMasterServiceLeadershipRunner#runIfStateRunning:453](https://github.com/apache/flink/blob/bfe4f9cc3d67d37a2258ab4226d70b6a7d24f22c/flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMasterServiceLeadershipRunner.java#L453)
 even though the `JobMasterServiceLeadershipRunner` is already switched to 
`STOPPED` state. I'm wondering whether we could make 
`JobMasterServiceLeadershipRunner#state` volatile and check the instance being 
in `RUNNING` state outside of the lock. But this wouldn't solve the issue 
entirely because there's still a slight chance that the state changes after the 
state check is processed but before entering the lock... Essentially, we would 
need another lock guarding the state field. 樂
   
   Thanks @XComp for your feedback. I agree that the reason for deadlock is 
that the `JobMasterServiceLeadershipRunner` was granted leadership when it was 
being closed. 
   At the beginning, I tried to make the `leaderContender.grantLeadership 
(issuedLeaderSessionID)` in `onGrantLeadership ` method moved outside the lock 
of `DefaultLeaderElectionService` to avoid nesting locks. But later, I found 
that there is another layer of lock nesting relationship. When executing 
`isLeader` and other methods, the curator framework will add a lock itself and 
can still reproduce the deadlock phenomenon.
   So I began to consider whether there was a way to completely avoid the 
nesting of locks when calling the `grantLeadership` of the 
`JobMasterServiceLeadershipRunner`. Inspired by `ResourceManagerServiceImpl` 
(which executes the leader event  in a separate single thread executor), I 
thought that the same way should also be used to avoid the nesting of lock 
structures. When we let a piece of code to another thread for execution, the 
locks placed on the current thread will not be moved to another thread, which 
will enable the logic of `granteLeadership` in 
`JobMasterServiceLeadershipRunner` has chance to get rid of the locks  that may 
be imposed by the caller (whether from the leaderSelectionService or from the 
Curator).
   We record the lock in `DefaultLeaderSelectionService` as A and the lock in 
`JobMasterServiceLeadershipRunner` as B.
   FLINK-29234's case is as following: 
   Thread t1 invoke `JobMasterServiceLeadershipRunner#closeAsync`, holds lock 
B, and then attempts to call `DefaultLeaderElectionService#stop` to obtain lock 
A.
   Thread t2 invoke `DefaultLeaderElectionService#onGrantLeadership`, hold lock 
A, and then try to call `JobMasterServiceLeadershipRunner#grantLeadership` to 
obtain lock B.
   After introducing an executor to process `grantLeadership`, t2 calling 
`onGrantLeadership` will hold lock A, move the code logic into the executor, 
and then release lock A without attempting to acquire lock B. It is the thread 
in executor's responsibility to obtain B and do the real work.
   If I miss something, you can tell me. I'm willing to listen to your opinion, 
because I'm not particularly familiar with this part code.
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-29711) Topic notification not present in metadata after 60000 ms.

2022-10-26 Thread Mason Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624591#comment-17624591
 ] 

Mason Chen commented on FLINK-29711:


Can you provide exact details where you added logs? I can reproduce this error 
message in unit test.

> Topic notification not present in metadata after 6 ms.
> --
>
> Key: FLINK-29711
> URL: https://issues.apache.org/jira/browse/FLINK-29711
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.4, 1.14.6
>Reporter: Durgesh Mishra
>Priority: Major
>
> Failed to send data to Kafka null with 
> FlinkKafkaInternalProducer\{transactionalId='null', inTransaction=false, 
> closed=false}
> at 
> org.apache.flink.connector.kafka.sink.KafkaWriter$WriterCallback.throwException(KafkaWriter.java:405)
> at 
> org.apache.flink.connector.kafka.sink.KafkaWriter$WriterCallback.lambda$onCompletion$0(KafkaWriter.java:391)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
> at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
> at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsNonBlocking(MailboxProcessor.java:353)
> at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:317)
> at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:201)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:809)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:761)
> at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958)
> at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:937)
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575)
> at java.base/java.lang.Thread.run(Unknown Source)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] SwimSweet commented on pull request #20779: [FLINK-28915] Flink Native k8s mode jar localtion support s3 schema.

2022-10-26 Thread GitBox


SwimSweet commented on PR #20779:
URL: https://github.com/apache/flink/pull/20779#issuecomment-1292306760

   @wangyang0918 I will work for it to support Yarn applicaiton mode and 
standalone mode.
   I found that Yarn application mode already has similar features. This 
feature provides yarn.provided.lib.dirs and yarn.provided.usrlib.dir 
parameters. But it seems that this feature only supports Hadoop file system?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-29609) Clean up jobmanager deployment on suspend after recording savepoint info

2022-10-26 Thread Hector Miuler Malpica Gallegos (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624585#comment-17624585
 ] 

Hector Miuler Malpica Gallegos commented on FLINK-29609:


Please, take into account the stateless batch processes, which once finished 
processing, should clean all the resources

> Clean up jobmanager deployment on suspend after recording savepoint info
> 
>
> Key: FLINK-29609
> URL: https://issues.apache.org/jira/browse/FLINK-29609
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Sriram Ganesh
>Priority: Major
> Fix For: kubernetes-operator-1.3.0
>
>
> Currently in case of suspending with savepoint. The jobmanager pod will 
> linger there forever after cancelling the job.
> This is currently used to ensure consistency in case the 
> operator/cancel-with-savepoint operation fails.
> Once we are sure however that the savepoint has been recorded and the job is 
> shut down, we should clean up all the resources. Optionally we can make this 
> configurable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29762) Can not create a standalone cluster with reactive mode using the operator

2022-10-26 Thread yuvipanda (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624581#comment-17624581
 ] 

yuvipanda commented on FLINK-29762:
---

Ah interesting! I'm using the portable runner with Beam to run Python 
pipelines, and am not entirely sure how to run it as an application as I'd 
imagine I'd have to submit a jar. But perhaps that belongs in the beam issue 
tracker?

> Can not create a standalone cluster with reactive mode using the operator
> -
>
> Key: FLINK-29762
> URL: https://issues.apache.org/jira/browse/FLINK-29762
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
> Environment: Kubernetes Version 1.22 on EKS.
> Flink Operator veresion 1.2.0
> Flink Veresion 1.15 (errors in 1.14 too)
>Reporter: yuvipanda
>Priority: Major
>
> I'm trying to create a minimal running flink cluster with reactive scaling 
> using the kubernetes operator (running v1.2.0), with the following YAML:
>  
> {{
> kind: FlinkDeployment
> metadata:
>   name: test-flink-cluster
> spec:
>   flinkConfiguration:
>     scheduler-mode: reactive
>   flinkVersion: v1_15
>   image: flink:1.15
>   jobManager:
>     replicas: 1
>     resource:
>       cpu: 0.2
>       memory: 1024m
>   mode: standalone
>   serviceAccount: flink
>   taskManager:
>     replicas: 1
>     resource:
>       cpu: 0.2
>       memory: 1024m}}
>  
> However, this causes the jobmanager to crash with the following:
>  
> {{sed: couldn't open temporary file /opt/flink/conf/sedLX7Jx8: Read-only file 
> system}}
> {{sed: couldn't open temporary file /opt/flink/conf/sed1vva8t: Read-only file 
> system}}
> {{/docker-entrypoint.sh: line 73: /opt/flink/conf/flink-conf.yaml: Read-only 
> file system}}
> {{/docker-entrypoint.sh: line 89: /opt/flink/conf/flink-conf.yaml.tmp: 
> Read-only file system}}
> {{Starting Job Manager}}
> {{Starting standalonesession as a console application on host 
> test-flink-cluster-58cd584fdd-xwbtf.}}
> {{2022-10-25 18:32:00,422 INFO  
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - 
> }}
> {{2022-10-25 18:32:00,510 INFO  
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  
> Preconfiguration: }}
> {{2022-10-25 18:32:00,512 INFO  
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - }}
> {{RESOURCE_PARAMS extraction logs:}}
> {{jvm_params: -Xmx469762048 -Xms469762048 -XX:MaxMetaspaceSize=268435456}}
> {{dynamic_configs: -D jobmanager.memory.off-heap.size=134217728b -D 
> jobmanager.memory.jvm-overhead.min=201326592b -D 
> jobmanager.memory.jvm-metaspace.size=268435456b -D 
> jobmanager.memory.heap.size=469762048b -D 
> jobmanager.memory.jvm-overhead.max=201326592b}}
> {{logs: WARNING: sun.reflect.Reflection.getCallerClass is not supported. This 
> will impact performance.}}
> {{INFO  [] - Loading configuration property: blob.server.port, 6124}}
> {{INFO  [] - Loading configuration property: 
> kubernetes.jobmanager.annotations, 
> flinkdeployment.flink.apache.org/generation:1}}
> {{INFO  [] - Loading configuration property: kubernetes.jobmanager.replicas, 
> 1}}
> {{INFO  [] - Loading configuration property: scheduler-mode, reactive}}
> {{INFO  [] - Loading configuration property: 
> "kubernetes.operator.metrics.reporter.prom.port", ""}}
> {{INFO  [] - Loading configuration property: jobmanager.rpc.address, 
> test-flink-cluster.default}}
> {{INFO  [] - Loading configuration property: kubernetes.taskmanager.cpu, 0.2}}
> {{INFO  [] - Loading configuration property: "prometheus.io/port", ""}}
> {{INFO  [] - Loading configuration property: kubernetes.service-account, 
> flink}}
> {{INFO  [] - Loading configuration property: kubernetes.cluster-id, 
> test-flink-cluster}}
> {{INFO  [] - Loading configuration property: kubernetes.container.image, 
> flink:1.15}}
> {{INFO  [] - Loading configuration property: parallelism.default, 2}}
> {{INFO  [] - Loading configuration property: kubernetes.namespace, default}}
> {{INFO  [] - Loading configuration property: taskmanager.numberOfTaskSlots, 
> 2}}
> {{INFO  [] - Loading configuration property: 
> kubernetes.rest-service.exposed.type, ClusterIP}}
> {{INFO  [] - Loading configuration property: "prometheus.io/scrape", "true"}}
> {{INFO  [] - Loading configuration property: taskmanager.memory.process.size, 
> 1024m}}
> {{INFO  [] - Loading configuration property: 
> "kubernetes.operator.metrics.reporter.prom.class", 
> "org.apache.flink.metrics.prometheus.PrometheusReporter"}}
> {{INFO  [] - Loading configuration property: web.cancel.enable, false}}
> {{INFO  [] - Loading configuration property: execution.target, remote}}
> {{INFO  [] - Loading configuration property: 

[jira] [Commented] (FLINK-29364) Root cause of Exceptions thrown in the SourceReader start() method gets "swallowed".

2022-10-26 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624567#comment-17624567
 ] 

Lijie Wang commented on FLINK-29364:


-> But the execution falls through to 
[Task.java#L780|https://github.com/apache/flink/blob/bccecc23067eb7f18e20bade814be73393401be5/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L780]
  and discards the root cause of canceling the source invokable without 
recording the actual reason.

Is this really the root cause? I think the actual cause has been recorded when 
call 
{{{}transitionState{}}}([Task.java#L778|https://github.com/apache/flink/blob/bccecc23067eb7f18e20bade814be73393401be5/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L778]
 and 
[Task.java#L1090|https://github.com/apache/flink/blob/bccecc23067eb7f18e20bade814be73393401be5/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L1090])

> Root cause of Exceptions thrown in the SourceReader start() method gets 
> "swallowed".
> 
>
> Key: FLINK-29364
> URL: https://issues.apache.org/jira/browse/FLINK-29364
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.15.2
>Reporter: Alexander Fedulov
>Priority: Major
>
> If an exception is thrown in the {_}SourceReader{_}'s _start()_ method, its 
> root cause does not get captured.
> The details are still available here: 
> [Task.java#L758|https://github.com/apache/flink/blob/bccecc23067eb7f18e20bade814be73393401be5/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L758]
> But the execution falls through to 
> [Task.java#L780|https://github.com/apache/flink/blob/bccecc23067eb7f18e20bade814be73393401be5/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L780]
>   and discards the root cause of
> canceling the source invokable without recording the actual reason.
>  
> Hot to reproduce: 
> [DataGeneratorSourceITCase.java#L117|https://github.com/afedulov/flink/blob/3df7669fcc6ba08c5147195b80cc97ac1481ec8c/flink-tests/src/test/java/org/apache/flink/api/connector/source/lib/DataGeneratorSourceITCase.java#L117]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-20578) Cannot create empty array using ARRAY[]

2022-10-26 Thread Saad Ur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624549#comment-17624549
 ] 

Saad Ur Rahman commented on FLINK-20578:


Hi [~eric.xiao], it's all yours - have fun ;). I was unable to establish 
traction in the community for a solution.

> Cannot create empty array using ARRAY[]
> ---
>
> Key: FLINK-20578
> URL: https://issues.apache.org/jira/browse/FLINK-20578
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Fabian Hueske
>Priority: Major
>  Labels: pull-request-available, starter
> Fix For: 1.17.0
>
> Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot 
> 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png
>
>
> Calling the ARRAY function without an element (`ARRAY[]`) results in an error 
> message.
> Is that the expected behavior?
> How can users create empty arrays?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] LadyForest commented on a diff in pull request #20652: [FLINK-22315][table] Support add/modify column/constraint/watermark for ALTER TABLE statement

2022-10-26 Thread GitBox


LadyForest commented on code in PR #20652:
URL: https://github.com/apache/flink/pull/20652#discussion_r1005830753


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/operations/AlterTableSchemaUtil.java:
##
@@ -0,0 +1,516 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.operations;
+
+import org.apache.flink.sql.parser.ddl.SqlAlterTableAdd;
+import org.apache.flink.sql.parser.ddl.SqlAlterTableModify;
+import org.apache.flink.sql.parser.ddl.SqlAlterTableSchema;
+import org.apache.flink.sql.parser.ddl.SqlTableColumn;
+import org.apache.flink.sql.parser.ddl.SqlWatermark;
+import org.apache.flink.sql.parser.ddl.constraint.SqlTableConstraint;
+import org.apache.flink.sql.parser.ddl.position.SqlTableColumnPosition;
+import org.apache.flink.table.api.Schema;
+import org.apache.flink.table.api.ValidationException;
+import org.apache.flink.table.catalog.ResolvedCatalogTable;
+import org.apache.flink.table.expressions.Expression;
+import org.apache.flink.table.expressions.SqlCallExpression;
+import org.apache.flink.table.planner.calcite.FlinkTypeFactory;
+import org.apache.flink.table.types.AbstractDataType;
+import org.apache.flink.table.types.DataType;
+import org.apache.flink.table.types.logical.LogicalType;
+import org.apache.flink.table.types.utils.TypeConversions;
+
+import org.apache.calcite.rel.type.RelDataType;
+import org.apache.calcite.sql.SqlCharStringLiteral;
+import org.apache.calcite.sql.SqlDataTypeSpec;
+import org.apache.calcite.sql.SqlIdentifier;
+import org.apache.calcite.sql.SqlNode;
+import org.apache.calcite.sql.validate.SqlValidator;
+
+import javax.annotation.Nullable;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.function.Consumer;
+import java.util.function.Function;
+import java.util.stream.Collectors;
+
+/** Util to convert {@link SqlAlterTableSchema} with source table to generate 
new {@link Schema}. */
+public class AlterTableSchemaUtil {
+
+private final SqlValidator sqlValidator;
+private final Consumer validateTableConstraint;
+private final Function escapeExpression;
+
+AlterTableSchemaUtil(
+SqlValidator sqlValidator,
+Function escapeExpression,
+Consumer validateTableConstraint) {
+this.sqlValidator = sqlValidator;
+this.validateTableConstraint = validateTableConstraint;
+this.escapeExpression = escapeExpression;
+}
+
+public Schema convertSchema(
+SqlAlterTableSchema alterTableSchema, ResolvedCatalogTable 
originalTable) {
+UnresolvedSchemaBuilder builder =
+new UnresolvedSchemaBuilder(
+originalTable,
+(FlinkTypeFactory) sqlValidator.getTypeFactory(),
+sqlValidator,
+validateTableConstraint,
+escapeExpression);
+AlterSchemaStrategy strategy = 
computeAlterSchemaStrategy(alterTableSchema);
+List columnPositions = 
alterTableSchema.getColumnPositions().getList();
+builder.addOrModifyColumns(strategy, columnPositions);
+alterTableSchema
+.getWatermark()
+.ifPresent(sqlWatermark -> 
builder.addOrModifyWatermarks(strategy, sqlWatermark));
+alterTableSchema
+.getFullConstraint()
+.ifPresent(
+(pk) ->
+builder.addOrModifyPrimaryKey(
+strategy, pk, 
columnPositions.isEmpty()));
+return builder.build();
+}
+
+private static class UnresolvedSchemaBuilder {
+
+List sortedColumnNames = new ArrayList<>();
+Map columns = new HashMap<>();
+Map watermarkSpecs = new 
HashMap<>();
+@Nullable Schema.UnresolvedPrimaryKey primaryKey = null;
+
+// Intermediate state
+Map physicalFieldNamesToTypes = new HashMap<>();
+Map metadataFieldNamesToTypes = new HashMap<>();
+Map computedFieldNamesToTypes = new HashMap<>();
+
+Function 

[jira] [Created] (FLINK-29771) Fix flaky RollbackTest

2022-10-26 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-29771:
--

 Summary: Fix flaky RollbackTest
 Key: FLINK-29771
 URL: https://issues.apache.org/jira/browse/FLINK-29771
 Project: Flink
  Issue Type: Bug
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.3.0
Reporter: Gyula Fora
Assignee: Gyula Fora






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] XComp commented on pull request #21137: [FLINK-29234][runtime] JobMasterServiceLeadershipRunner handle leader event in a separate executor to avoid dead lock

2022-10-26 Thread GitBox


XComp commented on PR #21137:
URL: https://github.com/apache/flink/pull/21137#issuecomment-1292206053

   Thanks @reswqa for this PR. I'm wondering how executing the leadership 
granting/revocation being called from within another thread would help fixing 
the issue. The locks might be still acquired concurrently in opposite orders 
leading to the deadlock situation.
   
   The usecase that was described in FLINK-29234 essentially happens because 
the Dispatcher is stopped (which, as a consequence, would stop 
`JobMasterServiceLeadershipRunner`) while the 
`JobMasterServiceLeadershipRunner` is granted leadership causing the locks to 
be acquired in the opposite order.
   
   I think the problem is that we're still trying to acquire the lock in 
[JobMasterServiceLeadershipRunner#runIfStateRunning:453](https://github.com/apache/flink/blob/bfe4f9cc3d67d37a2258ab4226d70b6a7d24f22c/flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMasterServiceLeadershipRunner.java#L453)
 even though the `JobMasterServiceLeadershipRunner` is already switched to 
`STOPPED` state. I'm wondering whether we could make 
`JobMasterServiceLeadershipRunner#state` volatile and check the instance being 
in `RUNNING` state outside of the lock. But this wouldn't solve the issue 
entirely because there's still a slight chance that the state changes after the 
state check is processed but before entering the lock... :thinking: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-29759) Cast type in LEFT JOIN

2022-10-26 Thread Alexandre Decuq (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17624525#comment-17624525
 ] 

Alexandre Decuq commented on FLINK-29759:
-

[~martijnvisser] noted we will try with 1.15 and let u updated.

> Cast type in LEFT JOIN
> --
>
> Key: FLINK-29759
> URL: https://issues.apache.org/jira/browse/FLINK-29759
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.14.5
>Reporter: Alexandre Decuq
>Priority: Critical
>
> Hello,
> I would like to use LEFT JOIN in order to implement a non blocking join 
> without two tables (relationship is optional).
> There is a specificity: key on both side has not the same type (STRING vs 
> INT).
> Here a snap a code:
> Avro input:
>  
> {code:java}
> {
>   "name": "CompanyBankAccountMessage",
>   "type": "record",
>   "namespace": "com.kyriba.dataproduct.core.model.input",
>   "fields": [ 
> {
>   "name": "data",
>   "type": {
> "fields": [
>   {
> "name": "CURRENCY_ID",
> "type": [
>   "null",
>   "string"
> ],
> "default": null,
>   },
>   ...
> ]
>   }
> }
>   ]
> }{code}
>  
> Avro output:
>  
> {code:java}
> {
>   "name": "CurrencyMessage",
>   "type": "record",
>   "namespace": "com.kyriba.dataproduct.core.model.input", 
>   "fields": [
> {
>   "name": "data",
>   "type": {
> "fields": [
>   {
> "name": "CURRENCY_ID",
> "type": "int"
>   },
>   ...
> ]
>   }
> }
>   ]
> }{code}
>  
> Sql query:
>  
> {code:java}
> SELECT ...
> FROM `my.input.COMPANY_BANK_ACCOUNT.v1.avro` as COMPANY_BANK_ACCOUNT
> LEFT JOIN `my.input.CURRENCY.v1.avro` as CURRENCY
> ON CAST(COMPANY_BANK_ACCOUNT.CURRENCY_ID as INT) = CURRENCY.CURRENCY_ID{code}
> I got this exception:
>  
>  
> {code:java}
> Conversion to relational algebra failed to preserve datatypes:
> validated type:
>   RecordType(BIGINT currencyUid, ...)
> converted type: 
>   RecordType(BIGINT currencyUid NOT NULL, ...)
> rel:
>   LogicalProject(currencyUid=[CAST($116.CURRENCY_ID):BIGINT NOT NULL], ...)
> LogicalJoin(condition=[=($11, $117)], joinType=[left])
>       LogicalTableScan(table=[[data-platform, core, 
> kyriba.flink-sql-test.core.cdc.COMPANY_BANK_ACCOUNT.v1.avro]])
>       LogicalProject(the_port_key=[$0], data=[$1], $f2=[$1.CURRENCY_ID])
>         LogicalTableScan(table=[[data-platform, core, 
> kyriba.flink-sql-test.core.cdc.CURRENCY.v1.avro]])
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.checkConvertedType(SqlToRelConverter.java:467){code}
> Did I make something wrong or this is a bug?
>  
> Nota: it works well with this inner join:
> {code:java}
> FROM `my.input.COMPANY_BANK_ACCOUNT.v1.avro` as COMPANY_BANK_ACCOUNT,
>  `my.input.CURRENCY.v1.avro` as CURRENCY
> WHERE CAST(COMPANY_BANK_ACCOUNT.CURRENCY_ID as INT) = CURRENCY.CURRENCY_ID 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] ericxiao251 commented on pull request #21077: [FlINK-29248] Add Scala Async Retry Strategies and ResultPredicates Helper Classes

2022-10-26 Thread GitBox


ericxiao251 commented on PR #21077:
URL: https://github.com/apache/flink/pull/21077#issuecomment-1292163461

   @lincoln-lil and @gaoyunhaii would you folks be able to review this PR? 
Hopefully this PR is short and sweet :).
   
   There are some doc changes related to the java examples that must have used 
old code to show how to use the result and exception predicates that I have 
also fixed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] klion26 commented on a diff in pull request #21136: [FLINK-29095][state] Improve logging in SharedStateRegistry

2022-10-26 Thread GitBox


klion26 commented on code in PR #21136:
URL: https://github.com/apache/flink/pull/21136#discussion_r1005746682


##
flink-runtime/src/main/java/org/apache/flink/runtime/state/SharedStateRegistryImpl.java:
##
@@ -79,12 +79,12 @@ public SharedStateRegistryImpl(Executor 
asyncDisposalExecutor) {
 
 @Override
 public StreamStateHandle registerReference(
-SharedStateRegistryKey registrationKey,
-StreamStateHandle state,
-long checkpointID,
-boolean preventDiscardingCreatedCheckpoint) {
+final SharedStateRegistryKey registrationKey,
+final StreamStateHandle newHandle,
+final long checkpointID,
+final boolean preventDiscardingCreatedCheckpoint) {
 
-checkNotNull(state);
+checkNotNull(newHandle);

Review Comment:
   minor: do we need to add error message here?



##
flink-runtime/src/main/java/org/apache/flink/runtime/state/SharedStateRegistryImpl.java:
##
@@ -95,60 +95,78 @@ public StreamStateHandle registerReference(
 entry = registeredStates.get(registrationKey);
 
 if (entry == null) {
-// Additional check that should never fail, because only state 
handles that are not
-// placeholders should
-// ever be inserted to the registry.
 checkState(
-!isPlaceholder(state),
+!isPlaceholder(newHandle),
 "Attempt to reference unknown state: " + 
registrationKey);
 
-entry = new SharedStateEntry(state, checkpointID);
+LOG.trace(
+"Registered new shared state {} under key {}.", 
newHandle, registrationKey);
+entry = new SharedStateEntry(newHandle, checkpointID);
 registeredStates.put(registrationKey, entry);
-LOG.trace("Registered new shared state {} under key {}.", 
entry, registrationKey);
 
-} else {
-// Delete if this is a real duplicate.
-// Note that task (backend) is not required to re-upload state
-// if the confirmation notification was missing.
-// However, it's also not required to use exactly the same 
handle or placeholder
-if (!Objects.equals(state, entry.stateHandle)) {
-if (entry.confirmed || isPlaceholder(state)) {
-scheduledStateDeletion = state;
-} else {
-// Old entry is not in a confirmed checkpoint yet, and 
the new one differs.
-// This might result from (omitted KG range here for 
simplicity):
-// 1. Flink recovers from a failure using a checkpoint 
1
-// 2. State Backend is initialized to UID xyz and a 
set of SST: { 01.sst }
-// 3. JM triggers checkpoint 2
-// 4. TM sends handle: "xyz-002.sst"; JM registers it 
under "xyz-002.sst"
-// 5. TM crashes; everything is repeated from (2)
-// 6. TM recovers from CP 1 again: backend UID "xyz", 
SST { 01.sst }
-// 7. JM triggers checkpoint 3
-// 8. TM sends NEW state "xyz-002.sst"
-// 9. JM discards it as duplicate
-// 10. checkpoint completes, but a wrong SST file is 
used
-// So we use a new entry and discard the old one:
-scheduledStateDeletion = entry.stateHandle;
-entry.stateHandle = state;
-}
-LOG.trace(
-"Identified duplicate state registration under key 
{}. New state {} was determined to "
-+ "be an unnecessary copy of existing 
state {} and will be dropped.",
-registrationKey,
-state,
-entry.stateHandle);
-}
+// no further handling
+return entry.stateHandle;
+
+} else if (entry.stateHandle == newHandle) {
+// might be a bug but state backend is not required to use a 
place-holder
+LOG.debug(
+"Duplicated registration under key {} with the same 
object: {}",
+registrationKey,
+newHandle);
+} else if (Objects.equals(entry.stateHandle, newHandle)) {
+// might be a bug but state backend is not required to use a 
place-holder

Review Comment:
   ditto



##
flink-runtime/src/main/java/org/apache/flink/runtime/state/SharedStateRegistryImpl.java:
##
@@ -95,60 +95,78 @@ public StreamStateHandle registerReference(
 entry = 

[GitHub] [flink-table-store] LadyForest commented on a diff in pull request #333: [FLINK-29760] Introduce snapshots metadata table

2022-10-26 Thread GitBox


LadyForest commented on code in PR #333:
URL: https://github.com/apache/flink-table-store/pull/333#discussion_r1005760161


##
flink-table-store-connector/src/main/java/org/apache/flink/table/store/connector/source/FlinkTableSource.java:
##
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.connector.source;
+
+import org.apache.flink.table.connector.source.ScanTableSource;
+import 
org.apache.flink.table.connector.source.abilities.SupportsFilterPushDown;
+import org.apache.flink.table.connector.source.abilities.SupportsLimitPushDown;
+import 
org.apache.flink.table.connector.source.abilities.SupportsProjectionPushDown;
+import org.apache.flink.table.expressions.ResolvedExpression;
+import org.apache.flink.table.store.file.predicate.Predicate;
+import org.apache.flink.table.store.file.predicate.PredicateBuilder;
+import org.apache.flink.table.store.file.predicate.PredicateConverter;
+import org.apache.flink.table.store.table.Table;
+import org.apache.flink.table.types.logical.RowType;
+
+import javax.annotation.Nullable;
+
+import java.util.ArrayList;
+import java.util.List;
+
+/** A Flink {@link ScanTableSource} for table store. */
+public abstract class FlinkTableSource
+implements ScanTableSource,
+SupportsFilterPushDown,
+SupportsProjectionPushDown,
+SupportsLimitPushDown {
+
+private final Table table;
+
+@Nullable protected Predicate predicate;
+@Nullable protected int[][] projectFields;
+@Nullable protected Long limit;
+
+public FlinkTableSource(Table table) {
+this(table, null, null, null);
+}
+
+public FlinkTableSource(
+Table table,
+@Nullable Predicate predicate,
+@Nullable int[][] projectFields,
+@Nullable Long limit) {
+this.table = table;
+this.predicate = predicate;
+this.projectFields = projectFields;
+this.limit = limit;
+}
+
+@Override
+public String asSummaryString() {
+return "TableStoreSource";
+}

Review Comment:
   Let `MetadataSource` and `TableStoreSource` override this method, 
respectively?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   3   >