[jira] [Commented] (FLINK-27274) Job cannot be recovered, after restarting cluster

2022-04-20 Thread macdoor615 (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524769#comment-17524769
 ] 

macdoor615 commented on FLINK-27274:


[~martijnvisser] All of my jobs have checkpoint. You can check my sql file & 
flink-conf.yaml. This is why my cluster can be recovered with version 1.14. You 
can also check the log of new_cf_alarm_recover.yaml.sql. It is recovered from 
the checkpoint.

> Job cannot be recovered, after restarting cluster
> -
>
> Key: FLINK-27274
> URL: https://issues.apache.org/jira/browse/FLINK-27274
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.15.0
> Environment: Flink 1.15.0-rc3
> [https://github.com/apache/flink/archive/refs/tags/release-1.15.0-rc3.tar.gz] 
>Reporter: macdoor615
>Priority: Major
> Attachments: flink-conf.yaml, 
> flink-gum-standalonesession-0-hb3-dev-flink-000.log.3.zip, 
> flink-gum-standalonesession-0-hb3-dev-flink-000.log.zip, 
> flink-gum-taskexecutor-2-hb3-dev-flink-000.log, log.recover.debug.zip, 
> new_cf_alarm_no_recover.yaml.sql
>
>
> 1. execute new_cf_alarm_no_recover.yaml.sql with sql-client.sh
> config file: flink-conf.yaml
> the job run properly
> 2. restart cluster with command
> stop-cluster.sh
> start-cluster.sh
> 3. job cannot be recovered
> log files
> flink-gum-standalonesession-0-hb3-dev-flink-000.log
> flink-gum-taskexecutor-2-hb3-dev-flink-000.log
> 4. not all job can not be recovered, some can, some can not, at same time
> 5. all job can be recovered on Flink 1.14.4



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] snuyanzin commented on pull request #17961: [FLINK-25109][Table SQL/Client] Update jline3 to 3.21.0

2022-04-20 Thread GitBox


snuyanzin commented on PR #17961:
URL: https://github.com/apache/flink/pull/17961#issuecomment-1103542450

   @MartijnVisser sorry for the poke
   since you are one of the committers who deals with version updates, could 
you please have a look here


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-27274) Job cannot be recovered, after restarting cluster

2022-04-20 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524770#comment-17524770
 ] 

Martijn Visser commented on FLINK-27274:


[~macdoor615] Yes, but Flink's checkpoint and savepoint mechanism have changed 
in Flink 1.15. See 
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/checkpoints_vs_savepoints/

> Job cannot be recovered, after restarting cluster
> -
>
> Key: FLINK-27274
> URL: https://issues.apache.org/jira/browse/FLINK-27274
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.15.0
> Environment: Flink 1.15.0-rc3
> [https://github.com/apache/flink/archive/refs/tags/release-1.15.0-rc3.tar.gz] 
>Reporter: macdoor615
>Priority: Major
> Attachments: flink-conf.yaml, 
> flink-gum-standalonesession-0-hb3-dev-flink-000.log.3.zip, 
> flink-gum-standalonesession-0-hb3-dev-flink-000.log.zip, 
> flink-gum-taskexecutor-2-hb3-dev-flink-000.log, log.recover.debug.zip, 
> new_cf_alarm_no_recover.yaml.sql
>
>
> 1. execute new_cf_alarm_no_recover.yaml.sql with sql-client.sh
> config file: flink-conf.yaml
> the job run properly
> 2. restart cluster with command
> stop-cluster.sh
> start-cluster.sh
> 3. job cannot be recovered
> log files
> flink-gum-standalonesession-0-hb3-dev-flink-000.log
> flink-gum-taskexecutor-2-hb3-dev-flink-000.log
> 4. not all job can not be recovered, some can, some can not, at same time
> 5. all job can be recovered on Flink 1.14.4



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink-table-store] JingsongLi merged pull request #95: [FLINK-27283] Add end to end tests for table store

2022-04-20 Thread GitBox


JingsongLi merged PR #95:
URL: https://github.com/apache/flink-table-store/pull/95


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-27283) Add end to end tests for table store

2022-04-20 Thread Jingsong Lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524751#comment-17524751
 ] 

Jingsong Lee edited comment on FLINK-27283 at 4/20/22 7:17 AM:
---

master: 7dc3225c5390b8c00651f0272764268ea08f128c

release-0.1: e54e4bc310cbcf72bbec564d3e51556c11f69605


was (Author: lzljs3620320):
master: 7dc3225c5390b8c00651f0272764268ea08f128c

> Add end to end tests for table store
> 
>
> Key: FLINK-27283
> URL: https://issues.apache.org/jira/browse/FLINK-27283
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.1.0
>Reporter: Caizhi Weng
>Assignee: Caizhi Weng
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.1.0
>
>
> End to end tests ensure that users can run table store as expected.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Closed] (FLINK-27283) Add end to end tests for table store

2022-04-20 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-27283.

Resolution: Fixed

> Add end to end tests for table store
> 
>
> Key: FLINK-27283
> URL: https://issues.apache.org/jira/browse/FLINK-27283
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.1.0
>Reporter: Caizhi Weng
>Assignee: Caizhi Weng
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.1.0
>
>
> End to end tests ensure that users can run table store as expected.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink-kubernetes-operator] wangyang0918 merged pull request #171: [FLINK-27289]Do some optimizations for "FlinkService#stopSessionCluster"

2022-04-20 Thread GitBox


wangyang0918 merged PR #171:
URL: https://github.com/apache/flink-kubernetes-operator/pull/171


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-27289) Do some optimizations for "FlinkService#stopSessionCluster"

2022-04-20 Thread Yang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang updated FLINK-27289:
--
Fix Version/s: kubernetes-operator-1.0.0

> Do some optimizations for "FlinkService#stopSessionCluster"
> ---
>
> Key: FLINK-27289
> URL: https://issues.apache.org/jira/browse/FLINK-27289
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: liuzhuo
>Assignee: liuzhuo
>Priority: Minor
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.0.0
>
>
> In the "FlinkService#stopSessionCluster" method, if 'deleteHaData=true', the 
> 'FlinkUtils#waitForClusterShutdown' method will be called twice



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Closed] (FLINK-27289) Do some optimizations for "FlinkService#stopSessionCluster"

2022-04-20 Thread Yang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang closed FLINK-27289.
-
Resolution: Fixed

Fixed via:

main: dc1cca82c31feb97ea960a919b90d711f2a1de92

> Do some optimizations for "FlinkService#stopSessionCluster"
> ---
>
> Key: FLINK-27289
> URL: https://issues.apache.org/jira/browse/FLINK-27289
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: liuzhuo
>Assignee: liuzhuo
>Priority: Minor
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.0.0
>
>
> In the "FlinkService#stopSessionCluster" method, if 'deleteHaData=true', the 
> 'FlinkUtils#waitForClusterShutdown' method will be called twice



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink-kubernetes-operator] wangyang0918 commented on a diff in pull request #173: [FLINK-27310] Fix FlinkOperatorITCase

2022-04-20 Thread GitBox


wangyang0918 commented on code in PR #173:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/173#discussion_r853808587


##
.github/workflows/ci.yml:
##
@@ -73,7 +73,7 @@ jobs:
   - name: Tests in flink-kubernetes-operator
 run: |
   cd flink-kubernetes-operator
-  mvn integration-test -Dit.skip=false
+  mvn verify -Dit.skip=false
   cd ..
   - name: Tests in flink-kubernetes-webhook

Review Comment:
   Could we also use the `mvn verify` in "Tests in flink-kubernetes-webhook" 
even though we do not have ITCase now?



##
.gitignore:
##
@@ -35,3 +35,4 @@ buildNumber.properties
 
 .idea
 *.iml
+*.DS_Store

Review Comment:
   nit: Keep the empty line at end of file.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] wangyang0918 commented on a diff in pull request #173: [FLINK-27310] Fix FlinkOperatorITCase

2022-04-20 Thread GitBox


wangyang0918 commented on code in PR #173:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/173#discussion_r853810148


##
flink-kubernetes-operator/src/test/java/org/apache/flink/kubernetes/operator/FlinkOperatorITCase.java:
##
@@ -114,6 +114,7 @@ private static FlinkDeployment buildSessionCluster() {
 resource.setCpu(1);
 JobManagerSpec jm = new JobManagerSpec();
 jm.setResource(resource);
+jm.setReplicas(1);

Review Comment:
   Would you like to change the default value of `replicas` in this PR or it 
will be done in a follow-up ticket?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-25547) [JUnit5 Migration] Module: flink-optimizer

2022-04-20 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-25547.

Resolution: Won't Fix

There's no benefit in migrating flink-optimizer as the dataset is already 
deprecated anyway.

> [JUnit5 Migration] Module: flink-optimizer
> --
>
> Key: FLINK-25547
> URL: https://issues.apache.org/jira/browse/FLINK-25547
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Hang Ruan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] masteryhx commented on pull request #19142: [FLINK-23252][changelog] Support recovery from checkpoint after disab…

2022-04-20 Thread GitBox


masteryhx commented on PR #19142:
URL: https://github.com/apache/flink/pull/19142#issuecomment-1103559768

   > I think current PR is much clear than before, I had not give a deep 
review. Since this PR is quite large, can you split this PR into several 
commints:
   > 
   > * Modify `ChangelogStateBackend` to support restoring and switch only.
   > * Introduce `ChangelogRestoreTarget` and its implementations.
   > * Introduce `CheckpointBoundKeyedStateHandle#rebound`.
   
   I just try to split it into four commits for convinent to review. 
   I will squash them when approved.
   cc @rkhachatryan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-25547) [JUnit5 Migration] Module: flink-optimizer

2022-04-20 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524774#comment-17524774
 ] 

Chesnay Schepler edited comment on FLINK-25547 at 4/20/22 7:25 AM:
---

There's no benefit in migrating flink-optimizer as the dataset api is already 
deprecated/approaching EOL anyway.


was (Author: zentol):
There's no benefit in migrating flink-optimizer as the dataset api is already 
deprecated anyway.

> [JUnit5 Migration] Module: flink-optimizer
> --
>
> Key: FLINK-25547
> URL: https://issues.apache.org/jira/browse/FLINK-25547
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Hang Ruan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (FLINK-25547) [JUnit5 Migration] Module: flink-optimizer

2022-04-20 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524774#comment-17524774
 ] 

Chesnay Schepler edited comment on FLINK-25547 at 4/20/22 7:25 AM:
---

There's no benefit in migrating flink-optimizer as the dataset api is already 
deprecated anyway.


was (Author: zentol):
There's no benefit in migrating flink-optimizer as the dataset is already 
deprecated anyway.

> [JUnit5 Migration] Module: flink-optimizer
> --
>
> Key: FLINK-25547
> URL: https://issues.apache.org/jira/browse/FLINK-25547
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Hang Ruan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] zentol commented on pull request #19518: [FLINK-25547][optimizer][tests] Migrate tests to JUnit5 for module fl…

2022-04-20 Thread GitBox


zentol commented on PR #19518:
URL: https://github.com/apache/flink/pull/19518#issuecomment-1103564559

   Closing the PR, please see the JIRA for details.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zentol closed pull request #19518: [FLINK-25547][optimizer][tests] Migrate tests to JUnit5 for module fl…

2022-04-20 Thread GitBox


zentol closed pull request #19518: [FLINK-25547][optimizer][tests] Migrate 
tests to JUnit5 for module fl…
URL: https://github.com/apache/flink/pull/19518


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (FLINK-26772) Application Mode does not wait for job cleanup during shutdown

2022-04-20 Thread Matthias Pohl (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl reassigned FLINK-26772:
-

Assignee: Matthias Pohl  (was: Mika Naylor)

> Application Mode does not wait for job cleanup during shutdown
> --
>
> Key: FLINK-26772
> URL: https://issues.apache.org/jira/browse/FLINK-26772
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.0
>Reporter: Mika Naylor
>Assignee: Matthias Pohl
>Priority: Critical
>  Labels: pull-request-available
> Attachments: FLINK-26772.standalone-job.log, 
> testcluster-599f4d476b-bghw5_log.txt
>
>
> We discovered that in Application Mode, when the application has completed, 
> the cluster is shutdown even if there are ongoing resource cleanup events 
> happening in the background. For example, if ha cleanup fails, further 
> retries are not attempted as the cluster is shut down before this can happen.
>  
> We should also add a flag for the shutdown that will prevent further jobs 
> from being submitted.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-25470) Add/Expose/Differentiate metrics of checkpoint size between changelog size vs materialization size

2022-04-20 Thread Hangxiang Yu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524776#comment-17524776
 ] 

Hangxiang Yu commented on FLINK-25470:
--

Yeah, I agree that we should put some metrics within the changelog part.

I'd like to work on it. Could you assign the ticket to me ? [~yunta] Thanks! 

> Add/Expose/Differentiate metrics of checkpoint size between changelog size vs 
> materialization size
> --
>
> Key: FLINK-25470
> URL: https://issues.apache.org/jira/browse/FLINK-25470
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Metrics, Runtime / State Backends
>Reporter: Yuan Mei
>Priority: Major
> Fix For: 1.16.0
>
> Attachments: Screen Shot 2021-12-29 at 1.09.48 PM.png
>
>
> FLINK-25557  only resolves part of the problems. 
> Eventually, we should answer questions:
>  * How much Data Size increases/exploding
>  * When a checkpoint includes a new Materialization
>  * Materialization size
>  * changelog sizes from the last complete checkpoint (that can roughly infer 
> restore time)
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27314) Support reactive mode for native Kubernetes integration in Flink Kubernetes Operator

2022-04-20 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524778#comment-17524778
 ] 

Yang Wang commented on FLINK-27314:
---

Maybe this need to be done in the upstream project Flink. I am afraid the Flink 
Kubernetes operator could not make native Kubernetes integration works with 
reactive mode.

> Support reactive mode for native Kubernetes integration in Flink Kubernetes 
> Operator
> 
>
> Key: FLINK-27314
> URL: https://issues.apache.org/jira/browse/FLINK-27314
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Fuyao Li
>Priority: Major
>
> Generally, this task is a low priority task now.
> Flink has some system level Flink metrics, Flink kubernetes operator can 
> detect these metrics and rescale automatically based checkpoint(similar to 
> standalone reactive mode) and rescale policy configured by users.
> The rescale behavior can be based on CPU utilization or memory utilization.
>  # Before rescaling, Flink operator should check whether the cluster has 
> enough resources, if not, the rescaling will be aborted.
>  # We can create a addition field to support this feature. The fields below 
> is just a rough suggestion.
> {code:java}
> reactiveScaling:
>   enabled: boolean
>   scaleMetric:  enum ["CPU", "MEM"]
> scaleDownThreshold:
>     scaleUpThreshold:
> minimumLimit:
> maximumLimit:
> increasePolicy: 
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] zentol opened a new pull request, #19523: [FLINK-27287][tests] Use random ports

2022-04-20 Thread GitBox


zentol opened a new pull request, #19523:
URL: https://github.com/apache/flink/pull/19523

   - migrates some tests to MiniClusterResources (where it was easy to do so)
   - introduces MiniClusterConfiguration#withRandomPorts
 - this gives us a single point where we can configure ports in the future


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-27287) FileExecutionGraphInfoStoreTest unstable with "Could not start rest endpoint on any port in port range 8081"

2022-04-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-27287:
---
Labels: pull-request-available  (was: )

> FileExecutionGraphInfoStoreTest unstable with "Could not start rest endpoint 
> on any port in port range 8081"
> 
>
> Key: FLINK-27287
> URL: https://issues.apache.org/jira/browse/FLINK-27287
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.16.0
>Reporter: Zhilong Hong
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> In our CI we met the exception below in {{FileExecutionGraphInfoStoreTest}} 
> and {{MemoryExecutionGraphInfoStoreITCase}}:
> {code:java}
> org.apache.flink.util.FlinkException: Could not create the 
> DispatcherResourceManagerComponent.
>   at 
> org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:285)
>   at 
> org.apache.flink.runtime.dispatcher.ExecutionGraphInfoStoreTestUtils$PersistingMiniCluster.createDispatcherResourceManagerComponents(ExecutionGraphInfoStoreTestUtils.java:227)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.setupDispatcherResourceManagerComponents(MiniCluster.java:489)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.start(MiniCluster.java:433)
>   at 
> org.apache.flink.runtime.dispatcher.FileExecutionGraphInfoStoreTest.testPutSuspendedJobOnClusterShutdown(FileExecutionGraphInfoStoreTest.java:328)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
>   at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
>   at 
> org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
>   at 
> org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:107)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
>   at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114)
> 

[GitHub] [flink] fapaul commented on a diff in pull request #19473: [FLINK-27199][Connector/Pulsar] Bump pulsar to 2.10.0

2022-04-20 Thread GitBox


fapaul commented on code in PR #19473:
URL: https://github.com/apache/flink/pull/19473#discussion_r853812193


##
docs/layouts/shortcodes/generated/pulsar_producer_configuration.html:
##
@@ -56,18 +56,6 @@
 Long
 The sequence id for avoiding the duplication, it's used when 
Pulsar doesn't have transaction.
 
-
-pulsar.producer.maxPendingMessages

Review Comment:
   Please also use a separate commit for the restructuring of the docs if they 
are unrelated to the version bump.



##
docs/layouts/shortcodes/generated/pulsar_client_configuration.html:
##
@@ -100,7 +100,7 @@
 
 
 pulsar.client.memoryLimitBytes
-0

Review Comment:
   I think this change deserves at least a separate commit. Is this changing 
the behavior or correcting the docs?



##
flink-connectors/flink-connector-pulsar/src/main/java/org/apache/flink/connector/pulsar/sink/PulsarSinkOptions.java:
##
@@ -128,6 +127,16 @@ private PulsarSinkOptions() {
 "The allowed transaction recommit times if we meet 
some retryable exception."
 + " This is used in Pulsar Transaction.");
 
+public static final ConfigOption 
PULSAR_MAX_PENDING_MESSAGES_ON_PARALLELISM =
+ConfigOptions.key(SINK_CONFIG_PREFIX + "maxPendingMessages")
+.intType()
+.defaultValue(1000)
+.withDescription(
+Description.builder()
+.text(
+"The maximum number of pending 
messages in on sink parallelism.")

Review Comment:
   Can you explain this configuration value? The current description is hard to 
understand.



##
flink-connectors/flink-connector-pulsar/src/test/java/org/apache/flink/connector/pulsar/testutils/runtime/PulsarRuntime.java:
##
@@ -50,22 +49,10 @@ static PulsarRuntime mock() {
 return new PulsarMockRuntime();
 }
 
-/**
- * Create a standalone Pulsar instance in test thread. We would start an 
embedded zookeeper and
- * bookkeeper. The stream storage for bookkeeper is disabled. The function 
worker is disabled on
- * Pulsar broker.
- *
- * This runtime would be faster than {@link #container()} and behaves 
the same as the {@link
- * #container()}.
- */
-static PulsarRuntime embedded() {

Review Comment:
   Has this something to do with the version bump? If not, please use a 
separate commit to remove the embedded test environment.



##
flink-connectors/flink-connector-pulsar/src/main/java/org/apache/flink/connector/pulsar/sink/PulsarSinkOptions.java:
##
@@ -128,6 +127,16 @@ private PulsarSinkOptions() {
 "The allowed transaction recommit times if we meet 
some retryable exception."
 + " This is used in Pulsar Transaction.");
 
+public static final ConfigOption 
PULSAR_MAX_PENDING_MESSAGES_ON_PARALLELISM =

Review Comment:
   Can you add this configuration on a separate commit?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-27313) Flink Datadog reporter add custom and dynamic tags to custom metrics when job running

2022-04-20 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524784#comment-17524784
 ] 

Chesnay Schepler commented on FLINK-27313:
--

Can you expand a bit further on what exactly qualifies as custom/dynamic tags?
You can define additional tags via {{MetricGroup#addGroup(key, value)}}.

> Flink Datadog reporter add custom and dynamic tags to custom metrics when job 
> running
> -
>
> Key: FLINK-27313
> URL: https://issues.apache.org/jira/browse/FLINK-27313
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Metrics
>Reporter: Huibo Peng
>Priority: Major
>
> We use datadog to receive Flink job's metric. We found a limitation that 
> Flink can not add custom and dynamic tags to metric when Flink job is running 
> exclude adding them in flink-conf.yaml. And this feature is important to us 
> because we need to use these tag to filter metrics.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] flinkbot commented on pull request #19523: [FLINK-27287][tests] Use random ports

2022-04-20 Thread GitBox


flinkbot commented on PR #19523:
URL: https://github.com/apache/flink/pull/19523#issuecomment-1103572810

   
   ## CI report:
   
   * c2ee118c9f6a01deb8994ecc82fbb862a679f62d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-27316) Prevent users from changing bucket number

2022-04-20 Thread Jane Chan (Jira)
Jane Chan created FLINK-27316:
-

 Summary: Prevent users from changing bucket number
 Key: FLINK-27316
 URL: https://issues.apache.org/jira/browse/FLINK-27316
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Affects Versions: table-store-0.1.0
Reporter: Jane Chan
 Fix For: table-store-0.1.0


Before we support this feature, we should throw a meaningful exception to 
prevent data corruption which is caused by
{code:sql}
 ALTER TABLE ... SET ('bucket' = '...');
 ALTER TABLE ... RESET ('bucket');{code}
 
h3. How to reproduce
{code:sql}

-- Suppose we defined a managed table like
CREATE TABLE IF NOT EXISTS managed_table (
  f0 INT,
  f1 STRING) WITH (
'path' = '...'
'bucket' = '3');

-- then write some data
INSERT INTO managed_table
VALUES (1, 'Sense and Sensibility),
(2, 'Pride and Prejudice), 
(3, 'Emma'), 
(4, 'Mansfield Park'), 
(5, 'Northanger Abbey'),
(6, 'The Mad Woman in the Attic'),
(7, 'Little Woman');

-- change bucket number
ALTER TABLE managed_table SET ('bucket' = '5');

-- write some data again
INSERT INTO managed_table 
VALUES (1, 'Sense and Sensibility'), 
(2, 'Pride and Prejudice'), 
(3, 'Emma'), 
(4, 'Mansfield Park'), 
(5, 'Northanger Abbey'), 
(6, 'The Mad Woman in the Attic'), 
(7, 'Little Woman'), 
(8, 'Jane Eyre');

-- change bucket number again
ALTER TABLE managed_table SET ('bucket' = '1')

-- then write some record with '-D' as changelog mode
-- E.g. changelogRow("-D", 7, "Little Woman"), changelogRow("-D", 2, "Pride and 
Prejudice"), changelogRow("-D", 3, "Emma"), changelogRow("-D", 4, "Mansfield 
Park"), changelogRow("-D", 5, "Northanger Abbey"), changelogRow("-D", 6, "The 
Mad Woman in the Attic"), changelogRow("-D", 8, "Jane Eyre"), 
changelogRow("-D", 1, "Sense and Sensibility"), changelogRow("-D", 1, "Sense 
and Sensibility") CREATE TABLE helper_source (
  f0 INT, 
  f1 STRING) WITH (
  'connector' = 'values', 
  'data-id' = '${register-id}', 
  'bounded' = 'false', 
  'changelog-mode' = 'I,UA,UB,D'
); 

INSERT INTO managed_table SELECT * FROM helper_source;

-- then read the snapshot

SELECT * FROM managed_table
{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (FLINK-27316) Prevent users from changing bucket number

2022-04-20 Thread Jane Chan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan updated FLINK-27316:
--
Description: 
Before we support this feature, we should throw a meaningful exception to 
prevent data corruption which is caused by
{code:sql}
 ALTER TABLE ... SET ('bucket' = '...');
 ALTER TABLE ... RESET ('bucket');{code}
 
h3. How to reproduce
{code:sql}
-- Suppose we defined a managed table like
CREATE TABLE IF NOT EXISTS managed_table (
  f0 INT,
  f1 STRING) WITH (
'path' = '...'
'bucket' = '3');

-- then write some data
INSERT INTO managed_table
VALUES (1, 'Sense and Sensibility),
(2, 'Pride and Prejudice), 
(3, 'Emma'), 
(4, 'Mansfield Park'), 
(5, 'Northanger Abbey'),
(6, 'The Mad Woman in the Attic'),
(7, 'Little Woman');

-- change bucket number
ALTER TABLE managed_table SET ('bucket' = '5');

-- write some data again
INSERT INTO managed_table 
VALUES (1, 'Sense and Sensibility'), 
(2, 'Pride and Prejudice'), 
(3, 'Emma'), 
(4, 'Mansfield Park'), 
(5, 'Northanger Abbey'), 
(6, 'The Mad Woman in the Attic'), 
(7, 'Little Woman'), 
(8, 'Jane Eyre');

-- change bucket number again
ALTER TABLE managed_table SET ('bucket' = '1')

-- then write some record with '-D' as rowkind
-- E.g. changelogRow("-D", 7, "Little Woman"),
-- changelogRow("-D", 2, "Pride and Prejudice"), 
-- changelogRow("-D", 3, "Emma"), 
-- changelogRow("-D", 4, "Mansfield Park"), 
-- changelogRow("-D", 5, "Northanger Abbey"), 
-- changelogRow("-D", 6, "The Mad Woman in the Attic"), 
-- changelogRow("-D", 8, "Jane Eyre"), 
-- changelogRow("-D", 1, "Sense and Sensibility"), 
-- changelogRow("-D", 1, "Sense and Sensibility") 

CREATE TABLE helper_source (
  f0 INT, 
  f1 STRING) WITH (
  'connector' = 'values', 
  'data-id' = '${register-id}', 
  'bounded' = 'false', 
  'changelog-mode' = 'I,UA,UB,D'
); 

INSERT INTO managed_table SELECT * FROM helper_source;

-- then read the snapshot

SELECT * FROM managed_table
{code}

  was:
Before we support this feature, we should throw a meaningful exception to 
prevent data corruption which is caused by
{code:sql}
 ALTER TABLE ... SET ('bucket' = '...');
 ALTER TABLE ... RESET ('bucket');{code}
 
h3. How to reproduce
{code:sql}

-- Suppose we defined a managed table like
CREATE TABLE IF NOT EXISTS managed_table (
  f0 INT,
  f1 STRING) WITH (
'path' = '...'
'bucket' = '3');

-- then write some data
INSERT INTO managed_table
VALUES (1, 'Sense and Sensibility),
(2, 'Pride and Prejudice), 
(3, 'Emma'), 
(4, 'Mansfield Park'), 
(5, 'Northanger Abbey'),
(6, 'The Mad Woman in the Attic'),
(7, 'Little Woman');

-- change bucket number
ALTER TABLE managed_table SET ('bucket' = '5');

-- write some data again
INSERT INTO managed_table 
VALUES (1, 'Sense and Sensibility'), 
(2, 'Pride and Prejudice'), 
(3, 'Emma'), 
(4, 'Mansfield Park'), 
(5, 'Northanger Abbey'), 
(6, 'The Mad Woman in the Attic'), 
(7, 'Little Woman'), 
(8, 'Jane Eyre');

-- change bucket number again
ALTER TABLE managed_table SET ('bucket' = '1')

-- then write some record with '-D' as changelog mode
-- E.g. changelogRow("-D", 7, "Little Woman"), changelogRow("-D", 2, "Pride and 
Prejudice"), changelogRow("-D", 3, "Emma"), changelogRow("-D", 4, "Mansfield 
Park"), changelogRow("-D", 5, "Northanger Abbey"), changelogRow("-D", 6, "The 
Mad Woman in the Attic"), changelogRow("-D", 8, "Jane Eyre"), 
changelogRow("-D", 1, "Sense and Sensibility"), changelogRow("-D", 1, "Sense 
and Sensibility") CREATE TABLE helper_source (
  f0 INT, 
  f1 STRING) WITH (
  'connector' = 'values', 
  'data-id' = '${register-id}', 
  'bounded' = 'false', 
  'changelog-mode' = 'I,UA,UB,D'
); 

INSERT INTO managed_table SELECT * FROM helper_source;

-- then read the snapshot

SELECT * FROM managed_table
{code}


> Prevent users from changing bucket number
> -
>
> Key: FLINK-27316
> URL: https://issues.apache.org/jira/browse/FLINK-27316
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.1.0
>Reporter: Jane Chan
>Priority: Major
> Fix For: table-store-0.1.0
>
>
> Before we support this feature, we should throw a meaningful exception to 
> prevent data corruption which is caused by
> {code:sql}
>  ALTER TABLE ... SET ('bucket' = '...');
>  ALTER TABLE ... RESET ('bucket');{code}
>  
> h3. How to reproduce
> {code:sql}
> -- Suppose we defined a managed table like
> CREATE TABLE IF NOT EXISTS managed_table (
>   f0 INT,
>   f1 STRING) WITH (
> 'path' = '...'
> 'bucket' = '3');
> -- then write some data
> INSERT INTO managed_table
> VALUES (1, 'Sense and Sensibility),
> (2, 'Pride and Prejudice), 
> (3, 'Emma'), 
> (4, 'Mansfield Park'), 
> (5, 'Northanger Abbey'),
> (6, 'The Mad Woman in the Attic'),
> (7, 'Little Woman');
> -- change bucket number
> ALTER TABLE

[jira] [Updated] (FLINK-27316) Prevent users from changing bucket number

2022-04-20 Thread Jane Chan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan updated FLINK-27316:
--
Description: 
Before supporting this feature, we should throw a meaningful exception to 
prevent data corruption which is caused by
{code:sql}
 ALTER TABLE ... SET ('bucket' = '...');
 ALTER TABLE ... RESET ('bucket');{code}
 
h3. How to reproduce
{code:sql}
-- Suppose we defined a managed table like
CREATE TABLE IF NOT EXISTS managed_table (
  f0 INT,
  f1 STRING) WITH (
'path' = '...'
'bucket' = '3');

-- then write some data
INSERT INTO managed_table
VALUES (1, 'Sense and Sensibility),
(2, 'Pride and Prejudice), 
(3, 'Emma'), 
(4, 'Mansfield Park'), 
(5, 'Northanger Abbey'),
(6, 'The Mad Woman in the Attic'),
(7, 'Little Woman');

-- change bucket number
ALTER TABLE managed_table SET ('bucket' = '5');

-- write some data again
INSERT INTO managed_table 
VALUES (1, 'Sense and Sensibility'), 
(2, 'Pride and Prejudice'), 
(3, 'Emma'), 
(4, 'Mansfield Park'), 
(5, 'Northanger Abbey'), 
(6, 'The Mad Woman in the Attic'), 
(7, 'Little Woman'), 
(8, 'Jane Eyre');

-- change bucket number again
ALTER TABLE managed_table SET ('bucket' = '1')

-- then write some record with '-D' as rowkind
-- E.g. changelogRow("-D", 7, "Little Woman"),
-- changelogRow("-D", 2, "Pride and Prejudice"), 
-- changelogRow("-D", 3, "Emma"), 
-- changelogRow("-D", 4, "Mansfield Park"), 
-- changelogRow("-D", 5, "Northanger Abbey"), 
-- changelogRow("-D", 6, "The Mad Woman in the Attic"), 
-- changelogRow("-D", 8, "Jane Eyre"), 
-- changelogRow("-D", 1, "Sense and Sensibility"), 
-- changelogRow("-D", 1, "Sense and Sensibility") 

CREATE TABLE helper_source (
  f0 INT, 
  f1 STRING) WITH (
  'connector' = 'values', 
  'data-id' = '${register-id}', 
  'bounded' = 'false', 
  'changelog-mode' = 'I,UA,UB,D'
); 

INSERT INTO managed_table SELECT * FROM helper_source;

-- then read the snapshot

SELECT * FROM managed_table
{code}

which will get wrong results

  was:
Before we support this feature, we should throw a meaningful exception to 
prevent data corruption which is caused by
{code:sql}
 ALTER TABLE ... SET ('bucket' = '...');
 ALTER TABLE ... RESET ('bucket');{code}
 
h3. How to reproduce
{code:sql}
-- Suppose we defined a managed table like
CREATE TABLE IF NOT EXISTS managed_table (
  f0 INT,
  f1 STRING) WITH (
'path' = '...'
'bucket' = '3');

-- then write some data
INSERT INTO managed_table
VALUES (1, 'Sense and Sensibility),
(2, 'Pride and Prejudice), 
(3, 'Emma'), 
(4, 'Mansfield Park'), 
(5, 'Northanger Abbey'),
(6, 'The Mad Woman in the Attic'),
(7, 'Little Woman');

-- change bucket number
ALTER TABLE managed_table SET ('bucket' = '5');

-- write some data again
INSERT INTO managed_table 
VALUES (1, 'Sense and Sensibility'), 
(2, 'Pride and Prejudice'), 
(3, 'Emma'), 
(4, 'Mansfield Park'), 
(5, 'Northanger Abbey'), 
(6, 'The Mad Woman in the Attic'), 
(7, 'Little Woman'), 
(8, 'Jane Eyre');

-- change bucket number again
ALTER TABLE managed_table SET ('bucket' = '1')

-- then write some record with '-D' as rowkind
-- E.g. changelogRow("-D", 7, "Little Woman"),
-- changelogRow("-D", 2, "Pride and Prejudice"), 
-- changelogRow("-D", 3, "Emma"), 
-- changelogRow("-D", 4, "Mansfield Park"), 
-- changelogRow("-D", 5, "Northanger Abbey"), 
-- changelogRow("-D", 6, "The Mad Woman in the Attic"), 
-- changelogRow("-D", 8, "Jane Eyre"), 
-- changelogRow("-D", 1, "Sense and Sensibility"), 
-- changelogRow("-D", 1, "Sense and Sensibility") 

CREATE TABLE helper_source (
  f0 INT, 
  f1 STRING) WITH (
  'connector' = 'values', 
  'data-id' = '${register-id}', 
  'bounded' = 'false', 
  'changelog-mode' = 'I,UA,UB,D'
); 

INSERT INTO managed_table SELECT * FROM helper_source;

-- then read the snapshot

SELECT * FROM managed_table
{code}

which will get wrong results


> Prevent users from changing bucket number
> -
>
> Key: FLINK-27316
> URL: https://issues.apache.org/jira/browse/FLINK-27316
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.1.0
>Reporter: Jane Chan
>Priority: Major
> Fix For: table-store-0.1.0
>
>
> Before supporting this feature, we should throw a meaningful exception to 
> prevent data corruption which is caused by
> {code:sql}
>  ALTER TABLE ... SET ('bucket' = '...');
>  ALTER TABLE ... RESET ('bucket');{code}
>  
> h3. How to reproduce
> {code:sql}
> -- Suppose we defined a managed table like
> CREATE TABLE IF NOT EXISTS managed_table (
>   f0 INT,
>   f1 STRING) WITH (
> 'path' = '...'
> 'bucket' = '3');
> -- then write some data
> INSERT INTO managed_table
> VALUES (1, 'Sense and Sensibility),
> (2, 'Pride and Prejudice), 
> (3, 'Emma'), 
> (4, 'Mansfield Park'), 
> (5, 'Northanger Abbey'),
> (6, 'The Mad Woma

[jira] [Updated] (FLINK-27316) Prevent users from changing bucket number

2022-04-20 Thread Jane Chan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan updated FLINK-27316:
--
Description: 
Before we support this feature, we should throw a meaningful exception to 
prevent data corruption which is caused by
{code:sql}
 ALTER TABLE ... SET ('bucket' = '...');
 ALTER TABLE ... RESET ('bucket');{code}
 
h3. How to reproduce
{code:sql}
-- Suppose we defined a managed table like
CREATE TABLE IF NOT EXISTS managed_table (
  f0 INT,
  f1 STRING) WITH (
'path' = '...'
'bucket' = '3');

-- then write some data
INSERT INTO managed_table
VALUES (1, 'Sense and Sensibility),
(2, 'Pride and Prejudice), 
(3, 'Emma'), 
(4, 'Mansfield Park'), 
(5, 'Northanger Abbey'),
(6, 'The Mad Woman in the Attic'),
(7, 'Little Woman');

-- change bucket number
ALTER TABLE managed_table SET ('bucket' = '5');

-- write some data again
INSERT INTO managed_table 
VALUES (1, 'Sense and Sensibility'), 
(2, 'Pride and Prejudice'), 
(3, 'Emma'), 
(4, 'Mansfield Park'), 
(5, 'Northanger Abbey'), 
(6, 'The Mad Woman in the Attic'), 
(7, 'Little Woman'), 
(8, 'Jane Eyre');

-- change bucket number again
ALTER TABLE managed_table SET ('bucket' = '1')

-- then write some record with '-D' as rowkind
-- E.g. changelogRow("-D", 7, "Little Woman"),
-- changelogRow("-D", 2, "Pride and Prejudice"), 
-- changelogRow("-D", 3, "Emma"), 
-- changelogRow("-D", 4, "Mansfield Park"), 
-- changelogRow("-D", 5, "Northanger Abbey"), 
-- changelogRow("-D", 6, "The Mad Woman in the Attic"), 
-- changelogRow("-D", 8, "Jane Eyre"), 
-- changelogRow("-D", 1, "Sense and Sensibility"), 
-- changelogRow("-D", 1, "Sense and Sensibility") 

CREATE TABLE helper_source (
  f0 INT, 
  f1 STRING) WITH (
  'connector' = 'values', 
  'data-id' = '${register-id}', 
  'bounded' = 'false', 
  'changelog-mode' = 'I,UA,UB,D'
); 

INSERT INTO managed_table SELECT * FROM helper_source;

-- then read the snapshot

SELECT * FROM managed_table
{code}

which will get wrong results

  was:
Before we support this feature, we should throw a meaningful exception to 
prevent data corruption which is caused by
{code:sql}
 ALTER TABLE ... SET ('bucket' = '...');
 ALTER TABLE ... RESET ('bucket');{code}
 
h3. How to reproduce
{code:sql}
-- Suppose we defined a managed table like
CREATE TABLE IF NOT EXISTS managed_table (
  f0 INT,
  f1 STRING) WITH (
'path' = '...'
'bucket' = '3');

-- then write some data
INSERT INTO managed_table
VALUES (1, 'Sense and Sensibility),
(2, 'Pride and Prejudice), 
(3, 'Emma'), 
(4, 'Mansfield Park'), 
(5, 'Northanger Abbey'),
(6, 'The Mad Woman in the Attic'),
(7, 'Little Woman');

-- change bucket number
ALTER TABLE managed_table SET ('bucket' = '5');

-- write some data again
INSERT INTO managed_table 
VALUES (1, 'Sense and Sensibility'), 
(2, 'Pride and Prejudice'), 
(3, 'Emma'), 
(4, 'Mansfield Park'), 
(5, 'Northanger Abbey'), 
(6, 'The Mad Woman in the Attic'), 
(7, 'Little Woman'), 
(8, 'Jane Eyre');

-- change bucket number again
ALTER TABLE managed_table SET ('bucket' = '1')

-- then write some record with '-D' as rowkind
-- E.g. changelogRow("-D", 7, "Little Woman"),
-- changelogRow("-D", 2, "Pride and Prejudice"), 
-- changelogRow("-D", 3, "Emma"), 
-- changelogRow("-D", 4, "Mansfield Park"), 
-- changelogRow("-D", 5, "Northanger Abbey"), 
-- changelogRow("-D", 6, "The Mad Woman in the Attic"), 
-- changelogRow("-D", 8, "Jane Eyre"), 
-- changelogRow("-D", 1, "Sense and Sensibility"), 
-- changelogRow("-D", 1, "Sense and Sensibility") 

CREATE TABLE helper_source (
  f0 INT, 
  f1 STRING) WITH (
  'connector' = 'values', 
  'data-id' = '${register-id}', 
  'bounded' = 'false', 
  'changelog-mode' = 'I,UA,UB,D'
); 

INSERT INTO managed_table SELECT * FROM helper_source;

-- then read the snapshot

SELECT * FROM managed_table
{code}


> Prevent users from changing bucket number
> -
>
> Key: FLINK-27316
> URL: https://issues.apache.org/jira/browse/FLINK-27316
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.1.0
>Reporter: Jane Chan
>Priority: Major
> Fix For: table-store-0.1.0
>
>
> Before we support this feature, we should throw a meaningful exception to 
> prevent data corruption which is caused by
> {code:sql}
>  ALTER TABLE ... SET ('bucket' = '...');
>  ALTER TABLE ... RESET ('bucket');{code}
>  
> h3. How to reproduce
> {code:sql}
> -- Suppose we defined a managed table like
> CREATE TABLE IF NOT EXISTS managed_table (
>   f0 INT,
>   f1 STRING) WITH (
> 'path' = '...'
> 'bucket' = '3');
> -- then write some data
> INSERT INTO managed_table
> VALUES (1, 'Sense and Sensibility),
> (2, 'Pride and Prejudice), 
> (3, 'Emma'), 
> (4, 'Mansfield Park'), 
> (5, 'Northanger Abbey'),
> (6, 'The Mad Woman in the Attic'),
> (7, 'Littl

[GitHub] [flink] syhily commented on a diff in pull request #19473: [FLINK-27199][Connector/Pulsar] Bump pulsar to 2.10.0

2022-04-20 Thread GitBox


syhily commented on code in PR #19473:
URL: https://github.com/apache/flink/pull/19473#discussion_r853829901


##
docs/layouts/shortcodes/generated/pulsar_client_configuration.html:
##
@@ -100,7 +100,7 @@
 
 
 pulsar.client.memoryLimitBytes
-0

Review Comment:
   Yep



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] EchoLee5 commented on pull request #19519: [FLINK-27315] Fix the demo of MemoryStateBackendMigration

2022-04-20 Thread GitBox


EchoLee5 commented on PR #19519:
URL: https://github.com/apache/flink/pull/19519#issuecomment-1103577541

   > It seems you still did not get what I mean, I think you'd better create 
another PR based on your another forked branch instead of `EchoLee5:master` 
branch.
   
   Sorry, I misunderstood your point, I closed this pr first


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] syhily commented on a diff in pull request #19473: [FLINK-27199][Connector/Pulsar] Bump pulsar to 2.10.0

2022-04-20 Thread GitBox


syhily commented on code in PR #19473:
URL: https://github.com/apache/flink/pull/19473#discussion_r853830134


##
flink-connectors/flink-connector-pulsar/src/main/java/org/apache/flink/connector/pulsar/sink/PulsarSinkOptions.java:
##
@@ -128,6 +127,16 @@ private PulsarSinkOptions() {
 "The allowed transaction recommit times if we meet 
some retryable exception."
 + " This is used in Pulsar Transaction.");
 
+public static final ConfigOption 
PULSAR_MAX_PENDING_MESSAGES_ON_PARALLELISM =
+ConfigOptions.key(SINK_CONFIG_PREFIX + "maxPendingMessages")
+.intType()
+.defaultValue(1000)
+.withDescription(
+Description.builder()
+.text(
+"The maximum number of pending 
messages in on sink parallelism.")

Review Comment:
   OK



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] EchoLee5 closed pull request #19519: [FLINK-27315] Fix the demo of MemoryStateBackendMigration

2022-04-20 Thread GitBox


EchoLee5 closed pull request #19519: [FLINK-27315] Fix the demo of 
MemoryStateBackendMigration
URL: https://github.com/apache/flink/pull/19519


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (FLINK-27316) Prevent users from changing bucket number

2022-04-20 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee reassigned FLINK-27316:


Assignee: Jane Chan

> Prevent users from changing bucket number
> -
>
> Key: FLINK-27316
> URL: https://issues.apache.org/jira/browse/FLINK-27316
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.1.0
>Reporter: Jane Chan
>Assignee: Jane Chan
>Priority: Major
> Fix For: table-store-0.1.0
>
>
> Before supporting this feature, we should throw a meaningful exception to 
> prevent data corruption which is caused by
> {code:sql}
>  ALTER TABLE ... SET ('bucket' = '...');
>  ALTER TABLE ... RESET ('bucket');{code}
>  
> h3. How to reproduce
> {code:sql}
> -- Suppose we defined a managed table like
> CREATE TABLE IF NOT EXISTS managed_table (
>   f0 INT,
>   f1 STRING) WITH (
> 'path' = '...'
> 'bucket' = '3');
> -- then write some data
> INSERT INTO managed_table
> VALUES (1, 'Sense and Sensibility),
> (2, 'Pride and Prejudice), 
> (3, 'Emma'), 
> (4, 'Mansfield Park'), 
> (5, 'Northanger Abbey'),
> (6, 'The Mad Woman in the Attic'),
> (7, 'Little Woman');
> -- change bucket number
> ALTER TABLE managed_table SET ('bucket' = '5');
> -- write some data again
> INSERT INTO managed_table 
> VALUES (1, 'Sense and Sensibility'), 
> (2, 'Pride and Prejudice'), 
> (3, 'Emma'), 
> (4, 'Mansfield Park'), 
> (5, 'Northanger Abbey'), 
> (6, 'The Mad Woman in the Attic'), 
> (7, 'Little Woman'), 
> (8, 'Jane Eyre');
> -- change bucket number again
> ALTER TABLE managed_table SET ('bucket' = '1')
> -- then write some record with '-D' as rowkind
> -- E.g. changelogRow("-D", 7, "Little Woman"),
> -- changelogRow("-D", 2, "Pride and Prejudice"), 
> -- changelogRow("-D", 3, "Emma"), 
> -- changelogRow("-D", 4, "Mansfield Park"), 
> -- changelogRow("-D", 5, "Northanger Abbey"), 
> -- changelogRow("-D", 6, "The Mad Woman in the Attic"), 
> -- changelogRow("-D", 8, "Jane Eyre"), 
> -- changelogRow("-D", 1, "Sense and Sensibility"), 
> -- changelogRow("-D", 1, "Sense and Sensibility") 
> CREATE TABLE helper_source (
>   f0 INT, 
>   f1 STRING) WITH (
>   'connector' = 'values', 
>   'data-id' = '${register-id}', 
>   'bounded' = 'false', 
>   'changelog-mode' = 'I,UA,UB,D'
> ); 
> INSERT INTO managed_table SELECT * FROM helper_source;
> -- then read the snapshot
> SELECT * FROM managed_table
> {code}
> which will get wrong results



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-27317) Snapshot deployment fails due to .scalafmt.conf not being found

2022-04-20 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-27317:


 Summary: Snapshot deployment fails due to .scalafmt.conf not being 
found
 Key: FLINK-27317
 URL: https://issues.apache.org/jira/browse/FLINK-27317
 Project: Flink
  Issue Type: Technical Debt
  Components: Build System
Affects Versions: 1.16.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.16.0


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=34852&view=logs&j=eca6b3a6-1600-56cc-916a-c549b3cde3ff&t=e9844b5e-5aa3-546b-6c3e-5395c7c0cac7

Cause by the maven-source-plugin jar goal forking the build and apparently 
messing up the working directory.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink-kubernetes-operator] bgeng777 commented on a diff in pull request #173: [FLINK-27310] Fix FlinkOperatorITCase

2022-04-20 Thread GitBox


bgeng777 commented on code in PR #173:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/173#discussion_r853843673


##
flink-kubernetes-operator/src/test/java/org/apache/flink/kubernetes/operator/FlinkOperatorITCase.java:
##
@@ -114,6 +114,7 @@ private static FlinkDeployment buildSessionCluster() {
 resource.setCpu(1);
 JobManagerSpec jm = new JobManagerSpec();
 jm.setResource(resource);
+jm.setReplicas(1);

Review Comment:
   I checked the usage of `replicas` and I think it is better to leave it to 
another ticket. The change is not big but I hope to make this PR more 
concentrated on CI workflow improvement.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-web] joemoe commented on pull request #526: Announcement blogpost for the 1.15 release

2022-04-20 Thread GitBox


joemoe commented on PR #526:
URL: https://github.com/apache/flink-web/pull/526#issuecomment-1103592187

   @MartijnVisser good catch. can fixed it. can you approve it now?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] bgeng777 commented on a diff in pull request #173: [FLINK-27310] Fix FlinkOperatorITCase

2022-04-20 Thread GitBox


bgeng777 commented on code in PR #173:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/173#discussion_r853845992


##
.github/workflows/ci.yml:
##
@@ -73,7 +73,7 @@ jobs:
   - name: Tests in flink-kubernetes-operator
 run: |
   cd flink-kubernetes-operator
-  mvn integration-test -Dit.skip=false
+  mvn verify -Dit.skip=false
   cd ..
   - name: Tests in flink-kubernetes-webhook

Review Comment:
   Yes, it is a good suggestion, I believe it is fine to use `verify` on 
webhook as well. I have updated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #19524: [FLINK-27317][build] Don't fork build for attaching sources

2022-04-20 Thread GitBox


flinkbot commented on PR #19524:
URL: https://github.com/apache/flink/pull/19524#issuecomment-1103594009

   
   ## CI report:
   
   * 79068f68df81e47bfe6ef66ca24df4e61f6b7ce7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-27317) Snapshot deployment fails due to .scalafmt.conf not being found

2022-04-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-27317:
---
Labels: pull-request-available  (was: )

> Snapshot deployment fails due to .scalafmt.conf not being found
> ---
>
> Key: FLINK-27317
> URL: https://issues.apache.org/jira/browse/FLINK-27317
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Build System
>Affects Versions: 1.16.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=34852&view=logs&j=eca6b3a6-1600-56cc-916a-c549b3cde3ff&t=e9844b5e-5aa3-546b-6c3e-5395c7c0cac7
> Cause by the maven-source-plugin jar goal forking the build and apparently 
> messing up the working directory.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27274) Job cannot be recovered, after restarting cluster

2022-04-20 Thread macdoor615 (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524794#comment-17524794
 ] 

macdoor615 commented on FLINK-27274:


[~martijnvisser] Let's talk about stop-cluster.sh. What is the purpose of this 
command?
 # stop cluster and store the state
 # stop cluster and clear the state
 # stop cluster and leave job in random state

1 or 2 or 3 ? 

> Job cannot be recovered, after restarting cluster
> -
>
> Key: FLINK-27274
> URL: https://issues.apache.org/jira/browse/FLINK-27274
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.15.0
> Environment: Flink 1.15.0-rc3
> [https://github.com/apache/flink/archive/refs/tags/release-1.15.0-rc3.tar.gz] 
>Reporter: macdoor615
>Priority: Major
> Attachments: flink-conf.yaml, 
> flink-gum-standalonesession-0-hb3-dev-flink-000.log.3.zip, 
> flink-gum-standalonesession-0-hb3-dev-flink-000.log.zip, 
> flink-gum-taskexecutor-2-hb3-dev-flink-000.log, log.recover.debug.zip, 
> new_cf_alarm_no_recover.yaml.sql
>
>
> 1. execute new_cf_alarm_no_recover.yaml.sql with sql-client.sh
> config file: flink-conf.yaml
> the job run properly
> 2. restart cluster with command
> stop-cluster.sh
> start-cluster.sh
> 3. job cannot be recovered
> log files
> flink-gum-standalonesession-0-hb3-dev-flink-000.log
> flink-gum-taskexecutor-2-hb3-dev-flink-000.log
> 4. not all job can not be recovered, some can, some can not, at same time
> 5. all job can be recovered on Flink 1.14.4



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-25872) Restoring from non-changelog checkpoint with changelog state-backend enabled in CLAIM mode discards state in use

2022-04-20 Thread Dawid Wysakowicz (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524796#comment-17524796
 ] 

Dawid Wysakowicz commented on FLINK-25872:
--

Sorry, for joining so late.

I'll try to rephrase the issue to make sure I understand it. First of all, the 
problem appears when changing the state backend, from a non-changelog state 
backend to changelog state backend. This does not work, because changelog state 
backend uses different state handles than other state backends (obviously). 
First question, for me is that is something that should be supported, from the 
guarantees we're giving not necessarily. Changing state backend is supported 
only via a savepoint. Having said that I understand the answer is that you do 
want to support that nevertheless.

Question: Is the problem related to state handles registered in 
{{SharedStateRegistry}}? Or does it affect non-shared, private parts of the 
initial checkpoint? If I understand correctly, it does affect also originally 
private parts of the initial checkpoint, right?

There are two proposed solutions to the problem. On a high level:
# (Roman's) Treat all handles of the initial checkpoint as shared ones, 
irrespective if the changelog state backend is used or not.
# (Yanfei's) Add a logic that converts the non-changelog checkpoint to the 
changelog checkpoint when restoring. This enforces making JobManager aware of 
Changelog state backend.

I feel both solutions add additional logic and complexity to the JM. Sorry if I 
am boring/annoying, but are we sure it is the right decision to support this 
kind of state backend switching? BTW, does the switching in the other direction 
work as well? What happens if we want to disable previously enabled changelog 
state backend? Is this supported?

> Restoring from non-changelog checkpoint with changelog state-backend enabled 
> in CLAIM mode discards state in use
> 
>
> Key: FLINK-25872
> URL: https://issues.apache.org/jira/browse/FLINK-25872
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Reporter: Yun Tang
>Assignee: Yanfei Lei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> If we restore from checkpoint with changelog state-backend enabled in 
> snapshot CLAIM mode, the restored checkpoint would be discarded on subsume. 
> This invalidates newer/active checkpoints because their materialized part is 
> discarded (for incremental wrapped checkpoints, their private state is 
> discarded). This bug is like FLINK-25478.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Closed] (FLINK-27029) DeploymentValidator should take default flink config into account during validation

2022-04-20 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-27029.
--
Resolution: Fixed

merged to main: 0c0ae05a4a2b73618e270f72a4a52a98fa4850c8

> DeploymentValidator should take default flink config into account during 
> validation
> ---
>
> Key: FLINK-27029
> URL: https://issues.apache.org/jira/browse/FLINK-27029
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Nicholas Jiang
>Priority: Major
>  Labels: pull-request-available, starter
> Fix For: kubernetes-operator-1.0.0
>
>
> Currently the DefaultDeploymentValidator only takes the FlinkDeployment 
> object into account.
> However in places where we validate the presence of config keys we should 
> also consider the default flink config which might already provide default 
> values for the required configs even if the deployment itself doesnt.
> We should make sure this works correctly both in the operator and the webhook



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27018) timestamp missing end zero when outputing to kafka

2022-04-20 Thread Yao Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524799#comment-17524799
 ] 

Yao Zhang commented on FLINK-27018:
---

Hi [~jeff-zou] ,

We cannot simple use `LocalDateTime.toString()` as in current implemetattion 
Flink json format should support datetime with both ISO-8601 and SQL standard.

 

It seems that ISO-8601 does not explicitly stipulate the precision of 
milliseconds or nanoseconds.

 

But personally I agree that the time format should be consistent.

 

See [JSON | Apache 
Flink|https://nightlies.apache.org/flink/flink-docs-release-1.14/zh/docs/connectors/table/formats/json/]

> timestamp missing end  zero when outputing to kafka
> ---
>
> Key: FLINK-27018
> URL: https://issues.apache.org/jira/browse/FLINK-27018
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.13.5
>Reporter: jeff-zou
>Priority: Major
> Attachments: kafka.png
>
>
> the bug is described as follows:
>  
> {code:java}
> data in source:
>  2022-04-02 03:34:21.260
> but after sink by sql, data in kafka:
>  2022-04-02 03:34:21.26
> {code}
>  
> data miss end zero in kafka.
>  
> sql:
> {code:java}
> create kafka_table(stime stimestamp) with ('connector'='kafka','format' = 
> 'json');
> insert into kafka_table select stime from (values(timestamp '2022-04-02 
> 03:34:21.260')){code}
> the value in kafka is : \{"stime":"2022-04-02 03:34:21.26"}, missed end zero.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] EchoLee5 opened a new pull request, #19525: [FLINK-27315] Fix the demo of MemoryStateBackendMigration

2022-04-20 Thread GitBox


EchoLee5 opened a new pull request, #19525:
URL: https://github.com/apache/flink/pull/19525

   
   
   ## What is the purpose of the change
   
   JobManagerStateBackend changed to JobManagerCheckpointStorage
   
   
   ## Brief change log
   
   JobManagerStateBackend changed to JobManagerCheckpointStorage
   
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't 
know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] snuyanzin commented on pull request #19467: [FLINK-25548][flink-sql-parser] Migrate tests to JUnit5

2022-04-20 Thread GitBox


snuyanzin commented on PR #19467:
URL: https://github.com/apache/flink/pull/19467#issuecomment-1103609846

   @zentol sorry for the poke
   Since you are one of the committers who deals with migration to junit5, 
could you please have a look here once you have time?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] EchoLee5 commented on pull request #19525: [FLINK-27315] Fix the demo of MemoryStateBackendMigration

2022-04-20 Thread GitBox


EchoLee5 commented on PR #19525:
URL: https://github.com/apache/flink/pull/19525#issuecomment-1103610714

   cc @Myasuka I resubmitted the pr, please review it, thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #19525: [FLINK-27315] Fix the demo of MemoryStateBackendMigration

2022-04-20 Thread GitBox


flinkbot commented on PR #19525:
URL: https://github.com/apache/flink/pull/19525#issuecomment-1103612405

   
   ## CI report:
   
   * 54b22b4614a0cb2d625710f836b1ca093e909ce5 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] rkhachatryan commented on a diff in pull request #19441: [FLINK-27187][state/changelog] Add changelog storage metric totalAttemptsPerUpload

2022-04-20 Thread GitBox


rkhachatryan commented on code in PR #19441:
URL: https://github.com/apache/flink/pull/19441#discussion_r853867505


##
flink-dstl/flink-dstl-dfs/src/test/java/org/apache/flink/changelog/fs/ChangelogStorageMetricsTest.java:
##
@@ -295,6 +349,39 @@ public void close() {
 }
 }
 
+private static class FixedLatencyUploader implements StateChangeUploader {
+private final long latency;
+
+public FixedLatencyUploader(long latency) {
+this.latency = latency;
+}
+
+@Override
+public UploadTasksResult upload(Collection tasks) throws 
IOException {
+Map> map = new HashMap<>();
+
+try {
+TimeUnit.MILLISECONDS.sleep(latency);

Review Comment:
   Thanks for the explanation. 
   In other words, `testAttemptsPerUpload` completes the last attempt and 
`testTotalAttemptsPerUpload` completes the first attempt, right?
   
   IMO it's better to keep these tests separate, because the scenarios and the 
assertions are different.
   
   As for the implementation of `testTotalAttemptsPerUpload`, how about waiting 
in the 1st attempt for the others to fail?
   Something like `CountDownLatch` or `CompletableFuture` + `AtomicInteger` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] wangyang0918 commented on a diff in pull request #174: [FLINK-27023] Unify flink and operator config handling

2022-04-20 Thread GitBox


wangyang0918 commented on code in PR #174:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/174#discussion_r853853234


##
flink-kubernetes-operator/src/test/java/org/apache/flink/kubernetes/operator/metrics/OperatorMetricUtilsTest.java:
##
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.kubernetes.operator.metrics;
+
+import org.apache.flink.configuration.Configuration;
+
+import org.junit.jupiter.api.Test;
+
+import java.util.Map;
+
+import static org.junit.Assert.assertEquals;

Review Comment:
   nit: Maybe we need to use `org.junit.jupiter.api.Assertions.assertEquals` 
since we are explicitly using junit 5.



##
flink-kubernetes-operator/src/test/java/org/apache/flink/kubernetes/operator/metrics/OperatorMetricUtilsTest.java:
##
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.kubernetes.operator.metrics;
+
+import org.apache.flink.configuration.Configuration;
+
+import org.junit.jupiter.api.Test;
+
+import java.util.Map;
+
+import static org.junit.Assert.assertEquals;
+
+/** @link OperatorMetricUtils tests. */

Review Comment:
   ```suggestion
   /** {@link OperatorMetricUtils tests.} */
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #174: [FLINK-27023] Unify flink and operator config handling

2022-04-20 Thread GitBox


gyfora commented on code in PR #174:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/174#discussion_r853871854


##
flink-kubernetes-operator/src/test/java/org/apache/flink/kubernetes/operator/metrics/OperatorMetricUtilsTest.java:
##
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.kubernetes.operator.metrics;
+
+import org.apache.flink.configuration.Configuration;
+
+import org.junit.jupiter.api.Test;
+
+import java.util.Map;
+
+import static org.junit.Assert.assertEquals;

Review Comment:
   good catch, will fix



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] wangyang0918 merged pull request #173: [FLINK-27310] Fix FlinkOperatorITCase

2022-04-20 Thread GitBox


wangyang0918 merged PR #173:
URL: https://github.com/apache/flink-kubernetes-operator/pull/173


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-27274) Job cannot be recovered, after restarting cluster

2022-04-20 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524804#comment-17524804
 ] 

Martijn Visser commented on FLINK-27274:


stop-cluster.sh doesn't have anything to do with state. It just stops 
TaskManagers and JobManagers (like the command says), so it would be 2. 

> Job cannot be recovered, after restarting cluster
> -
>
> Key: FLINK-27274
> URL: https://issues.apache.org/jira/browse/FLINK-27274
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.15.0
> Environment: Flink 1.15.0-rc3
> [https://github.com/apache/flink/archive/refs/tags/release-1.15.0-rc3.tar.gz] 
>Reporter: macdoor615
>Priority: Major
> Attachments: flink-conf.yaml, 
> flink-gum-standalonesession-0-hb3-dev-flink-000.log.3.zip, 
> flink-gum-standalonesession-0-hb3-dev-flink-000.log.zip, 
> flink-gum-taskexecutor-2-hb3-dev-flink-000.log, log.recover.debug.zip, 
> new_cf_alarm_no_recover.yaml.sql
>
>
> 1. execute new_cf_alarm_no_recover.yaml.sql with sql-client.sh
> config file: flink-conf.yaml
> the job run properly
> 2. restart cluster with command
> stop-cluster.sh
> start-cluster.sh
> 3. job cannot be recovered
> log files
> flink-gum-standalonesession-0-hb3-dev-flink-000.log
> flink-gum-taskexecutor-2-hb3-dev-flink-000.log
> 4. not all job can not be recovered, some can, some can not, at same time
> 5. all job can be recovered on Flink 1.14.4



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Closed] (FLINK-27310) FlinkOperatorITCase failure due to JobManager replicas less than one

2022-04-20 Thread Yang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang closed FLINK-27310.
-
Resolution: Fixed

Fixed via:

main:

e0e34cb481af597c2126760a0f4c81bacb7ea284

6e1f0e8c28986b515f7becb9968b9af25c8f4b6a

> FlinkOperatorITCase failure due to JobManager replicas less than one
> 
>
> Key: FLINK-27310
> URL: https://issues.apache.org/jira/browse/FLINK-27310
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Usamah Jassat
>Assignee: Biao Geng
>Priority: Minor
>  Labels: pull-request-available
>
> The FlinkOperatorITCase test is currently failing, even in the CI pipeline 
>  
> {code:java}
> INFO] Running org.apache.flink.kubernetes.operator.FlinkOperatorITCase
>   
>   
> 
> Error:  Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 3.178 s <<< FAILURE! - in 
> org.apache.flink.kubernetes.operator.FlinkOperatorITCase
>   
>   
> 
> Error:  org.apache.flink.kubernetes.operator.FlinkOperatorITCase.test  
> Time elapsed: 2.664 s  <<< ERROR!
>   
>   
> 
> io.fabric8.kubernetes.client.KubernetesClientException: Failure 
> executing: POST at: 
> https://192.168.49.2:8443/apis/flink.apache.org/v1beta1/namespaces/flink-operator-test/flinkdeployments.
>  Message: Forbidden! User minikube doesn't have permission. admission webhook 
> "vflinkdeployments.flink.apache.org" denied the request: JobManager replicas 
> should not be configured less than one..
>   
>   
> 
>   at 
> flink.kubernetes.operator@1.0-SNAPSHOT/org.apache.flink.kubernetes.operator.FlinkOperatorITCase.test(FlinkOperatorITCase.java:86){code}
>  
> While the test is failing the CI test run is passing which also should be 
> fixed then to fail on the test failure.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (FLINK-27274) Job cannot be recovered, after restarting cluster

2022-04-20 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524804#comment-17524804
 ] 

Martijn Visser edited comment on FLINK-27274 at 4/20/22 8:35 AM:
-

stop-cluster.sh doesn't have anything to do with state. It just stops 
TaskManagers and JobManagers (like the command says), so its result would be 2. 


was (Author: martijnvisser):
stop-cluster.sh doesn't have anything to do with state. It just stops 
TaskManagers and JobManagers (like the command says), so it would be 2. 

> Job cannot be recovered, after restarting cluster
> -
>
> Key: FLINK-27274
> URL: https://issues.apache.org/jira/browse/FLINK-27274
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.15.0
> Environment: Flink 1.15.0-rc3
> [https://github.com/apache/flink/archive/refs/tags/release-1.15.0-rc3.tar.gz] 
>Reporter: macdoor615
>Priority: Major
> Attachments: flink-conf.yaml, 
> flink-gum-standalonesession-0-hb3-dev-flink-000.log.3.zip, 
> flink-gum-standalonesession-0-hb3-dev-flink-000.log.zip, 
> flink-gum-taskexecutor-2-hb3-dev-flink-000.log, log.recover.debug.zip, 
> new_cf_alarm_no_recover.yaml.sql
>
>
> 1. execute new_cf_alarm_no_recover.yaml.sql with sql-client.sh
> config file: flink-conf.yaml
> the job run properly
> 2. restart cluster with command
> stop-cluster.sh
> start-cluster.sh
> 3. job cannot be recovered
> log files
> flink-gum-standalonesession-0-hb3-dev-flink-000.log
> flink-gum-taskexecutor-2-hb3-dev-flink-000.log
> 4. not all job can not be recovered, some can, some can not, at same time
> 5. all job can be recovered on Flink 1.14.4



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27310) FlinkOperatorITCase failure due to JobManager replicas less than one

2022-04-20 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524805#comment-17524805
 ] 

Yang Wang commented on FLINK-27310:
---

Thanks [~usamj] for reporting this issue and [~bgeng777] for fixing it.

> FlinkOperatorITCase failure due to JobManager replicas less than one
> 
>
> Key: FLINK-27310
> URL: https://issues.apache.org/jira/browse/FLINK-27310
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Usamah Jassat
>Assignee: Biao Geng
>Priority: Minor
>  Labels: pull-request-available
>
> The FlinkOperatorITCase test is currently failing, even in the CI pipeline 
>  
> {code:java}
> INFO] Running org.apache.flink.kubernetes.operator.FlinkOperatorITCase
>   
>   
> 
> Error:  Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 3.178 s <<< FAILURE! - in 
> org.apache.flink.kubernetes.operator.FlinkOperatorITCase
>   
>   
> 
> Error:  org.apache.flink.kubernetes.operator.FlinkOperatorITCase.test  
> Time elapsed: 2.664 s  <<< ERROR!
>   
>   
> 
> io.fabric8.kubernetes.client.KubernetesClientException: Failure 
> executing: POST at: 
> https://192.168.49.2:8443/apis/flink.apache.org/v1beta1/namespaces/flink-operator-test/flinkdeployments.
>  Message: Forbidden! User minikube doesn't have permission. admission webhook 
> "vflinkdeployments.flink.apache.org" denied the request: JobManager replicas 
> should not be configured less than one..
>   
>   
> 
>   at 
> flink.kubernetes.operator@1.0-SNAPSHOT/org.apache.flink.kubernetes.operator.FlinkOperatorITCase.test(FlinkOperatorITCase.java:86){code}
>  
> While the test is failing the CI test run is passing which also should be 
> fixed then to fail on the test failure.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (FLINK-27310) FlinkOperatorITCase failure due to JobManager replicas less than one

2022-04-20 Thread Yang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang updated FLINK-27310:
--
Fix Version/s: kubernetes-operator-1.0.0

> FlinkOperatorITCase failure due to JobManager replicas less than one
> 
>
> Key: FLINK-27310
> URL: https://issues.apache.org/jira/browse/FLINK-27310
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Usamah Jassat
>Assignee: Biao Geng
>Priority: Minor
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.0.0
>
>
> The FlinkOperatorITCase test is currently failing, even in the CI pipeline 
>  
> {code:java}
> INFO] Running org.apache.flink.kubernetes.operator.FlinkOperatorITCase
>   
>   
> 
> Error:  Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 3.178 s <<< FAILURE! - in 
> org.apache.flink.kubernetes.operator.FlinkOperatorITCase
>   
>   
> 
> Error:  org.apache.flink.kubernetes.operator.FlinkOperatorITCase.test  
> Time elapsed: 2.664 s  <<< ERROR!
>   
>   
> 
> io.fabric8.kubernetes.client.KubernetesClientException: Failure 
> executing: POST at: 
> https://192.168.49.2:8443/apis/flink.apache.org/v1beta1/namespaces/flink-operator-test/flinkdeployments.
>  Message: Forbidden! User minikube doesn't have permission. admission webhook 
> "vflinkdeployments.flink.apache.org" denied the request: JobManager replicas 
> should not be configured less than one..
>   
>   
> 
>   at 
> flink.kubernetes.operator@1.0-SNAPSHOT/org.apache.flink.kubernetes.operator.FlinkOperatorITCase.test(FlinkOperatorITCase.java:86){code}
>  
> While the test is failing the CI test run is passing which also should be 
> fixed then to fail on the test failure.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] MartijnVisser merged pull request #19478: [FLINK-25694][FileSystems][S3] Upgrade Presto to resolve GSON/Alluxio Vulnerability

2022-04-20 Thread GitBox


MartijnVisser merged PR #19478:
URL: https://github.com/apache/flink/pull/19478


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-25694) Upgrade Presto to resolve GSON/Alluxio Vulnerability

2022-04-20 Thread Martijn Visser (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser updated FLINK-25694:
---
Fix Version/s: 1.14.5
   1.15.1

> Upgrade Presto to resolve GSON/Alluxio Vulnerability
> 
>
> Key: FLINK-25694
> URL: https://issues.apache.org/jira/browse/FLINK-25694
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / FileSystem, FileSystems
>Affects Versions: 1.14.2
>Reporter: David Perkins
>Assignee: David Perkins
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.14.5, 1.15.1
>
>
> GSON has a bug, which was fixed in 2.8.9, see 
> [https://github.com/google/gson/pull/1991|https://github.com/google/gson/pull/1991].
>  This results in the possibility for DOS attacks.
> GSON is included in the `flink-s3-fs-presto` plugin, because Alluxio includes 
> it in their shaded client. I've opened an issue in Alluxio: 
> [https://github.com/Alluxio/alluxio/issues/14868]. When that is fixed, the 
> plugin also needs to be updated.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-25694) Upgrade Presto to resolve GSON/Alluxio Vulnerability

2022-04-20 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524808#comment-17524808
 ] 

Martijn Visser commented on FLINK-25694:


Also fixed in:

release-1.14: 9a5d8b090f75262f216acbf89f240151327f9ecb
release-1.15: 229c5f053731f1af127971075ae8b052e6e48a83

Thanks again [~davidnperkins]

> Upgrade Presto to resolve GSON/Alluxio Vulnerability
> 
>
> Key: FLINK-25694
> URL: https://issues.apache.org/jira/browse/FLINK-25694
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / FileSystem, FileSystems
>Affects Versions: 1.14.2
>Reporter: David Perkins
>Assignee: David Perkins
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.14.5, 1.15.1
>
>
> GSON has a bug, which was fixed in 2.8.9, see 
> [https://github.com/google/gson/pull/1991|https://github.com/google/gson/pull/1991].
>  This results in the possibility for DOS attacks.
> GSON is included in the `flink-s3-fs-presto` plugin, because Alluxio includes 
> it in their shaded client. I've opened an issue in Alluxio: 
> [https://github.com/Alluxio/alluxio/issues/14868]. When that is fixed, the 
> plugin also needs to be updated.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] dmvk commented on a diff in pull request #19524: [FLINK-27317][build] Don't fork build for attaching sources

2022-04-20 Thread GitBox


dmvk commented on code in PR #19524:
URL: https://github.com/apache/flink/pull/19524#discussion_r853881875


##
pom.xml:
##
@@ -1224,12 +1224,12 @@ under the License.


org.apache.maven.plugins

maven-source-plugin
-   
2.2.1

Review Comment:
   is there any particular reason for the upgrade?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] SteNicholas commented on a diff in pull request #174: [FLINK-27023] Unify flink and operator config handling

2022-04-20 Thread GitBox


SteNicholas commented on code in PR #174:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/174#discussion_r853887878


##
helm/flink-kubernetes-operator/conf/flink-conf.yaml:
##
@@ -24,3 +25,16 @@ queryable-state.proxy.ports: 6125
 jobmanager.memory.process.size: 1600m
 taskmanager.memory.process.size: 1728m
 parallelism.default: 2
+
+# Flink operator related configs
+# kubernetes.operator.reconciler.reschedule.interval: 60 s
+# kubernetes.operator.reconciler.max.parallelism: 5
+# kubernetes.operator.reconciler.flink.cancel.job.timeout: 1min
+# kubernetes.operator.reconciler.flink.cluster.shutdown.timeout: 60s

Review Comment:
   ```suggestion
   # kubernetes.operator.reconciler.flink.cluster.shutdown.timeout: 60 s
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] SteNicholas commented on a diff in pull request #174: [FLINK-27023] Unify flink and operator config handling

2022-04-20 Thread GitBox


SteNicholas commented on code in PR #174:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/174#discussion_r853888362


##
helm/flink-kubernetes-operator/conf/flink-conf.yaml:
##
@@ -24,3 +25,16 @@ queryable-state.proxy.ports: 6125
 jobmanager.memory.process.size: 1600m
 taskmanager.memory.process.size: 1728m
 parallelism.default: 2
+
+# Flink operator related configs
+# kubernetes.operator.reconciler.reschedule.interval: 60 s
+# kubernetes.operator.reconciler.max.parallelism: 5
+# kubernetes.operator.reconciler.flink.cancel.job.timeout: 1min

Review Comment:
   ```suggestion
   # kubernetes.operator.reconciler.flink.cancel.job.timeout: 1 min
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zentol commented on pull request #19524: [FLINK-27317][build] Don't fork build for attaching sources

2022-04-20 Thread GitBox


zentol commented on PR #19524:
URL: https://github.com/apache/flink/pull/19524#issuecomment-1103652034

   Using `${project.basedir}` is a good idea. I'm not aware of any downsides in 
using `-no-fork`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zentol commented on a diff in pull request #19524: [FLINK-27317][build] Don't fork build for attaching sources

2022-04-20 Thread GitBox


zentol commented on code in PR #19524:
URL: https://github.com/apache/flink/pull/19524#discussion_r853897149


##
pom.xml:
##
@@ -1224,12 +1224,12 @@ under the License.


org.apache.maven.plugins

maven-source-plugin
-   
2.2.1

Review Comment:
   No; it just felt like a good opportunity to do so since we're looking at the 
plugin anyway.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zentol commented on pull request #19524: [FLINK-27317][build] Don't fork build for attaching sources

2022-04-20 Thread GitBox


zentol commented on PR #19524:
URL: https://github.com/apache/flink/pull/19524#issuecomment-1103656845

   I would use the `-no-fork` goal though in any case because it saves some 
time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-27274) Job cannot be recovered, after restarting cluster

2022-04-20 Thread macdoor615 (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524814#comment-17524814
 ] 

macdoor615 commented on FLINK-27274:


[~martijnvisser] No, stop-cluster.sh does't clear job state. Please refer 
log.recover.debug.zip.

the state job 0f574f248180b8f8656cbab5916a151d is stored and recovered.

As [~zhuzh] said, stop-cluster.sh does not cancel job. It should suspend job

> Job cannot be recovered, after restarting cluster
> -
>
> Key: FLINK-27274
> URL: https://issues.apache.org/jira/browse/FLINK-27274
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.15.0
> Environment: Flink 1.15.0-rc3
> [https://github.com/apache/flink/archive/refs/tags/release-1.15.0-rc3.tar.gz] 
>Reporter: macdoor615
>Priority: Major
> Attachments: flink-conf.yaml, 
> flink-gum-standalonesession-0-hb3-dev-flink-000.log.3.zip, 
> flink-gum-standalonesession-0-hb3-dev-flink-000.log.zip, 
> flink-gum-taskexecutor-2-hb3-dev-flink-000.log, log.recover.debug.zip, 
> new_cf_alarm_no_recover.yaml.sql
>
>
> 1. execute new_cf_alarm_no_recover.yaml.sql with sql-client.sh
> config file: flink-conf.yaml
> the job run properly
> 2. restart cluster with command
> stop-cluster.sh
> start-cluster.sh
> 3. job cannot be recovered
> log files
> flink-gum-standalonesession-0-hb3-dev-flink-000.log
> flink-gum-taskexecutor-2-hb3-dev-flink-000.log
> 4. not all job can not be recovered, some can, some can not, at same time
> 5. all job can be recovered on Flink 1.14.4



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] XComp commented on a diff in pull request #19444: [FLINK-27206][metrics] Deprecate reflection-based reporter instantiation

2022-04-20 Thread GitBox


XComp commented on code in PR #19444:
URL: https://github.com/apache/flink/pull/19444#discussion_r853878533


##
flink-runtime/src/main/java/org/apache/flink/runtime/metrics/ReporterSetup.java:
##
@@ -360,6 +361,7 @@ private static List setupReporters(
 return reporterSetups;
 }
 
+@SuppressWarnings("deprecation")
 private static Optional loadReporter(

Review Comment:
   What about adding a warn log message after the `factorClassName != null` if 
block stating that this option is deprecated to make this more visual to the 
user through logs? Or do users have to touch their reporter code anyway when 
updating Flink? Then, they would see it through the `@deprecated` annotation of 
the `@InterceptInstantiationViaReflection` annotation...



##
flink-end-to-end-tests/flink-metrics-reporter-prometheus-test/src/test/java/org/apache/flink/metrics/prometheus/tests/PrometheusReporterEndToEndITCase.java:
##
@@ -324,37 +296,28 @@ private static void checkMetricAvailability(final 
OkHttpClient client, final Str
 static class TestParams {
 private final String jarLocationDescription;
 private final Consumer 
builderSetup;
-private final InstantiationType instantiationType;
 
 private TestParams(
 String jarLocationDescription,
-Consumer 
builderSetup,
-InstantiationType instantiationType) {
+Consumer 
builderSetup) {
 this.jarLocationDescription = jarLocationDescription;
 this.builderSetup = builderSetup;
-this.instantiationType = instantiationType;
 }
 
 public static TestParams from(
 String jarLocationDesription,
 Consumer 
builderSetup,
 InstantiationType instantiationType) {
-return new TestParams(jarLocationDesription, builderSetup, 
instantiationType);
+return new TestParams(jarLocationDesription, builderSetup);
 }
 
 public Consumer 
getBuilderSetup() {
 return builderSetup;
 }
 
-public InstantiationType getInstantiationType() {
-return instantiationType;
-}
-
 @Override
 public String toString() {
-return jarLocationDescription
-+ ", instantiated via "
-+ instantiationType.name().toLowerCase();
+return jarLocationDescription;
 }
 
 public enum InstantiationType {

Review Comment:
   The `REFLECTION` type is unused. I guess, we could get rid of the enum 
entirely now. No need to have this around...



##
flink-end-to-end-tests/flink-metrics-reporter-prometheus-test/src/test/java/org/apache/flink/metrics/prometheus/tests/PrometheusReporterEndToEndITCase.java:
##
@@ -324,37 +296,28 @@ private static void checkMetricAvailability(final 
OkHttpClient client, final Str
 static class TestParams {
 private final String jarLocationDescription;
 private final Consumer 
builderSetup;
-private final InstantiationType instantiationType;
 
 private TestParams(
 String jarLocationDescription,
-Consumer 
builderSetup,
-InstantiationType instantiationType) {
+Consumer 
builderSetup) {
 this.jarLocationDescription = jarLocationDescription;
 this.builderSetup = builderSetup;
-this.instantiationType = instantiationType;
 }
 
 public static TestParams from(
 String jarLocationDesription,
 Consumer 
builderSetup,
 InstantiationType instantiationType) {

Review Comment:
   unused



##
flink-core/src/main/java/org/apache/flink/configuration/MetricOptions.java:
##
@@ -67,8 +67,8 @@ public class MetricOptions {
 + " any of the names in the list will be 
started. Otherwise, all reporters that could be found in"
 + " the configuration will be started.");
 
-@Documentation.SuffixOption(NAMED_REPORTER_CONFIG_PREFIX)
-@Documentation.Section(value = Documentation.Sections.METRIC_REPORTERS, 
position = 1)
+/** @deprecated use {@link MetricOptions#REPORTER_FACTORY_CLASS} instead. 
*/
+@Deprecated
 public static final ConfigOption REPORTER_CLASS =

Review Comment:
   
[ConfigConstants#METRICS_REPORTER_CLASS_SUFFIX](https://github.com/apache/flink/blob/d834b271366ed508aba327aa94a14bb2b2e47a4c/flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java#L1116)
 is referring to this member. Just wanted to mention it but I guess it doesn't 
add any value...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

[jira] [Updated] (FLINK-27315) [docs] Fix the demo of MemoryStateBackendMigration

2022-04-20 Thread Echo Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Echo Lee updated FLINK-27315:
-
Summary: [docs] Fix the demo of MemoryStateBackendMigration  (was: Fix the 
demo of MemoryStateBackendMigration)

> [docs] Fix the demo of MemoryStateBackendMigration
> --
>
> Key: FLINK-27315
> URL: https://issues.apache.org/jira/browse/FLINK-27315
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.14.4
>Reporter: Echo Lee
>Assignee: Echo Lee
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> There is a problem with the memorystatebackendmigration demo under [state 
> backends 
> doc|https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/state_backends/#code-configuration]
> JobManagerStateBackend should be changed to JobManagerCheckpointStorage



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] zentol commented on a diff in pull request #19444: [FLINK-27206][metrics] Deprecate reflection-based reporter instantiation

2022-04-20 Thread GitBox


zentol commented on code in PR #19444:
URL: https://github.com/apache/flink/pull/19444#discussion_r853907099


##
flink-core/src/main/java/org/apache/flink/configuration/MetricOptions.java:
##
@@ -67,8 +67,8 @@ public class MetricOptions {
 + " any of the names in the list will be 
started. Otherwise, all reporters that could be found in"
 + " the configuration will be started.");
 
-@Documentation.SuffixOption(NAMED_REPORTER_CONFIG_PREFIX)
-@Documentation.Section(value = Documentation.Sections.METRIC_REPORTERS, 
position = 1)
+/** @deprecated use {@link MetricOptions#REPORTER_FACTORY_CLASS} instead. 
*/
+@Deprecated
 public static final ConfigOption REPORTER_CLASS =

Review Comment:
   they are both deprecated now, and are effectively the same thing. I think 
the current path is fine; someone using `METRICS_REPORTER_CLASS_SUFFIX` is lead 
to `REPORTER_CLASS`, which then leads to `REPORTER_FACTORY_CLASS`.
   Sure, we could remove 1 jump, but I don't think we'd really get anything out 
of it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] fapaul commented on a diff in pull request #19405: [FLINK-27066] Reintroduce e2e tests in ES as Java tests.

2022-04-20 Thread GitBox


fapaul commented on code in PR #19405:
URL: https://github.com/apache/flink/pull/19405#discussion_r853895298


##
flink-end-to-end-tests/flink-end-to-end-tests-elasticsearch-common/src/main/java/org/apache/flink/streaming/tests/KeyValue.java:
##
@@ -0,0 +1,74 @@
+package org.apache.flink.streaming.tests;
+
+import org.apache.flink.util.StringUtils;
+
+import java.io.Serializable;
+import java.util.Objects;
+
+/** A {@link Comparable} holder for key-value pairs. */
+public class KeyValue, V extends Comparable>
+implements Comparable>, Serializable {
+
+private static final long serialVersionUID = 1L;
+
+/** The key of the key-value pair. */
+public K key;
+/** The value the key-value pair. */
+public V value;
+
+/** Creates a new key-value pair where all fields are null. */
+public KeyValue() {}
+
+private KeyValue(K key, V value) {
+this.key = key;
+this.value = value;
+}
+
+@Override
+public int compareTo(KeyValue other) {

Review Comment:
   Do you need the `compareTo`? If you only need the comparison in a single 
place, it might be easier to use the existing `Tuple2` class and just pass in 
the respective comparator at the place where you want to compare things.



##
flink-test-utils-parent/flink-connector-test-utils/src/main/java/org/apache/flink/connector/testframe/testsuites/SinkTestSuiteBase.java:
##
@@ -508,6 +509,7 @@ private void checkResultWithSemantic(
 
.matchesRecordsFromSource(Arrays.asList(sort(testData)), semantic);
 return true;
 } catch (Throwable t) {
+LOG.error("Ooops", t);

Review Comment:
   Please use a more descriptive error message.



##
flink-end-to-end-tests/flink-end-to-end-tests-elasticsearch6/pom.xml:
##
@@ -0,0 +1,127 @@
+
+
+http://maven.apache.org/POM/4.0.0";
+xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
+
+   4.0.0
+
+   
+   org.apache.flink
+   flink-end-to-end-tests
+   1.16-SNAPSHOT
+   ..
+   
+
+   flink-end-to-end-tests-elasticsearch6
+   Flink : E2E Tests : Elasticsearch 6 Java
+   jar
+
+   
+   6.8.20
+   
+
+   
+   
+   org.apache.flink
+   flink-streaming-java
+   ${project.version}
+   provided
+   
+   
+   org.apache.flink
+   flink-connector-elasticsearch6
+   ${project.version}
+   
+   
+   org.apache.flink
+   
flink-end-to-end-tests-elasticsearch-common
+   ${project.version}
+   
+
+   
+   org.apache.flink
+   flink-end-to-end-tests-common
+   ${project.version}
+   test
+   
+   
+
+   
+   
+   
+   org.apache.maven.plugins
+   maven-shade-plugin
+   
+   
+   
+   shade

Review Comment:
   Quick reminder why do we need shading here?



##
flink-end-to-end-tests/flink-end-to-end-tests-elasticsearch-common/src/main/java/org/apache/flink/streaming/tests/ElasticsearchDataReader.java:
##
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.flink.streaming.tests;
+
+import org.apache.flink.connector.testframe.external.ExternalSystemDataReader;
+
+import java.time.Duration;
+import java.util.List;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+/** Elasticsearch data reader. */
+public class ElasticsearchDataReader
+implements ExternalSys

[GitHub] [flink] zentol commented on a diff in pull request #19444: [FLINK-27206][metrics] Deprecate reflection-based reporter instantiation

2022-04-20 Thread GitBox


zentol commented on code in PR #19444:
URL: https://github.com/apache/flink/pull/19444#discussion_r853911263


##
flink-runtime/src/main/java/org/apache/flink/runtime/metrics/ReporterSetup.java:
##
@@ -360,6 +361,7 @@ private static List setupReporters(
 return reporterSetups;
 }
 
+@SuppressWarnings("deprecation")
 private static Optional loadReporter(

Review Comment:
   they generally don't have to touch reporters when upgrading. I made sure we 
log a warning whenever we don't use a factory.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] XComp opened a new pull request, #19526: [BP-1.15][FLINK-24491][runtime] Make the job termination wait until the archiv…

2022-04-20 Thread GitBox


XComp opened a new pull request, #19526:
URL: https://github.com/apache/flink/pull/19526

   1.15 backport PR of parent PR #19275. Cherry-picking worked without 
conflicts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora merged pull request #174: [FLINK-27023] Unify flink and operator config handling

2022-04-20 Thread GitBox


gyfora merged PR #174:
URL: https://github.com/apache/flink-kubernetes-operator/pull/174


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-27023) Merge default flink and operator configuration settings for the operator

2022-04-20 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-27023.
--
Resolution: Fixed

merged to main: 805fef5849479b2777ad09258e42a474a32ce058

> Merge default flink and operator configuration settings for the operator
> 
>
> Key: FLINK-27023
> URL: https://issues.apache.org/jira/browse/FLINK-27023
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.0.0
>
>
> Based on the mailing list discussion : 
> [https://lists.apache.org/thread/pnf2gk9dgqv3qrtszqbfcdxf32t2gr3x]
> As a first step we can combine the operators default flink and operator 
> config.
> This includes the following changes:
>  # Get rid of the DefaultConfig class and replace with a single Configuration 
> object containing the settings for both.
>  # Rename OperatorConfigOptions -> KubernetesOperatorConfigOptions
>  # Prefix all options with `kubernetes` to get kubernetes.operator.
>  # In the helm chart combine the operatorConfiguration and 
> flinkDefaultConfiguration into a common defaultConfigurationSection. We 
> should still keep the logging settings separately for the two somehow



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27274) Job cannot be recovered, after restarting cluster

2022-04-20 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524827#comment-17524827
 ] 

Martijn Visser commented on FLINK-27274:


[~macdoor615] Once more, by default, stop-cluster.sh only does stop a cluster. 
Like I said, that means it stops JobManagers and TaskManagers. I can imagine a 
race condition going on when running this command in HA setup, causing a job 
sometimes to cancel or getting other behavior. That's because you're not 
following the correct process for restarting a cluster and/or job. What you're 
experiencing right now is not a bug, but the consequences of not following the 
correct steps. 

> Job cannot be recovered, after restarting cluster
> -
>
> Key: FLINK-27274
> URL: https://issues.apache.org/jira/browse/FLINK-27274
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.15.0
> Environment: Flink 1.15.0-rc3
> [https://github.com/apache/flink/archive/refs/tags/release-1.15.0-rc3.tar.gz] 
>Reporter: macdoor615
>Priority: Major
> Attachments: flink-conf.yaml, 
> flink-gum-standalonesession-0-hb3-dev-flink-000.log.3.zip, 
> flink-gum-standalonesession-0-hb3-dev-flink-000.log.zip, 
> flink-gum-taskexecutor-2-hb3-dev-flink-000.log, log.recover.debug.zip, 
> new_cf_alarm_no_recover.yaml.sql
>
>
> 1. execute new_cf_alarm_no_recover.yaml.sql with sql-client.sh
> config file: flink-conf.yaml
> the job run properly
> 2. restart cluster with command
> stop-cluster.sh
> start-cluster.sh
> 3. job cannot be recovered
> log files
> flink-gum-standalonesession-0-hb3-dev-flink-000.log
> flink-gum-taskexecutor-2-hb3-dev-flink-000.log
> 4. not all job can not be recovered, some can, some can not, at same time
> 5. all job can be recovered on Flink 1.14.4



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] flinkbot commented on pull request #19526: [BP-1.15][FLINK-24491][runtime] Make the job termination wait until the archiv…

2022-04-20 Thread GitBox


flinkbot commented on PR #19526:
URL: https://github.com/apache/flink/pull/19526#issuecomment-1103685638

   
   ## CI report:
   
   * f66b387d5acfe1cc03bc80b0c73c471b62d3892e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] XComp commented on a diff in pull request #19444: [FLINK-27206][metrics] Deprecate reflection-based reporter instantiation

2022-04-20 Thread GitBox


XComp commented on code in PR #19444:
URL: https://github.com/apache/flink/pull/19444#discussion_r853919505


##
flink-runtime/src/main/java/org/apache/flink/runtime/metrics/ReporterSetup.java:
##
@@ -462,6 +462,15 @@ private static Optional loadViaReflection(
 alternativeFactoryClassName);
 return loadViaFactory(
 alternativeFactoryClassName, reporterName, reporterConfig, 
reporterFactories);
+} else {
+LOG.warn(

Review Comment:
   This warning won't be printed if user still used 
`MetricOptions#REPORTER_CLASS` but has a matching factory class that can be 
used for `loadViaFactory`. I think we should move this log message up in the 
call stack and add it to `loadReporter` instead...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zoltar9264 commented on a diff in pull request #19441: [FLINK-27187][state/changelog] Add changelog storage metric totalAttemptsPerUpload

2022-04-20 Thread GitBox


zoltar9264 commented on code in PR #19441:
URL: https://github.com/apache/flink/pull/19441#discussion_r853922622


##
flink-dstl/flink-dstl-dfs/src/test/java/org/apache/flink/changelog/fs/ChangelogStorageMetricsTest.java:
##
@@ -295,6 +349,39 @@ public void close() {
 }
 }
 
+private static class FixedLatencyUploader implements StateChangeUploader {
+private final long latency;
+
+public FixedLatencyUploader(long latency) {
+this.latency = latency;
+}
+
+@Override
+public UploadTasksResult upload(Collection tasks) throws 
IOException {
+Map> map = new HashMap<>();
+
+try {
+TimeUnit.MILLISECONDS.sleep(latency);

Review Comment:
   Thanks for your nice suggestion @rkhachatryan .
   
   First, yes, testTotalAttemptsPerUpload completes the first attempt.
   
   Second, I agree with keep these tests separate.
   
   Third, the implementation of testTotalAttemptsPerUpload, I think you are 
right.
   And I think we can implement waiting in 1st attempt for the others in 
MaxAttemptUploader. What do you think about ?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zentol commented on a diff in pull request #19444: [FLINK-27206][metrics] Deprecate reflection-based reporter instantiation

2022-04-20 Thread GitBox


zentol commented on code in PR #19444:
URL: https://github.com/apache/flink/pull/19444#discussion_r853933046


##
flink-runtime/src/main/java/org/apache/flink/runtime/metrics/ReporterSetup.java:
##
@@ -462,6 +462,15 @@ private static Optional loadViaReflection(
 alternativeFactoryClassName);
 return loadViaFactory(
 alternativeFactoryClassName, reporterName, reporterConfig, 
reporterFactories);
+} else {
+LOG.warn(

Review Comment:
   👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] afedulov commented on a diff in pull request #19405: [FLINK-27066] Reintroduce e2e tests in ES as Java tests.

2022-04-20 Thread GitBox


afedulov commented on code in PR #19405:
URL: https://github.com/apache/flink/pull/19405#discussion_r853933757


##
flink-test-utils-parent/flink-connector-test-utils/src/main/java/org/apache/flink/connector/testframe/testsuites/SinkTestSuiteBase.java:
##
@@ -508,6 +509,7 @@ private void checkResultWithSemantic(
 
.matchesRecordsFromSource(Arrays.asList(sort(testData)), semantic);
 return true;
 } catch (Throwable t) {
+LOG.error("Ooops", t);

Review Comment:
   Whoops, this was not supposed to be merged. The way the test is designed the 
exception is swallowed by default, this is a remainder of my debugging.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] afedulov commented on a diff in pull request #19405: [FLINK-27066] Reintroduce e2e tests in ES as Java tests.

2022-04-20 Thread GitBox


afedulov commented on code in PR #19405:
URL: https://github.com/apache/flink/pull/19405#discussion_r853934778


##
flink-end-to-end-tests/flink-end-to-end-tests-elasticsearch-common/src/main/java/org/apache/flink/streaming/tests/ElasticsearchDataReader.java:
##
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.flink.streaming.tests;
+
+import org.apache.flink.connector.testframe.external.ExternalSystemDataReader;
+
+import java.time.Duration;
+import java.util.List;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+/** Elasticsearch data reader. */
+public class ElasticsearchDataReader
+implements ExternalSystemDataReader> {
+private final ElasticsearchClient client;
+private final String indexName;
+private final int pageLength;
+private int from;
+
+public ElasticsearchDataReader(ElasticsearchClient client, String 
indexName, int pageLength) {
+this.client = checkNotNull(client);
+this.indexName = checkNotNull(indexName);
+this.pageLength = pageLength;
+}
+
+@Override
+public List> poll(Duration timeout) {
+client.refreshIndex(indexName);
+// TODO: Tests are flaky without this small delay.

Review Comment:
   https://github.com/apache/flink/pull/19405#discussion_r848945151



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] afedulov commented on a diff in pull request #19405: [FLINK-27066] Reintroduce e2e tests in ES as Java tests.

2022-04-20 Thread GitBox


afedulov commented on code in PR #19405:
URL: https://github.com/apache/flink/pull/19405#discussion_r853936471


##
flink-end-to-end-tests/flink-end-to-end-tests-elasticsearch-common/src/main/java/org/apache/flink/streaming/tests/KeyValue.java:
##
@@ -0,0 +1,74 @@
+package org.apache.flink.streaming.tests;
+
+import org.apache.flink.util.StringUtils;
+
+import java.io.Serializable;
+import java.util.Objects;
+
+/** A {@link Comparable} holder for key-value pairs. */
+public class KeyValue, V extends Comparable>
+implements Comparable>, Serializable {
+
+private static final long serialVersionUID = 1L;
+
+/** The key of the key-value pair. */
+public K key;
+/** The value the key-value pair. */
+public V value;
+
+/** Creates a new key-value pair where all fields are null. */
+public KeyValue() {}
+
+private KeyValue(K key, V value) {
+this.key = key;
+this.value = value;
+}
+
+@Override
+public int compareTo(KeyValue other) {

Review Comment:
   This is backed into the testing framework:
   
https://github.com/apache/flink/blob/91c2b8d56bae03c2ce4a50b5c014cea842df9f74/flink-test-utils-parent/flink-connector-test-utils/src/main/java/org/apache/flink/connector/testframe/testsuites/SinkTestSuiteBase.java#L518



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-27318) KafkaSink: when init transaction, it take too long to get a producerId with epoch=0

2022-04-20 Thread Zhengqi Zhang (Jira)
Zhengqi Zhang created FLINK-27318:
-

 Summary: KafkaSink: when init transaction, it take too long to get 
a producerId with epoch=0
 Key: FLINK-27318
 URL: https://issues.apache.org/jira/browse/FLINK-27318
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Kafka
Affects Versions: 1.14.4
Reporter: Zhengqi Zhang
 Attachments: image-2022-04-20-17-34-48-207.png

as we can see, the new KafkaSink aborts all transactions that have been created 
by a subtask in a previous run, only return when get a producerId was unused 
before(epoch=0). But this can take a long time, especially if the task has been 
started and cancelled many times before. In my tests, it even took {*}10 
minutes{*}. Is there a better way to solve this problem, or {*}do what 
FlinkKafkaProducer did{*}.

!image-2022-04-20-17-34-48-207.png|width=556,height=412!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-27319) Duplicated "-t" option for savepoint format and deployment target

2022-04-20 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-27319:


 Summary: Duplicated "-t" option for savepoint format and 
deployment target
 Key: FLINK-27319
 URL: https://issues.apache.org/jira/browse/FLINK-27319
 Project: Flink
  Issue Type: Bug
  Components: Command Line Client
Affects Versions: 1.15.0
Reporter: Dawid Wysakowicz
Assignee: Dawid Wysakowicz
 Fix For: 1.15.0


The two options savepoint format and deployment target have the same short 
option which causes a clash and the CLI to fail.

I suggest to drop the short "-t" for savepoint format.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] XComp commented on a diff in pull request #19444: [FLINK-27206][metrics] Deprecate reflection-based reporter instantiation

2022-04-20 Thread GitBox


XComp commented on code in PR #19444:
URL: https://github.com/apache/flink/pull/19444#discussion_r853950240


##
flink-runtime/src/main/java/org/apache/flink/runtime/metrics/ReporterSetup.java:
##
@@ -375,6 +377,15 @@ private static Optional loadReporter(
 }
 
 if (reporterClassName != null) {
+LOG.warn(
+"The reporter configuration of '{}' configures the 
reporter class, which is a deprecated approach to configure reporters'."

Review Comment:
   ```suggestion
   "The reporter configuration of '{}' configures the 
reporter class, which is a deprecated approach to configure reporters."
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] dawidwys opened a new pull request, #19527: [FLINK-27319] Duplicated '-t' option for savepoint format and deployment target

2022-04-20 Thread GitBox


dawidwys opened a new pull request, #19527:
URL: https://github.com/apache/flink/pull/19527

   
   ## Verifying this change
   
   Run:
   ```
   ../flink-1.15.0/bin/flink savepoint --type canonical  savepoints
   ```
   check it runs correctly
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (**yes** / no)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't 
know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented) - the short option was undocumented
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-27319) Duplicated "-t" option for savepoint format and deployment target

2022-04-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-27319:
---
Labels: pull-request-available  (was: )

> Duplicated "-t" option for savepoint format and deployment target
> -
>
> Key: FLINK-27319
> URL: https://issues.apache.org/jira/browse/FLINK-27319
> Project: Flink
>  Issue Type: Bug
>  Components: Command Line Client
>Affects Versions: 1.15.0
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> The two options savepoint format and deployment target have the same short 
> option which causes a clash and the CLI to fail.
> I suggest to drop the short "-t" for savepoint format.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-27320) When using KafkaSink EXACTLY_ONCE semantics, frequent OutOfOrderSequenceException anomalies

2022-04-20 Thread Zhengqi Zhang (Jira)
Zhengqi Zhang created FLINK-27320:
-

 Summary: When using KafkaSink EXACTLY_ONCE semantics, frequent 
OutOfOrderSequenceException anomalies
 Key: FLINK-27320
 URL: https://issues.apache.org/jira/browse/FLINK-27320
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.14.4
Reporter: Zhengqi Zhang
 Attachments: image-2022-04-20-17-48-37-149.png, 
image-2022-04-20-17-49-15-143.png

This problem does not occur when using EXACTLY_ONCE semantics in 
FlinkKafkaProducer, but occurs frequently when using KafkaSink.

!image-2022-04-20-17-48-37-149.png|width=573,height=220!

!image-2022-04-20-17-49-15-143.png|width=818,height=469!

This is ProducerConfig when using KafkaSink:
{code:java}
    acks = 1
    batch.size = 16384
    bootstrap.servers = [localhost:9092]
    buffer.memory = 33554432
    client.dns.lookup = default
    client.id = 
    compression.type = none
    connections.max.idle.ms = 54
    delivery.timeout.ms = 12
    enable.idempotence = false
    interceptor.classes = []
    key.serializer = class 
org.apache.kafka.common.serialization.ByteArraySerializer
    linger.ms = 0
    max.block.ms = 6
    max.in.flight.requests.per.connection = 5
    max.request.size = 1048576
    metadata.max.age.ms = 30
    metric.reporters = []
    metrics.num.samples = 2
    metrics.recording.level = INFO
    metrics.sample.window.ms = 3
    partitioner.class = class 
org.apache.kafka.clients.producer.internals.DefaultPartitioner
    receive.buffer.bytes = 32768
    reconnect.backoff.max.ms = 1000
    reconnect.backoff.ms = 50
    request.timeout.ms = 3
    retries = 2147483647
    retry.backoff.ms = 100
    sasl.client.callback.handler.class = null
    sasl.jaas.config = null
    sasl.kerberos.kinit.cmd = /usr/bin/kinit
    sasl.kerberos.min.time.before.relogin = 6
    sasl.kerberos.service.name = null
    sasl.kerberos.ticket.renew.jitter = 0.05
    sasl.kerberos.ticket.renew.window.factor = 0.8
    sasl.login.callback.handler.class = null
    sasl.login.class = null
    sasl.login.refresh.buffer.seconds = 300
    sasl.login.refresh.min.period.seconds = 60
    sasl.login.refresh.window.factor = 0.8
    sasl.login.refresh.window.jitter = 0.05
    sasl.mechanism = GSSAPI
    security.protocol = PLAINTEXT
    security.providers = null
    send.buffer.bytes = 131072
    ssl.cipher.suites = null
    ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
    ssl.endpoint.identification.algorithm = https
    ssl.key.password = null
    ssl.keymanager.algorithm = SunX509
    ssl.keystore.location = null
    ssl.keystore.password = null
    ssl.keystore.type = JKS
    ssl.protocol = TLS
    ssl.provider = null
    ssl.secure.random.implementation = null
    ssl.trustmanager.algorithm = PKIX
    ssl.truststore.location = null
    ssl.truststore.password = null
    ssl.truststore.type = JKS
    transaction.timeout.ms = 30
    transactional.id = kafka-sink-0-36
    value.serializer = class 
org.apache.kafka.common.serialization.ByteArraySerializer{code}
This is ProducerConfig when using FlinkKafkaProducer:

 
{code:java}
    acks = 1
    batch.size = 16384
    bootstrap.servers = [localhost:9092]
    buffer.memory = 33554432
    client.dns.lookup = default
    client.id = 
    compression.type = none
    connections.max.idle.ms = 54
    delivery.timeout.ms = 12
    enable.idempotence = false
    interceptor.classes = []
    key.serializer = class 
org.apache.kafka.common.serialization.ByteArraySerializer
    linger.ms = 0
    max.block.ms = 6
    max.in.flight.requests.per.connection = 5
    max.request.size = 1048576
    metadata.max.age.ms = 30
    metric.reporters = []
    metrics.num.samples = 2
    metrics.recording.level = INFO
    metrics.sample.window.ms = 3
    partitioner.class = class 
org.apache.kafka.clients.producer.internals.DefaultPartitioner
    receive.buffer.bytes = 32768
    reconnect.backoff.max.ms = 1000
    reconnect.backoff.ms = 50
    request.timeout.ms = 3
    retries = 2147483647
    retry.backoff.ms = 100
    sasl.client.callback.handler.class = null
    sasl.jaas.config = null
    sasl.kerberos.kinit.cmd = /usr/bin/kinit
    sasl.kerberos.min.time.before.relogin = 6
    sasl.kerberos.service.name = null
    sasl.kerberos.ticket.renew.jitter = 0.05
    sasl.kerberos.ticket.renew.window.factor = 0.8
    sasl.login.callback.handler.class = null
    sasl.login.class = null
    sasl.login.refresh.buffer.seconds = 300
    sasl.login.refresh.min.period.seconds = 60
    sasl.login.refresh.window.factor = 0.8
    sasl.login.refresh.window.jitter = 0.05
    sasl.mechanism = GSSAPI
    security.protocol = PLAINTEXT
    security.providers = null
    send.buffer.bytes = 131072
    ssl.cipher.suites = null
    ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
    ssl.endpoint.identification.algorithm

[jira] [Assigned] (FLINK-25470) Add/Expose/Differentiate metrics of checkpoint size between changelog size vs materialization size

2022-04-20 Thread Yun Tang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Tang reassigned FLINK-25470:


Assignee: Hangxiang Yu

> Add/Expose/Differentiate metrics of checkpoint size between changelog size vs 
> materialization size
> --
>
> Key: FLINK-25470
> URL: https://issues.apache.org/jira/browse/FLINK-25470
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Metrics, Runtime / State Backends
>Reporter: Yuan Mei
>Assignee: Hangxiang Yu
>Priority: Major
> Fix For: 1.16.0
>
> Attachments: Screen Shot 2021-12-29 at 1.09.48 PM.png
>
>
> FLINK-25557  only resolves part of the problems. 
> Eventually, we should answer questions:
>  * How much Data Size increases/exploding
>  * When a checkpoint includes a new Materialization
>  * Materialization size
>  * changelog sizes from the last complete checkpoint (that can roughly infer 
> restore time)
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (FLINK-27318) KafkaSink: when init transaction, it take too long to get a producerId with epoch=0

2022-04-20 Thread Zhengqi Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhengqi Zhang updated FLINK-27318:
--
 Attachment: image-2022-04-20-17-59-27-397.png
Description: 
as we can see, the new KafkaSink aborts all transactions that have been created 
by a subtask in a previous run, only return when get a producerId was unused 
before(epoch=0). But this can take a long time, especially if the task has been 
started and cancelled many times before. In my tests, it even took {*}10 
minutes{*}. Is there a better way to solve this problem, or {*}do what 
FlinkKafkaProducer did{*}.

!image-2022-04-20-17-59-27-397.png|width=534,height=256!

!image-2022-04-20-17-34-48-207.png|width=556,height=412!

  was:
as we can see, the new KafkaSink aborts all transactions that have been created 
by a subtask in a previous run, only return when get a producerId was unused 
before(epoch=0). But this can take a long time, especially if the task has been 
started and cancelled many times before. In my tests, it even took {*}10 
minutes{*}. Is there a better way to solve this problem, or {*}do what 
FlinkKafkaProducer did{*}.

!image-2022-04-20-17-34-48-207.png|width=556,height=412!


> KafkaSink: when init transaction, it take too long to get a producerId with 
> epoch=0
> ---
>
> Key: FLINK-27318
> URL: https://issues.apache.org/jira/browse/FLINK-27318
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Affects Versions: 1.14.4
>Reporter: Zhengqi Zhang
>Priority: Major
> Attachments: image-2022-04-20-17-34-48-207.png, 
> image-2022-04-20-17-59-27-397.png
>
>
> as we can see, the new KafkaSink aborts all transactions that have been 
> created by a subtask in a previous run, only return when get a producerId was 
> unused before(epoch=0). But this can take a long time, especially if the task 
> has been started and cancelled many times before. In my tests, it even took 
> {*}10 minutes{*}. Is there a better way to solve this problem, or {*}do what 
> FlinkKafkaProducer did{*}.
> !image-2022-04-20-17-59-27-397.png|width=534,height=256!
> !image-2022-04-20-17-34-48-207.png|width=556,height=412!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-25470) Add/Expose/Differentiate metrics of checkpoint size between changelog size vs materialization size

2022-04-20 Thread Yun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524854#comment-17524854
 ] 

Yun Tang commented on FLINK-25470:
--

[~masteryhx] Already assigned to you.

> Add/Expose/Differentiate metrics of checkpoint size between changelog size vs 
> materialization size
> --
>
> Key: FLINK-25470
> URL: https://issues.apache.org/jira/browse/FLINK-25470
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Metrics, Runtime / State Backends
>Reporter: Yuan Mei
>Assignee: Hangxiang Yu
>Priority: Major
> Fix For: 1.16.0
>
> Attachments: Screen Shot 2021-12-29 at 1.09.48 PM.png
>
>
> FLINK-25557  only resolves part of the problems. 
> Eventually, we should answer questions:
>  * How much Data Size increases/exploding
>  * When a checkpoint includes a new Materialization
>  * Materialization size
>  * changelog sizes from the last complete checkpoint (that can roughly infer 
> restore time)
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] hehuiyuan commented on pull request #19069: [FLINK-24862][Connectors / Hive][backport]Fix user-defined hive udaf/udtf cannot be used normally in hive dialect for flink1.14

2022-04-20 Thread GitBox


hehuiyuan commented on PR #19069:
URL: https://github.com/apache/flink/pull/19069#issuecomment-1103742860

   Hi @luoyuxia , look at this ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #19527: [FLINK-27319] Duplicated '-t' option for savepoint format and deployment target

2022-04-20 Thread GitBox


flinkbot commented on PR #19527:
URL: https://github.com/apache/flink/pull/19527#issuecomment-1103743892

   
   ## CI report:
   
   * 2665d353d823e878efe887ead9d3686be173f368 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-27155) Reduce multiple reads to the same Changelog file in the same taskmanager during restore

2022-04-20 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524857#comment-17524857
 ] 

Roman Khachatryan commented on FLINK-27155:
---

I think we should consider widening the scope of the cache, because a single 
file may be used by multiple tasks.

However, with such caching we'll have to deal with concurrent file acceses. If 
we assume that IO is the bottleneck, then the easiest and most efficient way to 
address it is to serialize all these accesses.

Another problem is reading the file out-of-order (leading to random IO). It can 
be solved by collecting the offsets from all tasks and then passing them to the 
reader (along with file names and callbacks). This can be done separately (as 
the next step) once the same file is read from one place.

 

Code-wise, StateChangelogStorage, has exactly this scope: it is created on TM 
per job, and then used for all tasks of this job.

Similar to how multiple FsStateChangelogWriter share the same 
StateChangeUploadScheduler, readers can share some cache manager.

 

To cleanup the cache, I think it's better to use something like reference 
counting:
 * increment on 1st use of file or just changelogStorage.createReader()
 * decrement on reader.close(), at the end of backend recovery

Using (only) timeouts can be less efficient (files kept unncecessarily long, 
which is important because of space amplification with changelog) or unstable 
with small timeouts.

 

WDYT?

> Reduce multiple reads to the same Changelog file in the same taskmanager 
> during restore
> ---
>
> Key: FLINK-27155
> URL: https://issues.apache.org/jira/browse/FLINK-27155
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Reporter: Feifan Wang
>Assignee: Feifan Wang
>Priority: Major
> Fix For: 1.16.0
>
>
> h3. Background
> In the current implementation, State changes of different operators in the 
> same taskmanager may be written to the same changelog file, which effectively 
> reduces the number of files and requests to DFS.
> But on the other hand, the current implementation also reads the same 
> changelog file multiple times on recovery. More specifically, the number of 
> times the same changelog file is accessed is related to the number of 
> ChangeSets contained in it. And since each read needs to skip the preceding 
> bytes, this network traffic is also wasted.
> The result is a lot of unnecessary request to DFS when there are multiple 
> slots and keyed state in the same taskmanager.
> h3. Proposal
> We can reduce multiple reads to the same changelog file in the same 
> taskmanager during restore.
> One possible approach is to read the changelog file all at once and cache it 
> in memory or local file for a period of time when reading the changelog file.
> I think this could be a subtask of [v2 FLIP-158: Generalized incremental 
> checkpoints|https://issues.apache.org/jira/browse/FLINK-25842] .
> Hi [~ym] , [~roman]  how do you think about ?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] afedulov commented on a diff in pull request #19405: [FLINK-27066] Reintroduce e2e tests in ES as Java tests.

2022-04-20 Thread GitBox


afedulov commented on code in PR #19405:
URL: https://github.com/apache/flink/pull/19405#discussion_r853933757


##
flink-test-utils-parent/flink-connector-test-utils/src/main/java/org/apache/flink/connector/testframe/testsuites/SinkTestSuiteBase.java:
##
@@ -508,6 +509,7 @@ private void checkResultWithSemantic(
 
.matchesRecordsFromSource(Arrays.asList(sort(testData)), semantic);
 return true;
 } catch (Throwable t) {
+LOG.error("Ooops", t);

Review Comment:
   Whoops, this was not supposed to be merged. The way the test is designed the 
exception is swallowed by default, this is an accidental leftover of debugging.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] wangyang0918 commented on a diff in pull request #168: [FLINK-27161] Support fetch user jar from different source

2022-04-20 Thread GitBox


wangyang0918 commented on code in PR #168:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/168#discussion_r853929995


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/artifact/HttpArtifactFetcher.java:
##
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.kubernetes.operator.artifact;
+
+import org.apache.commons.io.FileUtils;
+import org.apache.commons.io.FilenameUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.net.URL;
+
+/** Download the jar from the http resource. */
+public class HttpArtifactFetcher implements ArtifactFetcher {
+
+public static final Logger LOG = 
LoggerFactory.getLogger(HttpArtifactFetcher.class);
+public static final HttpArtifactFetcher INSTANCE = new 
HttpArtifactFetcher();
+
+@Override
+public File fetch(String uri, File targetDir) throws Exception {
+var start = System.currentTimeMillis();
+URL url = new URL(uri);
+String fileName = FilenameUtils.getName(url.getFile());
+File targetFile = new File(targetDir, fileName);
+FileUtils.copyToFile(new URL(uri).openStream(), targetFile);

Review Comment:
   Same as above.



##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/artifact/FileSystemBasedArtifactFetcher.java:
##
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.kubernetes.operator.artifact;
+
+import org.apache.flink.core.fs.FileSystem;
+
+import org.apache.commons.io.FileUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+
+/** Leverage the flink filesystem plugin to fetch the artifact. */
+public class FileSystemBasedArtifactFetcher implements ArtifactFetcher {
+
+public static final Logger LOG = 
LoggerFactory.getLogger(FileSystemBasedArtifactFetcher.class);
+public static final FileSystemBasedArtifactFetcher INSTANCE =
+new FileSystemBasedArtifactFetcher();
+
+@Override
+public File fetch(String uri, File targetDir) throws Exception {
+org.apache.flink.core.fs.Path source = new 
org.apache.flink.core.fs.Path(uri);
+var start = System.currentTimeMillis();
+FileSystem fileSystem = source.getFileSystem();
+String fileName = source.getName();
+File targetFile = new File(targetDir, fileName);
+FileUtils.copyToFile(fileSystem.open(source), targetFile);

Review Comment:
   It seems that the opened stream is not closed correctly.



##
flink-kubernetes-operator/src/test/java/org/apache/flink/kubernetes/operator/artifact/ArtifactManagerTest.java:
##
@@ -0,0 +1,134 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either expr

[jira] [Commented] (FLINK-24960) YARNSessionCapacitySchedulerITCase.testVCoresAreSetCorrectlyAndJobManagerHostnameAreShownInWebInterfaceAndDynamicPropertiesAndYarnApplicationNameAndTaskManagerSlots ha

2022-04-20 Thread Biao Geng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524860#comment-17524860
 ] 

Biao Geng commented on FLINK-24960:
---

Hi [~mapohl], I tried to do some research but in latest 
[failure|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=34743&view=logs&j=298e20ef-7951-5965-0e79-ea664ddc435e&t=d4c90338-c843-57b0-3232-10ae74f00347&l=36026]
 posted by Yun, I did not find the {{Extracted hostname:port: }} was not shown. 
Though in the description of this ticket, it shows  {{Extracted hostname:port: 
5718b812c7ab:38622}} in the old CI test, which seems to be correct. 
I plan to verify {{yarnSessionClusterRunner.sendStop();}}  works fine(i.e. the 
session cluster will be stopped normally) first but I have not found a way to 
run the cron test's "test_cron_jdk11 misc" test only on the my own [azure 
pipeline|https://dev.azure.com/samuelgeng7/Flink/_build/results?buildId=109&view=results],
 which made the verification pretty slow and hard. Is there any guidelines 
about debugging the azure pipeline with some specific tests?

> YARNSessionCapacitySchedulerITCase.testVCoresAreSetCorrectlyAndJobManagerHostnameAreShownInWebInterfaceAndDynamicPropertiesAndYarnApplicationNameAndTaskManagerSlots
>  hangs on azure
> ---
>
> Key: FLINK-24960
> URL: https://issues.apache.org/jira/browse/FLINK-24960
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.15.0, 1.14.3
>Reporter: Yun Gao
>Assignee: Niklas Semmler
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.16.0
>
>
> {code:java}
> Nov 18 22:37:08 
> 
> Nov 18 22:37:08 Test 
> testVCoresAreSetCorrectlyAndJobManagerHostnameAreShownInWebInterfaceAndDynamicPropertiesAndYarnApplicationNameAndTaskManagerSlots(org.apache.flink.yarn.YARNSessionCapacitySchedulerITCase)
>  is running.
> Nov 18 22:37:08 
> 
> Nov 18 22:37:25 22:37:25,470 [main] INFO  
> org.apache.flink.yarn.YARNSessionCapacitySchedulerITCase [] - Extracted 
> hostname:port: 5718b812c7ab:38622
> Nov 18 22:52:36 
> ==
> Nov 18 22:52:36 Process produced no output for 900 seconds.
> Nov 18 22:52:36 
> ==
>  {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=26722&view=logs&j=f450c1a5-64b1-5955-e215-49cb1ad5ec88&t=cc452273-9efa-565d-9db8-ef62a38a0c10&l=36395



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] afedulov commented on a diff in pull request #19405: [FLINK-27066] Reintroduce e2e tests in ES as Java tests.

2022-04-20 Thread GitBox


afedulov commented on code in PR #19405:
URL: https://github.com/apache/flink/pull/19405#discussion_r853933757


##
flink-test-utils-parent/flink-connector-test-utils/src/main/java/org/apache/flink/connector/testframe/testsuites/SinkTestSuiteBase.java:
##
@@ -508,6 +509,7 @@ private void checkResultWithSemantic(
 
.matchesRecordsFromSource(Arrays.asList(sort(testData)), semantic);
 return true;
 } catch (Throwable t) {
+LOG.error("Ooops", t);

Review Comment:
   This is not going to make it into the final commit, I left it for the still 
open discussion of tests stability without "sleep" in the reader. The exception 
is simply swallowed in the base tests.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] wangyang0918 commented on a diff in pull request #170: [FLINK-27235] Publish Flink k8s Operator Helm Charts via Github Actions

2022-04-20 Thread GitBox


wangyang0918 commented on code in PR #170:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/170#discussion_r853962112


##
.github/workflows/helm-charts.yaml:
##
@@ -0,0 +1,35 @@
+name: Release Charts
+
+on:
+  push:
+branches:
+  - main
+
+jobs:
+  release:
+runs-on: ubuntu-latest
+steps:
+  - name: Checkout
+uses: actions/checkout@v2
+with:
+  fetch-depth: 0
+
+  - name: Configure Git
+run: |
+  git config user.name "$GITHUB_ACTOR"
+  git config user.email "$github_ac...@users.noreply.github.com"
+
+  - name: Install Helm
+uses: azure/setup-helm@v1
+with:
+  version: v3.8.1
+
+  - name: Run chart-releaser
+uses: helm/chart-releaser-action@v1.4.0

Review Comment:
   Thanks. I get your point.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   3   4   >