[jira] [Commented] (FLINK-22139) Flink Jobmanager & Task Manger logs are not writing to the logs files

2021-05-03 Thread Bhagi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338745#comment-17338745
 ] 

Bhagi commented on FLINK-22139:
---

Hi Till,

I was able to see the logs inside container in Non HA mode.. Let me check OR 
reconfigure the Flink HA mode config file.

Thanks

> Flink Jobmanager & Task Manger logs are not writing to the logs files
> -
>
> Key: FLINK-22139
> URL: https://issues.apache.org/jira/browse/FLINK-22139
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.12.2
> Environment: on kubernetes flink standalone deployment with 
> jobmanager HA is enabled.
>Reporter: Bhagi
>Priority: Major
>
> Hi Team,
> I am submitting the jobs and restarting the job manager and task manager 
> pods..  Log files are generating with the name task manager and job manager.
> but job manager & task manager log file size is '0', i am not sure any 
> configuration missed..why logs are not writing to their log files..
> # Task Manager pod###
> flink@flink-taskmanager-85b6585b7-hhgl7:~$ ls -lart log/
> total 0
> -rw-r--r-- 1 flink flink  0 Apr  7 09:35 
> flink--taskexecutor-0-flink-taskmanager-85b6585b7-hhgl7.log
> flink@flink-taskmanager-85b6585b7-hhgl7:~$
> ### Jobmanager pod Logs #
> flink@flink-jobmanager-f6db89b7f-lq4ps:~$
> -rw-r--r-- 1 7148739 flink 0 Apr  7 06:36 
> flink--standalonesession-0-flink-jobmanager-f6db89b7f-gtkx5.log
> -rw-r--r-- 1 7148739 flink 0 Apr  7 06:36 
> flink--standalonesession-0-flink-jobmanager-f6db89b7f-wnrfm.log
> -rw-r--r-- 1 7148739 flink 0 Apr  7 06:37 
> flink--standalonesession-0-flink-jobmanager-f6db89b7f-2b2fs.log
> -rw-r--r-- 1 7148739 flink 0 Apr  7 06:37 
> flink--standalonesession-0-flink-jobmanager-f6db89b7f-7kdhh.log
> -rw-r--r-- 1 7148739 flink 0 Apr  7 09:35 
> flink--standalonesession-0-flink-jobmanager-f6db89b7f-twhkt.log
> drwxrwxrwx 2 7148739 flink35 Apr  7 09:35 .
> -rw-r--r-- 1 7148739 flink 0 Apr  7 09:35 
> flink--standalonesession-0-flink-jobmanager-f6db89b7f-lq4ps.log
> flink@flink-jobmanager-f6db89b7f-lq4ps:~$
> I configured log4j.properties for flink
> log4j.properties: |+
> monitorInterval=30
> rootLogger.level = INFO
> rootLogger.appenderRef.file.ref = MainAppender
> logger.flink.name = org.apache.flink
> logger.flink.level = INFO
> logger.akka.name = akka
> logger.akka.level = INFO
> appender.main.name = MainAppender
> appender.main.type = RollingFile
> appender.main.append = true
> appender.main.fileName = ${sys:log.file}
> appender.main.filePattern = ${sys:log.file}.%i
> appender.main.layout.type = PatternLayout
> appender.main.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x 
> - %m%n
> appender.main.policies.type = Policies
> appender.main.policies.size.type = SizeBasedTriggeringPolicy
> appender.main.policies.size.size = 100MB
> appender.main.policies.startup.type = OnStartupTriggeringPolicy
> appender.main.strategy.type = DefaultRolloverStrategy
> appender.main.strategy.max = ${env:MAX_LOG_FILE_NUMBER:-10}
> logger.netty.name = 
> org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline
> logger.netty.level = OFF



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15823: [FLINK-22462][tests] Set max_connections to 200 in pgsql

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15823:
URL: https://github.com/apache/flink/pull/15823#issuecomment-831107274


   
   ## CI report:
   
   * 323d87e5394e8968856741d0bfd48463e38ddc8a Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17531)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15712: [FLINK-22400][hive connect]fix NPE problem when convert flink object for Map

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15712:
URL: https://github.com/apache/flink/pull/15712#issuecomment-824189664


   
   ## CI report:
   
   * f2fe08ada02c5c9e20ed397163d3ab7a34594994 UNKNOWN
   * 5fdb0880e9465be2711e8ac4b9df01388fc5aa2f Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17532)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15712: [FLINK-22400][hive connect]fix NPE problem when convert flink object for Map

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15712:
URL: https://github.com/apache/flink/pull/15712#issuecomment-824189664


   
   ## CI report:
   
   * 71ecabae057e85596265ea48d6b44662ad069bce Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17448)
 
   * f2fe08ada02c5c9e20ed397163d3ab7a34594994 UNKNOWN
   * 5fdb0880e9465be2711e8ac4b9df01388fc5aa2f UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15712: [FLINK-22400][hive connect]fix NPE problem when convert flink object for Map

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15712:
URL: https://github.com/apache/flink/pull/15712#issuecomment-824189664


   
   ## CI report:
   
   * 71ecabae057e85596265ea48d6b44662ad069bce Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17448)
 
   * f2fe08ada02c5c9e20ed397163d3ab7a34594994 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] hehuiyuan commented on a change in pull request #15712: [FLINK-22400][hive connect]fix NPE problem when convert flink object for Map

2021-05-03 Thread GitBox


hehuiyuan commented on a change in pull request #15712:
URL: https://github.com/apache/flink/pull/15712#discussion_r625459401



##
File path: 
flink-connectors/flink-connector-hive/src/test/java/org/apache/flink/connectors/hive/TableEnvHiveConnectorITCase.java
##
@@ -531,6 +532,52 @@ public void testLocationWithComma() throws Exception {
 }
 }
 
+@Test
+public void testReadHiveDataWithEmptyMapForHiveShim20X() throws Exception {
+TableEnvironment tableEnv = getTableEnvWithHiveCatalog();
+
+try {
+// Flink to write parquet file

Review comment:
   yes,parquet




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15789: [FLINK-21181][runtime] Wait for Invokable cancellation before releasing network resources

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15789:
URL: https://github.com/apache/flink/pull/15789#issuecomment-827996694


   
   ## CI report:
   
   * 4c5180310bf76e96f2665bf53531eccb1fa86421 UNKNOWN
   * 0bba0d971d9d4b8530cf90c6513ce68d91cad54e Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17526)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-22555) LGPL-2.1 files in flink-python jars

2021-05-03 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338692#comment-17338692
 ] 

Chesnay Schepler commented on FLINK-22555:
--

Out of curiosity, how did you discover this?

> LGPL-2.1 files in flink-python jars
> ---
>
> Key: FLINK-22555
> URL: https://issues.apache.org/jira/browse/FLINK-22555
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.13.0, 1.12.3
>Reporter: Henri Yandell
>Assignee: Chesnay Schepler
>Priority: Blocker
> Fix For: 1.13.1, 1.12.4
>
>
> Looking at, for example, 
> [https://repo1.maven.org/maven2/org/apache/flink/flink-python_2.11/1.13.0/] 
> the jar file contains three LGPL-2.1 source files:
>  * 
> flink-python_2.11-1.13.0/META-INF/maven/org.jboss.modules/jboss-modules/pom.xml
>  * flink-python_2.11-1.13.0/schema/module-1_1.xsd
>  * flink-python_2.11-1.13.0/schema/module-1_0.xsd
> There's nothing in the DEPENDENCIES or licenses directory on the topic. It 
> looks like Netty brings in the LGPL-2.1 JBoss dependency (depending on the 
> Apache-2.0 JBoss Marshalling, which depends on the LGPL-2.1 JBoss Modules).
> The xsd files are also appear to be coming from JBoss Modules. Given Apache's 
> position on LGPL-2.1 dependency inclusion in ASF projects, this seems like an 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22555) LGPL-2.1 files in flink-python jars

2021-05-03 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-22555:
-
Affects Version/s: 1.13.0
   1.12.3

> LGPL-2.1 files in flink-python jars
> ---
>
> Key: FLINK-22555
> URL: https://issues.apache.org/jira/browse/FLINK-22555
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.13.0, 1.12.3
>Reporter: Henri Yandell
>Assignee: Chesnay Schepler
>Priority: Blocker
>
> Looking at, for example, 
> [https://repo1.maven.org/maven2/org/apache/flink/flink-python_2.11/1.13.0/] 
> the jar file contains three LGPL-2.1 source files:
>  * 
> flink-python_2.11-1.13.0/META-INF/maven/org.jboss.modules/jboss-modules/pom.xml
>  * flink-python_2.11-1.13.0/schema/module-1_1.xsd
>  * flink-python_2.11-1.13.0/schema/module-1_0.xsd
> There's nothing in the DEPENDENCIES or licenses directory on the topic. It 
> looks like Netty brings in the LGPL-2.1 JBoss dependency (depending on the 
> Apache-2.0 JBoss Marshalling, which depends on the LGPL-2.1 JBoss Modules).
> The xsd files are also appear to be coming from JBoss Modules. Given Apache's 
> position on LGPL-2.1 dependency inclusion in ASF projects, this seems like an 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22555) LGPL-2.1 files in flink-python jars

2021-05-03 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-22555:
-
Fix Version/s: 1.12.4
   1.13.1

> LGPL-2.1 files in flink-python jars
> ---
>
> Key: FLINK-22555
> URL: https://issues.apache.org/jira/browse/FLINK-22555
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.13.0, 1.12.3
>Reporter: Henri Yandell
>Assignee: Chesnay Schepler
>Priority: Blocker
> Fix For: 1.13.1, 1.12.4
>
>
> Looking at, for example, 
> [https://repo1.maven.org/maven2/org/apache/flink/flink-python_2.11/1.13.0/] 
> the jar file contains three LGPL-2.1 source files:
>  * 
> flink-python_2.11-1.13.0/META-INF/maven/org.jboss.modules/jboss-modules/pom.xml
>  * flink-python_2.11-1.13.0/schema/module-1_1.xsd
>  * flink-python_2.11-1.13.0/schema/module-1_0.xsd
> There's nothing in the DEPENDENCIES or licenses directory on the topic. It 
> looks like Netty brings in the LGPL-2.1 JBoss dependency (depending on the 
> Apache-2.0 JBoss Marshalling, which depends on the LGPL-2.1 JBoss Modules).
> The xsd files are also appear to be coming from JBoss Modules. Given Apache's 
> position on LGPL-2.1 dependency inclusion in ASF projects, this seems like an 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22555) LGPL-2.1 files in flink-python jars

2021-05-03 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338687#comment-17338687
 ] 

Chesnay Schepler commented on FLINK-22555:
--

This seems to be bundled in beam-vendor-grpc-1_26_0-0.3; maybe we can just 
exclude these files. [~bayard] could you open an issue with the Beam project to 
remove said files from their releases?

> LGPL-2.1 files in flink-python jars
> ---
>
> Key: FLINK-22555
> URL: https://issues.apache.org/jira/browse/FLINK-22555
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Reporter: Henri Yandell
>Assignee: Chesnay Schepler
>Priority: Blocker
>
> Looking at, for example, 
> [https://repo1.maven.org/maven2/org/apache/flink/flink-python_2.11/1.13.0/] 
> the jar file contains three LGPL-2.1 source files:
>  * 
> flink-python_2.11-1.13.0/META-INF/maven/org.jboss.modules/jboss-modules/pom.xml
>  * flink-python_2.11-1.13.0/schema/module-1_1.xsd
>  * flink-python_2.11-1.13.0/schema/module-1_0.xsd
> There's nothing in the DEPENDENCIES or licenses directory on the topic. It 
> looks like Netty brings in the LGPL-2.1 JBoss dependency (depending on the 
> Apache-2.0 JBoss Marshalling, which depends on the LGPL-2.1 JBoss Modules).
> The xsd files are also appear to be coming from JBoss Modules. Given Apache's 
> position on LGPL-2.1 dependency inclusion in ASF projects, this seems like an 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-22555) LGPL-2.1 files in flink-python jars

2021-05-03 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-22555:


Assignee: Chesnay Schepler

> LGPL-2.1 files in flink-python jars
> ---
>
> Key: FLINK-22555
> URL: https://issues.apache.org/jira/browse/FLINK-22555
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Reporter: Henri Yandell
>Assignee: Chesnay Schepler
>Priority: Blocker
>
> Looking at, for example, 
> [https://repo1.maven.org/maven2/org/apache/flink/flink-python_2.11/1.13.0/] 
> the jar file contains three LGPL-2.1 source files:
>  * 
> flink-python_2.11-1.13.0/META-INF/maven/org.jboss.modules/jboss-modules/pom.xml
>  * flink-python_2.11-1.13.0/schema/module-1_1.xsd
>  * flink-python_2.11-1.13.0/schema/module-1_0.xsd
> There's nothing in the DEPENDENCIES or licenses directory on the topic. It 
> looks like Netty brings in the LGPL-2.1 JBoss dependency (depending on the 
> Apache-2.0 JBoss Marshalling, which depends on the LGPL-2.1 JBoss Modules).
> The xsd files are also appear to be coming from JBoss Modules. Given Apache's 
> position on LGPL-2.1 dependency inclusion in ASF projects, this seems like an 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22555) LGPL-2.1 files in flink-python jars

2021-05-03 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-22555:
-
Priority: Blocker  (was: Major)

> LGPL-2.1 files in flink-python jars
> ---
>
> Key: FLINK-22555
> URL: https://issues.apache.org/jira/browse/FLINK-22555
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Reporter: Henri Yandell
>Priority: Blocker
>
> Looking at, for example, 
> [https://repo1.maven.org/maven2/org/apache/flink/flink-python_2.11/1.13.0/] 
> the jar file contains three LGPL-2.1 source files:
>  * 
> flink-python_2.11-1.13.0/META-INF/maven/org.jboss.modules/jboss-modules/pom.xml
>  * flink-python_2.11-1.13.0/schema/module-1_1.xsd
>  * flink-python_2.11-1.13.0/schema/module-1_0.xsd
> There's nothing in the DEPENDENCIES or licenses directory on the topic. It 
> looks like Netty brings in the LGPL-2.1 JBoss dependency (depending on the 
> Apache-2.0 JBoss Marshalling, which depends on the LGPL-2.1 JBoss Modules).
> The xsd files are also appear to be coming from JBoss Modules. Given Apache's 
> position on LGPL-2.1 dependency inclusion in ASF projects, this seems like an 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15823: [FLINK-22462][tests] Set max_connections to 200 in pgsql

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15823:
URL: https://github.com/apache/flink/pull/15823#issuecomment-831107274


   
   ## CI report:
   
   * df97ad0958eaf4e9064303479b040fbb83a493e8 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17525)
 
   * 323d87e5394e8968856741d0bfd48463e38ddc8a Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17531)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15728: [FLINK-22379][runtime] CheckpointCoordinator checks the state of all …

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15728:
URL: https://github.com/apache/flink/pull/15728#issuecomment-824960796


   
   ## CI report:
   
   * 057695465c326acc5c40576cf35ee9f0283d6cac Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17524)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-22555) LGPL-2.1 files in flink-python jars

2021-05-03 Thread Henri Yandell (Jira)
Henri Yandell created FLINK-22555:
-

 Summary: LGPL-2.1 files in flink-python jars
 Key: FLINK-22555
 URL: https://issues.apache.org/jira/browse/FLINK-22555
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Reporter: Henri Yandell


Looking at, for example, 
[https://repo1.maven.org/maven2/org/apache/flink/flink-python_2.11/1.13.0/] the 
jar file contains three LGPL-2.1 source files:
 * 
flink-python_2.11-1.13.0/META-INF/maven/org.jboss.modules/jboss-modules/pom.xml
 * flink-python_2.11-1.13.0/schema/module-1_1.xsd
 * flink-python_2.11-1.13.0/schema/module-1_0.xsd

There's nothing in the DEPENDENCIES or licenses directory on the topic. It 
looks like Netty brings in the LGPL-2.1 JBoss dependency (depending on the 
Apache-2.0 JBoss Marshalling, which depends on the LGPL-2.1 JBoss Modules).

The xsd files are also appear to be coming from JBoss Modules. Given Apache's 
position on LGPL-2.1 dependency inclusion in ASF projects, this seems like an 
issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22554) Support Kafka Topic Patterns in Kafka Ingress

2021-05-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-22554:
---
Labels: pull-request-available  (was: )

> Support Kafka Topic Patterns in Kafka Ingress
> -
>
> Key: FLINK-22554
> URL: https://issues.apache.org/jira/browse/FLINK-22554
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Affects Versions: statefun-3.1.0
>Reporter: Seth Wiesman
>Assignee: Seth Wiesman
>Priority: Major
>  Labels: pull-request-available
>
> Flink's Kafka source supports subscription patterns, where it will consume 
> all topics that match a specified regex and periodically monitor for new 
> topics. We should support this in the statefun Kafka ingress as it is 
> generally useful and would remove a source of cluster downtime (subscribing 
> to a new topic).
>  
> I propose something like this.
>  
> {code:java}
> topics:
> - topic-pattern: my-topic-* // some regex
>   discovery-interval: 10s  // some duration 
>   valueType: blah
>   targets: 
>   - blah{code}
> The Flink consumer can be configured with both a list of concrete topics + a 
> pattern so validation is simple.
>  
> cc [~igalshilman]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink-statefun] sjwiesman opened a new pull request #231: [FLINK-22554] Add topic subscription to KafkaIngress

2021-05-03 Thread GitBox


sjwiesman opened a new pull request #231:
URL: https://github.com/apache/flink-statefun/pull/231


   Update the Kafka ingress to support Flinks Kafka topic discovery. This 
removes a common source of downtime from the statefun cluster, configuring new 
topics.  Users can now configure their ingress to use a topic pattern and the 
source will consume from all topics that match the regex and auto-discover new 
topics every `discoveryInterval`. 
   
   ```yaml
version: "3.0"

module:
  meta:
type: remote
spec:
  ingresses:
  - ingress:
  meta:
type: io.statefun.kafka/ingress
id: com.example/users
  spec:
address: kafka-broker:9092
consumerGroupId: my-consumer-group
startupPosition:
  type: earliest
topicPattern:
   pattern: topic-*
   discoveryInterval: 30s
   valueType: com.example/User
   targets:
 - com.example.fns/greeter
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15823: [FLINK-22462][tests] Set max_connections to 200 in pgsql

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15823:
URL: https://github.com/apache/flink/pull/15823#issuecomment-831107274


   
   ## CI report:
   
   * df97ad0958eaf4e9064303479b040fbb83a493e8 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17525)
 
   * 323d87e5394e8968856741d0bfd48463e38ddc8a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15789: [FLINK-21181][runtime] Wait for Invokable cancellation before releasing network resources

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15789:
URL: https://github.com/apache/flink/pull/15789#issuecomment-827996694


   
   ## CI report:
   
   * 4c5180310bf76e96f2665bf53531eccb1fa86421 UNKNOWN
   * 46e1be2c4832080bf1cb48c509b77cd88872d024 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17468)
 
   * 0bba0d971d9d4b8530cf90c6513ce68d91cad54e Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17526)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN
   * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN
   * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN
   * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN
   * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN
   * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN
   * 4303e9844bcaf301d79371a008525a8f8b944db1 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17523)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15823: [FLINK-22462][tests] Set max_connections to 200 in pgsql

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15823:
URL: https://github.com/apache/flink/pull/15823#issuecomment-831107274


   
   ## CI report:
   
   * 9f9242917bedadece14b34ac0a7f98f91bdf4c82 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17509)
 
   * df97ad0958eaf4e9064303479b040fbb83a493e8 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17525)
 
   * 323d87e5394e8968856741d0bfd48463e38ddc8a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15823: [FLINK-22462][tests] Set max_connections to 200 in pgsql

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15823:
URL: https://github.com/apache/flink/pull/15823#issuecomment-831107274


   
   ## CI report:
   
   * 9f9242917bedadece14b34ac0a7f98f91bdf4c82 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17509)
 
   * df97ad0958eaf4e9064303479b040fbb83a493e8 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15789: [FLINK-21181][runtime] Wait for Invokable cancellation before releasing network resources

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15789:
URL: https://github.com/apache/flink/pull/15789#issuecomment-827996694


   
   ## CI report:
   
   * 4c5180310bf76e96f2665bf53531eccb1fa86421 UNKNOWN
   * 46e1be2c4832080bf1cb48c509b77cd88872d024 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17468)
 
   * 0bba0d971d9d4b8530cf90c6513ce68d91cad54e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rkhachatryan commented on a change in pull request #15728: [FLINK-22379][runtime] CheckpointCoordinator checks the state of all …

2021-05-03 Thread GitBox


rkhachatryan commented on a change in pull request #15728:
URL: https://github.com/apache/flink/pull/15728#discussion_r625298504



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/DefaultCheckpointPlanCalculatorTest.java
##
@@ -149,18 +150,80 @@ public void testComputeWithMultipleLevels() throws 
Exception {
 }
 
 @Test
-public void testWithTriggeredTasksNotRunning() throws Exception {
+public void testNotRunningOneOfSourcesTriggeredTasksNotRunning() throws 
Exception {
+// given: Execution graph builder with one RUNNING source and NOT 
RUNNING source.
+FunctionWithException 
twoSourcesBuilder =
+(id) ->
+new 
CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder()
+.addJobVertex(id)
+.addJobVertex(new JobVertexID())
+.setTransitToRunning(vertex -> 
!vertex.getJobvertexId().equals(id))
+.build();
+
+// when: Creating the checkpoint plan.
+runWithNotRunTask(twoSourcesBuilder);
+
+// then: The plan failed because one task didn't have RUNNING state.
+}
+
+@Test
+public void testNotRunningSingleSourceTriggeredTasksNotRunning() throws 
Exception {
+// given: Execution graph builder with one NOT RUNNING source and 
RUNNING not source task.
+FunctionWithException 
sourceAndNotSourceBuilder =
+(id) ->
+new 
CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder()
+.addJobVertex(id)
+.addJobVertex(new JobVertexID(), false)
+.setTransitToRunning(vertex -> 
!vertex.getJobvertexId().equals(id))
+.build();
+
+// when: Creating the checkpoint plan.
+runWithNotRunTask(sourceAndNotSourceBuilder);
+
+// then: The plan failed because one task didn't have RUNNING state.
+}
+
+@Test
+public void testNotRunningOneOfNotSourcesTriggeredTasksNotRunning() throws 
Exception {
+// given: Execution graph builder with NOT RUNNING not source and 
RUNNING not source task.
+FunctionWithException 
twoNotSourceBuilder =
+(id) ->
+new 
CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder()
+.addJobVertex(id, false)
+.addJobVertex(new JobVertexID(), false)
+.setTransitToRunning(vertex -> 
!vertex.getJobvertexId().equals(id))
+.build();
+
+// when: Creating the checkpoint plan.
+runWithNotRunTask(twoNotSourceBuilder);
+
+// then: The plan failed because one task didn't have RUNNING state.
+}
+
+@Test
+public void testNotRunningSingleNotSourceTriggeredTasksNotRunning() throws 
Exception {
+// given: Execution graph builder with NOT RUNNING not source and 
RUNNING source tasks.
+FunctionWithException 
sourceAndNotSourceBuilder =
+(id) ->
+new 
CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder()
+.addJobVertex(id, false)
+.addJobVertex(new JobVertexID())
+.setTransitToRunning(vertex -> 
!vertex.getJobvertexId().equals(id))
+.build();
+
+// when: Creating the checkpoint plan.
+runWithNotRunTask(sourceAndNotSourceBuilder);
+
+// then: The plan failed because one task didn't have RUNNING state.
+}
+
+private void runWithNotRunTask(
+FunctionWithException 
graphBuilder)

Review comment:
   Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-22549) Flink HA with Zk in kubernetes standalone mode deployment is not working

2021-05-03 Thread Matthias (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias closed FLINK-22549.

Resolution: Won't Fix

This issue is already discussed on this [mailing list 
thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Flink-1-12-2-scala-2-11-HA-with-Zk-in-kubernetes-standalone-mode-deployment-is-not-working-td43426.html].
 Let's keep it there for now to avoid having multiple parallel threads.

I'm closing this issue.

> Flink HA with Zk in kubernetes standalone mode deployment is not working
> 
>
> Key: FLINK-22549
> URL: https://issues.apache.org/jira/browse/FLINK-22549
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.12.2
> Environment: kubernetes standalone deployment
>Reporter: Bhagi
>Priority: Major
> Attachments: jobmanager.log, screenshot-1.png, taskmanager.log
>
>
> Hi Team,
> I deployed kubernetes standalone deployment flink cluster with ZK HA, but 
> facing some issues, i have attached taskmanager and job manger logs.
> Can you please see the logs and help me solve this issue.
> UI is throwing this error: {"errors":["Service temporarily unavailable due to 
> an ongoing leader election. Please refresh."]}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN
   * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN
   * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN
   * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN
   * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN
   * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN
   * afc0a9002c425a95c032a94750a5e9ae45fae8d3 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17520)
 
   * 4303e9844bcaf301d79371a008525a8f8b944db1 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17523)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (FLINK-22554) Support Kafka Topic Patterns in Kafka Ingress

2021-05-03 Thread Seth Wiesman (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Seth Wiesman reassigned FLINK-22554:


Assignee: Seth Wiesman

> Support Kafka Topic Patterns in Kafka Ingress
> -
>
> Key: FLINK-22554
> URL: https://issues.apache.org/jira/browse/FLINK-22554
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Affects Versions: statefun-3.1.0
>Reporter: Seth Wiesman
>Assignee: Seth Wiesman
>Priority: Major
>
> Flink's Kafka source supports subscription patterns, where it will consume 
> all topics that match a specified regex and periodically monitor for new 
> topics. We should support this in the statefun Kafka ingress as it is 
> generally useful and would remove a source of cluster downtime (subscribing 
> to a new topic).
>  
> I propose something like this.
>  
> {code:java}
> topics:
> - topic-pattern: my-topic-* // some regex
>   discovery-interval: 10s  // some duration 
>   valueType: blah
>   targets: 
>   - blah{code}
> The Flink consumer can be configured with both a list of concrete topics + a 
> pattern so validation is simple.
>  
> cc [~igalshilman]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22133) SplitEmumerator does not provide checkpoint id in snapshot

2021-05-03 Thread Thomas Weise (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338487#comment-17338487
 ] 

Thomas Weise commented on FLINK-22133:
--

[~jqin] [~sewen] what do you think about back porting this change to 1.12? It 
would allow users/downstream projects to build on top of the FLIP-27 interfaces 
without facing breakage when going from 1.12 to 1.13.

The source API is experimental in 1.12 and we already made incompatible changes 
to it in 1.12.x releases?

 

> SplitEmumerator does not provide checkpoint id in snapshot
> --
>
> Key: FLINK-22133
> URL: https://issues.apache.org/jira/browse/FLINK-22133
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Common
>Affects Versions: 1.12.0
>Reporter: Brian Zhou
>Assignee: Stephan Ewen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> In ExternallyInducedSource API, the checkpoint trigger exposes the checkpoint 
> Id for the external client to identify the checkpoint. However, in the 
> FLIP-27 source, the SplitEmumerator::snapshot() is a no-arg method. The 
> connector cannot track the checkpoint ID from Flink



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22554) Support Kafka Topic Patterns in Kafka Ingress

2021-05-03 Thread Seth Wiesman (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Seth Wiesman updated FLINK-22554:
-
Description: 
Flink's Kafka source supports subscription patterns, where it will consume all 
topics that match a specified regex and periodically monitor for new topics. We 
should support this in the statefun Kafka ingress as it is generally useful and 
would remove a source of cluster downtime (subscribing to a new topic).

 

I propose something like this.

 
{code:java}
topics:
- topic-pattern: my-topic-* // some regex
  discovery-interval: 10s  // some duration 
  valueType: blah
  targets: 
  - blah{code}
The Flink consumer can be configured with both a list of concrete topics + a 
pattern so validation is simple.

 

cc [~igalshilman]

  was:
Flink's Kafka source supports subscription patterns, where it will consume all 
topics that match a specified regex and periodically monitor for new topics. We 
should support this in the statefun Kafka ingress as it is generally useful and 
would remove a source of cluster downtime (subscribing to a new topic).

 

I propose something like this.

 
{code:java}
topics:
- topic-pattern: my-topic-*
  valueType: blah
  targets: 
  - blah{code}
The Flink consumer can be configured with both a list of concrete topics + a 
pattern so validation is simple.

 

cc [~igalshilman]


> Support Kafka Topic Patterns in Kafka Ingress
> -
>
> Key: FLINK-22554
> URL: https://issues.apache.org/jira/browse/FLINK-22554
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Affects Versions: statefun-3.1.0
>Reporter: Seth Wiesman
>Priority: Major
>
> Flink's Kafka source supports subscription patterns, where it will consume 
> all topics that match a specified regex and periodically monitor for new 
> topics. We should support this in the statefun Kafka ingress as it is 
> generally useful and would remove a source of cluster downtime (subscribing 
> to a new topic).
>  
> I propose something like this.
>  
> {code:java}
> topics:
> - topic-pattern: my-topic-* // some regex
>   discovery-interval: 10s  // some duration 
>   valueType: blah
>   targets: 
>   - blah{code}
> The Flink consumer can be configured with both a list of concrete topics + a 
> pattern so validation is simple.
>  
> cc [~igalshilman]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22554) Support Kafka Topic Patterns in Kafka Ingress

2021-05-03 Thread Seth Wiesman (Jira)
Seth Wiesman created FLINK-22554:


 Summary: Support Kafka Topic Patterns in Kafka Ingress
 Key: FLINK-22554
 URL: https://issues.apache.org/jira/browse/FLINK-22554
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions
Affects Versions: statefun-3.1.0
Reporter: Seth Wiesman


Flink's Kafka source supports subscription patterns, where it will consume all 
topics that match a specified regex and periodically monitor for new topics. We 
should support this in the statefun Kafka ingress as it is generally useful and 
would remove a source of cluster downtime (subscribing to a new topic).

 

I propose something like this.

 
{code:java}
topics:
- topic-pattern: my-topic-*
  valueType: blah
  targets: 
  - blah{code}
The Flink consumer can be configured with both a list of concrete topics + a 
pattern so validation is simple.

 

cc [~igalshilman]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15825: [FLINK-22406][coordination][tests] Stabilize ReactiveModeITCase

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15825:
URL: https://github.com/apache/flink/pull/15825#issuecomment-831225399


   
   ## CI report:
   
   * e1a425301eee0b46c812866d12284125f5bde104 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17517)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15728: [FLINK-22379][runtime] CheckpointCoordinator checks the state of all …

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15728:
URL: https://github.com/apache/flink/pull/15728#issuecomment-824960796


   
   ## CI report:
   
   * d4f85948ee5281b189ba948a8ec512f62587f979 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17458)
 
   * 057695465c326acc5c40576cf35ee9f0283d6cac Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17524)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15728: [FLINK-22379][runtime] CheckpointCoordinator checks the state of all …

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15728:
URL: https://github.com/apache/flink/pull/15728#issuecomment-824960796


   
   ## CI report:
   
   * d4f85948ee5281b189ba948a8ec512f62587f979 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17458)
 
   * 057695465c326acc5c40576cf35ee9f0283d6cac UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-19481) Add support for a flink native GCS FileSystem

2021-05-03 Thread Ben Augarten (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338455#comment-17338455
 ] 

Ben Augarten commented on FLINK-19481:
--

Hey Robert and Galen, I appreciate you both weighing in. I got a chance to read 
through Galen's PR briefly and it does seem like it's mostly concerned with 
adding support for the RecoverableWriter interface, and their implementation of 
the RecoverableWriter interface does not have any explicit hadoop dependencies. 
So, it seems like their implementation would be useful with either a native or 
hadoop based implementation of the google cloud storage file system.

Our native implementation does have support for RecoverableWriter, but I didn't 
work directly on that and I don't believe it's being used in production right 
now. We've primarily been using our implementation for checkpointing, 
savepointing, and job graph storage.

The two paths forward I see are:

* As Galen proposed keep two separate implementations of the GCS FileSystem, 
one that goes through the hadoop stack and one that uses GCS SDKs, both using 
the shared RecoverableWriter implementations.
* Consolidate down to a native GCS FileSystem implementation, using Galen's 
implementation of the RecoverableWriter.

To me, the second option makes most sense, based on my experience as a user of 
flink and my general impression of the desire to move away from hadoop based 
file systems.

To accomplish that, I think that Galen should continue working on their MR. I 
can open another MR once theirs lands on master, or open an MR on their WIP. 
Though, I'd prefer waiting until outstanding discussions are resolved.

> Add support for a flink native GCS FileSystem
> -
>
> Key: FLINK-19481
> URL: https://issues.apache.org/jira/browse/FLINK-19481
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, FileSystems
>Affects Versions: 1.12.0
>Reporter: Ben Augarten
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Currently, GCS is supported but only by using the hadoop connector[1]
>  
> The objective of this improvement is to add support for checkpointing to 
> Google Cloud Storage with the Flink File System,
>  
> This would allow the `gs://` scheme to be used for savepointing and 
> checkpointing. Long term, it would be nice if we could use the GCS FileSystem 
> as a source and sink in flink jobs as well. 
>  
> Long term, I hope that implementing a flink native GCS FileSystem will 
> simplify usage of GCS because the hadoop FileSystem ends up bringing in many 
> unshaded dependencies.
>  
> [1] 
> [https://github.com/GoogleCloudDataproc/hadoop-connectors|https://github.com/GoogleCloudDataproc/hadoop-connectors)]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN
   * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN
   * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN
   * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN
   * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN
   * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN
   * cc9e8ddee9ff28b3044488d0e73bbc7f5c0a73e5 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17516)
 
   * afc0a9002c425a95c032a94750a5e9ae45fae8d3 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17520)
 
   * 4303e9844bcaf301d79371a008525a8f8b944db1 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17523)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] akalash commented on a change in pull request #15728: [FLINK-22379][runtime] CheckpointCoordinator checks the state of all …

2021-05-03 Thread GitBox


akalash commented on a change in pull request #15728:
URL: https://github.com/apache/flink/pull/15728#discussion_r625177400



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/DefaultCheckpointPlanCalculatorTest.java
##
@@ -149,18 +150,80 @@ public void testComputeWithMultipleLevels() throws 
Exception {
 }
 
 @Test
-public void testWithTriggeredTasksNotRunning() throws Exception {
+public void testNotRunningOneOfSourcesTriggeredTasksNotRunning() throws 
Exception {
+// given: Execution graph builder with one RUNNING source and NOT 
RUNNING source.
+FunctionWithException 
twoSourcesBuilder =
+(id) ->
+new 
CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder()
+.addJobVertex(id)
+.addJobVertex(new JobVertexID())
+.setTransitToRunning(vertex -> 
!vertex.getJobvertexId().equals(id))
+.build();
+
+// when: Creating the checkpoint plan.
+runWithNotRunTask(twoSourcesBuilder);
+
+// then: The plan failed because one task didn't have RUNNING state.
+}
+
+@Test
+public void testNotRunningSingleSourceTriggeredTasksNotRunning() throws 
Exception {
+// given: Execution graph builder with one NOT RUNNING source and 
RUNNING not source task.
+FunctionWithException 
sourceAndNotSourceBuilder =
+(id) ->
+new 
CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder()
+.addJobVertex(id)
+.addJobVertex(new JobVertexID(), false)
+.setTransitToRunning(vertex -> 
!vertex.getJobvertexId().equals(id))
+.build();
+
+// when: Creating the checkpoint plan.
+runWithNotRunTask(sourceAndNotSourceBuilder);
+
+// then: The plan failed because one task didn't have RUNNING state.
+}
+
+@Test
+public void testNotRunningOneOfNotSourcesTriggeredTasksNotRunning() throws 
Exception {
+// given: Execution graph builder with NOT RUNNING not source and 
RUNNING not source task.
+FunctionWithException 
twoNotSourceBuilder =
+(id) ->
+new 
CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder()
+.addJobVertex(id, false)
+.addJobVertex(new JobVertexID(), false)
+.setTransitToRunning(vertex -> 
!vertex.getJobvertexId().equals(id))
+.build();
+
+// when: Creating the checkpoint plan.
+runWithNotRunTask(twoNotSourceBuilder);
+
+// then: The plan failed because one task didn't have RUNNING state.
+}
+
+@Test
+public void testNotRunningSingleNotSourceTriggeredTasksNotRunning() throws 
Exception {
+// given: Execution graph builder with NOT RUNNING not source and 
RUNNING source tasks.
+FunctionWithException 
sourceAndNotSourceBuilder =
+(id) ->
+new 
CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder()
+.addJobVertex(id, false)
+.addJobVertex(new JobVertexID())
+.setTransitToRunning(vertex -> 
!vertex.getJobvertexId().equals(id))
+.build();
+
+// when: Creating the checkpoint plan.
+runWithNotRunTask(sourceAndNotSourceBuilder);
+
+// then: The plan failed because one task didn't have RUNNING state.
+}
+
+private void runWithNotRunTask(
+FunctionWithException 
graphBuilder)

Review comment:
   I've rewritten tests mostly as you suggested and I also collapsed them 
into one test. Please, take a look.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN
   * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN
   * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN
   * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN
   * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN
   * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN
   * cc9e8ddee9ff28b3044488d0e73bbc7f5c0a73e5 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17516)
 
   * afc0a9002c425a95c032a94750a5e9ae45fae8d3 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17520)
 
   * 4303e9844bcaf301d79371a008525a8f8b944db1 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-web] dawidwys merged pull request #442: [hotfix] Adjust phrasing for FlameGraph performance

2021-05-03 Thread GitBox


dawidwys merged pull request #442:
URL: https://github.com/apache/flink-web/pull/442


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-web] zentol opened a new pull request #442: [hotfix] Adjust phrasing for FlameGraph performance

2021-05-03 Thread GitBox


zentol opened a new pull request #442:
URL: https://github.com/apache/flink-web/pull/442


   The current 1.13 blog post contains a note for flamegraphs that they are a) 
expensive and b) may overload the metric system.
   Point b) is simply incorrect, while a) is not generally the case, and the 
existing documentation provides better information.
   
   So I just replaced the current note with a link to the docs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-web] zentol merged pull request #441: [hotfix] fix bullet points list formatting

2021-05-03 Thread GitBox


zentol merged pull request #441:
URL: https://github.com/apache/flink-web/pull/441


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-19481) Add support for a flink native GCS FileSystem

2021-05-03 Thread Galen Warren (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338418#comment-17338418
 ] 

Galen Warren commented on FLINK-19481:
--

Hi all, I'm the author of the other 
[PR|https://github.com/apache/flink/pull/15599] that relates to Google Cloud 
Storage. [~xintongsong] has been working with me on this.

The main goal of my PR is to add support for the RecoverableWriter interface, 
so that one can write to GCS via a StreamingFileSink. The file system support 
goes through the Hadoop stack, as noted above, using Google's [cloud storage 
connector|https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage].

I have not personally had problems using the GCS connector and the Hadoop stack 
– it seems to write check/savepoints properly. I also use it to write job 
manager HA data to GCS, which seems to work fine.

However, if we do want to support a native implementation in addition to the 
Hadoop-based one, we could approach it similarly to what has been done for S3, 
i.e. have a shared base project (flink-gs-fs-base?) and then projects for each 
of the implementations ( flink-gs-fs-hadoop and flink-gs-fs-native?). The 
recoverable-writer code could go into the shared project so that both of the 
implementations could use it (assuming that the native implementation doesn't 
already have a recoverable-writer implementation).

I'll defer to the Flink experts on whether that's a worthwhile effort or not. 
At this point, from my perspective, it wouldn't be that much work to rework the 
project structure to support this.

 

> Add support for a flink native GCS FileSystem
> -
>
> Key: FLINK-19481
> URL: https://issues.apache.org/jira/browse/FLINK-19481
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, FileSystems
>Affects Versions: 1.12.0
>Reporter: Ben Augarten
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Currently, GCS is supported but only by using the hadoop connector[1]
>  
> The objective of this improvement is to add support for checkpointing to 
> Google Cloud Storage with the Flink File System,
>  
> This would allow the `gs://` scheme to be used for savepointing and 
> checkpointing. Long term, it would be nice if we could use the GCS FileSystem 
> as a source and sink in flink jobs as well. 
>  
> Long term, I hope that implementing a flink native GCS FileSystem will 
> simplify usage of GCS because the hadoop FileSystem ends up bringing in many 
> unshaded dependencies.
>  
> [1] 
> [https://github.com/GoogleCloudDataproc/hadoop-connectors|https://github.com/GoogleCloudDataproc/hadoop-connectors)]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22553) Improve error reporting on TM connection failures

2021-05-03 Thread Roman Khachatryan (Jira)
Roman Khachatryan created FLINK-22553:
-

 Summary: Improve error reporting on TM connection failures
 Key: FLINK-22553
 URL: https://issues.apache.org/jira/browse/FLINK-22553
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Network
Reporter: Roman Khachatryan
 Fix For: 1.14.0


Connection failures reported by NettyPartitionRequestClient contain misleading 
NPE in their stacktrace, e.g. reported in 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/remote-task-manager-netty-exception-td43401.html
{code}

org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: 
Connecting to remote task manager '/100.98.115.117:41245' has failed. This 
might indicate that the remote task manager has been lost. 
at 
org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.connect(PartitionRequestClientFactory.java:145)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] 
at 
org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.connectWithRetries(PartitionRequestClientFactory.java:114)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] 
at 
org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:81)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] 
at 
org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:70)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] 
at 
org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:179)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] 
at 
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.internalRequestPartitions(SingleInputGate.java:321)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] 
at 
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:290)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] 
at 
org.apache.flink.runtime.taskmanager.InputGateWithMetrics.requestPartitions(InputGateWithMetrics.java:94)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] 
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
 [flink-dist_2.12-1.12.2.jar:1.12.2] 
at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90) 
[flink-dist_2.12-1.12.2.jar:1.12.2] 
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:297)
 [flink-dist_2.12-1.12.2.jar:1.12.2] 
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:189)
 [flink-dist_2.12-1.12.2.jar:1.12.2] 
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617)
 [flink-dist_2.12-1.12.2.jar:1.12.2] 
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581) 
[flink-dist_2.12-1.12.2.jar:1.12.2] 
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755) 
[flink-dist_2.12-1.12.2.jar:1.12.2] 
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570) 
[flink-dist_2.12-1.12.2.jar:1.12.2] 
at java.lang.Thread.run(Thread.java:834) [?:?]Caused by: 
java.lang.NullPointerException 
at org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:59) 
~[flink-dist_2.12-1.12.2.jar:1.12.2] 
at 
org.apache.flink.runtime.io.network.netty.NettyPartitionRequestClient.(NettyPartitionRequestClient.java:74)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] 
at 
org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.connect(PartitionRequestClientFactory.java:136)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] ... 16 more
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-22173) UnalignedCheckpointRescaleITCase fails on azure

2021-05-03 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338400#comment-17338400
 ] 

Roman Khachatryan edited comment on FLINK-22173 at 5/3/21, 2:48 PM:


A recent failure (different stacktrace though):
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=17468=logs=34f41360-6c0d-54d3-11a1-0292a2def1d9=2d56e022-1ace-542f-bf1a-b37dd63243f2=9772]
{code}
Apr 30 14:29:59 Caused by: 
org.apache.flink.shaded.netty4.io.netty.util.IllegalReferenceCountException: 
refCnt: 0, in crement: 1 
Apr 30 14:29:59    at 
org.apache.flink.shaded.netty4.io.netty.util.internal.ReferenceCountUpdater.retain0(ReferenceCo
 untUpdater.java:123) 
Apr 30 14:29:59    at 
org.apache.flink.shaded.netty4.io.netty.util.internal.ReferenceCountUpdater.retain(ReferenceCou
 ntUpdater.java:110) 
Apr 30 14:29:59    at 
org.apache.flink.shaded.netty4.io.netty.buffer.AbstractReferenceCountedByteBuf.retain(AbstractR
 eferenceCountedByteBuf.java:80) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.buffer.NetworkBuffer.retainBuffer(NetworkBuffer.java:166)
 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.buffer.NetworkBuffer.retainBuffer(NetworkBuffer.java:47)
 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.buffer.BufferConsumer.copy(BufferConsumer.java:143)
 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.buffer.BufferConsumer.toDebugString(BufferConsumer.java:202
 ) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.logger.NetworkActionsLogger.traceRecover(NetworkActionsLogg
 er.java:94) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.partition.PipelinedSubpartition.addRecovered(PipelinedSubpa
 rtition.java:142) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.checkpoint.channel.ResultSubpartitionRecoveredStateHandler.recover(Rec
 overedChannelStateHandler.java:195) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.checkpoint.channel.ResultSubpartitionRecoveredStateHandler.recover(Rec
 overedChannelStateHandler.java:144) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.checkpoint.channel.ChannelStateChunkReader.readChunk(SequentialChannel
 StateReaderImpl.java:207) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.checkpoint.channel.SequentialChannelStateReaderImpl.readSequentially(S
 equentialChannelStateReaderImpl.java:107) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.checkpoint.channel.SequentialChannelStateReaderImpl.read(SequentialCha
 nnelStateReaderImpl.java:93) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.checkpoint.channel.SequentialChannelStateReaderImpl.readOutputData(Seq
 uentialChannelStateReaderImpl.java:79) 
Apr 30 14:29:59    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:571)
 
Apr 30 14:29:59    at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecut
 or.java:55) 
Apr 30 14:29:59    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:554)
 
Apr 30 14:29:59    at 
org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:757) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.taskmanager.Task.run(Task.java:564) 
Apr 30 14:29:59    at java.lang.Thread.run(Thread.java:748)
{code}

Commits from master up to 89c6c03660a88a648bbd13b4e6696124fe46d013


was (Author: roman_khachatryan):
A recent failure (with commits in master up to 
89c6c03660a88a648bbd13b4e6696124fe46d013):
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=17468=logs=34f41360-6c0d-54d3-11a1-0292a2def1d9=2d56e022-1ace-542f-bf1a-b37dd63243f2=9772]
{code}
Apr 30 14:29:59 Caused by: 
org.apache.flink.shaded.netty4.io.netty.util.IllegalReferenceCountException: 
refCnt: 0, in crement: 1 
Apr 30 14:29:59    at 
org.apache.flink.shaded.netty4.io.netty.util.internal.ReferenceCountUpdater.retain0(ReferenceCo
 untUpdater.java:123) 
Apr 30 14:29:59    at 
org.apache.flink.shaded.netty4.io.netty.util.internal.ReferenceCountUpdater.retain(ReferenceCou
 ntUpdater.java:110) 
Apr 30 14:29:59    at 
org.apache.flink.shaded.netty4.io.netty.buffer.AbstractReferenceCountedByteBuf.retain(AbstractR
 eferenceCountedByteBuf.java:80) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.buffer.NetworkBuffer.retainBuffer(NetworkBuffer.java:166)
 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.buffer.NetworkBuffer.retainBuffer(NetworkBuffer.java:47)
 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.buffer.BufferConsumer.copy(BufferConsumer.java:143)
 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.buffer.BufferConsumer.toDebugString(BufferConsumer.java:202
 ) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.logger.NetworkActionsLogger.traceRecover(NetworkActionsLogg
 er.java:94) 
Apr 30 14:29:59    at 

[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN
   * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN
   * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN
   * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN
   * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN
   * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN
   * 5ee991b8ab6c1645e3661f2ae8b5c20e92223954 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17489)
 
   * cc9e8ddee9ff28b3044488d0e73bbc7f5c0a73e5 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17516)
 
   * afc0a9002c425a95c032a94750a5e9ae45fae8d3 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17520)
 
   * 4303e9844bcaf301d79371a008525a8f8b944db1 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-22173) UnalignedCheckpointRescaleITCase fails on azure

2021-05-03 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338400#comment-17338400
 ] 

Roman Khachatryan commented on FLINK-22173:
---

A recent failure (with commits in master up to 
89c6c03660a88a648bbd13b4e6696124fe46d013):
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=17468=logs=34f41360-6c0d-54d3-11a1-0292a2def1d9=2d56e022-1ace-542f-bf1a-b37dd63243f2=9772]
{code}
Apr 30 14:29:59 Caused by: 
org.apache.flink.shaded.netty4.io.netty.util.IllegalReferenceCountException: 
refCnt: 0, in crement: 1 
Apr 30 14:29:59    at 
org.apache.flink.shaded.netty4.io.netty.util.internal.ReferenceCountUpdater.retain0(ReferenceCo
 untUpdater.java:123) 
Apr 30 14:29:59    at 
org.apache.flink.shaded.netty4.io.netty.util.internal.ReferenceCountUpdater.retain(ReferenceCou
 ntUpdater.java:110) 
Apr 30 14:29:59    at 
org.apache.flink.shaded.netty4.io.netty.buffer.AbstractReferenceCountedByteBuf.retain(AbstractR
 eferenceCountedByteBuf.java:80) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.buffer.NetworkBuffer.retainBuffer(NetworkBuffer.java:166)
 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.buffer.NetworkBuffer.retainBuffer(NetworkBuffer.java:47)
 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.buffer.BufferConsumer.copy(BufferConsumer.java:143)
 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.buffer.BufferConsumer.toDebugString(BufferConsumer.java:202
 ) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.logger.NetworkActionsLogger.traceRecover(NetworkActionsLogg
 er.java:94) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.io.network.partition.PipelinedSubpartition.addRecovered(PipelinedSubpa
 rtition.java:142) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.checkpoint.channel.ResultSubpartitionRecoveredStateHandler.recover(Rec
 overedChannelStateHandler.java:195) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.checkpoint.channel.ResultSubpartitionRecoveredStateHandler.recover(Rec
 overedChannelStateHandler.java:144) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.checkpoint.channel.ChannelStateChunkReader.readChunk(SequentialChannel
 StateReaderImpl.java:207) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.checkpoint.channel.SequentialChannelStateReaderImpl.readSequentially(S
 equentialChannelStateReaderImpl.java:107) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.checkpoint.channel.SequentialChannelStateReaderImpl.read(SequentialCha
 nnelStateReaderImpl.java:93) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.checkpoint.channel.SequentialChannelStateReaderImpl.readOutputData(Seq
 uentialChannelStateReaderImpl.java:79) 
Apr 30 14:29:59    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:571)
 
Apr 30 14:29:59    at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecut
 or.java:55) 
Apr 30 14:29:59    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:554)
 
Apr 30 14:29:59    at 
org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:757) 
Apr 30 14:29:59    at 
org.apache.flink.runtime.taskmanager.Task.run(Task.java:564) 
Apr 30 14:29:59    at java.lang.Thread.run(Thread.java:748)
{code}


> UnalignedCheckpointRescaleITCase fails on azure
> ---
>
> Key: FLINK-22173
> URL: https://issues.apache.org/jira/browse/FLINK-22173
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.13.0
>Reporter: Dawid Wysakowicz
>Assignee: Arvid Heise
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=16232=logs=d8d26c26-7ec2-5ed2-772e-7a1a1eb8317c=be5fb08e-1ad7-563c-4f1a-a97ad4ce4865=9628
> {code}
> 2021-04-08T23:25:56.3131361Z [ERROR] Tests run: 31, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 839.623 s <<< FAILURE! - in 
> org.apache.flink.test.checkpointing.UnalignedCheckpointRescaleITCase
> 2021-04-08T23:25:56.3132784Z [ERROR] shouldRescaleUnalignedCheckpoint[no 
> scale union from 7 to 
> 7](org.apache.flink.test.checkpointing.UnalignedCheckpointRescaleITCase)  
> Time elapsed: 607.467 s  <<< ERROR!
> 2021-04-08T23:25:56.3133586Z 
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
> 2021-04-08T23:25:56.3134070Z  at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
> 2021-04-08T23:25:56.3134643Z  at 
> org.apache.flink.test.checkpointing.UnalignedCheckpointTestBase.execute(UnalignedCheckpointTestBase.java:168)
> 2021-04-08T23:25:56.3135577Z  at 
> 

[GitHub] [flink-web] afedulov opened a new pull request #441: [hotfix] fix bullet points list formatting

2021-05-03 Thread GitBox


afedulov opened a new pull request #441:
URL: https://github.com/apache/flink-web/pull/441


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-22552) Rebase StateFun on Flink 1.13

2021-05-03 Thread Igal Shilman (Jira)
Igal Shilman created FLINK-22552:


 Summary: Rebase StateFun on Flink 1.13
 Key: FLINK-22552
 URL: https://issues.apache.org/jira/browse/FLINK-22552
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions
Reporter: Igal Shilman


Following the recent release of Flink 1.13, StateFun master needs to be rebased 
on that version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22552) Rebase StateFun on Flink 1.13

2021-05-03 Thread Igal Shilman (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igal Shilman updated FLINK-22552:
-
Description: Following the recent release of Flink 1.13, StateFun main 
branch needs to be rebased on that version.  (was: Following the recent release 
of Flink 1.13, StateFun master needs to be rebased on that version.)

> Rebase StateFun on Flink 1.13
> -
>
> Key: FLINK-22552
> URL: https://issues.apache.org/jira/browse/FLINK-22552
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Reporter: Igal Shilman
>Priority: Major
>
> Following the recent release of Flink 1.13, StateFun main branch needs to be 
> rebased on that version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink-web] dawidwys merged pull request #440: [hotfix] Correct formatting in 1.13 release blogpost.

2021-05-03 Thread GitBox


dawidwys merged pull request #440:
URL: https://github.com/apache/flink-web/pull/440


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-web] morsapaes opened a new pull request #440: [hotfix] Correct formatting in 1.13 release blogpost.

2021-05-03 Thread GitBox


morsapaes opened a new pull request #440:
URL: https://github.com/apache/flink-web/pull/440


   One of the Python code snippets renders incorrectly due to a missing 
paragraph.
   
   https://user-images.githubusercontent.com/23521087/116886475-6864b880-ac29-11eb-9c3a-4f5b98466a8e.png;>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-web] dawidwys commented on pull request #436: Add Apache Flink release 1.13.0

2021-05-03 Thread GitBox


dawidwys commented on pull request #436:
URL: https://github.com/apache/flink-web/pull/436#issuecomment-831266920


   Merged...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-web] dawidwys closed pull request #436: Add Apache Flink release 1.13.0

2021-05-03 Thread GitBox


dawidwys closed pull request #436:
URL: https://github.com/apache/flink-web/pull/436


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15824: (1.11) [FLINK-20383][runtime] Fix race condition in notification.

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15824:
URL: https://github.com/apache/flink/pull/15824#issuecomment-831161678


   
   ## CI report:
   
   * f741e9ec3edf512b0adf4eb17548f9bf073f00ff Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17512)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN
   * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN
   * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN
   * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN
   * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN
   * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN
   * 5ee991b8ab6c1645e3661f2ae8b5c20e92223954 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17489)
 
   * cc9e8ddee9ff28b3044488d0e73bbc7f5c0a73e5 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17516)
 
   * afc0a9002c425a95c032a94750a5e9ae45fae8d3 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17520)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-22535) Resource leak would happen if exception thrown during AbstractInvokable#restore of task life

2021-05-03 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski closed FLINK-22535.
--
Fix Version/s: 1.14.0
   Resolution: Fixed

Merged to master 5926c4ac22b^..5926c4ac22b
Merged to release-1.13 as 0077b6fb07c^..0077b6fb07c

> Resource leak would happen if exception thrown during 
> AbstractInvokable#restore of task life
> 
>
> Key: FLINK-22535
> URL: https://issues.apache.org/jira/browse/FLINK-22535
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.13.0
>Reporter: Yun Tang
>Assignee: Anton Kalashnikov
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.14.0, 1.13.1
>
>
> FLINK-17012 introduced new initialization phase such as 
> {{AbstractInvokable.restore}}, however, if 
> [invokable.restore()|https://github.com/apache/flink/blob/79a521e08df550d96f97bb6915191d8496bb29ea/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L754-L759]
>  throws exception out, no more {{StreamTask#cleanUpInvoke}} would be called, 
> leading to resource leak.
> We internally leveraged another way to use managed memory by registering 
> specific operator identifier in memory manager, forgetting to call the stream 
> task cleanup would let stream operator not be disposed and we have to face 
> critical resource leak.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] pnowojski merged pull request #15820: [FLINK-22535][runtime] CleanUp is invoked for task even when the task…

2021-05-03 Thread GitBox


pnowojski merged pull request #15820:
URL: https://github.com/apache/flink/pull/15820


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN
   * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN
   * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN
   * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN
   * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN
   * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN
   * 5ee991b8ab6c1645e3661f2ae8b5c20e92223954 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17489)
 
   * cc9e8ddee9ff28b3044488d0e73bbc7f5c0a73e5 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17516)
 
   * afc0a9002c425a95c032a94750a5e9ae45fae8d3 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (FLINK-22530) RuntimeException after subsequent windowed grouping in TableAPI

2021-05-03 Thread Christopher Rost (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338361#comment-17338361
 ] 

Christopher Rost edited comment on FLINK-22530 at 5/3/21, 1:08 PM:
---

[~AHeise] and [~libenchao]] It Seems that you had a similar issue with two 
subsequent tumbling windows in 
https://issues.apache.org/jira/browse/FLINK-15494. Do you think it is the same 
problem here?


was (Author: chrizzz110):
@libenchao Seems that you had a similar issue with two subsequent tumbling 
windows.

> RuntimeException after subsequent windowed grouping in TableAPI
> ---
>
> Key: FLINK-22530
> URL: https://issues.apache.org/jira/browse/FLINK-22530
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.12.0
>Reporter: Christopher Rost
>Priority: Major
>
> After applying the following using the TableAPI v 1.12.0, an error is thrown: 
> {code:java}
> java.lang.RuntimeException: Error while applying rule 
> StreamExecGroupWindowAggregateRule(in:LOGICAL,out:STREAM_PHYSICAL), args 
> [rel#505:FlinkLogicalWindowAggregate.LOGICAL.any.None: 
> 0.[NONE].[NONE](input=RelSubset#504,group={1},window=TumblingGroupWindow('w2, 
> w1_rowtime, 1),properties=EXPR$1)]{code}
> The code snippet to reproduce:
> {code:java}
> Table table2 = table1
>   .window(Tumble.over(lit(10).seconds()).on($(EVENT_TIME)).as("w1"))
>   .groupBy($(ID), $(LABEL), $("w1"))
>   .select($(ID), $(LABEL), $("w1").rowtime().as("w1_rowtime"));
> // table2.execute().print(); --> work well
> Table table3 = table2
>   .window(Tumble.over(lit(10).seconds()).on($("w1_rowtime")).as("w2"))
>   .groupBy($(LABEL), $("w2"))
>   .select(
> $(LABEL).as("super_label"),
> lit(1).count().as("super_count"),
> $("w2").rowtime().as("w2_rowtime")
>   );
> // table3.execute().print(); //--> work well
>table3.select($("super_label"), $("w2_rowtime"))
>   .execute().print(); // --> throws exception
> {code}
> It seems that the alias "w1_rowtime" is no longer available for further 
> usages of table3, since the cause of the exception is: 
> {noformat}
> Caused by: java.lang.IllegalArgumentException: field [w1_rowtime] not found; 
> input fields are: [vertex_id, vertex_label, EXPR$0
> {noformat}
> {{The complete trace:}}
> {code:java}
> java.lang.RuntimeException: Error while applying rule 
> StreamExecGroupWindowAggregateRule(in:LOGICAL,out:STREAM_PHYSICAL), args 
> [rel#197:FlinkLogicalWindowAggregate.LOGICAL.any.None: 
> 0.[NONE].[NONE](input=RelSubset#196,group={1},window=TumblingGroupWindow('w2, 
> w1_rowtime, 1),properties=EXPR$1)]at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:256)
>   at 
> org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:58)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:510)
>   at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:312)
>   at 
> org.apache.flink.table.planner.plan.optimize.program.FlinkVolcanoProgram.optimize(FlinkVolcanoProgram.scala:64)
>   at 
> org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:62)
>   at 
> org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:58)
>   at 
> scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
>   at 
> scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at 
> scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157)
>   at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104)
>   at 
> org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.optimize(FlinkChainedProgram.scala:57)
>   at 
> org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.optimizeTree(StreamCommonSubGraphBasedOptimizer.scala:163)
>   at 
> org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.doOptimize(StreamCommonSubGraphBasedOptimizer.scala:79)
>   at 
> org.apache.flink.table.planner.plan.optimize.CommonSubGraphBasedOptimizer.optimize(CommonSubGraphBasedOptimizer.scala:77)
>   at 
> org.apache.flink.table.planner.delegation.PlannerBase.optimize(PlannerBase.scala:286)
>  

[jira] [Commented] (FLINK-22530) RuntimeException after subsequent windowed grouping in TableAPI

2021-05-03 Thread Christopher Rost (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338361#comment-17338361
 ] 

Christopher Rost commented on FLINK-22530:
--

@libenchao Seems that you had a similar issue with two subsequent tumbling 
windows.

> RuntimeException after subsequent windowed grouping in TableAPI
> ---
>
> Key: FLINK-22530
> URL: https://issues.apache.org/jira/browse/FLINK-22530
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.12.0
>Reporter: Christopher Rost
>Priority: Major
>
> After applying the following using the TableAPI v 1.12.0, an error is thrown: 
> {code:java}
> java.lang.RuntimeException: Error while applying rule 
> StreamExecGroupWindowAggregateRule(in:LOGICAL,out:STREAM_PHYSICAL), args 
> [rel#505:FlinkLogicalWindowAggregate.LOGICAL.any.None: 
> 0.[NONE].[NONE](input=RelSubset#504,group={1},window=TumblingGroupWindow('w2, 
> w1_rowtime, 1),properties=EXPR$1)]{code}
> The code snippet to reproduce:
> {code:java}
> Table table2 = table1
>   .window(Tumble.over(lit(10).seconds()).on($(EVENT_TIME)).as("w1"))
>   .groupBy($(ID), $(LABEL), $("w1"))
>   .select($(ID), $(LABEL), $("w1").rowtime().as("w1_rowtime"));
> // table2.execute().print(); --> work well
> Table table3 = table2
>   .window(Tumble.over(lit(10).seconds()).on($("w1_rowtime")).as("w2"))
>   .groupBy($(LABEL), $("w2"))
>   .select(
> $(LABEL).as("super_label"),
> lit(1).count().as("super_count"),
> $("w2").rowtime().as("w2_rowtime")
>   );
> // table3.execute().print(); //--> work well
>table3.select($("super_label"), $("w2_rowtime"))
>   .execute().print(); // --> throws exception
> {code}
> It seems that the alias "w1_rowtime" is no longer available for further 
> usages of table3, since the cause of the exception is: 
> {noformat}
> Caused by: java.lang.IllegalArgumentException: field [w1_rowtime] not found; 
> input fields are: [vertex_id, vertex_label, EXPR$0
> {noformat}
> {{The complete trace:}}
> {code:java}
> java.lang.RuntimeException: Error while applying rule 
> StreamExecGroupWindowAggregateRule(in:LOGICAL,out:STREAM_PHYSICAL), args 
> [rel#197:FlinkLogicalWindowAggregate.LOGICAL.any.None: 
> 0.[NONE].[NONE](input=RelSubset#196,group={1},window=TumblingGroupWindow('w2, 
> w1_rowtime, 1),properties=EXPR$1)]at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:256)
>   at 
> org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:58)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:510)
>   at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:312)
>   at 
> org.apache.flink.table.planner.plan.optimize.program.FlinkVolcanoProgram.optimize(FlinkVolcanoProgram.scala:64)
>   at 
> org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:62)
>   at 
> org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:58)
>   at 
> scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
>   at 
> scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at 
> scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157)
>   at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104)
>   at 
> org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.optimize(FlinkChainedProgram.scala:57)
>   at 
> org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.optimizeTree(StreamCommonSubGraphBasedOptimizer.scala:163)
>   at 
> org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.doOptimize(StreamCommonSubGraphBasedOptimizer.scala:79)
>   at 
> org.apache.flink.table.planner.plan.optimize.CommonSubGraphBasedOptimizer.optimize(CommonSubGraphBasedOptimizer.scala:77)
>   at 
> org.apache.flink.table.planner.delegation.PlannerBase.optimize(PlannerBase.scala:286)
>   at 
> org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:165)
>   at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:1267)
>   at 
> 

[jira] [Resolved] (FLINK-22368) UnalignedCheckpointITCase hangs on azure

2021-05-03 Thread Roman Khachatryan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan resolved FLINK-22368.
---
Fix Version/s: 1.12.4
   1.14.0
   Resolution: Fixed

Merged into 1.12 as afc0a9002c425a95c032a94750a5e9ae45fae8d3.
Merged into 1.13 as b41c5fab3e3eebb04b891ec2ac0688bedc2d94eb.
Merged into master as 6e3ccd5a9613a2de06c8ec410ba41e6e0c6959be.

> UnalignedCheckpointITCase hangs on azure
> 
>
> Key: FLINK-22368
> URL: https://issues.apache.org/jira/browse/FLINK-22368
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.13.0
>Reporter: Dawid Wysakowicz
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Fix For: 1.14.0, 1.13.1, 1.12.4
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=16818=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=d13f554f-d4b9-50f8-30ee-d49c6fb0b3cc=10144



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] rkhachatryan merged pull request #15726: [BP-1.12][FLINK-22368] Deque channel after releasing on EndOfPartition

2021-05-03 Thread GitBox


rkhachatryan merged pull request #15726:
URL: https://github.com/apache/flink/pull/15726


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rkhachatryan merged pull request #15727: [BP-1.13][FLINK-22368] Deque channel after releasing on EndOfPartition

2021-05-03 Thread GitBox


rkhachatryan merged pull request #15727:
URL: https://github.com/apache/flink/pull/15727


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-20217) More fine-grained timer processing

2021-05-03 Thread Nico Kruber (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338353#comment-17338353
 ] 

Nico Kruber commented on FLINK-20217:
-

I'd still consider this a major hurdle for which workarounds require quite some 
implementation effort by users.

> More fine-grained timer processing
> --
>
> Key: FLINK-20217
> URL: https://issues.apache.org/jira/browse/FLINK-20217
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Affects Versions: 1.10.2, 1.11.2, 1.12.0
>Reporter: Nico Kruber
>Priority: Major
>
> Timers are currently processed in one big block under the checkpoint lock 
> (under {{InternalTimerServiceImpl#advanceWatermark}}. This can be problematic 
> in a number of scenarios while doing checkpointing which would lead to 
> checkpoints timing out (and even unaligned checkpoints would not help).
> If you have a huge number of timers to process when advancing the watermark 
> and the task is also back-pressured, the situation may actually be worse 
> since you would block on the checkpoint lock and also wait for 
> buffers/credits from the receiver.
> I propose to make this loop more fine-grained so that it is interruptible by 
> checkpoints, but maybe there is also some other way to improve here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-20217) More fine-grained timer processing

2021-05-03 Thread Nico Kruber (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber updated FLINK-20217:

Labels:   (was: auto-deprioritized-major)

> More fine-grained timer processing
> --
>
> Key: FLINK-20217
> URL: https://issues.apache.org/jira/browse/FLINK-20217
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Affects Versions: 1.10.2, 1.11.2, 1.12.0
>Reporter: Nico Kruber
>Priority: Minor
>
> Timers are currently processed in one big block under the checkpoint lock 
> (under {{InternalTimerServiceImpl#advanceWatermark}}. This can be problematic 
> in a number of scenarios while doing checkpointing which would lead to 
> checkpoints timing out (and even unaligned checkpoints would not help).
> If you have a huge number of timers to process when advancing the watermark 
> and the task is also back-pressured, the situation may actually be worse 
> since you would block on the checkpoint lock and also wait for 
> buffers/credits from the receiver.
> I propose to make this loop more fine-grained so that it is interruptible by 
> checkpoints, but maybe there is also some other way to improve here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-20217) More fine-grained timer processing

2021-05-03 Thread Nico Kruber (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber updated FLINK-20217:

Priority: Major  (was: Minor)

> More fine-grained timer processing
> --
>
> Key: FLINK-20217
> URL: https://issues.apache.org/jira/browse/FLINK-20217
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Affects Versions: 1.10.2, 1.11.2, 1.12.0
>Reporter: Nico Kruber
>Priority: Major
>
> Timers are currently processed in one big block under the checkpoint lock 
> (under {{InternalTimerServiceImpl#advanceWatermark}}. This can be problematic 
> in a number of scenarios while doing checkpointing which would lead to 
> checkpoints timing out (and even unaligned checkpoints would not help).
> If you have a huge number of timers to process when advancing the watermark 
> and the task is also back-pressured, the situation may actually be worse 
> since you would block on the checkpoint lock and also wait for 
> buffers/credits from the receiver.
> I propose to make this loop more fine-grained so that it is interruptible by 
> checkpoints, but maybe there is also some other way to improve here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15825: [FLINK-22406][coordination][tests] Stabilize ReactiveModeITCase

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15825:
URL: https://github.com/apache/flink/pull/15825#issuecomment-831225399


   
   ## CI report:
   
   * e1a425301eee0b46c812866d12284125f5bde104 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17517)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15823: [FLINK-22462][tests] Set max_connections to 200 in pgsql

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15823:
URL: https://github.com/apache/flink/pull/15823#issuecomment-831107274


   
   ## CI report:
   
   * 9f9242917bedadece14b34ac0a7f98f91bdf4c82 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17509)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15820: [FLINK-22535][runtime] CleanUp is invoked for task even when the task…

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15820:
URL: https://github.com/apache/flink/pull/15820#issuecomment-830052015


   
   ## CI report:
   
   * 4ba39e78592ba67fc5c51f11b8737337337805e5 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17510)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-16556) TopSpeedWindowing should implement checkpointing for its source

2021-05-03 Thread Nico Kruber (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber updated FLINK-16556:

Labels: auto-deprioritized-major starter  (was: auto-deprioritized-major)

> TopSpeedWindowing should implement checkpointing for its source
> ---
>
> Key: FLINK-16556
> URL: https://issues.apache.org/jira/browse/FLINK-16556
> Project: Flink
>  Issue Type: Bug
>  Components: Examples
>Affects Versions: 1.10.0
>Reporter: Nico Kruber
>Priority: Minor
>  Labels: auto-deprioritized-major, starter
>
> {\{org.apache.flink.streaming.examples.windowing.TopSpeedWindowing.CarSource}}
>  does not implement checkpointing of its state, namely the current speeds and 
> distances per car. The main problem with this is that the window trigger only 
> fires if the new distance has increased by at least 50 but after restore, it 
> will be reset to 0 and could thus not produce output for a while.
>  
> Either the distance calculation could use {{Math.abs}} or the source needs 
> proper checkpointing. Optionally with allowing the number of cars to 
> increase/decrease.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] alpinegizmo commented on a change in pull request #15811: [FLINK-22253][docs] Update back pressure monitoring docs with new WebUI changes

2021-05-03 Thread GitBox


alpinegizmo commented on a change in pull request #15811:
URL: https://github.com/apache/flink/pull/15811#discussion_r625045499



##
File path: docs/content/docs/ops/monitoring/back_pressure.md
##
@@ -37,50 +37,47 @@ If you see a **back pressure warning** (e.g. `High`) for a 
task, this means that
 Take a simple `Source -> Sink` job as an example. If you see a warning for 
`Source`, this means that `Sink` is consuming data slower than `Source` is 
producing. `Sink` is back pressuring the upstream operator `Source`.
 
 
-## Sampling Back Pressure
+## Task performance metrics
 
-Back pressure monitoring works by repeatedly taking back pressure samples of 
your running tasks. The JobManager triggers repeated calls to 
`Task.isBackPressured()` for the tasks of your job.
+Every parallel instance of a task (subtask) is exposing a group of three 
metrics:
+- `backPressureTimeMsPerSecond`, time that subtask spent being back pressured
+- `idleTimeMsPerSecond`, time that subtask spent waiting for something to 
process
+- `busyTimeMsPerSecond`, time that subtask was busy doing some actual work
 
-{{< img src="/fig/back_pressure_sampling.png" class="img-responsive" >}}
-
-
-Internally, back pressure is judged based on the availability of output 
buffers. If there is no available buffer (at least one) for output, then it 
indicates that there is back pressure for the task.
-
-By default, the job manager triggers 100 samples every 50ms for each task in 
order to determine back pressure. The ratio you see in the web interface tells 
you how many of these samples were indicating back pressure, e.g. `0.01` 
indicates that only 1 in 100 was back pressured.
-
-- **OK**: 0 <= Ratio <= 0.10
-- **LOW**: 0.10 < Ratio <= 0.5
-- **HIGH**: 0.5 < Ratio <= 1
-
-In order to not overload the task managers with back pressure samples, the web 
interface refreshes samples only after 60 seconds.
-
-## Configuration
-
-You can configure the number of samples for the job manager with the following 
configuration keys:
-
-- `web.backpressure.refresh-interval`: Time after which available stats are 
deprecated and need to be refreshed (DEFAULT: 6, 1 min).
-- `web.backpressure.num-samples`: Number of samples to take to determine back 
pressure (DEFAULT: 100).
-- `web.backpressure.delay-between-samples`: Delay between samples to determine 
back pressure (DEFAULT: 50, 50 ms).
+Those metrics are being updated every couple of seconds and the reported value 
presents an average time
+that subtask has been back pressured (or idle or busy) in that last couple of 
seconds.
+Keep this in mind if your job has a varying load. Both, a subtask that has a 
constant load of 50% and a
+subtask that is alternating every second between fully loaded and idling, will 
have the same value
+of `busyTimeMsPerSecond` around `500ms`.

Review comment:
   ```suggestion
   Keep this in mind if your job has a varying load. For example, a subtask 
with a constant load of 50% and another subtask that is alternating every 
second between fully loaded and idling will both have the same value
   of `busyTimeMsPerSecond`: around `500ms`. 
   ```

##
File path: docs/content/docs/ops/monitoring/back_pressure.md
##
@@ -37,50 +37,47 @@ If you see a **back pressure warning** (e.g. `High`) for a 
task, this means that
 Take a simple `Source -> Sink` job as an example. If you see a warning for 
`Source`, this means that `Sink` is consuming data slower than `Source` is 
producing. `Sink` is back pressuring the upstream operator `Source`.
 
 
-## Sampling Back Pressure
+## Task performance metrics
 
-Back pressure monitoring works by repeatedly taking back pressure samples of 
your running tasks. The JobManager triggers repeated calls to 
`Task.isBackPressured()` for the tasks of your job.
+Every parallel instance of a task (subtask) is exposing a group of three 
metrics:
+- `backPressureTimeMsPerSecond`, time that subtask spent being back pressured
+- `idleTimeMsPerSecond`, time that subtask spent waiting for something to 
process
+- `busyTimeMsPerSecond`, time that subtask was busy doing some actual work
 
-{{< img src="/fig/back_pressure_sampling.png" class="img-responsive" >}}
-
-
-Internally, back pressure is judged based on the availability of output 
buffers. If there is no available buffer (at least one) for output, then it 
indicates that there is back pressure for the task.
-
-By default, the job manager triggers 100 samples every 50ms for each task in 
order to determine back pressure. The ratio you see in the web interface tells 
you how many of these samples were indicating back pressure, e.g. `0.01` 
indicates that only 1 in 100 was back pressured.
-
-- **OK**: 0 <= Ratio <= 0.10
-- **LOW**: 0.10 < Ratio <= 0.5
-- **HIGH**: 0.5 < Ratio <= 1
-
-In order to not overload the task managers with back pressure samples, the web 
interface refreshes samples only after 60 seconds.
-
-## Configuration
-
-You can configure the number of 

[GitHub] [flink] flinkbot commented on pull request #15825: [FLINK-22406][coordination][tests] Stabilize ReactiveModeITCase

2021-05-03 Thread GitBox


flinkbot commented on pull request #15825:
URL: https://github.com/apache/flink/pull/15825#issuecomment-831225399


   
   ## CI report:
   
   * e1a425301eee0b46c812866d12284125f5bde104 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15726: [BP-1.12][FLINK-22368] Deque channel after releasing on EndOfPartition

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15726:
URL: https://github.com/apache/flink/pull/15726#issuecomment-824941786


   
   ## CI report:
   
   * 5022df0e16a5b1f7798171c0cd95426df8ef5c64 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17508)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN
   * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN
   * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN
   * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN
   * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN
   * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN
   * 5ee991b8ab6c1645e3661f2ae8b5c20e92223954 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17489)
 
   * cc9e8ddee9ff28b3044488d0e73bbc7f5c0a73e5 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17516)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-22139) Flink Jobmanager & Task Manger logs are not writing to the logs files

2021-05-03 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338338#comment-17338338
 ] 

Till Rohrmann commented on FLINK-22139:
---

What is the side pod? Could you try deploying a non-HA cluster as described in 
the documentation and then log into the container of the {{JobManager}} and 
check whether {{/opt/flink/log}} contains the log file?

> Flink Jobmanager & Task Manger logs are not writing to the logs files
> -
>
> Key: FLINK-22139
> URL: https://issues.apache.org/jira/browse/FLINK-22139
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.12.2
> Environment: on kubernetes flink standalone deployment with 
> jobmanager HA is enabled.
>Reporter: Bhagi
>Priority: Major
>
> Hi Team,
> I am submitting the jobs and restarting the job manager and task manager 
> pods..  Log files are generating with the name task manager and job manager.
> but job manager & task manager log file size is '0', i am not sure any 
> configuration missed..why logs are not writing to their log files..
> # Task Manager pod###
> flink@flink-taskmanager-85b6585b7-hhgl7:~$ ls -lart log/
> total 0
> -rw-r--r-- 1 flink flink  0 Apr  7 09:35 
> flink--taskexecutor-0-flink-taskmanager-85b6585b7-hhgl7.log
> flink@flink-taskmanager-85b6585b7-hhgl7:~$
> ### Jobmanager pod Logs #
> flink@flink-jobmanager-f6db89b7f-lq4ps:~$
> -rw-r--r-- 1 7148739 flink 0 Apr  7 06:36 
> flink--standalonesession-0-flink-jobmanager-f6db89b7f-gtkx5.log
> -rw-r--r-- 1 7148739 flink 0 Apr  7 06:36 
> flink--standalonesession-0-flink-jobmanager-f6db89b7f-wnrfm.log
> -rw-r--r-- 1 7148739 flink 0 Apr  7 06:37 
> flink--standalonesession-0-flink-jobmanager-f6db89b7f-2b2fs.log
> -rw-r--r-- 1 7148739 flink 0 Apr  7 06:37 
> flink--standalonesession-0-flink-jobmanager-f6db89b7f-7kdhh.log
> -rw-r--r-- 1 7148739 flink 0 Apr  7 09:35 
> flink--standalonesession-0-flink-jobmanager-f6db89b7f-twhkt.log
> drwxrwxrwx 2 7148739 flink35 Apr  7 09:35 .
> -rw-r--r-- 1 7148739 flink 0 Apr  7 09:35 
> flink--standalonesession-0-flink-jobmanager-f6db89b7f-lq4ps.log
> flink@flink-jobmanager-f6db89b7f-lq4ps:~$
> I configured log4j.properties for flink
> log4j.properties: |+
> monitorInterval=30
> rootLogger.level = INFO
> rootLogger.appenderRef.file.ref = MainAppender
> logger.flink.name = org.apache.flink
> logger.flink.level = INFO
> logger.akka.name = akka
> logger.akka.level = INFO
> appender.main.name = MainAppender
> appender.main.type = RollingFile
> appender.main.append = true
> appender.main.fileName = ${sys:log.file}
> appender.main.filePattern = ${sys:log.file}.%i
> appender.main.layout.type = PatternLayout
> appender.main.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x 
> - %m%n
> appender.main.policies.type = Policies
> appender.main.policies.size.type = SizeBasedTriggeringPolicy
> appender.main.policies.size.size = 100MB
> appender.main.policies.startup.type = OnStartupTriggeringPolicy
> appender.main.strategy.type = DefaultRolloverStrategy
> appender.main.strategy.max = ${env:MAX_LOG_FILE_NUMBER:-10}
> logger.netty.name = 
> org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline
> logger.netty.level = OFF



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] tillrohrmann commented on a change in pull request #15741: [FLINK-22419][coordination][tests] Rework RpcEndpoint delay tests

2021-05-03 Thread GitBox


tillrohrmann commented on a change in pull request #15741:
URL: https://github.com/apache/flink/pull/15741#discussion_r625038879



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/rpc/RpcEndpointTest.java
##
@@ -340,108 +319,54 @@ public void testExecute() throws InterruptedException, 
ExecutionException, Timeo
 }
 }
 
-/** Tests scheduling runnable with delay specified in number and TimeUnit. 
*/
 @Test
-public void testScheduleRunnable()
-throws InterruptedException, ExecutionException, TimeoutException {
-final Time expectedDelay1 = Time.seconds(1);
-final Time expectedDelay2 = Time.milliseconds(500);
-final CompletableFuture actualDelayMsFuture1 = new 
CompletableFuture<>();
-final CompletableFuture actualDelayMsFuture2 = new 
CompletableFuture<>();
-final RpcEndpoint endpoint = new BaseEndpoint(rpcService);
-try {
-endpoint.start();
-final long startTime = System.currentTimeMillis();
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture1.complete(
-System.currentTimeMillis() - 
startTime);
-},
-expectedDelay1.getSize(),
-expectedDelay1.getUnit());
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture2.complete(
-System.currentTimeMillis() - 
startTime);
-},
-expectedDelay2.getSize(),
-expectedDelay2.getUnit());
-final long actualDelayMs1 =
-actualDelayMsFuture1.get(
-expectedDelay1.getSize() * 2, 
expectedDelay1.getUnit());
-final long actualDelayMs2 =
-actualDelayMsFuture2.get(
-expectedDelay2.getSize() * 2, 
expectedDelay2.getUnit());
-assertTrue(actualDelayMs1 > expectedDelay1.toMilliseconds() * 0.8);
-assertTrue(actualDelayMs1 < expectedDelay1.toMilliseconds() * 1.2);
-assertTrue(actualDelayMs2 > expectedDelay2.toMilliseconds() * 0.8);
-assertTrue(actualDelayMs2 < expectedDelay2.toMilliseconds() * 1.2);
-} finally {
-RpcUtils.terminateRpcEndpoint(endpoint, TIMEOUT);
-}
+public void testScheduleRunnableWithDelayInMilliseconds() throws Exception 
{
+testScheduleWithDelay(
+(mainThreadExecutor, expectedDelay) ->
+mainThreadExecutor.schedule(
+() -> {}, expectedDelay.toMillis(), 
TimeUnit.MILLISECONDS));
 }
 
-/** Tests scheduling callable with delay specified in number and TimeUnit. 
*/
 @Test
-public void testScheduleCallable()
-throws InterruptedException, ExecutionException, TimeoutException {
-final Time expectedDelay1 = Time.seconds(1);
-final Time expectedDelay2 = Time.milliseconds(500);
-final CompletableFuture actualDelayMsFuture1 = new 
CompletableFuture<>();
-final CompletableFuture actualDelayMsFuture2 = new 
CompletableFuture<>();
-final RpcEndpoint endpoint = new BaseEndpoint(rpcService);
-final int expectedInt = 12345;
-final String expectedString = "Flink";
-try {
-endpoint.start();
-final long startTime = System.currentTimeMillis();
-final ScheduledFuture intScheduleFuture =
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture1.complete(
-System.currentTimeMillis() - 
startTime);
-return expectedInt;
-},
-expectedDelay1.getSize(),
-expectedDelay1.getUnit());
-final ScheduledFuture stringScheduledFuture =
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture2.complete(
-System.currentTimeMillis() - 
startTime);
-   

[jira] [Updated] (FLINK-22551) checkpoints: strange behaviour

2021-05-03 Thread buom (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

buom updated FLINK-22551:
-
Description: 
* +*Case 1*:+ Work as expected

{code:java}
public class Example {

public static class ExampleSource extends RichSourceFunction
implements CheckpointedFunction {

private volatile boolean isRunning = true;

@Override
public void open(Configuration parameters) throws Exception {
System.out.println("[source] invoke open()");
}

@Override
public void close() throws Exception {
isRunning = false;
System.out.println("[source] invoke close()");
}

@Override
public void run(SourceContext ctx) throws Exception {
System.out.println("[source] invoke run()");
while (isRunning) {
ctx.collect("Flink");
Thread.sleep(500);
}
}

@Override
public void cancel() {
isRunning = false;
System.out.println("[source] invoke cancel()");
}

@Override
public void snapshotState(FunctionSnapshotContext context) throws 
Exception {
System.out.println("[source] invoke snapshotState()");
}

@Override
public void initializeState(FunctionInitializationContext context) 
throws Exception {
System.out.println("[source] invoke initializeState()");
}

}

public static class ExampleSink extends PrintSinkFunction
implements CheckpointedFunction {

@Override
public void snapshotState(FunctionSnapshotContext context) throws 
Exception {
System.out.println("[sink] invoke snapshotState()");
}

@Override
public void initializeState(FunctionInitializationContext context) 
throws Exception {
System.out.println("[sink] invoke initializeState()");
}
}

public static void main(String[] args) throws Exception {
final StreamExecutionEnvironment env =

StreamExecutionEnvironment.getExecutionEnvironment().enableCheckpointing(1000);

DataStream stream = env.addSource(new ExampleSource());
stream.addSink(new ExampleSink()).setParallelism(1);

env.execute();
}
}
{code}
{code:java}
$ java -jar ./example.jar

[sink] invoke initializeState()
[source] invoke initializeState()
[source] invoke open()
[source] invoke run()
Flink
[sink] invoke snapshotState()
[source] invoke snapshotState()
Flink
Flink
[sink] invoke snapshotState()
[source] invoke snapshotState()
Flink
Flink
[sink] invoke snapshotState()
[source] invoke snapshotState()
^C
{code}
 * *+Case 2:+* Run as unexpected (w/ _parallelism = 1_)

{code:java}
public class Example {

public static class ExampleSource extends RichSourceFunction
implements CheckpointedFunction {

private volatile boolean isRunning = true;

@Override
public void open(Configuration parameters) throws Exception {
System.out.println("[source] invoke open()");
}

@Override
public void close() throws Exception {
isRunning = false;
System.out.println("[source] invoke close()");
}

@Override
public void run(SourceContext ctx) throws Exception {
System.out.println("[source] invoke run()");
while (isRunning) {
ctx.collect("Flink");
Thread.sleep(500);
}
}

@Override
public void cancel() {
isRunning = false;
System.out.println("[source] invoke cancel()");
}

@Override
public void snapshotState(FunctionSnapshotContext context) throws 
Exception {
System.out.println("[source] invoke snapshotState()");
}

@Override
public void initializeState(FunctionInitializationContext context) 
throws Exception {
System.out.println("[source] invoke initializeState()");
}
}


public static void main(String[] args) throws Exception {
final StreamExecutionEnvironment env =

StreamExecutionEnvironment.getExecutionEnvironment().enableCheckpointing(1000);

DataStream stream = env.addSource(new ExampleSource());

Properties properties = new Properties();
properties.setProperty("bootstrap.servers", "localhost:9092");
String topic = "my-topic";

FlinkKafkaProducer kafkaProducer = new FlinkKafkaProducer<>(
topic,
(element, timestamp) -> {
byte[] value = element.getBytes(StandardCharsets.UTF_8);
return new ProducerRecord<>(topic, null, timestamp, null, 
value, null);
},
properties,

[jira] [Created] (FLINK-22551) checkpoints: strange behaviour

2021-05-03 Thread buom (Jira)
buom created FLINK-22551:


 Summary: checkpoints: strange behaviour 
 Key: FLINK-22551
 URL: https://issues.apache.org/jira/browse/FLINK-22551
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.13.0
 Environment: {code:java}
 java -version
openjdk version "11.0.2" 2019-01-15
OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
{code}
Reporter: buom


* +*Case 1*:+ Work as expected

{code:java}
public class Example {

public static class ExampleSource extends RichSourceFunction
implements CheckpointedFunction {

private volatile boolean isRunning = true;

@Override
public void open(Configuration parameters) throws Exception {
System.out.println("[source] invoke open()");
}

@Override
public void close() throws Exception {
isRunning = false;
System.out.println("[source] invoke close()");
}

@Override
public void run(SourceContext ctx) throws Exception {
System.out.println("[source] invoke run()");
while (isRunning) {
ctx.collect("Flink");
Thread.sleep(500);
}
}

@Override
public void cancel() {
isRunning = false;
System.out.println("[source] invoke cancel()");
}

@Override
public void snapshotState(FunctionSnapshotContext context) throws 
Exception {
System.out.println("[source] invoke snapshotState()");
}

@Override
public void initializeState(FunctionInitializationContext context) 
throws Exception {
System.out.println("[source] invoke initializeState()");
}

}

public static class ExampleSink extends PrintSinkFunction
implements CheckpointedFunction {

@Override
public void snapshotState(FunctionSnapshotContext context) throws 
Exception {
System.out.println("[sink] invoke snapshotState()");
}

@Override
public void initializeState(FunctionInitializationContext context) 
throws Exception {
System.out.println("[sink] invoke initializeState()");
}
}

public static void main(String[] args) throws Exception {
final StreamExecutionEnvironment env =

StreamExecutionEnvironment.getExecutionEnvironment().enableCheckpointing(1000);

DataStream stream = env.addSource(new ExampleSource());
stream.addSink(new ExampleSink()).setParallelism(1);

env.execute();
}
}
{code}
{code:java}
$ java -jar ./example.jar

[sink] invoke initializeState()
[source] invoke initializeState()
[source] invoke open()
[source] invoke run()
Flink
[sink] invoke snapshotState()
[source] invoke snapshotState()
Flink
Flink
[sink] invoke snapshotState()
[source] invoke snapshotState()
Flink
Flink
[sink] invoke snapshotState()
[source] invoke snapshotState()
^C
{code}
 * *+Case 2:+* Run as unexpected (w/ _parallelism = 1_)

{code:java}
public class Example {

public static class ExampleSource extends RichSourceFunction
implements CheckpointedFunction {

private volatile boolean isRunning = true;

@Override
public void open(Configuration parameters) throws Exception {
System.out.println("[source] invoke open()");
}

@Override
public void close() throws Exception {
isRunning = false;
System.out.println("[source] invoke close()");
}

@Override
public void run(SourceContext ctx) throws Exception {
System.out.println("[source] invoke run()");
while (isRunning) {
ctx.collect("Flink");
Thread.sleep(500);
}
}

@Override
public void cancel() {
isRunning = false;
System.out.println("[source] invoke cancel()");
}

@Override
public void snapshotState(FunctionSnapshotContext context) throws 
Exception {
System.out.println("[source] invoke snapshotState()");
}

@Override
public void initializeState(FunctionInitializationContext context) 
throws Exception {
System.out.println("[source] invoke initializeState()");
}
}


public static void main(String[] args) throws Exception {
final StreamExecutionEnvironment env =

StreamExecutionEnvironment.getExecutionEnvironment().enableCheckpointing(1000);

DataStream stream = env.addSource(new ExampleSource());

Properties properties = new Properties();
properties.setProperty("bootstrap.servers", "localhost:9092");
String topic = "my-topic";

FlinkKafkaProducer kafkaProducer = new FlinkKafkaProducer<>(
 

[GitHub] [flink-docker] AHeise commented on pull request #74: Updated maintainers

2021-05-03 Thread GitBox


AHeise commented on pull request #74:
URL: https://github.com/apache/flink-docker/pull/74#issuecomment-831207527


   Having the project is probably more future proof. Not sure why you need to 
list maintainers here in the first place...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15121:
URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240


   
   ## CI report:
   
   * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN
   * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN
   * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN
   * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN
   * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN
   * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN
   * 5ee991b8ab6c1645e3661f2ae8b5c20e92223954 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17489)
 
   * cc9e8ddee9ff28b3044488d0e73bbc7f5c0a73e5 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15825: [FLINK-22406][coordination][tests] Stabilize ReactiveModeITCase

2021-05-03 Thread GitBox


flinkbot commented on pull request #15825:
URL: https://github.com/apache/flink/pull/15825#issuecomment-831205782


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit e1a425301eee0b46c812866d12284125f5bde104 (Mon May 03 
11:41:33 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-22406) Unstable test ReactiveModeITCase.testScaleDownOnTaskManagerLoss()

2021-05-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-22406:
---
Labels: pull-request-available test-stability  (was: test-stability)

> Unstable test ReactiveModeITCase.testScaleDownOnTaskManagerLoss()
> -
>
> Key: FLINK-22406
> URL: https://issues.apache.org/jira/browse/FLINK-22406
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.13.0
>Reporter: Stephan Ewen
>Assignee: Chesnay Schepler
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> The test is stalling on Azure CI.
> https://dev.azure.com/sewen0794/Flink/_build/results?buildId=292=logs=0a15d512-44ac-5ba5-97ab-13a5d066c22c=634cd701-c189-5dff-24cb-606ed884db87=4865



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zentol opened a new pull request #15825: [FLINK-22406][coordination][tests] Stabilize ReactiveModeITCase

2021-05-03 Thread GitBox


zentol opened a new pull request #15825:
URL: https://github.com/apache/flink/pull/15825


   Stabilizes the ReactiveModeITCase by not checking what is actually being 
deployed, but just asking the JM with what parallelism it currently intends to 
run the job with.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol merged pull request #15749: [hotfix][docs] Fix typo.

2021-05-03 Thread GitBox


zentol merged pull request #15749:
URL: https://github.com/apache/flink/pull/15749


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol merged pull request #15800: [hotfix][python] Format exception message in ZipUtils

2021-05-03 Thread GitBox


zentol merged pull request #15800:
URL: https://github.com/apache/flink/pull/15800


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (FLINK-22494) Avoid discarding checkpoints in case of failure

2021-05-03 Thread Matthias (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias reassigned FLINK-22494:


Assignee: Matthias

> Avoid discarding checkpoints in case of failure
> ---
>
> Key: FLINK-22494
> URL: https://issues.apache.org/jira/browse/FLINK-22494
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / Coordination
>Affects Versions: 1.13.0, 1.14.0, 1.12.3
>Reporter: Matthias
>Assignee: Matthias
>Priority: Critical
> Fix For: 1.14.0, 1.13.1, 1.12.4
>
>
> Both {{StateHandleStore}} implementations (i.e. 
> [KubernetesStateHandleStore:157|https://github.com/apache/flink/blob/c6997c97c575d334679915c328792b8a3067cfb5/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/highavailability/KubernetesStateHandleStore.java#L157]
>  and 
> [ZooKeeperStateHandleStore:170|https://github.com/apache/flink/blob/c6997c97c575d334679915c328792b8a3067cfb5/flink-runtime/src/main/java/org/apache/flink/runtime/zookeeper/ZooKeeperStateHandleStore.java#L170])
>  discard checkpoints if the checkpoint metadata wasn't written to the 
> backend. 
> This does not cover the cases where the data was actually written to the 
> backend but the call failed anyway (e.g. due to network issues). In such a 
> case, we might end up having a pointer in the backend pointing to a 
> checkpoint that was discarded.
> Instead of discarding the checkpoint data in this case, we might want to keep 
> it for this specific use case. Otherwise, we might run into Exceptions when 
> recovering from the Checkpoint later on. We might want to add a warning to 
> the user pointing to the possibly orphaned checkpoint data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zentol merged pull request #15805: Update japicmp configuration and docs for 1.12.3

2021-05-03 Thread GitBox


zentol merged pull request #15805:
URL: https://github.com/apache/flink/pull/15805


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on a change in pull request #15741: [FLINK-22419][coordination][tests] Rework RpcEndpoint delay tests

2021-05-03 Thread GitBox


zentol commented on a change in pull request #15741:
URL: https://github.com/apache/flink/pull/15741#discussion_r625023950



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/rpc/RpcEndpointTest.java
##
@@ -340,108 +319,54 @@ public void testExecute() throws InterruptedException, 
ExecutionException, Timeo
 }
 }
 
-/** Tests scheduling runnable with delay specified in number and TimeUnit. 
*/
 @Test
-public void testScheduleRunnable()
-throws InterruptedException, ExecutionException, TimeoutException {
-final Time expectedDelay1 = Time.seconds(1);
-final Time expectedDelay2 = Time.milliseconds(500);
-final CompletableFuture actualDelayMsFuture1 = new 
CompletableFuture<>();
-final CompletableFuture actualDelayMsFuture2 = new 
CompletableFuture<>();
-final RpcEndpoint endpoint = new BaseEndpoint(rpcService);
-try {
-endpoint.start();
-final long startTime = System.currentTimeMillis();
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture1.complete(
-System.currentTimeMillis() - 
startTime);
-},
-expectedDelay1.getSize(),
-expectedDelay1.getUnit());
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture2.complete(
-System.currentTimeMillis() - 
startTime);
-},
-expectedDelay2.getSize(),
-expectedDelay2.getUnit());
-final long actualDelayMs1 =
-actualDelayMsFuture1.get(
-expectedDelay1.getSize() * 2, 
expectedDelay1.getUnit());
-final long actualDelayMs2 =
-actualDelayMsFuture2.get(
-expectedDelay2.getSize() * 2, 
expectedDelay2.getUnit());
-assertTrue(actualDelayMs1 > expectedDelay1.toMilliseconds() * 0.8);
-assertTrue(actualDelayMs1 < expectedDelay1.toMilliseconds() * 1.2);
-assertTrue(actualDelayMs2 > expectedDelay2.toMilliseconds() * 0.8);
-assertTrue(actualDelayMs2 < expectedDelay2.toMilliseconds() * 1.2);
-} finally {
-RpcUtils.terminateRpcEndpoint(endpoint, TIMEOUT);
-}
+public void testScheduleRunnableWithDelayInMilliseconds() throws Exception 
{
+testScheduleWithDelay(
+(mainThreadExecutor, expectedDelay) ->
+mainThreadExecutor.schedule(
+() -> {}, expectedDelay.toMillis(), 
TimeUnit.MILLISECONDS));
 }
 
-/** Tests scheduling callable with delay specified in number and TimeUnit. 
*/
 @Test
-public void testScheduleCallable()
-throws InterruptedException, ExecutionException, TimeoutException {
-final Time expectedDelay1 = Time.seconds(1);
-final Time expectedDelay2 = Time.milliseconds(500);
-final CompletableFuture actualDelayMsFuture1 = new 
CompletableFuture<>();
-final CompletableFuture actualDelayMsFuture2 = new 
CompletableFuture<>();
-final RpcEndpoint endpoint = new BaseEndpoint(rpcService);
-final int expectedInt = 12345;
-final String expectedString = "Flink";
-try {
-endpoint.start();
-final long startTime = System.currentTimeMillis();
-final ScheduledFuture intScheduleFuture =
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture1.complete(
-System.currentTimeMillis() - 
startTime);
-return expectedInt;
-},
-expectedDelay1.getSize(),
-expectedDelay1.getUnit());
-final ScheduledFuture stringScheduledFuture =
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture2.complete(
-System.currentTimeMillis() - 
startTime);
-

[GitHub] [flink] flinkbot edited a comment on pull request #15727: [BP-1.13][FLINK-22368] Deque channel after releasing on EndOfPartition

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15727:
URL: https://github.com/apache/flink/pull/15727#issuecomment-824941920


   
   ## CI report:
   
   * 72cc560e0d4292d7de977ccc3aabca2c48d6d518 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17506)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on pull request #15741: [FLINK-22419][coordination][tests] Rework RpcEndpoint delay tests

2021-05-03 Thread GitBox


zentol commented on pull request #15741:
URL: https://github.com/apache/flink/pull/15741#issuecomment-831198755


   > For example, I think I could make the test pass if schedule completely 
ignores the delay and runs the result directly. Hence, there is no longer a 
guard preventing such a regression.
   
   How so? It would call the wrong method on the `MainThreadExecutable`, 
failing the test. Now if you break the actual scheduling behavior in the 
`AkkaInvocationHandler` then this can indeed happen, but that reveals further 
issues in these tests (i.e., that they are reliant to a specific `RpcService` 
implementation).
   
   > I fear that we are actually losing test coverage with these changes
   
   We do, of the `AkkaInvocationHandler`. I can try adding a test for the 
actual scheduling behavior, but I don't think we should revert the proposed 
changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on a change in pull request #15741: [FLINK-22419][coordination][tests] Rework RpcEndpoint delay tests

2021-05-03 Thread GitBox


zentol commented on a change in pull request #15741:
URL: https://github.com/apache/flink/pull/15741#discussion_r625017325



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/rpc/RpcEndpointTest.java
##
@@ -340,108 +319,54 @@ public void testExecute() throws InterruptedException, 
ExecutionException, Timeo
 }
 }
 
-/** Tests scheduling runnable with delay specified in number and TimeUnit. 
*/
 @Test
-public void testScheduleRunnable()
-throws InterruptedException, ExecutionException, TimeoutException {
-final Time expectedDelay1 = Time.seconds(1);
-final Time expectedDelay2 = Time.milliseconds(500);
-final CompletableFuture actualDelayMsFuture1 = new 
CompletableFuture<>();
-final CompletableFuture actualDelayMsFuture2 = new 
CompletableFuture<>();
-final RpcEndpoint endpoint = new BaseEndpoint(rpcService);
-try {
-endpoint.start();
-final long startTime = System.currentTimeMillis();
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture1.complete(
-System.currentTimeMillis() - 
startTime);
-},
-expectedDelay1.getSize(),
-expectedDelay1.getUnit());
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture2.complete(
-System.currentTimeMillis() - 
startTime);
-},
-expectedDelay2.getSize(),
-expectedDelay2.getUnit());
-final long actualDelayMs1 =
-actualDelayMsFuture1.get(
-expectedDelay1.getSize() * 2, 
expectedDelay1.getUnit());
-final long actualDelayMs2 =
-actualDelayMsFuture2.get(
-expectedDelay2.getSize() * 2, 
expectedDelay2.getUnit());
-assertTrue(actualDelayMs1 > expectedDelay1.toMilliseconds() * 0.8);
-assertTrue(actualDelayMs1 < expectedDelay1.toMilliseconds() * 1.2);
-assertTrue(actualDelayMs2 > expectedDelay2.toMilliseconds() * 0.8);
-assertTrue(actualDelayMs2 < expectedDelay2.toMilliseconds() * 1.2);
-} finally {
-RpcUtils.terminateRpcEndpoint(endpoint, TIMEOUT);
-}
+public void testScheduleRunnableWithDelayInMilliseconds() throws Exception 
{
+testScheduleWithDelay(
+(mainThreadExecutor, expectedDelay) ->
+mainThreadExecutor.schedule(
+() -> {}, expectedDelay.toMillis(), 
TimeUnit.MILLISECONDS));
 }
 
-/** Tests scheduling callable with delay specified in number and TimeUnit. 
*/
 @Test
-public void testScheduleCallable()
-throws InterruptedException, ExecutionException, TimeoutException {
-final Time expectedDelay1 = Time.seconds(1);
-final Time expectedDelay2 = Time.milliseconds(500);
-final CompletableFuture actualDelayMsFuture1 = new 
CompletableFuture<>();
-final CompletableFuture actualDelayMsFuture2 = new 
CompletableFuture<>();
-final RpcEndpoint endpoint = new BaseEndpoint(rpcService);
-final int expectedInt = 12345;
-final String expectedString = "Flink";
-try {
-endpoint.start();
-final long startTime = System.currentTimeMillis();
-final ScheduledFuture intScheduleFuture =
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture1.complete(
-System.currentTimeMillis() - 
startTime);
-return expectedInt;
-},
-expectedDelay1.getSize(),
-expectedDelay1.getUnit());
-final ScheduledFuture stringScheduledFuture =
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture2.complete(
-System.currentTimeMillis() - 
startTime);
-

[GitHub] [flink] zentol commented on a change in pull request #15741: [FLINK-22419][coordination][tests] Rework RpcEndpoint delay tests

2021-05-03 Thread GitBox


zentol commented on a change in pull request #15741:
URL: https://github.com/apache/flink/pull/15741#discussion_r625016221



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/rpc/RpcEndpointTest.java
##
@@ -340,108 +319,54 @@ public void testExecute() throws InterruptedException, 
ExecutionException, Timeo
 }
 }
 
-/** Tests scheduling runnable with delay specified in number and TimeUnit. 
*/
 @Test
-public void testScheduleRunnable()
-throws InterruptedException, ExecutionException, TimeoutException {
-final Time expectedDelay1 = Time.seconds(1);
-final Time expectedDelay2 = Time.milliseconds(500);
-final CompletableFuture actualDelayMsFuture1 = new 
CompletableFuture<>();
-final CompletableFuture actualDelayMsFuture2 = new 
CompletableFuture<>();
-final RpcEndpoint endpoint = new BaseEndpoint(rpcService);
-try {
-endpoint.start();
-final long startTime = System.currentTimeMillis();
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture1.complete(
-System.currentTimeMillis() - 
startTime);
-},
-expectedDelay1.getSize(),
-expectedDelay1.getUnit());
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture2.complete(
-System.currentTimeMillis() - 
startTime);
-},
-expectedDelay2.getSize(),
-expectedDelay2.getUnit());
-final long actualDelayMs1 =
-actualDelayMsFuture1.get(
-expectedDelay1.getSize() * 2, 
expectedDelay1.getUnit());
-final long actualDelayMs2 =
-actualDelayMsFuture2.get(
-expectedDelay2.getSize() * 2, 
expectedDelay2.getUnit());
-assertTrue(actualDelayMs1 > expectedDelay1.toMilliseconds() * 0.8);
-assertTrue(actualDelayMs1 < expectedDelay1.toMilliseconds() * 1.2);
-assertTrue(actualDelayMs2 > expectedDelay2.toMilliseconds() * 0.8);
-assertTrue(actualDelayMs2 < expectedDelay2.toMilliseconds() * 1.2);
-} finally {
-RpcUtils.terminateRpcEndpoint(endpoint, TIMEOUT);
-}
+public void testScheduleRunnableWithDelayInMilliseconds() throws Exception 
{
+testScheduleWithDelay(
+(mainThreadExecutor, expectedDelay) ->
+mainThreadExecutor.schedule(
+() -> {}, expectedDelay.toMillis(), 
TimeUnit.MILLISECONDS));
 }
 
-/** Tests scheduling callable with delay specified in number and TimeUnit. 
*/
 @Test
-public void testScheduleCallable()
-throws InterruptedException, ExecutionException, TimeoutException {
-final Time expectedDelay1 = Time.seconds(1);
-final Time expectedDelay2 = Time.milliseconds(500);
-final CompletableFuture actualDelayMsFuture1 = new 
CompletableFuture<>();
-final CompletableFuture actualDelayMsFuture2 = new 
CompletableFuture<>();
-final RpcEndpoint endpoint = new BaseEndpoint(rpcService);
-final int expectedInt = 12345;
-final String expectedString = "Flink";
-try {
-endpoint.start();
-final long startTime = System.currentTimeMillis();
-final ScheduledFuture intScheduleFuture =
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture1.complete(
-System.currentTimeMillis() - 
startTime);
-return expectedInt;
-},
-expectedDelay1.getSize(),
-expectedDelay1.getUnit());
-final ScheduledFuture stringScheduledFuture =
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture2.complete(
-System.currentTimeMillis() - 
startTime);
-

[GitHub] [flink] zentol commented on a change in pull request #15741: [FLINK-22419][coordination][tests] Rework RpcEndpoint delay tests

2021-05-03 Thread GitBox


zentol commented on a change in pull request #15741:
URL: https://github.com/apache/flink/pull/15741#discussion_r625016153



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/rpc/RpcEndpointTest.java
##
@@ -340,108 +319,54 @@ public void testExecute() throws InterruptedException, 
ExecutionException, Timeo
 }
 }
 
-/** Tests scheduling runnable with delay specified in number and TimeUnit. 
*/
 @Test
-public void testScheduleRunnable()
-throws InterruptedException, ExecutionException, TimeoutException {
-final Time expectedDelay1 = Time.seconds(1);
-final Time expectedDelay2 = Time.milliseconds(500);
-final CompletableFuture actualDelayMsFuture1 = new 
CompletableFuture<>();
-final CompletableFuture actualDelayMsFuture2 = new 
CompletableFuture<>();
-final RpcEndpoint endpoint = new BaseEndpoint(rpcService);
-try {
-endpoint.start();
-final long startTime = System.currentTimeMillis();
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture1.complete(
-System.currentTimeMillis() - 
startTime);
-},
-expectedDelay1.getSize(),
-expectedDelay1.getUnit());
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture2.complete(
-System.currentTimeMillis() - 
startTime);
-},
-expectedDelay2.getSize(),
-expectedDelay2.getUnit());
-final long actualDelayMs1 =
-actualDelayMsFuture1.get(
-expectedDelay1.getSize() * 2, 
expectedDelay1.getUnit());
-final long actualDelayMs2 =
-actualDelayMsFuture2.get(
-expectedDelay2.getSize() * 2, 
expectedDelay2.getUnit());
-assertTrue(actualDelayMs1 > expectedDelay1.toMilliseconds() * 0.8);
-assertTrue(actualDelayMs1 < expectedDelay1.toMilliseconds() * 1.2);
-assertTrue(actualDelayMs2 > expectedDelay2.toMilliseconds() * 0.8);
-assertTrue(actualDelayMs2 < expectedDelay2.toMilliseconds() * 1.2);
-} finally {
-RpcUtils.terminateRpcEndpoint(endpoint, TIMEOUT);
-}
+public void testScheduleRunnableWithDelayInMilliseconds() throws Exception 
{
+testScheduleWithDelay(
+(mainThreadExecutor, expectedDelay) ->
+mainThreadExecutor.schedule(
+() -> {}, expectedDelay.toMillis(), 
TimeUnit.MILLISECONDS));
 }
 
-/** Tests scheduling callable with delay specified in number and TimeUnit. 
*/
 @Test
-public void testScheduleCallable()
-throws InterruptedException, ExecutionException, TimeoutException {
-final Time expectedDelay1 = Time.seconds(1);
-final Time expectedDelay2 = Time.milliseconds(500);
-final CompletableFuture actualDelayMsFuture1 = new 
CompletableFuture<>();
-final CompletableFuture actualDelayMsFuture2 = new 
CompletableFuture<>();
-final RpcEndpoint endpoint = new BaseEndpoint(rpcService);
-final int expectedInt = 12345;
-final String expectedString = "Flink";
-try {
-endpoint.start();
-final long startTime = System.currentTimeMillis();
-final ScheduledFuture intScheduleFuture =
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture1.complete(
-System.currentTimeMillis() - 
startTime);
-return expectedInt;
-},
-expectedDelay1.getSize(),
-expectedDelay1.getUnit());
-final ScheduledFuture stringScheduledFuture =
-endpoint.getMainThreadExecutor()
-.schedule(
-() -> {
-endpoint.validateRunsInMainThread();
-actualDelayMsFuture2.complete(
-System.currentTimeMillis() - 
startTime);
-

[GitHub] [flink] rkhachatryan commented on a change in pull request #15728: [FLINK-22379][runtime] CheckpointCoordinator checks the state of all …

2021-05-03 Thread GitBox


rkhachatryan commented on a change in pull request #15728:
URL: https://github.com/apache/flink/pull/15728#discussion_r625010241



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/DefaultCheckpointPlanCalculatorTest.java
##
@@ -149,18 +150,80 @@ public void testComputeWithMultipleLevels() throws 
Exception {
 }
 
 @Test
-public void testWithTriggeredTasksNotRunning() throws Exception {
+public void testNotRunningOneOfSourcesTriggeredTasksNotRunning() throws 
Exception {
+// given: Execution graph builder with one RUNNING source and NOT 
RUNNING source.
+FunctionWithException 
twoSourcesBuilder =
+(id) ->
+new 
CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder()
+.addJobVertex(id)
+.addJobVertex(new JobVertexID())
+.setTransitToRunning(vertex -> 
!vertex.getJobvertexId().equals(id))
+.build();
+
+// when: Creating the checkpoint plan.
+runWithNotRunTask(twoSourcesBuilder);
+
+// then: The plan failed because one task didn't have RUNNING state.
+}
+
+@Test
+public void testNotRunningSingleSourceTriggeredTasksNotRunning() throws 
Exception {
+// given: Execution graph builder with one NOT RUNNING source and 
RUNNING not source task.
+FunctionWithException 
sourceAndNotSourceBuilder =
+(id) ->
+new 
CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder()
+.addJobVertex(id)
+.addJobVertex(new JobVertexID(), false)
+.setTransitToRunning(vertex -> 
!vertex.getJobvertexId().equals(id))
+.build();
+
+// when: Creating the checkpoint plan.
+runWithNotRunTask(sourceAndNotSourceBuilder);
+
+// then: The plan failed because one task didn't have RUNNING state.
+}
+
+@Test
+public void testNotRunningOneOfNotSourcesTriggeredTasksNotRunning() throws 
Exception {
+// given: Execution graph builder with NOT RUNNING not source and 
RUNNING not source task.
+FunctionWithException 
twoNotSourceBuilder =
+(id) ->
+new 
CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder()
+.addJobVertex(id, false)
+.addJobVertex(new JobVertexID(), false)
+.setTransitToRunning(vertex -> 
!vertex.getJobvertexId().equals(id))
+.build();
+
+// when: Creating the checkpoint plan.
+runWithNotRunTask(twoNotSourceBuilder);
+
+// then: The plan failed because one task didn't have RUNNING state.
+}
+
+@Test
+public void testNotRunningSingleNotSourceTriggeredTasksNotRunning() throws 
Exception {
+// given: Execution graph builder with NOT RUNNING not source and 
RUNNING source tasks.
+FunctionWithException 
sourceAndNotSourceBuilder =
+(id) ->
+new 
CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder()
+.addJobVertex(id, false)
+.addJobVertex(new JobVertexID())
+.setTransitToRunning(vertex -> 
!vertex.getJobvertexId().equals(id))

Review comment:
   I think this line doesn't have any effect since later this vertex `id` 
transitions to some non-RUNNING state unconditionally.

##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/DefaultCheckpointPlanCalculatorTest.java
##
@@ -149,18 +150,80 @@ public void testComputeWithMultipleLevels() throws 
Exception {
 }
 
 @Test
-public void testWithTriggeredTasksNotRunning() throws Exception {
+public void testNotRunningOneOfSourcesTriggeredTasksNotRunning() throws 
Exception {
+// given: Execution graph builder with one RUNNING source and NOT 
RUNNING source.
+FunctionWithException 
twoSourcesBuilder =
+(id) ->
+new 
CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder()
+.addJobVertex(id)
+.addJobVertex(new JobVertexID())
+.setTransitToRunning(vertex -> 
!vertex.getJobvertexId().equals(id))
+.build();
+
+// when: Creating the checkpoint plan.
+runWithNotRunTask(twoSourcesBuilder);
+
+// then: The plan failed because one task didn't have RUNNING state.
+}
+
+@Test
+public void testNotRunningSingleSourceTriggeredTasksNotRunning() throws 
Exception {
+// given: Execution graph builder with one NOT RUNNING 

[jira] [Updated] (FLINK-22550) Flink-docker dev-master works against 1.11-SNAPSHOT

2021-05-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-22550:
---
Labels: pull-request-available  (was: )

> Flink-docker dev-master works against 1.11-SNAPSHOT
> ---
>
> Key: FLINK-22550
> URL: https://issues.apache.org/jira/browse/FLINK-22550
> Project: Flink
>  Issue Type: Bug
>  Components: flink-docker
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
>
> The dev-master branch is supposed to work against the latest snapshot version 
> of Flink, but it currently points to 1.11.
> We also need to update the release guide to ensure this is updated when a 
> release branch is cut.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink-docker] zentol commented on pull request #74: Updated maintainers

2021-05-03 Thread GitBox


zentol commented on pull request #74:
URL: https://github.com/apache/flink-docker/pull/74#issuecomment-831190343


   Do we need to list specific people, or can we just point to the Flink 
project as a whole?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-docker] zentol opened a new pull request #79: [FLINK-22550] Run tests against 1.14-SNAPSHOT

2021-05-03 Thread GitBox


zentol opened a new pull request #79:
URL: https://github.com/apache/flink-docker/pull/79


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-22550) Flink-docker dev-master works against 1.11-SNAPSHOT

2021-05-03 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-22550:


 Summary: Flink-docker dev-master works against 1.11-SNAPSHOT
 Key: FLINK-22550
 URL: https://issues.apache.org/jira/browse/FLINK-22550
 Project: Flink
  Issue Type: Bug
  Components: flink-docker
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler


The dev-master branch is supposed to work against the latest snapshot version 
of Flink, but it currently points to 1.11.

We also need to update the release guide to ensure this is updated when a 
release branch is cut.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15824: (1.11) [FLINK-20383][runtime] Fix race condition in notification.

2021-05-03 Thread GitBox


flinkbot edited a comment on pull request #15824:
URL: https://github.com/apache/flink/pull/15824#issuecomment-831161678


   
   ## CI report:
   
   * f741e9ec3edf512b0adf4eb17548f9bf073f00ff Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17512)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-web] MarkSfik commented on a change in pull request #438: Release blog post (update)

2021-05-03 Thread GitBox


MarkSfik commented on a change in pull request #438:
URL: https://github.com/apache/flink-web/pull/438#discussion_r624998785



##
File path: _posts/2021-04-22-release-1.13.0.md
##
@@ -0,0 +1,590 @@
+---
+layout: post 
+title:  "Apache Flink 1.13.0 Release Announcement"
+date: 2021-04-22T08:00:00.000Z 
+categories: news 
+authors:
+- stephan:
+  name: "Stephan Ewen"
+  twitter: "StephanEwen"
+- dwysakowicz:
+  name: "Dawid Wysakowicz"
+  twitter: "dwysakowicz"
+
+excerpt: The Apache Flink community is excited to announce the release of 
Flink 1.13.0! Around 200 contributors worked on over 1,000 issues to bring 
significant improvements to usability and observability as well as new features 
that improve elasticity of Flink’s Application-style deployments.
+---
+
+
+The Apache Flink community is excited to announce the release of Flink 1.13.0! 
More than 200
+contributors worked on over 1,000 issues for this new version.
+
+The release brings us a big step forward in one of our major efforts: **Making 
Stream Processing
+Applications as natural and as simple to manage as any other application.** 
The new *reactive scaling*
+mode means that scaling streaming applications in and out now works like in 
any other application,
+by just changing the number of parallel processes.
+
+The release also prominently features a **series of improvements that help 
users better understand the performance of
+applications.** When the streams don't flow as fast as you'd hope, these can 
help you to understand
+why: Load and *backpressure visualization* to identify bottlenecks, *CPU flame 
graphs* to identify hot
+code paths in your application, and *State Access Latencies* to see how the 
State Backends are keeping
+up.
+
+Beyond those features, the Flink community has added a ton of improvements all 
over the system,
+some of which we discuss in this article. We hope you enjoy the new release 
and features.
+Towards the end of the article, we describe changes to be aware of when 
upgrading
+from earlier versions of Apache Flink.
+
+{% toc %}
+
+We encourage you to [download the 
release](https://flink.apache.org/downloads.html) and share your
+feedback with the community through
+the [Flink mailing 
lists](https://flink.apache.org/community.html#mailing-lists)
+or [JIRA](https://issues.apache.org/jira/projects/FLINK/summary).
+
+
+
+# Notable Features
+
+## Reactive Scaling
+
+Reactive Scaling is the latest piece in Flink's initiative to make Stream 
Processing
+Applications as natural and as simple to manage as any other application.
+
+Flink has a dual nature when it comes to resource management and deployments: 
You can deploy
+Flink applications onto resource orchestrators like Kubernetes or Yarn in such 
a way that Flink actively manages
+the resources, and allocates and releases workers as needed. That is 
especially useful for jobs and
+applications that rapidly change their required resources, like batch 
applications and ad-hoc SQL
+queries. The application parallelism rules, the number of workers follows. In 
the context of Flink
+applications, we call this *active scaling*.
+
+For long running streaming applications, it is often a nicer model to just 
deploy them like any
+other long-running application: The application doesn't really need to know 
that it runs on K8s,
+EKS, Yarn, etc. and doesn't try to acquire a specific amount of workers; 
instead, it just uses the
+number of workers that is given to it. The number of workers rules, the 
application parallelism
+adjusts to that. In the context of Flink, we call that *re-active scaling*.
+
+The [Application Deployment Mode]({{ site.DOCS_BASE_URL 
}}flink-docs-release-1.13/docs/concepts/flink-architecture/#flink-application-execution)
+started this effort, making deployments more application-like (by avoiding two 
separate deployment
+steps to (1) start cluster and (2) submit application). The reactive scaling 
mode completes this,
+and you now don't have to use extra tools (scripts or a K8s operator) any more 
to keep the number
+of workers and the application parallelism settings in sync.
+
+You can now put an auto-scaler around Flink applications like around other 
typical applications — as
+long as you are mindful about the cost of rescaling, when configuring the 
autoscaler: Stateful
+streaming applications must move state around when scaling.
+
+To try the reactive-scaling mode, add the `scheduler-mode: reactive` config 
entry and deploy
+an application cluster ([standalone]({{ site.DOCS_BASE_URL 
}}flink-docs-release-1.13/docs/deployment/resource-providers/standalone/overview/#application-mode)
 or [Kubernetes]({{ site.DOCS_BASE_URL 
}}flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#deploy-application-cluster)).
 Check out [the reactive scaling docs]({{ site.DOCS_BASE_URL 
}}flink-docs-release-1.13/docs/deployment/elastic_scaling/#reactive-mode) for 
more details.
+
+
+## Analyzing Application 

  1   2   >