[jira] [Commented] (FLINK-22139) Flink Jobmanager & Task Manger logs are not writing to the logs files
[ https://issues.apache.org/jira/browse/FLINK-22139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338745#comment-17338745 ] Bhagi commented on FLINK-22139: --- Hi Till, I was able to see the logs inside container in Non HA mode.. Let me check OR reconfigure the Flink HA mode config file. Thanks > Flink Jobmanager & Task Manger logs are not writing to the logs files > - > > Key: FLINK-22139 > URL: https://issues.apache.org/jira/browse/FLINK-22139 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.12.2 > Environment: on kubernetes flink standalone deployment with > jobmanager HA is enabled. >Reporter: Bhagi >Priority: Major > > Hi Team, > I am submitting the jobs and restarting the job manager and task manager > pods.. Log files are generating with the name task manager and job manager. > but job manager & task manager log file size is '0', i am not sure any > configuration missed..why logs are not writing to their log files.. > # Task Manager pod### > flink@flink-taskmanager-85b6585b7-hhgl7:~$ ls -lart log/ > total 0 > -rw-r--r-- 1 flink flink 0 Apr 7 09:35 > flink--taskexecutor-0-flink-taskmanager-85b6585b7-hhgl7.log > flink@flink-taskmanager-85b6585b7-hhgl7:~$ > ### Jobmanager pod Logs # > flink@flink-jobmanager-f6db89b7f-lq4ps:~$ > -rw-r--r-- 1 7148739 flink 0 Apr 7 06:36 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-gtkx5.log > -rw-r--r-- 1 7148739 flink 0 Apr 7 06:36 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-wnrfm.log > -rw-r--r-- 1 7148739 flink 0 Apr 7 06:37 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-2b2fs.log > -rw-r--r-- 1 7148739 flink 0 Apr 7 06:37 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-7kdhh.log > -rw-r--r-- 1 7148739 flink 0 Apr 7 09:35 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-twhkt.log > drwxrwxrwx 2 7148739 flink35 Apr 7 09:35 . > -rw-r--r-- 1 7148739 flink 0 Apr 7 09:35 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-lq4ps.log > flink@flink-jobmanager-f6db89b7f-lq4ps:~$ > I configured log4j.properties for flink > log4j.properties: |+ > monitorInterval=30 > rootLogger.level = INFO > rootLogger.appenderRef.file.ref = MainAppender > logger.flink.name = org.apache.flink > logger.flink.level = INFO > logger.akka.name = akka > logger.akka.level = INFO > appender.main.name = MainAppender > appender.main.type = RollingFile > appender.main.append = true > appender.main.fileName = ${sys:log.file} > appender.main.filePattern = ${sys:log.file}.%i > appender.main.layout.type = PatternLayout > appender.main.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x > - %m%n > appender.main.policies.type = Policies > appender.main.policies.size.type = SizeBasedTriggeringPolicy > appender.main.policies.size.size = 100MB > appender.main.policies.startup.type = OnStartupTriggeringPolicy > appender.main.strategy.type = DefaultRolloverStrategy > appender.main.strategy.max = ${env:MAX_LOG_FILE_NUMBER:-10} > logger.netty.name = > org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline > logger.netty.level = OFF -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #15823: [FLINK-22462][tests] Set max_connections to 200 in pgsql
flinkbot edited a comment on pull request #15823: URL: https://github.com/apache/flink/pull/15823#issuecomment-831107274 ## CI report: * 323d87e5394e8968856741d0bfd48463e38ddc8a Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17531) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15712: [FLINK-22400][hive connect]fix NPE problem when convert flink object for Map
flinkbot edited a comment on pull request #15712: URL: https://github.com/apache/flink/pull/15712#issuecomment-824189664 ## CI report: * f2fe08ada02c5c9e20ed397163d3ab7a34594994 UNKNOWN * 5fdb0880e9465be2711e8ac4b9df01388fc5aa2f Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17532) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15712: [FLINK-22400][hive connect]fix NPE problem when convert flink object for Map
flinkbot edited a comment on pull request #15712: URL: https://github.com/apache/flink/pull/15712#issuecomment-824189664 ## CI report: * 71ecabae057e85596265ea48d6b44662ad069bce Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17448) * f2fe08ada02c5c9e20ed397163d3ab7a34594994 UNKNOWN * 5fdb0880e9465be2711e8ac4b9df01388fc5aa2f UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15712: [FLINK-22400][hive connect]fix NPE problem when convert flink object for Map
flinkbot edited a comment on pull request #15712: URL: https://github.com/apache/flink/pull/15712#issuecomment-824189664 ## CI report: * 71ecabae057e85596265ea48d6b44662ad069bce Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17448) * f2fe08ada02c5c9e20ed397163d3ab7a34594994 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] hehuiyuan commented on a change in pull request #15712: [FLINK-22400][hive connect]fix NPE problem when convert flink object for Map
hehuiyuan commented on a change in pull request #15712: URL: https://github.com/apache/flink/pull/15712#discussion_r625459401 ## File path: flink-connectors/flink-connector-hive/src/test/java/org/apache/flink/connectors/hive/TableEnvHiveConnectorITCase.java ## @@ -531,6 +532,52 @@ public void testLocationWithComma() throws Exception { } } +@Test +public void testReadHiveDataWithEmptyMapForHiveShim20X() throws Exception { +TableEnvironment tableEnv = getTableEnvWithHiveCatalog(); + +try { +// Flink to write parquet file Review comment: yes,parquet -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15789: [FLINK-21181][runtime] Wait for Invokable cancellation before releasing network resources
flinkbot edited a comment on pull request #15789: URL: https://github.com/apache/flink/pull/15789#issuecomment-827996694 ## CI report: * 4c5180310bf76e96f2665bf53531eccb1fa86421 UNKNOWN * 0bba0d971d9d4b8530cf90c6513ce68d91cad54e Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17526) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-22555) LGPL-2.1 files in flink-python jars
[ https://issues.apache.org/jira/browse/FLINK-22555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338692#comment-17338692 ] Chesnay Schepler commented on FLINK-22555: -- Out of curiosity, how did you discover this? > LGPL-2.1 files in flink-python jars > --- > > Key: FLINK-22555 > URL: https://issues.apache.org/jira/browse/FLINK-22555 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.13.0, 1.12.3 >Reporter: Henri Yandell >Assignee: Chesnay Schepler >Priority: Blocker > Fix For: 1.13.1, 1.12.4 > > > Looking at, for example, > [https://repo1.maven.org/maven2/org/apache/flink/flink-python_2.11/1.13.0/] > the jar file contains three LGPL-2.1 source files: > * > flink-python_2.11-1.13.0/META-INF/maven/org.jboss.modules/jboss-modules/pom.xml > * flink-python_2.11-1.13.0/schema/module-1_1.xsd > * flink-python_2.11-1.13.0/schema/module-1_0.xsd > There's nothing in the DEPENDENCIES or licenses directory on the topic. It > looks like Netty brings in the LGPL-2.1 JBoss dependency (depending on the > Apache-2.0 JBoss Marshalling, which depends on the LGPL-2.1 JBoss Modules). > The xsd files are also appear to be coming from JBoss Modules. Given Apache's > position on LGPL-2.1 dependency inclusion in ASF projects, this seems like an > issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-22555) LGPL-2.1 files in flink-python jars
[ https://issues.apache.org/jira/browse/FLINK-22555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-22555: - Affects Version/s: 1.13.0 1.12.3 > LGPL-2.1 files in flink-python jars > --- > > Key: FLINK-22555 > URL: https://issues.apache.org/jira/browse/FLINK-22555 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.13.0, 1.12.3 >Reporter: Henri Yandell >Assignee: Chesnay Schepler >Priority: Blocker > > Looking at, for example, > [https://repo1.maven.org/maven2/org/apache/flink/flink-python_2.11/1.13.0/] > the jar file contains three LGPL-2.1 source files: > * > flink-python_2.11-1.13.0/META-INF/maven/org.jboss.modules/jboss-modules/pom.xml > * flink-python_2.11-1.13.0/schema/module-1_1.xsd > * flink-python_2.11-1.13.0/schema/module-1_0.xsd > There's nothing in the DEPENDENCIES or licenses directory on the topic. It > looks like Netty brings in the LGPL-2.1 JBoss dependency (depending on the > Apache-2.0 JBoss Marshalling, which depends on the LGPL-2.1 JBoss Modules). > The xsd files are also appear to be coming from JBoss Modules. Given Apache's > position on LGPL-2.1 dependency inclusion in ASF projects, this seems like an > issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-22555) LGPL-2.1 files in flink-python jars
[ https://issues.apache.org/jira/browse/FLINK-22555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-22555: - Fix Version/s: 1.12.4 1.13.1 > LGPL-2.1 files in flink-python jars > --- > > Key: FLINK-22555 > URL: https://issues.apache.org/jira/browse/FLINK-22555 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.13.0, 1.12.3 >Reporter: Henri Yandell >Assignee: Chesnay Schepler >Priority: Blocker > Fix For: 1.13.1, 1.12.4 > > > Looking at, for example, > [https://repo1.maven.org/maven2/org/apache/flink/flink-python_2.11/1.13.0/] > the jar file contains three LGPL-2.1 source files: > * > flink-python_2.11-1.13.0/META-INF/maven/org.jboss.modules/jboss-modules/pom.xml > * flink-python_2.11-1.13.0/schema/module-1_1.xsd > * flink-python_2.11-1.13.0/schema/module-1_0.xsd > There's nothing in the DEPENDENCIES or licenses directory on the topic. It > looks like Netty brings in the LGPL-2.1 JBoss dependency (depending on the > Apache-2.0 JBoss Marshalling, which depends on the LGPL-2.1 JBoss Modules). > The xsd files are also appear to be coming from JBoss Modules. Given Apache's > position on LGPL-2.1 dependency inclusion in ASF projects, this seems like an > issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-22555) LGPL-2.1 files in flink-python jars
[ https://issues.apache.org/jira/browse/FLINK-22555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338687#comment-17338687 ] Chesnay Schepler commented on FLINK-22555: -- This seems to be bundled in beam-vendor-grpc-1_26_0-0.3; maybe we can just exclude these files. [~bayard] could you open an issue with the Beam project to remove said files from their releases? > LGPL-2.1 files in flink-python jars > --- > > Key: FLINK-22555 > URL: https://issues.apache.org/jira/browse/FLINK-22555 > Project: Flink > Issue Type: Bug > Components: API / Python >Reporter: Henri Yandell >Assignee: Chesnay Schepler >Priority: Blocker > > Looking at, for example, > [https://repo1.maven.org/maven2/org/apache/flink/flink-python_2.11/1.13.0/] > the jar file contains three LGPL-2.1 source files: > * > flink-python_2.11-1.13.0/META-INF/maven/org.jboss.modules/jboss-modules/pom.xml > * flink-python_2.11-1.13.0/schema/module-1_1.xsd > * flink-python_2.11-1.13.0/schema/module-1_0.xsd > There's nothing in the DEPENDENCIES or licenses directory on the topic. It > looks like Netty brings in the LGPL-2.1 JBoss dependency (depending on the > Apache-2.0 JBoss Marshalling, which depends on the LGPL-2.1 JBoss Modules). > The xsd files are also appear to be coming from JBoss Modules. Given Apache's > position on LGPL-2.1 dependency inclusion in ASF projects, this seems like an > issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-22555) LGPL-2.1 files in flink-python jars
[ https://issues.apache.org/jira/browse/FLINK-22555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler reassigned FLINK-22555: Assignee: Chesnay Schepler > LGPL-2.1 files in flink-python jars > --- > > Key: FLINK-22555 > URL: https://issues.apache.org/jira/browse/FLINK-22555 > Project: Flink > Issue Type: Bug > Components: API / Python >Reporter: Henri Yandell >Assignee: Chesnay Schepler >Priority: Blocker > > Looking at, for example, > [https://repo1.maven.org/maven2/org/apache/flink/flink-python_2.11/1.13.0/] > the jar file contains three LGPL-2.1 source files: > * > flink-python_2.11-1.13.0/META-INF/maven/org.jboss.modules/jboss-modules/pom.xml > * flink-python_2.11-1.13.0/schema/module-1_1.xsd > * flink-python_2.11-1.13.0/schema/module-1_0.xsd > There's nothing in the DEPENDENCIES or licenses directory on the topic. It > looks like Netty brings in the LGPL-2.1 JBoss dependency (depending on the > Apache-2.0 JBoss Marshalling, which depends on the LGPL-2.1 JBoss Modules). > The xsd files are also appear to be coming from JBoss Modules. Given Apache's > position on LGPL-2.1 dependency inclusion in ASF projects, this seems like an > issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-22555) LGPL-2.1 files in flink-python jars
[ https://issues.apache.org/jira/browse/FLINK-22555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-22555: - Priority: Blocker (was: Major) > LGPL-2.1 files in flink-python jars > --- > > Key: FLINK-22555 > URL: https://issues.apache.org/jira/browse/FLINK-22555 > Project: Flink > Issue Type: Bug > Components: API / Python >Reporter: Henri Yandell >Priority: Blocker > > Looking at, for example, > [https://repo1.maven.org/maven2/org/apache/flink/flink-python_2.11/1.13.0/] > the jar file contains three LGPL-2.1 source files: > * > flink-python_2.11-1.13.0/META-INF/maven/org.jboss.modules/jboss-modules/pom.xml > * flink-python_2.11-1.13.0/schema/module-1_1.xsd > * flink-python_2.11-1.13.0/schema/module-1_0.xsd > There's nothing in the DEPENDENCIES or licenses directory on the topic. It > looks like Netty brings in the LGPL-2.1 JBoss dependency (depending on the > Apache-2.0 JBoss Marshalling, which depends on the LGPL-2.1 JBoss Modules). > The xsd files are also appear to be coming from JBoss Modules. Given Apache's > position on LGPL-2.1 dependency inclusion in ASF projects, this seems like an > issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #15823: [FLINK-22462][tests] Set max_connections to 200 in pgsql
flinkbot edited a comment on pull request #15823: URL: https://github.com/apache/flink/pull/15823#issuecomment-831107274 ## CI report: * df97ad0958eaf4e9064303479b040fbb83a493e8 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17525) * 323d87e5394e8968856741d0bfd48463e38ddc8a Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17531) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15728: [FLINK-22379][runtime] CheckpointCoordinator checks the state of all …
flinkbot edited a comment on pull request #15728: URL: https://github.com/apache/flink/pull/15728#issuecomment-824960796 ## CI report: * 057695465c326acc5c40576cf35ee9f0283d6cac Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17524) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-22555) LGPL-2.1 files in flink-python jars
Henri Yandell created FLINK-22555: - Summary: LGPL-2.1 files in flink-python jars Key: FLINK-22555 URL: https://issues.apache.org/jira/browse/FLINK-22555 Project: Flink Issue Type: Bug Components: API / Python Reporter: Henri Yandell Looking at, for example, [https://repo1.maven.org/maven2/org/apache/flink/flink-python_2.11/1.13.0/] the jar file contains three LGPL-2.1 source files: * flink-python_2.11-1.13.0/META-INF/maven/org.jboss.modules/jboss-modules/pom.xml * flink-python_2.11-1.13.0/schema/module-1_1.xsd * flink-python_2.11-1.13.0/schema/module-1_0.xsd There's nothing in the DEPENDENCIES or licenses directory on the topic. It looks like Netty brings in the LGPL-2.1 JBoss dependency (depending on the Apache-2.0 JBoss Marshalling, which depends on the LGPL-2.1 JBoss Modules). The xsd files are also appear to be coming from JBoss Modules. Given Apache's position on LGPL-2.1 dependency inclusion in ASF projects, this seems like an issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-22554) Support Kafka Topic Patterns in Kafka Ingress
[ https://issues.apache.org/jira/browse/FLINK-22554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-22554: --- Labels: pull-request-available (was: ) > Support Kafka Topic Patterns in Kafka Ingress > - > > Key: FLINK-22554 > URL: https://issues.apache.org/jira/browse/FLINK-22554 > Project: Flink > Issue Type: Improvement > Components: Stateful Functions >Affects Versions: statefun-3.1.0 >Reporter: Seth Wiesman >Assignee: Seth Wiesman >Priority: Major > Labels: pull-request-available > > Flink's Kafka source supports subscription patterns, where it will consume > all topics that match a specified regex and periodically monitor for new > topics. We should support this in the statefun Kafka ingress as it is > generally useful and would remove a source of cluster downtime (subscribing > to a new topic). > > I propose something like this. > > {code:java} > topics: > - topic-pattern: my-topic-* // some regex > discovery-interval: 10s // some duration > valueType: blah > targets: > - blah{code} > The Flink consumer can be configured with both a list of concrete topics + a > pattern so validation is simple. > > cc [~igalshilman] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink-statefun] sjwiesman opened a new pull request #231: [FLINK-22554] Add topic subscription to KafkaIngress
sjwiesman opened a new pull request #231: URL: https://github.com/apache/flink-statefun/pull/231 Update the Kafka ingress to support Flinks Kafka topic discovery. This removes a common source of downtime from the statefun cluster, configuring new topics. Users can now configure their ingress to use a topic pattern and the source will consume from all topics that match the regex and auto-discover new topics every `discoveryInterval`. ```yaml version: "3.0" module: meta: type: remote spec: ingresses: - ingress: meta: type: io.statefun.kafka/ingress id: com.example/users spec: address: kafka-broker:9092 consumerGroupId: my-consumer-group startupPosition: type: earliest topicPattern: pattern: topic-* discoveryInterval: 30s valueType: com.example/User targets: - com.example.fns/greeter ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15823: [FLINK-22462][tests] Set max_connections to 200 in pgsql
flinkbot edited a comment on pull request #15823: URL: https://github.com/apache/flink/pull/15823#issuecomment-831107274 ## CI report: * df97ad0958eaf4e9064303479b040fbb83a493e8 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17525) * 323d87e5394e8968856741d0bfd48463e38ddc8a UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15789: [FLINK-21181][runtime] Wait for Invokable cancellation before releasing network resources
flinkbot edited a comment on pull request #15789: URL: https://github.com/apache/flink/pull/15789#issuecomment-827996694 ## CI report: * 4c5180310bf76e96f2665bf53531eccb1fa86421 UNKNOWN * 46e1be2c4832080bf1cb48c509b77cd88872d024 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17468) * 0bba0d971d9d4b8530cf90c6513ce68d91cad54e Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17526) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample
flinkbot edited a comment on pull request #15121: URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240 ## CI report: * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN * 4303e9844bcaf301d79371a008525a8f8b944db1 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17523) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15823: [FLINK-22462][tests] Set max_connections to 200 in pgsql
flinkbot edited a comment on pull request #15823: URL: https://github.com/apache/flink/pull/15823#issuecomment-831107274 ## CI report: * 9f9242917bedadece14b34ac0a7f98f91bdf4c82 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17509) * df97ad0958eaf4e9064303479b040fbb83a493e8 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17525) * 323d87e5394e8968856741d0bfd48463e38ddc8a UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15823: [FLINK-22462][tests] Set max_connections to 200 in pgsql
flinkbot edited a comment on pull request #15823: URL: https://github.com/apache/flink/pull/15823#issuecomment-831107274 ## CI report: * 9f9242917bedadece14b34ac0a7f98f91bdf4c82 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17509) * df97ad0958eaf4e9064303479b040fbb83a493e8 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15789: [FLINK-21181][runtime] Wait for Invokable cancellation before releasing network resources
flinkbot edited a comment on pull request #15789: URL: https://github.com/apache/flink/pull/15789#issuecomment-827996694 ## CI report: * 4c5180310bf76e96f2665bf53531eccb1fa86421 UNKNOWN * 46e1be2c4832080bf1cb48c509b77cd88872d024 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17468) * 0bba0d971d9d4b8530cf90c6513ce68d91cad54e UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rkhachatryan commented on a change in pull request #15728: [FLINK-22379][runtime] CheckpointCoordinator checks the state of all …
rkhachatryan commented on a change in pull request #15728: URL: https://github.com/apache/flink/pull/15728#discussion_r625298504 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/DefaultCheckpointPlanCalculatorTest.java ## @@ -149,18 +150,80 @@ public void testComputeWithMultipleLevels() throws Exception { } @Test -public void testWithTriggeredTasksNotRunning() throws Exception { +public void testNotRunningOneOfSourcesTriggeredTasksNotRunning() throws Exception { +// given: Execution graph builder with one RUNNING source and NOT RUNNING source. +FunctionWithException twoSourcesBuilder = +(id) -> +new CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder() +.addJobVertex(id) +.addJobVertex(new JobVertexID()) +.setTransitToRunning(vertex -> !vertex.getJobvertexId().equals(id)) +.build(); + +// when: Creating the checkpoint plan. +runWithNotRunTask(twoSourcesBuilder); + +// then: The plan failed because one task didn't have RUNNING state. +} + +@Test +public void testNotRunningSingleSourceTriggeredTasksNotRunning() throws Exception { +// given: Execution graph builder with one NOT RUNNING source and RUNNING not source task. +FunctionWithException sourceAndNotSourceBuilder = +(id) -> +new CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder() +.addJobVertex(id) +.addJobVertex(new JobVertexID(), false) +.setTransitToRunning(vertex -> !vertex.getJobvertexId().equals(id)) +.build(); + +// when: Creating the checkpoint plan. +runWithNotRunTask(sourceAndNotSourceBuilder); + +// then: The plan failed because one task didn't have RUNNING state. +} + +@Test +public void testNotRunningOneOfNotSourcesTriggeredTasksNotRunning() throws Exception { +// given: Execution graph builder with NOT RUNNING not source and RUNNING not source task. +FunctionWithException twoNotSourceBuilder = +(id) -> +new CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder() +.addJobVertex(id, false) +.addJobVertex(new JobVertexID(), false) +.setTransitToRunning(vertex -> !vertex.getJobvertexId().equals(id)) +.build(); + +// when: Creating the checkpoint plan. +runWithNotRunTask(twoNotSourceBuilder); + +// then: The plan failed because one task didn't have RUNNING state. +} + +@Test +public void testNotRunningSingleNotSourceTriggeredTasksNotRunning() throws Exception { +// given: Execution graph builder with NOT RUNNING not source and RUNNING source tasks. +FunctionWithException sourceAndNotSourceBuilder = +(id) -> +new CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder() +.addJobVertex(id, false) +.addJobVertex(new JobVertexID()) +.setTransitToRunning(vertex -> !vertex.getJobvertexId().equals(id)) +.build(); + +// when: Creating the checkpoint plan. +runWithNotRunTask(sourceAndNotSourceBuilder); + +// then: The plan failed because one task didn't have RUNNING state. +} + +private void runWithNotRunTask( +FunctionWithException graphBuilder) Review comment: Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-22549) Flink HA with Zk in kubernetes standalone mode deployment is not working
[ https://issues.apache.org/jira/browse/FLINK-22549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias closed FLINK-22549. Resolution: Won't Fix This issue is already discussed on this [mailing list thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Flink-1-12-2-scala-2-11-HA-with-Zk-in-kubernetes-standalone-mode-deployment-is-not-working-td43426.html]. Let's keep it there for now to avoid having multiple parallel threads. I'm closing this issue. > Flink HA with Zk in kubernetes standalone mode deployment is not working > > > Key: FLINK-22549 > URL: https://issues.apache.org/jira/browse/FLINK-22549 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.12.2 > Environment: kubernetes standalone deployment >Reporter: Bhagi >Priority: Major > Attachments: jobmanager.log, screenshot-1.png, taskmanager.log > > > Hi Team, > I deployed kubernetes standalone deployment flink cluster with ZK HA, but > facing some issues, i have attached taskmanager and job manger logs. > Can you please see the logs and help me solve this issue. > UI is throwing this error: {"errors":["Service temporarily unavailable due to > an ongoing leader election. Please refresh."]} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample
flinkbot edited a comment on pull request #15121: URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240 ## CI report: * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN * afc0a9002c425a95c032a94750a5e9ae45fae8d3 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17520) * 4303e9844bcaf301d79371a008525a8f8b944db1 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17523) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-22554) Support Kafka Topic Patterns in Kafka Ingress
[ https://issues.apache.org/jira/browse/FLINK-22554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Seth Wiesman reassigned FLINK-22554: Assignee: Seth Wiesman > Support Kafka Topic Patterns in Kafka Ingress > - > > Key: FLINK-22554 > URL: https://issues.apache.org/jira/browse/FLINK-22554 > Project: Flink > Issue Type: Improvement > Components: Stateful Functions >Affects Versions: statefun-3.1.0 >Reporter: Seth Wiesman >Assignee: Seth Wiesman >Priority: Major > > Flink's Kafka source supports subscription patterns, where it will consume > all topics that match a specified regex and periodically monitor for new > topics. We should support this in the statefun Kafka ingress as it is > generally useful and would remove a source of cluster downtime (subscribing > to a new topic). > > I propose something like this. > > {code:java} > topics: > - topic-pattern: my-topic-* // some regex > discovery-interval: 10s // some duration > valueType: blah > targets: > - blah{code} > The Flink consumer can be configured with both a list of concrete topics + a > pattern so validation is simple. > > cc [~igalshilman] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-22133) SplitEmumerator does not provide checkpoint id in snapshot
[ https://issues.apache.org/jira/browse/FLINK-22133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338487#comment-17338487 ] Thomas Weise commented on FLINK-22133: -- [~jqin] [~sewen] what do you think about back porting this change to 1.12? It would allow users/downstream projects to build on top of the FLIP-27 interfaces without facing breakage when going from 1.12 to 1.13. The source API is experimental in 1.12 and we already made incompatible changes to it in 1.12.x releases? > SplitEmumerator does not provide checkpoint id in snapshot > -- > > Key: FLINK-22133 > URL: https://issues.apache.org/jira/browse/FLINK-22133 > Project: Flink > Issue Type: Bug > Components: Connectors / Common >Affects Versions: 1.12.0 >Reporter: Brian Zhou >Assignee: Stephan Ewen >Priority: Major > Labels: pull-request-available > Fix For: 1.13.0 > > > In ExternallyInducedSource API, the checkpoint trigger exposes the checkpoint > Id for the external client to identify the checkpoint. However, in the > FLIP-27 source, the SplitEmumerator::snapshot() is a no-arg method. The > connector cannot track the checkpoint ID from Flink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-22554) Support Kafka Topic Patterns in Kafka Ingress
[ https://issues.apache.org/jira/browse/FLINK-22554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Seth Wiesman updated FLINK-22554: - Description: Flink's Kafka source supports subscription patterns, where it will consume all topics that match a specified regex and periodically monitor for new topics. We should support this in the statefun Kafka ingress as it is generally useful and would remove a source of cluster downtime (subscribing to a new topic). I propose something like this. {code:java} topics: - topic-pattern: my-topic-* // some regex discovery-interval: 10s // some duration valueType: blah targets: - blah{code} The Flink consumer can be configured with both a list of concrete topics + a pattern so validation is simple. cc [~igalshilman] was: Flink's Kafka source supports subscription patterns, where it will consume all topics that match a specified regex and periodically monitor for new topics. We should support this in the statefun Kafka ingress as it is generally useful and would remove a source of cluster downtime (subscribing to a new topic). I propose something like this. {code:java} topics: - topic-pattern: my-topic-* valueType: blah targets: - blah{code} The Flink consumer can be configured with both a list of concrete topics + a pattern so validation is simple. cc [~igalshilman] > Support Kafka Topic Patterns in Kafka Ingress > - > > Key: FLINK-22554 > URL: https://issues.apache.org/jira/browse/FLINK-22554 > Project: Flink > Issue Type: Improvement > Components: Stateful Functions >Affects Versions: statefun-3.1.0 >Reporter: Seth Wiesman >Priority: Major > > Flink's Kafka source supports subscription patterns, where it will consume > all topics that match a specified regex and periodically monitor for new > topics. We should support this in the statefun Kafka ingress as it is > generally useful and would remove a source of cluster downtime (subscribing > to a new topic). > > I propose something like this. > > {code:java} > topics: > - topic-pattern: my-topic-* // some regex > discovery-interval: 10s // some duration > valueType: blah > targets: > - blah{code} > The Flink consumer can be configured with both a list of concrete topics + a > pattern so validation is simple. > > cc [~igalshilman] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22554) Support Kafka Topic Patterns in Kafka Ingress
Seth Wiesman created FLINK-22554: Summary: Support Kafka Topic Patterns in Kafka Ingress Key: FLINK-22554 URL: https://issues.apache.org/jira/browse/FLINK-22554 Project: Flink Issue Type: Improvement Components: Stateful Functions Affects Versions: statefun-3.1.0 Reporter: Seth Wiesman Flink's Kafka source supports subscription patterns, where it will consume all topics that match a specified regex and periodically monitor for new topics. We should support this in the statefun Kafka ingress as it is generally useful and would remove a source of cluster downtime (subscribing to a new topic). I propose something like this. {code:java} topics: - topic-pattern: my-topic-* valueType: blah targets: - blah{code} The Flink consumer can be configured with both a list of concrete topics + a pattern so validation is simple. cc [~igalshilman] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #15825: [FLINK-22406][coordination][tests] Stabilize ReactiveModeITCase
flinkbot edited a comment on pull request #15825: URL: https://github.com/apache/flink/pull/15825#issuecomment-831225399 ## CI report: * e1a425301eee0b46c812866d12284125f5bde104 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17517) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15728: [FLINK-22379][runtime] CheckpointCoordinator checks the state of all …
flinkbot edited a comment on pull request #15728: URL: https://github.com/apache/flink/pull/15728#issuecomment-824960796 ## CI report: * d4f85948ee5281b189ba948a8ec512f62587f979 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17458) * 057695465c326acc5c40576cf35ee9f0283d6cac Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17524) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15728: [FLINK-22379][runtime] CheckpointCoordinator checks the state of all …
flinkbot edited a comment on pull request #15728: URL: https://github.com/apache/flink/pull/15728#issuecomment-824960796 ## CI report: * d4f85948ee5281b189ba948a8ec512f62587f979 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17458) * 057695465c326acc5c40576cf35ee9f0283d6cac UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19481) Add support for a flink native GCS FileSystem
[ https://issues.apache.org/jira/browse/FLINK-19481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338455#comment-17338455 ] Ben Augarten commented on FLINK-19481: -- Hey Robert and Galen, I appreciate you both weighing in. I got a chance to read through Galen's PR briefly and it does seem like it's mostly concerned with adding support for the RecoverableWriter interface, and their implementation of the RecoverableWriter interface does not have any explicit hadoop dependencies. So, it seems like their implementation would be useful with either a native or hadoop based implementation of the google cloud storage file system. Our native implementation does have support for RecoverableWriter, but I didn't work directly on that and I don't believe it's being used in production right now. We've primarily been using our implementation for checkpointing, savepointing, and job graph storage. The two paths forward I see are: * As Galen proposed keep two separate implementations of the GCS FileSystem, one that goes through the hadoop stack and one that uses GCS SDKs, both using the shared RecoverableWriter implementations. * Consolidate down to a native GCS FileSystem implementation, using Galen's implementation of the RecoverableWriter. To me, the second option makes most sense, based on my experience as a user of flink and my general impression of the desire to move away from hadoop based file systems. To accomplish that, I think that Galen should continue working on their MR. I can open another MR once theirs lands on master, or open an MR on their WIP. Though, I'd prefer waiting until outstanding discussions are resolved. > Add support for a flink native GCS FileSystem > - > > Key: FLINK-19481 > URL: https://issues.apache.org/jira/browse/FLINK-19481 > Project: Flink > Issue Type: Improvement > Components: Connectors / FileSystem, FileSystems >Affects Versions: 1.12.0 >Reporter: Ben Augarten >Priority: Minor > Labels: auto-deprioritized-major > > Currently, GCS is supported but only by using the hadoop connector[1] > > The objective of this improvement is to add support for checkpointing to > Google Cloud Storage with the Flink File System, > > This would allow the `gs://` scheme to be used for savepointing and > checkpointing. Long term, it would be nice if we could use the GCS FileSystem > as a source and sink in flink jobs as well. > > Long term, I hope that implementing a flink native GCS FileSystem will > simplify usage of GCS because the hadoop FileSystem ends up bringing in many > unshaded dependencies. > > [1] > [https://github.com/GoogleCloudDataproc/hadoop-connectors|https://github.com/GoogleCloudDataproc/hadoop-connectors)] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample
flinkbot edited a comment on pull request #15121: URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240 ## CI report: * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN * cc9e8ddee9ff28b3044488d0e73bbc7f5c0a73e5 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17516) * afc0a9002c425a95c032a94750a5e9ae45fae8d3 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17520) * 4303e9844bcaf301d79371a008525a8f8b944db1 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17523) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] akalash commented on a change in pull request #15728: [FLINK-22379][runtime] CheckpointCoordinator checks the state of all …
akalash commented on a change in pull request #15728: URL: https://github.com/apache/flink/pull/15728#discussion_r625177400 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/DefaultCheckpointPlanCalculatorTest.java ## @@ -149,18 +150,80 @@ public void testComputeWithMultipleLevels() throws Exception { } @Test -public void testWithTriggeredTasksNotRunning() throws Exception { +public void testNotRunningOneOfSourcesTriggeredTasksNotRunning() throws Exception { +// given: Execution graph builder with one RUNNING source and NOT RUNNING source. +FunctionWithException twoSourcesBuilder = +(id) -> +new CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder() +.addJobVertex(id) +.addJobVertex(new JobVertexID()) +.setTransitToRunning(vertex -> !vertex.getJobvertexId().equals(id)) +.build(); + +// when: Creating the checkpoint plan. +runWithNotRunTask(twoSourcesBuilder); + +// then: The plan failed because one task didn't have RUNNING state. +} + +@Test +public void testNotRunningSingleSourceTriggeredTasksNotRunning() throws Exception { +// given: Execution graph builder with one NOT RUNNING source and RUNNING not source task. +FunctionWithException sourceAndNotSourceBuilder = +(id) -> +new CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder() +.addJobVertex(id) +.addJobVertex(new JobVertexID(), false) +.setTransitToRunning(vertex -> !vertex.getJobvertexId().equals(id)) +.build(); + +// when: Creating the checkpoint plan. +runWithNotRunTask(sourceAndNotSourceBuilder); + +// then: The plan failed because one task didn't have RUNNING state. +} + +@Test +public void testNotRunningOneOfNotSourcesTriggeredTasksNotRunning() throws Exception { +// given: Execution graph builder with NOT RUNNING not source and RUNNING not source task. +FunctionWithException twoNotSourceBuilder = +(id) -> +new CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder() +.addJobVertex(id, false) +.addJobVertex(new JobVertexID(), false) +.setTransitToRunning(vertex -> !vertex.getJobvertexId().equals(id)) +.build(); + +// when: Creating the checkpoint plan. +runWithNotRunTask(twoNotSourceBuilder); + +// then: The plan failed because one task didn't have RUNNING state. +} + +@Test +public void testNotRunningSingleNotSourceTriggeredTasksNotRunning() throws Exception { +// given: Execution graph builder with NOT RUNNING not source and RUNNING source tasks. +FunctionWithException sourceAndNotSourceBuilder = +(id) -> +new CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder() +.addJobVertex(id, false) +.addJobVertex(new JobVertexID()) +.setTransitToRunning(vertex -> !vertex.getJobvertexId().equals(id)) +.build(); + +// when: Creating the checkpoint plan. +runWithNotRunTask(sourceAndNotSourceBuilder); + +// then: The plan failed because one task didn't have RUNNING state. +} + +private void runWithNotRunTask( +FunctionWithException graphBuilder) Review comment: I've rewritten tests mostly as you suggested and I also collapsed them into one test. Please, take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample
flinkbot edited a comment on pull request #15121: URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240 ## CI report: * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN * cc9e8ddee9ff28b3044488d0e73bbc7f5c0a73e5 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17516) * afc0a9002c425a95c032a94750a5e9ae45fae8d3 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17520) * 4303e9844bcaf301d79371a008525a8f8b944db1 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] dawidwys merged pull request #442: [hotfix] Adjust phrasing for FlameGraph performance
dawidwys merged pull request #442: URL: https://github.com/apache/flink-web/pull/442 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] zentol opened a new pull request #442: [hotfix] Adjust phrasing for FlameGraph performance
zentol opened a new pull request #442: URL: https://github.com/apache/flink-web/pull/442 The current 1.13 blog post contains a note for flamegraphs that they are a) expensive and b) may overload the metric system. Point b) is simply incorrect, while a) is not generally the case, and the existing documentation provides better information. So I just replaced the current note with a link to the docs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] zentol merged pull request #441: [hotfix] fix bullet points list formatting
zentol merged pull request #441: URL: https://github.com/apache/flink-web/pull/441 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19481) Add support for a flink native GCS FileSystem
[ https://issues.apache.org/jira/browse/FLINK-19481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338418#comment-17338418 ] Galen Warren commented on FLINK-19481: -- Hi all, I'm the author of the other [PR|https://github.com/apache/flink/pull/15599] that relates to Google Cloud Storage. [~xintongsong] has been working with me on this. The main goal of my PR is to add support for the RecoverableWriter interface, so that one can write to GCS via a StreamingFileSink. The file system support goes through the Hadoop stack, as noted above, using Google's [cloud storage connector|https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage]. I have not personally had problems using the GCS connector and the Hadoop stack – it seems to write check/savepoints properly. I also use it to write job manager HA data to GCS, which seems to work fine. However, if we do want to support a native implementation in addition to the Hadoop-based one, we could approach it similarly to what has been done for S3, i.e. have a shared base project (flink-gs-fs-base?) and then projects for each of the implementations ( flink-gs-fs-hadoop and flink-gs-fs-native?). The recoverable-writer code could go into the shared project so that both of the implementations could use it (assuming that the native implementation doesn't already have a recoverable-writer implementation). I'll defer to the Flink experts on whether that's a worthwhile effort or not. At this point, from my perspective, it wouldn't be that much work to rework the project structure to support this. > Add support for a flink native GCS FileSystem > - > > Key: FLINK-19481 > URL: https://issues.apache.org/jira/browse/FLINK-19481 > Project: Flink > Issue Type: Improvement > Components: Connectors / FileSystem, FileSystems >Affects Versions: 1.12.0 >Reporter: Ben Augarten >Priority: Minor > Labels: auto-deprioritized-major > > Currently, GCS is supported but only by using the hadoop connector[1] > > The objective of this improvement is to add support for checkpointing to > Google Cloud Storage with the Flink File System, > > This would allow the `gs://` scheme to be used for savepointing and > checkpointing. Long term, it would be nice if we could use the GCS FileSystem > as a source and sink in flink jobs as well. > > Long term, I hope that implementing a flink native GCS FileSystem will > simplify usage of GCS because the hadoop FileSystem ends up bringing in many > unshaded dependencies. > > [1] > [https://github.com/GoogleCloudDataproc/hadoop-connectors|https://github.com/GoogleCloudDataproc/hadoop-connectors)] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-22553) Improve error reporting on TM connection failures
Roman Khachatryan created FLINK-22553: - Summary: Improve error reporting on TM connection failures Key: FLINK-22553 URL: https://issues.apache.org/jira/browse/FLINK-22553 Project: Flink Issue Type: Improvement Components: Runtime / Network Reporter: Roman Khachatryan Fix For: 1.14.0 Connection failures reported by NettyPartitionRequestClient contain misleading NPE in their stacktrace, e.g. reported in http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/remote-task-manager-netty-exception-td43401.html {code} org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: Connecting to remote task manager '/100.98.115.117:41245' has failed. This might indicate that the remote task manager has been lost. at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.connect(PartitionRequestClientFactory.java:145) ~[flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.connectWithRetries(PartitionRequestClientFactory.java:114) ~[flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:81) ~[flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:70) ~[flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:179) ~[flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.internalRequestPartitions(SingleInputGate.java:321) ~[flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:290) ~[flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.runtime.taskmanager.InputGateWithMetrics.requestPartitions(InputGateWithMetrics.java:94) ~[flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50) [flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90) [flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:297) [flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:189) [flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617) [flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581) [flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755) [flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570) [flink-dist_2.12-1.12.2.jar:1.12.2] at java.lang.Thread.run(Thread.java:834) [?:?]Caused by: java.lang.NullPointerException at org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:59) ~[flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.runtime.io.network.netty.NettyPartitionRequestClient.(NettyPartitionRequestClient.java:74) ~[flink-dist_2.12-1.12.2.jar:1.12.2] at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.connect(PartitionRequestClientFactory.java:136) ~[flink-dist_2.12-1.12.2.jar:1.12.2] ... 16 more {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-22173) UnalignedCheckpointRescaleITCase fails on azure
[ https://issues.apache.org/jira/browse/FLINK-22173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338400#comment-17338400 ] Roman Khachatryan edited comment on FLINK-22173 at 5/3/21, 2:48 PM: A recent failure (different stacktrace though): [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=17468=logs=34f41360-6c0d-54d3-11a1-0292a2def1d9=2d56e022-1ace-542f-bf1a-b37dd63243f2=9772] {code} Apr 30 14:29:59 Caused by: org.apache.flink.shaded.netty4.io.netty.util.IllegalReferenceCountException: refCnt: 0, in crement: 1 Apr 30 14:29:59 at org.apache.flink.shaded.netty4.io.netty.util.internal.ReferenceCountUpdater.retain0(ReferenceCo untUpdater.java:123) Apr 30 14:29:59 at org.apache.flink.shaded.netty4.io.netty.util.internal.ReferenceCountUpdater.retain(ReferenceCou ntUpdater.java:110) Apr 30 14:29:59 at org.apache.flink.shaded.netty4.io.netty.buffer.AbstractReferenceCountedByteBuf.retain(AbstractR eferenceCountedByteBuf.java:80) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.buffer.NetworkBuffer.retainBuffer(NetworkBuffer.java:166) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.buffer.NetworkBuffer.retainBuffer(NetworkBuffer.java:47) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.buffer.BufferConsumer.copy(BufferConsumer.java:143) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.buffer.BufferConsumer.toDebugString(BufferConsumer.java:202 ) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.logger.NetworkActionsLogger.traceRecover(NetworkActionsLogg er.java:94) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.partition.PipelinedSubpartition.addRecovered(PipelinedSubpa rtition.java:142) Apr 30 14:29:59 at org.apache.flink.runtime.checkpoint.channel.ResultSubpartitionRecoveredStateHandler.recover(Rec overedChannelStateHandler.java:195) Apr 30 14:29:59 at org.apache.flink.runtime.checkpoint.channel.ResultSubpartitionRecoveredStateHandler.recover(Rec overedChannelStateHandler.java:144) Apr 30 14:29:59 at org.apache.flink.runtime.checkpoint.channel.ChannelStateChunkReader.readChunk(SequentialChannel StateReaderImpl.java:207) Apr 30 14:29:59 at org.apache.flink.runtime.checkpoint.channel.SequentialChannelStateReaderImpl.readSequentially(S equentialChannelStateReaderImpl.java:107) Apr 30 14:29:59 at org.apache.flink.runtime.checkpoint.channel.SequentialChannelStateReaderImpl.read(SequentialCha nnelStateReaderImpl.java:93) Apr 30 14:29:59 at org.apache.flink.runtime.checkpoint.channel.SequentialChannelStateReaderImpl.readOutputData(Seq uentialChannelStateReaderImpl.java:79) Apr 30 14:29:59 at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:571) Apr 30 14:29:59 at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecut or.java:55) Apr 30 14:29:59 at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:554) Apr 30 14:29:59 at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:757) Apr 30 14:29:59 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:564) Apr 30 14:29:59 at java.lang.Thread.run(Thread.java:748) {code} Commits from master up to 89c6c03660a88a648bbd13b4e6696124fe46d013 was (Author: roman_khachatryan): A recent failure (with commits in master up to 89c6c03660a88a648bbd13b4e6696124fe46d013): [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=17468=logs=34f41360-6c0d-54d3-11a1-0292a2def1d9=2d56e022-1ace-542f-bf1a-b37dd63243f2=9772] {code} Apr 30 14:29:59 Caused by: org.apache.flink.shaded.netty4.io.netty.util.IllegalReferenceCountException: refCnt: 0, in crement: 1 Apr 30 14:29:59 at org.apache.flink.shaded.netty4.io.netty.util.internal.ReferenceCountUpdater.retain0(ReferenceCo untUpdater.java:123) Apr 30 14:29:59 at org.apache.flink.shaded.netty4.io.netty.util.internal.ReferenceCountUpdater.retain(ReferenceCou ntUpdater.java:110) Apr 30 14:29:59 at org.apache.flink.shaded.netty4.io.netty.buffer.AbstractReferenceCountedByteBuf.retain(AbstractR eferenceCountedByteBuf.java:80) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.buffer.NetworkBuffer.retainBuffer(NetworkBuffer.java:166) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.buffer.NetworkBuffer.retainBuffer(NetworkBuffer.java:47) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.buffer.BufferConsumer.copy(BufferConsumer.java:143) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.buffer.BufferConsumer.toDebugString(BufferConsumer.java:202 ) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.logger.NetworkActionsLogger.traceRecover(NetworkActionsLogg er.java:94) Apr 30 14:29:59 at
[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample
flinkbot edited a comment on pull request #15121: URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240 ## CI report: * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN * 5ee991b8ab6c1645e3661f2ae8b5c20e92223954 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17489) * cc9e8ddee9ff28b3044488d0e73bbc7f5c0a73e5 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17516) * afc0a9002c425a95c032a94750a5e9ae45fae8d3 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17520) * 4303e9844bcaf301d79371a008525a8f8b944db1 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-22173) UnalignedCheckpointRescaleITCase fails on azure
[ https://issues.apache.org/jira/browse/FLINK-22173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338400#comment-17338400 ] Roman Khachatryan commented on FLINK-22173: --- A recent failure (with commits in master up to 89c6c03660a88a648bbd13b4e6696124fe46d013): [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=17468=logs=34f41360-6c0d-54d3-11a1-0292a2def1d9=2d56e022-1ace-542f-bf1a-b37dd63243f2=9772] {code} Apr 30 14:29:59 Caused by: org.apache.flink.shaded.netty4.io.netty.util.IllegalReferenceCountException: refCnt: 0, in crement: 1 Apr 30 14:29:59 at org.apache.flink.shaded.netty4.io.netty.util.internal.ReferenceCountUpdater.retain0(ReferenceCo untUpdater.java:123) Apr 30 14:29:59 at org.apache.flink.shaded.netty4.io.netty.util.internal.ReferenceCountUpdater.retain(ReferenceCou ntUpdater.java:110) Apr 30 14:29:59 at org.apache.flink.shaded.netty4.io.netty.buffer.AbstractReferenceCountedByteBuf.retain(AbstractR eferenceCountedByteBuf.java:80) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.buffer.NetworkBuffer.retainBuffer(NetworkBuffer.java:166) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.buffer.NetworkBuffer.retainBuffer(NetworkBuffer.java:47) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.buffer.BufferConsumer.copy(BufferConsumer.java:143) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.buffer.BufferConsumer.toDebugString(BufferConsumer.java:202 ) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.logger.NetworkActionsLogger.traceRecover(NetworkActionsLogg er.java:94) Apr 30 14:29:59 at org.apache.flink.runtime.io.network.partition.PipelinedSubpartition.addRecovered(PipelinedSubpa rtition.java:142) Apr 30 14:29:59 at org.apache.flink.runtime.checkpoint.channel.ResultSubpartitionRecoveredStateHandler.recover(Rec overedChannelStateHandler.java:195) Apr 30 14:29:59 at org.apache.flink.runtime.checkpoint.channel.ResultSubpartitionRecoveredStateHandler.recover(Rec overedChannelStateHandler.java:144) Apr 30 14:29:59 at org.apache.flink.runtime.checkpoint.channel.ChannelStateChunkReader.readChunk(SequentialChannel StateReaderImpl.java:207) Apr 30 14:29:59 at org.apache.flink.runtime.checkpoint.channel.SequentialChannelStateReaderImpl.readSequentially(S equentialChannelStateReaderImpl.java:107) Apr 30 14:29:59 at org.apache.flink.runtime.checkpoint.channel.SequentialChannelStateReaderImpl.read(SequentialCha nnelStateReaderImpl.java:93) Apr 30 14:29:59 at org.apache.flink.runtime.checkpoint.channel.SequentialChannelStateReaderImpl.readOutputData(Seq uentialChannelStateReaderImpl.java:79) Apr 30 14:29:59 at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:571) Apr 30 14:29:59 at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecut or.java:55) Apr 30 14:29:59 at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:554) Apr 30 14:29:59 at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:757) Apr 30 14:29:59 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:564) Apr 30 14:29:59 at java.lang.Thread.run(Thread.java:748) {code} > UnalignedCheckpointRescaleITCase fails on azure > --- > > Key: FLINK-22173 > URL: https://issues.apache.org/jira/browse/FLINK-22173 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.13.0 >Reporter: Dawid Wysakowicz >Assignee: Arvid Heise >Priority: Critical > Labels: test-stability > Fix For: 1.13.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=16232=logs=d8d26c26-7ec2-5ed2-772e-7a1a1eb8317c=be5fb08e-1ad7-563c-4f1a-a97ad4ce4865=9628 > {code} > 2021-04-08T23:25:56.3131361Z [ERROR] Tests run: 31, Failures: 0, Errors: 1, > Skipped: 0, Time elapsed: 839.623 s <<< FAILURE! - in > org.apache.flink.test.checkpointing.UnalignedCheckpointRescaleITCase > 2021-04-08T23:25:56.3132784Z [ERROR] shouldRescaleUnalignedCheckpoint[no > scale union from 7 to > 7](org.apache.flink.test.checkpointing.UnalignedCheckpointRescaleITCase) > Time elapsed: 607.467 s <<< ERROR! > 2021-04-08T23:25:56.3133586Z > org.apache.flink.runtime.client.JobExecutionException: Job execution failed. > 2021-04-08T23:25:56.3134070Z at > org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144) > 2021-04-08T23:25:56.3134643Z at > org.apache.flink.test.checkpointing.UnalignedCheckpointTestBase.execute(UnalignedCheckpointTestBase.java:168) > 2021-04-08T23:25:56.3135577Z at >
[GitHub] [flink-web] afedulov opened a new pull request #441: [hotfix] fix bullet points list formatting
afedulov opened a new pull request #441: URL: https://github.com/apache/flink-web/pull/441 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-22552) Rebase StateFun on Flink 1.13
Igal Shilman created FLINK-22552: Summary: Rebase StateFun on Flink 1.13 Key: FLINK-22552 URL: https://issues.apache.org/jira/browse/FLINK-22552 Project: Flink Issue Type: Improvement Components: Stateful Functions Reporter: Igal Shilman Following the recent release of Flink 1.13, StateFun master needs to be rebased on that version. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-22552) Rebase StateFun on Flink 1.13
[ https://issues.apache.org/jira/browse/FLINK-22552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igal Shilman updated FLINK-22552: - Description: Following the recent release of Flink 1.13, StateFun main branch needs to be rebased on that version. (was: Following the recent release of Flink 1.13, StateFun master needs to be rebased on that version.) > Rebase StateFun on Flink 1.13 > - > > Key: FLINK-22552 > URL: https://issues.apache.org/jira/browse/FLINK-22552 > Project: Flink > Issue Type: Improvement > Components: Stateful Functions >Reporter: Igal Shilman >Priority: Major > > Following the recent release of Flink 1.13, StateFun main branch needs to be > rebased on that version. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink-web] dawidwys merged pull request #440: [hotfix] Correct formatting in 1.13 release blogpost.
dawidwys merged pull request #440: URL: https://github.com/apache/flink-web/pull/440 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] morsapaes opened a new pull request #440: [hotfix] Correct formatting in 1.13 release blogpost.
morsapaes opened a new pull request #440: URL: https://github.com/apache/flink-web/pull/440 One of the Python code snippets renders incorrectly due to a missing paragraph. https://user-images.githubusercontent.com/23521087/116886475-6864b880-ac29-11eb-9c3a-4f5b98466a8e.png;> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] dawidwys commented on pull request #436: Add Apache Flink release 1.13.0
dawidwys commented on pull request #436: URL: https://github.com/apache/flink-web/pull/436#issuecomment-831266920 Merged... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] dawidwys closed pull request #436: Add Apache Flink release 1.13.0
dawidwys closed pull request #436: URL: https://github.com/apache/flink-web/pull/436 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15824: (1.11) [FLINK-20383][runtime] Fix race condition in notification.
flinkbot edited a comment on pull request #15824: URL: https://github.com/apache/flink/pull/15824#issuecomment-831161678 ## CI report: * f741e9ec3edf512b0adf4eb17548f9bf073f00ff Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17512) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample
flinkbot edited a comment on pull request #15121: URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240 ## CI report: * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN * 5ee991b8ab6c1645e3661f2ae8b5c20e92223954 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17489) * cc9e8ddee9ff28b3044488d0e73bbc7f5c0a73e5 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17516) * afc0a9002c425a95c032a94750a5e9ae45fae8d3 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17520) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-22535) Resource leak would happen if exception thrown during AbstractInvokable#restore of task life
[ https://issues.apache.org/jira/browse/FLINK-22535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-22535. -- Fix Version/s: 1.14.0 Resolution: Fixed Merged to master 5926c4ac22b^..5926c4ac22b Merged to release-1.13 as 0077b6fb07c^..0077b6fb07c > Resource leak would happen if exception thrown during > AbstractInvokable#restore of task life > > > Key: FLINK-22535 > URL: https://issues.apache.org/jira/browse/FLINK-22535 > Project: Flink > Issue Type: Bug > Components: Runtime / Task >Affects Versions: 1.13.0 >Reporter: Yun Tang >Assignee: Anton Kalashnikov >Priority: Critical > Labels: pull-request-available > Fix For: 1.14.0, 1.13.1 > > > FLINK-17012 introduced new initialization phase such as > {{AbstractInvokable.restore}}, however, if > [invokable.restore()|https://github.com/apache/flink/blob/79a521e08df550d96f97bb6915191d8496bb29ea/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L754-L759] > throws exception out, no more {{StreamTask#cleanUpInvoke}} would be called, > leading to resource leak. > We internally leveraged another way to use managed memory by registering > specific operator identifier in memory manager, forgetting to call the stream > task cleanup would let stream operator not be disposed and we have to face > critical resource leak. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] pnowojski merged pull request #15820: [FLINK-22535][runtime] CleanUp is invoked for task even when the task…
pnowojski merged pull request #15820: URL: https://github.com/apache/flink/pull/15820 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample
flinkbot edited a comment on pull request #15121: URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240 ## CI report: * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN * 5ee991b8ab6c1645e3661f2ae8b5c20e92223954 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17489) * cc9e8ddee9ff28b3044488d0e73bbc7f5c0a73e5 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17516) * afc0a9002c425a95c032a94750a5e9ae45fae8d3 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-22530) RuntimeException after subsequent windowed grouping in TableAPI
[ https://issues.apache.org/jira/browse/FLINK-22530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338361#comment-17338361 ] Christopher Rost edited comment on FLINK-22530 at 5/3/21, 1:08 PM: --- [~AHeise] and [~libenchao]] It Seems that you had a similar issue with two subsequent tumbling windows in https://issues.apache.org/jira/browse/FLINK-15494. Do you think it is the same problem here? was (Author: chrizzz110): @libenchao Seems that you had a similar issue with two subsequent tumbling windows. > RuntimeException after subsequent windowed grouping in TableAPI > --- > > Key: FLINK-22530 > URL: https://issues.apache.org/jira/browse/FLINK-22530 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.12.0 >Reporter: Christopher Rost >Priority: Major > > After applying the following using the TableAPI v 1.12.0, an error is thrown: > {code:java} > java.lang.RuntimeException: Error while applying rule > StreamExecGroupWindowAggregateRule(in:LOGICAL,out:STREAM_PHYSICAL), args > [rel#505:FlinkLogicalWindowAggregate.LOGICAL.any.None: > 0.[NONE].[NONE](input=RelSubset#504,group={1},window=TumblingGroupWindow('w2, > w1_rowtime, 1),properties=EXPR$1)]{code} > The code snippet to reproduce: > {code:java} > Table table2 = table1 > .window(Tumble.over(lit(10).seconds()).on($(EVENT_TIME)).as("w1")) > .groupBy($(ID), $(LABEL), $("w1")) > .select($(ID), $(LABEL), $("w1").rowtime().as("w1_rowtime")); > // table2.execute().print(); --> work well > Table table3 = table2 > .window(Tumble.over(lit(10).seconds()).on($("w1_rowtime")).as("w2")) > .groupBy($(LABEL), $("w2")) > .select( > $(LABEL).as("super_label"), > lit(1).count().as("super_count"), > $("w2").rowtime().as("w2_rowtime") > ); > // table3.execute().print(); //--> work well >table3.select($("super_label"), $("w2_rowtime")) > .execute().print(); // --> throws exception > {code} > It seems that the alias "w1_rowtime" is no longer available for further > usages of table3, since the cause of the exception is: > {noformat} > Caused by: java.lang.IllegalArgumentException: field [w1_rowtime] not found; > input fields are: [vertex_id, vertex_label, EXPR$0 > {noformat} > {{The complete trace:}} > {code:java} > java.lang.RuntimeException: Error while applying rule > StreamExecGroupWindowAggregateRule(in:LOGICAL,out:STREAM_PHYSICAL), args > [rel#197:FlinkLogicalWindowAggregate.LOGICAL.any.None: > 0.[NONE].[NONE](input=RelSubset#196,group={1},window=TumblingGroupWindow('w2, > w1_rowtime, 1),properties=EXPR$1)]at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:256) > at > org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:58) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:510) > at > org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:312) > at > org.apache.flink.table.planner.plan.optimize.program.FlinkVolcanoProgram.optimize(FlinkVolcanoProgram.scala:64) > at > org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:62) > at > org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:58) > at > scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) > at > scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) > at scala.collection.Iterator$class.foreach(Iterator.scala:891) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at > scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157) > at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104) > at > org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.optimize(FlinkChainedProgram.scala:57) > at > org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.optimizeTree(StreamCommonSubGraphBasedOptimizer.scala:163) > at > org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.doOptimize(StreamCommonSubGraphBasedOptimizer.scala:79) > at > org.apache.flink.table.planner.plan.optimize.CommonSubGraphBasedOptimizer.optimize(CommonSubGraphBasedOptimizer.scala:77) > at > org.apache.flink.table.planner.delegation.PlannerBase.optimize(PlannerBase.scala:286) >
[jira] [Commented] (FLINK-22530) RuntimeException after subsequent windowed grouping in TableAPI
[ https://issues.apache.org/jira/browse/FLINK-22530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338361#comment-17338361 ] Christopher Rost commented on FLINK-22530: -- @libenchao Seems that you had a similar issue with two subsequent tumbling windows. > RuntimeException after subsequent windowed grouping in TableAPI > --- > > Key: FLINK-22530 > URL: https://issues.apache.org/jira/browse/FLINK-22530 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.12.0 >Reporter: Christopher Rost >Priority: Major > > After applying the following using the TableAPI v 1.12.0, an error is thrown: > {code:java} > java.lang.RuntimeException: Error while applying rule > StreamExecGroupWindowAggregateRule(in:LOGICAL,out:STREAM_PHYSICAL), args > [rel#505:FlinkLogicalWindowAggregate.LOGICAL.any.None: > 0.[NONE].[NONE](input=RelSubset#504,group={1},window=TumblingGroupWindow('w2, > w1_rowtime, 1),properties=EXPR$1)]{code} > The code snippet to reproduce: > {code:java} > Table table2 = table1 > .window(Tumble.over(lit(10).seconds()).on($(EVENT_TIME)).as("w1")) > .groupBy($(ID), $(LABEL), $("w1")) > .select($(ID), $(LABEL), $("w1").rowtime().as("w1_rowtime")); > // table2.execute().print(); --> work well > Table table3 = table2 > .window(Tumble.over(lit(10).seconds()).on($("w1_rowtime")).as("w2")) > .groupBy($(LABEL), $("w2")) > .select( > $(LABEL).as("super_label"), > lit(1).count().as("super_count"), > $("w2").rowtime().as("w2_rowtime") > ); > // table3.execute().print(); //--> work well >table3.select($("super_label"), $("w2_rowtime")) > .execute().print(); // --> throws exception > {code} > It seems that the alias "w1_rowtime" is no longer available for further > usages of table3, since the cause of the exception is: > {noformat} > Caused by: java.lang.IllegalArgumentException: field [w1_rowtime] not found; > input fields are: [vertex_id, vertex_label, EXPR$0 > {noformat} > {{The complete trace:}} > {code:java} > java.lang.RuntimeException: Error while applying rule > StreamExecGroupWindowAggregateRule(in:LOGICAL,out:STREAM_PHYSICAL), args > [rel#197:FlinkLogicalWindowAggregate.LOGICAL.any.None: > 0.[NONE].[NONE](input=RelSubset#196,group={1},window=TumblingGroupWindow('w2, > w1_rowtime, 1),properties=EXPR$1)]at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:256) > at > org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:58) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:510) > at > org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:312) > at > org.apache.flink.table.planner.plan.optimize.program.FlinkVolcanoProgram.optimize(FlinkVolcanoProgram.scala:64) > at > org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:62) > at > org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:58) > at > scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) > at > scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) > at scala.collection.Iterator$class.foreach(Iterator.scala:891) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at > scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157) > at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104) > at > org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.optimize(FlinkChainedProgram.scala:57) > at > org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.optimizeTree(StreamCommonSubGraphBasedOptimizer.scala:163) > at > org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.doOptimize(StreamCommonSubGraphBasedOptimizer.scala:79) > at > org.apache.flink.table.planner.plan.optimize.CommonSubGraphBasedOptimizer.optimize(CommonSubGraphBasedOptimizer.scala:77) > at > org.apache.flink.table.planner.delegation.PlannerBase.optimize(PlannerBase.scala:286) > at > org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:165) > at > org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:1267) > at >
[jira] [Resolved] (FLINK-22368) UnalignedCheckpointITCase hangs on azure
[ https://issues.apache.org/jira/browse/FLINK-22368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Khachatryan resolved FLINK-22368. --- Fix Version/s: 1.12.4 1.14.0 Resolution: Fixed Merged into 1.12 as afc0a9002c425a95c032a94750a5e9ae45fae8d3. Merged into 1.13 as b41c5fab3e3eebb04b891ec2ac0688bedc2d94eb. Merged into master as 6e3ccd5a9613a2de06c8ec410ba41e6e0c6959be. > UnalignedCheckpointITCase hangs on azure > > > Key: FLINK-22368 > URL: https://issues.apache.org/jira/browse/FLINK-22368 > Project: Flink > Issue Type: Bug > Components: Runtime / Task >Affects Versions: 1.13.0 >Reporter: Dawid Wysakowicz >Assignee: Roman Khachatryan >Priority: Blocker > Labels: pull-request-available, test-stability > Fix For: 1.14.0, 1.13.1, 1.12.4 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=16818=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=d13f554f-d4b9-50f8-30ee-d49c6fb0b3cc=10144 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] rkhachatryan merged pull request #15726: [BP-1.12][FLINK-22368] Deque channel after releasing on EndOfPartition
rkhachatryan merged pull request #15726: URL: https://github.com/apache/flink/pull/15726 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rkhachatryan merged pull request #15727: [BP-1.13][FLINK-22368] Deque channel after releasing on EndOfPartition
rkhachatryan merged pull request #15727: URL: https://github.com/apache/flink/pull/15727 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20217) More fine-grained timer processing
[ https://issues.apache.org/jira/browse/FLINK-20217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338353#comment-17338353 ] Nico Kruber commented on FLINK-20217: - I'd still consider this a major hurdle for which workarounds require quite some implementation effort by users. > More fine-grained timer processing > -- > > Key: FLINK-20217 > URL: https://issues.apache.org/jira/browse/FLINK-20217 > Project: Flink > Issue Type: Improvement > Components: API / DataStream >Affects Versions: 1.10.2, 1.11.2, 1.12.0 >Reporter: Nico Kruber >Priority: Major > > Timers are currently processed in one big block under the checkpoint lock > (under {{InternalTimerServiceImpl#advanceWatermark}}. This can be problematic > in a number of scenarios while doing checkpointing which would lead to > checkpoints timing out (and even unaligned checkpoints would not help). > If you have a huge number of timers to process when advancing the watermark > and the task is also back-pressured, the situation may actually be worse > since you would block on the checkpoint lock and also wait for > buffers/credits from the receiver. > I propose to make this loop more fine-grained so that it is interruptible by > checkpoints, but maybe there is also some other way to improve here. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20217) More fine-grained timer processing
[ https://issues.apache.org/jira/browse/FLINK-20217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nico Kruber updated FLINK-20217: Labels: (was: auto-deprioritized-major) > More fine-grained timer processing > -- > > Key: FLINK-20217 > URL: https://issues.apache.org/jira/browse/FLINK-20217 > Project: Flink > Issue Type: Improvement > Components: API / DataStream >Affects Versions: 1.10.2, 1.11.2, 1.12.0 >Reporter: Nico Kruber >Priority: Minor > > Timers are currently processed in one big block under the checkpoint lock > (under {{InternalTimerServiceImpl#advanceWatermark}}. This can be problematic > in a number of scenarios while doing checkpointing which would lead to > checkpoints timing out (and even unaligned checkpoints would not help). > If you have a huge number of timers to process when advancing the watermark > and the task is also back-pressured, the situation may actually be worse > since you would block on the checkpoint lock and also wait for > buffers/credits from the receiver. > I propose to make this loop more fine-grained so that it is interruptible by > checkpoints, but maybe there is also some other way to improve here. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20217) More fine-grained timer processing
[ https://issues.apache.org/jira/browse/FLINK-20217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nico Kruber updated FLINK-20217: Priority: Major (was: Minor) > More fine-grained timer processing > -- > > Key: FLINK-20217 > URL: https://issues.apache.org/jira/browse/FLINK-20217 > Project: Flink > Issue Type: Improvement > Components: API / DataStream >Affects Versions: 1.10.2, 1.11.2, 1.12.0 >Reporter: Nico Kruber >Priority: Major > > Timers are currently processed in one big block under the checkpoint lock > (under {{InternalTimerServiceImpl#advanceWatermark}}. This can be problematic > in a number of scenarios while doing checkpointing which would lead to > checkpoints timing out (and even unaligned checkpoints would not help). > If you have a huge number of timers to process when advancing the watermark > and the task is also back-pressured, the situation may actually be worse > since you would block on the checkpoint lock and also wait for > buffers/credits from the receiver. > I propose to make this loop more fine-grained so that it is interruptible by > checkpoints, but maybe there is also some other way to improve here. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #15825: [FLINK-22406][coordination][tests] Stabilize ReactiveModeITCase
flinkbot edited a comment on pull request #15825: URL: https://github.com/apache/flink/pull/15825#issuecomment-831225399 ## CI report: * e1a425301eee0b46c812866d12284125f5bde104 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17517) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15823: [FLINK-22462][tests] Set max_connections to 200 in pgsql
flinkbot edited a comment on pull request #15823: URL: https://github.com/apache/flink/pull/15823#issuecomment-831107274 ## CI report: * 9f9242917bedadece14b34ac0a7f98f91bdf4c82 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17509) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15820: [FLINK-22535][runtime] CleanUp is invoked for task even when the task…
flinkbot edited a comment on pull request #15820: URL: https://github.com/apache/flink/pull/15820#issuecomment-830052015 ## CI report: * 4ba39e78592ba67fc5c51f11b8737337337805e5 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17510) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-16556) TopSpeedWindowing should implement checkpointing for its source
[ https://issues.apache.org/jira/browse/FLINK-16556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nico Kruber updated FLINK-16556: Labels: auto-deprioritized-major starter (was: auto-deprioritized-major) > TopSpeedWindowing should implement checkpointing for its source > --- > > Key: FLINK-16556 > URL: https://issues.apache.org/jira/browse/FLINK-16556 > Project: Flink > Issue Type: Bug > Components: Examples >Affects Versions: 1.10.0 >Reporter: Nico Kruber >Priority: Minor > Labels: auto-deprioritized-major, starter > > {\{org.apache.flink.streaming.examples.windowing.TopSpeedWindowing.CarSource}} > does not implement checkpointing of its state, namely the current speeds and > distances per car. The main problem with this is that the window trigger only > fires if the new distance has increased by at least 50 but after restore, it > will be reset to 0 and could thus not produce output for a while. > > Either the distance calculation could use {{Math.abs}} or the source needs > proper checkpointing. Optionally with allowing the number of cars to > increase/decrease. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] alpinegizmo commented on a change in pull request #15811: [FLINK-22253][docs] Update back pressure monitoring docs with new WebUI changes
alpinegizmo commented on a change in pull request #15811: URL: https://github.com/apache/flink/pull/15811#discussion_r625045499 ## File path: docs/content/docs/ops/monitoring/back_pressure.md ## @@ -37,50 +37,47 @@ If you see a **back pressure warning** (e.g. `High`) for a task, this means that Take a simple `Source -> Sink` job as an example. If you see a warning for `Source`, this means that `Sink` is consuming data slower than `Source` is producing. `Sink` is back pressuring the upstream operator `Source`. -## Sampling Back Pressure +## Task performance metrics -Back pressure monitoring works by repeatedly taking back pressure samples of your running tasks. The JobManager triggers repeated calls to `Task.isBackPressured()` for the tasks of your job. +Every parallel instance of a task (subtask) is exposing a group of three metrics: +- `backPressureTimeMsPerSecond`, time that subtask spent being back pressured +- `idleTimeMsPerSecond`, time that subtask spent waiting for something to process +- `busyTimeMsPerSecond`, time that subtask was busy doing some actual work -{{< img src="/fig/back_pressure_sampling.png" class="img-responsive" >}} - - -Internally, back pressure is judged based on the availability of output buffers. If there is no available buffer (at least one) for output, then it indicates that there is back pressure for the task. - -By default, the job manager triggers 100 samples every 50ms for each task in order to determine back pressure. The ratio you see in the web interface tells you how many of these samples were indicating back pressure, e.g. `0.01` indicates that only 1 in 100 was back pressured. - -- **OK**: 0 <= Ratio <= 0.10 -- **LOW**: 0.10 < Ratio <= 0.5 -- **HIGH**: 0.5 < Ratio <= 1 - -In order to not overload the task managers with back pressure samples, the web interface refreshes samples only after 60 seconds. - -## Configuration - -You can configure the number of samples for the job manager with the following configuration keys: - -- `web.backpressure.refresh-interval`: Time after which available stats are deprecated and need to be refreshed (DEFAULT: 6, 1 min). -- `web.backpressure.num-samples`: Number of samples to take to determine back pressure (DEFAULT: 100). -- `web.backpressure.delay-between-samples`: Delay between samples to determine back pressure (DEFAULT: 50, 50 ms). +Those metrics are being updated every couple of seconds and the reported value presents an average time +that subtask has been back pressured (or idle or busy) in that last couple of seconds. +Keep this in mind if your job has a varying load. Both, a subtask that has a constant load of 50% and a +subtask that is alternating every second between fully loaded and idling, will have the same value +of `busyTimeMsPerSecond` around `500ms`. Review comment: ```suggestion Keep this in mind if your job has a varying load. For example, a subtask with a constant load of 50% and another subtask that is alternating every second between fully loaded and idling will both have the same value of `busyTimeMsPerSecond`: around `500ms`. ``` ## File path: docs/content/docs/ops/monitoring/back_pressure.md ## @@ -37,50 +37,47 @@ If you see a **back pressure warning** (e.g. `High`) for a task, this means that Take a simple `Source -> Sink` job as an example. If you see a warning for `Source`, this means that `Sink` is consuming data slower than `Source` is producing. `Sink` is back pressuring the upstream operator `Source`. -## Sampling Back Pressure +## Task performance metrics -Back pressure monitoring works by repeatedly taking back pressure samples of your running tasks. The JobManager triggers repeated calls to `Task.isBackPressured()` for the tasks of your job. +Every parallel instance of a task (subtask) is exposing a group of three metrics: +- `backPressureTimeMsPerSecond`, time that subtask spent being back pressured +- `idleTimeMsPerSecond`, time that subtask spent waiting for something to process +- `busyTimeMsPerSecond`, time that subtask was busy doing some actual work -{{< img src="/fig/back_pressure_sampling.png" class="img-responsive" >}} - - -Internally, back pressure is judged based on the availability of output buffers. If there is no available buffer (at least one) for output, then it indicates that there is back pressure for the task. - -By default, the job manager triggers 100 samples every 50ms for each task in order to determine back pressure. The ratio you see in the web interface tells you how many of these samples were indicating back pressure, e.g. `0.01` indicates that only 1 in 100 was back pressured. - -- **OK**: 0 <= Ratio <= 0.10 -- **LOW**: 0.10 < Ratio <= 0.5 -- **HIGH**: 0.5 < Ratio <= 1 - -In order to not overload the task managers with back pressure samples, the web interface refreshes samples only after 60 seconds. - -## Configuration - -You can configure the number of
[GitHub] [flink] flinkbot commented on pull request #15825: [FLINK-22406][coordination][tests] Stabilize ReactiveModeITCase
flinkbot commented on pull request #15825: URL: https://github.com/apache/flink/pull/15825#issuecomment-831225399 ## CI report: * e1a425301eee0b46c812866d12284125f5bde104 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15726: [BP-1.12][FLINK-22368] Deque channel after releasing on EndOfPartition
flinkbot edited a comment on pull request #15726: URL: https://github.com/apache/flink/pull/15726#issuecomment-824941786 ## CI report: * 5022df0e16a5b1f7798171c0cd95426df8ef5c64 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17508) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample
flinkbot edited a comment on pull request #15121: URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240 ## CI report: * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN * 5ee991b8ab6c1645e3661f2ae8b5c20e92223954 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17489) * cc9e8ddee9ff28b3044488d0e73bbc7f5c0a73e5 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17516) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-22139) Flink Jobmanager & Task Manger logs are not writing to the logs files
[ https://issues.apache.org/jira/browse/FLINK-22139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338338#comment-17338338 ] Till Rohrmann commented on FLINK-22139: --- What is the side pod? Could you try deploying a non-HA cluster as described in the documentation and then log into the container of the {{JobManager}} and check whether {{/opt/flink/log}} contains the log file? > Flink Jobmanager & Task Manger logs are not writing to the logs files > - > > Key: FLINK-22139 > URL: https://issues.apache.org/jira/browse/FLINK-22139 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.12.2 > Environment: on kubernetes flink standalone deployment with > jobmanager HA is enabled. >Reporter: Bhagi >Priority: Major > > Hi Team, > I am submitting the jobs and restarting the job manager and task manager > pods.. Log files are generating with the name task manager and job manager. > but job manager & task manager log file size is '0', i am not sure any > configuration missed..why logs are not writing to their log files.. > # Task Manager pod### > flink@flink-taskmanager-85b6585b7-hhgl7:~$ ls -lart log/ > total 0 > -rw-r--r-- 1 flink flink 0 Apr 7 09:35 > flink--taskexecutor-0-flink-taskmanager-85b6585b7-hhgl7.log > flink@flink-taskmanager-85b6585b7-hhgl7:~$ > ### Jobmanager pod Logs # > flink@flink-jobmanager-f6db89b7f-lq4ps:~$ > -rw-r--r-- 1 7148739 flink 0 Apr 7 06:36 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-gtkx5.log > -rw-r--r-- 1 7148739 flink 0 Apr 7 06:36 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-wnrfm.log > -rw-r--r-- 1 7148739 flink 0 Apr 7 06:37 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-2b2fs.log > -rw-r--r-- 1 7148739 flink 0 Apr 7 06:37 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-7kdhh.log > -rw-r--r-- 1 7148739 flink 0 Apr 7 09:35 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-twhkt.log > drwxrwxrwx 2 7148739 flink35 Apr 7 09:35 . > -rw-r--r-- 1 7148739 flink 0 Apr 7 09:35 > flink--standalonesession-0-flink-jobmanager-f6db89b7f-lq4ps.log > flink@flink-jobmanager-f6db89b7f-lq4ps:~$ > I configured log4j.properties for flink > log4j.properties: |+ > monitorInterval=30 > rootLogger.level = INFO > rootLogger.appenderRef.file.ref = MainAppender > logger.flink.name = org.apache.flink > logger.flink.level = INFO > logger.akka.name = akka > logger.akka.level = INFO > appender.main.name = MainAppender > appender.main.type = RollingFile > appender.main.append = true > appender.main.fileName = ${sys:log.file} > appender.main.filePattern = ${sys:log.file}.%i > appender.main.layout.type = PatternLayout > appender.main.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x > - %m%n > appender.main.policies.type = Policies > appender.main.policies.size.type = SizeBasedTriggeringPolicy > appender.main.policies.size.size = 100MB > appender.main.policies.startup.type = OnStartupTriggeringPolicy > appender.main.strategy.type = DefaultRolloverStrategy > appender.main.strategy.max = ${env:MAX_LOG_FILE_NUMBER:-10} > logger.netty.name = > org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline > logger.netty.level = OFF -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] tillrohrmann commented on a change in pull request #15741: [FLINK-22419][coordination][tests] Rework RpcEndpoint delay tests
tillrohrmann commented on a change in pull request #15741: URL: https://github.com/apache/flink/pull/15741#discussion_r625038879 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/rpc/RpcEndpointTest.java ## @@ -340,108 +319,54 @@ public void testExecute() throws InterruptedException, ExecutionException, Timeo } } -/** Tests scheduling runnable with delay specified in number and TimeUnit. */ @Test -public void testScheduleRunnable() -throws InterruptedException, ExecutionException, TimeoutException { -final Time expectedDelay1 = Time.seconds(1); -final Time expectedDelay2 = Time.milliseconds(500); -final CompletableFuture actualDelayMsFuture1 = new CompletableFuture<>(); -final CompletableFuture actualDelayMsFuture2 = new CompletableFuture<>(); -final RpcEndpoint endpoint = new BaseEndpoint(rpcService); -try { -endpoint.start(); -final long startTime = System.currentTimeMillis(); -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture1.complete( -System.currentTimeMillis() - startTime); -}, -expectedDelay1.getSize(), -expectedDelay1.getUnit()); -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture2.complete( -System.currentTimeMillis() - startTime); -}, -expectedDelay2.getSize(), -expectedDelay2.getUnit()); -final long actualDelayMs1 = -actualDelayMsFuture1.get( -expectedDelay1.getSize() * 2, expectedDelay1.getUnit()); -final long actualDelayMs2 = -actualDelayMsFuture2.get( -expectedDelay2.getSize() * 2, expectedDelay2.getUnit()); -assertTrue(actualDelayMs1 > expectedDelay1.toMilliseconds() * 0.8); -assertTrue(actualDelayMs1 < expectedDelay1.toMilliseconds() * 1.2); -assertTrue(actualDelayMs2 > expectedDelay2.toMilliseconds() * 0.8); -assertTrue(actualDelayMs2 < expectedDelay2.toMilliseconds() * 1.2); -} finally { -RpcUtils.terminateRpcEndpoint(endpoint, TIMEOUT); -} +public void testScheduleRunnableWithDelayInMilliseconds() throws Exception { +testScheduleWithDelay( +(mainThreadExecutor, expectedDelay) -> +mainThreadExecutor.schedule( +() -> {}, expectedDelay.toMillis(), TimeUnit.MILLISECONDS)); } -/** Tests scheduling callable with delay specified in number and TimeUnit. */ @Test -public void testScheduleCallable() -throws InterruptedException, ExecutionException, TimeoutException { -final Time expectedDelay1 = Time.seconds(1); -final Time expectedDelay2 = Time.milliseconds(500); -final CompletableFuture actualDelayMsFuture1 = new CompletableFuture<>(); -final CompletableFuture actualDelayMsFuture2 = new CompletableFuture<>(); -final RpcEndpoint endpoint = new BaseEndpoint(rpcService); -final int expectedInt = 12345; -final String expectedString = "Flink"; -try { -endpoint.start(); -final long startTime = System.currentTimeMillis(); -final ScheduledFuture intScheduleFuture = -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture1.complete( -System.currentTimeMillis() - startTime); -return expectedInt; -}, -expectedDelay1.getSize(), -expectedDelay1.getUnit()); -final ScheduledFuture stringScheduledFuture = -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture2.complete( -System.currentTimeMillis() - startTime); -
[jira] [Updated] (FLINK-22551) checkpoints: strange behaviour
[ https://issues.apache.org/jira/browse/FLINK-22551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] buom updated FLINK-22551: - Description: * +*Case 1*:+ Work as expected {code:java} public class Example { public static class ExampleSource extends RichSourceFunction implements CheckpointedFunction { private volatile boolean isRunning = true; @Override public void open(Configuration parameters) throws Exception { System.out.println("[source] invoke open()"); } @Override public void close() throws Exception { isRunning = false; System.out.println("[source] invoke close()"); } @Override public void run(SourceContext ctx) throws Exception { System.out.println("[source] invoke run()"); while (isRunning) { ctx.collect("Flink"); Thread.sleep(500); } } @Override public void cancel() { isRunning = false; System.out.println("[source] invoke cancel()"); } @Override public void snapshotState(FunctionSnapshotContext context) throws Exception { System.out.println("[source] invoke snapshotState()"); } @Override public void initializeState(FunctionInitializationContext context) throws Exception { System.out.println("[source] invoke initializeState()"); } } public static class ExampleSink extends PrintSinkFunction implements CheckpointedFunction { @Override public void snapshotState(FunctionSnapshotContext context) throws Exception { System.out.println("[sink] invoke snapshotState()"); } @Override public void initializeState(FunctionInitializationContext context) throws Exception { System.out.println("[sink] invoke initializeState()"); } } public static void main(String[] args) throws Exception { final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment().enableCheckpointing(1000); DataStream stream = env.addSource(new ExampleSource()); stream.addSink(new ExampleSink()).setParallelism(1); env.execute(); } } {code} {code:java} $ java -jar ./example.jar [sink] invoke initializeState() [source] invoke initializeState() [source] invoke open() [source] invoke run() Flink [sink] invoke snapshotState() [source] invoke snapshotState() Flink Flink [sink] invoke snapshotState() [source] invoke snapshotState() Flink Flink [sink] invoke snapshotState() [source] invoke snapshotState() ^C {code} * *+Case 2:+* Run as unexpected (w/ _parallelism = 1_) {code:java} public class Example { public static class ExampleSource extends RichSourceFunction implements CheckpointedFunction { private volatile boolean isRunning = true; @Override public void open(Configuration parameters) throws Exception { System.out.println("[source] invoke open()"); } @Override public void close() throws Exception { isRunning = false; System.out.println("[source] invoke close()"); } @Override public void run(SourceContext ctx) throws Exception { System.out.println("[source] invoke run()"); while (isRunning) { ctx.collect("Flink"); Thread.sleep(500); } } @Override public void cancel() { isRunning = false; System.out.println("[source] invoke cancel()"); } @Override public void snapshotState(FunctionSnapshotContext context) throws Exception { System.out.println("[source] invoke snapshotState()"); } @Override public void initializeState(FunctionInitializationContext context) throws Exception { System.out.println("[source] invoke initializeState()"); } } public static void main(String[] args) throws Exception { final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment().enableCheckpointing(1000); DataStream stream = env.addSource(new ExampleSource()); Properties properties = new Properties(); properties.setProperty("bootstrap.servers", "localhost:9092"); String topic = "my-topic"; FlinkKafkaProducer kafkaProducer = new FlinkKafkaProducer<>( topic, (element, timestamp) -> { byte[] value = element.getBytes(StandardCharsets.UTF_8); return new ProducerRecord<>(topic, null, timestamp, null, value, null); }, properties,
[jira] [Created] (FLINK-22551) checkpoints: strange behaviour
buom created FLINK-22551: Summary: checkpoints: strange behaviour Key: FLINK-22551 URL: https://issues.apache.org/jira/browse/FLINK-22551 Project: Flink Issue Type: Bug Affects Versions: 1.13.0 Environment: {code:java} java -version openjdk version "11.0.2" 2019-01-15 OpenJDK Runtime Environment 18.9 (build 11.0.2+9) OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode) {code} Reporter: buom * +*Case 1*:+ Work as expected {code:java} public class Example { public static class ExampleSource extends RichSourceFunction implements CheckpointedFunction { private volatile boolean isRunning = true; @Override public void open(Configuration parameters) throws Exception { System.out.println("[source] invoke open()"); } @Override public void close() throws Exception { isRunning = false; System.out.println("[source] invoke close()"); } @Override public void run(SourceContext ctx) throws Exception { System.out.println("[source] invoke run()"); while (isRunning) { ctx.collect("Flink"); Thread.sleep(500); } } @Override public void cancel() { isRunning = false; System.out.println("[source] invoke cancel()"); } @Override public void snapshotState(FunctionSnapshotContext context) throws Exception { System.out.println("[source] invoke snapshotState()"); } @Override public void initializeState(FunctionInitializationContext context) throws Exception { System.out.println("[source] invoke initializeState()"); } } public static class ExampleSink extends PrintSinkFunction implements CheckpointedFunction { @Override public void snapshotState(FunctionSnapshotContext context) throws Exception { System.out.println("[sink] invoke snapshotState()"); } @Override public void initializeState(FunctionInitializationContext context) throws Exception { System.out.println("[sink] invoke initializeState()"); } } public static void main(String[] args) throws Exception { final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment().enableCheckpointing(1000); DataStream stream = env.addSource(new ExampleSource()); stream.addSink(new ExampleSink()).setParallelism(1); env.execute(); } } {code} {code:java} $ java -jar ./example.jar [sink] invoke initializeState() [source] invoke initializeState() [source] invoke open() [source] invoke run() Flink [sink] invoke snapshotState() [source] invoke snapshotState() Flink Flink [sink] invoke snapshotState() [source] invoke snapshotState() Flink Flink [sink] invoke snapshotState() [source] invoke snapshotState() ^C {code} * *+Case 2:+* Run as unexpected (w/ _parallelism = 1_) {code:java} public class Example { public static class ExampleSource extends RichSourceFunction implements CheckpointedFunction { private volatile boolean isRunning = true; @Override public void open(Configuration parameters) throws Exception { System.out.println("[source] invoke open()"); } @Override public void close() throws Exception { isRunning = false; System.out.println("[source] invoke close()"); } @Override public void run(SourceContext ctx) throws Exception { System.out.println("[source] invoke run()"); while (isRunning) { ctx.collect("Flink"); Thread.sleep(500); } } @Override public void cancel() { isRunning = false; System.out.println("[source] invoke cancel()"); } @Override public void snapshotState(FunctionSnapshotContext context) throws Exception { System.out.println("[source] invoke snapshotState()"); } @Override public void initializeState(FunctionInitializationContext context) throws Exception { System.out.println("[source] invoke initializeState()"); } } public static void main(String[] args) throws Exception { final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment().enableCheckpointing(1000); DataStream stream = env.addSource(new ExampleSource()); Properties properties = new Properties(); properties.setProperty("bootstrap.servers", "localhost:9092"); String topic = "my-topic"; FlinkKafkaProducer kafkaProducer = new FlinkKafkaProducer<>(
[GitHub] [flink-docker] AHeise commented on pull request #74: Updated maintainers
AHeise commented on pull request #74: URL: https://github.com/apache/flink-docker/pull/74#issuecomment-831207527 Having the project is probably more future proof. Not sure why you need to list maintainers here in the first place... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #15121: The method $(String) is undefined for the type TableExample
flinkbot edited a comment on pull request #15121: URL: https://github.com/apache/flink/pull/15121#issuecomment-793480240 ## CI report: * cf26ce895a7956d258acc073817f578558e78227 UNKNOWN * 709c4110370b845f42d916a06e56c6026cf2fac8 UNKNOWN * 4ea99332e1997eebca1f3f0a9d9229b8265fe32c UNKNOWN * 9a256ec6b644a0f795376076f4e3ff4e85aaad1e UNKNOWN * b7dca63a2e98f09c8c780a21091b5268d897b220 UNKNOWN * 2b7bb05a4dbbb0883db566589dfe1975fdbb279d UNKNOWN * 5ee991b8ab6c1645e3661f2ae8b5c20e92223954 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17489) * cc9e8ddee9ff28b3044488d0e73bbc7f5c0a73e5 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #15825: [FLINK-22406][coordination][tests] Stabilize ReactiveModeITCase
flinkbot commented on pull request #15825: URL: https://github.com/apache/flink/pull/15825#issuecomment-831205782 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit e1a425301eee0b46c812866d12284125f5bde104 (Mon May 03 11:41:33 UTC 2021) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-22406) Unstable test ReactiveModeITCase.testScaleDownOnTaskManagerLoss()
[ https://issues.apache.org/jira/browse/FLINK-22406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-22406: --- Labels: pull-request-available test-stability (was: test-stability) > Unstable test ReactiveModeITCase.testScaleDownOnTaskManagerLoss() > - > > Key: FLINK-22406 > URL: https://issues.apache.org/jira/browse/FLINK-22406 > Project: Flink > Issue Type: Bug > Components: Tests >Affects Versions: 1.13.0 >Reporter: Stephan Ewen >Assignee: Chesnay Schepler >Priority: Critical > Labels: pull-request-available, test-stability > > The test is stalling on Azure CI. > https://dev.azure.com/sewen0794/Flink/_build/results?buildId=292=logs=0a15d512-44ac-5ba5-97ab-13a5d066c22c=634cd701-c189-5dff-24cb-606ed884db87=4865 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] zentol opened a new pull request #15825: [FLINK-22406][coordination][tests] Stabilize ReactiveModeITCase
zentol opened a new pull request #15825: URL: https://github.com/apache/flink/pull/15825 Stabilizes the ReactiveModeITCase by not checking what is actually being deployed, but just asking the JM with what parallelism it currently intends to run the job with. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol merged pull request #15749: [hotfix][docs] Fix typo.
zentol merged pull request #15749: URL: https://github.com/apache/flink/pull/15749 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol merged pull request #15800: [hotfix][python] Format exception message in ZipUtils
zentol merged pull request #15800: URL: https://github.com/apache/flink/pull/15800 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-22494) Avoid discarding checkpoints in case of failure
[ https://issues.apache.org/jira/browse/FLINK-22494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias reassigned FLINK-22494: Assignee: Matthias > Avoid discarding checkpoints in case of failure > --- > > Key: FLINK-22494 > URL: https://issues.apache.org/jira/browse/FLINK-22494 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing, Runtime / Coordination >Affects Versions: 1.13.0, 1.14.0, 1.12.3 >Reporter: Matthias >Assignee: Matthias >Priority: Critical > Fix For: 1.14.0, 1.13.1, 1.12.4 > > > Both {{StateHandleStore}} implementations (i.e. > [KubernetesStateHandleStore:157|https://github.com/apache/flink/blob/c6997c97c575d334679915c328792b8a3067cfb5/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/highavailability/KubernetesStateHandleStore.java#L157] > and > [ZooKeeperStateHandleStore:170|https://github.com/apache/flink/blob/c6997c97c575d334679915c328792b8a3067cfb5/flink-runtime/src/main/java/org/apache/flink/runtime/zookeeper/ZooKeeperStateHandleStore.java#L170]) > discard checkpoints if the checkpoint metadata wasn't written to the > backend. > This does not cover the cases where the data was actually written to the > backend but the call failed anyway (e.g. due to network issues). In such a > case, we might end up having a pointer in the backend pointing to a > checkpoint that was discarded. > Instead of discarding the checkpoint data in this case, we might want to keep > it for this specific use case. Otherwise, we might run into Exceptions when > recovering from the Checkpoint later on. We might want to add a warning to > the user pointing to the possibly orphaned checkpoint data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] zentol merged pull request #15805: Update japicmp configuration and docs for 1.12.3
zentol merged pull request #15805: URL: https://github.com/apache/flink/pull/15805 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on a change in pull request #15741: [FLINK-22419][coordination][tests] Rework RpcEndpoint delay tests
zentol commented on a change in pull request #15741: URL: https://github.com/apache/flink/pull/15741#discussion_r625023950 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/rpc/RpcEndpointTest.java ## @@ -340,108 +319,54 @@ public void testExecute() throws InterruptedException, ExecutionException, Timeo } } -/** Tests scheduling runnable with delay specified in number and TimeUnit. */ @Test -public void testScheduleRunnable() -throws InterruptedException, ExecutionException, TimeoutException { -final Time expectedDelay1 = Time.seconds(1); -final Time expectedDelay2 = Time.milliseconds(500); -final CompletableFuture actualDelayMsFuture1 = new CompletableFuture<>(); -final CompletableFuture actualDelayMsFuture2 = new CompletableFuture<>(); -final RpcEndpoint endpoint = new BaseEndpoint(rpcService); -try { -endpoint.start(); -final long startTime = System.currentTimeMillis(); -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture1.complete( -System.currentTimeMillis() - startTime); -}, -expectedDelay1.getSize(), -expectedDelay1.getUnit()); -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture2.complete( -System.currentTimeMillis() - startTime); -}, -expectedDelay2.getSize(), -expectedDelay2.getUnit()); -final long actualDelayMs1 = -actualDelayMsFuture1.get( -expectedDelay1.getSize() * 2, expectedDelay1.getUnit()); -final long actualDelayMs2 = -actualDelayMsFuture2.get( -expectedDelay2.getSize() * 2, expectedDelay2.getUnit()); -assertTrue(actualDelayMs1 > expectedDelay1.toMilliseconds() * 0.8); -assertTrue(actualDelayMs1 < expectedDelay1.toMilliseconds() * 1.2); -assertTrue(actualDelayMs2 > expectedDelay2.toMilliseconds() * 0.8); -assertTrue(actualDelayMs2 < expectedDelay2.toMilliseconds() * 1.2); -} finally { -RpcUtils.terminateRpcEndpoint(endpoint, TIMEOUT); -} +public void testScheduleRunnableWithDelayInMilliseconds() throws Exception { +testScheduleWithDelay( +(mainThreadExecutor, expectedDelay) -> +mainThreadExecutor.schedule( +() -> {}, expectedDelay.toMillis(), TimeUnit.MILLISECONDS)); } -/** Tests scheduling callable with delay specified in number and TimeUnit. */ @Test -public void testScheduleCallable() -throws InterruptedException, ExecutionException, TimeoutException { -final Time expectedDelay1 = Time.seconds(1); -final Time expectedDelay2 = Time.milliseconds(500); -final CompletableFuture actualDelayMsFuture1 = new CompletableFuture<>(); -final CompletableFuture actualDelayMsFuture2 = new CompletableFuture<>(); -final RpcEndpoint endpoint = new BaseEndpoint(rpcService); -final int expectedInt = 12345; -final String expectedString = "Flink"; -try { -endpoint.start(); -final long startTime = System.currentTimeMillis(); -final ScheduledFuture intScheduleFuture = -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture1.complete( -System.currentTimeMillis() - startTime); -return expectedInt; -}, -expectedDelay1.getSize(), -expectedDelay1.getUnit()); -final ScheduledFuture stringScheduledFuture = -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture2.complete( -System.currentTimeMillis() - startTime); -
[GitHub] [flink] flinkbot edited a comment on pull request #15727: [BP-1.13][FLINK-22368] Deque channel after releasing on EndOfPartition
flinkbot edited a comment on pull request #15727: URL: https://github.com/apache/flink/pull/15727#issuecomment-824941920 ## CI report: * 72cc560e0d4292d7de977ccc3aabca2c48d6d518 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17506) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on pull request #15741: [FLINK-22419][coordination][tests] Rework RpcEndpoint delay tests
zentol commented on pull request #15741: URL: https://github.com/apache/flink/pull/15741#issuecomment-831198755 > For example, I think I could make the test pass if schedule completely ignores the delay and runs the result directly. Hence, there is no longer a guard preventing such a regression. How so? It would call the wrong method on the `MainThreadExecutable`, failing the test. Now if you break the actual scheduling behavior in the `AkkaInvocationHandler` then this can indeed happen, but that reveals further issues in these tests (i.e., that they are reliant to a specific `RpcService` implementation). > I fear that we are actually losing test coverage with these changes We do, of the `AkkaInvocationHandler`. I can try adding a test for the actual scheduling behavior, but I don't think we should revert the proposed changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on a change in pull request #15741: [FLINK-22419][coordination][tests] Rework RpcEndpoint delay tests
zentol commented on a change in pull request #15741: URL: https://github.com/apache/flink/pull/15741#discussion_r625017325 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/rpc/RpcEndpointTest.java ## @@ -340,108 +319,54 @@ public void testExecute() throws InterruptedException, ExecutionException, Timeo } } -/** Tests scheduling runnable with delay specified in number and TimeUnit. */ @Test -public void testScheduleRunnable() -throws InterruptedException, ExecutionException, TimeoutException { -final Time expectedDelay1 = Time.seconds(1); -final Time expectedDelay2 = Time.milliseconds(500); -final CompletableFuture actualDelayMsFuture1 = new CompletableFuture<>(); -final CompletableFuture actualDelayMsFuture2 = new CompletableFuture<>(); -final RpcEndpoint endpoint = new BaseEndpoint(rpcService); -try { -endpoint.start(); -final long startTime = System.currentTimeMillis(); -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture1.complete( -System.currentTimeMillis() - startTime); -}, -expectedDelay1.getSize(), -expectedDelay1.getUnit()); -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture2.complete( -System.currentTimeMillis() - startTime); -}, -expectedDelay2.getSize(), -expectedDelay2.getUnit()); -final long actualDelayMs1 = -actualDelayMsFuture1.get( -expectedDelay1.getSize() * 2, expectedDelay1.getUnit()); -final long actualDelayMs2 = -actualDelayMsFuture2.get( -expectedDelay2.getSize() * 2, expectedDelay2.getUnit()); -assertTrue(actualDelayMs1 > expectedDelay1.toMilliseconds() * 0.8); -assertTrue(actualDelayMs1 < expectedDelay1.toMilliseconds() * 1.2); -assertTrue(actualDelayMs2 > expectedDelay2.toMilliseconds() * 0.8); -assertTrue(actualDelayMs2 < expectedDelay2.toMilliseconds() * 1.2); -} finally { -RpcUtils.terminateRpcEndpoint(endpoint, TIMEOUT); -} +public void testScheduleRunnableWithDelayInMilliseconds() throws Exception { +testScheduleWithDelay( +(mainThreadExecutor, expectedDelay) -> +mainThreadExecutor.schedule( +() -> {}, expectedDelay.toMillis(), TimeUnit.MILLISECONDS)); } -/** Tests scheduling callable with delay specified in number and TimeUnit. */ @Test -public void testScheduleCallable() -throws InterruptedException, ExecutionException, TimeoutException { -final Time expectedDelay1 = Time.seconds(1); -final Time expectedDelay2 = Time.milliseconds(500); -final CompletableFuture actualDelayMsFuture1 = new CompletableFuture<>(); -final CompletableFuture actualDelayMsFuture2 = new CompletableFuture<>(); -final RpcEndpoint endpoint = new BaseEndpoint(rpcService); -final int expectedInt = 12345; -final String expectedString = "Flink"; -try { -endpoint.start(); -final long startTime = System.currentTimeMillis(); -final ScheduledFuture intScheduleFuture = -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture1.complete( -System.currentTimeMillis() - startTime); -return expectedInt; -}, -expectedDelay1.getSize(), -expectedDelay1.getUnit()); -final ScheduledFuture stringScheduledFuture = -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture2.complete( -System.currentTimeMillis() - startTime); -
[GitHub] [flink] zentol commented on a change in pull request #15741: [FLINK-22419][coordination][tests] Rework RpcEndpoint delay tests
zentol commented on a change in pull request #15741: URL: https://github.com/apache/flink/pull/15741#discussion_r625016221 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/rpc/RpcEndpointTest.java ## @@ -340,108 +319,54 @@ public void testExecute() throws InterruptedException, ExecutionException, Timeo } } -/** Tests scheduling runnable with delay specified in number and TimeUnit. */ @Test -public void testScheduleRunnable() -throws InterruptedException, ExecutionException, TimeoutException { -final Time expectedDelay1 = Time.seconds(1); -final Time expectedDelay2 = Time.milliseconds(500); -final CompletableFuture actualDelayMsFuture1 = new CompletableFuture<>(); -final CompletableFuture actualDelayMsFuture2 = new CompletableFuture<>(); -final RpcEndpoint endpoint = new BaseEndpoint(rpcService); -try { -endpoint.start(); -final long startTime = System.currentTimeMillis(); -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture1.complete( -System.currentTimeMillis() - startTime); -}, -expectedDelay1.getSize(), -expectedDelay1.getUnit()); -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture2.complete( -System.currentTimeMillis() - startTime); -}, -expectedDelay2.getSize(), -expectedDelay2.getUnit()); -final long actualDelayMs1 = -actualDelayMsFuture1.get( -expectedDelay1.getSize() * 2, expectedDelay1.getUnit()); -final long actualDelayMs2 = -actualDelayMsFuture2.get( -expectedDelay2.getSize() * 2, expectedDelay2.getUnit()); -assertTrue(actualDelayMs1 > expectedDelay1.toMilliseconds() * 0.8); -assertTrue(actualDelayMs1 < expectedDelay1.toMilliseconds() * 1.2); -assertTrue(actualDelayMs2 > expectedDelay2.toMilliseconds() * 0.8); -assertTrue(actualDelayMs2 < expectedDelay2.toMilliseconds() * 1.2); -} finally { -RpcUtils.terminateRpcEndpoint(endpoint, TIMEOUT); -} +public void testScheduleRunnableWithDelayInMilliseconds() throws Exception { +testScheduleWithDelay( +(mainThreadExecutor, expectedDelay) -> +mainThreadExecutor.schedule( +() -> {}, expectedDelay.toMillis(), TimeUnit.MILLISECONDS)); } -/** Tests scheduling callable with delay specified in number and TimeUnit. */ @Test -public void testScheduleCallable() -throws InterruptedException, ExecutionException, TimeoutException { -final Time expectedDelay1 = Time.seconds(1); -final Time expectedDelay2 = Time.milliseconds(500); -final CompletableFuture actualDelayMsFuture1 = new CompletableFuture<>(); -final CompletableFuture actualDelayMsFuture2 = new CompletableFuture<>(); -final RpcEndpoint endpoint = new BaseEndpoint(rpcService); -final int expectedInt = 12345; -final String expectedString = "Flink"; -try { -endpoint.start(); -final long startTime = System.currentTimeMillis(); -final ScheduledFuture intScheduleFuture = -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture1.complete( -System.currentTimeMillis() - startTime); -return expectedInt; -}, -expectedDelay1.getSize(), -expectedDelay1.getUnit()); -final ScheduledFuture stringScheduledFuture = -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture2.complete( -System.currentTimeMillis() - startTime); -
[GitHub] [flink] zentol commented on a change in pull request #15741: [FLINK-22419][coordination][tests] Rework RpcEndpoint delay tests
zentol commented on a change in pull request #15741: URL: https://github.com/apache/flink/pull/15741#discussion_r625016153 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/rpc/RpcEndpointTest.java ## @@ -340,108 +319,54 @@ public void testExecute() throws InterruptedException, ExecutionException, Timeo } } -/** Tests scheduling runnable with delay specified in number and TimeUnit. */ @Test -public void testScheduleRunnable() -throws InterruptedException, ExecutionException, TimeoutException { -final Time expectedDelay1 = Time.seconds(1); -final Time expectedDelay2 = Time.milliseconds(500); -final CompletableFuture actualDelayMsFuture1 = new CompletableFuture<>(); -final CompletableFuture actualDelayMsFuture2 = new CompletableFuture<>(); -final RpcEndpoint endpoint = new BaseEndpoint(rpcService); -try { -endpoint.start(); -final long startTime = System.currentTimeMillis(); -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture1.complete( -System.currentTimeMillis() - startTime); -}, -expectedDelay1.getSize(), -expectedDelay1.getUnit()); -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture2.complete( -System.currentTimeMillis() - startTime); -}, -expectedDelay2.getSize(), -expectedDelay2.getUnit()); -final long actualDelayMs1 = -actualDelayMsFuture1.get( -expectedDelay1.getSize() * 2, expectedDelay1.getUnit()); -final long actualDelayMs2 = -actualDelayMsFuture2.get( -expectedDelay2.getSize() * 2, expectedDelay2.getUnit()); -assertTrue(actualDelayMs1 > expectedDelay1.toMilliseconds() * 0.8); -assertTrue(actualDelayMs1 < expectedDelay1.toMilliseconds() * 1.2); -assertTrue(actualDelayMs2 > expectedDelay2.toMilliseconds() * 0.8); -assertTrue(actualDelayMs2 < expectedDelay2.toMilliseconds() * 1.2); -} finally { -RpcUtils.terminateRpcEndpoint(endpoint, TIMEOUT); -} +public void testScheduleRunnableWithDelayInMilliseconds() throws Exception { +testScheduleWithDelay( +(mainThreadExecutor, expectedDelay) -> +mainThreadExecutor.schedule( +() -> {}, expectedDelay.toMillis(), TimeUnit.MILLISECONDS)); } -/** Tests scheduling callable with delay specified in number and TimeUnit. */ @Test -public void testScheduleCallable() -throws InterruptedException, ExecutionException, TimeoutException { -final Time expectedDelay1 = Time.seconds(1); -final Time expectedDelay2 = Time.milliseconds(500); -final CompletableFuture actualDelayMsFuture1 = new CompletableFuture<>(); -final CompletableFuture actualDelayMsFuture2 = new CompletableFuture<>(); -final RpcEndpoint endpoint = new BaseEndpoint(rpcService); -final int expectedInt = 12345; -final String expectedString = "Flink"; -try { -endpoint.start(); -final long startTime = System.currentTimeMillis(); -final ScheduledFuture intScheduleFuture = -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture1.complete( -System.currentTimeMillis() - startTime); -return expectedInt; -}, -expectedDelay1.getSize(), -expectedDelay1.getUnit()); -final ScheduledFuture stringScheduledFuture = -endpoint.getMainThreadExecutor() -.schedule( -() -> { -endpoint.validateRunsInMainThread(); -actualDelayMsFuture2.complete( -System.currentTimeMillis() - startTime); -
[GitHub] [flink] rkhachatryan commented on a change in pull request #15728: [FLINK-22379][runtime] CheckpointCoordinator checks the state of all …
rkhachatryan commented on a change in pull request #15728: URL: https://github.com/apache/flink/pull/15728#discussion_r625010241 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/DefaultCheckpointPlanCalculatorTest.java ## @@ -149,18 +150,80 @@ public void testComputeWithMultipleLevels() throws Exception { } @Test -public void testWithTriggeredTasksNotRunning() throws Exception { +public void testNotRunningOneOfSourcesTriggeredTasksNotRunning() throws Exception { +// given: Execution graph builder with one RUNNING source and NOT RUNNING source. +FunctionWithException twoSourcesBuilder = +(id) -> +new CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder() +.addJobVertex(id) +.addJobVertex(new JobVertexID()) +.setTransitToRunning(vertex -> !vertex.getJobvertexId().equals(id)) +.build(); + +// when: Creating the checkpoint plan. +runWithNotRunTask(twoSourcesBuilder); + +// then: The plan failed because one task didn't have RUNNING state. +} + +@Test +public void testNotRunningSingleSourceTriggeredTasksNotRunning() throws Exception { +// given: Execution graph builder with one NOT RUNNING source and RUNNING not source task. +FunctionWithException sourceAndNotSourceBuilder = +(id) -> +new CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder() +.addJobVertex(id) +.addJobVertex(new JobVertexID(), false) +.setTransitToRunning(vertex -> !vertex.getJobvertexId().equals(id)) +.build(); + +// when: Creating the checkpoint plan. +runWithNotRunTask(sourceAndNotSourceBuilder); + +// then: The plan failed because one task didn't have RUNNING state. +} + +@Test +public void testNotRunningOneOfNotSourcesTriggeredTasksNotRunning() throws Exception { +// given: Execution graph builder with NOT RUNNING not source and RUNNING not source task. +FunctionWithException twoNotSourceBuilder = +(id) -> +new CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder() +.addJobVertex(id, false) +.addJobVertex(new JobVertexID(), false) +.setTransitToRunning(vertex -> !vertex.getJobvertexId().equals(id)) +.build(); + +// when: Creating the checkpoint plan. +runWithNotRunTask(twoNotSourceBuilder); + +// then: The plan failed because one task didn't have RUNNING state. +} + +@Test +public void testNotRunningSingleNotSourceTriggeredTasksNotRunning() throws Exception { +// given: Execution graph builder with NOT RUNNING not source and RUNNING source tasks. +FunctionWithException sourceAndNotSourceBuilder = +(id) -> +new CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder() +.addJobVertex(id, false) +.addJobVertex(new JobVertexID()) +.setTransitToRunning(vertex -> !vertex.getJobvertexId().equals(id)) Review comment: I think this line doesn't have any effect since later this vertex `id` transitions to some non-RUNNING state unconditionally. ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/DefaultCheckpointPlanCalculatorTest.java ## @@ -149,18 +150,80 @@ public void testComputeWithMultipleLevels() throws Exception { } @Test -public void testWithTriggeredTasksNotRunning() throws Exception { +public void testNotRunningOneOfSourcesTriggeredTasksNotRunning() throws Exception { +// given: Execution graph builder with one RUNNING source and NOT RUNNING source. +FunctionWithException twoSourcesBuilder = +(id) -> +new CheckpointCoordinatorTestingUtils.CheckpointExecutionGraphBuilder() +.addJobVertex(id) +.addJobVertex(new JobVertexID()) +.setTransitToRunning(vertex -> !vertex.getJobvertexId().equals(id)) +.build(); + +// when: Creating the checkpoint plan. +runWithNotRunTask(twoSourcesBuilder); + +// then: The plan failed because one task didn't have RUNNING state. +} + +@Test +public void testNotRunningSingleSourceTriggeredTasksNotRunning() throws Exception { +// given: Execution graph builder with one NOT RUNNING
[jira] [Updated] (FLINK-22550) Flink-docker dev-master works against 1.11-SNAPSHOT
[ https://issues.apache.org/jira/browse/FLINK-22550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-22550: --- Labels: pull-request-available (was: ) > Flink-docker dev-master works against 1.11-SNAPSHOT > --- > > Key: FLINK-22550 > URL: https://issues.apache.org/jira/browse/FLINK-22550 > Project: Flink > Issue Type: Bug > Components: flink-docker >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > > The dev-master branch is supposed to work against the latest snapshot version > of Flink, but it currently points to 1.11. > We also need to update the release guide to ensure this is updated when a > release branch is cut. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink-docker] zentol commented on pull request #74: Updated maintainers
zentol commented on pull request #74: URL: https://github.com/apache/flink-docker/pull/74#issuecomment-831190343 Do we need to list specific people, or can we just point to the Flink project as a whole? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-docker] zentol opened a new pull request #79: [FLINK-22550] Run tests against 1.14-SNAPSHOT
zentol opened a new pull request #79: URL: https://github.com/apache/flink-docker/pull/79 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-22550) Flink-docker dev-master works against 1.11-SNAPSHOT
Chesnay Schepler created FLINK-22550: Summary: Flink-docker dev-master works against 1.11-SNAPSHOT Key: FLINK-22550 URL: https://issues.apache.org/jira/browse/FLINK-22550 Project: Flink Issue Type: Bug Components: flink-docker Reporter: Chesnay Schepler Assignee: Chesnay Schepler The dev-master branch is supposed to work against the latest snapshot version of Flink, but it currently points to 1.11. We also need to update the release guide to ensure this is updated when a release branch is cut. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #15824: (1.11) [FLINK-20383][runtime] Fix race condition in notification.
flinkbot edited a comment on pull request #15824: URL: https://github.com/apache/flink/pull/15824#issuecomment-831161678 ## CI report: * f741e9ec3edf512b0adf4eb17548f9bf073f00ff Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=17512) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] MarkSfik commented on a change in pull request #438: Release blog post (update)
MarkSfik commented on a change in pull request #438: URL: https://github.com/apache/flink-web/pull/438#discussion_r624998785 ## File path: _posts/2021-04-22-release-1.13.0.md ## @@ -0,0 +1,590 @@ +--- +layout: post +title: "Apache Flink 1.13.0 Release Announcement" +date: 2021-04-22T08:00:00.000Z +categories: news +authors: +- stephan: + name: "Stephan Ewen" + twitter: "StephanEwen" +- dwysakowicz: + name: "Dawid Wysakowicz" + twitter: "dwysakowicz" + +excerpt: The Apache Flink community is excited to announce the release of Flink 1.13.0! Around 200 contributors worked on over 1,000 issues to bring significant improvements to usability and observability as well as new features that improve elasticity of Flink’s Application-style deployments. +--- + + +The Apache Flink community is excited to announce the release of Flink 1.13.0! More than 200 +contributors worked on over 1,000 issues for this new version. + +The release brings us a big step forward in one of our major efforts: **Making Stream Processing +Applications as natural and as simple to manage as any other application.** The new *reactive scaling* +mode means that scaling streaming applications in and out now works like in any other application, +by just changing the number of parallel processes. + +The release also prominently features a **series of improvements that help users better understand the performance of +applications.** When the streams don't flow as fast as you'd hope, these can help you to understand +why: Load and *backpressure visualization* to identify bottlenecks, *CPU flame graphs* to identify hot +code paths in your application, and *State Access Latencies* to see how the State Backends are keeping +up. + +Beyond those features, the Flink community has added a ton of improvements all over the system, +some of which we discuss in this article. We hope you enjoy the new release and features. +Towards the end of the article, we describe changes to be aware of when upgrading +from earlier versions of Apache Flink. + +{% toc %} + +We encourage you to [download the release](https://flink.apache.org/downloads.html) and share your +feedback with the community through +the [Flink mailing lists](https://flink.apache.org/community.html#mailing-lists) +or [JIRA](https://issues.apache.org/jira/projects/FLINK/summary). + + + +# Notable Features + +## Reactive Scaling + +Reactive Scaling is the latest piece in Flink's initiative to make Stream Processing +Applications as natural and as simple to manage as any other application. + +Flink has a dual nature when it comes to resource management and deployments: You can deploy +Flink applications onto resource orchestrators like Kubernetes or Yarn in such a way that Flink actively manages +the resources, and allocates and releases workers as needed. That is especially useful for jobs and +applications that rapidly change their required resources, like batch applications and ad-hoc SQL +queries. The application parallelism rules, the number of workers follows. In the context of Flink +applications, we call this *active scaling*. + +For long running streaming applications, it is often a nicer model to just deploy them like any +other long-running application: The application doesn't really need to know that it runs on K8s, +EKS, Yarn, etc. and doesn't try to acquire a specific amount of workers; instead, it just uses the +number of workers that is given to it. The number of workers rules, the application parallelism +adjusts to that. In the context of Flink, we call that *re-active scaling*. + +The [Application Deployment Mode]({{ site.DOCS_BASE_URL }}flink-docs-release-1.13/docs/concepts/flink-architecture/#flink-application-execution) +started this effort, making deployments more application-like (by avoiding two separate deployment +steps to (1) start cluster and (2) submit application). The reactive scaling mode completes this, +and you now don't have to use extra tools (scripts or a K8s operator) any more to keep the number +of workers and the application parallelism settings in sync. + +You can now put an auto-scaler around Flink applications like around other typical applications — as +long as you are mindful about the cost of rescaling, when configuring the autoscaler: Stateful +streaming applications must move state around when scaling. + +To try the reactive-scaling mode, add the `scheduler-mode: reactive` config entry and deploy +an application cluster ([standalone]({{ site.DOCS_BASE_URL }}flink-docs-release-1.13/docs/deployment/resource-providers/standalone/overview/#application-mode) or [Kubernetes]({{ site.DOCS_BASE_URL }}flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#deploy-application-cluster)). Check out [the reactive scaling docs]({{ site.DOCS_BASE_URL }}flink-docs-release-1.13/docs/deployment/elastic_scaling/#reactive-mode) for more details. + + +## Analyzing Application