[GitHub] [flink] flinkbot edited a comment on pull request #14361: [FLINK-19435][connectors/jdbc] Fix deadlock when loading different driver classes concurrently using Class.forName
flinkbot edited a comment on pull request #14361: URL: https://github.com/apache/flink/pull/14361#issuecomment-742899629 ## CI report: * c58cf4c33c3c645048409a3ea8664c330c176a25 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10819) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14370: [FLINK-20582][docs] Fix typos in `CREATE Statements` docs.
flinkbot edited a comment on pull request #14370: URL: https://github.com/apache/flink/pull/14370#issuecomment-743701239 ## CI report: * 959c736db926a699ddd78202c44e4faec0394110 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10821) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14370: [FLINK-20582][docs] Fix typos in `CREATE Statements` docs.
flinkbot edited a comment on pull request #14370: URL: https://github.com/apache/flink/pull/14370#issuecomment-743701239 ## CI report: * 959c736db926a699ddd78202c44e4faec0394110 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10821) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #14370: [FLINK-20582][docs] Fix typos in `CREATE Statements` docs.
flinkbot commented on pull request #14370: URL: https://github.com/apache/flink/pull/14370#issuecomment-743701239 ## CI report: * 959c736db926a699ddd78202c44e4faec0394110 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14362: [FLINK-20540][jdbc] Fix the defaultUrl for AbstractJdbcCatalog in JDBC connector
flinkbot edited a comment on pull request #14362: URL: https://github.com/apache/flink/pull/14362#issuecomment-742906429 ## CI report: * 007bdcc0807864eda87a99c32eebd9b7f96b613c Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10784) * aa34f060a06041b6e84afa54c8ae5507dba0342d Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10820) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #14370: [FLINK-20582][docs] Fix typos in `CREATE Statements` docs.
flinkbot commented on pull request #14370: URL: https://github.com/apache/flink/pull/14370#issuecomment-743698289 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 959c736db926a699ddd78202c44e4faec0394110 (Sat Dec 12 04:11:13 UTC 2020) **Warnings:** * **This pull request references an unassigned [Jira ticket](https://issues.apache.org/jira/browse/FLINK-20582).** According to the [code contribution guide](https://flink.apache.org/contributing/contribute-code.html), tickets need to be assigned before starting with the implementation work. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-20582) Fix typos in `CREATE Statements` docs
[ https://issues.apache.org/jira/browse/FLINK-20582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-20582: --- Labels: pull-request-available (was: ) > Fix typos in `CREATE Statements` docs > - > > Key: FLINK-20582 > URL: https://issues.apache.org/jira/browse/FLINK-20582 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.12.0 >Reporter: xiaozilong >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] V1ncentzzZ opened a new pull request #14370: [FLINK-20582][docs] Fix typos in `CREATE Statements` docs.
V1ncentzzZ opened a new pull request #14370: URL: https://github.com/apache/flink/pull/14370 ## What is the purpose of the change Fix typos in [CREATE Statements](https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/sql/create.html) docs. ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. *(or)* This change is already covered by existing tests, such as *(please describe tests)*. *(or)* This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (100MB)* - *Extended integration test for recovery after master (JobManager) failure* - *Added test that validates that TaskInfo is transferred only once across recoveries* - *Manually verified the change by running a 4 node cluser with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-20582) Fix typos in `CREATE Statements` docs
xiaozilong created FLINK-20582: -- Summary: Fix typos in `CREATE Statements` docs Key: FLINK-20582 URL: https://issues.apache.org/jira/browse/FLINK-20582 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 1.12.0 Reporter: xiaozilong -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #14362: [FLINK-20540][jdbc] Fix the defaultUrl for AbstractJdbcCatalog in JDBC connector
flinkbot edited a comment on pull request #14362: URL: https://github.com/apache/flink/pull/14362#issuecomment-742906429 ## CI report: * 007bdcc0807864eda87a99c32eebd9b7f96b613c Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10784) * aa34f060a06041b6e84afa54c8ae5507dba0342d UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Issue Comment Deleted] (FLINK-20389) UnalignedCheckpointITCase failure caused by NullPointerException
[ https://issues.apache.org/jira/browse/FLINK-20389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huang Xingbo updated FLINK-20389: - Comment: was deleted (was: I'm reopening it due to the failure frequency.) > UnalignedCheckpointITCase failure caused by NullPointerException > > > Key: FLINK-20389 > URL: https://issues.apache.org/jira/browse/FLINK-20389 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.12.0, 1.13.0 >Reporter: Matthias >Assignee: Matthias >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > Attachments: FLINK-20389-failure.log > > > [Build|https://dev.azure.com/mapohl/flink/_build/results?buildId=118&view=results] > failed due to {{UnalignedCheckpointITCase}} caused by a > {{NullPointerException}}: > {code:java} > Test execute[Parallel cogroup, p = > 10](org.apache.flink.test.checkpointing.UnalignedCheckpointITCase) failed > with: > org.apache.flink.runtime.client.JobExecutionException: Job execution failed. > at > org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147) > at > org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$2(MiniClusterJobClient.java:119) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:229) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:996) > at akka.dispatch.OnComplete.internal(Future.scala:264) > at akka.dispatch.OnComplete.internal(Future.scala:261) > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191) > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188) > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) > at > org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:74) > at > scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44) > at > scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252) > at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572) > at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22) > at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21) > at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436) > at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435) > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) > at > akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) > at > scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) > at > akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44) > at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: org.apache.flink.runtime.JobException: Recovery is suppressed by > FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=5, > backoffTimeMS=100) > at > org.apache.flink.runtime.execut
[jira] [Reopened] (FLINK-20389) UnalignedCheckpointITCase failure caused by NullPointerException
[ https://issues.apache.org/jira/browse/FLINK-20389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huang Xingbo reopened FLINK-20389: -- I'm reopening it due to the failure frequency. > UnalignedCheckpointITCase failure caused by NullPointerException > > > Key: FLINK-20389 > URL: https://issues.apache.org/jira/browse/FLINK-20389 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.12.0, 1.13.0 >Reporter: Matthias >Assignee: Matthias >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > Attachments: FLINK-20389-failure.log > > > [Build|https://dev.azure.com/mapohl/flink/_build/results?buildId=118&view=results] > failed due to {{UnalignedCheckpointITCase}} caused by a > {{NullPointerException}}: > {code:java} > Test execute[Parallel cogroup, p = > 10](org.apache.flink.test.checkpointing.UnalignedCheckpointITCase) failed > with: > org.apache.flink.runtime.client.JobExecutionException: Job execution failed. > at > org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147) > at > org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$2(MiniClusterJobClient.java:119) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:229) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:996) > at akka.dispatch.OnComplete.internal(Future.scala:264) > at akka.dispatch.OnComplete.internal(Future.scala:261) > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191) > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188) > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) > at > org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:74) > at > scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44) > at > scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252) > at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572) > at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22) > at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21) > at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436) > at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435) > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) > at > akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) > at > scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) > at > akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44) > at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: org.apache.flink.runtime.JobException: Recovery is suppressed by > FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=5, > backoffTimeMS=100) > at > org.apache.flink.runtime.executiongraph.failover.flip1.Execut
[jira] [Commented] (FLINK-20389) UnalignedCheckpointITCase failure caused by NullPointerException
[ https://issues.apache.org/jira/browse/FLINK-20389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17248280#comment-17248280 ] Huang Xingbo commented on FLINK-20389: -- I'm reopening it due to the failure frequency. > UnalignedCheckpointITCase failure caused by NullPointerException > > > Key: FLINK-20389 > URL: https://issues.apache.org/jira/browse/FLINK-20389 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.12.0, 1.13.0 >Reporter: Matthias >Assignee: Matthias >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > Attachments: FLINK-20389-failure.log > > > [Build|https://dev.azure.com/mapohl/flink/_build/results?buildId=118&view=results] > failed due to {{UnalignedCheckpointITCase}} caused by a > {{NullPointerException}}: > {code:java} > Test execute[Parallel cogroup, p = > 10](org.apache.flink.test.checkpointing.UnalignedCheckpointITCase) failed > with: > org.apache.flink.runtime.client.JobExecutionException: Job execution failed. > at > org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147) > at > org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$2(MiniClusterJobClient.java:119) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:229) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:996) > at akka.dispatch.OnComplete.internal(Future.scala:264) > at akka.dispatch.OnComplete.internal(Future.scala:261) > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191) > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188) > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) > at > org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:74) > at > scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44) > at > scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252) > at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572) > at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22) > at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21) > at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436) > at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435) > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) > at > akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) > at > scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) > at > akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44) > at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: org.apache.flink.runtime.JobException: Recovery is suppressed by > FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=5, > backoffTimeMS=100) > at > org.ap
[jira] [Commented] (FLINK-20389) UnalignedCheckpointITCase failure caused by NullPointerException
[ https://issues.apache.org/jira/browse/FLINK-20389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17248279#comment-17248279 ] Huang Xingbo commented on FLINK-20389: -- [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10818&view=logs&j=2c3cbe13-dee0-5837-cf47-3053da9a8a78&t=2c7d57b9-7341-5a87-c9af-2cf7cc1a37dc] > UnalignedCheckpointITCase failure caused by NullPointerException > > > Key: FLINK-20389 > URL: https://issues.apache.org/jira/browse/FLINK-20389 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.12.0, 1.13.0 >Reporter: Matthias >Assignee: Matthias >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > Attachments: FLINK-20389-failure.log > > > [Build|https://dev.azure.com/mapohl/flink/_build/results?buildId=118&view=results] > failed due to {{UnalignedCheckpointITCase}} caused by a > {{NullPointerException}}: > {code:java} > Test execute[Parallel cogroup, p = > 10](org.apache.flink.test.checkpointing.UnalignedCheckpointITCase) failed > with: > org.apache.flink.runtime.client.JobExecutionException: Job execution failed. > at > org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147) > at > org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$2(MiniClusterJobClient.java:119) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:229) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:996) > at akka.dispatch.OnComplete.internal(Future.scala:264) > at akka.dispatch.OnComplete.internal(Future.scala:261) > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191) > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188) > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) > at > org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:74) > at > scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44) > at > scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252) > at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572) > at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22) > at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21) > at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436) > at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435) > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) > at > akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) > at > scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) > at > akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44) > at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: org.apache.flink.runtime.JobException: Recovery is s
[jira] [Commented] (FLINK-20389) UnalignedCheckpointITCase failure caused by NullPointerException
[ https://issues.apache.org/jira/browse/FLINK-20389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17248277#comment-17248277 ] Huang Xingbo commented on FLINK-20389: -- Similar Case [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10816&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=6dff16b1-bf54-58f3-23c6-76282f49a185] > UnalignedCheckpointITCase failure caused by NullPointerException > > > Key: FLINK-20389 > URL: https://issues.apache.org/jira/browse/FLINK-20389 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.12.0, 1.13.0 >Reporter: Matthias >Assignee: Matthias >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > Attachments: FLINK-20389-failure.log > > > [Build|https://dev.azure.com/mapohl/flink/_build/results?buildId=118&view=results] > failed due to {{UnalignedCheckpointITCase}} caused by a > {{NullPointerException}}: > {code:java} > Test execute[Parallel cogroup, p = > 10](org.apache.flink.test.checkpointing.UnalignedCheckpointITCase) failed > with: > org.apache.flink.runtime.client.JobExecutionException: Job execution failed. > at > org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147) > at > org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$2(MiniClusterJobClient.java:119) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:229) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > at > org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:996) > at akka.dispatch.OnComplete.internal(Future.scala:264) > at akka.dispatch.OnComplete.internal(Future.scala:261) > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191) > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188) > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) > at > org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:74) > at > scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44) > at > scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252) > at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572) > at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22) > at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21) > at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436) > at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435) > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) > at > akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) > at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) > at > scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) > at > akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44) > at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: org.apache.flink.runtime.JobException: Re
[GitHub] [flink] flinkbot edited a comment on pull request #14361: [FLINK-19435][connectors/jdbc] Fix deadlock when loading different driver classes concurrently using Class.forName
flinkbot edited a comment on pull request #14361: URL: https://github.com/apache/flink/pull/14361#issuecomment-742899629 ## CI report: * fc55e236955b84396bed5f18851cd9dd0425060a Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10783) * c58cf4c33c3c645048409a3ea8664c330c176a25 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10819) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14361: [FLINK-19435][connectors/jdbc] Fix deadlock when loading different driver classes concurrently using Class.forName
flinkbot edited a comment on pull request #14361: URL: https://github.com/apache/flink/pull/14361#issuecomment-742899629 ## CI report: * fc55e236955b84396bed5f18851cd9dd0425060a Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10783) * c58cf4c33c3c645048409a3ea8664c330c176a25 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13879: [FLINK-19832][coordination] Do not schedule shared slot bulk if some slots have failed immediately
flinkbot edited a comment on pull request #13879: URL: https://github.com/apache/flink/pull/13879#issuecomment-720415853 ## CI report: * 52a32b3e279a3c17dfe865e6d09e1b57d5b29bfc Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10815) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14312: [FLINK-20491] Support Broadcast State in BATCH execution mode
flinkbot edited a comment on pull request #14312: URL: https://github.com/apache/flink/pull/14312#issuecomment-738876739 ## CI report: * 9f3deb69af6892fc676aa87e1d1b55981eded305 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10814) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13879: [FLINK-19832][coordination] Do not schedule shared slot bulk if some slots have failed immediately
flinkbot edited a comment on pull request #13879: URL: https://github.com/apache/flink/pull/13879#issuecomment-720415853 ## CI report: * 38c6bf78fd788dce6916150e8c0d7a95bc51144f Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10700) * 52a32b3e279a3c17dfe865e6d09e1b57d5b29bfc Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10815) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13879: [FLINK-19832][coordination] Do not schedule shared slot bulk if some slots have failed immediately
flinkbot edited a comment on pull request #13879: URL: https://github.com/apache/flink/pull/13879#issuecomment-720415853 ## CI report: * 38c6bf78fd788dce6916150e8c0d7a95bc51144f Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10700) * 52a32b3e279a3c17dfe865e6d09e1b57d5b29bfc UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-20580) Missing null value handling for SerializedValue's getByteArray()
[ https://issues.apache.org/jira/browse/FLINK-20580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias updated FLINK-20580: - Description: {{SerializedValue}} allows to wrap {{null}} values. Because of this, {{SerializedValue.getByteArray()}} might return {{null}} which is not properly handled in different locations (it's probably the best to use the IDEs "Find usages" to identify these locations). The most recent findings (for now) are listed in the comments. We should add null handling in these cases and add tests for these cases. was: {{SerializedValue}} allows to wrap {{null}} values. Because of this, {{SerializedValue.getByteArray()}} might return {{null}} which is not properly handled in different locations. We should add null handling in these cases and add tests for these cases. > Missing null value handling for SerializedValue's getByteArray() > - > > Key: FLINK-20580 > URL: https://issues.apache.org/jira/browse/FLINK-20580 > Project: Flink > Issue Type: Bug > Components: API / Type Serialization System >Affects Versions: 1.13.0 >Reporter: Matthias >Priority: Minor > Labels: starter > > {{SerializedValue}} allows to wrap {{null}} values. Because of this, > {{SerializedValue.getByteArray()}} might return {{null}} which is not > properly handled in different locations (it's probably the best to use the > IDEs "Find usages" to identify these locations). The most recent findings > (for now) are listed in the comments. > We should add null handling in these cases and add tests for these cases. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] tillrohrmann commented on pull request #13879: [FLINK-19832][coordination] Do not schedule shared slot bulk if some slots have failed immediately
tillrohrmann commented on pull request #13879: URL: https://github.com/apache/flink/pull/13879#issuecomment-743332830 I've added a test case for the problem and rebased on the latest master. Once AZP gives green light, I will merge this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20575) flink application failed to restore from check-point
[ https://issues.apache.org/jira/browse/FLINK-20575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17248056#comment-17248056 ] Steven Zhen Wu commented on FLINK-20575: Just from personal experience. Because this is an interrupted exception, this may be a symptom of job cancellation stuck and later aborted forcefully. Maybe check if there is any other exception before this among all TMs. > flink application failed to restore from check-point > > > Key: FLINK-20575 > URL: https://issues.apache.org/jira/browse/FLINK-20575 > Project: Flink > Issue Type: Bug >Affects Versions: 1.9.1 >Reporter: Yu Yang >Priority: Major > > Our flink application failed to restore from a check-point due to > com.amazonaws.AbortedException (we use s3a file system). Initially we > thought that the s3 file had some issue. It turned out that we can download > the s3 file fine. Any insights on this issue will be very welcome. > > 2020-12-11 07:02:40,018 ERROR > org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder - > Caught unexpected exception. > java.io.InterruptedIOException: getFileStatus on > s3a://mybucket/prod/checkpoints/u/tango/910d2ff2b2c7e01e99a9588d11385e92/shared/f245da83-fc01-424d-9719-d48b99a1ed35: > org.apache.flink.fs.s3base.shaded.com.amazonaws.AbortedException: > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateInterruptedException(S3AUtils.java:340) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:171) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:145) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2187) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:699) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem.open(FileSystem.java:950) > at > org.apache.flink.fs.s3.common.hadoop.HadoopFileSystem.open(HadoopFileSystem.java:120) > at > org.apache.flink.fs.s3.common.hadoop.HadoopFileSystem.open(HadoopFileSystem.java:37) > at > org.apache.flink.core.fs.SafetyNetWrapperFileSystem.open(SafetyNetWrapperFileSystem.java:85) > at > org.apache.flink.runtime.state.filesystem.FileStateHandle.openInputStream(FileStateHandle.java:68) > at > org.apache.flink.contrib.streaming.state.RocksDBStateDownloader.downloadDataForStateHandle(RocksDBStateDownloader.java:127) > at > org.apache.flink.contrib.streaming.state.RocksDBStateDownloader.lambda$createDownloadRunnables$0(RocksDBStateDownloader.java:109) > at > org.apache.flink.util.function.ThrowingRunnable.lambda$unchecked$0(ThrowingRunnable.java:50) > at > java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626) > at > org.apache.flink.runtime.concurrent.DirectExecutorService.execute(DirectExecutorService.java:211) > at > java.util.concurrent.CompletableFuture.asyncRunStage(CompletableFuture.java:1640) > at > java.util.concurrent.CompletableFuture.runAsync(CompletableFuture.java:1858) > at > org.apache.flink.contrib.streaming.state.RocksDBStateDownloader.downloadDataForAllStateHandles(RocksDBStateDownloader.java:83) > at > org.apache.flink.contrib.streaming.state.RocksDBStateDownloader.transferAllStateDataToDirectory(RocksDBStateDownloader.java:66) > at > org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreDBInstanceFromStateHandle(RocksDBIncrementalRestoreOperation.java:406) > at > org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreWithRescaling(RocksDBIncrementalRestoreOperation.java:294) > at > org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restore(RocksDBIncrementalRestoreOperation.java:146) > at > org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:270) > at > org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:520) > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:291) > at > org.apache.flin
[GitHub] [flink] tillrohrmann commented on a change in pull request #13883: [FLINK-19920][runtime] Remove legacy failover strategy
tillrohrmann commented on a change in pull request #13883: URL: https://github.com/apache/flink/pull/13883#discussion_r541083866 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java ## @@ -49,7 +49,7 @@ import org.apache.flink.runtime.concurrent.ScheduledExecutorServiceAdapter; import org.apache.flink.runtime.entrypoint.ClusterEntryPointExceptionUtils; import org.apache.flink.runtime.execution.ExecutionState; -import org.apache.flink.runtime.executiongraph.failover.FailoverStrategy; +import org.apache.flink.runtime.executiongraph.failover.flip1.FailoverStrategy; Review comment: I think this import is not really needed if we adjust the JavaDocs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] sjwiesman commented on a change in pull request #14312: [FLINK-20491] Support Broadcast State in BATCH execution mode
sjwiesman commented on a change in pull request #14312: URL: https://github.com/apache/flink/pull/14312#discussion_r541083223 ## File path: docs/dev/datastream_execution_mode.md ## @@ -237,6 +237,36 @@ next key. See [FLIP-140](https://cwiki.apache.org/confluence/x/kDh4CQ) for background information on this. +### Order of Processing + +The order in which records are processed in operators or user defined functions +(UDFs) can differ between `BATCH` and `STREAMING` execution. + +In `STREAMING` mode, user defined functions should not make any assumptions +about the order of incoming records. Records are processed immediately once +they arrive in sources. + +In `BATCH` execution mode, there are some operations where Flink guarantees +order. The ordering can be a side effect of the special task scheduling, +network shuffle, and state backend (see above) or it can be a conscious choice +by the system. + +There are three general types of input that we can differentiate: + +- _keyed input_: input from a `KeyedStream` +- _broadcast input_: input from a broadcast stream (see also [Broadcast + State]({% link dev/stream/state/broadcast_state.md %})) +- _regular input_: input that isn't any of the above types of input + +These are the ordering rules for the different input types + +- keyed inputs are processed after all other inputs +- broadcast inputs are processed before regular inputs + +As mentioned above, the keyed input will be grouped and Flink will process all +records of a keyed group consecutively before processing the next group. Review comment: @aljoscha What about this? ```suggestion ### Order of Processing The order in which records are processed in operators or user-defined functions (UDFs) can differ between `BATCH` and `STREAMING` execution. In `STREAMING` mode, user-defined functions should not make any assumptions about incoming records' order. Data is processed as soon as it arrives. In `BATCH` execution mode, there are some operations where Flink guarantees order. The ordering can be a side effect of the particular task scheduling, network shuffle, and state backend (see above), or a conscious choice by the system. There are three general types of input that we can differentiate: - _broadcast input_: input from a broadcast stream (see also [Broadcast State]({% link dev/stream/state/broadcast_state.md %})) - _regular input_: input that isn't any of the above types of input - _keyed input_: input from a `KeyedStream` Functions, or Operators, that consume multiple input types will process them in the following order: - broadcast inputs are processed first - regular inputs are processed second - keyed inputs are processed last For functions that consume from multiple regular or broadcast inputs — such as a `CoProcessFunction` — Flink has the right to process data from any input of that type in any order. For functions that consume from multiple keyed inputs — such as a `KeyedCoProcessFunction` — Flink processes all records for a single key from all keyed inputs before moving on to the next. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] sjwiesman commented on a change in pull request #14312: [FLINK-20491] Support Broadcast State in BATCH execution mode
sjwiesman commented on a change in pull request #14312: URL: https://github.com/apache/flink/pull/14312#discussion_r541083223 ## File path: docs/dev/datastream_execution_mode.md ## @@ -237,6 +237,36 @@ next key. See [FLIP-140](https://cwiki.apache.org/confluence/x/kDh4CQ) for background information on this. +### Order of Processing + +The order in which records are processed in operators or user defined functions +(UDFs) can differ between `BATCH` and `STREAMING` execution. + +In `STREAMING` mode, user defined functions should not make any assumptions +about the order of incoming records. Records are processed immediately once +they arrive in sources. + +In `BATCH` execution mode, there are some operations where Flink guarantees +order. The ordering can be a side effect of the special task scheduling, +network shuffle, and state backend (see above) or it can be a conscious choice +by the system. + +There are three general types of input that we can differentiate: + +- _keyed input_: input from a `KeyedStream` +- _broadcast input_: input from a broadcast stream (see also [Broadcast + State]({% link dev/stream/state/broadcast_state.md %})) +- _regular input_: input that isn't any of the above types of input + +These are the ordering rules for the different input types + +- keyed inputs are processed after all other inputs +- broadcast inputs are processed before regular inputs + +As mentioned above, the keyed input will be grouped and Flink will process all +records of a keyed group consecutively before processing the next group. Review comment: @aljoscha What about this? ```suggestion ### Order of Processing The order in which records are processed in operators or user-defined functions (UDFs) can differ between `BATCH` and `STREAMING` execution. In `STREAMING` mode, user-defined functions should not make any assumptions about incoming records' order. Data is processed as soon as it arrives. In `BATCH` execution mode, there are some operations where Flink guarantees order. The ordering can be a side effect of the particular task scheduling, network shuffle, and state backend (see above), or a conscious choice by the system. There are three general types of input that we can differentiate: - _broadcast input_: input from a broadcast stream (see also [Broadcast State]({% link dev/stream/state/broadcast_state.md %})) - _regular input_: input that isn't any of the above types of input - _keyed input_: input from a `KeyedStream` Functions, or Operators, that consume multiple input types will process them in the following order: - broadcast inputs are processed first - regular inputs are processed second - keyed inputs are processed last For functions that consume from multiple regular or broadcast inputs — such as a `CoProcessFunction` — Flink has the right to process data from any input of that type in any order. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20580) Missing null value handling for SerializedValue's getByteArray()
[ https://issues.apache.org/jira/browse/FLINK-20580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17248019#comment-17248019 ] Till Rohrmann commented on FLINK-20580: --- True, you are right [~jackwangcs]. I did not mention the {{AkkaRpcActor}} because I am fixing this problem via FLINK-20521. > Missing null value handling for SerializedValue's getByteArray() > - > > Key: FLINK-20580 > URL: https://issues.apache.org/jira/browse/FLINK-20580 > Project: Flink > Issue Type: Bug > Components: API / Type Serialization System >Affects Versions: 1.13.0 >Reporter: Matthias >Priority: Minor > Labels: starter > > {{SerializedValue}} allows to wrap {{null}} values. Because of this, > {{SerializedValue.getByteArray()}} might return {{null}} which is not > properly handled in different locations. > We should add null handling in these cases and add tests for these cases. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20580) Missing null value handling for SerializedValue's getByteArray()
[ https://issues.apache.org/jira/browse/FLINK-20580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17248007#comment-17248007 ] Jie Wang commented on FLINK-20580: -- Hi [~trohrmann] , I also find the {{SerializedValue.getByteArray()}} is called in {{AkkaRpcActor}} and {{RemoteRpcInvocation}}. And they both has some prolematic code to check the length of serializedData by: {{SerializedValue.getByteArray().length}}. > Missing null value handling for SerializedValue's getByteArray() > - > > Key: FLINK-20580 > URL: https://issues.apache.org/jira/browse/FLINK-20580 > Project: Flink > Issue Type: Bug > Components: API / Type Serialization System >Affects Versions: 1.13.0 >Reporter: Matthias >Priority: Minor > Labels: starter > > {{SerializedValue}} allows to wrap {{null}} values. Because of this, > {{SerializedValue.getByteArray()}} might return {{null}} which is not > properly handled in different locations. > We should add null handling in these cases and add tests for these cases. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #14312: [FLINK-20491] Support Broadcast State in BATCH execution mode
flinkbot edited a comment on pull request #14312: URL: https://github.com/apache/flink/pull/14312#issuecomment-738876739 ## CI report: * ca3900171fb130a6e379e9bf8b4f4f4f5b4c48a4 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10808) * 9f3deb69af6892fc676aa87e1d1b55981eded305 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10814) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19970) State leak in CEP Operators (expired events/keys not removed from state)
[ https://issues.apache.org/jira/browse/FLINK-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247999#comment-17247999 ] Thomas Wozniakowski commented on FLINK-19970: - [~dwysakowicz] I've emailed you over the code + JAR. Please give me a shout here or over email if you need anything else from me. > State leak in CEP Operators (expired events/keys not removed from state) > > > Key: FLINK-19970 > URL: https://issues.apache.org/jira/browse/FLINK-19970 > Project: Flink > Issue Type: Bug > Components: Library / CEP >Affects Versions: 1.11.2 > Environment: Flink 1.11.2 run using the official docker containers in > AWS ECS Fargate. > 1 Job Manager, 1 Taskmanager with 2vCPUs and 8GB memory >Reporter: Thomas Wozniakowski >Priority: Critical > Attachments: image-2020-11-04-11-35-12-126.png, screenshot-1.png, > screenshot-2.png, screenshot-3.png > > > We have been observing instability in our production environment recently, > seemingly related to state backends. We ended up building a load testing > environment to isolate factors and have discovered that the CEP library > appears to have some serious problems with state expiry. > h2. Job Topology > Source: Kinesis (standard connector) -> keyBy() and forward to... > CEP: Array of simple Keyed CEP Pattern operators (details below) -> forward > output to... > Sink: SQS (custom connector) > The CEP Patterns in the test look like this: > {code:java} > Pattern.begin(SCANS_SEQUENCE, AfterMatchSkipStrategy.skipPastLastEvent()) > .times(20) > .subtype(ScanEvent.class) > .within(Duration.minutes(30)); > {code} > h2. Taskmanager Config > {code:java} > taskmanager.numberOfTaskSlots: $numberOfTaskSlots > taskmanager.data.port: 6121 > taskmanager.rpc.port: 6122 > taskmanager.exit-on-fatal-akka-error: true > taskmanager.memory.process.size: $memoryProcessSize > taskmanager.memory.jvm-metaspace.size: 256m > taskmanager.memory.managed.size: 0m > jobmanager.rpc.port: 6123 > blob.server.port: 6130 > rest.port: 8081 > web.submit.enable: true > fs.s3a.connection.maximum: 50 > fs.s3a.threads.max: 50 > akka.framesize: 250m > akka.watch.threshold: 14 > state.checkpoints.dir: s3://$savepointBucketName/checkpoints > state.savepoints.dir: s3://$savepointBucketName/savepoints > state.backend: filesystem > state.backend.async: true > s3.access-key: $s3AccessKey > s3.secret-key: $s3SecretKey > {code} > (the substitutions are controlled by terraform). > h2. Tests > h4. Test 1 (No key rotation) > 8192 actors (different keys) emitting 1 Scan Event every 10 minutes > indefinitely. Actors (keys) never rotate in or out. > h4. Test 2 (Constant key rotation) > 8192 actors that produce 2 Scan events 10 minutes apart, then retire and > never emit again. The setup creates new actors (keys) as soon as one finishes > so we always have 8192. This test basically constantly rotates the key space. > h2. Results > For both tests, the state size (checkpoint size) grows unbounded and linearly > well past the 30 minute threshold that should have caused old keys or events > to be discard from the state. In the chart below, the left (steep) half is > the 24 hours we ran Test 1, the right (shallow) half is Test 2. My > understanding is that the checkpoint size should level off after ~45 minutes > or so then stay constant. > !image-2020-11-04-11-35-12-126.png! > Could someone please assist us with this? Unless we have dramatically > misunderstood how the CEP library is supposed to function this seems like a > pretty severe bug. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20521) Null result values are being swallowed by RPC system
[ https://issues.apache.org/jira/browse/FLINK-20521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247995#comment-17247995 ] Till Rohrmann commented on FLINK-20521: --- I am not entirely sure whether this failure is caused by not being able to return a {{null}} value from the RPC. The {{TaskExecutor.submitTask}} method should not return a {{null}} value. > Null result values are being swallowed by RPC system > > > Key: FLINK-20521 > URL: https://issues.apache.org/jira/browse/FLINK-20521 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.12.0, 1.11.2 >Reporter: Till Rohrmann >Assignee: Till Rohrmann >Priority: Major > Labels: pull-request-available > Fix For: 1.13.0, 1.12.1 > > > If an RPC method returns a {{null}} value, then it seems that the request > future won't get completed as reported in FLINK-17921. > We should either not allow to return {{null}} values as responses or make > sure that a {{null}} value is properly transmitted to the caller. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20580) Missing null value handling for SerializedValue's getByteArray()
[ https://issues.apache.org/jira/browse/FLINK-20580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247991#comment-17247991 ] Till Rohrmann commented on FLINK-20580: --- The problematic code locations are {{BlobWriter}} and {{SerializedValueSerializer}}. > Missing null value handling for SerializedValue's getByteArray() > - > > Key: FLINK-20580 > URL: https://issues.apache.org/jira/browse/FLINK-20580 > Project: Flink > Issue Type: Bug > Components: API / Type Serialization System >Affects Versions: 1.13.0 >Reporter: Matthias >Priority: Minor > Labels: starter > > {{SerializedValue}} allows to wrap {{null}} values. Because of this, > {{SerializedValue.getByteArray()}} might return {{null}} which is not > properly handled in different locations. > We should add null handling in these cases and add tests for these cases. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] tillrohrmann commented on a change in pull request #14359: [FLINK-20521][rpc] Add support for sending null responses
tillrohrmann commented on a change in pull request #14359: URL: https://github.com/apache/flink/pull/14359#discussion_r541045997 ## File path: flink-core/src/main/java/org/apache/flink/util/SerializedValue.java ## @@ -63,6 +66,7 @@ public T deserializeValue(ClassLoader loader) throws IOException, ClassNotFoundE * * @return Serialized data. */ + @Nullable Review comment: Thanks @XComp. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tillrohrmann commented on a change in pull request #13641: [FLINK-17760][tests] Rework tests to not rely on legacy scheduling codes in ExecutionGraph components
tillrohrmann commented on a change in pull request #13641: URL: https://github.com/apache/flink/pull/13641#discussion_r540934025 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/metrics/RestartTimeGaugeTest.java ## @@ -0,0 +1,83 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.executiongraph.metrics; + +import org.apache.flink.api.common.JobStatus; +import org.apache.flink.runtime.executiongraph.TestingJobStatusProvider; +import org.apache.flink.util.TestLogger; + +import org.junit.Test; + +import java.util.HashMap; +import java.util.Map; + +import static org.hamcrest.Matchers.greaterThan; +import static org.hamcrest.Matchers.is; +import static org.junit.Assert.assertThat; + +/** + * Tests for {@link RestartTimeGauge}. + */ +public class RestartTimeGaugeTest extends TestLogger { + + @Test + public void testNotRestarted() { + final RestartTimeGauge gauge = new RestartTimeGauge(new TestingJobStatusProvider(JobStatus.RUNNING, -1)); + assertThat(gauge.getValue(), is(0L)); + } + + @Test + public void testInRestarting() { + final Map statusTimestampMap = new HashMap<>(); + statusTimestampMap.put(JobStatus.RESTARTING, 1L); + + final RestartTimeGauge gauge = new RestartTimeGauge( + new TestingJobStatusProvider( + JobStatus.RESTARTING, + status -> statusTimestampMap.getOrDefault(status, -1L))); + // System.currentTimeMillis() is surely to be larger than 123L Review comment: The comment seems outdated. ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java ## @@ -82,202 +82,12 @@ private final TestingComponentMainThreadExecutor testMainThreadUtil = EXECUTOR_RESOURCE.getComponentMainThreadTestExecutor(); - /** -* Tests that slots are released if we cannot assign the allocated resource to the -* Execution. -*/ - @Test - public void testSlotReleaseOnFailedResourceAssignment() throws Exception { - final JobVertex jobVertex = createNoOpJobVertex(); - final JobVertexID jobVertexId = jobVertex.getID(); - - final CompletableFuture slotFuture = new CompletableFuture<>(); - final ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1); - slotProvider.addSlot(jobVertexId, 0, slotFuture); - - ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph( - slotProvider, - new NoRestartStrategy(), - jobVertex); - - executionGraph.start(ComponentMainThreadExecutorServiceAdapter.forMainThread()); - - ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(jobVertexId); - - final Execution execution = executionJobVertex.getTaskVertices()[0].getCurrentExecutionAttempt(); - - final SingleSlotTestingSlotOwner slotOwner = new SingleSlotTestingSlotOwner(); - - final LogicalSlot slot = createTestingLogicalSlot(slotOwner); - - final LogicalSlot otherSlot = new TestingLogicalSlotBuilder().createTestingLogicalSlot(); - - CompletableFuture allocationFuture = execution.allocateResourcesForExecution( - executionGraph.getSlotProviderStrategy(), - LocationPreferenceConstraint.ALL, - Collections.emptySet()); - - assertFalse(allocationFuture.isDone()); - - assertEquals(ExecutionState.SCHEDULED, execution.getState()); - - // assign a different resource to the execution - assertTrue(execution.tryAssignResource(otherSlot)); - - // completing now the future should cause the slot to be released - slotFuture.complete(slot); - - assertEquals(slot, slotOwner.getReturnedSlotFuture().get()); - } - private
[GitHub] [flink] flinkbot edited a comment on pull request #14359: [FLINK-20521][rpc] Add support for sending null responses
flinkbot edited a comment on pull request #14359: URL: https://github.com/apache/flink/pull/14359#issuecomment-742715028 ## CI report: * cc251da64dd21ecbf22a7fbe4cf2a041ef38a317 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10812) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14312: [FLINK-20491] Support Broadcast State in BATCH execution mode
flinkbot edited a comment on pull request #14312: URL: https://github.com/apache/flink/pull/14312#issuecomment-738876739 ## CI report: * ca3900171fb130a6e379e9bf8b4f4f4f5b4c48a4 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10808) * 9f3deb69af6892fc676aa87e1d1b55981eded305 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] aljoscha commented on pull request #14312: [FLINK-20491] Support Broadcast State in BATCH execution mode
aljoscha commented on pull request #14312: URL: https://github.com/apache/flink/pull/14312#issuecomment-743264798 @sjwiesman I'm also adding documentation here, it's a bit terse but still helpful, I think. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zhangmeng0426 commented on pull request #13470: [FLINK-19396][Connectors/Kafka]fix properties type cast error
zhangmeng0426 commented on pull request #13470: URL: https://github.com/apache/flink/pull/13470#issuecomment-743261103 any body can help? @wuchong This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13907: [FLINK-19942][Connectors / JDBC]Support sink parallelism configuration to JDBC connector
flinkbot edited a comment on pull request #13907: URL: https://github.com/apache/flink/pull/13907#issuecomment-721230727 ## CI report: * f4f1cf14d4d413c4c87881516215c7d5be64 UNKNOWN * 6acdd717e9e7a4f23bf7d798e72b8e50c5ff4e83 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10810) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wuchong commented on pull request #13932: [FLINK-19947][Connectors / Common]Support sink parallelism configuration for Print connector
wuchong commented on pull request #13932: URL: https://github.com/apache/flink/pull/13932#issuecomment-743244942 The build compile is failed, please have a look @zhuxiaoshang This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14312: [FLINK-20491] Support Broadcast State in BATCH execution mode
flinkbot edited a comment on pull request #14312: URL: https://github.com/apache/flink/pull/14312#issuecomment-738876739 ## CI report: * ca3900171fb130a6e379e9bf8b4f4f4f5b4c48a4 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10808) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13907: [FLINK-19942][Connectors / JDBC]Support sink parallelism configuration to JDBC connector
flinkbot edited a comment on pull request #13907: URL: https://github.com/apache/flink/pull/13907#issuecomment-721230727 ## CI report: * f4f1cf14d4d413c4c87881516215c7d5be64 UNKNOWN * 7137b6fcfa8f92478118215f47042927c49f4752 Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10805) * 6acdd717e9e7a4f23bf7d798e72b8e50c5ff4e83 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10810) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14338: [FLINK-20537][hive] Failed to call Hive UDF with string literal argum…
flinkbot edited a comment on pull request #14338: URL: https://github.com/apache/flink/pull/14338#issuecomment-740638158 ## CI report: * 872de3bf88035b995f466b2051d22416782c697e Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10809) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-20581) java.lang.NoClassDefFoundError: Could not initialize class org.apache.flink.util.JavaGcCleanerWrapper
huzeming created FLINK-20581: Summary: java.lang.NoClassDefFoundError: Could not initialize class org.apache.flink.util.JavaGcCleanerWrapper Key: FLINK-20581 URL: https://issues.apache.org/jira/browse/FLINK-20581 Project: Flink Issue Type: Bug Reporter: huzeming flink 1.10.2 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] XComp commented on a change in pull request #14367: [FLINK-20352][docs] Add PyFlink job submission section under the Advanced CLI section.
XComp commented on a change in pull request #14367: URL: https://github.com/apache/flink/pull/14367#discussion_r540949456 ## File path: docs/deployment/cli.md ## @@ -338,4 +338,74 @@ specified in the `config/flink-config.yaml`. For more details on the commands and the available options, please refer to the Resource Provider-specific pages of the documentation. +### Submitting PyFlink Jobs + +Currently, users are able to submit a PyFlink job via the CLI. It does not require to specify the +JAR file path or the entry main class, which is different from the Java job submission. + +Note When submitting Python job via `flink run`, Flink will run the command "python". Please run the following command to confirm that the python executable in current environment points to a supported Python version of 3.5+. +{% highlight bash %} +$ python --version +# the version printed here must be 3.5+ +{% endhighlight %} + +The following commands show different PyFlink job submission use-cases: + +- Run a Python Table job: +{% highlight bash %} +$ ./bin/flink run --python examples/python/table/batch/word_count.py +{% endhighlight %} + +- Run a Python Table job with pyFiles: +{% highlight bash %} +$ ./bin/flink run \ + --python examples/python/table/batch/word_count.py \ + --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt +{% endhighlight %} + +- Run a Python Table job with a JAR file: +{% highlight bash %} +$ ./bin/flink run \ + --python examples/python/table/batch/word_count.py \ + --jarfile +{% endhighlight %} + +- Run a Python Table job with pyFiles and pyModule: +{% highlight bash %} +$ ./bin/flink run \ + --pyModule batch.word_count \ + --pyFiles examples/python/table/batch +{% endhighlight %} + +- Submit a Python Table job on a specific JobManager running on host `` (adapt the command accordingly): +{% highlight bash %} +$ ./bin/flink run \ + --jobmanager :8081 \ + --python examples/python/table/batch/word_count.py +{% endhighlight %} + +- Run a Python Table job using a [YARN cluster in Per-Job Mode]({% link deployment/resource-providers/yarn.md %}#run-a-single-flink-job-on-hadoop-yarn): Review comment: ```suggestion - Run a Python Table job using a [YARN cluster in Per-Job Mode]({% link deployment/resource-providers/yarn.md %}#per-job-cluster-mode %): ``` ## File path: docs/deployment/cli.md ## @@ -338,4 +338,74 @@ specified in the `config/flink-config.yaml`. For more details on the commands and the available options, please refer to the Resource Provider-specific pages of the documentation. +### Submitting PyFlink Jobs + +Currently, users are able to submit a PyFlink job via the CLI. It does not require to specify the +JAR file path or the entry main class, which is different from the Java job submission. + +Note When submitting Python job via `flink run`, Flink will run the command "python". Please run the following command to confirm that the python executable in current environment points to a supported Python version of 3.5+. +{% highlight bash %} +$ python --version +# the version printed here must be 3.5+ +{% endhighlight %} + +The following commands show different PyFlink job submission use-cases: + +- Run a Python Table job: Review comment: Thinking about it: Why do we call it Python Table job? Isn't PyFlink job the correct way of labelling it? This applies to all the occurrences below as well. ## File path: docs/deployment/cli.zh.md ## @@ -337,4 +337,74 @@ specified in the `config/flink-config.yaml`. For more details on the commands and the available options, please refer to the Resource Provider-specific pages of the documentation. +### Submitting PyFlink Jobs + +Currently, users are able to submit a PyFlink job via the CLI. It does not require to specify the +JAR file path or the entry main class, which is different from the Java job submission. + +Note When submitting Python job via `flink run`, Flink will run the command "python". Please run the following command to confirm that the python executable in current environment points to a supported Python version of 3.5+. +{% highlight bash %} +$ python --version +# the version printed here must be 3.5+ +{% endhighlight %} + +The following commands show different PyFlink job submission use-cases: + +- Run a Python Table job: +{% highlight bash %} +$ ./bin/flink run --python examples/python/table/batch/word_count.py +{% endhighlight %} + +- Run a Python Table job with pyFiles: +{% highlight bash %} +$ ./bin/flink run \ + --python examples/python/table/batch/word_count.py \ + --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt +{% endhighlight %} + +- Run a Python Table job with a JAR file: +{% highlight bash %} +$ ./bin/flink run \ + --python examples/python/table/batch/word_count.py \ + --jarfile +{% endhighlight %} + +- Run a Python Table job with pyFile
[jira] [Commented] (FLINK-20504) NPE when writing to hive and fail over happened
[ https://issues.apache.org/jira/browse/FLINK-20504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247931#comment-17247931 ] Rui Li commented on FLINK-20504: One possible cause for this NPE is that the Hive {{recordWriter}} can be closed both in [finish|https://github.com/apache/flink/blob/release-1.11.1/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/write/HiveBulkWriterFactory.java#L77] and [dispose|https://github.com/apache/flink/blob/release-1.11.1/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/write/HiveBulkWriterFactory.java#L61]. Suppose {{finish}} is called first but then something goes wrong and {{dispose}} is invoked to do cleanup. Then the {{recordWriter}} is closed twice and leads to NPE. We can add some safety check to make sure this won't happen, i.e. by setting {{recordWriter=null}} after it's closed. But again, I'm not sure whether the problem causes job failure or just some error log. > NPE when writing to hive and fail over happened > --- > > Key: FLINK-20504 > URL: https://issues.apache.org/jira/browse/FLINK-20504 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.11.1 >Reporter: zhuxiaoshang >Priority: Major > > When writing to hive and fail over happened,I got the following exception > {code:java} > java.lang.NullPointerException > at > org.apache.parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:165) > at > org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:114) > at > org.apache.parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:165) > at > org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.close(ParquetRecordWriterWrapper.java:103) > at > org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.close(ParquetRecordWriterWrapper.java:120) > at > org.apache.flink.connectors.hive.write.HiveBulkWriterFactory$1.dispose(HiveBulkWriterFactory.java:61) > at > org.apache.flink.formats.hadoop.bulk.HadoopPathBasedPartFileWriter.dispose(HadoopPathBasedPartFileWriter.java:79) > at > org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.disposePartFile(Bucket.java:235) > at java.util.HashMap$Values.forEach(HashMap.java:981) > at > org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.close(Buckets.java:318) > at > org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSinkHelper.close(StreamingFileSinkHelper.java:108) > at > org.apache.flink.table.filesystem.stream.StreamingFileWriter.dispose(StreamingFileWriter.java:177) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:703) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:635) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:542) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) > at java.lang.Thread.run(Thread.java:748) > {code} > But it does not reproduce every time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] XComp commented on a change in pull request #14346: [FLINK-20354] Rework standalone docs pages
XComp commented on a change in pull request #14346: URL: https://github.com/apache/flink/pull/14346#discussion_r540938493 ## File path: docs/deployment/resource-providers/standalone/docker.md ## @@ -590,11 +416,203 @@ docker service create \ taskmanager ``` -The *job artifacts* must be available in the *JobManager* container, as outlined [here](#start-a-job-cluster). +The *job artifacts* must be available in the *JobManager* container, as outlined [here](#application-mode-on-docker). See also [how to specify the JobManager arguments](#jobmanager-additional-command-line-arguments) to pass them to the `flink-jobmanager` container. The example assumes that you run the swarm locally and expects the *job artifacts* to be in `/host/path/to/job/artifacts`. It also mounts the host path with the artifacts as a volume to the container's path `/opt/flink/usrlib`. {% top %} + +## Flink on Docker Reference + +### Image tags + +The [Flink Docker repository](https://hub.docker.com/_/flink/) is hosted on Docker Hub and serves images of Flink version 1.2.1 and later. +The source for these images can be found in the [Apache flink-docker](https://github.com/apache/flink-docker) repository. + +Images for each supported combination of Flink and Scala versions are available, and +[tag aliases](https://hub.docker.com/_/flink?tab=tags) are provided for convenience. + +For example, you can use the following aliases: + +* `flink:latest` → `flink:-scala_` +* `flink:1.11` → `flink:1.11.-scala_2.11` + +Note It is recommended to always use an explicit version tag of the docker image that specifies both the needed Flink and Scala +versions (for example `flink:1.11-scala_2.12`). +This will avoid some class conflicts that can occur if the Flink and/or Scala versions used in the application are different +from the versions provided by the docker image. + +Note Prior to Flink 1.5 version, Hadoop dependencies were always bundled with Flink. +You can see that certain tags include the version of Hadoop, e.g. (e.g. `-hadoop28`). +Beginning with Flink 1.5, image tags that omit the Hadoop version correspond to Hadoop-free releases of Flink +that do not include a bundled Hadoop distribution. + + +### Passing configuration via environment variables + +When you run Flink image, you can also change its configuration options by setting the environment variable `FLINK_PROPERTIES`: + +```sh +FLINK_PROPERTIES="jobmanager.rpc.address: host +taskmanager.numberOfTaskSlots: 3 +blob.server.port: 6124 +" +docker run --env FLINK_PROPERTIES=${FLINK_PROPERTIES} flink:{% if site.is_stable %}{{site.version}}-scala{{site.scala_version_suffix}}{% else %}latest{% endif %} +``` + +The [`jobmanager.rpc.address`]({% link deployment/config.md %}#jobmanager-rpc-address) option must be configured, others are optional to set. + +The environment variable `FLINK_PROPERTIES` should contain a list of Flink cluster configuration options separated by new line, +the same way as in the `flink-conf.yaml`. `FLINK_PROPERTIES` takes precedence over configurations in `flink-conf.yaml`. + +### Provide custom configuration + +The configuration files (`flink-conf.yaml`, logging, hosts etc) are located in the `/opt/flink/conf` directory in the Flink image. +To provide a custom location for the Flink configuration files, you can + +* **either mount a volume** with the custom configuration files to this path `/opt/flink/conf` when you run the Flink image: + +```sh +docker run \ +--mount type=bind,src=/host/path/to/custom/conf,target=/opt/flink/conf \ +flink:{% if site.is_stable %}{{site.version}}-scala{{site.scala_version_suffix}}{% else %}latest{% endif %} +``` + +* or add them to your **custom Flink image**, build and run it: + + *Dockerfile*: + +```dockerfile +FROM flink +ADD /host/path/to/flink-conf.yaml /opt/flink/conf/flink-conf.yaml +ADD /host/path/to/log4j.properties /opt/flink/conf/log4j.properties +``` + +Warning! The mounted volume must contain all necessary configuration files. +The `flink-conf.yaml` file must have write permission so that the Docker entry point script can modify it in certain cases. + +### Using filesystem plugins + +As described in the [plugins]({% link deployment/filesystems/plugins.md %}) documentation page: in order to use plugins they must be +copied to the correct location in the Flink installation in the Docker container for them to work. + +If you want to enable plugins provided with Flink (in the `opt/` directory of the Flink distribution), you can pass the environment variable `ENABLE_BUILT_IN_PLUGINS` when you run the Flink image. +The `ENABLE_BUILT_IN_PLUGINS` should contain a list of plugin jar file names separated by `;`. A valid plugin name is for example `flink-s3-fs-hadoop-{{site.version}}.jar` + +```sh +docker run \ +--env ENABLE_BUILT_IN_PLUGINS=flink-plugin1.jar;flink-plugin2.jar \ +flink:{% if site.is_
[GitHub] [flink] flinkbot edited a comment on pull request #14346: [FLINK-20354] Rework standalone docs pages
flinkbot edited a comment on pull request #14346: URL: https://github.com/apache/flink/pull/14346#issuecomment-741680007 ## CI report: * 2ad8d2e75d7f717ecc0f8ec1f29189306f50a214 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10811) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] XComp commented on a change in pull request #14359: [FLINK-20521][rpc] Add support for sending null responses
XComp commented on a change in pull request #14359: URL: https://github.com/apache/flink/pull/14359#discussion_r540935340 ## File path: flink-core/src/main/java/org/apache/flink/util/SerializedValue.java ## @@ -63,6 +66,7 @@ public T deserializeValue(ClassLoader loader) throws IOException, ClassNotFoundE * * @return Serialized data. */ + @Nullable Review comment: I created it: [FLINK-20580](https://issues.apache.org/jira/browse/FLINK-20580) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-20580) Missing null value handling for SerializedValue's getByteArray()
Matthias created FLINK-20580: Summary: Missing null value handling for SerializedValue's getByteArray() Key: FLINK-20580 URL: https://issues.apache.org/jira/browse/FLINK-20580 Project: Flink Issue Type: Bug Components: API / Type Serialization System Affects Versions: 1.13.0 Reporter: Matthias {{SerializedValue}} allows to wrap {{null}} values. Because of this, {{SerializedValue.getByteArray()}} might return {{null}} which is not properly handled in different locations. We should add null handling in these cases and add tests for these cases. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] zentol commented on a change in pull request #14340: [FLINK-20533][datadog] Add Histogram support
zentol commented on a change in pull request #14340: URL: https://github.com/apache/flink/pull/14340#discussion_r540934164 ## File path: flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DatadogHttpClient.java ## @@ -113,8 +113,8 @@ public void send(DSeries request) throws Exception { client.newCall(r).enqueue(EmptyCallback.getEmptyCallback()); } - public static String serialize(Object obj) throws JsonProcessingException { - return MAPPER.writeValueAsString(obj); + public static String serialize(Object obj, ObjectMapper mapper) throws JsonProcessingException { + return mapper.writeValueAsString(obj); Review comment: Yeah I wasn't too happy with it either; it is especially finicky since every setting the reporter uses for the mapper is ignored by the test :/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on a change in pull request #14340: [FLINK-20533][datadog] Add Histogram support
zentol commented on a change in pull request #14340: URL: https://github.com/apache/flink/pull/14340#discussion_r540933465 ## File path: flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DHistogram.java ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.metrics.datadog; + +import org.apache.flink.metrics.Histogram; +import org.apache.flink.metrics.HistogramStatistics; + +import java.util.List; + +/** + * Maps histograms to datadog gauges. + * + * Note: We cannot map them to datadog histograms because the HTTP API does not support them. + */ +public class DHistogram { + private final Histogram histogram; + private final MetricMetaData metaData; + + public DHistogram(Histogram histogram, String metricName, String host, List tags, Clock clock) { + this.histogram = histogram; + this.metaData = new MetricMetaData(MetricType.gauge, metricName, host, tags, clock); + } + + public void addTo(DSeries series) { + final HistogramStatistics statistics = histogram.getStatistics(); + + // this selection is based on https://docs.datadoghq.com/developers/metrics/types/?tab=histogram + // we only exclude 'sum' (which is optional), because we cannot compute it + // the semantics for count are also slightly different, because we don't reset it after a report + series.add(new StaticDMetric(statistics.getMean(), withMetricNameSuffix(metaData, "avg"))); + series.add(new StaticDMetric(histogram.getCount(), withMetricNameSuffix(metaData, "count"))); + series.add(new StaticDMetric(statistics.getQuantile(.5), withMetricNameSuffix(metaData, "median"))); + series.add(new StaticDMetric(statistics.getQuantile(.95), withMetricNameSuffix(metaData, "95percentile"))); + series.add(new StaticDMetric(statistics.getMin(), withMetricNameSuffix(metaData, "min"))); + series.add(new StaticDMetric(statistics.getMax(), withMetricNameSuffix(metaData, "max"))); Review comment: Yes, we should do this only once in the constructor; it's a waste of resources to do this every time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-10265) Configure checkpointing behavior for SQL Client
[ https://issues.apache.org/jira/browse/FLINK-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timo Walther closed FLINK-10265. Resolution: Not A Problem > Configure checkpointing behavior for SQL Client > --- > > Key: FLINK-10265 > URL: https://issues.apache.org/jira/browse/FLINK-10265 > Project: Flink > Issue Type: New Feature > Components: Table SQL / Client >Reporter: Timo Walther >Priority: Major > > The SQL Client environment file should expose checkpointing related > properties: > - enable checkpointing > - checkpointing interval > - mode > - timeout > - etc. see {{org.apache.flink.streaming.api.environment.CheckpointConfig}} > Per-job selection of state backends and their configuration. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (FLINK-10265) Configure checkpointing behavior for SQL Client
[ https://issues.apache.org/jira/browse/FLINK-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timo Walther reopened FLINK-10265: -- > Configure checkpointing behavior for SQL Client > --- > > Key: FLINK-10265 > URL: https://issues.apache.org/jira/browse/FLINK-10265 > Project: Flink > Issue Type: New Feature > Components: Table SQL / Client >Reporter: Timo Walther >Priority: Major > > The SQL Client environment file should expose checkpointing related > properties: > - enable checkpointing > - checkpointing interval > - mode > - timeout > - etc. see {{org.apache.flink.streaming.api.environment.CheckpointConfig}} > Per-job selection of state backends and their configuration. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-10265) Configure checkpointing behavior for SQL Client
[ https://issues.apache.org/jira/browse/FLINK-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timo Walther closed FLINK-10265. Assignee: (was: TANG Wen-hui) Resolution: Fixed This issue should not be necessary anymore as most Flink configuration parameters can now be set via the `configuration:` entry in the SQL Client YAML file. So the following should work and forwarded to Flink: {code} configuration: execution.checkpointing.interval: 42 {code} Please correct me if I'm wrong. > Configure checkpointing behavior for SQL Client > --- > > Key: FLINK-10265 > URL: https://issues.apache.org/jira/browse/FLINK-10265 > Project: Flink > Issue Type: New Feature > Components: Table SQL / Client >Reporter: Timo Walther >Priority: Major > > The SQL Client environment file should expose checkpointing related > properties: > - enable checkpointing > - checkpointing interval > - mode > - timeout > - etc. see {{org.apache.flink.streaming.api.environment.CheckpointConfig}} > Per-job selection of state backends and their configuration. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20521) Null result values are being swallowed by RPC system
[ https://issues.apache.org/jira/browse/FLINK-20521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247909#comment-17247909 ] Matthias commented on FLINK-20521: -- Just for documentation purposes as this issue is already in the process of being resolved: [This build|https://dev.azure.com/mapohl/flink/_build/results?buildId=137&view=logs&j=0a15d512-44ac-5ba5-97ab-13a5d066c22c&t=634cd701-c189-5dff-24cb-606ed884db87] failed due to a timeout which looks like being related to this issue. {code:java} 2020-12-11T04:08:07.7372503Z [ERROR] Tests run: 10, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 554.133 s <<< FAILURE! - in org.apache.flink.test.checkpointing.UnalignedCheckpointITCase 2020-12-11T04:08:07.7373566Z [ERROR] execute[Parallel cogroup, p = 10](org.apache.flink.test.checkpointing.UnalignedCheckpointITCase) Time elapsed: 300.162 s <<< ERROR! 2020-12-11T04:08:07.7374137Z org.junit.runners.model.TestTimedOutException: test timed out after 300 seconds 2020-12-11T04:08:07.7381510Zat sun.misc.Unsafe.park(Native Method) 2020-12-11T04:08:07.7381840Zat java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) 2020-12-11T04:08:07.7382291Zat java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707) 2020-12-11T04:08:07.7382782Zat java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) 2020-12-11T04:08:07.7383266Zat java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742) 2020-12-11T04:08:07.7383743Zat java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) 2020-12-11T04:08:07.7384367Zat org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1842) 2020-12-11T04:08:07.7384982Zat org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:70) 2020-12-11T04:08:07.7385641Zat org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1822) 2020-12-11T04:08:07.7386146Zat org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1804) 2020-12-11T04:08:07.7386810Zat org.apache.flink.test.checkpointing.UnalignedCheckpointTestBase.execute(UnalignedCheckpointTestBase.java:122) 2020-12-11T04:08:07.7387382Zat org.apache.flink.test.checkpointing.UnalignedCheckpointITCase.execute(UnalignedCheckpointITCase.java:159) 2020-12-11T04:08:07.7387800Zat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 2020-12-11T04:08:07.7388165Zat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 2020-12-11T04:08:07.7388744Zat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 2020-12-11T04:08:07.7389130Zat java.lang.reflect.Method.invoke(Method.java:498) 2020-12-11T04:08:07.7389538Zat org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) 2020-12-11T04:08:07.7389993Zat org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) 2020-12-11T04:08:07.7390446Zat org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) 2020-12-11T04:08:07.7391069Zat org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) 2020-12-11T04:08:07.7391463Zat org.junit.rules.Verifier$1.evaluate(Verifier.java:35) 2020-12-11T04:08:07.7391831Zat org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) 2020-12-11T04:08:07.7392483Zat org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) 2020-12-11T04:08:07.7393011Zat org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) 2020-12-11T04:08:07.7393443Zat java.util.concurrent.FutureTask.run(FutureTask.java:266) 2020-12-11T04:08:07.7393776Zat java.lang.Thread.run(Thread.java:748) 2020-12-11T04:08:07.7393952Z 2020-12-11T04:08:07.7789797Z [ERROR] execute[Parallel union, p = 10](org.apache.flink.test.checkpointing.UnalignedCheckpointITCase) Time elapsed: 118.969 s <<< ERROR! 2020-12-11T04:08:07.7790405Z org.apache.flink.runtime.client.JobExecutionException: Job execution failed. 2020-12-11T04:08:07.7790949Zat org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147) 2020-12-11T04:08:07.7791480Zat org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$2(MiniClusterJobClient.java:119) 2020-12-11T04:08:07.7791989Zat java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) 2020-12-11T04:08:07.7792447Zat java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) 2020-12-11T04:08:07.7792901Zat java.util.concurrent.CompletableFuture.postCo
[jira] [Created] (FLINK-20579) eash es sink will have
donglei created FLINK-20579: --- Summary: eash es sink will have Key: FLINK-20579 URL: https://issues.apache.org/jira/browse/FLINK-20579 Project: Flink Issue Type: Improvement Reporter: donglei BulkProcessorListener beforebulk must have the same route to speed up write to es As we know bulk with same route will send to es only one node and with one netio one disk io so every !http://km.oa.com/files/photos/captures/202007/1593922902_79_w1275_h710.png! Therefore, we take the following method. The beforeBulk in ElasticsearchSinkBase writes the same bulk according to the same batch. like this, private class BulkProcessorListener implements BulkProcessor.Listener { @Override public void beforeBulk(long executionId, BulkRequest request) { if (routePreBulk) {//Need to verify, whether there is a route set upstream String routing = UUID.randomUUID() + "_" + executionId; List requests = request.requests(); requests.forEach(x -> { ((IndexRequest) x).routing(routing); }); LOG.info("start bulk actions: {}, routing: {}", request.numberOfActions(), routing); } } The advantage of this is that when there are many es fragments later, because every bulk has the same route sent to the same es node, it saves es data splitting time and data landing time, and improves es performance. Preliminary estimates, this part can improve the performance of more than 2 times. The discussion points here are: Q: can we use keyby and with same route value A: Since we use this function to improve performance, setting the same route value after upstream keyby cannot guarantee that all data will be sent in one batch, such as 1w data and one route value, but there is no guarantee that 1w data will be in the same batch. . Q: How to judge whether to add route value A: Since oceanus cannot provide an external API interface, it is recommended to sample here, for example, to see if there is a route in a batch, if there are none, think that this sink does not need a route value. Q: Is the data uniform A: we has been running for a long time. In this setting, because bulk is route value is uniform, es data is uniform -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-20528) Add Python building blocks to make sure the basic functionality of Stream Group Table Aggregation could work
[ https://issues.apache.org/jira/browse/FLINK-20528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu reassigned FLINK-20528: --- Assignee: Huang Xingbo > Add Python building blocks to make sure the basic functionality of Stream > Group Table Aggregation could work > - > > Key: FLINK-20528 > URL: https://issues.apache.org/jira/browse/FLINK-20528 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Reporter: Huang Xingbo >Assignee: Huang Xingbo >Priority: Major > Fix For: 1.13.0 > > > Add Python building blocks to make sure the basic functionality of Stream > Group Table Aggregation could work -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-20507) Support Aggregate Operation in Python Table API
[ https://issues.apache.org/jira/browse/FLINK-20507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu closed FLINK-20507. --- Resolution: Fixed Merged to master via 878edd7ed7c9df7e6d313ebdfe30bd5dcc73cf74 > Support Aggregate Operation in Python Table API > --- > > Key: FLINK-20507 > URL: https://issues.apache.org/jira/browse/FLINK-20507 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Reporter: Huang Xingbo >Assignee: Huang Xingbo >Priority: Major > Labels: pull-request-available > Fix For: 1.13.0 > > > Support Python UDAF for Aggregate Operation in Python Table API > The usage: > {code:java} > t = ... # type: Table, table schema: [a: String, b: Int, c: Int] > # aggregate General Python UDAF > t_env.create_temporary_function("agg", GeneralPythonAggregateFunction()) > t.group_by(t.c).select("agg(a)") > # aggregate Pandas UDAF > mean_max_udaf = udaf(lambda a: Row(a.mean(), a.max()), > result_type=DataTypes.ROW( > [DataTypes.FIELD("a", DataTypes.FLOAT()), > DataTypes.FIELD("b", DataTypes.INT()), > func_type="pandas") > t.group_by(t.a).aggregate(mean_max_udaf(t.b).alias("d", "f")).select("a, d, > f"){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-19027) UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnParallelRemoteChannel failed because of test timeout
[ https://issues.apache.org/jira/browse/FLINK-19027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias closed FLINK-19027. Resolution: Fixed Closing the ticket again as this seems to be related to what's covered by FLINK-20521. > UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnParallelRemoteChannel > failed because of test timeout > > > Key: FLINK-19027 > URL: https://issues.apache.org/jira/browse/FLINK-19027 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.12.0 >Reporter: Dian Fu >Assignee: Arvid Heise >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5789&view=logs&j=119bbba7-f5e3-5e08-e72d-09f1529665de&t=ec103906-d047-5b8a-680e-05fc000dfca9] > {code} > 2020-08-22T21:13:05.5315459Z [ERROR] > shouldPerformUnalignedCheckpointOnParallelRemoteChannel(org.apache.flink.test.checkpointing.UnalignedCheckpointITCase) > Time elapsed: 300.075 s <<< ERROR! > 2020-08-22T21:13:05.5316451Z org.junit.runners.model.TestTimedOutException: > test timed out after 300 seconds > 2020-08-22T21:13:05.5317432Z at sun.misc.Unsafe.park(Native Method) > 2020-08-22T21:13:05.5317799Z at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > 2020-08-22T21:13:05.5318247Z at > java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707) > 2020-08-22T21:13:05.5318885Z at > java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) > 2020-08-22T21:13:05.5327035Z at > java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742) > 2020-08-22T21:13:05.5328114Z at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > 2020-08-22T21:13:05.5328869Z at > org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1719) > 2020-08-22T21:13:05.5329482Z at > org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:74) > 2020-08-22T21:13:05.5330138Z at > org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1699) > 2020-08-22T21:13:05.5330771Z at > org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1681) > 2020-08-22T21:13:05.5331351Z at > org.apache.flink.test.checkpointing.UnalignedCheckpointITCase.execute(UnalignedCheckpointITCase.java:158) > 2020-08-22T21:13:05.5332015Z at > org.apache.flink.test.checkpointing.UnalignedCheckpointITCase.shouldPerformUnalignedCheckpointOnParallelRemoteChannel(UnalignedCheckpointITCase.java:140) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] dianfu closed pull request #14353: [FLINK-20507][python] Support Aggregate Operation in Python Table API
dianfu closed pull request #14353: URL: https://github.com/apache/flink/pull/14353 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tillrohrmann commented on a change in pull request #14359: [FLINK-20521][rpc] Add support for sending null responses
tillrohrmann commented on a change in pull request #14359: URL: https://github.com/apache/flink/pull/14359#discussion_r540909045 ## File path: flink-core/src/main/java/org/apache/flink/util/SerializedValue.java ## @@ -63,6 +66,7 @@ public T deserializeValue(ClassLoader loader) throws IOException, ClassNotFoundE * * @return Serialized data. */ + @Nullable Review comment: Yes, I think so. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tillrohrmann commented on a change in pull request #14359: [FLINK-20521][rpc] Add support for sending null responses
tillrohrmann commented on a change in pull request #14359: URL: https://github.com/apache/flink/pull/14359#discussion_r540908353 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/rpc/akka/AkkaRpcActorTest.java ## @@ -515,6 +542,29 @@ public void setFoobar(int value) { // + interface NullRespondingGateway extends DummyRpcGateway { Review comment: Would also be possible but I didn't want to implement `Integer synchronousFoobar()` for the `DummyRpcEndpoint`. Hence, I opted for a new interface. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] XComp commented on a change in pull request #14359: [FLINK-20521][rpc] Add support for sending null responses
XComp commented on a change in pull request #14359: URL: https://github.com/apache/flink/pull/14359#discussion_r540908276 ## File path: flink-core/src/main/java/org/apache/flink/util/SerializedValue.java ## @@ -63,6 +66,7 @@ public T deserializeValue(ClassLoader loader) throws IOException, ClassNotFoundE * * @return Serialized data. */ + @Nullable Review comment: Does it make sense to create a Jira issue to cover the null handling in the classes related to this change? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-20577) Flink Temporal Join Hive Dim Error
[ https://issues.apache.org/jira/browse/FLINK-20577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu closed FLINK-20577. --- Resolution: Duplicate > Flink Temporal Join Hive Dim Error > -- > > Key: FLINK-20577 > URL: https://issues.apache.org/jira/browse/FLINK-20577 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.12.0 > Environment: sql-clinet >Reporter: HideOnBush >Priority: Major > > 查询SQL > {code:java} > SELECT * FROM hive_catalog.flink_db_test.kfk_master_test AS kafk_tbl JOIN > hive_catalog.gauss.dim_extend_shop_info /*+ > OPTIONS('streaming-source.enable'='true', > 'streaming-source.partition.include' = 'latest', > 'streaming-source.monitor-interval' = '12 > h','streaming-source.partition-order' = 'partition-name') */ FOR SYSTEM_TIME > AS OF kafk_tbl.proctime AS dim ON kafk_tbl.groupID = dim.group_id; > {code} > 堆栈日志 > Caused by: org.apache.flink.util.FlinkRuntimeException: Failed to load table > into cache after 3 retriesCaused by: > org.apache.flink.util.FlinkRuntimeException: Failed to load table into cache > after 3 retries at > org.apache.flink.table.filesystem.FileSystemLookupFunction.checkCacheReload(FileSystemLookupFunction.java:143) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.filesystem.FileSystemLookupFunction.eval(FileSystemLookupFunction.java:103) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > LookupFunction$1577.flatMap(Unknown Source) ~[?:?] at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.processElement(LookupJoinRunner.java:82) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.processElement(LookupJoinRunner.java:36) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] > > Caused by: java.lang.NullPointerException: bufferCaused by: > java.lang.NullPointerException: buffer at > org.apache.flink.core.memory.MemorySegment.(MemorySegment.java:161) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.core.memory.HybridMemorySegment.(HybridMemorySegment.java:86) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.core.memory.MemorySegmentFactory.wrap(MemorySegmentFactory.java:55) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.data.binary.BinaryStringData.fromBytes(BinaryStringData.java:98) > ~[flink-table_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.data.StringData.fromBytes(StringData.java:67) > ~[flink-table_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.data.ColumnarRowData.getString(ColumnarRowData.java:114) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.data.RowData.lambda$createFieldGetter$245ca7d1$1(RowData.java:243) > ~[flink-table_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.data.RowData.lambda$createFieldGetter$25774257$1(RowData.java:317) > ~[flink-table_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.runtime.typeutils.RowDataSerializer.copyRowData(RowDataSerializer.java:166) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] XComp commented on a change in pull request #14359: [FLINK-20521][rpc] Add support for sending null responses
XComp commented on a change in pull request #14359: URL: https://github.com/apache/flink/pull/14359#discussion_r540905070 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/rpc/akka/AkkaRpcActorTest.java ## @@ -515,6 +542,29 @@ public void setFoobar(int value) { // + interface NullRespondingGateway extends DummyRpcGateway { Review comment: Why do we introduce new classes/interfaces here? Can't we just use the already existing `DummyRpcGateway`/`NullRespondingEndpoint`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14307: [FLINK-20209][web] Add tolerable failed checkpoints config to web ui
flinkbot edited a comment on pull request #14307: URL: https://github.com/apache/flink/pull/14307#issuecomment-738548986 ## CI report: * 4a8bb6eeb532ce5e28869ab0c06c903552a8dccb Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10796) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rmetzger commented on a change in pull request #14340: [FLINK-20533][datadog] Add Histogram support
rmetzger commented on a change in pull request #14340: URL: https://github.com/apache/flink/pull/14340#discussion_r540898612 ## File path: flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DHistogram.java ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.metrics.datadog; + +import org.apache.flink.metrics.Histogram; +import org.apache.flink.metrics.HistogramStatistics; + +import java.util.List; + +/** + * Maps histograms to datadog gauges. + * + * Note: We cannot map them to datadog histograms because the HTTP API does not support them. + */ +public class DHistogram { + private final Histogram histogram; + private final MetricMetaData metaData; + + public DHistogram(Histogram histogram, String metricName, String host, List tags, Clock clock) { + this.histogram = histogram; + this.metaData = new MetricMetaData(MetricType.gauge, metricName, host, tags, clock); + } + + public void addTo(DSeries series) { + final HistogramStatistics statistics = histogram.getStatistics(); + + // this selection is based on https://docs.datadoghq.com/developers/metrics/types/?tab=histogram + // we only exclude 'sum' (which is optional), because we cannot compute it + // the semantics for count are also slightly different, because we don't reset it after a report + series.add(new StaticDMetric(statistics.getMean(), withMetricNameSuffix(metaData, "avg"))); + series.add(new StaticDMetric(histogram.getCount(), withMetricNameSuffix(metaData, "count"))); + series.add(new StaticDMetric(statistics.getQuantile(.5), withMetricNameSuffix(metaData, "median"))); + series.add(new StaticDMetric(statistics.getQuantile(.95), withMetricNameSuffix(metaData, "95percentile"))); + series.add(new StaticDMetric(statistics.getMin(), withMetricNameSuffix(metaData, "min"))); + series.add(new StaticDMetric(statistics.getMax(), withMetricNameSuffix(metaData, "max"))); Review comment: This runs in a separate thread, but seems quite expensive with the string concat and the MetricsMetaData allocation each time we send the metric. Does it make sense to initialize the MetaData with the different name suffixes in the constructor? ## File path: flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DatadogHttpClient.java ## @@ -113,8 +113,8 @@ public void send(DSeries request) throws Exception { client.newCall(r).enqueue(EmptyCallback.getEmptyCallback()); } - public static String serialize(Object obj) throws JsonProcessingException { - return MAPPER.writeValueAsString(obj); + public static String serialize(Object obj, ObjectMapper mapper) throws JsonProcessingException { + return mapper.writeValueAsString(obj); Review comment: I understand you changed this for the tests, but it's a bit ugly. Maybe add a comment? ## File path: flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DatadogHttpClient.java ## @@ -113,8 +113,8 @@ public void send(DSeries request) throws Exception { client.newCall(r).enqueue(EmptyCallback.getEmptyCallback()); } - public static String serialize(Object obj) throws JsonProcessingException { - return MAPPER.writeValueAsString(obj); + public static String serialize(Object obj, ObjectMapper mapper) throws JsonProcessingException { + return mapper.writeValueAsString(obj); Review comment: Strictly speaking this mapper is only necessary for having deterministic test results. The proper solution would probably be parsing the JSON response and compare the results not as a string. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about thi
[GitHub] [flink] flinkbot edited a comment on pull request #14359: [FLINK-20521][rpc] Add support for sending null responses
flinkbot edited a comment on pull request #14359: URL: https://github.com/apache/flink/pull/14359#issuecomment-742715028 ## CI report: * bb07e40930830e8e0ec15177b2454adf7ec8876e Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10774) * cc251da64dd21ecbf22a7fbe4cf2a041ef38a317 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10812) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14346: [FLINK-20354] Rework standalone docs pages
flinkbot edited a comment on pull request #14346: URL: https://github.com/apache/flink/pull/14346#issuecomment-741680007 ## CI report: * 1feff43f3fba39641c0403e7968dfca592b400d9 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10708) * 2ad8d2e75d7f717ecc0f8ec1f29189306f50a214 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10811) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-20578) Cannot create empty array using ARRAY[]
Fabian Hueske created FLINK-20578: - Summary: Cannot create empty array using ARRAY[] Key: FLINK-20578 URL: https://issues.apache.org/jira/browse/FLINK-20578 Project: Flink Issue Type: Sub-task Components: Table SQL / API Affects Versions: 1.11.2 Reporter: Fabian Hueske Calling the ARRAY function without an element (`ARRAY[]`) results in an error message. Is that the expected behavior? How can users create empty arrays? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #14359: [FLINK-20521][rpc] Add support for sending null responses
flinkbot edited a comment on pull request #14359: URL: https://github.com/apache/flink/pull/14359#issuecomment-742715028 ## CI report: * bb07e40930830e8e0ec15177b2454adf7ec8876e Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10774) * cc251da64dd21ecbf22a7fbe4cf2a041ef38a317 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14346: [FLINK-20354] Rework standalone docs pages
flinkbot edited a comment on pull request #14346: URL: https://github.com/apache/flink/pull/14346#issuecomment-741680007 ## CI report: * 1feff43f3fba39641c0403e7968dfca592b400d9 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10708) * 2ad8d2e75d7f717ecc0f8ec1f29189306f50a214 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13907: [FLINK-19942][Connectors / JDBC]Support sink parallelism configuration to JDBC connector
flinkbot edited a comment on pull request #13907: URL: https://github.com/apache/flink/pull/13907#issuecomment-721230727 ## CI report: * f4f1cf14d4d413c4c87881516215c7d5be64 UNKNOWN * 5ba8c4410ef2616561f46462d156949e8c6b Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10745) Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10705) * 7137b6fcfa8f92478118215f47042927c49f4752 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10805) * 6acdd717e9e7a4f23bf7d798e72b8e50c5ff4e83 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10810) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-20577) Flink Temporal Join Hive Dim Error
HideOnBush created FLINK-20577: -- Summary: Flink Temporal Join Hive Dim Error Key: FLINK-20577 URL: https://issues.apache.org/jira/browse/FLINK-20577 Project: Flink Issue Type: Bug Components: Table SQL / API Affects Versions: 1.12.0 Environment: sql-clinet Reporter: HideOnBush 查询SQL {code:java} SELECT * FROM hive_catalog.flink_db_test.kfk_master_test AS kafk_tbl JOIN hive_catalog.gauss.dim_extend_shop_info /*+ OPTIONS('streaming-source.enable'='true', 'streaming-source.partition.include' = 'latest', 'streaming-source.monitor-interval' = '12 h','streaming-source.partition-order' = 'partition-name') */ FOR SYSTEM_TIME AS OF kafk_tbl.proctime AS dim ON kafk_tbl.groupID = dim.group_id; {code} 堆栈日志 Caused by: org.apache.flink.util.FlinkRuntimeException: Failed to load table into cache after 3 retriesCaused by: org.apache.flink.util.FlinkRuntimeException: Failed to load table into cache after 3 retries at org.apache.flink.table.filesystem.FileSystemLookupFunction.checkCacheReload(FileSystemLookupFunction.java:143) ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at org.apache.flink.table.filesystem.FileSystemLookupFunction.eval(FileSystemLookupFunction.java:103) ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at LookupFunction$1577.flatMap(Unknown Source) ~[?:?] at org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.processElement(LookupJoinRunner.java:82) ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.processElement(LookupJoinRunner.java:36) ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] Caused by: java.lang.NullPointerException: bufferCaused by: java.lang.NullPointerException: buffer at org.apache.flink.core.memory.MemorySegment.(MemorySegment.java:161) ~[flink-dist_2.11-1.12.0.jar:1.12.0] at org.apache.flink.core.memory.HybridMemorySegment.(HybridMemorySegment.java:86) ~[flink-dist_2.11-1.12.0.jar:1.12.0] at org.apache.flink.core.memory.MemorySegmentFactory.wrap(MemorySegmentFactory.java:55) ~[flink-dist_2.11-1.12.0.jar:1.12.0] at org.apache.flink.table.data.binary.BinaryStringData.fromBytes(BinaryStringData.java:98) ~[flink-table_2.11-1.12.0.jar:1.12.0] at org.apache.flink.table.data.StringData.fromBytes(StringData.java:67) ~[flink-table_2.11-1.12.0.jar:1.12.0] at org.apache.flink.table.data.ColumnarRowData.getString(ColumnarRowData.java:114) ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at org.apache.flink.table.data.RowData.lambda$createFieldGetter$245ca7d1$1(RowData.java:243) ~[flink-table_2.11-1.12.0.jar:1.12.0] at org.apache.flink.table.data.RowData.lambda$createFieldGetter$25774257$1(RowData.java:317) ~[flink-table_2.11-1.12.0.jar:1.12.0] at org.apache.flink.table.runtime.typeutils.RowDataSerializer.copyRowData(RowDataSerializer.java:166) ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20504) NPE when writing to hive and fail over happened
[ https://issues.apache.org/jira/browse/FLINK-20504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247849#comment-17247849 ] Rui Li commented on FLINK-20504: If we look at [StreamTask::disposeAllOperators|https://github.com/apache/flink/blob/release-1.11.1/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L703] during {{cleanUpInvoke}}, this should only be an error log. So I wonder whether this cause job failure or just a log message? > NPE when writing to hive and fail over happened > --- > > Key: FLINK-20504 > URL: https://issues.apache.org/jira/browse/FLINK-20504 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.11.1 >Reporter: zhuxiaoshang >Priority: Major > > When writing to hive and fail over happened,I got the following exception > {code:java} > java.lang.NullPointerException > at > org.apache.parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:165) > at > org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:114) > at > org.apache.parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:165) > at > org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.close(ParquetRecordWriterWrapper.java:103) > at > org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.close(ParquetRecordWriterWrapper.java:120) > at > org.apache.flink.connectors.hive.write.HiveBulkWriterFactory$1.dispose(HiveBulkWriterFactory.java:61) > at > org.apache.flink.formats.hadoop.bulk.HadoopPathBasedPartFileWriter.dispose(HadoopPathBasedPartFileWriter.java:79) > at > org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.disposePartFile(Bucket.java:235) > at java.util.HashMap$Values.forEach(HashMap.java:981) > at > org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.close(Buckets.java:318) > at > org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSinkHelper.close(StreamingFileSinkHelper.java:108) > at > org.apache.flink.table.filesystem.stream.StreamingFileWriter.dispose(StreamingFileWriter.java:177) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:703) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:635) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:542) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) > at java.lang.Thread.run(Thread.java:748) > {code} > But it does not reproduce every time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20504) NPE when writing to hive and fail over happened
[ https://issues.apache.org/jira/browse/FLINK-20504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247848#comment-17247848 ] zhuxiaoshang commented on FLINK-20504: -- [~lirui] 2.3.4 > NPE when writing to hive and fail over happened > --- > > Key: FLINK-20504 > URL: https://issues.apache.org/jira/browse/FLINK-20504 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.11.1 >Reporter: zhuxiaoshang >Priority: Major > > When writing to hive and fail over happened,I got the following exception > {code:java} > java.lang.NullPointerException > at > org.apache.parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:165) > at > org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:114) > at > org.apache.parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:165) > at > org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.close(ParquetRecordWriterWrapper.java:103) > at > org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.close(ParquetRecordWriterWrapper.java:120) > at > org.apache.flink.connectors.hive.write.HiveBulkWriterFactory$1.dispose(HiveBulkWriterFactory.java:61) > at > org.apache.flink.formats.hadoop.bulk.HadoopPathBasedPartFileWriter.dispose(HadoopPathBasedPartFileWriter.java:79) > at > org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.disposePartFile(Bucket.java:235) > at java.util.HashMap$Values.forEach(HashMap.java:981) > at > org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.close(Buckets.java:318) > at > org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSinkHelper.close(StreamingFileSinkHelper.java:108) > at > org.apache.flink.table.filesystem.stream.StreamingFileWriter.dispose(StreamingFileWriter.java:177) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:703) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:635) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:542) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) > at java.lang.Thread.run(Thread.java:748) > {code} > But it does not reproduce every time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] rmetzger commented on a change in pull request #14346: [FLINK-20354] Rework standalone docs pages
rmetzger commented on a change in pull request #14346: URL: https://github.com/apache/flink/pull/14346#discussion_r540870878 ## File path: docs/deployment/resource-providers/standalone/docker.md ## @@ -590,11 +416,203 @@ docker service create \ taskmanager ``` -The *job artifacts* must be available in the *JobManager* container, as outlined [here](#start-a-job-cluster). +The *job artifacts* must be available in the *JobManager* container, as outlined [here](#application-mode-on-docker). See also [how to specify the JobManager arguments](#jobmanager-additional-command-line-arguments) to pass them to the `flink-jobmanager` container. The example assumes that you run the swarm locally and expects the *job artifacts* to be in `/host/path/to/job/artifacts`. It also mounts the host path with the artifacts as a volume to the container's path `/opt/flink/usrlib`. {% top %} + +## Flink on Docker Reference + +### Image tags + +The [Flink Docker repository](https://hub.docker.com/_/flink/) is hosted on Docker Hub and serves images of Flink version 1.2.1 and later. +The source for these images can be found in the [Apache flink-docker](https://github.com/apache/flink-docker) repository. + +Images for each supported combination of Flink and Scala versions are available, and +[tag aliases](https://hub.docker.com/_/flink?tab=tags) are provided for convenience. + +For example, you can use the following aliases: + +* `flink:latest` → `flink:-scala_` +* `flink:1.11` → `flink:1.11.-scala_2.11` + +Note It is recommended to always use an explicit version tag of the docker image that specifies both the needed Flink and Scala +versions (for example `flink:1.11-scala_2.12`). +This will avoid some class conflicts that can occur if the Flink and/or Scala versions used in the application are different +from the versions provided by the docker image. + +Note Prior to Flink 1.5 version, Hadoop dependencies were always bundled with Flink. +You can see that certain tags include the version of Hadoop, e.g. (e.g. `-hadoop28`). +Beginning with Flink 1.5, image tags that omit the Hadoop version correspond to Hadoop-free releases of Flink +that do not include a bundled Hadoop distribution. + + +### Passing configuration via environment variables + +When you run Flink image, you can also change its configuration options by setting the environment variable `FLINK_PROPERTIES`: + +```sh +FLINK_PROPERTIES="jobmanager.rpc.address: host +taskmanager.numberOfTaskSlots: 3 +blob.server.port: 6124 +" +docker run --env FLINK_PROPERTIES=${FLINK_PROPERTIES} flink:{% if site.is_stable %}{{site.version}}-scala{{site.scala_version_suffix}}{% else %}latest{% endif %} +``` + +The [`jobmanager.rpc.address`]({% link deployment/config.md %}#jobmanager-rpc-address) option must be configured, others are optional to set. + +The environment variable `FLINK_PROPERTIES` should contain a list of Flink cluster configuration options separated by new line, +the same way as in the `flink-conf.yaml`. `FLINK_PROPERTIES` takes precedence over configurations in `flink-conf.yaml`. + +### Provide custom configuration + +The configuration files (`flink-conf.yaml`, logging, hosts etc) are located in the `/opt/flink/conf` directory in the Flink image. +To provide a custom location for the Flink configuration files, you can + +* **either mount a volume** with the custom configuration files to this path `/opt/flink/conf` when you run the Flink image: + +```sh +docker run \ +--mount type=bind,src=/host/path/to/custom/conf,target=/opt/flink/conf \ +flink:{% if site.is_stable %}{{site.version}}-scala{{site.scala_version_suffix}}{% else %}latest{% endif %} +``` + +* or add them to your **custom Flink image**, build and run it: + + *Dockerfile*: + +```dockerfile +FROM flink +ADD /host/path/to/flink-conf.yaml /opt/flink/conf/flink-conf.yaml +ADD /host/path/to/log4j.properties /opt/flink/conf/log4j.properties +``` + +Warning! The mounted volume must contain all necessary configuration files. +The `flink-conf.yaml` file must have write permission so that the Docker entry point script can modify it in certain cases. + +### Using filesystem plugins + +As described in the [plugins]({% link deployment/filesystems/plugins.md %}) documentation page: in order to use plugins they must be +copied to the correct location in the Flink installation in the Docker container for them to work. + +If you want to enable plugins provided with Flink (in the `opt/` directory of the Flink distribution), you can pass the environment variable `ENABLE_BUILT_IN_PLUGINS` when you run the Flink image. +The `ENABLE_BUILT_IN_PLUGINS` should contain a list of plugin jar file names separated by `;`. A valid plugin name is for example `flink-s3-fs-hadoop-{{site.version}}.jar` + +```sh +docker run \ +--env ENABLE_BUILT_IN_PLUGINS=flink-plugin1.jar;flink-plugin2.jar \ +flink:{% if site.
[GitHub] [flink] pnowojski commented on a change in pull request #14057: [FLINK-19681][checkpointing] Timeout aligned checkpoints
pnowojski commented on a change in pull request #14057: URL: https://github.com/apache/flink/pull/14057#discussion_r540858619 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/io/AlternatingController.java ## @@ -114,6 +114,7 @@ public void barrierAnnouncement( lastSeenBarrier = barrier.getId(); firstBarrierArrivalTime = getArrivalTime(barrier); } + activeController = chooseController(barrier); Review comment: But doesn't it mean we should support time outing on every UC barrier? And it looks like we are doing that: ``` @Override public Optional barrierReceived(InputChannelInfo channelInfo, CheckpointBarrier barrier) throws IOException, CheckpointException { if (barrier.getCheckpointOptions().isUnalignedCheckpoint() && activeController == alignedController) { barrier = barrier.asUnaligned(); switchToUnaligned(channelInfo, barrier); activeController.barrierReceived(channelInfo, barrier); return Optional.of(barrier); } ``` ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/io/AlternatingController.java ## @@ -146,28 +146,7 @@ private void switchToUnaligned( @Override public Optional postProcessLastBarrier(InputChannelInfo channelInfo, CheckpointBarrier barrier) throws IOException, CheckpointException { - Optional maybeTimeOut = asTimedOut(barrier); - if (maybeTimeOut.isPresent() && activeController == alignedController) { - switchToUnaligned(channelInfo, maybeTimeOut.get()); - checkState(activeController == unalignedController); - checkState(!activeController.postProcessLastBarrier(channelInfo, maybeTimeOut.orElse(barrier)).isPresent()); - return maybeTimeOut; - } - - barrier = maybeTimeOut.orElse(barrier); - if (barrier.getCheckpointOptions().isUnalignedCheckpoint()) { - checkState(activeController == unalignedController); - checkState(!activeController.postProcessLastBarrier(channelInfo, maybeTimeOut.orElse(barrier)).isPresent()); - return Optional.empty(); - } - else { - checkState(activeController == alignedController); - Optional triggerResult = activeController.postProcessLastBarrier( - channelInfo, - barrier); - checkState(triggerResult.isPresent()); - return triggerResult; - } Review comment: > Besides, why timeout alignment if it's the last barrier? This essentially means that alignment is done. The alignment is done, but that's just input. As we do not have code to timeout outputs, it's better to timeout to UC even if the alignment was completed (although too late). Think especially about a case with just a single input channel. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (FLINK-20575) flink application failed to restore from check-point
[ https://issues.apache.org/jira/browse/FLINK-20575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang resolved FLINK-20575. - Release Note: i think that the issue is due to task cancelling that was caused by other exception. resovling this issue as 'Not A Bug'. Resolution: Not A Bug > flink application failed to restore from check-point > > > Key: FLINK-20575 > URL: https://issues.apache.org/jira/browse/FLINK-20575 > Project: Flink > Issue Type: Bug >Affects Versions: 1.9.1 >Reporter: Yu Yang >Priority: Major > > Our flink application failed to restore from a check-point due to > com.amazonaws.AbortedException (we use s3a file system). Initially we > thought that the s3 file had some issue. It turned out that we can download > the s3 file fine. Any insights on this issue will be very welcome. > > 2020-12-11 07:02:40,018 ERROR > org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder - > Caught unexpected exception. > java.io.InterruptedIOException: getFileStatus on > s3a://mybucket/prod/checkpoints/u/tango/910d2ff2b2c7e01e99a9588d11385e92/shared/f245da83-fc01-424d-9719-d48b99a1ed35: > org.apache.flink.fs.s3base.shaded.com.amazonaws.AbortedException: > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateInterruptedException(S3AUtils.java:340) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:171) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:145) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2187) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:699) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem.open(FileSystem.java:950) > at > org.apache.flink.fs.s3.common.hadoop.HadoopFileSystem.open(HadoopFileSystem.java:120) > at > org.apache.flink.fs.s3.common.hadoop.HadoopFileSystem.open(HadoopFileSystem.java:37) > at > org.apache.flink.core.fs.SafetyNetWrapperFileSystem.open(SafetyNetWrapperFileSystem.java:85) > at > org.apache.flink.runtime.state.filesystem.FileStateHandle.openInputStream(FileStateHandle.java:68) > at > org.apache.flink.contrib.streaming.state.RocksDBStateDownloader.downloadDataForStateHandle(RocksDBStateDownloader.java:127) > at > org.apache.flink.contrib.streaming.state.RocksDBStateDownloader.lambda$createDownloadRunnables$0(RocksDBStateDownloader.java:109) > at > org.apache.flink.util.function.ThrowingRunnable.lambda$unchecked$0(ThrowingRunnable.java:50) > at > java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626) > at > org.apache.flink.runtime.concurrent.DirectExecutorService.execute(DirectExecutorService.java:211) > at > java.util.concurrent.CompletableFuture.asyncRunStage(CompletableFuture.java:1640) > at > java.util.concurrent.CompletableFuture.runAsync(CompletableFuture.java:1858) > at > org.apache.flink.contrib.streaming.state.RocksDBStateDownloader.downloadDataForAllStateHandles(RocksDBStateDownloader.java:83) > at > org.apache.flink.contrib.streaming.state.RocksDBStateDownloader.transferAllStateDataToDirectory(RocksDBStateDownloader.java:66) > at > org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreDBInstanceFromStateHandle(RocksDBIncrementalRestoreOperation.java:406) > at > org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreWithRescaling(RocksDBIncrementalRestoreOperation.java:294) > at > org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restore(RocksDBIncrementalRestoreOperation.java:146) > at > org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:270) > at > org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:520) > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:291) > at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:142) > at
[GitHub] [flink] tillrohrmann commented on pull request #14359: [FLINK-20521][rpc] Add support for sending null responses
tillrohrmann commented on pull request #14359: URL: https://github.com/apache/flink/pull/14359#issuecomment-743121010 The reason I opted for supporting null values was that the RPC system already supported null return values for synchronous RPCs, for example `Integer foobar()` could return `null` w/o problems. Moreover, it is the more general solution. As a drawback it comes with all the problems null values in the Java world cause. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14338: [FLINK-20537][hive] Failed to call Hive UDF with string literal argum…
flinkbot edited a comment on pull request #14338: URL: https://github.com/apache/flink/pull/14338#issuecomment-740638158 ## CI report: * 4b305a2fa2f5a901337555ac533b1d7fabfe16e2 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10710) * 872de3bf88035b995f466b2051d22416782c697e Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10809) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14312: [FLINK-20491] Support Broadcast State in BATCH execution mode
flinkbot edited a comment on pull request #14312: URL: https://github.com/apache/flink/pull/14312#issuecomment-738876739 ## CI report: * 6f0b8a4de3ba32b1cf407287add34dad903a68a2 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10770) * ca3900171fb130a6e379e9bf8b4f4f4f5b4c48a4 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10808) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tillrohrmann commented on a change in pull request #14359: [FLINK-20521][rpc] Add support for sending null responses
tillrohrmann commented on a change in pull request #14359: URL: https://github.com/apache/flink/pull/14359#discussion_r540854148 ## File path: flink-core/src/main/java/org/apache/flink/util/SerializedValue.java ## @@ -63,6 +66,7 @@ public T deserializeValue(ClassLoader loader) throws IOException, ClassNotFoundE * * @return Serialized data. */ + @Nullable Review comment: Yes, there are indeed other problems with this class. I would consider this follow up tasks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pnowojski commented on a change in pull request #14057: [FLINK-19681][checkpointing] Timeout aligned checkpoints
pnowojski commented on a change in pull request #14057: URL: https://github.com/apache/flink/pull/14057#discussion_r540854209 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/io/AlternatingController.java ## @@ -193,6 +202,11 @@ private CheckpointBarrierBehaviourController chooseController(CheckpointBarrier private boolean canTimeout(CheckpointBarrier barrier) { return barrier.getCheckpointOptions().isTimeoutable() && - barrier.getCheckpointOptions().getAlignmentTimeout() < (System.currentTimeMillis() - barrier.getTimestamp()); + barrier.getId() <= lastSeenBarrier && + barrier.getCheckpointOptions().getAlignmentTimeout() * 1_000_000 < (System.nanoTime() - firstBarrierArrivalTime); + } + + private long getArrivalTime(CheckpointBarrier announcedBarrier) { + return announcedBarrier.getCheckpointOptions().isTimeoutable() ? System.nanoTime() : Long.MAX_VALUE; Review comment: > Why, could you explain? On a second thought, maybe it will partially, but not fully as well. guess we still have the code, that we can timeout to UC on the last processed barrier? So in case of single channel: 1. announcement is processed (first announcement will never timeout in this version) it won't timeout 2. barrier will be processed, and only it can timeout So a timeout on the first network exchange will work worse. That's a bit problematic, especially for simple jobs, with for example just a single exchange. Previous version would cut the checkpointing time by half, this version will do worse than that, in a way that's hard to quantify for me. There is some extreme corner case when imagine there is a heavy back pressure, but all CB are processed at the same time. That means announcements in this version wouldn't cause timeout (it would in my older proposal), and this version will need to wait for some CB to be processed (which can take long time). Active timeout would alleviate this problem though. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tillrohrmann commented on a change in pull request #14359: [FLINK-20521][rpc] Add support for sending null responses
tillrohrmann commented on a change in pull request #14359: URL: https://github.com/apache/flink/pull/14359#discussion_r540852240 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/rpc/akka/AkkaRpcActor.java ## @@ -342,10 +342,13 @@ private void sendAsyncResponse(CompletableFuture asyncResponse, String method promise.failure(serializedResult.right()); } } else { - promise.success(value); + promise.success(new Status.Success(value)); Review comment: No it was logged as a failure by Akka. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-20576) Flink Temporal Join Hive Dim Error
[ https://issues.apache.org/jira/browse/FLINK-20576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-20576: Fix Version/s: 1.13.0 > Flink Temporal Join Hive Dim Error > -- > > Key: FLINK-20576 > URL: https://issues.apache.org/jira/browse/FLINK-20576 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive, Table SQL / Ecosystem >Affects Versions: 1.12.0 >Reporter: HideOnBush >Priority: Major > Fix For: 1.13.0 > > > {code:java} > {code} > {noformat} > SELECT * FROM hive_catalog.flink_db_test.kfk_master_test AS kafk_tbl JOIN > hive_catalog.gauss.dim_extend_shop_info /*+ > OPTIONS('streaming-source.enable'='true', > 'streaming-source.partition.include' = 'latest', > 'streaming-source.monitor-interval' = '12 > h','streaming-source.partition-order' = 'partition-name') */ FOR SYSTEM_TIME > AS OF kafk_tbl.proctime AS dim ON kafk_tbl.groupID = dim.group_id where > kafk_tbl.groupID is not null; > {noformat} > When I execute the above statement, these stack error messages are returned > > Caused by: java.lang.NullPointerException: bufferCaused by: > java.lang.NullPointerException: buffer at > org.apache.flink.core.memory.MemorySegment.(MemorySegment.java:161) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.core.memory.HybridMemorySegment.(HybridMemorySegment.java:86) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.core.memory.MemorySegmentFactory.wrap(MemorySegmentFactory.java:55) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.data.binary.BinaryStringData.fromBytes(BinaryStringData.java:98) > ~[flink-table_2.11-1.12.0.jar:1.12.0] > > Caused by: org.apache.flink.util.FlinkRuntimeException: Failed to load table > into cache after 3 retriesCaused by: > org.apache.flink.util.FlinkRuntimeException: Failed to load table into cache > after 3 retries at > org.apache.flink.table.filesystem.FileSystemLookupFunction.checkCacheReload(FileSystemLookupFunction.java:143) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.filesystem.FileSystemLookupFunction.eval(FileSystemLookupFunction.java:103) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > LookupFunction$1577.flatMap(Unknown Source) ~[?:?] at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.processElement(LookupJoinRunner.java:82) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.processElement(LookupJoinRunner.java:36) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20576) Flink Temporal Join Hive Dim Error
[ https://issues.apache.org/jira/browse/FLINK-20576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-20576: Component/s: Table SQL / Ecosystem Connectors / Hive > Flink Temporal Join Hive Dim Error > -- > > Key: FLINK-20576 > URL: https://issues.apache.org/jira/browse/FLINK-20576 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive, Table SQL / Ecosystem >Affects Versions: 1.12.0 >Reporter: HideOnBush >Priority: Major > > {code:java} > {code} > {noformat} > SELECT * FROM hive_catalog.flink_db_test.kfk_master_test AS kafk_tbl JOIN > hive_catalog.gauss.dim_extend_shop_info /*+ > OPTIONS('streaming-source.enable'='true', > 'streaming-source.partition.include' = 'latest', > 'streaming-source.monitor-interval' = '12 > h','streaming-source.partition-order' = 'partition-name') */ FOR SYSTEM_TIME > AS OF kafk_tbl.proctime AS dim ON kafk_tbl.groupID = dim.group_id where > kafk_tbl.groupID is not null; > {noformat} > When I execute the above statement, these stack error messages are returned > > Caused by: java.lang.NullPointerException: bufferCaused by: > java.lang.NullPointerException: buffer at > org.apache.flink.core.memory.MemorySegment.(MemorySegment.java:161) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.core.memory.HybridMemorySegment.(HybridMemorySegment.java:86) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.core.memory.MemorySegmentFactory.wrap(MemorySegmentFactory.java:55) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.data.binary.BinaryStringData.fromBytes(BinaryStringData.java:98) > ~[flink-table_2.11-1.12.0.jar:1.12.0] > > Caused by: org.apache.flink.util.FlinkRuntimeException: Failed to load table > into cache after 3 retriesCaused by: > org.apache.flink.util.FlinkRuntimeException: Failed to load table into cache > after 3 retries at > org.apache.flink.table.filesystem.FileSystemLookupFunction.checkCacheReload(FileSystemLookupFunction.java:143) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.filesystem.FileSystemLookupFunction.eval(FileSystemLookupFunction.java:103) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > LookupFunction$1577.flatMap(Unknown Source) ~[?:?] at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.processElement(LookupJoinRunner.java:82) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.processElement(LookupJoinRunner.java:36) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20556) Compile exception when using Scala package object as POJO
[ https://issues.apache.org/jira/browse/FLINK-20556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-20556: Component/s: (was: Client / Job Submission) (was: API / Scala) > Compile exception when using Scala package object as POJO > - > > Key: FLINK-20556 > URL: https://issues.apache.org/jira/browse/FLINK-20556 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.11.2 > Environment: flink-1.11.2 > IntelliJ IDEA 2019.2 x64 > window10 > scala >Reporter: Zezheng Qin >Priority: Major > Attachments: bug-log.log, flink-datahub-bug.zip > > > if i defined a pojo class in a package object, the job excute failly. > but the job can excute successfully what if i defined a pojo class in the > package > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20556) Compile exception when using Scala package object as POJO
[ https://issues.apache.org/jira/browse/FLINK-20556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247824#comment-17247824 ] Jark Wu commented on FLINK-20556: - Hi [~zzh...@gmail.com], currently, Flink SQL doesn't support using Scala package object. You can use normal Scala class instead. > Compile exception when using Scala package object as POJO > - > > Key: FLINK-20556 > URL: https://issues.apache.org/jira/browse/FLINK-20556 > Project: Flink > Issue Type: Bug > Components: API / Scala, Client / Job Submission, Table SQL / Runtime >Affects Versions: 1.11.2 > Environment: flink-1.11.2 > IntelliJ IDEA 2019.2 x64 > window10 > scala >Reporter: Zezheng Qin >Priority: Major > Attachments: bug-log.log, flink-datahub-bug.zip > > > if i defined a pojo class in a package object, the job excute failly. > but the job can excute successfully what if i defined a pojo class in the > package > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20556) Compile exception when using Scala package object as POJO
[ https://issues.apache.org/jira/browse/FLINK-20556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-20556: Summary: Compile exception when using Scala package object as POJO (was: org.apache.flink.api.common.InvalidProgramException: Table program cannot be compiled. This is a bug. Please file an issue.) > Compile exception when using Scala package object as POJO > - > > Key: FLINK-20556 > URL: https://issues.apache.org/jira/browse/FLINK-20556 > Project: Flink > Issue Type: Bug > Components: API / Scala, Client / Job Submission, Table SQL / Runtime >Affects Versions: 1.11.2 > Environment: flink-1.11.2 > IntelliJ IDEA 2019.2 x64 > window10 > scala >Reporter: Zezheng Qin >Priority: Major > Attachments: bug-log.log, flink-datahub-bug.zip > > > if i defined a pojo class in a package object, the job excute failly. > but the job can excute successfully what if i defined a pojo class in the > package > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] kougazhang commented on pull request #14362: [FLINK-20540] Failed connecting to jdbc:postgresql://flink-postgres
kougazhang commented on pull request #14362: URL: https://github.com/apache/flink/pull/14362#issuecomment-743114728 > > Can we maybe change the code instead to handle this case more gracefully? > > Thanks @zentol , that's a better way, I rechecked the code and found is problematic which should use the formatted `this.baseUrl` rather than the function parameter `baseUrl`: > > ``` > this.baseUrl = baseUrl.endsWith("/") ? baseUrl : baseUrl + "/"; > this.defaultUrl = baseUrl + defaultDatabase; => this.defaultUrl = this.baseUrl + defaultDatabase; > ``` > > And thus the document update is unnecessary and we should change the code and update the test or add a new test to cover the change. > @kougazhang HDYT? @leonardBang , I think your suggestion is very good, and I'll change the code as you suggest. When I found this bug, I wanted to change the code in this way. This is jira link: [Addr](https://issues.apache.org/jira/browse/FLINK-20540?focusedCommentId=17246236&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17246236) I am glad to see we have same idea. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-20576) Flink Temporal Join Hive Dim Error
HideOnBush created FLINK-20576: -- Summary: Flink Temporal Join Hive Dim Error Key: FLINK-20576 URL: https://issues.apache.org/jira/browse/FLINK-20576 Project: Flink Issue Type: Bug Affects Versions: 1.12.0 Reporter: HideOnBush {code:java} {code} {noformat} SELECT * FROM hive_catalog.flink_db_test.kfk_master_test AS kafk_tbl JOIN hive_catalog.gauss.dim_extend_shop_info /*+ OPTIONS('streaming-source.enable'='true', 'streaming-source.partition.include' = 'latest', 'streaming-source.monitor-interval' = '12 h','streaming-source.partition-order' = 'partition-name') */ FOR SYSTEM_TIME AS OF kafk_tbl.proctime AS dim ON kafk_tbl.groupID = dim.group_id where kafk_tbl.groupID is not null; {noformat} When I execute the above statement, these stack error messages are returned Caused by: java.lang.NullPointerException: bufferCaused by: java.lang.NullPointerException: buffer at org.apache.flink.core.memory.MemorySegment.(MemorySegment.java:161) ~[flink-dist_2.11-1.12.0.jar:1.12.0] at org.apache.flink.core.memory.HybridMemorySegment.(HybridMemorySegment.java:86) ~[flink-dist_2.11-1.12.0.jar:1.12.0] at org.apache.flink.core.memory.MemorySegmentFactory.wrap(MemorySegmentFactory.java:55) ~[flink-dist_2.11-1.12.0.jar:1.12.0] at org.apache.flink.table.data.binary.BinaryStringData.fromBytes(BinaryStringData.java:98) ~[flink-table_2.11-1.12.0.jar:1.12.0] Caused by: org.apache.flink.util.FlinkRuntimeException: Failed to load table into cache after 3 retriesCaused by: org.apache.flink.util.FlinkRuntimeException: Failed to load table into cache after 3 retries at org.apache.flink.table.filesystem.FileSystemLookupFunction.checkCacheReload(FileSystemLookupFunction.java:143) ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at org.apache.flink.table.filesystem.FileSystemLookupFunction.eval(FileSystemLookupFunction.java:103) ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at LookupFunction$1577.flatMap(Unknown Source) ~[?:?] at org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.processElement(LookupJoinRunner.java:82) ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.processElement(LookupJoinRunner.java:36) ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66) ~[flink-dist_2.11-1.12.0.jar:1.12.0] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #14338: [FLINK-20537][hive] Failed to call Hive UDF with string literal argum…
flinkbot edited a comment on pull request #14338: URL: https://github.com/apache/flink/pull/14338#issuecomment-740638158 ## CI report: * 4b305a2fa2f5a901337555ac533b1d7fabfe16e2 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10710) * 872de3bf88035b995f466b2051d22416782c697e UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14312: [FLINK-20491] Support Broadcast State in BATCH execution mode
flinkbot edited a comment on pull request #14312: URL: https://github.com/apache/flink/pull/14312#issuecomment-738876739 ## CI report: * 6f0b8a4de3ba32b1cf407287add34dad903a68a2 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10770) * ca3900171fb130a6e379e9bf8b4f4f4f5b4c48a4 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13907: [FLINK-19942][Connectors / JDBC]Support sink parallelism configuration to JDBC connector
flinkbot edited a comment on pull request #13907: URL: https://github.com/apache/flink/pull/13907#issuecomment-721230727 ## CI report: * f4f1cf14d4d413c4c87881516215c7d5be64 UNKNOWN * 5ba8c4410ef2616561f46462d156949e8c6b Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10745) Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10705) * 7137b6fcfa8f92478118215f47042927c49f4752 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10805) * 6acdd717e9e7a4f23bf7d798e72b8e50c5ff4e83 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-20575) flink application failed to restore from check-point
[ https://issues.apache.org/jira/browse/FLINK-20575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated FLINK-20575: Description: Our flink application failed to restore from a check-point due to com.amazonaws.AbortedException (we use s3a file system). Initially we thought that the s3 file had some issue. It turned out that we can download the s3 file fine. Any insights on this issue will be very welcome. 2020-12-11 07:02:40,018 ERROR org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder - Caught unexpected exception. java.io.InterruptedIOException: getFileStatus on s3a://mybucket/prod/checkpoints/u/tango/910d2ff2b2c7e01e99a9588d11385e92/shared/f245da83-fc01-424d-9719-d48b99a1ed35: org.apache.flink.fs.s3base.shaded.com.amazonaws.AbortedException: at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateInterruptedException(S3AUtils.java:340) at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:171) at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:145) at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2187) at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149) at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088) at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:699) at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem.open(FileSystem.java:950) at org.apache.flink.fs.s3.common.hadoop.HadoopFileSystem.open(HadoopFileSystem.java:120) at org.apache.flink.fs.s3.common.hadoop.HadoopFileSystem.open(HadoopFileSystem.java:37) at org.apache.flink.core.fs.SafetyNetWrapperFileSystem.open(SafetyNetWrapperFileSystem.java:85) at org.apache.flink.runtime.state.filesystem.FileStateHandle.openInputStream(FileStateHandle.java:68) at org.apache.flink.contrib.streaming.state.RocksDBStateDownloader.downloadDataForStateHandle(RocksDBStateDownloader.java:127) at org.apache.flink.contrib.streaming.state.RocksDBStateDownloader.lambda$createDownloadRunnables$0(RocksDBStateDownloader.java:109) at org.apache.flink.util.function.ThrowingRunnable.lambda$unchecked$0(ThrowingRunnable.java:50) at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626) at org.apache.flink.runtime.concurrent.DirectExecutorService.execute(DirectExecutorService.java:211) at java.util.concurrent.CompletableFuture.asyncRunStage(CompletableFuture.java:1640) at java.util.concurrent.CompletableFuture.runAsync(CompletableFuture.java:1858) at org.apache.flink.contrib.streaming.state.RocksDBStateDownloader.downloadDataForAllStateHandles(RocksDBStateDownloader.java:83) at org.apache.flink.contrib.streaming.state.RocksDBStateDownloader.transferAllStateDataToDirectory(RocksDBStateDownloader.java:66) at org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreDBInstanceFromStateHandle(RocksDBIncrementalRestoreOperation.java:406) at org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreWithRescaling(RocksDBIncrementalRestoreOperation.java:294) at org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restore(RocksDBIncrementalRestoreOperation.java:146) at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:270) at org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:520) at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:291) at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:142) at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:121) at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:307) at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:135) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:253) at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(Str
[jira] [Created] (FLINK-20575) flink application failed to restore from check-point
Yu Yang created FLINK-20575: --- Summary: flink application failed to restore from check-point Key: FLINK-20575 URL: https://issues.apache.org/jira/browse/FLINK-20575 Project: Flink Issue Type: Bug Affects Versions: 1.9.1 Reporter: Yu Yang Our flink application failed to restore from a check-point due to com.amazonaws.AbortedException (we use s3a file system). Initially we thought that the s3 file had some issue. It turned out that we can download the s3 file fine. Any insights on this issue will be very welcome. || |6674 2020-12-11 07:02:40,018 ERROR org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder - Caught unexpected exception.| || |6675 java.io.InterruptedIOException: getFileStatus on s3a://bucket/prod/checkpoints/u/tango/910d2ff2b2c7e01e99a9588d11385e92/shared/f245da83-fc01-424d-9719-d48b99a1ed35: org.apache.flink.fs.s3base.shaded.com.amazonaws.AbortedException:| || |6676 at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateInterruptedException(S3AUtils.java:340)| || |6677 at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:171)| || |6678 at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:145)| || |6679 at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2187)| || |6680 at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149)| || |6681 at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088)| || |6682 at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:699)| || |6683 at org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem.open(FileSystem.java:950)| || |6684 at org.apache.flink.fs.s3.common.hadoop.HadoopFileSystem.open(HadoopFileSystem.java:120)| || |6685 at org.apache.flink.fs.s3.common.hadoop.HadoopFileSystem.open(HadoopFileSystem.java:37)| || |6686 at org.apache.flink.core.fs.SafetyNetWrapperFileSystem.open(SafetyNetWrapperFileSystem.java:85)| || |6687 at org.apache.flink.runtime.state.filesystem.FileStateHandle.openInputStream(FileStateHandle.java:68)| || |6688 at org.apache.flink.contrib.streaming.state.RocksDBStateDownloader.downloadDataForStateHandle(RocksDBStateDownloader.java:127)| || |6689 at org.apache.flink.contrib.streaming.state.RocksDBStateDownloader.lambda$createDownloadRunnables$0(RocksDBStateDownloader.java:109)| || |6690 at org.apache.flink.util.function.ThrowingRunnable.lambda$unchecked$0(ThrowingRunnable.java:50)| || |6691 at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)| || |6692 at org.apache.flink.runtime.concurrent.DirectExecutorService.execute(DirectExecutorService.java:211)| || |6693 at java.util.concurrent.CompletableFuture.asyncRunStage(CompletableFuture.java:1640)| || |6694 at java.util.concurrent.CompletableFuture.runAsync(CompletableFuture.java:1858)| || |6695 at org.apache.flink.contrib.streaming.state.RocksDBStateDownloader.downloadDataForAllStateHandles(RocksDBStateDownloader.java:83)| || |6696 at org.apache.flink.contrib.streaming.state.RocksDBStateDownloader.transferAllStateDataToDirectory(RocksDBStateDownloader.java:66)| || |6697 at org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreDBInstanceFromStateHandle(RocksDBIncrementalRestoreOperation.java:40 6)| || |6698 at org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restoreWithRescaling(RocksDBIncrementalRestoreOperation.java:294)| || |6699 at org.apache.flink.contrib.streaming.state.restore.RocksDBIncrementalRestoreOperation.restore(RocksDBIncrementalRestoreOperation.java:146)| || |6700 at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:270)| || |6701 at org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:520)| || |6702 at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:291)| || |6703 at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:142)| || |6704 at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:121)| || |6705 at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:307)| || |6706 at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:135)| || |6707 at
[jira] [Commented] (FLINK-20504) NPE when writing to hive and fail over happened
[ https://issues.apache.org/jira/browse/FLINK-20504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247810#comment-17247810 ] Rui Li commented on FLINK-20504: [~ZhuShang] Thanks for reporting the issue. What's your hive version? > NPE when writing to hive and fail over happened > --- > > Key: FLINK-20504 > URL: https://issues.apache.org/jira/browse/FLINK-20504 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive >Affects Versions: 1.11.1 >Reporter: zhuxiaoshang >Priority: Major > > When writing to hive and fail over happened,I got the following exception > {code:java} > java.lang.NullPointerException > at > org.apache.parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:165) > at > org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:114) > at > org.apache.parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:165) > at > org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.close(ParquetRecordWriterWrapper.java:103) > at > org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.close(ParquetRecordWriterWrapper.java:120) > at > org.apache.flink.connectors.hive.write.HiveBulkWriterFactory$1.dispose(HiveBulkWriterFactory.java:61) > at > org.apache.flink.formats.hadoop.bulk.HadoopPathBasedPartFileWriter.dispose(HadoopPathBasedPartFileWriter.java:79) > at > org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.disposePartFile(Bucket.java:235) > at java.util.HashMap$Values.forEach(HashMap.java:981) > at > org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.close(Buckets.java:318) > at > org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSinkHelper.close(StreamingFileSinkHelper.java:108) > at > org.apache.flink.table.filesystem.stream.StreamingFileWriter.dispose(StreamingFileWriter.java:177) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:703) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:635) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:542) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) > at java.lang.Thread.run(Thread.java:748) > {code} > But it does not reproduce every time. -- This message was sent by Atlassian Jira (v8.3.4#803005)