[jira] [Commented] (FLINK-1918) NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment
[ https://issues.apache.org/jira/browse/FLINK-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505348#comment-14505348 ] Stephan Ewen commented on FLINK-1918: - Thank you for reporting this. From what I can see, this may happen in the case where a host lookup failed. This should definitely give a better error message, or fail earlier with Unknown host. I'll prepare a patch for this... NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment - Key: FLINK-1918 URL: https://issues.apache.org/jira/browse/FLINK-1918 Project: Flink Issue Type: Bug Components: YARN Client Reporter: Zoltán Zvara Labels: yarn, yarn-client Trace: {code} Exception in thread main java.lang.NullPointerException at org.apache.flink.client.program.Client.init(Client.java:104) at org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:86) at org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:82) at org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:70) at Wordcount.main(Wordcount.java:23) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140) {code} The constructor is trying to set configuration parameter {{jobmanager.rpc.address}} with {{jobManagerAddress.getAddress().getHostAddress()}}, but {{jobManagerAddress.holder.addr}} is {{null}}. {{jobManagerAddress.holder.hostname}} and {{port}} holds the valid information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration
[ https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509097#comment-14509097 ] Stephan Ewen commented on FLINK-1930: - Should give better error messages now. Please post if this happens again... NullPointerException in vertex-centric iteration Key: FLINK-1930 URL: https://issues.apache.org/jira/browse/FLINK-1930 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Vasia Kalavri Hello to my Squirrels, I came across this exception when having a vertex-centric iteration output followed by a group by. I'm not sure if what is causing it, since I saw this error in a rather large pipeline, but I managed to reproduce it with [this code example | https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309] and a sufficiently large dataset, e.g. [this one | http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally). It seems like a null Buffer in RecordWriter. The exception message is the following: Exception in thread main org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.NullPointerException at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92) at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration
[ https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510649#comment-14510649 ] Stephan Ewen commented on FLINK-1930: - And this is the only exception that you see, it is not a followup exception? NullPointerException in vertex-centric iteration Key: FLINK-1930 URL: https://issues.apache.org/jira/browse/FLINK-1930 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Vasia Kalavri Hello to my Squirrels, I came across this exception when having a vertex-centric iteration output followed by a group by. I'm not sure if what is causing it, since I saw this error in a rather large pipeline, but I managed to reproduce it with [this code example | https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309] and a sufficiently large dataset, e.g. [this one | http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally). It seems like a null Buffer in RecordWriter. The exception message is the following: Exception in thread main org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.NullPointerException at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92) at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration
[ https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510650#comment-14510650 ] Stephan Ewen commented on FLINK-1930: - Seems like a bug that the pools get released to early. NullPointerException in vertex-centric iteration Key: FLINK-1930 URL: https://issues.apache.org/jira/browse/FLINK-1930 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Vasia Kalavri Hello to my Squirrels, I came across this exception when having a vertex-centric iteration output followed by a group by. I'm not sure if what is causing it, since I saw this error in a rather large pipeline, but I managed to reproduce it with [this code example | https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309] and a sufficiently large dataset, e.g. [this one | http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally). It seems like a null Buffer in RecordWriter. The exception message is the following: Exception in thread main org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.NullPointerException at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92) at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1804) flink-quickstart-scala tests fail on scala-2.11 build profile on travis
[ https://issues.apache.org/jira/browse/FLINK-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505645#comment-14505645 ] Stephan Ewen commented on FLINK-1804: - Do these consistently fail, or was that one spurious failure? flink-quickstart-scala tests fail on scala-2.11 build profile on travis --- Key: FLINK-1804 URL: https://issues.apache.org/jira/browse/FLINK-1804 Project: Flink Issue Type: Task Components: Build System, Quickstarts Affects Versions: 0.9 Reporter: Robert Metzger Travis builds on master started failing after the Scala 2.11 profile has been added to Flink. For example: https://travis-ci.org/apache/flink/jobs/56312734 The error: {code} [INFO] [INFO] --- scala-maven-plugin:3.1.4:compile (default) @ testArtifact --- [INFO] [INFO] artifact joda-time:joda-time: checking for updates from sonatype [INFO] [INFO] artifact joda-time:joda-time: checking for updates from sonatype-apache [INFO] [INFO] artifact joda-time:joda-time: checking for updates from sonatype [INFO] [INFO] artifact joda-time:joda-time: checking for updates from sonatype-apache [INFO] [WARNING] Expected all dependencies to require Scala version: 2.10.4 [INFO] [WARNING] com.twitter:chill_2.10:0.5.2 requires scala version: 2.10.4 [INFO] [WARNING] com.twitter:chill-avro_2.10:0.5.2 requires scala version: 2.10.4 [INFO] [WARNING] com.twitter:chill-bijection_2.10:0.5.2 requires scala version: 2.10.4 [INFO] [WARNING] com.twitter:bijection-core_2.10:0.7.2 requires scala version: 2.10.4 [INFO] [WARNING] com.twitter:bijection-avro_2.10:0.7.2 requires scala version: 2.10.4 [INFO] [WARNING] org.scala-lang:scala-reflect:2.10.4 requires scala version: 2.10.4 [INFO] [WARNING] org.apache.flink:flink-scala:0.9-SNAPSHOT requires scala version: 2.10.4 [INFO] [WARNING] org.apache.flink:flink-scala:0.9-SNAPSHOT requires scala version: 2.10.4 [INFO] [WARNING] org.scala-lang:scala-compiler:2.10.4 requires scala version: 2.10.4 [INFO] [WARNING] org.scalamacros:quasiquotes_2.10:2.0.1 requires scala version: 2.10.4 [INFO] [WARNING] org.apache.flink:flink-streaming-scala:0.9-SNAPSHOT requires scala version: 2.11.4 [INFO] [WARNING] Multiple versions of scala libraries detected! [INFO] [INFO] /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/src/main/scala:-1: info: compiling [INFO] [INFO] Compiling 3 source files to /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/target/classes at 1427650524446 [INFO] [ERROR] error: [INFO] [INFO] while compiling: /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/src/main/scala/org/apache/flink/archetypetest/SocketTextStreamWordCount.scala [INFO] [INFO] during phase: typer [INFO] [INFO] library version: version 2.10.4 [INFO] [INFO] compiler version: version 2.10.4 [INFO] [INFO] reconstructed args: -d /home/travis/build/apache/flink/flink-quickstart/flink-quickstart-scala/target/test-classes/projects/testArtifact/project/testArtifact/target/classes -classpath
[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration
[ https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14507290#comment-14507290 ] Stephan Ewen commented on FLINK-1930: - I have a bit of code pending that may help to figure out whether this is a side-effect of cancelling, or a bug in the buffer pools. I hope I can commit that soon... NullPointerException in vertex-centric iteration Key: FLINK-1930 URL: https://issues.apache.org/jira/browse/FLINK-1930 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Vasia Kalavri Hello to my Squirrels, I came across this exception when having a vertex-centric iteration output followed by a group by. I'm not sure if what is causing it, since I saw this error in a rather large pipeline, but I managed to reproduce it with [this code example | https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309] and a sufficiently large dataset, e.g. [this one | http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally). It seems like a null Buffer in RecordWriter. The exception message is the following: Exception in thread main org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.NullPointerException at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92) at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1907) Scala Interactive Shell
[ https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505618#comment-14505618 ] Stephan Ewen commented on FLINK-1907: - Yes, the shell needs to start a mini cluster and go against it with a remote executor. Local executor will not work, it assumes everything is in the system classloader. Scala Interactive Shell --- Key: FLINK-1907 URL: https://issues.apache.org/jira/browse/FLINK-1907 Project: Flink Issue Type: New Feature Components: Scala API Reporter: Nikolaas Steenbergen Assignee: Nikolaas Steenbergen Priority: Minor Build an interactive Shell for the Scala api. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1878) Add mode to Environments to deactivate sysout printing
[ https://issues.apache.org/jira/browse/FLINK-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506978#comment-14506978 ] Stephan Ewen commented on FLINK-1878: - Complemented in b70431239a5e18555866addb41ee6edf2b79ff60 Add mode to Environments to deactivate sysout printing -- Key: FLINK-1878 URL: https://issues.apache.org/jira/browse/FLINK-1878 Project: Flink Issue Type: Bug Components: Tests Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 The test output is currently spoiled for debugging by all the sysout output from the RemoteEnvironment-based tests The execution environment should offer a mode to activate/deactivate printing to System.out - for tests, we should deactivate this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1918) NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment
[ https://issues.apache.org/jira/browse/FLINK-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1918. - Resolution: Fixed Fix Version/s: 0.9 Assignee: Stephan Ewen Fixed via 2b8db40ac40d70027ce331f3a04c6ca7aa562a84 NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment - Key: FLINK-1918 URL: https://issues.apache.org/jira/browse/FLINK-1918 Project: Flink Issue Type: Bug Components: YARN Client Reporter: Zoltán Zvara Assignee: Stephan Ewen Labels: yarn, yarn-client Fix For: 0.9 Trace: {code} Exception in thread main java.lang.NullPointerException at org.apache.flink.client.program.Client.init(Client.java:104) at org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:86) at org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:82) at org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:70) at Wordcount.main(Wordcount.java:23) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140) {code} The constructor is trying to set configuration parameter {{jobmanager.rpc.address}} with {{jobManagerAddress.getAddress().getHostAddress()}}, but {{jobManagerAddress.holder.addr}} is {{null}}. {{jobManagerAddress.holder.hostname}} and {{port}} holds the valid information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1918) NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment
[ https://issues.apache.org/jira/browse/FLINK-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1918. --- NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment - Key: FLINK-1918 URL: https://issues.apache.org/jira/browse/FLINK-1918 Project: Flink Issue Type: Bug Components: YARN Client Reporter: Zoltán Zvara Assignee: Stephan Ewen Labels: yarn, yarn-client Fix For: 0.9 Trace: {code} Exception in thread main java.lang.NullPointerException at org.apache.flink.client.program.Client.init(Client.java:104) at org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:86) at org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:82) at org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:70) at Wordcount.main(Wordcount.java:23) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140) {code} The constructor is trying to set configuration parameter {{jobmanager.rpc.address}} with {{jobManagerAddress.getAddress().getHostAddress()}}, but {{jobManagerAddress.holder.addr}} is {{null}}. {{jobManagerAddress.holder.hostname}} and {{port}} holds the valid information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1918) NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment
[ https://issues.apache.org/jira/browse/FLINK-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506983#comment-14506983 ] Stephan Ewen commented on FLINK-1918: - [~Ehnalis] It should be fixed in the the latest master- You can compile your own, or wait a few hours until Travis/Apache have synced the maven repositories with the new version. NullPointerException at org.apache.flink.client.program.Client's constructor while using ExecutionEnvironment.createRemoteEnvironment - Key: FLINK-1918 URL: https://issues.apache.org/jira/browse/FLINK-1918 Project: Flink Issue Type: Bug Components: YARN Client Reporter: Zoltán Zvara Assignee: Stephan Ewen Labels: yarn, yarn-client Fix For: 0.9 Trace: {code} Exception in thread main java.lang.NullPointerException at org.apache.flink.client.program.Client.init(Client.java:104) at org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:86) at org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:82) at org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:70) at Wordcount.main(Wordcount.java:23) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140) {code} The constructor is trying to set configuration parameter {{jobmanager.rpc.address}} with {{jobManagerAddress.getAddress().getHostAddress()}}, but {{jobManagerAddress.holder.addr}} is {{null}}. {{jobManagerAddress.holder.hostname}} and {{port}} holds the valid information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1923) Replace asynchronous logging in JobManager with regular slf4j logging
[ https://issues.apache.org/jira/browse/FLINK-1923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1923. --- Replace asynchronous logging in JobManager with regular slf4j logging - Key: FLINK-1923 URL: https://issues.apache.org/jira/browse/FLINK-1923 Project: Flink Issue Type: Task Components: JobManager Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Till Rohrmann Fix For: 0.9 Its hard to understand exactly whats going on in the JobManager because the log messages are send asynchronously by a logging actor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1923) Replace asynchronous logging in JobManager with regular slf4j logging
[ https://issues.apache.org/jira/browse/FLINK-1923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1923. - Resolution: Fixed Fix Version/s: 0.9 Fixed via 5a2ca81912193552e74cd6a33637b7254e5a7174 Replace asynchronous logging in JobManager with regular slf4j logging - Key: FLINK-1923 URL: https://issues.apache.org/jira/browse/FLINK-1923 Project: Flink Issue Type: Task Components: JobManager Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Till Rohrmann Fix For: 0.9 Its hard to understand exactly whats going on in the JobManager because the log messages are send asynchronously by a logging actor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1922) Failed task deployment causes NPE on input split assignment
[ https://issues.apache.org/jira/browse/FLINK-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1922. --- Failed task deployment causes NPE on input split assignment --- Key: FLINK-1922 URL: https://issues.apache.org/jira/browse/FLINK-1922 Project: Flink Issue Type: Bug Components: JobManager Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Till Rohrmann Fix For: 0.9 The input split assignment code is returning {null} if the Task has failed, which is causing a NPE. We should improve our error handling / reporting in that situation. {code} 13:12:31,002 INFO org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 - Status of job c0b47ce41e9a85a628a628a3977705ef (Flink Java Job at Tue Apr 21 13:10:36 UTC 2015) changed to FAILING Cannot deploy task - TaskManager not responding.. 13:12:47,591 ERROR org.apache.flink.runtime.operators.RegularPactTask - Error in task code: CHAIN DataSource (at userMethod (org.apache.flink.api.java.io.AvroInputFormat)) - FlatMap (FlatMap at main(UserClass.java:111)) (20/50) java.lang.RuntimeException: Requesting the next InputSplit failed. at org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:88) at org.apache.flink.runtime.operators.DataSourceTask$1.hasNext(DataSourceTask.java:337) at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:136) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.NullPointerException at java.io.ByteArrayInputStream.init(ByteArrayInputStream.java:106) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:301) at org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:83) ... 4 more 13:12:47,595 INFO org.apache.flink.runtime.taskmanager.Task - CHAIN DataSource (at SomeMethod (org.apache.flink.api.java.io.AvroInputFormat)) - FlatMap (FlatMap at main(SomeClass.java:111)) (20/50) switched to FAILED : java.lang.RuntimeException: Requesting the next InputSplit failed. at org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:88) at org.apache.flink.runtime.operators.DataSourceTask$1.hasNext(DataSourceTask.java:337) at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:136) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.NullPointerException at java.io.ByteArrayInputStream.init(ByteArrayInputStream.java:106) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:301) at org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:83) ... 4 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1953) Rework Checkpoint Coordinator
Stephan Ewen created FLINK-1953: --- Summary: Rework Checkpoint Coordinator Key: FLINK-1953 URL: https://issues.apache.org/jira/browse/FLINK-1953 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 The checkpoint coordinator currently contains no tests and is vulnerable to a variety of situations. In particular, I propose to add: - Better configurability which tasks receive the trigger checkpoint messages, which tasks need to acknowledge the checkpoint, and which tasks need to receive confirmation messages. - checkpoint timeouts, such that incomplete checkpoints are guaranteed to be cleaned up after a while, regardless of successful checkpoints - better sanity checking of messages and fields, to properly handle/ignore messages for old/expired checkpoints, or invalidly routed messages - Better handling of checkpoint attempts at points where the execution has just failed is is currently being canceled. - Add a good set of tests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1952) Cannot run ConnectedComponents example: Could not allocate a slot on instance
[ https://issues.apache.org/jira/browse/FLINK-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516845#comment-14516845 ] Stephan Ewen commented on FLINK-1952: - That is helpful, thanks! Cannot run ConnectedComponents example: Could not allocate a slot on instance - Key: FLINK-1952 URL: https://issues.apache.org/jira/browse/FLINK-1952 Project: Flink Issue Type: Bug Components: Scheduler Affects Versions: 0.9 Reporter: Robert Metzger Steps to reproduce {code} ./bin/yarn-session.sh -n 350 {code} ... wait until they are connected ... {code} Number of connected TaskManagers changed to 266. Slots available: 266 Number of connected TaskManagers changed to 323. Slots available: 323 Number of connected TaskManagers changed to 334. Slots available: 334 Number of connected TaskManagers changed to 343. Slots available: 343 Number of connected TaskManagers changed to 350. Slots available: 350 {code} Start CC {code} ./bin/flink run -p 350 ./examples/flink-java-examples-0.9-SNAPSHOT-ConnectedComponents.jar {code} --- it runs Run KMeans, let it fail with {code} Failed to deploy the task Map (Map at main(KMeans.java:100)) (1/350) - execution #0 to slot SimpleSlot (2)(2)(0) - 182b7661ca9547a84591de940c47a200 - ALLOCATED/ALIVE: java.io.IOException: Insufficient number of network buffers: required 350, but only 254 available. The total number of network buffers is currently set to 2048. You can increase this number by setting the configuration key 'taskmanager.network.numberOfBuffers'. {code} ... as expected. (I've waited for 10 minutes between the two submissions) Starting CC now will fail: {code} ./bin/flink run -p 350 ./examples/flink-java-examples-0.9-SNAPSHOT-ConnectedComponents.jar {code} Error message(s): {code} Caused by: java.lang.IllegalStateException: Could not schedule consumer vertex IterationHead(WorksetIteration (Unnamed Delta Iteration)) (19/350) at org.apache.flink.runtime.executiongraph.Execution$3.call(Execution.java:479) at org.apache.flink.runtime.executiongraph.Execution$3.call(Execution.java:469) at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:94) at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) at scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107) ... 4 more Caused by: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Could not allocate a slot on instance 4a6d761cb084c32310ece1f849556faf @ cloud-19 - 1 slots - URL: akka.tcp://flink@130.149.21.23:51400/user/taskmanager, as required by the co-location constraint. at org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:247) at org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:110) at org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:262) at org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:436) at org.apache.flink.runtime.executiongraph.Execution$3.call(Execution.java:475) ... 9 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1953) Rework Checkpoint Coordinator
[ https://issues.apache.org/jira/browse/FLINK-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14519058#comment-14519058 ] Stephan Ewen commented on FLINK-1953: - First part implemented in 7f0ce1428bc32181d6d79ca6f1226b9e2e3d93be Rework Checkpoint Coordinator - Key: FLINK-1953 URL: https://issues.apache.org/jira/browse/FLINK-1953 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 The checkpoint coordinator currently contains no tests and is vulnerable to a variety of situations. In particular, I propose to add: - Better configurability which tasks receive the trigger checkpoint messages, which tasks need to acknowledge the checkpoint, and which tasks need to receive confirmation messages. - checkpoint timeouts, such that incomplete checkpoints are guaranteed to be cleaned up after a while, regardless of successful checkpoints - better sanity checking of messages and fields, to properly handle/ignore messages for old/expired checkpoints, or invalidly routed messages - Better handling of checkpoint attempts at points where the execution has just failed is is currently being canceled. - Add a good set of tests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1925) Split SubmitTask method up into two phases: Receive TDD and instantiation of TDD
[ https://issues.apache.org/jira/browse/FLINK-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1925. --- Split SubmitTask method up into two phases: Receive TDD and instantiation of TDD Key: FLINK-1925 URL: https://issues.apache.org/jira/browse/FLINK-1925 Project: Flink Issue Type: Improvement Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 0.9 A user reported that a job times out while submitting tasks to the TaskManager. The reason is that the JobManager expects a TaskOperationResult response upon submitting a task to the TM. The TM downloads then the required jars from the JM which blocks the actor thread and can take a very long time if many TMs download from the JM. Due to this, the SubmitTask future throws a TimeOutException. A possible solution could be that the TM eagerly acknowledges the reception of the SubmitTask message and executes the task initialization within a future. The future will upon completion send a UpdateTaskExecutionState message to the JM which switches the state of the task from deploying to running. This means that the handler of SubmitTask future in {{Execution}} won't change the state of the task. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1947) Make Avro and Tachyon test logging less verbose
[ https://issues.apache.org/jira/browse/FLINK-1947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513768#comment-14513768 ] Stephan Ewen commented on FLINK-1947: - You can avoid the sysout logging by using {{ExecutionEnvironment.getConfig().disableSystoutLogging()}}. Make Avro and Tachyon test logging less verbose --- Key: FLINK-1947 URL: https://issues.apache.org/jira/browse/FLINK-1947 Project: Flink Issue Type: Improvement Reporter: Till Rohrmann Priority: Minor Currently, the {{AvroExternalJarProgramITCase}} and the Tachyon test cases write the cluster status messages to stdout. I think these messages are not needed and only clutter the test output. Therefore, we should maybe suppress these messages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1950) Increase default heap cutoff ratio from 20% to 30% and move default value to ConfigConstants
[ https://issues.apache.org/jira/browse/FLINK-1950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514215#comment-14514215 ] Stephan Ewen commented on FLINK-1950: - 20% and 30% are both rather random magic numbers. Is there a better way to do this? Increase default heap cutoff ratio from 20% to 30% and move default value to ConfigConstants Key: FLINK-1950 URL: https://issues.apache.org/jira/browse/FLINK-1950 Project: Flink Issue Type: Task Components: YARN Client Affects Versions: 0.9 Reporter: Robert Metzger Priority: Blocker It seems that too many production users are facing issues with YARN killing containers due to resource overusage. We can mitigate the issue by using only 70% of the specified memory for the heap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1950) Increase default heap cutoff ratio from 20% to 30% and move default value to ConfigConstants
[ https://issues.apache.org/jira/browse/FLINK-1950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516694#comment-14516694 ] Stephan Ewen commented on FLINK-1950: - Given that our non-heap overhead has a minimal constant component as well (2 * netty direct buffers, a minimum of 256m perm gen space), I think having a minimum cutoff of 380 mb seems quite reasonable for flink as well. After that, it should be proportional, as the native pool, the stack space, etc are all relative to the heap, if I recall correctly. 10% seems almost a bit too little, my gut feeling would have been 15% or so, but if they ran a lot of experiments for that, it is probably a reasonable number. Increase default heap cutoff ratio from 20% to 30% and move default value to ConfigConstants Key: FLINK-1950 URL: https://issues.apache.org/jira/browse/FLINK-1950 Project: Flink Issue Type: Task Components: YARN Client Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Blocker It seems that too many production users are facing issues with YARN killing containers due to resource overusage. We can mitigate the issue by using only 70% of the specified memory for the heap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (FLINK-1968) Make Distributed Cache more robust
[ https://issues.apache.org/jira/browse/FLINK-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen reassigned FLINK-1968: --- Assignee: Stephan Ewen Make Distributed Cache more robust -- Key: FLINK-1968 URL: https://issues.apache.org/jira/browse/FLINK-1968 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen The distributed cache has a variety of issues at the moment. - It does not give a proper exception when a non-cached file is accessed - It swallows I/O exceptions that happen during file transfer and later only returns null - It keeps inonsistently reference counts and attempts to copy often, resolving this via file collisions - Files are not properly removes on shutdown - No shutdown hook to remove files when process is killed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1972) Remove instanceID in streaming tasks
Stephan Ewen created FLINK-1972: --- Summary: Remove instanceID in streaming tasks Key: FLINK-1972 URL: https://issues.apache.org/jira/browse/FLINK-1972 Project: Flink Issue Type: Bug Components: Streaming Reporter: Stephan Ewen The streaming tasks have a weird instanceID that is a JVM local counter. Tasks have already proper names, vertex ids, and IDs scoped to the attempt of execution. We should remove this counter ID. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1959) Accumulators BROKEN after Partitioning
[ https://issues.apache.org/jira/browse/FLINK-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14519542#comment-14519542 ] Stephan Ewen commented on FLINK-1959: - Can you try if this only happens with a parallelism greater than one, or also with parallelism one? Accumulators BROKEN after Partitioning -- Key: FLINK-1959 URL: https://issues.apache.org/jira/browse/FLINK-1959 Project: Flink Issue Type: Bug Components: Examples Affects Versions: 0.8.1 Reporter: mustafa elbehery Priority: Critical Fix For: 0.8.1 while running the Accumulator example in https://github.com/Elbehery/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/EmptyFieldsCountAccumulator.java, I tried to alter the data flow with PartitionByHash function before applying Filter, and the resulted accumulator was NULL. By Debugging, I could see the accumulator in the RunTime Map. However, by retrieving the accumulator from the JobExecutionResult object, it was NULL. The line caused the problem is file.partitionByHash(1).filter(new EmptyFieldFilter()) instead of file.filter(new EmptyFieldFilter()) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1690) ProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure spuriously fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521736#comment-14521736 ] Stephan Ewen commented on FLINK-1690: - This one is interesting, it is not Test specific, it is a vulnerability that all tests have: A TaskManager port conflict. Because the port is chosen not by netty, but beforehand (randomly), two TaskManagers can attempt to open the same port (with a small probability). ProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure spuriously fails on Travis -- Key: FLINK-1690 URL: https://issues.apache.org/jira/browse/FLINK-1690 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Priority: Minor I got the following error on Travis. {code} ProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure:244 The program did not finish in time {code} I think we have to increase the timeouts for this test case to make it reliably run on Travis. The log of the failed Travis build can be found [here|https://api.travis-ci.org/jobs/53952486/log.txt?deansi=true] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1962) Add Gelly Scala API
[ https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530313#comment-14530313 ] Stephan Ewen commented on FLINK-1962: - How about remove the generic type constraints from the Gelly Java API ? In the core API, we have no such constraints. The Flink client checks during plan generation anyways whether the key types are suitable as keys, so you would get a check in the re-flight phase still. Concerning seralizability, this is mainly caused by the fact that the types are used in function closures. This is a limitation that would be good to get around anyways in the near future, so that we support all types that the Flink type stack can handle. BTW: Scala case classes are serializable, so they should actually work fine against the serializable bound. Add Gelly Scala API --- Key: FLINK-1962 URL: https://issues.apache.org/jira/browse/FLINK-1962 Project: Flink Issue Type: Task Components: Gelly, Scala API Affects Versions: 0.9 Reporter: Vasia Kalavri -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1973) JobManager log does not contain state change messages on INFO level
Stephan Ewen created FLINK-1973: --- Summary: JobManager log does not contain state change messages on INFO level Key: FLINK-1973 URL: https://issues.apache.org/jira/browse/FLINK-1973 Project: Flink Issue Type: Improvement Components: JobManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 The JobManager log has only deployment messages, and job state change messages on INFO log level Messages about task vertices, are only logged on DEBUG level, which makes debugging harder, as by default, this info is now only available in the TaskManager logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1996) Add output methods to Table API
[ https://issues.apache.org/jira/browse/FLINK-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14538457#comment-14538457 ] Stephan Ewen commented on FLINK-1996: - Does this make a good starter issue? Add output methods to Table API --- Key: FLINK-1996 URL: https://issues.apache.org/jira/browse/FLINK-1996 Project: Flink Issue Type: Improvement Components: Table API Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: Aljoscha Krettek Tables need to be converted to DataSets (or DataStreams) to write them out. It would be good to have a way to emit Table results directly for example to print, CSV, JDBC, HBase, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1996) Add output methods to Table API
[ https://issues.apache.org/jira/browse/FLINK-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14538456#comment-14538456 ] Stephan Ewen commented on FLINK-1996: - I think it may actually be worth adding those methods, where possible. It really is much simpler that way... Their implementation could be slim, if they delegate to the data set code / output formats. Add output methods to Table API --- Key: FLINK-1996 URL: https://issues.apache.org/jira/browse/FLINK-1996 Project: Flink Issue Type: Improvement Components: Table API Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: Aljoscha Krettek Tables need to be converted to DataSets (or DataStreams) to write them out. It would be good to have a way to emit Table results directly for example to print, CSV, JDBC, HBase, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1968) Make Distributed Cache more robust
[ https://issues.apache.org/jira/browse/FLINK-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1968. - Resolution: Fixed Fixed via 1c8d866a83065e3d1bc9707dab81117f24c9f678 Make Distributed Cache more robust -- Key: FLINK-1968 URL: https://issues.apache.org/jira/browse/FLINK-1968 Project: Flink Issue Type: Bug Components: Local Runtime Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 The distributed cache has a variety of issues at the moment. - It does not give a proper exception when a non-cached file is accessed - It swallows I/O exceptions that happen during file transfer and later only returns null - It keeps inonsistently reference counts and attempts to copy often, resolving this via file collisions - Files are not properly removes on shutdown - No shutdown hook to remove files when process is killed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1968) Make Distributed Cache more robust
[ https://issues.apache.org/jira/browse/FLINK-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1968. --- Make Distributed Cache more robust -- Key: FLINK-1968 URL: https://issues.apache.org/jira/browse/FLINK-1968 Project: Flink Issue Type: Bug Components: Local Runtime Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 The distributed cache has a variety of issues at the moment. - It does not give a proper exception when a non-cached file is accessed - It swallows I/O exceptions that happen during file transfer and later only returns null - It keeps inonsistently reference counts and attempts to copy often, resolving this via file collisions - Files are not properly removes on shutdown - No shutdown hook to remove files when process is killed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1874) Break up streaming connectors into submodules
[ https://issues.apache.org/jira/browse/FLINK-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14538556#comment-14538556 ] Stephan Ewen commented on FLINK-1874: - I think we are about to merge the biggest change, so this issue should become available soon... Break up streaming connectors into submodules - Key: FLINK-1874 URL: https://issues.apache.org/jira/browse/FLINK-1874 Project: Flink Issue Type: Task Components: Build System, Streaming Affects Versions: 0.9 Reporter: Robert Metzger Labels: starter As per: http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Break-up-streaming-connectors-into-subprojects-td5001.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1672) Refactor task registration/unregistration
[ https://issues.apache.org/jira/browse/FLINK-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1672. - Resolution: Fixed Fix Version/s: 0.9 Assignee: Stephan Ewen Implemented in 8e61301452218e6d279b013beb7bbd02a7c2e3f9 Refactor task registration/unregistration - Key: FLINK-1672 URL: https://issues.apache.org/jira/browse/FLINK-1672 Project: Flink Issue Type: Improvement Components: Distributed Runtime Reporter: Ufuk Celebi Assignee: Stephan Ewen Fix For: 0.9 h4. Current control flow for task registrations # JM submits a TaskDeploymentDescriptor to a TM ## TM registers the required JAR files with the LibraryCacheManager and returns the user code class loader ## TM creates a Task instance and registers the task in the runningTasks map ## TM creates a TaskInputSplitProvider ## TM creates a RuntimeEnvironment and sets it as the environment for the task ## TM registers the task with the network environment ## TM sends async msg to profiler to monitor tasks ## TM creates temporary files in file cache ## TM tries to start the task If any operation = 1.2 fails: * TM calls task.failExternally() * TM removes temporary files from file cache * TM unregisters the task from the network environment * TM sends async msg to profiler to unmonitor tasks * TM calls unregisterMemoryManager on task If 1.1 fails, only unregister from LibraryCacheManager. h4. RuntimeEnvironment, Task, TaskManager separation The RuntimeEnvironment has references to certain components of the task manager like memory manager, which are accecssed from the task. Furthermore it implements Runnable, and creates the executing task Thread. The Task instance essentially wraps the RuntimeEnvironment and allows asynchronous state management of the task (RUNNING, FINISHED, etc.). The way that the state updates affect the task is not that obvious: state changes trigger messages to the TM, which for final states further trigger a msg to unregister the task. The way that tasks are unregistered again depends on the state of the task. I would propose to refactor this to make the way the state handling/registration/unregistration is handled is more transparent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1969) Remove old profile code
[ https://issues.apache.org/jira/browse/FLINK-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1969. - Resolution: Done Done in fbea2da26d01c470687a5ad217a5fd6ad1de89e4 Remove old profile code --- Key: FLINK-1969 URL: https://issues.apache.org/jira/browse/FLINK-1969 Project: Flink Issue Type: Bug Components: Distributed Runtime Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 The old profiler code is not instantiated any more and is basically dead. It has in parts been replaced by the metrics library already. The classes still get in the way during refactoring, which is why I suggest to remove them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1959) Accumulators BROKEN after Partitioning
[ https://issues.apache.org/jira/browse/FLINK-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14538540#comment-14538540 ] Stephan Ewen commented on FLINK-1959: - I will try and look into this very soon... Accumulators BROKEN after Partitioning -- Key: FLINK-1959 URL: https://issues.apache.org/jira/browse/FLINK-1959 Project: Flink Issue Type: Bug Components: Examples Affects Versions: master Reporter: mustafa elbehery Priority: Critical Fix For: master while running the Accumulator example in https://github.com/Elbehery/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/EmptyFieldsCountAccumulator.java, I tried to alter the data flow with PartitionByHash function before applying Filter, and the resulted accumulator was NULL. By Debugging, I could see the accumulator in the RunTime Map. However, by retrieving the accumulator from the JobExecutionResult object, it was NULL. The line caused the problem is file.partitionByHash(1).filter(new EmptyFieldFilter()) instead of file.filter(new EmptyFieldFilter()) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1962) Add Gelly Scala API
[ https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14538571#comment-14538571 ] Stephan Ewen commented on FLINK-1962: - This seems like a flaw in the Scala Type analysis. The Scala type system should really support the Java types properly. This will be a recurring issue, if consider it common practice to build Scala libraries on Java libraries. Add Gelly Scala API --- Key: FLINK-1962 URL: https://issues.apache.org/jira/browse/FLINK-1962 Project: Flink Issue Type: Task Components: Gelly, Scala API Affects Versions: 0.9 Reporter: Vasia Kalavri -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1969) Remove old profile code
[ https://issues.apache.org/jira/browse/FLINK-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1969. --- Remove old profile code --- Key: FLINK-1969 URL: https://issues.apache.org/jira/browse/FLINK-1969 Project: Flink Issue Type: Bug Components: Distributed Runtime Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 The old profiler code is not instantiated any more and is basically dead. It has in parts been replaced by the metrics library already. The classes still get in the way during refactoring, which is why I suggest to remove them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1672) Refactor task registration/unregistration
[ https://issues.apache.org/jira/browse/FLINK-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1672. --- Refactor task registration/unregistration - Key: FLINK-1672 URL: https://issues.apache.org/jira/browse/FLINK-1672 Project: Flink Issue Type: Improvement Components: Distributed Runtime Reporter: Ufuk Celebi Assignee: Stephan Ewen Fix For: 0.9 h4. Current control flow for task registrations # JM submits a TaskDeploymentDescriptor to a TM ## TM registers the required JAR files with the LibraryCacheManager and returns the user code class loader ## TM creates a Task instance and registers the task in the runningTasks map ## TM creates a TaskInputSplitProvider ## TM creates a RuntimeEnvironment and sets it as the environment for the task ## TM registers the task with the network environment ## TM sends async msg to profiler to monitor tasks ## TM creates temporary files in file cache ## TM tries to start the task If any operation = 1.2 fails: * TM calls task.failExternally() * TM removes temporary files from file cache * TM unregisters the task from the network environment * TM sends async msg to profiler to unmonitor tasks * TM calls unregisterMemoryManager on task If 1.1 fails, only unregister from LibraryCacheManager. h4. RuntimeEnvironment, Task, TaskManager separation The RuntimeEnvironment has references to certain components of the task manager like memory manager, which are accecssed from the task. Furthermore it implements Runnable, and creates the executing task Thread. The Task instance essentially wraps the RuntimeEnvironment and allows asynchronous state management of the task (RUNNING, FINISHED, etc.). The way that the state updates affect the task is not that obvious: state changes trigger messages to the TM, which for final states further trigger a msg to unregister the task. The way that tasks are unregistered again depends on the state of the task. I would propose to refactor this to make the way the state handling/registration/unregistration is handled is more transparent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1986) Group by fails on iterative data streams
[ https://issues.apache.org/jira/browse/FLINK-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14538328#comment-14538328 ] Stephan Ewen commented on FLINK-1986: - [~gyfora] and [~senorcarbone] Are you using co-location constraints, to make sure that head and tail of an iteration are co-located? Otherwise that is not guaranteed, but required by the backchannel broker. Group by fails on iterative data streams Key: FLINK-1986 URL: https://issues.apache.org/jira/browse/FLINK-1986 Project: Flink Issue Type: Bug Components: Streaming Reporter: Daniel Bali Labels: iteration, streaming Hello! When I try to run a `groupBy` on an IterativeDataStream I get a NullPointerException. Here is the code that reproduces the issue: {code} public Test() throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment(); DataStreamTuple2Long, Long edges = env .generateSequence(0, 7) .map(new MapFunctionLong, Tuple2Long, Long() { @Override public Tuple2Long, Long map(Long v) throws Exception { return new Tuple2(v, (v + 1)); } }); IterativeDataStreamTuple2Long, Long iteration = edges.iterate(); SplitDataStreamTuple2Long, Long step = iteration.groupBy(1) .map(new MapFunctionTuple2Long, Long, Tuple2Long, Long() { @Override public Tuple2Long, Long map(Tuple2Long, Long tuple) throws Exception { return tuple; } }) .split(new OutputSelectorTuple2Long, Long() { @Override public IterableString select(Tuple2Long, Long tuple) { ListString output = new ArrayList(); output.add(iterate); return output; } }); iteration.closeWith(step.select(iterate)); env.execute(Sandbox); } {code} Moving the groupBy before the iteration solves the issue. e.g. this works: {code} ... iteration = edges.groupBy(1).iterate(); iteration.map(...) {code} Here is the stack trace: {code} Exception in thread main java.lang.NullPointerException at org.apache.flink.streaming.api.graph.StreamGraph.addIterationTail(StreamGraph.java:207) at org.apache.flink.streaming.api.datastream.IterativeDataStream.closeWith(IterativeDataStream.java:72) at org.apache.flink.graph.streaming.example.Test.init(Test.java:73) at org.apache.flink.graph.streaming.example.Test.main(Test.java:79) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1989) Sorting of POJO data set from TableEnv yields NotSerializableException
[ https://issues.apache.org/jira/browse/FLINK-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14538348#comment-14538348 ] Stephan Ewen commented on FLINK-1989: - Storing the type information breaks with the current design principles, where the {{TypeInformation}} is a pure pre-flight concept, and the {{TypeSerializer}} and {{TypeComparator}} are the runtime handles. Sorting of POJO data set from TableEnv yields NotSerializableException -- Key: FLINK-1989 URL: https://issues.apache.org/jira/browse/FLINK-1989 Project: Flink Issue Type: Bug Components: Table API Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: Aljoscha Krettek Fix For: 0.9 Sorting or grouping (or probably any other key operation) on a POJO data set that was created by a {{TableEnvironment}} yields a {{NotSerializableException}} due to a non-serializable {{java.lang.reflect.Field}} object. I traced the error back to the {{ExpressionSelectFunction}}. I guess that a {{TypeInformation}} object is stored in the generated user-code function. A {{PojoTypeInfo}} holds Field objects, which cannot be serialized. The following test can be pasted into the {{SelectITCase}} and reproduces the problem. {code} @Test public void testGroupByAfterTable() throws Exception { ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); TableEnvironment tableEnv = new TableEnvironment(); DataSetTuple3Integer, Long, String ds = CollectionDataSets.get3TupleDataSet(env); Table in = tableEnv.toTable(ds, a,b,c); Table result = in .select(a, b, c); DataSetABC resultSet = tableEnv.toSet(result, ABC.class); resultSet .sortPartition(a, Order.DESCENDING) .writeAsText(resultPath, FileSystem.WriteMode.OVERWRITE); env.execute(); expected = 1,1,Hi\n + 2,2,Hello\n + 3,2,Hello world\n + 4,3,Hello world, + how are you?\n + 5,3,I am fine.\n + 6,3,Luke Skywalker\n + 7,4, + Comment#1\n + 8,4,Comment#2\n + 9,4,Comment#3\n + 10,4,Comment#4\n + 11,5, + Comment#5\n + 12,5,Comment#6\n + 13,5,Comment#7\n + 14,5,Comment#8\n + 15,5, + Comment#9\n + 16,6,Comment#10\n + 17,6,Comment#11\n + 18,6,Comment#12\n + 19, + 6,Comment#13\n + 20,6,Comment#14\n + 21,6,Comment#15\n; } public static class ABC { public int a; public long b; public String c; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1962) Add Gelly Scala API
[ https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539532#comment-14539532 ] Stephan Ewen commented on FLINK-1962: - In Scala, you usually use the Scala tuples. The Java tuples were introduced, because: - we wanted to keep the Scala dependencies out of the Java API - they can be used in a mutable way, which allows you to make some performance optimizations - we wound that certain Java programmers were confused by the scala tuples and the fields counting from _1 (where the tuple positions in the keys count from 0) That being said, you should be able to use scala tuples in the Java API, if you prefer that... Add Gelly Scala API --- Key: FLINK-1962 URL: https://issues.apache.org/jira/browse/FLINK-1962 Project: Flink Issue Type: Task Components: Gelly, Scala API Affects Versions: 0.9 Reporter: Vasia Kalavri -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1857) Flakey SimpleRecoveryITCase#testRestartMultipleTimes test
[ https://issues.apache.org/jira/browse/FLINK-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1857. --- Flakey SimpleRecoveryITCase#testRestartMultipleTimes test - Key: FLINK-1857 URL: https://issues.apache.org/jira/browse/FLINK-1857 Project: Flink Issue Type: Bug Components: Build System Affects Versions: master Reporter: Ufuk Celebi Fix For: 0.9 SimpleRecoveryITCase#testRestartMultipleTimes failed on Travis (with an unrelated change to the documentation): https://travis-ci.org/uce/incubator-flink/builds/57814190 https://travis-ci.org/uce/incubator-flink/jobs/57814194 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814194/log.txt https://travis-ci.org/uce/incubator-flink/jobs/57814195 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814195/log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1857) Flakey SimpleRecoveryITCase#testRestartMultipleTimes test
[ https://issues.apache.org/jira/browse/FLINK-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1857. - Resolution: Done Fix Version/s: 0.9 Flakey SimpleRecoveryITCase#testRestartMultipleTimes test - Key: FLINK-1857 URL: https://issues.apache.org/jira/browse/FLINK-1857 Project: Flink Issue Type: Bug Components: Build System Affects Versions: master Reporter: Ufuk Celebi Fix For: 0.9 SimpleRecoveryITCase#testRestartMultipleTimes failed on Travis (with an unrelated change to the documentation): https://travis-ci.org/uce/incubator-flink/builds/57814190 https://travis-ci.org/uce/incubator-flink/jobs/57814194 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814194/log.txt https://travis-ci.org/uce/incubator-flink/jobs/57814195 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814195/log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1857) Flakey SimpleRecoveryITCase#testRestartMultipleTimes test
[ https://issues.apache.org/jira/browse/FLINK-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541682#comment-14541682 ] Stephan Ewen commented on FLINK-1857: - I think this has been fixed by the combination of multiple fixes to timeouts and task initialization and message robustness Flakey SimpleRecoveryITCase#testRestartMultipleTimes test - Key: FLINK-1857 URL: https://issues.apache.org/jira/browse/FLINK-1857 Project: Flink Issue Type: Bug Components: Build System Affects Versions: master Reporter: Ufuk Celebi SimpleRecoveryITCase#testRestartMultipleTimes failed on Travis (with an unrelated change to the documentation): https://travis-ci.org/uce/incubator-flink/builds/57814190 https://travis-ci.org/uce/incubator-flink/jobs/57814194 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814194/log.txt https://travis-ci.org/uce/incubator-flink/jobs/57814195 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814195/log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1784) KafkaITCase
[ https://issues.apache.org/jira/browse/FLINK-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1784. - Resolution: Not A Problem KafkaITCase --- Key: FLINK-1784 URL: https://issues.apache.org/jira/browse/FLINK-1784 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9 Reporter: Fabian Hueske Priority: Minor I observed a non-deterministic failure of the {{KafkaITCase}} on Travis: https://travis-ci.org/fhueske/flink/jobs/55815808 {code} Running org.apache.flink.streaming.connectors.kafka.KafkaITCase 03/25/2015 16:08:15 Job execution switched to status RUNNING. 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to SCHEDULED 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to DEPLOYING 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to SCHEDULED 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to DEPLOYING 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to RUNNING 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to RUNNING 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to FAILED java.util.NoSuchElementException: next on empty iterator at scala.collection.Iterator$$anon$2.next(Iterator.scala:39) at scala.collection.Iterator$$anon$2.next(Iterator.scala:37) at scala.collection.LinearSeqLike$$anon$1.next(LinearSeqLike.scala:62) at scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30) at org.apache.flink.streaming.connectors.kafka.api.simple.KafkaTopicUtils.getLeaderBrokerAddressForTopic(KafkaTopicUtils.java:83) at org.apache.flink.streaming.connectors.kafka.api.KafkaSink.open(KafkaSink.java:118) at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:33) at org.apache.flink.streaming.api.invokable.StreamInvokable.open(StreamInvokable.java:158) at org.apache.flink.streaming.api.streamvertex.StreamVertex.openOperator(StreamVertex.java:202) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:165) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:209) at java.lang.Thread.run(Thread.java:701) 03/25/2015 16:08:17 Job execution switched to status FAILING. 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to CANCELING 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to FAILED java.util.NoSuchElementException: next on empty iterator at scala.collection.Iterator$$anon$2.next(Iterator.scala:39) at scala.collection.Iterator$$anon$2.next(Iterator.scala:37) at scala.collection.LinearSeqLike$$anon$1.next(LinearSeqLike.scala:62) at scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30) at org.apache.flink.streaming.connectors.kafka.api.simple.KafkaTopicUtils.getLeaderBrokerAddressForTopic(KafkaTopicUtils.java:83) at org.apache.flink.streaming.connectors.kafka.api.simple.PersistentKafkaSource.open(PersistentKafkaSource.java:159) at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:33) at org.apache.flink.streaming.api.invokable.StreamInvokable.open(StreamInvokable.java:158) at org.apache.flink.streaming.api.streamvertex.StreamVertex.openOperator(StreamVertex.java:198) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:165) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:209) at java.lang.Thread.run(Thread.java:701) 03/25/2015 16:08:17 Job execution switched to status FAILED. Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 10.774 sec FAILURE! - in org.apache.flink.streaming.connectors.kafka.KafkaITCase test(org.apache.flink.streaming.connectors.kafka.KafkaITCase) Time elapsed: 10.604 sec FAILURE! java.lang.AssertionError: Test failed with: null at org.junit.Assert.fail(Assert.java:88) at org.apache.flink.streaming.connectors.kafka.KafkaITCase.test(KafkaITCase.java:104) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1784) KafkaITCase
[ https://issues.apache.org/jira/browse/FLINK-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541689#comment-14541689 ] Stephan Ewen commented on FLINK-1784: - Subsumed by FLINK-2008 KafkaITCase --- Key: FLINK-1784 URL: https://issues.apache.org/jira/browse/FLINK-1784 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9 Reporter: Fabian Hueske Priority: Minor I observed a non-deterministic failure of the {{KafkaITCase}} on Travis: https://travis-ci.org/fhueske/flink/jobs/55815808 {code} Running org.apache.flink.streaming.connectors.kafka.KafkaITCase 03/25/2015 16:08:15 Job execution switched to status RUNNING. 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to SCHEDULED 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to DEPLOYING 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to SCHEDULED 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to DEPLOYING 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to RUNNING 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to RUNNING 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to FAILED java.util.NoSuchElementException: next on empty iterator at scala.collection.Iterator$$anon$2.next(Iterator.scala:39) at scala.collection.Iterator$$anon$2.next(Iterator.scala:37) at scala.collection.LinearSeqLike$$anon$1.next(LinearSeqLike.scala:62) at scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30) at org.apache.flink.streaming.connectors.kafka.api.simple.KafkaTopicUtils.getLeaderBrokerAddressForTopic(KafkaTopicUtils.java:83) at org.apache.flink.streaming.connectors.kafka.api.KafkaSink.open(KafkaSink.java:118) at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:33) at org.apache.flink.streaming.api.invokable.StreamInvokable.open(StreamInvokable.java:158) at org.apache.flink.streaming.api.streamvertex.StreamVertex.openOperator(StreamVertex.java:202) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:165) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:209) at java.lang.Thread.run(Thread.java:701) 03/25/2015 16:08:17 Job execution switched to status FAILING. 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to CANCELING 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to FAILED java.util.NoSuchElementException: next on empty iterator at scala.collection.Iterator$$anon$2.next(Iterator.scala:39) at scala.collection.Iterator$$anon$2.next(Iterator.scala:37) at scala.collection.LinearSeqLike$$anon$1.next(LinearSeqLike.scala:62) at scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30) at org.apache.flink.streaming.connectors.kafka.api.simple.KafkaTopicUtils.getLeaderBrokerAddressForTopic(KafkaTopicUtils.java:83) at org.apache.flink.streaming.connectors.kafka.api.simple.PersistentKafkaSource.open(PersistentKafkaSource.java:159) at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:33) at org.apache.flink.streaming.api.invokable.StreamInvokable.open(StreamInvokable.java:158) at org.apache.flink.streaming.api.streamvertex.StreamVertex.openOperator(StreamVertex.java:198) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:165) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:209) at java.lang.Thread.run(Thread.java:701) 03/25/2015 16:08:17 Job execution switched to status FAILED. Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 10.774 sec FAILURE! - in org.apache.flink.streaming.connectors.kafka.KafkaITCase test(org.apache.flink.streaming.connectors.kafka.KafkaITCase) Time elapsed: 10.604 sec FAILURE! java.lang.AssertionError: Test failed with: null at org.junit.Assert.fail(Assert.java:88) at org.apache.flink.streaming.connectors.kafka.KafkaITCase.test(KafkaITCase.java:104) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1690) ProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure spuriously fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541692#comment-14541692 ] Stephan Ewen commented on FLINK-1690: - I think this one is stable by now... ProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure spuriously fails on Travis -- Key: FLINK-1690 URL: https://issues.apache.org/jira/browse/FLINK-1690 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Priority: Minor I got the following error on Travis. {code} ProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure:244 The program did not finish in time {code} I think we have to increase the timeouts for this test case to make it reliably run on Travis. The log of the failed Travis build can be found [here|https://api.travis-ci.org/jobs/53952486/log.txt?deansi=true] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1865) Unstable test KafkaITCase
[ https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541684#comment-14541684 ] Stephan Ewen commented on FLINK-1865: - With the rewrite of the PersistentKafkaSource, this has been subsumed by FLINK-2008 Unstable test KafkaITCase - Key: FLINK-1865 URL: https://issues.apache.org/jira/browse/FLINK-1865 Project: Flink Issue Type: Bug Components: Streaming, Tests Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Robert Metzger {code} Running org.apache.flink.streaming.connectors.kafka.KafkaITCase 04/10/2015 13:46:53 Job execution switched to status RUNNING. 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54) at org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39) at org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196) at org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 4 more Caused by: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166) at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141) at org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 9 more 04/10/2015 13:47:04 Job execution switched to status FAILING. 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELED 04/10/2015 13:47:04 Job execution switched to status FAILED. 04/10/2015 13:47:05 Job execution switched to status RUNNING. 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:15 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at
[jira] [Closed] (FLINK-1865) Unstable test KafkaITCase
[ https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1865. --- Resolution: Not A Problem Unstable test KafkaITCase - Key: FLINK-1865 URL: https://issues.apache.org/jira/browse/FLINK-1865 Project: Flink Issue Type: Bug Components: Streaming, Tests Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Robert Metzger {code} Running org.apache.flink.streaming.connectors.kafka.KafkaITCase 04/10/2015 13:46:53 Job execution switched to status RUNNING. 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54) at org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39) at org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196) at org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 4 more Caused by: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166) at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141) at org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 9 more 04/10/2015 13:47:04 Job execution switched to status FAILING. 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELED 04/10/2015 13:47:04 Job execution switched to status FAILED. 04/10/2015 13:47:05 Job execution switched to status RUNNING. 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:15 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at
[jira] [Commented] (FLINK-1983) Remove dependencies on Record APIs for Spargel
[ https://issues.apache.org/jira/browse/FLINK-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14538180#comment-14538180 ] Stephan Ewen commented on FLINK-1983: - This seems doable without implications. Remove dependencies on Record APIs for Spargel -- Key: FLINK-1983 URL: https://issues.apache.org/jira/browse/FLINK-1983 Project: Flink Issue Type: Sub-task Components: Spargel Reporter: Henry Saputra Need to remove usage of Record API in Spargel -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1982) Remove dependencies on Record for Flink runtime and core
[ https://issues.apache.org/jira/browse/FLINK-1982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14538156#comment-14538156 ] Stephan Ewen commented on FLINK-1982: - I think the runtime and optimizer specializations can be removed with the API together in one patch. The last blocker is probably that some of the runtime tests are implemented in the Record API. We would need to migrate some, many can probably be dropped, as they are redundant now. The only test that we cannot port is the terasort test, because the other APIs do not yet support range partitioning. Remove dependencies on Record for Flink runtime and core Key: FLINK-1982 URL: https://issues.apache.org/jira/browse/FLINK-1982 Project: Flink Issue Type: Sub-task Components: Core Reporter: Henry Saputra Seemed like there are several uses of Record API in core and runtime module that need to be updated before Record API could be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-2011) Improve error message when user-defined serialization logic is wrong
Stephan Ewen created FLINK-2011: --- Summary: Improve error message when user-defined serialization logic is wrong Key: FLINK-2011 URL: https://issues.apache.org/jira/browse/FLINK-2011 Project: Flink Issue Type: Bug Components: Distributed Runtime Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-2010) Add test that verifies that chained tasks are properly checkpointed
Stephan Ewen created FLINK-2010: --- Summary: Add test that verifies that chained tasks are properly checkpointed Key: FLINK-2010 URL: https://issues.apache.org/jira/browse/FLINK-2010 Project: Flink Issue Type: Bug Components: Streaming Reporter: Stephan Ewen Because this is critical logic, we should have a unit test for the {{StreamingTask}} validating that the state from all chained tasks is taken into account. It needs to check - The trigger checkpoint method, and that the state handle handle in the checkpoint acknowledgement contains all the necessary state - The restore state method, that state is reset properly -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-2004) Memory leack in presence of failed checkpoints in KafkaSource
Stephan Ewen created FLINK-2004: --- Summary: Memory leack in presence of failed checkpoints in KafkaSource Key: FLINK-2004 URL: https://issues.apache.org/jira/browse/FLINK-2004 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Robert Metzger Priority: Critical Fix For: 0.9 Checkpoints that fail never send a commit message to the tasks. Maintaining a map of all pending checkpoints introduces a memory leak, as entries for failed checkpoints will never be removed. Approaches to fix this: - The source cleans up entries from older checkpoints once a checkpoint is committed (simple implementation in a linked hash map) - The commit message could include the optional state handle (source needs not maintain the map) - The checkpoint coordinator could send messages for failed checkpoints? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-2004) Memory leak in presence of failed checkpoints in KafkaSource
[ https://issues.apache.org/jira/browse/FLINK-2004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen updated FLINK-2004: Summary: Memory leak in presence of failed checkpoints in KafkaSource (was: Memory leack in presence of failed checkpoints in KafkaSource) Memory leak in presence of failed checkpoints in KafkaSource Key: FLINK-2004 URL: https://issues.apache.org/jira/browse/FLINK-2004 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Robert Metzger Priority: Critical Fix For: 0.9 Checkpoints that fail never send a commit message to the tasks. Maintaining a map of all pending checkpoints introduces a memory leak, as entries for failed checkpoints will never be removed. Approaches to fix this: - The source cleans up entries from older checkpoints once a checkpoint is committed (simple implementation in a linked hash map) - The commit message could include the optional state handle (source needs not maintain the map) - The checkpoint coordinator could send messages for failed checkpoints? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1959) Accumulators BROKEN after Partitioning
[ https://issues.apache.org/jira/browse/FLINK-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14540610#comment-14540610 ] Stephan Ewen commented on FLINK-1959: - Great debugging, this is perfect! I will prepare a fix based on that... Accumulators BROKEN after Partitioning -- Key: FLINK-1959 URL: https://issues.apache.org/jira/browse/FLINK-1959 Project: Flink Issue Type: Bug Components: Examples Affects Versions: master Reporter: mustafa elbehery Priority: Critical Fix For: master while running the Accumulator example in https://github.com/Elbehery/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/EmptyFieldsCountAccumulator.java, I tried to alter the data flow with PartitionByHash function before applying Filter, and the resulted accumulator was NULL. By Debugging, I could see the accumulator in the RunTime Map. However, by retrieving the accumulator from the JobExecutionResult object, it was NULL. The line caused the problem is file.partitionByHash(1).filter(new EmptyFieldFilter()) instead of file.filter(new EmptyFieldFilter()) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1953) Rework Checkpoint Coordinator
[ https://issues.apache.org/jira/browse/FLINK-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1953. --- Rework Checkpoint Coordinator - Key: FLINK-1953 URL: https://issues.apache.org/jira/browse/FLINK-1953 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 The checkpoint coordinator currently contains no tests and is vulnerable to a variety of situations. In particular, I propose to add: - Better configurability which tasks receive the trigger checkpoint messages, which tasks need to acknowledge the checkpoint, and which tasks need to receive confirmation messages. - checkpoint timeouts, such that incomplete checkpoints are guaranteed to be cleaned up after a while, regardless of successful checkpoints - better sanity checking of messages and fields, to properly handle/ignore messages for old/expired checkpoints, or invalidly routed messages - Better handling of checkpoint attempts at points where the execution has just failed is is currently being canceled. - Add a good set of tests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1953) Rework Checkpoint Coordinator
[ https://issues.apache.org/jira/browse/FLINK-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1953. - Resolution: Implemented Implemented in 9b7f8aa121e4a231632296d0809029aca9ebde6a Rework Checkpoint Coordinator - Key: FLINK-1953 URL: https://issues.apache.org/jira/browse/FLINK-1953 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 The checkpoint coordinator currently contains no tests and is vulnerable to a variety of situations. In particular, I propose to add: - Better configurability which tasks receive the trigger checkpoint messages, which tasks need to acknowledge the checkpoint, and which tasks need to receive confirmation messages. - checkpoint timeouts, such that incomplete checkpoints are guaranteed to be cleaned up after a while, regardless of successful checkpoints - better sanity checking of messages and fields, to properly handle/ignore messages for old/expired checkpoints, or invalidly routed messages - Better handling of checkpoint attempts at points where the execution has just failed is is currently being canceled. - Add a good set of tests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1973) JobManager log does not contain state change messages on INFO level
[ https://issues.apache.org/jira/browse/FLINK-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1973. - Resolution: Fixed Fixed in ededb6b74a2d388ff1c54ddf0f2733c65675c9a0 JobManager log does not contain state change messages on INFO level --- Key: FLINK-1973 URL: https://issues.apache.org/jira/browse/FLINK-1973 Project: Flink Issue Type: Improvement Components: JobManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 The JobManager log has only deployment messages, and job state change messages on INFO log level Messages about task vertices, are only logged on DEBUG level, which makes debugging harder, as by default, this info is now only available in the TaskManager logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1973) JobManager log does not contain state change messages on INFO level
[ https://issues.apache.org/jira/browse/FLINK-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1973. --- JobManager log does not contain state change messages on INFO level --- Key: FLINK-1973 URL: https://issues.apache.org/jira/browse/FLINK-1973 Project: Flink Issue Type: Improvement Components: JobManager Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 The JobManager log has only deployment messages, and job state change messages on INFO log level Messages about task vertices, are only logged on DEBUG level, which makes debugging harder, as by default, this info is now only available in the TaskManager logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1935) Reimplement PersistentKafkaSource using high level Kafka API
[ https://issues.apache.org/jira/browse/FLINK-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1935. - Resolution: Implemented Implemented and merged in 54e957614c38fed69baf726fc86059e9b11384cb Reimplement PersistentKafkaSource using high level Kafka API Key: FLINK-1935 URL: https://issues.apache.org/jira/browse/FLINK-1935 Project: Flink Issue Type: Improvement Components: Kafka Connector, Streaming Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Fix For: 0.9 The current PersistentKafkaSource in Flink has some limitations that I seek to overcome by reimplementing it using Kafka's high level API (and manually committing the offsets to ZK). This approach only works when the offsets are committed to ZK directly. The current PersistentKafkaSource does not integrate with existing Kafka tools (for example for monitoring the lag). All the communication with Zookeeper is implemented manually in our current code. This is prone to errors and inefficiencies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1935) Reimplement PersistentKafkaSource using high level Kafka API
[ https://issues.apache.org/jira/browse/FLINK-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1935. --- Reimplement PersistentKafkaSource using high level Kafka API Key: FLINK-1935 URL: https://issues.apache.org/jira/browse/FLINK-1935 Project: Flink Issue Type: Improvement Components: Kafka Connector, Streaming Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Fix For: 0.9 The current PersistentKafkaSource in Flink has some limitations that I seek to overcome by reimplementing it using Kafka's high level API (and manually committing the offsets to ZK). This approach only works when the offsets are committed to ZK directly. The current PersistentKafkaSource does not integrate with existing Kafka tools (for example for monitoring the lag). All the communication with Zookeeper is implemented manually in our current code. This is prone to errors and inefficiencies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1959) Accumulators BROKEN after Partitioning
[ https://issues.apache.org/jira/browse/FLINK-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1959. - Resolution: Pending Closed Fix Version/s: (was: master) 0.9 Assignee: Stephan Ewen Fixed in 73493335f4dbecbb4f1f9f954b08534a5e35ca90. Thank you for the debugging help! Accumulators BROKEN after Partitioning -- Key: FLINK-1959 URL: https://issues.apache.org/jira/browse/FLINK-1959 Project: Flink Issue Type: Bug Components: Examples Affects Versions: master Reporter: mustafa elbehery Assignee: Stephan Ewen Priority: Critical Fix For: 0.9 while running the Accumulator example in https://github.com/Elbehery/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/EmptyFieldsCountAccumulator.java, I tried to alter the data flow with PartitionByHash function before applying Filter, and the resulted accumulator was NULL. By Debugging, I could see the accumulator in the RunTime Map. However, by retrieving the accumulator from the JobExecutionResult object, it was NULL. The line caused the problem is file.partitionByHash(1).filter(new EmptyFieldFilter()) instead of file.filter(new EmptyFieldFilter()) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1959) Accumulators BROKEN after Partitioning
[ https://issues.apache.org/jira/browse/FLINK-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1959. --- Accumulators BROKEN after Partitioning -- Key: FLINK-1959 URL: https://issues.apache.org/jira/browse/FLINK-1959 Project: Flink Issue Type: Bug Components: Examples Affects Versions: master Reporter: mustafa elbehery Assignee: Stephan Ewen Priority: Critical Fix For: 0.9 while running the Accumulator example in https://github.com/Elbehery/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/EmptyFieldsCountAccumulator.java, I tried to alter the data flow with PartitionByHash function before applying Filter, and the resulted accumulator was NULL. By Debugging, I could see the accumulator in the RunTime Map. However, by retrieving the accumulator from the JobExecutionResult object, it was NULL. The line caused the problem is file.partitionByHash(1).filter(new EmptyFieldFilter()) instead of file.filter(new EmptyFieldFilter()) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2005) Remove dependencies on Record APIs for flink-jdbc module
[ https://issues.apache.org/jira/browse/FLINK-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14540955#comment-14540955 ] Stephan Ewen commented on FLINK-2005: - I think those can be simply removed. The regular API fully covers what we need. Remove dependencies on Record APIs for flink-jdbc module Key: FLINK-2005 URL: https://issues.apache.org/jira/browse/FLINK-2005 Project: Flink Issue Type: Sub-task Reporter: Henry Saputra Need to remove dependency on old Record API in the flink-jdbc module. Hopefully we could just move them to use common operators APIs instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-2041) Optimizer plans with memory for pipeline breakers, even though they are not used any more
Stephan Ewen created FLINK-2041: --- Summary: Optimizer plans with memory for pipeline breakers, even though they are not used any more Key: FLINK-2041 URL: https://issues.apache.org/jira/browse/FLINK-2041 Project: Flink Issue Type: Bug Components: Optimizer Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1659) Rename classes and packages that contains Pact
[ https://issues.apache.org/jira/browse/FLINK-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1659. --- Rename classes and packages that contains Pact -- Key: FLINK-1659 URL: https://issues.apache.org/jira/browse/FLINK-1659 Project: Flink Issue Type: Task Reporter: Henry Saputra Assignee: Stephan Ewen Priority: Minor Fix For: 0.9 We have several class names that contain or start with Pact. Pact is the previous term for Flink data model and user defined functions/ operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1659) Rename classes and packages that contains Pact
[ https://issues.apache.org/jira/browse/FLINK-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1659. - Resolution: Fixed Fix Version/s: 0.9 Solved and subsumed by FLINK-1611 Rename classes and packages that contains Pact -- Key: FLINK-1659 URL: https://issues.apache.org/jira/browse/FLINK-1659 Project: Flink Issue Type: Task Reporter: Henry Saputra Assignee: Stephan Ewen Priority: Minor Fix For: 0.9 We have several class names that contain or start with Pact. Pact is the previous term for Flink data model and user defined functions/ operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2006) TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED
[ https://issues.apache.org/jira/browse/FLINK-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547967#comment-14547967 ] Stephan Ewen commented on FLINK-2006: - You are right the test setup was not respecting the running sender before receiver rule. I fixed it, should be good now. TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED - Key: FLINK-2006 URL: https://issues.apache.org/jira/browse/FLINK-2006 Project: Flink Issue Type: Bug Components: TaskManager Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Stephan Ewen Fix For: 0.9 I've seen this test failing multiple times now. Example: https://travis-ci.org/rmetzger/flink/jobs/62355627 [~till.rohrmann] tried to fix the issue a while ago with this commit http://git-wip-us.apache.org/repos/asf/flink/commit/3897b47b -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2006) TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED
[ https://issues.apache.org/jira/browse/FLINK-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-2006. --- TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED - Key: FLINK-2006 URL: https://issues.apache.org/jira/browse/FLINK-2006 Project: Flink Issue Type: Bug Components: TaskManager Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Stephan Ewen Fix For: 0.9 I've seen this test failing multiple times now. Example: https://travis-ci.org/rmetzger/flink/jobs/62355627 [~till.rohrmann] tried to fix the issue a while ago with this commit http://git-wip-us.apache.org/repos/asf/flink/commit/3897b47b -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-2006) TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED
[ https://issues.apache.org/jira/browse/FLINK-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-2006. - Resolution: Fixed Fixed (again, hopefully correctly this time) in a6f6a6777860a1b66840aafc53c28fc9c9bf9eef TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED - Key: FLINK-2006 URL: https://issues.apache.org/jira/browse/FLINK-2006 Project: Flink Issue Type: Bug Components: TaskManager Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Stephan Ewen Fix For: 0.9 I've seen this test failing multiple times now. Example: https://travis-ci.org/rmetzger/flink/jobs/62355627 [~till.rohrmann] tried to fix the issue a while ago with this commit http://git-wip-us.apache.org/repos/asf/flink/commit/3897b47b -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2027) Flink website does not provide link to source repo
[ https://issues.apache.org/jira/browse/FLINK-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548005#comment-14548005 ] Stephan Ewen commented on FLINK-2027: - Agreed, the canonical source repo should definitely be linked. We encourage to make contributions through pull requests, so most contributors interact more with the github mirror than with the original repo. That is why we placed it rather prominently on the previous website. Flink website does not provide link to source repo -- Key: FLINK-2027 URL: https://issues.apache.org/jira/browse/FLINK-2027 Project: Flink Issue Type: Bug Reporter: Sebb Priority: Critical As the subject says - I could not find a link to the source repo anywhere obvious on the website -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2035) Update 0.9 roadmap with ML issues
[ https://issues.apache.org/jira/browse/FLINK-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547971#comment-14547971 ] Stephan Ewen commented on FLINK-2035: - Can you also put the issue into this doc, for release 0.9 tracking? https://docs.google.com/document/d/1VOqEpHFWSHyQ1zIVDtYKNBC-iT3r4jQc8COTYcO2Q30 Update 0.9 roadmap with ML issues - Key: FLINK-2035 URL: https://issues.apache.org/jira/browse/FLINK-2035 Project: Flink Issue Type: Improvement Components: Machine Learning Library Reporter: Theodore Vasiloudis The [current list|https://issues.apache.org/jira/browse/FLINK-2001?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%200.9%20AND%20component%20%3D%20%22Machine%20Learning%20Library%22] of issues linked with the 0.9 release is quite limited. We should go through the current ML issues and assign fix versions, so that we have a clear view of what we expect to have in 0.9. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2029) DOAP has disappeared
[ https://issues.apache.org/jira/browse/FLINK-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547953#comment-14547953 ] Stephan Ewen commented on FLINK-2029: - I updates the entry in http://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects/files.xml DOAP has disappeared Key: FLINK-2029 URL: https://issues.apache.org/jira/browse/FLINK-2029 Project: Flink Issue Type: Bug Reporter: Sebb Assignee: Ufuk Celebi The DOAP was previously published at http://flink.apache.org/doap_flink.rdf but this URL is now giving a 404. As such, the projects data file: http://svn.apache.org/repos/asf/infrastructure/site-tools/trunk/projects/files.xml has been amended to exclude the DOAP. Please reinstate the DOAP and then reinstate the entry in files.xml -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2027) Flink website does not provide link to source repo
[ https://issues.apache.org/jira/browse/FLINK-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547721#comment-14547721 ] Stephan Ewen commented on FLINK-2027: - I liked it that the link was fairly prominent before. A lot of activity is reflected on GitHub, like PullRequests Flink website does not provide link to source repo -- Key: FLINK-2027 URL: https://issues.apache.org/jira/browse/FLINK-2027 Project: Flink Issue Type: Bug Reporter: Sebb Priority: Critical As the subject says - I could not find a link to the source repo anywhere obvious on the website -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2020) Table API JoinITCase fails non-deterministically
[ https://issues.apache.org/jira/browse/FLINK-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547726#comment-14547726 ] Stephan Ewen commented on FLINK-2020: - I think we can lazily migrate the tests. I did this for a few tests in the past, whenever I encountered an issue and touched them anyways. Table API JoinITCase fails non-deterministically Key: FLINK-2020 URL: https://issues.apache.org/jira/browse/FLINK-2020 Project: Flink Issue Type: Bug Components: Java API, Table API Affects Versions: 0.9 Reporter: Fabian Hueske I observed the following failing test case in one of my Travis builds. {code} Tests run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.526 sec FAILURE! - in org.apache.flink.api.scala.table.test.JoinITCase testJoinWithMultipleKeys[Execution mode = CLUSTER](org.apache.flink.api.scala.table.test.JoinITCase) Time elapsed: 0.265 sec FAILURE! java.lang.AssertionError: Different number of lines in expected and obtained result. expected:6 but was:5 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:258) at org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:244) at org.apache.flink.api.scala.table.test.JoinITCase.after(JoinITCase.scala:49) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2020) Table API JoinITCase fails non-deterministically
[ https://issues.apache.org/jira/browse/FLINK-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547670#comment-14547670 ] Stephan Ewen commented on FLINK-2020: - This may be a sporadic temp fie failure. I have seen that before also for other tests. What I do not understand is why people still write tests with temp files. I think all new tests should use {{collect()}}. Table API JoinITCase fails non-deterministically Key: FLINK-2020 URL: https://issues.apache.org/jira/browse/FLINK-2020 Project: Flink Issue Type: Bug Components: Java API, Table API Affects Versions: 0.9 Reporter: Fabian Hueske I observed the following failing test case in one of my Travis builds. {code} Tests run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.526 sec FAILURE! - in org.apache.flink.api.scala.table.test.JoinITCase testJoinWithMultipleKeys[Execution mode = CLUSTER](org.apache.flink.api.scala.table.test.JoinITCase) Time elapsed: 0.265 sec FAILURE! java.lang.AssertionError: Different number of lines in expected and obtained result. expected:6 but was:5 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:258) at org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:244) at org.apache.flink.api.scala.table.test.JoinITCase.after(JoinITCase.scala:49) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2025) Support booleans in CSV reader
[ https://issues.apache.org/jira/browse/FLINK-2025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547674#comment-14547674 ] Stephan Ewen commented on FLINK-2025: - Oh, I was always under the impression that it supports booleans. Can you not use {{Boolean.class}} in the {{types(...)}} method? Support booleans in CSV reader -- Key: FLINK-2025 URL: https://issues.apache.org/jira/browse/FLINK-2025 Project: Flink Issue Type: New Feature Components: Core Reporter: Sebastian Schelter It would be great if Flink allowed to read booleans from CSV files, e.g. 1 for true and 0 for false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2025) Support booleans in CSV reader
[ https://issues.apache.org/jira/browse/FLINK-2025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547675#comment-14547675 ] Stephan Ewen commented on FLINK-2025: - I think it parses similar to {{Boolean.parseBoolean()}}. Support booleans in CSV reader -- Key: FLINK-2025 URL: https://issues.apache.org/jira/browse/FLINK-2025 Project: Flink Issue Type: New Feature Components: Core Reporter: Sebastian Schelter It would be great if Flink allowed to read booleans from CSV files, e.g. 1 for true and 0 for false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2025) Support booleans in CSV reader
[ https://issues.apache.org/jira/browse/FLINK-2025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547715#comment-14547715 ] Stephan Ewen commented on FLINK-2025: - Just checked, you are right. Do you want to implement one? Should be Fairly straightforward: - Implement a boolean parser (see for example IntParser) - Register the parser in the class {{FieldParser}} Done Support booleans in CSV reader -- Key: FLINK-2025 URL: https://issues.apache.org/jira/browse/FLINK-2025 Project: Flink Issue Type: New Feature Components: Core Reporter: Sebastian Schelter It would be great if Flink allowed to read booleans from CSV files, e.g. 1 for true and 0 for false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2001) DistanceMetric cannot be serialized
[ https://issues.apache.org/jira/browse/FLINK-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-2001. --- DistanceMetric cannot be serialized --- Key: FLINK-2001 URL: https://issues.apache.org/jira/browse/FLINK-2001 Project: Flink Issue Type: Bug Components: Machine Learning Library Reporter: Chiwan Park Assignee: Chiwan Park Priority: Critical Labels: ML Fix For: 0.9 Because DistanceMeasure trait doesn't extend Serializable, The task using DistanceMeasure raises a following exception. {code} Task not serializable org.apache.flink.api.common.InvalidProgramException: Task not serializable at org.apache.flink.api.scala.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:179) at org.apache.flink.api.scala.ClosureCleaner$.clean(ClosureCleaner.scala:171) at org.apache.flink.api.scala.DataSet.clean(DataSet.scala:123) at org.apache.flink.api.scala.DataSet$$anon$10.init(DataSet.scala:691) at org.apache.flink.api.scala.DataSet.combineGroup(DataSet.scala:690) at org.apache.flink.ml.classification.KNNModel.transform(KNN.scala:78) at org.apache.flink.ml.classification.KNNITSuite$$anonfun$1.apply$mcV$sp(KNNSuite.scala:25) at org.apache.flink.ml.classification.KNNITSuite$$anonfun$1.apply(KNNSuite.scala:12) at org.apache.flink.ml.classification.KNNITSuite$$anonfun$1.apply(KNNSuite.scala:12) at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) at org.scalatest.Transformer.apply(Transformer.scala:20) at org.scalatest.FlatSpecLike$$anon$1.apply(FlatSpecLike.scala:1647) at org.scalatest.Suite$class.withFixture(Suite.scala:1122) at org.scalatest.FlatSpec.withFixture(FlatSpec.scala:1683) at org.scalatest.FlatSpecLike$class.invokeWithFixture$1(FlatSpecLike.scala:1644) at org.scalatest.FlatSpecLike$$anonfun$runTest$1.apply(FlatSpecLike.scala:1656) at org.scalatest.FlatSpecLike$$anonfun$runTest$1.apply(FlatSpecLike.scala:1656) at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) at org.scalatest.FlatSpecLike$class.runTest(FlatSpecLike.scala:1656) at org.apache.flink.ml.classification.KNNITSuite.org$scalatest$BeforeAndAfter$$super$runTest(KNNSuite.scala:9) at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:200) at org.apache.flink.ml.classification.KNNITSuite.runTest(KNNSuite.scala:9) at org.scalatest.FlatSpecLike$$anonfun$runTests$1.apply(FlatSpecLike.scala:1714) at org.scalatest.FlatSpecLike$$anonfun$runTests$1.apply(FlatSpecLike.scala:1714) at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413) at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401) at scala.collection.immutable.List.foreach(List.scala:318) at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:390) at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:427) at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401) at scala.collection.immutable.List.foreach(List.scala:318) at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396) at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483) at org.scalatest.FlatSpecLike$class.runTests(FlatSpecLike.scala:1714) at org.scalatest.FlatSpec.runTests(FlatSpec.scala:1683) at org.scalatest.Suite$class.run(Suite.scala:1424) at org.scalatest.FlatSpec.org$scalatest$FlatSpecLike$$super$run(FlatSpec.scala:1683) at org.scalatest.FlatSpecLike$$anonfun$run$1.apply(FlatSpecLike.scala:1760) at org.scalatest.FlatSpecLike$$anonfun$run$1.apply(FlatSpecLike.scala:1760) at org.scalatest.SuperEngine.runImpl(Engine.scala:545) at org.scalatest.FlatSpecLike$class.run(FlatSpecLike.scala:1760) at org.apache.flink.ml.classification.KNNITSuite.org$scalatest$BeforeAndAfter$$super$run(KNNSuite.scala:9) at org.scalatest.BeforeAndAfter$class.run(BeforeAndAfter.scala:241) at org.apache.flink.ml.classification.KNNITSuite.run(KNNSuite.scala:9) at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:55) at
[jira] [Closed] (FLINK-785) Add Chained operators for AllReduce and AllGroupReduce
[ https://issues.apache.org/jira/browse/FLINK-785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-785. -- Add Chained operators for AllReduce and AllGroupReduce -- Key: FLINK-785 URL: https://issues.apache.org/jira/browse/FLINK-785 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: 0.9 Because the operators `AllReduce` and `AllGroupReduce` are used both for the pre-reduce (combiner side) and the final reduce, they would greatly benefit from a chained version. Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/785 Created by: [StephanEwen|https://github.com/StephanEwen] Labels: runtime, Milestone: Release 0.6 (unplanned) Created at: Sun May 11 17:41:12 CEST 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1987) Broken links in the add_operator section of the documentation
[ https://issues.apache.org/jira/browse/FLINK-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1987. - Resolution: Pending Closed Fix Version/s: 0.9 Fixed and merged in cafb8769a22e21c1c6fe045670ed968bb1293f77 Thanks you for the patch! Broken links in the add_operator section of the documentation - Key: FLINK-1987 URL: https://issues.apache.org/jira/browse/FLINK-1987 Project: Flink Issue Type: Bug Components: docs Affects Versions: 0.9 Reporter: Andra Lungu Assignee: Andra Lungu Priority: Trivial Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2006) TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED
[ https://issues.apache.org/jira/browse/FLINK-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-2006. --- TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED - Key: FLINK-2006 URL: https://issues.apache.org/jira/browse/FLINK-2006 Project: Flink Issue Type: Bug Components: TaskManager Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Stephan Ewen Fix For: 0.9 I've seen this test failing multiple times now. Example: https://travis-ci.org/rmetzger/flink/jobs/62355627 [~till.rohrmann] tried to fix the issue a while ago with this commit http://git-wip-us.apache.org/repos/asf/flink/commit/3897b47b -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2011) Improve error message when user-defined serialization logic is wrong
[ https://issues.apache.org/jira/browse/FLINK-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-2011. --- Improve error message when user-defined serialization logic is wrong Key: FLINK-2011 URL: https://issues.apache.org/jira/browse/FLINK-2011 Project: Flink Issue Type: Bug Components: Distributed Runtime Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-2011) Improve error message when user-defined serialization logic is wrong
[ https://issues.apache.org/jira/browse/FLINK-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-2011. - Resolution: Pending Closed Fixed in 113b20b7f8717b12c5f0dfa691da582d426fbae0 Improve error message when user-defined serialization logic is wrong Key: FLINK-2011 URL: https://issues.apache.org/jira/browse/FLINK-2011 Project: Flink Issue Type: Bug Components: Distributed Runtime Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-2006) TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED
[ https://issues.apache.org/jira/browse/FLINK-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-2006. - Resolution: Pending Closed Fix Version/s: 0.9 Assignee: Stephan Ewen Fixed in 9da2f1f725473f1242067acb546607888e3a5015 TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED - Key: FLINK-2006 URL: https://issues.apache.org/jira/browse/FLINK-2006 Project: Flink Issue Type: Bug Components: TaskManager Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Stephan Ewen Fix For: 0.9 I've seen this test failing multiple times now. Example: https://travis-ci.org/rmetzger/flink/jobs/62355627 [~till.rohrmann] tried to fix the issue a while ago with this commit http://git-wip-us.apache.org/repos/asf/flink/commit/3897b47b -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2041) Optimizer plans with memory for pipeline breakers, even though they are not used any more
[ https://issues.apache.org/jira/browse/FLINK-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-2041. --- Optimizer plans with memory for pipeline breakers, even though they are not used any more - Key: FLINK-2041 URL: https://issues.apache.org/jira/browse/FLINK-2041 Project: Flink Issue Type: Bug Components: Optimizer Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-2041) Optimizer plans with memory for pipeline breakers, even though they are not used any more
[ https://issues.apache.org/jira/browse/FLINK-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-2041. - Resolution: Fixed Fixed in c12403e20547bcdc2104781e0c160525a48783c8 Optimizer plans with memory for pipeline breakers, even though they are not used any more - Key: FLINK-2041 URL: https://issues.apache.org/jira/browse/FLINK-2041 Project: Flink Issue Type: Bug Components: Optimizer Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2040) TaskTest.testCancelDuringInvoke:434-validateListenerMessage:712 expected:CANCELING but was:CANCELED
[ https://issues.apache.org/jira/browse/FLINK-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550035#comment-14550035 ] Stephan Ewen commented on FLINK-2040: - Funny, I just saw the same thing... TaskTest.testCancelDuringInvoke:434-validateListenerMessage:712 expected:CANCELING but was:CANCELED Key: FLINK-2040 URL: https://issues.apache.org/jira/browse/FLINK-2040 Project: Flink Issue Type: Bug Components: TaskManager Affects Versions: 0.9 Reporter: Robert Metzger Looks like an unstable test https://travis-ci.org/rmetzger/flink/jobs/63016938 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2040) TaskTest.testCancelDuringInvoke:434-validateListenerMessage:712 expected:CANCELING but was:CANCELED
[ https://issues.apache.org/jira/browse/FLINK-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550038#comment-14550038 ] Stephan Ewen commented on FLINK-2040: - The cause is the following: It may be (very rarely) that the following race happens: - OUTSIDE THREAD: call to cancel() - OUTSIDE THREAD: atomic state change from running to canceling - TASK THREAD: finishes, atomic change from canceling to canceled - TASK THREAD: send notification that state is canceled - OUTSIDE THREAD: send notification that state is canceling In that case, the CANCELED notification comes before the CANCELING notification. The execution graph state machine accepts this, so it is actually okay for the system if that happens. I'll adjust the tests to tolerate that as well. TaskTest.testCancelDuringInvoke:434-validateListenerMessage:712 expected:CANCELING but was:CANCELED Key: FLINK-2040 URL: https://issues.apache.org/jira/browse/FLINK-2040 Project: Flink Issue Type: Bug Components: TaskManager Affects Versions: 0.9 Reporter: Robert Metzger Looks like an unstable test https://travis-ci.org/rmetzger/flink/jobs/63016938 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2042) Increase font size for stack figure on website home
[ https://issues.apache.org/jira/browse/FLINK-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550058#comment-14550058 ] Stephan Ewen commented on FLINK-2042: - I would like to increase the entire figure, actually, not just the font size... Increase font size for stack figure on website home --- Key: FLINK-2042 URL: https://issues.apache.org/jira/browse/FLINK-2042 Project: Flink Issue Type: Improvement Components: Project Website, website Reporter: Fabian Hueske Priority: Minor Labels: starter The font of the stack figure on the website home is quite small and could be increased. The image is also a bit blurred. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2026) Error message in count() only jobs
[ https://issues.apache.org/jira/browse/FLINK-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550056#comment-14550056 ] Stephan Ewen commented on FLINK-2026: - +1 for a good error message. I agree with Fabian, people would forget the execute() call right away otherwise... Error message in count() only jobs -- Key: FLINK-2026 URL: https://issues.apache.org/jira/browse/FLINK-2026 Project: Flink Issue Type: Bug Components: Core Reporter: Sebastian Schelter Assignee: Maximilian Michels Priority: Minor If I run a job that only calls count() on a dataset (which is a valid data flow IMHO), Flink executes the job but complains that no sinks are defined. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2038) Extend headline of website
[ https://issues.apache.org/jira/browse/FLINK-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550066#comment-14550066 ] Stephan Ewen commented on FLINK-2038: - How about making the scalable part of the analytics, not the open source ;-) Apache Flink is an open source platform for scalable and unified batch and stream data processing. Otherwise +1 for such a change Extend headline of website -- Key: FLINK-2038 URL: https://issues.apache.org/jira/browse/FLINK-2038 Project: Flink Issue Type: Improvement Components: website Reporter: Fabian Hueske Priority: Minor The current headline of the website is {quote} Apache Flink is an open source platform for unified batch and stream processing. {quote} I think two aspects are missing. 1. Flink is a *data* processing platform 2. Flink is a *parallel* system How about something along the lines of {quote} Apache Flink is a scalable open source platform for unified batch and stream data processing. {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2040) TaskTest.testCancelDuringInvoke:434-validateListenerMessage:712 expected:CANCELING but was:CANCELED
[ https://issues.apache.org/jira/browse/FLINK-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-2040. --- TaskTest.testCancelDuringInvoke:434-validateListenerMessage:712 expected:CANCELING but was:CANCELED Key: FLINK-2040 URL: https://issues.apache.org/jira/browse/FLINK-2040 Project: Flink Issue Type: Bug Components: TaskManager Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Stephan Ewen Fix For: 0.9 Looks like an unstable test https://travis-ci.org/rmetzger/flink/jobs/63016938 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1952) Cannot run ConnectedComponents example: Could not allocate a slot on instance
[ https://issues.apache.org/jira/browse/FLINK-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550201#comment-14550201 ] Stephan Ewen commented on FLINK-1952: - I cannot reproduce this error any more with these steps: - ConnectedComponents (successful) - KMeans (runs out of buffers) - ConnectedComponents The third program now succeeds as expected. I think the reason is that the task lifecycle is more robust now (with my changes from two weeks ago). Still, it does not mean that the slot sharing groups are bug free. It only means that this situation does not trigger that bug any more... Cannot run ConnectedComponents example: Could not allocate a slot on instance - Key: FLINK-1952 URL: https://issues.apache.org/jira/browse/FLINK-1952 Project: Flink Issue Type: Bug Components: Scheduler Affects Versions: 0.9 Reporter: Robert Metzger Priority: Blocker Steps to reproduce {code} ./bin/yarn-session.sh -n 350 {code} ... wait until they are connected ... {code} Number of connected TaskManagers changed to 266. Slots available: 266 Number of connected TaskManagers changed to 323. Slots available: 323 Number of connected TaskManagers changed to 334. Slots available: 334 Number of connected TaskManagers changed to 343. Slots available: 343 Number of connected TaskManagers changed to 350. Slots available: 350 {code} Start CC {code} ./bin/flink run -p 350 ./examples/flink-java-examples-0.9-SNAPSHOT-ConnectedComponents.jar {code} --- it runs Run KMeans, let it fail with {code} Failed to deploy the task Map (Map at main(KMeans.java:100)) (1/350) - execution #0 to slot SimpleSlot (2)(2)(0) - 182b7661ca9547a84591de940c47a200 - ALLOCATED/ALIVE: java.io.IOException: Insufficient number of network buffers: required 350, but only 254 available. The total number of network buffers is currently set to 2048. You can increase this number by setting the configuration key 'taskmanager.network.numberOfBuffers'. {code} ... as expected. (I've waited for 10 minutes between the two submissions) Starting CC now will fail: {code} ./bin/flink run -p 350 ./examples/flink-java-examples-0.9-SNAPSHOT-ConnectedComponents.jar {code} Error message(s): {code} Caused by: java.lang.IllegalStateException: Could not schedule consumer vertex IterationHead(WorksetIteration (Unnamed Delta Iteration)) (19/350) at org.apache.flink.runtime.executiongraph.Execution$3.call(Execution.java:479) at org.apache.flink.runtime.executiongraph.Execution$3.call(Execution.java:469) at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:94) at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) at scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107) ... 4 more Caused by: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Could not allocate a slot on instance 4a6d761cb084c32310ece1f849556faf @ cloud-19 - 1 slots - URL: akka.tcp://flink@130.149.21.23:51400/user/taskmanager, as required by the co-location constraint. at org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:247) at org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:110) at org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:262) at org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:436) at org.apache.flink.runtime.executiongraph.Execution$3.call(Execution.java:475) ... 9 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-2046) Quickstarts are missing on the new Website
Stephan Ewen created FLINK-2046: --- Summary: Quickstarts are missing on the new Website Key: FLINK-2046 URL: https://issues.apache.org/jira/browse/FLINK-2046 Project: Flink Issue Type: Bug Components: Project Website Reporter: Stephan Ewen -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1890) Add withReadFields or sth. similar to Scala API
[ https://issues.apache.org/jira/browse/FLINK-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499758#comment-14499758 ] Stephan Ewen commented on FLINK-1890: - I think the {{readFields}} are not evaluated by the optimizer or runtime at this point - may be they never will be. It is probably not urgent to add them to the fluent API at this point. I would vote to postpone this issue Add withReadFields or sth. similar to Scala API --- Key: FLINK-1890 URL: https://issues.apache.org/jira/browse/FLINK-1890 Project: Flink Issue Type: Wish Components: Java API, Scala API Reporter: Stefan Bunk Priority: Minor In the Scala API, you have the option to declare forwarded fields via the {{withForwardedFields}} method. It would be nice to have sth. similar for read fields, as otherwise one needs to create a class, which I personally try to avoid for readability. Maybe grouping all annotations in one function and have a first parameter indicating the type of annotation is also an option, if you plan on adding more annotations and want to keep the interface smaller. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1906) Add tip to work around plain Tuple return type of project operator
[ https://issues.apache.org/jira/browse/FLINK-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499655#comment-14499655 ] Stephan Ewen commented on FLINK-1906: - Makes sense. Add tip to work around plain Tuple return type of project operator -- Key: FLINK-1906 URL: https://issues.apache.org/jira/browse/FLINK-1906 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 0.9 Reporter: Fabian Hueske Priority: Minor Labels: starter The Java compiler is not able to infer the return type of the {{project}} operator and defaults to {{Tuple}}. This can cause problems if another operator is immediately called on the result of a {{project}} operator such as: {code} DataSetTuple5String,String,String,String,String ds = DataSetTuple1String ds2 = ds.project(0).distinct(0); {code} This problem can be overcome by hinting the return type of {{project}} like this: {code} DataSetTuple1String ds2 = ds.Tuple1Stringproject(0).distinct(0); {code} We should add this description to the documentation of the project operator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1868) EnvironmentInformationTest.testJavaMemory fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492076#comment-14492076 ] Stephan Ewen commented on FLINK-1868: - This may happen if, in the cource of a GC within the test, the JVM heap size changes. I'll fix it... EnvironmentInformationTest.testJavaMemory fails on Travis - Key: FLINK-1868 URL: https://issues.apache.org/jira/browse/FLINK-1868 Project: Flink Issue Type: Bug Components: Tests Affects Versions: 0.9 Reporter: Robert Metzger Priority: Minor Example: https://travis-ci.org/apache/flink/jobs/57939551 -- This message was sent by Atlassian JIRA (v6.3.4#6332)