[jira] [Commented] (FLINK-1501) Integrate metrics library and report basic metrics to JobManager web interface
[ https://issues.apache.org/jira/browse/FLINK-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328795#comment-14328795 ] ASF GitHub Bot commented on FLINK-1501: --- Github user vasia commented on the pull request: https://github.com/apache/flink/pull/421#issuecomment-75218017 This looks great! ^^ Integrate metrics library and report basic metrics to JobManager web interface -- Key: FLINK-1501 URL: https://issues.apache.org/jira/browse/FLINK-1501 Project: Flink Issue Type: Sub-task Components: JobManager, TaskManager Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Fix For: pre-apache As per mailing list, the library: https://github.com/dropwizard/metrics The goal of this task is to get the basic infrastructure in place. Subsequent issues will integrate more features into the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1587) coGroup throws NoSuchElementException on iterator.next()
[ https://issues.apache.org/jira/browse/FLINK-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328808#comment-14328808 ] Vasia Kalavri commented on FLINK-1587: -- Hi, you can easily reproduce if you have an edge set with an invalid edge ID, as [~andralungu] says. This is a case that shouldn't happen, so I suggest we add a check and throw an exception. This should probably be taken care of in the other 2 degrees methods. [~andralungu] or [~cebe] any of you would like to take care of this? :) -V. coGroup throws NoSuchElementException on iterator.next() Key: FLINK-1587 URL: https://issues.apache.org/jira/browse/FLINK-1587 Project: Flink Issue Type: Bug Components: Gelly Environment: flink-0.8.0-SNAPSHOT Reporter: Carsten Brandt I am receiving the following exception when running a simple job that extracts outdegree from a graph using Gelly. It is currently only failing on the cluster and I am not able to reproduce it locally. Will try that the next days. {noformat} 02/20/2015 02:27:02: CoGroup (CoGroup at inDegrees(Graph.java:675)) (5/64) switched to FAILED java.util.NoSuchElementException at java.util.Collections$EmptyIterator.next(Collections.java:3006) at flink.graphs.Graph$CountNeighborsCoGroup.coGroup(Graph.java:665) at org.apache.flink.runtime.operators.CoGroupDriver.run(CoGroupDriver.java:130) at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:493) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257) at java.lang.Thread.run(Thread.java:745) 02/20/2015 02:27:02: Job execution switched to status FAILING ... {noformat} The error occurs in Gellys Graph.java at this line: https://github.com/apache/flink/blob/a51c02f6e8be948d71a00c492808115d622379a7/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java#L636 Is there any valid case where a coGroup Iterator may be empty? As far as I see there is a bug somewhere. I'd like to write a test case for this to reproduce the issue. Where can I put such a test? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1515]Splitted runVertexCentricIteration...
Github user vasia commented on the pull request: https://github.com/apache/flink/pull/402#issuecomment-75220973 Hi, if no more comments, I'd like to merge this. There is a failing check in Travis (not related to this PR): ``` Tests in error: JobManagerFailsITCase.run:40-org$scalatest$BeforeAndAfterAll$$super$run:40-org$scalatest$WordSpecLike$$super$run:40-runTests:40-runTest:40-withFixture:40 » Timeout JobManagerFailsITCase.run:40-org$scalatest$BeforeAndAfterAll$$super$run:40-org$scalatest$WordSpecLike$$super$run:40-runTests:40-runTest:40-withFixture:40-TestKit.within:707-TestKit.within:707 » Timeout ``` Is this fixed by #422? Shall I proceed? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1515) [Gelly] Enable access to aggregators and broadcast sets in vertex-centric iteration
[ https://issues.apache.org/jira/browse/FLINK-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328814#comment-14328814 ] ASF GitHub Bot commented on FLINK-1515: --- Github user vasia commented on the pull request: https://github.com/apache/flink/pull/402#issuecomment-75220973 Hi, if no more comments, I'd like to merge this. There is a failing check in Travis (not related to this PR): ``` Tests in error: JobManagerFailsITCase.run:40-org$scalatest$BeforeAndAfterAll$$super$run:40-org$scalatest$WordSpecLike$$super$run:40-runTests:40-runTest:40-withFixture:40 » Timeout JobManagerFailsITCase.run:40-org$scalatest$BeforeAndAfterAll$$super$run:40-org$scalatest$WordSpecLike$$super$run:40-runTests:40-runTest:40-withFixture:40-TestKit.within:707-TestKit.within:707 » Timeout ``` Is this fixed by #422? Shall I proceed? Thanks! [Gelly] Enable access to aggregators and broadcast sets in vertex-centric iteration --- Key: FLINK-1515 URL: https://issues.apache.org/jira/browse/FLINK-1515 Project: Flink Issue Type: Improvement Components: Gelly Reporter: Vasia Kalavri Assignee: Martin Kiefer Currently, aggregators and broadcast sets cannot be accessed through Gelly's {{runVertexCentricIteration}} method. The functionality is already present in the {{VertexCentricIteration}} and we just need to expose it. This could be done like this: We create a method {{createVertexCentricIteration}}, which will return a {{VertexCentricIteration}} object and we change {{runVertexCentricIteration}} to accept this as a parameter (and return the graph after running this iteration). The user can configure the {{VertexCentricIteration}} by directly calling the public methods {{registerAggregator}}, {{setName}}, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1588) Load flink configuration also from classloader
Robert Metzger created FLINK-1588: - Summary: Load flink configuration also from classloader Key: FLINK-1588 URL: https://issues.apache.org/jira/browse/FLINK-1588 Project: Flink Issue Type: New Feature Reporter: Robert Metzger The GlobalConfiguration object should also check if it finds the flink-config.yaml in the classpath and load if from there. This allows users to inject configuration files in local standalone or embedded environments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1589) Add option to pass Configuration to LocalExecutor
Robert Metzger created FLINK-1589: - Summary: Add option to pass Configuration to LocalExecutor Key: FLINK-1589 URL: https://issues.apache.org/jira/browse/FLINK-1589 Project: Flink Issue Type: New Feature Reporter: Robert Metzger Right now its not possible for users to pass custom configuration values to Flink when running it from within an IDE. It would be very convenient to be able to create a local execution environment that allows passing configuration files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (FLINK-1589) Add option to pass Configuration to LocalExecutor
[ https://issues.apache.org/jira/browse/FLINK-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger reassigned FLINK-1589: - Assignee: Robert Metzger Add option to pass Configuration to LocalExecutor - Key: FLINK-1589 URL: https://issues.apache.org/jira/browse/FLINK-1589 Project: Flink Issue Type: New Feature Reporter: Robert Metzger Assignee: Robert Metzger Right now its not possible for users to pass custom configuration values to Flink when running it from within an IDE. It would be very convenient to be able to create a local execution environment that allows passing configuration files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1589] Add option to pass configuration ...
GitHub user rmetzger opened a pull request: https://github.com/apache/flink/pull/427 [FLINK-1589] Add option to pass configuration to LocalExecutor Please review the changes. I'll add a testcase and update the documentation later today. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rmetzger/flink flink1589 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/427.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #427 commit b75b4c285f4810faa5d02d638b61dc7b8e125c8d Author: Robert Metzger rmetz...@apache.org Date: 2015-02-20T11:40:41Z [FLINK-1589] Add option to pass configuration to LocalExecutor --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-1586) Add support for iteration visualization for Streaming programs
[ https://issues.apache.org/jira/browse/FLINK-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora updated FLINK-1586: -- Component/s: Streaming Add support for iteration visualization for Streaming programs -- Key: FLINK-1586 URL: https://issues.apache.org/jira/browse/FLINK-1586 Project: Flink Issue Type: Bug Components: Streaming Reporter: Gyula Fora Priority: Minor The plan visualizer currently does not support streaming programs containing iterations. There is no visualization at all due to an exception thrown in the visualizer script. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1589) Add option to pass Configuration to LocalExecutor
[ https://issues.apache.org/jira/browse/FLINK-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328841#comment-14328841 ] ASF GitHub Bot commented on FLINK-1589: --- GitHub user rmetzger opened a pull request: https://github.com/apache/flink/pull/427 [FLINK-1589] Add option to pass configuration to LocalExecutor Please review the changes. I'll add a testcase and update the documentation later today. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rmetzger/flink flink1589 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/427.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #427 commit b75b4c285f4810faa5d02d638b61dc7b8e125c8d Author: Robert Metzger rmetz...@apache.org Date: 2015-02-20T11:40:41Z [FLINK-1589] Add option to pass configuration to LocalExecutor Add option to pass Configuration to LocalExecutor - Key: FLINK-1589 URL: https://issues.apache.org/jira/browse/FLINK-1589 Project: Flink Issue Type: New Feature Reporter: Robert Metzger Assignee: Robert Metzger Right now its not possible for users to pass custom configuration values to Flink when running it from within an IDE. It would be very convenient to be able to create a local execution environment that allows passing configuration files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1584) Spurious failure of TaskManagerFailsITCase
[ https://issues.apache.org/jira/browse/FLINK-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1584. - Resolution: Fixed Fix Version/s: 0.9 Assignee: Till Rohrmann Fixed via 52b40debad6293cc0591eabb847eaac322d174aa Spurious failure of TaskManagerFailsITCase -- Key: FLINK-1584 URL: https://issues.apache.org/jira/browse/FLINK-1584 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 0.9 The {{TaskManagerFailsITCase}} fails spuriously on Travis. The reason might be that different test cases try to access the same {{JobManager}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1501] Add metrics library for monitorin...
Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/421#issuecomment-75209814 Indeed :-) What does the OS load mean? It would be really awesome to show the CPU load, too. I think this is a helpful indicator. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1501) Integrate metrics library and report basic metrics to JobManager web interface
[ https://issues.apache.org/jira/browse/FLINK-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328723#comment-14328723 ] ASF GitHub Bot commented on FLINK-1501: --- Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/421#issuecomment-75209814 Indeed :-) What does the OS load mean? It would be really awesome to show the CPU load, too. I think this is a helpful indicator. Integrate metrics library and report basic metrics to JobManager web interface -- Key: FLINK-1501 URL: https://issues.apache.org/jira/browse/FLINK-1501 Project: Flink Issue Type: Sub-task Components: JobManager, TaskManager Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Fix For: pre-apache As per mailing list, the library: https://github.com/dropwizard/metrics The goal of this task is to get the basic infrastructure in place. Subsequent issues will integrate more features into the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1556) JobClient does not wait until a job failed completely if submission exception
[ https://issues.apache.org/jira/browse/FLINK-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1556. - Resolution: Fixed Fix Version/s: 0.9 Re-fixed in 4ff91cd7cd3e05637d70a56cfc6f5a4ba2f2501a JobClient does not wait until a job failed completely if submission exception - Key: FLINK-1556 URL: https://issues.apache.org/jira/browse/FLINK-1556 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 0.9 If an exception occurs during job submission the {{JobClient}} received a {{SubmissionFailure}}. Upon receiving this message, the {{JobClient}} terminates itself and returns the error to the {{Client}}. This indicates to the user that the job has been completely failed which is not necessarily true. If the user directly after such a failure submits another job, then it might be the case that not all slots of the formerly failed job are returned. This can lead to a {{NoRessourceAvailableException}}. We can solve this problem by waiting for the completion of the job failure in the {{JobClient}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1451) Enable parallel execution of streaming file sources
[ https://issues.apache.org/jira/browse/FLINK-1451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora resolved FLINK-1451. --- Resolution: Fixed Assignee: Gyula Fora (was: Márton Balassi) https://github.com/apache/flink/commit/c56e3f10b27e1e5be38b8a731f330891b190a268 Enable parallel execution of streaming file sources --- Key: FLINK-1451 URL: https://issues.apache.org/jira/browse/FLINK-1451 Project: Flink Issue Type: Task Components: Streaming Affects Versions: 0.9 Reporter: Márton Balassi Assignee: Gyula Fora Currently the streaming FileSourceFunction does not support parallelism greater than 1 as it does not implement the ParallelSourceFunction interface. Usage with distributed filesystems should be checked. In relation with this issue possible parallel solution for the FileMonitoringFunction should be considered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1421) Implement a SAMOA Adapter for Flink Streaming
[ https://issues.apache.org/jira/browse/FLINK-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328882#comment-14328882 ] Paris Carbone commented on FLINK-1421: -- Thanks for all the great feedback everyone! I believe the adapter is ready for a PR! :) The current Flink-Samoa adapter runs well for all existing Tasks in local and remote mode. The only external requirement, at least from the samoa executable script is to have the Flink CLI and its dependencies installed and exported into a FLINK_HOME variable. [~StephanEwen] We should address the proper serialisation in the next patch. [~azaroth] We have rebased our current branch [1] with the incubator-samoa repository master and we also created a JIRA for the PR [2]. Do you think it is ok to do the PR now or should we wait for some upcoming refactoring that affects the adapters? [1] https://github.com/senorcarbone/samoa/tree/flink-integration [2] https://issues.apache.org/jira/browse/SAMOA-16 Implement a SAMOA Adapter for Flink Streaming - Key: FLINK-1421 URL: https://issues.apache.org/jira/browse/FLINK-1421 Project: Flink Issue Type: New Feature Components: Streaming Reporter: Paris Carbone Assignee: Paris Carbone Original Estimate: 336h Remaining Estimate: 336h Yahoo's Samoa is an experimental incremental machine learning library that builds on an abstract compositional data streaming model to write streaming algorithms. The task is to provide an adapter from SAMOA topologies to Flink-streaming job graphs in order to support Flink as a backend engine for SAMOA tasks. A statup guide can be viewed here : https://docs.google.com/document/d/18glDJDYmnFGT1UGtZimaxZpGeeg1Ch14NgDoymhPk2A/pub The main working branch of the adapter : https://github.com/senorcarbone/samoa/tree/flink-integration -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (FLINK-1555) Add utility to log the serializers of composite types
[ https://issues.apache.org/jira/browse/FLINK-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger reassigned FLINK-1555: - Assignee: Robert Metzger Add utility to log the serializers of composite types - Key: FLINK-1555 URL: https://issues.apache.org/jira/browse/FLINK-1555 Project: Flink Issue Type: Improvement Reporter: Robert Metzger Assignee: Robert Metzger Priority: Minor Users affected by poor performance might want to understand how Flink is serializing their data. Therefore, it would be cool to have a tool utility which logs the serializers like this: {{SerializerUtils.getSerializers(TypeInformationPOJO t);}} to get {code} PojoSerializer TupleSerializer IntSer DateSer GenericTypeSer(java.sql.Date) PojoSerializer GenericTypeSer(HashMap) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1589) Add option to pass Configuration to LocalExecutor
[ https://issues.apache.org/jira/browse/FLINK-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328963#comment-14328963 ] ASF GitHub Bot commented on FLINK-1589: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/427#issuecomment-75244751 I've added documentation and tests to the change. Lets see if travis gives us a green light. Add option to pass Configuration to LocalExecutor - Key: FLINK-1589 URL: https://issues.apache.org/jira/browse/FLINK-1589 Project: Flink Issue Type: New Feature Reporter: Robert Metzger Assignee: Robert Metzger Right now its not possible for users to pass custom configuration values to Flink when running it from within an IDE. It would be very convenient to be able to create a local execution environment that allows passing configuration files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1589] Add option to pass configuration ...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/427#issuecomment-75244751 I've added documentation and tests to the change. Lets see if travis gives us a green light. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-1590) Log environment information also in YARN mode
Robert Metzger created FLINK-1590: - Summary: Log environment information also in YARN mode Key: FLINK-1590 URL: https://issues.apache.org/jira/browse/FLINK-1590 Project: Flink Issue Type: Improvement Reporter: Robert Metzger Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1591) Remove window merge before flatten as an optimization
Gyula Fora created FLINK-1591: - Summary: Remove window merge before flatten as an optimization Key: FLINK-1591 URL: https://issues.apache.org/jira/browse/FLINK-1591 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Gyula Fora After a Window Reduce or Map transformation there is always a merge step when the transformation was parallel or grouped. This merge step should be removed when the windowing operator is followed by flatten to avoid unnecessary bottlenecks in the program. This feature should be added as an optimization step to the WindowingOptimizer class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1514) [Gelly] Add a Gather-Sum-Apply iteration method
[ https://issues.apache.org/jira/browse/FLINK-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328998#comment-14328998 ] ASF GitHub Bot commented on FLINK-1514: --- Github user vasia commented on the pull request: https://github.com/apache/flink/pull/408#issuecomment-75249793 Hi @balidani! Thanks a lot for this PR! Gather-Sum-Apply will be an awesome addition to Gelly ^^ Here come my comments: - There's no need for Gather, Sum and Apply functions to implement MapFunction, FlatJoinFunction, etc., since they are wrapped inside those in GatherSumApplyIteration class. Actually, I would use the Rich* versions instead, so that we can have access to open() and close() methods. You can look at how `VertexCentricIteration` wraps the `VertexUpdateFunction` inside a `RichCoGroupFunction`. - With this small change above, we could also allow access to aggregators and broadcast sets. This must be straight-forward to add (again look at `VertexCentricIteration` for hints). We should also add `getName()`, `setName()`, `getParallelism()`, `setParallelism()` methods to `GatherSumApplyIteration`. - Finally, it'd be great if you could add the tests you have as examples, i.e. one for Greedy Graph Coloring and one for GSAShortestPaths. Let me know if you have any doubts! Thanks again :sunny: [Gelly] Add a Gather-Sum-Apply iteration method --- Key: FLINK-1514 URL: https://issues.apache.org/jira/browse/FLINK-1514 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 0.9 Reporter: Vasia Kalavri Assignee: Daniel Bali This will be a method that implements the GAS computation model, but without the scatter step. The phases can be mapped into the following steps inside a delta iteration: gather: a map on each srcVertex, edge, trgVertex that produces a partial value sum: a reduce that combines the partial values apply: join with vertex set to update the vertex values using the results of sum and the previous state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1592) Refactor StreamGraph to store vertex IDs as Integers instead of Strings
Gyula Fora created FLINK-1592: - Summary: Refactor StreamGraph to store vertex IDs as Integers instead of Strings Key: FLINK-1592 URL: https://issues.apache.org/jira/browse/FLINK-1592 Project: Flink Issue Type: Task Components: Streaming Reporter: Gyula Fora Priority: Minor The vertex IDs are currently stored as Strings reflecting some deprecated usage. It should be refactored to use Integers instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1444][api-extending] Add support for sp...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/379 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-1592) Refactor StreamGraph to store vertex IDs as Integers instead of Strings
[ https://issues.apache.org/jira/browse/FLINK-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora updated FLINK-1592: -- Labels: starter (was: ) Refactor StreamGraph to store vertex IDs as Integers instead of Strings --- Key: FLINK-1592 URL: https://issues.apache.org/jira/browse/FLINK-1592 Project: Flink Issue Type: Task Components: Streaming Reporter: Gyula Fora Priority: Minor Labels: starter The vertex IDs are currently stored as Strings reflecting some deprecated usage. It should be refactored to use Integers instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1505] Separate reader API from result c...
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/428#issuecomment-75322938 We have rebased and tried, it works fine for us, thank you. I will add several minor changes afterwards but thats just for our convenience. So from my side +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1582][streaming]Allow SocketStream to r...
Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/424#issuecomment-75330500 I was thinking as high as a minute to have as a limit. But after reconsidering it I am for your argument, trying to establish the connection every couple of seconds should not be a big overhead, so my whole idea might be an overkill. Let us have your version for now and generalize it on request. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1582) SocketStream gets stuck when socket closes
[ https://issues.apache.org/jira/browse/FLINK-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329670#comment-14329670 ] ASF GitHub Bot commented on FLINK-1582: --- Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/424#issuecomment-75330500 I was thinking as high as a minute to have as a limit. But after reconsidering it I am for your argument, trying to establish the connection every couple of seconds should not be a big overhead, so my whole idea might be an overkill. Let us have your version for now and generalize it on request. SocketStream gets stuck when socket closes -- Key: FLINK-1582 URL: https://issues.apache.org/jira/browse/FLINK-1582 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.8, 0.9 Reporter: Márton Balassi Labels: starter When the server side of the socket closes the socket stream reader does not terminate. When the socket is reinitiated it does not reconnect just gets stuck. It would be nice to add options for the user have the reader should behave when the socket is down: terminate immediately (good for testing and examples) or wait a specified time - possibly forever. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1461][api-extending] Add SortPartition ...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/381 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1484) JobManager restart does not notify the TaskManager
[ https://issues.apache.org/jira/browse/FLINK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329046#comment-14329046 ] ASF GitHub Bot commented on FLINK-1484: --- Github user uce commented on the pull request: https://github.com/apache/flink/pull/368#issuecomment-75254417 @tillrohrmann and @StephanEwen worked on some other reliablity issues. Will the changes in this PR be subsumed by the upcoming changes? If not, we should merge this. :-) JobManager restart does not notify the TaskManager -- Key: FLINK-1484 URL: https://issues.apache.org/jira/browse/FLINK-1484 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 0.9 In case of a JobManager restart, which can happen due to an uncaught exception, the JobManager is restarted. However, connected TaskManager are not informed about the disconnection and continue sending messages to a JobManager with a reseted state. TaskManager should be informed about a possible restart and cleanup their own state in such a case. Afterwards, they can try to reconnect to a restarted JobManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1505) Separate buffer reader and channel consumption logic
[ https://issues.apache.org/jira/browse/FLINK-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329028#comment-14329028 ] ASF GitHub Bot commented on FLINK-1505: --- GitHub user uce opened a pull request: https://github.com/apache/flink/pull/428 [FLINK-1505] Separate reader API from result consumption @gyfora, can you please rebase on this branch and verify that everything is still working as expected for you? This PR separates the reader API (record and buffer readers) from result consumption (input gate). The buffer reader was a huge component with mixed responsibilities both as the runtime component to set up input channels for intermediate result consumption and as a lower-level user API to consume buffers/events. The separation makes it easier for users of the API (e.g. flink-streaming) to extend the handling of low-level buffers and events. Gyula's initial feedback confirmed this. In view of FLINK-1568, this PR makes it also easier to test the result consumption logic for failure scenarios. I will rebase #356 on this changes. You can merge this pull request into a Git repository by running: $ git pull https://github.com/uce/incubator-flink flink-1505-input_gate Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/428.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #428 commit db1dc5be12427664a418ce6e4fb41de39838fac0 Author: Ufuk Celebi u...@apache.org Date: 2015-02-10T14:05:44Z [FLINK-1505] [distributed runtime] Separate reader API from result consumption Separate buffer reader and channel consumption logic Key: FLINK-1505 URL: https://issues.apache.org/jira/browse/FLINK-1505 Project: Flink Issue Type: Improvement Components: Distributed Runtime Affects Versions: master Reporter: Ufuk Celebi Priority: Minor Currently, the hierarchy of readers (f.a.o.runtime.io.network.api) is bloated. There is no separation between consumption of the input channels and the buffer readers. This was not the case up until release-0.8 and has been introduced by me with intermediate results. I think this was a mistake and we should seperate this again. flink-streaming is currently the heaviest user of these lower level APIs and I have received feedback from [~gyfora] to undo this as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1505] Separate reader API from result c...
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/428#issuecomment-75257236 Thank you! I will try it over the weekend and give you feedback. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1589] Add option to pass configuration ...
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/427#discussion_r25080967 --- Diff: flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/ExecutionEnvironmentITCase.java --- @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.test.javaApiOperators; + +import org.apache.flink.api.java.DataSet; +import org.apache.flink.api.java.ExecutionEnvironment; +import org.apache.flink.configuration.ConfigConstants; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.test.javaApiOperators.util.CollectionDataSets; +import org.apache.flink.test.util.MultipleProgramsTestBase; +import org.junit.Assert; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.junit.runners.Parameterized; + +import java.util.ArrayList; +import java.util.Collection; + + +@RunWith(Parameterized.class) +public class ExecutionEnvironmentITCase extends MultipleProgramsTestBase { + + + public ExecutionEnvironmentITCase(ExecutionMode mode) { + super(mode); + } + + @Parameterized.Parameters(name = Execution mode = {0}) + public static CollectionExecutionMode[] executionModes(){ + CollectionExecutionMode[] c = new ArrayListExecutionMode[](1); + c.add(new ExecutionMode[] {ExecutionMode.CLUSTER}); + return c; + } + + + @Test + public void testLocalEnvironmentWithConfig() throws Exception { + IllegalArgumentException e = null; + try { + Configuration conf = new Configuration(); + conf.setBoolean(ConfigConstants.FILESYSTEM_DEFAULT_OVERWRITE_KEY, true); + conf.setString(ConfigConstants.TASK_MANAGER_TMP_DIR_KEY, /tmp/thelikelyhoodthatthisdirectoryexisitsisreallylow); --- End diff -- Are you sure? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1461) Add sortPartition operator
[ https://issues.apache.org/jira/browse/FLINK-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329033#comment-14329033 ] ASF GitHub Bot commented on FLINK-1461: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/381 Add sortPartition operator -- Key: FLINK-1461 URL: https://issues.apache.org/jira/browse/FLINK-1461 Project: Flink Issue Type: New Feature Components: Java API, Local Runtime, Optimizer, Scala API Reporter: Fabian Hueske Assignee: Fabian Hueske Priority: Minor A {{sortPartition()}} operator can be used to * sort the input of a {{mapPartition()}} operator * enforce a certain sorting of the input of a given operator of a program. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1505) Separate buffer reader and channel consumption logic
[ https://issues.apache.org/jira/browse/FLINK-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329061#comment-14329061 ] ASF GitHub Bot commented on FLINK-1505: --- Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/428#issuecomment-75257236 Thank you! I will try it over the weekend and give you feedback. Separate buffer reader and channel consumption logic Key: FLINK-1505 URL: https://issues.apache.org/jira/browse/FLINK-1505 Project: Flink Issue Type: Improvement Components: Distributed Runtime Affects Versions: master Reporter: Ufuk Celebi Priority: Minor Currently, the hierarchy of readers (f.a.o.runtime.io.network.api) is bloated. There is no separation between consumption of the input channels and the buffer readers. This was not the case up until release-0.8 and has been introduced by me with intermediate results. I think this was a mistake and we should seperate this again. flink-streaming is currently the heaviest user of these lower level APIs and I have received feedback from [~gyfora] to undo this as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1466) Add InputFormat to read HCatalog tables
[ https://issues.apache.org/jira/browse/FLINK-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske resolved FLINK-1466. -- Resolution: Implemented Implemented with bed3da4a61a8637c0faa9632b8a05ccef8c5a6dc Add InputFormat to read HCatalog tables --- Key: FLINK-1466 URL: https://issues.apache.org/jira/browse/FLINK-1466 Project: Flink Issue Type: New Feature Components: Java API, Scala API Reporter: Fabian Hueske Assignee: Fabian Hueske Priority: Minor HCatalog is a metadata repository and InputFormat to make Hive tables accessible to other frameworks such as Pig. Adding support for HCatalog would give access to Hive managed data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1505] Separate reader API from result c...
GitHub user uce opened a pull request: https://github.com/apache/flink/pull/428 [FLINK-1505] Separate reader API from result consumption @gyfora, can you please rebase on this branch and verify that everything is still working as expected for you? This PR separates the reader API (record and buffer readers) from result consumption (input gate). The buffer reader was a huge component with mixed responsibilities both as the runtime component to set up input channels for intermediate result consumption and as a lower-level user API to consume buffers/events. The separation makes it easier for users of the API (e.g. flink-streaming) to extend the handling of low-level buffers and events. Gyula's initial feedback confirmed this. In view of FLINK-1568, this PR makes it also easier to test the result consumption logic for failure scenarios. I will rebase #356 on this changes. You can merge this pull request into a Git repository by running: $ git pull https://github.com/uce/incubator-flink flink-1505-input_gate Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/428.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #428 commit db1dc5be12427664a418ce6e4fb41de39838fac0 Author: Ufuk Celebi u...@apache.org Date: 2015-02-10T14:05:44Z [FLINK-1505] [distributed runtime] Separate reader API from result consumption --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1466) Add InputFormat to read HCatalog tables
[ https://issues.apache.org/jira/browse/FLINK-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329032#comment-14329032 ] ASF GitHub Bot commented on FLINK-1466: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/411 Add InputFormat to read HCatalog tables --- Key: FLINK-1466 URL: https://issues.apache.org/jira/browse/FLINK-1466 Project: Flink Issue Type: New Feature Components: Java API, Scala API Reporter: Fabian Hueske Assignee: Fabian Hueske Priority: Minor HCatalog is a metadata repository and InputFormat to make Hive tables accessible to other frameworks such as Pig. Adding support for HCatalog would give access to Hive managed data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1444) Add data properties for data sources
[ https://issues.apache.org/jira/browse/FLINK-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329031#comment-14329031 ] ASF GitHub Bot commented on FLINK-1444: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/379 Add data properties for data sources Key: FLINK-1444 URL: https://issues.apache.org/jira/browse/FLINK-1444 Project: Flink Issue Type: New Feature Components: Java API, JobManager, Optimizer Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: Fabian Hueske Priority: Minor This issue proposes to add support for attaching data properties to data sources. These data properties are defined with respect to input splits. Possible properties are: - partitioning across splits: all elements of the same key (combination) are contained in one split - sorting / grouping with splits: elements are sorted or grouped on certain keys within a split - key uniqueness: a certain key (combination) is unique for all elements of the data source. This property is not defined wrt. input splits. The optimizer can leverage this information to generate more efficient execution plans. The InputFormat will be responsible to generate input splits such that the promised data properties are actually in place. Otherwise, the program will produce invalid results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1594) DataStreams don't support self-join
Daniel Bali created FLINK-1594: -- Summary: DataStreams don't support self-join Key: FLINK-1594 URL: https://issues.apache.org/jira/browse/FLINK-1594 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9 Environment: flink-0.9.0-SNAPSHOT Reporter: Daniel Bali Trying to join a DataSets with itself will result in exceptions. I get the following stack trace: {noformat} java.lang.Exception: Error setting up runtime environment: Union buffer reader must be initialized with at least two individual buffer readers at org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:173) at org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419) at org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.IllegalArgumentException: Union buffer reader must be initialized with at least two individual buffer readers at org.apache.flink.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:125) at org.apache.flink.runtime.io.network.api.reader.UnionBufferReader.init(UnionBufferReader.java:69) at org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setConfigInputs(CoStreamVertex.java:101) at org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setInputsOutputs(CoStreamVertex.java:63) at org.apache.flink.streaming.api.streamvertex.StreamVertex.registerInputOutput(StreamVertex.java:65) at org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:170) ... 20 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1589) Add option to pass Configuration to LocalExecutor
[ https://issues.apache.org/jira/browse/FLINK-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329117#comment-14329117 ] ASF GitHub Bot commented on FLINK-1589: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/427#discussion_r25080967 --- Diff: flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/ExecutionEnvironmentITCase.java --- @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.test.javaApiOperators; + +import org.apache.flink.api.java.DataSet; +import org.apache.flink.api.java.ExecutionEnvironment; +import org.apache.flink.configuration.ConfigConstants; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.test.javaApiOperators.util.CollectionDataSets; +import org.apache.flink.test.util.MultipleProgramsTestBase; +import org.junit.Assert; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.junit.runners.Parameterized; + +import java.util.ArrayList; +import java.util.Collection; + + +@RunWith(Parameterized.class) +public class ExecutionEnvironmentITCase extends MultipleProgramsTestBase { + + + public ExecutionEnvironmentITCase(ExecutionMode mode) { + super(mode); + } + + @Parameterized.Parameters(name = Execution mode = {0}) + public static CollectionExecutionMode[] executionModes(){ + CollectionExecutionMode[] c = new ArrayListExecutionMode[](1); + c.add(new ExecutionMode[] {ExecutionMode.CLUSTER}); + return c; + } + + + @Test + public void testLocalEnvironmentWithConfig() throws Exception { + IllegalArgumentException e = null; + try { + Configuration conf = new Configuration(); + conf.setBoolean(ConfigConstants.FILESYSTEM_DEFAULT_OVERWRITE_KEY, true); + conf.setString(ConfigConstants.TASK_MANAGER_TMP_DIR_KEY, /tmp/thelikelyhoodthatthisdirectoryexisitsisreallylow); --- End diff -- Are you sure? Add option to pass Configuration to LocalExecutor - Key: FLINK-1589 URL: https://issues.apache.org/jira/browse/FLINK-1589 Project: Flink Issue Type: New Feature Reporter: Robert Metzger Assignee: Robert Metzger Right now its not possible for users to pass custom configuration values to Flink when running it from within an IDE. It would be very convenient to be able to create a local execution environment that allows passing configuration files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1484) JobManager restart does not notify the TaskManager
[ https://issues.apache.org/jira/browse/FLINK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329056#comment-14329056 ] ASF GitHub Bot commented on FLINK-1484: --- Github user tillrohrmann closed the pull request at: https://github.com/apache/flink/pull/368 JobManager restart does not notify the TaskManager -- Key: FLINK-1484 URL: https://issues.apache.org/jira/browse/FLINK-1484 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 0.9 In case of a JobManager restart, which can happen due to an uncaught exception, the JobManager is restarted. However, connected TaskManager are not informed about the disconnection and continue sending messages to a JobManager with a reseted state. TaskManager should be informed about a possible restart and cleanup their own state in such a case. Afterwards, they can try to reconnect to a restarted JobManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1484] Adds explicit disconnect message ...
Github user tillrohrmann closed the pull request at: https://github.com/apache/flink/pull/368 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1484) JobManager restart does not notify the TaskManager
[ https://issues.apache.org/jira/browse/FLINK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329055#comment-14329055 ] ASF GitHub Bot commented on FLINK-1484: --- Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/368#issuecomment-75256160 This PR has been merged as part of PR #423 JobManager restart does not notify the TaskManager -- Key: FLINK-1484 URL: https://issues.apache.org/jira/browse/FLINK-1484 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 0.9 In case of a JobManager restart, which can happen due to an uncaught exception, the JobManager is restarted. However, connected TaskManager are not informed about the disconnection and continue sending messages to a JobManager with a reseted state. TaskManager should be informed about a possible restart and cleanup their own state in such a case. Afterwards, they can try to reconnect to a restarted JobManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1484] Adds explicit disconnect message ...
Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/368#issuecomment-75256160 This PR has been merged as part of PR #423 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-1594) DataStreams don't support self-join
[ https://issues.apache.org/jira/browse/FLINK-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Bali updated FLINK-1594: --- Description: Trying to join a DataSet with itself will result in exceptions. I get the following stack trace: {noformat} java.lang.Exception: Error setting up runtime environment: Union buffer reader must be initialized with at least two individual buffer readers at org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:173) at org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419) at org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.IllegalArgumentException: Union buffer reader must be initialized with at least two individual buffer readers at org.apache.flink.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:125) at org.apache.flink.runtime.io.network.api.reader.UnionBufferReader.init(UnionBufferReader.java:69) at org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setConfigInputs(CoStreamVertex.java:101) at org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setInputsOutputs(CoStreamVertex.java:63) at org.apache.flink.streaming.api.streamvertex.StreamVertex.registerInputOutput(StreamVertex.java:65) at org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:170) ... 20 more {noformat} was: Trying to join a DataSets with itself will result in exceptions. I get the following stack trace: {noformat} java.lang.Exception: Error setting up runtime environment: Union buffer reader must be initialized with at least two individual buffer readers at org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:173) at org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419) at org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at
[jira] [Resolved] (FLINK-1444) Add data properties for data sources
[ https://issues.apache.org/jira/browse/FLINK-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske resolved FLINK-1444. -- Resolution: Implemented Implemented with f0a28bf5345084a0a43df16021e60078e322e087 Add data properties for data sources Key: FLINK-1444 URL: https://issues.apache.org/jira/browse/FLINK-1444 Project: Flink Issue Type: New Feature Components: Java API, JobManager, Optimizer Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: Fabian Hueske Priority: Minor This issue proposes to add support for attaching data properties to data sources. These data properties are defined with respect to input splits. Possible properties are: - partitioning across splits: all elements of the same key (combination) are contained in one split - sorting / grouping with splits: elements are sorted or grouped on certain keys within a split - key uniqueness: a certain key (combination) is unique for all elements of the data source. This property is not defined wrt. input splits. The optimizer can leverage this information to generate more efficient execution plans. The InputFormat will be responsible to generate input splits such that the promised data properties are actually in place. Otherwise, the program will produce invalid results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1484] Adds explicit disconnect message ...
Github user uce commented on the pull request: https://github.com/apache/flink/pull/368#issuecomment-75254417 @tillrohrmann and @StephanEwen worked on some other reliablity issues. Will the changes in this PR be subsumed by the upcoming changes? If not, we should merge this. :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1567] Add switch to use the AvroSeriali...
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/413#discussion_r25086186 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/typeutils/GenericTypeInfo.java --- @@ -66,6 +71,9 @@ public boolean isKeyType() { @Override public TypeSerializerT createSerializer() { + if(useAvroSerializer) { + return new AvroSerializerT(this.typeClass); + } return new KryoSerializerT(this.typeClass); --- End diff -- If we enclose the ```return new KryoSerializer``` in an else block, then the control flow looks nicer imho. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1587) coGroup throws NoSuchElementException on iterator.next()
[ https://issues.apache.org/jira/browse/FLINK-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329218#comment-14329218 ] Andra Lungu commented on FLINK-1587: If there are no time constraints, I will take it :) coGroup throws NoSuchElementException on iterator.next() Key: FLINK-1587 URL: https://issues.apache.org/jira/browse/FLINK-1587 Project: Flink Issue Type: Bug Components: Gelly Environment: flink-0.8.0-SNAPSHOT Reporter: Carsten Brandt I am receiving the following exception when running a simple job that extracts outdegree from a graph using Gelly. It is currently only failing on the cluster and I am not able to reproduce it locally. Will try that the next days. {noformat} 02/20/2015 02:27:02: CoGroup (CoGroup at inDegrees(Graph.java:675)) (5/64) switched to FAILED java.util.NoSuchElementException at java.util.Collections$EmptyIterator.next(Collections.java:3006) at flink.graphs.Graph$CountNeighborsCoGroup.coGroup(Graph.java:665) at org.apache.flink.runtime.operators.CoGroupDriver.run(CoGroupDriver.java:130) at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:493) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257) at java.lang.Thread.run(Thread.java:745) 02/20/2015 02:27:02: Job execution switched to status FAILING ... {noformat} The error occurs in Gellys Graph.java at this line: https://github.com/apache/flink/blob/a51c02f6e8be948d71a00c492808115d622379a7/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java#L636 Is there any valid case where a coGroup Iterator may be empty? As far as I see there is a bug somewhere. I'd like to write a test case for this to reproduce the issue. Where can I put such a test? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1515) [Gelly] Enable access to aggregators and broadcast sets in vertex-centric iteration
[ https://issues.apache.org/jira/browse/FLINK-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329204#comment-14329204 ] ASF GitHub Bot commented on FLINK-1515: --- Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/402#issuecomment-75279454 I'm currently working on fixing this problem. You can ignore it for the moment. On Fri, Feb 20, 2015 at 11:58 AM, Vasia Kalavri notificati...@github.com wrote: Hi, if no more comments, I'd like to merge this. There is a failing check in Travis (not related to this PR): Tests in error: JobManagerFailsITCase.run:40-org$scalatest$BeforeAndAfterAll$$super$run:40-org$scalatest$WordSpecLike$$super$run:40-runTests:40-runTest:40-withFixture:40 » Timeout JobManagerFailsITCase.run:40-org$scalatest$BeforeAndAfterAll$$super$run:40-org$scalatest$WordSpecLike$$super$run:40-runTests:40-runTest:40-withFixture:40-TestKit.within:707-TestKit.within:707 » Timeout Is this fixed by #422 https://github.com/apache/flink/pull/422? Shall I proceed? Thanks! — Reply to this email directly or view it on GitHub https://github.com/apache/flink/pull/402#issuecomment-75220973. [Gelly] Enable access to aggregators and broadcast sets in vertex-centric iteration --- Key: FLINK-1515 URL: https://issues.apache.org/jira/browse/FLINK-1515 Project: Flink Issue Type: Improvement Components: Gelly Reporter: Vasia Kalavri Assignee: Martin Kiefer Currently, aggregators and broadcast sets cannot be accessed through Gelly's {{runVertexCentricIteration}} method. The functionality is already present in the {{VertexCentricIteration}} and we just need to expose it. This could be done like this: We create a method {{createVertexCentricIteration}}, which will return a {{VertexCentricIteration}} object and we change {{runVertexCentricIteration}} to accept this as a parameter (and return the graph after running this iteration). The user can configure the {{VertexCentricIteration}} by directly calling the public methods {{registerAggregator}}, {{setName}}, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1567) Add option to switch between Avro and Kryo serialization for GenericTypes
[ https://issues.apache.org/jira/browse/FLINK-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329215#comment-14329215 ] ASF GitHub Bot commented on FLINK-1567: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/413#discussion_r25086186 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/typeutils/GenericTypeInfo.java --- @@ -66,6 +71,9 @@ public boolean isKeyType() { @Override public TypeSerializerT createSerializer() { + if(useAvroSerializer) { + return new AvroSerializerT(this.typeClass); + } return new KryoSerializerT(this.typeClass); --- End diff -- If we enclose the ```return new KryoSerializer``` in an else block, then the control flow looks nicer imho. Add option to switch between Avro and Kryo serialization for GenericTypes - Key: FLINK-1567 URL: https://issues.apache.org/jira/browse/FLINK-1567 Project: Flink Issue Type: Improvement Affects Versions: 0.8, 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Allow users to switch the underlying serializer for GenericTypes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1589] Add option to pass configuration ...
Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/427#issuecomment-75271706 I think we should rework the test case to check that the configuration is properly passed to the system. Right now the exception is thrown in ```ds.writeAsText(null)``` because we pass ```null```. I'd propose something like @StephanEwen did in the PR #410. We set the number of slots in the configuration and the job to ```PARALLELISM_AUTO_MAX```. With the special input format which produces only a single element per split, we can count the number of parallel tasks, given that every task receives only one input split. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1555] Add serializer hierarchy debug ut...
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/415#discussion_r25085047 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/typeutils/CompositeType.java --- @@ -132,7 +133,31 @@ public CompositeType(ClassT typeClass) { } return getNewComparator(config); } - + + // + + /** +* Debugging utility to understand the hierarchy of serializers created by the Java API. +*/ + public static T String getSerializerTree(TypeInformationT ti) { + return getSerializerTree(ti, 0); + } + + private static T String getSerializerTree(TypeInformationT ti, int indent) { + String ret = ; + if(ti instanceof CompositeType) { + ret += ti.toString()+\n; --- End diff -- Should the ```toString``` method not already print the whole tree? Thus, the information would be redundant. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (FLINK-1587) coGroup throws NoSuchElementException on iterator.next()
[ https://issues.apache.org/jira/browse/FLINK-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andra Lungu reassigned FLINK-1587: -- Assignee: Andra Lungu coGroup throws NoSuchElementException on iterator.next() Key: FLINK-1587 URL: https://issues.apache.org/jira/browse/FLINK-1587 Project: Flink Issue Type: Bug Components: Gelly Environment: flink-0.8.0-SNAPSHOT Reporter: Carsten Brandt Assignee: Andra Lungu I am receiving the following exception when running a simple job that extracts outdegree from a graph using Gelly. It is currently only failing on the cluster and I am not able to reproduce it locally. Will try that the next days. {noformat} 02/20/2015 02:27:02: CoGroup (CoGroup at inDegrees(Graph.java:675)) (5/64) switched to FAILED java.util.NoSuchElementException at java.util.Collections$EmptyIterator.next(Collections.java:3006) at flink.graphs.Graph$CountNeighborsCoGroup.coGroup(Graph.java:665) at org.apache.flink.runtime.operators.CoGroupDriver.run(CoGroupDriver.java:130) at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:493) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257) at java.lang.Thread.run(Thread.java:745) 02/20/2015 02:27:02: Job execution switched to status FAILING ... {noformat} The error occurs in Gellys Graph.java at this line: https://github.com/apache/flink/blob/a51c02f6e8be948d71a00c492808115d622379a7/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java#L636 Is there any valid case where a coGroup Iterator may be empty? As far as I see there is a bug somewhere. I'd like to write a test case for this to reproduce the issue. Where can I put such a test? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1515]Splitted runVertexCentricIteration...
Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/402#issuecomment-75279454 I'm currently working on fixing this problem. You can ignore it for the moment. On Fri, Feb 20, 2015 at 11:58 AM, Vasia Kalavri notificati...@github.com wrote: Hi, if no more comments, I'd like to merge this. There is a failing check in Travis (not related to this PR): Tests in error: JobManagerFailsITCase.run:40-org$scalatest$BeforeAndAfterAll$$super$run:40-org$scalatest$WordSpecLike$$super$run:40-runTests:40-runTest:40-withFixture:40 » Timeout JobManagerFailsITCase.run:40-org$scalatest$BeforeAndAfterAll$$super$run:40-org$scalatest$WordSpecLike$$super$run:40-runTests:40-runTest:40-withFixture:40-TestKit.within:707-TestKit.within:707 » Timeout Is this fixed by #422 https://github.com/apache/flink/pull/422? Shall I proceed? Thanks! â Reply to this email directly or view it on GitHub https://github.com/apache/flink/pull/402#issuecomment-75220973. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1567) Add option to switch between Avro and Kryo serialization for GenericTypes
[ https://issues.apache.org/jira/browse/FLINK-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329219#comment-14329219 ] ASF GitHub Bot commented on FLINK-1567: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/413#discussion_r25086445 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/typeutils/runtime/AvroSerializer.java --- @@ -91,7 +98,22 @@ public T createInstance() { @Override public T copy(T from) { checkKryoInitialized(); - return this.kryo.copy(from); + try { + return this.kryo.copy(from); + } catch(KryoException ke) { + // kryo was unable to copy it, so we do it through serialization: + ByteArrayOutputStream baout = new ByteArrayOutputStream(); + Output output = new Output(baout); + + kryo.writeObject(output, from); + + output.close(); + + ByteArrayInputStream bain = new ByteArrayInputStream(baout.toByteArray()); + Input input = new Input(bain); + + return (T)kryo.readObject(input, from.getClass()); --- End diff -- Which cases do we cover with this method? Add option to switch between Avro and Kryo serialization for GenericTypes - Key: FLINK-1567 URL: https://issues.apache.org/jira/browse/FLINK-1567 Project: Flink Issue Type: Improvement Affects Versions: 0.8, 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Allow users to switch the underlying serializer for GenericTypes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1567] Add switch to use the AvroSeriali...
Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/413#issuecomment-75283551 LGTM besides of my small comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1567) Add option to switch between Avro and Kryo serialization for GenericTypes
[ https://issues.apache.org/jira/browse/FLINK-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329227#comment-14329227 ] ASF GitHub Bot commented on FLINK-1567: --- Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/413#issuecomment-75283551 LGTM besides of my small comments. Add option to switch between Avro and Kryo serialization for GenericTypes - Key: FLINK-1567 URL: https://issues.apache.org/jira/browse/FLINK-1567 Project: Flink Issue Type: Improvement Affects Versions: 0.8, 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Allow users to switch the underlying serializer for GenericTypes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1595) Add a complex integration test for Streaming API
Gyula Fora created FLINK-1595: - Summary: Add a complex integration test for Streaming API Key: FLINK-1595 URL: https://issues.apache.org/jira/browse/FLINK-1595 Project: Flink Issue Type: Task Components: Streaming Reporter: Gyula Fora The streaming tests currently lack a sophisticated integration test that would test many api features at once. This should include different merging, partitioning, grouping, aggregation types, as well as windowing and connected operators. The results should be tested for correctness. A test like this would help identifying bugs that are hard to detect by unit-tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1595) Add a complex integration test for Streaming API
[ https://issues.apache.org/jira/browse/FLINK-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora updated FLINK-1595: -- Labels: Starter (was: ) Add a complex integration test for Streaming API Key: FLINK-1595 URL: https://issues.apache.org/jira/browse/FLINK-1595 Project: Flink Issue Type: Task Components: Streaming Reporter: Gyula Fora Labels: Starter The streaming tests currently lack a sophisticated integration test that would test many api features at once. This should include different merging, partitioning, grouping, aggregation types, as well as windowing and connected operators. The results should be tested for correctness. A test like this would help identifying bugs that are hard to detect by unit-tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1594) DataStreams don't support self-join
[ https://issues.apache.org/jira/browse/FLINK-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora updated FLINK-1594: -- Description: Trying to window-join a DataStream with itself will result in exceptions. I get the following stack trace: {noformat} java.lang.Exception: Error setting up runtime environment: Union buffer reader must be initialized with at least two individual buffer readers at org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:173) at org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419) at org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.IllegalArgumentException: Union buffer reader must be initialized with at least two individual buffer readers at org.apache.flink.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:125) at org.apache.flink.runtime.io.network.api.reader.UnionBufferReader.init(UnionBufferReader.java:69) at org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setConfigInputs(CoStreamVertex.java:101) at org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setInputsOutputs(CoStreamVertex.java:63) at org.apache.flink.streaming.api.streamvertex.StreamVertex.registerInputOutput(StreamVertex.java:65) at org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:170) ... 20 more {noformat} was: Trying to join a DataSet with itself will result in exceptions. I get the following stack trace: {noformat} java.lang.Exception: Error setting up runtime environment: Union buffer reader must be initialized with at least two individual buffer readers at org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:173) at org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419) at org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at
[jira] [Commented] (FLINK-1594) DataStreams don't support self-join
[ https://issues.apache.org/jira/browse/FLINK-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329268#comment-14329268 ] Gyula Fora commented on FLINK-1594: --- We will have to take a look at how self joins are handled in the batch runtime. DataStreams don't support self-join --- Key: FLINK-1594 URL: https://issues.apache.org/jira/browse/FLINK-1594 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9 Environment: flink-0.9.0-SNAPSHOT Reporter: Daniel Bali Trying to window-join a DataStream with itself will result in exceptions. I get the following stack trace: {noformat} java.lang.Exception: Error setting up runtime environment: Union buffer reader must be initialized with at least two individual buffer readers at org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:173) at org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:419) at org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:44) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.IllegalArgumentException: Union buffer reader must be initialized with at least two individual buffer readers at org.apache.flink.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:125) at org.apache.flink.runtime.io.network.api.reader.UnionBufferReader.init(UnionBufferReader.java:69) at org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setConfigInputs(CoStreamVertex.java:101) at org.apache.flink.streaming.api.streamvertex.CoStreamVertex.setInputsOutputs(CoStreamVertex.java:63) at org.apache.flink.streaming.api.streamvertex.StreamVertex.registerInputOutput(StreamVertex.java:65) at org.apache.flink.runtime.execution.RuntimeEnvironment.init(RuntimeEnvironment.java:170) ... 20 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1522][gelly] Added test for SSSP Exampl...
Github user andralungu closed the pull request at: https://github.com/apache/flink/pull/414 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1522) Add tests for the library methods and examples
[ https://issues.apache.org/jira/browse/FLINK-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329381#comment-14329381 ] ASF GitHub Bot commented on FLINK-1522: --- GitHub user andralungu opened a pull request: https://github.com/apache/flink/pull/429 [FLINK-1522][gelly] Added test for SSSP Example You can merge this pull request into a Git repository by running: $ git pull https://github.com/andralungu/flink tidySSSP Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/429.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #429 commit 93cb052c4c432ccd28b008b2436e00d2382ea0ba Author: andralungu lungu.an...@gmail.com Date: 2015-02-20T19:11:42Z [FLINK-1522][gelly] Added test for SSSP Example Add tests for the library methods and examples -- Key: FLINK-1522 URL: https://issues.apache.org/jira/browse/FLINK-1522 Project: Flink Issue Type: New Feature Components: Gelly Reporter: Vasia Kalavri Assignee: Daniel Bali Labels: easyfix, test The current tests in gelly test one method at a time. We should have some tests for complete applications. As a start, we could add one test case per example and this way also make sure that our graph library methods actually give correct results. I'm assigning this to [~andralungu] because she has already implemented the test for SSSP, but I will help as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1522) Add tests for the library methods and examples
[ https://issues.apache.org/jira/browse/FLINK-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329379#comment-14329379 ] ASF GitHub Bot commented on FLINK-1522: --- Github user andralungu closed the pull request at: https://github.com/apache/flink/pull/414 Add tests for the library methods and examples -- Key: FLINK-1522 URL: https://issues.apache.org/jira/browse/FLINK-1522 Project: Flink Issue Type: New Feature Components: Gelly Reporter: Vasia Kalavri Assignee: Daniel Bali Labels: easyfix, test The current tests in gelly test one method at a time. We should have some tests for complete applications. As a start, we could add one test case per example and this way also make sure that our graph library methods actually give correct results. I'm assigning this to [~andralungu] because she has already implemented the test for SSSP, but I will help as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1522][gelly] Added test for SSSP Exampl...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/429#issuecomment-75310637 Btw, you can update a PR by pushing into your remote branch and don't need to close and open a new PR. If you want to change previous commits (including commit message) or rebase the PR, you can do a force push. However, when doing a force push all code comments get lost, so you should avoid that if somebody reviewed the code and made comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1522) Add tests for the library methods and examples
[ https://issues.apache.org/jira/browse/FLINK-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329479#comment-14329479 ] ASF GitHub Bot commented on FLINK-1522: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/429#issuecomment-75310637 Btw, you can update a PR by pushing into your remote branch and don't need to close and open a new PR. If you want to change previous commits (including commit message) or rebase the PR, you can do a force push. However, when doing a force push all code comments get lost, so you should avoid that if somebody reviewed the code and made comments. Add tests for the library methods and examples -- Key: FLINK-1522 URL: https://issues.apache.org/jira/browse/FLINK-1522 Project: Flink Issue Type: New Feature Components: Gelly Reporter: Vasia Kalavri Assignee: Daniel Bali Labels: easyfix, test The current tests in gelly test one method at a time. We should have some tests for complete applications. As a start, we could add one test case per example and this way also make sure that our graph library methods actually give correct results. I'm assigning this to [~andralungu] because she has already implemented the test for SSSP, but I will help as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1522][gelly] Added test for SSSP Exampl...
Github user andralungu commented on the pull request: https://github.com/apache/flink/pull/429#issuecomment-75312569 Thank you for the tip! I am still learning the dirty insights of Git :) Next time I will just update the PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1522) Add tests for the library methods and examples
[ https://issues.apache.org/jira/browse/FLINK-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329497#comment-14329497 ] ASF GitHub Bot commented on FLINK-1522: --- Github user andralungu commented on the pull request: https://github.com/apache/flink/pull/429#issuecomment-75312569 Thank you for the tip! I am still learning the dirty insights of Git :) Next time I will just update the PR. Add tests for the library methods and examples -- Key: FLINK-1522 URL: https://issues.apache.org/jira/browse/FLINK-1522 Project: Flink Issue Type: New Feature Components: Gelly Reporter: Vasia Kalavri Assignee: Daniel Bali Labels: easyfix, test The current tests in gelly test one method at a time. We should have some tests for complete applications. As a start, we could add one test case per example and this way also make sure that our graph library methods actually give correct results. I'm assigning this to [~andralungu] because she has already implemented the test for SSSP, but I will help as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)