Re: Flink GZip support
Hi Karim, also have a look at this old discussion from the user@ list: http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/read-gz-files-td760.html On Sun, Feb 22, 2015 at 10:33 AM, Felix Neutatz neut...@googlemail.com wrote: Hi Karim, you can use a Hadoop Input Format and read the files using flink-hadoop-compatibility classes like here: http://flink.apache.org/docs/0.7-incubating/hadoop_compatibility.html Have a nice Sunday, Felix 2015-02-22 10:02 GMT+01:00 Karim Alaa karim.hame...@gmail.com: Hi All, I’m currently working with Flink 0.8.0 and I would like to know if there is or will be any support for handling Gzipped files Thanks!
Re: Deprecated error building flink
Hi, But I’m using Oracle java 8 (javac 1.8.0_05). On Feb 22, 2015, at 6:32 PM, Robert Metzger rmetz...@apache.org wrote: Hi Dulaj, you are using an unsupported compiler to compile Flink. You can compile Flink only with OpenJDK 6 and all JDKs above 6. The Oracle JDK 6's compiler contains a bug. You can run Flink with all JREs 6+ (including Oracle JDK 6). I would recommend you to upgrade your Java version to 7 anyways, because 6 doesn't receive any security updates anymore. On Sun, Feb 22, 2015 at 12:55 PM, Dulaj Viduranga vidura...@icloud.com wrote: Hi all, I’m new here. I had some problems building flink on my mac. Could someone please take a look and help me out..? Dulaj Viduranga. [INFO] Scanning for projects... [WARNING] [WARNING] Some problems were encountered while building the effective model for org.apache.flink:flink-streaming-examples:jar:0.9-SNAPSHOT [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but found duplicate declaration of plugin org.apache.maven.plugins:maven-jar-plugin @ org.apache.flink:flink-streaming-examples:[unknown-version], /Users/Vidura/Documents/Development/flink/flink-staging/flink-streaming/flink-streaming-examples/pom.xml, line 462, column 12 [WARNING] [WARNING] It is highly recommended to fix these problems because they threaten the stability of your build. [WARNING] [WARNING] For this reason, future Maven versions might no longer support building such malformed projects. [WARNING] [INFO] [INFO] Reactor Build Order: [INFO] [INFO] flink [INFO] flink-shaded [INFO] flink-core [INFO] flink-java [INFO] flink-runtime [INFO] flink-compiler [INFO] flink-clients [INFO] flink-test-utils [INFO] flink-scala [INFO] flink-examples [INFO] flink-java-examples [INFO] flink-scala-examples [INFO] flink-staging [INFO] flink-streaming [INFO] flink-streaming-core [INFO] flink-tests [INFO] flink-avro [INFO] flink-jdbc [INFO] flink-spargel [INFO] flink-hadoop-compatibility [INFO] flink-streaming-scala [INFO] flink-streaming-connectors [INFO] flink-streaming-examples [INFO] flink-hbase [INFO] flink-gelly [INFO] flink-hcatalog [INFO] flink-tachyon [INFO] flink-quickstart [INFO] flink-quickstart-java [INFO] flink-quickstart-scala [INFO] flink-contrib [INFO] flink-yarn [INFO] flink-dist [INFO] flink-yarn-tests [INFO] [INFO] [INFO] Building flink 0.9-SNAPSHOT [INFO] Downloading: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.pom Downloaded: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.pom (10 KB at 1.8 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-project/0.10/apache-rat-project-0.10.pom Downloaded: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-project/0.10/apache-rat-project-0.10.pom (18 KB at 13.9 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.jar Downloaded: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.jar (42 KB at 31.8 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.pom Downloaded: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.pom (8 KB at 8.0 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.jar Downloaded: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.jar (33 KB at 27.4 KB/sec) [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ flink-parent --- [INFO] Deleting /Users/Vidura/Documents/Development/flink/target [INFO] [INFO] --- maven-checkstyle-plugin:2.12.1:check (validate) @ flink-parent --- [INFO] [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-maven) @ flink-parent --- [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ flink-parent --- [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ flink-parent --- [INFO] [INFO] --- maven-shade-plugin:2.3:shade (default) @ flink-parent --- [INFO] Excluding org.apache.commons:commons-lang3:jar:3.3.2 from the shaded jar. [INFO] Excluding org.slf4j:slf4j-api:jar:1.7.7 from the shaded jar. [INFO] Excluding org.slf4j:slf4j-log4j12:jar:1.7.7 from the shaded jar. [INFO] Excluding log4j:log4j:jar:1.2.17 from the shaded jar. [INFO] Replacing original artifact with shaded artifact. [INFO] [INFO] ---
Re: Deprecated error building flink
Seems like you have to set the JAVA_HOME variable properly ( http://stackoverflow.com/questions/18813828/why-maven-use-jdk-1-6-but-my-java-version-is-1-7 ) On Sun, Feb 22, 2015 at 2:11 PM, Dulaj Viduranga vidura...@icloud.com wrote: Oh yes. :) It runs on 1.6. How could I fix that? On Feb 22, 2015, at 6:37 PM, Robert Metzger rmetz...@apache.org wrote: Can you run mvn -version to verify that? Maybe maven is using a different java version? On Sun, Feb 22, 2015 at 2:05 PM, Dulaj Viduranga vidura...@icloud.com wrote: Hi, But I’m using Oracle java 8 (javac 1.8.0_05). On Feb 22, 2015, at 6:32 PM, Robert Metzger rmetz...@apache.org wrote: Hi Dulaj, you are using an unsupported compiler to compile Flink. You can compile Flink only with OpenJDK 6 and all JDKs above 6. The Oracle JDK 6's compiler contains a bug. You can run Flink with all JREs 6+ (including Oracle JDK 6). I would recommend you to upgrade your Java version to 7 anyways, because 6 doesn't receive any security updates anymore. On Sun, Feb 22, 2015 at 12:55 PM, Dulaj Viduranga vidura...@icloud.com wrote: Hi all, I’m new here. I had some problems building flink on my mac. Could someone please take a look and help me out..? Dulaj Viduranga. [INFO] Scanning for projects... [WARNING] [WARNING] Some problems were encountered while building the effective model for org.apache.flink:flink-streaming-examples:jar:0.9-SNAPSHOT [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but found duplicate declaration of plugin org.apache.maven.plugins:maven-jar-plugin @ org.apache.flink:flink-streaming-examples:[unknown-version], /Users/Vidura/Documents/Development/flink/flink-staging/flink-streaming/flink-streaming-examples/pom.xml, line 462, column 12 [WARNING] [WARNING] It is highly recommended to fix these problems because they threaten the stability of your build. [WARNING] [WARNING] For this reason, future Maven versions might no longer support building such malformed projects. [WARNING] [INFO] [INFO] Reactor Build Order: [INFO] [INFO] flink [INFO] flink-shaded [INFO] flink-core [INFO] flink-java [INFO] flink-runtime [INFO] flink-compiler [INFO] flink-clients [INFO] flink-test-utils [INFO] flink-scala [INFO] flink-examples [INFO] flink-java-examples [INFO] flink-scala-examples [INFO] flink-staging [INFO] flink-streaming [INFO] flink-streaming-core [INFO] flink-tests [INFO] flink-avro [INFO] flink-jdbc [INFO] flink-spargel [INFO] flink-hadoop-compatibility [INFO] flink-streaming-scala [INFO] flink-streaming-connectors [INFO] flink-streaming-examples [INFO] flink-hbase [INFO] flink-gelly [INFO] flink-hcatalog [INFO] flink-tachyon [INFO] flink-quickstart [INFO] flink-quickstart-java [INFO] flink-quickstart-scala [INFO] flink-contrib [INFO] flink-yarn [INFO] flink-dist [INFO] flink-yarn-tests [INFO] [INFO] [INFO] Building flink 0.9-SNAPSHOT [INFO] Downloading: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.pom Downloaded: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.pom (10 KB at 1.8 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-project/0.10/apache-rat-project-0.10.pom Downloaded: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-project/0.10/apache-rat-project-0.10.pom (18 KB at 13.9 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.jar Downloaded: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.jar (42 KB at 31.8 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.pom Downloaded: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.pom (8 KB at 8.0 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.jar Downloaded: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.jar (33 KB at 27.4 KB/sec) [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ flink-parent --- [INFO] Deleting /Users/Vidura/Documents/Development/flink/target [INFO] [INFO] --- maven-checkstyle-plugin:2.12.1:check (validate) @ flink-parent --- [INFO] [INFO] [INFO] ---
Re: Deprecated error building flink
Hi Dulaj, you are using an unsupported compiler to compile Flink. You can compile Flink only with OpenJDK 6 and all JDKs above 6. The Oracle JDK 6's compiler contains a bug. You can run Flink with all JREs 6+ (including Oracle JDK 6). I would recommend you to upgrade your Java version to 7 anyways, because 6 doesn't receive any security updates anymore. On Sun, Feb 22, 2015 at 12:55 PM, Dulaj Viduranga vidura...@icloud.com wrote: Hi all, I’m new here. I had some problems building flink on my mac. Could someone please take a look and help me out..? Dulaj Viduranga. [INFO] Scanning for projects... [WARNING] [WARNING] Some problems were encountered while building the effective model for org.apache.flink:flink-streaming-examples:jar:0.9-SNAPSHOT [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but found duplicate declaration of plugin org.apache.maven.plugins:maven-jar-plugin @ org.apache.flink:flink-streaming-examples:[unknown-version], /Users/Vidura/Documents/Development/flink/flink-staging/flink-streaming/flink-streaming-examples/pom.xml, line 462, column 12 [WARNING] [WARNING] It is highly recommended to fix these problems because they threaten the stability of your build. [WARNING] [WARNING] For this reason, future Maven versions might no longer support building such malformed projects. [WARNING] [INFO] [INFO] Reactor Build Order: [INFO] [INFO] flink [INFO] flink-shaded [INFO] flink-core [INFO] flink-java [INFO] flink-runtime [INFO] flink-compiler [INFO] flink-clients [INFO] flink-test-utils [INFO] flink-scala [INFO] flink-examples [INFO] flink-java-examples [INFO] flink-scala-examples [INFO] flink-staging [INFO] flink-streaming [INFO] flink-streaming-core [INFO] flink-tests [INFO] flink-avro [INFO] flink-jdbc [INFO] flink-spargel [INFO] flink-hadoop-compatibility [INFO] flink-streaming-scala [INFO] flink-streaming-connectors [INFO] flink-streaming-examples [INFO] flink-hbase [INFO] flink-gelly [INFO] flink-hcatalog [INFO] flink-tachyon [INFO] flink-quickstart [INFO] flink-quickstart-java [INFO] flink-quickstart-scala [INFO] flink-contrib [INFO] flink-yarn [INFO] flink-dist [INFO] flink-yarn-tests [INFO] [INFO] [INFO] Building flink 0.9-SNAPSHOT [INFO] Downloading: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.pom Downloaded: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.pom (10 KB at 1.8 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-project/0.10/apache-rat-project-0.10.pom Downloaded: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-project/0.10/apache-rat-project-0.10.pom (18 KB at 13.9 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.jar Downloaded: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.jar (42 KB at 31.8 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.pom Downloaded: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.pom (8 KB at 8.0 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.jar Downloaded: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.jar (33 KB at 27.4 KB/sec) [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ flink-parent --- [INFO] Deleting /Users/Vidura/Documents/Development/flink/target [INFO] [INFO] --- maven-checkstyle-plugin:2.12.1:check (validate) @ flink-parent --- [INFO] [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-maven) @ flink-parent --- [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ flink-parent --- [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ flink-parent --- [INFO] [INFO] --- maven-shade-plugin:2.3:shade (default) @ flink-parent --- [INFO] Excluding org.apache.commons:commons-lang3:jar:3.3.2 from the shaded jar. [INFO] Excluding org.slf4j:slf4j-api:jar:1.7.7 from the shaded jar. [INFO] Excluding org.slf4j:slf4j-log4j12:jar:1.7.7 from the shaded jar. [INFO] Excluding log4j:log4j:jar:1.2.17 from the shaded jar. [INFO] Replacing original artifact with shaded artifact. [INFO] [INFO] --- maven-failsafe-plugin:2.17:integration-test (default) @ flink-parent --- [INFO] Tests are skipped. [INFO] [INFO] --- apache-rat-plugin:0.10:check (default) @
Re: Deprecated error building flink
Oh yes. :) It runs on 1.6. How could I fix that? On Feb 22, 2015, at 6:37 PM, Robert Metzger rmetz...@apache.org wrote: Can you run mvn -version to verify that? Maybe maven is using a different java version? On Sun, Feb 22, 2015 at 2:05 PM, Dulaj Viduranga vidura...@icloud.com wrote: Hi, But I’m using Oracle java 8 (javac 1.8.0_05). On Feb 22, 2015, at 6:32 PM, Robert Metzger rmetz...@apache.org wrote: Hi Dulaj, you are using an unsupported compiler to compile Flink. You can compile Flink only with OpenJDK 6 and all JDKs above 6. The Oracle JDK 6's compiler contains a bug. You can run Flink with all JREs 6+ (including Oracle JDK 6). I would recommend you to upgrade your Java version to 7 anyways, because 6 doesn't receive any security updates anymore. On Sun, Feb 22, 2015 at 12:55 PM, Dulaj Viduranga vidura...@icloud.com wrote: Hi all, I’m new here. I had some problems building flink on my mac. Could someone please take a look and help me out..? Dulaj Viduranga. [INFO] Scanning for projects... [WARNING] [WARNING] Some problems were encountered while building the effective model for org.apache.flink:flink-streaming-examples:jar:0.9-SNAPSHOT [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but found duplicate declaration of plugin org.apache.maven.plugins:maven-jar-plugin @ org.apache.flink:flink-streaming-examples:[unknown-version], /Users/Vidura/Documents/Development/flink/flink-staging/flink-streaming/flink-streaming-examples/pom.xml, line 462, column 12 [WARNING] [WARNING] It is highly recommended to fix these problems because they threaten the stability of your build. [WARNING] [WARNING] For this reason, future Maven versions might no longer support building such malformed projects. [WARNING] [INFO] [INFO] Reactor Build Order: [INFO] [INFO] flink [INFO] flink-shaded [INFO] flink-core [INFO] flink-java [INFO] flink-runtime [INFO] flink-compiler [INFO] flink-clients [INFO] flink-test-utils [INFO] flink-scala [INFO] flink-examples [INFO] flink-java-examples [INFO] flink-scala-examples [INFO] flink-staging [INFO] flink-streaming [INFO] flink-streaming-core [INFO] flink-tests [INFO] flink-avro [INFO] flink-jdbc [INFO] flink-spargel [INFO] flink-hadoop-compatibility [INFO] flink-streaming-scala [INFO] flink-streaming-connectors [INFO] flink-streaming-examples [INFO] flink-hbase [INFO] flink-gelly [INFO] flink-hcatalog [INFO] flink-tachyon [INFO] flink-quickstart [INFO] flink-quickstart-java [INFO] flink-quickstart-scala [INFO] flink-contrib [INFO] flink-yarn [INFO] flink-dist [INFO] flink-yarn-tests [INFO] [INFO] [INFO] Building flink 0.9-SNAPSHOT [INFO] Downloading: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.pom Downloaded: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.pom (10 KB at 1.8 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-project/0.10/apache-rat-project-0.10.pom Downloaded: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-project/0.10/apache-rat-project-0.10.pom (18 KB at 13.9 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.jar Downloaded: https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.jar (42 KB at 31.8 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.pom Downloaded: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.pom (8 KB at 8.0 KB/sec) Downloading: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.jar Downloaded: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.jar (33 KB at 27.4 KB/sec) [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ flink-parent --- [INFO] Deleting /Users/Vidura/Documents/Development/flink/target [INFO] [INFO] --- maven-checkstyle-plugin:2.12.1:check (validate) @ flink-parent --- [INFO] [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-maven) @ flink-parent --- [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ flink-parent --- [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ flink-parent --- [INFO] [INFO] --- maven-shade-plugin:2.3:shade (default) @ flink-parent --- [INFO] Excluding
Re: [DISCUSS] Gelly iteration abstractions
Hi, yes, I was referring to the parallel Boruvka algorithm. There are several ways to implement this one in Flink and I believe that the one described in the paper (vertex-centric) is not the most elegant one :) Andra is now working on an idea that uses the delta iteration abstraction and we believe that it will be both more efficient and easier to understand. It has the edges in the solution set and the vertices in the workset, so it follows the pattern I describe in (2) in my previous e-mail. As a next step, we would like to see how having an iteration operator that could update the whole graph -what I describe as (3)- would make this even nicer. Any ideas are highly welcome! Cheers, V. On 22 February 2015 at 16:32, Andra Lungu lungu.an...@gmail.com wrote: Hi Alex, Vasia is talking about the second version(presented Friday) of Parallel Boruvka, which can be found here: https://github.com/TU-Berlin-DIMA/IMPRO-3.WS14/pull/59 I will propose the third, non-Pregel like approach directly to Gelly soon. If you have additional questions, I will be happy to answer them. Andra On Sun, Feb 22, 2015 at 4:23 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: Hi Vasia, I am trying to look at the problem in more detail. Which version of the MST are you talking about? Right now in the Gelly repository I can only find the SSSP example (parallel Bellman-Ford) from Section 4.2 in [1]. However, it seems that the issues encountered by Andra are related to the implementation of Parallel Boruvka (Section 3.2 in [2]). Is that correct? Regards, A. [1] http://www.vldb.org/pvldb/vol7/p1047-han.pdf [2] http://www.vldb.org/pvldb/vol7/p577-salihoglu.pdf 2015-02-19 21:03 GMT+01:00 Vasiliki Kalavri vasilikikala...@gmail.com: Hello beautiful Flink people, during the past few days, Andra and I have been discussing about how to extend Gelly's iteration methods. Alexander's course (and his awesome students) has made it obvious that vertex-centric iterations are not the best fit for algorithms which don't follow the common propagate-update pattern. For example, Andra is working on an implementation of Minimum Spanning Tree, which requires branching inside an iteration and also requires a convergence check of an internal iteration. Others also reported similar issues [1, 2]. Trying to fit such algorithms to the vertex-centric model leads to long and ugly code, e.g. aggregators to keep track of algorithm phases, duplicating data, etc. One limitation of the vertex-centric and the upcoming GAS model is that they both only allow the vertex values to be updated in each iteration. However, for some algorithms we need to update the edge values and in others we need to update both. In even more complex situations (like Andra's MST) in some iterations we need to update the vertex values and in some iterations we need to update the edge values. Another problem is that we currently don't have a way to allow different computational phases inside an iteration. This is something that Giraph solves with master compute, a function that is executed once before each superstep and sets the computation function. All that said, I believe that we can solve most of these issues if we nicely expose Flink's iteration operators in Gelly. I can see the following cases: 1. Bulk delta iterations where the solution set is the vertex dataset: this will be similar to vertex-centric and GAS, but will allow more flexible dataflows inside the iteration. 2. Bulk delta iterations where the solution set is the edge dataset: for the cases where we need to update edge values. 3. Bulk delta iterations where the solution set is the Graph: this will cover more complex cases, where the algorithm updates both vertices and edges or even adds/removes vertices/edges, i.e. updates the whole Graph. What do you think? I can see 1 2 being very easy to implement, but I suspect 3 won't be that easy (but so awesome to have ^^). Would it work the way a Graph is represented now, i.e. with 2 DataSets? Any comment, idea, pointer would be much appreciated! Thank you ^^ Cheers, -V. [1]: http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/Can-a-master-class-control-the-superstep-in-Flink-Spargel-td733.html [2]: http://issues.apache.org/jira/browse/FLINK-1552?focusedCommentId=14325769page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14325769
Re: [DISCUSS] Gelly iteration abstractions
Hi Vasia, I am trying to look at the problem in more detail. Which version of the MST are you talking about? Right now in the Gelly repository I can only find the SSSP example (parallel Bellman-Ford) from Section 4.2 in [1]. However, it seems that the issues encountered by Andra are related to the implementation of Parallel Boruvka (Section 3.2 in [2]). Is that correct? Regards, A. [1] http://www.vldb.org/pvldb/vol7/p1047-han.pdf [2] http://www.vldb.org/pvldb/vol7/p577-salihoglu.pdf 2015-02-19 21:03 GMT+01:00 Vasiliki Kalavri vasilikikala...@gmail.com: Hello beautiful Flink people, during the past few days, Andra and I have been discussing about how to extend Gelly's iteration methods. Alexander's course (and his awesome students) has made it obvious that vertex-centric iterations are not the best fit for algorithms which don't follow the common propagate-update pattern. For example, Andra is working on an implementation of Minimum Spanning Tree, which requires branching inside an iteration and also requires a convergence check of an internal iteration. Others also reported similar issues [1, 2]. Trying to fit such algorithms to the vertex-centric model leads to long and ugly code, e.g. aggregators to keep track of algorithm phases, duplicating data, etc. One limitation of the vertex-centric and the upcoming GAS model is that they both only allow the vertex values to be updated in each iteration. However, for some algorithms we need to update the edge values and in others we need to update both. In even more complex situations (like Andra's MST) in some iterations we need to update the vertex values and in some iterations we need to update the edge values. Another problem is that we currently don't have a way to allow different computational phases inside an iteration. This is something that Giraph solves with master compute, a function that is executed once before each superstep and sets the computation function. All that said, I believe that we can solve most of these issues if we nicely expose Flink's iteration operators in Gelly. I can see the following cases: 1. Bulk delta iterations where the solution set is the vertex dataset: this will be similar to vertex-centric and GAS, but will allow more flexible dataflows inside the iteration. 2. Bulk delta iterations where the solution set is the edge dataset: for the cases where we need to update edge values. 3. Bulk delta iterations where the solution set is the Graph: this will cover more complex cases, where the algorithm updates both vertices and edges or even adds/removes vertices/edges, i.e. updates the whole Graph. What do you think? I can see 1 2 being very easy to implement, but I suspect 3 won't be that easy (but so awesome to have ^^). Would it work the way a Graph is represented now, i.e. with 2 DataSets? Any comment, idea, pointer would be much appreciated! Thank you ^^ Cheers, -V. [1]: http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/Can-a-master-class-control-the-superstep-in-Flink-Spargel-td733.html [2]: http://issues.apache.org/jira/browse/FLINK-1552?focusedCommentId=14325769page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14325769
[jira] [Created] (FLINK-1597) VertexCentricIterations create inefficient execution plans
Martin Kiefer created FLINK-1597: Summary: VertexCentricIterations create inefficient execution plans Key: FLINK-1597 URL: https://issues.apache.org/jira/browse/FLINK-1597 Project: Flink Issue Type: Bug Components: Gelly Affects Versions: master Reporter: Martin Kiefer I did experiments with optimized versions of a graph algorithm that should utilize a secondary sort on the edges and a trade off between superstep numbers and I/O. To my surprise the optimizations did barely affect the execution times. I narrowed it down to inefficient execution plans. I assumed that edge sets would be partitioned once at the beginning of a VertexCentricIteration and never be touched again because they can not change during the iteration. I think this should be the desired behavior. What actually happens is that UDFs creating the edge set are pulled inside the iteration and are executed every superstep. This harms the performance of graph algorithms significantly. As a simple example have a look at the execution plan generated for the PageRankExample: https://gist.github.com/martinkiefer/28a63f953477e3987b5d -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Flink GZip support
Hi All, I’m currently working with Flink 0.8.0 and I would like to know if there is or will be any support for handling Gzipped files Thanks!