Re: Flink GZip support

2015-02-22 Thread Robert Metzger
Hi Karim,

also have a look at this old discussion from the user@ list:
http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/read-gz-files-td760.html



On Sun, Feb 22, 2015 at 10:33 AM, Felix Neutatz neut...@googlemail.com
wrote:

 Hi Karim,

 you can use a Hadoop Input Format and read the files
 using flink-hadoop-compatibility classes like here:
 http://flink.apache.org/docs/0.7-incubating/hadoop_compatibility.html

 Have a nice Sunday,

 Felix

 2015-02-22 10:02 GMT+01:00 Karim Alaa karim.hame...@gmail.com:

  Hi All,
 
  I’m currently working with Flink 0.8.0 and I would like to know if there
  is or will be any support for handling Gzipped files
 
  Thanks!



Re: Deprecated error building flink

2015-02-22 Thread Dulaj Viduranga
Hi,
But I’m using Oracle java 8 (javac 1.8.0_05). 

 On Feb 22, 2015, at 6:32 PM, Robert Metzger rmetz...@apache.org wrote:
 
 Hi Dulaj,
 
 you are using an unsupported compiler to compile Flink. You can compile
 Flink only with OpenJDK 6 and all JDKs above 6. The Oracle JDK 6's compiler
 contains a bug.
 
 You can run Flink with all JREs 6+ (including Oracle JDK 6).
 
 I would recommend you to upgrade your Java version to 7 anyways, because 6
 doesn't receive any security updates anymore.
 
 
 On Sun, Feb 22, 2015 at 12:55 PM, Dulaj Viduranga vidura...@icloud.com
 wrote:
 
 Hi all,
 I’m new here. I had some problems building flink on my mac. Could someone
 please take a look and help me out..?
 
 Dulaj Viduranga.
 
 [INFO] Scanning for projects...
 [WARNING]
 [WARNING] Some problems were encountered while building the effective
 model for org.apache.flink:flink-streaming-examples:jar:0.9-SNAPSHOT
 [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
 found duplicate declaration of plugin
 org.apache.maven.plugins:maven-jar-plugin @
 org.apache.flink:flink-streaming-examples:[unknown-version],
 /Users/Vidura/Documents/Development/flink/flink-staging/flink-streaming/flink-streaming-examples/pom.xml,
 line 462, column 12
 [WARNING]
 [WARNING] It is highly recommended to fix these problems because they
 threaten the stability of your build.
 [WARNING]
 [WARNING] For this reason, future Maven versions might no longer support
 building such malformed projects.
 [WARNING]
 [INFO]
 
 [INFO] Reactor Build Order:
 [INFO]
 [INFO] flink
 [INFO] flink-shaded
 [INFO] flink-core
 [INFO] flink-java
 [INFO] flink-runtime
 [INFO] flink-compiler
 [INFO] flink-clients
 [INFO] flink-test-utils
 [INFO] flink-scala
 [INFO] flink-examples
 [INFO] flink-java-examples
 [INFO] flink-scala-examples
 [INFO] flink-staging
 [INFO] flink-streaming
 [INFO] flink-streaming-core
 [INFO] flink-tests
 [INFO] flink-avro
 [INFO] flink-jdbc
 [INFO] flink-spargel
 [INFO] flink-hadoop-compatibility
 [INFO] flink-streaming-scala
 [INFO] flink-streaming-connectors
 [INFO] flink-streaming-examples
 [INFO] flink-hbase
 [INFO] flink-gelly
 [INFO] flink-hcatalog
 [INFO] flink-tachyon
 [INFO] flink-quickstart
 [INFO] flink-quickstart-java
 [INFO] flink-quickstart-scala
 [INFO] flink-contrib
 [INFO] flink-yarn
 [INFO] flink-dist
 [INFO] flink-yarn-tests
 [INFO]
 [INFO]
 
 [INFO] Building flink 0.9-SNAPSHOT
 [INFO]
 
 Downloading:
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.pom
 Downloaded:
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.pom
 (10 KB at 1.8 KB/sec)
 Downloading:
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-project/0.10/apache-rat-project-0.10.pom
 Downloaded:
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-project/0.10/apache-rat-project-0.10.pom
 (18 KB at 13.9 KB/sec)
 Downloading:
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.jar
 Downloaded:
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.jar
 (42 KB at 31.8 KB/sec)
 Downloading:
 https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.pom
 Downloaded:
 https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.pom
 (8 KB at 8.0 KB/sec)
 Downloading:
 https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.jar
 Downloaded:
 https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.jar
 (33 KB at 27.4 KB/sec)
 [INFO]
 [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ flink-parent ---
 [INFO] Deleting /Users/Vidura/Documents/Development/flink/target
 [INFO]
 [INFO] --- maven-checkstyle-plugin:2.12.1:check (validate) @ flink-parent
 ---
 [INFO]
 [INFO]
 [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-maven) @
 flink-parent ---
 [INFO]
 [INFO] --- maven-remote-resources-plugin:1.5:process (default) @
 flink-parent ---
 [INFO]
 [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @
 flink-parent ---
 [INFO]
 [INFO] --- maven-shade-plugin:2.3:shade (default) @ flink-parent ---
 [INFO] Excluding org.apache.commons:commons-lang3:jar:3.3.2 from the
 shaded jar.
 [INFO] Excluding org.slf4j:slf4j-api:jar:1.7.7 from the shaded jar.
 [INFO] Excluding org.slf4j:slf4j-log4j12:jar:1.7.7 from the shaded jar.
 [INFO] Excluding log4j:log4j:jar:1.2.17 from the shaded jar.
 [INFO] Replacing original artifact with shaded artifact.
 [INFO]
 [INFO] --- 

Re: Deprecated error building flink

2015-02-22 Thread Robert Metzger
Seems like you have to set the JAVA_HOME variable properly (
http://stackoverflow.com/questions/18813828/why-maven-use-jdk-1-6-but-my-java-version-is-1-7
)

On Sun, Feb 22, 2015 at 2:11 PM, Dulaj Viduranga vidura...@icloud.com
wrote:

 Oh yes. :) It runs on 1.6. How could I fix that?

  On Feb 22, 2015, at 6:37 PM, Robert Metzger rmetz...@apache.org wrote:
 
  Can you run mvn -version to verify that?
  Maybe maven is using a different java version?
 
  On Sun, Feb 22, 2015 at 2:05 PM, Dulaj Viduranga vidura...@icloud.com
  wrote:
 
  Hi,
 But I’m using Oracle java 8 (javac 1.8.0_05).
 
  On Feb 22, 2015, at 6:32 PM, Robert Metzger rmetz...@apache.org
 wrote:
 
  Hi Dulaj,
 
  you are using an unsupported compiler to compile Flink. You can compile
  Flink only with OpenJDK 6 and all JDKs above 6. The Oracle JDK 6's
  compiler
  contains a bug.
 
  You can run Flink with all JREs 6+ (including Oracle JDK 6).
 
  I would recommend you to upgrade your Java version to 7 anyways,
 because
  6
  doesn't receive any security updates anymore.
 
 
  On Sun, Feb 22, 2015 at 12:55 PM, Dulaj Viduranga 
 vidura...@icloud.com
  wrote:
 
  Hi all,
  I’m new here. I had some problems building flink on my mac. Could
  someone
  please take a look and help me out..?
 
  Dulaj Viduranga.
 
  [INFO] Scanning for projects...
  [WARNING]
  [WARNING] Some problems were encountered while building the effective
  model for org.apache.flink:flink-streaming-examples:jar:0.9-SNAPSHOT
  [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique
 but
  found duplicate declaration of plugin
  org.apache.maven.plugins:maven-jar-plugin @
  org.apache.flink:flink-streaming-examples:[unknown-version],
 
 
 /Users/Vidura/Documents/Development/flink/flink-staging/flink-streaming/flink-streaming-examples/pom.xml,
  line 462, column 12
  [WARNING]
  [WARNING] It is highly recommended to fix these problems because they
  threaten the stability of your build.
  [WARNING]
  [WARNING] For this reason, future Maven versions might no longer
 support
  building such malformed projects.
  [WARNING]
  [INFO]
 
 
  [INFO] Reactor Build Order:
  [INFO]
  [INFO] flink
  [INFO] flink-shaded
  [INFO] flink-core
  [INFO] flink-java
  [INFO] flink-runtime
  [INFO] flink-compiler
  [INFO] flink-clients
  [INFO] flink-test-utils
  [INFO] flink-scala
  [INFO] flink-examples
  [INFO] flink-java-examples
  [INFO] flink-scala-examples
  [INFO] flink-staging
  [INFO] flink-streaming
  [INFO] flink-streaming-core
  [INFO] flink-tests
  [INFO] flink-avro
  [INFO] flink-jdbc
  [INFO] flink-spargel
  [INFO] flink-hadoop-compatibility
  [INFO] flink-streaming-scala
  [INFO] flink-streaming-connectors
  [INFO] flink-streaming-examples
  [INFO] flink-hbase
  [INFO] flink-gelly
  [INFO] flink-hcatalog
  [INFO] flink-tachyon
  [INFO] flink-quickstart
  [INFO] flink-quickstart-java
  [INFO] flink-quickstart-scala
  [INFO] flink-contrib
  [INFO] flink-yarn
  [INFO] flink-dist
  [INFO] flink-yarn-tests
  [INFO]
  [INFO]
 
 
  [INFO] Building flink 0.9-SNAPSHOT
  [INFO]
 
 
  Downloading:
 
 
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.pom
  Downloaded:
 
 
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.pom
  (10 KB at 1.8 KB/sec)
  Downloading:
 
 
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-project/0.10/apache-rat-project-0.10.pom
  Downloaded:
 
 
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-project/0.10/apache-rat-project-0.10.pom
  (18 KB at 13.9 KB/sec)
  Downloading:
 
 
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.jar
  Downloaded:
 
 
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.jar
  (42 KB at 31.8 KB/sec)
  Downloading:
 
 
 https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.pom
  Downloaded:
 
 
 https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.pom
  (8 KB at 8.0 KB/sec)
  Downloading:
 
 
 https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.jar
  Downloaded:
 
 
 https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.jar
  (33 KB at 27.4 KB/sec)
  [INFO]
  [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ flink-parent
  ---
  [INFO] Deleting /Users/Vidura/Documents/Development/flink/target
  [INFO]
  [INFO] --- maven-checkstyle-plugin:2.12.1:check (validate) @
  flink-parent
  ---
  [INFO]
  [INFO]
  [INFO] --- 

Re: Deprecated error building flink

2015-02-22 Thread Robert Metzger
Hi Dulaj,

you are using an unsupported compiler to compile Flink. You can compile
Flink only with OpenJDK 6 and all JDKs above 6. The Oracle JDK 6's compiler
contains a bug.

You can run Flink with all JREs 6+ (including Oracle JDK 6).

I would recommend you to upgrade your Java version to 7 anyways, because 6
doesn't receive any security updates anymore.


On Sun, Feb 22, 2015 at 12:55 PM, Dulaj Viduranga vidura...@icloud.com
wrote:

 Hi all,
 I’m new here. I had some problems building flink on my mac. Could someone
 please take a look and help me out..?

 Dulaj Viduranga.

 [INFO] Scanning for projects...
 [WARNING]
 [WARNING] Some problems were encountered while building the effective
 model for org.apache.flink:flink-streaming-examples:jar:0.9-SNAPSHOT
 [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
 found duplicate declaration of plugin
 org.apache.maven.plugins:maven-jar-plugin @
 org.apache.flink:flink-streaming-examples:[unknown-version],
 /Users/Vidura/Documents/Development/flink/flink-staging/flink-streaming/flink-streaming-examples/pom.xml,
 line 462, column 12
 [WARNING]
 [WARNING] It is highly recommended to fix these problems because they
 threaten the stability of your build.
 [WARNING]
 [WARNING] For this reason, future Maven versions might no longer support
 building such malformed projects.
 [WARNING]
 [INFO]
 
 [INFO] Reactor Build Order:
 [INFO]
 [INFO] flink
 [INFO] flink-shaded
 [INFO] flink-core
 [INFO] flink-java
 [INFO] flink-runtime
 [INFO] flink-compiler
 [INFO] flink-clients
 [INFO] flink-test-utils
 [INFO] flink-scala
 [INFO] flink-examples
 [INFO] flink-java-examples
 [INFO] flink-scala-examples
 [INFO] flink-staging
 [INFO] flink-streaming
 [INFO] flink-streaming-core
 [INFO] flink-tests
 [INFO] flink-avro
 [INFO] flink-jdbc
 [INFO] flink-spargel
 [INFO] flink-hadoop-compatibility
 [INFO] flink-streaming-scala
 [INFO] flink-streaming-connectors
 [INFO] flink-streaming-examples
 [INFO] flink-hbase
 [INFO] flink-gelly
 [INFO] flink-hcatalog
 [INFO] flink-tachyon
 [INFO] flink-quickstart
 [INFO] flink-quickstart-java
 [INFO] flink-quickstart-scala
 [INFO] flink-contrib
 [INFO] flink-yarn
 [INFO] flink-dist
 [INFO] flink-yarn-tests
 [INFO]
 [INFO]
 
 [INFO] Building flink 0.9-SNAPSHOT
 [INFO]
 
 Downloading:
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.pom
 Downloaded:
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.pom
 (10 KB at 1.8 KB/sec)
 Downloading:
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-project/0.10/apache-rat-project-0.10.pom
 Downloaded:
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-project/0.10/apache-rat-project-0.10.pom
 (18 KB at 13.9 KB/sec)
 Downloading:
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.jar
 Downloaded:
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.jar
 (42 KB at 31.8 KB/sec)
 Downloading:
 https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.pom
 Downloaded:
 https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.pom
 (8 KB at 8.0 KB/sec)
 Downloading:
 https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.jar
 Downloaded:
 https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.jar
 (33 KB at 27.4 KB/sec)
 [INFO]
 [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ flink-parent ---
 [INFO] Deleting /Users/Vidura/Documents/Development/flink/target
 [INFO]
 [INFO] --- maven-checkstyle-plugin:2.12.1:check (validate) @ flink-parent
 ---
 [INFO]
 [INFO]
 [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-maven) @
 flink-parent ---
 [INFO]
 [INFO] --- maven-remote-resources-plugin:1.5:process (default) @
 flink-parent ---
 [INFO]
 [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @
 flink-parent ---
 [INFO]
 [INFO] --- maven-shade-plugin:2.3:shade (default) @ flink-parent ---
 [INFO] Excluding org.apache.commons:commons-lang3:jar:3.3.2 from the
 shaded jar.
 [INFO] Excluding org.slf4j:slf4j-api:jar:1.7.7 from the shaded jar.
 [INFO] Excluding org.slf4j:slf4j-log4j12:jar:1.7.7 from the shaded jar.
 [INFO] Excluding log4j:log4j:jar:1.2.17 from the shaded jar.
 [INFO] Replacing original artifact with shaded artifact.
 [INFO]
 [INFO] --- maven-failsafe-plugin:2.17:integration-test (default) @
 flink-parent ---
 [INFO] Tests are skipped.
 [INFO]
 [INFO] --- apache-rat-plugin:0.10:check (default) @ 

Re: Deprecated error building flink

2015-02-22 Thread Dulaj Viduranga
Oh yes. :) It runs on 1.6. How could I fix that?

 On Feb 22, 2015, at 6:37 PM, Robert Metzger rmetz...@apache.org wrote:
 
 Can you run mvn -version to verify that?
 Maybe maven is using a different java version?
 
 On Sun, Feb 22, 2015 at 2:05 PM, Dulaj Viduranga vidura...@icloud.com
 wrote:
 
 Hi,
But I’m using Oracle java 8 (javac 1.8.0_05).
 
 On Feb 22, 2015, at 6:32 PM, Robert Metzger rmetz...@apache.org wrote:
 
 Hi Dulaj,
 
 you are using an unsupported compiler to compile Flink. You can compile
 Flink only with OpenJDK 6 and all JDKs above 6. The Oracle JDK 6's
 compiler
 contains a bug.
 
 You can run Flink with all JREs 6+ (including Oracle JDK 6).
 
 I would recommend you to upgrade your Java version to 7 anyways, because
 6
 doesn't receive any security updates anymore.
 
 
 On Sun, Feb 22, 2015 at 12:55 PM, Dulaj Viduranga vidura...@icloud.com
 wrote:
 
 Hi all,
 I’m new here. I had some problems building flink on my mac. Could
 someone
 please take a look and help me out..?
 
 Dulaj Viduranga.
 
 [INFO] Scanning for projects...
 [WARNING]
 [WARNING] Some problems were encountered while building the effective
 model for org.apache.flink:flink-streaming-examples:jar:0.9-SNAPSHOT
 [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but
 found duplicate declaration of plugin
 org.apache.maven.plugins:maven-jar-plugin @
 org.apache.flink:flink-streaming-examples:[unknown-version],
 
 /Users/Vidura/Documents/Development/flink/flink-staging/flink-streaming/flink-streaming-examples/pom.xml,
 line 462, column 12
 [WARNING]
 [WARNING] It is highly recommended to fix these problems because they
 threaten the stability of your build.
 [WARNING]
 [WARNING] For this reason, future Maven versions might no longer support
 building such malformed projects.
 [WARNING]
 [INFO]
 
 [INFO] Reactor Build Order:
 [INFO]
 [INFO] flink
 [INFO] flink-shaded
 [INFO] flink-core
 [INFO] flink-java
 [INFO] flink-runtime
 [INFO] flink-compiler
 [INFO] flink-clients
 [INFO] flink-test-utils
 [INFO] flink-scala
 [INFO] flink-examples
 [INFO] flink-java-examples
 [INFO] flink-scala-examples
 [INFO] flink-staging
 [INFO] flink-streaming
 [INFO] flink-streaming-core
 [INFO] flink-tests
 [INFO] flink-avro
 [INFO] flink-jdbc
 [INFO] flink-spargel
 [INFO] flink-hadoop-compatibility
 [INFO] flink-streaming-scala
 [INFO] flink-streaming-connectors
 [INFO] flink-streaming-examples
 [INFO] flink-hbase
 [INFO] flink-gelly
 [INFO] flink-hcatalog
 [INFO] flink-tachyon
 [INFO] flink-quickstart
 [INFO] flink-quickstart-java
 [INFO] flink-quickstart-scala
 [INFO] flink-contrib
 [INFO] flink-yarn
 [INFO] flink-dist
 [INFO] flink-yarn-tests
 [INFO]
 [INFO]
 
 [INFO] Building flink 0.9-SNAPSHOT
 [INFO]
 
 Downloading:
 
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.pom
 Downloaded:
 
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.pom
 (10 KB at 1.8 KB/sec)
 Downloading:
 
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-project/0.10/apache-rat-project-0.10.pom
 Downloaded:
 
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-project/0.10/apache-rat-project-0.10.pom
 (18 KB at 13.9 KB/sec)
 Downloading:
 
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.jar
 Downloaded:
 
 https://repo.maven.apache.org/maven2/org/apache/rat/apache-rat-plugin/0.10/apache-rat-plugin-0.10.jar
 (42 KB at 31.8 KB/sec)
 Downloading:
 
 https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.pom
 Downloaded:
 
 https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.pom
 (8 KB at 8.0 KB/sec)
 Downloading:
 
 https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.jar
 Downloaded:
 
 https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.5.1/maven-install-plugin-2.5.1.jar
 (33 KB at 27.4 KB/sec)
 [INFO]
 [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ flink-parent
 ---
 [INFO] Deleting /Users/Vidura/Documents/Development/flink/target
 [INFO]
 [INFO] --- maven-checkstyle-plugin:2.12.1:check (validate) @
 flink-parent
 ---
 [INFO]
 [INFO]
 [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-maven) @
 flink-parent ---
 [INFO]
 [INFO] --- maven-remote-resources-plugin:1.5:process (default) @
 flink-parent ---
 [INFO]
 [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @
 flink-parent ---
 [INFO]
 [INFO] --- maven-shade-plugin:2.3:shade (default) @ flink-parent ---
 [INFO] Excluding 

Re: [DISCUSS] Gelly iteration abstractions

2015-02-22 Thread Vasiliki Kalavri
Hi,

yes, I was referring to the parallel Boruvka algorithm. There are several
ways to implement this one in Flink and I believe that the one described in
the paper (vertex-centric) is not the most elegant one :)

Andra is now working on an idea that uses the delta iteration abstraction
and we believe that it will be both more efficient and easier to
understand. It has the edges in the solution set and the vertices in the
workset, so it follows the pattern I describe in (2) in my previous e-mail.
As a next step, we would like to see how having an iteration operator that
could update the whole graph -what I describe as (3)- would make this even
nicer.

Any ideas are highly welcome!

Cheers,
V.

On 22 February 2015 at 16:32, Andra Lungu lungu.an...@gmail.com wrote:

 Hi Alex,

 Vasia is talking about the second version(presented Friday) of Parallel
 Boruvka, which can be found here:
 https://github.com/TU-Berlin-DIMA/IMPRO-3.WS14/pull/59

 I will propose the third, non-Pregel like approach directly to Gelly soon.

 If you have additional questions, I will be happy to answer them.

 Andra

 On Sun, Feb 22, 2015 at 4:23 PM, Alexander Alexandrov 
 alexander.s.alexand...@gmail.com wrote:

  Hi Vasia,
 
  I am trying to look at the problem in more detail. Which version of the
 MST
  are you talking about?
 
  Right now in the Gelly repository I can only find the SSSP example
  (parallel Bellman-Ford) from Section 4.2 in [1].
 
  However, it seems that the issues encountered by Andra are related to the
  implementation of Parallel Boruvka (Section 3.2 in [2]). Is that correct?
 
  Regards,
  A.
 
  [1] http://www.vldb.org/pvldb/vol7/p1047-han.pdf
  [2] http://www.vldb.org/pvldb/vol7/p577-salihoglu.pdf
 
  2015-02-19 21:03 GMT+01:00 Vasiliki Kalavri vasilikikala...@gmail.com:
 
   Hello beautiful Flink people,
  
   during the past few days, Andra and I have been discussing about how to
   extend Gelly's iteration methods.
  
   Alexander's course (and his awesome students) has made it obvious that
   vertex-centric iterations are not the best fit for algorithms which
 don't
   follow the common propagate-update pattern. For example, Andra is
  working
   on an implementation of Minimum Spanning Tree, which requires branching
   inside an iteration and also requires a convergence check of an
 internal
   iteration. Others also reported similar issues [1, 2]. Trying to fit
 such
   algorithms to the vertex-centric model leads to long and ugly code,
 e.g.
   aggregators to keep track of algorithm phases, duplicating data, etc.
  
   One limitation of the vertex-centric and the upcoming GAS model is that
   they both only allow the vertex values to be updated in each iteration.
   However, for some algorithms we need to update the edge values and in
   others we need to update both. In even more complex situations (like
   Andra's MST) in some iterations we need to update the vertex values and
  in
   some iterations we need to update the edge values.
   Another problem is that we currently don't have a way to allow
 different
   computational phases inside an iteration. This is something that Giraph
   solves with master compute, a function that is executed once before
 each
   superstep and sets the computation function.
  
   All that said, I believe that we can solve most of these issues if we
   nicely expose Flink's iteration operators in Gelly. I can see the
  following
   cases:
  
   1. Bulk  delta iterations where the solution set is the vertex
 dataset:
   this will be similar to vertex-centric and GAS, but will allow more
   flexible dataflows inside the iteration.
   2. Bulk  delta iterations where the solution set is the edge dataset:
  for
   the cases where we need to update edge values.
   3. Bulk  delta iterations where the solution set is the Graph: this
 will
   cover more complex cases, where the algorithm updates both vertices and
   edges or even adds/removes vertices/edges, i.e. updates the whole
 Graph.
  
   What do you think? I can see 1  2 being very easy to implement, but I
   suspect 3 won't be that easy (but so awesome to have ^^).
   Would it work the way a Graph is represented now, i.e. with 2 DataSets?
  
   Any comment, idea, pointer would be much appreciated! Thank you ^^
  
   Cheers,
   -V.
  
   [1]:
  
  
 
 http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/Can-a-master-class-control-the-superstep-in-Flink-Spargel-td733.html
   [2]:
  
  
 
 http://issues.apache.org/jira/browse/FLINK-1552?focusedCommentId=14325769page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14325769
  
 



Re: [DISCUSS] Gelly iteration abstractions

2015-02-22 Thread Alexander Alexandrov
Hi Vasia,

I am trying to look at the problem in more detail. Which version of the MST
are you talking about?

Right now in the Gelly repository I can only find the SSSP example
(parallel Bellman-Ford) from Section 4.2 in [1].

However, it seems that the issues encountered by Andra are related to the
implementation of Parallel Boruvka (Section 3.2 in [2]). Is that correct?

Regards,
A.

[1] http://www.vldb.org/pvldb/vol7/p1047-han.pdf
[2] http://www.vldb.org/pvldb/vol7/p577-salihoglu.pdf

2015-02-19 21:03 GMT+01:00 Vasiliki Kalavri vasilikikala...@gmail.com:

 Hello beautiful Flink people,

 during the past few days, Andra and I have been discussing about how to
 extend Gelly's iteration methods.

 Alexander's course (and his awesome students) has made it obvious that
 vertex-centric iterations are not the best fit for algorithms which don't
 follow the common propagate-update pattern. For example, Andra is working
 on an implementation of Minimum Spanning Tree, which requires branching
 inside an iteration and also requires a convergence check of an internal
 iteration. Others also reported similar issues [1, 2]. Trying to fit such
 algorithms to the vertex-centric model leads to long and ugly code, e.g.
 aggregators to keep track of algorithm phases, duplicating data, etc.

 One limitation of the vertex-centric and the upcoming GAS model is that
 they both only allow the vertex values to be updated in each iteration.
 However, for some algorithms we need to update the edge values and in
 others we need to update both. In even more complex situations (like
 Andra's MST) in some iterations we need to update the vertex values and in
 some iterations we need to update the edge values.
 Another problem is that we currently don't have a way to allow different
 computational phases inside an iteration. This is something that Giraph
 solves with master compute, a function that is executed once before each
 superstep and sets the computation function.

 All that said, I believe that we can solve most of these issues if we
 nicely expose Flink's iteration operators in Gelly. I can see the following
 cases:

 1. Bulk  delta iterations where the solution set is the vertex dataset:
 this will be similar to vertex-centric and GAS, but will allow more
 flexible dataflows inside the iteration.
 2. Bulk  delta iterations where the solution set is the edge dataset: for
 the cases where we need to update edge values.
 3. Bulk  delta iterations where the solution set is the Graph: this will
 cover more complex cases, where the algorithm updates both vertices and
 edges or even adds/removes vertices/edges, i.e. updates the whole Graph.

 What do you think? I can see 1  2 being very easy to implement, but I
 suspect 3 won't be that easy (but so awesome to have ^^).
 Would it work the way a Graph is represented now, i.e. with 2 DataSets?

 Any comment, idea, pointer would be much appreciated! Thank you ^^

 Cheers,
 -V.

 [1]:

 http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/Can-a-master-class-control-the-superstep-in-Flink-Spargel-td733.html
 [2]:

 http://issues.apache.org/jira/browse/FLINK-1552?focusedCommentId=14325769page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14325769



[jira] [Created] (FLINK-1597) VertexCentricIterations create inefficient execution plans

2015-02-22 Thread Martin Kiefer (JIRA)
Martin Kiefer created FLINK-1597:


 Summary: VertexCentricIterations create inefficient execution plans
 Key: FLINK-1597
 URL: https://issues.apache.org/jira/browse/FLINK-1597
 Project: Flink
  Issue Type: Bug
  Components: Gelly
Affects Versions: master
Reporter: Martin Kiefer


I did experiments with optimized versions of a graph algorithm that should 
utilize a secondary sort on the edges and a trade off between superstep numbers 
and I/O. To my surprise the optimizations did barely affect the execution 
times. I narrowed it down to inefficient execution plans.

I assumed that edge sets would be partitioned once at the beginning of a 
VertexCentricIteration and never be touched again because they can not change 
during the iteration. I think this should be the desired behavior. What 
actually happens is that UDFs creating the edge set are pulled inside the 
iteration and are executed every superstep. This harms the performance of graph 
algorithms significantly. 

As a simple example have a look at the execution plan generated for the 
PageRankExample:
https://gist.github.com/martinkiefer/28a63f953477e3987b5d



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Flink GZip support

2015-02-22 Thread Karim Alaa
Hi All,

I’m currently working with Flink 0.8.0 and I would like to know if there is or 
will be any support for handling Gzipped files

Thanks!