[jira] [Created] (FLINK-1599) TypeComperator

2015-02-23 Thread Max Michels (JIRA)
Max Michels created FLINK-1599: -- Summary: TypeComperator Key: FLINK-1599 URL: https://issues.apache.org/jira/browse/FLINK-1599 Project: Flink Issue Type: Bug Components: Distributed

[jira] [Created] (FLINK-1598) Give better error messages when serializers run out of space.

2015-02-23 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-1598: --- Summary: Give better error messages when serializers run out of space. Key: FLINK-1598 URL: https://issues.apache.org/jira/browse/FLINK-1598 Project: Flink

Re: [DISCUSS] Iterative streaming example

2015-02-23 Thread Paris Carbone
Hello Peter, Streaming machine learning algorithms make use of iterations quite widely. One simple example is implementing distributed stream learners. There, in many cases you need some central model aggregator, distributed estimators to offload the central node and of course feedback loops

Re: [DISCUSS] Gelly iteration abstractions

2015-02-23 Thread Stephan Ewen
For loops are basically rolled out - they yield long execution plans. On Mon, Feb 23, 2015 at 2:44 PM, Vasiliki Kalavri vasilikikala...@gmail.com wrote: for-loop iterations could cover some cases, I guess, when the number of iterations is known beforehand. Are there currently any

Re: [DISCUSS] Gelly iteration abstractions

2015-02-23 Thread Vasiliki Kalavri
Hi Stephan, yes, this would work for the cases where an algorithm only updates the vertex values or only updates the edge values. What we would like to also support is (a) algorithms where both vertices and edges are updated in one iteration (b) algorithms where the graph structure changes from

Re: [DISCUSS] Iterative streaming example

2015-02-23 Thread Szabó Péter
Nice. Thank you guys! @Paris Are there any Flink implementations of this model? The GitHub doc is quite general. Peter 2015-02-23 14:05 GMT+01:00 Paris Carbone par...@kth.se: Hello Peter, Streaming machine learning algorithms make use of iterations quite widely. One simple example is

[jira] [Created] (FLINK-1601) Sometimes the YARNSessionFIFOITCase fails on Travis

2015-02-23 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-1601: Summary: Sometimes the YARNSessionFIFOITCase fails on Travis Key: FLINK-1601 URL: https://issues.apache.org/jira/browse/FLINK-1601 Project: Flink Issue

Re: [DISCUSS] Gelly iteration abstractions

2015-02-23 Thread Stephan Ewen
Closed-loop iterations are much more efficient right now. Long for loops suffer from memory fragmentation (an issue that is in the list to fix). Also, closed loops can be stateful (delta iterations) and do not require task re-deployment. On Mon, Feb 23, 2015 at 4:15 PM, Vasiliki Kalavri

Re: [DISCUSS] Gelly iteration abstractions

2015-02-23 Thread Vasiliki Kalavri
I see, thanks a lot for the answers! To rephrase my original question, would it make sense to define a closed-loop iteration where the state is the whole graph? If you want to take a look at the current implementation of DMST using delta iteration, Andra has made a PR [1]. On a high-level, this

Could not build up connection to JobManager

2015-02-23 Thread Dulaj Viduranga
I’m getting Could not build up connection to JobManager.” When i tried to run the wordCount example. Can anyone help? Dulaj

Re: Could not build up connection to JobManager

2015-02-23 Thread Robert Metzger
Hi, you said in the other email thread that the error only occurs for Wordcount, not for Kmeans. Can you copy me the commands for both examples? I can not really believe that there is a difference between the two jobs. Can you also send us the contents of the jobmanager log file? Best, Robert

Re: Stale Synchronous Parallel iterations in Flink

2015-02-23 Thread Stephan Ewen
Hey Tran Nam-Luc! Great post with some really cool thoughts. I just posted this answer to your LinkedIN post. Greetings, Stephan = Nice post, very cool idea! Your understanding of Flink in that respect is really good. I had not heard of SSP before,

Re: [DISCUSS] Gelly iteration abstractions

2015-02-23 Thread Stephan Ewen
As a workaround, it should always work to get the Edge and Vertex data set from the graph and use the regular Fink iteration operators? On Sun, Feb 22, 2015 at 4:53 PM, Vasiliki Kalavri vasilikikala...@gmail.com wrote: Hi, yes, I was referring to the parallel Boruvka algorithm. There are

[DISCUSS] Discourage using the same class names even though in different packages

2015-02-23 Thread Henry Saputra
Hi All, I am seeing some same class names, even though in different package names, that could confuse new contributors. One of the attractiveness of Spark that it is the code structure is simple to follow than Hadoop (or Hive for that matter). For example we have IntermediateResultPartition in

[jira] [Created] (FLINK-1603) Update how to contribute doc to include information to send Github PR instead of attaching patch

2015-02-23 Thread Henry Saputra (JIRA)
Henry Saputra created FLINK-1603: Summary: Update how to contribute doc to include information to send Github PR instead of attaching patch Key: FLINK-1603 URL: https://issues.apache.org/jira/browse/FLINK-1603