Re: Thoughts About Streaming

2015-06-25 Thread Paris Carbone
+1 for writing this down > On 25 Jun 2015, at 18:11, Aljoscha Krettek wrote: > > +1 go ahead > > On Thu, 25 Jun 2015 at 18:02 Stephan Ewen wrote: > >> Hey! >> >> This thread covers many different topics. Lets break this up into separate >> discussions. >> >> - Operator State is already driv

Re: Thoughts About Streaming

2015-06-25 Thread Aljoscha Krettek
+1 go ahead On Thu, 25 Jun 2015 at 18:02 Stephan Ewen wrote: > Hey! > > This thread covers many different topics. Lets break this up into separate > discussions. > > - Operator State is already driven by Gyula and Paris and happening on the > above mentioned pull request and the followup discus

Re: Thoughts About Streaming

2015-06-25 Thread Stephan Ewen
Hey! This thread covers many different topics. Lets break this up into separate discussions. - Operator State is already driven by Gyula and Paris and happening on the above mentioned pull request and the followup discussions. - For windowing, this discussion has brought some results that we s

Re: Thoughts About Streaming

2015-06-25 Thread Matthias J. Sax
Sure. I picked this up. Using the current model for "occurrence time semantics" does not work. I elaborated on this in the past many times (but nobody cared). It is important to make it clear to the user what semantics are supported. Claiming to support "sliding windows" doesn't mean anything; the

Re: Thoughts About Streaming

2015-06-25 Thread Aljoscha Krettek
Yes, I am aware of this requirement and it would also be supported in my proposed model. The problem is, that the "custom timestamp" feature gives the impression that the elements would be windowed according to a user-timestamp. The results, however, are wrong because of the assumption about eleme

Re: [VOTE] Release additional convenience binaries for Flink 0.9.0

2015-06-25 Thread Ufuk Celebi
+1 - Checked hashes and signatures - Tested local mode for all versions - Tested Flink on YARN 2.4 with https://github.com/aljoscha/FliRTT and built-in data for all versions – Ufuk

Re: Thoughts About Streaming

2015-06-25 Thread Matthias J. Sax
Hi Aljoscha, I like that you are pushing in this direction. However, IMHO you misinterpreter the current approach. It does not assume that tuples arrive in-order; the current approach has no notion about a pre-defined-order (for example, the order in which the event are created). There is only the

Re: Thoughts About Streaming

2015-06-25 Thread Aljoscha Krettek
Yes, now this also processes about 3 mio Elements (Window Size 5 sec, Slide 1 sec) but it still fluctuates a lot between 1 mio. and 5 mio. Performance is not my main concern, however. My concern is that the current model assumes elements to arrive in order, which is simply not true. In your code

Re: Thoughts About Streaming

2015-06-25 Thread Gábor Gévay
I'm very sorry, I had a bug in the InversePreReducer. It should be fixed now. Can you please run it again? I also tried to reproduce some of your performance numbers, but I'm getting only less than 1/10th of yours. For example, in the Tumbling case, Current/Reduce produces only ~10 for me. Do

[VOTE] Release additional convenience binaries for Flink 0.9.0

2015-06-25 Thread Robert Metzger
Hi, as per discussion from this list, I propose to release the files in: http://people.apache.org/~rmetzger/flink-0.9.0-hadoop-binaries/ as additional convenience binaries for the Flink 0.9.0 release. The binaries have been build from the "release-0.9.0-rc4

Re: Running Flink Job with dependencies

2015-06-25 Thread santosh_rajaguru
Yes. I have configured a product file and specified the required plugins. It generates the jar files and i deploy them into lib folder. and deployed the main class jar into webclient. Ran the job. -- View this message in context: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Ru

Re: Thoughts About Streaming

2015-06-25 Thread Aljoscha Krettek
Hi, I also ran the tests on top of PR 856 (inverse reducer) now. The results seem incorrect. When I insert a Thread.sleep(1) in the tuple source, all the previous tests reported around 3600 tuples (Size 5 sec, Slide 1 sec) (Theoretically there would be 5000 tuples in 5 seconds but this is due to ov

Re: Running Flink Job with dependencies

2015-06-25 Thread Flavio Pompermaier
So you've put the plugin jars in the lib dir of each task manager? It could work but I'll try to package the plugins as a jar and transitively collect the required dependency. However I remember that it's quite tricky to find the correct way to handle them as a normal jars :) Maybe some good Flink

Re: Running Flink Job with dependencies

2015-06-25 Thread santosh_rajaguru
Thanks Flavio. Previously i packaged all the Jars into Fat Jar. Despite of specifying the Main class, it was unable to load some of the classes. Now, its running. As i have created Plugins jar and deployed those to Flink library and just created a simple jar for the main class plugin. Now, its loo

[jira] [Created] (FLINK-2276) Travis build error

2015-06-25 Thread Sachin Goel (JIRA)
Sachin Goel created FLINK-2276: -- Summary: Travis build error Key: FLINK-2276 URL: https://issues.apache.org/jira/browse/FLINK-2276 Project: Flink Issue Type: Bug Reporter: Sachin Goe

Re: Thoughts About Streaming

2015-06-25 Thread Gábor Gévay
Hello, Aljoscha, can you please try the performance test of Current/Reduce with the InversePreReducer in PR 856? (If you just call sum, it will use an InversePreReducer.) It would be an interesting test, because the inverse function optimization really depends on the stream being ordered, and I th

Re: Thoughts About Streaming

2015-06-25 Thread Ufuk Celebi
Thanks for writing this up and comparing to the current implementation. It's great to see that your mockup indicates correct/expected behaviour *and* better performance. :-) Regarding the results for the current mechanism: does this problem affects all window operators? – Ufuk On 25 Jun 2015,

Re: Thoughts About Streaming

2015-06-25 Thread Aljoscha Krettek
I think I'll have to elaborate a bit so I created a proof-of-concept implementation of my Ideas and ran some throughput measurements to alleviate concerns about performance. First, though, I want to highlight again why the current approach does not work with out-of-order elements (which, again, oc

Re: Provide Hadoop pre-build Hadoop 2.4 and Hadoop 2.6 binaries

2015-06-25 Thread Robert Metzger
I'll create the binaries and start the vote On Thu, Jun 25, 2015 at 10:33 AM, Ufuk Celebi wrote: > @Robert, Ma: can one of you start the vote today? > > Anyone who is against this can give a -1 in the vote thread. ;) > > – Ufuk > > On 25 Jun 2015, at 10:24, Maximilian Michels wrote: > > > +1 fo

Re: Possible bug?

2015-06-25 Thread Stephan Ewen
Happens to everyone once in a while ;-) On Thu, Jun 25, 2015 at 10:33 AM, Matthias J. Sax < mj...@informatik.hu-berlin.de> wrote: > Stupid me. Thanks! Of course, it cannot work. I forgot to assign ds to > itself: > > ds = ds.x().distinct(); > result = ds.collect(); > > I guess it was to late in t

Re: Possible bug?

2015-06-25 Thread Matthias J. Sax
Stupid me. Thanks! Of course, it cannot work. I forgot to assign ds to itself: ds = ds.x().distinct(); result = ds.collect(); I guess it was to late in the night ;) -Matthias On 06/25/2015 07:58 AM, Chiwan Park wrote: > Although you run > `ds.map(blahblah).sortPartition(blahblah).mapPartition(

Re: Provide Hadoop pre-build Hadoop 2.4 and Hadoop 2.6 binaries

2015-06-25 Thread Ufuk Celebi
@Robert, Ma: can one of you start the vote today? Anyone who is against this can give a -1 in the vote thread. ;) – Ufuk On 25 Jun 2015, at 10:24, Maximilian Michels wrote: > +1 for different Hadoop bundles. Other projects do it as well. > > On Wed, Jun 24, 2015 at 2:25 PM, Vyacheslav Zholude

Re: Provide Hadoop pre-build Hadoop 2.4 and Hadoop 2.6 binaries

2015-06-25 Thread Maximilian Michels
+1 for different Hadoop bundles. Other projects do it as well. On Wed, Jun 24, 2015 at 2:25 PM, Vyacheslav Zholudev < vyacheslav.zholu...@gmail.com> wrote: > We were experiencing different kinds of issues with Flink on Hadoop 2.4. > When > rebuilding Flink with Hadoop 2.4 dependencies all issues