Re: [DISCUSS] Allowed Lateness in Flink

2016-07-18 Thread Chen Qin
BTW, do you have rough timeline in term of roll out it to production? Thanks, Chen On Mon, Jul 18, 2016 at 2:46 AM, Aljoscha Krettek wrote: > Hi, > Chen commented this on the doc (I'm mirroring here so everyone can follow): > "It would be cool to be able to access last snapshot of window state

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-18 Thread Chen Qin
Aljoscha, Yes, that would works for our case! Chen On Mon, Jul 18, 2016 at 2:46 AM, Aljoscha Krettek wrote: > Hi, > Chen commented this on the doc (I'm mirroring here so everyone can follow): > "It would be cool to be able to access last snapshot of window states > before it get purged. Pipel

DataStream.partitionCustom() - define parallelism

2016-07-18 Thread vanekjar
Hi all, I've got one question considering custom partitioning in DataStream API. Is it possible to define/change parallelism when doing 'partitionCustom' transformation? As far as I discovered there is no way how to call 'setParallelism()' on the 'PartitionTransformation' because it's hidden from

RE: [DISCUSS] FLIP-2 Extending Window Function Metadata

2016-07-18 Thread Radu Tudoran
Hi, Sorry - I made a mistake - I was thinking of getting access to the collection (mist-read :) collector) of events in the window buffer in order to be able to delete/evict some of them which are not necessary the last ones. Radu -Original Message- From: Aljoscha Krettek [mailto:alj

Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

2016-07-18 Thread Aljoscha Krettek
What about the collector? This is only used for emitting elements to the downstream operation. On Mon, 18 Jul 2016 at 17:52 Radu Tudoran wrote: > Hi, > > I think it looks good and most importantly is that we can extend it in the > directions discussed so far. > > One question though regarding th

RE: [DISCUSS] FLIP-2 Extending Window Function Metadata

2016-07-18 Thread Radu Tudoran
Hi, I think it looks good and most importantly is that we can extend it in the directions discussed so far. One question though regarding the Collector - are we going to be able to delete random elements from the list if this is not exposed as a collection, at least to the evictor? If not, how

[jira] [Created] (FLINK-4231) Switch DistinctOperator from GroupReduceFunction to ReduceFunction

2016-07-18 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-4231: - Summary: Switch DistinctOperator from GroupReduceFunction to ReduceFunction Key: FLINK-4231 URL: https://issues.apache.org/jira/browse/FLINK-4231 Project: Flink I

[jira] [Created] (FLINK-4230) Session Windowing IT Case

2016-07-18 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-4230: - Summary: Session Windowing IT Case Key: FLINK-4230 URL: https://issues.apache.org/jira/browse/FLINK-4230 Project: Flink Issue Type: Test Componen

[jira] [Created] (FLINK-4229) Do not start Metrics Reporter by default

2016-07-18 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-4229: --- Summary: Do not start Metrics Reporter by default Key: FLINK-4229 URL: https://issues.apache.org/jira/browse/FLINK-4229 Project: Flink Issue Type: Impr

[jira] [Created] (FLINK-4228) RocksDB semi-async snapshot to S3AFileSystem fails

2016-07-18 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-4228: -- Summary: RocksDB semi-async snapshot to S3AFileSystem fails Key: FLINK-4228 URL: https://issues.apache.org/jira/browse/FLINK-4228 Project: Flink Issue Type: Bug

Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

2016-07-18 Thread Aljoscha Krettek
I incorporated the changes. The proposed interface of ProcessWindowFunction is now this: public abstract class ProcessWindowFunction implements Function { public abstract void process(KEY key, Iterable elements, Context ctx) throws Exception; public abstract class Context { publ

Re: [DISCUSS] FLIP-3 - Organization of Documentation

2016-07-18 Thread Till Rohrmann
+1 for the FLIP and making streaming the common case. Very good proposal :-) On Mon, Jul 18, 2016 at 11:48 AM, Aljoscha Krettek wrote: > +1 I like it a lot! > > On Fri, 15 Jul 2016 at 18:43 Stephan Ewen wrote: > > > My take would be to take streaming as the common case and make special > > sect

Re: [DISCUSS] Commit tagging

2016-07-18 Thread Till Rohrmann
Then +1 :-) On Fri, Jul 15, 2016 at 7:07 PM, Ufuk Celebi wrote: > It was intended as Till said... a list of preferred tags. > > On Fri, Jul 15, 2016 at 6:52 PM, Till Rohrmann > wrote: > > I agree with Robert that it would be a nice to have but not strictly > > required. I think it would help to

Re: Evaluating Apache Flink

2016-07-18 Thread Ovidiu-Cristian MARCU
Hi Kevin, I have orchestrated an evaluation of Spark and Flink for various batch and graph processing workloads (no streaming, no sql) (this work has been accepted as a paper at Cluster and I will publish soon a report, for more details please contact me directly). Both engines did well, stabl

Re: [DISCUSS] Enhance Window Evictor in Flink

2016-07-18 Thread Vishnu Viswanath
Hi Aljoscha, Thanks! Yes, I have the create page option now in wiki. Regards, Vishnu Viswanath, On Mon, Jul 18, 2016 at 6:34 AM, Aljoscha Krettek wrote: > @Radu, addition of more window types and sorting should be part of another > design proposal. This is interesting stuff but I think we shou

Re: [DISCUSS] Enhance Window Evictor in Flink

2016-07-18 Thread Aljoscha Krettek
@Radu, addition of more window types and sorting should be part of another design proposal. This is interesting stuff but I think we should keep issues separated because things can get complicated very quickly. On Mon, 18 Jul 2016 at 12:32 Aljoscha Krettek wrote: > Hi, > about TimeEvictor, yes,

Re: [DISCUSS] Enhance Window Evictor in Flink

2016-07-18 Thread Aljoscha Krettek
Hi, about TimeEvictor, yes, I think there should be specific evictors for processing time and event time. Also, the current time should be retrievable from the EvictorContext. For the wiki you will need permissions. This was recently changed because there was too much spam. I gave you permission t

Re: Restructuring Javadoc and Scaladoc for libraries

2016-07-18 Thread Aljoscha Krettek
Hi, it's somewhat tricky to do Doc aggregation with ScalaDoc. At least it used to be back when I initially set it up just for the Scala Batch API. Maybe it's a bit easier now. Cheers, Aljoscha On Sat, 16 Jul 2016 at 05:21 Chiwan Park wrote: > Hi Robert, > > Thanks for clarifying! I’ve filed thi

Re: In AbstractRocksDBState, why write a byte 42 between key and namespace?

2016-07-18 Thread Aljoscha Krettek
Ah I see, Stephan and I had a quick chat and it's for cases where there are 42s around the edges of the key/namespace. On Mon, 18 Jul 2016 at 11:51 Aljoscha Krettek wrote: > In which cases is it not solved? Because then we should make sure to solve > it. > > On Mon, 18 Jul 2016 at 10:33 Stephan

Re: In AbstractRocksDBState, why write a byte 42 between key and namespace?

2016-07-18 Thread Aljoscha Krettek
In which cases is it not solved? Because then we should make sure to solve it. On Mon, 18 Jul 2016 at 10:33 Stephan Ewen wrote: > Got it. But the ambiguity is not really solved by that, just lessened. > > On Sun, Jul 17, 2016 at 2:10 PM, Aljoscha Krettek > wrote: > > > @Stephan It's not about t

Re: [DISCUSS] FLIP-3 - Organization of Documentation

2016-07-18 Thread Aljoscha Krettek
+1 I like it a lot! On Fri, 15 Jul 2016 at 18:43 Stephan Ewen wrote: > My take would be to take streaming as the common case and make special > sections for batch. > > We can still have a few streaming-only sections (end to end exactly once) > and a few batch-only sections (optimizer). > > On Fr

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-18 Thread Aljoscha Krettek
Hi, Chen commented this on the doc (I'm mirroring here so everyone can follow): "It would be cool to be able to access last snapshot of window states before it get purged. Pipeline author might consider put it to external storage and deal with late arriving events by restore corresponding window."

Re: Flink 1.1.0 Preview RC0

2016-07-18 Thread Aljoscha Krettek
Not yet, but Kostas is investigating. On Fri, 15 Jul 2016 at 18:21 Ufuk Celebi wrote: > Most actually have a pending PR, except: > https://issues.apache.org/jira/browse/FLINK-4207, > https://issues.apache.org/jira/browse/FLINK-4201. > > I will assign FLINK-4201 to myself and hope to fix it over

Re: In AbstractRocksDBState, why write a byte 42 between key and namespace?

2016-07-18 Thread Stephan Ewen
Got it. But the ambiguity is not really solved by that, just lessened. On Sun, Jul 17, 2016 at 2:10 PM, Aljoscha Krettek wrote: > @Stephan It's not about the serializers not being able to read the key. The > key/namespace are never read again. It's just about the serialized form > possibly being