答复: 答复: 答复: TumblingProcessingTimeWindow emits extra results for a same window

2018-07-15 Thread Yuan,Youjun
Hi Hequn, To my understand, a processing time window is fired at the last millisecond of the window(maxTimestamp). Then what will happen if more elements arrive at the last millisecond, but AFTER the window is fired? Thanks, Youjun 发件人: Hequn Cheng 发送时间: Friday, July 13, 2018 9:44 PM 收件人:

Re: Persisting Table in Flink API

2018-07-15 Thread Hequn Cheng
Hi Shivam, Currently, fink sql/table-api support window join and non-window join[1]. If your requirements are not being met by sql/table-api, you can also use the datastream to implement your own logic. You can refer to the non-window join implement as an example[2][3]. Best, Hequn [1]

Re: How to customize trigger for Count Time Window

2018-07-15 Thread Rong Rong
Hi Soheil, I don't think just overriding the window trigger function is sufficient, since your logic effectively changes the how elements are assigned to a window. Based on a quick scan I think your use case might be able to reuse the DynamicGapSessionWIndow [1], where you will have to create a

Persisting Table in Flink API

2018-07-15 Thread Shivam Sharma
Hi, We have one use case in which we need to persist Table in Flink which can be later used to join with other tables. This table can be huge so we need to store it in off-heap but faster access. Any suggestions regarding this? -- Shivam Sharma Data Engineer @ Goibibo Indian Institute Of

Re: Flink Query Optimizer

2018-07-15 Thread vino yang
Hi Albert, If you want to provide more feature about the query optimizer for Flink. I suggest you based on Apache Calcite, if Calcite's optimizer can not match your requirement. You can talk with Calcite community or just customize Calcite if you do not want to wait. Our inner Calcite version

Re: Real time streaming as a microservice

2018-07-15 Thread Mich Talebzadeh
Hi Deepak, I will put it there once all the bits and pieces come together. At the moment I am drawing the diagrams. I will let you know. Definitely everyone's contribution is welcome. Regards, Dr Mich Talebzadeh LinkedIn *

Re: Real time streaming as a microservice

2018-07-15 Thread Deepak Sharma
Is it on github Mich ? I would love to use the flink and spark edition and add some use cases from my side. Thanks Deepak On Sun, Jul 15, 2018, 13:38 Mich Talebzadeh wrote: > Hi all, > > I have now managed to deploy both ZooKeeper and Kafka as microservices > using docker images. > > The idea

Re: Real time streaming as a microservice

2018-07-15 Thread Mich Talebzadeh
Hi all, I have now managed to deploy both ZooKeeper and Kafka as microservices using docker images. The idea came to me as I wanted to create lightweight processes for both ZooKeeper and Kafka to be used as services for Flink and Spark simultaneously. In this design both Flink and Spark rely on

Re: Flink Query Optimizer

2018-07-15 Thread Hequn Cheng
Hi Albert, Apache Flink leverages Apache Calcite to optimize and translate queries. The optimization currently performed include projection and filter push-down, subquery decorrelation, and other kinds of query rewriting. Flink does not yet optimize the order of joins[1]. I agree with you it is

Re: reduce a data stream to other type

2018-07-15 Thread Hequn Cheng
Hi Soheil, Yes, reduce function doesn't allow this. A ReduceFunction specifies how two elements from the input are combined to produce an output element of the same type. You can use AggregateFunction or FoldFunction. More details here[1]. Best, Hequn [1]

Re: Filtering and mapping data after window opertator

2018-07-15 Thread Hequn Cheng
Hi Soheil, We can't apply FilterFunction or MapFunction on WindowedStream. It is recommended to do these operations on DataStream, for example, temp.filter().map().keyBy(0).timeWindow(). Best, Hequn On Sat, Jul 14, 2018 at 9:14 PM, Soheil Pourbafrani wrote: > Hi, I'm getting data stream from