Re: Timestamp Watermark Assigner bpund question

2019-04-09 Thread Guowei Ma
Hi, 1. From doc[1], A Watermark(t) declares that event time has reached time t in that stream, meaning that there should be no more elements from the stream with a timestamp t’ <= t (i.e. events with timestamps older or equal to the watermark). So I think it might be counterintuitive that

Re: flink 1.7.2 freezes, waiting indefinitely for the buffer availability

2019-04-09 Thread Guowei Ma
Hi , Could you jstak the downstream Task (the Window) and have a look at what the window operator is doing? Best, Guowei Rahul Jain 于2019年4月10日周三 下午1:04写道: > We are also seeing something very similar. Looks like a bug. > > It seems to get stuck in LocalBufferPool forever and the job has to be

Re: flink 1.7.2 freezes, waiting indefinitely for the buffer availability

2019-04-09 Thread Rahul Jain
We are also seeing something very similar. Looks like a bug. It seems to get stuck in LocalBufferPool forever and the job has to be restarted. Is anyone else facing this too? On Tue, Apr 9, 2019 at 9:04 PM Indraneel R wrote: > Hi, > > We are trying to run a very simple flink pipeline, which

Re: Flink 1.7.2 UI : Jobs removed from Completed Jobs section

2019-04-09 Thread Jins George
Any input on this UI behavior ? Thanks, Jins From: Timothy Victor Date: Monday, April 8, 2019 at 10:47 AM To: Jins George Cc: user Subject: Re: Flink 1.7.2 UI : Jobs removed from Completed Jobs section I face the same issue in Flink 1.7.1. Would be good to know a solution. Tim On Mon, Apr

Timestamp Watermark Assigner bpund question

2019-04-09 Thread Vijay Balakrishnan
Hi, I have created a TimestampAssigner as follows. I want to use monitoring.getEventTimestamp() with an Event Time processing and collected aggregated stats over time window intervals of 5 secs, 5 mins etc. Is this the right way to create the TimeWaterMarkAssigner with a bound ? I want to collect

Re: Details of the downsides of “falling back” to Kyro rather than using Flink’s built in serde

2019-04-09 Thread kb
Hi Konstantin, Thank you! This is exactly what I was looking for. Thanks Kevin -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Details of the downsides of “falling back” to Kyro rather than using Flink’s built in serde

2019-04-09 Thread Konstantin Knauf
Hi Kevin, there are no performance downsides to using Flink POJOs. You are just limited in the types you can use (e.g. you can not use Collections). In Flink 1.7 you might want to use Avro (SpecificRecord) for your state objects to benefit from Flink's built-in state schema evolution

Details of the downsides of “falling back” to Kyro rather than using Flink’s built in serde

2019-04-09 Thread kb
Hi all, I was looking at https://ci.apache.org/projects/flink/flink-docs-stable/dev/types_serialization.html, and https://www.youtube.com/watch?v=euFMWFDThiE to improve our state management and back pressure. Both of these resources mention ensuring objects used in the flow are valid POJOs to

Re: [Discuss] Semantics of event time for state TTL

2019-04-09 Thread aitozi
Hi, Andrey I think ttl state has another scenario to simulate the slide window with the process function. User can define a state to store the data with the latest 1 day. And trigger calculate on the state every 5min. It is a operator similar to slidewindow. But i think it is more efficient than

[DISCUSS] State TTL支持EventTime

2019-04-09 Thread Yu Li
大家好, 也许大家已经注意到通过FLINK-12005[1]我们提出了在State中支持基于EventTime[2]的TTL[3]语义,有关这个特性我们希望向大家发起一个讨论/调查,包括下述问题: 1. 在您的应用场景下是否需要基于EventTime的state TTL?如果需要,请简单描述场景和需求原因,这是决定我们是否开发该特性的关键。 2. 在TTL定义中有两个关键的时间,一个是数据的时间戳,另外一个是检查数据是否过期的当前时间(currentTime)。由于流式计算的特点,数据有可能“乱序”到达,而Flink为了处理这种情况引入了watermark的概念。因此对于

Re: Flink 1.7.2 extremely unstable and losing jobs in prod

2019-04-09 Thread Bruno Aranda
Thanks Till, I will start separate threads for the two issues we are experiencing. Cheers, Bruno On Mon, 8 Apr 2019 at 15:27, Till Rohrmann wrote: > Hi Bruno, > > first of all good to hear that you could resolve some of the problems. > > Slots get removed if a TaskManager gets unregistered

RE: Re: Lantency caised Flink Checkpoint on EXACTLY_ONCE mode

2019-04-09 Thread min.tan
Many thanks for your quick reply. 1) My implementation has no commits. All commits are done in FlinkKafkaProducer class I envisage. KeyedSerializationSchemaWrapper keyedSerializationSchemaWrapper = new KeyedSerializationSchemaWrapper(new SimpleStringSchema()); new

Dynamic index(by day) to sink to elastic search

2019-04-09 Thread Jacky Yin 殷传旺
Hello There, We are using flink sql to build a stream pipeline which reads data from kafka, aggregates the data and finally sinks to elastic search. For the table sink to elastic search, we expect to create index by day (e.g. index1-2019-04-08, index1-2019-04-09…). Is this function supported?

Re: Schema Evolution on Dynamic Schema

2019-04-09 Thread Fabian Hueske
Hi Shahar, Thanks! The approach of the UDAGG would be very manual. You could not reuse the built-in functions. There are several ways to achieve this. One approach could be to have a map-based UDAGG for each type of aggregation that you'd like to support (SUM, COUNT, ...) Let's say we have a

Add flink gauge to serialiser ?

2019-04-09 Thread Frank Wilson
Is there a way to add a gauge to a flink serializer? I’d like to calculate and expose the total time to process a given tuple including the serialisation/deserialisation time. Or would it be a better idea to wrap the conctrete sink function (e.g. kafka producer) with an ‘instrumented sink’

Re: [DISCUSS] Drop Elasticssearch 1 connector

2019-04-09 Thread vino yang
+1 to drop it. Stephan Ewen 于2019年4月6日周六 上午6:51写道: > +1 to drop it > > Previously released versions are still available and compatible with newer > Flink versions anyways. > > On Fri, Apr 5, 2019 at 2:12 PM Bowen Li wrote: > > > +1 for dropping elasticsearch 1 connector. > > > > On Wed, Apr 3,

Re: [Discuss] Semantics of event time for state TTL

2019-04-09 Thread Aljoscha Krettek
I think so, I just wanted to bring it up again because the question was raised. > On 8. Apr 2019, at 22:56, Elias Levy wrote: > > Hasn't this been always the end goal? It's certainly what we have been > waiting on for job with very large TTLed state. Beyond timer storage, > timer processing

Re: 求助flink Buffer pool is destroyed异常

2019-04-09 Thread Yang Peng
一般情况下是内存太小了,导致的问题 应聘程序员 北京邮电大学 <13341000...@163.com> 于2019年4月9日周二 下午1:37写道: > hi, 大家好! > 今天运行flink时抛出了Buffer pool is > destroyed异常,数据源是kafka;消费前kafka队列中堆积了8G左右的数据。详细报错如下: > > > > org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException: > Could not forward element to