Re: Flink 1.4: Queryable State Client

2018-10-13 Thread vino yang
Hi Seye, It seems that you have conducted an in-depth analysis of this issue. If you think it's a bug or need improvement. Please feel free to create a JIRA issue to track its status. Thanks, vino. Seye Jin 于2018年10月14日周日 上午12:02写道: > I recently upgraded to flink 1.4 from 1.3 and leverage

Re: When does Trigger.clear() get called?

2018-10-13 Thread Averell
Hello Hequn, Thanks for the answers. Regarding question no.2, I am now clear. Regarding question no.1, does your answer apply to those custom states as well? This concern of mine came from Flink's implementation of CountTrigger, in which a custom state is being cleared explicitly in

Re: When does Trigger.clear() get called?

2018-10-13 Thread Hequn Cheng
Hi Averell, > 1. Neither PURGE nor clear() removes the States (so the States must be explicitly cleared by the user). Both PURGE and clear() remove state. The PURGE action removes the window state, i.e. the aggregate value. The clear() removes the window meta data including state in Trigger. >

Re: Questions in sink exactly once implementation

2018-10-13 Thread Hequn Cheng
Hi Henry, > 1. I have heard a idempotent way but I do not know how to implement it, would you please enlighten me about it by a example? It's a property of the result data. For example, you can overwrite old values with new ones using a primary key. > 2. If dirty data are *added* but not updated

Re: [DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-13 Thread Shuyi Chen
Welcome to the community and thanks for the great proposal, Xuefu! I think the proposal can be divided into 2 stages: making Flink to support Hive features, and make Hive to work with Flink. I agreed with Timo that on starting with a smaller scope, so we can make progress faster. As for [6], a

Re: When does Trigger.clear() get called?

2018-10-13 Thread Averell
Hello Fabian, So could I assume the followings? 1. Neither PURGE nor clear() removes the States (so the States must be explicitly cleared by the user). 2. When an event for a window arrives after PURGE has been called, it is still be processed, and is treated as the first event of that window.

Re: Identifying missing events in keyed streams

2018-10-13 Thread Averell
Thank you Fabian. Tried (2), and it's working well. I found one more benefit of (2) over (3) is that it allow me to easily raise multiple levels of alarms for each keyed stream (i.e: minor: missed 2 cycles, major: missed 5 cycles,...) Thanks for your help. Regards, Averell -- Sent from:

Flink 1.4: Queryable State Client

2018-10-13 Thread Seye Jin
I recently upgraded to flink 1.4 from 1.3 and leverage Queryable State client in my application. I have 1 jm and 5 tm all serviced behind kubernetes. A large state is built and distributed evenly across task mangers and the client can query state for specified key Issue: if a task manager dies

Re: [DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-13 Thread Bowen
Thank you Xuefu, for bringing up this awesome, detailed proposal! It will resolve lots of existing pain for users like me. In general, I totally agree that improving FlinkSQL's completeness would be a much better start point than building 'Hive on Flink', as the Hive community is concerned

Re: FlinkKafkaProducer and Confluent Schema Registry

2018-10-13 Thread Olga Luganska
Any suggestions? Thank you Sent from my iPhone On Oct 9, 2018, at 9:28 PM, Olga Luganska mailto:trebl...@hotmail.com>> wrote: Hello, I would like to use Confluent Schema Registry in my streaming job. I was able to make it work with the help of generic Kafka producer and FlinkKafkaConsumer

Re: Questions in sink exactly once implementation

2018-10-13 Thread 徐涛
Hi Hequn, Thanks a lot for your response. I have a few questions about this topic. Would you please help me about it? 1. I have heard a idempotent way but I do not know how to implement it, would you please enlighten me about it by a example? 2. If dirty data are added

Re: Are savepoints / checkpoints co-ordinated?

2018-10-13 Thread vino yang
Hi Anand, About "Cancel with savepoint" congxian is right. And for the duplicates, You should use kafka producer transaction (since 0.11) provided EXACTLY_ONCE semantic[1]. Thanks, vino. [1]: https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/connectors/kafka.html#kafka-011

Re: Not all files are processed? Stream source with ContinuousFileMonitoringFunction

2018-10-13 Thread Juan Miguel Cejuela
I’m using both a local (Unix) file system and hdfs. I’m going to check those to get ideas, thank you! I’m also checking the internal code of the class and my own older patch code. On Fri 12. Oct 2018 at 21:32, Fabian Hueske wrote: > Hi, > > Which file system are you reading from? If you are

Re: Mapstatedescriptor

2018-10-13 Thread Dominik Wosiński
Hey, It's the name for the whole descriptor. Not the keys, it means that no other descriptor should be created with the same name. Best Regards, Dom. Sob., 13.10.2018, 09:50 użytkownik Szymon napisał: > > > Hi, i have a question about MapStateDescriptor used to create MapState. > I have a

Mapstatedescriptor

2018-10-13 Thread Szymon
Hi, i have a question about MapStateDescriptor used to create MapState. I have a keyed stream and ProcessWindowFunction where I want to use MapState. And the question is that in MapStateDescriptor constructor public MapStateDescriptor(String name, ClassUK keyClass,