Re: Re: Bootstrapping the state

2018-07-22 Thread Henri Heiskanen
lan about this, I would try to submit this idea > to the community. > > And about "how to bootstrap a state", what does that mean? can you explain > this? > > Thank, vino > > > On 2018-07-20 20:00 , Henri Heiskanen Wrote: > > Hi, > > Thanks. Just to cla

Re: Bootstrapping the state

2018-07-20 Thread Henri Heiskanen
mplement SourceFunction interface). > > For your requirement, you can check a no more data idle time, if expire, > then exit, finally the job will stop. > > You can also refer the implementation of other source connectors. > > Thanks, vino. > > 2018-07-19 19:52 GMT+08:00

Bootstrapping the state

2018-07-19 Thread Henri Heiskanen
Hi, I've been looking into how to initialise large state and especially checked this presentation by Lyft referenced in this group as well: https://www.youtube.com/watch?v=WdMcyN5QZZQ In our use case we would like to load roughly 4 billion entries into this state and I believe loading this data f

Re: deduplication with streaming sql

2018-02-06 Thread Henri Heiskanen
should keep the state at least for 12 > hours but the most for 14 hours, you only need to register a new timer > every 2 hours. > > Hope this helps, > Fabian > > 2018-02-06 15:47 GMT+01:00 Henri Heiskanen : > >> Hi, >> >> Thanks. >> >> Doing this

Re: deduplication with streaming sql

2018-02-06 Thread Henri Heiskanen
is not officially supported nor >> tested. I opened an issue for this [1]. >> >> Until this issue is fixed I would recommend to implement a custom >> aggregate function that keeps track values seen so far [2]. >> >> Regards, >> Timo >> >> [1] https://is

deduplication with streaming sql

2018-02-06 Thread Henri Heiskanen
Hi, I have a use case where I would like to find distinct rows over certain period of time. Requirement is that new row is emitted asap. Otherwise the requirement is mainly to just filter out data to have smaller dataset for downstream. I noticed that SELECT DISTINCT and state retention time of 12

starting query server when running flink embedded

2017-09-28 Thread Henri Heiskanen
Hi, I would like to test queryable state just by running the flink embedded from my IDE. What is the easiest way to start it properly? If I run the below I can not see the query server listening at the given port. I found something about this, but it was about copying some base classes and post wa

Re: difference between checkpoints & savepoints

2017-08-10 Thread Henri Heiskanen
eckpoints in some cases if this is feature dropped. > > Right now, externalized checkpoints should offer all that you want. > > Best, > Stefan > > Am 10.08.2017 um 11:46 schrieb Henri Heiskanen >: > > Hi, > > It would be super helpful if Flink would provide out

Re: difference between checkpoints & savepoints

2017-08-10 Thread Henri Heiskanen
Hi, It would be super helpful if Flink would provide out of the box functionality for writing automatic savepoints and then starting from the latest savepoint. If external checkpoints would support rescaling then 1st requirement is met, but one would still need to e.g. find the latest checkpoint f

Re: Flink streaming questions

2017-01-09 Thread Henri Heiskanen
ink/flink-docs- > release-1.2/dev/windows.html#windowfunction-with-incremental-aggregation > > 2017-01-03 12:32 GMT+01:00 Henri Heiskanen : > >> Hi, >> >> Actually it seems "Fold cannot be used with a merging WindowAssigner" and >> workaround I found was

Re: Are heterogeneous DataStreams possible?

2017-01-09 Thread Henri Heiskanen
Hi, We have been using HashMap and has been working fine so far. Br, Henkka On Mon, Jan 9, 2017 at 5:35 PM, Aljoscha Krettek wrote: > You could try using JSON for all your data, this might me slow, however. > The other route, which I would suggest, is to have your own custom > TypeSerializers

Re: Kafka 09 consumer does not commit offsets

2017-01-09 Thread Henri Heiskanen
Hi, We had the same problem when running 0.9 consumer against 0.10 Kafka. Upgrading Flink Kafka connector to 0.10 fixed our issue. Br, Henkka On Mon, Jan 9, 2017 at 5:39 PM, Tzu-Li (Gordon) Tai wrote: > Hi, > > Not sure what might be going on here. I’m pretty certain that for > FlinkKafkaConsu

Re: Flink streaming questions

2017-01-03 Thread Henri Heiskanen
O): T = { >> if(!initialized){ >> doInitStuff() >> initialized = true >> } >> >> doNormalStuff() >> } >> } > > > #3 - One way to do this is as you've said which is to attach the profile > information to the event, using a mapper,

Flink streaming questions

2017-01-02 Thread Henri Heiskanen
Hi, I have few questions related to Flink streaming. I am on 1.2-SNAPSHOT and what I would like to accomplish is to have a stream that reads data from multiple kafka topics, identifies user sessions, uses an external user user profile to enrich the data, evaluates an script to produce session aggr