Re: About KafkaConsumerBase

2017-08-02 Thread aitozi
Hi, thanks,you explained clearly! -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/About-KafkaConsumerBase-tp14601p14621.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: [POLL] Dropping savepoint format compatibility for 1.1.x in the Flink 1.4.0 release

2017-08-02 Thread Stefan Richter
+1 > Am 28.07.2017 um 16:03 schrieb Stephan Ewen : > > Seems like no one raised a concern so far about dropping the savepoint format > compatibility for 1.1 in 1.4. > > Leaving this thread open for some more days, but from the sentiment, it seems > like we should go ahead? > > On Wed, Jul 12,

Constant write stall warning with RocksDB state backend

2017-08-02 Thread Kien Truong
Hi, With the setting SPINNING_DISK_OPTIMIZED_HIGH_MEM, I'm having a lot of warning from RocksDB: Stalling writes because we have 3 immutable memtables (waiting for flush), max_write_buffer_number is set to 4 rate 16777216 Increasing max_write_buffer_number causes the message to go away, but I

Re: Odd flink behaviour

2017-08-02 Thread Fabian Hueske
FileInputFormat cannot know about the reached variable that you added in your class. So there is no way it could reset it to false. An alternative implementation without overriding open() could be to change the reachedEnd method to check if the stream is still at offset 0. 2017-08-01 20:22 GMT+02:

Re: S3 Write Execption

2017-08-02 Thread Fabian Hueske
Hi Aneesha, the logs would show that Flink is going through a recovery cycle. Recovery means to cancel running tasks and start them again. If you don't see something like that in the logs, Flink continues to processing. I'm not familiar with the details of S3, so I can't tell if the exception ind

Re: multiple users per flink deployment

2017-08-02 Thread Tzu-Li (Gordon) Tai
Hi, There’s been quite a few requests on this recently on the mailing lists and also mentioned by some users offline, so I think we may need to start with plans to probably support this. I’m CC’ing Eron to this thread to see if he has any thoughts on this, as he was among the first authors driv

Re: Proper way to establish bucket counts

2017-08-02 Thread Fabian Hueske
Hi Robert, Flink collects many metrics by default, including the number of records / events that go into each operator (see [1], System Metrics, IO, "numRecordsIn"). So, you would only need to access that metric. Best, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/monito

Storing POJO's to RocksDB state backend

2017-08-02 Thread Biplob Biswas
Hi, I had a simple query as to how POJO's are stored in a state back end like RocksDB? Is it deserialized internally(with a default serde or we have to specify something)? and if yes, is Kryo the default serde? Thanks, Biplob -- View this message in context: http://apache-flink-user-mailing-

Re: Constant write stall warning with RocksDB state backend

2017-08-02 Thread Stefan Richter
Hi, there is some documentation on this topic here https://github.com/facebook/rocksdb/wiki/Write-Stalls . Increasing the buffer size seems ok, and the downside is a potentially higher memory footprint. Best, Stefan > Am 02.08.2017 um 10:

Re: S3 Write Execption

2017-08-02 Thread Stephan Ewen
It is very important to point out that the Bucketing sink can currently NOT work properly on S3. It assumes a consistent file system (like listing / renaming works consistently), and S3 is only eventually consistent. I assume that this eventual consistency of S3 is the cause of your error. There i

Re: Can i use lot of keyd states or should i use 1 big key state.

2017-08-02 Thread shashank agarwal
If I am creating KeyedState ("count by email id") and keyed stream has 10 unique email id's. Will it create 1 column family or hash table ? Or it will create 10 column family or hash table ? Can i have millions of unique email id in that keyed state ? On Tue, Aug 1, 2017 at 2:59 AM, shashank

Re: [POLL] Dropping savepoint format compatibility for 1.1.x in the Flink 1.4.0 release

2017-08-02 Thread Till Rohrmann
+1 On Wed, Aug 2, 2017 at 9:12 AM, Stefan Richter wrote: > +1 > > Am 28.07.2017 um 16:03 schrieb Stephan Ewen : > > Seems like no one raised a concern so far about dropping the savepoint > format compatibility for 1.1 in 1.4. > > Leaving this thread open for some more days, but from the sentimen

Re: [POLL] Dropping savepoint format compatibility for 1.1.x in the Flink 1.4.0 release

2017-08-02 Thread Kostas Kloudas
+1 > On Aug 2, 2017, at 3:16 PM, Till Rohrmann wrote: > > +1 > > On Wed, Aug 2, 2017 at 9:12 AM, Stefan Richter > wrote: > >> +1 >> >> Am 28.07.2017 um 16:03 schrieb Stephan Ewen : >> >> Seems like no one raised a concern so far about dropping the savepoint >> format compatibility for 1.1 i

Re: Storing POJO's to RocksDB state backend

2017-08-02 Thread Timo Walther
Hi Biplob, Flink is shipped with own serializers. POJOs and other datatypes are analyzed automatically. Kryo is only the fallback option, if your class does not meet the POJO criteria (see [1]). Usually, all serialization/deserialization to e.g. RocksDB happens internally and the user doesn't

KafkaConsumerBase

2017-08-02 Thread aitozi
Hi, i have a question that , when we use KafkaConsumerBase, we will have to fetch data from different partition in different parllel thread like the method shown in KafkaConsumerBase.java (version 1.2.0) protected static List assignPartitions( List allPartitio

Re: Eventime window

2017-08-02 Thread Timo Walther
Hi Govind, if the window is not triggered, this usually indicates that your timestamp and watermark assignment is not correct. According to your description, I don't think that you need a custom trigger/evictor. How often do events arrive from one device? There must be another event from the

Re: Eventime window

2017-08-02 Thread Govindarajan Srinivasaraghavan
Thanks Timo. The next message will arrive only after a minute or so. Is there a way to evict whatever is there in window buffer just after 10 seconds irrespective of whether a new message arrives or not. Thanks, Govind > On Aug 2, 2017, at 6:56 AM, Timo Walther wrote: > > Hi Govind, > > if

Re: KafkaConsumerBase

2017-08-02 Thread Tzu-Li (Gordon) Tai
Hi! method shown in KafkaConsumerBase.java (version 1.2.0)  A lot has changed in the FlinkKafkaConsumerBase since version 1.2.0. And if I remember correctly, the `assignPartitions` method was actually a no longer relevant method used in the code, and was properly removed afterwards. The method f

Re: Eventime window

2017-08-02 Thread Timo Walther
The question is what defines your `10 seconds`. In event-time the incoming events determine when 10 seconds have passed. Your description sounds like you want to have results after 10 seconds wall-clock/processing-time. So either you use a processing-time window or you implement a custom trigge

Re: Eventime window

2017-08-02 Thread Timo Walther
I forgot about the AssignerWithPeriodicWatermarks [1]. I think it could solve your problem easily. Timo [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/event_timestamps_watermarks.html#with-periodic-watermarks Am 02.08.17 um 16:30 schrieb Timo Walther: The question is wha

Re: KafkaConsumerBase

2017-08-02 Thread aitozi
Hi,Gordon Yes, just now i again read the code in assignTopicPartitions method , it indeed subscribe the partition the subtask should subscribe to. i didn't read the for loop generate subscribedPartitions for each subtasks in assignTopicPartitions carefully before for (int i = getRuntimeCo

Re: Eventime window

2017-08-02 Thread Govindarajan Srinivasaraghavan
Thanks Timo. Basically my requirement is based on event time the window has to be created but the trigger has to happen either when it has received the next element >10s or 10s has passed. Exactly the same way as you described. Let me try the AssignerWithPeriodicWatermarks approach. Thanks, Go

Can't find correct JobManager address, job fails with Queryable state

2017-08-02 Thread Biplob Biswas
When I start my flink job I get the following warning, if I am not wrong this is because it can't find the jobmanager at the given address(localhost), I tried changing: config.setString(JobManagerOptions.ADDRESS, "localhost"); to LAN IP, 127.0.0.1 and localhost but none of it seems to work. I am

Re: multiple users per flink deployment

2017-08-02 Thread Georg Heiler
Thanks for the overview. Currently a single flink cluster seems to run all tasks with the same user. I would want to be able to run each flink job as a separate user instead. The update for separate read/write users is nice though. Tzu-Li (Gordon) Tai schrieb am Mi. 2. Aug. 2017 um 10:59: > Hi,

Re: Odd flink behaviour

2017-08-02 Thread Mohit Anchlia
Thanks. I thought the purpose of below method was to supply that information? @Override *public* *boolean* reachedEnd() *throws* IOException { *logger*.info("Reached " + reached); *return* reached; } On Wed, Aug 2, 2017 at 1:43 AM, Fabian Hueske wrote: > FileInputFormat cannot know about th

Fwd: Flink -mesos-app master hang

2017-08-02 Thread Biswajit Das
Hi There, I have posted this here in the group a few days back and after that I have been exchanging email with Eron, thanks to Eron for all the tips. Now I see this basic auth error, I'm little confused how come Job Manager launched fine and task manager failing to auth. Also, mesos doc says by

Re: Eventime window

2017-08-02 Thread Eron Wright
Note that the built-in `BoundedOutOfOrdernessTimestampExtractor` generates watermarks based only on the timestamp of incoming events. Without new events, `BoundedOutOfOrdernessTimestampExtractor` will not advance the event-time clock. That explains why the window doesn't trigger immediately after

replacement for KeyedStream.fold(..) ?

2017-08-02 Thread Peter Ertl
Hi folks, since KeyedStream.fold(..) is marked as @deprecated what is the proper replacement for that kind of functionality? Is mapWithState() and flatMapWithState() a *full* replacement? Cheers Peter

Re: multiple users per flink deployment

2017-08-02 Thread Eron Wright
One of the key challenges is isolation, eg. ensuring that one job cannot access the credentials of another. The easiest solution today is to use the YARN deployment mode, with a separate app per job. Meanwhile, improvements being made under the FLIP-6 banner for 1.4+ are lying groundwork for a mu