Re: Flink on YARN: delegation token expired prevent job restart

2020-11-17 Thread Kien Truong
, Flink does exclude the HDFS_DELEGATION_TOKEN in the > HadoopModule when user provides the keytab and principal. I'll try to > do a deeper investigation to figure out is there any HDFS access > before the HadoopModule installed. > > Best, > Yangze Guo > > > On Tue, Nov

Re: Flink on YARN: delegation token expired prevent job restart

2020-11-17 Thread Kien Truong
fig the "security.kerberos.login.principal" and the "security.kerberos.login.keytab" together? If you only set the keytab, it will not take effect. Best, Yangze Guo On Tue, Nov 17, 2020 at 3:03 PM Kien Truong wrote: > > Hi all, > > We are having an issue where Flink Application Master is unable to

Flink on YARN: delegation token expired prevent job restart

2020-11-16 Thread Kien Truong
Hi all, We are having an issue where Flink Application Master is unable to automatically restart Flink job after its delegation token has expired. We are using Flink 1.11 with YARN 3.1.1 in single job per yarn-cluster mode. We have also add valid keytab configuration and taskmanagers are able to

Guide on writing Flink plugins

2020-10-05 Thread Kien Truong
Hi all, We want to write a Flink plugins to integrate Flink jobs with our in-house monitoring system. Are there any guide or tutorial that we can follow to write a Flink plugins ? The official documents are a bit bare bone. Regards, Kien

Re: [Flink 1.6] How to get current total number of processed events

2019-01-25 Thread Kien Truong
), I expect to obtain: oMaximum number of processed messages: 3 (corresponding to key k1) oMinimum number of processed messages: 1 (corresponding to keys 4 and 7) Do you have any idea to obtain this, please? Thank you so much ! Nhan *De :* Kien Truong [mailto:duckientru...@gmail.com] *Envoyé

Re: [Flink 1.6] How to get current total number of processed events

2019-01-24 Thread Kien Truong
, Thanh-Nhan Vo wrote: Hi Kien Truong, Thank you for your answer. I have another question, please ! If I count the number of messages processed for a given key j (denoted c_j), is there a way to retrieve max{c_j}, min{c_j}? Thanks *De :*Kien Truong [mailto:duckientru...@gmail.com] *Envoyé

Re: [Flink 1.6] How to get current total number of processed events

2019-01-23 Thread Kien Truong
Hi Nhan, Logically, the total number of processed events before an event cannot be accurately calculated unless events processing are synchronized. This is not scalable, so naturally I don't think Flink supports it. Although, I suppose you can get an approximate count by using a non-keyed

Re: How to trigger a Global Window with a different Message from the window message

2019-01-23 Thread Kien Truong
Hi Oliver, Try replacing Global Window with a KeyedProcessFunction. Store all the item received between CalcStart and CalcEnd inside a ListState the process them when CalcEnd is received. Regards, Kien On 1/17/2019 1:06 AM, Oliver Buckley-Salmon wrote: Hi, I have a Flink job where I

Re: When can the savepoint directory be deleted?

2019-01-23 Thread Kien Truong
Hi, As of Flink 1.7, the savepoint should not be deleted until after the first checkpoint has been successfully taken. https://ci.apache.org/projects/flink/flink-docs-release-1.7/release-notes/flink-1.7.html#savepoints-being-used-for-recovery Regards, Kien On 1/23/2019 6:57 PM, Ben Yan

Re: Flink Task Allocation on Nodes

2018-10-26 Thread Kien Truong
for each job a separate cluster? On Wed, Oct 24, 2018 at 3:23 PM Kien Truong <mailto:duckientru...@gmail.com>> wrote: Hi, You can have multiple Flink clusters on the same set of physical machines. In our experience, it's best to deploy a separate Flink cluster for

Re: Flink Task Allocation on Nodes

2018-10-24 Thread Kien Truong
it won't have hardware resource for other jobs. On Wed, Oct 24, 2018 at 2:20 PM Kien Truong wrote: > Hi, > > How are your task managers deploy ? > > If you cluster only have one task manager with one slot in each node, > then the job should be spread evenly. > > Regards, &

Re: Flink Task Allocation on Nodes

2018-10-24 Thread Kien Truong
Hi, How are your task managers deploy ? If you cluster only have one task manager with one slot in each node, then the job should be spread evenly. Regards, Kien On 10/24/2018 4:35 PM, Sayat Satybaldiyev wrote: Is there any way to indicate flink not to allocate all parallel tasks on one

Re: Size of Checkpoints increasing with time

2018-10-24 Thread Kien Truong
Hi, Do you use incremental checkpoint ? RocksDB is an append-only DB, so you will experience the steady increase in state size until a compaction occurs and old values of keys are garbage-collected. However, the average state size should stabilize after a while, if the load doesn't change.

Re: Checkpoint acknowledge takes too long

2018-10-24 Thread Kien Truong
Hi, In my experience, this is most likely due to one sub-task is blocked doing some long-running operation. Try to run the task manager with some profiler (like VisualVM) and check for hot spot. Regards, Kien On 10/24/2018 4:02 PM, 徐涛 wrote: Hi I am running a flink application

Re: Error with custom `SourceFunction` wrapping `InputFormatSourceFunction` (Flink 1.6)

2018-10-24 Thread Kien Truong
Hi, Since InputFormatSourceFunction is a subclass of RichParallelSourceFunction, your wrapper should also extend this class. In addition, remember to overwrite the methods defined in the AbstractRichFunction interface and proxy the call to the underlying InputFormatSourceFunction, in order

Flink does not checkpoint if operator is in Finished state

2018-10-15 Thread Kien Truong
Hi, As mentioned in the title, my Flink job will not check point if there are any finished source operator. Is this a bug or is it working as intended ? Regards, Kien

Re: How to clear keyed states periodically?

2018-09-14 Thread Kien Truong
Hi Paul, We have actually done something like this. Clearing a state with rocksdb state backend can actually be a very expensive operation, and block the operators for minutes with large states. To mitigate that, there are 2 approaches that we are using 1. Keeping the state small by increasing

Re: How to get past "bad" Kafka message, restart, keep state

2018-06-20 Thread Kien Truong
Hi, You can use FlatMap instead of Map, and only collect valid elements. Regards, Kien On 6/20/2018 7:57 AM, chrisr123 wrote: First time I'm trying to get this to work so bear with me. I'm trying to learn checkpointing with Kafka and handling "bad" messages, restarting without losing

Re: Multiple Task Slots support in Flink 1.5

2018-05-31 Thread Kien Truong
Hi, We're using multiple slots per TaskManager with legacy mode, and everything works fine. For the new default mode, it also seems to works for us, so I'm not sure what is not supported. May be someone from Flink team could clarify. Best regards, Kien On 5/31/2018 4:26 AM, Abdul

Re: Multiple hdfs

2018-05-22 Thread Kien Truong
Thanks 2018-05-22 15:00 GMT+01:00 Kien Truong <duckientru...@gmail.com <mailto:duckientru...@gmail.com>>: Hi, If your cluster are not high-availability clusters then just use the full path to the cluster. For example, to refer to directory

Re: Multiple hdfs

2018-05-22 Thread Kien Truong
Hi, If your cluster are not high-availability clusters then just use the full path to the cluster. For example, to refer to directory /checkpoint on cluster1, use hdfs://namenode1_ip:port/checkpoint Like wise, /data on cluster2 will be hdfs://namenode2_ip:port/data If your cluster is a

Re: Submitting Flink application on YARN parameter

2018-04-28 Thread Kien Truong
Hi, You have to enable CPU scheduling in YARN, otherwise it always shows that only 1 CPU is allocated for each container, regardless of how many Flink try to allocated. TaskManager memory is 1400MB, but Flink reserves some amount for for off-heap memory, so the actual heap size is smaller.

Re: debug for Flink

2018-04-20 Thread Kien Truong
during the system running? And then they can further be analyzed by category them by level? Best, Stephen On Apr 19, 2018, at 7:19 AM, Kien Truong <duckientru...@gmail.com> wrote: Hi, Our most useful tool when debugging Flink is actually the simple log files, because debugger just slow

Re: Substasks - Uneven allocation

2018-04-19 Thread Kien Truong
Hi Pedro, You can try to call either .rebalance() or|.shuffle()| || |before the Async operator. Shuffle might give a better result if you have fewer tasks than parallelism. Best regards, Kien | On 4/18/2018 11:10 PM, PedroMrChaves wrote: Hello, I have a job that has one async operational

Re: debug for Flink

2018-04-19 Thread Kien Truong
Hi, Our most useful tool when debugging Flink is actually the simple log files, because debugger just slow things down too much for us. However, having to re-deploy the entire cluster to change the logging level is a pain (we use YARN), so we would really like an easier method to change

Re: Consumer offsets not visible in Kafka

2018-04-19 Thread Kien Truong
Hi, That tool only shows active consumer-groups that make use of the automatic partitions assignment API. Flink use the manual partitions assignment API, so it will now show up there. The best way to monitor kafka offset with Flink is using Flink's own metrics system. Otherwise, you

Re: Error running on Hadoop 2.7

2018-03-22 Thread Kien Truong
Hi Ashish, Yeah, we also had this problem before. It can be solved by recompiling Flink with HDP version of Hadoop according to instruction here: https://ci.apache.org/projects/flink/flink-docs-release-1.4/start/building.html#vendor-specific-versions Regards, Kien On 3/22/2018 12:25 AM,

Re: Strange behavior on filter, group and reduce DataSets

2018-03-16 Thread Kien Truong
Hi, Just a guest, but string compare in Java should be using equals method, not == operator. Regards, Kien On 3/16/2018 9:47 PM, simone wrote: /subject.getField("field1") == "";// /

Cannot used managed keyed state in sink

2018-02-23 Thread Kien Truong
Hi, It seems that I can't used managed keyed state inside sink functions. Is this unsupported with Flink 1.4 or am I doing something wrong ? Regards, Kien ⁣Sent from TypeApp ​

Re: Java heap size in YARN

2018-02-15 Thread Kien Truong
<pawelbartosze...@gmail.com> wrote: >Thanks Kien. I will at least play with the setting :) We use hadoop >(s3) as >a chekpoint store. In our case off heap memory is around 300MB as >reported >on task manager statistic page. > >15 lut 2018 17:24 "Kien Truong" <duck

Re: Java heap size in YARN

2018-02-15 Thread Kien Truong
Hi, The relevant settings is: |containerized.heap-cutoff-ratio|: (Default 0.25) Percentage of heap space to remove from containers started by YARN. When a user requests a certain amount of memory for each TaskManager container (for example 4 GB), we can not pass this amount as the maximum

Re: Reduce parallelism without network transfer.

2018-02-05 Thread Kien Truong
work, please let us know. > >Piotrek > >> On 3 Feb 2018, at 04:39, Kien Truong <duckientru...@gmail.com> wrote: >> >> Hi, >> >> Assuming that I have a streaming job, using 30 task managers with 4 >slot each. I want to change the parallelism of 1 opera

Re: RocksDB / checkpoint questions

2018-02-02 Thread Kien Truong
⁣Sent from TypeApp ​ On Feb 3, 2018, 10:48, at 10:48, Kien Truong <duckientru...@gmail.com> wrote: >Hi, >Speaking from my experience, if the distributed disk fail, the >checkpoint will fail as well, but the job will continue running. The >checkpoint scheduler will keep run

Reduce parallelism without network transfer.

2018-02-02 Thread Kien Truong
Hi, Assuming that I have a streaming job, using 30 task managers with 4 slot each. I want to change the parallelism of 1 operator from 120 to 30. Are there anyway so that each subtask of this operator get data from 4 upstream subtasks running in the same task manager, thus avoiding network

Re: How does BucketingSink generate a SUCCESS file when a directory is finished

2018-02-01 Thread Kien Truong
Hi, I did not actually test this, but I think with Flink 1.4 you can extend BucketingSink and overwrite the invoke method to access the watermark Pseudo code: invoke(IN value, SinkFunction.Context context) {    long currentWatermark = context.watermark()    long taskIndex =

Re: Task Manager detached under load

2018-01-20 Thread Kien Truong
Hi, You should enable and check your garbage collection log. We've encountered case where Task Manager disassociated due to long GC pause. Regards, Kien On 1/20/2018 1:27 AM, ashish pok wrote: Hi All, We have hit some load related issues and was wondering if any one has some

Re: Flink on YARN

2018-01-20 Thread Kien Truong
Hi, You only need to install Flink on the node where you want to perform job submission. Regards, Kien On 1/20/2018 3:23 PM, Soheil Pourbafrani wrote: Hi, I have a YARN cluster(containing no Flink installation) that I want to run Flink application on that. I was wondering if it is

Re: Service discovery for flink-metrics-prometheus

2018-01-05 Thread Kien Truong
joscha Krettek <aljos...@apache.org> >wrote: > >> I'm not aware of how this is typically done but maybe Chesnay (cc'ed) >has >> an idea. >> >> > On 14. Dec 2017, at 16:55, Kien Truong <duckientru...@gmail.com> >wrote: >> > >> > Hi

Re: NullPointerException with Avro Serializer

2017-12-20 Thread Kien Truong
It turn out that our flink branch is out-of-date. Sorry for all the noise. :) Regards, Kien ⁣Sent from TypeApp ​ On Dec 20, 2017, 16:42, at 16:42, Kien Truong <duckientru...@gmail.com> wrote: >Upon further investigation, we found out that the reason: > >* The cluster was

Re: NullPointerException with Avro Serializer

2017-12-20 Thread Kien Truong
the default doesn't work for us. Best regards, Kien ⁣Sent from TypeApp ​ On Dec 20, 2017, 14:09, at 14:09, Kien Truong <duckientru...@gmail.com> wrote: >Hi, > >After upgrading to Flink 1.4, we encounter this exception > >Caused by: java.lang.NullPointerException: in >com.viett

NullPointerException with Avro Serializer

2017-12-19 Thread Kien Truong
Hi, After upgrading to Flink 1.4, we encounter this exception Caused by: java.lang.NullPointerException: in com.viettel.big4g.avro.LteSession in long null of long in field tmsi of com.viettel.big4g.avro.LteSession at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:161)

Service discovery for flink-metrics-prometheus

2017-12-14 Thread Kien Truong
Hi, Does anyone have recommendations about integrating flink-metrics-prometheus with some SD mechanism so that Prometheus can pick up the Task Manager's location dynamically ? Best regards, Kien

Re: [Docs] Can't add metrics to RichFilterFunction

2017-12-14 Thread Kien Truong
That syntax is incorrect, should be. @transient private var counter:Counter = _ Regards, Kien On 12/14/2017 8:03 PM, Julio Biason wrote: @transient private var counter:Counter

Re: Non-intrusive way to detect which type is using kryo ?

2017-12-01 Thread Kien Truong
that will make it easier to tell if a type is a POJO or not. I have some utility in mind like `ensurePojo(MyType.class)` that would throw an exception with a reason why this type must be treated as a generic type. Would this help in your case? Regards, Timo Am 11/28/17 um 2:40 AM schrieb Kien Truong: Hi

Non-intrusive way to detect which type is using kryo ?

2017-11-27 Thread Kien Truong
Hi, Are there any way to only log when Kryo serializer is used? It's a pain to disable generic type then try to solve the exception one by one. Best regards, Kien

Bad entry in block exception with RocksDB

2017-11-22 Thread Kien Truong
Hi, We are seeing this exception in one of our job, whenever a check point or save point is performed. java.lang.RuntimeException: Error while adding data to RocksDB at org.apache.flink.contrib.streaming.state.RocksDBListState.add(RocksDBListState.java:119) at

Re: Kafka consumer to sync topics by event time?

2017-11-22 Thread Kien Truong
Hi, When you join multiple stream with different watermarks, the resulting stream's watermark will be the smallest of the input watermark, as long as you don't explicitly assign a new watermarks generator. In your example, if small_topic has watermark at time t1, big_topic has watermark

Re: [Flink] merge-sort for a DataStream

2017-11-16 Thread Kien Truong
Hi Jiewen, Since a DataStream can have infinite number of elements, you can't globally sorted all the elements. If the number of element is finite, you can use the DataSet API, which will look smth like this DataSet a; DataSet aFlatten = a.flatMap(..); DataSet aSorted =

Re: Flink drops messages?

2017-11-13 Thread Kien Truong
Getting late elements from side-output is already available with Flink 1.3 :) Regards, Kien On 11/13/2017 5:00 PM, Fabian Hueske wrote: Hi Andrea, you are right. Flink's window operators can drop messages which are too late, i.e., have a timestamp smaller than the last watermark. This is

Re: Apache Flink - Question about thread safety for stateful collections (MapState)

2017-11-12 Thread Kien Truong
recommendation. Thanks again. On Sunday, November 12, 2017 1:16 AM, Jörn Franke <jornfra...@gmail.com> wrote: Be careful though with racing conditions . On 12. Nov 2017, at 02:47, Kien Truong <duckientru...@gmail.com <mailto:duckientru...@gmail.com>> wrote: Hi Mans, Th

Re: Apache Flink - Question about thread safety for stateful collections (MapState)

2017-11-11 Thread Kien Truong
Hi Mans, They're not executed in the same thread, but the methods that called them are synchronized[1] and therefore thread-safe. Best regards, Kien [1]

Re: Flink memory usage

2017-11-04 Thread Kien Truong
Hi, How did you measure the memory usage ? JVM processes tend to occupy the maximum memory allocated to them, regardless of whether those memory are actively in used or not. To correctly measure the memory usage, you should use Flink's metric system[1] Regards, Kien [1]

Re: Local combiner on each mapper in Flink

2017-10-26 Thread Kien Truong
Hi, For Streaming API, use a ProcessFunction as Fabian's suggestion. You can pretty much do anything with a ProcessFunction :) Best regards, Kien On 10/26/2017 8:01 PM, Le Xu wrote: Hi Kien: Is there a similar API for DataStream as well? Thanks! Le On Oct 26, 2017, at 7:58 AM, Kien

Re: Local combiner on each mapper in Flink

2017-10-26 Thread Kien Truong
Hi, For batch API, you can use GroupReduceFunction, which give you the same benefit as a MapReduce combiner. https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/batch/dataset_transformations.html#combinable-groupreducefunctions Regards, Kien On 10/26/2017 7:37 PM, Le Xu wrote:

Watermark on connected stream

2017-10-19 Thread Kien Truong
Hi, If I connect two stream with different watermark, how are the watermark of the resulting stream determined ? Best regards, Kien

Re: Off heap memory issue

2017-10-18 Thread Kien Truong
Hi, We saw a similar issue in one of our job due to ByteBuffer memory leak[1]. We fixed it using the solution in the article, setting -Djdk.nio.maxCachedBufferSize This variable is available for Java > 8u102 Best regards, Kien [1]http://www.evanjones.ca/java-bytebuffer-leak.html On

Re: Garbage collection concerns with Task Manager memory

2017-10-18 Thread Kien Truong
Hi, Yes, GC is still a major concern. Even G1 has a hard time dealing with >64GB heap in our experience. To mitigate, we run multiple TMs with smaller heap per machine, and use RocksDBStateBackend. Best regards, Kien On 10/18/2017 4:40 PM, Marchant, Hayden wrote: I read in the Flink

Re: Timer coalescing necessary?

2017-10-13 Thread Kien Truong
and a namespace). Best, Aljoscha On 13. Oct 2017, at 13:56, Kien Truong <duckientru...@gmail.com> wrote: Hi Aljoscha, Could you clarify how the timer system works right now ? For example, let's say I have a function F, with 3 keys that are registered to execute at processing time T.

Re: Timer coalescing necessary?

2017-10-13 Thread Kien Truong
different keys would not be possible right now. Best, Aljoscha On 13. Oct 2017, at 06:37, Kien Truong <duckientru...@gmail.com> wrote: Hi, We are having a streaming job where we use timers to implement key timeout for stateful functions. Should we implement coalescing logic to reduce the

Timer coalescing necessary?

2017-10-12 Thread Kien Truong
Hi, We are having a streaming job where we use timers to implement key timeout for stateful functions. Should we implement coalescing logic to reduce the number of timer trigger, or it is not necessary with Flink? Best regards, Kien

Re: Delete save point when using incremental checkpoint

2017-10-11 Thread Kien Truong
on is unfortunately mixing both terms, please expand which >> you're referring to. >> >> Regards, >> Chesnay >> >> >> On 11.10.2017 10:31, Kien Truong wrote: >> >>> Hi, >>> >>> When using increment checkpoint mode, can I delete the save point >that >>> the job recovered from after sometime ? Or do I have to keep that >>> checkpoint forever because it's a part of the snapshot chain ? >>> >>> Best regards, >>> Kien >>> >> >> >>

Delete save point when using incremental checkpoint

2017-10-11 Thread Kien Truong
Hi, When using increment checkpoint mode, can I delete the save point that the job recovered from after sometime ? Or do I have to keep that checkpoint forever because it's a part of the snapshot chain ? Best regards, Kien

Re: RocksDB segfault inside timer when accessing/clearing state

2017-10-09 Thread Kien Truong
reworking how the timers events are executed or interact with normal processing. Best, Stefan Am 07.10.2017 um 05:44 schrieb Kien Truong <duckientru...@gmail.com>: Hi, We are using processing timer to implement some state clean up logic. After switching from FsStateBackend to Rocks

Task Manager segfault randomly with RocksDB

2017-09-12 Thread Kien Truong
Hi all, Our task managers are segfaulting randomly when using RocksDB state backend, any tips regarding how to debug this situation are much appreciated. We are using Flink 1.3.2  with Yarn on Centos 7 Regards, Kiên

Re: Does RocksDB need a dedicated CPU?

2017-09-05 Thread Kien Truong
Hi, In my experience, RocksDB uses very little CPU, and doesn't need a dedicated CPU. However, it's quite disk intensive. You'd need fast, ideally dedicated SSDs to achieve the best performance. Regards, Kien On 9/5/2017 1:15 PM, Bowen Li wrote: Hi guys, Does RocksDB need a dedicated

Re: Exception for Scala anonymous class when restoring from state

2017-08-18 Thread Kien Truong
ed classes that are used as keys? >Maybe what you could try doing, as a means to avoid that for now, is to >make sure that the key classes are untouched. > >Please keep us updated on how this works out for you, I’ll continue to >look into it. > >Thanks, >Gordon > >

Exception for Scala anonymous class when restoring from state

2017-08-16 Thread Kien Truong
Hi, After some refactoring: moving some operator to separate functions/file, I'm encountering a lot of exceptions like these. The logic of the application did not change, and all the refactored operators are stateless, e.g simple map/flatmap/filter. Does anyone know how to fix/avoid/work

Re: Distribute crawling of a URL list using Flink

2017-08-14 Thread Kien Truong
ails. > >You basically pack your URL download into the asynchronous part and >collect >the resulting string for further processing in your pipeline. > > > >Nico > > >[1] >https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/ >asyncio.html > >

Re: Distribute crawling of a URL list using Flink

2017-08-14 Thread Kien Truong
Hi, While this task is quite trivial to do with Flink Dataset API, using readTextFile to read the input and a flatMap function to perform the downloading, it might not be a good idea. The download process is I/O bound, and will block the synchronous flatMap function, so the throughput

Re: Error during Kafka connection

2017-08-11 Thread Kien Truong
Hi, You mentioned that your kafka broker is behind a proxy. This could be a problem, because when the client try to get the cluster's topology, it will get the brokers ' private addresses , which is not reachable. Regards, Kien On Aug 11, 2017, 18:18, at 18:18, "Tzu-Li (Gordon) Tai"

Re: Using latency markers

2017-08-10 Thread Kien Truong
Hi, I just want to say we're having the same issues. There's no metric for latency when we attempted to export the metrics through graphite either. Regards, Kien On 8/10/2017 7:36 PM, Aljoscha Krettek wrote: Hi, I must admit that I've never used this but I'll try and look into it.

Constant write stall warning with RocksDB state backend

2017-08-02 Thread Kien Truong
want to ask if there are any downside to it? Best regards, Kien Truong

Re: Memory Leak - Flink / RocksDB ?

2017-07-27 Thread Kien Truong
in advance. Regards Shashwat On 26-Jul-2017, at 12:10 PM, Kien Truong <duckientru...@gmail.com <mailto:duckientru...@gmail.com>> wrote: Hi, What're your task manager memory configuration ? Can you post the TaskManager's log ? Regards, Kien On 7/25/2017 8:41 PM, Shashwat Rastog

Re: Memory Leak - Flink / RocksDB ?

2017-07-26 Thread Kien Truong
Hi, What're your task manager memory configuration ? Can you post the TaskManager's log ? Regards, Kien On 7/25/2017 8:41 PM, Shashwat Rastogi wrote: Hi, We have several Flink jobs, all of which reads data from Kafka do some aggregations (over sliding windows of (1d, 1h)) and writes

Re: FlinkKafkaConsumer010 - Memory Issue

2017-07-26 Thread Kien Truong
Hi Pedro, As long as there's no OutOfMemoryError/long garbage collection pause, there's nothing to worry about keeping memory allocated. The memory should be garbage-collected by the JVM when necessary. Regards, Kien On 7/25/2017 10:53 PM, PedroMrChaves wrote: Hello, Thank you for the

Re: Purging Late stream data

2017-07-25 Thread Kien Truong
Hi, One method you can use is using a ProcessFunction. In the process function, you get the timer service through the function context, which can then be used to schedule a task to clean up late data. Check out the docs for ProcessFunction

Re: Split Streams not working

2017-07-24 Thread Kien Truong
Hi, I meant adding a select function between the two consecutive select. Or if you use Flink 1.3, you can use the new side output functionality. Regards, Kien On 7/25/2017 7:54 AM, Kien Truong wrote: Hi, I think you're hitting this bug https://issues.apache.org/jira/browse/FLINK-5031

Re: Split Streams not working

2017-07-24 Thread Kien Truong
Hi, I think you're hitting this bug https://issues.apache.org/jira/browse/FLINK-5031 Try the workaround mentioned in a bug: add a map function between map and select Regards, Kien On 7/25/2017 3:14 AM, smandrell wrote: Basically, we are not splitting the streams correctly because when we

Re: FlinkKafkaConsumer010 - Memory Issue

2017-07-21 Thread Kien Truong
Hi, From the log, it doesn't seem that the task manager use a lot of memory. Can you post the output of top. Regards, Kien On 7/20/2017 1:23 AM, PedroMrChaves wrote: Hello, Whenever I submit a job to Flink that retrieves data from Kafka the memory consumption continuously increases. I've

Accept Avro object in ListCheckpointed interface

2017-07-21 Thread Kien Truong
Hi, ListCheckpointed only accept Serializable object at the moment, which make it cumbersome to checkpoint avro objects (have to convert them to byte arrays first). Is there any plan to support avro object directly? Best regards, Kien

Re: How can i merge more than one flink stream

2017-07-19 Thread Kien Truong
Hi, To expand on Fabian's answer, there's a few API for join. * connect - you have to provide a CoprocessFunction. * window join/cogroup - you provide  key selector functions, a time window and a join/cogroup function. With the first method, you have to write more code, in exchange for much

Re: High back-pressure after recovering from a save point

2017-07-16 Thread Kien Truong
in Flink 1.3.x. >> `FlinkKafkaConsumer#setStartFromLatest()` would be it. >> >> On 15 July 2017 at 12:33:53 AM, Stephan Ewen (se...@apache.org) >wrote: >> >> Can you try starting from the savepoint, but telling Kafka to start >from >> the latest offset? >> &g

Re: High back-pressure after recovering from a save point

2017-07-14 Thread Kien Truong
>into backpressure situations? > >Thanks, >Stephan > > >On Fri, Jul 14, 2017 at 1:15 AM, Kien Truong <duckientru...@gmail.com> >wrote: > >> Hi Fabian, >> This happens to me even when the restore is immediate, so there's not >much >> data in Kafka to

Re: High back-pressure after recovering from a save point

2017-07-13 Thread Kien Truong
;Once the job caught-up, the backpressure should disappear. > >Best, Fabian > >2017-07-13 15:48 GMT+02:00 Kien Truong <duckientru...@gmail.com>: > >> Hi all, >> >> I have one job where back-pressure is significantly higher after >resuming >> from

High back-pressure after recovering from a save point

2017-07-13 Thread Kien Truong
Hi all, I have one job where back-pressure is significantly higher after resuming from a save point. Because that job makes heavy use of stateful functions with RocksDBStateBackend , I'm suspecting that this is the cause of performance degradation. Does anyone encounter simillar issues

Re: data loss after implementing checkpoint

2017-07-10 Thread Kien Truong
Hi, I think you need to create a savepoint and restore from there. https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/savepoints.html Checkpoint are for automatic recovery within the lifetime of a job, they're deleted when you stop the job manually. Regards, Kien On

Re: flink kafka consumer lag

2017-07-09 Thread Kien Truong
Hi, You should setup a metric reporter to collect Flink's metrics. https://ci.apache.org/projects/flink/flink-docs-release-1.2/monitoring/metrics.html There's a lot of useful information in the metrics, including the consumer lags. I'm using the Graphite reporter with InfluxDB for storage +

Re: Different CoGroup behavior inside DeltaIteration

2015-11-16 Thread Duc Kien Truong
iteration. Best, Kien Truong Sent using CloudMagic Email [https://cloudmagic.com/k/d/mailapp?ct=ta=8.0.55=5.1.1=email_footer_2] On Mon, Nov 16, 2015 at 11:02 PM, Stephan Ewen < se...@apache.org [se...@apache.org] > wrote: It is actually very important that the co group in delta iterations