Re: AsyncFunction vs Async Sink

2023-06-15 Thread Teoh, Hong
share the same underlying implementation and the features like batching and rate limiting can benefit both? Best Lu On Wed, Jun 14, 2023 at 2:20 PM Teoh, Hong mailto:lian...@amazon.co.uk>> wrote: Hi Lu, Thanks for your question. See below for my understanding. I would recommend using

Re: Interaction between idling sources and watermark alignment

2023-06-15 Thread Teoh, Hong
Hi Alexis, below is my understanding: > I see that, in Flink 1.17.1, watermark alignment will be supported (as beta) > within a single source's splits and across different sources. I don't see > this explicitly mentioned in the documentation, but I assume that the concept > of "maximal drift"

Re: Async IO operator to write to DB

2023-06-14 Thread Teoh, Hong
Hi Karthik, As an additional update, since Flink 1.15 we have introduced the asynchronous sink base, which allows easy writing of an asynchronous sink (you simply provide the client, request and response handling) and the sink base will handle the state management, retries and batching. See

Re: AsyncFunction vs Async Sink

2023-06-14 Thread Teoh, Hong
Hi Lu, Thanks for your question. See below for my understanding. I would recommend using the Async Sink if you are writing to the external service as the final output of your job graph, and if you don’t have the ordered requirement that updates to the external system must be done before

Re: OOM taskmanager

2023-01-26 Thread Teoh, Hong
Hi Marco, When you say OOM, I assume you mean TM pod being OOMKilled, is that correct? If so, this usually means that the TM is using more than the actual memory allocated to the pod. First I would check your memory configuration to figure out where this extra memory use is coming from. This

Re: How to write custom serializer for dynamodb connector

2022-11-08 Thread Teoh, Hong
Hi Matt, First of all, awesome that you are using the DynamoDB sink! To resolve your issue with serialization in the DDB sink, you are right, the issue only happens when you create the AttributeValue object in a previous operator and send it to the sink. The issue here is with serializing of

Re: Is Flink SQL a good fit for alerting?

2022-07-27 Thread Teoh, Hong
Re-pasting from Slack [cid:image001.png@01D8A1E9.DA582010] Hong Teoh 7 hours ago I can give some examples, but they are all using DataStream API

Re: Using RocksDBStateBackend and SSD to store states, application runs slower..

2022-07-21 Thread Teoh, Hong
Hi, I’d say it seems you are trying to identify bottlenecks in your job, and are currently looking at RocksDB Disk I/O as one of the bottlenecks. However, there are also other bottlenecks (e.g. CPU/memory/network/sink throttling), and from what you described, it’s possible that the HDFS sink

Re: Flink Job Manager unable to recognize Task Manager Available slots

2022-05-24 Thread Teoh, Hong
Hi Sunitha, Without more information about your setup, I would assume you are trying to return JobManager (and HA setup) into a stable state. A couple of questions: * Since your job is cancelled, I would assume that the current job’s HA state is not important, so we can delete the

RE: Batching in kinesis sink

2022-05-12 Thread Teoh, Hong
Hi Zain, For Flink 1.13, we use the KinesisProducerLibrary. If you are using aggregation, you can control the maximum size of aggregated records by configuring the AggregationMaxSize in the producer config when constructing the FlinkKinesisProducer. (See [1] for more docs)