Re: Elastic Block Store as checkpoint storage

2023-07-17 Thread Konstantin Knauf
Hi Prabhu, this should be possible, but is quite expensive in comparison to AWS S3 and you have to remount the EBS volumes to the new Taskmanagers in case of a failure which takes some non-trivial time, which slows down recovery. So, overall I don't think its peferrable compared to S3. We do use

Flink Table API + Jacoco Plugin

2023-07-17 Thread Brendan Cortez
Hi all! I'm trying to use the jacoco-maven-plugin and run unit-tests for Flink Table API, but they fail with an error (see file error_log_flink_17.txt for full error stacktrace in attachment): java.lang.IllegalArgumentException: fromIndex(2) > toIndex(0) ... I'm using: - Flink: - flink-table-ap

Re: Async IO For Cassandra

2023-07-17 Thread Giannis Polyzos
Hi Pritam.. since this is a look-up to an external system considering there is network i/o in place and also the time to get the results it might be normal to notice backpressure there. Also note that the queries in Cassandra highly depend on the data model, so data can be easy to find between the

Re: Async IO For Cassandra

2023-07-17 Thread Shammon FY
Hi Pritam, I'm sorry that I'm not familiar with Cassandra. If your async function is always the root cause for backpressure, I think you can check the latency for the async request in your function and log some metrics. By the way, I think you can add cache in your async function to speedup the l

Re: Checkpoint size smaller than Savepoint size

2023-07-17 Thread Shammon FY
Hi Neha, I think you can first check whether the options `state.backend` and `state.backend.incremental` you mentioned above exist in `JobManager`->`Configuration` in Flink webui. If they do not exist, you may be using the wrong conf file. Best, Shammon FY On Mon, Jul 17, 2023 at 5:04 PM Neha .

Unsubscribe

2023-07-17 Thread wang
Unsubscribe

Re: Async IO For Cassandra

2023-07-17 Thread Pritam Agarwala
Hi Team, Any input on this will be really helpful. Thanks! On Tue, Jul 11, 2023 at 12:04 PM Pritam Agarwala < pritamagarwala...@gmail.com> wrote: > Hi Team, > > > I am using "AsyncDataStream.unorderedWait" to connect to cassandra . The > cassandra lookup operators are becoming the busy opera

Re: Checkpoint size smaller than Savepoint size

2023-07-17 Thread Neha . via user
Hi Shammon, state.backend: rocksdb state.backend.incremental: true This is already set in the Flink-conf. Anything else that should be taken care of for the incremental checkpointing? Is there any related bug in Flink 1.16.1? we have recently migrated to Flink 1.16.1 from Flink 1.13.6. What can b

Re: TCP Socket stream scalability

2023-07-17 Thread Flavio Pompermaier
I had a similar situation with my Elasticsearch source where you don't know before executing the query (via scroll API for example) how many splits you will find. How should you handle those situation with new Source API? On Mon, Jul 17, 2023 at 10:09 AM Martijn Visser wrote: > Hi Kamal, > > It

Re: PyFlink SQL from Kafka to Iceberg issues

2023-07-17 Thread Martijn Visser
Hi Dani, Plugins need to be placed in a folder inside the plugins directory, I think that might be the problem. Best regards, Martijn On Sun, Jul 9, 2023 at 7:00 PM Dániel Pálma wrote: > Thanks for the tips Martijn! > > I've fixed the library versions to 1.16 everywhere and also decided to >

Re: Hadoop Error on ECS Fargate

2023-07-17 Thread Martijn Visser
Hi Mengxi Wang, Which Flink version are you using? Best regards, Martijn On Thu, Jul 13, 2023 at 3:21 PM Wang, Mengxi X via user < user@flink.apache.org> wrote: > Hi community, > > > > We got this kuerberos error with Hadoop as file system on ECS Fargate > deployment. > > > > Caused by: org.ap

Re: TCP Socket stream scalability

2023-07-17 Thread Martijn Visser
Hi Kamal, It would require you to find a way to create a TCP connection on task managers where you would only read the assigned part of the TCP connection. Looking at the protocol itself, that most likely would be an issue. A TCP connection would also be problematic in case of replays and checkpoi