RE: RE: Flink Kafka Sink + Schema Registry + Message Headers

2024-02-01 Thread Kirti Dhar Upadhyay K via user
Thanks Jiabao and Yaroslav for your quick responses. Regards, Kirti Dhar From: Yaroslav Tkachenko Sent: 01 February 2024 21:42 Cc: user@flink.apache.org Subject: Re: RE: Flink Kafka Sink + Schema Registry + Message Headers The schema registry support is provided in

Jobmanager restart after it has been requested to stop

2024-02-01 Thread Liting Liu (litiliu) via user
Hi, community:   I'm running a Flink 1.14.3 job with flink-Kubernetes-operator-1.6.0 on the AWS. I found my flink jobmananger container's thread restarted after this flinkdeployment has been requested to stop, here is the log of jobmanager: 2024-02-01 21:57:48,977

Re: Parallelism and Target TPS

2024-02-01 Thread Zhanghao Chen
Hi Patricia, Flink will create one Kafka consumer per parallelism, however, you'll need some testing to measure the capability of a single task. Usu, one consumer can consume at a much higher rate than 1 record per second. Best, Zhanghao Chen From: patricia lee

Re: Case for a Broadcast join + hint in Flink Streaming SQL

2024-02-01 Thread Prabhjot Bharaj via user
Hi Feng, Thanks for your prompt response. If we were to solve this in Flink, my higher level viewpoint is: 1. First to implement Broadcast join in Flink Streaming SQL, that works across Table api (e.g. via a `left.join(right, , join_type="broadcast") 2. Then, support a Broadcast hint that would

Re: Case for a Broadcast join + hint in Flink Streaming SQL

2024-02-01 Thread Feng Jin
Hi Prabhjot I think this is a reasonable scenario. If there is a large table and a very small table for regular join, without broadcasting the regular join, it can easily cause data skew. We have also encountered similar problems too. Currently, we can only copy multiple copies of the small table

Case for a Broadcast join + hint in Flink Streaming SQL

2024-02-01 Thread Prabhjot Bharaj via user
Hello folks, We have a use case where we have a few stream-stream joins, requiring us to join a very large table with a much smaller table, essentially enriching the large table with a permutation on the smaller table (Consider deriving all orders/sessions for a new location). Given the nature

Re: RE: Flink Kafka Sink + Schema Registry + Message Headers

2024-02-01 Thread Yaroslav Tkachenko
The schema registry support is provided in ConfluentRegistryAvroSerializationSchema class (flink-avro-confluent-registry package). On Thu, Feb 1, 2024 at 8:04 AM Yaroslav Tkachenko wrote: > You can also implement a custom KafkaRecordSerializationSchema, which > allows creating a ProducerRecord

Re: RE: Flink Kafka Sink + Schema Registry + Message Headers

2024-02-01 Thread Yaroslav Tkachenko
You can also implement a custom KafkaRecordSerializationSchema, which allows creating a ProducerRecord (see "serialize" method) - you can set message key, headers, etc. manually. It's supported in older versions. On Thu, Feb 1, 2024 at 4:49 AM Jiabao Sun wrote: > Sorry, I didn't notice the

RE: RE: Flink Kafka Sink + Schema Registry + Message Headers

2024-02-01 Thread Jiabao Sun
Sorry, I didn't notice the version information. This feature was completed in FLINK-31049[1] and will be released in version 3.1.0 of Kafka. The release process[2] is currently underway and will be completed soon. However, version 3.1.0 does not promise support for Flink 1.16. If you need to

RE: Flink Kafka Sink + Schema Registry + Message Headers

2024-02-01 Thread Kirti Dhar Upadhyay K via user
Hi Jiabao, Thanks for reply. Currently I am using Flink 1.16.1 and I am not able to find any HeaderProvider setter method in class KafkaRecordSerializationSchemaBuilder. Although on github I found this support here:

[ANNOUNCE] Apache flink-connector-opensearch 1.1.0 released

2024-02-01 Thread Danny Cranmer
The Apache Flink community is very happy to announce the release of Apache flink-connector-opensearch 1.1.0. This release supports Apache Flink 1.17 and 1.18. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data