Hi Karthick,

on a high level seems like a data skew issue and some partitions have way
more data than others?
What is the number of your devices? how many messages are you processing?
Most of the things you share above sound like you are looking for
suggestions around load distribution for Kafka.  i.e number of partitions,
how to distribute your device data etc.
It would be good to also share what your flink job is doing as I don't see
anything mentioned around that.. are you observing back pressure in the
Flink UI?

Best

On Fri, Sep 15, 2023 at 3:46 PM Karthick <ibmkarthickma...@gmail.com> wrote:

> Dear Apache Flink Community,
>
>
>
> I am writing to urgently address a critical challenge we've encountered in
> our IoT platform that relies on Apache Kafka and real-time data processing.
> We believe this issue is of paramount importance and may have broad
> implications for the community.
>
>
>
> In our IoT ecosystem, we receive data streams from numerous devices, each
> uniquely identified. To maintain data integrity and ordering, we've
> meticulously configured a Kafka topic with ten partitions, ensuring that
> each device's data is directed to its respective partition based on its
> unique identifier. This architectural choice has proven effective in
> maintaining data order, but it has also unveiled a significant problem:
>
>
>
> *One device's data processing slowness is interfering with other devices'
> data, causing a detrimental ripple effect throughout our system.*
>
> To put it simply, when a single device experiences processing delays, it
> acts as a bottleneck within the Kafka partition, leading to delays in
> processing data from other devices sharing the same partition. This issue
> undermines the efficiency and scalability of our entire data processing
> pipeline.
>
> Additionally, I would like to highlight that we are currently using the
> default partitioner for choosing the partition of each device's data. If
> there are alternative partitioning strategies that can help alleviate this
> problem, we are eager to explore them.
>
> We are in dire need of a high-scalability solution that not only ensures
> each device's data processing is independent but also prevents any
> interference or collisions between devices' data streams. Our primary
> objectives are:
>
> 1. *Isolation and Independence:* We require a strategy that guarantees
> one device's processing speed does not affect other devices in the same
> Kafka partition. In other words, we need a solution that ensures the
> independent processing of each device's data.
>
>
> 2. *Open-Source Implementation:* We are actively seeking pointers to
> open-source implementations or references to working solutions that address
> this specific challenge within the Apache ecosystem or any existing
> projects, libraries, or community-contributed solutions that align with our
> requirements would be immensely valuable.
>
> We recognize that many Apache Flink users face similar issues and may have
> already found innovative ways to tackle them. We implore you to share your
> knowledge and experiences on this matter. Specifically, we are interested
> in:
>
> *- Strategies or architectural patterns that ensure independent processing
> of device data.*
>
> *- Insights into load balancing, scalability, and efficient data
> processing across Kafka partitions.*
>
> *- Any existing open-source projects or implementations that address
> similar challenges.*
>
>
>
> We are confident that your contributions will not only help us resolve
> this critical issue but also assist the broader Apache Flink community
> facing similar obstacles.
>
>
>
> Please respond to this thread with your expertise, solutions, or any
> relevant resources. Your support will be invaluable to our team and the
> entire Apache Flink community.
>
> Thank you for your prompt attention to this matter.
>
>
> Thanks & Regards
>
> Karthick.
>

Reply via email to