Re: multi-threaded consumer configuration like stream threads?

Liam Clarke-Hutchinson Tue, 27 Oct 2020 02:21:02 -0700

Hi Pushkar,

No. You'd need to combine a consumer with a thread pool or similar as you
prefer. As the docs say (from
https://kafka.apache.org/26/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html
)


We have intentionally avoided implementing a particular threading model for
> processing. This leaves several options for implementing multi-threaded
> processing of records.
> 1. One Consumer Per Thread
> A simple option is to give each thread its own consumer instance. Here are
> the pros and cons of this approach:
>
>    - *PRO*: It is the easiest to implement
>
>
>    - *PRO*: It is often the fastest as no inter-thread co-ordination is
>    needed
>
>
>    - *PRO*: It makes in-order processing on a per-partition basis very
>    easy to implement (each thread just processes messages in the order it
>    receives them).
>
>
>    - *CON*: More consumers means more TCP connections to the cluster (one
>    per thread). In general Kafka handles connections very efficiently so this
>    is generally a small cost.
>
>
>    - *CON*: Multiple consumers means more requests being sent to the
>    server and slightly less batching of data which can cause some drop in I/O
>    throughput.
>
>
>    - *CON*: The number of total threads across all processes will be
>    limited by the total number of partitions.
>
> 2. Decouple Consumption and Processing
> Another alternative is to have one or more consumer threads that do all
> data consumption and hands off ConsumerRecords
> <https://kafka.apache.org/26/javadoc/org/apache/kafka/clients/consumer/ConsumerRecords.html>
>  instances
> to a blocking queue consumed by a pool of processor threads that actually
> handle the record processing. This option likewise has pros and cons:
>
>    - *PRO*: This option allows independently scaling the number of
>    consumers and processors. This makes it possible to have a single consumer
>    that feeds many processor threads, avoiding any limitation on partitions.
>
>
>    - *CON*: Guaranteeing order across the processors requires particular
>    care as the threads will execute independently an earlier chunk of data may
>    actually be processed after a later chunk of data just due to the luck of
>    thread execution timing. For processing that has no ordering requirements
>    this is not a problem.
>
>
>    - *CON*: Manually committing the position becomes harder as it
>    requires that all threads co-ordinate to ensure that processing is complete
>    for that partition.
>
> There are many possible variations on this approach. For example each
> processor thread can have its own queue, and the consumer threads can hash
> into these queues using the TopicPartition to ensure in-order consumption
> and simplify commit.


Cheers,

Liam Clarke-Hutchinson

On Tue, Oct 27, 2020 at 8:04 PM Pushkar Deole <pdeole2...@gmail.com> wrote:

> Hi,
>
> Is there any configuration in kafka consumer to specify multiple threads
> the way it is there in kafka streams?
> Essentially, can we have a consumer with multiple threads where the threads
> would divide partitions of topic among them?
>

Re: multi-threaded consumer configuration like stream threads?

Reply via email to