subject:"Query regarding groupbykey in streams"

Re: Query regarding groupbykey in streams

2024-05-15 Thread Matthias J. Sax

If I read this correctly, your upstream producer which writes into the 
input topic of you KS app is using a custom partitioner?


If you do a `groupByKey()` and change the key upstream, it would result 
in a repartition step, which would fall back to the default partioner.


If you want to use a custom partitioner in KS, you should implement 
`StreamPartitioner` instead of the producer partitioner interface, and 
pass it into the relevant methods.


`groupByKey()` does not allow to set a partitioner (seem this is a gap 
we should close...) -- as a workaround you could add repartition() 
before the grouping to pass your custom partitioner.


For IQ, you should also need to pass your `StreamsPartitioner` to allow 
KS to fine the correct partition to query.


HTH.

-Matthias


On 5/13/24 4:35 PM, Dev Lover wrote:

Hi All,

I have a custom partitioner to distribute the data across partitions in my
cluster.

My setup looks like below
Version - 3.7.0
Kafka - 3 broker setup
Partition count - 10
Stream server pods - 2
Stream threads in each pod - 10
Deployed in Kubernetes
Custom partitioner on producer end.

I am doing a groupbykey . Is it correct to use it when I have custom
partitioner on producer end ?
I recently migrated to 3.7 from 3.5.1 . I am observing that partitions are
not evenly distributed across my 2 stream pods. Also my remote query is
failing with host being unavailable. But if I restart streams it works fine
for a certain time and again starts erroring out. Am I doing something
wrong?

Regards

Query regarding groupbykey in streams

2024-05-13 Thread Dev Lover

Hi All,

I have a custom partitioner to distribute the data across partitions in my
cluster.

My setup looks like below
Version - 3.7.0
Kafka - 3 broker setup
Partition count - 10
Stream server pods - 2
Stream threads in each pod - 10
Deployed in Kubernetes
Custom partitioner on producer end.

I am doing a groupbykey . Is it correct to use it when I have custom
partitioner on producer end ?
I recently migrated to 3.7 from 3.5.1 . I am observing that partitions are
not evenly distributed across my 2 stream pods. Also my remote query is
failing with host being unavailable. But if I restart streams it works fine
for a certain time and again starts erroring out. Am I doing something
wrong?

Regards

Re: Query regarding groupbykey in streams

Query regarding groupbykey in streams

2 matches

Site Navigation

Mail list logo

Footer information