Hello folks,

I'd like to ask the community about its opinion on the partitioning
functions in Kafka.

With KAFKA-2091 <https://issues.apache.org/jira/browse/KAFKA-2091>
integrated we are now able to have custom partitioners in the producer.
The question now becomes *which* partitioners should ship with Kafka?
This issue arose in the context of KAFKA-2092
<https://issues.apache.org/jira/browse/KAFKA-2092>, which implements a
specific load-balanced partitioning. This partitioner however assumes some
stages of processing on top of it to make proper use of the data, i.e., it
envisions Kafka as a substrate for stream processing, and not only as the
I/O component.
Is this a direction that Kafka wants to go towards? Or is this a role
better left to the internal communication systems of other stream
processing engines (e.g., Storm)?
And if the answer is the latter, how would something such a Samza (which
relies mostly on Kafka as its communication substrate) be able to implement
advanced partitioning schemes?

Cheers,
--
Gianmarco

Reply via email to