Hello folks, I'd like to ask the community about its opinion on the partitioning functions in Kafka.
With KAFKA-2091 <https://issues.apache.org/jira/browse/KAFKA-2091> integrated we are now able to have custom partitioners in the producer. The question now becomes *which* partitioners should ship with Kafka? This issue arose in the context of KAFKA-2092 <https://issues.apache.org/jira/browse/KAFKA-2092>, which implements a specific load-balanced partitioning. This partitioner however assumes some stages of processing on top of it to make proper use of the data, i.e., it envisions Kafka as a substrate for stream processing, and not only as the I/O component. Is this a direction that Kafka wants to go towards? Or is this a role better left to the internal communication systems of other stream processing engines (e.g., Storm)? And if the answer is the latter, how would something such a Samza (which relies mostly on Kafka as its communication substrate) be able to implement advanced partitioning schemes? Cheers, -- Gianmarco