Hi Druid community,                                                            
                                      
                                                                                
                                        
I would like to start a discussion on a new ingestion mode for Apache Druid 
backed by    Kafka 4.0 Share Groups (KIP-932 and subsequent iterations). 
Tracking progress: https://github.com/apache/druid/issues/18439  
Motivation & Vision: 
A large class of Druid ingestion use cases is inherently task-queue-like, not 
stream-ordered:

- Distributed System Monitoring: Log lines from thousands of microservices. 
Whether a log from service A arrives before service B is irrelevant — the query 
is "total ERROR count in the last 5 minutes."
- IoT Fleet Analytics: Temperature readings from geographically dispersed 
sensors. Each reading is an independent unit of work. The relative arrival 
order of a sensor in Singapore vs. Oslo carries no semantic meaning.
- Security Threat Detection: Netflow records analyzed for volume patterns. 
Threats are identified by aggregate attributes, not microsecond sequencing 
across network segments.
- API Observability: Billions of HTTP request records. Total p99 latency per 
endpoint per minute is the query. The order of individual requests does not 
affect the answer.

For all of these, the correct primitive is a work queue: N workers pull items 
from a shared pool, process them independently, and signal completion. Kafka 
Share Groups implement exactly this at the broker level, removing the need for 
Druid to solve it through partition management.

I will share the draft the design doc(and draft a PR) for further discussion. 
                                                                                
                                        Looking forward to feedback.            
                                                                      
Thanks,                        Shekhar RajakGithub: @Shekharrajak

Reply via email to