Hey! I was hoping I could get some input from people more experienced with Kafka Streams to determine if they'd be a good use case/solution for me.
I have multi-tenant clients submitting data to a Kafka topic that they want ETL'd to a third party service. I'd like to batch and group these by tenant over a time window, somewhere between 1 and 5 minutes. At the end of a time window then issue an API request to the third party service for each tenant sending the batch of data over. Other points of note: - Ideally we'd have exactly-once semantics, sending data multiple times would typically be bad. But we'd need to gracefully handle things like API request errors / service outages. - We currently use Storm for doing stream processing, but the long running time-windows and potentially large amount of data stored in memory make me a bit nervous to use it for this. Thoughts? Thanks in Advance! Stephen