[ 
https://issues.apache.org/jira/browse/KAFKA-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17364422#comment-17364422
 ] 

Sagar Rao commented on KAFKA-8295:
----------------------------------

Yeah that makes sense. I haven't looked at the count() APIs so can't comment at 
this moment. I will go through the implementation and also find the feasibility 
of it.

Other option that I was thinking was to expose a counter based State store? 
Basically, users use it as a counter for the keys. This would be agnostic of 
any time window. I guess currently, even to achieve that, users will have to do 
a read modify write.

 

But, this may not be an issue as all records against the same key would end up 
on the same stream processor thread, so technically multiple threads won't be 
accessing the key leading to race conditions.

Do you think a counter would be useful in state stores?

> Optimize count() using RocksDB merge operator
> ---------------------------------------------
>
>                 Key: KAFKA-8295
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8295
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: A. Sophie Blee-Goldman
>            Assignee: Sagar Rao
>            Priority: Major
>
> In addition to regular put/get/delete RocksDB provides a fourth operation, 
> merge. This essentially provides an optimized read/update/write path in a 
> single operation. One of the built-in (C++) merge operators exposed over the 
> Java API is a counter. We should be able to leverage this for a more 
> efficient implementation of count()
>  
> (Note: Unfortunately it seems unlikely we can use this to optimize general 
> aggregations, even if RocksJava allowed for a custom merge operator, unless 
> we provide a way for the user to specify and connect a C++ implemented 
> aggregator – otherwise we incur too much cost crossing the jni for a net 
> performance benefit)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to