[ 
https://issues.apache.org/jira/browse/KAFKA-7652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16750785#comment-16750785
 ] 

ASF GitHub Bot commented on KAFKA-7652:
---------------------------------------

guozhangwang commented on pull request #6191: KAFKA-7652: Part III; Put to 
underlying before Flush
URL: https://github.com/apache/kafka/pull/6191
 
 
   This is on top of the Part II PR and hence should be only reviewed when the 
part II PR is merged.
   
   1) In the caching layer's flush listener call, we should always write to the 
underlying store, before flushing (see 
https://github.com/apache/kafka/pull/4331 's point 4) for detailed 
explanation). When fixing 4331, it only touches on KV stores, but it turns out 
that we should fix for window and session store as well.
   
   2) Also apply the optimization that was in session-store already: when the 
new value bytes and old value bytes are all null (this is possible e.g. if 
there is a put(K, V) followed by a remove(K) or put(K, null) and these two 
operations only hit the cache), upon flushing this mean the underlying store 
does not have this value at all and also no intermediate value has been sent to 
downstream as well. We can skip both putting a null to the underlying store as 
well as calling the flush listener sending `null -> null` in this case.
   
   Modifies corresponding unit tests.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Kafka Streams Session store performance degradation from 0.10.2.2 to 0.11.0.0
> -----------------------------------------------------------------------------
>
>                 Key: KAFKA-7652
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7652
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.11.0.0, 0.11.0.1, 0.11.0.2, 0.11.0.3, 1.1.1, 2.0.0, 
> 2.0.1
>            Reporter: Jonathan Gordon
>            Priority: Major
>         Attachments: kafka_10_2_1_flushes.txt, kafka_11_0_3_flushes.txt
>
>
> I'm creating this issue in response to [~guozhang]'s request on the mailing 
> list:
> [https://lists.apache.org/thread.html/97d620f4fd76be070ca4e2c70e2fda53cafe051e8fc4505dbcca0321@%3Cusers.kafka.apache.org%3E]
> We are attempting to upgrade our Kafka Streams application from 0.10.2.1 but 
> experience a severe performance degradation. The highest amount of CPU time 
> seems spent in retrieving from the local cache. Here's an example thread 
> profile with 0.11.0.0:
> [https://i.imgur.com/l5VEsC2.png]
> When things are running smoothly we're gated by retrieving from the state 
> store with acceptable performance. Here's an example thread profile with 
> 0.10.2.1:
> [https://i.imgur.com/IHxC2cZ.png]
> Some investigation reveals that it appears we're performing about 3 orders 
> magnitude more lookups on the NamedCache over a comparable time period. I've 
> attached logs of the NamedCache flush logs for 0.10.2.1 and 0.11.0.3.
> We're using session windows and have the app configured for 
> commit.interval.ms = 30 * 1000 and cache.max.bytes.buffering = 10485760
> I'm happy to share more details if they would be helpful. Also happy to run 
> tests on our data.
> I also found this issue, which seems like it may be related:
> https://issues.apache.org/jira/browse/KAFKA-4904
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to