Eno Thereska created KAFKA-3973:
-----------------------------------

             Summary: Investigate feasibility of caching bytes vs. records
                 Key: KAFKA-3973
                 URL: https://issues.apache.org/jira/browse/KAFKA-3973
             Project: Kafka
          Issue Type: Sub-task
          Components: streams
            Reporter: Eno Thereska
            Assignee: Bill Bejeck


Currently the cache stores and accounts for records, not bytes or objects. This 
investigation would be around measuring any performance overheads that come 
from storing bytes or objects. As an outcome we should know whether 1) we 
should store bytes or 2) we should store objects. 

If we store objects, the cache still needs to know their size (so that it can 
know if the object fits in the allocated cache space, e.g., if the cache is 
100MB and the object is 10MB, we'd have space for 10 such objects). The 
investigation needs to figure out how to find out the size of the object 
efficiently in Java.

If we store bytes, then we are serialising an object into bytes before caching 
it, i.e., we take a serialisation cost. The investigation needs measure how bad 
this cost can be especially for the case when all objects fit in cache (and 
thus any extra serialisation cost would show).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to