[ 
https://issues.apache.org/jira/browse/KAFKA-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082004#comment-15082004
 ] 

xingang commented on KAFKA-3062:
--------------------------------

Yes! Example: 

Huge volume data producing to >60 partition, and 15 consumers will works on 
this data.

10 of them are time-latency sensitive, which is nearly real-time processing, 
it's better for them to consume from the page cache to get the data, sometime a 
little data loss even can be tolerant as its processing shows processing result 
for realtime

5 of them are reports processing from the data, it's Ok to be hours or even 
daily jobs, it does not require to show its result in a short time. 

considering, if the 5 stats-processing are in a lag, and they will consume from 
the disk, and make the page cache full of them, since such history data 
consuming are N times faster than the producing rate. hence, the 10 
time-latency sensitive processing are sad, since they always see the page cache 
missing~~ once they get a short time lag


Thanks for your quick response!


> Read from kafka replication to get data likely Version based
> ------------------------------------------------------------
>
>                 Key: KAFKA-3062
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3062
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: xingang
>
> Since Kafka require all the reading happens in the leader for the consistency.
> If there could be possible for the reading can happens in replication, thus, 
> for data have a number of consumers, for the consumers Not latency-sensitive 
> But Data-Loss sensitive can fetch its data from replication, in this case, it 
> will pollute the Pagecache for other consumers which are latency-sensitive



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to