Consumers can only fetch data up to the committed offset and the reason is reliability and durability on a broker crash (some consumers might get the new data and some may not as the data is not yet committed and lost). Data will be committed when it is flushed. So if you delay the flushing, consumers won't get those messages until that time.
Even though you flush periodically based on log.flush.interval.messages and log.flush.interval.ms, if the segment file is in the pagecache, the consumers will still benefit from that pagecache and OS wouldn't read it again from disk. On Thu, Aug 13, 2015 at 2:54 PM Yuheng Du <[email protected]> wrote: > Hi, > > As I understand it, kafka brokers will store the incoming messages into > pagecache as much as possible and then flush them into disk, right? > > But in my experiment where 90 producers is publishing data into 6 brokers, > I see that the log directory on disk where broker stores the data is > constantly increasing (every seconds.) So why this is happening? Does this > has to do with the default "log.flush.interval" setting? > > I want the broker to write to disk less often when serving some on-line > consumers to reduce latency. I tested in my broker the disk write speed is > around 110MB/s. > > Thanks for any replies. >
