On Thu, Aug 13, 2015 at 4:10 PM, Kishore Senji <kse...@gmail.com> wrote:

> Consumers can only fetch data up to the committed offset and the reason is
> reliability and durability on a broker crash (some consumers might get the
> new data and some may not as the data is not yet committed and lost). Data
> will be committed when it is flushed. So if you delay the flushing,
> consumers won't get those messages until that time.
>

As far as I know, this is not accurate.

A message is considered committed when all ISR replicas received it (this
much is documented). This doesn't need to include writing to disk, which
will happen asynchronously.


>
> Even though you flush periodically based on log.flush.interval.messages and
> log.flush.interval.ms, if the segment file is in the pagecache, the
> consumers will still benefit from that pagecache and OS wouldn't read it
> again from disk.
>
> On Thu, Aug 13, 2015 at 2:54 PM Yuheng Du <yuheng.du.h...@gmail.com>
> wrote:
>
> > Hi,
> >
> > As I understand it, kafka brokers will store the incoming messages into
> > pagecache as much as possible and then flush them into disk, right?
> >
> > But in my experiment where 90 producers is publishing data into 6
> brokers,
> > I see that the log directory on disk where broker stores the data is
> > constantly increasing (every seconds.) So why this is happening? Does
> this
> > has to do with the default "log.flush.interval" setting?
> >
> > I want the broker to write to disk less often when serving some on-line
> > consumers to reduce latency. I tested in my broker the disk write speed
> is
> > around 110MB/s.
> >
> > Thanks for any replies.
> >
>

Reply via email to