[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15658875#comment-15658875
 ] 

ASF GitHub Bot commented on BOOKKEEPER-968:
-------------------------------------------

Github user dlg99 commented on the issue:

    https://github.com/apache/bookkeeper/pull/77
  
    @merlimat limiting write rate won't help. One can write with relatively low 
write rate but since entry log only gets fileChannel.force() on log rotation we 
end up with a lot of writes cached in memory. 
    Once OS (when cache is too full or because of explicit fileChannel.force()) 
decide that it is time to flush the data to disk we have no control over the 
rate of writes. It does it at max possible speed, flushing 2048M on ssd with 
400M/sec write throughput takes 5 sec if there are no other writes/reads. 
    I.e. in our production env I have configured flush to happen at every 10MB 
of entry log writes and so far I am seeing a lot less read latency spikes. I'll 
get some charts to show during meetup.
    
    One can experiment with linux config (dirty_write_bytes / 
backround_dirty_write_bytes IIRC) but these are OS-wide setting (not per disk) 
and will affect other writes, i.e. I would not want to decrease these 
parameters to 10-50M for the rotational disk where we write application logs. 
Limiting these to the range of hundreds MB does not help much with the specific 
problem on hands.



> Entry log flushes happen on log rotation and cause long spikes in IO 
> utilization
> --------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-968
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-968
>             Project: Bookkeeper
>          Issue Type: Improvement
>          Components: bookkeeper-server
>    Affects Versions: 4.5.0
>            Reporter: Andrey Yegorov
>            Assignee: Andrey Yegorov
>            Priority: Minor
>
> Caught this issue on the servers with 128G of RAM. This is probably not an 
> issue on servers/VMs with less RAM.
> With current implementation we end up with single entry log flush during log 
> rotation.
> OS tries to flush everything as fast as possible and saturates disk. This 
> results in long periods of high latency (reads and writes).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to