On Tue, Mar 13, 2012 at 01:54:38PM +0100, Svatopluk Kraus wrote: > On Mon, Mar 12, 2012 at 7:19 PM, Konstantin Belousov > <kostik...@gmail.com> wrote: > > On Mon, Mar 12, 2012 at 04:00:58PM +0100, Svatopluk Kraus wrote: > >> Hi, > >> > >> I have solved a following problem. If a big file (according to > >> 'hidirtybuffers') is being written, the write speed is very poor. > >> > >> It's observed on system with elan 486 and 32MB RAM (i.e., low speed > >> CPU and not too much memory) running FreeBSD-9. > >> > >> Analysis: A file is being written. All or almost all dirty buffers > >> belong to the file. The file vnode is almost all time locked by > >> writing process. The buf_daemon() can not flush any dirty buffer as a > >> chance to acquire the file vnode lock is very low. A number of dirty > >> buffers grows up very slow and with each new dirty buffer slower, > >> because buf_daemon() eats more and more CPU time by looping on dirty > >> buffers queue (with very low or no effect). > >> > >> This slowing down effect is started by buf_daemon() itself, when > >> 'numdirtybuffers' reaches 'lodirtybuffers' threshold and buf_daemon() > >> is waked up by own timeout. The timeout fires at 'hz' period, but > >> starts to fire at 'hz/10' immediately as buf_daemon() fails to reach > >> 'lodirtybuffers' threshold. When 'numdirtybuffers' (now slowly) > >> reaches ((lodirtybuffers + hidirtybuffers) / 2) threshold, the > >> buf_daemon() can be waked up within bdwrite() too and it's much worse. > >> Finally and with very slow speed, the 'hidirtybuffers' or > >> 'dirtybufthresh' is reached, the dirty buffers are flushed, and > >> everything starts from beginning... > > Note that for some time, bufdaemon work is distributed among bufdaemon > > thread itself and any thread that fails to allocate a buffer, esp. > > a thread that owns vnode lock and covers long queue of dirty buffers. > > However, the problem starts when numdirtybuffers reaches > lodirtybuffers count and ends around hidirtybuffers count. There are > still plenty of free buffers in system. > > >> > >> On the system, a buffer size is 512 bytes and the default > >> thresholds are following: > >> > >> vfs.hidirtybuffers = 134 > >> vfs.lodirtybuffers = 67 > >> vfs.dirtybufthresh = 120 > >> > >> For example, a 2MB file is copied into flash disk in about 3 > >> minutes and 15 second. If dirtybufthresh is set to 40, the copy time > >> is about 20 seconds. > >> > >> My solution is a mix of three things: > >> 1. Suppresion of buf_daemon() wakeup by setting bd_request to 1 in > >> the main buf_daemon() loop. > > I cannot understand this. Please provide a patch that shows what do > > you mean there. > > > curthread->td_pflags |= TDP_NORUNNINGBUF | TDP_BUFNEED; > mtx_lock(&bdlock); > for (;;) { > - bd_request = 0; > + bd_request = 1; > mtx_unlock(&bdlock); Is this a complete patch ? The change just causes lost wakeups for bufdaemon, nothing more.
> > I read description of bd_request variable. However, bd_request should > serve as an indicator that buf_daemon() is in sleep. I.e., the > following paradigma should be used: > > mtx_lock(&bdlock); > bd_request = 0; /* now, it's only time when wakeup() will be meaningful */ > sleep(&bd_request, ..., hz/10); > bd_request = 1; /* in case of timeout, we must set it (bd_wakeup() > already set it) */ > mtx_unlock(&bdlock); > > My patch follows the paradigma. What happens without the patch in > described problem: buf_daemon() fails in its job and goes to sleep > with hz/10 period. It supposes that next early wakeup will do nothing > too. bd_request is untouched but buf_daemon() doesn't know if its last > wakeup was made by bd_wakeup() or by timeout. So, bd_request could be > 0 and buf_daemon() can be waked up before hz/10 just by bd_wakeup(). > Moreover, setting bd_request to 0 when buf_daemon() is not in sleep > can cause time consuming and useless wakeup() calls without effect.
pgpgYYUkYWWIP.pgp
Description: PGP signature