On Thu, 05 Apr 2007 19:42:20 +0200 [EMAIL PROTECTED] wrote: > Only do the congestion wait when we actually encountered congestion.
The name congestion_wait() was accurate back in 2002, but it isn't accurate any more, and you got misled. It does not only wait for a queue to become uncongested. See clear_bdi_congested()'s callers. As long as the queue is in an uncongested state, we deliver wakeups to congestion_wait() blockers on every IO completion. As I said before, it is so that the MM's polling operations poll at a higher frequency when the IO system is working faster. (It is also to synchronise with end_page_writeback()'s feeding of clean pages to us via rotate_reclaimable_page()). Page reclaim can get into trouble without any request queue having entered a congested state. For example, think about a machine which has a single disk, and the operator has increased that disk's request queue size to 100,000. With your patch all the VM's throttling would be bypassed and we go into a busy loop and declare OOM instantly. There are probably other situations in which page reclaim gets into trouble without a request queue being congested. Minor point: bdi_congested() can be arbitrarily expensive - for DM stackups it is roughly proportional to the number of subdevices in the device. We need to be careful about how frequently we call it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/