On Mon, 2007-04-16 at 16:24 +0800, Zhou Yingchao wrote: > When we run a two nfs client and a nfs server in the following way, we > met a livelock / starvation condition. > > MachineA MachineB > Client1 Client2 > Server > > As shown in the figure, we run a client and server on one machine, and > run another client on another machine. When Client1 and Client2 make > many writes at the same time, the Client1's request is blocked until > Client2's writes finished. > > We check the code, Client1 is blocked in generic_file_write-> ... > >balance_dirty_pages, balance_dirty_pages call writeback_inodes to > (only) flush data of the related fs. > > In nfs, we found that the Server has enhanced its dirty_thresh. So in > the loop of writeback_inodes, Client1 has no data to write out, and > the condition "ns_reclaimable+wbs.nr_writeback<=dirty_thresh" will not > be true until Client2 finishes its write request to Server. So the > loop will only end after Client2 finished its write job. > > The problem in this path is: why we write only pages of the related fs > in writeback_inodes but check the dirty thresh for total pages?
I am working on patches to fix this. Current version at (against -mm): http://programming.kicks-ass.net/kernel-patches/balance_dirty_pages/ However, after a rewrite of the BDI statistics work there are some funnies, which I haven't had time to analyse yet :-/ I hope to post a new version soonish... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

