regression with poll(2)?

2012-08-15 Thread Sage Weil
I'm experiencing a stall with Ceph daemons communicating over TCP that occurs reliably with 3.6-rc1 (and linus/master) but not 3.5. The basic situation is: - the socket is two processes communicating over TCP on the same host, e.g. tcp0 2164849 10.214.132.38:6801 10.214.132.38:5

Re: regression with poll(2)

2012-08-20 Thread Andrew Morton
On Mon, 20 Aug 2012 11:30:59 +0200 Eric Dumazet wrote: > On Mon, 2012-08-20 at 10:04 +0100, Mel Gorman wrote: > > > Can the following patch be tested please? It is reported to fix an fio > > regression that may be similar to what you are experiencing but has not > > been picked up yet. > > > >

Re: regression with poll(2)

2012-08-20 Thread Eric Dumazet
On Mon, 2012-08-20 at 16:20 -0700, Andrew Morton wrote: > On Mon, 20 Aug 2012 11:30:59 +0200 > Eric Dumazet wrote: > > > On Mon, 2012-08-20 at 10:04 +0100, Mel Gorman wrote: > > > > > Can the following patch be tested please? It is reported to fix an fio > > > regression that may be similar to w

Re: regression with poll(2)

2012-08-21 Thread Mel Gorman
On Mon, Aug 20, 2012 at 09:54:59AM -0700, Sage Weil wrote: > > > > > > > I've retested several times and confirmed that this change leads to the > > > breakage, and also confirmed that reverting it on top of -rc1 also fixes > > > the problem. > > > > > > I've also added some additional instrum

Re: regression with poll(2)

2012-08-21 Thread Andrew Morton
On Mon, 20 Aug 2012 10:02:05 -0700 Linus Torvalds wrote: > On Mon, Aug 20, 2012 at 2:04 AM, Mel Gorman wrote: > > > > Can the following patch be tested please? It is reported to fix an fio > > regression that may be similar to what you are experiencing but has not > > been picked up yet. > > A

Re: regression with poll(2)?

2012-08-15 Thread Atchley, Scott
On Aug 15, 2012, at 3:46 PM, Sage Weil wrote: > I'm experiencing a stall with Ceph daemons communicating over TCP that > occurs reliably with 3.6-rc1 (and linus/master) but not 3.5. The basic > situation is: > > - the socket is two processes communicating over TCP on the same host, e.g. > >

Re: regression with poll(2)?

2012-08-15 Thread Sage Weil
On Wed, 15 Aug 2012, Atchley, Scott wrote: > On Aug 15, 2012, at 3:46 PM, Sage Weil wrote: > > > I'm experiencing a stall with Ceph daemons communicating over TCP that > > occurs reliably with 3.6-rc1 (and linus/master) but not 3.5. The basic > > situation is: > > > > - the socket is two proce

Re: regression with poll(2)

2012-08-19 Thread Sage Weil
I've bisected and identified this commit: netvm: propagate page->pfmemalloc to skb The skb->pfmemalloc flag gets set to true iff during the slab allocation of data in __alloc_skb that the the PFMEMALLOC reserves were used. If the packet is fragmented, it is possible that page

Re: regression with poll(2)

2012-08-20 Thread Eric Dumazet
On Sun, 2012-08-19 at 11:49 -0700, Sage Weil wrote: > I've bisected and identified this commit: > > netvm: propagate page->pfmemalloc to skb > > The skb->pfmemalloc flag gets set to true iff during the slab allocation > of data in __alloc_skb that the the PFMEMALLOC reserves were

Re: regression with poll(2)

2012-08-20 Thread Mel Gorman
On Sun, Aug 19, 2012 at 11:49:31AM -0700, Sage Weil wrote: > I've bisected and identified this commit: > > netvm: propagate page->pfmemalloc to skb > > The skb->pfmemalloc flag gets set to true iff during the slab allocation > of data in __alloc_skb that the the PFMEMALLOC reserve

Re: regression with poll(2)

2012-08-20 Thread Eric Dumazet
On Mon, 2012-08-20 at 10:04 +0100, Mel Gorman wrote: > Can the following patch be tested please? It is reported to fix an fio > regression that may be similar to what you are experiencing but has not > been picked up yet. > > - This seems to help here. Boot your machine with "mem=768M" or a bit

Re: regression with poll(2)

2012-08-20 Thread Sage Weil
On Mon, 20 Aug 2012, Mel Gorman wrote: > On Sun, Aug 19, 2012 at 11:49:31AM -0700, Sage Weil wrote: > > I've bisected and identified this commit: > > > > netvm: propagate page->pfmemalloc to skb > > > > The skb->pfmemalloc flag gets set to true iff during the slab allocation > > o

Re: regression with poll(2)

2012-08-20 Thread Linus Torvalds
On Mon, Aug 20, 2012 at 2:04 AM, Mel Gorman wrote: > > Can the following patch be tested please? It is reported to fix an fio > regression that may be similar to what you are experiencing but has not > been picked up yet. Andrew, is this in your queue, or should I take this directly, or what? It