List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-13 Thread Mathias Krause
Hi, this is an attempt to resurrect the thread initially started here: http://thread.gmane.org/gmane.linux.network/353003 As that patch fixed the issue for the mentioned reproducer, it did not fix the bug for the production code Olivier is using. :( Changing the reproducer only slightly allow

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-29 Thread Mathias Krause
On 29 September 2015 at 21:09, Jason Baron wrote: > However, if we call connect on socket 's', to connect to a new socket 'o2', we > drop the reference on the original socket 'o'. Thus, we can now close socket > 'o' without unregistering from epoll. Then, when we either close the ep > or unregiste

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-30 Thread Michal Kubecek
On Wed, Sep 30, 2015 at 07:54:29AM +0200, Mathias Krause wrote: > On 29 September 2015 at 21:09, Jason Baron wrote: > > However, if we call connect on socket 's', to connect to a new socket 'o2', > > we > > drop the reference on the original socket 'o'. Thus, we can now close socket > > 'o' witho

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-30 Thread Rainer Weikusat
Mathias Krause writes: > On 29 September 2015 at 21:09, Jason Baron wrote: >> However, if we call connect on socket 's', to connect to a new socket 'o2', >> we >> drop the reference on the original socket 'o'. Thus, we can now close socket >> 'o' without unregistering from epoll. Then, when we e

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-30 Thread Mathias Krause
On 30 September 2015 at 12:56, Rainer Weikusat wrote: > Mathias Krause writes: >> On 29 September 2015 at 21:09, Jason Baron wrote: >>> However, if we call connect on socket 's', to connect to a new socket 'o2', >>> we >>> drop the reference on the original socket 'o'. Thus, we can now close so

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-30 Thread Rainer Weikusat
Mathias Krause writes: > On 30 September 2015 at 12:56, Rainer Weikusat > wrote: >> Mathias Krause writes: >>> On 29 September 2015 at 21:09, Jason Baron wrote: However, if we call connect on socket 's', to connect to a new socket 'o2', we drop the reference on the original soc

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-30 Thread Mathias Krause
On 30 September 2015 at 15:25, Rainer Weikusat wrote: > Mathias Krause writes: >> On 30 September 2015 at 12:56, Rainer Weikusat >> wrote: >>> In case you want some information on this: This is a kernel warning I >>> could trigger (more than once) on the single day I could so far spend >>> look

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-30 Thread Rainer Weikusat
Mathias Krause writes: > On 30 September 2015 at 15:25, Rainer Weikusat > wrote: >> Mathias Krause writes: >>> On 30 September 2015 at 12:56, Rainer Weikusat >>> wrote: In case you want some information on this: This is a kernel warning I could trigger (more than once) on the single

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-30 Thread Jason Baron
On 09/30/2015 01:54 AM, Mathias Krause wrote: > On 29 September 2015 at 21:09, Jason Baron wrote: >> However, if we call connect on socket 's', to connect to a new socket 'o2', >> we >> drop the reference on the original socket 'o'. Thus, we can now close socket >> 'o' without unregistering from

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-30 Thread Jason Baron
On 09/30/2015 03:34 AM, Michal Kubecek wrote: > On Wed, Sep 30, 2015 at 07:54:29AM +0200, Mathias Krause wrote: >> On 29 September 2015 at 21:09, Jason Baron wrote: >>> However, if we call connect on socket 's', to connect to a new socket 'o2', >>> we >>> drop the reference on the original socket

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-10-01 Thread Rainer Weikusat
Jason Baron writes: > On 09/30/2015 01:54 AM, Mathias Krause wrote: >> On 29 September 2015 at 21:09, Jason Baron wrote: >>> However, if we call connect on socket 's', to connect to a new socket 'o2', >>> we >>> drop the reference on the original socket 'o'. Thus, we can now close socket >>> 'o'

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-10-01 Thread Rainer Weikusat
Rainer Weikusat writes: > Jason Baron writes: >> On 09/30/2015 01:54 AM, Mathias Krause wrote: >>> On 29 September 2015 at 21:09, Jason Baron wrote: However, if we call connect on socket 's', to connect to a new socket 'o2', we drop the reference on the original socket 'o'. Thus,

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-10-01 Thread Rainer Weikusat
Rainer Weikusat writes: > Rainer Weikusat writes: >> Jason Baron writes: >>> On 09/30/2015 01:54 AM, Mathias Krause wrote: On 29 September 2015 at 21:09, Jason Baron wrote: > However, if we call connect on socket 's', to connect to a new socket > 'o2', we > drop the reference

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-13 Thread Eric Wong
+cc Jason Baron since he might be able to provide more insight into epoll. Mathias Krause wrote: > Hi, > > this is an attempt to resurrect the thread initially started here: > > http://thread.gmane.org/gmane.linux.network/353003 > > As that patch fixed the issue for the mentioned reproducer,

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-29 Thread Mathias Krause
On 14 September 2015 at 04:39, Eric Wong wrote: > +cc Jason Baron since he might be able to provide more insight into > epoll. > > Mathias Krause wrote: >> Hi, >> >> this is an attempt to resurrect the thread initially started here: >> >> http://thread.gmane.org/gmane.linux.network/353003 >> >>

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-29 Thread Jason Baron
On 09/29/2015 02:09 PM, Mathias Krause wrote: > On 14 September 2015 at 04:39, Eric Wong wrote: >> +cc Jason Baron since he might be able to provide more insight into >> epoll. >> >> Mathias Krause wrote: >>> Hi, >>> >>> this is an attempt to resurrect the thread initially started here: >>> >>>

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-15 Thread Rainer Weikusat
Mathias Krause writes: > this is an attempt to resurrect the thread initially started here: > > http://thread.gmane.org/gmane.linux.network/353003 > > As that patch fixed the issue for the mentioned reproducer, it did not > fix the bug for the production code Olivier is using. :( > > Changing th

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-15 Thread Mathias Krause
On Tue, Sep 15, 2015 at 06:07:05PM +0100, Rainer Weikusat wrote: > --- a/net/unix/af_unix.c > +++ b/net/unix/af_unix.c > -2233,10 +2233,14 static unsigned int > unix_dgram_poll(struct file *file, struct socket *sock, > writable = unix_writable(sk); > other = unix_peer_get(sk)