Re: svn commit: r265232 - head/sys/net

2014-05-02 Thread Alexander V. Chernikov

On 02.05.2014 20:24, Alan Somers wrote:

Author: asomers
Date: Fri May  2 16:24:09 2014
New Revision: 265232
URL: http://svnweb.freebsd.org/changeset/base/265232

Log:
   Fix a panic caused by doing ifconfig -am while a lagg is being destroyed.
   The thread that is destroying the lagg has already set sc-sc_psc=NULL when
   the ifconfig -am thread gets to lacp_req().  It tries to dereference
   sc-sc_psc and panics.  The solution is for lacp_req() to check the value of
   sc-sc_psc.  If NULL, harmlessly return an lacp_opreq structure full of
   zeros.  Full details in GNATS.

Sorry, it looks like I've missed -net@ discussion.

While this patch minimizes panics, they still can occur.
If one thread performs lagg_clone_destroy() - lagg_lacp_detach() (1) - 
lacp_detach(), executing


struct lacp_softc *lsc = LACP_SOFTC(sc); (e.g. one line before sc_psc = 
NULL assignment)


At the same moment, another thread (initiated by ifconfig) executes

struct lacp_softc *lsc = LACP_SOFTC(sc);

Then, thread #1 goes further to

LACP_LOCK_DESTROY(lsc);
free(lsc, M_DEVBUF);

After that, thread #2 performs

bzero(req, sizeof(struct lacp_opreq));

passes lsc check for NULL and crashes on destroyed mutex.


Briefly looking, we can remove WUNLOCK()  before lacp_detach() in 
lagg_lacp_detach() (I'm unsure about the reasons why do we need unlock 
here).
lacp_req() is already protected by at least LAGG_RLOCK() so we should 
get consistent view of sc-sc_psc.


   
   PR:		kern/189003

   Reviewed by: timeout on freebsd-net@
   MFC after:   3 weeks
   Sponsored by:Spectra Logic Corporation

Modified:
   head/sys/net/ieee8023ad_lacp.c

Modified: head/sys/net/ieee8023ad_lacp.c
==
--- head/sys/net/ieee8023ad_lacp.c  Fri May  2 16:15:34 2014
(r265231)
+++ head/sys/net/ieee8023ad_lacp.c  Fri May  2 16:24:09 2014
(r265232)
@@ -590,10 +590,20 @@ lacp_req(struct lagg_softc *sc, caddr_t
  {
struct lacp_opreq *req = (struct lacp_opreq *)data;
struct lacp_softc *lsc = LACP_SOFTC(sc);
-   struct lacp_aggregator *la = lsc-lsc_active_aggregator;
+   struct lacp_aggregator *la;
  
-	LACP_LOCK(lsc);

bzero(req, sizeof(struct lacp_opreq));
+   
+   /*
+* If the LACP softc is NULL, return with the opreq structure full of
+* zeros.  It is normal for the softc to be NULL while the lagg is
+* being destroyed.
+*/
+   if (NULL == lsc)
+   return;
+
+   la = lsc-lsc_active_aggregator;
+   LACP_LOCK(lsc);
if (la != NULL) {
req-actor_prio = ntohs(la-la_actor.lip_systemid.lsi_prio);
memcpy(req-actor_mac, la-la_actor.lip_systemid.lsi_mac,




___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: svn commit: r265232 - head/sys/net

2014-05-02 Thread Alan Somers
On Fri, May 2, 2014 at 11:01 AM, Alexander V. Chernikov
melif...@freebsd.org wrote:
 On 02.05.2014 20:24, Alan Somers wrote:

 Author: asomers
 Date: Fri May  2 16:24:09 2014
 New Revision: 265232
 URL: http://svnweb.freebsd.org/changeset/base/265232

 Log:
Fix a panic caused by doing ifconfig -am while a lagg is being
 destroyed.
The thread that is destroying the lagg has already set sc-sc_psc=NULL
 when
the ifconfig -am thread gets to lacp_req().  It tries to dereference
sc-sc_psc and panics.  The solution is for lacp_req() to check the
 value of
sc-sc_psc.  If NULL, harmlessly return an lacp_opreq structure full of
zeros.  Full details in GNATS.

 Sorry, it looks like I've missed -net@ discussion.

Thanks for the retroactive review; it's good too.


 While this patch minimizes panics, they still can occur.
 If one thread performs lagg_clone_destroy() - lagg_lacp_detach() (1) -
 lacp_detach(), executing

 struct lacp_softc *lsc = LACP_SOFTC(sc); (e.g. one line before sc_psc = NULL
 assignment)

 At the same moment, another thread (initiated by ifconfig) executes


 struct lacp_softc *lsc = LACP_SOFTC(sc);

 Then, thread #1 goes further to

 LACP_LOCK_DESTROY(lsc);
 free(lsc, M_DEVBUF);

 After that, thread #2 performs

 bzero(req, sizeof(struct lacp_opreq));

 passes lsc check for NULL and crashes on destroyed mutex.


 Briefly looking, we can remove WUNLOCK()  before lacp_detach() in
 lagg_lacp_detach() (I'm unsure about the reasons why do we need unlock
 here).
 lacp_req() is already protected by at least LAGG_RLOCK() so we should get
 consistent view of sc-sc_psc.

The WUNLOCK was added in r168561 without comment.  Removing it seems
like it would work.  It doesn't cause any new LORs or WITNESS
warnings.  And I can no longer reproduce the panic in lacp_req.
However, it doesn't fix another panic in lagg_port_ioctl line 825 (my
patch didn't either).  Do you have any good ideas what to do about
that?

#9  0x808cee81 in __mtx_lock_sleep (c=0xf800052bb648,
tid=18446735277704396800, opts=value optimized out,
file=value optimized out, line=value optimized out)
at /usr/home/alans/freebsd/head/sys/kern/kern_mutex.c:430
#10 0x808ced02 in __mtx_lock_flags (c=value optimized out, opts=0,
file=0x80f6dd8c
/usr/home/alans/freebsd/head/sys/kern/kern_rmlock.c, line=407) at
/usr/home/alans/freebsd/head/sys/kern/kern_mutex.c:223
#11 0x808e0032 in _rm_rlock (rm=0xf800052bb608,
tracker=0xfe0097751918, trylock=value optimized out)
at /usr/home/alans/freebsd/head/sys/kern/kern_rmlock.c:407
#12 0x808e0812 in _rm_rlock_debug (rm=0xf800052bb608,
tracker=0xfe0097751918, trylock=0,
file=0x81c1dd13
/usr/home/alans/freebsd/head/sys/modules/if_lagg/../../net/if_lagg.c,
line=825)
at /usr/home/alans/freebsd/head/sys/kern/kern_rmlock.c:659
#13 0x81c19f62 in lagg_port_ioctl (ifp=0xf8000590d800,
cmd=value optimized out, data=0xf80005582c00 tap0)
at /usr/home/alans/freebsd/head/sys/modules/if_lagg/../../net/if_lagg.c:825
#14 0x809ae407 in ifioctl (so=value optimized out, cmd=3225971084,
data=0xf80005582c00 tap0, td=value optimized out)
at /usr/home/alans/freebsd/head/sys/net/if.c:2643
#15 0x8093b9fb in kern_ioctl (td=value optimized out,
fd=value optimized out, com=value optimized out) at file.h:323
#16 0x8093b77c in sys_ioctl (td=0xf800053cc000,
uap=0xfe0097751b80)
at /usr/home/alans/freebsd/head/sys/kern/sys_generic.c:702
___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: svn commit: r265232 - head/sys/net

2014-05-02 Thread Alexander V. Chernikov
On 03.05.2014 00:22, Alan Somers wrote:
 On Fri, May 2, 2014 at 11:01 AM, Alexander V. Chernikov
 melif...@freebsd.org wrote:
 On 02.05.2014 20:24, Alan Somers wrote:

 Author: asomers
 Date: Fri May  2 16:24:09 2014
 New Revision: 265232
 URL: http://svnweb.freebsd.org/changeset/base/265232

 Log:
Fix a panic caused by doing ifconfig -am while a lagg is being
 destroyed.
The thread that is destroying the lagg has already set sc-sc_psc=NULL
 when
the ifconfig -am thread gets to lacp_req().  It tries to dereference
sc-sc_psc and panics.  The solution is for lacp_req() to check the
 value of
sc-sc_psc.  If NULL, harmlessly return an lacp_opreq structure full of
zeros.  Full details in GNATS.

 Sorry, it looks like I've missed -net@ discussion.
 
 Thanks for the retroactive review; it's good too.
 

 While this patch minimizes panics, they still can occur.
 If one thread performs lagg_clone_destroy() - lagg_lacp_detach() (1) -
 lacp_detach(), executing

 struct lacp_softc *lsc = LACP_SOFTC(sc); (e.g. one line before sc_psc = NULL
 assignment)

 At the same moment, another thread (initiated by ifconfig) executes


 struct lacp_softc *lsc = LACP_SOFTC(sc);

 Then, thread #1 goes further to

 LACP_LOCK_DESTROY(lsc);
 free(lsc, M_DEVBUF);

 After that, thread #2 performs

 bzero(req, sizeof(struct lacp_opreq));

 passes lsc check for NULL and crashes on destroyed mutex.


 Briefly looking, we can remove WUNLOCK()  before lacp_detach() in
 lagg_lacp_detach() (I'm unsure about the reasons why do we need unlock
 here).
 lacp_req() is already protected by at least LAGG_RLOCK() so we should get
 consistent view of sc-sc_psc.
 
 The WUNLOCK was added in r168561 without comment.  Removing it seems
 like it would work.  It doesn't cause any new LORs or WITNESS
 warnings.  And I can no longer reproduce the panic in lacp_req.
 However, it doesn't fix another panic in lagg_port_ioctl line 825 (my
 patch didn't either).  Do you have any good ideas what to do about
 that?
Interesting.
Line numbers look a bit shifted. Is line 825 'LAGG_RUNLOCK(sc, tracker)' ?

Are the steps to reproduce it the same as in kern/189003?

 
 #9  0x808cee81 in __mtx_lock_sleep (c=0xf800052bb648,
 tid=18446735277704396800, opts=value optimized out,
 file=value optimized out, line=value optimized out)
 at /usr/home/alans/freebsd/head/sys/kern/kern_mutex.c:430
 #10 0x808ced02 in __mtx_lock_flags (c=value optimized out, opts=0,
 file=0x80f6dd8c
 /usr/home/alans/freebsd/head/sys/kern/kern_rmlock.c, line=407) at
 /usr/home/alans/freebsd/head/sys/kern/kern_mutex.c:223
 #11 0x808e0032 in _rm_rlock (rm=0xf800052bb608,
 tracker=0xfe0097751918, trylock=value optimized out)
 at /usr/home/alans/freebsd/head/sys/kern/kern_rmlock.c:407
 #12 0x808e0812 in _rm_rlock_debug (rm=0xf800052bb608,
 tracker=0xfe0097751918, trylock=0,
 file=0x81c1dd13
 /usr/home/alans/freebsd/head/sys/modules/if_lagg/../../net/if_lagg.c,
 line=825)
 at /usr/home/alans/freebsd/head/sys/kern/kern_rmlock.c:659
 #13 0x81c19f62 in lagg_port_ioctl (ifp=0xf8000590d800,
 cmd=value optimized out, data=0xf80005582c00 tap0)
 at 
 /usr/home/alans/freebsd/head/sys/modules/if_lagg/../../net/if_lagg.c:825
 #14 0x809ae407 in ifioctl (so=value optimized out, cmd=3225971084,
 data=0xf80005582c00 tap0, td=value optimized out)
 at /usr/home/alans/freebsd/head/sys/net/if.c:2643
 #15 0x8093b9fb in kern_ioctl (td=value optimized out,
 fd=value optimized out, com=value optimized out) at file.h:323
 #16 0x8093b77c in sys_ioctl (td=0xf800053cc000,
 uap=0xfe0097751b80)
 at /usr/home/alans/freebsd/head/sys/kern/sys_generic.c:702
 

___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: svn commit: r265232 - head/sys/net

2014-05-02 Thread Alan Somers
On Fri, May 2, 2014 at 3:08 PM, Alexander V. Chernikov
melif...@freebsd.org wrote:
 On 03.05.2014 00:22, Alan Somers wrote:
 On Fri, May 2, 2014 at 11:01 AM, Alexander V. Chernikov
 melif...@freebsd.org wrote:
 On 02.05.2014 20:24, Alan Somers wrote:

 Author: asomers
 Date: Fri May  2 16:24:09 2014
 New Revision: 265232
 URL: http://svnweb.freebsd.org/changeset/base/265232

 Log:
Fix a panic caused by doing ifconfig -am while a lagg is being
 destroyed.
The thread that is destroying the lagg has already set sc-sc_psc=NULL
 when
the ifconfig -am thread gets to lacp_req().  It tries to dereference
sc-sc_psc and panics.  The solution is for lacp_req() to check the
 value of
sc-sc_psc.  If NULL, harmlessly return an lacp_opreq structure full of
zeros.  Full details in GNATS.

 Sorry, it looks like I've missed -net@ discussion.

 Thanks for the retroactive review; it's good too.


 While this patch minimizes panics, they still can occur.
 If one thread performs lagg_clone_destroy() - lagg_lacp_detach() (1) -
 lacp_detach(), executing

 struct lacp_softc *lsc = LACP_SOFTC(sc); (e.g. one line before sc_psc = NULL
 assignment)

 At the same moment, another thread (initiated by ifconfig) executes


 struct lacp_softc *lsc = LACP_SOFTC(sc);

 Then, thread #1 goes further to

 LACP_LOCK_DESTROY(lsc);
 free(lsc, M_DEVBUF);

 After that, thread #2 performs

 bzero(req, sizeof(struct lacp_opreq));

 passes lsc check for NULL and crashes on destroyed mutex.


 Briefly looking, we can remove WUNLOCK()  before lacp_detach() in
 lagg_lacp_detach() (I'm unsure about the reasons why do we need unlock
 here).
 lacp_req() is already protected by at least LAGG_RLOCK() so we should get
 consistent view of sc-sc_psc.

 The WUNLOCK was added in r168561 without comment.  Removing it seems
 like it would work.  It doesn't cause any new LORs or WITNESS
 warnings.  And I can no longer reproduce the panic in lacp_req.
 However, it doesn't fix another panic in lagg_port_ioctl line 825 (my
 patch didn't either).  Do you have any good ideas what to do about
 that?
 Interesting.
 Line numbers look a bit shifted. Is line 825 'LAGG_RUNLOCK(sc, tracker)' ?

Yes.


 Are the steps to reproduce it the same as in kern/189003?

Yes.  BTW, In kern/189003, I suggested reverting 253687.  In reality,
I didn't revert it; I commented out the sysctl lines it added in
lacp_attach().



 #9  0x808cee81 in __mtx_lock_sleep (c=0xf800052bb648,
 tid=18446735277704396800, opts=value optimized out,
 file=value optimized out, line=value optimized out)
 at /usr/home/alans/freebsd/head/sys/kern/kern_mutex.c:430
 #10 0x808ced02 in __mtx_lock_flags (c=value optimized out, opts=0,
 file=0x80f6dd8c
 /usr/home/alans/freebsd/head/sys/kern/kern_rmlock.c, line=407) at
 /usr/home/alans/freebsd/head/sys/kern/kern_mutex.c:223
 #11 0x808e0032 in _rm_rlock (rm=0xf800052bb608,
 tracker=0xfe0097751918, trylock=value optimized out)
 at /usr/home/alans/freebsd/head/sys/kern/kern_rmlock.c:407
 #12 0x808e0812 in _rm_rlock_debug (rm=0xf800052bb608,
 tracker=0xfe0097751918, trylock=0,
 file=0x81c1dd13
 /usr/home/alans/freebsd/head/sys/modules/if_lagg/../../net/if_lagg.c,
 line=825)
 at /usr/home/alans/freebsd/head/sys/kern/kern_rmlock.c:659
 #13 0x81c19f62 in lagg_port_ioctl (ifp=0xf8000590d800,
 cmd=value optimized out, data=0xf80005582c00 tap0)
 at 
 /usr/home/alans/freebsd/head/sys/modules/if_lagg/../../net/if_lagg.c:825
 #14 0x809ae407 in ifioctl (so=value optimized out, cmd=3225971084,
 data=0xf80005582c00 tap0, td=value optimized out)
 at /usr/home/alans/freebsd/head/sys/net/if.c:2643
 #15 0x8093b9fb in kern_ioctl (td=value optimized out,
 fd=value optimized out, com=value optimized out) at file.h:323
 #16 0x8093b77c in sys_ioctl (td=0xf800053cc000,
 uap=0xfe0097751b80)
 at /usr/home/alans/freebsd/head/sys/kern/sys_generic.c:702


___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org