[PATCH 1/3] Bridge: RCU barriers

2006-01-23 Thread Stephen Hemminger
Use RCU macro's to ensure barrier safety on the bridge
receive path.

Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>


--- br-fix.orig/net/bridge/br_if.c
+++ br-fix/net/bridge/br_if.c
@@ -124,7 +124,7 @@ static void del_nbp(struct net_bridge_po
struct net_bridge *br = p->br;
struct net_device *dev = p->dev;
 
-   dev->br_port = NULL;
+   rcu_assign_pointer(dev->br_port, NULL);
dev_set_promiscuity(dev, -1);
 
cancel_delayed_work(&p->carrier_check);
--- br-fix.orig/net/bridge/br_input.c
+++ br-fix/net/bridge/br_input.c
@@ -45,13 +45,20 @@ static void br_pass_frame_up(struct net_
 int br_handle_frame_finish(struct sk_buff *skb)
 {
const unsigned char *dest = eth_hdr(skb)->h_dest;
-   struct net_bridge_port *p = skb->dev->br_port;
-   struct net_bridge *br = p->br;
+   struct net_bridge_port *p;
+   struct net_bridge *br;
struct net_bridge_fdb_entry *dst;
int passedup = 0;
 
/* insert into forwarding database after filtering to avoid spoofing */
-   br_fdb_update(p->br, p, eth_hdr(skb)->h_source);
+   p = rcu_dereference(skb->dev->br_port);
+   if (unlikely(!p)) {
+   kfree_skb(skb);
+   goto out;
+   }
+
+   br = p->br;
+   br_fdb_update(br, p, eth_hdr(skb)->h_source);
 
if (p->state == BR_STATE_LEARNING) {
kfree_skb(skb);

--
Stephen Hemminger <[EMAIL PROTECTED]>
OSDL http://developer.osdl.org/~shemminger

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] Bridge: RCU barriers

2006-01-26 Thread Herbert Xu
On Mon, Jan 23, 2006 at 12:05:26PM -0800, Stephen Hemminger wrote:
> Use RCU macro's to ensure barrier safety on the bridge
> receive path.

Thanks for working on this Stephen.

> --- br-fix.orig/net/bridge/br_if.c
> +++ br-fix/net/bridge/br_if.c
> @@ -124,7 +124,7 @@ static void del_nbp(struct net_bridge_po
>   struct net_bridge *br = p->br;
>   struct net_device *dev = p->dev;
>  
> - dev->br_port = NULL;
> + rcu_assign_pointer(dev->br_port, NULL);

I think this barrier is in the wrong place.  On the deletion path what
we want to achieve is separate the zeroing of dev->br_port and the
actual freeing of the br_port object through a quiescent state.  This
is already achieved by the RCU call further down.

On the other hand, when we set dev->br_port in the first instance,
we do need the barrier.  So we need to turn the br_port assignment
in new_nbp into an rcu_assign_pointer.  In fact, we also need to
rearrange the code there so that we're 100% sure that the bridge
port is fully initialised before we assign dev->br_port.  This is
because as soon as that assignment is made packets may start getting
delivered to the bridge via this device.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] Bridge: RCU barriers

2006-01-26 Thread Stephen Hemminger
On Thu, 26 Jan 2006 19:45:06 +1100
Herbert Xu <[EMAIL PROTECTED]> wrote:

> On Mon, Jan 23, 2006 at 12:05:26PM -0800, Stephen Hemminger wrote:
> > Use RCU macro's to ensure barrier safety on the bridge
> > receive path.
> 
> Thanks for working on this Stephen.
> 
> > --- br-fix.orig/net/bridge/br_if.c
> > +++ br-fix/net/bridge/br_if.c
> > @@ -124,7 +124,7 @@ static void del_nbp(struct net_bridge_po
> > struct net_bridge *br = p->br;
> > struct net_device *dev = p->dev;
> >  
> > -   dev->br_port = NULL;
> > +   rcu_assign_pointer(dev->br_port, NULL);
> 
> I think this barrier is in the wrong place.  On the deletion path what
> we want to achieve is separate the zeroing of dev->br_port and the
> actual freeing of the br_port object through a quiescent state.  This
> is already achieved by the RCU call further down.

That breaks because of the race (found by Xen) where an interface
is being deleted from a bridge and the device is being removed.

br_del_if
rmmod device
netlink event
br_device_event
...

 
> On the other hand, when we set dev->br_port in the first instance,
> we do need the barrier.  So we need to turn the br_port assignment
> in new_nbp into an rcu_assign_pointer. In fact, we also need to
> rearrange the code there so that we're 100% sure that the bridge
> port is fully initialised before we assign dev->br_port.  This is
> because as soon as that assignment is made packets may start getting
> delivered to the bridge via this device.
> 
> Cheers,




-- 
Stephen Hemminger <[EMAIL PROTECTED]>
OSDL http://developer.osdl.org/~shemminger
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] Bridge: RCU barriers

2006-01-26 Thread Herbert Xu
On Thu, Jan 26, 2006 at 09:41:00AM -0800, Stephen Hemminger wrote:
>
> > > --- br-fix.orig/net/bridge/br_if.c
> > > +++ br-fix/net/bridge/br_if.c
> > > @@ -124,7 +124,7 @@ static void del_nbp(struct net_bridge_po
> > >   struct net_bridge *br = p->br;
> > >   struct net_device *dev = p->dev;
> > >  
> > > - dev->br_port = NULL;
> > > + rcu_assign_pointer(dev->br_port, NULL);
> > 
> > I think this barrier is in the wrong place.  On the deletion path what
> > we want to achieve is separate the zeroing of dev->br_port and the
> > actual freeing of the br_port object through a quiescent state.  This
> > is already achieved by the RCU call further down.
> 
> That breaks because of the race (found by Xen) where an interface
> is being deleted from a bridge and the device is being removed.
> 
>   br_del_if
>   rmmod device
>   netlink event
>   br_device_event
>   ...

Sorry, I don't get it.  How does adding a barrier in del_nbp
fix this bug?

In fact, as far as I can see, you need to add a pair of
rcu_read_lock/rcu_read_unlock in br_device_event to make it
work.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] Bridge: RCU barriers

2006-01-26 Thread Herbert Xu
On Fri, Jan 27, 2006 at 07:33:24AM +1100, herbert wrote:
>
> > That breaks because of the race (found by Xen) where an interface
> > is being deleted from a bridge and the device is being removed.
> > 
> > br_del_if
> > rmmod device
> > netlink event
> > br_device_event
> > ...
> 
> Sorry, I don't get it.  How does adding a barrier in del_nbp
> fix this bug?
> 
> In fact, as far as I can see, you need to add a pair of
> rcu_read_lock/rcu_read_unlock in br_device_event to make it
> work.

Actually, this race shouldn't exist anyway because both br_del_if
and br_device_event are meant to be called under the RTNL.  So
what's going on here?
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] Bridge: RCU barriers

2006-01-26 Thread Stephen Hemminger
On Fri, 27 Jan 2006 09:29:12 +1100
Herbert Xu <[EMAIL PROTECTED]> wrote:

> On Fri, Jan 27, 2006 at 07:33:24AM +1100, herbert wrote:
> >
> > > That breaks because of the race (found by Xen) where an interface
> > > is being deleted from a bridge and the device is being removed.
> > > 
> > >   br_del_if
> > >   rmmod device
> > >   netlink event
> > >   br_device_event
> > >   ...
> > 
> > Sorry, I don't get it.  How does adding a barrier in del_nbp
> > fix this bug?
> > 
> > In fact, as far as I can see, you need to add a pair of
> > rcu_read_lock/rcu_read_unlock in br_device_event to make it
> > work.
> 
> Actually, this race shouldn't exist anyway because both br_del_if
> and br_device_event are meant to be called under the RTNL.  So
> what's going on here?


Well, it was before I changed del_nbp to set dev->br_port to NULL.
So what br_del_if would get called twice for same port.


-- 
Stephen Hemminger <[EMAIL PROTECTED]>
OSDL http://developer.osdl.org/~shemminger
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] Bridge: RCU barriers

2006-01-26 Thread Stephen Hemminger
Plan B:
1. Use p->br (the back pointer) as the RCU sentinel
2. Keep dev->br_port set until after the RCU transition
3. Order operations so we don't have to worry about p->br being NULL
   in the case of STP timers.

Untested. Probably excessive use of rcu() macros. Since it is only
needed in cases where no locks held (ie.  !(RTNL | br_lock))


--- br-fix.orig/net/bridge/br_fdb.c
+++ br-fix/net/bridge/br_fdb.c
@@ -67,9 +67,9 @@ static __inline__ void fdb_delete(struct
br_fdb_put(f);
 }
 
-void br_fdb_changeaddr(struct net_bridge_port *p, const unsigned char *newaddr)
+void br_fdb_changeaddr(struct net_bridge *br, struct net_bridge_port *p,
+  const unsigned char *newaddr)
 {
-   struct net_bridge *br = p->br;
int i;

spin_lock_bh(&br->hash_lock);
--- br-fix.orig/net/bridge/br_if.c
+++ br-fix/net/bridge/br_if.c
@@ -79,24 +79,31 @@ static int port_cost(struct net_device *
  */
 static void port_carrier_check(void *arg)
 {
-   struct net_bridge_port *p = arg;
+   struct net_device *dev = arg;
+   struct net_bridge_port *p;
+   struct net_bridge *br;
 
rtnl_lock();
-   if (netif_carrier_ok(p->dev)) {
-   u32 cost = port_cost(p->dev);
+   if ((p = dev->br_port) == NULL ||
+   (br = rcu_dereference(p->br)) == NULL)
+   goto done;
 
-   spin_lock_bh(&p->br->lock);
+   if (netif_carrier_ok(dev)) {
+   u32 cost = port_cost(dev);
+
+   spin_lock_bh(&br->lock);
if (p->state == BR_STATE_DISABLED) {
p->path_cost = cost;
br_stp_enable_port(p);
}
-   spin_unlock_bh(&p->br->lock);
+   spin_unlock_bh(&br->lock);
} else {
-   spin_lock_bh(&p->br->lock);
+   spin_lock_bh(&br->lock);
if (p->state != BR_STATE_DISABLED)
br_stp_disable_port(p);
-   spin_unlock_bh(&p->br->lock);
+   spin_unlock_bh(&br->lock);
}
+done:
rtnl_unlock();
 }
 
@@ -104,7 +111,7 @@ static void destroy_nbp(struct net_bridg
 {
struct net_device *dev = p->dev;
 
-   p->br = NULL;
+   dev->br_port = NULL;
p->dev = NULL;
dev_put(dev);
 
@@ -119,29 +126,25 @@ static void destroy_nbp_rcu(struct rcu_h
 }
 
 /* called with RTNL */
-static void del_nbp(struct net_bridge_port *p)
+static void del_nbp(struct net_bridge *br, struct net_bridge_port *p)
 {
-   struct net_bridge *br = p->br;
-   struct net_device *dev = p->dev;
-
-   dev->br_port = NULL;
-   dev_set_promiscuity(dev, -1);
+   dev_set_promiscuity(p->dev, -1);
 
cancel_delayed_work(&p->carrier_check);
-   flush_scheduled_work();
 
spin_lock_bh(&br->lock);
br_stp_disable_port(p);
+   list_del_rcu(&p->list);
spin_unlock_bh(&br->lock);
 
br_fdb_delete_by_port(br, p);
 
-   list_del_rcu(&p->list);
-
del_timer_sync(&p->message_age_timer);
del_timer_sync(&p->forward_delay_timer);
del_timer_sync(&p->hold_timer);

+   rcu_assign_pointer(p->br, NULL);
+
call_rcu(&p->rcu, destroy_nbp_rcu);
 }
 
@@ -152,7 +155,7 @@ static void del_br(struct net_bridge *br
 
list_for_each_entry_safe(p, n, &br->port_list, list) {
br_sysfs_removeif(p);
-   del_nbp(p);
+   del_nbp(br, p);
}
 
del_timer_sync(&br->gc_timer);
@@ -249,7 +252,7 @@ static struct net_bridge_port *new_nbp(s
p->port_no = index;
br_init_port(p);
p->state = BR_STATE_DISABLED;
-   INIT_WORK(&p->carrier_check, port_carrier_check, p);
+   INIT_WORK(&p->carrier_check, port_carrier_check, dev);
kobject_init(&p->kobj);
 
return p;
@@ -386,7 +389,7 @@ int br_add_if(struct net_bridge *br, str
destroy_nbp(p);
  
else if ((err = br_sysfs_addif(p)))
-   del_nbp(p);
+   del_nbp(br, p);
else {
dev_set_promiscuity(dev, 1);
 
@@ -415,7 +418,7 @@ int br_del_if(struct net_bridge *br, str
return -EINVAL;
 
br_sysfs_removeif(p);
-   del_nbp(p);
+   del_nbp(br, p);
 
spin_lock_bh(&br->lock);
br_stp_recalculate_bridge_id(br);
--- br-fix.orig/net/bridge/br_input.c
+++ br-fix/net/bridge/br_input.c
@@ -46,17 +46,19 @@ int br_handle_frame_finish(struct sk_buf
 {
const unsigned char *dest = eth_hdr(skb)->h_dest;
struct net_bridge_port *p = skb->dev->br_port;
-   struct net_bridge *br = p->br;
+   struct net_bridge *br;
struct net_bridge_fdb_entry *dst;
int passedup = 0;
 
+   br = rcu_dereference(p->br);
+   if (!br)
+   goto drop;
+
/* insert into forwarding database after filtering to avoid spoofing */
-   br_fdb_update(p->br, p, eth_hdr(skb)->h_source);
+   br_fdb_update(b

Re: [PATCH 1/3] Bridge: RCU barriers

2006-01-26 Thread Herbert Xu
On Thu, Jan 26, 2006 at 03:01:00PM -0800, Stephen Hemminger wrote:
> 
> Well, it was before I changed del_nbp to set dev->br_port to NULL.
> So what br_del_if would get called twice for same port.

Right.  I wasn't questioning the zeroing of dev->br_port.  I was
just saying that placing a barrier using rcu_assign_pointer when
you're setting it to NULL is useless because the mechanism that is
required here is the quiescent state.

On the other hand, when we set dev->br_port to a non-NULL value
we do need the barrier.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] Bridge: RCU barriers

2006-01-26 Thread Herbert Xu
On Thu, Jan 26, 2006 at 03:05:54PM -0800, Stephen Hemminger wrote:
> Plan B:
> 1. Use p->br (the back pointer) as the RCU sentinel
> 2. Keep dev->br_port set until after the RCU transition
> 3. Order operations so we don't have to worry about p->br being NULL
>in the case of STP timers.

This sounds like a good plan.  I'll go over it later today.

Thanks Stephen,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] Bridge: RCU barriers

2006-01-30 Thread Herbert Xu
On Thu, Jan 26, 2006 at 03:05:54PM -0800, Stephen Hemminger wrote:
> Plan B:
> 1. Use p->br (the back pointer) as the RCU sentinel

A thought just came to mind.  What about adding a dead flag (p->dead)?
That means you won't have to carry around all those extra pointers.

So on the deleting path, you set p->dead, wait for the state to roll
over, and proceed with the deletion.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] Bridge: RCU barriers

2006-01-30 Thread Stephen Hemminger
On Tue, 31 Jan 2006 10:11:10 +1100
Herbert Xu <[EMAIL PROTECTED]> wrote:

> On Thu, Jan 26, 2006 at 03:05:54PM -0800, Stephen Hemminger wrote:
> > Plan B:
> > 1. Use p->br (the back pointer) as the RCU sentinel
> 
> A thought just came to mind.  What about adding a dead flag (p->dead)?
> That means you won't have to carry around all those extra pointers.
> 
> So on the deleting path, you set p->dead, wait for the state to roll
> over, and proceed with the deletion.
> 
> Cheers,

Or could use STP state, but the extra pointers aren't that much
anyway.

-- 
Stephen Hemminger <[EMAIL PROTECTED]>
OSDL http://developer.osdl.org/~shemminger
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] Bridge: RCU barriers

2006-01-30 Thread Herbert Xu
On Mon, Jan 30, 2006 at 03:44:15PM -0800, Stephen Hemminger wrote:
> Or could use STP state, but the extra pointers aren't that much
> anyway.

Well it'll make the patch smaller and easier to read at least :)
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html