Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-21 Thread Robert Olsson

David Miller writes:

 > Yes, this semaphore thing is highly problematic.  In the most crucial
 > areas where network driver consistency matters the most for ease of
 > understanding and debugging, the Intel drivers choose to be different
 > :-(
 > 
 > The way the napi_disable() logic breaks out from high packet load in
 > net_rx_action() is it simply returns even leaving interrupts disabled
 > when a pending napi_disable() is pending.
 > 
 > This is what trips up the semaphore logic.
 > 
 > Robert, give this patch a try.


 Yes it works. e1000 tested for ~3 hours with high very high load and 
 interface up/down every 5:th sec. Without the patch the irq's gets 
 disabled within a couple of seconds

 A resolute way of handling the semaphores. :)
   
 Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
 
 Cheers
--ro


 > In the long term this semaphore should be completely eliminated,
 > there is no justification for it.
 > 
 > Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
 > 
 > diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
 > index 0c9a6f7..76c0fa6 100644
 > --- a/drivers/net/e1000/e1000_main.c
 > +++ b/drivers/net/e1000/e1000_main.c
 > @@ -632,6 +632,7 @@ e1000_down(struct e1000_adapter *adapter)
 >  
 >  #ifdef CONFIG_E1000_NAPI
 >  napi_disable(>napi);
 > +atomic_set(>irq_sem, 0);
 >  #endif
 >  e1000_irq_disable(adapter);
 >  
 > diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
 > index 2ab3bfb..9cc5a6b 100644
 > --- a/drivers/net/e1000e/netdev.c
 > +++ b/drivers/net/e1000e/netdev.c
 > @@ -2183,6 +2183,7 @@ void e1000e_down(struct e1000_adapter *adapter)
 >  msleep(10);
 >  
 >  napi_disable(>napi);
 > +atomic_set(>irq_sem, 0);
 >  e1000_irq_disable(adapter);
 >  
 >  del_timer_sync(>watchdog_timer);
 > diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c
 > index d2fb88d..4f63839 100644
 > --- a/drivers/net/ixgb/ixgb_main.c
 > +++ b/drivers/net/ixgb/ixgb_main.c
 > @@ -296,6 +296,11 @@ ixgb_down(struct ixgb_adapter *adapter, boolean_t 
 > kill_watchdog)
 >  {
 >  struct net_device *netdev = adapter->netdev;
 >  
 > +#ifdef CONFIG_IXGB_NAPI
 > +napi_disable(>napi);
 > +atomic_set(>irq_sem, 0);
 > +#endif
 > +
 >  ixgb_irq_disable(adapter);
 >  free_irq(adapter->pdev->irq, netdev);
 >  
 > @@ -304,9 +309,7 @@ ixgb_down(struct ixgb_adapter *adapter, boolean_t 
 > kill_watchdog)
 >  
 >  if(kill_watchdog)
 >  del_timer_sync(>watchdog_timer);
 > -#ifdef CONFIG_IXGB_NAPI
 > -napi_disable(>napi);
 > -#endif
 > +
 >  adapter->link_speed = 0;
 >  adapter->link_duplex = 0;
 >  netif_carrier_off(netdev);
 > diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
 > index de3f45e..a4265bc 100644
 > --- a/drivers/net/ixgbe/ixgbe_main.c
 > +++ b/drivers/net/ixgbe/ixgbe_main.c
 > @@ -1409,9 +1409,11 @@ void ixgbe_down(struct ixgbe_adapter *adapter)
 >  IXGBE_WRITE_FLUSH(>hw);
 >  msleep(10);
 >  
 > +napi_disable(>napi);
 > +atomic_set(>irq_sem, 0);
 > +
 >  ixgbe_irq_disable(adapter);
 >  
 > -napi_disable(>napi);
 >  del_timer_sync(>watchdog_timer);
 >  
 >  netif_carrier_off(netdev);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-21 Thread Robert Olsson

David Miller writes:

  Yes, this semaphore thing is highly problematic.  In the most crucial
  areas where network driver consistency matters the most for ease of
  understanding and debugging, the Intel drivers choose to be different
  :-(
  
  The way the napi_disable() logic breaks out from high packet load in
  net_rx_action() is it simply returns even leaving interrupts disabled
  when a pending napi_disable() is pending.
  
  This is what trips up the semaphore logic.
  
  Robert, give this patch a try.


 Yes it works. e1000 tested for ~3 hours with high very high load and 
 interface up/down every 5:th sec. Without the patch the irq's gets 
 disabled within a couple of seconds

 A resolute way of handling the semaphores. :)
   
 Signed-off-by: Robert Olsson [EMAIL PROTECTED]
 
 Cheers
--ro


  In the long term this semaphore should be completely eliminated,
  there is no justification for it.
  
  Signed-off-by: David S. Miller [EMAIL PROTECTED]
  
  diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
  index 0c9a6f7..76c0fa6 100644
  --- a/drivers/net/e1000/e1000_main.c
  +++ b/drivers/net/e1000/e1000_main.c
  @@ -632,6 +632,7 @@ e1000_down(struct e1000_adapter *adapter)
   
   #ifdef CONFIG_E1000_NAPI
   napi_disable(adapter-napi);
  +atomic_set(adapter-irq_sem, 0);
   #endif
   e1000_irq_disable(adapter);
   
  diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
  index 2ab3bfb..9cc5a6b 100644
  --- a/drivers/net/e1000e/netdev.c
  +++ b/drivers/net/e1000e/netdev.c
  @@ -2183,6 +2183,7 @@ void e1000e_down(struct e1000_adapter *adapter)
   msleep(10);
   
   napi_disable(adapter-napi);
  +atomic_set(adapter-irq_sem, 0);
   e1000_irq_disable(adapter);
   
   del_timer_sync(adapter-watchdog_timer);
  diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c
  index d2fb88d..4f63839 100644
  --- a/drivers/net/ixgb/ixgb_main.c
  +++ b/drivers/net/ixgb/ixgb_main.c
  @@ -296,6 +296,11 @@ ixgb_down(struct ixgb_adapter *adapter, boolean_t 
  kill_watchdog)
   {
   struct net_device *netdev = adapter-netdev;
   
  +#ifdef CONFIG_IXGB_NAPI
  +napi_disable(adapter-napi);
  +atomic_set(adapter-irq_sem, 0);
  +#endif
  +
   ixgb_irq_disable(adapter);
   free_irq(adapter-pdev-irq, netdev);
   
  @@ -304,9 +309,7 @@ ixgb_down(struct ixgb_adapter *adapter, boolean_t 
  kill_watchdog)
   
   if(kill_watchdog)
   del_timer_sync(adapter-watchdog_timer);
  -#ifdef CONFIG_IXGB_NAPI
  -napi_disable(adapter-napi);
  -#endif
  +
   adapter-link_speed = 0;
   adapter-link_duplex = 0;
   netif_carrier_off(netdev);
  diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
  index de3f45e..a4265bc 100644
  --- a/drivers/net/ixgbe/ixgbe_main.c
  +++ b/drivers/net/ixgbe/ixgbe_main.c
  @@ -1409,9 +1409,11 @@ void ixgbe_down(struct ixgbe_adapter *adapter)
   IXGBE_WRITE_FLUSH(adapter-hw);
   msleep(10);
   
  +napi_disable(adapter-napi);
  +atomic_set(adapter-irq_sem, 0);
  +
   ixgbe_irq_disable(adapter);
   
  -napi_disable(adapter-napi);
   del_timer_sync(adapter-watchdog_timer);
   
   netif_carrier_off(netdev);
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-18 Thread Robert Olsson

David Miller writes:

 > > eth0 e1000_irq_enable sem = 1<- ifconfig eth0 down
 > > eth0 e1000_irq_disable sem = 2
 > > 
 > > **e1000_open <- ifconfig eth0 up
 > > eth0 e1000_irq_disable sem = 3  Dead. irq's can't be enabled
 > > e1000_irq_enable miss
 > > eth0 e1000_irq_enable sem = 2
 > > e1000_irq_enable miss
 > > eth0 e1000_irq_enable sem = 1
 > > ADDRCONF(NETDEV_UP): eth0: link is not ready
 > 
 > Yes, this semaphore thing is highly problematic.  In the most crucial
 > areas where network driver consistency matters the most for ease of
 > understanding and debugging, the Intel drivers choose to be different

 I don't understand the idea with semaphore for enabling/disabling 
 irq's either the overall logic must safer/better without it.  
 
 > The way the napi_disable() logic breaks out from high packet load in
 > net_rx_action() is it simply returns even leaving interrupts disabled
 > when a pending napi_disable() is pending.
 > 
 > This is what trips up the semaphore logic.
 > 
 > Robert, give this patch a try.
 > 
 > In the long term this semaphore should be completely eliminated,
 > there is no justification for it.

 It's on the testing list...

 Cheers
--ro


 > 
 > Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
 > 
 > diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
 > index 0c9a6f7..76c0fa6 100644
 > --- a/drivers/net/e1000/e1000_main.c
 > +++ b/drivers/net/e1000/e1000_main.c
 > @@ -632,6 +632,7 @@ e1000_down(struct e1000_adapter *adapter)
 >  
 >  #ifdef CONFIG_E1000_NAPI
 >  napi_disable(>napi);
 > +atomic_set(>irq_sem, 0);
 >  #endif
 >  e1000_irq_disable(adapter);
 >  
 > diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
 > index 2ab3bfb..9cc5a6b 100644
 > --- a/drivers/net/e1000e/netdev.c
 > +++ b/drivers/net/e1000e/netdev.c
 > @@ -2183,6 +2183,7 @@ void e1000e_down(struct e1000_adapter *adapter)
 >  msleep(10);
 >  
 >  napi_disable(>napi);
 > +atomic_set(>irq_sem, 0);
 >  e1000_irq_disable(adapter);
 >  
 >  del_timer_sync(>watchdog_timer);
 > diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c
 > index d2fb88d..4f63839 100644
 > --- a/drivers/net/ixgb/ixgb_main.c
 > +++ b/drivers/net/ixgb/ixgb_main.c
 > @@ -296,6 +296,11 @@ ixgb_down(struct ixgb_adapter *adapter, boolean_t 
 > kill_watchdog)
 >  {
 >  struct net_device *netdev = adapter->netdev;
 >  
 > +#ifdef CONFIG_IXGB_NAPI
 > +napi_disable(>napi);
 > +atomic_set(>irq_sem, 0);
 > +#endif
 > +
 >  ixgb_irq_disable(adapter);
 >  free_irq(adapter->pdev->irq, netdev);
 >  
 > @@ -304,9 +309,7 @@ ixgb_down(struct ixgb_adapter *adapter, boolean_t 
 > kill_watchdog)
 >  
 >  if(kill_watchdog)
 >  del_timer_sync(>watchdog_timer);
 > -#ifdef CONFIG_IXGB_NAPI
 > -napi_disable(>napi);
 > -#endif
 > +
 >  adapter->link_speed = 0;
 >  adapter->link_duplex = 0;
 >  netif_carrier_off(netdev);
 > diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
 > index de3f45e..a4265bc 100644
 > --- a/drivers/net/ixgbe/ixgbe_main.c
 > +++ b/drivers/net/ixgbe/ixgbe_main.c
 > @@ -1409,9 +1409,11 @@ void ixgbe_down(struct ixgbe_adapter *adapter)
 >  IXGBE_WRITE_FLUSH(>hw);
 >  msleep(10);
 >  
 > +napi_disable(>napi);
 > +atomic_set(>irq_sem, 0);
 > +
 >  ixgbe_irq_disable(adapter);
 >  
 > -napi_disable(>napi);
 >  del_timer_sync(>watchdog_timer);
 >  
 >  netif_carrier_off(netdev);
 > --
 > To unsubscribe from this list: send the line "unsubscribe netdev" in
 > the body of a message to [EMAIL PROTECTED]
 > More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-18 Thread Robert Olsson

David Miller writes:

   eth0 e1000_irq_enable sem = 1- ifconfig eth0 down
   eth0 e1000_irq_disable sem = 2
   
   **e1000_open - ifconfig eth0 up
   eth0 e1000_irq_disable sem = 3  Dead. irq's can't be enabled
   e1000_irq_enable miss
   eth0 e1000_irq_enable sem = 2
   e1000_irq_enable miss
   eth0 e1000_irq_enable sem = 1
   ADDRCONF(NETDEV_UP): eth0: link is not ready
  
  Yes, this semaphore thing is highly problematic.  In the most crucial
  areas where network driver consistency matters the most for ease of
  understanding and debugging, the Intel drivers choose to be different

 I don't understand the idea with semaphore for enabling/disabling 
 irq's either the overall logic must safer/better without it.  
 
  The way the napi_disable() logic breaks out from high packet load in
  net_rx_action() is it simply returns even leaving interrupts disabled
  when a pending napi_disable() is pending.
  
  This is what trips up the semaphore logic.
  
  Robert, give this patch a try.
  
  In the long term this semaphore should be completely eliminated,
  there is no justification for it.

 It's on the testing list...

 Cheers
--ro


  
  Signed-off-by: David S. Miller [EMAIL PROTECTED]
  
  diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
  index 0c9a6f7..76c0fa6 100644
  --- a/drivers/net/e1000/e1000_main.c
  +++ b/drivers/net/e1000/e1000_main.c
  @@ -632,6 +632,7 @@ e1000_down(struct e1000_adapter *adapter)
   
   #ifdef CONFIG_E1000_NAPI
   napi_disable(adapter-napi);
  +atomic_set(adapter-irq_sem, 0);
   #endif
   e1000_irq_disable(adapter);
   
  diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
  index 2ab3bfb..9cc5a6b 100644
  --- a/drivers/net/e1000e/netdev.c
  +++ b/drivers/net/e1000e/netdev.c
  @@ -2183,6 +2183,7 @@ void e1000e_down(struct e1000_adapter *adapter)
   msleep(10);
   
   napi_disable(adapter-napi);
  +atomic_set(adapter-irq_sem, 0);
   e1000_irq_disable(adapter);
   
   del_timer_sync(adapter-watchdog_timer);
  diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c
  index d2fb88d..4f63839 100644
  --- a/drivers/net/ixgb/ixgb_main.c
  +++ b/drivers/net/ixgb/ixgb_main.c
  @@ -296,6 +296,11 @@ ixgb_down(struct ixgb_adapter *adapter, boolean_t 
  kill_watchdog)
   {
   struct net_device *netdev = adapter-netdev;
   
  +#ifdef CONFIG_IXGB_NAPI
  +napi_disable(adapter-napi);
  +atomic_set(adapter-irq_sem, 0);
  +#endif
  +
   ixgb_irq_disable(adapter);
   free_irq(adapter-pdev-irq, netdev);
   
  @@ -304,9 +309,7 @@ ixgb_down(struct ixgb_adapter *adapter, boolean_t 
  kill_watchdog)
   
   if(kill_watchdog)
   del_timer_sync(adapter-watchdog_timer);
  -#ifdef CONFIG_IXGB_NAPI
  -napi_disable(adapter-napi);
  -#endif
  +
   adapter-link_speed = 0;
   adapter-link_duplex = 0;
   netif_carrier_off(netdev);
  diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
  index de3f45e..a4265bc 100644
  --- a/drivers/net/ixgbe/ixgbe_main.c
  +++ b/drivers/net/ixgbe/ixgbe_main.c
  @@ -1409,9 +1409,11 @@ void ixgbe_down(struct ixgbe_adapter *adapter)
   IXGBE_WRITE_FLUSH(adapter-hw);
   msleep(10);
   
  +napi_disable(adapter-napi);
  +atomic_set(adapter-irq_sem, 0);
  +
   ixgbe_irq_disable(adapter);
   
  -napi_disable(adapter-napi);
   del_timer_sync(adapter-watchdog_timer);
   
   netif_carrier_off(netdev);
  --
  To unsubscribe from this list: send the line unsubscribe netdev in
  the body of a message to [EMAIL PROTECTED]
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-16 Thread Robert Olsson

David Miller writes:
 > > On Wednesday 16 January 2008, David Miller wrote:
 > > > Ok, here is the patch I'll propose to fix this.  The goal is to make
 > > > it as simple as possible without regressing the thing we were trying
 > > > to fix.
 > > 
 > > Looks good to me. Tested with -rc8.
 > 
 > Thanks for testing.

 Yes that code looks nice. I'm using the patch but I've noticed another 
 phenomena with the current e1000 driver. There is a race when taking a 
 device down at high traffic loads. I've tracked and instrumented and it 
 seems like occasionly irq_sem can get bump up so interrupts can't be 
 enabled again.


eth0 e1000_irq_enable sem = 1<- High netload
eth0 e1000_irq_enable sem = 1
eth0 e1000_irq_enable sem = 1
eth0 e1000_irq_enable sem = 1
eth0 e1000_irq_enable sem = 1
eth0 e1000_irq_enable sem = 1
eth0 e1000_irq_enable sem = 1<- ifconfig eth0 down
eth0 e1000_irq_disable sem = 2

**e1000_open <- ifconfig eth0 up
eth0 e1000_irq_disable sem = 3  Dead. irq's can't be enabled
e1000_irq_enable miss
eth0 e1000_irq_enable sem = 2
e1000_irq_enable miss
eth0 e1000_irq_enable sem = 1
ADDRCONF(NETDEV_UP): eth0: link is not ready


Cheers
--ro

static void
e1000_irq_disable(struct e1000_adapter *adapter)
{
atomic_inc(>irq_sem);
E1000_WRITE_REG(>hw, IMC, ~0);
E1000_WRITE_FLUSH(>hw);
synchronize_irq(adapter->pdev->irq);

if(adapter->netdev->ifindex == 3)
printk("%s e1000_irq_disable sem = %d\n",  adapter->netdev->name,
   atomic_read(>irq_sem));
}

static void
e1000_irq_enable(struct e1000_adapter *adapter)
{
if (likely(atomic_dec_and_test(>irq_sem))) {
E1000_WRITE_REG(>hw, IMS, IMS_ENABLE_MASK);
E1000_WRITE_FLUSH(>hw);
}
else
printk("e1000_irq_enable miss\n");

if(adapter->netdev->ifindex == 3)
  printk("%s e1000_irq_enable sem = %d\n",  adapter->netdev->name,
 atomic_read(>irq_sem));
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-16 Thread Robert Olsson

David Miller writes:
   On Wednesday 16 January 2008, David Miller wrote:
Ok, here is the patch I'll propose to fix this.  The goal is to make
it as simple as possible without regressing the thing we were trying
to fix.
   
   Looks good to me. Tested with -rc8.
  
  Thanks for testing.

 Yes that code looks nice. I'm using the patch but I've noticed another 
 phenomena with the current e1000 driver. There is a race when taking a 
 device down at high traffic loads. I've tracked and instrumented and it 
 seems like occasionly irq_sem can get bump up so interrupts can't be 
 enabled again.


eth0 e1000_irq_enable sem = 1- High netload
eth0 e1000_irq_enable sem = 1
eth0 e1000_irq_enable sem = 1
eth0 e1000_irq_enable sem = 1
eth0 e1000_irq_enable sem = 1
eth0 e1000_irq_enable sem = 1
eth0 e1000_irq_enable sem = 1- ifconfig eth0 down
eth0 e1000_irq_disable sem = 2

**e1000_open - ifconfig eth0 up
eth0 e1000_irq_disable sem = 3  Dead. irq's can't be enabled
e1000_irq_enable miss
eth0 e1000_irq_enable sem = 2
e1000_irq_enable miss
eth0 e1000_irq_enable sem = 1
ADDRCONF(NETDEV_UP): eth0: link is not ready


Cheers
--ro

static void
e1000_irq_disable(struct e1000_adapter *adapter)
{
atomic_inc(adapter-irq_sem);
E1000_WRITE_REG(adapter-hw, IMC, ~0);
E1000_WRITE_FLUSH(adapter-hw);
synchronize_irq(adapter-pdev-irq);

if(adapter-netdev-ifindex == 3)
printk(%s e1000_irq_disable sem = %d\n,  adapter-netdev-name,
   atomic_read(adapter-irq_sem));
}

static void
e1000_irq_enable(struct e1000_adapter *adapter)
{
if (likely(atomic_dec_and_test(adapter-irq_sem))) {
E1000_WRITE_REG(adapter-hw, IMS, IMS_ENABLE_MASK);
E1000_WRITE_FLUSH(adapter-hw);
}
else
printk(e1000_irq_enable miss\n);

if(adapter-netdev-ifindex == 3)
  printk(%s e1000_irq_enable sem = %d\n,  adapter-netdev-name,
 atomic_read(adapter-irq_sem));
}
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] net: napi fix

2007-12-20 Thread Robert Olsson

David Miller writes:

 > > Is the netif_running() check even required?
 > 
 > No, it is not.
 > 
 > When a device is brought down, one of the first things
 > that happens is that we wait for all pending NAPI polls
 > to complete, then block any new polls from starting.

 Hello!

 Yes but the reason was not to wait for all pending polls to
 complete so a server/router could be rebooted even under high-
 load and DOS. We've experienced some nasty problems with this.

 Cheers.
--ro
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] net: napi fix

2007-12-20 Thread Robert Olsson

David Miller writes:

   Is the netif_running() check even required?
  
  No, it is not.
  
  When a device is brought down, one of the first things
  that happens is that we wait for all pending NAPI polls
  to complete, then block any new polls from starting.

 Hello!

 Yes but the reason was not to wait for all pending polls to
 complete so a server/router could be rebooted even under high-
 load and DOS. We've experienced some nasty problems with this.

 Cheers.
--ro
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Memory leak in 2.6.11-rc1?

2005-01-27 Thread Robert Olsson

Oh. Linux version 2.6.11-rc2 was used.

Robert Olsson writes:
 > 
 > Andrew Morton writes:
 >  > Russell King <[EMAIL PROTECTED]> wrote:
 > 
 >  > >  ip_dst_cache1292   1485256   151
 > 
 >  > I guess we should find a way to make it happen faster.
 >  
 > Here is route DoS attack. Pure routing no NAT no filter.
 > 
 > Start
 > =
 > ip_dst_cache   5 30256   151 : tunables  120   608 : 
 > slabdata  2  2  0
 > 
 > After DoS
 > =
 > ip_dst_cache   66045  76125256   151 : tunables  120   608 : 
 > slabdata   5075   5075480
 > 
 > After some GC runs.
 > ==
 > ip_dst_cache   2 15256   151 : tunables  120   608 : 
 > slabdata  1  1  0
 > 
 > No problems here. I saw Martin talked about NAT...
 > 
 >  --ro
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Memory leak in 2.6.11-rc1?

2005-01-27 Thread Robert Olsson

Andrew Morton writes:
 > Russell King <[EMAIL PROTECTED]> wrote:

 > >  ip_dst_cache1292   1485256   151

 > I guess we should find a way to make it happen faster.
 
Here is route DoS attack. Pure routing no NAT no filter.

Start
=
ip_dst_cache   5 30256   151 : tunables  120   608 : 
slabdata  2  2  0

After DoS
=
ip_dst_cache   66045  76125256   151 : tunables  120   608 : 
slabdata   5075   5075480

After some GC runs.
==
ip_dst_cache   2 15256   151 : tunables  120   608 : 
slabdata  1  1  0

No problems here. I saw Martin talked about NAT...

--ro
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Memory leak in 2.6.11-rc1?

2005-01-27 Thread Robert Olsson

Andrew Morton writes:
  Russell King [EMAIL PROTECTED] wrote:

ip_dst_cache1292   1485256   151

  I guess we should find a way to make it happen faster.
 
Here is route DoS attack. Pure routing no NAT no filter.

Start
=
ip_dst_cache   5 30256   151 : tunables  120   608 : 
slabdata  2  2  0

After DoS
=
ip_dst_cache   66045  76125256   151 : tunables  120   608 : 
slabdata   5075   5075480

After some GC runs.
==
ip_dst_cache   2 15256   151 : tunables  120   608 : 
slabdata  1  1  0

No problems here. I saw Martin talked about NAT...

--ro
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Memory leak in 2.6.11-rc1?

2005-01-27 Thread Robert Olsson

Oh. Linux version 2.6.11-rc2 was used.

Robert Olsson writes:
  
  Andrew Morton writes:
Russell King [EMAIL PROTECTED] wrote:
  
  ip_dst_cache1292   1485256   151
  
I guess we should find a way to make it happen faster.
   
  Here is route DoS attack. Pure routing no NAT no filter.
  
  Start
  =
  ip_dst_cache   5 30256   151 : tunables  120   608 : 
  slabdata  2  2  0
  
  After DoS
  =
  ip_dst_cache   66045  76125256   151 : tunables  120   608 : 
  slabdata   5075   5075480
  
  After some GC runs.
  ==
  ip_dst_cache   2 15256   151 : tunables  120   608 : 
  slabdata  1  1  0
  
  No problems here. I saw Martin talked about NAT...
  
   --ro
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[BUG] 2.6.* pktgen doesn't set ethnet header properly

2005-01-21 Thread Robert Olsson

Hello!

Look at

pginfos[i].hh[12] = 0x08; /* fill in protocol.  Rest is filled in later. */
pginfos[i].hh[13] = 0x00;

--ro


Junfeng Yang writes:
 > Hi,
 > 
 > I tried to use pktgen module from 2.6.* kernels and found out that I
 > couldn't receive any packets generated by pktgen.  I did not even see a
 > "packet dropped by kernel" message.  It turned out that function
 > setup_inject in net/core/pktgen.c doesn't setup the ethernet header field
 > correctly.  Below is a patch that fixes the problem.
 > 
 > --- kernel-source-2.6.8-orig/net/core/pktgen.c   2004-08-13 
 > 22:37:26.0 -0700
 > +++ kernel-source-2.6.8/net/core/pktgen.c2005-01-19 17:54:46.0 
 > -0800
 > @@ -259,6 +259,9 @@
 > 
 >  /* Set up Dest MAC */
 >  memcpy(&(info->hh[0]), info->dst_mac, 6);
 > +
 > +/* Set up protocol */
 > +((struct ethhdr *)(info->hh))->h_proto = htons(ETH_P_IP);
 > 
 >  info->saddr_min = 0;
 >  info->saddr_max = 0;
 > 
 > -Junfeng
 > 
 > -
 > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 > the body of a message to [EMAIL PROTECTED]
 > More majordomo info at  http://vger.kernel.org/majordomo-info.html
 > Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[BUG] 2.6.* pktgen doesn't set ethnet header properly

2005-01-21 Thread Robert Olsson

Hello!

Look at

pginfos[i].hh[12] = 0x08; /* fill in protocol.  Rest is filled in later. */
pginfos[i].hh[13] = 0x00;

--ro


Junfeng Yang writes:
  Hi,
  
  I tried to use pktgen module from 2.6.* kernels and found out that I
  couldn't receive any packets generated by pktgen.  I did not even see a
  packet dropped by kernel message.  It turned out that function
  setup_inject in net/core/pktgen.c doesn't setup the ethernet header field
  correctly.  Below is a patch that fixes the problem.
  
  --- kernel-source-2.6.8-orig/net/core/pktgen.c   2004-08-13 
  22:37:26.0 -0700
  +++ kernel-source-2.6.8/net/core/pktgen.c2005-01-19 17:54:46.0 
  -0800
  @@ -259,6 +259,9 @@
  
   /* Set up Dest MAC */
   memcpy((info-hh[0]), info-dst_mac, 6);
  +
  +/* Set up protocol */
  +((struct ethhdr *)(info-hh))-h_proto = htons(ETH_P_IP);
  
   info-saddr_min = 0;
   info-saddr_max = 0;
  
  -Junfeng
  
  -
  To unsubscribe from this list: send the line unsubscribe linux-kernel in
  the body of a message to [EMAIL PROTECTED]
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
  Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: How to optimize routing performance

2001-03-15 Thread Robert Olsson


Manfred Spraul writes:
 > >
 > > http://Linux/net-development/experiments/010313
 > >
 > The link is broken, and I couldn't find it at www.linux.com. Did you
 > forget the host?

 Yes Sir!

 The profile data from the Linux production router is at:
 
 http://robur.slu.se/Linux/net-development/experiments/010313

 Cheers.

--ro

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: How to optimize routing performance

2001-03-15 Thread Robert Olsson



Jonathan Morton writes:

 > Nice.  Any chance of similar functionality finding its' way outside the
 > Tulip driver, eg. to 3c509 or via-rhine?  I'd find those useful, since one
 > or two of my Macs appear to be capable of generating pseudo-DoS levels of
 > traffic under certain circumstances which totally lock a 486 (for the
 > duration) and heavily load a P166 - even though said Macs "only" have
 > 10baseT Ethernet.

 I'm not the one to tell. :-) 

 First its kind of experimental. Jamal has talked about putting together 
 a proposal for enhancing RX-process for inclusion in the 2.5 kernels. 
 There is meeting soon for this.


 But why not experiment a bit?

 Cheers.

--ro
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: How to optimize routing performance

2001-03-15 Thread Robert Olsson


[Sorry for the length]

Rik van Riel writes:
 > On Thu, 15 Mar 2001, Robert Olsson wrote:
 > 
 > >  CONFIG_NET_HW_FLOWCONTROL enables kernel code for it. But device
 > >  drivers has to have support for it. But unfortunely very few drivers
 > >  has support for it.
 > 
 > Isn't it possible to put something like this in the layer just
 > above the driver ?

 There is a dropping point in netif_rx. The problem is that knowledge
 of congestion has to be pushed back to the devices that is causing this.

 Alexey added netdev_dropping for drivers to check. And via netdev_wakeup()
 the drivers xon_metod can be called when the backlog below a certain 
 threshold. 

 So from here the driver has do the work. Not investing any resources and
 interrupts in packets we still have to drop. This what happens at very
 high load a kind of livelock. For routers routing protocols will time
 out and we loose conetivity. But I would say its important for all apps.
 
 In 2.4.0-test10 Jamal added sampling of the backlog queue so device
 drivers get the current congestion level. This opens new possiblities.
 

 > It probably won't work as well as putting it directly in the
 > driver, but it'll at least keep Linux from collapsing under
 > really heavy loads ...

 
 And we have done experiments with controlling interrupts and running
 the RX at "lower" priority. The idea is take RX-interrupt and immediately
 postponing the RX process to tasklet. The tasklet opens for new RX-ints.
 when its done.  This way dropping now occurs outside the box since and
 dropping becomes very undramatically.


 As little example of this. I monitored a DoS attack on Linux router
 equipped with this RX-tasklet driver.


Admin up6 day(s) 13 hour(s) 37 min 54 sec 
Last input  NOW
Last output NOW
5min RX bit/s   22.4 M  
5min TX bit/s   1.3 M
5min RX pkts/s  44079< 
5min TX pkts/s  877  
5min TX errors  0
5min RX errors  0
5min RX dropped 49913< 
  
Fb: no 3127894088 low 154133938 mod 6 high 0 drp 0 < Congestion levels

Polling:  ON starts/pkts/tasklet_count 96545881/2768574948/1850259980
HW_flowcontrol xon's 0   



 A bit of explanation. Above is output from tulip driver. We are forwarding
 44079 and we are dropping  49913 packets per second!  This box has 
 full BGP. The DoS attack was going on for about 30 minutes BGP survived 
 and the box was manageable. Under a heavy attack it still performs well.


 Cheers.

--ro

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: How to optimize routing performance

2001-03-15 Thread Robert Olsson


Rik van Riel writes:
 > On Thu, 15 Mar 2001, [ISO-8859-1] Mårten Wikström wrote:
 > 
 > > I've performed a test on the routing capacity of a Linux 2.4.2 box
 > > versus a FreeBSD 4.2 box. I used two Pentium Pro 200Mhz computers with
 > > 64Mb memory, and two DEC 100Mbit ethernet cards. I used a Smartbits
 > > test-tool to measure the packet throughput and the packet size was set
 > > to 64 bytes. Linux dropped no packets up to about 27000 packets/s, but
 > > then it started to drop packets at higher rates. Worse yet, the output
 > > rate actually decreased, so at the input rate of 4 packets/s
 
 It is a known problem yes. And just as Rik says its has been adressed
 in 2.1.x by Alexey for first time.


> > almost no packets got through. The behaviour of FreeBSD was different,
 > > it showed a steadily increased output rate up to about 7 packets/s
 > > before the output rate decreased. (Then the output rate was apprx.
 > > 4 packets/s).
 > 
 > > So, my question is: are these figures true, or is it possible to
 > > optimize the kernel somehow? The only changes I have made to the
 > > kernel config was to disable advanced routing.
 > 
 > There are some flow control options in the kernel which should
 > help. From your description, it looks like they aren't enabled
 > by default ...

 CONFIG_NET_HW_FLOWCONTROL enables kernel code for it. But device
 drivers has to have support for it. But unfortunely very few drivers
 has support for it.

 Also we done experiments were we move the device RX processing to 
 SoftIRQ rather than IRQ. With this RX is in better balance with 
 other kernel tasks and TX. Under very high load and under DoS 
 attacks the system is now manageable. It's in practical use already.


 > At the NordU/USENIX conference in Stockholm (this february) I
 > saw a nice presentation on the flow control code in the Linux
 > networking code and how it improved networking performance.
 > I'm pretty convinced that flow control _should_ be saving your
 > system in this case.

 Thanks Rik. 

 This is work/experiments by Jamal and me with support from Gurus. :-) 
 Jamal did this presentation at OLS 2000. At NordU/USENIX I gave an
 updated presentation of it. The presentation is not yet available form 
 the usenix webb I think.
 
 It can ftp from robur.slu.se:
 /pub/Linux/tmp/FF-NordUSENIX.pdf or .ps

 In summary Linux is very decent router. Wire speed small packets
 @ 100 Mbps and capable of Gigabit routing (1440 pkts tested) 
 we used.
 
 Also if people are interested we have done profiling on a Linux
 production router with full BGP at pretty loaded site. This to
 give us costs for route lookup, skb malloc/free, interrupts etc.
 
 http://Linux/net-development/experiments/010313

 I'm on netdev but not the kernel list.

 Cheers.

--ro
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: How to optimize routing performance

2001-03-15 Thread Robert Olsson


Rik van Riel writes:
  On Thu, 15 Mar 2001, [ISO-8859-1] Mrten Wikstrm wrote:
  
   I've performed a test on the routing capacity of a Linux 2.4.2 box
   versus a FreeBSD 4.2 box. I used two Pentium Pro 200Mhz computers with
   64Mb memory, and two DEC 100Mbit ethernet cards. I used a Smartbits
   test-tool to measure the packet throughput and the packet size was set
   to 64 bytes. Linux dropped no packets up to about 27000 packets/s, but
   then it started to drop packets at higher rates. Worse yet, the output
   rate actually decreased, so at the input rate of 4 packets/s
 
 It is a known problem yes. And just as Rik says its has been adressed
 in 2.1.x by Alexey for first time.


  almost no packets got through. The behaviour of FreeBSD was different,
   it showed a steadily increased output rate up to about 7 packets/s
   before the output rate decreased. (Then the output rate was apprx.
   4 packets/s).
  
   So, my question is: are these figures true, or is it possible to
   optimize the kernel somehow? The only changes I have made to the
   kernel config was to disable advanced routing.
  
  There are some flow control options in the kernel which should
  help. From your description, it looks like they aren't enabled
  by default ...

 CONFIG_NET_HW_FLOWCONTROL enables kernel code for it. But device
 drivers has to have support for it. But unfortunely very few drivers
 has support for it.

 Also we done experiments were we move the device RX processing to 
 SoftIRQ rather than IRQ. With this RX is in better balance with 
 other kernel tasks and TX. Under very high load and under DoS 
 attacks the system is now manageable. It's in practical use already.


  At the NordU/USENIX conference in Stockholm (this february) I
  saw a nice presentation on the flow control code in the Linux
  networking code and how it improved networking performance.
  I'm pretty convinced that flow control _should_ be saving your
  system in this case.

 Thanks Rik. 

 This is work/experiments by Jamal and me with support from Gurus. :-) 
 Jamal did this presentation at OLS 2000. At NordU/USENIX I gave an
 updated presentation of it. The presentation is not yet available form 
 the usenix webb I think.
 
 It can ftp from robur.slu.se:
 /pub/Linux/tmp/FF-NordUSENIX.pdf or .ps

 In summary Linux is very decent router. Wire speed small packets
 @ 100 Mbps and capable of Gigabit routing (1440 pkts tested) 
 we used.
 
 Also if people are interested we have done profiling on a Linux
 production router with full BGP at pretty loaded site. This to
 give us costs for route lookup, skb malloc/free, interrupts etc.
 
 http://Linux/net-development/experiments/010313

 I'm on netdev but not the kernel list.

 Cheers.

--ro
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: How to optimize routing performance

2001-03-15 Thread Robert Olsson


[Sorry for the length]

Rik van Riel writes:
  On Thu, 15 Mar 2001, Robert Olsson wrote:
  
CONFIG_NET_HW_FLOWCONTROL enables kernel code for it. But device
drivers has to have support for it. But unfortunely very few drivers
has support for it.
  
  Isn't it possible to put something like this in the layer just
  above the driver ?

 There is a dropping point in netif_rx. The problem is that knowledge
 of congestion has to be pushed back to the devices that is causing this.

 Alexey added netdev_dropping for drivers to check. And via netdev_wakeup()
 the drivers xon_metod can be called when the backlog below a certain 
 threshold. 

 So from here the driver has do the work. Not investing any resources and
 interrupts in packets we still have to drop. This what happens at very
 high load a kind of livelock. For routers routing protocols will time
 out and we loose conetivity. But I would say its important for all apps.
 
 In 2.4.0-test10 Jamal added sampling of the backlog queue so device
 drivers get the current congestion level. This opens new possiblities.
 

  It probably won't work as well as putting it directly in the
  driver, but it'll at least keep Linux from collapsing under
  really heavy loads ...

 
 And we have done experiments with controlling interrupts and running
 the RX at "lower" priority. The idea is take RX-interrupt and immediately
 postponing the RX process to tasklet. The tasklet opens for new RX-ints.
 when its done.  This way dropping now occurs outside the box since and
 dropping becomes very undramatically.


 As little example of this. I monitored a DoS attack on Linux router
 equipped with this RX-tasklet driver.


Admin up6 day(s) 13 hour(s) 37 min 54 sec 
Last input  NOW
Last output NOW
5min RX bit/s   22.4 M  
5min TX bit/s   1.3 M
5min RX pkts/s  44079 
5min TX pkts/s  877  
5min TX errors  0
5min RX errors  0
5min RX dropped 49913 
  
Fb: no 3127894088 low 154133938 mod 6 high 0 drp 0  Congestion levels

Polling:  ON starts/pkts/tasklet_count 96545881/2768574948/1850259980
HW_flowcontrol xon's 0   



 A bit of explanation. Above is output from tulip driver. We are forwarding
 44079 and we are dropping  49913 packets per second!  This box has 
 full BGP. The DoS attack was going on for about 30 minutes BGP survived 
 and the box was manageable. Under a heavy attack it still performs well.


 Cheers.

--ro

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: How to optimize routing performance

2001-03-15 Thread Robert Olsson



Jonathan Morton writes:

  Nice.  Any chance of similar functionality finding its' way outside the
  Tulip driver, eg. to 3c509 or via-rhine?  I'd find those useful, since one
  or two of my Macs appear to be capable of generating pseudo-DoS levels of
  traffic under certain circumstances which totally lock a 486 (for the
  duration) and heavily load a P166 - even though said Macs "only" have
  10baseT Ethernet.

 I'm not the one to tell. :-) 

 First its kind of experimental. Jamal has talked about putting together 
 a proposal for enhancing RX-process for inclusion in the 2.5 kernels. 
 There is meeting soon for this.


 But why not experiment a bit?

 Cheers.

--ro
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: How to optimize routing performance

2001-03-15 Thread Robert Olsson


Manfred Spraul writes:
  
   http://Linux/net-development/experiments/010313
  
  The link is broken, and I couldn't find it at www.linux.com. Did you
  forget the host?

 Yes Sir!

 The profile data from the Linux production router is at:
 
 http://robur.slu.se/Linux/net-development/experiments/010313

 Cheers.

--ro

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Preallocated skb's?

2000-09-14 Thread Robert Olsson



Yes !

The FF experiments with 2.1.X indicated improvement factor about 2-3 times
with skb recycling. With combination of FF and skb recycling we could reach 
fast Ethernet wire speed forwarding on 400 Mhz CPU. About ~147 KPPS. 
As jamal reported the improvement is much less today but the forwarding 
performance is impressive even without FF and skb recycling. Slab seems
to do a good job and especially when the debug is disabled. :-)


--ro

Andi Kleen writes:
 > On Thu, Sep 14, 2000 at 11:59:32PM +1100, Andrew Morton wrote:
 > > That's 20 usec per interrupt, of which 1 usec could be saved by skb
 > > pooling.
 > 
 > FF usually runs with interrupt mitigation at higher rates (8-16 or even
 > more packets / interrupt). I agree though that it probably does not 
 > make too much difference.  alloc_skb could probably be made cheaper 
 > for the FF case by being more clever in the slab constructor (I think
 > there was some bitrot during 2.3 on the cache line usage -- 2.2 pretty
 > much only needed 2 cache lines in the header for a FF packet) 
 > 
 > 
 > -Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Preallocated skb's?

2000-09-14 Thread Robert Olsson



Yes !

The FF experiments with 2.1.X indicated improvement factor about 2-3 times
with skb recycling. With combination of FF and skb recycling we could reach 
fast Ethernet wire speed forwarding on 400 Mhz CPU. About ~147 KPPS. 
As jamal reported the improvement is much less today but the forwarding 
performance is impressive even without FF and skb recycling. Slab seems
to do a good job and especially when the debug is disabled. :-)


--ro

Andi Kleen writes:
  On Thu, Sep 14, 2000 at 11:59:32PM +1100, Andrew Morton wrote:
   That's 20 usec per interrupt, of which 1 usec could be saved by skb
   pooling.
  
  FF usually runs with interrupt mitigation at higher rates (8-16 or even
  more packets / interrupt). I agree though that it probably does not 
  make too much difference.  alloc_skb could probably be made cheaper 
  for the FF case by being more clever in the slab constructor (I think
  there was some bitrot during 2.3 on the cache line usage -- 2.2 pretty
  much only needed 2 cache lines in the header for a FF packet) 
  
  
  -Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/