[PATCH net-next] phy: make some bits preserved while setup forced mode

2016-04-14 Thread Weidong Wang
When tested the PHY SGMII Loopback:
1.set the LOOPBACK bit,
2.set the autoneg to AUTONEG_DISABLE, it calls the
genphy_setup_forced which will clear the bit.

The BMCR_LOOPBACK bit should be preserved.

As Florian pointed out that other bits should be preserved too.
So I make the BMCR_ISOLATE and BMCR_PDOWN as well.

Signed-off-by: Weidong Wang <wangweido...@huawei.com>
---
the patch is evoluted from "phy: keep the BCMR_LOOPBACK bit
while setup forced mode".

---
 drivers/net/phy/phy_device.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index d551df6..9b24d7e 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -818,8 +818,9 @@ static int genphy_config_advert(struct phy_device *phydev)
  */
 int genphy_setup_forced(struct phy_device *phydev)
 {
-   int ctl = 0;
+   int ctl = phy_read(phydev, MII_BMCR);

+   ctl &= BMCR_LOOPBACK | BMCR_ISOLATE | BMCR_PDOWN;
phydev->pause = 0;
phydev->asym_pause = 0;

-- 
2.7.0




Re: [RESEND PATCH net-next] phy: keep the BCMR_LOOPBACK bit while setup forced mode

2016-04-13 Thread Weidong Wang
On 2016/4/14 2:41, Florian Fainelli wrote:
> On 13/04/16 04:59, Weidong Wang wrote:
>> When tested the PHY SGMII Loopback,:
>> 1.set the LOOPBACK bit,
>> 2.set the autoneg to AUTONEG_DISABLE, it calls the
>> genphy_setup_forced which will clear the bit.
>>
>> So just keep the LOOPBACK bit while setup forced mode.
> 
> Humm, it makes sense why we want this one, but maybe we want other bits
> to be preserved too, like MII_ISOLATE for instance?
> 

I will add the  MII_ISOLATE as well.

> Or maybe we should have a separate way to put the PHY into loopback mode
> which is deterministic and takes care of forcing the link at the same time?
> 

I test with the loopback mode, and the link status is undeterminable.

>>
>> Signed-off-by: Weidong Wang <wangweido...@huawei.com>
>> ---
>>  drivers/net/phy/phy_device.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>> index e551f3a..8da4b80 100644
>> --- a/drivers/net/phy/phy_device.c
>> +++ b/drivers/net/phy/phy_device.c
>> @@ -1124,7 +1124,9 @@ static int genphy_config_advert(struct phy_device 
>> *phydev)
>>  int genphy_setup_forced(struct phy_device *phydev)
>>  {
>>  int ctl = 0;
>> +int val = phy_read(phydev, MII_BMCR);
>>
>> +ctl |= val & BMCR_LOOPBACK;
>>  phydev->pause = 0;
>>  phydev->asym_pause = 0;
>>
>> -- 2.7.0
>>
> 
> 




Re: [RESEND PATCH net-next] phy: keep the BCMR_LOOPBACK bit while setup forced mode

2016-04-13 Thread Weidong Wang
On 2016/4/13 22:19, Sergei Shtylyov wrote:
> Hello.
> 
> On 4/13/2016 2:59 PM, Weidong Wang wrote:
> 
>> When tested the PHY SGMII Loopback,:
>> 1.set the LOOPBACK bit,
>> 2.set the autoneg to AUTONEG_DISABLE, it calls the
>> genphy_setup_forced which will clear the bit.
>>
>> So just keep the LOOPBACK bit while setup forced mode.
>>
>> Signed-off-by: Weidong Wang <wangweido...@huawei.com>
>> ---
>>   drivers/net/phy/phy_device.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>> index e551f3a..8da4b80 100644
>> --- a/drivers/net/phy/phy_device.c
>> +++ b/drivers/net/phy/phy_device.c
>> @@ -1124,7 +1124,9 @@ static int genphy_config_advert(struct phy_device 
>> *phydev)
>>   int genphy_setup_forced(struct phy_device *phydev)
>>   {
>>   int ctl = 0;
>> +int val = phy_read(phydev, MII_BMCR);
> 
>Please place this declaration first, DaveM prefers declarations to be 
> sorted from longest to shortest.
> 
>>
>> +ctl |= val & BMCR_LOOPBACK;
> 
>Just =, removing the 'ctl' initializer, please.
> 
> [...]
> 
> MBR, Sergei
> 
Got it.

Regards,
Weidong
> 
> 




[RESEND PATCH net-next] phy: keep the BCMR_LOOPBACK bit while setup forced mode

2016-04-13 Thread Weidong Wang
When tested the PHY SGMII Loopback,:
1.set the LOOPBACK bit,
2.set the autoneg to AUTONEG_DISABLE, it calls the
genphy_setup_forced which will clear the bit.

So just keep the LOOPBACK bit while setup forced mode.

Signed-off-by: Weidong Wang <wangweido...@huawei.com>
---
 drivers/net/phy/phy_device.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index e551f3a..8da4b80 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1124,7 +1124,9 @@ static int genphy_config_advert(struct phy_device *phydev)
 int genphy_setup_forced(struct phy_device *phydev)
 {
int ctl = 0;
+   int val = phy_read(phydev, MII_BMCR);

+   ctl |= val & BMCR_LOOPBACK;
phydev->pause = 0;
phydev->asym_pause = 0;

-- 2.7.0



[PATCH net-next] phy: keep the BCMR_LOOPBACK bit while setup forced mode

2016-03-24 Thread Weidong Wang
When tested the PHY SGMII Loopback,:
1.set the LOOPBACK bit,
2.set the autoneg to AUTONEG_DISABLE, it calls the
genphy_setup_forced which will clear the bit.

So just keep the LOOPBACK bit while setup forced mode.

Signed-off-by: Weidong Wang <wangweido...@huawei.com>
---
 drivers/net/phy/phy_device.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index e551f3a..8da4b80 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1124,7 +1124,9 @@ static int genphy_config_advert(struct phy_device *phydev)
 int genphy_setup_forced(struct phy_device *phydev)
 {
int ctl = 0;
+   int val = phy_read(phydev, MII_BMCR);

+   ctl |= val & BMCR_LOOPBACK;
phydev->pause = 0;
phydev->asym_pause = 0;

-- 
2.7.0




[PATCH net-next] netfilter: nf_conntrack: remove the unneed check for *bucket

2016-01-30 Thread Weidong Wang
In the 'for(...) {}', the *bucket alwasy < net->ct.htable_size,
so remove the check

Signed-off-by: Weidong Wang <wangweido...@huawei.com>
---
 net/netfilter/nf_conntrack_core.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c 
b/net/netfilter/nf_conntrack_core.c
index 3cb3cb8..cd7d5c8 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1383,14 +1383,12 @@ get_next_corpse(struct net *net, int (*iter)(struct 
nf_conn *i, void *data),
lockp = _conntrack_locks[*bucket % CONNTRACK_LOCKS];
local_bh_disable();
spin_lock(lockp);
-   if (*bucket < net->ct.htable_size) {
-   hlist_nulls_for_each_entry(h, n, 
>ct.hash[*bucket], hnnode) {
-   if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL)
-   continue;
-   ct = nf_ct_tuplehash_to_ctrack(h);
-   if (iter(ct, data))
-   goto found;
-   }
+   hlist_nulls_for_each_entry(h, n, >ct.hash[*bucket], 
hnnode) {
+   if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL)
+   continue;
+   ct = nf_ct_tuplehash_to_ctrack(h);
+   if (iter(ct, data))
+   goto found;
}
spin_unlock(lockp);
local_bh_enable();
-- 
2.7.0




Re: [PATCH net-next] netfilter: nf_conntrack: remove the unneed check for *bucket

2016-01-30 Thread Weidong Wang
On 2016/1/31 5:30, Florian Westphal wrote:
> Weidong Wang <wangweido...@huawei.com> wrote:
>> In the 'for(...) {}', the *bucket alwasy < net->ct.htable_size,
>> so remove the check
>> @@ -1383,14 +1383,12 @@ get_next_corpse(struct net *net, int (*iter)(struct 
>> nf_conn *i, void *data),
>>  lockp = _conntrack_locks[*bucket % CONNTRACK_LOCKS];
>>  local_bh_disable();
>>  spin_lock(lockp);
>> -if (*bucket < net->ct.htable_size) {
> 
> AFAIU net->ct.htable_size can shrink between for-test and aquiring
> the bucket lockp, so this additional if-test is needed.
> 

ok, Got it.
So ignore this patch.

Regards,
Weidong

> .
> 




[PATCH net-next] BNX2: free temp_stats_blk on error path

2015-10-12 Thread Weidong Wang
In bnx2_init_board, missing free temp_stats_blk on error path when
some operations do failed. Just add the 'kfree' operation.

Signed-off-by: Wang Weidong <wangweido...@huawei.com>
---
 drivers/net/ethernet/broadcom/bnx2.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c 
b/drivers/net/ethernet/broadcom/bnx2.c
index 6259064..8fc3f3c 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -8476,6 +8476,8 @@ err_out_disable:
pci_disable_device(pdev);

 err_out:
+   kfree(bp->temp_stats_blk);
+
return rc;
 }

-- 
1.9.0


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v3] BNX2: fix a Null Pointer for stats_blk

2015-10-08 Thread Weidong Wang
we have two processes to do:
P1#: ifconfig eth0 down; which will call bnx2_close, then will
, and set Null to stats_blk
P2#: ifconfig eth0; which will call bnx2_get_stats64, it will
use stats_blk.
In one case:
--P1#--   --P2#--
  stats_blk(no null)
bnx2_free_mem
->bp->stats_blk = NULL
  GET_64BIT_NET_STATS

then it will cause 'NULL Pointer' Problem.
it is as well with 'ethtool -S ethx'.

Allocate the statistics block at probe time so that this problem is
impossible

Signed-off-by: Wang Weidong <wangweido...@huawei.com>
---
Change in v2:
 - Use Allocate the statistics block instead of spinlock, which
   suggested by David Miller.
 - Updating commit message according to changes.

Change in v3:
 - bnx2_alloc_stats_blk is just allocating the stats block.
 - use 'bp->status_blk' to store the status_blk which would used
   in other funcs, just to key the codes simplified.

---
 drivers/net/ethernet/broadcom/bnx2.c | 79 
 drivers/net/ethernet/broadcom/bnx2.h |  1 +
 2 files changed, 53 insertions(+), 27 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c 
b/drivers/net/ethernet/broadcom/bnx2.c
index 2b66ef3..6259064 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -813,6 +813,46 @@ bnx2_alloc_rx_mem(struct bnx2 *bp)
 }

 static void
+bnx2_free_stats_blk(struct net_device *dev)
+{
+   struct bnx2 *bp = netdev_priv(dev);
+
+   if (bp->status_blk) {
+   dma_free_coherent(>pdev->dev, bp->status_stats_size,
+ bp->status_blk,
+ bp->status_blk_mapping);
+   bp->status_blk = NULL;
+   bp->stats_blk = NULL;
+   }
+}
+
+static int
+bnx2_alloc_stats_blk(struct net_device *dev)
+{
+   int status_blk_size;
+   void *status_blk;
+   struct bnx2 *bp = netdev_priv(dev);
+
+   /* Combine status and statistics blocks into one allocation. */
+   status_blk_size = L1_CACHE_ALIGN(sizeof(struct status_block));
+   if (bp->flags & BNX2_FLAG_MSIX_CAP)
+   status_blk_size = L1_CACHE_ALIGN(BNX2_MAX_MSIX_HW_VEC *
+BNX2_SBLK_MSIX_ALIGN_SIZE);
+   bp->status_stats_size = status_blk_size +
+   sizeof(struct statistics_block);
+   status_blk = dma_zalloc_coherent(>pdev->dev, bp->status_stats_size,
+>status_blk_mapping, GFP_KERNEL);
+   if (status_blk == NULL)
+   return -ENOMEM;
+
+   bp->status_blk = status_blk;
+   bp->stats_blk = status_blk + status_blk_size;
+   bp->stats_blk_mapping = bp->status_blk_mapping + status_blk_size;
+
+   return 0;
+}
+
+static void
 bnx2_free_mem(struct bnx2 *bp)
 {
int i;
@@ -829,37 +869,19 @@ bnx2_free_mem(struct bnx2 *bp)
bp->ctx_blk[i] = NULL;
}
}
-   if (bnapi->status_blk.msi) {
-   dma_free_coherent(>pdev->dev, bp->status_stats_size,
- bnapi->status_blk.msi,
- bp->status_blk_mapping);
+
+   if (bnapi->status_blk.msi)
bnapi->status_blk.msi = NULL;
-   bp->stats_blk = NULL;
-   }
 }

 static int
 bnx2_alloc_mem(struct bnx2 *bp)
 {
-   int i, status_blk_size, err;
+   int i, err;
struct bnx2_napi *bnapi;
-   void *status_blk;
-
-   /* Combine status and statistics blocks into one allocation. */
-   status_blk_size = L1_CACHE_ALIGN(sizeof(struct status_block));
-   if (bp->flags & BNX2_FLAG_MSIX_CAP)
-   status_blk_size = L1_CACHE_ALIGN(BNX2_MAX_MSIX_HW_VEC *
-BNX2_SBLK_MSIX_ALIGN_SIZE);
-   bp->status_stats_size = status_blk_size +
-   sizeof(struct statistics_block);
-
-   status_blk = dma_zalloc_coherent(>pdev->dev, bp->status_stats_size,
->status_blk_mapping, GFP_KERNEL);
-   if (status_blk == NULL)
-   goto alloc_mem_err;

bnapi = >bnx2_napi[0];
-   bnapi->status_blk.msi = status_blk;
+   bnapi->status_blk.msi = bp->status_blk;
bnapi->hw_tx_cons_ptr =
>status_blk.msi->status_tx_quick_consumer_index0;
bnapi->hw_rx_cons_ptr =
@@ -870,7 +892,7 @@ bnx2_alloc_mem(struct bnx2 *bp)

bnapi = >bnx2_napi[i];

-   sblk = (status_blk + BNX2_SBLK_MSIX_ALIGN_SIZE * i);
+   sblk = (bp->status_blk + BNX2_SBLK_MSIX_ALIGN_SIZE * i);
bnapi->status_blk.msix = sblk;
bnapi->hw_tx_cons_pt

[PATCH net-next v2] BNX2: fix a Null Pointer for stats_blk

2015-09-28 Thread Weidong Wang
we have two processes to do:
P1#: ifconfig eth0 down; which will call bnx2_close, then will
, and set Null to stats_blk
P2#: ifconfig eth0; which will call bnx2_get_stats64, it will
use stats_blk.
In one case:
--P1#--   --P2#--
  stats_blk(no null)
bnx2_free_mem
->bp->stats_blk = NULL
  GET_64BIT_NET_STATS

then it will cause 'NULL Pointer' Problem.
it is as well with 'ethtool -S ethx'.

Allocate the statistics block at probe time so that this problem is
impossible

Signed-off-by: Tianhong Ding 
---
Change in v2:
 - Use Allocate the statistics block instead of spinlock, which
   suggested by David Miller.
 - Updating commit message according to changes.

---
 drivers/net/ethernet/broadcom/bnx2.c | 61 +++-
 1 file changed, 40 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c 
b/drivers/net/ethernet/broadcom/bnx2.c
index 2b66ef3..1f33982 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -813,22 +813,11 @@ bnx2_alloc_rx_mem(struct bnx2 *bp)
 }

 static void
-bnx2_free_mem(struct bnx2 *bp)
+bnx2_free_stats_blk(struct net_device *dev)
 {
-   int i;
+   struct bnx2 *bp = netdev_priv(dev);
struct bnx2_napi *bnapi = >bnx2_napi[0];

-   bnx2_free_tx_mem(bp);
-   bnx2_free_rx_mem(bp);
-
-   for (i = 0; i < bp->ctx_pages; i++) {
-   if (bp->ctx_blk[i]) {
-   dma_free_coherent(>pdev->dev, BNX2_PAGE_SIZE,
- bp->ctx_blk[i],
- bp->ctx_blk_mapping[i]);
-   bp->ctx_blk[i] = NULL;
-   }
-   }
if (bnapi->status_blk.msi) {
dma_free_coherent(>pdev->dev, bp->status_stats_size,
  bnapi->status_blk.msi,
@@ -839,11 +828,12 @@ bnx2_free_mem(struct bnx2 *bp)
 }

 static int
-bnx2_alloc_mem(struct bnx2 *bp)
+bnx2_alloc_stats_blk(struct net_device *dev)
 {
-   int i, status_blk_size, err;
+   int i, status_blk_size;
struct bnx2_napi *bnapi;
void *status_blk;
+   struct bnx2 *bp = netdev_priv(dev);

/* Combine status and statistics blocks into one allocation. */
status_blk_size = L1_CACHE_ALIGN(sizeof(struct status_block));
@@ -852,11 +842,10 @@ bnx2_alloc_mem(struct bnx2 *bp)
 BNX2_SBLK_MSIX_ALIGN_SIZE);
bp->status_stats_size = status_blk_size +
sizeof(struct statistics_block);
-
status_blk = dma_zalloc_coherent(>pdev->dev, bp->status_stats_size,
 >status_blk_mapping, GFP_KERNEL);
if (status_blk == NULL)
-   goto alloc_mem_err;
+   return -ENOMEM;

bnapi = >bnx2_napi[0];
bnapi->status_blk.msi = status_blk;
@@ -865,11 +854,10 @@ bnx2_alloc_mem(struct bnx2 *bp)
bnapi->hw_rx_cons_ptr =
>status_blk.msi->status_rx_quick_consumer_index0;
if (bp->flags & BNX2_FLAG_MSIX_CAP) {
-   for (i = 1; i < bp->irq_nvecs; i++) {
+   for (i = 1; i < BNX2_MAX_MSIX_HW_VEC; i++) {
struct status_block_msix *sblk;

bnapi = >bnx2_napi[i];
-
sblk = (status_blk + BNX2_SBLK_MSIX_ALIGN_SIZE * i);
bnapi->status_blk.msix = sblk;
bnapi->hw_tx_cons_ptr =
@@ -879,11 +867,35 @@ bnx2_alloc_mem(struct bnx2 *bp)
bnapi->int_num = i << 24;
}
}
-
bp->stats_blk = status_blk + status_blk_size;
-
bp->stats_blk_mapping = bp->status_blk_mapping + status_blk_size;

+   return 0;
+}
+
+static void
+bnx2_free_mem(struct bnx2 *bp)
+{
+   int i;
+
+   bnx2_free_tx_mem(bp);
+   bnx2_free_rx_mem(bp);
+
+   for (i = 0; i < bp->ctx_pages; i++) {
+   if (bp->ctx_blk[i]) {
+   dma_free_coherent(>pdev->dev, BNX2_PAGE_SIZE,
+ bp->ctx_blk[i],
+ bp->ctx_blk_mapping[i]);
+   bp->ctx_blk[i] = NULL;
+   }
+   }
+}
+
+static int
+bnx2_alloc_mem(struct bnx2 *bp)
+{
+   int i, err;
+
if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
bp->ctx_pages = 0x2000 / BNX2_PAGE_SIZE;
if (bp->ctx_pages == 0)
@@ -8330,6 +8342,11 @@ bnx2_init_board(struct pci_dev *pdev, struct net_device 
*dev)

bp->phy_addr = 1;

+   /* allocate stats_blk */
+   rc = bnx2_alloc_stats_blk(dev);
+   if (rc)
+   goto err_out_unmap;
+
/* Disable WOL support if we are running on a SERDES chip. */
if (BNX2_CHIP(bp) == BNX2_CHIP_5709)
bnx2_get_5709_media(bp);
@@ -8586,6 

[PATCH net-next v2 RESEND] BNX2: fix a Null Pointer for stats_blk

2015-09-28 Thread Weidong Wang
we have two processes to do:
P1#: ifconfig eth0 down; which will call bnx2_close, then will
, and set Null to stats_blk
P2#: ifconfig eth0; which will call bnx2_get_stats64, it will
use stats_blk.
In one case:
--P1#--   --P2#--
  stats_blk(no null)
bnx2_free_mem
->bp->stats_blk = NULL
  GET_64BIT_NET_STATS

then it will cause 'NULL Pointer' Problem.
it is as well with 'ethtool -S ethx'.

Allocate the statistics block at probe time so that this problem is
impossible

Signed-off-by: Weidong Wang <wangweido...@huawei.com>
---
Change in v2:
 - Use Allocate the statistics block instead of spinlock, which
   suggested by David Miller.
 - Updating commit message according to changes.

---
 drivers/net/ethernet/broadcom/bnx2.c | 61 +++-
 1 file changed, 40 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c 
b/drivers/net/ethernet/broadcom/bnx2.c
index 2b66ef3..1f33982 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -813,22 +813,11 @@ bnx2_alloc_rx_mem(struct bnx2 *bp)
 }

 static void
-bnx2_free_mem(struct bnx2 *bp)
+bnx2_free_stats_blk(struct net_device *dev)
 {
-   int i;
+   struct bnx2 *bp = netdev_priv(dev);
struct bnx2_napi *bnapi = >bnx2_napi[0];

-   bnx2_free_tx_mem(bp);
-   bnx2_free_rx_mem(bp);
-
-   for (i = 0; i < bp->ctx_pages; i++) {
-   if (bp->ctx_blk[i]) {
-   dma_free_coherent(>pdev->dev, BNX2_PAGE_SIZE,
- bp->ctx_blk[i],
- bp->ctx_blk_mapping[i]);
-   bp->ctx_blk[i] = NULL;
-   }
-   }
if (bnapi->status_blk.msi) {
dma_free_coherent(>pdev->dev, bp->status_stats_size,
  bnapi->status_blk.msi,
@@ -839,11 +828,12 @@ bnx2_free_mem(struct bnx2 *bp)
 }

 static int
-bnx2_alloc_mem(struct bnx2 *bp)
+bnx2_alloc_stats_blk(struct net_device *dev)
 {
-   int i, status_blk_size, err;
+   int i, status_blk_size;
struct bnx2_napi *bnapi;
void *status_blk;
+   struct bnx2 *bp = netdev_priv(dev);

/* Combine status and statistics blocks into one allocation. */
status_blk_size = L1_CACHE_ALIGN(sizeof(struct status_block));
@@ -852,11 +842,10 @@ bnx2_alloc_mem(struct bnx2 *bp)
 BNX2_SBLK_MSIX_ALIGN_SIZE);
bp->status_stats_size = status_blk_size +
sizeof(struct statistics_block);
-
status_blk = dma_zalloc_coherent(>pdev->dev, bp->status_stats_size,
 >status_blk_mapping, GFP_KERNEL);
if (status_blk == NULL)
-   goto alloc_mem_err;
+   return -ENOMEM;

bnapi = >bnx2_napi[0];
bnapi->status_blk.msi = status_blk;
@@ -865,11 +854,10 @@ bnx2_alloc_mem(struct bnx2 *bp)
bnapi->hw_rx_cons_ptr =
>status_blk.msi->status_rx_quick_consumer_index0;
if (bp->flags & BNX2_FLAG_MSIX_CAP) {
-   for (i = 1; i < bp->irq_nvecs; i++) {
+   for (i = 1; i < BNX2_MAX_MSIX_HW_VEC; i++) {
struct status_block_msix *sblk;

bnapi = >bnx2_napi[i];
-
sblk = (status_blk + BNX2_SBLK_MSIX_ALIGN_SIZE * i);
bnapi->status_blk.msix = sblk;
bnapi->hw_tx_cons_ptr =
@@ -879,11 +867,35 @@ bnx2_alloc_mem(struct bnx2 *bp)
bnapi->int_num = i << 24;
}
}
-
bp->stats_blk = status_blk + status_blk_size;
-
bp->stats_blk_mapping = bp->status_blk_mapping + status_blk_size;

+   return 0;
+}
+
+static void
+bnx2_free_mem(struct bnx2 *bp)
+{
+   int i;
+
+   bnx2_free_tx_mem(bp);
+   bnx2_free_rx_mem(bp);
+
+   for (i = 0; i < bp->ctx_pages; i++) {
+   if (bp->ctx_blk[i]) {
+   dma_free_coherent(>pdev->dev, BNX2_PAGE_SIZE,
+ bp->ctx_blk[i],
+ bp->ctx_blk_mapping[i]);
+   bp->ctx_blk[i] = NULL;
+   }
+   }
+}
+
+static int
+bnx2_alloc_mem(struct bnx2 *bp)
+{
+   int i, err;
+
if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
bp->ctx_pages = 0x2000 / BNX2_PAGE_SIZE;
if (bp->ctx_pages == 0)
@@ -8330,6 +8342,11 @@ bnx2_init_board(struct pci_dev *pdev, struct net_device 
*dev)

bp->phy_addr = 1;

+   /* allocate stats_blk */
+   rc = bnx2_alloc_stats_blk(dev);
+   if (rc)
+   goto err_out_unmap;
+
  

Re: [PATCH net-next v2] BNX2: fix a Null Pointer for stats_blk

2015-09-28 Thread Weidong Wang
On 2015/9/28 15:01, Weidong Wang wrote:
> we have two processes to do:
> P1#: ifconfig eth0 down; which will call bnx2_close, then will
> , and set Null to stats_blk
> P2#: ifconfig eth0; which will call bnx2_get_stats64, it will
> use stats_blk.
> In one case:
> --P1#--   --P2#--
>   stats_blk(no null)
> bnx2_free_mem
> ->bp->stats_blk = NULL
>   GET_64BIT_NET_STATS
> 
> then it will cause 'NULL Pointer' Problem.
> it is as well with 'ethtool -S ethx'.
> 
> Allocate the statistics block at probe time so that this problem is
> impossible
> 
> Signed-off-by: Tianhong Ding <dingtianh...@huawei.com>
> ---

Sorry for that, The sob is Error. I will fixed it.

Just Ignore it.

Regards.
Weidong

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] BNX2: fix a Null Pointer for stats_blk

2015-09-24 Thread Weidong Wang
On 2015/9/24 13:34, David Miller wrote:
> From: Weidong Wang <wangweido...@huawei.com>
> Date: Thu, 24 Sep 2015 10:00:45 +0800
> 
>> It does affect the intention. Although, the problem exists then makes the
>> system panic within some case.
>>
>> Do you have any idea about it?
> 
> Allocate the statistics block at probe time so that this problem is
> impossible.
> 

It is a good idea.

Yet, what is the intention of the dynamic to alloc/free stats_block?
what will be affected by allocating the statistics block.

Best Regards,
Weidong

> 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] BNX2: fix a Null Pointer for stats_blk

2015-09-23 Thread Weidong Wang
On 2015/9/24 6:31, David Miller wrote:
> From: Weidong Wang <wangweido...@huawei.com>
> Date: Tue, 22 Sep 2015 20:42:40 +0800
> 
>> @@ -880,6 +882,7 @@ bnx2_alloc_mem(struct bnx2 *bp)
>>  }
>>  }
>>
>> +spin_lock(>stats64_lock);
>>  bp->stats_blk = status_blk + status_blk_size;
>>
>>  bp->stats_blk_mapping = bp->status_blk_mapping + status_blk_size;
>> @@ -894,20 +897,23 @@ bnx2_alloc_mem(struct bnx2 *bp)
>>  >ctx_blk_mapping[i],
>>  GFP_KERNEL);
>>  if (bp->ctx_blk[i] == NULL)
>> -goto alloc_mem_err;
>> +goto free_stats64_lock;
>>  }
>>  }
>>
>>  err = bnx2_alloc_rx_mem(bp);
>>  if (err)
>> -goto alloc_mem_err;
>> +goto free_stats64_lock;
> 
> You're holding a spinlock while doing GFP_KERNEL allocations.
> 

hm, yep, I should move it after the allocations. Like this:

@@ -880,7 +882,9 @@ bnx2_alloc_mem(struct bnx2 *bp)
}
}

+   spin_lock(>stats64_lock);
bp->stats_blk = status_blk + status_blk_size;
+   spin_unlock(>stats64_lock);

the allocations won't use the stats_blk, so I shouldn't hold the
lock while doing allocations.

> Second of all, taking a spinlock in get_stats64() defeats the whole
> intention of making statistics acquisition as fast and as SMP scalable
> as possible.
> 

It does affect the intention. Although, the problem exists then makes the
system panic within some case.

Do you have any idea about it?

Best Regards,
Weidong

> .
> 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next] BNX2: fix a Null Pointer for stats_blk

2015-09-22 Thread Weidong Wang
we have two processes to do:
P1#: ifconfig eth0 down; which will call bnx2_close, then will
, and set Null to stats_blk
P2#: ifconfig eth0; which will call bnx2_get_stats64, it will
use stats_blk.
In one case:
--P1#--   --P2#--
  stats_blk(no null)
bnx2_free_mem
->bp->stats_blk = NULL
  GET_64BIT_NET_STATS

then it will cause 'NULL Pointer' Problem.
it is as well with 'ethtool -S ethx'.

BTW, the other branch has this problem as well.

So we add a spin_lock to protect stats_blk.

Signed-off-by: Wang Weidong <wangweido...@huawei.com>
---
 drivers/net/ethernet/broadcom/bnx2.c | 42 
 drivers/net/ethernet/broadcom/bnx2.h |  1 +
 2 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c 
b/drivers/net/ethernet/broadcom/bnx2.c
index 2b66ef3..aec4081 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -830,11 +830,13 @@ bnx2_free_mem(struct bnx2 *bp)
}
}
if (bnapi->status_blk.msi) {
+   spin_lock(>stats64_lock);
dma_free_coherent(>pdev->dev, bp->status_stats_size,
  bnapi->status_blk.msi,
  bp->status_blk_mapping);
bnapi->status_blk.msi = NULL;
bp->stats_blk = NULL;
+   spin_unlock(>stats64_lock);
}
 }

@@ -880,6 +882,7 @@ bnx2_alloc_mem(struct bnx2 *bp)
}
}

+   spin_lock(>stats64_lock);
bp->stats_blk = status_blk + status_blk_size;

bp->stats_blk_mapping = bp->status_blk_mapping + status_blk_size;
@@ -894,20 +897,23 @@ bnx2_alloc_mem(struct bnx2 *bp)
>ctx_blk_mapping[i],
GFP_KERNEL);
if (bp->ctx_blk[i] == NULL)
-   goto alloc_mem_err;
+   goto free_stats64_lock;
}
}

err = bnx2_alloc_rx_mem(bp);
if (err)
-   goto alloc_mem_err;
+   goto free_stats64_lock;

err = bnx2_alloc_tx_mem(bp);
if (err)
-   goto alloc_mem_err;
+   goto free_stats64_lock;

+   spin_unlock(>stats64_lock);
return 0;

+free_stats64_lock:
+   spin_unlock(>stats64_lock);
 alloc_mem_err:
bnx2_free_mem(bp);
return -ENOMEM;
@@ -6756,10 +6762,14 @@ bnx2_close(struct net_device *dev)
 static void
 bnx2_save_stats(struct bnx2 *bp)
 {
-   u32 *hw_stats = (u32 *) bp->stats_blk;
-   u32 *temp_stats = (u32 *) bp->temp_stats_blk;
+   u32 *hw_stats;
+   u32 *temp_stats;
int i;

+   spin_lock(>stats64_lock);
+   hw_stats = (u32 *) bp->stats_blk;
+   temp_stats = (u32 *) bp->temp_stats_blk;
+
/* The 1st 10 counters are 64-bit counters */
for (i = 0; i < 20; i += 2) {
u32 hi;
@@ -6775,6 +6785,8 @@ bnx2_save_stats(struct bnx2 *bp)

for ( ; i < sizeof(struct statistics_block) / 4; i++)
temp_stats[i] += hw_stats[i];
+
+   spin_unlock(>stats64_lock);
 }

 #define GET_64BIT_NET_STATS64(ctr) \
@@ -6793,8 +6805,11 @@ bnx2_get_stats64(struct net_device *dev, struct 
rtnl_link_stats64 *net_stats)
 {
struct bnx2 *bp = netdev_priv(dev);

-   if (bp->stats_blk == NULL)
+   spin_lock(>stats64_lock);
+   if (bp->stats_blk == NULL) {
+   spin_unlock(>stats64_lock);
return net_stats;
+   }

net_stats->rx_packets =
GET_64BIT_NET_STATS(stat_IfHCInUcastPkts) +
@@ -6858,6 +6873,7 @@ bnx2_get_stats64(struct net_device *dev, struct 
rtnl_link_stats64 *net_stats)
GET_32BIT_NET_STATS(stat_IfInMBUFDiscards) +
GET_32BIT_NET_STATS(stat_FwRxDrop);

+   spin_unlock(>stats64_lock);
return net_stats;
 }

@@ -7634,13 +7650,17 @@ bnx2_get_ethtool_stats(struct net_device *dev,
 {
struct bnx2 *bp = netdev_priv(dev);
int i;
-   u32 *hw_stats = (u32 *) bp->stats_blk;
-   u32 *temp_stats = (u32 *) bp->temp_stats_blk;
+   u32 *hw_stats;
+   u32 *temp_stats;
u8 *stats_len_arr = NULL;

+   spin_lock(>stats64_lock);
+   hw_stats = (u32 *) bp->stats_blk;
+   temp_stats = (u32 *) bp->temp_stats_blk;
+
if (hw_stats == NULL) {
memset(buf, 0, sizeof(u64) * BNX2_NUM_STATS);
-   return;
+   goto free_stats64_lock;
}

if ((BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5706_A0) ||
@@ -7673,6 +7693,9 @@ bnx2_get_ethtool_stats(struct net_device *dev,
 (((u64) *(temp_stats + offset)) << 32) +
  

Re: Fix ipOutNoRoutes counter error for TCP and UDP

2007-05-31 Thread weidong

Mr David

   Sorry to trouble many times, I will attention to this next time.

   I have made this patch again, and I tried, it can be patched to the recently
   kernel of linux-2.6.21.3. following is the patch, and I also attach this
   patch to the attachment.

   Signed-off-by: Wei Dong [EMAIL PROTECTED]

diff -Nurp a/net/ipv4/datagram.c b/net/ipv4/datagram.c
--- a/net/ipv4/datagram.c   2007-06-01 13:22:39.0 +0800
+++ b/net/ipv4/datagram.c   2007-06-01 13:24:21.0 +0800
@@ -50,8 +50,12 @@ int ip4_datagram_connect(struct sock *sk
   RT_CONN_FLAGS(sk), oif,
   sk-sk_protocol,
   inet-sport, usin-sin_port, sk, 1);
-   if (err)
+   if (err) {
+   if (err == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
return err;
+   }
+
if ((rt-rt_flags  RTCF_BROADCAST)  !sock_flag(sk, SOCK_BROADCAST)) {
ip_rt_put(rt);
return -EACCES;
diff -Nurp a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
--- a/net/ipv4/tcp_ipv4.c   2007-06-01 13:22:39.0 +0800
+++ b/net/ipv4/tcp_ipv4.c   2007-06-01 13:25:07.0 +0800
@@ -192,8 +192,11 @@ int tcp_v4_connect(struct sock *sk, stru
   RT_CONN_FLAGS(sk), sk-sk_bound_dev_if,
   IPPROTO_TCP,
   inet-sport, usin-sin_port, sk, 1);
-   if (tmp  0)
+   if (tmp  0) {
+   if (tmp == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
return tmp;
+   }

if (rt-rt_flags  (RTCF_MULTICAST | RTCF_BROADCAST)) {
ip_rt_put(rt);
diff -Nurp a/net/ipv4/udp.c b/net/ipv4/udp.c
--- a/net/ipv4/udp.c2007-06-01 13:22:39.0 +0800
+++ b/net/ipv4/udp.c2007-06-01 13:26:47.0 +0800
@@ -631,8 +631,11 @@ int udp_sendmsg(struct kiocb *iocb, stru
 .dport = dport } } };
security_sk_classify_flow(sk, fl);
err = ip_route_output_flow(rt, fl, sk, 1);
-   if (err)
+   if (err) {
+   if (err == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
goto out;
+   }

err = -EACCES;
if ((rt-rt_flags  RTCF_BROADCAST) 



David Miller wrote:

From: Wei Dong [EMAIL PROTECTED]
Date: Thu, 31 May 2007 09:16:50 +0800


Hi Mr. David
  I have modified my patch according to you advice. I think -
EHOSTUNREACH is only for input path. In output path, we can just
simply check-ENETUNREACH  (^_^), the patch is shown in the end of this mail.

 I send this patch to you several weeks ago, but you have not replied to me.
This patch is not correctly?


Your email client is still corrupting the patch, it changes
tab characters into spaces.  This makes the patch not apply.

Please do not send the same unusable patch so many times.
Instead, I recommend to email the patch to yourself, and
you try to apply it, in the same way someone else receiving
your patch posting would.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


diff -Nurp a/net/ipv4/datagram.c b/net/ipv4/datagram.c
--- a/net/ipv4/datagram.c	2007-06-01 13:22:39.0 +0800
+++ b/net/ipv4/datagram.c	2007-06-01 13:24:21.0 +0800
@@ -50,8 +50,12 @@ int ip4_datagram_connect(struct sock *sk
 			   RT_CONN_FLAGS(sk), oif,
 			   sk-sk_protocol,
 			   inet-sport, usin-sin_port, sk, 1);
-	if (err)
+	if (err) {
+		if (err == -ENETUNREACH)
+			IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES); 
 		return err;
+	}
+
 	if ((rt-rt_flags  RTCF_BROADCAST)  !sock_flag(sk, SOCK_BROADCAST)) {
 		ip_rt_put(rt);
 		return -EACCES;
diff -Nurp a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
--- a/net/ipv4/tcp_ipv4.c	2007-06-01 13:22:39.0 +0800
+++ b/net/ipv4/tcp_ipv4.c	2007-06-01 13:25:07.0 +0800
@@ -192,8 +192,11 @@ int tcp_v4_connect(struct sock *sk, stru
 			   RT_CONN_FLAGS(sk), sk-sk_bound_dev_if,
 			   IPPROTO_TCP,
 			   inet-sport, usin-sin_port, sk, 1);
-	if (tmp  0)
+	if (tmp  0) {
+		if (tmp == -ENETUNREACH)
+			IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
 		return tmp;
+	}
 
 	if (rt-rt_flags  (RTCF_MULTICAST | RTCF_BROADCAST)) {
 		ip_rt_put(rt);
diff -Nurp a/net/ipv4/udp.c b/net/ipv4/udp.c
--- a/net/ipv4/udp.c	2007-06-01 13:22:39.0 +0800
+++ b/net/ipv4/udp.c	2007-06-01 13:26:47.0 +0800
@@ -631,8 +631,11 @@ int udp_sendmsg(struct kiocb *iocb, stru
 		 .dport = dport } } };
 		security_sk_classify_flow(sk, fl);
 		err = ip_route_output_flow(rt, fl, sk, 1);
-		if (err)
+		if (err) {
+			if (err == -ENETUNREACH)
+IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
 			goto out;

Re: Fix ipOutNoRoutes counter error for TCP and UDP

2007-04-25 Thread weidong
Hi Mr. David
I have modified my patch according to you advice. I think -
EHOSTUNREACH
is only for input path. In output path, we can just simply check
-ENETUNREACH  (^_^), the patch is shown in the end of this mail.

BTW: my E-mail has been changed to [EMAIL PROTECTED]


  Function need to fix:
  tcp_v4_connect(); ip4_datagram_connect(); udp_sendmsg();


  I think we need to make these checks more carefully.
 
  Route lookup can fail for several reasons other than
  no route being available.  Two examples are:
 
  1) Out of memory error while creating route
  2) IPSEC disallows communication to that flow ID
 
  As a result, we'll probably best limiting the counter
  increment when the error is either -EHOSTUNREACH or
  -ENETUNREACH.


signed-off-by: Wei Dong [EMAIL PROTECTED]


diff -ruNp a/net/ipv4/datagram.c b/net/ipv4/datagram.c
--- a/net/ipv4/datagram.c   2007-04-25 15:20:19.0 +0800
+++ b/net/ipv4/datagram.c   2007-04-25 15:21:42.0 +0800
@@ -50,8 +50,12 @@ int ip4_datagram_connect(struct sock *sk
   RT_CONN_FLAGS(sk), oif,
   sk-sk_protocol,
   inet-sport, usin-sin_port, sk);
-   if (err)
+   if (err) {
+   if (err == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
return err;
+   }
+
if ((rt-rt_flags  RTCF_BROADCAST)  !sock_flag(sk, SOCK_BROADCAST)) {
ip_rt_put(rt);
return -EACCES;
diff -ruNp a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
--- a/net/ipv4/tcp_ipv4.c   2007-04-25 15:20:19.0 +0800
+++ b/net/ipv4/tcp_ipv4.c   2007-04-25 15:21:42.0 +0800
@@ -192,8 +192,11 @@ int tcp_v4_connect(struct sock *sk, stru
   RT_CONN_FLAGS(sk), sk-sk_bound_dev_if,
   IPPROTO_TCP,
   inet-sport, usin-sin_port, sk);
-   if (tmp  0)
+   if (tmp  0) {
+   if (tmp == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
return tmp;
+   }

if (rt-rt_flags  (RTCF_MULTICAST | RTCF_BROADCAST)) {
ip_rt_put(rt);
diff -ruNp a/net/ipv4/udp.c b/net/ipv4/udp.c
--- a/net/ipv4/udp.c2007-04-25 15:20:19.0 +0800
+++ b/net/ipv4/udp.c2007-04-25 15:21:42.0 +0800
@@ -630,8 +630,11 @@ int udp_sendmsg(struct kiocb *iocb, stru
 .dport = dport } } };
security_sk_classify_flow(sk, fl);
err = ip_route_output_flow(rt, fl, sk, 
!(msg-msg_flagsMSG_DONTWAIT));
-   if (err)
+   if (err) {
+   if (err == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
goto out;
+   }

err = -EACCES;
if ((rt-rt_flags  RTCF_BROADCAST) 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fix ipOutNoRoutes counter error for TCP and UDP

2007-04-25 Thread weidong

Hi Mr. David
I have modified my patch according to you advice. I think -
EHOSTUNREACH is only for input path. In output path, we can just
simply check-ENETUNREACH  (^_^), the patch is shown in the end of this mail.

BTW: my E-mail has been changed to [EMAIL PROTECTED]



  Function need to fix:
  tcp_v4_connect(); ip4_datagram_connect(); udp_sendmsg();





  I think we need to make these checks more carefully.
 
  Route lookup can fail for several reasons other than
  no route being available.  Two examples are:
 
  1) Out of memory error while creating route
  2) IPSEC disallows communication to that flow ID
 
  As a result, we'll probably best limiting the counter
  increment when the error is either -EHOSTUNREACH or
  -ENETUNREACH.




signed-off-by: Wei Dong [EMAIL PROTECTED]


diff -ruNp a/net/ipv4/datagram.c b/net/ipv4/datagram.c
--- a/net/ipv4/datagram.c   2007-04-25 15:20:19.0 +0800
+++ b/net/ipv4/datagram.c   2007-04-25 15:21:42.0 +0800
@@ -50,8 +50,12 @@ int ip4_datagram_connect(struct sock *sk
   RT_CONN_FLAGS(sk), oif,
   sk-sk_protocol,
   inet-sport, usin-sin_port, sk);
-   if (err)
+   if (err) {
+   if (err == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
return err;
+   }
+
if ((rt-rt_flags  RTCF_BROADCAST)  !sock_flag(sk, SOCK_BROADCAST)) {
ip_rt_put(rt);
return -EACCES;
diff -ruNp a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
--- a/net/ipv4/tcp_ipv4.c   2007-04-25 15:20:19.0 +0800
+++ b/net/ipv4/tcp_ipv4.c   2007-04-25 15:21:42.0 +0800
@@ -192,8 +192,11 @@ int tcp_v4_connect(struct sock *sk, stru
   RT_CONN_FLAGS(sk), sk-sk_bound_dev_if,
   IPPROTO_TCP,
   inet-sport, usin-sin_port, sk);
-   if (tmp  0)
+   if (tmp  0) {
+   if (tmp == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
return tmp;
+   }

if (rt-rt_flags  (RTCF_MULTICAST | RTCF_BROADCAST)) {
ip_rt_put(rt);
diff -ruNp a/net/ipv4/udp.c b/net/ipv4/udp.c
--- a/net/ipv4/udp.c2007-04-25 15:20:19.0 +0800
+++ b/net/ipv4/udp.c2007-04-25 15:21:42.0 +0800
@@ -630,8 +630,11 @@ int udp_sendmsg(struct kiocb *iocb, stru
 .dport = dport } } };
security_sk_classify_flow(sk, fl);
err = ip_route_output_flow(rt, fl, sk,
!(msg-msg_flagsMSG_DONTWAIT));
-   if (err)
+   if (err) {
+   if (err == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
goto out;
+   }

err = -EACCES;
if ((rt-rt_flags  RTCF_BROADCAST) 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fix ipOutNoRoutes counter error for TCP and UDP

2007-03-27 Thread weidong
Hi Mr. David 
I have modified my patch according to you advice. I think -
EHOSTUNREACH 
is only for input path. In output path, we can just simply check 
-ENETUNREACH  (^_^), the patch is shown in the end of this mail. 

BTW: my E-mail has been changed to [EMAIL PROTECTED] 

 Function need to fix: 
 tcp_v4_connect(); ip4_datagram_connect(); udp_sendmsg(); 

 I think we need to make these checks more carefully. 
 
 Route lookup can fail for several reasons other than 
 no route being available.  Two examples are: 
 
 1) Out of memory error while creating route 
 2) IPSEC disallows communication to that flow ID 
 
 As a result, we'll probably best limiting the counter 
 increment when the error is either -EHOSTUNREACH or 
 -ENETUNREACH. 

signed-off-by: Wei Dong [EMAIL PROTECTED]

diff -ruNp old/net/ipv4/datagram.c new/net/ipv4/datagram.c
--- old/net/ipv4/datagram.c 2007-03-27 18:15:56.0 +0800
+++ new/net/ipv4/datagram.c 2007-03-27 18:23:58.0 +0800
@@ -50,8 +50,12 @@ int ip4_datagram_connect(struct sock *sk
   RT_CONN_FLAGS(sk), oif,
   sk-sk_protocol,
   inet-sport, usin-sin_port, sk);
-   if (err)
+   if (err) {
+   if (err == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
return err;
+   }
+
if ((rt-rt_flags  RTCF_BROADCAST)  !sock_flag(sk, SOCK_BROADCAST)) {
ip_rt_put(rt);
return -EACCES;
diff -ruNp old/net/ipv4/tcp_ipv4.c new/net/ipv4/tcp_ipv4.c
--- old/net/ipv4/tcp_ipv4.c 2007-03-27 18:15:56.0 +0800
+++ new/net/ipv4/tcp_ipv4.c 2007-03-27 18:28:38.0 +0800
@@ -192,8 +192,11 @@ int tcp_v4_connect(struct sock *sk, stru
   RT_CONN_FLAGS(sk), sk-sk_bound_dev_if,
   IPPROTO_TCP,
   inet-sport, usin-sin_port, sk);
-   if (tmp  0)
+   if (tmp  0) {
+   if (tmp == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
return tmp;
+   }
 
if (rt-rt_flags  (RTCF_MULTICAST | RTCF_BROADCAST)) {
ip_rt_put(rt);
diff -ruNp old/net/ipv4/udp.c new/net/ipv4/udp.c
--- old/net/ipv4/udp.c  2007-03-27 18:15:56.0 +0800
+++ new/net/ipv4/udp.c  2007-03-27 18:26:47.0 +0800
@@ -630,8 +630,11 @@ int udp_sendmsg(struct kiocb *iocb, stru
 .dport = dport } } };
security_sk_classify_flow(sk, fl);
err = ip_route_output_flow(rt, fl, sk, 
!(msg-msg_flagsMSG_DONTWAIT));
-   if (err)
+   if (err) {
+   if (err == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
goto out;
+   }
 
err = -EACCES;
if ((rt-rt_flags  RTCF_BROADCAST) 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fix bugs in Whether sock accept queue is full checking

2007-02-14 Thread weidong
Hi, All
when I use linux TCP socket, and find there is a bug in function
sk_acceptq_is_full(). 

When a new SYN comes, TCP module first checks its validation. If valid,
send SYN,ACK to the client and add the sock to the syn hash table. Next
time if received the valid ACK for SYN,ACK from the client. server will
accept this connection and increase the sk-sk_ack_backlog -- which is
done in function tcp_check_req().We check wether acceptq is full in
function tcp_v4_syn_recv_sock().

Consider an example:

 After listen(sockfd, 1) system call, sk-sk_max_ack_backlog is set to
1. As we know, sk-sk_ack_backlog is initialized to 0. Assuming accept()
system call is not invoked now.

1. 1st connection comes. invoke sk_acceptq_is_full(). sk-
sk_ack_backlog=0 sk-sk_max_ack_backlog=1, function return 0 accept
this connection. Increase the sk-sk_ack_backlog
2. 2nd connection comes. invoke sk_acceptq_is_full(). sk-
sk_ack_backlog=1 sk-sk_max_ack_backlog=1, function return 0 accept
this connection. Increase the sk-sk_ack_backlog
3. 3rd connection comes. invoke sk_acceptq_is_full(). sk-
sk_ack_backlog=2 sk-sk_max_ack_backlog=1, function return 1. Refuse
this connection.

I think it has bugs. after listen system call. sk-sk_max_ack_backlog=1
but now it can accept 2 connections. The following patch can fix this
problem.

signed-off-by: Wei Dong [EMAIL PROTECTED]

diff -ruN old/include/net/sock.h new/include/net/sock.h
--- old/include/net/sock.h  2007-02-03 08:38:21.0 -0500
+++ new/include/net/sock.h  2007-02-03 08:38:30.0 -0500
@@ -426,7 +426,7 @@
 
 static inline int sk_acceptq_is_full(struct sock *sk)
 {
-   return sk-sk_ack_backlog  sk-sk_max_ack_backlog;
+   return sk-sk_ack_backlog = sk-sk_max_ack_backlog;
 }
 
 /*




-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fix ipOutNoRoutes counter error for TCP and UDP

2007-02-13 Thread weidong
Hi, All
  When I tested Linux-2.6.20 and found that counter ipOutNoRoutes can
not increase correctly.
  The criteria is RFC2011

ipOutNoRoutes OBJECT-TYPE
SYNTAX  Counter32
MAX-ACCESS  read-only
STATUS  current
DESCRIPTION
The number of IP datagrams discarded because no route could
be found to transmit them to their destination.  Note that
this counter includes any packets counted in ipForwDatagrams
which meet this `no-route' criterion.  Note that this
includes any datagrams which a host cannot route because all
of its default routers are down.
::= { ip 12 }

In current Linux TCP/IP stack, maybe we should not increase this counter
in input path, but only increase it in output path due to the TCP/IP
stack performance.

Now in output path, when TCP client tries to connect to an unreachable
server(net unreachable, so no route can be found), this counter has no
increment. When we use UDP sending UDP datagram to an net unreachable
address, this counter also has no increment.

Function need to fix:
tcp_v4_connect(); ip4_datagram_connect(); udp_sendmsg();

The following patch can fix the problems mentioned above

BR 
Wei Dong

signed-off-by: Wei Dong [EMAIL PROTECTED]

diff -ruN old/net/ipv4/datagram.c new/net/ipv4/datagram.c
--- old/net/ipv4/datagram.c 2007-02-02 12:28:54.0 -0500
+++ new/net/ipv4/datagram.c 2007-02-02 12:29:01.0 -0500
@@ -50,8 +50,10 @@
   RT_CONN_FLAGS(sk), oif,
   sk-sk_protocol,
   inet-sport, usin-sin_port, sk);
-   if (err)
+   if (err) {
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
return err;
+   }
if ((rt-rt_flags  RTCF_BROADCAST)  !sock_flag(sk, SOCK_BROADCAST)) {
ip_rt_put(rt);
return -EACCES;
diff -ruN old/net/ipv4/tcp_ipv4.c new/net/ipv4/tcp_ipv4.c
--- old/net/ipv4/tcp_ipv4.c 2007-02-02 12:28:54.0 -0500
+++ new/net/ipv4/tcp_ipv4.c 2007-02-02 12:29:01.0 -0500
@@ -192,8 +192,10 @@
   RT_CONN_FLAGS(sk), sk-sk_bound_dev_if,
   IPPROTO_TCP,
   inet-sport, usin-sin_port, sk);
-   if (tmp  0)
+   if (tmp  0) {
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
return tmp;
+   }
 
if (rt-rt_flags  (RTCF_MULTICAST | RTCF_BROADCAST)) {
ip_rt_put(rt);
diff -ruN old/net/ipv4/udp.c new/net/ipv4/udp.c
--- old/net/ipv4/udp.c  2007-02-02 12:28:54.0 -0500
+++ new/net/ipv4/udp.c  2007-02-02 12:29:01.0 -0500
@@ -630,8 +630,10 @@
 .dport = dport } } };
security_sk_classify_flow(sk, fl);
err = ip_route_output_flow(rt, fl, sk, 
!(msg-msg_flagsMSG_DONTWAIT));
-   if (err)
+   if (err) {
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
goto out;
+   }
 
err = -EACCES;
if ((rt-rt_flags  RTCF_BROADCAST) 




-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Patch][IPv6] Fix wrong routing mechanism for Link Local IPv6 packets

2007-01-30 Thread weidong
Hello, Mr yoshfuji:
Thanks for your reply.
The following is the figure.
||
|| 
| Router | 
|||---|
|   |--|   |--|  || other |
|   | eth0 |---| eth1 |--||network|
|   |--|  ||--|  ||---|
|--|--||-|
   |  v| 
   | fe80::20c:29ff:fe24:fa0a  | 
   |   | 
   |   | 
   |   | 
   |   | 
   |   | 
   |   v 
   |   fe80::20c:29ff:fe24:fa14 
   |   
   | 
|--|--| 
|  |--|   | 
|  | eth0 |---|---fe80::200:ff:fe00:100 
|  |--|   | 
| | 
|Host | 
|-| 

Host eth0: fe80::200:ff:fe00:100
Router eth0: fe80::20c:29ff:fe24:fa0a
Router eth1: fe80::20c:29ff:fe24:fa14

We ping6 from host's eth0 to Router's eth1. Echo Request's src addr =
fe80::200:ff:fe00:100, dst addr = fe80::20c:29ff:fe24:fa14. And Kernel
just send ICMPv6 redirect packet and then forward the Echo Request to
router's eth0. If we run tcpdump on Host eth0, we can receive the ICMPv6
Redirect packet. And if we send NA which advertises
fe80::20c:29ff:fe24:fa14 MAC address(this is very easy for v6eval tool),
we also can receive the forwarded Echo Request(src:fe80::200:ff:fe00:100
dst is fe80::20c:29ff:fe24:fa14). 

I dived into the kernel, and found that maybe function rt6_score_route()
has problems. In rt6_score_route(), if rt6_check_dev() return 0, and the
dst ipv6 addr is link local addr, rt6_socre_route() return -1 directly.
I think this is not correct, we should return -1 only if the entry is in
the route cache, and the dst addr is link local addr. Only entries in
cache may select wrong IPv6 Link Local NIC for a link local dst addr.
because they are copied from static IPv6 fib table entries.

  Hello, Mr yoshfuji
  Take ping6 for example. Asumming there is a router which has 2 NICs.
  eth0 on router has ipv6 addr fe80::20c:29ff:fe24:fa0a, eth1 on router has
  ipv6 addr fe80::20c:29ff:fe24:fa14. Also there is a host connected to
  router's eth0, and the host's ipv6 addr is fe80::200:ff:fe00:100. We ping6
 :
 
 I still need more precise figure.
 
 Please draw complete box for the 2-3 boxes (pinger, router (and the
 destination)), link(s) and interfaces.
 
 +-+
 |Router   |
 +---+-+---+
 eth0| |eth1
 | |
 eth0|
 +---+-+
 |Host1|
 +-+
 
 Host1  eth0: fe80:
 Router eth0: fe80:
 Router eth1: fe80:...
 
 Or, something like that
 
 I think you may use other tool such as tgif etc.
 
 --yoshfuji 
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Patch][IPv6] Fix wrong routing mechanism for Link Local IPv6 packets

2007-01-30 Thread weidong
Hello, Mr yoshfuji
Thanks for your patch. I think maybe we checking oif first is better,
and WARN_ON in function rt6_score_route().
The following is my patch

Signed-off-by: Wei Dong [EMAIL PROTECTED]

diff -ruN old/net/ipv6/route.c new/net/ipv6/route.c
--- old/net/ipv6/route.c2007-02-16 13:46:33.0 -0500
+++ new/net/ipv6/route.c2007-02-16 13:44:27.0 -0500
@@ -309,12 +309,21 @@
 static int inline rt6_check_dev(struct rt6_info *rt, int oif)
 {
struct net_device *dev = rt-rt6i_dev;
-   if (!oif || dev-ifindex == oif)
+   int ret = 0;
+
+   if (!oif)
return 2;
+
if ((dev-flags  IFF_LOOPBACK) 
rt-rt6i_idev  rt-rt6i_idev-dev-ifindex == oif)
-   return 1;
-   return 0;
+   ret = 1;
+   else
+   return 0;
+
+   if (dev-ifindex == oif)
+   return 2;
+
+   return ret;
 }
 
 static int inline rt6_check_neigh(struct rt6_info *rt)
@@ -339,8 +348,11 @@
int m, n;

m = rt6_check_dev(rt, oif);
-   if (!m  (strict  RT6_LOOKUP_F_IFACE))
+   if (!m  (strict  RT6_LOOKUP_F_IFACE)) {
+   WARN_ON(rt-rt6i_dev-flags  IFF_LOOPBACK);
return -1;
+   }
+
 #ifdef CONFIG_IPV6_ROUTER_PREF
m |= IPV6_DECODE_PREF(IPV6_EXTRACT_PREF(rt-rt6i_flags))  2;
 #endif


On Wed, 2007-01-31 at 13:00 +0900, Wei Dong wrote:
 In article [EMAIL PROTECTED] (at Wed, 21 Feb 2007 
 09:57:12 -0500), weidong [EMAIL PROTECTED] says:
 
  The following is the figure.
 :
  Host eth0: fe80::200:ff:fe00:100
  Router eth0: fe80::20c:29ff:fe24:fa0a
  Router eth1: fe80::20c:29ff:fe24:fa14
 
  Other network
   |
   | eth1
  +++
  | Router  |
  +++
   | eth0
   |
   | eth0
  +++
  |  Host   |
  +-+
 
  We ping6 from host's eth0 to Router's eth1. Echo Request's src addr =
  fe80::200:ff:fe00:100, dst addr = fe80::20c:29ff:fe24:fa14. And Kernel
  just send ICMPv6 redirect packet and then forward the Echo Request to
  router's eth0. If we run tcpdump on Host eth0, we can receive the ICMPv6
  Redirect packet. And if we send NA which advertises
 
 This is correct, and intended behavior.
 
  fe80::20c:29ff:fe24:fa14 MAC address(this is very easy for v6eval tool),
  we also can receive the forwarded Echo Request(src:fe80::200:ff:fe00:100
  dst is fe80::20c:29ff:fe24:fa14).
 
 Well, this is known issue, actually.
 
 While this cannot happen in normal operation, we should NOT accept
 such traffic. :-)
 
 Here is the (untested) fix.
 
 -
 [IPV6] ROUTE: Do not accept traffic for link-local address on different 
 interface.
 
 Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
 
 --- 
 diff --git a/net/ipv6/route.c b/net/ipv6/route.c
 index 5f0043c..a7468e0 100644
 --- a/net/ipv6/route.c
 +++ b/net/ipv6/route.c
 @@ -311,12 +311,19 @@ static inline void rt6_probe(struct rt6_info *rt)
  static int inline rt6_check_dev(struct rt6_info *rt, int oif)
  {
   struct net_device *dev = rt-rt6i_dev;
 + int ret = 0;
 +
 + if (dev-flags  IFF_LOOPBACK) {
 + if (!WARN_ON(rt-rt6i_idev == NULL) 
 + rt-rt6i_idev-dev-ifindex == oif)
 + ret = 1;
 + else
 + return 0;
 + }
   if (!oif || dev-ifindex == oif)
   return 2;
 - if ((dev-flags  IFF_LOOPBACK) 
 - rt-rt6i_idev  rt-rt6i_idev-dev-ifindex == oif)
 - return 1;
 - return 0;
 +
 + return ret;
  }
 
  static int inline rt6_check_neigh(struct rt6_info *rt)
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch][IPv6] Fix wrong routing mechanism for Link Local IPv6 packets

2007-01-29 Thread weidong
Hi, all
When I tested linux-2.6.19.2, and found maybe there're some packet
routing bugs in linux kernel. My test topology is shown as the
following:

 eth0: fe80::20c:29ff:fe24:fa0a
| eth1: fe80::20c:29ff:fe24:fa14 
|  |

|  |
|  |
| LAN1 |LAN2
|  |
  -
^
|  
 Send Echo Request(src addr = fe80::200:ff:fe00:100)

Now we send Echo Request to eth1(ipv6 src addr is fe80::200:ff:fe00:100,
ipv6 dst addr is fe80::20c:29ff:fe24:fa14) on LAN1. Linux will send
ICMPv6 Redirect Packet to us. And the TargetAddress =
fe80::20c:29ff:fe24:fa14, DestinationAddress = fe80::20c:29ff:fe24:fa14.
Obviously, Linux considers that fe80::200:ff:fe00:100 and
fe80::20c:29ff:fe24:fa14 is on the SAME link(LAN0). In fact, kernel
invoke function ip6_foward().

When Linux decides whether or not making use of a rt6_info entry, it
will match the rt-rt6i_idev and rt-rt6_dev. This is done in function
rt6_check_dev(). If nothing matched, rt6_check_dev() return 0. Then
function rt6_score_route() will check whether the matched ipv6 addr
(fe80::20c:29ff:fe24:fa14 in our example) is a link local ipv6 address.
If it is a link local address, and rt-rt6i_idev rt-rt6_dev match
failed -- rt6_check_dev() return 0. Function rt6_score_route() return -1
directly. I think here is a problem. 

When kernel match eth1 addr with rt6_info entries, it will lookup in
local_table first. In rt6_check_dev() matching rt-rt6i_idev rt-
rt6_dev will fail. The reason is oif = 2 , rt-rt6i_idev-dev-ifindex
is 3 and rt-rt6i_dev-ifindex is 1. I think even this match failed,
rt6_score_route() should not return -1, but return 0. And I think check
for RT6_LOOKUP_F_IFACE flag isn't needed here. Checking for this flag is
only needed in route cache when matching dst addr.

Due to the reason mentioned above, all entries in local table matching
dst addr fe80::20c:29ff:fe24:fa14 are failed. And then kernel matches
main table. fe80::/64 entry in main table will match successfully. Later
ip6_rt_copy() will copy the function pointer rt-u.dst.input. Obviously
rt-u.dst.input in main table is ip6_forward().

The following is my patch.

signed-off-by: Wei Dong [EMAIL PROTECTED]

diff -ruN old/net/ipv6/route.c new/net/ipv6/route.c
--- old/net/ipv6/route.c2007-01-10 14:10:37.0 -0500
+++ new/net/ipv6/route.c2007-01-17 18:24:51.336774016 -0500
@@ -343,7 +343,7 @@
int m, n;

m = rt6_check_dev(rt, oif);
-   if (!m  (strict  RT6_LOOKUP_F_IFACE))
+   if (!m  (rt-rt6i_flags  RTF_CACHE)  (strict  RT6_LOOKUP_F_IFACE))
return -1;
 #ifdef CONFIG_IPV6_ROUTER_PREF
m |= IPV6_DECODE_PREF(IPV6_EXTRACT_PREF(rt-rt6i_flags))  2;



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch]Fix wrong routing mechanism for Link Local IPv6 packets

2007-01-17 Thread weidong
Hi, all
When I tested linux-2.6.19.2, and found maybe there're a packet routing
bug in linux kernel. My test topology is shown as the following:

 eth0: fe80::20c:29ff:fe24:fa0a
| eth1: fe80::20c:29ff:fe24:fa14 
|  |

|  |
|  |
| LAN1 |LAN2
|  |
  -
^
|  
 Send Echo Request(src addr = fe80::200:ff:fe00:100)

Now we send Echo Request to eth1(ipv6 src addr is fe80::200:ff:fe00:100,
ipv6 dst addr is fe80::20c:29ff:fe24:fa14) on LAN1. Linux will send
ICMPv6 Redirect Packet to us. And the TargetAddress =
fe80::20c:29ff:fe24:fa14, DestinationAddress = fe80::20c:29ff:fe24:fa14.
Obviously, Linux considers that fe80::200:ff:fe00:100 and
fe80::20c:29ff:fe24:fa14 is on the SAME link(LAN1). In fact, kernel
invoke function ip6_foward().

When Linux decides whether or not making use of a rt6_info entry, it
will match the rt-rt6i_idev and rt-rt6_dev. This is done in function
rt6_check_dev(). If nothing matched, rt6_check_dev() return 0. Then
function rt6_score_route() will check whether the matched ipv6 addr
(fe80::20c:29ff:fe24:fa14 in our example) is a link local ipv6 address.
If it is a link local address, and rt-rt6i_idev rt-rt6_dev match
failed -- rt6_check_dev() return 0. Function rt6_score_route() return -1
directly. I think here is a problem. 

When kernel match eth1 addr with rt6_info entries, it will lookup in
local_table first. In rt6_check_dev() matching rt-rt6i_idev rt-
rt6_dev will fail. The reason is oif = 2 , rt-rt6i_idev-dev-ifindex
is 3 and rt-rt6i_dev-ifindex is 1. I think even this match failed,
rt6_score_route() should not return -1, but return 0. And I think check
for RT6_LOOKUP_F_IFACE flag isn't needed here. Checking for this flag is
only needed in route cache when matching dst addr.

Due to the reason mentioned above, all entries in local table matching
dst addr fe80::20c:29ff:fe24:fa14 are failed. And then kernel matches
main table. fe80::/64 entry in main table will match successfully. Later
ip6_rt_copy() will copy the function pointer rt-u.dst.input. Obviously
rt-u.dst.input in main table is ip6_forward().

The following is my patch.

signed-off-by: Wei Dong [EMAIL PROTECTED]

diff -ruN old/net/ipv6/route.c new/net/ipv6/route.c
--- old/net/ipv6/route.c2007-01-10 14:10:37.0 -0500
+++ new/net/ipv6/route.c2007-01-17 18:24:51.336774016 -0500
@@ -343,7 +343,7 @@
int m, n;

m = rt6_check_dev(rt, oif);
-   if (!m  (strict  RT6_LOOKUP_F_IFACE))
+   if (!m  (rt-rt6i_flags  RTF_CACHE)  (strict  RT6_LOOKUP_F_IFACE))
return -1;
 #ifdef CONFIG_IPV6_ROUTER_PREF
m |= IPV6_DECODE_PREF(IPV6_EXTRACT_PREF(rt-rt6i_flags))  2;


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch] Fix SNMPv2 ipOutNoRoutes counter error

2006-12-03 Thread weidong
Hi All:
When I tested linux kernel 2.6.18.3, and find that kernel statistics
about IPSTATS_MIB_OUTNOROUTES which exsits in file /proc/net/snmp
doesn't increase correctly.  The criteria conform to RFC2011:

  ipOutNoRoutes OBJECT-TYPE
SYNTAX  Counter32
MAX-ACCESS  read-only
STATUS  current
DESCRIPTION
The number of IP datagrams discarded because no route could
be found to transmit them to their destination.  Note that
this counter includes any packets counted in ipForwDatagrams
which meet this `no-route' criterion.  Note that this
includes any datagrams which a host cannot route because all
of its default routers are down.
::= { ip 12 }

When a host received an IP packet, but the destination address is not
this host. The kernel just discards the IP packet but with no increment
for this counter.

When a router received an IP packet that this router can't forward due
to no route found. Kernel just simply invoke ip_error(), and send ICMP
packet. Also do nothing for this counter.

Signed-off-by: Wei Dong [EMAIL PROTECTED]

diff -ruN old/net/ipv4/icmp.c new/net/ipv4/icmp.c
--- old/net/ipv4/icmp.c 2006-09-20 00:42:06.0 -0300
+++ new/net/ipv4/icmp.c 2006-12-04 11:46:39.0 -0300
@@ -580,6 +580,9 @@
icmp_param.data_len = room;
icmp_param.head_len = sizeof(struct icmphdr);
 
+   if (code == ICMP_NET_UNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
+
icmp_push_reply(icmp_param, ipc, rt);
 ende:
ip_rt_put(rt);
diff -ruN old/net/ipv4/ip_input.c new/net/ipv4/ip_input.c
--- old/net/ipv4/ip_input.c 2006-09-20 00:42:06.0 -0300
+++ new/net/ipv4/ip_input.c 2006-11-29 11:15:52.0 -0300
@@ -340,8 +340,10 @@
int err = ip_route_input(skb, iph-daddr, iph-saddr, iph-tos,
 skb-dev);
if (unlikely(err)) {
-   if (err == -EHOSTUNREACH)
+   if (err == -EHOSTUNREACH) {
IP_INC_STATS_BH(IPSTATS_MIB_INADDRERRORS);
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
+   }
goto drop; 
}
}


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


SNMPv2 ipFragFails counter error

2006-08-31 Thread weidong
Hi, All
  When I tested Linux kernel 2.6.17.7 about statistics
ipFragFails,found that this counter couldn't increase correctly. The
criteria is RFC2011:
RFC2011
  ipFragFails OBJECT-TYPE
SYNTAX  Counter32
MAX-ACCESS  read-only
STATUS  current
DESCRIPTION
The number of IP datagrams that have been discarded because
they needed to be fragmented at this entity but could not
be, e.g., because their Don't Fragment flag was set.
::= { ip 18 }

When I send big IP packet to a router with DF bit set to 1 which need to
be fragmented, and router just sends an ICMP error message
ICMP_FRAG_NEEDED but no increments for this counter(in the function
ip_fragment).

signed-off-by:Wei Dong [EMAIL PROTECTED]

diff -ruN old/net/ipv4/ip_output.c new/net/ipv4/ip_output.c
--- old/net/ipv4/ip_output.c2006-07-25 11:36:01.0 +0800
+++ new/net/ipv4/ip_output.c2006-08-30 14:30:49.0 +0800
@@ -441,6 +441,7 @@
iph = skb-nh.iph;
 
if (unlikely((iph-frag_off  htons(IP_DF))  !skb-local_df)) {
+   IP_INC_STATS(IPSTATS_MIB_FRAGFAILS);
icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
  htonl(dst_mtu(rt-u.dst)));
kfree_skb(skb);



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 0/1]SNMPv2 ipv6IfStatsInHdrErrors counter error

2006-07-31 Thread weidong
Hi, All
  When I tested Linux kernel 2.6.17.7 about statistics
ipv6IfStatsInHdrErrors, found that this counter couldn't increase
correctly. The criteria is RFC2465:
  ipv6IfStatsInHdrErrors OBJECT-TYPE
  SYNTAX Counter3
  MAX-ACCESS read-only
  STATUS current
  DESCRIPTION
 The number of input datagrams discarded due to
 errors in their IPv6 headers, including version
 number mismatch, other format errors, hop count
 exceeded, errors discovered in processing their
 IPv6 options, etc.
  ::= { ipv6IfStatsEntry 2 }

When I send TTL=0 and TTL=1 a packet to a router which need to be
forwarded, router just sends an ICMPv6 message to tell the sender that
TIME_EXCEED and HOPLIMITS, but no increments for this counter(in the
function ip6_forward).

The following is the patch for this issue. 

diff -ruN old/net/ipv6/ip6_output.c new/net/ipv6/ip6_output.c
--- old/net/ipv6/ip6_output.c   2006-07-25 11:36:01.0 +0800
+++ new/net/ipv6/ip6_output.c   2006-07-31 16:16:13.0 +0800
@@ -356,6 +356,7 @@
skb-dev = dst-dev;
icmpv6_send(skb, ICMPV6_TIME_EXCEED, ICMPV6_EXC_HOPLIMIT,
0, skb-dev);
+   IP6_INC_STATS_BH(IPSTATS_MIB_INHDRERRORS);
 
kfree_skb(skb);
return -ETIMEDOUT;

signed-off-by:Wei Dong [EMAIL PROTECTED]

Regards
Wei Dong

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 1/1]SNMPv2 ipv6IfStatsOutFragCreates counter error

2006-07-31 Thread weidong
Hi, All
  When I tested linux kernel 2.6.71.7 about statistics
ipv6IfStatsOutFragCreates, and found that it couldn't increase
correctly. The criteria is RFC 2465:

  ipv6IfStatsOutFragCreates OBJECT-TYPE
  SYNTAX  Counter32
  MAX-ACCESS  read-only
  STATUS  current
  DESCRIPTION
 The number of output datagram fragments that have
 been generated as a result of fragmentation at
 this output interface.
  ::= { ipv6IfStatsEntry 15 }

I think there are two issues in Linux kernel. 
1st:
RFC2465 specifies the counter is The number of output datagram
fragments I think increasing this counter after output a fragment
successfully is better. And it should not be increased even though a
fragment is created but failed to output.

2nd:
If we send a big ICMP/ICMPv6 echo request to a host, and receive
ICMP/ICMPv6 echo reply consisted of some fragments. As we know that in
Linux kernel first fragmentation occurs in ICMP layer(maybe saying
transport layer is better), but this is not the real
fragmentation,just do some pre-fragment -- allocate space for date,
and form a frag_list, etc. The real fragmentation happens in IP layer
-- set offset and MF flag and so on. So I think in fast path for
ip_fragment/ip6_fragment, if we send a fragment which pre-fragment by
upper layer we should also increase ipv6IfStatsOutFragCreates.

The following is the patch for the issues mentioned above:

diff -ruN old/net/ipv4/ip_output.c new/net/ipv4/ip_output.c
--- old/net/ipv4/ip_output.c2006-07-25 11:36:01.0 +0800
+++ new/net/ipv4/ip_output.c2006-07-31 16:24:57.0 +0800
@@ -527,6 +527,8 @@
 
err = output(skb);
 
+   if (!err)
+   IP_INC_STATS(IPSTATS_MIB_FRAGCREATES);
if (err || !frag)
break;
 
@@ -650,9 +652,6 @@
/*
 *  Put this fragment into the sending queue.
 */
-
-   IP_INC_STATS(IPSTATS_MIB_FRAGCREATES);
-
iph-tot_len = htons(len + hlen);
 
ip_send_check(iph);
@@ -660,6 +659,8 @@
err = output(skb2);
if (err)
goto fail;
+
+   IP_INC_STATS(IPSTATS_MIB_FRAGCREATES);
}
kfree_skb(skb);
IP_INC_STATS(IPSTATS_MIB_FRAGOKS);
diff -ruN old/net/ipv6/ip6_output.c new/net/ipv6/ip6_output.c
--- old/net/ipv6/ip6_output.c   2006-07-25 11:36:01.0 +0800
+++ new/net/ipv6/ip6_output.c   2006-07-31 16:24:21.0 +0800
@@ -593,6 +593,9 @@
}

err = output(skb);
+   if(!err)
+   IP6_INC_STATS(IPSTATS_MIB_FRAGCREATES);
+   
if (err || !frag)
break;
 
@@ -704,12 +707,11 @@
/*
 *  Put this fragment into the sending queue.
 */
-
-   IP6_INC_STATS(IPSTATS_MIB_FRAGCREATES);
-
err = output(frag);
if (err)
goto fail;
+
+   IP6_INC_STATS(IPSTATS_MIB_FRAGCREATES);
}
kfree_skb(skb);
IP6_INC_STATS(IPSTATS_MIB_FRAGOKS);

signed-off-by: Wei Dong [EMAIL PROTECTED]
Regards
Wei Dong

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html