from:"Corinna Vinschen"

Re: [PATCH net] r8169: fix mac address change

2018-07-03 Thread Corinna Vinschen

On Jul  2 22:49, Heiner Kallweit wrote:
> Network core refuses to change mac address because flag
> IFF_LIVE_ADDR_CHANGE isn't set. Set this missing flag.
> 
> Fixes: 1f7aa2bc268e ("r8169: simplify rtl_set_mac_address")
> Reported-by: Corinna Vinschen 
> Signed-off-by: Heiner Kallweit 
> ---
>  drivers/net/ethernet/realtek/r8169.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/ethernet/realtek/r8169.c 
> b/drivers/net/ethernet/realtek/r8169.c
> index f80ac894..a390db27 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -7607,6 +7607,7 @@ static int rtl_init_one(struct pci_dev *pdev, const 
> struct pci_device_id *ent)
>   NETIF_F_HW_VLAN_CTAG_RX;
>   dev->vlan_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO |
>   NETIF_F_HIGHDMA;
> + dev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
>  
>   tp->cp_cmd |= RxChkSum | RxVlan;
>  
> -- 
> 2.18.0

Tested-by: Corinna Vinschen 

This patch allows to change the MAC any time, just as before
1f7aa2bc268e, and thus also fixes the problem reported in
https://www.spinics.net/lists/netdev/msg510926.html


Thanks,
Corinna

Regression introduced by "r8169: simplify rtl_set_mac_address"

2018-07-02 Thread Corinna Vinschen

Hi,

the patch 1f7aa2bc268e, "r8169: simplify rtl_set_mac_address",
introduced a regression found by trying to team a r8169 NIC.

Try the following (assuming the r8169 NIC is eth0):

$ nmcli con add type team con-name team0 ifname nm-team config \
  '{"runner": {"name": "lacp"}, "link_watch": {"name": "ethtool"}}' \
  ipv4.method disable ipv6.method ignore
$ nmcli con add type ethernet ifname eth0 con-name team0.0 master nm-team
$ nmcli con up team0.0
$ teamdctl nm-team port present eth0
command call failed (No such device)

Bisecting turned up commit 1f7aa2bc268e, "r8169: simplify
rtl_set_mac_address" as the culprit.  Reverting this patch fixes
the issue and the teamdctl call succeeds.

The reason is apparently the usage of eth_mac_addr here.  eth_mac_addr
calls eth_prepare_mac_addr_change which checks for IFF_LIVE_ADDR_CHANGE.
Debugging shows this flag not being set on r8169, thus
eth_prepare_mac_addr_change returns -EBUSY (no idea why userspace claims
"No such device", rather than "Device or resource busy", but that's not
the point here).

Note that other devices like igb, don't call eth_mac_addr either, but
rather call memcpy by themselves to copy the new MAC, just as the
original r8169 code did, too.  Consequentially this problem is not
present on igb.

I suggest to revert this change in the first place, but I wonder if
we're not just missing to set IFF_LIVE_ADDR_CHANGE in a lot of drivers.


Thanks,
Corinna

Re: [Intel-wired-lan] [PATCH] igb: add VF trust infrastructure

2018-01-24 Thread Corinna Vinschen

On Jan 24 03:41, Brown, Aaron F wrote:
> > From: Intel-wired-lan [mailto:intel-wired-lan-boun...@osuosl.org] On
> > Behalf Of Alexander Duyck
> > Sent: Monday, January 22, 2018 2:58 PM
> > To: intel-wired-lan <intel-wired-...@lists.osuosl.org>; Netdev
> > <netdev@vger.kernel.org>
> > Cc: Corinna Vinschen <vinsc...@redhat.com>
> > Subject: Re: [Intel-wired-lan] [PATCH] igb: add VF trust infrastructure
> > 
> > Hi Corinna,
> > 
> > I've looked over the patch and didn't see any issues. My understanding
> > is that Jeff has pulled it into his tree and it should be going
> > through testing shortly.
> 
> Indeed it is / was ;)

Thanks guys!


Corinna

> 
> > 
> > Thanks.
> > 
> > - Alex
> > 
> > On Mon, Jan 22, 2018 at 12:54 AM, Corinna Vinschen
> > <vinsc...@redhat.com> wrote:
> > > Hi,
> > >
> > > Could somebody please review this patch?
> > >
> > >
> > > Thanks,
> > > Corinna
> > >
> > >
> > > On Jan 17 11:53, Corinna Vinschen wrote:
> > >> * Add a per-VF value to know if a VF is trusted, by default don't
> > >>   trust VFs.
> > >>
> > >> * Implement netdev op to trust VFs (igb_ndo_set_vf_trust) and add
> > >>   trust status to ndo_get_vf_config output.
> > >>
> > >> * Allow a trusted VF to change MAC and MAC filters even if MAC
> > >>   has been administratively set.
> > >>
> > >> Signed-off-by: Corinna Vinschen <vinsc...@redhat.com>
> > >> ---
> > >>  drivers/net/ethernet/intel/igb/igb.h  |  1 +
> > >>  drivers/net/ethernet/intel/igb/igb_main.c | 30
> > +++---
> > >>  2 files changed, 28 insertions(+), 3 deletions(-)
> 
> Tested-by: Aaron Brown <aaron.f.br...@intel.com>
> ___
> Intel-wired-lan mailing list
> intel-wired-...@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

Re: [PATCH] igb: add VF trust infrastructure

2018-01-22 Thread Corinna Vinschen

Hi,

Could somebody please review this patch?


Thanks,
Corinna


On Jan 17 11:53, Corinna Vinschen wrote:
> * Add a per-VF value to know if a VF is trusted, by default don't
>   trust VFs.
> 
> * Implement netdev op to trust VFs (igb_ndo_set_vf_trust) and add
>   trust status to ndo_get_vf_config output.
> 
> * Allow a trusted VF to change MAC and MAC filters even if MAC
>   has been administratively set.
> 
> Signed-off-by: Corinna Vinschen <vinsc...@redhat.com>
> ---
>  drivers/net/ethernet/intel/igb/igb.h  |  1 +
>  drivers/net/ethernet/intel/igb/igb_main.c | 30 +++---
>  2 files changed, 28 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/igb/igb.h 
> b/drivers/net/ethernet/intel/igb/igb.h
> index 1c6b8d9176a8..55d6f17d5799 100644
> --- a/drivers/net/ethernet/intel/igb/igb.h
> +++ b/drivers/net/ethernet/intel/igb/igb.h
> @@ -109,6 +109,7 @@ struct vf_data_storage {
>   u16 pf_qos;
>   u16 tx_rate;
>   bool spoofchk_enabled;
> + bool trusted;
>  };
>  
>  /* Number of unicast MAC filters reserved for the PF in the RAR registers */
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
> b/drivers/net/ethernet/intel/igb/igb_main.c
> index 749fb1f720b4..5cbec6cf2b2a 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -190,6 +190,8 @@ static int igb_ndo_set_vf_vlan(struct net_device *netdev,
>  static int igb_ndo_set_vf_bw(struct net_device *, int, int, int);
>  static int igb_ndo_set_vf_spoofchk(struct net_device *netdev, int vf,
>  bool setting);
> +static int igb_ndo_set_vf_trust(struct net_device *netdev, int vf,
> + bool setting);
>  static int igb_ndo_get_vf_config(struct net_device *netdev, int vf,
>struct ifla_vf_info *ivi);
>  static void igb_check_vf_rate_limit(struct igb_adapter *);
> @@ -2527,6 +2529,7 @@ static const struct net_device_ops igb_netdev_ops = {
>   .ndo_set_vf_vlan= igb_ndo_set_vf_vlan,
>   .ndo_set_vf_rate= igb_ndo_set_vf_bw,
>   .ndo_set_vf_spoofchk= igb_ndo_set_vf_spoofchk,
> + .ndo_set_vf_trust   = igb_ndo_set_vf_trust,
>   .ndo_get_vf_config  = igb_ndo_get_vf_config,
>  #ifdef CONFIG_NET_POLL_CONTROLLER
>   .ndo_poll_controller= igb_netpoll,
> @@ -6383,6 +6386,9 @@ static int igb_vf_configure(struct igb_adapter 
> *adapter, int vf)
>   /* By default spoof check is enabled for all VFs */
>   adapter->vf_data[vf].spoofchk_enabled = true;
>  
> + /* By default VFs are not trusted */
> + adapter->vf_data[vf].trusted = false;
> +
>   return 0;
>  }
>  
> @@ -6940,13 +6946,13 @@ static int igb_set_vf_mac_filter(struct igb_adapter 
> *adapter, const int vf,
>   }
>   break;
>   case E1000_VF_MAC_FILTER_ADD:
> - if (vf_data->flags & IGB_VF_FLAG_PF_SET_MAC) {
> + if ((vf_data->flags & IGB_VF_FLAG_PF_SET_MAC) &&
> + !vf_data->trusted) {
>   dev_warn(>dev,
>"VF %d requested MAC filter but is 
> administratively denied\n",
>vf);
>   return -EINVAL;
>   }
> -
>   if (!is_valid_ether_addr(addr)) {
>   dev_warn(>dev,
>"VF %d attempted to set invalid MAC filter\n",
> @@ -6998,7 +7004,8 @@ static int igb_set_vf_mac_addr(struct igb_adapter 
> *adapter, u32 *msg, int vf)
>   int ret = 0;
>  
>   if (!info) {
> - if (vf_data->flags & IGB_VF_FLAG_PF_SET_MAC) {
> + if ((vf_data->flags & IGB_VF_FLAG_PF_SET_MAC) &&
> + !vf_data->trusted) {
>   dev_warn(>dev,
>"VF %d attempted to override administratively 
> set MAC address\nReload the VF driver to resume operations\n",
>vf);
> @@ -8934,6 +8941,22 @@ static int igb_ndo_set_vf_spoofchk(struct net_device 
> *netdev, int vf,
>   return 0;
>  }
>  
> +static int igb_ndo_set_vf_trust(struct net_device *netdev, int vf, bool 
> setting)
> +{
> + struct igb_adapter *adapter = netdev_priv(netdev);
> +
> + if (vf >= adapter->vfs_allocated_count)
> + return -EINVAL;
> + if (adapter->vf_data[vf].trusted == setting)
> + return 0;
> +
> + adapter->vf_data[vf].trusted = setting;
> +
> + dev_info(>pdev->dev, &

[PATCH] igb: add VF trust infrastructure

2018-01-17 Thread Corinna Vinschen

* Add a per-VF value to know if a VF is trusted, by default don't
  trust VFs.

* Implement netdev op to trust VFs (igb_ndo_set_vf_trust) and add
  trust status to ndo_get_vf_config output.

* Allow a trusted VF to change MAC and MAC filters even if MAC
  has been administratively set.

Signed-off-by: Corinna Vinschen <vinsc...@redhat.com>
---
 drivers/net/ethernet/intel/igb/igb.h  |  1 +
 drivers/net/ethernet/intel/igb/igb_main.c | 30 +++---
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb.h 
b/drivers/net/ethernet/intel/igb/igb.h
index 1c6b8d9176a8..55d6f17d5799 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -109,6 +109,7 @@ struct vf_data_storage {
u16 pf_qos;
u16 tx_rate;
bool spoofchk_enabled;
+   bool trusted;
 };
 
 /* Number of unicast MAC filters reserved for the PF in the RAR registers */
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c
index 749fb1f720b4..5cbec6cf2b2a 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -190,6 +190,8 @@ static int igb_ndo_set_vf_vlan(struct net_device *netdev,
 static int igb_ndo_set_vf_bw(struct net_device *, int, int, int);
 static int igb_ndo_set_vf_spoofchk(struct net_device *netdev, int vf,
   bool setting);
+static int igb_ndo_set_vf_trust(struct net_device *netdev, int vf,
+   bool setting);
 static int igb_ndo_get_vf_config(struct net_device *netdev, int vf,
 struct ifla_vf_info *ivi);
 static void igb_check_vf_rate_limit(struct igb_adapter *);
@@ -2527,6 +2529,7 @@ static const struct net_device_ops igb_netdev_ops = {
.ndo_set_vf_vlan= igb_ndo_set_vf_vlan,
.ndo_set_vf_rate= igb_ndo_set_vf_bw,
.ndo_set_vf_spoofchk= igb_ndo_set_vf_spoofchk,
+   .ndo_set_vf_trust   = igb_ndo_set_vf_trust,
.ndo_get_vf_config  = igb_ndo_get_vf_config,
 #ifdef CONFIG_NET_POLL_CONTROLLER
.ndo_poll_controller= igb_netpoll,
@@ -6383,6 +6386,9 @@ static int igb_vf_configure(struct igb_adapter *adapter, 
int vf)
/* By default spoof check is enabled for all VFs */
adapter->vf_data[vf].spoofchk_enabled = true;
 
+   /* By default VFs are not trusted */
+   adapter->vf_data[vf].trusted = false;
+
return 0;
 }
 
@@ -6940,13 +6946,13 @@ static int igb_set_vf_mac_filter(struct igb_adapter 
*adapter, const int vf,
}
break;
case E1000_VF_MAC_FILTER_ADD:
-   if (vf_data->flags & IGB_VF_FLAG_PF_SET_MAC) {
+   if ((vf_data->flags & IGB_VF_FLAG_PF_SET_MAC) &&
+   !vf_data->trusted) {
dev_warn(>dev,
 "VF %d requested MAC filter but is 
administratively denied\n",
 vf);
return -EINVAL;
}
-
if (!is_valid_ether_addr(addr)) {
dev_warn(>dev,
 "VF %d attempted to set invalid MAC filter\n",
@@ -6998,7 +7004,8 @@ static int igb_set_vf_mac_addr(struct igb_adapter 
*adapter, u32 *msg, int vf)
int ret = 0;
 
if (!info) {
-   if (vf_data->flags & IGB_VF_FLAG_PF_SET_MAC) {
+   if ((vf_data->flags & IGB_VF_FLAG_PF_SET_MAC) &&
+   !vf_data->trusted) {
dev_warn(>dev,
 "VF %d attempted to override administratively 
set MAC address\nReload the VF driver to resume operations\n",
 vf);
@@ -8934,6 +8941,22 @@ static int igb_ndo_set_vf_spoofchk(struct net_device 
*netdev, int vf,
return 0;
 }
 
+static int igb_ndo_set_vf_trust(struct net_device *netdev, int vf, bool 
setting)
+{
+   struct igb_adapter *adapter = netdev_priv(netdev);
+
+   if (vf >= adapter->vfs_allocated_count)
+   return -EINVAL;
+   if (adapter->vf_data[vf].trusted == setting)
+   return 0;
+
+   adapter->vf_data[vf].trusted = setting;
+
+   dev_info(>pdev->dev, "VF %u is %strusted\n",
+vf, setting ? "" : "not ");
+   return 0;
+}
+
 static int igb_ndo_get_vf_config(struct net_device *netdev,
 int vf, struct ifla_vf_info *ivi)
 {
@@ -8947,6 +8970,7 @@ static int igb_ndo_get_vf_config(struct net_device 
*netdev,
ivi->vlan = adapter->vf_data[vf].pf_vlan;
ivi->qos = adapter->vf_data[vf].pf_qos;
ivi->spoofchk = adapter->vf_data[vf].spoofchk_enabled;
+   ivi->trusted = adapter->vf_data[vf].trusted;
return 0;
 }
 
-- 
2.14.3

[PATCH v3] igb: Allow to remove administratively set MAC on VFs

2017-04-10 Thread Corinna Vinschen

  Before libvirt modifies the MAC address and vlan tag for an SRIOV VF
  for use by a virtual machine (either using vfio device assignment or
  macvtap passthru mode), it saves the current MAC address and vlan tag
  so that it can reset them to their original value when the guest is
  done.  Libvirt can't leave the VF MAC set to the value used by the
  now-defunct guest since it may be started again later using a
  different VF, but it certainly shouldn't just pick any random value,
  either. So it saves the state of everything prior to using the VF, and
  resets it to that.

  The igb driver initializes the MAC addresses of all VFs to
  00:00:00:00:00:00, and reports that when asked (via an RTM_GETLINK
  netlink message, also visible in the list of VFs in the output of "ip
  link show"). But when libvirt attempts to restore the MAC address back
  to 00:00:00:00:00:00 (using an RTM_SETLINK netlink message) the kernel
  responds with "Invalid argument".

  Forbidding a reset back to the original value leaves the VF MAC at the
  value set for the now-defunct virtual machine. Especially on a system
  with NetworkManager enabled, this has very bad consequences, since
  NetworkManager forces all interfacess to be IFF_UP all the time - if
  the same virtual machine is restarted using a different VF (or even on
  a different host), there will be multiple interfaces watching for
  traffic with the same MAC address.

  To allow libvirt to revert to the original state, we need a way to
  remove the administrative set MAC on a VF, to allow normal host
  operation again, and to reset/overwrite the VF MAC via VF netdev.

  This patch implements the outlined scenario by allowing to set the
  VF MAC to 00:00:00:00:00:00 via RTM_SETLINK on the PF.
  igb_ndo_set_vf_mac resets the IGB_VF_FLAG_PF_SET_MAC flag to 0,
  so it's possible to reset the VF MAC back to the original value via
  the VF netdev.

  Note: Recent patches to libvirt allow for a workaround if the NIC
  isn't capable of resetting the administrative MAC back to all 0, but
  in theory the NIC should allow resetting the MAC in the first place.

Signed-off-by: Corinna Vinschen <vinsc...@redhat.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 42 +++
 1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c
index 1cf74aa..dbb4e3c 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -8358,7 +8358,8 @@ static void igb_rar_set_index(struct igb_adapter 
*adapter, u32 index)
 
/* Indicate to hardware the Address is Valid. */
if (adapter->mac_table[index].state & IGB_MAC_STATE_IN_USE) {
-   rar_high |= E1000_RAH_AV;
+   if (is_valid_ether_addr(addr))
+   rar_high |= E1000_RAH_AV;
 
if (hw->mac.type == e1000_82575)
rar_high |= E1000_RAH_POOL_1 *
@@ -8396,17 +8397,36 @@ static int igb_set_vf_mac(struct igb_adapter *adapter,
 static int igb_ndo_set_vf_mac(struct net_device *netdev, int vf, u8 *mac)
 {
struct igb_adapter *adapter = netdev_priv(netdev);
-   if (!is_valid_ether_addr(mac) || (vf >= adapter->vfs_allocated_count))
+
+   if (vf >= adapter->vfs_allocated_count)
+   return -EINVAL;
+
+   /* Setting the VF MAC to 0 reverts the IGB_VF_FLAG_PF_SET_MAC
+* flag and allows to overwrite the MAC via VF netdev.  This
+* is necessary to allow libvirt a way to restore the original
+* MAC after unbinding vfio-pci and reloading igbvf after shutting
+* down a VM.
+*/
+   if (is_zero_ether_addr(mac)) {
+   adapter->vf_data[vf].flags &= ~IGB_VF_FLAG_PF_SET_MAC;
+   dev_info(>pdev->dev,
+"remove administratively set MAC on VF %d\n",
+vf);
+   } else if (is_valid_ether_addr (mac)) {
+   adapter->vf_data[vf].flags |= IGB_VF_FLAG_PF_SET_MAC;
+   dev_info(>pdev->dev, "setting MAC %pM on VF %d\n",
+mac, vf);
+   dev_info(>pdev->dev,
+"Reload the VF driver to make this change effective.");
+   /* Generate additional warning if PF is down */
+   if (test_bit(__IGB_DOWN, >state)) {
+   dev_warn(>pdev->dev,
+"The VF MAC address has been set, but the PF 
device is not up.\n");
+   dev_warn(>pdev->dev,
+"Bring the PF device up before attempting to 
use the VF device.\n");
+   }
+   } else {
return -EINVAL;
-   adapter->vf_data[vf].flags |= IGB_VF_FLAG_PF_SET_MAC;
-   dev_

Re: [Intel-wired-lan] [PATCH v2] igb: Allow to remove administratively set MAC on VFs

2017-04-10 Thread Corinna Vinschen

On Apr  7 12:06, Jeff Kirsher wrote:
> On Wed, 2017-04-05 at 15:46 +0200, Corinna Vinschen wrote:
> >   Before libvirt modifies the MAC address and vlan tag for an SRIOV
> > VF
> >   for use by a virtual machine (either using vfio device assignment
> > or
> >   macvtap passthru mode), it saves the current MAC address and vlan
> > tag
> >   so that it can reset them to their original value when the guest is
> >   done.  Libvirt can't leave the VF MAC set to the value used by the
> >   now-defunct guest since it may be started again later using a
> >   different VF, but it certainly shouldn't just pick any random
> > value,
> >   either. So it saves the state of everything prior to using the VF,
> > and
> >   resets it to that.
> > 
> >   The igb driver initializes the MAC addresses of all VFs to
> >   00:00:00:00:00:00, and reports that when asked (via an RTM_GETLINK
> >   netlink message, also visible in the list of VFs in the output of
> > "ip
> >   link show"). But when libvirt attempts to restore the MAC address
> > back
> >   to 00:00:00:00:00:00 (using an RTM_SETLINK netlink message) the
> > kernel
> >   responds with "Invalid argument".
> > 
> >   Forbidding a reset back to the original value leaves the VF MAC at
> > the
> >   value set for the now-defunct virtual machine. Especially on a
> > system
> >   with NetworkManager enabled, this has very bad consequences, since
> >   NetworkManager forces all interfacess to be IFF_UP all the time -
> > if
> >   the same virtual machine is restarted using a different VF (or even
> > on
> >   a different host), there will be multiple interfaces watching for
> >   traffic with the same MAC address.
> > 
> >   To allow libvirt to revert to the original state, we need a way to
> >   remove the administrative set MAC on a VF, to allow normal host
> >   operation again, and to reset/overwrite the VF MAC via VF netdev.
> > 
> >   This patch implements the outlined scenario by allowing to set the
> >   VF MAC to 00:00:00:00:00:00 via RTM_SETLINK on the PF.
> >   igb_ndo_set_vf_mac resets the IGB_VF_FLAG_PF_SET_MAC flag to 0,
> >   so it's possible to reset the VF MAC back to the original value via
> >   the VF netdev.
> > 
> >   Note: Recent patches to libvirt allow for a workaround if the NIC
> >   isn't capable of resetting the administrative MAC back to all 0,
> > but
> >   in theory the NIC should allow resetting the MAC in the first
> > place.
> > 
> > Signed-off-by: Corinna Vinschen <vinsc...@redhat.com>
> > ---
> >  drivers/net/ethernet/intel/igb/igb_main.c | 42
> > +++
> >  1 file changed, 31 insertions(+), 11 deletions(-)
> 
> This patch does not apply (not even close).  Please make sure to base
> you patch off my dev-queue branch of my next-queue tree on kernel.org.

Right, I applied the patch against net-next.  Actually, only the
first hunk didn't apply to dev-queue because of the change from
igb_rar_set_qsel to igb_rar_set_index.

I'll send a v3 agains dev-queue in a bit.


Corinna


signature.asc
Description: PGP signature

[PATCH v2] igb: Allow to remove administratively set MAC on VFs

2017-04-05 Thread Corinna Vinschen

  Before libvirt modifies the MAC address and vlan tag for an SRIOV VF
  for use by a virtual machine (either using vfio device assignment or
  macvtap passthru mode), it saves the current MAC address and vlan tag
  so that it can reset them to their original value when the guest is
  done.  Libvirt can't leave the VF MAC set to the value used by the
  now-defunct guest since it may be started again later using a
  different VF, but it certainly shouldn't just pick any random value,
  either. So it saves the state of everything prior to using the VF, and
  resets it to that.

  The igb driver initializes the MAC addresses of all VFs to
  00:00:00:00:00:00, and reports that when asked (via an RTM_GETLINK
  netlink message, also visible in the list of VFs in the output of "ip
  link show"). But when libvirt attempts to restore the MAC address back
  to 00:00:00:00:00:00 (using an RTM_SETLINK netlink message) the kernel
  responds with "Invalid argument".

  Forbidding a reset back to the original value leaves the VF MAC at the
  value set for the now-defunct virtual machine. Especially on a system
  with NetworkManager enabled, this has very bad consequences, since
  NetworkManager forces all interfacess to be IFF_UP all the time - if
  the same virtual machine is restarted using a different VF (or even on
  a different host), there will be multiple interfaces watching for
  traffic with the same MAC address.

  To allow libvirt to revert to the original state, we need a way to
  remove the administrative set MAC on a VF, to allow normal host
  operation again, and to reset/overwrite the VF MAC via VF netdev.

  This patch implements the outlined scenario by allowing to set the
  VF MAC to 00:00:00:00:00:00 via RTM_SETLINK on the PF.
  igb_ndo_set_vf_mac resets the IGB_VF_FLAG_PF_SET_MAC flag to 0,
  so it's possible to reset the VF MAC back to the original value via
  the VF netdev.

  Note: Recent patches to libvirt allow for a workaround if the NIC
  isn't capable of resetting the administrative MAC back to all 0, but
  in theory the NIC should allow resetting the MAC in the first place.

Signed-off-by: Corinna Vinschen <vinsc...@redhat.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 42 +++
 1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c
index 26a821f..c7673a2 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -8093,7 +8093,8 @@ static void igb_rar_set_qsel(struct igb_adapter *adapter, 
u8 *addr, u32 index,
rar_high = le16_to_cpup((__le16 *)(addr + 4));
 
/* Indicate to hardware the Address is Valid. */
-   rar_high |= E1000_RAH_AV;
+   if (is_valid_ether_addr(addr))
+   rar_high |= E1000_RAH_AV;
 
if (hw->mac.type == e1000_82575)
rar_high |= E1000_RAH_POOL_1 * qsel;
@@ -8125,17 +8126,36 @@ static int igb_set_vf_mac(struct igb_adapter *adapter,
 static int igb_ndo_set_vf_mac(struct net_device *netdev, int vf, u8 *mac)
 {
struct igb_adapter *adapter = netdev_priv(netdev);
-   if (!is_valid_ether_addr(mac) || (vf >= adapter->vfs_allocated_count))
+
+   if (vf >= adapter->vfs_allocated_count)
+   return -EINVAL;
+
+   /* Setting the VF MAC to 0 reverts the IGB_VF_FLAG_PF_SET_MAC
+* flag and allows to overwrite the MAC via VF netdev.  This
+* is necessary to allow libvirt a way to restore the original
+* MAC after unbinding vfio-pci and reloading igbvf after shutting
+* down a VM.
+*/
+   if (is_zero_ether_addr(mac)) {
+   adapter->vf_data[vf].flags &= ~IGB_VF_FLAG_PF_SET_MAC;
+   dev_info(>pdev->dev,
+"remove administratively set MAC on VF %d\n",
+vf);
+   } else if (is_valid_ether_addr (mac)) {
+   adapter->vf_data[vf].flags |= IGB_VF_FLAG_PF_SET_MAC;
+   dev_info(>pdev->dev, "setting MAC %pM on VF %d\n",
+mac, vf);
+   dev_info(>pdev->dev,
+"Reload the VF driver to make this change effective.");
+   /* Generate additional warning if PF is down */
+   if (test_bit(__IGB_DOWN, >state)) {
+   dev_warn(>pdev->dev,
+"The VF MAC address has been set, but the PF 
device is not up.\n");
+   dev_warn(>pdev->dev,
+"Bring the PF device up before attempting to 
use the VF device.\n");
+   }
+   } else {
return -EINVAL;
-   adapter->vf_data[vf].flags |= IGB_VF_FLAG_PF_SET_MAC;
-   dev_info(>pdev->dev, "setting MAC %pM on VF %d

Re: [Intel-wired-lan] [PATCH] igb: Allow to remove administratively set MAC on VFs

2017-04-05 Thread Corinna Vinschen

Hi Alex,

On Apr  4 10:33, Alexander Duyck wrote:
> On Tue, Apr 4, 2017 at 10:16 AM, Duyck, Alexander H
> <alexander.h.du...@intel.com> wrote:
> >> -Original Message-
> >> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@lists.osuosl.org] On
> >> Behalf Of Corinna Vinschen
> >> Sent: Tuesday, April 4, 2017 8:11 AM
> >> To: intel-wired-...@lists.osuosl.org
> >> Cc: netdev@vger.kernel.org
> >> Subject: [Intel-wired-lan] [PATCH] igb: Allow to remove administratively 
> >> set MAC
> >> on VFs
> >> [...]
> 
> So I just realized there is one other minor issue I just found. In
> igb_rar_set_qsel you should probably add a check for
> "is_valid_ether_addr(addr)" before you set the E1000_RAH_AV bit. For
> the zeroed MAC address it should be cleared so that we aren't
> filtering on a MAC address of all 0's for the VF.
> 
> - Alex

I see your point, but I'm a bit reluctant to do that because
igb_vf_configure() calls igb_set_vf_mac() with addr set to the
zeroed MAC explicitely:


  eth_zero_addr(mac_addr);
  igb_set_vf_mac(adapter, vf, mac_addr);

So in this case the zero MAC is already treated as valid address
and the E1000_RAH_AV bit is set.  Is that just a bug?


Corinna


signature.asc
Description: PGP signature

[PATCH] igb: Allow to remove administratively set MAC on VFs

2017-04-04 Thread Corinna Vinschen

  Before libvirt modifies the MAC address and vlan tag for an SRIOV VF
  for use by a virtual machine (either using vfio device assignment or
  macvtap passthru mode), it saves the current MAC address and vlan tag
  so that it can reset them to their original value when the guest is
  done.  Libvirt can't leave the VF MAC set to the value used by the
  now-defunct guest since it may be started again later using a
  different VF, but it certainly shouldn't just pick any random value,
  either. So it saves the state of everything prior to using the VF, and
  resets it to that.

  The igb driver initializes the MAC addresses of all VFs to
  00:00:00:00:00:00, and reports that when asked (via an RTM_GETLINK
  netlink message, also visible in the list of VFs in the output of "ip
  link show"). But when libvirt attempts to restore the MAC address back
  to 00:00:00:00:00:00 (using an RTM_SETLINK netlink message) the kernel
  responds with "Invalid argument".

  Forbidding a reset back to the original value leaves the VF MAC at the
  value set for the now-defunct virtual machine. Especially on a system
  with NetworkManager enabled, this has very bad consequences, since
  NetworkManager forces all interfacess to be IFF_UP all the time - if
  the same virtual machine is restarted using a different VF (or even on
  a different host), there will be multiple interfaces watching for
  traffic with the same MAC address.

  To allow libvirt to revert to the original state, we need a way to
  remove the administrative set MAC on a VF, to allow normal host
  operation again, and to reset/overwrite the VF MAC via VF netdev.

  This patch implements the outlined scenario by allowing to set the
  VF MAC to 00:00:00:00:00:00 via RTM_SETLINK on the PF.
  igb_ndo_set_vf_mac resets the IGB_VF_FLAG_PF_SET_MAC flag to 0,
  so it's possible to reset the VF MAC back to the original value via
  the VF netdev.

  Note: Recent patches to libvirt allow for a workaround if the NIC
  isn't capable of resetting the administrative MAC back to all 0, but
  in theory the NIC should allow resetting the MAC in the fisr place.

Signed-off-by: Corinna Vinschen <vinsc...@redhat.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c
index 26a821f..e7a61b1 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -8125,12 +8125,27 @@ static int igb_set_vf_mac(struct igb_adapter *adapter,
 static int igb_ndo_set_vf_mac(struct net_device *netdev, int vf, u8 *mac)
 {
struct igb_adapter *adapter = netdev_priv(netdev);
-   if (!is_valid_ether_addr(mac) || (vf >= adapter->vfs_allocated_count))
+
+   if (vf >= adapter->vfs_allocated_count)
+   return -EINVAL;
+   /* Setting the VF MAC to 0 reverts the IGB_VF_FLAG_PF_SET_MAC
+  flag and allows to overwrite the MAC via VF netdev.  This
+  is necessary to allow libvirt a way to restore the original
+  MAC after unbinding vfio-pci and reloading igbvf after shutting
+  down a VM. */
+   if (is_zero_ether_addr(mac)) {
+   adapter->vf_data[vf].flags &= ~IGB_VF_FLAG_PF_SET_MAC;
+   dev_info(>pdev->dev,
+"remove administratively set MAC on VF %d\n",
+vf);
+   } else if (is_valid_ether_addr (mac)) {
+   adapter->vf_data[vf].flags |= IGB_VF_FLAG_PF_SET_MAC;
+   dev_info(>pdev->dev, "setting MAC %pM on VF %d\n",
+mac, vf);
+   dev_info(>pdev->dev,
+"Reload the VF driver to make this change effective.");
+   } else
return -EINVAL;
-   adapter->vf_data[vf].flags |= IGB_VF_FLAG_PF_SET_MAC;
-   dev_info(>pdev->dev, "setting MAC %pM on VF %d\n", mac, vf);
-   dev_info(>pdev->dev,
-"Reload the VF driver to make this change effective.");
if (test_bit(__IGB_DOWN, >state)) {
dev_warn(>pdev->dev,
 "The VF MAC address has been set, but the PF device is 
not up.\n");
-- 
2.9.3

Re: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

2016-11-10 Thread Corinna Vinschen

On Nov 10 05:48, Hisashi T Fujinaka wrote:
> On Thu, 10 Nov 2016, Corinna Vinschen wrote:
> > On Nov  8 11:33, Alexander Duyck wrote:
> ...
> > > The question I would have is what is reading the device when it is in
> > > this state.  The watchdog and any other functions that would read the
> > > device should be disabled.
> > > 
> > > One possibility could be a race between a call to igb_close and the
> > > igb_suspend function.  We have seen some of those pop up recently on
> > > ixgbe and it looks like igb has the same bug.  We should probably be
> > > using the rtnl_lock to guarantee that netif_device_detach and the call
> > > to __igb_close are completed before igb_close could possibly be called
> > > by the network stack.
> > 
> > Do you have a pointer to the related ixgbe patch, by any chance?
> ...
> Here's the initial patch for igb I have, but it's on hold awaiting more
> changes in ixgbe regarding AER.

Thanks a lot!


Corinna


signature.asc
Description: PGP signature

Re: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

2016-11-10 Thread Corinna Vinschen

On Nov  8 11:33, Alexander Duyck wrote:
> On Tue, Nov 8, 2016 at 10:37 AM, Corinna Vinschen <vinsc...@redhat.com> wrote:
> > On Nov  8 09:16, Hisashi T Fujinaka wrote:
> >> On Tue, 8 Nov 2016, Corinna Vinschen wrote:
> >> > On Nov  8 15:06, Cao jin wrote:
> >> > > When running as guest, under certain condition, it will oops as 
> >> > > following.
> >> > > writel() in igb_configure_tx_ring() results in oops, because 
> >> > > hw->hw_addr
> >> > > is NULL. While other register access won't oops kernel because they use
> >> > > wr32/rd32 which have a defense against NULL pointer.
> >> > > [...]
> >> >
> >> > Incidentally we're just looking for a solution to that problem too.
> >> > Do three patches to fix the same problem at rougly the same time already
> >> > qualify as freak accident?
> >> >
> >> > FTR, I attached my current patch, which I was planning to submit after
> >> > some external testing.
> >> >
> >> > However, all three patches have one thing in common:  They workaround
> >> > a somewhat dubious resetting of the hardware address to NULL in case
> >> > reading from a register failed.
> >> >
> >> > That makes me wonder if setting the hardware address to NULL in
> >> > rd32/igb_rd32 is really such a good idea.  It's performed in a function
> >> > which return value is *never* tested for validity in the calling
> >> > functions and leads to subsequent crashes since no tests for hw_addr ==
> >> > NULL are performed.
> >> >
> >> > Maybe commit 22a8b2915 should be reconsidered?  Isn't there some more
> >> > graceful way to handle the "surprise removal"?
> >>
> >> Answering this from my home account because, well, work is Outlook.
> >>
> >> "Reconsidering" would be great. In fact, revert if if you'd like. I'm
> >> uncertain that the surprise removal code actually works the way I
> >> thought previously and I think I took a lot of it out of my local code.
> >>
> >> Unfortuantely I don't have any equipment that I can use to reproduce
> >> surprise removal any longer so that means I wouldn't be able to test
> >> anything. I have to defer to you or Cao Jin.
> >
> > I'm not too keen to rip out a PCIe NIC under power from my locale
> > desktop machine, but I think an actual surprise removal is not the
> > problem.
> >
> > As described in my git log entry, the error condition in igb_rd32 can be
> > triggered during a suspend.  The HW has been put into a sleep state but
> > some register read requests are apparently not guarded against that
> > situation.  Reading a register in this state returns -1, thus a suspend
> > is erroneously triggering the "surprise removal" sequence.
> 
> The question I would have is what is reading the device when it is in
> this state.  The watchdog and any other functions that would read the
> device should be disabled.
> 
> One possibility could be a race between a call to igb_close and the
> igb_suspend function.  We have seen some of those pop up recently on
> ixgbe and it looks like igb has the same bug.  We should probably be
> using the rtnl_lock to guarantee that netif_device_detach and the call
> to __igb_close are completed before igb_close could possibly be called
> by the network stack.

Do you have a pointer to the related ixgbe patch, by any chance?

> > Here's a raw idea:
> >
> > - Note that device is suspended in e1000_hw struct.  Don't trigger
> >   error sequence in igb_rd32 if so (...and return a 0 value???)
> 
> The thing is that a suspended device should not be accessed at all.
> If we are accessing it while it is suspended then that is a bug.  If
> you could throw a WARN_ON call in igb_rd32 to capture where this is
> being triggered that might be useful.
> 
> > - Otherwise assume it's actually a surprise removal.  In theory that
> >   should somehow trigger a device removal sequence, kind of like
> >   calling igb_remove, no?
> 
> Well a read of the MMIO region while suspended is more of a surprise
> read since there shouldn't be anything going on.  We need to isolate
> where that read is coming from and fix it.

That would be ideal, but the problem couldn't be reproduced yet apart
from at a customer's customer site.  It's not clear yet if we can access
the machine for further testing.


Corinna


signature.asc
Description: PGP signature

Re: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

2016-11-08 Thread Corinna Vinschen

On Nov  8 09:16, Hisashi T Fujinaka wrote:
> On Tue, 8 Nov 2016, Corinna Vinschen wrote:
> > On Nov  8 15:06, Cao jin wrote:
> > > When running as guest, under certain condition, it will oops as following.
> > > writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr
> > > is NULL. While other register access won't oops kernel because they use
> > > wr32/rd32 which have a defense against NULL pointer.
> > > [...]
> > 
> > Incidentally we're just looking for a solution to that problem too.
> > Do three patches to fix the same problem at rougly the same time already
> > qualify as freak accident?
> > 
> > FTR, I attached my current patch, which I was planning to submit after
> > some external testing.
> > 
> > However, all three patches have one thing in common:  They workaround
> > a somewhat dubious resetting of the hardware address to NULL in case
> > reading from a register failed.
> > 
> > That makes me wonder if setting the hardware address to NULL in
> > rd32/igb_rd32 is really such a good idea.  It's performed in a function
> > which return value is *never* tested for validity in the calling
> > functions and leads to subsequent crashes since no tests for hw_addr ==
> > NULL are performed.
> > 
> > Maybe commit 22a8b2915 should be reconsidered?  Isn't there some more
> > graceful way to handle the "surprise removal"?
> 
> Answering this from my home account because, well, work is Outlook.
> 
> "Reconsidering" would be great. In fact, revert if if you'd like. I'm
> uncertain that the surprise removal code actually works the way I
> thought previously and I think I took a lot of it out of my local code.
> 
> Unfortuantely I don't have any equipment that I can use to reproduce
> surprise removal any longer so that means I wouldn't be able to test
> anything. I have to defer to you or Cao Jin.

I'm not too keen to rip out a PCIe NIC under power from my locale
desktop machine, but I think an actual surprise removal is not the
problem.

As described in my git log entry, the error condition in igb_rd32 can be
triggered during a suspend.  The HW has been put into a sleep state but
some register read requests are apparently not guarded against that
situation.  Reading a register in this state returns -1, thus a suspend
is erroneously triggering the "surprise removal" sequence.

Here's a raw idea:

- Note that device is suspended in e1000_hw struct.  Don't trigger
  error sequence in igb_rd32 if so (...and return a 0 value???)

- Otherwise assume it's actually a surprise removal.  In theory that
  should somehow trigger a device removal sequence, kind of like
  calling igb_remove, no?


Thanks,
Corinna


signature.asc
Description: PGP signature

Re: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

2016-11-08 Thread Corinna Vinschen

On Nov  8 15:06, Cao jin wrote:
> When running as guest, under certain condition, it will oops as following.
> writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr
> is NULL. While other register access won't oops kernel because they use
> wr32/rd32 which have a defense against NULL pointer.
> 
> [  141.225449] pcieport :00:1c.0: AER: Multiple Uncorrected (Fatal)
> error received: id=0101
> [  141.225523] igb :01:00.1: PCIe Bus Error:
> severity=Uncorrected (Fatal), type=Unaccessible,
> id=0101(Unregistered Agent ID)
> [  141.299442] igb :01:00.1: broadcast error_detected message
> [  141.300539] igb :01:00.0 enp1s0f0: PCIe link lost, device now
> detached
> [  141.351019] igb :01:00.1 enp1s0f1: PCIe link lost, device now
> detached
> [  143.465904] pcieport :00:1c.0: Root Port link has been reset
> [  143.465994] igb :01:00.1: broadcast slot_reset message
> [  143.466039] igb :01:00.0: enabling device ( -> 0002)
> [  144.389078] igb :01:00.1: enabling device ( -> 0002)
> [  145.312078] igb :01:00.1: broadcast resume message
> [  145.322211] BUG: unable to handle kernel paging request at
> 3818
> [  145.361275] IP: []
> igb_configure_tx_ring+0x14d/0x280 [igb]
> [  145.400048] PGD 0
> [  145.438007] Oops: 0002 [#1] SMP
> 
> A similiar issue & solution could be found at:
> http://patchwork.ozlabs.org/patch/689592/
> 
> Signed-off-by: Cao jin <caoj.f...@cn.fujitsu.com>
> ---
>  drivers/net/ethernet/intel/igb/igb_main.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
> b/drivers/net/ethernet/intel/igb/igb_main.c
> index edc9a6a..3f240ac 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -3390,7 +3390,7 @@ void igb_configure_tx_ring(struct igb_adapter *adapter,
>tdba & 0xULL);
>   wr32(E1000_TDBAH(reg_idx), tdba >> 32);
>  
> - ring->tail = hw->hw_addr + E1000_TDT(reg_idx);
> + ring->tail = adapter->io_addr + E1000_TDT(reg_idx);
>   wr32(E1000_TDH(reg_idx), 0);
>   writel(0, ring->tail);
>  
> @@ -3729,7 +3729,7 @@ void igb_configure_rx_ring(struct igb_adapter *adapter,
>ring->count * sizeof(union e1000_adv_rx_desc));
>  
>   /* initialize head and tail */
> - ring->tail = hw->hw_addr + E1000_RDT(reg_idx);
> + ring->tail = adapter->io_addr + E1000_RDT(reg_idx);
>   wr32(E1000_RDH(reg_idx), 0);
>   writel(0, ring->tail);
>  
> -- 
> 2.1.0

Incidentally we're just looking for a solution to that problem too.
Do three patches to fix the same problem at rougly the same time already
qualify as freak accident?

FTR, I attached my current patch, which I was planning to submit after
some external testing.

However, all three patches have one thing in common:  They workaround
a somewhat dubious resetting of the hardware address to NULL in case
reading from a register failed.

That makes me wonder if setting the hardware address to NULL in
rd32/igb_rd32 is really such a good idea.  It's performed in a function
which return value is *never* tested for validity in the calling
functions and leads to subsequent crashes since no tests for hw_addr ==
NULL are performed.

Maybe commit 22a8b2915 should be reconsidered?  Isn't there some more
graceful way to handle the "surprise removal"?


Thanks,
Corinna
From c6e80c04a7f20bdf3bde490ff842bcc1c800bf2a Mon Sep 17 00:00:00 2001
From: Corinna Vinschen <vinsc...@redhat.com>
Date: Mon, 24 Oct 2016 16:44:55 +0200
Subject: [RHEL7.4 PATCH] igb_resume: Fix up hw_addr on resume

A suspend hanging for too long can trigger spurious reads.
The device is already suspended so igb_rd32 fails to read
and sets hw->hw_addr to NULL.  If we don't fix that here,
subsequent code will fail to write to HW registers and igb
crashes eventually in a writel call.

Signed-off-by: Corinna Vinschen <vinsc...@redhat.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c
index 8e96c35..d1f9d34 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -7577,6 +7577,9 @@ static int igb_resume(struct device *dev)
pci_enable_wake(pdev, PCI_D3hot, 0);
pci_enable_wake(pdev, PCI_D3cold, 0);
 
+   if (E1000_REMOVED(hw->hw_addr) && adapter->io_addr)
+   hw->hw_addr = adapter->io_addr;
+
if (igb_init_interrupt_scheme(adapter, true)) {
dev_err(>dev, "Unable to allocate memory for queues\n");
return -ENOMEM;
-- 
2.5.5



signature.asc
Description: PGP signature

Re: [PATCH] igb: Fix VLAN tag stripping on Intel i350

2016-01-28 Thread Corinna Vinschen

Hi Jeff,

On Jan 27 11:25, Jeff Kirsher wrote:
> On Wed, 2016-01-27 at 14:28 +0100, Corinna Vinschen wrote:
> > Problem: When switching off VLAN offloading on an i350, the VLAN
> > interface gets unusable.  For testing, set up a VLAN on an i350
> > and some remote machine, e.g.:
> > [...]
> I tried applying your patch to my tree for review and validation, but
> due to patches already applied against the igb driver in my tree, your
> patch does not apply cleanly.
> 
> Can you please update your patch to apply cleanly to my tree?
> git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue.git
> dev-queue
> 
> (next-queue tree and dev-queue branch)

The attached patch is against the dev-queue branch of your next-queue tree.

I retested again using an i350 as well as a 82580.  On the i350 I tested
additionally adding and removing VFs while offloading was switched on or
off, and it worked as desired.

I'm just a bit unsure about the correctness of the igb_set_vf_vlan_strip
call from igb_vf_reset.  I think it's the right thing to do since
resetting the VF requires to reevaluate the STRVLAN flag.  I'd feel
better if you could double check, though.

Thanks,
Corinna
From ab27415c3d3be2beb978316007f84fb54d12b619 Mon Sep 17 00:00:00 2001
From: Corinna Vinschen <vinsc...@redhat.com>
Date: Thu, 28 Jan 2016 13:31:11 +0100
Subject: [PATCH] igb: Fix VLAN tag stripping on Intel i350

Problem: When switching off VLAN offloading on an i350, the VLAN
interface gets unusable.  For testing, set up a VLAN on an i350
and some remote machine, e.g.:

  $ ip link add link eth0 name eth0.42 type vlan id 42
  $ ip addr add 192.168.42.1/24 dev eth0.42
  $ ip link set dev eth0.42 up

Offloading is switched on by default:

  $ ethtool -k eth0 | grep vlan-offload
  rx-vlan-offload: on
  tx-vlan-offload: on

  $ ping -c 3 -I eth0.42 192.168.42.2
  [...works as usual...]

Now switch off VLAN offloading and try again:

  $ ethtool -K eth0 rxvlan off
  Actual changes:
  rx-vlan-offload: off
  tx-vlan-offload: off [requested on]
  $ ping -c 3 -I eth0.42 192.168.42.2
  PING 192.168.42.2 (192.168.42.2) from 192.168.42.1 eth0.42: 56(84) bytes of da
ta.

  --- 192.168.42.2 ping statistics ---
  3 packets transmitted, 0 received, 100% packet loss, time 1999ms

I can only reproduce it on an i350, the above works fine on a 82580.

While inspecting the igb source, I came across the code in igb_set_vmolr
which sets the E1000_VMOLR_STRVLAN/E1000_DVMOLR_STRVLAN flags once and
for all, and in all of the igb code there's no other place where the
STRVLAN is set or cleared.  Thus, VLAN stripping is enabled in igb
unconditionally, independently of the offloading setting.

I compared that to the latest Intel igb-5.3.3.5 driver from
http://sourceforge.net/projects/e1000/ which in fact sets and clears the
STRVLAN flag independently from igb_set_vmolr in its own function
igb_set_vf_vlan_strip, depending on the vlan settings.

So I included the STRVLAN handling from the igb-5.3.3.5 driver into our
current igb driver and tested the above scenario again.  This time ping
still works after switching off VLAN offloading.

Tested on i350, with and without addtional VFs, as well as on 82580
successfully.

Signed-off-by: Corinna Vinschen <vinsc...@redhat.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 41 ---
 1 file changed, 32 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c
index 95d6042..c414e41 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3699,6 +3699,28 @@ static inline int igb_set_vf_rlpml(struct igb_adapter 
*adapter, int size,
return 0;
 }

+static inline void igb_set_vf_vlan_strip(struct igb_adapter *adapter,
+int vfn, bool enable)
+{
+   struct e1000_hw *hw = >hw;
+   u32 val, reg;
+
+   if (hw->mac.type < e1000_82576)
+   return;
+
+   if (hw->mac.type == e1000_i350)
+   reg = E1000_DVMOLR(vfn);
+   else
+   reg = E1000_VMOLR(vfn);
+
+   val = rd32(reg);
+   if (enable)
+   val |= E1000_VMOLR_STRVLAN;
+   else
+   val &= ~(E1000_VMOLR_STRVLAN);
+   wr32(reg, val);
+}
+
 static inline void igb_set_vmolr(struct igb_adapter *adapter,
 int vfn, bool aupe)
 {
@@ -3712,14 +3734,6 @@ static inline void igb_set_vmolr(struct igb_adapter 
*adapter,
return;

vmolr = rd32(E1000_VMOLR(vfn));
-   vmolr |= E1000_VMOLR_STRVLAN; /* Strip vlan tags */
-   if (hw->mac.type == e1000_i350) {
-   u32 dvmolr;
-
-   dvmolr = rd32(E1000_DVMOLR(vfn));
-   dvmolr |= E1000_DVMOLR_STRVLAN;
-   wr32(E1000_DVMOLR(vfn), dvmolr);
-   }
if (aupe)

[PATCH] igb: Fix VLAN tag stripping on Intel i350

2016-01-27 Thread Corinna Vinschen

Problem: When switching off VLAN offloading on an i350, the VLAN
interface gets unusable.  For testing, set up a VLAN on an i350
and some remote machine, e.g.:

  $ ip link add link eth0 name eth0.42 type vlan id 42
  $ ip addr add 192.168.42.1/24 dev eth0.42
  $ ip link set dev eth0.42 up

Offloading is switched on by default:

  $ ethtool -k eth0 | grep vlan-offload
  rx-vlan-offload: on
  tx-vlan-offload: on

  $ ping -c 3 -I eth0.42 192.168.42.2
  [...works as usual...]

Now switch off VLAN offloading and try again:

  $ ethtool -K eth0 rxvlan off
  Actual changes:
  rx-vlan-offload: off
  tx-vlan-offload: off [requested on]
  $ ping -c 3 -I eth0.42 192.168.42.2
  PING 192.168.42.2 (192.168.42.2) from 192.168.42.1 eth0.42: 56(84) bytes of 
data.

  --- 192.168.42.2 ping statistics ---
  3 packets transmitted, 0 received, 100% packet loss, time 1999ms

I can only reproduce it on an i350, the above works fine on a 82580.

While inspecting the igb source, I came across the code in igb_set_vmolr
which sets the E1000_VMOLR_STRVLAN/E1000_DVMOLR_STRVLAN flags once and
for all, and in all of the igb code there's no other place where the
STRVLAN is set or cleared.  Thus, VLAN stripping is enabled in igb
unconditionally, independently of the offloading setting.

I compared that to the latest Intel igb-5.3.3.5 driver from
http://sourceforge.net/projects/e1000/ which in fact sets and clears the
STRVLAN flag independently from igb_set_vmolr in its own function
igb_set_vf_vlan_strip, depending on the vlan settings.

So I included the STRVLAN handling from the igb-5.3.3.5 driver into our
current igb driver and tested the above scenario again.  This time ping
still works after switching off VLAN offloading.

Tested on i350, with and without addtional VFs, as well as on 82580
successfully.

Signed-off-by: Corinna Vinschen <vinsc...@redhat.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 39 ---
 1 file changed, 31 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c
index 31e5f39..afd3990 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3580,6 +3580,28 @@ static void igb_rlpml_set(struct igb_adapter *adapter)
wr32(E1000_RLPML, max_frame_size);
 }
 
+static inline void igb_set_vf_vlan_strip(struct igb_adapter *adapter,
+int vfn, bool enable)
+{
+   struct e1000_hw *hw = >hw;
+   u32 val, reg;
+
+   if (hw->mac.type < e1000_82576)
+   return;
+
+   if (hw->mac.type == e1000_i350)
+   reg = E1000_DVMOLR(vfn);
+   else
+   reg = E1000_VMOLR(vfn);
+
+   val = rd32(reg);
+   if (enable)
+   val |= E1000_VMOLR_STRVLAN;
+   else
+   val &= ~(E1000_VMOLR_STRVLAN);
+   wr32(reg, val);
+}
+
 static inline void igb_set_vmolr(struct igb_adapter *adapter,
 int vfn, bool aupe)
 {
@@ -3593,14 +3615,6 @@ static inline void igb_set_vmolr(struct igb_adapter 
*adapter,
return;
 
vmolr = rd32(E1000_VMOLR(vfn));
-   vmolr |= E1000_VMOLR_STRVLAN; /* Strip vlan tags */
-   if (hw->mac.type == e1000_i350) {
-   u32 dvmolr;
-
-   dvmolr = rd32(E1000_DVMOLR(vfn));
-   dvmolr |= E1000_DVMOLR_STRVLAN;
-   wr32(E1000_DVMOLR(vfn), dvmolr);
-   }
if (aupe)
vmolr |= E1000_VMOLR_AUPE; /* Accept untagged packets */
else
@@ -5937,6 +5951,7 @@ static int igb_ndo_set_vf_vlan(struct net_device *netdev,
goto out;
igb_set_vmvir(adapter, vlan | (qos << VLAN_PRIO_SHIFT), vf);
igb_set_vmolr(adapter, vf, !vlan);
+   igb_set_vf_vlan_strip(adapter, vf, true);
adapter->vf_data[vf].pf_vlan = vlan;
adapter->vf_data[vf].pf_qos = qos;
dev_info(>pdev->dev,
@@ -5952,6 +5967,7 @@ static int igb_ndo_set_vf_vlan(struct net_device *netdev,
 false, vf);
igb_set_vmvir(adapter, vlan, vf);
igb_set_vmolr(adapter, vf, true);
+   igb_set_vf_vlan_strip(adapter, vf, false);
adapter->vf_data[vf].pf_vlan = 0;
adapter->vf_data[vf].pf_qos = 0;
}
@@ -5986,6 +6002,11 @@ static int igb_set_vf_vlan(struct igb_adapter *adapter, 
u32 *msgbuf, u32 vf)
int vid = (msgbuf[1] & E1000_VLVF_VLANID_MASK);
int err = 0;
 
+   if (vid)
+   igb_set_vf_vlan_strip(adapter, vf, true);
+   else
+   igb_set_vf_vlan_strip(adapter, vf, false);
+
/* If in promiscuous mode we need to make sure the PF also has
 * the VLAN filter set.
 */
@@ -7202,6 +7223,8 @@ static void igb_vlan_mode(struct net_device *netd

Re: [PATCH v2] r8169: Don't claim WoL works if LanWake flag is not set

2015-12-11 Thread Corinna Vinschen

On Dec 11 01:06, Francois Romieu wrote:
> Corinna Vinschen <vinsc...@redhat.com> :
> [...]
> > It's still a bit weird.  On the machines I tested this on, if I disable
> > LanWake and shutdown the machine, I can send, e.g., MagicPackets as much
> > as I like, the machined don't come up.  Isn't it a bit misleading then
> > if ethtool reports that some WoL method is enabled but it doesn't work?
> 
> Of course it is. :o(
> 
> I'm fine with Config5.LanWake changes if you have empirical evidences that
> it helps.
> 
> We have terse - outdated ? - documentation and some hint from
> http://marc.info/?l=linux-netdev=137654699802446. I'm unable to figure
> what an/the adequate change could be, especially a low level chance of
> regression one.

I think the problem here is that LanWake only switches off aspects of
the WoL capability which can't be reflected in a reliable way to the
kernel.  That's certainly one reason for the driver to enable/disable
LanWake always in lock-step with PMEnable.

So I wonder if we shouldn't just add some code to rtl_init_one (or
create a new function called from rtl_init_one) which checks the WoL
flags and if the PmConfig and LanWake flags are set inconsistently
(aka "differently") then set them to an equal value, either 0 (no WoL
method enabled) or 1 (any WoL method enabled).

Does that make sense?

Thanks,
Corinna

pgpN4_FLvqZpy.pgp
Description: PGP signature

Re: [PATCH v2] r8169: Don't claim WoL works if LanWake flag is not set

2015-12-10 Thread Corinna Vinschen

On Dec  9 23:43, Francois Romieu wrote:
> Corinna Vinschen <vinsc...@redhat.com> :
> [...]
> > This patch is supposed to fix this behaviour.  If LanWake is 0, the
> > function now returns 0.  Thus ethtool correctly reports "Wake-on: d".
> 
> Can you turn it into a DMI controlled one (something like
> drivers/net/ethernet/marvell/skge.c use of dmi_check_system)

I could do this (after I could lay my hands on such a board, that is),
but I'm not convinced that this makes a lot of sense for two reasons.

> in order to
> avoid a global change of behavior ?

1.  There is no global change in behaviour.  The usual way to handle the
WoL flags is to set the affected method flags and additionally set
LanWake if any of the method flags is set.  The fact that the method
flags don't enable WoL without also settting the LanWake flag is
documented.

__rtl8169_get_wol not reflecting this is a bug.  The code lazily
assumes that checking the WoL method flags is sufficient while in
fact it isn't.  __rtl8169_set_wol sets the LanWake flag accordingly,
but that doesn't mean the driver may assume that the flags haven't
been set differently.  I can easily hack the driver to set LanWake
to 0 and ethtool would still happily report WoL is active.  That's
plain wrong.

2. While we now know a single board which neglects to set the LanWake
   flag, that doesn't mean there aren't other boards out there doing the
   same.

   On top of that, the state of the NIC registers in terms of WoL are
   *not* board-specific.  They are regular NIC registers which are just
   set in a combination which the driver in it's current state evaluates
   wrongly.  It doesn't matter who and why the flags have been set that
   way.  The driver should reflect the actual state, and that requires
   to check for LanWake.

For those reasons I think that my fix is the right thing to do.

> Btw it's probably time to emit some warning during driver probe if wol
> bits are not consistent with LanWake.

That's a good idea.  I'll propose a followup patch with this addition.

Thanks,
Corinna

pgp6boYAPgk5h.pgp
Description: PGP signature

Re: [PATCH v2] r8169: Don't claim WoL works if LanWake flag is not set

2015-12-10 Thread Corinna Vinschen

On Dec 10 21:40, Francois Romieu wrote:
> Corinna Vinschen <vinsc...@redhat.com> :
> [...]
> > I could do this (after I could lay my hands on such a board, that is),
> > but I'm not convinced that this makes a lot of sense for two reasons.
> 
> Ok, let's get this change applied. Whatever happens should not be
> hard to manage (I'm thinking about other boards or BIOSes relying on the
> current - broken as it can be - behavior to work correctly).
> 
> [...]
> > 1.  There is no global change in behaviour.  The usual way to handle the
> > WoL flags is to set the affected method flags and additionally set
> > LanWake if any of the method flags is set.  The fact that the method
> > flags don't enable WoL without also settting the LanWake flag is
> > documented.
> 
> I see no such thing in "7.5 Power Management Function" of the 8168c
> registers datasheet. While Config3 states Magic Packet and Link Up
> dependencies on Config1.PMEn, it says nothing about Config5.LanWake.
> 
> On old 8169 chipsets LanWake is autoloaded from EEPROM.
> 
> Plausible for Config5.{B, M, U}WF ? Ok.
> 
> Documented ? I am genuinely curious to know where.

Ok, I reread the documentation I have, and I got that wrong it seems.
Apparently the LanWake flag enables or disables the LANWAKE/LANWAKEB pin
only but not the other possible PM events.

So, self-NACKed.

It's still a bit weird.  On the machines I tested this on, if I disable
LanWake and shutdown the machine, I can send, e.g., MagicPackets as much
as I like, the machined don't come up.  Isn't it a bit misleading then
if ethtool reports that some WoL method is enabled but it doesn't work?


Corinna


pgpdp6cN6Fxeq.pgp
Description: PGP signature

[PATCH v2] r8169: Don't claim WoL works if LanWake flag is not set

2015-12-09 Thread Corinna Vinschen

[v2 adds missing text to the patch comment]

The __rtl8169_get_wol function returns the state of the various WoL
method bits (MagicPaket, Unicast, etc.).  While the PMEnable bit is
tested at entry, the function doesn't check the LanWake flag.  Even if
any one of the WoL bits is set, WoL is deactivated as a whole if LanWake
isn't set.  The return value from __rtl8169_get_wol doesn't reflect
that.

Unfortunately BIOS exist (HP T620) which set the MagicPaket bit to 1 but
the LanWake flag to 0 when switching on WoL in the BIOS.  On those machines,
ethtool wrongly claims that WoL via MagicPaket is activated while in fact
it isn't, unless you explicitely called, e.g, `ethtool -s  wol g'.

This patch is supposed to fix this behaviour.  If LanWake is 0, the
function now returns 0.  Thus ethtool correctly reports "Wake-on: d".

The patch is against Dave's net-next tree.


Signed-off-by: Corinna Vinschen <vinsc...@redhat.com>
---
 drivers/net/ethernet/realtek/r8169.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index 79ef799..813ad2b 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -1705,6 +1705,10 @@ static u32 __rtl8169_get_wol(struct rtl8169_private *tp)
if (!(options & PMEnable))
return 0;
 
+   options = RTL_R8(Config5);
+   if (!(options & LanWake))
+   return 0;
+
options = RTL_R8(Config3);
if (options & LinkUp)
wolopts |= WAKE_PHY;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] r8169: Don't claim WoL works if LanWake flag is not set

2015-12-09 Thread Corinna Vinschen

The __rtl8169_get_wol function returns the state of the various WoL
method bits (MagicPaket, Unicast, etc.).  While the PMEnable bit is
tested at entry, the function doesn't check the LanWake flag.  Even if
any one of the WoL bits is set, WoL is deactivated as a whole if LanWake
isn't set.  The return value from __rtl8169_get_wol doesn't reflect
that.

Unfortunately BIOS exist (HP T620) which set the MagicPaket bit to 1 but
the LanWake flag to 0 when switching on WoL in the BIOS.  On those
machines, ethtool wrongly claims that WoL via MagicPaket is activated
while in fact it isn't, unless you explicitely called

This patch is supposed to fix this behaviour.  If LanWake is 0, the
function now returns 0.  Thus ethtool correctly reports "Wake-on: d".

The patch is against Dave's net-next tree.


Signed-off-by: Corinna Vinschen <vinsc...@redhat.com>
---
 drivers/net/ethernet/realtek/r8169.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index 79ef799..813ad2b 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -1705,6 +1705,10 @@ static u32 __rtl8169_get_wol(struct rtl8169_private *tp)
if (!(options & PMEnable))
return 0;
 
+   options = RTL_R8(Config5);
+   if (!(options & LanWake))
+   return 0;
+
options = RTL_R8(Config3);
if (options & LinkUp)
wolopts |= WAKE_PHY;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kasan r8169 use-after-free trace.

2015-11-11 Thread Corinna Vinschen

On Nov 11 05:16, Eric Dumazet wrote:
> On Wed, 2015-11-11 at 10:19 +0100, Francois Romieu wrote:
> > Dave Jones <da...@codemonkey.org.uk> :
> > > This happens during boot, (and then there's a flood of traces that happen 
> > > so fast
> > > afterwards it completely overwhelms serial console; not sure if they're 
> > > the
> > > same/related or not).
> > > 
> > > ==
> > > BUG: KASAN: use-after-free in rtl8169_poll+0x4b6/0xb70 at addr 
> > > 8801d43b3288
> > > Read of size 1 by task kworker/0:3/188
> > > =
> > > BUG kmalloc-256 (Not tainted): kasan: bad access detected
> > > -
> > > 
> > > Disabling lock debugging due to kernel taint
> > > INFO: Slab 0xea000750ecc0 objects=16 used=16 fp=0x  (null) 
> > > flags=0x8080
> > > INFO: Object 0x8801d43b3200 @offset=512 fp=0x8801d43b3800
> > > 
> > > Bytes b4 8801d43b31f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> > > 00  
> > > Object 8801d43b3200: 00 38 3b d4 01 88 ff ff 00 00 00 00 00 00 00 00  
> > > .8;.
> > 
> > Does the patch below cure it ?
> > 
> > diff --git a/drivers/net/ethernet/realtek/r8169.c 
> > b/drivers/net/ethernet/realtek/r8169.c
> > index b4f2123..79ef799 100644
> > --- a/drivers/net/ethernet/realtek/r8169.c
> > +++ b/drivers/net/ethernet/realtek/r8169.c
> > @@ -7429,15 +7429,15 @@ process_pkt:
> >  
> > rtl8169_rx_vlan_tag(desc, skb);
> >  
> > +   if (skb->pkt_type == PACKET_MULTICAST)
> > +   dev->stats.multicast++;
> > +
> > napi_gro_receive(>napi, skb);
> >  
> > u64_stats_update_begin(>rx_stats.syncp);
> > tp->rx_stats.packets++;
> > tp->rx_stats.bytes += pkt_size;
> > u64_stats_update_end(>rx_stats.syncp);
> > -
> > -   if (skb->pkt_type == PACKET_MULTICAST)
> > -   dev->stats.multicast++;
> > }
> >  release_descriptor:
> > desc->opts2 = 0;
> 
> This looks obvious indeed, please submit this formally Francois ;)

Yes, please.  Thank you Francois.


> Fixes: d7d2d89d4b0af ("r8169: Add software counter for multicast packages")
> Acked-by: Eric Dumazet <eduma...@google.com>

Acked-by: Corinna Vinschen <vinsc...@redhat.com>


Corinna


pgpOTQv1S7P6H.pgp
Description: PGP signature

Re: [PATCH v2] r8169: Fix sleeping function called during get_stats64

2015-09-10 Thread Corinna Vinschen

On Sep  9 20:31, David Miller wrote:
> From: Corinna Vinschen <vinsc...@redhat.com>
> Date: Wed,  9 Sep 2015 23:16:40 +0200
> 
> > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=104031
> > Fixes: 6e85d5ad36a26debc23a9a865c029cbe242b2dc8
> > 
> > Based on the discussion starting at
> > http://www.spinics.net/lists/netdev/msg342193.html
> > 
> > Tested locally on RTL8168evl/8111evl with various concurrent processes
> > accessing /proc/net/dev while changing the link state as well as
> > removing/reloading the r8169 module.
> > 
> > Signed-off-by: Corinna Vinschen <vinsc...@redhat.com>
> 
> Please address Francois's minor feedback, camelcase drives me nuts
> too FWIW :-)

Haha, with pleasure.  I don't like them much either.  I just used
them to align to the original code, which uses camel back for the
rx/tx descriptor ring buffers allocated the same way.

I'll send a v3 in a minute which fixes all of Francois points.


Thanks,
Corinna


pgpkZoFaiGneN.pgp
Description: PGP signature

[PATCH v3] r8169: Fix sleeping function called during get_stats64, v2

2015-09-10 Thread Corinna Vinschen

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=104031
Fixes: 6e85d5ad36a26debc23a9a865c029cbe242b2dc8

Based on the discussion starting at
http://www.spinics.net/lists/netdev/msg342193.html

Tested locally on RTL8168evl/8111evl with various concurrent processes
accessing /proc/net/dev while changing the link state as well as
removing/reloading the r8169 module.

Signed-off-by: Corinna Vinschen <vinsc...@redhat.com>
---

v2 -> v3: Rename CntPhysAddr -> counters_phys_addr
  Rename CntArray-> counters
  Fix whitespace issue
  Fix label names in rtl_init_one.

 drivers/net/ethernet/realtek/r8169.c | 137 ++-
 1 file changed, 54 insertions(+), 83 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index 24dcbe6..2b32e0c 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -833,7 +833,8 @@ struct rtl8169_private {
unsigned features;
 
struct mii_if_info mii;
-   struct rtl8169_counters counters;
+   dma_addr_t counters_phys_addr;
+   struct rtl8169_counters *counters;
struct rtl8169_tc_offsets tc_offset;
u32 saved_wolopts;
u32 opts1_mask;
@@ -2190,53 +2191,37 @@ static int rtl8169_get_sset_count(struct net_device 
*dev, int sset)
}
 }
 
-static struct rtl8169_counters *rtl8169_map_counters(struct net_device *dev,
-dma_addr_t *paddr,
-u32 counter_cmd)
+DECLARE_RTL_COND(rtl_counters_cond)
 {
-   struct rtl8169_private *tp = netdev_priv(dev);
void __iomem *ioaddr = tp->mmio_addr;
-   struct device *d = >pci_dev->dev;
-   struct rtl8169_counters *counters;
-   u32 cmd;
 
-   counters = dma_alloc_coherent(d, sizeof(*counters), paddr, GFP_KERNEL);
-   if (counters) {
-   RTL_W32(CounterAddrHigh, (u64)*paddr >> 32);
-   cmd = (u64)*paddr & DMA_BIT_MASK(32);
-   RTL_W32(CounterAddrLow, cmd);
-   RTL_W32(CounterAddrLow, cmd | counter_cmd);
-   }
-   return counters;
+   return RTL_R32(CounterAddrLow) & (CounterReset | CounterDump);
 }
 
-static void rtl8169_unmap_counters (struct net_device *dev,
-   dma_addr_t paddr,
-   struct rtl8169_counters *counters)
+static bool rtl8169_do_counters(struct net_device *dev, u32 counter_cmd)
 {
struct rtl8169_private *tp = netdev_priv(dev);
void __iomem *ioaddr = tp->mmio_addr;
-   struct device *d = >pci_dev->dev;
+   dma_addr_t paddr = tp->counters_phys_addr;
+   u32 cmd;
+   bool ret;
 
-   RTL_W32(CounterAddrLow, 0);
-   RTL_W32(CounterAddrHigh, 0);
+   RTL_W32(CounterAddrHigh, (u64)paddr >> 32);
+   cmd = (u64)paddr & DMA_BIT_MASK(32);
+   RTL_W32(CounterAddrLow, cmd);
+   RTL_W32(CounterAddrLow, cmd | counter_cmd);
 
-   dma_free_coherent(d, sizeof(*counters), counters, paddr);
-}
+   ret = rtl_udelay_loop_wait_low(tp, _counters_cond, 10, 1000);
 
-DECLARE_RTL_COND(rtl_reset_counters_cond)
-{
-   void __iomem *ioaddr = tp->mmio_addr;
+   RTL_W32(CounterAddrLow, 0);
+   RTL_W32(CounterAddrHigh, 0);
 
-   return RTL_R32(CounterAddrLow) & CounterReset;
+   return ret;
 }
 
 static bool rtl8169_reset_counters(struct net_device *dev)
 {
struct rtl8169_private *tp = netdev_priv(dev);
-   struct rtl8169_counters *counters;
-   dma_addr_t paddr;
-   bool ret = true;
 
/*
 * Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the
@@ -2245,32 +2230,13 @@ static bool rtl8169_reset_counters(struct net_device 
*dev)
if (tp->mac_version < RTL_GIGA_MAC_VER_19)
return true;
 
-   counters = rtl8169_map_counters(dev, , CounterReset);
-   if (!counters)
-   return false;
-
-   if (!rtl_udelay_loop_wait_low(tp, _reset_counters_cond, 10, 1000))
-   ret = false;
-
-   rtl8169_unmap_counters(dev, paddr, counters);
-
-   return ret;
-}
-
-DECLARE_RTL_COND(rtl_counters_cond)
-{
-   void __iomem *ioaddr = tp->mmio_addr;
-
-   return RTL_R32(CounterAddrLow) & CounterDump;
+   return rtl8169_do_counters(dev, CounterReset);
 }
 
 static bool rtl8169_update_counters(struct net_device *dev)
 {
struct rtl8169_private *tp = netdev_priv(dev);
void __iomem *ioaddr = tp->mmio_addr;
-   struct rtl8169_counters *counters;
-   dma_addr_t paddr;
-   bool ret = true;
 
/*
 * Some chips are unable to dump tally counters when the receiver
@@ -2279,23 +2245,13 @@ static bool rtl8169_update_counters(struct net_device 
*dev)
if ((RTL_R8(ChipCmd) & CmdRxEnb) == 0)
return tru

Re: [PATCH net 1/1] r8169: fix sleepable allocation during netdevice stats retrieval.

2015-09-10 Thread Corinna Vinschen

On Sep 10 01:51, poma wrote:
> [PATCH v2] r8169: Fix sleeping function called during get_stats64
> http://www.spinics.net/lists/netdev/msg342490.html
> - the noise is still present
> 
> Corinna, care to share kernel-core & kernel-modules you are testing with,
> to unravel the Gordian Knot?

Somehow your initrd seem to refuse to pick up the new kernel module.

Here's basically what I do:

- Test system is RHEL7

- git clone git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git

- edit drivers/net/ethernet/realtek/r8169.c

- make config; make -j42

- make install_modules && make install

- init 6

Alternatively, if you already did the above once:

- git am http://www.spinics.net/lists/netdev/msg342490.html>

- make modules

- cp drivers/net/ethernet/realtek/r8169.ko \
  /lib/modules/4.2.0+/kernel/drivers/net/ethernet/realtek/r8169.c

- dracut -f --kver '4.2.0+'

- init 6


Corinna


pgpVpl0dCHn6g.pgp
Description: PGP signature

Re: [PATCH net 1/1] r8169: fix sleepable allocation during netdevice stats retrieval.

2015-09-10 Thread Corinna Vinschen

On Sep 10 14:36, poma wrote:
> On 10.09.2015 10:47, Corinna Vinschen wrote:
> > On Sep 10 01:51, poma wrote:
> >> [PATCH v2] r8169: Fix sleeping function called during get_stats64
> >> http://www.spinics.net/lists/netdev/msg342490.html
> >> - the noise is still present
> >>
> >> Corinna, care to share kernel-core & kernel-modules you are testing with,
> >> to unravel the Gordian Knot?
> > 
> > Somehow your initrd seem to refuse to pick up the new kernel module.
> > [...etc...]
> 
> [PATCH v3] r8169: Fix sleeping function called during get_stats64, v2
> http://marc.info/?l=linux-netdev=144187488130444=raw
> 
> # dracut --omit-drivers r8169 -f --kver 4.3.0-0.rc0.git11.1.fc24.x86_64
> 
> - the noise is gone
> 
> Tested-by: poma <pomidorabelis...@gmail.com>

Thanks for testing!


Corinna


pgpSwWLxiLb3R.pgp
Description: PGP signature

Re: [PATCH net 3/3] r8169: increase the lifespan of the hardware counters dump area.

2015-09-09 Thread Corinna Vinschen

On Sep  9 01:27, Francois Romieu wrote:
> Corinna Vinschen <vinsc...@redhat.com> :
> [...]
> > - Alternatively I can still reproduce the SEGV in rtl_remove_one
> >   when trying to rmmod the module, I just don't have the stack dump
> >   handy while writing this mail.  I can show it if needed.
> 
> I see it too. 
> 
> > I debugged this on and off the entire day (tweaking, compiling, rebooting,
> > kernel crash, rinse and repeat).
> > 
> > And the result of my debugging is totally crazy:
> 
> My patch corrupts memory.

That's rather obvious, I just fail to see how.

I try to implement this from scratch, maybe I stumble over the cause
by accident...


Corinna


pgpRDJT_M8GnX.pgp
Description: PGP signature

Re: [GIT] Networking

2015-09-09 Thread Corinna Vinschen

On Sep  8 22:16, Konrad Rzeszutek Wilk wrote:
> On Tue, Sep 8, 2015 at 10:14 PM, Konrad Rzeszutek Wilk
> <kon...@kernel.org> wrote:
> >
> > (Removed Linus and Andrew from the To, added Corinna ..)
> 
> and resending again without HTML (sorry, thought I had HTML-emails
> disabled by default)
> >
> > On Thu, Sep 3, 2015 at 1:35 AM, David Miller <da...@davemloft.net> wrote:
> >>
> >>
> >> Another merge window, another set of networking changes.  I've heard
> >> rumblings that the lightweight tunnels infrastructure has been voted
> >> networking change of the year.  But what do I know?
> >>
> > .. snip..
> >>
> >> Corinna Vinschen (2):
> >>   r8169: Add values missing in @get_stats64 from HW counters

Thanks, this has been reported over the weekend in
https://bugzilla.kernel.org/show_bug.cgi?id=104031
Strangly this didn't occur during my tests.

Looking into it...


Corinna


pgpQbD31yZa48.pgp
Description: PGP signature

[PATCH net] r8169: Fix sleeping function called during get_stats64

2015-09-09 Thread Corinna Vinschen

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=104031
Fixes: 6e85d5ad36a26debc23a9a865c029cbe242b2dc8

Based on the discussion starting at
http://www.spinics.net/lists/netdev/msg342193.html

Tested locally on RTL8168evl/8111evl with various concurrent processes
accessing /proc/net/dev while changing the link state as well as
removing/reloading the r8169 module at the same time.

Signed-off-by: Corinna Vinschen <vinsc...@redhat.com>
---
 drivers/net/ethernet/realtek/r8169.c | 140 ++-
 1 file changed, 56 insertions(+), 84 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index 24dcbe6..630811a 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -833,7 +833,8 @@ struct rtl8169_private {
unsigned features;
 
struct mii_if_info mii;
-   struct rtl8169_counters counters;
+   dma_addr_t CntPhysAddr;
+   struct rtl8169_counters *CntArray;
struct rtl8169_tc_offsets tc_offset;
u32 saved_wolopts;
u32 opts1_mask;
@@ -2190,53 +2191,40 @@ static int rtl8169_get_sset_count(struct net_device 
*dev, int sset)
}
 }
 
-static struct rtl8169_counters *rtl8169_map_counters(struct net_device *dev,
-dma_addr_t *paddr,
-u32 counter_cmd)
+DECLARE_RTL_COND(rtl_counters_cond)
 {
-   struct rtl8169_private *tp = netdev_priv(dev);
void __iomem *ioaddr = tp->mmio_addr;
-   struct device *d = >pci_dev->dev;
-   struct rtl8169_counters *counters;
-   u32 cmd;
 
-   counters = dma_alloc_coherent(d, sizeof(*counters), paddr, GFP_KERNEL);
-   if (counters) {
-   RTL_W32(CounterAddrHigh, (u64)*paddr >> 32);
-   cmd = (u64)*paddr & DMA_BIT_MASK(32);
-   RTL_W32(CounterAddrLow, cmd);
-   RTL_W32(CounterAddrLow, cmd | counter_cmd);
-   }
-   return counters;
+   return RTL_R32(CounterAddrLow) & (CounterReset | CounterDump);
 }
 
-static void rtl8169_unmap_counters (struct net_device *dev,
-   dma_addr_t paddr,
-   struct rtl8169_counters *counters)
+static bool rtl8169_do_counters(struct net_device *dev, u32 counter_cmd)
 {
struct rtl8169_private *tp = netdev_priv(dev);
void __iomem *ioaddr = tp->mmio_addr;
-   struct device *d = >pci_dev->dev;
+   dma_addr_t paddr = tp->CntPhysAddr;
+   u32 cmd;
+   bool ret;
 
-   RTL_W32(CounterAddrLow, 0);
-   RTL_W32(CounterAddrHigh, 0);
+   if (!paddr)
+   return false;
 
-   dma_free_coherent(d, sizeof(*counters), counters, paddr);
-}
+   RTL_W32(CounterAddrHigh, (u64)paddr >> 32);
+   cmd = (u64)paddr & DMA_BIT_MASK(32);
+   RTL_W32(CounterAddrLow, cmd);
+   RTL_W32(CounterAddrLow, cmd | counter_cmd);
 
-DECLARE_RTL_COND(rtl_reset_counters_cond)
-{
-   void __iomem *ioaddr = tp->mmio_addr;
+   ret = rtl_udelay_loop_wait_low(tp, _counters_cond, 10, 1000);
 
-   return RTL_R32(CounterAddrLow) & CounterReset;
+   RTL_W32(CounterAddrLow, 0);
+   RTL_W32(CounterAddrHigh, 0);
+
+   return ret;
 }
 
 static bool rtl8169_reset_counters(struct net_device *dev)
 {
struct rtl8169_private *tp = netdev_priv(dev);
-   struct rtl8169_counters *counters;
-   dma_addr_t paddr;
-   bool ret = true;
 
/*
 * Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the
@@ -2245,32 +2233,13 @@ static bool rtl8169_reset_counters(struct net_device 
*dev)
if (tp->mac_version < RTL_GIGA_MAC_VER_19)
return true;
 
-   counters = rtl8169_map_counters(dev, , CounterReset);
-   if (!counters)
-   return false;
-
-   if (!rtl_udelay_loop_wait_low(tp, _reset_counters_cond, 10, 1000))
-   ret = false;
-
-   rtl8169_unmap_counters(dev, paddr, counters);
-
-   return ret;
-}
-
-DECLARE_RTL_COND(rtl_counters_cond)
-{
-   void __iomem *ioaddr = tp->mmio_addr;
-
-   return RTL_R32(CounterAddrLow) & CounterDump;
+return rtl8169_do_counters(dev, CounterReset);
 }
 
 static bool rtl8169_update_counters(struct net_device *dev)
 {
struct rtl8169_private *tp = netdev_priv(dev);
void __iomem *ioaddr = tp->mmio_addr;
-   struct rtl8169_counters *counters;
-   dma_addr_t paddr;
-   bool ret = true;
 
/*
 * Some chips are unable to dump tally counters when the receiver
@@ -2279,23 +2248,13 @@ static bool rtl8169_update_counters(struct net_device 
*dev)
if ((RTL_R8(ChipCmd) & CmdRxEnb) == 0)
return true;
 
-   counters = rtl8169_map_counters(dev, , CounterDump);
-   if (!counters)
-   return false;
-
-   if (

Re: [PATCH net 1/1] r8169: fix sleepable allocation during netdevice stats retrieval.

2015-09-09 Thread Corinna Vinschen

On Sep  9 17:54, Corinna Vinschen wrote:
> On Sep  9 17:24, poma wrote:
> > [PATCH net] r8169: Fix sleeping function called during get_stats64
> > http://marc.info/?l=linux-netdev=144180123410135=raw
> > - the noise is still present
> 
> Are you really sure?  The entire dma_alloc/dma_free stuff has been moved
> away.  There's no locking or sleeping involved different from what was
> there before my original patch when calling .ndo_get_stats64.
> 
> How would I be able to reproduce this on the command line?

It would also be interesting to see the "noise" as it looks after
applying the above patch...


Corinna


pgpyx8KB9HKKJ.pgp
Description: PGP signature

Re: [PATCH net 1/1] r8169: fix sleepable allocation during netdevice stats retrieval.

2015-09-09 Thread Corinna Vinschen

On Sep  9 17:24, poma wrote:
> [PATCH net] r8169: Fix sleeping function called during get_stats64
> http://marc.info/?l=linux-netdev=144180123410135=raw
> - the noise is still present

Are you really sure?  The entire dma_alloc/dma_free stuff has been moved
away.  There's no locking or sleeping involved different from what was
there before my original patch when calling .ndo_get_stats64.

How would I be able to reproduce this on the command line?

Corinna

pgpcvEJOJJCGT.pgp
Description: PGP signature

Re: [PATCH net 1/1] r8169: fix sleepable allocation during netdevice stats retrieval.

2015-09-09 Thread Corinna Vinschen

On Sep  9 20:34, poma wrote:
> On 09/09/2015 05:55 PM, Corinna Vinschen wrote:
> > On Sep  9 17:54, Corinna Vinschen wrote:
> >> On Sep  9 17:24, poma wrote:
> >>> [PATCH net] r8169: Fix sleeping function called during get_stats64
> >>> http://marc.info/?l=linux-netdev=144180123410135=raw
> >>> - the noise is still present
> >>
> >> Are you really sure?  The entire dma_alloc/dma_free stuff has been moved
> >> away.  There's no locking or sleeping involved different from what was
> >> there before my original patch when calling .ndo_get_stats64.
> >>
> 
> 
> It is literally the kernel ring buffer output,
> so I really can not understand your question.

I'm asking because I'm wondering if you're actually testing the
right r8169.ko module, the one with the patch applied.  See below.

> >> How would I be able to reproduce this on the command line?
> 
> This I have already written, here's once more for you,
> 
>  This noise is induced via userspace, xfce4-netload-plugin,
>  http://goodies.xfce.org/projects/panel-plugins/xfce4-netload-plugin
> 
>  $ grep -i device .config/xfce4/panel/netload-16.rc
>  Network_Device=enp3s0
> 
>  $ ethtool -i enp3s0 | grep driver
>  driver: r8169
> 
> 
> Therefore, to try to reproduce this issue, 'xfce4-netload-plugin' must run 
> within 'xfce4-panel',
> moreover 'xfce4-netload-plugin' must be configured to monitor affected 
> network interface.

I'lll see if I can try this tomorrow.

> No command line this time, hombre.

If it has to be spanish, I'd prefer mujer, but whatever.

> > It would also be interesting to see the "noise" as it looks after
> > applying the above patch...
> 
> The "noise" after applying "r8169: Fix sleeping function called during 
> get_stats64":
> [...]
> [  215.049067] Call Trace:
> [  215.049078]  [] dump_stack+0x4b/0x63
> [  215.049090]  [] lockdep_rcu_suspicious+0xd7/0x110
> [  215.049099]  [] ___might_sleep+0xa7/0x230
> [  215.049107]  [] __might_sleep+0x49/0x80
> [  215.049121]  [] __alloc_pages_nodemask+0x2fe/0xb90
> [  215.049130]  [] ? debug_lockdep_rcu_enabled+0x1d/0x20
> [  215.049141]  [] ? sched_clock+0x9/0x10
> [  215.049149]  [] ? local_clock+0x1c/0x20
> [  215.049157]  [] ? debug_lockdep_rcu_enabled+0x1d/0x20
> [  215.049168]  [] dma_generic_alloc_coherent+0x96/0x130
> [  215.049178]  [] x86_swiotlb_alloc_coherent+0x25/0x50
> [  215.049193]  [] dma_alloc_attrs+0x6d/0xe0
> [  215.049208]  [] rtl8169_map_counters+0x3e/0x70 [r8169]

This is very certainly not the r8169.ko driver with my patch applied.
There is no rtl8169_map_counters function anymore, just as with
Francois' patch.  I'm not sure what you're doing wrong there, but this
stack dump definitely cannot be produced with either Francois or my
patch, so you're apparently testing the unpatched upstream driver all
the time.

> ...
> etc.
>   etc.
> etc.
> 
> This looks the same as at the beginning, as if you were dealing with
> an entirely different problem, hombre.

No, sorry, it's you running the wrong kernel module, and a single
occurence of the stack dump would have been sufficient, but thanks
all the same.


Corinna


pgpskJpMKmMHs.pgp
Description: PGP signature

Re: [PATCH net] r8169: Fix sleeping function called during get_stats64

2015-09-09 Thread Corinna Vinschen

On Sep  9 22:23, Francois Romieu wrote:
> Corinna Vinschen <vinsc...@redhat.com> :
> [...]
> > diff --git a/drivers/net/ethernet/realtek/r8169.c 
> > b/drivers/net/ethernet/realtek/r8169.c
> > index 24dcbe6..630811a 100644
> > --- a/drivers/net/ethernet/realtek/r8169.c
> > +++ b/drivers/net/ethernet/realtek/r8169.c
> [...]
> > +   if (!paddr)
> > +   return false;
> 
> I guess this is the secret recipe.

Actually, no.  I started out stress testing this combined with a printk
to show if paddr can be NULL, but this never occurs.  ndo_get_stats64 is
apparently only called after registering the device, and this occurs
after calling dma_alloc_coherent in rtl_init_one.

Rather than removing the entire test, I acidentally only removed the
printk.  Sorry about that.

> [...]
> > @@ -8447,9 +8411,14 @@ static int rtl_init_one(struct pci_dev *pdev, const 
> > struct pci_device_id *ent)
> >  
> > tp->rtl_fw = RTL_FIRMWARE_UNKNOWN;
> >  
> > +   tp->CntArray = dma_alloc_coherent (>dev, sizeof(*tp->CntArray),
> > +  >CntPhysAddr, GFP_KERNEL);
> > +   if (!tp->CntArray)
> > +   goto err_out_cnt_4;
> > +
> 
> rc is still zero here so rtl_init_one will return success.

Thanks for catching!  I'm sending a fixed patch in a minute, removing
the paddr test and setting rc to -ENOMEM if dma_alloc_coherent fails.

Corinna

pgp_SzSXvSkJ2.pgp
Description: PGP signature

[PATCH v2] r8169: Fix sleeping function called during get_stats64

2015-09-09 Thread Corinna Vinschen

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=104031
Fixes: 6e85d5ad36a26debc23a9a865c029cbe242b2dc8

Based on the discussion starting at
http://www.spinics.net/lists/netdev/msg342193.html

Tested locally on RTL8168evl/8111evl with various concurrent processes
accessing /proc/net/dev while changing the link state as well as
removing/reloading the r8169 module.

Signed-off-by: Corinna Vinschen <vinsc...@redhat.com>
---
 drivers/net/ethernet/realtek/r8169.c | 139 ++-
 1 file changed, 55 insertions(+), 84 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index 24dcbe6..c12 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -833,7 +833,8 @@ struct rtl8169_private {
unsigned features;
 
struct mii_if_info mii;
-   struct rtl8169_counters counters;
+   dma_addr_t CntPhysAddr;
+   struct rtl8169_counters *CntArray;
struct rtl8169_tc_offsets tc_offset;
u32 saved_wolopts;
u32 opts1_mask;
@@ -2190,53 +2191,37 @@ static int rtl8169_get_sset_count(struct net_device 
*dev, int sset)
}
 }
 
-static struct rtl8169_counters *rtl8169_map_counters(struct net_device *dev,
-dma_addr_t *paddr,
-u32 counter_cmd)
+DECLARE_RTL_COND(rtl_counters_cond)
 {
-   struct rtl8169_private *tp = netdev_priv(dev);
void __iomem *ioaddr = tp->mmio_addr;
-   struct device *d = >pci_dev->dev;
-   struct rtl8169_counters *counters;
-   u32 cmd;
 
-   counters = dma_alloc_coherent(d, sizeof(*counters), paddr, GFP_KERNEL);
-   if (counters) {
-   RTL_W32(CounterAddrHigh, (u64)*paddr >> 32);
-   cmd = (u64)*paddr & DMA_BIT_MASK(32);
-   RTL_W32(CounterAddrLow, cmd);
-   RTL_W32(CounterAddrLow, cmd | counter_cmd);
-   }
-   return counters;
+   return RTL_R32(CounterAddrLow) & (CounterReset | CounterDump);
 }
 
-static void rtl8169_unmap_counters (struct net_device *dev,
-   dma_addr_t paddr,
-   struct rtl8169_counters *counters)
+static bool rtl8169_do_counters(struct net_device *dev, u32 counter_cmd)
 {
struct rtl8169_private *tp = netdev_priv(dev);
void __iomem *ioaddr = tp->mmio_addr;
-   struct device *d = >pci_dev->dev;
+   dma_addr_t paddr = tp->CntPhysAddr;
+   u32 cmd;
+   bool ret;
 
-   RTL_W32(CounterAddrLow, 0);
-   RTL_W32(CounterAddrHigh, 0);
+   RTL_W32(CounterAddrHigh, (u64)paddr >> 32);
+   cmd = (u64)paddr & DMA_BIT_MASK(32);
+   RTL_W32(CounterAddrLow, cmd);
+   RTL_W32(CounterAddrLow, cmd | counter_cmd);
 
-   dma_free_coherent(d, sizeof(*counters), counters, paddr);
-}
+   ret = rtl_udelay_loop_wait_low(tp, _counters_cond, 10, 1000);
 
-DECLARE_RTL_COND(rtl_reset_counters_cond)
-{
-   void __iomem *ioaddr = tp->mmio_addr;
+   RTL_W32(CounterAddrLow, 0);
+   RTL_W32(CounterAddrHigh, 0);
 
-   return RTL_R32(CounterAddrLow) & CounterReset;
+   return ret;
 }
 
 static bool rtl8169_reset_counters(struct net_device *dev)
 {
struct rtl8169_private *tp = netdev_priv(dev);
-   struct rtl8169_counters *counters;
-   dma_addr_t paddr;
-   bool ret = true;
 
/*
 * Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the
@@ -2245,32 +2230,13 @@ static bool rtl8169_reset_counters(struct net_device 
*dev)
if (tp->mac_version < RTL_GIGA_MAC_VER_19)
return true;
 
-   counters = rtl8169_map_counters(dev, , CounterReset);
-   if (!counters)
-   return false;
-
-   if (!rtl_udelay_loop_wait_low(tp, _reset_counters_cond, 10, 1000))
-   ret = false;
-
-   rtl8169_unmap_counters(dev, paddr, counters);
-
-   return ret;
-}
-
-DECLARE_RTL_COND(rtl_counters_cond)
-{
-   void __iomem *ioaddr = tp->mmio_addr;
-
-   return RTL_R32(CounterAddrLow) & CounterDump;
+return rtl8169_do_counters(dev, CounterReset);
 }
 
 static bool rtl8169_update_counters(struct net_device *dev)
 {
struct rtl8169_private *tp = netdev_priv(dev);
void __iomem *ioaddr = tp->mmio_addr;
-   struct rtl8169_counters *counters;
-   dma_addr_t paddr;
-   bool ret = true;
 
/*
 * Some chips are unable to dump tally counters when the receiver
@@ -2279,23 +2245,13 @@ static bool rtl8169_update_counters(struct net_device 
*dev)
if ((RTL_R8(ChipCmd) & CmdRxEnb) == 0)
return true;
 
-   counters = rtl8169_map_counters(dev, , CounterDump);
-   if (!counters)
-   return false;
-
-   if (rtl_udelay_loop_wait_low(tp, _counters_cond, 10, 1000))
-   me

Re: [PATCH net 3/3] r8169: increase the lifespan of the hardware counters dump area.

2015-09-08 Thread Corinna Vinschen

Hi David,

On Sep  7 17:00, David Miller wrote:
> From: Corinna Vinschen <vinsc...@redhat.com>
> Date: Mon, 7 Sep 2015 11:29:49 +0200
> 
> > Still wondering though.  Given that the driver never failed before if
> > the counter values couldn't be updated, and given that these counter
> > values only have statistical relevance, why should this suddenly result
> > in a fatal failure at open time?
> 
> Failing to allocate such a small buffer means we have much deeper issues
> at hand.  A GFP_KERNEL allocation of this size really should not fail.

I'm not talking about the allocation.  I agree with you on that score.

What I'm talking about is the situation where the NIC hardware fails to
reset or update its own counters for whatever reason.  Apparently the
mechanism is supposed to be performed within a given timeframe.  The
code sets some registers and then waits for a flag bit to be set to 0.
For that it utilizes a busy loop checking the flag bit up to 1000 times
with a delay of about 10 us.

The error condition is that the flag bit hasn't been set to 0 when the
loop exits, after roughly 10ms, and *this* part does not constitute a
fatal error which breaks the operation of the NIC.  So, from my
perspective a timeout while trying to wait for updated counter values
from the NIC at @ndo_open time should not be treated as fatal.

Corinna

pgpk7BRl5wT6Y.pgp
Description: PGP signature

Re: [PATCH net 3/3] r8169: increase the lifespan of the hardware counters dump area.

2015-09-08 Thread Corinna Vinschen

On Sep  7 23:52, Francois Romieu wrote:
> Corinna Vinschen <vinsc...@redhat.com> :
> [...]
> > I have a bit of a problem with this patch.  It crashes when loading the
> > driver manually w/ modprobe.  For some reason dev_get_stats is called
> > during rtl_init_one and at that time the counters pointer is NULL, so
> > the kernel gets a SEGV.
> >
> >  After I worked around that I got a SEGV in
> > rtl_remove_one, which I still have to find out why.  I didn't test with
> > the latest kernel, though, so I still have to check if that's the reason
> > for the crashes.  That takes a bit longer, I just wanted to let you
> > know.
> 
> Call me stupid: I forgot that it's perfectly fine to request the stats
> of a down interface. :o/
> 
> Updated patch is on the way.
> 
> > There's also a problem with rtl8169_cmd_counters:  It always calls
> > rtl_udelay_loop_wait_low w/ rtl_reset_counters_cond, even when called
> > from rtl8169_update_counters, where it should call
> > rtl_udelay_loop_wait_low w/ rtl_counters_cond to check for the
> > CounterDump flag, rather than for the CounterReset flag.
> > 
> > For now I applied the below patch, but I think it's a bit ugly to
> > conditionalize the rtl_cond struct by the incoming flag value.
> 
> 
> 
> Let's check both at once:
> 
>   return RTL_R32(CounterAddrLow) & (CounterDump | CounterReset);

If you're sure that's valid, then why not?  It's certainly cleaner
code.


Corinna


pgpNZQ0zHUR2P.pgp
Description: PGP signature

Re: [PATCH net 3/3] r8169: increase the lifespan of the hardware counters dump area.

2015-09-08 Thread Corinna Vinschen

On Sep  8 02:02, Francois Romieu wrote:
> Francois Romieu  :
> [...]
> > Updated patch is on the way.
> 
> Fixed memcpy in patch 0001, moved counters allocation from open() 
> to probe(), returned open() to its original state but something is
> still wrong: the link does not come up.

I tested and debugged the attached patches.  Just as you noticed, the
interfaces (my test machine has two) don't come up at boot time and
subsequently I can also reproduce two kinds of crashes:

- Calling `ip link ... up' crashes the kernel in rtl_open like this:

[  138.031190]  [] dump_stack+0x44/0x55
[  138.036311]  [] __setup_irq+0x515/0x580
[  138.041693]  [] ? rtl8169_gset_xmii+0x20/0x20 [r8169]
[  138.048284]  [] request_threaded_irq+0xf4/0x1a0
[  138.054357]  [] rtl_open+0x3a7/0xab4 [r8169]
[...]

- Alternatively I can still reproduce the SEGV in rtl_remove_one
  when trying to rmmod the module, I just don't have the stack dump
  handy while writing this mail.  I can show it if needed.

I debugged this on and off the entire day (tweaking, compiling, rebooting,
kernel crash, rinse and repeat).

And the result of my debugging is totally crazy:

If I disable the call to rtl_init_counter_offsets in rtl_open, as in

  #if 0
retval = rtl_init_counter_offsets(dev);
if (retval < 0)
netif_warn(tp, hw, dev, "counter reset/update failed\n");
  #endif

the interfaces come up just fine.

If I reenable the rtl_init_counter_offsets call in rtl_open, and reduce
the rtl_init_counter_offsets function to just this:

  static int rtl_init_counter_offsets(struct net_device *dev)
  {
  return 1;
  }

then the interfaces refuse to come up, and a subsequent `ip link ... up'
crashes the kernel.

No, I do not understand this :(

Corinna

pgp2h7zEl3YBC.pgp
Description: PGP signature

Re: [PATCH net 1/1] r8169: fix sleepable allocation during netdevice stats retrieval.

2015-09-07 Thread Corinna Vinschen

On Sep  7 09:13, poma wrote:
> On 06.09.2015 12:19, Corinna Vinschen wrote:
> > On Sep  4 22:59, Francois Romieu wrote:
> >>  Applies against davem's net as of 
> >> f1ccbfce2fc787981d1182d09c1f6b67766783a8.
> >>
> >>  drivers/net/ethernet/realtek/r8169.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/net/ethernet/realtek/r8169.c 
> >> b/drivers/net/ethernet/realtek/r8169.c
> >> index 24dcbe6..56829ea 100644
> >> --- a/drivers/net/ethernet/realtek/r8169.c
> >> +++ b/drivers/net/ethernet/realtek/r8169.c
> >> @@ -2200,7 +2200,7 @@ static struct rtl8169_counters 
> >> *rtl8169_map_counters(struct net_device *dev,
> >>struct rtl8169_counters *counters;
> >>u32 cmd;
> >>  
> >> -  counters = dma_alloc_coherent(d, sizeof(*counters), paddr, GFP_KERNEL);
> >> +  counters = dma_alloc_coherent(d, sizeof(*counters), paddr, GFP_ATOMIC);
> >>if (counters) {
> >>RTL_W32(CounterAddrHigh, (u64)*paddr >> 32);
> >>cmd = (u64)*paddr & DMA_BIT_MASK(32);
> >> -- 
> >> 2.4.3
> > [...]
> 1. Reverted to r8169.c?id=eb78139
>- the noise is still present
> 2. Patches applied - Francois Romieu (3):
>r8169: decouple the counters data and the device private area.
>r8169: move rtl_reset_counters_cond before the hardware counters helpers.
>r8169: increase the lifespan of the hardware counters dump area.
>- the noise is still present

Sure you tested the right code?  I'm just asking because...

> [   70.016445]  [] dump_stack+0x4b/0x63
> [   70.016456]  [] lockdep_rcu_suspicious+0xd7/0x110
> [   70.016465]  [] ___might_sleep+0xa7/0x230
> [   70.016472]  [] __might_sleep+0x49/0x80
> [   70.016481]  [] __alloc_pages_nodemask+0x2fe/0xb90
> [   70.016490]  [] ? debug_lockdep_rcu_enabled+0x1d/0x20
> [   70.016499]  [] ? sched_clock+0x9/0x10
> [   70.016507]  [] ? local_clock+0x1c/0x20
> [   70.016514]  [] ? debug_lockdep_rcu_enabled+0x1d/0x20
> [   70.016524]  [] dma_generic_alloc_coherent+0x96/0x130
> [   70.016534]  [] x86_swiotlb_alloc_coherent+0x25/0x50
> [   70.016541]  [] dma_alloc_attrs+0x6d/0xe0
> [   70.016555]  [] rtl8169_map_counters+0x3e/0x70 [r8169]

...rtl8169_map_counters only exists before Francois patch.  The patch
removes this function.


Corinna


pgpVJzCq1DXC1.pgp
Description: PGP signature

Re: [PATCH net 3/3] r8169: increase the lifespan of the hardware counters dump area.

2015-09-07 Thread Corinna Vinschen

On Sep  6 22:21, Francois Romieu wrote:
> Corinna Vinschen <vinsc...@redhat.com> :
> > On Sep  5 14:18, rom...@fr.zoreil.com wrote:
> [...]
> > > - rtl_reset_counters_cond induced failures in open() are also considered
> > >   fatal: it takes acceptable work to unwind comfortably.
> > 
> > Why?
> 
> 
> Crap, my description does not match the code wrt rtl_reset_counters_cond. :o/
> 
> s/rtl8169_reset_counters/rtl8169_update_counters/g
> 
> The code is right.
> 
> [...]
> > This returns -EINVAL even for older chip versions which are not capable
> > of resetting the counters.  The result is, this driver will not work at
> > all on chip versions prior to RTL_GIGA_MAC_VER_19 anymore, because
> > rtl_open will always fail.
> 
> No. My changelog was misleading. rtl_init_counter_offsets handles this part
> correctly.

Oh, right, I missed that rtl_init_counter_offsets checks for `if (!rc)',
not for `if (rc < 0)' as for the call to rtl8169_update_counters.

Still wondering though.  Given that the driver never failed before if
the counter values couldn't be updated, and given that these counter
values only have statistical relevance, why should this suddenly result
in a fatal failure at open time?

Corinna

pgpU8NPcDgH_n.pgp
Description: PGP signature

Re: [PATCH net 3/3] r8169: increase the lifespan of the hardware counters dump area.

2015-09-07 Thread Corinna Vinschen

On Sep  6 12:37, Corinna Vinschen wrote:
> On Sep  5 14:18, rom...@fr.zoreil.com wrote:
> > From: Francois Romieu <rom...@fr.zoreil.com>
> > 
> > net/core/net-sysfs.c::netstat_show retrieves stats with spinlock held.
> > 
> > This change avoids sleepable allocation and performs some housekeeping:
> > - receive ring, transmit ring and counters dump area allocation failures
> >   are all considered fatal during open()
> > - netif_warn is now redundant with rtl_reset_counters_cond built-in
> >   failure message.
> > - rtl_reset_counters_cond induced failures in open() are also considered
> >   fatal: it takes acceptable work to unwind comfortably.
> 
> Why?  The counter reset is not a fatal condition for the operation of
> the NIC.  Even if the counters show incorrect values, the normal
> operation of the NIC is not affected.  And to top that off, even if
> resetting the counters actually doesn't work, the driver keeps the
> offset values, so a failed reset won't even harm the statistics.  It
> just doesn't make sense to break the NIC operation for such a minor
> problem.  And worse:
> 
> > -static bool rtl8169_reset_counters(struct net_device *dev)
> > +static int rtl8169_reset_counters(struct net_device *dev)
> >  {
> > struct rtl8169_private *tp = netdev_priv(dev);
> > -   struct rtl8169_counters *counters;
> > -   dma_addr_t paddr;
> > -   bool ret = true;
> >  
> > /*
> >  * Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the
> >  * tally counters.
> >  */
> > if (tp->mac_version < RTL_GIGA_MAC_VER_19)
> > -   return true;
> > -
> > -   counters = rtl8169_map_counters(dev, , CounterReset);
> > -   if (!counters)
> > -   return false;
> > -
> > -   if (!rtl_udelay_loop_wait_low(tp, _reset_counters_cond, 10, 1000))
> > -   ret = false;
> > -
> > -   rtl8169_unmap_counters(dev, paddr, counters);
> > +   return -EINVAL;
> 
> This returns -EINVAL even for older chip versions which are not capable
> of resetting the counters.  The result is, this driver will not work at
> all on chip versions prior to RTL_GIGA_MAC_VER_19 anymore, because
> rtl_open will always fail.
> 
> Other then that, I can test the patch probably tomorrow.

I have a bit of a problem with this patch.  It crashes when loading the
driver manually w/ modprobe.  For some reason dev_get_stats is called
during rtl_init_one and at that time the counters pointer is NULL, so
the kernel gets a SEGV.  After I worked around that I got a SEGV in
rtl_remove_one, which I still have to find out why.  I didn't test with
the latest kernel, though, so I still have to check if that's the reason
for the crashes.  That takes a bit longer, I just wanted to let you
know.

There's also a problem with rtl8169_cmd_counters:  It always calls
rtl_udelay_loop_wait_low w/ rtl_reset_counters_cond, even when called
from rtl8169_update_counters, where it should call
rtl_udelay_loop_wait_low w/ rtl_counters_cond to check for the
CounterDump flag, rather than for the CounterReset flag.

For now I applied the below patch, but I think it's a bit ugly to
conditionalize the rtl_cond struct by the incoming flag value.


Corinna


diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index ae07567..cd7adbf 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -2200,6 +2200,13 @@ DECLARE_RTL_COND(rtl_reset_counters_cond)
return RTL_R32(CounterAddrLow) & CounterReset;
 }
 
+DECLARE_RTL_COND(rtl_counters_cond)
+{
+   void __iomem *ioaddr = tp->mmio_addr;
+
+   return RTL_R32(CounterAddrLow) & CounterDump;
+}
+
 static int rtl8169_cmd_counters(struct net_device *dev, u32 counter_cmd)
 {
struct rtl8169_private *tp = netdev_priv(dev);
@@ -2212,7 +2219,10 @@ static int rtl8169_cmd_counters(struct net_device *dev, 
u32 counter_cmd)
RTL_W32(CounterAddrLow, cmd);
RTL_W32(CounterAddrLow, cmd | counter_cmd);
 
-   return rtl_udelay_loop_wait_low(tp, _reset_counters_cond, 10, 1000);
+   return rtl_udelay_loop_wait_low(tp,
+   counter_cmd == CounterDump ? _counters_cond :
+_reset_counters_cond,
+   10, 1000);
 }
 
 static int rtl8169_reset_counters(struct net_device *dev)
@@ -2229,13 +2239,6 @@ static int rtl8169_reset_counters(struct net_device *dev)
return rtl8169_cmd_counters(dev, CounterReset);
 }
 
-DECLARE_RTL_COND(rtl_counters_cond)
-{
-   void __iomem *ioaddr = tp->mmio_addr;
-
-   return RTL_R32(CounterAddrLow) & CounterDump;
-}
-
 static int rtl8169_update_counters(struct net_device *dev)
 {
struct rtl8169_privat

Re: [PATCH net 1/1] r8169: fix sleepable allocation during netdevice stats retrieval.

2015-09-06 Thread Corinna Vinschen

On Sep  4 22:59, Francois Romieu wrote:
> net/core/net-sysfs.c::netstat_show fetches statistics under dev_base_lock.
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=104031
> Fixes: 6e85d5ad36a2 ("r8169: Add values missing in @get_stats64 from HW 
> counters")
> Signed-off-by: Francois Romieu <rom...@fr.zoreil.com>
> Cc: Corinna Vinschen <vinsc...@redhat.com>
> ---
> 
>  Applies against davem's net as of f1ccbfce2fc787981d1182d09c1f6b67766783a8.
> 
>  drivers/net/ethernet/realtek/r8169.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/realtek/r8169.c 
> b/drivers/net/ethernet/realtek/r8169.c
> index 24dcbe6..56829ea 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -2200,7 +2200,7 @@ static struct rtl8169_counters 
> *rtl8169_map_counters(struct net_device *dev,
>   struct rtl8169_counters *counters;
>   u32 cmd;
>  
> - counters = dma_alloc_coherent(d, sizeof(*counters), paddr, GFP_KERNEL);
> + counters = dma_alloc_coherent(d, sizeof(*counters), paddr, GFP_ATOMIC);
>   if (counters) {
>   RTL_W32(CounterAddrHigh, (u64)*paddr >> 32);
>   cmd = (u64)*paddr & DMA_BIT_MASK(32);
> -- 
> 2.4.3

I'll have a look tomorrow what I'm back to work, but for the time
being, two questions.

- The code worked for me in local testing without problem, so I'm rather
  puzzled.  How can I reproduce the problem?

- I'm pretty new to this stuff, so I don't understand this:

  The dma_alloc_coherent(...,GFP_KERNEL) call is not new in the code,
  it's there since at least 2010.  It appears to work fine in the
  context of @get_ethtool_stats.  Why does this not work in the context
  of @ndo_get_stats64?


Thanks,
Corinna


pgpdO7Z2CLpTf.pgp
Description: PGP signature

Re: [PATCH net 3/3] r8169: increase the lifespan of the hardware counters dump area.

2015-09-06 Thread Corinna Vinschen

On Sep  5 14:18, rom...@fr.zoreil.com wrote:
> From: Francois Romieu 
> 
> net/core/net-sysfs.c::netstat_show retrieves stats with spinlock held.
> 
> This change avoids sleepable allocation and performs some housekeeping:
> - receive ring, transmit ring and counters dump area allocation failures
>   are all considered fatal during open()
> - netif_warn is now redundant with rtl_reset_counters_cond built-in
>   failure message.
> - rtl_reset_counters_cond induced failures in open() are also considered
>   fatal: it takes acceptable work to unwind comfortably.

Why?  The counter reset is not a fatal condition for the operation of
the NIC.  Even if the counters show incorrect values, the normal
operation of the NIC is not affected.  And to top that off, even if
resetting the counters actually doesn't work, the driver keeps the
offset values, so a failed reset won't even harm the statistics.  It
just doesn't make sense to break the NIC operation for such a minor
problem.  And worse:

> -static bool rtl8169_reset_counters(struct net_device *dev)
> +static int rtl8169_reset_counters(struct net_device *dev)
>  {
>   struct rtl8169_private *tp = netdev_priv(dev);
> - struct rtl8169_counters *counters;
> - dma_addr_t paddr;
> - bool ret = true;
>  
>   /*
>* Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the
>* tally counters.
>*/
>   if (tp->mac_version < RTL_GIGA_MAC_VER_19)
> - return true;
> -
> - counters = rtl8169_map_counters(dev, , CounterReset);
> - if (!counters)
> - return false;
> -
> - if (!rtl_udelay_loop_wait_low(tp, _reset_counters_cond, 10, 1000))
> - ret = false;
> -
> - rtl8169_unmap_counters(dev, paddr, counters);
> + return -EINVAL;

This returns -EINVAL even for older chip versions which are not capable
of resetting the counters.  The result is, this driver will not work at
all on chip versions prior to RTL_GIGA_MAC_VER_19 anymore, because
rtl_open will always fail.

Other then that, I can test the patch probably tomorrow.

Corinna

pgpcwnfNo06nh.pgp
Description: PGP signature

Re: [PATCH net 1/3] r8169: decouple the counters data and the device private area.

2015-09-06 Thread Corinna Vinschen

On Sep  5 14:18, rom...@fr.zoreil.com wrote:
> From: Francois Romieu 
> @@ -2335,24 +2337,25 @@ static void rtl8169_get_ethtool_stats(struct 
> net_device *dev,
> struct ethtool_stats *stats, u64 *data)
>  {
>   struct rtl8169_private *tp = netdev_priv(dev);
> + struct rtl8169_counters *c = tp->counters;

Minor style nit: Shouldn't that local var ideally be called "counters"
as you did in the other functions?


Corinna


pgpRd7HUBOTob.pgp
Description: PGP signature

[PATCH net-next] r8169: Add software counter for multicast packages

2015-08-27 Thread Corinna Vinschen

The multicast hardware counter on 8168/8111 chips is only 32 bit while the
statistics in struct rtnl_link_stats64 are 64 bit.  Given that statistics
are requested on an irregular basis, an overflow of the hardware counter
can go unnoticed.  To count even very large numbers of multicast packets
reliably, add a software counter and remove previously applied code to
fill the multicast field requested by @rtl8169_get_stats64 with the values
read from the rx_multicast hardware counter.

Signed-off-by: Corinna Vinschen vinsc...@redhat.com
---
 drivers/net/ethernet/realtek/r8169.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index d6d39df..24dcbe6 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -754,7 +754,6 @@ struct rtl8169_tc_offsets {
boolinited;
__le64  tx_errors;
__le32  tx_multi_collision;
-   __le32  rx_multicast;
__le16  tx_aborted;
 };
 
@@ -2326,7 +2325,6 @@ static bool rtl8169_init_counter_offsets(struct 
net_device *dev)
 
tp-tc_offset.tx_errors = tp-counters.tx_errors;
tp-tc_offset.tx_multi_collision = tp-counters.tx_multi_collision;
-   tp-tc_offset.rx_multicast = tp-counters.rx_multicast;
tp-tc_offset.tx_aborted = tp-counters.tx_aborted;
tp-tc_offset.inited = true;
 
@@ -7480,6 +7478,9 @@ process_pkt:
tp-rx_stats.packets++;
tp-rx_stats.bytes += pkt_size;
u64_stats_update_end(tp-rx_stats.syncp);
+
+   if (skb-pkt_type == PACKET_MULTICAST)
+   dev-stats.multicast++;
}
 release_descriptor:
desc-opts2 = 0;
@@ -7790,7 +7791,6 @@ rtl8169_get_stats64(struct net_device *dev, struct 
rtnl_link_stats64 *stats)
stats-rx_bytes = tp-rx_stats.bytes;
} while (u64_stats_fetch_retry_irq(tp-rx_stats.syncp, start));
 
-
do {
start = u64_stats_fetch_begin_irq(tp-tx_stats.syncp);
stats-tx_packets = tp-tx_stats.packets;
@@ -7804,6 +7804,7 @@ rtl8169_get_stats64(struct net_device *dev, struct 
rtnl_link_stats64 *stats)
stats-rx_crc_errors= dev-stats.rx_crc_errors;
stats-rx_fifo_errors   = dev-stats.rx_fifo_errors;
stats-rx_missed_errors = dev-stats.rx_missed_errors;
+   stats-multicast= dev-stats.multicast;
 
/*
 * Fetch additonal counter values missing in stats collected by driver
@@ -7819,8 +7820,6 @@ rtl8169_get_stats64(struct net_device *dev, struct 
rtnl_link_stats64 *stats)
le64_to_cpu(tp-tc_offset.tx_errors);
stats-collisions = le32_to_cpu(tp-counters.tx_multi_collision) -
le32_to_cpu(tp-tc_offset.tx_multi_collision);
-   stats-multicast = le32_to_cpu(tp-counters.rx_multicast) -
-   le32_to_cpu(tp-tc_offset.rx_multicast);
stats-tx_aborted_errors = le16_to_cpu(tp-counters.tx_aborted) -
le16_to_cpu(tp-tc_offset.tx_aborted);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 net-next] r8169: Add values missing in @get_stats64 from HW counters

2015-08-26 Thread Corinna Vinschen

On Aug 26 00:54, Francois Romieu wrote:
 David Miller da...@davemloft.net :
 [...]
  Your counter offsets should be read at probe time, not open time.
 
 It can be done but the CmdRxEnb / rx traffic must be enabled constraint
 will make it a major pita. 
 
 Reading counter offsets at the end of open() naturally solves this
 constraint (retentive error unwinding in opne() stops being completely
 trivial though :o/ ).
 
  Bringing the interface is brought down/up should not reset the
  counters.
 
 Afaiks rtl8169_tc_offsets.inited in rtl8169_init_counter_offsets
 takes care of it: it's set during the first open() after probe().
 
 Looking at it again, the patch directly stores 16 and 32 bit values
 in rtnl_link_stats64. Nobody should care about exact exceedingly high
 error count but rx_multicast ought to be accumulated.

I'll have a look into that for a followup patch.


Thanks,
Corinna


pgpgKxmrSJF6p.pgp
Description: PGP signature

Re: [PATCH v4 net-next] r8169: Add values missing in @get_stats64 from HW counters

2015-08-26 Thread Corinna Vinschen

On Aug 25 16:03, David Miller wrote:
 From: David Miller da...@davemloft.net
 Date: Tue, 25 Aug 2015 15:59:21 -0700 (PDT)

  From: Francois Romieu rom...@fr.zoreil.com
  Date: Wed, 26 Aug 2015 00:54:06 +0200

  Bringing the interface is brought down/up should not reset the
  counters.

  Afaiks rtl8169_tc_offsets.inited in rtl8169_init_counter_offsets
  takes care of it: it's set during the first open() after probe().

  Ok, then it's fine.

 And as such I've applied this patch, thanks.

Thanks,
Corinna

pgpWiElqoUtc5.pgp
Description: PGP signature

Re: [PATCH v2 net-next] r8169: Add values missing in @get_stats64 from HW counters

2015-08-25 Thread Corinna Vinschen

On Aug 25 01:33, Francois Romieu wrote:
 Corinna Vinschen vinsc...@redhat.com :
  On Aug 22 13:23, Francois Romieu wrote:
 [...]
   Sorry, my english was really bad:
   
   the code should propagate failure when rtl8169_reset_counters and
   rtl8169_update_counters *simultaneously* fail.
  
  Uhm... sorry, but that still doesn't answer the question.  As you can
  see in my patch, the initalization at open time is already encapsulated
  in a function rtl8169_init_counter_offsets. 
 
 I have read your patch, I have already answered the question and I have
 already said that it wasn't a showstopper.

Ok.  I hope v4 of the patch is ok?


Thanks,
Corinna


pgpNpx7FA2Ait.pgp
Description: PGP signature

Re: [PATCH v2 net-next] r8169: Add values missing in @get_stats64 from HW counters

2015-08-24 Thread Corinna Vinschen

On Aug 22 13:23, Francois Romieu wrote:
 Corinna Vinschen vinsc...@redhat.com :
 [...]
  That won't happen with the current patch because only
  rtl8169_reset_counters would print a log message, it's only called from
  open, and open occurs rather seldom.  Atop of that the code only tries
  to reset counters on HW supporting it, and only if resetting on the HW
  fails, there will be a log message at all.  There's no reasonable chance
  that failing to reset the counters will lead to log flooding.
 
 Thanks for reformulating it. We are in violent agreement here.
 
 [...]
  I'm not trying to avoid work, I'm trying to understand.
  
  As far as I see it failing to reset the counters has no impact on the
  viability of the code.  It's still working with offsets and if the
  offset is 0 or non-0, the user space won't see the difference in the
  values returned by @get_stats64.  Successful resetting the counters is
  just a bonus.
 
 Sorry, my english was really bad:
 
 the code should propagate failure when rtl8169_reset_counters and
 rtl8169_update_counters *simultaneously* fail.

Uhm... sorry, but that still doesn't answer the question.  As you can
see in my patch, the initalization at open time is already encapsulated
in a function rtl8169_init_counter_offsets. 

Assuming rtl8169_init_counter_offsets returns -1 if both functions,
rtl8169_reset_counters and rtl8169_update_counters fail.

Then... what?

Not being able to reset or update the counters is still not at all fatal
for the operation of the NIC as a whole and rtl_open in particular:

  rtl_open()
  {
[...]

/* This is non-fatal. */
if (!rtl8169_init_counter_offsets ()) {

  /* What to do here??? */

}

[...]
  }


Corinna


pgpg0O3EnYyyX.pgp
Description: PGP signature

[PATCH v3 net-next] r8169: Add values missing in @get_stats64 from HW counters

2015-08-24 Thread Corinna Vinschen

The r8169 driver collects statistical information returned by
@get_stats64 by counting them in the driver itself, even though many
(but not all) of the values are already collected by tally counters
(TCs) in the NIC.  Some of these TC values are not returned by
@get_stats64.  Especially the received multicast packages are missing
from /proc/net/dev.

Rectify this by fetching the TCs and returning them from
rtl8169_get_stats64.

The counters collected in the driver obviously disappear as soon as the
driver is unloaded so after a driver is loaded the counters always start
at 0. The TCs on the other hand are only reset by a power cycle.  Without
further considerations the values collected by the driver would not match
up against the TC values.

This patch introduces a new function rtl8169_reset_counters which
resets the TCs.  Also, since rtl8169_reset_counters shares most of
its code with rtl8169_update_counters, refactor the shared code into
two new functions  rtl8169_map_counters and rtl8169_unmap_counters.

Unfortunately chip versions prior to RTL_GIGA_MAC_VER_19 don't allow
to reset the TCs programatically.  Therefore introduce an addition to
the rtl8169_private struct and a function rtl8169_init_counter_offsets
to store the TCs at first rtl_open.  Use these values as offsets in
rtl8169_get_stats64.  Propagate a failure to reset *and* update the
counters up to rtl_open and emit a warning message, if so.

Signed-off-by: Corinna Vinschen vinsc...@redhat.com
---
 drivers/net/ethernet/realtek/r8169.c | 165 +++
 1 file changed, 150 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index f790f61..1e4322c 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -637,6 +637,9 @@ enum rtl_register_content {
/* _TBICSRBit */
TBILinkOK   = 0x0200,
 
+   /* ResetCounterCommand */
+   CounterReset= 0x1,
+
/* DumpCounterCommand */
CounterDump = 0x8,
 
@@ -747,6 +750,14 @@ struct rtl8169_counters {
__le16  tx_underun;
 };
 
+struct rtl8169_tc_offsets {
+   boolinited;
+   __le64  tx_errors;
+   __le32  tx_multi_collision;
+   __le32  rx_multicast;
+   __le16  tx_aborted;
+};
+
 enum rtl_flag {
RTL_FLAG_TASK_ENABLED,
RTL_FLAG_TASK_SLOW_PENDING,
@@ -824,6 +835,7 @@ struct rtl8169_private {
 
struct mii_if_info mii;
struct rtl8169_counters counters;
+   struct rtl8169_tc_offsets tc_offset;
u32 saved_wolopts;
u32 opts1_mask;
 
@@ -2179,6 +2191,73 @@ static int rtl8169_get_sset_count(struct net_device 
*dev, int sset)
}
 }
 
+static struct rtl8169_counters *rtl8169_map_counters(struct net_device *dev,
+dma_addr_t *paddr,
+u32 counter_cmd)
+{
+   struct rtl8169_private *tp = netdev_priv(dev);
+   void __iomem *ioaddr = tp-mmio_addr;
+   struct device *d = tp-pci_dev-dev;
+   struct rtl8169_counters *counters;
+   u32 cmd;
+
+   counters = dma_alloc_coherent(d, sizeof(*counters), paddr, GFP_KERNEL);
+   if (counters) {
+   RTL_W32(CounterAddrHigh, (u64)*paddr  32);
+   cmd = (u64)*paddr  DMA_BIT_MASK(32);
+   RTL_W32(CounterAddrLow, cmd);
+   RTL_W32(CounterAddrLow, cmd | counter_cmd);
+   }
+   return counters;
+}
+
+static void rtl8169_unmap_counters (struct net_device *dev,
+   dma_addr_t paddr,
+   struct rtl8169_counters *counters)
+{
+   struct rtl8169_private *tp = netdev_priv(dev);
+   void __iomem *ioaddr = tp-mmio_addr;
+   struct device *d = tp-pci_dev-dev;
+
+   RTL_W32(CounterAddrLow, 0);
+   RTL_W32(CounterAddrHigh, 0);
+
+   dma_free_coherent(d, sizeof(*counters), counters, paddr);
+}
+
+DECLARE_RTL_COND(rtl_reset_counters_cond)
+{
+   void __iomem *ioaddr = tp-mmio_addr;
+
+   return RTL_R32(CounterAddrLow)  CounterReset;
+}
+
+static bool rtl8169_reset_counters(struct net_device *dev)
+{
+   struct rtl8169_private *tp = netdev_priv(dev);
+   struct rtl8169_counters *counters;
+   dma_addr_t paddr;
+   bool ret = true;
+
+   /*
+* Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the
+* tally counters.
+*/
+   if (tp-mac_version  RTL_GIGA_MAC_VER_19)
+   return true;
+
+   counters = rtl8169_map_counters(dev, paddr, CounterReset);
+   if (!counters)
+   return false;
+
+   if (!rtl_udelay_loop_wait_low(tp, rtl_reset_counters_cond, 10, 1000))
+   ret = false;
+
+   rtl8169_unmap_counters(dev, paddr, counters);
+
+   return ret;
+}
+
 DECLARE_RTL_COND(rtl_counters_cond)
 {
void __iomem *ioaddr = tp-mmio_addr;
@@ -2186,38

Re: [PATCH v2 net-next] r8169: Add values missing in @get_stats64 from HW counters

2015-08-24 Thread Corinna Vinschen

On Aug 24 09:33, Corinna Vinschen wrote:
 On Aug 22 13:23, Francois Romieu wrote:
  Corinna Vinschen vinsc...@redhat.com :
  [...]
   That won't happen with the current patch because only
   rtl8169_reset_counters would print a log message, it's only called from
   open, and open occurs rather seldom.  Atop of that the code only tries
   to reset counters on HW supporting it, and only if resetting on the HW
   fails, there will be a log message at all.  There's no reasonable chance
   that failing to reset the counters will lead to log flooding.
  
  Thanks for reformulating it. We are in violent agreement here.
  
  [...]
   I'm not trying to avoid work, I'm trying to understand.
   
   As far as I see it failing to reset the counters has no impact on the
   viability of the code.  It's still working with offsets and if the
   offset is 0 or non-0, the user space won't see the difference in the
   values returned by @get_stats64.  Successful resetting the counters is
   just a bonus.
  
  Sorry, my english was really bad:
  
  the code should propagate failure when rtl8169_reset_counters and
  rtl8169_update_counters *simultaneously* fail.
 
 Uhm... sorry, but that still doesn't answer the question.  As you can
 see in my patch, the initalization at open time is already encapsulated
 in a function rtl8169_init_counter_offsets. 
 
 Assuming rtl8169_init_counter_offsets returns -1 if both functions,
 rtl8169_reset_counters and rtl8169_update_counters fail.
 
 Then... what?
 
 Not being able to reset or update the counters is still not at all fatal
 for the operation of the NIC as a whole and rtl_open in particular:
 
   rtl_open()
   {
 [...]
 
 /* This is non-fatal. */
 if (!rtl8169_init_counter_offsets ()) {
 
 /* What to do here??? */
 
 }
 
 [...]
   }

Never mind, I rearranged the code so that the warning is now printed
in rtl_open.  All functions related to the counter reset/update are
now bool functions.  I just sent a v3 patch.


Corinna


pgp7HuqR5UDnR.pgp
Description: PGP signature

[PATCH v4 net-next] r8169: Add values missing in @get_stats64 from HW counters

2015-08-24 Thread Corinna Vinschen

The r8169 driver collects statistical information returned by
@get_stats64 by counting them in the driver itself, even though many
(but not all) of the values are already collected by tally counters
(TCs) in the NIC.  Some of these TC values are not returned by
@get_stats64.  Especially the received multicast packages are missing
from /proc/net/dev.

Rectify this by fetching the TCs and returning them from
rtl8169_get_stats64.

The counters collected in the driver obviously disappear as soon as the
driver is unloaded so after a driver is loaded the counters always start
at 0. The TCs on the other hand are only reset by a power cycle.  Without
further considerations the values collected by the driver would not match
up against the TC values.

This patch introduces a new function rtl8169_reset_counters which
resets the TCs.  Also, since rtl8169_reset_counters shares most of
its code with rtl8169_update_counters, refactor the shared code into
two new functions  rtl8169_map_counters and rtl8169_unmap_counters.

Unfortunately chip versions prior to RTL_GIGA_MAC_VER_19 don't allow
to reset the TCs programatically.  Therefore introduce an addition to
the rtl8169_private struct and a function rtl8169_init_counter_offsets
to store the TCs at first rtl_open.  Use these values as offsets in
rtl8169_get_stats64.  Propagate a failure to reset *and* update the
counters up to rtl_open and emit a warning message, if so.

Signed-off-by: Corinna Vinschen vinsc...@redhat.com
---

[v3 - v4]: Remove useless rtl8169_close chunk changing case of comment


 drivers/net/ethernet/realtek/r8169.c | 163 ---
 1 file changed, 149 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index f790f61..d6d39df 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -637,6 +637,9 @@ enum rtl_register_content {
/* _TBICSRBit */
TBILinkOK   = 0x0200,
 
+   /* ResetCounterCommand */
+   CounterReset= 0x1,
+
/* DumpCounterCommand */
CounterDump = 0x8,
 
@@ -747,6 +750,14 @@ struct rtl8169_counters {
__le16  tx_underun;
 };
 
+struct rtl8169_tc_offsets {
+   boolinited;
+   __le64  tx_errors;
+   __le32  tx_multi_collision;
+   __le32  rx_multicast;
+   __le16  tx_aborted;
+};
+
 enum rtl_flag {
RTL_FLAG_TASK_ENABLED,
RTL_FLAG_TASK_SLOW_PENDING,
@@ -824,6 +835,7 @@ struct rtl8169_private {
 
struct mii_if_info mii;
struct rtl8169_counters counters;
+   struct rtl8169_tc_offsets tc_offset;
u32 saved_wolopts;
u32 opts1_mask;
 
@@ -2179,6 +2191,73 @@ static int rtl8169_get_sset_count(struct net_device 
*dev, int sset)
}
 }
 
+static struct rtl8169_counters *rtl8169_map_counters(struct net_device *dev,
+dma_addr_t *paddr,
+u32 counter_cmd)
+{
+   struct rtl8169_private *tp = netdev_priv(dev);
+   void __iomem *ioaddr = tp-mmio_addr;
+   struct device *d = tp-pci_dev-dev;
+   struct rtl8169_counters *counters;
+   u32 cmd;
+
+   counters = dma_alloc_coherent(d, sizeof(*counters), paddr, GFP_KERNEL);
+   if (counters) {
+   RTL_W32(CounterAddrHigh, (u64)*paddr  32);
+   cmd = (u64)*paddr  DMA_BIT_MASK(32);
+   RTL_W32(CounterAddrLow, cmd);
+   RTL_W32(CounterAddrLow, cmd | counter_cmd);
+   }
+   return counters;
+}
+
+static void rtl8169_unmap_counters (struct net_device *dev,
+   dma_addr_t paddr,
+   struct rtl8169_counters *counters)
+{
+   struct rtl8169_private *tp = netdev_priv(dev);
+   void __iomem *ioaddr = tp-mmio_addr;
+   struct device *d = tp-pci_dev-dev;
+
+   RTL_W32(CounterAddrLow, 0);
+   RTL_W32(CounterAddrHigh, 0);
+
+   dma_free_coherent(d, sizeof(*counters), counters, paddr);
+}
+
+DECLARE_RTL_COND(rtl_reset_counters_cond)
+{
+   void __iomem *ioaddr = tp-mmio_addr;
+
+   return RTL_R32(CounterAddrLow)  CounterReset;
+}
+
+static bool rtl8169_reset_counters(struct net_device *dev)
+{
+   struct rtl8169_private *tp = netdev_priv(dev);
+   struct rtl8169_counters *counters;
+   dma_addr_t paddr;
+   bool ret = true;
+
+   /*
+* Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the
+* tally counters.
+*/
+   if (tp-mac_version  RTL_GIGA_MAC_VER_19)
+   return true;
+
+   counters = rtl8169_map_counters(dev, paddr, CounterReset);
+   if (!counters)
+   return false;
+
+   if (!rtl_udelay_loop_wait_low(tp, rtl_reset_counters_cond, 10, 1000))
+   ret = false;
+
+   rtl8169_unmap_counters(dev, paddr, counters);
+
+   return ret;
+}
+
 DECLARE_RTL_COND

Re: [PATCH v2 net-next] r8169: Add values missing in @get_stats64 from HW counters

2015-08-21 Thread Corinna Vinschen

Sorry, I forgot to mention that I tested this patch on three different
chip versions, RTL_GIGA_MAC_VER_23, RTL_GIGA_MAC_VER_33 and
RTL_GIGA_MAC_VER_35.  I couldn't test on pre-RTL_GIGA_MAC_VER_19, but
the offset handling without counter reset already worked as expected on
later chip versions, so I'm pretty confident that older chip versions
should work accordingly.

On Aug 21 12:09, Corinna Vinschen wrote:
 The r8169 driver collects statistical information returned by
 @get_stats64 by counting them in the driver itself, even though many
 (but not all) of the values are already collected by tally counters
 (TCs) in the NIC.  Some of these TC values are not returned by
 @get_stats64.  Especially the received multicast packages are missing
 from /proc/net/dev.
 
 Rectify this by fetching the TCs and returning them from
 rtl8169_get_stats64.
 
 The counters collected in the driver obviously disappear as soon as the
 driver is unloaded so after a driver is loaded the counters always start
 at 0. The TCs on the other hand are only reset by a power cycle.  Without
 further considerations the values collected by the driver would not match
 up against the TC values.
 
 This patch introduces a new function rtl8169_reset_counters which
 resets the TCs.
 
 Unfortunately chip versions prior to RTL_GIGA_MAC_VER_19 don't allow
 to reset the TCs programatically.  Therefore introduce an addition to
 the rtl8169_private struct and a function rtl8169_init_counter_offsets
 to store the TCs at first rtl_open.  Use these values as offsets in
 rtl8169_get_stats64.
 
 Signed-off-by: Corinna Vinschen vinsc...@redhat.com
 ---
  drivers/net/ethernet/realtek/r8169.c | 107 
 +++
  1 file changed, 107 insertions(+)
 
 diff --git a/drivers/net/ethernet/realtek/r8169.c 
 b/drivers/net/ethernet/realtek/r8169.c
 index f790f61..f26a48d 100644
 --- a/drivers/net/ethernet/realtek/r8169.c
 +++ b/drivers/net/ethernet/realtek/r8169.c
 @@ -637,6 +637,9 @@ enum rtl_register_content {
   /* _TBICSRBit */
   TBILinkOK   = 0x0200,
  
 + /* ResetCounterCommand */
 + CounterReset= 0x1,
 +
   /* DumpCounterCommand */
   CounterDump = 0x8,
  
 @@ -747,6 +750,14 @@ struct rtl8169_counters {
   __le16  tx_underun;
  };
  
 +struct rtl8169_tc_offsets {
 + boolinited;
 + __le64  tx_errors;
 + __le32  tx_multi_collision;
 + __le32  rx_multicast;
 + __le16  tx_aborted;
 +};
 +
  enum rtl_flag {
   RTL_FLAG_TASK_ENABLED,
   RTL_FLAG_TASK_SLOW_PENDING,
 @@ -824,6 +835,7 @@ struct rtl8169_private {
  
   struct mii_if_info mii;
   struct rtl8169_counters counters;
 + struct rtl8169_tc_offsets tc_offset;
   u32 saved_wolopts;
   u32 opts1_mask;
  
 @@ -2179,6 +2191,47 @@ static int rtl8169_get_sset_count(struct net_device 
 *dev, int sset)
   }
  }
  
 +DECLARE_RTL_COND(rtl_reset_counters_cond)
 +{
 + void __iomem *ioaddr = tp-mmio_addr;
 +
 + return RTL_R32(CounterAddrLow)  CounterReset;
 +}
 +
 +static void rtl8169_reset_counters(struct net_device *dev)
 +{
 + struct rtl8169_private *tp = netdev_priv(dev);
 + void __iomem *ioaddr = tp-mmio_addr;
 + struct device *d = tp-pci_dev-dev;
 + struct rtl8169_counters *counters;
 + dma_addr_t paddr;
 + u32 cmd;
 +
 + /*
 +  * Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the
 +  * tally counters.
 +  */
 + if (tp-mac_version  RTL_GIGA_MAC_VER_19)
 + return;
 +
 + counters = dma_alloc_coherent(d, sizeof(*counters), paddr, GFP_KERNEL);
 + if (!counters)
 + return;
 +
 + RTL_W32(CounterAddrHigh, (u64)paddr  32);
 + cmd = (u64)paddr  DMA_BIT_MASK(32);
 + RTL_W32(CounterAddrLow, cmd);
 + RTL_W32(CounterAddrLow, cmd | CounterReset);
 +
 + if (!rtl_udelay_loop_wait_low(tp, rtl_reset_counters_cond, 10, 1000))
 + netif_warn(tp, hw, dev, counter reset failed\n);
 +
 + RTL_W32(CounterAddrLow, 0);
 + RTL_W32(CounterAddrHigh, 0);
 +
 + dma_free_coherent(d, sizeof(*counters), counters, paddr);
 +}
 +
  DECLARE_RTL_COND(rtl_counters_cond)
  {
   void __iomem *ioaddr = tp-mmio_addr;
 @@ -2220,6 +2273,39 @@ static void rtl8169_update_counters(struct net_device 
 *dev)
   dma_free_coherent(d, sizeof(*counters), counters, paddr);
  }
  
 +static void rtl8169_init_counter_offsets(struct net_device *dev)
 +{
 + struct rtl8169_private *tp = netdev_priv(dev);
 +
 + /*
 +  * rtl8169_init_counter_offsets is called from rtl_open.  On chip
 +  * versions prior to RTL_GIGA_MAC_VER_19 the tally counters are only
 +  * reset by a power cycle, while the counter values collected by the
 +  * driver are reset at every driver unload/load cycle.
 +  *
 +  * To make sure the HW values returned by @get_stats64 match the SW
 +  * values, we collect the initial values at first open(*) and use them
 +  * as offsets to normalize the values

Re: [PATCH v2 net-next] r8169: Add values missing in @get_stats64 from HW counters

2015-08-21 Thread Corinna Vinschen

On Aug 21 21:39, Francois Romieu wrote:
 Corinna Vinschen vinsc...@redhat.com :
 [...]
  diff --git a/drivers/net/ethernet/realtek/r8169.c 
  b/drivers/net/ethernet/realtek/r8169.c
  index f790f61..f26a48d 100644
  --- a/drivers/net/ethernet/realtek/r8169.c
  +++ b/drivers/net/ethernet/realtek/r8169.c
 [...]
  @@ -2179,6 +2191,47 @@ static int rtl8169_get_sset_count(struct net_device 
  *dev, int sset)
  }
   }
   
  +DECLARE_RTL_COND(rtl_reset_counters_cond)
  +{
  +   void __iomem *ioaddr = tp-mmio_addr;
  +
  +   return RTL_R32(CounterAddrLow)  CounterReset;
  +}
  +
  +static void rtl8169_reset_counters(struct net_device *dev)
  +{
 
 rtl8169_reset_counters duplicates most of rtl8169_update_counters. Please
 factor out the dma_alloc + parametrized CounterAddrLow write + cleanup.

Ok, will do.

  +   rtl8169_reset_counters(dev);
  +
  +   rtl8169_update_counters(dev);
 
 
 The code should propagate failure when both rtl8169_reset_counters and
 rtl8169_update_counters fail.

This one I don't understand.  Neither failing to reset the counters nor
failing to update the counters is fatal for the driver.  So far the
(unchanged) rtl8169_update_counters doesn't even print a log message,
while a failing reset in rtl8169_reset_counters now does.

Why is that not sufficent?


Thanks,
Corinna


pgpKL0AKrvb57.pgp
Description: PGP signature

[PATCH v2 net-next] r8169: Add values missing in @get_stats64 from HW counters

2015-08-21 Thread Corinna Vinschen

The r8169 driver collects statistical information returned by
@get_stats64 by counting them in the driver itself, even though many
(but not all) of the values are already collected by tally counters
(TCs) in the NIC.  Some of these TC values are not returned by
@get_stats64.  Especially the received multicast packages are missing
from /proc/net/dev.

Rectify this by fetching the TCs and returning them from
rtl8169_get_stats64.

The counters collected in the driver obviously disappear as soon as the
driver is unloaded so after a driver is loaded the counters always start
at 0. The TCs on the other hand are only reset by a power cycle.  Without
further considerations the values collected by the driver would not match
up against the TC values.

This patch introduces a new function rtl8169_reset_counters which
resets the TCs.

Unfortunately chip versions prior to RTL_GIGA_MAC_VER_19 don't allow
to reset the TCs programatically.  Therefore introduce an addition to
the rtl8169_private struct and a function rtl8169_init_counter_offsets
to store the TCs at first rtl_open.  Use these values as offsets in
rtl8169_get_stats64.

Signed-off-by: Corinna Vinschen vinsc...@redhat.com
---
 drivers/net/ethernet/realtek/r8169.c | 107 +++
 1 file changed, 107 insertions(+)

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index f790f61..f26a48d 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -637,6 +637,9 @@ enum rtl_register_content {
/* _TBICSRBit */
TBILinkOK   = 0x0200,
 
+   /* ResetCounterCommand */
+   CounterReset= 0x1,
+
/* DumpCounterCommand */
CounterDump = 0x8,
 
@@ -747,6 +750,14 @@ struct rtl8169_counters {
__le16  tx_underun;
 };
 
+struct rtl8169_tc_offsets {
+   boolinited;
+   __le64  tx_errors;
+   __le32  tx_multi_collision;
+   __le32  rx_multicast;
+   __le16  tx_aborted;
+};
+
 enum rtl_flag {
RTL_FLAG_TASK_ENABLED,
RTL_FLAG_TASK_SLOW_PENDING,
@@ -824,6 +835,7 @@ struct rtl8169_private {
 
struct mii_if_info mii;
struct rtl8169_counters counters;
+   struct rtl8169_tc_offsets tc_offset;
u32 saved_wolopts;
u32 opts1_mask;
 
@@ -2179,6 +2191,47 @@ static int rtl8169_get_sset_count(struct net_device 
*dev, int sset)
}
 }
 
+DECLARE_RTL_COND(rtl_reset_counters_cond)
+{
+   void __iomem *ioaddr = tp-mmio_addr;
+
+   return RTL_R32(CounterAddrLow)  CounterReset;
+}
+
+static void rtl8169_reset_counters(struct net_device *dev)
+{
+   struct rtl8169_private *tp = netdev_priv(dev);
+   void __iomem *ioaddr = tp-mmio_addr;
+   struct device *d = tp-pci_dev-dev;
+   struct rtl8169_counters *counters;
+   dma_addr_t paddr;
+   u32 cmd;
+
+   /*
+* Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the
+* tally counters.
+*/
+   if (tp-mac_version  RTL_GIGA_MAC_VER_19)
+   return;
+
+   counters = dma_alloc_coherent(d, sizeof(*counters), paddr, GFP_KERNEL);
+   if (!counters)
+   return;
+
+   RTL_W32(CounterAddrHigh, (u64)paddr  32);
+   cmd = (u64)paddr  DMA_BIT_MASK(32);
+   RTL_W32(CounterAddrLow, cmd);
+   RTL_W32(CounterAddrLow, cmd | CounterReset);
+
+   if (!rtl_udelay_loop_wait_low(tp, rtl_reset_counters_cond, 10, 1000))
+   netif_warn(tp, hw, dev, counter reset failed\n);
+
+   RTL_W32(CounterAddrLow, 0);
+   RTL_W32(CounterAddrHigh, 0);
+
+   dma_free_coherent(d, sizeof(*counters), counters, paddr);
+}
+
 DECLARE_RTL_COND(rtl_counters_cond)
 {
void __iomem *ioaddr = tp-mmio_addr;
@@ -2220,6 +2273,39 @@ static void rtl8169_update_counters(struct net_device 
*dev)
dma_free_coherent(d, sizeof(*counters), counters, paddr);
 }
 
+static void rtl8169_init_counter_offsets(struct net_device *dev)
+{
+   struct rtl8169_private *tp = netdev_priv(dev);
+
+   /*
+* rtl8169_init_counter_offsets is called from rtl_open.  On chip
+* versions prior to RTL_GIGA_MAC_VER_19 the tally counters are only
+* reset by a power cycle, while the counter values collected by the
+* driver are reset at every driver unload/load cycle.
+*
+* To make sure the HW values returned by @get_stats64 match the SW
+* values, we collect the initial values at first open(*) and use them
+* as offsets to normalize the values returned by @get_stats64.
+*
+* (*) We can't call rtl8169_init_counter_offsets from rtl_init_one
+* for the reason stated in rtl8169_update_counters; CmdRxEnb is only
+* set at open time by rtl_hw_start.
+*/
+
+   if (tp-tc_offset.inited)
+   return;
+
+   rtl8169_reset_counters(dev);
+
+   rtl8169_update_counters(dev);
+
+   tp-tc_offset.tx_errors = tp

Re: [PATCH] r8169: Add values missing in @get_stats64 from HW counters

2015-08-20 Thread Corinna Vinschen

On Aug 20 02:43, Hayes Wang wrote:
 Corinna Vinschen [mailto:vinsc...@redhat.com]
  Sent: Thursday, August 20, 2015 3:24 AM
 [...]
  +   /*
  +* Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the
  +* tally counters.
  +*/
  +   if (tp-mac_version = RTL_GIGA_MAC_VER_19) {
  +   RTL_W32(CounterAddrHigh, 0);
  +   RTL_W32(CounterAddrLow, CounterReset);
 
 I check these with our engineers, and they say the bit 6 ~ 63 should be the
 valid 64 byte alignment memory address. Although you don’t want to dump
 the counters, the hw may also clear the data in the memory which is indicated
 by bit 6 ~ 63, when you reset the counters.

Ok, that's easy enough to implement.  What about CmdRxEnb?  Are there
chips which need this flag set to perform the counter reset?


Thanks,
Corinna


pgpGrVZNY4vq8.pgp
Description: PGP signature

Re: [PATCH] r8169: Add values missing in @get_stats64 from HW counters

2015-08-19 Thread Corinna Vinschen

On Aug 19 09:31, Hayes Wang wrote:
 Corinna Vinschen [mailto:vinsc...@redhat.com]
  Sent: Wednesday, August 19, 2015 5:13 PM
 [...]
   It could be cleared by setting bit 0, such as rtl_tally_reset() of r8152.
  
  Is it safe to assume that this is implemented in all NICs covered by r8169? 
 
 It is supported from RTL8111C. That is, RTL_GIGA_MAC_VER_19 and later.

Thanks.  In that case I would prefer the same generic method for all
chip versions, so I'd opt for storing the offset values at rtl_open
time as my patch is doing right now.  Is that acceptable?

If so, wouldn't it make even more sense to use the hardware collected
information in @get_stats64 throughout, except for the numbers collected
*only* in software?  

I would be willing to propose a matching patch.


Thanks,
Corinna


pgpJydjgGbJe7.pgp
Description: PGP signature

Re: [PATCH] r8169: Add values missing in @get_stats64 from HW counters

2015-08-19 Thread Corinna Vinschen

On Aug 19 15:07, Corinna Vinschen wrote:
 On Aug 19 09:31, Hayes Wang wrote:
  Corinna Vinschen [mailto:vinsc...@redhat.com]
   Sent: Wednesday, August 19, 2015 5:13 PM
  [...]
It could be cleared by setting bit 0, such as rtl_tally_reset() of 
r8152.
   
   Is it safe to assume that this is implemented in all NICs covered by 
   r8169? 
  
  It is supported from RTL8111C. That is, RTL_GIGA_MAC_VER_19 and later.
 
 Thanks.  In that case I would prefer the same generic method for all
 chip versions, so I'd opt for storing the offset values at rtl_open
 time as my patch is doing right now.  Is that acceptable?
 
 If so, wouldn't it make even more sense to use the hardware collected
 information in @get_stats64 throughout, except for the numbers collected
 *only* in software?  
 
 I would be willing to propose a matching patch.

It just occured to me that the combination of resetting the counters on
post-RTL_GIGA_MAC_VER_19 chips plus offset handling would be quite
nice, because it would reset also the small 16 and 32 bit counters.

So I'd like to propose a patch which combines both techniques, if that's
an acceptable way to go forward.

Btw., does setting the reset bit in CounterAddrLow work the same way as
setting the CounterDump flag?  I.e, does the driver have to wait for the
hardware to set the bit to 0 again to be sure the reset is finished?


Thanks in advance,
Corinna


pgp8ULHO1_RPj.pgp
Description: PGP signature

Re: [PATCH] r8169: Add values missing in @get_stats64 from HW counters

2015-08-19 Thread Corinna Vinschen

On Aug 19 16:24, Corinna Vinschen wrote:
 On Aug 19 15:07, Corinna Vinschen wrote:
  On Aug 19 09:31, Hayes Wang wrote:
   Corinna Vinschen [mailto:vinsc...@redhat.com]
Sent: Wednesday, August 19, 2015 5:13 PM
   [...]
 It could be cleared by setting bit 0, such as rtl_tally_reset() of 
 r8152.

Is it safe to assume that this is implemented in all NICs covered by 
r8169? 
   
   It is supported from RTL8111C. That is, RTL_GIGA_MAC_VER_19 and later.
  
  Thanks.  In that case I would prefer the same generic method for all
  chip versions, so I'd opt for storing the offset values at rtl_open
  time as my patch is doing right now.  Is that acceptable?
  
  If so, wouldn't it make even more sense to use the hardware collected
  information in @get_stats64 throughout, except for the numbers collected
  *only* in software?  
  
  I would be willing to propose a matching patch.
 
 It just occured to me that the combination of resetting the counters on
 post-RTL_GIGA_MAC_VER_19 chips plus offset handling would be quite
 nice, because it would reset also the small 16 and 32 bit counters.
 
 So I'd like to propose a patch which combines both techniques, if that's
 an acceptable way to go forward.
 
 Btw., does setting the reset bit in CounterAddrLow work the same way as
 setting the CounterDump flag?  I.e, does the driver have to wait for the
 hardware to set the bit to 0 again to be sure the reset is finished?

Here's a preliminary implementation of the counter reset code.  It's
based on my previous patch.  It calls the new rtl8169_reset_counters
prior to rtl8169_update_counters from rtl_open.

I tested the patch successfully on a RTL8111f NIC (RTL_GIGA_MAC_VER_35).

This isn't an official patch submission, I'm just sending this for
further discussion.  I'll make a full patch submission if the code is
acceptable.  

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index e7c7955..204f7e7 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -637,6 +637,9 @@ enum rtl_register_content {
/* _TBICSRBit */
TBILinkOK   = 0x0200,
 
+   /* ResetCounterCommand */
+   CounterReset= 0x1,
+
/* DumpCounterCommand */
CounterDump = 0x8,
 
@@ -2188,6 +2191,31 @@ static int rtl8169_get_sset_count(struct net_device 
*dev, int sset)
}
 }
 
+DECLARE_RTL_COND(rtl_reset_counters_cond)
+{
+   void __iomem *ioaddr = tp-mmio_addr;
+
+   return RTL_R32(CounterAddrLow)  CounterReset;
+}
+
+static void rtl8169_reset_counters(struct net_device *dev)
+{
+   struct rtl8169_private *tp = netdev_priv(dev);
+   void __iomem *ioaddr = tp-mmio_addr;
+
+   /*
+* Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the
+* tally counters.
+*/
+   if (tp-mac_version = RTL_GIGA_MAC_VER_19) {
+   RTL_W32(CounterAddrHigh, 0);
+   RTL_W32(CounterAddrLow, CounterReset);
+   if (!rtl_udelay_loop_wait_low(tp, rtl_reset_counters_cond,
+ 10, 1000))
+   netif_warn(tp, hw, dev, counter reset failed\n);
+   }
+}
+
 DECLARE_RTL_COND(rtl_counters_cond)
 {
void __iomem *ioaddr = tp-mmio_addr;
@@ -2234,11 +2262,10 @@ static void rtl8169_init_tc_counter_offset(struct 
net_device *dev)
struct rtl8169_private *tp = netdev_priv(dev);
 
/*
-* rtl8169_init_tc_counter_offset is called from rtl_open.  The tally
-* counters are only reset by a power cycle, while the counter values
-* collected by the driver are reset at every driver unload/load cycle.
-* There's also no (documented?) way to reset the tally counters
-* programatically.
+* rtl8169_init_tc_counter_offset is called from rtl_open.  On chip
+* versions prior to RTL_GIGA_MAC_VER_19 the tally counters are only
+* reset by a power cycle, while the counter values collected by the
+* driver are reset at every driver unload/load cycle.
 *
 * To make sure the HW values returned by @get_stats64 match the SW
 * values, we collect the initial values at first open(*) and use them
@@ -2252,6 +2279,8 @@ static void rtl8169_init_tc_counter_offset(struct 
net_device *dev)
if (tp-tc_offset.inited)
return;
 
+   rtl8169_reset_counters(dev);
+
rtl8169_update_counters(dev);
 
tp-tc_offset.tx_errors = tp-counters.tx_errors;


Thanks,
Corinna


pgplEX60t_D0q.pgp
Description: PGP signature

Re: [PATCH] r8169: Add values missing in @get_stats64 from HW counters

2015-08-19 Thread Corinna Vinschen

On Aug 19 02:51, Hayes Wang wrote:
 [...]
   The TCs are only reset by a power cycle and there's no known way
   to reset them programatically.
 
 It could be cleared by setting bit 0, such as rtl_tally_reset() of r8152.

Thanks for this hint.  I'll give it a try.  Is it safe to assume that
this is implemented in all NICs covered by r8169?  Otherwise it's more
safe to use offset values as in my patch, I think.


Thanks,
Corinna


pgpLJr0cODr0S.pgp
Description: PGP signature

Re: [PATCH] r8169: Add values missing in @get_stats64 from HW counters

2015-08-19 Thread Corinna Vinschen

On Aug 18 19:05, David Miller wrote:
 From: Francois Romieu rom...@fr.zoreil.com
 Date: Tue, 18 Aug 2015 23:40:17 +0200

  Corinna Vinschen vinsc...@redhat.com :
  The r8169 driver collects statistical information returned by
  @get_stats64 by counting them in the driver itself, even though many
  (but not all) of the values are already collected by tally counters
  (TCs) in the NIC.  Some of these TC values are not returned by
  @get_stats64.  Especially the received multicast packages are missing
  from /proc/net/dev.
  Rectify this by fetching the TCs and returning them from
  rtl8169_get_stats64.

  It is not exactly a some / many / not all / confused [*] picture
  - get_stats64 provides software managed values
  - ethtool provides hardware only values

  [*] ok, rx_missed is kinda special.

  The counters collected in the driver obviously disappear as soon as the
  driver is unloaded so after a driver is loaded the counters always start
  at 0.  The TCs are only reset by a power cycle and there's no known way
  to reset them programatically. Without further considerations the values
  collected by the driver would not match up against the TC values.

  I'd rather:
  1. keep software and hardware stats separated
  2. push to userspace the responsibility to offset device specific hardware
 values jumps

  My 0.02 €.

 Any value that is provided by -get_stats64() should be provided by whatever
 mechanism is available and determined to be prudent for the device in
 question.  If the device is not tracking the event, this means pure software
 counters.

 If the device tracks the event, you should attempt to use the hardware
 facility to provide the -get_stats64() values.

 If the hardware provides counters not starting at zero on driver load,
 one option to handle that is to load the initial values of all of
 those counters and calculate diffs at -get_stats64() time.

That's what my code is doing, just for a limited set of values.

I was already wondering why some values available in the hardware tally
counters are additionally counted in software.  I was planning to
propose to use the hardware counters in @get_stats64 in the first place
and only count values in the driver which are not provided by the
hardware

Corinna

pgpv2lzWCTlZ1.pgp
Description: PGP signature

[PATCH] r8169: Add values missing in @get_stats64 from HW counters

2015-08-18 Thread Corinna Vinschen

The r8169 driver collects statistical information returned by
@get_stats64 by counting them in the driver itself, even though many
(but not all) of the values are already collected by tally counters
(TCs) in the NIC.  Some of these TC values are not returned by
@get_stats64.  Especially the received multicast packages are missing
from /proc/net/dev.

Rectify this by fetching the TCs and returning them from
rtl8169_get_stats64.

The counters collected in the driver obviously disappear as soon as the
driver is unloaded so after a driver is loaded the counters always start
at 0.  The TCs are only reset by a power cycle and there's no known way
to reset them programatically.  Without further considerations the values
collected by the driver would not match up against the TC values.

Therefore introduce an addition to the rtl8169_private struct and a
function rtl8169_init_tc_counter_offset to store the TCs at first
rtl_open.  Use these values as offsets in rtl8169_get_stats64.

Signed-off-by: Corinna Vinschen vinsc...@redhat.com
---
 drivers/net/ethernet/realtek/r8169.c | 62 
 1 file changed, 62 insertions(+)

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index f790f61..e7c7955 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -747,6 +747,14 @@ struct rtl8169_counters {
__le16  tx_underun;
 };
 
+struct rtl8169_tc_offsets {
+   boolinited;
+   __le64  tx_errors;
+   __le32  tx_multi_collision;
+   __le32  rx_multicast;
+   __le16  tx_aborted;
+};
+
 enum rtl_flag {
RTL_FLAG_TASK_ENABLED,
RTL_FLAG_TASK_SLOW_PENDING,
@@ -824,6 +832,7 @@ struct rtl8169_private {
 
struct mii_if_info mii;
struct rtl8169_counters counters;
+   struct rtl8169_tc_offsets tc_offset;
u32 saved_wolopts;
u32 opts1_mask;
 
@@ -2220,6 +2229,38 @@ static void rtl8169_update_counters(struct net_device 
*dev)
dma_free_coherent(d, sizeof(*counters), counters, paddr);
 }
 
+static void rtl8169_init_tc_counter_offset(struct net_device *dev)
+{
+   struct rtl8169_private *tp = netdev_priv(dev);
+
+   /*
+* rtl8169_init_tc_counter_offset is called from rtl_open.  The tally
+* counters are only reset by a power cycle, while the counter values
+* collected by the driver are reset at every driver unload/load cycle.
+* There's also no (documented?) way to reset the tally counters
+* programatically.
+*
+* To make sure the HW values returned by @get_stats64 match the SW
+* values, we collect the initial values at first open(*) and use them
+* as offsets to normalize the values returned by @get_stats64.
+*
+* (*) We can't call rtl8169_init_tc_counter_offset from rtl_init_one
+* for the reason stated in rtl8169_update_counters; CmdRxEnb is only
+* set at open time by rtl_hw_start.
+*/
+
+   if (tp-tc_offset.inited)
+   return;
+
+   rtl8169_update_counters(dev);
+
+   tp-tc_offset.tx_errors = tp-counters.tx_errors;
+   tp-tc_offset.tx_multi_collision = tp-counters.tx_multi_collision;
+   tp-tc_offset.rx_multicast = tp-counters.rx_multicast;
+   tp-tc_offset.tx_aborted = tp-counters.tx_aborted;
+   tp-tc_offset.inited = true;
+}
+
 static void rtl8169_get_ethtool_stats(struct net_device *dev,
  struct ethtool_stats *stats, u64 *data)
 {
@@ -7631,6 +7672,8 @@ static int rtl_open(struct net_device *dev)
 
rtl_hw_start(dev);
 
+   rtl8169_init_tc_counter_offset(dev);
+
netif_start_queue(dev);
 
rtl_unlock_work(tp);
@@ -7689,6 +7732,25 @@ rtl8169_get_stats64(struct net_device *dev, struct 
rtnl_link_stats64 *stats)
stats-rx_fifo_errors   = dev-stats.rx_fifo_errors;
stats-rx_missed_errors = dev-stats.rx_missed_errors;
 
+   /*
+* Fetch additonal counter values missing in stats collected by driver
+* from tally counters.
+*/
+   rtl8169_update_counters(dev);
+
+   /*
+* Subtract values fetched during initalization.
+* See rtl8169_init_tc_counter_offset for a description why we do that.
+*/
+   stats-tx_errors = le64_to_cpu(tp-counters.tx_errors) -
+   le64_to_cpu(tp-tc_offset.tx_errors);
+   stats-collisions = le32_to_cpu(tp-counters.tx_multi_collision) -
+   le32_to_cpu(tp-tc_offset.tx_multi_collision);
+   stats-multicast = le32_to_cpu(tp-counters.rx_multicast) -
+   le32_to_cpu(tp-tc_offset.rx_multicast);
+   stats-tx_aborted_errors = le16_to_cpu(tp-counters.tx_aborted) -
+   le16_to_cpu(tp-tc_offset.tx_aborted);
+
return stats;
 }
 
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo

61 matches

Mail list logo