Re: net/arp: ARP cache aging failed.

2016-11-23 Thread Julian Anastasov

Hello,

On Wed, 23 Nov 2016, Eric Dumazet wrote:

> On Wed, 2016-11-23 at 15:37 +0100, Hannes Frederic Sowa wrote:
> 
> > Irregardless about the question if bonding should keep the MAC address
> > alive, a MAC address can certainly change below a TCP connection.
> 
> Of course ;)
> 
> > 
> > dst_entry is 1:n to neigh_entry and as such we can end up confirming an
> > aging neighbor while sending a reply with dst->pending_confirm set while
> > the confirming packet actually came from a different neighbor.
> > 
> > I agree with Julian, pending_confirm became useless in this way.
> 
> Let's kill it then ;)

It works for traffic via gateway. I now see that
we can even avoid write in dst_confirm:

if (!dst->pending_confirm)
dst->pending_confirm = 1;

because it is called by non-dup TCP ACKs.

But for traffic to hosts on LAN we need different solution,
i.e. for cached dsts with rt_gateway = 0 (last entry below).

rt_uses_gateway rt_gateway DST_NOCACHE Description

1   nh_gw  ANY Traffic via gateway
0   LAN_host   1   FLOWI_FLAG_KNOWN_NH (nexthop
   set by IPVS, hdrincl, xt_TEE)
0   0  0   1 dst for many subnet hosts

Regards

--
Julian Anastasov 


Re: wl1251 & mac address & calibration data

2016-11-23 Thread Pavel Machek
Hi!

> > "ifconfig hw ether XX" normally sets the address. I guess that's
> > ioctl?
> 
> This sets temporary address and it is ioctl. IIRC same as what ethtool 
> uses. (ifconfig is already deprecated).
> 
> > And I guess we should use similar mechanism for permanent
> > address.
> 
> I'm not sure here... Above ioctl ↑↑↑ is for changing temporary mac 
> address. But here we do not want to change permanent mac address. We 
> want to tell kernel driver current permanent mac address which is
> stored

Well... I'd still use similar mechanism :-).
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


答复: [scr265482] ip_tunnel.c

2016-11-23 Thread 于立洋1
Yeah,I means that recreate the tunnel again, 
But I don’t think the patch can fix the bug. It only can make the first packet 
received successed. And the follow packet will droped also.
In function __gre_xmit  line 366 
  tunnel->o_seqno++;

If you restart from UINT_MAX, the 'o_seqno' of second packet will return to 0 
again. 

BTW:
   Can you read Chinese? :)

On Wed, Nov 23, 2016 at 6:47 PM, Liyang Yu (于立洋1)  wrote:
> Hi:
> I found that the GRE tunnel in same case can cause integer 
> overflow in ip_tunnel.c:397
>
> Cause of the problem:
> When tpi->seq less than tunnel->i_seqno, the packet will be droped.
>
> How to recurrence problem
> 1. Create an tunnel use kernel GRE module.
> 2. Use the tunnel to send packets for awile.
> 3.Reboot one site of the tunnel.
> 4. Communication interrupted

What do you mean by "reboot one site of the tunnel"?

If you mean something like delete and create it again, it has nothing related 
to integer overflow, the tunnel->o_seqno will restart from 0 and the 
tunnel->i_seqno will remain as it is since we can't detect the interruption of 
the tunnel traffic.
If so, the following patch could help?


diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 5719d6b..2738ff2 
100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -277,6 +277,7 @@ static struct net_device *__ip_tunnel_create(struct net 
*net,
tunnel = netdev_priv(dev);
tunnel->parms = *parms;
tunnel->net = net;
+   tunnel->o_seqno = UINT_MAX;

err = register_netdevice(dev);
if (err)


Re: [scr265482] ip_tunnel.c

2016-11-23 Thread Cong Wang
On Wed, Nov 23, 2016 at 6:47 PM, Liyang Yu (于立洋1)  wrote:
> Hi:
> I found that the GRE tunnel in same case can cause integer overflow 
> in ip_tunnel.c:397
>
> Cause of the problem:
> When tpi->seq less than tunnel->i_seqno, the packet will be droped.
>
> How to recurrence problem
> 1. Create an tunnel use kernel GRE module.
> 2. Use the tunnel to send packets for awile.
> 3.Reboot one site of the tunnel.
> 4. Communication interrupted

What do you mean by "reboot one site of the tunnel"?

If you mean something like delete and create it again,
it has nothing related to integer overflow, the tunnel->o_seqno
will restart from 0 and the tunnel->i_seqno will remain as it is
since we can't detect the interruption of the tunnel traffic.
If so, the following patch could help?


diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index 5719d6b..2738ff2 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -277,6 +277,7 @@ static struct net_device
*__ip_tunnel_create(struct net *net,
tunnel = netdev_priv(dev);
tunnel->parms = *parms;
tunnel->net = net;
+   tunnel->o_seqno = UINT_MAX;

err = register_netdevice(dev);
if (err)


[PATCH] adm80211: Removed unused 'io_addr' 'mem_addr' variables

2016-11-23 Thread Kirtika Ruchandani
Initial commit cc0b88cf5ecf ([PATCH] Add adm8211 802.11b wireless driver)
introduced variables mem_addr and io_addr in adm80211_probe() that are
set but not used. Compiling with W=1 gives the following warnings,
fix them.

drivers/net/wireless/admtek/adm8211.c: In function ‘adm8211_probe’:
drivers/net/wireless/admtek/adm8211.c:1769:15: warning: variable ‘io_addr’ set 
but not used [-Wunused-but-set-variable]
  unsigned int io_addr, io_len;
   ^
drivers/net/wireless/admtek/adm8211.c:1768:16: warning: variable ‘mem_addr’ set 
but not used [-Wunused-but-set-variable]
  unsigned long mem_addr, mem_len;
^

These are harmless warnings and are only being fixed to reduce the
noise with W=1 in the kernel. The calls to pci_resource_start do not
have any side-effects and are safe to remove.

Fixes: cc0b88cf5ecf ("[PATCH] Add adm8211 802.11b wireless driver")
Cc: Michael Wu 
Cc: John W. Linville 
Signed-off-by: Kirtika Ruchandani 
---
 drivers/net/wireless/admtek/adm8211.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/admtek/adm8211.c 
b/drivers/net/wireless/admtek/adm8211.c
index 70ecd82..70b4da0 100644
--- a/drivers/net/wireless/admtek/adm8211.c
+++ b/drivers/net/wireless/admtek/adm8211.c
@@ -1765,8 +1765,8 @@ static int adm8211_probe(struct pci_dev *pdev,
 {
struct ieee80211_hw *dev;
struct adm8211_priv *priv;
-   unsigned long mem_addr, mem_len;
-   unsigned int io_addr, io_len;
+   unsigned long mem_len;
+   unsigned int io_len;
int err;
u32 reg;
u8 perm_addr[ETH_ALEN];
@@ -1778,9 +1778,7 @@ static int adm8211_probe(struct pci_dev *pdev,
return err;
}

-   io_addr = pci_resource_start(pdev, 0);
io_len = pci_resource_len(pdev, 0);
-   mem_addr = pci_resource_start(pdev, 1);
mem_len = pci_resource_len(pdev, 1);
if (io_len < 256 || mem_len < 1024) {
printk(KERN_ERR "%s (adm8211): Too short PCI resources\n",
--
2.8.0.rc3.226.g39d4020


[net-next] neigh: fix the loop index error in neigh dump

2016-11-23 Thread Zhang Shengju
Loop index in neigh dump function is not updated correctly under some
circumstances, this patch will fix it.

Signed-off-by: Zhang Shengju 
---
 net/core/neighbour.c | 39 ++-
 1 file changed, 18 insertions(+), 21 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 2ae929f..ce32e9c 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -2256,6 +2256,16 @@ static bool neigh_ifindex_filtered(struct net_device 
*dev, int filter_idx)
return false;
 }
 
+static bool neigh_dump_filtered(struct net_device *dev, int filter_idx,
+   int filter_master_idx)
+{
+   if (neigh_ifindex_filtered(dev, filter_idx) ||
+   neigh_master_filtered(dev, filter_master_idx))
+   return true;
+
+   return false;
+}
+
 static int neigh_dump_table(struct neigh_table *tbl, struct sk_buff *skb,
struct netlink_callback *cb)
 {
@@ -2285,20 +2295,15 @@ static int neigh_dump_table(struct neigh_table *tbl, 
struct sk_buff *skb,
rcu_read_lock_bh();
nht = rcu_dereference_bh(tbl->nht);
 
-   for (h = s_h; h < (1 << nht->hash_shift); h++) {
-   if (h > s_h)
-   s_idx = 0;
+   for (h = s_h; h < (1 << nht->hash_shift); h++, s_idx = 0) {
for (n = rcu_dereference_bh(nht->hash_buckets[h]), idx = 0;
 n != NULL;
-n = rcu_dereference_bh(n->next)) {
-   if (!net_eq(dev_net(n->dev), net))
-   continue;
-   if (neigh_ifindex_filtered(n->dev, filter_idx))
+n = rcu_dereference_bh(n->next), idx++) {
+   if (idx < s_idx || !net_eq(dev_net(n->dev), net))
continue;
-   if (neigh_master_filtered(n->dev, filter_master_idx))
+   if (neigh_dump_filtered(n->dev, filter_idx,
+   filter_master_idx))
continue;
-   if (idx < s_idx)
-   goto next;
if (neigh_fill_info(skb, n, NETLINK_CB(cb->skb).portid,
cb->nlh->nlmsg_seq,
RTM_NEWNEIGH,
@@ -2306,8 +2311,6 @@ static int neigh_dump_table(struct neigh_table *tbl, 
struct sk_buff *skb,
rc = -1;
goto out;
}
-next:
-   idx++;
}
}
rc = skb->len;
@@ -2328,14 +2331,10 @@ static int pneigh_dump_table(struct neigh_table *tbl, 
struct sk_buff *skb,
 
read_lock_bh(&tbl->lock);
 
-   for (h = s_h; h <= PNEIGH_HASHMASK; h++) {
-   if (h > s_h)
-   s_idx = 0;
-   for (n = tbl->phash_buckets[h], idx = 0; n; n = n->next) {
-   if (pneigh_net(n) != net)
+   for (h = s_h; h <= PNEIGH_HASHMASK; h++, s_idx = 0) {
+   for (n = tbl->phash_buckets[h], idx = 0; n; n = n->next, idx++) 
{
+   if (idx < s_idx || pneigh_net(n) != net)
continue;
-   if (idx < s_idx)
-   goto next;
if (pneigh_fill_info(skb, n, NETLINK_CB(cb->skb).portid,
cb->nlh->nlmsg_seq,
RTM_NEWNEIGH,
@@ -2344,8 +2343,6 @@ static int pneigh_dump_table(struct neigh_table *tbl, 
struct sk_buff *skb,
rc = -1;
goto out;
}
-   next:
-   idx++;
}
}
 
-- 
1.8.3.1





[PATCH] netdevice: fix sparse warning for HARD_TX_LOCK

2016-11-23 Thread Michael S. Tsirkin
sparse warns about context imbalance in any code
that uses HARD_TX_LOCK/UNLOCK - this is because it's
unable to determine that flags don't change so
lock and unlock are paired.

Seems easy enough to fix by adding __acquire/__release
calls.

With this patch af_packet.c is now sparse-clean,

Signed-off-by: Michael S. Tsirkin 
---

compile-tested only.

 include/linux/netdevice.h | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 91ee364..0a58a50 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3539,6 +3539,17 @@ static inline void __netif_tx_lock(struct netdev_queue 
*txq, int cpu)
txq->xmit_lock_owner = cpu;
 }
 
+static inline bool __netif_tx_acquire(struct netdev_queue *txq)
+{
+   __acquire(&txq->_xmit_lock);
+   return true;
+}
+
+static inline void __netif_tx_release(struct netdev_queue *txq)
+{
+   __release(&txq->_xmit_lock);
+}
+
 static inline void __netif_tx_lock_bh(struct netdev_queue *txq)
 {
spin_lock_bh(&txq->_xmit_lock);
@@ -3640,17 +3651,21 @@ static inline void netif_tx_unlock_bh(struct net_device 
*dev)
 #define HARD_TX_LOCK(dev, txq, cpu) {  \
if ((dev->features & NETIF_F_LLTX) == 0) {  \
__netif_tx_lock(txq, cpu);  \
+   } else {\
+   __netif_tx_acquire(txq);\
}   \
 }
 
 #define HARD_TX_TRYLOCK(dev, txq)  \
(((dev->features & NETIF_F_LLTX) == 0) ?\
__netif_tx_trylock(txq) :   \
-   true )
+   __netif_tx_acquire(txq))
 
 #define HARD_TX_UNLOCK(dev, txq) { \
if ((dev->features & NETIF_F_LLTX) == 0) {  \
__netif_tx_unlock(txq); \
+   } else {\
+   __netif_tx_release(txq);\
}   \
 }
 
-- 
MST


[PATCH net 1/1] tipc: improve sanity check for received domain records

2016-11-23 Thread Jon Maloy
In commit 35c55c9877f8 ("tipc: add neighbor monitoring framework") we
added a data area to the link monitor STATE messages under the
assumption that previous versions did not use any such data area.

For versions older than Linux 4.3 this assumption is not correct. In
those version, all STATE messages sent out from a node inadvertently
contain a 16 byte data area containing a string; -a leftover from
previous RESET messages which were using this during the setup phase.
This string serves no purpose in STATE messages, and should no be there.

Unfortunately, this data area is delivered to the link monitor
framework, where a sanity check catches that it is not a correct domain
record, and drops it. It also issues a rate limited warning about the
event.

Since such events occur much more frequently than anticipated, we now
choose to remove the warning in order to not fill the kernel log with
useless contents. We also make the sanity check stricter, to further
reduce the risk that such data is inavertently admitted.

Signed-off-by: Jon Maloy 
---
 net/tipc/monitor.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/tipc/monitor.c b/net/tipc/monitor.c
index ed97a58..9e109bb 100644
--- a/net/tipc/monitor.c
+++ b/net/tipc/monitor.c
@@ -455,14 +455,14 @@ void tipc_mon_rcv(struct net *net, void *data, u16 dlen, 
u32 addr,
int i, applied_bef;
 
state->probing = false;
-   if (!dlen)
-   return;
 
/* Sanity check received domain record */
-   if ((dlen < new_dlen) || ntohs(arrv_dom->len) != new_dlen) {
-   pr_warn_ratelimited("Received illegal domain record\n");
+   if (dlen < dom_rec_len(arrv_dom, 0))
+   return;
+   if (dlen != dom_rec_len(arrv_dom, new_member_cnt))
+   return;
+   if ((dlen < new_dlen) || ntohs(arrv_dom->len) != new_dlen)
return;
-   }
 
/* Synch generation numbers with peer if link just came up */
if (!state->synched) {
-- 
2.7.4



[PATCH 3/4] mac80211: Removed unused 'struct ieee80211_supported_band*' variable

2016-11-23 Thread Kirtika Ruchandani
Commit b1bce14a7954 (mac80211: update opmode when adding new station)
refactored ieee80211_vht_handle_opmode into __ieee80211_vht_handle_opmode
and ieee80211_vht_handle_opmode leaving a set but unused variable
(sband) in the former. Compiling with W=1 gives the following warning,
fix it.

net/mac80211/vht.c: In function ‘__ieee80211_vht_handle_opmode’:
net/mac80211/vht.c:424:35: warning: variable ‘sband’ set but not used 
[-Wunused-but-set-variable]

Remove 'struct ieee80211_local* local' as well, it was only used to
set sband.

This is a harmless warning, and is only being fixed to reduce the
noise with W=1 in the kernel.

Fixes: b1bce14a7954 ("mac80211: update opmode when adding new station")
Cc: Marek Kwaczynski 
Cc: Johannes Berg 
Signed-off-by: Kirtika Ruchandani 
---
 net/mac80211/vht.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/net/mac80211/vht.c b/net/mac80211/vht.c
index ee71576..14920e3 100644
--- a/net/mac80211/vht.c
+++ b/net/mac80211/vht.c
@@ -420,14 +420,10 @@ u32 __ieee80211_vht_handle_opmode(struct 
ieee80211_sub_if_data *sdata,
  struct sta_info *sta, u8 opmode,
  enum nl80211_band band)
 {
-   struct ieee80211_local *local = sdata->local;
-   struct ieee80211_supported_band *sband;
enum ieee80211_sta_rx_bandwidth new_bw;
u32 changed = 0;
u8 nss;
 
-   sband = local->hw.wiphy->bands[band];
-
/* ignore - no support for BF yet */
if (opmode & IEEE80211_OPMODE_NOTIF_RX_NSS_TYPE_BF)
return 0;
-- 
2.8.0.rc3.226.g39d4020



[PATCH 4/4] mac80211: Remove unused 'beaconint_us' variable

2016-11-23 Thread Kirtika Ruchandani
Commit 4a733ef1bea7 (mac80211: remove PM-QoS listener) removed all use
of 'beaconint_us' from ieee80211_recalc_ps() but left the variable
intact. Compiling with W=1 gives the following warning, fix it.
net/mac80211/mlme.c: In function ‘ieee80211_recalc_ps’:
net/mac80211/mlme.c:1481:7: warning: variable ‘beaconint_us’ set but not used 
[-Wunused-but-set-variable]

iee80211_tu_to_usec has no side-effects and is safe to remove.

Fixes: 4a733ef1bea7 ("mac80211: remove PM-QoS listener")
Cc: Johannes Berg 
Signed-off-by: Kirtika Ruchandani 
---
 net/mac80211/mlme.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index 7486f2d..e883345 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -1478,10 +1478,6 @@ void ieee80211_recalc_ps(struct ieee80211_local *local)
 
if (count == 1 && ieee80211_powersave_allowed(found)) {
u8 dtimper = found->u.mgd.dtim_period;
-   s32 beaconint_us;
-
-   beaconint_us = ieee80211_tu_to_usec(
-   found->vif.bss_conf.beacon_int);
 
timeout = local->dynamic_ps_forced_timeout;
if (timeout < 0)
-- 
2.8.0.rc3.226.g39d4020



[PATCH 1/4] mac80211: Removed unused 'i' variable

2016-11-23 Thread Kirtika Ruchandani
Commit 5bcae31d9 (mac80211: implement multi-vif in-place reservations)
introduced ieee80211_vif_use_reserved_switch() with a counter variable
'i' that is set but not used. Compiling with W=1 gives the following
warning, fix it.
net/mac80211/chan.c: In function ‘ieee80211_vif_use_reserved_switch’:
net/mac80211/chan.c:1273:6: warning: variable ‘i’ set but not used 
[-Wunused-but-set-variable]

This is a harmless warning, and is only being fixed to reduce the
noise obtained with W=1 in the kernel.

Fixes: 5bcae31d9 ("mac80211: implement multi-vif in-place reservations")
Cc: Michal Kazior 
Cc: Johannes Berg 
Signed-off-by: Kirtika Ruchandani 
---
 net/mac80211/chan.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/net/mac80211/chan.c b/net/mac80211/chan.c
index e75cbf6..7550fd2 100644
--- a/net/mac80211/chan.c
+++ b/net/mac80211/chan.c
@@ -1270,7 +1270,7 @@ static int ieee80211_vif_use_reserved_switch(struct 
ieee80211_local *local)
struct ieee80211_sub_if_data *sdata, *sdata_tmp;
struct ieee80211_chanctx *ctx, *ctx_tmp, *old_ctx;
struct ieee80211_chanctx *new_ctx = NULL;
-   int i, err, n_assigned, n_reserved, n_ready;
+   int err, n_assigned, n_reserved, n_ready;
int n_ctx = 0, n_vifs_switch = 0, n_vifs_assign = 0, n_vifs_ctxless = 0;
 
lockdep_assert_held(&local->mtx);
@@ -1391,8 +1391,6 @@ static int ieee80211_vif_use_reserved_switch(struct 
ieee80211_local *local)
 * Update all structures, values and pointers to point to new channel
 * context(s).
 */
-
-   i = 0;
list_for_each_entry(ctx, &local->chanctx_list, list) {
if (ctx->replace_state != IEEE80211_CHANCTX_REPLACES_OTHER)
continue;
-- 
2.8.0.rc3.226.g39d4020



[PATCH 2/4] mac80211: Remove unused 'len' variable

2016-11-23 Thread Kirtika Ruchandani
Commit 633e27132625 (mac80211: split sched scan IEs) introduced the
len variable to keep track of the return value of
ieee80211_build_preq_ies() but did not use it. Compiling with W=1
gives the following warning, fix it.

net/mac80211/scan.c: In function ‘__ieee80211_request_sched_scan_start’:
net/mac80211/scan.c:1123:9: warning: variable ‘len’ set but not used 
[-Wunused-but-set-variable]

This is a harmless warning and is only being fixed to reduce the noise
with W=1 in the kernel.

Fixes: 633e27132625 ("mac80211: split sched scan IEs")
Cc: David Spinadel 
Cc: Alexander Bondar 
Cc: Johannes Berg 
Signed-off-by: Kirtika Ruchandani 
---
 net/mac80211/scan.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/net/mac80211/scan.c b/net/mac80211/scan.c
index 23d8ac8..faab3c4 100644
--- a/net/mac80211/scan.c
+++ b/net/mac80211/scan.c
@@ -1120,7 +1120,6 @@ int __ieee80211_request_sched_scan_start(struct 
ieee80211_sub_if_data *sdata,
u32 rate_masks[NUM_NL80211_BANDS] = {};
u8 bands_used = 0;
u8 *ie;
-   size_t len;
 
iebufsz = local->scan_ies_len + req->ie_len;
 
@@ -1145,10 +1144,9 @@ int __ieee80211_request_sched_scan_start(struct 
ieee80211_sub_if_data *sdata,
 
ieee80211_prepare_scan_chandef(&chandef, req->scan_width);
 
-   len = ieee80211_build_preq_ies(local, ie, num_bands * iebufsz,
-  &sched_scan_ies, req->ie,
-  req->ie_len, bands_used,
-  rate_masks, &chandef);
+   ieee80211_build_preq_ies(local, ie, num_bands * iebufsz,
+&sched_scan_ies, req->ie,
+req->ie_len, bands_used, rate_masks, &chandef);
 
ret = drv_sched_scan_start(local, sdata, req, &sched_scan_ies);
if (ret == 0) {
-- 
2.8.0.rc3.226.g39d4020



[PATCH 0/4] Fix -Wunused-but-set-variable in net/mac80211/

2016-11-23 Thread Kirtika Ruchandani
This patchset is part of the effort led by Arnd Bergmann to clean up
warnings in the kernel. This and following patchsets will focus on
"-Wunused-but-set-variable" as it among the noisier ones. These were
found compiling with W=1.

Kirtika Ruchandani (4):
  mac80211: Removed unused 'i' variable
  mac80211: Remove unused 'len' variable
  mac80211: Removed unused 'struct ieee80211_supported_band*' variable
  mac80211: Remove unused 'beaconint_us' variable

 net/mac80211/chan.c | 4 +---
 net/mac80211/mlme.c | 4 
 net/mac80211/scan.c | 8 +++-
 net/mac80211/vht.c  | 4 
 4 files changed, 4 insertions(+), 16 deletions(-)

-- 
2.8.0.rc3.226.g39d4020



linux-next: manual merge of the staging tree with the net-next tree

2016-11-23 Thread Stephen Rothwell
Hi Greg,

Today's linux-next merge of the staging tree got a conflict in:

  drivers/staging/unisys/include/iochannel.h

between commit:

  d0c2c9973ecd ("net: use core MTU range checking in virt drivers")

from the net-next tree and commit:

  b18f9c676f93 ("staging: unisys: include: fix pound defines")

from the staging tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/staging/unisys/include/iochannel.h
index 9081b3f8779c,c43da782f37e..
--- a/drivers/staging/unisys/include/iochannel.h
+++ b/drivers/staging/unisys/include/iochannel.h
@@@ -113,10 -117,12 +117,10 @@@ enum net_types 
  
  };
  
- #define   ETH_MIN_DATA_SIZE 46/* minimum eth data size */
- #define   ETH_MIN_PACKET_SIZE (ETH_HLEN + ETH_MIN_DATA_SIZE)
 -#define ETH_HEADER_SIZE 14/* size of ethernet header */
 -
+ #define ETH_MIN_DATA_SIZE 46  /* minimum eth data size */
 -#define ETH_MIN_PACKET_SIZE (ETH_HEADER_SIZE + ETH_MIN_DATA_SIZE)
++#define ETH_MIN_PACKET_SIZE (ETH_HLEN + ETH_MIN_DATA_SIZE)
  
- #define   VISOR_ETH_MAX_MTU 16384 /* maximum data size */
 -#define ETH_MAX_MTU 16384 /* maximum data size */
++#define VISOR_ETH_MAX_MTU 16384   /* maximum data size */
  
  #ifndef MAX_MACADDR_LEN
  #define MAX_MACADDR_LEN 6 /* number of bytes in MAC address */
@@@ -286,9 -304,9 +302,9 @@@ struct net_pkt_xmt 
int len;/* full length of data in the packet */
int num_frags;  /* number of fragments in frags containing data */
struct phys_info frags[MAX_PHYS_INFO];  /* physical page information */
 -  char ethhdr[ETH_HEADER_SIZE];   /* the ethernet header  */
 +  char ethhdr[ETH_HLEN];  /* the ethernet header  */
struct {
-   /* these are needed for csum at uisnic end */
+   /* These are needed for csum at uisnic end */
u8 valid;   /* 1 = struct is valid - else ignore */
u8 hrawoffv;/* 1 = hwrafoff is valid */
u8 nhrawoffv;   /* 1 = nhwrafoff is valid */
@@@ -321,29 -341,41 +339,41 @@@ struct net_pkt_xmtdone 
   */
  #define RCVPOST_BUF_SIZE 4032
  #define MAX_NET_RCV_CHAIN \
 -  ((ETH_MAX_MTU + ETH_HEADER_SIZE + RCVPOST_BUF_SIZE - 1) \
 +  ((VISOR_ETH_MAX_MTU + ETH_HLEN + RCVPOST_BUF_SIZE - 1) \
/ RCVPOST_BUF_SIZE)
  
+ /*
+  * rcv buf size must be large enough to include ethernet data len + ethernet
+  * header len - we are choosing 2K because it is guaranteed to be describable.
+  */
  struct net_pkt_rcvpost {
-   /* rcv buf size must be large enough to include ethernet data len +
-* ethernet header len - we are choosing 2K because it is guaranteed
-* to be describable
-*/
-   struct phys_info frag;  /* physical page information for the */
-   /* single fragment 2K rcv buf */
-   u64 unique_num;
-   /* unique_num ensure that receive posts are returned to */
-   /* the Adapter which we sent them originally. */
+   /* Physical page information for the single fragment 2K rcv buf */
+   struct phys_info frag;
+ 
+   /*
+* Ensures that receive posts are returned to the adapter which we sent
+* them from originally.
+*/
+   u64 unique_num;
+ 
  } __packed;
  
+ /*
+  * The number of rcvbuf that can be chained is based on max mtu and size of 
each
+  * rcvbuf.
+  */
  struct net_pkt_rcv {
-   /* the number of receive buffers that can be chained  */
-   /* is based on max mtu and size of each rcv buf */
-   u32 rcv_done_len;   /* length of received data */
-   u8 numrcvbufs;  /* number of receive buffers that contain the */
-   /* incoming data; guest end MUST chain these together. */
-   void *rcvbuf[MAX_NET_RCV_CHAIN];/* list of chained rcvbufs */
-   /* each entry is a receive buffer provided by NET_RCV_POST. */
+   u32 rcv_done_len; /* length of received data */
+ 
+   /*
+* numrcvbufs: contain the incoming data; guest side MUST chain these
+* together.
+*/
+   u8 numrcvbufs;
+ 
+   void *rcvbuf[MAX_NET_RCV_CHAIN]; /* list of chained rcvbufs */
+ 
+   /* Each entry is a receive buffer provided by NET_RCV_POST. */
/* NOTE: first rcvbuf in the chain will also be provided in net.buf. */
u64 unique_num;
u32 rcvs_dropped_delta;


RE: [PATCH net 1/2] r8152: fix the sw rx checksum is unavailable

2016-11-23 Thread Hayes Wang
Mark Lord [mailto:ml...@pobox.com]
> Sent: Thursday, November 24, 2016 3:30 AM
[...]
> Worth repeating: other dongles we have tried, eg. those using the asix driver,
> do not cause us any troubles here.  Only the r8152 dongles do.

I couldn't tell you why you would see the problem. I have tested the
RTL8152 on raspberry pi platform with iperf more than 17 hours. And
I don't see any invalid rx descriptor. I don't think it really is the
issue about our hw.

Best Regards,
Hayes



Re: [RFC net-next 1/3] net: bridge: Allow bridge master device to configure switch CPU port

2016-11-23 Thread Toshiaki Makita
On 2016/11/23 0:46, Vivien Didelot wrote:
> Hi Florian,
> 
> Florian Fainelli  writes:
> 
>> bridge vlan add vid 2 dev br0 self
>>  -> CPU port gets programmed
>> bridge vlan add vid 2 dev port0
>>  -> port0 (switch port 0) gets programmed
> 
> Although this is not specific to this patch, I'd like to point out that
> this seems not to be the behavior bridge expects.
> 
> The bridge manpage says:
> 
> bridge vlan add - add a new vlan filter entry
> ...
> 
>self   the vlan is configured on the specified physical device.
>   Required if the device is the bridge device.
> 
>master the vlan is configured on the software bridge (default).
> 
> So if I'm not mistaken, the switch chip must be programmed only when the
> bridge command is called with the "self" attribute. Without it, only
> software configuration must be made, like what happens when the driver
> returns -EOPNOTSUPP.
> 
> Currently, both commands below program the hardware:
> 
> # bridge vlan add vid 2 dev port0 [master]
> # bridge vlan add vid 2 dev port0 [master] self

Actually this is intended behavior, which keeps backward compatibility.
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7f1095394918c7058ff81c96c3bab3a897e97a9d

Thanks,
Toshiaki Makita




[scr265482] ip_tunnel.c

2016-11-23 Thread 于立洋1
Hi: 
I found that the GRE tunnel in same case can cause integer overflow in 
ip_tunnel.c:397
   
Cause of the problem:
When tpi->seq less than tunnel->i_seqno, the packet will be droped. 

How to recurrence problem
1. Create an tunnel use kernel GRE module.
2. Use the tunnel to send packets for awile.
3.Reboot one site of the tunnel. 
4. Communication interrupted 


if (tunnel->parms.i_flags&TUNNEL_SEQ) {
if (!(tpi->flags&TUNNEL_SEQ) ||
(tunnel->i_seqno && (s32)(ntohl(tpi->seq) - 
tunnel->i_seqno) < 0)) {/**Here is the trouble code* /
tunnel->dev->stats.rx_fifo_errors++;
tunnel->dev->stats.rx_errors++;
goto drop;
}
tunnel->i_seqno = ntohl(tpi->seq) + 1;
}

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

> Integer Overflow in ip_tunnel.c in Ubuntu Linux kernel GRE ALL kernel 
> version allows attacker to Denial of Service via reboot one end of the 
> tunnel

Could you please clarify whether this affects only Ubuntu, or potentially 
affects other Linux distributions? ip_tunnel.c is present in the Linux kernel 
in all distributions and is maintained at:

  
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/log/net/ipv4/ip_tunnel.c

You should provide your evidence of an integer overflow, such as source code or 
crash tracing.

If you are reporting an Ubuntu issue, please see:

  https://wiki.ubuntu.com/SecurityTeam/FAQ#Contact

about how to file a Private Security bug in Launchpad.

If you are reporting an issue affecting the Linux kernel in general, please 
contact:

  secur...@kernel.org

You can also include:

  netdev@vger.kernel.org

if the report is public. If you need to subscribe, see:

  http://vger.kernel.org/vger-lists.html#netdev

- --
CVE Assignment Team
M/S M300, 202 Burlington Road, Bedford, MA 01730 USA [ A PGP key is available 
for encrypted communications at
  http://cve.mitre.org/cve/request_id.html ] -BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQIcBAEBCAAGBQJYNeETAAoJEHb/MwWLVhi26RcP/R38S6V0LFGPHOTFNjTapcnV
RPKycC/lOCGjQehDAUkhxxTwolJpJF3RWeI+KL/hOvxA+LP3B3YeYdoYnQyZ6SqI
8J+zz5vV5mCP3olKYynO4S32bBn8rZiwoWsFWPaC4ILmoQFTLZiDbH6ji3DrHewm
OwrTysyC1a7clOuIM3BaPl3Ra0qMHsgR2b16gYMEdi/B1Ya3oLY7MVLTB2AixA9F
BB/aQjFMICfchEF39uQslU3jJd+SPuayLvceiKIvqFqBt1D8Kt2rBamzMmI5MC3M
ZbVBNfXde1MxqlV2WjUzl8KFj2l1zG7IlH1rcRes+6ZI3VaJnbv9Jyi6oc9QzMQc
nFRg9sH/DzD3g40bh2zRBtLqkQeTxxkg3JvaFc2OC2MaxMiobQCso926d4pFxTmd
+x8wP7E/nKvd4+E09/bep/v0+mEOxfSDICNGO/7gBOU4wKZ6IyaNftfe5Q1zDaxv
M3vWI6VqTFx32wY7TE69AHIH7X7WvzsBi7BLj2RHGFg2hwS7n80A1t4BcdYjPdSh
feFxfVH5gGAaG3Bm4jJOCKe5+vRwuJGjnox2+vQvUrD9v+vx0z1D5ooO8Ms2MLnT
kKL7BKhcntcoLJ3TUI09I2HZBSh7R3homgFhgrpbDHd0YjaW6XgqHjAr8piKEToK
V6jChR0YzXTkTlw1jYlE
=z0ta
-END PGP SIGNATURE-


Re: [PATCH 12/20] net/iucv: Convert to hotplug state machine

2016-11-23 Thread Ursula Braun
Sebastian,

your patch looks good to me. I run successfully some small tests with it.
I want to suggest a small change in iucv_init() to keep the uniform technique
of undo labels below. Do you agree?

Kind regards, Ursula

On 11/17/2016 07:35 PM, Sebastian Andrzej Siewior wrote:
> Install the callbacks via the state machine and let the core invoke the
> callbacks on the already online CPUs. The smp function calls in the
> online/downprep callbacks are not required as the callback is guaranteed to
> be invoked on the upcoming/outgoing cpu.
> 
> Cc: Ursula Braun 
> Cc: "David S. Miller" 
> Cc: linux-s...@vger.kernel.org
> Cc: netdev@vger.kernel.org
> Signed-off-by: Sebastian Andrzej Siewior 
> ---
>  include/linux/cpuhotplug.h |   1 +
>  net/iucv/iucv.c| 118 
> +
>  2 files changed, 45 insertions(+), 74 deletions(-)
> 
> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> index fd5598b8353a..69abf2c09f6c 100644
> --- a/include/linux/cpuhotplug.h
> +++ b/include/linux/cpuhotplug.h
> @@ -63,6 +63,7 @@ enum cpuhp_state {
>   CPUHP_X86_THERM_PREPARE,
>   CPUHP_X86_CPUID_PREPARE,
>   CPUHP_X86_MSR_PREPARE,
> + CPUHP_NET_IUCV_PREPARE,
>   CPUHP_TIMERS_DEAD,
>   CPUHP_NOTF_ERR_INJ_PREPARE,
>   CPUHP_MIPS_SOC_PREPARE,
> diff --git a/net/iucv/iucv.c b/net/iucv/iucv.c
> index 88a2a3ba4212..f0d6afc5d4a9 100644
> --- a/net/iucv/iucv.c
> +++ b/net/iucv/iucv.c
> @@ -639,7 +639,7 @@ static void iucv_disable(void)
>   put_online_cpus();
>  }
>  
> -static void free_iucv_data(int cpu)
> +static int iucv_cpu_dead(unsigned int cpu)
>  {
>   kfree(iucv_param_irq[cpu]);
>   iucv_param_irq[cpu] = NULL;
> @@ -647,9 +647,10 @@ static void free_iucv_data(int cpu)
>   iucv_param[cpu] = NULL;
>   kfree(iucv_irq_data[cpu]);
>   iucv_irq_data[cpu] = NULL;
> + return 0;
>  }
>  
> -static int alloc_iucv_data(int cpu)
> +static int iucv_cpu_prepare(unsigned int cpu)
>  {
>   /* Note: GFP_DMA used to get memory below 2G */
>   iucv_irq_data[cpu] = kmalloc_node(sizeof(struct iucv_irq_data),
> @@ -671,58 +672,38 @@ static int alloc_iucv_data(int cpu)
>   return 0;
>  
>  out_free:
> - free_iucv_data(cpu);
> + iucv_cpu_dead(cpu);
>   return -ENOMEM;
>  }
>  
> -static int iucv_cpu_notify(struct notifier_block *self,
> -  unsigned long action, void *hcpu)
> +static int iucv_cpu_online(unsigned int cpu)
>  {
> - cpumask_t cpumask;
> - long cpu = (long) hcpu;
> -
> - switch (action) {
> - case CPU_UP_PREPARE:
> - case CPU_UP_PREPARE_FROZEN:
> - if (alloc_iucv_data(cpu))
> - return notifier_from_errno(-ENOMEM);
> - break;
> - case CPU_UP_CANCELED:
> - case CPU_UP_CANCELED_FROZEN:
> - case CPU_DEAD:
> - case CPU_DEAD_FROZEN:
> - free_iucv_data(cpu);
> - break;
> - case CPU_ONLINE:
> - case CPU_ONLINE_FROZEN:
> - case CPU_DOWN_FAILED:
> - case CPU_DOWN_FAILED_FROZEN:
> - if (!iucv_path_table)
> - break;
> - smp_call_function_single(cpu, iucv_declare_cpu, NULL, 1);
> - break;
> - case CPU_DOWN_PREPARE:
> - case CPU_DOWN_PREPARE_FROZEN:
> - if (!iucv_path_table)
> - break;
> - cpumask_copy(&cpumask, &iucv_buffer_cpumask);
> - cpumask_clear_cpu(cpu, &cpumask);
> - if (cpumask_empty(&cpumask))
> - /* Can't offline last IUCV enabled cpu. */
> - return notifier_from_errno(-EINVAL);
> - smp_call_function_single(cpu, iucv_retrieve_cpu, NULL, 1);
> - if (cpumask_empty(&iucv_irq_cpumask))
> - smp_call_function_single(
> - cpumask_first(&iucv_buffer_cpumask),
> - iucv_allow_cpu, NULL, 1);
> - break;
> - }
> - return NOTIFY_OK;
> + if (!iucv_path_table)
> + return 0;
> + iucv_declare_cpu(NULL);
> + return 0;
>  }
>  
> -static struct notifier_block __refdata iucv_cpu_notifier = {
> - .notifier_call = iucv_cpu_notify,
> -};
> +static int iucv_cpu_down_prep(unsigned int cpu)
> +{
> + cpumask_t cpumask;
> +
> + if (!iucv_path_table)
> + return 0;
> +
> + cpumask_copy(&cpumask, &iucv_buffer_cpumask);
> + cpumask_clear_cpu(cpu, &cpumask);
> + if (cpumask_empty(&cpumask))
> + /* Can't offline last IUCV enabled cpu. */
> + return -EINVAL;
> +
> + iucv_retrieve_cpu(NULL);
> + if (!cpumask_empty(&iucv_irq_cpumask))
> + return 0;
> + smp_call_function_single(cpumask_first(&iucv_buffer_cpumask),
> +  iucv_allow_cpu, NULL, 1);
> + return 0;
> +}
>  
>  /**
>   * iucv_sever_pathid
> @@ -2027,6 +2008,7 @@ struct iucv_interface iucv_if = {
>  };
>  E

Re: [PATCH net-next 1/1] ipv6: sr: add option to control lwtunnel support

2016-11-23 Thread Alexei Starovoitov
On Wed, Nov 23, 2016 at 10:28:29AM +0100, David Lebrun wrote:
> On 11/23/2016 08:34 AM, Roopa Prabhu wrote:
> > I can't seem to reproduce the problem you are seeing. still trying..
> > I don't have CONFIG_LWTUNNEL set nor any of the other SEG6 configs.
> > My CONFIG_IPV6 is on and compiled as a module. I have also tried disabling 
> > it.
> > If you can send me the config, I can try again. Looking back at the patches,
> > I do see a few things below ..but they may not fix your problem directly.
> > 
> > Though I had none of the ipv6 segment routing configs turned on,
> > I do see the "Segment Routing with IPv6" msg at bootup.
> > Was looking at david's patches again, and a few things (I had missed seeing 
> > the last version):
> > 
> > In my review comment I was hinting at CONFIG_IPV6_SEG6 to cover all of ipv6 
> > segment routing,
> > including the lwtunnel bits.
> > 
> > something like below:
> > 
> > config IPV6_SEG6
> > bool "IPv6: Segment Routing Header encapsulation support"
> > depends on LWTUNNEL && IPV6
> > 
> > DavidL, do you see a problem doing it this way ?. with this 'seg6.o' will 
> > be part of CONFIG_IPV6_SEG6 and not
> > get initialized unless it is enabled..which seems like the right thing to 
> > do.
> 
> Can't reproduce the bug either, with CONFIG_IPV6=y, LWTUNNEL=n and all
> SEG6 disabled. Alexei, your .config and dmesg log could help.

I didn't save that .config and did bisect of the other bug that
messed up my .confg. Now I cannot reproduce it. Sorry for the noise.
Still weird though that ping prefers ipv6 address now.
$ ping localhost
PING localhost(localhost.localdomain (::1)) 56 data bytes
64 bytes from localhost.localdomain (::1): icmp_seq=1 ttl=64 time=0.043 ms



[PATCH net 1/1] tipc: fix compatibility bug in link monitoring

2016-11-23 Thread Jon Maloy
commit 817298102b0b ("tipc: fix link priority propagation") introduced a
compatibility problem between TIPC versions newer than Linux 4.6 and
those older than Linux 4.4. In versions later than 4.4, link STATE
messages only contain a non-zero link priority value when the sender
wants the receiver to change its priority. This has the effect that the
receiver resets itself in order to apply the new priority. This works
well, and is consistent with the said commit.

However, in versions older than 4.4 a valid link priority is present in
all sent link STATE messages, leading to cyclic link establishment and
reset on the 4.6+ node.

We fix this by adding a test that the received value should not only
be valid, but also differ from the current value in order to cause the
receiving link endpoint to reset.

Reported-by: Amar Nv 
Signed-off-by: Jon Maloy 
---
 net/tipc/link.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index 1055164..ecc12411 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1492,8 +1492,9 @@ static int tipc_link_proto_rcv(struct tipc_link *l, 
struct sk_buff *skb,
if (in_range(peers_tol, TIPC_MIN_LINK_TOL, TIPC_MAX_LINK_TOL))
l->tolerance = peers_tol;
 
-   if (peers_prio && in_range(peers_prio, TIPC_MIN_LINK_PRI,
-  TIPC_MAX_LINK_PRI)) {
+   /* Update own prio if peer indicates a different value */
+   if ((peers_prio != l->priority) &&
+   in_range(peers_prio, 1, TIPC_MAX_LINK_PRI)) {
l->priority = peers_prio;
rc = tipc_link_fsm_evt(l, LINK_FAILURE_EVT);
}
-- 
2.7.4



[Patch net-next] net_sched: move the empty tp check from ->destroy() to ->delete()

2016-11-23 Thread Cong Wang
Roi reported we could have a race condition where in ->classify() path
we dereference tp->root and meanwhile a parallel ->destroy() makes it
a NULL.

This is possible because ->destroy() could be called when deleting
a filter to check if we are the last one in tp, this tp is still
linked and visible at that time.

The root cause of this problem is the semantic of ->destroy(), it
does two things (for non-force case):

1) check if tp is empty
2) if tp is empty we could really destroy it

and its caller, if cares, needs to check its return value to see if
it is really destroyed. Therefore we can't unlink tp unless we know
it is empty.

As suggested by Daniel, we could actually move the test logic to ->delete()
so that we can safely unlink tp after ->delete() tells us the last one is
just deleted and before ->destroy().

What's more, even we unlink it before ->destroy(), it could still have
readers since we don't wait for a grace period here, we should not modify
tp->root in ->destroy() either.

Fixes: 1e052be69d04 ("net_sched: destroy proto tp when all filters are gone")
Reported-by: Roi Dayan 
Cc: Daniel Borkmann 
Cc: John Fastabend 
Signed-off-by: Cong Wang 
---
 include/net/sch_generic.h |  6 ++--
 net/sched/cls_api.c   | 18 +++-
 net/sched/cls_basic.c | 11 +++-
 net/sched/cls_bpf.c   | 11 +++-
 net/sched/cls_cgroup.c| 12 ++--
 net/sched/cls_flow.c  | 11 +++-
 net/sched/cls_flower.c| 10 ++-
 net/sched/cls_fw.c| 30 +++-
 net/sched/cls_matchall.c  | 10 ++-
 net/sched/cls_route.c | 30 ++--
 net/sched/cls_rsvp.h  | 34 +++
 net/sched/cls_tcindex.c   | 15 +-
 net/sched/cls_u32.c   | 71 +++
 net/sched/sch_api.c   | 14 --
 14 files changed, 137 insertions(+), 146 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index e6aa0a2..27cd1bd 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -203,14 +203,14 @@ struct tcf_proto_ops {
const struct tcf_proto *,
struct tcf_result *);
int (*init)(struct tcf_proto*);
-   bool(*destroy)(struct tcf_proto*, bool);
+   void(*destroy)(struct tcf_proto*);
 
unsigned long   (*get)(struct tcf_proto*, u32 handle);
int (*change)(struct net *net, struct sk_buff *,
struct tcf_proto*, unsigned long,
u32 handle, struct nlattr **,
unsigned long *, bool);
-   int (*delete)(struct tcf_proto*, unsigned long);
+   int (*delete)(struct tcf_proto*, unsigned long, 
bool*);
void(*walk)(struct tcf_proto*, struct tcf_walker 
*arg);
 
/* rtnetlink specific */
@@ -405,7 +405,7 @@ struct Qdisc *qdisc_create_dflt(struct netdev_queue 
*dev_queue,
const struct Qdisc_ops *ops, u32 parentid);
 void __qdisc_calculate_pkt_len(struct sk_buff *skb,
   const struct qdisc_size_table *stab);
-bool tcf_destroy(struct tcf_proto *tp, bool force);
+void tcf_destroy(struct tcf_proto *tp);
 void tcf_destroy_chain(struct tcf_proto __rcu **fl);
 int skb_do_redirect(struct sk_buff *);
 
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 8e93d4a..f159aeb 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -321,7 +321,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct 
nlmsghdr *n)
 
tfilter_notify(net, skb, n, tp, fh,
   RTM_DELTFILTER, false);
-   tcf_destroy(tp, true);
+   tcf_destroy(tp);
err = 0;
goto errout;
}
@@ -331,25 +331,29 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct 
nlmsghdr *n)
!(n->nlmsg_flags & NLM_F_CREATE))
goto errout;
} else {
+   bool last;
+
switch (n->nlmsg_type) {
case RTM_NEWTFILTER:
err = -EEXIST;
if (n->nlmsg_flags & NLM_F_EXCL) {
if (tp_created)
-   tcf_destroy(tp, true);
+   tcf_destroy(tp);
goto errout;
}
break;
case RTM_DELTFILTER:
-   err = tp->ops->delete(tp, fh);
+   err = tp->ops->delete(tp, fh, &last);
if (err == 0) {
-   struct tcf_proto *next = 
rtnl_derefe

Re: [PATCH net] net/mlx4_en: Free netdev resources under state lock

2016-11-23 Thread David Miller
From: Tariq Toukan 
Date: Tue, 22 Nov 2016 16:20:39 +0200

> Make sure mlx4_en_free_resources is called under the netdev state lock.
> This is needed since RCU dereference of XDP prog should be protected.
> 
> Fixes: 326fe02d1ed6 ("net/mlx4_en: protect ring->xdp_prog with rcu_read_lock")
> Signed-off-by: Tariq Toukan 
> Reported-by: Sagi Grimberg 

Applied.


Re: [Patch net] net: revert "net: l2tp: Treat NET_XMIT_CN as success in l2tp_eth_dev_xmit"

2016-11-23 Thread David Miller
From: Cong Wang 
Date: Mon, 21 Nov 2016 23:24:43 -0800

> This reverts commit 7c6ae610a1f0, because l2tp_xmit_skb() never
> returns NET_XMIT_CN, it ignores the return value of l2tp_xmit_core().
> 
> Cc: Gao Feng 
> Signed-off-by: Cong Wang 

Applied.


Re: [PATCH net 1/1] driver: macvlan: Check if need rollback multicast setting in macvlan_open

2016-11-23 Thread David Miller
From: f...@ikuai8.com
Date: Tue, 22 Nov 2016 09:54:36 +0800

> From: Gao Feng 
> 
> When dev_set_promiscuity failed in macvlan_open, it always invokes
> dev_set_allmulti without checking if necessary.
> Now check the IFF_ALLMULTI flag firstly before rollback the multicast
> setting in the error handler.
> 
> Signed-off-by: Gao Feng 

Applied.


Re: [PATCH] net: phy: micrel: fix KSZ8041FTL supported value

2016-11-23 Thread David Miller
From: Kirill Esipov 
Date: Mon, 21 Nov 2016 19:53:31 +0300

> Fix setting of SUPPORTED_FIBRE bit as it was not present in features
> of KSZ8041.
> 
> Signed-off-by: Kirill Esipov 

Applied.


Re: [PATCH] netdevice.h: fix kernel-doc warning

2016-11-23 Thread David Miller
From: Randy Dunlap 
Date: Mon, 21 Nov 2016 18:28:36 -0800

> From: Randy Dunlap 
> 
> Fix kernel-doc warning in  (missing ':'):
> 
> ..//include/linux/netdevice.h:1904: warning: No description found for 
> parameter 'prio_tc_map[TC_BITMASK + 1]'
> 
> Signed-off-by: Randy Dunlap 

Applied.


Re: [PATCH] bnxt_en: Fix a VXLAN vs GENEVE issue

2016-11-23 Thread David Miller
From: Christophe JAILLET 
Date: Tue, 22 Nov 2016 06:14:40 +0100

> Knowing that:
>   #define TUNNEL_DST_PORT_FREE_REQ_TUNNEL_TYPE_VXLAN(0x1UL << 0)
>   #define TUNNEL_DST_PORT_FREE_REQ_TUNNEL_TYPE_GENEVE   (0x5UL << 0)
> and that 'bnxt_hwrm_tunnel_dst_port_alloc()' is only called with one of
> these 2 constants, the TUNNEL_DST_PORT_ALLOC_REQ_TUNNEL_TYPE_GENEVE can not
> trigger.
> 
> Replace the bit test that overlap by an equality test, just as in
> 'bnxt_hwrm_tunnel_dst_port_free()' above.
> 
> Signed-off-by: Christophe JAILLET 

Applied.


Re: [net] rtnetlink: fix the wrong minimal dump size getting from rtnl_calcit()

2016-11-23 Thread David Miller
From: Zhang Shengju 
Date: Tue, 22 Nov 2016 14:14:28 +0800

> For RT netlink, calcit() function should return the minimal size for
> netlink dump message. This will make sure that dump message for every
> network device can be stored.
> 
> Currently, rtnl_calcit() function doesn't account the size of header of
> netlink message, this patch will fix it.
> 
> Signed-off-by: Zhang Shengju 

Applied.


Mrs. Grace Ibrahim

2016-11-23 Thread mrsgraceibra...@ono.com
I am Mrs Mrs Grace Ibrahim, i have a pending project of fulfillment to
put in your hand, i will need your support to make this dream come
through, could you let me know your interest to enable me give you
further information, and I hereby advice that you send the below
mentioned information

I decided to while/donate the sum of £ 4.7 million Euros to you for
the good work of God, and also to help the motherless and less
privilege and also for assistance of the widows. At the moment I
cannot take any telephone calls right now due to the fact that my
relatives (that have squandered the funds agave them for this purpose
before) are around me and my health status also. I have adjusted my
while and my lawyer is aware.

I have willed those properties to you by quoting my personal file
routing and account information. And I have also notified the bank
that I am willing that properties to you for a good, effective and
prudent work. I know I don't know you but I have been directed to do
this by God.ok Please contact this woman for more details you might
not get me on line in time contact this email if you need more ok.

Email: mrsgraceibrahim1...@gmail.com


Your full name.

Your private telephone number..

Your passport or identity card.

Your country... ...

Your occupation


Thank you as i wait your reply.

Yours faithful friend,

Mrs. Grace Ibrahim


Re: [RFC 02/10] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) Bus driver

2016-11-23 Thread Vishwanathapura, Niranjana

On Tue, Nov 22, 2016 at 05:49:32PM -0700, Jason Gunthorpe wrote:

> > We could add a custom Interface between HFI1 driver and hfi_vnic drivers
> > without involving a bus.
>
> hfi is already registering on the infiniband class, just use that.

I don't understand what you mean here?


Get the struct ib_device for the hfi and then do something to get hfi
specific function calls.

Or work it backwards with a _register function..



OK, thanks for your feedback.
We can make the hfi_vnic module as an ib client (which it is) like other ULPs, 
and do not have an in-built or custom bus for binding.
Then the hfi_vnic ULP by some mechanism will identify the device as hfi1 device 
and will only serve that device.


In order to pass the hfi function pointers to the hfi_vnic ULP, I can,
a) Have hfi_vnic ULP define an interface API for hfi1 driver to call to 
register its callback (as you pointed). Unfortunately there will be a module 
dependency here.

Or,
b) Add a new member ‘struct vnic_ops’ either to the ib_device structure or 
ib_port_immutable structure. As it is hfi1 specific, only hfi1 driver will set 
it. No module dependency here.


And will move the hfi_vnic module under ‘drivers/infiniband/ulp/hfi_vnic’.
All these will remove undue complexity and fit the driver in current design 
framework as per your suggestion.

Let me know your comments.

Niranjana



Jason


random thoughts on optmizing network namespace exit.

2016-11-23 Thread Eric W. Biederman

Fundamentally if we want things to get better we have to remove
unnecessary serialization.  It is entirely too easy to sleep when
cleaning up a networking subsystem and create long hold times on
net_mutex for no particular reasons.

What probably makes sense to do is to add the concept of a
non-serialized pernet_operation.  And then work through the networking
stack converting all of the pernet_operations.  That should allow
network namespace exits to overlap while they clean up, and it should
allow the net_mutex to be dropped at the same point we drop rtnl_lock
in cleanup_net.

It might be a touch tricky during the transition period to take
advantage of an early drop of net_mutex, but that is where I would
start.

Once net_mutex is no longer used to serialize initialization/cleanup
methods for a network namespace.  We can look at other bottlenecks.

Eric








Re: net/arp: ARP cache aging failed.

2016-11-23 Thread Eric Dumazet
On Wed, 2016-11-23 at 15:37 +0100, Hannes Frederic Sowa wrote:

> Irregardless about the question if bonding should keep the MAC address
> alive, a MAC address can certainly change below a TCP connection.

Of course ;)

> 
> dst_entry is 1:n to neigh_entry and as such we can end up confirming an
> aging neighbor while sending a reply with dst->pending_confirm set while
> the confirming packet actually came from a different neighbor.
> 
> I agree with Julian, pending_confirm became useless in this way.

Let's kill it then ;)




RE: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

2016-11-23 Thread Brown, Aaron F
> From: Intel-wired-lan [intel-wired-lan-boun...@lists.osuosl.org] on behalf of 
> Cao jin [caoj.f...@cn.fujitsu.com]
> Sent: Monday, November 07, 2016 11:06 PM
To> : linux-ker...@vger.kernel.org; netdev@vger.kernel.org
> Cc: izumi.t...@jp.fujitsu.com; intel-wired-...@lists.osuosl.org
> Subject: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of   
>   e1000_hw->hw_addr
> 
> When running as guest, under certain condition, it will oops as following.
> writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr
> is NULL. While other register access won't oops kernel because they use
> wr32/rd32 which have a defense against NULL pointer.
> 
> [  141.225449] pcieport :00:1c.0: AER: Multiple Uncorrected (Fatal)
> error received: id=0101
> [  141.225523] igb :01:00.1: PCIe Bus Error:
> severity=Uncorrected (Fatal), type=Unaccessible,
> id=0101(Unregistered Agent ID)
> [  141.299442] igb :01:00.1: broadcast error_detected message
> [  141.300539] igb :01:00.0 enp1s0f0: PCIe link lost, device now
> detached
> [  141.351019] igb :01:00.1 enp1s0f1: PCIe link lost, device now
> detached
> [  143.465904] pcieport :00:1c.0: Root Port link has been reset
> [  143.465994] igb :01:00.1: broadcast slot_reset message
> [  143.466039] igb :01:00.0: enabling device ( -> 0002)
> [  144.389078] igb :01:00.1: enabling device ( -> 0002)
> [  145.312078] igb :01:00.1: broadcast resume message
> [  145.322211] BUG: unable to handle kernel paging request at
> 3818
> [  145.361275] IP: []
> igb_configure_tx_ring+0x14d/0x280 [igb]
> [  145.400048] PGD 0
> [  145.438007] Oops: 0002 [#1] SMP
> 
> A similiar issue & solution could be found at:
> http://patchwork.ozlabs.org/patch/689592/
> 
> Signed-off-by: Cao jin 
> ---
>  drivers/net/ethernet/intel/igb/igb_main.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Tested-by: Aaron Brown 


[[PATCH net-next RFC] 1/4] net: dsa: mv88e6xxx: Implement mv88e6390 tag remap

2016-11-23 Thread Andrew Lunn
The mv88e6390 does not have the two registers to set the frame
priority map. Instead it has an indirection registers for setting a
number of different priority maps. Refactor the old code into an
function, implement the mv88e6390 version, and use an op to call the
right one.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx/chip.c  | 37 +++
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h | 10 ++
 drivers/net/dsa/mv88e6xxx/port.c  | 57 +++
 drivers/net/dsa/mv88e6xxx/port.h  |  2 ++
 4 files changed, 93 insertions(+), 13 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index bada6465af59..880e40288038 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -2617,20 +2617,10 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip 
*chip, int port)
if (err)
return err;
}
+   }
 
-   /* Tag Remap: use an identity 802.1p prio -> switch
-* prio mapping.
-*/
-   err = mv88e6xxx_port_write(chip, port, PORT_TAG_REGMAP_0123,
-  0x3210);
-   if (err)
-   return err;
-
-   /* Tag Remap 2: use an identity 802.1p prio -> switch
-* prio mapping.
-*/
-   err = mv88e6xxx_port_write(chip, port, PORT_TAG_REGMAP_4567,
-  0x7654);
+   if (chip->info->ops->tag_remap) {
+   err = chip->info->ops->tag_remap(chip, port);
if (err)
return err;
}
@@ -3193,6 +3183,7 @@ static const struct mv88e6xxx_ops mv88e6085_ops = {
.stats_get_sset_count = mv88e6095_stats_get_sset_count,
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
+   .tag_remap = mv88e6095_tag_remap,
 };
 
 static const struct mv88e6xxx_ops mv88e6095_ops = {
@@ -3221,6 +3212,7 @@ static const struct mv88e6xxx_ops mv88e6123_ops = {
.stats_get_sset_count = mv88e6095_stats_get_sset_count,
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
+   .tag_remap = mv88e6095_tag_remap,
 };
 
 static const struct mv88e6xxx_ops mv88e6131_ops = {
@@ -3249,6 +3241,7 @@ static const struct mv88e6xxx_ops mv88e6161_ops = {
.stats_get_sset_count = mv88e6095_stats_get_sset_count,
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
+   .tag_remap = mv88e6095_tag_remap,
 };
 
 static const struct mv88e6xxx_ops mv88e6165_ops = {
@@ -3263,6 +3256,7 @@ static const struct mv88e6xxx_ops mv88e6165_ops = {
.stats_get_sset_count = mv88e6095_stats_get_sset_count,
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
+   .tag_remap = mv88e6095_tag_remap,
 };
 
 static const struct mv88e6xxx_ops mv88e6171_ops = {
@@ -3278,6 +3272,7 @@ static const struct mv88e6xxx_ops mv88e6171_ops = {
.stats_get_sset_count = mv88e6095_stats_get_sset_count,
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
+   .tag_remap = mv88e6095_tag_remap,
 };
 
 static const struct mv88e6xxx_ops mv88e6172_ops = {
@@ -3295,6 +3290,7 @@ static const struct mv88e6xxx_ops mv88e6172_ops = {
.stats_get_sset_count = mv88e6095_stats_get_sset_count,
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
+   .tag_remap = mv88e6095_tag_remap,
 };
 
 static const struct mv88e6xxx_ops mv88e6175_ops = {
@@ -3310,6 +3306,7 @@ static const struct mv88e6xxx_ops mv88e6175_ops = {
.stats_get_sset_count = mv88e6095_stats_get_sset_count,
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
+   .tag_remap = mv88e6095_tag_remap,
 };
 
 static const struct mv88e6xxx_ops mv88e6176_ops = {
@@ -3327,6 +3324,7 @@ static const struct mv88e6xxx_ops mv88e6176_ops = {
.stats_get_sset_count = mv88e6095_stats_get_sset_count,
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
+   .tag_remap = mv88e6095_tag_remap,
 };
 
 static const struct mv88e6xxx_ops mv88e6185_ops = {
@@ -3357,6 +3355,7 @@ static const struct mv88e6xxx_ops mv88e6190_ops = {
.stats_get_sset_count = mv88e6320_stats_get_sset_count,
.stats_get_strings = mv88e6320_stats_get_strings,
.stats_get_stats = mv88e6390_stats_get_stats,
+   .tag_remap = mv88e6390_tag_remap,
 };
 
 static const struct mv88e6xxx_ops mv88e6190x_ops = {
@@ -3373,6 +3372,7 @@ static const struct mv88e6xxx_ops mv88e6190x_ops = {
.stats

[[PATCH net-next RFC] 2/4] net: dsa: mv88e6xxx: Monitor and Management tables

2016-11-23 Thread Andrew Lunn
The mv88e6390 changes the monitor control register into the Monitor
and Management control, which is an indirection register to various
registers. Move the existing code into global1.c, and add new code for
the mv88e6390.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx/chip.c  | 37 +--
 drivers/net/dsa/mv88e6xxx/global1.c   | 55 +++
 drivers/net/dsa/mv88e6xxx/global1.h   |  3 ++
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |  7 +
 4 files changed, 93 insertions(+), 9 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 880e40288038..a6fa3f81e11b 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -2747,15 +2747,11 @@ static int mv88e6xxx_g1_setup(struct mv88e6xxx_chip 
*chip)
if (err)
return err;
 
-   /* Configure the upstream port, and configure it as the port to which
-* ingress and egress and ARP monitor frames are to be sent.
-*/
-   reg = upstream_port << GLOBAL_MONITOR_CONTROL_INGRESS_SHIFT |
-   upstream_port << GLOBAL_MONITOR_CONTROL_EGRESS_SHIFT |
-   upstream_port << GLOBAL_MONITOR_CONTROL_ARP_SHIFT;
-   err = mv88e6xxx_g1_write(chip, GLOBAL_MONITOR_CONTROL, reg);
-   if (err)
-   return err;
+   if (chip->info->ops->monitor_ctrl) {
+   err = chip->info->ops->monitor_ctrl(chip, upstream_port);
+   if (err)
+   return err;
+   }
 
/* Disable remote management, and set the switch's DSA device number. */
err = mv88e6xxx_g1_write(chip, GLOBAL_CONTROL_2,
@@ -3184,6 +3180,7 @@ static const struct mv88e6xxx_ops mv88e6085_ops = {
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
.tag_remap = mv88e6095_tag_remap,
+   .monitor_ctrl = mv88e6095_monitor_ctrl,
 };
 
 static const struct mv88e6xxx_ops mv88e6095_ops = {
@@ -3198,6 +3195,7 @@ static const struct mv88e6xxx_ops mv88e6095_ops = {
.stats_get_sset_count = mv88e6095_stats_get_sset_count,
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
+   .monitor_ctrl = mv88e6095_monitor_ctrl,
 };
 
 static const struct mv88e6xxx_ops mv88e6123_ops = {
@@ -3213,6 +3211,7 @@ static const struct mv88e6xxx_ops mv88e6123_ops = {
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
.tag_remap = mv88e6095_tag_remap,
+   .monitor_ctrl = mv88e6095_monitor_ctrl,
 };
 
 static const struct mv88e6xxx_ops mv88e6131_ops = {
@@ -3227,6 +3226,7 @@ static const struct mv88e6xxx_ops mv88e6131_ops = {
.stats_get_sset_count = mv88e6095_stats_get_sset_count,
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
+   .monitor_ctrl = mv88e6095_monitor_ctrl,
 };
 
 static const struct mv88e6xxx_ops mv88e6161_ops = {
@@ -3242,6 +3242,7 @@ static const struct mv88e6xxx_ops mv88e6161_ops = {
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
.tag_remap = mv88e6095_tag_remap,
+   .monitor_ctrl = mv88e6095_monitor_ctrl,
 };
 
 static const struct mv88e6xxx_ops mv88e6165_ops = {
@@ -3257,6 +3258,7 @@ static const struct mv88e6xxx_ops mv88e6165_ops = {
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
.tag_remap = mv88e6095_tag_remap,
+   .monitor_ctrl = mv88e6095_monitor_ctrl,
 };
 
 static const struct mv88e6xxx_ops mv88e6171_ops = {
@@ -3273,6 +3275,7 @@ static const struct mv88e6xxx_ops mv88e6171_ops = {
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
.tag_remap = mv88e6095_tag_remap,
+   .monitor_ctrl = mv88e6095_monitor_ctrl,
 };
 
 static const struct mv88e6xxx_ops mv88e6172_ops = {
@@ -3291,6 +3294,7 @@ static const struct mv88e6xxx_ops mv88e6172_ops = {
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
.tag_remap = mv88e6095_tag_remap,
+   .monitor_ctrl = mv88e6095_monitor_ctrl,
 };
 
 static const struct mv88e6xxx_ops mv88e6175_ops = {
@@ -3307,6 +3311,7 @@ static const struct mv88e6xxx_ops mv88e6175_ops = {
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
.tag_remap = mv88e6095_tag_remap,
+   .monitor_ctrl = mv88e6095_monitor_ctrl,
 };
 
 static const struct mv88e6xxx_ops mv88e6176_ops = {
@@ -3325,6 +3330,7 @@ static const struct mv88e6xxx_ops mv88e6176_ops = {
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
.tag_remap = mv88e6095_tag_remap,
+   .monitor_ct

[[PATCH net-next RFC] 0/4] MV88E6390 batch two

2016-11-23 Thread Andrew Lunn
RFC only. Not for committing. They will conflict with the mv88e6097
support.

This is the second batch of patches adding support for the
MV88e6390. They are not sufficient to make it work properly.

The mv88e6390 has a much expanded set of priority maps. Refactor the
existing code, and implement basic support for the new device.

Similarly, the monitor control register has been reworked.

The mv88e6390 has something odd in its EDSA tagging implementation,
which means it is not possible to use it. So we need to use DSA
tagging. This is the first device with EDSA support where we need to
use DSA, and the code does not support this. So two patches refactor
the existing code. The two different register definitions are
separated out, and using DSA on an EDSA capable device is added.

Andrew Lunn (4):
  net: dsa: mv88e6xxx: Implement mv88e6390 tag remap
  net: dsa: mv88e6xxx: Monitor and Management tables
  net: dsa: mv88e6xxx: Move the tagging protocol into info
  net: dsa: mv88e6xxx: Refactor CPU and DSA port setup

 drivers/net/dsa/mv88e6xxx/chip.c  | 195 +-
 drivers/net/dsa/mv88e6xxx/global1.c   |  55 ++
 drivers/net/dsa/mv88e6xxx/global1.h   |   3 +
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |  40 ---
 drivers/net/dsa/mv88e6xxx/port.c  | 132 +++
 drivers/net/dsa/mv88e6xxx/port.h  |   7 ++
 6 files changed, 348 insertions(+), 84 deletions(-)

-- 
2.10.2



[[PATCH net-next RFC] 3/4] net: dsa: mv88e6xxx: Move the tagging protocol into info

2016-11-23 Thread Andrew Lunn
Older chips support a single tagging protocol, DSA. New chips support
both DSA and EDSA, an enhanced version. Having both as an option
changes the register layouts. Up until now, it has been assumed that
if EDSA is supported, it will be used. Hence the register layout has
been determined by which protocol should be used. However, mv88e6390
has a different implementation of EDSA, which requires we need to use
the DSA tagging. Hence separate the selection of the protocol from the
register layout.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx/chip.c  | 33 +++--
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h | 17 -
 2 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index a6fa3f81e11b..15ea1207b21a 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -2482,7 +2482,7 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip 
*chip, int port)
PORT_CONTROL_USE_TAG | PORT_CONTROL_USE_IP |
PORT_CONTROL_STATE_FORWARDING;
if (dsa_is_cpu_port(ds, port)) {
-   if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_EDSA))
+   if (chip->info->tag_protocol == DSA_TAG_PROTO_EDSA)
reg |= PORT_CONTROL_FRAME_ETHER_TYPE_DSA |
PORT_CONTROL_FORWARD_UNKNOWN_MC;
else
@@ -2611,7 +2611,7 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip 
*chip, int port)
/* Port Ethertype: use the Ethertype DSA Ethertype
 * value.
 */
-   if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_EDSA)) {
+   if (chip->info->tag_protocol == DSA_TAG_PROTO_EDSA) {
err = mv88e6xxx_port_write(chip, port, PORT_ETH_TYPE,
   ETH_P_EDSA);
if (err)
@@ -3592,6 +3592,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.global1_addr = 0x1b,
.age_time_coeff = 15000,
.g1_irqs = 8,
+   .tag_protocol = DSA_TAG_PROTO_DSA,
.flags = MV88E6XXX_FLAGS_FAMILY_6097,
.ops = &mv88e6085_ops,
},
@@ -3606,6 +3607,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.global1_addr = 0x1b,
.age_time_coeff = 15000,
.g1_irqs = 8,
+   .tag_protocol = DSA_TAG_PROTO_DSA,
.flags = MV88E6XXX_FLAGS_FAMILY_6095,
.ops = &mv88e6095_ops,
},
@@ -3620,6 +3622,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.global1_addr = 0x1b,
.age_time_coeff = 15000,
.g1_irqs = 9,
+   .tag_protocol = DSA_TAG_PROTO_DSA,
.flags = MV88E6XXX_FLAGS_FAMILY_6165,
.ops = &mv88e6123_ops,
},
@@ -3634,6 +3637,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.global1_addr = 0x1b,
.age_time_coeff = 15000,
.g1_irqs = 9,
+   .tag_protocol = DSA_TAG_PROTO_DSA,
.flags = MV88E6XXX_FLAGS_FAMILY_6185,
.ops = &mv88e6131_ops,
},
@@ -3648,6 +3652,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.global1_addr = 0x1b,
.age_time_coeff = 15000,
.g1_irqs = 9,
+   .tag_protocol = DSA_TAG_PROTO_DSA,
.flags = MV88E6XXX_FLAGS_FAMILY_6165,
.ops = &mv88e6161_ops,
},
@@ -3662,6 +3667,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.global1_addr = 0x1b,
.age_time_coeff = 15000,
.g1_irqs = 9,
+   .tag_protocol = DSA_TAG_PROTO_DSA,
.flags = MV88E6XXX_FLAGS_FAMILY_6165,
.ops = &mv88e6165_ops,
},
@@ -3676,6 +3682,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.global1_addr = 0x1b,
.age_time_coeff = 15000,
.g1_irqs = 9,
+   .tag_protocol = DSA_TAG_PROTO_EDSA,
.flags = MV88E6XXX_FLAGS_FAMILY_6351,
.ops = &mv88e6171_ops,
},
@@ -3690,6 +3697,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.global1_addr = 0x1b,
.age_time_coeff = 15000,
.g1_irqs = 9,
+   .tag_protocol = DSA_TAG_PROTO_EDSA,
.flags = MV88E6XXX_FLAGS_FAMILY_6352,
.ops = &mv88e6172_ops,
},
@@ -3704,6 +3712,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.global1_addr = 0x1b,
.age_time_coeff = 15000,
.g1_irqs = 9,
+   .tag_protocol = DSA_TAG_PROTO_EDSA,
.flags = MV88E6XXX_FLAGS_FAMIL

[[PATCH net-next RFC] 4/4] net: dsa: mv88e6xxx: Refactor CPU and DSA port setup

2016-11-23 Thread Andrew Lunn
Older chips only support DSA tagging. Newer chips have both DSA and
EDSA tagging. Put these two different implementations into functions
which get called from the ops structure.

This results in the helper mv88e6xxx_6065_family() becoming unused, so
remove it.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx/chip.c  | 92 ++-
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |  6 +++
 drivers/net/dsa/mv88e6xxx/port.c  | 75 
 drivers/net/dsa/mv88e6xxx/port.h  |  5 ++
 4 files changed, 133 insertions(+), 45 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 15ea1207b21a..28bd10d95750 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -677,11 +677,6 @@ static int mv88e6xxx_phy_ppu_write(struct mv88e6xxx_chip 
*chip, int addr,
return err;
 }
 
-static bool mv88e6xxx_6065_family(struct mv88e6xxx_chip *chip)
-{
-   return chip->info->family == MV88E6XXX_FAMILY_6065;
-}
-
 static bool mv88e6xxx_6095_family(struct mv88e6xxx_chip *chip)
 {
return chip->info->family == MV88E6XXX_FAMILY_6095;
@@ -2473,41 +2468,20 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip 
*chip, int port)
 * If this is the upstream port for this switch, enable
 * forwarding of unknown unicasts and multicasts.
 */
-   reg = 0;
-   if (mv88e6xxx_6352_family(chip) || mv88e6xxx_6351_family(chip) ||
-   mv88e6xxx_6165_family(chip) || mv88e6xxx_6097_family(chip) ||
-   mv88e6xxx_6095_family(chip) || mv88e6xxx_6065_family(chip) ||
-   mv88e6xxx_6185_family(chip) || mv88e6xxx_6320_family(chip))
-   reg = PORT_CONTROL_IGMP_MLD_SNOOP |
+   reg = PORT_CONTROL_IGMP_MLD_SNOOP |
PORT_CONTROL_USE_TAG | PORT_CONTROL_USE_IP |
PORT_CONTROL_STATE_FORWARDING;
+   err = mv88e6xxx_port_write(chip, port, PORT_CONTROL, reg);
+   if (err)
+   return err;
+
if (dsa_is_cpu_port(ds, port)) {
-   if (chip->info->tag_protocol == DSA_TAG_PROTO_EDSA)
-   reg |= PORT_CONTROL_FRAME_ETHER_TYPE_DSA |
-   PORT_CONTROL_FORWARD_UNKNOWN_MC;
-   else
-   reg |= PORT_CONTROL_DSA_TAG;
-   reg |= PORT_CONTROL_EGRESS_ADD_TAG |
-   PORT_CONTROL_FORWARD_UNKNOWN;
+   err = chip->info->ops->cpu_port_config(chip, port);
+   if (err)
+   return err;
}
if (dsa_is_dsa_port(ds, port)) {
-   if (mv88e6xxx_6095_family(chip) ||
-   mv88e6xxx_6185_family(chip))
-   reg |= PORT_CONTROL_DSA_TAG;
-   if (mv88e6xxx_6352_family(chip) ||
-   mv88e6xxx_6351_family(chip) ||
-   mv88e6xxx_6165_family(chip) ||
-   mv88e6xxx_6097_family(chip) ||
-   mv88e6xxx_6320_family(chip)) {
-   reg |= PORT_CONTROL_FRAME_MODE_DSA;
-   }
-
-   if (port == dsa_upstream_port(ds))
-   reg |= PORT_CONTROL_FORWARD_UNKNOWN |
-   PORT_CONTROL_FORWARD_UNKNOWN_MC;
-   }
-   if (reg) {
-   err = mv88e6xxx_port_write(chip, port, PORT_CONTROL, reg);
+   err = chip->info->ops->dsa_port_config(chip, port);
if (err)
return err;
}
@@ -2607,16 +2581,6 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip 
*chip, int port)
   0x);
if (err)
return err;
-
-   /* Port Ethertype: use the Ethertype DSA Ethertype
-* value.
-*/
-   if (chip->info->tag_protocol == DSA_TAG_PROTO_EDSA) {
-   err = mv88e6xxx_port_write(chip, port, PORT_ETH_TYPE,
-  ETH_P_EDSA);
-   if (err)
-   return err;
-   }
}
 
if (chip->info->ops->tag_remap) {
@@ -3181,6 +3145,8 @@ static const struct mv88e6xxx_ops mv88e6085_ops = {
.stats_get_stats = mv88e6095_stats_get_stats,
.tag_remap = mv88e6095_tag_remap,
.monitor_ctrl = mv88e6095_monitor_ctrl,
+   .cpu_port_config = mv88e6351_cpu_port_config,
+   .dsa_port_config = mv88e6351_dsa_port_config,
 };
 
 static const struct mv88e6xxx_ops mv88e6095_ops = {
@@ -3196,6 +3162,8 @@ static const struct mv88e6xxx_ops mv88e6095_ops = {
.stats_get_strings = mv88e6095_stats_get_strings,
.stats_get_stats = mv88e6095_stats_get_stats,
.monitor_ctrl = mv88e6095_monitor_ctrl,
+   .cpu_port_config = mv88e6095_cpu_port_config,
+   .dsa_port_config = mv88e6095_dsa_port_config,
 };
 
 static const struct mv88e6xxx_ops mv

[PATCH v2 net] phy: fix error case of phy_led_triggers_(un)register

2016-11-23 Thread Woojung.Huh
From: Woojung Huh 

When phy_init_hw() fails at phy_attach_direct();
- phy_detach() calls phy_led_triggers_unregister() without
  previous call of phy_led_triggers_register().
- still call phy_led_triggers_register() and cause memory leak.

Fixes: 2e0bc452f472 ("net: phy: leds: add support for led triggers on phy link 
state change")
Signed-off-by: Woojung Huh 
---
 drivers/net/phy/phy_device.c   | 6 +++---
 drivers/net/phy/phy_led_triggers.c | 2 --
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 9e8f048..ba86c19 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -914,15 +914,15 @@ int phy_attach_direct(struct net_device *dev, struct 
phy_device *phydev,
 */
err = phy_init_hw(phydev);
if (err)
-   phy_detach(phydev);
-   else
-   phy_resume(phydev);
+   goto error;
 
+   phy_resume(phydev);
phy_led_triggers_register(phydev);
 
return err;
 
 error:
+   phy_detach(phydev);
put_device(d);
module_put(bus->owner);
return err;
diff --git a/drivers/net/phy/phy_led_triggers.c 
b/drivers/net/phy/phy_led_triggers.c
index cda600a..fa62bdf 100644
--- a/drivers/net/phy/phy_led_triggers.c
+++ b/drivers/net/phy/phy_led_triggers.c
@@ -130,7 +130,5 @@ void phy_led_triggers_unregister(struct phy_device *phy)
 
for (i = 0; i < phy->phy_num_led_triggers; i++)
phy_led_trigger_unregister(&phy->phy_led_triggers[i]);
-
-   devm_kfree(&phy->mdio.dev, phy->phy_led_triggers);
 }
 EXPORT_SYMBOL_GPL(phy_led_triggers_unregister);
-- 
2.7.4


Re: [PATCH net] phy: fix error case of phy_led_triggers_(un)register

2016-11-23 Thread Florian Fainelli
Le 23/11/2016 à 13:39, woojung@microchip.com a écrit :
> From: Woojung Huh 
> 
> When phy_init_hw() fails at phy_attach_direct();
> - phy_detach() calls phy_led_triggers_unregister() without 
>   previous call of phy_led_triggers_register().
> - still call phy_led_triggers_register() and cause memory leak.
> 
> Signed-off-by: Woojung Huh 

Since you probably have to resubmit this, can you also add a Fixes tag:

Fixes: 2e0bc452f472 ("net: phy: leds: add support for led triggers on
phy link state change")

Thanks!
-- 
Florian


[PATCH net] net: ethernet: mvneta: Remove IFF_UNICAST_FLT which is not implemented

2016-11-23 Thread Andrew Lunn
The mvneta driver advertises it supports IFF_UNICAST_FLT. However, it
actually does not. The hardware probably does support it, but there is
no code to configure the filter. As a quick and simple fix, remove the
flag. This will cause the core to fall back to promiscuous mode.

Signed-off-by: Andrew Lunn 
Fixes: b50b72de2f2f ("net: mvneta: enable features before registering the 
driver")
---
 drivers/net/ethernet/marvell/mvneta.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c 
b/drivers/net/ethernet/marvell/mvneta.c
index 5cb07c2017bf..0c0a45af950f 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -4151,7 +4151,7 @@ static int mvneta_probe(struct platform_device *pdev)
dev->features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;
dev->hw_features |= dev->features;
dev->vlan_features |= dev->features;
-   dev->priv_flags |= IFF_UNICAST_FLT | IFF_LIVE_ADDR_CHANGE;
+   dev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
dev->gso_max_segs = MVNETA_MAX_TSO_SEGS;
 
err = register_netdev(dev);
-- 
2.10.2



Re: [patch net-next v2 09/11] ipv4: fib: Add an API to request a FIB dump

2016-11-23 Thread Hannes Frederic Sowa
On 23.11.2016 20:53, Ido Schimmel wrote:
> On Wed, Nov 23, 2016 at 06:47:03PM +0100, Hannes Frederic Sowa wrote:
>> Hmm, I think you need to read the sequence counter under rtnl_lock to
>> have an ordering with the rest of the updates to the RCU trie. Otherwise
>> you don't know if the fib trie has the correct view regarding to the
>> incoming notifications as a whole. This is also necessary during restarts.
>
> I spent quite a lot of time thinking about this specific issue, but I
> couldn't convince myself that the read should be done under RTNL and I'm
> not sure I understand your reasoning. Can you please elaborate?
>
> If, before each notification sent, we call atomic_inc() and then call
> atomic_read() at the end, then how can we be tricked?

The race I am suspecting to happen is:

 fib_register()

 delete route by notifier
 enqueue delete cmd into ordered queue

 starts dump
 sees deleted route by CPU1 because route not yet removed from RCU
 enqueues route for addition

sometimes later in the ordered queue:

delete route -> route not in hw, nop
add route from dump -> route added to hardware

The result should actually have been that route isn't in hw.

Bye,
Hannes


Re: [PATCH v9 2/6] cgroup: add support for eBPF programs

2016-11-23 Thread Rami Rosen
Hi Daniel,

A minor comment:

> +/**
> + * __cgroup_bpf_update() - Update the pinned program of a cgroup, and
> + * propagate the change to descendants
> + * @cgrp: The cgroup which descendants to traverse
> + * @parent: The parent of @cgrp, or %NULL if @cgrp is the root
> + * @prog: A new program to pin
> + * @type: Type of pinning operation (ingress/egress)
> + *
> + * Each cgroup has a set of two pointers for bpf programs; one for eBPF
> + * programs it owns, and which is effective for execution.
> + *
You have in the following section twice identical checks, for If @prog
is %NULL".
Shouldn't it be here (in the first case only) "If @prog is not %NULL"
instead "If @prog is %NULL"?

> + * If @prog is %NULL, this function attaches a new program to the cgroup and
> + * releases the one that is currently attached, if any. @prog is then made
> + * the effective program of type @type in that cgroup.
> + *
> + * If @prog is %NULL, the currently attached program of type @type is 
> released,
> + * and the effective program of the parent cgroup (if any) is inherited to
> + * @cgrp.
> + *


Regard,
Rami Rosen


Re: wl1251 & mac address & calibration data

2016-11-23 Thread Pali Rohár
On Wednesday 23 November 2016 23:23:35 Pavel Machek wrote:
> Hi!
> 
> > > > As wl1251.ko does not accept mac_address as module parameter,
> > > > such modprobe hook does not help -- as there is absolutely no
> > > > way from userspace to set or change (permanent) mac address.
> > > 
> > > Quoting modprobe.d manual:
> > > >   install modulename command...
> > > >   
> > > >   This command instructs modprobe to run your
> > > >   command instead of inserting the module in the
> > > >   kernel as normal. The command can be any shell
> > > >   command: this allows you to do any kind of
> > > >   complex processing you might wish. [...]
> > 
> > I know. But this do not allow me to send mac address to kernel --
> > as kernel does not support such command yet (reason for my first
> > question).
> 
> Plus, this does not really work for cases where wl1251 is not a
> module.

Yes, this is another problem.

> > > You can hook up a script that cooks up wl1251-nvs.bin (caldata,
> > > macaddr) and then insmod the actual wl1251.ko module. Or you can
> > > just cook up the nvs on first device boot and store it in
> > > /lib/firmware (possibly overwriting the "generic" wl1251 from
> > > linux-firmware).
> > 
> > This is what I would like to prevent -- overwriting (possible
> > readonly) system files with some device specific. It is really bad
> > idea!
> 
> Agreed.
> 
> "ifconfig hw ether XX" normally sets the address. I guess that's
> ioctl?

This sets temporary address and it is ioctl. IIRC same as what ethtool 
uses. (ifconfig is already deprecated).

> And I guess we should use similar mechanism for permanent
> address.

I'm not sure here... Above ioctl ↑↑↑ is for changing temporary mac 
address. But here we do not want to change permanent mac address. We 
want to tell kernel driver current permanent mac address which is stored 
in permanent mac address storage (in N900 case in mtd). Just like 
userspace helper as kernel driver do not have code which can read 
permanent mac address.

-- 
Pali Rohár
pali.ro...@gmail.com


signature.asc
Description: This is a digitally signed message part.


Re: [PATCH net-next v2] ethtool: Protect {get,set}_phy_tunable with PHY device mutex

2016-11-23 Thread Andrew Lunn
On Tue, Nov 22, 2016 at 01:55:31PM -0800, Florian Fainelli wrote:
> PHY drivers should be able to rely on the caller of {get,set}_tunable to
> have acquired the PHY device mutex, in order to both serialize against
> concurrent calls of these functions, but also against PHY state machine
> changes. All ethtool PHY-level functions do this, except
> {get,set}_tunable, so we make them consistent here as well.
> 
> We need to update the Microsemi PHY driver in the same commit to avoid
> introducing either deadlocks, or lack of proper locking.
> 
> Fixes: 968ad9da7e0e ("ethtool: Implements 
> ETHTOOL_PHY_GTUNABLE/ETHTOOL_PHY_STUNABLE")
> Fixes: 310d9ad57ae0 ("net: phy: Add downshift get/set support in Microsemi 
> PHYs driver")
> Signed-off-by: Florian Fainelli 

Reviewed-by: Andrew Lunn 

Andrew


How are you doing?

2016-11-23 Thread Lin Brown
Hey! my name is Monica, Can i be your friend?


Re: [PATCH kernel v3] PCI: Enable access to custom VPD for Chelsio devices (cxgb3)

2016-11-23 Thread Bjorn Helgaas
On Mon, Oct 24, 2016 at 06:04:17PM +1100, Alexey Kardashevskiy wrote:
> There is at least one Chelsio 10Gb card which uses VPD area to store
> some custom blocks (example below). However pci_vpd_size() returns
> the length of the first block only assuming that there can be only
> one VPD "End Tag" and VFIO blocks access beyond that offset
> (since 4e1a63555) which leads to the situation when the guest "cxgb3"
> driver fails to probe the device. The host system does not have this
> problem as the drives accesses the config space directly without
> pci_read_vpd()/...
> 
> This adds a quirk to override the VPD size to a bigger value.
> The maximum size is taken from EEPROMSIZE in
> drivers/net/ethernet/chelsio/cxgb3/common.h. We do not read the tag
> as the cxgb3 driver does as the driver supports writing to EEPROM/VPD
> and when it writes, it only checks for 8192 bytes boundary. The quirk
> is registerted for all devices supported by the cxgb3 driver.
> 
> This adds a quirk to the PCI layer (not to the cxgb3 driver) as
> the cxgb3 driver itself accesses VPD directly and the problem only exists
> with the vfio-pci driver (when cxgb3 is not running on the host and
> may not be even loaded) which blocks accesses beyond the first block
> of VPD data. However vfio-pci itself does not have quirks mechanism so
> we add it to PCI.
> 
> This is the controller:
> Ethernet controller [0200]: Chelsio Communications Inc T310 10GbE Single Port 
> Adapter [1425:0030]
> 
> This is what I parsed from its vpd:
> ===
> b'\x82*\x0010 Gigabit Ethernet-SR PCI Express Adapter\x90J\x00EC\x07D76809 
> FN\x0746K'
>   Large item 42 bytes; name 0x2 Identifier String
>   b'10 Gigabit Ethernet-SR PCI Express Adapter'
>  002d Large item 74 bytes; name 0x10
>   #00 [EC] len=7: b'D76809 '
>   #0a [FN] len=7: b'46K7897'
>   #14 [PN] len=7: b'46K7897'
>   #1e [MN] len=4: b'1037'
>   #25 [FC] len=4: b'5769'
>   #2c [SN] len=12: b'YL102035603V'
>   #3b [NA] len=12: b'00145E992ED1'
>  007a Small item 1 bytes; name 0xf End Tag
> 
>  0c00 Large item 16 bytes; name 0x2 Identifier String
>   b'S310E-SR-X  '
>  0c13 Large item 234 bytes; name 0x10
>   #00 [PN] len=16: b'TBD '
>   #13 [EC] len=16: b'110107730D2 '
>   #26 [SN] len=16: b'97YL102035603V  '
>   #39 [NA] len=12: b'00145E992ED1'
>   #48 [V0] len=6: b'175000'
>   #51 [V1] len=6: b'26'
>   #5a [V2] len=6: b'26'
>   #63 [V3] len=6: b'2000  '
>   #6c [V4] len=2: b'1 '
>   #71 [V5] len=6: b'c2'
>   #7a [V6] len=6: b'0 '
>   #83 [V7] len=2: b'1 '
>   #88 [V8] len=2: b'0 '
>   #8d [V9] len=2: b'0 '
>   #92 [VA] len=2: b'0 '
>   #97 [RV] len=80: 
> b's\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
>  0d00 Large item 252 bytes; name 0x11
>   #00 [VC] len=16: b'122310_1222 dp  '
>   #13 [VD] len=16: b'610-0001-00 H1\x00\x00'
>   #26 [VE] len=16: b'122310_1353 fp  '
>   #39 [VF] len=16: b'610-0001-00 H1\x00\x00'
>   #4c [RW] len=173: 
> b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
>  0dff Small item 0 bytes; name 0xf End Tag
> 
> 10f3 Large item 13315 bytes; name 0x62
> !!! unknown item name 98: 
> b'\xd0\x03\x00@`\x0c\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00'
> ===
> 
> Signed-off-by: Alexey Kardashevskiy 

Applied to pci/misc for v4.10, thanks, Alexey!

> ---
> Changes:
> v3:
> * unconditionally set VPD size to 8192
> 
> v2:
> * used pci_set_vpd_size() helper
> * added explicit list of IDs from cxgb3 driver
> * added a note in the commit log why the quirk is not in cxgb3
> ---
>  drivers/pci/quirks.c | 19 +++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index c232729..bc7c541 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3255,6 +3255,25 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 
> PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PORT_RIDGE,
>   quirk_thunderbolt_hotplug_msi);
>  
> +static void quirk_chelsio_extend_vpd(struct pci_dev *dev)
> +{
> + pci_set_vpd_size(dev, 8192);
> +}
> +
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x20, 
> quirk_chelsio_extend_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x21, 
> quirk_chelsio_extend_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x22, 
> quirk_chelsio_extend_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x23, 
> quirk_chelsio_extend_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x24, 
> quirk_chelsio_extend_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x25, 
> quirk_chelsio_extend_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x26, 
> quirk_chelsio_extend_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x30, 
> quirk_chelsio_extend_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x31, 
> quir

RE: IT Help Desk

2016-11-23 Thread Kim Tolman


From: Kim Tolman
Sent: Wednesday, November 23, 2016 5:00 PM
Subject: IT Help Desk

Your password Will Expire In The Next TWO {2} Days Current Faculty and Staff 
Should Log On To IT WEBSITE To Validate 
Your E-mail.


Re: wl1251 & mac address & calibration data

2016-11-23 Thread Pavel Machek
Hi!

> > > As wl1251.ko does not accept mac_address as module parameter, such
> > > modprobe hook does not help -- as there is absolutely no way from
> > > userspace to set or change (permanent) mac address.
> > 
> > Quoting modprobe.d manual:
> > >   install modulename command...
> > >   
> > >   This command instructs modprobe to run your
> > >   command instead of inserting the module in the
> > >   kernel as normal. The command can be any shell
> > >   command: this allows you to do any kind of
> > >   complex processing you might wish. [...]
> 
> I know. But this do not allow me to send mac address to kernel -- as 
> kernel does not support such command yet (reason for my first
> question).

Plus, this does not really work for cases where wl1251 is not a
module.

> > You can hook up a script that cooks up wl1251-nvs.bin (caldata,
> > macaddr) and then insmod the actual wl1251.ko module. Or you can just
> > cook up the nvs on first device boot and store it in /lib/firmware
> > (possibly overwriting the "generic" wl1251 from linux-firmware).
> 
> This is what I would like to prevent -- overwriting (possible readonly) 
> system files with some device specific. It is really bad idea!

Agreed.

"ifconfig hw ether XX" normally sets the address. I guess that's
ioctl? And I guess we should use similar mechanism for permanent
address.

Best regards,

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH net] phy: fix error case of phy_led_triggers_(un)register

2016-11-23 Thread Andrew Lunn
On Wed, Nov 23, 2016 at 09:39:37PM +, woojung@microchip.com wrote:
> From: Woojung Huh 
> 
> When phy_init_hw() fails at phy_attach_direct();
> - phy_detach() calls phy_led_triggers_unregister() without 
>   previous call of phy_led_triggers_register().
> - still call phy_led_triggers_register() and cause memory leak.
> 
> Signed-off-by: Woojung Huh 
> ---
>  drivers/net/phy/phy_device.c   | 6 +++---
>  drivers/net/phy/phy_led_triggers.c | 3 +++
>  2 files changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index 9e8f048..094a959 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -915,10 +915,10 @@ int phy_attach_direct(struct net_device *dev, struct 
> phy_device *phydev,
>   err = phy_init_hw(phydev);
>   if (err)
>   phy_detach(phydev);
> - else
> + else {
>   phy_resume(phydev);
> -
> - phy_led_triggers_register(phydev);
> + phy_led_triggers_register(phydev);
> + }

Hi Woojung

The code layout is rather unusual, putting the success case inside the
else {}. It would be better to do a goto out: on error, and detach the
phy there.

>  
>   return err;
>  
> diff --git a/drivers/net/phy/phy_led_triggers.c 
> b/drivers/net/phy/phy_led_triggers.c
> index cda600a..3b0b726 100644
> --- a/drivers/net/phy/phy_led_triggers.c
> +++ b/drivers/net/phy/phy_led_triggers.c
> @@ -128,6 +128,9 @@ void phy_led_triggers_unregister(struct phy_device *phy)
>  {
>   int i;
>  
> + if (!phy->phy_num_led_triggers)
> + return;
> +
>   for (i = 0; i < phy->phy_num_led_triggers; i++)
>   phy_led_trigger_unregister(&phy->phy_led_triggers[i]);

And this seems to be the wrong fix. The point of devm_kzalloc() is
that you don't need to free it, it will happen automatically. So why
not just remove the devm_kfree(&phy->mdio.dev, phy->phy_led_triggers).

Andrew


Re: [PATCH net] phy: fix error case of phy_led_triggers_(un)register

2016-11-23 Thread Sergei Shtylyov

Hello.

On 11/24/2016 12:39 AM, woojung@microchip.com wrote:


From: Woojung Huh 

When phy_init_hw() fails at phy_attach_direct();
- phy_detach() calls phy_led_triggers_unregister() without
  previous call of phy_led_triggers_register().
- still call phy_led_triggers_register() and cause memory leak.

Signed-off-by: Woojung Huh 
---
 drivers/net/phy/phy_device.c   | 6 +++---
 drivers/net/phy/phy_led_triggers.c | 3 +++
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 9e8f048..094a959 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -915,10 +915,10 @@ int phy_attach_direct(struct net_device *dev, struct 
phy_device *phydev,
err = phy_init_hw(phydev);
if (err)
phy_detach(phydev);
-   else
+   else {


   CodingStyle: all *if* branches should have {} if at least one has {}.


phy_resume(phydev);
-
-   phy_led_triggers_register(phydev);
+   phy_led_triggers_register(phydev);
+   }

return err;


[...]

MBR, Sergei



[PATCH net] phy: fix error case of phy_led_triggers_(un)register

2016-11-23 Thread Woojung.Huh
From: Woojung Huh 

When phy_init_hw() fails at phy_attach_direct();
- phy_detach() calls phy_led_triggers_unregister() without 
  previous call of phy_led_triggers_register().
- still call phy_led_triggers_register() and cause memory leak.

Signed-off-by: Woojung Huh 
---
 drivers/net/phy/phy_device.c   | 6 +++---
 drivers/net/phy/phy_led_triggers.c | 3 +++
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 9e8f048..094a959 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -915,10 +915,10 @@ int phy_attach_direct(struct net_device *dev, struct 
phy_device *phydev,
err = phy_init_hw(phydev);
if (err)
phy_detach(phydev);
-   else
+   else {
phy_resume(phydev);
-
-   phy_led_triggers_register(phydev);
+   phy_led_triggers_register(phydev);
+   }
 
return err;
 
diff --git a/drivers/net/phy/phy_led_triggers.c 
b/drivers/net/phy/phy_led_triggers.c
index cda600a..3b0b726 100644
--- a/drivers/net/phy/phy_led_triggers.c
+++ b/drivers/net/phy/phy_led_triggers.c
@@ -128,6 +128,9 @@ void phy_led_triggers_unregister(struct phy_device *phy)
 {
int i;
 
+   if (!phy->phy_num_led_triggers)
+   return;
+
for (i = 0; i < phy->phy_num_led_triggers; i++)
phy_led_trigger_unregister(&phy->phy_led_triggers[i]);
 
-- 
2.7.4


Re: [RFC PATCH v2 1/2] macb: Add 1588 support in Cadence GEM.

2016-11-23 Thread Richard Cochran
On Wed, Nov 23, 2016 at 02:34:03PM +0100, Andrei Pistirica wrote:
> From what I understand, your suggestion is:
> (ns | frac) * ppb = (total_ns | total_frac)
> (total_ns | total_frac) / 10^9 = (adj_ns | adj_frac)
> This is correct iff total_ns/10^9 >= 1, but the problem is that there are
> missed fractions due to the following approximation:
> frac*ppb =~ (ns*ppb+frac*ppb*2^16)*2^16-10^9*2^16*flor(ns*ppb+frac*ppb*2^16,
> 10^9).

-ENOPARSE;
 
> An example which uses values from a real test:
> let ppb=4891, ns=12 and frac=3158

That is a very strange example for nominal frequency.  The clock
period is 12.048187255859375 nanoseconds, and so the frequency is
8337.99 Hz.

But hey, let's go with it...

> - using suggested algorithm, yields: adj_ns = 0 and adj_frac = 0
> - using in-place algorithm, yields: adj_ns = 0, adj_frac = 4
> You can check the calculus.

The test program, below, shows you what I meant.  (Of course, you
should adjust this to fit the adjfine() method.)

Unfortunately, this device has a very coarse frequency resolution.
Using a nominal period of ns=12 as an example, the resolution is
2^-16 / 12 or 1.27 ppm.  The 24 bit device is much better in this
repect.

The output using your example numbers is:

   $ ./a.out 12 3158 4891
   ns=12 frac=3158
   ns=12 frac=3162

   $ ./a.out 12 3158 -4891
   ns=12 frac=3158
   ns=12 frac=3154

See how you get a result of +/- 4 with just one division?

Thanks,
Richard

---
#include 
#include 
#include 

static void adjfreq(uint32_t ns, uint32_t frac, int32_t ppb)
{
uint64_t adj;
uint32_t diff, word;
int neg_adj = 0;

printf("ns=%u frac=%u\n", ns, frac);

if (ppb < 0) {
neg_adj = 1;
ppb = -ppb;
}
word = (ns << 16) + frac;
adj = word;
adj *= ppb;
adj += 5UL;
diff = adj / 10UL;

word = neg_adj ? word - diff : word + diff;
printf("ns=%u frac=%u\n", word >> 16, word & 0x);
}

int main(int argc, char *argv[])
{
uint32_t ns, frac;
int32_t ppb;

if (argc != 4) {
puts("need ns, frac, and ppb");
return -1;
}
ns = atoi(argv[1]);
frac = atoi(argv[2]);
ppb = atoi(argv[3]);
adjfreq(ns, frac, ppb);
return 0;
}


[PATCH net-next v2 2/2] net: dsa: mv88e6xxx: add MV88E6097 switch

2016-11-23 Thread Stefan Eichenberger
Add support for the MV88E6097 switch. The change was tested on an Armada
based platform with a MV88E6097 switch.

Signed-off-by: Stefan Eichenberger 
---
 drivers/net/dsa/mv88e6xxx/chip.c  | 28 
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |  2 ++
 2 files changed, 30 insertions(+)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index bada646..68eb8fc 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -3209,6 +3209,20 @@ static const struct mv88e6xxx_ops mv88e6095_ops = {
.stats_get_stats = mv88e6095_stats_get_stats,
 };
 
+static const struct mv88e6xxx_ops mv88e6097_ops = {
+   /* MV88E6XXX_FAMILY_6097 */
+   .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+   .phy_read = mv88e6xxx_g2_smi_phy_read,
+   .phy_write = mv88e6xxx_g2_smi_phy_write,
+   .port_set_link = mv88e6xxx_port_set_link,
+   .port_set_duplex = mv88e6xxx_port_set_duplex,
+   .port_set_speed = mv88e6185_port_set_speed,
+   .stats_snapshot = mv88e6xxx_g1_stats_snapshot,
+   .stats_get_sset_count = mv88e6095_stats_get_sset_count,
+   .stats_get_strings = mv88e6095_stats_get_strings,
+   .stats_get_stats = mv88e6095_stats_get_stats,
+};
+
 static const struct mv88e6xxx_ops mv88e6123_ops = {
/* MV88E6XXX_FAMILY_6165 */
.set_switch_mac = mv88e6xxx_g2_set_switch_mac,
@@ -3580,6 +3594,20 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.ops = &mv88e6095_ops,
},
 
+   [MV88E6097] = {
+   .prod_num = PORT_SWITCH_ID_PROD_NUM_6097,
+   .family = MV88E6XXX_FAMILY_6097,
+   .name = "Marvell 88E6097/88E6097F",
+   .num_databases = 4096,
+   .num_ports = 11,
+   .port_base_addr = 0x10,
+   .global1_addr = 0x1b,
+   .age_time_coeff = 15000,
+   .g1_irqs = 8,
+   .flags = MV88E6XXX_FLAGS_FAMILY_6097,
+   .ops = &mv88e6097_ops,
+   },
+
[MV88E6123] = {
.prod_num = PORT_SWITCH_ID_PROD_NUM_6123,
.family = MV88E6XXX_FAMILY_6165,
diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h 
b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
index 3e69526..a2ff1fc 100644
--- a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
@@ -81,6 +81,7 @@
 #define PORT_SWITCH_ID 0x03
 #define PORT_SWITCH_ID_PROD_NUM_6085   0x04a
 #define PORT_SWITCH_ID_PROD_NUM_6095   0x095
+#define PORT_SWITCH_ID_PROD_NUM_6097   0x099
 #define PORT_SWITCH_ID_PROD_NUM_6131   0x106
 #define PORT_SWITCH_ID_PROD_NUM_6320   0x115
 #define PORT_SWITCH_ID_PROD_NUM_6123   0x121
@@ -378,6 +379,7 @@
 enum mv88e6xxx_model {
MV88E6085,
MV88E6095,
+   MV88E6097,
MV88E6123,
MV88E6131,
MV88E6161,
-- 
2.9.3



Re: [PATCH net-next v2 2/2] net: dsa: mv88e6xxx: add MV88E6097 switch

2016-11-23 Thread Vivien Didelot
Hi Stefan,

Stefan Eichenberger  writes:

> Add support for the MV88E6097 switch. The change was tested on an Armada
> based platform with a MV88E6097 switch.
>
> Signed-off-by: Stefan Eichenberger 

Reviewed-by: Vivien Didelot 

One day I'll understand Marvell products naming... ;-)

Thanks,

Vivien


[PATCH net-next v2 1/2] net: dsa: mv88e6xxx: enable EDSA on mv88e6097

2016-11-23 Thread Stefan Eichenberger
EDSA is currently disabled on mv88e6097 devices, this commit enables it.

Signed-off-by: Stefan Eichenberger 
Reviewed-by: Andrew Lunn 
Reviewed-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h 
b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
index 9298faa..3e69526 100644
--- a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
@@ -541,7 +541,8 @@ enum mv88e6xxx_cap {
 MV88E6XXX_FLAGS_MULTI_CHIP)
 
 #define MV88E6XXX_FLAGS_FAMILY_6097\
-   (MV88E6XXX_FLAG_G1_ATU_FID |\
+   (MV88E6XXX_FLAG_EDSA |  \
+MV88E6XXX_FLAG_G1_ATU_FID |\
 MV88E6XXX_FLAG_G1_VTU_FID |\
 MV88E6XXX_FLAG_GLOBAL2 |   \
 MV88E6XXX_FLAG_G2_MGMT_EN_2X | \
-- 
2.9.3



[PATCH net-next v2 0/2] Add support for the MV88e6097

2016-11-23 Thread Stefan Eichenberger
This patchset will add support for the MV88E6097 DSA switch and enable
EDSA on MV88E6097 family devices.

Changes since v1:
- Add missing g1_irqs = 8
- Add missing comment after mv88e6097_ops
- Change patch order

Stefan Eichenberger (2):
  net: dsa: mv88e6xxx: enable EDSA on mv88e6097
  net: dsa: mv88e6xxx: add MV88E6097 switch

 drivers/net/dsa/mv88e6xxx/chip.c  | 28 
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |  5 -
 2 files changed, 32 insertions(+), 1 deletion(-)

-- 
2.9.3



Re: [PATCH net-next v2 2/2] net: dsa: mv88e6xxx: add MV88E6097 switch

2016-11-23 Thread Andrew Lunn
On Wed, Nov 23, 2016 at 09:59:52PM +0100, Stefan Eichenberger wrote:
> Add support for the MV88E6097 switch. The change was tested on an Armada
> based platform with a MV88E6097 switch.
> 
> Signed-off-by: Stefan Eichenberger 

Reviewed-by: Andrew Lunn 

Andrew


Re: [PATCH 1/2] net: dsa: mv88e6xxx: add MV88E6097 switch

2016-11-23 Thread Stefan Eichenberger
On Wed, Nov 23, 2016 at 07:10:16PM +0100, Andrew Lunn wrote:
> > +   [MV88E6097] = {
> > +   .prod_num = PORT_SWITCH_ID_PROD_NUM_6097,
> > +   .family = MV88E6XXX_FAMILY_6097,
> > +   .name = "Marvell 88E6097/88E6097F",
> > +   .num_databases = 4096,
> > +   .num_ports = 11,
> > +   .port_base_addr = 0x10,
> > +   .global1_addr = 0x1b,
> > +   .age_time_coeff = 15000,
> > +   .flags = MV88E6XXX_FLAGS_FAMILY_6097,
> > +   .ops = &mv88e6097_ops,
> 
> Upps. Sorry, i missed something when you rebased onto net-next. You
> are missing .g1_irqs = . It is probably 9. You can check the
> datasheet, global 1, register 0. If bit 8 is AVBInt, you need 9. If
> bit 8 is reserved, then 8.

No problem, bit 8-10 are reserved. So I put 8 in then.

Regards,
Stefan


[PATCH net-next 1/1] ptp: gianfar: Use high resolution frequency method.

2016-11-23 Thread Ulrik De Bie
This patch depends on commit d8d263541913 ("ptp: Introduce a high
resolution frequency adjustment method.")

The gianfar devices offer a frequency resolution of about 0.46 ppb
(depends on actual value of tmr_add, for the calculation assumed
0x8000). This patch lets users of the device benefit from the increased
frequency resolution when tuning the clock. Thanks to the rounding the
maximum error between the requested frequency and the applied frequency
will then be about 0.23 ppb.

Tested on a v3.3.8 kernel on a real gianfar device. Verified compilation
on net-next (currently at v4.9-rc5).

Signed-off-by: Ulrik De Bie 
---
 drivers/net/ethernet/freescale/gianfar_ptp.c | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/freescale/gianfar_ptp.c 
b/drivers/net/ethernet/freescale/gianfar_ptp.c
index 3e8d1ff..721be13 100644
--- a/drivers/net/ethernet/freescale/gianfar_ptp.c
+++ b/drivers/net/ethernet/freescale/gianfar_ptp.c
@@ -280,21 +280,26 @@ static irqreturn_t isr(int irq, void *priv)
  * PTP clock operations
  */
 
-static int ptp_gianfar_adjfreq(struct ptp_clock_info *ptp, s32 ppb)
+static int ptp_gianfar_adjfine(struct ptp_clock_info *ptp, long scaled_ppm)
 {
-   u64 adj;
-   u32 diff, tmr_add;
+   u64 adj, diff;
+   u32 tmr_add;
int neg_adj = 0;
struct etsects *etsects = container_of(ptp, struct etsects, caps);
 
-   if (ppb < 0) {
+   if (scaled_ppm < 0) {
neg_adj = 1;
-   ppb = -ppb;
+   scaled_ppm = -scaled_ppm;
}
tmr_add = etsects->tmr_add;
adj = tmr_add;
-   adj *= ppb;
-   diff = div_u64(adj, 10ULL);
+
+   /* calculate diff as adj*(scaled_ppm/65536)/100
+* and round() to the nearest integer
+*/
+   adj *= scaled_ppm;
+   diff = div_u64(adj, 800);
+   diff = (diff >> 13) + ((diff >> 12) & 1);
 
tmr_add = neg_adj ? tmr_add - diff : tmr_add + diff;
 
@@ -415,7 +420,7 @@ static struct ptp_clock_info ptp_gianfar_caps = {
.n_per_out  = 0,
.n_pins = 0,
.pps= 1,
-   .adjfreq= ptp_gianfar_adjfreq,
+   .adjfine= ptp_gianfar_adjfine,
.adjtime= ptp_gianfar_adjtime,
.gettime64  = ptp_gianfar_gettime,
.settime64  = ptp_gianfar_settime,
-- 
2.10.2



Re: [PATCH net-next 1/1] ptp: gianfar: Use high resolution frequency method.

2016-11-23 Thread Richard Cochran
On Wed, Nov 23, 2016 at 09:11:04PM +0100, Ulrik De Bie wrote:
> This patch depends on commit d8d263541913 ("ptp: Introduce a high
> resolution frequency adjustment method.")
> 
> The gianfar devices offer a frequency resolution of about 0.46 ppb
> (depends on actual value of tmr_add, for the calculation assumed
> 0x8000). This patch lets users of the device benefit from the increased
> frequency resolution when tuning the clock. Thanks to the rounding the
> maximum error between the requested frequency and the applied frequency
> will then be about 0.23 ppb.
> 
> Tested on a v3.3.8 kernel on a real gianfar device. Verified compilation
> on net-next (currently at v4.9-rc5).
> 
> Signed-off-by: Ulrik De Bie 

Acked-by: Richard Cochran 


Re: [PATCH v2] cpsw: ethtool: add support for getting/setting EEE registers

2016-11-23 Thread Florian Fainelli
On 11/23/2016 12:08 PM, Yegor Yefremov wrote:
> On Wed, Nov 23, 2016 at 6:33 PM, Florian Fainelli  
> wrote:
>> On 11/23/2016 06:38 AM, yegorsli...@googlemail.com wrote:
>>> From: Yegor Yefremov 
>>>
>>> Add the ability to query and set Energy Efficient Ethernet parameters
>>> via ethtool for applicable devices.
>>
>> Are you sure this is enough to actually enable EEE? I don't see where
>> phy_init_eee() is called here, nor is the cpsw Ethernet controller part
>> configured to enable/disable EEE. EEE is not just a PHY thing, it
>> usually also needs to be configured properly at the Ethernet MAC/switch
>> level as well.
>>
>> Just curious here.
> 
> I'm sure I want to disable EEE :-) So I need this patch in order to
> check and disable EEE advertising.

OK, so you need this to disable EEE advertisement, which is great, but
this also allows you to enable EEE, is it enough to just advertise EEE
with your link partner for cpsw to work correctly? Just wondering, since
your commit message is more than short.
-- 
Florian


Re: [PATCH v2] cpsw: ethtool: add support for getting/setting EEE registers

2016-11-23 Thread Yegor Yefremov
On Wed, Nov 23, 2016 at 6:33 PM, Florian Fainelli  wrote:
> On 11/23/2016 06:38 AM, yegorsli...@googlemail.com wrote:
>> From: Yegor Yefremov 
>>
>> Add the ability to query and set Energy Efficient Ethernet parameters
>> via ethtool for applicable devices.
>
> Are you sure this is enough to actually enable EEE? I don't see where
> phy_init_eee() is called here, nor is the cpsw Ethernet controller part
> configured to enable/disable EEE. EEE is not just a PHY thing, it
> usually also needs to be configured properly at the Ethernet MAC/switch
> level as well.
>
> Just curious here.

I'm sure I want to disable EEE :-) So I need this patch in order to
check and disable EEE advertising.

Yegor


Re: [PATCH net-next] tcp: enhance tcp_collapse_retrans() with skb_shift()

2016-11-23 Thread Yuchung Cheng
On Tue, Nov 15, 2016 at 12:51 PM, Eric Dumazet  wrote:
>
> From: Eric Dumazet 
>
> In commit 2331ccc5b323 ("tcp: enhance tcp collapsing"),
> we made a first step allowing copying right skb to left skb head.
>
> Since all skbs in socket write queue are headless (but possibly the very
> first one), this strategy often does not work.
>
> This patch extends tcp_collapse_retrans() to perform frag shifting,
> thanks to skb_shift() helper.
>
> This helper needs to not BUG on non headless skbs, as callers are ok
> with that.
>
> Tested:
>
> Following packetdrill test now passes :
>
> 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
>+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
>+0 bind(3, ..., ...) = 0
>+0 listen(3, 1) = 0
>
>+0 < S 0:0(0) win 32792 
>+0 > S. 0:0(0) ack 1 
> +.100 < . 1:1(0) ack 1 win 257
>+0 accept(3, ..., ...) = 4
>
>+0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
>+0 write(4, ..., 200) = 200
>+0 > P. 1:201(200) ack 1
> +.001 write(4, ..., 200) = 200
>+0 > P. 201:401(200) ack 1
> +.001 write(4, ..., 200) = 200
>+0 > P. 401:601(200) ack 1
> +.001 write(4, ..., 200) = 200
>+0 > P. 601:801(200) ack 1
> +.001 write(4, ..., 200) = 200
>+0 > P. 801:1001(200) ack 1
> +.001 write(4, ..., 100) = 100
>+0 > P. 1001:1101(100) ack 1
> +.001 write(4, ..., 100) = 100
>+0 > P. 1101:1201(100) ack 1
> +.001 write(4, ..., 100) = 100
>+0 > P. 1201:1301(100) ack 1
> +.001 write(4, ..., 100) = 100
>+0 > P. 1301:1401(100) ack 1
>
> +.099 < . 1:1(0) ack 201 win 257
> +.001 < . 1:1(0) ack 201 win 257 
>+0 > P. 201:1001(800) ack 1
>
> Signed-off-by: Eric Dumazet 
> Cc: Neal Cardwell 
> Cc: Yuchung Cheng 
Acked-by: Yuchung Cheng 

Nice follow-up patch. This also works well with RACK loss detection
since RACK only cares about time (skb_mstamp) not sequence so
collapsing sequences is not a problem.

> ---
>  net/core/skbuff.c |4 +++-
>  net/ipv4/tcp_output.c |   22 +++---
>  2 files changed, 14 insertions(+), 12 deletions(-)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 
> 0b2a6e94af2de73ed638634c47a0fb71e2cbc1cb..a9cb81a10c4ba895587727aa4cf098e9a38424ea
>  100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2656,7 +2656,9 @@ int skb_shift(struct sk_buff *tgt, struct sk_buff *skb, 
> int shiftlen)
> struct skb_frag_struct *fragfrom, *fragto;
>
> BUG_ON(shiftlen > skb->len);
> -   BUG_ON(skb_headlen(skb));   /* Would corrupt stream */
> +
> +   if (skb_headlen(skb))
> +   return 0;
>
> todo = shiftlen;
> from = 0;
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 
> f57b5aa51b59cf0a58975fe34a7dcdb886ea8c50..19105b46a30436ebb85fe97ee43089e77aa028bb
>  100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2514,7 +2514,7 @@ void tcp_skb_collapse_tstamp(struct sk_buff *skb,
>  }
>
>  /* Collapses two adjacent SKB's during retransmission. */
> -static void tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb)
> +static bool tcp_collapse_retrans(struct sock *sk, struct sk_buff *skb)
>  {
> struct tcp_sock *tp = tcp_sk(sk);
> struct sk_buff *next_skb = tcp_write_queue_next(sk, skb);
> @@ -2525,14 +2525,17 @@ static void tcp_collapse_retrans(struct sock *sk, 
> struct sk_buff *skb)
>
> BUG_ON(tcp_skb_pcount(skb) != 1 || tcp_skb_pcount(next_skb) != 1);
>
> +   if (next_skb_size) {
> +   if (next_skb_size <= skb_availroom(skb))
> +   skb_copy_bits(next_skb, 0, skb_put(skb, 
> next_skb_size),
> + next_skb_size);
> +   else if (!skb_shift(skb, next_skb, next_skb_size))
> +   return false;
> +   }
> tcp_highest_sack_combine(sk, next_skb, skb);
>
> tcp_unlink_write_queue(next_skb, sk);
>
> -   if (next_skb_size)
> -   skb_copy_bits(next_skb, 0, skb_put(skb, next_skb_size),
> - next_skb_size);
> -
> if (next_skb->ip_summed == CHECKSUM_PARTIAL)
> skb->ip_summed = CHECKSUM_PARTIAL;
>
> @@ -2561,6 +2564,7 @@ static void tcp_collapse_retrans(struct sock *sk, 
> struct sk_buff *skb)
> tcp_skb_collapse_tstamp(skb, next_skb);
>
> sk_wmem_free_skb(sk, next_skb);
> +   return true;
>  }
>
>  /* Check if coalescing SKBs is legal. */
> @@ -2610,16 +2614,12 @@ static void tcp_retrans_try_collapse(struct sock *sk, 
> struct sk_buff *to,
>
> if (space < 0)
> break;
> -   /* Punt if not enough space exists in the first SKB for
> -* the data in the second
> -*/
> -   if (skb->len > skb_availroom(to))
> -   break;
>
> if (after(TCP_SKB_CB(skb)->end_seq, tcp_wnd_end(tp)))
> break;
>
> -   tcp_collapse_retrans(sk, to);
> +   

Re: [patch net-next v2 09/11] ipv4: fib: Add an API to request a FIB dump

2016-11-23 Thread Ido Schimmel
On Wed, Nov 23, 2016 at 06:47:03PM +0100, Hannes Frederic Sowa wrote:
> Hmm, I think you need to read the sequence counter under rtnl_lock to
> have an ordering with the rest of the updates to the RCU trie. Otherwise
> you don't know if the fib trie has the correct view regarding to the
> incoming notifications as a whole. This is also necessary during restarts.

I spent quite a lot of time thinking about this specific issue, but I
couldn't convince myself that the read should be done under RTNL and I'm
not sure I understand your reasoning. Can you please elaborate?

If, before each notification sent, we call atomic_inc() and then call
atomic_read() at the end, then how can we be tricked?

Thanks for looking into this!


Re: [patch net-next v2 10/11] mlxsw: spectrum_router: Request a dump of FIB tables during init

2016-11-23 Thread Jiri Pirko
Wed, Nov 23, 2016 at 08:22:30PM CET, ido...@idosch.org wrote:
>On Wed, Nov 23, 2016 at 06:08:23PM +0100, Hannes Frederic Sowa wrote:
>> On Wed, Nov 23, 2016, at 18:04, Jiri Pirko wrote:
>> > >Sure, but an abort function can be provided to the kernel anyway and the
>> > >driver can care about that.
>> > 
>> > Ok, how?
>> 
>> I think just a sysctl ontop of this series is enough plus a pr_warn.
>> Rocker and mlxsw are responsible to loop for a maximum amount of time.
>
>Maybe, when the module requests a dump it can also provide a callback
>that is invoked following each failed dump?

That would make sense. Thanks.


[PATCH V3 for-next 04/11] IB/hns: add self loopback for CM

2016-11-23 Thread Salil Mehta
From: Lijun Ou 

This patch mainly adds self loopback support for CM.

Signed-off-by: Lijun Ou 
Signed-off-by: Peter Chen 
Reviewed-by: Wei Hu (Xavier) 
Signed-off-by: Salil Mehta  
---
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c |   11 +++
 drivers/infiniband/hw/hns/hns_roce_hw_v1.h |2 ++
 2 files changed, 13 insertions(+)

diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c 
b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
index 959d5ca..e080dd6 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
@@ -32,6 +32,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include "hns_roce_common.h"
 #include "hns_roce_device.h"
@@ -72,6 +73,8 @@ int hns_roce_v1_post_send(struct ib_qp *ibqp, struct 
ib_send_wr *wr,
int nreq = 0;
u32 ind = 0;
int ret = 0;
+   u8 *smac;
+   int loopback;
 
if (unlikely(ibqp->qp_type != IB_QPT_GSI &&
ibqp->qp_type != IB_QPT_RC)) {
@@ -129,6 +132,14 @@ int hns_roce_v1_post_send(struct ib_qp *ibqp, struct 
ib_send_wr *wr,
   UD_SEND_WQE_U32_8_DMAC_5_M,
   UD_SEND_WQE_U32_8_DMAC_5_S,
   ah->av.mac[5]);
+
+   smac = (u8 *)hr_dev->dev_addr[qp->port];
+   loopback = ether_addr_equal_unaligned(ah->av.mac,
+ smac) ? 1 : 0;
+   roce_set_bit(ud_sq_wqe->u32_8,
+UD_SEND_WQE_U32_8_LOOPBACK_INDICATOR_S,
+loopback);
+
roce_set_field(ud_sq_wqe->u32_8,
   UD_SEND_WQE_U32_8_OPERATION_TYPE_M,
   UD_SEND_WQE_U32_8_OPERATION_TYPE_S,
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.h 
b/drivers/infiniband/hw/hns/hns_roce_hw_v1.h
index 6004c7f..cf28f1b 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v1.h
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.h
@@ -440,6 +440,8 @@ struct hns_roce_ud_send_wqe {
 #define UD_SEND_WQE_U32_8_DMAC_5_M   \
(((1UL << 8) - 1) << UD_SEND_WQE_U32_8_DMAC_5_S)
 
+#define UD_SEND_WQE_U32_8_LOOPBACK_INDICATOR_S 22
+
 #define UD_SEND_WQE_U32_8_OPERATION_TYPE_S 16
 #define UD_SEND_WQE_U32_8_OPERATION_TYPE_M   \
(((1UL << 4) - 1) << UD_SEND_WQE_U32_8_OPERATION_TYPE_S)
-- 
1.7.9.5




[PATCH V3 for-next 01/11] IB/hns: Add the interface for querying QP1

2016-11-23 Thread Salil Mehta
From: Lijun Ou 

In old code, It only added the interface for querying non-specific
QP. This patch mainly adds an interface for querying QP1.

Signed-off-by: Lijun Ou 
Reviewed-by: Wei Hu (Xavier) 
Signed-off-by: Salil Mehta  
---
Change Log

Patch V2: Addressed the comment provided by Anurup M
Link: https://patchwork.kernel.org/patch/9412855/
Patch V1: Initial Submit
---
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c |   83 +++-
 drivers/infiniband/hw/hns/hns_roce_hw_v1.h |6 +-
 2 files changed, 86 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c 
b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
index 71232e5..7485514 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
@@ -2630,8 +2630,78 @@ static int hns_roce_v1_query_qpc(struct hns_roce_dev 
*hr_dev,
return ret;
 }
 
-int hns_roce_v1_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
-int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr)
+static int hns_roce_v1_q_sqp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
+int qp_attr_mask,
+struct ib_qp_init_attr *qp_init_attr)
+{
+   struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device);
+   struct hns_roce_qp *hr_qp = to_hr_qp(ibqp);
+   struct hns_roce_sqp_context context;
+   u32 addr;
+
+   mutex_lock(&hr_qp->mutex);
+
+   if (hr_qp->state == IB_QPS_RESET) {
+   qp_attr->qp_state = IB_QPS_RESET;
+   goto done;
+   }
+
+   addr = ROCEE_QP1C_CFG0_0_REG + 
+   hr_qp->port * sizeof(struct hns_roce_sqp_context);
+   context.qp1c_bytes_4 = roce_read(hr_dev, addr);
+   context.sq_rq_bt_l = roce_read(hr_dev, addr + 1);
+   context.qp1c_bytes_12 = roce_read(hr_dev, addr + 2);
+   context.qp1c_bytes_16 = roce_read(hr_dev, addr + 3);
+   context.qp1c_bytes_20 = roce_read(hr_dev, addr + 4);
+   context.cur_rq_wqe_ba_l = roce_read(hr_dev, addr + 5);
+   context.qp1c_bytes_28 = roce_read(hr_dev, addr + 6);
+   context.qp1c_bytes_32 = roce_read(hr_dev, addr + 7);
+   context.cur_sq_wqe_ba_l = roce_read(hr_dev, addr + 8);
+   context.qp1c_bytes_40 = roce_read(hr_dev, addr + 9);
+
+   hr_qp->state = roce_get_field(context.qp1c_bytes_4,
+ QP1C_BYTES_4_QP_STATE_M,
+ QP1C_BYTES_4_QP_STATE_S);
+   qp_attr->qp_state   = hr_qp->state;
+   qp_attr->path_mtu   = IB_MTU_256;
+   qp_attr->path_mig_state = IB_MIG_ARMED;
+   qp_attr->qkey   = QKEY_VAL;
+   qp_attr->rq_psn = 0;
+   qp_attr->sq_psn = 0;
+   qp_attr->dest_qp_num= 1;
+   qp_attr->qp_access_flags = 6;
+
+   qp_attr->pkey_index = roce_get_field(context.qp1c_bytes_20,
+QP1C_BYTES_20_PKEY_IDX_M,
+QP1C_BYTES_20_PKEY_IDX_S);
+   qp_attr->port_num = hr_qp->port + 1;
+   qp_attr->sq_draining = 0;
+   qp_attr->max_rd_atomic = 0;
+   qp_attr->max_dest_rd_atomic = 0;
+   qp_attr->min_rnr_timer = 0;
+   qp_attr->timeout = 0;
+   qp_attr->retry_cnt = 0;
+   qp_attr->rnr_retry = 0;
+   qp_attr->alt_timeout = 0;
+
+done:
+   qp_attr->cur_qp_state = qp_attr->qp_state;
+   qp_attr->cap.max_recv_wr = hr_qp->rq.wqe_cnt;
+   qp_attr->cap.max_recv_sge = hr_qp->rq.max_gs;
+   qp_attr->cap.max_send_wr = hr_qp->sq.wqe_cnt;
+   qp_attr->cap.max_send_sge = hr_qp->sq.max_gs;
+   qp_attr->cap.max_inline_data = 0;
+   qp_init_attr->cap = qp_attr->cap;
+   qp_init_attr->create_flags = 0;
+
+   mutex_unlock(&hr_qp->mutex);
+
+   return 0;
+}
+
+static int hns_roce_v1_q_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
+   int qp_attr_mask,
+   struct ib_qp_init_attr *qp_init_attr)
 {
struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device);
struct hns_roce_qp *hr_qp = to_hr_qp(ibqp);
@@ -2767,6 +2837,15 @@ int hns_roce_v1_query_qp(struct ib_qp *ibqp, struct 
ib_qp_attr *qp_attr,
return ret;
 }
 
+int hns_roce_v1_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
+int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr)
+{
+   struct hns_roce_qp *hr_qp = to_hr_qp(ibqp);
+
+   return hr_qp->doorbell_qpn <= 1 ?
+   hns_roce_v1_q_sqp(ibqp, qp_attr, qp_attr_mask, qp_init_attr) :
+   hns_roce_v1_q_qp(ibqp, qp_attr, qp_attr_mask, qp_init_attr);
+}
 static void hns_roce_v1_destroy_qp_common(struct hns_roce_dev *hr_dev,
  struct hns_roce_qp *hr_qp,
  int is_user)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.h 
b/drivers/infiniband/hw/hns/hns_roce_hw_v1.h
index 539b0a3b.

[PATCH V3 for-next 02/11] IB/hns: Add code for refreshing CQ CI using TPTR

2016-11-23 Thread Salil Mehta
From: "Wei Hu (Xavier)" 

This patch added the code for refreshing CQ CI using TPTR in hip06
SoC.

We will send a doorbell to hardware for refreshing CQ CI when user
succeed to poll a cqe. But it will be failed if the doorbell has
been blocked. So hardware will read a special buffer called TPTR
to get the lastest CI value when the cq is almost full.

This patch support the special CI buffer as follows:
a) Alloc the memory for TPTR in the hns_roce_tptr_init function and
   free it in hns_roce_tptr_free function, these two functions will
   be called in probe function and in the remove function.
b) Add the code for computing offset(every cq need 2 bytes) and
   write the dma addr to every cq context to notice hardware in the
   function named hns_roce_v1_write_cqc.
c) Add code for mapping TPTR buffer to user space in function named
   hns_roce_mmap. The mapping distinguish TPTR and UAR of user mode
   by vm_pgoff(0: UAR, 1: TPTR, others:invaild) in hip06.
d) Alloc the code for refreshing CQ CI using TPTR in the function
   named hns_roce_v1_poll_cq.
e) Add some variable definitions to the related structure.

Signed-off-by: Wei Hu (Xavier) 
Signed-off-by: Dongdong Huang(Donald) 
Signed-off-by: Lijun Ou 
Signed-off-by: Salil Mehta  
---
 drivers/infiniband/hw/hns/hns_roce_common.h |2 -
 drivers/infiniband/hw/hns/hns_roce_cq.c |9 +++
 drivers/infiniband/hw/hns/hns_roce_device.h |6 +-
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c  |   79 ---
 drivers/infiniband/hw/hns/hns_roce_hw_v1.h  |9 +++
 drivers/infiniband/hw/hns/hns_roce_main.c   |   13 -
 6 files changed, 103 insertions(+), 15 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_common.h 
b/drivers/infiniband/hw/hns/hns_roce_common.h
index 2970161..0dcb620 100644
--- a/drivers/infiniband/hw/hns/hns_roce_common.h
+++ b/drivers/infiniband/hw/hns/hns_roce_common.h
@@ -253,8 +253,6 @@
 #define ROCEE_VENDOR_ID_REG0x0
 #define ROCEE_VENDOR_PART_ID_REG   0x4
 
-#define ROCEE_HW_VERSION_REG   0x8
-
 #define ROCEE_SYS_IMAGE_GUID_L_REG 0xC
 #define ROCEE_SYS_IMAGE_GUID_H_REG 0x10
 
diff --git a/drivers/infiniband/hw/hns/hns_roce_cq.c 
b/drivers/infiniband/hw/hns/hns_roce_cq.c
index 0973659..5dc8d92 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cq.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cq.c
@@ -349,6 +349,15 @@ struct ib_cq *hns_roce_ib_create_cq(struct ib_device 
*ib_dev,
goto err_mtt;
}
 
+   /*
+* For the QP created by kernel space, tptr value should be initialized
+* to zero; For the QP created by user space, it will cause synchronous
+* problems if tptr is set to zero here, so we initialze it in user
+* space.
+*/
+   if (!context)
+   *hr_cq->tptr_addr = 0;
+
/* Get created cq handler and carry out event */
hr_cq->comp = hns_roce_ib_cq_comp;
hr_cq->event = hns_roce_ib_cq_event;
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h 
b/drivers/infiniband/hw/hns/hns_roce_device.h
index 3417315..7242b14 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -37,6 +37,8 @@
 
 #define DRV_NAME "hns_roce"
 
+#define HNS_ROCE_HW_VER1   ('h' << 24 | 'i' << 16 | '0' << 8 | '6')
+
 #define MAC_ADDR_OCTET_NUM 6
 #define HNS_ROCE_MAX_MSG_LEN   0x8000
 
@@ -296,7 +298,7 @@ struct hns_roce_cq {
u32 cq_depth;
u32 cons_index;
void __iomem*cq_db_l;
-   void __iomem*tptr_addr;
+   u16 *tptr_addr;
unsigned long   cqn;
u32 vector;
atomic_trefcount;
@@ -553,6 +555,8 @@ struct hns_roce_dev {
 
int cmd_mod;
int loop_idc;
+   dma_addr_t  tptr_dma_addr; /*only for hw v1*/
+   u32 tptr_size; /*only for hw v1*/
struct hns_roce_hw  *hw;
 };
 
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c 
b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
index 7485514..959d5ca 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
@@ -849,6 +849,45 @@ static void hns_roce_bt_free(struct hns_roce_dev *hr_dev)
priv->bt_table.qpc_buf.buf, priv->bt_table.qpc_buf.map);
 }
 
+static int hns_roce_tptr_init(struct hns_roce_dev *hr_dev)
+{
+   struct device *dev = &hr_dev->pdev->dev;
+   struct hns_roce_buf_list *tptr_buf;
+   struct hns_roce_v1_priv *priv;
+
+   priv = (struct hns_roce_v1_priv *)hr_dev->hw->priv;
+   tptr_buf = &priv->tptr_table.tptr_buf;
+
+   /*
+* This buffer will be used for CQ's tptr(tail pointer), also

[PATCH V3 for-next 08/11] IB/hns: Modify query info named port_num when querying RC QP

2016-11-23 Thread Salil Mehta
From: "Wei Hu (Xavier)" 

This patch modified the output query info qp_attr->port_num
to fix bug in hip06.

Signed-off-by: Wei Hu (Xavier) 
Signed-off-by: Salil Mehta  
---
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c |4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c 
b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
index 509ea75..34b7898 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
@@ -2857,9 +2857,7 @@ static int hns_roce_v1_q_qp(struct ib_qp *ibqp, struct 
ib_qp_attr *qp_attr,
qp_attr->pkey_index = roce_get_field(context->qpc_bytes_12,
  QP_CONTEXT_QPC_BYTES_12_P_KEY_INDEX_M,
  QP_CONTEXT_QPC_BYTES_12_P_KEY_INDEX_S);
-   qp_attr->port_num = (u8)roce_get_field(context->qpc_bytes_156,
-QP_CONTEXT_QPC_BYTES_156_PORT_NUM_M,
-QP_CONTEXT_QPC_BYTES_156_PORT_NUM_S) + 1;
+   qp_attr->port_num = hr_qp->port + 1;
qp_attr->sq_draining = 0;
qp_attr->max_rd_atomic = roce_get_field(context->qpc_bytes_156,
 QP_CONTEXT_QPC_BYTES_156_INITIATOR_DEPTH_M,
-- 
1.7.9.5




[PATCH V3 for-next 05/11] IB/hns: Modify the condition of notifying hardware loopback

2016-11-23 Thread Salil Mehta
From: Lijun Ou 

This patch modified the condition of notifying hardware loopback.

In hip06, RoCE Engine has several ports, one QP is related
to one port. hardware only support loopback in the same port,
not in the different ports.

So, If QP related to port N, the dmac in the QP context equals
the smac of the local port N or the loop_idc is 1, we should
set loopback bit in QP context to notify hardware.

Signed-off-by: Wei Hu (Xavier) 
Signed-off-by: Lijun Ou 
Signed-off-by: Salil Mehta  
---
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c |   24 +++-
 1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c 
b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
index e080dd6..643a2ff 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
@@ -2244,24 +2244,14 @@ static int hns_roce_v1_m_qp(struct ib_qp *ibqp, const 
struct ib_qp_attr *attr,
 QP_CONTEXT_QPC_BYTE_32_SIGNALING_TYPE_S,
 hr_qp->sq_signal_bits);
 
-   for (port = 0; port < hr_dev->caps.num_ports; port++) {
-   smac = (u8 *)hr_dev->dev_addr[port];
-   dev_dbg(dev, "smac: %2x: %2x: %2x: %2x: %2x: %2x\n",
-   smac[0], smac[1], smac[2], smac[3], smac[4],
-   smac[5]);
-   if ((dmac[0] == smac[0]) && (dmac[1] == smac[1]) &&
-   (dmac[2] == smac[2]) && (dmac[3] == smac[3]) &&
-   (dmac[4] == smac[4]) && (dmac[5] == smac[5])) {
-   roce_set_bit(context->qpc_bytes_32,
-   QP_CONTEXT_QPC_BYTE_32_LOOPBACK_INDICATOR_S,
-   1);
-   break;
-   }
-   }
-
-   if (hr_dev->loop_idc == 0x1)
+   port = (attr_mask & IB_QP_PORT) ? (attr->port_num - 1) :
+   hr_qp->port;
+   smac = (u8 *)hr_dev->dev_addr[port];
+   /* when dmac equals smac or loop_idc is 1, it should loopback */
+   if (ether_addr_equal_unaligned(dmac, smac) ||
+   hr_dev->loop_idc == 0x1)
roce_set_bit(context->qpc_bytes_32,
-   QP_CONTEXT_QPC_BYTE_32_LOOPBACK_INDICATOR_S, 1);
+ QP_CONTEXT_QPC_BYTE_32_LOOPBACK_INDICATOR_S, 1);
 
roce_set_bit(context->qpc_bytes_32,
 QP_CONTEXT_QPC_BYTE_32_GLOBAL_HEADER_S,
-- 
1.7.9.5




[PATCH V3 for-next 00/11] Code improvements & fixes for HNS RoCE driver

2016-11-23 Thread Salil Mehta
This patchset introduces some code improvements and fixes
for the identified problems in the HNS RoCE driver.

Lijun Ou (4):
  IB/hns: Add the interface for querying QP1
  IB/hns: add self loopback for CM
  IB/hns: Modify the condition of notifying hardware loopback
  IB/hns: Fix the bug for qp state in hns_roce_v1_m_qp()

Salil Mehta (1):
  IB/hns: Fix for Checkpatch.pl comment style errors

Shaobo Xu (1):
  IB/hns: Implement the add_gid/del_gid and optimize the GIDs
management

Wei Hu (Xavier) (5):
  IB/hns: Add code for refreshing CQ CI using TPTR
  IB/hns: Optimize the logic of allocating memory using APIs
  IB/hns: Modify the macro for the timeout when cmd process
  IB/hns: Modify query info named port_num when querying RC QP
  IB/hns: Change qpn allocation to round-robin mode.

 drivers/infiniband/hw/hns/hns_roce_alloc.c  |   11 +-
 drivers/infiniband/hw/hns/hns_roce_cmd.c|8 +-
 drivers/infiniband/hw/hns/hns_roce_cmd.h|7 +-
 drivers/infiniband/hw/hns/hns_roce_common.h |2 -
 drivers/infiniband/hw/hns/hns_roce_cq.c |   17 +-
 drivers/infiniband/hw/hns/hns_roce_device.h |   45 ++--
 drivers/infiniband/hw/hns/hns_roce_eq.c |6 +-
 drivers/infiniband/hw/hns/hns_roce_hem.c|6 +-
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c  |  267 +--
 drivers/infiniband/hw/hns/hns_roce_hw_v1.h  |   17 +-
 drivers/infiniband/hw/hns/hns_roce_main.c   |  311 +++
 drivers/infiniband/hw/hns/hns_roce_mr.c |   22 +-
 drivers/infiniband/hw/hns/hns_roce_pd.c |5 +-
 drivers/infiniband/hw/hns/hns_roce_qp.c |2 +-
 14 files changed, 364 insertions(+), 362 deletions(-)

-- 
1.7.9.5




[PATCH V3 for-next 07/11] IB/hns: Modify the macro for the timeout when cmd process

2016-11-23 Thread Salil Mehta
From: "Wei Hu (Xavier)" 

This patch modified the macro for the timeout when cmd is
processing as follows:
Before modification:
 enum {
HNS_ROCE_CMD_TIME_CLASS_A   = 1,
HNS_ROCE_CMD_TIME_CLASS_B   = 1,
HNS_ROCE_CMD_TIME_CLASS_C   = 1,
 };
After modification:
 #define HNS_ROCE_CMD_TIMEOUT_MSECS 1

Signed-off-by: Wei Hu (Xavier) 
Signed-off-by: Salil Mehta  
---
 drivers/infiniband/hw/hns/hns_roce_cmd.h   |7 +--
 drivers/infiniband/hw/hns/hns_roce_cq.c|4 ++--
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c |8 
 drivers/infiniband/hw/hns/hns_roce_mr.c|4 ++--
 4 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.h 
b/drivers/infiniband/hw/hns/hns_roce_cmd.h
index e3997d3..ed14ad3 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cmd.h
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.h
@@ -34,6 +34,7 @@
 #define _HNS_ROCE_CMD_H
 
 #define HNS_ROCE_MAILBOX_SIZE  4096
+#define HNS_ROCE_CMD_TIMEOUT_MSECS 1
 
 enum {
/* TPT commands */
@@ -57,12 +58,6 @@ enum {
HNS_ROCE_CMD_QUERY_QP   = 0x22,
 };
 
-enum {
-   HNS_ROCE_CMD_TIME_CLASS_A   = 1,
-   HNS_ROCE_CMD_TIME_CLASS_B   = 1,
-   HNS_ROCE_CMD_TIME_CLASS_C   = 1,
-};
-
 struct hns_roce_cmd_mailbox {
void   *buf;
dma_addr_t  dma;
diff --git a/drivers/infiniband/hw/hns/hns_roce_cq.c 
b/drivers/infiniband/hw/hns/hns_roce_cq.c
index 5dc8d92..461a273 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cq.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cq.c
@@ -77,7 +77,7 @@ static int hns_roce_sw2hw_cq(struct hns_roce_dev *dev,
 unsigned long cq_num)
 {
return hns_roce_cmd_mbox(dev, mailbox->dma, 0, cq_num, 0,
-   HNS_ROCE_CMD_SW2HW_CQ, HNS_ROCE_CMD_TIME_CLASS_A);
+   HNS_ROCE_CMD_SW2HW_CQ, HNS_ROCE_CMD_TIMEOUT_MSECS);
 }
 
 static int hns_roce_cq_alloc(struct hns_roce_dev *hr_dev, int nent,
@@ -176,7 +176,7 @@ static int hns_roce_hw2sw_cq(struct hns_roce_dev *dev,
 {
return hns_roce_cmd_mbox(dev, 0, mailbox ? mailbox->dma : 0, cq_num,
 mailbox ? 0 : 1, HNS_ROCE_CMD_HW2SW_CQ,
-HNS_ROCE_CMD_TIME_CLASS_A);
+HNS_ROCE_CMD_TIMEOUT_MSECS);
 }
 
 static void hns_roce_free_cq(struct hns_roce_dev *hr_dev,
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c 
b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
index b835a55..509ea75 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
@@ -1871,12 +1871,12 @@ static int hns_roce_v1_qp_modify(struct hns_roce_dev 
*hr_dev,
if (op[cur_state][new_state] == HNS_ROCE_CMD_2RST_QP)
return hns_roce_cmd_mbox(hr_dev, 0, 0, hr_qp->qpn, 2,
 HNS_ROCE_CMD_2RST_QP,
-HNS_ROCE_CMD_TIME_CLASS_A);
+HNS_ROCE_CMD_TIMEOUT_MSECS);
 
if (op[cur_state][new_state] == HNS_ROCE_CMD_2ERR_QP)
return hns_roce_cmd_mbox(hr_dev, 0, 0, hr_qp->qpn, 2,
 HNS_ROCE_CMD_2ERR_QP,
-HNS_ROCE_CMD_TIME_CLASS_A);
+HNS_ROCE_CMD_TIMEOUT_MSECS);
 
mailbox = hns_roce_alloc_cmd_mailbox(hr_dev);
if (IS_ERR(mailbox))
@@ -1886,7 +1886,7 @@ static int hns_roce_v1_qp_modify(struct hns_roce_dev 
*hr_dev,
 
ret = hns_roce_cmd_mbox(hr_dev, mailbox->dma, 0, hr_qp->qpn, 0,
op[cur_state][new_state],
-   HNS_ROCE_CMD_TIME_CLASS_C);
+   HNS_ROCE_CMD_TIMEOUT_MSECS);
 
hns_roce_free_cmd_mailbox(hr_dev, mailbox);
return ret;
@@ -2681,7 +2681,7 @@ static int hns_roce_v1_query_qpc(struct hns_roce_dev 
*hr_dev,
 
ret = hns_roce_cmd_mbox(hr_dev, 0, mailbox->dma, hr_qp->qpn, 0,
HNS_ROCE_CMD_QUERY_QP,
-   HNS_ROCE_CMD_TIME_CLASS_A);
+   HNS_ROCE_CMD_TIMEOUT_MSECS);
if (!ret)
memcpy(hr_context, mailbox->buf, sizeof(*hr_context));
else
diff --git a/drivers/infiniband/hw/hns/hns_roce_mr.c 
b/drivers/infiniband/hw/hns/hns_roce_mr.c
index d87d189..a5bd645 100644
--- a/drivers/infiniband/hw/hns/hns_roce_mr.c
+++ b/drivers/infiniband/hw/hns/hns_roce_mr.c
@@ -53,7 +53,7 @@ static int hns_roce_sw2hw_mpt(struct hns_roce_dev *hr_dev,
 {
return hns_roce_cmd_mbox(hr_dev, mailbox->dma, 0, mpt_index, 0,
 HNS_ROCE_CMD_SW2HW_MPT,
-HNS_ROCE_CMD_TIME_CLASS_B);
+HNS_ROCE_CMD_TIMEOUT_MSECS);
 }
 
 static int hns_roce_

[PATCH V3 for-next 06/11] IB/hns: Fix the bug for qp state in hns_roce_v1_m_qp()

2016-11-23 Thread Salil Mehta
From: Lijun Ou 

In old code, the value of qp state from qpc was assigned for
attr->qp_state. The value may be an error while attr_mask &
IB_QP_STATE is zero.

Signed-off-by: Lijun Ou 
Reviewed-by: Wei Hu (Xavier) 
Signed-off-by: Salil Mehta  
---
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c 
b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
index 643a2ff..b835a55 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
@@ -2571,7 +2571,7 @@ static int hns_roce_v1_m_qp(struct ib_qp *ibqp, const 
struct ib_qp_attr *attr,
/* Every status migrate must change state */
roce_set_field(context->qpc_bytes_144,
   QP_CONTEXT_QPC_BYTES_144_QP_STATE_M,
-  QP_CONTEXT_QPC_BYTES_144_QP_STATE_S, attr->qp_state);
+  QP_CONTEXT_QPC_BYTES_144_QP_STATE_S, new_state);
 
/* SW pass context to HW */
ret = hns_roce_v1_qp_modify(hr_dev, &hr_qp->mtt,
-- 
1.7.9.5




[PATCH V3 for-next 03/11] IB/hns: Optimize the logic of allocating memory using APIs

2016-11-23 Thread Salil Mehta
From: "Wei Hu (Xavier)" 

This patch modified the logic of allocating memory using APIs in
hns RoCE driver. We used kcalloc instead of kmalloc_array and
bitmap_zero. And When kcalloc failed, call vzalloc to alloc
memory.

Signed-off-by: Wei Hu (Xavier) 
Signed-off-by: Ping Zhang 
Signed-off-by: Salil Mehta  
---
Change log:

PATCH V2: Addressed comment given by Leon
 Link: https://patchwork.kernel.org/patch/9412859/
PATCH V1: Initial Submit
---
 drivers/infiniband/hw/hns/hns_roce_mr.c |   16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_mr.c 
b/drivers/infiniband/hw/hns/hns_roce_mr.c
index fb87883..d87d189 100644
--- a/drivers/infiniband/hw/hns/hns_roce_mr.c
+++ b/drivers/infiniband/hw/hns/hns_roce_mr.c
@@ -137,11 +137,13 @@ static int hns_roce_buddy_init(struct hns_roce_buddy 
*buddy, int max_order)
 
for (i = 0; i <= buddy->max_order; ++i) {
s = BITS_TO_LONGS(1 << (buddy->max_order - i));
-   buddy->bits[i] = kmalloc_array(s, sizeof(long), GFP_KERNEL);
-   if (!buddy->bits[i])
-   goto err_out_free;
-
-   bitmap_zero(buddy->bits[i], 1 << (buddy->max_order - i));
+   buddy->bits[i] = kcalloc(s, sizeof(long), GFP_KERNEL |
+__GFP_NOWARN);
+   if (!buddy->bits[i]) {
+   buddy->bits[i] = vzalloc(s * sizeof(long));
+   if (!buddy->bits[i])
+   goto err_out_free;
+   }
}
 
set_bit(0, buddy->bits[buddy->max_order]);
@@ -151,7 +153,7 @@ static int hns_roce_buddy_init(struct hns_roce_buddy 
*buddy, int max_order)
 
 err_out_free:
for (i = 0; i <= buddy->max_order; ++i)
-   kfree(buddy->bits[i]);
+   kvfree(buddy->bits[i]);
 
 err_out:
kfree(buddy->bits);
@@ -164,7 +166,7 @@ static void hns_roce_buddy_cleanup(struct hns_roce_buddy 
*buddy)
int i;
 
for (i = 0; i <= buddy->max_order; ++i)
-   kfree(buddy->bits[i]);
+   kvfree(buddy->bits[i]);
 
kfree(buddy->bits);
kfree(buddy->num_free);
-- 
1.7.9.5




[PATCH V3 for-next 09/11] IB/hns: Change qpn allocation to round-robin mode.

2016-11-23 Thread Salil Mehta
From: "Wei Hu (Xavier)" 

When using CM to establish connections, qp number that was freed
just now will be rejected by ib core. To fix these problem, We
change qpn allocation to round-robin mode. We added the round-robin
mode for allocating resources using bitmap. We use round-robin mode
for qp number and non round-robing mode for other resources like
cq number, pd number etc.

Signed-off-by: Wei Hu (Xavier) 
Signed-off-by: Salil Mehta  
---
 drivers/infiniband/hw/hns/hns_roce_alloc.c  |   11 +++
 drivers/infiniband/hw/hns/hns_roce_cq.c |4 ++--
 drivers/infiniband/hw/hns/hns_roce_device.h |9 +++--
 drivers/infiniband/hw/hns/hns_roce_mr.c |2 +-
 drivers/infiniband/hw/hns/hns_roce_pd.c |5 +++--
 drivers/infiniband/hw/hns/hns_roce_qp.c |2 +-
 6 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_alloc.c 
b/drivers/infiniband/hw/hns/hns_roce_alloc.c
index 863a17a..605962f 100644
--- a/drivers/infiniband/hw/hns/hns_roce_alloc.c
+++ b/drivers/infiniband/hw/hns/hns_roce_alloc.c
@@ -61,9 +61,10 @@ int hns_roce_bitmap_alloc(struct hns_roce_bitmap *bitmap, 
unsigned long *obj)
return ret;
 }
 
-void hns_roce_bitmap_free(struct hns_roce_bitmap *bitmap, unsigned long obj)
+void hns_roce_bitmap_free(struct hns_roce_bitmap *bitmap, unsigned long obj,
+ int rr)
 {
-   hns_roce_bitmap_free_range(bitmap, obj, 1);
+   hns_roce_bitmap_free_range(bitmap, obj, 1, rr);
 }
 
 int hns_roce_bitmap_alloc_range(struct hns_roce_bitmap *bitmap, int cnt,
@@ -106,7 +107,8 @@ int hns_roce_bitmap_alloc_range(struct hns_roce_bitmap 
*bitmap, int cnt,
 }
 
 void hns_roce_bitmap_free_range(struct hns_roce_bitmap *bitmap,
-   unsigned long obj, int cnt)
+   unsigned long obj, int cnt,
+   int rr)
 {
int i;
 
@@ -116,7 +118,8 @@ void hns_roce_bitmap_free_range(struct hns_roce_bitmap 
*bitmap,
for (i = 0; i < cnt; i++)
clear_bit(obj + i, bitmap->table);
 
-   bitmap->last = min(bitmap->last, obj);
+   if (!rr)
+   bitmap->last = min(bitmap->last, obj);
bitmap->top = (bitmap->top + bitmap->max + bitmap->reserved_top)
   & bitmap->mask;
spin_unlock(&bitmap->lock);
diff --git a/drivers/infiniband/hw/hns/hns_roce_cq.c 
b/drivers/infiniband/hw/hns/hns_roce_cq.c
index 461a273..c9f6c3d 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cq.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cq.c
@@ -166,7 +166,7 @@ static int hns_roce_cq_alloc(struct hns_roce_dev *hr_dev, 
int nent,
hns_roce_table_put(hr_dev, &cq_table->table, hr_cq->cqn);
 
 err_out:
-   hns_roce_bitmap_free(&cq_table->bitmap, hr_cq->cqn);
+   hns_roce_bitmap_free(&cq_table->bitmap, hr_cq->cqn, BITMAP_NO_RR);
return ret;
 }
 
@@ -204,7 +204,7 @@ static void hns_roce_free_cq(struct hns_roce_dev *hr_dev,
spin_unlock_irq(&cq_table->lock);
 
hns_roce_table_put(hr_dev, &cq_table->table, hr_cq->cqn);
-   hns_roce_bitmap_free(&cq_table->bitmap, hr_cq->cqn);
+   hns_roce_bitmap_free(&cq_table->bitmap, hr_cq->cqn, BITMAP_NO_RR);
 }
 
 static int hns_roce_ib_get_cq_umem(struct hns_roce_dev *hr_dev,
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h 
b/drivers/infiniband/hw/hns/hns_roce_device.h
index 7242b14..593a42a 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -72,6 +72,9 @@
 #define HNS_ROCE_MAX_GID_NUM   16
 #define HNS_ROCE_GID_SIZE  16
 
+#define BITMAP_NO_RR   0
+#define BITMAP_RR  1
+
 #define MR_TYPE_MR 0x00
 #define MR_TYPE_DMA0x03
 
@@ -661,7 +664,8 @@ int hns_roce_buf_write_mtt(struct hns_roce_dev *hr_dev,
 void hns_roce_cleanup_qp_table(struct hns_roce_dev *hr_dev);
 
 int hns_roce_bitmap_alloc(struct hns_roce_bitmap *bitmap, unsigned long *obj);
-void hns_roce_bitmap_free(struct hns_roce_bitmap *bitmap, unsigned long obj);
+void hns_roce_bitmap_free(struct hns_roce_bitmap *bitmap, unsigned long obj,
+int rr);
 int hns_roce_bitmap_init(struct hns_roce_bitmap *bitmap, u32 num, u32 mask,
 u32 reserved_bot, u32 resetrved_top);
 void hns_roce_bitmap_cleanup(struct hns_roce_bitmap *bitmap);
@@ -669,7 +673,8 @@ int hns_roce_bitmap_init(struct hns_roce_bitmap *bitmap, 
u32 num, u32 mask,
 int hns_roce_bitmap_alloc_range(struct hns_roce_bitmap *bitmap, int cnt,
int align, unsigned long *obj);
 void hns_roce_bitmap_free_range(struct hns_roce_bitmap *bitmap,
-   unsigned long obj, int cnt);
+   unsigned long obj, int cnt,
+   int rr);
 
 struct ib_ah *hns_roce_create_ah(struct ib_pd *

[PATCH V3 for-next 11/11] IB/hns: Fix for Checkpatch.pl comment style errors

2016-11-23 Thread Salil Mehta
This patch correct the comment style errors caught by
checkpatch.pl script

Signed-off-by: Salil Mehta  
---
 drivers/infiniband/hw/hns/hns_roce_cmd.c|8 ++--
 drivers/infiniband/hw/hns/hns_roce_device.h |   28 ++---
 drivers/infiniband/hw/hns/hns_roce_eq.c |6 +--
 drivers/infiniband/hw/hns/hns_roce_hem.c|6 +--
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c  |   58 +--
 drivers/infiniband/hw/hns/hns_roce_main.c   |   28 ++---
 6 files changed, 67 insertions(+), 67 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.c 
b/drivers/infiniband/hw/hns/hns_roce_cmd.c
index 2a0b6c0..8c1f7a6 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cmd.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.c
@@ -216,10 +216,10 @@ static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev 
*hr_dev, u64 in_param,
goto out;
 
/*
-   * It is timeout when wait_for_completion_timeout return 0
-   * The return value is the time limit set in advance
-   * how many seconds showing
-   */
+* It is timeout when wait_for_completion_timeout return 0
+* The return value is the time limit set in advance
+* how many seconds showing
+*/
if (!wait_for_completion_timeout(&context->done,
 msecs_to_jiffies(timeout))) {
dev_err(dev, "[cmd]wait_for_completion_timeout timeout\n");
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h 
b/drivers/infiniband/hw/hns/hns_roce_device.h
index 9ef1cc3..e48464d 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -201,9 +201,9 @@ struct hns_roce_bitmap {
 /* Order = 0: bitmap is biggest, order = max bitmap is least (only a bit) */
 /* Every bit repesent to a partner free/used status in bitmap */
 /*
-* Initial, bits of other bitmap are all 0 except that a bit of max_order is 1
-* Bit = 1 represent to idle and available; bit = 0: not available
-*/
+ * Initial, bits of other bitmap are all 0 except that a bit of max_order is 1
+ * Bit = 1 represent to idle and available; bit = 0: not available
+ */
 struct hns_roce_buddy {
/* Members point to every order level bitmap */
unsigned long **bits;
@@ -365,25 +365,25 @@ struct hns_roce_cmdq {
struct mutexhcr_mutex;
struct semaphorepoll_sem;
/*
-   * Event mode: cmd register mutex protection,
-   * ensure to not exceed max_cmds and user use limit region
-   */
+* Event mode: cmd register mutex protection,
+* ensure to not exceed max_cmds and user use limit region
+*/
struct semaphoreevent_sem;
int max_cmds;
spinlock_t  context_lock;
int free_head;
struct hns_roce_cmd_context *context;
/*
-   * Result of get integer part
-   * which max_comds compute according a power of 2
-   */
+* Result of get integer part
+* which max_comds compute according a power of 2
+*/
u16 token_mask;
/*
-   * Process whether use event mode, init default non-zero
-   * After the event queue of cmd event ready,
-   * can switch into event mode
-   * close device, switch into poll mode(non event mode)
-   */
+* Process whether use event mode, init default non-zero
+* After the event queue of cmd event ready,
+* can switch into event mode
+* close device, switch into poll mode(non event mode)
+*/
u8  use_events;
u8  toggle;
 };
diff --git a/drivers/infiniband/hw/hns/hns_roce_eq.c 
b/drivers/infiniband/hw/hns/hns_roce_eq.c
index 21e21b0..50f8649 100644
--- a/drivers/infiniband/hw/hns/hns_roce_eq.c
+++ b/drivers/infiniband/hw/hns/hns_roce_eq.c
@@ -371,9 +371,9 @@ static int hns_roce_aeq_ovf_int(struct hns_roce_dev *hr_dev,
int i = 0;
 
/**
-   * AEQ overflow ECC mult bit err CEQ overflow alarm
-   * must clear interrupt, mask irq, clear irq, cancel mask operation
-   */
+* AEQ overflow ECC mult bit err CEQ overflow alarm
+* must clear interrupt, mask irq, clear irq, cancel mask operation
+*/
aeshift_val = roce_read(hr_dev, ROCEE_CAEP_AEQC_AEQE_SHIFT_REG);
 
if (roce_get_bit(aeshift_val,
diff --git a/drivers/infiniband/hw/hns/hns_roce_hem.c 
b/drivers/infiniband/hw/hns/hns_roce_hem.c
index 250d8f2..c5104e0 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hem.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hem.c
@@ -80,9 +80,9 @@ struct hns_roce_hem *hns_roce_alloc_hem(struct hns_roce_dev 
*hr_dev, int npages,
--order;
 
/*
-   * Alloc memory one time. If failed, don't alloc small block
-   * memory, directly return fail.
- 

[PATCH V3 for-next 10/11] IB/hns: Implement the add_gid/del_gid and optimize the GIDs management

2016-11-23 Thread Salil Mehta
From: Shaobo Xu 

IB core has implemented the calculation of GIDs and the management
of GID tables, and it is now responsible to supply query function
for GIDs. So the calculation of GIDs and the management of GID
tables in the RoCE driver is redundant.

The patch is to implement the add_gid/del_gid to set the GIDs in
the RoCE driver, remove the redundant calculation and management of
GIDs in the notifier call of the net device and the inet, and
update the query_gid.

Signed-off-by: Shaobo Xu 
Reviewed-by: Wei Hu (Xavier) 
Signed-off-by: Salil Mehta  
---
 drivers/infiniband/hw/hns/hns_roce_device.h |2 -
 drivers/infiniband/hw/hns/hns_roce_main.c   |  270 +--
 2 files changed, 48 insertions(+), 224 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h 
b/drivers/infiniband/hw/hns/hns_roce_device.h
index 593a42a..9ef1cc3 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -429,8 +429,6 @@ struct hns_roce_ib_iboe {
struct net_device  *netdevs[HNS_ROCE_MAX_PORTS];
struct notifier_block   nb;
struct notifier_block   nb_inet;
-   /* 16 GID is shared by 6 port in v1 engine. */
-   union ib_gidgid_table[HNS_ROCE_MAX_GID_NUM];
u8  phy_port[HNS_ROCE_MAX_PORTS];
 };
 
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c 
b/drivers/infiniband/hw/hns/hns_roce_main.c
index 6770171..795ef97 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -35,52 +35,13 @@
 #include 
 #include 
 #include 
+#include 
 #include "hns_roce_common.h"
 #include "hns_roce_device.h"
 #include "hns_roce_user.h"
 #include "hns_roce_hem.h"
 
 /**
- * hns_roce_addrconf_ifid_eui48 - Get default gid.
- * @eui: eui.
- * @vlan_id:  gid
- * @dev:  net device
- * Description:
- *MAC convert to GID
- *gid[0..7] = fe80   
- *gid[8] = mac[0] ^ 2
- *gid[9] = mac[1]
- *gid[10] = mac[2]
- *gid[11] = ff(VLAN ID high byte (4 MS bits))
- *gid[12] = fe(VLAN ID low byte)
- *gid[13] = mac[3]
- *gid[14] = mac[4]
- *gid[15] = mac[5]
- */
-static void hns_roce_addrconf_ifid_eui48(u8 *eui, u16 vlan_id,
-struct net_device *dev)
-{
-   memcpy(eui, dev->dev_addr, 3);
-   memcpy(eui + 5, dev->dev_addr + 3, 3);
-   if (vlan_id < 0x1000) {
-   eui[3] = vlan_id >> 8;
-   eui[4] = vlan_id & 0xff;
-   } else {
-   eui[3] = 0xff;
-   eui[4] = 0xfe;
-   }
-   eui[0] ^= 2;
-}
-
-static void hns_roce_make_default_gid(struct net_device *dev, union ib_gid 
*gid)
-{
-   memset(gid, 0, sizeof(*gid));
-   gid->raw[0] = 0xFE;
-   gid->raw[1] = 0x80;
-   hns_roce_addrconf_ifid_eui48(&gid->raw[8], 0x, dev);
-}
-
-/**
  * hns_get_gid_index - Get gid index.
  * @hr_dev: pointer to structure hns_roce_dev.
  * @port:  port, value range: 0 ~ MAX
@@ -96,30 +57,6 @@ int hns_get_gid_index(struct hns_roce_dev *hr_dev, u8 port, 
int gid_index)
return gid_index * hr_dev->caps.num_ports + port;
 }
 
-static int hns_roce_set_gid(struct hns_roce_dev *hr_dev, u8 port, int 
gid_index,
-union ib_gid *gid)
-{
-   struct device *dev = &hr_dev->pdev->dev;
-   u8 gid_idx = 0;
-
-   if (gid_index >= hr_dev->caps.gid_table_len[port]) {
-   dev_err(dev, "gid_index %d illegal, port %d gid range: 0~%d\n",
-   gid_index, port, hr_dev->caps.gid_table_len[port] - 1);
-   return -EINVAL;
-   }
-
-   gid_idx = hns_get_gid_index(hr_dev, port, gid_index);
-
-   if (!memcmp(gid, &hr_dev->iboe.gid_table[gid_idx], sizeof(*gid)))
-   return -EINVAL;
-
-   memcpy(&hr_dev->iboe.gid_table[gid_idx], gid, sizeof(*gid));
-
-   hr_dev->hw->set_gid(hr_dev, port, gid_index, gid);
-
-   return 0;
-}
-
 static void hns_roce_set_mac(struct hns_roce_dev *hr_dev, u8 port, u8 *addr)
 {
u8 phy_port;
@@ -147,15 +84,44 @@ static void hns_roce_set_mtu(struct hns_roce_dev *hr_dev, 
u8 port, int mtu)
hr_dev->hw->set_mtu(hr_dev, phy_port, tmp);
 }
 
-static void hns_roce_update_gids(struct hns_roce_dev *hr_dev, int port)
+static int hns_roce_add_gid(struct ib_device *device, u8 port_num,
+   unsigned int index, const union ib_gid *gid,
+   const struct ib_gid_attr *attr, void **context)
+{
+   struct hns_roce_dev *hr_dev = to_hr_dev(device);
+   u8 port = port_num - 1;
+   unsigned long flags;
+
+   if (port >= hr_dev->caps.num_ports)
+   return -EINVAL;
+
+   spin_lock_irqsave(&hr_dev->iboe.lock, flags);
+
+   hr_dev->hw->set_gid(hr_dev, port, index, (union ib_gid *)gid);
+
+   spin_unlock_irqrestore(&hr_dev->iboe.lock, flags);
+
+   return 0;
+}
+
+static int

Re: [PATCH net 1/2] r8152: fix the sw rx checksum is unavailable

2016-11-23 Thread Mark Lord

On 16-11-23 10:12 AM, Hayes Wang wrote:

Mark Lord [ml...@pobox.com]
[...]

What does this code do:



static void r8153_set_rx_early_size(struct r8152 *tp)
{
   u32 mtu = tp->netdev->mtu;
   u32 ocp_data = (agg_buf_sz - mtu - VLAN_ETH_HLEN - VLAN_HLEN) / 4;

   ocp_write_word(tp, MCU_TYPE_USB, USB_RX_EARLY_SIZE, ocp_data);
}


This only works for RTL8153. However, what you use is RTL8152.
It is like delay completion. It is used to reduce the loading of CPU
by letting a transfer contain more data to reduce the number of
transfers.


How is ocp_data used by the hardware?
Shouldn't the calculation also include sizeof(rx_desc) in there somewhere?


The algorithm is from our hw engineers, and it should be

   (agg_buf_sz - packet size) / 8

You could refer to commit a59e6d815226 ("r8152: correct the rx early size").


Thanks.

Right now I am working quite hard trying to narrow things down exactly.
You are correct that the driver does appear to be careful about accesses
beyond the filled portion of a URB buffer -- for some reason I thought
the original driver had issues there, but looking again it does not seem to.

One idea that is now looking more likely:
Things could be suffering from speculative CPU accesses to RAM
(the system here has non-coherent d-cache/RAM).
This could incorrectly pre-load data from adjacent URB buffers
into the d-cache, creating coherency issues.  I am testing now
with cacheline-sized guard zones between the buffers to see if
that is the issue or not.

Worth repeating: other dongles we have tried, eg. those using the asix driver,
do not cause us any troubles here.  Only the r8152 dongles do.

The other drivers do not use hardware checksums, so even if they did
incur similar bad packets, whatever the reason, those bad packets
would be detected/rejected by the Linux network stack (software checksums).
So everything appears to behave fine with them, as it does with
the r8152 driver when hardware checksums are disabled.

Still trying to understand exactly how these errors are happening.
It takes a very long time to do a conclusive test of anything here,
and I only have the hardware for a day or two a week.
So my apologies if I am slow in getting back to you on stuff.

Cheers





Re: [PATCH net-next] net/sched: cls_flower: verify root pointer before dereferncing it

2016-11-23 Thread Cong Wang
On Wed, Nov 23, 2016 at 3:29 AM, Daniel Borkmann  wrote:
>
> Can't we drop the 'force' parameter from tcf_destroy() and related cls
> destroy() callbacks, and change the logic roughly like this:
>
> [...]
> case RTM_DELTFILTER:
> err = tp->ops->delete(tp, fh, &drop_tp);
> if (err == 0) {
> struct tcf_proto *next = rtnl_dereference(tp->next);
>
> tfilter_notify(net, skb, n, tp,
>t->tcm_handle,
>RTM_DELTFILTER, false);
> if (drop_tp) {
> RCU_INIT_POINTER(*back, next);
> tcf_destroy(tp);
> }
> }
> goto errout;
> [...]
>
> This one was the only tcf_destroy() instance with force=false. Why can't
> the prior delete() callback make the decision whether the tp now has no
> further internal filters and thus can be dropped. Afaik, delete() and
> destroy() are protected by RTNL anyway. Thus, we could unlink the tp from
> the list before tcf_destroy(), which should then work with grace period
> as well. Given we remove the setting of tp->root to NULL, any outstanding
> readers for that grace period should either still execute the 'scheduled
> for removal' filter we just dropped, or find an empty list of filters.

This is exactly why I said "the semantic of ->destroy() needs to revise too",
this is a reasonable revision of course, but the change is still large because
we need to move that logic from ->destroy() to ->delete(). I was trying to find
a relatively small fix for -net and -stable, for -net-next we could do
aggressive
change as long as it's necessary. This is why I am still thinking about it,
perhaps there is no quick fix for this bug.


>
>> Hmm, perhaps we really have to switch to a doubly-linked list, that is
>> list_head. I need to double check. And also the semantic of ->destroy()
>> needs to revise too.
>
>
> Can you elaborate why double-linked list? Isn't the tp list always protected
> from modifications via RTNL in control path, and walked via
> rcu_dereference_bh()
> in data path?

At least two benefits we can get from using doubly-linked list:

1) No need to pass a 'prev' pointer if we want to remove tp in a RCU callback,
list_del_rcu(&tp->head) is just enough.

2) No need to worry about RCU pointers because list_head has RCU API's
already, much more readable to me.

Of course, the size of struct tcf_proto will grow a bit, but it doesn't seem to
be a problem.


Re: [PATCH net-next 1/2] openvswitch: Add a missing break statement.

2016-11-23 Thread Pravin Shelar
On Tue, Nov 22, 2016 at 8:09 PM, Jarno Rajahalme  wrote:
> Add a break statement to prevent fall-through from
> OVS_KEY_ATTR_ETHERNET to OVS_KEY_ATTR_TUNNEL.  Without the break
> actions setting ethernet addresses fail to validate with log messages
> complaining about invalid tunnel attributes.
>
> Fixes: 0a6410fbde ("openvswitch: netlink: support L3 packets")
> Signed-off-by: Jarno Rajahalme 
> ---
>  net/openvswitch/flow_netlink.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> index d19044f..c87d359 100644
> --- a/net/openvswitch/flow_netlink.c
> +++ b/net/openvswitch/flow_netlink.c
> @@ -2195,6 +2195,7 @@ static int validate_set(const struct nlattr *a,
> case OVS_KEY_ATTR_ETHERNET:
> if (mac_proto != MAC_PROTO_ETHERNET)
> return -EINVAL;
> +   break;
>
> case OVS_KEY_ATTR_TUNNEL:
> if (masked)

Thanks for tracking it down.

Acked-by: Pravin B Shelar 


Re: [patch net-next v2 10/11] mlxsw: spectrum_router: Request a dump of FIB tables during init

2016-11-23 Thread Ido Schimmel
On Wed, Nov 23, 2016 at 06:08:23PM +0100, Hannes Frederic Sowa wrote:
> On Wed, Nov 23, 2016, at 18:04, Jiri Pirko wrote:
> > >Sure, but an abort function can be provided to the kernel anyway and the
> > >driver can care about that.
> > 
> > Ok, how?
> 
> I think just a sysctl ontop of this series is enough plus a pr_warn.
> Rocker and mlxsw are responsible to loop for a maximum amount of time.

Maybe, when the module requests a dump it can also provide a callback
that is invoked following each failed dump?


[PATCH 2/2] net: dsa: mv88e6xxx: enable EDSA on mv88e6097

2016-11-23 Thread Stefan Eichenberger
EDSA is currently disabled on mv88e6097 devices, this commit enables it.

Signed-off-by: Stefan Eichenberger 
---
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h 
b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
index ab52c37..a2ff1fc 100644
--- a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
@@ -543,7 +543,8 @@ enum mv88e6xxx_cap {
 MV88E6XXX_FLAGS_MULTI_CHIP)
 
 #define MV88E6XXX_FLAGS_FAMILY_6097\
-   (MV88E6XXX_FLAG_G1_ATU_FID |\
+   (MV88E6XXX_FLAG_EDSA |  \
+MV88E6XXX_FLAG_G1_ATU_FID |\
 MV88E6XXX_FLAG_G1_VTU_FID |\
 MV88E6XXX_FLAG_GLOBAL2 |   \
 MV88E6XXX_FLAG_G2_MGMT_EN_2X | \
-- 
2.9.3



Re: [PATCH 2/2] net: dsa: mv88e6xxx: enable EDSA on mv88e6097

2016-11-23 Thread Vivien Didelot
Hi Stefan,

Stefan Eichenberger  writes:

> EDSA is currently disabled on mv88e6097 devices, this commit enables it.
>
> Signed-off-by: Stefan Eichenberger 

Reviewed-by: Vivien Didelot 

(you can include our Reviewed-by tags directly in the commit message of
this patch for v3, right under your Signed-off-by tag.)

Thanks,

Vivien


Re: [PATCH v2] net: dsa: mv88e6xxx: forward unknown mc packets on mv88e6097

2016-11-23 Thread Vivien Didelot
Hi Andrew,

Andrew Lunn  writes:

> On Wed, Nov 23, 2016 at 12:52:52PM -0500, Vivien Didelot wrote:
>> Hi Andrew,
>> 
>> Andrew Lunn  writes:
>> 
>> > And if you have a recent version of tcpdump, it will decode
>> > the header.
>> 
>> Since d729eb4, thanks to you Andrew ;-)
>> 
>> I move up the cleanup of ports setup in my priority list.
>
> Hi Vivien
>
> Please take a look at my mv88e6390 branch. I already refactored this
> code, because the mv88e6390 does something slightly different...
>
> I hope to post another batch of mv88e6390 patches soon, and they will
> include this cleanup. Since they will clash with these patches, i will
> post them first as RFC.

Perfect. Please split an RFC only including this cleanup if
possible. Fewer patches will be easier to review, since the first port
registers differs a lot.

Thanks,

Vivien


Re: [PATCH net-next 4/5] net: phy: bcm7xxx: Add support for downshift/Wirespeed

2016-11-23 Thread Florian Fainelli
On 11/23/2016 06:46 AM, Andrew Lunn wrote:
 Maybe we should think about this locking a bit. It is normal for the
 lock to be held when using ops in the phy driver structure. The
 exception is suspend/resume. Maybe we should also take the lock before
 calling the phydev->drv->get_tunable() and phydev->drv->set_tunable()?
>>>
>>> Yes, that certainly seems like a good approach to me, let me cook a
>>> patch doing that.
>>
>> Just for my understanding (such that I will not make the same mistake 
>> again)...
>>
>> Why is it that phy functions such as get_wol needs to take the phy_lock and
>> others like get_tunable does not.
>>
>> I do understand the arguments on why the lock should be held by the caller of
>> get_tunable, but I do not understand why the same argument does not apply for
>> get_wol.
> 
> Hi Allan
> 
> phy_ethtool_get_wol and friends probably should take the
> phy_lock. This inconsistency is probably leading to locking
> bugs. e.g. at803x_set_wol() does a read-modify-write, and does not
> take the lock.
> 
> There is no comment in the patch adding phy_ethtool_set_wol() to say
> why the lock is not taken, and a quick look at the code does not
> suggest a reason why it could not be taken/released by
> phy_ethtool_set_wol().

Yes, this should happen. I don't see how we cannot have two user-space
processes not racing with each other here for instance, see
mv643xx_eth_get_wol and cpsw_get_wol.

> 
> I think it would be a good idea to change this.
> 
> phy_suspend()/phy_resume() might have good reasons to avoid the lock,
> i've no idea how it is supposed to work. Is there a danger something
> else is holding the lock and has already been suspended? I guess not,
> otherwise there is little hope suspend would work at all.

phy_suspend() and phy_resume() usually get called after phy_disconnect()
or phy_stop() have been invoked, and even then this is during the
Ethernet driver's suspend resume/resume path, so there is no room for
concurrency to occur (user space is quiesced, and the PHY state machine
is stopped/halted), but still, if we were to change the calling context
it would be a good idea to acquire phydev->lock.
-- 
Florian


Re: [PATCH net-next 0/2] Add support for the MV88e6097

2016-11-23 Thread Vivien Didelot
Hi Stefan,

Stefan Eichenberger  writes:

> This patchset will add support for the MV88E6097 DSA switch and enable
> EDSA on MV88E6097 family devices.
>
> Stefan Eichenberger (2):
>   net: dsa: mv88e6xxx: add MV88E6097 switch
>   net: dsa: mv88e6xxx: enable EDSA on mv88e6097
>
>  drivers/net/dsa/mv88e6xxx/chip.c  | 26 ++
>  drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |  5 -
>  2 files changed, 30 insertions(+), 1 deletion(-)

Ideally I'd put 2/2 first, because right after 1/2 your switch won't
work as expected.

Thanks,

Vivien


Re: [PATCH net-next] net: properly flush delay-freed skbs

2016-11-23 Thread Jesper Dangaard Brouer
On Wed, 23 Nov 2016 09:12:50 -0800
Alexander Duyck  wrote:

> On Wed, Nov 23, 2016 at 8:44 AM, Eric Dumazet  wrote:
> > From: Eric Dumazet 
> >
> > Typical NAPI drivers use napi_consume_skb(skb) at TX completion time.
> > This put skb in a percpu special queue, napi_alloc_cache, to get bulk
> > frees.
> >
> > It turns out the queue is not flushed and hits the NAPI_SKB_CACHE_SIZE
> > limit quite often, with skbs that were queued hundreds of usec earlier.
> > I measured this can take ~6000 nsec to perform one flush.
> >
> > __kfree_skb_flush() can be called from two points right now :
> >
> > 1) From net_tx_action(), but only for skbs that were queued to
> > sd->completion_queue.
> >  
> >  -> Irrelevant for NAPI drivers in normal operation.  
> >
> > 2) From net_rx_action(), but only under high stress or if RPS/RFS has a
> > pending action.
> >
> > This patch changes net_rx_action() to perform the flush in all cases and
> > after more urgent operations happened (like kicking remote CPUS for
> > RPS/RFS).
> >
> > Signed-off-by: Eric Dumazet 
> > Cc: Jesper Dangaard Brouer 
> > Cc: Alexander Duyck 
> > ---  
> 
> Yeah, we didn't intent the data to be sitting around that long.  The
> change looks good to me.
> 
> Acked-by: Alexander Duyck 

Also looks good to me! Thanks for catching this.

Acked-by: Jesper Dangaard Brouer 

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer


Re: [PATCH 1/2] net: dsa: mv88e6xxx: add MV88E6097 switch

2016-11-23 Thread Andrew Lunn
> + [MV88E6097] = {
> + .prod_num = PORT_SWITCH_ID_PROD_NUM_6097,
> + .family = MV88E6XXX_FAMILY_6097,
> + .name = "Marvell 88E6097/88E6097F",
> + .num_databases = 4096,
> + .num_ports = 11,
> + .port_base_addr = 0x10,
> + .global1_addr = 0x1b,
> + .age_time_coeff = 15000,
> + .flags = MV88E6XXX_FLAGS_FAMILY_6097,
> + .ops = &mv88e6097_ops,

Upps. Sorry, i missed something when you rebased onto net-next. You
are missing .g1_irqs = . It is probably 9. You can check the
datasheet, global 1, register 0. If bit 8 is AVBInt, you need 9. If
bit 8 is reserved, then 8.

Andrew


Re: [PATCH 2/2] net: dsa: mv88e6xxx: enable EDSA on mv88e6097

2016-11-23 Thread Andrew Lunn
On Wed, Nov 23, 2016 at 06:55:46PM +0100, Stefan Eichenberger wrote:
> EDSA is currently disabled on mv88e6097 devices, this commit enables it.
> 
> Signed-off-by: Stefan Eichenberger 

Reviewed-by: Andrew Lunn 

Andrew


Re: [PATCH v2] net: dsa: mv88e6xxx: forward unknown mc packets on mv88e6097

2016-11-23 Thread Andrew Lunn
On Wed, Nov 23, 2016 at 12:52:52PM -0500, Vivien Didelot wrote:
> Hi Andrew,
> 
> Andrew Lunn  writes:
> 
> > And if you have a recent version of tcpdump, it will decode
> > the header.
> 
> Since d729eb4, thanks to you Andrew ;-)
> 
> I move up the cleanup of ports setup in my priority list.

Hi Vivien

Please take a look at my mv88e6390 branch. I already refactored this
code, because the mv88e6390 does something slightly different...

I hope to post another batch of mv88e6390 patches soon, and they will
include this cleanup. Since they will clash with these patches, i will
post them first as RFC.

  Andrew


[PATCH 1/2] net: dsa: mv88e6xxx: add MV88E6097 switch

2016-11-23 Thread Stefan Eichenberger
Add support for the MV88E6097 switch. The change was tested on an Armada
based platform with a MV88E6097 switch.

Signed-off-by: Stefan Eichenberger 
---
 drivers/net/dsa/mv88e6xxx/chip.c  | 26 ++
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |  2 ++
 2 files changed, 28 insertions(+)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index bada646..b14b3d5 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -3209,6 +3209,19 @@ static const struct mv88e6xxx_ops mv88e6095_ops = {
.stats_get_stats = mv88e6095_stats_get_stats,
 };
 
+static const struct mv88e6xxx_ops mv88e6097_ops = {
+   .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+   .phy_read = mv88e6xxx_g2_smi_phy_read,
+   .phy_write = mv88e6xxx_g2_smi_phy_write,
+   .port_set_link = mv88e6xxx_port_set_link,
+   .port_set_duplex = mv88e6xxx_port_set_duplex,
+   .port_set_speed = mv88e6185_port_set_speed,
+   .stats_snapshot = mv88e6xxx_g1_stats_snapshot,
+   .stats_get_sset_count = mv88e6095_stats_get_sset_count,
+   .stats_get_strings = mv88e6095_stats_get_strings,
+   .stats_get_stats = mv88e6095_stats_get_stats,
+};
+
 static const struct mv88e6xxx_ops mv88e6123_ops = {
/* MV88E6XXX_FAMILY_6165 */
.set_switch_mac = mv88e6xxx_g2_set_switch_mac,
@@ -3580,6 +3593,19 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.ops = &mv88e6095_ops,
},
 
+   [MV88E6097] = {
+   .prod_num = PORT_SWITCH_ID_PROD_NUM_6097,
+   .family = MV88E6XXX_FAMILY_6097,
+   .name = "Marvell 88E6097/88E6097F",
+   .num_databases = 4096,
+   .num_ports = 11,
+   .port_base_addr = 0x10,
+   .global1_addr = 0x1b,
+   .age_time_coeff = 15000,
+   .flags = MV88E6XXX_FLAGS_FAMILY_6097,
+   .ops = &mv88e6097_ops,
+   },
+
[MV88E6123] = {
.prod_num = PORT_SWITCH_ID_PROD_NUM_6123,
.family = MV88E6XXX_FAMILY_6165,
diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h 
b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
index 9298faa..ab52c37 100644
--- a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
@@ -81,6 +81,7 @@
 #define PORT_SWITCH_ID 0x03
 #define PORT_SWITCH_ID_PROD_NUM_6085   0x04a
 #define PORT_SWITCH_ID_PROD_NUM_6095   0x095
+#define PORT_SWITCH_ID_PROD_NUM_6097   0x099
 #define PORT_SWITCH_ID_PROD_NUM_6131   0x106
 #define PORT_SWITCH_ID_PROD_NUM_6320   0x115
 #define PORT_SWITCH_ID_PROD_NUM_6123   0x121
@@ -378,6 +379,7 @@
 enum mv88e6xxx_model {
MV88E6085,
MV88E6095,
+   MV88E6097,
MV88E6123,
MV88E6131,
MV88E6161,
-- 
2.9.3



[PATCH net-next 0/2] Add support for the MV88e6097

2016-11-23 Thread Stefan Eichenberger
This patchset will add support for the MV88E6097 DSA switch and enable
EDSA on MV88E6097 family devices.

Stefan Eichenberger (2):
  net: dsa: mv88e6xxx: add MV88E6097 switch
  net: dsa: mv88e6xxx: enable EDSA on mv88e6097

 drivers/net/dsa/mv88e6xxx/chip.c  | 26 ++
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |  5 -
 2 files changed, 30 insertions(+), 1 deletion(-)

-- 
2.9.3



Re: [PATCH v2] net: dsa: mv88e6xxx: forward unknown mc packets on mv88e6097

2016-11-23 Thread Vivien Didelot
Hi Andrew,

Andrew Lunn  writes:

> And if you have a recent version of tcpdump, it will decode
> the header.

Since d729eb4, thanks to you Andrew ;-)

I move up the cleanup of ports setup in my priority list. The code is
quite cluttered at the moment and it's hard to read through it. We need
proper helpers for egress floods, (E)DSA setup, etc. like what is being
done for the other devices.

Thanks,

Vivien


Re: [PATCH v2] net: dsa: mv88e6xxx: forward unknown mc packets on mv88e6097

2016-11-23 Thread Stefan Eichenberger
On Wed, Nov 23, 2016 at 06:32:30PM +0100, Andrew Lunn wrote:
> On Wed, Nov 23, 2016 at 06:14:41PM +0100, Stefan Eichenberger wrote:
> > On Wed, Nov 23, 2016 at 05:59:49PM +0100, Andrew Lunn wrote:
> > > On Wed, Nov 23, 2016 at 05:54:40PM +0100, Stefan Eichenberger wrote:
> > > > Packets with unknown destination addresses are not forwarded to the cpu
> > > > port on mv88e6097 based switches (e.g. MV88E6097) at the moment. This
> > > > commit enables PORT_CONTROL_FORWARD_UNKNOWN_MC for this family.
> > > 
> > > Please try adding MV88E6XXX_FLAG_EDSA to
> > > MV88E6XXX_FLAGS_FAMILY_6097. That is the better fix if it works.
> > 
> > I was even wondering what EDSA means:) Thanks this solved the problem!
> 
> Great.
> 
> We should fix up a few minor issues and resubmit.
> 
> What is the status of the first patch, which added 6097 to the driver?
> I don't think David accepted it yet. So lets make one patchset
> containing the two patches.
> 
> The subject line of the patches need to have net-next in it. e.g.
> 
> [PATCH net-next 0/2] Add support for the MV88e6097
> 
> Include a cover node, saying what the patchset as a whole does.
> This gets used as the merge commit message.
> 
> Then the two patches.

Perfect, thanks a lot for the help! The patchset will follow.

Thanks
Stefan


[PATCH net-next] mlx4: do not use priv->stats_lock in mlx4_en_auto_moderation()

2016-11-23 Thread Eric Dumazet
From: Eric Dumazet 

Per RX ring packets/bytes counters are not protected by global
priv->stats_lock.

Better not confuse the reader, and use READ_ONCE() to show we read
these counters without surrounding synchronization.

Interrupt moderation is best effort, and we do not really care of
ultra precise counters.

Signed-off-by: Eric Dumazet 
---
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c 
b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 
9a807e93c9fdd81e61e561208aa1480a244d0bdb..b964bdcd4ae509a7e693215e8b32f040218e252c
 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1391,10 +1391,8 @@ static void mlx4_en_auto_moderation(struct mlx4_en_priv 
*priv)
return;
 
for (ring = 0; ring < priv->rx_ring_num; ring++) {
-   spin_lock_bh(&priv->stats_lock);
-   rx_packets = priv->rx_ring[ring]->packets;
-   rx_bytes = priv->rx_ring[ring]->bytes;
-   spin_unlock_bh(&priv->stats_lock);
+   rx_packets = READ_ONCE(priv->rx_ring[ring]->packets);
+   rx_bytes = READ_ONCE(priv->rx_ring[ring]->bytes);
 
rx_pkt_diff = ((unsigned long) (rx_packets -
priv->last_moder_packets[ring]));




Re: [patch net-next v2 09/11] ipv4: fib: Add an API to request a FIB dump

2016-11-23 Thread Hannes Frederic Sowa
On 23.11.2016 15:34, Jiri Pirko wrote:
> From: Ido Schimmel 
> 
> Commit b90eb7549499 ("fib: introduce FIB notification infrastructure")
> introduced a new notification chain to notify listeners (f.e., switchdev
> drivers) about addition and deletion of routes.
> 
> However, upon registration to the chain the FIB tables can already be
> populated, which means potential listeners will have an incomplete view
> of the tables.
> 
> Solve that by adding an API to request a FIB dump. The dump itself it
> done using RCU in order not to starve consumers that need RTNL to make
> progress.
> 
> For each net namespace the integrity of the dump is ensured by reading
> the atomic change sequence counter before and after the dump. This
> allows us to avoid the problematic situation in which the dumping
> process sends a ENTRY_ADD notification following ENTRY_DEL generated by
> another process holding RTNL.
> 
> Signed-off-by: Ido Schimmel 
> Signed-off-by: Jiri Pirko 
> ---
>  include/net/ip_fib.h |   1 +
>  net/ipv4/fib_trie.c  | 117 
> +++
>  2 files changed, 118 insertions(+)
> 
> diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
> index 6c67b93..c76303e 100644
> --- a/include/net/ip_fib.h
> +++ b/include/net/ip_fib.h
> @@ -221,6 +221,7 @@ enum fib_event_type {
>   FIB_EVENT_RULE_DEL,
>  };
>  
> +bool fib_notifier_dump(struct notifier_block *nb);
>  int register_fib_notifier(struct notifier_block *nb);
>  int unregister_fib_notifier(struct notifier_block *nb);
>  int call_fib_notifiers(struct net *net, enum fib_event_type event_type,
> diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
> index b1d2d09..9770edfe 100644
> --- a/net/ipv4/fib_trie.c
> +++ b/net/ipv4/fib_trie.c
> @@ -86,6 +86,67 @@
>  
>  static ATOMIC_NOTIFIER_HEAD(fib_chain);
>  
> +static int call_fib_notifier(struct notifier_block *nb, struct net *net,
> +  enum fib_event_type event_type,
> +  struct fib_notifier_info *info)
> +{
> + info->net = net;
> + return nb->notifier_call(nb, event_type, info);
> +}
> +
> +static void fib_rules_notify(struct net *net, struct notifier_block *nb,
> +  enum fib_event_type event_type)
> +{
> +#ifdef CONFIG_IP_MULTIPLE_TABLES
> + struct fib_notifier_info info;
> +
> + if (net->ipv4.fib_has_custom_rules)
> + call_fib_notifier(nb, net, event_type, &info);
> +#endif
> +}
> +
> +static void fib_notify(struct net *net, struct notifier_block *nb,
> +enum fib_event_type event_type);
> +
> +static int call_fib_entry_notifier(struct notifier_block *nb, struct net 
> *net,
> +enum fib_event_type event_type, u32 dst,
> +int dst_len, struct fib_info *fi,
> +u8 tos, u8 type, u32 tb_id, u32 nlflags)
> +{
> + struct fib_entry_notifier_info info = {
> + .dst = dst,
> + .dst_len = dst_len,
> + .fi = fi,
> + .tos = tos,
> + .type = type,
> + .tb_id = tb_id,
> + .nlflags = nlflags,
> + };
> + return call_fib_notifier(nb, net, event_type, &info.info);
> +}
> +
> +bool fib_notifier_dump(struct notifier_block *nb)
> +{
> + struct net *net;
> + bool ret = true;



> + rcu_read_lock();
> + for_each_net_rcu(net) {
> + int fib_seq = atomic_read(&net->ipv4.fib_seq);
> +
> + fib_rules_notify(net, nb, FIB_EVENT_RULE_ADD);
> + fib_notify(net, nb, FIB_EVENT_ENTRY_ADD);
> + if (atomic_read(&net->ipv4.fib_seq) != fib_seq) {
> + ret = false;
> + goto out_unlock;
> + }

Hmm, I think you need to read the sequence counter under rtnl_lock to
have an ordering with the rest of the updates to the RCU trie. Otherwise
you don't know if the fib trie has the correct view regarding to the
incoming notifications as a whole. This is also necessary during restarts.

You can also try to register the notifier after the dump and check for
the sequence number after registering the notifier, maybe that is easier
(and restart unregisters and does the same).

Bye,
Hannes



Re: [PATCH v2] net: dsa: mv88e6xxx: forward unknown mc packets on mv88e6097

2016-11-23 Thread Andrew Lunn
On Wed, Nov 23, 2016 at 06:14:41PM +0100, Stefan Eichenberger wrote:
> On Wed, Nov 23, 2016 at 05:59:49PM +0100, Andrew Lunn wrote:
> > On Wed, Nov 23, 2016 at 05:54:40PM +0100, Stefan Eichenberger wrote:
> > > Packets with unknown destination addresses are not forwarded to the cpu
> > > port on mv88e6097 based switches (e.g. MV88E6097) at the moment. This
> > > commit enables PORT_CONTROL_FORWARD_UNKNOWN_MC for this family.
> > 
> > Please try adding MV88E6XXX_FLAG_EDSA to
> > MV88E6XXX_FLAGS_FAMILY_6097. That is the better fix if it works.
> 
> I was even wondering what EDSA means:) Thanks this solved the problem!

Plain DSA puts four bytes of header between the MAC source address and
the EtherType/Length.

EDSA puts in an 8 byte header, and includes an Ethertype value of
0xdada. Having that ethertype value makes it more obvious what is
going on. And if you have a recent version of tcpdump, it will decode
the header.

Andrew


Re: [PATCH v2] cpsw: ethtool: add support for getting/setting EEE registers

2016-11-23 Thread Florian Fainelli
On 11/23/2016 06:38 AM, yegorsli...@googlemail.com wrote:
> From: Yegor Yefremov 
> 
> Add the ability to query and set Energy Efficient Ethernet parameters
> via ethtool for applicable devices.

Are you sure this is enough to actually enable EEE? I don't see where
phy_init_eee() is called here, nor is the cpsw Ethernet controller part
configured to enable/disable EEE. EEE is not just a PHY thing, it
usually also needs to be configured properly at the Ethernet MAC/switch
level as well.

Just curious here.
-- 
Florian


Re: [PATCH v2] net: dsa: mv88e6xxx: forward unknown mc packets on mv88e6097

2016-11-23 Thread Andrew Lunn
On Wed, Nov 23, 2016 at 06:14:41PM +0100, Stefan Eichenberger wrote:
> On Wed, Nov 23, 2016 at 05:59:49PM +0100, Andrew Lunn wrote:
> > On Wed, Nov 23, 2016 at 05:54:40PM +0100, Stefan Eichenberger wrote:
> > > Packets with unknown destination addresses are not forwarded to the cpu
> > > port on mv88e6097 based switches (e.g. MV88E6097) at the moment. This
> > > commit enables PORT_CONTROL_FORWARD_UNKNOWN_MC for this family.
> > 
> > Please try adding MV88E6XXX_FLAG_EDSA to
> > MV88E6XXX_FLAGS_FAMILY_6097. That is the better fix if it works.
> 
> I was even wondering what EDSA means:) Thanks this solved the problem!

Great.

We should fix up a few minor issues and resubmit.

What is the status of the first patch, which added 6097 to the driver?
I don't think David accepted it yet. So lets make one patchset
containing the two patches.

The subject line of the patches need to have net-next in it. e.g.

[PATCH net-next 0/2] Add support for the MV88e6097

Include a cover node, saying what the patchset as a whole does.
This gets used as the merge commit message.

Then the two patches.

When posting the patchset, please start a new thread. A new version of
a patchset or patch should be a new thread.

Thanks
Andrew


  1   2   3   >