Re: [PATCH net 0/4] net/mlx4_en: fix stats

2016-05-25 Thread David Miller
From: Eric Dumazet 
Date: Wed, 25 May 2016 09:50:35 -0700

> mlx4 has various bugs in its ndo_get_stats() and related functions.
> This patch series address the obvious issues.
> Remaining ones will be discussed later.

Series applied, thanks.


Re: [PATCH net] sctp: fix double EPs display in sctp_diag

2016-05-25 Thread David Miller
From: Xin Long 
Date: Thu, 26 May 2016 03:09:23 +0800

> We have this situation: that EP hash table, contains only the EPs
> that are listening, while the transports one, has the opposite.
> We have to traverse both to dump all.
> 
> But when we traverse the transports one we will also get EPs that are
> in the EP hash if they are listening. In this case, the EP is dumped
> twice.
> 
> We will fix it by checking if the endpoint that is in the endpoint
> hash table contains any ep->asoc in there, as it means we will also
> find it via transport hash, and thus we can/should skip it, depending
> on the filters used, like 'ss -l'.
> 
> Still, we should NOT skip it if the user is listing only listening
> endpoints, because then we are not traversing the transport hash.
> so we have to check idiag_states there also.
> 
> Signed-off-by: Xin Long 

Applied.


Re: [PATCH] net: arc: trivial: Replace comma with a semicolon

2016-05-25 Thread David Miller
From: Marek Vasut 
Date: Thu, 26 May 2016 00:40:05 +0200

> Fix a typo in the driver, replace comma with a semicolon at the end
> of statement. While using comma is a legal C here and probably does
> not even generate compiler warning, it was unlikely the intention.
> 
> Signed-off-by: Marek Vasut 

Applied.


Re: [PATCH] net: stmmac: Fix incorrect memcpy source memory

2016-05-25 Thread David Miller
From: Andrew Lunn 
Date: Thu, 26 May 2016 02:26:25 +0200

> On Thu, May 26, 2016 at 12:40:23AM +0200, Marek Vasut wrote:
>> The memcpy() currently copies mdio_bus_data into new_bus->irq, which
>> makes no sense, since the mdio_bus_data structure contains more than
>> just irqs. The code was likely supposed to copy mdio_bus_data->irqs
>> into the new_bus->irq instead, so fix this.
>> 
>> Signed-off-by: Marek Vasut 
>> Cc: David S. Miller 
>> Cc: Giuseppe Cavallaro 
>> Cc: Alexandre Torgue 
> 
> Reviewed-by: Andrew Lunn 

Applied and queued up for -stable, thanks.


RE: [v10, 7/7] mmc: sdhci-of-esdhc: fix host version for T4240-R1.0-R2.0

2016-05-25 Thread Yangbo Lu
Hi Uffe,

Could we merge this patchset? ...
It has been a long time to wait for Arnd's response...
 
Thanks a lot.


Best regards,
Yangbo Lu


> -Original Message-
> From: Yangbo Lu
> Sent: Friday, May 20, 2016 2:06 PM
> To: 'Scott Wood'; Arnd Bergmann; linux-arm-ker...@lists.infradead.org
> Cc: linuxppc-...@lists.ozlabs.org; Mark Rutland;
> devicet...@vger.kernel.org; ulf.hans...@linaro.org; Russell King; Bhupesh
> Sharma; netdev@vger.kernel.org; Joerg Roedel; Kumar Gala; linux-
> m...@vger.kernel.org; linux-ker...@vger.kernel.org; Yang-Leo Li;
> io...@lists.linux-foundation.org; Rob Herring; linux-...@vger.kernel.org;
> Claudiu Manoil; Santosh Shilimkar; Xiaobo Xie; linux-...@vger.kernel.org;
> Qiang Zhao
> Subject: RE: [v10, 7/7] mmc: sdhci-of-esdhc: fix host version for T4240-
> R1.0-R2.0
> 
> Hi Arnd,
> 
> Any comments?
> Please reply when you see the email since we want this eSDHC issue to be
> fixed soon.
> 
> All the patches are Freescale-specific and is to fix the eSDHC driver.
> Could we let them merged first if you were talking about a new way of
> abstracting getting SoC version.
> 
> 
> Thanks :)
> 
> 
> Best regards,
> Yangbo Lu
> 



Re: usbnet: smsc95xx: fix link detection for disabled autonegotiation

2016-05-25 Thread Andrew Lunn
On Thu, May 26, 2016 at 04:06:47AM +0200, Christoph Fritz wrote:
> To detect link status up/down for connections where autonegotiation is
> explicitly disabled, we don't get an irq but need to poll the status
> register for link up/down detection.
> This patch adds a workqueue to poll for link status.

Did you consider using the phylib? It probably does the needed polling
already, and it looks like the functions needed to implement an MDIO
bus are already in place.

Andrew


usbnet: smsc95xx: fix link detection for disabled autonegotiation

2016-05-25 Thread Christoph Fritz
To detect link status up/down for connections where autonegotiation is
explicitly disabled, we don't get an irq but need to poll the status
register for link up/down detection.
This patch adds a workqueue to poll for link status.

Signed-off-by: Christoph Fritz 
---
 drivers/net/usb/smsc95xx.c | 51 ++
 1 file changed, 51 insertions(+)

diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index d9d2806..dc989a8 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -61,6 +61,8 @@
 #define SUSPEND_ALLMODES   (SUSPEND_SUSPEND0 | SUSPEND_SUSPEND1 | \
 SUSPEND_SUSPEND2 | SUSPEND_SUSPEND3)
 
+#define CARRIER_CHECK_DELAY (2 * HZ)
+
 struct smsc95xx_priv {
u32 mac_cr;
u32 hash_hi;
@@ -69,6 +71,9 @@ struct smsc95xx_priv {
spinlock_t mac_cr_lock;
u8 features;
u8 suspend_flags;
+   bool link_ok;
+   struct delayed_work carrier_check;
+   struct usbnet *dev;
 };
 
 static bool turbo_mode = true;
@@ -624,6 +629,44 @@ static void smsc95xx_status(struct usbnet *dev, struct urb 
*urb)
intdata);
 }
 
+static void set_carrier(struct usbnet *dev, bool link)
+{
+   struct smsc95xx_priv *pdata = (struct smsc95xx_priv *)(dev->data[0]);
+
+   if (pdata->link_ok == link)
+   return;
+
+   pdata->link_ok = link;
+
+   if (link)
+   usbnet_link_change(dev, 1, 0);
+   else
+   usbnet_link_change(dev, 0, 0);
+}
+
+static void check_carrier(struct work_struct *work)
+{
+   struct smsc95xx_priv *pdata = container_of(work, struct smsc95xx_priv,
+   carrier_check.work);
+   struct usbnet *dev = pdata->dev;
+   int ret;
+
+   if (pdata->suspend_flags != 0)
+   return;
+
+   ret = smsc95xx_mdio_read(dev->net, dev->mii.phy_id, MII_BMSR);
+   if (ret < 0) {
+   netdev_warn(dev->net, "Failed to read MII_BMSR\n");
+   return;
+   }
+   if (ret & BMSR_LSTATUS)
+   set_carrier(dev, 1);
+   else
+   set_carrier(dev, 0);
+
+   schedule_delayed_work(>carrier_check, CARRIER_CHECK_DELAY);
+}
+
 /* Enable or disable Tx & Rx checksum offload engines */
 static int smsc95xx_set_features(struct net_device *netdev,
netdev_features_t features)
@@ -1165,13 +1208,20 @@ static int smsc95xx_bind(struct usbnet *dev, struct 
usb_interface *intf)
dev->net->flags |= IFF_MULTICAST;
dev->net->hard_header_len += SMSC95XX_TX_OVERHEAD_CSUM;
dev->hard_mtu = dev->net->mtu + dev->net->hard_header_len;
+
+   pdata->dev = dev;
+   INIT_DELAYED_WORK(>carrier_check, check_carrier);
+   schedule_delayed_work(>carrier_check, CARRIER_CHECK_DELAY);
+
return 0;
 }
 
 static void smsc95xx_unbind(struct usbnet *dev, struct usb_interface *intf)
 {
struct smsc95xx_priv *pdata = (struct smsc95xx_priv *)(dev->data[0]);
+
if (pdata) {
+   cancel_delayed_work(>carrier_check);
netif_dbg(dev, ifdown, dev->net, "free pdata\n");
kfree(pdata);
pdata = NULL;
@@ -1695,6 +1745,7 @@ static int smsc95xx_resume(struct usb_interface *intf)
 
/* do this first to ensure it's cleared even in error case */
pdata->suspend_flags = 0;
+   schedule_delayed_work(>carrier_check, CARRIER_CHECK_DELAY);
 
if (suspend_flags & SUSPEND_ALLMODES) {
/* clear wake-up sources */
-- 
2.1.4





[PATCH] brcmfmac: rework function picking free BSS index

2016-05-25 Thread Rafał Miłecki
The old implementation was overcomplicated and slightly bugged in some
corner cases.

Consider following state of BSS-es (limited to 6 for simplification):
drvr->iflist[0]: { bsscfgidx:0, ndev->name:wlan1, }
drvr->iflist[1]:  (null)
drvr->iflist[2]: { bsscfgidx:2, ndev->name:wlan1-1, }
drvr->iflist[3]: { bsscfgidx:3, ndev->name:wlan1-2, }
drvr->iflist[4]:  (null)
drvr->iflist[5]:  (null)
In such case the next AP interface should bsscfgidx 4 (we don't use 1 as
it's reserved for P2P).

With old code the loop iterations were following:
[ifidx = 0] [bsscfgidx = 2] [highest = 2]
[ifidx = 1] [bsscfgidx = 2] [highest = 2] available = true
[ifidx = 2] [bsscfgidx = 2] [highest = 2] bsscfgidx = highest + 1
[ifidx = 3] [bsscfgidx = 3] [highest = 2] bsscfgidx = highest + 1
[ifidx = 4] [bsscfgidx = 3] [highest = 2] available = true
[ifidx = 5] [bsscfgidx = 3] [highest = 2] available = true
There were 2 obvious problems:
1) Having empty BSS at index 1 was resulting in available being always
   set to true, even if we would run out of BSS-es.
2) Calculated bsscfgidx was invalid (3 instead of 4) resulting in driver
   not being able to create the 4th AP interface.

New code is simpler, placed in file where it's really used, handles
running out of free BSS-es and allows using 4 interfaces at the same
time. It also looks for the first free BSS instead of one after the last
in use. It works well with current driver (which doesn't allow deleting
interfaces) and should be future proof (if we ever allow deleting).

Signed-off-by: Rafał Miłecki 
---
 .../broadcom/brcm80211/brcmfmac/cfg80211.c | 17 ++-
 .../wireless/broadcom/brcm80211/brcmfmac/core.c| 24 --
 .../wireless/broadcom/brcm80211/brcmfmac/core.h|  1 -
 3 files changed, 16 insertions(+), 26 deletions(-)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c 
b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
index 3d09d23..d00eef8 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
@@ -541,6 +541,21 @@ brcmf_cfg80211_update_proto_addr_mode(struct wireless_dev 
*wdev)
ADDR_INDIRECT);
 }
 
+static int brcmf_get_first_free_bsscfgidx(struct brcmf_pub *drvr)
+{
+   int bsscfgidx;
+
+   for (bsscfgidx = 0; bsscfgidx < BRCMF_MAX_IFS; bsscfgidx++) {
+   /* bsscfgidx 1 is reserved for legacy P2P */
+   if (bsscfgidx == 1)
+   continue;
+   if (!drvr->iflist[bsscfgidx])
+   return bsscfgidx;
+   }
+
+   return -ENOMEM;
+}
+
 static int brcmf_cfg80211_request_ap_if(struct brcmf_if *ifp)
 {
struct brcmf_mbss_ssid_le mbss_ssid_le;
@@ -548,7 +563,7 @@ static int brcmf_cfg80211_request_ap_if(struct brcmf_if 
*ifp)
int err;
 
memset(_ssid_le, 0, sizeof(mbss_ssid_le));
-   bsscfgidx = brcmf_get_next_free_bsscfgidx(ifp->drvr);
+   bsscfgidx = brcmf_get_first_free_bsscfgidx(ifp->drvr);
if (bsscfgidx < 0)
return bsscfgidx;
 
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c 
b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
index b590499..6a76480 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
@@ -753,30 +753,6 @@ void brcmf_remove_interface(struct brcmf_if *ifp)
brcmf_del_if(ifp->drvr, ifp->bsscfgidx);
 }
 
-int brcmf_get_next_free_bsscfgidx(struct brcmf_pub *drvr)
-{
-   int ifidx;
-   int bsscfgidx;
-   bool available;
-   int highest;
-
-   available = false;
-   bsscfgidx = 2;
-   highest = 2;
-   for (ifidx = 0; ifidx < BRCMF_MAX_IFS; ifidx++) {
-   if (drvr->iflist[ifidx]) {
-   if (drvr->iflist[ifidx]->bsscfgidx == bsscfgidx)
-   bsscfgidx = highest + 1;
-   else if (drvr->iflist[ifidx]->bsscfgidx > highest)
-   highest = drvr->iflist[ifidx]->bsscfgidx;
-   } else {
-   available = true;
-   }
-   }
-
-   return available ? bsscfgidx : -ENOMEM;
-}
-
 #ifdef CONFIG_INET
 #define ARPOL_MAX_ENTRIES  8
 static int brcmf_inetaddr_changed(struct notifier_block *nb,
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.h 
b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.h
index 647d3cc..2a075c5 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.h
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.h
@@ -217,7 +217,6 @@ int brcmf_net_attach(struct brcmf_if *ifp, bool 
rtnl_locked);
 struct brcmf_if *brcmf_add_if(struct brcmf_pub *drvr, s32 bsscfgidx, s32 ifidx,
  bool is_p2pdev, char *name, u8 *mac_addr);
 void 

Re: [PATCH] virtio_net: fix virtnet_open and virtnet_probe competing for try_fill_recv

2016-05-25 Thread Jason Wang



On 2016年05月25日 20:33, wangyunjian wrote:

In function virtnet_open() and virtnet_probe(), func try_fill_recv() will be 
executed at the same time. VQ in virtqueue_add() is not protected well and 
BUG_ON will be triggered when virito_net.ko being removed.

Test Script:
for (( i=0; i<500; i=i+1 )); do
rmmod virtio_net
modprobe virtio_net
ifconfig eth0 up
done

[  302.336996] [ cut here ]
[  302.338794] kernel BUG at virtio_ring.c:750!
[  302.340013] invalid opcode:  [#1] SMP
[  302.340013] last sysfs file: 
/sys/devices/pci:00/:00:03.0/virtio0/device
[  302.340013] CPU 0
[  302.340013] Modules linked in: virtio_balloon virtio_net(-) virtio_pci 
virtio_ring virtio ipv6 af_packet microcode acpiphp pci_hotplug fuse loop 
dm_mod rtc_cmos tpm_tis rtc_core tpm i2c_piix4 rtc_lib container button floppy 
pcspkr tpm_bios i2c_core joydev sg usbhid hid uhci_hcd ehci_hcd usbcore edd 
ext3 mbcache jbd fan processor ide_pci_generic piix ide_core ata_generic at 
a_piix libata thermal thermal_sys hwmon sd_mod crc_t10dif kvm_ivshmem(N) 
scsi_mod pv_channel(N) [last unloaded: virtio]
[  302.340013] Supported: Yes
[  302.340013] Pid: 8410, comm: rmmod Tainted: GN  2.6.32.12-0.7-default #1 
Standard PC (i440FX + PIIX, 1996)
[  302.340013] RIP: 0010:[]  [] 
virtqueue_detach_unused_buf+0xb9/0xc0 [virtio_ring]
[  302.340013] RSP: 0018:88000c7a9e08  EFLAGS: 00010283
[  302.340013] RAX: 00f4 RBX: 0100 RCX: 4d8e
[  302.340013] RDX: 880001e0 RSI: 0046 RDI: 81a71570
[  302.340013] RBP: 88000c987000 R08:  R09: 000a
[  302.340013] R10:  R11:  R12: 0400
[  302.340013] R13:  R14: 7fff92bc1758 R15: 0001
[  302.340013] FS:  7f8b3995d700() GS:880001e0() 
knlGS:
[  302.340013] CS:  0010 DS:  ES:  CR0: 8005003b
[  302.340013] CR2: 7fff196433e0 CR3: 0d07e000 CR4: 06f0
[  302.340013] DR0:  DR1:  DR2: 
[  302.340013] DR3:  DR6: 0ff0 DR7: 0400
[  302.340013] Process rmmod (pid: 8410, threadinfo 88000c7a8000, task 
88000c7aa200)
[  302.340013] Stack:
[  302.340013]  88000fbb3780  88000c987000 
a034b178
[  302.340013] <0> 88000fbb3850 88000fbb3780 a034efc0 
88000fbb3850
[  302.340013] <0>  a034b299 88000fbb3780 
a034b406
[  302.340013] Call Trace:
[  302.340013]  [] free_unused_bufs+0x88/0x120 [virtio_net]
[  302.340013]  [] remove_vq_common+0x19/0x30 [virtio_net]
[  302.340013]  [] virtnet_remove+0x46/0x80 [virtio_net]
[  302.340013]  [] virtio_dev_remove+0x15/0x60 [virtio]
[  302.340013]  [] __device_release_driver+0x91/0x110
[  302.340013]  [] driver_detach+0xa8/0xb0
[  302.340013]  [] bus_remove_driver+0x82/0x110
[  302.340013]  [] sys_delete_module+0x1c4/0x290
[  302.340013]  [] system_call_fastpath+0x16/0x1b
[  302.340013]  [<7f8b394c7de7>] 0x7f8b394c7de7
[  302.340013] Code: c3 01 49 83 c4 04 e8 30 10 07 e1 8b 55 38 39 da 77 d0 8b 75 2c 
31 c0 48 c7 c7 ba 4b 32 a0 e8 18 10 07 e1 8b 45 2c 3b 45 38 74 82 <0f> 0b eb fe 
0f 1f 00 48 83 ec 28 48 89 6c 24 08 48 89 1c 24 48
[  302.340013] RIP  [] virtqueue_detach_unused_buf+0xb9/0xc0 
[virtio_ring]
[  302.340013]  RSP 
[  302.438579] ---[ end trace 1e583bdb5b2644f1 ]---


Signed-off-by: Yunjian Wang 
---
  drivers/net/virtio_net.c | 4 
  1 file changed, 4 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 49d84e5..4528ef8 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -818,10 +818,6 @@ static int virtnet_open(struct net_device *dev)
 int i;

 for (i = 0; i < vi->max_queue_pairs; i++) {
-   if (i < vi->curr_queue_pairs)
-   /* Make sure we have some buffers: if oom use wq. */
-   if (!try_fill_recv(vi, >rq[i], GFP_KERNEL))
-   schedule_delayed_work(>refill, 0);
 virtnet_napi_enable(>rq[i]);
 }

--
1.7.12.4



Thanks a lot for spotting the issue.

But the fix looks questionable, what if we increase the number of queues 
before we open it? I can thin two solutions:


- since ndo_open() is protected by rtnl_lock, how about protect the 
refill in virtnet_probe() with rtnl_lock?
- or looks like we can remove the refills in virtnet_probe() since we 
will do it in .ndo_open() ?


Btw, we usually use net or net-next prefix in the patch. For the patch 
like this, it should have something like "PATCH net" as a prefix. And 
since it needs go to -stable, better add a short descriptor after "---" 
for letting David to know about this.


Re: [PATCH] net: stmmac: Fix incorrect memcpy source memory

2016-05-25 Thread Andrew Lunn
On Thu, May 26, 2016 at 12:40:23AM +0200, Marek Vasut wrote:
> The memcpy() currently copies mdio_bus_data into new_bus->irq, which
> makes no sense, since the mdio_bus_data structure contains more than
> just irqs. The code was likely supposed to copy mdio_bus_data->irqs
> into the new_bus->irq instead, so fix this.
> 
> Signed-off-by: Marek Vasut 
> Cc: David S. Miller 
> Cc: Giuseppe Cavallaro 
> Cc: Alexandre Torgue 

Reviewed-by: Andrew Lunn 

 Andrew


Re: [PATCH] net: stmmac: Fix incorrect memcpy source memory

2016-05-25 Thread David Miller
From: Marek Vasut 
Date: Thu, 26 May 2016 00:40:23 +0200

> The memcpy() currently copies mdio_bus_data into new_bus->irq, which
> makes no sense, since the mdio_bus_data structure contains more than
> just irqs. The code was likely supposed to copy mdio_bus_data->irqs
> into the new_bus->irq instead, so fix this.
> 
> Signed-off-by: Marek Vasut 

Fixes: e7f4dc3536a4 ("mdio: Move allocation of interrupts into core")

Andrew, please review.


Re: [PATCH v2] net: alx: use custom skb allocator

2016-05-25 Thread David Miller
From: Feng Tang 
Date: Wed, 25 May 2016 14:49:54 +0800

> This patch follows Eric Dumazet's commit 7b70176421 for Atheros
> atl1c driver to fix one exactly same bug in alx driver, that the
> network link will be lost in 1-5 minutes after the device is up.
> 
> My laptop Lenovo Y580 with Atheros AR8161 ethernet device hit the
> same problem with kernel 4.4, and it will be cured by Jarod Wilson's
> commit c406700c for alx driver which get merged in 4.5. But there
> are still some alx devices can't function well even with Jarod's
> patch, while this patch could make them work fine. More details on
>   https://bugzilla.kernel.org/show_bug.cgi?id=70761
> 
> The debug shows the issue is very likely to be related with the RX
> DMA address, specifically 0x...f80, if RX buffer get 0x...f80 several
> times, their will be RX overflow error and device will stop working.
> 
> For kernel 4.5.0 with Jarod's patch which works fine with my
> AR8161/Lennov Y580, if I made some change to the
>   __netdev_alloc_skb
>   --> __alloc_page_frag()
> to make the allocated buffer can get an address with 0x...f80,
> then the same error happens. If I make it to 0x...f40 or 0xfc0,
> everything will be still fine. So I tend to believe that the
> 0x..f80 address cause the silicon to behave abnormally.
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=70761
> Cc: Eric Dumazet 
> Cc: Johannes Berg 
> Cc: Jarod Wilson 
> Signed-off-by: Feng Tang 
> Tested-by: Ole Lukoie 

Looks good, applied, thanks.

But now that we have at least two instances of this code we really
need to put a common version somewhere. :-/


[PATCH V3] brcmfmac: print error if p2p_ifadd firmware command fails

2016-05-25 Thread Rafał Miłecki
This is helpful for debugging, without this all I was getting from "iw"
command on device with BCM43602 was:
> command failed: Too many open files in system (-23)

Signed-off-by: Rafał Miłecki 
---
V2: s/in/if/ in commit message
V3: Add one more error message as suggested by Arend
---
 drivers/net/wireless/broadcom/brcm80211/brcmfmac/p2p.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/p2p.c 
b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/p2p.c
index 1652a48..bc26aec 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/p2p.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/p2p.c
@@ -2031,7 +2031,7 @@ static int brcmf_p2p_request_p2p_if(struct brcmf_p2p_info 
*p2p,
err = brcmf_fil_iovar_data_set(ifp, "p2p_ifadd", _request,
   sizeof(if_request));
if (err)
-   return err;
+   brcmf_err("p2p_ifadd failed %d\n", err);
 
return err;
 }
@@ -2185,6 +2185,7 @@ struct wireless_dev *brcmf_p2p_add_vif(struct wiphy 
*wiphy, const char *name,
err = brcmf_p2p_request_p2p_if(>p2p, ifp, cfg->p2p.int_addr,
   iftype);
if (err) {
+   brcmf_err("Failed to request P2P virtual interface %s\n", name);
brcmf_cfg80211_arm_vif_event(cfg, NULL);
goto fail;
}
-- 
1.8.4.5



[PATCH] net: stmmac: Fix incorrect memcpy source memory

2016-05-25 Thread Marek Vasut
The memcpy() currently copies mdio_bus_data into new_bus->irq, which
makes no sense, since the mdio_bus_data structure contains more than
just irqs. The code was likely supposed to copy mdio_bus_data->irqs
into the new_bus->irq instead, so fix this.

Signed-off-by: Marek Vasut 
Cc: David S. Miller 
Cc: Giuseppe Cavallaro 
Cc: Alexandre Torgue 
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
index 3f83c36..ec29585 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
@@ -297,7 +297,7 @@ int stmmac_mdio_register(struct net_device *ndev)
return -ENOMEM;
 
if (mdio_bus_data->irqs)
-   memcpy(new_bus->irq, mdio_bus_data, sizeof(new_bus->irq));
+   memcpy(new_bus->irq, mdio_bus_data->irqs, sizeof(new_bus->irq));
 
 #ifdef CONFIG_OF
if (priv->device->of_node)
-- 
2.7.0



[PATCH] net: arc: trivial: Replace comma with a semicolon

2016-05-25 Thread Marek Vasut
Fix a typo in the driver, replace comma with a semicolon at the end
of statement. While using comma is a legal C here and probably does
not even generate compiler warning, it was unlikely the intention.

Signed-off-by: Marek Vasut 
Cc: David S. Miller 
Cc: Caesar Wang 
Cc: Heiko Stuebner 
---
 drivers/net/ethernet/arc/emac_mdio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/arc/emac_mdio.c 
b/drivers/net/ethernet/arc/emac_mdio.c
index 16419f5..058460b 100644
--- a/drivers/net/ethernet/arc/emac_mdio.c
+++ b/drivers/net/ethernet/arc/emac_mdio.c
@@ -141,7 +141,7 @@ int arc_mdio_probe(struct arc_emac_priv *priv)
priv->bus = bus;
bus->priv = priv;
bus->parent = priv->dev;
-   bus->name = "Synopsys MII Bus",
+   bus->name = "Synopsys MII Bus";
bus->read = _mdio_read;
bus->write = _mdio_write;
bus->reset = _mdio_reset;
-- 
2.7.0



Re: [PATCH] brcmfmac: fix setting AP channel with new firmwares

2016-05-25 Thread Rafał Miłecki
On 25 May 2016 at 23:08, Arend van Spriel  wrote:
> On 24-05-16 11:09, Rafał Miłecki wrote:
>> Firmware for new chipsets is based on a new major version of code
>> internally maintained at Broadcom. E.g. brcmfmac4366b-pcie.bin (used for
>> BCM4366B1) is based on 10.10.69.3309 while brcmfmac43602-pcie.ap.bin was
>> based on 7.35.177.56.
>>
>> Currently setting AP 5 GHz channel doesn't work reliably with BCM4366B1.
>> When setting e.g. 36 control channel with VHT80 (center channel 42)
>> firmware may randomly pick one of:
>> 1) 52 control channel with 58 as center one
>> 2) 100 control channel with 106 as center one
>> 3) 116 control channel with 122 as center one
>> 4) 149 control channel with 155 as center one
>>
>> It seems new firmwares require setting AP mode (BRCMF_C_SET_AP) before
>> specifying a channel. Changing an order of firmware calls fixes the
>> problem.
>>
>> This fix was verified with BCM4366B1 and tested for regressions on
>> BCM43602. It's unclear if it's needed (or correct at all) for P2P
>> interfaces so it leaves this code unaffected.
>
> In doing so the code reads a bit awkward so if P2P-GO works with the
> changed order that would be preferable.

I'd prefer to have one code path as well, but my device/firmware
doesn't support P2P so I couldn't test it.

Could you test it or check firmware code to see if it's safe to change
P2P path as well?

-- 
Rafał


Re: [PATCH net 1/4] net/mlx4_en: fix tx_dropped bug

2016-05-25 Thread Alexei Starovoitov
On Wed, May 25, 2016 at 09:50:36AM -0700, Eric Dumazet wrote:
> 1) mlx4_en_xmit() can increment priv->stats.tx_dropped, but this variable
> is overwritten in mlx4_en_DUMP_ETH_STATS().
> 
> 2) This increment was not SMP safe, as a port might have many TX queues.
> 
> Add a per TX ring tx_dropped to fix these issues.
> 
> This is u32 as mlx4_en_DUMP_ETH_STATS() will add a 32bit field.
> 
> So lets avoid bugs with SNMP agents having to cope with partial
> overwraps. (One of these agents being bond_fold_stats())
> 
> Signed-off-by: Eric Dumazet 
> Reported-by: Willem de Bruijn 
> Cc: Eugenia Emantayev 

this problem was bugging me as well. thank you for fixing it.
Acked-by: Alexei Starovoitov 



Re: [PATCH] brcmfmac: fix setting AP channel with new firmwares

2016-05-25 Thread Arend van Spriel


On 24-05-16 11:09, Rafał Miłecki wrote:
> Firmware for new chipsets is based on a new major version of code
> internally maintained at Broadcom. E.g. brcmfmac4366b-pcie.bin (used for
> BCM4366B1) is based on 10.10.69.3309 while brcmfmac43602-pcie.ap.bin was
> based on 7.35.177.56.
> 
> Currently setting AP 5 GHz channel doesn't work reliably with BCM4366B1.
> When setting e.g. 36 control channel with VHT80 (center channel 42)
> firmware may randomly pick one of:
> 1) 52 control channel with 58 as center one
> 2) 100 control channel with 106 as center one
> 3) 116 control channel with 122 as center one
> 4) 149 control channel with 155 as center one
> 
> It seems new firmwares require setting AP mode (BRCMF_C_SET_AP) before
> specifying a channel. Changing an order of firmware calls fixes the
> problem.
> 
> This fix was verified with BCM4366B1 and tested for regressions on
> BCM43602. It's unclear if it's needed (or correct at all) for P2P
> interfaces so it leaves this code unaffected.

In doing so the code reads a bit awkward so if P2P-GO works with the
changed order that would be preferable.

> Signed-off-by: Rafał Miłecki 
> ---
>  .../net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c  | 16 
> 
>  1 file changed, 12 insertions(+), 4 deletions(-)


Re: [PATCH V2] brcmfmac: print error if p2p_ifadd firmware command fails

2016-05-25 Thread Arend van Spriel
On 24-05-16 23:05, Rafał Miłecki wrote:
> This is helpful for debugging, without this all I was getting from "iw"
> command on device with BCM43602 was:
>> command failed: Too many open files in system (-23)
> 
> Signed-off-by: Rafał Miłecki 
> ---
> V2: s/in/if/ in commit message
> ---
>  drivers/net/wireless/broadcom/brcm80211/brcmfmac/p2p.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/p2p.c 
> b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/p2p.c
> index 1652a48..f7b7e29 100644
> --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/p2p.c
> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/p2p.c
> @@ -2031,7 +2031,7 @@ static int brcmf_p2p_request_p2p_if(struct 
> brcmf_p2p_info *p2p,
>   err = brcmf_fil_iovar_data_set(ifp, "p2p_ifadd", _request,
>  sizeof(if_request));
>   if (err)
> - return err;
> + brcmf_err("p2p_ifadd failed %d\n", err);

I would prefer adding a more generic failure message including ifname
and type in brcmf_cfg80211_add_iface() in cfg80211.c.

Regards,
Arend

>  
>   return err;
>  }
> 


Re: [PATCH net] sctp: fix double EPs display in sctp_diag

2016-05-25 Thread Marcelo Ricardo Leitner
On Thu, May 26, 2016 at 03:09:23AM +0800, Xin Long wrote:
> We have this situation: that EP hash table, contains only the EPs
> that are listening, while the transports one, has the opposite.
> We have to traverse both to dump all.
> 
> But when we traverse the transports one we will also get EPs that are
> in the EP hash if they are listening. In this case, the EP is dumped
> twice.
> 
> We will fix it by checking if the endpoint that is in the endpoint
> hash table contains any ep->asoc in there, as it means we will also
> find it via transport hash, and thus we can/should skip it, depending
> on the filters used, like 'ss -l'.
> 
> Still, we should NOT skip it if the user is listing only listening
> endpoints, because then we are not traversing the transport hash.
> so we have to check idiag_states there also.
> 
> Signed-off-by: Xin Long 

Acked-by: Marcelo Ricardo Leitner 

> ---
>  net/sctp/sctp_diag.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/net/sctp/sctp_diag.c b/net/sctp/sctp_diag.c
> index 8e3e769..1ce724b 100644
> --- a/net/sctp/sctp_diag.c
> +++ b/net/sctp/sctp_diag.c
> @@ -356,6 +356,9 @@ static int sctp_ep_dump(struct sctp_endpoint *ep, void *p)
>   if (cb->args[4] < cb->args[1])
>   goto next;
>  
> + if ((r->idiag_states & ~TCPF_LISTEN) && !list_empty(>asocs))
> + goto next;
> +
>   if (r->sdiag_family != AF_UNSPEC &&
>   sk->sk_family != r->sdiag_family)
>   goto next;
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


Re: [PATCH net] sctp: sctp_diag should dump sctp socket type

2016-05-25 Thread Eric Dumazet
On Thu, 2016-05-26 at 03:14 +0800, Xin Long wrote:
> Now we cannot distinguish that one sk is a udp or sctp style when
> we use ss to dump sctp_info. it's necessary to dump it as well.
> 
> For sctp_diag, ss support is not officially available, thus there
> are no official users of this yet, so we can add this field in the
> middle of sctp_info without breaking user API.
> 
> Signed-off-by: Xin Long 
> ---
>  include/linux/sctp.h | 1 +
>  net/sctp/socket.c| 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/include/linux/sctp.h b/include/linux/sctp.h
> index dacb5e7..3a406af 100644
> --- a/include/linux/sctp.h
> +++ b/include/linux/sctp.h
> @@ -761,6 +761,7 @@ struct sctp_info {
>   __u32   sctpi_s_autoclose;
>   __u32   sctpi_s_adaptation_ind;
>   __u32   sctpi_s_pd_point;
> + __u32   sctpi_s_type;

Well, this is also adding a 4-byte padding at the end of the structure.


Basically, because of the 8-byte alignment cause by the __u64 fields,
adding a single __u32 adds a padding.

Don't you have another u32 info you'd like to publish ?





Re: [PATCH net v2] team: don't call netdev_change_features under team->lock

2016-05-25 Thread David Miller
From: Ivan Vecera 
Date: Wed, 25 May 2016 21:21:52 +0200

> The team_device_event() notifier calls team_compute_features() to fix
> vlan_features under team->lock to protect team->port_list. The problem is
> that subsequent __team_compute_features() calls netdev_change_features()
> to propagate vlan_features to upper vlan devices while team->lock is still
> taken. This can lead to deadlock when NETIF_F_LRO is modified on lower
> devices or team device itself.
> 
> Example:
> The team0 as active backup with eth0 and eth1 NICs. Both eth0 & eth1 are
> LRO capable and LRO is enabled. Thus LRO is also enabled on team0.
> 
> The command 'ethtool -K team0 lro off' now hangs due to this deadlock:
 ...
> The bug is present in team from the beginning but it appeared after the commit
> fd867d5 (net/core: generic support for disabling netdev features down stack)
> that adds synchronization of features with lower devices.
> 
> Fixes: fd867d5 (net/core: generic support for disabling netdev features down 
> stack)
> Cc: Jiri Pirko 
> Signed-off-by: Ivan Vecera 

Applied and queued up for -stable, thanks.


Re: [PATCH net v2] team: don't call netdev_change_features under team->lock

2016-05-25 Thread Jiri Pirko
Wed, May 25, 2016 at 09:21:52PM CEST, ivec...@redhat.com wrote:
>The team_device_event() notifier calls team_compute_features() to fix
>vlan_features under team->lock to protect team->port_list. The problem is
>that subsequent __team_compute_features() calls netdev_change_features()
>to propagate vlan_features to upper vlan devices while team->lock is still
>taken. This can lead to deadlock when NETIF_F_LRO is modified on lower
>devices or team device itself.
>
>Example:
>The team0 as active backup with eth0 and eth1 NICs. Both eth0 & eth1 are
>LRO capable and LRO is enabled. Thus LRO is also enabled on team0.
>
>The command 'ethtool -K team0 lro off' now hangs due to this deadlock:
>
>dev_ethtool()
>-> ethtool_set_features()
> -> __netdev_update_features(team)
>  -> netdev_sync_lower_features()
>   -> netdev_update_features(lower_1)
>-> __netdev_update_features(lower_1)
>-> netdev_features_change(lower_1)
> -> call_netdevice_notifiers(...)
>  -> team_device_event(lower_1)
>   -> team_compute_features(team) [TAKES team->lock]
>-> netdev_change_features(team)
> -> __netdev_update_features(team)
>  -> netdev_sync_lower_features()
>   -> netdev_update_features(lower_2)
>-> __netdev_update_features(lower_2)
>-> netdev_features_change(lower_2)
> -> call_netdevice_notifiers(...)
>  -> team_device_event(lower_2)
>   -> team_compute_features(team) [DEADLOCK]
>
>The bug is present in team from the beginning but it appeared after the commit
>fd867d5 (net/core: generic support for disabling netdev features down stack)
>that adds synchronization of features with lower devices.
>
>Fixes: fd867d5 (net/core: generic support for disabling netdev features down 
>stack)
>Cc: Jiri Pirko 
>Signed-off-by: Ivan Vecera 

Signed-off-by: Jiri Pirko 


Re: [PATCH net v2 0/3] Documentation: dsa: misc fixes

2016-05-25 Thread David Miller
From: Florian Fainelli 
Date: Tue, 24 May 2016 21:26:38 -0700

> Here are some miscelaneous documentation fixes for DSA, I targeted "net"
> because these are not functional code changes, but still documentation fixes
> per-se.
> 
> Changes in v2:
> 
> - reword what the port_vlan_filtering is about based on feedback from Vivien 
> and Ido

Series applied, thanks Florian.


Re: [PATCH net] sfc: on MC reset, clear PIO buffer linkage in TXQs

2016-05-25 Thread David Miller
From: Edward Cree 
Date: Tue, 24 May 2016 18:53:36 +0100

> Otherwise, if we fail to allocate new PIO buffers, our TXQs will try to
> use the old ones, which aren't there any more.
> 
> Fixes: 183233bec810 "sfc: Allocate and link PIO buffers; map them with 
> write-combining"
> Signed-off-by: Edward Cree 

Applied and queued up for -stable.


Re: [PATCH net 0/2] Fix spinlock usage in HWBM

2016-05-25 Thread David Miller
From: Gregory CLEMENT 
Date: Tue, 24 May 2016 18:03:24 +0200

> these two patches fix spinlock related issues introduced in v4.6. They
> have been reported by Russell King and Jean-Jacques Hiblot.

Series applied and queued up for -stable.


[PATCH net v2] team: don't call netdev_change_features under team->lock

2016-05-25 Thread Ivan Vecera
The team_device_event() notifier calls team_compute_features() to fix
vlan_features under team->lock to protect team->port_list. The problem is
that subsequent __team_compute_features() calls netdev_change_features()
to propagate vlan_features to upper vlan devices while team->lock is still
taken. This can lead to deadlock when NETIF_F_LRO is modified on lower
devices or team device itself.

Example:
The team0 as active backup with eth0 and eth1 NICs. Both eth0 & eth1 are
LRO capable and LRO is enabled. Thus LRO is also enabled on team0.

The command 'ethtool -K team0 lro off' now hangs due to this deadlock:

dev_ethtool()
-> ethtool_set_features()
 -> __netdev_update_features(team)
  -> netdev_sync_lower_features()
   -> netdev_update_features(lower_1)
-> __netdev_update_features(lower_1)
-> netdev_features_change(lower_1)
 -> call_netdevice_notifiers(...)
  -> team_device_event(lower_1)
   -> team_compute_features(team) [TAKES team->lock]
-> netdev_change_features(team)
 -> __netdev_update_features(team)
  -> netdev_sync_lower_features()
   -> netdev_update_features(lower_2)
-> __netdev_update_features(lower_2)
-> netdev_features_change(lower_2)
 -> call_netdevice_notifiers(...)
  -> team_device_event(lower_2)
   -> team_compute_features(team) [DEADLOCK]

The bug is present in team from the beginning but it appeared after the commit
fd867d5 (net/core: generic support for disabling netdev features down stack)
that adds synchronization of features with lower devices.

Fixes: fd867d5 (net/core: generic support for disabling netdev features down 
stack)
Cc: Jiri Pirko 
Signed-off-by: Ivan Vecera 
---
 drivers/net/team/team.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 718ceea..800a449 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -988,7 +988,7 @@ static void team_port_disable(struct team *team,
 #define TEAM_ENC_FEATURES  (NETIF_F_HW_CSUM | NETIF_F_SG | \
 NETIF_F_RXCSUM | NETIF_F_ALL_TSO)
 
-static void __team_compute_features(struct team *team)
+static void ___team_compute_features(struct team *team)
 {
struct team_port *port;
u32 vlan_features = TEAM_VLAN_FEATURES & NETIF_F_ALL_FOR_ALL;
@@ -1019,15 +1019,20 @@ static void __team_compute_features(struct team *team)
team->dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
if (dst_release_flag == (IFF_XMIT_DST_RELEASE | 
IFF_XMIT_DST_RELEASE_PERM))
team->dev->priv_flags |= IFF_XMIT_DST_RELEASE;
+}
 
+static void __team_compute_features(struct team *team)
+{
+   ___team_compute_features(team);
netdev_change_features(team->dev);
 }
 
 static void team_compute_features(struct team *team)
 {
mutex_lock(>lock);
-   __team_compute_features(team);
+   ___team_compute_features(team);
mutex_unlock(>lock);
+   netdev_change_features(team->dev);
 }
 
 static int team_port_enter(struct team *team, struct team_port *port)
-- 
2.7.3



Re: [PATCH v2] tipc: fix potential null pointer dereferences in some compat functions

2016-05-25 Thread David Miller
From: Baozeng Ding 
Date: Tue, 24 May 2016 22:33:24 +0800

> Before calling the nla_parse_nested function, make sure the pointer to the
> attribute is not null. This patch fixes several potential null pointer
> dereference vulnerabilities in the tipc netlink functions.
> 
> Signed-off-by: Baozeng Ding 
> ---
> v2: declare local variable as reverse christmas tree format and make the 
> commit
> log fit in 80 columns

Looks good, applied, thanks.


Re: [PATCH net 1/1] qed: Reset the enable flag for eth protocol.

2016-05-25 Thread David Miller
From: Sudarsana Reddy Kalluru 
Date: Tue, 24 May 2016 05:25:23 -0400

> This patch fixes the coding error in determining the enable flag for
> the application/protocol. The enable flag should be set for all protocols
> but the eth.
> 
> Signed-off-by: Sudarsana Reddy Kalluru 
> Signed-off-by: Yuval Mintz 

Applied.


Re: [RFC PATCH] ethtool: add support for 25G/50G/100G speed modes

2016-05-25 Thread David Miller
From: vi...@cumulusnetworks.com
Date: Sun, 22 May 2016 23:59:00 -0700

> From: Vidya Sagar Ravipati 
> 
> This patch enhances ethtool link mode bitmap to include
> 25G/50G/100G speed along with interface modes
> 
> Signed-off-by: Vidya Sagar Ravipati 

This looks fine, applied, thanks.


Re: [PATCH net] sctp: sctp_diag should dump sctp socket type

2016-05-25 Thread David Miller
From: Xin Long 
Date: Thu, 26 May 2016 03:14:28 +0800

> For sctp_diag, ss support is not officially available, thus there
> are no official users of this yet, so we can add this field in the
> middle of sctp_info without breaking user API.

This is not what matters.

What matters is if a released kernel went out with the existing API.


[PATCH net] sctp: sctp_diag should dump sctp socket type

2016-05-25 Thread Xin Long
Now we cannot distinguish that one sk is a udp or sctp style when
we use ss to dump sctp_info. it's necessary to dump it as well.

For sctp_diag, ss support is not officially available, thus there
are no official users of this yet, so we can add this field in the
middle of sctp_info without breaking user API.

Signed-off-by: Xin Long 
---
 include/linux/sctp.h | 1 +
 net/sctp/socket.c| 1 +
 2 files changed, 2 insertions(+)

diff --git a/include/linux/sctp.h b/include/linux/sctp.h
index dacb5e7..3a406af 100644
--- a/include/linux/sctp.h
+++ b/include/linux/sctp.h
@@ -761,6 +761,7 @@ struct sctp_info {
__u32   sctpi_s_autoclose;
__u32   sctpi_s_adaptation_ind;
__u32   sctpi_s_pd_point;
+   __u32   sctpi_s_type;
__u8sctpi_s_nodelay;
__u8sctpi_s_disable_fragments;
__u8sctpi_s_v4mapped;
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 777d032..ebb8e41 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -4216,6 +4216,7 @@ int sctp_get_sctp_info(struct sock *sk, struct 
sctp_association *asoc,
info->sctpi_s_autoclose = sp->autoclose;
info->sctpi_s_adaptation_ind = sp->adaptation_ind;
info->sctpi_s_pd_point = sp->pd_point;
+   info->sctpi_s_type = sp->type;
info->sctpi_s_nodelay = sp->nodelay;
info->sctpi_s_disable_fragments = sp->disable_fragments;
info->sctpi_s_v4mapped = sp->v4mapped;
-- 
2.1.0



[PATCH net] sctp: fix double EPs display in sctp_diag

2016-05-25 Thread Xin Long
We have this situation: that EP hash table, contains only the EPs
that are listening, while the transports one, has the opposite.
We have to traverse both to dump all.

But when we traverse the transports one we will also get EPs that are
in the EP hash if they are listening. In this case, the EP is dumped
twice.

We will fix it by checking if the endpoint that is in the endpoint
hash table contains any ep->asoc in there, as it means we will also
find it via transport hash, and thus we can/should skip it, depending
on the filters used, like 'ss -l'.

Still, we should NOT skip it if the user is listing only listening
endpoints, because then we are not traversing the transport hash.
so we have to check idiag_states there also.

Signed-off-by: Xin Long 
---
 net/sctp/sctp_diag.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/sctp/sctp_diag.c b/net/sctp/sctp_diag.c
index 8e3e769..1ce724b 100644
--- a/net/sctp/sctp_diag.c
+++ b/net/sctp/sctp_diag.c
@@ -356,6 +356,9 @@ static int sctp_ep_dump(struct sctp_endpoint *ep, void *p)
if (cb->args[4] < cb->args[1])
goto next;
 
+   if ((r->idiag_states & ~TCPF_LISTEN) && !list_empty(>asocs))
+   goto next;
+
if (r->sdiag_family != AF_UNSPEC &&
sk->sk_family != r->sdiag_family)
goto next;
-- 
2.1.0



Re: [RFC PATCH 00/29] net: VRF support

2016-05-25 Thread David Ahern

On 5/25/16 10:04 AM, Chenna wrote:

David Ahern  gmail.com> writes:



Kernel patches are also available here:
https://github.com/dsahern/linux.git vrf-3.19

iproute2 patches are also available here:
https://github.com/dsahern/iproute2 vrf-3.19




Hello David,

Do we have the similar support package for 3.10 kernel?


The VRF patches referenced above were not accepted upstream. An 
alternative implementation was accepted for the 4.3 kernel with various 
updates in all of the kernel versions since.


Users that want the VRF implementation in an older kernel (e.g., 3.10) 
will need to backport the kernel patches. Top of tree iproute2 can be 
used as is with older kernels.


Re: [RFC PATCH 00/29] net: VRF support

2016-05-25 Thread Chenna
David Ahern  gmail.com> writes:

> 
> Kernel patches are also available here:
> https://github.com/dsahern/linux.git vrf-3.19
> 
> iproute2 patches are also available here:
> https://github.com/dsahern/iproute2 vrf-3.19
> 


Hello David,

Do we have the similar support package for 3.10 kernel?

Thanks
-Chenna





Re: [PATCH v8 10/22] IB/hns: Add process flow to init RoCE engine

2016-05-25 Thread Leon Romanovsky
On Wed, May 25, 2016 at 11:05:13PM +0800, Lijun Ou wrote:
> This patch mainly initialized the RoCE engine. It is absolutely
> necessary to run RoCE. It mainly includes that configure DMAE
> user, initialize doorbell and raq operations, enable port.
> 
> Signed-off-by: Wei Hu 
> Signed-off-by: Nenglong Zhao 
> Signed-off-by: Lijun Ou 
> ---
>  drivers/infiniband/hw/hns/hns_roce_common.h | 107 +++
>  drivers/infiniband/hw/hns/hns_roce_device.h |  15 +
>  drivers/infiniband/hw/hns/hns_roce_hw_v1.c  | 477 
> 
>  drivers/infiniband/hw/hns/hns_roce_hw_v1.h  |  68 +++-
>  drivers/infiniband/hw/hns/hns_roce_main.c   |  20 ++
>  5 files changed, 686 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/hw/hns/hns_roce_common.h 
> b/drivers/infiniband/hw/hns/hns_roce_common.h
> index 5998778..73c6220 100644
> --- a/drivers/infiniband/hw/hns/hns_roce_common.h
> +++ b/drivers/infiniband/hw/hns/hns_roce_common.h
> @@ -53,6 +53,93 @@
>  #define roce_set_bit(origin, shift, val) \
>   roce_set_field((origin), (1ul << (shift)), (shift), (val))
>  
> +#define ROCEE_GLB_CFG_ROCEE_DB_SQ_MODE_S 3
> +#define ROCEE_GLB_CFG_ROCEE_DB_OTH_MODE_S 4
> +
> +#define ROCEE_GLB_CFG_SQ_EXT_DB_MODE_S 5
> +
> +#define ROCEE_GLB_CFG_OTH_EXT_DB_MODE_S 6
> +
> +#define ROCEE_GLB_CFG_ROCEE_PORT_ST_S 10
> +#define ROCEE_GLB_CFG_ROCEE_PORT_ST_M  \
> + (((1UL << 6) - 1) << ROCEE_GLB_CFG_ROCEE_PORT_ST_S)
> +
> +#define ROCEE_GLB_CFG_TRP_RAQ_DROP_EN_S 16
> +
> +#define ROCEE_DMAE_USER_CFG1_ROCEE_STREAM_ID_TB_CFG_S 0
> +#define ROCEE_DMAE_USER_CFG1_ROCEE_STREAM_ID_TB_CFG_M  \
> + (((1UL << 24) - 1) << ROCEE_DMAE_USER_CFG1_ROCEE_STREAM_ID_TB_CFG_S)
> +
> +#define ROCEE_DMAE_USER_CFG1_ROCEE_CACHE_TB_CFG_S 24
> +#define ROCEE_DMAE_USER_CFG1_ROCEE_CACHE_TB_CFG_M  \
> + (((1UL << 4) - 1) << ROCEE_DMAE_USER_CFG1_ROCEE_CACHE_TB_CFG_S)
> +
> +#define ROCEE_DMAE_USER_CFG2_ROCEE_STREAM_ID_PKT_CFG_S 0
> +#define ROCEE_DMAE_USER_CFG2_ROCEE_STREAM_ID_PKT_CFG_M   \
> + (((1UL << 24) - 1) << ROCEE_DMAE_USER_CFG2_ROCEE_STREAM_ID_PKT_CFG_S)
> +
> +#define ROCEE_DMAE_USER_CFG2_ROCEE_CACHE_PKT_CFG_S 24
> +#define ROCEE_DMAE_USER_CFG2_ROCEE_CACHE_PKT_CFG_M   \
> + (((1UL << 4) - 1) << ROCEE_DMAE_USER_CFG2_ROCEE_CACHE_PKT_CFG_S)
> +
> +#define ROCEE_DB_SQ_WL_ROCEE_DB_SQ_WL_S 0
> +#define ROCEE_DB_SQ_WL_ROCEE_DB_SQ_WL_M   \
> + (((1UL << 16) - 1) << ROCEE_DB_SQ_WL_ROCEE_DB_SQ_WL_S)
> +
> +#define ROCEE_DB_SQ_WL_ROCEE_DB_SQ_WL_EMPTY_S 16
> +#define ROCEE_DB_SQ_WL_ROCEE_DB_SQ_WL_EMPTY_M   \
> + (((1UL << 16) - 1) << ROCEE_DB_SQ_WL_ROCEE_DB_SQ_WL_EMPTY_S)
> +
> +#define ROCEE_DB_OTHERS_WL_ROCEE_DB_OTH_WL_S 0
> +#define ROCEE_DB_OTHERS_WL_ROCEE_DB_OTH_WL_M   \
> + (((1UL << 16) - 1) << ROCEE_DB_OTHERS_WL_ROCEE_DB_OTH_WL_S)
> +
> +#define ROCEE_DB_OTHERS_WL_ROCEE_DB_OTH_WL_EMPTY_S 16
> +#define ROCEE_DB_OTHERS_WL_ROCEE_DB_OTH_WL_EMPTY_M   \
> + (((1UL << 16) - 1) << ROCEE_DB_OTHERS_WL_ROCEE_DB_OTH_WL_EMPTY_S)
> +
> +#define ROCEE_RAQ_WL_ROCEE_RAQ_WL_S 0
> +#define ROCEE_RAQ_WL_ROCEE_RAQ_WL_M   \
> + (((1UL << 8) - 1) << ROCEE_RAQ_WL_ROCEE_RAQ_WL_S)
> +
> +#define ROCEE_WRMS_POL_TIME_INTERVAL_WRMS_POL_TIME_INTERVAL_S 0
> +#define ROCEE_WRMS_POL_TIME_INTERVAL_WRMS_POL_TIME_INTERVAL_M   \
> + (((1UL << 15) - 1) << \
> + ROCEE_WRMS_POL_TIME_INTERVAL_WRMS_POL_TIME_INTERVAL_S)
> +
> +#define ROCEE_WRMS_POL_TIME_INTERVAL_WRMS_RAQ_TIMEOUT_CHK_CFG_S 16
> +#define ROCEE_WRMS_POL_TIME_INTERVAL_WRMS_RAQ_TIMEOUT_CHK_CFG_M   \
> + (((1UL << 4) - 1) << \
> + ROCEE_WRMS_POL_TIME_INTERVAL_WRMS_RAQ_TIMEOUT_CHK_CFG_S)
> +
> +#define ROCEE_WRMS_POL_TIME_INTERVAL_WRMS_RAQ_TIMEOUT_CHK_EN_S 20
> +
> +#define ROCEE_WRMS_POL_TIME_INTERVAL_WRMS_EXT_RAQ_MODE 21
> +
> +#define ROCEE_EXT_DB_SQ_H_EXT_DB_SQ_SHIFT_S 0
> +#define ROCEE_EXT_DB_SQ_H_EXT_DB_SQ_SHIFT_M   \
> + (((1UL << 5) - 1) << ROCEE_EXT_DB_SQ_H_EXT_DB_SQ_SHIFT_S)
> +
> +#define ROCEE_EXT_DB_SQ_H_EXT_DB_SQ_BA_H_S 5
> +#define ROCEE_EXT_DB_SQ_H_EXT_DB_SQ_BA_H_M   \
> + (((1UL << 5) - 1) << ROCEE_EXT_DB_SQ_H_EXT_DB_SQ_BA_H_S)
> +
> +#define ROCEE_EXT_DB_OTH_H_EXT_DB_OTH_SHIFT_S 0
> +#define ROCEE_EXT_DB_OTH_H_EXT_DB_OTH_SHIFT_M   \
> + (((1UL << 5) - 1) << ROCEE_EXT_DB_OTH_H_EXT_DB_OTH_SHIFT_S)
> +
> +#define ROCEE_EXT_DB_SQ_H_EXT_DB_OTH_BA_H_S 5
> +#define ROCEE_EXT_DB_SQ_H_EXT_DB_OTH_BA_H_M   \
> + (((1UL << 5) - 1) << ROCEE_EXT_DB_SQ_H_EXT_DB_OTH_BA_H_S)
> +
> +#define ROCEE_EXT_RAQ_H_EXT_RAQ_SHIFT_S 0
> +#define ROCEE_EXT_RAQ_H_EXT_RAQ_SHIFT_M   \
> + (((1UL << 5) - 1) << ROCEE_EXT_RAQ_H_EXT_RAQ_SHIFT_S)
> +
> +#define ROCEE_EXT_RAQ_H_EXT_RAQ_BA_H_S 8
> +#define ROCEE_EXT_RAQ_H_EXT_RAQ_BA_H_M   \
> + (((1UL << 5) - 1) << ROCEE_EXT_RAQ_H_EXT_RAQ_BA_H_S)
> +
>  #define ROCEE_BT_CMD_H_ROCEE_BT_CMD_IN_MDF_S 0
>  #define ROCEE_BT_CMD_H_ROCEE_BT_CMD_IN_MDF_M   \
>   (((1UL << 19) - 1) << ROCEE_BT_CMD_H_ROCEE_BT_CMD_IN_MDF_S)
> @@ -120,6 +207,26 

[PATCH 3.19.y-ckt 38/40] VSOCK: do not disconnect socket when peer has shutdown SEND only

2016-05-25 Thread Kamal Mostafa
3.19.8-ckt22 -stable review patch.  If anyone has any objections, please let me 
know.

---8<

From: Ian Campbell 

[ Upstream commit dedc58e067d8c379a15a8a183c5db318201295bb ]

The peer may be expecting a reply having sent a request and then done a
shutdown(SHUT_WR), so tearing down the whole socket at this point seems
wrong and breaks for me with a client which does a SHUT_WR.

Looking at other socket family's stream_recvmsg callbacks doing a shutdown
here does not seem to be the norm and removing it does not seem to have
had any adverse effects that I can see.

I'm using Stefan's RFC virtio transport patches, I'm unsure of the impact
on the vmci transport.

Signed-off-by: Ian Campbell 
Cc: "David S. Miller" 
Cc: Stefan Hajnoczi 
Cc: Claudio Imbrenda 
Cc: Andy King 
Cc: Dmitry Torokhov 
Cc: Jorgen Hansen 
Cc: Adit Ranadive 
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 net/vmw_vsock/af_vsock.c | 21 +
 1 file changed, 1 insertion(+), 20 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 1d0e39c..316e856 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1796,27 +1796,8 @@ vsock_stream_recvmsg(struct kiocb *kiocb,
else if (sk->sk_shutdown & RCV_SHUTDOWN)
err = 0;
 
-   if (copied > 0) {
-   /* We only do these additional bookkeeping/notification steps
-* if we actually copied something out of the queue pair
-* instead of just peeking ahead.
-*/
-
-   if (!(flags & MSG_PEEK)) {
-   /* If the other side has shutdown for sending and there
-* is nothing more to read, then modify the socket
-* state.
-*/
-   if (vsk->peer_shutdown & SEND_SHUTDOWN) {
-   if (vsock_stream_has_data(vsk) <= 0) {
-   sk->sk_state = SS_UNCONNECTED;
-   sock_set_flag(sk, SOCK_DONE);
-   sk->sk_state_change(sk);
-   }
-   }
-   }
+   if (copied > 0)
err = copied;
-   }
 
 out_wait:
finish_wait(sk_sleep(sk), );
-- 
2.7.4



[3.19.y-ckt stable] Patch "VSOCK: do not disconnect socket when peer has shutdown SEND only" has been added to the 3.19.y-ckt tree

2016-05-25 Thread Kamal Mostafa
This is a note to let you know that I have just added a patch titled

VSOCK: do not disconnect socket when peer has shutdown SEND only

to the linux-3.19.y-queue branch of the 3.19.y-ckt extended stable tree 
which can be found at:


https://git.launchpad.net/~canonical-kernel/linux/+git/linux-stable-ckt/log/?h=linux-3.19.y-queue

This patch is scheduled to be released in version 3.19.8-ckt22.

If you, or anyone else, feels it should not be added to this tree, please 
reply to this email.

For more information about the 3.19.y-ckt tree, see
https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable

Thanks.
-Kamal

---8<

>From 5175ed6b780a668391e00dba063cdf5ae175c4df Mon Sep 17 00:00:00 2001
From: Ian Campbell 
Date: Wed, 4 May 2016 14:21:53 +0100
Subject: VSOCK: do not disconnect socket when peer has shutdown SEND only

[ Upstream commit dedc58e067d8c379a15a8a183c5db318201295bb ]

The peer may be expecting a reply having sent a request and then done a
shutdown(SHUT_WR), so tearing down the whole socket at this point seems
wrong and breaks for me with a client which does a SHUT_WR.

Looking at other socket family's stream_recvmsg callbacks doing a shutdown
here does not seem to be the norm and removing it does not seem to have
had any adverse effects that I can see.

I'm using Stefan's RFC virtio transport patches, I'm unsure of the impact
on the vmci transport.

Signed-off-by: Ian Campbell 
Cc: "David S. Miller" 
Cc: Stefan Hajnoczi 
Cc: Claudio Imbrenda 
Cc: Andy King 
Cc: Dmitry Torokhov 
Cc: Jorgen Hansen 
Cc: Adit Ranadive 
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 net/vmw_vsock/af_vsock.c | 21 +
 1 file changed, 1 insertion(+), 20 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 1d0e39c..316e856 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1796,27 +1796,8 @@ vsock_stream_recvmsg(struct kiocb *kiocb,
else if (sk->sk_shutdown & RCV_SHUTDOWN)
err = 0;

-   if (copied > 0) {
-   /* We only do these additional bookkeeping/notification steps
-* if we actually copied something out of the queue pair
-* instead of just peeking ahead.
-*/
-
-   if (!(flags & MSG_PEEK)) {
-   /* If the other side has shutdown for sending and there
-* is nothing more to read, then modify the socket
-* state.
-*/
-   if (vsk->peer_shutdown & SEND_SHUTDOWN) {
-   if (vsock_stream_has_data(vsk) <= 0) {
-   sk->sk_state = SS_UNCONNECTED;
-   sock_set_flag(sk, SOCK_DONE);
-   sk->sk_state_change(sk);
-   }
-   }
-   }
+   if (copied > 0)
err = copied;
-   }

 out_wait:
finish_wait(sk_sleep(sk), );
--
2.7.4



Re: [PATCH net-next 1/2] net: vrf: Fix dst reference counting

2016-05-25 Thread David Ahern
I failed to update the auto-generated subject prefix in the patches. 
Both of these are for 4.5 stable not net-next.


On 5/25/16 10:35 AM, David Ahern wrote:

commit 9ab179d83b4e31ea277a123492e419067c2f129a upstream

Vivek reported a kernel exception deleting a VRF with an active
connection through it. The root cause is that the socket has a cached
reference to a dst that is destroyed. Converting the dst_destroy to
dst_release and letting proper reference counting kick in does not
work as the dst has a reference to the device which needs to be released
as well.

I talked to Hannes about this at netdev and he pointed out the ipv4 and
ipv6 dst handling has dst_ifdown for just this scenario. Rather than
continuing with the reinvented dst wheel in VRF just remove it and
leverage the ipv4 and ipv6 versions.

Fixes: 193125dbd8eb2 ("net: Introduce VRF device driver")
Fixes: 35402e3136634 ("net: Add IPv6 support to VRF device")

Signed-off-by: David Ahern 
Signed-off-by: David S. Miller 




[PATCH net 4/4] net/mlx4_en: get rid of private net_device_stats

2016-05-25 Thread Eric Dumazet
We simply can use the standard net_device stats.

We do not need to clear fields that are already 0.

Signed-off-by: Eric Dumazet 
Cc: Willem de Bruijn 
Cc: Eugenia Emantayev 
---
 drivers/net/ethernet/mellanox/mlx4/en_ethtool.c |  2 +-
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c  |  3 +--
 drivers/net/ethernet/mellanox/mlx4/en_port.c| 14 +++---
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h|  1 -
 4 files changed, 5 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c 
b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
index c761194bb323..fc95affaf76b 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
@@ -362,7 +362,7 @@ static void mlx4_en_get_ethtool_stats(struct net_device 
*dev,
 
for (i = 0; i < NUM_MAIN_STATS; i++, bitmap_iterator_inc())
if (bitmap_iterator_test())
-   data[index++] = ((unsigned long *)>stats)[i];
+   data[index++] = ((unsigned long *)>stats)[i];
 
for (i = 0; i < NUM_PORT_STATS; i++, bitmap_iterator_inc())
if (bitmap_iterator_test())
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c 
b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index a4fc6e966505..19ceced6736c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1302,7 +1302,7 @@ mlx4_en_get_stats64(struct net_device *dev, struct 
rtnl_link_stats64 *stats)
struct mlx4_en_priv *priv = netdev_priv(dev);
 
spin_lock_bh(>stats_lock);
-   netdev_stats_to_stats64(stats, >stats);
+   netdev_stats_to_stats64(stats, >stats);
spin_unlock_bh(>stats_lock);
 
return stats;
@@ -1877,7 +1877,6 @@ static void mlx4_en_clear_stats(struct net_device *dev)
if (mlx4_en_DUMP_ETH_STATS(mdev, priv->port, 1))
en_dbg(HW, priv, "Failed dumping statistics\n");
 
-   memset(>stats, 0, sizeof(priv->stats));
memset(>pstats, 0, sizeof(priv->pstats));
memset(>pkstats, 0, sizeof(priv->pkstats));
memset(>port_stats, 0, sizeof(priv->port_stats));
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_port.c 
b/drivers/net/ethernet/mellanox/mlx4/en_port.c
index 3df8690154b1..5aa8b751f417 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_port.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_port.c
@@ -152,8 +152,9 @@ int mlx4_en_DUMP_ETH_STATS(struct mlx4_en_dev *mdev, u8 
port, u8 reset)
struct mlx4_counter tmp_counter_stats;
struct mlx4_en_stat_out_mbox *mlx4_en_stats;
struct mlx4_en_stat_out_flow_control_mbox *flowstats;
-   struct mlx4_en_priv *priv = netdev_priv(mdev->pndev[port]);
-   struct net_device_stats *stats = >stats;
+   struct net_device *dev = mdev->pndev[port];
+   struct mlx4_en_priv *priv = netdev_priv(dev);
+   struct net_device_stats *stats = >stats;
struct mlx4_cmd_mailbox *mailbox;
u64 in_mod = reset << 8 | port;
int err;
@@ -239,20 +240,11 @@ int mlx4_en_DUMP_ETH_STATS(struct mlx4_en_dev *mdev, u8 
port, u8 reset)
stats->multicast = en_stats_adder(_en_stats->MCAST_prio_0,
  _en_stats->MCAST_prio_1,
  NUM_PRIORITIES);
-   stats->collisions = 0;
stats->rx_dropped = be32_to_cpu(mlx4_en_stats->RDROP) +
sw_rx_dropped;
stats->rx_length_errors = be32_to_cpu(mlx4_en_stats->RdropLength);
-   stats->rx_over_errors = 0;
stats->rx_crc_errors = be32_to_cpu(mlx4_en_stats->RCRC);
-   stats->rx_frame_errors = 0;
stats->rx_fifo_errors = be32_to_cpu(mlx4_en_stats->RdropOvflw);
-   stats->rx_missed_errors = 0;
-   stats->tx_aborted_errors = 0;
-   stats->tx_carrier_errors = 0;
-   stats->tx_fifo_errors = 0;
-   stats->tx_heartbeat_errors = 0;
-   stats->tx_window_errors = 0;
stats->tx_dropped += be32_to_cpu(mlx4_en_stats->TDROP);
 
/* RX stats */
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index bfda84df8aae..467d47ed2c39 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -483,7 +483,6 @@ struct mlx4_en_priv {
struct mlx4_en_port_profile *prof;
struct net_device *dev;
unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
-   struct net_device_stats stats;
struct mlx4_en_port_state port_state;
spinlock_t stats_lock;
struct ethtool_flow_id ethtool_rules[MAX_NUM_OF_FS_RULES];
-- 
2.8.0.rc3.226.g39d4020



[PATCH net 3/4] net/mlx4_en: get rid of ret_stats

2016-05-25 Thread Eric Dumazet
mlx4 uses a private struct net_device_stats in a vain attempt
to avoid races.

This is buggy because multiple cpus could call mlx4_en_get_stats()
at the same time, so ret_stats can not guarantee stable results.

To fix this, we need to switch to ndo_get_stats64() as this
method provides per-thread storage.

This allows to reduce mlx4_en_priv bloat.

Signed-off-by: Eric Dumazet 
Cc: Willem de Bruijn 
Cc: Eugenia Emantayev 
---
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 11 ++-
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |  1 -
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c 
b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index bd637c4eff13..a4fc6e966505 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1296,15 +1296,16 @@ static void mlx4_en_tx_timeout(struct net_device *dev)
 }
 
 
-static struct net_device_stats *mlx4_en_get_stats(struct net_device *dev)
+static struct rtnl_link_stats64 *
+mlx4_en_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 {
struct mlx4_en_priv *priv = netdev_priv(dev);
 
spin_lock_bh(>stats_lock);
-   memcpy(>ret_stats, >stats, sizeof(priv->stats));
+   netdev_stats_to_stats64(stats, >stats);
spin_unlock_bh(>stats_lock);
 
-   return >ret_stats;
+   return stats;
 }
 
 static void mlx4_en_set_default_moderation(struct mlx4_en_priv *priv)
@@ -2487,7 +2488,7 @@ static const struct net_device_ops mlx4_netdev_ops = {
.ndo_stop   = mlx4_en_close,
.ndo_start_xmit = mlx4_en_xmit,
.ndo_select_queue   = mlx4_en_select_queue,
-   .ndo_get_stats  = mlx4_en_get_stats,
+   .ndo_get_stats64= mlx4_en_get_stats64,
.ndo_set_rx_mode= mlx4_en_set_rx_mode,
.ndo_set_mac_address= mlx4_en_set_mac,
.ndo_validate_addr  = eth_validate_addr,
@@ -2519,7 +2520,7 @@ static const struct net_device_ops mlx4_netdev_ops_master 
= {
.ndo_stop   = mlx4_en_close,
.ndo_start_xmit = mlx4_en_xmit,
.ndo_select_queue   = mlx4_en_select_queue,
-   .ndo_get_stats  = mlx4_en_get_stats,
+   .ndo_get_stats64= mlx4_en_get_stats64,
.ndo_set_rx_mode= mlx4_en_set_rx_mode,
.ndo_set_mac_address= mlx4_en_set_mac,
.ndo_validate_addr  = eth_validate_addr,
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 9a9124031fc7..bfda84df8aae 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -484,7 +484,6 @@ struct mlx4_en_priv {
struct net_device *dev;
unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
struct net_device_stats stats;
-   struct net_device_stats ret_stats;
struct mlx4_en_port_state port_state;
spinlock_t stats_lock;
struct ethtool_flow_id ethtool_rules[MAX_NUM_OF_FS_RULES];
-- 
2.8.0.rc3.226.g39d4020



[PATCH net 2/4] net/mlx4_en: clear some TX ring stats in mlx4_en_clear_stats()

2016-05-25 Thread Eric Dumazet
mlx4_en_clear_stats() clears about everything but few TX ring
fields are missing :
- queue_stopped, wake_queue, tso_packets, xmit_more

Signed-off-by: Eric Dumazet 
Cc: Willem de Bruijn 
Cc: Eugenia Emantayev 
---
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c 
b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index cfd50206f7c3..bd637c4eff13 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1893,6 +1893,10 @@ static void mlx4_en_clear_stats(struct net_device *dev)
priv->tx_ring[i]->packets = 0;
priv->tx_ring[i]->tx_csum = 0;
priv->tx_ring[i]->tx_dropped = 0;
+   priv->tx_ring[i]->queue_stopped = 0;
+   priv->tx_ring[i]->wake_queue = 0;
+   priv->tx_ring[i]->tso_packets = 0;
+   priv->tx_ring[i]->xmit_more = 0;
}
for (i = 0; i < priv->rx_ring_num; i++) {
priv->rx_ring[i]->bytes = 0;
-- 
2.8.0.rc3.226.g39d4020



[PATCH net 1/4] net/mlx4_en: fix tx_dropped bug

2016-05-25 Thread Eric Dumazet
1) mlx4_en_xmit() can increment priv->stats.tx_dropped, but this variable
is overwritten in mlx4_en_DUMP_ETH_STATS().

2) This increment was not SMP safe, as a port might have many TX queues.

Add a per TX ring tx_dropped to fix these issues.

This is u32 as mlx4_en_DUMP_ETH_STATS() will add a 32bit field.

So lets avoid bugs with SNMP agents having to cope with partial
overwraps. (One of these agents being bond_fold_stats())

Signed-off-by: Eric Dumazet 
Reported-by: Willem de Bruijn 
Cc: Eugenia Emantayev 
---
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 1 +
 drivers/net/ethernet/mellanox/mlx4/en_port.c   | 4 +++-
 drivers/net/ethernet/mellanox/mlx4/en_tx.c | 8 
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   | 1 +
 4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c 
b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 92e0624f4cf0..cfd50206f7c3 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1892,6 +1892,7 @@ static void mlx4_en_clear_stats(struct net_device *dev)
priv->tx_ring[i]->bytes = 0;
priv->tx_ring[i]->packets = 0;
priv->tx_ring[i]->tx_csum = 0;
+   priv->tx_ring[i]->tx_dropped = 0;
}
for (i = 0; i < priv->rx_ring_num; i++) {
priv->rx_ring[i]->bytes = 0;
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_port.c 
b/drivers/net/ethernet/mellanox/mlx4/en_port.c
index 20b6c2e678b8..3df8690154b1 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_port.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_port.c
@@ -188,6 +188,7 @@ int mlx4_en_DUMP_ETH_STATS(struct mlx4_en_dev *mdev, u8 
port, u8 reset)
}
stats->tx_packets = 0;
stats->tx_bytes = 0;
+   stats->tx_dropped = 0;
priv->port_stats.tx_chksum_offload = 0;
priv->port_stats.queue_stopped = 0;
priv->port_stats.wake_queue = 0;
@@ -199,6 +200,7 @@ int mlx4_en_DUMP_ETH_STATS(struct mlx4_en_dev *mdev, u8 
port, u8 reset)
 
stats->tx_packets += ring->packets;
stats->tx_bytes += ring->bytes;
+   stats->tx_dropped += ring->tx_dropped;
priv->port_stats.tx_chksum_offload += ring->tx_csum;
priv->port_stats.queue_stopped += ring->queue_stopped;
priv->port_stats.wake_queue+= ring->wake_queue;
@@ -251,7 +253,7 @@ int mlx4_en_DUMP_ETH_STATS(struct mlx4_en_dev *mdev, u8 
port, u8 reset)
stats->tx_fifo_errors = 0;
stats->tx_heartbeat_errors = 0;
stats->tx_window_errors = 0;
-   stats->tx_dropped = be32_to_cpu(mlx4_en_stats->TDROP);
+   stats->tx_dropped += be32_to_cpu(mlx4_en_stats->TDROP);
 
/* RX stats */
priv->pkstats.rx_multicast_packets = stats->multicast;
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c 
b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index f6e61570cb2c..76aa4d27183c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -726,12 +726,12 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct 
net_device *dev)
bool inline_ok;
u32 ring_cons;
 
-   if (!priv->port_up)
-   goto tx_drop;
-
tx_ind = skb_get_queue_mapping(skb);
ring = priv->tx_ring[tx_ind];
 
+   if (!priv->port_up)
+   goto tx_drop;
+
/* fetch ring->cons far ahead before needing it to avoid stall */
ring_cons = ACCESS_ONCE(ring->cons);
 
@@ -1030,7 +1030,7 @@ tx_drop_unmap:
 
 tx_drop:
dev_kfree_skb_any(skb);
-   priv->stats.tx_dropped++;
+   ring->tx_dropped++;
return NETDEV_TX_OK;
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index cc84e09f324a..9a9124031fc7 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -270,6 +270,7 @@ struct mlx4_en_tx_ring {
unsigned long   tx_csum;
unsigned long   tso_packets;
unsigned long   xmit_more;
+   unsigned inttx_dropped;
struct mlx4_bf  bf;
unsigned long   queue_stopped;
 
-- 
2.8.0.rc3.226.g39d4020



[PATCH net 0/4] net/mlx4_en: fix stats

2016-05-25 Thread Eric Dumazet
mlx4 has various bugs in its ndo_get_stats() and related functions.
This patch series address the obvious issues.
Remaining ones will be discussed later.

Eric Dumazet (4):
  net/mlx4_en: fix tx_dropped bug
  net/mlx4_en: clear some TX ring stats in mlx4_en_clear_stats()
  net/mlx4_en: get rid of ret_stats
  net/mlx4_en: get rid of private net_device_stats

 drivers/net/ethernet/mellanox/mlx4/en_ethtool.c |  2 +-
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c  | 17 +++--
 drivers/net/ethernet/mellanox/mlx4/en_port.c| 18 ++
 drivers/net/ethernet/mellanox/mlx4/en_tx.c  |  8 
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h|  3 +--
 5 files changed, 23 insertions(+), 25 deletions(-)

-- 
2.8.0.rc3.226.g39d4020



[PATCH net-next 2/2] net: vrf: protect changes to private data with rcu

2016-05-25 Thread David Ahern
commit b0e95ccdd77591f108c938bbc702b57554a1665d upstream

One cpu can be processing packets which includes using the cached route
entries in the vrf device's private data and on another cpu the device
gets deleted which releases the routes and sets the pointers in net_vrf
to NULL. This results in datapath dereferencing a NULL pointer.

Fix by protecting access to dst's with rcu.

Fixes: 193125dbd8eb ("net: Introduce VRF device driver")
Fixes: 35402e313663 ("net: Add IPv6 support to VRF device")
Signed-off-by: David Ahern 
Signed-off-by: David S. Miller 
---
 drivers/net/vrf.c | 68 +--
 1 file changed, 46 insertions(+), 22 deletions(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index d8197f93e56c..bfc9febfb022 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -47,8 +47,8 @@
((struct net_device *)rcu_dereference(dev->rx_handler_data))
 
 struct net_vrf {
-   struct rtable   *rth;
-   struct rt6_info *rt6;
+   struct rtable __rcu *rth;
+   struct rt6_info __rcu   *rt6;
u32 tb_id;
 };
 
@@ -355,10 +355,15 @@ static int vrf_output6(struct net *net, struct sock *sk, 
struct sk_buff *skb)
!(IP6CB(skb)->flags & IP6SKB_REROUTED));
 }
 
+/* holding rtnl */
 static void vrf_rt6_release(struct net_vrf *vrf)
 {
-   dst_release(>rt6->dst);
-   vrf->rt6 = NULL;
+   struct rt6_info *rt6 = rtnl_dereference(vrf->rt6);
+
+   rcu_assign_pointer(vrf->rt6, NULL);
+
+   if (rt6)
+   dst_release(>dst);
 }
 
 static int vrf_rt6_create(struct net_device *dev)
@@ -376,7 +381,8 @@ static int vrf_rt6_create(struct net_device *dev)
rt6->dst.output = vrf_output6;
rt6->rt6i_table = fib6_get_table(net, vrf->tb_id);
dst_hold(>dst);
-   vrf->rt6 = rt6;
+   rcu_assign_pointer(vrf->rt6, rt6);
+
rc = 0;
 out:
return rc;
@@ -450,26 +456,32 @@ static int vrf_output(struct net *net, struct sock *sk, 
struct sk_buff *skb)
!(IPCB(skb)->flags & IPSKB_REROUTED));
 }
 
+/* holding rtnl */
 static void vrf_rtable_release(struct net_vrf *vrf)
 {
-   struct dst_entry *dst = (struct dst_entry *)vrf->rth;
+   struct rtable *rth = rtnl_dereference(vrf->rth);
+
+   rcu_assign_pointer(vrf->rth, NULL);
 
-   dst_release(dst);
-   vrf->rth = NULL;
+   if (rth)
+   dst_release(>dst);
 }
 
-static struct rtable *vrf_rtable_create(struct net_device *dev)
+static int vrf_rtable_create(struct net_device *dev)
 {
struct net_vrf *vrf = netdev_priv(dev);
struct rtable *rth;
 
rth = rt_dst_alloc(dev, 0, RTN_UNICAST, 1, 1, 0);
-   if (rth) {
-   rth->dst.output = vrf_output;
-   rth->rt_table_id = vrf->tb_id;
-   }
+   if (!rth)
+   return -ENOMEM;
 
-   return rth;
+   rth->dst.output = vrf_output;
+   rth->rt_table_id = vrf->tb_id;
+
+   rcu_assign_pointer(vrf->rth, rth);
+
+   return 0;
 }
 
 / device handling /
@@ -573,8 +585,7 @@ static int vrf_dev_init(struct net_device *dev)
goto out_nomem;
 
/* create the default dst which points back to us */
-   vrf->rth = vrf_rtable_create(dev);
-   if (!vrf->rth)
+   if (vrf_rtable_create(dev) != 0)
goto out_stats;
 
if (vrf_rt6_create(dev) != 0)
@@ -617,8 +628,13 @@ static struct rtable *vrf_get_rtable(const struct 
net_device *dev,
if (!(fl4->flowi4_flags & FLOWI_FLAG_L3MDEV_SRC)) {
struct net_vrf *vrf = netdev_priv(dev);
 
-   rth = vrf->rth;
-   dst_hold(>dst);
+   rcu_read_lock();
+
+   rth = rcu_dereference(vrf->rth);
+   if (likely(rth))
+   dst_hold(>dst);
+
+   rcu_read_unlock();
}
 
return rth;
@@ -663,16 +679,24 @@ static int vrf_get_saddr(struct net_device *dev, struct 
flowi4 *fl4)
 static struct dst_entry *vrf_get_rt6_dst(const struct net_device *dev,
 const struct flowi6 *fl6)
 {
-   struct rt6_info *rt = NULL;
+   struct dst_entry *dst = NULL;
 
if (!(fl6->flowi6_flags & FLOWI_FLAG_L3MDEV_SRC)) {
struct net_vrf *vrf = netdev_priv(dev);
+   struct rt6_info *rt;
+
+   rcu_read_lock();
+
+   rt = rcu_dereference(vrf->rt6);
+   if (likely(rt)) {
+   dst = >dst;
+   dst_hold(dst);
+   }
 
-   rt = vrf->rt6;
-   dst_hold(>dst);
+   rcu_read_unlock();
}
 
-   return (struct dst_entry *)rt;
+   return dst;
 }
 #endif
 
-- 
2.1.4



[PATCH net-next 1/2] net: vrf: Fix dst reference counting

2016-05-25 Thread David Ahern
commit 9ab179d83b4e31ea277a123492e419067c2f129a upstream

Vivek reported a kernel exception deleting a VRF with an active
connection through it. The root cause is that the socket has a cached
reference to a dst that is destroyed. Converting the dst_destroy to
dst_release and letting proper reference counting kick in does not
work as the dst has a reference to the device which needs to be released
as well.

I talked to Hannes about this at netdev and he pointed out the ipv4 and
ipv6 dst handling has dst_ifdown for just this scenario. Rather than
continuing with the reinvented dst wheel in VRF just remove it and
leverage the ipv4 and ipv6 versions.

Fixes: 193125dbd8eb2 ("net: Introduce VRF device driver")
Fixes: 35402e3136634 ("net: Add IPv6 support to VRF device")

Signed-off-by: David Ahern 
Signed-off-by: David S. Miller 
---
 drivers/net/vrf.c   | 177 +---
 include/net/ip6_route.h |   3 +
 include/net/route.h |   3 +
 net/ipv4/route.c|   7 +-
 net/ipv6/route.c|   7 +-
 5 files changed, 30 insertions(+), 167 deletions(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index bdcf617a9d52..d8197f93e56c 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -61,41 +61,6 @@ struct pcpu_dstats {
struct u64_stats_sync   syncp;
 };
 
-static struct dst_entry *vrf_ip_check(struct dst_entry *dst, u32 cookie)
-{
-   return dst;
-}
-
-static int vrf_ip_local_out(struct net *net, struct sock *sk, struct sk_buff 
*skb)
-{
-   return ip_local_out(net, sk, skb);
-}
-
-static unsigned int vrf_v4_mtu(const struct dst_entry *dst)
-{
-   /* TO-DO: return max ethernet size? */
-   return dst->dev->mtu;
-}
-
-static void vrf_dst_destroy(struct dst_entry *dst)
-{
-   /* our dst lives forever - or until the device is closed */
-}
-
-static unsigned int vrf_default_advmss(const struct dst_entry *dst)
-{
-   return 65535 - 40;
-}
-
-static struct dst_ops vrf_dst_ops = {
-   .family = AF_INET,
-   .local_out  = vrf_ip_local_out,
-   .check  = vrf_ip_check,
-   .mtu= vrf_v4_mtu,
-   .destroy= vrf_dst_destroy,
-   .default_advmss = vrf_default_advmss,
-};
-
 /* neighbor handling is done with actual device; do not want
  * to flip skb->dev for those ndisc packets. This really fails
  * for multiple next protocols (e.g., NEXTHDR_HOP). But it is
@@ -350,46 +315,6 @@ static netdev_tx_t vrf_xmit(struct sk_buff *skb, struct 
net_device *dev)
 }
 
 #if IS_ENABLED(CONFIG_IPV6)
-static struct dst_entry *vrf_ip6_check(struct dst_entry *dst, u32 cookie)
-{
-   return dst;
-}
-
-static struct dst_ops vrf_dst_ops6 = {
-   .family = AF_INET6,
-   .local_out  = ip6_local_out,
-   .check  = vrf_ip6_check,
-   .mtu= vrf_v4_mtu,
-   .destroy= vrf_dst_destroy,
-   .default_advmss = vrf_default_advmss,
-};
-
-static int init_dst_ops6_kmem_cachep(void)
-{
-   vrf_dst_ops6.kmem_cachep = kmem_cache_create("vrf_ip6_dst_cache",
-sizeof(struct rt6_info),
-0,
-SLAB_HWCACHE_ALIGN,
-NULL);
-
-   if (!vrf_dst_ops6.kmem_cachep)
-   return -ENOMEM;
-
-   return 0;
-}
-
-static void free_dst_ops6_kmem_cachep(void)
-{
-   kmem_cache_destroy(vrf_dst_ops6.kmem_cachep);
-}
-
-static int vrf_input6(struct sk_buff *skb)
-{
-   skb->dev->stats.rx_errors++;
-   kfree_skb(skb);
-   return 0;
-}
-
 /* modelled after ip6_finish_output2 */
 static int vrf_finish_output6(struct net *net, struct sock *sk,
  struct sk_buff *skb)
@@ -430,67 +355,34 @@ static int vrf_output6(struct net *net, struct sock *sk, 
struct sk_buff *skb)
!(IP6CB(skb)->flags & IP6SKB_REROUTED));
 }
 
-static void vrf_rt6_destroy(struct net_vrf *vrf)
+static void vrf_rt6_release(struct net_vrf *vrf)
 {
-   dst_destroy(>rt6->dst);
-   free_percpu(vrf->rt6->rt6i_pcpu);
+   dst_release(>rt6->dst);
vrf->rt6 = NULL;
 }
 
 static int vrf_rt6_create(struct net_device *dev)
 {
struct net_vrf *vrf = netdev_priv(dev);
-   struct dst_entry *dst;
+   struct net *net = dev_net(dev);
struct rt6_info *rt6;
-   int cpu;
int rc = -ENOMEM;
 
-   rt6 = dst_alloc(_dst_ops6, dev, 0,
-   DST_OBSOLETE_NONE,
-   (DST_HOST | DST_NOPOLICY | DST_NOXFRM));
+   rt6 = ip6_dst_alloc(net, dev,
+   DST_HOST | DST_NOPOLICY | DST_NOXFRM | DST_NOCACHE);
if (!rt6)
goto out;
 
-   dst = >dst;
-
-   rt6->rt6i_pcpu = alloc_percpu_gfp(struct rt6_info *, GFP_KERNEL);
-   if (!rt6->rt6i_pcpu) {
-  

[PATCH 4.5-stable 0/2] net: vrf: 4.5 backports

2016-05-25 Thread David Ahern
Backports of 2 vrf patches to 4.5.5 stable tree.

Dave: not sure if you want me to send to stable maintainers or if you
  want to add them to your stable queue. I have the 4.4.11 versions
  ready as well.

David Ahern (2):
  net: vrf: Fix dst reference counting
  net: vrf: protect changes to private data with rcu

 drivers/net/vrf.c   | 235 
 include/net/ip6_route.h |   3 +
 include/net/route.h |   3 +
 net/ipv4/route.c|   7 +-
 net/ipv6/route.c|   7 +-
 5 files changed, 71 insertions(+), 184 deletions(-)

-- 
2.1.4



[PATCH percpu/for-4.7-fixes 2/2] percpu: fix synchronization between synchronous map extension and chunk destruction

2016-05-25 Thread Tejun Heo
For non-atomic allocations, pcpu_alloc() can try to extend the area
map synchronously after dropping pcpu_lock; however, the extension
wasn't synchronized against chunk destruction and the chunk might get
freed while extension is in progress.

This patch fixes the bug by putting most of non-atomic allocations
under pcpu_alloc_mutex to synchronize against pcpu_balance_work which
is responsible for async chunk management including destruction.

Signed-off-by: Tejun Heo 
Reported-and-tested-by: Alexei Starovoitov 
Reported-by: Vlastimil Babka 
Reported-by: Sasha Levin 
Cc: sta...@vger.kernel.org # v3.18+
Fixes: 1a4d76076cda ("percpu: implement asynchronous chunk population")
---
Hello,

I'll send both patches mainline in a couple days through the percpu
tree.

Thanks.

 mm/percpu.c |   16 
 1 file changed, 8 insertions(+), 8 deletions(-)

--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -162,7 +162,7 @@ static struct pcpu_chunk *pcpu_reserved_
 static int pcpu_reserved_chunk_limit;
 
 static DEFINE_SPINLOCK(pcpu_lock); /* all internal data structures */
-static DEFINE_MUTEX(pcpu_alloc_mutex); /* chunk create/destroy, [de]pop */
+static DEFINE_MUTEX(pcpu_alloc_mutex); /* chunk create/destroy, [de]pop, map 
ext */
 
 static struct list_head *pcpu_slot __read_mostly; /* chunk list slots */
 
@@ -444,6 +444,8 @@ static int pcpu_extend_area_map(struct p
size_t old_size = 0, new_size = new_alloc * sizeof(new[0]);
unsigned long flags;
 
+   lockdep_assert_held(_alloc_mutex);
+
new = pcpu_mem_zalloc(new_size);
if (!new)
return -ENOMEM;
@@ -890,6 +892,9 @@ static void __percpu *pcpu_alloc(size_t
return NULL;
}
 
+   if (!is_atomic)
+   mutex_lock(_alloc_mutex);
+
spin_lock_irqsave(_lock, flags);
 
/* serve reserved allocations from the reserved chunk if available */
@@ -962,12 +967,9 @@ restart:
if (is_atomic)
goto fail;
 
-   mutex_lock(_alloc_mutex);
-
if (list_empty(_slot[pcpu_nr_slots - 1])) {
chunk = pcpu_create_chunk();
if (!chunk) {
-   mutex_unlock(_alloc_mutex);
err = "failed to allocate new chunk";
goto fail;
}
@@ -978,7 +980,6 @@ restart:
spin_lock_irqsave(_lock, flags);
}
 
-   mutex_unlock(_alloc_mutex);
goto restart;
 
 area_found:
@@ -988,8 +989,6 @@ area_found:
if (!is_atomic) {
int page_start, page_end, rs, re;
 
-   mutex_lock(_alloc_mutex);
-
page_start = PFN_DOWN(off);
page_end = PFN_UP(off + size);
 
@@ -1000,7 +999,6 @@ area_found:
 
spin_lock_irqsave(_lock, flags);
if (ret) {
-   mutex_unlock(_alloc_mutex);
pcpu_free_area(chunk, off, _pages);
err = "failed to populate";
goto fail_unlock;
@@ -1040,6 +1038,8 @@ fail:
/* see the flag handling in pcpu_blance_workfn() */
pcpu_atomic_alloc_failed = true;
pcpu_schedule_balance_work();
+   } else {
+   mutex_unlock(_alloc_mutex);
}
return NULL;
 }


[PATCH percpu/for-4.7-fixes 1/2] percpu: fix synchronization between chunk->map_extend_work and chunk destruction

2016-05-25 Thread Tejun Heo
Atomic allocations can trigger async map extensions which is serviced
by chunk->map_extend_work.  pcpu_balance_work which is responsible for
destroying idle chunks wasn't synchronizing properly against
chunk->map_extend_work and may end up freeing the chunk while the work
item is still in flight.

This patch fixes the bug by rolling async map extension operations
into pcpu_balance_work.

Signed-off-by: Tejun Heo 
Reported-and-tested-by: Alexei Starovoitov 
Reported-by: Vlastimil Babka 
Reported-by: Sasha Levin 
Cc: sta...@vger.kernel.org # v3.18+
Fixes: 9c824b6a172c ("percpu: make sure chunk->map array has available space")
---
 mm/percpu.c |   57 -
 1 file changed, 36 insertions(+), 21 deletions(-)

--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -112,7 +112,7 @@ struct pcpu_chunk {
int map_used;   /* # of map entries used before 
the sentry */
int map_alloc;  /* # of map entries allocated */
int *map;   /* allocation map */
-   struct work_struct  map_extend_work;/* async ->map[] extension */
+   struct list_headmap_extend_list;/* on pcpu_map_extend_chunks */
 
void*data;  /* chunk data */
int first_free; /* no free below this */
@@ -166,6 +166,9 @@ static DEFINE_MUTEX(pcpu_alloc_mutex);  /
 
 static struct list_head *pcpu_slot __read_mostly; /* chunk list slots */
 
+/* chunks which need their map areas extended, protected by pcpu_lock */
+static LIST_HEAD(pcpu_map_extend_chunks);
+
 /*
  * The number of empty populated pages, protected by pcpu_lock.  The
  * reserved chunk doesn't contribute to the count.
@@ -395,13 +398,19 @@ static int pcpu_need_to_extend(struct pc
 {
int margin, new_alloc;
 
+   lockdep_assert_held(_lock);
+
if (is_atomic) {
margin = 3;
 
if (chunk->map_alloc <
-   chunk->map_used + PCPU_ATOMIC_MAP_MARGIN_LOW &&
-   pcpu_async_enabled)
-   schedule_work(>map_extend_work);
+   chunk->map_used + PCPU_ATOMIC_MAP_MARGIN_LOW) {
+   if (list_empty(>map_extend_list)) {
+   list_add_tail(>map_extend_list,
+ _map_extend_chunks);
+   pcpu_schedule_balance_work();
+   }
+   }
} else {
margin = PCPU_ATOMIC_MAP_MARGIN_HIGH;
}
@@ -467,20 +476,6 @@ out_unlock:
return 0;
 }
 
-static void pcpu_map_extend_workfn(struct work_struct *work)
-{
-   struct pcpu_chunk *chunk = container_of(work, struct pcpu_chunk,
-   map_extend_work);
-   int new_alloc;
-
-   spin_lock_irq(_lock);
-   new_alloc = pcpu_need_to_extend(chunk, false);
-   spin_unlock_irq(_lock);
-
-   if (new_alloc)
-   pcpu_extend_area_map(chunk, new_alloc);
-}
-
 /**
  * pcpu_fit_in_area - try to fit the requested allocation in a candidate area
  * @chunk: chunk the candidate area belongs to
@@ -740,7 +735,7 @@ static struct pcpu_chunk *pcpu_alloc_chu
chunk->map_used = 1;
 
INIT_LIST_HEAD(>list);
-   INIT_WORK(>map_extend_work, pcpu_map_extend_workfn);
+   INIT_LIST_HEAD(>map_extend_list);
chunk->free_size = pcpu_unit_size;
chunk->contig_hint = pcpu_unit_size;
 
@@ -1129,6 +1124,7 @@ static void pcpu_balance_workfn(struct w
if (chunk == list_first_entry(free_head, struct pcpu_chunk, 
list))
continue;
 
+   list_del_init(>map_extend_list);
list_move(>list, _free);
}
 
@@ -1146,6 +1142,25 @@ static void pcpu_balance_workfn(struct w
pcpu_destroy_chunk(chunk);
}
 
+   /* service chunks which requested async area map extension */
+   do {
+   int new_alloc = 0;
+
+   spin_lock_irq(_lock);
+
+   chunk = list_first_entry_or_null(_map_extend_chunks,
+   struct pcpu_chunk, map_extend_list);
+   if (chunk) {
+   list_del_init(>map_extend_list);
+   new_alloc = pcpu_need_to_extend(chunk, false);
+   }
+
+   spin_unlock_irq(_lock);
+
+   if (new_alloc)
+   pcpu_extend_area_map(chunk, new_alloc);
+   } while (chunk);
+
/*
 * Ensure there are certain number of free populated pages for
 * atomic allocs.  Fill up from the most packed so that atomic
@@ -1644,7 +1659,7 @@ int __init pcpu_setup_first_chunk(const
 */
schunk = memblock_virt_alloc(pcpu_chunk_struct_size, 0);

Re: [PATCH v8 22/22] MAINTAINERS: Add maintainers for HiSilicon RoCE driver

2016-05-25 Thread Joe Perches
On Wed, 2016-05-25 at 23:05 +0800, Lijun Ou wrote:
> This patch added maintainers for RoCE driver.

Please add sections in alphabetic order.

> diff --git a/MAINTAINERS b/MAINTAINERS
[]
> @@ -10121,6 +10121,14 @@ W:   http://www.emulex.com
>  S:   Supported
>  F:   drivers/infiniband/hw/ocrdma/
>  
> +HISILICON ROCE DRIVER
> +M:   Wei Hu(Xavier) 
> +M:   Lijun Ou 
> +L:   linux-r...@vger.kernel.org
> +S:   Maintained
> +F:   drivers/infiniband/hw/hns/
> +F:   Documentation/devicetree/bindings/infiniband/hisilicon-hns-roce.txt
> +
>  SFC NETWORK DRIVER
>  M:   Solarflare linux maintainers 
>  M:   Edward Cree 


[PATCH v8 19/22] IB/hns: Add memory region operations support

2016-05-25 Thread Lijun Ou
This patch was mainly for implementing of memory region.
Memory Registration provides mechanisms that allow consumers
to describe a set of virtually contiguous memory locations or
a set of physically contiguous memory locations.
MR operations includes as follows:
1. get dma MR in kernel mode
2. get MR in user mode
3. deregister MR
And the locations of some functions was adjusted in
some files.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
Signed-off-by: Salil Mehta 
---
 drivers/infiniband/hw/hns/hns_roce_cmd.h|   9 +
 drivers/infiniband/hw/hns/hns_roce_device.h |  45 +
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c  | 157 +
 drivers/infiniband/hw/hns/hns_roce_hw_v1.h  | 103 +++
 drivers/infiniband/hw/hns/hns_roce_icm.h|   1 +
 drivers/infiniband/hw/hns/hns_roce_main.c   |   7 +
 drivers/infiniband/hw/hns/hns_roce_mr.c | 253 
 drivers/infiniband/hw/hns/hns_roce_qp.c |   1 +
 8 files changed, 576 insertions(+)

diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.h 
b/drivers/infiniband/hw/hns/hns_roce_cmd.h
index cb3e85a..7b37bea 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cmd.h
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.h
@@ -36,6 +36,14 @@
 #include 
 
 enum {
+   /* TPT commands */
+   HNS_ROCE_CMD_SW2HW_MPT  = 0xd,
+   HNS_ROCE_CMD_HW2SW_MPT  = 0xf,
+
+   /* CQ commands */
+   HNS_ROCE_CMD_SW2HW_CQ   = 0x16,
+   HNS_ROCE_CMD_HW2SW_CQ   = 0x17,
+
/* QP/EE commands */
HNS_ROCE_CMD_RST2INIT_QP= 0x19,
HNS_ROCE_CMD_INIT2RTR_QP= 0x1a,
@@ -51,6 +59,7 @@ enum {
 
 enum {
HNS_ROCE_CMD_TIME_CLASS_A   = 1,
+   HNS_ROCE_CMD_TIME_CLASS_B   = 1,
HNS_ROCE_CMD_TIME_CLASS_C   = 1,
 };
 
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h 
b/drivers/infiniband/hw/hns/hns_roce_device.h
index 09076e9..d08b699 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -57,6 +57,10 @@
 #define HNS_ROCE_MIN_CQE_NUM   0x40
 #define HNS_ROCE_MIN_WQE_NUM   0x20
 
+/* Hardware specification only for v1 engine */
+#define HNS_ROCE_MAX_INNER_MTPT_NUM0x7
+#define HNS_ROCE_MAX_MTPT_PBL_NUM  0x10
+
 #define HNS_ROCE_MAX_IRQ_NUM   34
 
 #define HNS_ROCE_COMP_VEC_NUM  32
@@ -73,9 +77,17 @@
 #define HNS_ROCE_MAX_GID_NUM   16
 #define HNS_ROCE_GID_SIZE  16
 
+#define MR_TYPE_MR 0x00
+#define MR_TYPE_DMA0x03
+
 #define PKEY_ID0x
 #define NODE_DESC_SIZE 64
 
+#define SERV_TYPE_RC   0
+#define SERV_TYPE_RD   1
+#define SERV_TYPE_UC   2
+#define SERV_TYPE_UD   3
+
 /* Address shift 12bit with the special hardware address operation of RoCEE */
 #define ADDR_SHIFT_12  12
 
@@ -85,7 +97,10 @@
 /* Address shift 44bit with the special hardware address operation of RoCEE */
 #define ADDR_SHIFT_44  44
 
+#define PAGES_SHIFT_8  8
 #define PAGES_SHIFT_16 16
+#define PAGES_SHIFT_24 24
+#define PAGES_SHIFT_32 32
 
 enum hns_roce_qp_state {
HNS_ROCE_QP_STATE_RST= 0,
@@ -229,6 +244,23 @@ struct hns_roce_mtt {
int page_shift;
 };
 
+/* Only support 4K page size for mr register */
+#define MR_SIZE_4K 0
+
+struct hns_roce_mr {
+   struct ib_mribmr;
+   struct ib_umem  *umem;
+   u64 iova; /* MR's virtual orignal addr */
+   u64 size; /* Address range of MR */
+   u32 key; /* Key of MR */
+   u32 pd;   /* PD num of MR */
+   u32 access;/* Access permission of MR */
+   int enabled; /* MR's active status */
+   int type;   /* MR's register type */
+   u64 *pbl_buf;/* MR's PBL space */
+   dma_addr_t  pbl_dma_addr;   /* MR's PBL space PA */
+};
+
 struct hns_roce_mr_table {
struct hns_roce_bitmap  mtpt_bitmap;
struct hns_roce_buddy   mtt_buddy;
@@ -499,6 +531,8 @@ struct hns_roce_hw {
void (*set_mac)(struct hns_roce_dev *hr_dev, u8 phy_port, u8 *addr);
void (*set_mtu)(struct hns_roce_dev *hr_dev, u8 phy_port,
enum ib_mtu mtu);
+   int (*write_mtpt)(void *mb_buf, struct hns_roce_mr *mr,
+ unsigned long 

[PATCH v8 10/22] IB/hns: Add process flow to init RoCE engine

2016-05-25 Thread Lijun Ou
This patch mainly initialized the RoCE engine. It is absolutely
necessary to run RoCE. It mainly includes that configure DMAE
user, initialize doorbell and raq operations, enable port.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
---
 drivers/infiniband/hw/hns/hns_roce_common.h | 107 +++
 drivers/infiniband/hw/hns/hns_roce_device.h |  15 +
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c  | 477 
 drivers/infiniband/hw/hns/hns_roce_hw_v1.h  |  68 +++-
 drivers/infiniband/hw/hns/hns_roce_main.c   |  20 ++
 5 files changed, 686 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_common.h 
b/drivers/infiniband/hw/hns/hns_roce_common.h
index 5998778..73c6220 100644
--- a/drivers/infiniband/hw/hns/hns_roce_common.h
+++ b/drivers/infiniband/hw/hns/hns_roce_common.h
@@ -53,6 +53,93 @@
 #define roce_set_bit(origin, shift, val) \
roce_set_field((origin), (1ul << (shift)), (shift), (val))
 
+#define ROCEE_GLB_CFG_ROCEE_DB_SQ_MODE_S 3
+#define ROCEE_GLB_CFG_ROCEE_DB_OTH_MODE_S 4
+
+#define ROCEE_GLB_CFG_SQ_EXT_DB_MODE_S 5
+
+#define ROCEE_GLB_CFG_OTH_EXT_DB_MODE_S 6
+
+#define ROCEE_GLB_CFG_ROCEE_PORT_ST_S 10
+#define ROCEE_GLB_CFG_ROCEE_PORT_ST_M  \
+   (((1UL << 6) - 1) << ROCEE_GLB_CFG_ROCEE_PORT_ST_S)
+
+#define ROCEE_GLB_CFG_TRP_RAQ_DROP_EN_S 16
+
+#define ROCEE_DMAE_USER_CFG1_ROCEE_STREAM_ID_TB_CFG_S 0
+#define ROCEE_DMAE_USER_CFG1_ROCEE_STREAM_ID_TB_CFG_M  \
+   (((1UL << 24) - 1) << ROCEE_DMAE_USER_CFG1_ROCEE_STREAM_ID_TB_CFG_S)
+
+#define ROCEE_DMAE_USER_CFG1_ROCEE_CACHE_TB_CFG_S 24
+#define ROCEE_DMAE_USER_CFG1_ROCEE_CACHE_TB_CFG_M  \
+   (((1UL << 4) - 1) << ROCEE_DMAE_USER_CFG1_ROCEE_CACHE_TB_CFG_S)
+
+#define ROCEE_DMAE_USER_CFG2_ROCEE_STREAM_ID_PKT_CFG_S 0
+#define ROCEE_DMAE_USER_CFG2_ROCEE_STREAM_ID_PKT_CFG_M   \
+   (((1UL << 24) - 1) << ROCEE_DMAE_USER_CFG2_ROCEE_STREAM_ID_PKT_CFG_S)
+
+#define ROCEE_DMAE_USER_CFG2_ROCEE_CACHE_PKT_CFG_S 24
+#define ROCEE_DMAE_USER_CFG2_ROCEE_CACHE_PKT_CFG_M   \
+   (((1UL << 4) - 1) << ROCEE_DMAE_USER_CFG2_ROCEE_CACHE_PKT_CFG_S)
+
+#define ROCEE_DB_SQ_WL_ROCEE_DB_SQ_WL_S 0
+#define ROCEE_DB_SQ_WL_ROCEE_DB_SQ_WL_M   \
+   (((1UL << 16) - 1) << ROCEE_DB_SQ_WL_ROCEE_DB_SQ_WL_S)
+
+#define ROCEE_DB_SQ_WL_ROCEE_DB_SQ_WL_EMPTY_S 16
+#define ROCEE_DB_SQ_WL_ROCEE_DB_SQ_WL_EMPTY_M   \
+   (((1UL << 16) - 1) << ROCEE_DB_SQ_WL_ROCEE_DB_SQ_WL_EMPTY_S)
+
+#define ROCEE_DB_OTHERS_WL_ROCEE_DB_OTH_WL_S 0
+#define ROCEE_DB_OTHERS_WL_ROCEE_DB_OTH_WL_M   \
+   (((1UL << 16) - 1) << ROCEE_DB_OTHERS_WL_ROCEE_DB_OTH_WL_S)
+
+#define ROCEE_DB_OTHERS_WL_ROCEE_DB_OTH_WL_EMPTY_S 16
+#define ROCEE_DB_OTHERS_WL_ROCEE_DB_OTH_WL_EMPTY_M   \
+   (((1UL << 16) - 1) << ROCEE_DB_OTHERS_WL_ROCEE_DB_OTH_WL_EMPTY_S)
+
+#define ROCEE_RAQ_WL_ROCEE_RAQ_WL_S 0
+#define ROCEE_RAQ_WL_ROCEE_RAQ_WL_M   \
+   (((1UL << 8) - 1) << ROCEE_RAQ_WL_ROCEE_RAQ_WL_S)
+
+#define ROCEE_WRMS_POL_TIME_INTERVAL_WRMS_POL_TIME_INTERVAL_S 0
+#define ROCEE_WRMS_POL_TIME_INTERVAL_WRMS_POL_TIME_INTERVAL_M   \
+   (((1UL << 15) - 1) << \
+   ROCEE_WRMS_POL_TIME_INTERVAL_WRMS_POL_TIME_INTERVAL_S)
+
+#define ROCEE_WRMS_POL_TIME_INTERVAL_WRMS_RAQ_TIMEOUT_CHK_CFG_S 16
+#define ROCEE_WRMS_POL_TIME_INTERVAL_WRMS_RAQ_TIMEOUT_CHK_CFG_M   \
+   (((1UL << 4) - 1) << \
+   ROCEE_WRMS_POL_TIME_INTERVAL_WRMS_RAQ_TIMEOUT_CHK_CFG_S)
+
+#define ROCEE_WRMS_POL_TIME_INTERVAL_WRMS_RAQ_TIMEOUT_CHK_EN_S 20
+
+#define ROCEE_WRMS_POL_TIME_INTERVAL_WRMS_EXT_RAQ_MODE 21
+
+#define ROCEE_EXT_DB_SQ_H_EXT_DB_SQ_SHIFT_S 0
+#define ROCEE_EXT_DB_SQ_H_EXT_DB_SQ_SHIFT_M   \
+   (((1UL << 5) - 1) << ROCEE_EXT_DB_SQ_H_EXT_DB_SQ_SHIFT_S)
+
+#define ROCEE_EXT_DB_SQ_H_EXT_DB_SQ_BA_H_S 5
+#define ROCEE_EXT_DB_SQ_H_EXT_DB_SQ_BA_H_M   \
+   (((1UL << 5) - 1) << ROCEE_EXT_DB_SQ_H_EXT_DB_SQ_BA_H_S)
+
+#define ROCEE_EXT_DB_OTH_H_EXT_DB_OTH_SHIFT_S 0
+#define ROCEE_EXT_DB_OTH_H_EXT_DB_OTH_SHIFT_M   \
+   (((1UL << 5) - 1) << ROCEE_EXT_DB_OTH_H_EXT_DB_OTH_SHIFT_S)
+
+#define ROCEE_EXT_DB_SQ_H_EXT_DB_OTH_BA_H_S 5
+#define ROCEE_EXT_DB_SQ_H_EXT_DB_OTH_BA_H_M   \
+   (((1UL << 5) - 1) << ROCEE_EXT_DB_SQ_H_EXT_DB_OTH_BA_H_S)
+
+#define ROCEE_EXT_RAQ_H_EXT_RAQ_SHIFT_S 0
+#define ROCEE_EXT_RAQ_H_EXT_RAQ_SHIFT_M   \
+   (((1UL << 5) - 1) << ROCEE_EXT_RAQ_H_EXT_RAQ_SHIFT_S)
+
+#define ROCEE_EXT_RAQ_H_EXT_RAQ_BA_H_S 8
+#define ROCEE_EXT_RAQ_H_EXT_RAQ_BA_H_M   \
+   (((1UL << 5) - 1) << ROCEE_EXT_RAQ_H_EXT_RAQ_BA_H_S)
+
 #define ROCEE_BT_CMD_H_ROCEE_BT_CMD_IN_MDF_S 0
 #define ROCEE_BT_CMD_H_ROCEE_BT_CMD_IN_MDF_M   \
(((1UL << 19) - 1) << ROCEE_BT_CMD_H_ROCEE_BT_CMD_IN_MDF_S)
@@ -120,6 +207,26 @@
 #define ROCEE_ECC_CERR_ALM2_REG0xB48
 
 #define ROCEE_ACK_DELAY_REG0x14
+#define ROCEE_GLB_CFG_REG  0x18
+
+#define ROCEE_DMAE_USER_CFG1_REG   0x40
+#define 

[PATCH v8 01/22] net: hns: Add reset function support for RoCE driver

2016-05-25 Thread Lijun Ou
It added reset function for RoCE driver. RoCE is a feature of hns.
In hip06 SoC, in RoCE reset process, it's needed to configure dsaf
channel reset, port and sl map info. Reset function of RoCE is
located in dsaf module, we only call it in RoCE driver when needed.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
Signed-off-by: Sheng Li 
---
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c | 90 +-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h | 32 +++-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c | 57 --
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h  | 15 +++-
 4 files changed, 181 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c 
b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
index 1c2ddb2..da7bb92 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1295,9 +1296,9 @@ static int hns_dsaf_init_hw(struct dsaf_device *dsaf_dev)
dev_dbg(dsaf_dev->dev,
"hns_dsaf_init_hw begin %s !\n", dsaf_dev->ae_dev.name);
 
-   hns_dsaf_rst(dsaf_dev, 0);
+   hns_dsaf_rst(dsaf_dev, false);
mdelay(10);
-   hns_dsaf_rst(dsaf_dev, 1);
+   hns_dsaf_rst(dsaf_dev, true);
 
hns_dsaf_comm_init(dsaf_dev);
 
@@ -1325,7 +1326,7 @@ static int hns_dsaf_init_hw(struct dsaf_device *dsaf_dev)
 static void hns_dsaf_remove_hw(struct dsaf_device *dsaf_dev)
 {
/*reset*/
-   hns_dsaf_rst(dsaf_dev, 0);
+   hns_dsaf_rst(dsaf_dev, false);
 }
 
 /**
@@ -2685,6 +2686,89 @@ static struct platform_driver g_dsaf_driver = {
 
 module_platform_driver(g_dsaf_driver);
 
+/**
+ * hns_dsaf_roce_reset - reset dsaf and roce
+ * @dsaf_fwnode: Pointer to framework node for the dasf
+ * @enable: false - request reset , true - drop reset
+ * retuen 0 - success , negative -fail
+ */
+int hns_dsaf_roce_reset(struct fwnode_handle *dsaf_fwnode, bool enable)
+{
+   struct dsaf_device *dsaf_dev;
+   struct platform_device *pdev;
+   unsigned int mp;
+   unsigned int sl;
+   unsigned int credit;
+   int i;
+   const u32 port_map[DSAF_ROCE_CREDIT_CHN][DSAF_ROCE_CHAN_MODE_NUM] = {
+   {DSAF_ROCE_PORT_0, DSAF_ROCE_PORT_0, DSAF_ROCE_PORT_0},
+   {DSAF_ROCE_PORT_1, DSAF_ROCE_PORT_0, DSAF_ROCE_PORT_0},
+   {DSAF_ROCE_PORT_2, DSAF_ROCE_PORT_1, DSAF_ROCE_PORT_0},
+   {DSAF_ROCE_PORT_3, DSAF_ROCE_PORT_1, DSAF_ROCE_PORT_0},
+   {DSAF_ROCE_PORT_4, DSAF_ROCE_PORT_2, DSAF_ROCE_PORT_1},
+   {DSAF_ROCE_PORT_4, DSAF_ROCE_PORT_2, DSAF_ROCE_PORT_1},
+   {DSAF_ROCE_PORT_5, DSAF_ROCE_PORT_3, DSAF_ROCE_PORT_1},
+   {DSAF_ROCE_PORT_5, DSAF_ROCE_PORT_3, DSAF_ROCE_PORT_1},
+   };
+   const u32 sl_map[DSAF_ROCE_CREDIT_CHN][DSAF_ROCE_CHAN_MODE_NUM] = {
+   {DSAF_ROCE_SL_0, DSAF_ROCE_SL_0, DSAF_ROCE_SL_0},
+   {DSAF_ROCE_SL_0, DSAF_ROCE_SL_1, DSAF_ROCE_SL_1},
+   {DSAF_ROCE_SL_0, DSAF_ROCE_SL_0, DSAF_ROCE_SL_2},
+   {DSAF_ROCE_SL_0, DSAF_ROCE_SL_1, DSAF_ROCE_SL_3},
+   {DSAF_ROCE_SL_0, DSAF_ROCE_SL_0, DSAF_ROCE_SL_0},
+   {DSAF_ROCE_SL_1, DSAF_ROCE_SL_1, DSAF_ROCE_SL_1},
+   {DSAF_ROCE_SL_0, DSAF_ROCE_SL_0, DSAF_ROCE_SL_2},
+   {DSAF_ROCE_SL_1, DSAF_ROCE_SL_1, DSAF_ROCE_SL_3},
+   };
+
+   if (!is_of_node(dsaf_fwnode)) {
+   pr_err("hisi_dsaf: Only support DT node!\n");
+   return -EINVAL;
+   }
+   pdev = of_find_device_by_node(to_of_node(dsaf_fwnode));
+   dsaf_dev = dev_get_drvdata(>dev);
+   if (AE_IS_VER1(dsaf_dev->dsaf_ver)) {
+   dev_err(dsaf_dev->dev, "%s v1 chip do not support roce!\n",
+   dsaf_dev->ae_dev.name);
+   return -ENODEV;
+   }
+
+   if (!enable) {
+   /* Reset rocee-channels in dsaf and rocee */
+   hns_dsaf_srst_chns(dsaf_dev, DSAF_CHNS_MASK, false);
+   hns_dsaf_roce_srst(dsaf_dev, false);
+   } else {
+   /* Configure dsaf tx roce correspond to port map and sl map */
+   mp = dsaf_read_dev(dsaf_dev, DSAF_ROCE_PORT_MAP_REG);
+   for (i = 0; i < DSAF_ROCE_CREDIT_CHN; i++)
+   dsaf_set_field(mp, 7 << i * 3, i * 3,
+  port_map[i][DSAF_ROCE_6PORT_MODE]);
+   dsaf_set_field(mp, 3 << i * 3, i * 3, 0);
+   dsaf_write_dev(dsaf_dev, DSAF_ROCE_PORT_MAP_REG, mp);
+
+   sl = dsaf_read_dev(dsaf_dev, DSAF_ROCE_SL_MAP_REG);
+   for (i = 0; i < DSAF_ROCE_CREDIT_CHN; i++)
+   dsaf_set_field(sl, 3 << i * 2, i 

[PATCH v8 03/22] IB/hns: Add initial main frame driver and get cfg info

2016-05-25 Thread Lijun Ou
This patch mainly added the initial bare main driver. It
could get the relative configure information of net node.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
---
 drivers/infiniband/hw/hns/hns_roce_device.h |  72 ++
 drivers/infiniband/hw/hns/hns_roce_main.c   | 201 
 2 files changed, 273 insertions(+)
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_device.h
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_main.c

diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h 
b/drivers/infiniband/hw/hns/hns_roce_device.h
new file mode 100644
index 000..f9de8e4
--- /dev/null
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -0,0 +1,72 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef _HNS_ROCE_DEVICE_H
+#define _HNS_ROCE_DEVICE_H
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DRV_NAME "hns_roce"
+
+#define HNS_ROCE_MAX_IRQ_NUM   34
+#define HNS_ROCE_MAX_PORTS 6
+
+struct hns_roce_ib_iboe {
+   struct net_device  *netdevs[HNS_ROCE_MAX_PORTS];
+   u8  phy_port[HNS_ROCE_MAX_PORTS];
+};
+
+struct hns_roce_caps {
+   u8  num_ports;
+};
+
+struct hns_roce_dev {
+   struct ib_deviceib_dev;
+   struct platform_device  *pdev;
+   struct hns_roce_ib_iboe iboe;
+
+   int irq[HNS_ROCE_MAX_IRQ_NUM];
+   u8 __iomem  *reg_base;
+   struct hns_roce_capscaps;
+
+   int cmd_mod;
+   int loop_idc;
+};
+
+#endif /* _HNS_ROCE_DEVICE_H */
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c 
b/drivers/infiniband/hw/hns/hns_roce_main.c
new file mode 100644
index 000..6137339
--- /dev/null
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -0,0 +1,201 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 

[PATCH v8 22/22] MAINTAINERS: Add maintainers for HiSilicon RoCE driver

2016-05-25 Thread Lijun Ou
This patch added maintainers for RoCE driver.

Signed-off-by: Wei Hu 
Signed-off-by: Lijun Ou 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5f83015..ba23a81 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10121,6 +10121,14 @@ W: http://www.emulex.com
 S: Supported
 F: drivers/infiniband/hw/ocrdma/
 
+HISILICON ROCE DRIVER
+M: Wei Hu(Xavier) 
+M: Lijun Ou 
+L: linux-r...@vger.kernel.org
+S: Maintained
+F: drivers/infiniband/hw/hns/
+F: Documentation/devicetree/bindings/infiniband/hisilicon-hns-roce.txt
+
 SFC NETWORK DRIVER
 M: Solarflare linux maintainers 
 M: Edward Cree 
-- 
1.9.1



[PATCH v8 11/22] IB/hns: Add IB device registration

2016-05-25 Thread Lijun Ou
This patch registered IB device when loaded, and unregistered
IB device when removed.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
---
 drivers/infiniband/hw/hns/hns_roce_main.c | 47 +++
 1 file changed, 47 insertions(+)

diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c 
b/drivers/infiniband/hw/hns/hns_roce_main.c
index 589a9f7..788fe69 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -62,6 +62,41 @@
 #include "hns_roce_device.h"
 #include "hns_roce_icm.h"
 
+void hns_roce_unregister_device(struct hns_roce_dev *hr_dev)
+{
+   ib_unregister_device(_dev->ib_dev);
+}
+
+int hns_roce_register_device(struct hns_roce_dev *hr_dev)
+{
+   int ret;
+   struct hns_roce_ib_iboe *iboe = NULL;
+   struct ib_device *ib_dev = NULL;
+   struct device *dev = _dev->pdev->dev;
+
+   iboe = _dev->iboe;
+
+   ib_dev = _dev->ib_dev;
+   strlcpy(ib_dev->name, "hisi_%d", IB_DEVICE_NAME_MAX);
+
+   ib_dev->owner   = THIS_MODULE;
+   ib_dev->node_type   = RDMA_NODE_IB_CA;
+   ib_dev->dma_device  = dev;
+
+   ib_dev->phys_port_cnt   = hr_dev->caps.num_ports;
+   ib_dev->local_dma_lkey  = hr_dev->caps.reserved_lkey;
+   ib_dev->num_comp_vectors= hr_dev->caps.num_comp_vectors;
+   ib_dev->uverbs_abi_ver  = 1;
+
+   ret = ib_register_device(ib_dev, NULL);
+   if (ret) {
+   dev_err(dev, "ib_register_device failed!\n");
+   return ret;
+   }
+
+   return 0;
+}
+
 int hns_roce_get_cfg(struct hns_roce_dev *hr_dev)
 {
int i;
@@ -367,6 +402,17 @@ static int hns_roce_probe(struct platform_device *pdev)
goto error_failed_engine_init;
}
 
+   ret = hns_roce_register_device(hr_dev);
+   if (ret) {
+   dev_err(dev, "register_device failed!\n");
+   goto error_failed_register_device;
+   }
+
+   return 0;
+
+error_failed_register_device:
+   hns_roce_engine_uninit(hr_dev);
+
 error_failed_engine_init:
hns_roce_cleanup_bitmap(hr_dev);
 
@@ -402,6 +448,7 @@ static int hns_roce_remove(struct platform_device *pdev)
 {
struct hns_roce_dev *hr_dev = platform_get_drvdata(pdev);
 
+   hns_roce_unregister_device(hr_dev);
hns_roce_engine_uninit(hr_dev);
hns_roce_cleanup_bitmap(hr_dev);
hns_roce_cleanup_icm(hr_dev);
-- 
1.9.1



Re: [PATCH net] team: don't call netdev_change_features under team->lock

2016-05-25 Thread Jiri Pirko
Wed, May 25, 2016 at 04:55:49PM CEST, ivec...@redhat.com wrote:
>The team_device_event() notifier calls team_compute_features() to fix
>vlan_features under team->lock to protect team->port_list. The problem is
>that subsequent __team_compute_features() calls netdev_change_features()
>to propagate vlan_features to upper vlan devices while team->lock is still
>taken. This can lead to deadlock when NETIF_F_LRO is modified on lower
>devices or team device itself.
>
>Example:
>The team0 as active backup with eth0 and eth1 NICs. Both eth0 & eth1 are
>LRO capable and LRO is enabled. Thus LRO is also enabled on team0.
>
>The command 'ethtool -K team0 lro off' now hangs due to this deadlock:
>
>dev_ethtool()
>-> ethtool_set_features()
> -> __netdev_update_features(team)
>  -> netdev_sync_lower_features()
>   -> netdev_update_features(lower_1)
>-> __netdev_update_features(lower_1)
>-> netdev_features_change(lower_1)
> -> call_netdevice_notifiers(...)
>  -> team_device_event(lower_1)
>   -> team_compute_features(team) [TAKES team->lock]
>-> netdev_change_features(team)
> -> __netdev_update_features(team)
>  -> netdev_sync_lower_features()
>   -> netdev_update_features(lower_2)
>-> __netdev_update_features(lower_2)
>-> netdev_features_change(lower_2)
> -> call_netdevice_notifiers(...)
>  -> team_device_event(lower_2)
>   -> team_compute_features(team) [DEADLOCK]
>
>Cc: Jiri Pirko 
>
>Signed-off-by: Ivan Vecera 

Please add "Fixes:" line.
Thanks!

Signed-off-by: Jiri Pirko 


[PATCH v8 05/22] IB/hns: Add initial profile resource

2016-05-25 Thread Lijun Ou
This patch mainly configured some profile resoure. For example,
vendor_id, hardware version, and some data structure sizes so on.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
---
 drivers/infiniband/hw/hns/hns_roce_common.h | 47 +
 drivers/infiniband/hw/hns/hns_roce_device.h | 58 -
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c  | 78 +
 drivers/infiniband/hw/hns/hns_roce_hw_v1.h  | 38 +-
 drivers/infiniband/hw/hns/hns_roce_main.c   |  8 +++
 5 files changed, 227 insertions(+), 2 deletions(-)
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_common.h

diff --git a/drivers/infiniband/hw/hns/hns_roce_common.h 
b/drivers/infiniband/hw/hns/hns_roce_common.h
new file mode 100644
index 000..164546d
--- /dev/null
+++ b/drivers/infiniband/hw/hns/hns_roce_common.h
@@ -0,0 +1,47 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef _HNS_ROCE_COMMON_H
+#define _HNS_ROCE_COMMON_H
+
+/*ROCEE_REG DEFINITION/
+#define ROCEE_VENDOR_ID_REG0x0
+#define ROCEE_VENDOR_PART_ID_REG   0x4
+
+#define ROCEE_HW_VERSION_REG   0x8
+
+#define ROCEE_SYS_IMAGE_GUID_L_REG 0xC
+#define ROCEE_SYS_IMAGE_GUID_H_REG 0x10
+
+#define ROCEE_ACK_DELAY_REG0x14
+
+#endif /* _HNS_ROCE_COMMON_H */
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h 
b/drivers/infiniband/hw/hns/hns_roce_device.h
index 2e18488..dce30c8 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -45,19 +45,69 @@
 #define DRV_NAME "hns_roce"
 
 #define HNS_ROCE_MAX_IRQ_NUM   34
+
+#define HNS_ROCE_COMP_VEC_NUM  32
+
+#define HNS_ROCE_AEQE_VEC_NUM  1
+#define HNS_ROCE_AEQE_OF_VEC_NUM   1
+
 #define HNS_ROCE_MAX_PORTS 6
 
+/* Address shift 32bit with the special hardware address operation of RoCEE */
+#define ADDR_SHIFT_32  32
+
 struct hns_roce_ib_iboe {
struct net_device  *netdevs[HNS_ROCE_MAX_PORTS];
u8  phy_port[HNS_ROCE_MAX_PORTS];
 };
 
 struct hns_roce_caps {
-   u8  num_ports;
+   u64 fw_ver;
+   u8  num_ports;
+   int gid_table_len[HNS_ROCE_MAX_PORTS];
+   int pkey_table_len[HNS_ROCE_MAX_PORTS];
+   int local_ca_ack_delay;
+   int num_uars;
+   u32 phy_num_uars;
+   u32 max_sq_sg;  /* 2 */
+   u32 max_sq_inline;  /* 32 */
+   u32 max_rq_sg;  /* 2 */
+   int num_qps;/* 256k */
+   u32 max_wqes;   /* 16k */
+   u32 max_sq_desc_sz; /* 64 */
+   u32 max_rq_desc_sz; /* 64 */
+   int max_qp_init_rdma;
+   int max_qp_dest_rdma;
+   int sqp_start;
+   int num_cqs;
+   int max_cqes;
+   int reserved_cqs;
+   int num_aeq_vectors;/* 1 */
+   int num_comp_vectors;   /* 32 ceq */
+   int num_other_vectors;
+   int num_mtpts;
+   u32 num_mtt_segs;
+   int reserved_mtts;
+   int reserved_mrws;
+   int reserved_uars;

[PATCH v8 06/22] IB/hns: Add initial cmd operation

2016-05-25 Thread Lijun Ou
This patch added the operation for cmd, and added some functions
for initializing eq table and selecting cmd mode.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
---
 drivers/infiniband/hw/hns/hns_roce_cmd.c| 119 
 drivers/infiniband/hw/hns/hns_roce_cmd.h|  42 ++
 drivers/infiniband/hw/hns/hns_roce_common.h |   2 +
 drivers/infiniband/hw/hns/hns_roce_device.h |  41 ++
 drivers/infiniband/hw/hns/hns_roce_main.c   |  13 +++
 5 files changed, 217 insertions(+)
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_cmd.c
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_cmd.h

diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.c 
b/drivers/infiniband/hw/hns/hns_roce_cmd.c
new file mode 100644
index 000..3badf86
--- /dev/null
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.c
@@ -0,0 +1,119 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "hns_roce_common.h"
+#include "hns_roce_device.h"
+#include "hns_roce_cmd.h"
+
+#define CMD_MAX_NUM32
+
+int hns_roce_cmd_init(struct hns_roce_dev *hr_dev)
+{
+   struct device *dev = _dev->pdev->dev;
+
+   mutex_init(_dev->cmd.hcr_mutex);
+   sema_init(_dev->cmd.poll_sem, 1);
+   hr_dev->cmd.use_events = 0;
+   hr_dev->cmd.toggle = 1;
+   hr_dev->cmd.max_cmds = CMD_MAX_NUM;
+   hr_dev->cmd.hcr = hr_dev->reg_base + ROCEE_MB1_REG;
+   hr_dev->cmd.pool = dma_pool_create("hns_roce_cmd", dev,
+  HNS_ROCE_MAILBOX_SIZE,
+  HNS_ROCE_MAILBOX_SIZE, 0);
+   if (!hr_dev->cmd.pool) {
+   dev_err(dev, "Couldn't create mailbox pool for cmd.\n");
+   return -ENOMEM;
+   }
+
+   return 0;
+}
+
+void hns_roce_cmd_cleanup(struct hns_roce_dev *hr_dev)
+{
+   dma_pool_destroy(hr_dev->cmd.pool);
+}
+
+int hns_roce_cmd_use_events(struct hns_roce_dev *hr_dev)
+{
+   struct hns_roce_cmdq *hr_cmd = _dev->cmd;
+   int i;
+
+   hr_cmd->context = kmalloc(hr_cmd->max_cmds *
+ sizeof(struct hns_roce_cmd_context),
+ GFP_KERNEL);
+   if (!hr_cmd->context)
+   return -ENOMEM;
+
+   for (i = 0; i < hr_cmd->max_cmds; ++i) {
+   hr_cmd->context[i].token = i;
+   hr_cmd->context[i].next = i + 1;
+   }
+
+   hr_cmd->context[hr_cmd->max_cmds - 1].next = -1;
+   hr_cmd->free_head = 0;
+
+   sema_init(_cmd->event_sem, hr_cmd->max_cmds);
+   spin_lock_init(_cmd->context_lock);
+
+   for (hr_cmd->token_mask = 1; hr_cmd->token_mask < hr_cmd->max_cmds;
+hr_cmd->token_mask <<= 1)
+   ;
+   --hr_cmd->token_mask;
+   hr_cmd->use_events = 1;
+
+   down(_cmd->poll_sem);
+
+   return 0;
+}
+
+void hns_roce_cmd_use_polling(struct hns_roce_dev *hr_dev)
+{
+   struct hns_roce_cmdq *hr_cmd = _dev->cmd;
+   int i;
+
+   hr_cmd->use_events = 0;
+
+   for (i = 0; i < hr_cmd->max_cmds; ++i)
+   down(_cmd->event_sem);
+
+   kfree(hr_cmd->context);
+   up(_cmd->poll_sem);
+}
diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.h 
b/drivers/infiniband/hw/hns/hns_roce_cmd.h
new file mode 100644
index 000..6759402
--- /dev/null
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.h
@@ -0,0 +1,42 @@
+/*
+ * Copyright (c) 2016 

[PATCH v8 00/22] Add HiSilicon RoCE driver

2016-05-25 Thread Lijun Ou
The HiSilicon Network Substem is a long term evolution IP which is
supposed to be used in HiSilicon ICT SoCs. HNS (HiSilicon Network
Sybsystem) also has a hardware support of performing RDMA with
RoCEE.
The driver for HiSilicon RoCEE(RoCE Engine) is a platform driver and
will support mulitple versions of SOCs in future. This version of driver 
is meant to support Hip06 SoC(which confirms to RoCEEv1 hardware
specifications).

Changes v7 -> v8:
1. add a verbs operation named get_port_immutable. It is an 
   independent patch.
2. add a comment for the definition of ADDR_SHIFT_n, n are 12,32
   and 44.
3. restructures the code to align with naming convention of the Linux
   according to the review of Doug Ledford.
4. modify the state for all .c and .h files.

Changes v6 -> v7:
1. modify some type of parameter, use bool replace the original type.
2. add the Signed-off-by signatures in the first patch.
3. delete the improper print sentence in hns_roce_create_eq.

Changes v5 -> v6:
1. modify the type of obj for unsigned long according the reviews, and
   modify the same questions in RoCE module.
2. fix the spelling error.
3. fix the Signed-off-by signatures.

Changes v4 -> v5:
1. redesign the patchset for RoCE modules in order to split the huge
   patch into small patches.
2. fix the directory path for RoCE module. Delete the hisilicon level.
3. modify the name of roce_v1_hw into roce_hw_v1.

Changes v3 -> v4:
1. modify roce.o into hns-roce.o in Makefile and Kconfig file.

Changes v2 -> v3:
1. modify the formats of RoCE driver code base v2 by the experts 
   reviewing. also, it used kmalloc_array instead of kmalloc, kcalloc
   instead of kzalloc, when refer to memory allocation for array
2. remove some functions without use and unconnected macros
3. modify the binding document with RoCE DT base v2 which added
   interrupt-names
4. redesign the port_map and si_map in hns_dsaf_roce_reset
5. add HiSilicon RoCE driver maintainers introduction in MAINTAINERS
   document

Changes v1 -> v2:
1. modify the formats of roce driver code by the experts reviewing
2. modify the bindings file with roce dts. add the attribute named 
   interrput-names.
3. modify the way of defining port mode in hns_dsaf_main.c
4. move the Kconfig file into the hns directory and send it with roce

Lijun Ou (22):
  net: hns: Add reset function support for RoCE driver
  devicetree: bindings: IB: Add binding document for HiSilicon RoCE
  IB/hns: Add initial main frame driver and get cfg info
  IB/hns: Add RoCE engine reset function
  IB/hns: Add initial profile resource
  IB/hns: Add initial cmd operation
  IB/hns: Add event queue support
  IB/hns: Add icm support
  IB/hns: Add hca support
  IB/hns: Add process flow to init RoCE engine
  IB/hns: Add IB device registration
  IB/hns: Set mtu and gid support
  IB/hns: Add interface of the protocol stack registration
  IB/hns: Add operations support for IB device and port
  IB/hns: Add PD operations support
  IB/hns: Add ah operations support
  IB/hns: Add QP operations support
  IB/hns: Add CQ operations support
  IB/hns: Add memory region operations support
  IB/hns: Add operation for getting immutable port
  IB/hns: Kconfig and Makefile for RoCE module
  MAINTAINERS: Add maintainers for HiSilicon RoCE driver

 .../bindings/infiniband/hisilicon-hns-roce.txt |  107 +
 MAINTAINERS|8 +
 drivers/infiniband/Kconfig |1 +
 drivers/infiniband/hw/Makefile |1 +
 drivers/infiniband/hw/hns/Kconfig  |   10 +
 drivers/infiniband/hw/hns/Makefile |9 +
 drivers/infiniband/hw/hns/hns_roce_ah.c|  132 +
 drivers/infiniband/hw/hns/hns_roce_alloc.c |  262 ++
 drivers/infiniband/hw/hns/hns_roce_cmd.c   |  390 +++
 drivers/infiniband/hw/hns/hns_roce_cmd.h   |   84 +
 drivers/infiniband/hw/hns/hns_roce_common.h|  325 +++
 drivers/infiniband/hw/hns/hns_roce_cq.c|  460 
 drivers/infiniband/hw/hns/hns_roce_device.h|  760 ++
 drivers/infiniband/hw/hns/hns_roce_eq.c|  778 ++
 drivers/infiniband/hw/hns/hns_roce_eq.h|  131 +
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c | 2831 
 drivers/infiniband/hw/hns/hns_roce_hw_v1.h |  989 +++
 drivers/infiniband/hw/hns/hns_roce_icm.c   |  607 +
 drivers/infiniband/hw/hns/hns_roce_icm.h   |  136 +
 drivers/infiniband/hw/hns/hns_roce_main.c  | 1112 
 drivers/infiniband/hw/hns/hns_roce_mr.c|  624 +
 drivers/infiniband/hw/hns/hns_roce_pd.c|  150 ++
 drivers/infiniband/hw/hns/hns_roce_qp.c|  859 ++
 drivers/infiniband/hw/hns/hns_roce_user.h  |   53 +
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c |   90 +-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h |   32 +-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c |   57 +-
 

[PATCH v8 15/22] IB/hns: Add PD operations support

2016-05-25 Thread Lijun Ou
This patch added the verbs to operate PD. It mainly includes
the functions of allocating PD and deallocating PD.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
---
 drivers/infiniband/hw/hns/hns_roce_device.h | 17 
 drivers/infiniband/hw/hns/hns_roce_main.c   |  8 +++-
 drivers/infiniband/hw/hns/hns_roce_pd.c | 62 +
 3 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h 
b/drivers/infiniband/hw/hns/hns_roce_device.h
index c404e55..97f117e 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -143,6 +143,11 @@ struct hns_roce_ucontext {
struct hns_roce_uar uar;
 };
 
+struct hns_roce_pd {
+   struct ib_pdibpd;
+   unsigned long   pdn;
+};
+
 struct hns_roce_bitmap {
/* Bitmap Traversal last a bit which is 1 */
unsigned long   last;
@@ -411,6 +416,11 @@ static inline struct hns_roce_ucontext
return container_of(ibucontext, struct hns_roce_ucontext, ibucontext);
 }
 
+static inline struct hns_roce_pd *to_hr_pd(struct ib_pd *ibpd)
+{
+   return container_of(ibpd, struct hns_roce_pd, ibpd);
+}
+
 static inline void hns_roce_write64_k(__be32 val[2], void __iomem *dest)
 {
__raw_writeq(*(u64 *) val, dest);
@@ -458,6 +468,13 @@ int hns_roce_bitmap_alloc_range(struct hns_roce_bitmap 
*bitmap, int cnt,
 void hns_roce_bitmap_free_range(struct hns_roce_bitmap *bitmap,
unsigned long obj, int cnt);
 
+struct ib_pd *hns_roce_alloc_pd(struct ib_device *ib_dev,
+   struct ib_ucontext *context,
+   struct ib_udata *udata);
+int hns_roce_pd_alloc(struct hns_roce_dev *hr_dev, unsigned long *pdn);
+void hns_roce_pd_free(struct hns_roce_dev *hr_dev, unsigned long pdn);
+int hns_roce_dealloc_pd(struct ib_pd *pd);
+
 void hns_roce_cq_completion(struct hns_roce_dev *hr_dev, u32 cqn);
 void hns_roce_cq_event(struct hns_roce_dev *hr_dev, u32 cqn, int event_type);
 void hns_roce_qp_event(struct hns_roce_dev *hr_dev, u32 qpn, int event_type);
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c 
b/drivers/infiniband/hw/hns/hns_roce_main.c
index b93800f..0732d0d 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -604,7 +604,9 @@ int hns_roce_register_device(struct hns_roce_dev *hr_dev)
ib_dev->uverbs_cmd_mask =
(1ULL << IB_USER_VERBS_CMD_GET_CONTEXT) |
(1ULL << IB_USER_VERBS_CMD_QUERY_DEVICE) |
-   (1ULL << IB_USER_VERBS_CMD_QUERY_PORT);
+   (1ULL << IB_USER_VERBS_CMD_QUERY_PORT) |
+   (1ULL << IB_USER_VERBS_CMD_ALLOC_PD) |
+   (1ULL << IB_USER_VERBS_CMD_DEALLOC_PD);
 
/* HCA||device||port */
ib_dev->modify_device   = hns_roce_modify_device;
@@ -618,6 +620,10 @@ int hns_roce_register_device(struct hns_roce_dev *hr_dev)
ib_dev->dealloc_ucontext= hns_roce_dealloc_ucontext;
ib_dev->mmap= hns_roce_mmap;
 
+   /* PD */
+   ib_dev->alloc_pd= hns_roce_alloc_pd;
+   ib_dev->dealloc_pd  = hns_roce_dealloc_pd;
+
ret = ib_register_device(ib_dev, NULL);
if (ret) {
dev_err(dev, "ib_register_device failed!\n");
diff --git a/drivers/infiniband/hw/hns/hns_roce_pd.c 
b/drivers/infiniband/hw/hns/hns_roce_pd.c
index 6ad38f2..f7f8fc0 100644
--- a/drivers/infiniband/hw/hns/hns_roce_pd.c
+++ b/drivers/infiniband/hw/hns/hns_roce_pd.c
@@ -40,6 +40,28 @@
 #include "hns_roce_common.h"
 #include "hns_roce_device.h"
 
+int hns_roce_pd_alloc(struct hns_roce_dev *hr_dev, unsigned long *pdn)
+{
+   struct device *dev = _dev->pdev->dev;
+   unsigned long pd_number;
+   int ret = 0;
+
+   ret = hns_roce_bitmap_alloc(_dev->pd_bitmap, _number);
+   if (ret == -1) {
+   dev_err(dev, "alloc pdn from pdbitmap failed\n");
+   return -ENOMEM;
+   }
+
+   *pdn = pd_number;
+
+   return 0;
+}
+
+void hns_roce_pd_free(struct hns_roce_dev *hr_dev, unsigned long pdn)
+{
+   hns_roce_bitmap_free(_dev->pd_bitmap, pdn);
+}
+
 int hns_roce_init_pd_table(struct hns_roce_dev *hr_dev)
 {
return hns_roce_bitmap_init(_dev->pd_bitmap, hr_dev->caps.num_pds,
@@ -52,6 +74,46 @@ void hns_roce_cleanup_pd_table(struct hns_roce_dev *hr_dev)
hns_roce_bitmap_cleanup(_dev->pd_bitmap);
 }
 
+struct ib_pd *hns_roce_alloc_pd(struct ib_device *ib_dev,
+   struct ib_ucontext *context,
+   struct ib_udata *udata)
+{
+   struct hns_roce_dev *hr_dev = to_hr_dev(ib_dev);
+   struct device *dev = _dev->pdev->dev;
+   struct hns_roce_pd *pd;
+   int ret;
+
+  

[PATCH net] team: don't call netdev_change_features under team->lock

2016-05-25 Thread Ivan Vecera
The team_device_event() notifier calls team_compute_features() to fix
vlan_features under team->lock to protect team->port_list. The problem is
that subsequent __team_compute_features() calls netdev_change_features()
to propagate vlan_features to upper vlan devices while team->lock is still
taken. This can lead to deadlock when NETIF_F_LRO is modified on lower
devices or team device itself.

Example:
The team0 as active backup with eth0 and eth1 NICs. Both eth0 & eth1 are
LRO capable and LRO is enabled. Thus LRO is also enabled on team0.

The command 'ethtool -K team0 lro off' now hangs due to this deadlock:

dev_ethtool()
-> ethtool_set_features()
 -> __netdev_update_features(team)
  -> netdev_sync_lower_features()
   -> netdev_update_features(lower_1)
-> __netdev_update_features(lower_1)
-> netdev_features_change(lower_1)
 -> call_netdevice_notifiers(...)
  -> team_device_event(lower_1)
   -> team_compute_features(team) [TAKES team->lock]
-> netdev_change_features(team)
 -> __netdev_update_features(team)
  -> netdev_sync_lower_features()
   -> netdev_update_features(lower_2)
-> __netdev_update_features(lower_2)
-> netdev_features_change(lower_2)
 -> call_netdevice_notifiers(...)
  -> team_device_event(lower_2)
   -> team_compute_features(team) [DEADLOCK]

Cc: Jiri Pirko 

Signed-off-by: Ivan Vecera 
---
 drivers/net/team/team.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 718ceea..800a449 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -988,7 +988,7 @@ static void team_port_disable(struct team *team,
 #define TEAM_ENC_FEATURES  (NETIF_F_HW_CSUM | NETIF_F_SG | \
 NETIF_F_RXCSUM | NETIF_F_ALL_TSO)
 
-static void __team_compute_features(struct team *team)
+static void ___team_compute_features(struct team *team)
 {
struct team_port *port;
u32 vlan_features = TEAM_VLAN_FEATURES & NETIF_F_ALL_FOR_ALL;
@@ -1019,15 +1019,20 @@ static void __team_compute_features(struct team *team)
team->dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
if (dst_release_flag == (IFF_XMIT_DST_RELEASE | 
IFF_XMIT_DST_RELEASE_PERM))
team->dev->priv_flags |= IFF_XMIT_DST_RELEASE;
+}
 
+static void __team_compute_features(struct team *team)
+{
+   ___team_compute_features(team);
netdev_change_features(team->dev);
 }
 
 static void team_compute_features(struct team *team)
 {
mutex_lock(>lock);
-   __team_compute_features(team);
+   ___team_compute_features(team);
mutex_unlock(>lock);
+   netdev_change_features(team->dev);
 }
 
 static int team_port_enter(struct team *team, struct team_port *port)
-- 
2.7.3



[PATCH v8 09/22] IB/hns: Add hca support

2016-05-25 Thread Lijun Ou
This patch mainly setup hca for RoCE. It will do a series of
initial works, as follows:
1. init uar table, allocate uar resource
2. init pd table
3. init cq table
4. init mr table
5. init qp table

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
---
 drivers/infiniband/hw/hns/hns_roce_alloc.c  | 128 +
 drivers/infiniband/hw/hns/hns_roce_cq.c |  25 
 drivers/infiniband/hw/hns/hns_roce_device.h |  69 +
 drivers/infiniband/hw/hns/hns_roce_icm.c|  89 
 drivers/infiniband/hw/hns/hns_roce_icm.h|   7 +
 drivers/infiniband/hw/hns/hns_roce_main.c   |  79 +++
 drivers/infiniband/hw/hns/hns_roce_mr.c | 211 
 drivers/infiniband/hw/hns/hns_roce_pd.c |  88 
 drivers/infiniband/hw/hns/hns_roce_qp.c |  30 
 9 files changed, 726 insertions(+)
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_alloc.c
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_mr.c
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_pd.c

diff --git a/drivers/infiniband/hw/hns/hns_roce_alloc.c 
b/drivers/infiniband/hw/hns/hns_roce_alloc.c
new file mode 100644
index 000..d2932c1
--- /dev/null
+++ b/drivers/infiniband/hw/hns/hns_roce_alloc.c
@@ -0,0 +1,128 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "hns_roce_device.h"
+
+int hns_roce_bitmap_alloc(struct hns_roce_bitmap *bitmap, unsigned long *obj)
+{
+   int ret = 0;
+
+   spin_lock(>lock);
+   *obj = find_next_zero_bit(bitmap->table, bitmap->max, bitmap->last);
+   if (*obj >= bitmap->max) {
+   bitmap->top = (bitmap->top + bitmap->max + bitmap->reserved_top)
+  & bitmap->mask;
+   *obj = find_first_zero_bit(bitmap->table, bitmap->max);
+   }
+
+   if (*obj < bitmap->max) {
+   set_bit(*obj, bitmap->table);
+   bitmap->last = (*obj + 1);
+   if (bitmap->last == bitmap->max)
+   bitmap->last = 0;
+   *obj |= bitmap->top;
+   } else {
+   ret = -1;
+   }
+
+   spin_unlock(>lock);
+
+   return ret;
+}
+
+void hns_roce_bitmap_free(struct hns_roce_bitmap *bitmap, unsigned long obj)
+{
+   hns_roce_bitmap_free_range(bitmap, obj, 1);
+}
+
+void hns_roce_bitmap_free_range(struct hns_roce_bitmap *bitmap,
+   unsigned long obj, int cnt)
+{
+   int i;
+
+   obj &= bitmap->max + bitmap->reserved_top - 1;
+
+   spin_lock(>lock);
+   for (i = 0; i < cnt; i++)
+   clear_bit(obj + i, bitmap->table);
+
+   bitmap->last = min(bitmap->last, obj);
+   bitmap->top = (bitmap->top + bitmap->max + bitmap->reserved_top)
+  & bitmap->mask;
+   spin_unlock(>lock);
+}
+
+int hns_roce_bitmap_init(struct hns_roce_bitmap *bitmap, u32 num, u32 mask,
+u32 reserved_bot, u32 reserved_top)
+{
+   u32 i;
+
+   if (num != roundup_pow_of_two(num))
+   return -EINVAL;
+
+   bitmap->last = 0;
+   bitmap->top = 0;
+   bitmap->max = num - reserved_top;
+   bitmap->mask = mask;
+   bitmap->reserved_top = reserved_top;
+   spin_lock_init(>lock);
+   bitmap->table = kcalloc(BITS_TO_LONGS(bitmap->max), sizeof(long),
+   

[PATCH v8 04/22] IB/hns: Add RoCE engine reset function

2016-05-25 Thread Lijun Ou
This patch mainly added reset flow of RoCE engine in RoCE
driver. It is necessary when RoCE is loaded and removed.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
---
 drivers/infiniband/hw/hns/hns_roce_device.h |  7 +++
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c  | 72 +
 drivers/infiniband/hw/hns/hns_roce_hw_v1.h  | 40 
 drivers/infiniband/hw/hns/hns_roce_main.c   | 16 ++-
 4 files changed, 134 insertions(+), 1 deletion(-)
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_hw_v1.c
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_hw_v1.h

diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h 
b/drivers/infiniband/hw/hns/hns_roce_device.h
index f9de8e4..2e18488 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -56,6 +56,10 @@ struct hns_roce_caps {
u8  num_ports;
 };
 
+struct hns_roce_hw {
+   int (*reset)(struct hns_roce_dev *hr_dev, bool enable);
+};
+
 struct hns_roce_dev {
struct ib_deviceib_dev;
struct platform_device  *pdev;
@@ -67,6 +71,9 @@ struct hns_roce_dev {
 
int cmd_mod;
int loop_idc;
+   struct hns_roce_hw  *hw;
 };
 
+extern struct hns_roce_hw hns_roce_hw_v1;
+
 #endif /* _HNS_ROCE_DEVICE_H */
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c 
b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
new file mode 100644
index 000..198be3b
--- /dev/null
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
@@ -0,0 +1,72 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "hns_roce_device.h"
+#include "hns_roce_hw_v1.h"
+
+/**
+ * hns_roce_v1_reset - reset roce
+ * @hr_dev: roce device struct pointer
+ * @enable: true -- drop reset, false -- reset
+ * return 0 - success , negative --fail
+ */
+int hns_roce_v1_reset(struct hns_roce_dev *hr_dev, bool enable)
+{
+   struct device_node *dsaf_node;
+   struct device *dev = _dev->pdev->dev;
+   struct device_node *np = dev->of_node;
+   int ret;
+
+   dsaf_node = of_parse_phandle(np, "dsaf-handle", 0);
+
+   if (!enable) {
+   ret = hns_dsaf_roce_reset(_node->fwnode, false);
+   } else {
+   ret = hns_dsaf_roce_reset(_node->fwnode, false);
+   if (ret)
+   return ret;
+
+   msleep(SLEEP_TIME_INTERVAL);
+   ret = hns_dsaf_roce_reset(_node->fwnode, true);
+   }
+
+   return ret;
+}
+
+struct hns_roce_hw hns_roce_hw_v1 = {
+   .reset = hns_roce_v1_reset,
+};
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.h 
b/drivers/infiniband/hw/hns/hns_roce_hw_v1.h
new file mode 100644
index 000..ca69d0b
--- /dev/null
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.h
@@ -0,0 +1,40 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ * 

[PATCH v8 08/22] IB/hns: Add icm support

2016-05-25 Thread Lijun Ou
This patch mainly added icm support for RoCE. It initializes icm
which managers the relative memory blocks for RoCE. The data
structures of RoCE will be located in it. For example, CQ table,
QP table and MTPT table so on.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
---
 drivers/infiniband/hw/hns/hns_roce_common.h |  19 ++
 drivers/infiniband/hw/hns/hns_roce_device.h |  30 ++
 drivers/infiniband/hw/hns/hns_roce_icm.c| 462 
 drivers/infiniband/hw/hns/hns_roce_icm.h| 119 +++
 drivers/infiniband/hw/hns/hns_roce_main.c   |  84 +
 5 files changed, 714 insertions(+)
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_icm.c
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_icm.h

diff --git a/drivers/infiniband/hw/hns/hns_roce_common.h 
b/drivers/infiniband/hw/hns/hns_roce_common.h
index 5b0ca43..5998778 100644
--- a/drivers/infiniband/hw/hns/hns_roce_common.h
+++ b/drivers/infiniband/hw/hns/hns_roce_common.h
@@ -53,6 +53,22 @@
 #define roce_set_bit(origin, shift, val) \
roce_set_field((origin), (1ul << (shift)), (shift), (val))
 
+#define ROCEE_BT_CMD_H_ROCEE_BT_CMD_IN_MDF_S 0
+#define ROCEE_BT_CMD_H_ROCEE_BT_CMD_IN_MDF_M   \
+   (((1UL << 19) - 1) << ROCEE_BT_CMD_H_ROCEE_BT_CMD_IN_MDF_S)
+
+#define ROCEE_BT_CMD_H_ROCEE_BT_CMD_S 19
+
+#define ROCEE_BT_CMD_H_ROCEE_BT_CMD_MDF_S 20
+#define ROCEE_BT_CMD_H_ROCEE_BT_CMD_MDF_M   \
+   (((1UL << 2) - 1) << ROCEE_BT_CMD_H_ROCEE_BT_CMD_MDF_S)
+
+#define ROCEE_BT_CMD_H_ROCEE_BT_CMD_BA_H_S 22
+#define ROCEE_BT_CMD_H_ROCEE_BT_CMD_BA_H_M   \
+   (((1UL << 5) - 1) << ROCEE_BT_CMD_H_ROCEE_BT_CMD_BA_H_S)
+
+#define ROCEE_BT_CMD_H_ROCEE_BT_CMD_HW_SYNS_S 31
+
 #define ROCEE_CAEP_AEQC_AEQE_SHIFT_CAEP_AEQC_STATE_S 0
 #define ROCEE_CAEP_AEQC_AEQE_SHIFT_CAEP_AEQC_STATE_M   \
(((1UL << 2) - 1) << ROCEE_CAEP_AEQC_AEQE_SHIFT_CAEP_AEQC_STATE_S)
@@ -93,6 +109,8 @@
 #define ROCEE_SYS_IMAGE_GUID_L_REG 0xC
 #define ROCEE_SYS_IMAGE_GUID_H_REG 0x10
 
+#define ROCEE_BT_CMD_H_REG 0x204
+
 #define ROCEE_CAEP_AEQE_CONS_IDX_REG   0x3AC
 #define ROCEE_CAEP_CEQC_CONS_IDX_0_REG 0x3BC
 
@@ -105,6 +123,7 @@
 
 #define ROCEE_CAEP_CE_INTERVAL_CFG_REG 0x190
 #define ROCEE_CAEP_CE_BURST_NUM_CFG_REG0x194
+#define ROCEE_BT_CMD_L_REG 0x200
 
 #define ROCEE_MB1_REG  0x210
 
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h 
b/drivers/infiniband/hw/hns/hns_roce_device.h
index 6dda4b7..9538248 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -120,6 +120,26 @@ enum {
HNS_ROCE_CMD_SUCCESS= 1,
 };
 
+struct hns_roce_icm_table {
+   /* ICM type: 0 = qpc 1 = mtt 2 = cqc 3 = srq 4 = other */
+   u32 type;
+   /* ICM array elment num */
+   unsigned long   num_icm;
+   /* ICM entry record obj total num */
+   unsigned long   num_obj;
+   /*Single obj size */
+   unsigned long   obj_size;
+   int lowmem;
+   int coherent;
+   struct mutexmutex;
+   struct hns_roce_icm **icm;
+};
+
+struct hns_roce_mr_table {
+   struct hns_roce_icm_table   mtt_table;
+   struct hns_roce_icm_table   mtpt_table;
+};
+
 struct hns_roce_buf_list {
void*buf;
dma_addr_t  map;
@@ -135,11 +155,14 @@ struct hns_roce_cq {
 
 struct hns_roce_qp_table {
spinlock_t  lock;
+   struct hns_roce_icm_table   qp_table;
+   struct hns_roce_icm_table   irrl_table;
 };
 
 struct hns_roce_cq_table {
spinlock_t  lock;
struct radix_tree_root  tree;
+   struct hns_roce_icm_table   table;
 };
 
 struct hns_roce_cmd_context {
@@ -268,6 +291,7 @@ struct hns_roce_hw {
 struct hns_roce_dev {
struct ib_deviceib_dev;
struct platform_device  *pdev;
+   spinlock_t  bt_cmd_lock;
struct hns_roce_ib_iboe iboe;
 
int irq[HNS_ROCE_MAX_IRQ_NUM];
@@ -282,6 +306,7 @@ struct hns_roce_dev {
u32 hw_rev;
 
struct hns_roce_cmdqcmd;
+   struct hns_roce_mr_table  mr_table;
struct hns_roce_cq_table  cq_table;
struct hns_roce_qp_table  qp_table;
struct hns_roce_eq_table  eq_table;
@@ -291,6 +316,11 @@ struct hns_roce_dev {
struct hns_roce_hw  *hw;
 };
 
+static inline void hns_roce_write64_k(__be32 val[2], void __iomem *dest)
+{
+   __raw_writeq(*(u64 *) val, dest);
+}
+
 static inline struct hns_roce_qp
*__hns_roce_qp_lookup(struct hns_roce_dev *hr_dev, u32 qpn)
 {
diff --git a/drivers/infiniband/hw/hns/hns_roce_icm.c 
b/drivers/infiniband/hw/hns/hns_roce_icm.c
new file mode 100644
index 

[PATCH v8 14/22] IB/hns: Add operations support for IB device and port

2016-05-25 Thread Lijun Ou
This patch mainly registered some relative verbs for the kernel.
These operation functions will be called by user. For example:
1. modify device
2. query device
3. query_port
4. modify_port
and so on.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
---
 drivers/infiniband/hw/hns/hns_roce_common.h |   4 +
 drivers/infiniband/hw/hns/hns_roce_device.h |  21 +++
 drivers/infiniband/hw/hns/hns_roce_main.c   | 228 
 drivers/infiniband/hw/hns/hns_roce_user.h   |  40 +
 4 files changed, 293 insertions(+)
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_user.h

diff --git a/drivers/infiniband/hw/hns/hns_roce_common.h 
b/drivers/infiniband/hw/hns/hns_roce_common.h
index 60d361c..d6832c5 100644
--- a/drivers/infiniband/hw/hns/hns_roce_common.h
+++ b/drivers/infiniband/hw/hns/hns_roce_common.h
@@ -33,6 +33,10 @@
 #ifndef _HNS_ROCE_COMMON_H
 #define _HNS_ROCE_COMMON_H
 
+#ifndef assert
+#define assert(cond)
+#endif
+
 #define roce_writel(value, addr) writel((value), (addr))
 #define roce_readl(addr)readl((addr))
 #define roce_raw_write(value, addr) \
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h 
b/drivers/infiniband/hw/hns/hns_roce_device.h
index 62b5924..c404e55 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -45,6 +45,7 @@
 #define DRV_NAME "hns_roce"
 
 #define MAC_ADDR_OCTET_NUM 6
+#define HNS_ROCE_MAX_MSG_LEN   0x8000
 
 #define HNS_ROCE_BA_SIZE   (32 * 4096)
 
@@ -57,6 +58,10 @@
 
 #define HNS_ROCE_MAX_PORTS 6
 #define HNS_ROCE_MAX_GID_NUM   16
+#define HNS_ROCE_GID_SIZE  16
+
+#define PKEY_ID0x
+#define NODE_DESC_SIZE 64
 
 /* Address shift 12bit with the special hardware address operation of RoCEE */
 #define ADDR_SHIFT_12  12
@@ -133,6 +138,11 @@ struct hns_roce_uar {
unsigned long   index;
 };
 
+struct hns_roce_ucontext {
+   struct ib_ucontext  ibucontext;
+   struct hns_roce_uar uar;
+};
+
 struct hns_roce_bitmap {
/* Bitmap Traversal last a bit which is 1 */
unsigned long   last;
@@ -390,6 +400,17 @@ struct hns_roce_dev {
struct hns_roce_hw  *hw;
 };
 
+static inline struct hns_roce_dev *to_hr_dev(struct ib_device *ib_dev)
+{
+   return container_of(ib_dev, struct hns_roce_dev, ib_dev);
+}
+
+static inline struct hns_roce_ucontext
+   *to_hr_ucontext(struct ib_ucontext *ibucontext)
+{
+   return container_of(ibucontext, struct hns_roce_ucontext, ibucontext);
+}
+
 static inline void hns_roce_write64_k(__be32 val[2], void __iomem *dest)
 {
__raw_writeq(*(u64 *) val, dest);
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c 
b/drivers/infiniband/hw/hns/hns_roce_main.c
index 66fbae2..b93800f 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -60,6 +60,7 @@
 #include 
 #include "hns_roce_common.h"
 #include "hns_roce_device.h"
+#include "hns_roce_user.h"
 #include "hns_roce_icm.h"
 
 /**
@@ -360,6 +361,217 @@ int hns_roce_setup_mtu_gids(struct hns_roce_dev  *hr_dev)
return ret;
 }
 
+static int hns_roce_query_device(struct ib_device *ib_dev,
+struct ib_device_attr *props,
+struct ib_udata *uhw)
+{
+   struct hns_roce_dev *hr_dev = to_hr_dev(ib_dev);
+
+   memset(props, 0, sizeof(*props));
+
+   props->fw_ver = hr_dev->fw_ver;
+   props->sys_image_guid = hr_dev->sys_image_guid;
+   props->max_mr_size = (u64)(~(0ULL));
+   props->page_size_cap = hr_dev->caps.page_size_cap;
+   props->vendor_id = hr_dev->vendor_id;
+   props->vendor_part_id = hr_dev->vendor_part_id;
+   props->hw_ver = hr_dev->hw_rev;
+   props->max_qp = hr_dev->caps.num_qps;
+   props->max_qp_wr = hr_dev->caps.max_wqes;
+   props->device_cap_flags = IB_DEVICE_PORT_ACTIVE_EVENT |
+ IB_DEVICE_RC_RNR_NAK_GEN |
+ IB_DEVICE_LOCAL_DMA_LKEY;
+   props->max_sge = hr_dev->caps.max_sq_sg;
+   props->max_sge_rd = 1;
+   props->max_cq = hr_dev->caps.num_cqs;
+   props->max_cqe = hr_dev->caps.max_cqes;
+   props->max_mr = hr_dev->caps.num_mtpts;
+   props->max_pd = hr_dev->caps.num_pds;
+   props->max_qp_rd_atom = hr_dev->caps.max_qp_dest_rdma;
+   props->max_qp_init_rd_atom = hr_dev->caps.max_qp_init_rdma;
+   props->atomic_cap = IB_ATOMIC_NONE;
+   props->max_pkeys = 1;
+   props->local_ca_ack_delay = hr_dev->caps.local_ca_ack_delay;
+
+   return 0;
+}
+
+static int hns_roce_query_port(struct ib_device *ib_dev, u8 port_num,
+   

[PATCH v8 12/22] IB/hns: Set mtu and gid support

2016-05-25 Thread Lijun Ou
This patch mainly set mtu and gid resource. These resource
will be used to set up network transmission in nodes.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
---
 drivers/infiniband/hw/hns/hns_roce_common.h |  16 
 drivers/infiniband/hw/hns/hns_roce_device.h |  12 +++
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c  |  64 +++
 drivers/infiniband/hw/hns/hns_roce_hw_v1.h  |   1 +
 drivers/infiniband/hw/hns/hns_roce_main.c   | 123 
 5 files changed, 216 insertions(+)

diff --git a/drivers/infiniband/hw/hns/hns_roce_common.h 
b/drivers/infiniband/hw/hns/hns_roce_common.h
index 73c6220..60d361c 100644
--- a/drivers/infiniband/hw/hns/hns_roce_common.h
+++ b/drivers/infiniband/hw/hns/hns_roce_common.h
@@ -156,6 +156,14 @@
 
 #define ROCEE_BT_CMD_H_ROCEE_BT_CMD_HW_SYNS_S 31
 
+#define ROCEE_SMAC_H_ROCEE_SMAC_H_S 0
+#define ROCEE_SMAC_H_ROCEE_SMAC_H_M   \
+   (((1UL << 16) - 1) << ROCEE_SMAC_H_ROCEE_SMAC_H_S)
+
+#define ROCEE_SMAC_H_ROCEE_PORT_MTU_S 16
+#define ROCEE_SMAC_H_ROCEE_PORT_MTU_M   \
+   (((1UL << 4) - 1) << ROCEE_SMAC_H_ROCEE_PORT_MTU_S)
+
 #define ROCEE_CAEP_AEQC_AEQE_SHIFT_CAEP_AEQC_STATE_S 0
 #define ROCEE_CAEP_AEQC_AEQE_SHIFT_CAEP_AEQC_STATE_M   \
(((1UL << 2) - 1) << ROCEE_CAEP_AEQC_AEQE_SHIFT_CAEP_AEQC_STATE_S)
@@ -196,8 +204,16 @@
 #define ROCEE_SYS_IMAGE_GUID_L_REG 0xC
 #define ROCEE_SYS_IMAGE_GUID_H_REG 0x10
 
+#define ROCEE_PORT_GID_L_0_REG 0x50
+#define ROCEE_PORT_GID_ML_0_REG0x54
+#define ROCEE_PORT_GID_MH_0_REG0x58
+#define ROCEE_PORT_GID_H_0_REG 0x5C
+
 #define ROCEE_BT_CMD_H_REG 0x204
 
+#define ROCEE_SMAC_L_0_REG 0x240
+#define ROCEE_SMAC_H_0_REG 0x244
+
 #define ROCEE_CAEP_AEQE_CONS_IDX_REG   0x3AC
 #define ROCEE_CAEP_CEQC_CONS_IDX_0_REG 0x3BC
 
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h 
b/drivers/infiniband/hw/hns/hns_roce_device.h
index fa906c1..5b714bf 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -44,6 +44,8 @@
 
 #define DRV_NAME "hns_roce"
 
+#define MAC_ADDR_OCTET_NUM 6
+
 #define HNS_ROCE_BA_SIZE   (32 * 4096)
 
 #define HNS_ROCE_MAX_IRQ_NUM   34
@@ -54,6 +56,7 @@
 #define HNS_ROCE_AEQE_OF_VEC_NUM   1
 
 #define HNS_ROCE_MAX_PORTS 6
+#define HNS_ROCE_MAX_GID_NUM   16
 
 /* Address shift 12bit with the special hardware address operation of RoCEE */
 #define ADDR_SHIFT_12  12
@@ -266,6 +269,8 @@ struct hns_roce_qp {
 
 struct hns_roce_ib_iboe {
struct net_device  *netdevs[HNS_ROCE_MAX_PORTS];
+   /* 16 GID is shared by 6 port in v1 engine. */
+   union ib_gidgid_table[HNS_ROCE_MAX_GID_NUM];
u8  phy_port[HNS_ROCE_MAX_PORTS];
 };
 
@@ -340,6 +345,11 @@ struct hns_roce_hw {
void (*hw_profile)(struct hns_roce_dev *hr_dev);
int (*hw_init)(struct hns_roce_dev *hr_dev);
void (*hw_uninit)(struct hns_roce_dev *hr_dev);
+   void (*set_gid)(struct hns_roce_dev *hr_dev, u8 port, int gid_index,
+   union ib_gid *gid);
+   void (*set_mac)(struct hns_roce_dev *hr_dev, u8 phy_port, u8 *addr);
+   void (*set_mtu)(struct hns_roce_dev *hr_dev, u8 phy_port,
+   enum ib_mtu mtu);
void*priv;
 };
 
@@ -357,6 +367,7 @@ struct hns_roce_dev {
struct hns_roce_capscaps;
struct radix_tree_root  qp_table_tree;
 
+   unsigned char   dev_addr[HNS_ROCE_MAX_PORTS][MAC_ADDR_OCTET_NUM];
u64 fw_ver;
u64 sys_image_guid;
u32 vendor_id;
@@ -426,6 +437,7 @@ void hns_roce_bitmap_free_range(struct hns_roce_bitmap 
*bitmap,
 void hns_roce_cq_completion(struct hns_roce_dev *hr_dev, u32 cqn);
 void hns_roce_cq_event(struct hns_roce_dev *hr_dev, u32 cqn, int event_type);
 void hns_roce_qp_event(struct hns_roce_dev *hr_dev, u32 qpn, int event_type);
+int hns_get_gid_index(struct hns_roce_dev *hr_dev, u8 port, int gid_index);
 
 extern struct hns_roce_hw hns_roce_hw_v1;
 
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c 
b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
index 9cd6c1a..29c6903 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
@@ -619,9 +619,73 @@ void hns_roce_v1_uninit(struct hns_roce_dev *hr_dev)
hns_roce_db_free(hr_dev);
 }
 
+void hns_roce_v1_set_gid(struct hns_roce_dev *hr_dev, u8 port, int gid_index,
+union ib_gid *gid)
+{
+   u32 *p = NULL;
+   u8 gid_idx = 0;
+
+   gid_idx = hns_get_gid_index(hr_dev, port, 

[PATCH v8 21/22] IB/hns: Kconfig and Makefile for RoCE module

2016-05-25 Thread Lijun Ou
This patch added Kconfig and Makefile for building RoCE module.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
---
 drivers/infiniband/Kconfig |  1 +
 drivers/infiniband/hw/Makefile |  1 +
 drivers/infiniband/hw/hns/Kconfig  | 10 ++
 drivers/infiniband/hw/hns/Makefile |  9 +
 4 files changed, 21 insertions(+)
 create mode 100644 drivers/infiniband/hw/hns/Kconfig
 create mode 100644 drivers/infiniband/hw/hns/Makefile

diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
index 6425c0e..726a4ca 100644
--- a/drivers/infiniband/Kconfig
+++ b/drivers/infiniband/Kconfig
@@ -74,6 +74,7 @@ source "drivers/infiniband/hw/mlx5/Kconfig"
 source "drivers/infiniband/hw/nes/Kconfig"
 source "drivers/infiniband/hw/ocrdma/Kconfig"
 source "drivers/infiniband/hw/usnic/Kconfig"
+source "drivers/infiniband/hw/hns/Kconfig"
 
 source "drivers/infiniband/ulp/ipoib/Kconfig"
 
diff --git a/drivers/infiniband/hw/Makefile b/drivers/infiniband/hw/Makefile
index c7ad0a4..223eb78 100644
--- a/drivers/infiniband/hw/Makefile
+++ b/drivers/infiniband/hw/Makefile
@@ -8,3 +8,4 @@ obj-$(CONFIG_MLX5_INFINIBAND)   += mlx5/
 obj-$(CONFIG_INFINIBAND_NES)   += nes/
 obj-$(CONFIG_INFINIBAND_OCRDMA)+= ocrdma/
 obj-$(CONFIG_INFINIBAND_USNIC) += usnic/
+obj-$(CONFIG_INFINIBAND_HISILICON_HNS) += hns/
diff --git a/drivers/infiniband/hw/hns/Kconfig 
b/drivers/infiniband/hw/hns/Kconfig
new file mode 100644
index 000..c47c168
--- /dev/null
+++ b/drivers/infiniband/hw/hns/Kconfig
@@ -0,0 +1,10 @@
+config INFINIBAND_HISILICON_HNS
+   tristate "Hisilicon Hns ROCE Driver"
+   depends on NET_VENDOR_HISILICON
+   depends on ARM64 && HNS && HNS_DSAF && HNS_ENET
+   ---help---
+ This is a ROCE/RDMA driver for the Hisilicon RoCE engine. The engine
+ is used in Hisilicon Hi1610 and more further ICT SoC.
+
+ To compile this driver as a module, choose M here: the module
+ will be called hns-roce.
diff --git a/drivers/infiniband/hw/hns/Makefile 
b/drivers/infiniband/hw/hns/Makefile
new file mode 100644
index 000..404a700
--- /dev/null
+++ b/drivers/infiniband/hw/hns/Makefile
@@ -0,0 +1,9 @@
+#
+# Makefile for the HISILICON RoCE drivers.
+#
+
+obj-$(CONFIG_INFINIBAND_HISILICON_HNS) += hns-roce.o
+hns-roce-objs := hns_roce_main.o hns_roce_cmd.o hns_roce_eq.o hns_roce_pd.o \
+   hns_roce_ah.o hns_roce_icm.o hns_roce_mr.o hns_roce_qp.o \
+   hns_roce_cq.o hns_roce_alloc.o hns_roce_hw_v1.o
+
-- 
1.9.1



[PATCH v8 17/22] IB/hns: Add QP operations support

2016-05-25 Thread Lijun Ou
This patch was implementing for queue pair operations. QP Consists
of a Send Work Queue and a Receive Work Queue. Send and receive
queues are always created as a pair and remain that way throughout
their lifetime. A Queue Pair is identified by its Queue Pair Number.
QP operations as follows:
1. create QP. When a QP is created, a complete set of initial
   attributes must be specified by the Consumer.
2. query QP. Returns the attribute list and current values for
   the specified QP.
3. modify QP. modify QP relative attributes by it.
4. destroy QP. When a QP is destroyed, any outstanding Work
   Requests are no longer considered to be in the scope of
   the Channel Interface. It is the responsibility of the
   Consumer to be able to clean up any resources
5. post send request. Builds one or more WQEs for the Send Queue
   in the specified QP.
6. post receive request. Builds one or more WQEs for the receive
   Queue in the specified QP.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
Signed-off-by: Salil Mehta 
---
 drivers/infiniband/hw/hns/hns_roce_alloc.c  |  134 +++
 drivers/infiniband/hw/hns/hns_roce_cmd.c|  249 
 drivers/infiniband/hw/hns/hns_roce_cmd.h|   33 +
 drivers/infiniband/hw/hns/hns_roce_common.h |   58 +
 drivers/infiniband/hw/hns/hns_roce_device.h |  167 +++
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c  | 1643 +++
 drivers/infiniband/hw/hns/hns_roce_hw_v1.h  |  626 ++
 drivers/infiniband/hw/hns/hns_roce_icm.c|   56 +
 drivers/infiniband/hw/hns/hns_roce_icm.h|9 +
 drivers/infiniband/hw/hns/hns_roce_main.c   |   14 +-
 drivers/infiniband/hw/hns/hns_roce_mr.c |  160 +++
 drivers/infiniband/hw/hns/hns_roce_qp.c |  765 +
 drivers/infiniband/hw/hns/hns_roce_user.h   |   13 +
 13 files changed, 3926 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_alloc.c 
b/drivers/infiniband/hw/hns/hns_roce_alloc.c
index d2932c1..786385a 100644
--- a/drivers/infiniband/hw/hns/hns_roce_alloc.c
+++ b/drivers/infiniband/hw/hns/hns_roce_alloc.c
@@ -71,6 +71,45 @@ void hns_roce_bitmap_free(struct hns_roce_bitmap *bitmap, 
unsigned long obj)
hns_roce_bitmap_free_range(bitmap, obj, 1);
 }
 
+int hns_roce_bitmap_alloc_range(struct hns_roce_bitmap *bitmap, int cnt,
+   int align, unsigned long *obj)
+{
+   int ret = 0;
+   int i;
+
+   if (likely(cnt == 1 && align == 1))
+   return hns_roce_bitmap_alloc(bitmap, obj);
+
+   spin_lock(>lock);
+
+   *obj = bitmap_find_next_zero_area(bitmap->table, bitmap->max,
+ bitmap->last, cnt, align - 1);
+   if (*obj >= bitmap->max) {
+   bitmap->top = (bitmap->top + bitmap->max + bitmap->reserved_top)
+  & bitmap->mask;
+   *obj = bitmap_find_next_zero_area(bitmap->table, bitmap->max, 0,
+ cnt, align - 1);
+   }
+
+   if (*obj < bitmap->max) {
+   for (i = 0; i < cnt; i++)
+   set_bit(*obj + i, bitmap->table);
+
+   if (*obj == bitmap->last) {
+   bitmap->last = (*obj + cnt);
+   if (bitmap->last >= bitmap->max)
+   bitmap->last = 0;
+   }
+   *obj |= bitmap->top;
+   } else {
+   ret = -1;
+   }
+
+   spin_unlock(>lock);
+
+   return ret;
+}
+
 void hns_roce_bitmap_free_range(struct hns_roce_bitmap *bitmap,
unsigned long obj, int cnt)
 {
@@ -118,6 +157,101 @@ void hns_roce_bitmap_cleanup(struct hns_roce_bitmap 
*bitmap)
kfree(bitmap->table);
 }
 
+void hns_roce_buf_free(struct hns_roce_dev *hr_dev, u32 size,
+  struct hns_roce_buf *buf)
+{
+   int i;
+   struct device *dev = _dev->pdev->dev;
+   u32 bits_per_long = BITS_PER_LONG;
+
+   if (buf->nbufs == 1) {
+   dma_free_coherent(dev, size, buf->direct.buf, buf->direct.map);
+   } else {
+   if (bits_per_long == 64)
+   vunmap(buf->direct.buf);
+
+   for (i = 0; i < buf->nbufs; ++i)
+   if (buf->page_list[i].buf)
+   dma_free_coherent(_dev->pdev->dev, PAGE_SIZE,
+ buf->page_list[i].buf,
+ buf->page_list[i].map);
+   kfree(buf->page_list);
+   }
+}
+
+int hns_roce_buf_alloc(struct hns_roce_dev *hr_dev, u32 size, u32 max_direct,
+  struct hns_roce_buf *buf)
+{
+   int i = 0;
+   dma_addr_t t;
+   struct page **pages;
+   struct device *dev = 

[PATCH v8 02/22] devicetree: bindings: IB: Add binding document for HiSilicon RoCE

2016-05-25 Thread Lijun Ou
This patch added DTS binding document for HiSilicon RoCE driver.

Signed-off-by: Wei Hu 
Signed-off-by: Lijun Ou 
---
 .../bindings/infiniband/hisilicon-hns-roce.txt | 107 +
 1 file changed, 107 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/infiniband/hisilicon-hns-roce.txt

diff --git 
a/Documentation/devicetree/bindings/infiniband/hisilicon-hns-roce.txt 
b/Documentation/devicetree/bindings/infiniband/hisilicon-hns-roce.txt
new file mode 100644
index 000..2c59ed9
--- /dev/null
+++ b/Documentation/devicetree/bindings/infiniband/hisilicon-hns-roce.txt
@@ -0,0 +1,107 @@
+HiSilicon RoCE DT description
+
+HiSilicon RoCE engine is a part of network subsystem.
+It works depending on other part of network wubsytem, such as, gmac and
+dsa fabric.
+
+Additional properties are described here:
+
+Required properties:
+- compatible: Should contain "hisilicon,hns-roce-v1".
+- reg: Physical base address of the roce driver and
+length of memory mapped region.
+- eth-handle: phandle, specifies a reference to a node
+representing a ethernet device.
+- dsaf-handle: phandle, specifies a reference to a node
+representing a dsaf device.
+- #address-cells: must be 2
+- #size-cells: must be 2
+Optional properties:
+- dma-coherent: Present if DMA operations are coherent.
+- interrupt-parent: the interrupt parent of this device.
+- interrupts: should contain 32 completion event irq,1 async event irq
+and 1 event overflow irq.
+- interrupt-names:should be one of 34 irqs for roce device
+  - hns-roce-comp-0 ~ hns-roce-comp-31: 32 complete event irq
+  - hns-roce-async: 1 async event irq
+  - hns-roce-common: named common exception warning irq
+Example:
+   infiniband@c400 {
+   compatible = "hisilicon,hns-roce-v1";
+   reg = <0x0 0xc400 0x0 0x10>;
+   dma-coherent;
+   eth-handle = < >;
+   dsaf-handle = <_dsa>;
+   #address-cells = <2>;
+   #size-cells = <2>;
+   interrupt-parent = <_dsa>;
+   interrupts = <722 1>,
+   <723 1>,
+   <724 1>,
+   <725 1>,
+   <726 1>,
+   <727 1>,
+   <728 1>,
+   <729 1>,
+   <730 1>,
+   <731 1>,
+   <732 1>,
+   <733 1>,
+   <734 1>,
+   <735 1>,
+   <736 1>,
+   <737 1>,
+   <738 1>,
+   <739 1>,
+   <740 1>,
+   <741 1>,
+   <742 1>,
+   <743 1>,
+   <744 1>,
+   <745 1>,
+   <746 1>,
+   <747 1>,
+   <748 1>,
+   <749 1>,
+   <750 1>,
+   <751 1>,
+   <752 1>,
+   <753 1>,
+   <785 1>,
+   <754 4>;
+
+   interrupt-names = "hns-roce-comp-0",
+   "hns-roce-comp-1",
+   "hns-roce-comp-2",
+   "hns-roce-comp-3",
+   "hns-roce-comp-4",
+   "hns-roce-comp-5",
+   "hns-roce-comp-6",
+   "hns-roce-comp-7",
+   "hns-roce-comp-8",
+   "hns-roce-comp-9",
+   "hns-roce-comp-10",
+   "hns-roce-comp-11",
+   "hns-roce-comp-12",
+   "hns-roce-comp-13",
+   "hns-roce-comp-14",
+   "hns-roce-comp-15",
+   "hns-roce-comp-16",
+   "hns-roce-comp-17",
+   "hns-roce-comp-18",
+

[PATCH v8 16/22] IB/hns: Add ah operations support

2016-05-25 Thread Lijun Ou
This patch was for implementing of address handle operations.
It includes three verbs that create ah, query ah and destroy
ah. They is completed independently by RoCE driver.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
---
 drivers/infiniband/hw/hns/hns_roce_ah.c | 132 
 drivers/infiniband/hw/hns/hns_roce_device.h |  30 +++
 drivers/infiniband/hw/hns/hns_roce_main.c   |   5 ++
 3 files changed, 167 insertions(+)
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_ah.c

diff --git a/drivers/infiniband/hw/hns/hns_roce_ah.c 
b/drivers/infiniband/hw/hns/hns_roce_ah.c
new file mode 100644
index 000..9397614
--- /dev/null
+++ b/drivers/infiniband/hw/hns/hns_roce_ah.c
@@ -0,0 +1,132 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "hns_roce_common.h"
+#include "hns_roce_device.h"
+
+#define HNS_ROCE_PORT_NUM_SHIFT24
+#define HNS_ROCE_VLAN_SL_BIT_MASK  7
+#define HNS_ROCE_VLAN_SL_SHIFT 13
+
+struct ib_ah *hns_roce_create_ah(struct ib_pd *ibpd, struct ib_ah_attr 
*ah_attr)
+{
+   struct hns_roce_dev *hr_dev = to_hr_dev(ibpd->device);
+   struct device *dev = _dev->pdev->dev;
+   struct ib_gid_attr gid_attr;
+   struct hns_roce_ah *ah;
+   u16 vlan_tag = 0x;
+   struct in6_addr in6;
+   union ib_gid sgid;
+   int ret;
+
+   ah = kzalloc(sizeof(*ah), GFP_ATOMIC);
+   if (!ah)
+   return ERR_PTR(-ENOMEM);
+
+   /* Get mac address */
+   memcpy(, ah_attr->grh.dgid.raw, sizeof(ah_attr->grh.dgid.raw));
+   if (rdma_is_multicast_addr())
+   rdma_get_mcast_mac(, ah->av.mac);
+   else
+   memcpy(ah->av.mac, ah_attr->dmac, sizeof(ah_attr->dmac));
+
+   /* Get source gid */
+   ret = ib_get_cached_gid(ibpd->device, ah_attr->port_num,
+   ah_attr->grh.sgid_index, , _attr);
+   if (ret) {
+   dev_err(dev, "get sgid failed! ret = %d\n", ret);
+   kfree(ah);
+   return ERR_PTR(ret);
+   }
+
+   if (gid_attr.ndev) {
+   if (is_vlan_dev(gid_attr.ndev))
+   vlan_tag = vlan_dev_vlan_id(gid_attr.ndev);
+   dev_put(gid_attr.ndev);
+   }
+
+   if (vlan_tag < 0x1000)
+   vlan_tag |= (ah_attr->sl & HNS_ROCE_VLAN_SL_BIT_MASK) <<
+HNS_ROCE_VLAN_SL_SHIFT;
+
+   ah->av.port_pd = cpu_to_be32(to_hr_pd(ibpd)->pdn | (ah_attr->port_num <<
+HNS_ROCE_PORT_NUM_SHIFT));
+   ah->av.gid_index = ah_attr->grh.sgid_index;
+   ah->av.vlan = cpu_to_le16(vlan_tag);
+   dev_dbg(dev, "gid_index = 0x%x,vlan = 0x%x\n", ah->av.gid_index,
+   ah->av.vlan);
+
+   if (ah_attr->static_rate)
+   ah->av.stat_rate = IB_RATE_10_GBPS;
+
+   memcpy(ah->av.dgid, ah_attr->grh.dgid.raw, HNS_ROCE_GID_SIZE);
+   ah->av.sl_tclass_flowlabel = cpu_to_le32(ah_attr->sl <<
+HNS_ROCE_SL_SHIFT);
+
+   return >ibah;
+}
+
+int hns_roce_query_ah(struct ib_ah *ibah, struct ib_ah_attr *ah_attr)
+{
+   struct hns_roce_ah *ah = to_hr_ah(ibah);
+
+   memset(ah_attr, 0, sizeof(*ah_attr));
+
+   ah_attr->sl = le32_to_cpu(ah->av.sl_tclass_flowlabel) >>
+ HNS_ROCE_SL_SHIFT;
+   ah_attr->port_num = le32_to_cpu(ah->av.port_pd) >>

[PATCH v8 20/22] IB/hns: Add operation for getting immutable port

2016-05-25 Thread Lijun Ou
This patch added a new verbs that is getting port immutable.
It is added in the 4.5 kernel and latest. It is necessary to
solve the fail questions for registering ib device.

Signed-off-by: Wei Hu 
Signed-off-by: Lijun Ou 
---
 drivers/infiniband/hw/hns/hns_roce_main.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c 
b/drivers/infiniband/hw/hns/hns_roce_main.c
index 9286b07..8355763 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -572,6 +572,25 @@ static int hns_roce_mmap(struct ib_ucontext *context,
return 0;
 }
 
+static int hns_roce_port_immutable(struct ib_device *ib_dev, u8 port_num,
+  struct ib_port_immutable *immutable)
+{
+   struct ib_port_attr attr;
+   int ret;
+
+   ret = hns_roce_query_port(ib_dev, port_num, );
+   if (ret)
+   return ret;
+
+   immutable->pkey_tbl_len = attr.pkey_tbl_len;
+   immutable->gid_tbl_len = attr.gid_tbl_len;
+
+   immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE;
+   immutable->max_mad_size = IB_MGMT_MAD_SIZE;
+
+   return 0;
+}
+
 void hns_roce_unregister_device(struct hns_roce_dev *hr_dev)
 {
struct hns_roce_ib_iboe *iboe = _dev->iboe;
@@ -657,6 +676,9 @@ int hns_roce_register_device(struct hns_roce_dev *hr_dev)
ib_dev->reg_user_mr = hns_roce_reg_user_mr;
ib_dev->dereg_mr= hns_roce_dereg_mr;
 
+   /* OTHERS */
+   ib_dev->get_port_immutable  = hns_roce_port_immutable;
+
ret = ib_register_device(ib_dev, NULL);
if (ret) {
dev_err(dev, "ib_register_device failed!\n");
-- 
1.9.1



[PATCH v8 07/22] IB/hns: Add event queue support

2016-05-25 Thread Lijun Ou
This patch added event queue support for RoCE driver. It is used
for RoCE interrupt. RoCE includes 32 synchronous event irqs, 1
asynchronous event irq and 1 common overflow irq.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
---
 drivers/infiniband/hw/hns/hns_roce_cmd.c|  22 +
 drivers/infiniband/hw/hns/hns_roce_common.h |  72 +++
 drivers/infiniband/hw/hns/hns_roce_cq.c |  77 +++
 drivers/infiniband/hw/hns/hns_roce_device.h | 142 +
 drivers/infiniband/hw/hns/hns_roce_eq.c | 778 
 drivers/infiniband/hw/hns/hns_roce_eq.h | 131 +
 drivers/infiniband/hw/hns/hns_roce_main.c   |  24 +
 drivers/infiniband/hw/hns/hns_roce_qp.c |  63 +++
 8 files changed, 1309 insertions(+)
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_cq.c
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_eq.c
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_eq.h
 create mode 100644 drivers/infiniband/hw/hns/hns_roce_qp.c

diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.c 
b/drivers/infiniband/hw/hns/hns_roce_cmd.c
index 3badf86..72f287a 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cmd.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.c
@@ -45,6 +45,28 @@
 
 #define CMD_MAX_NUM32
 
+static int hns_roce_status_to_errno(u8 orig_status)
+{
+   if (orig_status == HNS_ROCE_CMD_SUCCESS)
+   return 0;
+   else
+   return -EIO;
+}
+
+void hns_roce_cmd_event(struct hns_roce_dev *hr_dev, u16 token, u8 status,
+   u64 out_param)
+{
+   struct hns_roce_cmd_context
+   *context = _dev->cmd.context[token & hr_dev->cmd.token_mask];
+
+   if (token != context->token)
+   return;
+
+   context->result = hns_roce_status_to_errno(status);
+   context->out_param = out_param;
+   complete(>done);
+}
+
 int hns_roce_cmd_init(struct hns_roce_dev *hr_dev)
 {
struct device *dev = _dev->pdev->dev;
diff --git a/drivers/infiniband/hw/hns/hns_roce_common.h 
b/drivers/infiniband/hw/hns/hns_roce_common.h
index 4c90cd4..5b0ca43 100644
--- a/drivers/infiniband/hw/hns/hns_roce_common.h
+++ b/drivers/infiniband/hw/hns/hns_roce_common.h
@@ -33,6 +33,57 @@
 #ifndef _HNS_ROCE_COMMON_H
 #define _HNS_ROCE_COMMON_H
 
+#define roce_writel(value, addr) writel((value), (addr))
+#define roce_readl(addr)readl((addr))
+#define roce_raw_write(value, addr) \
+   __raw_writel((__force u32)cpu_to_le32(value), (addr))
+
+#define roce_get_field(origin, mask, shift) \
+   (((origin) & (mask)) >> (shift))
+
+#define roce_get_bit(origin, shift) \
+   roce_get_field((origin), (1ul << (shift)), (shift))
+
+#define roce_set_field(origin, mask, shift, val) \
+   do { \
+   (origin) &= (~(mask)); \
+   (origin) |= (((u32)(val) << (shift)) & (mask)); \
+   } while (0)
+
+#define roce_set_bit(origin, shift, val) \
+   roce_set_field((origin), (1ul << (shift)), (shift), (val))
+
+#define ROCEE_CAEP_AEQC_AEQE_SHIFT_CAEP_AEQC_STATE_S 0
+#define ROCEE_CAEP_AEQC_AEQE_SHIFT_CAEP_AEQC_STATE_M   \
+   (((1UL << 2) - 1) << ROCEE_CAEP_AEQC_AEQE_SHIFT_CAEP_AEQC_STATE_S)
+
+#define ROCEE_CAEP_AEQC_AEQE_SHIFT_CAEP_AEQC_AEQE_SHIFT_S 8
+#define ROCEE_CAEP_AEQC_AEQE_SHIFT_CAEP_AEQC_AEQE_SHIFT_M   \
+   (((1UL << 4) - 1) << ROCEE_CAEP_AEQC_AEQE_SHIFT_CAEP_AEQC_AEQE_SHIFT_S)
+
+#define ROCEE_CAEP_AEQC_AEQE_SHIFT_CAEP_AEQ_ALM_OVF_INT_ST_S 17
+
+#define ROCEE_CAEP_AEQE_CUR_IDX_CAEP_AEQ_BT_H_S 0
+#define ROCEE_CAEP_AEQE_CUR_IDX_CAEP_AEQ_BT_H_M   \
+   (((1UL << 5) - 1) << ROCEE_CAEP_AEQE_CUR_IDX_CAEP_AEQ_BT_H_S)
+
+#define ROCEE_CAEP_AEQE_CUR_IDX_CAEP_AEQE_CUR_IDX_S 16
+#define ROCEE_CAEP_AEQE_CUR_IDX_CAEP_AEQE_CUR_IDX_M   \
+   (((1UL << 16) - 1) << ROCEE_CAEP_AEQE_CUR_IDX_CAEP_AEQE_CUR_IDX_S)
+
+#define ROCEE_CAEP_AEQE_CONS_IDX_CAEP_AEQE_CONS_IDX_S 0
+#define ROCEE_CAEP_AEQE_CONS_IDX_CAEP_AEQE_CONS_IDX_M   \
+   (((1UL << 16) - 1) << ROCEE_CAEP_AEQE_CONS_IDX_CAEP_AEQE_CONS_IDX_S)
+
+#define ROCEE_CAEP_CEQC_SHIFT_CAEP_CEQ_ALM_OVF_INT_ST_S 16
+#define ROCEE_CAEP_CE_IRQ_MASK_CAEP_CEQ_ALM_OVF_MASK_S 1
+#define ROCEE_CAEP_CEQ_ALM_OVF_CAEP_CEQ_ALM_OVF_S 0
+
+#define ROCEE_CAEP_AE_MASK_CAEP_AEQ_ALM_OVF_MASK_S 0
+#define ROCEE_CAEP_AE_MASK_CAEP_AE_IRQ_MASK_S 1
+
+#define ROCEE_CAEP_AE_ST_CAEP_AEQ_ALM_OVF_S 0
+
 /*ROCEE_REG DEFINITION/
 #define ROCEE_VENDOR_ID_REG0x0
 #define ROCEE_VENDOR_PART_ID_REG   0x4
@@ -42,8 +93,29 @@
 #define ROCEE_SYS_IMAGE_GUID_L_REG 0xC
 #define ROCEE_SYS_IMAGE_GUID_H_REG 0x10
 
+#define ROCEE_CAEP_AEQE_CONS_IDX_REG   0x3AC
+#define ROCEE_CAEP_CEQC_CONS_IDX_0_REG 0x3BC
+
+#define ROCEE_ECC_UCERR_ALM1_REG   0xB38
+#define ROCEE_ECC_UCERR_ALM2_REG   0xB3C
+#define ROCEE_ECC_CERR_ALM1_REG

[PATCH v8 13/22] IB/hns: Add interface of the protocol stack registration

2016-05-25 Thread Lijun Ou
This patch mainly added the function module which netif notify
registered the protocol stack. It includes interface functions
as follows:
1. The executive called interface of RoCE when the netlink
   event that registered protocol stack was generated
2. The executive called interface of RoCE when ip address
   that registered protocol stack was changed.
In addition that, it will free the relative resource when RoCE
is removed.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
---
 drivers/infiniband/hw/hns/hns_roce_device.h |   3 +
 drivers/infiniband/hw/hns/hns_roce_main.c   | 210 
 2 files changed, 213 insertions(+)

diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h 
b/drivers/infiniband/hw/hns/hns_roce_device.h
index 5b714bf..62b5924 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -268,7 +268,10 @@ struct hns_roce_qp {
 };
 
 struct hns_roce_ib_iboe {
+   spinlock_t  lock;
struct net_device  *netdevs[HNS_ROCE_MAX_PORTS];
+   struct notifier_block   nb;
+   struct notifier_block   nb_inet;
/* 16 GID is shared by 6 port in v1 engine. */
union ib_gidgid_table[HNS_ROCE_MAX_GID_NUM];
u8  phy_port[HNS_ROCE_MAX_PORTS];
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c 
b/drivers/infiniband/hw/hns/hns_roce_main.c
index 5cfd9cf..66fbae2 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -63,6 +63,46 @@
 #include "hns_roce_icm.h"
 
 /**
+ * hns_roce_addrconf_ifid_eui48 - Get default gid.
+ * @eui: eui.
+ * @vlan_id:  gid
+ * @dev:  net device
+ * Description:
+ *MAC convert to GID
+ *gid[0..7] = fe80   
+ *gid[8] = mac[0] ^ 2
+ *gid[9] = mac[1]
+ *gid[10] = mac[2]
+ *gid[11] = ff(VLAN ID high byte (4 MS bits))
+ *gid[12] = fe(VLAN ID low byte)
+ *gid[13] = mac[3]
+ *gid[14] = mac[4]
+ *gid[15] = mac[5]
+ */
+static void hns_roce_addrconf_ifid_eui48(u8 *eui, u16 vlan_id,
+struct net_device *dev)
+{
+   memcpy(eui, dev->dev_addr, 3);
+   memcpy(eui + 5, dev->dev_addr + 3, 3);
+   if (vlan_id < 0x1000) {
+   eui[3] = vlan_id >> 8;
+   eui[4] = vlan_id & 0xff;
+   } else {
+   eui[3] = 0xff;
+   eui[4] = 0xfe;
+   }
+   eui[0] ^= 2;
+}
+
+void hns_roce_make_default_gid(struct net_device *dev, union ib_gid *gid)
+{
+   memset(gid, 0, sizeof(*gid));
+   gid->raw[0] = 0xFE;
+   gid->raw[1] = 0x80;
+   hns_roce_addrconf_ifid_eui48(>raw[8], 0x, dev);
+}
+
+/**
  * hns_get_gid_index - Get gid index.
  * @hr_dev: pointer to structure hns_roce_dev.
  * @port:  port, value range: 0 ~ MAX
@@ -140,6 +180,152 @@ void hns_roce_update_gids(struct hns_roce_dev *hr_dev, 
int port)
ib_dispatch_event();
 }
 
+static int handle_en_event(struct hns_roce_dev *hr_dev, u8 port,
+  unsigned long event)
+{
+   struct device *dev = _dev->pdev->dev;
+   struct net_device *netdev;
+   unsigned long flags;
+   union ib_gid gid;
+   int ret = 0;
+
+   netdev = hr_dev->iboe.netdevs[port];
+   if (!netdev) {
+   dev_err(dev, "port(%d) can't find netdev\n", port);
+   return -ENODEV;
+   }
+
+   spin_lock_irqsave(_dev->iboe.lock, flags);
+
+   switch (event) {
+   case NETDEV_UP:
+   case NETDEV_CHANGE:
+   case NETDEV_REGISTER:
+   case NETDEV_CHANGEADDR:
+   hns_roce_set_mac(hr_dev, port, netdev->dev_addr);
+   hns_roce_make_default_gid(netdev, );
+   ret = hns_roce_set_gid(hr_dev, port, 0, );
+   if (!ret)
+   hns_roce_update_gids(hr_dev, port);
+   break;
+   case NETDEV_DOWN:
+   /*
+   * In v1 engine, only support all ports closed together.
+   */
+   break;
+   default:
+   dev_dbg(dev, "NETDEV event = 0x%x!\n", (u32)(event));
+   break;
+   }
+
+   spin_unlock_irqrestore(_dev->iboe.lock, flags);
+   return ret;
+}
+
+static int hns_roce_netdev_event(struct notifier_block *self,
+unsigned long event, void *ptr)
+{
+   struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+   struct hns_roce_ib_iboe *iboe = NULL;
+   struct hns_roce_dev *hr_dev = NULL;
+   u8 port = 0;
+   int ret = 0;
+
+   hr_dev = container_of(self, struct hns_roce_dev, iboe.nb);
+   iboe = _dev->iboe;
+
+   for (port = 0; port < hr_dev->caps.num_ports; port++) {
+   if (dev == iboe->netdevs[port]) {
+   

Re: [RESEND] Re: updating carl9170-1.fw in linux-firmware.git

2016-05-25 Thread Ben Hutchings
On Wed, 2016-04-20 at 23:11 +0200, Christian Lamparter wrote:
> On Wednesday, April 20, 2016 10:59:44 AM Kalle Valo wrote:
> > Christian Lamparter  writes:
> > 
> > > On Monday, April 18, 2016 07:42:05 PM Kalle Valo wrote:
> > > > Christian Lamparter  writes:
> > > > 
> > > > > On Monday, April 18, 2016 06:45:09 PM Kalle Valo wrote:
> > > > > 
> > > > > > Why even mention anything about a "special firmware" as the 
> > > > > > firmware is
> > > > > > already available from linux-firmware.git? 
> > > > > 
> > > > > Yes and no. 1.9.6 is in linux-firmware.git. I've tried to add 1.9.9 
> > > > > too
> > > > > but that failed.
> > > > > 
> > > > 
> > > > Rick's comment makes sense to me, better just to provide the latest
> > > > version. No need to unnecessary confuse the users. And if someone really
> > > > wants to use an older version that she can retrieve it from the git
> > > > history.
> > > 
> > > Part of the fun here is that firmware is GPLv2. The linux-firmware.git has
> > > to point to or add the firmware source to their tree. They have added 
> > > every
> > > single source file to it instead of "packaging" it in a tar.bz2/gz/xz
> > > like you normally do for release sources.
> > > 
> > > If you want to read more about it:
> > > 
> > 
> > Yeah, that's more work. I get that. But I'm still not understanding
> > what's the actual problem which prevents us from updating carl9170
> > firmware in linux-firmware.
> I'm not sure, but why not ask? I've added the cc'ed Linux Firmware
> Maintainers. So for those people reading the fw list:
> 
> What would it take to update the carl9170-1.fw firmware file in your
> repository to the latest version?
> 
> Who has to sent the firmware update. Does it have to be the person who
> sent the first request? (Xose)? The maintainer of the firmware (me)?
> someone from Qualcomm Atheros? Or someone else (specific)? (the 
> firmware is licensed as GPLv2 - in theory anyone should be able to
> do that)

Given the licence, I don't particularly care.

> How should the firmware source update be handled? Currently the latest
> .tar.xz of the firmware has ~130kb. The formated patches from 1.9.6 to
> latest are about ~100kb (182 individual patches).

Either patches that 'git am' can handle, or a git branch.

> How does linux-firmware handle new binary firmware images and new 
> sources? What if carl9170fw-2.bin is added. Do we need another
> source directory for this in the current tree then? Because 
> carl9170fw-1.bin will still be needed for backwards compatibility
> so we basically need to duplicate parts of the source?

We still need to include the old binary for compatibility, and the old
source for GPL compliance.  (If there was a source version that could
build firmware for both ABI versions, then we could update both
binaries and have one set of source files.  But it doesn't sound like
that's the case.)

> Also, how's the situation with ath9k_htc? The 1.4.0 image contains
> some GPLv2 code as well?

I didn't realise that.

> So, why is there no source in the tree, but just the link to it?

An oversight which we need to fix.

> Because, I would like to do basically the same
> for carl9170fw and just add a link to the carl9170fw repository and
> save everyone this source update "song and dance".

Merely linking to upstream source doesn't satisfy GPLv2 source
requirements, at least not in case of commercial distribution.  Linux
distributors should be able to use a snapshot of linux-firmware as the
upstream source for a package, without worrying about whether there are
extra sources they need to include.

(I'm aware that there are several files that don't actually have clear
licences for.  But those are at least called out in WHENCE, and were
previously distributed as part of the kernel sources for years.)

Ben.

-- 
Ben Hutchings
Time is nature's way of making sure that everything doesn't happen at
once.


signature.asc
Description: This is a digitally signed message part


[PATCH 2/2] fou: add Kconfig options for IPv6 support

2016-05-25 Thread Arnd Bergmann
A previous patch added the fou6.ko module, but that failed to link
in a couple of configurations:

net/built-in.o: In function `ip6_tnl_encap_add_fou_ops':
net/ipv6/fou6.c:88: undefined reference to `ip6_tnl_encap_add_ops'
net/ipv6/fou6.c:94: undefined reference to `ip6_tnl_encap_add_ops'
net/ipv6/fou6.c:97: undefined reference to `ip6_tnl_encap_del_ops'
net/built-in.o: In function `ip6_tnl_encap_del_fou_ops':
net/ipv6/fou6.c:106: undefined reference to `ip6_tnl_encap_del_ops'
net/ipv6/fou6.c:107: undefined reference to `ip6_tnl_encap_del_ops'

If CONFIG_IPV6=m, ip6_tnl_encap_add_ops/ip6_tnl_encap_del_ops
are in a module, but fou6.c can still be built-in, and that
obviously fails to link.

Also, if CONFIG_IPV6=y, but CONFIG_IPV6_TUNNEL=m or
CONFIG_IPV6_TUNNEL=n, the same problem happens for a different
reason.

This adds two new silent Kconfig symbols to work around both
problems:

- CONFIG_IPV6_FOU is now always set to 'm' if either CONFIG_NET_FOU=m
  or CONFIG_IPV6=m
- CONFIG_IPV6_FOU_TUNNEL is set implicitly when IPV6_FOU is enabled
  and NET_FOU_IP_TUNNELS is also turned out, and it will ensure
  that CONFIG_IPV6_TUNNEL is also available.

The options could be made user-visible as well, to give additional
room for configuration, but it seems easier not to bother users
with more choice here.

Signed-off-by: Arnd Bergmann 
Fixes: aa3463d65e7b ("fou: Add encap ops for IPv6 tunnels")
---
 net/ipv6/Kconfig  | 9 +
 net/ipv6/Makefile | 2 +-
 net/ipv6/fou6.c   | 2 +-
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig
index 3f8411328de5..994608263260 100644
--- a/net/ipv6/Kconfig
+++ b/net/ipv6/Kconfig
@@ -232,6 +232,15 @@ config IPV6_GRE
 
  Saying M here will produce a module called ip6_gre. If unsure, say N.
 
+config IPV6_FOU
+   tristate
+   default NET_FOU && IPV6
+
+config IPV6_FOU_TUNNEL
+   tristate
+   default NET_FOU_IP_TUNNELS && IPV6_FOU
+   select INET6_TUNNEL
+
 config IPV6_MULTIPLE_TABLES
bool "IPv6: Multiple Routing Tables"
select FIB_RULES
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index 7ec3129c9ace..d42ca3d1197f 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -42,7 +42,7 @@ obj-$(CONFIG_IPV6_VTI) += ip6_vti.o
 obj-$(CONFIG_IPV6_SIT) += sit.o
 obj-$(CONFIG_IPV6_TUNNEL) += ip6_tunnel.o
 obj-$(CONFIG_IPV6_GRE) += ip6_gre.o
-obj-$(CONFIG_NET_FOU) += fou6.o
+obj-$(CONFIG_NET_FOU_IPV6_TUNNELS) += fou6.o
 
 obj-y += addrconf_core.o exthdrs_core.o ip6_checksum.o ip6_icmp.o
 obj-$(CONFIG_INET) += output_core.o protocol.o $(ipv6-offload)
diff --git a/net/ipv6/fou6.c b/net/ipv6/fou6.c
index c972d0b52579..9ea249b9451e 100644
--- a/net/ipv6/fou6.c
+++ b/net/ipv6/fou6.c
@@ -69,7 +69,7 @@ int gue6_build_header(struct sk_buff *skb, struct 
ip_tunnel_encap *e,
 }
 EXPORT_SYMBOL(gue6_build_header);
 
-#ifdef CONFIG_NET_FOU_IP_TUNNELS
+#if IS_ENABLED(CONFIG_IPV6_FOU_TUNNEL)
 
 static const struct ip6_tnl_encap_ops fou_ip6tun_ops = {
.encap_hlen = fou_encap_hlen,
-- 
2.7.0



[PATCH v8 18/22] IB/hns: Add CQ operations support

2016-05-25 Thread Lijun Ou
This patch was implementing for Completion Queue(CQ) operations.
A CQ can be used to multiplex work completions from multiple work
queues across queue pairs on the same HCA. CQ as the notification
mechanism for Work Request completions.
CQ operations as follows:
1. create CQ. CQ are created through the Channel Interface,
   The maximum number of Completion Queue Entries (CQEs) that
   may be outstanding on a CQ must be specified when the CQ
   is created.
2. destroy CQ. Destroys the specified CQ. Resources allocated
   by the Channel Interface to implement the CQ must be
   deallocated during the destroy operation.
3. request completion notification. Requests the CQ event handler
   be called when the next completion entry of the specified type
   is added to the specified CQ.
4. poll CQ. Polls the specified CQ for a Work Completion.
   A Work Completion indicates that a Work Request for a Work
   Queue associated with the CQ is done.

Signed-off-by: Wei Hu 
Signed-off-by: Nenglong Zhao 
Signed-off-by: Lijun Ou 
Signed-off-by: Salil Mehta 
---
 drivers/infiniband/hw/hns/hns_roce_cq.c | 358 
 drivers/infiniband/hw/hns/hns_roce_device.h |  33 +++
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c  | 340 ++
 drivers/infiniband/hw/hns/hns_roce_hw_v1.h  | 117 +
 drivers/infiniband/hw/hns/hns_roce_main.c   |   9 +
 5 files changed, 857 insertions(+)

diff --git a/drivers/infiniband/hw/hns/hns_roce_cq.c 
b/drivers/infiniband/hw/hns/hns_roce_cq.c
index 7553179..abbe4fd 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cq.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cq.c
@@ -34,6 +34,364 @@
 #include 
 #include 
 #include "hns_roce_device.h"
+#include "hns_roce_cmd.h"
+#include "hns_roce_icm.h"
+#include "hns_roce_user.h"
+#include "hns_roce_common.h"
+
+static void hns_roce_ib_cq_comp(struct hns_roce_cq *hr_cq)
+{
+   struct ib_cq *ibcq = _cq->ib_cq;
+
+   ibcq->comp_handler(ibcq, ibcq->cq_context);
+}
+
+static void hns_roce_ib_cq_event(struct hns_roce_cq *hr_cq,
+enum hns_roce_event event_type)
+{
+   struct hns_roce_dev *hr_dev;
+   struct ib_event event;
+   struct ib_cq *ibcq;
+
+   ibcq = _cq->ib_cq;
+   hr_dev = to_hr_dev(ibcq->device);
+
+   if (event_type != HNS_ROCE_EVENT_TYPE_CQ_ID_INVALID &&
+   event_type != HNS_ROCE_EVENT_TYPE_CQ_ACCESS_ERROR &&
+   event_type != HNS_ROCE_EVENT_TYPE_CQ_OVERFLOW) {
+   dev_err(_dev->pdev->dev,
+   "hns_roce_ib: Unexpected event type 0x%x on CQ %06lx\n",
+   event_type, hr_cq->cqn);
+   return;
+   }
+
+   if (ibcq->event_handler) {
+   event.device = ibcq->device;
+   event.event = IB_EVENT_CQ_ERR;
+   event.element.cq = ibcq;
+   ibcq->event_handler(, ibcq->cq_context);
+   }
+}
+
+static int hns_roce_sw2hw_cq(struct hns_roce_dev *dev,
+struct hns_roce_cmd_mailbox *mailbox,
+unsigned long cq_num)
+{
+   return hns_roce_cmd_mbox(dev, mailbox->dma, 0, cq_num, 0,
+   HNS_ROCE_CMD_SW2HW_CQ, HNS_ROCE_CMD_TIME_CLASS_A);
+}
+
+static int hns_roce_cq_alloc(struct hns_roce_dev *hr_dev, int nent,
+struct hns_roce_mtt *hr_mtt,
+struct hns_roce_uar *hr_uar,
+struct hns_roce_cq *hr_cq, int vector,
+int collapsed)
+{
+   struct hns_roce_cmd_mailbox *mailbox = NULL;
+   struct hns_roce_cq_table *cq_table = NULL;
+   struct device *dev = _dev->pdev->dev;
+   dma_addr_t dma_handle;
+   u64 *mtts = NULL;
+   int ret = 0;
+
+   cq_table = _dev->cq_table;
+
+   /* Get the physical address of cq buf */
+   mtts = hns_roce_table_find(_dev->mr_table.mtt_table,
+  hr_mtt->first_seg, _handle);
+   if (!mtts) {
+   dev_err(dev, "CQ alloc.Failed to find cq buf addr.\n");
+   return -EINVAL;
+   }
+
+   if (vector >= hr_dev->caps.num_comp_vectors) {
+   dev_err(dev, "CQ alloc.Invalid vector.\n");
+   return -EINVAL;
+   }
+   hr_cq->vector = vector;
+
+   ret = hns_roce_bitmap_alloc(_table->bitmap, _cq->cqn);
+   if (ret == -1) {
+   dev_err(dev, "CQ alloc.Failed to alloc index.\n");
+   return -ENOMEM;
+   }
+
+   /* Get CQC memory icm table */
+   ret = hns_roce_table_get(hr_dev, _table->table, hr_cq->cqn);
+   if (ret) {
+   dev_err(dev, "CQ alloc.Failed to get context mem.\n");
+   goto err_out;
+   }
+
+   /* The cq insert radix tree */
+   spin_lock_irq(_table->lock);
+   /* 

[PATCH 1/2] ipv6: hide ip6_encap_hlen/ip6_tnl_encap definitions

2016-05-25 Thread Arnd Bergmann
A recent cleanup moved MAX_IPTUN_ENCAP_OPS along with some other
definitions, but it is now invisible when CONFIG_INET is
not defined, but still referenced from ip6_tunnel.h:

In file included from net/xfrm/xfrm_input.c:17:0:
include/net/ip6_tunnel.h:67:17: error: 'MAX_IPTUN_ENCAP_OPS' undeclared here 
(not in a function)
   ip6tun_encaps[MAX_IPTUN_ENCAP_OPS];
 ^~~

This hides the ip6_encap_hlen and ip6_tnl_encap functions inside
of CONFIG_INET so we don't run into the the problem.

Alternatively we could move the macro out of the #ifdef again to
restore the previous behavior

Signed-off-by: Arnd Bergmann 
Fixes: 55c2bc143224 ("net: Cleanup encap items in ip_tunnels.h")
---
 include/net/ip6_tunnel.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h
index d325c81332e3..43a5a0e4524c 100644
--- a/include/net/ip6_tunnel.h
+++ b/include/net/ip6_tunnel.h
@@ -63,6 +63,8 @@ struct ip6_tnl_encap_ops {
u8 *protocol, struct flowi6 *fl6);
 };
 
+#ifdef CONFIG_INET
+
 extern const struct ip6_tnl_encap_ops __rcu *
ip6tun_encaps[MAX_IPTUN_ENCAP_OPS];
 
@@ -138,7 +140,6 @@ struct net *ip6_tnl_get_link_net(const struct net_device 
*dev);
 int ip6_tnl_get_iflink(const struct net_device *dev);
 int ip6_tnl_change_mtu(struct net_device *dev, int new_mtu);
 
-#ifdef CONFIG_INET
 static inline void ip6tunnel_xmit(struct sock *sk, struct sk_buff *skb,
  struct net_device *dev)
 {
-- 
2.7.0



Re: [ethtool 0/3][pull request] Intel Wired LAN Driver Updates 2016-05-03

2016-05-25 Thread Ben Hutchings
On Tue, 2016-05-24 at 16:47 -0700, Jeff Kirsher wrote:
> On Wed, 2016-05-04 at 09:44 -0700, Jeff Kirsher wrote:
> > This series contains updates to ixgbe in ethtool.
> > 
> > Preethi adds missing device IDs and mac_type definitions, also updated
> > the display registers for x550, x550em_x/a.  Cleaned up the format string
> > storage by taking advantage of "for" loops.
> > 
> > The following are changes since commit
> > deb1c6613ec14fd828d321e38c7bea45fe559bd5:
> >   Release version 4.5.
> > and are available in the git repository at:
> >   git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/ethtool master
> > 
> > Preethi Banala (3):
> >   ethtool/ixgbe: Add device ID and mac_type definitions
> >   ethtool/ixgbe: Correct offsets and support x550, x550em_x, x550em_a
> >   ethtool/ixgbe: Reduce format string storage
> > 
> >  ixgbe.c | 173 +++---
> > --
> >  1 file changed, 95 insertions(+), 78 deletions(-)
> > 
> 
> Is Ben still maintaining ethtool?

I am - barely.

> I ask because I have this series which I
> sent out earlier this month, with no word and I know there is at least one
> other ethtool patch series that has had no response or committal from who
> ever is maintaining ethtool.

I'm going to do one more release and then look for a new maintainer.

Ben.

> I know we discussed last netconf that we should look at possibly a new tool
> to address the shortcomings of ethtool, but I was not aware we had
> abandoned maintaining the current ethtool already before any replacement
> tool has been developed.
-- 
Ben Hutchings
Time is nature's way of making sure that everything doesn't happen at
once.


signature.asc
Description: This is a digitally signed message part


Re: [PATCH net v2 3/3] Documentation: networking: dsa: Describe port_vlan_filtering

2016-05-25 Thread Vivien Didelot
Florian Fainelli  writes:

> Described what the port_vlan_filtering function is supposed to
> accomplish.
>
> Fixes: fb2dabad69f0 ("net: dsa: support VLAN filtering switchdev attr")
> Signed-off-by: Florian Fainelli 

Reviewed-by: Vivien Didelot 


Re: Davicom DM9162 PHY supported in the kernel?

2016-05-25 Thread Amr Bekhit
Hi Andrew,

I've had another play around with the DTS and appear to have solved
the problem. I've changed the ethernet defintions from:

compatible = "cdns,at32ap7000-macb", "cdns,macb";

to

compatible = "cdns,at91sam9260-macb";

and everything seems to work fine now - I can finally send pings.

The reason I never suspected the dts is because the dts file comes
from the manufacturer provided software package, and in their package
the ethernet works fine, so I never suspected it.

Anyway, thanks for all your help, I appreciate it.

On 25 May 2016 at 14:20, Amr Bekhit  wrote:
> Hi Andrew,
>
> I've uploaded the device tree to http://pastebin.com/tNp2PnW4.
>
> On 25 May 2016 at 13:39, Andrew Lunn  wrote:
>> On Wed, May 25, 2016 at 11:28:37AM +0100, Amr Bekhit wrote:
>>> Hi Andrew,
>>>
>>> I added the following line to genphy_read_status to print out the value of 
>>> BMCR:
>>>
>>>  phydev->lp_advertising = 0;
>>>
>>> +printk(KERN_DEBUG "MII_BMCR: 0x%04X\n", phy_read(phydev, MII_BMCR));
>>>
>>>  if (AUTONEG_ENABLE == phydev->autoneg) {
>>>
>>> After booting up the kernel and running ifconfig eth0 up, I get the
>>> following in kernel log:
>>>
>>> [   83.890625] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
>>> [   85.328125] MII_BMCR: 0x1000
>>> [   86.328125] MII_BMCR: 0x1000
>>> [   87.328125] MII_BMCR: 0x3100
>>> [   87.328125] macb f802c000.ethernet eth0: link up (100/Full)
>>> [   87.328125] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
>>> [   88.328125] MII_BMCR: 0x3100
>>> [   89.328125] MII_BMCR: 0x3100
>>> [   90.328125] MII_BMCR: 0x3100
>>>
>>> So it appears that after bringing the interface up, the PHY is
>>> configured for 100Mbps, autoneg enabled and duplex mode. The PHY is
>>> not isolated or powered down.
>>
>> So this all looks good. So it suggests your problem is between the MAC
>> and the PHY, since the PHY is able to talk to the PHY on the other end
>> of the cable.
>>
>> What do you have in device tree?
>>
>>  Andrew


Re: Davicom DM9162 PHY supported in the kernel?

2016-05-25 Thread Andrew Lunn
On Wed, May 25, 2016 at 02:20:58PM +0100, Amr Bekhit wrote:
> Hi Andrew,
> 
> I've uploaded the device tree to http://pastebin.com/tNp2PnW4.

This is the decompiled blob, making it hard to read.

I do however notice that you have two clocks, "hclk", "pclk". The
binding talks of a third optional clock. Is it required for your
hardware?

Also, have you seen:

https://lkml.org/lkml/2015/3/5/499

Are you sure you have the correct compatible string?

Andrew


Re: [PATCH RESEND 7/8] pipe: account to kmemcg

2016-05-25 Thread Vladimir Davydov
On Tue, May 24, 2016 at 01:04:33PM -0700, Eric Dumazet wrote:
> On Tue, 2016-05-24 at 19:13 +0300, Vladimir Davydov wrote:
> > On Tue, May 24, 2016 at 05:59:02AM -0700, Eric Dumazet wrote:
> > ...
> > > > +static int anon_pipe_buf_steal(struct pipe_inode_info *pipe,
> > > > +  struct pipe_buffer *buf)
> > > > +{
> > > > +   struct page *page = buf->page;
> > > > +
> > > > +   if (page_count(page) == 1) {
> > > 
> > > This looks racy : some cpu could have temporarily elevated page count.
> > 
> > All pipe operations (pipe_buf_operations->get, ->release, ->steal) are
> > supposed to be called under pipe_lock. So, if we see a pipe_buffer->page
> > with refcount of 1 in ->steal, that means that we are the only its user
> > and it can't be spliced to another pipe.
> > 
> > In fact, I just copied the code from generic_pipe_buf_steal, adding
> > kmemcg related checks along the way, so it should be fine.
> 
> So you guarantee that no other cpu might have done
> get_page_unless_zero() right before this test ?

Each pipe_buffer holds a reference to its page. If we find page's
refcount to be 1 here, then it can be referenced only by our
pipe_buffer. And the refcount cannot be increased by a parallel thread,
because we hold pipe_lock, which rules out splice, and otherwise it's
impossible to reach the page as it is not on lru. That said, I think I
guarantee that this should be safe.

Thanks,
Vladimir


Re: Davicom DM9162 PHY supported in the kernel?

2016-05-25 Thread Amr Bekhit
Hi Andrew,

I've uploaded the device tree to http://pastebin.com/tNp2PnW4.

On 25 May 2016 at 13:39, Andrew Lunn  wrote:
> On Wed, May 25, 2016 at 11:28:37AM +0100, Amr Bekhit wrote:
>> Hi Andrew,
>>
>> I added the following line to genphy_read_status to print out the value of 
>> BMCR:
>>
>>  phydev->lp_advertising = 0;
>>
>> +printk(KERN_DEBUG "MII_BMCR: 0x%04X\n", phy_read(phydev, MII_BMCR));
>>
>>  if (AUTONEG_ENABLE == phydev->autoneg) {
>>
>> After booting up the kernel and running ifconfig eth0 up, I get the
>> following in kernel log:
>>
>> [   83.890625] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
>> [   85.328125] MII_BMCR: 0x1000
>> [   86.328125] MII_BMCR: 0x1000
>> [   87.328125] MII_BMCR: 0x3100
>> [   87.328125] macb f802c000.ethernet eth0: link up (100/Full)
>> [   87.328125] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
>> [   88.328125] MII_BMCR: 0x3100
>> [   89.328125] MII_BMCR: 0x3100
>> [   90.328125] MII_BMCR: 0x3100
>>
>> So it appears that after bringing the interface up, the PHY is
>> configured for 100Mbps, autoneg enabled and duplex mode. The PHY is
>> not isolated or powered down.
>
> So this all looks good. So it suggests your problem is between the MAC
> and the PHY, since the PHY is able to talk to the PHY on the other end
> of the cable.
>
> What do you have in device tree?
>
>  Andrew


Re: Davicom DM9162 PHY supported in the kernel?

2016-05-25 Thread Andrew Lunn
On Wed, May 25, 2016 at 11:28:37AM +0100, Amr Bekhit wrote:
> Hi Andrew,
> 
> I added the following line to genphy_read_status to print out the value of 
> BMCR:
> 
>  phydev->lp_advertising = 0;
> 
> +printk(KERN_DEBUG "MII_BMCR: 0x%04X\n", phy_read(phydev, MII_BMCR));
> 
>  if (AUTONEG_ENABLE == phydev->autoneg) {
> 
> After booting up the kernel and running ifconfig eth0 up, I get the
> following in kernel log:
> 
> [   83.890625] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
> [   85.328125] MII_BMCR: 0x1000
> [   86.328125] MII_BMCR: 0x1000
> [   87.328125] MII_BMCR: 0x3100
> [   87.328125] macb f802c000.ethernet eth0: link up (100/Full)
> [   87.328125] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> [   88.328125] MII_BMCR: 0x3100
> [   89.328125] MII_BMCR: 0x3100
> [   90.328125] MII_BMCR: 0x3100
> 
> So it appears that after bringing the interface up, the PHY is
> configured for 100Mbps, autoneg enabled and duplex mode. The PHY is
> not isolated or powered down.

So this all looks good. So it suggests your problem is between the MAC
and the PHY, since the PHY is able to talk to the PHY on the other end
of the cable.

What do you have in device tree?

 Andrew


[PATCH] virtio_net: fix virtnet_open and virtnet_probe competing for try_fill_recv

2016-05-25 Thread wangyunjian
In function virtnet_open() and virtnet_probe(), func try_fill_recv() will be 
executed at the same time. VQ in virtqueue_add() is not protected well and 
BUG_ON will be triggered when virito_net.ko being removed.

Test Script:
for (( i=0; i<500; i=i+1 )); do
rmmod virtio_net
modprobe virtio_net
ifconfig eth0 up
done

[  302.336996] [ cut here ]
[  302.338794] kernel BUG at virtio_ring.c:750!
[  302.340013] invalid opcode:  [#1] SMP
[  302.340013] last sysfs file: 
/sys/devices/pci:00/:00:03.0/virtio0/device
[  302.340013] CPU 0
[  302.340013] Modules linked in: virtio_balloon virtio_net(-) virtio_pci 
virtio_ring virtio ipv6 af_packet microcode acpiphp pci_hotplug fuse loop 
dm_mod rtc_cmos tpm_tis rtc_core tpm i2c_piix4 rtc_lib container button floppy 
pcspkr tpm_bios i2c_core joydev sg usbhid hid uhci_hcd ehci_hcd usbcore edd 
ext3 mbcache jbd fan processor ide_pci_generic piix ide_core ata_generic at 
a_piix libata thermal thermal_sys hwmon sd_mod crc_t10dif kvm_ivshmem(N) 
scsi_mod pv_channel(N) [last unloaded: virtio]
[  302.340013] Supported: Yes
[  302.340013] Pid: 8410, comm: rmmod Tainted: GN  2.6.32.12-0.7-default #1 
Standard PC (i440FX + PIIX, 1996)
[  302.340013] RIP: 0010:[]  [] 
virtqueue_detach_unused_buf+0xb9/0xc0 [virtio_ring]
[  302.340013] RSP: 0018:88000c7a9e08  EFLAGS: 00010283
[  302.340013] RAX: 00f4 RBX: 0100 RCX: 4d8e
[  302.340013] RDX: 880001e0 RSI: 0046 RDI: 81a71570
[  302.340013] RBP: 88000c987000 R08:  R09: 000a
[  302.340013] R10:  R11:  R12: 0400
[  302.340013] R13:  R14: 7fff92bc1758 R15: 0001
[  302.340013] FS:  7f8b3995d700() GS:880001e0() 
knlGS:
[  302.340013] CS:  0010 DS:  ES:  CR0: 8005003b
[  302.340013] CR2: 7fff196433e0 CR3: 0d07e000 CR4: 06f0
[  302.340013] DR0:  DR1:  DR2: 
[  302.340013] DR3:  DR6: 0ff0 DR7: 0400
[  302.340013] Process rmmod (pid: 8410, threadinfo 88000c7a8000, task 
88000c7aa200)
[  302.340013] Stack:
[  302.340013]  88000fbb3780  88000c987000 
a034b178
[  302.340013] <0> 88000fbb3850 88000fbb3780 a034efc0 
88000fbb3850
[  302.340013] <0>  a034b299 88000fbb3780 
a034b406
[  302.340013] Call Trace:
[  302.340013]  [] free_unused_bufs+0x88/0x120 [virtio_net]
[  302.340013]  [] remove_vq_common+0x19/0x30 [virtio_net]
[  302.340013]  [] virtnet_remove+0x46/0x80 [virtio_net]
[  302.340013]  [] virtio_dev_remove+0x15/0x60 [virtio]
[  302.340013]  [] __device_release_driver+0x91/0x110
[  302.340013]  [] driver_detach+0xa8/0xb0
[  302.340013]  [] bus_remove_driver+0x82/0x110
[  302.340013]  [] sys_delete_module+0x1c4/0x290
[  302.340013]  [] system_call_fastpath+0x16/0x1b
[  302.340013]  [<7f8b394c7de7>] 0x7f8b394c7de7
[  302.340013] Code: c3 01 49 83 c4 04 e8 30 10 07 e1 8b 55 38 39 da 77 d0 8b 
75 2c 31 c0 48 c7 c7 ba 4b 32 a0 e8 18 10 07 e1 8b 45 2c 3b 45 38 74 82 <0f> 0b 
eb fe 0f 1f 00 48 83 ec 28 48 89 6c 24 08 48 89 1c 24 48
[  302.340013] RIP  [] virtqueue_detach_unused_buf+0xb9/0xc0 
[virtio_ring]
[  302.340013]  RSP 
[  302.438579] ---[ end trace 1e583bdb5b2644f1 ]---


Signed-off-by: Yunjian Wang 
---
 drivers/net/virtio_net.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 49d84e5..4528ef8 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -818,10 +818,6 @@ static int virtnet_open(struct net_device *dev)
int i;

for (i = 0; i < vi->max_queue_pairs; i++) {
-   if (i < vi->curr_queue_pairs)
-   /* Make sure we have some buffers: if oom use wq. */
-   if (!try_fill_recv(vi, >rq[i], GFP_KERNEL))
-   schedule_delayed_work(>refill, 0);
virtnet_napi_enable(>rq[i]);
}

--
1.7.12.4



Re: Davicom DM9162 PHY supported in the kernel?

2016-05-25 Thread Amr Bekhit
Hi Andrew,

I added the following line to genphy_read_status to print out the value of BMCR:

 phydev->lp_advertising = 0;

+printk(KERN_DEBUG "MII_BMCR: 0x%04X\n", phy_read(phydev, MII_BMCR));

 if (AUTONEG_ENABLE == phydev->autoneg) {

After booting up the kernel and running ifconfig eth0 up, I get the
following in kernel log:

[   83.890625] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[   85.328125] MII_BMCR: 0x1000
[   86.328125] MII_BMCR: 0x1000
[   87.328125] MII_BMCR: 0x3100
[   87.328125] macb f802c000.ethernet eth0: link up (100/Full)
[   87.328125] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[   88.328125] MII_BMCR: 0x3100
[   89.328125] MII_BMCR: 0x3100
[   90.328125] MII_BMCR: 0x3100

So it appears that after bringing the interface up, the PHY is
configured for 100Mbps, autoneg enabled and duplex mode. The PHY is
not isolated or powered down.

On 24 May 2016 at 17:30, Andrew Lunn  wrote:
> On Tue, May 24, 2016 at 05:09:35PM +0100, Amr Bekhit wrote:
>> Hi Andrew,
>>
>> > How about adding a printk() in genphy_read_status().
>>
>> Would you be able to point me to some more information about these
>> status bits, please?
>
> You can get the datasheet from here:
>
> http://www.davicom.com.tw/userfile/24247/DM9162_DM9162I-12-MCO-DS-F01_08062014.pdf
>
> I would start by looking at the BMCR register. The
> genphy_read_status() function should get called about once per second,
> so that is an O.K. place to add debug prints, and there is example
> code for reading MII_BMCR.
>
>  Andrew


[iproute2 PATCH 1/1] tc filter u32: Coding style fixes

2016-05-25 Thread Jamal Hadi Salim
From: Jamal Hadi Salim 

"handle" was being used several times for different things.
Fix the 80 character limit abuse and other little issues while at it.

Signed-off-by: Jamal Hadi Salim 
---
 tc/f_u32.c | 66 +-
 1 file changed, 40 insertions(+), 26 deletions(-)

diff --git a/tc/f_u32.c b/tc/f_u32.c
index 6299515..1962dfe 100644
--- a/tc/f_u32.c
+++ b/tc/f_u32.c
@@ -171,7 +171,8 @@ static int pack_key16(struct tc_u32_sel *sel, __u32 key, 
__u32 mask,
return pack_key(sel, key, mask, off, offmask);
 }
 
-static int pack_key8(struct tc_u32_sel *sel, __u32 key, __u32 mask, int off, 
int offmask)
+static int pack_key8(struct tc_u32_sel *sel, __u32 key, __u32 mask, int off,
+int offmask)
 {
if (key > 0xFF || mask > 0xFF)
return -1;
@@ -835,16 +836,19 @@ static void print_ipv4(FILE *f, const struct tc_u32_key 
*key)
case 0:
switch (ntohl(key->mask)) {
case 0x0f00:
-   fprintf(f, "\n  match IP ihl %u", ntohl(key->val) >> 
24);
+   fprintf(f, "\n  match IP ihl %u",
+   ntohl(key->val) >> 24);
return;
case 0x00ff:
-   fprintf(f, "\n  match IP dsfield %#x", ntohl(key->val) 
>> 16);
+   fprintf(f, "\n  match IP dsfield %#x",
+   ntohl(key->val) >> 16);
return;
}
break;
case 8:
if (ntohl(key->mask) == 0x00ff) {
-   fprintf(f, "\n  match IP protocol %d", ntohl(key->val) 
>> 16);
+   fprintf(f, "\n  match IP protocol %d",
+   ntohl(key->val) >> 16);
return;
}
break;
@@ -892,16 +896,19 @@ static void print_ipv6(FILE *f, const struct tc_u32_key 
*key)
case 0:
switch (ntohl(key->mask)) {
case 0x0f00:
-   fprintf(f, "\n  match IP ihl %u", ntohl(key->val) >> 
24);
+   fprintf(f, "\n  match IP ihl %u",
+   ntohl(key->val) >> 24);
return;
case 0x00ff:
-   fprintf(f, "\n  match IP dsfield %#x", ntohl(key->val) 
>> 16);
+   fprintf(f, "\n  match IP dsfield %#x",
+   ntohl(key->val) >> 16);
return;
}
break;
case 8:
if (ntohl(key->mask) == 0x00ff) {
-   fprintf(f, "\n  match IP protocol %d", ntohl(key->val) 
>> 16);
+   fprintf(f, "\n  match IP protocol %d",
+   ntohl(key->val) >> 16);
return;
}
break;
@@ -1031,14 +1038,14 @@ static int u32_parse_opt(struct filter_util *qu, char 
*handle,
continue;
} else if (matches(*argv, "classid") == 0 ||
   strcmp(*argv, "flowid") == 0) {
-   unsigned int handle;
+   unsigned int flowid;
 
NEXT_ARG();
-   if (get_tc_classid(, *argv)) {
+   if (get_tc_classid(, *argv)) {
fprintf(stderr, "Illegal \"classid\"\n");
return -1;
}
-   addattr_l(n, MAX_MSG, TCA_U32_CLASSID, , 4);
+   addattr_l(n, MAX_MSG, TCA_U32_CLASSID, , 4);
sel.sel.flags |= TC_U32_TERMINAL;
} else if (matches(*argv, "divisor") == 0) {
unsigned int divisor;
@@ -1058,34 +1065,34 @@ static int u32_parse_opt(struct filter_util *qu, char 
*handle,
return -1;
}
} else if (strcmp(*argv, "link") == 0) {
-   unsigned int handle;
+   unsigned int linkid;
 
NEXT_ARG();
-   if (get_u32_handle(, *argv)) {
+   if (get_u32_handle(, *argv)) {
fprintf(stderr, "Illegal \"link\"\n");
return -1;
}
-   if (handle && TC_U32_NODE(handle)) {
+   if (linkid && TC_U32_NODE(linkid)) {
fprintf(stderr, "\"link\" must be a hash 
table.\n");
return -1;
}
addattr_l(n, MAX_MSG, TCA_U32_LINK, , 4);
} else if (strcmp(*argv, "ht") == 0) {
-   unsigned int handle;
+  

Re: [iproute2 PATH 2/2] tc action policer: enable timestamp display

2016-05-25 Thread Jamal Hadi Salim

Stephen,
This requires you pull Dave's -net headers.

cheers,
jamal

On 16-05-25 06:05 AM, Jamal Hadi Salim wrote:

From: Jamal Hadi Salim 

Signed-off-by: Jamal Hadi Salim 
---
  tc/m_police.c | 11 ++-
  1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/tc/m_police.c b/tc/m_police.c
index 8752d4f..cb17c9e 100644
--- a/tc/m_police.c
+++ b/tc/m_police.c
@@ -379,7 +379,16 @@ int print_police(struct action_util *a, FILE *f, struct 
rtattr *arg)
linklayer = (p->rate.linklayer & TC_LINKLAYER_MASK);
if (linklayer > TC_LINKLAYER_ETHERNET || show_details)
fprintf(f, "linklayer %s ", sprint_linklayer(linklayer, b2));
-   fprintf(f, "\n\tref %d bind %d\n", p->refcnt, p->bindcnt);
+   fprintf(f, "\n\tref %d bind %d", p->refcnt, p->bindcnt);
+   if (show_stats) {
+   if (tb[TCA_POLICE_TM]) {
+   struct tcf_t *tm = RTA_DATA(tb[TCA_POLICE_TM]);
+
+   print_tm(f, tm);
+   }
+   }
+   fprintf(f, "\n");
+

return 0;
  }





[iproute2 PATH 1/2] tc action policer: Avoid nonsensical input

2016-05-25 Thread Jamal Hadi Salim
From: Jamal Hadi Salim 

The user must at least specify a choice of the token bucket or
ewma policing or late binding index. TB policing requires at minimal
a rate and burst.

In addition fix formatting issues (80 chars etc).

Signed-off-by: Jamal Hadi Salim 
---
 tc/m_police.c | 45 ++---
 1 file changed, 30 insertions(+), 15 deletions(-)

diff --git a/tc/m_police.c b/tc/m_police.c
index 97558bd..8752d4f 100644
--- a/tc/m_police.c
+++ b/tc/m_police.c
@@ -99,7 +99,6 @@ static int police_action_a2n(const char *arg, int *result)
return 0;
 }
 
-
 static int get_police_result(int *action, int *result, char *arg)
 {
char *p = strchr(arg, '/');
@@ -121,8 +120,8 @@ static int get_police_result(int *action, int *result, char 
*arg)
return 0;
 }
 
-
-int act_parse_police(struct action_util *a, int *argc_p, char ***argv_p, int 
tca_id, struct nlmsghdr *n)
+int act_parse_police(struct action_util *a, int *argc_p, char ***argv_p,
+int tca_id, struct nlmsghdr *n)
 {
int argc = *argc_p;
char **argv = *argv_p;
@@ -258,10 +257,21 @@ int act_parse_police(struct action_util *a, int *argc_p, 
char ***argv_p, int tca
if (!ok)
return -1;
 
-   if (p.rate.rate && !buffer) {
+   if (p.rate.rate && avrate)
+   return -1;
+
+   /* Must at least do late binding, use TB or ewma policing */
+   if (!p.rate.rate && !avrate && !p.index) {
+   fprintf(stderr, "\"rate\" or \"avrate\" MUST be specified.\n");
+   return -1;
+   }
+
+   /* When the TB policer is used, burst is required */
+   if (p.rate.rate && !buffer && !avrate) {
fprintf(stderr, "\"burst\" requires \"rate\".\n");
return -1;
}
+
if (p.peakrate.rate) {
if (!p.rate.rate) {
fprintf(stderr, "\"peakrate\" requires \"rate\".\n");
@@ -276,8 +286,9 @@ int act_parse_police(struct action_util *a, int *argc_p, 
char ***argv_p, int tca
if (p.rate.rate) {
p.rate.mpu = mpu;
p.rate.overhead = overhead;
-   if (tc_calc_rtable(, rtab, Rcell_log, mtu, linklayer) < 
0) {
-   fprintf(stderr, "TBF: failed to calculate rate 
table.\n");
+   if (tc_calc_rtable(, rtab, Rcell_log, mtu,
+  linklayer) < 0) {
+   fprintf(stderr, "POLICE: failed to calculate rate 
table.\n");
return -1;
}
p.burst = tc_calc_xmittime(p.rate.rate, buffer);
@@ -286,7 +297,8 @@ int act_parse_police(struct action_util *a, int *argc_p, 
char ***argv_p, int tca
if (p.peakrate.rate) {
p.peakrate.mpu = mpu;
p.peakrate.overhead = overhead;
-   if (tc_calc_rtable(, ptab, Pcell_log, mtu, 
linklayer) < 0) {
+   if (tc_calc_rtable(, ptab, Pcell_log, mtu,
+  linklayer) < 0) {
fprintf(stderr, "POLICE: failed to calculate peak rate 
table.\n");
return -1;
}
@@ -317,8 +329,7 @@ int parse_police(int *argc_p, char ***argv_p, int tca_id, 
struct nlmsghdr *n)
return act_parse_police(NULL, argc_p, argv_p, tca_id, n);
 }
 
-int
-print_police(struct action_util *a, FILE *f, struct rtattr *arg)
+int print_police(struct action_util *a, FILE *f, struct rtattr *arg)
 {
SPRINT_BUF(b1);
SPRINT_BUF(b2);
@@ -354,22 +365,26 @@ print_police(struct action_util *a, FILE *f, struct 
rtattr *arg)
if (p->peakrate.rate)
fprintf(f, "peakrate %s ", sprint_rate(p->peakrate.rate, b1));
if (tb[TCA_POLICE_AVRATE])
-   fprintf(f, "avrate %s ", 
sprint_rate(rta_getattr_u32(tb[TCA_POLICE_AVRATE]), b1));
-   fprintf(f, "action %s", police_action_n2a(p->action, b1, sizeof(b1)));
+   fprintf(f, "avrate %s ",
+   sprint_rate(rta_getattr_u32(tb[TCA_POLICE_AVRATE]),
+   b1));
+   fprintf(f, "action %s",
+   police_action_n2a(p->action, b1, sizeof(b1)));
if (tb[TCA_POLICE_RESULT]) {
-   fprintf(f, "/%s ", police_action_n2a(*(int 
*)RTA_DATA(tb[TCA_POLICE_RESULT]), b1, sizeof(b1)));
+   fprintf(f, "/%s",
+   police_action_n2a(*(int 
*)RTA_DATA(tb[TCA_POLICE_RESULT]), b1, sizeof(b1)));
} else
fprintf(f, " ");
fprintf(f, "overhead %ub ", p->rate.overhead);
linklayer = (p->rate.linklayer & TC_LINKLAYER_MASK);
if (linklayer > TC_LINKLAYER_ETHERNET || show_details)
fprintf(f, "linklayer %s ", sprint_linklayer(linklayer, b2));
-   fprintf(f, "\nref %d bind %d\n", p->refcnt, p->bindcnt);
+   fprintf(f, "\n\tref %d bind %d\n", p->refcnt, p->bindcnt);
 
  

[iproute2 PATH 2/2] tc action policer: enable timestamp display

2016-05-25 Thread Jamal Hadi Salim
From: Jamal Hadi Salim 

Signed-off-by: Jamal Hadi Salim 
---
 tc/m_police.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/tc/m_police.c b/tc/m_police.c
index 8752d4f..cb17c9e 100644
--- a/tc/m_police.c
+++ b/tc/m_police.c
@@ -379,7 +379,16 @@ int print_police(struct action_util *a, FILE *f, struct 
rtattr *arg)
linklayer = (p->rate.linklayer & TC_LINKLAYER_MASK);
if (linklayer > TC_LINKLAYER_ETHERNET || show_details)
fprintf(f, "linklayer %s ", sprint_linklayer(linklayer, b2));
-   fprintf(f, "\n\tref %d bind %d\n", p->refcnt, p->bindcnt);
+   fprintf(f, "\n\tref %d bind %d", p->refcnt, p->bindcnt);
+   if (show_stats) {
+   if (tb[TCA_POLICE_TM]) {
+   struct tcf_t *tm = RTA_DATA(tb[TCA_POLICE_TM]);
+
+   print_tm(f, tm);
+   }
+   }
+   fprintf(f, "\n");
+
 
return 0;
 }
-- 
1.9.1



good news

2016-05-25 Thread I WILL LIKE TO SET UP A HUMANITERIAN FOUNDATION IN YOUR COUNTRY IN YOUR CARE CAN I TRUST YOU?
I am a Philanthropists i have been engage in Building Of Hospitals For
Cancer Victims,Rehabilitation Of HIV Victims,Helping the less
previleged in the society and Helping Widows i will like to set up a
charity foundation in your country under your care can i trust
you?contact me for more good news I AM Mrs Janet Penninger.


[PATCH v2] net: alx: use custom skb allocator

2016-05-25 Thread Feng Tang
This patch follows Eric Dumazet's commit 7b70176421 for Atheros
atl1c driver to fix one exactly same bug in alx driver, that the
network link will be lost in 1-5 minutes after the device is up.

My laptop Lenovo Y580 with Atheros AR8161 ethernet device hit the
same problem with kernel 4.4, and it will be cured by Jarod Wilson's
commit c406700c for alx driver which get merged in 4.5. But there
are still some alx devices can't function well even with Jarod's
patch, while this patch could make them work fine. More details on
https://bugzilla.kernel.org/show_bug.cgi?id=70761

The debug shows the issue is very likely to be related with the RX
DMA address, specifically 0x...f80, if RX buffer get 0x...f80 several
times, their will be RX overflow error and device will stop working.

For kernel 4.5.0 with Jarod's patch which works fine with my
AR8161/Lennov Y580, if I made some change to the
__netdev_alloc_skb
--> __alloc_page_frag()
to make the allocated buffer can get an address with 0x...f80,
then the same error happens. If I make it to 0x...f40 or 0xfc0,
everything will be still fine. So I tend to believe that the
0x..f80 address cause the silicon to behave abnormally.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=70761
Cc: Eric Dumazet 
Cc: Johannes Berg 
Cc: Jarod Wilson 
Signed-off-by: Feng Tang 
Tested-by: Ole Lukoie 
---
 drivers/net/ethernet/atheros/alx/alx.h  |  4 +++
 drivers/net/ethernet/atheros/alx/main.c | 48 -
 2 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/atheros/alx/alx.h 
b/drivers/net/ethernet/atheros/alx/alx.h
index 8fc93c5..d02c424 100644
--- a/drivers/net/ethernet/atheros/alx/alx.h
+++ b/drivers/net/ethernet/atheros/alx/alx.h
@@ -96,6 +96,10 @@ struct alx_priv {
unsigned int rx_ringsz;
unsigned int rxbuf_size;
 
+   struct page  *rx_page;
+   unsigned int rx_page_offset;
+   unsigned int rx_frag_size;
+
struct napi_struct napi;
struct alx_tx_queue txq;
struct alx_rx_queue rxq;
diff --git a/drivers/net/ethernet/atheros/alx/main.c 
b/drivers/net/ethernet/atheros/alx/main.c
index 9fe8b5e..c98acdc 100644
--- a/drivers/net/ethernet/atheros/alx/main.c
+++ b/drivers/net/ethernet/atheros/alx/main.c
@@ -70,6 +70,35 @@ static void alx_free_txbuf(struct alx_priv *alx, int entry)
}
 }
 
+static struct sk_buff *alx_alloc_skb(struct alx_priv *alx, gfp_t gfp)
+{
+   struct sk_buff *skb;
+   struct page *page;
+
+   if (alx->rx_frag_size > PAGE_SIZE)
+   return __netdev_alloc_skb(alx->dev, alx->rxbuf_size, gfp);
+
+   page = alx->rx_page;
+   if (!page) {
+   alx->rx_page = page = alloc_page(gfp);
+   if (unlikely(!page))
+   return NULL;
+   alx->rx_page_offset = 0;
+   }
+
+   skb = build_skb(page_address(page) + alx->rx_page_offset,
+   alx->rx_frag_size);
+   if (likely(skb)) {
+   alx->rx_page_offset += alx->rx_frag_size;
+   if (alx->rx_page_offset >= PAGE_SIZE)
+   alx->rx_page = NULL;
+   else
+   get_page(page);
+   }
+   return skb;
+}
+
+
 static int alx_refill_rx_ring(struct alx_priv *alx, gfp_t gfp)
 {
struct alx_rx_queue *rxq = >rxq;
@@ -86,7 +115,7 @@ static int alx_refill_rx_ring(struct alx_priv *alx, gfp_t 
gfp)
while (!cur_buf->skb && next != rxq->read_idx) {
struct alx_rfd *rfd = >rfd[cur];
 
-   skb = __netdev_alloc_skb(alx->dev, alx->rxbuf_size, gfp);
+   skb = alx_alloc_skb(alx, gfp);
if (!skb)
break;
dma = dma_map_single(>hw.pdev->dev,
@@ -124,6 +153,7 @@ static int alx_refill_rx_ring(struct alx_priv *alx, gfp_t 
gfp)
alx_write_mem16(>hw, ALX_RFD_PIDX, cur);
}
 
+
return count;
 }
 
@@ -592,6 +622,11 @@ static void alx_free_rings(struct alx_priv *alx)
kfree(alx->txq.bufs);
kfree(alx->rxq.bufs);
 
+   if (alx->rx_page) {
+   put_page(alx->rx_page);
+   alx->rx_page = NULL;
+   }
+
dma_free_coherent(>hw.pdev->dev,
  alx->descmem.size,
  alx->descmem.virt,
@@ -646,6 +681,7 @@ static int alx_request_irq(struct alx_priv *alx)
  alx->dev->name, alx);
if (!err)
goto out;
+
/* fall back to legacy interrupt */
pci_disable_msi(alx->hw.pdev);
}
@@ -689,6 +725,7 @@ static int alx_init_sw(struct alx_priv *alx)
struct pci_dev *pdev = alx->hw.pdev;
struct alx_hw *hw = >hw;
int err;
+   unsigned int head_size;
 
err =