Re: [PATCH 0/7] net: stmmac: Fixes and Tegra186 support

2017-02-26 Thread Thierry Reding
On Thu, Feb 23, 2017 at 12:57:05PM -0500, David Miller wrote:
> 
> The net-next tree is closed, therefore it is not appropriate to submit
> feature patches or cleanups at this time.
> 
> Please wait for the merge window to be finished and the net-next tree
> to open back up before resubmitting this patch series.

Okay, I'll resend this after the merge window. In the meantime, surely
it's okay for others to review patches?

Thierry


signature.asc
Description: PGP signature


Re: [PATCH v5 5/6] 6lowpan: Use netdev addr_len to determine lladdr len

2017-02-26 Thread Alexander Aring
Hi,

On 02/24/2017 01:14 PM, Luiz Augusto von Dentz wrote:
> From: Luiz Augusto von Dentz 
> 
> This allow technologies such as Bluetooth to use its native lladdr which
> is eui48 instead of eui64 which was expected by functions like
> lowpan_header_decompress and lowpan_header_compress.
> 
> Signed-off-by: Luiz Augusto von Dentz 
> Reviewed-by: Stefan Schmidt 
> ---
>  include/net/6lowpan.h   | 19 +++
>  net/6lowpan/iphc.c  | 49 
> ++---
>  net/bluetooth/6lowpan.c | 42 ++
>  3 files changed, 63 insertions(+), 47 deletions(-)
> 
> diff --git a/include/net/6lowpan.h b/include/net/6lowpan.h
> index 5ab4c99..c5792cb 100644
> --- a/include/net/6lowpan.h
> +++ b/include/net/6lowpan.h
> @@ -198,6 +198,25 @@ static inline void 
> lowpan_iphc_uncompress_eui64_lladdr(struct in6_addr *ipaddr,
>   ipaddr->s6_addr[8] ^= 0x02;
>  }
>  
> +static inline void lowpan_iphc_uncompress_eui48_lladdr(struct in6_addr 
> *ipaddr,
> +const void *lladdr)
> +{
> + /* fe:80:::XXff:feXX:
> +  *\_/
> +  *  hwaddr
> +  */
> + ipaddr->s6_addr[0] = 0xFE;
> + ipaddr->s6_addr[1] = 0x80;
> + memcpy(>s6_addr[8], lladdr, 3);
> + ipaddr->s6_addr[11] = 0xFF;
> + ipaddr->s6_addr[12] = 0xFE;
> + memcpy(>s6_addr[13], lladdr + 3, 3);
> + /* second bit-flip (Universe/Local)
> +  * is done according RFC2464
> +  */
> + ipaddr->s6_addr[8] ^= 0x02;
> +}
> +

same thing here. I think you don't need u/l bitflip here, you argumented
already that IID is without it in another patch, or?

btw: making static inline function -> then remove link-local setting
here before. Then we can use this function for ipv6/sateful/stateless
IPHC compression/decompression to generate IID.
And better with a function before that evaluates lltype (or dev->addr_len,
see below why not).

another thing is:
__ipv6_addr_set_half(_addr32[0], htonl(0xFE80), 0);

should be used here, but this need to be cleanuped everywhere in 6lowpan
code. :-)

---

What I mean such function placed in 6lowpan header should only set the
IID according a ipaddr.
Such function can also be used then in IPv6 IID generation.

And DON'T make such handling depending on address size, this is in my
opinion wrong. Because the link-layer 6lowpan adaption RFC describes how
to generate the IID and not depending on a address size.
Means another link-layer e.g. has eui48 but will set a u/l bitflip here.
You should use the lltype of 6lowpan netdev private area for that.

This means also the name "eui48" in the function is also semantic wrong,
at my point of view.
(Okay, I don't care about function names right now).

Anyway, I agree that doesn't matter currently because we have only two
adaptions right now.
... and yes, I know (already more about ~one year) BTLE 6LoWPAN is
broken (races/rfc stuff) and I am happy that somebody fix that now.
So I would also ack patches which makes it depending on dev->addr_len.
Otherwise broken things will never be fixed...

- Alex



Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE

2017-02-26 Thread Xin Long
On Sat, Feb 25, 2017 at 4:41 PM, Xin Long  wrote:
> On Fri, Feb 24, 2017 at 6:14 PM, David Laight  wrote:
>>
>> From: Xin Long
>> > Sent: 24 February 2017 06:44
>> ...
>> > > IIRC sctp_packet_can_append_data() is called for the first queued
>> > > data chunk in order to decide whether to generate a message that
>> > > consists only of data chunks.
>> > > If it returns SCTP_XMIT_OK then a message is built collecting the
>> > > rest of the queued data chunks (until the window fills).
>> > >
>> > > So if I send a message with MSG_MORE set (on an idle connection)
>> > > SCTP_XMIT_DELAY is returned and a message isn't sent.
>> > >
>> > > I now send a second small message, this time with MSG_MORE clear.
>> > > The message is queued, then the code looks to see if it can send 
>> > > anything.
>> > >
>> > > sctp_packet_can_append_data() is called for the first queued chunk.
>> > > Since it has force_delay set SCTP_XMIT_DELAY is returned and no
>> > > message is built.
>> > > The second message isn't even looked at.
>> > You're right. I can see the problem now.
>> >
>> > What I expected is it should work like:
>> >
>> > 1, send 3 small chunks with MSG_MORE set, the queue is:
>> >   chk3 [set] -> chk2 [set] -> chk1 [set]
>>
>> Strange way to write a queue! chk1 points to chk2 :-)
> haha, just  a model.
>
>>
>> > 2. send 1 more chunk with MSG_MORE clear, the queue is:
>> >   chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear]
>>
>> I don't think processing the entire queue is a good idea.
>> Both from execution time and the effects on the data cache.
>> The SCTP code is horrid enough as it is.
> you check the codes in last email, it's not processing the entire queue.
>
> 1). only when queue has delay chunk inside by checking queue->has_delay
> and current chunk has msg_more flag.
>
> 2). will break on the first chunk with clear in the queue.
>
> but yes, in 2), extra work has to be done than before, but not much.
>
>>
>> > 3. then if user send more small chunks with MSG_MORE set,
>> > the queue is like:
>> >   chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> 
>> > chk1 [clear]
>> > so that the new small chunks' flag will not affect the other chunks 
>> > bundling.
>>
>> That isn't really necessary.
>> The user can't expect to have absolute control over which chunks get bundled
>> together.
>> If the above chunks still aren't big enough to fill a frame the code might
>> as well wait for the next chunk instead of building a packet that contains
>> chk1 through to chkB.
>>
>> Remember you'll only get a queued chunk with MSG_MORE clear if data can't be 
>> sent.
>> As soon as data can be sent, if the first chunk has MSG_MORE clear all of the
>> queued chunks will be sent.
>>
>> So immediately after your (3) the application is expected to send a chunk
>> with MSG_MORE clear - at that point all the queued chunks can be sent in
>> a single packet.
> understand this.
>
> what I'm worried about is if the msg_more is saved in assoc:
>  chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear]
> then when you send a small chkA with MSG_MORE,
> the queue will be like:
>  chkA [set] -> chk4[set] -> chk3 [set] -> chk2 [set] -> chk1 [set]
> because msg_more is saved in assoc, every chunk can look at it.
> chk1 - chk4 are big enough to be packed into a packet, they were
> not sent last time because a lot of chunks are in the retransmit
> queue.
>
> But now even if retransmit queue is null, chk1-chk4 are still blocked.
>
> can you accept that chkA may block the old chunks ?
even also block the retransmit chunks.

>
>>
>> So just save the last MSG_MORE on the association as I did.
> I will think about it, thanks.
>
>>
>> David


[PATCH v2 net-next 0/6] drivers: net: xgene-v2: Add RGMII based 1G driver

2017-02-26 Thread Iyappan Subramanian
This patch set adds support for RGMII based 1GbE hardware which uses a linked
list of DMA descriptor architecture (v2) for APM X-Gene SoCs.

Signed-off-by: Iyappan Subramanian 
---
v2: Address review comments from v1
- moved create_desc_ring and delete_desc_ring to open() and close()
  respectively
- changed to use dma_zalloc APIs
- fixed tx_timeout()
- removed tx completion polling upper bound
- added error checking on rx packets
- added netif_stop_queue() and netif_wake_queue()

v1:
- Initial version
---

Iyappan Subramanian (6):
  drivers: net: xgene-v2: Add DMA descriptor
  drivers: net: xgene-v2: Add mac configuration
  drivers: net: xgene-v2: Add ethernet hardware configuration
  drivers: net: xgene-v2: Add base driver
  drivers: net: xgene-v2: Add transmit and receive
  MAINTAINERS: Add entry for APM X-Gene SoC Ethernet (v2) driver

 MAINTAINERS|   6 +
 drivers/net/ethernet/apm/Kconfig   |   1 +
 drivers/net/ethernet/apm/Makefile  |   1 +
 drivers/net/ethernet/apm/xgene-v2/Kconfig  |  11 +
 drivers/net/ethernet/apm/xgene-v2/Makefile |   6 +
 drivers/net/ethernet/apm/xgene-v2/enet.c   |  71 +++
 drivers/net/ethernet/apm/xgene-v2/enet.h   |  43 ++
 drivers/net/ethernet/apm/xgene-v2/mac.c| 116 +
 drivers/net/ethernet/apm/xgene-v2/mac.h| 110 +
 drivers/net/ethernet/apm/xgene-v2/main.c   | 756 +
 drivers/net/ethernet/apm/xgene-v2/main.h   |  75 +++
 drivers/net/ethernet/apm/xgene-v2/ring.c   |  81 
 drivers/net/ethernet/apm/xgene-v2/ring.h   | 119 +
 13 files changed, 1396 insertions(+)
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/Kconfig
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/Makefile
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/enet.c
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/enet.h
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/mac.c
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/mac.h
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/main.c
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/main.h
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/ring.c
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/ring.h

-- 
1.9.1



[PATCH v2 net-next 5/6] drivers: net: xgene-v2: Add transmit and receive

2017-02-26 Thread Iyappan Subramanian
This patch adds,
- Transmit
- Transmit completion poll
- Receive poll
- NAPI handler

and enables the driver.

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Keyur Chudgar 
---
 drivers/net/ethernet/apm/Kconfig   |   1 +
 drivers/net/ethernet/apm/Makefile  |   1 +
 drivers/net/ethernet/apm/xgene-v2/Kconfig  |  11 ++
 drivers/net/ethernet/apm/xgene-v2/Makefile |   6 +
 drivers/net/ethernet/apm/xgene-v2/main.c   | 248 -
 drivers/net/ethernet/apm/xgene-v2/main.h   |   1 +
 6 files changed, 267 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/Kconfig
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/Makefile

diff --git a/drivers/net/ethernet/apm/Kconfig b/drivers/net/ethernet/apm/Kconfig
index ec63d70..59efe5b 100644
--- a/drivers/net/ethernet/apm/Kconfig
+++ b/drivers/net/ethernet/apm/Kconfig
@@ -1 +1,2 @@
 source "drivers/net/ethernet/apm/xgene/Kconfig"
+source "drivers/net/ethernet/apm/xgene-v2/Kconfig"
diff --git a/drivers/net/ethernet/apm/Makefile 
b/drivers/net/ethernet/apm/Makefile
index 65ce32a..946b2a4 100644
--- a/drivers/net/ethernet/apm/Makefile
+++ b/drivers/net/ethernet/apm/Makefile
@@ -3,3 +3,4 @@
 #
 
 obj-$(CONFIG_NET_XGENE) += xgene/
+obj-$(CONFIG_NET_XGENE_V2) += xgene-v2/
diff --git a/drivers/net/ethernet/apm/xgene-v2/Kconfig 
b/drivers/net/ethernet/apm/xgene-v2/Kconfig
new file mode 100644
index 000..1205861
--- /dev/null
+++ b/drivers/net/ethernet/apm/xgene-v2/Kconfig
@@ -0,0 +1,11 @@
+config NET_XGENE_V2
+   tristate "APM X-Gene SoC Ethernet-v2 Driver"
+   depends on HAS_DMA
+   depends on ARCH_XGENE || COMPILE_TEST
+   help
+ This is the Ethernet driver for the on-chip ethernet interface
+ which uses a linked list of DMA descriptor architecture (v2) for
+ APM X-Gene SoCs.
+
+ To compile this driver as a module, choose M here. This module will
+ be called xgene-enet-v2.
diff --git a/drivers/net/ethernet/apm/xgene-v2/Makefile 
b/drivers/net/ethernet/apm/xgene-v2/Makefile
new file mode 100644
index 000..735309c
--- /dev/null
+++ b/drivers/net/ethernet/apm/xgene-v2/Makefile
@@ -0,0 +1,6 @@
+#
+# Makefile for APM X-Gene Ethernet v2 driver
+#
+
+xgene-enet-v2-objs := main.o mac.o enet.o ring.o
+obj-$(CONFIG_NET_XGENE_V2) += xgene-enet-v2.o
diff --git a/drivers/net/ethernet/apm/xgene-v2/main.c 
b/drivers/net/ethernet/apm/xgene-v2/main.c
index c96b4cc..f613a78 100644
--- a/drivers/net/ethernet/apm/xgene-v2/main.c
+++ b/drivers/net/ethernet/apm/xgene-v2/main.c
@@ -113,7 +113,7 @@ static int xge_refill_buffers(struct net_device *ndev, u32 
nbuf)
raw_desc->m1 = cpu_to_le64(SET_BITS(NEXT_DESC_ADDRL, addr_lo) |
   SET_BITS(NEXT_DESC_ADDRH, addr_hi) |
   SET_BITS(PKT_ADDRH,
-   dma_addr >> PKT_ADDRL_LEN));
+   upper_32_bits(dma_addr)));
 
dma_wmb();
raw_desc->m0 = cpu_to_le64(SET_BITS(PKT_ADDRL, dma_addr) |
@@ -177,6 +177,194 @@ static void xge_free_irq(struct net_device *ndev)
devm_free_irq(dev, pdata->resources.irq, pdata);
 }
 
+static bool is_tx_slot_available(struct xge_raw_desc *raw_desc)
+{
+   if (GET_BITS(E, le64_to_cpu(raw_desc->m0)) &&
+   (GET_BITS(PKT_SIZE, le64_to_cpu(raw_desc->m0)) == SLOT_EMPTY))
+   return true;
+
+   return false;
+}
+
+static netdev_tx_t xge_start_xmit(struct sk_buff *skb, struct net_device *ndev)
+{
+   struct xge_pdata *pdata = netdev_priv(ndev);
+   struct device *dev = >pdev->dev;
+   static dma_addr_t dma_addr;
+   struct xge_desc_ring *tx_ring;
+   struct xge_raw_desc *raw_desc;
+   u64 addr_lo, addr_hi;
+   void *pkt_buf;
+   u8 tail;
+   u16 len;
+
+   tx_ring = pdata->tx_ring;
+   tail = tx_ring->tail;
+   len = skb_headlen(skb);
+   raw_desc = _ring->raw_desc[tail];
+
+   if (!is_tx_slot_available(raw_desc)) {
+   netif_stop_queue(ndev);
+   return NETDEV_TX_BUSY;
+   }
+
+   /* Packet buffers should be 64B aligned */
+   pkt_buf = dma_zalloc_coherent(dev, XGENE_ENET_STD_MTU, _addr,
+ GFP_ATOMIC);
+   if (unlikely(!pkt_buf)) {
+   dev_kfree_skb_any(skb);
+   return NETDEV_TX_OK;
+   }
+   memcpy(pkt_buf, skb->data, len);
+
+   addr_hi = GET_BITS(NEXT_DESC_ADDRH, le64_to_cpu(raw_desc->m1));
+   addr_lo = GET_BITS(NEXT_DESC_ADDRL, le64_to_cpu(raw_desc->m1));
+   raw_desc->m1 = cpu_to_le64(SET_BITS(NEXT_DESC_ADDRL, addr_lo) |
+  SET_BITS(NEXT_DESC_ADDRH, addr_hi) |
+  SET_BITS(PKT_ADDRH,
+   

[PATCH v2 net-next 1/6] drivers: net: xgene-v2: Add DMA descriptor

2017-02-26 Thread Iyappan Subramanian
This patch adds DMA descriptor setup and interrupt enable/disable
functions.

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Keyur Chudgar 
---
 drivers/net/ethernet/apm/xgene-v2/main.h |  74 +++
 drivers/net/ethernet/apm/xgene-v2/ring.c |  81 +
 drivers/net/ethernet/apm/xgene-v2/ring.h | 119 +++
 3 files changed, 274 insertions(+)
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/main.h
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/ring.c
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/ring.h

diff --git a/drivers/net/ethernet/apm/xgene-v2/main.h 
b/drivers/net/ethernet/apm/xgene-v2/main.h
new file mode 100644
index 000..a2f8712
--- /dev/null
+++ b/drivers/net/ethernet/apm/xgene-v2/main.h
@@ -0,0 +1,74 @@
+/*
+ * Applied Micro X-Gene SoC Ethernet v2 Driver
+ *
+ * Copyright (c) 2017, Applied Micro Circuits Corporation
+ * Author(s): Iyappan Subramanian 
+ *   Keyur Chudgar 
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#ifndef __XGENE_ENET_V2_MAIN_H__
+#define __XGENE_ENET_V2_MAIN_H__
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "mac.h"
+#include "enet.h"
+#include "ring.h"
+
+#define XGENE_ENET_V2_VERSION  "v1.0"
+#define XGENE_ENET_STD_MTU 1536
+#define XGENE_ENET_MIN_FRAME   60
+#define IRQ_ID_SIZE 16
+
+struct xge_resource {
+   void __iomem *base_addr;
+   int phy_mode;
+   u32 irq;
+};
+
+struct xge_stats {
+   u64 tx_packets;
+   u64 tx_bytes;
+   u64 rx_packets;
+   u64 rx_bytes;
+};
+
+/* ethernet private data */
+struct xge_pdata {
+   struct xge_resource resources;
+   struct xge_desc_ring *tx_ring;
+   struct xge_desc_ring *rx_ring;
+   struct platform_device *pdev;
+   char irq_name[IRQ_ID_SIZE];
+   struct net_device *ndev;
+   struct napi_struct napi;
+   struct xge_stats stats;
+   int phy_speed;
+   u8 nbufs;
+};
+
+#endif /* __XGENE_ENET_V2_MAIN_H__ */
diff --git a/drivers/net/ethernet/apm/xgene-v2/ring.c 
b/drivers/net/ethernet/apm/xgene-v2/ring.c
new file mode 100644
index 000..3881082
--- /dev/null
+++ b/drivers/net/ethernet/apm/xgene-v2/ring.c
@@ -0,0 +1,81 @@
+/*
+ * Applied Micro X-Gene SoC Ethernet v2 Driver
+ *
+ * Copyright (c) 2017, Applied Micro Circuits Corporation
+ * Author(s): Iyappan Subramanian 
+ *   Keyur Chudgar 
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include "main.h"
+
+/* create circular linked list of descriptors */
+void xge_setup_desc(struct xge_desc_ring *ring)
+{
+   struct xge_raw_desc *raw_desc;
+   dma_addr_t dma_h, next_dma;
+   u16 offset;
+   int i;
+
+   for (i = 0; i < XGENE_ENET_NUM_DESC; i++) {
+   raw_desc = >raw_desc[i];
+
+   offset = (i + 1) & (XGENE_ENET_NUM_DESC - 1);
+   next_dma = ring->dma_addr + (offset * XGENE_ENET_DESC_SIZE);
+
+   raw_desc->m0 = cpu_to_le64(SET_BITS(E, 1) |
+  SET_BITS(PKT_SIZE, SLOT_EMPTY));
+   dma_h = upper_32_bits(next_dma);
+   raw_desc->m1 = cpu_to_le64(SET_BITS(NEXT_DESC_ADDRL, next_dma) |
+  SET_BITS(NEXT_DESC_ADDRH, dma_h));
+   }
+}
+
+void xge_update_tx_desc_addr(struct xge_pdata *pdata)
+{
+   struct xge_desc_ring *ring = pdata->tx_ring;
+   dma_addr_t dma_addr = ring->dma_addr;
+
+   xge_wr_csr(pdata, DMATXDESCL, dma_addr);
+   xge_wr_csr(pdata, DMATXDESCH, 

[PATCH v2 net-next 2/6] drivers: net: xgene-v2: Add mac configuration

2017-02-26 Thread Iyappan Subramanian
This patch adds functions to configure and control mac.  This
patch also adds helper functions to get/set registers.

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Keyur Chudgar 
---
 drivers/net/ethernet/apm/xgene-v2/mac.c | 116 
 drivers/net/ethernet/apm/xgene-v2/mac.h | 110 ++
 2 files changed, 226 insertions(+)
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/mac.c
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/mac.h

diff --git a/drivers/net/ethernet/apm/xgene-v2/mac.c 
b/drivers/net/ethernet/apm/xgene-v2/mac.c
new file mode 100644
index 000..9c3d32d
--- /dev/null
+++ b/drivers/net/ethernet/apm/xgene-v2/mac.c
@@ -0,0 +1,116 @@
+/*
+ * Applied Micro X-Gene SoC Ethernet v2 Driver
+ *
+ * Copyright (c) 2017, Applied Micro Circuits Corporation
+ * Author(s): Iyappan Subramanian 
+ *   Keyur Chudgar 
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include "main.h"
+
+void xge_mac_reset(struct xge_pdata *pdata)
+{
+   xge_wr_csr(pdata, MAC_CONFIG_1, SOFT_RESET);
+   xge_wr_csr(pdata, MAC_CONFIG_1, 0);
+}
+
+static void xge_mac_set_speed(struct xge_pdata *pdata)
+{
+   u32 icm0, icm2, ecm0, mc2;
+   u32 intf_ctrl, rgmii;
+
+   icm0 = xge_rd_csr(pdata, ICM_CONFIG0_REG_0);
+   icm2 = xge_rd_csr(pdata, ICM_CONFIG2_REG_0);
+   ecm0 = xge_rd_csr(pdata, ECM_CONFIG0_REG_0);
+   rgmii = xge_rd_csr(pdata, RGMII_REG_0);
+   mc2 = xge_rd_csr(pdata, MAC_CONFIG_2);
+   intf_ctrl = xge_rd_csr(pdata, INTERFACE_CONTROL);
+   icm2 |= CFG_WAITASYNCRD_EN;
+
+   switch (pdata->phy_speed) {
+   case SPEED_10:
+   SET_REG_BITS(, INTF_MODE, 1);
+   SET_REG_BITS(_ctrl, HD_MODE, 0);
+   SET_REG_BITS(, CFG_MACMODE, 0);
+   SET_REG_BITS(, CFG_WAITASYNCRD, 500);
+   SET_REG_BIT(, CFG_SPEED_125, 0);
+   break;
+   case SPEED_100:
+   SET_REG_BITS(, INTF_MODE, 1);
+   SET_REG_BITS(_ctrl, HD_MODE, 1);
+   SET_REG_BITS(, CFG_MACMODE, 1);
+   SET_REG_BITS(, CFG_WAITASYNCRD, 80);
+   SET_REG_BIT(, CFG_SPEED_125, 0);
+   break;
+   default:
+   SET_REG_BITS(, INTF_MODE, 2);
+   SET_REG_BITS(_ctrl, HD_MODE, 2);
+   SET_REG_BITS(, CFG_MACMODE, 2);
+   SET_REG_BITS(, CFG_WAITASYNCRD, 16);
+   SET_REG_BIT(, CFG_SPEED_125, 1);
+   break;
+   }
+
+   mc2 |= FULL_DUPLEX | CRC_EN | PAD_CRC;
+   SET_REG_BITS(, CFG_WFIFOFULLTHR, 0x32);
+
+   xge_wr_csr(pdata, MAC_CONFIG_2, mc2);
+   xge_wr_csr(pdata, INTERFACE_CONTROL, intf_ctrl);
+   xge_wr_csr(pdata, RGMII_REG_0, rgmii);
+   xge_wr_csr(pdata, ICM_CONFIG0_REG_0, icm0);
+   xge_wr_csr(pdata, ICM_CONFIG2_REG_0, icm2);
+   xge_wr_csr(pdata, ECM_CONFIG0_REG_0, ecm0);
+}
+
+void xge_mac_set_station_addr(struct xge_pdata *pdata)
+{
+   u32 addr0, addr1;
+   u8 *dev_addr = pdata->ndev->dev_addr;
+
+   addr0 = (dev_addr[3] << 24) | (dev_addr[2] << 16) |
+   (dev_addr[1] << 8) | dev_addr[0];
+   addr1 = (dev_addr[5] << 24) | (dev_addr[4] << 16);
+
+   xge_wr_csr(pdata, STATION_ADDR0, addr0);
+   xge_wr_csr(pdata, STATION_ADDR1, addr1);
+}
+
+void xge_mac_init(struct xge_pdata *pdata)
+{
+   xge_mac_reset(pdata);
+   xge_mac_set_speed(pdata);
+   xge_mac_set_station_addr(pdata);
+}
+
+void xge_mac_enable(struct xge_pdata *pdata)
+{
+   u32 data;
+
+   data = xge_rd_csr(pdata, MAC_CONFIG_1);
+   data |= TX_EN | RX_EN;
+   xge_wr_csr(pdata, MAC_CONFIG_1, data);
+
+   data = xge_rd_csr(pdata, MAC_CONFIG_1);
+}
+
+void xge_mac_disable(struct xge_pdata *pdata)
+{
+   u32 data;
+
+   data = xge_rd_csr(pdata, MAC_CONFIG_1);
+   data &= ~(TX_EN | RX_EN);
+   xge_wr_csr(pdata, MAC_CONFIG_1, data);
+}
diff --git a/drivers/net/ethernet/apm/xgene-v2/mac.h 
b/drivers/net/ethernet/apm/xgene-v2/mac.h
new file mode 100644
index 000..0fce6ae
--- /dev/null
+++ b/drivers/net/ethernet/apm/xgene-v2/mac.h
@@ -0,0 +1,110 @@
+/*
+ * Applied Micro X-Gene SoC Ethernet v2 Driver
+ *
+ * Copyright (c) 2017, Applied Micro Circuits Corporation
+ * Author(s): Iyappan 

[PATCH v2 net-next 6/6] MAINTAINERS: Add entry for APM X-Gene SoC Ethernet (v2) driver

2017-02-26 Thread Iyappan Subramanian
This patch adds a MAINTAINERS entry for the ethernet driver for
the on-chip ethernet interface which uses a linked list of DMA
descriptor architecture (v2) for APM X-Gene SoCs.

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Keyur Chudgar 
---
 MAINTAINERS | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 4b03c47..359fc34 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -902,6 +902,12 @@ F: drivers/net/phy/mdio-xgene.c
 F: Documentation/devicetree/bindings/net/apm-xgene-enet.txt
 F: Documentation/devicetree/bindings/net/apm-xgene-mdio.txt
 
+APPLIED MICRO (APM) X-GENE SOC ETHERNET (V2) DRIVER
+M: Iyappan Subramanian 
+M: Keyur Chudgar 
+S: Supported
+F: drivers/net/ethernet/apm/xgene-v2/
+
 APPLIED MICRO (APM) X-GENE SOC PMU
 M: Tai Nguyen 
 S: Supported
-- 
1.9.1



Re: [PATCH net-next 4/6] drivers: net: xgene-v2: Add base driver

2017-02-26 Thread Iyappan Subramanian
Hi Andrew,

On Tue, Jan 31, 2017 at 12:01 PM, Andrew Lunn  wrote:
>> + phy_mode = device_get_phy_mode(dev);
>> + if (phy_mode < 0) {
>> + dev_err(dev, "Unable to get phy-connection-type\n");
>> + return phy_mode;
>> + }
>> + pdata->resources.phy_mode = phy_mode;
>> +
>> + if (pdata->resources.phy_mode != PHY_INTERFACE_MODE_RGMII) {
>> + dev_err(dev, "Incorrect phy-connection-type specified\n");
>> + return -ENODEV;
>> + }
>
> This seems a bit limiting. What if you need to use:
>
> PHY_INTERFACE_MODE_RGMII_ID,
> PHY_INTERFACE_MODE_RGMII_RXID,
> PHY_INTERFACE_MODE_RGMII_TXID,
>
> in order to set the RGMII delays.

This version of the driver doesn't support setting delays.  The delay
support will be added in the future.

>
>Andrew
>


Re: [PATCH net-next 4/6] drivers: net: xgene-v2: Add base driver

2017-02-26 Thread Iyappan Subramanian
Hi Florian,

On Tue, Jan 31, 2017 at 12:31 PM, Florian Fainelli  wrote:
> On 01/31/2017 11:03 AM, Iyappan Subramanian wrote:
>> This patch adds,
>>
>>  - probe, remove, shutdown
>>  - open, close and stats
>>  - create and delete ring
>>  - request and delete irq
>>
>> Signed-off-by: Iyappan Subramanian 
>> Signed-off-by: Keyur Chudgar 
>> ---
>
>> +static void xge_delete_desc_rings(struct net_device *ndev)
>> +{
>> + struct xge_pdata *pdata = netdev_priv(ndev);
>> + struct device *dev = >pdev->dev;
>> + struct xge_desc_ring *ring;
>> +
>> + ring = pdata->tx_ring;
>> + if (ring) {
>> + if (ring->skbs)
>> + devm_kfree(dev, ring->skbs);
>> + if (ring->pkt_bufs)
>> + devm_kfree(dev, ring->pkt_bufs);
>> + devm_kfree(dev, ring);
>> + }
>
> The very fact that you have to do the devm_kfree suggests that the way
> you manage the lifetime of the ring is not appropriate, and in fact, if
> we look at how xge_create_desc_ring() is called, in the driver's probe
> function indicates that if the network interface is never openeded, we
> are just wasting memory sitting there and doing nothing. You should
> consider moving this to the ndo_open(), resp. ndo_close() functions to
> optimize memory consumption wrt. the network interface state.

I will move these to open and close functions and will use dma_zalloc() APIs.

>
>> +
>> + ring = pdata->rx_ring;
>> + if (ring) {
>> + if (ring->skbs)
>> + devm_kfree(dev, ring->skbs);
>> + devm_kfree(dev, ring);
>> + }
>> +}
>> +
>> +static struct xge_desc_ring *xge_create_desc_ring(struct net_device *ndev)
>> +{
>> + struct xge_pdata *pdata = netdev_priv(ndev);
>> + struct device *dev = >pdev->dev;
>> + struct xge_desc_ring *ring;
>> + u16 size;
>> +
>> + ring = devm_kzalloc(dev, sizeof(struct xge_desc_ring), GFP_KERNEL);
>> + if (!ring)
>> + return NULL;
>> +
>> + ring->ndev = ndev;
>> +
>> + size = XGENE_ENET_DESC_SIZE * XGENE_ENET_NUM_DESC;
>> + ring->desc_addr = dmam_alloc_coherent(dev, size, >dma_addr,
>> +   GFP_KERNEL | __GFP_ZERO);
>
> There is no dmam_zalloc_coherent()? Then again, that seems to be a
> candidate for dma_zalloc_coherent() and moving this to the ndo_open()
> function.
>
>> + if (!ring->desc_addr) {
>> + devm_kfree(dev, ring);
>> + return NULL;
>> + }
>> +
>> + xge_setup_desc(ring);
>> +
>> + return ring;
>> +}
>> +
>> +static int xge_refill_buffers(struct net_device *ndev, u32 nbuf)
>> +{
>> + struct xge_pdata *pdata = netdev_priv(ndev);
>> + struct xge_desc_ring *ring = pdata->rx_ring;
>> + const u8 slots = XGENE_ENET_NUM_DESC - 1;
>> + struct device *dev = >pdev->dev;
>> + struct xge_raw_desc *raw_desc;
>> + u64 addr_lo, addr_hi;
>> + u8 tail = ring->tail;
>> + struct sk_buff *skb;
>> + dma_addr_t dma_addr;
>> + u16 len;
>> + int i;
>> +
>> + for (i = 0; i < nbuf; i++) {
>> + raw_desc = >raw_desc[tail];
>> +
>> + len = XGENE_ENET_STD_MTU;
>> + skb = netdev_alloc_skb(ndev, len);
>> + if (unlikely(!skb))
>> + return -ENOMEM;
>
> Are not you leaving holes in your RX ring if you do that?

No.  The probe will fail and clean up the unused buffers.

>
>> +
>> + dma_addr = dma_map_single(dev, skb->data, len, 
>> DMA_FROM_DEVICE);
>> + if (dma_mapping_error(dev, dma_addr)) {
>> + netdev_err(ndev, "DMA mapping error\n");
>> + dev_kfree_skb_any(skb);
>> + return -EINVAL;
>> + }
>
>
>> +static void xge_timeout(struct net_device *ndev)
>> +{
>> + struct xge_pdata *pdata = netdev_priv(ndev);
>> + struct netdev_queue *txq;
>> +
>> + xge_mac_reset(pdata);
>> +
>> + txq = netdev_get_tx_queue(ndev, 0);
>> + txq->trans_start = jiffies;
>> + netif_tx_start_queue(txq);
>
> It most likely is not that simple, don't you want to walk the list of
> pending transmissed SKBs and free them all?

I'll add more exhaustive clean up and restart of Tx hardware.

>
>> +}
>> +
>> +static void xge_get_stats64(struct net_device *ndev,
>> + struct rtnl_link_stats64 *storage)
>> +{
>> + struct xge_pdata *pdata = netdev_priv(ndev);
>> + struct xge_stats *stats = >stats;
>> +
>> + storage->tx_packets += stats->tx_packets;
>> + storage->tx_bytes += stats->tx_bytes;
>> +
>> + storage->rx_packets += stats->rx_packets;
>> + storage->rx_bytes += stats->rx_bytes;
>
> Pretty sure you need some synchronization primitives here for non 64-bit
> architectures (maybe this driver is not used outside of 64-bit, but still).

Synchronization primitives are not required for this 

Re: [PATCH net-next 5/6] drivers: net: xgene-v2: Add transmit and receive

2017-02-26 Thread Iyappan Subramanian
Hi Florian,

On Tue, Jan 31, 2017 at 12:33 PM, Florian Fainelli  wrote:
> On 01/31/2017 11:03 AM, Iyappan Subramanian wrote:
>> This patch adds,
>> - Transmit
>> - Transmit completion poll
>> - Receive poll
>> - NAPI handler
>>
>> and enables the driver.
>>
>> Signed-off-by: Iyappan Subramanian 
>> Signed-off-by: Keyur Chudgar 
>> ---
>
>> +
>> + tx_ring = pdata->tx_ring;
>> + tail = tx_ring->tail;
>> + len = skb_headlen(skb);
>> + raw_desc = _ring->raw_desc[tail];
>> +
>> + /* Tx descriptor not available */
>> + if (!GET_BITS(E, le64_to_cpu(raw_desc->m0)) ||
>> + GET_BITS(PKT_SIZE, le64_to_cpu(raw_desc->m0)))
>> + return NETDEV_TX_BUSY;
>> +
>> + /* Packet buffers should be 64B aligned */
>> + pkt_buf = dma_alloc_coherent(dev, XGENE_ENET_STD_MTU, _addr,
>> +  GFP_ATOMIC);
>> + if (unlikely(!pkt_buf))
>> + goto out;
>
> Can't you obtain a DMA-API mapping for skb->data and pass it down to the
> hardware? This copy here is inefficient.

This hardware requires 64-byte alignment.

>
>> +
>> + memcpy(pkt_buf, skb->data, len);
>> +
>> + addr_hi = GET_BITS(NEXT_DESC_ADDRH, le64_to_cpu(raw_desc->m1));
>> + addr_lo = GET_BITS(NEXT_DESC_ADDRL, le64_to_cpu(raw_desc->m1));
>> + raw_desc->m1 = cpu_to_le64(SET_BITS(NEXT_DESC_ADDRL, addr_lo) |
>> +SET_BITS(NEXT_DESC_ADDRH, addr_hi) |
>> +SET_BITS(PKT_ADDRH,
>> + dma_addr >> PKT_ADDRL_LEN));
>> +
>> + dma_wmb();
>> +
>> + raw_desc->m0 = cpu_to_le64(SET_BITS(PKT_ADDRL, dma_addr) |
>> +SET_BITS(PKT_SIZE, len) |
>> +SET_BITS(E, 0));
>> +
>> + skb_tx_timestamp(skb);
>> + xge_wr_csr(pdata, DMATXCTRL, 1);
>> +
>> + pdata->stats.tx_packets++;
>> + pdata->stats.tx_bytes += skb->len;
>
> This is both racy and incorrect. Racy because after you wrote DMATXCTRL,
> your TX completion can run, and it can do that while interrupting your
> CPU presumably, and free the SKB, therefore making you access a freed
> SKB (or it should, if it does not), it's also incorrect, because before
> you get signaled a TX completion, there is no guarantee that the packets
> did actually make it through, you must update your stats in the TX
> completion handler.

Thanks.  I'll move the tx stats part to Tx completion.

>
>> +
>> + tx_ring->skbs[tail] = skb;
>> + tx_ring->pkt_bufs[tail] = pkt_buf;
>> + tx_ring->tail = (tail + 1) & (XGENE_ENET_NUM_DESC - 1);
>> +
>> +out:
>> + dev_kfree_skb_any(skb);
>
> Don't do this, remember a pointer to the SKB, free the SKB in TX
> completion handler, preferably in NAPI context.

I'll implement this.

>
>> +
>> + return NETDEV_TX_OK;
>> +}
>> +
>> +static void xge_txc_poll(struct net_device *ndev, unsigned int budget)
>> +{
>> + struct xge_pdata *pdata = netdev_priv(ndev);
>> + struct device *dev = >pdev->dev;
>> + struct xge_desc_ring *tx_ring;
>> + struct xge_raw_desc *raw_desc;
>> + u64 addr_lo, addr_hi;
>> + dma_addr_t dma_addr;
>> + void *pkt_buf;
>> + bool pktsent;
>> + u32 data;
>> + u8 head;
>> + int i;
>> +
>> + tx_ring = pdata->tx_ring;
>> + head = tx_ring->head;
>> +
>> + data = xge_rd_csr(pdata, DMATXSTATUS);
>> + pktsent = data & TX_PKT_SENT;
>> + if (unlikely(!pktsent))
>> + return;
>> +
>> + for (i = 0; i < budget; i++) {
>
> TX completion handlers should run unbound and free the entire TX ring,
> don't make it obey to an upper bound.

I'll do as suggested.

>
>> + raw_desc = _ring->raw_desc[head];
>> +
>> + if (!GET_BITS(E, le64_to_cpu(raw_desc->m0)))
>> + break;
>> +
>> + dma_rmb();
>> +
>> + addr_hi = GET_BITS(PKT_ADDRH, le64_to_cpu(raw_desc->m1));
>> + addr_lo = GET_BITS(PKT_ADDRL, le64_to_cpu(raw_desc->m0));
>> + dma_addr = (addr_hi << PKT_ADDRL_LEN) | addr_lo;
>> +
>> + pkt_buf = tx_ring->pkt_bufs[head];
>> +
>> + /* clear pktstart address and pktsize */
>> + raw_desc->m0 = cpu_to_le64(SET_BITS(E, 1) |
>> +SET_BITS(PKT_SIZE, 0));
>> + xge_wr_csr(pdata, DMATXSTATUS, 1);
>> +
>> + dma_free_coherent(dev, XGENE_ENET_STD_MTU, pkt_buf, dma_addr);
>> +
>> + head = (head + 1) & (XGENE_ENET_NUM_DESC - 1);
>> + }
>> +
>> + tx_ring->head = head;
>> +}
>> +
>> +static int xge_rx_poll(struct net_device *ndev, unsigned int budget)
>> +{
>> + struct xge_pdata *pdata = netdev_priv(ndev);
>> + struct device *dev = >pdev->dev;
>> + dma_addr_t addr_hi, addr_lo, dma_addr;
>> + struct xge_desc_ring *rx_ring;
>> + struct xge_raw_desc *raw_desc;
>> + struct 

[PATCH v2 net-next 3/6] drivers: net: xgene-v2: Add ethernet hardware configuration

2017-02-26 Thread Iyappan Subramanian
This patch adds functions to configure ethernet hardware.

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Keyur Chudgar 
---
 drivers/net/ethernet/apm/xgene-v2/enet.c | 71 
 drivers/net/ethernet/apm/xgene-v2/enet.h | 43 +++
 2 files changed, 114 insertions(+)
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/enet.c
 create mode 100644 drivers/net/ethernet/apm/xgene-v2/enet.h

diff --git a/drivers/net/ethernet/apm/xgene-v2/enet.c 
b/drivers/net/ethernet/apm/xgene-v2/enet.c
new file mode 100644
index 000..b49edee
--- /dev/null
+++ b/drivers/net/ethernet/apm/xgene-v2/enet.c
@@ -0,0 +1,71 @@
+/*
+ * Applied Micro X-Gene SoC Ethernet v2 Driver
+ *
+ * Copyright (c) 2017, Applied Micro Circuits Corporation
+ * Author(s): Iyappan Subramanian 
+ *   Keyur Chudgar 
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include "main.h"
+
+void xge_wr_csr(struct xge_pdata *pdata, u32 offset, u32 val)
+{
+   void __iomem *addr = pdata->resources.base_addr + offset;
+
+   iowrite32(val, addr);
+}
+
+u32 xge_rd_csr(struct xge_pdata *pdata, u32 offset)
+{
+   void __iomem *addr = pdata->resources.base_addr + offset;
+
+   return ioread32(addr);
+}
+
+int xge_port_reset(struct net_device *ndev)
+{
+   struct xge_pdata *pdata = netdev_priv(ndev);
+
+   xge_wr_csr(pdata, ENET_SRST, 0x3);
+   xge_wr_csr(pdata, ENET_SRST, 0x2);
+   xge_wr_csr(pdata, ENET_SRST, 0x0);
+
+   xge_wr_csr(pdata, ENET_SHIM, DEVM_ARAUX_COH | DEVM_AWAUX_COH);
+
+   return 0;
+}
+
+static void xge_traffic_resume(struct net_device *ndev)
+{
+   struct xge_pdata *pdata = netdev_priv(ndev);
+
+   xge_wr_csr(pdata, CFG_FORCE_LINK_STATUS_EN, 1);
+   xge_wr_csr(pdata, FORCE_LINK_STATUS, 1);
+
+   xge_wr_csr(pdata, CFG_LINK_AGGR_RESUME, 1);
+   xge_wr_csr(pdata, RX_DV_GATE_REG, 1);
+}
+
+int xge_port_init(struct net_device *ndev)
+{
+   struct xge_pdata *pdata = netdev_priv(ndev);
+
+   pdata->phy_speed = SPEED_1000;
+   xge_mac_init(pdata);
+   xge_traffic_resume(ndev);
+
+   return 0;
+}
diff --git a/drivers/net/ethernet/apm/xgene-v2/enet.h 
b/drivers/net/ethernet/apm/xgene-v2/enet.h
new file mode 100644
index 000..40371cf
--- /dev/null
+++ b/drivers/net/ethernet/apm/xgene-v2/enet.h
@@ -0,0 +1,43 @@
+/*
+ * Applied Micro X-Gene SoC Ethernet v2 Driver
+ *
+ * Copyright (c) 2017, Applied Micro Circuits Corporation
+ * Author(s): Iyappan Subramanian 
+ *   Keyur Chudgar 
+ *
+ * This program is free software; you can redistribute  it and/or modify it
+ * under  the terms of  the GNU General  Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#ifndef __XGENE_ENET_V2_ENET_H__
+#define __XGENE_ENET_V2_ENET_H__
+
+#define ENET_CLKEN 0xc008
+#define ENET_SRST  0xc000
+#define ENET_SHIM  0xc010
+#define CFG_MEM_RAM_SHUTDOWN   0xd070
+#define BLOCK_MEM_RDY  0xd074
+
+#define DEVM_ARAUX_COH BIT(19)
+#define DEVM_AWAUX_COH BIT(3)
+
+#define CFG_FORCE_LINK_STATUS_EN   0x229c
+#define FORCE_LINK_STATUS  0x22a0
+#define CFG_LINK_AGGR_RESUME   0x27c8
+#define RX_DV_GATE_REG 0x2dfc
+
+void xge_wr_csr(struct xge_pdata *pdata, u32 offset, u32 val);
+u32 xge_rd_csr(struct xge_pdata *pdata, u32 offset);
+int xge_port_reset(struct net_device *ndev);
+
+#endif  /* __XGENE_ENET_V2_ENET__H__ */
-- 
1.9.1



Re: [PATCH net-next 5/6] drivers: net: xgene-v2: Add transmit and receive

2017-02-26 Thread Iyappan Subramanian
On Wed, Feb 1, 2017 at 3:09 AM, David Laight  wrote:
> From Florian Fainelli
>> Sent: 31 January 2017 20:33
>> On 01/31/2017 11:03 AM, Iyappan Subramanian wrote:
>> > This patch adds,
>> > - Transmit
>> > - Transmit completion poll
>> > - Receive poll
>> > - NAPI handler
>> >
>> > and enables the driver.
>> >
>> > Signed-off-by: Iyappan Subramanian 
>> > Signed-off-by: Keyur Chudgar 
>> > ---
>>
>> > +
>> > +   tx_ring = pdata->tx_ring;
>> > +   tail = tx_ring->tail;
>> > +   len = skb_headlen(skb);
>> > +   raw_desc = _ring->raw_desc[tail];
>> > +
>> > +   /* Tx descriptor not available */
>> > +   if (!GET_BITS(E, le64_to_cpu(raw_desc->m0)) ||
>> > +   GET_BITS(PKT_SIZE, le64_to_cpu(raw_desc->m0)))
>> > +   return NETDEV_TX_BUSY;
>
> Aren't you supposed to detect 'ring full' and stop the code
> giving you packets to transmit.

I'll add stop queue and wake queue.

>
>> > +
>> > +   /* Packet buffers should be 64B aligned */
>
> Is that really a requirement of the hardware?
> Almost all ethernet frames are 4n+2 aligned.

Yes, it's a hardware requirement.

>
>> > +   pkt_buf = dma_alloc_coherent(dev, XGENE_ENET_STD_MTU, _addr,
>> > +GFP_ATOMIC);
>> > +   if (unlikely(!pkt_buf))
>> > +   goto out;
>>
>> Can't you obtain a DMA-API mapping for skb->data and pass it down to the
>> hardware? This copy here is inefficient.
>>
>> > +
>> > +   memcpy(pkt_buf, skb->data, len);
>
> You really need to verify that the len <= XGENE_ENET_STD_MTU.

This version of the driver, doesn't support jumbo frame.  So, the
check is not required.

>
> Isn't this code only transmitting the 'head' of the packet?
> What about the fragments??

This driver doesn't enable SG yet.

> ...
> David
>


Re: [PATCH net] l2tp: avoid use-after-free caused by l2tp_ip_backlog_recv

2017-02-26 Thread David Miller
From: Paul Hüber 
Date: Sun, 26 Feb 2017 17:58:19 +0100

> l2tp_ip_backlog_recv may not return -1 if the packet gets dropped.
> The return value is passed up to ip_local_deliver_finish, which treats
> negative values as an IP protocol number for resubmission.
> 
> Signed-off-by: Paul Hüber 

Applied and queued up for -stable, thanks.


[PATCH net] net: solve a NAPI race

2017-02-26 Thread Eric Dumazet
From: Eric Dumazet 

While playing with mlx4 hardware timestamping of RX packets, I found
that some packets were received by TCP stack with a ~200 ms delay...

Since the timestamp was provided by the NIC, and my probe was added
in tcp_v4_rcv() while in BH handler, I was confident it was not
a sender issue, or a drop in the network.

This would happen with a very low probability, but hurting RPC
workloads.

A NAPI driver normally arms the IRQ after the napi_complete_done(),
after NAPI_STATE_SCHED is cleared, so that the hard irq handler can grab
it.

Problem is that if another point in the stack grabs NAPI_STATE_SCHED bit
while IRQ are not disabled, we might have later an IRQ firing and
finding this bit set, right before napi_complete_done() clears it.

This can happen with busy polling users, or if gro_flush_timeout is
used. But some other uses of napi_schedule() in drivers can cause this
as well.

This patch adds a new NAPI_STATE_MISSED bit, that napi_schedule_prep()
can set if it could not grab NAPI_STATE_SCHED

Then napi_complete_done() properly reschedules the napi to make sure
we do not miss something.

Since we manipulate multiple bits at once, use cmpxchg() like in
sk_busy_loop() to provide proper transactions.

Signed-off-by: Eric Dumazet 
---
 include/linux/netdevice.h |   29 +++
 net/core/dev.c|   44 ++--
 2 files changed, 51 insertions(+), 22 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 
f40f0ab3847a8caaf46bd4d5f224c65014f501cc..97456b2539e46d6232dda804f6a434db6fd7134f
 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -330,6 +330,7 @@ struct napi_struct {
 
 enum {
NAPI_STATE_SCHED,   /* Poll is scheduled */
+   NAPI_STATE_MISSED,  /* reschedule a napi poll */
NAPI_STATE_DISABLE, /* Disable pending */
NAPI_STATE_NPSVC,   /* Netpoll - don't dequeue from poll_list */
NAPI_STATE_HASHED,  /* In NAPI hash (busy polling possible) */
@@ -338,12 +339,13 @@ enum {
 };
 
 enum {
-   NAPIF_STATE_SCHED= (1UL << NAPI_STATE_SCHED),
-   NAPIF_STATE_DISABLE  = (1UL << NAPI_STATE_DISABLE),
-   NAPIF_STATE_NPSVC= (1UL << NAPI_STATE_NPSVC),
-   NAPIF_STATE_HASHED   = (1UL << NAPI_STATE_HASHED),
-   NAPIF_STATE_NO_BUSY_POLL = (1UL << NAPI_STATE_NO_BUSY_POLL),
-   NAPIF_STATE_IN_BUSY_POLL = (1UL << NAPI_STATE_IN_BUSY_POLL),
+   NAPIF_STATE_SCHED= BIT(NAPI_STATE_SCHED),
+   NAPIF_STATE_MISSED   = BIT(NAPI_STATE_MISSED),
+   NAPIF_STATE_DISABLE  = BIT(NAPI_STATE_DISABLE),
+   NAPIF_STATE_NPSVC= BIT(NAPI_STATE_NPSVC),
+   NAPIF_STATE_HASHED   = BIT(NAPI_STATE_HASHED),
+   NAPIF_STATE_NO_BUSY_POLL = BIT(NAPI_STATE_NO_BUSY_POLL),
+   NAPIF_STATE_IN_BUSY_POLL = BIT(NAPI_STATE_IN_BUSY_POLL),
 };
 
 enum gro_result {
@@ -414,20 +416,7 @@ static inline bool napi_disable_pending(struct napi_struct 
*n)
return test_bit(NAPI_STATE_DISABLE, >state);
 }
 
-/**
- * napi_schedule_prep - check if NAPI can be scheduled
- * @n: NAPI context
- *
- * Test if NAPI routine is already running, and if not mark
- * it as running.  This is used as a condition variable to
- * insure only one NAPI poll instance runs.  We also make
- * sure there is no pending NAPI disable.
- */
-static inline bool napi_schedule_prep(struct napi_struct *n)
-{
-   return !napi_disable_pending(n) &&
-   !test_and_set_bit(NAPI_STATE_SCHED, >state);
-}
+bool napi_schedule_prep(struct napi_struct *n);
 
 /**
  * napi_schedule - schedule NAPI poll
diff --git a/net/core/dev.c b/net/core/dev.c
index 
304f2deae5f9897e60a79ed8b69d6ef208295ded..82d868049ba78260a5f28376842657b72bd31994
 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4884,6 +4884,32 @@ void __napi_schedule(struct napi_struct *n)
 EXPORT_SYMBOL(__napi_schedule);
 
 /**
+ * napi_schedule_prep - check if napi can be scheduled
+ * @n: napi context
+ *
+ * Test if NAPI routine is already running, and if not mark
+ * it as running.  This is used as a condition variable
+ * insure only one NAPI poll instance runs.  We also make
+ * sure there is no pending NAPI disable.
+ */
+bool napi_schedule_prep(struct napi_struct *n)
+{
+   unsigned long val, new;
+
+   do {
+   val = READ_ONCE(n->state);
+   if (unlikely(val & NAPIF_STATE_DISABLE))
+   return false;
+   new = val | NAPIF_STATE_SCHED;
+   if (unlikely(val & NAPIF_STATE_SCHED))
+   new |= NAPIF_STATE_MISSED;
+   } while (cmpxchg(>state, val, new) != val);
+
+   return !(val & NAPIF_STATE_SCHED);
+}
+EXPORT_SYMBOL(napi_schedule_prep);
+
+/**
  * __napi_schedule_irqoff - schedule for receive
  * @n: entry to schedule
  *
@@ -4897,7 +4923,7 @@ EXPORT_SYMBOL(__napi_schedule_irqoff);
 
 

Re: [PATCH net] xfrm: provide correct dst in xfrm_neigh_lookup

2017-02-26 Thread David Miller
From: Julian Anastasov 
Date: Sat, 25 Feb 2017 17:57:43 +0200

> Fix xfrm_neigh_lookup to provide dst->path to the
> neigh_lookup dst_ops method.
> 
> When skb is provided, the IP address in packet should already
> match the dst->path address family. But for the non-skb case,
> we should consider the last tunnel address as nexthop address.
> 
> Fixes: f894cbf847c9 ("net: Add optional SKB arg to dst_ops->neigh_lookup().")
> Signed-off-by: Julian Anastasov 

This looks good to me.

Steffen, I applied this directly to my tree, I hope you don't mind.


Re: [PATCH net 1/1] net sched actions: decrement module reference count after table flush.

2017-02-26 Thread David Miller
From: Roman Mashak 
Date: Fri, 24 Feb 2017 11:00:32 -0500

> When tc actions are loaded as a module and no actions have been installed,
> flushing them would result in actions removed from the memory, but modules
> reference count not being decremented, so that the modules would not be
> unloaded.
...
> Signed-off-by: Roman Mashak 
> Signed-off-by: Jamal Hadi Salim 

Applied and queued up for -stable, thanks.


Re: [PATCH net] rxrpc: Kernel calls get stuck in recvmsg

2017-02-26 Thread David Miller
From: David Howells 
Date: Fri, 24 Feb 2017 21:57:13 +

> Calls made through the in-kernel interface can end up getting stuck because
> of a missed variable update in a loop in rxrpc_recvmsg_data().  The problem
> is like this:
> 
>  (1) A new packet comes in and doesn't cause a notification to be given to
>  the client as there's still another packet in the ring - the
>  assumption being that if the client will keep drawing off data until
>  the ring is empty.
> 
>  (2) The client is in rxrpc_recvmsg_data(), inside the big while loop that
>  iterates through the packets.  This copies the window pointers into
>  variables rather than using the information in the call struct
>  because:
> 
>  (a) MSG_PEEK might be in effect;
> 
>  (b) we need a barrier after reading call->rx_top to pair with the
>barrier in the softirq routine that loads the buffer.
> 
>  (3) The reading of call->rx_top is done outside of the loop, and top is
>  never updated whilst we're in the loop.  This means that even through
>  there's a new packet available, we don't see it and may return -EFAULT
>  to the caller - who will happily return to the scheduler and await the
>  next notification.
> 
>  (4) No further notifications are forthcoming until there's an abort as the
>  ring isn't empty.
> 
> The fix is to move the read of call->rx_top inside the loop - but it needs
> to be done before the condition is checked.
> 
> Reported-by: Marc Dionne 
> Signed-off-by: David Howells 
> Tested-by: Marc Dionne 

Applied, thanks.


Re: [bug report] rhashtable: Add nested tables

2017-02-26 Thread David Miller
From: Herbert Xu 
Date: Sat, 25 Feb 2017 22:38:11 +0800

> Subject: rhashtable: Fix use before NULL check in bucket_table_free
> 
> Dan Carpenter reported a use before NULL check bug in the function
> bucket_table_free.  In fact we don't need the NULL check at all as
> no caller can provide a NULL argument.  So this patch fixes this by
> simply removing it.
> 
> Reported-by: Dan Carpenter 
> Signed-off-by: Herbert Xu 

Also applied, thanks Herbert.


Re: [rhashtable] 5d60de5ff1 [ INFO: suspicious RCU usage. ]

2017-02-26 Thread David Miller
From: Herbert Xu 
Date: Sat, 25 Feb 2017 22:39:50 +0800

> Subject: rhashtable: Fix RCU dereference annotation in rht_bucket_nested
> 
> The current annotation is wrong as it says that we're only called
> under spinlock.  In fact it should be marked as under either
> spinlock or RCU read lock.
> 
> Fixes: da20420f83ea ("rhashtable: Add nested tables")
> Reported-by: Fengguang Wu 
> Signed-off-by: Herbert Xu 

Applied.


Re: [PATCH net] ipv6: check sk sk_type and protocol early in ip_mroute_set/getsockopt

2017-02-26 Thread David Miller
From: Xin Long 
Date: Fri, 24 Feb 2017 16:29:06 +0800

> Commit 5e1859fbcc3c ("ipv4: ipmr: various fixes and cleanups") fixed
> the issue for ipv4 ipmr:
> 
>   ip_mroute_setsockopt() & ip_mroute_getsockopt() should not
>   access/set raw_sk(sk)->ipmr_table before making sure the socket
>   is a raw socket, and protocol is IGMP
> 
> The same fix should be done for ipv6 ipmr as well.
> 
> This patch can fix the panic caused by overwriting the same offset
> as ipmr_table as in raw_sk(sk) when accessing other type's socket
> by ip_mroute_setsockopt().
> 
> Signed-off-by: Xin Long 

Applied and queued up for -stable, thanks.


Re: [PATCH net] sctp: set sin_port for addr param when checking duplicate address

2017-02-26 Thread David Miller
From: Xin Long 
Date: Fri, 24 Feb 2017 15:18:46 +0800

> Commit b8607805dd15 ("sctp: not copying duplicate addrs to the assoc's
> bind address list") tried to check for duplicate address before copying
> to asoc's bind_addr list from global addr list.
> 
> But all the addrs' sin_ports in global addr list are 0 while the addrs'
> sin_ports are bp->port in asoc's bind_addr list. It means even if it's
> a duplicate address, af->cmp_addr will still return 0 as the their
> sin_ports are different.
> 
> This patch is to fix it by setting the sin_port for addr param with
> bp->port before comparing the addrs.
> 
> Fixes: b8607805dd15 ("sctp: not copying duplicate addrs to the assoc's bind 
> address list")
> Reported-by: Wei Chen 
> Signed-off-by: Xin Long 

Applied and queued up for -stable, thanks.


Re: [PATCH net 1/1] net sched actions: do not overwrite status of action creation.

2017-02-26 Thread David Miller
From: Roman Mashak 
Date: Fri, 24 Feb 2017 17:36:58 -0500

> nla_memdup_cookie was overwriting err value, declared at function
> scope and earlier initialized with result of ->init(). At success
> nla_memdup_cookie() returns 0, and thus module refcnt decremented,
> although the action was installed.
 ...
> Fixes: 1045ba77a ("net sched actions: Add support for user cookies")
> Signed-off-by: Roman Mashak 
> Signed-off-by: Jamal Hadi Salim 

Applied, thanks.


Re: [PATCH] lib: Allow compile-testing of parman

2017-02-26 Thread David Miller
From: Geert Uytterhoeven 
Date: Fri, 24 Feb 2017 11:25:55 +0100

> This allows to enable and run the accompanying test (test_parman)
> without dependencies on other users of parman.
> 
> Signed-off-by: Geert Uytterhoeven 

Applied, thanks.


[bpf] 9d876e79df: BUG: unable to handle kernel paging request at 653a8346

2017-02-26 Thread kernel test robot
4.490615]  fput+0xd/0x10
[   24.492610]  task_work_run+0x57/0x80
[   24.494621]  do_exit+0x213/0x9e0
[   24.507034]  ? ___might_sleep+0xa1/0x140
[   24.509082]  do_group_exit+0x33/0x90
[   24.511027]  SyS_exit_group+0x16/0x20
[   24.513158]  do_fast_syscall_32+0x9a/0x160
[   24.515430]  entry_SYSENTER_32+0x4c/0x7b
[   24.527175] EIP: 0xb77c5cc5
[   24.528695] EFLAGS: 0292 CPU: 0
[   24.530581] EAX: ffda EBX:  ECX: 002d EDX: b77bc8ac
[   24.533601] ESI:  EDI: 0001 EBP: bfd1fb58 ESP: bfd1fa6c
[   24.536691]  DS: 007b ES: 007b FS:  GS:  SS: 007b
[   24.550841] Code: 55 89 e5 e8 48 0e 7f 00 5d c3 66 90 66 90 66 90 55 89 e5 
57 56 53 83 ec 10 e8 32 0e 7f 00 89 c7 89 45 ec 8b 00 89 55 f0 89 4d e8 <8b> 10 
39 c7 8d 58 f4 8d 72 f4 75 0b eb 3b 8d b4 26 00 00 00 00
[   24.560353] EIP: __wake_up_common+0x1b/0x70 SS:ESP: 0068:d6bb9e48
[   24.571547] CR2: 653a8346
[   24.573551] ---[ end trace d208903f8b9ffa11 ]---
[   24.576150] Kernel panic - not syncing: Fatal exception

git bisect start 36fd98883ef26e06ac5e1f99569930f19d59da0a 
7089db84e356562f8ba737c29e472cc42d530dbc --
git bisect  bad 91908381ef0f5509e823d518c3e7c97141620db3  # 11:58 17- 
18  Merge 
'linux-review/Codrut-Grosu/ASoC-ux500-Added-to-the-next-line/20170226-035023' 
into devel-catchup-201702260425
git bisect  bad 0838cbbd5637d7bb585c370d073f80008760c339  # 12:25  8-  
9  Merge 
'linux-review/Codrut-Grosu/ASoC-ux500-Added-blank-line-after-declarations/20170226-040437'
 into devel-catchup-201702260425
git bisect  bad b3dac24e69442026a2e46d0a770eff184bebf037  # 13:04  2-  
4  Merge 'linux-review/John-Fastabend/XDP-for-ixgbe/20170226-013816' into 
devel-catchup-201702260425
git bisect good 200859542c312ac76fa786c31d0fbf0adfb5d5ce  # 16:24206+  
0  0day base guard for 'devel-catchup-201702260425'
git bisect good 661091093918657ab544fb8ca91a5ab172a986dc  # 16:53203+  
0  net: ipv4: remove fib_lookup.h from devinet.c include list
git bisect good dfcb7a14866b8c34b2d3a74ae31631e1d4e7f591  # 17:11210+  
0  Merge branch 'ipvtap'
git bisect good 3b4735281f67b0aa62bf74c8a1a7758c17f7158d  # 17:30210+  
0  nfp: Use PCI_DEVICE_ID_NETRONOME_NFP* defines
git bisect good bd5ca062ba7d24bcc28f637aa90056f642a35dfa  # 17:50210+  
0  nfp: report NSP ABI version in ethtool FW version
git bisect  bad 1faaa78f36cb2915ae89138ba5846f87ade85dcb  # 17:58102- 
47  bnxt_en: use eth_hw_addr_random()
git bisect good daf1f1e7841138cb0e48d52c8573a5f064d8f495  # 18:28308+  
0  bnxt_en: Fix NULL pointer dereference in a failure path during open.
git bisect good ab42676af052e6d3502b31c2dc6b07af08ff126f  # 18:41307+  
0  net: mvpp2: handle too large value in mvpp2_rx_time_coal_set()
git bisect good 0e0372816b9cbd22c82e3e7cd36e8e74c58ba641  # 21:18310+  
0  net: mvpp2: switch to build_skb() in the RX path
git bisect good 29869d66870a715177bfb505f66a7e0e8bcc89c3  # 21:37310+  
0  tcp: Revert "tcp: tcp_probe: use spin_lock_bh()"
git bisect good d2852a2240509e512712e25de2d0796cda435ecb  # 21:57310+  
0  arch: add ARCH_HAS_SET_MEMORY config
git bisect  bad d54fef315399e0b16f8ae2b41167f34f8df12e88  # 22:25 10- 
11  Merge branch 'bpf-unlocking-fix'
git bisect  bad 9d876e79df6a2f364b9f2737eacd72ceb27da53a  # 08:58 38- 
20  bpf: fix unlocking of jited image when module ronx not set
# first bad commit: [9d876e79df6a2f364b9f2737eacd72ceb27da53a] bpf: fix 
unlocking of jited image when module ronx not set
git bisect good d2852a2240509e512712e25de2d0796cda435ecb  # 10:01910+  
0  arch: add ARCH_HAS_SET_MEMORY config
# extra tests on HEAD of linux-devel/devel-catchup-201702260425
git bisect  bad 36fd98883ef26e06ac5e1f99569930f19d59da0a  # 10:07 20- 
26  0day head guard for 'devel-catchup-201702260425'
# extra tests on tree/branch linus/master
git bisect  bad e5d56efc97f8240d0b5d66c03949382b6d7e5570  # 10:07  0-  
4  Merge tag 'watchdog-for-linus-v4.11' of 
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
# extra tests on tree/branch linux-next/master
git bisect  bad 3e7350242c6f3d41d28e03418bd781cc1b7bad5f  # 10:14  5-  
7  Add linux-next specific files for 20170224

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/lkp  Intel Corporation


dmesg-quantal-intel12-44:20170226223341:i386-randconfig-r0-201709:4.10.0-rc8-02017-g9d876e7:59.gz
Description: application/gzip
#!/bin/bash

kernel=$1
initrd=quantal-core-i386.cgz

wget --no-clobber 
https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd

kvm=(
qemu-system-x86_64
-enable-kvm
-cpu kvm64
-kernel $kernel
-initrd $initrd
-m 369
-smp 2
-device e1000,netdev=net0
-netdev user,id=net0
-boot order=nc
-n

[PATCH net/ipv6] net/ipv6: avoid possible dead locking on addr_gen_mode sysctl

2017-02-26 Thread Felix Jia
The addr_gen_mode variable can be accessed by both sysctl and netlink.
Repleacd rtnl_lock() with rtnl_trylock() protect the sysctl operation to
avoid the possbile dead lock.`

Signed-off-by: Felix Jia 
---
 net/ipv6/addrconf.c | 22 +++---
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 3a2025f5bf2c..cfc485a8e1c0 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -5692,13 +5692,18 @@ static int addrconf_sysctl_addr_gen_mode(struct 
ctl_table *ctl, int write,
struct inet6_dev *idev = (struct inet6_dev *)ctl->extra1;
struct net *net = (struct net *)ctl->extra2;
 
+   if (!rtnl_trylock())
+   return restart_syscall();
+
ret = proc_dointvec(ctl, write, buffer, lenp, ppos);
 
if (write) {
new_val = *((int *)ctl->data);
 
-   if (check_addr_gen_mode(new_val) < 0)
-   return -EINVAL;
+   if (check_addr_gen_mode(new_val) < 0) {
+   ret = -EINVAL;
+   goto out;
+   }
 
/* request for default */
if (>ipv6.devconf_dflt->addr_gen_mode == ctl->data) {
@@ -5707,20 +5712,23 @@ static int addrconf_sysctl_addr_gen_mode(struct 
ctl_table *ctl, int write,
/* request for individual net device */
} else {
if (!idev)
-   return ret;
+   goto out;
 
-   if (check_stable_privacy(idev, net, new_val) < 0)
-   return -EINVAL;
+   if (check_stable_privacy(idev, net, new_val) < 0) {
+   ret = -EINVAL;
+   goto out;
+   }
 
if (idev->cnf.addr_gen_mode != new_val) {
idev->cnf.addr_gen_mode = new_val;
-   rtnl_lock();
addrconf_dev_config(idev->dev);
-   rtnl_unlock();
}
}
}
 
+out:
+   rtnl_unlock();
+
return ret;
 }
 
-- 
2.11.0



[PATCH] net: sgi: ioc3-eth: use new api ethtool_{get|set}_link_ksettings

2017-02-26 Thread Philippe Reynes
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.

Signed-off-by: Philippe Reynes 
---
 drivers/net/ethernet/sgi/ioc3-eth.c |   14 --
 1 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/sgi/ioc3-eth.c 
b/drivers/net/ethernet/sgi/ioc3-eth.c
index 57e6cef..52ead55 100644
--- a/drivers/net/ethernet/sgi/ioc3-eth.c
+++ b/drivers/net/ethernet/sgi/ioc3-eth.c
@@ -1558,25 +1558,27 @@ static void ioc3_get_drvinfo (struct net_device *dev,
strlcpy(info->bus_info, pci_name(ip->pdev), sizeof(info->bus_info));
 }
 
-static int ioc3_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
+static int ioc3_get_link_ksettings(struct net_device *dev,
+  struct ethtool_link_ksettings *cmd)
 {
struct ioc3_private *ip = netdev_priv(dev);
int rc;
 
spin_lock_irq(>ioc3_lock);
-   rc = mii_ethtool_gset(>mii, cmd);
+   rc = mii_ethtool_get_link_ksettings(>mii, cmd);
spin_unlock_irq(>ioc3_lock);
 
return rc;
 }
 
-static int ioc3_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
+static int ioc3_set_link_ksettings(struct net_device *dev,
+  const struct ethtool_link_ksettings *cmd)
 {
struct ioc3_private *ip = netdev_priv(dev);
int rc;
 
spin_lock_irq(>ioc3_lock);
-   rc = mii_ethtool_sset(>mii, cmd);
+   rc = mii_ethtool_set_link_ksettings(>mii, cmd);
spin_unlock_irq(>ioc3_lock);
 
return rc;
@@ -1608,10 +1610,10 @@ static u32 ioc3_get_link(struct net_device *dev)
 
 static const struct ethtool_ops ioc3_ethtool_ops = {
.get_drvinfo= ioc3_get_drvinfo,
-   .get_settings   = ioc3_get_settings,
-   .set_settings   = ioc3_set_settings,
.nway_reset = ioc3_nway_reset,
.get_link   = ioc3_get_link,
+   .get_link_ksettings = ioc3_get_link_ksettings,
+   .set_link_ksettings = ioc3_set_link_ksettings,
 };
 
 static int ioc3_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
-- 
1.7.4.4



Re: [PATCH net] net/mlx4_en: fix overflow in mlx4_en_init_timestamp()

2017-02-26 Thread David Miller
From: Eric Dumazet 
Date: Thu, 23 Feb 2017 15:22:43 -0800

> From: Eric Dumazet 
> 
> The cited commit makes a great job of finding optimal shift/multiplier
> values assuming a 10 seconds wrap around, but forgot to change the
> overflow_period computation.
> 
> It overflows in cyclecounter_cyc2ns(), and the final result is 804 ms,
> which is silly.
> 
> Lets simply use 5 seconds, no need to recompute this, given how it is
> supposed to work.
> 
> Later, we will use a timer instead of a work queue, since the new RX
> allocation schem will no longer need mlx4_en_recover_from_oom() and the
> service_task firing every 250 ms.
> 
> Fixes: 31c128b66e5b ("net/mlx4_en: Choose time-stamping shift value according 
> to HW frequency")
> Signed-off-by: Eric Dumazet 

Applied, thanks.


[PATCH][V2] rtlwifi: rtl8192de: ix spelling mistake: "althougth" -> "though"

2017-02-26 Thread Colin King
From: Colin Ian King 

trivial fix to spelling mistake in RT_TRACE message

Signed-off-by: Colin Ian King 
---
 drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
index de98d88..dcb5d83 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
@@ -812,7 +812,7 @@ bool rtl92d_phy_config_rf_with_headerfile(struct 
ieee80211_hw *hw,
 * pathA or mac1 has to set phy0 pathA */
if ((content == radiob_txt) && (rfpath == RF90_PATH_A)) {
RT_TRACE(rtlpriv, COMP_INIT, DBG_LOUD,
-" ===> althougth Path A, we load radiob.txt\n");
+" ===> though Path A, we load radiob.txt\n");
radioa_arraylen = radiob_arraylen;
radioa_array_table = radiob_array_table;
}
-- 
2.10.2



Re: [PATCH][V2] rtlwifi: rtl8192de: ix spelling mistake: "althougth" -> "though"

2017-02-26 Thread Larry Finger

On 02/26/2017 12:52 PM, Colin King wrote:

From: Colin Ian King 

trivial fix to spelling mistake in RT_TRACE message

Signed-off-by: Colin Ian King 
---
 drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


Acked-by: Larry Finger 

Thanks,

Larry



diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
index de98d88..dcb5d83 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
@@ -812,7 +812,7 @@ bool rtl92d_phy_config_rf_with_headerfile(struct 
ieee80211_hw *hw,
 * pathA or mac1 has to set phy0 pathA */
if ((content == radiob_txt) && (rfpath == RF90_PATH_A)) {
RT_TRACE(rtlpriv, COMP_INIT, DBG_LOUD,
-" ===> althougth Path A, we load radiob.txt\n");
+" ===> though Path A, we load radiob.txt\n");
radioa_arraylen = radiob_arraylen;
radioa_array_table = radiob_array_table;
}





Re: [PATCH][V2] rtlwifi: rtl8192de: ix spelling mistake: "althougth" -> "though"

2017-02-26 Thread Colin Ian King
On 26/02/17 18:52, Colin King wrote:
> From: Colin Ian King 
> 
> trivial fix to spelling mistake in RT_TRACE message
> 
> Signed-off-by: Colin Ian King 
> ---
>  drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c 
> b/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
> index de98d88..dcb5d83 100644
> --- a/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
> +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
> @@ -812,7 +812,7 @@ bool rtl92d_phy_config_rf_with_headerfile(struct 
> ieee80211_hw *hw,
>* pathA or mac1 has to set phy0 pathA */
>   if ((content == radiob_txt) && (rfpath == RF90_PATH_A)) {
>   RT_TRACE(rtlpriv, COMP_INIT, DBG_LOUD,
> -  " ===> althougth Path A, we load radiob.txt\n");
> +  " ===> though Path A, we load radiob.txt\n");
>   radioa_arraylen = radiob_arraylen;
>   radioa_array_table = radiob_array_table;
>   }
> 
OOPS, ignore that.


Re: [PATCH net] net/mlx4_en: fix overflow in mlx4_en_init_timestamp()

2017-02-26 Thread Tariq Toukan



On 24/02/2017 1:22 AM, Eric Dumazet wrote:

From: Eric Dumazet 

The cited commit makes a great job of finding optimal shift/multiplier
values assuming a 10 seconds wrap around, but forgot to change the
overflow_period computation.

It overflows in cyclecounter_cyc2ns(), and the final result is 804 ms,
which is silly.

Lets simply use 5 seconds, no need to recompute this, given how it is
supposed to work.

Later, we will use a timer instead of a work queue, since the new RX
allocation schem will no longer need mlx4_en_recover_from_oom() and the
service_task firing every 250 ms.

Fixes: 31c128b66e5b ("net/mlx4_en: Choose time-stamping shift value according to HW 
frequency")
Signed-off-by: Eric Dumazet 
Cc: Tariq Toukan 
Cc: Eugenia Emantayev 
---
 drivers/net/ethernet/mellanox/mlx4/en_clock.c |   18 +++-
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h  |1
 2 files changed, 8 insertions(+), 11 deletions(-)



Reviewed-by: Tariq Toukan 

Thanks for your patch.



[PATCH] net: rocker: use new api ethtool_{get|set}_link_ksettings

2017-02-26 Thread Philippe Reynes
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.

Signed-off-by: Philippe Reynes 
---
 drivers/net/ethernet/rocker/rocker_main.c |   55 +
 1 files changed, 32 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/rocker/rocker_main.c 
b/drivers/net/ethernet/rocker/rocker_main.c
index 0f63a44..b712ec2 100644
--- a/drivers/net/ethernet/rocker/rocker_main.c
+++ b/drivers/net/ethernet/rocker/rocker_main.c
@@ -1115,7 +1115,7 @@ int rocker_cmd_exec(struct rocker_port *rocker_port, bool 
nowait,
  const struct rocker_desc_info 
*desc_info,
  void *priv)
 {
-   struct ethtool_cmd *ecmd = priv;
+   struct ethtool_link_ksettings *ecmd = priv;
const struct rocker_tlv *attrs[ROCKER_TLV_CMD_MAX + 1];
const struct rocker_tlv *info_attrs[ROCKER_TLV_CMD_PORT_SETTINGS_MAX + 
1];
u32 speed;
@@ -1137,13 +1137,14 @@ int rocker_cmd_exec(struct rocker_port *rocker_port, 
bool nowait,
duplex = 
rocker_tlv_get_u8(info_attrs[ROCKER_TLV_CMD_PORT_SETTINGS_DUPLEX]);
autoneg = 
rocker_tlv_get_u8(info_attrs[ROCKER_TLV_CMD_PORT_SETTINGS_AUTONEG]);
 
-   ecmd->transceiver = XCVR_INTERNAL;
-   ecmd->supported = SUPPORTED_TP;
-   ecmd->phy_address = 0xff;
-   ecmd->port = PORT_TP;
-   ethtool_cmd_speed_set(ecmd, speed);
-   ecmd->duplex = duplex ? DUPLEX_FULL : DUPLEX_HALF;
-   ecmd->autoneg = autoneg ? AUTONEG_ENABLE : AUTONEG_DISABLE;
+   ethtool_link_ksettings_zero_link_mode(ecmd, supported);
+   ethtool_link_ksettings_add_link_mode(ecmd, supported, TP);
+
+   ecmd->base.phy_address = 0xff;
+   ecmd->base.port = PORT_TP;
+   ecmd->base.speed = speed;
+   ecmd->base.duplex = duplex ? DUPLEX_FULL : DUPLEX_HALF;
+   ecmd->base.autoneg = autoneg ? AUTONEG_ENABLE : AUTONEG_DISABLE;
 
return 0;
 }
@@ -1250,7 +1251,7 @@ struct port_name {
  struct rocker_desc_info *desc_info,
  void *priv)
 {
-   struct ethtool_cmd *ecmd = priv;
+   struct ethtool_link_ksettings *ecmd = priv;
struct rocker_tlv *cmd_info;
 
if (rocker_tlv_put_u16(desc_info, ROCKER_TLV_CMD_TYPE,
@@ -1263,13 +1264,13 @@ struct port_name {
   rocker_port->pport))
return -EMSGSIZE;
if (rocker_tlv_put_u32(desc_info, ROCKER_TLV_CMD_PORT_SETTINGS_SPEED,
-  ethtool_cmd_speed(ecmd)))
+  ecmd->base.speed))
return -EMSGSIZE;
if (rocker_tlv_put_u8(desc_info, ROCKER_TLV_CMD_PORT_SETTINGS_DUPLEX,
- ecmd->duplex))
+ ecmd->base.duplex))
return -EMSGSIZE;
if (rocker_tlv_put_u8(desc_info, ROCKER_TLV_CMD_PORT_SETTINGS_AUTONEG,
- ecmd->autoneg))
+ ecmd->base.autoneg))
return -EMSGSIZE;
rocker_tlv_nest_end(desc_info, cmd_info);
return 0;
@@ -1347,8 +1348,9 @@ struct port_name {
return 0;
 }
 
-static int rocker_cmd_get_port_settings_ethtool(struct rocker_port 
*rocker_port,
-   struct ethtool_cmd *ecmd)
+static int
+rocker_cmd_get_port_settings_ethtool(struct rocker_port *rocker_port,
+struct ethtool_link_ksettings *ecmd)
 {
return rocker_cmd_exec(rocker_port, false,
   rocker_cmd_get_port_settings_prep, NULL,
@@ -1373,12 +1375,17 @@ static int rocker_cmd_get_port_settings_mode(struct 
rocker_port *rocker_port,
   rocker_cmd_get_port_settings_mode_proc, p_mode);
 }
 
-static int rocker_cmd_set_port_settings_ethtool(struct rocker_port 
*rocker_port,
-   struct ethtool_cmd *ecmd)
+static int
+rocker_cmd_set_port_settings_ethtool(struct rocker_port *rocker_port,
+const struct ethtool_link_ksettings *ecmd)
 {
+   struct ethtool_link_ksettings copy_ecmd;
+
+   memcpy(_ecmd, ecmd, sizeof(copy_ecmd));
+
return rocker_cmd_exec(rocker_port, false,
   rocker_cmd_set_port_settings_ethtool_prep,
-  ecmd, NULL, NULL);
+  _ecmd, NULL, NULL);
 }
 
 static int rocker_cmd_set_port_settings_macaddr(struct rocker_port 
*rocker_port,
@@ -2237,16 +2244,18 @@ static int rocker_router_fib_event(struct 
notifier_block *nb,
  * ethtool interface
  /
 
-static int rocker_port_get_settings(struct net_device *dev,
-   struct ethtool_cmd *ecmd)
+static int

Re: [PATCH net] net/mlx4_en: reception NAPI/IRQ race breaker

2017-02-26 Thread Eric Dumazet
On Sun, 2017-02-26 at 09:40 -0800, Eric Dumazet wrote:
> NAPI_STATE_SCHED
> 
> Actually we could use an additional bit for that, that the driver would
> set even if NAPI_STATE_SCHED could not be grabbed.

Just to be clear :

Drivers would require no change, this would be done in
existing helpers.





Re: [PATCH net] net/mlx4_en: reception NAPI/IRQ race breaker

2017-02-26 Thread Eric Dumazet
On Sun, 2017-02-26 at 09:34 -0800, Eric Dumazet wrote:

> I do not believe this bug is mlx4 specific.
> 
> Anything doing the following while hard irq were not masked :
> 
> local_bh_disable();
> napi_reschedule(>rx_cq[ring]->napi);
> local_bh_enable();
> 
> Like in mlx4_en_recover_from_oom()
> 
> Can trigger the issue really.
> 
> Unfortunately I do not see how core layer can handle this.
> Only the driver hard irq could possibly know that it could not grab
> NAPI_STATE_SCHED

Actually we could use an additional bit for that, that the driver would
set even if NAPI_STATE_SCHED could not be grabbed.

Let me try something.






Re: [PATCH net] net/mlx4_en: reception NAPI/IRQ race breaker

2017-02-26 Thread Eric Dumazet
On Sun, 2017-02-26 at 18:32 +0200, Saeed Mahameed wrote:
> On Sat, Feb 25, 2017 at 4:22 PM, Eric Dumazet  wrote:
> > From: Eric Dumazet 
> >
> > While playing with hardware timestamping of RX packets, I found
> > that some packets were received by TCP stack with a ~200 ms delay...
> >
> > Since the timestamp was provided by the NIC, and my probe was added
> > in tcp_v4_rcv() while in BH handler, I was confident it was not
> > a sender issue, or a drop in the network.
> >
> > This would happen with a very low probability, but hurting RPC
> > workloads.
> >
> > I could track this down to the cx-3 (mlx4 driver) card that was
> > apparently sending an IRQ before we would arm it.
> >
> 
> Hi Eric,
> 
> This is highly unlikely, the hardware should not do that, and if this
> is really the case
> we need to hunt down the root cause and not work around it.

Well, I definitely see the interrupt coming while the napi bit is not
available.


> 
> > A NAPI driver normally arms the IRQ after the napi_complete_done(),
> > after NAPI_STATE_SCHED is cleared, so that the hard irq handler can grab
> > it.
> >
> > This patch adds a new rx_irq_miss field that is incremented every time
> > the hard IRQ handler could not grab NAPI_STATE_SCHED.
> >
> > Then, mlx4_en_poll_rx_cq() is able to detect that rx_irq was incremented
> > and attempts to read more packets, if it can re-acquire NAPI_STATE_SCHED
> 
> Are you sure this is not some kind of race condition with the busy
> polling mechanism
> Introduced in ("net/mlx4_en: use napi_complete_done() return value") ?
> Maybe the busy polling thread when it detects that it wants to yield,
> it arms the CQ too early (when napi is not ready)?
> 

Good question.

No busy polling in my tests.

I have triggers by using on a 2x10 Gbit host

(bonding of two 10Gbit mlx4 ports)


ethtool -C eth1 adaptive-rx on rx-usecs-low 0
ethtool -C eth2 adaptive-rx on rx-usecs-low 0
./super_netperf 9 -H lpaa6 -t TCP_RR -l 1 -- -r 100,100 &


4 rx and tx queues per NIC, fq packet scheduler on them.

TCP timestamps are on. (this might be important to get the last packet
of a given size. In my case 912 bytes, with a PSH flag)


> Anyway Tariq and I would like to further investigate the fired IRQ
> while CQ is not armed.  It smells
> like  a bug in the driver/napi level, it is not a HW expected behavior.
> 
> Any pointers on how to reproduce ?  how often would the "rx_irq_miss"
> counter advance on a linerate RX load ?


About 1000 times per second on my hosts, receiving about 1.2 Mpps.
But most of these misses are not an issue because next packet is
arriving maybe less than 10 usec later.

Note that the bug is hard to notice, because TCP would fast retransmit,
or the next packet was coming soon enough.

You have to be unlucky enough that the RX queue that missed the NAPI
schedule receive no more packets before the 200 ms RTO timer, and the
packet stuck in the RX ring is the last packet of the RPC.


I believe part of the problem is that NAPI_STATE_SCHED can be grabbed by
a process while hard irqs were not disabled.

I do not believe this bug is mlx4 specific.

Anything doing the following while hard irq were not masked :

local_bh_disable();
napi_reschedule(>rx_cq[ring]->napi);
local_bh_enable();

Like in mlx4_en_recover_from_oom()

Can trigger the issue really.

Unfortunately I do not see how core layer can handle this.
Only the driver hard irq could possibly know that it could not grab
NAPI_STATE_SCHED






Re: [PATCH] rtlwifi: rtl8192de: ix spelling mistake: "althougth" -> "although"

2017-02-26 Thread Larry Finger

On 02/26/2017 09:19 AM, Colin King wrote:

From: Colin Ian King 

trivial fix to spelling mistake in RT_TRACE message

Signed-off-by: Colin Ian King 


Bad fix. It should be althougth => through. Please read the context.

NACK.

Larry


---
 drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
index de98d88..dcb5d83 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
@@ -812,7 +812,7 @@ bool rtl92d_phy_config_rf_with_headerfile(struct 
ieee80211_hw *hw,
 * pathA or mac1 has to set phy0 pathA */
if ((content == radiob_txt) && (rfpath == RF90_PATH_A)) {
RT_TRACE(rtlpriv, COMP_INIT, DBG_LOUD,
-" ===> althougth Path A, we load radiob.txt\n");
+" ===> although Path A, we load radiob.txt\n");
radioa_arraylen = radiob_arraylen;
radioa_array_table = radiob_array_table;
}





[PATCH net] l2tp: avoid use-after-free caused by l2tp_ip_backlog_recv

2017-02-26 Thread Paul Hüber
l2tp_ip_backlog_recv may not return -1 if the packet gets dropped.
The return value is passed up to ip_local_deliver_finish, which treats
negative values as an IP protocol number for resubmission.

Signed-off-by: Paul Hüber 
---
 net/l2tp/l2tp_ip.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 28c21546d5b6..3ed30153a6f5 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -381,7 +381,7 @@ static int l2tp_ip_backlog_recv(struct sock *sk, struct 
sk_buff *skb)
 drop:
IP_INC_STATS(sock_net(sk), IPSTATS_MIB_INDISCARDS);
kfree_skb(skb);
-   return -1;
+   return 0;
 }
 
 /* Userspace will call sendmsg() on the tunnel socket to send L2TP
-- 
2.11.1



Re: [PATCH] net: s2io: fix typo argumnet argument

2017-02-26 Thread David Miller
From: Corentin Labbe 
Date: Sat, 25 Feb 2017 21:12:41 +0100

> This commit fix the typo argumnet/argument
> 
> Signed-off-by: Corentin Labbe 

Applied.


Re: [PATCH] net: vxge: fix typo argumnet argument

2017-02-26 Thread David Miller
From: Corentin Labbe 
Date: Sat, 25 Feb 2017 21:08:57 +0100

> This commit fix the typo argumnet/argument
> 
> Signed-off-by: Corentin Labbe 

Applied.


Re: [PATCH net] net/mlx4_en: reception NAPI/IRQ race breaker

2017-02-26 Thread Saeed Mahameed
On Sat, Feb 25, 2017 at 4:22 PM, Eric Dumazet  wrote:
> From: Eric Dumazet 
>
> While playing with hardware timestamping of RX packets, I found
> that some packets were received by TCP stack with a ~200 ms delay...
>
> Since the timestamp was provided by the NIC, and my probe was added
> in tcp_v4_rcv() while in BH handler, I was confident it was not
> a sender issue, or a drop in the network.
>
> This would happen with a very low probability, but hurting RPC
> workloads.
>
> I could track this down to the cx-3 (mlx4 driver) card that was
> apparently sending an IRQ before we would arm it.
>

Hi Eric,

This is highly unlikely, the hardware should not do that, and if this
is really the case
we need to hunt down the root cause and not work around it.

> A NAPI driver normally arms the IRQ after the napi_complete_done(),
> after NAPI_STATE_SCHED is cleared, so that the hard irq handler can grab
> it.
>
> This patch adds a new rx_irq_miss field that is incremented every time
> the hard IRQ handler could not grab NAPI_STATE_SCHED.
>
> Then, mlx4_en_poll_rx_cq() is able to detect that rx_irq was incremented
> and attempts to read more packets, if it can re-acquire NAPI_STATE_SCHED

Are you sure this is not some kind of race condition with the busy
polling mechanism
Introduced in ("net/mlx4_en: use napi_complete_done() return value") ?
Maybe the busy polling thread when it detects that it wants to yield,
it arms the CQ too early (when napi is not ready)?

Anyway Tariq and I would like to further investigate the fired IRQ
while CQ is not armed.  It smells
like  a bug in the driver/napi level, it is not a HW expected behavior.

Any pointers on how to reproduce ?  how often would the "rx_irq_miss"
counter advance on a linerate RX load ?

>
> Note that this work around would probably not work if the IRQ is spread
> over many cpus, since it assume the hard irq and softirq are handled by
> the same cpu. This kind of setup is buggy anyway because of reordering
> issues.
>
> Signed-off-by: Eric Dumazet 
> Cc: Tariq Toukan 
> Cc: Saeed Mahameed 
> ---
>  drivers/net/ethernet/mellanox/mlx4/en_rx.c   |   32 +
>  drivers/net/ethernet/mellanox/mlx4/mlx4_en.h |1
>  2 files changed, 26 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c 
> b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> index 
> 867292880c07a15124a0cf099d1fcda09926548e..7c262dc6a9971ca99a890bc3cff49289f0ef43fc
>  100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> @@ -1132,10 +1132,14 @@ void mlx4_en_rx_irq(struct mlx4_cq *mcq)
> struct mlx4_en_cq *cq = container_of(mcq, struct mlx4_en_cq, mcq);
> struct mlx4_en_priv *priv = netdev_priv(cq->dev);
>
> -   if (likely(priv->port_up))
> -   napi_schedule_irqoff(>napi);
> -   else
> +   if (likely(priv->port_up)) {
> +   if (napi_schedule_prep(>napi))
> +   __napi_schedule_irqoff(>napi);
> +   else
> +   cq->rx_irq_missed++;
> +   } else {
> mlx4_en_arm_cq(priv, cq);
> +   }
>  }
>
>  /* Rx CQ polling - called by NAPI */
> @@ -1144,9 +1148,12 @@ int mlx4_en_poll_rx_cq(struct napi_struct *napi, int 
> budget)
> struct mlx4_en_cq *cq = container_of(napi, struct mlx4_en_cq, napi);
> struct net_device *dev = cq->dev;
> struct mlx4_en_priv *priv = netdev_priv(dev);
> -   int done;
> +   int done = 0;
> +   u32 rx_irq_missed;
>
> -   done = mlx4_en_process_rx_cq(dev, cq, budget);
> +again:
> +   rx_irq_missed = READ_ONCE(cq->rx_irq_missed);
> +   done += mlx4_en_process_rx_cq(dev, cq, budget - done);
>
> /* If we used up all the quota - we're probably not done yet... */
> if (done == budget) {
> @@ -1171,10 +1178,21 @@ int mlx4_en_poll_rx_cq(struct napi_struct *napi, int 
> budget)
>  */
> if (done)
> done--;
> +   if (napi_complete_done(napi, done))
> +   mlx4_en_arm_cq(priv, cq);
> +   return done;
> }
> -   /* Done for now */
> -   if (napi_complete_done(napi, done))
> +   if (unlikely(READ_ONCE(cq->rx_irq_missed) != rx_irq_missed))
> +   goto again;
> +
> +   if (napi_complete_done(napi, done)) {
> mlx4_en_arm_cq(priv, cq);
> +
> +   /* We might have received an interrupt too soon */
> +   if (unlikely(READ_ONCE(cq->rx_irq_missed) != rx_irq_missed) &&
> +   napi_reschedule(napi))
> +   goto again;
> +   }
> return done;
>  }
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h 
> b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> index 
> 

Re: [PATCH] lib: fix spelling mistake: "actualy" -> "actually"

2017-02-26 Thread David Miller
From: Colin King 
Date: Sun, 26 Feb 2017 12:10:12 +

> From: Colin Ian King 
> 
> trivial fix to spelling mistake in pr_err message
> 
> Signed-off-by: Colin Ian King 

Applied.


Re: [PATCH net] ipv4: add missing initialization for flowi4_uid

2017-02-26 Thread David Miller
From: Julian Anastasov 
Date: Sun, 26 Feb 2017 15:50:52 +0200

> Avoid matching of random stack value for uid when rules
> are looked up on input route or when RP filter is used.
> Problem should affect only setups that use ip rules with
> uid range.
> 
> Fixes: 622ec2c9d524 ("net: core: add UID to flows, rules, and routes")
> Signed-off-by: Julian Anastasov 

Applied and queued up for -stable.


Re: [PATCH net] ipv4: mask tos for input route

2017-02-26 Thread David Miller
From: Julian Anastasov 
Date: Sun, 26 Feb 2017 17:14:35 +0200

> Restore the lost masking of TOS in input route code to
> allow ip rules to match it properly.
> 
> Problem [1] noticed by Shmulik Ladkani 
> 
> [1] http://marc.info/?t=137331755300040=1=2
> 
> Fixes: 89aef8921bfb ("ipv4: Delete routing cache.")
> Signed-off-by: Julian Anastasov 

Our TOS handling in the routing code is certainly more subtle than it
needs to be.

Applied and queued up for -stable.


[PATCH] rtlwifi: rtl8192de: ix spelling mistake: "althougth" -> "although"

2017-02-26 Thread Colin King
From: Colin Ian King 

trivial fix to spelling mistake in RT_TRACE message

Signed-off-by: Colin Ian King 
---
 drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
index de98d88..dcb5d83 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
@@ -812,7 +812,7 @@ bool rtl92d_phy_config_rf_with_headerfile(struct 
ieee80211_hw *hw,
 * pathA or mac1 has to set phy0 pathA */
if ((content == radiob_txt) && (rfpath == RF90_PATH_A)) {
RT_TRACE(rtlpriv, COMP_INIT, DBG_LOUD,
-" ===> althougth Path A, we load radiob.txt\n");
+" ===> although Path A, we load radiob.txt\n");
radioa_arraylen = radiob_arraylen;
radioa_array_table = radiob_array_table;
}
-- 
2.10.2



[PATCH net] ipv4: mask tos for input route

2017-02-26 Thread Julian Anastasov
Restore the lost masking of TOS in input route code to
allow ip rules to match it properly.

Problem [1] noticed by Shmulik Ladkani 

[1] http://marc.info/?t=137331755300040=1=2

Fixes: 89aef8921bfb ("ipv4: Delete routing cache.")
Signed-off-by: Julian Anastasov 
---
 net/ipv4/route.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 584ed66..8471dd1 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2009,6 +2009,7 @@ int ip_route_input_noref(struct sk_buff *skb, __be32 
daddr, __be32 saddr,
 {
int res;
 
+   tos &= IPTOS_RT_MASK;
rcu_read_lock();
 
/* Multicast recognition logic is moved from route cache to here.
-- 
1.9.3



Re: [PATCH] iproute2: show network device dependency tree

2017-02-26 Thread Jiri Pirko
Sun, Feb 26, 2017 at 03:00:14PM CET, zaboj.camp...@post.cz wrote:
>On Sun, 2017-02-26 at 08:56 +0100, Jiri Pirko wrote:
>> Sat, Feb 25, 2017 at 09:22:22PM CET, zaboj.camp...@post.cz wrote:
>> > On Sat, 2017-02-25 at 18:39 +0100, Jiri Pirko wrote:
>> > > > Sat, Feb 25, 2017 at 05:59:00PM CET, zaboj.camp...@post.cz
>> > > > wrote:
>> > > > Add the argument '-tree' to ip-link to show network devices
>> > > > dependency tree.
>> > > > 
>> > > > Example:
>> > > > 
>> > > > $ ip -tree link
>> > > > eth0
>> > > >    bond0
>> > > > eth1
>> > > >    bond0
>> > > > eth2
>> > > >    bond1
>> > > > eth3
>> > > >    bond1
>> > > 
>> > > 
>> > > Hmm, what is this good for? I'm probably missing something...
>> > 
>> > I consider this kind of output useful when troubleshooting a complex
>> > configuration with many interfaces. It may show relations among
>> > interfaces.
>> 
>> Did you see https://github.com/jbenc/plotnetcfg ?
>> 
>
>Thanks for the link. I haven't seen plotnetcfg and I like it.
>It is handy when the analyzed system has GUI.

You can also run it remotelly. Also I believe that you can catch the
state into some dump file and process it later on. Not 100% sure though.
Ccing Jiri Benc who is the original author of plotnetcfg.


Re: [PATCH] lib: fix spelling mistake: "actualy" -> "actually"

2017-02-26 Thread Jiri Pirko
Sun, Feb 26, 2017 at 01:10:12PM CET, colin.k...@canonical.com wrote:
>From: Colin Ian King 
>
>trivial fix to spelling mistake in pr_err message
>
>Signed-off-by: Colin Ian King 

Acked-by: Jiri Pirko 


>---
> lib/test_parman.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/lib/test_parman.c b/lib/test_parman.c
>index fe9f3a7..35e3224 100644
>--- a/lib/test_parman.c
>+++ b/lib/test_parman.c
>@@ -334,7 +334,7 @@ static int test_parman_check_array(struct test_parman 
>*test_parman,
>   last_priority = item->prio->priority;
> 
>   if (item->parman_item.index != i) {
>-  pr_err("Item has different index in compare to where it 
>actualy is (%lu != %d)\n",
>+  pr_err("Item has different index in compare to where it 
>actually is (%lu != %d)\n",
>  item->parman_item.index, i);
>   return -EINVAL;
>   }
>-- 
>2.10.2
>


RE: FOR YOUR INFORMATION

2017-02-26 Thread Eddie Concepcion



From: Eddie Concepcion
Sent: Sunday, February 26, 2017 9:11 AM
To: Eddie Concepcion
Subject: FOR YOUR INFORMATION



Hi,

You have received a donation of 1,000,000USD; reply to 
merlebutler...@gmail.com, for details.












































































































































































CONFIDENTIALITY NOTICE: This email message, including any attachments, is for 
the sole use of the intended recipient(s) and may contain confidential and 
privileged information.  Any unauthorized use, disclosure or distribution is 
prohibited.  If you are not the intended recipient, please discard the message 
immediately and inform the sender that the message was sent in error.



[PATCH net] ipv4: add missing initialization for flowi4_uid

2017-02-26 Thread Julian Anastasov
Avoid matching of random stack value for uid when rules
are looked up on input route or when RP filter is used.
Problem should affect only setups that use ip rules with
uid range.

Fixes: 622ec2c9d524 ("net: core: add UID to flows, rules, and routes")
Signed-off-by: Julian Anastasov 
---
 net/ipv4/fib_frontend.c | 6 +++---
 net/ipv4/route.c| 1 +
 2 files changed, 4 insertions(+), 3 deletions(-)

I'm not sure if this is the correct way to initialize the uid. I see
other places that simply do memset and use 0 for uid.

diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 7db2ad2..b39a791 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -319,7 +319,7 @@ static int __fib_validate_source(struct sk_buff *skb, 
__be32 src, __be32 dst,
int ret, no_addr;
struct fib_result res;
struct flowi4 fl4;
-   struct net *net;
+   struct net *net = dev_net(dev);
bool dev_match;
 
fl4.flowi4_oif = 0;
@@ -332,6 +332,7 @@ static int __fib_validate_source(struct sk_buff *skb, 
__be32 src, __be32 dst,
fl4.flowi4_scope = RT_SCOPE_UNIVERSE;
fl4.flowi4_tun_key.tun_id = 0;
fl4.flowi4_flags = 0;
+   fl4.flowi4_uid = sock_net_uid(net, NULL);
 
no_addr = idev->ifa_list == NULL;
 
@@ -339,13 +340,12 @@ static int __fib_validate_source(struct sk_buff *skb, 
__be32 src, __be32 dst,
 
trace_fib_validate_source(dev, );
 
-   net = dev_net(dev);
if (fib_lookup(net, , , 0))
goto last_resort;
if (res.type != RTN_UNICAST &&
(res.type != RTN_LOCAL || !IN_DEV_ACCEPT_LOCAL(idev)))
goto e_inval;
-   if (!rpf && !fib_num_tclassid_users(dev_net(dev)) &&
+   if (!rpf && !fib_num_tclassid_users(net) &&
(dev->ifindex != oif || !IN_DEV_TX_REDIRECTS(idev)))
goto last_resort;
fib_combine_itag(itag, );
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index cb494a5..584ed66 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1876,6 +1876,7 @@ static int ip_route_input_slow(struct sk_buff *skb, 
__be32 daddr, __be32 saddr,
fl4.flowi4_flags = 0;
fl4.daddr = daddr;
fl4.saddr = saddr;
+   fl4.flowi4_uid = sock_net_uid(net, NULL);
err = fib_lookup(net, , , 0);
if (err != 0) {
if (!IN_DEV_FORWARD(in_dev))
-- 
1.9.3



Re: [PATCH] iproute2: show network device dependency tree

2017-02-26 Thread Zaboj Campula
On Sun, 2017-02-26 at 08:56 +0100, Jiri Pirko wrote:
> Sat, Feb 25, 2017 at 09:22:22PM CET, zaboj.camp...@post.cz wrote:
> > On Sat, 2017-02-25 at 18:39 +0100, Jiri Pirko wrote:
> > > > Sat, Feb 25, 2017 at 05:59:00PM CET, zaboj.camp...@post.cz
> > > > wrote:
> > > > Add the argument '-tree' to ip-link to show network devices
> > > > dependency tree.
> > > > 
> > > > Example:
> > > > 
> > > > $ ip -tree link
> > > > eth0
> > > >    bond0
> > > > eth1
> > > >    bond0
> > > > eth2
> > > >    bond1
> > > > eth3
> > > >    bond1
> > > 
> > > 
> > > Hmm, what is this good for? I'm probably missing something...
> > 
> > I consider this kind of output useful when troubleshooting a complex
> > configuration with many interfaces. It may show relations among
> > interfaces.
> 
> Did you see https://github.com/jbenc/plotnetcfg ?
> 

Thanks for the link. I haven't seen plotnetcfg and I like it.
It is handy when the analyzed system has GUI.


> 
> > 
> > 
> > > 
> > > 
> > > 
> > > > 
> > > > > > > > Signed-off-by: Zaboj Campula 
> > > > 
> > > > ---
> > > > include/utils.h |  1 +
> > > > ip/ip.c |  5 ++-
> > > > ip/ipaddress.c  | 97 
> > > > -
> > > > 3 files changed, 87 insertions(+), 16 deletions(-)
> > > > 
> > > > diff --git a/include/utils.h b/include/utils.h
> > > > index 22369e0..f1acf4d 100644
> > > > --- a/include/utils.h
> > > > +++ b/include/utils.h
> > > > @@ -20,6 +20,7 @@ extern int show_raw;
> > > > extern int resolve_hosts;
> > > > extern int oneline;
> > > > extern int brief;
> > > > +extern int tree;;
> > > > extern int timestamp;
> > > > extern int timestamp_short;
> > > > extern const char * _SL_;
> > > > diff --git a/ip/ip.c b/ip/ip.c
> > > > index 07050b0..29747a5 100644
> > > > --- a/ip/ip.c
> > > > +++ b/ip/ip.c
> > > > @@ -33,6 +33,7 @@ int show_details;
> > > > int resolve_hosts;
> > > > int oneline;
> > > > int brief;
> > > > +int tree;
> > > > int timestamp;
> > > > const char *_SL_;
> > > > int force;
> > > > @@ -57,7 +58,7 @@ static void usage(void)
> > > > "-h[uman-readable] | -iec |\n"
> > > > "-f[amily] { inet | inet6 | ipx | dnet | mpls | 
> > > > bridge | link } |\n"
> > > > "-4 | -6 | -I | -D | -B | -0 |\n"
> > > > -"-l[oops] { maximum-addr-flush-attempts } | 
> > > > -br[ief] |\n"
> > > > +"-l[oops] { maximum-addr-flush-attempts } | 
> > > > -br[ief] | -tr[ee] |\n"
> > > > "-o[neline] | -t[imestamp] | -ts[hort] | -b[atch] 
> > > > [filename] |\n"
> > > > "-rc[vbuf] [size] | -n[etns] name | -a[ll] | 
> > > > -c[olor]}\n");
> > > > exit(-1);
> > > > @@ -257,6 +258,8 @@ int main(int argc, char **argv)
> > > > > > > > > > > > batch_file = argv[1];
> > > > > > > > > > > > } else if (matches(opt, "-brief") == 0) 
> > > > > > > > > > > > {
> > > > > > > > > > > > ++brief;
> > > > > > > > > > > > +   } else if (matches(opt, "-tree") == 0) {
> > > > > > > > > > > > +   ++tree;
> > > > > > > > > > > > } else if (matches(opt, "-rcvbuf") == 
> > > > > > > > > > > > 0) {
> > > > > > unsigned int size;
> > > > 
> > > > diff --git a/ip/ipaddress.c b/ip/ipaddress.c
> > > > index 242c6ea..5ebcb1a 100644
> > > > --- a/ip/ipaddress.c
> > > > +++ b/ip/ipaddress.c
> > > > @@ -1534,6 +1534,69 @@ static int iplink_filter_req(struct nlmsghdr 
> > > > *nlh, int reqlen)
> > > > return 0;
> > > > }
> > > > 
> > > > +static int has_master(struct nlmsg_chain *linfo, int index)
> > > > +{
> > > > > > > > > > > > +   struct nlmsg_list *l;
> > > > > > > > > > > > +   struct rtattr *tb[IFLA_MAX+1];
> > > > > > > > > > > > +   int len;
> > > > > > > > > > > > +   for (l = linfo->head; l; l = l->next) {
> > > > > > > > > > > > +   struct ifinfomsg *ifi = 
> > > > > > > > > > > > NLMSG_DATA(>h);
> > > > > > > > > > > > +   len = l->h.nlmsg_len;
> > > > > > > > > > > > +   len -= NLMSG_LENGTH(sizeof(*ifi));
> > > > > > > > > > > > +   parse_rtattr(tb, IFLA_MAX, 
> > > > > > > > > > > > IFLA_RTA(ifi), len);
> > > > > > > > > > > > +   if (tb[IFLA_MASTER] && *(int 
> > > > > > > > > > > > *)RTA_DATA(tb[IFLA_MASTER]) == index)
> > > > > > > > > > > > +   return 1;
> > > > > > > > > > > > +   }
> > > > > > +   return 0;
> > > > 
> > > > +}
> > > > +
> > > > +static struct nlmsg_list *get_master(struct nlmsg_chain *linfo, struct 
> > > > rtattr **tb)
> > > > +{
> > > > > > > > > > > > +   struct nlmsg_list *l;
> > > > > > > > > > > > +   if (tb[IFLA_MASTER]) {
> > > > > > > > > > > > +   int master = *(int 
> > > > > > > > > > > > *)RTA_DATA(tb[IFLA_MASTER]);
> > > > > > > > > > > > +   for (l = linfo->head; l; l = l->next) 

[PATCH] lib: fix spelling mistake: "actualy" -> "actually"

2017-02-26 Thread Colin King
From: Colin Ian King 

trivial fix to spelling mistake in pr_err message

Signed-off-by: Colin Ian King 
---
 lib/test_parman.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/test_parman.c b/lib/test_parman.c
index fe9f3a7..35e3224 100644
--- a/lib/test_parman.c
+++ b/lib/test_parman.c
@@ -334,7 +334,7 @@ static int test_parman_check_array(struct test_parman 
*test_parman,
last_priority = item->prio->priority;
 
if (item->parman_item.index != i) {
-   pr_err("Item has different index in compare to where it 
actualy is (%lu != %d)\n",
+   pr_err("Item has different index in compare to where it 
actually is (%lu != %d)\n",
   item->parman_item.index, i);
return -EINVAL;
}
-- 
2.10.2



Re: [Intel-wired-lan] [PATCH] e1000e: fix timing for 82579 Gigabit Ethernet controller

2017-02-26 Thread Neftin, Sasha

On 2/19/2017 14:55, Neftin, Sasha wrote:

On 2/16/2017 20:42, Bernd Faust wrote:

After an upgrade to Linux kernel v4.x the hardware timestamps of the
82579 Gigabit Ethernet Controller are different than expected.
The values that are being read are almost four times as big as before
the kernel upgrade.

The difference is that after the upgrade the driver sets the clock
frequency to 25MHz, where before the upgrade it was set to 96MHz. Intel
confirmed that the correct frequency for this network adapter is 96MHz.

Signed-off-by: Bernd Faust 
---
  drivers/net/ethernet/intel/e1000e/netdev.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c 
b/drivers/net/ethernet/intel/e1000e/netdev.c

index 7017281..8b7113d 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -3511,6 +3511,12 @@ s32 e1000e_get_base_timinca(struct 
e1000_adapter *adapter, u32 *timinca)


  switch (hw->mac.type) {
  case e1000_pch2lan:
+/* Stable 96MHz frequency */
+incperiod = INCPERIOD_96MHz;
+incvalue = INCVALUE_96MHz;
+shift = INCVALUE_SHIFT_96MHz;
+adapter->cc.shift = shift + INCPERIOD_SHIFT_96MHz;
+break;
  case e1000_pch_lpt:
  if (er32(TSYNCRXCTL) & E1000_TSYNCRXCTL_SYSCFI) {
  /* Stable 96MHz frequency */
--
2.7.4
___
Intel-wired-lan mailing list
intel-wired-...@lists.osuosl.org
http://lists.osuosl.org/mailman/listinfo/intel-wired-lan


Hello,

e1000_pch2lan mac type corresponds to 82579LM and 82579V network 
adapters. System clock frequency indication (SYSCFI) for these devices 
supports both 25MHz and 96MHz frequency. By default TSYNCRXCTL.SYSCFI 
is set to 1 and that means 96MHz frequency is picked.


It is better to keep the current implementation as it covers all options.

Thanks,

Sasha


Hello,

During last couple of weeks I saw few  complaints from community on same 
timing problem with 82579. I will double check clock definition with HW 
architecture.


Sasha