date:20170323

[regression], wlcore, wl1835mod random firmware load failures

2017-03-23 Thread Mika Penttilä


With 4.11-rc+ we have seen randomly :
 wlcore: ERROR could not allocate memory for the firmware

when trying to load a firmware bundled in the kernel at boot time. Any ideas?

[PATCH net] net: phy: Export mdiobus_register_board_info()

2017-03-23 Thread Florian Fainelli

We can build modular code that uses mdiobus_register_board_info() which would
lead to linking failure since this symbol is not expoerted.

Fixes: 648ea0134069 ("net: phy: Allow pre-declaration of MDIO devices")
Signed-off-by: Florian Fainelli 
---
 drivers/net/phy/mdio-boardinfo.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/phy/mdio-boardinfo.c b/drivers/net/phy/mdio-boardinfo.c
index 6b988f77da08..61941e29daae 100644
--- a/drivers/net/phy/mdio-boardinfo.c
+++ b/drivers/net/phy/mdio-boardinfo.c
@@ -84,3 +84,4 @@ int mdiobus_register_board_info(const struct mdio_board_info 
*info,
 
return 0;
 }
+EXPORT_SYMBOL(mdiobus_register_board_info);
-- 
2.9.3

Re: [PATCH net-next v2 5/5] net-next: dsa: add dsa support for Mediatek MT7530 switch

2017-03-23 Thread Andrew Lunn

> +static int
> +mt7623_trgmii_write(struct mt7530_priv *priv,  u32 reg, u32 val)
> +{
> + int ret;
> +
> + ret =  regmap_write(priv->ethernet, TRGMII_BASE(reg), val);
> + if (ret < 0)
> + dev_err(priv->dev,
> + "failed to priv write register\n");
> + return ret;
> +}
> +
> +static u32
> +mt7623_trgmii_read(struct mt7530_priv *priv, u32 reg)
> +{
> + int ret;
> + u32 val;
> +
> + ret = regmap_read(priv->ethernet, TRGMII_BASE(reg), &val);
> + if (ret < 0) {
> + dev_err(priv->dev,
> + "failed to priv read register\n");
> + return ret;
> + }
> +
> + return val;
> +}

Hi Sean

These appear to be the only two accessors which use the regmap.

> +static int
> +mt7530_regmap_read(void *ctx, uint32_t reg, uint32_t *val)
> +{
> + struct mt7530_priv *priv = (struct mt7530_priv *)ctx;
> +
> + /* BIT(15) is used as indication for pseudo registers
> +  * which would be translated into the general MDIO
> +  * access to leverage the unique regmap sys interface.
> +  */
> + if (reg & BIT(15))
> + *val = mdiobus_read_nested(priv->bus,
> +(reg & 0xf00) >> 8,
> +(reg & 0xff) >> 2);
> + else
> + *val = mt7530_read(priv, reg);
> +
> + return 0;
> +}

.

> +static const struct regmap_range mt7530_readable_ranges[] = {
> + regmap_reg_range(0x, 0x00ac), /* Global control */
> + regmap_reg_range(0x2000, 0x202c), /* Port Control - P0 */
> + regmap_reg_range(0x2100, 0x212c), /* Port Control - P1 */
> + regmap_reg_range(0x2200, 0x222c), /* Port Control - P2 */
> + regmap_reg_range(0x2300, 0x232c), /* Port Control - P3 */
> + regmap_reg_range(0x2400, 0x242c), /* Port Control - P4 */
> + regmap_reg_range(0x2500, 0x252c), /* Port Control - P5 */
> + regmap_reg_range(0x2600, 0x262c), /* Port Control - P6 */
> + regmap_reg_range(0x30e0, 0x30f8), /* Port MAC - SYS */
> + regmap_reg_range(0x3000, 0x3014), /* Port MAC - P0 */
> + regmap_reg_range(0x3100, 0x3114), /* Port MAC - P1 */
> + regmap_reg_range(0x3200, 0x3214), /* Port MAC - P2*/
> + regmap_reg_range(0x3300, 0x3314), /* Port MAC - P3*/
> + regmap_reg_range(0x3400, 0x3414), /* Port MAC - P4 */
> + regmap_reg_range(0x3500, 0x3514), /* Port MAC - P5 */
> + regmap_reg_range(0x3600, 0x3614), /* Port MAC - P6 */
> + regmap_reg_range(0x4000, 0x40d4), /* MIB - P0 */
> + regmap_reg_range(0x4100, 0x41d4), /* MIB - P1 */
> + regmap_reg_range(0x4200, 0x42d4), /* MIB - P2 */
> + regmap_reg_range(0x4300, 0x43d4), /* MIB - P3 */
> + regmap_reg_range(0x4400, 0x44d4), /* MIB - P4 */
> + regmap_reg_range(0x4500, 0x45d4), /* MIB - P5 */
> + regmap_reg_range(0x4600, 0x46d4), /* MIB - P6 */
> + regmap_reg_range(0x4fe0, 0x4ff4), /* SYS */
> + regmap_reg_range(0x7000, 0x700c), /* SYS 2 */
> + regmap_reg_range(0x7018, 0x7028), /* SYS 3 */
> + regmap_reg_range(0x7800, 0x7830), /* SYS 4 */
> + regmap_reg_range(0x7a00, 0x7a7c), /* TRGMII */
> + regmap_reg_range(0x8000, 0x8078), /* Psedo address for Phy - P0 */
> + regmap_reg_range(0x8100, 0x8178), /* Psedo address for Phy - P1 */
> + regmap_reg_range(0x8200, 0x8278), /* Psedo address for Phy - P2 */
> + regmap_reg_range(0x8300, 0x8378), /* Psedo address for Phy - P3 */
> + regmap_reg_range(0x8400, 0x8478), /* Psedo address for Phy - P4 */
> +};

It looks like your regmap accessor are only used for 0x7a00 to 0x7a7c.

It is not clear why you even bother with a regmap. If you have it, why
not use it for all registers within the regmap?

Andrew

Re: [PATCH net] net: phy: Export mdiobus_register_board_info()

2017-03-23 Thread Andrew Lunn

On Wed, Mar 22, 2017 at 10:40:30PM -0700, Florian Fainelli wrote:
> We can build modular code that uses mdiobus_register_board_info() which would
> lead to linking failure since this symbol is not expoerted.
> 
> Fixes: 648ea0134069 ("net: phy: Allow pre-declaration of MDIO devices")
> Signed-off-by: Florian Fainelli 

Reviewed-by: Andrew Lunn 

Andrew

Re: [PATCH net-next 1/8] ptr_ring: introduce batch dequeuing

2017-03-23 Thread Jason Wang




On 2017年03月22日 21:43, Michael S. Tsirkin wrote:

On Tue, Mar 21, 2017 at 12:04:40PM +0800, Jason Wang wrote:

Signed-off-by: Jason Wang 
---
  include/linux/ptr_ring.h | 65 
  1 file changed, 65 insertions(+)

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 6c70444..4771ded 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -247,6 +247,22 @@ static inline void *__ptr_ring_consume(struct ptr_ring *r)
return ptr;
  }
  
+static inline int __ptr_ring_consume_batched(struct ptr_ring *r,

+void **array, int n)
+{
+   void *ptr;
+   int i = 0;
+
+   while (i < n) {
+   ptr = __ptr_ring_consume(r);
+   if (!ptr)
+   break;
+   array[i++] = ptr;
+   }
+
+   return i;
+}
+
  /*
   * Note: resize (below) nests producer lock within consumer lock, so if you
   * call this in interrupt or BH context, you must disable interrupts/BH when


This ignores the comment above that function:

/* Note: callers invoking this in a loop must use a compiler barrier,
  * for example cpu_relax().
  */


Yes, __ptr_ring_swap_queue() ignores this too.



Also - it looks like it shouldn't matter if reads are reordered but I wonder.
Thoughts? Including some reasoning about it in commit log would be nice.


Yes, I think it doesn't matter in this case, it matters only for batched 
producing.


Thanks




@@ -297,6 +313,55 @@ static inline void *ptr_ring_consume_bh(struct ptr_ring *r)
return ptr;
  }
  
+static inline int ptr_ring_consume_batched(struct ptr_ring *r,

+  void **array, int n)
+{
+   int ret;
+
+   spin_lock(&r->consumer_lock);
+   ret = __ptr_ring_consume_batched(r, array, n);
+   spin_unlock(&r->consumer_lock);
+
+   return ret;
+}
+
+static inline int ptr_ring_consume_batched_irq(struct ptr_ring *r,
+  void **array, int n)
+{
+   int ret;
+
+   spin_lock_irq(&r->consumer_lock);
+   ret = __ptr_ring_consume_batched(r, array, n);
+   spin_unlock_irq(&r->consumer_lock);
+
+   return ret;
+}
+
+static inline int ptr_ring_consume_batched_any(struct ptr_ring *r,
+  void **array, int n)
+{
+   unsigned long flags;
+   int ret;
+
+   spin_lock_irqsave(&r->consumer_lock, flags);
+   ret = __ptr_ring_consume_batched(r, array, n);
+   spin_unlock_irqrestore(&r->consumer_lock, flags);
+
+   return ret;
+}
+
+static inline int ptr_ring_consume_batched_bh(struct ptr_ring *r,
+ void **array, int n)
+{
+   int ret;
+
+   spin_lock_bh(&r->consumer_lock);
+   ret = __ptr_ring_consume_batched(r, array, n);
+   spin_unlock_bh(&r->consumer_lock);
+
+   return ret;
+}
+
  /* Cast to structure type and call a function without discarding from FIFO.
   * Function must return a value.
   * Callers must take consumer_lock.
--
2.7.4

Re: EINVAL when using connect() for udp sockets

2017-03-23 Thread Eric Dumazet

On Thu, 2017-03-23 at 13:22 +1100, Daurnimator wrote:
> On 9 March 2017 at 14:10, Daurnimator  wrote:
> > When debugging https://github.com/daurnimator/lua-http/issues/73 which
> > uses https://github.com/wahern/dns we ran into an issue where modern
> > linux kernels return EINVAL if you try and re-use a udp socket.
> > The issue seems to occur if you go from a local destination ip to a
> > non-local one.
> 
> Did anyone get a chance to look into this issue?

I believe man page is not complete.

A disconnect is needed before another connect()

eg :

#include 
#include 
#include 
#include 
#include 
#include 

int main() {
int fd = socket(PF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP);
if (fd == -1)
exit(1);
if (bind(fd, (struct sockaddr*)&(struct
sockaddr_in){.sin_family=AF_INET, .sin_port=htons(57997),
.sin_addr=inet_addr("0.0.0.0")}, 16))
exit(2);
if (connect(fd, (struct sockaddr*)&(struct
sockaddr_in){.sin_family=AF_INET, .sin_port=htons(53),
.sin_addr=inet_addr("127.0.0.2")}, 16))
exit(3);
if (-1 == sendto(fd, "test", 4, 0, NULL, 0))
exit(4);
char buf[200];
if (-1 != recvfrom(fd, buf, 200, 0, 0, 0) && errno != ECONNREFUSED)
exit(5);
/* okay, try next server... */

/* disconnect */
if (connect(fd, (struct sockaddr*)&(struct 
sockaddr_in){.sin_family=AF_UNSPEC}, 16))
exit(6);

if (connect(fd, (struct sockaddr*)&(struct
sockaddr_in){.sin_family=AF_INET, .sin_port=htons(53),
.sin_addr=inet_addr("8.8.8.8")}, 16))
exit(7);
exit(0);
}

Re: [PATCH net-next v2 5/5] net-next: dsa: add dsa support for Mediatek MT7530 switch

2017-03-23 Thread Sean Wang

Hi Andrew,

The purpose for the regmap table registered is to 

provide a way which helps us to look up a specific 

register on the switch through regmap-debugfs.


And not all ranges of register is defined

so I only include the meaningful ones in a sparse way 

for the table.

Sean


On Thu, 2017-03-23 at 08:22 +0100, Andrew Lunn wrote:
> > +static int
> > +mt7623_trgmii_write(struct mt7530_priv *priv,  u32 reg, u32 val)
> > +{
> > +   int ret;
> > +
> > +   ret =  regmap_write(priv->ethernet, TRGMII_BASE(reg), val);
> > +   if (ret < 0)
> > +   dev_err(priv->dev,
> > +   "failed to priv write register\n");
> > +   return ret;
> > +}
> > +
> > +static u32
> > +mt7623_trgmii_read(struct mt7530_priv *priv, u32 reg)
> > +{
> > +   int ret;
> > +   u32 val;
> > +
> > +   ret = regmap_read(priv->ethernet, TRGMII_BASE(reg), &val);
> > +   if (ret < 0) {
> > +   dev_err(priv->dev,
> > +   "failed to priv read register\n");
> > +   return ret;
> > +   }
> > +
> > +   return val;
> > +}
> 
> Hi Sean
> 
> These appear to be the only two accessors which use the regmap.
> 
> > +static int
> > +mt7530_regmap_read(void *ctx, uint32_t reg, uint32_t *val)
> > +{
> > +   struct mt7530_priv *priv = (struct mt7530_priv *)ctx;
> > +
> > +   /* BIT(15) is used as indication for pseudo registers
> > +* which would be translated into the general MDIO
> > +* access to leverage the unique regmap sys interface.
> > +*/
> > +   if (reg & BIT(15))
> > +   *val = mdiobus_read_nested(priv->bus,
> > +  (reg & 0xf00) >> 8,
> > +  (reg & 0xff) >> 2);
> > +   else
> > +   *val = mt7530_read(priv, reg);
> > +
> > +   return 0;
> > +}
> 
> .
> 
> > +static const struct regmap_range mt7530_readable_ranges[] = {
> > +   regmap_reg_range(0x, 0x00ac), /* Global control */
> > +   regmap_reg_range(0x2000, 0x202c), /* Port Control - P0 */
> > +   regmap_reg_range(0x2100, 0x212c), /* Port Control - P1 */
> > +   regmap_reg_range(0x2200, 0x222c), /* Port Control - P2 */
> > +   regmap_reg_range(0x2300, 0x232c), /* Port Control - P3 */
> > +   regmap_reg_range(0x2400, 0x242c), /* Port Control - P4 */
> > +   regmap_reg_range(0x2500, 0x252c), /* Port Control - P5 */
> > +   regmap_reg_range(0x2600, 0x262c), /* Port Control - P6 */
> > +   regmap_reg_range(0x30e0, 0x30f8), /* Port MAC - SYS */
> > +   regmap_reg_range(0x3000, 0x3014), /* Port MAC - P0 */
> > +   regmap_reg_range(0x3100, 0x3114), /* Port MAC - P1 */
> > +   regmap_reg_range(0x3200, 0x3214), /* Port MAC - P2*/
> > +   regmap_reg_range(0x3300, 0x3314), /* Port MAC - P3*/
> > +   regmap_reg_range(0x3400, 0x3414), /* Port MAC - P4 */
> > +   regmap_reg_range(0x3500, 0x3514), /* Port MAC - P5 */
> > +   regmap_reg_range(0x3600, 0x3614), /* Port MAC - P6 */
> > +   regmap_reg_range(0x4000, 0x40d4), /* MIB - P0 */
> > +   regmap_reg_range(0x4100, 0x41d4), /* MIB - P1 */
> > +   regmap_reg_range(0x4200, 0x42d4), /* MIB - P2 */
> > +   regmap_reg_range(0x4300, 0x43d4), /* MIB - P3 */
> > +   regmap_reg_range(0x4400, 0x44d4), /* MIB - P4 */
> > +   regmap_reg_range(0x4500, 0x45d4), /* MIB - P5 */
> > +   regmap_reg_range(0x4600, 0x46d4), /* MIB - P6 */
> > +   regmap_reg_range(0x4fe0, 0x4ff4), /* SYS */
> > +   regmap_reg_range(0x7000, 0x700c), /* SYS 2 */
> > +   regmap_reg_range(0x7018, 0x7028), /* SYS 3 */
> > +   regmap_reg_range(0x7800, 0x7830), /* SYS 4 */
> > +   regmap_reg_range(0x7a00, 0x7a7c), /* TRGMII */
> > +   regmap_reg_range(0x8000, 0x8078), /* Psedo address for Phy - P0 */
> > +   regmap_reg_range(0x8100, 0x8178), /* Psedo address for Phy - P1 */
> > +   regmap_reg_range(0x8200, 0x8278), /* Psedo address for Phy - P2 */
> > +   regmap_reg_range(0x8300, 0x8378), /* Psedo address for Phy - P3 */
> > +   regmap_reg_range(0x8400, 0x8478), /* Psedo address for Phy - P4 */
> > +};
> 
> It looks like your regmap accessor are only used for 0x7a00 to 0x7a7c.
> 
> It is not clear why you even bother with a regmap. If you have it, why
> not use it for all registers within the regmap?
> 
> Andrew

virtnet_xdp_set induces WARNING at drivers/pci/msi.c:1261 pci_irq_vector+0xd4/0xe0

2017-03-23 Thread Brenden Blanco

Hi netdev,

I was using the new xdp support in virtio to test some idea that I have,
and am seeing the WARN_ON_ONCE in the end of this email when I add the
first xdp program. After this, the VM seems to operate just fine.

I know that y'all were testing with a particular qemu command line,
which I have adapted to a libvirt xml definition that is close but not
quite exactly like what was suggested. The VM is 4-vCPU, with a single
nic that looks something like:

...
-smp 4,sockets=4,cores=1,threads=1
...
-netdev 
tap,fds=26:28:29:30:31:32:33:34:35,id=hostnet0,vhost=on,vhostfds=36:37:38:39:40:41:42:43:44
 -device 
virtio-net-pci,guest_tso4=off,guest_tso6=off,guest_ecn=off,guest_ufo=off,mq=on,vectors=20,netdev=hostnet0,id=net0,mac=52:54:00:18:17:f4,bus=pci.0,addr=0x3
...

This comes from a libvirt definition of:

   ...
   
  
  
  
  
  

  
  
  

...

I couldn't find an xml parameter to drive qemu's queues=X field as
tested by John, and I believe the vectors=Y parameters comes from a 2N+2
calculation from libvirt's vhost queues setting.

I tested with many different queues settings, and 9 was the only one
that XDP liked enough to allow my program. The resulting ethtool
channels looked like:

Channel parameters for ens3:
Pre-set maximums:
RX: 0
TX: 0
Other:  0
Combined:   9
Current hardware settings:
RX: 0
TX: 0
Other:  0
Combined:   8

Kernel is v4.11-rc2 based.

So, my question would be two-fold:

1. Is there a way to coerce libvirt to create the right command line,
and/or is this some bug in libvirt (and who should I email if so)?

2. Is there anything to fix in the driver side or is this warning
exactly what should be happening with the quirky configuration that I
have?

Thanks,
Brenden


[  131.424391] WARNING: CPU: 2 PID: 3218 at drivers/pci/msi.c:1261 
pci_irq_vector+0xd4/0xe0
[  131.424392] Modules linked in: pktgen ipt_MASQUERADE nf_nat_masquerade_ipv4 
xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 
xt_addrtype iptable_filter ip_tables xt_conntrack x_tables nf_nat nf_conntrack 
br_netfilter bridge stp llc dm_thin_pool dm_persistent_data dm_bufio 
dm_bio_prison binfmt_misc ppdev snd_hda_codec_generic snd_hda_intel 
snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd parport_pc 
intel_rapl_perf soundcore input_leds serio_raw i2c_piix4 parport qemu_fw_cfg 
mac_hid ib_iser rdma_cm configfs iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp 
libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 libcrc32c 
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 
raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
[  131.424416]  pcbc qxl drm_kms_helper aesni_intel syscopyarea sysfillrect 
aes_x86_64 sysimgblt crypto_simd fb_sys_fops cryptd ttm psmouse glue_helper drm 
virtio_net pata_acpi floppy
[  131.424423] CPU: 2 PID: 3218 Comm: PktGen Not tainted 4.11.0-rc2-03152017+ #4
[  131.424424] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-20170228_101828-anatol 04/01/2014
[  131.424424] Call Trace:
[  131.424427]  dump_stack+0x63/0x90
[  131.424430]  __warn+0xcb/0xf0
[  131.424431]  warn_slowpath_null+0x1d/0x20
[  131.424433]  pci_irq_vector+0xd4/0xe0
[  131.424435]  vp_synchronize_vectors+0x46/0x60
[  131.424436]  vp_reset+0x37/0x40
[  131.424438]  virtnet_xdp+0x240/0x3f0 [virtio_net]
[  131.424441]  dev_change_xdp_fd+0x91/0x150
[  131.424443]  do_setlink+0xd34/0xd60
[  131.424445]  ? memcg_kmem_charge_memcg+0x76/0x90
[  131.424448]  ? new_slab+0x31d/0x710
[  131.424449]  ? nla_parse+0xa0/0x100
[  131.424450]  rtnl_setlink+0x100/0x140
[  131.424453]  ? ns_capable_common+0x60/0x80
[  131.424454]  rtnetlink_rcv_msg+0xe6/0x220
[  131.424455]  ? __kmalloc_node_track_caller+0x1ef/0x2c0
[  131.424457]  ? __alloc_skb+0x87/0x1c0
[  131.424458]  ? rtnl_newlink+0x8a0/0x8a0
[  131.424459]  netlink_rcv_skb+0xa4/0xc0
[  131.424460]  rtnetlink_rcv+0x28/0x30
[  131.424461]  netlink_unicast+0x18c/0x240
[  131.424462]  netlink_sendmsg+0x2fb/0x3a0
[  131.424463]  sock_sendmsg+0x38/0x50
[  131.424464]  SYSC_sendto+0x101/0x190
[  131.424467]  ? fput+0xe/0x10
[  131.424468]  ? task_work_run+0x83/0xa0
[  131.424469]  SyS_sendto+0xe/0x10
[  131.424472]  entry_SYSCALL_64_fastpath+0x1e/0xad
[  131.424473] RIP: 0033:0x7fbe4ebda99d
[  131.424473] RSP: 002b:7ffc257d3ef8 EFLAGS: 0246 ORIG_RAX: 
002c
[  131.424474] RAX: ffda RBX: 0006 RCX: 7fbe4ebda99d
[  131.424474] RDX: 002c RSI: 7ffc257d3f10 RDI: 0005
[  131.424475] RBP: 0005 R08:  R09: 
[  131.424475] R10:  R11: 0246 R12: 7ffc257d5300
[  131.424476] R13: 7ffc257d5300 R14: 7ffc257d5540 R15:

Re: [PATCH] net: usbnet: support 64bit stats in qmi_wwan driver

2017-03-23 Thread Bjørn Mork

Greg Ungerer  writes:

> Add support for the net stats64 counters to the usbnet core and then to
> the qmi_wwan driver.
>
> This is a strait forward addition of 64bit counters for RX and TX packets
> and byte counts. It is done in the same style as for the other net drivers
> that support stats64.
>
> The bulk of the change is to the usbnet core. Then it is trivial to use
> that in the qmi_wwan.c driver. It would be very simple to extend this
> support to other usbnet based drivers.
>
> The motivation to add this is that it is not particularly difficult to
> get the RX and TX byte counts to wrap on 32bit platforms.

You must have a higher quota than me :)

But the patch does not apply to current net-next du to a conflict with
the ethtool_{get|set}_link_ksettings changes.


> +void usbnet_get_stats64(struct net_device *net, struct rtnl_link_stats64 
> *stats)
> +{
> + struct usbnet *dev = netdev_priv(net);
> + unsigned int start;
> +
> + netdev_stats_to_stats64(stats, &net->stats);
> +
> + do {
> + start = u64_stats_fetch_begin_irq(&dev->stats.syncp);
> + stats->rx_packets = dev->stats.rx_packets;
> + stats->rx_bytes = dev->stats.rx_bytes;
> + stats->tx_packets = dev->stats.tx_packets;
> + stats->tx_bytes = dev->stats.tx_bytes;
> + } while (u64_stats_fetch_retry_irq(&dev->stats.syncp, start));
> +}
> +

And I believe EXPORT_SYMBOL is missing here?



Bjørn

Re: race condition in kernel/padata.c

2017-03-23 Thread Steffen Klassert

On Thu, Mar 23, 2017 at 12:03:43AM +0100, Jason A. Donenfeld wrote:
> Hey Steffen,
> 
> WireGuard makes really heavy use of padata, feeding it units of work
> from different cores in different contexts all at the same time. For
> the most part, everything has been fine, but one particular user has
> consistently run into mysterious bugs. He's using a rather old dual
> core CPU, which have a tendency to bring out race conditions
> sometimes. After struggling with getting a good backtrace, we finally
> managed to extract this from list debugging:
> 
> [87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33
> __list_add+0xae/0x130
> [87487.301868] list_add corruption. prev->next should be next
> (b17abfc043d0), but was 8dba70872c80. (prev=8dba70872b00).
> [87487.339011]  [] dump_stack+0x68/0xa3
> [87487.342198]  [] ? console_unlock+0x281/0x6d0
> [87487.345364]  [] __warn+0xff/0x140
> [87487.348513]  [] warn_slowpath_fmt+0x4a/0x50
> [87487.351659]  [] __list_add+0xae/0x130
> [87487.354772]  [] ? _raw_spin_lock+0x64/0x70
> [87487.357915]  [] padata_reorder+0x1e6/0x420
> [87487.361084]  [] padata_do_serial+0xa5/0x120
> 
> padata_reorder calls list_add_tail with the list to which its adding
> locked, which seems correct:
> 
> spin_lock(&squeue->serial.lock);
> list_add_tail(&padata->list, &squeue->serial.list);
> spin_unlock(&squeue->serial.lock);
> 
> This therefore leaves only place where such inconsistency could occur:
> if padata->list is added at the same time on two different threads.
> This pdata pointer comes from the function call to
> padata_get_next(pd), which has in it the following block:
> 
> next_queue = per_cpu_ptr(pd->pqueue, cpu);
> padata = NULL;
> reorder = &next_queue->reorder;
> if (!list_empty(&reorder->list)) {
>padata = list_entry(reorder->list.next,
>struct padata_priv, list);
>spin_lock(&reorder->lock);
>list_del_init(&padata->list);
>atomic_dec(&pd->reorder_objects);
>spin_unlock(&reorder->lock);
> 
>pd->processed++;
> 
>goto out;
> }
> out:
> return padata;
> 
> I strongly suspect that the problem here is that two threads can race
> on reorder list. Even though the deletion is locked, call to
> list_entry is not locked, which means it's feasible that two threads
> pick up the same padata object and subsequently call list_add_tail on
> them at the same time. The fix would thus be to hoist that lock
> outside of that block.

Yes, looks like we should lock the whole list handling block here.

Thanks!

Re: [PATCH] net: usbnet: support 64bit stats in qmi_wwan driver

2017-03-23 Thread Oliver Neukum

Am Donnerstag, den 23.03.2017, 11:25 +1000 schrieb Greg Ungerer:
> Add support for the net stats64 counters to the usbnet core and then to
> the qmi_wwan driver.
> 
> This is a strait forward addition of 64bit counters for RX and TX packets
> and byte counts. It is done in the same style as for the other net drivers
> that support stats64.
> 
> The bulk of the change is to the usbnet core. Then it is trivial to use
> that in the qmi_wwan.c driver. It would be very simple to extend this
> support to other usbnet based drivers.
> 
> The motivation to add this is that it is not particularly difficult to
> get the RX and TX byte counts to wrap on 32bit platforms.

Hi,

you need to export the symbol usbnet_get_stats64
Other than that it looks good.

Regards
Oliver

Re: [PATCH net-next 7/8] vhost_net: try batch dequing from skb array

2017-03-23 Thread Jason Wang




On 2017年03月22日 22:16, Michael S. Tsirkin wrote:

On Tue, Mar 21, 2017 at 12:04:46PM +0800, Jason Wang wrote:

We used to dequeue one skb during recvmsg() from skb_array, this could
be inefficient because of the bad cache utilization and spinlock
touching for each packet. This patch tries to batch them by calling
batch dequeuing helpers explicitly on the exported skb array and pass
the skb back through msg_control for underlayer socket to finish the
userspace copying.

Tests were done by XDP1:
- small buffer:
   Before: 1.88Mpps
   After : 2.25Mpps (+19.6%)
- mergeable buffer:
   Before: 1.83Mpps
   After : 2.10Mpps (+14.7%)

Signed-off-by: Jason Wang 
---
  drivers/vhost/net.c | 64 +
  1 file changed, 60 insertions(+), 4 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 9b51989..53f09f2 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -28,6 +28,8 @@
  #include 
  #include 
  #include 
+#include 
+#include 
  
  #include 
  
@@ -85,6 +87,7 @@ struct vhost_net_ubuf_ref {

struct vhost_virtqueue *vq;
  };
  
+#define VHOST_RX_BATCH 64

  struct vhost_net_virtqueue {
struct vhost_virtqueue vq;
size_t vhost_hlen;
@@ -99,6 +102,10 @@ struct vhost_net_virtqueue {
/* Reference counting for outstanding ubufs.
 * Protected by vq mutex. Writers must also take device mutex. */
struct vhost_net_ubuf_ref *ubufs;
+   struct skb_array *rx_array;
+   void *rxq[VHOST_RX_BATCH];
+   int rt;
+   int rh;
  };
  
  struct vhost_net {

@@ -201,6 +208,8 @@ static void vhost_net_vq_reset(struct vhost_net *n)
n->vqs[i].ubufs = NULL;
n->vqs[i].vhost_hlen = 0;
n->vqs[i].sock_hlen = 0;
+   n->vqs[i].rt = 0;
+   n->vqs[i].rh = 0;
}
  
  }

@@ -503,13 +512,30 @@ static void handle_tx(struct vhost_net *net)
mutex_unlock(&vq->mutex);
  }
  
-static int peek_head_len(struct sock *sk)

+static int peek_head_len_batched(struct vhost_net_virtqueue *rvq)

Pls rename to say what it actually does: fetch skbs


Ok.




+{
+   if (rvq->rh != rvq->rt)
+   goto out;
+
+   rvq->rh = rvq->rt = 0;
+   rvq->rt = skb_array_consume_batched_bh(rvq->rx_array, rvq->rxq,
+   VHOST_RX_BATCH);

A comment explaining why is is -bh would be helpful.


Ok.

Thanks

[PATCH net-next] sched: act_csum: don't mangle TCP and UDP GSO packets

2017-03-23 Thread Davide Caratti

after act_csum computes the checksum on skbs carrying GSO TCP/UDP packets,
subsequent segmentation fails because skb_needs_check(skb, true) returns
true. Because of that, skb_warn_bad_offload() is invoked and the following
message is displayed:

WARNING: CPU: 3 PID: 28 at net/core/dev.c:2553 skb_warn_bad_offload+0xf0/0xfd
<...>

  [] skb_warn_bad_offload+0xf0/0xfd
  [] __skb_gso_segment+0xec/0x110
  [] validate_xmit_skb+0x12d/0x2b0
  [] validate_xmit_skb_list+0x42/0x70
  [] sch_direct_xmit+0xd0/0x1b0
  [] __qdisc_run+0x120/0x270
  [] __dev_queue_xmit+0x23d/0x690
  [] dev_queue_xmit+0x10/0x20

Since GSO is able to compute checksum on individual segments of such skbs,
we can simply skip mangling the packet.

Signed-off-by: Davide Caratti 
---
 net/sched/act_csum.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index e978ccd4..6c319a4 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -181,6 +181,9 @@ static int tcf_csum_ipv4_tcp(struct sk_buff *skb, unsigned 
int ihl,
struct tcphdr *tcph;
const struct iphdr *iph;
 
+   if (skb_is_gso(skb) && skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)
+   return 1;
+
tcph = tcf_csum_skb_nextlayer(skb, ihl, ipl, sizeof(*tcph));
if (tcph == NULL)
return 0;
@@ -202,6 +205,9 @@ static int tcf_csum_ipv6_tcp(struct sk_buff *skb, unsigned 
int ihl,
struct tcphdr *tcph;
const struct ipv6hdr *ip6h;
 
+   if (skb_is_gso(skb) && skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
+   return 1;
+
tcph = tcf_csum_skb_nextlayer(skb, ihl, ipl, sizeof(*tcph));
if (tcph == NULL)
return 0;
@@ -225,6 +231,9 @@ static int tcf_csum_ipv4_udp(struct sk_buff *skb, unsigned 
int ihl,
const struct iphdr *iph;
u16 ul;
 
+   if (skb_is_gso(skb) && skb_shinfo(skb)->gso_type & SKB_GSO_UDP)
+   return 1;
+
/*
 * Support both UDP and UDPLITE checksum algorithms, Don't use
 * udph->len to get the real length without any protocol check,
@@ -278,6 +287,9 @@ static int tcf_csum_ipv6_udp(struct sk_buff *skb, unsigned 
int ihl,
const struct ipv6hdr *ip6h;
u16 ul;
 
+   if (skb_is_gso(skb) && skb_shinfo(skb)->gso_type & SKB_GSO_UDP)
+   return 1;
+
/*
 * Support both UDP and UDPLITE checksum algorithms, Don't use
 * udph->len to get the real length without any protocol check,
-- 
2.7.4

Re: [PATCH v2 1/2] net: phy: Fix PHY AN done state machine for interrupt driven PHYs

2017-03-23 Thread Sergei Shtylyov


Hello!

On 3/22/2017 2:02 PM, Roger Quadros wrote:


he ethernet link on an interrupt driven PHY was not coming up if the


   s/he/The/?


ethernet cable was plugged before the ethernet interface was brought up.


   Also, my spell checker trips on "ethernet", perhaps should be capitalized?


The PHY state machine seems to be stuck from RUNNING to AN state
with no new interrupts from the PHY. So it doesn't know when the
PHY Auto-negotiation has been completed and doesn't transition to RUNNING
state with ANEG done thus netif_carrier_on() is never called.

NOTE: genphy_config_aneg() will not restart PHY Auto-negotiation of
advertisement parameters didn't change.

Fix this by scheduling the PHY state machine in phy_start_aneg().

Fixes: 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not 
polling.")
Cc: stable  # v4.9+
Signed-off-by: Roger Quadros 

[...]

MBR, Sergei

stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"

2017-03-23 Thread Corentin Labbe

Hello

Using next-20170323 produce a huge performance regression on my sunxi boards.
On dwmac-sun8i, iperf goes from 94mbs/s to 37 when sending.

On cubieboard2(dwmac-sunxi), iperf made the kernel flood with 
"ndesc_get_rx_status: Oversized frame spanned multiple buffers"
and network is lost after.

Reverting aff3d9eff84399e433c4aca65a9bb236581bc082 fix the issue.
I still try to found which part of this patch mades the performance lower.

Regards
Corentin Labbe

Re: stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"

2017-03-23 Thread Joao Pinto


Hi Corentin,

Às 10:08 AM de 3/23/2017, Corentin Labbe escreveu:
> Hello
> 
> Using next-20170323 produce a huge performance regression on my sunxi boards.
> On dwmac-sun8i, iperf goes from 94mbs/s to 37 when sending.
> 
> On cubieboard2(dwmac-sunxi), iperf made the kernel flood with 
> "ndesc_get_rx_status: Oversized frame spanned multiple buffers"
> and network is lost after.
> 
> Reverting aff3d9eff84399e433c4aca65a9bb236581bc082 fix the issue.
> I still try to found which part of this patch mades the performance lower.
> 
> Regards
> Corentin Labbe
> 

I have a 4.21 QoS Core with 4 RX + 4 TX and detected no regression.
Could you please share the iperf cmds you are using in order for me to reproduce
in my side?

@stmmac users: It would be great if people that have a setup could also perform
teh same iperf test in order to clean in up for everyone.

Thanks,
Joao

[patch net-next] mlxsw: Remove debugfs interface

2017-03-23 Thread Jiri Pirko

From: Ido Schimmel 

We don't use it during development and we can't extend it either, so
remove it.

Signed-off-by: Ido Schimmel 
Signed-off-by: Jiri Pirko 
---
 drivers/net/ethernet/mellanox/mlxsw/core.c | 177 -
 drivers/net/ethernet/mellanox/mlxsw/pci.c  | 138 --
 2 files changed, 315 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c 
b/drivers/net/ethernet/mellanox/mlxsw/core.c
index a4c0784..fb8187d 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -40,9 +40,6 @@
 #include 
 #include 
 #include 
-#include 
-#include 
-#include 
 #include 
 #include 
 #include 
@@ -74,23 +71,9 @@ static DEFINE_SPINLOCK(mlxsw_core_driver_list_lock);
 
 static const char mlxsw_core_driver_name[] = "mlxsw_core";
 
-static struct dentry *mlxsw_core_dbg_root;
-
 static struct workqueue_struct *mlxsw_wq;
 static struct workqueue_struct *mlxsw_owq;
 
-struct mlxsw_core_pcpu_stats {
-   u64 trap_rx_packets[MLXSW_TRAP_ID_MAX];
-   u64 trap_rx_bytes[MLXSW_TRAP_ID_MAX];
-   u64 port_rx_packets[MLXSW_PORT_MAX_PORTS];
-   u64 port_rx_bytes[MLXSW_PORT_MAX_PORTS];
-   struct u64_stats_sync   syncp;
-   u32 trap_rx_dropped[MLXSW_TRAP_ID_MAX];
-   u32 port_rx_dropped[MLXSW_PORT_MAX_PORTS];
-   u32 trap_rx_invalid;
-   u32 port_rx_invalid;
-};
-
 struct mlxsw_core_port {
struct devlink_port devlink_port;
void *port_driver_priv;
@@ -121,12 +104,6 @@ struct mlxsw_core {
spinlock_t trans_list_lock; /* protects trans_list writes */
bool use_emad;
} emad;
-   struct mlxsw_core_pcpu_stats __percpu *pcpu_stats;
-   struct dentry *dbg_dir;
-   struct {
-   struct debugfs_blob_wrapper vsd_blob;
-   struct debugfs_blob_wrapper psid_blob;
-   } dbg;
struct {
u8 *mapping; /* lag_id+port_index to local_port mapping */
} lag;
@@ -703,91 +680,6 @@ static int mlxsw_emad_reg_access(struct mlxsw_core 
*mlxsw_core,
  * Core functions
  */
 
-static int mlxsw_core_rx_stats_dbg_read(struct seq_file *file, void *data)
-{
-   struct mlxsw_core *mlxsw_core = file->private;
-   struct mlxsw_core_pcpu_stats *p;
-   u64 rx_packets, rx_bytes;
-   u64 tmp_rx_packets, tmp_rx_bytes;
-   u32 rx_dropped, rx_invalid;
-   unsigned int start;
-   int i;
-   int j;
-   static const char hdr[] =
-   " NUM   RX_PACKETS RX_BYTES RX_DROPPED\n";
-
-   seq_printf(file, hdr);
-   for (i = 0; i < MLXSW_TRAP_ID_MAX; i++) {
-   rx_packets = 0;
-   rx_bytes = 0;
-   rx_dropped = 0;
-   for_each_possible_cpu(j) {
-   p = per_cpu_ptr(mlxsw_core->pcpu_stats, j);
-   do {
-   start = u64_stats_fetch_begin(&p->syncp);
-   tmp_rx_packets = p->trap_rx_packets[i];
-   tmp_rx_bytes = p->trap_rx_bytes[i];
-   } while (u64_stats_fetch_retry(&p->syncp, start));
-
-   rx_packets += tmp_rx_packets;
-   rx_bytes += tmp_rx_bytes;
-   rx_dropped += p->trap_rx_dropped[i];
-   }
-   seq_printf(file, "trap %3d %12llu %12llu %10u\n",
-  i, rx_packets, rx_bytes, rx_dropped);
-   }
-   rx_invalid = 0;
-   for_each_possible_cpu(j) {
-   p = per_cpu_ptr(mlxsw_core->pcpu_stats, j);
-   rx_invalid += p->trap_rx_invalid;
-   }
-   seq_printf(file, "trap INV   %10u\n",
-  rx_invalid);
-
-   for (i = 0; i < MLXSW_PORT_MAX_PORTS; i++) {
-   rx_packets = 0;
-   rx_bytes = 0;
-   rx_dropped = 0;
-   for_each_possible_cpu(j) {
-   p = per_cpu_ptr(mlxsw_core->pcpu_stats, j);
-   do {
-   start = u64_stats_fetch_begin(&p->syncp);
-   tmp_rx_packets = p->port_rx_packets[i];
-   tmp_rx_bytes = p->port_rx_bytes[i];
-   } while (u64_stats_fetch_retry(&p->syncp, start));
-
-   rx_packets += tmp_rx_packets;
-   rx_bytes += tmp_rx_bytes;
-   rx_dropped += p->port_rx_dropped[i];
-   }
-   seq_printf(file, "port %3d %12llu %12llu %10u\n",
-  i, rx_packets, rx_bytes, rx_dropped);
-   }
-   rx_invalid = 0;
-   for_each_possible_cpu(j) {
-   p = per_cpu_ptr(mlxsw_core->pcpu_stats, j);
-   rx_invalid +

Re: stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"

2017-03-23 Thread Corentin Labbe

On Thu, Mar 23, 2017 at 10:12:18AM +, Joao Pinto wrote:
> 
> Hi Corentin,
> 
> Às 10:08 AM de 3/23/2017, Corentin Labbe escreveu:
> > Hello
> > 
> > Using next-20170323 produce a huge performance regression on my sunxi 
> > boards.
> > On dwmac-sun8i, iperf goes from 94mbs/s to 37 when sending.
> > 
> > On cubieboard2(dwmac-sunxi), iperf made the kernel flood with 
> > "ndesc_get_rx_status: Oversized frame spanned multiple buffers"
> > and network is lost after.
> > 
> > Reverting aff3d9eff84399e433c4aca65a9bb236581bc082 fix the issue.
> > I still try to found which part of this patch mades the performance lower.
> > 
> > Regards
> > Corentin Labbe
> > 
> 
> I have a 4.21 QoS Core with 4 RX + 4 TX and detected no regression.
> Could you please share the iperf cmds you are using in order for me to 
> reproduce
> in my side?

simple iperf -c serverip for both board

[PATCH net-next 0/2] net: bridge: allow user-space to add ext learned entries

2017-03-23 Thread Nikolay Aleksandrov

Hi,
This set adds the ability to add externally learned entries from
user-space. For symmetry and proper function we need to allow SW entries
to take over HW learned ones (similar to how HW can take over SW entries
currently) which is needed for our use case (evpn) where we have pure SW
ports and HW ports mixed in a single bridge. This does not play well with
switchdev devices currently because there's no feedback when the entry is
taken over, but this case has never worked anyway and feedback can be
easily added when needed.
Patch 02 simply allows to use NTF_EXT_LEARNED from user-space, we already
have Quagga patches that make use of this functionality.

Thanks,
 Nik

Nikolay Aleksandrov (2):
  net: bridge: allow SW learn to take over HW fdb entries
  net: bridge: allow to add externally learned entries from user-space

 net/bridge/br_fdb.c | 5 +
 1 file changed, 5 insertions(+)

-- 
2.1.4

[PATCH net-next 2/2] net: bridge: allow to add externally learned entries from user-space

2017-03-23 Thread Nikolay Aleksandrov

The NTF_EXT_LEARNED flag was added for switchdev and externally learned
entries, but it can also be used for entries learned via a software
in user-space which requires dynamic entries that do not expire.
One such case that we have is with quagga and evpn which need dynamic
entries but also require to age them themselves.

Signed-off-by: Nikolay Aleksandrov 
---
 net/bridge/br_fdb.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index e4c8adf517ea..7e5902e69f85 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -857,6 +857,8 @@ static int __br_fdb_add(struct ndmsg *ndm, struct 
net_bridge *br,
br_fdb_update(br, p, addr, vid, true);
rcu_read_unlock();
local_bh_enable();
+   } else if (ndm->ndm_flags & NTF_EXT_LEARNED) {
+   err = br_fdb_external_learn_add(br, p, addr, vid);
} else {
spin_lock_bh(&br->hash_lock);
err = fdb_add_entry(br, p, addr, ndm->ndm_state,
-- 
2.1.4

[PATCH net-next 1/2] net: bridge: allow SW learn to take over HW fdb entries

2017-03-23 Thread Nikolay Aleksandrov

Allow to take over an entry which was previously learned via HW when it
shows up from a SW port. This is analogous to how HW takes over SW learned
entries already.

Suggested-by: Roopa Prabhu 
Signed-off-by: Nikolay Aleksandrov 
---
 net/bridge/br_fdb.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 4f598dc2d916..e4c8adf517ea 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -594,6 +594,9 @@ void br_fdb_update(struct net_bridge *br, struct 
net_bridge_port *source,
fdb->updated = now;
if (unlikely(added_by_user))
fdb->added_by_user = 1;
+   /* Take over HW learned entry */
+   if (unlikely(fdb->added_by_external_learn))
+   fdb->added_by_external_learn = 0;
if (unlikely(fdb_modified))
fdb_notify(br, fdb, RTM_NEWNEIGH);
}
-- 
2.1.4

Re: stmmac still supporting spear600 ?

2017-03-23 Thread Giuseppe CAVALLARO


Hello Thomas

On 3/21/2017 3:50 PM, Thomas Petazzoni wrote:

Hello,

On Thu, 9 Mar 2017 15:56:31 +0100, Giuseppe CAVALLARO wrote:


On 3/9/2017 10:32 AM, Thomas Petazzoni wrote:


OK, I'll have a look. However, I'm still confused by this DMA_RESET bit
that never clears, contrary to what the datasheet says. Are there some
erratas?

I suggest you to take a look at the tx/rx clocks from PHY.
You have to provide these otherwise you cannot reset the engine.

Thanks for the hint.


you are welcome


Further research has revealed that everything is working fine on a
platform with a Gigabit PHY connected via GMII.

However, on a different platform (which I'm using) with a 10/100 PHY
connected via MII, DMA_RESET never clears, and networking doesn't work.
The SMSC PHY LAN8700 is also supposed to be providing the clock through
its TX_CLK pin. I double checked, and both the MAC and PHY are in MII
mode, but still no luck so far.

Of course, if you have any suggestion or hint, I'm all ears :)


I can just you to keep the focus on clock configuration. I tested the 
SMSC PHY LAN8700
w/o any issues on several platform.  In MII both rx/tx_clk are provided 
by PHY and if you

have an external oscillator this should be safe enough, indeed.
Another check you can do is about the reset time! Maybe you need to 
change something
when reset the SMSC transceiver, try to increase the delay (if you use 
GPIO to reset it).


Regards
Peppe



Thanks,

Thomas

Re: stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"

2017-03-23 Thread Joao Pinto

Às 10:20 AM de 3/23/2017, Corentin Labbe escreveu:
> On Thu, Mar 23, 2017 at 10:12:18AM +, Joao Pinto wrote:
>>
>> Hi Corentin,
>>
>> Às 10:08 AM de 3/23/2017, Corentin Labbe escreveu:
>>> Hello
>>>
>>> Using next-20170323 produce a huge performance regression on my sunxi 
>>> boards.
>>> On dwmac-sun8i, iperf goes from 94mbs/s to 37 when sending.
>>>
>>> On cubieboard2(dwmac-sunxi), iperf made the kernel flood with 
>>> "ndesc_get_rx_status: Oversized frame spanned multiple buffers"
>>> and network is lost after.
>>>
>>> Reverting aff3d9eff84399e433c4aca65a9bb236581bc082 fix the issue.
>>> I still try to found which part of this patch mades the performance lower.
>>>
>>> Regards
>>> Corentin Labbe
>>>
>>
>> I have a 4.21 QoS Core with 4 RX + 4 TX and detected no regression.
>> Could you please share the iperf cmds you are using in order for me to 
>> reproduce
>> in my side?
> 
> simple iperf -c serverip for both board
> 

Ok, I am going to run my tests with a fresh net-next and come back to you soon.

Thanks,
Joao

Re: [PATCH 4.11] genetlink: fix counting regression on ctrl_dumpfamily()

2017-03-23 Thread poma

On 22.03.2017 16:08, Stanislaw Gruszka wrote:
> Commit 2ae0f17df1cd ("genetlink: use idr to track families") replaced
> 
>   if (++n < fams_to_skip)
>   continue;
> into:
> 
>   if (n++ < fams_to_skip)
>   continue;
> 
> This subtle change cause that on retry ctrl_dumpfamily() call we omit
> one family that failed to do ctrl_fill_info() on previous call, because
> cb->args[0] = n number counts also family that failed to do
> ctrl_fill_info().
> 
> Patch fixes the problem and avoid confusion in the future just decrease
> n counter when ctrl_fill_info() fail.
> 
> User visible problem caused by this bug is failure to get access to
> some genetlink family i.e. nl80211. However problem is reproducible
> only if number of registered genetlink families is big enough to
> cause second call of ctrl_dumpfamily().
> 
> Cc: Xose Vazquez Perez 
> Cc: Larry Finger 
> Cc: Johannes Berg 
> Fixes: 2ae0f17df1cd ("genetlink: use idr to track families")
> Signed-off-by: Stanislaw Gruszka 
> ---
> Dave, please also target this for 4.10+ -stable.
> 
>  net/netlink/genetlink.c |4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
> index fb6e10f..92e0981 100644
> --- a/net/netlink/genetlink.c
> +++ b/net/netlink/genetlink.c
> @@ -783,8 +783,10 @@ static int ctrl_dumpfamily(struct sk_buff *skb, struct 
> netlink_callback *cb)
>  
>   if (ctrl_fill_info(rt, NETLINK_CB(cb->skb).portid,
>  cb->nlh->nlmsg_seq, NLM_F_MULTI,
> -skb, CTRL_CMD_NEWFAMILY) < 0)
> +skb, CTRL_CMD_NEWFAMILY) < 0) {
> + n--;
>   break;
> + }
>   }
>  
>   cb->args[0] = n;
> 


Thanks Stanislaw, Larry!

Tested-by: poma 

Ref.
https://bugzilla.redhat.com/show_bug.cgi?id=1422247

Re: stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"

2017-03-23 Thread Giuseppe CAVALLARO

Hello

On 3/23/2017 11:20 AM, Corentin Labbe wrote:

I have a 4.21 QoS Core with 4 RX + 4 TX and detected no regression.
>Could you please share the iperf cmds you are using in order for me to 
reproduce
>in my side?

Joao, you have a really powerful HW integration with multiple channels 
for both RX and TX.
Often this is not the same for other setup where, usually just a DMA0 is 
present or, sometime, there

is just one RX extra channel.

My question is, what happens on this kind of configurations? Are we 
still guarantying the best performances?

Also we have to guarantee, that the TSO and SG are always working. 
Another point is the buffer sizes that

can be different among platforms.

The problem  below reported by Corentin push me to think that there is a 
bug, so we should
understand when this has been introduced and if likely fixed by some 
configuration we are

not take care right now.

ndesc_get_rx_status: Oversized frame spanned multiple buffers"

Best Regards
Peppe

Re: stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"

2017-03-23 Thread Giuseppe CAVALLARO

On 3/23/2017 11:48 AM, Giuseppe CAVALLARO wrote:

Hello

On 3/23/2017 11:20 AM, Corentin Labbe wrote:

I have a 4.21 QoS Core with 4 RX + 4 TX and detected no regression.
>Could you please share the iperf cmds you are using in order for me 
to reproduce

>in my side?

Joao, you have a really powerful HW integration with multiple channels 
for both RX and TX.
Often this is not the same for other setup where, usually just a DMA0 
is present or, sometime, there

is just one RX extra channel.

My question is, what happens on this kind of configurations? Are we 
still guarantying the best performances?

Also we have to guarantee, that the TSO and SG are always working. 
Another point is the buffer sizes that

can be different among platforms.

The problem  below reported by Corentin push me to think that there is 
a bug, so we should
understand when this has been introduced and if likely fixed by some 
configuration we are

not take care right now.

ndesc_get_rx_status: Oversized frame spanned multiple buffers"

I wonder if this could be easily triggered by getting a big file via 
FTP. So not properly related on performance benchs

peppe

Best Regards
Peppe

Re: stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"

2017-03-23 Thread Joao Pinto


Hi Peppe,

Às 10:48 AM de 3/23/2017, Giuseppe CAVALLARO escreveu:
> Hello
> 
> On 3/23/2017 11:20 AM, Corentin Labbe wrote:
>>> I have a 4.21 QoS Core with 4 RX + 4 TX and detected no regression.
>>> >Could you please share the iperf cmds you are using in order for me to
>>> reproduce
>>> >in my side?
> 
> Joao, you have a really powerful HW integration with multiple channels for 
> both
> RX and TX.
> Often this is not the same for other setup where, usually just a DMA0 is 
> present
> or, sometime, there
> is just one RX extra channel.

My opinion is that we should not have problems, since the majority of features
introduced are used if you configure rx queues > 1 or tx queues > 1, so if you
use the default (=1) those confiogurations will not take place.

> 
> My question is, what happens on this kind of configurations? Are we still
> guarantying the best performances?
> 
> Also we have to guarantee, that the TSO and SG are always working. Another 
> point
> is the buffer sizes that
> can be different among platforms.

We have to pay attention to the RX buffer size, since I had problems with DHCP
messages not being received because of little buffer size.
Currently TX buffer size is not configurable and in the future it should be
useful to include it too.

> 
> The problem  below reported by Corentin push me to think that there is a bug, 
> so
> we should
> understand when this has been introduced and if likely fixed by some
> configuration we are
> not take care right now.

Of course.

> 
> ndesc_get_rx_status: Oversized frame spanned multiple buffers"
> 
> 
> Best Regards
> Peppe

Thanks,
Joao

Re: stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"

2017-03-23 Thread Joao Pinto

Às 10:51 AM de 3/23/2017, Giuseppe CAVALLARO escreveu:
> On 3/23/2017 11:48 AM, Giuseppe CAVALLARO wrote:
>> Hello
>>
>> On 3/23/2017 11:20 AM, Corentin Labbe wrote:
 I have a 4.21 QoS Core with 4 RX + 4 TX and detected no regression.
 >Could you please share the iperf cmds you are using in order for me to
 reproduce
 >in my side?
>>
>> Joao, you have a really powerful HW integration with multiple channels for
>> both RX and TX.
>> Often this is not the same for other setup where, usually just a DMA0 is
>> present or, sometime, there
>> is just one RX extra channel.
>>
>> My question is, what happens on this kind of configurations? Are we still
>> guarantying the best performances?
>>
>> Also we have to guarantee, that the TSO and SG are always working. Another
>> point is the buffer sizes that
>> can be different among platforms.
>>
>> The problem  below reported by Corentin push me to think that there is a bug,
>> so we should
>> understand when this has been introduced and if likely fixed by some
>> configuration we are
>> not take care right now.
>>
>> ndesc_get_rx_status: Oversized frame spanned multiple buffers"
> 
> I wonder if this could be easily triggered by getting a big file via FTP. So 
> not
> properly related on performance benchs

I am going to do that test and check it out and also run iperf a couple of
times. I am counting on doing this today and send you later the results. If
anyone gets results sooner please share.

> 
> peppe
> 
>>
>>
>> Best Regards
>> Peppe
>>
> 

Thanks.

[PATCH net] r8152: prevent the driver from transmitting packets with carrier off

2017-03-23 Thread Hayes Wang

The linking status may be changed when autosuspend. And, after
autoresume, the driver may try to transmit packets when the device
is carrier off, because the interrupt transfer doesn't update the
linking status, yet. And, if the device is in ALDPS mode, the device
would stop working.

The another similar case is
 1. unplug the cable.
 2. interrupt transfer queue a work_queue for linking change.
 3. device enters the ALDPS mode.
 4. a tx occurs before the work_queue is called.

Signed-off-by: Hayes Wang 
---
 drivers/net/usb/r8152.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 0b1b918..c34df33 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -1294,6 +1294,7 @@ static void intr_callback(struct urb *urb)
}
} else {
if (netif_carrier_ok(tp->netdev)) {
+   netif_stop_queue(tp->netdev);
set_bit(RTL8152_LINK_CHG, &tp->flags);
schedule_delayed_work(&tp->schedule, 0);
}
@@ -3169,6 +3170,9 @@ static void set_carrier(struct r8152 *tp)
napi_enable(&tp->napi);
netif_wake_queue(netdev);
netif_info(tp, link, netdev, "carrier on\n");
+   } else if (netif_queue_stopped(netdev) &&
+  skb_queue_len(&tp->tx_queue) < tp->tx_qlen) {
+   netif_wake_queue(netdev);
}
} else {
if (netif_carrier_ok(netdev)) {
@@ -3702,8 +3706,18 @@ static int rtl8152_resume(struct usb_interface *intf)
tp->rtl_ops.autosuspend_en(tp, false);
napi_disable(&tp->napi);
set_bit(WORK_ENABLE, &tp->flags);
-   if (netif_carrier_ok(tp->netdev))
-   rtl_start_rx(tp);
+
+   if (netif_carrier_ok(tp->netdev)) {
+   if (rtl8152_get_speed(tp) & LINK_STATUS) {
+   rtl_start_rx(tp);
+   } else {
+   netif_carrier_off(tp->netdev);
+   tp->rtl_ops.disable(tp);
+   netif_info(tp, link, tp->netdev,
+  "linking down\n");
+   }
+   }
+
napi_enable(&tp->napi);
clear_bit(SELECTIVE_SUSPEND, &tp->flags);
smp_mb__after_atomic();
-- 
2.7.4

[PATCH v3 net 2/5] net:ethernet:aquantia: Fix packet type detection (TCP/UDP) for IPv6.

2017-03-23 Thread Pavel Belous

From: Pavel Belous 

In order for the checksum offloads to work correctly we need to set the
packet type bit (TCP/UDP) in the TX context buffer.

Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code")

Signed-off-by: Pavel Belous 
---
 drivers/net/ethernet/aquantia/atlantic/aq_nic.c | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c 
b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
index ee78444..db2b51d 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
@@ -510,10 +510,22 @@ static unsigned int aq_nic_map_skb(struct aq_nic_s *self,
if (skb->ip_summed == CHECKSUM_PARTIAL) {
dx_buff->is_ip_cso = (htons(ETH_P_IP) == skb->protocol) ?
1U : 0U;
-   dx_buff->is_tcp_cso =
-   (ip_hdr(skb)->protocol == IPPROTO_TCP) ? 1U : 0U;
-   dx_buff->is_udp_cso =
-   (ip_hdr(skb)->protocol == IPPROTO_UDP) ? 1U : 0U;
+
+   if (ip_hdr(skb)->version == 4) {
+   dx_buff->is_tcp_cso =
+   (ip_hdr(skb)->protocol == IPPROTO_TCP) ?
+   1U : 0U;
+   dx_buff->is_udp_cso =
+   (ip_hdr(skb)->protocol == IPPROTO_UDP) ?
+   1U : 0U;
+   } else if (ip_hdr(skb)->version == 6) {
+   dx_buff->is_tcp_cso =
+   (ipv6_hdr(skb)->nexthdr == NEXTHDR_TCP) ?
+   1U : 0U;
+   dx_buff->is_udp_cso =
+   (ipv6_hdr(skb)->nexthdr == NEXTHDR_UDP) ?
+   1U : 0U;
+   }
}
 
for (; nr_frags--; ++frag_count) {
-- 
2.7.4

[PATCH v3 net 1/5] net:ethernet:aquantia: Remove adapter re-opening when MTU changed.

2017-03-23 Thread Pavel Belous

From: Pavel Belous 

Closing/opening the adapter is not needed at all.
The new MTU settings take effect immediately.

Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code")

Signed-off-by: Pavel Belous 
---
 drivers/net/ethernet/aquantia/atlantic/aq_main.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_main.c 
b/drivers/net/ethernet/aquantia/atlantic/aq_main.c
index d05fbfd..5d6c40d 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_main.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_main.c
@@ -100,11 +100,6 @@ static int aq_ndev_change_mtu(struct net_device *ndev, int 
new_mtu)
goto err_exit;
ndev->mtu = new_mtu;
 
-   if (netif_running(ndev)) {
-   aq_ndev_close(ndev);
-   aq_ndev_open(ndev);
-   }
-
 err_exit:
return err;
 }
-- 
2.7.4

[PATCH v3 net 0/5] net:ethernet:aquantia: Misc fixes for atlantic driver.

2017-03-23 Thread Pavel Belous

From: Pavel Belous 

The following patchset containg several fixes for aQuantia AQtion driver
for net tree: A couple fixes for IPv6 and other fixes.

v1->v2: Fix compilation error (using HW_ATL_A0_TXD_CTL_CMD_IPV6 instead
HW_ATL_B0_TXD_CTL_CMD_IPV6).
v2->v3: Added "Fixes" tags.

Pavel Belous (5):
  net:ethernet:aquantia: Remove adapter re-opening when MTU changed.
  net:ethernet:aquantia: Fix packet type detection (TCP/UDP) for IPv6.
  net:ethernet:aquantia: Missing spinlock initialization.
  net:ethernet:aquantia: Fix for LSO with IPv6.
  net:ethernet:aquantia: Reset is_gso flag when EOP reached.

 drivers/net/ethernet/aquantia/atlantic/aq_main.c   |  5 -
 drivers/net/ethernet/aquantia/atlantic/aq_nic.c| 23 ++
 drivers/net/ethernet/aquantia/atlantic/aq_ring.c   |  1 +
 drivers/net/ethernet/aquantia/atlantic/aq_ring.h   |  3 ++-
 .../ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c  |  4 
 .../ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c  |  4 
 6 files changed, 30 insertions(+), 10 deletions(-)

-- 
2.7.4

[PATCH v3 net 3/5] net:ethernet:aquantia: Missing spinlock initialization.

2017-03-23 Thread Pavel Belous

From: Pavel Belous 

Fix for missing initialization aq_ring header.lock spinlock.

Fixes: 018423e90bee ("net: ethernet: aquantia: Add ring support code")

Signed-off-by: Pavel Belous 
---
 drivers/net/ethernet/aquantia/atlantic/aq_ring.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c 
b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
index 0358e607..3a8a4aa 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
@@ -101,6 +101,7 @@ int aq_ring_init(struct aq_ring_s *self)
self->hw_head = 0;
self->sw_head = 0;
self->sw_tail = 0;
+   spin_lock_init(&self->header.lock);
return 0;
 }
 
-- 
2.7.4

[PATCH v3 net 4/5] net:ethernet:aquantia: Fix for LSO with IPv6.

2017-03-23 Thread Pavel Belous

From: Pavel Belous 

Fix Context Command bit: L3 type = "0" for IPv4, "1" for IPv6.

Fixes: bab6de8fd180 ("net: ethernet: aquantia:
 Atlantic A0 and B0 specific functions.")

Signed-off-by: Pavel Belous 
---
 drivers/net/ethernet/aquantia/atlantic/aq_nic.c   | 3 +++
 drivers/net/ethernet/aquantia/atlantic/aq_ring.h  | 3 ++-
 drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c | 3 +++
 drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c | 3 +++
 4 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c 
b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
index db2b51d..cdb0299 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
@@ -487,6 +487,9 @@ static unsigned int aq_nic_map_skb(struct aq_nic_s *self,
dx_buff->mss = skb_shinfo(skb)->gso_size;
dx_buff->is_txc = 1U;
 
+   dx_buff->is_ipv6 =
+   (ip_hdr(skb)->version == 6) ? 1U : 0U;
+
dx = aq_ring_next_dx(ring, dx);
dx_buff = &ring->buff_ring[dx];
++ret;
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.h 
b/drivers/net/ethernet/aquantia/atlantic/aq_ring.h
index 2572546..eecd6d1 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.h
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.h
@@ -58,7 +58,8 @@ struct __packed aq_ring_buff_s {
u8 len_l2;
u8 len_l3;
u8 len_l4;
-   u8 rsvd2;
+   u8 is_ipv6:1;
+   u8 rsvd2:7;
u32 len_pkt;
};
};
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c 
b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c
index a2b746a..a536875 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c
@@ -433,6 +433,9 @@ static int hw_atl_a0_hw_ring_tx_xmit(struct aq_hw_s *self,
buff->len_l3 +
buff->len_l2);
is_gso = true;
+
+   if (buff->is_ipv6)
+   txd->ctl |= HW_ATL_A0_TXD_CTL_CMD_IPV6;
} else {
buff_pa_len = buff->len;
 
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c 
b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
index cab2931..69488c9 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
@@ -471,6 +471,9 @@ static int hw_atl_b0_hw_ring_tx_xmit(struct aq_hw_s *self,
buff->len_l3 +
buff->len_l2);
is_gso = true;
+
+   if (buff->is_ipv6)
+   txd->ctl |= HW_ATL_B0_TXD_CTL_CMD_IPV6;
} else {
buff_pa_len = buff->len;
 
-- 
2.7.4

[PATCH v3 net 5/5] net:ethernet:aquantia: Reset is_gso flag when EOP reached.

2017-03-23 Thread Pavel Belous

From: Pavel Belous 

We need to reset is_gso flag when EOP reached (entire LSO packet processed).

Fixes: bab6de8fd180 ("net: ethernet: aquantia:
 Atlantic A0 and B0 specific functions.")

Signed-off-by: Pavel Belous 
---
 drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c | 1 +
 drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c 
b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c
index a536875..4ee15ff 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c
@@ -461,6 +461,7 @@ static int hw_atl_a0_hw_ring_tx_xmit(struct aq_hw_s *self,
if (unlikely(buff->is_eop)) {
txd->ctl |= HW_ATL_A0_TXD_CTL_EOP;
txd->ctl |= HW_ATL_A0_TXD_CTL_CMD_WB;
+   is_gso = false;
}
}
 
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c 
b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
index 69488c9..4215070 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
@@ -499,6 +499,7 @@ static int hw_atl_b0_hw_ring_tx_xmit(struct aq_hw_s *self,
if (unlikely(buff->is_eop)) {
txd->ctl |= HW_ATL_B0_TXD_CTL_EOP;
txd->ctl |= HW_ATL_B0_TXD_CTL_CMD_WB;
+   is_gso = false;
}
}
 
-- 
2.7.4

Re: [PATCH] net: usbnet: support 64bit stats in qmi_wwan driver

2017-03-23 Thread kbuild test robot

Hi Greg,

[auto build test ERROR on net/master]
[also build test ERROR on v4.11-rc3]
[cannot apply to net-next/master next-20170323]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Greg-Ungerer/net-usbnet-support-64bit-stats-in-qmi_wwan-driver/20170323-171629
config: x86_64-rhel (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

>> ERROR: "usbnet_get_stats64" [drivers/net/usb/qmi_wwan.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

[PATCH] padata: avoid race in reordering

2017-03-23 Thread Jason A. Donenfeld

Under extremely heavy uses of padata, crashes occur, and with list
debugging turned on, this happens instead:

[87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33
__list_add+0xae/0x130
[87487.301868] list_add corruption. prev->next should be next
(b17abfc043d0), but was 8dba70872c80. (prev=8dba70872b00).
[87487.339011]  [] dump_stack+0x68/0xa3
[87487.342198]  [] ? console_unlock+0x281/0x6d0
[87487.345364]  [] __warn+0xff/0x140
[87487.348513]  [] warn_slowpath_fmt+0x4a/0x50
[87487.351659]  [] __list_add+0xae/0x130
[87487.354772]  [] ? _raw_spin_lock+0x64/0x70
[87487.357915]  [] padata_reorder+0x1e6/0x420
[87487.361084]  [] padata_do_serial+0xa5/0x120

padata_reorder calls list_add_tail with the list to which its adding
locked, which seems correct:

spin_lock(&squeue->serial.lock);
list_add_tail(&padata->list, &squeue->serial.list);
spin_unlock(&squeue->serial.lock);

This therefore leaves only place where such inconsistency could occur:
if padata->list is added at the same time on two different threads.
This pdata pointer comes from the function call to
padata_get_next(pd), which has in it the following block:

next_queue = per_cpu_ptr(pd->pqueue, cpu);
padata = NULL;
reorder = &next_queue->reorder;
if (!list_empty(&reorder->list)) {
   padata = list_entry(reorder->list.next,
   struct padata_priv, list);
   spin_lock(&reorder->lock);
   list_del_init(&padata->list);
   atomic_dec(&pd->reorder_objects);
   spin_unlock(&reorder->lock);

   pd->processed++;

   goto out;
}
out:
return padata;

I strongly suspect that the problem here is that two threads can race
on reorder list. Even though the deletion is locked, call to
list_entry is not locked, which means it's feasible that two threads
pick up the same padata object and subsequently call list_add_tail on
them at the same time. The fix is thus be hoist that lock outside of
that block.

Signed-off-by: Jason A. Donenfeld 
---
 kernel/padata.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/padata.c b/kernel/padata.c
index 05316c9f32da..3202aa17492c 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -186,19 +186,20 @@ static struct padata_priv *padata_get_next(struct 
parallel_data *pd)
 
reorder = &next_queue->reorder;
 
+   spin_lock(&reorder->lock);
if (!list_empty(&reorder->list)) {
padata = list_entry(reorder->list.next,
struct padata_priv, list);
 
-   spin_lock(&reorder->lock);
list_del_init(&padata->list);
atomic_dec(&pd->reorder_objects);
-   spin_unlock(&reorder->lock);
 
pd->processed++;
 
+   spin_unlock(&reorder->lock);
goto out;
}
+   spin_unlock(&reorder->lock);
 
if (__this_cpu_read(pd->pqueue->cpu_index) == next_queue->cpu_index) {
padata = ERR_PTR(-ENODATA);
-- 
2.11.1

Re: Problem: net: mvneta: auto-negotiation with disconnected network cable

2017-03-23 Thread Maxime Morin

Hi,
Thank you very much for your help and your reactivity! See my answer bellow:

> 22.03.2017 16:23, Maxime Morin пишет:
> > Hi all,
> >
> > I work on an embedded platform based on the Marvell Armada 88F6707, that is 
> > connected to a Marvell Alaska 88E1512 ethernet transceiver. A defect has 
> > appeared recently, and it turns out to be a regression on the network part. 
> > There is a complete lost of the network when following these steps:
> >  1) boot the board with the network cable disconnected
> >  2) run the following commands (or equivalent):
> >  ip link set eth0 up
> >  ip addr add 10.0.0.80/24 dev eth0
> >  ethtool -s eth0 autoneg on #this is the command that really breaks 
> > the network
> Why do you call it a regression, if previously
> this command did nothing at all?

I called that a regression because we still pass through the function 
phy_ethtool_sset(), which I though was also doing something about the 
auto-negotiation. But apparently not.

> 
> >  3) plug the network cable
> >  => there is no network, and no way to have it back except by 
> > rebooting
> > If I do not launch the "ethtool" command, when I plug the network cable it 
> > works, so it really seems to be related to the auto-negotiation set to "on" 
> > when the network cable has never been > connected.
> So if you do that with the cable plugged it, there
> is no breakage?
> When you do "ethtool -s eth0 autoneg off" it doesn't
> revive?

Unfortunately no, it does not. I tried many things with ethtool, but it never 
gets back.

> > I did a "git bisect" to find when the regression was introduced, because it 
> > previously worked with kernel 4.4, but not with the recent ones. The commit 
> > that made appear the issue is this one: 
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0c0744fc1dd5b
> > If I remove on mvneta.c the part that was added on this commit on the 
> > function mvneta_ethtool_set_link_ksettings (mvneta_ethtool_set_settings at 
> > that time), I do not have the issue, but I can't call that a fix...
> > So, could it be a driver issue, or maybe a wrong configuration somewhere? 
> > If you need additional information to reproduce the problem please ask me, 
> > I will be as responsive as possible.
> It seems mvneta_set_autoneg() does some non-symmetric
> things. It clears
> MVNETA_GMAC_FORCE_LINK_PASS |
> MVNETA_GMAC_FORCE_LINK_DOWN |
> MVNETA_GMAC_AN_FLOW_CTRL_EN
> when enabling autoneg, and does not restore these flags
> when disabling it. Try to make it to set or to restore these
> flags and see if that makes "ethtool -s eth0 autoneg off" to
> get the network back alive> .

As you suggested, I just set these three flags when the function is called with 
"enable" set to 0. And it works!
Actually, I did not even have to set autoneg to off. When the module is probed, 
the default param are applied (mvneta_defaults_set()), and mvneta_set_autoneg() 
is called with "enable" to 0, and it seems to fix everything. I tested it then, 
by setting autoneg to off/on, booting with or without the cable plugged, and I 
failed to break it. It seems to be fixed. Should I submit a patch? (would be 
the first time...)

Again, thanks a lot for your help.

Re: [PATCH] net: usbnet: support 64bit stats in qmi_wwan driver

2017-03-23 Thread Greg Ungerer


Hi Bjorn,

On 23/03/17 18:33, Bjørn Mork wrote:

Greg Ungerer  writes:


Add support for the net stats64 counters to the usbnet core and then to
the qmi_wwan driver.

This is a strait forward addition of 64bit counters for RX and TX packets
and byte counts. It is done in the same style as for the other net drivers
that support stats64.

The bulk of the change is to the usbnet core. Then it is trivial to use
that in the qmi_wwan.c driver. It would be very simple to extend this
support to other usbnet based drivers.

The motivation to add this is that it is not particularly difficult to
get the RX and TX byte counts to wrap on 32bit platforms.


You must have a higher quota than me :)


Well, not me personally :-)



But the patch does not apply to current net-next du to a conflict with
the ethtool_{get|set}_link_ksettings changes.


Ok, will respin against net-next. I generated this against 4.11-rc2.



+void usbnet_get_stats64(struct net_device *net, struct rtnl_link_stats64 
*stats)
+{
+   struct usbnet *dev = netdev_priv(net);
+   unsigned int start;
+
+   netdev_stats_to_stats64(stats, &net->stats);
+
+   do {
+   start = u64_stats_fetch_begin_irq(&dev->stats.syncp);
+   stats->rx_packets = dev->stats.rx_packets;
+   stats->rx_bytes = dev->stats.rx_bytes;
+   stats->tx_packets = dev->stats.tx_packets;
+   stats->tx_bytes = dev->stats.tx_bytes;
+   } while (u64_stats_fetch_retry_irq(&dev->stats.syncp, start));
+}
+


And I believe EXPORT_SYMBOL is missing here?


Yep, will fix that too. Thanks.

Regards
Greg

Re: [PATCH] net: usbnet: support 64bit stats in qmi_wwan driver

2017-03-23 Thread Greg Ungerer


Hi Oliver,

On 23/03/17 18:46, Oliver Neukum wrote:

Am Donnerstag, den 23.03.2017, 11:25 +1000 schrieb Greg Ungerer:

Add support for the net stats64 counters to the usbnet core and then to
the qmi_wwan driver.

This is a strait forward addition of 64bit counters for RX and TX packets
and byte counts. It is done in the same style as for the other net drivers
that support stats64.

The bulk of the change is to the usbnet core. Then it is trivial to use
that in the qmi_wwan.c driver. It would be very simple to extend this
support to other usbnet based drivers.

The motivation to add this is that it is not particularly difficult to
get the RX and TX byte counts to wrap on 32bit platforms.


Hi,

you need to export the symbol usbnet_get_stats64
Other than that it looks good.


Thanks. I will respin a v2 with that fixed.

Regards
Greg

Re: [PATCH net-next v3] net: Add sysctl to toggle early demux for tcp and udp

2017-03-23 Thread kbuild test robot

Hi Subash,

[auto build test ERROR on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Subash-Abhinov-Kasiviswanathan/net-Add-sysctl-to-toggle-early-demux-for-tcp-and-udp/20170323-182822
config: x86_64-kexec (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

   net/built-in.o: In function `proc_tcp_early_demux':
   sysctl_net_ipv4.c:(.text+0x7fe04): undefined reference to 
`tcp_v6_early_demux_configure'
   net/built-in.o: In function `proc_udp_early_demux':
>> sysctl_net_ipv4.c:(.text+0x7fe3d): undefined reference to 
>> `udp_v6_early_demux_configure'

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

net/kcm: double free of kcm inode

2017-03-23 Thread Dmitry Vyukov

Hello,

I've got the following report while running syzkaller fuzzer. Note the
preceding kmem_cache_alloc injected failure, it's most likely the root
cause.

FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 0
CPU: 1 PID: 21839 Comm: syz-executor4 Not tainted 4.11.0-rc3+ #364
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x1b8/0x28d lib/dump_stack.c:52
 fail_dump lib/fault-inject.c:45 [inline]
 should_fail+0x78a/0x870 lib/fault-inject.c:154
 should_failslab+0xec/0x120 mm/failslab.c:31
 slab_pre_alloc_hook mm/slab.h:434 [inline]
 slab_alloc mm/slab.c:3394 [inline]
 kmem_cache_alloc+0x200/0x720 mm/slab.c:3570
 sk_prot_alloc+0x65/0x2a0 net/core/sock.c:1331
 sk_alloc+0x8c/0x710 net/core/sock.c:1393
 kcm_clone net/kcm/kcmsock.c:1655 [inline]
 kcm_ioctl+0xb65/0x17e0 net/kcm/kcmsock.c:1713
 sock_do_ioctl+0x65/0xb0 net/socket.c:895
 sock_ioctl+0x2c2/0x440 net/socket.c:993
 vfs_ioctl fs/ioctl.c:45 [inline]
 do_vfs_ioctl+0x1af/0x16d0 fs/ioctl.c:685
 SYSC_ioctl fs/ioctl.c:700 [inline]
 SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
 entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x445b79
RSP: 002b:7f05eb28e858 EFLAGS: 0286 ORIG_RAX: 0010
RAX: ffda RBX: 00708000 RCX: 00445b79
RDX: 20001000 RSI: 89e2 RDI: 0005
RBP: 0086 R08:  R09: 
R10:  R11: 0286 R12: 004a7e31
R13:  R14: 7f05eb28e618 R15: 7f05eb28e788
==
BUG: KASAN: use-after-free in __fput+0x6b0/0x7f0 fs/file_table.c:211
at addr 880037a25670
Read of size 2 by task syz-executor4/21839
CPU: 1 PID: 21839 Comm: syz-executor4 Not tainted 4.11.0-rc3+ #364
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x1b8/0x28d lib/dump_stack.c:52
 kasan_object_err+0x1c/0x70 mm/kasan/report.c:166
 print_address_description mm/kasan/report.c:210 [inline]
 kasan_report_error mm/kasan/report.c:294 [inline]
 kasan_report.part.2+0x1be/0x480 mm/kasan/report.c:316
 kasan_report mm/kasan/report.c:335 [inline]
 __asan_report_load2_noabort+0x29/0x30 mm/kasan/report.c:335
 __fput+0x6b0/0x7f0 fs/file_table.c:211
 fput+0x15/0x20 fs/file_table.c:245
 task_work_run+0x1a4/0x270 kernel/task_work.c:116
 tracehook_notify_resume include/linux/tracehook.h:191 [inline]
 exit_to_usermode_loop+0x24d/0x2d0 arch/x86/entry/common.c:161
 prepare_exit_to_usermode arch/x86/entry/common.c:191 [inline]
 syscall_return_slowpath+0x3bd/0x460 arch/x86/entry/common.c:260
 entry_SYSCALL_64_fastpath+0xc0/0xc2
RIP: 0033:0x445b79
RSP: 002b:7f05eb28e858 EFLAGS: 0286 ORIG_RAX: 0010
RAX: fff4 RBX: 00708000 RCX: 00445b79
RDX: 20001000 RSI: 89e2 RDI: 0005
RBP: 2170 R08:  R09: 
R10:  R11: 0286 R12: 006e0230
R13: 89e2 R14: 20001000 R15: 0005
Object at 880037a25640, in cache sock_inode_cache size: 944
Allocated:
PID = 21839
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:517
 set_track mm/kasan/kasan.c:529 [inline]
 kasan_kmalloc+0xbc/0xf0 mm/kasan/kasan.c:620
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:559
 kmem_cache_alloc+0x110/0x720 mm/slab.c:3572
 sock_alloc_inode+0x70/0x300 net/socket.c:250
 alloc_inode+0x65/0x180 fs/inode.c:207
 new_inode_pseudo+0x69/0x190 fs/inode.c:889
 sock_alloc+0x41/0x270 net/socket.c:565
 kcm_clone net/kcm/kcmsock.c:1634 [inline]
 kcm_ioctl+0x990/0x17e0 net/kcm/kcmsock.c:1713
 sock_do_ioctl+0x65/0xb0 net/socket.c:895
 sock_ioctl+0x2c2/0x440 net/socket.c:993
 vfs_ioctl fs/ioctl.c:45 [inline]
 do_vfs_ioctl+0x1af/0x16d0 fs/ioctl.c:685
 SYSC_ioctl fs/ioctl.c:700 [inline]
 SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
 entry_SYSCALL_64_fastpath+0x1f/0xc2
Freed:
PID = 21839
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:517
 set_track mm/kasan/kasan.c:529 [inline]
 kasan_slab_free+0x81/0xc0 mm/kasan/kasan.c:593
 __cache_free mm/slab.c:3514 [inline]
 kmem_cache_free+0x71/0x240 mm/slab.c:3774
 sock_destroy_inode+0x56/0x70 net/socket.c:280
 destroy_inode+0x15d/0x200 fs/inode.c:264
 evict+0x57e/0x920 fs/inode.c:570
 iput_final fs/inode.c:1515 [inline]
 iput+0x62b/0xa20 fs/inode.c:1542
 sock_release+0x168/0x1e0 net/socket.c:607
 sock_close+0x16/0x20 net/socket.c:1061
 __fput+0x327/0x7f0 fs/file_table.c:209
 fput+0x15/0x20 fs/file_table.c:245
 task_work_run+0x1a4/0x270 kernel/task_work.c:116
 tracehook_notify_resume include/linux/tracehook.h:191 [inline]
 exit_to_usermode_loop+0x24d/0x2d0 arch/x86/entry/common.c:161
 prepare_exit_to_usermode arch/x86/entry/co

Re: [PATCH v3 net 2/5] net:ethernet:aquantia: Fix packet type detection (TCP/UDP) for IPv6.

2017-03-23 Thread David Arcari

On 03/23/2017 07:19 AM, Pavel Belous wrote:
> From: Pavel Belous 
> 
> In order for the checksum offloads to work correctly we need to set the
> packet type bit (TCP/UDP) in the TX context buffer.
> 
> Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code")
> 
> Signed-off-by: Pavel Belous 
> ---
>  drivers/net/ethernet/aquantia/atlantic/aq_nic.c | 20 
>  1 file changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c 
> b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
> index ee78444..db2b51d 100644
> --- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
> +++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
> @@ -510,10 +510,22 @@ static unsigned int aq_nic_map_skb(struct aq_nic_s 
> *self,
>   if (skb->ip_summed == CHECKSUM_PARTIAL) {
>   dx_buff->is_ip_cso = (htons(ETH_P_IP) == skb->protocol) ?
>   1U : 0U;
> - dx_buff->is_tcp_cso =
> - (ip_hdr(skb)->protocol == IPPROTO_TCP) ? 1U : 0U;
> - dx_buff->is_udp_cso =
> - (ip_hdr(skb)->protocol == IPPROTO_UDP) ? 1U : 0U;
> +
> + if (ip_hdr(skb)->version == 4) {
> + dx_buff->is_tcp_cso =
> + (ip_hdr(skb)->protocol == IPPROTO_TCP) ?
> + 1U : 0U;
> + dx_buff->is_udp_cso =
> + (ip_hdr(skb)->protocol == IPPROTO_UDP) ?
> + 1U : 0U;
> + } else if (ip_hdr(skb)->version == 6) {
> + dx_buff->is_tcp_cso =
> + (ipv6_hdr(skb)->nexthdr == NEXTHDR_TCP) ?
> + 1U : 0U;
> + dx_buff->is_udp_cso =
> + (ipv6_hdr(skb)->nexthdr == NEXTHDR_UDP) ?
> + 1U : 0U;
> + }
>   }
>  
>   for (; nr_frags--; ++frag_count) {
> 

Fixes tcp/ipv6

Tested-by: David Arcari

Re: [PATCH net-next 0/2] net: bridge: allow user-space to add ext learned entries

2017-03-23 Thread Nikolay Aleksandrov

On 23/03/17 12:27, Nikolay Aleksandrov wrote:
> Hi,
> This set adds the ability to add externally learned entries from
> user-space. For symmetry and proper function we need to allow SW entries
> to take over HW learned ones (similar to how HW can take over SW entries
> currently) which is needed for our use case (evpn) where we have pure SW
> ports and HW ports mixed in a single bridge. This does not play well with
> switchdev devices currently because there's no feedback when the entry is
> taken over, but this case has never worked anyway and feedback can be
> easily added when needed.
> Patch 02 simply allows to use NTF_EXT_LEARNED from user-space, we already
> have Quagga patches that make use of this functionality.
> 
> Thanks,
>  Nik
> 

A minor clarification: the functionality is necessary so the aging can
be handled by user-space software (e.g. quagga w/ evpn) when the entry
has been learned externally, but can also dynamically move to a local
bridge port and then the aging should be done by the bridge.

Re: [PATCH net-next v3] net: Add sysctl to toggle early demux for tcp and udp

2017-03-23 Thread kbuild test robot

Hi Subash,

[auto build test ERROR on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Subash-Abhinov-Kasiviswanathan/net-Add-sysctl-to-toggle-early-demux-for-tcp-and-udp/20170323-182822
config: alpha-defconfig (attached as .config)
compiler: alpha-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=alpha 

All errors (new ones prefixed by >>):

   net/built-in.o: In function `proc_tcp_early_demux':
   net/ipv4/sysctl_net_ipv4.c:308: undefined reference to 
`tcp_v6_early_demux_configure'
   net/ipv4/sysctl_net_ipv4.c:308: undefined reference to 
`tcp_v6_early_demux_configure'
   net/built-in.o: In function `proc_udp_early_demux':
>> net/ipv4/sysctl_net_ipv4.c:325: undefined reference to 
>> `udp_v6_early_demux_configure'
>> net/ipv4/sysctl_net_ipv4.c:325: undefined reference to 
>> `udp_v6_early_demux_configure'

vim +325 net/ipv4/sysctl_net_ipv4.c

   302  ret = proc_dointvec(table, write, buffer, lenp, ppos);
   303  
   304  if (write && !ret) {
   305  int enabled = init_net.ipv4.sysctl_tcp_early_demux;
   306  
   307  tcp_v4_early_demux_configure(enabled);
 > 308  tcp_v6_early_demux_configure(enabled);
   309  }
   310  
   311  return ret;
   312  }
   313  
   314  static int proc_udp_early_demux(struct ctl_table *table, int write,
   315  void __user *buffer, size_t *lenp, 
loff_t *ppos)
   316  {
   317  int ret = 0;
   318  
   319  ret = proc_dointvec(table, write, buffer, lenp, ppos);
   320  
   321  if (write && !ret) {
   322  int enabled = init_net.ipv4.sysctl_udp_early_demux;
   323  
   324  udp_v4_early_demux_configure(enabled);
 > 325  udp_v6_early_demux_configure(enabled);
   326  }
   327  
   328  return ret;

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

netlink: NULL timer crash

2017-03-23 Thread Dmitry Vyukov

Hello,

The following program triggers call of NULL timer func:

https://gist.githubusercontent.com/dvyukov/c210d01c74b911273469a93862ea7788/raw/2a3182772a6a6e20af3e71c02c2a1c2895d803fb/gistfile1.txt


BUG: unable to handle kernel NULL pointer dereference at   (null)
IP:   (null)
PGD 0
Oops: 0010 [#1] SMP KASAN
Modules linked in:
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.11.0-rc3+ #365
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: 88006c634300 task.stack: 88006c64
RIP: 0010:  (null)
RSP: 0018:88006d1077c8 EFLAGS: 00010246
RAX: dc00 RBX: 880062bddb00 RCX: 8154e161
RDX: 1090c1f1 RSI:  RDI: 880062bddb00
RBP: 88006d1077e8 R08: fbfff0a936a8 R09: 0001
R10: 0001 R11: fbfff0a936a7 R12: 84860f80
R13:  R14: 880062bddb60 R15: 11000da20f05
FS:  () GS:88006d10() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2:  CR3: 04e21000 CR4: 001406e0
Call Trace:
 
 neigh_timer_handler+0x365/0xd40 net/core/neighbour.c:944
 call_timer_fn+0x232/0x8c0 kernel/time/timer.c:1268
 expire_timers kernel/time/timer.c:1307 [inline]
 __run_timers+0x6f7/0xbd0 kernel/time/timer.c:1601
 run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
 __do_softirq+0x2d6/0xb54 kernel/softirq.c:284
 invoke_softirq kernel/softirq.c:364 [inline]
 irq_exit+0x1b1/0x1e0 kernel/softirq.c:405
 exiting_irq arch/x86/include/asm/apic.h:657 [inline]
 smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:487
RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
RSP: 0018:88006c647dc0 EFLAGS: 0286 ORIG_RAX: ff10
RAX: dc00 RBX: 11000d8c8fbb RCX: 
RDX: 109d8ed4 RSI: 0001 RDI: 84ec76a0
RBP: 88006c647dc0 R08: ed000d8c6861 R09: 
R10:  R11:  R12: fbfff09d8ed2
R13: 88006c647e78 R14: 84ec7690 R15: 0002
 
 arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
 default_idle+0xba/0x450 arch/x86/kernel/process.c:275
 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:266
 default_idle_call+0x37/0x80 kernel/sched/idle.c:97
 cpuidle_idle_call kernel/sched/idle.c:155 [inline]
 do_idle+0x230/0x380 kernel/sched/idle.c:244
 cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:346
 start_secondary+0x2a7/0x340 arch/x86/kernel/smpboot.c:275
 start_cpu+0x14/0x14 arch/x86/kernel/head_64.S:306
Code:  Bad RIP value.
RIP:   (null) RSP: 88006d1077c8
CR2: 
---[ end trace 845120b8a0d21411 ]---

On commit 093b995e3b55a0ae0670226ddfcb05bfbf0099ae

Re: stmmac: Performance regression after commit aff3d9eff843 "net: stmmac: enable multiple buffers"

2017-03-23 Thread Joao Pinto

Às 10:56 AM de 3/23/2017, Joao Pinto escreveu:
> Às 10:51 AM de 3/23/2017, Giuseppe CAVALLARO escreveu:
>> On 3/23/2017 11:48 AM, Giuseppe CAVALLARO wrote:
>>> Hello
>>>
>>> On 3/23/2017 11:20 AM, Corentin Labbe wrote:
>>>>> I have a 4.21 QoS Core with 4 RX + 4 TX and detected no regression.
>>>>>> Could you please share the iperf cmds you are using in order for me to
>>>>> reproduce
>>>>>> in my side?
>>>

HW Version: 4.21 QoS Core in HAPS DX7 (FPGA)
The connection between the FPGA and PC where stmmac is running is PCIe.
My configurations are done in stmmac_pci. Here they are:

@@ -68,10 +70,52 @@ static void stmmac_default_data(struct plat_stmmacenet_data
*plat)
 {
plat->bus_id = 1;
plat->phy_addr = 0;
-   plat->interface = PHY_INTERFACE_MODE_GMII;
-   plat->clk_csr = 2;  /* clk_csr_i = 20-35MHz & MDC = clk_csr_i/16 */
-   plat->has_gmac = 1;
-   plat->force_sf_dma_mode = 1;
+   plat->interface = PHY_INTERFACE_MODE_SGMII;
+   plat->clk_csr = 0x5;
+   plat->has_gmac = 0;
+   plat->has_gmac4 = 1;
+   plat->force_sf_dma_mode = 0;
+
+   plat->rx_queues_to_use = 4;
+   plat->tx_queues_to_use = 4;
+
+   plat->rx_sched_algorithm = MTL_RX_ALGORITHM_SP;
+
+   plat->rx_queues_cfg[0].mode_to_use = MTL_QUEUE_AVB;
+   plat->rx_queues_cfg[1].mode_to_use = MTL_QUEUE_DCB;
+   plat->rx_queues_cfg[2].mode_to_use = MTL_QUEUE_DCB;
+   plat->rx_queues_cfg[3].mode_to_use = MTL_QUEUE_DCB;
+
+   plat->tx_queues_cfg[0].mode_to_use = MTL_QUEUE_DCB;
+   plat->tx_queues_cfg[1].mode_to_use = MTL_QUEUE_AVB;
+   plat->tx_queues_cfg[2].mode_to_use = MTL_QUEUE_DCB;
+   plat->tx_queues_cfg[3].mode_to_use = MTL_QUEUE_DCB;
+
+   plat->tx_queues_cfg[1].send_slope = 0xCCC;
+   plat->tx_queues_cfg[1].idle_slope = 0x1333;
+   plat->tx_queues_cfg[1].high_credit = 0x4B;
+   plat->tx_queues_cfg[1].low_credit = 0xFFB5;
+
+   plat->rx_queues_cfg[0].chan = 0;
+   plat->rx_queues_cfg[1].chan = 1;
+   plat->rx_queues_cfg[2].chan = 2;
+   plat->rx_queues_cfg[3].chan = 3;
+
+   plat->tx_sched_algorithm = MTL_TX_ALGORITHM_WRR;
+   plat->tx_queues_cfg[0].weight = 0x10;
+   plat->tx_queues_cfg[1].weight = 0x11;
+   plat->tx_queues_cfg[2].weight = 0x12;
+   plat->tx_queues_cfg[3].weight = 0x13;
+
+   /* Disable Priority config by default */
+   plat->tx_queues_cfg[0].use_prio = false;
+   plat->rx_queues_cfg[0].use_prio = false;
+
+   /* Disable RX queues routing by default */
+   plat->rx_queues_cfg[0].pkt_route = 0x0;
+   plat->rx_queues_cfg[1].pkt_route = 0x0;
+   plat->rx_queues_cfg[2].pkt_route = 0x0;
+   plat->rx_queues_cfg[3].pkt_route = 0x0;

plat->mdio_bus_data->phy_reset = NULL;
plat->mdio_bus_data->phy_mask = 0;
@@ -83,22 +127,14 @@ static void stmmac_default_data(struct plat_stmmacenet_data
*plat)
/* Set default value for multicast hash bins */
plat->multicast_filter_bins = HASH_TABLE_SIZE;

+   plat->dma_cfg->fixed_burst = 0;
+   plat->dma_cfg->aal = 0;
+
/* Set default value for unicast filter entries */
plat->unicast_filter_entries = 1;

/* Set the maxmtu to a default of JUMBO_LEN */
plat->maxmtu = JUMBO_LEN;
-
-   /* Set default number of RX and TX queues to use */
-   plat->tx_queues_to_use = 1;
-   plat->rx_queues_to_use = 1;
-
-   /* Disable Priority config by default */
-   plat->tx_queues_cfg[0].use_prio = false;
-   plat->rx_queues_cfg[0].use_prio = false;
-
-   /* Disable RX queues routing by default */
-   plat->rx_queues_cfg[0].pkt_route = 0x0;
 }


*** TESTS ***


*TEST 1: File (linux-next tarball) transfer of ~1.4G by scp to the DUT*

scp net-next-20170323.tar.gz x@XXX:/home/synopsys/
The authenticity of host 'X' can't be established.
ECDSA key fingerprint is SHA256:/XX.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'XX' (ECDSA) to the list of known hosts.
XX@X's password:
net-next20170323.tar.gz

 100% 1366MB  79.3MB/s   00:17

ifconfig after transfer:

eth1  Link encap:Ethernet  HWaddr 
  inet addr:  Bcast:  Mask:
  inet6 addr: X Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:1026614 errors:0 dropped:0 overruns:0 frame:0
  TX packets:56804 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:1502856063 (1.5 GB)  TX bytes:4224767 (4.2 MB)
  Interrupt:16

*stmmac Log after transfer:

#:~/tem

Re: net/kcm: use-after-free in kcm_wq

2017-03-23 Thread Dmitry Vyukov

On Fri, Mar 3, 2017 at 9:03 PM, Cong Wang  wrote:
> On Fri, Mar 3, 2017 at 2:11 AM, Dmitry Vyukov  wrote:
>> Also like this one:
>>
>> ==
>> BUG: KASAN: use-after-free in atomic_long_read
>> include/linux/compiler.h:254 [inline] at addr 8800538aba60
>> BUG: KASAN: use-after-free in get_work_pool+0x2f2/0x340
>> kernel/workqueue.c:709 at addr 8800538aba60
>> Read of size 8 by task syz-executor6/7965
>> CPU: 2 PID: 7965 Comm: syz-executor6 Not tainted 4.10.0+ #248
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:15 [inline]
>>  dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:166
>>  print_address_description mm/kasan/report.c:204 [inline]
>>  kasan_report_error mm/kasan/report.c:288 [inline]
>>  kasan_report.part.2+0x198/0x440 mm/kasan/report.c:310
>>  kasan_report mm/kasan/report.c:331 [inline]
>>  __asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:331
>>  atomic_long_read include/linux/compiler.h:254 [inline]
>>  get_work_pool+0x2f2/0x340 kernel/workqueue.c:709
>>  __queue_work+0x2b3/0x1210 kernel/workqueue.c:1401
>>  queue_work_on+0x2e9/0x330 kernel/workqueue.c:1486
>>  queue_work include/linux/workqueue.h:487 [inline]
>>  strp_check_rcv+0x25/0x30 net/strparser/strparser.c:494
>
>
> It is not kcm_wq, it is strp_wq, and the work struct is strp->rx_work
> which lives in struct kcm_psock. The work is cancelled by strp_done(),
> it seems get queued again after strp_done()...


on 093b995e3b55a0ae0670226ddfcb05bfbf0099ae:

==
BUG: KASAN: use-after-free in worker_thread+0x1024/0x1340
kernel/workqueue.c:2229 at addr 88006d164ae0
Read of size 8 by task kworker/u8:3/25139
CPU: 2 PID: 25139 Comm: kworker/u8:3 Not tainted 4.11.0-rc3+ #364
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x1b8/0x28d lib/dump_stack.c:52
 kasan_object_err+0x1c/0x70 mm/kasan/report.c:166
 print_address_description mm/kasan/report.c:210 [inline]
 kasan_report_error mm/kasan/report.c:294 [inline]
 kasan_report.part.2+0x1be/0x480 mm/kasan/report.c:316
 kasan_report mm/kasan/report.c:337 [inline]
 __asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:337
 worker_thread+0x1024/0x1340 kernel/workqueue.c:2229
 kthread+0x359/0x420 kernel/kthread.c:229
 ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
Object at 88006d1649c0, in cache kcm_psock_cache size: 616
Allocated:
PID = 25123
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:517
 set_track mm/kasan/kasan.c:529 [inline]
 kasan_kmalloc+0xbc/0xf0 mm/kasan/kasan.c:620
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:559
 kmem_cache_alloc+0x110/0x720 mm/slab.c:3572
 kmem_cache_zalloc include/linux/slab.h:653 [inline]
 kcm_attach net/kcm/kcmsock.c:1386 [inline]
 kcm_attach_ioctl net/kcm/kcmsock.c:1457 [inline]
 kcm_ioctl+0x2bc/0x17e0 net/kcm/kcmsock.c:1692
 sock_do_ioctl+0x65/0xb0 net/socket.c:895
 sock_ioctl+0x2c2/0x440 net/socket.c:993
 vfs_ioctl fs/ioctl.c:45 [inline]
 do_vfs_ioctl+0x1af/0x16d0 fs/ioctl.c:685
 SYSC_ioctl fs/ioctl.c:700 [inline]
 SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
 entry_SYSCALL_64_fastpath+0x1f/0xc2
Freed:
PID = 25139
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:517
 set_track mm/kasan/kasan.c:529 [inline]
 kasan_slab_free+0x81/0xc0 mm/kasan/kasan.c:593
 __cache_free mm/slab.c:3514 [inline]
 kmem_cache_free+0x71/0x240 mm/slab.c:3774
 unreserve_psock+0x5d4/0x7b0 net/kcm/kcmsock.c:547
 kcm_write_msgs+0xba6/0x1ba0 net/kcm/kcmsock.c:590
 kcm_tx_work+0x32/0x1f0 net/kcm/kcmsock.c:731
 process_one_work+0xb20/0x1b40 kernel/workqueue.c:2097
 worker_thread+0x1b4/0x1340 kernel/workqueue.c:2231
 kthread+0x359/0x420 kernel/kthread.c:229
 ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
Memory state around the buggy address:
 88006d164980: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
 88006d164a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>88006d164a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
   ^
 88006d164b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 88006d164b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==

[PATCH] isdn: use setup_timer

2017-03-23 Thread Geliang Tang

Use setup_timer() instead of init_timer() to simplify the code.

Signed-off-by: Geliang Tang 
---
 drivers/isdn/divert/isdn_divert.c   |  9 +++--
 drivers/isdn/hardware/eicon/divasi.c|  5 ++---
 drivers/isdn/hardware/mISDN/hfcmulti.c  | 10 --
 drivers/isdn/hardware/mISDN/hfcpci.c|  9 +++--
 drivers/isdn/hardware/mISDN/mISDNipac.c |  5 ++---
 drivers/isdn/hardware/mISDN/mISDNisar.c | 10 --
 drivers/isdn/hardware/mISDN/w6692.c |  5 ++---
 drivers/isdn/hisax/amd7930_fn.c |  4 +---
 drivers/isdn/hisax/arcofi.c |  4 +---
 drivers/isdn/hisax/diva.c   |  5 ++---
 drivers/isdn/hisax/elsa.c   |  4 +---
 drivers/isdn/hisax/fsm.c|  4 +---
 drivers/isdn/hisax/hfc4s8s_l1.c |  5 ++---
 drivers/isdn/hisax/hfc_2bds0.c  |  4 +---
 drivers/isdn/hisax/hfc_pci.c|  8 ++--
 drivers/isdn/hisax/hfc_sx.c |  8 ++--
 drivers/isdn/hisax/hfc_usb.c|  8 ++--
 drivers/isdn/hisax/hfcscard.c   |  4 +---
 drivers/isdn/hisax/icc.c|  4 +---
 drivers/isdn/hisax/ipacx.c  |  4 +---
 drivers/isdn/hisax/isac.c   |  4 +---
 drivers/isdn/hisax/isar.c   | 10 --
 drivers/isdn/hisax/isdnl3.c |  4 +---
 drivers/isdn/hisax/teleint.c|  4 +---
 drivers/isdn/hisax/w6692.c  |  5 ++---
 drivers/isdn/i4l/isdn_ppp.c |  5 ++---
 drivers/isdn/i4l/isdn_tty.c |  5 ++---
 drivers/isdn/mISDN/dsp_core.c   |  4 +---
 drivers/isdn/mISDN/fsm.c|  4 +---
 drivers/isdn/mISDN/l1oip_core.c |  4 +---
 30 files changed, 54 insertions(+), 114 deletions(-)

diff --git a/drivers/isdn/divert/isdn_divert.c 
b/drivers/isdn/divert/isdn_divert.c
index 50749a7..060d357 100644
--- a/drivers/isdn/divert/isdn_divert.c
+++ b/drivers/isdn/divert/isdn_divert.c
@@ -157,10 +157,8 @@ int cf_command(int drvid, int mode,
/* allocate mem for information struct */
if (!(cs = kmalloc(sizeof(struct call_struc), GFP_ATOMIC)))
return (-ENOMEM); /* no memory */
-   init_timer(&cs->timer);
+   setup_timer(&cs->timer, deflect_timer_expire, (ulong)cs);
cs->info[0] = '\0';
-   cs->timer.function = deflect_timer_expire;
-   cs->timer.data = (ulong) cs; /* pointer to own structure */
cs->ics.driver = drvid;
cs->ics.command = ISDN_CMD_PROT_IO; /* protocol specific io */
cs->ics.arg = DSS1_CMD_INVOKE; /* invoke supplementary service */
@@ -452,10 +450,9 @@ static int isdn_divert_icall(isdn_ctrl *ic)
return (0); /* no external deflection 
needed */
if (!(cs = kmalloc(sizeof(struct call_struc), 
GFP_ATOMIC)))
return (0); /* no memory */
-   init_timer(&cs->timer);
+   setup_timer(&cs->timer, deflect_timer_expire,
+   (ulong)cs);
cs->info[0] = '\0';
-   cs->timer.function = deflect_timer_expire;
-   cs->timer.data = (ulong) cs; /* pointer to own 
structure */
 
cs->ics = *ic; /* copy incoming data */
if (!cs->ics.parm.setup.phone[0]) 
strcpy(cs->ics.parm.setup.phone, "0");
diff --git a/drivers/isdn/hardware/eicon/divasi.c 
b/drivers/isdn/hardware/eicon/divasi.c
index cb88090..c610495 100644
--- a/drivers/isdn/hardware/eicon/divasi.c
+++ b/drivers/isdn/hardware/eicon/divasi.c
@@ -300,9 +300,8 @@ static int um_idi_open_adapter(struct file *file, int 
adapter_nr)
p_os = (diva_um_idi_os_context_t *) diva_um_id_get_os_context(e);
init_waitqueue_head(&p_os->read_wait);
init_waitqueue_head(&p_os->close_wait);
-   init_timer(&p_os->diva_timer_id);
-   p_os->diva_timer_id.function = (void *) diva_um_timer_function;
-   p_os->diva_timer_id.data = (unsigned long) p_os;
+   setup_timer(&p_os->diva_timer_id, (void *)diva_um_timer_function,
+   (unsigned long)p_os);
p_os->aborted = 0;
p_os->adapter_nr = adapter_nr;
return (1);
diff --git a/drivers/isdn/hardware/mISDN/hfcmulti.c 
b/drivers/isdn/hardware/mISDN/hfcmulti.c
index 480c2d7..961c07e 100644
--- a/drivers/isdn/hardware/mISDN/hfcmulti.c
+++ b/drivers/isdn/hardware/mISDN/hfcmulti.c
@@ -3878,9 +3878,8 @@ hfcmulti_initmode(struct dchannel *dch)
if (hc->dnum[pt]) {
mode_hfcmulti(hc, dch->slot, dch->dev.D.protocol,
  -1, 0, -1, 0);
-   dch->timer.function = (void *) hfcmulti_dbusy_timer;
-   dch->timer.data = (long) dch;
-   init_timer(&dch->timer);
+   setup_timer(&dch->timer, (void *)hfcmulti_dbusy_timer,
+   (long)dch);
}

Re: [PATCH net 4/5] net:ethernet:aquantia: Fix for LSO with IPv6.

2017-03-23 Thread kbuild test robot

Hi Pavel,

[auto build test ERROR on net/master]

url:
https://github.com/0day-ci/linux/commits/Pavel-Belous/net-ethernet-aquantia-Misc-fixes-for-atlantic-driver/20170323-191314
config: x86_64-allmodconfig (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c: In function 
'hw_atl_a0_hw_ring_tx_xmit':
>> drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c:438:17: error: 
>> 'HW_ATL_B0_TXD_CTL_CMD_IPV6' undeclared (first use in this function)
txd->ctl |= HW_ATL_B0_TXD_CTL_CMD_IPV6;
^~
   drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c:438:17: note: each 
undeclared identifier is reported only once for each function it appears in

vim +/HW_ATL_B0_TXD_CTL_CMD_IPV6 +438 
drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c

   432  pkt_len -= (buff->len_l4 +
   433  buff->len_l3 +
   434  buff->len_l2);
   435  is_gso = true;
   436  
   437  if (buff->is_ipv6)
 > 438  txd->ctl |= HW_ATL_B0_TXD_CTL_CMD_IPV6;
   439  } else {
   440  buff_pa_len = buff->len;
   441  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

[PATCH v2] net: stmmac: add set_mac to the stmmac_ops

2017-03-23 Thread Corentin Labbe

Two different set_mac functions exists but stmmac_dwmac4_set_mac() is
only used for enabling and never for disabling.
So on dwmac4, the MAC RX/TX is never disabled.

This patch add a generic function pointer set_mac() to stmmac_ops and
replace all call to stmmac_set_mac/stmmac_dwmac4_set_mac by a call to
this pointer.

Since dwmac4_ops is const, set_mac cannot be modified after, and so dwmac4_ops
is duplioacted like dwmac4_dma_ops.

Signed-off-by: Corentin Labbe 
---
 drivers/net/ethernet/stmicro/stmmac/common.h   |  2 ++
 .../net/ethernet/stmicro/stmmac/dwmac1000_core.c   |  1 +
 .../net/ethernet/stmicro/stmmac/dwmac100_core.c|  1 +
 drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c  | 39 --
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  | 11 +++---
 5 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h 
b/drivers/net/ethernet/stmicro/stmmac/common.h
index 572cf8b..90d28bc 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -474,6 +474,8 @@ struct mac_device_info;
 struct stmmac_ops {
/* MAC core initialization */
void (*core_init)(struct mac_device_info *hw, int mtu);
+   /* Enable the MAC RX/TX */
+   void (*set_mac)(void __iomem *ioaddr, bool enable);
/* Enable and verify that the IPC module is supported */
int (*rx_ipc)(struct mac_device_info *hw);
/* Enable RX Queues */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
index 7f78f77..f3d9305 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
@@ -490,6 +490,7 @@ static void dwmac1000_debug(void __iomem *ioaddr, struct 
stmmac_extra_stats *x,
 
 static const struct stmmac_ops dwmac1000_ops = {
.core_init = dwmac1000_core_init,
+   .set_mac = stmmac_set_mac,
.rx_ipc = dwmac1000_rx_ipc_enable,
.dump_regs = dwmac1000_dump_regs,
.host_irq_status = dwmac1000_irq_status,
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c
index 524135e..1b36091 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c
@@ -150,6 +150,7 @@ static void dwmac100_pmt(struct mac_device_info *hw, 
unsigned long mode)
 
 static const struct stmmac_ops dwmac100_ops = {
.core_init = dwmac100_core_init,
+   .set_mac = stmmac_set_mac,
.rx_ipc = dwmac100_rx_ipc_enable,
.dump_regs = dwmac100_dump_mac_regs,
.host_irq_status = dwmac100_irq_status,
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
index 40ce202..48793f2 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
@@ -669,6 +669,38 @@ static void dwmac4_debug(void __iomem *ioaddr, struct 
stmmac_extra_stats *x,
 
 static const struct stmmac_ops dwmac4_ops = {
.core_init = dwmac4_core_init,
+   .set_mac = stmmac_set_mac,
+   .rx_ipc = dwmac4_rx_ipc_enable,
+   .rx_queue_enable = dwmac4_rx_queue_enable,
+   .rx_queue_prio = dwmac4_rx_queue_priority,
+   .tx_queue_prio = dwmac4_tx_queue_priority,
+   .rx_queue_routing = dwmac4_tx_queue_routing,
+   .prog_mtl_rx_algorithms = dwmac4_prog_mtl_rx_algorithms,
+   .prog_mtl_tx_algorithms = dwmac4_prog_mtl_tx_algorithms,
+   .set_mtl_tx_queue_weight = dwmac4_set_mtl_tx_queue_weight,
+   .map_mtl_to_dma = dwmac4_map_mtl_dma,
+   .config_cbs = dwmac4_config_cbs,
+   .dump_regs = dwmac4_dump_regs,
+   .host_irq_status = dwmac4_irq_status,
+   .host_mtl_irq_status = dwmac4_irq_mtl_status,
+   .flow_ctrl = dwmac4_flow_ctrl,
+   .pmt = dwmac4_pmt,
+   .set_umac_addr = dwmac4_set_umac_addr,
+   .get_umac_addr = dwmac4_get_umac_addr,
+   .set_eee_mode = dwmac4_set_eee_mode,
+   .reset_eee_mode = dwmac4_reset_eee_mode,
+   .set_eee_timer = dwmac4_set_eee_timer,
+   .set_eee_pls = dwmac4_set_eee_pls,
+   .pcs_ctrl_ane = dwmac4_ctrl_ane,
+   .pcs_rane = dwmac4_rane,
+   .pcs_get_adv_lp = dwmac4_get_adv_lp,
+   .debug = dwmac4_debug,
+   .set_filter = dwmac4_set_filter,
+};
+
+static const struct stmmac_ops dwmac410_ops = {
+   .core_init = dwmac4_core_init,
+   .set_mac = stmmac_dwmac4_set_mac,
.rx_ipc = dwmac4_rx_ipc_enable,
.rx_queue_enable = dwmac4_rx_queue_enable,
.rx_queue_prio = dwmac4_rx_queue_priority,
@@ -715,8 +747,6 @@ struct mac_device_info *dwmac4_setup(void __iomem *ioaddr, 
int mcbins,
if (mac->multicast_filter_bins)
mac->mcast_bits_log2 = ilog2(mac->multicast_filter_bins);
 
-   mac->mac = &dwmac4_ops;
-
mac->link.port = GMAC_CONFIG_PS;

Re: Page allocator order-0 optimizations merged

2017-03-23 Thread Jesper Dangaard Brouer

On Wed, 22 Mar 2017 23:40:04 +
Mel Gorman  wrote:

> On Wed, Mar 22, 2017 at 07:39:17PM +0200, Tariq Toukan wrote:
> > > > > This modification may slow allocations from IRQ context slightly
> > > > > but the
> > > > > main gain from the per-cpu allocator is that it scales better for
> > > > > allocations from multiple contexts.  There is an implicit
> > > > > assumption that
> > > > > intensive allocations from IRQ contexts on multiple CPUs from a single
> > > > > NUMA node are rare  
> > Hi Mel, Jesper, and all.
> > 
> > This assumption contradicts regular multi-stream traffic that is naturally
> > handled
> > over close numa cores.  I compared iperf TCP multistream (8 streams)
> > over CX4 (mlx5 driver) with kernels v4.10 (before this series) vs
> > kernel v4.11-rc1 (with this series).
> > I disabled the page-cache (recycle) mechanism to stress the page allocator,
> > and see a drastic degradation in BW, from 47.5 G in v4.10 to 31.4 G in
> > v4.11-rc1 (34% drop).
> > I noticed queued_spin_lock_slowpath occupies 62.87% of CPU time.  
> 
> Can you get the stack trace for the spin lock slowpath to confirm it's
> from IRQ context?

AFAIK allocations happen in softirq.  Argh and during review I missed
that in_interrupt() also covers softirq.  To Mel, can we use a in_irq()
check instead?

(p.s. just landed and got home)
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

[PATCH net-next 0/6] qed: Management interaction & feature changes

2017-03-23 Thread Yuval Mintz

All patches in this series either affect direct interaction with the
management firmware, or changes logic relating to some values retrieved
from it.

Patch #1 revises the basic logic for sending messages to the management
firmware and there completion, and is the most significant [at least
code-wise] of the bunch.

Patch #2 changes infrastrcure in a way that should better protect us form
mistakes leading to stack corruption such as was fixed in
bb4802428432 ("qed: Prevent stack corruption on MFW interaction").

Patch #3 corrects some update API endian issue [sent here as it would
create conflicts with #2, and because it's lack would create a rather
insignifcant problem].

Patch #4 removes some unnecessary logging, allowing cleaner forward
compatibility with future management firmware versions.

Patches #5, #6 slightly change the number of possible L2 queues in some
scenarios, leading to the possibility of having more queues / VFS.

Dave,

Please consider applying this series to 'net-next'.

Thanks,
Yuvsl


Tomer Tayar (2):
  qed: Revise MFW command locking
  qed: Pass src/dst sizes when interacting with MFW

Yuval Mintz (4):
  qed: Correct endian order of MAC passed to MFW
  qed: Reduce verbosity of unimplemented MFW messages
  qed: Don't waste SBs unused by RoCE
  qed: Reserve VF feature before PF

 drivers/net/ethernet/qlogic/qed/qed_dev.c |  38 ++-
 drivers/net/ethernet/qlogic/qed/qed_mcp.c | 525 +++---
 drivers/net/ethernet/qlogic/qed/qed_mcp.h |  17 +-
 3 files changed, 373 insertions(+), 207 deletions(-)

-- 
1.9.3

[PATCH net-next 2/6] qed: Pass src/dst sizes when interacting with MFW

2017-03-23 Thread Yuval Mintz

From: Tomer Tayar 

The driver interaction with management firmware involves a union
of all the data-members relating to the commands the driver prepares.

Current interface assumes the caller always passes such a union -
but thats cumbersome as well as risky [chancing a stack corruption
in case caller accidentally passes a smaller member instead of union].

Change implementation so that caller could pass a pointer to any
of the members instead of the union.

Signed-off-by: Tomer Tayar 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_mcp.c | 119 --
 drivers/net/ethernet/qlogic/qed/qed_mcp.h |   6 +-
 2 files changed, 66 insertions(+), 59 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_mcp.c 
b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
index 0d157de..8d18102 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_mcp.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
@@ -368,12 +368,12 @@ static bool qed_mcp_has_pending_cmd(struct qed_hwfn 
*p_hwfn)
p_mb_params->mcp_param = DRV_MB_RD(p_hwfn, p_ptt, fw_mb_param);
 
/* Get the union data */
-   if (p_mb_params->p_data_dst != NULL) {
+   if (p_mb_params->p_data_dst != NULL && p_mb_params->data_dst_size) {
u32 union_data_addr = p_hwfn->mcp_info->drv_mb_addr +
  offsetof(struct public_drv_mb,
   union_data);
qed_memcpy_from(p_hwfn, p_ptt, p_mb_params->p_data_dst,
-   union_data_addr, sizeof(union drv_union_data));
+   union_data_addr, p_mb_params->data_dst_size);
}
 
p_cmd_elem->b_is_completed = true;
@@ -394,9 +394,9 @@ static void __qed_mcp_cmd_and_union(struct qed_hwfn *p_hwfn,
union_data_addr = p_hwfn->mcp_info->drv_mb_addr +
  offsetof(struct public_drv_mb, union_data);
memset(&union_data, 0, sizeof(union_data));
-   if (p_mb_params->p_data_src != NULL)
+   if (p_mb_params->p_data_src != NULL && p_mb_params->data_src_size)
memcpy(&union_data, p_mb_params->p_data_src,
-  sizeof(union_data));
+  p_mb_params->data_src_size);
qed_memcpy_to(p_hwfn, p_ptt, union_data_addr, &union_data,
  sizeof(union_data));
 
@@ -519,6 +519,7 @@ static int qed_mcp_cmd_and_union(struct qed_hwfn *p_hwfn,
 struct qed_ptt *p_ptt,
 struct qed_mcp_mb_params *p_mb_params)
 {
+   size_t union_data_size = sizeof(union drv_union_data);
u32 max_retries = QED_DRV_MB_MAX_RETRIES;
u32 delay = CHIP_MCP_RESP_ITER_US;
 
@@ -528,6 +529,15 @@ static int qed_mcp_cmd_and_union(struct qed_hwfn *p_hwfn,
return -EBUSY;
}
 
+   if (p_mb_params->data_src_size > union_data_size ||
+   p_mb_params->data_dst_size > union_data_size) {
+   DP_ERR(p_hwfn,
+  "The provided size is larger than the union data size 
[src_size %u, dst_size %u, union_data_size %zu]\n",
+  p_mb_params->data_src_size,
+  p_mb_params->data_dst_size, union_data_size);
+   return -EINVAL;
+   }
+
return _qed_mcp_cmd_and_union(p_hwfn, p_ptt, p_mb_params, max_retries,
  delay);
 }
@@ -540,11 +550,10 @@ int qed_mcp_cmd(struct qed_hwfn *p_hwfn,
u32 *o_mcp_param)
 {
struct qed_mcp_mb_params mb_params;
-   union drv_union_data data_src;
+   struct mcp_mac wol_mac;
int rc;
 
memset(&mb_params, 0, sizeof(mb_params));
-   memset(&data_src, 0, sizeof(data_src));
mb_params.cmd = cmd;
mb_params.param = param;
 
@@ -553,17 +562,18 @@ int qed_mcp_cmd(struct qed_hwfn *p_hwfn,
(p_hwfn->cdev->wol_config == QED_OV_WOL_ENABLED)) {
u8 *p_mac = p_hwfn->cdev->wol_mac;
 
-   data_src.wol_mac.mac_upper = p_mac[0] << 8 | p_mac[1];
-   data_src.wol_mac.mac_lower = p_mac[2] << 24 | p_mac[3] << 16 |
-p_mac[4] << 8 | p_mac[5];
+   memset(&wol_mac, 0, sizeof(wol_mac));
+   wol_mac.mac_upper = p_mac[0] << 8 | p_mac[1];
+   wol_mac.mac_lower = p_mac[2] << 24 | p_mac[3] << 16 |
+   p_mac[4] << 8 | p_mac[5];
 
DP_VERBOSE(p_hwfn,
   (QED_MSG_SP | NETIF_MSG_IFDOWN),
   "Setting WoL MAC: %pM --> [%08x,%08x]\n",
-  p_mac, data_src.wol_mac.mac_upper,
-  data_src.wol_mac.mac_lower);
+  p_mac, wol_mac.mac_upper, wol_mac.mac_lower);
 
-   mb_params.p_data_src = &data_src;
+   mb_params.p_data_src = &wol_mac;
+   mb_params.data_src_size = sizeof(wo

[PATCH net-next 4/6] qed: Reduce verbosity of unimplemented MFW messages

2017-03-23 Thread Yuval Mintz

Management firmware and driver are meant to be both backward and forward
compatibile with each other.

If a new mangement firmware would work with an older driver,
it's possible that driver would receive indications which are meaningless
to it. That's perfectly acceptible from the firmware part - so no need to
log such messages at default verbosity; That would only serve to confuse
users.

Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_mcp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_mcp.c 
b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
index d1fcd87..ccea0ea 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_mcp.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
@@ -1114,7 +1114,7 @@ int qed_mcp_handle_events(struct qed_hwfn *p_hwfn,
qed_mcp_update_bw(p_hwfn, p_ptt);
break;
default:
-   DP_NOTICE(p_hwfn, "Unimplemented MFW message %d\n", i);
+   DP_INFO(p_hwfn, "Unimplemented MFW message %d\n", i);
rc = -EINVAL;
}
}
-- 
1.9.3

[PATCH net-next 3/6] qed: Correct endian order of MAC passed to MFW

2017-03-23 Thread Yuval Mintz

The management firmware is running on a Big Endian processor,
and when running on LE platform HW is configured to swap access
to memory shared between management firmware and driver on
32-bit granulariy.

As a result, for matters of simplicity most of the APIs between
driver and management firmware are based on 32-bit variables.
MAC settings are one exception, as driver needs to fill a byte
array when indicating to management firmware that primary MAC
has changed.
Due to the swap, driver must make sure that the mac that was
provided in byte-order would be translated into native order,
otherwise after the swap the management firmware would read
it swapped.

Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_mcp.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_mcp.c 
b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
index 8d18102..d1fcd87 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_mcp.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
@@ -1600,6 +1600,7 @@ int qed_mcp_ov_update_mac(struct qed_hwfn *p_hwfn,
  struct qed_ptt *p_ptt, u8 *mac)
 {
struct qed_mcp_mb_params mb_params;
+   u32 mfw_mac[2];
int rc;
 
memset(&mb_params, 0, sizeof(mb_params));
@@ -1608,8 +1609,16 @@ int qed_mcp_ov_update_mac(struct qed_hwfn *p_hwfn,
  DRV_MSG_CODE_VMAC_TYPE_SHIFT;
mb_params.param |= MCP_PF_ID(p_hwfn);
 
-   mb_params.p_data_src = mac;
-   mb_params.data_src_size = 6;
+   /* MCP is BE, and on LE platforms PCI would swap access to SHMEM
+* in 32-bit granularity.
+* So the MAC has to be set in native order [and not byte order],
+* otherwise it would be read incorrectly by MFW after swap.
+*/
+   mfw_mac[0] = mac[0] << 24 | mac[1] << 16 | mac[2] << 8 | mac[3];
+   mfw_mac[1] = mac[4] << 24 | mac[5] << 16;
+
+   mb_params.p_data_src = (u8 *)mfw_mac;
+   mb_params.data_src_size = 8;
rc = qed_mcp_cmd_and_union(p_hwfn, p_ptt, &mb_params);
if (rc)
DP_ERR(p_hwfn, "Failed to send mac address, rc = %d\n", rc);
-- 
1.9.3

[PATCH net 0/3] s390/qeth patches for net

2017-03-23 Thread Ursula Braun

Hi Dave,

here are 2 s390/qeth patches built for net fixing a problem with AF_IUCV
traffic through HiperSockets.
And we come up with an update for the MAINTAINERS file to establish
Julian as Co-Maintainer for drivers/s390/net and net/iucv.

Thanks, Ursula

Julian Wiedmann (2):
  s390/qeth: size calculation outbound buffers
  s390/qeth: no ETH header for outbound AF_IUCV

Ursula Braun (1):
  MAINTAINERS: add Julian Wiedmann

 MAINTAINERS   |  2 ++
 drivers/s390/net/qeth_core.h  |  3 ++-
 drivers/s390/net/qeth_core_main.c |  5 +++--
 drivers/s390/net/qeth_l2_main.c   |  5 +++--
 drivers/s390/net/qeth_l3_main.c   | 20 +++-
 5 files changed, 17 insertions(+), 18 deletions(-)

--
2.8.4

[PATCH net 3/3] MAINTAINERS: add Julian Wiedmann

2017-03-23 Thread Ursula Braun

Add Julian Wiedmann as additional maintainer for drivers/s390/net
and net/iucv.

Signed-off-by: Ursula Braun 
---
 MAINTAINERS | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index c45c02b..e48678d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10808,6 +10808,7 @@ F:  drivers/s390/block/dasd*
 F: block/partitions/ibm.c
 
 S390 NETWORK DRIVERS
+M: Julian Wiedmann 
 M: Ursula Braun 
 L: linux-s...@vger.kernel.org
 W: http://www.ibm.com/developerworks/linux/linux390/
@@ -10838,6 +10839,7 @@ S:  Supported
 F: drivers/s390/scsi/zfcp_*
 
 S390 IUCV NETWORK LAYER
+M: Julian Wiedmann 
 M: Ursula Braun 
 L: linux-s...@vger.kernel.org
 W: http://www.ibm.com/developerworks/linux/linux390/
-- 
2.8.4

[PATCH net-next 1/6] qed: Revise MFW command locking

2017-03-23 Thread Yuval Mintz

From: Tomer Tayar 

Interaction of driver -> management firmware is based
on a one-pending mailbox [per interface], and various
mailbox commands need to be synchronized.

Current scheme is messy, and there's a difficulty extending
it as it deals differently with various commands as well as
making assumption on the required behavior for load/unload
requests.

Drop the current scheme into a completion-list-based approach;
Each flow would try sending the command when possible,
allowing one flow to complete another flow's completion and
relieve the mailbox before sending its own command.

Signed-off-by: Tomer Tayar 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_mcp.c | 405 --
 drivers/net/ethernet/qlogic/qed/qed_mcp.h |  11 +-
 2 files changed, 280 insertions(+), 136 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_mcp.c 
b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
index 87fde20..0d157de 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_mcp.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
@@ -111,12 +111,71 @@ void qed_mcp_read_mb(struct qed_hwfn *p_hwfn, struct 
qed_ptt *p_ptt)
}
 }
 
+struct qed_mcp_cmd_elem {
+   struct list_head list;
+   struct qed_mcp_mb_params *p_mb_params;
+   u16 expected_seq_num;
+   bool b_is_completed;
+};
+
+/* Must be called while cmd_lock is acquired */
+static struct qed_mcp_cmd_elem *
+qed_mcp_cmd_add_elem(struct qed_hwfn *p_hwfn,
+struct qed_mcp_mb_params *p_mb_params,
+u16 expected_seq_num)
+{
+   struct qed_mcp_cmd_elem *p_cmd_elem = NULL;
+
+   p_cmd_elem = kzalloc(sizeof(*p_cmd_elem), GFP_ATOMIC);
+   if (!p_cmd_elem)
+   goto out;
+
+   p_cmd_elem->p_mb_params = p_mb_params;
+   p_cmd_elem->expected_seq_num = expected_seq_num;
+   list_add(&p_cmd_elem->list, &p_hwfn->mcp_info->cmd_list);
+out:
+   return p_cmd_elem;
+}
+
+/* Must be called while cmd_lock is acquired */
+static void qed_mcp_cmd_del_elem(struct qed_hwfn *p_hwfn,
+struct qed_mcp_cmd_elem *p_cmd_elem)
+{
+   list_del(&p_cmd_elem->list);
+   kfree(p_cmd_elem);
+}
+
+/* Must be called while cmd_lock is acquired */
+static struct qed_mcp_cmd_elem *qed_mcp_cmd_get_elem(struct qed_hwfn *p_hwfn,
+u16 seq_num)
+{
+   struct qed_mcp_cmd_elem *p_cmd_elem = NULL;
+
+   list_for_each_entry(p_cmd_elem, &p_hwfn->mcp_info->cmd_list, list) {
+   if (p_cmd_elem->expected_seq_num == seq_num)
+   return p_cmd_elem;
+   }
+
+   return NULL;
+}
+
 int qed_mcp_free(struct qed_hwfn *p_hwfn)
 {
if (p_hwfn->mcp_info) {
+   struct qed_mcp_cmd_elem *p_cmd_elem, *p_tmp;
+
kfree(p_hwfn->mcp_info->mfw_mb_cur);
kfree(p_hwfn->mcp_info->mfw_mb_shadow);
+
+   spin_lock_bh(&p_hwfn->mcp_info->cmd_lock);
+   list_for_each_entry_safe(p_cmd_elem,
+p_tmp,
+&p_hwfn->mcp_info->cmd_list, list) {
+   qed_mcp_cmd_del_elem(p_hwfn, p_cmd_elem);
+   }
+   spin_unlock_bh(&p_hwfn->mcp_info->cmd_lock);
}
+
kfree(p_hwfn->mcp_info);
 
return 0;
@@ -160,7 +219,7 @@ static int qed_load_mcp_offsets(struct qed_hwfn *p_hwfn, 
struct qed_ptt *p_ptt)
p_info->drv_pulse_seq = DRV_MB_RD(p_hwfn, p_ptt, drv_pulse_mb) &
DRV_PULSE_SEQ_MASK;
 
-   p_info->mcp_hist = (u16)qed_rd(p_hwfn, p_ptt, MISCS_REG_GENERIC_POR_0);
+   p_info->mcp_hist = qed_rd(p_hwfn, p_ptt, MISCS_REG_GENERIC_POR_0);
 
return 0;
 }
@@ -176,6 +235,12 @@ int qed_mcp_cmd_init(struct qed_hwfn *p_hwfn, struct 
qed_ptt *p_ptt)
goto err;
p_info = p_hwfn->mcp_info;
 
+   /* Initialize the MFW spinlock */
+   spin_lock_init(&p_info->cmd_lock);
+   spin_lock_init(&p_info->link_lock);
+
+   INIT_LIST_HEAD(&p_info->cmd_list);
+
if (qed_load_mcp_offsets(p_hwfn, p_ptt) != 0) {
DP_NOTICE(p_hwfn, "MCP is not initialized\n");
/* Do not free mcp_info here, since public_base indicate that
@@ -190,10 +255,6 @@ int qed_mcp_cmd_init(struct qed_hwfn *p_hwfn, struct 
qed_ptt *p_ptt)
if (!p_info->mfw_mb_shadow || !p_info->mfw_mb_addr)
goto err;
 
-   /* Initialize the MFW spinlock */
-   spin_lock_init(&p_info->lock);
-   spin_lock_init(&p_info->link_lock);
-
return 0;
 
 err:
@@ -201,68 +262,39 @@ int qed_mcp_cmd_init(struct qed_hwfn *p_hwfn, struct 
qed_ptt *p_ptt)
return -ENOMEM;
 }
 
-/* Locks the MFW mailbox of a PF to ensure a single access.
- * The lock is achieved in most cases by holding a spinlock, causing other
- * threads to wait till a previous access is done.
- * In some cases (currently when a [UN]LOAD_REQ

[PATCH net-next 6/6] qed: Reserve VF feature before PF

2017-03-23 Thread Yuval Mintz

Align the driver feature distribution with the flow utilized
by the management firmware - first reserve L2 queues for
VFs and use all the remaining for the PF.

The current distribution might lead to PFs with an enormous
amount of queues, but at the same time leave us with insufficient
resources for starting all VFs.

Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_dev.c | 27 ---
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c 
b/drivers/net/ethernet/qlogic/qed/qed_dev.c
index 6bdac4f..11e45f0 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_dev.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c
@@ -1565,17 +1565,22 @@ static void qed_hw_set_feat(struct qed_hwfn *p_hwfn)
non_l2_sbs = feat_num[QED_RDMA_CNQ];
}
 
-   feat_num[QED_PF_L2_QUE] = min_t(u32,
-   RESC_NUM(p_hwfn, QED_SB) -
-   non_l2_sbs,
-   RESC_NUM(p_hwfn, QED_L2_QUEUE));
-
-   memset(&sb_cnt_info, 0, sizeof(sb_cnt_info));
-   qed_int_get_num_sbs(p_hwfn, &sb_cnt_info);
-   feat_num[QED_VF_L2_QUE] =
-   min_t(u32,
- RESC_NUM(p_hwfn, QED_L2_QUEUE) -
- FEAT_NUM(p_hwfn, QED_PF_L2_QUE), sb_cnt_info.sb_iov_cnt);
+   if (p_hwfn->hw_info.personality == QED_PCI_ETH_ROCE ||
+   p_hwfn->hw_info.personality == QED_PCI_ETH) {
+   /* Start by allocating VF queues, then PF's */
+   memset(&sb_cnt_info, 0, sizeof(sb_cnt_info));
+   qed_int_get_num_sbs(p_hwfn, &sb_cnt_info);
+   feat_num[QED_VF_L2_QUE] = min_t(u32,
+   RESC_NUM(p_hwfn, QED_L2_QUEUE),
+   sb_cnt_info.sb_iov_cnt);
+   feat_num[QED_PF_L2_QUE] = min_t(u32,
+   RESC_NUM(p_hwfn, QED_SB) -
+   non_l2_sbs,
+   RESC_NUM(p_hwfn,
+QED_L2_QUEUE) -
+   FEAT_NUM(p_hwfn,
+QED_VF_L2_QUE));
+   }
 
DP_VERBOSE(p_hwfn,
   NETIF_MSG_PROBE,
-- 
1.9.3

[PATCH net-next 5/6] qed: Don't waste SBs unused by RoCE

2017-03-23 Thread Yuval Mintz

When RoCE is enabled on a given L2 interface, the interrupt lines
are divided equally between L2 and RoCE -
But in case number of lines needed for RoCE is limited by number
of available CNQs, we can utilize the additional lines for L2.

Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_dev.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c 
b/drivers/net/ethernet/qlogic/qed/qed_dev.c
index 8b5df71..6bdac4f 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_dev.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c
@@ -1550,7 +1550,7 @@ static void qed_hw_set_feat(struct qed_hwfn *p_hwfn)
 {
u32 *feat_num = p_hwfn->hw_info.feat_num;
struct qed_sb_cnt_info sb_cnt_info;
-   int num_features = 1;
+   u32 non_l2_sbs = 0;
 
if (IS_ENABLED(CONFIG_QED_RDMA) &&
p_hwfn->hw_info.personality == QED_PCI_ETH_ROCE) {
@@ -1558,15 +1558,16 @@ static void qed_hw_set_feat(struct qed_hwfn *p_hwfn)
 * the status blocks equally between L2 / RoCE but with
 * consideration as to how many l2 queues / cnqs we have.
 */
-   num_features++;
-
feat_num[QED_RDMA_CNQ] =
-   min_t(u32, RESC_NUM(p_hwfn, QED_SB) / num_features,
+   min_t(u32, RESC_NUM(p_hwfn, QED_SB) / 2,
  RESC_NUM(p_hwfn, QED_RDMA_CNQ_RAM));
+
+   non_l2_sbs = feat_num[QED_RDMA_CNQ];
}
 
-   feat_num[QED_PF_L2_QUE] = min_t(u32, RESC_NUM(p_hwfn, QED_SB) /
-   num_features,
+   feat_num[QED_PF_L2_QUE] = min_t(u32,
+   RESC_NUM(p_hwfn, QED_SB) -
+   non_l2_sbs,
RESC_NUM(p_hwfn, QED_L2_QUEUE));
 
memset(&sb_cnt_info, 0, sizeof(sb_cnt_info));
@@ -1578,11 +1579,11 @@ static void qed_hw_set_feat(struct qed_hwfn *p_hwfn)
 
DP_VERBOSE(p_hwfn,
   NETIF_MSG_PROBE,
-  "#PF_L2_QUEUES=%d VF_L2_QUEUES=%d #ROCE_CNQ=%d #SBS=%d 
num_features=%d\n",
+  "#PF_L2_QUEUES=%d VF_L2_QUEUES=%d #ROCE_CNQ=%d #SBS=%d\n",
   (int)FEAT_NUM(p_hwfn, QED_PF_L2_QUE),
   (int)FEAT_NUM(p_hwfn, QED_VF_L2_QUE),
   (int)FEAT_NUM(p_hwfn, QED_RDMA_CNQ),
-  RESC_NUM(p_hwfn, QED_SB), num_features);
+  RESC_NUM(p_hwfn, QED_SB));
 }
 
 static enum resource_id_enum qed_hw_get_mfw_res_id(enum qed_resources res_id)
-- 
1.9.3

[PATCH net 2/3] s390/qeth: no ETH header for outbound AF_IUCV

2017-03-23 Thread Ursula Braun

From: Julian Wiedmann 

With AF_IUCV traffic, the skb passed to hard_start_xmit() has a 14 byte
slot at skb->data, intended for an ETH header. qeth_l3_fill_af_iucv_hdr()
fills this ETH header... and then immediately moves it to the
skb's headroom, where it disappears and is never seen again.

But it's still possible for us to return NETDEV_TX_BUSY after the skb has
been modified. Since we didn't get a private copy of the skb, the next
time the skb is delivered to hard_start_xmit() it no longer has the
expected layout (we moved the ETH header to the headroom, so skb->data
now starts at the IUCV_TRANS header). So when qeth_l3_fill_af_iucv_hdr()
does another round of rebuilding, the resulting qeth header ends up
all wrong. On transmission, the buffer is then rejected by
the HiperSockets device with SBALF15 = x'04'.
When this error is passed back to af_iucv as TX_NOTIFY_UNREACHABLE, it
tears down the offending socket.

As the ETH header for AF_IUCV serves no purpose, just align the code to
what we do for IP traffic on L3 HiperSockets: keep the ETH header at
skb->data, and pass down data_offset = ETH_HLEN to qeth_fill_buffer().
When mapping the payload into the SBAL elements, the ETH header is then
stripped off. This avoids the skb manipulations in
qeth_l3_fill_af_iucv_hdr(), and any buffer re-entering hard_start_xmit()
after NETDEV_TX_BUSY is now processed properly.

Signed-off-by: Julian Wiedmann 
Signed-off-by: Ursula Braun 
---
 drivers/s390/net/qeth_l3_main.c | 15 ---
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/s390/net/qeth_l3_main.c b/drivers/s390/net/qeth_l3_main.c
index 72aa953..653f0fb 100644
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -2609,17 +2609,13 @@ static void qeth_l3_fill_af_iucv_hdr(struct qeth_card 
*card,
char daddr[16];
struct af_iucv_trans_hdr *iucv_hdr;
 
-   skb_pull(skb, 14);
-   card->dev->header_ops->create(skb, card->dev, 0,
- card->dev->dev_addr, card->dev->dev_addr,
- card->dev->addr_len);
-   skb_pull(skb, 14);
-   iucv_hdr = (struct af_iucv_trans_hdr *)skb->data;
memset(hdr, 0, sizeof(struct qeth_hdr));
hdr->hdr.l3.id = QETH_HEADER_TYPE_LAYER3;
hdr->hdr.l3.ext_flags = 0;
-   hdr->hdr.l3.length = skb->len;
+   hdr->hdr.l3.length = skb->len - ETH_HLEN;
hdr->hdr.l3.flags = QETH_HDR_IPV6 | QETH_CAST_UNICAST;
+
+   iucv_hdr = (struct af_iucv_trans_hdr *) (skb->data + ETH_HLEN);
memset(daddr, 0, sizeof(daddr));
daddr[0] = 0xfe;
daddr[1] = 0x80;
@@ -2823,10 +2819,7 @@ static int qeth_l3_hard_start_xmit(struct sk_buff *skb, 
struct net_device *dev)
if ((card->info.type == QETH_CARD_TYPE_IQD) &&
!skb_is_nonlinear(skb)) {
new_skb = skb;
-   if (new_skb->protocol == ETH_P_AF_IUCV)
-   data_offset = 0;
-   else
-   data_offset = ETH_HLEN;
+   data_offset = ETH_HLEN;
hdr = kmem_cache_alloc(qeth_core_header_cache, GFP_ATOMIC);
if (!hdr)
goto tx_drop;
-- 
2.8.4

Re: [PATCH net-next v2 5/5] net-next: dsa: add dsa support for Mediatek MT7530 switch

2017-03-23 Thread Felix Fietkau

On 2017-03-23 09:06, Sean Wang wrote:
> Hi Andrew,
> 
> The purpose for the regmap table registered is to 
> 
> provide a way which helps us to look up a specific 
> 
> register on the switch through regmap-debugfs.
> 
> 
> And not all ranges of register is defined
> 
> so I only include the meaningful ones in a sparse way 
> 
> for the table.
I think in that case it might be nice to make regmap support optional in
order to avoid pulling in bloat on platforms that don't need it.

- Felix

Re: [PATCH net-next v2 5/5] net-next: dsa: add dsa support for Mediatek MT7530 switch

2017-03-23 Thread John Crispin




On 23/03/17 15:09, Felix Fietkau wrote:

On 2017-03-23 09:06, Sean Wang wrote:

Hi Andrew,

The purpose for the regmap table registered is to

provide a way which helps us to look up a specific

register on the switch through regmap-debugfs.


And not all ranges of register is defined

so I only include the meaningful ones in a sparse way

for the table.

I think in that case it might be nice to make regmap support optional in
order to avoid pulling in bloat on platforms that don't need it.

- Felix

The 2 relevant platforms are mips/ralink and arm/mediatek. both require 
regmap for the eth_sysctl syscon if they want to utilize the mtk_soc_eth 
driver which is a prereq for mt7530. so regmap cannot be optional here.


John




___
Linux-mediatek mailing list
linux-media...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

Re: [PATCH net-next v2 5/5] net-next: dsa: add dsa support for Mediatek MT7530 switch

2017-03-23 Thread Felix Fietkau

On 2017-03-23 15:25, John Crispin wrote:
> 
> 
> On 23/03/17 15:09, Felix Fietkau wrote:
>> On 2017-03-23 09:06, Sean Wang wrote:
>>> Hi Andrew,
>>>
>>> The purpose for the regmap table registered is to
>>>
>>> provide a way which helps us to look up a specific
>>>
>>> register on the switch through regmap-debugfs.
>>>
>>>
>>> And not all ranges of register is defined
>>>
>>> so I only include the meaningful ones in a sparse way
>>>
>>> for the table.
>> I think in that case it might be nice to make regmap support optional in
>> order to avoid pulling in bloat on platforms that don't need it.
>>
>> - Felix
>>
> The 2 relevant platforms are mips/ralink and arm/mediatek. both require 
> regmap for the eth_sysctl syscon if they want to utilize the mtk_soc_eth 
> driver which is a prereq for mt7530. so regmap cannot be optional here.
Makes sense, thanks.

- Felix

[PATCH net 1/3] s390/qeth: size calculation outbound buffers

2017-03-23 Thread Ursula Braun

From: Julian Wiedmann 

Depending on the device type, hard_start_xmit() builds different output
buffer formats. For instance with HiperSockets, on both L2 and L3 we
strip the ETH header from the skb - L3 doesn't need it, and L2 carries
it in the buffer's header element.
For this, we pass data_offset = ETH_HLEN all the way down to
__qeth_fill_buffer(), where skb->data is then adjusted accordingly.
But the initial size calculation still considers the *full* skb length
(including the ETH header). So qeth_get_elements_no() can erroneously
reject a skb as too big, even though it would actually fit into an
output buffer once the ETH header has been trimmed off later.

Fix this by passing an additional offset to qeth_get_elements_no(),
that indicates where in the skb the on-wire data actually begins.
Since the current code uses data_offset=-1 for some special handling
on OSA, we need to clamp data_offset to 0...

On HiperSockets this helps when sending ~MTU-size skbs with weird page
alignment. No change for OSA or AF_IUCV.

Signed-off-by: Julian Wiedmann 
Signed-off-by: Ursula Braun 
---
 drivers/s390/net/qeth_core.h  | 3 ++-
 drivers/s390/net/qeth_core_main.c | 5 +++--
 drivers/s390/net/qeth_l2_main.c   | 5 +++--
 drivers/s390/net/qeth_l3_main.c   | 5 +++--
 4 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/s390/net/qeth_core.h b/drivers/s390/net/qeth_core.h
index e7addea..d9561e3 100644
--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -961,7 +961,8 @@ int qeth_bridgeport_query_ports(struct qeth_card *card,
 int qeth_bridgeport_setrole(struct qeth_card *card, enum qeth_sbp_roles role);
 int qeth_bridgeport_an_set(struct qeth_card *card, int enable);
 int qeth_get_priority_queue(struct qeth_card *, struct sk_buff *, int, int);
-int qeth_get_elements_no(struct qeth_card *, struct sk_buff *, int);
+int qeth_get_elements_no(struct qeth_card *card, struct sk_buff *skb,
+int extra_elems, int data_offset);
 int qeth_get_elements_for_frags(struct sk_buff *);
 int qeth_do_send_packet_fast(struct qeth_card *, struct qeth_qdio_out_q *,
struct sk_buff *, struct qeth_hdr *, int, int, int);
diff --git a/drivers/s390/net/qeth_core_main.c 
b/drivers/s390/net/qeth_core_main.c
index 315d8a2..9a5f99c 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -3837,6 +3837,7 @@ EXPORT_SYMBOL_GPL(qeth_get_elements_for_frags);
  * @card:  qeth card structure, to check max. elems.
  * @skb:   SKB address
  * @extra_elems:   extra elems needed, to check against max.
+ * @data_offset:   range starts at skb->data + data_offset
  *
  * Returns the number of pages, and thus QDIO buffer elements, needed to cover
  * skb data, including linear part and fragments. Checks if the result plus
@@ -3844,10 +3845,10 @@ EXPORT_SYMBOL_GPL(qeth_get_elements_for_frags);
  * Note: extra_elems is not included in the returned result.
  */
 int qeth_get_elements_no(struct qeth_card *card,
-struct sk_buff *skb, int extra_elems)
+struct sk_buff *skb, int extra_elems, int data_offset)
 {
int elements = qeth_get_elements_for_range(
-   (addr_t)skb->data,
+   (addr_t)skb->data + data_offset,
(addr_t)skb->data + skb_headlen(skb)) +
qeth_get_elements_for_frags(skb);
 
diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c
index bea4833..af4e6a6 100644
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -849,7 +849,7 @@ static int qeth_l2_hard_start_xmit(struct sk_buff *skb, 
struct net_device *dev)
 * chaining we can not send long frag lists
 */
if ((card->info.type != QETH_CARD_TYPE_IQD) &&
-   !qeth_get_elements_no(card, new_skb, 0)) {
+   !qeth_get_elements_no(card, new_skb, 0, 0)) {
int lin_rc = skb_linearize(new_skb);
 
if (card->options.performance_stats) {
@@ -894,7 +894,8 @@ static int qeth_l2_hard_start_xmit(struct sk_buff *skb, 
struct net_device *dev)
}
}
 
-   elements = qeth_get_elements_no(card, new_skb, elements_needed);
+   elements = qeth_get_elements_no(card, new_skb, elements_needed,
+   (data_offset > 0) ? data_offset : 0);
if (!elements) {
if (data_offset >= 0)
kmem_cache_free(qeth_core_header_cache, hdr);
diff --git a/drivers/s390/net/qeth_l3_main.c b/drivers/s390/net/qeth_l3_main.c
index 06d0add..72aa953 100644
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -2867,7 +2867,7 @@ static int qeth_l3_hard_start_xmit(struct sk_buff *skb, 
struct net_device *dev)
 */
if ((card->info.type != QETH_CARD_TYPE_IQD) &&

Re: [PATCH 3/7] ath9k: Add support for reading the EEPROM data using the nvmem API

2017-03-23 Thread Alban

On Tue, 14 Mar 2017 00:53:55 +0100
Christian Lamparter  wrote:

> On Monday, March 13, 2017 10:05:11 PM CET Alban wrote:
> > Currently SoC platforms use a firmware request to get the EEPROM data.
> > This is mostly a hack and rely on using a user-helper scripts which is
> > deprecated. A nicer alternative is to use the nvmem API which was
> > designed for this kind of task.
> > 
> > Furthermore we let CONFIG_ATH9K_AHB select CONFIG_NVMEM as such
> > devices will generally use this method for loading the EEPROM data.
> > 
> > Signed-off-by: Alban 
> > ---
> > @@ -654,6 +656,25 @@ static int ath9k_init_softc(u16 devid, struct 
> > ath_softc *sc,
> > if (ret)
> > return ret;
> >  
> > +   /* If the EEPROM hasn't been retrieved via firmware request
> > +* use the nvmem API insted.
> > +*/
> > +   if (!ah->eeprom_blob) {
> > +   struct nvmem_cell *eeprom_cell;
> > +
> > +   eeprom_cell = nvmem_cell_get(ah->dev, "eeprom");
> > +   if (!IS_ERR(eeprom_cell)) {
> > +   ah->eeprom_data = nvmem_cell_read(
> > +   eeprom_cell, &ah->eeprom_size);
> > +   nvmem_cell_put(eeprom_cell);
> > +
> > +   if (IS_ERR(ah->eeprom_data)) {
> > +   dev_err(ah->dev, "failed to read eeprom");
> > +   return PTR_ERR(ah->eeprom_data);
> > +   }
> > +   }
> > +   }
> > +
> > if (ath9k_led_active_high != -1)
> > ah->config.led_active_high = ath9k_led_active_high == 1;
> >
> Are you sure this works with AR93XX SoCs that have the calibration data
> in the OTP?

I only tested this on an ar9132 platform, so it might well be that a
few things are missing for the newer SoC. However this shouldn't break
anything existing as a platform needs to define an nvmem cell on the
athk9 device for this code to be used at all.

Alban


pgpkNvuX79h5H.pgp
Description: OpenPGP digital signature

Re: Page allocator order-0 optimizations merged

2017-03-23 Thread Mel Gorman

On Thu, Mar 23, 2017 at 02:43:47PM +0100, Jesper Dangaard Brouer wrote:
> On Wed, 22 Mar 2017 23:40:04 +
> Mel Gorman  wrote:
> 
> > On Wed, Mar 22, 2017 at 07:39:17PM +0200, Tariq Toukan wrote:
> > > > > > This modification may slow allocations from IRQ context slightly
> > > > > > but the
> > > > > > main gain from the per-cpu allocator is that it scales better for
> > > > > > allocations from multiple contexts.  There is an implicit
> > > > > > assumption that
> > > > > > intensive allocations from IRQ contexts on multiple CPUs from a 
> > > > > > single
> > > > > > NUMA node are rare  
> > > Hi Mel, Jesper, and all.
> > > 
> > > This assumption contradicts regular multi-stream traffic that is naturally
> > > handled
> > > over close numa cores.  I compared iperf TCP multistream (8 streams)
> > > over CX4 (mlx5 driver) with kernels v4.10 (before this series) vs
> > > kernel v4.11-rc1 (with this series).
> > > I disabled the page-cache (recycle) mechanism to stress the page 
> > > allocator,
> > > and see a drastic degradation in BW, from 47.5 G in v4.10 to 31.4 G in
> > > v4.11-rc1 (34% drop).
> > > I noticed queued_spin_lock_slowpath occupies 62.87% of CPU time.  
> > 
> > Can you get the stack trace for the spin lock slowpath to confirm it's
> > from IRQ context?
> 
> AFAIK allocations happen in softirq.  Argh and during review I missed
> that in_interrupt() also covers softirq.  To Mel, can we use a in_irq()
> check instead?
> 
> (p.s. just landed and got home)

Not built or even boot tested. I'm unable to run tests at the moment

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6cbde310abed..f82225725bc1 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2481,7 +2481,7 @@ void free_hot_cold_page(struct page *page, bool cold)
unsigned long pfn = page_to_pfn(page);
int migratetype;
 
-   if (in_interrupt()) {
+   if (in_irq()) {
__free_pages_ok(page, 0);
return;
}
@@ -2647,7 +2647,7 @@ static struct page *__rmqueue_pcplist(struct zone *zone, 
int migratetype,
 {
struct page *page;
 
-   VM_BUG_ON(in_interrupt());
+   VM_BUG_ON(in_irq());
 
do {
if (list_empty(list)) {
@@ -2704,7 +2704,7 @@ struct page *rmqueue(struct zone *preferred_zone,
unsigned long flags;
struct page *page;
 
-   if (likely(order == 0) && !in_interrupt()) {
+   if (likely(order == 0) && !in_irq()) {
page = rmqueue_pcplist(preferred_zone, zone, order,
gfp_flags, migratetype);
goto out;

-- 
Mel Gorman
SUSE Labs

Re: netlink: NULL timer crash

2017-03-23 Thread Eric Dumazet

On Thu, Mar 23, 2017 at 5:55 AM, Dmitry Vyukov  wrote:
> Hello,
>
> The following program triggers call of NULL timer func:
>
> https://gist.githubusercontent.com/dvyukov/c210d01c74b911273469a93862ea7788/raw/2a3182772a6a6e20af3e71c02c2a1c2895d803fb/gistfile1.txt
>
>
> BUG: unable to handle kernel NULL pointer dereference at   (null)
> IP:   (null)
> PGD 0
> Oops: 0010 [#1] SMP KASAN
> Modules linked in:
> CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.11.0-rc3+ #365
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> task: 88006c634300 task.stack: 88006c64
> RIP: 0010:  (null)
> RSP: 0018:88006d1077c8 EFLAGS: 00010246
> RAX: dc00 RBX: 880062bddb00 RCX: 8154e161
> RDX: 1090c1f1 RSI:  RDI: 880062bddb00
> RBP: 88006d1077e8 R08: fbfff0a936a8 R09: 0001
> R10: 0001 R11: fbfff0a936a7 R12: 84860f80
> R13:  R14: 880062bddb60 R15: 11000da20f05
> FS:  () GS:88006d10() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2:  CR3: 04e21000 CR4: 001406e0
> Call Trace:
>  
>  neigh_timer_handler+0x365/0xd40 net/core/neighbour.c:944
>  call_timer_fn+0x232/0x8c0 kernel/time/timer.c:1268
>  expire_timers kernel/time/timer.c:1307 [inline]
>  __run_timers+0x6f7/0xbd0 kernel/time/timer.c:1601
>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
>  __do_softirq+0x2d6/0xb54 kernel/softirq.c:284
>  invoke_softirq kernel/softirq.c:364 [inline]
>  irq_exit+0x1b1/0x1e0 kernel/softirq.c:405
>  exiting_irq arch/x86/include/asm/apic.h:657 [inline]
>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:487
> RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
> RSP: 0018:88006c647dc0 EFLAGS: 0286 ORIG_RAX: ff10
> RAX: dc00 RBX: 11000d8c8fbb RCX: 
> RDX: 109d8ed4 RSI: 0001 RDI: 84ec76a0
> RBP: 88006c647dc0 R08: ed000d8c6861 R09: 
> R10:  R11:  R12: fbfff09d8ed2
> R13: 88006c647e78 R14: 84ec7690 R15: 0002
>  
>  arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
>  default_idle+0xba/0x450 arch/x86/kernel/process.c:275
>  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:266
>  default_idle_call+0x37/0x80 kernel/sched/idle.c:97
>  cpuidle_idle_call kernel/sched/idle.c:155 [inline]
>  do_idle+0x230/0x380 kernel/sched/idle.c:244
>  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:346
>  start_secondary+0x2a7/0x340 arch/x86/kernel/smpboot.c:275
>  start_cpu+0x14/0x14 arch/x86/kernel/head_64.S:306
> Code:  Bad RIP value.
> RIP:   (null) RSP: 88006d1077c8
> CR2: 
> ---[ end trace 845120b8a0d21411 ]---
>
> On commit 093b995e3b55a0ae0670226ddfcb05bfbf0099ae

Nice !

Looks like neigh->ops->solicit is NULL

[PATCH 1/3] net: hns: avoid gcc-7.0.1 warning for uninitialized data

2017-03-23 Thread Arnd Bergmann

hns_dsaf_set_mac_key() calls dsaf_set_field() on an uninitialized field,
which will then change only a few of its bits, causing a warning with
the latest gcc:

hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_set_mac_uc_entry':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may 
be used uninitialized in this function [-Werror=maybe-uninitialized]
   (origin) &= (~(mask)); \
^~
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_set_mac_mc_entry':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may 
be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_add_mac_mc_port':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may 
be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_del_mac_entry':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may 
be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_rm_mac_addr':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may 
be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_del_mac_mc_port':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may 
be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_get_mac_uc_entry':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may 
be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_get_mac_mc_entry':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may 
be used uninitialized in this function [-Werror=maybe-uninitialized]

The code is actually correct since we always set all 16 bits of the
port_vlan field, but gcc correctly points out that the first
access does contain uninitialized data.

This initializes the field to zero first before setting the
individual bits.

Fixes: 5483bfcb169c ("net: hns: modify tcam table and set mac key")
Signed-off-by: Arnd Bergmann 
---
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c 
b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
index 90dbda792614..1ec6d4bb6044 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
@@ -1519,6 +1519,7 @@ static void hns_dsaf_set_mac_key(
mac_key->high.bits.mac_3 = addr[3];
mac_key->low.bits.mac_4 = addr[4];
mac_key->low.bits.mac_5 = addr[5];
+   mac_key->low.bits.port_vlan = 0;
dsaf_set_field(mac_key->low.bits.port_vlan, DSAF_TBL_TCAM_KEY_VLAN_M,
   DSAF_TBL_TCAM_KEY_VLAN_S, vlan_id);
dsaf_set_field(mac_key->low.bits.port_vlan, DSAF_TBL_TCAM_KEY_PORT_M,
-- 
2.9.0

[PATCH 2/3] net: hns: fix uninitialized data use

2017-03-23 Thread Arnd Bergmann

When dev_dbg() is enabled, we print uninitialized data, as gcc-7.0.1
now points out:

ethernet/hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_set_promisc_tcam':
ethernet/hisilicon/hns/hns_dsaf_main.c:2947:75: error: 'tbl_tcam_data.low.val' 
may be used uninitialized in this function [-Werror=maybe-uninitialized]
ethernet/hisilicon/hns/hns_dsaf_main.c:2947:75: error: 'tbl_tcam_data.high.val' 
may be used uninitialized in this function [-Werror=maybe-uninitialized]

We also pass the data into hns_dsaf_tcam_mc_cfg(), which might later
use it (not sure about that), so it seems safer to just always initialize
the tbl_tcam_data structure.

Fixes: 1f5fa2dd1cfa ("net: hns: fix for promisc mode in HNS driver")
Signed-off-by: Arnd Bergmann 
---
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c 
b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
index 1ec6d4bb6044..403ea9db6dbd 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
@@ -2925,10 +2925,11 @@ void hns_dsaf_set_promisc_tcam(struct dsaf_device 
*dsaf_dev,
/* find the tcam entry index for promisc */
entry_index = dsaf_promisc_tcam_entry(port);
 
+   memset(&tbl_tcam_data, 0, sizeof(tbl_tcam_data));
+   memset(&tbl_tcam_mask, 0, sizeof(tbl_tcam_mask));
+
/* config key mask */
if (enable) {
-   memset(&tbl_tcam_data, 0, sizeof(tbl_tcam_data));
-   memset(&tbl_tcam_mask, 0, sizeof(tbl_tcam_mask));
dsaf_set_field(tbl_tcam_data.low.bits.port_vlan,
   DSAF_TBL_TCAM_KEY_PORT_M,
   DSAF_TBL_TCAM_KEY_PORT_S, port);
-- 
2.9.0

Re: netlink: NULL timer crash

2017-03-23 Thread Eric Dumazet

On Thu, 2017-03-23 at 07:53 -0700, Eric Dumazet wrote:

> Nice !
> 
> Looks like neigh->ops->solicit is NULL

Apparently we allow admins to do really stupid things with neighbours
on tunnels.

Following patch should avoid the crash.

Anyone has better ideas ?


 net/ipv4/arp.c   |5 +
 net/ipv6/ndisc.c |4 
 2 files changed, 9 insertions(+)

diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index 
51b27ae09fbd725bcd8030982e5850215ac4ce5c..963191b12e28041bf5df6f37f222a7155f83a414
 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -146,8 +146,13 @@ static const struct neigh_ops arp_hh_ops = {
.connected_output = neigh_resolve_output,
 };
 
+static void arp_no_solicit(struct neighbour *neigh, struct sk_buff *skb)
+{
+}
+
 static const struct neigh_ops arp_direct_ops = {
.family =   AF_INET,
+   .solicit =  arp_no_solicit,
.output =   neigh_direct_output,
.connected_output = neigh_direct_output,
 };
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 
7ebac630d3c603186be2fc0dcbaac7d7e74bfde6..86f290b749d5ca0db4310b17ebeff35d847540c7
 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -99,9 +99,13 @@ static const struct neigh_ops ndisc_hh_ops = {
.connected_output = neigh_resolve_output,
 };
 
+static void ndisc_no_solicit(struct neighbour *neigh, struct sk_buff *skb)
+{
+}
 
 static const struct neigh_ops ndisc_direct_ops = {
.family =   AF_INET6,
+   .solicit =  ndisc_no_solicit,
.output =   neigh_direct_output,
.connected_output = neigh_direct_output,
 };

[patch net-next] net: sched: choke: remove dead filter classify code

2017-03-23 Thread Jiri Pirko

From: Jiri Pirko 

sch_choke is classless qdisc so it does not define cl_ops. Therefore
filter_list cannot be ever changed, being NULL all the time.
Reason is this check in tc_ctl_tfilter:

/* Is it classful? */
cops = q->ops->cl_ops;
if (!cops)
return -EINVAL;

So remove this dead code.

Signed-off-by: Jiri Pirko 
---
 net/sched/sch_choke.c | 51 ---
 1 file changed, 51 deletions(-)

diff --git a/net/sched/sch_choke.c b/net/sched/sch_choke.c
index 3b86a97..03ce895 100644
--- a/net/sched/sch_choke.c
+++ b/net/sched/sch_choke.c
@@ -58,7 +58,6 @@ struct choke_sched_data {
 
 /* Variables */
struct red_vars  vars;
-   struct tcf_proto __rcu *filter_list;
struct {
u32 prob_drop;  /* Early probability drops */
u32 prob_mark;  /* Early probability marks */
@@ -152,11 +151,6 @@ static inline void choke_set_classid(struct sk_buff *skb, 
u16 classid)
choke_skb_cb(skb)->classid = classid;
 }
 
-static u16 choke_get_classid(const struct sk_buff *skb)
-{
-   return choke_skb_cb(skb)->classid;
-}
-
 /*
  * Compare flow of two packets
  *  Returns true only if source and destination address and port match.
@@ -188,40 +182,6 @@ static bool choke_match_flow(struct sk_buff *skb1,
 }
 
 /*
- * Classify flow using either:
- *  1. pre-existing classification result in skb
- *  2. fast internal classification
- *  3. use TC filter based classification
- */
-static bool choke_classify(struct sk_buff *skb,
-  struct Qdisc *sch, int *qerr)
-
-{
-   struct choke_sched_data *q = qdisc_priv(sch);
-   struct tcf_result res;
-   struct tcf_proto *fl;
-   int result;
-
-   fl = rcu_dereference_bh(q->filter_list);
-   result = tc_classify(skb, fl, &res, false);
-   if (result >= 0) {
-#ifdef CONFIG_NET_CLS_ACT
-   switch (result) {
-   case TC_ACT_STOLEN:
-   case TC_ACT_QUEUED:
-   *qerr = NET_XMIT_SUCCESS | __NET_XMIT_STOLEN;
-   case TC_ACT_SHOT:
-   return false;
-   }
-#endif
-   choke_set_classid(skb, TC_H_MIN(res.classid));
-   return true;
-   }
-
-   return false;
-}
-
-/*
  * Select a packet at random from queue
  * HACK: since queue can have holes from previous deletion; retry several
  *   times to find a random skb but then just give up and return the head
@@ -257,9 +217,6 @@ static bool choke_match_random(const struct 
choke_sched_data *q,
return false;
 
oskb = choke_peek_random(q, pidx);
-   if (rcu_access_pointer(q->filter_list))
-   return choke_get_classid(nskb) == choke_get_classid(oskb);
-
return choke_match_flow(oskb, nskb);
 }
 
@@ -270,12 +227,6 @@ static int choke_enqueue(struct sk_buff *skb, struct Qdisc 
*sch,
struct choke_sched_data *q = qdisc_priv(sch);
const struct red_parms *p = &q->parms;
 
-   if (rcu_access_pointer(q->filter_list)) {
-   /* If using external classifiers, get result and record it. */
-   if (!choke_classify(skb, sch, &ret))
-   goto other_drop;/* Packet was eaten by filter */
-   }
-
choke_skb_cb(skb)->keys_valid = 0;
/* Compute average queue usage (see RED) */
q->vars.qavg = red_calc_qavg(p, &q->vars, sch->q.qlen);
@@ -340,7 +291,6 @@ static int choke_enqueue(struct sk_buff *skb, struct Qdisc 
*sch,
qdisc_drop(skb, sch, to_free);
return NET_XMIT_CN;
 
-other_drop:
if (ret & __NET_XMIT_BYPASS)
qdisc_qstats_drop(sch);
__qdisc_drop(skb, to_free);
@@ -538,7 +488,6 @@ static void choke_destroy(struct Qdisc *sch)
 {
struct choke_sched_data *q = qdisc_priv(sch);
 
-   tcf_destroy_chain(&q->filter_list);
choke_free(q->tab);
 }
 
-- 
2.7.4

Re: [iproute2 net-next 3/3] ip netconf: Show all families on dev request

2017-03-23 Thread Nicolas Dichtel

Le 22/03/2017 à 22:59, David Ahern a écrit :
> Currently specifying a device to ip netconf and it dumps only values
> for IPv4. Change this to dump data for all families unless a specific
> family is given.
> 
> Signed-off-by: David Ahern 
> ---
>  ip/ipnetconf.c | 23 +--
>  1 file changed, 13 insertions(+), 10 deletions(-)
> 
> diff --git a/ip/ipnetconf.c b/ip/ipnetconf.c
> index dc0851025223..ab4c1d9db7c8 100644
> --- a/ip/ipnetconf.c
> +++ b/ip/ipnetconf.c
> @@ -56,6 +56,7 @@ int print_netconf(const struct sockaddr_nl *who, struct 
> rtnl_ctrl_data *ctrl,
>   struct netconfmsg *ncm = NLMSG_DATA(n);
>   int len = n->nlmsg_len;
>   struct rtattr *tb[NETCONFA_MAX+1];
> + int ifindex = 0;
>  
>   if (n->nlmsg_type == NLMSG_ERROR)
>   return -1;
> @@ -77,6 +78,12 @@ int print_netconf(const struct sockaddr_nl *who, struct 
> rtnl_ctrl_data *ctrl,
>   parse_rtattr(tb, NETCONFA_MAX, netconf_rta(ncm),
>NLMSG_PAYLOAD(n, sizeof(*ncm)));
>  
> + if (tb[NETCONFA_IFINDEX])
> + ifindex = *((int *)rta_getattr_str(tb[NETCONFA_IFINDEX]));
This line is moved, but rta_getattr_u32() is probably more right.


Regards,
Nicolas

net/sched: GPF in qdisc_hash_add

2017-03-23 Thread Dmitry Vyukov

Hello,

I've hit the following GPF while running syzkaller on commit
093b995e3b55a0ae0670226ddfcb05bfbf0099ae.  Note the preceding injected
kmalloc failure, most likely it's the root cause.

FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 0
CPU: 2 PID: 12732 Comm: syz-executor6 Not tainted 4.11.0-rc3+ #365
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x1b8/0x28d lib/dump_stack.c:52
 fail_dump lib/fault-inject.c:52 [inline]
 should_fail+0x804/0x8c0 lib/fault-inject.c:154
 should_failslab+0xec/0x120 mm/failslab.c:31
 slab_pre_alloc_hook mm/slab.h:434 [inline]
 slab_alloc_node mm/slab.c:3315 [inline]
 kmem_cache_alloc_node_trace+0x200/0x720 mm/slab.c:3679
 __do_kmalloc_node mm/slab.c:3699 [inline]
 __kmalloc_node+0x33/0x70 mm/slab.c:3707
 kmalloc_node include/linux/slab.h:532 [inline]
 kzalloc_node include/linux/slab.h:674 [inline]
 qdisc_alloc+0xf4/0x670 net/sched/sch_generic.c:604
 qdisc_create_dflt+0x59/0x160 net/sched/sch_generic.c:652
 attach_one_default_qdisc net/sched/sch_generic.c:767 [inline]
 netdev_for_each_tx_queue include/linux/netdevice.h:1948 [inline]
 attach_default_qdiscs net/sched/sch_generic.c:786 [inline]
 dev_activate+0x58d/0x920 net/sched/sch_generic.c:829
 __dev_open+0x25b/0x360 net/core/dev.c:1348
 __dev_change_flags+0x159/0x3d0 net/core/dev.c:6460
 dev_change_flags+0x88/0x140 net/core/dev.c:6525
 dev_ifsioc+0x51f/0x9b0 net/core/dev_ioctl.c:254
 dev_ioctl+0x1fe/0x1030 net/core/dev_ioctl.c:532
 sock_do_ioctl+0x94/0xb0 net/socket.c:902
 sock_ioctl+0x2c2/0x440 net/socket.c:993
 vfs_ioctl fs/ioctl.c:45 [inline]
 do_vfs_ioctl+0x1af/0x16d0 fs/ioctl.c:685
 SYSC_ioctl fs/ioctl.c:700 [inline]
 SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
 entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x445b79
RSP: 002b:7f68665cf858 EFLAGS: 0286 ORIG_RAX: 0010
RAX: ffda RBX: 00708000 RCX: 00445b79
RDX: 2000 RSI: 8914 RDI: 0019
RBP: 0086 R08:  R09: 
R10:  R11: 0286 R12: 004a7e31
R13:  R14: 7f68665cf618 R15: 7f68665cf788
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault:  [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 2 PID: 12732 Comm: syz-executor6 Not tainted 4.11.0-rc3+ #365
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: 880062b7a2c0 task.stack: 88003348
RIP: 0010:qdisc_hash_add.part.19+0xb6/0x3c0 net/sched/sch_api.c:280
RSP: 0018:880033487820 EFLAGS: 00010202
RAX: dc00 RBX: 85357e00 RCX: c90002b24000
RDX: 007a RSI: 835a523a RDI: 03d0
RBP: 8800334878b8 R08: fbfff0a6afeb R09: fbfff0a6afeb
R10: 0001 R11: fbfff0a6afea R12: 85357e48
R13: 110006690f06 R14: 880033487890 R15: 
FS:  7f68665d0700() GS:88006e20() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 004c2d44 CR3: 3c6f8000 CR4: 26e0
Call Trace:
 qdisc_hash_add+0x76/0x90 net/sched/sch_api.c:279
 attach_default_qdiscs net/sched/sch_generic.c:798 [inline]
 dev_activate+0x6ca/0x920 net/sched/sch_generic.c:829
 __dev_open+0x25b/0x360 net/core/dev.c:1348
 __dev_change_flags+0x159/0x3d0 net/core/dev.c:6460
 dev_change_flags+0x88/0x140 net/core/dev.c:6525
 dev_ifsioc+0x51f/0x9b0 net/core/dev_ioctl.c:254
 dev_ioctl+0x1fe/0x1030 net/core/dev_ioctl.c:532
 sock_do_ioctl+0x94/0xb0 net/socket.c:902
 sock_ioctl+0x2c2/0x440 net/socket.c:993
 vfs_ioctl fs/ioctl.c:45 [inline]
 do_vfs_ioctl+0x1af/0x16d0 fs/ioctl.c:685
 SYSC_ioctl fs/ioctl.c:700 [inline]
 SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
 entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x445b79
RSP: 002b:7f68665cf858 EFLAGS: 0286 ORIG_RAX: 0010
RAX: ffda RBX: 00708000 RCX: 00445b79
RDX: 2000 RSI: 8914 RDI: 0019
RBP: 0086 R08:  R09: 
R10:  R11: 0286 R12: 004a7e31
R13:  R14: 7f68665cf618 R15: 7f68665cf788
Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 5a 02 00 00 4d 8b 3f 48 b8
00 00 00 00 00 fc ff df 49 8d bf d0 03 00 00 48 89 fa 48 c1 ea 03 <80>
3c 02 00 0f 85 c2 02 00 00 49 81 bf d0 03 00 00 00 7e 35 85
RIP: qdisc_hash_add.part.19+0xb6/0x3c0 net/sched/sch_api.c:280 RSP:
880033487820
---[ end trace 1529d12967754f9c ]---

[PATCH] bna: avoid writing uninitialized data into hw registers

2017-03-23 Thread Arnd Bergmann

The latest gcc-7 snapshot warns about bfa_ioc_send_enable/bfa_ioc_send_disable
writing undefined values into the hardware registers:

drivers/net/ethernet/brocade/bna/bfa_ioc.c: In function 
'bfa_iocpf_sm_disabling_entry':
arch/arm/include/asm/io.h:109:22: error: '*((void *)&disable_req+4)' is used 
uninitialized in this function [-Werror=uninitialized]
arch/arm/include/asm/io.h:109:22: error: '*((void *)&disable_req+8)' is used 
uninitialized in this function [-Werror=uninitialized]

The two functions look like they should do the same thing, but only one
of them initializes the time stamp and clscode field. The fact that we
only get a warning for one of the two functions seems to be arbitrary,
based on the inlining decisions in the compiler.

To address this, I'm making both functions do the same thing:

- set the clscode from the ioc structure in both
- set the time stamp from ktime_get_real_seconds (which also
  avoids the signed-integer overflow in 2038 and extends the
  well-defined behavior until 2106).
- zero-fill the reserved field

Fixes: 8b230ed8ec96 ("bna: Brocade 10Gb Ethernet device driver")
Signed-off-by: Arnd Bergmann 
---
 drivers/net/ethernet/brocade/bna/bfa_ioc.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/brocade/bna/bfa_ioc.c 
b/drivers/net/ethernet/brocade/bna/bfa_ioc.c
index 9e59663a6ead..0f6811860ad5 100644
--- a/drivers/net/ethernet/brocade/bna/bfa_ioc.c
+++ b/drivers/net/ethernet/brocade/bna/bfa_ioc.c
@@ -1930,13 +1930,13 @@ static void
 bfa_ioc_send_enable(struct bfa_ioc *ioc)
 {
struct bfi_ioc_ctrl_req enable_req;
-   struct timeval tv;
 
bfi_h2i_set(enable_req.mh, BFI_MC_IOC, BFI_IOC_H2I_ENABLE_REQ,
bfa_ioc_portid(ioc));
enable_req.clscode = htons(ioc->clscode);
-   do_gettimeofday(&tv);
-   enable_req.tv_sec = ntohl(tv.tv_sec);
+   enable_req.rsvd = htons(0);
+   /* overflow in 2106 */
+   enable_req.tv_sec = ntohl(ktime_get_real_seconds());
bfa_ioc_mbox_send(ioc, &enable_req, sizeof(struct bfi_ioc_ctrl_req));
 }
 
@@ -1947,6 +1947,10 @@ bfa_ioc_send_disable(struct bfa_ioc *ioc)
 
bfi_h2i_set(disable_req.mh, BFI_MC_IOC, BFI_IOC_H2I_DISABLE_REQ,
bfa_ioc_portid(ioc));
+   disable_req.clscode = htons(ioc->clscode);
+   disable_req.rsvd = htons(0);
+   /* overflow in 2106 */
+   disable_req.tv_sec = ntohl(ktime_get_real_seconds());
bfa_ioc_mbox_send(ioc, &disable_req, sizeof(struct bfi_ioc_ctrl_req));
 }
 
-- 
2.9.0

RE: SCTP MSG_MORE code

2017-03-23 Thread David Laight

From: Xin Long
> Sent: 21 March 2017 06:01
> On Tue, Mar 21, 2017 at 1:49 AM, David Laight  wrote:
> > Something needs to be done with SCTP MSG_MORE before the end of the rc 
> > cycle.
> > The current code is definitely broken.
> agreed.
> 
> >
> > I objected to the last 'fix' patch because it clears the flag is a place 
> > where
> > I don't think it is necessary to do so - so could generate extra ethernet 
> > frames.
> >
> Sorry, can you double check the last 'fix' patch ?
> I could not get 'generate extra ethernet frames'.

It would require the connection be 'flow controlled' and/or have
retransmissions.
Otherwise 'data chunk only' ethernet frames are only generated in response
to send() so would always see the value from the last send().

> if we keep sending data with "MSG_MORE", after one ethernet frame
> is sent, "followed by a second ethernet frame with 1 chunk in it" will NOT
> happen, as in this loop the asoc's msg_more flag is still set, and this flush
> is called by sctp_sendmsg(the function msg_more should care more).
> 
> 
> If your point about "generate extra ethernet frames"  is right, sure, I will
> change the way to fix that. but before this, pls check it again, appreciate 
> it.

I won't be able to test this in the short term.

David

Re: Extending socket timestamping API for NTP

2017-03-23 Thread Miroslav Lichvar

On Thu, Feb 09, 2017 at 12:09:41PM +0100, Miroslav Lichvar wrote:
> On Thu, Feb 09, 2017 at 09:02:42AM +0100, Richard Cochran wrote:
> > On Tue, Feb 07, 2017 at 03:01:44PM +0100, Miroslav Lichvar wrote:
> > > 5) new SO_TIMESTAMPING options to get transposed RX timestamps
> > > 
> > >PTP uses preamble RX timestamps, but NTP works with trailer RX
> > >timestamps. This means NTP implementations currently need to
> > >transpose HW RX timestamps. The calculation requires the link speed
> > >and the length of the packet at layer 2. It seems this can be
> > >reliably done only using raw sockets. It would be very nice if the
> > >kernel could tranpose the timestamps automatically.
> > 
> > Impossible, because the link speed may change between the time when
> > the MAC receives the data the kernel gets around to calculating the
> > time stamp.
> 
> I think that would be an acceptable limitation. The application
> certainly couldn't do a better job than the kernel and it won't have
> to use raw sockets.

After becoming a bit more familiar with the code I don't think this is
a good idea anymore :). I suspect there would be a noticeable
performance impact if each timestamped packet could trigger reading of
the current link speed. If the value had to be cached it would make
more sense to do it in the application.

With no option to get transposed timestamps the point 6 can be
scratched too.

A better approach might be a control message that would provide the
original interface index together with the length of the packet, so
the application could transpose the HW timestamp and map the HW
interface to the PHC.

The two values could be saved in the skb_shared_info structure. Now
my question is if they could be useful also for other things than
timestamping and if it should be a new socket option which would work
on any socket independently from timestamping, or if it should rather
be a new flag for the SO_TIMESTAMPING option. If the latter, would it
make sense to put them in the skb_shared_hwtstamps structure and
modify all drivers to set the values when a HW timestamp is captured
instead of adding more code to __netif_receive_skb_core() or similar?

What do you think?

> > > 6) new SO_TIMESTAMPING option to get PHC index with HW timestamps
> > > 
> > >With bridges, bonding and other things it's difficult to determine
> > >which PHC timestamped the packet. It would be very useful if the
> > >PHC index was provided with each HW timestamp.

-- 
Miroslav Lichvar

Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE

2017-03-23 Thread Marcelo Ricardo Leitner

On Thu, Mar 23, 2017 at 12:35:46PM +0800, Xin Long wrote:
> On Thu, Mar 23, 2017 at 1:33 AM, Marcelo Ricardo Leitner
>  wrote:
> > On Wed, Mar 22, 2017 at 02:07:37PM +, David Laight wrote:
> >> Regardless of the MSG_MORE flags associated with any specific send()
> >> request there will always be protocol effects (like retransmissions
> >> or flow control 'on') that will generate different 'chunking'.
> >
> > Yes, those are the ones that may lead to some confusion on how it
> > actually works, and mangling them is not really desired for the
> > sideeffects that it might have.
> >
> > Sooner or later we could have bug reports like "hey this chunk shouldn't
> > have been packed with that." if we stick with the initial proposition,
> > while with David's view, we are only promising to not send packets with
> > a single chunk and as long as the application send more data fast enough.
> >
> > David, are we on the same page now? ;-)
> >
> > Xin, what do you think?
> If we insist that MSG_MORE means not to send  ANY data, I compromise.
> does ANY include retransmission DATA? should MSG_MORE block
> retransmission ?

That's not really what he meant by that, I think. That "ANY" in there is
a way to refer to the entire buf and not that msg sendmsg is sending.
Later I explained what I got from his explanation, which should be more
like:
"If MSG_MORE was used, and there are no packets in flight, do not send a
packet right away because the application is going to send more data."
Would have to handle the (Not-)Nagle situation too:
"If not using Nagle and using MSG_MORE, try to not generate a packet
right away." (because this may send packets with a single chunk even if
in_flight != 0)
In both cases, if the flush is generated by other triggers, it's okay.

Because if there are chunks already queued, they will be sent as soon as
in_flight reaches 0 or some other break is lifted (flow control).
Holding the chunk that was queued with MSG_MORE and sending a partial
packet in this case because of MSG_MORE is not good, it's possibly not
saving anything.

  Marcelo

[PATCH net-next v3 2/3] net: phy: MDIO_BCM_UNIMAC should depend on OF_MDIO

2017-03-23 Thread Florian Fainelli

The Broadcom MDIO UniMAC driver uses routines provided by of_mdio.c which is
guarded by CONFIG_OF_MDIO.

Signed-off-by: Florian Fainelli 
---
 drivers/net/phy/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index 8dbd59baa34d..7ab4b14a43b7 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -40,7 +40,7 @@ config MDIO_BCM_IPROC
 
 config MDIO_BCM_UNIMAC
tristate "Broadcom UniMAC MDIO bus controller"
-   depends on HAS_IOMEM
+   depends on HAS_IOMEM && OF_MDIO
help
  This module provides a driver for the Broadcom UniMAC MDIO busses.
  This hardware can be found in the Broadcom GENET Ethernet MAC
-- 
2.9.3

[PATCH net-next v3 3/3] net: phy: Allow splitting MDIO bus/device support from PHYs

2017-03-23 Thread Florian Fainelli

Introduce a new configuration symbol: MDIO_DEVICE which allows building
the MDIO devices and bus code, without pulling in the entire Ethernet
PHY library and devices code.

PHYLIB nows select MDIO_DEVICE and the relevant Makefile files are
updated to reflect that.

When MDIO_DEVICE (MDIO bus/device only) is selected, but not PHYLIB, we
have mdio-bus.ko as a loadable module, and it does not have a
module_exit() function because the safety of removing a bus class is
unclear.

When both MDIO_DEVICE and PHYLIB are enabled, we need to assemble
everything into a common loadable module: libphy.ko because of nasty
circular dependencies between phy.c, phy_device.c and mdio_bus.c which
are really tough to untangle.

Signed-off-by: Florian Fainelli 
---
 drivers/net/Makefile |  2 +-
 drivers/net/phy/Kconfig  | 60 +++-
 drivers/net/phy/Makefile | 13 +++--
 drivers/net/phy/mdio-boardinfo.c |  1 +
 drivers/net/phy/mdio_bus.c   |  9 ++
 include/linux/phy.h  | 21 --
 6 files changed, 76 insertions(+), 30 deletions(-)

diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 98ed4d96987c..55f75aea283c 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -18,7 +18,7 @@ obj-$(CONFIG_MII) += mii.o
 obj-$(CONFIG_MDIO) += mdio.o
 obj-$(CONFIG_NET) += Space.o loopback.o
 obj-$(CONFIG_NETCONSOLE) += netconsole.o
-obj-$(CONFIG_PHYLIB) += phy/
+obj-$(CONFIG_MDIO_DEVICE) += phy/
 obj-$(CONFIG_RIONET) += rionet.o
 obj-$(CONFIG_NET_TEAM) += team/
 obj-$(CONFIG_TUN) += tun.o
diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index 7ab4b14a43b7..60ffc9da6a28 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -2,33 +2,12 @@
 # PHY Layer Configuration
 #
 
-menuconfig PHYLIB
-   tristate "PHY Device support and infrastructure"
-   depends on NETDEVICES
+menuconfig MDIO_DEVICE
+   tristate "MDIO bus device drivers"
help
- Ethernet controllers are usually attached to PHY
- devices.  This option provides infrastructure for
- managing PHY devices.
-
-if PHYLIB
-
-config SWPHY
-   bool
-
-config LED_TRIGGER_PHY
-   bool "Support LED triggers for tracking link state"
-   depends on LEDS_TRIGGERS
-   ---help---
- Adds support for a set of LED trigger events per-PHY.  Link
- state change will trigger the events, for consumption by an
- LED class driver.  There are triggers for each link speed currently
- supported by the phy, and are of the form:
-  ::
-
- Where speed is in the form:
-   Mbps or Gbps
+  MDIO devices and driver infrastructure code.
 
-comment "MDIO bus device drivers"
+if MDIO_DEVICE
 
 config MDIO_BCM_IPROC
tristate "Broadcom iProc MDIO bus controller"
@@ -49,6 +28,7 @@ config MDIO_BCM_UNIMAC
 
 config MDIO_BITBANG
tristate "Bitbanged MDIO buses"
+   depends on !(MDIO_DEVICE=y && PHYLIB=m)
help
  This module implements the MDIO bus protocol in software,
  for use by low level drivers that export the ability to
@@ -160,6 +140,36 @@ config MDIO_XGENE
  This module provides a driver for the MDIO busses found in the
  APM X-Gene SoC's.
 
+endif
+
+menuconfig PHYLIB
+   tristate "PHY Device support and infrastructure"
+   depends on NETDEVICES
+   select MDIO_DEVICE
+   help
+ Ethernet controllers are usually attached to PHY
+ devices.  This option provides infrastructure for
+ managing PHY devices.
+
+if PHYLIB
+
+config SWPHY
+   bool
+
+config LED_TRIGGER_PHY
+   bool "Support LED triggers for tracking link state"
+   depends on LEDS_TRIGGERS
+   ---help---
+ Adds support for a set of LED trigger events per-PHY.  Link
+ state change will trigger the events, for consumption by an
+ LED class driver.  There are triggers for each link speed currently
+ supported by the phy, and are of the form:
+  ::
+
+ Where speed is in the form:
+   Mbps or Gbps
+
+
 comment "MII PHY device drivers"
 
 config AMD_PHY
diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
index 82d915614646..0e1ec0438c23 100644
--- a/drivers/net/phy/Makefile
+++ b/drivers/net/phy/Makefile
@@ -1,7 +1,16 @@
 # Makefile for Linux PHY drivers and MDIO bus drivers
 
-libphy-y   := phy.o phy_device.o mdio_bus.o mdio_device.o \
-  mdio-boardinfo.o phy-core.o
+libphy-y   := phy.o phy-core.o phy_device.o
+mdio-bus-y += mdio_bus.o mdio_device.o mdio-boardinfo.o
+
+# PHYLIB implies MDIO_DEVICE, in that case, we have a bunch of circular
+# dependencies that does not make it possible to split mdio-bus objects into a
+# dedicated loadable module, so we bundle them all together into libphy.ko
+ifdef CONFIG_PHYLIB
+libphy-y

[PATCH net-next v3 0/3] net: phy: Allow splitting MDIO bus/device support

2017-03-23 Thread Florian Fainelli

Hi all,

This patch series allows building support for MDIO bus controllers which
are sometimes usable and necessary in cases where there are no Ethernet PHYs.

Changes in v3:
- corrected of_mdio compile guards for prototypes vs. stubs
- added a missing OF_MDIO dependency for MDIO_BCM_UNIMAC
- fixed Kbuild bot reported errors against mdio-bitbang

Changes in v2:
- implement Russell's feedback
- solve the circular dependency in the CONFIG_MDIO_DEVICE + CONFIG_PHYLIB case


Florian Fainelli (3):
  of_mdio: Correct check against CONFIG_OF
  net: phy: MDIO_BCM_UNIMAC should depend on OF_MDIO
  net: phy: Allow splitting MDIO bus/device support from PHYs

 drivers/net/Makefile |  2 +-
 drivers/net/phy/Kconfig  | 62 +++-
 drivers/net/phy/Makefile | 13 +++--
 drivers/net/phy/mdio-boardinfo.c |  1 +
 drivers/net/phy/mdio_bus.c   |  9 ++
 include/linux/of_mdio.h  |  4 +--
 include/linux/phy.h  | 21 --
 7 files changed, 79 insertions(+), 33 deletions(-)

-- 
2.9.3

[PATCH net-next v3 1/3] of_mdio: Correct check against CONFIG_OF

2017-03-23 Thread Florian Fainelli

CONFIG_OF_MDIO is actually what triggers the build of drivers/of/of_mdio.c, so
providing inline stubs when CONFIG_OF_MDIO=y should be based on that symbol as
well.

Signed-off-by: Florian Fainelli 
---
 include/linux/of_mdio.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/of_mdio.h b/include/linux/of_mdio.h
index a58cca8bcb29..ba35ba520487 100644
--- a/include/linux/of_mdio.h
+++ b/include/linux/of_mdio.h
@@ -12,7 +12,7 @@
 #include 
 #include 
 
-#ifdef CONFIG_OF
+#if IS_ENABLED(CONFIG_OF_MDIO)
 extern int of_mdiobus_register(struct mii_bus *mdio, struct device_node *np);
 extern struct phy_device *of_phy_find_device(struct device_node *phy_np);
 extern struct phy_device *of_phy_connect(struct net_device *dev,
@@ -32,7 +32,7 @@ extern int of_phy_register_fixed_link(struct device_node *np);
 extern void of_phy_deregister_fixed_link(struct device_node *np);
 extern bool of_phy_is_fixed_link(struct device_node *np);
 
-#else /* CONFIG_OF */
+#else /* CONFIG_OF_MDIO */
 static inline int of_mdiobus_register(struct mii_bus *mdio, struct device_node 
*np)
 {
/*
-- 
2.9.3

Re: [PATCH net-next v4] net: Add sysctl to toggle early demux for tcp and udp

2017-03-23 Thread kbuild test robot

Hi Subash,

[auto build test ERROR on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Subash-Abhinov-Kasiviswanathan/net-Add-sysctl-to-toggle-early-demux-for-tcp-and-udp/20170323-205131
config: x86_64-kexec (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

   net/built-in.o: In function `proc_tcp_early_demux':
>> sysctl_net_ipv4.c:(.text+0x7fe04): undefined reference to 
>> `tcp_v6_early_demux_configure'
   net/built-in.o: In function `proc_udp_early_demux':
>> sysctl_net_ipv4.c:(.text+0x7fe3d): undefined reference to 
>> `udp_v6_early_demux_configure'

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [v2,net-next,1/3] net: stmmac: enable multiple buffers

2017-03-23 Thread Thierry Reding

On Fri, Mar 17, 2017 at 04:11:05PM +, Joao Pinto wrote:
> This patch creates 2 new structures (stmmac_tx_queue and stmmac_rx_queue)
> in include/linux/stmmac.h, enabling that each RX and TX queue has its
> own buffers and data.
> 
> Signed-off-by: Joao Pinto 
> ---
> changes v1->v2:
> - just to keep up version
> 
>  drivers/net/ethernet/stmicro/stmmac/chain_mode.c  |   45 +-
>  drivers/net/ethernet/stmicro/stmmac/ring_mode.c   |   46 +-
>  drivers/net/ethernet/stmicro/stmmac/stmmac.h  |   49 +-
>  drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 1306 
> ++---
>  4 files changed, 973 insertions(+), 473 deletions(-)

Hi Joao,

This seems to break support on Tegra186 again. I've gone through this
patch multiple times and I can't figure out what could be causing it.
Any ideas?

What I'm seeing is that the transmit queue 0 times out:

[  101.121774] Sending DHCP requests ...
[  111.841763] NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 0 
timed out

and then I also see this:

[  112.252024] dwc-eth-dwmac 249.ethernet: DMA-API: device driver 
tries to free DMA memory it has not allocated [device 
address=0x57ac6e9d] [size=0 bytes]
[  112.266606] [ cut here ]
[  112.271220] WARNING: CPU: 0 PID: 0 at 
/home/thierry.reding/src/kernel/linux-tegra.git/lib/dma-debug.c:1106 
check_unmap+0x7b0/0x930
[  112.282934] Modules linked in:
[  112.285985]
[  112.287474] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G S  W   
4.11.0-rc3-next-20170323-00060-g2eab4557749b-dirty #400
[  112.298581] Hardware name: NVIDIA Tegra186 P2771- Development 
Board (DT)
[  112.305615] task: 08f87b00 task.stack: 08f7
[  112.311523] PC is at check_unmap+0x7b0/0x930
[  112.315785] LR is at check_unmap+0x7b0/0x930
[  112.320046] pc : [] lr : [] 
pstate: 6145
[  112.327426] sp : 8001f5e50c50
[  112.330733] x29: 8001f5e50c50 x28: 08f75180
[  112.336042] x27: 08f87b00 x26: 0020
[  112.341351] x25: 0140 x24: 08f81000
[  112.346660] x23: 8001ec4b0810 x22: 57ac6e9d
[  112.351969] x21: 57ac6e9d x20: 8001f5e50cb0
[  112.357277] x19: 8001ec4b0810 x18: 0010
[  112.362586] x17: 262ea01f x16: 0f48bf67
[  112.367895] x15: 0006 x14: 5d64396536636137
[  112.373203] x13: 3530303030303030 x12: 3078303d73736572
[  112.378511] x11: 6464612065636976 x10: 65645b2064657461
[  112.383819] x9 : 0852c238 x8 : 01fb
[  112.389126] x7 :  x6 : 0810ad58
[  112.394434] x5 :  x4 : 
[  112.399743] x3 :  x2 : 08f99258
[  112.405050] x1 : 08f87b00 x0 : 0097
[  112.410358]
[  112.411846] ---[ end trace 48028f96a0e990fb ]---
[  112.416453] Call trace:
[  112.418895] Exception stack(0x8001f5e50a80 to 0x8001f5e50bb0)
[  112.425324] 0a80: 8001ec4b0810 0001 8001f5e50c50 
083d75f0
[  112.433139] 0aa0: 01c0   
08d1c0c0
[  112.440954] 0ac0: 8001f5e50c50 8001f5e50c50 8001f5e50c10 
ffc8
[  112.448769] 0ae0: 8001f5e50b10 0810c3a8 8001f5e50c50 
8001f5e50c50
[  112.456585] 0b00: 8001f5e50c10 ffc8 8001f5e50bc0 
08178388
[  112.464399] 0b20: 0097 08f87b00 08f99258 

[  112.472215] 0b40:   0810ad58 

[  112.480030] 0b60: 01fb 0852c238 65645b2064657461 
6464612065636976
[  112.487845] 0b80: 3078303d73736572 3530303030303030 5d64396536636137 
0006
[  112.495659] 0ba0: 0f48bf67 262ea01f
[  112.500528] [] check_unmap+0x7b0/0x930
[  112.505830] [] debug_dma_unmap_page+0x68/0x70
[  112.511744] [] 
stmmac_free_tx_buffers.isra.1+0x114/0x198
[  112.518604] [] stmmac_tx_err+0x7c/0x160
[  112.523993] [] stmmac_tx_timeout+0x34/0x50
[  112.529642] [] dev_watchdog+0x270/0x2a8
[  112.535032] [] call_timer_fn+0x64/0xd0
[  112.540334] [] expire_timers+0xb0/0xc0
[  112.545636] [] run_timer_softirq+0x80/0xc0
[  112.551284] [] __do_softirq+0x10c/0x218
[  112.556673] [] irq_exit+0xc8/0x118
[  112.561629] [] __handle_domain_irq+0x60/0xb8
[  112.567450] [] gic_handle_irq+0x54/0xa8
[  112.572837] Exception stack(0x08f73dd0 to 0x08f73f00)
[  112.5

Re: [v2,net-next,1/3] net: stmmac: enable multiple buffers

2017-03-23 Thread Joao Pinto

Hi Thierry,

Às 5:17 PM de 3/23/2017, Thierry Reding escreveu:
> On Fri, Mar 17, 2017 at 04:11:05PM +, Joao Pinto wrote:
>> This patch creates 2 new structures (stmmac_tx_queue and stmmac_rx_queue)
>> in include/linux/stmmac.h, enabling that each RX and TX queue has its
>> own buffers and data.
>>
>> Signed-off-by: Joao Pinto 
>> ---
>> changes v1->v2:
>> - just to keep up version
>>
>>  drivers/net/ethernet/stmicro/stmmac/chain_mode.c  |   45 +-
>>  drivers/net/ethernet/stmicro/stmmac/ring_mode.c   |   46 +-
>>  drivers/net/ethernet/stmicro/stmmac/stmmac.h  |   49 +-
>>  drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 1306 
>> ++---
>>  4 files changed, 973 insertions(+), 473 deletions(-)
> 
> Hi Joao,
> 
> This seems to break support on Tegra186 again. I've gone through this
> patch multiple times and I can't figure out what could be causing it.
> Any ideas?
> 
> What I'm seeing is that the transmit queue 0 times out:
> 
>   [  101.121774] Sending DHCP requests ...
>   [  111.841763] NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 0 
> timed out

You are using a GMAC or GMAC4 aka QoS?

> 
> and then I also see this:
> 
>   [  112.252024] dwc-eth-dwmac 249.ethernet: DMA-API: device driver 
> tries to free DMA memory it has not allocated [device 
> address=0x57ac6e9d] [size=0 bytes]

Humm... Something in stmmac_free_tx_buffers... I'll need to check.

>   [  112.266606] [ cut here ]
>   [  112.271220] WARNING: CPU: 0 PID: 0 at 
> /home/thierry.reding/src/kernel/linux-tegra.git/lib/dma-debug.c:1106 
> check_unmap+0x7b0/0x930
>   [  112.282934] Modules linked in:
>   [  112.285985]
>   [  112.287474] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G S  W   
> 4.11.0-rc3-next-20170323-00060-g2eab4557749b-dirty #400
>   [  112.298581] Hardware name: NVIDIA Tegra186 P2771- Development 
> Board (DT)
>   [  112.305615] task: 08f87b00 task.stack: 08f7
>   [  112.311523] PC is at check_unmap+0x7b0/0x930
>   [  112.315785] LR is at check_unmap+0x7b0/0x930
>   [  112.320046] pc : [] lr : [] 
> pstate: 6145
>   [  112.327426] sp : 8001f5e50c50
>   [  112.330733] x29: 8001f5e50c50 x28: 08f75180
>   [  112.336042] x27: 08f87b00 x26: 0020
>   [  112.341351] x25: 0140 x24: 08f81000
>   [  112.346660] x23: 8001ec4b0810 x22: 57ac6e9d
>   [  112.351969] x21: 57ac6e9d x20: 8001f5e50cb0
>   [  112.357277] x19: 8001ec4b0810 x18: 0010
>   [  112.362586] x17: 262ea01f x16: 0f48bf67
>   [  112.367895] x15: 0006 x14: 5d64396536636137
>   [  112.373203] x13: 3530303030303030 x12: 3078303d73736572
>   [  112.378511] x11: 6464612065636976 x10: 65645b2064657461
>   [  112.383819] x9 : 0852c238 x8 : 01fb
>   [  112.389126] x7 :  x6 : 0810ad58
>   [  112.394434] x5 :  x4 : 
>   [  112.399743] x3 :  x2 : 08f99258
>   [  112.405050] x1 : 08f87b00 x0 : 0097
>   [  112.410358]
>   [  112.411846] ---[ end trace 48028f96a0e990fb ]---
>   [  112.416453] Call trace:
>   [  112.418895] Exception stack(0x8001f5e50a80 to 0x8001f5e50bb0)
>   [  112.425324] 0a80: 8001ec4b0810 0001 8001f5e50c50 
> 083d75f0
>   [  112.433139] 0aa0: 01c0   
> 08d1c0c0
>   [  112.440954] 0ac0: 8001f5e50c50 8001f5e50c50 8001f5e50c10 
> ffc8
>   [  112.448769] 0ae0: 8001f5e50b10 0810c3a8 8001f5e50c50 
> 8001f5e50c50
>   [  112.456585] 0b00: 8001f5e50c10 ffc8 8001f5e50bc0 
> 08178388
>   [  112.464399] 0b20: 0097 08f87b00 08f99258 
> 
>   [  112.472215] 0b40:   0810ad58 
> 
>   [  112.480030] 0b60: 01fb 0852c238 65645b2064657461 
> 6464612065636976
>   [  112.487845] 0b80: 3078303d73736572 3530303030303030 5d64396536636137 
> 0006
>   [  112.495659] 0ba0: 0f48bf67 262ea01f
>   [  112.500528] [] check_unmap+0x7b0/0x930
>   [  112.505830] [] debug_dma_unmap_page+0x68/0x70
>   [  112.511744] [] 
> stmmac_free_tx_buffers.isra.1+0x114/0x198
>   [  112.518604] [] stmmac_tx_err+0x7c/0x160
>   [  112.523993] [] s

[PATCH net-next 2/3] net: systemport: Clear status to reduce spurious interrupts

2017-03-23 Thread Florian Fainelli

Do something similar to commit d5810ca3252a ("net: bcmgenet: clear
status to reduce spurious interrupts") and clear interrupts right before
servicing them. This reduces the number of interrupts by 10K
interrupts/sec for a TX TCP session 1Gbits/sec.

Signed-off-by: Florian Fainelli 
---
 drivers/net/ethernet/broadcom/bcmsysport.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c 
b/drivers/net/ethernet/broadcom/bcmsysport.c
index 986fb05529fc..c915bcfae0af 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -669,6 +669,9 @@ static unsigned int bcm_sysport_desc_rx(struct 
bcm_sysport_priv *priv,
u16 len, status;
struct bcm_rsb *rsb;
 
+   /* Clear status before servicing to reduce spurious interrupts */
+   intrl2_0_writel(priv, INTRL2_0_RDMA_MBDONE, INTRL2_CPU_CLEAR);
+
/* Determine how much we should process since last call, SYSTEMPORT Lite
 * groups the producer and consumer indexes into the same 32-bit
 * which we access using RDMA_CONS_INDEX
@@ -814,6 +817,13 @@ static unsigned int __bcm_sysport_tx_reclaim(struct 
bcm_sysport_priv *priv,
struct bcm_sysport_cb *cb;
u32 hw_ind;
 
+   /* Clear status before servicing to reduce spurious interrupts */
+   if (!ring->priv->is_lite)
+   intrl2_1_writel(ring->priv, BIT(ring->index), INTRL2_CPU_CLEAR);
+   else
+   intrl2_0_writel(ring->priv, BIT(ring->index +
+   INTRL2_0_TDMA_MBDONE_SHIFT), INTRL2_CPU_CLEAR);
+
/* Compute how many descriptors have been processed since last call */
hw_ind = tdma_readl(priv, TDMA_DESC_RING_PROD_CONS_INDEX(ring->index));
c_index = (hw_ind >> RING_CONS_INDEX_SHIFT) & RING_CONS_INDEX_MASK;
-- 
2.9.3

[PATCH net-next 1/3] net: systemport: Track per TX ring statistics

2017-03-23 Thread Florian Fainelli

bcm_sysport_tx_reclaim_one() is currently summing TX bytes/packets in a
way that is not SMP friendly, mutliples CPUs could run
bcm_sysport_tx_reclaim_one() independently and still update
stats->tx_bytes and stats->tx_packets, cloberring the other CPUs
statistics.

Fix this by tracking per TX rings the number of bytes, packets,
dropped and errors statistics, and provide a bcm_sysport_get_nstats()
function which aggregates everything and returns a consistent output.

Signed-off-by: Florian Fainelli 
---
 drivers/net/ethernet/broadcom/bcmsysport.c | 65 ++
 drivers/net/ethernet/broadcom/bcmsysport.h |  5 +++
 2 files changed, 63 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c 
b/drivers/net/ethernet/broadcom/bcmsysport.c
index a68d4889f5db..986fb05529fc 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -284,6 +284,7 @@ static const struct bcm_sysport_stats 
bcm_sysport_gstrings_stats[] = {
STAT_MIB_SOFT("alloc_rx_buff_failed", mib.alloc_rx_buff_failed),
STAT_MIB_SOFT("rx_dma_failed", mib.rx_dma_failed),
STAT_MIB_SOFT("tx_dma_failed", mib.tx_dma_failed),
+   /* Per TX-queue statistics are dynamically appended */
 };
 
 #define BCM_SYSPORT_STATS_LEN  ARRAY_SIZE(bcm_sysport_gstrings_stats)
@@ -338,7 +339,8 @@ static int bcm_sysport_get_sset_count(struct net_device 
*dev, int string_set)
continue;
j++;
}
-   return j;
+   /* Include per-queue statistics */
+   return j + dev->num_tx_queues * NUM_SYSPORT_TXQ_STAT;
default:
return -EOPNOTSUPP;
}
@@ -349,6 +351,7 @@ static void bcm_sysport_get_strings(struct net_device *dev,
 {
struct bcm_sysport_priv *priv = netdev_priv(dev);
const struct bcm_sysport_stats *s;
+   char buf[128];
int i, j;
 
switch (stringset) {
@@ -363,6 +366,18 @@ static void bcm_sysport_get_strings(struct net_device *dev,
   ETH_GSTRING_LEN);
j++;
}
+
+   for (i = 0; i < dev->num_tx_queues; i++) {
+   snprintf(buf, sizeof(buf), "txq%d_packets", i);
+   memcpy(data + j * ETH_GSTRING_LEN, buf,
+  ETH_GSTRING_LEN);
+   j++;
+
+   snprintf(buf, sizeof(buf), "txq%d_bytes", i);
+   memcpy(data + j * ETH_GSTRING_LEN, buf,
+  ETH_GSTRING_LEN);
+   j++;
+   }
break;
default:
break;
@@ -418,6 +433,7 @@ static void bcm_sysport_get_stats(struct net_device *dev,
  struct ethtool_stats *stats, u64 *data)
 {
struct bcm_sysport_priv *priv = netdev_priv(dev);
+   struct bcm_sysport_tx_ring *ring;
int i, j;
 
if (netif_running(dev))
@@ -436,6 +452,22 @@ static void bcm_sysport_get_stats(struct net_device *dev,
data[j] = *(unsigned long *)p;
j++;
}
+
+   /* For SYSTEMPORT Lite since we have holes in our statistics, j would
+* be equal to BCM_SYSPORT_STATS_LEN at the end of the loop, but it
+* needs to point to how many total statistics we have minus the
+* number of per TX queue statistics
+*/
+   j = bcm_sysport_get_sset_count(dev, ETH_SS_STATS) -
+   dev->num_tx_queues * NUM_SYSPORT_TXQ_STAT;
+
+   for (i = 0; i < dev->num_tx_queues; i++) {
+   ring = &priv->tx_rings[i];
+   data[j] = ring->packets;
+   j++;
+   data[j] = ring->bytes;
+   j++;
+   }
 }
 
 static void bcm_sysport_get_wol(struct net_device *dev,
@@ -746,26 +778,26 @@ static unsigned int bcm_sysport_desc_rx(struct 
bcm_sysport_priv *priv,
return processed;
 }
 
-static void bcm_sysport_tx_reclaim_one(struct bcm_sysport_priv *priv,
+static void bcm_sysport_tx_reclaim_one(struct bcm_sysport_tx_ring *ring,
   struct bcm_sysport_cb *cb,
   unsigned int *bytes_compl,
   unsigned int *pkts_compl)
 {
+   struct bcm_sysport_priv *priv = ring->priv;
struct device *kdev = &priv->pdev->dev;
-   struct net_device *ndev = priv->netdev;
 
if (cb->skb) {
-   ndev->stats.tx_bytes += cb->skb->len;
+   ring->bytes += cb->skb->len;
*bytes_compl += cb->skb->len;
dma_unmap_single(kdev, dma_unmap_addr(cb, dma_addr),
 dma_unmap_len(cb, dma_len),
 DMA_TO_DEVICE);
-   ndev->stats.tx_packets++;
+   ring->packets++;
(*pkts_compl)++;

[PATCH net-next 0/3] net: systemport: TX/NAPI improvements

2017-03-23 Thread Florian Fainelli

Hi all,

This patch series builds up on Doug's latest changes done in BCMGENET to reduce
the number of spurious interrupts in NAPI, simplify pointer arithmetic and
finally tracking of per TX ring statistics to be SMP friendly.

Florian Fainelli (3):
  net: systemport: Track per TX ring statistics
  net: systemport: Clear status to reduce spurious interrupts
  net: systemport: Simplify circular pointer arithmetic

 drivers/net/ethernet/broadcom/bcmsysport.c | 81 +-
 drivers/net/ethernet/broadcom/bcmsysport.h |  5 ++
 2 files changed, 74 insertions(+), 12 deletions(-)

-- 
2.9.3

[PATCH net-next 3/3] net: systemport: Simplify circular pointer arithmetic

2017-03-23 Thread Florian Fainelli

Similar to c298ede2fe21 ("net: bcmgenet: simplify circular pointer
arithmetic") we don't need to complex arthimetic since we always have a
ring size that is a power of 2.

Signed-off-by: Florian Fainelli 
---
 drivers/net/ethernet/broadcom/bcmsysport.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c 
b/drivers/net/ethernet/broadcom/bcmsysport.c
index c915bcfae0af..61e26c6b26ab 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -682,11 +682,7 @@ static unsigned int bcm_sysport_desc_rx(struct 
bcm_sysport_priv *priv,
p_index = rdma_readl(priv, RDMA_CONS_INDEX);
p_index &= RDMA_PROD_INDEX_MASK;
 
-   if (p_index < priv->rx_c_index)
-   to_process = (RDMA_CONS_INDEX_MASK + 1) -
-   priv->rx_c_index + p_index;
-   else
-   to_process = p_index - priv->rx_c_index;
+   to_process = (p_index - priv->rx_c_index) & RDMA_CONS_INDEX_MASK;
 
netif_dbg(priv, rx_status, ndev,
  "p_index=%d rx_c_index=%d to_process=%d\n",
-- 
2.9.3

[PATCH V8 1/3] irq: Add flags to request_percpu_irq function

2017-03-23 Thread Daniel Lezcano

In the next changes, we track the interrupts but we discard the timers as
that does not make sense. The next interrupt on a timer is predictable.

But, the API request_percpu_irq does not allow to pass a flag, hence specifying
if the interrupt type is a timer.

Solve this by passing a 'flags' parameter to the function and change all the
callers to pass IRQF_TIMER when the interrupt is a timer interrupt, zero
otherwise.

For now, in order to prevent a misusage of this parameter, only the IRQF_TIMER
flag is a valid parameter to be passed to the request_percpu_irq function.

Signed-off-by: Daniel Lezcano 
---
 arch/arc/kernel/perf_event.c |  2 +-
 arch/arc/kernel/smp.c|  2 +-
 arch/arm/kernel/smp_twd.c|  3 ++-
 arch/arm/xen/enlighten.c |  2 +-
 drivers/clocksource/arc_timer.c  |  2 +-
 drivers/clocksource/arm_arch_timer.c | 15 +--
 drivers/clocksource/arm_global_timer.c   |  2 +-
 drivers/clocksource/exynos_mct.c |  2 +-
 drivers/clocksource/qcom-timer.c |  2 +-
 drivers/clocksource/time-armada-370-xp.c |  2 +-
 drivers/clocksource/timer-nps.c  |  2 +-
 drivers/net/ethernet/marvell/mvneta.c|  2 +-
 drivers/perf/arm_pmu.c   |  2 +-
 include/linux/interrupt.h|  5 +++--
 kernel/irq/manage.c  | 11 ---
 virt/kvm/arm/arch_timer.c|  5 +++--
 virt/kvm/arm/vgic/vgic-init.c|  2 +-
 17 files changed, 37 insertions(+), 26 deletions(-)

diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c
index 2ce24e7..2a90c7a 100644
--- a/arch/arc/kernel/perf_event.c
+++ b/arch/arc/kernel/perf_event.c
@@ -525,7 +525,7 @@ static int arc_pmu_device_probe(struct platform_device 
*pdev)
arc_pmu->irq = irq;
 
/* intc map function ensures irq_set_percpu_devid() called */
-   request_percpu_irq(irq, arc_pmu_intr, "ARC perf counters",
+   request_percpu_irq(irq, 0, arc_pmu_intr, "ARC perf counters",
   this_cpu_ptr(&arc_pmu_cpu));
 
on_each_cpu(arc_cpu_pmu_irq_init, &irq, 1);
diff --git a/arch/arc/kernel/smp.c b/arch/arc/kernel/smp.c
index f462671..5cdd3c9 100644
--- a/arch/arc/kernel/smp.c
+++ b/arch/arc/kernel/smp.c
@@ -381,7 +381,7 @@ int smp_ipi_irq_setup(int cpu, irq_hw_number_t hwirq)
if (!cpu) {
int rc;
 
-   rc = request_percpu_irq(virq, do_IPI, "IPI Interrupt", dev);
+   rc = request_percpu_irq(virq, 0, do_IPI, "IPI Interrupt", dev);
if (rc)
panic("Percpu IRQ request failed for %u\n", virq);
}
diff --git a/arch/arm/kernel/smp_twd.c b/arch/arm/kernel/smp_twd.c
index 895ae51..988f9b9 100644
--- a/arch/arm/kernel/smp_twd.c
+++ b/arch/arm/kernel/smp_twd.c
@@ -332,7 +332,8 @@ static int __init twd_local_timer_common_register(struct 
device_node *np)
goto out_free;
}
 
-   err = request_percpu_irq(twd_ppi, twd_handler, "twd", twd_evt);
+   err = request_percpu_irq(twd_ppi, IRQF_TIMER, twd_handler, "twd",
+twd_evt);
if (err) {
pr_err("twd: can't register interrupt %d (%d)\n", twd_ppi, err);
goto out_free;
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 81e3217..2897f94 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -400,7 +400,7 @@ static int __init xen_guest_init(void)
 
xen_init_IRQ();
 
-   if (request_percpu_irq(xen_events_irq, xen_arm_callback,
+   if (request_percpu_irq(xen_events_irq, 0, xen_arm_callback,
   "events", &xen_vcpu)) {
pr_err("Error request IRQ %d\n", xen_events_irq);
return -EINVAL;
diff --git a/drivers/clocksource/arc_timer.c b/drivers/clocksource/arc_timer.c
index 7517f95..e78e306 100644
--- a/drivers/clocksource/arc_timer.c
+++ b/drivers/clocksource/arc_timer.c
@@ -301,7 +301,7 @@ static int __init arc_clockevent_setup(struct device_node 
*node)
}
 
/* Needs apriori irq_set_percpu_devid() done in intc map function */
-   ret = request_percpu_irq(arc_timer_irq, timer_irq_handler,
+   ret = request_percpu_irq(arc_timer_irq, IRQF_TIMER, timer_irq_handler,
 "Timer0 (per-cpu-tick)", evt);
if (ret) {
pr_err("clockevent: unable to request irq\n");
diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index 7a8a411..11398ff 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -768,16 +768,19 @@ static int __init arch_timer_register(void)
ppi = arch_timer_ppi[arch_timer_uses_ppi];
switch (arch_timer_uses_ppi) {
case VIRT_PPI:
-   err = request_percpu_irq(ppi, arch_timer_handler_virt,
-

[Patch net] kcm: return immediately after copy_from_user() failure

2017-03-23 Thread Cong Wang

There is no reason to continue after a copy_from_user()
failure.

Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Cc: Tom Herbert 
Signed-off-by: Cong Wang 
---
 net/kcm/kcmsock.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index 309062f..31762f7 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -1687,7 +1687,7 @@ static int kcm_ioctl(struct socket *sock, unsigned int 
cmd, unsigned long arg)
struct kcm_attach info;
 
if (copy_from_user(&info, (void __user *)arg, sizeof(info)))
-   err = -EFAULT;
+   return -EFAULT;
 
err = kcm_attach_ioctl(sock, &info);
 
@@ -1697,7 +1697,7 @@ static int kcm_ioctl(struct socket *sock, unsigned int 
cmd, unsigned long arg)
struct kcm_unattach info;
 
if (copy_from_user(&info, (void __user *)arg, sizeof(info)))
-   err = -EFAULT;
+   return -EFAULT;
 
err = kcm_unattach_ioctl(sock, &info);
 
@@ -1708,7 +1708,7 @@ static int kcm_ioctl(struct socket *sock, unsigned int 
cmd, unsigned long arg)
struct socket *newsock = NULL;
 
if (copy_from_user(&info, (void __user *)arg, sizeof(info)))
-   err = -EFAULT;
+   return -EFAULT;
 
err = kcm_clone(sock, &info, &newsock);
 
-- 
2.5.5

Re: net/kcm: double free of kcm inode

2017-03-23 Thread Cong Wang

On Thu, Mar 23, 2017 at 5:09 AM, Dmitry Vyukov  wrote:
> Hello,
>
> I've got the following report while running syzkaller fuzzer. Note the
> preceding kmem_cache_alloc injected failure, it's most likely the root
> cause.
>
> FAULT_INJECTION: forcing a failure.
> name failslab, interval 1, probability 0, space 0, times 0
> CPU: 1 PID: 21839 Comm: syz-executor4 Not tainted 4.11.0-rc3+ #364
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:16 [inline]
>  dump_stack+0x1b8/0x28d lib/dump_stack.c:52
>  fail_dump lib/fault-inject.c:45 [inline]
>  should_fail+0x78a/0x870 lib/fault-inject.c:154
>  should_failslab+0xec/0x120 mm/failslab.c:31
>  slab_pre_alloc_hook mm/slab.h:434 [inline]
>  slab_alloc mm/slab.c:3394 [inline]
>  kmem_cache_alloc+0x200/0x720 mm/slab.c:3570
>  sk_prot_alloc+0x65/0x2a0 net/core/sock.c:1331
>  sk_alloc+0x8c/0x710 net/core/sock.c:1393
>  kcm_clone net/kcm/kcmsock.c:1655 [inline]
>  kcm_ioctl+0xb65/0x17e0 net/kcm/kcmsock.c:1713
>  sock_do_ioctl+0x65/0xb0 net/socket.c:895
>  sock_ioctl+0x2c2/0x440 net/socket.c:993
>  vfs_ioctl fs/ioctl.c:45 [inline]
>  do_vfs_ioctl+0x1af/0x16d0 fs/ioctl.c:685
>  SYSC_ioctl fs/ioctl.c:700 [inline]
>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
>  entry_SYSCALL_64_fastpath+0x1f/0xc2

I don't know if this patch could fix this bug or not:
https://patchwork.ozlabs.org/patch/742860/

This is why I don't add your Reported-by. But it could be related.

Thanks.

> RIP: 0033:0x445b79
> RSP: 002b:7f05eb28e858 EFLAGS: 0286 ORIG_RAX: 0010
> RAX: ffda RBX: 00708000 RCX: 00445b79
> RDX: 20001000 RSI: 89e2 RDI: 0005
> RBP: 0086 R08:  R09: 
> R10:  R11: 0286 R12: 004a7e31
> R13:  R14: 7f05eb28e618 R15: 7f05eb28e788
> ==
> BUG: KASAN: use-after-free in __fput+0x6b0/0x7f0 fs/file_table.c:211
> at addr 880037a25670
> Read of size 2 by task syz-executor4/21839
> CPU: 1 PID: 21839 Comm: syz-executor4 Not tainted 4.11.0-rc3+ #364
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:16 [inline]
>  dump_stack+0x1b8/0x28d lib/dump_stack.c:52
>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:166
>  print_address_description mm/kasan/report.c:210 [inline]
>  kasan_report_error mm/kasan/report.c:294 [inline]
>  kasan_report.part.2+0x1be/0x480 mm/kasan/report.c:316
>  kasan_report mm/kasan/report.c:335 [inline]
>  __asan_report_load2_noabort+0x29/0x30 mm/kasan/report.c:335
>  __fput+0x6b0/0x7f0 fs/file_table.c:211
>  fput+0x15/0x20 fs/file_table.c:245
>  task_work_run+0x1a4/0x270 kernel/task_work.c:116
>  tracehook_notify_resume include/linux/tracehook.h:191 [inline]
>  exit_to_usermode_loop+0x24d/0x2d0 arch/x86/entry/common.c:161
>  prepare_exit_to_usermode arch/x86/entry/common.c:191 [inline]
>  syscall_return_slowpath+0x3bd/0x460 arch/x86/entry/common.c:260
>  entry_SYSCALL_64_fastpath+0xc0/0xc2
> RIP: 0033:0x445b79
> RSP: 002b:7f05eb28e858 EFLAGS: 0286 ORIG_RAX: 0010
> RAX: fff4 RBX: 00708000 RCX: 00445b79
> RDX: 20001000 RSI: 89e2 RDI: 0005
> RBP: 2170 R08:  R09: 
> R10:  R11: 0286 R12: 006e0230
> R13: 89e2 R14: 20001000 R15: 0005
> Object at 880037a25640, in cache sock_inode_cache size: 944
> Allocated:
> PID = 21839
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:517
>  set_track mm/kasan/kasan.c:529 [inline]
>  kasan_kmalloc+0xbc/0xf0 mm/kasan/kasan.c:620
>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:559
>  kmem_cache_alloc+0x110/0x720 mm/slab.c:3572
>  sock_alloc_inode+0x70/0x300 net/socket.c:250
>  alloc_inode+0x65/0x180 fs/inode.c:207
>  new_inode_pseudo+0x69/0x190 fs/inode.c:889
>  sock_alloc+0x41/0x270 net/socket.c:565
>  kcm_clone net/kcm/kcmsock.c:1634 [inline]
>  kcm_ioctl+0x990/0x17e0 net/kcm/kcmsock.c:1713
>  sock_do_ioctl+0x65/0xb0 net/socket.c:895
>  sock_ioctl+0x2c2/0x440 net/socket.c:993
>  vfs_ioctl fs/ioctl.c:45 [inline]
>  do_vfs_ioctl+0x1af/0x16d0 fs/ioctl.c:685
>  SYSC_ioctl fs/ioctl.c:700 [inline]
>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
>  entry_SYSCALL_64_fastpath+0x1f/0xc2
> Freed:
> PID = 21839
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:517
>  set_track mm/kasan/kasan.c:529 [inline]
>  kasan_slab_free+0x81/0xc0 mm/kasan/kasan.c:593
>  __cache_free mm/slab.c:3514 [inline]
>  kmem_cache_free+0x71/0x240 mm/slab.c:3774
>  sock_destroy_inode+0x56/0x70 net/socket.c:280
>  destroy_inode+0x15d/0x200 fs/inode.c:264
>  evict+0x57e/0x920 fs/inode.c:570
>  iput_final fs/inode.c:

Re: [v2,net-next,1/3] net: stmmac: enable multiple buffers

2017-03-23 Thread Thierry Reding

On Thu, Mar 23, 2017 at 05:27:08PM +, Joao Pinto wrote:
> Hi Thierry,
> 
> Às 5:17 PM de 3/23/2017, Thierry Reding escreveu:
> > On Fri, Mar 17, 2017 at 04:11:05PM +, Joao Pinto wrote:
> >> This patch creates 2 new structures (stmmac_tx_queue and stmmac_rx_queue)
> >> in include/linux/stmmac.h, enabling that each RX and TX queue has its
> >> own buffers and data.
> >>
> >> Signed-off-by: Joao Pinto 
> >> ---
> >> changes v1->v2:
> >> - just to keep up version
> >>
> >>  drivers/net/ethernet/stmicro/stmmac/chain_mode.c  |   45 +-
> >>  drivers/net/ethernet/stmicro/stmmac/ring_mode.c   |   46 +-
> >>  drivers/net/ethernet/stmicro/stmmac/stmmac.h  |   49 +-
> >>  drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 1306 
> >> ++---
> >>  4 files changed, 973 insertions(+), 473 deletions(-)
> > 
> > Hi Joao,
> > 
> > This seems to break support on Tegra186 again. I've gone through this
> > patch multiple times and I can't figure out what could be causing it.
> > Any ideas?
> > 
> > What I'm seeing is that the transmit queue 0 times out:
> > 
> > [  101.121774] Sending DHCP requests ...
> > [  111.841763] NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 0 
> > timed out
> 
> You are using a GMAC or GMAC4 aka QoS?

Yes. It's called EQOS (or EAVB) on Tegra186.

> > and then I also see this:
> > 
> > [  112.252024] dwc-eth-dwmac 249.ethernet: DMA-API: device driver 
> > tries to free DMA memory it has not allocated [device 
> > address=0x57ac6e9d] [size=0 bytes]
> 
> Humm... Something in stmmac_free_tx_buffers... I'll need to check.
> 
> > [  112.266606] [ cut here ]
> > [  112.271220] WARNING: CPU: 0 PID: 0 at 
> > /home/thierry.reding/src/kernel/linux-tegra.git/lib/dma-debug.c:1106 
> > check_unmap+0x7b0/0x930
> > [  112.282934] Modules linked in:
> > [  112.285985]
> > [  112.287474] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G S  W   
> > 4.11.0-rc3-next-20170323-00060-g2eab4557749b-dirty #400
> > [  112.298581] Hardware name: NVIDIA Tegra186 P2771- Development 
> > Board (DT)
> > [  112.305615] task: 08f87b00 task.stack: 08f7
> > [  112.311523] PC is at check_unmap+0x7b0/0x930
> > [  112.315785] LR is at check_unmap+0x7b0/0x930
> > [  112.320046] pc : [] lr : [] 
> > pstate: 6145
> > [  112.327426] sp : 8001f5e50c50
> > [  112.330733] x29: 8001f5e50c50 x28: 08f75180
> > [  112.336042] x27: 08f87b00 x26: 0020
> > [  112.341351] x25: 0140 x24: 08f81000
> > [  112.346660] x23: 8001ec4b0810 x22: 57ac6e9d
> > [  112.351969] x21: 57ac6e9d x20: 8001f5e50cb0
> > [  112.357277] x19: 8001ec4b0810 x18: 0010
> > [  112.362586] x17: 262ea01f x16: 0f48bf67
> > [  112.367895] x15: 0006 x14: 5d64396536636137
> > [  112.373203] x13: 3530303030303030 x12: 3078303d73736572
> > [  112.378511] x11: 6464612065636976 x10: 65645b2064657461
> > [  112.383819] x9 : 0852c238 x8 : 01fb
> > [  112.389126] x7 :  x6 : 0810ad58
> > [  112.394434] x5 :  x4 : 
> > [  112.399743] x3 :  x2 : 08f99258
> > [  112.405050] x1 : 08f87b00 x0 : 0097
> > [  112.410358]
> > [  112.411846] ---[ end trace 48028f96a0e990fb ]---
> > [  112.416453] Call trace:
> > [  112.418895] Exception stack(0x8001f5e50a80 to 0x8001f5e50bb0)
> > [  112.425324] 0a80: 8001ec4b0810 0001 8001f5e50c50 
> > 083d75f0
> > [  112.433139] 0aa0: 01c0   
> > 08d1c0c0
> > [  112.440954] 0ac0: 8001f5e50c50 8001f5e50c50 8001f5e50c10 
> > ffc8
> > [  112.448769] 0ae0: 8001f5e50b10 0810c3a8 8001f5e50c50 
> > 8001f5e50c50
> > [  112.456585] 0b00: 8001f5e50c10 ffc8 8001f5e50bc0 
> > 08178388
> > [  112.464399] 0b20: 0097 08f87b00 08f99258 
> > 
> > [  112.472215] 0b40:   0810ad58 
> > 
> > [  112.480030] 0b60: 01fb 085

Re: [PATCH net-next v4] net: Add sysctl to toggle early demux for tcp and udp

2017-03-23 Thread kbuild test robot

Hi Subash,

[auto build test ERROR on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Subash-Abhinov-Kasiviswanathan/net-Add-sysctl-to-toggle-early-demux-for-tcp-and-udp/20170323-205131
config: arm-allmodconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm 

All errors (new ones prefixed by >>):

   net/built-in.o: In function `proc_tcp_early_demux':
>> ncsi-manage.c:(.text+0xdffd4): undefined reference to 
>> `tcp_v6_early_demux_configure'
   net/built-in.o: In function `proc_udp_early_demux':
>> ncsi-manage.c:(.text+0xe0040): undefined reference to 
>> `udp_v6_early_demux_configure'

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [PATCH V8 1/3] irq: Add flags to request_percpu_irq function

2017-03-23 Thread Vineet Gupta

On 03/23/2017 10:42 AM, Daniel Lezcano wrote:
> In the next changes, we track the interrupts but we discard the timers as
> that does not make sense. The next interrupt on a timer is predictable.
>
> But, the API request_percpu_irq does not allow to pass a flag, hence 
> specifying
> if the interrupt type is a timer.
>
> Solve this by passing a 'flags' parameter to the function and change all the
> callers to pass IRQF_TIMER when the interrupt is a timer interrupt, zero
> otherwise.
>
> For now, in order to prevent a misusage of this parameter, only the IRQF_TIMER
> flag is a valid parameter to be passed to the request_percpu_irq function.
>
> Signed-off-by: Daniel Lezcano 

Acked-by: Vineet Gupta# for arch/arc, arc_timer bits

Re: [PATCH net-next] liquidio: allocate RX buffers in OOM conditions in PF and VF

2017-03-23 Thread Burla, Satananda

The 03/22/2017 19:37, David Miller wrote:
> From: Felix Manlunas 
> Date: Wed, 22 Mar 2017 11:31:13 -0700
> 
> > From: Satanand Burla 
> >
> > Add workqueue that is periodically run to try to allocate RX buffers in OOM
> > conditions in PF and VF.
> >
> > Signed-off-by: Satanand Burla 
> > Signed-off-by: Felix Manlunas 
> 
> Applied, but I'm really not so sure you want to poll these queue states
> 4 times a second all the time.
> 
> Why don't you trigger the workqueue when you actually get an allocation
> failure?
That is certainly a better option. We will incorporate that in the
coming series.
-- 
Thanks
Satanand

Re: Extending socket timestamping API for NTP

2017-03-23 Thread Denny Page

[Resend as plain text for netdev]

> On Mar 23, 2017, at 09:21, Miroslav Lichvar  wrote:
> 
> After becoming a bit more familiar with the code I don't think this is
> a good idea anymore :). I suspect there would be a noticeable
> performance impact if each timestamped packet could trigger reading of
> the current link speed. If the value had to be cached it would make
> more sense to do it in the application.

I am very surprised at this. The application caching approach requires the 
application retrieve the value via a system call. The system call overhead is 
huge in comparison to everything else. More importantly, the application cached 
value may be wrong. If the application takes a sample every 5 seconds, there 
are 5 seconds of timestamps that can be wildly wrong.

At the driver level, if the speed check is done on packet receive, retrieving 
the link speed is a single register read which is a small overhead compared 
with processing the timestamp. The alternative approach of caching still makes 
more sense in the driver rather than the application. The driver receives an 
interrupt when negotiation happens, and It’s trivial to cache the value at that 
point. And a cached value by the driver will always be correct. Implementing it 
in the driver also allows for hardware to provide the functionality where 
available. Yes, there is only one chip that provides this currently, but if 
there is sufficient demand others will appear. There is no way to take 
advantage of this functionality unless this is handled by the driver.

I think it makes a lot of sense to leave this to the driver developer.

Re: [PATCH 1/3] net: hns: avoid gcc-7.0.1 warning for uninitialized data

2017-03-23 Thread David Miller


Arnd, I only see 2 patches out of 3 and no series header posting.

Re: [PATCH V8 1/3] irq: Add flags to request_percpu_irq function

2017-03-23 Thread Mark Rutland

Hi Daniel,

On Thu, Mar 23, 2017 at 06:42:01PM +0100, Daniel Lezcano wrote:
> In the next changes, we track the interrupts but we discard the timers as
> that does not make sense. The next interrupt on a timer is predictable.

Sorry, but I could not parse this. 

[...]

> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 9612b84..0f5ab4a 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -661,7 +661,7 @@ static int cpu_pmu_request_irq(struct arm_pmu *cpu_pmu, 
> irq_handler_t handler)
>  
>   irq = platform_get_irq(pmu_device, 0);
>   if (irq > 0 && irq_is_percpu(irq)) {
> - err = request_percpu_irq(irq, handler, "arm-pmu",
> + err = request_percpu_irq(irq, 0, handler, "arm-pmu",
>&hw_events->percpu_pmu);
>   if (err) {
>   pr_err("unable to request IRQ%d for ARM PMU counters\n",

Please Cc myself and Will Deacon when modifying the arm_pmu driver, as
per MAINTAINERS. I only spotted this patch by chance.

This conflicts with arm_pmu changes I have queued for v4.12 [1].

So, can we leave the prototype of request_percpu_irq() as-is?

Why not add a new request_percpu_irq_flags() function, and leave
request_percpu_irq() as a wrapper for that? e.g.

static inline int
request_percpu_irq(unsigned int irq, irq_handler_t handler,
   const char *devname, void __percpu *percpu_dev_id)
{
return request_percpu_irq_flags(irq, handler, devname,
percpu_dev_id, 0);
}

... that would avoid having to touch any non-timer driver for now.

[...]

> -request_percpu_irq(unsigned int irq, irq_handler_t handler,
> -const char *devname, void __percpu *percpu_dev_id);
> +request_percpu_irq(unsigned int irq, unsigned long flags,
> +irq_handler_t handler,  const char *devname,
> +void __percpu *percpu_dev_id);
>  

Looking at request_irq, the prototype is:

int __must_check
request_irq(unsigned int irq, irq_handler_t handler,
unsigned long flags, const char *name,
void *dev);

... surely it would be better to share the same argument order? i.e.

int __must_check
request_percpu_irq(unsigned int irq, irq_handler_t handler,
   unsigned long flags, const char *devname,
   void __percpu *percpu_dev_id);

Thanks,
Mark.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm/perf/refactoring

1 2 >

1 - 100 of 190 matches

Mail list logo