date:20150707

Re: [PATCH 3/3] ipmr_free_table() should be called under taken rtnl_lock

2015-07-07 Thread Cong Wang

On Tue, Jul 7, 2015 at 10:25 AM, Vasily Averin v...@virtuozzo.com wrote:
 On 07.07.2015 20:13, Cong Wang wrote:
 On Tue, Jul 7, 2015 at 8:53 AM, Vasily Averin v...@virtuozzo.com wrote:
 ipmr_free_table() calls unregister_netdevice_many() inside
 and changes net_todo_list protected by rtnl_lock

 Did you see any real bug?

 No, it was result of manual code review.

 ipmr_free_table() is called in failure path, in this case there is no
 device registered yet, so unregister should be just a nop?

 However may be it's better to mark this place for future anyway?

Then add a comment there. ;)
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 02/22] fjes: Hardware initialization routine

2015-07-07 Thread Yasuaki Ishimatsu


On Wed, 24 Jun 2015 11:55:34 +0900
Taku Izumi izumi.t...@jp.fujitsu.com wrote:

 This patch adds hardware initialization routine to be
 invoked at driver's .probe routine.
 
 Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
 ---
  drivers/net/fjes/Makefile|   2 +-
  drivers/net/fjes/fjes.h  |   2 +-
  drivers/net/fjes/fjes_hw.c   | 297 
 +++
  drivers/net/fjes/fjes_hw.h   | 250 
  drivers/net/fjes/fjes_regs.h | 102 +++
  5 files changed, 651 insertions(+), 2 deletions(-)
  create mode 100644 drivers/net/fjes/fjes_hw.c
  create mode 100644 drivers/net/fjes/fjes_hw.h
  create mode 100644 drivers/net/fjes/fjes_regs.h
 
 diff --git a/drivers/net/fjes/Makefile b/drivers/net/fjes/Makefile
 index 98e59cb..a67f65d8 100644
 --- a/drivers/net/fjes/Makefile
 +++ b/drivers/net/fjes/Makefile
 @@ -27,5 +27,5 @@
  
  obj-$(CONFIG_FUJITSU_ES) += fjes.o
  
 -fjes-objs := fjes_main.o
 +fjes-objs := fjes_main.o fjes_hw.o
  
 diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h
 index 4622da1..15ded96 100644
 --- a/drivers/net/fjes/fjes.h
 +++ b/drivers/net/fjes/fjes.h
 @@ -28,6 +28,6 @@
  
  extern char fjes_driver_name[];
  extern char fjes_driver_version[];
 -extern u32 fjes_support_mtu[];
 +extern const u32 fjes_support_mtu[];
  
  #endif /* FJES_H_ */
 diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c
 new file mode 100644
 index 000..68ef4d3
 --- /dev/null
 +++ b/drivers/net/fjes/fjes_hw.c
 @@ -0,0 +1,297 @@
 +/*
 + *  FUJITSU Extended Socket Network Device driver
 + *  Copyright (c) 2015 FUJITSU LIMITED
 + *
 + * This program is free software; you can redistribute it and/or modify it
 + * under the terms and conditions of the GNU General Public License,
 + * version 2, as published by the Free Software Foundation.
 + *
 + * This program is distributed in the hope it will be useful, but WITHOUT
 + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
 + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
 + * more details.
 + *
 + * You should have received a copy of the GNU General Public License along 
 with
 + * this program; if not, see http://www.gnu.org/licenses/.
 + *
 + * The full GNU General Public License is included in this distribution in
 + * the file called COPYING.
 + *
 + */
 +
 +#include fjes_hw.h
 +#include fjes.h
 +
 +/* supported MTU list */
 +const u32 fjes_support_mtu[] = {
 + FJES_MTU_DEFINE(8 * 1024),
 + FJES_MTU_DEFINE(16 * 1024),
 + FJES_MTU_DEFINE(32 * 1024),
 + FJES_MTU_DEFINE(64 * 1024),
 + 0
 +};
 +
 +u32 fjes_hw_rd32(struct fjes_hw *hw, u32 reg)
 +{
 + u8 *base = hw-base;
 + u32 value = 0;
 +
 + value = readl(base[reg]);
 +
 + return value;
 +}
 +
 +static u8 *fjes_hw_iomap(struct fjes_hw *hw)
 +{
 + u8 *base;
 +
 + if (!request_mem_region(hw-hw_res.start, hw-hw_res.size,
 + fjes_driver_name)) {
 + pr_err(request_mem_region failed\n);
 + return NULL;
 + }
 +
 + base = (u8 *)ioremap_nocache(hw-hw_res.start, hw-hw_res.size);
 +
 + return base;
 +}
 +
 +int fjes_hw_reset(struct fjes_hw *hw)
 +{
 + int timeout;
 + union REG_DCTL dctl;
 +
 + dctl.reg = 0;
 + dctl.bits.reset = 1;
 + wr32(XSCT_DCTL, dctl.reg);
 +
 + timeout = FJES_DEVICE_RESET_TIMEOUT * 1000;
 + dctl.reg = rd32(XSCT_DCTL);
 + while ((dctl.bits.reset == 1)  (timeout  0)) {
 + msleep(1000);
 + dctl.reg = rd32(XSCT_DCTL);
 + timeout -= 1000;
 + }
 +
 + return timeout  0 ? 0 : -EIO;
 +}
 +
 +static int fjes_hw_get_max_epid(struct fjes_hw *hw)
 +{
 + union REG_MAX_EP info;
 +
 + info.reg = rd32(XSCT_MAX_EP);
 +
 + return info.bits.maxep;
 +}
 +
 +static int fjes_hw_get_my_epid(struct fjes_hw *hw)
 +{
 + union REG_OWNER_EPID info;
 +
 + info.reg = rd32(XSCT_OWNER_EPID);
 +
 + return info.bits.epid;
 +}
 +
 +static int fjes_hw_alloc_shared_status_region(struct fjes_hw *hw)
 +{
 + size_t size;
 +
 + size = sizeof(struct fjes_device_shared_info) +
 + (sizeof(u8) * hw-max_epid);
 + hw-hw_info.share = kzalloc(size, GFP_KERNEL);
 + if (!hw-hw_info.share)
 + return -ENOMEM;
 +
 + hw-hw_info.share-epnum = hw-max_epid;
 +
 + return 0;
 +}
 +
 +static int fjes_hw_alloc_epbuf(struct epbuf_handler *epbh)
 +{
 + void *mem;
 +
 + mem = vzalloc(EP_BUFFER_SIZE);
 + if (!mem)
 + return -ENOMEM;
 +
 + epbh-buffer = mem;
 + epbh-size = EP_BUFFER_SIZE;
 +
 + epbh-info = (union ep_buffer_info *)mem;
 + epbh-ring = (u8 *)(mem + sizeof(union ep_buffer_info));
 +
 + return 0;
 +}
 +
 +void fjes_hw_setup_epbuf(struct epbuf_handler *epbh, u8 *mac_addr, u32 mtu)
 +{
 + union ep_buffer_info *info = epbh-info;
 + int i;
 + u16 vlan_id[EP_BUFFER_SUPPORT_VLAN_MAX];
 +
 + for (i = 0; i

[RFC PATCH net-next] sctp: fix src address selection if using secondary addresses

2015-07-07 Thread Marcelo Ricardo Leitner

Hi folks,

This is an attempt to better choose a src address for sctp packets as
peers with rp_filter could be dropping our packets in some situations.
With this patch, we try to respect and use a src address that belongs to
the interface we are putting the packet out.

I have that feeling that there is be a better way to do this, but I
just couldn't see it.

This patch has been tested with and without gateways between the peers
and also just two peers connected via two subnets and results were
pretty good.

One could think that this limits the address combination we can use, but
such combinations probably are just bogus anyway. Like, if you have an
host with address A1 and B1 and another with A2 and B2, you cannot
expect that A can use A1 to reach B2 through subnet B, because the
return path would be via the other link which, when this switch happens,
we are thinking it's broken.

Thanks,
Marcelo

---8---

In short, sctp is likely to incorrectly choose src address if socket is
bound to secondary addresses. This patch fixes it by adding a new check
that tries to anticipate if the src address would be expected by the
next hop/peer on this interface by doing reverse routing.

Also took the shot to reduce the indentation level on this code.

Details:

Currently, sctp will do a routing attempt without specifying the src
address and compare the returned value (preferred source) with the
addresses that the socket is bound to. When using secondary addresses,
this will not match.

Then it will try specifying each of the addresses that the socket is
bound to and re-routing, checking if that address is valid as src for
that dst. Thing is, this check alone is weak:

# ip r l
192.168.100.0/24 dev eth1  proto kernel  scope link  src 192.168.100.149
192.168.122.0/24 dev eth0  proto kernel  scope link  src 192.168.122.147

# ip a l
1: lo: LOOPBACK,UP,LOWER_UP mtu 65536 qdisc noqueue state UNKNOWN group 
default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
   valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
   valid_lft forever preferred_lft forever
2: eth0: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast state UP 
group default qlen 1000
link/ether 52:54:00:15:18:6a brd ff:ff:ff:ff:ff:ff
inet 192.168.122.147/24 brd 192.168.122.255 scope global dynamic eth0
   valid_lft 2160sec preferred_lft 2160sec
inet 192.168.122.148/24 scope global secondary eth0
   valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe15:186a/64 scope link
   valid_lft forever preferred_lft forever
3: eth1: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast state UP 
group default qlen 1000
link/ether 52:54:00:b3:91:46 brd ff:ff:ff:ff:ff:ff
inet 192.168.100.149/24 brd 192.168.100.255 scope global dynamic eth1
   valid_lft 2162sec preferred_lft 2162sec
inet 192.168.100.148/24 scope global secondary eth1
   valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:feb3:9146/64 scope link
   valid_lft forever preferred_lft forever
4: ens9: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast state UP 
group default qlen 1000
link/ether 52:54:00:05:47:ee brd ff:ff:ff:ff:ff:ff
inet6 fe80::5054:ff:fe05:47ee/64 scope link
   valid_lft forever preferred_lft forever

# ip r g 192.168.100.193 from 192.168.122.148
192.168.100.193 from 192.168.122.148 dev eth1
cache

Even if you specify an interface:

# ip r g 192.168.100.193 from 192.168.122.148 oif eth1
192.168.100.193 from 192.168.122.148 dev eth1
cache

Although this would be valid, peers using rp_filter will drop such
packets as their src doesn't match the routes for that interface.

So we fix this by adding an extra check, we try to do the reverse
routing and check if the interface used would be the same. If not, we
skip such address. If yes, we use it.

Signed-off-by: Marcelo Ricardo Leitner marcelo.leit...@gmail.com
---
 net/sctp/protocol.c | 55 +++--
 1 file changed, 41 insertions(+), 14 deletions(-)

diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 
59e80356672bdf89777265ae1f8c384792dfb98c..e52fd6f77963426a7cf3e83ca01a9cdae1cb2c01
 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -53,6 +53,7 @@
 #include net/net_namespace.h
 #include net/protocol.h
 #include net/ip.h
+#include net/ip_fib.h
 #include net/ipv6.h
 #include net/route.h
 #include net/sctp/sctp.h
@@ -487,23 +488,49 @@ static void sctp_v4_get_dst(struct sctp_transport *t, 
union sctp_addr *saddr,
 */
rcu_read_lock();
list_for_each_entry_rcu(laddr, bp-address_list, list) {
+   struct flowi4 in;
+   struct fib_result res;
+
if (!laddr-valid)
continue;
-   if ((laddr-state == SCTP_ADDR_SRC) 
-   (AF_INET == laddr-a.sa.sa_family)) {
-   fl4-fl4_sport =

Re: [PATCH 3/3] ipmr_free_table() should be called under taken rtnl_lock

2015-07-07 Thread Vasily Averin

On 07.07.2015 20:30, Cong Wang wrote:
 On Tue, Jul 7, 2015 at 10:25 AM, Vasily Averin v...@virtuozzo.com wrote:
 On 07.07.2015 20:13, Cong Wang wrote:
 On Tue, Jul 7, 2015 at 8:53 AM, Vasily Averin v...@virtuozzo.com wrote:
 ipmr_free_table() calls unregister_netdevice_many() inside
 and changes net_todo_list protected by rtnl_lock

 Did you see any real bug?

 No, it was result of manual code review.

 ipmr_free_table() is called in failure path, in this case there is no
 device registered yet, so unregister should be just a nop?

 However may be it's better to mark this place for future anyway?
 
 Then add a comment there. ;)

As you can see I'm not familiar with this code,
so I would like to ask you to do it. :)
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 1/3] net: dsa: mv88e6xxx: add debugfs interface for VTU

2015-07-07 Thread Vivien Didelot

Hi Andrew,

On Jul 6, 2015, at 10:08 PM, Andrew Lunn and...@lunn.ch wrote:

 +static int _mv88e6xxx_vtu_getnext(struct dsa_switch *ds, u16 vid,
 +  struct mv88e6xxx_vtu_entry *entry)
 +{
 +int ret, i;
 +
 +ret = _mv88e6xxx_vtu_wait(ds);
 +if (ret  0)
 +return ret;
 +
 +ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_VID,
 +   vid  GLOBAL_VTU_VID_MASK);
 +if (ret  0)
 +return ret;
 +
 +ret = _mv88e6xxx_vtu_cmd(ds, GLOBAL_VTU_OP_VTU_GET_NEXT);
 +if (ret  0)
 +return ret;
 +
 +ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_VID);
 +if (ret  0)
 +return ret;
 +
 +entry-vid = ret  GLOBAL_VTU_VID_MASK;
 +entry-valid = !!(ret  GLOBAL_VTU_VID_VALID);
 +
 +if (entry-valid) {
 +/* Ports 0-3, offsets 0, 4, 8, 12 */
 +ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_DATA_0_3);
 +if (ret  0)
 +return ret;
 +
 +for (i = 0; i  4; ++i)
 +entry-tags[i] = (ret  (i * 4))  3;
 +
 +/* Ports 4-6, offsets 0, 4, 8 */
 +ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_DATA_4_7);
 +if (ret  0)
 +return ret;
 +
 +for (i = 4; i  7; ++i)
 +entry-tags[i] = (ret  ((i - 4) * 4))  3;
 
 Hi Vivien
 
 It looks like you still have up to 7 ports, rather than use
 ps-num_ports. I have a ten port switch i would like to use VLANs with
 :-)
 
 +
 +if (mv88e6xxx_6097_family(ds) || mv88e6xxx_6165_family(ds) ||
 +mv88e6xxx_6351_family(ds) || mv88e6xxx_6352_family(ds)) {
 +ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL,
 +  GLOBAL_VTU_FID);
 +if (ret  0)
 +return ret;
 +
 +entry-fid = ret  GLOBAL_VTU_FID_MASK;
 +
 +ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL,
 +  GLOBAL_VTU_SID);
 +if (ret  0)
 +return ret;
 +
 +entry-sid = ret  GLOBAL_VTU_SID_MASK;
 +} else {
 +entry-fid = 0;
 +entry-sid = 0;
 +}
 +}
 +
 +return 0;
 +}
 +
 +static int _mv88e6xxx_vtu_loadpurge(struct dsa_switch *ds,
 +struct mv88e6xxx_vtu_entry *entry)
 +{
 +u16 data = 0;
 +int ret, i;
 +
 +ret = _mv88e6xxx_vtu_wait(ds);
 +if (ret  0)
 +return ret;
 +
 +if (entry-valid) {
 +/* Set Data Register, ports 0-3, offsets 0, 4, 8, 12 */
 +for (data = i = 0; i  4; ++i)
 +data |= entry-tags[i]  (i * 4);
 +ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_DATA_0_3,
 +   data);
 +if (ret  0)
 +return ret;
 +
 +/* Set Data Register, ports 4-6, offsets 0, 4, 8 */
 +for (data = 0, i = 4; i  7; ++i)
 +data |= entry-tags[i]  ((i - 4) * 4);
 +ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_DATA_4_7,
 +   data);
 
 Same again here.
 
  Andrew

Indeed, I intentionally kept it as is, since the 88E6352 datasheet is not
really clear about this. I see that the register 0x09 (called
GLOBAL_VTU_DATA_8_11 in mv88e6xxx.h) only contains VID priority related bits in
15:12, in my case. As bits 11:0 are reserved, I suspect that the offsets of
Member TagP7, Member TagP8 and Member TagP9 are respectively 0, 4 and 8 for
you. Can you confirm? If so, I'll make that generic with these values.

Thanks,
-v
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/3] ipmr_free_table() should be called under taken rtnl_lock

2015-07-07 Thread Vasily Averin

ipmr_free_table() calls unregister_netdevice_many() inside
and changes net_todo_list protected by rtnl_lock

Signed-off-by: Vasily Averin v...@virtuozzo.com
---
 net/ipv6/ip6mr.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index 74ceb73..9108636 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -250,7 +250,9 @@ static int __net_init ip6mr_rules_init(struct net *net)
return 0;
 
 err2:
+   rtnl_lock();
ip6mr_free_table(mrt);
+   rtnl_unlock();
 err1:
fib_rules_unregister(ops);
return err;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Performance bottleneck with ndo_start_xmit

2015-07-07 Thread Jason A. Donenfeld

Hi folks,

I'm writing a kernel module that creates a virtual network device with
rtnl_link_register. At initialization time, it creates a UDP socket
with sock_create_kern. On ndo_start_xmit, it passes the data of the
skb to the UDP socket's sendmsg, after some minimal crypto and
processing. The device's MTU takes things into account properly. In
other words: it's a UDP-based tunnel device. And it works.

But I'm hitting a bottleneck in the send path (ndo_start_xmit) that I
can't seem to figure out. None of the aforementioned crypto or
processing contributes significantly. I boot up two virtual machines,
configure the tunnel on them, and run iperf to test bandwidth. Using
the tunnel device I get around 450mbps. Without using the tunnel
device, I get around 5gbps. These performance characteristics remain
the same for 1 CPU and for 4 CPUs and for 8 CPUs.

When it maxes out at ~5gbps without using the tunnel device, the CPU
is at around 80%. When it maxes out at ~450mbps using the tunnel
device, the CPU is at 100%. Running perf top indicates that most the
kernel time is spent in e1000_xmit, or the xmit function of whichever
driver underlies the UDP socket. Very little percent of time is spent
in any functions related to my module or even inside UDP's sendmsg
call tree.

I'm stumped. I've tried workqueues, tasklets, all sorts of deferal.
I've tried not using a UDP _socket_ and instead constructing an
Ethernet, IP, and UDP header myself, checksumming it, computing the
flowi4s,  getting the macs, and passing it to dev_queue_xmit. But in
all cases, the bandwidth stays the same: 450mbps at 100% CPU
utilization with the e1000_xmit (or vmxnet3_xmit if I'm using that
driver instead) function at the top of the list in perf top.

I can confirm that the receive path never reaches 100% CPU
utilization, and hence the bottleneck is in the send path, described
above.

Can anyone help? Or point me in the right direction of where to learn?
I have exhausted all of the documentation resources I've been able to
find, and my eyes hurt from reading tens of thousands of lines of
kernel code trying to figure this out. I'm at a loss.

Any pointers would be greatly appreciated.

Regards,
Jason
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/3] ipmr_free_table() should be called under taken rtnl_lock

2015-07-07 Thread Cong Wang

On Tue, Jul 7, 2015 at 8:53 AM, Vasily Averin v...@virtuozzo.com wrote:
 ipmr_free_table() calls unregister_netdevice_many() inside
 and changes net_todo_list protected by rtnl_lock


Did you see any real bug?

ipmr_free_table() is called in failure path, in this case there is no
device registered yet, so unregister should be just a nop?
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/3] missing rtnl_unlock in igb_sriov_reinit()

2015-07-07 Thread Vasily Averin

Signed-off-by: Vasily Averin v...@virtuozzo.com
---
 drivers/net/ethernet/intel/igb/igb_main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c
index 2f70a9b..5881458 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -7538,6 +7538,7 @@ static int igb_sriov_reinit(struct pci_dev *dev)
igb_init_queue_configuration(adapter);
 
if (igb_init_interrupt_scheme(adapter, true)) {
+   rtnl_unlock();
dev_err(pdev-dev, Unable to allocate memory for queues\n);
return -ENOMEM;
}
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/3] rtnl_lock fixes

2015-07-07 Thread Vasily Averin

While investigation possible reasons of net_todo_list corruption 
on rhel6 based OpenVZ kernel I've found few places not fixed in mainline.

Vasily Averin (3):
  missing rtnl_unlock in i40evf_resume()
  missing rtnl_unlock in igb_sriov_reinit()
  ipmr_free_table() should be called under taken rtnl_lock

 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 1 +
 drivers/net/ethernet/intel/igb/igb_main.c   | 1 +
 net/ipv6/ip6mr.c| 2 ++
 3 files changed, 4 insertions(+)

-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/3] missing rtnl_unlock in i40evf_resume()

2015-07-07 Thread Vasily Averin

Signed-off-by: Vasily Averin v...@virtuozzo.com
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c 
b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 4ab4ebb..dd6a428 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -2402,6 +2402,7 @@ static int i40evf_resume(struct pci_dev *pdev)
rtnl_lock();
err = i40evf_set_interrupt_capability(adapter);
if (err) {
+   rtnl_unlock();
dev_err(pdev-dev, Cannot enable MSI-X interrupts.\n);
return err;
}
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 0/3] net: dsa: mv88e6xxx: add support for VLAN Table Unit

2015-07-07 Thread Vivien Didelot

Hi Andrew, Scott,

On Jul 6, 2015, at 11:46 PM, Scott Feldman sfel...@gmail.com wrote:

 On Mon, Jul 6, 2015 at 7:00 PM, Andrew Lunn and...@lunn.ch wrote:
 On Tue, Jul 07, 2015 at 01:38:04AM +0200, Andrew Lunn wrote:
 On Sun, Jul 05, 2015 at 10:14:50PM -0400, Vivien Didelot wrote:
  Hi all,
 
  This patchset brings full support for hardware VLANs in DSA, and the 
  Marvell
  88E6xxx compatible switch chips.

 Hi Vivien

 I just booted these patches on my board, and i'm getting WARNINGS:

 [   61.111302] WARNING: CPU: 0 PID: 2751 at net/switchdev/switchdev.c:265
 switchdev_port_obj_add+0xd4/0xdc()

 Hi Vivien

 I debugged this a bit.

 The problem comes from:

 static int dsa_slave_port_obj_add(struct net_device *dev,
   struct switchdev_obj *obj)
 {
 int err;

 /*
  * Skip the prepare phase, since currently the DSA drivers don't 
 need to
  * allocate any memory for operations and they will not fail to HW
  * (unless something horrible goes wrong on the MDIO bus, in which 
 case
  * the prepare phase wouldn't have been able to predict anyway).
  */
 if (obj-trans != SWITCHDEV_TRANS_COMMIT)
 return 0;

 switch (obj-id) {
 case SWITCHDEV_OBJ_PORT_VLAN:
 err = dsa_slave_port_vlans_add(dev, obj);
 break;
 default:
 err = -EOPNOTSUPP;
 break;
 }

 return err;
 }

 It is being called with obj-id of 2, which is
 SWITCHDEV_OBJ_IPV4_FIB. This function is called twice. The first time
 it is with SWITCHDEV_TRANS_PREPARE and we are allowed to return an
 error. The second time, with SWITCHDEV_TRANS_COMMIT, errors are not
 allowed.

 EOPNOTSUPP is considered an error, so since we don't support
 SWITCHDEV_OBJ_IPV4_FIB we error out the COMMIT phase.

 Not sure which is cleaner. Test to see if we support the object during
 the prepare, or allow the commit to accept EOPNOTSUPP as not being an
 error?
 
 I think we should return EOPNOTSUPP on PREPARE, so move the trans !=
 COMMIT test inside the case for PORT_VLAN.  That would future-proof
 the func when new objects are added to switchdev (and not supported by
 dsa_slave).

Does this fixup http://ix.io/jxq look good to you?

Thanks,
-v
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PULL] virtio/vhost: cross endian support

2015-07-07 Thread Thomas Huth

On Thu, 2 Jul 2015 11:32:52 +0200
Michael S. Tsirkin m...@redhat.com wrote:

 On Thu, Jul 02, 2015 at 11:12:56AM +0200, Greg Kurz wrote:
  On Thu, 2 Jul 2015 08:01:28 +0200
  Michael S. Tsirkin m...@redhat.com wrote:
...
   Yea, well - support for legacy BE guests on the new LE hosts is
   exactly the motivation for this.
   
   I dislike it too, but there are two redeeming properties that
   made me merge this:
   
   1.  It's a trivial amount of code: since we wrap host/guest accesses
   anyway, almost all of it is well hidden from drivers.
   
   2.  Sane platforms would never set flags like VHOST_CROSS_ENDIAN_LEGACY -
   and when it's clear, there's zero overhead (as some point it was
   tested by compiling with and without the patches, got the same
   stripped binary).
   
   Maybe we could create a Kconfig symbol to enforce point (2): prevent
   people from enabling it e.g. on x86. I will look into this - but it can
   be done by a patch on top, so I think this can be merged as is.
   
  
  This cross-endian *oddity* is targeting PowerPC book3s_64 processors... I
  am not aware of any other users. Maybe create a symbol that would
  be only selected by PPC_BOOK3S_64 ?
 
 I think some ARM systems are trying to support cross-endian
 configurations as well.
 
 Besides that, yes, this is more or less what I had in mind.

Would something simple like this already do the job:

diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -35,6 +35,7 @@ config VHOST
 
 config VHOST_CROSS_ENDIAN_LEGACY
bool Cross-endian support for vhost
+   depends on KVM_BOOK3S_64 || KVM_ARM_HOST
default n
---help---
  This option allows vhost to support guests with a different byte

?

If that looks acceptable, I can submit a proper patch if you like.

 Thomas
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next v2] ipv4: add support for linkdown sysctl to netconf

2015-07-07 Thread Andy Gospodarek

This kernel patch exports the value of the new
ignore_routes_with_linkdown via netconf.

v2: changes to notify userspace via netlink when sysctl values change
and proposed for 'net' since this could be considered a bugfix

Signed-off-by: Andy Gospodarek go...@cumulusnetworks.com
Suggested-by: Nicolas Dichtel nicolas.dich...@6wind.com
---
I realize two of these changes result in lines 80 chars, but this is to
keep the coding-style used by the surrounding code.

There are multiple ways to resolve this, but one would be to shorten the
defines used for this feature as IGNORE_ROUTES_WITH_LINKDOWN is a
mouthful.  I would propose dropping _WITH_ from all the defines and
have only IGNORE_ROUTES_LINKDOWN everywhere.  Doing this now in 'net'
would be ideal before a release happens and it cannot be changed.

 include/uapi/linux/netconf.h |  1 +
 net/ipv4/devinet.c   | 13 +
 2 files changed, 14 insertions(+)

diff --git a/include/uapi/linux/netconf.h b/include/uapi/linux/netconf.h
index 669a1f0..23cbd34 100644
--- a/include/uapi/linux/netconf.h
+++ b/include/uapi/linux/netconf.h
@@ -15,6 +15,7 @@ enum {
NETCONFA_RP_FILTER,
NETCONFA_MC_FORWARDING,
NETCONFA_PROXY_NEIGH,
+   NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN,
__NETCONFA_MAX
 };
 #define NETCONFA_MAX   (__NETCONFA_MAX - 1)
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 7498716..e813196 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -1740,6 +1740,8 @@ static int inet_netconf_msgsize_devconf(int type)
size += nla_total_size(4);
if (type == -1 || type == NETCONFA_PROXY_NEIGH)
size += nla_total_size(4);
+   if (type == -1 || type == NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN)
+   size += nla_total_size(4);
 
return size;
 }
@@ -1780,6 +1782,10 @@ static int inet_netconf_fill_devconf(struct sk_buff 
*skb, int ifindex,
nla_put_s32(skb, NETCONFA_PROXY_NEIGH,
IPV4_DEVCONF(*devconf, PROXY_ARP))  0)
goto nla_put_failure;
+   if ((type == -1 || type == NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN) 
+   nla_put_s32(skb, NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN,
+   IPV4_DEVCONF(*devconf, IGNORE_ROUTES_WITH_LINKDOWN))  
0)
+   goto nla_put_failure;
 
nlmsg_end(skb, nlh);
return 0;
@@ -1819,6 +1825,7 @@ static const struct nla_policy 
devconf_ipv4_policy[NETCONFA_MAX+1] = {
[NETCONFA_FORWARDING]   = { .len = sizeof(int) },
[NETCONFA_RP_FILTER]= { .len = sizeof(int) },
[NETCONFA_PROXY_NEIGH]  = { .len = sizeof(int) },
+   [NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN]  = { .len = sizeof(int) },
 };
 
 static int inet_netconf_get_devconf(struct sk_buff *in_skb,
@@ -2048,6 +2055,12 @@ static int devinet_conf_proc(struct ctl_table *ctl, int 
write,
inet_netconf_notify_devconf(net, NETCONFA_PROXY_NEIGH,
ifindex, cnf);
}
+   if (i == IPV4_DEVCONF_IGNORE_ROUTES_WITH_LINKDOWN - 1 
+   new_value != old_value) {
+   ifindex = devinet_conf_ifindex(net, cnf);
+   inet_netconf_notify_devconf(net, 
NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN,
+   ifindex, cnf);
+   }
}
 
return ret;
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 0/3] net: dsa: mv88e6xxx: add support for VLAN Table Unit

2015-07-07 Thread Andrew Lunn

On Tue, Jul 07, 2015 at 12:17:57PM -0400, Vivien Didelot wrote:
 Hi Andrew, Scott,
 Does this fixup http://ix.io/jxq look good to you?

Yes, that looks good.

 Andrew
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Broadcom BCM54610 Linux support

2015-07-07 Thread Florian Fainelli

(adding Michael)

On 07/07/15 03:58, Markus Pargmann wrote:
 Hi,
 
 I found the phy driver which supports broadcom BCM5461. But I am not
 sure if this driver does support BCM54610 in fiber mode as well? Or if
 there are any open datasheets which could be used to write a mainline
 driver for it. I would appretiate any information about this.

There are not publicly available datasheets as far as I can tell, the
current driver does not support anything but copper modes.

If you have reference code from somewhere else (e.g: bootloader or a
Broadcom SDK), I would be inclined to port it over the Linux PHY driver.
-- 
Florian
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/3] ipmr_free_table() should be called under taken rtnl_lock

2015-07-07 Thread Vasily Averin

On 07.07.2015 20:13, Cong Wang wrote:
 On Tue, Jul 7, 2015 at 8:53 AM, Vasily Averin v...@virtuozzo.com wrote:
 ipmr_free_table() calls unregister_netdevice_many() inside
 and changes net_todo_list protected by rtnl_lock
 
 Did you see any real bug?

No, it was result of manual code review.

 ipmr_free_table() is called in failure path, in this case there is no
 device registered yet, so unregister should be just a nop?

However may be it's better to mark this place for future anyway?
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

(( Please view the attached file for your code payment ))

2015-07-07 Thread U / N

Please view the attached file for your code payment

United Nations Compensation Unit.pdf
Description: Adobe PDF document

Re: [PATCH iproute2] include: add copy of tipc.h

2015-07-07 Thread Michal Kubecek

On Mon, Jul 06, 2015 at 02:46:49PM -0700, Stephen Hemminger wrote:
 On Mon, 29 Jun 2015 10:53:15 +0200 (CEST)
 Michal Kubecek mkube...@suse.cz wrote:
 
  Copy of kernel include/uapi/linux/tipc.h is needed to build on systems
  with pre-3.16 kernel headers.
  
  Signed-off-by: Michal Kubecek mkube...@suse.cz
 
 Ok, I applied (and fixed) this.
 The headers you want are not directly from include/uapi/linux, instead
 you need to run:
  $ make install_headers
 
 Then copy the result out of usr/include/linux/ when getting a santized
 header.

Thank you for the explanation. I was looking for some script automating
the header sanitization but I didn't realize I should have looked into
kernel git, not iproute2. I'll remember for the future.

 Michal Kubecek

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] net: add support for linkdown sysctl to netconf

2015-07-07 Thread Nicolas Dichtel


Le 06/07/2015 20:21, Andy Gospodarek a écrit :

This kernel patch exports the value of the new
ignore_routes_with_linkdown via netconf.

Signed-off-by: Andy Gospodarek go...@cumulusnetworks.com
Suggested-by: Nicolas Dichtel nicolas.dich...@6wind.com
---

You need also to patch devinet_conf_proc() so that a netlink message is
sent when the user updates the sysctl entry.


Regards,
Nicolas
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] add stealth mode

2015-07-07 Thread Clemens Ladisch

valdis.kletni...@vt.edu wrote:
 On Thu, 02 Jul 2015 10:56:01 +0200, Matteo Croce said:
 Add option to disable any reply not related to a listening socket

 2) You *do* realize that this isn't anywhere near sufficient in order
 to actually make your machine invisible, right?  (Hint: What *other*
 packets can be sent to a machine to provoke a response?)

Even worse: if you want to pretend that the entire machine is not there,
you must make the router in front on you reply with an ICMP destination
unreachable message.


Regards,
Clemens
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC net-next] xfrm: refactory to avoid state tasklet scheduling errors

2015-07-07 Thread Giuseppe Cantavenera

The SA state is managed by a tasklet scheduled relying on the wall clock.
Previous changes have already tried to address bugs
when the system time is changed but some error conditions still exist,
because the logic is still coupled with the wall time.

If the time is changed in between the SA is created and the tasklet timer
is started for the first time, the SA scheduling will be broken:
either the SA will expire and never be recreated, or it will expire at
an unexpected time.  The reason is that x-curlft.add_time will not be valid
when the next variable is computed for the very first time
in xfrm_timer_handler().

Fix this behaviour by avoiding to rely on the system time.
Stick to relative time intervals and realise a total decoupling
from the wall time.

Based on another patch written and published by
Fan Du (fan...@intel.com) in 2013 but never merged:
part of the code preserved, some rewritten and improved.
Changes to the logic accounting for the use_time expiration.
Here we allow both add_time and use_time expirations to be set.

Cc: Steffen Klassert steffen.klass...@secunet.com
Cc: David S. Miller da...@davemloft.net
Cc: Fan Du fan...@intel.com
Cc: Alexander Sverdlin alexander.sverd...@nokia.com
Cc: Matija Glavinic Pecotic matija.glavinic-pecotic@nokia.com
Signed-off-by: Giuseppe Cantavenera giuseppe.cantavenera@nokia.com
Signed-off-by: Nicholas Faustini nicholas.faustini@nokia.com
---

Hello,

we also meet the same bug Fan Du did a while ago.
Two solutions were proposed in the past:
either forcibly mark as expired all of the keys every time the clock is set,
or replace the existing timers with relative ones.

The former would introduce unexpected behaviour 
(the keys would keep expiring when they shouldn't) and does not address the
real problem: THE COUPLING between the SA scheduling and the wall timer.
Actually it introduces even more of that.

The latter is robust, extremly lightweight and maintanable, and preserves the
expected behaviour, that's why we preferred it.

Any feedback or any other idea is greatly appreciated.

Thanks,
Regards,
Giuseppe

 include/net/xfrm.h|  10 ++-
 net/xfrm/xfrm_state.c | 181 --
 2 files changed, 123 insertions(+), 68 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 721e9c3..a1335cf 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -212,8 +212,8 @@ struct xfrm_state {
struct xfrm_lifetime_cur curlft;
struct tasklet_hrtimer  mtimer;
 
-   /* used to fix curlft-add_time when changing date */
-   longsaved_tmo;
+   /* seconds beetwen hard and software expiration */
+   longtmo;
 
/* Last used time */
unsigned long   lastused;
@@ -240,7 +240,11 @@ static inline struct net *xs_net(struct xfrm_state *x)
 
 /* xflags - make enum if more show up */
 #define XFRM_TIME_DEFER1
-#define XFRM_SOFT_EXPIRE 2
+#define XFRM_SA_ADD_MODE 2
+#define XFRM_SA_USE_MODE 4
+#define XFRM_SA_ACQ_MODE 8
+#define XFRM_SA_SOFT_STAGE 16
+#define XFRM_SA_HARD_STAGE 32
 
 enum {
XFRM_STATE_VOID,
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 0ab5413..2c6a5a5 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -387,78 +387,38 @@ static enum hrtimer_restart xfrm_timer_handler(struct 
hrtimer *me)
 {
struct tasklet_hrtimer *thr = container_of(me, struct tasklet_hrtimer, 
timer);
struct xfrm_state *x = container_of(thr, struct xfrm_state, mtimer);
-   unsigned long now = get_seconds();
long next = LONG_MAX;
-   int warn = 0;
int err = 0;
+   int exp_reason_unknown = 0;
 
spin_lock(x-lock);
if (x-km.state == XFRM_STATE_DEAD)
goto out;
if (x-km.state == XFRM_STATE_EXPIRED)
goto expired;
-   if (x-lft.hard_add_expires_seconds) {
-   long tmo = x-lft.hard_add_expires_seconds +
-   x-curlft.add_time - now;
-   if (tmo = 0) {
-   if (x-xflags  XFRM_SOFT_EXPIRE) {
-   /* enter hard expire without soft expire first?!
-* setting a new date could trigger this.
-* workarbound: fix x-curflt.add_time by below:
-*/
-   x-curlft.add_time = now - x-saved_tmo - 1;
-   tmo = x-lft.hard_add_expires_seconds - 
x-saved_tmo;
-   } else
-   goto expired;
-   }
-   if (tmo  next)
-   next = tmo;
-   }
-   if (x-lft.hard_use_expires_seconds) {
-   long tmo = x-lft.hard_use_expires_seconds +
-   (x-curlft.use_time ? : now) - now;
-   if (tmo = 0)
-   goto expired;
-   if (tmo  next)
-

Re: [PATCH v2] add stealth mode

2015-07-07 Thread Hannes Frederic Sowa



On Mon, Jul 6, 2015, at 21:44, Matteo Croce wrote:
 2015-07-06 12:49 GMT+02:00  valdis.kletni...@vt.edu:
  On Thu, 02 Jul 2015 10:56:01 +0200, Matteo Croce said:
  Add option to disable any reply not related to a listening socket,
  like RST/ACK for TCP and ICMP Port-Unreachable for UDP.
  Also disables ICMP replies to echo request and timestamp.
  The stealth mode can be enabled selectively for a single interface.
 
  A few notes.
 
  1) Do you have an actual use case where an iptables '-j DROP' isn't usable?
 
 If you mean using a default DROP policy and allowing only the traffic
 do you want,
 then the use case is where the port can change at runtime and you may not
 want
 to update the firewall every time

Can't you use socket match in netfilter to accomplish exactly that?
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net] rtnetlink: verify IFLA_VF_INFO attributes before passing them to driver

2015-07-07 Thread Vlad Zolotarov




On 07/07/15 01:07, Daniel Borkmann wrote:

Jason Gunthorpe reported that since commit c02db8c6290b (rtnetlink: make
SR-IOV VF interface symmetric), we don't verify IFLA_VF_INFO attributes
anymore with respect to their policy, that is, ifla_vfinfo_policy[].

Before, they were part of ifla_policy[], but they have been nested since
placed under IFLA_VFINFO_LIST, that contains the attribute IFLA_VF_INFO,
which is another nested attribute for the actual VF attributes such as
IFLA_VF_MAC, IFLA_VF_VLAN, etc.

Despite the policy being split out from ifla_policy[] in this commit,
it's never applied anywhere. nla_for_each_nested() only does basic nla_ok()
testing for struct nlattr, but it doesn't know about the data context and
their requirements.

Fix, on top of Jason's initial work, does 1) parsing of the attributes
with the right policy, and 2) using the resulting parsed attribute table
from 1) instead of the nla_for_each_nested() loop (just like we used to
do when still part of ifla_policy[]).

Reference: http://thread.gmane.org/gmane.linux.network/368913
Fixes: c02db8c6290b (rtnetlink: make SR-IOV VF interface symmetric)
Reported-by: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Chris Wright chr...@sous-sol.org
Cc: Sucheta Chakraborty sucheta.chakrabo...@qlogic.com
Cc: Greg Rose gregory.v.r...@intel.com
Cc: Jeff Kirsher jeffrey.t.kirs...@intel.com
Cc: Rony Efraim ro...@mellanox.com
Cc: Vlad Zolotarov vl...@cloudius-systems.com
Cc: Nicolas Dichtel nicolas.dich...@6wind.com
Cc: Thomas Graf tg...@suug.ch
Signed-off-by: Jason Gunthorpe jguntho...@obsidianresearch.com
Signed-off-by: Daniel Borkmann dan...@iogearbox.net


Acked-by: Vlad Zolotarov vl...@cloudius-systems.com


---
  net/core/rtnetlink.c | 187 ++-
  1 file changed, 96 insertions(+), 91 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 01ced4a..9e433d5 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1328,10 +1328,6 @@ static const struct nla_policy 
ifla_info_policy[IFLA_INFO_MAX+1] = {
[IFLA_INFO_SLAVE_DATA]  = { .type = NLA_NESTED },
  };
  
-static const struct nla_policy ifla_vfinfo_policy[IFLA_VF_INFO_MAX+1] = {

-   [IFLA_VF_INFO]  = { .type = NLA_NESTED },
-};
-
  static const struct nla_policy ifla_vf_policy[IFLA_VF_MAX+1] = {
[IFLA_VF_MAC]   = { .len = sizeof(struct ifla_vf_mac) },
[IFLA_VF_VLAN]  = { .len = sizeof(struct ifla_vf_vlan) },
@@ -1488,96 +1484,98 @@ static int validate_linkmsg(struct net_device *dev, 
struct nlattr *tb[])
return 0;
  }
  
-static int do_setvfinfo(struct net_device *dev, struct nlattr *attr)

+static int do_setvfinfo(struct net_device *dev, struct nlattr **tb)
  {
-   int rem, err = -EINVAL;
-   struct nlattr *vf;
const struct net_device_ops *ops = dev-netdev_ops;
+   int err = -EINVAL;
  
-	nla_for_each_nested(vf, attr, rem) {

-   switch (nla_type(vf)) {
-   case IFLA_VF_MAC: {
-   struct ifla_vf_mac *ivm;
-   ivm = nla_data(vf);
-   err = -EOPNOTSUPP;
-   if (ops-ndo_set_vf_mac)
-   err = ops-ndo_set_vf_mac(dev, ivm-vf,
- ivm-mac);
-   break;
-   }
-   case IFLA_VF_VLAN: {
-   struct ifla_vf_vlan *ivv;
-   ivv = nla_data(vf);
-   err = -EOPNOTSUPP;
-   if (ops-ndo_set_vf_vlan)
-   err = ops-ndo_set_vf_vlan(dev, ivv-vf,
-  ivv-vlan,
-  ivv-qos);
-   break;
-   }
-   case IFLA_VF_TX_RATE: {
-   struct ifla_vf_tx_rate *ivt;
-   struct ifla_vf_info ivf;
-   ivt = nla_data(vf);
-   err = -EOPNOTSUPP;
-   if (ops-ndo_get_vf_config)
-   err = ops-ndo_get_vf_config(dev, ivt-vf,
-ivf);
-   if (err)
-   break;
-   err = -EOPNOTSUPP;
-   if (ops-ndo_set_vf_rate)
-   err = ops-ndo_set_vf_rate(dev, ivt-vf,
-  ivf.min_tx_rate,
-  ivt-rate);
-   break;
-   }
-   case IFLA_VF_RATE: {
-   struct ifla_vf_rate *ivt;
-   ivt = nla_data(vf);
-   err = -EOPNOTSUPP;
-   if (ops-ndo_set_vf_rate)
-   err =

[PATCH net v2] Revert dev: set iflink to 0 for virtual interfaces

2015-07-07 Thread Nicolas Dichtel

This reverts commit e1622baf54df8cc958bf29d71de5ad545ea7d93c.

The side effect of this commit is to add a '@NONE' after each virtual
interface name with a 'ip link'. It may break existing scripts.

Reported-by: Oliver Hartkopp socket...@hartkopp.net
Signed-off-by: Nicolas Dichtel nicolas.dich...@6wind.com
Tested-by: Oliver Hartkopp socket...@hartkopp.net
---

v2: fix Oliver's first name (sorry for the typo)

 net/core/dev.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 6778ad52..72e0a4331154 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -677,10 +677,6 @@ int dev_get_iflink(const struct net_device *dev)
if (dev-netdev_ops  dev-netdev_ops-ndo_get_iflink)
return dev-netdev_ops-ndo_get_iflink(dev);
 
-   /* If dev-rtnl_link_ops is set, it's a virtual interface. */
-   if (dev-rtnl_link_ops)
-   return 0;
-
return dev-ifindex;
 }
 EXPORT_SYMBOL(dev_get_iflink);
-- 
2.4.2

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] bonding: primary_reselect with failure is not working properly

2015-07-07 Thread GMAIL


On Monday 06 July 2015 09:02 PM, Andy Gospodarek wrote:


On Mon, Jul 06, 2015 at 05:34:01PM +0530, GMAIL wrote:

On Friday 03 July 2015 11:46 PM, Jay Vosburgh wrote:

GMAILranamazh...@gmail.com  wrote:

[...]

Looks good, added cosmetic changes for more readability,
it might save some instructions :)


diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 19eb990..317a494 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -689,40 +689,57 @@ out:
  }
-static bool bond_should_change_active(struct bonding *bond)
+static struct slave *bond_choose_primary_or_current(struct bonding *bond)
  {
 struct slave *prim = rtnl_dereference(bond-primary_slave);
 struct slave *curr = rtnl_dereference(bond-curr_active_slave);

Probably a good idea to add back a blank line here.

Otherwise this logic appears to be proper to resolve your issue and
Jay's additions appear to handle the case where primary_slave is NULL.


It was there, I don't know, may be mail client issue.


-   if (!prim || !curr || curr-link != BOND_LINK_UP)
-   return true;
+   if (!prim || prim-link != BOND_LINK_UP) {
+   if (!curr || curr-link != BOND_LINK_UP)
+   return NULL;
+   return curr;
+   }
+
 if (bond-force_primary) {
 bond-force_primary = false;
-   return true;
+   return prim;
+   }
+
+   if (!curr || curr-link != BOND_LINK_UP)
+   return prim;
+
+   /* At this point, prim and curr are both up */
+   switch (bond-params.primary_reselect) {
+   case BOND_PRI_RESELECT_ALWAYS:
+   return prim;
+   case BOND_PRI_RESELECT_BETTER:
+   if (prim-speed  curr-speed)
+   return curr;
+   if (prim-speed == curr-speed  prim-duplex = curr-duplex)
+   return curr;
+   return prim;
+   case BOND_PRI_RESELECT_FAILURE:
+   return curr;
+   default:
+   netdev_err(bond-dev, impossible primary_reselect %d\n,
+  bond-params.primary_reselect);
+   return curr;
 }
-   if (bond-params.primary_reselect == BOND_PRI_RESELECT_BETTER 
-   (prim-speed  curr-speed ||
-(prim-speed == curr-speed  prim-duplex = curr-duplex)))
-   return false;
-   if (bond-params.primary_reselect == BOND_PRI_RESELECT_FAILURE)
-   return false;
-   return true;
  }

  /**
- * find_best_interface - select the best available slave to be the active one
+ * bond_find_best_slave - select the best available slave to be the active one
   * @bond: our bonding struct
   */
  static struct slave *bond_find_best_slave(struct bonding *bond)
  {
-   struct slave *slave, *bestslave = NULL, *primary;
+   struct slave *slave, *bestslave = NULL;
 struct list_head *iter;
 int mintime = bond-params.updelay;

-   primary = rtnl_dereference(bond-primary_slave);
-   if (primary  primary-link == BOND_LINK_UP 
-   bond_should_change_active(bond))
-   return primary;
+   slave = bond_choose_primary_or_current(bond);
+   if (slave)
+   return slave;

 bond_for_each_slave(bond, slave, iter) {
 if (slave-link == BOND_LINK_UP)
---

Regards,
Mazhar Rana


Sending updated version of patch(v3) separately which will accommodate
my and Jay's changes.

Regards,
Mazhar Rana

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] tcp: always send a quick ack when quickacks are enabled

2015-07-07 Thread Eric Dumazet

On Tue, 2015-07-07 at 14:22 +1000, Jon Maxwell wrote:


 @@ -4887,6 +4884,7 @@ static inline void tcp_data_snd_check(struct sock *sk)
  static void __tcp_ack_snd_check(struct sock *sk, int ofo_possible)
  {
   struct tcp_sock *tp = tcp_sk(sk);
 + const struct dst_entry *dst = __sk_dst_get(sk);
  
   /* More than one full frame received... */
   if (((tp-rcv_nxt - tp-rcv_wup)  inet_csk(sk)-icsk_ack.rcv_mss 
 @@ -4896,6 +4894,8 @@ static void __tcp_ack_snd_check(struct sock *sk, int 
 ofo_possible)
__tcp_select_window(sk) = tp-rcv_wnd) ||
   /* We ACK each frame or... */
   tcp_in_quickack_mode(sk) ||
 + /* quickack on dst */
 + (dst  dst_metric(dst, RTAX_QUICKACK)) ||
   /* We have out of order data. */
   (ofo_possible  skb_peek(tp-out_of_order_queue))) {

This logic should be moved to tcp_in_quickack_mode() ?

Note I placed the dst test before others, to reduce jump prediction
misses.

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 684f095d196e..69ec8d25a2e5 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -196,11 +196,13 @@ static void tcp_enter_quickack_mode(struct sock *sk)
  * and the session is not interactive.
  */
 
-static inline bool tcp_in_quickack_mode(const struct sock *sk)
+static bool tcp_in_quickack_mode(struct sock *sk)
 {
const struct inet_connection_sock *icsk = inet_csk(sk);
+   const struct dst_entry *dst = __sk_dst_get(sk);
 
-   return icsk-icsk_ack.quick  !icsk-icsk_ack.pingpong;
+   return (dst  dst_metric(dst, RTAX_QUICKACK)) ||
+  (icsk-icsk_ack.quick  !icsk-icsk_ack.pingpong);
 }
 
 static void tcp_ecn_queue_cwr(struct tcp_sock *tp)




--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] net: ec_bhf: Use module_pci_driver

2015-07-07 Thread Vaishali Thakkar

Use module_pci_driver for drivers whose init and exit functions
only register and unregister, respectively.

A simplified version of the Coccinelle semantic patch that performs
this transformation is as follows:

@a@
identifier f, x;
@@
-static f(...) { return pci_register_driver(x); }

@b depends on a@
identifier e, a.x;
@@
-static e(...) { pci_unregister_driver(x); }

@c depends on a  b@
identifier a.f;
declarer name module_init;
@@
-module_init(f);

@d depends on a  b  c@
identifier b.e, a.x;
declarer name module_exit;
declarer name module_pci_driver;
@@
-module_exit(e);
+module_pci_driver(x);

Signed-off-by: Vaishali Thakkar vthakkar1...@gmail.com
---
 drivers/net/ethernet/ec_bhf.c | 14 +-
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/ec_bhf.c b/drivers/net/ethernet/ec_bhf.c
index d101750..f7b4248 100644
--- a/drivers/net/ethernet/ec_bhf.c
+++ b/drivers/net/ethernet/ec_bhf.c
@@ -604,19 +604,7 @@ static struct pci_driver pci_driver = {
.probe  = ec_bhf_probe,
.remove = ec_bhf_remove,
 };
-
-static int __init ec_bhf_init(void)
-{
-   return pci_register_driver(pci_driver);
-}
-
-static void __exit ec_bhf_exit(void)
-{
-   pci_unregister_driver(pci_driver);
-}
-
-module_init(ec_bhf_init);
-module_exit(ec_bhf_exit);
+module_pci_driver(pci_driver);
 
 module_param(polling_frequency, long, S_IRUGO);
 MODULE_PARM_DESC(polling_frequency, Polling timer frequency in ns);
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] tcp: always send a quick ack when quickacks are enabled

2015-07-07 Thread Jonathan Maxwell

 On Tue, 2015-07-07 at 14:22 +1000, Jon Maxwell wrote:


  @@ -4887,6 +4884,7 @@ static inline void tcp_data_snd_check(struct sock
  *sk)
   static void __tcp_ack_snd_check(struct sock *sk, int ofo_possible)
   {
   struct tcp_sock *tp = tcp_sk(sk);
  +const struct dst_entry *dst = __sk_dst_get(sk);
 
   /* More than one full frame received... */
   if (((tp-rcv_nxt - tp-rcv_wup)  inet_csk(sk)-icsk_ack.rcv_mss 
  @@ -4896,6 +4894,8 @@ static void __tcp_ack_snd_check(struct sock *sk, int
  ofo_possible)
__tcp_select_window(sk) = tp-rcv_wnd) ||
   /* We ACK each frame or... */
   tcp_in_quickack_mode(sk) ||
  +/* quickack on dst */
  +(dst  dst_metric(dst, RTAX_QUICKACK)) ||
   /* We have out of order data. */
   (ofo_possible  skb_peek(tp-out_of_order_queue))) {

 This logic should be moved to tcp_in_quickack_mode() ?

Yes agreed that's a better place for the check seeing that we
already check the other quickack conditions there as well.


 Note I placed the dst test before others, to reduce jump prediction
 misses.

 diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
 index 684f095d196e..69ec8d25a2e5 100644
 --- a/net/ipv4/tcp_input.c
 +++ b/net/ipv4/tcp_input.c
 @@ -196,11 +196,13 @@ static void tcp_enter_quickack_mode(struct sock *sk)
   * and the session is not interactive.
   */

 -static inline bool tcp_in_quickack_mode(const struct sock *sk)
 +static bool tcp_in_quickack_mode(struct sock *sk)
  {
  const struct inet_connection_sock *icsk = inet_csk(sk);
 +const struct dst_entry *dst = __sk_dst_get(sk);

 -return icsk-icsk_ack.quick  !icsk-icsk_ack.pingpong;
 +return (dst  dst_metric(dst, RTAX_QUICKACK)) ||
 +   (icsk-icsk_ack.quick  !icsk-icsk_ack.pingpong);
  }

  static void tcp_ecn_queue_cwr(struct tcp_sock *tp)



On Tue, Jul 7, 2015 at 5:05 PM, Eric Dumazet eric.duma...@gmail.com wrote:
 On Tue, 2015-07-07 at 14:22 +1000, Jon Maxwell wrote:


 @@ -4887,6 +4884,7 @@ static inline void tcp_data_snd_check(struct sock *sk)
  static void __tcp_ack_snd_check(struct sock *sk, int ofo_possible)
  {
   struct tcp_sock *tp = tcp_sk(sk);
 + const struct dst_entry *dst = __sk_dst_get(sk);

   /* More than one full frame received... */
   if (((tp-rcv_nxt - tp-rcv_wup)  inet_csk(sk)-icsk_ack.rcv_mss 
 @@ -4896,6 +4894,8 @@ static void __tcp_ack_snd_check(struct sock *sk, int 
 ofo_possible)
__tcp_select_window(sk) = tp-rcv_wnd) ||
   /* We ACK each frame or... */
   tcp_in_quickack_mode(sk) ||
 + /* quickack on dst */
 + (dst  dst_metric(dst, RTAX_QUICKACK)) ||
   /* We have out of order data. */
   (ofo_possible  skb_peek(tp-out_of_order_queue))) {

 This logic should be moved to tcp_in_quickack_mode() ?

 Note I placed the dst test before others, to reduce jump prediction
 misses.

 diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
 index 684f095d196e..69ec8d25a2e5 100644
 --- a/net/ipv4/tcp_input.c
 +++ b/net/ipv4/tcp_input.c
 @@ -196,11 +196,13 @@ static void tcp_enter_quickack_mode(struct sock *sk)
   * and the session is not interactive.
   */

 -static inline bool tcp_in_quickack_mode(const struct sock *sk)
 +static bool tcp_in_quickack_mode(struct sock *sk)
  {
 const struct inet_connection_sock *icsk = inet_csk(sk);
 +   const struct dst_entry *dst = __sk_dst_get(sk);

 -   return icsk-icsk_ack.quick  !icsk-icsk_ack.pingpong;
 +   return (dst  dst_metric(dst, RTAX_QUICKACK)) ||
 +  (icsk-icsk_ack.quick  !icsk-icsk_ack.pingpong);
  }

  static void tcp_ecn_queue_cwr(struct tcp_sock *tp)




--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Performance bottleneck with ndo_start_xmit

2015-07-07 Thread Stephen Hemminger

On Tue, 7 Jul 2015 18:32:22 +0200
Jason A. Donenfeld ja...@zx2c4.com wrote:

 I'm writing a kernel module that creates a virtual network device with
 rtnl_link_register.

Is it open source, is the source available to look at?
If not, please solve your own problems.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] vmxnet3: prevent receive getting out of sequence on napi poll

2015-07-07 Thread Andy Gospodarek

On Tue, Jul 07, 2015 at 02:02:18PM -0400, Neil Horman wrote:
 vmxnet3's current napi path is built to count every rx descriptor we recieve,
 and use that as a count of the napi budget.  That means its possible to return
 from a napi poll halfway through recieving a fragmented packet accross 
 multiple
 dma descriptors.  If that happens, the next napi poll will start with the
 descriptor ring in an improper state (e.g. the first descriptor we look at may
 have the end-of-packet bit set), which will cause a BUG halt in the driver.
 
 Fix the issue by only counting whole received packets in the napi poll and
 returning that value, rather than the descriptor count.
 
 Tested by the reporter and myself, successfully
 
 Signed-off-by: Neil Horman nhor...@tuxdriver.com
 CC: Shreyas Bhatewara sbhatew...@vmware.com
 CC: David S. Miller da...@davemloft.net

Looks good.  I'm now curious how widespread something like this might be
for drivers that use a similar EOP marker

Acked-by: Andy Gospodarek go...@cumulusnetworks.com
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] 3c59x: Fix shared IRQ handling

2015-07-07 Thread Denys Vlasenko

As its first order of business, boomerang_interrupt() checks whether
the device really has any pending interrupts. If it does not,
it does nothing and returns, but it still returns IRQ_HANDLED.

This is wrong: interrupt was not handled, IRQ handlers of other
devices sharing this IRQ line need to be called.

vortex_interrupt() has it right: it returns IRQ_NONE in this case
via IRQ_RETVAL(0).

Do the same in boomerang_interrupt().

Signed-off-by: Denys Vlasenko dvlas...@redhat.com
CC: David S. Miller da...@davemloft.net
CC: linux-ker...@vger.kernel.org
CC: netdev@vger.kernel.org
---
 drivers/net/ethernet/3com/3c59x.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/3com/3c59x.c 
b/drivers/net/ethernet/3com/3c59x.c
index 41095eb..c11d6fc 100644
--- a/drivers/net/ethernet/3com/3c59x.c
+++ b/drivers/net/ethernet/3com/3c59x.c
@@ -2382,6 +2384,7 @@ boomerang_interrupt(int irq, void *dev_id)
void __iomem *ioaddr;
int status;
int work_done = max_interrupt_work;
+   int handled = 0;
 
ioaddr = vp-ioaddr;
 
@@ -2400,6 +2403,7 @@ boomerang_interrupt(int irq, void *dev_id)
 
if ((status  IntLatch) == 0)
goto handler_exit;  /* No interrupt: shared IRQs 
can cause this */
+   handled = 1;
 
if (status == 0x) { /* h/w no longer present (hotplug)? */
if (vortex_debug  1)
@@ -2501,7 +2505,7 @@ boomerang_interrupt(int irq, void *dev_id)
 handler_exit:
vp-handling_irq = 0;
spin_unlock(vp-lock);
-   return IRQ_HANDLED;
+   return IRQ_RETVAL(handled);
 }
 
 static int vortex_rx(struct net_device *dev)
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] net/bridge: Use __in6_dev_get rather than in6_dev_get in br_validate_ipv6

2015-07-07 Thread Stephen Hemminger

On Tue, 7 Jul 2015 15:55:21 +0100
Julien Grall julien.gr...@citrix.com wrote:

 The commit efb6de9b4ba0092b2c55f6a52d16294a8a698edd netfilter: bridge:
 forward IPv6 fragmented packets introduced a new function
 br_validate_ipv6 which take a reference on the inet6 device. Although,
 the reference is not released at the end.
 
 This will result to the impossibility to destroy any netdevice using
 ipv6 and bridge.
 
 It's possible to directly retrieve the inet6 device without taking a
 reference as all netfilter hooks are protected by rcu_read_lock via
 nf_hook_slow.
 
 Spotted while trying to destroy a Xen guest on the upstream Linux:
 unregister_netdevice: waiting for vif1.0 to become free. Usage count = 1
 
 Signed-off-by: Julien Grall julien.gr...@citrix.com
 Cc: Bernhard Thaler bernhard.tha...@wvnet.at
 Cc: Pablo Neira Ayuso pa...@netfilter.org
 Cc: f...@strlen.de
 Cc: ian.campb...@citrix.com
 Cc: wei.l...@citrix.com
 Cc: Bob Liu bob@oracle.com
 
 ---
 Note that it's impossible to create new guest after this message.
 I'm not sure if it's normal.
 
 Changes in v2:
 - Don't take a reference to inet6.
 - This was net/bridge: Add missing in6_dev_put in
 br_validate_ipv6 [0]
 
 [0] https://lkml.org/lkml/2015/7/3/443
 ---
  net/bridge/br_netfilter_ipv6.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

I like this simple solution

Acked-by: Stephen Hemminger step...@networkplumber.org

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 6/7] hvsock: introduce Hyper-V VM Sockets feature

2015-07-07 Thread Stephen Hemminger

On Mon,  6 Jul 2015 07:47:29 -0700
Dexuan Cui de...@microsoft.com wrote:

 Hyper-V VM sockets (hvsock) supplies a byte-stream based communication
 mechanism between the host and a guest. It's kind of TCP over VMBus, but
 the transportation layer (VMBus) is much simpler than IP. With Hyper-V VM
 Sockets, applications between the host and a guest can talk with each
 other directly by the traditional BSD-style socket APIs.
 
 Hyper-V VM Sockets is only available on Windows 10 host and later. The
 patch implements the necessary support in the guest side by introducing
 a new socket address family AF_HYPERV.
 
 Signed-off-by: Dexuan Cui de...@microsoft.com

Is there any chance that AF_VSOCK could be used with different transport
for VMware and Hyper-V. Better to make guest applications host independent.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: linux-4.2-rc1/samples/bpf/sockex3_kern.c: bad expression ?

2015-07-07 Thread Alexei Starovoitov

On Tue, Jul 07, 2015 at 11:27:55AM +, David Binderman wrote:
 Hello there,
 
 [linux-4.2-rc1/samples/bpf/sockex3_kern.c:268]: (style) Expression '(X  
 0xf0) == 0x4' is always false.
 
 Source code is
 
    if ((verlen  0xF0) == 4)
 
 Maybe
 
    if ((verlen  0xF0) == 0x40)

oops, yes. Could you please send a patch to fix it. Thx

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 0/3] net: dsa: mv88e6xxx: add support for VLAN Table Unit

2015-07-07 Thread Scott Feldman

On Tue, Jul 7, 2015 at 9:17 AM, Vivien Didelot
vivien.dide...@savoirfairelinux.com wrote:
 Hi Andrew, Scott,
 Does this fixup http://ix.io/jxq look good to you?

Yes.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] vmxnet3: prevent receive getting out of sequence on napi poll

2015-07-07 Thread Neil Horman

vmxnet3's current napi path is built to count every rx descriptor we recieve,
and use that as a count of the napi budget.  That means its possible to return
from a napi poll halfway through recieving a fragmented packet accross multiple
dma descriptors.  If that happens, the next napi poll will start with the
descriptor ring in an improper state (e.g. the first descriptor we look at may
have the end-of-packet bit set), which will cause a BUG halt in the driver.

Fix the issue by only counting whole received packets in the napi poll and
returning that value, rather than the descriptor count.

Tested by the reporter and myself, successfully

Signed-off-by: Neil Horman nhor...@tuxdriver.com
CC: Shreyas Bhatewara sbhatew...@vmware.com
CC: David S. Miller da...@davemloft.net
---
 drivers/net/vmxnet3/vmxnet3_drv.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c 
b/drivers/net/vmxnet3/vmxnet3_drv.c
index da11bb5..46f4cad 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -1216,7 +1216,7 @@ vmxnet3_rq_rx_complete(struct vmxnet3_rx_queue *rq,
static const u32 rxprod_reg[2] = {
VMXNET3_REG_RXPROD, VMXNET3_REG_RXPROD2
};
-   u32 num_rxd = 0;
+   u32 num_pkts = 0;
bool skip_page_frags = false;
struct Vmxnet3_RxCompDesc *rcd;
struct vmxnet3_rx_ctx *ctx = rq-rx_ctx;
@@ -1235,13 +1235,12 @@ vmxnet3_rq_rx_complete(struct vmxnet3_rx_queue *rq,
struct Vmxnet3_RxDesc *rxd;
u32 idx, ring_idx;
struct vmxnet3_cmd_ring *ring = NULL;
-   if (num_rxd = quota) {
+   if (num_pkts = quota) {
/* we may stop even before we see the EOP desc of
 * the current pkt
 */
break;
}
-   num_rxd++;
BUG_ON(rcd-rqID != rq-qid  rcd-rqID != rq-qid2);
idx = rcd-rxdIdx;
ring_idx = rcd-rqID  adapter-num_rx_queues ? 0 : 1;
@@ -1413,6 +1412,7 @@ not_lro:
napi_gro_receive(rq-napi, skb);
 
ctx-skb = NULL;
+   num_pkts++;
}
 
 rcd_done:
@@ -1443,7 +1443,7 @@ rcd_done:
  
rq-comp_ring.base[rq-comp_ring.next2proc].rcd, rxComp);
}
 
-   return num_rxd;
+   return num_pkts;
 }
 
 
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] vmxnet3: prevent receive getting out of sequence on napi poll

2015-07-07 Thread Neil Horman

On Tue, Jul 07, 2015 at 02:10:50PM -0400, Andy Gospodarek wrote:
 On Tue, Jul 07, 2015 at 02:02:18PM -0400, Neil Horman wrote:
  vmxnet3's current napi path is built to count every rx descriptor we 
  recieve,
  and use that as a count of the napi budget.  That means its possible to 
  return
  from a napi poll halfway through recieving a fragmented packet accross 
  multiple
  dma descriptors.  If that happens, the next napi poll will start with the
  descriptor ring in an improper state (e.g. the first descriptor we look at 
  may
  have the end-of-packet bit set), which will cause a BUG halt in the driver.
  
  Fix the issue by only counting whole received packets in the napi poll and
  returning that value, rather than the descriptor count.
  
  Tested by the reporter and myself, successfully
  
  Signed-off-by: Neil Horman nhor...@tuxdriver.com
  CC: Shreyas Bhatewara sbhatew...@vmware.com
  CC: David S. Miller da...@davemloft.net
 
 Looks good.  I'm now curious how widespread something like this might be
 for drivers that use a similar EOP marker
 
Thats a fair question, though It manifests pretty clearly in any driver that
does any sort of strict state checking.  I think several drivers just punt if
they see an EOP desriptor before an SOP descriptor, toss it and keep going, so
you might loose a few extra frames while the driver drains the queue.


 Acked-by: Andy Gospodarek go...@cumulusnetworks.com
 
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: linux-4.2-rc1/samples/bpf/sockex3_kern.c: bad expression ?

2015-07-07 Thread David Binderman

Hello there Alexei,


 oops, yes. Could you please send a patch to fix it. 

Sorry, but I gave up trying to do kernel patches many years ago.

Regards

David Binderman

  --
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] freescale:Make the function gfar_configure_coalescing_all static

2015-07-07 Thread Uwe Kleine-König

On Tue, Jul 07, 2015 at 03:49:33PM -0400, Nicholas Krause wrote:
 This makes the function gfar_configure_coalescing_all static
 and removes its function prototype from the header file, gianfar.h
really? Looking at the diffstat I dare to disagree.

 due to this function only ever being called in its definition and
 declaration file of gainfar.c.
 
 Signed-off-by: Nicholas Krause xerofo...@gmail.com
 ---
  drivers/net/ethernet/freescale/gianfar.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[ANNOUNCE] iproute2 4.1.1

2015-07-07 Thread Stephen Hemminger

Maintenance release of iproute2 for Linux 4.1

This fixes issues affecting some environments with iproute2 4.1.
The MPLS support was missing one patch, and the introduction of
TIPC made build difficult in environments without proper libmnl.

Source:
  http://www.kernel.org/pub/linux/utils/net/iproute2/iproute2-4.1.1.tar.gz

Repository:
  git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git

Report problems (or enhancements) to the netdev@vger.kernel.org mailing list.

---
Gustavo Zacarias (1):
  tipc: make build conditional on having libmnl

Jan Engelhardt (1):
  build: must honor pkg-config flags for libmnl

Michal Kubeček (1):
  include: add copy of tipc.h

Roopa Prabhu (1):
  mpls: always set type RTN_UNICAST and scope RT_SCOPE_UNIVERSE for

Stephen Hemminger (1):
  v4.1.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Performance bottleneck with ndo_start_xmit

2015-07-07 Thread Jason A. Donenfeld

On Tue, Jul 7, 2015 at 8:10 PM, Stephen Hemminger
step...@networkplumber.org wrote:
 Is it open source, is the source available to look at?
 If not, please solve your own problems.

Yes it is. Right now the repo is under password because it's supposed
to keep your data secure, but I haven't audited it yet, and I don't
want someone to rely on the software erroneously before I've made sure
it's safe. If my general question here doesn't turn up any good
pointers, I'll take the password off the repo and just add some
massive do not use! warnings.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 06/22] fjes: buffer address regist/unregistration routine

2015-07-07 Thread Yasuaki Ishimatsu


On Wed, 24 Jun 2015 11:55:38 +0900
Taku Izumi izumi.t...@jp.fujitsu.com wrote:

 This patch adds buffer address regist/unregistration routine.
 
 This function is mainly invoked when network device's
 activation (open) and deactivation (close)
 in order to retist/unregist shared buffer address.
 
 Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
 ---
  drivers/net/fjes/fjes_hw.c | 187 
 +
  drivers/net/fjes/fjes_hw.h |   9 ++-
  2 files changed, 195 insertions(+), 1 deletion(-)
 
 diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c
 index 1ffa62e..4451e70 100644
 --- a/drivers/net/fjes/fjes_hw.c
 +++ b/drivers/net/fjes/fjes_hw.c
 @@ -453,6 +453,193 @@ int fjes_hw_request_info(struct fjes_hw *hw)
   return result;
  }
  
 +int fjes_hw_register_buff_addr(struct fjes_hw *hw, int dest_epid,
 +struct ep_share_mem_info *buf_pair)
 +{
 + union fjes_device_command_req *req_buf = hw-hw_info.req_buf;
 + union fjes_device_command_res *res_buf = hw-hw_info.res_buf;
 + enum fjes_dev_command_response_e ret;
 + int i, idx;
 + int page_count;
 + void *addr;
 + int timeout;
 + int result;
 +
 + if (test_bit(dest_epid, hw-hw_info.buffer_share_bit))
 + return 0;
 +
 + memset(req_buf, 0, hw-hw_info.req_buf_size);
 + memset(res_buf, 0, hw-hw_info.res_buf_size);
 +
 + req_buf-share_buffer.length =
 + FJES_DEV_COMMAND_SHARE_BUFFER_REQ_LEN(buf_pair-tx.size,
 +   buf_pair-rx.size);
 + req_buf-share_buffer.epid = dest_epid;
 +
 + idx = 0;
 + req_buf-share_buffer.buffer[idx++] = buf_pair-tx.size;
 + page_count = buf_pair-tx.size / EP_BUFFER_INFO_SIZE;
 + for (i = 0; i  page_count; i++) {
 + addr = ((u8 *)(buf_pair-tx.buffer)) +
 + (i * EP_BUFFER_INFO_SIZE);
 + req_buf-share_buffer.buffer[idx++] =
 + (__le64)(page_to_phys(vmalloc_to_page(addr)) +
 + offset_in_page(addr));
 + }
 +
 + req_buf-share_buffer.buffer[idx++] = buf_pair-rx.size;
 + page_count = buf_pair-rx.size / EP_BUFFER_INFO_SIZE;
 + for (i = 0; i  page_count; i++) {
 + addr = ((u8 *)(buf_pair-rx.buffer)) +
 + (i * EP_BUFFER_INFO_SIZE);
 + req_buf-share_buffer.buffer[idx++] =
 + (__le64)(page_to_phys(vmalloc_to_page(addr)) +
 + offset_in_page(addr));
 + }
 +

 + res_buf-share_buffer.length = 0;
 + res_buf-share_buffer.code = 0;
 +
 + ret = fjes_hw_issue_request_command(hw, FJES_CMD_REQ_SHARE_BUFFER);
 +
 + timeout = FJES_COMMAND_REQ_BUFF_TIMEOUT * 1000;
 + while ((ret == FJES_CMD_STATUS_NORMAL) 
 +(res_buf-share_buffer.length ==
 + FJES_DEV_COMMAND_SHARE_BUFFER_RES_LEN) 
 +(res_buf-share_buffer.code == FJES_CMD_REQ_RES_CODE_BUSY) 
 +(timeout  0)) {
 + msleep(200 + hw-my_epid * 20);
 + timeout -= (200 + hw-my_epid * 20);
 +
 + res_buf-share_buffer.length = 0;
 + res_buf-share_buffer.code = 0;
 +
 + ret =
 + fjes_hw_issue_request_command(hw,
 +   
 FJES_CMD_REQ_SHARE_BUFFER);
 + }
 +
 + result = 0;
 +
 + if (res_buf-share_buffer.length !=
 + FJES_DEV_COMMAND_SHARE_BUFFER_RES_LEN)
 + result = -ENOMSG;
 + else if (ret == FJES_CMD_STATUS_NORMAL) {
 + switch (res_buf-share_buffer.code) {
 + case FJES_CMD_REQ_RES_CODE_NORMAL:
 + result = 0;
 + set_bit(dest_epid, hw-hw_info.buffer_share_bit);
 + break;
 + case FJES_CMD_REQ_RES_CODE_BUSY:
 + result = -EBUSY;
 + break;
 + default:
 + result = -EPERM;
 + break;
 + }
 + } else {
 + switch (ret) {
 + case FJES_CMD_STATUS_UNKNOWN:
 + result = -EPERM;
 + break;
 + case FJES_CMD_STATUS_TIMEOUT:
 + result = -EBUSY;
 + break;
 + case FJES_CMD_STATUS_ERROR_PARAM:
 + case FJES_CMD_STATUS_ERROR_STATUS:
 + default:
 + result = -EPERM;
 + break;
 + }
 + }
 +
 + return result;

fjes_hw_unregister_buff_addr() has same implemetation.
How about preparing new function to unify them?

Thanks,
Yasuaki Ishimatsu

 +}
 +
 +int fjes_hw_unregister_buff_addr(struct fjes_hw *hw, int dest_epid)
 +{
 + union fjes_device_command_req *req_buf = hw-hw_info.req_buf;
 + union

Re: [PULL] virtio/vhost: cross endian support

2015-07-07 Thread Michael S. Tsirkin

On Tue, Jul 07, 2015 at 06:36:53PM +0200, Thomas Huth wrote:
 On Thu, 2 Jul 2015 11:32:52 +0200
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Thu, Jul 02, 2015 at 11:12:56AM +0200, Greg Kurz wrote:
   On Thu, 2 Jul 2015 08:01:28 +0200
   Michael S. Tsirkin m...@redhat.com wrote:
 ...
Yea, well - support for legacy BE guests on the new LE hosts is
exactly the motivation for this.

I dislike it too, but there are two redeeming properties that
made me merge this:

1.  It's a trivial amount of code: since we wrap host/guest accesses
anyway, almost all of it is well hidden from drivers.

2.  Sane platforms would never set flags like VHOST_CROSS_ENDIAN_LEGACY 
-
and when it's clear, there's zero overhead (as some point it was
tested by compiling with and without the patches, got the same
stripped binary).

Maybe we could create a Kconfig symbol to enforce point (2): prevent
people from enabling it e.g. on x86. I will look into this - but it can
be done by a patch on top, so I think this can be merged as is.

   
   This cross-endian *oddity* is targeting PowerPC book3s_64 processors... I
   am not aware of any other users. Maybe create a symbol that would
   be only selected by PPC_BOOK3S_64 ?
  
  I think some ARM systems are trying to support cross-endian
  configurations as well.
  
  Besides that, yes, this is more or less what I had in mind.
 
 Would something simple like this already do the job:
 
 diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
 --- a/drivers/vhost/Kconfig
 +++ b/drivers/vhost/Kconfig
 @@ -35,6 +35,7 @@ config VHOST
  
  config VHOST_CROSS_ENDIAN_LEGACY
   bool Cross-endian support for vhost
 + depends on KVM_BOOK3S_64 || KVM_ARM_HOST
   default n
   ---help---
 This option allows vhost to support guests with a different byte
 
 ?

Do all ARM hosts support this dynamic endian-ness?

 If that looks acceptable, I can submit a proper patch if you like.
 
  Thomas

I think I prefer some kind of symbol defined by these arches,
so I don't get to maintain an arch list in vhost.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net 1/1] drivers/net/usb: add device id for NVIDIA Tegra USB 3.0 Ethernet

2015-07-07 Thread Zheng Liu

This device is sold as 'NVIDIA Tegra USB 3.0 Ethernet'.
Chipset is RTL8153 and works with r8152.

Signed-off-by: Zheng Liu zh...@nvidia.com
---
 drivers/net/usb/cdc_ether.c | 8 
 drivers/net/usb/r8152.c | 2 ++
 2 files changed, 10 insertions(+)

diff --git a/drivers/net/usb/cdc_ether.c b/drivers/net/usb/cdc_ether.c
index 4545e78840b0..35a2bffe848a 100644
--- a/drivers/net/usb/cdc_ether.c
+++ b/drivers/net/usb/cdc_ether.c
@@ -523,6 +523,7 @@ static const struct driver_info wwan_info = {
 #define REALTEK_VENDOR_ID  0x0bda
 #define SAMSUNG_VENDOR_ID  0x04e8
 #define LENOVO_VENDOR_ID   0x17ef
+#define NVIDIA_VENDOR_ID   0x0955
 
 static const struct usb_device_id  products[] = {
 /* BLACKLIST !!
@@ -710,6 +711,13 @@ static const struct usb_device_id  products[] = {
.driver_info = 0,
 },
 
+/* NVIDIA Tegra USB 3.0 Ethernet Adapters (based on Realtek RTL8153) */
+{
+   USB_DEVICE_AND_INTERFACE_INFO(NVIDIA_VENDOR_ID, 0x09ff, USB_CLASS_COMM,
+   USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE),
+   .driver_info = 0,
+},
+
 /* WHITELIST!!!
  *
  * CDC Ether uses two interfaces, not necessarily consecutive.
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index aafa1a1898e4..7f6419ebb5e1 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -494,6 +494,7 @@ enum rtl8152_flags {
 #define VENDOR_ID_REALTEK  0x0bda
 #define VENDOR_ID_SAMSUNG  0x04e8
 #define VENDOR_ID_LENOVO   0x17ef
+#define VENDOR_ID_NVIDIA   0x0955
 
 #define MCU_TYPE_PLA   0x0100
 #define MCU_TYPE_USB   0x
@@ -4117,6 +4118,7 @@ static struct usb_device_id rtl8152_table[] = {
{REALTEK_USB_DEVICE(VENDOR_ID_SAMSUNG, 0xa101)},
{REALTEK_USB_DEVICE(VENDOR_ID_LENOVO,  0x7205)},
{REALTEK_USB_DEVICE(VENDOR_ID_LENOVO,  0x304f)},
+   {REALTEK_USB_DEVICE(VENDOR_ID_NVIDIA,  0x09ff)},
{}
 };
 
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 0/3] net: dsa: mv88e6xxx: add support for VLAN Table Unit

2015-07-07 Thread Vivien Didelot

Hi all,

This patchset brings full support for hardware VLANs in DSA, and the Marvell
88E6xxx compatible switch chips.

The first patch adds the VTU operations to the mv88e6xxx code, as well as a
vtu debugfs file to read and modify the hardware VLAN table.

The second patch adds the glue between DSA and the switchdev VLAN objects.

The third patch finally implements the necessary functions in the mv88e6xxx
code to interact with the hardware VLAN through switchdev, from userspace
commands such as bridge vlan.

Below is an example of what can be done with this patchset.

VID 550: 1t 3u
VID 1000: 2t
VID 1200: 2t 4t

The VLAN setup above can be achieved with the following bridge commands:

bridge vlan add vid 550 dev swp1 master
bridge vlan add vid 550 dev swp3 master untagged pvid
bridge vlan add vid 1000 dev swp2 master
bridge vlan add vid 1200 dev swp2 master
bridge vlan add vid 1200 dev swp4 master

Removing the port 1 from VLAN 550 is done with:

bridge vlan del vid 550 dev swp1

The bridge command would output the following setup:

# bridge vlan
portvlan ids
swp0None
swp0
swp1None
swp1
swp21000
1200

swp21000
1200

swp3550 PVID Egress Untagged

swp3550 PVID Egress Untagged

swp41200

swp41200

br0 None

Assuming that swp5 is the CPU port, the vtu debugfs file would show:

# cat /sys/kernel/debug/dsa0/vtu
 VID  FID  SID  0  1  2  3  4  5  6
 550  5500  x  x  x  u  x  t  x
1000 10000  x  x  t  x  x  t  x
1200 12000  x  x  t  x  t  t  x

v4: return -EOPNOTSUPP in switchdev prepare phase for unsupported objects;
handle num_ports in VTU GetNext / LoadPurge operations, instead of hardcoded 7.

Cheers,
  -v

Vivien Didelot (3):
  net: dsa: mv88e6xxx: add debugfs interface for VTU
  net: dsa: add support for switchdev VLAN objects
  net: dsa: mv88e6xxx: add switchdev VLAN operations

 drivers/net/dsa/mv88e6123_61_65.c |   3 +
 drivers/net/dsa/mv88e6131.c   |   3 +
 drivers/net/dsa/mv88e6171.c   |   3 +
 drivers/net/dsa/mv88e6352.c   |   3 +
 drivers/net/dsa/mv88e6xxx.c   | 476 ++
 drivers/net/dsa/mv88e6xxx.h   |  36 +++
 include/net/dsa.h |   9 +
 net/dsa/dsa_priv.h|   6 +
 net/dsa/slave.c   | 142 
 9 files changed, 681 insertions(+)

-- 
2.4.5

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 1/3] net: dsa: mv88e6xxx: add debugfs interface for VTU

2015-07-07 Thread Vivien Didelot

Implement the Get Next and Load Purge operations for the VLAN Table
Unit, and a vtu debugfs file to read and write the hardware VLANs.

A populated VTU look like this:

# cat /sys/kernel/debug/dsa0/vtu
 VID  FID  SID  0  1  2  3  4  5  6
 550  5620  x  x  x  u  x  t  x
1000 10120  x  x  t  x  x  t  x
1200 12120  x  x  t  x  t  t  x

Where t, u, x, -, respectively means that the port is tagged,
untagged, excluded or unmodified, for a given VLAN entry.

VTU entries can be added by echoing the same format:

echo 1300 1312 0 x x t x t t x  vtu

and can be deleted by echoing only the VID:

echo 1000  vtu

Signed-off-by: Vivien Didelot vivien.dide...@savoirfairelinux.com
---
 drivers/net/dsa/mv88e6xxx.c | 322 
 drivers/net/dsa/mv88e6xxx.h |  31 +
 2 files changed, 353 insertions(+)

diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 8c130c0..049553c 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -2,6 +2,9 @@
  * net/dsa/mv88e6xxx.c - Marvell 88e6xxx switch chip support
  * Copyright (c) 2008 Marvell Semiconductor
  *
+ * Copyright (c) 2015 CMC Electronics, Inc.
+ * Added support for 802.1Q VLAN Table Unit
+ *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
  * the Free Software Foundation; either version 2 of the License, or
@@ -1366,6 +1369,192 @@ static void mv88e6xxx_bridge_work(struct work_struct 
*work)
}
 }
 
+static int _mv88e6xxx_vtu_wait(struct dsa_switch *ds)
+{
+   return _mv88e6xxx_wait(ds, REG_GLOBAL, GLOBAL_VTU_OP,
+  GLOBAL_VTU_OP_BUSY);
+}
+
+static int _mv88e6xxx_vtu_cmd(struct dsa_switch *ds, u16 op)
+{
+   int ret;
+
+   ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_OP, op);
+   if (ret  0)
+   return ret;
+
+   return _mv88e6xxx_vtu_wait(ds);
+}
+
+static int _mv88e6xxx_stu_loadpurge(struct dsa_switch *ds, u8 sid, bool valid)
+{
+   int ret, data;
+
+   ret = _mv88e6xxx_vtu_wait(ds);
+   if (ret  0)
+   return ret;
+
+   data = sid  GLOBAL_VTU_SID_MASK;
+   if (valid)
+   data |= GLOBAL_VTU_VID_VALID;
+
+   ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_VID, data);
+   if (ret  0)
+   return ret;
+
+   /* Unused (yet) data registers */
+   ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_DATA_0_3, 0);
+   if (ret  0)
+   return ret;
+
+   ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_DATA_4_7, 0);
+   if (ret  0)
+   return ret;
+
+   ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_DATA_8_11, 0);
+   if (ret  0)
+   return ret;
+
+   return _mv88e6xxx_vtu_cmd(ds, GLOBAL_VTU_OP_STU_LOAD_PURGE);
+}
+
+static int _mv88e6xxx_vtu_getnext(struct dsa_switch *ds, u16 vid,
+ struct mv88e6xxx_vtu_entry *entry)
+{
+   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
+   struct mv88e6xxx_vtu_entry next = { 0 };
+   int ret;
+
+   ret = _mv88e6xxx_vtu_wait(ds);
+   if (ret  0)
+   return ret;
+
+   ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_VID,
+  vid  GLOBAL_VTU_VID_MASK);
+   if (ret  0)
+   return ret;
+
+   ret = _mv88e6xxx_vtu_cmd(ds, GLOBAL_VTU_OP_VTU_GET_NEXT);
+   if (ret  0)
+   return ret;
+
+   ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_VID);
+   if (ret  0)
+   return ret;
+
+   next.vid = ret  GLOBAL_VTU_VID_MASK;
+   next.valid = !!(ret  GLOBAL_VTU_VID_VALID);
+
+   if (next.valid) {
+   u16 data[3];
+   int port;
+
+   ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_DATA_0_3);
+   if (ret  0)
+   return ret;
+   data[0] = ret;
+   ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_DATA_4_7);
+   if (ret  0)
+   return ret;
+   data[1] = ret;
+   ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_DATA_8_11);
+   if (ret  0)
+   return ret;
+   data[2] = ret;
+
+   for (port = 0; port  ps-num_ports; ++port) {
+   int reg = data[port / 4];
+
+   next.tags[port] =
+   GLOBAL_VTU_DATA_MEMBER_TAG_UNMASK(port, reg);
+   }
+
+   if (mv88e6xxx_6097_family(ds) || mv88e6xxx_6165_family(ds) ||
+   mv88e6xxx_6351_family(ds) || mv88e6xxx_6352_family(ds)) {
+   ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL,
+ GLOBAL_VTU_FID);
+   if (ret  0)
+

[PATCH v4 2/3] net: dsa: add support for switchdev VLAN objects

2015-07-07 Thread Vivien Didelot

This patch adds the glue between DSA and switchdev operations to add,
delete and dump SWITCHDEV_OBJ_PORT_VLAN objects.

This is a first step to link the bridge vlan command with hardware
entries for DSA compatible switch chips.

Signed-off-by: Vivien Didelot vivien.dide...@savoirfairelinux.com
---
 include/net/dsa.h  |   9 
 net/dsa/dsa_priv.h |   6 +++
 net/dsa/slave.c| 142 +
 3 files changed, 157 insertions(+)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index fbca63b..cabf2a5 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -302,6 +302,15 @@ struct dsa_switch_driver {
   const unsigned char *addr, u16 vid);
int (*fdb_getnext)(struct dsa_switch *ds, int port,
   unsigned char *addr, bool *is_static);
+
+   /*
+* VLAN support
+*/
+   int (*port_vlan_add)(struct dsa_switch *ds, int port, u16 vid,
+u16 bridge_flags);
+   int (*port_vlan_del)(struct dsa_switch *ds, int port, u16 vid);
+   int (*port_vlan_dump)(struct dsa_switch *ds, int port, u16 vid,
+ u16 *bridge_flags);
 };
 
 void register_switch_driver(struct dsa_switch_driver *type);
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index d5f1f9b..9029717 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -13,6 +13,7 @@
 
 #include linux/phy.h
 #include linux/netdevice.h
+#include linux/if_vlan.h
 
 struct dsa_device_ops {
netdev_tx_t (*xmit)(struct sk_buff *skb, struct net_device *dev);
@@ -47,6 +48,11 @@ struct dsa_slave_priv {
int old_duplex;
 
struct net_device   *bridge_dev;
+
+   /*
+* Which VLANs this port is a member of.
+*/
+   DECLARE_BITMAP(vlan_bitmap, VLAN_N_VID);
 };
 
 /* dsa.c */
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 04ffad3..1da861e 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -18,6 +18,7 @@
 #include net/rtnetlink.h
 #include net/switchdev.h
 #include linux/if_bridge.h
+#include linux/if_vlan.h
 #include dsa_priv.h
 
 /* slave mii_bus handling ***/
@@ -363,6 +364,141 @@ static int dsa_slave_port_attr_set(struct net_device *dev,
return ret;
 }
 
+static int dsa_slave_port_vlans_add(struct net_device *dev,
+struct switchdev_obj *obj)
+{
+   struct switchdev_obj_vlan *vlan = obj-u.vlan;
+   struct dsa_slave_priv *p = netdev_priv(dev);
+   struct dsa_switch *ds = p-parent;
+   int vid, err = 0;
+
+   if (!ds-drv-port_vlan_add)
+   return -EOPNOTSUPP;
+
+   for (vid = vlan-vid_begin; vid = vlan-vid_end; ++vid) {
+   err = ds-drv-port_vlan_add(ds, p-port, vid, vlan-flags);
+   if (err)
+   break;
+   set_bit(vid, p-vlan_bitmap);
+   }
+
+   return err;
+}
+
+static int dsa_slave_port_obj_add(struct net_device *dev,
+ struct switchdev_obj *obj)
+{
+   int err = -EOPNOTSUPP;
+
+   /*
+* The DSA drivers don't need to allocate any memory for operations on
+* prepare phase, and they won't fail to HW on commit phase (unless
+* something terrible goes wrong on the MDIO bus, in which case the
+* commit phase wouldn't have been able to predict anyway).
+*
+* If an object is supported, skip the prepare phase by returning 0,
+* otherwise return -EOPNOTSUPP.
+*/
+
+   switch (obj-id) {
+   case SWITCHDEV_OBJ_PORT_VLAN:
+   if (obj-trans == SWITCHDEV_TRANS_PREPARE)
+   return 0;
+
+   if (obj-trans == SWITCHDEV_TRANS_COMMIT)
+   err = dsa_slave_port_vlans_add(dev, obj);
+   break;
+   default:
+   err = -EOPNOTSUPP;
+   break;
+   }
+
+   return err;
+}
+
+static int dsa_slave_port_vlans_del(struct net_device *dev,
+struct switchdev_obj *obj)
+{
+   struct switchdev_obj_vlan *vlan = obj-u.vlan;
+   struct dsa_slave_priv *p = netdev_priv(dev);
+   struct dsa_switch *ds = p-parent;
+   int vid, err = 0;
+
+   if (!ds-drv-port_vlan_del)
+   return -EOPNOTSUPP;
+
+   for (vid = vlan-vid_begin; vid = vlan-vid_end; ++vid) {
+   err = ds-drv-port_vlan_del(ds, p-port, vid);
+   if (err)
+   break;
+   clear_bit(vid, p-vlan_bitmap);
+   }
+
+   return err;
+}
+
+static int dsa_slave_port_obj_del(struct net_device *dev,
+ struct switchdev_obj *obj)
+{
+   int err;
+
+   switch (obj-id) {
+   case SWITCHDEV_OBJ_PORT_VLAN:
+   err = dsa_slave_port_vlans_del(dev, obj);
+   break;
+   default:
+

[PATCH v4 3/3] net: dsa: mv88e6xxx: add switchdev VLAN operations

2015-07-07 Thread Vivien Didelot

This commit implements the switchdev operations to add, delete and dump
VLANs for the Marvell 88E6352 and compatible switch chips.

This allows to access the switch VLAN Table Unit from standard userspace
commands such as bridge vlan.

A configuration like 1t 2t 3t 4u for VLAN 10 is achieved like this:

# bridge vlan add dev swp1 vid 10 master
# bridge vlan add dev swp2 vid 10 master
# bridge vlan add dev swp3 vid 10 master
# bridge vlan add dev swp4 vid 10 master untagged pvid

This calls port_vlan_add() for each command. Removing the port 3 from
VLAN 10 is done with:

# bridge vlan del dev swp3 vid 10

This calls port_vlan_del() for port 3. Dumping VLANs is done with:

# bridge vlan show
portvlan ids
swp0None
swp0
swp1 10

swp1 10

swp2 10

swp2 10

swp3None
swp3
swp4 10 PVID Egress Untagged

swp4 10 PVID Egress Untagged

br0 None

This calls port_vlan_dump() for each ports.

Signed-off-by: Vivien Didelot vivien.dide...@savoirfairelinux.com
---
 drivers/net/dsa/mv88e6123_61_65.c |   3 +
 drivers/net/dsa/mv88e6131.c   |   3 +
 drivers/net/dsa/mv88e6171.c   |   3 +
 drivers/net/dsa/mv88e6352.c   |   3 +
 drivers/net/dsa/mv88e6xxx.c   | 154 ++
 drivers/net/dsa/mv88e6xxx.h   |   5 ++
 6 files changed, 171 insertions(+)

diff --git a/drivers/net/dsa/mv88e6123_61_65.c 
b/drivers/net/dsa/mv88e6123_61_65.c
index 71a29a7..8e679ff 100644
--- a/drivers/net/dsa/mv88e6123_61_65.c
+++ b/drivers/net/dsa/mv88e6123_61_65.c
@@ -134,6 +134,9 @@ struct dsa_switch_driver mv88e6123_61_65_switch_driver = {
 #endif
.get_regs_len   = mv88e6xxx_get_regs_len,
.get_regs   = mv88e6xxx_get_regs,
+   .port_vlan_add  = mv88e6xxx_port_vlan_add,
+   .port_vlan_del  = mv88e6xxx_port_vlan_del,
+   .port_vlan_dump = mv88e6xxx_port_vlan_dump,
 };
 
 MODULE_ALIAS(platform:mv88e6123);
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index 32f4a08..c4d914b 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -182,6 +182,9 @@ struct dsa_switch_driver mv88e6131_switch_driver = {
.get_strings= mv88e6xxx_get_strings,
.get_ethtool_stats  = mv88e6xxx_get_ethtool_stats,
.get_sset_count = mv88e6xxx_get_sset_count,
+   .port_vlan_add  = mv88e6xxx_port_vlan_add,
+   .port_vlan_del  = mv88e6xxx_port_vlan_del,
+   .port_vlan_dump = mv88e6xxx_port_vlan_dump,
 };
 
 MODULE_ALIAS(platform:mv88e6085);
diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index 1c78084..7701ce6 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -119,6 +119,9 @@ struct dsa_switch_driver mv88e6171_switch_driver = {
.fdb_add= mv88e6xxx_port_fdb_add,
.fdb_del= mv88e6xxx_port_fdb_del,
.fdb_getnext= mv88e6xxx_port_fdb_getnext,
+   .port_vlan_add  = mv88e6xxx_port_vlan_add,
+   .port_vlan_del  = mv88e6xxx_port_vlan_del,
+   .port_vlan_dump = mv88e6xxx_port_vlan_dump,
 };
 
 MODULE_ALIAS(platform:mv88e6171);
diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
index 632815c..b981be4a 100644
--- a/drivers/net/dsa/mv88e6352.c
+++ b/drivers/net/dsa/mv88e6352.c
@@ -392,6 +392,9 @@ struct dsa_switch_driver mv88e6352_switch_driver = {
.fdb_add= mv88e6xxx_port_fdb_add,
.fdb_del= mv88e6xxx_port_fdb_del,
.fdb_getnext= mv88e6xxx_port_fdb_getnext,
+   .port_vlan_add  = mv88e6xxx_port_vlan_add,
+   .port_vlan_del  = mv88e6xxx_port_vlan_del,
+   .port_vlan_dump = mv88e6xxx_port_vlan_dump,
 };
 
 MODULE_ALIAS(platform:mv88e6352);
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 049553c..c7cd5f4 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -1555,6 +1555,160 @@ static int _mv88e6xxx_vtu_loadpurge(struct dsa_switch 
*ds,
return _mv88e6xxx_vtu_cmd(ds, GLOBAL_VTU_OP_VTU_LOAD_PURGE);
 }
 
+int mv88e6xxx_port_vlan_add(struct dsa_switch *ds, int port, u16 vid,
+   u16 bridge_flags)
+{
+   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
+   struct mv88e6xxx_vtu_entry entry = { 0 };
+   int prev_vid = vid ? vid - 1 : 0xfff;
+   int i, ret;
+
+   mutex_lock(ps-smi_mutex);
+   ret = _mv88e6xxx_vtu_getnext(ds, prev_vid, entry);
+   if (ret  0)
+   goto unlock;
+
+   /* If the VLAN does not exist, re-initialize the entry for addition */
+   if (entry.vid != vid || !entry.valid) {
+   memset(entry, 0, sizeof(entry));
+   entry.valid = true;
+   entry.vid = vid;
+   entry.fid = vid; /* We use one

Re: [PATCH V2] cdc_ncm: Add support for moving NDP to end of NCM frame

2015-07-07 Thread Enrico Mioso


Hi Oliver, hello to who is reading this message.

i was re-reading the code and the oops, without understanding what's the 
problem. Still: what impressed me is the fact that at some point you see NULL 
ptr dereference in unrelated code (fbcon). Is it possible that at some point 
the memory portion (172 bytes if device is affected by NCM errata, and mine 
is), that the portion of memory to which ctx-delayed_ndp16 points to is 
somehow moved / thrown away?
It doesn't make sense, because otherwise even accesses to the ctx variable 
would give problems. And they don't.
Looking around then, I see kzalloc() / kmalloc (kzalloc =kmalloc | __GFP_ZERO) 
are used to allocate any size of memory (with the only requirement for it to be 
small). In rndis_host.c 1025 bytes (not 1024) are allocated, so I am excluding 
any kind of alignment problem here.


Thank you,
Enrico
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] add stealth mode

2015-07-07 Thread Matteo Croce

2015-07-07 9:01 GMT+02:00 Clemens Ladisch clem...@ladisch.de:
 valdis.kletni...@vt.edu wrote:
 On Thu, 02 Jul 2015 10:56:01 +0200, Matteo Croce said:
 Add option to disable any reply not related to a listening socket

 2) You *do* realize that this isn't anywhere near sufficient in order
 to actually make your machine invisible, right?  (Hint: What *other*
 packets can be sent to a machine to provoke a response?)

 Even worse: if you want to pretend that the entire machine is not there,
 you must make the router in front on you reply with an ICMP destination
 unreachable message.

You can't do sometimes, like in DSL lines where the router in front of
you is an ISP owned DSLAM

-- 
Matteo Croce
OpenWrt Developer
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] add stealth mode

2015-07-07 Thread Matteo Croce

2015-07-07 10:07 GMT+02:00 Hannes Frederic Sowa han...@stressinduktion.org:


 On Mon, Jul 6, 2015, at 21:44, Matteo Croce wrote:
 2015-07-06 12:49 GMT+02:00  valdis.kletni...@vt.edu:
  On Thu, 02 Jul 2015 10:56:01 +0200, Matteo Croce said:
  Add option to disable any reply not related to a listening socket,
  like RST/ACK for TCP and ICMP Port-Unreachable for UDP.
  Also disables ICMP replies to echo request and timestamp.
  The stealth mode can be enabled selectively for a single interface.
 
  A few notes.
 
  1) Do you have an actual use case where an iptables '-j DROP' isn't usable?

 If you mean using a default DROP policy and allowing only the traffic
 do you want,
 then the use case is where the port can change at runtime and you may not
 want
 to update the firewall every time

 Can't you use socket match in netfilter to accomplish exactly that?

You mean the owner --uid match?
Yes  sort of, but my was a different goal, I want just to disable any
kind of reply from a specific interface (usually WAN) unless there is
a listening socket, to mitigate port scanning and flood attacks
without having a firewall.

Obviously you can do it with a firewall,
but why do we have /proc/sys/net/ipv4/icmp_echo_ignore_all when we can
drop ICMP echoes?

-- 
Matteo Croce
OpenWrt Developer
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next] tcp: v1 always send a quick ack when quickacks are enabled

2015-07-07 Thread Jon Maxwell

V1 of this patch contains Eric Dumazet's suggestion to move the per
dst RTAX_QUICKACK check into tcp_in_quickack_mode(). Thanks Eric.

I ran some tests and after setting the ip route change quickack 1
knob there were still many delayed ACKs sent. This occured
because when icsk_ack.quick=0 the !icsk_ack.pingpong value is
subsequently ignored as tcp_in_quickack_mode() checks both these
values. The condition for a quick ack to trigger requires
that both icsk_ack.quick != 0 and icsk_ack.pingpong=0. Currently
only icsk_ack.pingpong is controlled by the knob. But the
icsk_ack.quick value changes dynamically depending on heuristics.
The crux of the matter is that delayed acks still cannot be entirely
disabled even with the RTAX_QUICKACK per dst knob enabled. This
patch ensures that a quick ack is always sent when the RTAX_QUICKACK
per dst knob is turned on.

The ip route change quickack 1 knob was recently added to enable
quickacks. It was modeled around the TCP_QUICKACK setsockopt() option.
This issue is that even with ip route change quickack 1 enabled
we still see delayed ACKs under some conditions. It would be nice
to be able to completely disable delayed ACKs.

Here is an example:

# netstat -s|grep dela
3 delayed acks sent

For all routes enable the knob

# ip route change quickack 1

Generate some traffic across a slow link and we still see the delayed
acks.

# netstat -s|grep dela
106 delayed acks sent
1 delayed acks further delayed because of locked socket

The issue is that both the ip route change quickack 1 knob and
the TCP_QUICKACK option set the icsk_ack.pingpong variable to 0.
However at the business end in the __tcp_ack_snd_check() routine,
tcp_in_quickack_mode() checks that both icsk_ack.quick != 0
and icsk_ack.pingpong=0 in order to trigger a quickack. As
icsk_ack.quick is determined by heuristics it can be 0. When
that occurs the icsk_ack.pingpong value is ignored and a delayed
ACK is sent regardless.

This patch moves the RTAX_QUICKACK per dst check into the
tcp_in_quickack_mode() routine which ensures that a quickack is
always sent when the quickack knob is enabled for that dst.

Signed-off-by: Jon Maxwell jmaxwel...@gmail.com
---
 net/ipv4/tcp_input.c  | 11 +--
 net/ipv4/tcp_output.c |  6 ++
 2 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 684f095..b9da527 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -196,11 +196,13 @@ static void tcp_enter_quickack_mode(struct sock *sk)
  * and the session is not interactive.
  */
 
-static inline bool tcp_in_quickack_mode(const struct sock *sk)
+static bool tcp_in_quickack_mode(struct sock *sk)
 {
const struct inet_connection_sock *icsk = inet_csk(sk);
+   const struct dst_entry *dst = __sk_dst_get(sk);
 
-   return icsk-icsk_ack.quick  !icsk-icsk_ack.pingpong;
+   return (dst  dst_metric(dst, RTAX_QUICKACK)) ||
+   (icsk-icsk_ack.quick  !icsk-icsk_ack.pingpong);
 }
 
 static void tcp_ecn_queue_cwr(struct tcp_sock *tp)
@@ -3948,7 +3950,6 @@ void tcp_reset(struct sock *sk)
 static void tcp_fin(struct sock *sk)
 {
struct tcp_sock *tp = tcp_sk(sk);
-   const struct dst_entry *dst;
 
inet_csk_schedule_ack(sk);
 
@@ -3960,9 +3961,7 @@ static void tcp_fin(struct sock *sk)
case TCP_ESTABLISHED:
/* Move to CLOSE_WAIT */
tcp_set_state(sk, TCP_CLOSE_WAIT);
-   dst = __sk_dst_get(sk);
-   if (!dst || !dst_metric(dst, RTAX_QUICKACK))
-   inet_csk(sk)-icsk_ack.pingpong = 1;
+   inet_csk(sk)-icsk_ack.pingpong = 1;
break;
 
case TCP_CLOSE_WAIT:
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index b1c218d..7105784 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -163,7 +163,6 @@ static void tcp_event_data_sent(struct tcp_sock *tp,
 {
struct inet_connection_sock *icsk = inet_csk(sk);
const u32 now = tcp_time_stamp;
-   const struct dst_entry *dst = __sk_dst_get(sk);
 
if (sysctl_tcp_slow_start_after_idle 
(!tp-packets_out  (s32)(now - tp-lsndtime)  icsk-icsk_rto))
@@ -174,9 +173,8 @@ static void tcp_event_data_sent(struct tcp_sock *tp,
/* If it is a reply for ato after last received
 * packet, enter pingpong mode.
 */
-   if ((u32)(now - icsk-icsk_ack.lrcvtime)  icsk-icsk_ack.ato 
-   (!dst || !dst_metric(dst, RTAX_QUICKACK)))
-   icsk-icsk_ack.pingpong = 1;
+   if ((u32)(now - icsk-icsk_ack.lrcvtime)  icsk-icsk_ack.ato)
+   icsk-icsk_ack.pingpong = 1;
 }
 
 /* Account for an ACK we sent. */
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net,v2] ip_tunnel: fix ipv4 pmtu check to honor inner ip header df

2015-07-07 Thread Pravin Shelar

On Mon, Jul 6, 2015 at 10:34 PM, Timo Teräs timo.te...@iki.fi wrote:
 Frag needed should be sent only if the inner header asked
 to not fragment. Currently fragmentation is broken if the
 tunnel has df set, but df was not asked in the original
 packet. The tunnel's df needs to be still checked to update
 internally the pmtu cache.

 Commit 23a3647bc4f93bac broke it, and this commit fixes
 the ipv4 df check back to the way it was.

 Fixes: 23a3647bc4f93bac (ip_tunnels: Use skb-len to PMTU check.)
 Cc: Pravin B Shelar pshe...@nicira.com
 Signed-off-by: Timo Teräs timo.te...@iki.fi
 ---
 Should go to -stable queues (3.12.y and newer).

 v2: revised commit message wording a bit, and added
 signed-off-by line that was forgotten accidentally.


Acked-by: Pravin B Shelar pshe...@nicira.com
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 6/7] hvsock: introduce Hyper-V VM Sockets feature

2015-07-07 Thread Paul Bolle

Just two nits.

On ma, 2015-07-06 at 07:47 -0700, Dexuan Cui wrote:
 --- /dev/null
 +++ b/net/hv_sock/Kconfig

 +config HYPERV_SOCK
 + tristate Microsoft Hyper-V Socket (EXPERIMENTAL)
 + depends on HYPERV
 + default m
 + help
 +   Hyper-V Socket is a socket protocol similar to TCP, allowing
 +   communication between a Linux guest and the host.
 +
 +   To compile this driver as a module, choose M here: the module
 +   will be called hv_sock. If unsure, say N.

It's a bit odd to advise to say N if one is unsure and set the default
to 'm' at the same time.

 --- /dev/null
 +++ b/net/hv_sock/af_hvsock.c

 +static int hvsock_init(void)
 +{
 + [...]
 +}
 +
 +static void hvsock_exit(void)
 +{
 + [...]
 +}
 +
 +module_init(hvsock_init);
 +module_exit(hvsock_exit);

Any specific reason not to mark these functions __init and __exit?

Thanks,


Paul Bolle
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH 6/7] hvsock: introduce Hyper-V VM Sockets feature

2015-07-07 Thread Dexuan Cui

 -Original Message-
 From: Olaf Hering [mailto:o...@aepfle.de]
 Sent: Tuesday, July 7, 2015 18:10
 To: Dexuan Cui; Paul Bolle
 Cc: gre...@linuxfoundation.org; da...@davemloft.net;
 netdev@vger.kernel.org; linux-ker...@vger.kernel.org; driverdev-
 de...@linuxdriverproject.org; a...@canonical.com; jasow...@redhat.com; KY
 Srinivasan; Haiyang Zhang
 Subject: Re: [PATCH 6/7] hvsock: introduce Hyper-V VM Sockets feature
 
 On Tue, Jul 07, Paul Bolle wrote:
 
  On ma, 2015-07-06 at 07:47 -0700, Dexuan Cui wrote:
   --- /dev/null
   +++ b/net/hv_sock/Kconfig
 
   +config HYPERV_SOCK
   + tristate Microsoft Hyper-V Socket (EXPERIMENTAL)
   + depends on HYPERV
   + default m
 
  It's a bit odd to advise to say N if one is unsure and set the default
  to 'm' at the same time.
 
 The 'default' line has to be removed IMO.
 
 Olaf

OK, removing the line seems better than 'default n', though both reproduce
the same # CONFIG_HYPERV_SOCK is not set.

-- Dexuan

[PATCH net 0/3] sfc: compat for lack of VADAPTOR_SET_MAC in adaptor_firmware = 4.1.1.1023

2015-07-07 Thread Shradha Shah

This patch series resolves an incompatibility with legacy
firmware due to the lack of MC_CMD_VADAPTOR_SET_MAC in
adaptor_firmware = 4.1.1.1023

Unless this patch series is applied there will be a compatibility
issue between the driver and Solarflare adapters running older
firmware. 

Tested with and without CONFIG_SFC_SRIOV

Daniel Pieczko (3):
  sfc: refactor code in efx_ef10_set_mac_address()
  sfc: add legacy method for changing a PF's MAC address
  sfc: suppress handled MCDI failures when changing the MAC address

 drivers/net/ethernet/sfc/ef10.c   | 172 --
 drivers/net/ethernet/sfc/ef10_sriov.c |  42 -
 drivers/net/ethernet/sfc/ef10_sriov.h |   6 ++
 3 files changed, 151 insertions(+), 69 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH 6/7] hvsock: introduce Hyper-V VM Sockets feature

2015-07-07 Thread Dexuan Cui

 -Original Message-
 From: Paul Bolle
 Sent: Tuesday, July 7, 2015 17:38
 To: Dexuan Cui
 Subject: Re: [PATCH 6/7] hvsock: introduce Hyper-V VM Sockets feature
 
 Just two nits.
 
 On ma, 2015-07-06 at 07:47 -0700, Dexuan Cui wrote:
  --- /dev/null
  +++ b/net/hv_sock/Kconfig
 
  +config HYPERV_SOCK
  +   tristate Microsoft Hyper-V Socket (EXPERIMENTAL)
  +   depends on HYPERV
  +   default m
  +   help
  + Hyper-V Socket is a socket protocol similar to TCP, allowing
  + communication between a Linux guest and the host.
  +
  + To compile this driver as a module, choose M here: the module
  + will be called hv_sock. If unsure, say N.
 
 It's a bit odd to advise to say N if one is unsure and set the default
 to 'm' at the same time.
Hi Paul,
Thanks for the suggestion!
I'll change the 'default' to n in V2.

  --- /dev/null
  +++ b/net/hv_sock/af_hvsock.c
 
  +static int hvsock_init(void)
  +{
  +   [...]
  +}
  +
  +static void hvsock_exit(void)
  +{
  +   [...]
  +}
  +
  +module_init(hvsock_init);
  +module_exit(hvsock_exit);
 
 Any specific reason not to mark these functions __init and __exit?
 
 Paul Bolle
Thanks for pointing this out -- I missed that. 
I'll add __init and __exit in V2.

Thanks,
-- Dexuan
N�r��yb�X��ǧv�^�)޺{.n�+���z�^�)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

Re: [PATCH 6/7] hvsock: introduce Hyper-V VM Sockets feature

2015-07-07 Thread Paul Bolle

On di, 2015-07-07 at 10:20 +, Dexuan Cui wrote:
 OK, removing the line seems better than 'default n', though both 
 reproduce the same # CONFIG_HYPERV_SOCK is not set.

Speaking from memory (so chances are I'm forgetting some silly detail)
that is because
# CONFIG_FOO is not set

will be printed if FOO's dependencies are met and FOO either has a
prompt or a default of 'n'.

Hope this helps,


Paul Bolle
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

linux-4.2-rc1/samples/bpf/sockex3_kern.c: bad expression ?

2015-07-07 Thread David Binderman

Hello there,

[linux-4.2-rc1/samples/bpf/sockex3_kern.c:268]: (style) Expression '(X  0xf0) 
== 0x4' is always false.

Source code is

   if ((verlen  0xF0) == 4)

Maybe

   if ((verlen  0xF0) == 0x40)

Regards

David Binderman


  --
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net 3/3] sfc: suppress handled MCDI failures when changing the MAC address

2015-07-07 Thread Shradha Shah

From: Daniel Pieczko dpiec...@solarflare.com

Signed-off-by: Shradha Shah ss...@solarflare.com
---
 drivers/net/ethernet/sfc/ef10.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index e0cb361..605cc89 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -3933,8 +3933,8 @@ static int efx_ef10_set_mac_address(struct efx_nic *efx)
efx-net_dev-dev_addr);
MCDI_SET_DWORD(inbuf, VADAPTOR_SET_MAC_IN_UPSTREAM_PORT_ID,
   nic_data-vport_id);
-   rc = efx_mcdi_rpc(efx, MC_CMD_VADAPTOR_SET_MAC, inbuf,
- sizeof(inbuf), NULL, 0, NULL);
+   rc = efx_mcdi_rpc_quiet(efx, MC_CMD_VADAPTOR_SET_MAC, inbuf,
+   sizeof(inbuf), NULL, 0, NULL);
 
efx_ef10_filter_table_probe(efx);
up_write(efx-filter_sem);
@@ -3986,6 +3986,9 @@ static int efx_ef10_set_mac_address(struct efx_nic *efx)
 * MCFW do not support VFs.
 */
rc = efx_ef10_vport_set_mac_address(efx);
+   } else {
+   efx_mcdi_display_error(efx, MC_CMD_VADAPTOR_SET_MAC,
+  sizeof(inbuf), NULL, 0, rc);
}
 
return rc;
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net 2/3] sfc: add legacy method for changing a PF's MAC address

2015-07-07 Thread Shradha Shah

From: Daniel Pieczko dpiec...@solarflare.com

Some versions of MCFW do not support the MC_CMD_VADAPTOR_SET_MAC
command, and ENOSYS will be returned.

If the PF created its own vport, the function's datapath must be
stopped and the vport can be reconfigured to reflect the new MAC
address.

If the MCFW created the vport for the PF (which is the case when
the nic_data-vport_mac is blank), nothing further needs to be
done as the vport is not under the control of the PF.

This only applies to PFs because the MCFW in question does not
support VFs.

Signed-off-by: Shradha Shah ss...@solarflare.com
---
 drivers/net/ethernet/sfc/ef10.c   | 120 ++
 drivers/net/ethernet/sfc/ef10_sriov.c |  42 
 drivers/net/ethernet/sfc/ef10_sriov.h |   6 ++
 3 files changed, 126 insertions(+), 42 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 9740cd0..e0cb361 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -101,6 +101,11 @@ static unsigned int efx_ef10_mem_map_size(struct efx_nic 
*efx)
return resource_size(efx-pci_dev-resource[bar]);
 }
 
+static bool efx_ef10_is_vf(struct efx_nic *efx)
+{
+   return efx-type-is_vf;
+}
+
 static int efx_ef10_get_pf_index(struct efx_nic *efx)
 {
MCDI_DECLARE_BUF(outbuf, MC_CMD_GET_FUNCTION_INFO_OUT_LEN);
@@ -677,6 +682,48 @@ static int efx_ef10_probe_pf(struct efx_nic *efx)
return efx_ef10_probe(efx);
 }
 
+int efx_ef10_vadaptor_alloc(struct efx_nic *efx, unsigned int port_id)
+{
+   MCDI_DECLARE_BUF(inbuf, MC_CMD_VADAPTOR_ALLOC_IN_LEN);
+
+   MCDI_SET_DWORD(inbuf, VADAPTOR_ALLOC_IN_UPSTREAM_PORT_ID, port_id);
+   return efx_mcdi_rpc(efx, MC_CMD_VADAPTOR_ALLOC, inbuf, sizeof(inbuf),
+   NULL, 0, NULL);
+}
+
+int efx_ef10_vadaptor_free(struct efx_nic *efx, unsigned int port_id)
+{
+   MCDI_DECLARE_BUF(inbuf, MC_CMD_VADAPTOR_FREE_IN_LEN);
+
+   MCDI_SET_DWORD(inbuf, VADAPTOR_FREE_IN_UPSTREAM_PORT_ID, port_id);
+   return efx_mcdi_rpc(efx, MC_CMD_VADAPTOR_FREE, inbuf, sizeof(inbuf),
+   NULL, 0, NULL);
+}
+
+int efx_ef10_vport_add_mac(struct efx_nic *efx,
+  unsigned int port_id, u8 *mac)
+{
+   MCDI_DECLARE_BUF(inbuf, MC_CMD_VPORT_ADD_MAC_ADDRESS_IN_LEN);
+
+   MCDI_SET_DWORD(inbuf, VPORT_ADD_MAC_ADDRESS_IN_VPORT_ID, port_id);
+   ether_addr_copy(MCDI_PTR(inbuf, VPORT_ADD_MAC_ADDRESS_IN_MACADDR), mac);
+
+   return efx_mcdi_rpc(efx, MC_CMD_VPORT_ADD_MAC_ADDRESS, inbuf,
+   sizeof(inbuf), NULL, 0, NULL);
+}
+
+int efx_ef10_vport_del_mac(struct efx_nic *efx,
+  unsigned int port_id, u8 *mac)
+{
+   MCDI_DECLARE_BUF(inbuf, MC_CMD_VPORT_DEL_MAC_ADDRESS_IN_LEN);
+
+   MCDI_SET_DWORD(inbuf, VPORT_DEL_MAC_ADDRESS_IN_VPORT_ID, port_id);
+   ether_addr_copy(MCDI_PTR(inbuf, VPORT_DEL_MAC_ADDRESS_IN_MACADDR), mac);
+
+   return efx_mcdi_rpc(efx, MC_CMD_VPORT_DEL_MAC_ADDRESS, inbuf,
+   sizeof(inbuf), NULL, 0, NULL);
+}
+
 #ifdef CONFIG_SFC_SRIOV
 static int efx_ef10_probe_vf(struct efx_nic *efx)
 {
@@ -3804,6 +3851,72 @@ static void efx_ef10_filter_sync_rx_mode(struct efx_nic 
*efx)
WARN_ON(remove_failed);
 }
 
+static int efx_ef10_vport_set_mac_address(struct efx_nic *efx)
+{
+   struct efx_ef10_nic_data *nic_data = efx-nic_data;
+   u8 mac_old[ETH_ALEN];
+   int rc, rc2;
+
+   /* Only reconfigure a PF-created vport */
+   if (is_zero_ether_addr(nic_data-vport_mac))
+   return 0;
+
+   efx_device_detach_sync(efx);
+   efx_net_stop(efx-net_dev);
+   down_write(efx-filter_sem);
+   efx_ef10_filter_table_remove(efx);
+   up_write(efx-filter_sem);
+
+   rc = efx_ef10_vadaptor_free(efx, nic_data-vport_id);
+   if (rc)
+   goto restore_filters;
+
+   ether_addr_copy(mac_old, nic_data-vport_mac);
+   rc = efx_ef10_vport_del_mac(efx, nic_data-vport_id,
+   nic_data-vport_mac);
+   if (rc)
+   goto restore_vadaptor;
+
+   rc = efx_ef10_vport_add_mac(efx, nic_data-vport_id,
+   efx-net_dev-dev_addr);
+   if (!rc) {
+   ether_addr_copy(nic_data-vport_mac, efx-net_dev-dev_addr);
+   } else {
+   rc2 = efx_ef10_vport_add_mac(efx, nic_data-vport_id, mac_old);
+   if (rc2) {
+   /* Failed to add original MAC, so clear vport_mac */
+   eth_zero_addr(nic_data-vport_mac);
+   goto reset_nic;
+   }
+   }
+
+restore_vadaptor:
+   rc2 = efx_ef10_vadaptor_alloc(efx, nic_data-vport_id);
+   if (rc2)
+   goto reset_nic;
+restore_filters:
+   down_write(efx-filter_sem);
+   rc2 = efx_ef10_filter_table_probe(efx);
+   up_write(efx-filter_sem);
+   if (rc2)

[PATCH net-next 2/4] cxgb4: Update register ranges for T6 adapter

2015-07-07 Thread Hariprasad Shenai

Signed-off-by: Hariprasad Shenai haripra...@chelsio.com
---
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c 
b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index 2b52aae..ba2be1e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -1345,9 +1345,9 @@ void t4_get_regs(struct adapter *adap, void *buf, size_t 
buf_size)
0x5a80, 0x5a9c,
0x5b94, 0x5bfc,
0x5c10, 0x5ec0,
-   0x5ec8, 0x5ec8,
+   0x5ec8, 0x5ecc,
0x6000, 0x6040,
-   0x6058, 0x6154,
+   0x6058, 0x615c,
0x7700, 0x7798,
0x77c0, 0x7880,
0x78cc, 0x78fc,
@@ -1371,20 +1371,22 @@ void t4_get_regs(struct adapter *adap, void *buf, 
size_t buf_size)
0x9f00, 0x9f6c,
0x9f80, 0xa020,
0xd004, 0xd03c,
+   0xd100, 0xd118,
+   0xd200, 0xd31c,
0xdfc0, 0xdfe0,
0xe000, 0xf008,
0x11000, 0x11014,
0x11048, 0x0,
0x8, 0x1117c,
-   0x11190, 0x11260,
+   0x11190, 0x11264,
0x11300, 0x1130c,
-   0x12000, 0x1205c,
+   0x12000, 0x1206c,
0x19040, 0x1906c,
0x19078, 0x19080,
0x1908c, 0x19124,
0x19150, 0x191b0,
0x191d0, 0x191e8,
-   0x19238, 0x192b8,
+   0x19238, 0x192bc,
0x193f8, 0x19474,
0x19490, 0x194cc,
0x194f0, 0x194f8,
@@ -1466,7 +1468,7 @@ void t4_get_regs(struct adapter *adap, void *buf, size_t 
buf_size)
0x30200, 0x30318,
0x30400, 0x3052c,
0x30540, 0x3061c,
-   0x30800, 0x3088c,
+   0x30800, 0x30890,
0x308c0, 0x30908,
0x30910, 0x309b8,
0x30a00, 0x30a04,
@@ -1544,7 +1546,7 @@ void t4_get_regs(struct adapter *adap, void *buf, size_t 
buf_size)
0x34200, 0x34318,
0x34400, 0x3452c,
0x34540, 0x3461c,
-   0x34800, 0x3488c,
+   0x34800, 0x34890,
0x348c0, 0x34908,
0x34910, 0x349b8,
0x34a00, 0x34a04,
-- 
2.3.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next 1/4] cxgb4: Don't use entire L2T table, use only its slice

2015-07-07 Thread Hariprasad Shenai

The driver was retrieving the parameters for the bounds of its
slice of the L2T from the firmware and then throwing those away and
using the entire table. This corrects that problem.

Signed-off-by: Hariprasad Shenai haripra...@chelsio.com
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c |  2 +-
 drivers/net/ethernet/chelsio/cxgb4/l2t.c| 94 ++---
 drivers/net/ethernet/chelsio/cxgb4/l2t.h| 18 -
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.h  |  1 -
 4 files changed, 71 insertions(+), 44 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index c64b5a9..324244a 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -4760,7 +4760,7 @@ static int init_one(struct pci_dev *pdev, const struct 
pci_device_id *ent)
 */
cfg_queues(adapter);
 
-   adapter-l2t = t4_init_l2t();
+   adapter-l2t = t4_init_l2t(adapter-l2t_start, adapter-l2t_end);
if (!adapter-l2t) {
/* We tolerate a lack of L2T, giving up some functionality */
dev_warn(pdev-dev, could not allocate L2T, continuing\n);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/l2t.c 
b/drivers/net/ethernet/chelsio/cxgb4/l2t.c
index 252efc2..ac27898 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/l2t.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/l2t.c
@@ -51,24 +51,17 @@
 #define VLAN_NONE 0xfff
 
 /* identifies sync vs async L2T_WRITE_REQs */
-#define F_SYNC_WR(1  12)
-
-enum {
-   L2T_STATE_VALID,  /* entry is up to date */
-   L2T_STATE_STALE,  /* entry may be used but needs revalidation */
-   L2T_STATE_RESOLVING,  /* entry needs address resolution */
-   L2T_STATE_SYNC_WRITE, /* synchronous write of entry underway */
-
-   /* when state is one of the below the entry is not hashed */
-   L2T_STATE_SWITCHING,  /* entry is being used by a switching filter */
-   L2T_STATE_UNUSED  /* entry not in use */
-};
+#define SYNC_WR_S12
+#define SYNC_WR_V(x) ((x)  SYNC_WR_S)
+#define SYNC_WR_FSYNC_WR_V(1)
 
 struct l2t_data {
+   unsigned int l2t_start; /* start index of our piece of the L2T */
+   unsigned int l2t_size;  /* number of entries in l2tab */
rwlock_t lock;
atomic_t nfree; /* number of free entries */
struct l2t_entry *rover;/* starting point for next allocation */
-   struct l2t_entry l2tab[L2T_SIZE];
+   struct l2t_entry l2tab[0];  /* MUST BE LAST */
 };
 
 static inline unsigned int vlan_prio(const struct l2t_entry *e)
@@ -85,29 +78,36 @@ static inline void l2t_hold(struct l2t_data *d, struct 
l2t_entry *e)
 /*
  * To avoid having to check address families we do not allow v4 and v6
  * neighbors to be on the same hash chain.  We keep v4 entries in the first
- * half of available hash buckets and v6 in the second.
+ * half of available hash buckets and v6 in the second.  We need at least two
+ * entries in our L2T for this scheme to work.
  */
 enum {
-   L2T_SZ_HALF = L2T_SIZE / 2,
-   L2T_HASH_MASK = L2T_SZ_HALF - 1
+   L2T_MIN_HASH_BUCKETS = 2,
 };
 
-static inline unsigned int arp_hash(const u32 *key, int ifindex)
+static inline unsigned int arp_hash(struct l2t_data *d, const u32 *key,
+   int ifindex)
 {
-   return jhash_2words(*key, ifindex, 0)  L2T_HASH_MASK;
+   unsigned int l2t_size_half = d-l2t_size / 2;
+
+   return jhash_2words(*key, ifindex, 0) % l2t_size_half;
 }
 
-static inline unsigned int ipv6_hash(const u32 *key, int ifindex)
+static inline unsigned int ipv6_hash(struct l2t_data *d, const u32 *key,
+int ifindex)
 {
+   unsigned int l2t_size_half = d-l2t_size / 2;
u32 xor = key[0] ^ key[1] ^ key[2] ^ key[3];
 
-   return L2T_SZ_HALF + (jhash_2words(xor, ifindex, 0)  L2T_HASH_MASK);
+   return (l2t_size_half +
+   (jhash_2words(xor, ifindex, 0) % l2t_size_half));
 }
 
-static unsigned int addr_hash(const u32 *addr, int addr_len, int ifindex)
+static unsigned int addr_hash(struct l2t_data *d, const u32 *addr,
+ int addr_len, int ifindex)
 {
-   return addr_len == 4 ? arp_hash(addr, ifindex) :
-  ipv6_hash(addr, ifindex);
+   return addr_len == 4 ? arp_hash(d, addr, ifindex) :
+  ipv6_hash(d, addr, ifindex);
 }
 
 /*
@@ -139,6 +139,8 @@ static void neigh_replace(struct l2t_entry *e, struct 
neighbour *n)
  */
 static int write_l2e(struct adapter *adap, struct l2t_entry *e, int sync)
 {
+   struct l2t_data *d = adap-l2t;
+   unsigned int l2t_idx = e-idx + d-l2t_start;
struct sk_buff *skb;
struct cpl_l2t_write_req *req;
 
@@ -150,10 +152,10 @@ static int write_l2e(struct adapter *adap, struct 
l2t_entry *e, int sync)
INIT_TP_WR(req, 0);

[PATCH net-next 4/4] cxgb4: Enable cim_la dump to support T6

2015-07-07 Thread Hariprasad Shenai

Signed-off-by: Hariprasad Shenai haripra...@chelsio.com
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c | 54 --
 1 file changed, 51 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
index 484eb8c..42d48dd 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
@@ -151,6 +151,45 @@ static int cim_la_show_3in1(struct seq_file *seq, void *v, 
int idx)
return 0;
 }
 
+static int cim_la_show_t6(struct seq_file *seq, void *v, int idx)
+{
+   if (v == SEQ_START_TOKEN) {
+   seq_puts(seq, Status   InstData  PC LS0Stat  
+LS0Addr  LS0Data  LS1Stat  LS1Addr  LS1Data\n);
+   } else {
+   const u32 *p = v;
+
+   seq_printf(seq,   %02x   %04x%04x %04x%04x %04x%04x %08x %08x 
%08x %08x %08x %08x\n,
+  (p[9]  16)  0xff,   /* Status */
+  p[9]  0x, p[8]  16, /* Inst */
+  p[8]  0x, p[7]  16, /* Data */
+  p[7]  0x, p[6]  16, /* PC */
+  p[2], p[1], p[0],  /* LS0 Stat, Addr and Data */
+  p[5], p[4], p[3]); /* LS1 Stat, Addr and Data */
+   }
+   return 0;
+}
+
+static int cim_la_show_pc_t6(struct seq_file *seq, void *v, int idx)
+{
+   if (v == SEQ_START_TOKEN) {
+   seq_puts(seq, Status   InstData  PC\n);
+   } else {
+   const u32 *p = v;
+
+   seq_printf(seq,   %02x   %08x %08x %08x\n,
+  p[3]  0xff, p[2], p[1], p[0]);
+   seq_printf(seq,   %02x   %02x%06x %02x%06x %02x%06x\n,
+  (p[6]  8)  0xff, p[6]  0xff, p[5]  8,
+  p[5]  0xff, p[4]  8, p[4]  0xff, p[3]  8);
+   seq_printf(seq,   %02x   %04x%04x %04x%04x %04x%04x\n,
+  (p[9]  16)  0xff, p[9]  0x, p[8]  16,
+  p[8]  0x, p[7]  16, p[7]  0x,
+  p[6]  16);
+   }
+   return 0;
+}
+
 static int cim_la_open(struct inode *inode, struct file *file)
 {
int ret;
@@ -162,9 +201,18 @@ static int cim_la_open(struct inode *inode, struct file 
*file)
if (ret)
return ret;
 
-   p = seq_open_tab(file, adap-params.cim_la_size / 8, 8 * sizeof(u32), 1,
-cfg  UPDBGLACAPTPCONLY_F ?
-cim_la_show_3in1 : cim_la_show);
+   if (is_t6(adap-params.chip)) {
+   /* +1 to account for integer division of CIMLA_SIZE/10 */
+   p = seq_open_tab(file, (adap-params.cim_la_size / 10) + 1,
+10 * sizeof(u32), 1,
+cfg  UPDBGLACAPTPCONLY_F ?
+   cim_la_show_pc_t6 : cim_la_show_t6);
+   } else {
+   p = seq_open_tab(file, adap-params.cim_la_size / 8,
+8 * sizeof(u32), 1,
+cfg  UPDBGLACAPTPCONLY_F ? cim_la_show_3in1 :
+cim_la_show);
+   }
if (!p)
return -ENOMEM;
 
-- 
2.3.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next 0/4] Cleanup, T6 changes and register range update

2015-07-07 Thread Hariprasad Shenai

Hi,

This patch series adds the following:
Don't use entire L2T table, update register ranges for T6 adapter,
read stats for only available channels for T6 and enable cim_la dump for
T6 adapter also.

This patch series has been created against net-next tree and includes
patches on cxgb4 driver.

We have included all the maintainers of respective drivers. Kindly review
the change and let us know in case of any review comments.

Thanks

Hariprasad Shenai (4):
  cxgb4: Don't use entire L2T table, use only its slice
  cxgb4: Update register ranges for T6 adapter
  cxgb4: Read stats for only available channels
  cxgb4: Enable cim_la dump to support T6

 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c | 54 -
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c|  2 +-
 drivers/net/ethernet/chelsio/cxgb4/l2t.c   | 94 +-
 drivers/net/ethernet/chelsio/cxgb4/l2t.h   | 18 -
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.c | 89 
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.h |  1 -
 6 files changed, 157 insertions(+), 101 deletions(-)

-- 
2.3.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next 3/4] cxgb4: Read stats for only available channels

2015-07-07 Thread Hariprasad Shenai

Updating the driver to read the stats of only available channels. T6 and
later has only 2 channels

Signed-off-by: Hariprasad Shenai haripra...@chelsio.com
---
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.c | 73 +++---
 1 file changed, 26 insertions(+), 47 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c 
b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index ba2be1e..1e6597d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -3926,43 +3926,25 @@ void t4_tp_get_tcp_stats(struct adapter *adap, struct 
tp_tcp_stats *v4,
  */
 void t4_tp_get_err_stats(struct adapter *adap, struct tp_err_stats *st)
 {
-   /* T6 and later has 2 channels */
-   if (adap-params.arch.nchan == NCHAN) {
-   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
-st-mac_in_errs, 12, TP_MIB_MAC_IN_ERR_0_A);
-   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
-st-tnl_cong_drops, 8,
-TP_MIB_TNL_CNG_DROP_0_A);
-   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
-st-tnl_tx_drops, 4,
-TP_MIB_TNL_DROP_0_A);
-   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
-st-ofld_vlan_drops, 4,
-TP_MIB_OFD_VLN_DROP_0_A);
-   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
-st-tcp6_in_errs, 4,
-TP_MIB_TCP_V6IN_ERR_0_A);
-   } else {
-   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
-st-mac_in_errs, 2, TP_MIB_MAC_IN_ERR_0_A);
-   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
-st-hdr_in_errs, 2, TP_MIB_HDR_IN_ERR_0_A);
-   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
-st-tcp_in_errs, 2, TP_MIB_TCP_IN_ERR_0_A);
-   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
-st-tnl_cong_drops, 2,
-TP_MIB_TNL_CNG_DROP_0_A);
-   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
-st-ofld_chan_drops, 2,
-TP_MIB_OFD_CHN_DROP_0_A);
-   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
-st-tnl_tx_drops, 2, TP_MIB_TNL_DROP_0_A);
-   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
-st-ofld_vlan_drops, 2,
-TP_MIB_OFD_VLN_DROP_0_A);
-   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
-st-tcp6_in_errs, 2, TP_MIB_TCP_V6IN_ERR_0_A);
-   }
+   int nchan = adap-params.arch.nchan;
+
+   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
+st-mac_in_errs, nchan, TP_MIB_MAC_IN_ERR_0_A);
+   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
+st-hdr_in_errs, nchan, TP_MIB_HDR_IN_ERR_0_A);
+   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
+st-tcp_in_errs, nchan, TP_MIB_TCP_IN_ERR_0_A);
+   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
+st-tnl_cong_drops, nchan, TP_MIB_TNL_CNG_DROP_0_A);
+   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
+st-ofld_chan_drops, nchan, TP_MIB_OFD_CHN_DROP_0_A);
+   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
+st-tnl_tx_drops, nchan, TP_MIB_TNL_DROP_0_A);
+   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
+st-ofld_vlan_drops, nchan, TP_MIB_OFD_VLN_DROP_0_A);
+   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
+st-tcp6_in_errs, nchan, TP_MIB_TCP_V6IN_ERR_0_A);
+
t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A,
 st-ofld_no_neigh, 2, TP_MIB_OFD_ARP_DROP_A);
 }
@@ -3976,16 +3958,13 @@ void t4_tp_get_err_stats(struct adapter *adap, struct 
tp_err_stats *st)
  */
 void t4_tp_get_cpl_stats(struct adapter *adap, struct tp_cpl_stats *st)
 {
-   /* T6 and later has 2 channels */
-   if (adap-params.arch.nchan == NCHAN) {
-   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A, st-req,
-8, TP_MIB_CPL_IN_REQ_0_A);
-   } else {
-   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A, st-req,
-2, TP_MIB_CPL_IN_REQ_0_A);
-   t4_read_indirect(adap, TP_MIB_INDEX_A, TP_MIB_DATA_A, st-rsp,
-2, TP_MIB_CPL_OUT_RSP_0_A);
-   }
+   int nchan =

[PATCH v3] bonding: primary_reselect with failure is not working properly

2015-07-07 Thread Mazhar Rana

From: Mazhar Rana mazhar.r...@cyberoam.com

When primary_reselect is set to failure, primary interface should
not become active until current active slave is down. But if we set first
member of bond device as a primary interface and primary_reselect
is set to failure then whenever primary interface's link get back(up)
it become active slave even if current active slave is still up.

With this patch, bond_find_best_slave will not traverse members if
primary interface is not candidate for failover/reselection and current
active slave is still up.

Signed-off-by: Mazhar Rana mazhar.r...@cyberoam.com
Signed-off-by: Jay Vosburgh j.vosbu...@gmail.com
---

v1-v2: 
return curr instead of bond-curr_active_slave.

v2-v3: 
To make code more clear, replaced function bond_should_change_active
with bond_choose_primary_or_current which will return slave device.

 drivers/net/bonding/bond_main.c | 51 +++--
 1 file changed, 34 insertions(+), 17 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 19eb990..317a494 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -689,40 +689,57 @@ out:
 
 }
 
-static bool bond_should_change_active(struct bonding *bond)
+static struct slave *bond_choose_primary_or_current(struct bonding *bond)
 {
struct slave *prim = rtnl_dereference(bond-primary_slave);
struct slave *curr = rtnl_dereference(bond-curr_active_slave);
 
-   if (!prim || !curr || curr-link != BOND_LINK_UP)
-   return true;
+   if (!prim || prim-link != BOND_LINK_UP) {
+   if (!curr || curr-link != BOND_LINK_UP)
+   return NULL;
+   return curr;
+   }
+
if (bond-force_primary) {
bond-force_primary = false;
-   return true;
+   return prim;
+   }
+
+   if (!curr || curr-link != BOND_LINK_UP)
+   return prim;
+
+   /* At this point, prim and curr are both up */
+   switch (bond-params.primary_reselect) {
+   case BOND_PRI_RESELECT_ALWAYS:
+   return prim;
+   case BOND_PRI_RESELECT_BETTER:
+   if (prim-speed  curr-speed)
+   return curr;
+   if (prim-speed == curr-speed  prim-duplex = curr-duplex)
+   return curr;
+   return prim;
+   case BOND_PRI_RESELECT_FAILURE:
+   return curr;
+   default:
+   netdev_err(bond-dev, impossible primary_reselect %d\n,
+  bond-params.primary_reselect);
+   return curr;
}
-   if (bond-params.primary_reselect == BOND_PRI_RESELECT_BETTER 
-   (prim-speed  curr-speed ||
-(prim-speed == curr-speed  prim-duplex = curr-duplex)))
-   return false;
-   if (bond-params.primary_reselect == BOND_PRI_RESELECT_FAILURE)
-   return false;
-   return true;
 }
 
 /**
- * find_best_interface - select the best available slave to be the active one
+ * bond_find_best_slave - select the best available slave to be the active one
  * @bond: our bonding struct
  */
 static struct slave *bond_find_best_slave(struct bonding *bond)
 {
-   struct slave *slave, *bestslave = NULL, *primary;
+   struct slave *slave, *bestslave = NULL;
struct list_head *iter;
int mintime = bond-params.updelay;
 
-   primary = rtnl_dereference(bond-primary_slave);
-   if (primary  primary-link == BOND_LINK_UP 
-   bond_should_change_active(bond))
-   return primary;
+   slave = bond_choose_primary_or_current(bond);
+   if (slave)
+   return slave;
 
bond_for_each_slave(bond, slave, iter) {
if (slave-link == BOND_LINK_UP)
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net 1/3] sfc: refactor code in efx_ef10_set_mac_address()

2015-07-07 Thread Shradha Shah

From: Daniel Pieczko dpiec...@solarflare.com

Re-organize the structure of error handling to avoid having
to duplicate the netif_err() around the ifdefs.

The only change to the behaviour of the error-handling is that
the PF's data structure to record VF details should only be
updated if the original command succeeded.

Signed-off-by: Shradha Shah ss...@solarflare.com
---
 drivers/net/ethernet/sfc/ef10.c | 45 ++---
 1 file changed, 20 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 8476434..9740cd0 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -3829,38 +3829,27 @@ static int efx_ef10_set_mac_address(struct efx_nic *efx)
efx_net_open(efx-net_dev);
netif_device_attach(efx-net_dev);
 
-#if !defined(CONFIG_SFC_SRIOV)
-   if (rc == -EPERM)
-   netif_err(efx, drv, efx-net_dev,
- Cannot change MAC address; use sfboot to enable 
mac-spoofing
-  on this interface\n);
-#else
-   if (rc == -EPERM) {
+#ifdef CONFIG_SFC_SRIOV
+   if (efx-pci_dev-is_virtfn  efx-pci_dev-physfn) {
struct pci_dev *pci_dev_pf = efx-pci_dev-physfn;
 
-   /* Switch to PF and change MAC address on vport */
-   if (efx-pci_dev-is_virtfn  pci_dev_pf) {
-   struct efx_nic *efx_pf = pci_get_drvdata(pci_dev_pf);
+   if (rc == -EPERM) {
+   struct efx_nic *efx_pf;
 
-   if (!efx_ef10_sriov_set_vf_mac(efx_pf,
-  nic_data-vf_index,
-  efx-net_dev-dev_addr))
-   return 0;
-   }
-   netif_err(efx, drv, efx-net_dev,
- Cannot change MAC address; use sfboot to enable 
mac-spoofing
-  on this interface\n);
-   } else if (efx-pci_dev-is_virtfn) {
-   /* Successfully changed by VF (with MAC spoofing), so update the
-* parent PF if possible.
-*/
-   struct pci_dev *pci_dev_pf = efx-pci_dev-physfn;
+   /* Switch to PF and change MAC address on vport */
+   efx_pf = pci_get_drvdata(pci_dev_pf);
 
-   if (pci_dev_pf) {
+   rc = efx_ef10_sriov_set_vf_mac(efx_pf,
+  nic_data-vf_index,
+  efx-net_dev-dev_addr);
+   } else if (!rc) {
struct efx_nic *efx_pf = pci_get_drvdata(pci_dev_pf);
struct efx_ef10_nic_data *nic_data = efx_pf-nic_data;
unsigned int i;
 
+   /* MAC address successfully changed by VF (with MAC
+* spoofing) so update the parent PF if possible.
+*/
for (i = 0; i  efx_pf-vf_count; ++i) {
struct ef10_vf *vf = nic_data-vf + i;
 
@@ -3871,8 +3860,14 @@ static int efx_ef10_set_mac_address(struct efx_nic *efx)
}
}
}
-   }
+   } else
 #endif
+   if (rc == -EPERM) {
+   netif_err(efx, drv, efx-net_dev,
+ Cannot change MAC address; use sfboot to enable
+  mac-spoofing on this interface\n);
+   }
+
return rc;
 }
 

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 6/7] hvsock: introduce Hyper-V VM Sockets feature

2015-07-07 Thread Olaf Hering

On Tue, Jul 07, Dexuan Cui wrote:

 OK, removing the line seems better than 'default n', though both reproduce
 the same # CONFIG_HYPERV_SOCK is not set.

Perhaps default VMBUS (or whatever syntax is needed) may be the way to
enable it conditionally.

Olaf
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Broadcom BCM54610 Linux support

2015-07-07 Thread Markus Pargmann

Hi,

I found the phy driver which supports broadcom BCM5461. But I am not
sure if this driver does support BCM54610 in fiber mode as well? Or if
there are any open datasheets which could be used to write a mainline
driver for it. I would appretiate any information about this.

Thanks,

Markus

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |


signature.asc
Description: Digital signature

Re: [PATCH 6/7] hvsock: introduce Hyper-V VM Sockets feature

2015-07-07 Thread Olaf Hering

On Tue, Jul 07, Paul Bolle wrote:

 On ma, 2015-07-06 at 07:47 -0700, Dexuan Cui wrote:
  --- /dev/null
  +++ b/net/hv_sock/Kconfig
 
  +config HYPERV_SOCK
  +   tristate Microsoft Hyper-V Socket (EXPERIMENTAL)
  +   depends on HYPERV
  +   default m

 It's a bit odd to advise to say N if one is unsure and set the default
 to 'm' at the same time.

The 'default' line has to be removed IMO.

Olaf
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stable request for gso feature flag and error handling fixes

2015-07-07 Thread Jay Vosburgh


Please consider commit

commit 1e16aa3ddf863c6b9f37eddf52503230a62dedb3
Author: Florian Westphal f...@strlen.de
Date:   Mon Oct 20 13:49:16 2014 +0200

net: gso: use feature flag argument in all protocol gso handlers

and, at your discretion, the related commit

commit 330966e501ffe282d7184fde4518d5e0c24bc7f8
Author: Florian Westphal f...@strlen.de
Date:   Mon Oct 20 13:49:17 2014 +0200

net: make skb_gso_segment error handling more robust

for -stable kernels prior to 3.18 back to 3.10.

We have observed kernel panics when an openvswitch bridge is
populated with virtual devices (veth, for example) that have expansive
feature sets that include NETIF_F_GSO_GRE.

The failure occurs when foreign GRE encapsulated traffic
(explicitly not including the initial packets of a connection) arrives
at the system (likely via a switch flood event).  The packets are GRO
accumulated, and passed to the OVS receive processing.  As the
connection is not in the OVS kernel datapath table, the call path is:

ovs_dp_upcall -
 queue_gso_packets -
__skb_gso_segment(skb, NETIF_F_SG, false)

Without the first patch cited above, __skb_gso_segment returns
NULL, as the features from the device (including GSO_GRE) are used in
place of the _SG feature supplied to the call.

Without the second patch cited above, the kernel panics when it
later dereferences the NULL skb pointer in queue_userspace_packet.

Strictly speaking, with the first place applied the panic is
avoided (as the NULL return does not occur), but including the second
patch may still be prudent.

Thanks,

-J

---
-Jay Vosburgh, jay.vosbu...@canonical.com
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next v2] ipv6: sysctl to restrict candidate source addresses

2015-07-07 Thread Lorenzo Colitti

On Mon, Jul 6, 2015 at 12:05 PM, Erik Kline e...@google.com wrote:
 Per RFC 6724, section 4, Candidate Source Addresses:

 It is RECOMMENDED that the candidate source addresses be the set
 of unicast addresses assigned to the interface that will be used
 to send to the destination (the outgoing interface).

 Add a sysctl to enable this behaviour.

 Signed-off-by: Erik Kline e...@google.com

I think this is useful, because it ensures that devices with a working
IPv6 configuration on interface A, and a partial IPv6 configuration on
interface B do not attempt to send packets on interface B using
interface A's source address.

Example: there are home routers in the wild that send out an IPv6
router advertisement that configures a default route but no IPv6
address. This change makes it so that the host does not attempt to use
an IPv6 address from another network (e.g., a cellular data
connection) on the home network.

It is also what the RFC recommends.

Acked-by: Lorenzo Colitti lore...@google.com
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH,v2 net-next] ipvs: skb_orphan in case of forwarding

2015-07-07 Thread Simon Horman

On Sun, Jul 05, 2015 at 09:14:38PM -0700, Alex Gartrell wrote:
 On Sun, Jul 5, 2015 at 8:50 PM, Simon Horman ho...@verge.net.au wrote:
  Is it possible to get a 'Fixes:' tag?
 
 I suppose it'd be appropriate to say
 
 Fixes: 41063e9dd119 (ipv4: Early TCP socket demux.)
 
 As that is what introduces tcp early_demux, but that's just a guess as
 I haven't bisected it (not even sure my test would run on that code
 base).

Thanks. The reason that I am asking about this is to ease getting
this fix, or a derivative of it, into the appropriate stable trees.

Is there any possibility you could investigate which stable trees are
effected by this bug?
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Stable request for gso feature flag and error handling fixes

2015-07-07 Thread David Miller

From: Jay Vosburgh jay.vosbu...@canonical.com
Date: Tue, 07 Jul 2015 17:38:50 -0700

   Please consider commit

When you ask me to consider commits for -stable you have to tell
me what -stable releases you want me to submit them for.

Currently I am only doing -stable submissions for 4.1.x, 3.18.x
and 3.14.x
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] net: add support for linkdown sysctl to netconf

2015-07-07 Thread Andy Gospodarek

On Tue, Jul 07, 2015 at 09:57:57AM +0200, Nicolas Dichtel wrote:
 Le 06/07/2015 20:21, Andy Gospodarek a écrit :
 This kernel patch exports the value of the new
 ignore_routes_with_linkdown via netconf.
 
 Signed-off-by: Andy Gospodarek go...@cumulusnetworks.com
 Suggested-by: Nicolas Dichtel nicolas.dich...@6wind.com
 ---
 You need also to patch devinet_conf_proc() so that a netlink message is
 sent when the user updates the sysctl entry.

Doh!  I had that change in a different topic branch, but didn't
cherry-pick correctly before posting.  Thanks for catching this!

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net PATCH 1/1] drivers: net: cpsw: fix crash while accessing second slave ethernet interface

2015-07-07 Thread Mugunthan V N

When cpsw's number of slave is set to 1 in device tree and while
accessing second slave ndev and priv in cpsw_tx_interrupt(),
there is a kernel crash. This is due to cpsw_get_slave_priv()
not verifying number of slaves while retriving netdev priv and
returns a invalid memory region. Fixing the issue by introducing
number of slave check in cpsw_get_slave_priv() and
cpsw_get_slave_ndev().

[   15.879589] Unable to handle kernel paging request at virtual address 
0f0e142c
[   15.888540] pgd = ed374000
[   15.891359] [0f0e142c] *pgd=
[   15.895105] Internal error: Oops: 5 [#1] SMP ARM
[   15.899936] Modules linked in:
[   15.903139] CPU: 0 PID: 593 Comm: udhcpc Tainted: GW   
4.1.0-12205-gfda8b18-dirty #10
[   15.912386] Hardware name: Generic AM43 (Flattened Device Tree)
[   15.918557] task: ed2a2e00 ti: ed3fe000 task.ti: ed3fe000
[   15.924187] PC is at cpsw_tx_interrupt+0x30/0x44
[   15.929008] LR is at _raw_spin_unlock_irqrestore+0x40/0x44
[   15.934726] pc : [c048b9cc]lr : [c05ef4f4]psr: 2193
[   15.934726] sp : ed3ffc08  ip : ed2a2e40  fp : 
[   15.946685] r10: c0969ce8  r9 : c0969cfc  r8 : 
[   15.952129] r7 : 00c6  r6 : ee54ab00  r5 : ee169c64  r4 : ee534e00
[   15.958932] r3 : 0f0e0d0c  r2 :   r1 : ed3ffbc0  r0 : 0001
[   15.965735] Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment 
user
[   15.973261] Control: 10c5387d  Table: ad374059  DAC: 0015
[   15.979246] Process udhcpc (pid: 593, stack limit = 0xed3fe218)
[   15.985414] Stack: (0xed3ffc08 to 0xed40)
[   15.989954] fc00:   ee54ab00 c009928c c0a9e648 6193 
32e4 ee169c00
[   15.998478] fc20: ee169c64 ee169c00 ee169c64 ee54ab00 0001 0001 
ee67e268 ee008800
[   16.006995] fc40: ee534800 c009946c ee169c00 ee169c64 c08bd660 c009c370 
c009c2a4 00c6
[   16.015513] fc60: c08b75c4 c08b0854  c0098b3c 00c6 c0098c50 
ed3ffcb0 003a
[   16.024033] fc80: ed3ffcb0 fa24010c c08b7800 fa240100 ee7e9880 c00094c4 
c05ef4e8 6013
[   16.032556] fca0:  ed3ffce4 ee7e9880 c05ef964 0001 ed2a33d8 
 ed2a2e00
[   16.041080] fcc0: 6013 ee536bf8 6013 ee51b800 ee7e9880 ee67e268 
ee7e9880 ee534800
[   16.049603] fce0: c0ad0768 ed3ffcf8 c008e910 c05ef4e8 6013  
0001 0001
[   16.058121] fd00: ee536bf8 c0487a04   ee534800  
0156 c048c990
[   16.066645] fd20:   c0969f40   c05000e8 
0001 
[   16.075167] fd40:  c051eefc  ee67e268   
ee51b800 ed3ffd9c
[   16.083690] fd60:  ee67e200 ee51b800 ee7e9880 ee67e268  
 ee67e200
[   16.092211] fd80: ee51b800 ee7e9880 ee67e268 ee534800 ee67e200 c051eedc 
ee67e268 0010
[   16.100727] fda0:   ee7e9880 ee534800  ee67e268 
ee51b800 c05006fc
[   16.109247] fdc0: ee67e268 0001 c0500488 0156 ee7e9880  
ed3fe000 fff4
[   16.117771] fde0: ed3fff1c ee7e9880 ee534800 0148  ed1f8340 
 
[   16.126289] fe00:  c05a9054   0156 c0ab62a8 
0010 ed3e7000
[   16.134812] fe20:  0008 edcfb700 ed3fff1c c0fb5f94 ed2a2e00 
c0fb5f64 05d8
[   16.143336] fe40: c0a9b3b8  ed3e7070    
9f40 
[   16.151858] fe60:  00020022 00110008   43004400 
 
[   16.160374] fe80:       
 
[   16.168898] fea0: edcfb700 bee5f380 0014  ed3fe000  
4400 c04e2b64
[   16.177415] fec0: 0002 c04e3b00 ed3ffeec 0001 011a  
 bee5f394
[   16.185937] fee0: 0148 ed3fff10 0014 0001   
ed3ffee4 
[   16.194459] ff00:    c04e3664 00080011 0002 
0600 
[   16.202980] ff20:    c008dd54 ee5a6f08 ee636e80 
c096972d c0089c14
[   16.211499] ff40:  6013 ee5a6f40 6013  ee5a6f40 
0002 0006
[   16.220023] ff60:  edcfb700 0001 ed2a2e00 c000f60c 0001 
011a c008ea34
[   16.228540] ff80: 0006  bee5f380 0014 bee5f380 0014 
bee5f380 0122
[   16.237059] ffa0: c000f7c4 c000f5e0 bee5f380 0014 0006 bee5f394 
0148 
[   16.245581] ffc0: bee5f380 0014 bee5f380 0122 fd6e 4300 
4800 4400
[   16.254104] ffe0: bee5f378 bee5f36c 000307ec b6f39044 4010 0006 
ed36fa40 
[   16.262642] [c048b9cc] (cpsw_tx_interrupt) from [c009928c] 
(handle_irq_event_percpu+0x64/0x204)
[   16.272076] [c009928c] (handle_irq_event_percpu) from [c009946c] 
(handle_irq_event+0x40/0x64)
[   16.281330] [c009946c] (handle_irq_event) from [c009c370] 
(handle_fasteoi_irq+0xcc/0x1a8)
[   16.290220] [c009c370] (handle_fasteoi_irq) from [c0098b3c] 
(generic_handle_irq+0x20/0x30)
[   16.299197] [c0098b3c]

[PATCH] net/tipc: initialize security state for new connection socket

2015-07-07 Thread Stephen Smalley

Calling connect() with an AF_TIPC socket would trigger a series
of error messages from SELinux along the lines of:
SELinux: Invalid class 0
type=AVC msg=audit(1434126658.487:34500): avc:  denied  { unprintable }
  for pid=292 comm=kworker/u16:5 scontext=system_u:system_r:kernel_t:s0
  tcontext=system_u:object_r:unlabeled_t:s0 tclass=unprintable
  permissive=0

This was due to a failure to initialize the security state of the new
connection sock by the tipc code, leaving it with junk in the security
class field and an unlabeled secid.  Add a call to security_sk_clone()
to inherit the security state from the parent socket.

Reported-by: Tim Shearer tim.shea...@overturenetworks.com
Signed-off-by: Stephen Smalley s...@tycho.nsa.gov
Acked-by: Paul Moore p...@paul-moore.com
---
 net/tipc/socket.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 46b6ed5..3a7567f 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -2007,6 +2007,7 @@ static int tipc_accept(struct socket *sock, struct socket 
*new_sock, int flags)
res = tipc_sk_create(sock_net(sock-sk), new_sock, 0, 1);
if (res)
goto exit;
+   security_sk_clone(sock-sk, new_sock-sk);
 
new_sk = new_sock-sk;
new_tsock = tipc_sk(new_sk);
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net] bridge: mdb: zero out the local br_ip variable before use

2015-07-07 Thread Nikolay Aleksandrov

Since commit b0e9a30dd669 (bridge: Add vlan id to multicast groups)
there's a check in br_ip_equal() for a matching vlan id, but the mdb
functions were not modified to use (or at least zero it) so when an
entry was added it would have a garbage vlan id (from the local br_ip
variable in __br_mdb_add/del) and this would prevent it from being
matched and also deleted. So zero out the whole local ip var to protect
ourselves from future changes and also to fix the current bug, since
there's no vlan id support in the mdb uapi - use always vlan id 0.
Example before patch:
root@debian:~# bridge mdb add dev br0 port eth1 grp 239.0.0.1 permanent
root@debian:~# bridge mdb
dev br0 port eth1 grp 239.0.0.1 permanent
root@debian:~# bridge mdb del dev br0 port eth1 grp 239.0.0.1 permanent
RTNETLINK answers: Invalid argument

After patch:
root@debian:~# bridge mdb add dev br0 port eth1 grp 239.0.0.1 permanent
root@debian:~# bridge mdb
dev br0 port eth1 grp 239.0.0.1 permanent
root@debian:~# bridge mdb del dev br0 port eth1 grp 239.0.0.1 permanent
root@debian:~# bridge mdb

Signed-off-by: Nikolay Aleksandrov ra...@blackwall.org
Fixes: b0e9a30dd669 (bridge: Add vlan id to multicast groups)
---
 net/bridge/br_mdb.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index e29ad70b3000..cc00066c0622 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -371,6 +371,7 @@ static int __br_mdb_add(struct net *net, struct net_bridge 
*br,
if (!p || p-br != br || p-state == BR_STATE_DISABLED)
return -EINVAL;
 
+   memset(ip, 0, sizeof(ip));
ip.proto = entry-addr.proto;
if (ip.proto == htons(ETH_P_IP))
ip.u.ip4 = entry-addr.u.ip4;
@@ -417,6 +418,7 @@ static int __br_mdb_del(struct net_bridge *br, struct 
br_mdb_entry *entry)
if (!netif_running(br-dev) || br-multicast_disabled)
return -EINVAL;
 
+   memset(ip, 0, sizeof(ip));
ip.proto = entry-addr.proto;
if (ip.proto == htons(ETH_P_IP)) {
if (timer_pending(br-ip4_other_query.timer))
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next 1/3] tcp: introduce TCP experimental option for SMC

2015-07-07 Thread Ursula Braun

Eric,

understood, would it be acceptable if the SMC-specific hooks in the
TCP-code are enclosed with #ifdef CONFIG_SMC ... #endif?

Regards, Ursula

On Mon, 2015-07-06 at 18:08 +0200, Eric Dumazet wrote:
 On Mon, 2015-07-06 at 17:11 +0200, Ursula Braun wrote:
  From: Ursula Braun ursula.br...@de.ibm.com
  
  The SMC-R protocol defines dynamic discovery of peers. This is done by
  implementing experimental TCP options as defined in RFC6994. The TCP code
  needs to be extended to support RFC6994.
  
  Setting the TCP experimental option for SMC-R [2] will be triggered from
  kernel exploiters like the new SMC-R socket family by setting a new
  flag syn_smc on struct tcp_sock of the connecting and the listening
  socket. If the client peer is SMC-R capable, flag syn_smc is kept on the
  connecting socket after the 3-way TCP handshake, otherwise it is reset.
  If the server peer is SMC-R capable, the new connected TCP socket has
  the new flag set, otherwise not.
  
  Code snippet client:
tcp_sk(sock-sk)-syn_smc = 1;
rc = kernel_connect(sock, addr, alen, flags);
if (tcp_sk(sock-sk)-syn_smc) {
/* switch to smc for this connection */
  
  Code snippet server:
tcp_sk(sock-sk)-syn_smc = 1;
rc = kernel_listen(sock, backlog);
rc = kernel_accept(sock, newsock, 0);
if (tcp_sk(newsock-sk)-syn_smc) {
/* switch to smc for this connection */
  
  References:
  [1] Shared Use of TCP Experimental Options RFC 6994:
  https://tools.ietf.org/rfc/rfc6994.txt
  [2] IANA ExID SMCR:
  
  http://www.iana.org/assignments/tcp-parameters/tcp-parameters.xhtml#tcp-exids
  
  This patch has already been posted in June 2013, but Dave Miller has
  postponed applying till the user of the new flags, ie. the entire SMC-R
  protocol stack is implemented.
  
  Signed-off-by: Ursula Braun ubr...@linux.vnet.ibm.com
 
 
   struct tcp_out_options {
  u16 options;/* bit field of OPTION_* */
  @@ -544,6 +545,14 @@ static void tcp_options_write(__be32 *ptr, struct 
  tcp_sock *tp,
  }
  ptr += (len + 3)  2;
  }
  +
  +   if (unlikely(OPTION_SMC  options)) {
  +   *ptr++ = htonl((TCPOPT_NOP   24) |
  +  (TCPOPT_NOP   16) |
  +  (TCPOPT_EXP   8) |
  +  (TCPOLEN_EXP_SMC_BASE));
  +   *ptr++ = htonl(TCPOPT_SMC_MAGIC);
  +   }
   }
 
 
 I am concerned about adding an additional conditional branch in TCP
 write fast path, on all hosts, while SMC seems to be available only for
 some hardware class.
 
 
 
 


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 1/5] ipvlan: remove counters of ipv4 and ipv6 addresses

2015-07-07 Thread Mahesh Bandewar

On Fri, Jul 3, 2015 at 5:58 AM, Konstantin Khlebnikov
khlebni...@yandex-team.ru wrote:
 They are unused after commit f631c44bbe15 (ipvlan: Always set broadcast bit 
 in
 multicast filter).

 Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
 ---
  drivers/net/ipvlan/ipvlan.h  |2 -
  drivers/net/ipvlan/ipvlan_main.c |   65 
 +++---
  2 files changed, 26 insertions(+), 41 deletions(-)


 diff --git a/drivers/net/ipvlan/ipvlan_main.c 
 b/drivers/net/ipvlan/ipvlan_main.c
 index 1acc283160d9..62577b3f01f2 100644
 --- a/drivers/net/ipvlan/ipvlan_main.c
 +++ b/drivers/net/ipvlan/ipvlan_main.c
 @@ -627,8 +622,9 @@ static int ipvlan_add_addr6(struct ipvl_dev *ipvlan, 
 struct in6_addr *ip6_addr)
 memcpy(addr-ip6addr, ip6_addr, sizeof(struct in6_addr));
 addr-atype = IPVL_IPV6;
 list_add_tail(addr-anode, ipvlan-addrs);
 -   ipvlan-ipv6cnt++;
 -   /* If the interface is not up, the address will be added to the hash
 +
 +   /*
 +* If the interface is not up, the address will be added to the hash
Why? Preferred commenting style in net is
  /* multi-line
   * comment
   */
  * list by ipvlan_open.
  */
 if (netif_running(ipvlan-dev))
 @@ -642,16 +638,11 @@ static void ipvlan_del_addr6(struct ipvl_dev *ipvlan, 
 struct in6_addr *ip6_addr)
 struct ipvl_addr *addr;

 addr = ipvlan_find_addr(ipvlan, ip6_addr, true);
 -   if (!addr)
 -   return;
 -
 -   ipvlan_ht_addr_del(addr, true);
 -   list_del(addr-anode);
 -   ipvlan-ipv6cnt--;
 -   WARN_ON(ipvlan-ipv6cnt  0);
 -   kfree_rcu(addr, rcu);
 -
 -   return;
 +   if (addr) {
 +   ipvlan_ht_addr_del(addr, true);
 +   list_del(addr-anode);
 +   kfree_rcu(addr, rcu);
 +   }
This delta is unnecessarily big and can be reduced to deleting just two lines.
  }

  static int ipvlan_addr6_event(struct notifier_block *unused,
 @@ -699,8 +690,9 @@ static int ipvlan_add_addr4(struct ipvl_dev *ipvlan, 
 struct in_addr *ip4_addr)
 memcpy(addr-ip4addr, ip4_addr, sizeof(struct in_addr));
 addr-atype = IPVL_IPV4;
 list_add_tail(addr-anode, ipvlan-addrs);
 -   ipvlan-ipv4cnt++;
 -   /* If the interface is not up, the address will be added to the hash
 +
 +   /*
 +* If the interface is not up, the address will be added to the hash
same here (multi line comments)
  * list by ipvlan_open.
  */
 if (netif_running(ipvlan-dev))
 @@ -714,16 +706,11 @@ static void ipvlan_del_addr4(struct ipvl_dev *ipvlan, 
 struct in_addr *ip4_addr)
 struct ipvl_addr *addr;

 addr = ipvlan_find_addr(ipvlan, ip4_addr, false);
 -   if (!addr)
 -   return;
 -
 -   ipvlan_ht_addr_del(addr, true);
 -   list_del(addr-anode);
 -   ipvlan-ipv4cnt--;
 -   WARN_ON(ipvlan-ipv4cnt  0);
 -   kfree_rcu(addr, rcu);
 -
 -   return;
 +   if (addr) {
 +   ipvlan_ht_addr_del(addr, true);
 +   list_del(addr-anode);
 +   kfree_rcu(addr, rcu);
 +   }
This delta is unnecessarily big and can be reduced to deleting just two lines.
  }

  static int ipvlan_addr4_event(struct notifier_block *unused,

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Stable request for gso feature flag and error handling fixes

2015-07-07 Thread Jay Vosburgh

David Miller da...@davemloft.net wrote:

From: Jay Vosburgh jay.vosbu...@canonical.com
Date: Tue, 07 Jul 2015 17:38:50 -0700

  Please consider commit

When you ask me to consider commits for -stable you have to tell
me what -stable releases you want me to submit them for.

Currently I am only doing -stable submissions for 4.1.x, 3.18.x
and 3.14.x

I did say:

   for -stable kernels prior to 3.18 back to 3.10.

So, this would be just for 3.14.x.  My apologies if I buried
that too far into the message.

Are the other -stable tree maintainers picking up patches after
you've submitted to 4.1/3.18/3.14, or is it necessary to make separate
requests?

-J

---
-Jay Vosburgh, jay.vosbu...@canonical.com
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Broadcom BCM54610 Linux support

2015-07-07 Thread Markus Pargmann

Hi,

On Tue, Jul 07, 2015 at 10:34:56AM -0700, Florian Fainelli wrote:
 (adding Michael)
 
 On 07/07/15 03:58, Markus Pargmann wrote:
  Hi,
  
  I found the phy driver which supports broadcom BCM5461. But I am not
  sure if this driver does support BCM54610 in fiber mode as well? Or if
  there are any open datasheets which could be used to write a mainline
  driver for it. I would appretiate any information about this.
 
 There are not publicly available datasheets as far as I can tell, the
 current driver does not support anything but copper modes.
 
 If you have reference code from somewhere else (e.g: bootloader or a
 Broadcom SDK), I would be inclined to port it over the Linux PHY driver.

Thanks. Unfortunately I don't have any reference code that I could use
to support BCM54610 in fiber mode. So it's probably better to use a
different PHY with more public information.

Thanks,

Markus

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |


signature.asc
Description: Digital signature

RE: [PATCH 6/7] hvsock: introduce Hyper-V VM Sockets feature

2015-07-07 Thread Dexuan Cui

 -Original Message-
 From: Stephen Hemminger
 Sent: Wednesday, July 8, 2015 2:31
 Subject: Re: [PATCH 6/7] hvsock: introduce Hyper-V VM Sockets feature

 On Mon,  6 Jul 2015 07:47:29 -0700
 Dexuan Cui de...@microsoft.com wrote:

  Hyper-V VM sockets (hvsock) supplies a byte-stream based communication
  mechanism between the host and a guest. It's kind of TCP over VMBus, but
  the transportation layer (VMBus) is much simpler than IP. With Hyper-V VM
  Sockets, applications between the host and a guest can talk with each
  other directly by the traditional BSD-style socket APIs.

  Hyper-V VM Sockets is only available on Windows 10 host and later. The
  patch implements the necessary support in the guest side by introducing
  a new socket address family AF_HYPERV.

  Signed-off-by: Dexuan Cui de...@microsoft.com

 Is there any chance that AF_VSOCK could be used with different transport
 for VMware and Hyper-V. Better to make guest applications host independent.

Hi Stephen,
Thanks for the question. I tried to do that (since AF_HYPERV and AF_VSOCK
are conceptually similar), but I found it would be impractical: I listed the
reasons in my cover letter of the patchset:
https://lkml.org/lkml/2015/7/6/431

IMO the biggest difference is the size of the endpoint (u128 vs. u32):
u32 ContextID, u32 Port in AF_VOSCK
vs.
u128 GUID_VM_ID, u128 GUID_ServiceID in AF_HYPERV.

In the current code of AF_VSOCK and the related transport layer (the wrapper
ops of VMware's VMCI), the size is widely used by struct sockaddr_vm (this
struct is also exported to the user space).

So, anyway, the user space application has to explicitly handle the different
endpoint size.

And in the driver side, I'm afraid there is no way to directly reuse the code of
AF_VSOCK with trivial change :-( , because we would have to make the
AF_VSOCK code be able to know the real sockaddr type (sockaddr_vm or
sockaddr_hv? The two structs have different layout and different field names)
at runtime and behave differently. This would make the code a mess, IMO.

That's why I think it would be better to introduce a new address family.

Thanks,
-- Dexuan
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 4/5] ipvlan: protect addresses with internal spinlock

2015-07-07 Thread Mahesh Bandewar

On Fri, Jul 3, 2015 at 5:58 AM, Konstantin Khlebnikov
khlebni...@yandex-team.ru wrote:
 Inet6addr notifier is atomic and runs in bh context without RTNL when
 ipv6 receives router advertisement packet and performs autoconfiguration.

 This patch adds ipvl_port-addr_lock and helpers: ipvlan_addr_lock_bh,
 ipvlan_addr_unlock_bh for protecting ipvlan addresses and hash table.

Frankly I'm not comfortable adding spin-locks all over. I think any
config that mostly takes place with RTNL makes sense but this
inet6addr needs to be thought through and implanting spin-locks in
IPvlan is a work-around for a problem some where else.
Why can't a work-queue that takes RTNL to call inet6addr-notifier be
implemented when called from bh?

 Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
 ---
  drivers/net/ipvlan/ipvlan.h  |   11 +++
  drivers/net/ipvlan/ipvlan_core.c |2 --
  drivers/net/ipvlan/ipvlan_main.c |   33 ++---
  3 files changed, 41 insertions(+), 5 deletions(-)

[snip]
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] net/tipc: initialize security state for new connection socket

2015-07-07 Thread Ying Xue

On 07/07/2015 09:43 PM, Stephen Smalley wrote:
 Calling connect() with an AF_TIPC socket would trigger a series
 of error messages from SELinux along the lines of:
 SELinux: Invalid class 0
 type=AVC msg=audit(1434126658.487:34500): avc:  denied  { unprintable }
   for pid=292 comm=kworker/u16:5 scontext=system_u:system_r:kernel_t:s0
   tcontext=system_u:object_r:unlabeled_t:s0 tclass=unprintable
   permissive=0
 
 This was due to a failure to initialize the security state of the new
 connection sock by the tipc code, leaving it with junk in the security
 class field and an unlabeled secid.  Add a call to security_sk_clone()
 to inherit the security state from the parent socket.
 
 Reported-by: Tim Shearer tim.shea...@overturenetworks.com
 Signed-off-by: Stephen Smalley s...@tycho.nsa.gov
 Acked-by: Paul Moore p...@paul-moore.com

Acked-by: Ying Xue ying@windriver.com

 ---
  net/tipc/socket.c | 1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/net/tipc/socket.c b/net/tipc/socket.c
 index 46b6ed5..3a7567f 100644
 --- a/net/tipc/socket.c
 +++ b/net/tipc/socket.c
 @@ -2007,6 +2007,7 @@ static int tipc_accept(struct socket *sock, struct 
 socket *new_sock, int flags)
   res = tipc_sk_create(sock_net(sock-sk), new_sock, 0, 1);
   if (res)
   goto exit;
 + security_sk_clone(sock-sk, new_sock-sk);
  
   new_sk = new_sock-sk;
   new_tsock = tipc_sk(new_sk);
 

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] net: systemport: Use eth_hw_addr_random

2015-07-07 Thread Vaishali Thakkar

Use eth_hw_addr_random() instead of calling random_ether_addr().
Here, this change is setting addr_assign_type to NET_ADDR_RANDOM.

The Coccinelle semantic patch that performs this transformation
is as follows:

@@
identifier a,b;
@@

-random_ether_addr(a-b);
+eth_hw_addr_random(a);

Signed-off-by: Vaishali Thakkar vthakkar1...@gmail.com
---
Note that this patch is compile tested only and I have used file
drivers/net/ethernet/hisilicon/hix5hd2_gmac.c as a reference.
Also, original call didn't make assignment to NET_ADDR_RANDOM. So,
it would be good if someone can test this change.
---
 drivers/net/ethernet/broadcom/bcmsysport.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c 
b/drivers/net/ethernet/broadcom/bcmsysport.c
index 909ad7a..4566cdf 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -1793,7 +1793,7 @@ static int bcm_sysport_probe(struct platform_device *pdev)
macaddr = of_get_mac_address(dn);
if (!macaddr || !is_valid_ether_addr(macaddr)) {
dev_warn(pdev-dev, using random Ethernet MAC\n);
-   random_ether_addr(dev-dev_addr);
+   eth_hw_addr_random(dev);
} else {
ether_addr_copy(dev-dev_addr, macaddr);
}
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] dsa: mv88e6352/mv88e6xxx: Add support for Marvell 88E6320 and 88E6321

2015-07-07 Thread Guenter Roeck

From: Aleksey S. Kazantsev io...@yandex.ru

MV88E6320 and MV88E6321 are largely compatible to MV886352,
but are members of a different chip family.

Signed-off-by: Aleksey S. Kazantsev io...@yandex.ru
Signed-off-by: Guenter Roeck li...@roeck-us.net
---
 drivers/net/dsa/Kconfig |  6 +++---
 drivers/net/dsa/mv88e6352.c | 31 +--
 drivers/net/dsa/mv88e6xxx.c | 42 ++
 drivers/net/dsa/mv88e6xxx.h |  8 +++-
 4 files changed, 65 insertions(+), 22 deletions(-)

diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
index 7ad0a4d8e475..4c483d937481 100644
--- a/drivers/net/dsa/Kconfig
+++ b/drivers/net/dsa/Kconfig
@@ -46,13 +46,13 @@ config NET_DSA_MV88E6171
  ethernet switches chips.
 
 config NET_DSA_MV88E6352
-   tristate Marvell 88E6172/88E6176/88E6352 ethernet switch chip support
+   tristate Marvell 88E6172/6176/6320/6321/6352 ethernet switch chip 
support
depends on NET_DSA
select NET_DSA_MV88E6XXX
select NET_DSA_TAG_EDSA
---help---
- This enables support for the Marvell 88E6172, 88E6176 and 88E6352
- ethernet switch chips.
+ This enables support for the Marvell 88E6172, 88E6176, 88E6320,
+ 88E6321 and 88E6352 ethernet switch chips.
 
 config NET_DSA_BCM_SF2
tristate Broadcom Starfighter 2 Ethernet switch support
diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
index 632815c10a40..cfece5ae9d5f 100644
--- a/drivers/net/dsa/mv88e6352.c
+++ b/drivers/net/dsa/mv88e6352.c
@@ -36,6 +36,18 @@ static char *mv88e6352_probe(struct device *host_dev, int 
sw_addr)
return Marvell 88E6172;
if ((ret  0xfff0) == PORT_SWITCH_ID_6176)
return Marvell 88E6176;
+   if (ret == PORT_SWITCH_ID_6320_A1)
+   return Marvell 88E6320 (A1);
+   if (ret == PORT_SWITCH_ID_6320_A2)
+   return Marvell 88e6320 (A2);
+   if ((ret  0xfff0) == PORT_SWITCH_ID_6320)
+   return Marvell 88E6320;
+   if (ret == PORT_SWITCH_ID_6321_A1)
+   return Marvell 88E6321 (A1);
+   if (ret == PORT_SWITCH_ID_6321_A2)
+   return Marvell 88e6321 (A2);
+   if ((ret  0xfff0) == PORT_SWITCH_ID_6321)
+   return Marvell 88E6321;
if (ret == PORT_SWITCH_ID_6352_A0)
return Marvell 88E6352 (A0);
if (ret == PORT_SWITCH_ID_6352_A1)
@@ -84,11 +96,12 @@ static int mv88e6352_setup_global(struct dsa_switch *ds)
 
 static int mv88e6352_get_temp(struct dsa_switch *ds, int *temp)
 {
+   int phy = mv88e6xxx_6320_family(ds) ? 3 : 0;
int ret;
 
*temp = 0;
 
-   ret = mv88e6xxx_phy_page_read(ds, 0, 6, 27);
+   ret = mv88e6xxx_phy_page_read(ds, phy, 6, 27);
if (ret  0)
return ret;
 
@@ -99,11 +112,12 @@ static int mv88e6352_get_temp(struct dsa_switch *ds, int 
*temp)
 
 static int mv88e6352_get_temp_limit(struct dsa_switch *ds, int *temp)
 {
+   int phy = mv88e6xxx_6320_family(ds) ? 3 : 0;
int ret;
 
*temp = 0;
 
-   ret = mv88e6xxx_phy_page_read(ds, 0, 6, 26);
+   ret = mv88e6xxx_phy_page_read(ds, phy, 6, 26);
if (ret  0)
return ret;
 
@@ -114,23 +128,25 @@ static int mv88e6352_get_temp_limit(struct dsa_switch 
*ds, int *temp)
 
 static int mv88e6352_set_temp_limit(struct dsa_switch *ds, int temp)
 {
+   int phy = mv88e6xxx_6320_family(ds) ? 3 : 0;
int ret;
 
-   ret = mv88e6xxx_phy_page_read(ds, 0, 6, 26);
+   ret = mv88e6xxx_phy_page_read(ds, phy, 6, 26);
if (ret  0)
return ret;
temp = clamp_val(DIV_ROUND_CLOSEST(temp, 5) + 5, 0, 0x1f);
-   return mv88e6xxx_phy_page_write(ds, 0, 6, 26,
+   return mv88e6xxx_phy_page_write(ds, phy, 6, 26,
(ret  0xe0ff) | (temp  8));
 }
 
 static int mv88e6352_get_temp_alarm(struct dsa_switch *ds, bool *alarm)
 {
+   int phy = mv88e6xxx_6320_family(ds) ? 3 : 0;
int ret;
 
*alarm = false;
 
-   ret = mv88e6xxx_phy_page_read(ds, 0, 6, 26);
+   ret = mv88e6xxx_phy_page_read(ds, phy, 6, 26);
if (ret  0)
return ret;
 
@@ -394,5 +410,8 @@ struct dsa_switch_driver mv88e6352_switch_driver = {
.fdb_getnext= mv88e6xxx_port_fdb_getnext,
 };
 
-MODULE_ALIAS(platform:mv88e6352);
 MODULE_ALIAS(platform:mv88e6172);
+MODULE_ALIAS(platform:mv88e6176);
+MODULE_ALIAS(platform:mv88e6320);
+MODULE_ALIAS(platform:mv88e6321);
+MODULE_ALIAS(platform:mv88e6352);
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index fd8547c2b79d..f394e4d4d9e0 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -517,6 +517,18 @@ static bool mv88e6xxx_6185_family(struct dsa_switch

IPsec maintenance during the next weeks

2015-07-07 Thread Steffen Klassert

David,

I'll be off without mail access for the next two and a half weeks.
Can you please take urgent IPsec patches directly into the net
tree during this time? 

I'll let you know as soon as I'm back.

Thanks!
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH 6/7] hvsock: introduce Hyper-V VM Sockets feature

2015-07-07 Thread Dexuan Cui

 -Original Message-
 From: Olaf Hering
 Sent: Tuesday, July 7, 2015 18:59
 Subject: Re: [PATCH 6/7] hvsock: introduce Hyper-V VM Sockets feature

 On Tue, Jul 07, Dexuan Cui wrote:

  OK, removing the line seems better than 'default n', though both reproduce
  the same # CONFIG_HYPERV_SOCK is not set.

 Perhaps default VMBUS (or whatever syntax is needed) may be the way to
 enable it conditionally.

 Olaf
Thanks, Olaf!
I think we can use default m if HYPERV.

Paul, I'll remove the sentence If unsure, say N.

 Thanks,
-- Dexuan

Re: [PATCH net-next 1/3] tcp: introduce TCP experimental option for SMC

2015-07-07 Thread Eric Dumazet

On Tue, 2015-07-07 at 15:57 +0200, Ursula Braun wrote:
 Eric,
 
 understood, would it be acceptable if the SMC-specific hooks in the
 TCP-code are enclosed with #ifdef CONFIG_SMC ... #endif?

If this CONFIG_SMC is enabled only on relevant builds, I guess it would
be ok. (Try to use helpers in include files to avoid spreading new
#ifdef in C files)



--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Stable request for gso feature flag and error handling fixes

2015-07-07 Thread David Miller

From: Jay Vosburgh jay.vosbu...@canonical.com
Date: Tue, 07 Jul 2015 22:10:53 -0700

   Are the other -stable tree maintainers picking up patches after
 you've submitted to 4.1/3.18/3.14, or is it necessary to make separate
 requests?

I don't know and I frankly don't care.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: IPsec maintenance during the next weeks

2015-07-07 Thread David Miller

From: Steffen Klassert steffen.klass...@secunet.com
Date: Wed, 8 Jul 2015 07:04:32 +0200

 I'll be off without mail access for the next two and a half weeks.
 Can you please take urgent IPsec patches directly into the net
 tree during this time? 

 I'll let you know as soon as I'm back.

Sure, thanks for letting me know.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next] net: igb: implement high frequency periodic output signals

2015-07-07 Thread Richard Cochran

In addition to interrupt driven target time output events, the i210
also has two programmable clock outputs.  These clocks support periods
between 16 nanoseconds and 140 milliseconds.  This patch implements
the periodic output function using the clock outputs when possible,
falling back to the target time for longer periods.

Signed-off-by: Richard Cochran richardcoch...@gmail.com
---
 drivers/net/ethernet/intel/igb/e1000_regs.h |  2 +
 drivers/net/ethernet/intel/igb/igb_ptp.c| 72 +
 2 files changed, 54 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/e1000_regs.h 
b/drivers/net/ethernet/intel/igb/e1000_regs.h
index 6f0490d..4af2870 100644
--- a/drivers/net/ethernet/intel/igb/e1000_regs.h
+++ b/drivers/net/ethernet/intel/igb/e1000_regs.h
@@ -104,6 +104,8 @@
 #define E1000_TRGTTIMH0  0x0B648 /* Target Time Register 0 High - RW */
 #define E1000_TRGTTIML1  0x0B64C /* Target Time Register 1 Low  - RW */
 #define E1000_TRGTTIMH1  0x0B650 /* Target Time Register 1 High - RW */
+#define E1000_FREQOUT0   0x0B654 /* Frequency Out 0 Control Register - RW */
+#define E1000_FREQOUT1   0x0B658 /* Frequency Out 1 Control Register - RW */
 #define E1000_AUXSTMPL0  0x0B65C /* Auxiliary Time Stamp 0 Register Low  - RO 
*/
 #define E1000_AUXSTMPH0  0x0B660 /* Auxiliary Time Stamp 0 Register High - RO 
*/
 #define E1000_AUXSTMPL1  0x0B664 /* Auxiliary Time Stamp 1 Register Low  - RO 
*/
diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c 
b/drivers/net/ethernet/intel/igb/igb_ptp.c
index c3a9392c..5982f28 100644
--- a/drivers/net/ethernet/intel/igb/igb_ptp.c
+++ b/drivers/net/ethernet/intel/igb/igb_ptp.c
@@ -405,7 +405,7 @@ static void igb_pin_extts(struct igb_adapter *igb, int 
chan, int pin)
wr32(E1000_CTRL_EXT, ctrl_ext);
 }
 
-static void igb_pin_perout(struct igb_adapter *igb, int chan, int pin)
+static void igb_pin_perout(struct igb_adapter *igb, int chan, int pin, int 
freq)
 {
static const u32 aux0_sel_sdp[IGB_N_SDP] = {
AUX0_SEL_SDP0, AUX0_SEL_SDP1, AUX0_SEL_SDP2, AUX0_SEL_SDP3,
@@ -424,6 +424,14 @@ static void igb_pin_perout(struct igb_adapter *igb, int 
chan, int pin)
TS_SDP0_SEL_TT1, TS_SDP1_SEL_TT1,
TS_SDP2_SEL_TT1, TS_SDP3_SEL_TT1,
};
+   static const u32 ts_sdp_sel_fc0[IGB_N_SDP] = {
+   TS_SDP0_SEL_FC0, TS_SDP1_SEL_FC0,
+   TS_SDP2_SEL_FC0, TS_SDP3_SEL_FC0,
+   };
+   static const u32 ts_sdp_sel_fc1[IGB_N_SDP] = {
+   TS_SDP0_SEL_FC1, TS_SDP1_SEL_FC1,
+   TS_SDP2_SEL_FC1, TS_SDP3_SEL_FC1,
+   };
static const u32 ts_sdp_sel_clr[IGB_N_SDP] = {
TS_SDP0_SEL_FC1, TS_SDP1_SEL_FC1,
TS_SDP2_SEL_FC1, TS_SDP3_SEL_FC1,
@@ -445,11 +453,17 @@ static void igb_pin_perout(struct igb_adapter *igb, int 
chan, int pin)
tssdp = ~AUX1_TS_SDP_EN;
 
tssdp = ~ts_sdp_sel_clr[pin];
-   if (chan == 1)
-   tssdp |= ts_sdp_sel_tt1[pin];
-   else
-   tssdp |= ts_sdp_sel_tt0[pin];
-
+   if (freq) {
+   if (chan == 1)
+   tssdp |= ts_sdp_sel_fc1[pin];
+   else
+   tssdp |= ts_sdp_sel_fc0[pin];
+   } else {
+   if (chan == 1)
+   tssdp |= ts_sdp_sel_tt1[pin];
+   else
+   tssdp |= ts_sdp_sel_tt0[pin];
+   }
tssdp |= ts_sdp_en[pin];
 
wr32(E1000_TSSDP, tssdp);
@@ -463,10 +477,10 @@ static int igb_ptp_feature_enable_i210(struct 
ptp_clock_info *ptp,
struct igb_adapter *igb =
container_of(ptp, struct igb_adapter, ptp_caps);
struct e1000_hw *hw = igb-hw;
-   u32 tsauxc, tsim, tsauxc_mask, tsim_mask, trgttiml, trgttimh;
+   u32 tsauxc, tsim, tsauxc_mask, tsim_mask, trgttiml, trgttimh, freqout;
unsigned long flags;
struct timespec ts;
-   int pin = -1;
+   int use_freq = 0, pin = -1;
s64 ns;
 
switch (rq-type) {
@@ -511,40 +525,58 @@ static int igb_ptp_feature_enable_i210(struct 
ptp_clock_info *ptp,
ts.tv_nsec = rq-perout.period.nsec;
ns = timespec_to_ns(ts);
ns = ns  1;
-   if (on  ns  50LL) {
-   /* 2k interrupts per second is an awful lot. */
-   return -EINVAL;
+   if (on  ns = 7000LL) {
+   if (ns  8LL)
+   return -EINVAL;
+   use_freq = 1;
}
ts = ns_to_timespec(ns);
if (rq-perout.index == 1) {
-   tsauxc_mask = TSAUXC_EN_TT1;
-   tsim_mask = TSINTR_TT1;
+   if (use_freq) {
+   tsauxc_mask = TSAUXC_EN_CLK1 | TSAUXC_ST1;
+   tsim_mask = 0;
+   } else {
+

Darlehen angebot

2015-07-07 Thread karin LOCK




Hallo,
Haben Sie Interesse an einer finanziellen Darlehen zu 3%?
kontaktieren Sie mich für Details und Bedingungen.
meine mail: petmvinfol...@gmail.com
danke

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2] net/bridge: Use __in6_dev_get rather than in6_dev_get in br_validate_ipv6

2015-07-07 Thread Julien Grall

The commit efb6de9b4ba0092b2c55f6a52d16294a8a698edd netfilter: bridge:
forward IPv6 fragmented packets introduced a new function
br_validate_ipv6 which take a reference on the inet6 device. Although,
the reference is not released at the end.

This will result to the impossibility to destroy any netdevice using
ipv6 and bridge.

It's possible to directly retrieve the inet6 device without taking a
reference as all netfilter hooks are protected by rcu_read_lock via
nf_hook_slow.

Spotted while trying to destroy a Xen guest on the upstream Linux:
unregister_netdevice: waiting for vif1.0 to become free. Usage count = 1

Signed-off-by: Julien Grall julien.gr...@citrix.com
Cc: Bernhard Thaler bernhard.tha...@wvnet.at
Cc: Pablo Neira Ayuso pa...@netfilter.org
Cc: f...@strlen.de
Cc: ian.campb...@citrix.com
Cc: wei.l...@citrix.com
Cc: Bob Liu bob@oracle.com

---
Note that it's impossible to create new guest after this message.
I'm not sure if it's normal.

Changes in v2:
- Don't take a reference to inet6.
- This was net/bridge: Add missing in6_dev_put in
br_validate_ipv6 [0]

[0] https://lkml.org/lkml/2015/7/3/443
---
 net/bridge/br_netfilter_ipv6.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/bridge/br_netfilter_ipv6.c b/net/bridge/br_netfilter_ipv6.c
index 6d12d26..13b7d1e 100644
--- a/net/bridge/br_netfilter_ipv6.c
+++ b/net/bridge/br_netfilter_ipv6.c
@@ -104,7 +104,7 @@ int br_validate_ipv6(struct sk_buff *skb)
 {
const struct ipv6hdr *hdr;
struct net_device *dev = skb-dev;
-   struct inet6_dev *idev = in6_dev_get(skb-dev);
+   struct inet6_dev *idev = __in6_dev_get(skb-dev);
u32 pkt_len;
u8 ip6h_len = sizeof(struct ipv6hdr);
 
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

97 matches

Mail list logo