Re: [PATCH v2] stmmac: avoid ipq806x constant overflow warning

2015-11-13 Thread Geert Uytterhoeven
Hi Joe,

On Fri, Nov 13, 2015 at 8:52 AM, Joe Perches  wrote:
> On Fri, 2015-11-13 at 08:37 +0100, Geert Uytterhoeven wrote:
>> On Thu, Nov 12, 2015 at 10:03 PM, Arnd Bergmann 
>> wrote:
>> > Building dwmac-ipq806x on a 64-bit architecture produces a harmless
>> > warning from gcc:
>> >
>> > stmmac/dwmac-ipq806x.c: In function 'ipq806x_gmac_probe':
>> > include/linux/bitops.h:6:19: warning: overflow in implicit constant
>> > conversion [-Woverflow]
>> >   val = QSGMII_PHY_CDR_EN |
>> > stmmac/dwmac-ipq806x.c:333:8: note: in expansion of macro
>> > 'QSGMII_PHY_CDR_EN'
>> >  #define QSGMII_PHY_CDR_EN   BIT(0)
>> >  #define BIT(nr)   (1UL << (nr))
>> >
>> > This is a result of the type conversion rules in C, when we take
>> > the
>> > logical OR of multiple different types. In particular, we have
>> > and unsigned long
>> >
>> > QSGMII_PHY_CDR_EN == BIT(0) == (1ul << 0) ==
>> > 0x0001ul
>> >
>> > and a signed int
>> >
>> > 0xC << QSGMII_PHY_TX_DRV_AMP_OFFSET == 0xc000
>> >
>> > which together gives a signed long value
>> >
>> > 0xc001l
>> >
>> > and when this is passed into a function that takes an unsigned int
>> > type,
>> > gcc warns about the signed overflow and the loss of the upper 32
>> > -bits that
>> > are all ones.
>> >
>> > This patch adds 'ul' type modifiers to the literal numbers passed
>> > in
>> > here, so now the expression remains an 'unsigned long' with the
>> > upper
>> > bits all zero, and that avoids the signed overflow and the warning.
>>
>> FWIW, the 64-bitness of BIT() on 64-bit platforms is also causing
>> subtle
>> warnings in other places, e.g. when inverting them to create bit
>> mask, cfr.
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commi
>> t/?id=a9efeca613a8fe5281d7c91f5c8c9ea46f2312f6
>
> I still think specific length BIT macros
> can be useful.
>
> https://lkml.org/lkml/2015/10/16/852

Yeah!

I only recently started liking the BIT() macro (before I preferred hex, too).
Perhaps because Renesas datasheets use bit numbers all over the place ;-)

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: OVS VXLAN decap rule has full match on TTL for the outer headers?

2015-11-13 Thread Joe Stringer
On 11 November 2015 at 22:34, Or Gerlitz  wrote:
> On Thu, Nov 12, 2015 at 12:44 AM, Jesse Gross  wrote:
>> On Wed, Nov 11, 2015 at 6:47 AM, Or Gerlitz  wrote:
>>> Hi Joe/Jesse,
>>>
>>> We've noticed that VXLAN decap rules set by OVS in the below trivial VXLAN
>>> config contain full match on TTL=64 for the outer headers, can you explain
>>> the reasoning behind it? is that justa typo in dumping the flow?
>>
>> Looking at the code, this does indeed seem to be the case. To the best
>> of my knowledge, there was no particular reasoning behind it - it
>> simply was never an issue so the wildcard generation wasn't improved.
>
> so all of the multi-million vxlan packets that were ever decapsulatd by OVS
> arrived with TTL=X=64, this is impossible... when tunneled traffic goes
> through routers, they change TTL of the outer packet, right?
>
> so if the was a router between my hosts, the packet would arrive with TTL-63, 
> we
> still miss something here.

I don't follow the logic. You observed one flow which matched on
TTL=64, therefore all vxlan packets terminated at OVS have TTL=64?

If OVS received packets with different TTLs, they would miss and
ovs-vswitchd would generate flows to match that traffic too. If that
becomes an issue, presumably the wildcard generation can be improved.

> 08:09:32.704276 00:02:c9:e9:bf:32 > f4:52:14:01:da:82, ethertype IPv4
> (0x0800), length 148: (tos 0x0, ttl 64, id 31703, offset 0, flags
> [DF], proto UDP (17), length 134)
> 192.168.31.17.51757 > 192.168.31.18.4789: [no cksum] UDP, length 106
> 08:09:32.704323 f4:52:14:01:da:82 > 00:02:c9:e9:bf:32, ethertype IPv4
> (0x0800), length 148: (tos 0x0, ttl 64, id 45301, offset 0, flags
> [DF], proto UDP (17), length 134)
> 192.168.31.18.53633 > 192.168.31.17.4789: [no cksum] UDP, length 106
>
>
>>> I also noticed that on my systems (upstream kernel 4.3.0-rc6+, veth
>>> emulating a VM network 192.168.52/24 and host network 192.168.31/24, ovs
>>> user-space 2.3.2) something is broken in the encap rule reporting, traffic
>>> goes fine (below)
>
>> That is pretty strange, I have never seen that before. It seems that
>> the key to be set is being reported as UNSPEC, which is weird if
>> traffic is flowing normally. I guess you could try a newer version of
>> userspace but I'm not aware of any patches that went in that would
>> have fixed a bug in this area.
>
> Seems as clear regression here w.r.t inter-operability between OVS 
> user/kernel,
> user space 2.3.2 which is listed in the web site as the latest release
> on the LTS branch reports junk,
> user space 2.0.90 crashes...

I agree that this UNSPEC issue on v2.3 could do with a bit of a closer
look. I'll see if I can find some time for it. Alternatively if you're
willing and have bandwith, I'd be curious if it's related to the
masked set field feature introduced in Linux-4.0.

OVS 2.0.90 isn't an actual version number, it typically means master
at some point between v2.0 and v2.1 (unknown), and furthermore it
sounds like that issue is fixed in v2.3 so upgrading is the best
option.

> looking in the kernel logs when I did the runs with 2.3.2 I do see
> this prints few times
>
> openvswitch: netlink: Key type 62 is out of range max 26

This particular issue looks like the experimental MPLS support in
OVS2.3.x which was later reshuffled when it was merged upstream. At
that time there was no way to silently probe for features, so the MPLS
feature probe ends up causing a log message like this when you pair
OVS 2.3.x with any kernel module that didn't also have the same
experimental MPLS support (which, I'm not sure if there's any official
version released anywhere). It is harmless.

> and later the below "Dropping previously announced user features" warning

In this case it looks like you created the datapath using a newer
version of the userspace utilities, then without deleting the
datapath, attempted to reuse the datapath with an older version of the
userspace utilities. This is fine, but it warns you because it drops
particular user features which the newer userspace supported (because
the older userspace doesn't support them). Sure, it's not the most
graceful, but it doesn't look fatal in and of itself. Comment from the
code below for context:

/* An outdated user space instance that does not understand
* the concept of user_features has attempted to create a new
* datapath and is likely to reuse it. Drop all user features.
*/

> So I now moved to 2.4.0, and things aren't much better... can you give
> a quick try on
> your systems for upstream kernel against upstream OVS w.r.t to simple
> VXLAN config?

What do you mean by "not much better"? Do you mean that you still
observe one of the above three issues, or you see a different issue?
In particular I'd be curious if you observe the UNSPEC issue.

FWIW I've been regularly running a very trivial vxlan test with the
upstream (master) OVS and at least v4.3, and I haven't noticed
anything particularly unusual. That said I haven'

Re: [PATCH] net: phy: vitesse: add support for VSC8601

2015-11-13 Thread Mason
On 12/11/2015 19:41, Mans Rullgard wrote:

> + .phy_id = PHY_ID_VSC8601,
> + .name   = "Vitesse VSC8601",
> + .phy_id_mask= 0x0000,
> + .features   = PHY_GBIT_FEATURES,
> + .flags  = PHY_HAS_INTERRUPT,
> + .config_init= &genphy_config_init,
> + .config_aneg= &genphy_config_aneg,
> + .read_status= &genphy_read_status,
> + .ack_interrupt  = &vsc824x_ack_interrupt,
> + .config_intr= &vsc82xx_config_intr,

I expected Documentation/CodingStyle to forbid taking the address
of functions.

Regards.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next RFC V3 0/3] basic busy polling support for vhost_net

2015-11-13 Thread Jason Wang


On 11/12/2015 08:02 PM, Felipe Franciosi wrote:
> Hi Jason,
>
> I understand your busy loop timeout is quite conservative at 50us. Did you 
> try any other values?

I've also tried 20us. And results shows 50us was better in:

- very small packet tx (e.g 64bytes at most 46% improvement)
- TCP_RR (at most 11% improvement)

But I will test bigger values. In fact, for net itself, we can be even
more aggressive: make vhost poll forever but I haven't tired this.

>
> Also, did you measure how polling affects many VMs talking to each other 
> (e.g. 20 VMs on each host, perhaps with several vNICs each, transmitting to a 
> corresponding VM/vNIC pair on another host)?

Not yet, in my todo list.

>
>
> On a complete separate experiment (busy waiting on storage I/O rings on Xen), 
> I have observed that bigger timeouts gave bigger benefits. On the other hand, 
> all cases that contended for CPU were badly hurt with any sort of polling.
>
> The cases that contended for CPU consisted of many VMs generating workload 
> over very fast I/O devices (in that case, several NVMe devices on a single 
> host). And the metric that got affected was aggregate throughput from all VMs.
>
> The solution was to determine whether to poll depending on the host's overall 
> CPU utilisation at that moment. That gave me the best of both worlds as 
> polling made everything faster without slowing down any other metric.

You mean a threshold and exit polling when it exceeds this? I use a
simpler method: just exit the busy loop when there's more than one
processes is in running state. I test this method in the past for socket
busy read (http://www.gossamer-threads.com/lists/linux/kernel/1997531)
which seems can solve the issue. But haven't tested this for vhost
polling. Will run some simple test (e.g pin two vhost threads in one
host cpu), and see how well it perform.

Thanks

>
> Thanks,
> Felipe
>
>
>
> On 12/11/2015 10:20, "kvm-ow...@vger.kernel.org on behalf of Jason Wang" 
>  wrote:
>
>>
>> On 11/12/2015 06:16 PM, Jason Wang wrote:
>>> Hi all:
>>>
>>> This series tries to add basic busy polling for vhost net. The idea is
>>> simple: at the end of tx/rx processing, busy polling for new tx added
>>> descriptor and rx receive socket for a while. The maximum number of
>>> time (in us) could be spent on busy polling was specified ioctl.
>>>
>>> Test were done through:
>>>
>>> - 50 us as busy loop timeout
>>> - Netperf 2.6
>>> - Two machines with back to back connected ixgbe
>>> - Guest with 1 vcpu and 1 queue
>>>
>>> Results:
>>> - For stream workload, ioexits were reduced dramatically in medium
>>>   size (1024-2048) of tx (at most -39%) and almost all rx (at most
>>>   -79%) as a result of polling. This compensate for the possible
>>>   wasted cpu cycles more or less. That porbably why we can still see
>>>   some increasing in the normalized throughput in some cases.
>>> - Throughput of tx were increased (at most 105%) expect for the huge
>>>   write (16384). And we can send more packets in the case (+tpkts were
>>>   increased).
>>> - Very minor rx regression in some cases.
>>> - Improvemnt on TCP_RR (at most 16%).
>> Forget to mention, the following test results by order are:
>>
>> 1) Guest TX
>> 2) Guest RX
>> 3) TCP_RR
>>
>>> size/session/+thu%/+normalize%/+tpkts%/+rpkts%/+ioexits%/
>>>64/ 1/   +9%/  -17%/   +5%/  +10%/   -2%
>>>64/ 2/   +8%/  -18%/   +6%/  +10%/   -1%
>>>64/ 4/   +4%/  -21%/   +6%/  +10%/   -1%
>>>64/ 8/   +9%/  -17%/   +6%/   +9%/   -2%
>>>   256/ 1/  +20%/   -1%/  +15%/  +11%/   -9%
>>>   256/ 2/  +15%/   -6%/  +15%/   +8%/   -8%
>>>   256/ 4/  +17%/   -4%/  +16%/   +8%/   -8%
>>>   256/ 8/  -61%/  -69%/  +16%/  +10%/  -10%
>>>   512/ 1/  +15%/   -3%/  +19%/  +18%/  -11%
>>>   512/ 2/  +19%/0%/  +19%/  +13%/  -10%
>>>   512/ 4/  +18%/   -2%/  +18%/  +15%/  -10%
>>>   512/ 8/  +17%/   -1%/  +18%/  +15%/  -11%
>>>  1024/ 1/  +25%/   +4%/  +27%/  +16%/  -21%
>>>  1024/ 2/  +28%/   +8%/  +25%/  +15%/  -22%
>>>  1024/ 4/  +25%/   +5%/  +25%/  +14%/  -21%
>>>  1024/ 8/  +27%/   +7%/  +25%/  +16%/  -21%
>>>  2048/ 1/  +32%/  +12%/  +31%/  +22%/  -38%
>>>  2048/ 2/  +33%/  +12%/  +30%/  +23%/  -36%
>>>  2048/ 4/  +31%/  +10%/  +31%/  +24%/  -37%
>>>  2048/ 8/ +105%/  +75%/  +33%/  +23%/  -39%
>>> 16384/ 1/0%/  -14%/   +2%/0%/  +19%
>>> 16384/ 2/0%/  -13%/  +19%/  -13%/  +17%
>>> 16384/ 4/0%/  -12%/   +3%/0%/   +2%
>>> 16384/ 8/0%/  -11%/   -2%/   +1%/   +1%
>>> size/session/+thu%/+normalize%/+tpkts%/+rpkts%/+ioexits%/
>>>64/ 1/   -7%/  -23%/   +4%/   +6%/  -74%
>>>64/ 2/   -2%/  -12%/   +2%/   +2%/  -55%
>>>64/ 4/   +2%/   -5%/  +10%/   -2%/  -43%
>>>64/ 8/   -5%/   -5%/  +11%/  -34%/  -59%
>>>   256/ 1/   -6%/  -16%/   +9%/  +11%/  -60%
>>>   256/ 2/   +3%/   -4%/   +6%/   -3%/  -28%
>>>   256/ 4/0%/   -5%/   -9%/   -9%/  -10%
>>>   256/

[PATCH net] be2net: check properly status in lancer_cmd_get_file_len()

2015-11-13 Thread Ivan Vecera
The lancer_cmd_get_file_len() calls lancer_cmd_read_object() to get
the current size of registers for ethtool registers dump. The size
is stored in data_read but only when the returned status is 0 otherwise
it is uninitialized thus random.

Signed-off-by: Ivan Vecera 
---
 drivers/net/ethernet/emulex/benet/be_ethtool.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/emulex/benet/be_ethtool.c 
b/drivers/net/ethernet/emulex/benet/be_ethtool.c
index f4cb8e4..26b6192 100644
--- a/drivers/net/ethernet/emulex/benet/be_ethtool.c
+++ b/drivers/net/ethernet/emulex/benet/be_ethtool.c
@@ -248,6 +248,8 @@ static u32 lancer_cmd_get_file_len(struct be_adapter 
*adapter, u8 *file_name)
status = lancer_cmd_read_object(adapter, &data_len_cmd, 0, 0,
file_name, &data_read, &eof,
&addn_status);
+   if (status)
+   return 0;
 
return data_read;
 }
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: phy: vitesse: add support for VSC8601

2015-11-13 Thread Måns Rullgård
Mason  writes:

> On 12/11/2015 19:41, Mans Rullgard wrote:
>
>> +.phy_id = PHY_ID_VSC8601,
>> +.name   = "Vitesse VSC8601",
>> +.phy_id_mask= 0x0000,
>> +.features   = PHY_GBIT_FEATURES,
>> +.flags  = PHY_HAS_INTERRUPT,
>> +.config_init= &genphy_config_init,
>> +.config_aneg= &genphy_config_aneg,
>> +.read_status= &genphy_read_status,
>> +.ack_interrupt  = &vsc824x_ack_interrupt,
>> +.config_intr= &vsc82xx_config_intr,
>
> I expected Documentation/CodingStyle to forbid taking the address
> of functions.

I can't find anything to that effect.  That said, it's not something I
would normally do, but all the other phy_driver entries in that file
look like that.

-- 
Måns Rullgård
m...@mansr.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] ravb: Fix int mask value overwritten issue

2015-11-13 Thread Yoshihiro Kaneko
From: Masaru Nagai 

When RX/TX interrupt for Network Control queue and Best Effort queue
is issued at the same time, the interrupt mask of Network Control
queue will be reset when the mask of Best Effort queue is set.
This patch fixes this problem.

Signed-off-by: Masaru Nagai 
Signed-off-by: Yoshihiro Kaneko 
---

This patch is based on the master branch of David Miller's networking tree.

 drivers/net/ethernet/renesas/ravb_main.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/renesas/ravb_main.c 
b/drivers/net/ethernet/renesas/ravb_main.c
index aa7b208..7180e26 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -733,8 +733,10 @@ static irqreturn_t ravb_interrupt(int irq, void *dev_id)
((tis  & tic)  & BIT(q))) {
if (napi_schedule_prep(&priv->napi[q])) {
/* Mask RX and TX interrupts */
-   ravb_write(ndev, ric0 & ~BIT(q), RIC0);
-   ravb_write(ndev, tic  & ~BIT(q), TIC);
+   ric0 &= ~BIT(q);
+   tic &= ~BIT(q);
+   ravb_write(ndev, ric0, RIC0);
+   ravb_write(ndev, tic, TIC);
__napi_schedule(&priv->napi[q]);
} else {
netdev_warn(ndev,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 net-next] net/core: ensure features get disabled on new lower devs

2015-11-13 Thread Jiri Pirko
Fri, Nov 13, 2015 at 01:26:18AM CET, f.faine...@gmail.com wrote:
>On 04/11/15 18:56, David Miller wrote:
>>> Fixes: fd867d51f889 ("net/core: generic support for disabling netdev 
>>> features down stack")
>>  ...
>>> Reported-by: Nikolay Aleksandrov 
>>> Signed-off-by: Jarod Wilson 
>>> ---
>>> v2: Based on suggestions from Alex, and with not changing err to ret, this
>>> patch actually becomes quite minimal and doesn't ugly up the code much.
>> 
>> Applied, thanks.
>
>This causes some warnings to be displayed for DSA stacked devices:
>
>[1.272297] brcm-sf2 f0b0.ethernet_switch: Starfighter 2 top:
>4.00, core: 2.00 base: 0xf0c8, IRQs: 68, 69
>[1.283181] libphy: dsa slave smi: probed
>[1.344088] f0b403c0.mdio:05: Broadcom BCM7445 PHY revision: 0xd0,
>patch: 3
>[1.658917] brcm-sf2 f0b0.ethernet_switch gphy (uninitialized):
>attached PHY at address 5 [Broadcom BCM7445]
>[1.669414] brcm-sf2 f0b0.ethernet_switch gphy: set_features()
>failed (-1); wanted 0x4020, left 0x4820
>[1.734202] brcm-sf2 f0b0.ethernet_switch rgmii_1
>(uninitialized): attached PHY at address 0 [Generic PHY]
>[1.744486] brcm-sf2 f0b0.ethernet_switch rgmii_1: set_features()
>failed (-1); wanted 0x4020, left 0x4820
>[1.809091] brcm-sf2 f0b0.ethernet_switch rgmii_2
>(uninitialized): attached PHY at address 1 [Generic PHY]
>[1.819364] brcm-sf2 f0b0.ethernet_switch rgmii_2: set_features()
>failed (-1); wanted 0x4020, left 0x4820
>[1.884090] brcm-sf2 f0b0.ethernet_switch moca (uninitialized):
>attached PHY at address 2 [Generic PHY]
>[1.894109] brcm-sf2 f0b0.ethernet_switch moca: set_features()
>failed (-1); wanted 0x4020, left 0x4820
>
>DSA slave network devices are not associated with their master network
>device using the typical lower/upper netdev helpers.
>
>I do not have a good fix to come up with yet, but if you see something
>obvious with net/dsa/slave.c, feel free to send patches for testing, I
>can boot net-next on this platform.

I'm having similar issues with bridge, with linus's git now:


...
[   14.354362] br0: set_features() failed (-1); wanted 0x00801fd978a9, left 
0x00801fff78e9
[   14.430480] br0: set_features() failed (-1); wanted 0x00801fd978a9, left 
0x00801fff78e9
[   14.430550] IPv6: ADDRCONF(NETDEV_UP): br0: link is not ready
[   17.938637] tg3 :01:00.0 eno1: Link is up at 1000 Mbps, full duplex
[   17.938647] tg3 :01:00.0 eno1: Flow control is off for TX and off for RX
[   17.938651] tg3 :01:00.0 eno1: EEE is disabled
[   17.938669] IPv6: ADDRCONF(NETDEV_CHANGE): eno1: link becomes ready
[   17.938753] br0: port 1(eno1) entered forwarding state
[   17.938762] br0: port 1(eno1) entered forwarding state
[   17.938834] IPv6: ADDRCONF(NETDEV_CHANGE): br0: link becomes ready
[   29.763514] FS-Cache: Loaded
[   29.917680] FS-Cache: Netfs 'nfs' registered for caching
[   29.936739] Key type dns_resolver registered
[   30.637482] NFS: Registering the id_resolver key type
[   30.637502] Key type id_resolver registered
[   30.637504] Key type id_legacy registered
[   31.286444] ip6_tables: (C) 2000-2006 Netfilter Core Team
[   31.403005] Ebtables v2.0 registered
[   31.630354] tun: Universal TUN/TAP device driver, 1.6
[   31.630358] tun: (C) 1999-2004 Max Krasnyansky 
[   31.630824] virbr0-nic: set_features() failed (-1); wanted 
0x008048c1, left 0x0080001b48c9
[   31.677764] virbr0-nic: set_features() failed (-1); wanted 
0x008048c1, left 0x0080001b48c9
[   31.677855] device virbr0-nic entered promiscuous mode
[   31.677898] virbr0: set_features() failed (-1); wanted 0x00801fdb78c9, 
left 0x00801fff78e9
[   31.904892] nf_conntrack version 0.5.0 (65536 buckets, 262144 max)
[   32.087094] virbr0: set_features() failed (-1); wanted 0x00801fdb78c9, 
left 0x00801fff78e9
[   32.087196] virbr0: port 1(virbr0-nic) entered listening state
[   32.087205] virbr0: port 1(virbr0-nic) entered listening state
[   32.093676] br0: set_features() failed (-1); wanted 0x00801fd978a9, left 
0x00801fff78e9
[   32.093786] virbr0: set_features() failed (-1); wanted 0x00801fdb78c9, 
left 0x00801fff78e9
[   32.093872] virbr0-nic: set_features() failed (-1); wanted 
0x008048c1, left 0x0080001b48c9
[   32.093966] virbr0: set_features() failed (-1); wanted 0x00801fdb78c9, 
left 0x00801fff78e9
[   32.094051] virbr0-nic: set_features() failed (-1); wanted 
0x008048c1, left 0x0080001b48c9
[   32.094132] virbr0: set_features() failed (-1); wanted 0x00801fdb78c9, 
left 0x00801fff78e9
[   32.124341] virbr0: port 1(virbr0-nic) entered disabled state

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 2/2] be2net: replace hardcoded values with existing define

2015-11-13 Thread Ivan Vecera
Signed-off-by: Ivan Vecera 
---
 drivers/net/ethernet/emulex/benet/be_ethtool.c | 3 ++-
 drivers/net/ethernet/emulex/benet/be_main.c| 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_ethtool.c 
b/drivers/net/ethernet/emulex/benet/be_ethtool.c
index 3d8c6c1..2362304 100644
--- a/drivers/net/ethernet/emulex/benet/be_ethtool.c
+++ b/drivers/net/ethernet/emulex/benet/be_ethtool.c
@@ -1116,7 +1116,8 @@ static int be_set_rss_hash_opts(struct be_adapter 
*adapter,
return 0;
 
status = be_cmd_rss_config(adapter, adapter->rss_info.rsstable,
-  rss_flags, 128, adapter->rss_info.rss_hkey);
+  rss_flags, RSS_INDIR_TABLE_LEN,
+  adapter->rss_info.rss_hkey);
if (!status)
adapter->rss_info.rss_flags = rss_flags;
 
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c 
b/drivers/net/ethernet/emulex/benet/be_main.c
index eb48a97..b6ad029 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -3518,7 +3518,7 @@ static int be_rx_qs_create(struct be_adapter *adapter)
 
netdev_rss_key_fill(rss_key, RSS_HASH_KEY_LEN);
rc = be_cmd_rss_config(adapter, rss->rsstable, rss->rss_flags,
-  128, rss_key);
+  RSS_INDIR_TABLE_LEN, rss_key);
if (rc) {
rss->rss_flags = RSS_ENABLE_NONE;
return rc;
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 1/2] be2net: remove unused local rsstable array

2015-11-13 Thread Ivan Vecera
Remove rsstable array and its initialization from be_set_rss_hash_opts().
The array became unused after "e255787 be2net: Support for configurable
RSS hash key". The initial RSS table is now filled and stored for later
usage during Rx queue creation.

Signed-off-by: Ivan Vecera 
---
 drivers/net/ethernet/emulex/benet/be_ethtool.c | 16 ++--
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_ethtool.c 
b/drivers/net/ethernet/emulex/benet/be_ethtool.c
index 26b6192..3d8c6c1 100644
--- a/drivers/net/ethernet/emulex/benet/be_ethtool.c
+++ b/drivers/net/ethernet/emulex/benet/be_ethtool.c
@@ -1064,9 +1064,7 @@ static int be_get_rxnfc(struct net_device *netdev, struct 
ethtool_rxnfc *cmd,
 static int be_set_rss_hash_opts(struct be_adapter *adapter,
struct ethtool_rxnfc *cmd)
 {
-   struct be_rx_obj *rxo;
-   int status = 0, i, j;
-   u8 rsstable[128];
+   int status;
u32 rss_flags = adapter->rss_info.rss_flags;
 
if (cmd->data != L3_RSS_FLAGS &&
@@ -1115,17 +1113,7 @@ static int be_set_rss_hash_opts(struct be_adapter 
*adapter,
}
 
if (rss_flags == adapter->rss_info.rss_flags)
-   return status;
-
-   if (be_multi_rxq(adapter)) {
-   for (j = 0; j < 128; j += adapter->num_rss_qs) {
-   for_all_rss_queues(adapter, rxo, i) {
-   if ((j + i) >= 128)
-   break;
-   rsstable[j + i] = rxo->rss_id;
-   }
-   }
-   }
+   return 0;
 
status = be_cmd_rss_config(adapter, adapter->rss_info.rsstable,
   rss_flags, 128, adapter->rss_info.rss_hkey);
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/14] mm: memcontrol: do not account memory+swap on unified hierarchy

2015-11-13 Thread Michal Hocko
On Thu 12-11-15 18:41:30, Johannes Weiner wrote:
> The unified hierarchy memory controller doesn't expose the memory+swap
> counter to userspace, but its accounting is hardcoded in all charge
> paths right now, including the per-cpu charge cache ("the stock").
> 
> To avoid adding yet more pointless memory+swap accounting with the
> socket memory support in unified hierarchy, disable the counter
> altogether when in unified hierarchy mode.
> 
> Signed-off-by: Johannes Weiner 

Acked-by: Michal Hocko 

> ---
>  mm/memcontrol.c | 44 +---
>  1 file changed, 25 insertions(+), 19 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 658bef2..e7f1a79 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -87,6 +87,12 @@ int do_swap_account __read_mostly;
>  #define do_swap_account  0
>  #endif
>  
> +/* Whether legacy memory+swap accounting is active */
> +static bool do_memsw_account(void)
> +{
> + return !cgroup_subsys_on_dfl(memory_cgrp_subsys) && do_swap_account;
> +}
> +
>  static const char * const mem_cgroup_stat_names[] = {
>   "cache",
>   "rss",
> @@ -1177,7 +1183,7 @@ static unsigned long mem_cgroup_margin(struct 
> mem_cgroup *memcg)
>   if (count < limit)
>   margin = limit - count;
>  
> - if (do_swap_account) {
> + if (do_memsw_account()) {
>   count = page_counter_read(&memcg->memsw);
>   limit = READ_ONCE(memcg->memsw.limit);
>   if (count <= limit)
> @@ -1280,7 +1286,7 @@ void mem_cgroup_print_oom_info(struct mem_cgroup 
> *memcg, struct task_struct *p)
>   pr_cont(":");
>  
>   for (i = 0; i < MEM_CGROUP_STAT_NSTATS; i++) {
> - if (i == MEM_CGROUP_STAT_SWAP && !do_swap_account)
> + if (i == MEM_CGROUP_STAT_SWAP && !do_memsw_account())
>   continue;
>   pr_cont(" %s:%luKB", mem_cgroup_stat_names[i],
>   K(mem_cgroup_read_stat(iter, i)));
> @@ -1903,7 +1909,7 @@ static void drain_stock(struct memcg_stock_pcp *stock)
>  
>   if (stock->nr_pages) {
>   page_counter_uncharge(&old->memory, stock->nr_pages);
> - if (do_swap_account)
> + if (do_memsw_account())
>   page_counter_uncharge(&old->memsw, stock->nr_pages);
>   css_put_many(&old->css, stock->nr_pages);
>   stock->nr_pages = 0;
> @@ -2033,11 +2039,11 @@ retry:
>   if (consume_stock(memcg, nr_pages))
>   return 0;
>  
> - if (!do_swap_account ||
> + if (!do_memsw_account() ||
>   page_counter_try_charge(&memcg->memsw, batch, &counter)) {
>   if (page_counter_try_charge(&memcg->memory, batch, &counter))
>   goto done_restock;
> - if (do_swap_account)
> + if (do_memsw_account())
>   page_counter_uncharge(&memcg->memsw, batch);
>   mem_over_limit = mem_cgroup_from_counter(counter, memory);
>   } else {
> @@ -2124,7 +2130,7 @@ force:
>* temporarily by force charging it.
>*/
>   page_counter_charge(&memcg->memory, nr_pages);
> - if (do_swap_account)
> + if (do_memsw_account())
>   page_counter_charge(&memcg->memsw, nr_pages);
>   css_get_many(&memcg->css, nr_pages);
>  
> @@ -2161,7 +2167,7 @@ static void cancel_charge(struct mem_cgroup *memcg, 
> unsigned int nr_pages)
>   return;
>  
>   page_counter_uncharge(&memcg->memory, nr_pages);
> - if (do_swap_account)
> + if (do_memsw_account())
>   page_counter_uncharge(&memcg->memsw, nr_pages);
>  
>   css_put_many(&memcg->css, nr_pages);
> @@ -2441,7 +2447,7 @@ void __memcg_kmem_uncharge(struct page *page, int order)
>  
>   page_counter_uncharge(&memcg->kmem, nr_pages);
>   page_counter_uncharge(&memcg->memory, nr_pages);
> - if (do_swap_account)
> + if (do_memsw_account())
>   page_counter_uncharge(&memcg->memsw, nr_pages);
>  
>   page->mem_cgroup = NULL;
> @@ -3154,7 +3160,7 @@ static int memcg_stat_show(struct seq_file *m, void *v)
>   BUILD_BUG_ON(ARRAY_SIZE(mem_cgroup_lru_names) != NR_LRU_LISTS);
>  
>   for (i = 0; i < MEM_CGROUP_STAT_NSTATS; i++) {
> - if (i == MEM_CGROUP_STAT_SWAP && !do_swap_account)
> + if (i == MEM_CGROUP_STAT_SWAP && !do_memsw_account())
>   continue;
>   seq_printf(m, "%s %lu\n", mem_cgroup_stat_names[i],
>  mem_cgroup_read_stat(memcg, i) * PAGE_SIZE);
> @@ -3176,14 +3182,14 @@ static int memcg_stat_show(struct seq_file *m, void 
> *v)
>   }
>   seq_printf(m, "hierarchical_memory_limit %llu\n",
>  (u64)memory * PAGE_SIZE);
> - if (do_swap_account)
> + if (do_memsw_account())
>   seq_printf(m, "hierarchical_memsw_limit %llu\n",
>

[PATCH v2] net ipv4: use preferred log methods

2015-11-13 Thread Bastian Stender
Replace printk calls with preferred unconditional log method calls to keep
kernel messages clean. Conditional printks were left untouched to avoid
change in behaviour.

Furthermore a newline was added to the "too small MTU" message.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] net ipv4: use preferred log methods

2015-11-13 Thread Bastian Stender
Replace printk calls with preferred unconditional log method calls to keep
kernel messages clean.

Added newline to "too small MTU" message.

Signed-off-by: Bastian Stender 
---
 net/ipv4/ipconfig.c| 73 ++
 net/ipv4/netfilter/arp_tables.c|  6 +--
 net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c |  2 +-
 net/ipv4/netfilter/nf_nat_snmp_basic.c | 22 
 4 files changed, 44 insertions(+), 59 deletions(-)

diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c
index 0bc7412..e86e8a9 100644
--- a/net/ipv4/ipconfig.c
+++ b/net/ipv4/ipconfig.c
@@ -65,15 +65,6 @@
 #include 
 #include 
 
-/* Define this to allow debugging output */
-#undef IPCONFIG_DEBUG
-
-#ifdef IPCONFIG_DEBUG
-#define DBG(x) printk x
-#else
-#define DBG(x) do { } while(0)
-#endif
-
 #if defined(CONFIG_IP_PNP_DHCP)
 #define IPCONFIG_DHCP
 #endif
@@ -227,7 +218,7 @@ static int __init ic_open_devs(void)
if (dev->mtu >= 364)
able |= IC_BOOTP;
else
-   pr_warn("DHCP/BOOTP: Ignoring device %s, MTU %d 
too small",
+   pr_warn("DHCP/BOOTP: Ignoring device %s, MTU %d 
too small\n",
dev->name, dev->mtu);
if (!(dev->flags & IFF_NOARP))
able |= IC_RARP;
@@ -254,8 +245,8 @@ static int __init ic_open_devs(void)
else
d->xid = 0;
ic_proto_have_if |= able;
-   DBG(("IP-Config: %s UP (able=%d, xid=%08x)\n",
-   dev->name, able, d->xid));
+   pr_debug("IP-Config: %s UP (able=%d, xid=%08x)\n",
+dev->name, able, d->xid);
}
}
 
@@ -311,7 +302,7 @@ static void __init ic_close_devs(void)
next = d->next;
dev = d->dev;
if (dev != ic_dev && !netdev_uses_dsa(dev)) {
-   DBG(("IP-Config: Downing %s\n", dev->name));
+   pr_debug("IP-Config: Downing %s\n", dev->name);
dev_change_flags(dev, d->flags);
}
kfree(d);
@@ -464,7 +455,8 @@ static int __init ic_defaults(void)
   &ic_myaddr);
return -1;
}
-   printk("IP-Config: Guessing netmask %pI4\n", &ic_netmask);
+   pr_notice("IP-Config: Guessing netmask %pI4\n",
+ &ic_netmask);
}
 
return 0;
@@ -675,9 +667,7 @@ ic_dhcp_init_options(u8 *options)
u8 *e = options;
int len;
 
-#ifdef IPCONFIG_DEBUG
-   printk("DHCP: Sending message type %d\n", mt);
-#endif
+   pr_debug("DHCP: Sending message type %d\n", mt);
 
memcpy(e, ic_bootp_cookie, 4);  /* RFC1048 Magic Cookie */
e += 4;
@@ -847,7 +837,8 @@ static void __init ic_bootp_send_if(struct ic_device *d, 
unsigned long jiffies_d
else if (dev->type == ARPHRD_FDDI)
b->htype = ARPHRD_ETHER;
else {
-   printk("Unknown ARP type 0x%04x for device %s\n", dev->type, 
dev->name);
+   pr_warn("Unknown ARP type 0x%04x for device %s\n", dev->type,
+   dev->name);
b->htype = dev->type; /* can cause undefined behavior */
}
 
@@ -904,14 +895,12 @@ static void __init ic_do_bootp_ext(u8 *ext)
int i;
__be16 mtu;
 
-#ifdef IPCONFIG_DEBUG
u8 *c;
 
-   printk("DHCP/BOOTP: Got extension %d:",*ext);
+   pr_debug("DHCP/BOOTP: Got extension %d:", *ext);
for (c=ext+2; cyour_ip;
ic_servaddr = server_id;
-#ifdef IPCONFIG_DEBUG
-   printk("DHCP: Offered address %pI4 by server 
%pI4\n",
-  &ic_myaddr, &b->iph.saddr);
-#endif
+   pr_debug("DHCP: Offered address %pI4 by server 
%pI4\n",
+&ic_myaddr, &b->iph.saddr);
/* The DHCP indicated server address takes
 * precedence over the bootp header one if
 * they are different.
@@ -1254,13 +1239,13 @@ static int __init ic_dynamic(void)
(ic_proto_enabled & IC_USE_DHCP) &&
ic_dhcp_msgtype != DHCPACK) {
ic_got_reply = 0;
-   pr_cont(",");
+   pr_notice(",");
continue;
}
 #endif /* IPCONFIG_DHCP */
 
if (ic_got_reply) {
-   pr_cont(" OK\n");
+   pr_notice(" OK\n");
break;
}
 
@@ -1268,7 +1253,7 @@ static int __init ic_dynamic(void)
  

Re: [PATCH 10/14] mm: memcontrol: generalize the socket accounting jump label

2015-11-13 Thread Michal Hocko
On Thu 12-11-15 18:41:29, Johannes Weiner wrote:
> The unified hierarchy memory controller is going to use this jump
> label as well to control the networking callbacks. Move it to the
> memory controller code and give it a more generic name.
> 
> Signed-off-by: Johannes Weiner 

Yes it makes more sense in memcg proper
Acked-by: Michal Hocko 

> ---
>  include/linux/memcontrol.h | 4 
>  include/net/sock.h | 7 ---
>  mm/memcontrol.c| 3 +++
>  net/core/sock.c| 5 -
>  net/ipv4/tcp_memcontrol.c  | 4 ++--
>  5 files changed, 9 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 1c71f27..4cf5afa 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -693,6 +693,8 @@ static inline void mem_cgroup_wb_stats(struct 
> bdi_writeback *wb,
>  
>  #if defined(CONFIG_INET) && defined(CONFIG_MEMCG_KMEM)
>  struct sock;
> +extern struct static_key memcg_sockets_enabled_key;
> +#define mem_cgroup_sockets_enabled 
> static_key_false(&memcg_sockets_enabled_key)
>  void sock_update_memcg(struct sock *sk);
>  void sock_release_memcg(struct sock *sk);
>  bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int 
> nr_pages);
> @@ -701,6 +703,8 @@ static inline bool 
> mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
>  {
>   return memcg->tcp_mem.memory_pressure;
>  }
> +#else
> +#define mem_cgroup_sockets_enabled 0
>  #endif /* CONFIG_INET && CONFIG_MEMCG_KMEM */
>  
>  #ifdef CONFIG_MEMCG_KMEM
> diff --git a/include/net/sock.h b/include/net/sock.h
> index b439dcc..bf1b901 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -1065,13 +1065,6 @@ static inline void sk_refcnt_debug_release(const 
> struct sock *sk)
>  #define sk_refcnt_debug_release(sk) do { } while (0)
>  #endif /* SOCK_REFCNT_DEBUG */
>  
> -#if defined(CONFIG_MEMCG_KMEM) && defined(CONFIG_NET)
> -extern struct static_key memcg_socket_limit_enabled;
> -#define mem_cgroup_sockets_enabled 
> static_key_false(&memcg_socket_limit_enabled)
> -#else
> -#define mem_cgroup_sockets_enabled 0
> -#endif
> -
>  static inline bool sk_stream_memory_free(const struct sock *sk)
>  {
>   if (sk->sk_wmem_queued >= sk->sk_sndbuf)
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 89b1d9e..658bef2 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -291,6 +291,9 @@ static inline struct mem_cgroup 
> *mem_cgroup_from_id(unsigned short id)
>  /* Writing them here to avoid exposing memcg's inner layout */
>  #if defined(CONFIG_INET) && defined(CONFIG_MEMCG_KMEM)
>  
> +struct static_key memcg_sockets_enabled_key;
> +EXPORT_SYMBOL(memcg_sockets_enabled_key);
> +
>  void sock_update_memcg(struct sock *sk)
>  {
>   struct mem_cgroup *memcg;
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 6486b0d..c5435b5 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -201,11 +201,6 @@ EXPORT_SYMBOL(sk_net_capable);
>  static struct lock_class_key af_family_keys[AF_MAX];
>  static struct lock_class_key af_family_slock_keys[AF_MAX];
>  
> -#if defined(CONFIG_MEMCG_KMEM)
> -struct static_key memcg_socket_limit_enabled;
> -EXPORT_SYMBOL(memcg_socket_limit_enabled);
> -#endif
> -
>  /*
>   * Make lock validator output more readable. (we pre-construct these
>   * strings build-time, so that runtime initialization of socket
> diff --git a/net/ipv4/tcp_memcontrol.c b/net/ipv4/tcp_memcontrol.c
> index 47addc3..17df9dd 100644
> --- a/net/ipv4/tcp_memcontrol.c
> +++ b/net/ipv4/tcp_memcontrol.c
> @@ -34,7 +34,7 @@ void tcp_destroy_cgroup(struct mem_cgroup *memcg)
>   return;
>  
>   if (test_bit(MEMCG_SOCK_ACTIVATED, &memcg->tcp_mem.flags))
> - static_key_slow_dec(&memcg_socket_limit_enabled);
> + static_key_slow_dec(&memcg_sockets_enabled_key);
>  }
>  
>  static int tcp_update_limit(struct mem_cgroup *memcg, unsigned long nr_pages)
> @@ -73,7 +73,7 @@ static int tcp_update_limit(struct mem_cgroup *memcg, 
> unsigned long nr_pages)
>*/
>   if (!test_and_set_bit(MEMCG_SOCK_ACTIVATED,
> &memcg->tcp_mem.flags))
> - static_key_slow_inc(&memcg_socket_limit_enabled);
> + static_key_slow_inc(&memcg_sockets_enabled_key);
>   set_bit(MEMCG_SOCK_ACTIVE, &memcg->tcp_mem.flags);
>   }
>  
> -- 
> 2.6.2

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] ravb: Fix int mask value overwritten issue

2015-11-13 Thread Geert Uytterhoeven
Hi Kaneko-san,

On Fri, Nov 13, 2015 at 11:24 AM, Yoshihiro Kaneko
 wrote:
> From: Masaru Nagai 
>
> When RX/TX interrupt for Network Control queue and Best Effort queue
> is issued at the same time, the interrupt mask of Network Control
> queue will be reset when the mask of Best Effort queue is set.

Nice catch!

At first I was a bit puzzled why this would make a difference, but
the key is "will be reset in the next iteration of the for loop", which
falls outside of the visible context.

> This patch fixes this problem.
>
> Signed-off-by: Masaru Nagai 
> Signed-off-by: Yoshihiro Kaneko 

Acked-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 net-next] net/core: ensure features get disabled on new lower devs

2015-11-13 Thread Nikolay Aleksandrov
On 11/13/2015 11:29 AM, Jiri Pirko wrote:
> Fri, Nov 13, 2015 at 01:26:18AM CET, f.faine...@gmail.com wrote:
>> On 04/11/15 18:56, David Miller wrote:
 Fixes: fd867d51f889 ("net/core: generic support for disabling netdev 
 features down stack")
>>>  ...
 Reported-by: Nikolay Aleksandrov 
 Signed-off-by: Jarod Wilson 
 ---
 v2: Based on suggestions from Alex, and with not changing err to ret, this
 patch actually becomes quite minimal and doesn't ugly up the code much.
>>>
>>> Applied, thanks.
>>
>> This causes some warnings to be displayed for DSA stacked devices:
>>
>> [1.272297] brcm-sf2 f0b0.ethernet_switch: Starfighter 2 top:
>> 4.00, core: 2.00 base: 0xf0c8, IRQs: 68, 69
>> [1.283181] libphy: dsa slave smi: probed
>> [1.344088] f0b403c0.mdio:05: Broadcom BCM7445 PHY revision: 0xd0,
>> patch: 3
>> [1.658917] brcm-sf2 f0b0.ethernet_switch gphy (uninitialized):
>> attached PHY at address 5 [Broadcom BCM7445]
>> [1.669414] brcm-sf2 f0b0.ethernet_switch gphy: set_features()
>> failed (-1); wanted 0x4020, left 0x4820
>> [1.734202] brcm-sf2 f0b0.ethernet_switch rgmii_1
>> (uninitialized): attached PHY at address 0 [Generic PHY]
>> [1.744486] brcm-sf2 f0b0.ethernet_switch rgmii_1: set_features()
>> failed (-1); wanted 0x4020, left 0x4820
>> [1.809091] brcm-sf2 f0b0.ethernet_switch rgmii_2
>> (uninitialized): attached PHY at address 1 [Generic PHY]
>> [1.819364] brcm-sf2 f0b0.ethernet_switch rgmii_2: set_features()
>> failed (-1); wanted 0x4020, left 0x4820
>> [1.884090] brcm-sf2 f0b0.ethernet_switch moca (uninitialized):
>> attached PHY at address 2 [Generic PHY]
>> [1.894109] brcm-sf2 f0b0.ethernet_switch moca: set_features()
>> failed (-1); wanted 0x4020, left 0x4820
>>
>> DSA slave network devices are not associated with their master network
>> device using the typical lower/upper netdev helpers.
>>
>> I do not have a good fix to come up with yet, but if you see something
>> obvious with net/dsa/slave.c, feel free to send patches for testing, I
>> can boot net-next on this platform.
> 
> I'm having similar issues with bridge, with linus's git now:
> 
[snip]

Hmm, I think it's because the bridge and dsa/slave don't have ndo_set_features()
so err is left as -1 and thus an error is reported which isn't actually true.
Before in this case the features would just get set, so could you please try
the following patch ?


diff --git a/net/core/dev.c b/net/core/dev.c
index ab9b8d0d115e..4a1d198dbbff 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6426,6 +6426,8 @@ int __netdev_update_features(struct net_device *dev)
 
if (dev->netdev_ops->ndo_set_features)
err = dev->netdev_ops->ndo_set_features(dev, features);
+   else
+   err = 0;
 
if (unlikely(err < 0)) {
netdev_err(dev,
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] switchdev: bridge: Check return code is not EOPNOTSUPP

2015-11-13 Thread Ido Schimmel
When NET_SWITCHDEV=n, switchdev_port_attr_set simply returns EOPNOTSUPP.
In this case we should not emit errors and warnings to the kernel log.

Reported-by: Sander Eikelenboom 
Fixes: 0bc05d585d38 ("switchdev: allow caller to explicitly request
attr_set as deferred")
Fixes: 6ac311ae8bfb ("Adding switchdev ageing notification on port
bridged")
Signed-off-by: Ido Schimmel 
Signed-off-by: Jiri Pirko 
---
 net/bridge/br_stp.c| 2 +-
 net/bridge/br_stp_if.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
index f7e8dee..5f3f645 100644
--- a/net/bridge/br_stp.c
+++ b/net/bridge/br_stp.c
@@ -48,7 +48,7 @@ void br_set_state(struct net_bridge_port *p, unsigned int 
state)
 
p->state = state;
err = switchdev_port_attr_set(p->dev, &attr);
-   if (err)
+   if (err && err != -EOPNOTSUPP)
br_warn(p->br, "error setting offload STP state on port 
%u(%s)\n",
(unsigned int) p->port_no, p->dev->name);
 }
diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
index fa53d7a..5396ff08 100644
--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -50,7 +50,7 @@ void br_init_port(struct net_bridge_port *p)
p->config_pending = 0;
 
err = switchdev_port_attr_set(p->dev, &attr);
-   if (err)
+   if (err && err != -EOPNOTSUPP)
netdev_err(p->dev, "failed to set HW ageing time\n");
 }
 
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] switchdev: bridge: Check return code is not EOPNOTSUPP

2015-11-13 Thread Sander Eikelenboom

On 2015-11-13 12:06, Ido Schimmel wrote:
When NET_SWITCHDEV=n, switchdev_port_attr_set simply returns 
EOPNOTSUPP.

In this case we should not emit errors and warnings to the kernel log.


Hi Ido,

Thanks for your patch!

It fixes these:
[  207.245442] vif vif-1-0 vif1.0: failed to set HW ageing time
[  207.245443] xen_bridge: error setting offload STP state on 
port1(vif1.0)


But i still have these:
[  335.412194] vif19.0-emu: set_features() failed (-1); wanted 
0x008048c1, left 0x0080001b48c9
[  335.412204] vif19.0-emu: set_features() failed (-1); wanted 
0x008048c1, left 0x0080001b48c9
[  335.412311] vif19.0-emu: set_features() failed (-1); wanted 
0x008248c9, left 0x0080001b48c9
[  335.412319] vif19.0-emu: set_features() failed (-1); wanted 
0x008048c1, left 0x0080001b48c9
[  335.412326] vif19.0-emu: set_features() failed (-1); wanted 
0x008048c1, left 0x0080001b48c9
[  335.535955] vif vif-19-0 vif19.0: set_features() failed (-1); wanted 
0x00044803, left 0x000400114813
[  335.535965] vif vif-19-0 vif19.0: set_features() failed (-1); wanted 
0x00044803, left 0x000400114813
[  335.615392] vif vif-19-0 vif19.0: set_features() failed (-1); wanted 
0x00044803, left 0x000400114813
[  335.615401] xen_bridge: set_features() failed (-1); wanted 
0x00801fdb78c9, left 0x00801fff78e9


--
Sander


Reported-by: Sander Eikelenboom 
Fixes: 0bc05d585d38 ("switchdev: allow caller to explicitly request
attr_set as deferred")
Fixes: 6ac311ae8bfb ("Adding switchdev ageing notification on port
bridged")
Signed-off-by: Ido Schimmel 
Signed-off-by: Jiri Pirko 
---
 net/bridge/br_stp.c| 2 +-
 net/bridge/br_stp_if.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
index f7e8dee..5f3f645 100644
--- a/net/bridge/br_stp.c
+++ b/net/bridge/br_stp.c
@@ -48,7 +48,7 @@ void br_set_state(struct net_bridge_port *p,
unsigned int state)

p->state = state;
err = switchdev_port_attr_set(p->dev, &attr);
-   if (err)
+   if (err && err != -EOPNOTSUPP)
br_warn(p->br, "error setting offload STP state on port 
%u(%s)\n",
(unsigned int) p->port_no, p->dev->name);
 }
diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
index fa53d7a..5396ff08 100644
--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -50,7 +50,7 @@ void br_init_port(struct net_bridge_port *p)
p->config_pending = 0;

err = switchdev_port_attr_set(p->dev, &attr);
-   if (err)
+   if (err && err != -EOPNOTSUPP)
netdev_err(p->dev, "failed to set HW ageing time\n");
 }


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] switchdev: bridge: Check return code is not EOPNOTSUPP

2015-11-13 Thread Ido Schimmel
Fri, Nov 13, 2015 at 02:34:45PM IST, li...@eikelenboom.it wrote:
>On 2015-11-13 12:06, Ido Schimmel wrote:
>> When NET_SWITCHDEV=n, switchdev_port_attr_set simply returns 
>> EOPNOTSUPP.
>> In this case we should not emit errors and warnings to the kernel log.
>
>Hi Ido,
>
>Thanks for your patch!
>
>It fixes these:
>[  207.245442] vif vif-1-0 vif1.0: failed to set HW ageing time
>[  207.245443] xen_bridge: error setting offload STP state on 
>port1(vif1.0)
>
>But i still have these:
>[  335.412194] vif19.0-emu: set_features() failed (-1); wanted 
>0x008048c1, left 0x0080001b48c9
>[  335.412204] vif19.0-emu: set_features() failed (-1); wanted 
>0x008048c1, left 0x0080001b48c9
>[  335.412311] vif19.0-emu: set_features() failed (-1); wanted 
>0x008248c9, left 0x0080001b48c9
>[  335.412319] vif19.0-emu: set_features() failed (-1); wanted 
>0x008048c1, left 0x0080001b48c9
>[  335.412326] vif19.0-emu: set_features() failed (-1); wanted 
>0x008048c1, left 0x0080001b48c9
>[  335.535955] vif vif-19-0 vif19.0: set_features() failed (-1); wanted 
>0x00044803, left 0x000400114813
>[  335.535965] vif vif-19-0 vif19.0: set_features() failed (-1); wanted 
>0x00044803, left 0x000400114813
>[  335.615392] vif vif-19-0 vif19.0: set_features() failed (-1); wanted 
>0x00044803, left 0x000400114813
>[  335.615401] xen_bridge: set_features() failed (-1); wanted 
>0x00801fdb78c9, left 0x00801fff78e9
>

Yes, this is a different issue and I see that Nik is already working on
it. Can you please try his patch?

http://patchwork.ozlabs.org/patch/544242/
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] switchdev: bridge: Check return code is not EOPNOTSUPP

2015-11-13 Thread Sander Eikelenboom

On 2015-11-13 13:43, Ido Schimmel wrote:

Fri, Nov 13, 2015 at 02:34:45PM IST, li...@eikelenboom.it wrote:

On 2015-11-13 12:06, Ido Schimmel wrote:

When NET_SWITCHDEV=n, switchdev_port_attr_set simply returns
EOPNOTSUPP.
In this case we should not emit errors and warnings to the kernel 
log.


Hi Ido,

Thanks for your patch!

It fixes these:
[  207.245442] vif vif-1-0 vif1.0: failed to set HW ageing time
[  207.245443] xen_bridge: error setting offload STP state on
port1(vif1.0)

But i still have these:
[  335.412194] vif19.0-emu: set_features() failed (-1); wanted
0x008048c1, left 0x0080001b48c9
[  335.412204] vif19.0-emu: set_features() failed (-1); wanted
0x008048c1, left 0x0080001b48c9
[  335.412311] vif19.0-emu: set_features() failed (-1); wanted
0x008248c9, left 0x0080001b48c9
[  335.412319] vif19.0-emu: set_features() failed (-1); wanted
0x008048c1, left 0x0080001b48c9
[  335.412326] vif19.0-emu: set_features() failed (-1); wanted
0x008048c1, left 0x0080001b48c9
[  335.535955] vif vif-19-0 vif19.0: set_features() failed (-1); 
wanted

0x00044803, left 0x000400114813
[  335.535965] vif vif-19-0 vif19.0: set_features() failed (-1); 
wanted

0x00044803, left 0x000400114813
[  335.615392] vif vif-19-0 vif19.0: set_features() failed (-1); 
wanted

0x00044803, left 0x000400114813
[  335.615401] xen_bridge: set_features() failed (-1); wanted
0x00801fdb78c9, left 0x00801fff78e9



Yes, this is a different issue and I see that Nik is already working on
it. Can you please try his patch?

http://patchwork.ozlabs.org/patch/544242/


Yeah that suppresses the warning, thx !

--
Sander
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] net: fix feature changes on devices without ndo_set_features

2015-11-13 Thread Nikolay Aleksandrov
From: Nikolay Aleksandrov 

When __netdev_update_features() was updated to ensure some features are
disabled on new lower devices, an error was introduced for devices which
don't have the ndo_set_features() method set. Before we'll just set the
new features, but now we return an error and don't set them. Fix this by
returning the old behaviour and setting err to 0 when ndo_set_features
is not present.

Fixes: e7868a85e1b2 ("net/core: ensure features get disabled on new lower devs")
CC: Jarod Wilson 
CC: Jiri Pirko 
CC: Ido Schimmel 
CC: Sander Eikelenboom 
CC: Andy Gospodarek 
CC: Florian Fainelli 
Signed-off-by: Nikolay Aleksandrov 
---
Sander please feel free to give your Tested-by.

 net/core/dev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index ab9b8d0d115e..4a1d198dbbff 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6426,6 +6426,8 @@ int __netdev_update_features(struct net_device *dev)
 
if (dev->netdev_ops->ndo_set_features)
err = dev->netdev_ops->ndo_set_features(dev, features);
+   else
+   err = 0;
 
if (unlikely(err < 0)) {
netdev_err(dev,
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] net: fix feature changes on devices without ndo_set_features

2015-11-13 Thread Jiri Pirko
Fri, Nov 13, 2015 at 02:54:01PM CET, ra...@blackwall.org wrote:
>From: Nikolay Aleksandrov 
>
>When __netdev_update_features() was updated to ensure some features are
>disabled on new lower devices, an error was introduced for devices which
>don't have the ndo_set_features() method set. Before we'll just set the
>new features, but now we return an error and don't set them. Fix this by
>returning the old behaviour and setting err to 0 when ndo_set_features
>is not present.
>
>Fixes: e7868a85e1b2 ("net/core: ensure features get disabled on new lower 
>devs")
>CC: Jarod Wilson 
>CC: Jiri Pirko 
>CC: Ido Schimmel 
>CC: Sander Eikelenboom 
>CC: Andy Gospodarek 
>CC: Florian Fainelli 
>Signed-off-by: Nikolay Aleksandrov 

Reviewed-by: Jiri Pirko 

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] net: fix feature changes on devices without ndo_set_features

2015-11-13 Thread Andy Gospodarek
On Fri, Nov 13, 2015 at 02:54:01PM +0100, Nikolay Aleksandrov wrote:
> From: Nikolay Aleksandrov 
> 
> When __netdev_update_features() was updated to ensure some features are
> disabled on new lower devices, an error was introduced for devices which
> don't have the ndo_set_features() method set. Before we'll just set the
> new features, but now we return an error and don't set them. Fix this by
> returning the old behaviour and setting err to 0 when ndo_set_features
> is not present.

Thanks for the quick turnaround.

> Fixes: e7868a85e1b2 ("net/core: ensure features get disabled on new lower 
> devs")
> CC: Jarod Wilson 
> CC: Jiri Pirko 
> CC: Ido Schimmel 
> CC: Sander Eikelenboom 
> CC: Andy Gospodarek 
> CC: Florian Fainelli 
> Signed-off-by: Nikolay Aleksandrov 

Reviewed-by: Andy Gospodarek 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch] netlink: fix a limit in NETLINK_LIST_MEMBERSHIPS

2015-11-13 Thread Dan Carpenter
This condition doesn't work when len is smaller than expected and not a
multiple of 4.  In that situation "len - pos" is negative and type
promoted to a high unsigned value and we do not break out of the loop.
It causes the program calling it to crash.

Fixes: b42be38b2778 ('netlink: add API to retrieve all group memberships')
Signed-off-by: Dan Carpenter 

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 59651af..76a8466 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2373,7 +2373,7 @@ static int netlink_getsockopt(struct socket *sock, int 
level, int optname,
err = 0;
netlink_lock_table();
for (pos = 0; pos * 8 < nlk->ngroups; pos += sizeof(u32)) {
-   if (len - pos < sizeof(u32))
+   if (len < pos + sizeof(u32))
break;
 
idx = pos / sizeof(unsigned long);
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] net: fix __netdev_update_features return on ndo_set_features failure

2015-11-13 Thread Nikolay Aleksandrov
From: Nikolay Aleksandrov 

If ndo_set_features fails __netdev_update_features() will return -1 but
this is wrong because it is expected to return 0 if no features were
changed (see netdev_update_features()), which will cause a netdev
notifier to be called without any actual changes. Fix this by returning
0 if ndo_set_features fails.

Fixes: 6cb6a27c45ce ("net: Call netdev_features_change() from 
netdev_update_features()")
CC: Michał Mirosław 
Signed-off-by: Nikolay Aleksandrov 
---
 net/core/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 4a1d198dbbff..1974aee005a6 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6433,7 +6433,7 @@ int __netdev_update_features(struct net_device *dev)
netdev_err(dev,
"set_features() failed (%d); wanted %pNF, left %pNF\n",
err, &features, &dev->features);
-   return -1;
+   return 0;
}
 
 sync_lower:
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] net: fix feature changes on devices without ndo_set_features

2015-11-13 Thread Jarod Wilson

Nikolay Aleksandrov wrote:

From: Nikolay Aleksandrov

When __netdev_update_features() was updated to ensure some features are
disabled on new lower devices, an error was introduced for devices which
don't have the ndo_set_features() method set. Before we'll just set the
new features, but now we return an error and don't set them. Fix this by
returning the old behaviour and setting err to 0 when ndo_set_features
is not present.

Fixes: e7868a85e1b2 ("net/core: ensure features get disabled on new lower devs")
CC: Jarod Wilson
CC: Jiri Pirko
CC: Ido Schimmel
CC: Sander Eikelenboom
CC: Andy Gospodarek
CC: Florian Fainelli
Signed-off-by: Nikolay Aleksandrov
---
Sander please feel free to give your Tested-by.


Ah, good catch, thank you for cleaning up my mess.

Reviewed-by: Jarod Wilson 

--
Jarod Wilson
ja...@redhat.com


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] netlink: fix a limit in NETLINK_LIST_MEMBERSHIPS

2015-11-13 Thread David Herrmann
Hi

On Fri, Nov 13, 2015 at 3:20 PM, Dan Carpenter  wrote:
> This condition doesn't work when len is smaller than expected and not a
> multiple of 4.  In that situation "len - pos" is negative and type
> promoted to a high unsigned value and we do not break out of the loop.
> It causes the program calling it to crash.

Could you give an example how this can happen? The loop-invariant
should be "len >= pos", as such, this shouldn't happen. "pos" starts
out as 0, "len" is guaranteed to be >=0. "pos" is only incremented by
4, if "len - pos >= 4".

What am I missing?

Thanks
David

> Fixes: b42be38b2778 ('netlink: add API to retrieve all group memberships')
> Signed-off-by: Dan Carpenter 
>
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index 59651af..76a8466 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -2373,7 +2373,7 @@ static int netlink_getsockopt(struct socket *sock, int 
> level, int optname,
> err = 0;
> netlink_lock_table();
> for (pos = 0; pos * 8 < nlk->ngroups; pos += sizeof(u32)) {
> -   if (len - pos < sizeof(u32))
> +   if (len < pos + sizeof(u32))
> break;
>
> idx = pos / sizeof(unsigned long);
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: hisilicon: fix binding document of mdio

2015-11-13 Thread Rob Herring
On Fri, Nov 13, 2015 at 10:23:44AM +0800, huangdaode wrote:
> This patch fixes explain the occasion of "hisilcon,mdio" according to
> Arnd's comments. specify it is only used for hip04.
> 
> First, please give your commnents.
> 
> Signed-off-by: huangdaode 
> ---
>  Documentation/devicetree/bindings/net/hisilicon-hns-mdio.txt | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/net/hisilicon-hns-mdio.txt 
> b/Documentation/devicetree/bindings/net/hisilicon-hns-mdio.txt
> index 9c23fdf..f650e78 100644
> --- a/Documentation/devicetree/bindings/net/hisilicon-hns-mdio.txt
> +++ b/Documentation/devicetree/bindings/net/hisilicon-hns-mdio.txt
> @@ -1,7 +1,9 @@
>  Hisilicon MDIO bus controller
>  
>  Properties:
> -- compatible: "hisilicon,mdio","hisilicon,hns-mdio".
> +- compatible: can be one of "hisilicon,hns-mdio","hisilicon,mdio",
> +  for hip04 board, please use "hisilicon,mdio",
> +  other boards, "hisilicon,hns-mdio" is OK.

Please reformat like this:

- compatible: can be one of:
"hisilicon,hns-mdio"
"hisilicon,mdio"
  For hip04 board, must be "hisilicon,mdio".
  Otherwise, must be "hisilicon,hns-mdio".

>  - reg: The base address of the MDIO bus controller register bank.
>  - #address-cells: Must be <1>.
>  - #size-cells: Must be <0>.  MDIO addresses have no size component.
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe devicetree" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: OVS VXLAN decap rule has full match on TTL for the outer headers?

2015-11-13 Thread Or Gerlitz
On Fri, Nov 13, 2015 at 10:14 AM, Joe Stringer  wrote:

> I don't follow the logic. You observed one flow which matched on
> TTL=64, therefore all vxlan packets terminated at OVS have TTL=64?

> If OVS received packets with different TTLs, they would miss and
> ovs-vswitchd would generate flows to match that traffic too.

ok, that makes things a bit better, but (see next)

> If that becomes an issue, presumably the wildcard generation can be improved.

is there a deep reason for vlxan "learned flows" to actually match w
or w.o wild cards on TTLs?? for non-tunneled flow I don't see  this
happening.


> I agree that this UNSPEC issue on v2.3 could do with a bit of a closer
> look. I'll see if I can find some time for it. Alternatively if you're
> willing and have bandwith, I'd be curious if it's related to the
> masked set field feature introduced in Linux-4.0.

so what would you suggest here? run with 3.19 or earlier?


> In this case it looks like you created the datapath using a newer
> version of the userspace utilities, then without deleting the
> datapath, attempted to reuse the datapath with an older version of the
> userspace utilities. This is fine, but it warns you because it drops
> particular user features which the newer userspace supported (because
> the older userspace doesn't support them). Sure, it's not the most
> graceful, but it doesn't look fatal in and of itself. Comment from the
> code below for context:
>
> /* An outdated user space instance that does not understand
> * the concept of user_features has attempted to create a new
> * datapath and is likely to reuse it. Drop all user features.
> */

thanks for the clarification/s, yes, I tried few user-space versions
one after the other, possibly w.o deleting the datapath

>> So I now moved to 2.4.0, and things aren't much better... can you give
>> a quick try on
>> your systems for upstream kernel against upstream OVS w.r.t to simple
>> VXLAN config?

> What do you mean by "not much better"? Do you mean that you still
> observe one of the above three issues, or you see a different issue?
> In particular I'd be curious if you observe the UNSPEC issue.

I mean this is printed by ovs-dpctl dump-flows for the encap rule

recirc_id(0),in_port(3),eth(src=02:d3:6e:35:59:35,dst=9e:1e:90:87:27:1a),eth_type(0x0800),ipv4(tos=0/0x3,frag=no),
packets:111, bytes:10878, used:0.569s, actions:set(unspec(bad key
length 8, expected 0)(00 00 00 00 00 00 00 62)),2

Or.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/14] mm: memcontrol: export root_mem_cgroup

2015-11-13 Thread David Miller
From: Johannes Weiner 
Date: Thu, 12 Nov 2015 18:41:20 -0500

> A later patch will need this symbol in files other than memcontrol.c,
> so export it now and replace mem_cgroup_root_css at the same time.
> 
> Signed-off-by: Johannes Weiner 
> Acked-by: Michal Hocko 

Acked-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/14] mm: vmscan: simplify memcg vs. global shrinker invocation

2015-11-13 Thread David Miller
From: Johannes Weiner 
Date: Thu, 12 Nov 2015 18:41:21 -0500

> Letting shrink_slab() handle the root_mem_cgroup, and implicitely the
> !CONFIG_MEMCG case, allows shrink_zone() to invoke the shrinkers
> unconditionally from within the memcg iteration loop.
> 
> Signed-off-by: Johannes Weiner 
> Acked-by: Michal Hocko 

Acked-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/14] net: tcp_memcontrol: remove bogus hierarchy pressure propagation

2015-11-13 Thread David Miller
From: Johannes Weiner 
Date: Thu, 12 Nov 2015 18:41:23 -0500

> When a cgroup currently breaches its socket memory limit, it enters
> memory pressure mode for itself and its *ancestors*. This throttles
> transmission in unrelated sibling and cousin subtrees that have
> nothing to do with the breached limit.
> 
> On the contrary, breaching a limit should make that group and its
> *children* enter memory pressure mode. But this happens already,
> albeit lazily: if an ancestor limit is breached, siblings will enter
> memory pressure on their own once the next packet arrives for them.
> 
> So no additional hierarchy code is needed. Remove the bogus stuff.
> 
> Signed-off-by: Johannes Weiner 

Acked-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/14] net: tcp_memcontrol: properly detect ancestor socket pressure

2015-11-13 Thread David Miller
From: Johannes Weiner 
Date: Thu, 12 Nov 2015 18:41:22 -0500

> When charging socket memory, the code currently checks only the local
> page counter for excess to determine whether the memcg is under socket
> pressure. But even if the local counter is fine, one of the ancestors
> could have breached its limit, which should also force this child to
> enter socket pressure. This currently doesn't happen.
> 
> Fix this by using page_counter_try_charge() first. If that fails, it
> means that either the local counter or one of the ancestors are in
> excess of their limit, and the child should enter socket pressure.
> 
> Signed-off-by: Johannes Weiner 

Acked-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/14] net: tcp_memcontrol: protect all tcp_memcontrol calls by jump-label

2015-11-13 Thread David Miller
From: Johannes Weiner 
Date: Thu, 12 Nov 2015 18:41:24 -0500

> Move the jump-label from sock_update_memcg() and sock_release_memcg()
> to the callsite, and so eliminate those function calls when socket
> accounting is not enabled.
> 
> This also eliminates the need for dummy functions because the calls
> will be optimized away if the Kconfig options are not enabled.
> 
> Signed-off-by: Johannes Weiner 

Acked-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [net-next 04/17] drivers/net/intel: use napi_complete_done()

2015-11-13 Thread Alexander Duyck

On 11/12/2015 09:18 PM, Eric Dumazet wrote:

On Thu, 2015-10-15 at 14:43 -0700, Jeff Kirsher wrote:

From: Jesse Brandeburg 

As per Eric Dumazet's previous patches:
(see commit (24d2e4a50737) - tg3: use napi_complete_done())

Quoting verbatim:
Using napi_complete_done() instead of napi_complete() allows
us to use /sys/class/net/ethX/gro_flush_timeout

GRO layer can aggregate more packets if the flush is delayed a bit,
without having to set too big coalescing parameters that impact
latencies.


Tested
configuration: low latency via ethtool -C ethx adaptive-rx off
rx-usecs 10 adaptive-tx off tx-usecs 15
workload: streaming rx using netperf TCP_MAERTS

igb:
MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () 
port 0 AF_INET : demo
...
Interim result:  941.48 10^6bits/s over 1.000 seconds ending at 1440193171.589

Alignment  Offset BytesBytes   Recvs   BytesSends
Local  Remote  Local  Remote  Xfered   Per Per
Recv   SendRecv   Send Recv (avg)  Send (avg)
 8   8  0   0 1176930056  1475.36797726   16384.00  71905

MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () 
port 0 AF_INET : demo
...
Interim result:  941.49 10^6bits/s over 0.997 seconds ending at 1440193142.763

Alignment  Offset BytesBytes   Recvs   BytesSends
Local  Remote  Local  Remote  Xfered   Per Per
Recv   SendRecv   Send Recv (avg)  Send (avg)
 8   8  0   0 1175182320  50476.00 23282   16384.00  71816

i40e:
Hard to test because the traffic is incoming so fast (24Gb/s) that GRO
always receives 87kB, even at the highest interrupt rate.

Other drivers were only compile tested.

Signed-off-by: Jesse Brandeburg 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 


Hi guys

I am not sure the ixgbe part is working :

ixgbe_qv_unlock_napi() does :

/* flush any outstanding Rx frames */
if (q_vector->napi.gro_list)
 napi_gro_flush(&q_vector->napi, false);

And it is called before napi_complete_done(napi, work_done);


Yes, I'm pretty certain you cannot use this napi_complete_done with 
anything that support busy poll sockets.  The problem is you need to 
flush any existing lists before yielding to the socket polling in order 
to avoid packet ordering issues between the NAPI polling routine and the 
socket polling routine.


- Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/14] net: tcp_memcontrol: remove dead per-memcg count of allocated sockets

2015-11-13 Thread David Miller
From: Johannes Weiner 
Date: Thu, 12 Nov 2015 18:41:25 -0500

> The number of allocated sockets is used for calculations in the soft
> limit phase, where packets are accepted but the socket is under memory
> pressure. Since there is no soft limit phase in tcp_memcontrol, and
> memory pressure is only entered when packets are already dropped, this
> is actually dead code. Remove it.
> 
> As this is the last user of parent_cg_proto(), remove that too.
> 
> Signed-off-by: Johannes Weiner 

Acked-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: hisilicon: fix binding document of mdio

2015-11-13 Thread Arnd Bergmann
On Friday 13 November 2015 08:44:57 Rob Herring wrote:
> 
> Please reformat like this:
> 
> - compatible: can be one of:
> "hisilicon,hns-mdio"
> "hisilicon,mdio"
>   For hip04 board, must be "hisilicon,mdio".
>   Otherwise, must be "hisilicon,hns-mdio".

should we recommend the use of "hisilicon,hns-mdio" unconditionally
and only list "hisilicon,mdio" for backwards compatibility?

Arnd
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [net-next 04/17] drivers/net/intel: use napi_complete_done()

2015-11-13 Thread Eric Dumazet
On Fri, 2015-11-13 at 08:06 -0800, Alexander Duyck wrote:

> Yes, I'm pretty certain you cannot use this napi_complete_done with 
> anything that support busy poll sockets.  The problem is you need to 
> flush any existing lists before yielding to the socket polling in order 
> to avoid packet ordering issues between the NAPI polling routine and the 
> socket polling routine.

My plan is to make busy poll independent of GRO / RPS / RFS, and generic
if possible, for all NAPI drivers. (No need to absolutely provide
ndo_busy_poll()

I really do not see GRO being a problem for low latency : RPC messages
are terminated by PSH flag that take care of flushing GRO engine.

For mixed use, (low latency and other kind of flows), GRO is a win.

With the following sk_busy_loop() , we :

- allow tunneling traffic to use busy poll as well as native traffic.
- allow RFS/RPS being used (sending IPI to other cpus if needed)
- use the 'lets burn cpu cycles' to do useful work (like TX completions, RCU 
callbacks...)
- Implement busy poll for all NAPI drivers.

rcu_read_lock();
napi = napi_by_id(sk->sk_napi_id);
if (!napi)
goto out;
ops = napi->dev->netdev_ops;

for (;;) {
local_bh_disable();
rc = 0;
if (ops->ndo_busy_poll) {
rc = ops->ndo_busy_poll(napi);
} else if (napi_schedule_prep(napi)) {
rc = napi->poll(napi, 4);
if (rc == 4) {
napi_complete_done(napi, rc);
napi_schedule(napi);
}
}
if (rc > 0)
NET_ADD_STATS_BH(sock_net(sk),
 LINUX_MIB_BUSYPOLLRXPACKETS, rc);
local_bh_enable();

if (rc == LL_FLUSH_FAILED ||
nonblock ||
!skb_queue_empty(&sk->sk_receive_queue) ||
need_resched() ||
busy_loop_timeout(end_time))
break;

cpu_relax();
}
rcu_read_unlock();




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


via-velocity skb_over_panic

2015-11-13 Thread Timo Teras
Hi,

I recently saw via-velocity skb_over_panic() on one of my locations.
The panic happened with two separate hardware devices, so it appears to
be network related, not broken hardware.

I did not get the actual over_panic printk, as I got only screen shot
of them monitor. But the visible part of call trace says:
 
 skb_put
 velocity_poll
 net_rx_action
 __do_softirq
 irq_exit
 common_interrupt
 

The was recurring every few hours, so I patched via-velocity with the
following after looking the code a bit:

--- a/drivers/net/ethernet/via/via-velocity.c
+++ b/drivers/net/ethernet/via/via-velocity.c
@@ -2060,6 +2060,11 @@ static int velocity_receive_frame(struct velocity_info 
*vptr, int idx)
stats->rx_length_errors++;
return -EINVAL;
}
+   if (pkt_len < 4 || pkt_len > vptr->rx.buf_sz) {
+   VELOCITY_PRT(MSG_LEVEL_VERBOSE, KERN_ERR " %s : the received 
frame size %d is inconsistent.\n", vptr->netdev->name, pkt_len);
+   stats->rx_length_errors++;
+   return -EINVAL;
+   }
 
if (rd->rdesc0.RSR & RSR_MAR)
stats->multicast++;

This seems to have fixed the panics. And I do see one of the NIC's
ethtool report's in_range_length_errors increasing once in a while. For
some reason I don't see the above debug message though, so I'm not sure
on what pkt_len triggers it.

In any case, the cade a bit later on does unconditionally:
skb_put(skb, pkt_len - 4);

So it's possible that some bad packets make the NIC return unexpected
packet sizes, and the current code can panic on it.

Any suggestions for better fix?

Thanks,
Timo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] net: usb: cdc_ether: add Dell DW5580 as a mobile broadband adapter

2015-11-13 Thread Daniele Palmas
Since Dell DW5580 is a 3G modem, this patch adds the device as a
mobile broadband adapter

Signed-off-by: Daniele Palmas 
---
 drivers/net/usb/cdc_ether.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/usb/cdc_ether.c b/drivers/net/usb/cdc_ether.c
index 35a2bff..5e92076 100644
--- a/drivers/net/usb/cdc_ether.c
+++ b/drivers/net/usb/cdc_ether.c
@@ -764,6 +764,11 @@ static const struct usb_device_id  products[] = {
USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE),
.driver_info = (kernel_ulong_t) &wwan_info,
 }, {
+   /* Dell DW5580 modules */
+   USB_DEVICE_AND_INTERFACE_INFO(DELL_VENDOR_ID, 0x81ba, USB_CLASS_COMM,
+   USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE),
+   .driver_info = (kernel_ulong_t)&wwan_info,
+}, {
USB_INTERFACE_INFO(USB_CLASS_COMM, USB_CDC_SUBCLASS_ETHERNET,
USB_CDC_PROTO_NONE),
.driver_info = (unsigned long) &cdc_info,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[iproute PATCH 00/12] smaller iptunnel and ip6tunnel review

2015-11-13 Thread Phil Sutter
In an effort to try and merge iptunnel and ip6tunnel support code, I found a
few things worth changing, mainly by comparing the two files.

Please note that I did not test functionality of all supported tunnel modes,
but since the changes are fairly small and obvious I hopefully didn't introduce
too many bugs.

Phil Sutter (12):
  ip{,6}tunnel: get rid of extraneous whitespace when printing
  ip/tunnel: introduce tnl_parse_key()
  ip{,6}tunnel: unify behaviour if physical device is not found
  iptunnel: use ll_name_to_index() for physical interface lookup
  ip{,6}tunnel: align do_tunnels_list() a bit
  ip6tunnel: print local/remote addresses like iptunnel does
  ip6tunnel: fix coding style: no newline between brace and else
  iptunnel: share common code when setting tunnel mode
  iptunnel: simplify parsing TTL, allow 'hlim' as identifier
  iptunnel: share common code when determining the default interface
name
  iptunnel: sanitize copying tunnel name
  ip{,6}tunnel: put spaces around non-unary operators

 ip/ip6tunnel.c |  83 +++-
 ip/iptunnel.c  | 239 ++---
 ip/tunnel.c|  15 
 ip/tunnel.h|   1 +
 4 files changed, 135 insertions(+), 203 deletions(-)

-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[iproute PATCH 08/12] iptunnel: share common code when setting tunnel mode

2015-11-13 Thread Phil Sutter
Signed-off-by: Phil Sutter 
---
 ip/iptunnel.c | 39 ++-
 1 file changed, 14 insertions(+), 25 deletions(-)

diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index e323c1f..92edb34 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -47,6 +47,15 @@ static void usage(void)
exit(-1);
 }
 
+static void set_tunnel_proto(struct ip_tunnel_parm *p, int proto)
+{
+   if (p->iph.protocol && p->iph.protocol != proto) {
+   fprintf(stderr,"You managed to ask for more than one tunnel 
mode.\n");
+   exit(-1);
+   }
+   p->iph.protocol = proto;
+}
+
 static int parse_args(int argc, char **argv, int cmd, struct ip_tunnel_parm *p)
 {
int count = 0;
@@ -68,38 +77,18 @@ static int parse_args(int argc, char **argv, int cmd, 
struct ip_tunnel_parm *p)
NEXT_ARG();
if (strcmp(*argv, "ipip") == 0 ||
strcmp(*argv, "ip/ip") == 0) {
-   if (p->iph.protocol && p->iph.protocol != 
IPPROTO_IPIP) {
-   fprintf(stderr,"You managed to ask for 
more than one tunnel mode.\n");
-   exit(-1);
-   }
-   p->iph.protocol = IPPROTO_IPIP;
+   set_tunnel_proto(p, IPPROTO_IPIP);
} else if (strcmp(*argv, "gre") == 0 ||
   strcmp(*argv, "gre/ip") == 0) {
-   if (p->iph.protocol && p->iph.protocol != 
IPPROTO_GRE) {
-   fprintf(stderr,"You managed to ask for 
more than one tunnel mode.\n");
-   exit(-1);
-   }
-   p->iph.protocol = IPPROTO_GRE;
+   set_tunnel_proto(p, IPPROTO_GRE);
} else if (strcmp(*argv, "sit") == 0 ||
   strcmp(*argv, "ipv6/ip") == 0) {
-   if (p->iph.protocol && p->iph.protocol != 
IPPROTO_IPV6) {
-   fprintf(stderr,"You managed to ask for 
more than one tunnel mode.\n");
-   exit(-1);
-   }
-   p->iph.protocol = IPPROTO_IPV6;
+   set_tunnel_proto(p, IPPROTO_IPV6);
} else if (strcmp(*argv, "isatap") == 0) {
-   if (p->iph.protocol && p->iph.protocol != 
IPPROTO_IPV6) {
-   fprintf(stderr, "You managed to ask for 
more than one tunnel mode.\n");
-   exit(-1);
-   }
-   p->iph.protocol = IPPROTO_IPV6;
+   set_tunnel_proto(p, IPPROTO_IPV6);
isatap++;
} else if (strcmp(*argv, "vti") == 0) {
-   if (p->iph.protocol && p->iph.protocol != 
IPPROTO_IPIP) {
-   fprintf(stderr, "You managed to ask for 
more than one tunnel mode.\n");
-   exit(-1);
-   }
-   p->iph.protocol = IPPROTO_IPIP;
+   set_tunnel_proto(p, IPPROTO_IPIP);
p->i_flags |= VTI_ISVTI;
} else {
fprintf(stderr,"Unknown tunnel mode \"%s\"\n", 
*argv);
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[iproute PATCH 07/12] ip6tunnel: fix coding style: no newline between brace and else

2015-11-13 Thread Phil Sutter
Signed-off-by: Phil Sutter 
---
 ip/ip6tunnel.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/ip/ip6tunnel.c b/ip/ip6tunnel.c
index 9eb5b2f..d8957f0 100644
--- a/ip/ip6tunnel.c
+++ b/ip/ip6tunnel.c
@@ -262,8 +262,7 @@ static int parse_args(int argc, char **argv, int cmd, 
struct ip6_tnl_parm2 *p)
} else {
if (strcmp(*argv, "name") == 0) {
NEXT_ARG();
-   }
-   else if (matches(*argv, "help") == 0)
+   } else if (matches(*argv, "help") == 0)
usage();
if (p->name[0])
duparg2("name", *argv);
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[iproute PATCH 03/12] ip{,6}tunnel: unify behaviour if physical device is not found

2015-11-13 Thread Phil Sutter
Make ip6tunnel print an error message as well. While there, get rid of
unnecessary line breaking.

Signed-off-by: Phil Sutter 
---
 ip/ip6tunnel.c | 4 +++-
 ip/iptunnel.c  | 3 +--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/ip/ip6tunnel.c b/ip/ip6tunnel.c
index 8b842b6..410276f 100644
--- a/ip/ip6tunnel.c
+++ b/ip/ip6tunnel.c
@@ -278,8 +278,10 @@ static int parse_args(int argc, char **argv, int cmd, 
struct ip6_tnl_parm2 *p)
}
if (medium[0]) {
p->link = ll_name_to_index(medium);
-   if (p->link == 0)
+   if (p->link == 0) {
+   fprintf(stderr, "Cannot find device \"%s\"\n", medium);
return -1;
+   }
}
return 0;
 }
diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index 9c9dc54..803bb83 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -228,8 +228,7 @@ static int parse_args(int argc, char **argv, int cmd, 
struct ip_tunnel_parm *p)
if (medium[0]) {
p->link = if_nametoindex(medium);
if (p->link == 0) {
-   fprintf(stderr, "Cannot find device \"%s\"\n",
-   medium);
+   fprintf(stderr, "Cannot find device \"%s\"\n", medium);
return -1;
}
}
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[iproute PATCH 02/12] ip/tunnel: introduce tnl_parse_key()

2015-11-13 Thread Phil Sutter
Instead of duplicating the same code six times (key, ikey and okey in
iptunnel and ip6tunnel), have a common parsing routine. This has the
added benefit of having the same verbose error message in ip6tunnel as
well as iptunnel.

I'm not sure if parsing an IPv4 address as key makes sense for
ip6tunnel, but the code was there before so this patch at least doesn't
make it worse.

Signed-off-by: Phil Sutter 
---
 ip/ip6tunnel.c | 33 +++--
 ip/iptunnel.c  | 33 +++--
 ip/tunnel.c| 15 +++
 ip/tunnel.h|  1 +
 4 files changed, 22 insertions(+), 60 deletions(-)

diff --git a/ip/ip6tunnel.c b/ip/ip6tunnel.c
index 07010d3..8b842b6 100644
--- a/ip/ip6tunnel.c
+++ b/ip/ip6tunnel.c
@@ -230,45 +230,18 @@ static int parse_args(int argc, char **argv, int cmd, 
struct ip6_tnl_parm2 *p)
invarg("not inherit", *argv);
p->flags |= IP6_TNL_F_RCV_DSCP_COPY;
} else if (strcmp(*argv, "key") == 0) {
-   unsigned uval;
NEXT_ARG();
p->i_flags |= GRE_KEY;
p->o_flags |= GRE_KEY;
-   if (strchr(*argv, '.'))
-   p->i_key = p->o_key = get_addr32(*argv);
-   else {
-   if (get_unsigned(&uval, *argv, 0) < 0) {
-   fprintf(stderr, "invalid value of 
\"key\"\n");
-   exit(-1);
-   }
-   p->i_key = p->o_key = htonl(uval);
-   }
+   p->i_key = p->o_key = tnl_parse_key("key", *argv);
} else if (strcmp(*argv, "ikey") == 0) {
-   unsigned uval;
NEXT_ARG();
p->i_flags |= GRE_KEY;
-   if (strchr(*argv, '.'))
-   p->i_key = get_addr32(*argv);
-   else {
-   if (get_unsigned(&uval, *argv, 0)<0) {
-   fprintf(stderr, "invalid value of 
\"ikey\"\n");
-   exit(-1);
-   }
-   p->i_key = htonl(uval);
-   }
+   p->i_key = tnl_parse_key("ikey", *argv);
} else if (strcmp(*argv, "okey") == 0) {
-   unsigned uval;
NEXT_ARG();
p->o_flags |= GRE_KEY;
-   if (strchr(*argv, '.'))
-   p->o_key = get_addr32(*argv);
-   else {
-   if (get_unsigned(&uval, *argv, 0)<0) {
-   fprintf(stderr, "invalid value of 
\"okey\"\n");
-   exit(-1);
-   }
-   p->o_key = htonl(uval);
-   }
+   p->o_key = tnl_parse_key("okey", *argv);
} else if (strcmp(*argv, "seq") == 0) {
p->i_flags |= GRE_SEQ;
p->o_flags |= GRE_SEQ;
diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index 36534f2..9c9dc54 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -106,45 +106,18 @@ static int parse_args(int argc, char **argv, int cmd, 
struct ip_tunnel_parm *p)
exit(-1);
}
} else if (strcmp(*argv, "key") == 0) {
-   unsigned uval;
NEXT_ARG();
p->i_flags |= GRE_KEY;
p->o_flags |= GRE_KEY;
-   if (strchr(*argv, '.'))
-   p->i_key = p->o_key = get_addr32(*argv);
-   else {
-   if (get_unsigned(&uval, *argv, 0)<0) {
-   fprintf(stderr, "invalid value for 
\"key\": \"%s\"; it should be an unsigned integer\n", *argv);
-   exit(-1);
-   }
-   p->i_key = p->o_key = htonl(uval);
-   }
+   p->i_key = p->o_key = tnl_parse_key("key", *argv);
} else if (strcmp(*argv, "ikey") == 0) {
-   unsigned uval;
NEXT_ARG();
p->i_flags |= GRE_KEY;
-   if (strchr(*argv, '.'))
-   p->i_key = get_addr32(*argv);
-   else {
-   if (get_unsigned(&uval, *argv, 0)<0) {
-   fprintf(stderr, "invalid value for 
\"ikey\": \"%s\"; it should be an unsigned integer\n", *argv);
-   

[iproute PATCH 12/12] ip{,6}tunnel: put spaces around non-unary operators

2015-11-13 Thread Phil Sutter
Signed-off-by: Phil Sutter 
---
 ip/ip6tunnel.c | 16 
 ip/iptunnel.c  | 40 
 2 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/ip/ip6tunnel.c b/ip/ip6tunnel.c
index d8957f0..320d253 100644
--- a/ip/ip6tunnel.c
+++ b/ip/ip6tunnel.c
@@ -110,22 +110,22 @@ static void print_tunnel(struct ip6_tnl_parm2 *p)
printf(" dscp inherit");
 
if (p->proto == IPPROTO_GRE) {
-   if ((p->i_flags&GRE_KEY) && (p->o_flags&GRE_KEY) && p->o_key == 
p->i_key)
+   if ((p->i_flags & GRE_KEY) && (p->o_flags & GRE_KEY) && 
p->o_key == p->i_key)
printf(" key %u", ntohl(p->i_key));
-   else if ((p->i_flags|p->o_flags)&GRE_KEY) {
-   if (p->i_flags&GRE_KEY)
+   else if ((p->i_flags | p->o_flags) & GRE_KEY) {
+   if (p->i_flags & GRE_KEY)
printf(" ikey %u", ntohl(p->i_key));
-   if (p->o_flags&GRE_KEY)
+   if (p->o_flags & GRE_KEY)
printf(" okey %u", ntohl(p->o_key));
}
 
-   if (p->i_flags&GRE_SEQ)
+   if (p->i_flags & GRE_SEQ)
printf("%s  Drop packets out of sequence.", _SL_);
-   if (p->i_flags&GRE_CSUM)
+   if (p->i_flags & GRE_CSUM)
printf("%s  Checksum in received packet is required.", 
_SL_);
-   if (p->o_flags&GRE_SEQ)
+   if (p->o_flags & GRE_SEQ)
printf("%s  Sequence packets on output.", _SL_);
-   if (p->o_flags&GRE_CSUM)
+   if (p->o_flags & GRE_CSUM)
printf("%s  Checksum output packets.", _SL_);
}
 }
diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index b377a5b..b9552ed 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -139,7 +139,7 @@ static int parse_args(int argc, char **argv, int cmd, 
struct ip_tunnel_parm *p)
p->iph.saddr = htonl(INADDR_ANY);
} else if (strcmp(*argv, "dev") == 0) {
NEXT_ARG();
-   strncpy(medium, *argv, IFNAMSIZ-1);
+   strncpy(medium, *argv, IFNAMSIZ - 1);
} else if (strcmp(*argv, "ttl") == 0 ||
   strcmp(*argv, "hoplimit") == 0 ||
   strcmp(*argv, "hlim") == 0) {
@@ -336,14 +336,14 @@ static void print_tunnel(struct ip_tunnel_parm *p)
if (p->iph.tos) {
SPRINT_BUF(b1);
printf(" tos");
-   if (p->iph.tos&1)
+   if (p->iph.tos & 1)
printf(" inherit");
-   if (p->iph.tos&~1)
-   printf("%c%s ", p->iph.tos&1 ? '/' : ' ',
-  rtnl_dsfield_n2a(p->iph.tos&~1, b1, sizeof(b1)));
+   if (p->iph.tos & ~1)
+   printf("%c%s ", p->iph.tos & 1 ? '/' : ' ',
+  rtnl_dsfield_n2a(p->iph.tos & ~1, b1, 
sizeof(b1)));
}
 
-   if (!(p->iph.frag_off&htons(IP_DF)))
+   if (!(p->iph.frag_off & htons(IP_DF)))
printf(" nopmtudisc");
 
if (p->iph.protocol == IPPROTO_IPV6 && !tnl_ioctl_get_6rd(p->name, 
&ip6rd) && ip6rd.prefixlen) {
@@ -357,22 +357,22 @@ static void print_tunnel(struct ip_tunnel_parm *p)
}
}
 
-   if ((p->i_flags&GRE_KEY) && (p->o_flags&GRE_KEY) && p->o_key == 
p->i_key)
+   if ((p->i_flags & GRE_KEY) && (p->o_flags & GRE_KEY) && p->o_key == 
p->i_key)
printf(" key %u", ntohl(p->i_key));
-   else if ((p->i_flags|p->o_flags)&GRE_KEY) {
-   if (p->i_flags&GRE_KEY)
+   else if ((p->i_flags | p->o_flags) & GRE_KEY) {
+   if (p->i_flags & GRE_KEY)
printf(" ikey %u", ntohl(p->i_key));
-   if (p->o_flags&GRE_KEY)
+   if (p->o_flags & GRE_KEY)
printf(" okey %u", ntohl(p->o_key));
}
 
-   if (p->i_flags&GRE_SEQ)
+   if (p->i_flags & GRE_SEQ)
printf("%s  Drop packets out of sequence.", _SL_);
-   if (p->i_flags&GRE_CSUM)
+   if (p->i_flags & GRE_CSUM)
printf("%s  Checksum in received packet is required.", _SL_);
-   if (p->o_flags&GRE_SEQ)
+   if (p->o_flags & GRE_SEQ)
printf("%s  Sequence packets on output.", _SL_);
-   if (p->o_flags&GRE_CSUM)
+   if (p->o_flags & GRE_CSUM)
printf("%s  Checksum output packets.", _SL_);
 }
 
@@ -592,19 +592,19 @@ int do_iptunnel(int argc, char **argv)
 
if (argc > 0) {
if (matches(*argv, "add") == 0)
-   return do_add(SIOCADDTUNNEL, argc-1, argv+1);
+   return do_add(SIOCADDTUNNEL, argc - 1, argv + 1);
if (matches(*argv, "change") ==

[iproute PATCH 04/12] iptunnel: use ll_name_to_index() for physical interface lookup

2015-11-13 Thread Phil Sutter
Although the cache is only initialized in do_show(), this way it is at
least consistent with ip6tunnel.

Signed-off-by: Phil Sutter 
---
 ip/iptunnel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index 803bb83..a547852 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -226,7 +226,7 @@ static int parse_args(int argc, char **argv, int cmd, 
struct ip_tunnel_parm *p)
}
 
if (medium[0]) {
-   p->link = if_nametoindex(medium);
+   p->link = ll_name_to_index(medium);
if (p->link == 0) {
fprintf(stderr, "Cannot find device \"%s\"\n", medium);
return -1;
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[iproute PATCH 11/12] iptunnel: sanitize copying tunnel name

2015-11-13 Thread Phil Sutter
Since p->name is only IFNAMSIZ bytes, do not copy more than IFNAMSIZ - 1
bytes into it so there remains at least a single null byte in the end.

Signed-off-by: Phil Sutter 
---
 ip/iptunnel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index 3b46a15..b377a5b 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -175,7 +175,7 @@ static int parse_args(int argc, char **argv, int cmd, 
struct ip_tunnel_parm *p)
usage();
if (p->name[0])
duparg2("name", *argv);
-   strncpy(p->name, *argv, IFNAMSIZ);
+   strncpy(p->name, *argv, IFNAMSIZ - 1);
if (cmd == SIOCCHGTUNNEL && count == 0) {
struct ip_tunnel_parm old_p;
memset(&old_p, 0, sizeof(old_p));
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[iproute PATCH 10/12] iptunnel: share common code when determining the default interface name

2015-11-13 Thread Phil Sutter
Signed-off-by: Phil Sutter 
---
 ip/iptunnel.c | 70 +--
 1 file changed, 25 insertions(+), 45 deletions(-)

diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index 8c05f6f..3b46a15 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -239,10 +239,26 @@ static int parse_args(int argc, char **argv, int cmd, 
struct ip_tunnel_parm *p)
return 0;
 }
 
+static const char *tnl_defname(const struct ip_tunnel_parm *p)
+{
+   switch (p->iph.protocol) {
+   case IPPROTO_IPIP:
+   if (p->i_flags & VTI_ISVTI)
+   return "ip_vti0";
+   else
+   return "tunl0";
+   case IPPROTO_GRE:
+   return "gre0";
+   case IPPROTO_IPV6:
+   return "sit0";
+   }
+   return NULL;
+}
 
 static int do_add(int cmd, int argc, char **argv)
 {
struct ip_tunnel_parm p;
+   const char *basedev;
 
if (parse_args(argc, argv, cmd, &p) < 0)
return -1;
@@ -252,21 +268,12 @@ static int do_add(int cmd, int argc, char **argv)
return -1;
}
 
-   switch (p.iph.protocol) {
-   case IPPROTO_IPIP:
-   if (p.i_flags & VTI_ISVTI)
-   return tnl_add_ioctl(cmd, "ip_vti0", p.name, &p);
-   else
-   return tnl_add_ioctl(cmd, "tunl0", p.name, &p);
-   case IPPROTO_GRE:
-   return tnl_add_ioctl(cmd, "gre0", p.name, &p);
-   case IPPROTO_IPV6:
-   return tnl_add_ioctl(cmd, "sit0", p.name, &p);
-   default:
+   if (!(basedev = tnl_defname(&p))) {
fprintf(stderr, "cannot determine tunnel mode (ipip, gre, vti 
or sit)\n");
return -1;
}
-   return -1;
+
+   return tnl_add_ioctl(cmd, basedev, p.name, &p);
 }
 
 static int do_del(int argc, char **argv)
@@ -276,20 +283,7 @@ static int do_del(int argc, char **argv)
if (parse_args(argc, argv, SIOCDELTUNNEL, &p) < 0)
return -1;
 
-   switch (p.iph.protocol) {
-   case IPPROTO_IPIP:
-   if (p.i_flags & VTI_ISVTI)
-   return tnl_del_ioctl("ip_vti0", p.name, &p);
-   else
-   return tnl_del_ioctl("tunl0", p.name, &p);
-   case IPPROTO_GRE:
-   return tnl_del_ioctl("gre0", p.name, &p);
-   case IPPROTO_IPV6:
-   return tnl_del_ioctl("sit0", p.name, &p);
-   default:
-   return tnl_del_ioctl(p.name, p.name, &p);
-   }
-   return -1;
+   return tnl_del_ioctl(tnl_defname(&p) ? : p.name, p.name, &p);
 }
 
 static void print_tunnel(struct ip_tunnel_parm *p)
@@ -462,31 +456,17 @@ static int do_tunnels_list(struct ip_tunnel_parm *p)
 
 static int do_show(int argc, char **argv)
 {
-   int err;
struct ip_tunnel_parm p;
+   const char *basedev;
 
ll_init_map(&rth);
if (parse_args(argc, argv, SIOCGETTUNNEL, &p) < 0)
return -1;
 
-   switch (p.iph.protocol) {
-   case IPPROTO_IPIP:
-   if (p.i_flags & VTI_ISVTI)
-   err = tnl_get_ioctl(p.name[0] ? p.name : "ip_vti0", &p);
-   else
-   err = tnl_get_ioctl(p.name[0] ? p.name : "tunl0", &p);
-   break;
-   case IPPROTO_GRE:
-   err = tnl_get_ioctl(p.name[0] ? p.name : "gre0", &p);
-   break;
-   case IPPROTO_IPV6:
-   err = tnl_get_ioctl(p.name[0] ? p.name : "sit0", &p);
-   break;
-   default:
-   do_tunnels_list(&p);
-   return 0;
-   }
-   if (err)
+   if (!(basedev = tnl_defname(&p)))
+   return do_tunnels_list(&p);
+
+   if (tnl_get_ioctl(p.name[0] ? p.name : basedev, &p))
return -1;
 
print_tunnel(&p);
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[iproute PATCH 01/12] ip{,6}tunnel: get rid of extraneous whitespace when printing

2015-11-13 Thread Phil Sutter
Put whitespace in the beginning of optional parts, not as suffix
anywhere. Also drop double whitespaces in between words.

Signed-off-by: Phil Sutter 
---
 ip/ip6tunnel.c |  4 ++--
 ip/iptunnel.c  | 16 
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/ip/ip6tunnel.c b/ip/ip6tunnel.c
index 9884efd..07010d3 100644
--- a/ip/ip6tunnel.c
+++ b/ip/ip6tunnel.c
@@ -111,9 +111,9 @@ static void print_tunnel(struct ip6_tnl_parm2 *p)
printf(" key %u", ntohl(p->i_key));
else if ((p->i_flags|p->o_flags)&GRE_KEY) {
if (p->i_flags&GRE_KEY)
-   printf(" ikey %u ", ntohl(p->i_key));
+   printf(" ikey %u", ntohl(p->i_key));
if (p->o_flags&GRE_KEY)
-   printf(" okey %u ", ntohl(p->o_key));
+   printf(" okey %u", ntohl(p->o_key));
}
 
if (p->i_flags&GRE_SEQ)
diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index 78fa988..36534f2 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -343,7 +343,7 @@ static void print_tunnel(struct ip_tunnel_parm *p)
/* Do not use format_host() for local addr,
 * symbolic name will not be useful.
 */
-   printf("%s: %s/ip  remote %s  local %s ",
+   printf("%s: %s/ip remote %s local %s",
   p->name,
   tnl_strproto(p->iph.protocol),
   p->iph.daddr ? format_host(AF_INET, 4, &p->iph.daddr, s1, 
sizeof(s1)) : "any",
@@ -371,13 +371,13 @@ static void print_tunnel(struct ip_tunnel_parm *p)
if (p->link) {
const char *n = ll_index_to_name(p->link);
if (n)
-   printf(" dev %s ", n);
+   printf(" dev %s", n);
}
 
if (p->iph.ttl)
-   printf(" ttl %d ", p->iph.ttl);
+   printf(" ttl %d", p->iph.ttl);
else
-   printf(" ttl inherit ");
+   printf(" ttl inherit");
 
if (p->iph.tos) {
SPRINT_BUF(b1);
@@ -393,11 +393,11 @@ static void print_tunnel(struct ip_tunnel_parm *p)
printf(" nopmtudisc");
 
if (p->iph.protocol == IPPROTO_IPV6 && !tnl_ioctl_get_6rd(p->name, 
&ip6rd) && ip6rd.prefixlen) {
-   printf(" 6rd-prefix %s/%u ",
+   printf(" 6rd-prefix %s/%u",
   inet_ntop(AF_INET6, &ip6rd.prefix, s1, sizeof(s1)),
   ip6rd.prefixlen);
if (ip6rd.relay_prefix) {
-   printf("6rd-relay_prefix %s/%u ",
+   printf(" 6rd-relay_prefix %s/%u",
   format_host(AF_INET, 4, &ip6rd.relay_prefix, s1, 
sizeof(s1)),
   ip6rd.relay_prefixlen);
}
@@ -407,9 +407,9 @@ static void print_tunnel(struct ip_tunnel_parm *p)
printf(" key %u", ntohl(p->i_key));
else if ((p->i_flags|p->o_flags)&GRE_KEY) {
if (p->i_flags&GRE_KEY)
-   printf(" ikey %u ", ntohl(p->i_key));
+   printf(" ikey %u", ntohl(p->i_key));
if (p->o_flags&GRE_KEY)
-   printf(" okey %u ", ntohl(p->o_key));
+   printf(" okey %u", ntohl(p->o_key));
}
 
if (p->i_flags&GRE_SEQ)
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[iproute PATCH 06/12] ip6tunnel: print local/remote addresses like iptunnel does

2015-11-13 Thread Phil Sutter
This makes output consistent with iptunnel, also supporting reverse DNS
lookup for remote address if requested.

Signed-off-by: Phil Sutter 
---
 ip/ip6tunnel.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/ip/ip6tunnel.c b/ip/ip6tunnel.c
index ba92518..9eb5b2f 100644
--- a/ip/ip6tunnel.c
+++ b/ip/ip6tunnel.c
@@ -68,14 +68,17 @@ static void usage(void)
 
 static void print_tunnel(struct ip6_tnl_parm2 *p)
 {
-   char remote[64];
-   char local[64];
-
-   inet_ntop(AF_INET6, &p->raddr, remote, sizeof(remote));
-   inet_ntop(AF_INET6, &p->laddr, local, sizeof(local));
+   char s1[1024];
+   char s2[1024];
 
+   /* Do not use format_host() for local addr,
+* symbolic name will not be useful.
+*/
printf("%s: %s/ipv6 remote %s local %s",
-  p->name, tnl_strproto(p->proto), remote, local);
+  p->name,
+  tnl_strproto(p->proto),
+  format_host(AF_INET6, 16, &p->raddr, s1, sizeof(s1)),
+  rt_addr_n2a(AF_INET6, 16, &p->laddr, s2, sizeof(s2)));
if (p->link) {
const char *n = ll_index_to_name(p->link);
if (n)
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[iproute PATCH 09/12] iptunnel: simplify parsing TTL, allow 'hlim' as identifier

2015-11-13 Thread Phil Sutter
Instead of parsing an unsigned integer and checking boundaries, simply
parse u8. This and the added ttl alias 'hlim' provide consistency with
ip6tunnel.

Signed-off-by: Phil Sutter 
---
 ip/iptunnel.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index 92edb34..8c05f6f 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -141,14 +141,13 @@ static int parse_args(int argc, char **argv, int cmd, 
struct ip_tunnel_parm *p)
NEXT_ARG();
strncpy(medium, *argv, IFNAMSIZ-1);
} else if (strcmp(*argv, "ttl") == 0 ||
-  strcmp(*argv, "hoplimit") == 0) {
-   unsigned uval;
+  strcmp(*argv, "hoplimit") == 0 ||
+  strcmp(*argv, "hlim") == 0) {
+   __u8 uval;
NEXT_ARG();
if (strcmp(*argv, "inherit") != 0) {
-   if (get_unsigned(&uval, *argv, 0))
+   if (get_u8(&uval, *argv, 0))
invarg("invalid TTL\n", *argv);
-   if (uval > 255)
-   invarg("TTL must be <=255\n", *argv);
p->iph.ttl = uval;
}
} else if (strcmp(*argv, "tos") == 0 ||
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[iproute PATCH 05/12] ip{,6}tunnel: align do_tunnels_list() a bit

2015-11-13 Thread Phil Sutter
In iptunnel, declare loop variables inside the loop as done in
ip6tunnel.

Fix and simplify goto logic in ip6tunnel:
- Failure to read over header lines would have left fp opened.
- By returning directly upon fopen() failure, fp can be closed
  unconditionally in the end.

Use the same goto logic in iptunnel, as well.

Signed-off-by: Phil Sutter 
---
 ip/ip6tunnel.c |  8 +++-
 ip/iptunnel.c  | 25 +
 2 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/ip/ip6tunnel.c b/ip/ip6tunnel.c
index 410276f..ba92518 100644
--- a/ip/ip6tunnel.c
+++ b/ip/ip6tunnel.c
@@ -326,14 +326,14 @@ static int do_tunnels_list(struct ip6_tnl_parm2 *p)
FILE *fp = fopen("/proc/net/dev", "r");
if (fp == NULL) {
perror("fopen");
-   goto end;
+   return -1;
}
 
/* skip two lines at the begenning of the file */
if (!fgets(buf, sizeof(buf), fp) ||
!fgets(buf, sizeof(buf), fp)) {
fprintf(stderr, "/proc/net/dev read error\n");
-   return -1;
+   goto end;
}
 
while (fgets(buf, sizeof(buf), fp) != NULL) {
@@ -395,10 +395,8 @@ static int do_tunnels_list(struct ip6_tnl_parm2 *p)
printf("\n");
}
err = 0;
-
  end:
-   if (fp)
-   fclose(fp);
+   fclose(fp);
return err;
 }
 
diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index a547852..e323c1f 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -396,14 +396,8 @@ static void print_tunnel(struct ip_tunnel_parm *p)
 
 static int do_tunnels_list(struct ip_tunnel_parm *p)
 {
-   char name[IFNAMSIZ];
-   unsigned long  rx_bytes, rx_packets, rx_errs, rx_drops,
-   rx_fifo, rx_frame,
-   tx_bytes, tx_packets, tx_errs, tx_drops,
-   tx_fifo, tx_colls, tx_carrier, rx_multi;
-   struct ip_tunnel_parm p1;
-
char buf[512];
+   int err = -1;
FILE *fp = fopen("/proc/net/dev", "r");
if (fp == NULL) {
perror("fopen");
@@ -414,19 +408,24 @@ static int do_tunnels_list(struct ip_tunnel_parm *p)
if (!fgets(buf, sizeof(buf), fp) ||
!fgets(buf, sizeof(buf), fp)) {
fprintf(stderr, "/proc/net/dev read error\n");
-   fclose(fp);
-   return -1;
+   goto end;
}
 
while (fgets(buf, sizeof(buf), fp) != NULL) {
+   char name[IFNAMSIZ];
int index, type;
+   unsigned long rx_bytes, rx_packets, rx_errs, rx_drops,
+   rx_fifo, rx_frame,
+   tx_bytes, tx_packets, tx_errs, tx_drops,
+   tx_fifo, tx_colls, tx_carrier, rx_multi;
+   struct ip_tunnel_parm p1;
char *ptr;
+
buf[sizeof(buf) - 1] = 0;
if ((ptr = strchr(buf, ':')) == NULL ||
(*ptr++ = 0, sscanf(buf, "%s", name) != 1)) {
fprintf(stderr, "Wrong format for /proc/net/dev. Giving 
up.\n");
-   fclose(fp);
-   return -1;
+   goto end;
}
if (sscanf(ptr, "%ld%ld%ld%ld%ld%ld%ld%*d%ld%ld%ld%ld%ld%ld%ld",
   &rx_bytes, &rx_packets, &rx_errs, &rx_drops,
@@ -467,8 +466,10 @@ static int do_tunnels_list(struct ip_tunnel_parm *p)
}
printf("\n");
}
+   err = 0;
+ end:
fclose(fp);
-   return 0;
+   return err;
 }
 
 static int do_show(int argc, char **argv)
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] arm64: bpf: make BPF prologue and epilogue align with ARM64 AAPCS

2015-11-13 Thread Shi, Yang

On 11/12/2015 7:28 PM, Z Lim wrote:

On Thu, Nov 12, 2015 at 1:57 PM, Yang Shi  wrote:


Save and restore FP/LR in BPF prog prologue and epilogue, save SP to FP
in prologue in order to get the correct stack backtrace.

However, ARM64 JIT used FP (x29) as eBPF fp register, FP is subjected to
change during function call so it may cause the BPF prog stack base address
change too.

Use x25 to replace FP as BPF stack base register (fp). Since x25 is callee
saved register, so it will keep intact during function call.
It is initialized in BPF prog prologue when BPF prog is started to run
everytime. When BPF prog exits, it could be just tossed.

So, the BPF stack layout looks like:

  high
  original A64_SP =>   0:+-+ BPF prologue
 | | FP/LR and callee saved registers
  BPF fp register => -64:+-+
 | |
 | ... | BPF prog stack
 | |
 | |
  current A64_SP/FP =>   +-+
 | |
 | ... | Function call stack
 | |
 +-+
   low



Yang, for stack unwinding to work, shouldn't it be something like the following?


Yes, thanks for catching this. v3 will be post soon.

Yang



   | LR |
A64_FP => | FP |
   | .. |



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [iproute PATCH 05/12] ip{,6}tunnel: align do_tunnels_list() a bit

2015-11-13 Thread David Laight
From: Phil Sutter
> Sent: 13 November 2015 17:09
> In iptunnel, declare loop variables inside the loop as done in
> ip6tunnel.
...
> @@ -396,14 +396,8 @@ static void print_tunnel(struct ip_tunnel_parm *p)
> 
>  static int do_tunnels_list(struct ip_tunnel_parm *p)
>  {
> - char name[IFNAMSIZ];
> - unsigned long  rx_bytes, rx_packets, rx_errs, rx_drops,
> - rx_fifo, rx_frame,
> - tx_bytes, tx_packets, tx_errs, tx_drops,
> - tx_fifo, tx_colls, tx_carrier, rx_multi;
> - struct ip_tunnel_parm p1;
> -
...
>   while (fgets(buf, sizeof(buf), fp) != NULL) {
> + char name[IFNAMSIZ];
>   int index, type;
> + unsigned long rx_bytes, rx_packets, rx_errs, rx_drops,
> + rx_fifo, rx_frame,
> + tx_bytes, tx_packets, tx_errs, tx_drops,
> + tx_fifo, tx_colls, tx_carrier, rx_multi;
> + struct ip_tunnel_parm p1;
>   char *ptr;
> +

Personally I find that just makes it harder to find where the
variables are defined.
Since the linux kernel cannot be compiled with -Wshadow declaring
variables in inner scopes can easily lead to very strange bugs.

David
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [iproute PATCH 05/12] ip{,6}tunnel: align do_tunnels_list() a bit

2015-11-13 Thread Phil Sutter
On Fri, Nov 13, 2015 at 05:30:10PM +, David Laight wrote:
> From: Phil Sutter
> > Sent: 13 November 2015 17:09
> > In iptunnel, declare loop variables inside the loop as done in
> > ip6tunnel.
> ...
> > @@ -396,14 +396,8 @@ static void print_tunnel(struct ip_tunnel_parm *p)
> > 
> >  static int do_tunnels_list(struct ip_tunnel_parm *p)
> >  {
> > -   char name[IFNAMSIZ];
> > -   unsigned long  rx_bytes, rx_packets, rx_errs, rx_drops,
> > -   rx_fifo, rx_frame,
> > -   tx_bytes, tx_packets, tx_errs, tx_drops,
> > -   tx_fifo, tx_colls, tx_carrier, rx_multi;
> > -   struct ip_tunnel_parm p1;
> > -
> ...
> > while (fgets(buf, sizeof(buf), fp) != NULL) {
> > +   char name[IFNAMSIZ];
> > int index, type;
> > +   unsigned long rx_bytes, rx_packets, rx_errs, rx_drops,
> > +   rx_fifo, rx_frame,
> > +   tx_bytes, tx_packets, tx_errs, tx_drops,
> > +   tx_fifo, tx_colls, tx_carrier, rx_multi;
> > +   struct ip_tunnel_parm p1;
> > char *ptr;
> > +
> 
> Personally I find that just makes it harder to find where the
> variables are defined.

Well, the above aligns the code with ip/ip6tunnel.c in that particular
matter. I'm neither a friend of the old nor the new version, so if
everyone thinks it is better without this patch, I'm fine with changing
ip/ip6tunnel.c accordingly as well.

Looking at the code again, maybe the better option overall would be to
export the whole file reading and stats printing code into a shared
function.

> Since the linux kernel cannot be compiled with -Wshadow declaring
> variables in inner scopes can easily lead to very strange bugs.

Well, since this is not kernel code but iproute2 one, we *could* compile
it with -Wshadow.

Thanks, Phil
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/10] Netfilter fixes for net

2015-11-13 Thread Josh Boyer
On Wed, Nov 11, 2015 at 12:33 PM, Pablo Neira Ayuso  wrote:
> Jozsef Kadlecsik (3):
>   netfilter: ipset: Fix extension alignment
>   netfilter: ipset: Fix hash:* type expiration
>   netfilter: ipset: Fix hash type expire: release empty hash bucket block

Should these three go to stable?  We've had reports in Fedora about
ipset crashing on e.g. ARM architectures with 4.2.y kernels.  If not
all three, then perhaps just the alignment fix?

josh
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux-next: Tree for Nov 13 (netfilter)

2015-11-13 Thread Randy Dunlap

On 11/12/15 18:23, Stephen Rothwell wrote:
> Hi all,
> 
> Please do *not* add any material intended for v4.5 to your linux-next
> included branches until after v4.4-rc1 has been released.
> 
> Changes since 20151112:
> 

on x86_64:

net/built-in.o: In function `tee_tg6':
xt_TEE.c:(.text+0x4b28b): undefined reference to `nf_dup_ipv6'


Full randconfig file is attached.


-- 
~Randy



#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.3.0 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_PERF_EVENTS_INTEL_UNCORE=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_MMU=y
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx 
-fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 
-fcall-saved-r11"
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_CONSTRUCTORS=y
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
# CONFIG_KERNEL_GZIP is not set
# CONFIG_KERNEL_BZIP2 is not set
CONFIG_KERNEL_LZMA=y
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
# CONFIG_POSIX_MQUEUE is not set
CONFIG_CROSS_MEMORY_ATTACH=y
# CONFIG_FHANDLE is not set
# CONFIG_USELIB is not set
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_CHIP=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_HZ_PERIODIC=y
# CONFIG_NO_HZ_IDLE is not set
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
# CONFIG_IRQ_TIME_ACCOUNTING is not set

#
# RCU Subsystem
#
CONFIG_TINY_RCU=y
CONFIG_RCU_EXPERT=y
CONFIG_SRCU=y
# CONFIG_TASKS_RCU is not set
CONFIG_RCU_STALL_COMMON=y
# CONFIG_TREE_RCU_TRACE is not set
CONFIG_RCU_KTHREAD_PRIO=0
# CONFIG_RCU_EXPEDITE_BOOT is not set
# CONFIG_BUILD_BIN2C is not set
# CONFIG_IKCONFIG is not set
CONFIG_LOG_BUF_SHIFT=17
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_SUPPORTS_INT128=y
CONFIG_CGROUPS=y
# CONFIG_CGROUP_DEBUG is not set
# CONFIG_CGROUP_FREEZER is not set
CONFIG_CGROUP_PIDS=y
# CONFIG_CGROUP_DEVICE is not set
# CONFIG_CPUSETS is not set
CONFIG_CGROUP_CPUACCT=y
CONFIG_PAGE_COUNTER=y
CONFIG_MEMCG=y
CONFIG_MEMCG_SWAP=y
CONFIG_MEMCG_SWAP_ENABLED=y
# CONFIG_CGROUP_PERF is not set
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_CFS_BANDWIDTH=y
# CONFIG_RT_GROUP_SCHED is not set
CONFIG_BLK_CGROUP=y
CONFIG_DEBUG_BLK_CGROUP=y
CONFIG_CGROUP_WRITEBACK=y
# CONFIG_CHECKPOINT_RESTORE is not set
CONFIG_SCHED_AUTOGROUP=y
CONFIG_SYSFS_DEPRECATED=y
CONFIG_SYSFS_DEPRECATED_V2=y
# CONFIG_RELAY is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
# CONFIG_RD_LZO is not set
CONFIG_RD_LZ4=y
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
CONFIG_BPF=y
CONFIG_EXPERT=y
# CONFIG_MULTIUSER is not set
# CONFIG_SGETMASK_SYSCALL is not set
# CONFIG_SYSFS_SYSCALL

[PATCH V3 2/2] arm64: bpf: make BPF prologue and epilogue align with ARM64 AAPCS

2015-11-13 Thread Yang Shi
Save and restore FP/LR in BPF prog prologue and epilogue, save SP to FP
in prologue in order to get the correct stack backtrace.

However, ARM64 JIT used FP (x29) as eBPF fp register, FP is subjected to
change during function call so it may cause the BPF prog stack base address
change too.

Use x25 to replace FP as BPF stack base register (fp). Since x25 is callee
saved register, so it will keep intact during function call.
It is initialized in BPF prog prologue when BPF prog is started to run
everytime. When BPF prog exits, it could be just tossed.

So, the BPF stack layout looks like:

 high
 original A64_SP =>   0:+-+ BPF prologue
|FP/LR|
 current A64_FP =>  -16:+-+
| ... | callee saved registers
 BPF fp register => -64:+-+
| |
| ... | BPF prog stack
| |
| |
 current A64_SP =>  +-+
| |
| ... | Function call stack
| |
+-+
  low

CC: Zi Shen Lim 
CC: Xi Wang 
Signed-off-by: Yang Shi 
---
V3 --> V2:
* Make FP point to FP'
* Fix a compile warning

 arch/arm64/net/bpf_jit_comp.c | 37 +++--
 1 file changed, 31 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index ac8b548..c131e38 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -50,7 +50,7 @@ static const int bpf2a64[] = {
[BPF_REG_8] = A64_R(21),
[BPF_REG_9] = A64_R(22),
/* read-only frame pointer to access stack */
-   [BPF_REG_FP] = A64_FP,
+   [BPF_REG_FP] = A64_R(25),
/* temporary register for internal BPF JIT */
[TMP_REG_1] = A64_R(23),
[TMP_REG_2] = A64_R(24),
@@ -155,16 +155,42 @@ static void build_prologue(struct jit_ctx *ctx)
stack_size += 4; /* extra for skb_copy_bits buffer */
stack_size = STACK_ALIGN(stack_size);
 
+   /*
+* BPF prog stack layout
+*
+* high
+* original A64_SP =>   0:+-+ BPF prologue
+*|FP/LR|
+* current A64_FP =>  -16:+-+
+*| ... | callee saved registers
+* BPF fp register => -64:+-+
+*| |
+*| ... | BPF prog stack
+*| |
+*| |
+* current A64_SP =>  +-+
+*| |
+*| ... | Function call stack
+*| |
+*+-+
+*  low
+*
+*/
+
+   /* Save FP and LR registers to stay align with ARM64 AAPCS */
+   emit(A64_PUSH(A64_FP, A64_LR, A64_SP), ctx);
+   emit(A64_MOV(1, A64_FP, A64_SP), ctx);
+
/* Save callee-saved register */
emit(A64_PUSH(r6, r7, A64_SP), ctx);
emit(A64_PUSH(r8, r9, A64_SP), ctx);
if (ctx->tmp_used)
emit(A64_PUSH(tmp1, tmp2, A64_SP), ctx);
 
-   /* Set up frame pointer */
+   /* Set up BPF prog stack base register (x25) */
emit(A64_MOV(1, fp, A64_SP), ctx);
 
-   /* Set up BPF stack */
+   /* Set up function call stack */
emit(A64_SUB_I(1, A64_SP, A64_SP, stack_size), ctx);
 
/* Clear registers A and X */
@@ -179,7 +205,6 @@ static void build_epilogue(struct jit_ctx *ctx)
const u8 r7 = bpf2a64[BPF_REG_7];
const u8 r8 = bpf2a64[BPF_REG_8];
const u8 r9 = bpf2a64[BPF_REG_9];
-   const u8 fp = bpf2a64[BPF_REG_FP];
const u8 tmp1 = bpf2a64[TMP_REG_1];
const u8 tmp2 = bpf2a64[TMP_REG_2];
int stack_size = MAX_BPF_STACK;
@@ -196,8 +221,8 @@ static void build_epilogue(struct jit_ctx *ctx)
emit(A64_POP(r8, r9, A64_SP), ctx);
emit(A64_POP(r6, r7, A64_SP), ctx);
 
-   /* Restore frame pointer */
-   emit(A64_MOV(1, fp, A64_SP), ctx);
+   /* Restore FP/LR registers */
+   emit(A64_POP(A64_FP, A64_LR, A64_SP), ctx);
 
/* Set return value */
emit(A64_MOV(1, A64_R(0), r0), ctx);
-- 
2.0.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] net: fix feature changes on devices without ndo_set_features

2015-11-13 Thread Florian Fainelli
On 13/11/15 05:54, Nikolay Aleksandrov wrote:
> From: Nikolay Aleksandrov 
> 
> When __netdev_update_features() was updated to ensure some features are
> disabled on new lower devices, an error was introduced for devices which
> don't have the ndo_set_features() method set. Before we'll just set the
> new features, but now we return an error and don't set them. Fix this by
> returning the old behaviour and setting err to 0 when ndo_set_features
> is not present.
> 
> Fixes: e7868a85e1b2 ("net/core: ensure features get disabled on new lower 
> devs")
> CC: Jarod Wilson 
> CC: Jiri Pirko 
> CC: Ido Schimmel 
> CC: Sander Eikelenboom 
> CC: Andy Gospodarek 
> CC: Florian Fainelli 
> Signed-off-by: Nikolay Aleksandrov 

Tested-by: Florian Fainelli 

Thanks everyone!

> ---
> Sander please feel free to give your Tested-by.
> 
>  net/core/dev.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index ab9b8d0d115e..4a1d198dbbff 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6426,6 +6426,8 @@ int __netdev_update_features(struct net_device *dev)
>  
>   if (dev->netdev_ops->ndo_set_features)
>   err = dev->netdev_ops->ndo_set_features(dev, features);
> + else
> + err = 0;
>  
>   if (unlikely(err < 0)) {
>   netdev_err(dev,
> 


-- 
Florian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bonding: Offloading bonds to hardware

2015-11-13 Thread Florian Fainelli
On 12/11/15 08:02, Premkumar Jonnala wrote:
> Packet forwarding to/from bond interfaces is done in software.
> 
> This patch enables certain platforms to bridge traffic to/from
> bond interfaces in hardware.  Notifications are sent out when 
> the "active" slave set for a bond interface is updated in 
> software.  Platforms use the notifications to program the 
> hardware accordingly.  The changes have been verified to work 
> with configured and 802.3ad bond interfaces.

This is a good explanation of why you want the changes, and how this is
implemented in a system utilizing that, but this is not documenting why
you are making these changes to the bonding code, nor how they are
supposed to be used by an implementor driver, since there is no such
user posted (yet?).

You introduce two new NDOs which are not documented in the commit
message which would be nice to explain, in particular, why adding new
NDOs and not switchdev attributes and methods for instance?

Also, is it possible to move some of the logic into a notifier instead
of having to maintain an array of slaves and an array of slaves to discard?

> 
> Signed-off-by: Premkumar Jonnala 
> 
> ---
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index b4351ca..4b53733 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -3759,6 +3759,101 @@ err:
>   bond_slave_arr_work_rearm(bond, 1);
>  }
>  
> +static int slave_present(struct slave *slave, struct bond_up_slave *arr)
> +{
> + int i;
> +
> + if (!arr)
> + return 0;
> +
> + for (i = 0; i < arr->count; i++) {
> + if (arr->arr[i] == slave)
> + return 1;
> + }
> + return 0;
> +}
> +
> +/* Send notification to clear/remove slaves for 'bond' in 'arr' except for
> + * slaves in 'ignore_arr'.
> + */
> +static int bond_slave_arr_clear_notify(struct bonding *bond,
> + struct bond_up_slave *arr,
> + struct bond_up_slave *ignore_arr)
> +{
> + struct slave *slave;
> + struct net_device *slave_dev;
> + int i, rv;
> + const struct net_device_ops *ops;
> +
> + if (!bond->dev || !arr)
> + return -EINVAL;
> +
> + rv = 0;
> + for (i = 0; i < arr->count; i++) {
> + slave = arr->arr[i];
> + if (!slave || !slave->dev)
> + continue;
> +
> + slave_dev = slave->dev;
> + if (slave_present(slave, ignore_arr)) {
> + netdev_dbg(bond->dev, "ignoring clear of slave %s\n",
> + slave_dev->name);
> + continue;
> + }
> + ops = slave_dev->netdev_ops;
> + if (!ops || !ops->ndo_bond_slave_discard) {
> + netdev_dbg(bond->dev, "No slave discard ops for %s\n",
> + slave_dev->name);
> + continue;
> + }
> + rv = ops->ndo_bond_slave_discard(slave_dev, bond->dev);
> + if (rv < 0)
> + return rv;
> + }
> + return rv;
> +}
> +
> +/* Send notification about updated slaves for 'bond' except for slaves in
> + * 'ignore_arr'.
> + */
> +static int bond_slave_arr_set_notify(struct bonding *bond,
> + struct bond_up_slave *ignore_arr)
> +{
> + struct slave *slave;
> + struct net_device *slave_dev;
> + struct bond_up_slave *arr;
> + int i, rv;
> + const struct net_device_ops *ops;
> +
> + if (!bond || !bond->dev)
> + return -EINVAL;
> + rv = 0;
> +
> + arr = rtnl_dereference(bond->slave_arr);
> + if (!arr)
> + return -EINVAL;
> +
> + for (i = 0; i < arr->count; i++) {
> + slave = arr->arr[i];
> + slave_dev = slave->dev;
> + if (slave_present(slave, ignore_arr)) {
> + netdev_dbg(bond->dev, "ignoring add of slave %s\n",
> + slave->dev->name);
> + continue;
> + }
> + ops = slave_dev->netdev_ops;
> + if (!ops || !ops->ndo_bond_slave_add) {
> + netdev_dbg(bond->dev, "No slave add ops for %s\n",
> + slave_dev->name);
> + continue;
> + }
> + rv = ops->ndo_bond_slave_add(slave_dev, bond->dev);
> + if (rv < 0)
> + return rv;
> + }
> + return rv;
> +}
> +
>  /* Build the usable slaves array in control path for modes that use xmit-hash
>   * to determine the slave interface -
>   * (a) BOND_MODE_8023AD
> @@ -3771,7 +3866,7 @@ int bond_update_slave_arr(struct bonding *bond, struct 
> slave *skipslave)
>  {
>   struct slave *slave;
>   struct list_head *iter;
> - struct bond_up_slave *new_arr, *old_arr;
> + struct bond_up_slave *new_arr, *old_arr, *discard_arr = 0

Re: [net-next 04/17] drivers/net/intel: use napi_complete_done()

2015-11-13 Thread Alexander Duyck

On 11/13/2015 08:49 AM, Eric Dumazet wrote:

On Fri, 2015-11-13 at 08:06 -0800, Alexander Duyck wrote:


Yes, I'm pretty certain you cannot use this napi_complete_done with
anything that support busy poll sockets.  The problem is you need to
flush any existing lists before yielding to the socket polling in order
to avoid packet ordering issues between the NAPI polling routine and the
socket polling routine.

My plan is to make busy poll independent of GRO / RPS / RFS, and generic
if possible, for all NAPI drivers. (No need to absolutely provide
ndo_busy_poll()

I really do not see GRO being a problem for low latency : RPC messages
are terminated by PSH flag that take care of flushing GRO engine.


Right.  I wasn't thinking so much about GRO delaying the frames as the 
fact that ixgbe will call netif_receive_skb if busy polling instead of 
napi_gro_receive.  So you might have frames left in the GRO list that 
would get bypassed if pulled out during busy polling.



For mixed use, (low latency and other kind of flows), GRO is a win.


Agreed.


With the following sk_busy_loop() , we :

- allow tunneling traffic to use busy poll as well as native traffic.
- allow RFS/RPS being used (sending IPI to other cpus if needed)
- use the 'lets burn cpu cycles' to do useful work (like TX completions, RCU 
callbacks...)
- Implement busy poll for all NAPI drivers.

 rcu_read_lock();
 napi = napi_by_id(sk->sk_napi_id);
 if (!napi)
 goto out;
 ops = napi->dev->netdev_ops;

 for (;;) {
 local_bh_disable();
 rc = 0;
 if (ops->ndo_busy_poll) {
 rc = ops->ndo_busy_poll(napi);
 } else if (napi_schedule_prep(napi)) {
 rc = napi->poll(napi, 4);
 if (rc == 4) {
 napi_complete_done(napi, rc);
 napi_schedule(napi);
 }
 }
 if (rc > 0)
 NET_ADD_STATS_BH(sock_net(sk),
  LINUX_MIB_BUSYPOLLRXPACKETS, rc);
 local_bh_enable();

 if (rc == LL_FLUSH_FAILED ||
 nonblock ||
 !skb_queue_empty(&sk->sk_receive_queue) ||
 need_resched() ||
 busy_loop_timeout(end_time))
 break;

 cpu_relax();
 }
 rcu_read_unlock();


Sounds good.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/10] Netfilter fixes for net

2015-11-13 Thread Jozsef Kadlecsik
On Fri, 13 Nov 2015, Josh Boyer wrote:

> On Wed, Nov 11, 2015 at 12:33 PM, Pablo Neira Ayuso  
> wrote:
> > Jozsef Kadlecsik (3):
> >   netfilter: ipset: Fix extension alignment
> >   netfilter: ipset: Fix hash:* type expiration
> >   netfilter: ipset: Fix hash type expire: release empty hash bucket 
> > block
> 
> Should these three go to stable?  We've had reports in Fedora about
> ipset crashing on e.g. ARM architectures with 4.2.y kernels.  If not
> all three, then perhaps just the alignment fix?

Yes, those should definitely go to stable, at least the first two ones.

Best regards,
Jozsef
-
E-mail  : kad...@blackhole.kfki.hu, kadlecsik.joz...@wigner.mta.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences
  H-1525 Budapest 114, POB. 49, Hungary
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] unix: avoid use-after-free in ep_remove_wait_queue

2015-11-13 Thread Rainer Weikusat
An AF_UNIX datagram socket being the client in an n:1 association with
some server socket is only allowed to send messages to the server if the
receive queue of this socket contains at most sk_max_ack_backlog
datagrams. This implies that prospective writers might be forced to go
to sleep despite none of the message presently enqueued on the server
receive queue were sent by them. In order to ensure that these will be
woken up once space becomes again available, the present unix_dgram_poll
routine does a second sock_poll_wait call with the peer_wait wait queue
of the server socket as queue argument (unix_dgram_recvmsg does a wake
up on this queue after a datagram was received). This is inherently
problematic because the server socket is only guaranteed to remain alive
for as long as the client still holds a reference to it. In case the
connection is dissolved via connect or by the dead peer detection logic
in unix_dgram_sendmsg, the server socket may be freed despite "the
polling mechanism" (in particular, epoll) still has a pointer to the
corresponding peer_wait queue. There's no way to forcibly deregister a
wait queue with epoll.

Based on an idea by Jason Baron, the patch below changes the code such
that a wait_queue_t belonging to the client socket is enqueued on the
peer_wait queue of the server whenever the peer receive queue full
condition is detected by either a sendmsg or a poll. A wake up on the
peer queue is then relayed to the ordinary wait queue of the client
socket via wake function. The connection to the peer wait queue is again
dissolved if either a wake up is about to be relayed or the client
socket reconnects or a dead peer is detected or the client socket is
itself closed. This enables removing the second sock_poll_wait from
unix_dgram_poll, thus avoiding the use-after-free, while still ensuring
that no blocked writer sleeps forever.

Signed-off-by: Rainer Weikusat 
---
"Believed to be least buggy version"

- disconnect from former peer in _dgram_connect

- use unix_state_double_lock in _dgram_sendmsg to ensure
  recv_ready/ wake_me preconditions are met (noted by Jason
  Baron)

diff --git a/include/net/af_unix.h b/include/net/af_unix.h
index b36d837..2a91a05 100644
--- a/include/net/af_unix.h
+++ b/include/net/af_unix.h
@@ -62,6 +62,7 @@ struct unix_sock {
 #define UNIX_GC_CANDIDATE  0
 #define UNIX_GC_MAYBE_CYCLE1
struct socket_wqpeer_wq;
+   wait_queue_tpeer_wake;
 };
 
 static inline struct unix_sock *unix_sk(const struct sock *sk)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 94f6582..30e7c56 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -326,6 +326,122 @@ found:
return s;
 }
 
+/* Support code for asymmetrically connected dgram sockets
+ *
+ * If a datagram socket is connected to a socket not itself connected
+ * to the first socket (eg, /dev/log), clients may only enqueue more
+ * messages if the present receive queue of the server socket is not
+ * "too large". This means there's a second writeability condition
+ * poll and sendmsg need to test. The dgram recv code will do a wake
+ * up on the peer_wait wait queue of a socket upon reception of a
+ * datagram which needs to be propagated to sleeping would-be writers
+ * since these might not have sent anything so far. This can't be
+ * accomplished via poll_wait because the lifetime of the server
+ * socket might be less than that of its clients if these break their
+ * association with it or if the server socket is closed while clients
+ * are still connected to it and there's no way to inform "a polling
+ * implementation" that it should let go of a certain wait queue
+ *
+ * In order to propagate a wake up, a wait_queue_t of the client
+ * socket is enqueued on the peer_wait queue of the server socket
+ * whose wake function does a wake_up on the ordinary client socket
+ * wait queue. This connection is established whenever a write (or
+ * poll for write) hit the flow control condition and broken when the
+ * association to the server socket is dissolved or after a wake up
+ * was relayed.
+ */
+
+static int unix_dgram_peer_wake_relay(wait_queue_t *q, unsigned mode, int 
flags,
+ void *key)
+{
+   struct unix_sock *u;
+   wait_queue_head_t *u_sleep;
+
+   u = container_of(q, struct unix_sock, peer_wake);
+
+   __remove_wait_queue(&unix_sk(u->peer_wake.private)->peer_wait,
+   q);
+   u->peer_wake.private = NULL;
+
+   /* relaying can only happen while the wq still exists */
+   u_sleep = sk_sleep(&u->sk);
+   if (u_sleep)
+   wake_up_interruptible_poll(u_sleep, key);
+
+   return 0;
+}
+
+static int unix_dgram_peer_wake_connect(struct sock *sk, struct sock *other)
+{
+   struct unix_sock *u, *u_other;
+   int rc;
+
+   u = unix_sk(sk);
+   u_other = unix_sk(other);
+   rc = 0;
+
+   spin_lock(

Re: [PATCH] unix: avoid use-after-free in ep_remove_wait_queue

2015-11-13 Thread Rainer Weikusat
Hannes Frederic Sowa  writes:
> On Wed, Nov 11, 2015, at 17:12, Rainer Weikusat wrote:
>> Hannes Frederic Sowa  writes:
>> > On Tue, Nov 10, 2015, at 22:55, Rainer Weikusat wrote:
>> >> An AF_UNIX datagram socket being the client in an n:1 association with
>> >> some server socket is only allowed to send messages to the server if the
>> >> receive queue of this socket contains at most sk_max_ack_backlog
>> >> datagrams.
>> 
>> [...]
>> 
>> > This whole patch seems pretty complicated to me.
>> >
>> > Can't we just remove the unix_recvq_full checks alltogether and unify
>> > unix_dgram_poll with unix_poll?
>> >
>> > If we want to be cautious we could simply make unix_max_dgram_qlen limit
>> > the number of skbs which are in flight from a sending socket. The skb
>> > destructor can then decrement this. This seems much simpler.
>> >
>> > Would this work?
>> 
>> In the way this is intended to work, cf
>> 
>> http://marc.info/?t=11562760602&r=1&w=2
>
> Oh, I see, we don't limit closed but still referenced sockets. This
> actually makes sense on how fd handling is implemented, just as a range
> check.
>
> Have you checked if we can somehow deregister the socket in the poll
> event framework? You wrote that it does not provide such a function but
> maybe it would be easy to add?

I thought about this but this would amount to adding a general interface
for the sole purpose of enabling the af_unix code to talk to the
eventpoll code and I don't really like this idea: IMHO, there should be
at least two users (preferably three) before creating any kind of
'abstract interface'. An even more ideal "castle in the air"
(hypothetical) solution would be "change the eventpoll.c code such that
it won't be affected if a wait queue just goes away". That's at least
theoretically possible (although it might not be in practice).

I wouldn't mind doing that (assuming it was possible) if it was just
for the kernels my employer uses because I'm aware of the uses these
will be put to and in control of the corresponding userland code. But
for "general Linux code", changing epoll in order to help the af_unix
code is more potential trouble than it's worth: Exchanging a relatively
unimportant bug in some module for a much more visibly damaging bug in a
central facility would be a bad tradeoff.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bonding: Offloading bonds to hardware

2015-11-13 Thread David Miller
From: Florian Fainelli 
Date: Fri, 13 Nov 2015 10:38:48 -0800

> This is a good explanation of why you want the changes, and how this is
> implemented in a system utilizing that, but this is not documenting why
> you are making these changes to the bonding code, nor how they are
> supposed to be used by an implementor driver, since there is no such
> user posted (yet?).

I am basically not even going to look at proposals for new device ops
that don't also show an actual user of the new interfaces.

That's not how we do development, and not providing a real example
user of a new set of interfaces makes it impossible to properly review
such changes.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] netlink: fix a limit in NETLINK_LIST_MEMBERSHIPS

2015-11-13 Thread Dan Carpenter
Oh.  Crap...  My mistake.  Sorry for the noise.  The original code is
fine.

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH -stable] multiple backports requested

2015-11-13 Thread Charles (Chas) Williams
Dave, could you please add the following backports?

For the 3.14.y stable queue:
commit 77751427a1ff25b27d47a4c36b12c3c8667855ac
ipv6: addrconf: validate new MTU before applying it
This addresses CVE-2015-0272

commit 74e98eb085889b0d2d4908f59f6e00026063014f
RDS: verify the underlying transport exists before creating a connection
This addresses CVE-2015-6937

For the 4.1.y stable queue:

commit 74e98eb085889b0d2d4908f59f6e00026063014f
RDS: verify the underlying transport exists before creating a connection
This addresses CVE-2015-6937

Thanks!


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: via-velocity skb_over_panic

2015-11-13 Thread David Miller
From: Timo Teras 
Date: Fri, 13 Nov 2015 18:49:47 +0200

> + if (pkt_len < 4 || pkt_len > vptr->rx.buf_sz) {
> + VELOCITY_PRT(MSG_LEVEL_VERBOSE, KERN_ERR " %s : the received 
> frame size %d is inconsistent.\n", vptr->netdev->name, pkt_len);
 ...
> This seems to have fixed the panics. And I do see one of the NIC's
> ethtool report's in_range_length_errors increasing once in a while. For
> some reason I don't see the above debug message though, so I'm not sure
> on what pkt_len triggers it.

You have to set the driver message level >= MSG_LEVEL_VERBOSE (3) to
see it.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -stable] multiple backports requested

2015-11-13 Thread David Miller
From: "Charles (Chas) Williams" <3ch...@gmail.com>
Date: Fri, 13 Nov 2015 15:13:11 -0500

> Dave, could you please add the following backports?
> 
> For the 3.14.y stable queue:

I am no longer handling 3.14.y -stable submissions, sorry...

> For the 4.1.y stable queue:
> 
>   commit 74e98eb085889b0d2d4908f59f6e00026063014f
>   RDS: verify the underlying transport exists before creating a connection
>   This addresses CVE-2015-6937

Queued up for 3.18, 4.1, and 4.2.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bonding: Offloading bonds to hardware

2015-11-13 Thread Andrew Lunn
On Thu, Nov 12, 2015 at 04:02:18PM +, Premkumar Jonnala wrote:
> Packet forwarding to/from bond interfaces is done in software.
> 
> This patch enables certain platforms to bridge traffic to/from
> bond interfaces in hardware.  Notifications are sent out when 
> the "active" slave set for a bond interface is updated in 
> software.  Platforms use the notifications to program the 
> hardware accordingly.

Hi Premkumar

I can think of three use cases of binding with hardware offload
engines:

1) External user ports of the switch are bonded together into a trunk.

2) Host Ethernet ports connected to the switch are bonded together so
you get double the bandwidth between the host and the switch.

3) In DSA setups where you have a cluster of switches, some switch
ports connect to other switch ports, so forming the cluster. You can
bond ports together to double the bandwidth between switches in the
cluster.

The requirements here are quite different in each case. 
Which of these uses cases are you trying to address?

  Thanks
Andrew
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH stable <= 3.18] net: add length argument to skb_copy_and_csum_datagram_iovec

2015-11-13 Thread David Miller
From: Sabrina Dubroca 
Date: Thu, 12 Nov 2015 10:48:22 +0100

> 2015-11-10, 16:03:52 -0800, Greg Kroah-Hartman wrote:
>> On Tue, Nov 10, 2015 at 05:59:26PM -0600, Josh Hunt wrote:
>> > On Thu, Oct 29, 2015 at 5:00 AM, Sabrina Dubroca  
>> > wrote:
>> > > 2015-10-15, 14:25:03 +0200, Sabrina Dubroca wrote:
>> > >> Without this length argument, we can read past the end of the iovec in
>> > >> memcpy_toiovec because we have no way of knowing the total length of the
>> > >> iovec's buffers.
>> > >>
>> > >> This is needed for stable kernels where 89c22d8c3b27 ("net: Fix skb
>> > >> csum races when peeking") has been backported but that don't have the
>> > >> ioviter conversion, which is almost all the stable trees <= 3.18.
>> > >>
>> > >> This also fixes a kernel crash for NFS servers when the client uses
>> > >>  -onfsvers=3,proto=udp to mount the export.
>> > >>
>> > >> Signed-off-by: Sabrina Dubroca 
>> > >> Reviewed-by: Hannes Frederic Sowa 
>> > >
>> > > Fixes CVE-2015-8019.
>> > > http://www.openwall.com/lists/oss-security/2015/10/29/1
>> > >
>> > > --
>> > > Sabrina
>> > > --
>> > > To unsubscribe from this list: send the line "unsubscribe netdev" in
>> > > the body of a message to majord...@vger.kernel.org
>> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > 
>> > Greg
>> > 
>> > Do you have this in your queue? I saw a few other stables pick this
>> > up, but haven't seen it in 3.14 or 3.18 yet. It wasn't clear to me if
>> > this had been fully reviewed yet.
>> 
>> I rely on Dave to package up networking stable patches and forward them
>> on to me, that's why you haven't seen it be picked up yet.
>> 
>> thanks,
>> 
>> greg k-h
> 
> David, can you queue this up?

This doesn't even apply to v3.18.24, the patched call site in
net/rxrpc/ar-recvmsg.c doesn't even exist.

Once you fix this up just submit it to -stable directly, I'm
fine with that for this.  I'm only handling submissions back
to v3.18 (4 releases) anyways.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


4.3.0+ breaks software VPN

2015-11-13 Thread Jens Axboe

Hi,

Tried to connect to sw vpn today, and it isn't working. Running git 
as-of yesterday. In dmesg:


[23703.921542] vpn0: set_features() failed (-1); wanted 
0x008048c1, left 0x0080001b48c9


Reverting:

fd867d51f889
5ba3f7d61a3a
e7868a85e1b2

in reverse order makes it work again. How do we get this fixed so that 
4.4-rc1 doesn't break basic VPN support?


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel panic in 4.2.3, rb_erase in sch_fq

2015-11-13 Thread Denys Fedoryshchenko
I can confirm, after patch this issue never appeared again. So maybe 
good to push it to stable and etc :) Thanks a lot Eric, you saved me 
again.



Still i have some weird panic issues, maybe related to conntrack, but 
they are rare even on high load, so i am slowly gathering data, and i 
found at least one more person with similar conntrack crashes on latest 
kernels.



On 2015-11-04 06:46, Eric Dumazet wrote:

On Wed, 2015-11-04 at 06:25 +0200, Denys Fedoryshchenko wrote:

On 2015-11-04 00:06, Cong Wang wrote:
> On Mon, Nov 2, 2015 at 6:11 AM, Denys Fedoryshchenko
>  wrote:
>> Hi!
>>
>> Actually seems i was getting this panic for a while (once per week) on
>> loaded pppoe server, but just now was able to get full panic message.
>> After checking commit logs on sch_fq.c i didnt seen any fixes, so
>> probably
>> upgrading to newer kernel wont help?
>
>
> Can you share your `tc qdisc show dev ` with us? And how to
> reproduce
> it? I tried to setup htb+fq and then flip the interface back and forth
> but I don't
> see any crash.
My guess it wont be easy to reproduce, it is happening on box with 
4.5k

interfaces, that constantly create/delete interfaces,
and even with that this problem may happen once per day, or may not
happen for 1 week.

Here is script that is being fired after new ppp interface detected. 
But

pppoe process are independent from
process that are "establishing" shapers.



It is probably a generic bug. sch_fq seems OK to me.

Somehow nobody tries to change qdisc hundred times per second ;)

Could you try following patch ?

It seems to 'fix' the issue for me.

diff --git a/net/core/dev.c b/net/core/dev.c
index 8ce3f74cd6b9..bf136103bc7b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2880,6 +2880,12 @@ static inline int __dev_xmit_skb(struct sk_buff
*skb, struct Qdisc *q,
spin_lock(&q->busylock);

spin_lock(root_lock);
+   if (unlikely(q != rcu_dereference_bh(txq->qdisc))) {
+   pr_err_ratelimited("Arg, qdisc changed ! state %lx\n", 
q->state);
+   kfree_skb(skb);
+   rc = NET_XMIT_DROP;
+   goto end;
+   }
if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) {
kfree_skb(skb);
rc = NET_XMIT_DROP;
@@ -2913,6 +2919,7 @@ static inline int __dev_xmit_skb(struct sk_buff
*skb, struct Qdisc *q,
__qdisc_run(q);
}
}
+end:
spin_unlock(root_lock);
if (unlikely(contended))
spin_unlock(&q->busylock);

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 4.3.0+ breaks software VPN

2015-11-13 Thread Dave Jones
On Fri, Nov 13, 2015 at 02:37:00PM -0700, Jens Axboe wrote:
 > Hi,
 > 
 > Tried to connect to sw vpn today, and it isn't working. Running git 
 > as-of yesterday. In dmesg:
 > 
 > [23703.921542] vpn0: set_features() failed (-1); wanted 
 > 0x008048c1, left 0x0080001b48c9
 > 
 > Reverting:
 > 
 > fd867d51f889
 > 5ba3f7d61a3a
 > e7868a85e1b2
 > 
 > in reverse order makes it work again. How do we get this fixed so that 
 > 4.4-rc1 doesn't break basic VPN support?

Possibly related:
I see those set_features warnings have started spewing in my 2-nic bonding
setup too.

[   51.595169] Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
[   51.595647] bond0: set_features() failed (-1); wanted 0x0fc0f388, 
left 0x0fd9fbe9
[   51.597168] bond0: Setting MII monitoring interval to 100
[   51.600782] bond0: Adding slave eth0
[   51.831790] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[   51.832078] bond0: set_features() failed (-1); wanted 0x0fd9fba9, 
left 0x0fd9fbe9
[   51.832190] bond0: Enslaving eth0 as an active interface with a down link
[   51.836657] bond0: Adding slave eth1
[   52.039515] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
[   52.039655] bond0: set_features() failed (-1); wanted 0x0fd9fba9, 
left 0x0fd9fbe9
[   52.039735] bond0: Enslaving eth1 as an active interface with a down link


Dave
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 4.3.0+ breaks software VPN

2015-11-13 Thread Nikolay Aleksandrov
On 11/13/2015 10:37 PM, Jens Axboe wrote:
> Hi,
> 
> Tried to connect to sw vpn today, and it isn't working. Running git as-of 
> yesterday. In dmesg:
> 
> [23703.921542] vpn0: set_features() failed (-1); wanted 0x008048c1, 
> left 0x0080001b48c9
> 
> Reverting:
> 
> fd867d51f889
> 5ba3f7d61a3a
> e7868a85e1b2
> 
> in reverse order makes it work again. How do we get this fixed so that 
> 4.4-rc1 doesn't break basic VPN support?
> 

Today I've posted a patch that should fix this,
http://patchwork.ozlabs.org/patch/544307/

More testing is always welcome.

Cheers,
 Nik

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 4.3.0+ breaks software VPN

2015-11-13 Thread Jens Axboe

On 11/13/2015 02:44 PM, Nikolay Aleksandrov wrote:

On 11/13/2015 10:37 PM, Jens Axboe wrote:

Hi,

Tried to connect to sw vpn today, and it isn't working. Running git as-of 
yesterday. In dmesg:

[23703.921542] vpn0: set_features() failed (-1); wanted 0x008048c1, 
left 0x0080001b48c9

Reverting:

fd867d51f889
5ba3f7d61a3a
e7868a85e1b2

in reverse order makes it work again. How do we get this fixed so that 4.4-rc1 
doesn't break basic VPN support?



Today I've posted a patch that should fix this,
http://patchwork.ozlabs.org/patch/544307/

More testing is always welcome.


Thanks, that works!

Tested-by: Jens Axboe 


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] unix: avoid use-after-free in ep_remove_wait_queue

2015-11-13 Thread Jason Baron


On 11/13/2015 01:51 PM, Rainer Weikusat wrote:

[...]

>  
> - if (unix_peer(other) != sk && unix_recvq_full(other)) {
> - if (!timeo) {
> - err = -EAGAIN;
> - goto out_unlock;
> - }
> + if (unix_peer(sk) == other && !unix_dgram_peer_recv_ready(sk, other)) {

Remind me why the 'unix_peer(sk) == other' is added here? If the remote
is not connected we still want to make sure that we don't overflow the
the remote rcv queue, right?

In terms of this added 'double' lock for both sk and other, where
previously we just held the 'other' lock. I think we could continue to
just hold the 'other' lock unless the remote queue is full, so something
like:

if (unix_peer(other) != sk && unix_recvq_full(other)) {
bool need_wakeup = false;

skipping the blocking case...

err = -EAGAIN;
if (!other_connected)
goto out_unlock;
unix_state_unlock(other);
unix_state_lock(sk);

/* if remote peer has changed under us, the connect()
   will wake up any pending waiter, just return -EAGAIN

if (unix_peer(sk) == other) {
/* In case we see there is space available
   queue the wakeup and we will try again. This
   this should be an unlikely condition */
if (!unix_dgram_peer_wake_me(sk, other))
need_wakeup = true;
}
unix_state_unlock(sk);
if (need_wakeup)
wake_up_interruptible_poll(sk_sleep(sk),POLLOUT
| POLLWRNORM | POLLWRBAND);
goto out_free;
}

So I'm not sure if the 'double' lock really affects any workload, but
the above might be away to avoid it.

Also - it might be helpful to add a 'Fixes:' tag referencing where this
issue started, in the changelog.

Worth mentioning too is that this patch should improve the polling case
here dramatically, as we currently wake the entire queue on every remote
read even when we have room in the rcv buffer. So this patch will cut
down on ctxt switching rate dramatically from what we currently have.

Thanks,

-Jason
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 net-next] net/core: ensure features get disabled on new lower devs

2015-11-13 Thread Laura Abbott

On 11/13/2015 02:51 AM, Nikolay Aleksandrov wrote:

On 11/13/2015 11:29 AM, Jiri Pirko wrote:

Fri, Nov 13, 2015 at 01:26:18AM CET, f.faine...@gmail.com wrote:

On 04/11/15 18:56, David Miller wrote:

Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down 
stack")

  ...

Reported-by: Nikolay Aleksandrov 
Signed-off-by: Jarod Wilson 
---
v2: Based on suggestions from Alex, and with not changing err to ret, this
patch actually becomes quite minimal and doesn't ugly up the code much.


Applied, thanks.


This causes some warnings to be displayed for DSA stacked devices:

[1.272297] brcm-sf2 f0b0.ethernet_switch: Starfighter 2 top:
4.00, core: 2.00 base: 0xf0c8, IRQs: 68, 69
[1.283181] libphy: dsa slave smi: probed
[1.344088] f0b403c0.mdio:05: Broadcom BCM7445 PHY revision: 0xd0,
patch: 3
[1.658917] brcm-sf2 f0b0.ethernet_switch gphy (uninitialized):
attached PHY at address 5 [Broadcom BCM7445]
[1.669414] brcm-sf2 f0b0.ethernet_switch gphy: set_features()
failed (-1); wanted 0x4020, left 0x4820
[1.734202] brcm-sf2 f0b0.ethernet_switch rgmii_1
(uninitialized): attached PHY at address 0 [Generic PHY]
[1.744486] brcm-sf2 f0b0.ethernet_switch rgmii_1: set_features()
failed (-1); wanted 0x4020, left 0x4820
[1.809091] brcm-sf2 f0b0.ethernet_switch rgmii_2
(uninitialized): attached PHY at address 1 [Generic PHY]
[1.819364] brcm-sf2 f0b0.ethernet_switch rgmii_2: set_features()
failed (-1); wanted 0x4020, left 0x4820
[1.884090] brcm-sf2 f0b0.ethernet_switch moca (uninitialized):
attached PHY at address 2 [Generic PHY]
[1.894109] brcm-sf2 f0b0.ethernet_switch moca: set_features()
failed (-1); wanted 0x4020, left 0x4820

DSA slave network devices are not associated with their master network
device using the typical lower/upper netdev helpers.

I do not have a good fix to come up with yet, but if you see something
obvious with net/dsa/slave.c, feel free to send patches for testing, I
can boot net-next on this platform.


I'm having similar issues with bridge, with linus's git now:


[snip]

Hmm, I think it's because the bridge and dsa/slave don't have ndo_set_features()
so err is left as -1 and thus an error is reported which isn't actually true.
Before in this case the features would just get set, so could you please try
the following patch ?


diff --git a/net/core/dev.c b/net/core/dev.c
index ab9b8d0d115e..4a1d198dbbff 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6426,6 +6426,8 @@ int __netdev_update_features(struct net_device *dev)

if (dev->netdev_ops->ndo_set_features)
err = dev->netdev_ops->ndo_set_features(dev, features);
+   else
+   err = 0;

if (unlikely(err < 0)) {
netdev_err(dev,



The patch seems to be working for at least one person who reported the
problem in Fedora rawhide https://bugzilla.redhat.com/show_bug.cgi?id=1281674

Thanks,
Laura

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: via-velocity skb_over_panic

2015-11-13 Thread Francois Romieu
Timo Teras  :
[...]
> So it's possible that some bad packets make the NIC return unexpected
> packet sizes, and the current code can panic on it.
> 
> Any suggestions for better fix?

if (vptr->flags & VELOCITY_FLAGS_VAL_PKT_LEN) {
if (rd->rdesc0.RSR & RSR_RL) {

Huh ?

[...]
velocity_set_bool_opt(&opts->flags, ValPktLen[index], VAL_PKT_LEN_DEF, 
VELOCITY_FLAGS_VAL_PKT_LEN, "ValPktLen", devname);
[...]
#define VAL_PKT_LEN_DEF 0
/* ValPktLen[] is used for setting the checksum offload ability of NIC.
   0: Receive frame with invalid layer 2 length (Default)
   1: Drop frame with invalid layer 2 length
*/
VELOCITY_PARAM(ValPktLen, "Receiving or Drop invalid 802.3 frame");

*spleen*

RSR_RL should be set on packet length error. You can imnsvho remove the
VELOCITY_FLAGS_VAL_PKT_LEN and VAL_PKT_LEN_DEF stuff altogether.
He who cares about this option should add NETIF_F_RXALL support to the
via-velocity driver.

-- 
Ueimor
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next] Driver: Vmxnet3: Fix use of mfTableLen for big endian architectures

2015-11-13 Thread Shrikrishna Khare
Signed-off-by: Shrikrishna Khare 
Reported-by: Masao Uebayashi 
Signed-off-by: Bhavesh Davda 
---
 drivers/net/vmxnet3/vmxnet3_drv.c | 7 ---
 drivers/net/vmxnet3/vmxnet3_int.h | 4 ++--
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c 
b/drivers/net/vmxnet3/vmxnet3_drv.c
index 46f4cad..899ea42 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -2157,12 +2157,13 @@ vmxnet3_set_mc(struct net_device *netdev)
if (!netdev_mc_empty(netdev)) {
new_table = vmxnet3_copy_mc(netdev);
if (new_table) {
-   rxConf->mfTableLen = cpu_to_le16(
-   netdev_mc_count(netdev) * ETH_ALEN);
+   size_t sz = netdev_mc_count(netdev) * ETH_ALEN;
+
+   rxConf->mfTableLen = cpu_to_le16(sz);
new_table_pa = dma_map_single(
&adapter->pdev->dev,
new_table,
-   rxConf->mfTableLen,
+   sz,
PCI_DMA_TODEVICE);
}
 
diff --git a/drivers/net/vmxnet3/vmxnet3_int.h 
b/drivers/net/vmxnet3/vmxnet3_int.h
index 3f859a5..4c58c83 100644
--- a/drivers/net/vmxnet3/vmxnet3_int.h
+++ b/drivers/net/vmxnet3/vmxnet3_int.h
@@ -69,10 +69,10 @@
 /*
  * Version numbers
  */
-#define VMXNET3_DRIVER_VERSION_STRING   "1.4.3.0-k"
+#define VMXNET3_DRIVER_VERSION_STRING   "1.4.4.0-k"
 
 /* a 32-bit int, each byte encode a verion number in VMXNET3_DRIVER_VERSION */
-#define VMXNET3_DRIVER_VERSION_NUM  0x01040300
+#define VMXNET3_DRIVER_VERSION_NUM  0x01040400
 
 #if defined(CONFIG_PCI_MSI)
/* RSS only makes sense if MSI-X is supported. */
-- 
1.8.5.6

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] ravb: Fix int mask value overwritten issue

2015-11-13 Thread Simon Horman
On Fri, Nov 13, 2015 at 11:45:28AM +0100, Geert Uytterhoeven wrote:
> Hi Kaneko-san,
> 
> On Fri, Nov 13, 2015 at 11:24 AM, Yoshihiro Kaneko
>  wrote:
> > From: Masaru Nagai 
> >
> > When RX/TX interrupt for Network Control queue and Best Effort queue
> > is issued at the same time, the interrupt mask of Network Control
> > queue will be reset when the mask of Best Effort queue is set.
> 
> Nice catch!
> 
> At first I was a bit puzzled why this would make a difference, but
> the key is "will be reset in the next iteration of the for loop", which
> falls outside of the visible context.
> 
> > This patch fixes this problem.
> >
> > Signed-off-by: Masaru Nagai 
> > Signed-off-by: Yoshihiro Kaneko 
> 
> Acked-by: Geert Uytterhoeven 

Acked-by: Simon Horman 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] bpf, arm: start flushing icache range from header

2015-11-13 Thread Daniel Borkmann
During review I noticed that the icache range we're flushing should
start at header already and not at ctx.image.

Reason is that after 55309dd3d4cd ("net: bpf: arm: address randomize
and write protect JIT code"), we also want to make sure to flush the
random-sized trap in front of the start of the actual program (analogous
to x86). No operational differences from user side.

Signed-off-by: Daniel Borkmann 
Tested-by: Nicolas Schichan 
Cc: Alexei Starovoitov 
---
 ( As arm32 fixes usually go via Dave's tree, targeting -net. )

 arch/arm/net/bpf_jit_32.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 2f4b14c..591f9db 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -1061,7 +1061,7 @@ void bpf_jit_compile(struct bpf_prog *fp)
}
build_epilogue(&ctx);
 
-   flush_icache_range((u32)ctx.target, (u32)(ctx.target + ctx.idx));
+   flush_icache_range((u32)header, (u32)(ctx.target + ctx.idx));
 
 #if __LINUX_ARM_ARCH__ < 7
if (ctx.imm_count)
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V3 2/2] arm64: bpf: make BPF prologue and epilogue align with ARM64 AAPCS

2015-11-13 Thread Z Lim
Yang, I noticed another thing...

On Fri, Nov 13, 2015 at 10:09 AM, Yang Shi  wrote:
> Save and restore FP/LR in BPF prog prologue and epilogue, save SP to FP
> in prologue in order to get the correct stack backtrace.
>
> However, ARM64 JIT used FP (x29) as eBPF fp register, FP is subjected to
> change during function call so it may cause the BPF prog stack base address
> change too.
>
> Use x25 to replace FP as BPF stack base register (fp). Since x25 is callee
> saved register, so it will keep intact during function call.

Can you please add save/restore for x25 also? :)

> It is initialized in BPF prog prologue when BPF prog is started to run
> everytime. When BPF prog exits, it could be just tossed.
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Test. Please ignore

2015-11-13 Thread team
Test message. Please ignore.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: eth0: set_features() failed (-1); wanted 0x0000000000004000, left 0x0000000000004800

2015-11-13 Thread Arthur Marsh



Arthur Marsh wrote on 14/11/15 13:46:

Hi, I'm not sure if this is an actual error or just an informational
message but on this pc (with a single-core AMD Athlon(tm) 64 Processor
3200+) I've been getting the following:


Nov 13 18:16:12 localhost kernel: [0.938025] via-rhine :00:12.0
eth0: set_features() failed (-1); wanted 0x4000, left
0x4800
Nov 13 18:16:12 localhost kernel: [0.938574] via-rhine :00:12.0
eth0: VIA Rhine II at 0x1e000, 00:13:d4:cc:9b:57, IRQ 23
Nov 13 18:16:12 localhost kernel: [0.939418] via-rhine :00:12.0
eth0: MII PHY found at address 1, status 0x786d advertising 01e1 Link 45e1

The Ethernet card still works nonetheless.

git-bisect showed that the:

eth0: set_features() failed (-1); wanted 0x4000, left
0x4800

messages started with the following commit:

  git bisect good
e7868a85e1b26bcb2e71088841eec1d310a97ac9 is the first bad commit
commit e7868a85e1b26bcb2e71088841eec1d310a97ac9
Author: Jarod Wilson 
Date:   Tue Nov 3 23:09:32 2015 -0500

 net/core: ensure features get disabled on new lower devs

 With moving netdev_sync_lower_features() after the .ndo_set_features
 calls, I neglected to verify that devices added *after* a flag had
been
 disabled on an upper device were properly added with that flag
disabled as
 well. This currently happens, because we exit
__netdev_update_features()
 when we see dev->features == features for the upper dev. We can
retain the
 optimization of leaving without calling .ndo_set_features with a
bit of
 tweaking and a goto here.


Nikolay Aleksandrov's patch:

http://patchwork.ozlabs.org/patch/544307/

fixed the issue for me thanks.

Arthur.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: OVS VXLAN decap rule has full match on TTL for the outer headers?

2015-11-13 Thread Joe Stringer
On 13 November 2015 at 06:46, Or Gerlitz  wrote:
> On Fri, Nov 13, 2015 at 10:14 AM, Joe Stringer  wrote:
>
>> I don't follow the logic. You observed one flow which matched on
>> TTL=64, therefore all vxlan packets terminated at OVS have TTL=64?
>
>> If OVS received packets with different TTLs, they would miss and
>> ovs-vswitchd would generate flows to match that traffic too.
>
> ok, that makes things a bit better, but (see next)
>
>> If that becomes an issue, presumably the wildcard generation can be improved.
>
> is there a deep reason for vlxan "learned flows" to actually match w
> or w.o wild cards on TTLs?? for non-tunneled flow I don't see  this
> happening.

No deep reason I'm aware of.

>> I agree that this UNSPEC issue on v2.3 could do with a bit of a closer
>> look. I'll see if I can find some time for it. Alternatively if you're
>> willing and have bandwith, I'd be curious if it's related to the
>> masked set field feature introduced in Linux-4.0.
>
> so what would you suggest here? run with 3.19 or earlier?

On second thought, I think if it were related to that then userspace
v2.4 would not exhibit the problem. I'd have to dig in to see why it
occurs.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html