from:"Lorenzo Bianconi"

[PATCH net] net: thunderx: fix NULL pointer dereference in nic_remove

2018-11-26 Thread Lorenzo Bianconi

Fix a possible NULL pointer dereference in nic_remove routine
removing the nicpf module if nic_probe fails.
The issue can be triggered with the following reproducer:

$rmmod nicvf
$rmmod nicpf

[  521.412008] Unable to handle kernel access to user memory outside uaccess 
routines at virtual address 0014
[  521.422777] Mem abort info:
[  521.425561]   ESR = 0x9604
[  521.428624]   Exception class = DABT (current EL), IL = 32 bits
[  521.434535]   SET = 0, FnV = 0
[  521.437579]   EA = 0, S1PTW = 0
[  521.440730] Data abort info:
[  521.443603]   ISV = 0, ISS = 0x0004
[  521.447431]   CM = 0, WnR = 0
[  521.450417] user pgtable: 4k pages, 48-bit VAs, pgdp = 72a3da42
[  521.457022] [0014] pgd=
[  521.461916] Internal error: Oops: 9604 [#1] SMP
[  521.511801] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018
[  521.518664] pstate: 8045 (Nzcv daif +PAN -UAO)
[  521.523451] pc : nic_remove+0x24/0x88 [nicpf]
[  521.527808] lr : pci_device_remove+0x48/0xd8
[  521.532066] sp : 13433cc0
[  521.535370] x29: 13433cc0 x28: 810f6ac5
[  521.540672] x27:  x26: 
[  521.545974] x25: 5600 x24: 0015
[  521.551274] x23: 8007ff89a110 x22: 01667070
[  521.556576] x21: 8007ffb170b0 x20: 8007ffb17000
[  521.561877] x19:  x18: 0025
[  521.567178] x17:  x16: 010ffc33ff98 x8 : 

[  521.593683] x7 :  x6 : 0001
[  521.598983] x5 : 0002 x4 : 0003
[  521.604284] x3 : 8007ffb17184 x2 : 8007ffb17184
[  521.609585] x1 : 01662118 x0 : 08557be0
[  521.614887] Process rmmod (pid: 1897, stack limit = 0x859535c3)
[  521.621490] Call trace:
[  521.623928]  nic_remove+0x24/0x88 [nicpf]
[  521.627927]  pci_device_remove+0x48/0xd8
[  521.631847]  device_release_driver_internal+0x1b0/0x248
[  521.637062]  driver_detach+0x50/0xc0
[  521.640628]  bus_remove_driver+0x60/0x100
[  521.644627]  driver_unregister+0x34/0x60
[  521.648538]  pci_unregister_driver+0x24/0xd8
[  521.652798]  nic_cleanup_module+0x14/0x111c [nicpf]
[  521.657672]  __arm64_sys_delete_module+0x150/0x218
[  521.662460]  el0_svc_handler+0x94/0x110
[  521.666287]  el0_svc+0x8/0xc
[  521.669160] Code: aa1e03e0 9102c295 d503201f f9404eb3 (b9401660)

Fixes: 4863dea3fab0 ("net: Adding support for Cavium ThunderX network 
controller")
Signed-off-by: Lorenzo Bianconi 
---
 drivers/net/ethernet/cavium/thunder/nic_main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c 
b/drivers/net/ethernet/cavium/thunder/nic_main.c
index 55af04fa03a7..6c8dcb65ff03 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -1441,6 +1441,9 @@ static void nic_remove(struct pci_dev *pdev)
 {
struct nicpf *nic = pci_get_drvdata(pdev);
 
+   if (!nic)
+   return;
+
if (nic->flags & NIC_SRIOV_ENABLED)
pci_disable_sriov(pdev);
 
-- 
2.19.1

Re: [PATCH net] net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue

2018-11-24 Thread Lorenzo Bianconi

> From: Lorenzo Bianconi 
> Date: Fri, 23 Nov 2018 18:28:01 +0100
>
> > Reset snd_queue tso_hdrs pointer to NULL in nicvf_free_snd_queue routine
> > since it is used to check if tso dma descriptor queue has been previously
> > allocated. The issue can be triggered with the following reproducer:
> >
> > $ip link set dev enP2p1s0v0 xdpdrv obj xdp_dummy.o
> > $ip link set dev enP2p1s0v0 xdpdrv off
>  ...
> > where xdp_dummy.c is a simple bpf program that forwards the incoming
> > frames to the network stack (available here:
> > https://github.com/altoor/xdp_walkthrough_examples/blob/master/sample_1/xdp_dummy.c)
> >
> > Fixes: 05c773f52b96 ("net: thunderx: Add basic XDP support")
> > Fixes: 4863dea3fab0 ("net: Adding support for Cavium ThunderX network
> > controller")
> >
> > Signed-off-by: Lorenzo Bianconi 
>
> Applied and queued up for -stable, but please in the future:
>
> 1) Do not break up long "Fixes: " tag lines, it must be keep as a single
>uninterrupted line for grep'ability etc.
>
> 2) Do not put an empty line between "Fixes: " and other tags.  All tags
>are equal, and appear in a straight uninterrupted sequence of lines.
>

Hi David,

ack, will do next time.

Regards,
Lorenzo

> Thank you.

[PATCH net] net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue

2018-11-23 Thread Lorenzo Bianconi

Reset snd_queue tso_hdrs pointer to NULL in nicvf_free_snd_queue routine
since it is used to check if tso dma descriptor queue has been previously
allocated. The issue can be triggered with the following reproducer:

$ip link set dev enP2p1s0v0 xdpdrv obj xdp_dummy.o
$ip link set dev enP2p1s0v0 xdpdrv off

[  341.467649] WARNING: CPU: 74 PID: 2158 at mm/vmalloc.c:1511 
__vunmap+0x98/0xe0
[  341.515010] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018
[  341.521874] pstate: 6045 (nZCv daif +PAN -UAO)
[  341.526654] pc : __vunmap+0x98/0xe0
[  341.530132] lr : __vunmap+0x98/0xe0
[  341.533609] sp : 1c5db860
[  341.536913] x29: 1c5db860 x28: 0002
[  341.542214] x27: 810feb5090b0 x26: 17e57000
[  341.547515] x25:  x24: fbd0
[  341.552816] x23:  x22: 810feb5090b0
[  341.558117] x21:  x20: 
[  341.563418] x19: 17e57000 x18: 
[  341.568719] x17:  x16: 
[  341.574020] x15: 0010 x14: 
[  341.579321] x13: 8985eb27 x12: 0985eb2f
[  341.584622] x11: 096b3000 x10: 1c5db510
[  341.589923] x9 : ffd0 x8 : 086868e8
[  341.595224] x7 : 3430303030303030 x6 : 06ef
[  341.600525] x5 : 003f x4 : 
[  341.605825] x3 :  x2 : 
[  341.611126] x1 : 096b3728 x0 : 0038
[  341.616428] Call trace:
[  341.618866]  __vunmap+0x98/0xe0
[  341.621997]  vunmap+0x3c/0x50
[  341.624961]  arch_dma_free+0x68/0xa0
[  341.628534]  dma_direct_free+0x50/0x80
[  341.632285]  nicvf_free_resources+0x160/0x2d8 [nicvf]
[  341.637327]  nicvf_config_data_transfer+0x174/0x5e8 [nicvf]
[  341.642890]  nicvf_stop+0x298/0x340 [nicvf]
[  341.647066]  __dev_close_many+0x9c/0x108
[  341.650977]  dev_close_many+0xa4/0x158
[  341.654720]  rollback_registered_many+0x140/0x530
[  341.659414]  rollback_registered+0x54/0x80
[  341.663499]  unregister_netdevice_queue+0x9c/0xe8
[  341.668192]  unregister_netdev+0x28/0x38
[  341.672106]  nicvf_remove+0xa4/0xa8 [nicvf]
[  341.676280]  nicvf_shutdown+0x20/0x30 [nicvf]
[  341.680630]  pci_device_shutdown+0x44/0x88
[  341.684720]  device_shutdown+0x144/0x250
[  341.688640]  kernel_restart_prepare+0x44/0x50
[  341.692986]  kernel_restart+0x20/0x68
[  341.696638]  __se_sys_reboot+0x210/0x238
[  341.700550]  __arm64_sys_reboot+0x24/0x30
[  341.704555]  el0_svc_handler+0x94/0x110
[  341.708382]  el0_svc+0x8/0xc
[  341.711252] ---[ end trace 3f4019c8439959c9 ]---
[  341.715874] page:7e0003ef4000 count:0 mapcount:0 
mapping: index:0x4
[  341.723872] flags: 0x1fffe0()
[  341.727527] raw: 001fffe0 7e0003f1a008 7e0003ef4048 

[  341.735263] raw: 0004   

[  341.742994] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)

where xdp_dummy.c is a simple bpf program that forwards the incoming
frames to the network stack (available here:
https://github.com/altoor/xdp_walkthrough_examples/blob/master/sample_1/xdp_dummy.c)

Fixes: 05c773f52b96 ("net: thunderx: Add basic XDP support")
Fixes: 4863dea3fab0 ("net: Adding support for Cavium ThunderX network
controller")

Signed-off-by: Lorenzo Bianconi 
---
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c 
b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 187a249ff2d1..fcaf18fa3904 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -585,10 +585,12 @@ static void nicvf_free_snd_queue(struct nicvf *nic, 
struct snd_queue *sq)
if (!sq->dmem.base)
return;
 
-   if (sq->tso_hdrs)
+   if (sq->tso_hdrs) {
dma_free_coherent(>pdev->dev,
  sq->dmem.q_len * TSO_HEADER_SIZE,
  sq->tso_hdrs, sq->tso_hdrs_phys);
+   sq->tso_hdrs = NULL;
+   }
 
/* Free pending skbs in the queue */
smp_rmb();
-- 
2.19.1

Re: net: thunderx: nicvf_xdp_setup error code path

2018-11-22 Thread Lorenzo Bianconi

> 
> Sunil did review this, so please resubmit formally.

Hi David,

I have already posted a formal patch:
https://patchwork.ozlabs.org/patch/1001213/

Reviewing the mail I have not explicitly added you in cc, sorry

Regards,
Lorenzo

[PATCH net] net: thunderx: set xdp_prog to NULL if bpf_prog_add fails

2018-11-21 Thread Lorenzo Bianconi

Set xdp_prog pointer to NULL if bpf_prog_add fails since that routine
reports the error code instead of NULL in case of failure and xdp_prog
pointer value is used in the driver to verify if XDP is currently
enabled.
Moreover report the error code to userspace if nicvf_xdp_setup fails

Fixes: 05c773f52b96 ("net: thunderx: Add basic XDP support")
Signed-off-by: Lorenzo Bianconi 
---
 drivers/net/ethernet/cavium/thunder/nicvf_main.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c 
b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 768f584f8392..88f8a8fa93cd 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1784,6 +1784,7 @@ static int nicvf_xdp_setup(struct nicvf *nic, struct 
bpf_prog *prog)
bool if_up = netif_running(nic->netdev);
struct bpf_prog *old_prog;
bool bpf_attached = false;
+   int ret = 0;
 
/* For now just support only the usual MTU sized frames */
if (prog && (dev->mtu > 1500)) {
@@ -1817,8 +1818,12 @@ static int nicvf_xdp_setup(struct nicvf *nic, struct 
bpf_prog *prog)
if (nic->xdp_prog) {
/* Attach BPF program */
nic->xdp_prog = bpf_prog_add(nic->xdp_prog, nic->rx_queues - 1);
-   if (!IS_ERR(nic->xdp_prog))
+   if (!IS_ERR(nic->xdp_prog)) {
bpf_attached = true;
+   } else {
+   ret = PTR_ERR(nic->xdp_prog);
+   nic->xdp_prog = NULL;
+   }
}
 
/* Calculate Tx queues needed for XDP and network stack */
@@ -1830,7 +1835,7 @@ static int nicvf_xdp_setup(struct nicvf *nic, struct 
bpf_prog *prog)
netif_trans_update(nic->netdev);
}
 
-   return 0;
+   return ret;
 }
 
 static int nicvf_xdp(struct net_device *netdev, struct netdev_bpf *xdp)
-- 
2.19.1

net: thunderx: nicvf_xdp_setup error code path

2018-11-20 Thread Lorenzo Bianconi

Hi all,

looking at thunderx XDP support I noticed that nic->xdp_prog pointer in
nicvf_xdp_setup is not actually set to NULL if bpf_prog_add fails but it
is initialized with bpf_prog_add error code. xdp_prog pointer value is used in
the driver to verify if XDP is currently enabled.
Moreover nicvf_xdp_setup does not report to the userspace any error code in
case of failure.
I wrote the following patch to fix the reported issues. Please note I just
compiled it, not actually tested since I have no thunderx nic at the moment.

@Sunil: could you please give it a whirl? If it is ok I will post a formal
patch, thanks

Regards,
Lorenzo

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c 
b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 768f584f8392..88f8a8fa93cd 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -1784,6 +1784,7 @@ static int nicvf_xdp_setup(struct nicvf *nic, struct 
bpf_prog *prog)
bool if_up = netif_running(nic->netdev);
struct bpf_prog *old_prog;
bool bpf_attached = false;
+   int ret = 0;
 
/* For now just support only the usual MTU sized frames */
if (prog && (dev->mtu > 1500)) {
@@ -1817,8 +1818,12 @@ static int nicvf_xdp_setup(struct nicvf *nic, struct 
bpf_prog *prog)
if (nic->xdp_prog) {
/* Attach BPF program */
nic->xdp_prog = bpf_prog_add(nic->xdp_prog, nic->rx_queues - 1);
-   if (!IS_ERR(nic->xdp_prog))
+   if (!IS_ERR(nic->xdp_prog)) {
bpf_attached = true;
+   } else {
+   ret = PTR_ERR(nic->xdp_prog);
+   nic->xdp_prog = NULL;
+   }
}
 
/* Calculate Tx queues needed for XDP and network stack */
@@ -1830,7 +1835,7 @@ static int nicvf_xdp_setup(struct nicvf *nic, struct 
bpf_prog *prog)
netif_trans_update(nic->netdev);
}
 
-   return 0;
+   return ret;
 }
 
 static int nicvf_xdp(struct net_device *netdev, struct netdev_bpf *xdp)

Re: [PATCH iproute 2/2] utils: fix get_rtnl_link_stats_rta stats parsing

2018-10-11 Thread Lorenzo Bianconi

> > iproute2 walks through the list of available tunnels using netlink
> > protocol in order to get device info instead of reading
> > them from proc filesystem. However the kernel reports device statistics
> > using IFLA_INET6_STATS/IFLA_INET6_ICMP6STATS attributes nested in
> > IFLA_PROTINFO one but iproutes expects these info in
> > IFLA_STATS64/IFLA_STATS attributes.
> > The issue can be triggered with the following reproducer:
> > 
> > $ip link add ip6d0 type ip6tnl mode ip6ip6 local ::1 remote ::1
> > $ip -6 -d -s tunnel show ip6d0
> > ip6d0: ipv6/ipv6 remote ::1 local ::1 encaplimit 4 hoplimit 64
> > tclass 0x00 flowlabel 0x0 (flowinfo 0x)
> > Dump terminated
> > 
> > Fix the issue introducing IFLA_INET6_STATS attribute parsing
> > 
> > Fixes: 3e953938717f ("iptunnel/ip6tunnel: Use netlink to walk through
> > tunnels list")
> > 
> > Signed-off-by: Lorenzo Bianconi 
> 
> Can't we fix the kernel to report statistics properly, rather than
> starting iproute2 doing more /proc interfaces.
> 

Hi Stephen,

sorry, I did not get what you mean. Current iproute implementation
walks through tunnels list using netlink protocol and parses device
statistics in the kernel netlink message. However it does not take
into account the actual netlink message layout since the statistic
attribute is nested in IFLA_PROTINFO one.
Moreover AFAIU the related kernel code has not changed since iproute
commit 3e953938717f, so I guess we should fix the issue in iproute code
instead in the kernel one. Do you agree?

Regards,
Lorenzo

[PATCH iproute 2/2] utils: fix get_rtnl_link_stats_rta stats parsing

2018-10-10 Thread Lorenzo Bianconi

iproute2 walks through the list of available tunnels using netlink
protocol in order to get device info instead of reading
them from proc filesystem. However the kernel reports device statistics
using IFLA_INET6_STATS/IFLA_INET6_ICMP6STATS attributes nested in
IFLA_PROTINFO one but iproutes expects these info in
IFLA_STATS64/IFLA_STATS attributes.
The issue can be triggered with the following reproducer:

$ip link add ip6d0 type ip6tnl mode ip6ip6 local ::1 remote ::1
$ip -6 -d -s tunnel show ip6d0
ip6d0: ipv6/ipv6 remote ::1 local ::1 encaplimit 4 hoplimit 64
tclass 0x00 flowlabel 0x0 (flowinfo 0x)
Dump terminated

Fix the issue introducing IFLA_INET6_STATS attribute parsing

Fixes: 3e953938717f ("iptunnel/ip6tunnel: Use netlink to walk through
tunnels list")

Signed-off-by: Lorenzo Bianconi 
---
 lib/utils.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/lib/utils.c b/lib/utils.c
index e87ecf31..7be2d6be 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1549,6 +1550,24 @@ static void copy_rtnl_link_stats64(struct 
rtnl_link_stats64 *stats64,
*a++ = *b++;
 }
 
+#define IPSTATS_MIB_MAX_LEN(__IPSTATS_MIB_MAX * sizeof(__u64))
+static void get_snmp_counters(struct rtnl_link_stats64 *stats64,
+ struct rtattr *s)
+{
+   __u64 *mib = (__u64 *)RTA_DATA(s);
+
+   memset(stats64, 0, sizeof(*stats64));
+
+   stats64->rx_packets = mib[IPSTATS_MIB_INPKTS];
+   stats64->rx_bytes = mib[IPSTATS_MIB_INOCTETS];
+   stats64->tx_packets = mib[IPSTATS_MIB_OUTPKTS];
+   stats64->tx_bytes = mib[IPSTATS_MIB_OUTOCTETS];
+   stats64->rx_errors = mib[IPSTATS_MIB_INDISCARDS];
+   stats64->tx_errors = mib[IPSTATS_MIB_OUTDISCARDS];
+   stats64->multicast = mib[IPSTATS_MIB_INMCASTPKTS];
+   stats64->rx_frame_errors = mib[IPSTATS_MIB_CSUMERRORS];
+}
+
 int get_rtnl_link_stats_rta(struct rtnl_link_stats64 *stats64,
struct rtattr *tb[])
 {
@@ -1565,6 +1584,14 @@ int get_rtnl_link_stats_rta(struct rtnl_link_stats64 
*stats64,
rta = tb[IFLA_STATS];
size = sizeof(struct rtnl_link_stats);
s = 
+   } else if (tb[IFLA_PROTINFO]) {
+   struct rtattr *ptb[IPSTATS_MIB_MAX_LEN + 1];
+
+   parse_rtattr_nested(ptb, IPSTATS_MIB_MAX_LEN,
+   tb[IFLA_PROTINFO]);
+   if (ptb[IFLA_INET6_STATS])
+   get_snmp_counters(stats64, ptb[IFLA_INET6_STATS]);
+   return sizeof(*stats64);
} else {
return -1;
}
-- 
2.17.1

[PATCH iproute 1/2] uapi: add snmp header file

2018-10-10 Thread Lorenzo Bianconi

Introduce snmp header file. It will be used in subsequent patch in
order to parse device statistics reported in
IFLA_INET6_STATS/IFLA_INET6_ICMP6STATS netlink attributes

Signed-off-by: Lorenzo Bianconi 
---
 include/uapi/linux/snmp.h | 323 ++
 1 file changed, 323 insertions(+)
 create mode 100644 include/uapi/linux/snmp.h

diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
new file mode 100644
index ..f80135e5
--- /dev/null
+++ b/include/uapi/linux/snmp.h
@@ -0,0 +1,323 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Definitions for MIBs
+ *
+ * Author: Hideaki YOSHIFUJI 
+ */
+
+#ifndef _LINUX_SNMP_H
+#define _LINUX_SNMP_H
+
+/* ipstats mib definitions */
+/*
+ * RFC 1213:  MIB-II
+ * RFC 2011 (updates 1213):  SNMPv2-MIB-IP
+ * RFC 2863:  Interfaces Group MIB
+ * RFC 2465:  IPv6 MIB: General Group
+ * draft-ietf-ipv6-rfc2011-update-10.txt: MIB for IP: IP Statistics Tables
+ */
+enum
+{
+   IPSTATS_MIB_NUM = 0,
+/* frequently written fields in fast path, kept in same cache line */
+   IPSTATS_MIB_INPKTS, /* InReceives */
+   IPSTATS_MIB_INOCTETS,   /* InOctets */
+   IPSTATS_MIB_INDELIVERS, /* InDelivers */
+   IPSTATS_MIB_OUTFORWDATAGRAMS,   /* OutForwDatagrams */
+   IPSTATS_MIB_OUTPKTS,/* OutRequests */
+   IPSTATS_MIB_OUTOCTETS,  /* OutOctets */
+/* other fields */
+   IPSTATS_MIB_INHDRERRORS,/* InHdrErrors */
+   IPSTATS_MIB_INTOOBIGERRORS, /* InTooBigErrors */
+   IPSTATS_MIB_INNOROUTES, /* InNoRoutes */
+   IPSTATS_MIB_INADDRERRORS,   /* InAddrErrors */
+   IPSTATS_MIB_INUNKNOWNPROTOS,/* InUnknownProtos */
+   IPSTATS_MIB_INTRUNCATEDPKTS,/* InTruncatedPkts */
+   IPSTATS_MIB_INDISCARDS, /* InDiscards */
+   IPSTATS_MIB_OUTDISCARDS,/* OutDiscards */
+   IPSTATS_MIB_OUTNOROUTES,/* OutNoRoutes */
+   IPSTATS_MIB_REASMTIMEOUT,   /* ReasmTimeout */
+   IPSTATS_MIB_REASMREQDS, /* ReasmReqds */
+   IPSTATS_MIB_REASMOKS,   /* ReasmOKs */
+   IPSTATS_MIB_REASMFAILS, /* ReasmFails */
+   IPSTATS_MIB_FRAGOKS,/* FragOKs */
+   IPSTATS_MIB_FRAGFAILS,  /* FragFails */
+   IPSTATS_MIB_FRAGCREATES,/* FragCreates */
+   IPSTATS_MIB_INMCASTPKTS,/* InMcastPkts */
+   IPSTATS_MIB_OUTMCASTPKTS,   /* OutMcastPkts */
+   IPSTATS_MIB_INBCASTPKTS,/* InBcastPkts */
+   IPSTATS_MIB_OUTBCASTPKTS,   /* OutBcastPkts */
+   IPSTATS_MIB_INMCASTOCTETS,  /* InMcastOctets */
+   IPSTATS_MIB_OUTMCASTOCTETS, /* OutMcastOctets */
+   IPSTATS_MIB_INBCASTOCTETS,  /* InBcastOctets */
+   IPSTATS_MIB_OUTBCASTOCTETS, /* OutBcastOctets */
+   IPSTATS_MIB_CSUMERRORS, /* InCsumErrors */
+   IPSTATS_MIB_NOECTPKTS,  /* InNoECTPkts */
+   IPSTATS_MIB_ECT1PKTS,   /* InECT1Pkts */
+   IPSTATS_MIB_ECT0PKTS,   /* InECT0Pkts */
+   IPSTATS_MIB_CEPKTS, /* InCEPkts */
+   IPSTATS_MIB_REASM_OVERLAPS, /* ReasmOverlaps */
+   __IPSTATS_MIB_MAX
+};
+
+/* icmp mib definitions */
+/*
+ * RFC 1213:  MIB-II ICMP Group
+ * RFC 2011 (updates 1213):  SNMPv2 MIB for IP: ICMP group
+ */
+enum
+{
+   ICMP_MIB_NUM = 0,
+   ICMP_MIB_INMSGS,/* InMsgs */
+   ICMP_MIB_INERRORS,  /* InErrors */
+   ICMP_MIB_INDESTUNREACHS,/* InDestUnreachs */
+   ICMP_MIB_INTIMEEXCDS,   /* InTimeExcds */
+   ICMP_MIB_INPARMPROBS,   /* InParmProbs */
+   ICMP_MIB_INSRCQUENCHS,  /* InSrcQuenchs */
+   ICMP_MIB_INREDIRECTS,   /* InRedirects */
+   ICMP_MIB_INECHOS,   /* InEchos */
+   ICMP_MIB_INECHOREPS,/* InEchoReps */
+   ICMP_MIB_INTIMESTAMPS,  /* InTimestamps */
+   ICMP_MIB_INTIMESTAMPREPS,   /* InTimestampReps */
+   ICMP_MIB_INADDRMASKS,   /* InAddrMasks */
+   ICMP_MIB_INADDRMASKREPS,/* InAddrMaskReps */
+   ICMP_MIB_OUTMSGS,   /* OutMsgs */
+   ICMP_MIB_OUTERRORS, /* OutErrors */
+   ICMP_MIB_OUTDESTUNREACHS,   /* OutDestUnreachs */
+   ICMP_MIB_OUTTIMEEXCDS,  /* OutTimeExcds */
+   ICMP_MIB_OUTPARMPROBS,  /* OutParmProbs */
+   ICMP_MIB_OUTSRCQUENCHS, /* OutSrcQuenchs */
+   ICMP_MIB_OUTREDIRECTS

[PATCH iproute 0/2] introduce IFLA_INET6_STATS attribute parsing

2018-10-10 Thread Lorenzo Bianconi

Add IFLA_INET6_STATS netlink attribute parsing in order to fix an issue
triggered dumping device statistics.
Introduce snmp header as helper to walks through attribute subfields

Lorenzo Bianconi (2):
  uapi: add snmp header file
  utils: fix get_rtnl_link_stats_rta stats parsing

 include/uapi/linux/snmp.h | 323 ++
 lib/utils.c   |  27 
 2 files changed, 350 insertions(+)
 create mode 100644 include/uapi/linux/snmp.h

-- 
2.17.1

Re: [PATCH iproute2] iplink_vxlan: take into account preferred_family creating vxlan device

2018-09-21 Thread Lorenzo Bianconi

> On Fri, 21 Sep 2018 15:34:25 +0200
> Lorenzo Bianconi  wrote:
> 
> > Take into account the configured preferred_family if neither saddr or
> > daddr are provided since otherwise vxlan kernel module will use IPv4 as
> > default remote inet family neglecting the one provided by userspace.
> > This behaviour was originally in commit 97d564b90ccb ("vxlan: use
> > preferred address family when neither group or remote is specified").
> > The issue can be triggered with the following reproducer:
> > 
> > $ip -6 link add vxlan1 type vxlan id 42 dev enp0s2 \
> >  proxy nolearning l2miss l3miss
> > $bridge fdb add 46:47:1f:a7:1c:25 dev vxlan1 dst 2000::2
> > RTNETLINK answers: Address family not supported by protocol
> > 
> > Fixes: 1e9b8072de2c ("iplink_vxlan: Get rid of inet_get_addr()")
> > Signed-off-by: Lorenzo Bianconi 
> 
> This patch was already in the queue. Is this a new version (if so please mark 
> it as v2).

Hi Stephen,

yes, this is a v2. I missed v2 in the subject and changes respect to v1
(I have just modified the commit log actually). Please drop v1 and sorry for
the noise.

Regards,
Lorenzo

>

[PATCH iproute2] iplink_vxlan: take into account preferred_family creating vxlan device

2018-09-21 Thread Lorenzo Bianconi

Take into account the configured preferred_family if neither saddr or
daddr are provided since otherwise vxlan kernel module will use IPv4 as
default remote inet family neglecting the one provided by userspace.
This behaviour was originally in commit 97d564b90ccb ("vxlan: use
preferred address family when neither group or remote is specified").
The issue can be triggered with the following reproducer:

$ip -6 link add vxlan1 type vxlan id 42 dev enp0s2 \
 proxy nolearning l2miss l3miss
$bridge fdb add 46:47:1f:a7:1c:25 dev vxlan1 dst 2000::2
RTNETLINK answers: Address family not supported by protocol

Fixes: 1e9b8072de2c ("iplink_vxlan: Get rid of inet_get_addr()")
Signed-off-by: Lorenzo Bianconi 
---
 ip/iplink_vxlan.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/ip/iplink_vxlan.c b/ip/iplink_vxlan.c
index 2bc253fc..831f39a2 100644
--- a/ip/iplink_vxlan.c
+++ b/ip/iplink_vxlan.c
@@ -82,6 +82,7 @@ static int vxlan_parse_opt(struct link_util *lu, int argc, 
char **argv,
__u64 attrs = 0;
bool set_op = (n->nlmsg_type == RTM_NEWLINK &&
   !(n->nlmsg_flags & NLM_F_CREATE));
+   bool selected_family = false;
 
saddr.family = daddr.family = AF_UNSPEC;
 
@@ -356,12 +357,26 @@ static int vxlan_parse_opt(struct link_util *lu, int 
argc, char **argv,
int type = (saddr.family == AF_INET) ? IFLA_VXLAN_LOCAL
 : IFLA_VXLAN_LOCAL6;
addattr_l(n, 1024, type, saddr.data, saddr.bytelen);
+   selected_family = true;
}
 
if (is_addrtype_inet()) {
int type = (daddr.family == AF_INET) ? IFLA_VXLAN_GROUP
 : IFLA_VXLAN_GROUP6;
addattr_l(n, 1024, type, daddr.data, daddr.bytelen);
+   selected_family = true;
+   }
+
+   if (!selected_family) {
+   if (preferred_family == AF_INET) {
+   get_addr(, "default", AF_INET);
+   addattr_l(n, 1024, IFLA_VXLAN_GROUP,
+ daddr.data, daddr.bytelen);
+   } else if (preferred_family == AF_INET6) {
+   get_addr(, "default", AF_INET6);
+   addattr_l(n, 1024, IFLA_VXLAN_GROUP6,
+ daddr.data, daddr.bytelen);
+   }
}
 
if (!set_op || VXLAN_ATTRSET(attrs, IFLA_VXLAN_LEARNING))
-- 
2.17.1

iproute2: fail to add fdb entries to ipv6 vxlan device

2018-09-17 Thread Lorenzo Bianconi

Hi all,

while working on IPv6 vlxan driver I figured out that with recent version
of iproute2 it is no longer possible to configure an IPv6 vxlan device without
endpoint info (local ip, remote ip or group ip) and later add entries in
the vxlan fdb. This issue can be triggered with the following reproducer:

$ip -6 link add vxlan0 type vxlan id 42 dev enp0s2 \
proxy nolearning l2miss l3miss
$bridge fdb add 46:47:1f:a7:1c:25 dev vxlan1 dst 2000::2
RTNETLINK answers: Address family not supported by protocol

Starting from commit 1e9b8072de2c ("iplink_vxlan: Get rid of inet_get_addr()")
the preferred_family is no longer taken into account if neither saddr or daddr
are provided and vxlan kernel module will use IPv4 as default remote inet
family neglecting the one provided by userspace.
I guess we can fix that issue in two ways:

1- add a new netlink attribute to vxlan driver in order to specify the
preferred_family to use

2- restore the behaviour introduced in commit 97d564b90ccb ("vxlan: use
preferred address family when neither group or remote is specified") with
the following patch

-- >8 --
diff --git a/ip/iplink_vxlan.c b/ip/iplink_vxlan.c
index 2bc253fc..831f39a2 100644
--- a/ip/iplink_vxlan.c
+++ b/ip/iplink_vxlan.c
@@ -82,6 +82,7 @@ static int vxlan_parse_opt(struct link_util *lu, int argc, 
char **argv,
__u64 attrs = 0;
bool set_op = (n->nlmsg_type == RTM_NEWLINK &&
   !(n->nlmsg_flags & NLM_F_CREATE));
+   bool selected_family = false;
 
saddr.family = daddr.family = AF_UNSPEC;
 
@@ -356,12 +357,26 @@ static int vxlan_parse_opt(struct link_util *lu, int 
argc, char **argv,
int type = (saddr.family == AF_INET) ? IFLA_VXLAN_LOCAL
 : IFLA_VXLAN_LOCAL6;
addattr_l(n, 1024, type, saddr.data, saddr.bytelen);
+   selected_family = true;
}
 
if (is_addrtype_inet()) {
int type = (daddr.family == AF_INET) ? IFLA_VXLAN_GROUP
 : IFLA_VXLAN_GROUP6;
addattr_l(n, 1024, type, daddr.data, daddr.bytelen);
+   selected_family = true;
+   }
+
+   if (!selected_family) {
+   if (preferred_family == AF_INET) {
+   get_addr(, "default", AF_INET);
+   addattr_l(n, 1024, IFLA_VXLAN_GROUP,
+ daddr.data, daddr.bytelen);
+   } else if (preferred_family == AF_INET6) {
+   get_addr(, "default", AF_INET6);
+   addattr_l(n, 1024, IFLA_VXLAN_GROUP6,
+ daddr.data, daddr.bytelen);
+   }
}
 
if (!set_op || VXLAN_ATTRSET(attrs, IFLA_VXLAN_LEARNING))
-- 8< --

What is the best way to proceed?

Regards,
Lorenzo

Re: [PATCH net-next] veth: report NEWLINK event when moving the peer device in a new namespace

2018-09-01 Thread Lorenzo Bianconi

>
> > On 8/31/18 10:19 AM, Lorenzo Bianconi wrote:
> > >> On 8/31/18 5:43 AM, Lorenzo Bianconi wrote:
> > >>> When moving a veth device to another namespace, userspace receives a
> > >>> RTM_DELLINK message indicating the device has been removed from current
> > >>> netns. However, the other peer does not receive a netlink event
> > >>> containing new values for IFLA_LINK_NETNSID and IFLA_LINK veth
> > >>> attributes.
> > >>> Fix that behaviour sending to userspace a RTM_NEWLINK message in the 
> > >>> peer
> > >>> namespace to report new IFLA_LINK_NETNSID/IFLA_LINK values
> > >>>
> > >>
> > >> A newlink message is generated in the new namespace. What information is
> > >> missing from that message?
> > >>
> > >
> > > Hi David,
> > >
> > > let's assume we have two veth paired devices (veth0 and veth1) on inet
> > > namespace. When moving a veth1 to another namespace, userspace is notified
> > > with RTM_DELLINK event on inet namespace to indicate that veth1 has been
> > > moved to another namespace. However some userspace applications
> > > (e.g. NetworkManager), listening for events on inet namespace, are 
> > > interested
> > > in veth1 ifindex in the new namespace. This patch sends a new RTM_NEWLINK 
> > > event
> > > in inet namespace to provide new values for IFLA_LINK_NETNSID/IFLA_LINK
> >
> > This is in init namespace
> > $ ip li set veth2 netns foo
> >
> > $ ip monitor
> > Deleted 20: veth2@veth1:  mtu 1500 qdisc
> > noop state DOWN group default
> > link/ether c6:d0:d6:c5:23:7d brd ff:ff:ff:ff:ff:ff new-netns foo
> > new-ifindex 20
> >
> > It shows the new namespace in the delete message.
>

Hi David,

I was thinking about the commit 38e01b30563a and then I realized I
misread the code
yesterday. The commit 38e01b30563a provides all relevant info but it
emits the event
for veth1 (the device moved in the new namespace).
An userspace application will not receive that message if it filters
events for just
a specific device (veth0 in this case) despite that some device
properties have changed
(since veth0 and veth1 are paired devices). To fix that behavior in
veth_notify routine
I emits a RTM_NEWLINK event for veth0.

Regards,
Lorenzo

Re: [PATCH net-next] veth: report NEWLINK event when moving the peer device in a new namespace

2018-08-31 Thread Lorenzo Bianconi

> On 8/31/18 10:19 AM, Lorenzo Bianconi wrote:
> >> On 8/31/18 5:43 AM, Lorenzo Bianconi wrote:
> >>> When moving a veth device to another namespace, userspace receives a
> >>> RTM_DELLINK message indicating the device has been removed from current
> >>> netns. However, the other peer does not receive a netlink event
> >>> containing new values for IFLA_LINK_NETNSID and IFLA_LINK veth
> >>> attributes.
> >>> Fix that behaviour sending to userspace a RTM_NEWLINK message in the peer
> >>> namespace to report new IFLA_LINK_NETNSID/IFLA_LINK values
> >>>
> >>
> >> A newlink message is generated in the new namespace. What information is
> >> missing from that message?
> >>
> > 
> > Hi David,
> > 
> > let's assume we have two veth paired devices (veth0 and veth1) on inet
> > namespace. When moving a veth1 to another namespace, userspace is notified
> > with RTM_DELLINK event on inet namespace to indicate that veth1 has been
> > moved to another namespace. However some userspace applications
> > (e.g. NetworkManager), listening for events on inet namespace, are 
> > interested
> > in veth1 ifindex in the new namespace. This patch sends a new RTM_NEWLINK 
> > event
> > in inet namespace to provide new values for IFLA_LINK_NETNSID/IFLA_LINK 
> 
> This is in init namespace
> $ ip li set veth2 netns foo
> 
> $ ip monitor
> Deleted 20: veth2@veth1:  mtu 1500 qdisc
> noop state DOWN group default
> link/ether c6:d0:d6:c5:23:7d brd ff:ff:ff:ff:ff:ff new-netns foo
> new-ifindex 20
> 
> It shows the new namespace in the delete message.

Ops, I have not noticed this info has been already introduced in
the commit 38e01b30563a ("dev: advertise the new ifindex when the netns
iface changes"). Thanks for the hint.

DaveM please drop this patch.

Regards,
Lorenzo

Re: [PATCH net-next] veth: report NEWLINK event when moving the peer device in a new namespace

2018-08-31 Thread Lorenzo Bianconi

> On 8/31/18 5:43 AM, Lorenzo Bianconi wrote:
> > When moving a veth device to another namespace, userspace receives a
> > RTM_DELLINK message indicating the device has been removed from current
> > netns. However, the other peer does not receive a netlink event
> > containing new values for IFLA_LINK_NETNSID and IFLA_LINK veth
> > attributes.
> > Fix that behaviour sending to userspace a RTM_NEWLINK message in the peer
> > namespace to report new IFLA_LINK_NETNSID/IFLA_LINK values
> > 
> 
> A newlink message is generated in the new namespace. What information is
> missing from that message?
> 

Hi David,

let's assume we have two veth paired devices (veth0 and veth1) on inet
namespace. When moving a veth1 to another namespace, userspace is notified
with RTM_DELLINK event on inet namespace to indicate that veth1 has been
moved to another namespace. However some userspace applications
(e.g. NetworkManager), listening for events on inet namespace, are interested
in veth1 ifindex in the new namespace. This patch sends a new RTM_NEWLINK event
in inet namespace to provide new values for IFLA_LINK_NETNSID/IFLA_LINK 

Regards,
Lorenzo

[PATCH net-next] veth: report NEWLINK event when moving the peer device in a new namespace

2018-08-31 Thread Lorenzo Bianconi

When moving a veth device to another namespace, userspace receives a
RTM_DELLINK message indicating the device has been removed from current
netns. However, the other peer does not receive a netlink event
containing new values for IFLA_LINK_NETNSID and IFLA_LINK veth
attributes.
Fix that behaviour sending to userspace a RTM_NEWLINK message in the peer
namespace to report new IFLA_LINK_NETNSID/IFLA_LINK values

Reviewed-by: Stefano Brivio 
Signed-off-by: Lorenzo Bianconi 
---
Changes since RFC:
- export rtmsg_ifinfo_build_skb() symbol and do not use rtmsg_ifinfo_newnet()
  directly
---
 drivers/net/veth.c   | 66 +++-
 net/core/rtnetlink.c |  1 +
 2 files changed, 66 insertions(+), 1 deletion(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 8d679c8b7f25..0caffc93d74d 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -1242,18 +1242,82 @@ static struct rtnl_link_ops veth_link_ops = {
.get_link_net   = veth_get_link_net,
 };
 
+static int veth_notify(struct notifier_block *this,
+  unsigned long event, void *ptr)
+{
+   struct net_device *peer, *dev = netdev_notifier_info_to_dev(ptr);
+   struct net *peer_net, *net = dev_net(dev);
+   int nsid, ret = NOTIFY_DONE;
+   struct veth_priv *priv;
+   struct sk_buff *skb;
+
+   if (dev->netdev_ops != _netdev_ops)
+   return NOTIFY_DONE;
+
+   if (event != NETDEV_REGISTER)
+   return NOTIFY_DONE;
+
+   priv = netdev_priv(dev);
+
+   rcu_read_lock();
+
+   peer = rcu_dereference(priv->peer);
+   if (!peer)
+   goto out;
+
+   peer_net = dev_net(peer);
+   /* do not forward events if both veth devices
+* are in the same namespace
+*/
+   if (peer_net == net)
+   goto out;
+
+   /* notify on peer namespace new IFLA_LINK_NETNSID
+* and IFLA_LINK values
+*/
+   nsid = peernet2id_alloc(peer_net, net);
+   skb = rtmsg_ifinfo_build_skb(RTM_NEWLINK, peer, ~0U,
+IFLA_EVENT_NONE, GFP_ATOMIC,
+, dev->ifindex);
+   if (skb) {
+   rtnl_notify(skb, peer_net, 0, RTNLGRP_LINK, NULL,
+   GFP_ATOMIC);
+   ret = NOTIFY_OK;
+   }
+
+out:
+   rcu_read_unlock();
+
+   return ret;
+}
+
+static struct notifier_block veth_notifier = {
+   .notifier_call = veth_notify,
+};
+
 /*
  * init/fini
  */
 
 static __init int veth_init(void)
 {
-   return rtnl_link_register(_link_ops);
+   int err;
+
+   err = register_netdevice_notifier(_notifier);
+   if (err < 0)
+   return err;
+
+   err = rtnl_link_register(_link_ops);
+   if (err < 0)
+   unregister_netdevice_notifier(_notifier);
+
+   return err;
 }
 
 static __exit void veth_exit(void)
 {
rtnl_link_unregister(_link_ops);
+   unregister_netdevice_notifier(_notifier);
 }
 
 module_init(veth_init);
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 24431e578310..b2df84db7670 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3326,6 +3326,7 @@ struct sk_buff *rtmsg_ifinfo_build_skb(int type, struct 
net_device *dev,
rtnl_set_sk_err(net, RTNLGRP_LINK, err);
return NULL;
 }
+EXPORT_SYMBOL(rtmsg_ifinfo_build_skb);
 
 void rtmsg_ifinfo_send(struct sk_buff *skb, struct net_device *dev, gfp_t 
flags)
 {
-- 
2.17.1

[RFC net-next] veth: report NEWLINK event when moving the peer device in a new namespace

2018-08-29 Thread Lorenzo Bianconi

When moving a veth device to another namespace, userspace receives a
RTM_DELLINK message indicating the device has been removed from current
netns. However, the other peer does not receive a netlink event
containing new values for IFLA_LINK_NETNSID and IFLA_LINK veth
attributes.
Fix that behaviour sending to userspace a RTM_NEWLINK message in the peer
namespace to report new IFLA_LINK_NETNSID/IFLA_LINK values

Reviewed-by: Stefano Brivio 
Signed-off-by: Lorenzo Bianconi 
---
 drivers/net/veth.c | 60 +-
 1 file changed, 59 insertions(+), 1 deletion(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 8d679c8b7f25..b27d46d8084a 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -1242,18 +1242,76 @@ static struct rtnl_link_ops veth_link_ops = {
.get_link_net   = veth_get_link_net,
 };
 
+static int veth_notify(struct notifier_block *this,
+  unsigned long event, void *ptr)
+{
+   struct net_device *peer, *dev = netdev_notifier_info_to_dev(ptr);
+   struct net *peer_net, *net = dev_net(dev);
+   int nsid, ret = NOTIFY_DONE;
+   struct veth_priv *priv;
+
+   if (dev->netdev_ops != _netdev_ops)
+   return NOTIFY_DONE;
+
+   if (event != NETDEV_REGISTER)
+   return NOTIFY_DONE;
+
+   priv = netdev_priv(dev);
+
+   rcu_read_lock();
+
+   peer = rcu_dereference(priv->peer);
+   if (!peer)
+   goto out;
+
+   peer_net = dev_net(peer);
+   /* do not forward events if both veth devices
+* are in the same namespace
+*/
+   if (peer_net == net)
+   goto out;
+
+   /* notify on peer namespace new IFLA_LINK_NETNSID
+* and IFLA_LINK values
+*/
+   nsid = peernet2id_alloc(peer_net, net);
+   rtmsg_ifinfo_newnet(RTM_NEWLINK, peer, ~0U, GFP_ATOMIC,
+   , dev->ifindex);
+   ret = NOTIFY_OK;
+
+out:
+   rcu_read_unlock();
+
+   return ret;
+}
+
+static struct notifier_block veth_notifier = {
+   .notifier_call = veth_notify,
+};
+
 /*
  * init/fini
  */
 
 static __init int veth_init(void)
 {
-   return rtnl_link_register(_link_ops);
+   int err;
+
+   err = register_netdevice_notifier(_notifier);
+   if (err < 0)
+   return err;
+
+   err = rtnl_link_register(_link_ops);
+   if (err < 0)
+   unregister_netdevice_notifier(_notifier);
+
+   return err;
 }
 
 static __exit void veth_exit(void)
 {
rtnl_link_unregister(_link_ops);
+   unregister_netdevice_notifier(_notifier);
 }
 
 module_init(veth_init);
-- 
2.17.1

Re: [PATCH 3.18.y] Fix compilation error backporting upstream commit 9fc12023d6f5

2018-08-04 Thread Lorenzo Bianconi

> On Sat, Aug 04, 2018 at 02:27:41PM +0200, Lorenzo Bianconi wrote:
> > Fix following compilation error backporting upstream commit
> > 9fc12023d6f5 ("ipv4: remove BUG_ON() from fib_compute_spec_dst)
> >
> > net/ipv4/fib_frontend.c: In function 'fib_compute_spec_dst':
> > net/ipv4/fib_frontend.c:225:3: error: expected expression before 'if'
> >if (!fib_lookup(net, , ))
> >
> > net/ipv4/fib_frontend.c:200:20: warning: unused variable 'res'
> >   struct fib_result res;
> >
> > Fixes: 0e46da6c6fac ("ipv4: remove BUG_ON() from fib_compute_spec_dst")
> > Signed-off-by: Lorenzo Bianconi 
> > ---
> >  net/ipv4/fib_frontend.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
> > index 924db4bedd88..0b29649627a7 100644
> > --- a/net/ipv4/fib_frontend.c
> > +++ b/net/ipv4/fib_frontend.c
> > @@ -221,7 +221,7 @@ __be32 fib_compute_spec_dst(struct sk_buff *skb)
> >   fl4.saddr = 0;
> >   fl4.flowi4_tos = RT_TOS(ip_hdr(skb)->tos);
> >   fl4.flowi4_scope = scope;
> > - fl4.flowi4_mark = vmark ? skb->mark : 0,
> > + fl4.flowi4_mark = vmark ? skb->mark : 0;
> >   if (!fib_lookup(net, , ))
> >   return FIB_RES_PREFSRC(net, res);
> >   } else {
>
> Thanks, but I fixed this about 6 hours ago, right?
>

Ops, right. Sorry for the noise :)

Regards,
Lorenzo

> greg k-h

[PATCH 3.18.y] Fix compilation error backporting upstream commit 9fc12023d6f5

2018-08-04 Thread Lorenzo Bianconi

Fix following compilation error backporting upstream commit
9fc12023d6f5 ("ipv4: remove BUG_ON() from fib_compute_spec_dst)

net/ipv4/fib_frontend.c: In function 'fib_compute_spec_dst':
net/ipv4/fib_frontend.c:225:3: error: expected expression before 'if'
   if (!fib_lookup(net, , ))

net/ipv4/fib_frontend.c:200:20: warning: unused variable 'res'
  struct fib_result res;

Fixes: 0e46da6c6fac ("ipv4: remove BUG_ON() from fib_compute_spec_dst")
Signed-off-by: Lorenzo Bianconi 
---
 net/ipv4/fib_frontend.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 924db4bedd88..0b29649627a7 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -221,7 +221,7 @@ __be32 fib_compute_spec_dst(struct sk_buff *skb)
fl4.saddr = 0;
fl4.flowi4_tos = RT_TOS(ip_hdr(skb)->tos);
fl4.flowi4_scope = scope;
-   fl4.flowi4_mark = vmark ? skb->mark : 0,
+   fl4.flowi4_mark = vmark ? skb->mark : 0;
if (!fib_lookup(net, , ))
return FIB_RES_PREFSRC(net, res);
} else {
-- 
2.18.0

[PATCH net] ipv4: remove BUG_ON() from fib_compute_spec_dst

2018-07-27 Thread Lorenzo Bianconi

Remove BUG_ON() from fib_compute_spec_dst routine and check
in_dev pointer during flowi4 data structure initialization.
fib_compute_spec_dst routine can be run concurrently with device removal
where ip_ptr net_device pointer is set to NULL. This can happen
if userspace enables pkt info on UDP rx socket and the device
is removed while traffic is flowing

Fixes: 35ebf65e851c ("ipv4: Create and use fib_compute_spec_dst() helper")
Signed-off-by: Lorenzo Bianconi 
---
 net/ipv4/fib_frontend.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index e46cdd310e5f..2998b0e47d4b 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -292,19 +292,19 @@ __be32 fib_compute_spec_dst(struct sk_buff *skb)
return ip_hdr(skb)->daddr;
 
in_dev = __in_dev_get_rcu(dev);
-   BUG_ON(!in_dev);
 
net = dev_net(dev);
 
scope = RT_SCOPE_UNIVERSE;
if (!ipv4_is_zeronet(ip_hdr(skb)->saddr)) {
+   bool vmark = in_dev && IN_DEV_SRC_VMARK(in_dev);
struct flowi4 fl4 = {
.flowi4_iif = LOOPBACK_IFINDEX,
.flowi4_oif = l3mdev_master_ifindex_rcu(dev),
.daddr = ip_hdr(skb)->saddr,
.flowi4_tos = RT_TOS(ip_hdr(skb)->tos),
.flowi4_scope = scope,
-   .flowi4_mark = IN_DEV_SRC_VMARK(in_dev) ? skb->mark : 0,
+   .flowi4_mark = vmark ? skb->mark : 0,
};
if (!fib_lookup(net, , , 0))
return FIB_RES_PREFSRC(net, res);
-- 
2.17.1

Re: [PATCH net-next] ipv6: send netlink notifications for manually configured addresses

2018-04-19 Thread Lorenzo Bianconi

> From: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
> Date: Tue, 17 Apr 2018 11:54:39 +0200
>
>> Send a netlink notification when userspace adds a manually configured
>> address if DAD is enabled and optimistic flag isn't set.
>> Moreover send RTM_DELADDR notifications for tentative addresses.
>>
>> Some userspace applications (e.g. NetworkManager) are interested in
>> addr netlink events albeit the address is still in tentative state,
>> however events are not sent if DAD process is not completed.
>> If the address is added and immediately removed userspace listeners
>> are not notified. This behaviour can be easily reproduced by using
>> veth interfaces:
>>
>> $ ip -b - <>> link add dev vm1 type veth peer name vm2
>>> link set dev vm1 up
>>> link set dev vm2 up
>>> addr add 2001:db8:a:b:1:2:3:4/64 dev vm1
>>> addr del 2001:db8:a:b:1:2:3:4/64 dev vm1
>> EOF
>>
>> This patch reverts the behaviour introduced by the commit f784ad3d79e5
>> ("ipv6: do not send RTM_DELADDR for tentative addresses")
>>
>> Suggested-by: Thomas Haller <thal...@redhat.com>
>> Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
>
> Ok, applied to net-next.
>
> It would be really nice if we clearly documented somewhere the exact
> situations where we desire ipv6 address netlink notifications to be
> sent out.
>
> Maybe even a common function that guards the event emission, where we
> encode the rules.  Or a comment somewhere prominent.  Something like
> that.
>
> Right now this isn't spelled out clearly anywhere, and that's probably
> why things keep being adjusted back and forth like this.

Sounds good, I will post a patch. Thanks

Regards,
Lorenzo

>
> Thank you.

[PATCH net-next] ipv6: send netlink notifications for manually configured addresses

2018-04-17 Thread Lorenzo Bianconi

Send a netlink notification when userspace adds a manually configured
address if DAD is enabled and optimistic flag isn't set.
Moreover send RTM_DELADDR notifications for tentative addresses.

Some userspace applications (e.g. NetworkManager) are interested in
addr netlink events albeit the address is still in tentative state,
however events are not sent if DAD process is not completed.
If the address is added and immediately removed userspace listeners
are not notified. This behaviour can be easily reproduced by using
veth interfaces:

$ ip -b - < link add dev vm1 type veth peer name vm2
> link set dev vm1 up
> link set dev vm2 up
> addr add 2001:db8:a:b:1:2:3:4/64 dev vm1
> addr del 2001:db8:a:b:1:2:3:4/64 dev vm1
EOF

This patch reverts the behaviour introduced by the commit f784ad3d79e5
("ipv6: do not send RTM_DELADDR for tentative addresses")

Suggested-by: Thomas Haller <thal...@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/ipv6/addrconf.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 62b97722722c..b2c0175125db 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2901,6 +2901,11 @@ static int inet6_addr_add(struct net *net, int ifindex,
  expires, flags);
}
 
+   /* Send a netlink notification if DAD is enabled and
+* optimistic flag is not set
+*/
+   if (!(ifp->flags & (IFA_F_OPTIMISTIC | IFA_F_NODAD)))
+   ipv6_ifa_notify(0, ifp);
/*
 * Note that section 3.1 of RFC 4429 indicates
 * that the Optimistic flag should not be set for
@@ -5028,14 +5033,6 @@ static void inet6_ifa_notify(int event, struct 
inet6_ifaddr *ifa)
struct net *net = dev_net(ifa->idev->dev);
int err = -ENOBUFS;
 
-   /* Don't send DELADDR notification for TENTATIVE address,
-* since NEWADDR notification is sent only after removing
-* TENTATIVE flag, if DAD has not failed.
-*/
-   if (ifa->flags & IFA_F_TENTATIVE && !(ifa->flags & IFA_F_DADFAILED) &&
-   event == RTM_DELADDR)
-   return;
-
skb = nlmsg_new(inet6_ifaddr_msgsize(), GFP_ATOMIC);
if (!skb)
goto errout;
-- 
2.14.3

Re: [PATCH] ipv6: remove unnecessary check in addrconf_prefix_rcv_add_addr()

2018-04-16 Thread Lorenzo Bianconi

> Remove unnecessary check on update_lft variable in
> addrconf_prefix_rcv_add_addr routine since it is always set to 0.
> Moreover remove update_lft re-initialization to 0
>
> Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
> ---
>  net/ipv6/addrconf.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index dffa38004c13..b2c0175125db 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -2529,7 +2529,6 @@ int addrconf_prefix_rcv_add_addr(struct net *net, 
> struct net_device *dev,
> if (IS_ERR_OR_NULL(ifp))
> return -1;
>
> -   update_lft = 0;
> create = 1;
> spin_lock_bh(>lock);
> ifp->flags |= IFA_F_MANAGETEMPADDR;
> @@ -2551,7 +2550,7 @@ int addrconf_prefix_rcv_add_addr(struct net *net, 
> struct net_device *dev,
> stored_lft = ifp->valid_lft - (now - ifp->tstamp) / 
> HZ;
> else
> stored_lft = 0;
> -   if (!update_lft && !create && stored_lft) {
> +   if (!create && stored_lft) {
> const u32 minimum_lft = min_t(u32,
> stored_lft, MIN_VALID_LIFETIME);
> valid_lft = max(valid_lft, minimum_lft);
> --
> 2.14.3
>

I forgot 'net-next' tag in the subject. Dave should I send a v2?

Regards,
Lorenzo

[PATCH] ipv6: remove unnecessary check in addrconf_prefix_rcv_add_addr()

2018-04-16 Thread Lorenzo Bianconi

Remove unnecessary check on update_lft variable in
addrconf_prefix_rcv_add_addr routine since it is always set to 0.
Moreover remove update_lft re-initialization to 0

Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/ipv6/addrconf.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index dffa38004c13..b2c0175125db 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2529,7 +2529,6 @@ int addrconf_prefix_rcv_add_addr(struct net *net, struct 
net_device *dev,
if (IS_ERR_OR_NULL(ifp))
return -1;
 
-   update_lft = 0;
create = 1;
spin_lock_bh(>lock);
ifp->flags |= IFA_F_MANAGETEMPADDR;
@@ -2551,7 +2550,7 @@ int addrconf_prefix_rcv_add_addr(struct net *net, struct 
net_device *dev,
stored_lft = ifp->valid_lft - (now - ifp->tstamp) / HZ;
else
stored_lft = 0;
-   if (!update_lft && !create && stored_lft) {
+   if (!create && stored_lft) {
const u32 minimum_lft = min_t(u32,
stored_lft, MIN_VALID_LIFETIME);
valid_lft = max(valid_lft, minimum_lft);
-- 
2.14.3

[RFC net-next] ipv6: send netlink notifications for manually configured addresses

2018-04-12 Thread Lorenzo Bianconi

Send a netlink notification when userspace adds a manually configured
address if DAD is enabled and optimistic flag isn't set.
Moreover send RTM_DELADDR notifications for tentative addresses.

Some userspace applications (e.g. NetworkManager) are interested in
addr netlink events albeit the address is still in tentative state,
however events are not sent if DAD process is not completed.
If the address is added and immediately removed userspace listeners
are not notified. This behaviour can be easily reproduced by using
veth interfaces:

$ ip -b - < link add dev vm1 type veth peer name vm2
> link set dev vm1 up
> link set dev vm2 up
> addr add 2001:db8:a:b:1:2:3:4/64 dev vm1
> addr del 2001:db8:a:b:1:2:3:4/64 dev vm1
EOF

This patch reverts the behaviour introduced by the commit f784ad3d79e5
("ipv6: do not send RTM_DELADDR for tentative addresses")

Suggested-by: Thomas Haller <thal...@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/ipv6/addrconf.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 78cef00c9596..dffa38004c13 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2902,6 +2902,11 @@ static int inet6_addr_add(struct net *net, int ifindex,
  expires, flags);
}
 
+   /* Send a netlink notification if DAD is enabled and
+* optimistic flag is not set
+*/
+   if (!(ifp->flags & (IFA_F_OPTIMISTIC | IFA_F_NODAD)))
+   ipv6_ifa_notify(0, ifp);
/*
 * Note that section 3.1 of RFC 4429 indicates
 * that the Optimistic flag should not be set for
@@ -5029,14 +5034,6 @@ static void inet6_ifa_notify(int event, struct 
inet6_ifaddr *ifa)
struct net *net = dev_net(ifa->idev->dev);
int err = -ENOBUFS;
 
-   /* Don't send DELADDR notification for TENTATIVE address,
-* since NEWADDR notification is sent only after removing
-* TENTATIVE flag, if DAD has not failed.
-*/
-   if (ifa->flags & IFA_F_TENTATIVE && !(ifa->flags & IFA_F_DADFAILED) &&
-   event == RTM_DELADDR)
-   return;
-
skb = nlmsg_new(inet6_ifaddr_msgsize(), GFP_ATOMIC);
if (!skb)
goto errout;
-- 
2.14.3

[PATCH v2 net-next 0/2] do not allow adding routes if disable_ipv6 is enabled

2018-03-29 Thread Lorenzo Bianconi

Do not allow userspace to add static ipv6 routes if disable_ipv6 is enabled.
Update disable_ipv6 documentation according to that change

Changes since v1:
- added an extack message telling the user that IPv6 is disabled on the nexthop
  device
- rebased on-top of net-next

Lorenzo Bianconi (2):
  ipv6: do not set routes if disable_ipv6 has been enabled
  Documentation: ip-sysctl.txt: clarify disable_ipv6

 Documentation/networking/ip-sysctl.txt | 4 +++-
 net/ipv6/route.c   | 6 ++
 2 files changed, 9 insertions(+), 1 deletion(-)

-- 
2.14.3

[PATCH v2 net-next 1/2] ipv6: do not set routes if disable_ipv6 has been enabled

2018-03-29 Thread Lorenzo Bianconi

Do not allow setting ipv6 routes from userspace if disable_ipv6 has been
enabled. The issue can be triggered using the following reproducer:

- sysctl net.ipv6.conf.all.disable_ipv6=1
- ip -6 route add a:b:c:d::/64 dev em1
- ip -6 route show
  a:b:c:d::/64 dev em1 metric 1024 pref medium

Fix it checking disable_ipv6 value in ip6_route_info_create routine

Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/ipv6/route.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index ba8d5df50ebe..e461ef1158b6 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2917,6 +2917,12 @@ static struct rt6_info *ip6_route_info_create(struct 
fib6_config *cfg,
if (!dev)
goto out;
 
+   if (idev->cnf.disable_ipv6) {
+   NL_SET_ERR_MSG(extack, "IPv6 is disabled on nexthop device");
+   err = -EACCES;
+   goto out;
+   }
+
if (!(dev->flags & IFF_UP)) {
NL_SET_ERR_MSG(extack, "Nexthop device is not up");
err = -ENETDOWN;
-- 
2.14.3

[PATCH v2 net-next 2/2] Documentation: ip-sysctl.txt: clarify disable_ipv6

2018-03-29 Thread Lorenzo Bianconi

Clarify that when disable_ipv6 is enabled even the ipv6 routes
are deleted for the selected interface and from now it will not
be possible to add addresses/routes to that interface

Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 Documentation/networking/ip-sysctl.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/ip-sysctl.txt 
b/Documentation/networking/ip-sysctl.txt
index 1d1120753ae8..33f35f049ad5 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -1703,7 +1703,9 @@ disable_ipv6 - BOOLEAN
interface and start Duplicate Address Detection, if necessary.
 
When this value is changed from 0 to 1 (IPv6 is being disabled),
-   it will dynamically delete all address on the given interface.
+   it will dynamically delete all addresses and routes on the given
+   interface. From now on it will not possible to add addresses/routes
+   to the selected interface.
 
 accept_dad - INTEGER
Whether to accept DAD (Duplicate Address Detection).
-- 
2.14.3

Re: [PATCH net-next 1/2] ipv6: do not set routes if disable_ipv6 has been enabled

2018-03-29 Thread Lorenzo Bianconi

> On 3/27/18 11:11 AM, Lorenzo Bianconi wrote:
> > Do not allow to set ipv6 routes from userspace if disable_ipv6 has been
> > enabled. The issue can be triggered using the following reproducer:
> > 
> > - sysctl net.ipv6.conf.all.disable_ipv6=1
> > - ip -6 route add a:b:c:d::/64 dev em1
> > - ip -6 route show
> >   a:b:c:d::/64 dev em1 metric 1024 pref medium
> > 
> > Fix it checking disable_ipv6 value in ip6_route_info_create routine
> > 
> > Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
> > ---
> >  net/ipv6/route.c | 5 +
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> > index 1d0eaa69874d..672fd7fdb037 100644
> > --- a/net/ipv6/route.c
> > +++ b/net/ipv6/route.c
> > @@ -2917,6 +2917,11 @@ static struct rt6_info *ip6_route_info_create(struct 
> > fib6_config *cfg,
> > if (!dev)
> > goto out;
> >  
> > +   if (idev->cnf.disable_ipv6) {
> > +   err = -EACCES;
> 
> you need an extack message telling the user that IPv6 is disabled on the
> nexthop device.
> 

Ack, will do in v2.

Regards,
Lorenzo

> > +   goto out;
> > +   }
> > +
> > if (!(dev->flags & IFF_UP)) {
> > NL_SET_ERR_MSG(extack, "Nexthop device is not up");
> > err = -ENETDOWN;
> > 
>

[PATCH net-next 1/2] ipv6: do not set routes if disable_ipv6 has been enabled

2018-03-27 Thread Lorenzo Bianconi

Do not allow to set ipv6 routes from userspace if disable_ipv6 has been
enabled. The issue can be triggered using the following reproducer:

- sysctl net.ipv6.conf.all.disable_ipv6=1
- ip -6 route add a:b:c:d::/64 dev em1
- ip -6 route show
  a:b:c:d::/64 dev em1 metric 1024 pref medium

Fix it checking disable_ipv6 value in ip6_route_info_create routine

Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/ipv6/route.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 1d0eaa69874d..672fd7fdb037 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2917,6 +2917,11 @@ static struct rt6_info *ip6_route_info_create(struct 
fib6_config *cfg,
if (!dev)
goto out;
 
+   if (idev->cnf.disable_ipv6) {
+   err = -EACCES;
+   goto out;
+   }
+
if (!(dev->flags & IFF_UP)) {
NL_SET_ERR_MSG(extack, "Nexthop device is not up");
err = -ENETDOWN;
-- 
2.14.3

[PATCH net-next 2/2] Documentation: ip-sysctl.txt: clarify disable_ipv6

2018-03-27 Thread Lorenzo Bianconi

Clarify that when disable_ipv6 is enabled even the ipv6 routes
are deleted for the selected interface and from now it will not
be possible to add addresses/routes to that interface

Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 Documentation/networking/ip-sysctl.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/ip-sysctl.txt 
b/Documentation/networking/ip-sysctl.txt
index 1d1120753ae8..33f35f049ad5 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -1703,7 +1703,9 @@ disable_ipv6 - BOOLEAN
interface and start Duplicate Address Detection, if necessary.
 
When this value is changed from 0 to 1 (IPv6 is being disabled),
-   it will dynamically delete all address on the given interface.
+   it will dynamically delete all addresses and routes on the given
+   interface. From now on it will not possible to add addresses/routes
+   to the selected interface.
 
 accept_dad - INTEGER
Whether to accept DAD (Duplicate Address Detection).
-- 
2.14.3

[PATCH net-next 0/2] do not allow to add routes if disable_ipv6 is enabled

2018-03-27 Thread Lorenzo Bianconi

Do not allow userspace to add static ipv6 routes if disable_ipv6 is enabled.
Update disable_ipv6 documentation according to that change

Lorenzo Bianconi (2):
  ipv6: do not set routes if disable_ipv6 has been enabled
  Documentation: ip-sysctl.txt: clarify disable_ipv6

 Documentation/networking/ip-sysctl.txt | 4 +++-
 net/ipv6/route.c   | 5 +
 2 files changed, 8 insertions(+), 1 deletion(-)

-- 
2.14.3

[PATCH net v2] ipv6: fix access to non-linear packet in ndisc_fill_redirect_hdr_option()

2018-03-08 Thread Lorenzo Bianconi

Fix the following slab-out-of-bounds kasan report in
ndisc_fill_redirect_hdr_option when the incoming ipv6 packet is not
linear and the accessed data are not in the linear data region of orig_skb.

[ 1503.122508] 
==
[ 1503.122832] BUG: KASAN: slab-out-of-bounds in ndisc_send_redirect+0x94e/0x990
[ 1503.123036] Read of size 1184 at addr 8800298ab6b0 by task netperf/1932

[ 1503.123220] CPU: 0 PID: 1932 Comm: netperf Not tainted 4.16.0-rc2+ #124
[ 1503.123347] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
1.10.2-2.fc27 04/01/2014
[ 1503.123527] Call Trace:
[ 1503.123579]  
[ 1503.123638]  print_address_description+0x6e/0x280
[ 1503.123849]  kasan_report+0x233/0x350
[ 1503.123946]  memcpy+0x1f/0x50
[ 1503.124037]  ndisc_send_redirect+0x94e/0x990
[ 1503.125150]  ip6_forward+0x1242/0x13b0
[...]
[ 1503.153890] Allocated by task 1932:
[ 1503.153982]  kasan_kmalloc+0x9f/0xd0
[ 1503.154074]  __kmalloc_track_caller+0xb5/0x160
[ 1503.154198]  __kmalloc_reserve.isra.41+0x24/0x70
[ 1503.154324]  __alloc_skb+0x130/0x3e0
[ 1503.154415]  sctp_packet_transmit+0x21a/0x1810
[ 1503.154533]  sctp_outq_flush+0xc14/0x1db0
[ 1503.154624]  sctp_do_sm+0x34e/0x2740
[ 1503.154715]  sctp_primitive_SEND+0x57/0x70
[ 1503.154807]  sctp_sendmsg+0xaa6/0x1b10
[ 1503.154897]  sock_sendmsg+0x68/0x80
[ 1503.154987]  ___sys_sendmsg+0x431/0x4b0
[ 1503.155078]  __sys_sendmsg+0xa4/0x130
[ 1503.155168]  do_syscall_64+0x171/0x3f0
[ 1503.155259]  entry_SYSCALL_64_after_hwframe+0x42/0xb7

[ 1503.155436] Freed by task 1932:
[ 1503.155527]  __kasan_slab_free+0x134/0x180
[ 1503.155618]  kfree+0xbc/0x180
[ 1503.155709]  skb_release_data+0x27f/0x2c0
[ 1503.155800]  consume_skb+0x94/0xe0
[ 1503.155889]  sctp_chunk_put+0x1aa/0x1f0
[ 1503.155979]  sctp_inq_pop+0x2f8/0x6e0
[ 1503.156070]  sctp_assoc_bh_rcv+0x6a/0x230
[ 1503.156164]  sctp_inq_push+0x117/0x150
[ 1503.156255]  sctp_backlog_rcv+0xdf/0x4a0
[ 1503.156346]  __release_sock+0x142/0x250
[ 1503.156436]  release_sock+0x80/0x180
[ 1503.156526]  sctp_sendmsg+0xbb0/0x1b10
[ 1503.156617]  sock_sendmsg+0x68/0x80
[ 1503.156708]  ___sys_sendmsg+0x431/0x4b0
[ 1503.156799]  __sys_sendmsg+0xa4/0x130
[ 1503.156889]  do_syscall_64+0x171/0x3f0
[ 1503.156980]  entry_SYSCALL_64_after_hwframe+0x42/0xb7

[ 1503.157158] The buggy address belongs to the object at 8800298ab600
which belongs to the cache kmalloc-1024 of size 1024
[ 1503.157444] The buggy address is located 176 bytes inside of
1024-byte region [8800298ab600, 8800298aba00)
[ 1503.157702] The buggy address belongs to the page:
[ 1503.157820] page:eaa62a00 count:1 mapcount:0 
mapping: index:0x0 compound_mapcount: 0
[ 1503.158053] flags: 0x40008100(slab|head)
[ 1503.158171] raw: 40008100   
0001800e000e
[ 1503.158350] raw: dead0100 dead0200 880036002600 

[ 1503.158523] page dumped because: kasan: bad access detected

[ 1503.158698] Memory state around the buggy address:
[ 1503.158816]  8800298ab900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00
[ 1503.158988]  8800298ab980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00
[ 1503.159165] >8800298aba00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc
[ 1503.159338]^
[ 1503.159436]  8800298aba80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[ 1503.159610]  8800298abb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[ 1503.159785] 
==
[ 1503.159964] Disabling lock debugging due to kernel taint

The test scenario to trigger the issue consists of 4 devices:
- H0: data sender, connected to LAN0
- H1: data receiver, connected to LAN1
- GW0 and GW1: routers between LAN0 and LAN1. Both of them have an
  ethernet connection on LAN0 and LAN1
On H{0,1} set GW0 as default gateway while on GW0 set GW1 as next hop for
data from LAN0 to LAN1.
Moreover create an ip6ip6 tunnel between H0 and H1 and send 3 concurrent
data streams (TCP/UDP/SCTP) from H0 to H1 through ip6ip6 tunnel (send
buffer size is set to 16K). While data streams are active flush the route
cache on HA multiple times.
I have not been able to identify a given commit that introduced the issue
since, using the reproducer described above, the kasan report has been
triggered from 4.14 and I have not gone back further.

Reported-by: Jianlin Shi <ji...@redhat.com>
Reviewed-by: Stefano Brivio <sbri...@redhat.com>
Reviewed-by: Eric Dumazet <eduma...@google.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
Changes since v1:
- improve commit log
- target the patch for net and not for net-next
---
 net/ipv6/ndisc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index f61a5b613b52..ba5e04c6ae17 100644
--- a/net/ipv6/ndi

Re: [PATCH net-next] ipv6: fix access to non-linear packet in ndisc_fill_redirect_hdr_option()

2018-03-05 Thread Lorenzo Bianconi

> From: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
> Date: Fri,  2 Mar 2018 11:53:06 +0100
>
>> Fix the following slab-out-of-bounds kasan report in
>> ndisc_fill_redirect_hdr_option when the incoming ipv6 packet is not
>> linear and the accessed data are not in the linear data region of orig_skb
>  ...
>> Reported-by: Jianlin Shi <ji...@redhat.com>
>> Reviewed-by: Stefano Brivio <sbri...@redhat.com>
>> Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
>
> As a bug fix this should be targetting 'net' not 'net-next'.
>
> Furthermore, we need an appropriate Fixes: tag so we know when this
> problem existed.
>
> If you go far back and it seems like the problem has always been
> there, say so and mention how far back you checked.

Hi Dave,

Yes sorry, that is exactly the case. Bisecting the issue, I am able to
trigger the kasan report on 4.14 and I have not gone back further.
Moreover the issue is much more hard to trigger on net-next (or net)
respect to 4.14.

>
> It also helps to explain exactly how the condition is created
> ("X creates packet with Y bytes of header space, Z fragments
> it at byte N, and that's how we end up here with such a packet")
> because such a description aids understanding and might help
> suggest alternative (less expensive, cleaner) ways to fix the
> problem.
>
> Thanks.

The test scenario consists of 4 devices (simulated using veth and namespaces):
- H0: data sender, connected to LAN0
- H1: data receiver, connected to LAN1
- GW0 and GW1: routers between LAN0 and LAN1. Both of them have an
ethernet connection on LAN{0,1}
- The default route on H{0,1} is GW0
- using ip command on GW0, set GW1 as next hop for LAN1 from LAN0
- create a ip6ip6 tunnel between H0 and H1
- send 3 concurrent data streams (TCP/UDP/SCTP) from H0 to H1 through
ip6ip6 tunnel (buffer size is set to 16K)
- while data streams are active flush the route cache on HA multiple times.

Regards,
Lorenzo

[PATCH net-next] ipv6: fix access to non-linear packet in ndisc_fill_redirect_hdr_option()

2018-03-02 Thread Lorenzo Bianconi

Fix the following slab-out-of-bounds kasan report in
ndisc_fill_redirect_hdr_option when the incoming ipv6 packet is not
linear and the accessed data are not in the linear data region of orig_skb

[ 1503.122508] 
==
[ 1503.122832] BUG: KASAN: slab-out-of-bounds in ndisc_send_redirect+0x94e/0x990
[ 1503.123036] Read of size 1184 at addr 8800298ab6b0 by task netperf/1932

[ 1503.123220] CPU: 0 PID: 1932 Comm: netperf Not tainted 4.16.0-rc2+ #124
[ 1503.123347] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
1.10.2-2.fc27 04/01/2014
[ 1503.123527] Call Trace:
[ 1503.123579]  
[ 1503.123638]  print_address_description+0x6e/0x280
[ 1503.123849]  kasan_report+0x233/0x350
[ 1503.123946]  memcpy+0x1f/0x50
[ 1503.124037]  ndisc_send_redirect+0x94e/0x990
[ 1503.125150]  ip6_forward+0x1242/0x13b0
[...]
[ 1503.153890] Allocated by task 1932:
[ 1503.153982]  kasan_kmalloc+0x9f/0xd0
[ 1503.154074]  __kmalloc_track_caller+0xb5/0x160
[ 1503.154198]  __kmalloc_reserve.isra.41+0x24/0x70
[ 1503.154324]  __alloc_skb+0x130/0x3e0
[ 1503.154415]  sctp_packet_transmit+0x21a/0x1810
[ 1503.154533]  sctp_outq_flush+0xc14/0x1db0
[ 1503.154624]  sctp_do_sm+0x34e/0x2740
[ 1503.154715]  sctp_primitive_SEND+0x57/0x70
[ 1503.154807]  sctp_sendmsg+0xaa6/0x1b10
[ 1503.154897]  sock_sendmsg+0x68/0x80
[ 1503.154987]  ___sys_sendmsg+0x431/0x4b0
[ 1503.155078]  __sys_sendmsg+0xa4/0x130
[ 1503.155168]  do_syscall_64+0x171/0x3f0
[ 1503.155259]  entry_SYSCALL_64_after_hwframe+0x42/0xb7

[ 1503.155436] Freed by task 1932:
[ 1503.155527]  __kasan_slab_free+0x134/0x180
[ 1503.155618]  kfree+0xbc/0x180
[ 1503.155709]  skb_release_data+0x27f/0x2c0
[ 1503.155800]  consume_skb+0x94/0xe0
[ 1503.155889]  sctp_chunk_put+0x1aa/0x1f0
[ 1503.155979]  sctp_inq_pop+0x2f8/0x6e0
[ 1503.156070]  sctp_assoc_bh_rcv+0x6a/0x230
[ 1503.156164]  sctp_inq_push+0x117/0x150
[ 1503.156255]  sctp_backlog_rcv+0xdf/0x4a0
[ 1503.156346]  __release_sock+0x142/0x250
[ 1503.156436]  release_sock+0x80/0x180
[ 1503.156526]  sctp_sendmsg+0xbb0/0x1b10
[ 1503.156617]  sock_sendmsg+0x68/0x80
[ 1503.156708]  ___sys_sendmsg+0x431/0x4b0
[ 1503.156799]  __sys_sendmsg+0xa4/0x130
[ 1503.156889]  do_syscall_64+0x171/0x3f0
[ 1503.156980]  entry_SYSCALL_64_after_hwframe+0x42/0xb7

[ 1503.157158] The buggy address belongs to the object at 8800298ab600
which belongs to the cache kmalloc-1024 of size 1024
[ 1503.157444] The buggy address is located 176 bytes inside of
1024-byte region [8800298ab600, 8800298aba00)
[ 1503.157702] The buggy address belongs to the page:
[ 1503.157820] page:eaa62a00 count:1 mapcount:0 
mapping: index:0x0 compound_mapcount: 0
[ 1503.158053] flags: 0x40008100(slab|head)
[ 1503.158171] raw: 40008100   
0001800e000e
[ 1503.158350] raw: dead0100 dead0200 880036002600 

[ 1503.158523] page dumped because: kasan: bad access detected

[ 1503.158698] Memory state around the buggy address:
[ 1503.158816]  8800298ab900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00
[ 1503.158988]  8800298ab980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00
[ 1503.159165] >8800298aba00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc
[ 1503.159338]^
[ 1503.159436]  8800298aba80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[ 1503.159610]  8800298abb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb
[ 1503.159785] 
==
[ 1503.159964] Disabling lock debugging due to kernel taint

Reported-by: Jianlin Shi <ji...@redhat.com>
Reviewed-by: Stefano Brivio <sbri...@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/ipv6/ndisc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 0a19ce3a6f7f..afd8c15827cd 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1554,7 +1554,8 @@ static void ndisc_fill_redirect_hdr_option(struct sk_buff 
*skb,
*(opt++) = (rd_len >> 3);
opt += 6;
 
-   memcpy(opt, ipv6_hdr(orig_skb), rd_len - 8);
+   skb_copy_bits(orig_skb, skb_network_offset(orig_skb), opt,
+ rd_len - 8);
 }
 
 void ndisc_send_redirect(struct sk_buff *skb, const struct in6_addr *target)
-- 
2.14.3

[PATCH net-next] l2tp: remove switch block in l2tp_nl_cmd_session_create()

2018-01-17 Thread Lorenzo Bianconi

Remove the switch block in l2tp_nl_cmd_session_create() that
checks pseudowire-specific parameters since just L2TP_PWTYPE_ETH and
L2TP_PWTYPE_PPP are currently supported and no actual checks are
performed. Moreover the L2TP_PWTYPE_IP/default case presents a harmless
issue in error handling (break instead of goto out_tunnel)

Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/l2tp/l2tp_netlink.c | 21 -
 1 file changed, 21 deletions(-)

diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
index 405a5341ed1e..e7ea9c4b89ff 100644
--- a/net/l2tp/l2tp_netlink.c
+++ b/net/l2tp/l2tp_netlink.c
@@ -620,27 +620,6 @@ static int l2tp_nl_cmd_session_create(struct sk_buff *skb, 
struct genl_info *inf
goto out_tunnel;
}
 
-   /* Check that pseudowire-specific params are present */
-   switch (cfg.pw_type) {
-   case L2TP_PWTYPE_NONE:
-   break;
-   case L2TP_PWTYPE_ETH_VLAN:
-   if (!info->attrs[L2TP_ATTR_VLAN_ID]) {
-   ret = -EINVAL;
-   goto out_tunnel;
-   }
-   break;
-   case L2TP_PWTYPE_ETH:
-   break;
-   case L2TP_PWTYPE_PPP:
-   case L2TP_PWTYPE_PPP_AC:
-   break;
-   case L2TP_PWTYPE_IP:
-   default:
-   ret = -EPROTONOSUPPORT;
-   break;
-   }
-
ret = l2tp_nl_cmd_ops[cfg.pw_type]->session_create(net, tunnel,
   session_id,
   peer_session_id,
-- 
2.13.6

[PATCH v3 net-next 2/4] l2tp: remove l2specific_len dependency in l2tp_core

2018-01-16 Thread Lorenzo Bianconi

Remove l2specific_len dependency while building l2tpv3 header or
parsing the received frame since default L2-Specific Sublayer is
always four bytes long and we don't need to rely on a user supplied
value.
Moreover in l2tp netlink code there are no sanity checks to
enforce the relation between l2specific_len and l2specific_type,
so sending a malformed netlink message is possible to set
l2specific_type to L2TP_L2SPECTYPE_DEFAULT (or even
L2TP_L2SPECTYPE_NONE) and set l2specific_len to a value greater than
4 leaking memory on the wire and sending corrupted frames.

Reviewed-by: Guillaume Nault <g.na...@alphalink.fr>
Tested-by: Guillaume Nault <g.na...@alphalink.fr>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/l2tp/l2tp_core.c | 34 --
 net/l2tp/l2tp_core.h | 11 +++
 2 files changed, 27 insertions(+), 18 deletions(-)

diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 62285fc6eb59..88efb8b845ca 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -730,11 +730,9 @@ void l2tp_recv_common(struct l2tp_session *session, struct 
sk_buff *skb,
 "%s: recv data ns=%u, session nr=%u\n",
 session->name, ns, session->nr);
}
+   ptr += 4;
}
 
-   /* Advance past L2-specific header, if present */
-   ptr += session->l2specific_len;
-
if (L2TP_SKB_CB(skb)->has_seq) {
/* Received a packet with sequence numbers. If we're the LNS,
 * check if we sre sending sequence numbers and if not,
@@ -1048,21 +1046,20 @@ static int l2tp_build_l2tpv3_header(struct l2tp_session 
*session, void *buf)
memcpy(bufp, >cookie[0], session->cookie_len);
bufp += session->cookie_len;
}
-   if (session->l2specific_len) {
-   if (session->l2specific_type == L2TP_L2SPECTYPE_DEFAULT) {
-   u32 l2h = 0;
-   if (session->send_seq) {
-   l2h = 0x4000 | session->ns;
-   session->ns++;
-   session->ns &= 0xff;
-   l2tp_dbg(session, L2TP_MSG_SEQ,
-"%s: updated ns to %u\n",
-session->name, session->ns);
-   }
+   if (session->l2specific_type == L2TP_L2SPECTYPE_DEFAULT) {
+   u32 l2h = 0;
 
-   *((__be32 *) bufp) = htonl(l2h);
+   if (session->send_seq) {
+   l2h = 0x4000 | session->ns;
+   session->ns++;
+   session->ns &= 0xff;
+   l2tp_dbg(session, L2TP_MSG_SEQ,
+"%s: updated ns to %u\n",
+session->name, session->ns);
}
-   bufp += session->l2specific_len;
+
+   *((__be32 *)bufp) = htonl(l2h);
+   bufp += 4;
}
 
return bufp - optr;
@@ -1719,7 +1716,7 @@ int l2tp_session_delete(struct l2tp_session *session)
 EXPORT_SYMBOL_GPL(l2tp_session_delete);
 
 /* We come here whenever a session's send_seq, cookie_len or
- * l2specific_len parameters are set.
+ * l2specific_type parameters are set.
  */
 void l2tp_session_set_header_len(struct l2tp_session *session, int version)
 {
@@ -1728,7 +1725,8 @@ void l2tp_session_set_header_len(struct l2tp_session 
*session, int version)
if (session->send_seq)
session->hdr_len += 4;
} else {
-   session->hdr_len = 4 + session->cookie_len + 
session->l2specific_len;
+   session->hdr_len = 4 + session->cookie_len;
+   session->hdr_len += l2tp_get_l2specific_len(session);
if (session->tunnel->encap == L2TP_ENCAPTYPE_UDP)
session->hdr_len += 4;
}
diff --git a/net/l2tp/l2tp_core.h b/net/l2tp/l2tp_core.h
index c2e9bbd79b35..7bef304de4f0 100644
--- a/net/l2tp/l2tp_core.h
+++ b/net/l2tp/l2tp_core.h
@@ -302,6 +302,17 @@ static inline void l2tp_session_dec_refcount(struct 
l2tp_session *session)
l2tp_session_free(session);
 }
 
+static inline int l2tp_get_l2specific_len(struct l2tp_session *session)
+{
+   switch (session->l2specific_type) {
+   case L2TP_L2SPECTYPE_DEFAULT:
+   return 4;
+   case L2TP_L2SPECTYPE_NONE:
+   default:
+   return 0;
+   }
+}
+
 #define l2tp_printk(ptr, type, func, fmt, ...) \
 do {   \
if (((ptr)->debug) & (type))\
-- 
2.13.6

[PATCH v3 net-next 3/4] l2tp: remove l2specific_len configurable parameter

2018-01-16 Thread Lorenzo Bianconi

Remove l2specific_len configuration parameter since now L2-Specific
Sublayer length is computed according to l2specific_type provided by
userspace.

Reviewed-by: Guillaume Nault <g.na...@alphalink.fr>
Tested-by: Guillaume Nault <g.na...@alphalink.fr>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/l2tp/l2tp_core.c| 1 -
 net/l2tp/l2tp_core.h| 2 --
 net/l2tp/l2tp_debugfs.c | 2 +-
 net/l2tp/l2tp_netlink.c | 4 
 4 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 88efb8b845ca..194a7483bb93 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1777,7 +1777,6 @@ struct l2tp_session *l2tp_session_create(int priv_size, 
struct l2tp_tunnel *tunn
session->lns_mode = cfg->lns_mode;
session->reorder_timeout = cfg->reorder_timeout;
session->l2specific_type = cfg->l2specific_type;
-   session->l2specific_len = cfg->l2specific_len;
session->cookie_len = cfg->cookie_len;
memcpy(>cookie[0], >cookie[0], 
cfg->cookie_len);
session->peer_cookie_len = cfg->peer_cookie_len;
diff --git a/net/l2tp/l2tp_core.h b/net/l2tp/l2tp_core.h
index 7bef304de4f0..9bbee90e9963 100644
--- a/net/l2tp/l2tp_core.h
+++ b/net/l2tp/l2tp_core.h
@@ -59,7 +59,6 @@ struct l2tp_session_cfg {
int debug;  /* bitmask of debug message
 * categories */
u16 vlan_id;/* VLAN pseudowire only */
-   u16 l2specific_len; /* Layer 2 specific length */
u16 l2specific_type; /* Layer 2 specific type */
u8  cookie[8];  /* optional cookie */
int cookie_len; /* 0, 4 or 8 bytes */
@@ -85,7 +84,6 @@ struct l2tp_session {
int cookie_len;
u8  peer_cookie[8];
int peer_cookie_len;
-   u16 l2specific_len;
u16 l2specific_type;
u16 hdr_len;
u32 nr; /* session NR state (receive) */
diff --git a/net/l2tp/l2tp_debugfs.c b/net/l2tp/l2tp_debugfs.c
index 2c30587d1a14..72e713da4733 100644
--- a/net/l2tp/l2tp_debugfs.c
+++ b/net/l2tp/l2tp_debugfs.c
@@ -181,7 +181,7 @@ static void l2tp_dfs_seq_session_show(struct seq_file *m, 
void *v)
   session->debug,
   jiffies_to_msecs(session->reorder_timeout));
seq_printf(m, "   offset 0 l2specific %hu/%hu\n",
-  session->l2specific_type, session->l2specific_len);
+  session->l2specific_type, l2tp_get_l2specific_len(session));
if (session->cookie_len) {
seq_printf(m, "   cookie %02x%02x%02x%02x",
   session->cookie[0], session->cookie[1],
diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
index 9ba2b8a68f65..405a5341ed1e 100644
--- a/net/l2tp/l2tp_netlink.c
+++ b/net/l2tp/l2tp_netlink.c
@@ -561,10 +561,6 @@ static int l2tp_nl_cmd_session_create(struct sk_buff *skb, 
struct genl_info *inf
cfg.l2specific_type = L2TP_L2SPECTYPE_DEFAULT;
}
 
-   cfg.l2specific_len = 4;
-   if (info->attrs[L2TP_ATTR_L2SPEC_LEN])
-   cfg.l2specific_len = 
nla_get_u8(info->attrs[L2TP_ATTR_L2SPEC_LEN]);
-
if (info->attrs[L2TP_ATTR_COOKIE]) {
u16 len = nla_len(info->attrs[L2TP_ATTR_COOKIE]);
if (len > 8) {
-- 
2.13.6

[PATCH v3 net-next 4/4] l2tp: mark L2TP_ATTR_L2SPEC_LEN as not used

2018-01-16 Thread Lorenzo Bianconi

Reviewed-by: Guillaume Nault <g.na...@alphalink.fr>
Tested-by: Guillaume Nault <g.na...@alphalink.fr>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 include/uapi/linux/l2tp.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/uapi/linux/l2tp.h b/include/uapi/linux/l2tp.h
index 71e62795104d..7d570c7bd117 100644
--- a/include/uapi/linux/l2tp.h
+++ b/include/uapi/linux/l2tp.h
@@ -97,7 +97,7 @@ enum {
L2TP_ATTR_OFFSET,   /* u16 (not used) */
L2TP_ATTR_DATA_SEQ, /* u16 */
L2TP_ATTR_L2SPEC_TYPE,  /* u8, enum l2tp_l2spec_type */
-   L2TP_ATTR_L2SPEC_LEN,   /* u8, enum l2tp_l2spec_type */
+   L2TP_ATTR_L2SPEC_LEN,   /* u8 (not used) */
L2TP_ATTR_PROTO_VERSION,/* u8 */
L2TP_ATTR_IFNAME,   /* string */
L2TP_ATTR_CONN_ID,  /* u32 */
-- 
2.13.6

[PATCH v3 net-next 1/4] l2tp: double-check l2specific_type provided by userspace

2018-01-16 Thread Lorenzo Bianconi

Add sanity check on l2specific_type provided by userspace in
l2tp_nl_cmd_session_create() since just L2TP_L2SPECTYPE_DEFAULT and
L2TP_L2SPECTYPE_NONE are currently supported.
Moreover explicitly set l2specific_type to L2TP_L2SPECTYPE_DEFAULT
only if the userspace does not provide a value for it

Reviewed-by: Guillaume Nault <g.na...@alphalink.fr>
Tested-by: Guillaume Nault <g.na...@alphalink.fr>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/l2tp/l2tp_netlink.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
index e1ca29f79821..9ba2b8a68f65 100644
--- a/net/l2tp/l2tp_netlink.c
+++ b/net/l2tp/l2tp_netlink.c
@@ -550,9 +550,16 @@ static int l2tp_nl_cmd_session_create(struct sk_buff *skb, 
struct genl_info *inf
if (info->attrs[L2TP_ATTR_DATA_SEQ])
cfg.data_seq = 
nla_get_u8(info->attrs[L2TP_ATTR_DATA_SEQ]);
 
-   cfg.l2specific_type = L2TP_L2SPECTYPE_DEFAULT;
-   if (info->attrs[L2TP_ATTR_L2SPEC_TYPE])
+   if (info->attrs[L2TP_ATTR_L2SPEC_TYPE]) {
cfg.l2specific_type = 
nla_get_u8(info->attrs[L2TP_ATTR_L2SPEC_TYPE]);
+   if (cfg.l2specific_type != L2TP_L2SPECTYPE_DEFAULT &&
+   cfg.l2specific_type != L2TP_L2SPECTYPE_NONE) {
+   ret = -EINVAL;
+   goto out_tunnel;
+   }
+   } else {
+   cfg.l2specific_type = L2TP_L2SPECTYPE_DEFAULT;
+   }
 
cfg.l2specific_len = 4;
if (info->attrs[L2TP_ATTR_L2SPEC_LEN])
-- 
2.13.6

[PATCH v3 net-next 0/4] l2tp: set l2specific_len based on l2specific_type

2018-01-16 Thread Lorenzo Bianconi

Do not rely on l2specific_len value provided by userspace but set sublayer
length according to l2specific_type.
Mark L2TP_ATTR_L2SPEC_LEN attribute as not used

Changes since v2:
- drop the patch related to a fix in the switch default case in
  l2tp_nl_cmd_session_create()
- use L2SPECTYPE_NONE as default case in l2tp_get_l2specific_len()

Changes since v1:
- remove l2specific_len parameter
- add sanity check on l2specific_type provided by userspace

Lorenzo Bianconi (4):
  l2tp: double-check l2specific_type provided by userspace
  l2tp: remove l2specific_len dependency in l2tp_core
  l2tp: remove l2specific_len configurable parameter
  l2tp: mark L2TP_ATTR_L2SPEC_LEN as not used

 include/uapi/linux/l2tp.h |  2 +-
 net/l2tp/l2tp_core.c  | 35 ---
 net/l2tp/l2tp_core.h  | 13 +++--
 net/l2tp/l2tp_debugfs.c   |  2 +-
 net/l2tp/l2tp_netlink.c   | 15 +--
 5 files changed, 38 insertions(+), 29 deletions(-)

-- 
2.13.6

Re: [PATCH v2 net-next 0/5] l2tp: set l2specific_len based on l2specific_type

2018-01-16 Thread Lorenzo Bianconi

> On Sun, Jan 14, 2018 at 03:50:53PM +0100, Lorenzo Bianconi wrote:
>> Do not rely on l2specific_len value provided by userspace but set sublayer
>> length according to l2specific_type.
>> Fix a harmless issue in the switch default case in
>> l2tp_nl_cmd_session_create().
>>
>> Changes since v1:
>> - remove l2specific_len parameter
>> - add sanity check on l2specific_type provided by userspace
>>
> Thanks for working on this Lorenzo.
>
> Whatever the discussions on patches 1 and 3 lead to:
>
> Reviewed-by: Guillaume Nault <g.na...@alphalink.fr>
> Tested-by: Guillaume Nault <g.na...@alphalink.fr>

I will send a v3 later today dropping the first patch (I will send a
single one removing the switch later) and including your suggestion
for default case in  l2tp_get_l2specific_len().
Thanks for reviewing the patch :)

Regards,
Lorenzo

Re: [PATCH v2 net-next 1/5] l2tp: fix switch default error handling in l2tp_nl_cmd_session_create()

2018-01-15 Thread Lorenzo Bianconi

> On 15 January 2018 at 21:18, Lorenzo Bianconi
> <lorenzo.bianc...@redhat.com> wrote:
>>> On Sun, Jan 14, 2018 at 03:50:54PM +0100, Lorenzo Bianconi wrote:
>>>> Although this issue is harmless since that code path is protected by the
>>>> check on l2tp_nl_cmd_ops[]/l2tp_nl_cmd_ops[]->session_create(), fix error
>>>> handling for L2TP_PWTYPE_IP/default case in l2tp_nl_cmd_session_create()
>>>>
>>>> Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
>>>> ---
>>>>  net/l2tp/l2tp_netlink.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
>>>> index e1ca29f79821..48b5bf30ec50 100644
>>>> --- a/net/l2tp/l2tp_netlink.c
>>>> +++ b/net/l2tp/l2tp_netlink.c
>>>> @@ -635,7 +635,7 @@ static int l2tp_nl_cmd_session_create(struct sk_buff 
>>>> *skb, struct genl_info *inf
>>>>   case L2TP_PWTYPE_IP:
>>>>   default:
>>>>   ret = -EPROTONOSUPPORT;
>>>> - break;
>>>> + goto out_tunnel;
>>>>   }
>>>>
>>> Not sure if this change is really worthwhile. As you noted, this is
>>> unreachable code. And this switch should better be removed entirely:
>>> it doesn't do anything for supported pseudo-wires.
>>>
>>> And if PWTYPE_ETH_VLAN were to be implemented, it should perform its
>>> configuration consistency checking in its own PW specific code, not in
>>> the genl handler.
>>>
>>
>> Personally I would prefer to not remove some code that could be useful
>> for a future implementation, but just fix it if it presents issues to
>> address.
>> Anyway we can simply drop this patch from the series and I can send a
>> new one to remove the switch entirely.
>>
>> @James what do you think?
>
> Keep the patch series focused. If you read the patch series as a set,
> this patch stands out as not fitting the purpose of the series. I
> agree with Guillaume.
>

Ok, fine. What about the fix? Do you prefer to remove the switch or just fix it?

>>
>> Regards,
>> Lorenzo
>>
>>> Anyway, removing this switch isn't the purpose of this series, so I
>>> think you can drop this patch.
>
> I agree.

Re: [PATCH v2 net-next 1/5] l2tp: fix switch default error handling in l2tp_nl_cmd_session_create()

2018-01-15 Thread Lorenzo Bianconi

> On Sun, Jan 14, 2018 at 03:50:54PM +0100, Lorenzo Bianconi wrote:
>> Although this issue is harmless since that code path is protected by the
>> check on l2tp_nl_cmd_ops[]/l2tp_nl_cmd_ops[]->session_create(), fix error
>> handling for L2TP_PWTYPE_IP/default case in l2tp_nl_cmd_session_create()
>>
>> Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
>> ---
>>  net/l2tp/l2tp_netlink.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
>> index e1ca29f79821..48b5bf30ec50 100644
>> --- a/net/l2tp/l2tp_netlink.c
>> +++ b/net/l2tp/l2tp_netlink.c
>> @@ -635,7 +635,7 @@ static int l2tp_nl_cmd_session_create(struct sk_buff 
>> *skb, struct genl_info *inf
>>   case L2TP_PWTYPE_IP:
>>   default:
>>   ret = -EPROTONOSUPPORT;
>> - break;
>> + goto out_tunnel;
>>   }
>>
> Not sure if this change is really worthwhile. As you noted, this is
> unreachable code. And this switch should better be removed entirely:
> it doesn't do anything for supported pseudo-wires.
>
> And if PWTYPE_ETH_VLAN were to be implemented, it should perform its
> configuration consistency checking in its own PW specific code, not in
> the genl handler.
>

Personally I would prefer to not remove some code that could be useful
for a future implementation, but just fix it if it presents issues to
address.
Anyway we can simply drop this patch from the series and I can send a
new one to remove the switch entirely.

@James what do you think?

Regards,
Lorenzo

> Anyway, removing this switch isn't the purpose of this series, so I
> think you can drop this patch.

Re: [PATCH v2 net-next 3/5] l2tp: remove l2specific_len dependency in l2tp_core

2018-01-15 Thread Lorenzo Bianconi

> On Mon, Jan 15, 2018 at 07:43:18PM +0100, Lorenzo Bianconi wrote:
>> > On Sun, Jan 14, 2018 at 03:50:56PM +0100, Lorenzo Bianconi wrote:
>> >> --- a/net/l2tp/l2tp_core.h
>> >> +++ b/net/l2tp/l2tp_core.h
>> >> @@ -302,6 +302,17 @@ static inline void l2tp_session_dec_refcount(struct 
>> >> l2tp_session *session)
>> >>   l2tp_session_free(session);
>> >>  }
>> >>
>> >> +static inline int l2tp_get_l2specific_len(struct l2tp_session *session)
>> >> +{
>> >> + switch (session->l2specific_type) {
>> >> + case L2TP_L2SPECTYPE_NONE:
>> >> + return 0;
>> >> + case L2TP_L2SPECTYPE_DEFAULT:
>> >> + default:
>> >> + return 4;
>> >> + }
>> >> +}
>> >>
>> > The data path only compares ->l2specific_type to L2SPECTYPE_DEFAULT and
>> > treats any other value as L2SPECTYPE_NONE. Therefore, we should keep
>> > this logic here and return 0 for unknown types.
>>
>> The data path only compares l2specific_type to L2SPECTYPE_DEFAULT
>> since in the other supported case (L2SPECTYPE_NONE) there is no action
>> required. Moreover L2SPECTYPE_DEFAULT is default configured value if
>> the user does not provide any value for l2specific_type so there are
>> no 'unknown' types and I thought L2TP_L2SPECTYPE_DEFAULT was a better
>> choice for default value
>>
> Yes, but what I meant is that the data patch treats unknow values as
> L2SPECTYPE_NONE, while l2tp_get_l2specific_len() now treats them as
> L2TP_L2SPECTYPE_DEFAULT. I'd just prefer to avoid that inconsistency;
> it makes it easier to reason about the code.
>
> But if you really prefer L2SPECTYPE_DEFAULT, then fine. Unless someone
> messes with new l2spec types, we should never reach this case.

Yes, there are now way to hit just default case so there are no
difference to use L2SPECTYPE_DEFAULT or L2SPECTYPE_NONE as default
actually.
Moreover from now on there are no 'unknow' values for l2specific_type.
Anyway if you think that change is important I can respin a v3, no
issue from my side.

Regards,
Lorenzo

Re: [PATCH v2 net-next 3/5] l2tp: remove l2specific_len dependency in l2tp_core

2018-01-15 Thread Lorenzo Bianconi

> On Sun, Jan 14, 2018 at 03:50:56PM +0100, Lorenzo Bianconi wrote:
>> --- a/net/l2tp/l2tp_core.h
>> +++ b/net/l2tp/l2tp_core.h
>> @@ -302,6 +302,17 @@ static inline void l2tp_session_dec_refcount(struct 
>> l2tp_session *session)
>>   l2tp_session_free(session);
>>  }
>>
>> +static inline int l2tp_get_l2specific_len(struct l2tp_session *session)
>> +{
>> + switch (session->l2specific_type) {
>> + case L2TP_L2SPECTYPE_NONE:
>> + return 0;
>> + case L2TP_L2SPECTYPE_DEFAULT:
>> + default:
>> + return 4;
>> + }
>> +}
>>
> The data path only compares ->l2specific_type to L2SPECTYPE_DEFAULT and
> treats any other value as L2SPECTYPE_NONE. Therefore, we should keep
> this logic here and return 0 for unknown types.

The data path only compares l2specific_type to L2SPECTYPE_DEFAULT
since in the other supported case (L2SPECTYPE_NONE) there is no action
required. Moreover L2SPECTYPE_DEFAULT is default configured value if
the user does not provide any value for l2specific_type so there are
no 'unknown' types and I thought L2TP_L2SPECTYPE_DEFAULT was a better
choice for default value

Re: [PATCH v2 net-next 2/5] l2tp: double-check l2specific_type provided by userspace

2018-01-15 Thread Lorenzo Bianconi

> On Sun, Jan 14, 2018 at 03:50:55PM +0100, Lorenzo Bianconi wrote:
>> Add sanity check on l2specific_type provided by userspace in
>> l2tp_nl_cmd_session_create() since just L2TP_L2SPECTYPE_DEFAULT and
>> L2TP_L2SPECTYPE_NONE are currently supported.
>> Moreover do not always initialize l2specific_type if userspace requests
>> a given l2-specific sublayer type
>>
> I don't understand your last sentence. l2specific_type is always
> initialised in your patch (or session creation is aborted).
>

I mean to explicitly set l2specific_type to L2TP_L2SPECTYPE_DEFAULT
only if the userspace does not provide a value for it (I moved the
'default' initialization in the 'else' case)

>> Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
>> ---
>>  net/l2tp/l2tp_netlink.c | 11 +--
>>  1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
>> index 48b5bf30ec50..711cf208f23a 100644
>> --- a/net/l2tp/l2tp_netlink.c
>> +++ b/net/l2tp/l2tp_netlink.c
>> @@ -550,9 +550,16 @@ static int l2tp_nl_cmd_session_create(struct sk_buff 
>> *skb, struct genl_info *inf
>>   if (info->attrs[L2TP_ATTR_DATA_SEQ])
>>   cfg.data_seq = 
>> nla_get_u8(info->attrs[L2TP_ATTR_DATA_SEQ]);
>>
>> - cfg.l2specific_type = L2TP_L2SPECTYPE_DEFAULT;
>> - if (info->attrs[L2TP_ATTR_L2SPEC_TYPE])
>> + if (info->attrs[L2TP_ATTR_L2SPEC_TYPE]) {
>>   cfg.l2specific_type = 
>> nla_get_u8(info->attrs[L2TP_ATTR_L2SPEC_TYPE]);
>> + if (cfg.l2specific_type != L2TP_L2SPECTYPE_DEFAULT &&
>> + cfg.l2specific_type != L2TP_L2SPECTYPE_NONE) {
>> + ret = -EINVAL;
>> + goto out_tunnel;
>> + }
>> + } else {
>> + cfg.l2specific_type = L2TP_L2SPECTYPE_DEFAULT;
>> + }
>>
>>   cfg.l2specific_len = 4;
>>   if (info->attrs[L2TP_ATTR_L2SPEC_LEN])
>> --
>> 2.13.6
>>

[PATCH v2 net-next 1/5] l2tp: fix switch default error handling in l2tp_nl_cmd_session_create()

2018-01-14 Thread Lorenzo Bianconi

Although this issue is harmless since that code path is protected by the
check on l2tp_nl_cmd_ops[]/l2tp_nl_cmd_ops[]->session_create(), fix error
handling for L2TP_PWTYPE_IP/default case in l2tp_nl_cmd_session_create()

Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/l2tp/l2tp_netlink.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
index e1ca29f79821..48b5bf30ec50 100644
--- a/net/l2tp/l2tp_netlink.c
+++ b/net/l2tp/l2tp_netlink.c
@@ -635,7 +635,7 @@ static int l2tp_nl_cmd_session_create(struct sk_buff *skb, 
struct genl_info *inf
case L2TP_PWTYPE_IP:
default:
ret = -EPROTONOSUPPORT;
-   break;
+   goto out_tunnel;
}
 
ret = l2tp_nl_cmd_ops[cfg.pw_type]->session_create(net, tunnel,
-- 
2.13.6

[PATCH v2 net-next 2/5] l2tp: double-check l2specific_type provided by userspace

2018-01-14 Thread Lorenzo Bianconi

Add sanity check on l2specific_type provided by userspace in
l2tp_nl_cmd_session_create() since just L2TP_L2SPECTYPE_DEFAULT and
L2TP_L2SPECTYPE_NONE are currently supported.
Moreover do not always initialize l2specific_type if userspace requests
a given l2-specific sublayer type

Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/l2tp/l2tp_netlink.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
index 48b5bf30ec50..711cf208f23a 100644
--- a/net/l2tp/l2tp_netlink.c
+++ b/net/l2tp/l2tp_netlink.c
@@ -550,9 +550,16 @@ static int l2tp_nl_cmd_session_create(struct sk_buff *skb, 
struct genl_info *inf
if (info->attrs[L2TP_ATTR_DATA_SEQ])
cfg.data_seq = 
nla_get_u8(info->attrs[L2TP_ATTR_DATA_SEQ]);
 
-   cfg.l2specific_type = L2TP_L2SPECTYPE_DEFAULT;
-   if (info->attrs[L2TP_ATTR_L2SPEC_TYPE])
+   if (info->attrs[L2TP_ATTR_L2SPEC_TYPE]) {
cfg.l2specific_type = 
nla_get_u8(info->attrs[L2TP_ATTR_L2SPEC_TYPE]);
+   if (cfg.l2specific_type != L2TP_L2SPECTYPE_DEFAULT &&
+   cfg.l2specific_type != L2TP_L2SPECTYPE_NONE) {
+   ret = -EINVAL;
+   goto out_tunnel;
+   }
+   } else {
+   cfg.l2specific_type = L2TP_L2SPECTYPE_DEFAULT;
+   }
 
cfg.l2specific_len = 4;
if (info->attrs[L2TP_ATTR_L2SPEC_LEN])
-- 
2.13.6

[PATCH v2 net-next 0/5] l2tp: set l2specific_len based on l2specific_type

2018-01-14 Thread Lorenzo Bianconi

Do not rely on l2specific_len value provided by userspace but set sublayer
length according to l2specific_type.
Fix a harmless issue in the switch default case in
l2tp_nl_cmd_session_create().

Changes since v1:
- remove l2specific_len parameter
- add sanity check on l2specific_type provided by userspace

Lorenzo Bianconi (5):
  l2tp: fix switch default error handling in
l2tp_nl_cmd_session_create()
  l2tp: double-check l2specific_type provided by userspace
  l2tp: remove l2specific_len dependency in l2tp_core
  l2tp: remove l2specific_len configurable parameter
  l2tp: mark L2TP_ATTR_L2SPEC_LEN as not used

 include/uapi/linux/l2tp.h |  2 +-
 net/l2tp/l2tp_core.c  | 35 ---
 net/l2tp/l2tp_core.h  | 13 +++--
 net/l2tp/l2tp_debugfs.c   |  2 +-
 net/l2tp/l2tp_netlink.c   | 17 ++---
 5 files changed, 39 insertions(+), 30 deletions(-)

-- 
2.13.6

[PATCH v2 net-next 3/5] l2tp: remove l2specific_len dependency in l2tp_core

2018-01-14 Thread Lorenzo Bianconi

Remove l2specific_len dependency while building l2tpv3 header or
parsing the received frame since default L2-Specific Sublayer is
always four bytes long and we don't need to rely on a user supplied
value.
Moreover in l2tp netlink code there are no sanity checks to
enforce the relation between l2specific_len and l2specific_type,
so sending a malformed netlink message is possible to set
l2specific_type to L2TP_L2SPECTYPE_DEFAULT (or even
L2TP_L2SPECTYPE_NONE) and set l2specific_len to a value greater than
4 leaking memory on the wire and sending corrupted frames.

Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/l2tp/l2tp_core.c | 34 --
 net/l2tp/l2tp_core.h | 11 +++
 2 files changed, 27 insertions(+), 18 deletions(-)

diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 62285fc6eb59..88efb8b845ca 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -730,11 +730,9 @@ void l2tp_recv_common(struct l2tp_session *session, struct 
sk_buff *skb,
 "%s: recv data ns=%u, session nr=%u\n",
 session->name, ns, session->nr);
}
+   ptr += 4;
}
 
-   /* Advance past L2-specific header, if present */
-   ptr += session->l2specific_len;
-
if (L2TP_SKB_CB(skb)->has_seq) {
/* Received a packet with sequence numbers. If we're the LNS,
 * check if we sre sending sequence numbers and if not,
@@ -1048,21 +1046,20 @@ static int l2tp_build_l2tpv3_header(struct l2tp_session 
*session, void *buf)
memcpy(bufp, >cookie[0], session->cookie_len);
bufp += session->cookie_len;
}
-   if (session->l2specific_len) {
-   if (session->l2specific_type == L2TP_L2SPECTYPE_DEFAULT) {
-   u32 l2h = 0;
-   if (session->send_seq) {
-   l2h = 0x4000 | session->ns;
-   session->ns++;
-   session->ns &= 0xff;
-   l2tp_dbg(session, L2TP_MSG_SEQ,
-"%s: updated ns to %u\n",
-session->name, session->ns);
-   }
+   if (session->l2specific_type == L2TP_L2SPECTYPE_DEFAULT) {
+   u32 l2h = 0;
 
-   *((__be32 *) bufp) = htonl(l2h);
+   if (session->send_seq) {
+   l2h = 0x4000 | session->ns;
+   session->ns++;
+   session->ns &= 0xff;
+   l2tp_dbg(session, L2TP_MSG_SEQ,
+"%s: updated ns to %u\n",
+session->name, session->ns);
}
-   bufp += session->l2specific_len;
+
+   *((__be32 *)bufp) = htonl(l2h);
+   bufp += 4;
}
 
return bufp - optr;
@@ -1719,7 +1716,7 @@ int l2tp_session_delete(struct l2tp_session *session)
 EXPORT_SYMBOL_GPL(l2tp_session_delete);
 
 /* We come here whenever a session's send_seq, cookie_len or
- * l2specific_len parameters are set.
+ * l2specific_type parameters are set.
  */
 void l2tp_session_set_header_len(struct l2tp_session *session, int version)
 {
@@ -1728,7 +1725,8 @@ void l2tp_session_set_header_len(struct l2tp_session 
*session, int version)
if (session->send_seq)
session->hdr_len += 4;
} else {
-   session->hdr_len = 4 + session->cookie_len + 
session->l2specific_len;
+   session->hdr_len = 4 + session->cookie_len;
+   session->hdr_len += l2tp_get_l2specific_len(session);
if (session->tunnel->encap == L2TP_ENCAPTYPE_UDP)
session->hdr_len += 4;
}
diff --git a/net/l2tp/l2tp_core.h b/net/l2tp/l2tp_core.h
index c2e9bbd79b35..06128a159a3c 100644
--- a/net/l2tp/l2tp_core.h
+++ b/net/l2tp/l2tp_core.h
@@ -302,6 +302,17 @@ static inline void l2tp_session_dec_refcount(struct 
l2tp_session *session)
l2tp_session_free(session);
 }
 
+static inline int l2tp_get_l2specific_len(struct l2tp_session *session)
+{
+   switch (session->l2specific_type) {
+   case L2TP_L2SPECTYPE_NONE:
+   return 0;
+   case L2TP_L2SPECTYPE_DEFAULT:
+   default:
+   return 4;
+   }
+}
+
 #define l2tp_printk(ptr, type, func, fmt, ...) \
 do {   \
if (((ptr)->debug) & (type))\
-- 
2.13.6

[PATCH v2 net-next 5/5] l2tp: mark L2TP_ATTR_L2SPEC_LEN as not used

2018-01-14 Thread Lorenzo Bianconi

Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 include/uapi/linux/l2tp.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/uapi/linux/l2tp.h b/include/uapi/linux/l2tp.h
index 71e62795104d..7d570c7bd117 100644
--- a/include/uapi/linux/l2tp.h
+++ b/include/uapi/linux/l2tp.h
@@ -97,7 +97,7 @@ enum {
L2TP_ATTR_OFFSET,   /* u16 (not used) */
L2TP_ATTR_DATA_SEQ, /* u16 */
L2TP_ATTR_L2SPEC_TYPE,  /* u8, enum l2tp_l2spec_type */
-   L2TP_ATTR_L2SPEC_LEN,   /* u8, enum l2tp_l2spec_type */
+   L2TP_ATTR_L2SPEC_LEN,   /* u8 (not used) */
L2TP_ATTR_PROTO_VERSION,/* u8 */
L2TP_ATTR_IFNAME,   /* string */
L2TP_ATTR_CONN_ID,  /* u32 */
-- 
2.13.6

[PATCH v2 net-next 4/5] l2tp: remove l2specific_len configurable parameter

2018-01-14 Thread Lorenzo Bianconi

Remove l2specific_len configuration parameter since now L2-Specific
Sublayer length is computed according to l2specific_type provided by
userspace.

Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/l2tp/l2tp_core.c| 1 -
 net/l2tp/l2tp_core.h| 2 --
 net/l2tp/l2tp_debugfs.c | 2 +-
 net/l2tp/l2tp_netlink.c | 4 
 4 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 88efb8b845ca..194a7483bb93 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1777,7 +1777,6 @@ struct l2tp_session *l2tp_session_create(int priv_size, 
struct l2tp_tunnel *tunn
session->lns_mode = cfg->lns_mode;
session->reorder_timeout = cfg->reorder_timeout;
session->l2specific_type = cfg->l2specific_type;
-   session->l2specific_len = cfg->l2specific_len;
session->cookie_len = cfg->cookie_len;
memcpy(>cookie[0], >cookie[0], 
cfg->cookie_len);
session->peer_cookie_len = cfg->peer_cookie_len;
diff --git a/net/l2tp/l2tp_core.h b/net/l2tp/l2tp_core.h
index 06128a159a3c..f5768fa1d98f 100644
--- a/net/l2tp/l2tp_core.h
+++ b/net/l2tp/l2tp_core.h
@@ -59,7 +59,6 @@ struct l2tp_session_cfg {
int debug;  /* bitmask of debug message
 * categories */
u16 vlan_id;/* VLAN pseudowire only */
-   u16 l2specific_len; /* Layer 2 specific length */
u16 l2specific_type; /* Layer 2 specific type */
u8  cookie[8];  /* optional cookie */
int cookie_len; /* 0, 4 or 8 bytes */
@@ -85,7 +84,6 @@ struct l2tp_session {
int cookie_len;
u8  peer_cookie[8];
int peer_cookie_len;
-   u16 l2specific_len;
u16 l2specific_type;
u16 hdr_len;
u32 nr; /* session NR state (receive) */
diff --git a/net/l2tp/l2tp_debugfs.c b/net/l2tp/l2tp_debugfs.c
index 2c30587d1a14..72e713da4733 100644
--- a/net/l2tp/l2tp_debugfs.c
+++ b/net/l2tp/l2tp_debugfs.c
@@ -181,7 +181,7 @@ static void l2tp_dfs_seq_session_show(struct seq_file *m, 
void *v)
   session->debug,
   jiffies_to_msecs(session->reorder_timeout));
seq_printf(m, "   offset 0 l2specific %hu/%hu\n",
-  session->l2specific_type, session->l2specific_len);
+  session->l2specific_type, l2tp_get_l2specific_len(session));
if (session->cookie_len) {
seq_printf(m, "   cookie %02x%02x%02x%02x",
   session->cookie[0], session->cookie[1],
diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
index 711cf208f23a..f36f0b5f950f 100644
--- a/net/l2tp/l2tp_netlink.c
+++ b/net/l2tp/l2tp_netlink.c
@@ -561,10 +561,6 @@ static int l2tp_nl_cmd_session_create(struct sk_buff *skb, 
struct genl_info *inf
cfg.l2specific_type = L2TP_L2SPECTYPE_DEFAULT;
}
 
-   cfg.l2specific_len = 4;
-   if (info->attrs[L2TP_ATTR_L2SPEC_LEN])
-   cfg.l2specific_len = 
nla_get_u8(info->attrs[L2TP_ATTR_L2SPEC_LEN]);
-
if (info->attrs[L2TP_ATTR_COOKIE]) {
u16 len = nla_len(info->attrs[L2TP_ATTR_COOKIE]);
if (len > 8) {
-- 
2.13.6

Re: [PATCH net-next 2/3] l2tp: configure l2specific_len according to l2specific_type

2018-01-10 Thread Lorenzo Bianconi

> On Wed, Jan 10, 2018 at 06:20:41PM +0100, Lorenzo Bianconi wrote:
>> Set l2specific_len according to the L2-Specific Sublayer type specified
>> by the userspace and not rely on a user supplied value for sublayer len
>> since invalid usage of L2TP_ATTR_L2SPEC_LEN allows leaking memory on the
>> network and sending corrupted frames.
>> Moreover add a sanity check on l2specific_type provided by userspace
>>
>> Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
>> ---
>>  net/l2tp/l2tp_netlink.c | 21 -
>>  1 file changed, 16 insertions(+), 5 deletions(-)
>>
>> diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
>> index 48b5bf30ec50..404e5482c4e7 100644
>> --- a/net/l2tp/l2tp_netlink.c
>> +++ b/net/l2tp/l2tp_netlink.c
>> @@ -550,13 +550,24 @@ static int l2tp_nl_cmd_session_create(struct sk_buff 
>> *skb, struct genl_info *inf
>>   if (info->attrs[L2TP_ATTR_DATA_SEQ])
>>   cfg.data_seq = 
>> nla_get_u8(info->attrs[L2TP_ATTR_DATA_SEQ]);
>>
>> - cfg.l2specific_type = L2TP_L2SPECTYPE_DEFAULT;
>> - if (info->attrs[L2TP_ATTR_L2SPEC_TYPE])
>> + if (info->attrs[L2TP_ATTR_L2SPEC_TYPE]) {
>>   cfg.l2specific_type = 
>> nla_get_u8(info->attrs[L2TP_ATTR_L2SPEC_TYPE]);
>>
>> - cfg.l2specific_len = 4;
>> - if (info->attrs[L2TP_ATTR_L2SPEC_LEN])
>> - cfg.l2specific_len = 
>> nla_get_u8(info->attrs[L2TP_ATTR_L2SPEC_LEN]);
>> + switch (cfg.l2specific_type) {
>> + case L2TP_L2SPECTYPE_DEFAULT:
>> + cfg.l2specific_len = 4;
>> + break;
>> + case L2TP_L2SPECTYPE_NONE:
>> + cfg.l2specific_len = 0;
>> + break;
>> + default:
>> + ret = -EINVAL;
>> + goto out_tunnel;
>> + }
>> + } else {
>> + cfg.l2specific_type = L2TP_L2SPECTYPE_DEFAULT;
>> + cfg.l2specific_len = 4;
>> + }
>>
>>   if (info->attrs[L2TP_ATTR_COOKIE]) {
>>   u16 len = nla_len(info->attrs[L2TP_ATTR_COOKIE]);
>>
> I think we can go one step further and remove .l2specific_len from
> struct l2tp_session and struct l2tp_session_cfg. We never need this
> information. Then we can start ignoring the L2TP_ATTR_L2SPEC_LEN
> attribute just like we've done with L2TP_ATTR_OFFSET.

Hi Guillaume,

I was thinking about it, let's way for some feedbacks and then I can
respin a v2.
Regards,

Lorenzo

[PATCH net-next 1/3] l2tp: fix switch default error handling in l2tp_nl_cmd_session_create()

2018-01-10 Thread Lorenzo Bianconi

Although this issue is harmless since that code path is protected by the
check on l2tp_nl_cmd_ops[]/l2tp_nl_cmd_ops[]->session_create(), fix error
handling for L2TP_PWTYPE_IP/default case in l2tp_nl_cmd_session_create()

Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/l2tp/l2tp_netlink.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
index e1ca29f79821..48b5bf30ec50 100644
--- a/net/l2tp/l2tp_netlink.c
+++ b/net/l2tp/l2tp_netlink.c
@@ -635,7 +635,7 @@ static int l2tp_nl_cmd_session_create(struct sk_buff *skb, 
struct genl_info *inf
case L2TP_PWTYPE_IP:
default:
ret = -EPROTONOSUPPORT;
-   break;
+   goto out_tunnel;
}
 
ret = l2tp_nl_cmd_ops[cfg.pw_type]->session_create(net, tunnel,
-- 
2.13.6

[PATCH net-next 3/3] l2tp: mark L2TP_ATTR_L2SPEC_LEN as not used

2018-01-10 Thread Lorenzo Bianconi

Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 include/uapi/linux/l2tp.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/uapi/linux/l2tp.h b/include/uapi/linux/l2tp.h
index 71e62795104d..ec609a017691 100644
--- a/include/uapi/linux/l2tp.h
+++ b/include/uapi/linux/l2tp.h
@@ -97,7 +97,7 @@ enum {
L2TP_ATTR_OFFSET,   /* u16 (not used) */
L2TP_ATTR_DATA_SEQ, /* u16 */
L2TP_ATTR_L2SPEC_TYPE,  /* u8, enum l2tp_l2spec_type */
-   L2TP_ATTR_L2SPEC_LEN,   /* u8, enum l2tp_l2spec_type */
+   L2TP_ATTR_L2SPEC_LEN,   /* u8, (not used) */
L2TP_ATTR_PROTO_VERSION,/* u8 */
L2TP_ATTR_IFNAME,   /* string */
L2TP_ATTR_CONN_ID,  /* u32 */
-- 
2.13.6

[PATCH net-next 2/3] l2tp: configure l2specific_len according to l2specific_type

2018-01-10 Thread Lorenzo Bianconi

Set l2specific_len according to the L2-Specific Sublayer type specified
by the userspace and not rely on a user supplied value for sublayer len
since invalid usage of L2TP_ATTR_L2SPEC_LEN allows leaking memory on the
network and sending corrupted frames.
Moreover add a sanity check on l2specific_type provided by userspace

Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/l2tp/l2tp_netlink.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
index 48b5bf30ec50..404e5482c4e7 100644
--- a/net/l2tp/l2tp_netlink.c
+++ b/net/l2tp/l2tp_netlink.c
@@ -550,13 +550,24 @@ static int l2tp_nl_cmd_session_create(struct sk_buff 
*skb, struct genl_info *inf
if (info->attrs[L2TP_ATTR_DATA_SEQ])
cfg.data_seq = 
nla_get_u8(info->attrs[L2TP_ATTR_DATA_SEQ]);
 
-   cfg.l2specific_type = L2TP_L2SPECTYPE_DEFAULT;
-   if (info->attrs[L2TP_ATTR_L2SPEC_TYPE])
+   if (info->attrs[L2TP_ATTR_L2SPEC_TYPE]) {
cfg.l2specific_type = 
nla_get_u8(info->attrs[L2TP_ATTR_L2SPEC_TYPE]);
 
-   cfg.l2specific_len = 4;
-   if (info->attrs[L2TP_ATTR_L2SPEC_LEN])
-   cfg.l2specific_len = 
nla_get_u8(info->attrs[L2TP_ATTR_L2SPEC_LEN]);
+   switch (cfg.l2specific_type) {
+   case L2TP_L2SPECTYPE_DEFAULT:
+   cfg.l2specific_len = 4;
+   break;
+   case L2TP_L2SPECTYPE_NONE:
+   cfg.l2specific_len = 0;
+   break;
+   default:
+   ret = -EINVAL;
+   goto out_tunnel;
+   }
+   } else {
+   cfg.l2specific_type = L2TP_L2SPECTYPE_DEFAULT;
+   cfg.l2specific_len = 4;
+   }
 
if (info->attrs[L2TP_ATTR_COOKIE]) {
u16 len = nla_len(info->attrs[L2TP_ATTR_COOKIE]);
-- 
2.13.6

[PATCH net-next 0/3] set l2specific_len based on l2specific_type

2018-01-10 Thread Lorenzo Bianconi

Do not rely on l2specific_len value provided by userspace but set sublayer
length according to l2specific_type.

Lorenzo Bianconi (3):
  l2tp: fix switch default error handling in
l2tp_nl_cmd_session_create()
  l2tp: configure l2specific_len according to l2specific_type
  l2tp: mark L2TP_ATTR_L2SPEC_LEN as not used

 include/uapi/linux/l2tp.h |  2 +-
 net/l2tp/l2tp_netlink.c   | 23 +--
 2 files changed, 18 insertions(+), 7 deletions(-)

-- 
2.13.6

Re: [PATCH net-next 2/2] l2tp: add peer_offset parameter

2018-01-08 Thread Lorenzo Bianconi

> On Wed, Jan 03, 2018 at 04:06:28PM +0100, Lorenzo Bianconi wrote:
>> I agree to remove offset parameter in this case. What about (as
>> already suggested by James) to take into account possible alignment
>> issues with previous version of L2TPv3 protocol using 'L2 specific
>> sublayer'?
>>
> I think James was refering to the general architecture of L2TPv3, where
> such issues should be handled by pseudo-wire specific headers. I don't
> think he was talking about implementing arbitrary padding using an
> L2 specific sublayer. None of the standardised headers allow arbitrary
> padding. And implementing our own would make us imcompatible with any
> other implementation.
>
>> I guess, on the kernel side (we will need to patch iproute2 on
>> userspace side), we need just to properly initialized the 'l2specific'
>> field to 0 since otherwise we will have the same memleak issue there
>> if assume we can have l2specific_len != {0,4}.
>>
> That would produce the same frame format as what the 'offset' option
> was supposed to produce (if it did properly initialise its padding
> bits). That is, we'd have an arbitrary number of padding bits inserted
> between the l2-specific header and the l2 frame (L2TP's payload). These
> frames are invalid, and doing that brings us nothing compared to
> keeping the offset option.
>
>> Moreover does it worth to add some sanity checks in netlink code to
>> enforce the relation between l2specific_len and l2specific_type?
>>
> Yes, certainly.
>
>> At
>> the moment there are no guarantee that if l2specific_type is set to
>> L2TP_L2SPECTYPE_DEFAULT, l2specific_len will be grater or equal than
>> 4.
>>
> Thanks for pointing this out. That needs to be fixed. We don't support
> anything but the default l2spec layer (or no l2spec layer at all). So
> we should only accept L2TP_L2SPECTYPE_NONE and L2TP_L2SPECTYPE_DEFAULT.
>
> And we shouldn't need to use l2specific_len at all. The default l2spec
> header is always four bytes long, so we don't need to rely on a user
> supplied value. Looking at the code, it seems that invalid usage of
> L2TP_ATTR_L2SPEC_LEN allows leaking memory on the network or sending
> corrupted frames (depending on if its value is too small or too big).
>
> Do you want to take care of this?

Ack, I am working on it.
Regards,

Lorenzo

Re: [PATCH net-next 2/2] l2tp: add peer_offset parameter

2018-01-03 Thread Lorenzo Bianconi

> On Tue, Jan 02, 2018 at 08:28:03PM +0100, Lorenzo Bianconi wrote:
>> Perhaps I am little bit polarized on UABI issue, but I was rethinking
>> about it and maybe removing offset parameter would lead to an
>> interoperability issue for device running L2TPv3 since offset
>> parameter is there and it is not a nope.
>> Please consider this setup:
>> - 2 endpoint running L2TPv3, the first running net-next and the second
>> running 4.14
>> - both endpoint are configured using iproute2 in this way:
>>
>>   - ip l2tp add tunnel local  remote  tunnel_id 
>> peer_tunnel_id  udp_sport  udp_dport 
>>   - ip l2tp add tunnel local  remote  tunnel_id 
>> peer_tunnel_id  udp_sport  udp_dport 
>>   - ip l2tp add session name l2tp0 tunnel_id  session_id 
>> peer_session_id  offset 8
>>   - ip l2tp add session name l2tp0 tunnel_id  session_id 
>> peer_session_id  offset 8
>>
>> Can we assume offset is never used for L2TPv3?
>>
> That's what I think. You're right worrying about ABI issues. And I
> wouldn't dare proposing such a removal if I had doubts about breaking a
> user setup.
>
> Considering the lack of use cases and the absence of interoperability
> of this feature, I hardly can imagine it being used.
> But it's not only that: the feature has been buggy for years without
> anyone noticing. And this bug wasn't difficult to spot (one just needs
> to look at an L2TPv3 header in a network packet dump).
>
> It's really the combination of these three issues (buggy, no use case
> and not producing valid L2TPv3 frames) that makes me propose a removal.

Hi Guillaume, James,

I agree to remove offset parameter in this case. What about (as
already suggested by James) to take into account possible alignment
issues with previous version of L2TPv3 protocol using 'L2 specific
sublayer'?
I guess, on the kernel side (we will need to patch iproute2 on
userspace side), we need just to properly initialized the 'l2specific'
field to 0 since otherwise we will have the same memleak issue there
if assume we can have l2specific_len != {0,4}.
Moreover does it worth to add some sanity checks in netlink code to
enforce the relation between l2specific_len and l2specific_type? At
the moment there are no guarantee that if l2specific_type is set to
L2TP_L2SPECTYPE_DEFAULT, l2specific_len will be grater or equal than
4.

Regards,
Lorenzo

Re: [PATCH net-next 2/2] l2tp: add peer_offset parameter

2018-01-02 Thread Lorenzo Bianconi

>> > Lorenzo, is this being added to fix interoperability with another L2TPv3
>> > implementation? If so, can you share more details?
>> >
>>
>> Hi James,
>>
>> I introduced peer_offset parameter to fix a specific setup where
>> tunnel endpoints
>> running L2TPv3 would use different values for tx offset (since in
>> iproute2 there is no
>> restriction on it), not to fix a given an interoperability issue.
>>
> Yes, but was it just to test iproute2's peer_offset option? Or is there
> a plan to use it for real?
>
>> I introduced this feature since:
>>  - offset has been added for long time to L2TPv3 implementation
>>(commit f7faffa3ff8ef6ae712ef16312b8a2aa7a1c95fe and
>>commit 309795f4bec2d69cd507a631f82065c2198a0825) and I wanted to
>> preserve UABI
>>  - have the same degree of freedom for offset parameter we have in
>> L2TPv2 and fix the issue
>>described above
>>
> AFAIU, the current L2TPv2 implementation never sets the offset field
> and nobody ever realised.
>

Perhaps I am little bit polarized on UABI issue, but I was rethinking
about it and maybe removing offset parameter would lead to an
interoperability issue for device running L2TPv3 since offset
parameter is there and it is not a nope.
Please consider this setup:
- 2 endpoint running L2TPv3, the first running net-next and the second
running 4.14
- both endpoint are configured using iproute2 in this way:

  - ip l2tp add tunnel local  remote  tunnel_id 
peer_tunnel_id  udp_sport  udp_dport 
  - ip l2tp add tunnel local  remote  tunnel_id 
peer_tunnel_id  udp_sport  udp_dport 
  - ip l2tp add session name l2tp0 tunnel_id  session_id 
peer_session_id  offset 8
  - ip l2tp add session name l2tp0 tunnel_id  session_id 
peer_session_id  offset 8

Can we assume offset is never used for L2TPv3?

Regards,
Lorenzo

>> Now what we can do I guess is:
>> - as suggested by Guillaume drop completely the offset support without 
>> removing
>>   netlink attribute in order to not break UABI
>> - fix offset support initializing properly padding bits
>>
> I'd go for the first one. I just wonder if that looks acceptable to
> David an James.

Re: [PATCH net-next 2/2] l2tp: add peer_offset parameter

2017-12-29 Thread Lorenzo Bianconi

> Sorry for only just seeing this (vacation).
>
>
> On 28/12/17 19:45, Guillaume Nault wrote:
>>
>> On Thu, Dec 28, 2017 at 07:23:48PM +0100, Lorenzo Bianconi wrote:
>>>
>>> On Dec 28, Guillaume Nault wrote:
>>>>
>>>> After a quick review of L2TPv3 and pseudowires RFCs, I still don't see
>>>> how adding some padding between the L2TPv3 header and the payload could
>>>> constitute a valid frame. Of course, the base feature is already there,
>>>> but after a quick test, it looks like the padding bits aren't
>>>> initialised and leak memory.
>>>
>>> Do you mean for L2TPv2 or L2TPv3? For L2TPv3 offset/peer_offset are
>>> initialized
>>> in l2tp_nl_cmd_session_create()
>>>
>> That's the offsets stored in the l2tp_session_cfg structure. But I was
>> talking about the xmit path: l2tp_build_l2tpv3_header() doesn't
>> initialise the padding between the header and the payload. So when
>> someone activates this option, then every transmitted packet leaks
>> memory on the wire.
>>
>>> Setting session data offset is already supported in L2TP kernel module
>>> (and could be already used by userspace applications);
>>> for L2TPv2 there is an optional 16-bit value in the header while for
>>> L2TPv3
>>> the offset is configured by userspace.
>>> At the moment the kernel (for L2TPv3) uses offset for both tx and rx
>>> side.
>>> Userspace (iproute2) allows to distinguish tx offset (offset) from rx one
>>> (peer_offset) but since the rx part is not handled at the moment
>>> (I fixed peer_offset support in iproute2, I have not sent the patch
>>> upstream yet, attached below)
>>> this leads to a misalignment between tunnel endpoints.
>>> You can easily reproduce the issue using this setup (and the below patch
>>> for iproute2):
>>>
>>> ip l2tp add tunnel local  remote  tunnel_id 
>>> peer_tunnel_id  udp_sport  udp_dport 
>>> ip l2tp add tunnel local  remote  tunnel_id 
>>> peer_tunnel_id  udp_sport  udp_dport 
>>>
>>> ip l2tp add session name l2tp0 tunnel_id  session_id 
>>> peer_session_id  offset 8 peer_offset 16
>>> ip l2tp add session name l2tp0 tunnel_id  session_id 
>>> peer_session_id  offset 16 peer_offset 8
>>>
>> Yes, I'm well aware of that. And thanks for having worked on a full
>> solution including iproute2. But does one really need to set
>> asymetrical offset values? It doesn't look wrong to require setting the
>> same value on both sides. Other options need this, like "l2spec_type".
>>
>> Here we have an option that:
>>* creates invalid packets (AFAIK),
>>* is buggy and leaks memory on the network,
>>* doesn't seem to have any use case (even the manpage
>>  says "This is hardly ever used").
>>
>> So I'm sorry, but I don't see the point in expanding this option to
>> allow even stranger setups. If there's a use case, then fine.
>> Otherwise, let's just acknowledge that the "peer_offset" option of
>> iproute2 is a noop (and maybe remove it from the manpage).
>>
>> And the kernel "offset" option needs to be fixed. Actually, I wouldn't
>> mind if it was converted to be a noop, or even rejected. L2TP already
>> has its share of unused features that complicate the code and hamper
>> evolution and bug fixing. As I said earlier, if it's buggy, unused and
>> can't even produce valid packets, then why bothering with it?
>>
>> But that's just my point of view. James, do you have an opinion on
>> this?
>
>
> I agree, Guillaume.
>
> The L2TPv3 protocol RFC dropped the configurable offset of L2TPv2 - instead,
> the Layer-2-Specific-Sublayer is supposed to handle any transport-specific
> data alignment requirements. I think a configurable offset has found its way
> into iproute2 l2tp commands by mistake, perhaps because the netlink API
> defines an attribute for it, but which was only intended for use with
> L2TPv2. For L2TPv2, we only configure the offset for transmitted packets. In
> received packets, the offset (if present) is obtained from the L2TPv2 header
> in each received packet. There is no need to add a peer-offset netlink
> attribute to set the offset expected in received packets.
>
> Lorenzo, is this being added to fix interoperability with another L2TPv3
> implementation? If so, can you share more details?
>

Hi James,

I introduced peer_offset parameter to fix a specific setup where
tunnel endpoints
running L2TPv3 would use different values for tx offs

Re: [PATCH net-next 2/2] l2tp: add peer_offset parameter

2017-12-28 Thread Lorenzo Bianconi

On Dec 28, Guillaume Nault wrote:
> On Fri, Dec 22, 2017 at 03:10:18PM +0100, Lorenzo Bianconi wrote:
> > Introduce peer_offset parameter in order to add the capability
> > to specify two different values for payload offset on tx/rx side.
> > If just offset is provided by userspace use it for rx side as well
> > in order to maintain compatibility with older l2tp versions
> > 
> Sorry for being late on this, I originally missed this patchset and
> only noticed it yesterday.
> 
> Lorenzo, can you give some context around this new feature?
> Quite frankly I can't see the point of it. I've never heard of offsets
> in L2TPv3, and for L2TPv2, the offset value is already encoded in the
> header.

Hi Guillaume,

thanks for your feedback.

> 
> After a quick review of L2TPv3 and pseudowires RFCs, I still don't see
> how adding some padding between the L2TPv3 header and the payload could
> constitute a valid frame. Of course, the base feature is already there,
> but after a quick test, it looks like the padding bits aren't
> initialised and leak memory.

Do you mean for L2TPv2 or L2TPv3? For L2TPv3 offset/peer_offset are initialized
in l2tp_nl_cmd_session_create()

> 
> So, unless I missed something, setting offsets in L2TPv3 is
> non-compliant, the current implementation is buggy and most likely
> unused. I'd really prefer getting the implementation fixed, or even
> removed entirely. Extending it to allow asymmetrical offset values
> looks wrong to me, unless you have a use case in mind.
> 
> Regards,
> 
> Guillaume
> 
> PS: I also noticed that iproute2 has a "peer_offset" option, but it's a
> noop.

Setting session data offset is already supported in L2TP kernel module
(and could be already used by userspace applications);
for L2TPv2 there is an optional 16-bit value in the header while for L2TPv3
the offset is configured by userspace.
At the moment the kernel (for L2TPv3) uses offset for both tx and rx side.
Userspace (iproute2) allows to distinguish tx offset (offset) from rx one
(peer_offset) but since the rx part is not handled at the moment
(I fixed peer_offset support in iproute2, I have not sent the patch upstream 
yet, attached below)
this leads to a misalignment between tunnel endpoints.
You can easily reproduce the issue using this setup (and the below patch for 
iproute2):

ip l2tp add tunnel local  remote  tunnel_id  peer_tunnel_id 
 udp_sport  udp_dport 
ip l2tp add tunnel local  remote  tunnel_id  peer_tunnel_id 
 udp_sport  udp_dport 

ip l2tp add session name l2tp0 tunnel_id  session_id  peer_session_id 
 offset 8 peer_offset 16
ip l2tp add session name l2tp0 tunnel_id  session_id  peer_session_id 
 offset 16 peer_offset 8

commit ee1b976f22fbea530c94a5233ac8c7cd8d643ae9
Author: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
Date:   Thu Dec 21 14:41:39 2017 +0100

ip: l2tp: add peer_offset netlink callback

diff --git a/include/uapi/linux/l2tp.h b/include/uapi/linux/l2tp.h
index 472e9924..21223df7 100644
--- a/include/uapi/linux/l2tp.h
+++ b/include/uapi/linux/l2tp.h
@@ -127,6 +127,7 @@ enum {
L2TP_ATTR_UDP_ZERO_CSUM6_TX,/* flag */
L2TP_ATTR_UDP_ZERO_CSUM6_RX,/* flag */
L2TP_ATTR_PAD,
+   L2TP_ATTR_PEER_OFFSET,  /* u16 */
__L2TP_ATTR_MAX,
 };
 
diff --git a/ip/ipl2tp.c b/ip/ipl2tp.c
index 7c5ed313..a3220a8b 100644
--- a/ip/ipl2tp.c
+++ b/ip/ipl2tp.c
@@ -176,6 +176,8 @@ static int create_session(struct l2tp_parm *p)
  p->reorder_timeout);
if (p->offset)
addattr16(, 1024, L2TP_ATTR_OFFSET, p->offset);
+   if (p->peer_offset)
+   addattr16(, 1024, L2TP_ATTR_PEER_OFFSET, p->peer_offset);
if (p->cookie_len)
addattr_l(, 1024, L2TP_ATTR_COOKIE,
  p->cookie, p->cookie_len);
@@ -316,6 +318,8 @@ static int get_response(struct nlmsghdr *n, void *arg)
p->encap = rta_getattr_u16(attrs[L2TP_ATTR_ENCAP_TYPE]);
if (attrs[L2TP_ATTR_OFFSET])
p->offset = rta_getattr_u16(attrs[L2TP_ATTR_OFFSET]);
+   if (attrs[L2TP_ATTR_PEER_OFFSET])
+   p->peer_offset = rta_getattr_u16(attrs[L2TP_ATTR_PEER_OFFSET]);
if (attrs[L2TP_ATTR_DATA_SEQ])
p->data_seq = rta_getattr_u16(attrs[L2TP_ATTR_DATA_SEQ]);
if (attrs[L2TP_ATTR_CONN_ID])


Regards,
Lorenzo

[PATCH net-next 2/2] l2tp: add peer_offset parameter

2017-12-22 Thread Lorenzo Bianconi

Introduce peer_offset parameter in order to add the capability
to specify two different values for payload offset on tx/rx side.
If just offset is provided by userspace use it for rx side as well
in order to maintain compatibility with older l2tp versions

Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 include/uapi/linux/l2tp.h |  1 +
 net/l2tp/l2tp_core.c  |  3 ++-
 net/l2tp/l2tp_core.h  | 13 ++---
 net/l2tp/l2tp_debugfs.c   |  8 +---
 net/l2tp/l2tp_netlink.c   | 21 -
 5 files changed, 38 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/l2tp.h b/include/uapi/linux/l2tp.h
index d84ce5c1c9aa..d6fee55dbded 100644
--- a/include/uapi/linux/l2tp.h
+++ b/include/uapi/linux/l2tp.h
@@ -127,6 +127,7 @@ enum {
L2TP_ATTR_UDP_ZERO_CSUM6_TX,/* flag */
L2TP_ATTR_UDP_ZERO_CSUM6_RX,/* flag */
L2TP_ATTR_PAD,
+   L2TP_ATTR_PEER_OFFSET,  /* u16 */
__L2TP_ATTR_MAX,
 };
 
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 115918ad8eca..6ff64717da1e 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -792,7 +792,7 @@ void l2tp_recv_common(struct l2tp_session *session, struct 
sk_buff *skb,
ptr += 2 + offset;
}
} else
-   ptr += session->offset;
+   ptr += session->peer_offset;
 
offset = ptr - optr;
if (!pskb_may_pull(skb, offset))
@@ -1785,6 +1785,7 @@ struct l2tp_session *l2tp_session_create(int priv_size, 
struct l2tp_tunnel *tunn
session->lns_mode = cfg->lns_mode;
session->reorder_timeout = cfg->reorder_timeout;
session->offset = cfg->offset;
+   session->peer_offset = cfg->peer_offset;
session->l2specific_type = cfg->l2specific_type;
session->l2specific_len = cfg->l2specific_len;
session->cookie_len = cfg->cookie_len;
diff --git a/net/l2tp/l2tp_core.h b/net/l2tp/l2tp_core.h
index 9534e16965cc..c6fe7cc42a05 100644
--- a/net/l2tp/l2tp_core.h
+++ b/net/l2tp/l2tp_core.h
@@ -59,7 +59,8 @@ struct l2tp_session_cfg {
int debug;  /* bitmask of debug message
 * categories */
u16 vlan_id;/* VLAN pseudowire only */
-   u16 offset; /* offset to payload */
+   u16 offset; /* offset to tx payload */
+   u16 peer_offset;/* offset to rx payload */
u16 l2specific_len; /* Layer 2 specific length */
u16 l2specific_type; /* Layer 2 specific type */
u8  cookie[8];  /* optional cookie */
@@ -86,8 +87,14 @@ struct l2tp_session {
int cookie_len;
u8  peer_cookie[8];
int peer_cookie_len;
-   u16 offset; /* offset from end of L2TP 
header
-  to beginning of data */
+   u16 offset; /* offset from end of L2TP
+* header to beginning of
+* tx data
+*/
+   u16 peer_offset;/* offset from end of L2TP
+* header to beginning of
+* rx data
+*/
u16 l2specific_len;
u16 l2specific_type;
u16 hdr_len;
diff --git a/net/l2tp/l2tp_debugfs.c b/net/l2tp/l2tp_debugfs.c
index eb69411bcb47..4cc30b38aba4 100644
--- a/net/l2tp/l2tp_debugfs.c
+++ b/net/l2tp/l2tp_debugfs.c
@@ -180,8 +180,9 @@ static void l2tp_dfs_seq_session_show(struct seq_file *m, 
void *v)
   session->lns_mode ? "LNS" : "LAC",
   session->debug,
   jiffies_to_msecs(session->reorder_timeout));
-   seq_printf(m, "   offset %hu l2specific %hu/%hu\n",
-  session->offset, session->l2specific_type, 
session->l2specific_len);
+   seq_printf(m, "   offset %hu peer_offset %hu l2specific %hu/%hu\n",
+  session->offset, session->peer_offset,
+  session->l2specific_type, session->l2specific_len);
if (session->cookie_len) {
seq_printf(m, "   cookie %02x%02x%02x%02x",
   session->cookie[0], session->cookie[1],
@@ -228,7 +229,8 @@ static int l2tp_dfs_

[PATCH net-next 0/2] l2tp: fix offset/peer_offset conf parameters

2017-12-22 Thread Lorenzo Bianconi

This patchset add peer_offset configuration parameter in order to
specify two different values for payload offset on tx/rx side.
Moreover fix missing print session offset info

Hangbin Liu (1):
  l2tp: fix missing print session offset info

Lorenzo Bianconi (1):
  l2tp: add peer_offset parameter

 include/uapi/linux/l2tp.h |  1 +
 net/l2tp/l2tp_core.c  |  3 ++-
 net/l2tp/l2tp_core.h  | 13 ++---
 net/l2tp/l2tp_debugfs.c   |  8 +---
 net/l2tp/l2tp_netlink.c   | 23 ++-
 5 files changed, 40 insertions(+), 8 deletions(-)

-- 
2.13.6

[PATCH net-next 1/2] l2tp: fix missing print session offset info

2017-12-22 Thread Lorenzo Bianconi

From: Hangbin Liu <liuhang...@gmail.com>

Report offset parameter in L2TP_CMD_SESSION_GET command if
it has been configured by userspace

Fixes: 309795f4bec ("l2tp: Add netlink control API for L2TP")
Reported-by: Jianlin Shi <ji...@redhat.com>
Signed-off-by: Hangbin Liu <liuhang...@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
---
 net/l2tp/l2tp_netlink.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
index a1f24fb2be98..7e9c50125556 100644
--- a/net/l2tp/l2tp_netlink.c
+++ b/net/l2tp/l2tp_netlink.c
@@ -761,6 +761,8 @@ static int l2tp_nl_session_send(struct sk_buff *skb, u32 
portid, u32 seq, int fl
 
if ((session->ifname[0] &&
 nla_put_string(skb, L2TP_ATTR_IFNAME, session->ifname)) ||
+   (session->offset &&
+nla_put_u16(skb, L2TP_ATTR_OFFSET, session->offset)) ||
(session->cookie_len &&
 nla_put(skb, L2TP_ATTR_COOKIE, session->cookie_len,
 >cookie[0])) ||
-- 
2.13.6

Re: [PATCHv2 net] l2tp: fix missing print session offset info

2017-12-22 Thread Lorenzo Bianconi

> Fixes: 309795f4bec ("l2tp: Add netlink control API for L2TP")
> Reported-by: Jianlin Shi 
> Signed-off-by: Hangbin Liu 
> ---
>  net/l2tp/l2tp_netlink.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
> index a1f24fb..36378b4 100644
> --- a/net/l2tp/l2tp_netlink.c
> +++ b/net/l2tp/l2tp_netlink.c
> @@ -761,6 +761,8 @@ static int l2tp_nl_session_send(struct sk_buff *skb, u32 
> portid, u32 seq, int fl
>
> if ((session->ifname[0] &&
>  nla_put_string(skb, L2TP_ATTR_IFNAME, session->ifname)) ||
> +   (session->offset &&
> +nla_put_u16(skb, L2TP_ATTR_OFFSET, session->offset)) ||
> (session->cookie_len &&
>  nla_put(skb, L2TP_ATTR_COOKIE, session->cookie_len,
>  >cookie[0])) ||
> --
> 2.5.5
>

Hi David,

please hold on a while with this patch, I will send it in a patchset
that addresses some offset related stuff.
Regards,

Lorenzo

Re: [GIT] Networking

2015-09-05 Thread Lorenzo Bianconi

Hi all,

> On Wed, Sep 2, 2015 at 10:35 PM, David Miller  wrote:
>>
>> Another merge window, another set of networking changes.  I've heard
>> rumblings that the lightweight tunnels infrastructure has been voted
>> networking change of the year.
>
> .. and others say that the most notable feature is the idiotic bugs
> that it introduces, and the compiler even complains about.
>
> Christ, people. Learn C, instead of just stringing random characters
> together until it compiles (with warnings).
>
> This:
>
>   static bool rate_control_cap_mask(struct ieee80211_sub_if_data *sdata,
>struct ieee80211_supported_band *sband,
>struct ieee80211_sta *sta, u32 *mask,
>u8 mcs_mask[IEEE80211_HT_MCS_MASK_LEN])
>
> is horribly broken to begin with, because array arguments in C don't
> actually exist. Sadly, compilers accept it for various bad historical
> reasons, and silently turn it into just a pointer argument. There are
> arguments for them, but they are from weak minds.
>
> But happily gcc has a really really valid warning (kudos - I often end
> up ragging on the bad warnings gcc has, but this one is a keeper),
> because a few lines down the mistake then turns into pure and utter
> garbage.
>
> It's garbage that was basically encouraged by the first mistake
> (thinking that C allows array arguments), namely:
>
>   for (i = 0; i < sizeof(mcs_mask); i++)
>

I moved rate_control_apply_mask() logic in rate_control_cap_mask() in
order to be applied in multiple code points (i.e.
 rate_control_apply_mask_ratetbl()). Since I was using gcc version
4.9.2, the warning did not show up. Sorry for that bug.

> the "sizeof(mcs_mask)" is _shit_. Since array arguments don't actually
> exist in C, it is the size of the pointer, not the array. The first
> mistake makes the bug look like reasonable code. Although I'd argue
> that the code would actually be bad regardless, since "sizeof" is the
> size in bytes, and the code actually wants the number of entries (and
> we do have ARRAY_SIZE() for that).
>
> Sure, in this case the entries are just one byte each, so it would
> have *worked* had it not been for the array argument issue, but it's
> misleading and the code is just fundamentally buggy and nonsensical in
> two entirely different ways that fed on each other.
>
> That line should read
>
>   for (i = 0; i < IEEE80211_HT_MCS_MASK_LEN; i++)
>
> and the argument should just have been declared as the pointer it actually is.
>
> A later patch then added onto the pile of manure by adding *another*
> broken array argument, but at least that one then used the proper loop
> for traversal of that array.
>
> The fact that I notice this bug from a very basic "let's just compile
> each pull request and make sure it isn't complete crap" is sad.
>
> Now, it *looks* like the code was just moved, and the "sizeof()" was
> initially correct (because it was a size of an actual array). Well, it
> was "correct" in the sense that it generated the right code, even if
> the whole confusion between "number of entries" and "size in bytes"
> was still there. Then it got moved and turned from "confused but
> happens to generate correct code" into "buggy pile of bovine manure".
> See commit 90c66bd2232a ("mac80211: remove ieee80211_tx_rate
> dependency in rate mask code").
>
> So I can see how this bug happened, and I am only slightly upset with
> Lorenzo who is the author of that commit.
>
> What I can't see is why the code has existed in at least two
> maintainer trees (Johannes' and David's) for a couple of weeks, and
> nobody cared about the new compiler warnings? And it was sent to me
> despite that new warning?
>
> I realy want people to take a really hard look at functions that use
> arrays as arguments. It really is very misleading, even if it can look
> "prettier", and some people will argue that it's "documentation" about
> how the pointer is a particular size. But it's neither. It's basically
> just lying about what is going on, and the only thing it documents is
> "I don't know how to C". Misleading documentation isn't documentation,
> it's a mistake.
>
> I see it in that file for at least the functions rate_idx_match_mask()
> and rate_control_cap_mask(). I tried - and failed - to come up with a
> reasonable grep pattern to try to see how common it is, and I'm too
> lazy to add some sparse check for it.
>
> Please people. When I see these kinds of obviously bogus code
> problems, that just makes me very upset. Because it makes me worry
> about all the non-obvious stuff that I miss.  Sadly, this time I had
> pushed out the merge early (because I wanted to test the wireless
> changes on my laptop), so now the bug is out there.
>
> I'm not sure what the practical *impact* of the bug is. Yes, it only
> traverses four or eight rate entries (depending on 32-bit or
> 64-bitness of the kernel) out

69 matches

Mail list logo