date:20180312

Re: [PATCH] src: fix parsing for set handle attributes

2018-03-12 Thread Eckl , Máté

Hi,

I am new here, but isn't it strange that you mask the flags with the
HANDLE attribute?

Regards,
Máté

2018-03-11 14:18 GMT+01:00 Harsha Sharma :
> Correct one typo for parsing set handles.
>
> Signed-off-by: Harsha Sharma 
> ---
>  src/set.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/set.c b/src/set.c
> index 0889b00..d2a7589 100644
> --- a/src/set.c
> +++ b/src/set.c
> @@ -368,7 +368,7 @@ void nftnl_set_nlmsg_build_payload(struct nlmsghdr *nlh, 
> struct nftnl_set *s)
> mnl_attr_put_strz(nlh, NFTA_SET_TABLE, s->table);
> if (s->flags & (1 << NFTNL_SET_NAME))
> mnl_attr_put_strz(nlh, NFTA_SET_NAME, s->name);
> -   if (s->handle & (1 << NFTNL_SET_HANDLE))
> +   if (s->flags & (1 << NFTNL_SET_HANDLE))
> mnl_attr_put_u64(nlh, NFTA_SET_HANDLE, htobe64(s->handle));
> if (s->flags & (1 << NFTNL_SET_FLAGS))
> mnl_attr_put_u32(nlh, NFTA_SET_FLAGS, htonl(s->set_flags));
> --
> 2.14.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Port triggering

2018-03-12 Thread Stéphane Veyret

Partially answering to myself : here is a good starting point for
nftables dev ->
https://zasdfgbnm.github.io/2017/09/07/Extending-nftables/
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Port triggering

2018-03-12 Thread Florian Westphal

Stéphane Veyret  wrote:
> A few words on the specs I imagined for the port triggering:
> 
> table ip trigger {
>  chain postrouting {
>   type filter hook postrouting priority 0;
>   ip dport 554 trigger open rtsp timeout 300 # Open the
> trigger named rtsp if packet arrives for port 554 - trigger will close
> in 300s if not refreshed. This will record source (client) and target
> (server) address
>  }
> }
> 
> table ip nat {
>  chain prerouting {
>   type nat hook prerouting priority 0;
>   ip dport 6970-7170 trigger dnat rtsp # If trigger is open
> and source is recorded server address, DNAT the packet to recorded
> client address
>  }
> }

You might already be able to do this with maps, however it looks
like it might be better to just allow to set conntrack expectations from
nftables rules/packet path instead.

(Or i still fail to understand what you want to do, it does
 sound exactly like expectations, e.g. for ftp data channel in
 response to PASV command on ftp control channel).

Something like:

chain postrouting {
type filter hook postrouting priority 0;
# tell kernel to install an expectation
# arriving on udp ports 6970-7170
# expectation will follow whatever NAT transformation
# is active on master connection
# expectation is removed after 5 minutes
# (we could of course also allow to install an expectation
# for 'foreign' addresses as well but I don't think its needed
# yet
ip dport 554 ct expectation set udp dport 6970-7170 timeout 5m
}

table ip filter {
  chain forward {
   ip dport 6970-7170 ct status expected accept
  }
}

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH nft] src: install table skeleton files to sysconfdir/nftables

2018-03-12 Thread Florian Westphal

commit 6c9230e79339ca ("nftables: rearrange files and examples")
removed the install hook for the old 'iptables table skeleton rulesets'.

This restores the install hook for some of these.

Reported-by: Duncan Roe 
Cc: Arturo Borrero Gonzalez 
Signed-off-by: Florian Westphal 
---
 Makefile.am|  1 +
 configure.ac   |  2 ++
 files/Makefile.am  |  1 +
 files/examples/Makefile.am | 18 ++
 4 files changed, 22 insertions(+)
 create mode 100644 files/Makefile.am
 create mode 100644 files/examples/Makefile.am

diff --git a/Makefile.am b/Makefile.am
index 5ef61be6dfec..f33da9dbd181 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -2,6 +2,7 @@ ACLOCAL_AMFLAGS = -I m4
 
 SUBDIRS =  src \
include \
+   files   \
doc
 
 EXTRA_DIST =   tests   \
diff --git a/configure.ac b/configure.ac
index 6c6b9b3a4c4b..fb2175a55656 100644
--- a/configure.ac
+++ b/configure.ac
@@ -119,6 +119,8 @@ AC_CONFIG_FILES([   \
include/linux/netfilter_ipv4/Makefile   \
include/linux/netfilter_ipv6/Makefile   \
doc/Makefile\
+   files/Makefile  \
+   files/examples/Makefile \
])
 AC_OUTPUT
 
diff --git a/files/Makefile.am b/files/Makefile.am
new file mode 100644
index ..aee2d7baa2ad
--- /dev/null
+++ b/files/Makefile.am
@@ -0,0 +1 @@
+SUBDIRS = examples
diff --git a/files/examples/Makefile.am b/files/examples/Makefile.am
new file mode 100644
index ..21e8be1bd388
--- /dev/null
+++ b/files/examples/Makefile.am
@@ -0,0 +1,18 @@
+
+pkgsysconfdir = ${sysconfdir}/nftables
+dist_pkgsysconf_DATA = arp-filter.nft  \
+   bridge-filter.nft   \
+   inet-filter.nft \
+   ipv4-filter.nft \
+   ipv4-mangle.nft \
+   ipv4-nat.nft\
+   ipv4-raw.nft\
+   ipv6-filter.nft \
+   ipv6-mangle.nft \
+   ipv6-nat.nft\
+   ipv6-raw.nft\
+   netdev-ingress.nft
+
+
+install-data-hook:
+   ${SED} -i 's|@sbindir[@]|${sbindir}/|g' ${DESTDIR}${pkgsysconfdir}/*
-- 
2.16.1

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [nft] nftables: Fixing Bug 1219 - handle rt0 and rt2 properly

2018-03-12 Thread Ahmed Abdelsalam


On Sun, 11 Mar 2018 23:00:41 +0100
Pablo Neira Ayuso  wrote:

> On Tue, Feb 27, 2018 at 07:25:14AM +0100, Ahmed Abdelsalam wrote:
> > Type 0 and 2 of the IPv6 Routing extension header are not handled
> > properly by exthdr_init_raw() in src/exthdr.c
> > 
> > In order to fix the bug, we extended the "enum nft_exthdr_op" to
> > differentiate between rt, rt0, and rt2.
> > 
> > This patch should fix the bug. We tested the patch against the
> > same configuration reported in the bug and the output is as
> > shown below.
> > 
> > table ip6 filter {
> > chain input {
> > type filter hook input priority 0; policy accept;
> > rt0 addr[1] a::2
> > }
> > }
> 
> Applied, thanks Ahmed.
> 
> Would you also update tests/py to cover this? Thanks.
Thanks Pablo!
I will send you a patch with the required tests.

-- 
Ahmed Abdelsalam 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH nft] src: install table skeleton files to sysconfdir/nftables

2018-03-12 Thread Arturo Borrero Gonzalez

On 12 March 2018 at 12:36, Florian Westphal  wrote:
> +
> +install-data-hook:
> +   ${SED} -i 's|@sbindir[@]|${sbindir}/|g' ${DESTDIR}${pkgsysconfdir}/*
> --

The shebang in those files is static now (#!/usr/sbin/nft -f)

Perhaps we should differentiate between files we use for development
and example files for the tarball (downstream users)
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH nft] netlink: use nftnl_flowtable_get/set

2018-03-12 Thread Florian Westphal

the '_array' variant is just a wrapper for get/set api; this
allows the array variant to be removed from libnftnl.

Signed-off-by: Florian Westphal 
---
 src/netlink.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/netlink.c b/src/netlink.c
index a74dc2551e88..bfa30502a2b2 100644
--- a/src/netlink.c
+++ b/src/netlink.c
@@ -1588,7 +1588,7 @@ int netlink_add_flowtable(struct netlink_ctx *ctx, const 
struct handle *h,
dev_array[i++] = expr->identifier;
 
dev_array[i] = NULL;
-   nftnl_flowtable_set_array(flo, NFTNL_FLOWTABLE_DEVICES, dev_array);
+   nftnl_flowtable_set(flo, NFTNL_FLOWTABLE_DEVICES, dev_array);
 
netlink_dump_flowtable(flo, ctx);
 
@@ -1678,7 +1678,7 @@ netlink_delinearize_flowtable(struct netlink_ctx *ctx,
  struct nftnl_flowtable *nlo)
 {
struct flowtable *flowtable;
-   const char **dev_array;
+   const char * const *dev_array;
int len = 0, i;
 
flowtable = flowtable_alloc(&netlink_location);
@@ -1688,8 +1688,8 @@ netlink_delinearize_flowtable(struct netlink_ctx *ctx,
xstrdup(nftnl_flowtable_get_str(nlo, NFTNL_FLOWTABLE_TABLE));
flowtable->handle.flowtable =
xstrdup(nftnl_flowtable_get_str(nlo, NFTNL_FLOWTABLE_NAME));
-   dev_array = nftnl_flowtable_get_array(nlo, NFTNL_FLOWTABLE_DEVICES);
-   while (dev_array[len] != '\0')
+   dev_array = nftnl_flowtable_get(nlo, NFTNL_FLOWTABLE_DEVICES);
+   while (dev_array[len])
len++;
 
flowtable->dev_array = calloc(1, len * sizeof(char *));
-- 
2.16.1

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH nft] src: install table skeleton files to sysconfdir/nftables

2018-03-12 Thread Florian Westphal

Arturo Borrero Gonzalez  wrote:
> On 12 March 2018 at 12:36, Florian Westphal  wrote:
> > +
> > +install-data-hook:
> > +   ${SED} -i 's|@sbindir[@]|${sbindir}/|g' ${DESTDIR}${pkgsysconfdir}/*
> > --
> 
> The shebang in those files is static now (#!/usr/sbin/nft -f)

Indeed.  I would change this back to the replace-variant.

> Perhaps we should differentiate between files we use for development
> and example files for the tarball (downstream users)

Right, the example files could provide real examples, the files
in /etc are just the 'iptables tables' in nft (i.e. only
base chain hooks, no rules).
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] Net: netfilter: Replace printk() with pr_*() and define pr_fmt()

2018-03-12 Thread Arushi Singhal

Using pr_() is more concise than printk(KERN_).
This patch:
* Replace printks having a log level with the appropriate
pr_*() macros.
* Define pr_fmt() to include relevant name.
* Remove redundant prefixes from pr_*() calls.
* Indent the code where possible.
* Remove the useless output messages.
* Remove periods from messages.

Signed-off-by: Arushi Singhal 
---
 net/netfilter/nf_conntrack_acct.c  |  6 --
 net/netfilter/nf_conntrack_ecache.c|  6 --
 net/netfilter/nf_conntrack_timestamp.c |  6 --
 net/netfilter/nf_nat_core.c|  4 +++-
 net/netfilter/nf_nat_ftp.c |  7 ---
 net/netfilter/nf_nat_irc.c |  7 ---
 net/netfilter/nfnetlink_queue.c| 14 +++---
 net/netfilter/xt_time.c| 13 +++--
 8 files changed, 37 insertions(+), 26 deletions(-)

diff --git a/net/netfilter/nf_conntrack_acct.c 
b/net/netfilter/nf_conntrack_acct.c
index 8669167..1d66de5 100644
--- a/net/netfilter/nf_conntrack_acct.c
+++ b/net/netfilter/nf_conntrack_acct.c
@@ -8,6 +8,8 @@
  * published by the Free Software Foundation.
  */
 
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include 
 #include 
 #include 
@@ -80,7 +82,7 @@ static int nf_conntrack_acct_init_sysctl(struct net *net)
net->ct.acct_sysctl_header = register_net_sysctl(net, "net/netfilter",
 table);
if (!net->ct.acct_sysctl_header) {
-   printk(KERN_ERR "nf_conntrack_acct: can't register to 
sysctl.\n");
+   pr_err("can't register to sysctl\n");
goto out_register;
}
return 0;
@@ -125,7 +127,7 @@ int nf_conntrack_acct_init(void)
 {
int ret = nf_ct_extend_register(&acct_extend);
if (ret < 0)
-   pr_err("nf_conntrack_acct: Unable to register extension\n");
+   pr_err("Unable to register extension\n");
return ret;
 }
 
diff --git a/net/netfilter/nf_conntrack_ecache.c 
b/net/netfilter/nf_conntrack_ecache.c
index caac41a..c11822a 100644
--- a/net/netfilter/nf_conntrack_ecache.c
+++ b/net/netfilter/nf_conntrack_ecache.c
@@ -11,6 +11,8 @@
  * published by the Free Software Foundation.
  */
 
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include 
 #include 
 #include 
@@ -372,7 +374,7 @@ static int nf_conntrack_event_init_sysctl(struct net *net)
net->ct.event_sysctl_header =
register_net_sysctl(net, "net/netfilter", table);
if (!net->ct.event_sysctl_header) {
-   printk(KERN_ERR "nf_ct_event: can't register to sysctl.\n");
+   pr_err("can't register to sysctl\n");
goto out_register;
}
return 0;
@@ -419,7 +421,7 @@ int nf_conntrack_ecache_init(void)
 {
int ret = nf_ct_extend_register(&event_extend);
if (ret < 0)
-   pr_err("nf_ct_event: Unable to register event extension.\n");
+   pr_err("Unable to register event extension\n");
 
BUILD_BUG_ON(__IPCT_MAX >= 16); /* ctmask, missed use u16 */
 
diff --git a/net/netfilter/nf_conntrack_timestamp.c 
b/net/netfilter/nf_conntrack_timestamp.c
index 4c4734b..56766cb 100644
--- a/net/netfilter/nf_conntrack_timestamp.c
+++ b/net/netfilter/nf_conntrack_timestamp.c
@@ -6,6 +6,8 @@
  * published by the Free Software Foundation (or any later at your option).
  */
 
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include 
 #include 
 #include 
@@ -58,7 +60,7 @@ static int nf_conntrack_tstamp_init_sysctl(struct net *net)
net->ct.tstamp_sysctl_header = register_net_sysctl(net, "net/netfilter",
   table);
if (!net->ct.tstamp_sysctl_header) {
-   printk(KERN_ERR "nf_ct_tstamp: can't register to sysctl.\n");
+   pr_err("can't register to sysctl\n");
goto out_register;
}
return 0;
@@ -104,7 +106,7 @@ int nf_conntrack_tstamp_init(void)
int ret;
ret = nf_ct_extend_register(&tstamp_extend);
if (ret < 0)
-   pr_err("nf_ct_tstamp: Unable to register extension\n");
+   pr_err("Unable to register extension\n");
return ret;
 }
 
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index 6c38421..617693f 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -8,6 +8,8 @@
  * published by the Free Software Foundation.
  */
 
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include 
 #include 
 #include 
@@ -814,7 +816,7 @@ static int __init nf_nat_init(void)
ret = nf_ct_extend_register(&nat_extend);
if (ret < 0) {
nf_ct_free_hashtable(nf_nat_bysource, nf_nat_htable_size);
-   printk(KERN_ERR "nf_nat_core: Unable to register extension\n");
+   pr_err("Unable to register extension\n");
return ret;
}
 
diff --git a/net/netfilter/nf_nat_ftp.c b/net/netfilter/nf_nat_f

[PATCH] net: drivers/net: Remove unnecessary skb_copy_expand OOM messages

2018-03-12 Thread Joe Perches

skb_copy_expand without __GFP_NOWARN already does a dump_stack
on OOM so these messages are redundant.

Signed-off-by: Joe Perches 
---
 drivers/net/ethernet/qualcomm/qca_spi.c | 1 -
 drivers/net/usb/lg-vl600.c  | 6 +-
 drivers/net/wimax/i2400m/usb-rx.c   | 3 ---
 drivers/net/wireless/ti/wl1251/tx.c | 4 +---
 drivers/usb/gadget/function/f_eem.c | 1 -
 net/mac80211/rx.c   | 5 +
 net/netfilter/nfnetlink_queue.c | 5 +
 7 files changed, 4 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/qualcomm/qca_spi.c 
b/drivers/net/ethernet/qualcomm/qca_spi.c
index 9c236298fe21..5803cd6db406 100644
--- a/drivers/net/ethernet/qualcomm/qca_spi.c
+++ b/drivers/net/ethernet/qualcomm/qca_spi.c
@@ -705,7 +705,6 @@ qcaspi_netdev_xmit(struct sk_buff *skb, struct net_device 
*dev)
tskb = skb_copy_expand(skb, QCAFRM_HEADER_LEN,
   QCAFRM_FOOTER_LEN + pad_len, GFP_ATOMIC);
if (!tskb) {
-   netdev_dbg(qca->net_dev, "could not allocate 
tx_buff\n");
qca->stats.out_of_mem++;
return NETDEV_TX_BUSY;
}
diff --git a/drivers/net/usb/lg-vl600.c b/drivers/net/usb/lg-vl600.c
index dbabd7ca5268..257916f172cd 100644
--- a/drivers/net/usb/lg-vl600.c
+++ b/drivers/net/usb/lg-vl600.c
@@ -157,12 +157,8 @@ static int vl600_rx_fixup(struct usbnet *dev, struct 
sk_buff *skb)
 
s->current_rx_buf = skb_copy_expand(skb, 0,
le32_to_cpup(&frame->len), GFP_ATOMIC);
-   if (!s->current_rx_buf) {
-   netif_err(dev, ifup, dev->net, "Reserving %i bytes "
-   "for packet assembly failed.\n",
-   le32_to_cpup(&frame->len));
+   if (!s->current_rx_buf)
dev->net->stats.rx_errors++;
-   }
 
return 0;
}
diff --git a/drivers/net/wimax/i2400m/usb-rx.c 
b/drivers/net/wimax/i2400m/usb-rx.c
index b78ee676e102..5b64bda7d9e7 100644
--- a/drivers/net/wimax/i2400m/usb-rx.c
+++ b/drivers/net/wimax/i2400m/usb-rx.c
@@ -263,9 +263,6 @@ struct sk_buff *i2400mu_rx(struct i2400mu *i2400mu, struct 
sk_buff *rx_skb)
new_skb = skb_copy_expand(rx_skb, 0, rx_size - rx_skb->len,
  GFP_KERNEL);
if (new_skb == NULL) {
-   if (printk_ratelimit())
-   dev_err(dev, "RX: Can't reallocate skb to %d; "
-   "RX dropped\n", rx_size);
kfree_skb(rx_skb);
rx_skb = NULL;
goto out;   /* drop it...*/
diff --git a/drivers/net/wireless/ti/wl1251/tx.c 
b/drivers/net/wireless/ti/wl1251/tx.c
index de2fa6705574..12ed14ebc307 100644
--- a/drivers/net/wireless/ti/wl1251/tx.c
+++ b/drivers/net/wireless/ti/wl1251/tx.c
@@ -221,10 +221,8 @@ static int wl1251_tx_send_packet(struct wl1251 *wl, struct 
sk_buff *skb,
struct sk_buff *newskb = skb_copy_expand(skb, 0, 3,
 GFP_KERNEL);
 
-   if (unlikely(newskb == NULL)) {
-   wl1251_error("Can't allocate skb!");
+   if (unlikely(newskb == NULL))
return -EINVAL;
-   }
 
tx_hdr = (struct tx_double_buffer_desc *) newskb->data;
 
diff --git a/drivers/usb/gadget/function/f_eem.c 
b/drivers/usb/gadget/function/f_eem.c
index 37557651b600..c13befa31110 100644
--- a/drivers/usb/gadget/function/f_eem.c
+++ b/drivers/usb/gadget/function/f_eem.c
@@ -507,7 +507,6 @@ static int eem_unwrap(struct gether *port,
0,
GFP_ATOMIC);
if (unlikely(!skb3)) {
-   DBG(cdev, "unable to realign EEM packet\n");
dev_kfree_skb_any(skb2);
continue;
}
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index d01743234cf6..9c898a3688c6 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -2549,11 +2549,8 @@ ieee80211_rx_h_mesh_fwding(struct ieee80211_rx_data *rx)
 
fwd_skb = skb_copy_expand(skb, local->tx_headroom +
   sdata->encrypt_headroom, 0, GFP_ATOMIC);
-   if (!fwd_skb) {
-   net_info_ratelimited("%s: failed to clone mesh frame\n",
-   sdata->name);
+   if (!fwd_skb)
goto out;
-   }
 
fwd_hdr =  (struct ieee80211_hdr *) fwd_skb->data;
fwd_hdr->frame_control &= ~cpu_to_le16(IEEE80211_FCTL_RETRY);
diff --git a/net/netfilter/nfnetlink_q

Re: Port triggering

2018-03-12 Thread Stéphane Veyret

Thank you for your help.

2018-03-12 12:25 GMT+01:00 Florian Westphal :
> (Or i still fail to understand what you want to do, it does
>  sound exactly like expectations, e.g. for ftp data channel in
>  response to PASV command on ftp control channel).

No, what I would like to have is more like FTP *active* connexion. The
(in-lan) client is initiating a connection to the server. The server
replies and the initiate a new connection (data connection for FTP) on
a new port. I want this new connection to be associated to the first
one. This is also what we have with rtsp or battle-net protocols.

> Something like:
>
> chain postrouting {
> type filter hook postrouting priority 0;
> # tell kernel to install an expectation
> # arriving on udp ports 6970-7170
> # expectation will follow whatever NAT transformation
> # is active on master connection
> # expectation is removed after 5 minutes
> # (we could of course also allow to install an expectation
> # for 'foreign' addresses as well but I don't think its needed
> # yet
> ip dport 554 ct expectation set udp dport 6970-7170 timeout 5m
> }

It may be what I'm looking for. But I couldn't find any documentation
about this “ct expectation” command. Or do you mean I should create a
conntrack helper module for that ?

-- 
Bien cordialement, / Plej kore,

Stéphane Veyret
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Port triggering

2018-03-12 Thread Florian Westphal

Stéphane Veyret  wrote:
> 2018-03-12 12:25 GMT+01:00 Florian Westphal :
> > (Or i still fail to understand what you want to do, it does
> >  sound exactly like expectations, e.g. for ftp data channel in
> >  response to PASV command on ftp control channel).
> 
> No, what I would like to have is more like FTP *active* connexion.

Thats what I meant :-/

(PORT command, not PASV).

> > Something like:
> >
> > chain postrouting {
> > type filter hook postrouting priority 0;
> > # tell kernel to install an expectation
> > # arriving on udp ports 6970-7170
> > # expectation will follow whatever NAT transformation
> > # is active on master connection
> > # expectation is removed after 5 minutes
> > # (we could of course also allow to install an expectation
> > # for 'foreign' addresses as well but I don't think its needed
> > # yet
> > ip dport 554 ct expectation set udp dport 6970-7170 timeout 5m
> > }
> 
> It may be what I'm looking for. But I couldn't find any documentation
> about this “ct expectation” command. Or do you mean I should create a
> conntrack helper module for that?

Right, this doesn't exist yet.

I think we (you) should consider to extend net/netfilter/nft_ct.c, to
support a new NFT_CT_EXPECT attribute in nft_ct_set_eval() function.

This would then install a new expectation based on what userspace told
us.

You can look at
net/netfilter/nf_conntrack_ftp.c
and search for nf_ct_expect_alloc() to see where the ftp helper installs
the expectation.

The main difference would be that with nft_ct.c, most properties of
the new expectation would be determined by netlink attributes which were
set by the nftables ruleset.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Port triggering

2018-03-12 Thread Stéphane Veyret

2018-03-12 16:53 GMT+01:00 Florian Westphal :
>> It may be what I'm looking for. But I couldn't find any documentation
>> about this “ct expectation” command. Or do you mean I should create a
>> conntrack helper module for that?
>
> Right, this doesn't exist yet.
>
> I think we (you) should consider to extend net/netfilter/nft_ct.c, to
> support a new NFT_CT_EXPECT attribute in nft_ct_set_eval() function.
>
> This would then install a new expectation based on what userspace told
> us.
>
> You can look at
> net/netfilter/nf_conntrack_ftp.c
> and search for nf_ct_expect_alloc() to see where the ftp helper installs
> the expectation.
>
> The main difference would be that with nft_ct.c, most properties of
> the new expectation would be determined by netlink attributes which were
> set by the nftables ruleset.

Thank you, I'll do that… :-)

-- 
Bien cordialement, / Plej kore,

Stéphane Veyret
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH nft] netlink: use nftnl_flowtable_get/set

2018-03-12 Thread Pablo Neira Ayuso

On Mon, Mar 12, 2018 at 01:00:17PM +0100, Florian Westphal wrote:
> the '_array' variant is just a wrapper for get/set api; this
> allows the array variant to be removed from libnftnl.

LGTM, thanks Florian!

> Signed-off-by: Florian Westphal 
> ---
>  src/netlink.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/src/netlink.c b/src/netlink.c
> index a74dc2551e88..bfa30502a2b2 100644
> --- a/src/netlink.c
> +++ b/src/netlink.c
> @@ -1588,7 +1588,7 @@ int netlink_add_flowtable(struct netlink_ctx *ctx, 
> const struct handle *h,
>   dev_array[i++] = expr->identifier;
>  
>   dev_array[i] = NULL;
> - nftnl_flowtable_set_array(flo, NFTNL_FLOWTABLE_DEVICES, dev_array);
> + nftnl_flowtable_set(flo, NFTNL_FLOWTABLE_DEVICES, dev_array);
>  
>   netlink_dump_flowtable(flo, ctx);
>  
> @@ -1678,7 +1678,7 @@ netlink_delinearize_flowtable(struct netlink_ctx *ctx,
> struct nftnl_flowtable *nlo)
>  {
>   struct flowtable *flowtable;
> - const char **dev_array;
> + const char * const *dev_array;
>   int len = 0, i;
>  
>   flowtable = flowtable_alloc(&netlink_location);
> @@ -1688,8 +1688,8 @@ netlink_delinearize_flowtable(struct netlink_ctx *ctx,
>   xstrdup(nftnl_flowtable_get_str(nlo, NFTNL_FLOWTABLE_TABLE));
>   flowtable->handle.flowtable =
>   xstrdup(nftnl_flowtable_get_str(nlo, NFTNL_FLOWTABLE_NAME));
> - dev_array = nftnl_flowtable_get_array(nlo, NFTNL_FLOWTABLE_DEVICES);
> - while (dev_array[len] != '\0')
> + dev_array = nftnl_flowtable_get(nlo, NFTNL_FLOWTABLE_DEVICES);
> + while (dev_array[len])
>   len++;
>  
>   flowtable->dev_array = calloc(1, len * sizeof(char *));
> -- 
> 2.16.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/5] netfilter: nft_set_hash: skip fixed hash if timeout is specified

2018-03-12 Thread Pablo Neira Ayuso

Fixed hash supports to timeouts, so skip it. Otherwise, userspace hits
EOPNOTSUPP.

Fixes: 6c03ae210ce3 ("netfilter: nft_set_hash: add non-resizable hashtable 
implementation")
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nft_set_hash.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nft_set_hash.c b/net/netfilter/nft_set_hash.c
index 3f1624ee056f..d40591fe1b2f 100644
--- a/net/netfilter/nft_set_hash.c
+++ b/net/netfilter/nft_set_hash.c
@@ -674,7 +674,7 @@ static const struct nft_set_ops *
 nft_hash_select_ops(const struct nft_ctx *ctx, const struct nft_set_desc *desc,
u32 flags)
 {
-   if (desc->size) {
+   if (desc->size && !(flags & NFT_SET_TIMEOUT)) {
switch (desc->klen) {
case 4:
return &nft_hash_fast_ops;
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/5] Netfilter fixes for net

2018-03-12 Thread Pablo Neira Ayuso

Hi David,

The following patchset contains Netfilter fixes for your net tree, they are:

1) Fixed hashtable representation doesn't support timeout flag, skip it
   otherwise rules to add elements from the packet fail bogusly fail with
   EOPNOTSUPP.

2) Fix bogus error with 32-bits ebtables userspace and 64-bits kernel,
   patch from Florian Westphal.

3) Sanitize proc names in several x_tables extensions, also from Florian.

4) Add sanitization to ebt_among wormhash logic, from Florian.

5) Missing release of hook array in flowtable.


You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!



The following changes since commit ce380619fab99036f5e745c7a865b21c59f005f6:

  Merge tag 'please-pull-ia64_misc' of 
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux (2018-03-05 20:31:14 
-0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to c04a3f730021c304c7cc4bc30ee57ee70ad98d57:

  netfilter: nf_tables: release flowtable hooks (2018-03-11 21:24:56 +0100)


Florian Westphal (3):
  netfilter: ebtables: fix erroneous reject of last rule
  netfilter: x_tables: add and use xt_check_proc_name
  netfilter: bridge: ebt_among: add more missing match size checks

Pablo Neira Ayuso (2):
  netfilter: nft_set_hash: skip fixed hash if timeout is specified
  netfilter: nf_tables: release flowtable hooks

 include/linux/netfilter/x_tables.h |  2 ++
 net/bridge/netfilter/ebt_among.c   | 34 ++
 net/bridge/netfilter/ebtables.c|  6 +-
 net/netfilter/nf_tables_api.c  |  1 +
 net/netfilter/nft_set_hash.c   |  2 +-
 net/netfilter/x_tables.c   | 30 ++
 net/netfilter/xt_hashlimit.c   | 16 ++--
 net/netfilter/xt_recent.c  |  6 +++---
 8 files changed, 86 insertions(+), 11 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 5/5] netfilter: nf_tables: release flowtable hooks

2018-03-12 Thread Pablo Neira Ayuso

Otherwise we leak this array.

Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nf_tables_api.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 558593e6a0a3..c4acc7340eb1 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -5423,6 +5423,7 @@ static void nf_tables_flowtable_notify(struct nft_ctx 
*ctx,
 static void nf_tables_flowtable_destroy(struct nft_flowtable *flowtable)
 {
cancel_delayed_work_sync(&flowtable->data.gc_work);
+   kfree(flowtable->ops);
kfree(flowtable->name);
flowtable->data.type->free(&flowtable->data);
rhashtable_destroy(&flowtable->data.rhashtable);
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/5] netfilter: ebtables: fix erroneous reject of last rule

2018-03-12 Thread Pablo Neira Ayuso

From: Florian Westphal 

The last rule in the blob has next_entry offset that is same as total size.
This made "ebtables32 -A OUTPUT -d de:ad:be:ef:01:02" fail on 64 bit kernel.

Fixes: b71812168571fa ("netfilter: ebtables: CONFIG_COMPAT: don't trust 
userland offsets")
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/bridge/netfilter/ebtables.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 254ef9f49567..a94d23b0a9af 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -2119,8 +2119,12 @@ static int size_entry_mwt(struct ebt_entry *entry, const 
unsigned char *base,
 * offsets are relative to beginning of struct ebt_entry (i.e., 0).
 */
for (i = 0; i < 4 ; ++i) {
-   if (offsets[i] >= *total)
+   if (offsets[i] > *total)
return -EINVAL;
+
+   if (i < 3 && offsets[i] == *total)
+   return -EINVAL;
+
if (i == 0)
continue;
if (offsets[i-1] > offsets[i])
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/5] netfilter: x_tables: add and use xt_check_proc_name

2018-03-12 Thread Pablo Neira Ayuso

From: Florian Westphal 

recent and hashlimit both create /proc files, but only check that
name is 0 terminated.

This can trigger WARN() from procfs when name is "" or "/".
Add helper for this and then use it for both.

Cc: Eric Dumazet 
Reported-by: Eric Dumazet 
Reported-by: 
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 include/linux/netfilter/x_tables.h |  2 ++
 net/netfilter/x_tables.c   | 30 ++
 net/netfilter/xt_hashlimit.c   | 16 ++--
 net/netfilter/xt_recent.c  |  6 +++---
 4 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/include/linux/netfilter/x_tables.h 
b/include/linux/netfilter/x_tables.h
index 1313b35c3ab7..14529511c4b8 100644
--- a/include/linux/netfilter/x_tables.h
+++ b/include/linux/netfilter/x_tables.h
@@ -285,6 +285,8 @@ unsigned int *xt_alloc_entry_offsets(unsigned int size);
 bool xt_find_jump_offset(const unsigned int *offsets,
 unsigned int target, unsigned int size);
 
+int xt_check_proc_name(const char *name, unsigned int size);
+
 int xt_check_match(struct xt_mtchk_param *, unsigned int size, u_int8_t proto,
   bool inv_proto);
 int xt_check_target(struct xt_tgchk_param *, unsigned int size, u_int8_t proto,
diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index fa1655aff8d3..4aa01c90e9d1 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -423,6 +423,36 @@ textify_hooks(char *buf, size_t size, unsigned int mask, 
uint8_t nfproto)
return buf;
 }
 
+/**
+ * xt_check_proc_name - check that name is suitable for /proc file creation
+ *
+ * @name: file name candidate
+ * @size: length of buffer
+ *
+ * some x_tables modules wish to create a file in /proc.
+ * This function makes sure that the name is suitable for this
+ * purpose, it checks that name is NUL terminated and isn't a 'special'
+ * name, like "..".
+ *
+ * returns negative number on error or 0 if name is useable.
+ */
+int xt_check_proc_name(const char *name, unsigned int size)
+{
+   if (name[0] == '\0')
+   return -EINVAL;
+
+   if (strnlen(name, size) == size)
+   return -ENAMETOOLONG;
+
+   if (strcmp(name, ".") == 0 ||
+   strcmp(name, "..") == 0 ||
+   strchr(name, '/'))
+   return -EINVAL;
+
+   return 0;
+}
+EXPORT_SYMBOL(xt_check_proc_name);
+
 int xt_check_match(struct xt_mtchk_param *par,
   unsigned int size, u_int8_t proto, bool inv_proto)
 {
diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 66f5aca62a08..3360f13dc208 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -917,8 +917,9 @@ static int hashlimit_mt_check_v1(const struct 
xt_mtchk_param *par)
struct hashlimit_cfg3 cfg = {};
int ret;
 
-   if (info->name[sizeof(info->name) - 1] != '\0')
-   return -EINVAL;
+   ret = xt_check_proc_name(info->name, sizeof(info->name));
+   if (ret)
+   return ret;
 
ret = cfg_copy(&cfg, (void *)&info->cfg, 1);
 
@@ -935,8 +936,9 @@ static int hashlimit_mt_check_v2(const struct 
xt_mtchk_param *par)
struct hashlimit_cfg3 cfg = {};
int ret;
 
-   if (info->name[sizeof(info->name) - 1] != '\0')
-   return -EINVAL;
+   ret = xt_check_proc_name(info->name, sizeof(info->name));
+   if (ret)
+   return ret;
 
ret = cfg_copy(&cfg, (void *)&info->cfg, 2);
 
@@ -950,9 +952,11 @@ static int hashlimit_mt_check_v2(const struct 
xt_mtchk_param *par)
 static int hashlimit_mt_check(const struct xt_mtchk_param *par)
 {
struct xt_hashlimit_mtinfo3 *info = par->matchinfo;
+   int ret;
 
-   if (info->name[sizeof(info->name) - 1] != '\0')
-   return -EINVAL;
+   ret = xt_check_proc_name(info->name, sizeof(info->name));
+   if (ret)
+   return ret;
 
return hashlimit_mt_check_common(par, &info->hinfo, &info->cfg,
 info->name, 3);
diff --git a/net/netfilter/xt_recent.c b/net/netfilter/xt_recent.c
index 6d232d18faff..81ee1d6543b2 100644
--- a/net/netfilter/xt_recent.c
+++ b/net/netfilter/xt_recent.c
@@ -361,9 +361,9 @@ static int recent_mt_check(const struct xt_mtchk_param *par,
info->hit_count, XT_RECENT_MAX_NSTAMPS - 1);
return -EINVAL;
}
-   if (info->name[0] == '\0' ||
-   strnlen(info->name, XT_RECENT_NAME_LEN) == XT_RECENT_NAME_LEN)
-   return -EINVAL;
+   ret = xt_check_proc_name(info->name, sizeof(info->name));
+   if (ret)
+   return ret;
 
if (ip_pkt_list_tot && info->hit_count < ip_pkt_list_tot)
nstamp_mask = roundup_pow_of_two(ip_pkt_list_tot) - 1;
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...

[PATCH 4/5] netfilter: bridge: ebt_among: add more missing match size checks

2018-03-12 Thread Pablo Neira Ayuso

From: Florian Westphal 

ebt_among is special, it has a dynamic match size and is exempt
from the central size checks.

commit c4585a2823edf ("bridge: ebt_among: add missing match size checks")
added validation for pool size, but missed fact that the macros
ebt_among_wh_src/dst can already return out-of-bound result because
they do not check value of wh_src/dst_ofs (an offset) vs. the size
of the match that userspace gave to us.

v2:
check that offset has correct alignment.
Paolo Abeni points out that we should also check that src/dst
wormhash arrays do not overlap, and src + length lines up with
start of dst (or vice versa).
v3: compact wormhash_sizes_valid() part

NB: Fixes tag is intentionally wrong, this bug exists from day
one when match was added for 2.6 kernel. Tag is there so stable
maintainers will notice this one too.

Tested with same rules from the earlier patch.

Fixes: c4585a2823edf ("bridge: ebt_among: add missing match size checks")
Reported-by: 
Signed-off-by: Florian Westphal 
Reviewed-by: Eric Dumazet 
Signed-off-by: Pablo Neira Ayuso 
---
 net/bridge/netfilter/ebt_among.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/net/bridge/netfilter/ebt_among.c b/net/bridge/netfilter/ebt_among.c
index c5afb4232ecb..620e54f08296 100644
--- a/net/bridge/netfilter/ebt_among.c
+++ b/net/bridge/netfilter/ebt_among.c
@@ -177,6 +177,28 @@ static bool poolsize_invalid(const struct ebt_mac_wormhash 
*w)
return w && w->poolsize >= (INT_MAX / sizeof(struct 
ebt_mac_wormhash_tuple));
 }
 
+static bool wormhash_offset_invalid(int off, unsigned int len)
+{
+   if (off == 0) /* not present */
+   return false;
+
+   if (off < (int)sizeof(struct ebt_among_info) ||
+   off % __alignof__(struct ebt_mac_wormhash))
+   return true;
+
+   off += sizeof(struct ebt_mac_wormhash);
+
+   return off > len;
+}
+
+static bool wormhash_sizes_valid(const struct ebt_mac_wormhash *wh, int a, int 
b)
+{
+   if (a == 0)
+   a = sizeof(struct ebt_among_info);
+
+   return ebt_mac_wormhash_size(wh) + a == b;
+}
+
 static int ebt_among_mt_check(const struct xt_mtchk_param *par)
 {
const struct ebt_among_info *info = par->matchinfo;
@@ -189,6 +211,10 @@ static int ebt_among_mt_check(const struct xt_mtchk_param 
*par)
if (expected_length > em->match_size)
return -EINVAL;
 
+   if (wormhash_offset_invalid(info->wh_dst_ofs, em->match_size) ||
+   wormhash_offset_invalid(info->wh_src_ofs, em->match_size))
+   return -EINVAL;
+
wh_dst = ebt_among_wh_dst(info);
if (poolsize_invalid(wh_dst))
return -EINVAL;
@@ -201,6 +227,14 @@ static int ebt_among_mt_check(const struct xt_mtchk_param 
*par)
if (poolsize_invalid(wh_src))
return -EINVAL;
 
+   if (info->wh_src_ofs < info->wh_dst_ofs) {
+   if (!wormhash_sizes_valid(wh_src, info->wh_src_ofs, 
info->wh_dst_ofs))
+   return -EINVAL;
+   } else {
+   if (!wormhash_sizes_valid(wh_dst, info->wh_dst_ofs, 
info->wh_src_ofs))
+   return -EINVAL;
+   }
+
expected_length += ebt_mac_wormhash_size(wh_src);
 
if (em->match_size != EBT_ALIGN(expected_length)) {
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/5] Netfilter fixes for net

2018-03-12 Thread David Miller

From: Pablo Neira Ayuso 
Date: Mon, 12 Mar 2018 17:15:59 +0100

> The following patchset contains Netfilter fixes for your net tree, they are:
> 
> 1) Fixed hashtable representation doesn't support timeout flag, skip it
>otherwise rules to add elements from the packet fail bogusly fail with
>EOPNOTSUPP.
> 
> 2) Fix bogus error with 32-bits ebtables userspace and 64-bits kernel,
>patch from Florian Westphal.
> 
> 3) Sanitize proc names in several x_tables extensions, also from Florian.
> 
> 4) Add sanitization to ebt_among wormhash logic, from Florian.
> 
> 5) Missing release of hook array in flowtable.

Pulled, thank you.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 00/30] Netfilter/IPVS updates for net-next

2018-03-12 Thread Pablo Neira Ayuso

Hi David,

The following patchset contains Netfilter/IPVS updates for your net-next
tree. This batch comes with more input sanitization for xtables to
address bug reports from fuzzers, preparation works to the flowtable
infrastructure and assorted updates. In no particular order, they are:

1) Make sure userspace provides a valid standard target verdict, from
   Florian Westphal.

2) Sanitize error target size, also from Florian.

3) Validate that last rule in basechain matches underflow/policy since
   userspace assumes this when decoding the ruleset blob that comes
   from the kernel, from Florian.

4) Consolidate hook entry checks through xt_check_table_hooks(),
   patch from Florian.

5) Cap ruleset allocations at 512 mbytes, 134217728 rules and reject
   very large compat offset arrays, so we have a reasonable upper limit
   and fuzzers don't exercise the oom-killer. Patches from Florian.

6) Several WARN_ON checks on xtables mutex helper, from Florian.

7) xt_rateest now has a hashtable per net, from Cong Wang.

8) Consolidate counter allocation in xt_counters_alloc(), from Florian.

9) Earlier xt_table_unlock() call in {ip,ip6,arp,eb}tables, patch
   from Xin Long.

10) Set FLOW_OFFLOAD_DIR_* to IP_CT_DIR_* definitions, patch from
Felix Fietkau.

11) Consolidate code through flow_offload_fill_dir(), also from Felix.

12) Inline ip6_dst_mtu_forward() just like ip_dst_mtu_maybe_forward()
to remove a dependency with flowtable and ipv6.ko, from Felix.

13) Cache mtu size in flow_offload_tuple object, this is safe for
forwarding as f87c10a8aa1e describes, from Felix.

14) Rename nf_flow_table.c to nf_flow_table_core.o, to simplify too
modular infrastructure, from Felix.

15) Add rt0, rt2 and rt4 IPv6 routing extension support, patch from
Ahmed Abdelsalam.

16) Remove unused parameter in nf_conncount_count(), from Yi-Hung Wei.

17) Support for counting only to nf_conncount infrastructure, patch
from Yi-Hung Wei.

18) Add strict NFT_CT_{SRC_IP,DST_IP,SRC_IP6,DST_IP6} key datatypes
to nft_ct.

19) Use boolean as return value from ipt_ah and from IPVS too, patch
from Gustavo A. R. Silva.

20) Remove useless parameters in nfnl_acct_overquota() and
nf_conntrack_broadcast_help(), from Taehee Yoo.

21) Use ipv6_addr_is_multicast() from xt_cluster, also from Taehee Yoo.

22) Statify nf_tables_obj_lookup_byhandle, patch from Fengguang Wu.

23) Fix typo in xt_limit, from Geert Uytterhoeven.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git

Thanks!



The following changes since commit ef3f6c256f0b4711a3ef1489797b95820be5ab01:

  Merge branch 'mvpp2-jumbo-frames-support' (2018-03-05 12:55:55 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git HEAD

for you to fetch changes up to 90eee0957b655339d659c0b9bba64f5c90b2233b:

  netfilter: nft_ct: add NFT_CT_{SRC,DST}_{IP,IP6} (2018-03-11 22:17:57 +0100)


Ahmed Abdelsalam (1):
  netfilter: nf_tables: handle rt0 and rt2 properly

Cong Wang (1):
  netfilter: make xt_rateest hash table per net

Felix Fietkau (5):
  netfilter: nf_flow_table: use IP_CT_DIR_* values for FLOW_OFFLOAD_DIR_*
  netfilter: nf_flow_table: clean up flow_offload_alloc
  ipv6: make ip6_dst_mtu_forward inline
  netfilter: nf_flow_table: cache mtu in struct flow_offload_tuple
  netfilter: nf_flow_table: rename nf_flow_table.c to nf_flow_table_core.c

Florian Westphal (12):
  netfilter: x_tables: check standard verdicts in core
  netfilter: x_tables: check error target size too
  netfilter: x_tables: move hook entry checks into core
  netfilter: x_tables: enforce unique and ascending entry points
  netfilter: x_tables: cap allocations at 512 mbyte
  netfilter: x_tables: limit allocation requests for blob rule heads
  netfilter: x_tables: add counters allocation wrapper
  netfilter: compat: prepare xt_compat_init_offsets to return errors
  netfilter: compat: reject huge allocation requests
  netfilter: x_tables: make sure compat af mutex is held
  netfilter: x_tables: ensure last rule in base chain matches 
underflow/policy
  netfilter: x_tables: fix build with CONFIG_COMPAT=n

Geert Uytterhoeven (1):
  netfilter: xt_limit: Spelling s/maxmum/maximum/

Gustavo A. R. Silva (2):
  netfilter: ipt_ah: return boolean instead of integer
  ipvs: use true and false for boolean values

Pablo Neira Ayuso (1):
  netfilter: nft_ct: add NFT_CT_{SRC,DST}_{IP,IP6}

Taehee Yoo (3):
  netfilter: nfnetlink_acct: remove useless parameter
  netfilter: xt_cluster: get rid of xt_cluster_ipv6_is_multicast
  netfilter: nf_conntrack_broadcast: remove useless parameter

Xin Long (1):
  netfilter: unlock xt_table earlier in __do_replace

Yi-Hung Wei (2):

[PATCH 15/30] netfilter: compat: reject huge allocation requests

2018-03-12 Thread Pablo Neira Ayuso

From: Florian Westphal 

no need to bother even trying to allocating huge compat offset arrays,
such ruleset is rejected later on anyway becaus we refuse to allocate
overly large rule blobs.

However, compat translation happens before blob allocation, so we should
add a check there too.

This is supposed to help with fuzzing by avoiding oom-killer.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/x_tables.c | 26 ++
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index e878c85a9268..33724b08b8f0 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -582,14 +582,8 @@ int xt_compat_add_offset(u_int8_t af, unsigned int offset, 
int delta)
 {
struct xt_af *xp = &xt[af];
 
-   if (!xp->compat_tab) {
-   if (!xp->number)
-   return -EINVAL;
-   xp->compat_tab = vmalloc(sizeof(struct compat_delta) * 
xp->number);
-   if (!xp->compat_tab)
-   return -ENOMEM;
-   xp->cur = 0;
-   }
+   if (WARN_ON(!xp->compat_tab))
+   return -ENOMEM;
 
if (xp->cur >= xp->number)
return -EINVAL;
@@ -634,6 +628,22 @@ EXPORT_SYMBOL_GPL(xt_compat_calc_jump);
 
 int xt_compat_init_offsets(u8 af, unsigned int number)
 {
+   size_t mem;
+
+   if (!number || number > (INT_MAX / sizeof(struct compat_delta)))
+   return -EINVAL;
+
+   if (WARN_ON(xt[af].compat_tab))
+   return -EINVAL;
+
+   mem = sizeof(struct compat_delta) * number;
+   if (mem > XT_MAX_TABLE_SIZE)
+   return -ENOMEM;
+
+   xt[af].compat_tab = vmalloc(mem);
+   if (!xt[af].compat_tab)
+   return -ENOMEM;
+
xt[af].number = number;
xt[af].cur = 0;
 
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 05/30] netfilter: ipt_ah: return boolean instead of integer

2018-03-12 Thread Pablo Neira Ayuso

From: "Gustavo A. R. Silva" 

Return statements in functions returning bool should use
true/false instead of 1/0.

This issue was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva 
Signed-off-by: Pablo Neira Ayuso 
---
 net/ipv4/netfilter/ipt_ah.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/netfilter/ipt_ah.c b/net/ipv4/netfilter/ipt_ah.c
index a787d07f6cb7..7c6c20eaf4db 100644
--- a/net/ipv4/netfilter/ipt_ah.c
+++ b/net/ipv4/netfilter/ipt_ah.c
@@ -47,7 +47,7 @@ static bool ah_mt(const struct sk_buff *skb, struct 
xt_action_param *par)
 */
pr_debug("Dropping evil AH tinygram.\n");
par->hotdrop = true;
-   return 0;
+   return false;
}
 
return spi_match(ahinfo->spis[0], ahinfo->spis[1],
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 13/30] netfilter: x_tables: add counters allocation wrapper

2018-03-12 Thread Pablo Neira Ayuso

From: Florian Westphal 

allows to have size checks in a single spot.
This is supposed to reduce oom situations when fuzz-testing xtables.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 include/linux/netfilter/x_tables.h |  1 +
 net/ipv4/netfilter/arp_tables.c|  2 +-
 net/ipv4/netfilter/ip_tables.c |  2 +-
 net/ipv6/netfilter/ip6_tables.c|  2 +-
 net/netfilter/x_tables.c   | 15 +++
 5 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/include/linux/netfilter/x_tables.h 
b/include/linux/netfilter/x_tables.h
index fa0c19c328f1..0bd93c589a8c 100644
--- a/include/linux/netfilter/x_tables.h
+++ b/include/linux/netfilter/x_tables.h
@@ -301,6 +301,7 @@ int xt_data_to_user(void __user *dst, const void *src,
 
 void *xt_copy_counters_from_user(const void __user *user, unsigned int len,
 struct xt_counters_info *info, bool compat);
+struct xt_counters *xt_counters_alloc(unsigned int counters);
 
 struct xt_table *xt_register_table(struct net *net,
   const struct xt_table *table,
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index be5821215ea0..82ba09b50fdb 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -883,7 +883,7 @@ static int __do_replace(struct net *net, const char *name,
struct arpt_entry *iter;
 
ret = 0;
-   counters = vzalloc(num_counters * sizeof(struct xt_counters));
+   counters = xt_counters_alloc(num_counters);
if (!counters) {
ret = -ENOMEM;
goto out;
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 29bda9484a33..4901ca6c3e09 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -1045,7 +1045,7 @@ __do_replace(struct net *net, const char *name, unsigned 
int valid_hooks,
struct ipt_entry *iter;
 
ret = 0;
-   counters = vzalloc(num_counters * sizeof(struct xt_counters));
+   counters = xt_counters_alloc(num_counters);
if (!counters) {
ret = -ENOMEM;
goto out;
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index ba3776a4d305..e84cec49b60f 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -1063,7 +1063,7 @@ __do_replace(struct net *net, const char *name, unsigned 
int valid_hooks,
struct ip6t_entry *iter;
 
ret = 0;
-   counters = vzalloc(num_counters * sizeof(struct xt_counters));
+   counters = xt_counters_alloc(num_counters);
if (!counters) {
ret = -ENOMEM;
goto out;
diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index 01f8e122e74e..82b1f8f52ac6 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -1290,6 +1290,21 @@ static int xt_jumpstack_alloc(struct xt_table_info *i)
return 0;
 }
 
+struct xt_counters *xt_counters_alloc(unsigned int counters)
+{
+   struct xt_counters *mem;
+
+   if (counters == 0 || counters > INT_MAX / sizeof(*mem))
+   return NULL;
+
+   counters *= sizeof(*mem);
+   if (counters > XT_MAX_TABLE_SIZE)
+   return NULL;
+
+   return vzalloc(counters);
+}
+EXPORT_SYMBOL(xt_counters_alloc);
+
 struct xt_table_info *
 xt_replace_table(struct xt_table *table,
  unsigned int num_counters,
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 29/30] netfilter: conncount: Support count only use case

2018-03-12 Thread Pablo Neira Ayuso

From: Yi-Hung Wei 

Currently, nf_conncount_count() counts the number of connections that
matches key and inserts a conntrack 'tuple' with the same key into the
accounting data structure.  This patch supports another use case that only
counts the number of connections where 'tuple' is not provided.  Therefore,
proper changes are made on nf_conncount_count() to support the case where
'tuple' is NULL.  This could be useful for querying statistics or
debugging purpose.

Signed-off-by: Yi-Hung Wei 
Acked-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nf_conncount.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 9305a08b4422..153e690e2893 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -104,7 +104,7 @@ static unsigned int check_hlist(struct net *net,
struct nf_conn *found_ct;
unsigned int length = 0;
 
-   *addit = true;
+   *addit = tuple ? true : false;
 
/* check the saved connections */
hlist_for_each_entry_safe(conn, n, head, node) {
@@ -117,7 +117,7 @@ static unsigned int check_hlist(struct net *net,
 
found_ct = nf_ct_tuplehash_to_ctrack(found);
 
-   if (nf_ct_tuple_equal(&conn->tuple, tuple)) {
+   if (tuple && nf_ct_tuple_equal(&conn->tuple, tuple)) {
/*
 * Just to be sure we have it only once in the list.
 * We should not see tuples twice unless someone hooks
@@ -220,6 +220,9 @@ count_tree(struct net *net, struct rb_root *root,
goto restart;
}
 
+   if (!tuple)
+   return 0;
+
/* no match, need to insert new node */
rbconn = kmem_cache_alloc(conncount_rb_cachep, GFP_ATOMIC);
if (rbconn == NULL)
@@ -242,6 +245,9 @@ count_tree(struct net *net, struct rb_root *root,
return 1;
 }
 
+/* Count and return number of conntrack entries in 'net' with particular 'key'.
+ * If 'tuple' is not null, insert it into the accounting data structure.
+ */
 unsigned int nf_conncount_count(struct net *net,
struct nf_conncount_data *data,
const u32 *key,
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 30/30] netfilter: nft_ct: add NFT_CT_{SRC,DST}_{IP,IP6}

2018-03-12 Thread Pablo Neira Ayuso

All existing keys, except the NFT_CT_SRC and NFT_CT_DST are assumed to
have strict datatypes. This is causing problems with sets and
concatenations given the specific length of these keys is not known.

Signed-off-by: Pablo Neira Ayuso 
Acked-by: Florian Westphal 
---
 include/uapi/linux/netfilter/nf_tables.h | 12 --
 net/netfilter/nft_ct.c   | 38 
 2 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/netfilter/nf_tables.h 
b/include/uapi/linux/netfilter/nf_tables.h
index bb2135c8ad73..09f4eb1928f0 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -912,8 +912,8 @@ enum nft_rt_attributes {
  * @NFT_CT_EXPIRATION: relative conntrack expiration time in ms
  * @NFT_CT_HELPER: connection tracking helper assigned to conntrack
  * @NFT_CT_L3PROTOCOL: conntrack layer 3 protocol
- * @NFT_CT_SRC: conntrack layer 3 protocol source (IPv4/IPv6 address)
- * @NFT_CT_DST: conntrack layer 3 protocol destination (IPv4/IPv6 address)
+ * @NFT_CT_SRC: conntrack layer 3 protocol source (IPv4/IPv6 address, 
deprecated)
+ * @NFT_CT_DST: conntrack layer 3 protocol destination (IPv4/IPv6 address, 
deprecated)
  * @NFT_CT_PROTOCOL: conntrack layer 4 protocol
  * @NFT_CT_PROTO_SRC: conntrack layer 4 protocol source
  * @NFT_CT_PROTO_DST: conntrack layer 4 protocol destination
@@ -923,6 +923,10 @@ enum nft_rt_attributes {
  * @NFT_CT_AVGPKT: conntrack average bytes per packet
  * @NFT_CT_ZONE: conntrack zone
  * @NFT_CT_EVENTMASK: ctnetlink events to be generated for this conntrack
+ * @NFT_CT_SRC_IP: conntrack layer 3 protocol source (IPv4 address)
+ * @NFT_CT_DST_IP: conntrack layer 3 protocol destination (IPv4 address)
+ * @NFT_CT_SRC_IP6: conntrack layer 3 protocol source (IPv6 address)
+ * @NFT_CT_DST_IP6: conntrack layer 3 protocol destination (IPv6 address)
  */
 enum nft_ct_keys {
NFT_CT_STATE,
@@ -944,6 +948,10 @@ enum nft_ct_keys {
NFT_CT_AVGPKT,
NFT_CT_ZONE,
NFT_CT_EVENTMASK,
+   NFT_CT_SRC_IP,
+   NFT_CT_DST_IP,
+   NFT_CT_SRC_IP6,
+   NFT_CT_DST_IP6,
 };
 
 /**
diff --git a/net/netfilter/nft_ct.c b/net/netfilter/nft_ct.c
index 6ab274b14484..ea737fd789e8 100644
--- a/net/netfilter/nft_ct.c
+++ b/net/netfilter/nft_ct.c
@@ -196,6 +196,26 @@ static void nft_ct_get_eval(const struct nft_expr *expr,
case NFT_CT_PROTO_DST:
nft_reg_store16(dest, (__force u16)tuple->dst.u.all);
return;
+   case NFT_CT_SRC_IP:
+   if (nf_ct_l3num(ct) != NFPROTO_IPV4)
+   goto err;
+   *dest = tuple->src.u3.ip;
+   return;
+   case NFT_CT_DST_IP:
+   if (nf_ct_l3num(ct) != NFPROTO_IPV4)
+   goto err;
+   *dest = tuple->dst.u3.ip;
+   return;
+   case NFT_CT_SRC_IP6:
+   if (nf_ct_l3num(ct) != NFPROTO_IPV6)
+   goto err;
+   memcpy(dest, tuple->src.u3.ip6, sizeof(struct in6_addr));
+   return;
+   case NFT_CT_DST_IP6:
+   if (nf_ct_l3num(ct) != NFPROTO_IPV6)
+   goto err;
+   memcpy(dest, tuple->dst.u3.ip6, sizeof(struct in6_addr));
+   return;
default:
break;
}
@@ -419,6 +439,20 @@ static int nft_ct_get_init(const struct nft_ctx *ctx,
return -EAFNOSUPPORT;
}
break;
+   case NFT_CT_SRC_IP:
+   case NFT_CT_DST_IP:
+   if (tb[NFTA_CT_DIRECTION] == NULL)
+   return -EINVAL;
+
+   len = FIELD_SIZEOF(struct nf_conntrack_tuple, src.u3.ip);
+   break;
+   case NFT_CT_SRC_IP6:
+   case NFT_CT_DST_IP6:
+   if (tb[NFTA_CT_DIRECTION] == NULL)
+   return -EINVAL;
+
+   len = FIELD_SIZEOF(struct nf_conntrack_tuple, src.u3.ip6);
+   break;
case NFT_CT_PROTO_SRC:
case NFT_CT_PROTO_DST:
if (tb[NFTA_CT_DIRECTION] == NULL)
@@ -588,6 +622,10 @@ static int nft_ct_get_dump(struct sk_buff *skb, const 
struct nft_expr *expr)
switch (priv->key) {
case NFT_CT_SRC:
case NFT_CT_DST:
+   case NFT_CT_SRC_IP:
+   case NFT_CT_DST_IP:
+   case NFT_CT_SRC_IP6:
+   case NFT_CT_DST_IP6:
case NFT_CT_PROTO_SRC:
case NFT_CT_PROTO_DST:
if (nla_put_u8(skb, NFTA_CT_DIRECTION, priv->dir))
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 28/30] netfilter: Refactor nf_conncount

2018-03-12 Thread Pablo Neira Ayuso

From: Yi-Hung Wei 

Remove parameter 'family' in nf_conncount_count() and count_tree().
It is because the parameter is not useful after commit 625c556118f3
("netfilter: connlimit: split xt_connlimit into front and backend").

Signed-off-by: Yi-Hung Wei 
Acked-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/nf_conntrack_count.h | 1 -
 net/netfilter/nf_conncount.c   | 4 +---
 net/netfilter/xt_connlimit.c   | 4 ++--
 3 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_count.h 
b/include/net/netfilter/nf_conntrack_count.h
index adf8db44cf86..e61184fbfb71 100644
--- a/include/net/netfilter/nf_conntrack_count.h
+++ b/include/net/netfilter/nf_conntrack_count.h
@@ -11,7 +11,6 @@ void nf_conncount_destroy(struct net *net, unsigned int 
family,
 unsigned int nf_conncount_count(struct net *net,
struct nf_conncount_data *data,
const u32 *key,
-   unsigned int family,
const struct nf_conntrack_tuple *tuple,
const struct nf_conntrack_zone *zone);
 #endif
diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 6d65389e308f..9305a08b4422 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -158,7 +158,6 @@ static void tree_nodes_free(struct rb_root *root,
 static unsigned int
 count_tree(struct net *net, struct rb_root *root,
   const u32 *key, u8 keylen,
-  u8 family,
   const struct nf_conntrack_tuple *tuple,
   const struct nf_conntrack_zone *zone)
 {
@@ -246,7 +245,6 @@ count_tree(struct net *net, struct rb_root *root,
 unsigned int nf_conncount_count(struct net *net,
struct nf_conncount_data *data,
const u32 *key,
-   unsigned int family,
const struct nf_conntrack_tuple *tuple,
const struct nf_conntrack_zone *zone)
 {
@@ -259,7 +257,7 @@ unsigned int nf_conncount_count(struct net *net,
 
spin_lock_bh(&nf_conncount_locks[hash % CONNCOUNT_LOCK_SLOTS]);
 
-   count = count_tree(net, root, key, data->keylen, family, tuple, zone);
+   count = count_tree(net, root, key, data->keylen, tuple, zone);
 
spin_unlock_bh(&nf_conncount_locks[hash % CONNCOUNT_LOCK_SLOTS]);
 
diff --git a/net/netfilter/xt_connlimit.c b/net/netfilter/xt_connlimit.c
index b1b17b9353e1..6275106ccf50 100644
--- a/net/netfilter/xt_connlimit.c
+++ b/net/netfilter/xt_connlimit.c
@@ -67,8 +67,8 @@ connlimit_mt(const struct sk_buff *skb, struct 
xt_action_param *par)
key[1] = zone->id;
}
 
-   connections = nf_conncount_count(net, info->data, key,
-xt_family(par), tuple_ptr, zone);
+   connections = nf_conncount_count(net, info->data, key, tuple_ptr,
+zone);
if (connections == 0)
/* kmalloc failed, drop it entirely */
goto hotdrop;
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 27/30] netfilter: nf_tables: handle rt0 and rt2 properly

2018-03-12 Thread Pablo Neira Ayuso

From: Ahmed Abdelsalam 

This fixes Netfilter's bugzilla #1219.

Type 0 and 2 of the IPv6 Routing extension header are not handled
properlyby exthdr_init_raw() in src/exthdr.c

In order to fix the bug, we extended the "enum nft_exthdr_op" to
differentiate between rt, rt0, and rt2.

In this patch we extended the kernel implementation of nf_tables to
recognize the new options

Signed-off-by: Ahmed Abdelsalam 
Signed-off-by: Pablo Neira Ayuso 
---
 include/uapi/linux/netfilter/nf_tables.h | 3 +++
 net/netfilter/nft_exthdr.c   | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/include/uapi/linux/netfilter/nf_tables.h 
b/include/uapi/linux/netfilter/nf_tables.h
index 66dceee0ae30..bb2135c8ad73 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -731,6 +731,9 @@ enum nft_exthdr_flags {
 enum nft_exthdr_op {
NFT_EXTHDR_OP_IPV6,
NFT_EXTHDR_OP_TCPOPT,
+   NFT_EXTHDR_OP_RT0,
+   NFT_EXTHDR_OP_RT2,
+   NFT_EXTHDR_OP_RT4,
__NFT_EXTHDR_OP_MAX
 };
 #define NFT_EXTHDR_OP_MAX  (__NFT_EXTHDR_OP_MAX - 1)
diff --git a/net/netfilter/nft_exthdr.c b/net/netfilter/nft_exthdr.c
index 47ec1046ad11..bbc1be2b3b73 100644
--- a/net/netfilter/nft_exthdr.c
+++ b/net/netfilter/nft_exthdr.c
@@ -399,6 +399,9 @@ nft_exthdr_select_ops(const struct nft_ctx *ctx,
return &nft_exthdr_tcp_ops;
break;
case NFT_EXTHDR_OP_IPV6:
+   case NFT_EXTHDR_OP_RT0:
+   case NFT_EXTHDR_OP_RT2:
+   case NFT_EXTHDR_OP_RT4:
if (tb[NFTA_EXTHDR_DREG])
return &nft_exthdr_ipv6_ops;
break;
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 24/30] netfilter: nf_flow_table: rename nf_flow_table.c to nf_flow_table_core.c

2018-03-12 Thread Pablo Neira Ayuso

From: Felix Fietkau 

Preparation for adding more code to the same module

Signed-off-by: Felix Fietkau 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/Makefile  | 2 ++
 net/netfilter/{nf_flow_table.c => nf_flow_table_core.c} | 0
 2 files changed, 2 insertions(+)
 rename net/netfilter/{nf_flow_table.c => nf_flow_table_core.c} (100%)

diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 5d9b8b959e58..138db16d59ed 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -112,6 +112,8 @@ obj-$(CONFIG_NFT_FWD_NETDEV)+= nft_fwd_netdev.o
 
 # flow table infrastructure
 obj-$(CONFIG_NF_FLOW_TABLE)+= nf_flow_table.o
+nf_flow_table-objs := nf_flow_table_core.o
+
 obj-$(CONFIG_NF_FLOW_TABLE_INET) += nf_flow_table_inet.o
 
 # generic X tables 
diff --git a/net/netfilter/nf_flow_table.c b/net/netfilter/nf_flow_table_core.c
similarity index 100%
rename from net/netfilter/nf_flow_table.c
rename to net/netfilter/nf_flow_table_core.c
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 23/30] netfilter: nf_flow_table: cache mtu in struct flow_offload_tuple

2018-03-12 Thread Pablo Neira Ayuso

From: Felix Fietkau 

Reduces the number of cache lines touched in the offload forwarding
path. This is safe because PMTU limits are bypassed for the forwarding
path (see commit f87c10a8aa1e for more details).

Signed-off-by: Felix Fietkau 
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/nf_flow_table.h   |  2 ++
 net/ipv4/netfilter/nf_flow_table_ipv4.c | 17 +++--
 net/ipv6/netfilter/nf_flow_table_ipv6.c | 17 +++--
 net/netfilter/nf_flow_table.c   |  8 ++--
 4 files changed, 14 insertions(+), 30 deletions(-)

diff --git a/include/net/netfilter/nf_flow_table.h 
b/include/net/netfilter/nf_flow_table.h
index 09ba67598991..76ee5c81b752 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -55,6 +55,8 @@ struct flow_offload_tuple {
 
int oifidx;
 
+   u16 mtu;
+
struct dst_entry*dst_cache;
 };
 
diff --git a/net/ipv4/netfilter/nf_flow_table_ipv4.c 
b/net/ipv4/netfilter/nf_flow_table_ipv4.c
index 25d2975da156..e17ef57b0df4 100644
--- a/net/ipv4/netfilter/nf_flow_table_ipv4.c
+++ b/net/ipv4/netfilter/nf_flow_table_ipv4.c
@@ -177,7 +177,7 @@ static int nf_flow_tuple_ip(struct sk_buff *skb, const 
struct net_device *dev,
 }
 
 /* Based on ip_exceeds_mtu(). */
-static bool __nf_flow_exceeds_mtu(const struct sk_buff *skb, unsigned int mtu)
+static bool nf_flow_exceeds_mtu(const struct sk_buff *skb, unsigned int mtu)
 {
if (skb->len <= mtu)
return false;
@@ -191,17 +191,6 @@ static bool __nf_flow_exceeds_mtu(const struct sk_buff 
*skb, unsigned int mtu)
return true;
 }
 
-static bool nf_flow_exceeds_mtu(struct sk_buff *skb, const struct rtable *rt)
-{
-   u32 mtu;
-
-   mtu = ip_dst_mtu_maybe_forward(&rt->dst, true);
-   if (__nf_flow_exceeds_mtu(skb, mtu))
-   return true;
-
-   return false;
-}
-
 unsigned int
 nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb,
const struct nf_hook_state *state)
@@ -232,9 +221,9 @@ nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb,
 
dir = tuplehash->tuple.dir;
flow = container_of(tuplehash, struct flow_offload, tuplehash[dir]);
-
rt = (const struct rtable *)flow->tuplehash[dir].tuple.dst_cache;
-   if (unlikely(nf_flow_exceeds_mtu(skb, rt)))
+
+   if (unlikely(nf_flow_exceeds_mtu(skb, flow->tuplehash[dir].tuple.mtu)))
return NF_ACCEPT;
 
if (skb_try_make_writable(skb, sizeof(*iph)))
diff --git a/net/ipv6/netfilter/nf_flow_table_ipv6.c 
b/net/ipv6/netfilter/nf_flow_table_ipv6.c
index d346705d6ee6..f530efd3e378 100644
--- a/net/ipv6/netfilter/nf_flow_table_ipv6.c
+++ b/net/ipv6/netfilter/nf_flow_table_ipv6.c
@@ -173,7 +173,7 @@ static int nf_flow_tuple_ipv6(struct sk_buff *skb, const 
struct net_device *dev,
 }
 
 /* Based on ip_exceeds_mtu(). */
-static bool __nf_flow_exceeds_mtu(const struct sk_buff *skb, unsigned int mtu)
+static bool nf_flow_exceeds_mtu(const struct sk_buff *skb, unsigned int mtu)
 {
if (skb->len <= mtu)
return false;
@@ -184,17 +184,6 @@ static bool __nf_flow_exceeds_mtu(const struct sk_buff 
*skb, unsigned int mtu)
return true;
 }
 
-static bool nf_flow_exceeds_mtu(struct sk_buff *skb, const struct rt6_info *rt)
-{
-   u32 mtu;
-
-   mtu = ip6_dst_mtu_forward(&rt->dst);
-   if (__nf_flow_exceeds_mtu(skb, mtu))
-   return true;
-
-   return false;
-}
-
 unsigned int
 nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb,
  const struct nf_hook_state *state)
@@ -225,9 +214,9 @@ nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb,
 
dir = tuplehash->tuple.dir;
flow = container_of(tuplehash, struct flow_offload, tuplehash[dir]);
-
rt = (struct rt6_info *)flow->tuplehash[dir].tuple.dst_cache;
-   if (unlikely(nf_flow_exceeds_mtu(skb, rt)))
+
+   if (unlikely(nf_flow_exceeds_mtu(skb, flow->tuplehash[dir].tuple.mtu)))
return NF_ACCEPT;
 
if (skb_try_make_writable(skb, sizeof(*ip6h)))
diff --git a/net/netfilter/nf_flow_table.c b/net/netfilter/nf_flow_table.c
index db0673a40b97..7403a0dfddf7 100644
--- a/net/netfilter/nf_flow_table.c
+++ b/net/netfilter/nf_flow_table.c
@@ -4,6 +4,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -23,6 +25,7 @@ flow_offload_fill_dir(struct flow_offload *flow, struct 
nf_conn *ct,
 {
struct flow_offload_tuple *ft = &flow->tuplehash[dir].tuple;
struct nf_conntrack_tuple *ctt = &ct->tuplehash[dir].tuple;
+   struct dst_entry *dst = route->tuple[dir].dst;
 
ft->dir = dir;
 
@@ -30,10 +33,12 @@ flow_offload_fill_dir(struct flow_offload *flow, struct 
nf_conn *ct,
case NFPROTO_IPV4:
ft->src_v4 = ctt->src.u3.in;
ft->dst_v4 = ctt->dst.u3.in;
+

[PATCH 26/30] ipvs: use true and false for boolean values

2018-03-12 Thread Pablo Neira Ayuso

From: "Gustavo A. R. Silva" 

Assign true or false to boolean variables instead of an integer value.

This issue was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva 
Signed-off-by: Simon Horman 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/ipvs/ip_vs_lblc.c  | 4 ++--
 net/netfilter/ipvs/ip_vs_lblcr.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c
index 6a340c94c4b8..942e835caf7f 100644
--- a/net/netfilter/ipvs/ip_vs_lblc.c
+++ b/net/netfilter/ipvs/ip_vs_lblc.c
@@ -238,7 +238,7 @@ static void ip_vs_lblc_flush(struct ip_vs_service *svc)
int i;
 
spin_lock_bh(&svc->sched_lock);
-   tbl->dead = 1;
+   tbl->dead = true;
for (i = 0; i < IP_VS_LBLC_TAB_SIZE; i++) {
hlist_for_each_entry_safe(en, next, &tbl->bucket[i], list) {
ip_vs_lblc_del(en);
@@ -369,7 +369,7 @@ static int ip_vs_lblc_init_svc(struct ip_vs_service *svc)
tbl->max_size = IP_VS_LBLC_TAB_SIZE*16;
tbl->rover = 0;
tbl->counter = 1;
-   tbl->dead = 0;
+   tbl->dead = false;
tbl->svc = svc;
 
/*
diff --git a/net/netfilter/ipvs/ip_vs_lblcr.c b/net/netfilter/ipvs/ip_vs_lblcr.c
index 0627881128da..a5acab25c36b 100644
--- a/net/netfilter/ipvs/ip_vs_lblcr.c
+++ b/net/netfilter/ipvs/ip_vs_lblcr.c
@@ -404,7 +404,7 @@ static void ip_vs_lblcr_flush(struct ip_vs_service *svc)
struct hlist_node *next;
 
spin_lock_bh(&svc->sched_lock);
-   tbl->dead = 1;
+   tbl->dead = true;
for (i = 0; i < IP_VS_LBLCR_TAB_SIZE; i++) {
hlist_for_each_entry_safe(en, next, &tbl->bucket[i], list) {
ip_vs_lblcr_free(en);
@@ -532,7 +532,7 @@ static int ip_vs_lblcr_init_svc(struct ip_vs_service *svc)
tbl->max_size = IP_VS_LBLCR_TAB_SIZE*16;
tbl->rover = 0;
tbl->counter = 1;
-   tbl->dead = 0;
+   tbl->dead = false;
tbl->svc = svc;
 
/*
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 25/30] netfilter: x_tables: fix build with CONFIG_COMPAT=n

2018-03-12 Thread Pablo Neira Ayuso

From: Florian Westphal 

I placed the helpers within CONFIG_COMPAT section, move them
outside.

Fixes: 472ebdcd15ebdb ("netfilter: x_tables: check error target size too")
Fixes: 07a9da51b4b6ae ("netfilter: x_tables: check standard verdicts in core")
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/x_tables.c | 62 
 1 file changed, 31 insertions(+), 31 deletions(-)

diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index 7521e8a72c06..bac932f1c582 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -577,6 +577,37 @@ int xt_check_table_hooks(const struct xt_table_info *info, 
unsigned int valid_ho
 }
 EXPORT_SYMBOL(xt_check_table_hooks);
 
+static bool verdict_ok(int verdict)
+{
+   if (verdict > 0)
+   return true;
+
+   if (verdict < 0) {
+   int v = -verdict - 1;
+
+   if (verdict == XT_RETURN)
+   return true;
+
+   switch (v) {
+   case NF_ACCEPT: return true;
+   case NF_DROP: return true;
+   case NF_QUEUE: return true;
+   default:
+   break;
+   }
+
+   return false;
+   }
+
+   return false;
+}
+
+static bool error_tg_ok(unsigned int usersize, unsigned int kernsize,
+   const char *msg, unsigned int msglen)
+{
+   return usersize == kernsize && strnlen(msg, msglen) < msglen;
+}
+
 #ifdef CONFIG_COMPAT
 int xt_compat_add_offset(u_int8_t af, unsigned int offset, int delta)
 {
@@ -736,37 +767,6 @@ struct compat_xt_error_target {
char errorname[XT_FUNCTION_MAXNAMELEN];
 };
 
-static bool verdict_ok(int verdict)
-{
-   if (verdict > 0)
-   return true;
-
-   if (verdict < 0) {
-   int v = -verdict - 1;
-
-   if (verdict == XT_RETURN)
-   return true;
-
-   switch (v) {
-   case NF_ACCEPT: return true;
-   case NF_DROP: return true;
-   case NF_QUEUE: return true;
-   default:
-   break;
-   }
-
-   return false;
-   }
-
-   return false;
-}
-
-static bool error_tg_ok(unsigned int usersize, unsigned int kernsize,
-   const char *msg, unsigned int msglen)
-{
-   return usersize == kernsize && strnlen(msg, msglen) < msglen;
-}
-
 int xt_compat_check_entry_offsets(const void *base, const char *elems,
  unsigned int target_offset,
  unsigned int next_offset)
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 21/30] netfilter: nf_flow_table: clean up flow_offload_alloc

2018-03-12 Thread Pablo Neira Ayuso

From: Felix Fietkau 

Reduce code duplication and make it much easier to read

Signed-off-by: Felix Fietkau 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nf_flow_table.c | 93 ---
 1 file changed, 34 insertions(+), 59 deletions(-)

diff --git a/net/netfilter/nf_flow_table.c b/net/netfilter/nf_flow_table.c
index ec410cae9307..db0673a40b97 100644
--- a/net/netfilter/nf_flow_table.c
+++ b/net/netfilter/nf_flow_table.c
@@ -16,6 +16,38 @@ struct flow_offload_entry {
struct rcu_head rcu_head;
 };
 
+static void
+flow_offload_fill_dir(struct flow_offload *flow, struct nf_conn *ct,
+ struct nf_flow_route *route,
+ enum flow_offload_tuple_dir dir)
+{
+   struct flow_offload_tuple *ft = &flow->tuplehash[dir].tuple;
+   struct nf_conntrack_tuple *ctt = &ct->tuplehash[dir].tuple;
+
+   ft->dir = dir;
+
+   switch (ctt->src.l3num) {
+   case NFPROTO_IPV4:
+   ft->src_v4 = ctt->src.u3.in;
+   ft->dst_v4 = ctt->dst.u3.in;
+   break;
+   case NFPROTO_IPV6:
+   ft->src_v6 = ctt->src.u3.in6;
+   ft->dst_v6 = ctt->dst.u3.in6;
+   break;
+   }
+
+   ft->l3proto = ctt->src.l3num;
+   ft->l4proto = ctt->dst.protonum;
+   ft->src_port = ctt->src.u.tcp.port;
+   ft->dst_port = ctt->dst.u.tcp.port;
+
+   ft->iifidx = route->tuple[dir].ifindex;
+   ft->oifidx = route->tuple[!dir].ifindex;
+
+   ft->dst_cache = route->tuple[dir].dst;
+}
+
 struct flow_offload *
 flow_offload_alloc(struct nf_conn *ct, struct nf_flow_route *route)
 {
@@ -40,65 +72,8 @@ flow_offload_alloc(struct nf_conn *ct, struct nf_flow_route 
*route)
 
entry->ct = ct;
 
-   switch (ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.l3num) {
-   case NFPROTO_IPV4:
-   flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.src_v4 =
-   ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u3.in;
-   flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.dst_v4 =
-   ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u3.in;
-   flow->tuplehash[FLOW_OFFLOAD_DIR_REPLY].tuple.src_v4 =
-   ct->tuplehash[IP_CT_DIR_REPLY].tuple.src.u3.in;
-   flow->tuplehash[FLOW_OFFLOAD_DIR_REPLY].tuple.dst_v4 =
-   ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.u3.in;
-   break;
-   case NFPROTO_IPV6:
-   flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.src_v6 =
-   ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u3.in6;
-   flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.dst_v6 =
-   ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u3.in6;
-   flow->tuplehash[FLOW_OFFLOAD_DIR_REPLY].tuple.src_v6 =
-   ct->tuplehash[IP_CT_DIR_REPLY].tuple.src.u3.in6;
-   flow->tuplehash[FLOW_OFFLOAD_DIR_REPLY].tuple.dst_v6 =
-   ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.u3.in6;
-   break;
-   }
-
-   flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.l3proto =
-   ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.l3num;
-   flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.l4proto =
-   ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.protonum;
-   flow->tuplehash[FLOW_OFFLOAD_DIR_REPLY].tuple.l3proto =
-   ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.l3num;
-   flow->tuplehash[FLOW_OFFLOAD_DIR_REPLY].tuple.l4proto =
-   ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.protonum;
-
-   flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.dst_cache =
- route->tuple[FLOW_OFFLOAD_DIR_ORIGINAL].dst;
-   flow->tuplehash[FLOW_OFFLOAD_DIR_REPLY].tuple.dst_cache =
- route->tuple[FLOW_OFFLOAD_DIR_REPLY].dst;
-
-   flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.src_port =
-   ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u.tcp.port;
-   flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.dst_port =
-   ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u.tcp.port;
-   flow->tuplehash[FLOW_OFFLOAD_DIR_REPLY].tuple.src_port =
-   ct->tuplehash[IP_CT_DIR_REPLY].tuple.src.u.tcp.port;
-   flow->tuplehash[FLOW_OFFLOAD_DIR_REPLY].tuple.dst_port =
-   ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.u.tcp.port;
-
-   flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.dir =
-   FLOW_OFFLOAD_DIR_ORIGINAL;
-   flow->tuplehash[FLOW_OFFLOAD_DIR_REPLY].tuple.dir =
-   FLOW_OFFLOAD_DIR_REPLY;
-
-   flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.iifidx =
-   route->tuple[FLOW_OFFLOAD_DIR_ORIGINAL].ifindex;
-   flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.oifidx =
-   route->tuple[FLOW_OFFLOAD_DIR_REPLY].

[PATCH 22/30] ipv6: make ip6_dst_mtu_forward inline

2018-03-12 Thread Pablo Neira Ayuso

From: Felix Fietkau 

Needed to remove a direct dependency on ipv6.ko from flowtable
infrastructure. Make it inline like ip_dst_mtu_maybe_forward().

Signed-off-by: Felix Fietkau 
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/ip6_route.h | 21 +
 include/net/ipv6.h  |  2 --
 net/ipv6/ip6_output.c   | 22 --
 3 files changed, 21 insertions(+), 24 deletions(-)

diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
index ce2abc0ff102..18ef8c9890e2 100644
--- a/include/net/ip6_route.h
+++ b/include/net/ip6_route.h
@@ -271,4 +271,25 @@ static inline bool rt6_duplicate_nexthop(struct rt6_info 
*a, struct rt6_info *b)
   !lwtunnel_cmp_encap(a->dst.lwtstate, b->dst.lwtstate);
 }
 
+static inline unsigned int ip6_dst_mtu_forward(const struct dst_entry *dst)
+{
+   unsigned int mtu;
+   struct inet6_dev *idev;
+
+   if (dst_metric_locked(dst, RTAX_MTU)) {
+   mtu = dst_metric_raw(dst, RTAX_MTU);
+   if (mtu)
+   return mtu;
+   }
+
+   mtu = IPV6_MIN_MTU;
+   rcu_read_lock();
+   idev = __in6_dev_get(dst->dev);
+   if (idev)
+   mtu = idev->cnf.mtu6;
+   rcu_read_unlock();
+
+   return mtu;
+}
+
 #endif
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index cabd3cdd4015..a4089cebe8d3 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -970,8 +970,6 @@ static inline struct sk_buff *ip6_finish_skb(struct sock 
*sk)
  &inet6_sk(sk)->cork);
 }
 
-unsigned int ip6_dst_mtu_forward(const struct dst_entry *dst);
-
 int ip6_dst_lookup(struct net *net, struct sock *sk, struct dst_entry **dst,
   struct flowi6 *fl6);
 struct dst_entry *ip6_dst_lookup_flow(const struct sock *sk, struct flowi6 
*fl6,
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index a6eb0e699b15..2f1de4e8132a 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -378,28 +378,6 @@ static inline int ip6_forward_finish(struct net *net, 
struct sock *sk,
return dst_output(net, sk, skb);
 }
 
-unsigned int ip6_dst_mtu_forward(const struct dst_entry *dst)
-{
-   unsigned int mtu;
-   struct inet6_dev *idev;
-
-   if (dst_metric_locked(dst, RTAX_MTU)) {
-   mtu = dst_metric_raw(dst, RTAX_MTU);
-   if (mtu)
-   return mtu;
-   }
-
-   mtu = IPV6_MIN_MTU;
-   rcu_read_lock();
-   idev = __in6_dev_get(dst->dev);
-   if (idev)
-   mtu = idev->cnf.mtu6;
-   rcu_read_unlock();
-
-   return mtu;
-}
-EXPORT_SYMBOL_GPL(ip6_dst_mtu_forward);
-
 static bool ip6_pkt_too_big(const struct sk_buff *skb, unsigned int mtu)
 {
if (skb->len <= mtu)
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 20/30] netfilter: nf_flow_table: use IP_CT_DIR_* values for FLOW_OFFLOAD_DIR_*

2018-03-12 Thread Pablo Neira Ayuso

From: Felix Fietkau 

Simplifies further code cleanups

Signed-off-by: Felix Fietkau 
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/nf_flow_table.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/net/netfilter/nf_flow_table.h 
b/include/net/netfilter/nf_flow_table.h
index 833752dd0c58..09ba67598991 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 struct nf_flowtable;
@@ -27,11 +28,10 @@ struct nf_flowtable {
 };
 
 enum flow_offload_tuple_dir {
-   FLOW_OFFLOAD_DIR_ORIGINAL,
-   FLOW_OFFLOAD_DIR_REPLY,
-   __FLOW_OFFLOAD_DIR_MAX  = FLOW_OFFLOAD_DIR_REPLY,
+   FLOW_OFFLOAD_DIR_ORIGINAL = IP_CT_DIR_ORIGINAL,
+   FLOW_OFFLOAD_DIR_REPLY = IP_CT_DIR_REPLY,
+   FLOW_OFFLOAD_DIR_MAX = IP_CT_DIR_MAX
 };
-#define FLOW_OFFLOAD_DIR_MAX   (__FLOW_OFFLOAD_DIR_MAX + 1)
 
 struct flow_offload_tuple {
union {
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 14/30] netfilter: compat: prepare xt_compat_init_offsets to return errors

2018-03-12 Thread Pablo Neira Ayuso

From: Florian Westphal 

should have no impact, function still always returns 0.
This patch is only to ease review.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 include/linux/netfilter/x_tables.h |  2 +-
 net/bridge/netfilter/ebtables.c| 10 --
 net/ipv4/netfilter/arp_tables.c| 10 +++---
 net/ipv4/netfilter/ip_tables.c |  8 ++--
 net/ipv6/netfilter/ip6_tables.c| 10 +++---
 net/netfilter/x_tables.c   |  4 +++-
 6 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/include/linux/netfilter/x_tables.h 
b/include/linux/netfilter/x_tables.h
index 0bd93c589a8c..7bd896dc78df 100644
--- a/include/linux/netfilter/x_tables.h
+++ b/include/linux/netfilter/x_tables.h
@@ -510,7 +510,7 @@ void xt_compat_unlock(u_int8_t af);
 
 int xt_compat_add_offset(u_int8_t af, unsigned int offset, int delta);
 void xt_compat_flush_offsets(u_int8_t af);
-void xt_compat_init_offsets(u_int8_t af, unsigned int number);
+int xt_compat_init_offsets(u8 af, unsigned int number);
 int xt_compat_calc_jump(u_int8_t af, unsigned int offset);
 
 int xt_compat_match_offset(const struct xt_match *match);
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 02c4b409d317..217aa79f7b2a 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -1819,10 +1819,14 @@ static int compat_table_info(const struct 
ebt_table_info *info,
 {
unsigned int size = info->entries_size;
const void *entries = info->entries;
+   int ret;
 
newinfo->entries_size = size;
 
-   xt_compat_init_offsets(NFPROTO_BRIDGE, info->nentries);
+   ret = xt_compat_init_offsets(NFPROTO_BRIDGE, info->nentries);
+   if (ret)
+   return ret;
+
return EBT_ENTRY_ITERATE(entries, size, compat_calc_entry, info,
entries, newinfo);
 }
@@ -2245,7 +2249,9 @@ static int compat_do_replace(struct net *net, void __user 
*user,
 
xt_compat_lock(NFPROTO_BRIDGE);
 
-   xt_compat_init_offsets(NFPROTO_BRIDGE, tmp.nentries);
+   ret = xt_compat_init_offsets(NFPROTO_BRIDGE, tmp.nentries);
+   if (ret < 0)
+   goto out_unlock;
ret = compat_copy_entries(entries_tmp, tmp.entries_size, &state);
if (ret < 0)
goto out_unlock;
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 82ba09b50fdb..aaafdbd15ad3 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -769,7 +769,9 @@ static int compat_table_info(const struct xt_table_info 
*info,
memcpy(newinfo, info, offsetof(struct xt_table_info, entries));
newinfo->initial_entries = 0;
loc_cpu_entry = info->entries;
-   xt_compat_init_offsets(NFPROTO_ARP, info->number);
+   ret = xt_compat_init_offsets(NFPROTO_ARP, info->number);
+   if (ret)
+   return ret;
xt_entry_foreach(iter, loc_cpu_entry, info->size) {
ret = compat_calc_entry(iter, info, loc_cpu_entry, newinfo);
if (ret != 0)
@@ -1156,7 +1158,7 @@ static int translate_compat_table(struct xt_table_info 
**pinfo,
struct compat_arpt_entry *iter0;
struct arpt_replace repl;
unsigned int size;
-   int ret = 0;
+   int ret;
 
info = *pinfo;
entry0 = *pentry0;
@@ -1165,7 +1167,9 @@ static int translate_compat_table(struct xt_table_info 
**pinfo,
 
j = 0;
xt_compat_lock(NFPROTO_ARP);
-   xt_compat_init_offsets(NFPROTO_ARP, compatr->num_entries);
+   ret = xt_compat_init_offsets(NFPROTO_ARP, compatr->num_entries);
+   if (ret)
+   goto out_unlock;
/* Walk through entries, checking offsets. */
xt_entry_foreach(iter0, entry0, compatr->size) {
ret = check_compat_entry_size_and_hooks(iter0, info, &size,
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 4901ca6c3e09..f9063513f9d1 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -933,7 +933,9 @@ static int compat_table_info(const struct xt_table_info 
*info,
memcpy(newinfo, info, offsetof(struct xt_table_info, entries));
newinfo->initial_entries = 0;
loc_cpu_entry = info->entries;
-   xt_compat_init_offsets(AF_INET, info->number);
+   ret = xt_compat_init_offsets(AF_INET, info->number);
+   if (ret)
+   return ret;
xt_entry_foreach(iter, loc_cpu_entry, info->size) {
ret = compat_calc_entry(iter, info, loc_cpu_entry, newinfo);
if (ret != 0)
@@ -1407,7 +1409,9 @@ translate_compat_table(struct net *net,
 
j = 0;
xt_compat_lock(AF_INET);
-   xt_compat_init_offsets(AF_INET, compatr->num_entries);
+   ret = xt_compat_init_offsets(AF_INET, compatr->num_entries);
+   if (ret)
+   goto out_unlock;
/* Walk

[PATCH 18/30] netfilter: make xt_rateest hash table per net

2018-03-12 Thread Pablo Neira Ayuso

From: Cong Wang 

As suggested by Eric, we need to make the xt_rateest
hash table and its lock per netns to reduce lock
contentions.

Cc: Florian Westphal 
Cc: Eric Dumazet 
Cc: Pablo Neira Ayuso 
Signed-off-by: Cong Wang 
Reviewed-by: Eric Dumazet 
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/xt_rateest.h |  4 +-
 net/netfilter/xt_RATEEST.c | 91 +++---
 net/netfilter/xt_rateest.c | 10 ++---
 3 files changed, 72 insertions(+), 33 deletions(-)

diff --git a/include/net/netfilter/xt_rateest.h 
b/include/net/netfilter/xt_rateest.h
index b1db13772554..832ab69efda5 100644
--- a/include/net/netfilter/xt_rateest.h
+++ b/include/net/netfilter/xt_rateest.h
@@ -21,7 +21,7 @@ struct xt_rateest {
struct net_rate_estimator __rcu *rate_est;
 };
 
-struct xt_rateest *xt_rateest_lookup(const char *name);
-void xt_rateest_put(struct xt_rateest *est);
+struct xt_rateest *xt_rateest_lookup(struct net *net, const char *name);
+void xt_rateest_put(struct net *net, struct xt_rateest *est);
 
 #endif /* _XT_RATEEST_H */
diff --git a/net/netfilter/xt_RATEEST.c b/net/netfilter/xt_RATEEST.c
index 141c295191f6..dec843cadf46 100644
--- a/net/netfilter/xt_RATEEST.c
+++ b/net/netfilter/xt_RATEEST.c
@@ -14,15 +14,21 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
 #include 
 
-static DEFINE_MUTEX(xt_rateest_mutex);
-
 #define RATEEST_HSIZE  16
-static struct hlist_head rateest_hash[RATEEST_HSIZE] __read_mostly;
+
+struct xt_rateest_net {
+   struct mutex hash_lock;
+   struct hlist_head hash[RATEEST_HSIZE];
+};
+
+static unsigned int xt_rateest_id;
+
 static unsigned int jhash_rnd __read_mostly;
 
 static unsigned int xt_rateest_hash(const char *name)
@@ -31,21 +37,23 @@ static unsigned int xt_rateest_hash(const char *name)
   (RATEEST_HSIZE - 1);
 }
 
-static void xt_rateest_hash_insert(struct xt_rateest *est)
+static void xt_rateest_hash_insert(struct xt_rateest_net *xn,
+  struct xt_rateest *est)
 {
unsigned int h;
 
h = xt_rateest_hash(est->name);
-   hlist_add_head(&est->list, &rateest_hash[h]);
+   hlist_add_head(&est->list, &xn->hash[h]);
 }
 
-static struct xt_rateest *__xt_rateest_lookup(const char *name)
+static struct xt_rateest *__xt_rateest_lookup(struct xt_rateest_net *xn,
+ const char *name)
 {
struct xt_rateest *est;
unsigned int h;
 
h = xt_rateest_hash(name);
-   hlist_for_each_entry(est, &rateest_hash[h], list) {
+   hlist_for_each_entry(est, &xn->hash[h], list) {
if (strcmp(est->name, name) == 0) {
est->refcnt++;
return est;
@@ -55,20 +63,23 @@ static struct xt_rateest *__xt_rateest_lookup(const char 
*name)
return NULL;
 }
 
-struct xt_rateest *xt_rateest_lookup(const char *name)
+struct xt_rateest *xt_rateest_lookup(struct net *net, const char *name)
 {
+   struct xt_rateest_net *xn = net_generic(net, xt_rateest_id);
struct xt_rateest *est;
 
-   mutex_lock(&xt_rateest_mutex);
-   est = __xt_rateest_lookup(name);
-   mutex_unlock(&xt_rateest_mutex);
+   mutex_lock(&xn->hash_lock);
+   est = __xt_rateest_lookup(xn, name);
+   mutex_unlock(&xn->hash_lock);
return est;
 }
 EXPORT_SYMBOL_GPL(xt_rateest_lookup);
 
-void xt_rateest_put(struct xt_rateest *est)
+void xt_rateest_put(struct net *net, struct xt_rateest *est)
 {
-   mutex_lock(&xt_rateest_mutex);
+   struct xt_rateest_net *xn = net_generic(net, xt_rateest_id);
+
+   mutex_lock(&xn->hash_lock);
if (--est->refcnt == 0) {
hlist_del(&est->list);
gen_kill_estimator(&est->rate_est);
@@ -78,7 +89,7 @@ void xt_rateest_put(struct xt_rateest *est)
 */
kfree_rcu(est, rcu);
}
-   mutex_unlock(&xt_rateest_mutex);
+   mutex_unlock(&xn->hash_lock);
 }
 EXPORT_SYMBOL_GPL(xt_rateest_put);
 
@@ -98,6 +109,7 @@ xt_rateest_tg(struct sk_buff *skb, const struct 
xt_action_param *par)
 
 static int xt_rateest_tg_checkentry(const struct xt_tgchk_param *par)
 {
+   struct xt_rateest_net *xn = net_generic(par->net, xt_rateest_id);
struct xt_rateest_target_info *info = par->targinfo;
struct xt_rateest *est;
struct {
@@ -108,10 +120,10 @@ static int xt_rateest_tg_checkentry(const struct 
xt_tgchk_param *par)
 
net_get_random_once(&jhash_rnd, sizeof(jhash_rnd));
 
-   mutex_lock(&xt_rateest_mutex);
-   est = __xt_rateest_lookup(info->name);
+   mutex_lock(&xn->hash_lock);
+   est = __xt_rateest_lookup(xn, info->name);
if (est) {
-   mutex_unlock(&xt_rateest_mutex);
+   mutex_unlock(&xn->hash_lock);
/*
 * If estimator parameters are specified, they must match the
 * existing estimator.
@@ -119,7 +131,7 @@

[PATCH 19/30] netfilter: xt_limit: Spelling s/maxmum/maximum/

2018-03-12 Thread Pablo Neira Ayuso

From: Geert Uytterhoeven 

Signed-off-by: Geert Uytterhoeven 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/xt_limit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/xt_limit.c b/net/netfilter/xt_limit.c
index 55d18cd67635..9f098ecb2449 100644
--- a/net/netfilter/xt_limit.c
+++ b/net/netfilter/xt_limit.c
@@ -46,7 +46,7 @@ MODULE_ALIAS("ip6t_limit");
 
See Alexey's formal explanation in net/sched/sch_tbf.c.
 
-   To get the maxmum range, we multiply by this factor (ie. you get N
+   To get the maximum range, we multiply by this factor (ie. you get N
credits per jiffy).  We want to allow a rate as low as 1 per day
(slowest userspace tool allows), which means
CREDITS_PER_JIFFY*HZ*60*60*24 < 2^32. ie. */
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 16/30] netfilter: x_tables: make sure compat af mutex is held

2018-03-12 Thread Pablo Neira Ayuso

From: Florian Westphal 

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/x_tables.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index 33724b08b8f0..7521e8a72c06 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -582,6 +582,8 @@ int xt_compat_add_offset(u_int8_t af, unsigned int offset, 
int delta)
 {
struct xt_af *xp = &xt[af];
 
+   WARN_ON(!mutex_is_locked(&xt[af].compat_mutex));
+
if (WARN_ON(!xp->compat_tab))
return -ENOMEM;
 
@@ -599,6 +601,8 @@ EXPORT_SYMBOL_GPL(xt_compat_add_offset);
 
 void xt_compat_flush_offsets(u_int8_t af)
 {
+   WARN_ON(!mutex_is_locked(&xt[af].compat_mutex));
+
if (xt[af].compat_tab) {
vfree(xt[af].compat_tab);
xt[af].compat_tab = NULL;
@@ -630,6 +634,8 @@ int xt_compat_init_offsets(u8 af, unsigned int number)
 {
size_t mem;
 
+   WARN_ON(!mutex_is_locked(&xt[af].compat_mutex));
+
if (!number || number > (INT_MAX / sizeof(struct compat_delta)))
return -EINVAL;
 
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 17/30] netfilter: x_tables: ensure last rule in base chain matches underflow/policy

2018-03-12 Thread Pablo Neira Ayuso

From: Florian Westphal 

Harmless from kernel point of view, but again iptables assumes that
this is true when decoding ruleset coming from kernel.

If a (syzkaller generated) ruleset doesn't have the underflow/policy
stored as the last rule in the base chain, then iptables will abort()
because it doesn't find the chain policy.

libiptc assumes that the policy is the last rule in the basechain, which
is only true for iptables-generated rulesets.

Unfortunately this needs code duplication -- the functions need the
struct layout of the rule head, but that is different for
ip/ip6/arptables.

NB: pr_warn could be pr_debug but in case this break rulesets somehow its
useful to know why blob was rejected.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/ipv4/netfilter/arp_tables.c | 17 -
 net/ipv4/netfilter/ip_tables.c  | 17 -
 net/ipv6/netfilter/ip6_tables.c | 17 -
 3 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index aaafdbd15ad3..f366ff1cfc19 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -309,10 +309,13 @@ static int mark_source_chains(const struct xt_table_info 
*newinfo,
for (hook = 0; hook < NF_ARP_NUMHOOKS; hook++) {
unsigned int pos = newinfo->hook_entry[hook];
struct arpt_entry *e = entry0 + pos;
+   unsigned int last_pos, depth;
 
if (!(valid_hooks & (1 << hook)))
continue;
 
+   depth = 0;
+   last_pos = pos;
/* Set initial back pointer. */
e->counters.pcnt = pos;
 
@@ -343,6 +346,8 @@ static int mark_source_chains(const struct xt_table_info 
*newinfo,
pos = e->counters.pcnt;
e->counters.pcnt = 0;
 
+   if (depth)
+   --depth;
/* We're at the start. */
if (pos == oldpos)
goto next;
@@ -367,6 +372,9 @@ static int mark_source_chains(const struct xt_table_info 
*newinfo,
if (!xt_find_jump_offset(offsets, 
newpos,
 
newinfo->number))
return 0;
+
+   if (entry0 + newpos != 
arpt_next_entry(e))
+   ++depth;
} else {
/* ... this is a fallthru */
newpos = pos + e->next_offset;
@@ -377,8 +385,15 @@ static int mark_source_chains(const struct xt_table_info 
*newinfo,
e->counters.pcnt = pos;
pos = newpos;
}
+   if (depth == 0)
+   last_pos = pos;
+   }
+next:
+   if (last_pos != newinfo->underflow[hook]) {
+   pr_err_ratelimited("last base chain position %u doesn't 
match underflow %u (hook %u)\n",
+  last_pos, newinfo->underflow[hook], 
hook);
+   return 0;
}
-next:  ;
}
return 1;
 }
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index f9063513f9d1..2362ca2c9e0c 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -378,10 +378,13 @@ mark_source_chains(const struct xt_table_info *newinfo,
for (hook = 0; hook < NF_INET_NUMHOOKS; hook++) {
unsigned int pos = newinfo->hook_entry[hook];
struct ipt_entry *e = entry0 + pos;
+   unsigned int last_pos, depth;
 
if (!(valid_hooks & (1 << hook)))
continue;
 
+   depth = 0;
+   last_pos = pos;
/* Set initial back pointer. */
e->counters.pcnt = pos;
 
@@ -410,6 +413,8 @@ mark_source_chains(const struct xt_table_info *newinfo,
pos = e->counters.pcnt;
e->counters.pcnt = 0;
 
+   if (depth)
+   --depth;
/* We're at the start. */
if (pos == oldpos)
goto next;
@@ -434,6 +439,9 @@ mark_source_chains(const struct xt_table_info *newinfo,
if (!xt_find_jump_offset(offsets, 
newpos,

[PATCH 10/30] netfilter: x_tables: enforce unique and ascending entry points

2018-03-12 Thread Pablo Neira Ayuso

From: Florian Westphal 

Harmless from kernel point of view, but iptables assumes that this is
true when decoding a ruleset.

iptables walks the dumped blob from kernel, and, for each entry that
creates a new chain it prints out rule/chain information.
Base chains (hook entry points) are thus only shown when they appear
in the rule blob.  One base chain that is referenced multiple times
in hook blob is then only printed once.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/x_tables.c | 31 ++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index 5d8ba89a8da8..4e6cbb38e616 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -529,10 +529,15 @@ static int xt_check_entry_match(const char *match, const 
char *target,
  */
 int xt_check_table_hooks(const struct xt_table_info *info, unsigned int 
valid_hooks)
 {
-   unsigned int i;
+   const char *err = "unsorted underflow";
+   unsigned int i, max_uflow, max_entry;
+   bool check_hooks = false;
 
BUILD_BUG_ON(ARRAY_SIZE(info->hook_entry) != 
ARRAY_SIZE(info->underflow));
 
+   max_entry = 0;
+   max_uflow = 0;
+
for (i = 0; i < ARRAY_SIZE(info->hook_entry); i++) {
if (!(valid_hooks & (1 << i)))
continue;
@@ -541,9 +546,33 @@ int xt_check_table_hooks(const struct xt_table_info *info, 
unsigned int valid_ho
return -EINVAL;
if (info->underflow[i] == 0x)
return -EINVAL;
+
+   if (check_hooks) {
+   if (max_uflow > info->underflow[i])
+   goto error;
+
+   if (max_uflow == info->underflow[i]) {
+   err = "duplicate underflow";
+   goto error;
+   }
+   if (max_entry > info->hook_entry[i]) {
+   err = "unsorted entry";
+   goto error;
+   }
+   if (max_entry == info->hook_entry[i]) {
+   err = "duplicate entry";
+   goto error;
+   }
+   }
+   max_entry = info->hook_entry[i];
+   max_uflow = info->underflow[i];
+   check_hooks = true;
}
 
return 0;
+error:
+   pr_err_ratelimited("%s at hook %d\n", err, i);
+   return -EINVAL;
 }
 EXPORT_SYMBOL(xt_check_table_hooks);
 
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 11/30] netfilter: x_tables: cap allocations at 512 mbyte

2018-03-12 Thread Pablo Neira Ayuso

From: Florian Westphal 

Arbitrary limit, however, this still allows huge rulesets
(> 1 million rules).  This helps with automated fuzzer as it prevents
oom-killer invocation.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/x_tables.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index 4e6cbb38e616..dc68ac49614a 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -40,6 +40,7 @@ MODULE_AUTHOR("Harald Welte ");
 MODULE_DESCRIPTION("{ip,ip6,arp,eb}_tables backend module");
 
 #define XT_PCPU_BLOCK_SIZE 4096
+#define XT_MAX_TABLE_SIZE  (512 * 1024 * 1024)
 
 struct compat_delta {
unsigned int offset; /* offset in kernel */
@@ -1117,7 +1118,7 @@ struct xt_table_info *xt_alloc_table_info(unsigned int 
size)
struct xt_table_info *info = NULL;
size_t sz = sizeof(*info) + size;
 
-   if (sz < sizeof(*info))
+   if (sz < sizeof(*info) || sz >= XT_MAX_TABLE_SIZE)
return NULL;
 
/* __GFP_NORETRY is not fully supported by kvmalloc but it should
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 12/30] netfilter: x_tables: limit allocation requests for blob rule heads

2018-03-12 Thread Pablo Neira Ayuso

From: Florian Westphal 

This is a very conservative limit (134217728 rules), but good
enough to not trigger frequent oom from syzkaller.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/x_tables.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index dc68ac49614a..01f8e122e74e 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -894,6 +894,9 @@ EXPORT_SYMBOL(xt_check_entry_offsets);
  */
 unsigned int *xt_alloc_entry_offsets(unsigned int size)
 {
+   if (size > XT_MAX_TABLE_SIZE / sizeof(unsigned int))
+   return NULL;
+
return kvmalloc_array(size, sizeof(unsigned int), GFP_KERNEL | 
__GFP_ZERO);
 
 }
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 09/30] netfilter: x_tables: move hook entry checks into core

2018-03-12 Thread Pablo Neira Ayuso

From: Florian Westphal 

Allow followup patch to change on location instead of three.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 include/linux/netfilter/x_tables.h |  2 ++
 net/ipv4/netfilter/arp_tables.c| 13 +++--
 net/ipv4/netfilter/ip_tables.c | 13 +++--
 net/ipv6/netfilter/ip6_tables.c| 13 +++--
 net/netfilter/x_tables.c   | 29 +
 5 files changed, 40 insertions(+), 30 deletions(-)

diff --git a/include/linux/netfilter/x_tables.h 
b/include/linux/netfilter/x_tables.h
index 1313b35c3ab7..fa0c19c328f1 100644
--- a/include/linux/netfilter/x_tables.h
+++ b/include/linux/netfilter/x_tables.h
@@ -281,6 +281,8 @@ int xt_check_entry_offsets(const void *base, const char 
*elems,
   unsigned int target_offset,
   unsigned int next_offset);
 
+int xt_check_table_hooks(const struct xt_table_info *info, unsigned int 
valid_hooks);
+
 unsigned int *xt_alloc_entry_offsets(unsigned int size);
 bool xt_find_jump_offset(const unsigned int *offsets,
 unsigned int target, unsigned int size);
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index c9ffa884a4ee..be5821215ea0 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -555,16 +555,9 @@ static int translate_table(struct xt_table_info *newinfo, 
void *entry0,
if (i != repl->num_entries)
goto out_free;
 
-   /* Check hooks all assigned */
-   for (i = 0; i < NF_ARP_NUMHOOKS; i++) {
-   /* Only hooks which are valid */
-   if (!(repl->valid_hooks & (1 << i)))
-   continue;
-   if (newinfo->hook_entry[i] == 0x)
-   goto out_free;
-   if (newinfo->underflow[i] == 0x)
-   goto out_free;
-   }
+   ret = xt_check_table_hooks(newinfo, repl->valid_hooks);
+   if (ret)
+   goto out_free;
 
if (!mark_source_chains(newinfo, repl->valid_hooks, entry0, offsets)) {
ret = -ELOOP;
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index c9b57a6bf96a..29bda9484a33 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -702,16 +702,9 @@ translate_table(struct net *net, struct xt_table_info 
*newinfo, void *entry0,
if (i != repl->num_entries)
goto out_free;
 
-   /* Check hooks all assigned */
-   for (i = 0; i < NF_INET_NUMHOOKS; i++) {
-   /* Only hooks which are valid */
-   if (!(repl->valid_hooks & (1 << i)))
-   continue;
-   if (newinfo->hook_entry[i] == 0x)
-   goto out_free;
-   if (newinfo->underflow[i] == 0x)
-   goto out_free;
-   }
+   ret = xt_check_table_hooks(newinfo, repl->valid_hooks);
+   if (ret)
+   goto out_free;
 
if (!mark_source_chains(newinfo, repl->valid_hooks, entry0, offsets)) {
ret = -ELOOP;
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index f46954221933..ba3776a4d305 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -720,16 +720,9 @@ translate_table(struct net *net, struct xt_table_info 
*newinfo, void *entry0,
if (i != repl->num_entries)
goto out_free;
 
-   /* Check hooks all assigned */
-   for (i = 0; i < NF_INET_NUMHOOKS; i++) {
-   /* Only hooks which are valid */
-   if (!(repl->valid_hooks & (1 << i)))
-   continue;
-   if (newinfo->hook_entry[i] == 0x)
-   goto out_free;
-   if (newinfo->underflow[i] == 0x)
-   goto out_free;
-   }
+   ret = xt_check_table_hooks(newinfo, repl->valid_hooks);
+   if (ret)
+   goto out_free;
 
if (!mark_source_chains(newinfo, repl->valid_hooks, entry0, offsets)) {
ret = -ELOOP;
diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index f045bb4f7063..5d8ba89a8da8 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -518,6 +518,35 @@ static int xt_check_entry_match(const char *match, const 
char *target,
return 0;
 }
 
+/** xt_check_table_hooks - check hook entry points are sane
+ *
+ * @info xt_table_info to check
+ * @valid_hooks - hook entry points that we can enter from
+ *
+ * Validates that the hook entry and underflows points are set up.
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int xt_check_table_hooks(const struct xt_table_info *info, unsigned int 
valid_hooks)
+{
+   unsigned int i;
+
+   BUILD_BUG_ON(ARRAY_SIZE(info->hook_entry) != 
ARRAY_SIZE(info->underflow));
+
+   for (i = 0; i <

[PATCH 06/30] netfilter: unlock xt_table earlier in __do_replace

2018-03-12 Thread Pablo Neira Ayuso

From: Xin Long 

Now it's doing cleanup_entry for oldinfo under the xt_table lock,
but it's not really necessary. After the replacement job is done
in xt_replace_table, oldinfo is not used elsewhere any more, and
it can be freed without xt_table lock safely.

The important thing is that rtnl_lock is called in some xt_target
destroy, which means rtnl_lock, a big lock is used in xt_table
lock, a smaller one. It usually could be the reason why a dead
lock may happen.

Besides, all xt_target/match checkentry is called out of xt_table
lock. It's better also to move all cleanup_entry calling out of
xt_table lock, just as do_replace_finish does for ebtables.

Signed-off-by: Xin Long 
Signed-off-by: Pablo Neira Ayuso 
---
 net/ipv4/netfilter/arp_tables.c | 3 ++-
 net/ipv4/netfilter/ip_tables.c  | 3 ++-
 net/ipv6/netfilter/ip6_tables.c | 3 ++-
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index c36ffce3c812..a0c7ce76879c 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -925,6 +925,8 @@ static int __do_replace(struct net *net, const char *name,
(newinfo->number <= oldinfo->initial_entries))
module_put(t->me);
 
+   xt_table_unlock(t);
+
get_old_counters(oldinfo, counters);
 
/* Decrease module usage counts and free resource */
@@ -939,7 +941,6 @@ static int __do_replace(struct net *net, const char *name,
net_warn_ratelimited("arptables: counters copy to user failed 
while replacing table\n");
}
vfree(counters);
-   xt_table_unlock(t);
return ret;
 
  put_module:
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index d4f7584d2dbe..4f7153e25e0b 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -1087,6 +1087,8 @@ __do_replace(struct net *net, const char *name, unsigned 
int valid_hooks,
(newinfo->number <= oldinfo->initial_entries))
module_put(t->me);
 
+   xt_table_unlock(t);
+
get_old_counters(oldinfo, counters);
 
/* Decrease module usage counts and free resource */
@@ -1100,7 +1102,6 @@ __do_replace(struct net *net, const char *name, unsigned 
int valid_hooks,
net_warn_ratelimited("iptables: counters copy to user failed 
while replacing table\n");
}
vfree(counters);
-   xt_table_unlock(t);
return ret;
 
  put_module:
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 4de8ac1e5af4..6c44033decab 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -1105,6 +1105,8 @@ __do_replace(struct net *net, const char *name, unsigned 
int valid_hooks,
(newinfo->number <= oldinfo->initial_entries))
module_put(t->me);
 
+   xt_table_unlock(t);
+
get_old_counters(oldinfo, counters);
 
/* Decrease module usage counts and free resource */
@@ -1118,7 +1120,6 @@ __do_replace(struct net *net, const char *name, unsigned 
int valid_hooks,
net_warn_ratelimited("ip6tables: counters copy to user failed 
while replacing table\n");
}
vfree(counters);
-   xt_table_unlock(t);
return ret;
 
  put_module:
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 04/30] netfilter: nf_conntrack_broadcast: remove useless parameter

2018-03-12 Thread Pablo Neira Ayuso

From: Taehee Yoo 

parameter protoff in nf_conntrack_broadcast_help is not used anywhere.

Signed-off-by: Taehee Yoo 
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/nf_conntrack_helper.h | 3 +--
 net/netfilter/nf_conntrack_broadcast.c  | 1 -
 net/netfilter/nf_conntrack_netbios_ns.c | 5 +++--
 net/netfilter/nf_conntrack_snmp.c   | 5 +++--
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_helper.h 
b/include/net/netfilter/nf_conntrack_helper.h
index fc39bbaf107c..32c2a94a219d 100644
--- a/include/net/netfilter/nf_conntrack_helper.h
+++ b/include/net/netfilter/nf_conntrack_helper.h
@@ -132,8 +132,7 @@ void nf_conntrack_helper_pernet_fini(struct net *net);
 int nf_conntrack_helper_init(void);
 void nf_conntrack_helper_fini(void);
 
-int nf_conntrack_broadcast_help(struct sk_buff *skb, unsigned int protoff,
-   struct nf_conn *ct,
+int nf_conntrack_broadcast_help(struct sk_buff *skb, struct nf_conn *ct,
enum ip_conntrack_info ctinfo,
unsigned int timeout);
 
diff --git a/net/netfilter/nf_conntrack_broadcast.c 
b/net/netfilter/nf_conntrack_broadcast.c
index ecc3ab784633..a1086bdec242 100644
--- a/net/netfilter/nf_conntrack_broadcast.c
+++ b/net/netfilter/nf_conntrack_broadcast.c
@@ -20,7 +20,6 @@
 #include 
 
 int nf_conntrack_broadcast_help(struct sk_buff *skb,
-   unsigned int protoff,
struct nf_conn *ct,
enum ip_conntrack_info ctinfo,
unsigned int timeout)
diff --git a/net/netfilter/nf_conntrack_netbios_ns.c 
b/net/netfilter/nf_conntrack_netbios_ns.c
index 496ce173f0c1..a4a59dc7cf17 100644
--- a/net/netfilter/nf_conntrack_netbios_ns.c
+++ b/net/netfilter/nf_conntrack_netbios_ns.c
@@ -41,9 +41,10 @@ static struct nf_conntrack_expect_policy exp_policy = {
 };
 
 static int netbios_ns_help(struct sk_buff *skb, unsigned int protoff,
-  struct nf_conn *ct, enum ip_conntrack_info ctinfo)
+  struct nf_conn *ct,
+  enum ip_conntrack_info ctinfo)
 {
-   return nf_conntrack_broadcast_help(skb, protoff, ct, ctinfo, timeout);
+   return nf_conntrack_broadcast_help(skb, ct, ctinfo, timeout);
 }
 
 static struct nf_conntrack_helper helper __read_mostly = {
diff --git a/net/netfilter/nf_conntrack_snmp.c 
b/net/netfilter/nf_conntrack_snmp.c
index 87b95a2c270c..2d0f8e010821 100644
--- a/net/netfilter/nf_conntrack_snmp.c
+++ b/net/netfilter/nf_conntrack_snmp.c
@@ -36,11 +36,12 @@ int (*nf_nat_snmp_hook)(struct sk_buff *skb,
 EXPORT_SYMBOL_GPL(nf_nat_snmp_hook);
 
 static int snmp_conntrack_help(struct sk_buff *skb, unsigned int protoff,
-   struct nf_conn *ct, enum ip_conntrack_info ctinfo)
+  struct nf_conn *ct,
+  enum ip_conntrack_info ctinfo)
 {
typeof(nf_nat_snmp_hook) nf_nat_snmp;
 
-   nf_conntrack_broadcast_help(skb, protoff, ct, ctinfo, timeout);
+   nf_conntrack_broadcast_help(skb, ct, ctinfo, timeout);
 
nf_nat_snmp = rcu_dereference(nf_nat_snmp_hook);
if (nf_nat_snmp && ct->status & IPS_NAT_MASK)
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 08/30] netfilter: x_tables: check error target size too

2018-03-12 Thread Pablo Neira Ayuso

From: Florian Westphal 

Check that userspace ERROR target (custom user-defined chains) match
expected format, and the chain name is null terminated.

This is irrelevant for kernel, but iptables itself relies on sane input
when it dumps rules from kernel.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/x_tables.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index 2e4d423e58e6..f045bb4f7063 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -654,6 +654,11 @@ struct compat_xt_standard_target {
compat_uint_t verdict;
 };
 
+struct compat_xt_error_target {
+   struct compat_xt_entry_target t;
+   char errorname[XT_FUNCTION_MAXNAMELEN];
+};
+
 static bool verdict_ok(int verdict)
 {
if (verdict > 0)
@@ -679,6 +684,12 @@ static bool verdict_ok(int verdict)
return false;
 }
 
+static bool error_tg_ok(unsigned int usersize, unsigned int kernsize,
+   const char *msg, unsigned int msglen)
+{
+   return usersize == kernsize && strnlen(msg, msglen) < msglen;
+}
+
 int xt_compat_check_entry_offsets(const void *base, const char *elems,
  unsigned int target_offset,
  unsigned int next_offset)
@@ -708,6 +719,12 @@ int xt_compat_check_entry_offsets(const void *base, const 
char *elems,
 
if (!verdict_ok(st->verdict))
return -EINVAL;
+   } else if (strcmp(t->u.user.name, XT_ERROR_TARGET) == 0) {
+   const struct compat_xt_error_target *et = (const void *)t;
+
+   if (!error_tg_ok(t->u.target_size, sizeof(*et),
+et->errorname, sizeof(et->errorname)))
+   return -EINVAL;
}
 
/* compat_xt_entry match has less strict alignment requirements,
@@ -796,6 +813,12 @@ int xt_check_entry_offsets(const void *base,
 
if (!verdict_ok(st->verdict))
return -EINVAL;
+   } else if (strcmp(t->u.user.name, XT_ERROR_TARGET) == 0) {
+   const struct xt_error_target *et = (const void *)t;
+
+   if (!error_tg_ok(t->u.target_size, sizeof(*et),
+et->errorname, sizeof(et->errorname)))
+   return -EINVAL;
}
 
return xt_check_entry_match(elems, base + target_offset,
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 07/30] netfilter: x_tables: check standard verdicts in core

2018-03-12 Thread Pablo Neira Ayuso

From: Florian Westphal 

Userspace must provide a valid verdict to the standard target.

The verdict can be either a jump (signed int > 0), or a return code.

Allowed return codes are either RETURN (pop from stack), NF_ACCEPT, DROP
and QUEUE (latter is allowed for legacy reasons).

Jump offsets (verdict > 0) are checked in more detail later on when
loop-detection is performed.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/ipv4/netfilter/arp_tables.c |  5 -
 net/ipv4/netfilter/ip_tables.c  |  5 -
 net/ipv6/netfilter/ip6_tables.c |  5 -
 net/netfilter/x_tables.c| 49 -
 4 files changed, 43 insertions(+), 21 deletions(-)

diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index a0c7ce76879c..c9ffa884a4ee 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -334,11 +334,6 @@ static int mark_source_chains(const struct xt_table_info 
*newinfo,
 t->verdict < 0) || visited) {
unsigned int oldpos, size;
 
-   if ((strcmp(t->target.u.user.name,
-   XT_STANDARD_TARGET) == 0) &&
-   t->verdict < -NF_MAX_VERDICT - 1)
-   return 0;
-
/* Return: backtrack through the last
 * big jump.
 */
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 4f7153e25e0b..c9b57a6bf96a 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -402,11 +402,6 @@ mark_source_chains(const struct xt_table_info *newinfo,
 t->verdict < 0) || visited) {
unsigned int oldpos, size;
 
-   if ((strcmp(t->target.u.user.name,
-   XT_STANDARD_TARGET) == 0) &&
-   t->verdict < -NF_MAX_VERDICT - 1)
-   return 0;
-
/* Return: backtrack through the last
   big jump. */
do {
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 6c44033decab..f46954221933 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -420,11 +420,6 @@ mark_source_chains(const struct xt_table_info *newinfo,
 t->verdict < 0) || visited) {
unsigned int oldpos, size;
 
-   if ((strcmp(t->target.u.user.name,
-   XT_STANDARD_TARGET) == 0) &&
-   t->verdict < -NF_MAX_VERDICT - 1)
-   return 0;
-
/* Return: backtrack through the last
   big jump. */
do {
diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index d9deebe599ec..2e4d423e58e6 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -654,6 +654,31 @@ struct compat_xt_standard_target {
compat_uint_t verdict;
 };
 
+static bool verdict_ok(int verdict)
+{
+   if (verdict > 0)
+   return true;
+
+   if (verdict < 0) {
+   int v = -verdict - 1;
+
+   if (verdict == XT_RETURN)
+   return true;
+
+   switch (v) {
+   case NF_ACCEPT: return true;
+   case NF_DROP: return true;
+   case NF_QUEUE: return true;
+   default:
+   break;
+   }
+
+   return false;
+   }
+
+   return false;
+}
+
 int xt_compat_check_entry_offsets(const void *base, const char *elems,
  unsigned int target_offset,
  unsigned int next_offset)
@@ -675,9 +700,15 @@ int xt_compat_check_entry_offsets(const void *base, const 
char *elems,
if (target_offset + t->u.target_size > next_offset)
return -EINVAL;
 
-   if (strcmp(t->u.user.name, XT_STANDARD_TARGET) == 0 &&
-   COMPAT_XT_ALIGN(target_offset + sizeof(struct 
compat_xt_standard_target)) != next_offset)
-   return -EINVAL;
+   if (strcmp(t->u.user.name, XT_STANDARD_TARGET) == 0) {
+   const struct compat_xt_standard_target *st = (const void *)t;
+
+   if (COMPAT_XT_ALIGN(target_offset + sizeof(*st)) != next_offset)
+   return -EINVAL;
+
+   if (!verdict_ok(st->verdict))
+   return -EINVAL;
+   }
 
/* compat_xt_entry match has less strict alignment requirements,
 * otherwise they

[PATCH 01/30] netfilter: nf_tables: nf_tables_obj_lookup_byhandle() can be static

2018-03-12 Thread Pablo Neira Ayuso

From: kbuild test robot 

Fixes: 3ecbfd65f50e ("netfilter: nf_tables: allocate handle and delete objects 
via handle")
Signed-off-by: Fengguang Wu 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nf_tables_api.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 8b9fe30de0cd..8cc7fc970f0c 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -4328,9 +4328,9 @@ struct nft_object *nf_tables_obj_lookup(const struct 
nft_table *table,
 }
 EXPORT_SYMBOL_GPL(nf_tables_obj_lookup);
 
-struct nft_object *nf_tables_obj_lookup_byhandle(const struct nft_table *table,
-const struct nlattr *nla,
-u32 objtype, u8 genmask)
+static struct nft_object *nf_tables_obj_lookup_byhandle(const struct nft_table 
*table,
+   const struct nlattr 
*nla,
+   u32 objtype, u8 genmask)
 {
struct nft_object *obj;
 
@@ -4850,7 +4850,7 @@ struct nft_flowtable *nf_tables_flowtable_lookup(const 
struct nft_table *table,
 }
 EXPORT_SYMBOL_GPL(nf_tables_flowtable_lookup);
 
-struct nft_flowtable *
+static struct nft_flowtable *
 nf_tables_flowtable_lookup_byhandle(const struct nft_table *table,
const struct nlattr *nla, u8 genmask)
 {
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 02/30] netfilter: nfnetlink_acct: remove useless parameter

2018-03-12 Thread Pablo Neira Ayuso

From: Taehee Yoo 

parameter skb in nfnl_acct_overquota is not used anywhere.

Signed-off-by: Taehee Yoo 
Signed-off-by: Pablo Neira Ayuso 
---
 include/linux/netfilter/nfnetlink_acct.h | 3 +--
 net/netfilter/nfnetlink_acct.c   | 3 +--
 net/netfilter/xt_nfacct.c| 2 +-
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/include/linux/netfilter/nfnetlink_acct.h 
b/include/linux/netfilter/nfnetlink_acct.h
index b4d741195c28..beee8bffe49e 100644
--- a/include/linux/netfilter/nfnetlink_acct.h
+++ b/include/linux/netfilter/nfnetlink_acct.h
@@ -16,6 +16,5 @@ struct nf_acct;
 struct nf_acct *nfnl_acct_find_get(struct net *net, const char *filter_name);
 void nfnl_acct_put(struct nf_acct *acct);
 void nfnl_acct_update(const struct sk_buff *skb, struct nf_acct *nfacct);
-int nfnl_acct_overquota(struct net *net, const struct sk_buff *skb,
-   struct nf_acct *nfacct);
+int nfnl_acct_overquota(struct net *net, struct nf_acct *nfacct);
 #endif /* _NFNL_ACCT_H */
diff --git a/net/netfilter/nfnetlink_acct.c b/net/netfilter/nfnetlink_acct.c
index 88d427f9f9e6..b9505bcd3827 100644
--- a/net/netfilter/nfnetlink_acct.c
+++ b/net/netfilter/nfnetlink_acct.c
@@ -467,8 +467,7 @@ static void nfnl_overquota_report(struct net *net, struct 
nf_acct *nfacct)
  GFP_ATOMIC);
 }
 
-int nfnl_acct_overquota(struct net *net, const struct sk_buff *skb,
-   struct nf_acct *nfacct)
+int nfnl_acct_overquota(struct net *net, struct nf_acct *nfacct)
 {
u64 now;
u64 *quota;
diff --git a/net/netfilter/xt_nfacct.c b/net/netfilter/xt_nfacct.c
index c8674deed4eb..6b56f4170860 100644
--- a/net/netfilter/xt_nfacct.c
+++ b/net/netfilter/xt_nfacct.c
@@ -28,7 +28,7 @@ static bool nfacct_mt(const struct sk_buff *skb, struct 
xt_action_param *par)
 
nfnl_acct_update(skb, info->nfacct);
 
-   overquota = nfnl_acct_overquota(xt_net(par), skb, info->nfacct);
+   overquota = nfnl_acct_overquota(xt_net(par), info->nfacct);
 
return overquota == NFACCT_UNDERQUOTA ? false : true;
 }
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 03/30] netfilter: xt_cluster: get rid of xt_cluster_ipv6_is_multicast

2018-03-12 Thread Pablo Neira Ayuso

From: Taehee Yoo 

If use the ipv6_addr_is_multicast instead of xt_cluster_ipv6_is_multicast,
then we can reduce code size.

Signed-off-by: Taehee Yoo 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/xt_cluster.c | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/net/netfilter/xt_cluster.c b/net/netfilter/xt_cluster.c
index 0068688995c8..dfbdbb2fc0ed 100644
--- a/net/netfilter/xt_cluster.c
+++ b/net/netfilter/xt_cluster.c
@@ -60,13 +60,6 @@ xt_cluster_hash(const struct nf_conn *ct,
 }
 
 static inline bool
-xt_cluster_ipv6_is_multicast(const struct in6_addr *addr)
-{
-   __be32 st = addr->s6_addr32[0];
-   return ((st & htonl(0xFF00)) == htonl(0xFF00));
-}
-
-static inline bool
 xt_cluster_is_multicast_addr(const struct sk_buff *skb, u_int8_t family)
 {
bool is_multicast = false;
@@ -76,8 +69,7 @@ xt_cluster_is_multicast_addr(const struct sk_buff *skb, 
u_int8_t family)
is_multicast = ipv4_is_multicast(ip_hdr(skb)->daddr);
break;
case NFPROTO_IPV6:
-   is_multicast =
-   xt_cluster_ipv6_is_multicast(&ipv6_hdr(skb)->daddr);
+   is_multicast = ipv6_addr_is_multicast(&ipv6_hdr(skb)->daddr);
break;
default:
WARN_ON(1);
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/30] Netfilter/IPVS updates for net-next

2018-03-12 Thread David Miller

From: Pablo Neira Ayuso 
Date: Mon, 12 Mar 2018 18:58:50 +0100

> The following patchset contains Netfilter/IPVS updates for your net-next
> tree. This batch comes with more input sanitization for xtables to
> address bug reports from fuzzers, preparation works to the flowtable
> infrastructure and assorted updates. In no particular order, they are:

Sorry, I've seen enough.  I'm not pulling this.

What is the story with this flow table stuff?  I tried to ask you
about this before, but the response I was given was extremely vague
and did not answer my question at all.

This is a lot of code, and a lot of infrastructure, yet I see
no device using the infrastructure to offload conntack.

Nor can I see how this can possibly be even useful for such an
application.  What conntrack offload needs are things completely
outside of what the flow table stuff provides.  Mainly, they
require that the SKB is completely abstracted away from all of
the contrack code paths, and that the conntrack infrastructure
operates on an abstract packet metadata concept.

If you are targetting one specific piece of hardware with TCAMs
that you are familiar with.  I'd like you to stop right there.
Because if that is all that this infrastructure can actually
be used for, it is definitely designed wrong.

This, as has been the case in the past, is what is wrong with
netfilter approach to supporting offloading.  We see all of this
infrastructure before an actual working use case is provided for a
specific piece of hardware for a specific driver in the tree.

Nobody can evaluate whether the approach is good or not without
a clear driver change implementing support for it.

No other area of networking puts the cart before the horse like this.

I do not agree at all with the flow table infrastructure and I
therefore do not want to pull any more flow table changes into my tree
until there is an actual user of this stuff in that pull request which
actually works in a way which is useful for people.  It is completely
dead and useless code currently.

If you disagree you have to not just say it, but show it with a driver
that successfully and cleanly uses this code to offload conntrack.

Meanwhile, remove the flow table commits from this pull request out of
your tree and ask me to pull in the rest.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/30] Netfilter/IPVS updates for net-next

2018-03-12 Thread Felix Fietkau

On 2018-03-12 19:58, David Miller wrote:
> From: Pablo Neira Ayuso 
> Date: Mon, 12 Mar 2018 18:58:50 +0100
> 
>> The following patchset contains Netfilter/IPVS updates for your net-next
>> tree. This batch comes with more input sanitization for xtables to
>> address bug reports from fuzzers, preparation works to the flowtable
>> infrastructure and assorted updates. In no particular order, they are:
> 
> Sorry, I've seen enough.  I'm not pulling this.
> 
> What is the story with this flow table stuff?  I tried to ask you
> about this before, but the response I was given was extremely vague
> and did not answer my question at all.
> 
> This is a lot of code, and a lot of infrastructure, yet I see
> no device using the infrastructure to offload conntack.
> 
> Nor can I see how this can possibly be even useful for such an
> application.  What conntrack offload needs are things completely
> outside of what the flow table stuff provides.  Mainly, they
> require that the SKB is completely abstracted away from all of
> the contrack code paths, and that the conntrack infrastructure
> operates on an abstract packet metadata concept.
> 
> If you are targetting one specific piece of hardware with TCAMs
> that you are familiar with.  I'd like you to stop right there.
> Because if that is all that this infrastructure can actually
> be used for, it is definitely designed wrong.
> 
> This, as has been the case in the past, is what is wrong with
> netfilter approach to supporting offloading.  We see all of this
> infrastructure before an actual working use case is provided for a
> specific piece of hardware for a specific driver in the tree.
> 
> Nobody can evaluate whether the approach is good or not without
> a clear driver change implementing support for it.
> 
> No other area of networking puts the cart before the horse like this.
> 
> I do not agree at all with the flow table infrastructure and I
> therefore do not want to pull any more flow table changes into my tree
> until there is an actual user of this stuff in that pull request which
> actually works in a way which is useful for people.  It is completely
> dead and useless code currently.
It's not dead and useless. In its current state, it has a software fast
path that significantly improves nftables routing/NAT throughput,
especially on embedded devices.
On some devices, I've seen "only" 20% throughput improvement (along with
CPU usage reduction), on others it's quite a bit lot more. This is
without any extra drivers or patches aside from what's posted.

Within OpenWrt, I'm working on a patch that makes the same available to
legacy netfilter as well. This is the reason for a lot of the core
refactoring that I did.

Hardware offload is still being worked on, not sure when we will have
the first driver ready. But as it stands now, the code is already very
useful and backported to OpenWrt for testing.

I think that in a couple of weeks this code will be ready to be enabled
by default in OpenWrt, which means that a lot of users' setups will get
a lot faster with no configuration change at all.

- Felix
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/30] Netfilter/IPVS updates for net-next

2018-03-12 Thread David Miller

From: Felix Fietkau 
Date: Mon, 12 Mar 2018 20:30:01 +0100

> It's not dead and useless. In its current state, it has a software fast
> path that significantly improves nftables routing/NAT throughput,
> especially on embedded devices.
> On some devices, I've seen "only" 20% throughput improvement (along with
> CPU usage reduction), on others it's quite a bit lot more. This is
> without any extra drivers or patches aside from what's posted.

I wonder if this software fast path has the exploitability problems that
things like the ipv4 routing cache and the per-cpu flow cache both had.
And the reason for which both were removed.

I don't see how you can avoid this problem.

I'm willing to be shown otherwise :-)
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/30] Netfilter/IPVS updates for net-next

2018-03-12 Thread Felix Fietkau

On 2018-03-12 21:01, David Miller wrote:
> From: Felix Fietkau 
> Date: Mon, 12 Mar 2018 20:30:01 +0100
> 
>> It's not dead and useless. In its current state, it has a software fast
>> path that significantly improves nftables routing/NAT throughput,
>> especially on embedded devices.
>> On some devices, I've seen "only" 20% throughput improvement (along with
>> CPU usage reduction), on others it's quite a bit lot more. This is
>> without any extra drivers or patches aside from what's posted.
> 
> I wonder if this software fast path has the exploitability problems that
> things like the ipv4 routing cache and the per-cpu flow cache both had.
> And the reason for which both were removed.
> 
> I don't see how you can avoid this problem.
> 
> I'm willing to be shown otherwise :-)
I don't think it suffers from the same issues, and if it does, it's a
lot easier to mitigate. The ruleset can easily be configured to only
offload connections that transferred a certain amount of data, handling
only bulk flows.

It's easy to put an upper limit on the number of offloaded connections,
and there's nothing in the code that just creates an offload entry per
packet or per lookup or something like that.

If you have other concerns, I'm sure we can address them with follow-up
patches, but as it stands, I think the code is already quite useful.

- Felix
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iptables-save - suggest patch to add functionality

2018-03-12 Thread Alban Vidal

Package: iptables

Dear Maintainers,

Le 11/03/2018 à 21:57, Pablo Neira Ayuso a écrit :
> Hi Alban,
>
> On Tue, Jan 23, 2018 at 11:44:22AM +0100, Alban Vidal wrote:
>> 1) Adding -z or --zero option: Reset to zero counters of the chains.
> I have no objections to this -z feature, but better use -Z uppercase
> instead, so we match it with the existing -Z in iptables that only
> refers to chains too.
>
> A single patch for this new feature is prefered.
> Could you also update xtables-save BTW? This is the compat tool to
> save iptables-compat listings from nftables.

The first patch is join, I have changed with -Z uppercase option, and
updated the man page.
« xtables-save » is also updated.

Output examples :

iptables-save -Z
# Generated by iptables-save v1.6.2 on Mon Mar 12 23:30:16 2018
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
(...)

xtables-multi save4 -Z
# Generated by iptables-save v1.6.2 on Mon Mar 12 23:30:42 2018
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
(...)

>> 2) Adding -h or --help option: print help/usage (inspired by manpage)
> Fine, but place this in a separated patch, no need for common file.
> Don't bother about copy and paste.

I send you the second patch for -h option after you are pushed the first.

>> diff --git a/iptables/ip6tables-save.c b/iptables/ip6tables-save.c
>> index 8e3a6afd..466ce0ce 100644
>> --- a/iptables/ip6tables-save.c
>> +++ b/iptables/ip6tables-save.c
>> @@ -3,6 +3,8 @@
>>   * Original code: iptables-save
>>   * Authors: Paul 'Rusty' Russel  and
>>   *  Harald Welte 
>> + * Contributor: Alban Vidal 
> These days, git already registers this, previous lines are just there
> for historical reasons. So please, remove this.
It's done ! Removed from source code.

Best regards,
Alban Vidal


iptables-save_patch1.tar.gz
Description: application/gzip
diff --git a/iptables/ip6tables-save.c b/iptables/ip6tables-save.c
index 8e3a6afd..a94beffc 100644
--- a/iptables/ip6tables-save.c
+++ b/iptables/ip6tables-save.c
@@ -19,11 +19,15 @@
 #include "ip6tables.h"
 #include "ip6tables-multi.h"
 
-static int show_counters;
+static int show_counters = false;
+
+/* if true (opt -Z, --zero): Reset to zero counters of the chains */
+static int rst_chain_counters = false;
 
 static const struct option options[] = {
 	{.name = "counters", .has_arg = false, .val = 'c'},
 	{.name = "dump", .has_arg = false, .val = 'd'},
+	{.name = "zero", .has_arg = false, .val = 'Z'},
 	{.name = "table",.has_arg = true,  .val = 't'},
 	{.name = "modprobe", .has_arg = true,  .val = 'M'},
 	{.name = "file", .has_arg = true,  .val = 'f'},
@@ -96,7 +100,13 @@ static int do_output(const char *tablename)
 			struct xt_counters count;
 			printf("%s ",
 			   ip6tc_get_policy(chain, &count, h));
-			printf("[%llu:%llu]\n", (unsigned long long)count.pcnt, (unsigned long long)count.bcnt);
+			if (!rst_chain_counters) {
+/* Default value, print count */
+printf("[%llu:%llu]\n", (unsigned long long)count.pcnt, (unsigned long long)count.bcnt);
+			} else {
+/* Reset to zero counters of the chains */
+printf("[0:0]\n");
+			}
 		} else {
 			printf("- [0:0]\n");
 		}
@@ -146,15 +156,17 @@ int ip6tables_save_main(int argc, char *argv[])
 	init_extensions6();
 #endif
 
-	while ((c = getopt_long(argc, argv, "bcdt:M:f:", options, NULL)) != -1) {
+	while ((c = getopt_long(argc, argv, "bcZdt:M:f:", options, NULL)) != -1) {
 		switch (c) {
 		case 'b':
 			fprintf(stderr, "-b/--binary option is not implemented\n");
 			break;
 		case 'c':
-			show_counters = 1;
+			show_counters = true;
+			break;
+		case 'Z':
+			rst_chain_counters = true;
 			break;
-
 		case 't':
 			/* Select specific table. */
 			tablename = optarg;
diff --git a/iptables/iptables-save.8.in b/iptables/iptables-save.8.in
index 51e11f3e..200d6448 100644
--- a/iptables/iptables-save.8.in
+++ b/iptables/iptables-save.8.in
@@ -24,10 +24,10 @@ iptables-save \(em dump iptables rules
 ip6tables-save \(em dump iptables rules
 .SH SYNOPSIS
 \fBiptables\-save\fP [\fB\-M\fP \fImodprobe\fP] [\fB\-c\fP]
-[\fB\-t\fP \fItable\fP] [\fB\-f\fP \fIfilename\fP]
+[\fB\-Z\fP] [\fB\-t\fP \fItable\fP] [\fB\-f\fP \fIfilename\fP]
 .P
 \fBip6tables\-save\fP [\fB\-M\fP \fImodprobe\fP] [\fB\-c\fP]
-[\fB\-t\fP \fItable\fP] [\fB\-f\fP \fIfilename\fP]
+[\fB\-Z\fP] [\fB\-t\fP \fItable\fP] [\fB\-f\fP \fIfilename\fP]
 .SH DESCRIPTION
 .PP
 .B iptables-save
@@ -45,19 +45,24 @@ Specify a filename to log the output to. If not specified, iptables-save
 will log to STDOUT.
 .TP
 \fB\-c\fR, \fB\-\-counters\fR
-include the current values of all packet and byte counters in the output
+Include the current values of all packet and byte counters in the output.
+.TP
+\fB\-Z\fR, \fB\-\-zero\fR
+Reset to zero counters of the chains.
 .TP
 \fB\-t\fR, \fB\-\-table\fR \fItablename\fP
-restrict output to only one table. If not specified, output includes all
+Restrict output to only one table. If not specified,

[PATCH] netfilter: cttimeout: remove VLA usage

2018-03-12 Thread Gustavo A. R. Silva

In preparation to enabling -Wvla, remove VLA and replace it
with dynamic memory allocation.

>From a security viewpoint, the use of Variable Length Arrays can be
a vector for stack overflow attacks. Also, in general, as the code
evolves it is easy to lose track of how big a VLA can get. Thus, we
can end up having segfaults that are hard to debug.

Also, fixed as part of the directive to remove all VLAs from
the kernel: https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Gustavo A. R. Silva 
---
 net/netfilter/nfnetlink_cttimeout.c | 26 +-
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/net/netfilter/nfnetlink_cttimeout.c 
b/net/netfilter/nfnetlink_cttimeout.c
index 6819300..dcd7bd3 100644
--- a/net/netfilter/nfnetlink_cttimeout.c
+++ b/net/netfilter/nfnetlink_cttimeout.c
@@ -51,19 +51,27 @@ ctnl_timeout_parse_policy(void *timeouts,
  const struct nf_conntrack_l4proto *l4proto,
  struct net *net, const struct nlattr *attr)
 {
+   struct nlattr **tb;
int ret = 0;
 
-   if (likely(l4proto->ctnl_timeout.nlattr_to_obj)) {
-   struct nlattr *tb[l4proto->ctnl_timeout.nlattr_max+1];
+   if (!l4proto->ctnl_timeout.nlattr_to_obj)
+   return 0;
 
-   ret = nla_parse_nested(tb, l4proto->ctnl_timeout.nlattr_max,
-  attr, l4proto->ctnl_timeout.nla_policy,
-  NULL);
-   if (ret < 0)
-   return ret;
+   tb = kcalloc(l4proto->ctnl_timeout.nlattr_max + 1, sizeof(*tb),
+GFP_KERNEL);
 
-   ret = l4proto->ctnl_timeout.nlattr_to_obj(tb, net, timeouts);
-   }
+   if (!tb)
+   return -ENOMEM;
+
+   ret = nla_parse_nested(tb, l4proto->ctnl_timeout.nlattr_max, attr,
+  l4proto->ctnl_timeout.nla_policy, NULL);
+   if (ret < 0)
+   goto err;
+
+   ret = l4proto->ctnl_timeout.nlattr_to_obj(tb, net, timeouts);
+
+err:
+   kfree(tb);
return ret;
 }
 
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] netfilter: cttimeout: remove VLA usage

2018-03-12 Thread Joe Perches

On Mon, 2018-03-12 at 18:14 -0500, Gustavo A. R. Silva wrote:
> In preparation to enabling -Wvla, remove VLA and replace it
> with dynamic memory allocation.
> 
> From a security viewpoint, the use of Variable Length Arrays can be
> a vector for stack overflow attacks. Also, in general, as the code
> evolves it is easy to lose track of how big a VLA can get. Thus, we
> can end up having segfaults that are hard to debug.
> 
> Also, fixed as part of the directive to remove all VLAs from
[]
> diff --git a/net/netfilter/nfnetlink_cttimeout.c 
> b/net/netfilter/nfnetlink_cttimeout.c
[]
> @@ -51,19 +51,27 @@ ctnl_timeout_parse_policy(void *timeouts,
> const struct nf_conntrack_l4proto *l4proto,
> struct net *net, const struct nlattr *attr)
>  {
> + struct nlattr **tb;
>   int ret = 0;
>  
> - if (likely(l4proto->ctnl_timeout.nlattr_to_obj)) {
> - struct nlattr *tb[l4proto->ctnl_timeout.nlattr_max+1];
> + if (!l4proto->ctnl_timeout.nlattr_to_obj)
> + return 0;

Why not
if unlikely(!...)

>  
> - ret = nla_parse_nested(tb, l4proto->ctnl_timeout.nlattr_max,
> -attr, l4proto->ctnl_timeout.nla_policy,
> -NULL);
> - if (ret < 0)
> - return ret;
> + tb = kcalloc(l4proto->ctnl_timeout.nlattr_max + 1, sizeof(*tb),
> +  GFP_KERNEL);

kmalloc_array?

>  
> - ret = l4proto->ctnl_timeout.nlattr_to_obj(tb, net, timeouts);
> - }
> + if (!tb)
> + return -ENOMEM;
> +
> + ret = nla_parse_nested(tb, l4proto->ctnl_timeout.nlattr_max, attr,
> +l4proto->ctnl_timeout.nla_policy, NULL);
> + if (ret < 0)
> + goto err;
> +
> + ret = l4proto->ctnl_timeout.nlattr_to_obj(tb, net, timeouts);
> +
> +err:
> + kfree(tb);
>   return ret;
>  }
>  
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] netfilter: nfnetlink_cthelper: Remove VLA usage

2018-03-12 Thread Gustavo A. R. Silva

In preparation to enabling -Wvla, remove VLA and replace it
with dynamic memory allocation.

>From a security viewpoint, the use of Variable Length Arrays can be
a vector for stack overflow attacks. Also, in general, as the code
evolves it is easy to lose track of how big a VLA can get. Thus, we
can end up having segfaults that are hard to debug.

Also, fixed as part of the directive to remove all VLAs from
the kernel: https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Gustavo A. R. Silva 
---
 net/netfilter/nfnetlink_cthelper.c | 25 +
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/net/netfilter/nfnetlink_cthelper.c 
b/net/netfilter/nfnetlink_cthelper.c
index d33ce6d..4a4b293 100644
--- a/net/netfilter/nfnetlink_cthelper.c
+++ b/net/netfilter/nfnetlink_cthelper.c
@@ -314,23 +314,30 @@ nfnl_cthelper_update_policy_one(const struct 
nf_conntrack_expect_policy *policy,
 static int nfnl_cthelper_update_policy_all(struct nlattr *tb[],
   struct nf_conntrack_helper *helper)
 {
-   struct nf_conntrack_expect_policy new_policy[helper->expect_class_max + 
1];
+   struct nf_conntrack_expect_policy *new_policy;
struct nf_conntrack_expect_policy *policy;
-   int i, err;
+   int i, ret = 0;
+
+   new_policy = kmalloc_array(helper->expect_class_max + 1,
+  sizeof(*new_policy), GFP_KERNEL);
+   if (!new_policy)
+   return -ENOMEM;
 
/* Check first that all policy attributes are well-formed, so we don't
 * leave things in inconsistent state on errors.
 */
for (i = 0; i < helper->expect_class_max + 1; i++) {
 
-   if (!tb[NFCTH_POLICY_SET + i])
-   return -EINVAL;
+   if (!tb[NFCTH_POLICY_SET + i]) {
+   ret = -EINVAL;
+   goto err;
+   }
 
-   err = nfnl_cthelper_update_policy_one(&helper->expect_policy[i],
+   ret = nfnl_cthelper_update_policy_one(&helper->expect_policy[i],
  &new_policy[i],
  tb[NFCTH_POLICY_SET + i]);
-   if (err < 0)
-   return err;
+   if (ret < 0)
+   goto err;
}
/* Now we can safely update them. */
for (i = 0; i < helper->expect_class_max + 1; i++) {
@@ -340,7 +347,9 @@ static int nfnl_cthelper_update_policy_all(struct nlattr 
*tb[],
policy->timeout = new_policy->timeout;
}
 
-   return 0;
+err:
+   kfree(new_policy);
+   return ret;
 }
 
 static int nfnl_cthelper_update_policy(struct nf_conntrack_helper *helper,
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] netfilter: nf_tables: remove VLA usage

2018-03-12 Thread Gustavo A. R. Silva

In preparation to enabling -Wvla, remove VLA and replace it
with dynamic memory allocation.

>From a security viewpoint, the use of Variable Length Arrays can be
a vector for stack overflow attacks. Also, in general, as the code
evolves it is easy to lose track of how big a VLA can get. Thus, we
can end up having segfaults that are hard to debug.

Also, fixed as part of the directive to remove all VLAs from
the kernel: https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Gustavo A. R. Silva 
---
 net/netfilter/nf_tables_api.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 3f815b6..ea76903 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -4357,16 +4357,20 @@ static struct nft_object *nft_obj_init(const struct 
nft_ctx *ctx,
   const struct nft_object_type *type,
   const struct nlattr *attr)
 {
-   struct nlattr *tb[type->maxattr + 1];
+   struct nlattr **tb;
const struct nft_object_ops *ops;
struct nft_object *obj;
-   int err;
+   int err = -ENOMEM;
+
+   tb = kcalloc(type->maxattr + 1, sizeof(*tb), GFP_KERNEL);
+   if (!tb)
+   goto err1;
 
if (attr) {
err = nla_parse_nested(tb, type->maxattr, attr, type->policy,
   NULL);
if (err < 0)
-   goto err1;
+   goto err2;
} else {
memset(tb, 0, sizeof(tb[0]) * (type->maxattr + 1));
}
@@ -4375,7 +4379,7 @@ static struct nft_object *nft_obj_init(const struct 
nft_ctx *ctx,
ops = type->select_ops(ctx, (const struct nlattr * const *)tb);
if (IS_ERR(ops)) {
err = PTR_ERR(ops);
-   goto err1;
+   goto err2;
}
} else {
ops = type->ops;
@@ -4383,18 +4387,21 @@ static struct nft_object *nft_obj_init(const struct 
nft_ctx *ctx,
 
err = -ENOMEM;
obj = kzalloc(sizeof(*obj) + ops->size, GFP_KERNEL);
-   if (obj == NULL)
-   goto err1;
+   if (!obj)
+   goto err2;
 
err = ops->init(ctx, (const struct nlattr * const *)tb, obj);
if (err < 0)
-   goto err2;
+   goto err3;
 
obj->ops = ops;
 
+   kfree(tb);
return obj;
-err2:
+err3:
kfree(obj);
+err2:
+   kfree(tb);
 err1:
return ERR_PTR(err);
 }
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] ebtables: Add string filter

2018-03-12 Thread Bernie Harris

Hi Pablo, thanks for the reply. Just wanted to clarify your first comment below:

On Mon, Mar 12, 2018 at 09:41:00AM +0100, Pablo Neira Ayuso wrote:
> To: Bernie Harris
> Cc: netfilter-devel@vger.kernel.org; kad...@blackhole.kfki.hu; 
> f...@strlen.de; da...@davemloft.net
> Subject: Re: [PATCH 2/2] ebtables: Add string filter
> 
> Hi Bernie,
> 
> A few comments below.
> 
> On Tue, Feb 27, 2018 at 10:58:35AM +1300, Bernie Harris wrote:
> > This patch is part of a proposal to add a string filter to
> > ebtables, which would be similar to the string filter in
> > iptables.
> >
> > Like iptables, the ebtables filter uses the xt_string module,
> > however some modifications have been made for this to work
> > correctly.
> >
> > Currently ebtables assumes that the revision number of all
> > match modules is 0. The xt_string module doesn't register a match
> > with revision 0 so the solution is to modify ebtables to allow
> > extensions to specify a revision number, similar to iptables.
> > This gets passed down to the kernel, which is then able to find
> > the match module correctly.
> >
> > Signed-off-by: Bernie Harris 
> > ---
> >  include/uapi/linux/netfilter_bridge/ebtables.h |  5 -
> >  net/bridge/netfilter/ebtables.c| 12 
> >  net/netfilter/xt_string.c  |  1 +
> >  3 files changed, 13 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/uapi/linux/netfilter_bridge/ebtables.h 
> > b/include/uapi/linux/netfilter_bridge/ebtables.h
> > index 9ff57c0a0199..2143d5623d3b 100644
> > --- a/include/uapi/linux/netfilter_bridge/ebtables.h
> > +++ b/include/uapi/linux/netfilter_bridge/ebtables.h
> > @@ -120,7 +120,10 @@ struct ebt_entries {
> >
> >  struct ebt_entry_match {
> >   union {
> > - char name[EBT_FUNCTION_MAXNAMELEN];
> > + struct {
> > + char name[EBT_FUNCTION_MAXNAMELEN];
> > + uint8_t revision;
> 
> EBT_FUNCTION_MAXNAMELEN needs to be adjusted too to scratch this
> revision byte field. Otherwise, we break backward binary
> compatibility.
> 

By this did you mean reduce EBT_FUNCTION_MAXNAMELEN by 1? Though I assume that 
would break
a small number of existing setups. Alternatively, is there some way of adding a 
new ebt_entry_match_v2
structure that includes a revision field?

Thanks

> > + };
> >   struct xt_match *match;
> >   } u;
> >   /* size of data */
> > diff --git a/net/bridge/netfilter/ebtables.c 
> > b/net/bridge/netfilter/ebtables.c
> > index 02c4b409d317..6e55f3437fc8 100644
> > --- a/net/bridge/netfilter/ebtables.c
> > +++ b/net/bridge/netfilter/ebtables.c
> > @@ -358,12 +358,12 @@ ebt_check_match(struct ebt_entry_match *m, struct 
> > xt_mtchk_param *par,
> >   left - sizeof(struct ebt_entry_match) < m->match_size)
> >   return -EINVAL;
> >
> > - match = xt_find_match(NFPROTO_BRIDGE, m->u.name, 0);
> > + match = xt_find_match(NFPROTO_BRIDGE, m->u.name, m->u.revision);
> >   if (IS_ERR(match) || match->family != NFPROTO_BRIDGE) {
> >   if (!IS_ERR(match))
> >   module_put(match->me);
> >   request_module("ebt_%s", m->u.name);
> > - match = xt_find_match(NFPROTO_BRIDGE, m->u.name, 0);
> > + match = xt_find_match(NFPROTO_BRIDGE, m->u.name, 
> > m->u.revision);
> >   }
> >   if (IS_ERR(match))
> >   return PTR_ERR(match);
> > @@ -1604,7 +1604,10 @@ struct compat_ebt_replace {
> >  /* struct ebt_entry_match, _target and _watcher have same layout */
> >  struct compat_ebt_entry_mwt {
> >   union {
> > - char name[EBT_FUNCTION_MAXNAMELEN];
> > + struct {
> > + char name[EBT_FUNCTION_MAXNAMELEN];
> > + u8 revision;
> > + };
> >   compat_uptr_t ptr;
> >   } u;
> >   compat_uint_t match_size;
> > @@ -1948,7 +1951,8 @@ static int compat_mtw_from_user(struct 
> > compat_ebt_entry_mwt *mwt,
> >
> >   switch (compat_mwt) {
> >   case EBT_COMPAT_MATCH:
> > - match = xt_request_find_match(NFPROTO_BRIDGE, name, 0);
> > + match = xt_request_find_match(NFPROTO_BRIDGE, name,
> > +   mwt->u.revision);
> >   if (IS_ERR(match))
> >   return PTR_ERR(match);
> >
> 
> Could you split this in two patches? One to add basic revision
> infrastructure to ebtables; and another one - oneliner patch
> containing the chunk below - to string matching support.
> 
> Thanks!
> 
> > diff --git a/net/netfilter/xt_string.c b/net/netfilter/xt_string.c
> > index 423293ee57c2..be1feddadcf0 100644
> > --- a/net/netfilter/xt_string.c
> > +++ b/net/netfilter/xt_string.c
> > @@ -21,6 +21,7 @@ MODULE_DESCRIPTION("Xtables: string-based matching");
> >  MODULE_LICENSE("GPL");
> >  MODULE_ALIAS("ipt_string");
> >  MODULE_ALIAS("ip6t_string");
> > +MODULE_ALIAS("ebt_strin

[PATCH v2] netfilter: nf_tables: remove VLA usage

2018-03-12 Thread Gustavo A. R. Silva

In preparation to enabling -Wvla, remove VLA and replace it
with dynamic memory allocation.

>From a security viewpoint, the use of Variable Length Arrays can be
a vector for stack overflow attacks. Also, in general, as the code
evolves it is easy to lose track of how big a VLA can get. Thus, we
can end up having segfaults that are hard to debug.

Also, fixed as part of the directive to remove all VLAs from
the kernel: https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Gustavo A. R. Silva 
---
Changes in v2:
 - Use kmalloc_array instead of kcalloc.

 net/netfilter/nf_tables_api.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 3f815b6..5a42e97 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -4357,16 +4357,20 @@ static struct nft_object *nft_obj_init(const struct 
nft_ctx *ctx,
   const struct nft_object_type *type,
   const struct nlattr *attr)
 {
-   struct nlattr *tb[type->maxattr + 1];
+   struct nlattr **tb;
const struct nft_object_ops *ops;
struct nft_object *obj;
-   int err;
+   int err = -ENOMEM;
+
+   tb = kmalloc_array(type->maxattr + 1, sizeof(*tb), GFP_KERNEL);
+   if (!tb)
+   goto err1;
 
if (attr) {
err = nla_parse_nested(tb, type->maxattr, attr, type->policy,
   NULL);
if (err < 0)
-   goto err1;
+   goto err2;
} else {
memset(tb, 0, sizeof(tb[0]) * (type->maxattr + 1));
}
@@ -4375,7 +4379,7 @@ static struct nft_object *nft_obj_init(const struct 
nft_ctx *ctx,
ops = type->select_ops(ctx, (const struct nlattr * const *)tb);
if (IS_ERR(ops)) {
err = PTR_ERR(ops);
-   goto err1;
+   goto err2;
}
} else {
ops = type->ops;
@@ -4383,18 +4387,21 @@ static struct nft_object *nft_obj_init(const struct 
nft_ctx *ctx,
 
err = -ENOMEM;
obj = kzalloc(sizeof(*obj) + ops->size, GFP_KERNEL);
-   if (obj == NULL)
-   goto err1;
+   if (!obj)
+   goto err2;
 
err = ops->init(ctx, (const struct nlattr * const *)tb, obj);
if (err < 0)
-   goto err2;
+   goto err3;
 
obj->ops = ops;
 
+   kfree(tb);
return obj;
-err2:
+err3:
kfree(obj);
+err2:
+   kfree(tb);
 err1:
return ERR_PTR(err);
 }
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 0/17] netfilter: nf_flow_table: refactoring, TCP state tracking, sending flows to slow path

2018-03-12 Thread Rafał Miłecki


On Mon, 5 Mar 2018 23:11:38 +0100, Pablo Neira Ayuso wrote:
> On Mon, Feb 26, 2018 at 10:15:07AM +0100, Felix Fietkau wrote:
> > Fixes issues with connections hanging after >30 seconds idle time.
> >
> > Changes since v2:
> > - Include the previous patch series
> > - Rebase to current nf.git
> > - Provide longer description for the teardown state and the changes
> >   for passing flows back to the slow path
> >
> > Changes since v1:
> > - Fix up connection tracking state earlier to improve processing of TCP
> >   FIN/RST that trigger the bump to the slow path.
> > - Fix the value of ct->proto.tcp.state, reset the window values to force
> >   the tcp window check to resync
>
> Series applied, thanks Felix.

Hi Pablo,

I just noticed net-next.git already got net.git merged and contains
Felix's DNAT fix.

Just letting you know, in case you have a moment to look at remaining
patches. Thanks a lot for taking care of Felix's work! I'm really
excited about this feature hitting OpenWrt/LEDE :)
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

64 matches

Mail list logo