Pablo Neira Ayuso wrote:
> On Mon, Apr 24, 2017 at 03:37:38PM +0200, Florian Westphal wrote:
> > low hanging fruits to speed up netns cleanup in netfilter.
> > We're way too happy to issue expensive synchronize_rcu() all
> > over the place.
> >
> > On my test vm 8 processes doing 32 unshare each
El Wed, Apr 19, 2017 at 11:39:20AM -0700 Matthias Kaehlcke ha dit:
> Not all parameters passed to ctnetlink_parse_tuple() and
> ctnetlink_exp_dump_tuple() match the enum type in the signatures of these
> functions. Since this is intended change the argument type of to be an int
> value.
>
> Signe
From: Pablo Neira Ayuso
Date: Mon, 1 May 2017 12:53:53 +0200
> On Mon, May 01, 2017 at 12:46:27PM +0200, Pablo Neira Ayuso wrote:
>> Hi David,
>>
>> The following patchset contains Netfilter updates for your net-next
>> tree. A large bunch of code cleanups, XXX they are:
>
> Oh well, it case yo
From: Pablo Neira Ayuso
Date: Mon, 1 May 2017 12:46:27 +0200
> The following patchset contains Netfilter updates for your net-next
> tree. A large bunch of code cleanups, XXX they are:
...
> You can pull these changes from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git
Hello,
The following patches are rediffs for "ipvs: SNAT packet replies
only for NATed connections" for different stable kernels:
The official patch for the net tree (will come from Simon) works for:
4.9.25, 4.10.13, 4.11, net tree
Patch 1: 3.2.88, 3.4.113
Patch 2: 3.10.105, 3.12.73, 3.1
We do not check if packet from real server is for NAT
connection before performing SNAT. This causes problems
for setups that use DR/TUN and allow local clients to
access the real server directly, for example:
- local client in director creates IPVS-DR/TUN connection
CIP->VIP and the request packe
We do not check if packet from real server is for NAT
connection before performing SNAT. This causes problems
for setups that use DR/TUN and allow local clients to
access the real server directly, for example:
- local client in director creates IPVS-DR/TUN connection
CIP->VIP and the request packe
We do not check if packet from real server is for NAT
connection before performing SNAT. This causes problems
for setups that use DR/TUN and allow local clients to
access the real server directly, for example:
- local client in director creates IPVS-DR/TUN connection
CIP->VIP and the request packe
If no NLM_F_EXCL is set and the element already exists in the set, make
sure that both elements have the same extensions.
Signed-off-by: Pablo Neira Ayuso
---
net/netfilter/nf_tables_api.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_t
BTW, I'll be routing this patchset through nf.git, this depends on a
fix from Liping Zhang. Just to keep it simple, this batch is
nevertheless coming with two fixes.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
M
On Mon, May 01, 2017 at 12:46:27PM +0200, Pablo Neira Ayuso wrote:
> Hi David,
>
> The following patchset contains Netfilter updates for your net-next
> tree. A large bunch of code cleanups, XXX they are:
Oh well, it case you can amend XXX thing there, David. This should be
instead...
The follow
From: Arushi Singhal
This comments are obsolete and should go, as there are no set of rules
per CPU anymore.
Signed-off-by: Arushi Singhal
---
net/ipv6/netfilter/ip6_tables.c | 9 -
1 file changed, 9 deletions(-)
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_ta
From: Gao Feng
The expect check function __nf_ct_expect_check() asks the master_help is
necessary. So it is unnecessary to go ahead in ctnetlink_alloc_expect
when there is no help.
Actually the commit bc01befdcf3e ("netfilter: ctnetlink: add support for
user-space expectation helpers") permits c
From: Florian Westphal
Check for the NAT status bits, they are set once conntrack needs NAT in source
or
reply direction, this is slightly faster than nfct_nat() as that has to check
the
extension area.
Signed-off-by: Florian Westphal
---
net/netfilter/ipvs/ip_vs_ftp.c | 2 +-
1 file changed
From: Florian Westphal
successful insert into the bysource hash sets IPS_SRC_NAT_DONE status bit
so we can check that instead of presence of nat extension which requires
extra deref.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
---
net/netfilter/nf_conntrack_core.c | 2 +-
From: Arushi Singhal
Remove & from function pointers to conform to the style found elsewhere
in the file. Done using the following semantic patch
//
@r@
identifier f;
@@
f(...) { ... }
@@
identifier r.f;
@@
- &f
+ f
//
Signed-off-by: Arushi Singhal
Signed-off-by: Pablo Neira Ayuso
---
ne
From: Gao Feng
nf_nat_mangle_{udp,tcp}_packet() returns int. However, it is used as
bool type in many spots. Fix this by consistently handle this return
value as a boolean.
Signed-off-by: Gao Feng
Signed-off-by: Pablo Neira Ayuso
---
include/net/netfilter/nf_nat_helper.h | 36 +++-
From: Florian Westphal
add a 32 byte scratch area in the helper struct instead of relying
on variable sized helpers plus compile-time asserts to let us know
if 32 bytes aren't enough anymore.
Not having variable sized helpers will later allow to add BUILD_BUG_ON
for the total size of conntrack e
From: Gao Feng
1. Remove single !events condition check to deliver the missed event
even though there is no new event happened.
Consider this case:
1) nf_ct_deliver_cached_events is invoked at the first time, the
event is failed to deliver, then the missed is set.
2) nf_ct_deliver_cached_events
From: Taehee Yoo
__nf_nat_decode_session is called from nf_nat_decode_session as decodefn.
before calling decodefn, it already set rcu_read_lock. so rcu_read_lock in
__nf_nat_decode_session can be removed.
Signed-off-by: Taehee Yoo
Signed-off-by: Pablo Neira Ayuso
---
net/netfilter/nf_nat_cor
From: Gao Feng
There are two nf_conntrack_l4proto_udp4 declarations in the head file
nf_conntrack_ipv4/6.h. Now remove one which is not enbraced by the macro
CONFIG_NF_CT_PROTO_UDPLITE.
Signed-off-by: Gao Feng
---
include/net/netfilter/ipv4/nf_conntrack_ipv4.h | 1 -
include/net/netfilter/ipv6
From: Florian Westphal
By default the kernel emits all ctnetlink events for a connection.
This allows to select the types of events to generate.
This can be used to e.g. only send DESTROY events but no NEW/UPDATE ones
and will work even if sysctl net.netfilter.nf_conntrack_events is set to 0.
T
From: Florian Westphal
its definition is not needed in nf_conntrack.h.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
---
include/net/netfilter/nf_conntrack.h| 19 ---
include/net/netfilter/nf_conntrack_helper.h | 17 +
2 files changed
From: Florian Westphal
This function is now obsolete and always returns false.
This change has no effect on generated code.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
---
include/net/ip_vs.h | 4 ++--
include/net/netfilter/nf_conntrack.h | 5
From: Florian Westphal
If insertion of a new conntrack fails because the table is full, the kernel
searches the next buckets of the hash slot where the new connection
was supposed to be inserted at for an entry that hasn't seen traffic
in reply direction (non-assured), if it finds one, that entry
From: Aaron Conole
Signed-off-by: Aaron Conole
Signed-off-by: Pablo Neira Ayuso
---
net/netfilter/nf_tables_api.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 22e191ad4468..91e9191a43d8 100644
--- a/net/netfilter/nf_ta
From: Florian Westphal
get rid of the (now unused) nf_ct_ext_add_length define and also
rename the function to plain nf_ct_ext_add().
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
---
include/net/netfilter/nf_conntrack_extend.h | 8 +---
net/netfilter/nf_conntrack_exte
From: Florian Westphal
The commit ab8bc7ed864b9c4f1fcb00a22bbe4e0f66ce8003
("netfilter: remove nf_ct_is_untracked")
changed the line
if (ct && !nf_ct_is_untracked(ct) && nfct_nat(ct)) {
to
if (ct && nfct_nat(ct)) {
meanwhile, the commit 41390895e50bc4f28abe384c6b35ac27464a20ec
(
From: Florian Westphal
Userspace should not abuse the kernel to store large amounts of data,
reject requests larger than the private area can accommodate.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
---
net/netfilter/nfnetlink_cthelper.c | 10 --
1 file changed, 8
From: Aaron Conole
The protonet pointer will unconditionally be rewritten, so just do the
needed assignment first.
Signed-off-by: Aaron Conole
Signed-off-by: Pablo Neira Ayuso
---
net/netfilter/nf_conntrack_proto.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/net/netf
From: Florian Westphal
commit 223b02d923ecd7c84cf9780bb3686f455d279279
("netfilter: nf_conntrack: reserve two bytes for nf_ct_ext->len")
had to increase size of the extension offsets because total size of the
extensions had increased to a point where u8 did overflow.
3 years later we've managed
From: Florian Westphal
No need to track this for inkernel helpers anymore as
NF_CT_HELPER_BUILD_BUG_ON checks do this now.
All inkernel helpers know what kind of structure they
stored in helper->data.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
---
include/net/netfilter/
From: Florian Westphal
resurrect an old patch from Pablo Neira to remove the untracked objects.
Currently, there are four possible states of an skb wrt. conntrack.
1. No conntrack attached, ct is NULL.
2. Normal (kmem cache allocated) ct attached.
3. a template (kmalloc'd), not in any hash tabl
From: Florian Westphal
Defer registration of the synproxy hooks until the first SYNPROXY rule is
added. Also means we only register hooks in namespaces that need it.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
---
include/net/netfilter/nf_conntrack_synproxy.h | 2 +
net
From: Gao Feng
The window scale may be enlarged from 14 to 15 according to the itef
draft https://tools.ietf.org/html/draft-nishida-tcpm-maxwin-03.
Use the macro TCP_MAX_WSCALE to support it easily with TCP stack in
the future.
Signed-off-by: Gao Feng
Signed-off-by: Pablo Neira Ayuso
---
net
From: Florian Westphal
nf_(un)register_hooks has to maintain an internal hook list to add/remove
those hooks from net namespaces as they are added/deleted.
ipvs already uses pernet_ops, so we can switch to the (more recent)
pernet hook api instead.
Compile tested only.
Signed-off-by: Florian W
From: Florian Westphal
Only "cache" needs to use ulong (its used with set_bit()), missed can use
u16. Also add build-time assertion to ensure event bits fit.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
---
include/net/netfilter/nf_conntrack_ecache.h| 4 ++--
incl
From: Florian Westphal
looks like decnet isn't namespacified in first place, so restrict hook
registration to the initial namespace.
Prepares for eventual removal of legacy nf_register_hook() api.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
---
net/decnet/netfilter/dn_rt
From: Florian Westphal
Similar to ip_register_table, pass nf_hook_ops to ebt_register_table().
This allows to handle hook registration also via pernet_ops and allows
us to avoid use of legacy register_hook api.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
---
include/linux
From: Gao Feng
Current SYNPROXY codes return NF_DROP during normal TCP handshaking,
it is not friendly to caller. Because the nf_hook_slow would treat
the NF_DROP as an error, and return -EPERM.
As a result, it may cause the top caller think it meets one error.
For example, the following codes a
From: Florian Westphal
krealloc(NULL, ..) is same as kmalloc(), so we can avoid special-casing
the initial allocation after the prealloc removal (we had to use
->alloc_len as the initial allocation size).
This also means we do not zero the preallocated memory anymore; only
offsets[]. Existing c
From: Florian Westphal
It was used by the nat extension, but since commit
7c9664351980 ("netfilter: move nat hlist_head to nf_conn") its only needed
for connections that use MASQUERADE target or a nat helper.
Also it seems a lot easier to preallocate a fixed size instead.
With default settings,
From: Florian Westphal
make sure nat extension gets added if the master conntrack is subject to
NAT. This will be required once the nat core stops adding it by default.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
---
net/ipv4/netfilter/nf_nat_pptp.c | 25 +++
From: Florian Westphal
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
---
include/net/netfilter/nf_conntrack_extend.h | 4 ++--
net/netfilter/nf_conntrack_acct.c | 2 +-
net/netfilter/nf_conntrack_ecache.c | 2 +-
net/netfilter/nf_conntrack_extend.c
From: Florian Westphal
Currently the nat extension is always attached as soon as nat module is
loaded. However, most NAT uses do not need the nat extension anymore.
Prepare to remove the add-nat-by-default by making those places that need
it attach it if its not present yet.
Signed-off-by: Flo
From: Florian Westphal
nowadays the NAT extension only stores the interface index
(used to purge connections that got masqueraded when interface goes down)
and pptp nat information.
Previous patches moved nf_ct_nat_ext_add to those places that need it.
Signed-off-by: Florian Westphal
Signed-of
From: Aaron Conole
There are no in-tree callers of this function and it isn't exported.
Signed-off-by: Aaron Conole
Signed-off-by: Simon Horman
---
include/net/ip_vs.h | 2 --
net/netfilter/ipvs/ip_vs_proto.c | 22 --
2 files changed, 24 deletions(-)
diff --
From: Liping Zhang
For NF_NAT_MANIP_SRC, we will insert the ct to the nat_bysource_table,
then remove it from the nat_bysource_table via nat_extend->destroy.
But now, the nat extension is attached on demand, so if the nat extension
is not attached, we will not be notified when the ct is destroye
From: Florian Westphal
synchronize_net is expensive and slows down netns cleanup a lot.
We have two APIs to unregister a hook:
nf_unregister_net_hook (which calls synchronize_net())
and
nf_unregister_net_hooks (calls nf_unregister_net_hook in a loop)
Make nf_unregister_net_hook a wapper around
From: Florian Westphal
nf_log_unregister() (which is what gets called in the logger backends
module exit paths) does a (required, module is removed) synchronize_rcu().
But nf_log_unset() is only called from pernet exit handlers. It doesn't
free any memory so there appears to be no need to call s
From: Florian Westphal
nf_unregister_net_hook(s) can avoid a second call to synchronize_net,
provided there is no nfqueue active in that net namespace (which is
the common case).
This also gets rid of the extra arg to nf_queue_nf_hook_drop(), normally
this gets called during netns cleanup so no
From: Florian Westphal
net/ipv4/netfilter/nf_nat_snmp_basic.c:1158:1: warning: the frame size
of 1160 bytes is larger than 1024 bytes
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
---
net/ipv4/netfilter/nf_nat_snmp_basic.c | 12 ++--
1 file changed, 6 insertions(+),
From: Aaron Conole
The sync_refresh_period variable is unsigned, so it can never be < 0.
Signed-off-by: Aaron Conole
Signed-off-by: Simon Horman
---
net/netfilter/ipvs/ip_vs_sync.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfil
From: Aaron Conole
There are no in-tree callers.
Signed-off-by: Aaron Conole
Acked-by: Jozsef Kadlecsik
Signed-off-by: Pablo Neira Ayuso
---
net/netfilter/ipset/ip_set_core.c | 8
1 file changed, 8 deletions(-)
diff --git a/net/netfilter/ipset/ip_set_core.c
b/net/netfilter/ipset/i
From: Gao Feng
The __nf_nat_alloc_null_binding invokes nf_nat_setup_info which may
return NF_DROP when memory is exhausted, so convert NF_DROP to -ENOMEM
to make ctnetlink happy. Or ctnetlink_setup_nat treats it as a success
when one error NF_DROP happens actully.
Signed-off-by: Gao Feng
Signed
From: simran singhal
This patch replace list_entry with list_prev_entry as it makes the
code more clear to read.
Signed-off-by: simran singhal
Signed-off-by: Pablo Neira Ayuso
---
net/netfilter/nf_tables_api.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/n
From: simran singhal
For string without format specifiers, use seq_puts(). For
seq_printf("\n"), use seq_putc('\n').
Signed-off-by: simran singhal
Acked-by: Simon Horman
Signed-off-by: Pablo Neira Ayuso
---
net/netfilter/ipvs/ip_vs_ctl.c | 8
net/netfilter/nf_conntrack_expe
Add and use nfnl_msg_type() function to replace opencoded nfnetlink
message type. I suggested this change, Arushi Singhal made an initial
patch to address this but was missing several spots.
Signed-off-by: Pablo Neira Ayuso
---
include/linux/netfilter/nfnetlink.h | 5 +
net/netfilter/ipset
From: Arushi Singhal
This patch uses the following coccinelle script to remove
a variable that was simply used to store the return
value of a function call before returning it:
@@
identifier len,f;
@@
-int len;
... when != len
when strict
-len =
+return
f(...);
-return len;
Signe
From: Gao Feng
Because the type of expecting, the member of nf_conn_help, is u8, it
would overflow after reach U8_MAX(255). So it doesn't work when we
configure the max_expected exceeds 255 with expect policy.
Now add the check for max_expected. Return the -EINVAL when it exceeds
the limit.
Sig
From: Gao Feng
When remove one expect, it needs three statements. And there are
multiple duplicated codes in current code. So add one common function
nf_ct_remove_expect to consolidate this.
Signed-off-by: Gao Feng
Signed-off-by: Pablo Neira Ayuso
---
include/net/netfilter/nf_conntrack_expect
From: simran singhal
The following Coccinelle script was used to detect this:
@r@
expression x;
void* e;
type T;
identifier f;
@@
(
*((T *)e)
|
((T *)x)[...]
|
((T*)x)->f
|
- (T*)
e
)
Unnecessary parantheses are also remove.
Signed-off-by: simran singhal
Reviewed-by: Stephen Hemminger
This new helper function allows us to check if this is a basechain.
Signed-off-by: Pablo Neira Ayuso
---
include/net/netfilter/nf_tables.h | 5 +
net/netfilter/nf_tables_api.c | 30 +++---
net/netfilter/nf_tables_netdev.c | 2 +-
net/netfilter/nft_compat.c
Hi David,
The following patchset contains Netfilter updates for your net-next
tree. A large bunch of code cleanups, XXX they are:
1) Check for ct->status bit instead of using nfct_nat() from IPVS and
Netfilter codebase, patch from Florian Westphal.
2) Use kcalloc() wherever possible in the IP
From: Varsha Rao
Replace kzalloc with kcalloc. As kcalloc is preferred for allocating an
array instead of kzalloc. This patch fixes the checkpatch issue.
Signed-off-by: Varsha Rao
---
net/netfilter/ipvs/ip_vs_sync.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ne
On Sat, Apr 29, 2017 at 09:59:49PM +0800, Liping Zhang wrote:
> From: Liping Zhang
>
> For NF_NAT_MANIP_SRC, we will insert the ct to the nat_bysource_table,
> then remove it from the nat_bysource_table via nat_extend->destroy.
>
> But now, the nat extension is attached on demand, so if the nat
On Fri, Apr 28, 2017 at 12:11:57PM +0200, Simon Horman wrote:
> Hi Pablo,
>
> please consider these enhancements to IPVS for v4.12.
> If it is too late for v4.12 then please consider them for v4.13.
>
> * Remove unused function
> * Correct comparison of unsigned value
Pulled, thanks Simon.
--
To
On Thu, Apr 27, 2017 at 04:39:43PM +0200, Florian Westphal wrote:
> net/ipv4/netfilter/nf_nat_snmp_basic.c:1158:1: warning: the frame size
> of 1160 bytes is larger than 1024 bytes
Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a messa
On Mon, Apr 24, 2017 at 03:37:38PM +0200, Florian Westphal wrote:
> low hanging fruits to speed up netns cleanup in netfilter.
> We're way too happy to issue expensive synchronize_rcu() all
> over the place.
>
> On my test vm 8 processes doing 32 unshare each finish in ~3 minutes,
> with these pat
On Wed, Apr 26, 2017 at 01:32:38PM +0200, Arturo Borrero Gonzalez wrote:
> On 25 April 2017 at 15:18, Pablo Neira Ayuso wrote:
> >>
> >> Yes. The timer based approach is... timer based (async).
> >>
> >> It doesn't fit in an environment where you need to sync events as soon
> >> as they happen.
>
Several spots in the code use goto statements to return the error,
remove them.
Signed-off-by: Pablo Neira Ayuso
---
net/netfilter/nf_tables_api.c | 93 +++
1 file changed, 42 insertions(+), 51 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net
Only nft_dynset needs not to release NFT_SET_EXT_EXPR, add
nft_dynset_elem_destroy() that just releases what we need.
Signed-off-by: Pablo Neira Ayuso
---
include/net/netfilter/nf_tables.h | 3 +--
net/netfilter/nf_tables_api.c | 11 +--
net/netfilter/nft_dynset.c| 13 ++
Do not assume userspace always sends us NFT_DATA_VALUE for bitwise and
cmp expressions. Although NFT_DATA_VERDICT does not make any sense, it
is still possible to handcraft a netlink message using this incorrect
data type.
Signed-off-by: Pablo Neira Ayuso
---
This patch replaces [nf-next,2/4] net
Andreas reports that the following incremental update using our commit
protocol doesn't work.
# nft -f incremental-update.nft
delete element ip filter client_to_any { 10.180.86.22 : goto CIn_1 }
delete chain ip filter CIn_1
... Error: Could not process rule: Device or resource busy
The existi
74 matches
Mail list logo