Re: [BUG] moving fq back to clock monotonic breaks my setup
On Fri, Jan 11, 2019 at 10:35 AM Eric Dumazet wrote: > On Thu, Jan 10, 2019 at 12:55 AM Paolo Abeni wrote: > > On Thu, 2019-01-10 at 09:25 +0100, Ian Kumlien wrote: > > > > > This works, and so does: > > > https://marc.info/?l=linux-netdev=154696956604748=2 > > > > > > Pointed out by Paolo (tested both separately) > > > > Note: I cleared the tstamp in br_forward_finish() instead of > > br_dev_queue_push_xmit() because I think the latter could be called > > also in the local xmit path, via br_nf_post_routing. > > > > We must preserve the tstamp in output path, right? > > > > I was not aware of your patch, SGTM, thanks. And you can add Tested-by: ian.kuml...@gmail.com
Re: [BUG] moving fq back to clock monotonic breaks my setup
On Thu, Jan 10, 2019 at 12:55 AM Paolo Abeni wrote: > > On Thu, 2019-01-10 at 09:25 +0100, Ian Kumlien wrote: > > This works, and so does: > > https://marc.info/?l=linux-netdev=154696956604748=2 > > > > Pointed out by Paolo (tested both separately) > > Note: I cleared the tstamp in br_forward_finish() instead of > br_dev_queue_push_xmit() because I think the latter could be called > also in the local xmit path, via br_nf_post_routing. > > We must preserve the tstamp in output path, right? > I was not aware of your patch, SGTM, thanks.
Re: [BUG] moving fq back to clock monotonic breaks my setup
On Thu, 2019-01-10 at 09:25 +0100, Ian Kumlien wrote: > On Thu, Jan 10, 2019 at 6:53 AM Eric Dumazet wrote: > > On Wed, Jan 9, 2019 at 4:48 PM Ian Kumlien wrote: > > > Hi, > > > > > > Just been trough ~5+ hours of bisecting and eventually actually found > > > the culprit =) > > > > > > commit fb420d5d91c1274d5966917725e71f27ed092a85 (refs/bisect/bad) > > > Author: Eric Dumazet > > > Date: Fri Sep 28 10:28:44 2018 -0700 > > > > > > tcp/fq: move back to CLOCK_MONOTONIC > > > > > > [--8<--] > > > > > > So this might be because my setup might be "odd". > > > > > > Basically I have a firewall with four nics that uses two of those nics > > > to handle my normal > > > internet connection (firewall/MASQ/NAT) and the other two are assigned > > > to one bridge each. > > > > > > The firewall is also my local caching DNS server and DHCP server, > > > which is also used by the VM:s... > > > But with 4.20 DHCP replies disappeared before entering the bridge - i > > > couldn't even see them in > > > tcpdump! (all nics are ixgbe on a atom soc) > > > > > > I'm currently running a kernel with that patch reversed but I'm also > > > wondering about possible ways > > > forward since I'm reverting a fix from someone else... > > > > I suggest you use netdev@ mailing list instead of lkml > > > > Then, we probably need to clear skb->tstamp in more paths (you are > > mentioning bridge ...) > > > > See commit 8203e2d844d34af247a151d8ebd68553a6e91785 for reference. > > > > Can you try : > > > > diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c > > index > > 5372e2042adfe20d3cd039c29057535b2413be61..bd4fa141420c92a44716bd93fcd8aa3d3310203a > > 100644 > > --- a/net/bridge/br_forward.c > > +++ b/net/bridge/br_forward.c > > @@ -53,6 +53,7 @@ int br_dev_queue_push_xmit(struct net *net, struct > > sock *sk, struct sk_buff *skb > > skb_set_network_header(skb, depth); > > } > > > > + skb->tstamp = 0; > > dev_queue_xmit(skb); > > > > return 0; > > This works, and so does: > https://marc.info/?l=linux-netdev=154696956604748=2 > > Pointed out by Paolo (tested both separately) Note: I cleared the tstamp in br_forward_finish() instead of br_dev_queue_push_xmit() because I think the latter could be called also in the local xmit path, via br_nf_post_routing. We must preserve the tstamp in output path, right? Thanks, Paolo
Re: [BUG] moving fq back to clock monotonic breaks my setup
On Thu, Jan 10, 2019 at 6:53 AM Eric Dumazet wrote: > On Wed, Jan 9, 2019 at 4:48 PM Ian Kumlien wrote: > > > > Hi, > > > > Just been trough ~5+ hours of bisecting and eventually actually found > > the culprit =) > > > > commit fb420d5d91c1274d5966917725e71f27ed092a85 (refs/bisect/bad) > > Author: Eric Dumazet > > Date: Fri Sep 28 10:28:44 2018 -0700 > > > > tcp/fq: move back to CLOCK_MONOTONIC > > > > [--8<--] > > > > So this might be because my setup might be "odd". > > > > Basically I have a firewall with four nics that uses two of those nics > > to handle my normal > > internet connection (firewall/MASQ/NAT) and the other two are assigned > > to one bridge each. > > > > The firewall is also my local caching DNS server and DHCP server, > > which is also used by the VM:s... > > But with 4.20 DHCP replies disappeared before entering the bridge - i > > couldn't even see them in > > tcpdump! (all nics are ixgbe on a atom soc) > > > > I'm currently running a kernel with that patch reversed but I'm also > > wondering about possible ways > > forward since I'm reverting a fix from someone else... > > I suggest you use netdev@ mailing list instead of lkml > > Then, we probably need to clear skb->tstamp in more paths (you are > mentioning bridge ...) > > See commit 8203e2d844d34af247a151d8ebd68553a6e91785 for reference. > > Can you try : > > diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c > index > 5372e2042adfe20d3cd039c29057535b2413be61..bd4fa141420c92a44716bd93fcd8aa3d3310203a > 100644 > --- a/net/bridge/br_forward.c > +++ b/net/bridge/br_forward.c > @@ -53,6 +53,7 @@ int br_dev_queue_push_xmit(struct net *net, struct > sock *sk, struct sk_buff *skb > skb_set_network_header(skb, depth); > } > > + skb->tstamp = 0; > dev_queue_xmit(skb); > > return 0; This works, and so does: https://marc.info/?l=linux-netdev=154696956604748=2 Pointed out by Paolo (tested both separately)
Re: [BUG] moving fq back to clock monotonic breaks my setup
On Wed, Jan 9, 2019 at 4:48 PM Ian Kumlien wrote: > > Hi, > > Just been trough ~5+ hours of bisecting and eventually actually found > the culprit =) > > commit fb420d5d91c1274d5966917725e71f27ed092a85 (refs/bisect/bad) > Author: Eric Dumazet > Date: Fri Sep 28 10:28:44 2018 -0700 > > tcp/fq: move back to CLOCK_MONOTONIC > > [--8<--] > > So this might be because my setup might be "odd". > > Basically I have a firewall with four nics that uses two of those nics > to handle my normal > internet connection (firewall/MASQ/NAT) and the other two are assigned > to one bridge each. > > The firewall is also my local caching DNS server and DHCP server, > which is also used by the VM:s... > But with 4.20 DHCP replies disappeared before entering the bridge - i > couldn't even see them in > tcpdump! (all nics are ixgbe on a atom soc) > > I'm currently running a kernel with that patch reversed but I'm also > wondering about possible ways > forward since I'm reverting a fix from someone else... I suggest you use netdev@ mailing list instead of lkml Then, we probably need to clear skb->tstamp in more paths (you are mentioning bridge ...) See commit 8203e2d844d34af247a151d8ebd68553a6e91785 for reference. Can you try : diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c index 5372e2042adfe20d3cd039c29057535b2413be61..bd4fa141420c92a44716bd93fcd8aa3d3310203a 100644 --- a/net/bridge/br_forward.c +++ b/net/bridge/br_forward.c @@ -53,6 +53,7 @@ int br_dev_queue_push_xmit(struct net *net, struct sock *sk, struct sk_buff *skb skb_set_network_header(skb, depth); } + skb->tstamp = 0; dev_queue_xmit(skb); return 0; Thanks.
[BUG] moving fq back to clock monotonic breaks my setup
Hi, Just been trough ~5+ hours of bisecting and eventually actually found the culprit =) commit fb420d5d91c1274d5966917725e71f27ed092a85 (refs/bisect/bad) Author: Eric Dumazet Date: Fri Sep 28 10:28:44 2018 -0700 tcp/fq: move back to CLOCK_MONOTONIC [--8<--] So this might be because my setup might be "odd". Basically I have a firewall with four nics that uses two of those nics to handle my normal internet connection (firewall/MASQ/NAT) and the other two are assigned to one bridge each. The firewall is also my local caching DNS server and DHCP server, which is also used by the VM:s... But with 4.20 DHCP replies disappeared before entering the bridge - i couldn't even see them in tcpdump! (all nics are ixgbe on a atom soc) I'm currently running a kernel with that patch reversed but I'm also wondering about possible ways forward since I'm reverting a fix from someone else...