Re: [BUG] moving fq back to clock monotonic breaks my setup

2019-01-11 Thread Ian Kumlien
On Fri, Jan 11, 2019 at 10:35 AM Eric Dumazet  wrote:
> On Thu, Jan 10, 2019 at 12:55 AM Paolo Abeni  wrote:
> > On Thu, 2019-01-10 at 09:25 +0100, Ian Kumlien wrote:
>
>
> > > This works, and so does: 
> > > https://marc.info/?l=linux-netdev=154696956604748=2
> > >
> > > Pointed out by Paolo (tested both separately)
> >
> > Note: I cleared the tstamp in br_forward_finish() instead of
> > br_dev_queue_push_xmit() because I think the latter could be called
> > also in the local xmit path, via br_nf_post_routing.
> >
> > We must preserve the tstamp in output path, right?
> >
>
>  I was not aware of your patch, SGTM, thanks.

And you can add Tested-by: ian.kuml...@gmail.com


Re: [BUG] moving fq back to clock monotonic breaks my setup

2019-01-11 Thread Eric Dumazet
On Thu, Jan 10, 2019 at 12:55 AM Paolo Abeni  wrote:
>
> On Thu, 2019-01-10 at 09:25 +0100, Ian Kumlien wrote:


> > This works, and so does: 
> > https://marc.info/?l=linux-netdev=154696956604748=2
> >
> > Pointed out by Paolo (tested both separately)
>
> Note: I cleared the tstamp in br_forward_finish() instead of
> br_dev_queue_push_xmit() because I think the latter could be called
> also in the local xmit path, via br_nf_post_routing.
>
> We must preserve the tstamp in output path, right?
>

 I was not aware of your patch, SGTM, thanks.


Re: [BUG] moving fq back to clock monotonic breaks my setup

2019-01-10 Thread Paolo Abeni
On Thu, 2019-01-10 at 09:25 +0100, Ian Kumlien wrote:
> On Thu, Jan 10, 2019 at 6:53 AM Eric Dumazet  wrote:
> > On Wed, Jan 9, 2019 at 4:48 PM Ian Kumlien  wrote:
> > > Hi,
> > > 
> > > Just been trough ~5+ hours of bisecting and eventually actually found
> > > the culprit =)
> > > 
> > > commit fb420d5d91c1274d5966917725e71f27ed092a85 (refs/bisect/bad)
> > > Author: Eric Dumazet 
> > > Date:   Fri Sep 28 10:28:44 2018 -0700
> > > 
> > > tcp/fq: move back to CLOCK_MONOTONIC
> > > 
> > > [--8<--]
> > > 
> > > So this might be because my setup might be "odd".
> > > 
> > > Basically I have a firewall with four nics that uses two of those nics
> > > to handle my normal
> > > internet connection (firewall/MASQ/NAT) and the other two are assigned
> > > to one bridge each.
> > > 
> > > The firewall is also my local caching DNS server and DHCP server,
> > > which is also used by the VM:s...
> > > But with 4.20 DHCP replies disappeared before entering the bridge - i
> > > couldn't even see them in
> > > tcpdump! (all nics are ixgbe on a atom soc)
> > > 
> > > I'm currently running a kernel with that patch reversed but I'm also
> > > wondering about possible ways
> > > forward since I'm reverting a fix from someone else...
> > 
> > I suggest you use netdev@ mailing list instead of lkml
> > 
> > Then, we probably need to clear skb->tstamp in more paths (you are
> > mentioning bridge ...)
> > 
> > See commit 8203e2d844d34af247a151d8ebd68553a6e91785 for reference.
> > 
> > Can you try :
> > 
> > diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
> > index 
> > 5372e2042adfe20d3cd039c29057535b2413be61..bd4fa141420c92a44716bd93fcd8aa3d3310203a
> > 100644
> > --- a/net/bridge/br_forward.c
> > +++ b/net/bridge/br_forward.c
> > @@ -53,6 +53,7 @@ int br_dev_queue_push_xmit(struct net *net, struct
> > sock *sk, struct sk_buff *skb
> > skb_set_network_header(skb, depth);
> > }
> > 
> > +   skb->tstamp = 0;
> > dev_queue_xmit(skb);
> > 
> > return 0;
> 
> This works, and so does: 
> https://marc.info/?l=linux-netdev=154696956604748=2
> 
> Pointed out by Paolo (tested both separately)

Note: I cleared the tstamp in br_forward_finish() instead of
br_dev_queue_push_xmit() because I think the latter could be called
also in the local xmit path, via br_nf_post_routing.

We must preserve the tstamp in output path, right?

Thanks,

Paolo






Re: [BUG] moving fq back to clock monotonic breaks my setup

2019-01-10 Thread Ian Kumlien
On Thu, Jan 10, 2019 at 6:53 AM Eric Dumazet  wrote:
> On Wed, Jan 9, 2019 at 4:48 PM Ian Kumlien  wrote:
> >
> > Hi,
> >
> > Just been trough ~5+ hours of bisecting and eventually actually found
> > the culprit =)
> >
> > commit fb420d5d91c1274d5966917725e71f27ed092a85 (refs/bisect/bad)
> > Author: Eric Dumazet 
> > Date:   Fri Sep 28 10:28:44 2018 -0700
> >
> > tcp/fq: move back to CLOCK_MONOTONIC
> >
> > [--8<--]
> >
> > So this might be because my setup might be "odd".
> >
> > Basically I have a firewall with four nics that uses two of those nics
> > to handle my normal
> > internet connection (firewall/MASQ/NAT) and the other two are assigned
> > to one bridge each.
> >
> > The firewall is also my local caching DNS server and DHCP server,
> > which is also used by the VM:s...
> > But with 4.20 DHCP replies disappeared before entering the bridge - i
> > couldn't even see them in
> > tcpdump! (all nics are ixgbe on a atom soc)
> >
> > I'm currently running a kernel with that patch reversed but I'm also
> > wondering about possible ways
> > forward since I'm reverting a fix from someone else...
>
> I suggest you use netdev@ mailing list instead of lkml
>
> Then, we probably need to clear skb->tstamp in more paths (you are
> mentioning bridge ...)
>
> See commit 8203e2d844d34af247a151d8ebd68553a6e91785 for reference.
>
> Can you try :
>
> diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
> index 
> 5372e2042adfe20d3cd039c29057535b2413be61..bd4fa141420c92a44716bd93fcd8aa3d3310203a
> 100644
> --- a/net/bridge/br_forward.c
> +++ b/net/bridge/br_forward.c
> @@ -53,6 +53,7 @@ int br_dev_queue_push_xmit(struct net *net, struct
> sock *sk, struct sk_buff *skb
> skb_set_network_header(skb, depth);
> }
>
> +   skb->tstamp = 0;
> dev_queue_xmit(skb);
>
> return 0;

This works, and so does: https://marc.info/?l=linux-netdev=154696956604748=2

Pointed out by Paolo (tested both separately)


Re: [BUG] moving fq back to clock monotonic breaks my setup

2019-01-09 Thread Eric Dumazet
On Wed, Jan 9, 2019 at 4:48 PM Ian Kumlien  wrote:
>
> Hi,
>
> Just been trough ~5+ hours of bisecting and eventually actually found
> the culprit =)
>
> commit fb420d5d91c1274d5966917725e71f27ed092a85 (refs/bisect/bad)
> Author: Eric Dumazet 
> Date:   Fri Sep 28 10:28:44 2018 -0700
>
> tcp/fq: move back to CLOCK_MONOTONIC
>
> [--8<--]
>
> So this might be because my setup might be "odd".
>
> Basically I have a firewall with four nics that uses two of those nics
> to handle my normal
> internet connection (firewall/MASQ/NAT) and the other two are assigned
> to one bridge each.
>
> The firewall is also my local caching DNS server and DHCP server,
> which is also used by the VM:s...
> But with 4.20 DHCP replies disappeared before entering the bridge - i
> couldn't even see them in
> tcpdump! (all nics are ixgbe on a atom soc)
>
> I'm currently running a kernel with that patch reversed but I'm also
> wondering about possible ways
> forward since I'm reverting a fix from someone else...

I suggest you use netdev@ mailing list instead of lkml

Then, we probably need to clear skb->tstamp in more paths (you are
mentioning bridge ...)

See commit 8203e2d844d34af247a151d8ebd68553a6e91785 for reference.

Can you try :

diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 
5372e2042adfe20d3cd039c29057535b2413be61..bd4fa141420c92a44716bd93fcd8aa3d3310203a
100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -53,6 +53,7 @@ int br_dev_queue_push_xmit(struct net *net, struct
sock *sk, struct sk_buff *skb
skb_set_network_header(skb, depth);
}

+   skb->tstamp = 0;
dev_queue_xmit(skb);

return 0;

Thanks.


[BUG] moving fq back to clock monotonic breaks my setup

2019-01-09 Thread Ian Kumlien
Hi,

Just been trough ~5+ hours of bisecting and eventually actually found
the culprit =)

commit fb420d5d91c1274d5966917725e71f27ed092a85 (refs/bisect/bad)
Author: Eric Dumazet 
Date:   Fri Sep 28 10:28:44 2018 -0700

tcp/fq: move back to CLOCK_MONOTONIC

[--8<--]

So this might be because my setup might be "odd".

Basically I have a firewall with four nics that uses two of those nics
to handle my normal
internet connection (firewall/MASQ/NAT) and the other two are assigned
to one bridge each.

The firewall is also my local caching DNS server and DHCP server,
which is also used by the VM:s...
But with 4.20 DHCP replies disappeared before entering the bridge - i
couldn't even see them in
tcpdump! (all nics are ixgbe on a atom soc)

I'm currently running a kernel with that patch reversed but I'm also
wondering about possible ways
forward since I'm reverting a fix from someone else...