Re: [offlist] Re: Crash in netlink/sk_filter_trim_cap on ARMv7 on 4.18rc1

2018-08-16 Thread Marc Haber
000
[   11.999732] 1fc0: be8dcf80 b6f19ce8 0186aea0 0128  0062 
0186b6e8 
[   12.007902] 1fe0: 0128 be8dcf50 b6e003e3 b6e01346
[   12.012957] Code: e3130010 e1a0c000 1a30 e35c (e584900c) 
[   12.019056] Internal error: Oops: a06 [#4] SMP ARM
[   12.019171] ---[ end trace 1b60255ae59ac008 ]---


-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression

2017-11-13 Thread Marc Haber
Hi,

those four patches never made it into any 4.13 release.

0001-net-call-sk_reuseport_match-if-we-are-a-reusesock.patch
0001-net-don-t-fast-patch-mismatched-sockets-in-STRICT-mo.patch
0001-net-use-inet6_rcv_saddr-to-compare-sockets.patch
0001-net-set-tb-fast_sk_family.patch

And I have just seen that the first two are not even in 4.14. What does
that mean for libvirt users on systems runnign a 4.14 kernel?

The third and fourth patch
(0001-net-use-inet6_rcv_saddr-to-compare-sockets.patch and
0001-net-set-tb-fast_sk_family.patch) seem to be in 4.14.

Greetings
Marc

On Mon, Sep 18, 2017 at 10:02:32AM +0200, Marc Haber wrote:
> On Sun, Sep 17, 2017 at 09:17:13AM -0400, Cole Robinson wrote:
> > On 09/15/2017 01:51 PM, Josef Bacik wrote:
> > > Finally got access to a box to run this down myself.  This patch on top 
> > > of the other patches fixes the problem for me, could you verify it works 
> > > for you?  Thanks,
> > > 
> > 
> > Yup I can confirm that patch fixes things when applied on top of the
> > previous 3 patches. Thanks! Please tag those patches for stable releases
> > if appropriate, this is affecting a decent amount of libvirt users
> 
> I can also confirm that these four patches fix things for me (on
> Debian) as well. Thanks!
> 
> I would love to have this in one of Greg's next 4.13 releases.

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression

2017-09-18 Thread Marc Haber
On Sun, Sep 17, 2017 at 09:17:13AM -0400, Cole Robinson wrote:
> On 09/15/2017 01:51 PM, Josef Bacik wrote:
> > Finally got access to a box to run this down myself.  This patch on top of 
> > the other patches fixes the problem for me, could you verify it works for 
> > you?  Thanks,
> > 
> 
> Yup I can confirm that patch fixes things when applied on top of the
> previous 3 patches. Thanks! Please tag those patches for stable releases
> if appropriate, this is affecting a decent amount of libvirt users

I can also confirm that these four patches fix things for me (on
Debian) as well. Thanks!

I would love to have this in one of Greg's next 4.13 releases.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: After a while of system running no incoming UDP any more?

2017-08-11 Thread Marc Haber
On Fri, Aug 11, 2017 at 04:34:53PM +0200, Marc Haber wrote:
> On Fri, Jul 28, 2017 at 02:14:34PM +0200, Marc Haber wrote:
> > I can confirm that these two changes make a system in bad state work
> > again immediately. Will try the patch on 4.12.4 later today.
> 
> After upgrading my test systems to 4.12.5, the issue reappeared. This
> shows me that the patch indeed helped (my patched 4.12.4 kernels didn't
> show the bad behavior), and that the patch didn't make its way into
> 4.12.5. The patch applied to 4.12.5, kernels are building.

It seems to be in the freshly released 4.12.6.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: After a while of system running no incoming UDP any more?

2017-08-11 Thread Marc Haber
On Fri, Jul 28, 2017 at 02:14:34PM +0200, Marc Haber wrote:
> I can confirm that these two changes make a system in bad state work
> again immediately. Will try the patch on 4.12.4 later today.

After upgrading my test systems to 4.12.5, the issue reappeared. This
shows me that the patch indeed helped (my patched 4.12.4 kernels didn't
show the bad behavior), and that the patch didn't make its way into
4.12.5. The patch applied to 4.12.5, kernels are building.

The run-time fix works as well.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: After a while of system running no incoming UDP any more?

2017-07-28 Thread Marc Haber
On Fri, Jul 28, 2017 at 10:07:57AM +0200, Paolo Abeni wrote:
> Ad a workaround you can disable UDP early demux:
> 
> echo 0 > /proc/sys/net/ipv4/udp_early_demux
> 
> (will affect both ipv4 and ipv6).
> 
> and (if the system  is already into the bad state) increase the udp
> accounted memory limit, writing in /proc/sys/net/ipv4/udp_mem greater
> values than the current ones (the actual values depends on the system
> total memory).

I can confirm that these two changes make a system in bad state work
again immediately. Will try the patch on 4.12.4 later today.

Thanks for helping!

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: After a while of system running no incoming UDP any more?

2017-07-28 Thread Marc Haber
ms  1  0.0
UdpInErrors 50 0.0
UdpOutDatagrams 47 0.0
UdpIgnoredMulti 1  0.0
Ip6InReceives   75 0.0
Ip6InDelivers   73 0.0
Ip6OutRequests  64 0.0
Ip6InMcastPkts  2  0.0
Ip6InOctets 7837   0.0
Ip6OutOctets11876  0.0
Ip6InMcastOctets2790.0
Ip6InNoECTPkts  75 0.0
Udp6InErrors3  0.0
IpExtInBcastPkts1  0.0
IpExtInOctets   18447  0.0
IpExtOutOctets  3478   0.0
IpExtInBcastOctets  1830.0
IpExtInNoECTPkts59 0.0

; <<>> DiG 9.10.3-P4-Debian <<>> +time=2 @8.8.8.8 zugschlus.de mx
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
Recv-Q Send-Q Local Address:Port
  Peer Address:Port
0  0 216.231.132.60:7879
  202.12.27.33:domain
0  0 216.231.132.60:32711   
  202.12.27.33:domain
0  0 216.231.132.60:54238   
  202.12.27.33:domain
0  0 216.231.132.60:30948   
192.228.79.201:domain
0  0 216.231.132.60:4106
  202.12.27.33:domain
0  0 216.231.132.60:6667
  202.12.27.33:domain
0  0 216.231.132.60:2090
192.228.79.201:domain
0  0 216.231.132.60:60459   
192.228.79.201:domain
0  0 216.231.132.60:16427   
  202.12.27.33:domain
0  0 216.231.132.60:9019
  202.12.27.33:domain
0  0 216.231.132.60:2113
  202.12.27.33:domain
0  0 216.231.132.60:34907   
  202.12.27.33:domain
0  0 216.231.132.60:34654   
  202.12.27.33:domain
0  0 216.231.132.60:47725   
  202.12.27.33:domain
0  0 216.231.132.60:35774   
  202.12.27.33:domain
IpInReceives38 0.0
IpInDelivers38 0.0
IpOutRequests   38 0.0
UdpInDatagrams  2  0.0
UdpInErrors 34 0.0
UdpOutDatagrams 36 0.0
Ip6InReceives   14 0.0
Ip6InDelivers   13 0.0
Ip6OutRequests  13 0.0
Ip6InMcastPkts  1  0.0
Ip6InOctets 1046   0.0
Ip6OutOctets6277   0.0
Ip6InMcastOctets1330.0
Ip6InNoECTPkts  13 0.0
Ip6InECT0Pkts   1  0.0
Udp6InDatagrams 1  0.0
Udp6OutDatagrams1  0.0
IpExtInOctets   15963  0.0
IpExtOutOctets  2397   0.0
IpExtInNoECTPkts37 0.0
IpExtInECT0Pkts 1  0.0
[20/1079]mh@impetus:~ $

I am afraid I cannot keep this state for much longer than a few
additional hours as this is an authoritative name server...

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
No

Re: After a while of system running no incoming UDP any more?

2017-07-26 Thread Marc Haber
On Tue, Jul 25, 2017 at 02:17:52PM +0200, Paolo Abeni wrote:
> On Tue, 2017-07-25 at 13:57 +0200, Marc Haber wrote:
> > On Mon, Jul 24, 2017 at 04:19:10PM +0200, Paolo Abeni wrote:
> > > Once that a system enter the buggy status, do the packets reach the
> > > relevant socket's queue?
> > > 
> > > ss -u
> > 
> > That one only shows table headers on an unaffected system in normal
> > operation, right?
> 
> This one shows the current lenght of the socket receive queue (Recv-Q,
> the first column). If the packets land into the skbuff (and the user
> space reader for some reason is not woken up) such value will grow over
> time.

Only that there is no value:
[4/4992]mh@swivel:~ $ ss -u
Recv-Q Send-Q Local Address:Port Peer Address:Port  
[5/4992]mh@swivel:~ $

(is that the intended behavior on a system thiat is not affected by the
issue?)

> > > nstat |grep -e Udp -e Ip
> > > 
> > > will help checking that.
> > 
> > An unaffected system will show UdpInDatagrams, right?
> > 
> > But where is the connection to the relevant socket's queue?
> 
> If the socket queue lenght (as reported above) does not increase,
> IP/UDP stats could give an hint of where and why the packets stop
> traversing the network stack.

We'll see. Still waiting for the phenomenon to show up again.

> Beyond that, you can try using perf probes or kprobe/systemtap to [try
> to] track the relevant packets inside the kernel.

That's way beyond my kernel foo, I'm afraid.

Thanks for helping, I'll report back.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: After a while of system running no incoming UDP any more?

2017-07-25 Thread Marc Haber
Hi Paolo,

thanks for your answer. I appreciate that.

On Mon, Jul 24, 2017 at 04:19:10PM +0200, Paolo Abeni wrote:
> On Mon, 2017-07-24 at 14:09 +0200, Marc Haber wrote:
> > Before I begin running older kernels on productive systems, I would like
> > to ask wether there have been recent changes in the 4.11 => 4.12
> > development cycle that might cause an issue like that.
> 
> While there has been some activity regarding the UDP protocol lately,
> almost nothing touched UDP in the 4.11 release cycle.

4.11 is good, 4.12 is bad.

> The issue you describe looks similar to the bug fixed by the commit
> 9bd780f5e066 ("udp: fix poll()"), but the bugged code is only in later
> kernels. 

That one is in v4.13-rc1 and v4.13-rc2, but it doesn't apply to my 4.12
trees.

> > Any idea what might be happening here and what else I could try?
> 
> Once that a system enter the buggy status, do the packets reach the
> relevant socket's queue?
> 
> ss -u

That one only shows table headers on an unaffected system in normal
operation, right?

> nstat |grep -e Udp -e Ip
> 
> will help checking that.

An unaffected system will show UdpInDatagrams, right?

But where is the connection to the relevant socket's queue?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


After a while of system running no incoming UDP any more?

2017-07-24 Thread Marc Haber
Hi,

I am running ~ 50 servers, most of them as KVM guests, some of them as
Xen guests, and even less of them on hardware, and have recently updated
to Debian stretch. I usually use kernels locally built from the latest
vanille stable release.

Roughly since the upgrade to Debian stretch and kernel 4.12, some of my
systems have begun to not forward UDP packets (such as incoming DNS
replies) to the user space. When this happens, I see the packet coming
in on tcpdump -p, but the application never sees it and eventuelly times
out. An strace on the process sees the process waiting on the select()
syscall and nothing happens when the system receives the UDP packet. I
do also see the same phenomenon with ntp. A reboot always fixes the
issue. 

Runnign wireshark on a pcap file obtained on an affected systems does
show all checksums to be in order. Both IPv4 and IPv6 are affected, and
in the DNS case, switching dig/drill or even the system resolver to TCP
also fixes the issue.

This happens only after the system has been running for a few days, and
I have seen this happen on both KVM and Xen guests, but not (yet) on
real hardware. In my zoo of servers, this happens - over the entire
sample - about twice a week, often enough to be annoying and seldomly
enough to make debugging really difficult since you'll never know in
advance which system will have the issue for the next time.

I have therefore been reluctant to downgrade kernel or system since that
would mean days of work. Bisecting is probably out of the question since
you'll never know when "git bisect good" is a sufficiently safe
assumption.

Before I begin running older kernels on productive systems, I would like
to ask wether there have been recent changes in the 4.11 => 4.12
development cycle that might cause an issue like that.

Since I have never seen the issue on stretch systems when they were
still running 4.11.8 (the latest 4.11 kernel that I had deployed before
switching over to 4.12), I do really suspect the kernel, and I do also
suspect that network interface offloading is probably not the culprit.

On the KVM guests, I use virtio-net, and I had that one high on my list
until one of the two Xen guests that doesn't show any network modules
loaded has been showing the phenomenon as well.

That Xen guest outputs the following to lshw -C network:

that doesn't show any network modules loaded has been showing the
phenomenon as well.

That Xen guest outputs the following to lshw -C network:

  *-network
   description: Ethernet interface
   physical id: 1
   logical name: eth0
   serial: 0e:06:5f:74:48:97
   capabilities: ethernet physical
   configuration: broadcast=yes driver=vif ip= link=yes 
multicast=yes

So I assume that this one is not using virtio-net, so virtio-net seems
safe as well.

Any idea what might be happening here and what else I could try?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: [PATCH (net.git) 2/3] Revert "stmmac: Fix 'eth0: No PHY found' regression"

2016-05-11 Thread Marc Haber
On Wed, Apr 13, 2016 at 05:44:25PM +0200, Marc Haber wrote:
> On Fri, Apr 01, 2016 at 09:07:15AM +0200, Giuseppe Cavallaro wrote:
> > This reverts commit 88f8b1bb41c6208f81b6a480244533ded7b59493.
> > due to problems on GeekBox and Banana Pi M1 board when
> > connected to a real transceiver instead of a switch via
> > fixed-link.
> 
> This reversal is still needed in Linux 4.5.1 on Banana Pi.
> 
> Please consider including it in Linux 4.5.2.

This reversal is still needed in Linux 4.5.4 on Banana Pi.

Please consider including it in Linux 4.5.5.

Greetings
Marc



> 
> > 
> > Signed-off-by: Giuseppe Cavallaro <peppe.cavall...@st.com>
> > Cc: Gabriel Fernandez <gabriel.fernan...@linaro.org>
> > Cc: Andreas Färber <afaer...@suse.de>
> > Cc: Frank Schäfer <fschaefer@googlemail.com>
> > Cc: Dinh Nguyen <dinh.li...@gmail.com>
> > Cc: David S. Miller <da...@davemloft.net>
> > ---
> >  drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c  |   11 ++-
> >  .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |9 +
> >  include/linux/stmmac.h |1 -
> >  3 files changed, 11 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c 
> > b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
> > index ea76129..af09ced 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
> > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
> > @@ -199,12 +199,21 @@ int stmmac_mdio_register(struct net_device *ndev)
> > struct stmmac_priv *priv = netdev_priv(ndev);
> > struct stmmac_mdio_bus_data *mdio_bus_data = priv->plat->mdio_bus_data;
> > int addr, found;
> > -   struct device_node *mdio_node = priv->plat->mdio_node;
> > +   struct device_node *mdio_node = NULL;
> > +   struct device_node *child_node = NULL;
> >  
> > if (!mdio_bus_data)
> > return 0;
> >  
> > if (IS_ENABLED(CONFIG_OF)) {
> > +   for_each_child_of_node(priv->device->of_node, child_node) {
> > +   if (of_device_is_compatible(child_node,
> > +   "snps,dwmac-mdio")) {
> > +   mdio_node = child_node;
> > +   break;
> > +   }
> > +   }
> > +
> > if (mdio_node) {
> > netdev_dbg(ndev, "FOUND MDIO subnode\n");
> > } else {
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c 
> > b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> > index dcbd2a1..9cf181f 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> > @@ -146,7 +146,6 @@ stmmac_probe_config_dt(struct platform_device *pdev, 
> > const char **mac)
> > struct device_node *np = pdev->dev.of_node;
> > struct plat_stmmacenet_data *plat;
> > struct stmmac_dma_cfg *dma_cfg;
> > -   struct device_node *child_node = NULL;
> >  
> > plat = devm_kzalloc(>dev, sizeof(*plat), GFP_KERNEL);
> > if (!plat)
> > @@ -177,19 +176,13 @@ stmmac_probe_config_dt(struct platform_device *pdev, 
> > const char **mac)
> > plat->phy_node = of_node_get(np);
> > }
> >  
> > -   for_each_child_of_node(np, child_node)
> > -   if (of_device_is_compatible(child_node, "snps,dwmac-mdio")) {
> > -   plat->mdio_node = child_node;
> > -   break;
> > -   }
> > -
> > /* "snps,phy-addr" is not a standard property. Mark it as deprecated
> >  * and warn of its use. Remove this when phy node support is added.
> >  */
> > if (of_property_read_u32(np, "snps,phy-addr", >phy_addr) == 0)
> > dev_warn(>dev, "snps,phy-addr property is deprecated\n");
> >  
> > -   if ((plat->phy_node && !of_phy_is_fixed_link(np)) || !plat->mdio_node)
> > +   if ((plat->phy_node && !of_phy_is_fixed_link(np)) || plat->phy_bus_name)
> >     plat->mdio_bus_data = NULL;
> > else
> > plat->mdio_bus_data =
> > diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h
> > index 4bcf5a6..6e53fa8 100644
> > --- a/include/linux/stmmac.h
> > +++ b/include/linux/stmmac.h
> > @@ -114,7 +114,6 @@ struct plat_stmmacenet_data {
> >  

Re: [PATCH (net.git) 2/3] Revert "stmmac: Fix 'eth0: No PHY found' regression"

2016-04-13 Thread Marc Haber
On Fri, Apr 01, 2016 at 09:07:15AM +0200, Giuseppe Cavallaro wrote:
> This reverts commit 88f8b1bb41c6208f81b6a480244533ded7b59493.
> due to problems on GeekBox and Banana Pi M1 board when
> connected to a real transceiver instead of a switch via
> fixed-link.

This reversal is still needed in Linux 4.5.1 on Banana Pi.

Please consider including it in Linux 4.5.2.

Greetings
Marc

> 
> Signed-off-by: Giuseppe Cavallaro <peppe.cavall...@st.com>
> Cc: Gabriel Fernandez <gabriel.fernan...@linaro.org>
> Cc: Andreas Färber <afaer...@suse.de>
> Cc: Frank Schäfer <fschaefer@googlemail.com>
> Cc: Dinh Nguyen <dinh.li...@gmail.com>
> Cc: David S. Miller <da...@davemloft.net>
> ---
>  drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c  |   11 ++-
>  .../net/ethernet/stmicro/stmmac/stmmac_platform.c  |9 +
>  include/linux/stmmac.h |1 -
>  3 files changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c 
> b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
> index ea76129..af09ced 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
> @@ -199,12 +199,21 @@ int stmmac_mdio_register(struct net_device *ndev)
>   struct stmmac_priv *priv = netdev_priv(ndev);
>   struct stmmac_mdio_bus_data *mdio_bus_data = priv->plat->mdio_bus_data;
>   int addr, found;
> - struct device_node *mdio_node = priv->plat->mdio_node;
> + struct device_node *mdio_node = NULL;
> + struct device_node *child_node = NULL;
>  
>   if (!mdio_bus_data)
>   return 0;
>  
>   if (IS_ENABLED(CONFIG_OF)) {
> + for_each_child_of_node(priv->device->of_node, child_node) {
> + if (of_device_is_compatible(child_node,
> + "snps,dwmac-mdio")) {
> + mdio_node = child_node;
> + break;
> + }
> + }
> +
>   if (mdio_node) {
>   netdev_dbg(ndev, "FOUND MDIO subnode\n");
>   } else {
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c 
> b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> index dcbd2a1..9cf181f 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> @@ -146,7 +146,6 @@ stmmac_probe_config_dt(struct platform_device *pdev, 
> const char **mac)
>   struct device_node *np = pdev->dev.of_node;
>   struct plat_stmmacenet_data *plat;
>   struct stmmac_dma_cfg *dma_cfg;
> - struct device_node *child_node = NULL;
>  
>   plat = devm_kzalloc(>dev, sizeof(*plat), GFP_KERNEL);
>   if (!plat)
> @@ -177,19 +176,13 @@ stmmac_probe_config_dt(struct platform_device *pdev, 
> const char **mac)
>   plat->phy_node = of_node_get(np);
>   }
>  
> - for_each_child_of_node(np, child_node)
> - if (of_device_is_compatible(child_node, "snps,dwmac-mdio")) {
> - plat->mdio_node = child_node;
> - break;
> - }
> -
>   /* "snps,phy-addr" is not a standard property. Mark it as deprecated
>* and warn of its use. Remove this when phy node support is added.
>*/
>   if (of_property_read_u32(np, "snps,phy-addr", >phy_addr) == 0)
>   dev_warn(>dev, "snps,phy-addr property is deprecated\n");
>  
> - if ((plat->phy_node && !of_phy_is_fixed_link(np)) || !plat->mdio_node)
> + if ((plat->phy_node && !of_phy_is_fixed_link(np)) || plat->phy_bus_name)
>   plat->mdio_bus_data = NULL;
>   else
>   plat->mdio_bus_data =
> diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h
> index 4bcf5a6..6e53fa8 100644
> --- a/include/linux/stmmac.h
> +++ b/include/linux/stmmac.h
> @@ -114,7 +114,6 @@ struct plat_stmmacenet_data {
>   int interface;
>   struct stmmac_mdio_bus_data *mdio_bus_data;
>   struct device_node *phy_node;
> - struct device_node *mdio_node;
>   struct stmmac_dma_cfg *dma_cfg;
>   int clk_csr;
>   int has_gmac;
> -- 
> 1.7.4.4
> 

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: IPv6 route to gateway on fe80::1%eth0 when I have fe80::1%br0 locally

2016-02-23 Thread Marc Haber
On Tue, Feb 23, 2016 at 10:03:28AM +0100, Hannes Frederic Sowa wrote:
> Thanks for letting me know. Hopefully this also fixes
> https://bugzilla.kernel.org/show_bug.cgi?id=110721.

As far as I have understood the systemd release logs, the code
handling IPv6 RAs was added in systemd 229, which was released on
February 11. So, #110721, filed in January, seems to be "safe" from
this issue unless a development snapshot of systemd was used here.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: IPv6 route to gateway on fe80::1%eth0 when I have fe80::1%br0 locally

2016-02-22 Thread Marc Haber
On Mon, Feb 22, 2016 at 05:15:41PM +0100, Hannes Frederic Sowa wrote:
> On 22.02.2016 16:47, Marc Haber wrote:
> >Can you reproduce the behavior with accept_ra_from_local =0 as well?
> >Unfortunately, the debugging VM I build works fine, it's just the
> >physical host showing this behavior. This is really strange.
> 
> Same here. Debugging VM didn't show this error at all and other systems
> didn't show this symptom either (4.4.2 as well as net-next).
> 
> With which kernel did you see this behavior for the first time and what was
> the last working version?

Thanks for motivating me to investigate this further.

I have to apologize. It is not a kernel issue.

It has turned out that systemd, starting with version 229, has placed
a "Not invented here" stamp on route advertisement processing in the
kernel and has implemented its own userspace code to handle router
advertisements.

And, of course, they did it wrong.

Setting IPv6AcceptRouterAdvertisements=0 in eth0.network seems to
disable enough code that this issue does not show any more.

Sorry for the rumble, I debugged the wrong piece of software. Bugs in
Debian are filed, #815582, #815586. I don't file bugs with systemd
upstream any more since I got silenced on systemd-devel for losing my
temper.

Greetings
Marc


-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: IPv6 route to gateway on fe80::1%eth0 when I have fe80::1%br0 locally

2016-02-22 Thread Marc Haber
On Mon, Feb 22, 2016 at 04:12:36PM +0100, Hannes Frederic Sowa wrote:
> On 22.02.2016 16:04, Marc Haber wrote:
> >In prose:
> >
> >The host is a host for KVM VMs. It receives IPv6 connectivity via RA
> >on eth0, where the default gateway announces its address as fe80::1.
> >It also provides IPv6 connectivity to the VMs via the br0 interface.
> >It is running radvd on br0, and for statically configured VMs it has
> >also fe80::1 on br0.
> >
> >If accept_ra_from_local on eth0 were 0, the system would not accept
> >the RA from the default gateway and and up with no IPv6 since fe80::1
> >is locally configured with br0.
> 
> Isn't this behavior fixed with
> 
> commit c1a9a291cee0890eb0f435243f3fb84fefb04348
> Author: Hannes Frederic Sowa <han...@stressinduktion.org>
> Date:   Wed Dec 23 22:44:37 2015 +0100
> 
> ipv6: honor ifindex in case we receive ll addresses in router
> advertisements
> 
> $ git describe --contains c1a9a291cee0890eb0f435243f3fb84fefb04348
> v4.4-rc8~5^2~10
> 
> ?
> 
> If you don't have fe80::1%br0 bound on exactly that interface, it should
> work, no? So, no need for accept_ra_from_local, which has dubious semantics
> anyway.

I have accept_ra_from_local set to 0 on all interfaces now, and I
still get the dubious default route on eth0.

> >If accept_ra_from_local on eth0 is 1, the system accepts both the RA
> >from the default gateway on eth0 _AND_ its own RA sent out and
> >received on br0, and, making things worse, is setting the IP address
> >and default route not on br0, but on eth0.
> 
> Understood. Thanks, I was just able to easily reproduce it. Was already
> wondering why someone would enable accept_ra_from_local besides only
> testing. I check it out, thanks!

Can you reproduce the behavior with accept_ra_from_local =0 as well?
Unfortunately, the debugging VM I build works fine, it's just the
physical host showing this behavior. This is really strange.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: IPv6 route to gateway on fe80::1%eth0 when I have fe80::1%br0 locally

2016-02-22 Thread Marc Haber
Hi Hannes,

On Tue, Dec 22, 2015 at 10:50:04PM +0100, Hannes Frederic Sowa wrote:
> Thanks but no need to do that, I already cooked a patch and will submit
> tomorrow after some testing. We don't need to enhance the sysctl,
> default should be to simply check the interface too if a route with
> link-local address is received.

Kernel bugzilla #112751 is related to this.

The following is snipped to the relevant parts and was obtained on a
Debian system running kernel 4.4.2

[1/501]mh@fan:~$ for f in 
/proc/sys/net/ipv6/conf/*/{accept_ra,accept_ra_from_local,forwarding}; do echo 
$f; cat $f; done
/proc/sys/net/ipv6/conf/all/accept_ra
1
/proc/sys/net/ipv6/conf/br0/accept_ra
0
/proc/sys/net/ipv6/conf/default/accept_ra
1
/proc/sys/net/ipv6/conf/eth0/accept_ra
2
/proc/sys/net/ipv6/conf/all/accept_ra_from_local
0
/proc/sys/net/ipv6/conf/br0/accept_ra_from_local
0
/proc/sys/net/ipv6/conf/default/accept_ra_from_local
0
/proc/sys/net/ipv6/conf/eth0/accept_ra_from_local
1
[2/502]mh@fan:~$ ip a
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP 
group default qlen 1000
inet6 2a01:238:4071:328d:5604:a6ff:fe82:2100/64 scope global mngtmpaddr 
noprefixroute dynamic
   valid_lft 86038sec preferred_lft 14038sec
inet6 2a01:238:4071:3282:5604:a6ff:fe82:2100/64 scope global mngtmpaddr 
noprefixroute dynamic
   valid_lft 86372sec preferred_lft 14372sec
3: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group 
default qlen 1000
inet6 2a01:238:4071:328d::1d:153/64 scope global
   valid_lft forever preferred_lft forever
inet6 2a01:238:4071:328d::1d:100/64 scope global
   valid_lft forever preferred_lft forever
[3/503]mh@fan:~$ ip -6 r
default via fe80::1 dev eth0  proto ra  metric 1024  pref medium
default via fe80::c4f4:98ff:fedc:5e21 dev eth0  proto ra  metric 1024  pref 
medium
[4/504]mh@fan:~$

In prose:

The host is a host for KVM VMs. It receives IPv6 connectivity via RA
on eth0, where the default gateway announces its address as fe80::1.
It also provides IPv6 connectivity to the VMs via the br0 interface.
It is running radvd on br0, and for statically configured VMs it has
also fe80::1 on br0.

If accept_ra_from_local on eth0 were 0, the system would not accept
the RA from the default gateway and and up with no IPv6 since fe80::1
is locally configured with br0.

If accept_ra_from_local on eth0 is 1, the system accepts both the RA
from the default gateway on eth0 _AND_ its own RA sent out and
received on br0, and, making things worse, is setting the IP address
and default route not on br0, but on eth0.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421


Re: IPv6 route to gateway on fe80::1%eth0 when I have fe80::1%br0 locally

2015-12-22 Thread Marc Haber
Hi Hannes,

thanks for your mail.

On Tue, Dec 22, 2015 at 04:15:14PM +0100, Hannes Frederic Sowa wrote:
> On 12.12.2015 20:58, Marc Haber wrote:
> > Any hints would be appreciated.
> 
> This sysctl should help:
> 
> accept_ra_from_local - BOOLEAN
> Accept RA with source-address that is found on local machine
> if the RA is otherwise proper and able to be accepted.
> Default is to NOT accept these as it may be an un-intended
> network loop.
> 
> Functional default:
>enabled if accept_ra_from_local is enabled
>on a specific interface.
>disabled if accept_ra_from_local is disabled
>on a specific interface.
> 
> Anyway, this has to be fixed up in a clean way and should work by default.

The clean way would be:

accept_ra_from_local=0: never accept RA with source-address that is
  found on local machine
accept_ra_from_local=1: always accept RA with source-address that is
  found on local machine. Dangerous.
accept_ra_from_local=2: only accept RA with link local source-address
  that is found on local machine, and not if received RA points to an
  address that is locally configured on the same interface. Default.

Shall I file a bug for this in bugzilla?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


IPv6 route to gateway on fe80::1%eth0 when I have fe80::1%br0 locally

2015-12-12 Thread Marc Haber
Hi,

one of my systems (Debian unstable, kernel 4.3.2) serves as host to
virtualize other systems. It therefore has a Bridge interface br0 to
talk to the virtual machines. To have simple configuration, I have
configured fe80::1 on br0, and the VMs use that as a default gateway
(in the case that SLAAC might be turned off on the target machines).

|[1/498]mh@fan:~$ ip -6 addr show dev br0
|3: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
|inet6 2a01:238:4071:328d:c4f4:98ff:fedc:5e21/64 scope global mngtmpaddr 
dynamic
|   valid_lft 86400sec preferred_lft 14400sec
|inet6 2a01:238:4071:328d::1d:153/64 scope global
|   valid_lft forever preferred_lft forever
|inet6 2a01:238:4071:328d::1d:100/64 scope global
|   valid_lft forever preferred_lft forever
|inet6 fec0:0:0:::3/128 scope site
|   valid_lft forever preferred_lft forever
|inet6 fec0:0:0:::2/128 scope site
|   valid_lft forever preferred_lft forever
|inet6 fec0:0:0:::1/128 scope site
|   valid_lft forever preferred_lft forever
|inet6 fe80::1/64 scope link
|   valid_lft forever preferred_lft forever
|inet6 fe80::c4f4:98ff:fedc:5e21/64 scope link
|   valid_lft forever preferred_lft forever
|[2/499]mh@fan:~$

The Machine itself does, of course, have an uplink to the Internet. I
would like to have it do SLAAC on that uplink interface so that it
learns the gateway automatically.
/proc/sys/net/ipv6/conf/{all,eth0}/forwarding is 1,
/proc/sys/net/ipv6/conf/{all,eth0}/accept_ra is 2.

On older machines, this setup works fine.

Here is the result of SLAAC:
|[2/499]mh@fan:~$ ip -6 addr show dev eth0
|2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
|inet6 2a01:238:4071:3282:5604:a6ff:fe82:2100/64 scope global mngtmpaddr 
dynamic
|   valid_lft 86318sec preferred_lft 14318sec
|inet6 fe80::5604:a6ff:fe82:2100/64 scope link
|   valid_lft forever preferred_lft forever
|[3/500]mh@fan:~$

Please note that eth0 does _not_ have an fe80::1 address.

The gateway that is reachable on the physical eth0 is a Linux as well.
It's running radvd 1.9.1, and it has fe80::1 on its inner interface
configured, for the same reason of ease of configurability on systems
not running SLAAC. It also has a auto-configured link local address,
fe80::7c79:61ff:fe31:5528/64.

For some strange reasons, the radvd running on the router now
announces fe80::1 as the gateway address and not the auto-configured
link local address that older radvd versions (such as the 1.8 in
Debian oldstable) use.

Fan, the system in question, uses this as excuse to only honor the
prefix announcement in the RA coming in from the router and to ignore
the gateway, presumably because we have the same IP address bound to
one of our other IP addresses.

In IPv6, this setup is however, perfectly valid and common. fe80::1 is
commonly used on interfaces that can be used as gateway towards the
Internet so that the local admin does not need to think when manually
setting a default route. This is easily proven by manually setting the
route ("ip -6 route add default via fe80::1 dev eth0"), which makes
the entire setup work immediately.

Cross-Checking, with the fe80::1 removed from br0, things are fine as
well, prefix and route are learned on eth0 in this case.

I find the kernel's behavior perfectly valid for IPv4, so that it
doesn't accept routes pointing towards locally bound IP addresses. In
IPv6, link local addresses do depend on the interface, and thus only
the combination of IP address and interface should be used for this
extra check.

It should be possible to have fe80::1%br0 locally while having a route
point towards fe80::1%eth0. That this does not work is, in my opinion,
a kernel bug.

I am open to arguments why the kernel's behavior is correct this way,
and would like to know what to do on my systems to (a) have SLAAC
working on my "routing" VM host and to (b) keep ease of configuration
on non-SLAAC systems on both the physical and the virtual network.

Any hints would be appreciated.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html