On Fri, Apr 10, 2020 at 12:14:17PM +0200, Martin Pieuchot wrote: > On 10/04/20(Fri) 11:18, Claudio Jeker wrote: > > On Fri, Apr 10, 2020 at 10:47:53AM +0200, Martin Pieuchot wrote: > > > On 09/04/20(Thu) 20:22, Laurent Salle wrote: > > > > On 08/04/2020 06.52, Martin Pieuchot wrote: > > > > > > > > > It's the same bug as reported by sthen@. Two interfaces in the same > > > > > subnet > > > > > have two identical cloning routes: > > > > > > > > I've been able to reproduce systematically the problem with an OpenBSD > > > > virtual machine running the latest snapshot and two vio interface with > > > > different priority connected to the same lan with dhcp. > > > > > > Thanks for the report! Diff below seems to fix the issue here, could > > > you try it? > > > > I'm not convinced that this is the right solution. In your diff you insert > > the MAC received on one interface into the arp node of another interface. > > This feels wrong, arp entries should never cross over interfaces. > > For example if for some reasons the two interfaces have the same gateway > > IP but use different MACs for that IP then this breaks. > > Makes sense. > > Well it looks like when the default route on if0 tries to use the L2 > route underneath it, the ARP layer resolve the entry on if1 instead of > on if0. > > The route on if0 is being used because it has higher priority, however > the L2 entry on if1 has been inserted first. I haven't debugged > further.
Yes, this comes from the fact that rtalloc() will find the gw route of the wrong interface and not clone a new entry from the other interface and so the rt_gwroute cache is all messed up. I fear we need some special functions to fix these issues with rtalloc(). Arp should do lookups with an interface and ignore routes that are not from that interface. Then the function needs to walk the routing table to find the cloning route. So it kind of needs to rtable_iterate() and then also backtrack upwards until it find the right route. Side note, rtalloc() always returns the best matching route not considereing the state of the route. Because of this also rerouting because of a down interface does not really work. Again rtalloc() would beed to backtrack upwards if a route is not valid. -- :wq Claudio