I suspect that the netifd changes are related, since that looks like the
only relevant area of major activity in the past month when this began
happening. Then again, the timing-sensitive nature means that the
underlying problem may have been present for a while, and only exposed
with the recent netifd changes.
I've noticed that it's possible for the MAC address to change during a
DHCP client transaction when OpenWrt is configured to obtain a DHCP lease
on a bridged interface. In my example, I have a WNDR3700 configured as
follows in /etc/config/network:
config interface lan
option ifname eth0.1
option type bridge
option proto dhcp
wlan0 and wlan1 are also configured to join this bridged interface, both
configured in /etc/config/wireless with:
config wifi-iface
option network lan
eth0.1's MAC address is c6:xx:xx:xx:xx:01 (with the locally-administered
[LA] bit set). wlan0's is c4:xx:xx:xx:xx:01, and wlan1's is
c4:xx:xx:xx:xx:03. On the DHCP server (also OpenWrt running dnsmasq), I
observe the DHCP transaction when the WNDR3700 boots:
Jun 26 12:21:57 gw1 daemon.info dnsmasq-dhcp[2285]: DHCPDISCOVER(br-lan)
c6:xx:xx:xx:xx:01
Jun 26 12:21:57 gw1 daemon.info dnsmasq-dhcp[2285]: DHCPOFFER(br-lan)
192.168.1.211 c6:xx:xx:xx:xx:01
Jun 26 12:21:57 gw1 daemon.info dnsmasq-dhcp[2285]: DHCPREQUEST(br-lan)
192.168.1.211 c4:xx:xx:xx:xx:01
Jun 26 12:21:57 gw1 daemon.info dnsmasq-dhcp[2285]: DHCPACK(br-lan)
192.168.1.211 c4:xx:xx:xx:xx:01
Note that the MAC address changes from having the LA bit set
(c6:xx:xx:xx:xx:01) when the client sends its DHCPDISCOVER to not having
the LA bit set (c6:xx:xx:xx:xx:01) when it sends its DHCPREQUEST. This is
because the br-lan interface begins life taking the MAC address of its
only initial underlying interface, eth0.1, which has the LA bit set. When
wlan0 is added to this bridge, the bridge winds up taking that interface's
MAC address (c4:xx:xx:xx:xx:01) instead. I have observed that this occurs
repeatably immediately in the middle of the DHCP transaction intended to
assign the interface's address. The logs from the client side confirm
this:
Dec 31 19:00:12 ap2 kern.debug kernel: [ 12.890000] ar71xx: pll_reg
0xb8050010: 0x11110000
Dec 31 19:00:12 ap2 kern.info kernel: [ 12.890000] eth0: link up
(1000Mbps/Full duplex)
Dec 31 19:00:12 ap2 kern.info kernel: [ 12.890000] device eth0.1 entered
promiscuous mode
Dec 31 19:00:12 ap2 kern.info kernel: [ 12.900000] device eth0 entered
promiscuous mode
Dec 31 19:00:12 ap2 kern.info kernel: [ 12.920000] br-lan: port 1(eth0.1)
entered forwarding state
Dec 31 19:00:12 ap2 kern.info kernel: [ 12.920000] br-lan: port 1(eth0.1)
entered forwarding state
Dec 31 19:00:13 ap2 daemon.notice netifd: lan (628): udhcpc (v1.19.4) started
Dec 31 19:00:13 ap2 daemon.notice netifd: lan (628): Sending discover...
Dec 31 19:00:14 ap2 kern.info kernel: [ 14.700000] device wlan0 entered
promiscuous mode
Dec 31 19:00:14 ap2 kern.info kernel: [ 14.920000] br-lan: port 1(eth0.1)
entered forwarding state
Dec 31 19:00:14 ap2 kern.info kernel: [ 14.960000] br-lan: port 2(wlan0)
entered forwarding state
Dec 31 19:00:14 ap2 kern.info kernel: [ 14.960000] br-lan: port 2(wlan0)
entered forwarding state
Dec 31 19:00:15 ap2 daemon.notice netifd: lan (628): Sending select for
192.168.1.211...
Dec 31 19:00:15 ap2 daemon.notice netifd: lan (628): Lease of 192.168.1.211
obtained, lease time 43200
Dec 31 19:00:15 ap2 daemon.notice netifd: Interface 'lan' is now up
Jun 26 12:21:58 ap2 kern.info kernel: [ 16.960000] br-lan: port 2(wlan0)
entered forwarding state
Jun 26 12:21:58 ap2 kern.info kernel: [ 17.480000] device wlan1 entered
promiscuous mode
Jun 26 12:22:01 ap2 kern.info kernel: [ 20.250000] br-lan: port 3(wlan1)
entered forwarding state
Jun 26 12:22:01 ap2 kern.info kernel: [ 20.260000] br-lan: port 3(wlan1)
entered forwarding state
Jun 26 12:22:03 ap2 kern.info kernel: [ 22.260000] br-lan: port 3(wlan1)
entered forwarding state
Using a variable hardware address for a DHCP transaction is no good.
Having the hardware address be unpredicatable is also a problem. I
discovered this problem when debugging why a static DHCP address
assignment ("config host" in /etc/config/dhcp on the server) was not
effective. If I use c6:xx:xx:xx:xx:01 on the server, then the client won't
be able to DHCPREQUST the desired address once it begins using MAC address
c4:xx:xx:xx:xx:01. If I use c4:xx:xx:xx:xx:01 on the server, then the
server won't send a DHCPOFFER for the desired address because the
DHCPREQUEST will have a different MAC address.
Ultimately, it may just be a bad idea to have a bridge's MAC address
change once established, at least as long as the bridge still contains an
underlying interface that "owns" the MAC address it's using.
_______________________________________________
openwrt-devel mailing list
[email protected]
https://lists.openwrt.org/mailman/listinfo/openwrt-devel