Jason Zhou created MESOS-10243:
----------------------------------
Summary: MAC Address changes from link::setMAC may not stick
Key: MESOS-10243
URL: https://issues.apache.org/jira/browse/MESOS-10243
Project: Mesos
Issue Type: Bug
Reporter: Jason Zhou
It seems that there are scenarios where mesos containers cannot communicate
with agents as the MAC addresses are set incorrectly, leading to dropped
packets. A workaround for this behavior is to check that the MAC address is set
correctly after the ioctl call, and retry the address setting if necessary.
In our test, this workaround appears to reduce the frequency of this issue, but
does not seem to prevent all such failures.
Reviewboard ticket for the workaround: [https://reviews.apache.org/r/75057/]
Observed scenarios with incorrectly assigned MAC addresses:
1. ioctl returns the correct MAC address, but not net::mac
2. both net::mac and ioctl return the same MAC address, but are both wrong
3. There are no cases where ioctl/net::mac come back with the same MAC
address as before setting. i.e. there is no no-op observed.
4. There is a possibility that ioctl/net::mac results disagree with each
other even before attempting to set our desired MAC address. As such, we
check that the results agree before we set, and log a warning if we find
a mismatch
5. There is a possibility that the MAC address we set ends up overwritten by
a garbage value after setMAC has already completed and checked that the
mac address was set correctly. Since this error happens after this
function has finished, we cannot log nor detect it in setMAC. Our workaround
cannot deal with this scenario as it occurs outside setMAC
Notes:
1. We have observed this behavior only on CentOS 9 systems at the moment,
We have tried kernels 5.15.147, 5.15.160, 5.15.161, which all have this
issue.
CentOS 7 systems do not seem to have this issue with setMAC.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)