[
https://issues.apache.org/jira/browse/MESOS-10243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866134#comment-17866134
]
Benjamin Mahler commented on MESOS-10243:
-----------------------------------------
Landed fix for host network namespace veth<pid> interface.
Let's leave this open and mark as fixed once we also set the container network
namespace eth0 interface's mac address on creation / update the script to stop
setting it.
> MAC Address changes from link::setMAC may not stick, leading to container
> launch failure with port mapping isolator.
> --------------------------------------------------------------------------------------------------------------------
>
> Key: MESOS-10243
> URL: https://issues.apache.org/jira/browse/MESOS-10243
> Project: Mesos
> Issue Type: Bug
> Affects Versions: 1.11.0
> Reporter: Jason Zhou
> Assignee: Jason Zhou
> Priority: Major
>
> It seems that there are scenarios where mesos containers cannot communicate
> with agents as the MAC addresses are set incorrectly, leading to dropped
> packets. A workaround for this behavior is to check that the MAC address is
> set correctly after the ioctl call, and retry the address setting if
> necessary.
> In our test, this workaround appears to reduce the frequency of this issue,
> but does not seem to prevent all such failures.
> Reviewboard ticket for the workaround: [https://reviews.apache.org/r/75057/]
> Observed scenarios with incorrectly assigned MAC addresses:
> 1. ioctl returns the correct MAC address, but not net::mac
> 2. both net::mac and ioctl return the same MAC address, but are both wrong
> 3. There are no cases where ioctl/net::mac come back with the same MAC
> address as before setting. i.e. there is no no-op observed.
> 4. There is a possibility that ioctl/net::mac results disagree with each
> other even before attempting to set our desired MAC address. As such, we
> check that the results agree before we set, and log a warning if we find
> a mismatch
> 5. There is a possibility that the MAC address we set ends up overwritten by
> a garbage value after setMAC has already completed and checked that the
> mac address was set correctly. Since this error happens after this
> function has finished, we cannot log nor detect it in setMAC. Our
> workaround cannot deal with this scenario as it occurs outside setMAC
> Notes:
> 1. We have observed this behavior only on CentOS 9 systems at the moment,
> We have tried kernels 5.15.147, 5.15.160, 5.15.161, which all have this
> issue.
> CentOS 7 systems do not seem to have this issue with setMAC.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)