Hi Dmitry This problem needs to be addressed with topology information so that scheduler framework can utilize it and request affinity constraints.
We started to look into this when we are required to expose GPU HW information. It would be good to introduce generic topology structure so that generic interconnects and associated resource topology can be expressed. Please have a look at https://issues.apache.org/jira/browse/MESOS-7080 -- Vikram -----Original Message----- From: Dmitry Zhuk [mailto:dz...@twopensource.com] Sent: Wednesday, March 22, 2017 6:49 AM To: dev@mesos.apache.org Subject: CPU affinity Hi Is anyone working on MESOS-314 <https://issues.apache.org/jira/browse/MESOS-314> “Support the cgroups 'cpusets' subsystem” or related functionality? I found other related tickets in JIRA, but there seems to be no recent progress on them: MESOS-5342 <https://issues.apache.org/jira/browse/MESOS-5342>, MESOS-5358 <https://issues.apache.org/jira/browse/MESOS-5358>. There’s also a mention of idea of exposing cpusets similar to network ports. I’d like propose an alternative approach for adding CPU affinity support and would be interested in any feedback on it. If the community is interested in this approach, I can work on design document and implementation. The basic idea is to let frameworks specify affinity requirements in ContainerInfo using the following structure: message AffinityInfo { enum ProcessingUnit { THREAD = 1; CORE = 2; SOCKET = 3; NUMA_NODE = 4; } // Indicates that container should be bound to the units of specified type. // For example: bind = NUMA_NODE indicates, that process // can run on any thread from some NUMA node. required ProcessingUnit bind = 1; // Indicates that assigned processing units must not be shared with // other containers. optional bool exclusive = 2 [default = false]; } message ContainerInfo { … optional AffinityInfo affinity_info = …; } In future this can be extended to require exclusive NUMA node memory access, proximity to devices, etc. This also requires exposing hardware topology information (such as number of cpus per node) to frameworks to evaluate offer suitability, and providing visibility to frameworks on failures to assign CPUs per requirements, but this can be left out of scope of the MVP. Thanks ----------------------------------------------------------------------------------- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. -----------------------------------------------------------------------------------