Great to hear! Thanks for the update.

On Thu, Jan 14, 2021 at 5:18 PM Charles-François Natali <cf.nat...@gmail.com>
wrote:

> It's a bit old but in case it could help, we recently implemented this
> at work - here's how we did it:
> - the NUMA topology is exposed via agent custom resources
> - the framework does the allocation of the corresponding resources to
> the tasks according to the NUMA topology: e.g. if the task requests 2
> CPUs within the same NUMA node, the framework would allocate them
> - a custom executor then implements the CPU affinity/cpuset using the
> resources provided by the framework
>
> It works really nicely.
>
> Cheers,
>
> Charles
>
>
> Le mar. 7 juil. 2020 à 18:12, Milind Chabbi <mil...@uber.com> a écrit :
> >
> > Grégoire, thanks for your reply. This is super helpful to make a
> stronger case around the affinity benefits.
> > Would you be able to offer additional details that you mentioned? I am
> definitely interested.
> > Is your isolator source code publicly available?
> >
> > -Milind
> >
> > On Tue, Jul 7, 2020 at 3:14 AM Grégoire Seux <g.s...@criteo.com> wrote:
> >>
> >> Hello,
> >>
> >> I'd like to give you a return of experience because we've worked on
> this last year.
> >> We've used CFS bandwidth isolation for several years and encountered
> many issues (lack of predictability, bugs present in old linux kernels and
> lack of cache/memory locality). At some point, we've implemented a custom
> isolator to manage cpusets (using
> https://github.com/criteo/mesos-command-modules/ as a base to write an
> isolator in a scripting language).
> >>
> >> The isolator had a very simple behavior: upon new task, look at which
> cpus are not within a cpuset cgroup, select (if possible) cpus from the
> same numa node and create cpuset cgroup for the starting task.
> >> In practice, it provided a general decrease of cpu consumption (up to
> 8% of some cpu intensive applications) and better ability to reason about
> the cpu isolation model.
> >> The allocation is optimistic: it tries to use cpus from the same numa
> node but if it's not possible, task is spread accross nodes. In practice it
> happens very rarely because of one small optimization to assign cpus from
> the most loaded numa node (decreasing fragmentation of available cpus
> accross numa nodes).
> >>
> >> I'd be glad to give more details if you are interested
> >>
> >> --
> >> Grégoire
>

Reply via email to