Adding GPU support to the docker containerizer is not something that is
very hard to do. The choice in the past to *not* build GPU support for the
docker containerizer was a conscious one in order get people moved over to
the UCR instead. All of the other innovations we work on are prioritised
for the UCR, and we didn't see a compelling reason to make an exception for
GPU support. Building a solution around nvidia-docker would have been a
solution requiring minimal changes to mesos, but then there would have been
yet another dependency in the system that we didn't want to introduce.

However, this was a decision made over 3 years ago, and maybe it's time to
revisit it.

The next docker release will include an integrated `--gpus` flag, bypassing
the need for nvidia-docker entirely:

https://github.com/docker/cli/pull/1714

With this in place it really would be trivial to add support for GPUs to
the docker containerizer, since there would be no requirement for users to
do any external setup for nvidia-docker.

What do people think? Has the landscape changed and does it now make sense
to add GPU support for the docker containerizer given the new upcoming
`--gpus` flag?

Kevin

On Fri, Apr 5, 2019 at 6:58 PM Benjamin Mahler <bmah...@apache.org> wrote:

> +Kevin Klues
>
>
> On Fri, Apr 5, 2019 at 1:24 AM Huadong Liu <h...@yelp.com.invalid> wrote:
>
>> Hi Ben, thanks for pointing me to the docker containerizer ticket. I do
>> see
>> the value of UCR.
>>
>> Since nvidia-docker already takes care of mounting the driver etc., if we
>> use the "--docker=nvidia-docker" agent option to replace the docker
>> command
>> with the nvidia-docker command, GPU support with the docker containerizer
>> seems trivial. Did I miss anything?
>>
>> On Thu, Apr 4, 2019 at 8:00 PM Benjamin Mahler <bmah...@apache.org>
>> wrote:
>>
>> > The "UCR" (aka mesos containerizer) and "Docker containerizer" are two
>> > different containerizers that users tend to choose between. UCR is what
>> > many of our serious users rely on and so we made the investment there
>> > first. GPU support for the docker containerizer was also something that
>> was
>> > planned, but hasn't been prioritized:
>> > https://issues.apache.org/jira/browse/MESOS-5795
>> >
>> > These days, many of our users use Docker images with UCR (i.e. bypassing
>> > the need for the docker daemon).
>> >
>> > Maybe the containerization devs can chime in here I'm in saying anything
>> > inaccurate or to shed some light on where things are headed.
>> >
>> > On Wed, Apr 3, 2019 at 2:21 PM Huadong Liu <h...@yelp.com.invalid>
>> wrote:
>> >
>> > > Hi,
>> > >
>> > > Nvidia GPU support in Mesos/Marathon mandates the mesos containerizer
>> > > <
>> > >
>> >
>> https://github.com/mesosphere/marathon/blob/master/src/main/scala/mesosphere/marathon/state/AppDefinition.scala#L557
>> > > >
>> > >  which "mimics" nvidia-docker.
>> > > <http://mesos.apache.org/documentation/latest/gpu-support/> Can
>> someone
>> > > help me understand why docker containerizer with agent option
>> > > "--docker=nvidia-docker" wasn't the choice? Thank you!
>> > >
>> > > --
>> > > Huadong
>> > >
>> >
>>
>

Reply via email to