Re: [DISCUSS] FLIP-108: Add GPU support in Flink

Stephan Ewen Fri, 13 Mar 2020 08:59:00 -0700

> > Can we somehow keep this out of the TaskManager services
> I fear that we could not. IMO, the GPUManager(or
> ExternalServicesManagers in future) is conceptually one of the task
> manager services, just like MemoryManager before 1.10.
> - It maintains/holds the GPU resource at TM level and all of the
> operators allocate the GPU resources from it. So, it should be
> exclusive to a single TaskExecutor.
> - We could add a collection called ExternalResourceManagers to hold
> all managers of other external resources in the future.
>


Can you help me understand why this needs the addition in TaskMagerServices
or in the RuntimeContext?
Are you worried about the case when multiple Task Executors run in the same
JVM? That's not common, but wouldn't it actually be good in that case to
share the GPU Manager, given that the GPU is shared?

Thanks,
Stephan

---------------------------


> What parts need information about this?
> In this FLIP, operators need the information. Thus, we expose GPU
> information to the RuntimeContext/FunctionContext. The slot profile is
> not aware of GPU resources as GPU is TM level resource now.
>
> > Can the GPU Manager be a "self contained" thing that simply takes the
> configuration, and then abstracts everything internally?
> Yes, we just pass the path/args of the discover script and how many
> GPUs per TM to it. It takes the responsibility to get the GPU
> information and expose them to the RuntimeContext/FunctionContext of
> Operators. Meanwhile, we'd better not allow operators to directly
> access GPUManager, it should get what they want from Context. We could
> then decouple the interface/implementation of GPUManager and Public
> API.
>
> Best,
> Yangze Guo
>
> On Fri, Mar 13, 2020 at 7:26 PM Stephan Ewen <se...@apache.org> wrote:
> >
> > It sounds fine to initially start with GPU specific support and think
> about
> > generalizing this once we better understand the space.
> >
> > About the implementation suggested in FLIP-108:
> >   - Can we somehow keep this out of the TaskManager services? Anything we
> > have to pull through all layers of the TM makes the TM components yet
> more
> > complex and harder to maintain.
> >
> >   - What parts need information about this?
> >     -> do the slot profiles need information about the GPU?
> >     -> Can the GPU Manager be a "self contained" thing that simply takes
> > the configuration, and then abstracts everything internally? Operators
> can
> > access it via "GPUManager.get()" or so?
> >
> >
> >
> > On Wed, Mar 4, 2020 at 4:19 AM Yangze Guo <karma...@gmail.com> wrote:
> >
> > > Thanks for all the feedbacks.
> > >
> > > @Becket
> > > Regarding the WebUI and GPUInfo, you're right, I'll add them to the
> > > Public API section.
> > >
> > >
> > > @Stephan @Becket
> > > Regarding the general extended resource mechanism, I second Xintong's
> > > suggestion.
> > > - It's better to leverage ResourceProfile and ResourceSpec after we
> > > supporting fine-grained GPU scheduling. As a first step proposal, I
> > > prefer to not include it in the scope of this FLIP.
> > > - Regarding the "Extended Resource Manager", if I understand
> > > correctly, it just a code refactoring atm, we could extract the
> > > open/close/allocateExtendResources of GPUManager to that interface. If
> > > that is the case, +1 to do it during implementation.
> > >
> > > @Xingbo
> > > As Xintong said, we looked into how Spark supports a general "Custom
> > > Resource Scheduling" before and decided to introduce a common resource
> > > configuration
> > > schema(taskmanager.resource.{resourceName}.amount/discovery-script)
> > > to make it more extensible. I think the "resource" is a proper level
> > > to contain all the configs of extended resources.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Wed, Mar 4, 2020 at 10:48 AM Xingbo Huang <hxbks...@gmail.com>
> wrote:
> > > >
> > > > Thanks a lot for the FLIP, Yangze.
> > > >
> > > > There is no doubt that GPU resource management support will greatly
> > > > facilitate the development of AI-related applications by PyFlink
> users.
> > > >
> > > > I have only one comment about this wiki:
> > > >
> > > > Regarding the names of several GPU configurations, I think it is
> better
> > > to
> > > > delete the resource field makes it consistent with the names of other
> > > > resource-related configurations in TaskManagerOption.
> > > >
> > > > e.g. taskmanager.resource.gpu.discovery-script.path ->
> > > > taskmanager.gpu.discovery-script.path
> > > >
> > > > Best,
> > > >
> > > > Xingbo
> > > >
> > > >
> > > > Xintong Song <tonysong...@gmail.com> 于2020年3月4日周三 上午10:39写道：
> > > >
> > > > > @Stephan, @Becket,
> > > > >
> > > > > Actually, Yangze, Yang and I also had an offline discussion about
> > > making
> > > > > the "GPU Support" as some general "Extended Resource Support". We
> > > believe
> > > > > supporting extended resources in a general mechanism is definitely
> a
> > > good
> > > > > and extensible way. The reason we propose this FLIP narrowing its
> scope
> > > > > down to GPU alone, is mainly for the concern on extra efforts and
> > > review
> > > > > capacity needed for a general mechanism.
> > > > >
> > > > > To come up with a well design on a general extended resource
> management
> > > > > mechanism, we would need to investigate more on how people use
> > > different
> > > > > kind of resources in practice. For GPU, we learnt such knowledge
> from
> > > the
> > > > > experts, Becket and his team members. But for FPGA, or other
> potential
> > > > > extended resources, we don't have such convenient information
> sources,
> > > > > making the investigation requires more efforts, which I tend to
> think
> > > is
> > > > > not necessary atm.
> > > > >
> > > > > On the other hand, we also looked into how Spark supports a general
> > > "Custom
> > > > > Resource Scheduling". Assuming we want to have a similar general
> > > extended
> > > > > resource mechanism in the future, we believe that the current GPU
> > > support
> > > > > design can be easily extended, in an incremental way without too
> many
> > > > > reworks.
> > > > >
> > > > >    - The most important part is probably user interfaces. Spark
> offers
> > > > >    configuration options to define the amount, discovery script and
> > > vendor
> > > > > (on
> > > > >    k8s) in a per resource type bias [1], which is very similar to
> what
> > > we
> > > > >    proposed in this FLIP. I think it's not necessary to expose
> config
> > > > > options
> > > > >    in the general way atm, since we do not have supports for other
> > > resource
> > > > >    types now. If later we decided to have per resource type config
> > > > > options, we
> > > > >    can have backwards compatibility on the current proposed options
> > > with
> > > > >    simple key mapping.
> > > > >    - For the GPU Manager, if later needed we can change it to a
> > > "Extended
> > > > >    Resource Manager" (or whatever it is called). That should be a
> pure
> > > > >    component-internal refactoring.
> > > > >    - For ResourceProfile and ResourceSpec, there are already
> fields for
> > > > >    general extended resource. We can of course leverage them when
> > > > > supporting
> > > > >    fine grained GPU scheduling. That is also not in the scope of
> this
> > > first
> > > > >    step proposal, and would require FLIP-56 to be finished first.
> > > > >
> > > > > To summary up, I agree with Becket that have a separate FLIP for
> the
> > > > > general extended resource mechanism, and keep it in mind when
> > > discussing
> > > > > and implementing the current one.
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > >
> https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview
> > > > >
> > > > > On Wed, Mar 4, 2020 at 9:18 AM Becket Qin <becket....@gmail.com>
> > > wrote:
> > > > >
> > > > > > That's a good point, Stephan. It makes total sense to generalize
> the
> > > > > > resource management to support custom resources. Having that
> allows
> > > users
> > > > > > to add new resources by themselves. The general resource
> management
> > > may
> > > > > > involve two different aspects:
> > > > > >
> > > > > > 1. The custom resource type definition. It is supported by the
> > > extended
> > > > > > resources in ResourceProfile and ResourceSpec. This will likely
> cover
> > > > > > majority of the cases.
> > > > > >
> > > > > > 2. The custom resource allocation logic, i.e. how to assign the
> > > resources
> > > > > > to different tasks, operators, and so on. This may require two
> > > levels /
> > > > > > steps:
> > > > > >     a. Subtask level - make sure the subtasks are put into
> suitable
> > > > > slots.
> > > > > > It is done by the global RM and is not customizable right now.
> > > > > >     b. Operator level - map the exact resource to the operators
> in
> > > TM.
> > > > > e.g.
> > > > > > GPU 1 for operator A, GPU 2 for operator B. This step is needed
> > > assuming
> > > > > > the global RM does not distinguish individual resources of the
> same
> > > type.
> > > > > > It is true for memory, but not for GPU.
> > > > > >
> > > > > > The GPU manager is designed to do 2.b here. So it should
> discover the
> > > > > > physical GPU information and bind/match them to each operators.
> > > Making
> > > > > this
> > > > > > general will fill in the missing piece to support custom resource
> > > type
> > > > > > definition. But I'd avoid calling it a "External Resource
> Manager" to
> > > > > avoid
> > > > > > confusion with RM, maybe something like "Operator Resource
> Assigner"
> > > > > would
> > > > > > be more accurate. So for each resource type users can have an
> > > optional
> > > > > > "Operator Resource Assigner" in the TM. For memory, users don't
> need
> > > > > this,
> > > > > > but for other extended resources, users may need that.
> > > > > >
> > > > > > Personally I think a pluggable "Operator Resource Assigner" is
> > > achievable
> > > > > > in this FLIP. But I am also OK with having that in a separate
> FLIP
> > > > > because
> > > > > > the interface between the "Operator Resource Assigner" and
> operator
> > > may
> > > > > > take a while to settle down if we want to make it generic. But I
> > > think
> > > > > our
> > > > > > implementation should take this future work into consideration so
> > > that we
> > > > > > don't need to break backwards compatibility once we have that.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jiangjie (Becket) Qin
> > > > > >
> > > > > > On Wed, Mar 4, 2020 at 12:27 AM Stephan Ewen <se...@apache.org>
> > > wrote:
> > > > > >
> > > > > > > Thank you for writing this FLIP.
> > > > > > >
> > > > > > > I cannot really give much input into the mechanics of GPU-aware
> > > > > > scheduling
> > > > > > > and GPU allocation, as I have no experience with that.
> > > > > > >
> > > > > > > One thought I had when reading the proposal is if it makes
> sense to
> > > > > look
> > > > > > at
> > > > > > > the "GPU Manager" as an "External Resource Manager", and GPU
> is one
> > > > > such
> > > > > > > resource.
> > > > > > > The way I understand the ResourceProfile and ResourceSpec,
> that is
> > > how
> > > > > it
> > > > > > > is done there.
> > > > > > > It has the advantage that it looks more extensible. Maybe
> there is
> > > a
> > > > > GPU
> > > > > > > Resource, a specialized NVIDIA GPU Resource, and FPGA
> Resource, a
> > > > > Alibaba
> > > > > > > TPU Resource, etc.
> > > > > > >
> > > > > > > Best,
> > > > > > > Stephan
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Mar 3, 2020 at 7:57 AM Becket Qin <
> becket....@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > > Thanks for the FLIP Yangze. GPU resource management support
> is a
> > > > > > > must-have
> > > > > > > > for machine learning use cases. Actually it is one of the
> mostly
> > > > > asked
> > > > > > > > question from the users who are interested in using Flink
> for ML.
> > > > > > > >
> > > > > > > > Some quick comments / questions to the wiki.
> > > > > > > > 1. The WebUI / REST API should probably also be mentioned in
> the
> > > > > public
> > > > > > > > interface section.
> > > > > > > > 2. Is the data structure that holds GPU info also a public
> API?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jiangjie (Becket) Qin
> > > > > > > >
> > > > > > > > On Tue, Mar 3, 2020 at 10:15 AM Xintong Song <
> > > tonysong...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks for drafting the FLIP and kicking off the
> discussion,
> > > > > Yangze.
> > > > > > > > >
> > > > > > > > > Big +1 for this feature. Supporting using of GPU in Flink
> is
> > > > > > > significant,
> > > > > > > > > especially for the ML scenarios.
> > > > > > > > > I've reviewed the FLIP wiki doc and it looks good to me. I
> > > think
> > > > > > it's a
> > > > > > > > > very good first step for Flink's GPU supports.
> > > > > > > > >
> > > > > > > > > Thank you~
> > > > > > > > >
> > > > > > > > > Xintong Song
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Mar 2, 2020 at 12:06 PM Yangze Guo <
> karma...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi everyone,
> > > > > > > > > >
> > > > > > > > > > We would like to start a discussion thread on "FLIP-108:
> Add
> > > GPU
> > > > > > > > > > support in Flink"[1].
> > > > > > > > > >
> > > > > > > > > > This FLIP mainly discusses the following issues:
> > > > > > > > > >
> > > > > > > > > > - Enable user to configure how many GPUs in a task
> executor
> > > and
> > > > > > > > > > forward such requirements to the external resource
> managers
> > > (for
> > > > > > > > > > Kubernetes/Yarn/Mesos setups).
> > > > > > > > > > - Provide information of available GPU resources to
> > > operators.
> > > > > > > > > >
> > > > > > > > > > Key changes proposed in the FLIP are as follows:
> > > > > > > > > >
> > > > > > > > > > - Forward GPU resource requirements to Yarn/Kubernetes.
> > > > > > > > > > - Introduce GPUManager as one of the task manager
> services to
> > > > > > > discover
> > > > > > > > > > and expose GPU resource information to the context of
> > > functions.
> > > > > > > > > > - Introduce the default script for GPU discovery, in
> which we
> > > > > > provide
> > > > > > > > > > the privilege mode to help user to achieve worker-level
> > > isolation
> > > > > > in
> > > > > > > > > > standalone mode.
> > > > > > > > > >
> > > > > > > > > > Please find more details in the FLIP wiki document [1].
> > > Looking
> > > > > > > forward
> > > > > > > > > to
> > > > > > > > > > your feedbacks.
> > > > > > > > > >
> > > > > > > > > > [1]
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Yangze Guo
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
>

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

Reply via email to