[ https://issues.apache.org/jira/browse/SPARK-29151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16934419#comment-16934419 ]
Thomas Graves commented on SPARK-29151: --------------------------------------- To keep this simple for a design, I think we change the .amount config to be a Double and then kind of make it like tasks per GPU. So we only allow 0-0.5 or whole numbers 1,2,3,4. We don't allow 1.25 for instance because we have no way to tell the user which GPU they get 1/4 of. We only do 0-0.5 because anything larger then 0.5 essentially just give you 1 task per GPU. for the math for the scheduler I think we can do floor (1/amount). This should give us a nice multiple for tasks per GPU for the scheduler to track. floor(1/0.333) = 3 . Basically internally scheduler treat it as an int at that point so we don't have issues with weird precision math issues. I think this will be ok if we document it clearly and have log messages and such to what it is really using. > Support fraction resources for resource scheduling > -------------------------------------------------- > > Key: SPARK-29151 > URL: https://issues.apache.org/jira/browse/SPARK-29151 > Project: Spark > Issue Type: Story > Components: Scheduler > Affects Versions: 3.0.0 > Reporter: Thomas Graves > Priority: Major > > The current resource scheduling code for GPU/FPGA, etc only supports amounts > as integers, so you can only schedule whole resources. There are cases where > you may want to share the resources and schedule multiple tasks to run on the > same resources (GPU). It would be nice to support fractional resources. > Somehow say we want a task to have 1/4 of a GPU for instance. I think we > only want to support fractional when the resources amount is < 1. Otherwise > you run into issues where someone asks for 2 1/8 GPU, which doesn't really > make sense to me and makes assigning addresses very complicated. > Need to think about implementation details, for instance using a float can be > troublesome here due to floating point math precision issues. > Another thing to consider, depending on implementation is limiting the > precision - go down to tenths, hundreths, thousandths, etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org