[ 
https://issues.apache.org/jira/browse/SPARK-29151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16934419#comment-16934419
 ] 

Thomas Graves commented on SPARK-29151:
---------------------------------------

To keep this simple for a design, I think we change the .amount config to be a 
Double and then kind of make it like tasks per GPU.

So we only allow 0-0.5 or whole numbers 1,2,3,4.  We don't allow 1.25 for 
instance because we have no way to tell the user which GPU they get 1/4 of. We 
only do 0-0.5 because anything larger then 0.5 essentially just give you 1 task 
per GPU.

for the math for the scheduler I think we can do floor (1/amount). This should 
give us a nice multiple for tasks per GPU for the scheduler to track.  
floor(1/0.333) = 3 . Basically internally scheduler treat it as an int at that 
point so we don't have issues with weird precision math issues.  

I think this will be ok if we document it clearly and have log messages and 
such to what it is really using.

> Support fraction resources for resource scheduling
> --------------------------------------------------
>
>                 Key: SPARK-29151
>                 URL: https://issues.apache.org/jira/browse/SPARK-29151
>             Project: Spark
>          Issue Type: Story
>          Components: Scheduler
>    Affects Versions: 3.0.0
>            Reporter: Thomas Graves
>            Priority: Major
>
> The current resource scheduling code for GPU/FPGA, etc only supports amounts 
> as integers, so you can only schedule whole resources.  There are cases where 
> you may want to share the resources and schedule multiple tasks to run on the 
> same resources (GPU).  It would be nice to support fractional resources.  
> Somehow say we want a task to have 1/4 of a GPU for instance.  I think we 
> only want to support fractional when the resources amount is < 1.  Otherwise 
> you run into issues where someone asks for 2 1/8 GPU, which doesn't really 
> make sense to me and makes assigning addresses very complicated.
> Need to think about implementation details, for instance using a float can be 
> troublesome here due to floating point math precision issues.
> Another thing to consider, depending on implementation is limiting the 
> precision - go down to tenths, hundreths, thousandths, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to