[ https://issues.apache.org/jira/browse/YARN-9099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16751126#comment-16751126 ]
Peter Bacsko commented on YARN-9099: ------------------------------------ [~snemeth] as [~tangzhankun] pointed out, is it possible to add a unit test for this? > GpuResourceAllocator.getReleasingGpus calculates number of GPUs in a wrong way > ------------------------------------------------------------------------------ > > Key: YARN-9099 > URL: https://issues.apache.org/jira/browse/YARN-9099 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Szilard Nemeth > Assignee: Szilard Nemeth > Priority: Major > Attachments: YARN-9099.001.patch, YARN-9099.002.patch > > > getReleasingGpus plays an important role in the calculation which happens > when GpuAllocator assign GPUs to a container, see: > GpuResourceAllocator#internalAssignGpus. > If multiple GPUs are assigned to the same container, getReleasingGpus will > return an invalid number. > The iterator goes over on mappings of (GPU device, container ID) and it > retrieves the container by its ID the number of times the container ID is > mapped to any device. > Then for every container, the resource value for the GPU resource is added to > a running sum. > Obviously, if a container is mapped to 2 or more devices, then the > container's GPU resource counter is added to the running sum as many times as > the number of GPU devices the container has. > Example: > Let's suppose {{usedDevices}} contains these mappings: > - (GPU1, container1) > - (GPU2, container1) > - (GPU3, container2) > GPU resource value is 2 for container1 and > GPU resource value is 1 for container2. > Then, if container1 is in a running state, getReleasingGpus will return 4 > instead of 2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org