[ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907951#comment-16907951
 ] 

Szilard Nemeth commented on YARN-9100:
--------------------------------------

Patch looks good in overall, I only found one minor bit of weirdness in the 
test code (TestGpuResourceAllocator): 
In TestGpuResourceAllocator#assertAllocatedGpu: 

We have:
{code:java}
assertEquals(1, allocation.getAllowedGPUs().size());
assertEquals(0, allocation.getDeniedGPUs().size());

Set<GpuDevice> allowedGPUs = allocation.getAllowedGPUs();
assertEquals(1, allowedGPUs.size());

GpuDevice allocatedGpu = (GpuDevice) allowedGPUs.toArray()[0];
assertEquals(expectedGpu, allocatedGpu);
assertAssignmentInStateStore(expectedGpu, container);
{code}

I think the code block of 

{code:java}
Set<GpuDevice> allowedGPUs = allocation.getAllowedGPUs();
assertEquals(1, allowedGPUs.size());
{code}
is superfluous, as the size of allowed GPUs list is already checked above for 
the allocation.
Please fix this minor bit and I think we are good to go!

Thanks! 

> Add tests for GpuResourceAllocator and do minor code cleanup
> ------------------------------------------------------------
>
>                 Key: YARN-9100
>                 URL: https://issues.apache.org/jira/browse/YARN-9100
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Szilard Nemeth
>            Assignee: Peter Bacsko
>            Priority: Major
>         Attachments: YARN-9100-004.patch, YARN-9100-005.patch, 
> YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100-008.patch, 
> YARN-9100.001.patch, YARN-9100.002.patch, YARN-9100.003.patch
>
>
> Add tests for GpuResourceAllocator and do minor code cleanup
> - Improved log and exception messages
> - Added some new debug logs
> - Some methods are named like *Copy, these are returning copies of internal 
> data structures. The word "copy" is just a noise in their name, so they have 
> been renamed. Additionally, the copied data structures modified to be 
> immutable.
> - The waiting loop in method assignGpus were decoupled into a new class, 
> RetryCommand. 
> Some more words about the new class RetryCommand: 
> There are some similar waiting loops in the code in: AMRMClient, 
> AMRMClientAsync and even in GenericTestUtils (see waitFor method). 
> RetryCommand could be a future replacement of these duplicated code, as it 
> gives a solution to this waiting loop problem in a generic way.
> The only downside of the usage of RetryCommand in GpuResourceAllocator 
> (startGpuAssignmentLoop) is the ugly exception handling part, but that's 
> solely because how Java deals with checked exceptions vs. lambdas. If there's 
> a cleaner way to solve the exception handling, I'm open for any suggestions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to