[ 
https://issues.apache.org/jira/browse/YARN-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256541#comment-15256541
 ] 

Sunil G commented on YARN-4390:
-------------------------------

HI [~leftnoteasy]

Thanks for sharing patch here, Few comments/doubts from my side.

1. In 
{{CapacitySchedulerPreemptionUtils#deductPreemptableResourcesBasedSelectedCandidates}},
 partition is retrieved from {{SchedulerNode}}. I think we can get this from 
{{RMContainer#getNodeLabelExpression}} itself considering we will update 
RMContainer labelExpression (via a set call) while we invoke 
{{replaceLabelsOnNode}}.
2. {{context.getQueueByPartition}} has a code path flow to return NULL 
eventhough its not possible to happen. With this assumption, null check is not 
handled in caller side. So is it better to raise an exception and skip that 
round of preemption to add more clarify?
3. From {{FifoCandidatesSelector}},
{code}
CapacitySchedulerPreemptionUtils
72              .deductPreemptableResourcesBasedSelectedCandidates(
{code}
Ideally this code snippet helps to deduct already selected containers resource 
from its corresponding {{TempQueuePartition}}. On that note, I think this code 
need not have to be inside one such selector. Rather it may be better suited 
from the caller end and ensure that after each round of selection, possible 
deduction is done.

4. One doubt about the newly changed {{PreemptableResourceCalculator}} ctor. 
Could we get the information about 
{{considersReservedResourceWhenCalculateIdeal}} from {{preemptionContext}} 
itself. We could add this as a getter api so that code may be more clean.

5. {{getNodesForPreemption}} in {{ReservedContainerCandidatesSelector}} has to 
scan all nodes in cluster to get whether any node has a reserved container or 
not. For recent UI work, I thought having a metric to know the count of 
reserved nodes in cluster. Is it costlier to keep a reserved nodes list in 
scheduler? If so, could that be used here. Its just a thought, I havent really 
checked the cost of keeping such list. So its fine to scrap this idea also :)
6. In {{ReservedContainerCandidatesSelector}}, containers are sorted against 
launch time. Is it possible that we may select a high priority latest container 
against some low priority slightly old one. 
7. As  I see {{RMContainer}} is read-only interface. I feel {{setQueueName}} 
can be moved out from this interface and place in {{RMContainerImpl}}. I think 
we could downcast and use the same.

Minor nits:
1. {{ReservedContainerCandidatesSelector#getPreemptionCandidatesOnNode}} has 
some commented code. It can be removed.
2. in {{TempQueuePerPartition}}, could we add {{reserved}} to {{toString}} for 
logging.


I think test class looks very clean and readable after new framework change. It 
is very good and really useful :).
 I thought of trying this framework sometime tomorrow,and will share if I have 
comments.

> Do surgical preemption based on reserved container in CapacityScheduler
> -----------------------------------------------------------------------
>
>                 Key: YARN-4390
>                 URL: https://issues.apache.org/jira/browse/YARN-4390
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacity scheduler
>    Affects Versions: 3.0.0, 2.8.0, 2.7.3
>            Reporter: Eric Payne
>            Assignee: Wangda Tan
>         Attachments: YARN-4390-design.1.pdf, YARN-4390-test-results.pdf, 
> YARN-4390.1.patch, YARN-4390.2.patch, YARN-4390.3.branch-2.patch, 
> YARN-4390.3.patch, YARN-4390.4.patch, YARN-4390.5.patch
>
>
> There are multiple reasons why preemption could unnecessarily preempt 
> containers. One is that an app could be requesting a large container (say 
> 8-GB), and the preemption monitor could conceivably preempt multiple 
> containers (say 8, 1-GB containers) in order to fill the large container 
> request. These smaller containers would then be rejected by the requesting AM 
> and potentially given right back to the preempted app.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to