[ 
https://issues.apache.org/jira/browse/YARN-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15176278#comment-15176278
 ] 

Wangda Tan commented on YARN-4719:
----------------------------------

[~kasha],

bq. Not sure I understand the suggestion. Elaborate?
In ver.2 patch, getAllNodes uses shallowCopy, what I meant is instead of 
copying the entire HashMap, you can use ConcurrentMap instead.
In ver.3 patch, you removed shallowCopy and returns HashMap.values(), if node 
removed while someone is iterating values(), the behavior is undefined. See: 
[javadoc|https://docs.oracle.com/javase/7/docs/api/java/util/HashMap.html#values()]

bq. I feel any logic that has to iterate through all nodes should go through 
ClusterNodeTracker - that way, we don't run into cases where we access the list 
of nodes without a lock.
As I commented above, we can use ConcurrentMap instead of locking 
ClusterNodeTracker. Do you need strong consistency for 
addBlacklistedNodeIdsToList? (Because node list could be updated while we 
updating blacklistedNodes.

bq. Any particular reason you think this doesn't belong here?
I would prefer to keep cleaner responsibility of ClusterNodeTracker, if we adds 
application logic here, we could add any logic related to SchedulerNode to this 
class as well. This refactoring patch is majorly for code clean up to me, I 
think it's better to keep it clean from the beginning.


> Add a helper library to maintain node state and allows common queries
> ---------------------------------------------------------------------
>
>                 Key: YARN-4719
>                 URL: https://issues.apache.org/jira/browse/YARN-4719
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: scheduler
>    Affects Versions: 2.8.0
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>         Attachments: yarn-4719-1.patch, yarn-4719-2.patch, yarn-4719-3.patch
>
>
> The scheduler could use a helper library to maintain node state and allowing 
> matching/sorting queries. Several reasons for this:
> # Today, a lot of the node state management is done separately in each 
> scheduler. Having a single library will take us that much closer to reducing 
> duplication among schedulers.
> # Adding a filtering/matching API would simplify node labels and locality 
> significantly. 
> # An API that returns a sorted list for a custom comparator would help 
> YARN-1011 where we want to sort by allocation and utilization for 
> continuous/asynchronous and opportunistic scheduling respectively. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to