[jira] [Comment Edited] (YARN-3139) Improve locks in AbstractYarnScheduler/CapacityScheduler/FairScheduler

2016-09-28 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15529409#comment-15529409
 ] 

Jian He edited comment on YARN-3139 at 9/28/16 12:10 PM:
-

Patch looks good to me. 

bq. Instead of taking readLock, I think we can check if 
getNode(container.getNodeId()) == null, and stop if it is.
In the latest patch, do you intend to not check "getNode(container.getNodeId()) 
== null" ? I think this is possible if a node is removed while the container on 
that is being released by AM.






was (Author: jianhe):
Patch looks good to me. 

bq. Instead of taking readLock, I think we can check if 
getNode(container.getNodeId()) == null, and stop if it is.
In the latest patch, do you intend to not check "getNode(container.getNodeId()) 
== null" ?





> Improve locks in AbstractYarnScheduler/CapacityScheduler/FairScheduler
> --
>
> Key: YARN-3139
> URL: https://issues.apache.org/jira/browse/YARN-3139
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3139.0.patch, YARN-3139.1.patch, YARN-3139.2.patch, 
> YARN-3139.3.patch, YARN-3139.4.patch, YARN-3139.5.patch
>
>
> Enhance locks in AbstractYarnScheduler/CapacityScheduler/FairScheduler, as 
> mentioned in YARN-3091, a possible solution is using read/write lock. Other 
> fine-graind locks for specific purposes / bugs should be addressed in 
> separated tickets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-3139) Improve locks in AbstractYarnScheduler/CapacityScheduler/FairScheduler

2016-09-22 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513402#comment-15513402
 ] 

Jian He edited comment on YARN-3139 at 9/22/16 2:20 PM:


There are also similar comments like this... I just couldn't remember why these 
comments were added.. Anyway, may be it's fine..
{code}
  /**
   * Validate increase/decrease request. This function must be called under
   * the queue lock to make sure that the access to container resource is
   * atomic. Refer to LeafQueue.decreaseContainer() and
   * CapacityScheduelr.updateIncreaseRequests()
   * 
   * - Throw exception when any other error happens
   * 
   */
  public static void checkSchedContainerChangeRequest(
{code}
May be it's related to some consistency issues with respect to queue stats?


was (Author: jianhe):
There are also similar comments like this... I just couldn't remember why these 
comments were added.. Anyway, may be it's fine..
{code}
  /**
   * Validate increase/decrease request. This function must be called under
   * the queue lock to make sure that the access to container resource is
   * atomic. Refer to LeafQueue.decreaseContainer() and
   * CapacityScheduelr.updateIncreaseRequests()
   * 
   * - Throw exception when any other error happens
   * 
   */
  public static void checkSchedContainerChangeRequest(
{code}

> Improve locks in AbstractYarnScheduler/CapacityScheduler/FairScheduler
> --
>
> Key: YARN-3139
> URL: https://issues.apache.org/jira/browse/YARN-3139
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3139.0.patch, YARN-3139.1.patch, YARN-3139.2.patch
>
>
> Enhance locks in AbstractYarnScheduler/CapacityScheduler/FairScheduler, as 
> mentioned in YARN-3091, a possible solution is using read/write lock. Other 
> fine-graind locks for specific purposes / bugs should be addressed in 
> separated tickets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-3139) Improve locks in AbstractYarnScheduler/CapacityScheduler/FairScheduler

2016-09-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15510954#comment-15510954
 ] 

Wangda Tan edited comment on YARN-3139 at 9/21/16 8:42 PM:
---

Thanks [~jianhe] for review:

bq. Is removing these locks ok?..

The original lock is added by YARN-4519, but I found instead of locking on leaf 
queue, we can simply lock the application. When an application is locked, we 
cannot release/increase/decrease a container while another 
release/increase/decrease container is in progress. So we will be safe if all 
these operations are protected by application's writeLock.

And here is a summary of how scheduler/queue/app locks being used after this 
patch:
{code}
rollback/decreaseContainer/completedContainer: 
LeafQueue.writeLock -> app.writeLock
updateRequest/updateIncreaseRequest :
app.writeLock
new allocation/new reservation/remove or add app/remove or add 
node/reinitialize :
CS.writeLock -> PQ.writeLock -> LQ.writeLock -> app.writeLock
{code}

So I think existing model looks fine.

bq. Also, here.. the queue lock is removed
Same as above

bq. In fairscheduler, here, locking on application is changed to locking on 
scheduler ?
Fixed

bq. the lock is outside try block...
Actually this is a workaround for the previously reported findbugs warning: 
https://builds.apache.org/job/PreCommit-YARN-Build/13166/artifact/patchprocess/new-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.html
This is safe and it is suggested, because when writeLock.lock() fails (like 
throw any exception), the writeLock.unlock() will not be invoked. However in 
practice, putting writelock inside try or out of try are both safe, because 
writeLock.lock() will not likely throw any exception.

Uploading ver.2 patch, fixed a couple of unnecessary readlocks.


was (Author: leftnoteasy):
Thanks [~jianhe] for review:

bq. Is removing these locks ok?..

The original lock is added by YARN-4519, but I found instead of locking on leaf 
queue, we can simply lock the application. When an application is locked, we 
cannot release/increase/decrease a container while another 
release/increase/decrease container is in progress. So we will be safe if all 
these operations are protected by application's writeLock.

And here is a summary of how scheduler/queue/app locks being used after this 
patch:
{code}
rollback/decreaseContainer/completedContainer: LeafQueue.writeLock -> 
app.writeLock
updateRequest/updateIncreaseRequest : app.writeLock
new allocation/new reservation/remove or add app/remove or add 
node/reinitialize : CS.writeLock -> PQ.writeLock -> LQ.writeLock -> 
app.writeLock
{code}

So I think existing model looks fine.

bq. Also, here.. the queue lock is removed
Same as above

bq. In fairscheduler, here, locking on application is changed to locking on 
scheduler ?
Fixed

bq. the lock is outside try block...
Actually this is a workaround for the previously reported findbugs warning: 
https://builds.apache.org/job/PreCommit-YARN-Build/13166/artifact/patchprocess/new-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.html
This is safe and it is suggested, because when writeLock.lock() fails (like 
throw any exception), the writeLock.unlock() will not be invoked. However in 
practice, putting writelock inside try or out of try are both safe, because 
writeLock.lock() will not likely throw any exception.

Uploading ver.2 patch, fixed a couple of unnecessary readlocks.

> Improve locks in AbstractYarnScheduler/CapacityScheduler/FairScheduler
> --
>
> Key: YARN-3139
> URL: https://issues.apache.org/jira/browse/YARN-3139
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3139.0.patch, YARN-3139.1.patch, YARN-3139.2.patch
>
>
> Enhance locks in AbstractYarnScheduler/CapacityScheduler/FairScheduler, as 
> mentioned in YARN-3091, a possible solution is using read/write lock. Other 
> fine-graind locks for specific purposes / bugs should be addressed in 
> separated tickets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org