[jira] [Updated] (YARN-370) CapacityScheduler app submission fails when min alloc size not multiple of AM size

2013-02-01 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-370:
---

Priority: Blocker  (was: Critical)

> CapacityScheduler app submission fails when min alloc size not multiple of AM 
> size
> --
>
> Key: YARN-370
> URL: https://issues.apache.org/jira/browse/YARN-370
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Thomas Graves
>Priority: Blocker
>
> I was running 2.0.3-SNAPSHOT with the capacity scheduler configured with 
> minimum allocation size 1G. The AM size was set to 1.5G. I didn't specify 
> resource calculator so it was using DefaultResourceCalculator.  The am launch 
> failed with the error below:
> Application application_1359688216672_0001 failed 1 times due to Error 
> launching appattempt_1359688216672_0001_01. Got exception: RemoteTrace: 
> at LocalTrace: 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> RemoteTrace: at LocalTrace: 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> Unauthorized request to start container. Expected resource  vCores:1> but found  at 
> org.apache.hadoop.yarn.factories.impl.pb.YarnRemoteExceptionFactoryPBImpl.createYarnRemoteException(YarnRemoteExceptionFactoryPBImpl.java:39)
>  at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:47) at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.authorizeRequest(ContainerManagerImpl.java:383)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainer(ContainerManagerImpl.java:400)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagerPBServiceImpl.startContainer(ContainerManagerPBServiceImpl.java:68)
>  at 
> org.apache.hadoop.yarn.proto.ContainerManager$ContainerManagerService$2.callBlockingMethod(ContainerManager.java:83)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014) at 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1735) at 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1731) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:415) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1729) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:525) at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
>  at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
>  at 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:123)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:109)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:111)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:255)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>  at java.lang.Thread.run(Thread.java:722) . Failing the application. 
> It looks like the launchcontext for the app didn't have the resources rounded 
> up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-370) CapacityScheduler app submission fails when min alloc size not multiple of AM size

2013-02-01 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-370:
---

Target Version/s: 2.0.3-alpha  (was: 3.0.0, 2.0.3-alpha)

> CapacityScheduler app submission fails when min alloc size not multiple of AM 
> size
> --
>
> Key: YARN-370
> URL: https://issues.apache.org/jira/browse/YARN-370
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Thomas Graves
>Priority: Blocker
>
> I was running 2.0.3-SNAPSHOT with the capacity scheduler configured with 
> minimum allocation size 1G. The AM size was set to 1.5G. I didn't specify 
> resource calculator so it was using DefaultResourceCalculator.  The am launch 
> failed with the error below:
> Application application_1359688216672_0001 failed 1 times due to Error 
> launching appattempt_1359688216672_0001_01. Got exception: RemoteTrace: 
> at LocalTrace: 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> RemoteTrace: at LocalTrace: 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> Unauthorized request to start container. Expected resource  vCores:1> but found  at 
> org.apache.hadoop.yarn.factories.impl.pb.YarnRemoteExceptionFactoryPBImpl.createYarnRemoteException(YarnRemoteExceptionFactoryPBImpl.java:39)
>  at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:47) at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.authorizeRequest(ContainerManagerImpl.java:383)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainer(ContainerManagerImpl.java:400)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagerPBServiceImpl.startContainer(ContainerManagerPBServiceImpl.java:68)
>  at 
> org.apache.hadoop.yarn.proto.ContainerManager$ContainerManagerService$2.callBlockingMethod(ContainerManager.java:83)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014) at 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1735) at 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1731) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:415) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1729) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:525) at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
>  at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
>  at 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:123)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:109)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:111)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:255)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>  at java.lang.Thread.run(Thread.java:722) . Failing the application. 
> It looks like the launchcontext for the app didn't have the resources rounded 
> up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-370) CapacityScheduler app submission fails when min alloc size not multiple of AM size

2013-02-01 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-370:
---

Affects Version/s: (was: 3.0.0)

> CapacityScheduler app submission fails when min alloc size not multiple of AM 
> size
> --
>
> Key: YARN-370
> URL: https://issues.apache.org/jira/browse/YARN-370
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.0.3-alpha
>Reporter: Thomas Graves
>Priority: Blocker
>
> I was running 2.0.3-SNAPSHOT with the capacity scheduler configured with 
> minimum allocation size 1G. The AM size was set to 1.5G. I didn't specify 
> resource calculator so it was using DefaultResourceCalculator.  The am launch 
> failed with the error below:
> Application application_1359688216672_0001 failed 1 times due to Error 
> launching appattempt_1359688216672_0001_01. Got exception: RemoteTrace: 
> at LocalTrace: 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> RemoteTrace: at LocalTrace: 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> Unauthorized request to start container. Expected resource  vCores:1> but found  at 
> org.apache.hadoop.yarn.factories.impl.pb.YarnRemoteExceptionFactoryPBImpl.createYarnRemoteException(YarnRemoteExceptionFactoryPBImpl.java:39)
>  at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:47) at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.authorizeRequest(ContainerManagerImpl.java:383)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainer(ContainerManagerImpl.java:400)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagerPBServiceImpl.startContainer(ContainerManagerPBServiceImpl.java:68)
>  at 
> org.apache.hadoop.yarn.proto.ContainerManager$ContainerManagerService$2.callBlockingMethod(ContainerManager.java:83)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014) at 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1735) at 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1731) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:415) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1729) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:525) at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
>  at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
>  at 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:123)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:109)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:111)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:255)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>  at java.lang.Thread.run(Thread.java:722) . Failing the application. 
> It looks like the launchcontext for the app didn't have the resources rounded 
> up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-370) CapacityScheduler app submission fails when min alloc size not multiple of AM size

2013-02-01 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-370:
---

Assignee: Zhijie Shen

> CapacityScheduler app submission fails when min alloc size not multiple of AM 
> size
> --
>
> Key: YARN-370
> URL: https://issues.apache.org/jira/browse/YARN-370
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.0.3-alpha
>Reporter: Thomas Graves
>Assignee: Zhijie Shen
>Priority: Blocker
>
> I was running 2.0.3-SNAPSHOT with the capacity scheduler configured with 
> minimum allocation size 1G. The AM size was set to 1.5G. I didn't specify 
> resource calculator so it was using DefaultResourceCalculator.  The am launch 
> failed with the error below:
> Application application_1359688216672_0001 failed 1 times due to Error 
> launching appattempt_1359688216672_0001_01. Got exception: RemoteTrace: 
> at LocalTrace: 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> RemoteTrace: at LocalTrace: 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> Unauthorized request to start container. Expected resource  vCores:1> but found  at 
> org.apache.hadoop.yarn.factories.impl.pb.YarnRemoteExceptionFactoryPBImpl.createYarnRemoteException(YarnRemoteExceptionFactoryPBImpl.java:39)
>  at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:47) at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.authorizeRequest(ContainerManagerImpl.java:383)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainer(ContainerManagerImpl.java:400)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagerPBServiceImpl.startContainer(ContainerManagerPBServiceImpl.java:68)
>  at 
> org.apache.hadoop.yarn.proto.ContainerManager$ContainerManagerService$2.callBlockingMethod(ContainerManager.java:83)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014) at 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1735) at 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1731) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:415) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1729) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:525) at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
>  at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
>  at 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:123)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:109)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:111)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:255)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>  at java.lang.Thread.run(Thread.java:722) . Failing the application. 
> It looks like the launchcontext for the app didn't have the resources rounded 
> up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-370) CapacityScheduler app submission fails when min alloc size not multiple of AM size

2013-02-05 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-370:
-

Attachment: YARN-370-branch-2.patch

I've tested the change, with which ContainerManagerImpl saw the updated the 
resource, i.e., 2048 mem. It should fix exception.

The user define 1.5G for AM container, such that we need to update the resource 
of ApplicationSubmissionContext according to the real allocated size. One 
remaining issue is whether other containers automatically created by the system 
will be assigned the memory size which is not the multiple of the min alloc 
size or not. If it will, the problem will happen on the non-AMcontainer as well.

> CapacityScheduler app submission fails when min alloc size not multiple of AM 
> size
> --
>
> Key: YARN-370
> URL: https://issues.apache.org/jira/browse/YARN-370
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.0.3-alpha
>Reporter: Thomas Graves
>Assignee: Zhijie Shen
>Priority: Blocker
> Attachments: YARN-370-branch-2.patch
>
>
> I was running 2.0.3-SNAPSHOT with the capacity scheduler configured with 
> minimum allocation size 1G. The AM size was set to 1.5G. I didn't specify 
> resource calculator so it was using DefaultResourceCalculator.  The am launch 
> failed with the error below:
> Application application_1359688216672_0001 failed 1 times due to Error 
> launching appattempt_1359688216672_0001_01. Got exception: RemoteTrace: 
> at LocalTrace: 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> RemoteTrace: at LocalTrace: 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> Unauthorized request to start container. Expected resource  vCores:1> but found  at 
> org.apache.hadoop.yarn.factories.impl.pb.YarnRemoteExceptionFactoryPBImpl.createYarnRemoteException(YarnRemoteExceptionFactoryPBImpl.java:39)
>  at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:47) at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.authorizeRequest(ContainerManagerImpl.java:383)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainer(ContainerManagerImpl.java:400)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagerPBServiceImpl.startContainer(ContainerManagerPBServiceImpl.java:68)
>  at 
> org.apache.hadoop.yarn.proto.ContainerManager$ContainerManagerService$2.callBlockingMethod(ContainerManager.java:83)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014) at 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1735) at 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1731) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:415) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1729) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:525) at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
>  at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
>  at 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:123)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:109)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:111)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:255)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>  at java.lang.Thread.run(Thread.java:722) . Failing the application. 
> It looks like the launchcontext for the app didn't have the resources rounded 
> up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-370) CapacityScheduler app submission fails when min alloc size not multiple of AM size

2013-02-05 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-370:
-

Attachment: YARN-370-branch-2_1.patch

Sure, I've attached the newest patch, which adopts Thomas' fix. And I've also 
updated the related test cases. Thanks!

> CapacityScheduler app submission fails when min alloc size not multiple of AM 
> size
> --
>
> Key: YARN-370
> URL: https://issues.apache.org/jira/browse/YARN-370
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.0.3-alpha
>Reporter: Thomas Graves
>Assignee: Zhijie Shen
>Priority: Blocker
> Attachments: YARN-370-branch-2_1.patch, YARN-370-branch-2.patch
>
>
> I was running 2.0.3-SNAPSHOT with the capacity scheduler configured with 
> minimum allocation size 1G. The AM size was set to 1.5G. I didn't specify 
> resource calculator so it was using DefaultResourceCalculator.  The am launch 
> failed with the error below:
> Application application_1359688216672_0001 failed 1 times due to Error 
> launching appattempt_1359688216672_0001_01. Got exception: RemoteTrace: 
> at LocalTrace: 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> RemoteTrace: at LocalTrace: 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
> Unauthorized request to start container. Expected resource  vCores:1> but found  at 
> org.apache.hadoop.yarn.factories.impl.pb.YarnRemoteExceptionFactoryPBImpl.createYarnRemoteException(YarnRemoteExceptionFactoryPBImpl.java:39)
>  at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:47) at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.authorizeRequest(ContainerManagerImpl.java:383)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainer(ContainerManagerImpl.java:400)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagerPBServiceImpl.startContainer(ContainerManagerPBServiceImpl.java:68)
>  at 
> org.apache.hadoop.yarn.proto.ContainerManager$ContainerManagerService$2.callBlockingMethod(ContainerManager.java:83)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014) at 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1735) at 
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1731) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:415) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1729) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:525) at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
>  at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
>  at 
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:123)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:109)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:111)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:255)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>  at java.lang.Thread.run(Thread.java:722) . Failing the application. 
> It looks like the launchcontext for the app didn't have the resources rounded 
> up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira