[jira] [Commented] (SLIDER-939) flex down does not cancel the outstanding request

2015-09-28 Thread Youjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/SLIDER-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934575#comment-14934575
 ] 

Youjie Chen commented on SLIDER-939:


Update on this:
We do more testing with Slider HBase app, the same scenario as before: we have 
6 nodes: 1 for HBase master, 5 for HBase regionserver(set 51% memory of each 
node to ensure this). we do following steps with yarn and hbase user 
respectively:
1) flex up HBase regionserver from 5 to 6, there would be one outstanding 
request pending and some nodes reserved for the container request, expected.
2) flex down back to 5 nodes

Then there would be 2 different behaviors with 2 users:
1) if deploy and run HBase instance as yarn user
The outstanding request will be cancelled and the reserved nodes would be 
unreserved from YARN side, YARN log is a below:
  (AppSchedulingInfo.java:updateResourceRequests(148)) - update: 
application=application_1443174197731_0003 request={Priority: 1073741826, 
Capability: , # Containers: 0, Location: *, Relax 
Locality: true}
2) if deploy and run HBase instance as hbase user
The nodes reserved will not be unreserved, and YARN log looks like:
 scheduler.AppSchedulingInfo 
(AppSchedulingInfo.java:updateResourceRequests(148)) - update: 
application=application_1443170550563_0003 request={Priority: 1073741826, 
Capability: , # Containers: 5, Location: *, Relax 
Locality: true}

Compared the 2 YARN outputs: we can see the difference: the first case with 
yarn user, when do update resources, the containers request become 0, so the 
reserved nodes will be unreserved and new requests can be accpeted, while  the 
second case with hbase user, the containers request is still 5. so the nodes 
reserved will still be reserved, so block other jobs container request.
It looks like this issue is related to the user that deploy and run the 
instance, any idea on this ?   Thanks !

> flex down does not cancel the outstanding request
> -
>
> Key: SLIDER-939
> URL: https://issues.apache.org/jira/browse/SLIDER-939
> Project: Slider
>  Issue Type: Bug
>  Components: core
>Affects Versions: Slider 0.80
> Environment: Hadoop 2.7.1 
> Slider 0.80.0
>Reporter: Youjie Chen
>Assignee: Steve Loughran
>  Labels: patch
> Fix For: Slider 0.81
>
>
> I run slider app on  a 6 nodes cluster. To ensure there is only one 
> comonent(worker) instance on each node, I set yarn.memory to 51% of the total 
> memory. 
> Then I flex up to 7 workers,  there would be one worker request(outstanding)  
> that will never be met, this is expected.
> Then I flexed down back to 6 workers, and any container request for any job 
> would be blocked even if there are plenty of memory/core for the job, From RM 
> log, we can see there are continuous output:
> capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainersToNode(1240)) - Skipping scheduling 
> since node test.example.com:45454 is reserved by application 
> appattempt_1442384698868_0008_01
>  It seems  the outstanding requests are not actually cancelled in the 
> requesting container queue but keep trying to request.
> After I flexed down to 5 workers, the other blocked jobs can run.
> This is related to JIRA https://issues.apache.org/jira/browse/SLIDER-490



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SLIDER-663) Make it easy to develop and deploy application packages that are essentially shell commands

2015-09-28 Thread Gour Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/SLIDER-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14933495#comment-14933495
 ] 

Gour Saha commented on SLIDER-663:
--

I think this is what you are looking for - 
https://slider.incubator.apache.org/docs/slider_specs/simple_pkg.html

It has reference from the [Creating App 
Packages|https://slider.incubator.apache.org/docs/slider_specs/index.html] page.

> Make it easy to develop and deploy application packages that are essentially 
> shell commands
> ---
>
> Key: SLIDER-663
> URL: https://issues.apache.org/jira/browse/SLIDER-663
> Project: Slider
>  Issue Type: New Feature
>  Components: agent-provider, app-package
>Affects Versions: Slider 0.60
>Reporter: Sumit Mohanty
>Assignee: Sumit Mohanty
>Priority: Critical
> Fix For: Slider 0.80
>
> Attachments: PackagingSimplificationandCreateEnhancements.pdf
>
>
> Slider app packages require several artifacts for completeness such as a 
> metainfo.xml, a python script to read config and a python script for life 
> cycle commands, a tarball, etc.
> A simple application can be modeled as a shell command or as a java 
> application that just needs a jar and some system properties. So the 
> application requirement can be summarized as:
> * a jar
> * a command
> * a small set of properties
> While it is possible to model these as an application package (e.g. 
> memcached) it is also evident that there are a lot of common patterns that 
> need to be duplicated.
> Slider should provide a way to pass these parameters in the create call 
> itself rather than having to create a full application package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SLIDER-663) Make it easy to develop and deploy application packages that are essentially shell commands

2015-09-28 Thread YONG FENG (JIRA)

[ 
https://issues.apache.org/jira/browse/SLIDER-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1492#comment-1492
 ] 

YONG FENG commented on SLIDER-663:
--

Hi Sumit,

Is there a user doc for SLIDER-663 like Docker based application in 
https://slider.incubator.apache.org/docs/slider_specs/application_pkg_docker.html?
 Or user could treat the design doc as the user doc?

Thanks

> Make it easy to develop and deploy application packages that are essentially 
> shell commands
> ---
>
> Key: SLIDER-663
> URL: https://issues.apache.org/jira/browse/SLIDER-663
> Project: Slider
>  Issue Type: New Feature
>  Components: agent-provider, app-package
>Affects Versions: Slider 0.60
>Reporter: Sumit Mohanty
>Assignee: Sumit Mohanty
>Priority: Critical
> Fix For: Slider 0.80
>
> Attachments: PackagingSimplificationandCreateEnhancements.pdf
>
>
> Slider app packages require several artifacts for completeness such as a 
> metainfo.xml, a python script to read config and a python script for life 
> cycle commands, a tarball, etc.
> A simple application can be modeled as a shell command or as a java 
> application that just needs a jar and some system properties. So the 
> application requirement can be summarized as:
> * a jar
> * a command
> * a small set of properties
> While it is possible to model these as an application package (e.g. 
> memcached) it is also evident that there are a lot of common patterns that 
> need to be duplicated.
> Slider should provide a way to pass these parameters in the create call 
> itself rather than having to create a full application package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Slider-develop - Build # 711 - Still Failing

2015-09-28 Thread Apache Jenkins Server
The Apache Jenkins build system has built Slider-develop (build #711)

Status: Still Failing

Check console output at https://builds.apache.org/job/Slider-develop/711/ to 
view the results.

[jira] [Commented] (SLIDER-663) Make it easy to develop and deploy application packages that are essentially shell commands

2015-09-28 Thread YONG FENG (JIRA)

[ 
https://issues.apache.org/jira/browse/SLIDER-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14933519#comment-14933519
 ] 

YONG FENG commented on SLIDER-663:
--

Hi Gour,

The design doc mentioned use could customize the command to start, stop, check 
status ... components of application as follows. However I did not find the 
related info in user doc. Is it missed in user doc or has not been implemented 
yet?

"commands": [
{
"command": {
"exec": "script for configure",
"name": "CONFIGURE",
"type": "PYTHON",
}
},
{
"command": {
"exec": "command for Start",
"name": "START",
"type": "SHELL",
}
}
]

Thanks,

> Make it easy to develop and deploy application packages that are essentially 
> shell commands
> ---
>
> Key: SLIDER-663
> URL: https://issues.apache.org/jira/browse/SLIDER-663
> Project: Slider
>  Issue Type: New Feature
>  Components: agent-provider, app-package
>Affects Versions: Slider 0.60
>Reporter: Sumit Mohanty
>Assignee: Sumit Mohanty
>Priority: Critical
> Fix For: Slider 0.80
>
> Attachments: PackagingSimplificationandCreateEnhancements.pdf
>
>
> Slider app packages require several artifacts for completeness such as a 
> metainfo.xml, a python script to read config and a python script for life 
> cycle commands, a tarball, etc.
> A simple application can be modeled as a shell command or as a java 
> application that just needs a jar and some system properties. So the 
> application requirement can be summarized as:
> * a jar
> * a command
> * a small set of properties
> While it is possible to model these as an application package (e.g. 
> memcached) it is also evident that there are a lot of common patterns that 
> need to be duplicated.
> Slider should provide a way to pass these parameters in the create call 
> itself rather than having to create a full application package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SLIDER-663) Make it easy to develop and deploy application packages that are essentially shell commands

2015-09-28 Thread YONG FENG (JIRA)

[ 
https://issues.apache.org/jira/browse/SLIDER-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14933501#comment-14933501
 ] 

YONG FENG commented on SLIDER-663:
--

Thanks Gour. 

I don't know why I missed it :-)


> Make it easy to develop and deploy application packages that are essentially 
> shell commands
> ---
>
> Key: SLIDER-663
> URL: https://issues.apache.org/jira/browse/SLIDER-663
> Project: Slider
>  Issue Type: New Feature
>  Components: agent-provider, app-package
>Affects Versions: Slider 0.60
>Reporter: Sumit Mohanty
>Assignee: Sumit Mohanty
>Priority: Critical
> Fix For: Slider 0.80
>
> Attachments: PackagingSimplificationandCreateEnhancements.pdf
>
>
> Slider app packages require several artifacts for completeness such as a 
> metainfo.xml, a python script to read config and a python script for life 
> cycle commands, a tarball, etc.
> A simple application can be modeled as a shell command or as a java 
> application that just needs a jar and some system properties. So the 
> application requirement can be summarized as:
> * a jar
> * a command
> * a small set of properties
> While it is possible to model these as an application package (e.g. 
> memcached) it is also evident that there are a lot of common patterns that 
> need to be duplicated.
> Slider should provide a way to pass these parameters in the create call 
> itself rather than having to create a full application package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SLIDER-943) Container Escalation failing

2015-09-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SLIDER-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14933242#comment-14933242
 ] 

ASF subversion and git services commented on SLIDER-943:


Commit 4083cffea0e8e4a69b97f820cd6b2b0a3dea039e in incubator-slider's branch 
refs/heads/develop from [~ste...@apache.org]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-slider.git;h=4083cff ]

SLIDER-943 Container Escalation failing


> Container Escalation failing
> 
>
> Key: SLIDER-943
> URL: https://issues.apache.org/jira/browse/SLIDER-943
> Project: Slider
>  Issue Type: Bug
>  Components: appmaster
>Affects Versions: Slider 0.80
> Environment: real YARN Cluster
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
>
> {Code}
> 2015-09-26 18:25:05,533 [AmExecutor-006] ERROR actions.QueueExecutor - 
> Exception processing 
> org.apache.slider.server.appmaster.actions.RenewingAction@475c6ebf 
> name='renewing EscalateOutstandingRequests', delay=0, attrs=0, 
> sequenceNumber=5}: 
> org.apache.hadoop.yarn.client.api.InvalidContainerRequestException: Cannot 
> submit a ContainerRequest askingfor location * with locality relaxation 
> true when it has already been requested with locality relaxation false^M
> org.apache.hadoop.yarn.client.api.InvalidContainerRequestException: Cannot 
> submit a ContainerRequest asking for location * with locality relaxation true 
> when it has already been requested with locality relaxation false^M
>   at 
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.checkLocalityRelaxationConflict(AMRMClientImpl.java:582)^M
>   at 
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.addContainerRequest(AMRMClientImpl.java:415)^M
>   at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.addContainerRequest(AMRMClientAsyncImpl.java:166)^M
>   at 
> org.apache.slider.server.appmaster.operations.AsyncRMOperationHandler.addContainerRequest(AsyncRMOperationHandler.java:106)^M
>   at 
> org.apache.slider.server.appmaster.operations.ContainerRequestOperation.execute(ContainerRequestOperation.java:38)^M
>   at 
> org.apache.slider.server.appmaster.operations.RMOperationHandler.execute(RMOperationHandler.java:28)^M
>   at 
> org.apache.slider.server.appmaster.SliderAppMaster.execute(SliderAppMaster.java:1889)^M
>   at 
> org.apache.slider.server.appmaster.SliderAppMaster.escalateOutstandingRequests(SliderAppMaster.java:1824)^M
>   at 
> org.apache.slider.server.appmaster.actions.EscalateOutstandingRequests.execute(EscalateOutstandingRequests.java:43)^M
>   at 
> org.apache.slider.server.appmaster.actions.RenewingAction.execute(RenewingAction.java:88)^M
>   at 
> org.apache.slider.server.appmaster.actions.QueueExecutor.run(QueueExecutor.java:73)^M
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)^M
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)^M
>   at java.lang.Thread.run(Thread.java:745)^M
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SLIDER-943) Container Escalation failing

2015-09-28 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/SLIDER-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14933080#comment-14933080
 ] 

Steve Loughran commented on SLIDER-943:
---

Better formatted
{code}
2015-09-26 18:25:05,533 [AmExecutor-006] ERROR actions.QueueExecutor - 
Exception processing 
org.apache.slider.server.appmaster.actions.RenewingAction@475c6ebf 
name='renewing EscalateOutstandingRequests', delay=0, attrs=0, 
sequenceNumber=5}: 
org.apache.hadoop.yarn.client.api.InvalidContainerRequestException: Cannot 
submit a ContainerRequest askingfor location * with locality relaxation 
true when it has already been requested with locality relaxation false^M
org.apache.hadoop.yarn.client.api.InvalidContainerRequestException: Cannot 
submit a ContainerRequest asking for location * with locality relaxation true 
when it has already been requested with locality relaxation false^M
  at 
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.checkLocalityRelaxationConflict(AMRMClientImpl.java:582)^M
  at 
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.addContainerRequest(AMRMClientImpl.java:415)^M
  at 
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.addContainerRequest(AMRMClientAsyncImpl.java:166)^M
  at 
org.apache.slider.server.appmaster.operations.AsyncRMOperationHandler.addContainerRequest(AsyncRMOperationHandler.java:106)^M
  at 
org.apache.slider.server.appmaster.operations.ContainerRequestOperation.execute(ContainerRequestOperation.java:38)^M
  at 
org.apache.slider.server.appmaster.operations.RMOperationHandler.execute(RMOperationHandler.java:28)^M
  at 
org.apache.slider.server.appmaster.SliderAppMaster.execute(SliderAppMaster.java:1889)^M
  at 
org.apache.slider.server.appmaster.SliderAppMaster.escalateOutstandingRequests(SliderAppMaster.java:1824)^M
  at 
org.apache.slider.server.appmaster.actions.EscalateOutstandingRequests.execute(EscalateOutstandingRequests.java:43)^M
  at 
org.apache.slider.server.appmaster.actions.RenewingAction.execute(RenewingAction.java:88)^M
  at 
org.apache.slider.server.appmaster.actions.QueueExecutor.run(QueueExecutor.java:73)^M
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)^M
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)^M
  at java.lang.Thread.run(Thread.java:745)^M
{code}

> Container Escalation failing
> 
>
> Key: SLIDER-943
> URL: https://issues.apache.org/jira/browse/SLIDER-943
> Project: Slider
>  Issue Type: Bug
>  Components: appmaster
>Affects Versions: Slider 0.80
> Environment: real YARN Cluster
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
>
> {Code}
> 2015-09-26 18:25:05,533 [AmExecutor-006] ERROR actions.QueueExecutor - 
> Exception processing 
> org.apache.slider.server.appmaster.actions.RenewingAction@475c6ebf 
> name='renewing EscalateOutstandingRequests', delay=0, attrs=0, 
> sequenceNumber=5}: 
> org.apache.hadoop.yarn.client.api.InvalidContainerRequestException: Cannot 
> submit a ContainerRequest askingfor location * with locality relaxation 
> true when it has already been requested with locality relaxation false^M
> org.apache.hadoop.yarn.client.api.InvalidContainerRequestException: Cannot 
> submit a ContainerRequest asking for location * with locality relaxation true 
> when it has already been requested with locality relaxation false^M
>   at 
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.checkLocalityRelaxationConflict(AMRMClientImpl.java:582)^M
>   at 
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.addContainerRequest(AMRMClientImpl.java:415)^M
>   at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.addContainerRequest(AMRMClientAsyncImpl.java:166)^M
>   at 
> org.apache.slider.server.appmaster.operations.AsyncRMOperationHandler.addContainerRequest(AsyncRMOperationHandler.java:106)^M
>   at 
> org.apache.slider.server.appmaster.operations.ContainerRequestOperation.execute(ContainerRequestOperation.java:38)^M
>   at 
> org.apache.slider.server.appmaster.operations.RMOperationHandler.execute(RMOperationHandler.java:28)^M
>   at 
> org.apache.slider.server.appmaster.SliderAppMaster.execute(SliderAppMaster.java:1889)^M
>   at 
> org.apache.slider.server.appmaster.SliderAppMaster.escalateOutstandingRequests(SliderAppMaster.java:1824)^M
>   at 
> org.apache.slider.server.appmaster.actions.EscalateOutstandingRequests.execute(EscalateOutstandingRequests.java:43)^M
>   at 
> org.apache.slider.server.appmaster.actions.RenewingAction.execute(RenewingAction.java:88)^M
>   at 
> org.apache.slider.server.appmaster.actions.QueueExecutor.run(QueueExecutor.java:73)^M
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)^M
>   at 
> 

[jira] [Commented] (SLIDER-943) Container Escalation failing

2015-09-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SLIDER-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14933309#comment-14933309
 ] 

ASF subversion and git services commented on SLIDER-943:


Commit 733bd5632c10f1a470ec63f0147ef3b0076dc2e6 in incubator-slider's branch 
refs/heads/develop from [~ste...@apache.org]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-slider.git;h=733bd56 ]

SLIDER-943 Container Escalation failing: update outstanding request priority


> Container Escalation failing
> 
>
> Key: SLIDER-943
> URL: https://issues.apache.org/jira/browse/SLIDER-943
> Project: Slider
>  Issue Type: Bug
>  Components: appmaster
>Affects Versions: Slider 0.80
> Environment: real YARN Cluster
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
>
> {Code}
> 2015-09-26 18:25:05,533 [AmExecutor-006] ERROR actions.QueueExecutor - 
> Exception processing 
> org.apache.slider.server.appmaster.actions.RenewingAction@475c6ebf 
> name='renewing EscalateOutstandingRequests', delay=0, attrs=0, 
> sequenceNumber=5}: 
> org.apache.hadoop.yarn.client.api.InvalidContainerRequestException: Cannot 
> submit a ContainerRequest askingfor location * with locality relaxation 
> true when it has already been requested with locality relaxation false^M
> org.apache.hadoop.yarn.client.api.InvalidContainerRequestException: Cannot 
> submit a ContainerRequest asking for location * with locality relaxation true 
> when it has already been requested with locality relaxation false^M
>   at 
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.checkLocalityRelaxationConflict(AMRMClientImpl.java:582)^M
>   at 
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.addContainerRequest(AMRMClientImpl.java:415)^M
>   at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.addContainerRequest(AMRMClientAsyncImpl.java:166)^M
>   at 
> org.apache.slider.server.appmaster.operations.AsyncRMOperationHandler.addContainerRequest(AsyncRMOperationHandler.java:106)^M
>   at 
> org.apache.slider.server.appmaster.operations.ContainerRequestOperation.execute(ContainerRequestOperation.java:38)^M
>   at 
> org.apache.slider.server.appmaster.operations.RMOperationHandler.execute(RMOperationHandler.java:28)^M
>   at 
> org.apache.slider.server.appmaster.SliderAppMaster.execute(SliderAppMaster.java:1889)^M
>   at 
> org.apache.slider.server.appmaster.SliderAppMaster.escalateOutstandingRequests(SliderAppMaster.java:1824)^M
>   at 
> org.apache.slider.server.appmaster.actions.EscalateOutstandingRequests.execute(EscalateOutstandingRequests.java:43)^M
>   at 
> org.apache.slider.server.appmaster.actions.RenewingAction.execute(RenewingAction.java:88)^M
>   at 
> org.apache.slider.server.appmaster.actions.QueueExecutor.run(QueueExecutor.java:73)^M
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)^M
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)^M
>   at java.lang.Thread.run(Thread.java:745)^M
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)