Is this a multi node cluster?
If yes, can you check the time across all the nodes and make sure they are
in sync.
If not, did you override any timeout properties via the app config or
resources? If you could share these json files which you used to start the
app, it will help to debug further.
-
Thanks! That was helpful. (Strangely) As it turns out, the container
is released (and cleaned up) even before the STOP command is queued.
Some more logs:
Node Manager:
-
2016-07-07 15:50:14,148 [AmExecutor-006] INFO state.AppState - Role
ConnectD flexed from 2 to 1
2016-07-07
[
https://issues.apache.org/jira/browse/SLIDER-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gour Saha resolved SLIDER-1151.
---
Resolution: Fixed
> Don't log Invalid port range values when there are no invalid ports specified
> -
[
https://issues.apache.org/jira/browse/SLIDER-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gour Saha updated SLIDER-1151:
--
Assignee: Manoj Samel
> Don't log Invalid port range values when there are no invalid ports specified
>
[
https://issues.apache.org/jira/browse/SLIDER-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Manoj Samel updated SLIDER-1151:
Attachment: SLIDER-1151.1.patch
> Don't log Invalid port range values when there are no invalid por
[
https://issues.apache.org/jira/browse/SLIDER-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Manoj Samel updated SLIDER-1151:
Attachment: (was: check_empty.patch)
> Don't log Invalid port range values when there are no in
[
https://issues.apache.org/jira/browse/SLIDER-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15366908#comment-15366908
]
Gour Saha commented on SLIDER-1151:
---
Thanks [~manojsamel]. I committed it just now for
[
https://issues.apache.org/jira/browse/SLIDER-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15366907#comment-15366907
]
ASF subversion and git services commented on SLIDER-1151:
-
Commit
If you look for the container ID in the nodemanager log on the host where
the container was running, you should be able to see when the container
stopped and was cleaned up. Looks like it even logs when it deletes the
container directories.
On Thu, Jul 7, 2016 at 2:04 PM, Sarthak Kukreti wrote:
[
https://issues.apache.org/jira/browse/SLIDER-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15366824#comment-15366824
]
Manoj Samel commented on SLIDER-1151:
-
[~gsaha], patch attached
> Don't log Invalid
[
https://issues.apache.org/jira/browse/SLIDER-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Manoj Samel updated SLIDER-1151:
Attachment: check_empty.patch
> Don't log Invalid port range values when there are no invalid ports
[
https://issues.apache.org/jira/browse/SLIDER-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15366781#comment-15366781
]
Gour Saha commented on SLIDER-1151:
---
[~manojsamel] feel free to submit a patch. Seems l
kafka,py is still present in the filecache directory: its just the
"container_1467829690678_0022_01_03" directory that seems to be
deleted before the runCommand() call
- Sarthak
On Thu, Jul 7, 2016 at 12:35 PM, Billie Rinaldi
wrote:
> I think that
> /private/tmp/hdfs/nm-local-dir/usercache/s
I think that
/private/tmp/hdfs/nm-local-dir/usercache/sarthakk/appcache/application_1467829690678_0022/container_1467829690678_0022_01_03/app/definition
is linked to
/private/tmp/hdfs/nm-local-dir/usercache/sarthakk/appcache/application_1467829690678_0022/filecache/113/slider-kafka-package-1.0.
Hello!
I am trying to use Slider to distribute an application over a YARN
cluster. While attempting to use "slider flex" to decrease the number
of containers allocated for the application (using the kafka
app-package as reference), I came across the following error:
ERROR 2016-07-07 10:57:36,461
15 matches
Mail list logo