[jira] [Commented] (YARN-3735) Retain JRE Fatal error logs upon container failure

2015-05-28 Thread Srikanth Sundarrajan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562734#comment-14562734
 ] 

Srikanth Sundarrajan commented on YARN-3735:


When JRE fails with a fatal error during any of the container launchers, an 
error file is created in the working directory and this is being removed by the 
node manager upon container cleanup. It becomes challenging to debug in this 
scenario. It might be useful to collect this and append it to the container 
stderr when such errors are encountered.

> Retain JRE Fatal error logs upon container failure
> --
>
> Key: YARN-3735
> URL: https://issues.apache.org/jira/browse/YARN-3735
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Srikanth Sundarrajan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3735) Retain JRE Fatal error logs upon container failure

2015-05-28 Thread Srikanth Sundarrajan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562803#comment-14562803
 ] 

Srikanth Sundarrajan commented on YARN-3735:


There have been cases where MRAppMaster fails with SIGBUS errors in our cluster 
for instance and it is sporadic.

{noformat}
sudo -u UUU yarn logs -applicationId application_1432020518439_802161
Unable to get ApplicationState. Attempting to fetch logs directly from the 
filesystem.


Container: container_1432020518439_802161_02_01 on host.grid.com_45454
===
LogType:stderr
Log Upload Time:26-May-2015 16:42:53
LogLength:0
Log Contents:

LogType:stdout
Log Upload Time:26-May-2015 16:42:53
LogLength:954
Log Contents:
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0x7) at pc=0x7fce44882aad, pid=8391, tid=140523938055936
#
# JRE version: Java(TM) SE Runtime Environment (7.0_67-b01) (build 1.7.0_67-b01)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  [libzip.so+0x5aad]  readCEN+0x79d
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# 
/data/d1/yarn/local/usercache/UUU/appcache/application_1432020518439_802161/container_1432020518439_802161_02_01/hs_err_pid8391.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

Container: container_1432020518439_802161_01_01 on host.grid.com_45454
===
LogType:stderr
Log Upload Time:26-May-2015 16:42:53
LogLength:0
Log Contents:

LogType:stdout
Log Upload Time:26-May-2015 16:42:53
LogLength:954
Log Contents:
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0x7) at pc=0x7f2360144aad, pid=8077, tid=139789960816384
#
# JRE version: Java(TM) SE Runtime Environment (7.0_67-b01) (build 1.7.0_67-b01)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  [libzip.so+0x5aad]  readCEN+0x79d
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# 
/data/d1/yarn/local/usercache/UUU/appcache/application_1432020518439_802161/container_1432020518439_802161_01_01/hs_err_pid8077.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
{noformat}


> Retain JRE Fatal error logs upon container failure
> --
>
> Key: YARN-3735
> URL: https://issues.apache.org/jira/browse/YARN-3735
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Srikanth Sundarrajan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3735) Retain JRE Fatal error logs upon container failure

2015-05-28 Thread Srikanth Sundarrajan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562804#comment-14562804
 ] 

Srikanth Sundarrajan commented on YARN-3735:


There have been cases where MRAppMaster fails with SIGBUS errors in our cluster 
for instance and it is sporadic.

{noformat}
sudo -u UUU yarn logs -applicationId application_1432020518439_802161
Unable to get ApplicationState. Attempting to fetch logs directly from the 
filesystem.


Container: container_1432020518439_802161_02_01 on host.grid.com_45454
===
LogType:stderr
Log Upload Time:26-May-2015 16:42:53
LogLength:0
Log Contents:

LogType:stdout
Log Upload Time:26-May-2015 16:42:53
LogLength:954
Log Contents:
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0x7) at pc=0x7fce44882aad, pid=8391, tid=140523938055936
#
# JRE version: Java(TM) SE Runtime Environment (7.0_67-b01) (build 1.7.0_67-b01)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  [libzip.so+0x5aad]  readCEN+0x79d
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# 
/data/d1/yarn/local/usercache/UUU/appcache/application_1432020518439_802161/container_1432020518439_802161_02_01/hs_err_pid8391.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

Container: container_1432020518439_802161_01_01 on host.grid.com_45454
===
LogType:stderr
Log Upload Time:26-May-2015 16:42:53
LogLength:0
Log Contents:

LogType:stdout
Log Upload Time:26-May-2015 16:42:53
LogLength:954
Log Contents:
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0x7) at pc=0x7f2360144aad, pid=8077, tid=139789960816384
#
# JRE version: Java(TM) SE Runtime Environment (7.0_67-b01) (build 1.7.0_67-b01)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  [libzip.so+0x5aad]  readCEN+0x79d
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# 
/data/d1/yarn/local/usercache/UUU/appcache/application_1432020518439_802161/container_1432020518439_802161_01_01/hs_err_pid8077.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
{noformat}


> Retain JRE Fatal error logs upon container failure
> --
>
> Key: YARN-3735
> URL: https://issues.apache.org/jira/browse/YARN-3735
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Srikanth Sundarrajan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3735) Retain JRE Fatal error logs upon container failure

2015-05-28 Thread Srikanth Sundarrajan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562807#comment-14562807
 ] 

Srikanth Sundarrajan commented on YARN-3735:


There have been cases where MRAppMaster fails with SIGBUS errors in our cluster 
for instance and it is sporadic.

{noformat}
sudo -u UUU yarn logs -applicationId application_1432020518439_802161
Unable to get ApplicationState. Attempting to fetch logs directly from the 
filesystem.


Container: container_1432020518439_802161_02_01 on host.grid.com_45454
===
LogType:stderr
Log Upload Time:26-May-2015 16:42:53
LogLength:0
Log Contents:

LogType:stdout
Log Upload Time:26-May-2015 16:42:53
LogLength:954
Log Contents:
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0x7) at pc=0x7fce44882aad, pid=8391, tid=140523938055936
#
# JRE version: Java(TM) SE Runtime Environment (7.0_67-b01) (build 1.7.0_67-b01)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  [libzip.so+0x5aad]  readCEN+0x79d
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# 
/data/d1/yarn/local/usercache/UUU/appcache/application_1432020518439_802161/container_1432020518439_802161_02_01/hs_err_pid8391.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

Container: container_1432020518439_802161_01_01 on host.grid.com_45454
===
LogType:stderr
Log Upload Time:26-May-2015 16:42:53
LogLength:0
Log Contents:

LogType:stdout
Log Upload Time:26-May-2015 16:42:53
LogLength:954
Log Contents:
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0x7) at pc=0x7f2360144aad, pid=8077, tid=139789960816384
#
# JRE version: Java(TM) SE Runtime Environment (7.0_67-b01) (build 1.7.0_67-b01)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  [libzip.so+0x5aad]  readCEN+0x79d
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# 
/data/d1/yarn/local/usercache/UUU/appcache/application_1432020518439_802161/container_1432020518439_802161_01_01/hs_err_pid8077.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
{noformat}


> Retain JRE Fatal error logs upon container failure
> --
>
> Key: YARN-3735
> URL: https://issues.apache.org/jira/browse/YARN-3735
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Srikanth Sundarrajan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3735) Retain JRE Fatal error logs upon container failure

2015-05-28 Thread Srikanth Sundarrajan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562814#comment-14562814
 ] 

Srikanth Sundarrajan commented on YARN-3735:


Sorry about the multiple post, had issues with JIRA.

> Retain JRE Fatal error logs upon container failure
> --
>
> Key: YARN-3735
> URL: https://issues.apache.org/jira/browse/YARN-3735
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Srikanth Sundarrajan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3735) Retain JRE Fatal error logs upon container failure

2015-07-14 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627048#comment-14627048
 ] 

Vinod Kumar Vavilapalli commented on YARN-3735:
---

YARN doesn't assume JVM containers. Can we instead change the apps to get them 
to write this file to the log-directory instead of the working directory? That 
way, log-aggregation will automatically archive them.

> Retain JRE Fatal error logs upon container failure
> --
>
> Key: YARN-3735
> URL: https://issues.apache.org/jira/browse/YARN-3735
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Srikanth Sundarrajan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3735) Retain JRE Fatal error logs upon container failure

2016-11-03 Thread JESSE CHEN (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15634224#comment-15634224
 ] 

JESSE CHEN commented on YARN-3735:
--

Is there a workaround even?

> Retain JRE Fatal error logs upon container failure
> --
>
> Key: YARN-3735
> URL: https://issues.apache.org/jira/browse/YARN-3735
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Srikanth Sundarrajan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org