[jira] [Assigned] (SPARK-25869) Spark on YARN: the original diagnostics is missing when job failed maxAppAttempts times
[ https://issues.apache.org/jira/browse/SPARK-25869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-25869: -- Assignee: (was: Marcelo Vanzin) > Spark on YARN: the original diagnostics is missing when job failed > maxAppAttempts times > --- > > Key: SPARK-25869 > URL: https://issues.apache.org/jira/browse/SPARK-25869 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.1.1 >Reporter: Yeliang Cang >Priority: Major > > When configure spark on yarn, I submit job using below command: > {code} > spark-submit --class org.apache.spark.examples.SparkPi --master yarn > --deploy-mode cluster --driver-memory 127m --driver-cores 1 > --executor-memory 2048m --executor-cores 1 --num-executors 10 --queue > root.mr --conf spark.testing.reservedMemory=1048576 --conf > spark.yarn.executor.memoryOverhead=50 --conf > spark.yarn.driver.memoryOverhead=50 > /opt/ZDH/parcels/lib/spark/examples/jars/spark-examples* 1 > {code} > Apparently, the driver memory is not enough, but this can not be seen in > spark client log: > {code} > 2018-10-29 19:28:34,658 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: ACCEPTED) > 2018-10-29 19:28:35,660 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: RUNNING) > 2018-10-29 19:28:35,660 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: N/A > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812501560 > final status: UNDEFINED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0013/ > user: mr > 2018-10-29 19:28:36,663 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: FINISHED) > 2018-10-29 19:28:36,663 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: Shutdown hook called before final status was reported. > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812501560 > final status: FAILED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0013/ > user: mr > Exception in thread "main" org.apache.spark.SparkException: Application > application_1540536615315_0013 finished with failed status > at org.apache.spark.deploy.yarn.Client.run(Client.scala:1137) > at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1183) > at org.apache.spark.deploy.yarn.Client.main(Client.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775) > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > 2018-10-29 19:28:36,694 INFO org.apache.spark.util.ShutdownHookManager: > Shutdown hook called > 2018-10-29 19:28:36,695 INFO org.apache.spark.util.ShutdownHookManager: > Deleting directory /tmp/spark-96077be5-0dfa-496d-a6a0-96e83393a8d9 > {code} > > > Solution: after apply the patch, spark client log can be shown as: > {code} > 2018-10-29 19:27:32,962 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0012 (state: RUNNING) > 2018-10-29 19:27:32,962 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: N/A > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812436656 > final status: UNDEFINED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0012/ > user: mr > 2018-10-29 19:27:33,964 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0012 (state: FAILED) > 2018-10-29 19:27:33,964 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: Application application_1540536615315_0012 failed 2 times due > to AM Container for appattempt_1540536615315_0012_02 exited with > exitCode: -104 > For more detailed output, check application tracking > page:http://zdh141:8088/cluster/app/application_1540536615315_0012Then, click > on links to logs of each attempt. > Diagnostics: virtual memory used. Killing contain
[jira] [Assigned] (SPARK-25869) Spark on YARN: the original diagnostics is missing when job failed maxAppAttempts times
[ https://issues.apache.org/jira/browse/SPARK-25869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-25869: -- Assignee: Marcelo Vanzin > Spark on YARN: the original diagnostics is missing when job failed > maxAppAttempts times > --- > > Key: SPARK-25869 > URL: https://issues.apache.org/jira/browse/SPARK-25869 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.1.1 >Reporter: Yeliang Cang >Assignee: Marcelo Vanzin >Priority: Major > > When configure spark on yarn, I submit job using below command: > {code} > spark-submit --class org.apache.spark.examples.SparkPi --master yarn > --deploy-mode cluster --driver-memory 127m --driver-cores 1 > --executor-memory 2048m --executor-cores 1 --num-executors 10 --queue > root.mr --conf spark.testing.reservedMemory=1048576 --conf > spark.yarn.executor.memoryOverhead=50 --conf > spark.yarn.driver.memoryOverhead=50 > /opt/ZDH/parcels/lib/spark/examples/jars/spark-examples* 1 > {code} > Apparently, the driver memory is not enough, but this can not be seen in > spark client log: > {code} > 2018-10-29 19:28:34,658 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: ACCEPTED) > 2018-10-29 19:28:35,660 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: RUNNING) > 2018-10-29 19:28:35,660 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: N/A > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812501560 > final status: UNDEFINED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0013/ > user: mr > 2018-10-29 19:28:36,663 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: FINISHED) > 2018-10-29 19:28:36,663 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: Shutdown hook called before final status was reported. > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812501560 > final status: FAILED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0013/ > user: mr > Exception in thread "main" org.apache.spark.SparkException: Application > application_1540536615315_0013 finished with failed status > at org.apache.spark.deploy.yarn.Client.run(Client.scala:1137) > at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1183) > at org.apache.spark.deploy.yarn.Client.main(Client.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775) > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > 2018-10-29 19:28:36,694 INFO org.apache.spark.util.ShutdownHookManager: > Shutdown hook called > 2018-10-29 19:28:36,695 INFO org.apache.spark.util.ShutdownHookManager: > Deleting directory /tmp/spark-96077be5-0dfa-496d-a6a0-96e83393a8d9 > {code} > > > Solution: after apply the patch, spark client log can be shown as: > {code} > 2018-10-29 19:27:32,962 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0012 (state: RUNNING) > 2018-10-29 19:27:32,962 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: N/A > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812436656 > final status: UNDEFINED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0012/ > user: mr > 2018-10-29 19:27:33,964 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0012 (state: FAILED) > 2018-10-29 19:27:33,964 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: Application application_1540536615315_0012 failed 2 times due > to AM Container for appattempt_1540536615315_0012_02 exited with > exitCode: -104 > For more detailed output, check application tracking > page:http://zdh141:8088/cluster/app/application_1540536615315_0012Then, click > on links to logs of each attempt. > Diagnostics: virtual m
[jira] [Assigned] (SPARK-25869) Spark on YARN: the original diagnostics is missing when job failed maxAppAttempts times
[ https://issues.apache.org/jira/browse/SPARK-25869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-25869: -- Assignee: (was: Marcelo Vanzin) > Spark on YARN: the original diagnostics is missing when job failed > maxAppAttempts times > --- > > Key: SPARK-25869 > URL: https://issues.apache.org/jira/browse/SPARK-25869 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.1.1 >Reporter: Yeliang Cang >Priority: Major > > When configure spark on yarn, I submit job using below command: > {code} > spark-submit --class org.apache.spark.examples.SparkPi --master yarn > --deploy-mode cluster --driver-memory 127m --driver-cores 1 > --executor-memory 2048m --executor-cores 1 --num-executors 10 --queue > root.mr --conf spark.testing.reservedMemory=1048576 --conf > spark.yarn.executor.memoryOverhead=50 --conf > spark.yarn.driver.memoryOverhead=50 > /opt/ZDH/parcels/lib/spark/examples/jars/spark-examples* 1 > {code} > Apparently, the driver memory is not enough, but this can not be seen in > spark client log: > {code} > 2018-10-29 19:28:34,658 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: ACCEPTED) > 2018-10-29 19:28:35,660 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: RUNNING) > 2018-10-29 19:28:35,660 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: N/A > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812501560 > final status: UNDEFINED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0013/ > user: mr > 2018-10-29 19:28:36,663 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: FINISHED) > 2018-10-29 19:28:36,663 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: Shutdown hook called before final status was reported. > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812501560 > final status: FAILED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0013/ > user: mr > Exception in thread "main" org.apache.spark.SparkException: Application > application_1540536615315_0013 finished with failed status > at org.apache.spark.deploy.yarn.Client.run(Client.scala:1137) > at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1183) > at org.apache.spark.deploy.yarn.Client.main(Client.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775) > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > 2018-10-29 19:28:36,694 INFO org.apache.spark.util.ShutdownHookManager: > Shutdown hook called > 2018-10-29 19:28:36,695 INFO org.apache.spark.util.ShutdownHookManager: > Deleting directory /tmp/spark-96077be5-0dfa-496d-a6a0-96e83393a8d9 > {code} > > > Solution: after apply the patch, spark client log can be shown as: > {code} > 2018-10-29 19:27:32,962 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0012 (state: RUNNING) > 2018-10-29 19:27:32,962 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: N/A > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812436656 > final status: UNDEFINED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0012/ > user: mr > 2018-10-29 19:27:33,964 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0012 (state: FAILED) > 2018-10-29 19:27:33,964 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: Application application_1540536615315_0012 failed 2 times due > to AM Container for appattempt_1540536615315_0012_02 exited with > exitCode: -104 > For more detailed output, check application tracking > page:http://zdh141:8088/cluster/app/application_1540536615315_0012Then, click > on links to logs of each attempt. > Diagnostics: virtual memory used. Killing contain
[jira] [Assigned] (SPARK-25869) Spark on YARN: the original diagnostics is missing when job failed maxAppAttempts times
[ https://issues.apache.org/jira/browse/SPARK-25869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-25869: -- Assignee: Marcelo Vanzin > Spark on YARN: the original diagnostics is missing when job failed > maxAppAttempts times > --- > > Key: SPARK-25869 > URL: https://issues.apache.org/jira/browse/SPARK-25869 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.1.1 >Reporter: Yeliang Cang >Assignee: Marcelo Vanzin >Priority: Major > > When configure spark on yarn, I submit job using below command: > {code} > spark-submit --class org.apache.spark.examples.SparkPi --master yarn > --deploy-mode cluster --driver-memory 127m --driver-cores 1 > --executor-memory 2048m --executor-cores 1 --num-executors 10 --queue > root.mr --conf spark.testing.reservedMemory=1048576 --conf > spark.yarn.executor.memoryOverhead=50 --conf > spark.yarn.driver.memoryOverhead=50 > /opt/ZDH/parcels/lib/spark/examples/jars/spark-examples* 1 > {code} > Apparently, the driver memory is not enough, but this can not be seen in > spark client log: > {code} > 2018-10-29 19:28:34,658 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: ACCEPTED) > 2018-10-29 19:28:35,660 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: RUNNING) > 2018-10-29 19:28:35,660 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: N/A > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812501560 > final status: UNDEFINED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0013/ > user: mr > 2018-10-29 19:28:36,663 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: FINISHED) > 2018-10-29 19:28:36,663 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: Shutdown hook called before final status was reported. > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812501560 > final status: FAILED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0013/ > user: mr > Exception in thread "main" org.apache.spark.SparkException: Application > application_1540536615315_0013 finished with failed status > at org.apache.spark.deploy.yarn.Client.run(Client.scala:1137) > at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1183) > at org.apache.spark.deploy.yarn.Client.main(Client.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775) > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > 2018-10-29 19:28:36,694 INFO org.apache.spark.util.ShutdownHookManager: > Shutdown hook called > 2018-10-29 19:28:36,695 INFO org.apache.spark.util.ShutdownHookManager: > Deleting directory /tmp/spark-96077be5-0dfa-496d-a6a0-96e83393a8d9 > {code} > > > Solution: after apply the patch, spark client log can be shown as: > {code} > 2018-10-29 19:27:32,962 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0012 (state: RUNNING) > 2018-10-29 19:27:32,962 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: N/A > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812436656 > final status: UNDEFINED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0012/ > user: mr > 2018-10-29 19:27:33,964 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0012 (state: FAILED) > 2018-10-29 19:27:33,964 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: Application application_1540536615315_0012 failed 2 times due > to AM Container for appattempt_1540536615315_0012_02 exited with > exitCode: -104 > For more detailed output, check application tracking > page:http://zdh141:8088/cluster/app/application_1540536615315_0012Then, click > on links to logs of each attempt. > Diagnostics: virtual m
[jira] [Assigned] (SPARK-25869) Spark on YARN: the original diagnostics is missing when job failed maxAppAttempts times
[ https://issues.apache.org/jira/browse/SPARK-25869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25869: Assignee: (was: Apache Spark) > Spark on YARN: the original diagnostics is missing when job failed > maxAppAttempts times > --- > > Key: SPARK-25869 > URL: https://issues.apache.org/jira/browse/SPARK-25869 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.1.1 >Reporter: Yeliang Cang >Priority: Major > > When configure spark on yarn, I submit job using below command: > {code} > spark-submit --class org.apache.spark.examples.SparkPi --master yarn > --deploy-mode cluster --driver-memory 127m --driver-cores 1 > --executor-memory 2048m --executor-cores 1 --num-executors 10 --queue > root.mr --conf spark.testing.reservedMemory=1048576 --conf > spark.yarn.executor.memoryOverhead=50 --conf > spark.yarn.driver.memoryOverhead=50 > /opt/ZDH/parcels/lib/spark/examples/jars/spark-examples* 1 > {code} > Apparently, the driver memory is not enough, but this can not be seen in > spark client log: > {code} > 2018-10-29 19:28:34,658 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: ACCEPTED) > 2018-10-29 19:28:35,660 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: RUNNING) > 2018-10-29 19:28:35,660 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: N/A > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812501560 > final status: UNDEFINED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0013/ > user: mr > 2018-10-29 19:28:36,663 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: FINISHED) > 2018-10-29 19:28:36,663 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: Shutdown hook called before final status was reported. > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812501560 > final status: FAILED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0013/ > user: mr > Exception in thread "main" org.apache.spark.SparkException: Application > application_1540536615315_0013 finished with failed status > at org.apache.spark.deploy.yarn.Client.run(Client.scala:1137) > at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1183) > at org.apache.spark.deploy.yarn.Client.main(Client.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775) > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > 2018-10-29 19:28:36,694 INFO org.apache.spark.util.ShutdownHookManager: > Shutdown hook called > 2018-10-29 19:28:36,695 INFO org.apache.spark.util.ShutdownHookManager: > Deleting directory /tmp/spark-96077be5-0dfa-496d-a6a0-96e83393a8d9 > {code} > > > Solution: after apply the patch, spark client log can be shown as: > {code} > 2018-10-29 19:27:32,962 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0012 (state: RUNNING) > 2018-10-29 19:27:32,962 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: N/A > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812436656 > final status: UNDEFINED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0012/ > user: mr > 2018-10-29 19:27:33,964 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0012 (state: FAILED) > 2018-10-29 19:27:33,964 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: Application application_1540536615315_0012 failed 2 times due > to AM Container for appattempt_1540536615315_0012_02 exited with > exitCode: -104 > For more detailed output, check application tracking > page:http://zdh141:8088/cluster/app/application_1540536615315_0012Then, click > on links to logs of each attempt. > Diagnostics: virtual memory used. Killing container. >
[jira] [Assigned] (SPARK-25869) Spark on YARN: the original diagnostics is missing when job failed maxAppAttempts times
[ https://issues.apache.org/jira/browse/SPARK-25869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25869: Assignee: Apache Spark > Spark on YARN: the original diagnostics is missing when job failed > maxAppAttempts times > --- > > Key: SPARK-25869 > URL: https://issues.apache.org/jira/browse/SPARK-25869 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.1.1 >Reporter: Yeliang Cang >Assignee: Apache Spark >Priority: Major > > When configure spark on yarn, I submit job using below command: > {code} > spark-submit --class org.apache.spark.examples.SparkPi --master yarn > --deploy-mode cluster --driver-memory 127m --driver-cores 1 > --executor-memory 2048m --executor-cores 1 --num-executors 10 --queue > root.mr --conf spark.testing.reservedMemory=1048576 --conf > spark.yarn.executor.memoryOverhead=50 --conf > spark.yarn.driver.memoryOverhead=50 > /opt/ZDH/parcels/lib/spark/examples/jars/spark-examples* 1 > {code} > Apparently, the driver memory is not enough, but this can not be seen in > spark client log: > {code} > 2018-10-29 19:28:34,658 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: ACCEPTED) > 2018-10-29 19:28:35,660 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: RUNNING) > 2018-10-29 19:28:35,660 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: N/A > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812501560 > final status: UNDEFINED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0013/ > user: mr > 2018-10-29 19:28:36,663 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0013 (state: FINISHED) > 2018-10-29 19:28:36,663 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: Shutdown hook called before final status was reported. > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812501560 > final status: FAILED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0013/ > user: mr > Exception in thread "main" org.apache.spark.SparkException: Application > application_1540536615315_0013 finished with failed status > at org.apache.spark.deploy.yarn.Client.run(Client.scala:1137) > at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1183) > at org.apache.spark.deploy.yarn.Client.main(Client.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775) > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > 2018-10-29 19:28:36,694 INFO org.apache.spark.util.ShutdownHookManager: > Shutdown hook called > 2018-10-29 19:28:36,695 INFO org.apache.spark.util.ShutdownHookManager: > Deleting directory /tmp/spark-96077be5-0dfa-496d-a6a0-96e83393a8d9 > {code} > > > Solution: after apply the patch, spark client log can be shown as: > {code} > 2018-10-29 19:27:32,962 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0012 (state: RUNNING) > 2018-10-29 19:27:32,962 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: N/A > ApplicationMaster host: 10.43.183.143 > ApplicationMaster RPC port: 0 > queue: root.mr > start time: 1540812436656 > final status: UNDEFINED > tracking URL: http://zdh141:8088/proxy/application_1540536615315_0012/ > user: mr > 2018-10-29 19:27:33,964 INFO org.apache.spark.deploy.yarn.Client: Application > report for application_1540536615315_0012 (state: FAILED) > 2018-10-29 19:27:33,964 INFO org.apache.spark.deploy.yarn.Client: > client token: N/A > diagnostics: Application application_1540536615315_0012 failed 2 times due > to AM Container for appattempt_1540536615315_0012_02 exited with > exitCode: -104 > For more detailed output, check application tracking > page:http://zdh141:8088/cluster/app/application_1540536615315_0012Then, click > on links to logs of each attempt. > Diagnostics: virtual memory us