[ https://issues.apache.org/jira/browse/SPARK-23941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16431337#comment-16431337 ]
Apache Spark commented on SPARK-23941: -------------------------------------- User 'tiboun' has created a pull request for this issue: https://github.com/apache/spark/pull/21014 > Mesos task failed on specific spark app name > -------------------------------------------- > > Key: SPARK-23941 > URL: https://issues.apache.org/jira/browse/SPARK-23941 > Project: Spark > Issue Type: Bug > Components: Mesos, Spark Submit > Affects Versions: 2.2.1, 2.3.0 > Environment: OS: Ubuntu 16.0.4 > Spark: 2.3.0 > Mesos: 1.5.0 > Reporter: bounkong khamphousone > Priority: Major > > It seems to be a bug related to spark's MesosClusterDispatcher. In order to > reproduce the bug, you need to have mesos and mesos dispatcher running. > I'm currently running mesos 1.5 and spark 2.3.0 (tried with 2.2.1 as well). > If you launch the following program: > > {code:java} > spark-submit --master mesos://127.0.1.1:7077 --deploy-mode cluster --class > org.apache.spark.examples.SparkPi --name "my favorite task (myId = 123-456)" > /home/tiboun/tools/spark/examples/jars/spark-examples_2.11-2.3.0.jar 100 > {code} > , then the task fails with the following output : > > {code:java} > I0409 11:00:35.360352 22726 fetcher.cpp:551] Fetcher Info: > {"cache_directory":"\/tmp\/mesos\/fetch\/tiboun","items":[{"action":"BYPASS_CACHE","uri":{"cache":false,"extract":true,"value":"\/home\/tiboun\/tools\/spark\/examples\/jars\/spark-examples_2.11-2.3.0.jar"}}],"sandbox_directory":"\/var\/lib\/mesos\/slaves\/0262246c-14a3-4408-9b74-5e3b65dc1344-S0\/frameworks\/edff1a6f-38c6-46e0-a3c1-62a8fbfc2b5d-0014\/executors\/driver-20180409110035-0004\/runs\/8ac20902-74e1-45c4-9ab6-c52a79940189","user":"tiboun"} > I0409 11:00:35.363119 22726 fetcher.cpp:450] Fetching URI > '/home/tiboun/tools/spark/examples/jars/spark-examples_2.11-2.3.0.jar' > I0409 11:00:35.363143 22726 fetcher.cpp:291] Fetching directly into the > sandbox directory > I0409 11:00:35.363168 22726 fetcher.cpp:225] Fetching URI > '/home/tiboun/tools/spark/examples/jars/spark-examples_2.11-2.3.0.jar' > W0409 11:00:35.366839 22726 fetcher.cpp:330] Copying instead of extracting > resource from URI with 'extract' flag, because it does not seem to be an > archive: /home/tiboun/tools/spark/examples/jars/spark-examples_2.11-2.3.0.jar > I0409 11:00:35.366873 22726 fetcher.cpp:603] Fetched > '/home/tiboun/tools/spark/examples/jars/spark-examples_2.11-2.3.0.jar' to > '/var/lib/mesos/slaves/0262246c-14a3-4408-9b74-5e3b65dc1344-S0/frameworks/edff1a6f-38c6-46e0-a3c1-62a8fbfc2b5d-0014/executors/driver-20180409110035-0004/runs/8ac20902-74e1-45c4-9ab6-c52a79940189/spark-examples_2.11-2.3.0.jar' > I0409 11:00:35.366878 22726 fetcher.cpp:608] Successfully fetched all URIs > into > '/var/lib/mesos/slaves/0262246c-14a3-4408-9b74-5e3b65dc1344-S0/frameworks/edff1a6f-38c6-46e0-a3c1-62a8fbfc2b5d-0014/executors/driver-20180409110035-0004/runs/8ac20902-74e1-45c4-9ab6-c52a79940189' > I0409 11:00:35.438725 22733 exec.cpp:162] Version: 1.5.0 > I0409 11:00:35.440770 22734 exec.cpp:236] Executor registered on agent > 0262246c-14a3-4408-9b74-5e3b65dc1344-S0 > I0409 11:00:35.441388 22733 executor.cpp:171] Received SUBSCRIBED event > I0409 11:00:35.441586 22733 executor.cpp:175] Subscribed executor on > tiboun-Dell-Precision-M3800 > I0409 11:00:35.441643 22733 executor.cpp:171] Received LAUNCH event > I0409 11:00:35.441767 22733 executor.cpp:638] Starting task > driver-20180409110035-0004 > I0409 11:00:35.445050 22733 executor.cpp:478] Running > '/usr/libexec/mesos/mesos-containerizer launch <POSSIBLY-SENSITIVE-DATA>' > I0409 11:00:35.445770 22733 executor.cpp:651] Forked command at 22743 > sh: 1: Syntax error: "(" unexpected > I0409 11:00:35.538661 22736 executor.cpp:938] Command exited with status 2 > (pid: 22743) > I0409 11:00:36.541016 22739 process.cpp:887] Failed to accept socket: future > discarded > {code} > If you remove the parentheses, you get the following result: > > {code:java} > I0409 11:03:02.023701 23085 fetcher.cpp:551] Fetcher Info: > {"cache_directory":"\/tmp\/mesos\/fetch\/tiboun","items":[{"action":"BYPASS_CACHE","uri":{"cache":false,"extract":true,"value":"\/home\/tiboun\/tools\/spark\/examples\/jars\/spark-examples_2.11-2.3.0.jar"}}],"sandbox_directory":"\/var\/lib\/mesos\/slaves\/0262246c-14a3-4408-9b74-5e3b65dc1344-S0\/frameworks\/edff1a6f-38c6-46e0-a3c1-62a8fbfc2b5d-0014\/executors\/driver-20180409110301-0006\/runs\/f887c0ab-b48f-4382-850c-383c1c944269","user":"tiboun"} > I0409 11:03:02.028268 23085 fetcher.cpp:450] Fetching URI > '/home/tiboun/tools/spark/examples/jars/spark-examples_2.11-2.3.0.jar' > I0409 11:03:02.028302 23085 fetcher.cpp:291] Fetching directly into the > sandbox directory > I0409 11:03:02.028336 23085 fetcher.cpp:225] Fetching URI > '/home/tiboun/tools/spark/examples/jars/spark-examples_2.11-2.3.0.jar' > W0409 11:03:02.031209 23085 fetcher.cpp:330] Copying instead of extracting > resource from URI with 'extract' flag, because it does not seem to be an > archive: /home/tiboun/tools/spark/examples/jars/spark-examples_2.11-2.3.0.jar > I0409 11:03:02.031250 23085 fetcher.cpp:603] Fetched > '/home/tiboun/tools/spark/examples/jars/spark-examples_2.11-2.3.0.jar' to > '/var/lib/mesos/slaves/0262246c-14a3-4408-9b74-5e3b65dc1344-S0/frameworks/edff1a6f-38c6-46e0-a3c1-62a8fbfc2b5d-0014/executors/driver-20180409110301-0006/runs/f887c0ab-b48f-4382-850c-383c1c944269/spark-examples_2.11-2.3.0.jar' > I0409 11:03:02.031258 23085 fetcher.cpp:608] Successfully fetched all URIs > into > '/var/lib/mesos/slaves/0262246c-14a3-4408-9b74-5e3b65dc1344-S0/frameworks/edff1a6f-38c6-46e0-a3c1-62a8fbfc2b5d-0014/executors/driver-20180409110301-0006/runs/f887c0ab-b48f-4382-850c-383c1c944269' > I0409 11:03:02.090797 23095 exec.cpp:162] Version: 1.5.0 > I0409 11:03:02.095283 23092 exec.cpp:236] Executor registered on agent > 0262246c-14a3-4408-9b74-5e3b65dc1344-S0 > I0409 11:03:02.096693 23095 executor.cpp:171] Received SUBSCRIBED event > I0409 11:03:02.097040 23095 executor.cpp:175] Subscribed executor on > tiboun-Dell-Precision-M3800 > I0409 11:03:02.097141 23095 executor.cpp:171] Received LAUNCH event > I0409 11:03:02.097357 23095 executor.cpp:638] Starting task > driver-20180409110301-0006 > I0409 11:03:02.101521 23095 executor.cpp:478] Running > '/usr/libexec/mesos/mesos-containerizer launch <POSSIBLY-SENSITIVE-DATA>' > I0409 11:03:02.102332 23095 executor.cpp:651] Forked command at 23100 > Error: Cannot load main class from JAR > file:/var/lib/mesos/slaves/0262246c-14a3-4408-9b74-5e3b65dc1344-S0/frameworks/edff1a6f-38c6-46e0-a3c1-62a8fbfc2b5d-0014/executors/driver-20180409110301-0006/runs/f887c0ab-b48f-4382-850c-383c1c944269/favorite > Run with --help for usage help or --verbose for debug output > I0409 11:03:02.792325 23090 executor.cpp:938] Command exited with status 1 > (pid: 23100) > I0409 11:03:03.794505 23098 process.cpp:887] Failed to accept socket: future > discarded > {code} > Interesting things is that mesos try to find main class on a file called > "favorite" which is part of the task name. > > I've tried to launch spark-shell with the same name and it works fine. Task > name's get driver's name and add a sequence after it. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org