[jira] [Updated] (SPARK-12176) SparkLauncher's setConf() does not support configs containing spaces
[ https://issues.apache.org/jira/browse/SPARK-12176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuhang Chen updated SPARK-12176: Flags: Important > SparkLauncher's setConf() does not support configs containing spaces > > > Key: SPARK-12176 > URL: https://issues.apache.org/jira/browse/SPARK-12176 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1, 1.5.2 > Environment: All >Reporter: Yuhang Chen >Priority: Minor > > The spark-submit uses '--conf K=V' pattern for setting configs. According to > the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should > be wrapped with quotes. > However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would > not do that wrapping for you, and there is no chance for wrapping by yourself > with the API it provides. > I checked up the source, all confs are stored in a Map before generating > launching commands. Thus. my advice is checking all values of the conf Map > and do wrapping during command building. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12176) SparkLauncher's setConf() does not support configs containing spaces
[ https://issues.apache.org/jira/browse/SPARK-12176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuhang Chen updated SPARK-12176: Description: The spark-submit uses '--conf K=V' pattern for setting configs. According to the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should be wrapped with quotes. However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would not do that wrapping for you, and there is no chance for wrapping by yourself with the API it provides. I checked up the source, all confs are stored in a Map before generating launching commands. Thus. my advice is checking all values of the conf Map and do wrapping during command building. was: The spark-submit uses '--conf K=V' pattern for setting configs. And according to the docs, if the 'V' you set contains spaces in it, you should wrap the whole 'K=V' parts with quotes. However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would not wrap the ‘K=V' parts for you, and there is no place for wrapping by yourself with the API it provides. I checked up the source, all confs are stored in a Map before generating launching commands. Thus. my advice is checking all values of the conf Map and do wrapping during command building. > SparkLauncher's setConf() does not support configs containing spaces > > > Key: SPARK-12176 > URL: https://issues.apache.org/jira/browse/SPARK-12176 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1, 1.5.2 > Environment: All >Reporter: Yuhang Chen >Priority: Minor > > The spark-submit uses '--conf K=V' pattern for setting configs. According to > the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should > be wrapped with quotes. > However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would > not do that wrapping for you, and there is no chance for wrapping by yourself > with the API it provides. > I checked up the source, all confs are stored in a Map before generating > launching commands. Thus. my advice is checking all values of the conf Map > and do wrapping during command building. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12176) SparkLauncher's setConf() does not support configs containing spaces
[ https://issues.apache.org/jira/browse/SPARK-12176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuhang Chen updated SPARK-12176: Description: The spark-submit uses '--conf K=V' pattern for setting configs. According to the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should be wrapped with quotes. However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would not do that wrapping for you, and there is no chance for wrapping by yourself with the API it provides. For example, I want to add {{-XX:+PrintGCDetails -XX:+PrintGCTimeStamps}} for executors (spark.executor.extraJavaOptions), and the conf contains a space in it. For spark-submit, I should wrap the conf with quotes like this: bq. --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" But when I use the setConf() API of SparkLauncher, I write code like this: {{launcher.setConf("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps");}} Now, SparkLauncher uses Java's ProcessBuilder to start a sub-process, in which the spark-submit is finally executed. And it turns out that the final command is like this; bq. --conf spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps See? the quotes are gone, and the job counld not be launched with this command. Then I checked up the source, all confs are stored in a Map before generating launching commands. Thus. my advice is checking all values of the conf Map and do wrapping during command building. was: The spark-submit uses '--conf K=V' pattern for setting configs. According to the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should be wrapped with quotes. However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would not do that wrapping for you, and there is no chance for wrapping by yourself with the API it provides. For example, I want to add "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" for executors (spark.executor.extraJavaOptions), and the conf contains a space in it. For spark-submit, I should wrap the conf with quotes like this: --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" But when I use the setConf() API of SparkLauncher, I write code like this: launcher.setConf("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps"). Now, SparkLauncher uses Java's ProcessBuilder to start a sub-process, in which the spark-submit is finally executed. And it turns out that the final command is like this; --conf spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps See? the quotes are gone, and the job counld not be launched with this command. Then I checked up the source, all confs are stored in a Map before generating launching commands. Thus. my advice is checking all values of the conf Map and do wrapping during command building. > SparkLauncher's setConf() does not support configs containing spaces > > > Key: SPARK-12176 > URL: https://issues.apache.org/jira/browse/SPARK-12176 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1, 1.5.2 > Environment: All >Reporter: Yuhang Chen >Priority: Minor > > The spark-submit uses '--conf K=V' pattern for setting configs. According to > the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should > be wrapped with quotes. > However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would > not do that wrapping for you, and there is no chance for wrapping by yourself > with the API it provides. > For example, I want to add {{-XX:+PrintGCDetails -XX:+PrintGCTimeStamps}} for > executors (spark.executor.extraJavaOptions), and the conf contains a space in > it. > For spark-submit, I should wrap the conf with quotes like this: > bq. --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps" > But when I use the setConf() API of SparkLauncher, I write code like this: > {{launcher.setConf("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps");}} > Now, SparkLauncher uses Java's ProcessBuilder to start a sub-process, in > which the spark-submit is finally executed. And it turns out that the final > command is like this; > bq. --conf spark.executor.extraJavaOptions=-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps > See? the quotes are gone, and the job counld not be launched with this > command. > Then I checked up the source, all confs are stored in a Map before generating > launching commands. Thus. my advice is checking all values of the conf Map > and do wrapping during command building. -- This message was sent by Atlassian JIRA
[jira] [Updated] (SPARK-12176) SparkLauncher's setConf() does not support configs containing spaces
[ https://issues.apache.org/jira/browse/SPARK-12176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuhang Chen updated SPARK-12176: Description: The spark-submit uses '--conf K=V' pattern for setting configs. According to the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should be wrapped with quotes. However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would not do that wrapping for you, and there is no chance for wrapping by yourself with the API it provides. For example, I want to add {{-XX:+PrintGCDetails -XX:+PrintGCTimeStamps}} for executors (spark.executor.extraJavaOptions), and the conf contains a space in it. For spark-submit, I should wrap the conf with quotes like this: bq. --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" But when I use the setConf() API of SparkLauncher, I write code like this: {{launcher.setConf("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps");}} Now, SparkLauncher uses Java's ProcessBuilder to start a sub-process, in which the spark-submit is finally executed. And it turns out that the final command is like this; bq. --conf spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps See? the quotes are gone, and the job counld not be launched with this command. Then I checked up the source, all confs are stored in a Map before generating launching commands. Thus. my advice is checking all values of the conf Map and do wrapping during command building. was: The spark-submit uses '--conf K=V' pattern for setting configs. According to the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should be wrapped with quotes. However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would not do that wrapping for you, and there is no chance for wrapping by yourself with the API it provides. For example, I want to add {{-XX:+PrintGCDetails -XX:+PrintGCTimeStamps}} for executors (spark.executor.extraJavaOptions), and the conf contains a space in it. For spark-submit, I should wrap the conf with quotes like this: bq. --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" But when I use the setConf() API of SparkLauncher, I write code like this: {{launcher.setConf("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps");}} Now, SparkLauncher uses Java's ProcessBuilder to start a sub-process, in which the spark-submit is finally executed. And it turns out that the final command is like this; bq. --conf spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps See? the quotes are gone, and the job counld not be launched with this command. Then I checked up the source, all confs are stored in a Map before generating launching commands. Thus. my advice is checking all values of the conf Map and do wrapping during command building. > SparkLauncher's setConf() does not support configs containing spaces > > > Key: SPARK-12176 > URL: https://issues.apache.org/jira/browse/SPARK-12176 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1, 1.5.2 > Environment: All >Reporter: Yuhang Chen >Priority: Minor > > The spark-submit uses '--conf K=V' pattern for setting configs. According to > the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should > be wrapped with quotes. > However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would > not do that wrapping for you, and there is no chance for wrapping by yourself > with the API it provides. > For example, I want to add {{-XX:+PrintGCDetails -XX:+PrintGCTimeStamps}} for > executors (spark.executor.extraJavaOptions), and the conf contains a space in > it. > For spark-submit, I should wrap the conf with quotes like this: > bq. --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps" > But when I use the setConf() API of SparkLauncher, I write code like this: > {{launcher.setConf("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps");}} > Now, SparkLauncher uses Java's ProcessBuilder to start a sub-process, in > which the spark-submit is finally executed. And it turns out that the final > command is like this; > bq. --conf spark.executor.extraJavaOptions=-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps > See? the quotes are gone, and the job counld not be launched with this > command. > Then I checked up the source, all confs are stored in a Map before generating > launching commands. Thus. my advice is checking all values of the conf Map > and do wrapping during command building. -- This message was sent by
[jira] [Updated] (SPARK-12176) SparkLauncher's setConf() does not support configs containing spaces
[ https://issues.apache.org/jira/browse/SPARK-12176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuhang Chen updated SPARK-12176: Description: The spark-submit uses '--conf K=V' pattern for setting configs. According to the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should be wrapped with quotes. However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would not do that wrapping for you, and there is no chance for wrapping by yourself with the API it provides. For example, I want to add {{-XX:+PrintGCDetails -XX:+PrintGCTimeStamps}} for executors (spark.executor.extraJavaOptions), and the conf contains a space in it. For spark-submit, I should wrap the conf with quotes like this: {code} --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" {code} But when I use the setConf() API of SparkLauncher, I write code like this: {code} launcher.setConf("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps"); {code} Now, SparkLauncher uses Java's ProcessBuilder to start a sub-process, in which the spark-submit is finally executed. And it turns out that the final command is like this; {code} --conf spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps {code} See? the quotes are gone, and the job counld not be launched with this command. Then I checked up the source, all confs are stored in a Map before generating launching commands. Thus. my advice is checking all values of the conf Map and do wrapping during command building. was: The spark-submit uses '--conf K=V' pattern for setting configs. According to the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should be wrapped with quotes. However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would not do that wrapping for you, and there is no chance for wrapping by yourself with the API it provides. For example, I want to add {{-XX:+PrintGCDetails -XX:+PrintGCTimeStamps}} for executors (spark.executor.extraJavaOptions), and the conf contains a space in it. For spark-submit, I should wrap the conf with quotes like this: bq. --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" But when I use the setConf() API of SparkLauncher, I write code like this: {{launcher.setConf("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps");}} Now, SparkLauncher uses Java's ProcessBuilder to start a sub-process, in which the spark-submit is finally executed. And it turns out that the final command is like this; bq. --conf spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps See? the quotes are gone, and the job counld not be launched with this command. Then I checked up the source, all confs are stored in a Map before generating launching commands. Thus. my advice is checking all values of the conf Map and do wrapping during command building. > SparkLauncher's setConf() does not support configs containing spaces > > > Key: SPARK-12176 > URL: https://issues.apache.org/jira/browse/SPARK-12176 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1, 1.5.2 > Environment: All >Reporter: Yuhang Chen >Priority: Minor > > The spark-submit uses '--conf K=V' pattern for setting configs. According to > the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should > be wrapped with quotes. > However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would > not do that wrapping for you, and there is no chance for wrapping by yourself > with the API it provides. > For example, I want to add {{-XX:+PrintGCDetails -XX:+PrintGCTimeStamps}} for > executors (spark.executor.extraJavaOptions), and the conf contains a space in > it. > For spark-submit, I should wrap the conf with quotes like this: > {code} > --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps" > {code} > But when I use the setConf() API of SparkLauncher, I write code like this: > {code} > launcher.setConf("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps"); > {code} > Now, SparkLauncher uses Java's ProcessBuilder to start a sub-process, in > which the spark-submit is finally executed. And it turns out that the final > command is like this; > {code} > --conf spark.executor.extraJavaOptions=-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps > {code} > See? the quotes are gone, and the job counld not be launched with this > command. > Then I checked up the source, all confs are stored in a Map before generating > launching commands. Thus. my advice is checking all values of the conf Map >
[jira] [Updated] (SPARK-12176) SparkLauncher's setConf() does not support configs containing spaces
[ https://issues.apache.org/jira/browse/SPARK-12176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuhang Chen updated SPARK-12176: Description: The spark-submit uses '--conf K=V' pattern for setting configs. According to the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should be wrapped with quotes. However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would not do that wrapping for you, and there is no chance for wrapping by yourself with the API it provides. For example, I want to add "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" for executors (spark.executor.extraJavaOptions), and the conf contains a space in it. For spark-submit, I should wrap the conf with quotes like this: --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" But when I use the setConf() API of SparkLauncher, I write code like this: launcher.setConf("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps"). Now, SparkLauncher uses Java's ProcessBuilder to start a sub-process, in which the spark-submit is finally executed. And it turns out that the final command is like this; --conf spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps See? the quotes are gone, and the job counld not be launched with this command. Then I checked up the source, all confs are stored in a Map before generating launching commands. Thus. my advice is checking all values of the conf Map and do wrapping during command building. was: The spark-submit uses '--conf K=V' pattern for setting configs. According to the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should be wrapped with quotes. However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would not do that wrapping for you, and there is no chance for wrapping by yourself with the API it provides. I checked up the source, all confs are stored in a Map before generating launching commands. Thus. my advice is checking all values of the conf Map and do wrapping during command building. For example, I want to add "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" for executors (spark.executor.extraJavaOptions), and the conf contains a space in it. For spark-submit, I should wrap the conf with quotes like this: --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" But when I use the setConf() API of SparkLauncher, I write code like this: launcher.setConf("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps"). Now, SparkLauncher uses Java's ProcessBuilder to start a sub-process, in which the spark-submit is finally executed. I debugged the source, and it turns out the command is like this; --conf spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps See? the quotes are gone, and the job counld not be launched with this command. > SparkLauncher's setConf() does not support configs containing spaces > > > Key: SPARK-12176 > URL: https://issues.apache.org/jira/browse/SPARK-12176 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1, 1.5.2 > Environment: All >Reporter: Yuhang Chen >Priority: Minor > > The spark-submit uses '--conf K=V' pattern for setting configs. According to > the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should > be wrapped with quotes. > However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would > not do that wrapping for you, and there is no chance for wrapping by yourself > with the API it provides. > For example, I want to add "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" for > executors (spark.executor.extraJavaOptions), and the conf contains a space in > it. > For spark-submit, I should wrap the conf with quotes like this: > --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps" > But when I use the setConf() API of SparkLauncher, I write code like this: > launcher.setConf("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps"). Now, SparkLauncher uses Java's ProcessBuilder to > start a sub-process, in which the spark-submit is finally executed. And it > turns out that the final command is like this; > --conf spark.executor.extraJavaOptions=-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps > See? the quotes are gone, and the job counld not be launched with this > command. > Then I checked up the source, all confs are stored in a Map before generating > launching commands. Thus. my advice is checking all values of the conf Map > and do wrapping during command building. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SPARK-12176) SparkLauncher's setConf() does not support configs containing spaces
[ https://issues.apache.org/jira/browse/SPARK-12176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046162#comment-15046162 ] Yuhang Chen commented on SPARK-12176: - I modified the descriptions, hope I've made myself clear. Contact me if you need further information. > SparkLauncher's setConf() does not support configs containing spaces > > > Key: SPARK-12176 > URL: https://issues.apache.org/jira/browse/SPARK-12176 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1, 1.5.2 > Environment: All >Reporter: Yuhang Chen >Priority: Minor > > The spark-submit uses '--conf K=V' pattern for setting configs. According to > the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should > be wrapped with quotes. > However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would > not do that wrapping for you, and there is no chance for wrapping by yourself > with the API it provides. > I checked up the source, all confs are stored in a Map before generating > launching commands. Thus. my advice is checking all values of the conf Map > and do wrapping during command building. > For example, I want to add "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" for > executors (spark.executor.extraJavaOptions), and the conf contains a space in > it. For spark-submit, I should wrap the conf with quotes like this: > --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps" > But when I use the setConf() API of SparkLauncher, I write code like this: > launcher.setConf("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps"). Now, SparkLauncher uses Java's ProcessBuilder to > start a sub-process, in which the spark-submit is finally executed. I > debugged the source, and it turns out the command is like this; > --conf spark.executor.extraJavaOptions=-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps > See? the quotes are gone, and the job counld not be launched with this > command. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12176) SparkLauncher's setConf() does not support configs containing spaces
[ https://issues.apache.org/jira/browse/SPARK-12176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuhang Chen updated SPARK-12176: Description: The spark-submit uses '--conf K=V' pattern for setting configs. According to the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should be wrapped with quotes. However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would not do that wrapping for you, and there is no chance for wrapping by yourself with the API it provides. I checked up the source, all confs are stored in a Map before generating launching commands. Thus. my advice is checking all values of the conf Map and do wrapping during command building. For example, I want to add "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" for executors (spark.executor.extraJavaOptions), and the conf contains a space in it. For spark-submit, I should wrap the conf with quotes like this: --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" But when I use the setConf() API of SparkLauncher, I write code like this: launcher.setConf("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps"). Now, SparkLauncher uses Java's ProcessBuilder to start a sub-process, in which the spark-submit is finally executed. I debugged the source, and it turns out the command is like this; --conf spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps See? the quotes are gone, and the job counld not be launched with this command. was: The spark-submit uses '--conf K=V' pattern for setting configs. According to the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should be wrapped with quotes. However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would not do that wrapping for you, and there is no chance for wrapping by yourself with the API it provides. I checked up the source, all confs are stored in a Map before generating launching commands. Thus. my advice is checking all values of the conf Map and do wrapping during command building. > SparkLauncher's setConf() does not support configs containing spaces > > > Key: SPARK-12176 > URL: https://issues.apache.org/jira/browse/SPARK-12176 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1, 1.5.2 > Environment: All >Reporter: Yuhang Chen >Priority: Minor > > The spark-submit uses '--conf K=V' pattern for setting configs. According to > the docs, if the 'V' you set has spaces in it, the whole 'K=V' parts should > be wrapped with quotes. > However, the SparkLauncher (org.apache.spark.launcher.SparkLauncher) would > not do that wrapping for you, and there is no chance for wrapping by yourself > with the API it provides. > I checked up the source, all confs are stored in a Map before generating > launching commands. Thus. my advice is checking all values of the conf Map > and do wrapping during command building. > For example, I want to add "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" for > executors (spark.executor.extraJavaOptions), and the conf contains a space in > it. For spark-submit, I should wrap the conf with quotes like this: > --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps" > But when I use the setConf() API of SparkLauncher, I write code like this: > launcher.setConf("spark.executor.extraJavaOptions", "-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps"). Now, SparkLauncher uses Java's ProcessBuilder to > start a sub-process, in which the spark-submit is finally executed. I > debugged the source, and it turns out the command is like this; > --conf spark.executor.extraJavaOptions=-XX:+PrintGCDetails > -XX:+PrintGCTimeStamps > See? the quotes are gone, and the job counld not be launched with this > command. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9523) Receiver for Spark Streaming does not naturally support kryo serializer
[ https://issues.apache.org/jira/browse/SPARK-9523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14968628#comment-14968628 ] Yuhang Chen commented on SPARK-9523: So you mean closures also support kryo? But I never add any kryo codes to them and they worked just fine when kryo serializer was set in SparkConf, while the receivers didn't. I got confused by that. > Receiver for Spark Streaming does not naturally support kryo serializer > --- > > Key: SPARK-9523 > URL: https://issues.apache.org/jira/browse/SPARK-9523 > Project: Spark > Issue Type: Improvement > Components: Streaming >Affects Versions: 1.3.1 > Environment: Windows 7 local mode >Reporter: Yuhang Chen >Priority: Minor > Labels: kryo, serialization > Original Estimate: 120h > Remaining Estimate: 120h > > In some cases, some attributes in a class is not serializable, which you > still want to use after serialization of the whole object, you'll have to > customize your serialization codes. For example, you can declare those > attributes as transient, which makes them ignored during serialization, and > then you can reassign their values during deserialization. > Now, if you're using Java serialization, you'll have to implement > Serializable, and write those codes in readObject() and writeObejct() > methods; And if you're using kryo serialization, you'll have to implement > KryoSerializable, and write these codes in read() and write() methods. > In Spark and Spark Streaming, you can set kryo as the serializer for speeding > up. However, the functions taken by RDD or DStream operations are still > serialized by Java serialization, which means you only need to write those > custom serialization codes in readObject() and writeObejct() methods. > But when it comes to Spark Streaming's Receiver, things are different. When > you wish to customize an InputDStream, you must extend the Receiver. However, > it turns out, the Receiver will be serialized by kryo if you set kryo > serializer in SparkConf, and will fall back to Java serialization if you > didn't. > So here's comes the problems, if you want to change the serializer by > configuration and make sure the Receiver runs perfectly for both Java and > kryo, you'll have to write all the 4 methods above. First, it is redundant, > since you'll have to write serialization/deserialization code almost twice; > Secondly, there's nothing in the doc or in the code to inform users to > implement the KryoSerializable interface. > Since all other function parameters are serialized by Java only, I suggest > you also make it so for the Receiver. It may be slower, but since the > serialization will only be executed for each interval, it's durable. More > importantly, it can cause fewer trouble -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9595) Adding API to SparkConf for kryo serializers registration
[ https://issues.apache.org/jira/browse/SPARK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14968615#comment-14968615 ] Yuhang Chen commented on SPARK-9595: Sorry, I replied to the wrong person, the question is meant for another issue, please just ignore it. > Adding API to SparkConf for kryo serializers registration > - > > Key: SPARK-9595 > URL: https://issues.apache.org/jira/browse/SPARK-9595 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 1.3.1, 1.4.1 >Reporter: Yuhang Chen >Priority: Minor > Original Estimate: 168h > Remaining Estimate: 168h > > Currently SparkConf has a registerKryoClasses API for kryo registration. > However, this only works when you register classes. If you want to register > customized kryo serializers, you'll have to extend the KryoSerializer class > and write some codes. > This is not only very inconvenient, but also require the registration to be > done in compile-time, which is not always possible. Thus, I suggest another > API to SparkConf for registering customized kryo serializers. It could be > like this: > def registerKryoSerializers(serializers: Map[Class[_], Serializer]): SparkConf -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11144) Add SparkLauncher for Spark Streaming, Spark SQL, etc
[ https://issues.apache.org/jira/browse/SPARK-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967488#comment-14967488 ] Yuhang Chen commented on SPARK-11144: - Thanks for your advice and sorry for the inconvenience. > Add SparkLauncher for Spark Streaming, Spark SQL, etc > - > > Key: SPARK-11144 > URL: https://issues.apache.org/jira/browse/SPARK-11144 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL, Streaming >Affects Versions: 1.5.1 > Environment: Linux x64 >Reporter: Yuhang Chen >Priority: Minor > Labels: launcher > > Now we hava org.apache.spark.launcher.SparkLauncher to lauch spark as a child > process. However, it does not support other libs, such as Spark Streaming and > Spark SQL. > What I'm looking for is an utility like spark-submit, with which you can > submit any spark lib jobs to all supported resource manager(Standalone, YARN, > Mesos, etc) in Java/Scala code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9595) Adding API to SparkConf for kryo serializers registration
[ https://issues.apache.org/jira/browse/SPARK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967503#comment-14967503 ] Yuhang Chen commented on SPARK-9595: So you mean closures also support kryo? But I never add any kryo codes to them and they worked just fine when kryo serializer was set in SparkConf, while the receivers didn't. I got confused by that. > Adding API to SparkConf for kryo serializers registration > - > > Key: SPARK-9595 > URL: https://issues.apache.org/jira/browse/SPARK-9595 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 1.3.1, 1.4.1 >Reporter: Yuhang Chen >Priority: Minor > Original Estimate: 168h > Remaining Estimate: 168h > > Currently SparkConf has a registerKryoClasses API for kryo registration. > However, this only works when you register classes. If you want to register > customized kryo serializers, you'll have to extend the KryoSerializer class > and write some codes. > This is not only very inconvenient, but also require the registration to be > done in compile-time, which is not always possible. Thus, I suggest another > API to SparkConf for registering customized kryo serializers. It could be > like this: > def registerKryoSerializers(serializers: Map[Class[_], Serializer]): SparkConf -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-11144) Add SparkLauncher for Spark Streaming, Spark SQL, etc
[ https://issues.apache.org/jira/browse/SPARK-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14966320#comment-14966320 ] Yuhang Chen edited comment on SPARK-11144 at 10/21/15 6:33 AM: --- Yes, you were right, I did not make myself clear. The SparkLauncher works just like you said, but it's still not something that I'm looking for. Please close this issue if you may, I will write another one to express my idea. was (Author: fish748): But the SparkLauncher does not support Spark Streaming, right? > Add SparkLauncher for Spark Streaming, Spark SQL, etc > - > > Key: SPARK-11144 > URL: https://issues.apache.org/jira/browse/SPARK-11144 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL, Streaming >Affects Versions: 1.5.1 > Environment: Linux x64 >Reporter: Yuhang Chen >Priority: Minor > Labels: launcher > > Now we hava org.apache.spark.launcher.SparkLauncher to lauch spark as a child > process. However, it does not support other libs, such as Spark Streaming and > Spark SQL. > What I'm looking for is an utility like spark-submit, with which you can > submit any spark lib jobs to all supported resource manager(Standalone, YARN, > Mesos, etc) in Java/Scala code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11144) Add SparkLauncher for Spark Streaming, Spark SQL, etc
[ https://issues.apache.org/jira/browse/SPARK-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14966320#comment-14966320 ] Yuhang Chen commented on SPARK-11144: - But the SparkLauncher does not support Spark Streaming, right? > Add SparkLauncher for Spark Streaming, Spark SQL, etc > - > > Key: SPARK-11144 > URL: https://issues.apache.org/jira/browse/SPARK-11144 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL, Streaming >Affects Versions: 1.5.1 > Environment: Linux x64 >Reporter: Yuhang Chen >Priority: Minor > Labels: launcher > > Now we hava org.apache.spark.launcher.SparkLauncher to lauch spark as a child > process. However, it does not support other libs, such as Spark Streaming and > Spark SQL. > What I'm looking for is an utility like spark-submit, with which you can > submit any spark lib jobs to all supported resource manager(Standalone, YARN, > Mesos, etc) in Java/Scala code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11144) Add SparkLauncher for Spark Streaming, Spark SQL, etc
[ https://issues.apache.org/jira/browse/SPARK-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14966317#comment-14966317 ] Yuhang Chen commented on SPARK-11144: - It is possible, but it's not exactly what I'm looking for. For both spark-submit and org.apache.spark.deploy.SparkSubmit, if you want to submit a Spark job in code, you need start a new process using Java's PrecessBuilder or something, then execute some shell commands with some arguments. But what I hope is something I can use by code purely. I might not express myself clearly, I would love to write an email to you with more detailed explanation if it's OK with you. > Add SparkLauncher for Spark Streaming, Spark SQL, etc > - > > Key: SPARK-11144 > URL: https://issues.apache.org/jira/browse/SPARK-11144 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL, Streaming >Affects Versions: 1.5.1 > Environment: Linux x64 >Reporter: Yuhang Chen >Priority: Minor > Labels: launcher > > Now we hava org.apache.spark.launcher.SparkLauncher to lauch spark as a child > process. However, it does not support other libs, such as Spark Streaming and > Spark SQL. > What I'm looking for is an utility like spark-submit, with which you can > submit any spark lib jobs to all supported resource manager(Standalone, YARN, > Mesos, etc) in Java/Scala code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Issue Comment Deleted] (SPARK-11144) Add SparkLauncher for Spark Streaming, Spark SQL, etc
[ https://issues.apache.org/jira/browse/SPARK-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuhang Chen updated SPARK-11144: Comment: was deleted (was: Yes.) > Add SparkLauncher for Spark Streaming, Spark SQL, etc > - > > Key: SPARK-11144 > URL: https://issues.apache.org/jira/browse/SPARK-11144 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL, Streaming >Affects Versions: 1.5.1 > Environment: Linux x64 >Reporter: Yuhang Chen >Priority: Minor > Labels: launcher > > Now we hava org.apache.spark.launcher.SparkLauncher to lauch spark as a child > process. However, it does not support other libs, such as Spark Streaming and > Spark SQL. > What I'm looking for is an utility like spark-submit, with which you can > submit any spark lib jobs to all supported resource manager(Standalone, YARN, > Mesos, etc) in Java/Scala code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11144) Add SparkLauncher for Spark Streaming, Spark SQL, etc
[ https://issues.apache.org/jira/browse/SPARK-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14966102#comment-14966102 ] Yuhang Chen commented on SPARK-11144: - Yes. > Add SparkLauncher for Spark Streaming, Spark SQL, etc > - > > Key: SPARK-11144 > URL: https://issues.apache.org/jira/browse/SPARK-11144 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL, Streaming >Affects Versions: 1.5.1 > Environment: Linux x64 >Reporter: Yuhang Chen >Priority: Minor > Labels: launcher > > Now we hava org.apache.spark.launcher.SparkLauncher to lauch spark as a child > process. However, it does not support other libs, such as Spark Streaming and > Spark SQL. > What I'm looking for is an utility like spark-submit, with which you can > submit any spark lib jobs to all supported resource manager(Standalone, YARN, > Mesos, etc) in Java/Scala code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11144) Add SparkLauncher for Spark Streaming, Spark SQL, etc
[ https://issues.apache.org/jira/browse/SPARK-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14966101#comment-14966101 ] Yuhang Chen commented on SPARK-11144: - Yes. > Add SparkLauncher for Spark Streaming, Spark SQL, etc > - > > Key: SPARK-11144 > URL: https://issues.apache.org/jira/browse/SPARK-11144 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL, Streaming >Affects Versions: 1.5.1 > Environment: Linux x64 >Reporter: Yuhang Chen >Priority: Minor > Labels: launcher > > Now we hava org.apache.spark.launcher.SparkLauncher to lauch spark as a child > process. However, it does not support other libs, such as Spark Streaming and > Spark SQL. > What I'm looking for is an utility like spark-submit, with which you can > submit any spark lib jobs to all supported resource manager(Standalone, YARN, > Mesos, etc) in Java/Scala code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11144) Add SparkLauncher for Spark Streaming, Spark SQL, etc
[ https://issues.apache.org/jira/browse/SPARK-11144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14966303#comment-14966303 ] Yuhang Chen commented on SPARK-11144: - Yes, spark-submit can be used to launch Spark, Spark Streaming, Spark SQL, etc.However, the org.apache.spark.launcher.SparkLauncher only support Spark itself. I hope you guys can make the SparkLauncher something like 'spark-submit in code'. > Add SparkLauncher for Spark Streaming, Spark SQL, etc > - > > Key: SPARK-11144 > URL: https://issues.apache.org/jira/browse/SPARK-11144 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL, Streaming >Affects Versions: 1.5.1 > Environment: Linux x64 >Reporter: Yuhang Chen >Priority: Minor > Labels: launcher > > Now we hava org.apache.spark.launcher.SparkLauncher to lauch spark as a child > process. However, it does not support other libs, such as Spark Streaming and > Spark SQL. > What I'm looking for is an utility like spark-submit, with which you can > submit any spark lib jobs to all supported resource manager(Standalone, YARN, > Mesos, etc) in Java/Scala code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-11144) Add SparkLauncher for Spark Streaming, Spark SQL, etc
Yuhang Chen created SPARK-11144: --- Summary: Add SparkLauncher for Spark Streaming, Spark SQL, etc Key: SPARK-11144 URL: https://issues.apache.org/jira/browse/SPARK-11144 Project: Spark Issue Type: Improvement Components: Spark Core, SQL, Streaming Affects Versions: 1.5.1 Environment: Linux x64 Reporter: Yuhang Chen Priority: Minor Now we hava org.apache.spark.launcher.SparkLauncher to lauch spark as a child process. However, it does not support other libs, such as Spark Streaming and Spark SQL. What I'm looking for is an utility like spark-submit, with which you can submit any spark lib jobs to all supported resource manager(Standalone, YARN, Mesos, etc) in Java/Scala code. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org