[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20925 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20925#discussion_r180193625 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -154,9 +165,17 @@ List buildSparkSubmitArgs() { List args = new ArrayList<>(); -SparkSubmitOptionParser parser = new SparkSubmitOptionParser(); +OptionParser parser = new OptionParser(false); --- End diff -- This is explained in the public API (`addSparkArg` which is declared in `AbstractLauncher`). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20925#discussion_r180193382 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala --- @@ -775,17 +781,17 @@ class SparkSubmitSuite } test("SPARK_CONF_DIR overrides spark-defaults.conf") { -forConfDir(Map("spark.executor.memory" -> "2.3g")) { path => +forConfDir(Map("spark.executor.memory" -> "3g")) { path => --- End diff -- It's just that now instead of just printing an error to the output, an exception is actually thrown. ``` [info] SparkSubmitSuite: [info] - SPARK_CONF_DIR overrides spark-defaults.conf *** FAILED *** (144 milliseconds) [info] org.apache.spark.SparkException: Executor Memory cores must be a positive number [info] at org.apache.spark.deploy.SparkSubmitArguments.error(SparkSubmitArguments.scala:652) [info] at org.apache.spark.deploy.SparkSubmitArguments.validateSubmitArguments(SparkSubmitArguments.scala:267) ``` That is because: ``` scala> c.set("spark.abcde", "2.3g") res0: org.apache.spark.SparkConf = org.apache.spark.SparkConf@cd5ff55 scala> c.getSizeAsBytes("spark.abcde") java.lang.NumberFormatException: Size must be specified as bytes (b), kibibytes (k), mebibytes (m), gibibytes (g), tebibytes (t), or pebibytes(p). E.g. 50b, 100k, or 250m. Fractional values are not supported. Input was: 2.3 ``` Just noticed that the error message is kinda wrong, but also this whole validation function (`validateSubmitArguments`) leaves a lot to be desired... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/20925#discussion_r179987224 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -154,9 +165,17 @@ List buildSparkSubmitArgs() { List args = new ArrayList<>(); -SparkSubmitOptionParser parser = new SparkSubmitOptionParser(); +OptionParser parser = new OptionParser(false); --- End diff -- whats the reason for allowing unknown args here? Is it so an old launcher can start a newer spark, which may accept more args? A comment would be helpful --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/20925#discussion_r179987409 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -154,9 +165,17 @@ List buildSparkSubmitArgs() { List args = new ArrayList<>(); -SparkSubmitOptionParser parser = new SparkSubmitOptionParser(); +OptionParser parser = new OptionParser(false); +boolean isStartingApp = isAppResourceReq; + +// If the user args array is not empty, we need to parse it to detect exactly what +// the user is trying to run, so that checks below are correct. +if (!userArgs.isEmpty()) { + parser.parse(userArgs); + isStartingApp = parser.isAppResourceReq; --- End diff -- I don't really care whether the name is `isStartingApp` or `isAppResourceReq`, but seems the name should be same here and in OptionParser, unless there is some difference I'm missing. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/20925#discussion_r179952361 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala --- @@ -775,17 +781,17 @@ class SparkSubmitSuite } test("SPARK_CONF_DIR overrides spark-defaults.conf") { -forConfDir(Map("spark.executor.memory" -> "2.3g")) { path => +forConfDir(Map("spark.executor.memory" -> "3g")) { path => --- End diff -- why this change? you no longer support fractional values? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20925#discussion_r179892764 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -99,17 +100,27 @@ */ private boolean allowsMixedArguments; + /** + * This constructor is used when creating a user-configurable launcher. It allows the + * spark-submit argument list to be modified after creation. + */ SparkSubmitCommandBuilder() { -this.sparkArgs = new ArrayList<>(); this.isAppResourceReq = true; this.isExample = false; +this.parsedArgs = new ArrayList<>(); +this.userArgs = new ArrayList<>(); } + /** + * This constructor is used when invoking spark-submit; it parses and validates arguments + * provided by the user on the command line. + */ SparkSubmitCommandBuilder(List args) { this.allowsMixedArguments = false; -this.sparkArgs = new ArrayList<>(); +this.parsedArgs = new ArrayList<>(); boolean isExample = false; List submitArgs = args; +this.userArgs = null; --- End diff -- If you want to take a stab at refactoring... I'm not so sure you'd be able to make things much better though, since the parameters just control shared logic that is applied later. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20925#discussion_r179892170 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -88,7 +88,8 @@ SparkLauncher.NO_RESOURCE); } - final List sparkArgs; + final List userArgs; --- End diff -- That's overkill for final fields. Even more if those fields are package-private. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20925#discussion_r179892080 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala --- @@ -499,20 +497,18 @@ private[deploy] class SparkSubmitArguments(args: Seq[String], env: Map[String, S } private def printUsageAndExit(exitCode: Int, unknownParam: Any = null): Unit = { --- End diff -- The intent is to "exit" the submission process (even if there's no "exit" in some cases). The different name would also feel weird given the "exitCode" parameter. So even if not optimal I prefer the current name. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...
Github user attilapiros commented on a diff in the pull request: https://github.com/apache/spark/pull/20925#discussion_r179816905 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala --- @@ -499,20 +497,18 @@ private[deploy] class SparkSubmitArguments(args: Seq[String], env: Map[String, S } private def printUsageAndExit(exitCode: Int, unknownParam: Any = null): Unit = { --- End diff -- Consider renaming the method. What about printUsageAndThrowException? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...
Github user attilapiros commented on a diff in the pull request: https://github.com/apache/spark/pull/20925#discussion_r179832847 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -99,17 +100,27 @@ */ private boolean allowsMixedArguments; + /** + * This constructor is used when creating a user-configurable launcher. It allows the + * spark-submit argument list to be modified after creation. + */ SparkSubmitCommandBuilder() { -this.sparkArgs = new ArrayList<>(); this.isAppResourceReq = true; this.isExample = false; +this.parsedArgs = new ArrayList<>(); +this.userArgs = new ArrayList<>(); } + /** + * This constructor is used when invoking spark-submit; it parses and validates arguments + * provided by the user on the command line. + */ SparkSubmitCommandBuilder(List args) { this.allowsMixedArguments = false; -this.sparkArgs = new ArrayList<>(); +this.parsedArgs = new ArrayList<>(); boolean isExample = false; List submitArgs = args; +this.userArgs = null; --- End diff -- Consider Collections.emptyList(). I see these two constructors covers two different use cases. An abstract base class with two derived classes could express this two uses cases better but I know it is out of scope for now. Does it make sense to create a Jira ticket for refactoring this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...
Github user attilapiros commented on a diff in the pull request: https://github.com/apache/spark/pull/20925#discussion_r179814761 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala --- @@ -289,27 +288,26 @@ private[deploy] class SparkSubmitArguments(args: Seq[String], env: Map[String, S } --- End diff -- This might be a good candidate to use your new error method instead of throwing the Exception directly. It might happen there is client catching both Exception and SparkException and doing very different things but I guess that is very unlikely case. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...
Github user attilapiros commented on a diff in the pull request: https://github.com/apache/spark/pull/20925#discussion_r179825806 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -88,7 +88,8 @@ SparkLauncher.NO_RESOURCE); } - final List sparkArgs; + final List userArgs; --- End diff -- Consider making it private and accessing via methods. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...
Github user attilapiros commented on a diff in the pull request: https://github.com/apache/spark/pull/20925#discussion_r179834098 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -400,6 +419,11 @@ private boolean isThriftServer(String mainClass) { private class OptionParser extends SparkSubmitOptionParser { boolean isAppResourceReq = true; +boolean errorOnUnknownArgs; --- End diff -- private --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...
GitHub user vanzin opened a pull request: https://github.com/apache/spark/pull/20925 [SPARK-22941][core] Do not exit JVM when submit fails with in-process launcher. The current in-process launcher implementation just calls the SparkSubmit object, which, in case of errors, will more often than not exit the JVM. This is not desirable since this launcher is meant to be used inside other applications, and that would kill the application. The change turns SparkSubmit into a class, and abstracts aways some of the functionality used to print error messages and abort the submission process. The default implementation uses the logging system for messages, and throws exceptions for errors. As part of that I also moved some code that doesn't really belong in SparkSubmit to a better location. The command line invocation of spark-submit now uses a special implementation of the SparkSubmit class that overrides those behaviors to do what is expected from the command line version (print to the terminal, exit the JVM, etc). A lot of the changes are to replace calls to methods such as "printErrorAndExit" with the new API. As part of adding tests for this, I had to fix some small things in the launcher option parser so that things like "--version" can work when used in the launcher library. There is still code that prints directly to the terminal, like all the Ivy-related code in SparkSubmitUtils, and other areas where some re-factoring would help, like the CommandLineUtils class, but I chose to leave those alone to keep this change more focused. Aside from existing and added unit tests, I ran command line tools with a bunch of different arguments to make sure messages and errors behave like before. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vanzin/spark SPARK-22941 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20925.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20925 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org