[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...

2018-04-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20925


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...

2018-04-09 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20925#discussion_r180193625
  
--- Diff: 
launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java 
---
@@ -154,9 +165,17 @@
 
   List buildSparkSubmitArgs() {
 List args = new ArrayList<>();
-SparkSubmitOptionParser parser = new SparkSubmitOptionParser();
+OptionParser parser = new OptionParser(false);
--- End diff --

This is explained in the public API (`addSparkArg` which is declared in 
`AbstractLauncher`).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...

2018-04-09 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20925#discussion_r180193382
  
--- Diff: 
core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala ---
@@ -775,17 +781,17 @@ class SparkSubmitSuite
   }
 
   test("SPARK_CONF_DIR overrides spark-defaults.conf") {
-forConfDir(Map("spark.executor.memory" -> "2.3g")) { path =>
+forConfDir(Map("spark.executor.memory" -> "3g")) { path =>
--- End diff --

It's just that now instead of just printing an error to the output, an 
exception is actually thrown.

```
[info] SparkSubmitSuite:
[info] - SPARK_CONF_DIR overrides spark-defaults.conf *** FAILED *** (144 
milliseconds)
[info]   org.apache.spark.SparkException: Executor Memory cores must be a 
positive number
[info]   at 
org.apache.spark.deploy.SparkSubmitArguments.error(SparkSubmitArguments.scala:652)
[info]   at 
org.apache.spark.deploy.SparkSubmitArguments.validateSubmitArguments(SparkSubmitArguments.scala:267)
```

That is because:

```
scala> c.set("spark.abcde", "2.3g")
res0: org.apache.spark.SparkConf = org.apache.spark.SparkConf@cd5ff55

scala> c.getSizeAsBytes("spark.abcde")
java.lang.NumberFormatException: Size must be specified as bytes (b), 
kibibytes (k), mebibytes (m), gibibytes (g), tebibytes (t), or pebibytes(p). 
E.g. 50b, 100k, or 250m.
Fractional values are not supported. Input was: 2.3
```

Just noticed that the error message is kinda wrong, but also this whole 
validation function (`validateSubmitArguments`) leaves a lot to be desired...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...

2018-04-08 Thread squito
Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/20925#discussion_r179987224
  
--- Diff: 
launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java 
---
@@ -154,9 +165,17 @@
 
   List buildSparkSubmitArgs() {
 List args = new ArrayList<>();
-SparkSubmitOptionParser parser = new SparkSubmitOptionParser();
+OptionParser parser = new OptionParser(false);
--- End diff --

whats the reason for allowing unknown args here?  Is it so an old launcher 
can start a newer spark, which may accept more args?  A comment would be helpful


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...

2018-04-08 Thread squito
Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/20925#discussion_r179987409
  
--- Diff: 
launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java 
---
@@ -154,9 +165,17 @@
 
   List buildSparkSubmitArgs() {
 List args = new ArrayList<>();
-SparkSubmitOptionParser parser = new SparkSubmitOptionParser();
+OptionParser parser = new OptionParser(false);
+boolean isStartingApp = isAppResourceReq;
+
+// If the user args array is not empty, we need to parse it to detect 
exactly what
+// the user is trying to run, so that checks below are correct.
+if (!userArgs.isEmpty()) {
+  parser.parse(userArgs);
+  isStartingApp = parser.isAppResourceReq;
--- End diff --

I don't really care whether the name is `isStartingApp` or 
`isAppResourceReq`, but seems the name should be same here and in OptionParser, 
unless there is some difference I'm missing.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...

2018-04-08 Thread squito
Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/20925#discussion_r179952361
  
--- Diff: 
core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala ---
@@ -775,17 +781,17 @@ class SparkSubmitSuite
   }
 
   test("SPARK_CONF_DIR overrides spark-defaults.conf") {
-forConfDir(Map("spark.executor.memory" -> "2.3g")) { path =>
+forConfDir(Map("spark.executor.memory" -> "3g")) { path =>
--- End diff --

why this change?  you no longer support fractional values?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...

2018-04-06 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20925#discussion_r179892764
  
--- Diff: 
launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java 
---
@@ -99,17 +100,27 @@
*/
   private boolean allowsMixedArguments;
 
+  /**
+   * This constructor is used when creating a user-configurable launcher. 
It allows the
+   * spark-submit argument list to be modified after creation.
+   */
   SparkSubmitCommandBuilder() {
-this.sparkArgs = new ArrayList<>();
 this.isAppResourceReq = true;
 this.isExample = false;
+this.parsedArgs = new ArrayList<>();
+this.userArgs = new ArrayList<>();
   }
 
+  /**
+   * This constructor is used when invoking spark-submit; it parses and 
validates arguments
+   * provided by the user on the command line.
+   */
   SparkSubmitCommandBuilder(List args) {
 this.allowsMixedArguments = false;
-this.sparkArgs = new ArrayList<>();
+this.parsedArgs = new ArrayList<>();
 boolean isExample = false;
 List submitArgs = args;
+this.userArgs = null;
--- End diff --

If you want to take a stab at refactoring... I'm not so sure you'd be able 
to make things much better though, since the parameters just control shared 
logic that is applied later.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...

2018-04-06 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20925#discussion_r179892170
  
--- Diff: 
launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java 
---
@@ -88,7 +88,8 @@
   SparkLauncher.NO_RESOURCE);
   }
 
-  final List sparkArgs;
+  final List userArgs;
--- End diff --

That's overkill for final fields. Even more if those fields are 
package-private.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...

2018-04-06 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20925#discussion_r179892080
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala ---
@@ -499,20 +497,18 @@ private[deploy] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, S
   }
 
   private def printUsageAndExit(exitCode: Int, unknownParam: Any = null): 
Unit = {
--- End diff --

The intent is to "exit" the submission process (even if there's no "exit" 
in some cases). The different name would also feel weird given the "exitCode" 
parameter. So even if not optimal I prefer the current name.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...

2018-04-06 Thread attilapiros
Github user attilapiros commented on a diff in the pull request:

https://github.com/apache/spark/pull/20925#discussion_r179816905
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala ---
@@ -499,20 +497,18 @@ private[deploy] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, S
   }
 
   private def printUsageAndExit(exitCode: Int, unknownParam: Any = null): 
Unit = {
--- End diff --

Consider renaming the method.  What about printUsageAndThrowException?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...

2018-04-06 Thread attilapiros
Github user attilapiros commented on a diff in the pull request:

https://github.com/apache/spark/pull/20925#discussion_r179832847
  
--- Diff: 
launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java 
---
@@ -99,17 +100,27 @@
*/
   private boolean allowsMixedArguments;
 
+  /**
+   * This constructor is used when creating a user-configurable launcher. 
It allows the
+   * spark-submit argument list to be modified after creation.
+   */
   SparkSubmitCommandBuilder() {
-this.sparkArgs = new ArrayList<>();
 this.isAppResourceReq = true;
 this.isExample = false;
+this.parsedArgs = new ArrayList<>();
+this.userArgs = new ArrayList<>();
   }
 
+  /**
+   * This constructor is used when invoking spark-submit; it parses and 
validates arguments
+   * provided by the user on the command line.
+   */
   SparkSubmitCommandBuilder(List args) {
 this.allowsMixedArguments = false;
-this.sparkArgs = new ArrayList<>();
+this.parsedArgs = new ArrayList<>();
 boolean isExample = false;
 List submitArgs = args;
+this.userArgs = null;
--- End diff --

Consider Collections.emptyList(). I see these two constructors covers two 
different use cases. An abstract base class with two derived classes could 
express this two uses cases better but I know it is out of scope for now. Does 
it make sense to create a Jira ticket for refactoring this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...

2018-04-06 Thread attilapiros
Github user attilapiros commented on a diff in the pull request:

https://github.com/apache/spark/pull/20925#discussion_r179814761
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala ---
@@ -289,27 +288,26 @@ private[deploy] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, S
 }
--- End diff --

This might be a good candidate to use your new error method instead of 
throwing the Exception directly. It might happen there is client catching both 
Exception and SparkException and doing very different things but I guess that 
is very unlikely case.   


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...

2018-04-06 Thread attilapiros
Github user attilapiros commented on a diff in the pull request:

https://github.com/apache/spark/pull/20925#discussion_r179825806
  
--- Diff: 
launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java 
---
@@ -88,7 +88,8 @@
   SparkLauncher.NO_RESOURCE);
   }
 
-  final List sparkArgs;
+  final List userArgs;
--- End diff --

Consider making it private and accessing via methods.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...

2018-04-06 Thread attilapiros
Github user attilapiros commented on a diff in the pull request:

https://github.com/apache/spark/pull/20925#discussion_r179834098
  
--- Diff: 
launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java 
---
@@ -400,6 +419,11 @@ private boolean isThriftServer(String mainClass) {
   private class OptionParser extends SparkSubmitOptionParser {
 
 boolean isAppResourceReq = true;
+boolean errorOnUnknownArgs;
--- End diff --

private


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20925: [SPARK-22941][core] Do not exit JVM when submit f...

2018-03-28 Thread vanzin
GitHub user vanzin opened a pull request:

https://github.com/apache/spark/pull/20925

[SPARK-22941][core] Do not exit JVM when submit fails with in-process 
launcher.

The current in-process launcher implementation just calls the SparkSubmit
object, which, in case of errors, will more often than not exit the JVM.
This is not desirable since this launcher is meant to be used inside other
applications, and that would kill the application.

The change turns SparkSubmit into a class, and abstracts aways some of
the functionality used to print error messages and abort the submission
process. The default implementation uses the logging system for messages,
and throws exceptions for errors. As part of that I also moved some code
that doesn't really belong in SparkSubmit to a better location.

The command line invocation of spark-submit now uses a special 
implementation
of the SparkSubmit class that overrides those behaviors to do what is 
expected
from the command line version (print to the terminal, exit the JVM, etc).

A lot of the changes are to replace calls to methods such as 
"printErrorAndExit"
with the new API.

As part of adding tests for this, I had to fix some small things in the
launcher option parser so that things like "--version" can work when
used in the launcher library.

There is still code that prints directly to the terminal, like all the
Ivy-related code in SparkSubmitUtils, and other areas where some 
re-factoring
would help, like the CommandLineUtils class, but I chose to leave those
alone to keep this change more focused.

Aside from existing and added unit tests, I ran command line tools with
a bunch of different arguments to make sure messages and errors behave
like before.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vanzin/spark SPARK-22941

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20925.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20925






---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org