[GitHub] spark issue #21849: [SPARK-24243][CORE] Expose exceptions from InProcessAppH...

2018-08-30 Thread sahilTakiar
Github user sahilTakiar commented on the issue:

https://github.com/apache/spark/pull/21849
  
@vanzin addressed your comments and updated the PR


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21849: [SPARK-24243][CORE] Expose exceptions from InProc...

2018-08-30 Thread sahilTakiar
Github user sahilTakiar commented on a diff in the pull request:

https://github.com/apache/spark/pull/21849#discussion_r214110048
  
--- Diff: 
launcher/src/main/java/org/apache/spark/launcher/OutputRedirector.java ---
@@ -17,6 +17,8 @@
 
 package org.apache.spark.launcher;
 
+import org.apache.commons.lang.StringUtils;
--- End diff --

copied the code into a helper method


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21849: [SPARK-24243][CORE] Expose exceptions from InProc...

2018-08-30 Thread sahilTakiar
Github user sahilTakiar commented on a diff in the pull request:

https://github.com/apache/spark/pull/21849#discussion_r214109949
  
--- Diff: 
launcher/src/main/java/org/apache/spark/launcher/ChildProcAppHandle.java ---
@@ -46,6 +47,18 @@ public synchronized void disconnect() {
 }
   }
 
+  /**
+   * Since {@link SparkLauncher} runs {@code spark-submit} in a 
sub-process, its difficult to
--- End diff --

updated the javadocs


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21849: [SPARK-24243][CORE] Expose exceptions from InProc...

2018-08-30 Thread sahilTakiar
Github user sahilTakiar commented on a diff in the pull request:

https://github.com/apache/spark/pull/21849#discussion_r214109901
  
--- Diff: 
core/src/test/java/org/apache/spark/launcher/SparkLauncherSuite.java ---
@@ -227,6 +220,82 @@ public void testInProcessLauncherDoesNotKillJvm() 
throws Exception {
 assertEquals(SparkAppHandle.State.LOST, handle.getState());
   }
 
+  @Test
+  public void testInProcessLauncherGetError() throws Exception {
+// Because this test runs SparkLauncher in process and in client mode, 
it pollutes the system
+// properties, and that can cause test failures down the test 
pipeline. So restore the original
+// system properties after this test runs.
+Map properties = new HashMap<>(System.getProperties());
+
+SparkAppHandle handle = null;
+try {
+  handle = new InProcessLauncher()
+.setMaster("local")
+.setAppResource(SparkLauncher.NO_RESOURCE)
+.setMainClass(ErrorInProcessTestApp.class.getName())
+.addAppArgs("hello")
+.startApplication();
+
+  final SparkAppHandle _handle = handle;
+  eventually(Duration.ofSeconds(60), Duration.ofMillis(1000), () -> {
+assertEquals(SparkAppHandle.State.FAILED, _handle.getState());
+  });
+
+  assertNotNull(handle.getError());
+  assertTrue(handle.getError().isPresent());
+  assertTrue(handle.getError().get().getCause() == DUMMY_EXCEPTION);
--- End diff --

changed to `assertSame`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21849: [SPARK-24243][CORE] Expose exceptions from InProcessAppH...

2018-07-23 Thread sahilTakiar
Github user sahilTakiar commented on the issue:

https://github.com/apache/spark/pull/21849
  
@vanzin could you take a look? I'm not sure if the same race conditions 
present in the other unit tests apply to the new ones since no `SparkContext` 
is being created. For now I didn't add any synchronization, but I can add some.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21849: [SPARK-24243][CORE] Expose exceptions from InProc...

2018-07-23 Thread sahilTakiar
GitHub user sahilTakiar opened a pull request:

https://github.com/apache/spark/pull/21849

[SPARK-24243][CORE] Expose exceptions from InProcessAppHandle

## What changes were proposed in this pull request?

Adds a new method to `SparkAppHandle` called `getError` which returns
the exception (if present) that caused the underlying Spark app to
fail.

## How was this patch tested?

New tests added to `SparkLauncherSuite` for the new method.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sahilTakiar/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21849.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21849


commit 7e3100ed148f98078c04e5aba8a41ead8776f89b
Author: Sahil Takiar 
Date:   2018-07-23T17:31:24Z

[SPARK-24243][CORE] Expose exceptions from InProcessAppHandle

Adds a new method to `SparkAppHandle` called `getError` which returns
the exception (if present) that caused the underlying Spark app to
fail.

New tests added to `SparkLauncherSuite` for the new method.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20893: [SPARK-23785][LAUNCHER] LauncherBackend doesn't check st...

2018-03-27 Thread sahilTakiar
Github user sahilTakiar commented on the issue:

https://github.com/apache/spark/pull/20893
  
Wrote a test in `SparkLauncherSuite` and was able to replicate the error 
from HIVE-18533, and then realized the exception is only logged and then 
swallowed. From `SparkContext`

```
if (_dagScheduler != null) {
  Utils.tryLogNonFatalError {
_dagScheduler.stop()
  }
  _dagScheduler = null
}
```

Seems the failures I was seeing in HIVE-18533 are due to something else.

Regardless, this is still probably a good fix since you still don't want to 
write to the connection unless its open, but given that the exception is only 
logged and not thrown, don't see an easy way to write a test for this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20893: [SPARK-23785][LAUNCHER] LauncherBackend doesn't c...

2018-03-27 Thread sahilTakiar
Github user sahilTakiar commented on a diff in the pull request:

https://github.com/apache/spark/pull/20893#discussion_r177577783
  
--- Diff: 
core/src/main/scala/org/apache/spark/launcher/LauncherBackend.scala ---
@@ -114,10 +114,10 @@ private[spark] abstract class LauncherBackend {
 
 override def close(): Unit = {
   try {
+_isConnected = false
 super.close()
   } finally {
 onDisconnected()
--- End diff --

Yeah, your right, it doesn't look like `onDisconnected` is being used 
anywhere, but its marked as `protected` so I guess its meant to be used by a 
sub-class (although no sub-class uses it).

So agree, this change should be safe, it doesn't change the semantics of 
any use of `onDisconnected`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20893: [SPARK-23785][LAUNCHER] LauncherBackend doesn't check st...

2018-03-26 Thread sahilTakiar
Github user sahilTakiar commented on the issue:

https://github.com/apache/spark/pull/20893
  
Ok, I'll work on writing a test for `SparkLauncherSuite`.

The test added here was meant to cover the race condition mentioned 
[here|https://spark.apache.org/docs/2.3.0/sql-programming-guide.html#bucketing-sorting-and-partitioning]


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20893: [SPARK-23785][LAUNCHER] LauncherBackend doesn't c...

2018-03-26 Thread sahilTakiar
Github user sahilTakiar commented on a diff in the pull request:

https://github.com/apache/spark/pull/20893#discussion_r177208050
  
--- Diff: 
core/src/main/scala/org/apache/spark/launcher/LauncherBackend.scala ---
@@ -114,10 +114,10 @@ private[spark] abstract class LauncherBackend {
 
 override def close(): Unit = {
   try {
+_isConnected = false
 super.close()
   } finally {
 onDisconnected()
--- End diff --

`_isConnected` is used in `def isConnected()`. Moving it from the `finally` 
block to before the call to `super.close()` avoids a race condition where a 
client tries to write to the connection after it has been closed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20893: [SPARK-23785][LAUNCHER] LauncherBackend doesn't c...

2018-03-23 Thread sahilTakiar
GitHub user sahilTakiar opened a pull request:

https://github.com/apache/spark/pull/20893

[SPARK-23785][LAUNCHER] LauncherBackend doesn't check state of connection 
before setting state

## What changes were proposed in this pull request?

Changed `LauncherBackend` `set` method so that it checks if the connection 
is open or not before writing to it (uses `isConnected`).

## How was this patch tested?

None

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sahilTakiar/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20893.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20893


commit 1896236c9b79975e8add4017eb7fafc1cec59d70
Author: Sahil Takiar <stakiar@...>
Date:   2018-03-23T20:24:21Z

[SPARK-23785][LAUNCHER] LauncherBackend doesn't check state of connection 
before setting state




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20815: [SPARK-23658][LAUNCHER] InProcessAppHandle uses t...

2018-03-13 Thread sahilTakiar
GitHub user sahilTakiar opened a pull request:

https://github.com/apache/spark/pull/20815

[SPARK-23658][LAUNCHER] InProcessAppHandle uses the wrong class in getLogger

## What changes were proposed in this pull request?

Changed `Logger` in `InProcessAppHandle` to use `InProcessAppHandle` 
instead of `ChildProcAppHandle`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sahilTakiar/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20815.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20815


commit 1518a5af591b2254e947a60e0ec107551f2155a4
Author: Sahil Takiar <stakiar@...>
Date:   2018-03-13T18:24:20Z

[SPARK-23658][LAUNCHER] InProcessAppHandle uses the wrong class in getLogger




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19413: [SPARK-20466][CORE] HadoopRDD#addLocalConfigurati...

2017-10-03 Thread sahilTakiar
Github user sahilTakiar commented on a diff in the pull request:

https://github.com/apache/spark/pull/19413#discussion_r142499531
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
@@ -157,20 +157,23 @@ class HadoopRDD[K, V](
   if (conf.isInstanceOf[JobConf]) {
 logDebug("Re-using user-broadcasted JobConf")
 conf.asInstanceOf[JobConf]
-  } else if (HadoopRDD.containsCachedMetadata(jobConfCacheKey)) {
-logDebug("Re-using cached JobConf")
-HadoopRDD.getCachedMetadata(jobConfCacheKey).asInstanceOf[JobConf]
   } else {
-// Create a JobConf that will be cached and used across this RDD's 
getJobConf() calls in the
-// local process. The local cache is accessed through 
HadoopRDD.putCachedMetadata().
-// The caching helps minimize GC, since a JobConf can contain 
~10KB of temporary objects.
-// Synchronize to prevent ConcurrentModificationException 
(SPARK-1097, HADOOP-10456).
-HadoopRDD.CONFIGURATION_INSTANTIATION_LOCK.synchronized {
-  logDebug("Creating new JobConf and caching it for later re-use")
-  val newJobConf = new JobConf(conf)
-  initLocalJobConfFuncOpt.foreach(f => f(newJobConf))
-  HadoopRDD.putCachedMetadata(jobConfCacheKey, newJobConf)
-  newJobConf
+val newJobConf = HadoopRDD.getCachedMetadata(jobConfCacheKey)
--- End diff --

Thanks for the tip. Updated the patch.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19413: [SPARK-20466][CORE] HadoopRDD#addLocalConfigurati...

2017-10-03 Thread sahilTakiar
Github user sahilTakiar commented on a diff in the pull request:

https://github.com/apache/spark/pull/19413#discussion_r142466295
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
@@ -157,20 +157,23 @@ class HadoopRDD[K, V](
   if (conf.isInstanceOf[JobConf]) {
 logDebug("Re-using user-broadcasted JobConf")
 conf.asInstanceOf[JobConf]
-  } else if (HadoopRDD.containsCachedMetadata(jobConfCacheKey)) {
-logDebug("Re-using cached JobConf")
-HadoopRDD.getCachedMetadata(jobConfCacheKey).asInstanceOf[JobConf]
   } else {
-// Create a JobConf that will be cached and used across this RDD's 
getJobConf() calls in the
-// local process. The local cache is accessed through 
HadoopRDD.putCachedMetadata().
-// The caching helps minimize GC, since a JobConf can contain 
~10KB of temporary objects.
-// Synchronize to prevent ConcurrentModificationException 
(SPARK-1097, HADOOP-10456).
-HadoopRDD.CONFIGURATION_INSTANTIATION_LOCK.synchronized {
-  logDebug("Creating new JobConf and caching it for later re-use")
-  val newJobConf = new JobConf(conf)
-  initLocalJobConfFuncOpt.foreach(f => f(newJobConf))
-  HadoopRDD.putCachedMetadata(jobConfCacheKey, newJobConf)
-  newJobConf
+val newJobConf = HadoopRDD.getCachedMetadata(jobConfCacheKey)
--- End diff --

I could do that, but then I would have to remove the `logDebug("Re-using 
cached JobConf")` statement. Unless there is some way to get around that in 
Scala.

Otherwise, I can remove the val name to `cachedJobConf`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19413: [SPARK-20466][CORE] HadoopRDD#addLocalConfiguration thro...

2017-10-02 Thread sahilTakiar
Github user sahilTakiar commented on the issue:

https://github.com/apache/spark/pull/19413
  
Fixed the style check.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19413: [SPARK-20466][CORE] HadoopRDD#addLocalConfigurati...

2017-10-02 Thread sahilTakiar
GitHub user sahilTakiar opened a pull request:

https://github.com/apache/spark/pull/19413

[SPARK-20466][CORE] HadoopRDD#addLocalConfiguration throws NPE

## What changes were proposed in this pull request?

Fix for SPARK-20466, full description of the issue in the JIRA. To 
summarize, `HadoopRDD` uses a metadata cache to cache `JobConf` objects. The 
cache uses soft-references, which means the JVM can delete entries from the 
cache whenever there is GC pressure. `HadoopRDD#getJobConf` had a bug where it 
would check if the cache contained the `JobConf`, if it did it would get the 
`JobConf` from the cache and return it. This doesn't work when soft-references 
are used as the JVM can delete the entry between the existence check and the 
get call.

## How was this patch tested?

Haven't thought of a good way to test this yet given the issue only occurs 
sometimes, and happens during high GC pressure. Was thinking of using mocks to 
verify `#getJobConf` is doing the right thing. I deleted the method 
`HadoopRDD#containsCachedMetadata` so that we don't hit this issue again.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sahilTakiar/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19413.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19413


commit 680f32c311e33784e11763c109488d528178efc8
Author: Sahil Takiar <stak...@cloudera.com>
Date:   2017-10-02T20:44:23Z

[SPARK-20466][CORE] HadoopRDD#addLocalConfiguration throws NPE




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17499: [SPARK-20161][CORE] Default log4j properties file...

2017-04-04 Thread sahilTakiar
Github user sahilTakiar closed the pull request at:

https://github.com/apache/spark/pull/17499


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17499: [SPARK-20161][CORE] Default log4j properties file should...

2017-04-04 Thread sahilTakiar
Github user sahilTakiar commented on the issue:

https://github.com/apache/spark/pull/17499
  
Sounds like we agree that if this should be done, it should be done in 
Hive. We have an open JIRA (HIVE-13517) to address this.

If there are no other comments, I'll close this PR and the corresponding 
JIRA.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17499: [SPARK-20161][CORE] Default log4j properties file should...

2017-03-31 Thread sahilTakiar
Github user sahilTakiar commented on the issue:

https://github.com/apache/spark/pull/17499
  
Thanks for taking look everyone. The original motivation for this PR comes 
[HIVE-13517](https://issues.apache.org/jira/browse/HIVE-13517). It was said to 
be useful for debugging HoS applications that run on YARN. I personally can't 
point to a specific example where knowing the thread-id has helped debugging 
Spark applications. The HIVE JIRA mentions that the logs from different threads 
get mixed, which makes it difficult to debug. The idea is that adding the 
thread-id to the logs will help other Spark applications too, but if thats not 
the case then I think we can close this. I've tagged some HoS experts in 
SPARK-20161 to see if they can provide more details.

As Marcelo said, there is still a workaround on the Hive side for this 
issue, a custom log4j.properties file can always be provided.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17499: [SPARK-20161][CORE] Default log4j properties file should...

2017-03-31 Thread sahilTakiar
Github user sahilTakiar commented on the issue:

https://github.com/apache/spark/pull/17499
  
I'm not sure how the community feels about adding this to the default log4j 
files, so posting this as a reference for now. Some more details are in the 
JIRA, but this can improve debugabbility, especially for Spark-on-YARN.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17499: [SPARK-20161][CORE] Default log4j properties file...

2017-03-31 Thread sahilTakiar
GitHub user sahilTakiar opened a pull request:

https://github.com/apache/spark/pull/17499

[SPARK-20161][CORE] Default log4j properties file should print thread-id in 
ConversionPattern

## What changes were proposed in this pull request?

Change the default log4j properties files so that they display the thread 
name in the log output.

## How was this patch tested?

No testing done yet

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sahilTakiar/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17499.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17499


commit 417a0ddc6f7fef8af007d13562f3681e72715bc0
Author: Sahil Takiar <stak...@cloudera.com>
Date:   2017-03-31T20:23:44Z

[SPARK-20161][CORE] Default log4j properties file should print thread-id in 
ConversionPattern




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org