[GitHub] spark issue #13326: [SPARK-15560] [Mesos] Queued/Supervise drivers waiting f...

2017-02-08 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13326
  
@tnachen Can you check this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13077: [SPARK-10748] [Mesos] Log error instead of crashing Spar...

2017-02-08 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13077
  
@srowen /@tnachen Can you check this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13072: [SPARK-15288] [Mesos] Mesos dispatcher should handle gra...

2017-02-08 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13072
  
@srowen Can you check this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13143: [SPARK-15359] [Mesos] Mesos dispatcher should handle DRI...

2017-02-08 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13143
  
@tnachen Can you check this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16801: [SPARK-13619] [WEBUI] [CORE] Jobs page UI shows w...

2017-02-03 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/16801

[SPARK-13619] [WEBUI] [CORE] Jobs page UI shows wrong number of failed tasks

## What changes were proposed in this pull request?

When the Failed/Killed Task End events come after the Job End event then 
these task events are simply getting ignored without considering/updating into 
JobUIData. It is happening because jobId information is getting deleted from 
stageIdToActiveJobIds during the Job End event and Task End event is not able 
to find the Job information to update. 

## How was this patch tested?

### Current behaviour of Spark Jobs page for Running Application and 
History page,

 Completed Jobs (1)

| Job Id | Description | Submitted | Duration | Stages: Succeeded/Total | 
Tasks (for all stages): Succeeded/Total | 
| --- | --- | --- | --- | --- | --- |
| 0  | saveAsTextFile at JavaWordCountWithSlowTask.java:49  | 2017/01/25 
09:03:14 | 1.4 min  | 2/2   | 400/400 (17 killed) | 


 Completed Stages (2)

| Stage Id | Description | Submitted | Duration | Tasks: Succeeded/Total | 
Input | Output | Shuffle Read | Shuffle Write | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| 1 | saveAsTextFile at JavaWordCountWithSlowTask.java:49 +details | 
2017/01/25 09:04:34 | 5 s | 200/200 (2 failed) (1 killed) | |  6.8 KB | 2.3 MB 
| | 
| 0 | mapToPair at JavaWordCountWithSlowTask.java:33 +details  | 2017/01/25 
09:03:15 | 1.3 min | 200/200 (16 killed) | 1915.5 MB | |  |  2.3 MB | 


### Behaviour of the Web Pages after applying the patch,

 Completed Jobs (1)

| Job Id | Description | Submitted | Duration | Stages: Succeeded/Total | 
Tasks (for all stages): Succeeded/Total | 
| --- | --- | --- | --- | --- | --- |
| 0  | saveAsTextFile at JavaWordCountWithSlowTask.java:49 | 2017/01/25 
09:03:14 |  1.4 min  | 2/2   | 400/400 (2 failed) (17 killed) | 


 Completed Stages (2)

| Stage Id | Description | Submitted | Duration | Tasks: Succeeded/Total | 
Input | Output | Shuffle Read | Shuffle Write | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| 1 | saveAsTextFile at JavaWordCountWithSlowTask.java:49 +details | 
2017/01/25 09:04:34 | 5 s | 200/200 (2 failed) (1 killed) |  | 6.8 KB |
2.3 MB   | 
| 0 | mapToPair at JavaWordCountWithSlowTask.java:33 +details | 2017/01/25 
09:03:15 | 1.3 min | 200/200 (16 killed) |   1915.5 MB |  |  | 2.3 MB | 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-13619

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16801.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16801


commit 3b51ef0e5ddd58e0bd8f90a52ca08145e5cdef4d
Author: Devaraj K 
Date:   2017-02-04T01:40:35Z

[SPARK-13619] [WEBUI] [CORE] Jobs page UI shows wrong number of failed
tasks




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13072: [SPARK-15288] [Mesos] Mesos dispatcher should handle gra...

2017-01-27 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13072
  
MesosClusterDispatcher also has multiple threads like Executor, when any 
one thread terminates in the MesosClusterDispatcher process due to some 
error/exception it keeps running without performing the terminated thread 
functionality. I think we need to handle those uncaught exceptions from the 
MesosClusterDispatcher process threads using the UncaughtExceptionHandler and 
take the action instead of running the MesosClusterDispatcher without 
performing the functionality and without notifying the user.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16725: [SPARK-19377] [WEBUI] [CORE] Killed tasks should ...

2017-01-27 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/16725

[SPARK-19377] [WEBUI] [CORE] Killed tasks should have the status as KILLED

## What changes were proposed in this pull request?

Copying of the killed status was missing while getting the newTaskInfo 
object by dropping the unnecessary details to reduce the memory usage. This 
patch adds the copying of the killed status to newTaskInfo object, this will 
correct the display of the status from wrong status to KILLED status in Web UI.

## How was this patch tested?

Current behaviour of displaying tasks in stage UI page,

| Index | ID | Attempt | Status | Locality Level | Executor ID / Host | 
Launch Time | Duration | GC Time | Input Size / Records | Write Time | Shuffle 
Write Size / Records | Errors |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
--- |
|143|10 |0  |SUCCESS|NODE_LOCAL |6 / x.xx.x.x 
stdout stderr|2017/01/25 07:49:27 |0 ms | |0.0 B / 0  | 
|0.0 B / 0|TaskKilled (killed intentionally)|
|156|11 |0  |SUCCESS|NODE_LOCAL |5 / x.xx.x.x 
stdout stderr|2017/01/25 07:49:27 |0 ms | |0.0 B / 0  | 
|0.0 B / 0|TaskKilled (killed intentionally)|



Web UI display after applying the patch,

| Index | ID | Attempt | Status | Locality Level | Executor ID / Host | 
Launch Time | Duration | GC Time | Input Size / Records | Write Time | Shuffle 
Write Size / Records | Errors |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
--- |
|143|10 |0  |KILLED |NODE_LOCAL |6 / x.xx.x.x stdout 
stderr|2017/01/25 07:49:27 |0 ms | |0.0 B / 0  |  | 0.0 B / 
0  | TaskKilled (killed intentionally)|
|156|11 |0  |KILLED |NODE_LOCAL |5 / x.xx.x.x stdout 
stderr|2017/01/25 07:49:27 |0 ms | |0.0 B / 0  |  |0.0 B / 
0   | TaskKilled (killed intentionally)|


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-19377

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16725.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16725


commit 6206d109b646e55223a4b162a37e70f42f4570a1
Author: Devaraj K 
Date:   2017-01-28T05:53:21Z

[SPARK-19377] [WEBUI] [CORE] Killed tasks should have the status as KILLED




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16705: [SPARK-19354] [Core] Killed tasks are getting mar...

2017-01-25 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/16705

[SPARK-19354] [Core] Killed tasks are getting marked as FAILED

## What changes were proposed in this pull request?

Handling the exception which occurs during the kill and logging it instead 
of re-throwing the exception which causes to mark the task as FAILED.

## How was this patch tested?

I verified this manually by running multiple applications, with the patch 
changes when any exception occurs during kill, it logs the exception and 
continues with the kill process. It shows/considers the task as KILLED in Web 
UI sections of 'Details for Job' and 'Aggregated Metrics by Executor'.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-19354

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16705.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16705


commit d472245bb7392db4dc1b260eeafba1470448ef03
Author: Devaraj K 
Date:   2017-01-25T21:33:09Z

[SPARK-19354] [Core] Killed tasks are getting marked as FAILED




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13077: [SPARK-10748] [Mesos] Log error instead of crashi...

2016-12-29 Thread devaraj-kavali
Github user devaraj-kavali commented on a diff in the pull request:

https://github.com/apache/spark/pull/13077#discussion_r94205390
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -559,15 +560,29 @@ private[spark] class MesosClusterScheduler(
   } else {
 val offer = offerOption.get
 val queuedTasks = tasks.getOrElseUpdate(offer.offerId, new 
ArrayBuffer[TaskInfo])
-val task = createTaskInfo(submission, offer)
-queuedTasks += task
-logTrace(s"Using offer ${offer.offerId.getValue} to launch driver 
" +
-  submission.submissionId)
-val newState = new MesosClusterSubmissionState(submission, 
task.getTaskId, offer.slaveId,
-  None, new Date(), None, getDriverFrameworkID(submission))
-launchedDrivers(submission.submissionId) = newState
-launchedDriversState.persist(submission.submissionId, newState)
-afterLaunchCallback(submission.submissionId)
+breakable {
--- End diff --

Here it needs to continue in the for loop from the catch block with next 
set of drivers. It cannot return from the exception since it needs to launch 
the other candidates, I can consider the other suggestion i.e. moving the 
following code into try clause. I will update the PR by moving the code into 
try block. Please let me know if it doesn’t make sense.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13077: [SPARK-10748] [Mesos] Log error instead of crashing Spar...

2016-10-10 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13077
  
@tnachen, sorry for the delay, I will update the patch. Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12753: [SPARK-3767] [CORE] Support wildcard in Spark properties

2016-08-17 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/12753
  
I will update this PR with the ConfigReader and reopen the jira.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12753: [SPARK-3767] [CORE] Support wildcard in Spark properties

2016-08-17 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/12753
  
@vanzin, SPARK-3767 was resolved as 'Won't Fix' by @srowen. I was in 
assumption that SPARK-16671 covers this as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12753: [SPARK-3767] [CORE] Support wildcard in Spark properties

2016-08-04 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/12753
  
@vanzin Thanks for looking into this, I have resolved the conflicts.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13143: [SPARK-15359] [Mesos] Mesos dispatcher should handle DRI...

2016-06-22 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13143
  
MesosDriver doesn't throw any exception, it just returns with the value as 
Status.DRIVER_ABORTED.

 ```
  registerLatch.await()

  // propagate any error to the calling thread. This ensures that 
SparkContext creation fails
  // without leaving a broken context that won't be able to schedule 
any tasks
  error.foreach(throw _)

This code handles exceptions and throws if it gets Status.DRIVER_ABORTED 
during registration, once the registration completes there is no code to handle 
and will be skipped the status and also thread dies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13077: [SPARK-10748] [Mesos] Log error instead of crashing Spar...

2016-06-22 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13077
  
Thanks @tnachen for looking into this, I will update this with the changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11996: [SPARK-10530] [CORE] Kill other task attempts when one t...

2016-06-14 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/11996
  
@lw-lin I think it will release the resources and then it throws 
TaskKilledException at 
[Executor.scala#L307](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L307).
 If you are facing the issue then please file a separate ticket with the 
details, we can discuss there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13323: [SPARK-15555] [Mesos] Driver with --supervise option can...

2016-06-09 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13323
  
@tnachen Thanks for your review, I have added a test for this, can you have 
a look into it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13326: [SPARK-15560] [Mesos] Queued/Supervise drivers waiting f...

2016-06-06 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13326
  

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59989/testReport/

`org.apache.spark.scheduler.BlacklistIntegrationSuite.Bad node with 
multiple executors, job will still succeed with the right confs`

This test is passing in my local environment and also doesn't seem to be 
related this change.

@tnachen can we retest it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13326: [SPARK-15560] [Mesos] Queued/Supervise drivers waiting f...

2016-06-04 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13326
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59989/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13326: [SPARK-15560] [Mesos] Queued/Supervise drivers waiting f...

2016-06-04 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13326
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13326: [SPARK-15560] [Mesos] Queued/Supervise drivers waiting f...

2016-06-04 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13326
  
**[Test build #59989 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59989/consoleFull)**
 for PR 13326 at commit 
[`7f4f34b`](https://github.com/apache/spark/commit/7f4f34b1dd8ec20297f1295610e11c8fed860652).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13326: [SPARK-15560] [Mesos] Queued/Supervise drivers waiting f...

2016-06-04 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13326
  
**[Test build #59989 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59989/consoleFull)**
 for PR 13326 at commit 
[`7f4f34b`](https://github.com/apache/spark/commit/7f4f34b1dd8ec20297f1295610e11c8fed860652).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13326: [SPARK-15560] [Mesos] Queued/Supervise drivers waiting f...

2016-06-04 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13326
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13407: [SPARK-15665] [CORE] spark-submit --kill and --status ar...

2016-06-03 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13407
  
Thanks @vanzin for review and merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13326: [SPARK-15560] [Mesos] Queued/Supervise drivers wa...

2016-06-03 Thread devaraj-kavali
Github user devaraj-kavali commented on a diff in the pull request:

https://github.com/apache/spark/pull/13326#discussion_r65748851
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -188,10 +188,10 @@ private[spark] class MesosClusterScheduler(
 mesosDriver.killTask(task.taskId)
 k.success = true
 k.message = "Killing running driver"
-  } else if (removeFromQueuedDrivers(submissionId)) {
--- End diff --

Thanks @tnachen for looking into this, I see it is being used in other 
places.

```
  queuedDrivers
.filter(d => launchedDrivers.contains(d.submissionId))
.foreach(d => removeFromQueuedDrivers(d.submissionId))

```

```
  // Then we walk through the queued drivers and try to schedule them.
  scheduleTasks(
copyBuffer(queuedDrivers),
removeFromQueuedDrivers,
currentOffers,
tasks)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13407: [SPARK-15665] [CORE] spark-submit --kill and --status ar...

2016-06-02 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13407
  
Thanks @vanzin and @andrewor14 for looking into this, sorry for the delay.

> If SparkSubmit can still process --kill and --status with those, then 
that's fine too (just use SparkLauncher.NO_RESOURCE).

I tried this but it doesn't work with the below error
```
[devaraj@server2 spark-master]$ ./bin/spark-submit --kill 
driver-20160531171222-
Error: Cannot load main class from JAR spark-internal with URI null. Please 
specify a class through --class.
Run with --help for usage help or --verbose for debug output
```

I have renamed the printInfo flag to isAppResourceReq and used the same for 
kill and status cases also. 

Please review and let me know your feedback.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15665] [CORE] spark-submit --kill and -...

2016-05-31 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/13407

[SPARK-15665] [CORE] spark-submit --kill and --status are not working

## What changes were proposed in this pull request?
--kill and --status were not considered while handling in OptionParser and 
due to that it was failing. Now handling the --kill and --status options as 
part of OptionParser.handle.


## How was this patch tested?

I have verified these manually by running --kill and --status commands.




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-15665

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13407.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13407


commit 62ebc7a22fde3f0974f381a89c48c8e5d43a1ce4
Author: Devaraj K 
Date:   2016-05-31T09:38:50Z

[SPARK-15665] [CORE] spark-submit --kill and --status are not working




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10530] [CORE] Kill other task attempts ...

2016-05-30 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11996#issuecomment-222589679
  
Thanks @kayousterhout.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10530] [CORE] Kill other task attempts ...

2016-05-27 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11996#issuecomment-222104830
  
@kayousterhout, I have added inline comments and the build is also fine 
now, please have a look into it. Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10530] [CORE] Kill other task attempts ...

2016-05-26 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11996#issuecomment-221968586
  
@kayousterhout Thanks a lot for your review and comments. I have fixed 
them, please have a look into this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15560] [Mesos] Queued/Supervise drivers...

2016-05-26 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/13326

[SPARK-15560] [Mesos] Queued/Supervise drivers waiting for retry drivers 
disappear for kill command in Mesos mode

## What changes were proposed in this pull request?

With the patch, it moves the drivers from Queued Drivers/Supervise drivers 
waiting for retry section to Finished Drivers section when they get killed.


## How was this patch tested?
I have verified it manually by checking the Mesos Dispatcher UI while 
simulating this scenario.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-15560

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13326.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13326


commit 7f4f34b1dd8ec20297f1295610e11c8fed860652
Author: Devaraj K 
Date:   2016-05-26T10:12:34Z

[SPARK-15560] [Mesos] Queued/Supervise drivers waiting for retry drivers
disappear for kill command in Mesos mode




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15555] [Mesos] Driver with --supervise ...

2016-05-26 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/13323

[SPARK-1] [Mesos] Driver with --supervise option cannot be killed in 
Mesos mode

## What changes were proposed in this pull request?
Not adding the Killed applications for retry.


## How was this patch tested?

I have verified manually in the Mesos cluster, with the changes the killed 
applications move to Finished Drivers section and will not retry.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13323.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13323


commit 2e5664c63416fed1a9954fd2ed6c71773eed34ed
Author: Devaraj K 
Date:   2016-05-26T07:48:00Z

[SPARK-1] [Mesos] Driver with --supervise option cannot be killed in
Mesos mode




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10530] [CORE] Kill other task attempts ...

2016-05-25 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11996#issuecomment-221531103
  
@kayousterhout, can you have look into this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10530] [CORE] Kill other task attempts ...

2016-05-18 Thread devaraj-kavali
Github user devaraj-kavali commented on a diff in the pull request:

https://github.com/apache/spark/pull/11996#discussion_r63738214
  
--- Diff: 
core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala ---
@@ -789,6 +791,51 @@ class TaskSetManagerSuite extends SparkFunSuite with 
LocalSparkContext with Logg
 assert(TaskLocation("executor_host1_3") === 
ExecutorCacheTaskLocation("host1", "3"))
   }
 
+  test("Kill other task attempts when one attempt belonging to the same 
task succeeds") {
+sc = new SparkContext("local", "test")
+val sched = new FakeTaskScheduler(sc, ("exec1", "host1"), ("exec2", 
"host2"))
+val taskSet = FakeTask.createTaskSet(4)
+val manager = new TaskSetManager(sched, taskSet, MAX_TASK_FAILURES)
+val accumUpdatesByTask: Array[Seq[AccumulableInfo]] = 
taskSet.tasks.map { task =>
+  task.initialAccumulators.map { a => a.toInfo(Some(0L), None) }
+}
+// Offer resources for 4 tasks to start
+for ((k, v) <- List(
+"exec1" -> "host1",
+"exec1" -> "host1",
+"exec2" -> "host2",
+"exec2" -> "host2")) {
+  val taskOption = manager.resourceOffer(k, v, NO_PREF)
+  assert(taskOption.isDefined)
+  val task = taskOption.get
+  assert(task.executorId === k)
+}
+assert(sched.startedTasks.toSet === Set(0, 1, 2, 3))
+// Complete the 3 tasks and leave 1 task in running
+for (id <- Set(0, 1, 2)) {
+  manager.handleSuccessfulTask(id, createTaskResult(id, 
accumUpdatesByTask(id)))
+  assert(sched.endedTasks(id) === Success)
+}
+
+// Wait for the threshold time to start speculative attempt for the 
running task
+Thread.sleep(100)
--- End diff --

I feel adding an argument to **checkSpeculatableTasks()** would lead to 
change the signature of the method in the Schedulable interface and 
correspondingly all of its implementations. I am thinking to move the code in 
**TaskSetManager.checkSpeculatableTasks()** to another method which takes an 
argument(i.e minTimeToSpeculation: Int) and same method can be used in the 
test. Please give your opinion on this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10530] [CORE] Kill other task attempts ...

2016-05-18 Thread devaraj-kavali
Github user devaraj-kavali commented on a diff in the pull request:

https://github.com/apache/spark/pull/11996#discussion_r63736986
  
--- Diff: 
core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala ---
@@ -789,6 +791,51 @@ class TaskSetManagerSuite extends SparkFunSuite with 
LocalSparkContext with Logg
 assert(TaskLocation("executor_host1_3") === 
ExecutorCacheTaskLocation("host1", "3"))
   }
 
+  test("Kill other task attempts when one attempt belonging to the same 
task succeeds") {
+sc = new SparkContext("local", "test")
+val sched = new FakeTaskScheduler(sc, ("exec1", "host1"), ("exec2", 
"host2"))
+val taskSet = FakeTask.createTaskSet(4)
+val manager = new TaskSetManager(sched, taskSet, MAX_TASK_FAILURES)
+val accumUpdatesByTask: Array[Seq[AccumulableInfo]] = 
taskSet.tasks.map { task =>
+  task.initialAccumulators.map { a => a.toInfo(Some(0L), None) }
+}
+// Offer resources for 4 tasks to start
+for ((k, v) <- List(
+"exec1" -> "host1",
+"exec1" -> "host1",
+"exec2" -> "host2",
+"exec2" -> "host2")) {
+  val taskOption = manager.resourceOffer(k, v, NO_PREF)
+  assert(taskOption.isDefined)
+  val task = taskOption.get
+  assert(task.executorId === k)
+}
+assert(sched.startedTasks.toSet === Set(0, 1, 2, 3))
+// Complete the 3 tasks and leave 1 task in running
+for (id <- Set(0, 1, 2)) {
+  manager.handleSuccessfulTask(id, createTaskResult(id, 
accumUpdatesByTask(id)))
+  assert(sched.endedTasks(id) === Success)
+}
+
+// Wait for the threshold time to start speculative attempt for the 
running task
+Thread.sleep(100)
+val speculation = manager.checkSpeculatableTasks
+assert(speculation === true)
+// Offer resource to start the speculative attempt for the running task
+val taskOption5 = manager.resourceOffer("exec1", "host1", NO_PREF)
+assert(taskOption5.isDefined)
+val task5 = taskOption5.get
+assert(task5.taskId === 4)
+assert(task5.executorId === "exec1")
+assert(task5.attemptNumber === 1)
+sched.backend = mock(classOf[SchedulerBackend])
+// Complete the speculative attempt for the running task
+manager.handleSuccessfulTask(4, createTaskResult(3, 
accumUpdatesByTask(3)))
+assert(sched.endedTasks(3) === Success)
--- End diff --

Here **sched.backend** is **mock(classOf[SchedulerBackend])** and as part 
of **manager.handleSuccessfulTask()**, it issues **sched.backend.killTask()** 
for any other attempts. Since it is a mock invocation it only ensures that 
other attempts kill invocation is happening. I have added the same in the 
comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10530] [CORE] Kill other task attempts ...

2016-05-18 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11996#issuecomment-220082195
  
Thanks a lot @kayousterhout for the review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15359] [Mesos] Mesos dispatcher should ...

2016-05-17 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/13143

[SPARK-15359] [Mesos] Mesos dispatcher should handle DRIVER_ABORTED status 
from mesosDriver.run()

## What changes were proposed in this pull request?

When the mesosDriver.run() returns with the status as DRIVER_ABORTED then 
throwing the exception which can be handled from SparkUncaughtExceptionHandler 
to shutdown the dispatcher.


## How was this patch tested?

I verified it manually, the driver thread throws exception when 
mesosDriver.run() returns with the DRIVER_ABORTED status.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-15359

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13143.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13143


commit c16fb5ff62a943d2c17524f6e8a328acfc8dfd82
Author: Devaraj K 
Date:   2016-05-17T08:32:13Z

[SPARK-15359] [Mesos] Mesos dispatcher should handle DRIVER_ABORTED status
from mesosDriver.run()




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10748] [Mesos] Log error instead of cra...

2016-05-12 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/13077

[SPARK-10748] [Mesos] Log error instead of crashing Spark Mesos dispatcher 
when a job is misconfigured

## What changes were proposed in this pull request?

Now handling the spark exception which gets thrown for invalid job 
configuration, marking that job as failed and continuing to launch the other 
drivers instead of throwing the exception.

## How was this patch tested?

I verified manually, now the misconfigured jobs move to Finished Drivers 
section in UI and continue to launch the other jobs.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-10748

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13077.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13077


commit ae85f8154e506c223a018d9c04c58967a82fa580
Author: Devaraj K 
Date:   2016-05-12T10:10:50Z

[SPARK-10748] [Mesos] Log error instead of crashing Spark Mesos dispatcher
when a job is misconfigured




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15288] [SQL] Support old table schema c...

2016-05-12 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/13073#issuecomment-218703662
  
@clockfly, seems JIRA number mentioned in the title is wrong, I think it 
should be SPARK-15253.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10530] [CORE] Kill other task attempts ...

2016-05-11 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11996#issuecomment-218671842
  
@kayousterhout, @markhamstra any comments plz?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15288] [Mesos] Mesos dispatcher should ...

2016-05-11 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/13072

[SPARK-15288] [Mesos] Mesos dispatcher should handle gracefully when any 
thread gets UncaughtException

## What changes were proposed in this pull request?

Adding the default UncaughtExceptionHandler to the MesosClusterDispatcher.


## How was this patch tested?
I verified it manually, when any of the dispatcher thread gets uncaught 
exceptions then the default UncaughtExceptionHandler will handle those 
exceptions.




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-15288

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13072.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13072


commit 27de4fb65b7400d7d2e76843cb9eb1c55c9d69d4
Author: Devaraj K 
Date:   2016-05-12T06:25:38Z

[SPARK-15288] [Mesos] Mesos dispatcher should handle gracefully when any
thread gets UncaughtException




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14234] [CORE] Executor crashes for Task...

2016-05-04 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/12031#issuecomment-216774818
  
Thanks a lot @zsxwing for pushing this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3767] [CORE] Support wildcard in Spark ...

2016-05-03 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/12753#issuecomment-216760992
  
@rxin, Please have a look into this and let me know any thing needs to be 
done here. About @, M/R also uses @ for the taskid wild card in java opts and 
there is no problem in windows and as well as in other places with @.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3767] [CORE] Support wildcard in Spark ...

2016-05-03 Thread devaraj-kavali
Github user devaraj-kavali commented on a diff in the pull request:

https://github.com/apache/spark/pull/12753#discussion_r61995355
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala
 ---
@@ -166,14 +166,15 @@ private[spark] class CoarseMesosSchedulerBackend(
   environment.addVariables(
 
Environment.Variable.newBuilder().setName("SPARK_CLASSPATH").setValue(cp).build())
 }
-val extraJavaOpts = conf.get("spark.executor.extraJavaOptions", "")
+var extraJavaOpts = conf.get("spark.executor.extraJavaOptions", "")
--- End diff --

Thanks @BryanCutler  for the suggestion, I have addressed it in the latest.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1989] [CORE] Exit executors faster if t...

2016-04-29 Thread devaraj-kavali
Github user devaraj-kavali closed the pull request at:

https://github.com/apache/spark/pull/12571


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3767] [CORE] Support wildcard in Spark ...

2016-04-28 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/12753#issuecomment-215619844
  
Thanks @rxin for checking this, I don't think @ is used any where. Here 
again we are replacing only for 'spark.executor.extraJavaOptions' value when 
@execid@ occurs, any other @ symbols we leave as it is, so I don't think any 
problem occurs due to this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3767] [CORE] Support wildcard in Spark ...

2016-04-28 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/12753

[SPARK-3767] [CORE] Support wildcard in Spark properties

## What changes were proposed in this pull request?

Added provision to specify the 'spark.executor.extraJavaOptions' value in 
terms of the Executor Id(i.e. @execid@). @execid@ will be replaced with the 
Executor Id while starting the executor.

## How was this patch tested?

I have verified this by checking the executor process command  and gc logs. 
I verified the same in different deployement modes(Standalone, YARN, Mesos).




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-3767

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/12753.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #12753


commit fc3bbd0d72d9319885b877490f57ed4f1b870fa2
Author: Devaraj K 
Date:   2016-04-28T10:46:51Z

[SPARK-3767] [CORE] Support wildcard in Spark properties




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13965] [CORE] TaskSetManager should kil...

2016-04-27 Thread devaraj-kavali
Github user devaraj-kavali closed the pull request at:

https://github.com/apache/spark/pull/11778


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1989] [CORE] Exit executors faster if t...

2016-04-26 Thread devaraj-kavali
Github user devaraj-kavali commented on a diff in the pull request:

https://github.com/apache/spark/pull/12571#discussion_r61209327
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
 ---
@@ -66,12 +66,20 @@ private[spark] class SparkDeploySchedulerBackend(
   "--cores", "{{CORES}}",
   "--app-id", "{{APP_ID}}",
   "--worker-url", "{{WORKER_URL}}")
-val extraJavaOpts = 
sc.conf.getOption("spark.executor.extraJavaOptions")
+var extraJavaOpts = 
sc.conf.getOption("spark.executor.extraJavaOptions")
   .map(Utils.splitCommandString).getOrElse(Seq.empty)
 val classPathEntries = 
sc.conf.getOption("spark.executor.extraClassPath")
   .map(_.split(java.io.File.pathSeparator).toSeq).getOrElse(Nil)
 val libraryPathEntries = 
sc.conf.getOption("spark.executor.extraLibraryPath")
   .map(_.split(java.io.File.pathSeparator).toSeq).getOrElse(Nil)
+// Add GC Limit options if they are not present
+val extraJavaOptsAsStr = extraJavaOpts.mkString(" ")
+if (!extraJavaOptsAsStr.contains("-XX:GCTimeLimit")) {
+  extraJavaOpts :+= Utils.getGCTimeLimitOption
--- End diff --

Thanks @vanzin for your comments, I will update with the comments fix and 
also will verify with non-Oracle JVMs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1989] [CORE] Exit executors faster if t...

2016-04-26 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/12571#issuecomment-214982150
  
Thanks @tgravescs for the comment, users can still specify this gc params 
as part of the java opts. If the user doesn't specify these gc params then only 
we are adding with default values for executors instead of the relying on JVM 
default values.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1989] [CORE] Exit executors faster if t...

2016-04-22 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/12571#issuecomment-213522339
  
@srowen I have made the changes, Please have a look into this. Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1989] [CORE] Exit executors faster if t...

2016-04-21 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/12571#issuecomment-212841070
  
Thanks @srowen for checking this immediately, I will make the changes as 
per your explanation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14234] [CORE] Executor crashes for Task...

2016-04-21 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/12031#issuecomment-212826864
  
ping @andrewor14, @zsxwing 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1989] [CORE] Exit executors faster if t...

2016-04-21 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/12571

[SPARK-1989] [CORE] Exit executors faster if they get into a cycle of heavy 
GC

## What changes were proposed in this pull request?

Added spark.executor.gcTimeLimit config for getting the value for the GC 
Option -XX:GCTimeLimit and spark.executor.gcHeapFreeLimit config for getting 
the value for the GC Option -XX:GCHeapFreeLimit. Now GC time limit and heap 
free limit options need to set using these configs and are not allowed as part 
of spark.executor.extraJavaOptions.


## How was this patch tested?

I have verified this by checking the executor process command when I ran 
different spark applications. I verified the same in different deployement 
modes(Standalone, YARN, Mesos). 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-1989

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/12571.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #12571


commit 4f715623487f95a71b6c38c2e50c5e1b6ec7a1b3
Author: Devaraj K 
Date:   2016-04-21T09:15:07Z

[SPARK-1989] [CORE] Exit executors faster if they get into a cycle of
heavy GC




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14234] [CORE] Executor crashes for Task...

2016-04-11 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/12031#issuecomment-208725033
  
@andrewor14, Can you have a look into this when you find some time? Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14234] [CORE] Executor crashes for Task...

2016-04-06 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/12031#issuecomment-206215752
  
Thanks @zsxwing for your comments. I have addressed them, Please have a 
look into this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13063] [YARN] Make the SPARK YARN STAGI...

2016-04-05 Thread devaraj-kavali
Github user devaraj-kavali commented on a diff in the pull request:

https://github.com/apache/spark/pull/12082#discussion_r58590060
  
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -1444,4 +1444,19 @@ object Client extends Logging {
 uri.startsWith(s"$LOCAL_SCHEME:")
   }
 
+  /**
+   *  Returns the app staging dir.
+   */
+  private def getAppStagingDirPath(
+  conf: SparkConf,
+  fs: FileSystem,
+  appStagingDir: String): Path = {
+val stagingRootDir = conf.get(STAGING_DIR).orNull
--- End diff --

Thanks @tgravescs for the suggestion. I have addressed it in the latest.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10530] [CORE] Kill other task attempts ...

2016-04-05 Thread devaraj-kavali
Github user devaraj-kavali commented on a diff in the pull request:

https://github.com/apache/spark/pull/11996#discussion_r58563479
  
--- Diff: 
core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala ---
@@ -789,6 +791,51 @@ class TaskSetManagerSuite extends SparkFunSuite with 
LocalSparkContext with Logg
 assert(TaskLocation("executor_host1_3") === 
ExecutorCacheTaskLocation("host1", "3"))
   }
 
+  test("Kill other task attempts when one attempt belonging to the same 
task succeeds") {
+sc = new SparkContext("local", "test")
+val sched = new FakeTaskScheduler(sc, ("exec1", "host1"), ("exec2", 
"host2"))
+val taskSet = FakeTask.createTaskSet(4)
+val manager = new TaskSetManager(sched, taskSet, MAX_TASK_FAILURES)
+val accumUpdatesByTask: Array[Seq[AccumulableInfo]] = 
taskSet.tasks.map { task =>
+  task.initialAccumulators.map { a => a.toInfo(Some(0L), None) }
+}
+// Offer resources for 4 tasks to start
+for ((k, v) <- List(
+"exec1" -> "host1",
+"exec1" -> "host1",
+"exec2" -> "host2",
+"exec2" -> "host2")) {
+  val taskOption = manager.resourceOffer(k, v, NO_PREF)
+  assert(taskOption.isDefined)
+  val task = taskOption.get
+  assert(task.executorId === k)
+}
+assert(sched.startedTasks.toSet === Set(0, 1, 2, 3))
+// Complete the 3 tasks and leave 1 task in running
+for (id <- Set(0, 1, 2)) {
+  manager.handleSuccessfulTask(id, createTaskResult(id, 
accumUpdatesByTask(id)))
+  assert(sched.endedTasks(id) === Success)
+}
+
+// Wait for the threshold time to start speculative attempt for the 
running task
+Thread.sleep(100)
--- End diff --

Thanks @tgravescs for your quick response.

Here Thread.sleep(100) is to match the threshold value mentioned in 
TaskSetManager.checkSpeculatableTasks(). It is the minimum time where the task 
needs to run for this much of time before becoming eligible for launching a 
speculative attempt. I don't see any way to change this default value.

> val medianDuration = durations(min((0.5 * tasksSuccessful).round.toInt, 
durations.length - 1))
> val threshold = max(SPECULATION_MULTIPLIER * medianDuration, 100)
> 

I don't think this threshold value is related to the config 
‘spark.speculation.interval’ here.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14234] [CORE] Executor crashes for Task...

2016-04-04 Thread devaraj-kavali
Github user devaraj-kavali commented on a diff in the pull request:

https://github.com/apache/spark/pull/12031#discussion_r58342616
  
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -319,10 +319,14 @@ private[spark] class Executor(
 
 case _: TaskKilledException | _: InterruptedException if 
task.killed =>
   logInfo(s"Executor killed $taskName (TID $taskId)")
+  // Reset the interrupted status of the thread to update the 
status
+  Thread.interrupted()
   execBackend.statusUpdate(taskId, TaskState.KILLED, 
ser.serialize(TaskKilled))
--- End diff --

Thanks @zsxwing for looking into the patch.

> What will happen if the thread is interrupted when 
execBackend.statusUpdate is running? I think the executor will still crash.


I do think this is a problem. I have handled it in the latest, Can you look 
into the changes?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10530] [CORE] Kill other task attempts ...

2016-04-01 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11996#issuecomment-204313542
  
Thanks @tgravescs for checking this, I will add test for these changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13063] [YARN] Make the SPARK YARN STAGI...

2016-04-01 Thread devaraj-kavali
Github user devaraj-kavali commented on a diff in the pull request:

https://github.com/apache/spark/pull/12082#discussion_r58177914
  
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -1444,4 +1444,19 @@ object Client extends Logging {
 uri.startsWith(s"$LOCAL_SCHEME:")
   }
 
+  /**
+   *  Returns the app staging dir.
+   */
+  private def getAppStagingDirPath(
+  conf: SparkConf,
+  fs: FileSystem,
+  appStagingDir: String): Path = {
+val stagingRootDir = conf.get(STAGING_DIR).orNull
--- End diff --

`conf.get(STAGING_DIR).orElse(fs.getHomeDirectory)` gives type mismatch 
compilation error since the `fs.getHomeDirectory` return type is Path and it 
expects to be String as per `conf.get(STAGING_DIR)`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13063] [YARN] Make the SPARK YARN STAGI...

2016-04-01 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/12082#issuecomment-204309628
  
Thanks @tgravescs for looking into the patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13063] [YARN] Make the SPARK YARN STAGI...

2016-03-31 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/12082

[SPARK-13063] [YARN] Make the SPARK YARN STAGING DIR as configurable

## What changes were proposed in this pull request?
Made the SPARK YARN STAGING DIR as configurable with the configuration as 
'spark.yarn.staging-dir'.

## How was this patch tested?

I have verified it manually by running applications on yarn, If the 
'spark.yarn.staging-dir' is configured then the value used as staging directory 
otherwise uses the default value i.e. file system’s home directory for the 
user.




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-13063

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/12082.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #12082


commit c3f02fdbdeb9c9dbe3d2a7361414005eed987509
Author: Devaraj K 
Date:   2016-03-31T09:41:22Z

[SPARK-13063] [YARN] Make the SPARK YARN STAGING DIR as configurable




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14234] [CORE] Executor crashes for Task...

2016-03-29 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/12031

[SPARK-14234] [CORE] Executor crashes for TaskRunner thread interruption

## What changes were proposed in this pull request?
Resetting the task interruption status before updating the task status.

## How was this patch tested?
I have verified it manually by running multiple applications, Executor 
doesn't crash and updates the status to the driver without any exceptions with 
the patch changes.




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-14234

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/12031.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #12031


commit 94b31fa52ce283c2c3de838bfd32dd2cc918c50d
Author: Devaraj K 
Date:   2016-03-29T08:23:59Z

[SPARK-14234] [CORE] Executor crashes for TaskRunner thread interruption




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13343] [CORE] speculative tasks that di...

2016-03-28 Thread devaraj-kavali
Github user devaraj-kavali closed the pull request at:

https://github.com/apache/spark/pull/11916


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13343] [CORE] speculative tasks that di...

2016-03-28 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11916#issuecomment-202318900
  
I have moved these changes to the PR 
https://github.com/apache/spark/pull/11996 for SPARK-10530. @tgravescs, please 
have a look into https://github.com/apache/spark/pull/11996 when you have some 
time. Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10530] [CORE] Kill other task attempts ...

2016-03-28 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/11996

[SPARK-10530] [CORE] Kill other task attempts when one taskattempt 
belonging the same task is succeeded in speculation

## What changes were proposed in this pull request?

With this patch, TaskSetManager kills other running attempts when any one 
of the attempt succeeds for the same task. Also killed tasks will not be 
considered as failed tasks and they get listed separately in the UI and also 
shows the task state as KILLED instead of FAILED.


## How was this patch tested?

core\src\test\scala\org\apache\spark\ui\jobs\JobProgressListenerSuite.scala
core\src\test\scala\org\apache\spark\util\JsonProtocolSuite.scala


I have verified this patch manually by enabling spark.speculation as true, 
when any attempt gets succeeded then other running attempts are getting killed 
for the same task and other pending tasks are getting assigned in those. And 
also when any attempt gets killed then they are considered as KILLED tasks and 
not considered as FAILED tasks. Please find the attached screen shots for the 
reference.


![stage-tasks-table](https://cloud.githubusercontent.com/assets/3174804/14075132/394c6a12-f4f4-11e5-8638-20ff7b8cc9bc.png)

![stages-table](https://cloud.githubusercontent.com/assets/3174804/14075134/3b60f412-f4f4-11e5-9ea6-dd0dcc86eb03.png)


Ref : https://github.com/apache/spark/pull/11916

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-10530

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11996.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11996


commit 1a9e36516e9016f43a605abce0ee49e1262363a6
Author: Devaraj K 
Date:   2016-03-28T09:03:07Z

[SPARK-10530] [CORE] Kill other task attempts when one taskattempt
belonging the same task is succeeded in speculation




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13343] [CORE] speculative tasks that di...

2016-03-25 Thread devaraj-kavali
Github user devaraj-kavali commented on a diff in the pull request:

https://github.com/apache/spark/pull/11916#discussion_r57459040
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -620,6 +620,14 @@ private[spark] class TaskSetManager(
 // Note: "result.value()" only deserializes the value when it's called 
at the first time, so
 // here "result.value()" just returns the value and won't block other 
threads.
 sched.dagScheduler.taskEnded(tasks(index), Success, result.value(), 
result.accumUpdates, info)
+// Kill other task attempts if any as the one attempt succeeded
+for (attemptInfo <- taskAttempts(index) if attemptInfo.attemptNumber 
!= info.attemptNumber
--- End diff --

Thanks @tgravescs.

I would be happy to fix the issue about succeeding more than one attempt as 
you explained as part of this PR but I am thinking it would be good if we can 
handle it separately without mixing with the current PR changes.

I will move the current changes to a PR for 
[SPARK-10530](https://issues.apache.org/jira/browse/SPARK-10530) and we can 
continue to fix multiple attempts success issue as part of the 
[SPARK-13343](https://issues.apache.org/jira/browse/SPARK-13343). 
Please let me if it doesn't make sense.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13343] [CORE] speculative tasks that di...

2016-03-24 Thread devaraj-kavali
Github user devaraj-kavali commented on a diff in the pull request:

https://github.com/apache/spark/pull/11916#discussion_r57349258
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -620,6 +620,14 @@ private[spark] class TaskSetManager(
 // Note: "result.value()" only deserializes the value when it's called 
at the first time, so
 // here "result.value()" just returns the value and won't block other 
threads.
 sched.dagScheduler.taskEnded(tasks(index), Success, result.value(), 
result.accumUpdates, info)
+// Kill other task attempts if any as the one attempt succeeded
+for (attemptInfo <- taskAttempts(index) if attemptInfo.attemptNumber 
!= info.attemptNumber
--- End diff --

I can think that during the map phase(which don't write to Hadoop) there is 
a chance of succeeding two attempts as you explained. But in final phase(which 
write to Hadoop) tasks, during commitTask() if two attempts try to rename 
taskAttemptPath to committedTaskPath then only one attempt would succeed and 
other will fail with the rename failure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13343] [CORE] speculative tasks that di...

2016-03-24 Thread devaraj-kavali
Github user devaraj-kavali commented on a diff in the pull request:

https://github.com/apache/spark/pull/11916#discussion_r57340394
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -620,6 +620,14 @@ private[spark] class TaskSetManager(
 // Note: "result.value()" only deserializes the value when it's called 
at the first time, so
 // here "result.value()" just returns the value and won't block other 
threads.
 sched.dagScheduler.taskEnded(tasks(index), Success, result.value(), 
result.accumUpdates, info)
+// Kill other task attempts if any as the one attempt succeeded
+for (attemptInfo <- taskAttempts(index) if attemptInfo.attemptNumber 
!= info.attemptNumber
--- End diff --

Thanks @tgravescs for the comment.

If anyone attempt is actually completed(succeeded) and not reached the 
success event here and during that time if any other attempt tries to commit 
the o/p then the SparkHadoopMapRedUtil.commitTask would prevent it doing so. 
And other case is that if the task attempt completes in Executor before getting 
the kill signal from TaskSetManager.handleSuccessfulTask then the Executor 
ignores the kill request and there will be no problem. I don't see a case that 
there will be two attempts becoming success where the task attempts use the 
commit coordination, Please help me understand if there are any. 

Here the major issue is, there are other task attempts running and not 
releasing the executor threads even if there is a task attempt already 
succeeded for the same task, sometimes these unnecessary task attempts keep 
running till the job/application completion(if the worker nodes running these 
attempts are very slow) which makes the application performance worse.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13343] [CORE] speculative tasks that di...

2016-03-24 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11916#issuecomment-200740334
  
Thanks @rxin and @andrewor14 for looking into the patch.

These failed tests in the latest build are not related to this patch and 
they have been failing in the previous builds as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13343] [CORE] speculative tasks that di...

2016-03-23 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/11916

[SPARK-13343] [CORE] speculative tasks that didn't commit shouldn't be 
marked as success

## What changes were proposed in this pull request?

Now with this patch, killed tasks will not be considered as failed tasks 
and they get listed separately in the UI and also shows the task state as 
KILLED instead of FAILED.

## How was this patch tested?

I have verified this patch manually, when any attempt gets killed then they 
are considered as KILLED tasks and not considered as FAILED tasks. Please find 
the attached screen shots for the reference. 
[SPARK-13965](https://issues.apache.org/jira/browse/SPARK-13965)/https://github.com/apache/spark/pull/11778
 kills the running task attempts immediately when any one of the task succeed 
and this patch will show consider and show them as KILLED.

![stage-tasks-table](https://cloud.githubusercontent.com/assets/3174804/13984882/1e8deb66-f11f-11e5-9a89-e571dc5f1eef.png)

![stages-table](https://cloud.githubusercontent.com/assets/3174804/13984881/1e8d8216-f11f-11e5-9d29-22a7aca94938.png)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-13343

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11916.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11916


commit 7c033b6d6dd7eb1d9296d82a965facec95dd6757
Author: Devaraj K 
Date:   2016-03-23T12:11:30Z

[SPARK-13343] [CORE] speculative tasks that didn't commit shouldn't be
marked as success




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-913] [CORE] log the size of each shuffl...

2016-03-19 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/11819

[SPARK-913] [CORE] log the size of each shuffle block in block manager

## What changes were proposed in this pull request?
Added a log message which shows the size of the block.


## How was this patch tested?
Verified it manually that this log is coming in the executor log.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-913

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11819.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11819


commit e738937d557230e5dacee7f7f913e37e54255a8e
Author: Devaraj K 
Date:   2016-03-18T10:17:09Z

[SPARK-913] [CORE] log the size of each shuffle block in block manager




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13965] [CORE] Driver should kill the ot...

2016-03-18 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/11778

[SPARK-13965] [CORE] Driver should kill the other running task attempts if 
any one task attempt succeeds for the same task

## What changes were proposed in this pull request?

core\src\main\scala\org\apache\spark\scheduler\TaskSetManager.scala

TaskSetManager kills other running attempts when any one attempts succeeds 
for the same task.

## How was this patch tested?

I have verified this patch manually by enabling spark.speculation as true, 
when any attempt gets succeeded then other running attempts are getting killed 
for the same task and other pending tasks are getting assigned in those.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-13965

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11778.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11778


commit 10cf58e10ffb93961db41d3269fd48dab8ecf711
Author: Devaraj K 
Date:   2016-03-17T09:10:14Z

[SPARK-13965] [CORE] Driver should kill the other running task attempts if
any one task attempt succeeds for the same task




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-913] [CORE] log the size of each shuffl...

2016-03-18 Thread devaraj-kavali
Github user devaraj-kavali closed the pull request at:

https://github.com/apache/spark/pull/11819


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-913] [CORE] log the size of each shuffl...

2016-03-18 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11819#issuecomment-198434010
  
Thanks @srowen and @JoshRosen  for the details, I am closing this since the 
BlockManager no longer handles shuffled blocks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13117] [Web UI] WebUI should use the lo...

2016-03-07 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11490#issuecomment-193282292
  
Sounds fine @srowen, I will update with the change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13117] [Web UI] WebUI should use the lo...

2016-03-07 Thread devaraj-kavali
Github user devaraj-kavali commented on a diff in the pull request:

https://github.com/apache/spark/pull/11490#discussion_r5519
  
--- Diff: core/src/main/scala/org/apache/spark/ui/WebUI.scala ---
@@ -134,7 +134,8 @@ private[spark] abstract class WebUI(
   def bind() {
 assert(!serverInfo.isDefined, "Attempted to bind %s more than 
once!".format(className))
 try {
-  serverInfo = Some(startJettyServer("0.0.0.0", port, sslOptions, 
handlers, conf, name))
+  var host = Option(conf.getenv("SPARK_LOCAL_IP")).getOrElse("0.0.0.0")
+  serverInfo = Some(startJettyServer(host, port, sslOptions, handlers, 
conf, name))
   logInfo("Started %s at http://%s:%d".format(className, 
publicHostName, boundPort))
--- End diff --

I am in an assumption that we need to consider SPARK_PUBLIC_DNS value for 
showing the url in the log as per our previous conversation. Don’t we need to 
consider SPARK_PUBLIC_DNS while logging here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13117] [Web UI] WebUI should use the lo...

2016-03-06 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11490#issuecomment-192878456
  
Thanks @srowen and @zsxwing for the confirmation. I have updated the 
description and fixed the review comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13117] [Web UI] WebUI should use the lo...

2016-03-03 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11490#issuecomment-192118319
  
I agree @srowen, I see that SPARK_PUBLIC_DNS is not for binding purpose. I 
have changed the env var to SPARK_LOCAL_IP.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13117] [Web UI] WebUI should use the lo...

2016-03-03 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11490#issuecomment-191869171
  
I had overlooked and it was my mistake, I think we need to consider both 
the env variables something like, 
```
  serverInfo = 
Some(startJettyServer(Option(conf.getenv("SPARK_PUBLIC_DNS"))
.getOrElse(Option(conf.getenv("SPARK_LOCAL_IP"))
  .getOrElse("0.0.0.0")), port, sslOptions, handlers, conf, name))
```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13117] [Web UI] WebUI should use the lo...

2016-03-03 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/11490

[SPARK-13117] [Web UI] WebUI should use the local ip not 0.0.0.0

## What changes were proposed in this pull request?

In WebUI, now Jetty Server starts with SPARK_PUBLIC_DNS config value if it
is configured otherwise it starts with default value as '0.0.0.0'.

It is continuation as per the closed PR 
https://github.com/apache/spark/pull/11133 for the JIRA SPARK-13117 and 
discussion in SPARK-13117.

## How was this patch tested?

This has been verified using the command 'netstat -tnlp | grep ' to 
check on which IP/hostname is binding with the below steps. 

In the below results, mentioned PID in the command is the corresponding 
process id.

 Without the patch changes, 
Web UI(Jetty Server) is not taking the value configured for 
SPARK_PUBLIC_DNS and it is listening to all the interfaces. 
## Master
```
[devaraj@stobdtserver2 sbin]$ netstat -tnlp | grep 3930
tcp6   0  0 :::8080 :::*LISTEN  
3930/java
```


## Worker
```
[devaraj@stobdtserver2 sbin]$ netstat -tnlp | grep 4090
tcp6   0  0 :::8081 :::*LISTEN  
4090/java
```

## History Server Process, 
```
[devaraj@stobdtserver2 sbin]$ netstat -tnlp | grep 2471
tcp6   0  0 :::18080:::*LISTEN  
2471/java
```
## Driver
```
[devaraj@stobdtserver2 spark-master]$ netstat -tnlp | grep 6556
tcp6   0  0 :::4040 :::*LISTEN  
6556/java
```


 With the patch changes

# i. With SPARK_PUBLIC_DNS configured
If the SPARK_PUBLIC_DNS is configured then all the processes Web UI(Jetty 
Server) is getting bind to the configured value.
## Master
```
[devaraj@stobdtserver2 sbin]$ netstat -tnlp | grep 1561
tcp6   0  0 x.x.x.x:8080   :::*LISTEN  
1561/java
```
## Worker 
```
[devaraj@stobdtserver2 sbin]$ netstat -tnlp | grep 2229
tcp6   0  0 x.x.x.x:8081   :::*LISTEN  
2229/java
```
## History Server
```
[devaraj@stobdtserver2 sbin]$ netstat -tnlp | grep 3747
tcp6   0  0 x.x.x.x:18080  :::*LISTEN  
3747/java
```
## Driver
```
[devaraj@stobdtserver2 spark-master]$ netstat -tnlp | grep 6013
tcp6   0  0 x.x.x.x:4040   :::*LISTEN  
6013/java
```

# ii. Without SPARK_PUBLIC_DNS configured
If the SPARK_PUBLIC_DNS is not configured then all the processes Web 
UI(Jetty Server) will start with the '0.0.0.0' as default value.
## Master
```
[devaraj@stobdtserver2 sbin]$ netstat -tnlp | grep 4573
tcp6   0  0 :::8080 :::*LISTEN  
4573/java
```


## Worker
```
[devaraj@stobdtserver2 sbin]$ netstat -tnlp | grep 4703
tcp6   0  0 :::8081 :::*LISTEN  
4703/java
```

## History Server
```
[devaraj@stobdtserver2 sbin]$ netstat -tnlp | grep 4846
tcp6   0  0 :::18080:::*LISTEN  
4846/java
```

## Driver
```
[devaraj@stobdtserver2 sbin]$ netstat -tnlp | grep 5437
tcp6   0  0 :::4040 :::*LISTEN  
5437/java
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-13117-v1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11490.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11490


commit 1d736ff7053b0df7b42b34ae738b7a2873e718a7
Author: Devaraj K 
Date:   2016-03-03T09:00:24Z

[SPARK-13117] [Web UI] WebUI should use the local ip not 0.0.0.0

In WebUI, now Jetty Server starts with SPARK_PUBLIC_DNS config value if it
is configured otherwise it starts with default value as '0.0.0.0'.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13621] [CORE] TestExecutor.scala needs ...

2016-03-02 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/11474

[SPARK-13621] [CORE] TestExecutor.scala needs to be moved to test package

Moved TestExecutor.scala from src to test package and removed the unused 
file TestClient.scala.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-13621

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11474.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11474


commit 894638b208cfbec911161093f14f2e05ed31c2a9
Author: Devaraj K 
Date:   2016-03-02T17:37:03Z

[SPARK-13621] [CORE] TestExecutor.scala needs to be moved to test package

Moved TestExecutor.scala from src to test package and removed unused file
TestClient.scala.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13117][Web UI] WebUI should use the loc...

2016-02-26 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11133#issuecomment-189320534
  
@srowen 
Would it be OK if we start the Jetty server with the default value as 
"0.0.0.0" instead of the local host name and it can take effect of the 
configured value for SPARK_PUBLIC_DNS if it is configured? It would change only 
for the Web UI and doesn't impact any others.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13117][Web UI] WebUI should use the loc...

2016-02-25 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11133#issuecomment-188678997
  
Earlier there was no problem in the test because the jetty server was 
getting started with ‘0.0.0.0’ and was not taking effect of the value 
configured for SPARK_PUBLIC_DNS and test assertions are checking the host name 
of the url's and those url's are getting derived from the SPARK_PUBLIC_DNS.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13012] [Documentation] Replace example ...

2016-02-23 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11053#issuecomment-188062361
  
@yinxusen I will look into the issue SPARK-13462, Thanks for creating it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13117][Web UI] WebUI should use the loc...

2016-02-22 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11133#issuecomment-187520503
  
@srowen, I have fixed the test failure, Can you have a look into this? 
Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13117][Web UI] WebUI should use the loc...

2016-02-22 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11133#issuecomment-187098980
  
It is not giving clear details about the failure and exiting with the exit 
code 1 is because of ***System.exit(1)***. I think we can skip this 
***System.exit(1)*** while running tests to avoid the termination of jvm for 
these kind of exceptions and show them as test failures. 

```javascript
try {
  serverInfo = Some(startJettyServer(publicHostName, port, sslOptions, 
handlers, conf, name))
  logInfo("Started %s at http://%s:%d".format(className, 
publicHostName, boundPort))
} catch {
  case e: Exception =>
logError("Failed to bind %s".format(className), e)
System.exit(1)
}

```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13117][Web UI] WebUI should use the loc...

2016-02-22 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11133#issuecomment-187096991
  
I see the test/Jenkins failure is due to the PR change.

Here org.apache.spark.deploy.LogUrlsStandaloneSuite is failing because of 
the below exception,

```
16/02/22 17:38:32.257 dispatcher-event-loop-5 ERROR Worker: Connection to 
master failed! Waiting for master to reconnect...
16/02/22 17:38:42.441 ScalaTest-main-running-LogUrlsStandaloneSuite ERROR 
SparkUI: Failed to bind SparkUI
java.net.SocketException: Unresolved address
at sun.nio.ch.Net.translateToSocketException(Net.java:157)
at sun.nio.ch.Net.translateException(Net.java:183)
at sun.nio.ch.Net.translateException(Net.java:189)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:76)
at 
org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187)
at 
org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:316)
at 
org.eclipse.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:265)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
at org.eclipse.jetty.server.Server.doStart(Server.java:293)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
at 
org.apache.spark.ui.JettyUtils$.org$apache$spark$ui$JettyUtils$$connect$1(JettyUtils.scala:283)
at org.apache.spark.ui.JettyUtils$$anonfun$5.apply(JettyUtils.scala:293)
at org.apache.spark.ui.JettyUtils$$anonfun$5.apply(JettyUtils.scala:293)
at 
org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1973)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:166)
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1964)
at 
org.apache.spark.ui.JettyUtils$.startJettyServer(JettyUtils.scala:293)
at org.apache.spark.ui.WebUI.bind(WebUI.scala:137)
at 
org.apache.spark.SparkContext$$anonfun$13.apply(SparkContext.scala:458)
at 
org.apache.spark.SparkContext$$anonfun$13.apply(SparkContext.scala:458)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.SparkContext.(SparkContext.scala:458)
at org.apache.spark.SparkContext.(SparkContext.scala:133)
at 
org.apache.spark.deploy.LogUrlsStandaloneSuite$$anonfun$2.apply$mcV$sp(LogUrlsStandaloneSuite.scala:59)
at 
org.apache.spark.deploy.LogUrlsStandaloneSuite$$anonfun$2.apply(LogUrlsStandaloneSuite.scala:55)
at 
org.apache.spark.deploy.LogUrlsStandaloneSuite$$anonfun$2.apply(LogUrlsStandaloneSuite.scala:55)
at 
org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
```



In LogUrlsStandaloneSuite.scala, SPARK_PUBLIC_DNS is getting set as 
"public_dns" and same is trying to use while starting the jetty server in 
WebUI.scala and it is failing to resolve the "public_dns".

```javascript
 test("verify that log urls reflect SPARK_PUBLIC_DNS (SPARK-6175)") {
val SPARK_PUBLIC_DNS = "public_dns"
val conf = new SparkConfWithEnv(Map("SPARK_PUBLIC_DNS" -> 
SPARK_PUBLIC_DNS)).set(
  "spark.extraListeners", classOf[SaveExecutorInfo].getName)
sc = new SparkContext("local-cluster[2,1,1024]", "test", conf)
```


```javascript
protected val publicHostName = 
Option(conf.getenv("SPARK_PUBLIC_DNS")).getOrElse(localHostName)

  def bind() {
assert(!serverInfo.isDefined, "Attempted to bind %s more than 
once!".format(className))
try {
  serverInfo = Some(startJettyServer(publicHostName, port, sslOptions, 
handlers, conf, name))
  logInfo("Started %s at http://%s:%d".format(className, 
publicHostName, boundPort))
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13012] [Documentation] Replace example ...

2016-02-21 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11053#issuecomment-186855380
  
Thanks @yinxusen for the good suggestion, I have addressed it.

> ModelSelectionViaTrainValidationSplitExample and 
JavaModelSelectionViaTrainValidationSplitExample still have a problem of Vector 
serialization. But I think we can add follow-up JIRA to locate the bug and fix 
it.

Yes, we can create an another followup JIRA to fix the problem. Thank you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13117][Web UI] WebUI should use the loc...

2016-02-19 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11133#issuecomment-186185243
  
Thanks @srowen for trying jenkins test to check this.

```javascript
[info] - verify that correct log urls get propagated from workers (2 
seconds, 508 milliseconds)
Exception in thread "Thread-46" Exception in thread "Thread-53" 
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:196)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at 
java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2293)
at 
java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2586)
at 
java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2596)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1318)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at sbt.React.react(ForkTests.scala:114)
at 
sbt.ForkTests$$anonfun$mainTestTask$1$Acceptor$2$.run(ForkTests.scala:74)
at java.lang.Thread.run(Thread.java:745)
java.io.EOFException
at 
java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2598)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1318)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at 
org.scalatest.tools.Framework$ScalaTestRunner$Skeleton$1$React.react(Framework.scala:945)
at 
org.scalatest.tools.Framework$ScalaTestRunner$Skeleton$1.run(Framework.scala:934)
at java.lang.Thread.run(Thread.java:745)
[info] ScalaTest
[info] Run completed in 10 minutes, 41 seconds.
[info] Total number of tests run: 1378
[info] Suites: completed 152, aborted 0
[info] Tests: succeeded 1378, failed 0, canceled 0, ignored 5, pending 0
[info] All tests passed.
[error] Error: Total 0, Failed 0, Errors 0, Passed 0
[error] Error during tests:
[error] Running java with options -classpath 
/home/jenkins/workspace/SparkPullRequestBuilder/core/target/scala-2.11/test-classes:/home/jenkins/workspace/SparkPullRequestBuilder/core/target/scala-2.11/classes:/home/jenkins/workspace/SparkPullRequestBuilder/launcher/target/scala-2.11/classes:/home/jenkins/workspace/SparkPullRequestBuilder/network/common/target/scala-2.11/classes:/home/jenkins/workspace/SparkPullRequestBuilder/network/shuffle/target/scala-2.11/classes:/home/jenkins/workspace/SparkPullRequestBuilder/unsafe/target/scala-2.11/classes:/home/jenkins/wor:/home/sparkivy/per-executor-caches/7/.sbt/boot/scala-2.10.5/org.scala-sbt/sbt/0.13.9/test-agent-0.13.9.jar:/home/sparkivy/per-executor-caches/7/.sbt/boot/scala-2.10.5/org.scala-sbt/sbt/0.13.9/test-interface-1.0.jar
 sbt.ForkMain 55745 failed with exit code 1
[info] MQTTStreamSuite:
```
```javascript
[info] Passed: Total 975, Failed 0, Errors 0, Passed 975, Ignored 12
[error] (core/test:test) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 4484 s, completed Feb 17, 2016 5:23:51 AM
[error] running /home/jenkins/workspace/SparkPullRequestBuilder/build/sbt 
-Pyarn -Phadoop-2.3 -Phive -Pkinesis-asl -Phive-thriftserver 
-Dtest.exclude.tags=org.apache.spark.tags.ExtendedHiveTest,org.apache.spark.tags.ExtendedYarnTest
 test ; received return code 1
```

I think Jenkins are showing test failures because of these, I don't see any 
test case failure here. Can anyone help me to give some information about how 
to check these test errors?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13012] [Documentation] Replace example ...

2016-02-18 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11053#issuecomment-186082803
  
Thanks again @yinxusen for the review, I have addressed the comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13016] [Documentation] Replace example ...

2016-02-18 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11132#issuecomment-186070514
  
Thanks again @yinxusen for the review, I have addressed them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13016] [Documentation] Replace example ...

2016-02-18 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11132#issuecomment-185629805
  
Thanks @yinxusen for the review, I have addressed them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13012] [Documentation] Replace example ...

2016-02-18 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11053#issuecomment-185611856
  
Thanks @yinxusen for your details review and comments. I have addressed 
them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13016] [Documentation] Replace example ...

2016-02-17 Thread devaraj-kavali
Github user devaraj-kavali commented on a diff in the pull request:

https://github.com/apache/spark/pull/11132#discussion_r53277719
  
--- Diff: 
examples/src/main/java/org/apache/spark/examples/mllib/JavaSVDExample.java ---
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.examples.mllib;
+
+//$example on$
+import java.util.LinkedList;
+
+import org.apache.spark.SparkConf;
+import org.apache.spark.SparkContext;
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.apache.spark.mllib.linalg.Matrix;
+import org.apache.spark.mllib.linalg.SingularValueDecomposition;
+import org.apache.spark.mllib.linalg.Vector;
+import org.apache.spark.mllib.linalg.Vectors;
+import org.apache.spark.mllib.linalg.distributed.RowMatrix;
+//$example off$
+
+/**
+ * Example for SingularValueDecomposition.
+ */
+public class JavaSVDExample {
+  public static void main(String[] args) {
+SparkConf conf = new SparkConf().setAppName("SVD Example");
+SparkContext sc = new SparkContext(conf);
+
+// $example on$
+double[][] array = { { 1.12, 2.05, 3.12 }, { 5.56, 6.28, 8.94 }, { 
10.2, 8.0, 20.5 } };
+LinkedList rowsList = new LinkedList();
+for (int i = 0; i < array.length; i++) {
+  Vector currentRow = Vectors.dense(array[i]);
+  rowsList.add(currentRow);
+}
+JavaRDD rows = 
JavaSparkContext.fromSparkContext(sc).parallelize(rowsList);
+
+// Create a RowMatrix from JavaRDD.
+RowMatrix mat = new RowMatrix(rows.rdd());
+
+// Compute the top 3 singular values and corresponding singular 
vectors.
+SingularValueDecomposition svd = mat.computeSVD(3, 
true, 1.0E-9d);
+RowMatrix U = svd.U();
+Vector s = svd.s();
+Matrix V = svd.V();
+Vector[] collectPartitions = (Vector[]) U.rows().collect();
--- End diff --

It gives compilation error if we remove typecasting here since the return 
type of collect is Object.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13012] [Documentation] Replace example ...

2016-02-17 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11053#issuecomment-185280685
  
Thanks @srowen for review and comments. I have removed serialVersionUID and 
setters in Java Beans and also addressed the unnecessary spaces between braces 
in imports.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13016] [Documentation] Replace example ...

2016-02-16 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11132#issuecomment-185064091
  
@yinxusen Thanks for reviewing, I have addressed the comments, Please have 
a look into this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13117][Web UI] WebUI should use the loc...

2016-02-16 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11133#issuecomment-185016457
  
@srowen I am investigating it, will update. Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13012] [Documentation] Replace example ...

2016-02-15 Thread devaraj-kavali
Github user devaraj-kavali commented on the pull request:

https://github.com/apache/spark/pull/11053#issuecomment-184538336
  
Thanks for the review @yinxusen. I have configured the code format in IDE 
and using the same for formatting the code. I will fix these comments and 
update.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   >