[GitHub] spark pull request: [SPARK-4680] "none" -> NoOpCompressionCodec

2014-12-02 Thread roxchkplusony
Github user roxchkplusony commented on the pull request:

https://github.com/apache/spark/pull/3540#issuecomment-65200691
  
:)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4680] "none" -> NoOpCompressionCodec

2014-12-02 Thread roxchkplusony
Github user roxchkplusony commented on the pull request:

https://github.com/apache/spark/pull/3540#issuecomment-65196744
  
You're absolutely right. I was simply not familiar with 
spark.shuffle.compress. I saw spark.broadcast.compress easily enough, 
hilariously... This PR and Jira should be closed. I can do the honors :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4680] "none" -> NoOpCompressionCodec

2014-12-02 Thread roxchkplusony
Github user roxchkplusony closed the pull request at:

https://github.com/apache/spark/pull/3540


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4680] "none" -> NoOpCompressionCodec

2014-12-01 Thread roxchkplusony
GitHub user roxchkplusony opened a pull request:

https://github.com/apache/spark/pull/3540

[SPARK-4680] "none" -> NoOpCompressionCodec



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Paxata/spark feature/no-op-compression-codec

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3540.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3540


commit 1c7e928e25100251fcd551c8c6c0c6819251963b
Author: roxchkplusony 
Date:   2014-12-01T20:34:00Z

"none" -> NoOpCompressionCodec




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [BRANCH-1.1][SPARK-4626] Kill a task only if t...

2014-11-28 Thread roxchkplusony
Github user roxchkplusony closed the pull request at:

https://github.com/apache/spark/pull/3503


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [BRANCH-1.1][SPARK-4626] Kill a task only if t...

2014-11-27 Thread roxchkplusony
Github user roxchkplusony commented on the pull request:

https://github.com/apache/spark/pull/3503#issuecomment-64860545
  
@rxin here it is


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...

2014-11-27 Thread roxchkplusony
GitHub user roxchkplusony opened a pull request:

https://github.com/apache/spark/pull/3503

[SPARK-4626] Kill a task only if the executorId is (still) registered wi...

v1.1 backport for #3483

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Paxata/spark bugfix/4626-1.1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3503.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3503


commit 4f95baf2ce023021a1c8199707f18ac93d1b6886
Author: roxchkplusony 
Date:   2014-11-27T23:54:40Z

[SPARK-4626] Kill a task only if the executorId is (still) registered with 
the scheduler

Author: roxchkplusony 

Closes #3483 from roxchkplusony/bugfix/4626 and squashes the following 
commits:

aba9184 [roxchkplusony] replace warning message per review
5e7fdea [roxchkplusony] [SPARK-4626] Kill a task only if the executorId is 
(still) registered with the scheduler




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [BRANCH-1.1][SPARK-4626] Kill a task only if t...

2014-11-27 Thread roxchkplusony
Github user roxchkplusony closed the pull request at:

https://github.com/apache/spark/pull/3502


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [BRANCH-1.1][SPARK-4626] Kill a task only if t...

2014-11-27 Thread roxchkplusony
GitHub user roxchkplusony opened a pull request:

https://github.com/apache/spark/pull/3502

[BRANCH-1.1][SPARK-4626] Kill a task only if the executorId is (still) 
registered with the scheduler

v1.1 backport for #3483

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Paxata/spark bugfix/4626-1.1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3502.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3502


commit 90f8f3eed026e9c4f1a4b1952e284558c0e3fd23
Author: chutium 
Date:   2014-08-27T20:13:04Z

[SPARK-3138][SQL] sqlContext.parquetFile should be able to take a single 
file as parameter

```if (!fs.getFileStatus(path).isDir) throw Exception``` make no sense 
after this commit #1370

be careful if someone is working on SPARK-2551, make sure the new change 
passes test case ```test("Read a parquet file instead of a directory")```

Author: chutium 

Closes #2044 from chutium/parquet-singlefile and squashes the following 
commits:

4ae477f [chutium] [SPARK-3138][SQL] sqlContext.parquetFile should be able 
to take a single file as parameter

(cherry picked from commit 48f42781dedecd38ddcb2dcf67dead92bb4318f5)
Signed-off-by: Michael Armbrust 

commit 3cb4e1718f40a18e3d19a33fd627960687bbcb6c
Author: Vida Ha 
Date:   2014-08-27T21:26:06Z

Spark-3213 Fixes issue with spark-ec2 not detecting slaves created with 
"Launch More like this"

... copy the spark_cluster_tag from a spot instance requests over to the 
instances.

Author: Vida Ha 

Closes #2163 from vidaha/vida/spark-3213 and squashes the following commits:

5070a70 [Vida Ha] Spark-3214 Fix issue with spark-ec2 not detecting slaves 
created with 'Launch More Like This' and using Spot Requests

(cherry picked from commit 7faf755ae4f0cf510048e432340260a6e609066d)
Signed-off-by: Josh Rosen 

commit c1ffa3e4cdfbd1f84b5c8d8de5d0fb958a19e211
Author: Andrew Or 
Date:   2014-08-27T21:46:56Z

[SPARK-3243] Don't use stale spark-driver.* system properties

If we set both `spark.driver.extraClassPath` and `--driver-class-path`, 
then the latter correctly overrides the former. However, the value of the 
system property `spark.driver.extraClassPath` still uses the former, which is 
actually not added to the class path. This may cause some confusion...

Of course, this also affects other options (i.e. java options, library 
path, memory...).

Author: Andrew Or 

Closes #2154 from andrewor14/driver-submit-configs-fix and squashes the 
following commits:

17ec6fc [Andrew Or] Fix tests
0140836 [Andrew Or] Don't forget spark.driver.memory
e39d20f [Andrew Or] Also set spark.driver.extra* configs in client mode
(cherry picked from commit 63a053ab140d7bf605e8c5b7fb5a7bd52aca29b2)

Signed-off-by: Patrick Wendell 

commit b3d763b0b7fc6345dac5d222414f902e4afdee13
Author: viirya 
Date:   2014-08-27T21:55:05Z

[SPARK-3252][SQL] Add missing condition for test

According to the text message, both relations should be tested. So add the 
missing condition.

Author: viirya 

Closes #2159 from viirya/fix_test and squashes the following commits:

b1c0f52 [viirya] add missing condition.

(cherry picked from commit 28d41d627919fcb196d9d31bad65d664770bee67)
Signed-off-by: Michael Armbrust 

commit 77116875f4184e0a637d9d7fd5b1dfeaabe0c9d3
Author: Aaron Davidson 
Date:   2014-08-27T22:05:47Z

[SQL] [SPARK-3236] Reading Parquet tables from Metastore mangles location

Currently we do `relation.hiveQlTable.getDataLocation.getPath`, which 
returns the path-part of the URI (e.g., "s3n://my-bucket/my-path" => 
"/my-path"). We should do `relation.hiveQlTable.getDataLocation.toString` 
instead, as a URI's toString returns a faithful representation of the full URI, 
which can later be passed into a Hadoop Path.

Author: Aaron Davidson 

Closes #2150 from aarondav/parquet-location and squashes the following 
commits:

459f72c [Aaron Davidson] [SQL] [SPARK-3236] Reading Parquet tables from 
Metastore mangles location

(cherry picked from commit cc275f4b7910f6d0ad266a43bac2fdae58e9739e)
Signed-off-by: Michael Armbrust 

commit 5ea260ebd1acbbe9705849a16ee67758e33c65b0
Author: luogankun 
Date:   2014-08-27T22:08:22Z

[SPARK-3065][SQL] Add locale setting to fix results do not match for 
udf_unix_timestamp format " MMM dd h:mm:ss a" run with not 
"America/Los_Angeles" TimeZone in HiveCompatibilitySuite

When run the udf_unix_timestamp of 
org.apache.spark.sql.hive.execution.HiveCompatibilitySuite testcase
with 

[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...

2014-11-27 Thread roxchkplusony
Github user roxchkplusony commented on the pull request:

https://github.com/apache/spark/pull/3483#issuecomment-64849808
  
Gladly, after a little break and a chance to figure out upstream 
branches... lol.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...

2014-11-27 Thread roxchkplusony
Github user roxchkplusony commented on the pull request:

https://github.com/apache/spark/pull/3483#issuecomment-64838944
  
Thanks @rxin! Style-wise I agree. Funny that you put up another alternative 
:-P


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...

2014-11-27 Thread roxchkplusony
Github user roxchkplusony commented on the pull request:

https://github.com/apache/spark/pull/3483#issuecomment-64835901
  
At this point, the merits of the change have disappeared from discussion 
and now we're onto style questions. Since this change does not diverge from 
existing patterns, can we move forward? Anyone who cares enough about the style 
question is free to make a separate PR. Is there anything left to do before 
accepting or rejecting this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...

2014-11-27 Thread roxchkplusony
Github user roxchkplusony commented on a diff in the pull request:

https://github.com/apache/spark/pull/3483#discussion_r21012034
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -127,7 +127,13 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val actorSyste
 makeOffers()
 
   case KillTask(taskId, executorId, interruptThread) =>
-executorDataMap(executorId).executorActor ! KillTask(taskId, 
executorId, interruptThread)
+executorDataMap.get(executorId) match {
+  case Some(executorInfo) =>
+executorInfo.executorActor ! KillTask(taskId, executorId, 
interruptThread)
+  case None =>
+// Ignoring the task kill since the executor is not registered.
+logWarning(s"Attempted to kill task $taskId for unknown 
executor $executorId.")
+}
--- End diff --

I like it less, but it's a close #2 next to the existing pattern. I 
wouldn't object to keeping them the same or moving to your pattern.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...

2014-11-27 Thread roxchkplusony
Github user roxchkplusony commented on a diff in the pull request:

https://github.com/apache/spark/pull/3483#discussion_r21008476
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -127,7 +127,13 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val actorSyste
 makeOffers()
 
   case KillTask(taskId, executorId, interruptThread) =>
-executorDataMap(executorId).executorActor ! KillTask(taskId, 
executorId, interruptThread)
+executorDataMap.get(executorId) match {
+  case Some(executorInfo) =>
+executorInfo.executorActor ! KillTask(taskId, executorId, 
interruptThread)
+  case None =>
+// Ignoring the task kill since the executor is not registered.
+logWarning(s"Attempted to kill task $taskId for unknown 
executor $executorId.")
+}
--- End diff --

I understand the general objection (pattern matching is usually a cop-out 
to better functional style) but that's not the appropriate pattern here. map is 
specifically designed to apply a morphism from A -> B (in Scala, f: A => B) to 
describe Option[A] -> Option[B]. What we are doing here is applying a choice of 
side effect, not a value, depending on the concrete Option. The example is 
(Option[A] -> Option[Unit]) -> Unit with misuse of monadic operators. Also, 
this code applies patterns found consistently elsewhere in this class file. If 
you believe strongly in this pattern, would you mind opening a PR for review?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...

2014-11-26 Thread roxchkplusony
Github user roxchkplusony commented on a diff in the pull request:

https://github.com/apache/spark/pull/3483#discussion_r20964768
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -127,7 +127,14 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val actorSyste
 makeOffers()
 
   case KillTask(taskId, executorId, interruptThread) =>
-executorDataMap(executorId).executorActor ! KillTask(taskId, 
executorId, interruptThread)
+executorDataMap.get(executorId) match {
+  case Some(executorInfo) =>
+executorInfo.executorActor ! KillTask(taskId, executorId, 
interruptThread)
+  case None =>
+// Ignoring the task kill since the executor is not registered.
+logWarning(s"Ignored task kill $taskId $executorId"
+  + " for unknown executor $sender with ID $executorId")
--- End diff --

I have no problem doing that. Do you think StatusUpdate's message is clear 
as-is?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...

2014-11-26 Thread roxchkplusony
Github user roxchkplusony commented on the pull request:

https://github.com/apache/spark/pull/3483#issuecomment-64683263
  
I largely stole the structure from the status update message handler.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...

2014-11-26 Thread roxchkplusony
GitHub user roxchkplusony opened a pull request:

https://github.com/apache/spark/pull/3483

[SPARK-4626] Kill a task only if the executorId is (still) registered with 
the scheduler



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Paxata/spark bugfix/4626

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3483.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3483


commit 97c4efb4dd7972c3aae0c6f496b8d1a5984da4d7
Author: roxchkplusony 
Date:   2014-11-26T17:37:00Z

[SPARK-4626] Kill a task only if the executorId is (still) registered with 
the scheduler




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org