[GitHub] spark pull request: [SPARK-4680] "none" -> NoOpCompressionCodec
Github user roxchkplusony commented on the pull request: https://github.com/apache/spark/pull/3540#issuecomment-65200691 :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4680] "none" -> NoOpCompressionCodec
Github user roxchkplusony commented on the pull request: https://github.com/apache/spark/pull/3540#issuecomment-65196744 You're absolutely right. I was simply not familiar with spark.shuffle.compress. I saw spark.broadcast.compress easily enough, hilariously... This PR and Jira should be closed. I can do the honors :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4680] "none" -> NoOpCompressionCodec
Github user roxchkplusony closed the pull request at: https://github.com/apache/spark/pull/3540 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4680] "none" -> NoOpCompressionCodec
GitHub user roxchkplusony opened a pull request: https://github.com/apache/spark/pull/3540 [SPARK-4680] "none" -> NoOpCompressionCodec You can merge this pull request into a Git repository by running: $ git pull https://github.com/Paxata/spark feature/no-op-compression-codec Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3540.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3540 commit 1c7e928e25100251fcd551c8c6c0c6819251963b Author: roxchkplusony Date: 2014-12-01T20:34:00Z "none" -> NoOpCompressionCodec --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [BRANCH-1.1][SPARK-4626] Kill a task only if t...
Github user roxchkplusony closed the pull request at: https://github.com/apache/spark/pull/3503 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [BRANCH-1.1][SPARK-4626] Kill a task only if t...
Github user roxchkplusony commented on the pull request: https://github.com/apache/spark/pull/3503#issuecomment-64860545 @rxin here it is --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...
GitHub user roxchkplusony opened a pull request: https://github.com/apache/spark/pull/3503 [SPARK-4626] Kill a task only if the executorId is (still) registered wi... v1.1 backport for #3483 You can merge this pull request into a Git repository by running: $ git pull https://github.com/Paxata/spark bugfix/4626-1.1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3503.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3503 commit 4f95baf2ce023021a1c8199707f18ac93d1b6886 Author: roxchkplusony Date: 2014-11-27T23:54:40Z [SPARK-4626] Kill a task only if the executorId is (still) registered with the scheduler Author: roxchkplusony Closes #3483 from roxchkplusony/bugfix/4626 and squashes the following commits: aba9184 [roxchkplusony] replace warning message per review 5e7fdea [roxchkplusony] [SPARK-4626] Kill a task only if the executorId is (still) registered with the scheduler --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [BRANCH-1.1][SPARK-4626] Kill a task only if t...
Github user roxchkplusony closed the pull request at: https://github.com/apache/spark/pull/3502 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [BRANCH-1.1][SPARK-4626] Kill a task only if t...
GitHub user roxchkplusony opened a pull request: https://github.com/apache/spark/pull/3502 [BRANCH-1.1][SPARK-4626] Kill a task only if the executorId is (still) registered with the scheduler v1.1 backport for #3483 You can merge this pull request into a Git repository by running: $ git pull https://github.com/Paxata/spark bugfix/4626-1.1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3502.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3502 commit 90f8f3eed026e9c4f1a4b1952e284558c0e3fd23 Author: chutium Date: 2014-08-27T20:13:04Z [SPARK-3138][SQL] sqlContext.parquetFile should be able to take a single file as parameter ```if (!fs.getFileStatus(path).isDir) throw Exception``` make no sense after this commit #1370 be careful if someone is working on SPARK-2551, make sure the new change passes test case ```test("Read a parquet file instead of a directory")``` Author: chutium Closes #2044 from chutium/parquet-singlefile and squashes the following commits: 4ae477f [chutium] [SPARK-3138][SQL] sqlContext.parquetFile should be able to take a single file as parameter (cherry picked from commit 48f42781dedecd38ddcb2dcf67dead92bb4318f5) Signed-off-by: Michael Armbrust commit 3cb4e1718f40a18e3d19a33fd627960687bbcb6c Author: Vida Ha Date: 2014-08-27T21:26:06Z Spark-3213 Fixes issue with spark-ec2 not detecting slaves created with "Launch More like this" ... copy the spark_cluster_tag from a spot instance requests over to the instances. Author: Vida Ha Closes #2163 from vidaha/vida/spark-3213 and squashes the following commits: 5070a70 [Vida Ha] Spark-3214 Fix issue with spark-ec2 not detecting slaves created with 'Launch More Like This' and using Spot Requests (cherry picked from commit 7faf755ae4f0cf510048e432340260a6e609066d) Signed-off-by: Josh Rosen commit c1ffa3e4cdfbd1f84b5c8d8de5d0fb958a19e211 Author: Andrew Or Date: 2014-08-27T21:46:56Z [SPARK-3243] Don't use stale spark-driver.* system properties If we set both `spark.driver.extraClassPath` and `--driver-class-path`, then the latter correctly overrides the former. However, the value of the system property `spark.driver.extraClassPath` still uses the former, which is actually not added to the class path. This may cause some confusion... Of course, this also affects other options (i.e. java options, library path, memory...). Author: Andrew Or Closes #2154 from andrewor14/driver-submit-configs-fix and squashes the following commits: 17ec6fc [Andrew Or] Fix tests 0140836 [Andrew Or] Don't forget spark.driver.memory e39d20f [Andrew Or] Also set spark.driver.extra* configs in client mode (cherry picked from commit 63a053ab140d7bf605e8c5b7fb5a7bd52aca29b2) Signed-off-by: Patrick Wendell commit b3d763b0b7fc6345dac5d222414f902e4afdee13 Author: viirya Date: 2014-08-27T21:55:05Z [SPARK-3252][SQL] Add missing condition for test According to the text message, both relations should be tested. So add the missing condition. Author: viirya Closes #2159 from viirya/fix_test and squashes the following commits: b1c0f52 [viirya] add missing condition. (cherry picked from commit 28d41d627919fcb196d9d31bad65d664770bee67) Signed-off-by: Michael Armbrust commit 77116875f4184e0a637d9d7fd5b1dfeaabe0c9d3 Author: Aaron Davidson Date: 2014-08-27T22:05:47Z [SQL] [SPARK-3236] Reading Parquet tables from Metastore mangles location Currently we do `relation.hiveQlTable.getDataLocation.getPath`, which returns the path-part of the URI (e.g., "s3n://my-bucket/my-path" => "/my-path"). We should do `relation.hiveQlTable.getDataLocation.toString` instead, as a URI's toString returns a faithful representation of the full URI, which can later be passed into a Hadoop Path. Author: Aaron Davidson Closes #2150 from aarondav/parquet-location and squashes the following commits: 459f72c [Aaron Davidson] [SQL] [SPARK-3236] Reading Parquet tables from Metastore mangles location (cherry picked from commit cc275f4b7910f6d0ad266a43bac2fdae58e9739e) Signed-off-by: Michael Armbrust commit 5ea260ebd1acbbe9705849a16ee67758e33c65b0 Author: luogankun Date: 2014-08-27T22:08:22Z [SPARK-3065][SQL] Add locale setting to fix results do not match for udf_unix_timestamp format " MMM dd h:mm:ss a" run with not "America/Los_Angeles" TimeZone in HiveCompatibilitySuite When run the udf_unix_timestamp of org.apache.spark.sql.hive.execution.HiveCompatibilitySuite testcase with
[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...
Github user roxchkplusony commented on the pull request: https://github.com/apache/spark/pull/3483#issuecomment-64849808 Gladly, after a little break and a chance to figure out upstream branches... lol. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...
Github user roxchkplusony commented on the pull request: https://github.com/apache/spark/pull/3483#issuecomment-64838944 Thanks @rxin! Style-wise I agree. Funny that you put up another alternative :-P --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...
Github user roxchkplusony commented on the pull request: https://github.com/apache/spark/pull/3483#issuecomment-64835901 At this point, the merits of the change have disappeared from discussion and now we're onto style questions. Since this change does not diverge from existing patterns, can we move forward? Anyone who cares enough about the style question is free to make a separate PR. Is there anything left to do before accepting or rejecting this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...
Github user roxchkplusony commented on a diff in the pull request: https://github.com/apache/spark/pull/3483#discussion_r21012034 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -127,7 +127,13 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val actorSyste makeOffers() case KillTask(taskId, executorId, interruptThread) => -executorDataMap(executorId).executorActor ! KillTask(taskId, executorId, interruptThread) +executorDataMap.get(executorId) match { + case Some(executorInfo) => +executorInfo.executorActor ! KillTask(taskId, executorId, interruptThread) + case None => +// Ignoring the task kill since the executor is not registered. +logWarning(s"Attempted to kill task $taskId for unknown executor $executorId.") +} --- End diff -- I like it less, but it's a close #2 next to the existing pattern. I wouldn't object to keeping them the same or moving to your pattern. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...
Github user roxchkplusony commented on a diff in the pull request: https://github.com/apache/spark/pull/3483#discussion_r21008476 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -127,7 +127,13 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val actorSyste makeOffers() case KillTask(taskId, executorId, interruptThread) => -executorDataMap(executorId).executorActor ! KillTask(taskId, executorId, interruptThread) +executorDataMap.get(executorId) match { + case Some(executorInfo) => +executorInfo.executorActor ! KillTask(taskId, executorId, interruptThread) + case None => +// Ignoring the task kill since the executor is not registered. +logWarning(s"Attempted to kill task $taskId for unknown executor $executorId.") +} --- End diff -- I understand the general objection (pattern matching is usually a cop-out to better functional style) but that's not the appropriate pattern here. map is specifically designed to apply a morphism from A -> B (in Scala, f: A => B) to describe Option[A] -> Option[B]. What we are doing here is applying a choice of side effect, not a value, depending on the concrete Option. The example is (Option[A] -> Option[Unit]) -> Unit with misuse of monadic operators. Also, this code applies patterns found consistently elsewhere in this class file. If you believe strongly in this pattern, would you mind opening a PR for review? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...
Github user roxchkplusony commented on a diff in the pull request: https://github.com/apache/spark/pull/3483#discussion_r20964768 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -127,7 +127,14 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val actorSyste makeOffers() case KillTask(taskId, executorId, interruptThread) => -executorDataMap(executorId).executorActor ! KillTask(taskId, executorId, interruptThread) +executorDataMap.get(executorId) match { + case Some(executorInfo) => +executorInfo.executorActor ! KillTask(taskId, executorId, interruptThread) + case None => +// Ignoring the task kill since the executor is not registered. +logWarning(s"Ignored task kill $taskId $executorId" + + " for unknown executor $sender with ID $executorId") --- End diff -- I have no problem doing that. Do you think StatusUpdate's message is clear as-is? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...
Github user roxchkplusony commented on the pull request: https://github.com/apache/spark/pull/3483#issuecomment-64683263 I largely stole the structure from the status update message handler. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4626] Kill a task only if the executorI...
GitHub user roxchkplusony opened a pull request: https://github.com/apache/spark/pull/3483 [SPARK-4626] Kill a task only if the executorId is (still) registered with the scheduler You can merge this pull request into a Git repository by running: $ git pull https://github.com/Paxata/spark bugfix/4626 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3483.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3483 commit 97c4efb4dd7972c3aae0c6f496b8d1a5984da4d7 Author: roxchkplusony Date: 2014-11-26T17:37:00Z [SPARK-4626] Kill a task only if the executorId is (still) registered with the scheduler --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org