[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-11-06 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/22288 Thanks for the reviews and feedback @tgravescs , @squito ! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-11-01 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/22288 @squito I have tested it again with both scenarios and I was able to verify the expected behavior. For the cases that are not covered in the PR, i will mention them in the jira

[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-10-31 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/22288 @tgravescs I have fixed a nit and its good to be reviewed. @squito I have updated the comment, let me know if its okay. Thanks for the reviews

[GitHub] spark pull request #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new...

2018-10-26 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r228637254 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -415,9 +420,54 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new...

2018-10-26 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r228636999 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -415,9 +420,54 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new...

2018-10-26 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r228636880 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -415,9 +420,54 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-10-23 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/22288 @squito for the locality wait, it would be the same as the condition where it is not completely blacklisted. I have added a test for this. If we want to ensure the sequence for the timeout expiring

[GitHub] spark pull request #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new...

2018-10-22 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r227080534 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala --- @@ -503,6 +507,145 @@ class TaskSchedulerImplSuite extends

[GitHub] spark pull request #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new...

2018-10-22 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r227082382 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -415,9 +420,55 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new...

2018-10-22 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r227095389 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -415,9 +420,55 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new...

2018-10-22 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r227080844 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala --- @@ -503,6 +507,145 @@ class TaskSchedulerImplSuite extends

[GitHub] spark pull request #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new...

2018-10-22 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r227077071 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -453,6 +504,25 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-10-22 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/22288 It applies to both DA and SA. I have updated the description. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-10-22 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/22288 @squito I have made the changes and updated the description. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-10-19 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/22288 test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new...

2018-10-19 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r226754849 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -415,9 +420,65 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-10-18 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/22288 Failure is unrelated. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-10-18 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/22288 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new...

2018-10-12 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r224892495 --- Diff: core/src/main/scala/org/apache/spark/scheduler/BlacklistTracker.scala --- @@ -146,21 +146,31 @@ private[scheduler] class BlacklistTracker

[GitHub] spark pull request #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new...

2018-10-12 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r224873268 --- Diff: core/src/main/scala/org/apache/spark/scheduler/BlacklistTracker.scala --- @@ -146,21 +146,31 @@ private[scheduler] class BlacklistTracker

[GitHub] spark pull request #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new...

2018-10-10 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r224167756 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -415,9 +419,61 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-10-09 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/22288 @tgravescs I have addressed the review comments. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-09 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r223741374 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,272 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-04 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222783753 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,266 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-04 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222739514 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,274 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-04 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222784195 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,266 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-04 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222785629 --- Diff: core/src/main/scala/org/apache/spark/metrics/ExecutorMetricType.scala --- @@ -59,6 +60,43 @@ case object JVMOffHeapMemory extends

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-04 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222783588 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,266 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-04 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222781707 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,266 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-04 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222739822 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,266 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #22612: [SPARK-24958] Add executors' process tree total m...

2018-10-04 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22612#discussion_r222779045 --- Diff: core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala --- @@ -0,0 +1,266 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-09-28 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/22288 the failures seem to be unrelated. I wasn't able to reproduce them. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-09-28 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/22288 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new...

2018-09-11 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r216795021 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -414,9 +425,48 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request #22288: [SPARK-22148][Scheduler] Acquire new executors to...

2018-09-11 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r216788096 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -414,9 +425,48 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request #22288: [SPARK-22148][Scheduler] Acquire new executors to...

2018-09-11 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r216788079 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -414,9 +425,48 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request #22288: [SPARK-22148][Scheduler] Acquire new executors to...

2018-09-11 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r216788016 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -623,8 +623,9 @@ private[spark] class TaskSetManager

[GitHub] spark pull request #22288: [SPARK-22148][Scheduler] Acquire new executors to...

2018-09-04 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r215036162 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -414,9 +425,54 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark issue #22288: [SPARK-22148][Scheduler] Acquire new executors to avoid ...

2018-08-31 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/22288 @squito @tgravescs Can you review this PR? Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22288: [SPARK-22148] Acquire new executors to avoid hang...

2018-08-30 Thread dhruve
GitHub user dhruve opened a pull request: https://github.com/apache/spark/pull/22288 [SPARK-22148] Acquire new executors to avoid hang because of blacklisting ## What changes were proposed in this pull request? Every time a task is unschedulable because of the condition where

[GitHub] spark pull request #22121: [SPARK-25133][SQL][Doc]Avro data source guide

2018-08-22 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22121#discussion_r212075677 --- Diff: docs/avro-data-source-guide.md --- @@ -0,0 +1,377 @@ +--- +layout: global +title: Apache Avro Data Source Guide

[GitHub] spark pull request #22121: [SPARK-25133][SQL][Doc]Avro data source guide

2018-08-22 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22121#discussion_r212032223 --- Diff: docs/avro-data-source-guide.md --- @@ -0,0 +1,377 @@ +--- +layout: global +title: Apache Avro Data Source Guide

[GitHub] spark issue #22015: [SPARK-20286][SPARK-24786][Core][DynamicAllocation] Rele...

2018-08-07 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/22015 This modifies the SparkListenerUnpersistRDD case class, which fails the MiMa tests. If this is something heavily used by developers, I can add either another ListenerBus message or modify the PR

[GitHub] spark pull request #22015: [SPARK-20286] Release executors on unpersisting R...

2018-08-06 Thread dhruve
GitHub user dhruve opened a pull request: https://github.com/apache/spark/pull/22015 [SPARK-20286] Release executors on unpersisting RDD ## What changes were proposed in this pull request? Currently, the executors acquired using dynamic allocation are not released when

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2018-07-20 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/19194 @squito I have made the changes as requested. Can you have a look at this again. Thanks. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-02 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/21601 @attilapiros I will modify the test to add a check/assert which makes it easy to follow and validate what we are trying to achieve in the test. For the rest of the cases, since these are hadoop

[GitHub] spark pull request #21601: [SPARK-24610] fix reading small files via wholeTe...

2018-07-02 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/21601#discussion_r199597945 --- Diff: core/src/main/scala/org/apache/spark/input/WholeTextFileInputFormat.scala --- @@ -53,6 +53,19 @@ private[spark] class WholeTextFileInputFormat

[GitHub] spark pull request #21601: [SPARK-24610] fix reading small files via wholeTe...

2018-07-02 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/21601#discussion_r199602993 --- Diff: core/src/main/scala/org/apache/spark/input/WholeTextFileInputFormat.scala --- @@ -53,6 +53,19 @@ private[spark] class WholeTextFileInputFormat

[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-06-21 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/21601 @vanzin Can you review this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #21601: [SPARK-24610] fix reading small files via wholeTe...

2018-06-20 Thread dhruve
GitHub user dhruve opened a pull request: https://github.com/apache/spark/pull/21601 [SPARK-24610] fix reading small files via wholeTextFiles ## What changes were proposed in this pull request? The `WholeTextFileInputFormat` determines the `maxSplitSize` for the file/s being

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-10-26 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/19194 Hi @squito. Sorry for the late response. I want to get back on this. As soon as I get a chance I will work on it and update the PR

[GitHub] spark pull request #19194: [SPARK-20589] Allow limiting task concurrency per...

2017-09-27 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/19194#discussion_r141482741 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -512,6 +535,9 @@ private[spark] class TaskSetManager

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-27 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/19194 Configuring at the stage level seems to be the appropriate and more deterministic choice. If we agree on changing the API, we can start another effort looking in that direction. Till then we can

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-27 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/19194 @squito Thanks for pointing that out. What you mentioned makes sense and I did dig some more on the `DAGScheduler` and `activeJobForStage` to gather more context. We could take into account

[GitHub] spark pull request #19194: [SPARK-20589] Allow limiting task concurrency per...

2017-09-21 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/19194#discussion_r140361162 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -619,6 +625,47 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark pull request #19194: [SPARK-20589] Allow limiting task concurrency per...

2017-09-21 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/19194#discussion_r140321640 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -619,6 +625,47 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark pull request #19194: [SPARK-20589] Allow limiting task concurrency per...

2017-09-21 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/19194#discussion_r140293577 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -619,6 +625,47 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark pull request #19194: [SPARK-20589] Allow limiting task concurrency per...

2017-09-20 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/19194#discussion_r140122744 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala --- @@ -1255,6 +1255,97 @@ class TaskSetManagerSuite extends

[GitHub] spark pull request #19194: [SPARK-20589] Allow limiting task concurrency per...

2017-09-20 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/19194#discussion_r140124886 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -619,6 +625,47 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark pull request #19194: [SPARK-20589] Allow limiting task concurrency per...

2017-09-20 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/19194#discussion_r140122769 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala --- @@ -1255,6 +1255,97 @@ class TaskSetManagerSuite extends

[GitHub] spark pull request #19194: [SPARK-20589] Allow limiting task concurrency per...

2017-09-20 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/19194#discussion_r140123047 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -758,11 +825,52 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-18 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/19194 @tgraves I have addressed the comments and tried to cover the possible cases in the existing test for job groups and speculation. Kindly let me know if we need to add or address more use cases

[GitHub] spark pull request #19194: [SPARK-20589] Allow limiting task concurrency per...

2017-09-15 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/19194#discussion_r139163830 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala --- @@ -1255,6 +1255,97 @@ class TaskSetManagerSuite extends

[GitHub] spark pull request #19194: [SPARK-20589] Allow limiting task concurrency per...

2017-09-15 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/19194#discussion_r139163463 --- Diff: core/src/test/scala/org/apache/spark/ExecutorAllocationManagerSuite.scala --- @@ -210,22 +216,282 @@ class ExecutorAllocationManagerSuite

[GitHub] spark pull request #19194: [SPARK-20589] Allow limiting task concurrency per...

2017-09-15 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/19194#discussion_r139163529 --- Diff: core/src/test/scala/org/apache/spark/ExecutorAllocationManagerSuite.scala --- @@ -210,22 +216,282 @@ class ExecutorAllocationManagerSuite

[GitHub] spark pull request #19194: [SPARK-20589] Allow limiting task concurrency per...

2017-09-15 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/19194#discussion_r139163109 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -758,11 +812,58 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark pull request #19194: [SPARK-20589] Allow limiting task concurrency per...

2017-09-15 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/19194#discussion_r139162871 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -758,11 +812,58 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-11 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/19194 Rebased this PR with current master and have squashed the earlier commits. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #19194: [SPARK-20589] Allow limiting task concurrency per...

2017-09-11 Thread dhruve
GitHub user dhruve opened a pull request: https://github.com/apache/spark/pull/19194 [SPARK-20589] Allow limiting task concurrency per stage ## What changes were proposed in this pull request? This change allows the user to specify the maximum no. of tasks running in a given

[GitHub] spark pull request #19157: [SPARK-20589][Core][Scheduler] Allow limiting tas...

2017-09-11 Thread dhruve
Github user dhruve closed the pull request at: https://github.com/apache/spark/pull/19157 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19157: [SPARK-20589][Core][Scheduler] Allow limiting task concu...

2017-09-08 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/19157 @HyukjinKwon Thanks for pointing this out. I will do a rebase and then do a push. The message from appveyor wasn't very obvious so I didn't realize

[GitHub] spark issue #19157: [SPARK-20589][Core][Scheduler] Allow limiting task concu...

2017-09-07 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/19157 Reopened this because CI was having issues with the previous PR. [18950](https://github.com/apache/spark/pull/18950

[GitHub] spark issue #19157: [SPARK-20589][Core][Scheduler] Allow limiting task concu...

2017-09-07 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/19157 @squito @markhamstra @tgravescs Can you review this PR. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #19157: [SPARK-20589][Core][Scheduler] Allow limiting tas...

2017-09-07 Thread dhruve
GitHub user dhruve opened a pull request: https://github.com/apache/spark/pull/19157 [SPARK-20589][Core][Scheduler] Allow limiting task concurrency per job group ## What changes were proposed in this pull request? This change allows the user to specify the maximum no. of tasks

[GitHub] spark issue #18950: [SPARK-20589][Core][Scheduler] Allow limiting task concu...

2017-09-07 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/18950 CI is having issues downloading my repo. Closing this PR and opening a new one. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #18950: [SPARK-20589][Core][Scheduler] Allow limiting tas...

2017-09-07 Thread dhruve
Github user dhruve closed the pull request at: https://github.com/apache/spark/pull/18950 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18950: [SPARK-20589][Core][Scheduler] Allow limiting tas...

2017-08-22 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/18950#discussion_r134608269 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -598,13 +600,58 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark pull request #18950: [SPARK-20589][Core][Scheduler] Allow limiting tas...

2017-08-22 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/18950#discussion_r134608326 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -727,6 +780,68 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark issue #18950: [SPARK-20589][Core][Scheduler] Allow limiting task concu...

2017-08-22 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/18950 @squito I will pull the test from the latest master and update it with the changes we made. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #18950: [SPARK-20589][Core][Scheduler] Allow limiting task concu...

2017-08-21 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/18950 @squito @markhamstra I addressed the comments and have made the changes to account for running different job groups concurrently. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #18950: [SPARK-20589][Core][Scheduler] Allow limiting tas...

2017-08-21 Thread dhruve
GitHub user dhruve reopened a pull request: https://github.com/apache/spark/pull/18950 [SPARK-20589][Core][Scheduler] Allow limiting task concurrency per job group ## What changes were proposed in this pull request? This change allows the user to specify the maximum no. of tasks

[GitHub] spark pull request #18950: [SPARK-20589][Core][Scheduler] Allow limiting tas...

2017-08-21 Thread dhruve
Github user dhruve closed the pull request at: https://github.com/apache/spark/pull/18950 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #18950: [SPARK-20589][Core][Scheduler] Allow limiting tas...

2017-08-16 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/18950#discussion_r133547976 --- Diff: core/src/test/scala/org/apache/spark/ExecutorAllocationManagerSuite.scala --- @@ -188,6 +188,125 @@ class ExecutorAllocationManagerSuite

[GitHub] spark pull request #18950: [SPARK-20589][Core][Scheduler] Allow limiting tas...

2017-08-16 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/18950#discussion_r133547780 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -454,64 +477,68 @@ private[spark] class TaskSetManager

[GitHub] spark pull request #18950: [SPARK-20589][Core][Scheduler] Allow limiting tas...

2017-08-16 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/18950#discussion_r133547673 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -602,6 +604,21 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark pull request #18950: [SPARK-20589][Core][Scheduler] Allow limiting tas...

2017-08-15 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/18950#discussion_r133321125 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -602,6 +604,21 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark issue #18950: [SPARK-20589][Core][Scheduler] Allow limiting task concu...

2017-08-15 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/18950 @kayousterhout @squito Can you review this PR ? Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #18950: [SPARK-20589][Core][Scheduler] Allow limiting tas...

2017-08-15 Thread dhruve
GitHub user dhruve opened a pull request: https://github.com/apache/spark/pull/18950 [SPARK-20589][Core][Scheduler] Allow limiting task concurrency per job group ## What changes were proposed in this pull request? This change allows the user to specify the maximum no. of tasks

[GitHub] spark pull request #18691: [SPARK-21243][Core] Limit no. of map outputs in a...

2017-07-22 Thread dhruve
Github user dhruve closed the pull request at: https://github.com/apache/spark/pull/18691 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #18691: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-21 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/18691 Thanks @tgravescs Closing the PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18487: [SPARK-21243][Core] Limit no. of map outputs in a...

2017-07-21 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/18487#discussion_r128793623 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -375,6 +390,7 @@ final class ShuffleBlockFetcherIterator

[GitHub] spark issue #18691: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-21 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/18691 @cloud-fan @tgravescs I have resolved the merge conflicts for 2.2. This was just related to remove extra configs. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-20 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/18487 @tgravescs Thanks for merging this. I have created a PR for 2.2 https://github.com/apache/spark/pull/18691 I had to remove a couple of newer config entries which landed while resolving

[GitHub] spark pull request #18691: [SPARK-21243][Core] Limit no. of map outputs in a...

2017-07-20 Thread dhruve
GitHub user dhruve opened a pull request: https://github.com/apache/spark/pull/18691 [SPARK-21243][Core] Limit no. of map outputs in a shuffle fetch For configurations with external shuffle enabled, we have observed that if a very large no. of blocks are being fetched from a remote

[GitHub] spark pull request #18487: [SPARK-21243][Core] Limit no. of map outputs in a...

2017-07-20 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/18487#discussion_r128572125 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -375,6 +390,7 @@ final class ShuffleBlockFetcherIterator

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-20 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/18487 @cloud-fan replied to your comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18487: [SPARK-21243][Core] Limit no. of map outputs in a...

2017-07-20 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/18487#discussion_r128573651 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -321,6 +321,17 @@ package object config { .intConf

[GitHub] spark pull request #18487: [SPARK-21243][Core] Limit no. of map outputs in a...

2017-07-20 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/18487#discussion_r128559318 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -443,12 +459,57 @@ final class ShuffleBlockFetcherIterator

[GitHub] spark pull request #18487: [SPARK-21243][Core] Limit no. of map outputs in a...

2017-07-20 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/18487#discussion_r128557269 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -443,12 +459,57 @@ final class ShuffleBlockFetcherIterator

[GitHub] spark pull request #18487: [SPARK-21243][Core] Limit no. of map outputs in a...

2017-07-18 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/18487#discussion_r127998722 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -277,11 +290,13 @@ final class ShuffleBlockFetcherIterator

  1   2   3   >