[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter's readingItera...

2018-11-24 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/23083#discussion_r236060023 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala --- @@ -727,9 +727,10 @@ private[spark] class ExternalSorter[K, V, C

[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak

2018-11-23 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/23083#discussion_r235955118 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala --- @@ -727,9 +727,10 @@ private[spark] class ExternalSorter[K, V, C

[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter Leak

2018-11-23 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/23083 > For the task completion listener, I think it's an overkill to introduce a new API, do you know where exactly we leak the memory? and can we null it out when the ShuffleBlockFetcherItera

[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak

2018-11-21 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/23083#discussion_r235357628 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -72,7 +73,8 @@ final class ShuffleBlockFetcherIterator

[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak

2018-11-21 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/23083#discussion_r235357328 --- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala --- @@ -99,6 +99,13 @@ private[spark] class TaskContextImpl

[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter Leak

2018-11-21 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/23083 And another thing: > P.S. This PR does not cover cases with CoGroupedRDDs which use ExternalAppendOnlyMap internally, which itself can lead to OutOfMemoryErrors in many places. So do

[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak

2018-11-21 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/23083#discussion_r235300436 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -72,7 +73,8 @@ final class ShuffleBlockFetcherIterator

[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak

2018-11-21 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/23083#discussion_r235299353 --- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala --- @@ -99,6 +99,13 @@ private[spark] class TaskContextImpl

[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak

2018-11-21 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/23083#discussion_r235299656 --- Diff: core/src/main/scala/org/apache/spark/shuffle/BlockStoreShuffleReader.scala --- @@ -103,11 +116,26 @@ private[spark] class

[GitHub] spark pull request #23083: [SPARK-26114][CORE] ExternalSorter Leak

2018-11-21 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/23083#discussion_r235297396 --- Diff: core/src/main/scala/org/apache/spark/TaskContext.scala --- @@ -127,9 +127,21 @@ abstract class TaskContext extends Serializable

[GitHub] spark issue #23040: [SPARK-26068][Core]ChunkedByteBufferInputStream should h...

2018-11-14 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/23040 cc @ericl and @JoshRosen, this bug was introduced by https://github.com/apache/spark/pull/14099/files After loosing empty chunk check, the ChunkedByteBufferInputStream doesn't handle

[GitHub] spark issue #21445: [SPARK-24404][SS] Increase currentEpoch when meet a Epoc...

2018-05-30 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21445 > I think the best way to do it is to make the shuffle writer responsible for incrementing the epoch within its task, the same way the data source writer does currently. Y

[GitHub] spark pull request #21397: [SPARK-24334] Fix race condition in ArrowPythonRu...

2018-05-22 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21397#discussion_r189973566 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala --- @@ -94,8 +88,18 @@ class ArrowPythonRunner

[GitHub] spark issue #21369: [SPARK-22713][CORE] ExternalAppendOnlyMap leaks when spi...

2018-05-22 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21369 > @advancedxy , using jvisualvm+heap dump I could see that the second introduced test case ("drop all references to the underlying map once the iterator is exhausted") eliminated a

[GitHub] spark pull request #21399: [SPARK-22269][BUILD] Run Java linter via SBT for ...

2018-05-22 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21399#discussion_r189963174 --- Diff: project/SparkBuild.scala --- @@ -740,6 +741,16 @@ object Unidoc { ) } +object CheckStyle { --- End diff

[GitHub] spark pull request #21399: [SPARK-22269][BUILD] Run Java linter via SBT for ...

2018-05-22 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21399#discussion_r189960217 --- Diff: project/SparkBuild.scala --- @@ -740,6 +741,16 @@ object Unidoc { ) } +object CheckStyle { --- End diff

[GitHub] spark pull request #21369: [SPARK-22713][CORE] ExternalAppendOnlyMap leaks w...

2018-05-21 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21369#discussion_r189768075 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala --- @@ -305,8 +310,8 @@ class ExternalAppendOnlyMap[K, V, C

[GitHub] spark pull request #21369: [SPARK-22713][CORE] ExternalAppendOnlyMap leaks w...

2018-05-21 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21369#discussion_r189768116 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala --- @@ -585,17 +591,24 @@ class ExternalAppendOnlyMap[K, V

[GitHub] spark pull request #21369: [SPARK-22713][CORE] ExternalAppendOnlyMap leaks w...

2018-05-21 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21369#discussion_r189768301 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala --- @@ -630,7 +643,7 @@ private[spark] object

[GitHub] spark pull request #21369: [SPARK-22713][CORE] ExternalAppendOnlyMap leaks w...

2018-05-21 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21369#discussion_r189768335 --- Diff: core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala --- @@ -23,8 +23,9 @@ import org.apache.spark

[GitHub] spark pull request #21369: [SPARK-22713][CORE] ExternalAppendOnlyMap leaks w...

2018-05-21 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21369#discussion_r189768271 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala --- @@ -585,17 +591,24 @@ class ExternalAppendOnlyMap[K, V

[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...

2018-05-21 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21165: [Spark-20087][CORE] Attach accumulators / metrics...

2018-05-21 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21165#discussion_r189530671 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -1868,15 +1868,26 @@ class DAGSchedulerSuite extends

[GitHub] spark pull request #21165: [Spark-20087][CORE] Attach accumulators / metrics...

2018-05-21 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21165#discussion_r189530510 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -1868,15 +1868,26 @@ class DAGSchedulerSuite extends

[GitHub] spark pull request #21165: [Spark-20087][CORE] Attach accumulators / metrics...

2018-05-21 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21165#discussion_r189525864 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -1868,15 +1868,26 @@ class DAGSchedulerSuite extends

[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...

2018-05-16 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 Gently ping @cloud-fan again. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #21327: [SPARK-24107][CORE][followup] ChunkedByteBuffer.w...

2018-05-16 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21327#discussion_r188532109 --- Diff: core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala --- @@ -63,15 +63,18 @@ private[spark] class ChunkedByteBuffer(var chunks

[GitHub] spark pull request #21327: [SPARK-24107][CORE][followup] ChunkedByteBuffer.w...

2018-05-16 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21327#discussion_r188525716 --- Diff: core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala --- @@ -63,15 +63,18 @@ private[spark] class ChunkedByteBuffer(var chunks

[GitHub] spark pull request #21165: [Spark-20087][CORE] Attach accumulators / metrics...

2018-05-14 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21165#discussion_r188005420 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -287,6 +287,28 @@ private[spark] class Executor( notifyAll

[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...

2018-05-14 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 Looks like that simply add fields with default values into case class will break binary compatibility. How should we deal with that? Add to MimaExcludes or add missing methods? @cloud-fan

[GitHub] spark pull request #21165: [Spark-20087][CORE] Attach accumulators / metrics...

2018-05-14 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21165#discussion_r187984186 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -287,6 +287,28 @@ private[spark] class Executor( notifyAll

[GitHub] spark pull request #21165: [Spark-20087][CORE] Attach accumulators / metrics...

2018-05-09 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21165#discussion_r187074691 --- Diff: core/src/main/scala/org/apache/spark/TaskEndReason.scala --- @@ -212,9 +212,15 @@ case object TaskResultLost extends TaskFailedReason

[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...

2018-05-08 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 ping @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #21165: [Spark-20087][CORE] Attach accumulators / metrics...

2018-05-07 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21165#discussion_r186468071 --- Diff: core/src/main/scala/org/apache/spark/TaskEndReason.scala --- @@ -212,9 +212,15 @@ case object TaskResultLost extends TaskFailedReason

[GitHub] spark pull request #21165: [Spark-20087][CORE] Attach accumulators / metrics...

2018-05-02 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21165#discussion_r185690094 --- Diff: core/src/main/scala/org/apache/spark/TaskEndReason.scala --- @@ -212,9 +212,15 @@ case object TaskResultLost extends TaskFailedReason

[GitHub] spark pull request #21165: [Spark-20087][CORE] Attach accumulators / metrics...

2018-05-01 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21165#discussion_r185226136 --- Diff: core/src/main/scala/org/apache/spark/TaskEndReason.scala --- @@ -212,9 +212,15 @@ case object TaskResultLost extends TaskFailedReason

[GitHub] spark pull request #21165: [Spark-20087][CORE] Attach accumulators / metrics...

2018-04-27 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21165#discussion_r184840246 --- Diff: core/src/main/scala/org/apache/spark/TaskEndReason.scala --- @@ -212,9 +212,15 @@ case object TaskResultLost extends TaskFailedReason

[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...

2018-04-27 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 @jiangxb1987 @cloud-fan I think it's ready for review. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...

2018-04-26 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 > We should not do these 2 things together, and to me the second one is way simpler to get in and we should do it first. Agreed. For the scope of this pr, let's get killed task

[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...

2018-04-26 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 > However, I don't agree user side accumulators should get updates from killed tasks, that changes the semantic of accumulators. And I don't think end-users need to care about killed ta

[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...

2018-04-26 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 I add a note for accumulator update. Please comment if more document is needed. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #21165: [Spark-20087][CORE] Attach accumulators / metrics to 'Ta...

2018-04-26 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 > It should be [Spark-20087] instead of [Spark 20087] in the title. --- - To unsubscribe, e-mail: reviews-unsub

[GitHub] spark issue #21165: [Spark 20087][CORE] Attach accumulators / metrics to 'Ta...

2018-04-26 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 > we should document the changes in a migration document or something, I think documentation is necessary, will update the documentation tomorrow (Beijing t

[GitHub] spark pull request #21165: Spark 20087: Attach accumulators / metrics to 'Ta...

2018-04-26 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21165#discussion_r184441076 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1417,10 +1417,13 @@ class DAGScheduler( case

[GitHub] spark pull request #21165: Spark 20087: Attach accumulators / metrics to 'Ta...

2018-04-26 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21165#discussion_r184440128 --- Diff: core/src/main/scala/org/apache/spark/TaskEndReason.scala --- @@ -212,9 +212,15 @@ case object TaskResultLost extends TaskFailedReason

[GitHub] spark pull request #21165: Spark 20087: Attach accumulators / metrics to 'Ta...

2018-04-26 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21165#discussion_r184429082 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -287,6 +287,28 @@ private[spark] class Executor( notifyAll

[GitHub] spark issue #21165: Spark 20087: Attach accumulators / metrics to 'TaskKille...

2018-04-26 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/21165 cc @squito @jiangxb1987 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21165: Spark 20087: Attach accumulators / metrics to 'Ta...

2018-04-26 Thread advancedxy
GitHub user advancedxy opened a pull request: https://github.com/apache/spark/pull/21165 Spark 20087: Attach accumulators / metrics to 'TaskKilled' end reason ## What changes were proposed in this pull request? The ultimate goal is for listeners to onTaskEnd to receive metrics

[GitHub] spark pull request #18492: [SPARK-19326] Speculated task attempts do not get...

2018-04-16 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/18492#discussion_r181706342 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -373,8 +373,14 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark pull request #20993: [SPARK-23881][CORE][TEST] Fix flaky test JobCance...

2018-04-06 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20993#discussion_r179794288 --- Diff: core/src/test/scala/org/apache/spark/JobCancellationSuite.scala --- @@ -349,36 +350,39 @@ class JobCancellationSuite extends SparkFunSuite

[GitHub] spark issue #20993: [SPARK-23881][CORE][TEST] Fix flaky test JobCancellation...

2018-04-06 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20993 @jiangxb1987 can you post a Jenkins job link referring the flaky test? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #20449: [SPARK-23040][CORE]: Returns interruptible iterat...

2018-03-05 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20449#discussion_r172416019 --- Diff: core/src/main/scala/org/apache/spark/shuffle/BlockStoreShuffleReader.scala --- @@ -104,9 +104,16 @@ private[spark] class

[GitHub] spark issue #20449: [SPARK-23040][CORE]: Returns interruptible iterator for ...

2018-03-05 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20449 @cloud-fan is it possible that we also merge this into branch-2.3, so this fix could be released in the Spark-2.3.1

[GitHub] spark issue #20449: [SPARK-23040][CORE]: Returns interruptible iterator for ...

2018-03-03 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20449 ping @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #20449: [SPARK-23040][CORE]: Returns interruptible iterat...

2018-03-01 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20449#discussion_r171489737 --- Diff: core/src/test/scala/org/apache/spark/JobCancellationSuite.scala --- @@ -40,6 +41,10 @@ class JobCancellationSuite extends SparkFunSuite

[GitHub] spark issue #20449: [SPARK-23040][CORE]: Returns interruptible iterator for ...

2018-02-28 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20449 > I'm not sure, let's just try it :) All right, I finally tracked down why it's hanging on Jenkins. The global semaphores used by `interruptible iterator of shuffle rea

[GitHub] spark pull request #20449: [SPARK-23040][CORE]: Returns interruptible iterat...

2018-02-27 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20449#discussion_r171152501 --- Diff: core/src/test/scala/org/apache/spark/JobCancellationSuite.scala --- @@ -320,6 +321,63 @@ class JobCancellationSuite extends SparkFunSuite

[GitHub] spark pull request #20449: [SPARK-23040][CORE]: Returns interruptible iterat...

2018-02-27 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20449#discussion_r170847138 --- Diff: core/src/test/scala/org/apache/spark/JobCancellationSuite.scala --- @@ -320,6 +321,58 @@ class JobCancellationSuite extends SparkFunSuite

[GitHub] spark pull request #20449: [SPARK-23040][CORE]: Returns interruptible iterat...

2018-02-27 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20449#discussion_r170844335 --- Diff: core/src/test/scala/org/apache/spark/JobCancellationSuite.scala --- @@ -320,6 +321,58 @@ class JobCancellationSuite extends SparkFunSuite

[GitHub] spark pull request #20449: [SPARK-23040][CORE]: Returns interruptible iterat...

2018-02-27 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20449#discussion_r170844519 --- Diff: core/src/test/scala/org/apache/spark/JobCancellationSuite.scala --- @@ -320,6 +321,58 @@ class JobCancellationSuite extends SparkFunSuite

[GitHub] spark pull request #20449: [SPARK-23040][CORE]: Returns interruptible iterat...

2018-02-27 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20449#discussion_r170844540 --- Diff: core/src/test/scala/org/apache/spark/JobCancellationSuite.scala --- @@ -320,6 +321,58 @@ class JobCancellationSuite extends SparkFunSuite

[GitHub] spark pull request #20449: [SPARK-23040][CORE]: Returns interruptible iterat...

2018-02-27 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20449#discussion_r170844554 --- Diff: core/src/test/scala/org/apache/spark/JobCancellationSuite.scala --- @@ -320,6 +321,58 @@ class JobCancellationSuite extends SparkFunSuite

[GitHub] spark pull request #20449: [SPARK-23040][CORE]: Returns interruptible iterat...

2018-02-27 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20449#discussion_r170844986 --- Diff: core/src/test/scala/org/apache/spark/JobCancellationSuite.scala --- @@ -320,6 +321,58 @@ class JobCancellationSuite extends SparkFunSuite

[GitHub] spark pull request #20449: [SPARK-23040][CORE]: Returns interruptible iterat...

2018-02-26 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20449#discussion_r170822587 --- Diff: core/src/test/scala/org/apache/spark/JobCancellationSuite.scala --- @@ -320,6 +321,63 @@ class JobCancellationSuite extends SparkFunSuite

[GitHub] spark pull request #20449: [SPARK-23040][CORE]: Returns interruptible iterat...

2018-02-26 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20449#discussion_r170821804 --- Diff: core/src/test/scala/org/apache/spark/JobCancellationSuite.scala --- @@ -320,6 +321,63 @@ class JobCancellationSuite extends SparkFunSuite

[GitHub] spark issue #20449: [SPARK-23040][CORE]: Returns interruptible iterator for ...

2018-02-26 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20449 @cloud-fan I have update the comments and fixed style issues(previously was auto formatted by IntelliJ) --- - To unsubscribe

[GitHub] spark pull request #20449: [SPARK-23040][CORE]: Returns interruptible iterat...

2018-02-25 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20449#discussion_r170457215 --- Diff: core/src/main/scala/org/apache/spark/shuffle/BlockStoreShuffleReader.scala --- @@ -104,9 +104,16 @@ private[spark] class

[GitHub] spark issue #20449: [SPARK-23040][CORE]: Returns interruptible iterator for ...

2018-02-08 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20449 @jerryshao @cloud-fan I have updated my code. Do you have any other concerns? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #20449: [SPARK-23040][CORE]: Returns interruptible iterator for ...

2018-02-08 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20449 > I was wondering do we actually meet this issue in production envs, @jerryshao I met this issue in our production when I was debugging a Spark job. I noticed the aborted stage's t

[GitHub] spark issue #20449: [SPARK-23040][CORE]: Returns interruptible iterator for ...

2018-02-08 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20449 Hi, @jerryshao I didn't see exception. But the issue is: When the stage is abort and all the remaining tasks are killed, those tasks are not cancelled but rather continue running which

[GitHub] spark pull request #20449: [SPARK-23040][CORE]: Returns interruptible iterat...

2018-02-08 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20449#discussion_r166864792 --- Diff: core/src/main/scala/org/apache/spark/shuffle/BlockStoreShuffleReader.scala --- @@ -104,9 +104,18 @@ private[spark] class

[GitHub] spark issue #17422: [SPARK-20087] Attach accumulators / metrics to 'TaskKill...

2018-02-04 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/17422 All right then, I will take it over. Of course the credit should go to @noodle-fb. We can discuss whether this behaviour is desirable or not in the JIRA or the new PR

[GitHub] spark issue #20449: [SPARK-23040][CORE]: Returns interruptible iterator for ...

2018-02-04 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20449 ping @cloud-fan and @jiangxb1987. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #20449: [SPARK-23040][CORE]: Returns interruptible iterat...

2018-01-31 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20449#discussion_r165066153 --- Diff: core/src/main/scala/org/apache/spark/shuffle/BlockStoreShuffleReader.scala --- @@ -104,9 +104,18 @@ private[spark] class

[GitHub] spark issue #20449: [SPARK-23040][CORE]: Returns interruptible iterator for ...

2018-01-31 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20449 ping @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #20449: [SPARK-23040][CORE]: Returns interruptible iterat...

2018-01-30 Thread advancedxy
GitHub user advancedxy opened a pull request: https://github.com/apache/spark/pull/20449 [SPARK-23040][CORE]: Returns interruptible iterator for shuffle reader ## What changes were proposed in this pull request? Before this commit, a non-interruptible iterator is returned

[GitHub] spark issue #17422: [SPARK-20087] Attach accumulators / metrics to 'TaskKill...

2018-01-25 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/17422 @noodle-fb are you still working on this? If not, I may work on it based on your current impl. I am facing same issue here. The accumulator updates are lost for killed tasks

[GitHub] spark pull request #20178: [Spark-22952][CORE] Deprecate stageAttemptId in f...

2018-01-09 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20178#discussion_r160587943 --- Diff: core/src/main/scala/org/apache/spark/scheduler/StageInfo.scala --- @@ -56,6 +56,8 @@ class StageInfo( completionTime = Some

[GitHub] spark pull request #20178: [Spark-22952][CORE] Deprecate stageAttemptId in f...

2018-01-08 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20178#discussion_r160170753 --- Diff: core/src/main/scala/org/apache/spark/scheduler/StageInfo.scala --- @@ -56,6 +56,8 @@ class StageInfo( completionTime = Some

[GitHub] spark pull request #20178: [Spark-22952][CORE] Deprecate stageAttemptId in f...

2018-01-07 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20178#discussion_r160077159 --- Diff: core/src/main/scala/org/apache/spark/scheduler/StageInfo.scala --- @@ -56,6 +56,8 @@ class StageInfo( completionTime = Some

[GitHub] spark pull request #20178: [Spark-22952][CORE] Deprecate stageAttemptId in f...

2018-01-07 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20178#discussion_r160076462 --- Diff: core/src/main/scala/org/apache/spark/scheduler/StageInfo.scala --- @@ -56,6 +56,8 @@ class StageInfo( completionTime = Some

[GitHub] spark issue #20178: [Spark-22952][CORE] Deprecate stageAttemptId in favour o...

2018-01-07 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20178 cc @cloud-fan @zsxwing and @squito I only included the changes related to StageInfo's deprecated getter. It would involve too much changes if we want to replace attemptId

[GitHub] spark issue #20178: [Spark-22952][CORE] Deprecate stageAttemptId in favour o...

2018-01-07 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20178 Please add me to whitelist... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #20178: [Spark-22952][CORE] Deprecate stageAttemptId in f...

2018-01-07 Thread advancedxy
GitHub user advancedxy opened a pull request: https://github.com/apache/spark/pull/20178 [Spark-22952][CORE] Deprecate stageAttemptId in favour of stageAttemptNumber ## What changes were proposed in this pull request? 1. Deprecate attemptId in StageInfo and add `def

[GitHub] spark issue #20082: [SPARK-22897][CORE]: Expose stageAttemptId in TaskContex...

2018-01-03 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20082 Of course. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #20082: [SPARK-22897][CORE]: Expose stageAttemptId in Tas...

2018-01-02 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20082#discussion_r159227953 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -79,6 +79,7 @@ private[spark] abstract class Task[T

[GitHub] spark issue #20082: [SPARK-22897][CORE]: Expose stageAttemptId in TaskContex...

2018-01-01 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20082 @cloud-fan Please take another look. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20082: [SPARK-22897][CORE]: Expose stageAttemptId in Tas...

2017-12-29 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20082#discussion_r159038326 --- Diff: core/src/main/scala/org/apache/spark/TaskContext.scala --- @@ -150,6 +150,13 @@ abstract class TaskContext extends Serializable

[GitHub] spark issue #20082: [SPARK-22897][CORE]: Expose stageAttemptId in TaskContex...

2017-12-28 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20082 ping @cloud-fan @jiangxb1987 @zsxwing, I think it's ready for merging. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #20082: [SPARK-22897][CORE]: Expose stageAttemptId in Tas...

2017-12-27 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20082#discussion_r158898080 --- Diff: core/src/main/scala/org/apache/spark/TaskContext.scala --- @@ -150,6 +150,11 @@ abstract class TaskContext extends Serializable

[GitHub] spark pull request #20082: [SPARK-22897][CORE]: Expose stageAttemptId in Tas...

2017-12-27 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20082#discussion_r158897767 --- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala --- @@ -42,6 +42,7 @@ import org.apache.spark.util._ */ private[spark

[GitHub] spark pull request #20082: [SPARK-22897][CORE]: Expose stageAttemptId in Tas...

2017-12-27 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20082#discussion_r158894808 --- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala --- @@ -42,6 +42,7 @@ import org.apache.spark.util._ */ private[spark

[GitHub] spark pull request #20082: [SPARK-22897][CORE]: Expose stageAttemptId in Tas...

2017-12-26 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20082#discussion_r158774364 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskContextSuite.scala --- @@ -158,6 +159,28 @@ class TaskContextSuite extends SparkFunSuite

[GitHub] spark pull request #20082: [SPARK-22897][CORE]: Expose stageAttemptId in Tas...

2017-12-26 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20082#discussion_r158773445 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskContextSuite.scala --- @@ -158,6 +159,28 @@ class TaskContextSuite extends SparkFunSuite

[GitHub] spark pull request #20082: [SPARK-22897][CORE]: Expose stageAttemptId in Tas...

2017-12-26 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20082#discussion_r158721239 --- Diff: project/MimaExcludes.scala --- @@ -95,7 +95,10 @@ object MimaExcludes { // [SPARK-21087] CrossValidator, TrainValidationSplit

[GitHub] spark issue #20082: [SPARK-22897][CORE]: Expose stageAttemptId in TaskContex...

2017-12-26 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20082 >Please explain in the JIRA/PR description about your use case that need to access the stageAttemptId in the executor - we must provide a very good reason to make such a change to a pub

[GitHub] spark pull request #20082: [SPARK-22897][CORE]: Expose stageAttemptId in Tas...

2017-12-26 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20082#discussion_r158699877 --- Diff: core/src/main/scala/org/apache/spark/TaskContext.scala --- @@ -150,6 +150,10 @@ abstract class TaskContext extends Serializable

[GitHub] spark issue #20082: [SPARK-22897][CORE]: Expose stageAttemptId in TaskContex...

2017-12-26 Thread advancedxy
Github user advancedxy commented on the issue: https://github.com/apache/spark/pull/20082 cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #20082: [SPARK-22897][CORE]: Expose stageAttemptId in Tas...

2017-12-26 Thread advancedxy
GitHub user advancedxy opened a pull request: https://github.com/apache/spark/pull/20082 [SPARK-22897][CORE]: Expose stageAttemptId in TaskContext ## What changes were proposed in this pull request? stageAttemptId added in TaskContext and corresponding construction modification

[GitHub] spark pull request: [SPARK-6030][CORE] Using simulated field layou...

2015-05-01 Thread advancedxy
Github user advancedxy commented on the pull request: https://github.com/apache/spark/pull/4783#issuecomment-98257433 @JoshRosen @srowen, I already tested in the test suite. Just want to make sure if I don't miss anything obvious. Here is the standalone reproduce sample. https

  1   2   >