[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-03-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16867 Thanks a lot for comments. I refined accordingly : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-03-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16867 @mridulm Thanks a lot for comments. I refined accordingly. (btw time complexity of the `rebalance` in `MedianHeap`is O(1)). --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-03-05 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16867 @mridulm Thanks a lot for your comments. I did a test with `TreeSet` previously with 100k tasks. I calculate the time spent on insertion. The results are: 372ms, 362ms, 458ms, 429ms, 363ms

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-05 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r104344524 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -754,7 +743,6 @@ private[spark] class TaskSetManager

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-05 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r104344529 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -754,7 +743,6 @@ private[spark] class TaskSetManager

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-05 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r104344274 --- Diff: core/src/main/scala/org/apache/spark/util/collection/MedianHeap.scala --- @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-05 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r104344072 --- Diff: core/src/test/scala/org/apache/spark/util/collection/MedianHeapSuite.scala --- @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-03-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16867 @squito Yes, some of machine learning jobs which do cartesian product in my cluster have over than 100k tasks in the `TaskSetManager`. --- If your project is set up for it, you can reply

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-03-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16867 @kayousterhout @squito Thanks a lot for your comments, really helpful :) I really think median heap is a good idea. `slice` is `O(n)` and is not most efficient. I'm doing

[GitHub] spark pull request #17133: [SPARK-19793] Use clock.getTimeMillis when mark t...

2017-03-03 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/17133#discussion_r104273530 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskInfo.scala --- @@ -75,6 +75,8 @@ class TaskInfo( } private[spark] def

[GitHub] spark pull request #17133: [SPARK-19793] Use clock.getTimeMillis when mark t...

2017-03-03 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/17133#discussion_r104161512 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskInfo.scala --- @@ -75,6 +75,8 @@ class TaskInfo( } private[spark] def

[GitHub] spark pull request #17133: [SPARK-19793] Use clock.getTimeMillis when mark t...

2017-03-02 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/17133#discussion_r104066996 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -695,7 +695,8 @@ private[spark] class TaskSetManager( def

[GitHub] spark issue #17133: [SPARK-19793] Use clock.getTimeMillis when mark task as ...

2017-03-02 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/17133 I found this when do https://github.com/apache/spark/pull/17112, which is for measuring the approach I proposed in https://github.com/apache/spark/pull/16867. --- If your project is set up

[GitHub] spark pull request #17133: [SPARK-19793] Use clock.getTimeMillis when mark t...

2017-03-02 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/17133 [SPARK-19793] Use clock.getTimeMillis when mark task as finished in TaskSetManager. ## What changes were proposed in this pull request? TaskSetManager is now using

[GitHub] spark issue #17111: [SPARK-19777] Scan runningTasksSet when check speculatab...

2017-03-01 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/17111 @kayousterhout Thanks for merging. (btw, I made some measurements for https://github.com/apache/spark/pull/16867 SPARK-16929, please take a look when you have time :) ) --- If your

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-02-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16867 I added a measurement for this pr in #17112 . Results are as below, newAlgorithm indicates whether we use `TreeSet` to get the median duration or not. And `time cost` is the time used when get

[GitHub] spark issue #17112: Measurement for SPARK-16929.

2017-02-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/17112 The unit test "Measurement for SPARK-16929." added is the measurement. In TaskSetManagerSuite.scala line 1049, if `newAlgorithm=true`, `successfulTaskIdsSet `will be used to get

[GitHub] spark pull request #17112: Measurement for SPARK-16929.

2017-02-28 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/17112 Measurement for SPARK-16929. ## What changes were proposed in this pull request? This pr doesn't target for merging. It's a measurement for https://github.com/apache/spark/pull/16867

[GitHub] spark issue #17111: [SPARK-19777] Scan runningTasksSet when check speculatab...

2017-02-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/17111 @squito Thanks a lot :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-02-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16867 @kayousterhout @squito It's great to open a new jira for this change. Please take a look at https://github.com/apache/spark/pull/17111. --- If your project is set up for it, you can reply

[GitHub] spark issue #17111: [SPARK-19777] Scan runningTasksSet when check speculatab...

2017-02-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/17111 cc @kayousterhout @squito --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17111: [SPARK-19777] Scan runningTasksSet when check spe...

2017-02-28 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/17111 [SPARK-19777] Scan runningTasksSet when check speculatable tasks in T… …askSetManager. ## What changes were proposed in this pull request? When check speculatable tasks

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-02-27 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r103391138 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -911,14 +916,14 @@ private[spark] class TaskSetManager

[GitHub] spark issue #16989: [WIP][SPARK-19659] Fetch big blocks to disk when shuffle...

2017-02-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16989 @squito I've uploaded a design doc to jira, please take a look when you have time :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #16867: [WIP][SPARK-16929] Improve performance when check specul...

2017-02-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16867 @squito Thanks a lot for your comments : ) >When check speculatable tasks in TaskSetManager, current code scan all task infos and sort durations of successful tasks in O(NlogN) t

[GitHub] spark issue #16989: [SPARK-19659] Fetch big blocks to disk when shuffle-read...

2017-02-21 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16989 @squito Thanks a lot for your comments : ) Yes, There must be a design doc for discussing. I will prepare and post a pdf to jira. --- If your project is set up for it, you can reply

[GitHub] spark pull request #16901: [SPARK-19565] Improve DAGScheduler tests.

2017-02-20 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/16901 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #16901: [SPARK-19565] Improve DAGScheduler tests.

2017-02-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16901 @kayousterhout I'll close since this functionality is already tested. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-02-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16867 @kayousterhout @squito Would you mind to take a look at this when have time ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #16790: [SPARK-19450] Replace askWithRetry with askSync.

2017-02-19 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16790 @srowen @vanzin Thanks a lot for the work on this ~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #16989: [SPARK-19659] Fetch big blocks to disk when shuffle-read...

2017-02-19 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16989 @vanzin @squito Would you mind to take a look at this when have time ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-02-19 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16989 [SPARK-19659] Fetch big blocks to disk when shuffle-read. ## What changes were proposed in this pull request? Currently the whole block is fetched into memory(off heap by default) when

[GitHub] spark issue #16790: [SPARK-19450] Replace askWithRetry with askSync.

2017-02-17 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16790 Both `askSync` and `askWithRetry` are blocking, the only difference is the "retry"(default is 3 times) when the rpc is failed. Callers of this method do not necessarily rely on t

[GitHub] spark issue #16790: [SPARK-19450] Replace askWithRetry with askSync.

2017-02-16 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16790 https://github.com/apache/spark/pull/16690#discussion_r101616883 causes the build to produce lots of deprecation warnings. @srowen @vanzin How do you think about this ? --- If your project

[GitHub] spark issue #16690: [SPARK-19347] ReceiverSupervisorImpl can add block to Re...

2017-02-16 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16690 @srowen How do you think about https://github.com/apache/spark/pull/16790? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-15 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @kayousterhout @squito @markhamstra Thanks for all of your work for this patch. Really appreciate your help : ) --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-15 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 Yes, refined : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-14 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @squito Thanks a lot. I've refined the comment, please take another look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #16901: [SPARK-19565] Improve DAGScheduler tests.

2017-02-14 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16901 @squito Thanks a lot for your comments. I've refined the comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #16901: [SPARK-19565] Improve DAGScheduler tests.

2017-02-13 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16901#discussion_r100968529 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -2161,6 +2161,48 @@ class DAGSchedulerSuite extends SparkFunSuite

[GitHub] spark issue #16901: [SPARK-19565] Improve DAGScheduler tests.

2017-02-13 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16901 @kayousterhout I've refined accordingly. Sorry for the stupid mistake I made. Please take another look at this : ) --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-13 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @kayousterhout I've refined accordingly, please take another look : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #16620: [SPARK-19263] DAGScheduler should avoid sending c...

2017-02-13 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16620#discussion_r100953546 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -2161,6 +2161,58 @@ class DAGSchedulerSuite extends SparkFunSuite

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-12 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @kayousterhout Thanks a lot for the clear explanation. It makes great sense to me and help me understand the logic a lot. Also I think the way of testing is very good and make the code very

[GitHub] spark issue #16901: [SPARK-19565] Improve DAGScheduler tests.

2017-02-12 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16901 @kayousterhout @squito @markhamstra As mentioned in #16620 , I think it might make sense to make this pr. Please take a look. If you think it is too trivial, I will close. --- If your

[GitHub] spark pull request #16901: [SPARK-19565] Improve DAGScheduler tests.

2017-02-12 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16901 [SPARK-19565] Improve DAGScheduler tests. ## What changes were proposed in this pull request? This is related to #16620. When fetch failed, stage will be resubmitted. There can

[GitHub] spark issue #16876: [SPARK-19537] Move pendingPartitions to ShuffleMapStage.

2017-02-10 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16876 @kayousterhout It's great to give a definition of `pendingPartitions` in `ShuffleMapStage`. May I ask a question and make my understanding about `pendingPartitions` clear ? It means

[GitHub] spark issue #16831: [SPARK-19263] Fix race in SchedulerIntegrationSuite.

2017-02-09 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16831 @kayousterhout Thanks a lot. Sorry for this and I'll be careful in the future. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #16876: [SPARK-19537] Move pendingPartitions to ShuffleMapStage.

2017-02-09 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16876 It's great to have pendingPartitions in ShuffleMapStage. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-02-08 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16867 [SPARK-16929] Improve performance when check speculatable tasks. ## What changes were proposed in this pull request? When check speculatable tasks in `TaskSetManager`, current code scan

[GitHub] spark issue #16831: [SPARK-19263] Fix race in SchedulerIntegrationSuite.

2017-02-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16831 @squito Many thanks for your help. You are so kind person : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #16831: [SPARK-19263] Fix race in SchedulerIntegrationSuite.

2017-02-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16831 @kayousterhout thanks a lot. I'm not sure how to start the unit test automatically, do I have the right to do that? BTW, may I ask a question, what is the proper way to run the unit test

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @markhamstra @squito @kayousterhout It would be great if you can give more comments about above and I can continue working on this : ) --- If your project is set up for it, you can reply

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 As @squito mentioned: >Before this, the DAGScheduler didn't really know anything about taskSetManagers. (In its current form, this pr uses a "leaked" handle via rootPool.getSorte

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @kayousterhout @squito @markhamstra Thanks a lot for reviewing this pr thus far. I do think the approach, which throws away task results from earlier attempts that were running on executors

[GitHub] spark issue #16831: [SPARK-19263] Fix race in SchedulerIntegrationSuite.

2017-02-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16831 @kayousterhout Thanks a lot for review. I've already refined. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #16831: [SPARK-19263] Fix race in SchedulerIntegrationSuite.

2017-02-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16831 @squito Thanks a lot for review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16831: [SPARK-19263] Fix race in SchedulerIntegrationSuite.

2017-02-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16831 @kayousterhout @squito This is originally raised by @squito when review https://github.com/apache/spark/pull/16620. Sorry for my eager to make this small pr. --- If your project is set up

[GitHub] spark pull request #16831: [SPARK-19263] Fix race in SchedulerIntegrationSui...

2017-02-07 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16831 [SPARK-19263] Fix race in SchedulerIntegrationSuite. ## What changes were proposed in this pull request? All the process of offering resource and generating `TaskDescription` should

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @kayousterhout @squito @markhamstra Thanks a lot for for the comments. I've already refined accordingly. I still have one concern: > If this is a correct description, I’d ar

[GitHub] spark issue #16738: [SPARK-19398] Change one misleading log in TaskSetManage...

2017-02-05 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16738 @kayousterhout Thanks a lot again : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16738: [SPARK-19398] Change one misleading log in TaskSetManage...

2017-02-05 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16738 @kayousterhout Thanks a lot for helping this pr thus far. I think the proposal is quite clear. I've already refined. Please take another look. --- If your project is set up for it, you can

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-05 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @squito Thanks a lot for helping this PR thus far. I've added unit test in `DAGSchedulerSuite`, but not sure if it is exactly what you suggest. I created a `mockTaskSchedulerImpl

[GitHub] spark issue #16738: [SPARK-19398] Change one misleading log in TaskSetManage...

2017-02-05 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16738 Thanks a lot for the comments. Actually Im still not sure how to change this log or even just remove it. I just think the log is confusing. It is printed out every FetchFailed. Please give some

[GitHub] spark pull request #16808: [SPARK-19461] Remove some unused imports.

2017-02-05 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/16808 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #16808: [SPARK-19461] Remove some unused imports.

2017-02-05 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16808 [SPARK-19461] Remove some unused imports. ## What changes were proposed in this pull request? Remove some unused imports in `CoarseGrainedSchedulerBackend` and `YarnSchedulerBackend

[GitHub] spark issue #16738: [SPARK-19398] Change one misleading log in TaskSetManage...

2017-02-04 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16738 I just changed the log message, but not sure if it clear enough. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #16807: [SPARK-19398] Change one misleading log in TaskSe...

2017-02-04 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/16807 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #16738: [SPARK-19398] Change one misleading log in TaskSe...

2017-02-04 Thread jinxing64
GitHub user jinxing64 reopened a pull request: https://github.com/apache/spark/pull/16738 [SPARK-19398] Change one misleading log in TaskSetManager. ## What changes were proposed in this pull request? Log below is misleading: ``` if (successful(index

[GitHub] spark pull request #16807: [SPARK-19398] Change one misleading log in TaskSe...

2017-02-04 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16807 [SPARK-19398] Change one misleading log in TaskSetManager. ## What changes were proposed in this pull request? Log below is misleading: ``` if (successful(index

[GitHub] spark pull request #16738: [SPARK-19398] Change one misleading log in TaskSe...

2017-02-04 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/16738 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #16790: [SPARK-19450] Replace askWithRetry with askSync.

2017-02-03 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16790 [SPARK-19450] Replace askWithRetry with askSync. ## What changes were proposed in this pull request? `askSync` is already added in `RpcEndpointRef` (see SPARK-19347 and https

[GitHub] spark issue #16738: [SPARK-19398] remove one misleading log in TaskSetManage...

2017-02-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16738 @kayousterhout Would you please give a look at this ? It's great if you could help review this : ) --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @squito Would you please take another look at this? Please give some advice if possible and I can continue working on this : ) --- If your project is set up for it, you can reply

[GitHub] spark issue #16779: [SPARK-19437] Rectify spark executor id in HeartbeatRece...

2017-02-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16779 Thanks a lot for reviewing this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16779: [SPARK-19437] Rectify spark executor id in HeartbeatRece...

2017-02-02 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16779 @zsxwing Thanks a lot for reviewing this. Not sure why the test doesn't start automatically. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #16780: [SPARK-19438] Both reading and updating executorD...

2017-02-02 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/16780 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #16780: [SPARK-19438] Both reading and updating executorDataMap ...

2017-02-02 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16780 Thanks a lot for looking into this~ @zsxwing You are right. My understanding about this is incorrect. `CoarseGrainedSchedulerBackend: DriverEndpoint` is a `ThreadSafeRpcEndpoint`, thus

[GitHub] spark issue #16738: [SPARK-19398] remove one misleading log in TaskSetManage...

2017-02-02 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16738 @srowen Thanks a lot. I'll refine : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #16780: [SPARK-19438] Both reading and updating executorD...

2017-02-02 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16780 [SPARK-19438] Both reading and updating executorDataMap should be guarded by CoarseGrainedSchedulerBackend.this.synchronized when handle RegisterExecutor. ## What changes were proposed

[GitHub] spark pull request #16779: [SPARK-19437] Rectify spark executor id in Heartb...

2017-02-02 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16779 [SPARK-19437] Rectify spark executor id in HeartbeatReceiverSuite. ## What changes were proposed in this pull request? The current code in `HeartbeatReceiverSuite`, executorId is set

[GitHub] spark issue #16690: [SPARK-19347] ReceiverSupervisorImpl can add block to Re...

2017-02-01 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16690 Thanks a lot for reviewing this PR~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16738: [SPARK-19398] remove one misleading log in TaskSetManage...

2017-02-01 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16738 @srowen @jasonmoore2k Thanks a lot for reviewing this PR~ >Should successful and tasksSuccessful renamed to be completed and tasksCompleted? How do you think about ab

[GitHub] spark issue #16690: [SPARK-19347] ReceiverSupervisorImpl can add block to Re...

2017-02-01 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16690 @vanzin Thanks a lot for helping this PR~ I've already refined~ Please take another look~ --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-02-01 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @squito Thanks a lot for keep reviewing this~ Your comments are very helpful ~ Thank you so much for your help ~~ -when we encounter the condition where there are no pending

[GitHub] spark pull request #16620: [SPARK-19263] DAGScheduler should avoid sending c...

2017-01-31 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16620#discussion_r98819685 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1212,8 +1223,9 @@ class DAGScheduler

[GitHub] spark issue #16690: [SPARK-19347] ReceiverSupervisorImpl can add block to Re...

2017-01-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16690 I feel very sorry if this is disturbing : ) @vanzin Thanks a lot for continuing reviewing this and I'll be more patient : ) Sorry again~~ --- If your project is set up for it, you can

[GitHub] spark pull request #16620: [SPARK-19263] DAGScheduler should avoid sending c...

2017-01-30 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16620#discussion_r98488916 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -718,6 +703,21 @@ private[spark] class TaskSetManager

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-01-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @squito ping for review~~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16690: [SPARK-19347] ReceiverSupervisorImpl can add block to Re...

2017-01-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16690 @vanzin ping for review --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16738: [SPARK-19398] remove one misleading log in TaskSetManage...

2017-01-29 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16738 Should `successful` and `tasksSuccessful` renamed to be `completed` and `tasksCompleted`?which I think make more sense. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #16738: [SPARK-19398] remove one misleading log in TaskSe...

2017-01-29 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16738 [SPARK-19398] remove one misleading log in TaskSetManager. ## What changes were proposed in this pull request? Log below is misleading: ``` if (successful(index

[GitHub] spark issue #16690: [SPARK-19347] ReceiverSupervisorImpl can add block to Re...

2017-01-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16690 @vanzin @zsxwing ping for review~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-01-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @squito Could you please take another look at this ? : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #16620: [SPARK-19263] DAGScheduler should avoid sending c...

2017-01-26 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16620#discussion_r98043010 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -718,6 +703,21 @@ private[spark] class TaskSetManager

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-01-25 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 Fail to pass unit test. I will keep working on this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-01-25 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 >hmm, this is a nuisance. I don't see any good way to get rid of this sleep ... but now that I think about it, why can't you do this in DAGSchedulerSuite? it seems like this can be entir

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-01-25 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @squito Thanks a lot for your comments, they are very helpful. I've already refined the code, please take another look : ) When handle `Success` of `ShuffleMapTask`, what I want

[GitHub] spark pull request #16690: [SPARK-19347] ReceiverSupervisorImpl can add bloc...

2017-01-24 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/16690 [SPARK-19347] ReceiverSupervisorImpl can add block to ReceiverTracker multiple times because of askWithRetry. ## What changes were proposed in this pull request? `ReceiverSupervisorImpl

[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should avoid sending conflict...

2017-01-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @markhamstra Thanks a lot for your comment, I've already refined, please take another look ~ --- If your project is set up for it, you can reply to this email and have your reply appear

<    2   3   4   5   6   7   8   >