Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18231
@vanzin
Thanks a lot for reviewing this. I refined according to your comments,
Please take another look at this when you have time :)
---
If your project is set up for it, you can reply
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18231#discussion_r120809215
--- Diff:
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockHandler.java
---
@@ -209,4 +190,47 @@ private
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18231#discussion_r120808962
--- Diff:
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockHandler.java
---
@@ -209,4 +190,47 @@ private
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18231
@srowen
Thanks a lot looking into this :)
For example: blockId="shuffle_20_1000_2000", it is stored as an `String`,
which costs more than 20 bytes. In this change, it will c
GitHub user jinxing64 opened a pull request:
https://github.com/apache/spark/pull/18231
[WIP][SPARK-20994] Remove reduant characters in OpenBlocks to save memory
for shuffle service.
## What changes were proposed in this pull request?
In current code, blockIds
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18231
n my cluster, we are suffering from OOM of shuffle-service.
We found that a lot of executors are fetching blocks from a single
shuffle-service. Analyzing the memory, we found
Github user jinxing64 closed the pull request at:
https://github.com/apache/spark/pull/18211
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18211
@vanzin
Thanks a lot for comment. I will close this pr and think if there is other
solution.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18211
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18211#discussion_r120388659
--- Diff:
common/network-common/src/test/java/org/apache/spark/network/server/OneForOneStreamManagerSuite.java
---
@@ -1,50 +0,0
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18204
Thanks for merging
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18211
In this pr:
1. Instead of `chunkIndex`, fetch chunk by `String chunkId`. Server doesn't
cache the blocks list.
2. In `OpenBlocks`, only metadata(e.g. appId, executorId) of the stream
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18211
In my cluster, we are suffering from OOM of shuffle-service.
We found that a lot of executors are fetching blocks from a single
shuffle-service. Analyzing the memory, we found
GitHub user jinxing64 opened a pull request:
https://github.com/apache/spark/pull/18211
[WIP][SPARK-20994] Alleviate memory pressure in StreamManager
## What changes were proposed in this pull request?
In current code, chunks are fetched from shuffle service in two steps
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18204
@srowen
Thanks for approving ! :-)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
GitHub user jinxing64 opened a pull request:
https://github.com/apache/spark/pull/18204
[SPARK-20985] sc.stop should be encapsulated in finally
## What changes were proposed in this pull request?
Stop `SparkContext` in `finally`, thus other tests won't complain that
there's
Github user jinxing64 closed the pull request at:
https://github.com/apache/spark/pull/17533
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/17533
@HyukjinKwon
Sorry, I will close this for now and make another pr if there's progress.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/17603
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/17603
@squito
Thank you so much :-) :-)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/17312
Sorry, I will close it for now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user jinxing64 closed the pull request at:
https://github.com/apache/spark/pull/17312
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/17634
@squito
Thanks a lot for merging :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18117
@cloud-fan
Thanks a lot for notification. I think it's really good change here ð
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18117#discussion_r118731980
--- Diff:
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
---
@@ -443,34 +445,34 @@ class
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18117#discussion_r118731891
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
---
@@ -214,11 +214,12 @@ final class
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
@cloud-fan @JoshRosen @mridulm @squito @viirya
Thanks a lot for taking so much time reviewing this patch !
Sorry for the stupid mistakes I made. I will be more careful next time
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r118482160
--- Diff:
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
---
@@ -401,4 +413,64 @@ class
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r118424375
--- Diff:
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
---
@@ -401,4 +411,61 @@ class
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r118424377
--- Diff: docs/configuration.md ---
@@ -520,6 +520,14 @@ Apart from these, the following properties are also
available, and may be useful
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/17634
Yes. it doesn't fail for sure.
I just think it's fairly straightforward that partitioner should be
compatible with num of child RDD's partitions. I find no reason the num of
partitions
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r118272761
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
---
@@ -175,33 +187,49 @@ final class
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r118272605
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
---
@@ -163,6 +170,11 @@ final class
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r118272653
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
---
@@ -175,33 +187,49 @@ final class
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r118272532
--- Diff:
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -287,4 +287,10 @@ package object config {
.bytesConf
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
In current change:
1) remove the partial written file when failing
2) remove all shuffle files when `cleanup()`(this is registered as a task
completion callback)
---
If your project
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
@cloud-fan
In current change, the shuffle files are deleted twice:
1). After the `ManagedBuffer.release`
2). In the `cleanup()`, the `cleanup()` is already registered as a task
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/17634
@jiangxb1987
Thank you so much taking time looking into this.
Yes, it is not failing in existing code. But I think it's quite
straightforward that partitioner should be compatible
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
In current change:
1. `ShuffleBlockFetcherIterator` is not a `MemoryConsumer`
2. Name of shuffle file becomes: ${context.taskAttemptId()}-remote-$bId
3. Try to delete all shuffle files
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18031
@cloud-fan
Thanks for merging !
@mridulm @JoshRosen @viirya @HyukjinKwon @wzhfy Thanks a lot for taking
time reviewing this pr !
---
If your project is set up for it, you can reply
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
@cloud-fan
Yes, thanks a lot for merging #18031
I will update soon !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18031
In current change:
1. there's only one config: spark.shuffle.accurateBlockThreshold
2. I remove the huge blocks from the numerator in that calculation for
average size
---
If your
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18031#discussion_r117620188
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -193,8 +219,27 @@ private[spark] object HighlyCompressedMapStatus
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18031#discussion_r117610423
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -121,48 +126,69 @@ private[spark] class CompressedMapStatus
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18031#discussion_r117610285
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -193,8 +219,27 @@ private[spark] object HighlyCompressedMapStatus
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18031#discussion_r117510051
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -193,8 +219,27 @@ private[spark] object HighlyCompressedMapStatus
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18031#discussion_r117461089
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -193,8 +219,27 @@ private[spark] object HighlyCompressedMapStatus
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18031#discussion_r117460170
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -193,8 +219,27 @@ private[spark] object HighlyCompressedMapStatus
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18031
@HyukjinKwon
Thank you so much ! Really helpful ð
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18031
Gentle ping to @JoshRosen @cloud-fan @mridulm
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18031
Jenkins, retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18031#discussion_r117293425
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -193,8 +219,27 @@ private[spark] object HighlyCompressedMapStatus
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18031
I try to give user a way to control the memory strictly and no blocks are
underestimated(setting spark.shuffle.accurateBlockThreshold=0 and
spark.shuffle.accurateBlockThresholdByTimesAverage=1
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18031
To resolve the comments in https://github.com/apache/spark/pull/16989 :
>minimum size before we consider something a large block : if average is
10kb, and some blocks are > 20kb, spillin
GitHub user jinxing64 opened a pull request:
https://github.com/apache/spark/pull/18031
Record accurate size of blocks in MapStatus when it's above threshold.
## What changes were proposed in this pull request?
Currently, when number of reduces is above 2000
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
@JoshRosen
Thanks a lot for taking time looking into this pr. I'm reading your
comments carefully.
Yes, I think it's good to integrate with memory manager later.
I will break
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
Checking the code:
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/config/ConfigProvider.scala#L59
`SparkConfigProvider` just check if the key
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
It seems like `SparkConfigProvider` is not checking alternatives in
`SparkConf`. That's why spark.memory.offHeap.enabled is not set(still the
default value), though we've already set
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r117152091
--- Diff:
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -278,4 +278,39 @@ package object config
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
@cloud-fan Thanks, I will refine the documents.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r116929126
--- Diff:
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -278,4 +278,39 @@ package object config
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r116929076
--- Diff:
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
---
@@ -401,4 +429,146 @@ class
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r116919856
--- Diff:
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
---
@@ -401,4 +429,146 @@ class
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r116911964
--- Diff:
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
---
@@ -401,4 +429,146 @@ class
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r116911537
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/BlockStoreShuffleReader.scala ---
@@ -51,7 +59,10 @@ private[spark] class BlockStoreShuffleReader
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r116907421
--- Diff:
core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala
---
@@ -401,4 +429,139 @@ class
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r116907437
--- Diff:
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -278,4 +278,10 @@ package object config
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r116907401
--- Diff: docs/configuration.md ---
@@ -954,16 +970,16 @@ Apart from these, the following properties are also
available, and may be useful
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r116907199
--- Diff:
core/src/test/scala/org/apache/spark/scheduler/MapStatusSuite.scala ---
@@ -128,4 +138,27 @@ class MapStatusSuite extends SparkFunSuite
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
@cloud-fan Thanks a lot. I will refine :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r116674324
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -193,8 +216,21 @@ private[spark] object HighlyCompressedMapStatus
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
In current code, `spark.memory.offHeap.enabled` is used when decide
`tungstenMemoryMode`.
`spark.memory.offHeap.enabled` doesn't decide remote blocks are shuffled to
whether onHeap or offHeap
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r116521474
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/BlockStoreShuffleReader.scala ---
@@ -51,7 +59,10 @@ private[spark] class BlockStoreShuffleReader
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r116520507
--- Diff: core/src/main/scala/org/apache/spark/memory/MemoryManager.scala
---
@@ -54,7 +54,8 @@ private[spark] abstract class MemoryManager
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
Very gentle ping to @cloud-fan and @mridulm
How do you think about the current change :) ?
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
@cloud-fan @mridulm
I think it's good idea to make 2000 configurable. But checking the code,
I'm a little bit hesitant to do that in this pr. I think it's bigger change and
some related code
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115933206
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
---
@@ -175,33 +193,54 @@ final class
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
@cloud-fan
Yes, I think it's a good idea to make `2000` configurable. I will refine.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
As @mridulm mentioned, in `HighlyCompressedMapStatus` it can be configured
in two respects:
>1. minimum size before we consider something a large block.
>2. The fraction '2' shoul
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115884613
--- Diff:
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java
---
@@ -126,4 +150,50 @@ private void
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
Yes, I will refine :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115638943
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -128,41 +131,60 @@ private[spark] class CompressedMapStatus
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
@mridulm
Really thankful for taking time looking into this pr. Really helpful. I
refined according to your comments. Please take another look when you have time
and give more comments
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115525769
--- Diff:
common/network-common/src/main/java/org/apache/spark/network/buffer/FileSegmentManagedBuffer.java
---
@@ -36,7 +36,7
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115525399
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -128,41 +131,60 @@ private[spark] class CompressedMapStatus
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115523971
--- Diff:
common/network-common/src/main/java/org/apache/spark/network/server/OneForOneStreamManager.java
---
@@ -95,6 +95,14 @@ public ManagedBuffer
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115523324
--- Diff:
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java
---
@@ -126,4 +149,38 @@ private void
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115523103
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
---
@@ -154,15 +164,24 @@ final class
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115522973
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
---
@@ -154,15 +164,24 @@ final class
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115522902
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
---
@@ -137,6 +146,7 @@ final class ShuffleBlockFetcherIterator
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115522781
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -193,8 +206,18 @@ private[spark] object HighlyCompressedMapStatus
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115519863
--- Diff: core/src/main/scala/org/apache/spark/memory/MemoryManager.scala
---
@@ -54,7 +54,8 @@ private[spark] abstract class MemoryManager
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115519964
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -128,41 +130,52 @@ private[spark] class CompressedMapStatus
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115518541
--- Diff:
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java
---
@@ -126,4 +151,39 @@ private void
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
Jenkins, test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/16989
@cloud-fan
Thank you very much for reviewing this thus far :)
>How about we always fetch to disk if the block size is over
maxBytesInFlight?
I super agree with this. It's to
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115230064
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
---
@@ -163,6 +173,8 @@ final class ShuffleBlockFetcherIterator
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115229995
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
---
@@ -175,33 +187,45 @@ final class
Github user jinxing64 commented on a diff in the pull request:
https://github.com/apache/spark/pull/16989#discussion_r115229968
--- Diff:
core/src/test/scala/org/apache/spark/scheduler/MapStatusSuite.scala ---
@@ -128,4 +130,22 @@ class MapStatusSuite extends SparkFunSuite
401 - 500 of 719 matches
Mail list logo