[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-06-30 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r124995034 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java --- @@ -126,4 +150,38 @@ private void

[GitHub] spark pull request #18482: [SPARK-21262][WIP] Stop sending 'stream request' ...

2017-06-30 Thread jinxing64
GitHub user jinxing64 reopened a pull request: https://github.com/apache/spark/pull/18482 [SPARK-21262][WIP] Stop sending 'stream request' when shuffle blocks. ## What changes were proposed in this pull request? In current code, client will fetch the remote huge

[GitHub] spark pull request #18482: [SPARK-21262][WIP] Stop sending 'stream request' ...

2017-06-30 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/18482 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #18482: Stop sending 'stream request' when shuffle blocks...

2017-06-30 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/18482 Stop sending 'stream request' when shuffle blocks. ## What changes were proposed in this pull request? In current code, client will fetch the remote huge blocks to d

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-06-29 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r124957411 --- Diff: common/network-common/src/main/java/org/apache/spark/network/server/OneForOneStreamManager.java --- @@ -95,6 +97,25 @@ public ManagedBuffer

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-06-29 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r124957319 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java --- @@ -126,4 +150,38 @@ private void

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-06-29 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r124952655 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java --- @@ -126,4 +150,38 @@ private void

[GitHub] spark issue #18466: [SPARK-21253][CORE] Disable use DownloadCallback fetch b...

2017-06-29 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18466 It's great job ! 👍 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enable

[GitHub] spark issue #18472: [SPARK-21253][Core]Fix a bug that StreamCallback may not...

2017-06-29 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18472 Nice catch! LGTM ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18466: [SPARK-21253][CORE] Disable use DownloadCallback fetch b...

2017-06-29 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18466 @wangyum Can you reproduce this ? It would be great if you can give more details. I cannot reproduce on my local. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-06-29 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r124795947 --- Diff: common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java --- @@ -257,4 +257,31 @@ public Properties cryptoConf

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-06-29 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18454: [SPARK-21240] Fix code style for constructing and stoppi...

2017-06-29 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18454 Thanks for merging. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-06-29 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r124749907 --- Diff: common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java --- @@ -257,4 +257,31 @@ public Properties cryptoConf

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-06-29 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r124749254 --- Diff: common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java --- @@ -257,4 +257,31 @@ public Properties cryptoConf

[GitHub] spark pull request #18446: [SPARK-21236] Make the threshold of using HighlyC...

2017-06-29 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/18446 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #18446: [SPARK-21236] Make the threshold of using HighlyCompress...

2017-06-29 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18446 Sure :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #18405: [SPARK-21194][SQL] Fail the putNullmethod when containsN...

2017-06-29 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18446: [SPARK-21236] Make the threshold of using HighlyCompress...

2017-06-29 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18446 True. I just try to make it more complete and refine the hardcode. `spark.shuffle.accurateBlockThreshold` is useful, but sometimes it's hard for user to tune it("how low is enough?

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r124723237 --- Diff: common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java --- @@ -257,4 +257,31 @@ public Properties cryptoConf

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r124722541 --- Diff: common/network-common/src/main/java/org/apache/spark/network/util/PooledByteBufAllocatorWithMetrics.java --- @@ -0,0 +1,69

[GitHub] spark issue #18405: [SPARK-21194][SQL] Fail the putNullmethod when containsN...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18454: [SPARK-21240] Fix code style for constructing and stoppi...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18454 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18405: [SPARK-21194][SQL] Fail the putNullmethod when containsN...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 Thanks a lot for quick reply. I hope good luck tomorrow :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #18454: [SPARK-21240] Fix code style for constructing and stoppi...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18454 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18454: [SPARK-21240] Fix code style for constructing and stoppi...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18454 @srowen Thanks for review. And yes, I shouldn't open a separate JIRA(I will be careful next time). I checked the code and think all the instances are included. Hope I didn'

[GitHub] spark issue #18446: [SPARK-21236] Make the threshold of using HighlyCompress...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18446 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18405: [SPARK-21194][SQL] Fail the putNullmethod when containsN...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #18454: [SPARK-21240] Fix code style for constructing and...

2017-06-28 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/18454 [SPARK-21240] Fix code style for constructing and stopping a SparkContext in UT. ## What changes were proposed in this pull request? Same with SPARK-20985. Fix code style for

[GitHub] spark issue #18405: [SPARK-21194][SQL] Fail the putNullmethod when containsN...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 Thanks for taking time help this. Maybe I should retry this :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #18405: [SPARK-21194][SQL] Fail the putNullmethod when containsN...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 @HyukjinKwon I'm not sure what's going on with the test. I think the failure is irrelevant with this PR. Would you mind help on this? --- If your project is set up for it, you

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r124510372 --- Diff: common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java --- @@ -257,4 +257,31 @@ public Properties cryptoConf

[GitHub] spark issue #18446: [SPARK-21236] Make the threshold of using HighlyCompress...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18446 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 @jiangxb1987 Thanks a lot for taking time review this pr. I will read your comments very carefully and refine it. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r124502658 --- Diff: core/src/main/scala/org/apache/spark/deploy/ExternalShuffleService.scala --- @@ -54,13 +54,16 @@ class ExternalShuffleService(sparkConf

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r124502044 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockHandler.java --- @@ -90,16 +96,28 @@ protected void

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r124501943 --- Diff: common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java --- @@ -257,4 +257,31 @@ public Properties cryptoConf

[GitHub] spark issue #18405: [SPARK-21194][SQL] Fail the putNullmethod when containsN...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18446: [SPARK-21236] Make the threshold of using HighlyCompress...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18446 Yes, this is discussed ever in https://github.com/apache/spark/pull/16989 . Only average size of blocks are stored in `HighlyCompressedMapStatus`. If driver has enough memory, we can store more

[GitHub] spark issue #18446: [SPARK-21236] Make the threshold of using HighlyCompress...

2017-06-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18446 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18446: [SPARK-21236] Make the threshold of using HighlyCompress...

2017-06-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18446 Sure, thanks for review :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #18446: [SPARK-21236] Make the threshold of using HighlyC...

2017-06-27 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/18446 [SPARK-21236] Make the threshold of using HighlyCompressedStatus configurable. ## What changes were proposed in this pull request? Currently the threshold of using

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-06-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 https://user-images.githubusercontent.com/4058918/27620790-a0a27c8e-5bfe-11e7-8a78-5ebbf4839437.png";> As the screenshot, there are tons of `io.netty.channel.ChannelOutboundBuffer$Entr

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-06-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 >So an alternative to this is limiting the number of blocks each reducer is fetching at once Is it relevant to `spark.reducer.maxSizeInFlight` ? Breaking `OpenBlocks` into m

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-06-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 Thanks a lot for quick reply :) Yes, this patch doesn't guarantee avoiding the OOM on shuffle service when all reducers are opening the blocks at the same time. But we can alleviate

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-06-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 @vanzin @tgravescs How do you think about this idea ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #18405: [SPARK-21194][SQL] Fail the putNullmethod when containsN...

2017-06-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 The test is complaining below, which I think is irrelevant. Fetching package metadata: ..SSL verification error: hostname 'conda.binstar.org' doesn't match either o

[GitHub] spark issue #18405: [SPARK-21194][SQL] Fail the putNullmethod when containsN...

2017-06-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18405: [SPARK-21194][SQL] Fail the putNullmethod when containsN...

2017-06-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 @kiszk Thanks a lot for taking time review this. I've no idea why test failed :( --- If your project is set up for it, you can reply to this email and have your reply appear on GitH

[GitHub] spark issue #18405: [SPARK-21194][SQL] Fail the putNullmethod when containsN...

2017-06-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18405: [SPARK-21194][SQL] Fail the putNullmethod when containsN...

2017-06-25 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18405: [SPARK-21194][SQL] Fail the putNullmethod when containsN...

2017-06-25 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18405: [SPARK-21194][SQL][WIP] Fail the putNullmethod when cont...

2017-06-24 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 cc @kiszk --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark pull request #18405: [SPARK-21194][SQL][WIP] Fail the putNullmethod wh...

2017-06-23 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/18405 [SPARK-21194][SQL][WIP] Fail the putNullmethod when containsNull=false. ## What changes were proposed in this pull request? Currently there's no check for putting null into a `Arra

[GitHub] spark issue #18327: [SPARK-21047] Add test suites for complicated cases in C...

2017-06-23 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18327 @cloud-fan Thanks for merging :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18327: [SPARK-21047] Add test suites for complicated cas...

2017-06-23 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18327#discussion_r123690164 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite.scala --- @@ -739,6 +739,157 @@ class ColumnarBatchSuite

[GitHub] spark issue #18327: [SPARK-21047] Add test suites for complicated cases in C...

2017-06-23 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18327 @cloud-fan Thanks a lot for taking time review this :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #18327: [SPARK-21047] Add test suites for complicated cas...

2017-06-22 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18327#discussion_r123499668 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite.scala --- @@ -739,6 +739,157 @@ class ColumnarBatchSuite

[GitHub] spark pull request #18327: [SPARK-21047] Add test suites for complicated cas...

2017-06-22 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18327#discussion_r123473623 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnarBatch.java --- @@ -241,7 +241,40 @@ public MapData getMap(int ordinal

[GitHub] spark issue #18327: [SPARK-21047] Add test suites for complicated cases in C...

2017-06-22 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18327 @kiszk I tried to add a test `Nest Array(containing null) in Array.`. Please take a look when you have time and I will continue working on this :) --- If your project is set up for it, you

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-06-22 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/18388 [SPARK-21175] Reject OpenBlocks when memory shortage on shuffle service. ## What changes were proposed in this pull request? A shuffle service can serves blocks from multiple apps/tasks

[GitHub] spark pull request #18327: [SPARK-21047] Add test suites for complicated cas...

2017-06-21 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18327#discussion_r123258919 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite.scala --- @@ -739,6 +739,123 @@ class ColumnarBatchSuite

[GitHub] spark pull request #18239: [SPARK-19462] fix bug in Exchange--pass in a tmp ...

2017-06-21 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/18239 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #18249: [SPARK-19937] Collect metrics for remote bytes read to d...

2017-06-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18249 @vanzin Would you mind give more comments when have time ? And I can continue working on this :) --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #18327: [SPARK-21047] Add test suites for complicated cases in C...

2017-06-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18327 @kiszk Thank you so much ! I will read your comments carefully and refine this pr : ) --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #14085: [SPARK-16408][SQL] SparkSQL Added file get Exception: is...

2017-06-19 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/14085 @zenglinxi0615 This pr is about adding all files in a directory recursively, thus no need to enumerate all the filenames? I think this can be pretty useful especially in production env

[GitHub] spark issue #18343: [SPARK-21133][CORE] Fix HighlyCompressedMapStatus#writeE...

2017-06-18 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18343 Thanks for ping. If I understand correctly, `HighlyCompressedStatus` is initialized when 2 situations: 1. Creating `MapStatus` when shuffle-write and the reduce partitions is over 2000

[GitHub] spark issue #18231: [SPARK-20994] Remove redundant characters in OpenBlocks ...

2017-06-16 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 @cloud–fan Thanks for merging ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18327: [WIP][SPARK-21047] Add test suites for complicated cases...

2017-06-16 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18327 @kiszk Would you mind if I make a try for this JIRA? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #18327: [SPARK-21047] Add test suites for complicated cas...

2017-06-16 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/18327 [SPARK-21047] Add test suites for complicated cases in ColumnarBatchSuite ## What changes were proposed in this pull request? Current ColumnarBatchSuite has very simple test cases for `Array

[GitHub] spark issue #18231: [SPARK-20994] Remove redundant characters in OpenBlocks ...

2017-06-16 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #18231: [SPARK-20994] Remove redundant characters in Open...

2017-06-15 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18231#discussion_r122367121 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockHandler.java --- @@ -209,4 +190,51 @@ private

[GitHub] spark issue #18239: [SPARK-19462] fix bug in Exchange--pass in a tmp "newPar...

2017-06-15 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18239 @cloud-fan Thanks a lot for reply. Yes, I'm also hesitate to backport branch-1.6; But I think this bug is too obvious -- with `spark.sql.adaptive.enabled=true`, any rerunni

[GitHub] spark issue #18231: [SPARK-20994] Remove redundant characters in OpenBlocks ...

2017-06-15 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 @cloud-fan Thanks a lot for taking time review this. I refined accordingly :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #18231: [SPARK-20994] Remove redundant characters in OpenBlocks ...

2017-06-15 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 @jiangxb1987 Thanks a lot for taking time review this pr. More comments are welcome. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #18231: [SPARK-20994] Remove redundant characters in Open...

2017-06-15 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18231#discussion_r122240746 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java --- @@ -150,27 +150,20 @@ public void

[GitHub] spark pull request #18231: [SPARK-20994] Remove redundant characters in Open...

2017-06-15 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18231#discussion_r122240056 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockHandler.java --- @@ -209,4 +190,51 @@ private

[GitHub] spark pull request #18231: [SPARK-20994] Remove redundant characters in Open...

2017-06-15 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18231#discussion_r122238486 --- Diff: common/network-shuffle/src/test/java/org/apache/spark/network/sasl/SaslIntegrationSuite.java --- @@ -202,7 +202,7 @@ public void

[GitHub] spark pull request #18231: [SPARK-20994] Remove redundant characters in Open...

2017-06-15 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18231#discussion_r122238244 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockHandler.java --- @@ -209,4 +190,51 @@ private

[GitHub] spark pull request #18231: [SPARK-20994] Remove redundant characters in Open...

2017-06-15 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18231#discussion_r122237095 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockHandler.java --- @@ -209,4 +190,51 @@ private

[GitHub] spark issue #18239: [SPARK-19462] fix bug in Exchange--pass in a tmp "newPar...

2017-06-15 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18239 In master branch, there's no such issue. I think the scenario described in jira is a good case. And I will add a test case in the pr. Our product env is based on spark-1.6. So I made th

[GitHub] spark issue #18239: [SPARK-19462] fix bug in Exchange--pass in a tmp "newPar...

2017-06-14 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18239 Very gentle ping @jiangxb1987 It would be great if you can take a look when you have time. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #18231: [SPARK-20994] Remove reduant characters in OpenBlocks to...

2017-06-09 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 @vanzin Thanks again for comments :) I refined accordingly, please take another look when you have time. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request #18231: [SPARK-20994] Remove reduant characters in OpenBl...

2017-06-09 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18231#discussion_r121242495 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockHandler.java --- @@ -209,4 +190,51 @@ private

[GitHub] spark pull request #18231: [WIP][SPARK-20994] Remove reduant characters in O...

2017-06-09 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18231#discussion_r121241362 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockHandler.java --- @@ -209,4 +190,51 @@ private

[GitHub] spark issue #18239: [SPARK-19462] fix bug in Exchange--pass in a tmp "newPar...

2017-06-09 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18239 @jiangxb1987 would you mind to take a look at this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #18249: [WIP][SPARK-19937] Collect metrics for remote byt...

2017-06-09 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/18249 [WIP][SPARK-19937] Collect metrics for remote bytes read to disk during shuffle. In current code(https://github.com/apache/spark/pull/16989), big blocks are shuffled to disk. This pr

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 @srowen I did a test to verify this patch. I wrap a number of blocks inside `OpenBlocks` and send it to `ExternalShuffleBlockHandler`. With this change: it cost about 133M in the

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 Yes, I think it's great to do some tests and give a good evidence. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pr

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 there is no where referencing `msg`, right? I guess the `msg` will be garbage collected fluently. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 The blockIds cannot be freed because they are referenced in the iterator. In current change they are not. We reference the mapIdAndReduceIds instead. Thus the blockIds in OpenBlocks can be

[GitHub] spark pull request #18231: [WIP][SPARK-20994] Remove reduant characters in O...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18231#discussion_r120845706 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockHandler.java --- @@ -209,4 +190,52 @@ private

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 I mean the blockIds in `OpenBlocks`, they have reference in iterator. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #18231: [WIP][SPARK-20994] Remove reduant characters in O...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18231#discussion_r120844431 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockHandler.java --- @@ -209,4 +190,52 @@ private

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 @srowen Sorry, I didn't make it clear. 1. In current code, all blockIds are stored in the iterator. They are released only when the iterator is traversed. 2. Now I change the `Strin

[GitHub] spark pull request #17276: [WIP][SPARK-19937] Collect metrics of block sizes...

2017-06-08 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/17276 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17276: [WIP][SPARK-19937] Collect metrics of block sizes when s...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/17276 @mridulm @squito Thanks a lot for taking time review this pr. I will close it for now and make another one if there is progress. --- If your project is set up for it, you can reply to

[GitHub] spark issue #18239: [SPARK-19462] fix bug in Exchange--pass in a tmp "newPar...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18239 I'm not sure if it is appropriate to make this pr and backport to 1.6. It's great if there's someone taking some time reviewing this. --- If your project is set up for it, you ca

[GitHub] spark pull request #18239: [SPARK-19462] fix bug in Exchange--pass in a tmp ...

2017-06-08 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/18239 [SPARK-19462] fix bug in Exchange--pass in a tmp "newPartitioning" in "prepareShuffleDependency" When `spark.sql.adaptive.enabled` is true, any rerunning of ancestors of `

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 Actually it's more than 12 bytes. Yes, there are millions of these. In my heap dump, it's 1.5 G --- If your project is set up for it, you can reply to this email and have your re

<    1   2   3   4   5   6   7   8   >