[GitHub] spark issue #22371: [SPARK-25386][CORE] Don't need to synchronize the IndexS...

2018-09-11 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/22371 OK, thanks everyone for the help. Close it --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #22371: [SPARK-25386][CORE] Don't need to synchronize the...

2018-09-11 Thread ConeyLiu
Github user ConeyLiu closed the pull request at: https://github.com/apache/spark/pull/22371 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22371: [SPARK-25386][CORE] Don't need to synchronize the IndexS...

2018-09-10 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/22371 @squito , thanks for the review. I intend to using `ConcurrentHashMap[Int, AtomicReferenceArray]` previously. After re-think the code, I can know the lock here is used to prevent the same

[GitHub] spark pull request #22371: [SPARK-25386][CORE] Don't need to synchronize the...

2018-09-10 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/22371#discussion_r216304910 --- Diff: core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala --- @@ -51,6 +52,8 @@ private[spark] class

[GitHub] spark issue #22371: [SPARK-25386][CORE] Don't need to synchronize the IndexS...

2018-09-10 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/22371 Thanks @felixcheung, @srowen, @cloud-fan for your time. There is only one instance of `IndexShuffleBlockResolver` per executor, and the synchronize is used to protect the modify safely when

[GitHub] spark issue #22371: [SPARK-25386][CORE] Don't need to synchronize the IndexS...

2018-09-09 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/22371 @cloud-fan @jiangxb1987 Could you help to review this? Thanks a lot. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22371: [SPARK-25386][CORE] Don't need to synchronize the...

2018-09-09 Thread ConeyLiu
GitHub user ConeyLiu opened a pull request: https://github.com/apache/spark/pull/22371 [SPARK-25386][CORE] Don't need to synchronize the IndexShuffleBlockResolver for each writeIndexFileAndCommit ## What changes were proposed in this pull request? Now, we need

[GitHub] spark pull request #20844: [SPARK-23707][SQL] Don't need shuffle exchange wi...

2018-08-26 Thread ConeyLiu
Github user ConeyLiu closed the pull request at: https://github.com/apache/spark/pull/20844 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20844: [SPARK-23707][SQL] Don't need shuffle exchange with sing...

2018-08-26 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/20844 thanks for all. Closes it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20844: [SPARK-23707][SQL] Don't need shuffle exchange wi...

2018-03-22 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20844#discussion_r176327636 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala --- @@ -348,6 +348,13 @@ case class RangeExec(range

[GitHub] spark issue #20844: [SPARK-23707][SQL] Don't need shuffle exchange with sing...

2018-03-21 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/20844 This change is very simple, and just make it consistent with other `LeafNode`. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #20844: [SPARK-23707][SQL] No shuffle exchange with single parti...

2018-03-21 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/20844 @cloud-fan, pls take a look, thanks a lot. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20844: [SPARK-23707][SQL] Fresh 'initRange' name to avoi...

2018-03-20 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20844#discussion_r175966108 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala --- @@ -396,9 +396,11 @@ case class RangeExec(range

[GitHub] spark pull request #20844: [SPARK-23707][SQL] Fresh 'initRange' name to avoi...

2018-03-19 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20844#discussion_r175658889 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala --- @@ -396,9 +396,11 @@ case class RangeExec(range

[GitHub] spark pull request #20844: [SPARK-23707][SQL] Fresh 'initRange' name to avoi...

2018-03-19 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20844#discussion_r175634287 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala --- @@ -396,9 +396,11 @@ case class RangeExec(range

[GitHub] spark pull request #20844: [SPARK-23707][SQL] Fresh 'initRange' name to avoi...

2018-03-18 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20844#discussion_r175315224 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala --- @@ -396,9 +396,11 @@ case class RangeExec(range

[GitHub] spark issue #20844: [SPARK-23707][SQL] Fresh 'initRange' name to avoid metho...

2018-03-16 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/20844 @cloud-fan pls take a look, this is a small change. Thanks a lot. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #20844: [SPARK-23707][SQL] Fresh 'initRange' name to avoi...

2018-03-16 Thread ConeyLiu
GitHub user ConeyLiu opened a pull request: https://github.com/apache/spark/pull/20844 [SPARK-23707][SQL] Fresh 'initRange' name to avoid method name conflicts ## What changes were proposed in this pull request? We should call `ctx.freshName` to get the `initRange` to avoid name

[GitHub] spark issue #20676: [SPARK-23516][CORE] It is unnecessary to transfer unroll...

2018-02-27 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/20676 Yeah, I see that. I'm not sure it's OK to change. But I think we should follow the interface design, not the underlying implementation

[GitHub] spark pull request #20676: [SPARK-23516][CORE] It is unnecessary to transfer...

2018-02-27 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20676#discussion_r171115071 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -246,18 +246,18 @@ private[spark] class MemoryStore

[GitHub] spark issue #20676: [SPARK-23516][CORE] It is unnecessary to transfer unroll...

2018-02-27 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/20676 This is for compatibility reasons. The memory management also support legacy memory management (`StaticMemoryManager`). In `StaticMemoryManager`, the storage memory and unroll memory is managed

[GitHub] spark pull request #20461: [SPARK-23289][CORE]OneForOneBlockFetcher.Download...

2018-01-31 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20461#discussion_r165246022 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java --- @@ -171,7 +171,9 @@ private void

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2018-01-26 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19285 thanks all. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-24 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r163768689 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -232,78 +236,93 @@ private[spark] class MemoryStore

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-24 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r163743072 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -346,85 +350,24 @@ private[spark] class MemoryStore

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-24 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r163551992 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -346,85 +350,24 @@ private[spark] class MemoryStore

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-24 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r163551817 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -702,6 +645,83 @@ private[spark] class MemoryStore

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-23 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r163462053 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -233,17 +235,13 @@ private[spark] class MemoryStore

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2018-01-23 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19285 Thanks for your valuable suggestion, the code has been updated. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-22 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r163131741 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -162,26 +162,29 @@ private[spark] class MemoryStore

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-22 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r163131519 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -162,26 +162,29 @@ private[spark] class MemoryStore

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-22 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r163131383 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -233,17 +235,13 @@ private[spark] class MemoryStore

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-21 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r162840896 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -162,26 +162,29 @@ private[spark] class MemoryStore

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-21 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r162840776 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -162,26 +162,29 @@ private[spark] class MemoryStore

[GitHub] spark pull request #20026: [SPARK-22838][Core] Avoid unnecessary copying of ...

2018-01-21 Thread ConeyLiu
Github user ConeyLiu closed the pull request at: https://github.com/apache/spark/pull/20026 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20026: [SPARK-22838][Core] Avoid unnecessary copying of data

2018-01-21 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/20026 close it, thanks for everyone. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #20026: [SPARK-22838][Core] Avoid unnecessary copying of ...

2018-01-21 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20026#discussion_r162810684 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskStore.scala --- @@ -152,7 +153,7 @@ private class DiskBlockData( file: File

[GitHub] spark pull request #20026: [SPARK-22838][Core] Avoid unnecessary copying of ...

2018-01-21 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20026#discussion_r162803175 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskStore.scala --- @@ -152,7 +153,7 @@ private class DiskBlockData( file: File

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2018-01-21 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19285 Thanks for reviewing. The code has updated, pls help to review. Thanks again. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-21 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r162802949 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -261,37 +263,93 @@ private[spark] class MemoryStore

[GitHub] spark issue #20026: [SPARK-22838][Core] Avoid unnecessary copying of data

2017-12-26 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/20026 cc @jiangxb1987 any comments on this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20026: [SPARK-22838][Core] Avoid unnecessary copying of data

2017-12-21 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/20026 I'll update it tomorrow. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20026: [SPARK-22838][Core] Avoid unnecessary copying of ...

2017-12-21 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20026#discussion_r158279099 --- Diff: core/src/main/java/org/apache/spark/io/NioBufferedFileInputStream.java --- @@ -61,6 +61,7 @@ private boolean refill() throws IOException

[GitHub] spark issue #20026: [SPARK-22838][Core] Avoid unnecessary copying of data

2017-12-21 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/20026 It seems the error not related. And can you add me to the whitelist? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #20026: [SPARK-22838][Core] Avoid unnecessary copying of ...

2017-12-21 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20026#discussion_r158243377 --- Diff: core/src/main/java/org/apache/spark/io/NioBufferedFileInputStream.java --- @@ -61,6 +61,7 @@ private boolean refill() throws IOException

[GitHub] spark pull request #20026: [SPARK-22838][Core] Avoid unnecessary copying of ...

2017-12-21 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20026#discussion_r158220107 --- Diff: core/src/main/java/org/apache/spark/io/NioBufferedFileInputStream.java --- @@ -91,7 +92,12 @@ public synchronized int read(byte[] b, int offset

[GitHub] spark pull request #20026: [SPARK-22838][Core] Avoid unnecessary copying of ...

2017-12-21 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20026#discussion_r158220052 --- Diff: core/src/main/java/org/apache/spark/io/NioBufferedFileInputStream.java --- @@ -61,6 +61,7 @@ private boolean refill() throws IOException

[GitHub] spark pull request #20026: [SPARK-22838][Core] Avoid unnecessary copying of ...

2017-12-20 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20026#discussion_r158181658 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskStore.scala --- @@ -208,7 +209,7 @@ private class EncryptedBlockData( conf: SparkConf

[GitHub] spark pull request #20026: [SPARK-22838][Core] Avoid unnecessary copying of ...

2017-12-19 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20026#discussion_r157938250 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskStore.scala --- @@ -208,7 +209,7 @@ private class EncryptedBlockData( conf: SparkConf

[GitHub] spark issue #20026: [SPARK-22838][Core] Avoid unnecessary copying of data

2017-12-19 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/20026 @cloud-fan Please take a look, thanks a lot. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20026: [SPARK-22838][Core] Avoid unnecessary copying of ...

2017-12-19 Thread ConeyLiu
GitHub user ConeyLiu opened a pull request: https://github.com/apache/spark/pull/20026 [SPARK-22838][Core] Avoid unnecessary copying of data ## What changes were proposed in this pull request? If we read data from FileChannel to HeapByteBuffer, there is a need to copy

[GitHub] spark issue #19735: [MINOR][CORE] Using bufferedInputStream for dataDeserial...

2017-11-13 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19735 thanks @jerryshao @srowen. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #19735: [MINOR][CORE] Using bufferedInputStream for dataDeserial...

2017-11-12 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19735 @srowen Could you take a look? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #19735: [MINOR][CORE] Using bufferedInputStream for dataD...

2017-11-12 Thread ConeyLiu
GitHub user ConeyLiu opened a pull request: https://github.com/apache/spark/pull/19735 [MINOR][CORE] Using bufferedInputStream for dataDeserializeStream ## What changes were proposed in this pull request? Small fix. Using bufferedInputStream for dataDeserializeStream

[GitHub] spark issue #19661: [SPARK-22450][Core][Mllib]safely register class for mlli...

2017-11-10 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19661 thanks everyone. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #19661: [SPARK-22450][Core][Mllib]safely register class f...

2017-11-10 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19661#discussion_r150173067 --- Diff: core/src/test/scala/org/apache/spark/serializer/KryoSerializerSuite.scala --- @@ -108,6 +108,27 @@ class KryoSerializerSuite extends

[GitHub] spark pull request #19661: [SPARK-22450][Core][Mllib]safely register class f...

2017-11-10 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19661#discussion_r150172493 --- Diff: core/src/test/scala/org/apache/spark/serializer/KryoSerializerSuite.scala --- @@ -108,6 +108,27 @@ class KryoSerializerSuite extends

[GitHub] spark pull request #19661: [SPARK-22450][Core][Mllib]safely register class f...

2017-11-09 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19661#discussion_r150157116 --- Diff: core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala --- @@ -178,6 +179,28 @@ class KryoSerializer(conf: SparkConf

[GitHub] spark pull request #19661: [SPARK-22450][Core][Mllib]safely register class f...

2017-11-08 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19661#discussion_r149846210 --- Diff: core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala --- @@ -178,10 +178,40 @@ class KryoSerializer(conf: SparkConf

[GitHub] spark pull request #19661: [SPARK-22450][Core][Mllib]safely register class f...

2017-11-07 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19661#discussion_r149553694 --- Diff: core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala --- @@ -178,10 +178,40 @@ class KryoSerializer(conf: SparkConf

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2017-11-06 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19285 It's updated. Thanks a lot. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #17936: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

2017-11-06 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/17936 OK, thanks a lot. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #19661: [SPARK-22450][Core][Mllib]safely register class for mlli...

2017-11-06 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19661 Thanks for reviewing. The code is updated. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19661: [SPARK-22450][Core][Mllib]safely register class for mlli...

2017-11-05 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19661 >So why don't you include some classes such as org.apache.spark.ml.feature.Instance ? I'm not family with those algorithm, I can add them such as `org.apache.spark.ml.feature.Insta

[GitHub] spark issue #19586: [SPARK-22367][WIP][CORE] Separate the serialization of c...

2017-11-05 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19586 Thanks for the suggestion, I re-raised a pr to solve this problem. Close it now. --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #19586: [SPARK-22367][WIP][CORE] Separate the serializati...

2017-11-05 Thread ConeyLiu
Github user ConeyLiu closed the pull request at: https://github.com/apache/spark/pull/19586 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19661: [SPARK-22450][Core][Mllib]safely register class for mlli...

2017-11-05 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19661 #19586 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #19661: [SPARK-22450][Core][Mllib]safely register class f...

2017-11-05 Thread ConeyLiu
GitHub user ConeyLiu opened a pull request: https://github.com/apache/spark/pull/19661 [SPARK-22450][Core][Mllib]safely register class for mllib ## What changes were proposed in this pull request? There are still some algorithms based on mllib, such as KMeans. For now

[GitHub] spark issue #19586: [SPARK-22367][WIP][CORE] Separate the serialization of c...

2017-11-03 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19586 Hi @cloud-fan, @jerryshao. The problem of `writeClass` and `readClass` can be solved by register the class: Vector, DenseVector, SparseVector. The follow is the test results: ```scala val

[GitHub] spark issue #19586: [SPARK-22367][WIP][CORE] Separate the serialization of c...

2017-11-02 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19586 OK, I can understand your concern. There is huge gc problem for K-means workload, it occupied about 10-20% percent. The source data is cached in memory, there is even worse performance when

[GitHub] spark issue #19586: [SPARK-22367][WIP][CORE] Separate the serialization of c...

2017-11-01 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19586 Hi @cloud-fan, for most case the data type should be same. So I think this optimization is valuable, because it can save the space and cpu resource considerable. What about setting a flag

[GitHub] spark issue #19586: [SPARK-22367][WIP][CORE] Separate the serialization of c...

2017-11-01 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19586 Currently, I use it directly. Maybe this is suitable for some special case which has same type data, such as ml or else

[GitHub] spark issue #19586: [SPARK-22367][WIP][CORE] Separate the serialization of c...

2017-10-31 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19586 Hi @jerryshao, Thanks for the reminder, it doesn't support it. I'm sorry I did not take that into account. How about using configuration to determine whether we should use `SerializerInstance

[GitHub] spark issue #19586: [SPARK-22367][WIP][CORE] Separate the serialization of c...

2017-10-30 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19586 Hi @cloud-fan, thanks for reviewing. There are some errors about `UnsafeShuffleWrite` need further fixed. I am not familiar with this code, so I need some time

[GitHub] spark pull request #19586: [SPARK-22367][WIP][CORE] Separate the serializati...

2017-10-30 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19586#discussion_r147709649 --- Diff: core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala --- @@ -205,11 +205,45 @@ class KryoSerializationStream

[GitHub] spark pull request #19586: [SPARK-22367][CORE] Separate the serialization of...

2017-10-27 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19586#discussion_r147371400 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -376,7 +382,17 @@ private[spark] class MemoryStore

[GitHub] spark issue #19586: [SPARK-22367][CORE] Separate the serialization of class ...

2017-10-27 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19586 @srowen Thanks for the reviewing. What do you meaning here? > I'm trying to think if there's any case where we intend to support kryo/java serialized objects from 2.x in

[GitHub] spark pull request #19586: [SPARK-22367][CORE] Separate the serialization of...

2017-10-27 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19586#discussion_r147368368 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -860,9 +876,26 @@ private[storage] class PartiallySerializedBlock[T

[GitHub] spark pull request #19586: [SPARK-22367][CORE] Separate the serialization of...

2017-10-27 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19586#discussion_r147368002 --- Diff: core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala --- @@ -205,11 +205,45 @@ class KryoSerializationStream

[GitHub] spark pull request #19586: [SPARK-22367][CORE] Separate the serialization of...

2017-10-27 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19586#discussion_r147367241 --- Diff: pom.xml --- @@ -133,7 +133,7 @@ 1.6.0 9.3.20.v20170531 3.1.0 -0.8.4 +0.9.2 --- End diff

[GitHub] spark issue #19586: [SPARK-22367][CORE] Separate the serialization of class ...

2017-10-27 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19586 One executor, the configuration as follows: the script: ```shell ${SPARK_HOME}/bin/spark-submit \ --class com.intel.KryoTest \ --master yarn

[GitHub] spark pull request #19586: [SPARK-22367][CORE] Separate the serialization of...

2017-10-27 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19586#discussion_r147346131 --- Diff: pom.xml --- @@ -133,7 +133,7 @@ 1.6.0 9.3.20.v20170531 3.1.0 -0.8.4 +0.9.2 --- End diff

[GitHub] spark issue #19586: [SPARK-22367][CORE] Separate the serialization of class ...

2017-10-27 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19586 Hi, @cloud-fan @jiangxb1987 @chenghao-intel. Would you mind take a look? Thanks a lot. --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #19586: [SPARK-22367][CORE] Separate the serialization of...

2017-10-27 Thread ConeyLiu
GitHub user ConeyLiu opened a pull request: https://github.com/apache/spark/pull/19586 [SPARK-22367][CORE] Separate the serialization of class and object for iteraor ## What changes were proposed in this pull request? Becuase they are all the same class for an iterator

[GitHub] spark pull request #19511: [SPARK-22293][SQL] Avoid unnecessary traversal in...

2017-10-26 Thread ConeyLiu
Github user ConeyLiu closed the pull request at: https://github.com/apache/spark/pull/19511 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19511: [SPARK-22293][SQL] Avoid unnecessary traversal in Resolv...

2017-10-17 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19511 OK, close it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #19317: [SPARK-22098][CORE] Add new method aggregateByKey...

2017-10-17 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19317#discussion_r145294297 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -180,6 +180,56 @@ class PairRDDFunctions[K, V](self: RDD[(K, V

[GitHub] spark pull request #19317: [SPARK-22098][CORE] Add new method aggregateByKey...

2017-10-17 Thread ConeyLiu
Github user ConeyLiu closed the pull request at: https://github.com/apache/spark/pull/19317 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19511: [SPARK-22293][SQL] Avoid unnecessary traversal in Resolv...

2017-10-17 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19511 Hi @gatorsmile, if we can combine the two traverse, this should be simplify the code not complicate. However, this can't get big performance improvement. And I can close it if this change

[GitHub] spark pull request #19511: [SPARK-22293][SQL] Avoid unnecessary traversal in...

2017-10-17 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19511#discussion_r145041400 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -890,32 +890,39 @@ class Analyzer

[GitHub] spark issue #19511: [SPARK-22293][SQL] Avoid unnecessary traversal in Resolv...

2017-10-17 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19511 Hi, @cloud-fan @gatorsmile. Would you mind take a look? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #19511: [SPARK-22293][SQL] Avoid unnecessary traversal in...

2017-10-17 Thread ConeyLiu
GitHub user ConeyLiu opened a pull request: https://github.com/apache/spark/pull/19511 [SPARK-22293][SQL] Avoid unnecessary traversal in ResolveReferences ## What changes were proposed in this pull request? We don't need traverse the children expression to determine whether

[GitHub] spark pull request #19317: [SPARK-22098][CORE] Add new method aggregateByKey...

2017-10-16 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19317#discussion_r144760426 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -180,6 +180,56 @@ class PairRDDFunctions[K, V](self: RDD[(K, V

[GitHub] spark pull request #19317: [SPARK-22098][CORE] Add new method aggregateByKey...

2017-10-15 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19317#discussion_r144755666 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -180,6 +180,56 @@ class PairRDDFunctions[K, V](self: RDD[(K, V

[GitHub] spark issue #19317: [SPARK-22098][CORE] Add new method aggregateByKeyLocally...

2017-10-15 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19317 Hi @WeichenXu123, any comments on this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19316: [SPARK-22097][CORE]Request an accurate memory aft...

2017-10-11 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19316#discussion_r144169752 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -388,7 +388,13 @@ private[spark] class MemoryStore

[GitHub] spark pull request #19316: [SPARK-22097][CORE]Request an accurate memory aft...

2017-10-10 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19316#discussion_r143901552 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -388,7 +388,13 @@ private[spark] class MemoryStore

[GitHub] spark issue #19316: [SPARK-22097][CORE]Request an accurate memory after we u...

2017-10-10 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19316 Hi @cloud-fan @jiangxb1987 Do you have time to check this? Thanks a lot. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #19317: [SPARK-22098][CORE] Add new method aggregateByKeyLocally...

2017-09-24 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19317 Test case: ```scala test("performance of aggregateByKeyLocally ") { val random = new Random(1) val pairs = sc.parallelize(0 until 1000, 20)

[GitHub] spark issue #19317: [SPARK-22098][CORE] Add new method aggregateByKeyLocally...

2017-09-24 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19317 OK, just keep it. Does this need more test or more improvements ? --- - To unsubscribe, e-mail: reviews-unsubscr

  1   2   3   >