[GitHub] spark issue #23228: [MINOR][DOC] Update the condition description of seriali...

2018-12-10 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/23228 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23251: [SPARK-26300][SS] Remove a redundant `checkForStreaming`...

2018-12-10 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/23251 cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #23228: [MINOR][DOC] Update the condition description of seriali...

2018-12-09 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/23228 I have updated, thanks all. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #23228: [MINOR][DOC]The condition description of serialized shuf...

2018-12-06 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/23228 cc @JoshRosen @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23251: [SPARK-26300][SS] The `checkForStreaming` mothod ...

2018-12-06 Thread 10110346
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/23251 [SPARK-26300][SS] The `checkForStreaming` mothod may be called twice in `createQuery` ## What changes were proposed in this pull request? If `checkForContinuous` is called

[GitHub] spark pull request #23228: [MINOR][DOC]The condition description of serializ...

2018-12-05 Thread 10110346
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/23228 [MINOR][DOC]The condition description of serialized shuffle is not very accurate ## What changes were proposed in this pull request? `1. The shuffle dependency specifies no aggregation

[GitHub] spark pull request #23216: [SPARK-26264][CORE]It is better to add @transient...

2018-12-04 Thread 10110346
Github user 10110346 closed the pull request at: https://github.com/apache/spark/pull/23216 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23216: [SPARK-26264][CORE]It is better to add @transient to fie...

2018-12-04 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/23216 Ok, I will close this PR, thank you very much --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23216: [SPARK-26264][CORE]It is better to add @transient to fie...

2018-12-04 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/23216 > > > Are you sure it's even a field in the class? it looks like it's only used to define this: > > ``` > @transient private[this] val preferredLocs

[GitHub] spark pull request #23216: [SPARK-26264][CORE]It is better to add @transient...

2018-12-04 Thread 10110346
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/23216 [SPARK-26264][CORE]It is better to add @transient to field 'locs' for class `ResultTask`. ## What changes were proposed in this pull request? The field 'locs' is only used in driver side

[GitHub] spark issue #23162: [MINOR][DOC] Correct some document description errors

2018-11-30 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/23162 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23162: [MINOR][DOC] Correct some document description er...

2018-11-29 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/23162#discussion_r237713245 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -430,8 +430,8 @@ package object config { .doc("The

[GitHub] spark issue #23162: [MINOR][DOC] Correct some document description errors

2018-11-28 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/23162 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #21957: [SPARK-24994][SQL] When the data type of the fiel...

2018-11-27 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/21957#discussion_r236965962 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -269,7 +269,8 @@ case class FileSourceScanExec

[GitHub] spark issue #22590: [SPARK-25574][SQL]Add an option `keepQuotes` for parsing...

2018-11-27 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22590 @HyukjinKwon I think it is not important. but our customers need this feature. Yeah, it is better to find a way to set the arbitrary parse settings options

[GitHub] spark pull request #23162: [MINOR][DOC] Correct some document description er...

2018-11-27 Thread 10110346
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/23162 [MINOR][DOC] Correct some document description errors ## What changes were proposed in this pull request? Correct some document description errors. ## How was this patch tested

[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-11-27 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22163 cc @kiszk @maropu --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23154: [SPARK-26195][SQL] Correct exception messages in some cl...

2018-11-27 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/23154 LGTM,thanks --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #23154: [SPARK-26195][SQL] Correct exception messages in ...

2018-11-27 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/23154#discussion_r236920634 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java --- @@ -510,42 +510,42 @@ public void

[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-11-26 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22163 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22779: [SPARK-25786][CORE]If the ByteBuffer.hasArray is false ,...

2018-11-17 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22779 @srowen Thanks, I am sorry, I am on holiday, l will update it next week ,I am reply ing on my phone. --- - To unsubscribe, e

[GitHub] spark pull request #22723: [SPARK-25729][CORE]It is better to replace `minPa...

2018-11-04 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22723#discussion_r230579427 --- Diff: core/src/main/scala/org/apache/spark/input/WholeTextFileInputFormat.scala --- @@ -48,11 +50,11 @@ private[spark] class WholeTextFileInputFormat

[GitHub] spark pull request #22723: [SPARK-25729][CORE]It is better to replace `minPa...

2018-11-04 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22723#discussion_r230579423 --- Diff: core/src/main/scala/org/apache/spark/input/WholeTextFileInputFormat.scala --- @@ -48,11 +50,11 @@ private[spark] class WholeTextFileInputFormat

[GitHub] spark pull request #22723: [SPARK-25729][CORE]It is better to replace `minPa...

2018-11-04 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22723#discussion_r230579084 --- Diff: core/src/main/scala/org/apache/spark/rdd/WholeTextFileRDD.scala --- @@ -51,7 +51,7 @@ private[spark] class WholeTextFileRDD( case

[GitHub] spark pull request #22855: [SPARK-25839] [Core] Implement use of KryoPool in...

2018-10-29 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22855#discussion_r228844982 --- Diff: core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala --- @@ -298,30 +312,40 @@ class KryoDeserializationStream

[GitHub] spark issue #22723: [SPARK-25729][CORE]It is better to replace `minPartition...

2018-10-28 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22723 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22723: [SPARK-25729][CORE]It is better to replace `minPartition...

2018-10-28 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22723 Thanks, yes, you are right. After you reminded, I realized there were other places, such as `HadoopRDD`. But I wonder if it's better to just modify `WholeTextFileInputFormat`, like

[GitHub] spark issue #22802: [SPARK-25806][SQL]The instance of FileSplit is redundant

2018-10-28 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22802 @srowen Thanks, I have checked all and updated it --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22723: [SPARK-25729][CORE]It is better to replace `minPartition...

2018-10-25 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22723 In fact, `BinaryFileRDD ` uses `max(defaultParallelism, minPartitions)`: `BinaryFileRDD --->setMinPartitions--->Math.max(sc.defaultParallelism, minPartitions)`. In ad

[GitHub] spark issue #22723: [SPARK-25729][CORE]It is better to replace `minPartition...

2018-10-24 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22723 What you say is reasonable. But from the perspective of resource utilization, I think it is better to replace `minPartitions` with `defaultParallelism`, we can see `BinaryFileRDD

[GitHub] spark pull request #22754: [SPARK-25776][CORE]The disk write buffer size mus...

2018-10-24 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22754#discussion_r228009755 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -495,8 +495,8 @@ package object config { ConfigBuilder

[GitHub] spark issue #22754: [SPARK-25776][CORE]The disk write buffer size must be gr...

2018-10-23 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22754: [SPARK-25776][CORE]The disk write buffer size must be gr...

2018-10-23 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #22754: [SPARK-25776][CORE]The disk write buffer size mus...

2018-10-23 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22754#discussion_r227392626 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -495,8 +495,8 @@ package object config { ConfigBuilder

[GitHub] spark pull request #22802: [SPARK-25806][SQL][MINOR]The instanceof FileSplit...

2018-10-23 Thread 10110346
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/22802 [SPARK-25806][SQL][MINOR]The instanceof FileSplit is redundant for ParquetFileFormat ## What changes were proposed in this pull request? The instance of `FileSplit` is redundant

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-22 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #22754: [SPARK-25776][CORE][MINOR]The disk write buffer s...

2018-10-22 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22754#discussion_r227200492 --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillWriter.java --- @@ -62,6 +62,8 @@ public

[GitHub] spark issue #22723: [SPARK-25729][CORE]It is better to replace `minPartition...

2018-10-22 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22723 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #22754: [SPARK-25776][CORE][MINOR]The disk write buffer s...

2018-10-22 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22754#discussion_r227191411 --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillWriter.java --- @@ -62,6 +62,8 @@ public

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE][MINOR]The disk write buffer size mus...

2018-10-21 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 Thank you for your review, I will update it @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22779: [SPARK-25786][CORE]If the ByteBuffer.hasArray is false ,...

2018-10-20 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22779 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22779: [SPARK-25786][CORE]If the ByteBuffer.hasArray is false ,...

2018-10-20 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22779 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22779: [SPARK-25786][CORE]If the ByteBuffer.hasArray is false ,...

2018-10-20 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22779 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #22779: [SPARK-25786][CORE]If the ByteBuffer.hasArray is ...

2018-10-19 Thread 10110346
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/22779 [SPARK-25786][CORE]If the ByteBuffer.hasArray is false , it will throw UnsupportedOperationException for Kryo ## What changes were proposed in this pull request? `deserialize` for kryo

[GitHub] spark issue #22725: [SPARK-25753][[CORE][FOLLOW-UP]fix reading small files v...

2018-10-19 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22725 ok,thanks @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22774: [SPARK-25780][CORE]Scheduling the tasks which have no hi...

2018-10-19 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22774 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #22774: [SPARK-25780][CORE]Scheduling the tasks which hav...

2018-10-19 Thread 10110346
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/22774 [SPARK-25780][CORE]Scheduling the tasks which have no higher level locality first ## What changes were proposed in this pull request? For example: An application has two executors: (exe1

[GitHub] spark issue #22754: [MINOR][CORE]The disk write buffer size must be greater ...

2018-10-18 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 @kiszk Thanks,I will create a JIRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22754: [MINOR][CORE]The disk write buffer size must be greater ...

2018-10-18 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22754 If we set 12 into this, `freeSpaceInWriteBuffer ` will be 0, and the length of `copyMemory` will always be 0, so it is not allowed to set 12 into this property. https://github.com

[GitHub] spark pull request #22754: [CORE][MINOR]The disk write buffer size must be g...

2018-10-17 Thread 10110346
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/22754 [CORE][MINOR]The disk write buffer size must be greater than 12 ## What changes were proposed in this pull request? In `UnsafeSorterSpillWriter.java`, when we write a record to a spill

[GitHub] spark issue #22725: [SPARK-24610][[CORE][FOLLOW-UP]fix reading small files v...

2018-10-16 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22725 @tgravescs ok, I will do it ,thanks --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #22723: [SPARK-25729][CORE]It is better to replace `minPartition...

2018-10-15 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22723 cc @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22725: [SPARK-24610][[CORE][FOLLOW-UP]fix reading small files v...

2018-10-15 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22725 cc @dhruve @tgravescs --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22731: [SPARK-25674][FOLLOW-UP] Update the stats for eac...

2018-10-15 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22731#discussion_r225362470 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala --- @@ -106,15 +106,16 @@ class FileScanRDD

[GitHub] spark pull request #22725: [SPARK-24610][[CORE][FOLLOW-UP]fix reading small ...

2018-10-15 Thread 10110346
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/22725 [SPARK-24610][[CORE][FOLLOW-UP]fix reading small files via BinaryFileRDD ## What changes were proposed in this pull request? This is a follow up of #21601, `StreamFileInputFormat

[GitHub] spark pull request #22723: [SPARK-25729][CORE]It is better to replace `minPa...

2018-10-15 Thread 10110346
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/22723 [SPARK-25729][CORE]It is better to replace `minPartitions` with `defaultParallelism` , when `minPartitions` is less than `defaultParallelism` ## What changes were proposed in this pull request

[GitHub] spark pull request #22594: [SPARK-25674][SQL] If the records are incremented...

2018-10-10 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22594#discussion_r224286853 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala --- @@ -70,6 +70,8 @@ class FileScanRDD

[GitHub] spark issue #22590: [SPARK-25574][SQL]Add an option `keepQuotes` for parsing...

2018-10-09 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22590 Normally, it's better to have no quotes, but in our production environment, the user requests quotes to be displayed, so we need this option

[GitHub] spark pull request #22594: [SPARK-25674][SQL] If the records are incremented...

2018-10-09 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22594#discussion_r223612070 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala --- @@ -570,4 +572,33 @@ class SQLMetricsSuite extends

[GitHub] spark pull request #22590: [SPARK-25574][SQL]Add an option `keepQuotes` for ...

2018-10-09 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22590#discussion_r223590113 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -194,6 +195,22 @@ class CSVSuite extends QueryTest

[GitHub] spark issue #22594: [MINOR][SQL] When batch reading, the number of bytes can...

2018-10-04 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22594 @srowen Yes,I will update,thanks --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22594: [MINOR][SQL] When batch reading, the number of by...

2018-09-30 Thread 10110346
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/22594 [MINOR][SQL] When batch reading, the number of bytes can not be updated as expected. ## What changes were proposed in this pull request? When batch reading, the number of bytes can

[GitHub] spark pull request #22590: [SPARK-25574][SQL]Add an option `keepQuotes` for ...

2018-09-29 Thread 10110346
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/22590 [SPARK-25574][SQL]Add an option `keepQuotes` for parsing csv file ## What changes were proposed in this pull request? In the PR, I added new option for csv file - `keepQuotes`. In our

[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-09-20 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22163 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #22358: [SPARK-25366][SQL]Zstd and brotli CompressionCode...

2018-09-09 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22358#discussion_r216180370 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -398,10 +398,10 @@ object SQLConf

[GitHub] spark pull request #22358: [SPARK-25366][SQL]Zstd and brotil CompressionCode...

2018-09-07 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22358#discussion_r215901781 --- Diff: docs/sql-programming-guide.md --- @@ -964,7 +964,7 @@ Configuration of Parquet can be done using the `setConf` method on `SparkSession

[GitHub] spark issue #22358: [SPARK-25366][SQL]Zstd and brotil CompressionCodec are n...

2018-09-07 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22358 yeah, the error message is output from external jar(parquet-common-1.10.0.jar), I think spark + parquet should avoid the hadoop dependencies

[GitHub] spark issue #22358: [SPARK-25366][SQL]Zstd and brotil CompressionCodec are n...

2018-09-07 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22358 Thanks, if there are the codecs found, we support those compressions, but how do I find it? @HyukjinKwon

[GitHub] spark pull request #22358: [SPARK-25366][SQL]Zstd and brotil CompressionCode...

2018-09-07 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22358#discussion_r215887803 --- Diff: docs/sql-programming-guide.md --- @@ -964,7 +964,7 @@ Configuration of Parquet can be done using the `setConf` method on `SparkSession

[GitHub] spark issue #22358: [SPARK-25366][SQL]Zstd and brotil CompressionCodec are n...

2018-09-07 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22358 It is using reflection to acquire hadoop classes for compression which are not in the supplied dependencies(hadoop-common-2.6.5.jar, hadoop-common-2.7.0.jar, hadoop-common-3.1.0.jar

[GitHub] spark pull request #22358: [SPARK-25366][SQL]Zstd and brotil CompressionCode...

2018-09-07 Thread 10110346
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/22358 [SPARK-25366][SQL]Zstd and brotil CompressionCodec are not supported for parquet files ## What changes were proposed in this pull request? Hadoop2.6 and hadoop2.7 do not contain zstd

[GitHub] spark pull request #22350: [SPARK-25356][SQL]Add Parquet block size option t...

2018-09-06 Thread 10110346
Github user 10110346 closed the pull request at: https://github.com/apache/spark/pull/22350 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22350: [SPARK-25356][SQL]Add Parquet block size option t...

2018-09-06 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22350#discussion_r215819785 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala --- @@ -123,6 +123,9 @@ class

[GitHub] spark pull request #22350: [SPARK-25356][SQL]Add Parquet block size option t...

2018-09-06 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22350#discussion_r215598798 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala --- @@ -123,6 +123,9 @@ class

[GitHub] spark pull request #22350: [SPARK-25356][SQL]Add Parquet block size option t...

2018-09-06 Thread 10110346
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/22350 [SPARK-25356][SQL]Add Parquet block size option to SparkSQL configuration ## What changes were proposed in this pull request? I think we should configure the Parquet buffer size

[GitHub] spark issue #22306: [SPARK-25300][CORE]Unified the configuration parameter `...

2018-09-03 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22306 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22306: [SPARK-25300][CORE]Unified the configuration parameter `...

2018-09-02 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22306 Thanks,I will apply them to test cases @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22306: [SPARK-25300][CORE]Unified the configuration para...

2018-08-31 Thread 10110346
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/22306 [SPARK-25300][CORE]Unified the configuration parameter `spark.shuffle.service.enabled` ## What changes were proposed in this pull request? The configuration parameter

[GitHub] spark pull request #22241: [SPARK-25249][CORE][TEST]add a unit test for Open...

2018-08-27 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/22241#discussion_r212902991 --- Diff: core/src/test/scala/org/apache/spark/util/collection/OpenHashMapSuite.scala --- @@ -194,4 +194,42 @@ class OpenHashMapSuite extends

  1   2   3   4   >