Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/23228
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/23251
cc @cloud-fan
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/23228
I have updated, thanks all.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/23228
cc @JoshRosen @cloud-fan
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
GitHub user 10110346 opened a pull request:
https://github.com/apache/spark/pull/23251
[SPARK-26300][SS] The `checkForStreaming` mothod may be called twice in
`createQuery`
## What changes were proposed in this pull request?
If `checkForContinuous` is called
GitHub user 10110346 opened a pull request:
https://github.com/apache/spark/pull/23228
[MINOR][DOC]The condition description of serialized shuffle is not very
accurate
## What changes were proposed in this pull request?
`1. The shuffle dependency specifies no aggregation
Github user 10110346 closed the pull request at:
https://github.com/apache/spark/pull/23216
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/23216
Ok, I will close this PR, thank you very much
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/23216
>
>
> Are you sure it's even a field in the class? it looks like it's only used
to define this:
>
> ```
> @transient private[this] val preferredLocs
GitHub user 10110346 opened a pull request:
https://github.com/apache/spark/pull/23216
[SPARK-26264][CORE]It is better to add @transient to field 'locs' for class
`ResultTask`.
## What changes were proposed in this pull request?
The field 'locs' is only used in driver side
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/23162
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/23162#discussion_r237713245
--- Diff:
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -430,8 +430,8 @@ package object config {
.doc("The
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/23162
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21957#discussion_r236965962
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -269,7 +269,8 @@ case class FileSourceScanExec
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22590
@HyukjinKwon I think it is not important. but our customers need this
feature.
Yeah, it is better to find a way to set the arbitrary parse settings options
GitHub user 10110346 opened a pull request:
https://github.com/apache/spark/pull/23162
[MINOR][DOC] Correct some document description errors
## What changes were proposed in this pull request?
Correct some document description errors.
## How was this patch tested
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22163
cc @kiszk @maropu
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/23154
LGTM,thanks
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/23154#discussion_r236920634
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java
---
@@ -510,42 +510,42 @@ public void
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22163
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22779
@srowen Thanks, I am sorry, I am on holiday, l will update it next week ,I
am reply ing on my phone.
---
-
To unsubscribe, e
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22723#discussion_r230579427
--- Diff:
core/src/main/scala/org/apache/spark/input/WholeTextFileInputFormat.scala ---
@@ -48,11 +50,11 @@ private[spark] class WholeTextFileInputFormat
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22723#discussion_r230579423
--- Diff:
core/src/main/scala/org/apache/spark/input/WholeTextFileInputFormat.scala ---
@@ -48,11 +50,11 @@ private[spark] class WholeTextFileInputFormat
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22723#discussion_r230579084
--- Diff: core/src/main/scala/org/apache/spark/rdd/WholeTextFileRDD.scala
---
@@ -51,7 +51,7 @@ private[spark] class WholeTextFileRDD(
case
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22855#discussion_r228844982
--- Diff:
core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala ---
@@ -298,30 +312,40 @@ class KryoDeserializationStream
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22723
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22723
Thanks, yes, you are right.
After you reminded, I realized there were other places, such as `HadoopRDD`.
But I wonder if it's better to just modify `WholeTextFileInputFormat`, like
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22802
@srowen Thanks, I have checked all and updated it
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22723
In fact, `BinaryFileRDD ` uses `max(defaultParallelism, minPartitions)`:
`BinaryFileRDD --->setMinPartitions--->Math.max(sc.defaultParallelism,
minPartitions)`.
In ad
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22723
What you say is reasonable.
But from the perspective of resource utilization, I think it is better to
replace `minPartitions` with `defaultParallelism`, we can see `BinaryFileRDD
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22754#discussion_r228009755
--- Diff:
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -495,8 +495,8 @@ package object config {
ConfigBuilder
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22754#discussion_r227392626
--- Diff:
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -495,8 +495,8 @@ package object config {
ConfigBuilder
GitHub user 10110346 opened a pull request:
https://github.com/apache/spark/pull/22802
[SPARK-25806][SQL][MINOR]The instanceof FileSplit is redundant for
ParquetFileFormat
## What changes were proposed in this pull request?
The instance of `FileSplit` is redundant
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22754#discussion_r227200492
--- Diff:
core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillWriter.java
---
@@ -62,6 +62,8 @@ public
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22723
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22754#discussion_r227191411
--- Diff:
core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillWriter.java
---
@@ -62,6 +62,8 @@ public
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
Thank you for your review, I will update it @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22779
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22779
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22779
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
GitHub user 10110346 opened a pull request:
https://github.com/apache/spark/pull/22779
[SPARK-25786][CORE]If the ByteBuffer.hasArray is false , it will throw
UnsupportedOperationException for Kryo
## What changes were proposed in this pull request?
`deserialize` for kryo
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22725
ok,thanks @dongjoon-hyun
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22774
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
GitHub user 10110346 opened a pull request:
https://github.com/apache/spark/pull/22774
[SPARK-25780][CORE]Scheduling the tasks which have no higher level locality
first
## What changes were proposed in this pull request?
For example:
An application has two executors: (exe1
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
@kiszk Thanksï¼I will create a JIRA.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22754
If we set 12 into this, `freeSpaceInWriteBuffer ` will be 0, and the
length of `copyMemory` will always be 0, so it is not allowed to set 12 into
this property.
https://github.com
GitHub user 10110346 opened a pull request:
https://github.com/apache/spark/pull/22754
[CORE][MINOR]The disk write buffer size must be greater than 12
## What changes were proposed in this pull request?
In `UnsafeSorterSpillWriter.java`, when we write a record to a spill
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22725
@tgravescs ok, I will do it ,thanks
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22723
cc @gatorsmile
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22725
cc @dhruve @tgravescs
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22731#discussion_r225362470
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala
---
@@ -106,15 +106,16 @@ class FileScanRDD
GitHub user 10110346 opened a pull request:
https://github.com/apache/spark/pull/22725
[SPARK-24610][[CORE][FOLLOW-UP]fix reading small files via BinaryFileRDD
## What changes were proposed in this pull request?
This is a follow up of #21601, `StreamFileInputFormat
GitHub user 10110346 opened a pull request:
https://github.com/apache/spark/pull/22723
[SPARK-25729][CORE]It is better to replace `minPartitions` with
`defaultParallelism` , when `minPartitions` is less than `defaultParallelism`
## What changes were proposed in this pull request
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22594#discussion_r224286853
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala
---
@@ -70,6 +70,8 @@ class FileScanRDD
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22590
Normally, it's better to have no quotes, but in our production environment,
the user requests quotes to be displayed, so we need this option
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22594#discussion_r223612070
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
---
@@ -570,4 +572,33 @@ class SQLMetricsSuite extends
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22590#discussion_r223590113
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
---
@@ -194,6 +195,22 @@ class CSVSuite extends QueryTest
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22594
@srowen Yes,I will update,thanks
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
GitHub user 10110346 opened a pull request:
https://github.com/apache/spark/pull/22594
[MINOR][SQL] When batch reading, the number of bytes can not be updated as
expected.
## What changes were proposed in this pull request?
When batch reading, the number of bytes can
GitHub user 10110346 opened a pull request:
https://github.com/apache/spark/pull/22590
[SPARK-25574][SQL]Add an option `keepQuotes` for parsing csv file
## What changes were proposed in this pull request?
In the PR, I added new option for csv file - `keepQuotes`.
In our
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22163
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22358#discussion_r216180370
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -398,10 +398,10 @@ object SQLConf
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22358#discussion_r215901781
--- Diff: docs/sql-programming-guide.md ---
@@ -964,7 +964,7 @@ Configuration of Parquet can be done using the
`setConf` method on `SparkSession
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22358
yeahï¼ the error message is output from external
jar(parquet-common-1.10.0.jar),
I think spark + parquet should avoid the hadoop dependencies
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22358
Thanksï¼ if there are the codecs found, we support those compressions, but
how do I find it? @HyukjinKwon
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22358#discussion_r215887803
--- Diff: docs/sql-programming-guide.md ---
@@ -964,7 +964,7 @@ Configuration of Parquet can be done using the
`setConf` method on `SparkSession
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22358
It is using reflection to acquire hadoop classes for compression which are
not in the supplied dependencies(hadoop-common-2.6.5.jar,
hadoop-common-2.7.0.jar, hadoop-common-3.1.0.jar
GitHub user 10110346 opened a pull request:
https://github.com/apache/spark/pull/22358
[SPARK-25366][SQL]Zstd and brotil CompressionCodec are not supported for
parquet files
## What changes were proposed in this pull request?
Hadoop2.6 and hadoop2.7 do not contain zstd
Github user 10110346 closed the pull request at:
https://github.com/apache/spark/pull/22350
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22350#discussion_r215819785
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -123,6 +123,9 @@ class
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22350#discussion_r215598798
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -123,6 +123,9 @@ class
GitHub user 10110346 opened a pull request:
https://github.com/apache/spark/pull/22350
[SPARK-25356][SQL]Add Parquet block size option to SparkSQL configuration
## What changes were proposed in this pull request?
I think we should configure the Parquet buffer size
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22306
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user 10110346 commented on the issue:
https://github.com/apache/spark/pull/22306
Thanks,I will apply them to test cases @kiszk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
GitHub user 10110346 opened a pull request:
https://github.com/apache/spark/pull/22306
[SPARK-25300][CORE]Unified the configuration parameter
`spark.shuffle.service.enabled`
## What changes were proposed in this pull request?
The configuration parameter
Github user 10110346 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22241#discussion_r212902991
--- Diff:
core/src/test/scala/org/apache/spark/util/collection/OpenHashMapSuite.scala ---
@@ -194,4 +194,42 @@ class OpenHashMapSuite extends
1 - 100 of 377 matches
Mail list logo