Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22612#discussion_r237737064
--- Diff:
core/src/main/scala/org/apache/spark/executor/ProcfsMetricsGetter.scala ---
@@ -0,0 +1,228 @@
+/*
+ * Licensed to the Apache Software
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22612#discussion_r237736770
--- Diff:
core/src/main/scala/org/apache/spark/executor/ProcfsMetricsGetter.scala ---
@@ -0,0 +1,228 @@
+/*
+ * Licensed to the Apache Software
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22965#discussion_r231908870
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcReadBenchmark.scala ---
@@ -32,9 +32,11 @@ import org.apache.spark.sql.types
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22823#discussion_r231771399
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/WideTableBenchmark.scala
---
@@ -0,0 +1,52 @@
+/*
+ * Licensed
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22823#discussion_r231198094
--- Diff: sql/core/benchmarks/WideTableBenchmark-results.txt ---
@@ -0,0 +1,17
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22823
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22823#discussion_r231116690
--- Diff: sql/core/benchmarks/WideTableBenchmark-results.txt ---
@@ -0,0 +1,17
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22823
@dongjoon-hyun Just push the rebased version, thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22847
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22847
@cloud-fan @rednaxelafx I missed that! Please help review.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22847
@cloud-fan @gatorsmile How about merging this PR first? And then we can
dissuss those performance issue in other PR?
1. One PR to improve WideTableBenchmark #22823 WIP.
2. One PR to add more
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22847#discussion_r230248773
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -812,6 +812,17 @@ object SQLConf {
.intConf
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22847
I used the WideTableBenchmark to test this configuration.
4 scenarioes are tested, `2048` is always better than `1024`, overall it is
also good and looks more safe to avoid hitting 8KB limitaion
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22847#discussion_r229919857
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -812,6 +812,17 @@ object SQLConf {
.intConf
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22847#discussion_r229638917
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -812,6 +812,17 @@ object SQLConf {
.intConf
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22879
`tpcdsQueries` and `tpcdsQueriesV2_7` are duplicated to
`TPCDSQueryBenchmark`'s, should we maintain them together
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22847
@cloud-fan @dongjoon-hyun @kiszk I just add a negative check, maybe we need
another PR to figure better value later if it is not easy to decide now
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22861
@dongjoon-hyun Tests have been passed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22861#discussion_r229162191
--- Diff:
external/avro/src/test/scala/org/apache/spark/sql/execution/benchmark/AvroWriteBenchmark.scala
---
@@ -19,22 +19,17 @@ package
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22845#discussion_r229011040
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVBenchmarks.scala
---
@@ -137,22 +124,15 @@ object CSVBenchmarks extends
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22844#discussion_r229010476
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmarks.scala
---
@@ -195,23 +170,16 @@ object JSONBenchmarks
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22861
@dongjoon-hyun I used #22872 to make main args accessible for
`BenchmarkBase`'s subclass, this PR is mainly for refactoring
`DataSourceWriteBenchmark` and `BuiltInDataSourceWriteBenchmark
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22872
@HyukjinKwon @cloud-fan My previous tests have been passed :).
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22847
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22861
Current implementation misses main args, but some suite would need it
anyway.
Let's can discuss #22872.
---
-
To unsubscribe
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22872
@gengliangwang @dongjoon-hyun @cloud-fan
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
GitHub user yucai opened a pull request:
https://github.com/apache/spark/pull/22872
[SPARK-25864][SQL][TEST] Make main args accessible for BenchmarkBase's
subclass
## What changes were proposed in this pull request?
Set main args correctly in BenchmarkBase, to make
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22847#discussion_r228788123
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -812,6 +812,17 @@ object SQLConf {
.intConf
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22861
@dongjoon-hyun Originally, I want to do two things in this PR.
1. Make `mainArgs` correctly set in `BenchmarkBase`.
2. Include an example to use `mainArgs`: refactor
`DataSourceWriteBenchmark
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22861
@gengliangwang @wangyum @dongjoon-hyun help review.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
GitHub user yucai opened a pull request:
https://github.com/apache/spark/pull/22861
[SPARK-25663][SPARK-25661][SQL][TEST] Refactor
BuiltInDataSourceWriteBenchmark, DataSourceWriteBenchmark and
AvroWriteBenchmark to use main method
## What changes were proposed in this pull request
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22823#discussion_r228715829
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/BenchmarkWideTable.scala
---
@@ -1,52 +0,0 @@
-/*
- * Licensed
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22847
@cloud-fan @dongjoon-hyun @gengliangwang Kindly help review.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
GitHub user yucai opened a pull request:
https://github.com/apache/spark/pull/22847
[SPARK-25850][SQL] Make the split threshold for the code generated method
configurable
## What changes were proposed in this pull request?
As per the
[discussion](https://github.com/apache
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22823#discussion_r228387605
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
---
@@ -910,12 +910,14 @@ class CodegenContext
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22823
Thanks @wangyum for good suggestion!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
GitHub user yucai opened a pull request:
https://github.com/apache/spark/pull/22823
[SPARK-25676][SQL][TEST] Refactor BenchmarkWideTable to use main method
## What changes were proposed in this pull request?
Refactor BenchmarkWideTable to use main method.
Generate
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/19788
@cloud-fan , exception screenshot. Let me know if you want any change.
![image](https://user-images.githubusercontent.com/2989575/47471258-1793ce00-d83c-11e8-90bf-107865fc9032.png
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/21156#discussion_r227268021
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/JoinUtils.scala ---
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software
GitHub user yucai opened a pull request:
https://github.com/apache/spark/pull/22580
[SPARK-25508][SQL] Refactor OrcReadBenchmark to use main method
## What changes were proposed in this pull request?
Refactor OrcReadBenchmark to use main method.
Generate benchmark
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22493
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22493
@dongjoon-hyun could you help review?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22495#discussion_r219731873
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/SortBenchmark.scala
---
@@ -28,12 +28,15 @@ import
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22490#discussion_r219548887
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnarBatchBenchmark.scala
---
@@ -30,8 +30,13 @@ import
GitHub user yucai opened a pull request:
https://github.com/apache/spark/pull/22495
[SPARK-25486][TEST] Refactor SortBenchmark to use main method
## What changes were proposed in this pull request?
Refactor SortBenchmark to use main method.
Generate benchmark result
GitHub user yucai opened a pull request:
https://github.com/apache/spark/pull/22493
[SPARK-25485][TEST] Refactor UnsafeProjectionBenchmark to use main method
## What changes were proposed in this pull request?
Refactor `UnsafeProjectionBenchmark` to use main method
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22490
@wangyum Could you help review?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
GitHub user yucai opened a pull request:
https://github.com/apache/spark/pull/22490
[SPARK-25481][TEST] Refactor ColumnarBatchBenchmark to use main method
## What changes were proposed in this pull request?
Refactor `ColumnarBatchBenchmark` to use main method.
Generate
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/21791
@kiszk I feel it is hard to add UT, do you have any idea?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/21791
@kiszk I see #22324 has been solved, which is one of my PR's dependency
actually (see
https://issues.apache.org/jira/browse/SPARK-24925?focusedCommentId=16556818
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22197
@cloud-fan, tests have passed. And I will use a followup PR to make it
cleaner.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22197#discussion_r213912108
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala
---
@@ -1021,6 +1022,113 @@ class
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22197
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22197
@cloud-fan I reverted to the previous version.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22197
@dongjoon-hyun Sorry for the late response, description is changed to:
> Although filter "ID < 100L" is generated by Spark, it fails to pushdown
into parquet actually,
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22197
I will treat the above case as acceptable and will add a duplicated field
check for the parquet schema.
---
-
To unsubscribe, e
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22197
Both `catalystRequestedSchema` and `parquetSchema` are recursive structure,
is there the easy way to find duplicated fields? Thanks
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22197
@cloud-fan I also think my way changes too much in this PR.
> go through the parquet schema and find duplicated field names
If user query only query non-duplicated field, this
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22184#discussion_r213581563
--- Diff: docs/sql-programming-guide.md ---
@@ -1895,6 +1895,10 @@ working with timestamps in `pandas_udf`s to get the
best performance, see
- Since
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22197#discussion_r213551202
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala
---
@@ -350,25 +356,38 @@ private[parquet] class
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22184#discussion_r213519348
--- Diff: docs/sql-programming-guide.md ---
@@ -1895,6 +1895,10 @@ working with timestamps in `pandas_udf`s to get the
best performance, see
- Since
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22184#discussion_r213386126
--- Diff: docs/sql-programming-guide.md ---
@@ -1895,6 +1895,10 @@ working with timestamps in `pandas_udf`s to get the
best performance, see
- Since
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22197
@dongjoon-hyun In the **schema matched case** as you listed, it is expected
behavior in current master.
```
spark.sparkContext.hadoopConfiguration.setInt("parquet.block.size", 8 *
1
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22197
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22197
@gatorsmile I can help check `spark.sql.caseSensitive` for all the built-in
data sources.
---
-
To unsubscribe, e-mail: reviews
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22197#discussion_r212814524
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala
---
@@ -350,25 +356,38 @@ private[parquet] class
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22197#discussion_r212813302
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala
---
@@ -1021,6 +1022,116 @@ class
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22197#discussion_r212813240
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala
---
@@ -1021,6 +1022,116 @@ class
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22197#discussion_r212801680
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala
---
@@ -1021,6 +1021,88 @@ class
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22197
@cloud-fan @HyukjinKwon Seem cannot simply add `originalName` into
`ParquetSchemaType`.
Because we need exact ParquetSchemaType info for type match, like:
```
private val
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22197#discussion_r212798970
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala
---
@@ -350,25 +352,43 @@ private[parquet] class
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22197#discussion_r212788978
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -377,7 +378,7 @@ class ParquetFileFormat
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22183
We need to backport it. Without this PR, we cannot solve the data issue in
[SPARK-25206] Wrong data may be returned when enable pushdown
GitHub user yucai opened a pull request:
https://github.com/apache/spark/pull/22197
[SPARK-25207][SQL] Case-insensitve field resolution for filter pushdown
when reading Parquet
## What changes were proposed in this pull request?
Currently, filter pushdown will not work
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22148
I trigged 3 hours ago, but see many Jenkins submission is in the queue.
And it says "Jenkins is about to shut down" ?
![image](https://user-images.githubusercontent.com/298957
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22148
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22148
LGTM.
@cloud-fan @gatorsmile Could you kindly help trigger Jenkins and review?
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/19788
**Summary**
One disk IO solution's performance seems not as good as current PR19877's
implementation.
**Benchmark**
```scacla
spark.range(1, 512000L, 1, 1280).selectExpr
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22077
@LantaoJin KafkaSourceStressForDontFailOnDataLossSuite fails occationally,
thanks @wangyum for retesting.
---
-
To unsubscribe, e
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22077
ok to test.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22077
@cloud-fan Kindly help review.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22077
LGTM.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22077
@LantaoJin, could you modify the title and description?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22066
@cloud-fan Synced with @LantaoJin he will help port to 2.3 soon and I will
review it.
---
-
To unsubscribe, e-mail: reviews
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22066
@cloud-fan @jerryshao sure, I will do it.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22066#discussion_r209426886
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala
---
@@ -404,21 +404,26 @@ abstract class HashExpression[E
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22066
@LantaoJin I realized the initial way had some issue, so I marked it as WIP
to refine and add test. It is different from your original implementation, so I
would like to use this one
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22066
@cloud-fan Jira and 1st is from this one. It is critical to our 2.3
migration.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22066#discussion_r209267827
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala
---
@@ -778,21 +783,22 @@ case class HiveHash(children: Seq
Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/22066#discussion_r209265323
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
---
@@ -2831,4 +2831,17 @@ class SQLQuerySuite extends QueryTest
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22066
@cloud-fan @gatorsmile PR has been ready, kindly help review.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/22066
@cloud-fan I am refining and adding tests.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
GitHub user yucai opened a pull request:
https://github.com/apache/spark/pull/22066
[SPARK-25084][SQL] "distribute by" on multiple columns may lead to codegen
issue
## What changes were proposed in this pull request?
"distribute by" on multiple columns
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/19788
@cloud-fan @gatorsmile I am trying the new method as suggested and I have a
question.
If we make it **purely server-side** optimization, for external shuffle
service, it has no idea how
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/19788
@gatorsmile @cloud-fan @carsonwang I will update it recently.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/21156
closed by mistake, reopen it.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
GitHub user yucai reopened a pull request:
https://github.com/apache/spark/pull/21156
[SPARK-24087][SQL] Avoid shuffle when join keys are a super-set of bucket
keys
## What changes were proposed in this pull request?
To improve the bucket join, when join keys are a super
Github user yucai closed the pull request at:
https://github.com/apache/spark/pull/21156
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
GitHub user yucai opened a pull request:
https://github.com/apache/spark/pull/21791
[SPARK-24832][SQL] Improve inputMetrics's bytesRead update for ColumnarBatch
## What changes were proposed in this pull request?
Currently, ColumnarBatch's bytesRead need to be updated every
Github user yucai commented on the issue:
https://github.com/apache/spark/pull/21156
@maryannxue how about this way? Any better idea?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
1 - 100 of 258 matches
Mail list logo