Github user habren commented on a diff in the pull request:
https://github.com/apache/spark/pull/21868#discussion_r210890027
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -459,6 +460,29 @@ object SQLConf {
.intConf
Github user habren commented on the issue:
https://github.com/apache/spark/pull/21868
@HyukjinKwon Yes this is to handle it dynamically.
For ad-hoc query, the selected columns are different for different queries,
and it's not convenient or event impossible for users to set
Github user habren commented on a diff in the pull request:
https://github.com/apache/spark/pull/21868#discussion_r210887543
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -459,6 +460,29 @@ object SQLConf {
.intConf
Github user habren commented on a diff in the pull request:
https://github.com/apache/spark/pull/21868#discussion_r210887308
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -459,6 +460,29 @@ object SQLConf {
.intConf
Github user habren commented on a diff in the pull request:
https://github.com/apache/spark/pull/21868#discussion_r210886442
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -25,17 +25,16 @@ import java.util.zip.Deflater
import
Github user habren commented on a diff in the pull request:
https://github.com/apache/spark/pull/21868#discussion_r210876154
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -459,6 +460,29 @@ object SQLConf {
.intConf
Github user habren commented on a diff in the pull request:
https://github.com/apache/spark/pull/21868#discussion_r210871055
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -425,12 +426,44 @@ case class FileSourceScanExec
Github user habren commented on a diff in the pull request:
https://github.com/apache/spark/pull/21868#discussion_r210793717
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -425,12 +426,44 @@ case class FileSourceScanExec
Github user habren commented on the issue:
https://github.com/apache/spark/pull/21868
Hi @HyukjinKwon I moved the change to master branch just now. Please help
to review
---
-
To unsubscribe, e-mail: reviews
Github user habren commented on a diff in the pull request:
https://github.com/apache/spark/pull/21868#discussion_r210456342
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -401,12 +399,41 @@ case class FileSourceScanExec
Github user habren commented on the issue:
https://github.com/apache/spark/pull/21868
@HyukjinKwon Thanks for your comments. I will submit it to master soon
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user habren commented on the issue:
https://github.com/apache/spark/pull/21868
@maropu Thanks for your comments. ORC can also benefit from this change
since ORC is also columnar file format. Do you think I should add ORC support
by change the below line
Github user habren commented on a diff in the pull request:
https://github.com/apache/spark/pull/22018#discussion_r208788059
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
---
@@ -297,7 +297,7 @@ object InMemoryFileIndex
Github user habren commented on a diff in the pull request:
https://github.com/apache/spark/pull/22018#discussion_r208787523
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
---
@@ -297,7 +297,7 @@ object InMemoryFileIndex
Github user habren commented on a diff in the pull request:
https://github.com/apache/spark/pull/22018#discussion_r208784609
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
---
@@ -297,7 +297,7 @@ object InMemoryFileIndex
Github user habren commented on the issue:
https://github.com/apache/spark/pull/22018
Hi Takeshi Yamamuro Hyukjin Kwonâ and @viirya Can you take a look at this
patch?
---
-
To unsubscribe, e-mail: reviews
Github user habren commented on the issue:
https://github.com/apache/spark/pull/21868
Hi @maropu and @viirya Do you agree with the basic idea that we should take
column pruning in to consideration during splitting the input files
GitHub user habren opened a pull request:
https://github.com/apache/spark/pull/22018
[SPARK-25038][SQL] Accelerate Spark Plan generation when Spark SQL reâ¦
https://issues.apache.org/jira/browse/SPARK-25038
When Spark SQL read large amount of data, it take a long time
Github user habren commented on the issue:
https://github.com/apache/spark/pull/21868
Hi @maropu and @viirya Do you agree with the basic idea that we should
take column pruning in to consideration during splitting the input files
GitHub user habren reopened a pull request:
https://github.com/apache/spark/pull/21868
[SPARK-24906][SQL] Adaptively enlarge split / partition size for Parqâ¦
Please refer to https://issues.apache.org/jira/browse/SPARK-24906 for more
detail and test
For columnar file
Github user habren commented on the issue:
https://github.com/apache/spark/pull/21868
@maropu If I understand correct, your concern is about how to calculate
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user habren closed the pull request at:
https://github.com/apache/spark/pull/21868
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user habren commented on a diff in the pull request:
https://github.com/apache/spark/pull/21868#discussion_r205356861
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -401,12 +399,41 @@ case class FileSourceScanExec
Github user habren commented on a diff in the pull request:
https://github.com/apache/spark/pull/21868#discussion_r205288000
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -401,12 +399,41 @@ case class FileSourceScanExec
Github user habren commented on a diff in the pull request:
https://github.com/apache/spark/pull/21868#discussion_r205287123
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -381,6 +381,26 @@ object SQLConf {
.booleanConf
GitHub user habren opened a pull request:
https://github.com/apache/spark/pull/21868
[SPARK-24906][SQL] Adaptively enlarge split / partition size for Parqâ¦
Please refer to https://issues.apache.org/jira/browse/SPARK-24906 for more
detail and test
For columnar file
26 matches
Mail list logo