Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r208423190
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala
---
@@ -49,4 +51,11 @@ object DataSourceUtils
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r208295767
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala
---
@@ -49,4 +51,11 @@ object DataSourceUtils
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
@maropu @gatorsmile, Any other changes required?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
@maropu @gatorsmile The tests are failing in places unrelated to my
changes. How can I resolve this?
---
-
To unsubscribe, e
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r207090805
--- Diff: docs/sql-programming-guide.md ---
@@ -1872,6 +1872,8 @@ working with timestamps in `pandas_udf`s to get the
best performance, see
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r207090543
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -1449,6 +1449,15 @@ object SQLConf {
.intConf
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r207090467
--- Diff: docs/sql-programming-guide.md ---
@@ -1872,6 +1872,8 @@ working with timestamps in `pandas_udf`s to get the
best performance, see
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
@gatorsmile, I have addressed the comments. Any other fix required?
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r207035320
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
---
@@ -78,7 +93,8 @@ object CommandUtils extends Logging
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r207035227
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -1449,6 +1449,13 @@ object SQLConf {
.intConf
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r207035140
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -1449,6 +1449,13 @@ object SQLConf {
.intConf
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
@gatorsmile, I have made the changes.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r205916878
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
---
@@ -47,15 +48,23 @@ object CommandUtils extends Logging
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r205916762
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
---
@@ -47,15 +48,23 @@ object CommandUtils extends Logging
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r205546260
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
---
@@ -47,15 +48,23 @@ object CommandUtils extends Logging
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r205545098
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
---
@@ -47,15 +48,23 @@ object CommandUtils extends Logging
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r205544840
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
---
@@ -47,15 +48,23 @@ object CommandUtils extends Logging
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
@gatorsmile Can you review this please? Thanks.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r204823567
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
---
@@ -47,15 +48,26 @@ object CommandUtils extends Logging
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r204823533
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
---
@@ -47,15 +48,26 @@ object CommandUtils extends Logging
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r204823422
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala ---
@@ -148,6 +148,19 @@ class StatisticsSuite extends
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r204823485
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileIndex.scala
---
@@ -59,14 +59,15 @@ abstract class
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r204823137
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala ---
@@ -148,6 +149,25 @@ class StatisticsSuite extends
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r204618663
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala ---
@@ -148,6 +148,19 @@ class StatisticsSuite extends
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r204618589
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Utils.scala
---
@@ -55,4 +57,11 @@ private[sql] object
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
@gatorsmile @maropu ping.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r204200662
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
---
@@ -47,15 +48,27 @@ object CommandUtils extends Logging
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r204121451
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
---
@@ -47,15 +48,27 @@ object CommandUtils extends Logging
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r203613041
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala ---
@@ -148,6 +148,19 @@ class StatisticsSuite extends
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r203456566
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
---
@@ -47,15 +48,27 @@ object CommandUtils extends Logging
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
@maropu @gatorsmile Can you review this? Thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/18193
This fix is useful, is there any update on this?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
Ping @gatorsmile
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
@maropu I have added a simple test, Can you check it out?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
@maropu, The method `calculateTotalSize` is already tested in
`StatisticsSuite` in a few places. One such test is "analyze Hive serde tables"
where we check to make sure the calcu
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r201820832
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
---
@@ -162,7 +162,7 @@ object InMemoryFileIndex
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r201820630
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
---
@@ -47,15 +48,29 @@ object CommandUtils extends Logging
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
@maropu regarding tests, since I didn't make any changes to
`inMemoryFileIndex` isn't it better if I added tests in
StatisticsCollectionSuite/StatisticsSuite?
Also, since we just
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r200824025
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
---
@@ -47,15 +48,26 @@ object CommandUtils extends Logging
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r200666542
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
---
@@ -47,15 +48,26 @@ object CommandUtils extends Logging
Github user Achuth17 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r200544127
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
---
@@ -47,15 +48,26 @@ object CommandUtils extends Logging
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
@maropu I have made the changes. What are the next steps?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
Yes, In the case where the data is stored in S3 I noticed a significant
difference.
Some rough numbers - When done serially for a table in S3 with 1000
partitions, the calculateTotalSize
GitHub user Achuth17 opened a pull request:
https://github.com/apache/spark/pull/21608
[SPARK-24626] [SQL] Improve Analyze Table command
## What changes were proposed in this pull request?
Currently, Analyze table calculates table size sequentially for each
partition. We
48 matches
Mail list logo