szehon-ho commented on PR #45267:
URL: https://github.com/apache/spark/pull/45267#issuecomment-2043308510
Thanks @sunchao and all for the review!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
sunchao closed pull request #45267: [SPARK-47094][SQL] SPJ : Dynamically
rebalance number of buckets when they are not equal
URL: https://github.com/apache/spark/pull/45267
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
sunchao commented on PR #45267:
URL: https://github.com/apache/spark/pull/45267#issuecomment-2040912874
Thanks, merged to master!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
szehon-ho commented on PR #45267:
URL: https://github.com/apache/spark/pull/45267#issuecomment-2035996293
@sunchao thanks can you take another look? I think the failed test is not
related (looks like error in pyspark uploading test report).
--
This is an automated message from the
sunchao commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1538630610
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ReducibleFunction.java:
##
@@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software
szehon-ho commented on PR #45267:
URL: https://github.com/apache/spark/pull/45267#issuecomment-2019159327
@sunchao if you have time for another look, reverted to use specific
argument type and method for bucket, and worry about other parameterized
transforms later.
--
This is an
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1538195196
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ReducibleFunction.java:
##
@@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1538195196
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ReducibleFunction.java:
##
@@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software
sunchao commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1536551512
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ReducibleFunction.java:
##
@@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software
szehon-ho commented on PR #45267:
URL: https://github.com/apache/spark/pull/45267#issuecomment-2016084578
@sunchao thanks ! addressed review comments. Lmk if supporting generic
function parameter is too messy, and we want to switch back to just bucket for
first cut.
--
This is an
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1536299907
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ReducibleFunction.java:
##
@@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1536300235
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ReducibleFunction.java:
##
@@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software
sunchao commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1535145417
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala:
##
@@ -846,6 +879,20 @@ case class KeyGroupedShuffleSpec(
}
}
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1530840650
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ReducibleFunction.java:
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software
szehon-ho commented on PR #45267:
URL: https://github.com/apache/spark/pull/45267#issuecomment-2008733057
@sunchao can you take another look? I guess open question is how to support
other potential parameterized transforms in the future, let me know your
thoughts.
--
This is an
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1530840650
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ReducibleFunction.java:
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1530840650
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ReducibleFunction.java:
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1530840650
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ReducibleFunction.java:
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1529561359
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ReducibleFunction.java:
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1529562057
##
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala:
##
@@ -505,11 +506,28 @@ case class EnsureRequirements(
}
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1529561359
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ReducibleFunction.java:
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software
sunchao commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1529047571
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/Reducer.java:
##
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1523710658
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala:
##
@@ -829,20 +846,59 @@ case class KeyGroupedShuffleSpec(
}
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1523707057
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala:
##
@@ -829,20 +846,59 @@ case class KeyGroupedShuffleSpec(
}
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1523706702
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ReducibleFunction.java:
##
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software
sunchao commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1521800234
##
sql/core/src/test/scala/org/apache/spark/sql/connector/KeyGroupedPartitioningSuite.scala:
##
@@ -1310,6 +1314,312 @@ class KeyGroupedPartitioningSuite extends
advancedxy commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1521751659
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala:
##
@@ -635,6 +636,22 @@ trait ShuffleSpec {
*/
def
sunchao commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1521729517
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala:
##
@@ -635,6 +636,22 @@ trait ShuffleSpec {
*/
def
advancedxy commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1521439073
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala:
##
@@ -635,6 +636,22 @@ trait ShuffleSpec {
*/
def
sunchao commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1520267117
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala:
##
@@ -635,6 +636,22 @@ trait ShuffleSpec {
*/
def
sunchao commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1520187304
##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/functions/ReducibleFunction.java:
##
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1506339765
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala:
##
@@ -635,6 +636,22 @@ trait ShuffleSpec {
*/
def
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1506338476
##
sql/core/src/test/scala/org/apache/spark/sql/connector/KeyGroupedPartitioningSuite.scala:
##
@@ -1310,6 +1314,312 @@ class KeyGroupedPartitioningSuite extends
advancedxy commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1505984397
##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala:
##
@@ -635,6 +636,22 @@ trait ShuffleSpec {
*/
def
advancedxy commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1505905928
##
sql/core/src/test/scala/org/apache/spark/sql/connector/KeyGroupedPartitioningSuite.scala:
##
@@ -1310,6 +1314,312 @@ class KeyGroupedPartitioningSuite extends
szehon-ho commented on PR #45267:
URL: https://github.com/apache/spark/pull/45267#issuecomment-1967323627
Thanks @himadripal I was able to modify one of my tests to also reproduce
it, and it should be fixed now.
--
This is an automated message from the Apache Git Service.
To respond to
sunchao commented on PR #45267:
URL: https://github.com/apache/spark/pull/45267#issuecomment-1965888375
Thanks @szehon-ho ! will take a look in a few days.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
himadripal commented on PR #45267:
URL: https://github.com/apache/spark/pull/45267#issuecomment-1965681861
was trying this test case,
`val partition1 = Array(Expressions.years("ts"),
bucket(2, "id"))
val partition2 = Array(Expressions.days("ts"),
bucket(4, "id"))`
szehon-ho commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1503537598
##
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##
@@ -1537,6 +1537,18 @@ object SQLConf {
.booleanConf
himadripal commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1503514183
##
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##
@@ -1537,6 +1537,18 @@ object SQLConf {
.booleanConf
40 matches
Mail list logo