[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

GitBox Tue, 06 Dec 2022 16:25:36 -0800


aokolnychyi commented on code in PR #6371:
URL: https://github.com/apache/iceberg/pull/6371#discussion_r1041611507



##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java:
##########
@@ -42,4 +42,9 @@ private SparkSQLProperties() {}
   // Controls whether to check the order of fields during writes
   public static final String CHECK_ORDERING = 
"spark.sql.iceberg.check-ordering";
   public static final boolean CHECK_ORDERING_DEFAULT = true;
+
+  // Controls whether to preserve the existing grouping of data while planning 
splits
+  public static final String PRESERVE_DATA_GROUPING =

Review Comment:
   One alternative name was 
`spark.sql.iceberg.split.preserve-partition-boundaries`. I discarded it because 
we are not really preserving partition boundaries if there are multiple specs. 
Also, Spark will push down join keys in the future so we won't really respect 
partition boundaries as such.
   
   That being said, I'll think more tomorrow. Ideas are always welcome.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6371: Spark 3.3: Support storage-partitioned joins

Reply via email to