rdblue commented on code in PR #6371:
URL: https://github.com/apache/iceberg/pull/6371#discussion_r1055860782
##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java:
##########
@@ -42,4 +42,9 @@ private SparkSQLProperties() {}
// Controls whether to check the order of fields during writes
public static final String CHECK_ORDERING =
"spark.sql.iceberg.check-ordering";
public static final boolean CHECK_ORDERING_DEFAULT = true;
+
+ // Controls whether to preserve the existing grouping of data while planning
splits
+ public static final String PRESERVE_DATA_GROUPING =
+ "spark.sql.iceberg.split.preserve-data-grouping";
+ public static final boolean PRESERVE_DATA_GROUPING_DEFAULT = false;
Review Comment:
I think long term we will want to get rid of this, but that will require
some help from Spark. Ideally, Spark should tell the source whether or not it
cares about preserving grouping, and on which columns it matters. If we had
that information we wouldn't need this at all.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]