Re: [PR] Spark 3.5: Fix Incorrect Spec Used With AddFiles Procedure [iceberg]

via GitHub Wed, 19 Feb 2025 13:32:11 -0800


RussellSpitzer commented on code in PR #12319:
URL: https://github.com/apache/iceberg/pull/12319#discussion_r1962400715



##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkTableUtil.java:
##########
@@ -1085,4 +1088,43 @@ private ExecutorService getService() {
       return service;
     }
   }
+
+  /**
+   * Returns the first partition spec in an IcebergTable that shares the same 
names and ordering as
+   * the partition columns in a given Spark Table. Throws an error if not found
+   */
+  private static PartitionSpec findCompatibleSpec(
+      Table icebergTable, SparkSession spark, String sparkTable) throws 
AnalysisException {
+    List<String> parts = 
Lists.newArrayList(Splitter.on('.').limit(2).split(sparkTable));
+    String db = parts.size() == 1 ? "default" : parts.get(0);
+    String table = parts.get(parts.size() == 1 ? 0 : 1);
+
+    List<String> sparkPartNames =
+        spark.catalog().listColumns(db, table).collectAsList().stream()
+            .filter(org.apache.spark.sql.catalog.Column::isPartition)
+            .map(org.apache.spark.sql.catalog.Column::name)
+            .map(name -> name.toLowerCase(Locale.ROOT))
+            .collect(Collectors.toList());
+
+    for (PartitionSpec icebergSpec : icebergTable.specs().values()) {

Review Comment:
   I don't think we have this in the spec, but the implementation here will 
always re-use a spec that is identical rather than creating a new identical 
spec. That said either would be valid if duplicates did exist but I don't think 
they should.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Spark 3.5: Fix Incorrect Spec Used With AddFiles Procedure [iceberg]

Reply via email to