Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

via GitHub Wed, 31 Jul 2024 17:00:24 -0700


rdblue commented on code in PR #10678:
URL: https://github.com/apache/iceberg/pull/10678#discussion_r1699225863



##########
api/src/main/java/org/apache/iceberg/PartitionSpec.java:
##########
@@ -427,13 +429,21 @@ private void checkForRedundantPartitions(PartitionField 
field) {
       dedupFields.put(dedupKey, field);
     }
 
+    public Builder caseSensitive(boolean sensitive) {
+      this.caseSensitive = sensitive;
+      return this;
+    }
+
     public Builder withSpecId(int newSpecId) {
       this.specId = newSpecId;
       return this;
     }
 
     private Types.NestedField findSourceColumn(String sourceName) {
-      Types.NestedField sourceColumn = schema.findField(sourceName);
+      Types.NestedField sourceColumn =
+          this.caseSensitive
+              ? schema.findField(sourceName)
+              : schema.caseInsensitiveFindField(sourceName);

Review Comment:
   The current behavior is to fail when a schema that requires case sensitivity 
is used in a case insensitive way. Right now that would be by throwing an 
exception when [creating the case insensitive 
index](https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/types/TypeUtil.java#L184-L190).
   
   Because Iceberg needs to work in contexts that are both case sensitive and 
case insensitive, I think it is the right strategy to fail at runtime. Iceberg 
can't impose case insensitivity on engines that are case sensitive or vice 
versa.
   
   In nearly all situations, this works fine and is not noticed because it is 
very uncommon to have both `data` and `DATA` in a schema.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

Reply via email to