[GitHub] [iceberg] edgarRd commented on a change in pull request #1140: ORC: skip non-iceberg columns when converting schema to Iceberg

GitBox Mon, 29 Jun 2020 10:23:59 -0700


edgarRd commented on a change in pull request #1140:
URL: https://github.com/apache/iceberg/pull/1140#discussion_r447132136




##########
File path: 
orc/src/main/java/org/apache/iceberg/orc/OrcSchemaWithTypeVisitor.java
##########
@@ -63,12 +68,20 @@
     List<TypeDescription> fields = record.getChildren();
     List<String> names = record.getFieldNames();
     List<T> results = Lists.newArrayListWithExpectedSize(fields.size());
-    for (TypeDescription field : fields) {
-      int fieldId = ORCSchemaUtil.fieldId(field);
-      Types.NestedField iField = struct != null ? struct.field(fieldId) : null;
+    List<String> includedNames = 
Lists.newArrayListWithExpectedSize(names.size());
+    for (int i = 0; i < fields.size(); ++i) {
+      TypeDescription field = fields.get(i);
+      String name = names.get(i);
+      Optional<Integer> fieldId = ORCSchemaUtil.icebergID(field);

Review comment:
       I generally agree that the visitor should only be concerned with the 
traversal. However, in this case I think the requirement of having knowledge of 
the `fieldId ` was already embedded in this visitor due to traversing the 
Iceberg schema as well - which seems like it's assumed to be a full conversion 
of the `TypeDescription`, if a field was not found instead it would just fail 
the process. 
   
   I think it'd be better to have a generic `OrcSchemaVisitor` without context 
on the Iceberg `Schema`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] edgarRd commented on a change in pull request #1140: ORC: skip non-iceberg columns when converting schema to Iceberg

Reply via email to