Re: [PR] Hive: Return new scan after applying column project parameter [iceberg]

via GitHub Tue, 11 Jun 2024 19:59:47 -0700


zhangbutao commented on code in PR #10449:
URL: https://github.com/apache/iceberg/pull/10449#discussion_r1635749793



##########
mr/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java:
##########
@@ -125,11 +125,9 @@ public List<InputSplit> getSplits(JobContext context) {
     }
     String schemaStr = conf.get(InputFormatConfig.READ_SCHEMA);
     if (schemaStr != null) {
-      scan.project(SchemaParser.fromJson(schemaStr));
-    }
-    String[] selectedColumns = 
conf.getStrings(InputFormatConfig.SELECTED_COLUMNS);
-    if (selectedColumns != null) {
-      scan.select(selectedColumns);
+      scan = scan.project(SchemaParser.fromJson(schemaStr));
+    } else if (conf.getStrings(InputFormatConfig.SELECTED_COLUMNS) != null) {

Review Comment:
   BTW, It doesn't make any difference whether the scan has projected columns 
or not as on hive side we use the conf to propagate the projected columns. 
Check this comment 
https://github.com/apache/hive/pull/5282#issuecomment-2154202895
   
   But we need care that if we allow user to set the projected columns, mainly 
means `iceberg.mr.read.schema`, as it can be propagated by the conf to be used 
by the iceberg reader.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Hive: Return new scan after applying column project parameter [iceberg]

Reply via email to