cdmikechen commented on code in PR #3391:
URL: https://github.com/apache/hudi/pull/3391#discussion_r1006287106


##########
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieColumnProjectionUtils.java:
##########
@@ -109,4 +109,22 @@ public static List<Pair<String,String>> 
getIOColumnNameAndTypes(Configuration co
         .collect(Collectors.toList());
   }
 
+  /**
+   * if schema contains timestamp columns, this method is used for 
compatibility when there is no timestamp fields
+   * We expect 3 cases to use parquet-avro reader to read timestamp column:
+   *  1. read columns contain timestamp type
+   *  2. no read columns and exists original columns contain timestamp type
+   *  3. no read columns and no original columns, but avro schema contains type
+   */
+  public static boolean supportTimestamp(Configuration conf) {
+    List<String> reads = Arrays.asList(getReadColumnNames(conf));
+    if (reads.isEmpty()) {
+      return getIOColumnTypes(conf).contains("timestamp");
+    }
+    List<String> names = getIOColumns(conf);
+    List<String> types = getIOColumnTypes(conf);
+    return types.isEmpty() || IntStream.range(0, names.size()).filter(i -> 
reads.contains(names.get(i)))

Review Comment:
   @xushiyan 
   I was trying to think of a worst case scenario if I couldn't find the 
columns. I had encountered a similar problem with some of the test cases when 
running the azure test cases, so I've added this treatment here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to