[GitHub] [hudi] cdmikechen commented on a diff in pull request #3391: [HUDI-83] Fix Timestamp/Date type read by Hive3

GitBox Fri, 20 May 2022 06:21:18 -0700


cdmikechen commented on code in PR #3391:
URL: https://github.com/apache/hudi/pull/3391#discussion_r878137673



##########
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieColumnProjectionUtils.java:
##########
@@ -109,4 +109,25 @@ public static List<Pair<String,String>> 
getIOColumnNameAndTypes(Configuration co
         .collect(Collectors.toList());
   }
 
+  /**
+   * if schema contains timestamp columns, this method is used for 
compatibility when there is no timestamp fields
+   * We expect 3 cases to use parquet-avro reader to read timestamp column:
+   *  1. read columns contain timestamp type
+   *  2. no read columns and exists original columns contain timestamp type
+   *  3. no read columns and no original columns, but avro schema contains type
+   */
+  public static boolean supportTimestamp(Configuration conf) {
+    List<String> reads = Arrays.asList(getReadColumnNames(conf));
+    if (reads.isEmpty()) {
+      return getIOColumnTypes(conf).contains("timestamp");
+    }
+    List<String> names = getIOColumns(conf);
+    if (names.isEmpty()) {
+      return true;

Review Comment:
   @xiarixiaoyao 
   My consideration is to avoid that some hive related test cases do not 
explicitly declare these configurations, resulting in test failure. I have 
encountered some similar problems on azure pipeline, so I add it. If it is 
empty, it defaults to true.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] cdmikechen commented on a diff in pull request #3391: [HUDI-83] Fix Timestamp/Date type read by Hive3

Reply via email to