[ https://issues.apache.org/jira/browse/HUDI-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HUDI-1662: --------------------------------- Labels: pull-request-available (was: ) > Failed to query real-time view use hive/spark-sql when hudi mor table > contains dateType > ---------------------------------------------------------------------------------------- > > Key: HUDI-1662 > URL: https://issues.apache.org/jira/browse/HUDI-1662 > Project: Apache Hudi > Issue Type: Bug > Components: Hive Integration > Affects Versions: 0.7.0 > Environment: hive 3.1.1 > spark 2.4.5 > hadoop 3.1.1 > suse os > Reporter: tao meng > Priority: Major > Labels: pull-request-available > Original Estimate: 24h > Remaining Estimate: 24h > > step1: prepare raw DataFrame with DateType, and insert it to HudiMorTable > df_raw.withColumn("date", lit(Date.valueOf("2020-11-10"))) > merge(df_raw, "bulk_insert", "huditest.bulkinsert_mor_10g") > step2: prepare update DataFrame with DateType, and upsert into HudiMorTable > df_update = sql("select * from > huditest.bulkinsert_mor_10g_rt").withColumn("date", > lit(Date.valueOf("2020-11-11"))) > merge(df_update, "upsert", "huditest.bulkinsert_mor_10g") > > step3: use hive-beeeline/ spark-sql query mor_rt table > use beeline/spark-sql execute statement select * from > huditest.bulkinsert_mor_10g_rt where primary_key = 10000000; > then the follow error will occur: > _java.lang.ClassCastExceoption: org.apache.hadoop.io.IntWritable cannot be > cast to org.apache.hadoop.hive.serde2.io.DateWritableV2_ > > > Root cause analysis: > hudi use avro format to store log file, avro store DateType as INT(Type is > INT but logcialType is date)。 > when hudi read log file and convert avro INT type record to > writable,logicalType is not respected which lead the dateType will cast to > IntWritable。 > seem: > [https://github.com/apache/hudi/blob/master/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeRecordReaderUtils.java#L169] > > Modification plan: when cast avro INT type to writable, logicalType must > be considerd > case INT: > if (schema.getLogicalType() != null && > schema.getLogicalType().getName().equals("date")) { > return new DateWritable((Integer) value); > } else { > return new IntWritable((Integer) value); > } > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)