xiarixiaoyao commented on code in PR #8026: URL: https://github.com/apache/hudi/pull/8026#discussion_r1116437743
########## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala: ########## @@ -155,12 +158,13 @@ abstract class HoodieBaseRelation(val sqlContext: SQLContext, } } + val avroNameAndSpace = AvroConversionUtils.getAvroRecordNameAndNamespace(tableName) val avroSchema = internalSchemaOpt.map { is => - AvroInternalSchemaConverter.convert(is, "schema") + AvroInternalSchemaConverter.convert(is, avroNameAndSpace._2 + "." + avroNameAndSpace._1) Review Comment: @alexeykudinkin thanks for your review. 1) schema evolution has nothing to do with this scene,since schema evolution will call HoodieAvroUtils.rewriteRecordWithNewSchema to uinfy namespace. i change this line just want to ensure that the namespace of reading schema and writing schema are consistent. 2) The namespace of the schema used by hudi when writing the log is from tableName, but the namespace of read schema is “schema" 3) When the schema evolution is not enabled,For decimal types, different namespaces produce different names, avro is name sensitive. we should keep the read schema and write schema has the same namespace just as previous versions of hudi eg: ff decimal(38, 10) hudi log write schema will be : {"name":"ff","type":[{"type":"fixed","name":"fixed","namespace":"hoodie.h0.h0_record.ff","size":16,"logicalType":"decimal","precision":38,"scale":10} spark read schema will be ff type is : "name":"ff","type":[{"type":"fixed","name":"fixed","namespace":"Record.ff","size":16,"logicalType":"decimal","precision":38,"scale":10},"null"]} the read schema and write schema is incompatible, we cannot use read schema to read log。previous versions of hudi does not have this problem Caused by: org.apache.avro.AvroTypeException: Found hoodie.h0.h0_record.ff.fixed, expecting union at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:308) at org.apache.avro.io.parsing.Parser.advance(Parser.java:86) at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:275) at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:188) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:161) at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:260) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:248) at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:180) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:161) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:154) at org.apache.hudi.common.table.log.block.HoodieAvroDataBlock$RecordIterator.next(HoodieAvroDataBlock.java:201) at org.apache.hudi.common.table.log.block.HoodieAvroDataBlock$RecordIterator.next(HoodieAvroDataBlock.java:149) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org