Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22037#discussion_r209152562 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala --- @@ -138,10 +142,21 @@ class AvroDeserializer(rootAvroType: Schema, rootCatalystType: DataType) { bytes case b: Array[Byte] => b case other => throw new RuntimeException(s"$other is not a valid avro binary.") - } updater.set(ordinal, bytes) + case (FIXED, d: DecimalType) => (updater, ordinal, value) => + val bigDecimal = decimalConversions.fromFixed(value.asInstanceOf[GenericFixed], avroType, + LogicalTypes.decimal(d.precision, d.scale)) --- End diff -- Comparing to `binaryToUnscaledLong`, I think using the method from Avro library makes more sense. Also the method `binaryToUnscaledLong` is using the underlying byte array of parquet Binary without copying it. (If we create a new Util method for both, then Parquet data source will lose this optimization.) For performance consideration, we can create a similar method in Avro. I tried the function `binaryToUnscaledLong` in Avro and it works. I can change it if you insist.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org