Github user gengliangwang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22037#discussion_r209152562
  
    --- Diff: 
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala 
---
    @@ -138,10 +142,21 @@ class AvroDeserializer(rootAvroType: Schema, 
rootCatalystType: DataType) {
                 bytes
               case b: Array[Byte] => b
               case other => throw new RuntimeException(s"$other is not a valid 
avro binary.")
    -
             }
             updater.set(ordinal, bytes)
     
    +      case (FIXED, d: DecimalType) => (updater, ordinal, value) =>
    +        val bigDecimal = 
decimalConversions.fromFixed(value.asInstanceOf[GenericFixed], avroType,
    +          LogicalTypes.decimal(d.precision, d.scale))
    --- End diff --
    
    Comparing to `binaryToUnscaledLong`, I think using the method from Avro 
library makes more sense. 
    
    Also the method `binaryToUnscaledLong` is using the underlying byte array 
of parquet Binary without copying it. (If we create a new Util method for both, 
then Parquet data source will lose this optimization.)
    
    For performance consideration, we can create a similar method in Avro.  I 
tried the function `binaryToUnscaledLong` in Avro and it works. I can change it 
if you insist. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to