[GitHub] spark pull request #21847: [SPARK-24855][SQL][EXTERNAL]: Built-in AVRO suppo...

cloud-fan Thu, 09 Aug 2018 01:13:40 -0700

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21847#discussion_r208841399
  
    --- Diff: 
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala ---
    @@ -87,10 +87,18 @@ class AvroSerializer(rootCatalystType: DataType, 
rootAvroType: Schema, nullable:
             (getter, ordinal) => getter.getDouble(ordinal)
           case d: DecimalType =>
             (getter, ordinal) => getter.getDecimal(ordinal, d.precision, 
d.scale).toString
    -      case StringType =>
    -        (getter, ordinal) => new 
Utf8(getter.getUTF8String(ordinal).getBytes)
    -      case BinaryType =>
    -        (getter, ordinal) => ByteBuffer.wrap(getter.getBinary(ordinal))
    +      case StringType => avroType.getType match {
    +        case Type.ENUM =>
    +          (getter, ordinal) => new EnumSymbol(avroType, 
getter.getUTF8String(ordinal).toString)
    +        case _ =>
    +          (getter, ordinal) => new 
Utf8(getter.getUTF8String(ordinal).getBytes)
    +      }
    +      case BinaryType => avroType.getType match {
    +        case Type.FIXED =>
    --- End diff --
    
    FIXED has a "size" attribute, shall we consider it when preparing the 
bytes? e.g. shall we throw exception if the bytes from Spark exceed the size, 
and shall we padding the bytes when its length is smaller than the size.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21847: [SPARK-24855][SQL][EXTERNAL]: Built-in AVRO suppo...

Reply via email to