[ https://issues.apache.org/jira/browse/SPARK-27388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Taoufik DACHRAOUI updated SPARK-27388: -------------------------------------- Description: *What changes were proposed in this pull request?* This PR adds expression encoders for beans, java.util.List, java.util.Map and java enum. The Beans are objects defined by properties; A property is defined by a setter and a getter functions where the getter return type is equal to the setter unique parameter type and the getter and setter functions have the same name; if the getter name is prefixed by "get" then the setter name must be prefixed by "set"; see tests for bean examples. Avro objects are beans and thus we can create an expression encoder for avro objects as follows: {code:java} implicit val exprEncoder = ExpressionEncoder[Foo]() {code} All avro types, including fixed types, and excluding complex union types, are suppported by this addition. The avro fixed types are beans with exactly one property: bytes. Currently complex avro unions are not supported because a complex union is declared as Object and there cannot be an expression encoder for Object type (need to use a custom serializer like kryo for example) *How was this patch tested?* currently only 1 encodeDecodeTest was added to ExpressionEncoderSuite; the test uses the following avro schema: {code:java} {"namespace": "org.apache.spark.sql.catalyst.encoders", "type": "record", "name": "AvroExample1", "fields": [ {"name":"mymoney","type":["null",{"type":"record","name":"Money","namespace":"org.apache.spark.sql.catalyst.encoders","fields":[ {"name":"amount","type":"float","default":0}, {"name":"currency","type":{"type":"enum","name":"Currency","symbols":["EUR","USD","BRL"]},"default":"EUR"}]}], "default":null}, {"name": "myfloat", "type": "float"}, {"name": "mylong", "type": "long"}, {"name": "myint", "type": "int"}, {"name": "mydouble", "type": "double"}, {"name": "myboolean", "type": "boolean"}, {"name": "mystring", "type": "string"}, {"name": "mybytes", "type": "bytes"}, {"name": "myfixed", "type": {"type": "fixed", "name": "Magic", "size": 4}}, {"name": "myarray", "type": {"type": "array", "items": "string"}}, {"name": "mymap", "type": {"type": "map", "values": "int"}} ] }{code} was: *What changes were proposed in this pull request?* This PR adds expression encoders for beans, java.util.List, java.util.Map and java enum. The Beans are objects defined by properties; A property is defined by a setter and a getter functions where the getter return type is equal to the setter unique parameter type and the getter and setter functions have the same name; if the getter name is prefixed by "get" then the setter name must be prefixed by "set"; see tests for bean examples. Avro objects are beans and thus we can create an expression encoder for avro objects as follows: {code:java} implicit val exprEncoder = ExpressionEncoder[Foo]() {code} All avro types, including fixed types, and excluding complex union types, are suppported by this addition. The avro fixed types are beans with exactly one property: bytes. Currently complex avro unions are not supported because a complex union is declared as Object and there cannot be an expression encoder for Object type (need to use a custom serializer like kryo for example) *How was this patch tested?* currently only 1 encodeDecodeTest was added to ExpressionEncoderSuite; the test uses the following avro schema: > expression encoder for avro like objects > ---------------------------------------- > > Key: SPARK-27388 > URL: https://issues.apache.org/jira/browse/SPARK-27388 > Project: Spark > Issue Type: New Feature > Components: SQL > Affects Versions: 2.4.1 > Reporter: Taoufik DACHRAOUI > Priority: Major > > *What changes were proposed in this pull request?* > This PR adds expression encoders for beans, java.util.List, java.util.Map and > java enum. > The Beans are objects defined by properties; A property is defined by a > setter and a getter functions where the getter return type is equal to the > setter unique parameter type and the getter and setter functions have the > same name; if the getter name is prefixed by "get" then the setter name must > be prefixed by "set"; see tests for bean examples. > Avro objects are beans and thus we can create an expression encoder for avro > objects as follows: > {code:java} > implicit val exprEncoder = ExpressionEncoder[Foo]() > {code} > All avro types, including fixed types, and excluding complex union types, are > suppported by this addition. > The avro fixed types are beans with exactly one property: bytes. > Currently complex avro unions are not supported because a complex union is > declared as Object and there cannot be an expression encoder for Object type > (need to use a custom serializer like kryo for example) > *How was this patch tested?* > currently only 1 encodeDecodeTest was added to ExpressionEncoderSuite; the > test uses the following avro schema: > {code:java} > {"namespace": "org.apache.spark.sql.catalyst.encoders", "type": "record", > "name": "AvroExample1", > "fields": [ > > {"name":"mymoney","type":["null",{"type":"record","name":"Money","namespace":"org.apache.spark.sql.catalyst.encoders","fields":[ > {"name":"amount","type":"float","default":0}, > > {"name":"currency","type":{"type":"enum","name":"Currency","symbols":["EUR","USD","BRL"]},"default":"EUR"}]}], > "default":null}, > {"name": "myfloat", "type": "float"}, > {"name": "mylong", "type": "long"}, > {"name": "myint", "type": "int"}, > {"name": "mydouble", "type": "double"}, > {"name": "myboolean", "type": "boolean"}, > {"name": "mystring", "type": "string"}, > {"name": "mybytes", "type": "bytes"}, > {"name": "myfixed", "type": {"type": "fixed", "name": "Magic", "size": 4}}, > {"name": "myarray", "type": {"type": "array", "items": "string"}}, > {"name": "mymap", "type": {"type": "map", "values": "int"}} > ] }{code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org