[GitHub] [spark] Fokko commented on a change in pull request #24405: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas

2019-12-03 Thread GitBox
Fokko commented on a change in pull request #24405: [SPARK-27506][SQL] Allow 
deserialization of Avro data using compatible schemas
URL: https://github.com/apache/spark/pull/24405#discussion_r353590959
 
 

 ##
 File path: docs/sql-data-sources-avro.md
 ##
 @@ -240,6 +240,14 @@ Data source options of Avro can be set via:
 
 function from_avro
   
+  
+writerSchema
 
 Review comment:
   I would stick to `writerSchema`, mostly because this is also the term used 
in Avro itself: 
https://avro.apache.org/docs/1.9.1/api/java/org/apache/avro/hadoop/io/AvroValueDeserializer.html


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Fokko commented on a change in pull request #24405: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas

2019-10-16 Thread GitBox
Fokko commented on a change in pull request #24405: [SPARK-27506][SQL] Allow 
deserialization of Avro data using compatible schemas
URL: https://github.com/apache/spark/pull/24405#discussion_r335344093
 
 

 ##
 File path: docs/sql-data-sources-avro.md
 ##
 @@ -240,6 +240,14 @@ Data source options of Avro can be set via:
 
 function from_avro
   
+  
+writerSchema
+None
+Optional Avro schema (in JSON format) that was used to serialize the 
data. This should be set if the schema provided
+  for deserialization is compatible with - but not the same as - the one 
used to originally convert the data to Avro.
+
 
 Review comment:
   Would it be possible to link to the Confluent documentation? They have an 
excellent document on schema compatibility and evolution: 
https://docs.confluent.io/current/schema-registry/avro.html


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Fokko commented on a change in pull request #24405: [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas

2019-10-16 Thread GitBox
Fokko commented on a change in pull request #24405: [SPARK-27506][SQL] Allow 
deserialization of Avro data using compatible schemas
URL: https://github.com/apache/spark/pull/24405#discussion_r335344326
 
 

 ##
 File path: 
external/avro/src/test/scala/org/apache/spark/sql/avro/AvroFunctionsSuite.scala
 ##
 @@ -153,4 +153,45 @@ class AvroFunctionsSuite extends QueryTest with 
SharedSparkSession {
   assert(df.collect().map(_.get(0)) === Seq(Row("one"), Row("two"), 
Row("three"), Row("four")))
 }
   }
+
+  test("SPARK-27506: roundtrip in to_avro and from_avro with different 
compatible schemas") {
 
 Review comment:
   I would also add a test with an incompatible schema, for example, changing a 
`string` to an `int`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org