Re: spark-avro aliases incompatible

2017-11-07 Thread Gaspar Muñoz
In the doc you refer: // The Avro records get converted to Spark types, filtered, and// then written back out as Avro recordsval df = spark.read.avro("/tmp/episodes.avro")df.filter("doctor > 5").write.avro("/tmp/output") Alternatively you can specify the format to use instead: [image: Copy to

Re: spark-avro aliases incompatible

2017-11-06 Thread Gourav Sengupta
Hi, I may be wrong about this, but when you are using format("") you are basically using old SPARK classes, which still exists because of backward compatibility. Please refer to the following documentation to take advantage of the recent changes in SPARK:

Re: spark-avro aliases incompatible

2017-11-06 Thread Gaspar Muñoz
Of course, right now I'm trying in local with spark 2.2.0 and spark-avro 4.0.0. I've just uploaded a snippet https://gist.github.com/gasparms/5d0740bd61a500357e0230756be963e1 Basically, my avro schema has a field with an alias and in the last part of code spark-avro is not able to read old data

Re: spark-avro aliases incompatible

2017-11-05 Thread Gourav Sengupta
Hi Gaspar, can you please provide the details regarding the environment, versions, libraries and code snippets please? For example: SPARK version, OS, distribution, running on YARN, etc and all other details. Regards, Gourav Sengupta On Sun, Nov 5, 2017 at 9:03 AM, Gaspar Muñoz

spark-avro aliases incompatible

2017-11-05 Thread Gaspar Muñoz
Hi there, I use avro format to store historical due to avro schema evolution. I manage external schemas and read them using avroSchema option so we have been able to add and delete columns. The problem is when I introduced aliases and Spark process didn't work as expected and then I read in