In the doc you refer:
// The Avro records get converted to Spark types, filtered, and// then
written back out as Avro recordsval df =
spark.read.avro("/tmp/episodes.avro")df.filter("doctor >
5").write.avro("/tmp/output")
Alternatively you can specify the format to use instead:
[image: Copy to
Hi,
I may be wrong about this, but when you are using format("") you are
basically using old SPARK classes, which still exists because of backward
compatibility.
Please refer to the following documentation to take advantage of the recent
changes in SPARK:
Of course,
right now I'm trying in local with spark 2.2.0 and spark-avro 4.0.0. I've
just uploaded a snippet
https://gist.github.com/gasparms/5d0740bd61a500357e0230756be963e1
Basically, my avro schema has a field with an alias and in the last part of
code spark-avro is not able to read old data
Hi Gaspar,
can you please provide the details regarding the environment, versions,
libraries and code snippets please?
For example: SPARK version, OS, distribution, running on YARN, etc and all
other details.
Regards,
Gourav Sengupta
On Sun, Nov 5, 2017 at 9:03 AM, Gaspar Muñoz
Hi there,
I use avro format to store historical due to avro schema evolution. I
manage external schemas and read them using avroSchema option so we have
been able to add and delete columns.
The problem is when I introduced aliases and Spark process didn't work as
expected and then I read in