Thanks for the pointer Michael. I've downloaded spark 1.2.0 from https://people.apache.org/~pwendell/spark-1.2.0-snapshot1/ and clone and built the spark-avro repo you linked to.
When I run it against the example avro file linked to in the documentation it works. However, when I try to load my avro file (linked to in my original question) I receive the following error: java.lang.RuntimeException: Unsupported type LONG at scala.sys.package$.error(package.scala:27) at com.databricks.spark.avro.AvroRelation.com $databricks$spark$avro$AvroRelation$$toSqlType(AvroRelation.scala:116) at com.databricks.spark.avro.AvroRelation$$anonfun$5.apply(AvroRelation.scala:97) at com.databricks.spark.avro.AvroRelation$$anonfun$5.apply(AvroRelation.scala:96) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) ... If this is useful I'm happy to try loading the various different avro files I have to try to battle-test spark-avro. Thanks On Thu, Nov 20, 2014 at 6:30 PM, Michael Armbrust <mich...@databricks.com> wrote: > One option (starting with Spark 1.2, which is currently in preview) is to > use the Avro library for Spark SQL. This is very new, but we would love to > get feedback: https://github.com/databricks/spark-avro > > On Thu, Nov 20, 2014 at 10:19 AM, al b <beanb...@googlemail.com> wrote: > >> I've read several posts of people struggling to read avro in spark. The >> examples I've tried don't work. When I try this solution ( >> https://stackoverflow.com/questions/23944615/how-can-i-load-avros-in-spark-using-the-schema-on-board-the-avro-files) >> I get errors: >> >> spark java.io.NotSerializableException: >> org.apache.avro.mapred.AvroWrapper >> >> How can I read the following sample file in spark using scala? >> >> http://www.4shared.com/file/SxnYcdgJce/sample.html >> >> Thomas >> > >