Yeah, I have the same problem with 1.1.0, but not 1.0.0.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/java-io-NotSerializableException-org-apache-avro-mapred-AvroKey-using-spark-with-avro-tp15165p20752.html
Sent from the Apache Spark User List mailing
add these to your dependencies:
io.netty % netty % 3.6.6.Final
exclude(io.netty, netty-all) to the end of spark and hadoop dependencies
reference: https://spark-project.atlassian.net/browse/SPARK-1138
I am using Spark 1.1 so the akka issue is already fixed
--
View this message in context:
YES! This worked! thanks!
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Is-there-a-way-to-write-spark-RDD-to-Avro-files-tp10947p11245.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Hi, I am facing a similar dilemma. I am trying to aggregate a bunch of small
avro files into one avro file. I read it in with:
sc.newAPIHadoopFile[AvroKey[GenericRecord], NullWritable,
AvroKeyInputFormat[GenericRecord]](path)
but I can't find saveAsHadoopFile or saveAsNewAPIHadoopFile. Can you
Yes, I saw that after I looked at it closer. Thanks! But I am running into a
schema not set error:
Writer schema for output key was not set. Use AvroJob.setOutputKeySchema()
I am in the process of figuring out how to set schema for an AvroJob from a
HDFS file, but any pointer is much appreciated!