Hadoop 2.6.0 included? spark-assembly-1.5.2-hadoop2.6.0.jar On Feb 24, 2016, at 4:08 PM, Koert Kuipers <ko...@tresata.com<mailto:ko...@tresata.com>> wrote:
does your spark version come with batteries (hadoop included) or is it build with hadoop provided and you are adding hadoop binaries to classpath On Wed, Feb 24, 2016 at 3:08 PM, <ross.cramb...@thomsonreuters.com<mailto:ross.cramb...@thomsonreuters.com>> wrote: I’m trying to save a data frame in Avro format but am getting the following error: java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter; I found the following workaround https://github.com/databricks/spark-avro/issues/91<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_databricks_spark-2Davro_issues_91&d=CwMFaQ&c=4ZIZThykDLcoWk-GVjSLm9hvvvzvGv0FLoWSRuCSs5Q&r=DJcC0Gr3B6BfuPcycQUvAi5ueGCorF1rF8_kDa-hAYg&m=6PPI9KfAhYd00YlGE-i1UOpWXH5wXl-sbvA9ru97_Q0&s=Cob1Er8hdIoBCA16Da16bHbcJJMQPgCY_XEvuj4ZcZs&e=> - which seems to say that this is from a mismatch in Avro versions. I have tried following both solutions detailed to no avail: - Manually downloading avro-1.7.7.jar and including it in /usr/lib/hadoop-mapreduce/ - Adding avro-1.7.7.jar to spark.driver.extraClassPath and spark.executor.extraClassPath - The same with avro-1.6.6 I am still getting the same error, and now I am just stabbing in the dark. Anyone else still running into this issue? I am using Pyspark 1.5.2 on EMR.