[ https://issues.apache.org/jira/browse/SPARK-19424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Herman van Hovell closed SPARK-19424. ------------------------------------- Resolution: Not A Problem > Wrong runtime type in RDD when reading from avro with custom serializer > ----------------------------------------------------------------------- > > Key: SPARK-19424 > URL: https://issues.apache.org/jira/browse/SPARK-19424 > Project: Spark > Issue Type: Bug > Components: Java API > Affects Versions: 2.0.2 > Environment: Ubuntu, spark 2.0.2 prebuilt for hadoop 2.7 > Reporter: Nira Amit > > I am trying to read data from avro files into an RDD using Kryo. My code > compiles fine, but in runtime I'm getting a ClassCastException. Here is what > my code does: > {code} > SparkConf conf = new SparkConf()... > conf.set("spark.serializer", KryoSerializer.class.getCanonicalName()); > conf.set("spark.kryo.registrator", MyKryoRegistrator.class.getName()); > JavaSparkContext sc = new JavaSparkContext(conf); > {code} > Where MyKryoRegistrator registers a Serializer for MyCustomClass: > {code} > public void registerClasses(Kryo kryo) { > kryo.register(MyCustomClass.class, new MyCustomClassSerializer()); > } > {code} > Then, I read my datafile: > {code} > JavaPairRDD<MyCustomClass, NullWritable> records = > sc.newAPIHadoopFile("file:/path/to/datafile.avro", > AvroKeyInputFormat.class, MyCustomClass.class, > NullWritable.class, > sc.hadoopConfiguration()); > Tuple2<MyCustomClass, NullWritable> first = records.first(); > {code} > This seems to work fine, but using a debugger I can see that while the RDD > has a kClassTag of my.package.containing.MyCustomClass, the variable first > contains a Tuple2<AvroKey, NullWritable>, not Tuple2<MyCustomClass, > NullWritable>! And indeed, when the following line executes: > {code} > System.out.println("Got a result, custom field is: " + > first._1.getSomeCustomField()); > {code} > I get an exception: > {code} > java.lang.ClassCastException: org.apache.avro.mapred.AvroKey cannot be cast > to my.package.containing.MyCustomClass > {code} > Am I doing something wrong? And even so, shouldn't I get a compilation error > rather than a runtime error? -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org