So I've been futzing with Crunch a bit, and trying to understand how to
build a pipeline that outputs Avro data files. Roughly, I'm doing something
along these lines:

    Schema.Parser schemaParser = new Schema.Parser();
    final Schema avroObjSchema = schemaParser.parse(
schemaJsonString);

    AvroType avroType = new AvroType<MyAvroObject>(MyAvroObject.class,
        avroObjSchema, new
AvroDeepCopier.AvroReflectDeepCopier<MyAvroObject>(
        MyAvroObject.class, avroObjSchema));

    PCollection<MyAvroObject> words = logs.parallelDo(new DoFn<String,
MyAvroObject>() {
      public void process(String line, Emitter<MyAvroObject> emitter) {
        emitter.emit(convertStringToAvroObj(line));
      }
    }, avroType);

However, this results in a class cast exception:

Exception in thread "main" java.lang.ClassCastException: class
com.company.MyAvroObject
    at java.lang.Class.asSubclass(Class.java:3039)
    at
org.apache.crunch.types.writable.Writables.records(Writables.java:250)
    at
org.apache.crunch.types.writable.WritableTypeFamily.records(WritableTypeFamily.java:86)
    at org.apache.crunch.types.PTypeUtils.convert(PTypeUtils.java:61)
    at org.apache.crunch.types.writable.WritableTypeFamily.as
(WritableTypeFamily.java:135)
    at
org.apache.crunch.impl.mr.MRPipeline.writeTextFile(MRPipeline.java:319)

Anybody have any thoughts? There's got to be a magical incantation that I
have slightly off.

Reply via email to