[ 
https://issues.apache.org/jira/browse/CRUNCH-606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276901#comment-15276901
 ] 

Micah Whitacre commented on CRUNCH-606:
---------------------------------------

Found another disconnect I'm running into as well.

{noformat}
ava.lang.ClassCastException: org.apache.avro.mapred.AvroWrapper cannot be cast 
to org.apache.hadoop.io.NullWritable
        at 
org.apache.crunch.types.avro.AvroKeyConverter.convertInput(AvroKeyConverter.java:25)
{noformat}

Since an normal AvroInputFormat returns <AvroWrapper<T>, NullWritable> that 
bled into the AvroKeyConverter with expects the same.  So while right now the 
KafkaRecordReader is returning <K, V> for it to actually fit the 
AvroKeyConverter it should returning Pair<K, V>.  Or more specifically it 
should be <AvroWrapper<Pair<K,V>>, NullWritable>.  Not great that it the 
converter is putting restrictions on the input format but I can possibly work 
around it.

> Create a KafkaSource
> --------------------
>
>                 Key: CRUNCH-606
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-606
>             Project: Crunch
>          Issue Type: New Feature
>          Components: IO
>            Reporter: Micah Whitacre
>            Assignee: Micah Whitacre
>         Attachments: CRUNCH-606.diff, CRUNCH-606.patch
>
>
> Pulling data out of Kafka is a common use case and some of the ways to do it 
> Kafka Connect, Camus, Gobblin do not integrate nicely with existing 
> processing pipelines like Crunch.  With Kafka 0.9, the consuming API is a lot 
> easier so we should build a Source implementation that can read from Kafka.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to