[
https://issues.apache.org/jira/browse/CRUNCH-606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276901#comment-15276901
]
Micah Whitacre commented on CRUNCH-606:
---------------------------------------
Found another disconnect I'm running into as well.
{noformat}
ava.lang.ClassCastException: org.apache.avro.mapred.AvroWrapper cannot be cast
to org.apache.hadoop.io.NullWritable
at
org.apache.crunch.types.avro.AvroKeyConverter.convertInput(AvroKeyConverter.java:25)
{noformat}
Since an normal AvroInputFormat returns <AvroWrapper<T>, NullWritable> that
bled into the AvroKeyConverter with expects the same. So while right now the
KafkaRecordReader is returning <K, V> for it to actually fit the
AvroKeyConverter it should returning Pair<K, V>. Or more specifically it
should be <AvroWrapper<Pair<K,V>>, NullWritable>. Not great that it the
converter is putting restrictions on the input format but I can possibly work
around it.
> Create a KafkaSource
> --------------------
>
> Key: CRUNCH-606
> URL: https://issues.apache.org/jira/browse/CRUNCH-606
> Project: Crunch
> Issue Type: New Feature
> Components: IO
> Reporter: Micah Whitacre
> Assignee: Micah Whitacre
> Attachments: CRUNCH-606.diff, CRUNCH-606.patch
>
>
> Pulling data out of Kafka is a common use case and some of the ways to do it
> Kafka Connect, Camus, Gobblin do not integrate nicely with existing
> processing pipelines like Crunch. With Kafka 0.9, the consuming API is a lot
> easier so we should build a Source implementation that can read from Kafka.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)