[
https://issues.apache.org/jira/browse/CRUNCH-606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275734#comment-15275734
]
Micah Whitacre commented on CRUNCH-606:
---------------------------------------
Thanks for the hint. I can easily simplify to do what you are proposing. The
one bit we might be missing out on is that Kafka's Serializer/Deserializer
takes in a "topic" field and boolean "isKey" field as well as configuration
properties. By the time it leaves the InputFormat/RecordReader t has lost that
info so we'd lose a little flexibility. We don't actually us that right now
but it'd be nice to support it. I'll play around with some of what you
proposed and other options. I currently have the source implemented aside from
this conversion piece.
> Create a KafkaSource
> --------------------
>
> Key: CRUNCH-606
> URL: https://issues.apache.org/jira/browse/CRUNCH-606
> Project: Crunch
> Issue Type: New Feature
> Components: IO
> Reporter: Micah Whitacre
> Assignee: Micah Whitacre
> Attachments: CRUNCH-606.patch
>
>
> Pulling data out of Kafka is a common use case and some of the ways to do it
> Kafka Connect, Camus, Gobblin do not integrate nicely with existing
> processing pipelines like Crunch. With Kafka 0.9, the consuming API is a lot
> easier so we should build a Source implementation that can read from Kafka.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)