[ 
https://issues.apache.org/jira/browse/CRUNCH-606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275734#comment-15275734
 ] 

Micah Whitacre commented on CRUNCH-606:
---------------------------------------

Thanks for the hint.  I can easily simplify to do what you are proposing.  The 
one bit we might be missing out on is that Kafka's Serializer/Deserializer 
takes in a "topic" field and boolean "isKey" field as well as configuration 
properties.  By the time it leaves the InputFormat/RecordReader t has lost that 
info so we'd lose a little flexibility.  We don't actually us that right now 
but it'd be nice to support it.  I'll play around with some of what you 
proposed and other options.  I currently have the source implemented aside from 
this conversion piece.

> Create a KafkaSource
> --------------------
>
>                 Key: CRUNCH-606
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-606
>             Project: Crunch
>          Issue Type: New Feature
>          Components: IO
>            Reporter: Micah Whitacre
>            Assignee: Micah Whitacre
>         Attachments: CRUNCH-606.patch
>
>
> Pulling data out of Kafka is a common use case and some of the ways to do it 
> Kafka Connect, Camus, Gobblin do not integrate nicely with existing 
> processing pipelines like Crunch.  With Kafka 0.9, the consuming API is a lot 
> easier so we should build a Source implementation that can read from Kafka.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to