[ 
https://issues.apache.org/jira/browse/KAFKA-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642323#comment-14642323
 ] 

Gwen Shapira commented on KAFKA-2365:
-------------------------------------

I keep waiting for [~ewencp] to create the "log connector" jira so I can add 
specific requirements, but its getting late, so I'm adding them here for 
reference and will move the comment to the right place tomorrow :)

Top log features missing from Flume (more or less in order of importance):
* Ability to safely tail a log (vs. only ingest completed files)
* rename / move file after done ingesting
* recursively ingest from sub-directories
* handling gzipped files (i.e. unzip and split into records)
* pulling from FTP / http / SFTP
* some intelligence regarding record splitting (support for CSV and JSON is 
pretty high on the list)
* conversion to Avro (duh!), bonus if the schema can be magically inferred and 
lated editted.

> Copycat checklist
> -----------------
>
>                 Key: KAFKA-2365
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2365
>             Project: Kafka
>          Issue Type: New Feature
>            Reporter: Ewen Cheslack-Postava
>            Assignee: Ewen Cheslack-Postava
>              Labels: feature
>             Fix For: 0.8.3
>
>
> This covers the development plan for 
> [KIP-26|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=58851767].
>  There are a number of features that can be developed in sequence to make 
> incremental progress, and often in parallel:
> * Initial patch - connector API and core implementation
> * Runtime data API
> * Standalone CLI
> * REST API
> * Distributed copycat - CLI
> * Distributed copycat - coordinator
> * Distributed copycat - config storage
> * Distributed copycat - offset storage
> * Log/file connector (sample source/sink connector)
> * Elasticsearch sink connector (sample sink connector for full log -> Kafka 
> -> Elasticsearch sample pipeline)
> * Copycat metrics
> * System tests (including connector tests)
> * Mirrormaker connector
> * Copycat documentation
> This is an initial list, but it might need refinement to allow for more 
> incremental progress and may be missing features we find we want before the 
> initial release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to