[ 
https://issues.apache.org/jira/browse/KAFKA-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642333#comment-14642333
 ] 

Gwen Shapira commented on KAFKA-2374:
-------------------------------------

Top log features missing from Flume (more or less in order of importance):

Ability to safely tail a log (vs. only ingest completed files)
rename / move file after done ingesting
recursively ingest from sub-directories
handling gzipped files (i.e. unzip and split into records)
pulling from FTP / http / SFTP
some intelligence regarding record splitting (support for CSV and JSON is 
pretty high on the list)
conversion to Avro (duh!), bonus if the schema can be magically inferred and 
lated editted.

(no need to implement them all, just mentioned for reference / inspiration)



> Implement Copycat log/file connector
> ------------------------------------
>
>                 Key: KAFKA-2374
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2374
>             Project: Kafka
>          Issue Type: Sub-task
>            Reporter: Ewen Cheslack-Postava
>            Assignee: Ewen Cheslack-Postava
>             Fix For: 0.8.3
>
>
> This is a good baseline connector that has zero dependencies and works well 
> as both a demonstration and a practical use case for standalone mode.
> Two key features it should ideally support: support multiple files and 
> rolling log files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to