[ 
https://issues.apache.org/jira/browse/FLUME-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097170#comment-15097170
 ] 

Jarek Jarcec Cecho commented on FLUME-2703:
-------------------------------------------

I'm a bit concerned about introducing this functionality to flume - flume has 
been designed as a event based system and not necessary file based one. Trying 
to preserve the original filename might make it seem like we're transferring 
whole files which is not the case. Even with SpoolDirectorySource that reads 
whole files we can change order or generate duplicates at the end so the 
resulting file on HDFS might not end up having the same checksum. Also this can 
lead to a lot of issues with two independent flume agents will try to write to 
the same output file on HDFS.

> HDFS sink: Ability to exclude time counter in fileName via sink configuration 
> ------------------------------------------------------------------------------
>
>                 Key: FLUME-2703
>                 URL: https://issues.apache.org/jira/browse/FLUME-2703
>             Project: Flume
>          Issue Type: Improvement
>          Components: Sinks+Sources
>    Affects Versions: v1.7.0
>            Reporter: Hari
>            Priority: Minor
>         Attachments: FLUME-2703-0.patch
>
>
> HDFS sinks always append time counter to filenames which is not configurable.
> In some use cases, it is desirable to retain the original filename. 
> For e.g. While ingesting a blob using Spool directory source, it's desirable 
> to retain the original filename (basename) in HDFS.  
> This patch allows to configure a HDFS sink to override this behavior 
> retaining the backward compatible file naming convention by default i.e,
> hdfs.appendTimeCounter = false



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to