More details Flume 1.6 - Core Apache version. KafkaSource (0.8.2) -> File Channel -> HDFS Sink (CDH5.5.2).
On Thu, Jan 12, 2017 at 12:20 PM, Justin Workman <justinjwork...@gmail.com> wrote: > sorry for cross posting to user and dev. I have recently set up a flume > configuration where we are using the regex_extractor interceptor to parse > the actual event date from the record flowing through the Flume source, > then using that date to build the HDFS sink bucket path. However, it > appears that the hdfs.idleTimeout value is not honored in this > configuration. It does work when using the timestamp interceptor you build > the output path. > > I have set the hdfs.idleTimeout value for the HDFS sink, but the files are > never closed or renamed until I restart or shutdown Flume. Our flume is > configured to roll based on size or output path, and the files > rename/close/roll fine based on size, however the last file in each output > path is always left with the .tmp extension until we restart Flume. I would > expect that the file would be renamed and closed if there are no records > written to this file after the idleTimeout is reached. > > Could I be missing something, or is this a known bug with the > regex_extract interceptor? > > Thanks > Justin >