Absolutey, see below. Just to reiterate, when using the timestamp
interceptor values to build the output path based on timestamp in the flume
header, things roll correct. The files also roll just fine base on file
size as well. However when using the regex_interceptor to get the actual
events timestamp to use in the output path, the last file in each directory
does not ever rename/close until flume is restarted.
*flume-conf.properties*
agent1.sources = fpssKafkaTopic
agent1.channels = fpssHdfsFileChannel
agent1.sinks = fpssHdfsSink
agent1.sources.fpssKafkaTopic.type =
org.apache.flume.source.kafka.KafkaSource
agent1.sources.fpssKafkaTopic.zookeeperConnect = zk-host:2181
agent1.sources.fpssKafkaTopic.topic = first-pass-stream-sessionized
agent1.sources.fpssKafkaTopic.groupId = flume-first-pass-stream-sessionized
agent1.sources.fpssKafkaTopic.kafka.auto.offset.reset = smallest
agent1.sources.fpssKafkaTopic.channels = fpssHdfsFileChannel
agent1.sources.fpssKafkaTopic.interceptors = i1 i2 i3
agent1.sources.fpssKafkaTopic.interceptors.i1.type = timestamp
agent1.sources.fpssKafkaTopic.interceptors.i1.preserveExisting = false
agent1.sources.fpssKafkaTopic.interceptors.i2.type =
org.apache.flume.interceptor.HostInterceptor$Builder
agent1.sources.fpssKafkaTopic.interceptors.i2.hostHeader = hostname
agent1.sources.fpssKafkaTopic.interceptors.i2.useIP= false
agent1.sources.fpssKafkaTopic.interceptors.i2.preserveExisting = true
agent1.sources.fpssKafkaTopic.interceptors.i3.type = regex_extractor
agent1.sources.fpssKafkaTopic.interceptors.i3.regex =
^.*\\"entryId\\":\\{\\"date\\":\\"(\\d\\d\\d\\d)-(\\d\\d)-(\\d\\d)T(\\d\\d):.*\\"\\}.*$
agent1.sources.fpssKafkaTopic.interceptors.i3.serializers = s1 s2 s3 s4
agent1.sources.fpssKafkaTopic.interceptors.i3.serializers.s1.name = year
agent1.sources.fpssKafkaTopic.interceptors.i3.serializers.s2.name = month
agent1.sources.fpssKafkaTopic.interceptors.i3.serializers.s3.name = day
agent1.sources.fpssKafkaTopic.interceptors.i3.serializers.s4.name = hour
agent1.sources.fpssKafkaTopic.kafka.consumer.timeout.ms = 100
agent1.channels.fpssHdfsFileChannel.type = file
agent1.channels.fpssHdfsFileChannel.checkpointDir =
/opt/flume/file-channel/fpss/checkpoint
agent1.channels.fpssHdfsFileChannel.dataDirs =
/opt/flume/file-channel/fpss/data
agent1.sinks.fpssHdfsSink.type = hdfs
agent1.sinks.fpssHdfsSink.hdfs.filePrefix = %{hostname}-log
agent1.sinks.fpssHdfsSink.hdfs.inUseSuffix = .tmp
agent1.sinks.fpssHdfsSink.hdfs.path =
hdfs://prodcluster/flumedata/processed/first-pass-stream/%{year}/%{month}/%{day}/%{hour}-00
agent1.sinks.fpssHdfsSink.hdfs.kerberosPrincipal = [email protected]
agent1.sinks.fpssHdfsSink.hdfs.kerberosKeytab = <keytab path removed for
privacy>
agent1.sinks.fpssHdfsSink.hdfs.rollInterval = 0
agent1.sinks.fpssHdfsSink.hdfs.rollCount = 0
## Account for compression. See flume-2128
## My calculation: 512 * 1024 * 1024 * 2.75
agent1.sinks.fpssHdfsSink.hdfs.rollSize = 1476395008
# Close file if idle more than 300 seconds
agent1.sinks.hdfsSink.hdfs.idleTimeout = 300
agent1.sinks.hdfsSink.hdfs.useLocalTimeStamp = true
agent1.sinks.fpssHdfsSink.hdfs.fileType = CompressedStream
agent1.sinks.fpssHdfsSink.hdfs.codeC = snappy
agent1.sinks.fpssHdfsSink.hdfs.writeFormat = Text
agent1.sinks.fpssHdfsSink.channel = fpssHdfsFileChannel
agent1.sinks.fpssHdfsSink.hdfs.batchSize = 10000
agent1.sinks.fpssHdfsSink.hdfs.threadsPoolSize = 20
agent1.sinks.fpssHdfsSink.hdfs.callTimeout = 20000
*HDFS Output Since Midnight (Notice the last file is never closed/renamed)*
hdfs dfs -ls /flumedata/processed/first-pass-stream/2017/01/13/*/
17/01/13 12:38:52 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Found 7 items
-rw-r--r-- 3 b2c_runtime hadoop 513710580 2017-01-13 00:09
/flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.1484290815397.snappy
-rw-r--r-- 3 b2c_runtime hadoop 514439844 2017-01-13 00:18
/flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.1484290815398.snappy
-rw-r--r-- 3 b2c_runtime hadoop 515125962 2017-01-13 00:28
/flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.1484290815399.snappy
-rw-r--r-- 3 b2c_runtime hadoop 513010837 2017-01-13 00:38
/flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.1484290815400.snappy
-rw-r--r-- 3 b2c_runtime hadoop 511315467 2017-01-13 00:49
/flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.1484290815401.snappy
-rw-r--r-- 3 b2c_runtime hadoop 508420966 2017-01-13 00:59
/flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.1484290815402.snappy
-rw-r--r-- 3 b2c_runtime hadoop 2503353 2017-01-13 00:59
/flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log.1484290815403.snappy.tmp
Found 6 items
-rw-r--r-- 3 b2c_runtime hadoop 509116221 2017-01-13 01:10
/flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.1484294415705.snappy
-rw-r--r-- 3 b2c_runtime hadoop 507800675 2017-01-13 01:21
/flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.1484294415706.snappy
-rw-r--r-- 3 b2c_runtime hadoop 504432110 2017-01-13 01:32
/flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.1484294415707.snappy
-rw-r--r-- 3 b2c_runtime hadoop 501932914 2017-01-13 01:42
/flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.1484294415708.snappy
-rw-r--r-- 3 b2c_runtime hadoop 498136257 2017-01-13 01:50
/flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.1484294415709.snappy
-rw-r--r-- 3 b2c_runtime hadoop 60539 2017-01-13 01:50
/flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log.1484294415710.snappy.tmp
Found 6 items
-rw-r--r-- 3 b2c_runtime hadoop 500879399 2017-01-13 02:11
/flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.1484298016017.snappy
-rw-r--r-- 3 b2c_runtime hadoop 501827071 2017-01-13 02:21
/flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.1484298016018.snappy
-rw-r--r-- 3 b2c_runtime hadoop 501489101 2017-01-13 02:32
/flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.1484298016019.snappy
-rw-r--r-- 3 b2c_runtime hadoop 501527838 2017-01-13 02:43
/flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.1484298016020.snappy
-rw-r--r-- 3 b2c_runtime hadoop 499393977 2017-01-13 02:54
/flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.1484298016021.snappy
-rw-r--r-- 3 b2c_runtime hadoop 1282327 2017-01-13 02:54
/flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log.1484298016022.snappy.tmp
Found 6 items
-rw-r--r-- 3 b2c_runtime hadoop 501033294 2017-01-13 03:10
/flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.1484301615579.snappy
-rw-r--r-- 3 b2c_runtime hadoop 500933906 2017-01-13 03:20
/flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.1484301615580.snappy
-rw-r--r-- 3 b2c_runtime hadoop 505869233 2017-01-13 03:31
/flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.1484301615581.snappy
-rw-r--r-- 3 b2c_runtime hadoop 502910608 2017-01-13 03:41
/flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.1484301615582.snappy
-rw-r--r-- 3 b2c_runtime hadoop 499561080 2017-01-13 03:52
/flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.1484301615583.snappy
-rw-r--r-- 3 b2c_runtime hadoop 3616826 2017-01-13 03:52
/flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log.1484301615584.snappy.tmp
Found 6 items
-rw-r--r-- 3 b2c_runtime hadoop 502243204 2017-01-13 04:11
/flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.1484305215893.snappy
-rw-r--r-- 3 b2c_runtime hadoop 508966498 2017-01-13 04:22
/flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.1484305215894.snappy
-rw-r--r-- 3 b2c_runtime hadoop 510972236 2017-01-13 04:34
/flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.1484305215895.snappy
-rw-r--r-- 3 b2c_runtime hadoop 513225577 2017-01-13 04:46
/flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.1484305215896.snappy
-rw-r--r-- 3 b2c_runtime hadoop 512743679 2017-01-13 04:57
/flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.1484305215897.snappy
-rw-r--r-- 3 b2c_runtime hadoop 3888775 2017-01-13 04:57
/flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log.1484305215898.snappy.tmp
Found 7 items
-rw-r--r-- 3 b2c_runtime hadoop 515832251 2017-01-13 05:11
/flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.1484308811983.snappy
-rw-r--r-- 3 b2c_runtime hadoop 518077964 2017-01-13 05:20
/flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.1484308811984.snappy
-rw-r--r-- 3 b2c_runtime hadoop 519490676 2017-01-13 05:29
/flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.1484308811985.snappy
-rw-r--r-- 3 b2c_runtime hadoop 519105563 2017-01-13 05:37
/flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.1484308811986.snappy
-rw-r--r-- 3 b2c_runtime hadoop 518672209 2017-01-13 05:46
/flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.1484308811987.snappy
-rw-r--r-- 3 b2c_runtime hadoop 520019853 2017-01-13 05:53
/flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.1484308811988.snappy
-rw-r--r-- 3 b2c_runtime hadoop 1574211 2017-01-13 05:53
/flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log.1484308811989.snappy.tmp
Found 9 items
-rw-r--r-- 3 b2c_runtime hadoop 521428204 2017-01-13 06:07
/flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413743.snappy
-rw-r--r-- 3 b2c_runtime hadoop 519885769 2017-01-13 06:15
/flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413744.snappy
-rw-r--r-- 3 b2c_runtime hadoop 519050891 2017-01-13 06:21
/flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413745.snappy
-rw-r--r-- 3 b2c_runtime hadoop 520691322 2017-01-13 06:29
/flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413746.snappy
-rw-r--r-- 3 b2c_runtime hadoop 520902319 2017-01-13 06:36
/flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413747.snappy
-rw-r--r-- 3 b2c_runtime hadoop 520831873 2017-01-13 06:42
/flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413748.snappy
-rw-r--r-- 3 b2c_runtime hadoop 519785647 2017-01-13 06:49
/flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413749.snappy
-rw-r--r-- 3 b2c_runtime hadoop 520590143 2017-01-13 06:55
/flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413750.snappy
-rw-r--r-- 3 b2c_runtime hadoop 4621367 2017-01-13 06:55
/flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log.1484312413751.snappy.tmp
Found 11 items
-rw-r--r-- 3 b2c_runtime hadoop 522623760 2017-01-13 07:06
/flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015214.snappy
-rw-r--r-- 3 b2c_runtime hadoop 523065112 2017-01-13 07:12
/flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015215.snappy
-rw-r--r-- 3 b2c_runtime hadoop 523445533 2017-01-13 07:18
/flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015216.snappy
-rw-r--r-- 3 b2c_runtime hadoop 523084945 2017-01-13 07:24
/flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015217.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524283976 2017-01-13 07:30
/flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015218.snappy
-rw-r--r-- 3 b2c_runtime hadoop 523923379 2017-01-13 07:36
/flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015219.snappy
-rw-r--r-- 3 b2c_runtime hadoop 523910723 2017-01-13 07:42
/flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015220.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524266095 2017-01-13 07:47
/flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015221.snappy
-rw-r--r-- 3 b2c_runtime hadoop 523002505 2017-01-13 07:53
/flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015222.snappy
-rw-r--r-- 3 b2c_runtime hadoop 520706211 2017-01-13 07:58
/flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015223.snappy
-rw-r--r-- 3 b2c_runtime hadoop 8051588 2017-01-13 07:58
/flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log.1484316015224.snappy.tmp
Found 11 items
-rw-r--r-- 3 b2c_runtime hadoop 520528155 2017-01-13 08:05
/flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618433.snappy
-rw-r--r-- 3 b2c_runtime hadoop 521761390 2017-01-13 08:11
/flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618434.snappy
-rw-r--r-- 3 b2c_runtime hadoop 522548272 2017-01-13 08:16
/flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618435.snappy
-rw-r--r-- 3 b2c_runtime hadoop 522616117 2017-01-13 08:22
/flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618436.snappy
-rw-r--r-- 3 b2c_runtime hadoop 525953759 2017-01-13 08:28
/flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618437.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524475009 2017-01-13 08:34
/flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618438.snappy
-rw-r--r-- 3 b2c_runtime hadoop 523995339 2017-01-13 08:40
/flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618439.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524188832 2017-01-13 08:47
/flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618440.snappy
-rw-r--r-- 3 b2c_runtime hadoop 525303001 2017-01-13 08:53
/flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618441.snappy
-rw-r--r-- 3 b2c_runtime hadoop 525606532 2017-01-13 08:59
/flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618442.snappy
-rw-r--r-- 3 b2c_runtime hadoop 4486982 2017-01-13 08:59
/flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log.1484319618443.snappy.tmp
Found 11 items
-rw-r--r-- 3 b2c_runtime hadoop 525207364 2017-01-13 09:06
/flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216987.snappy
-rw-r--r-- 3 b2c_runtime hadoop 526105891 2017-01-13 09:12
/flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216988.snappy
-rw-r--r-- 3 b2c_runtime hadoop 526426735 2017-01-13 09:18
/flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216989.snappy
-rw-r--r-- 3 b2c_runtime hadoop 525298099 2017-01-13 09:24
/flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216990.snappy
-rw-r--r-- 3 b2c_runtime hadoop 525282945 2017-01-13 09:30
/flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216991.snappy
-rw-r--r-- 3 b2c_runtime hadoop 523921005 2017-01-13 09:36
/flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216992.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524827705 2017-01-13 09:42
/flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216993.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524203463 2017-01-13 09:47
/flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216994.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524678485 2017-01-13 09:53
/flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216995.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524598220 2017-01-13 09:59
/flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216996.snappy
-rw-r--r-- 3 b2c_runtime hadoop 3877959 2017-01-13 09:59
/flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log.1484323216997.snappy.tmp
Found 10 items
-rw-r--r-- 3 b2c_runtime hadoop 523000460 2017-01-13 10:06
/flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813831.snappy
-rw-r--r-- 3 b2c_runtime hadoop 523455154 2017-01-13 10:12
/flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813832.snappy
-rw-r--r-- 3 b2c_runtime hadoop 525465618 2017-01-13 10:18
/flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813833.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524630955 2017-01-13 10:24
/flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813834.snappy
-rw-r--r-- 3 b2c_runtime hadoop 527780298 2017-01-13 10:30
/flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813835.snappy
-rw-r--r-- 3 b2c_runtime hadoop 526565562 2017-01-13 10:37
/flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813836.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524936336 2017-01-13 10:43
/flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813837.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524565610 2017-01-13 10:49
/flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813838.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524276950 2017-01-13 10:55
/flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813839.snappy
-rw-r--r-- 3 b2c_runtime hadoop 654810 2017-01-13 10:55
/flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log.1484326813840.snappy.tmp
Found 11 items
-rw-r--r-- 3 b2c_runtime hadoop 524174553 2017-01-13 11:06
/flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415712.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524127864 2017-01-13 11:12
/flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415713.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524778919 2017-01-13 11:18
/flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415714.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524851182 2017-01-13 11:24
/flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415715.snappy
-rw-r--r-- 3 b2c_runtime hadoop 525156750 2017-01-13 11:30
/flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415716.snappy
-rw-r--r-- 3 b2c_runtime hadoop 525334538 2017-01-13 11:35
/flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415717.snappy
-rw-r--r-- 3 b2c_runtime hadoop 527346578 2017-01-13 11:41
/flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415718.snappy
-rw-r--r-- 3 b2c_runtime hadoop 525592734 2017-01-13 11:47
/flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415719.snappy
-rw-r--r-- 3 b2c_runtime hadoop 525502291 2017-01-13 11:53
/flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415720.snappy
-rw-r--r-- 3 b2c_runtime hadoop 523135186 2017-01-13 11:58
/flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415721.snappy
-rw-r--r-- 3 b2c_runtime hadoop 9967141 2017-01-13 11:58
/flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log.1484330415722.snappy.tmp
Found 7 items
-rw-r--r-- 3 b2c_runtime hadoop 520881970 2017-01-13 12:05
/flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.1484334016849.snappy
-rw-r--r-- 3 b2c_runtime hadoop 522340745 2017-01-13 12:11
/flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.1484334016850.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524156495 2017-01-13 12:17
/flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.1484334016851.snappy
-rw-r--r-- 3 b2c_runtime hadoop 523482390 2017-01-13 12:23
/flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.1484334016852.snappy
-rw-r--r-- 3 b2c_runtime hadoop 524096591 2017-01-13 12:29
/flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.1484334016853.snappy
-rw-r--r-- 3 b2c_runtime hadoop 523184628 2017-01-13 12:35
/flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.1484334016854.snappy
-rw-r--r-- 3 b2c_runtime hadoop 10981218 2017-01-13 12:35
/flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log.1484334016855.snappy.tmp
*HDFS Stat On One Of The File (Keep in Mind the output backet is based on
event time that is MDT/MST vs the stat date of GMT)*
hadoop fs -stat "%y %n"
/flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100
-log.1484326813840.snappy.tmp
17/01/13 12:57:07 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
2017-01-13 17:55:35 flumeload100-log.1484326813840.snappy.tmp
Thanks
Justin
On Thu, Jan 12, 2017 at 11:56 PM, Denes Arvay <[email protected]> wrote:
> Hi Justin,
>
> Could you please share your config file with us?
>
> Thanks,
> Denes
>
>
> On Thu, Jan 12, 2017, 20:20 Justin Workman <[email protected]>
> wrote:
>
>> sorry for cross posting to user and dev. I have recently set up a flume
>> configuration where we are using the regex_extractor interceptor to parse
>> the actual event date from the record flowing through the Flume source,
>> then using that date to build the HDFS sink bucket path. However, it
>> appears that the hdfs.idleTimeout value is not honored in this
>> configuration. It does work when using the timestamp interceptor you build
>> the output path.
>>
>> I have set the hdfs.idleTimeout value for the HDFS sink, but the files
>> are never closed or renamed until I restart or shutdown Flume. Our flume is
>> configured to roll based on size or output path, and the files
>> rename/close/roll fine based on size, however the last file in each output
>> path is always left with the .tmp extension until we restart Flume. I would
>> expect that the file would be renamed and closed if there are no records
>> written to this file after the idleTimeout is reached.
>>
>> Could I be missing something, or is this a known bug with the
>> regex_extract interceptor?
>>
>> Thanks
>> Justin
>>
>