Sorry for wasting anyones time. In reviewing my configuration, I have a typo in the hdfs.idleTimeout configuration.
On Fri, Jan 13, 2017 at 2:14 PM, Justin Workman <[email protected]> wrote: > I'll try debug again. The output /regex seems to be fine, but I never see > a call to close/rename the last files in each directory until flume shuts > down or restarts. > > I would expect to see this call when the idleTimeout value is reached. > > Sent from my iPhone > > On Jan 13, 2017, at 2:05 PM, iain wright <[email protected]> wrote: > > Might be worth trying the debug output (I forget exact sink name) to just > log the headers being attached to events after the interceptor to validate > the regex is working correctly, and for all events. > > I setup this exact config at previous company so I know it works. > > I also remember needing to escape the regex in an odd way due to how java > was loading/parsing the config > > Best, > Iain > > Sent from my iPhone > > On Jan 13, 2017, at 12:00 PM, Justin Workman <[email protected]> > wrote: > > Absolutey, see below. Just to reiterate, when using the timestamp > interceptor values to build the output path based on timestamp in the flume > header, things roll correct. The files also roll just fine base on file > size as well. However when using the regex_interceptor to get the actual > events timestamp to use in the output path, the last file in each directory > does not ever rename/close until flume is restarted. > > > *flume-conf.properties* > agent1.sources = fpssKafkaTopic > agent1.channels = fpssHdfsFileChannel > agent1.sinks = fpssHdfsSink > > agent1.sources.fpssKafkaTopic.type = org.apache.flume.source.kafka. > KafkaSource > agent1.sources.fpssKafkaTopic.zookeeperConnect = zk-host:2181 > agent1.sources.fpssKafkaTopic.topic = first-pass-stream-sessionized > agent1.sources.fpssKafkaTopic.groupId = flume-first-pass-stream- > sessionized > agent1.sources.fpssKafkaTopic.kafka.auto.offset.reset = smallest > agent1.sources.fpssKafkaTopic.channels = fpssHdfsFileChannel > agent1.sources.fpssKafkaTopic.interceptors = i1 i2 i3 > agent1.sources.fpssKafkaTopic.interceptors.i1.type = timestamp > agent1.sources.fpssKafkaTopic.interceptors.i1.preserveExisting = false > agent1.sources.fpssKafkaTopic.interceptors.i2.type = > org.apache.flume.interceptor.HostInterceptor$Builder > agent1.sources.fpssKafkaTopic.interceptors.i2.hostHeader = hostname > agent1.sources.fpssKafkaTopic.interceptors.i2.useIP= false > agent1.sources.fpssKafkaTopic.interceptors.i2.preserveExisting = true > agent1.sources.fpssKafkaTopic.interceptors.i3.type = regex_extractor > agent1.sources.fpssKafkaTopic.interceptors.i3.regex = > ^.*\\"entryId\\":\\{\\"date\\":\\"(\\d\\d\\d\\d)-(\\d\\d)-(\ > \d\\d)T(\\d\\d):.*\\"\\}.*$ > agent1.sources.fpssKafkaTopic.interceptors.i3.serializers = s1 s2 s3 s4 > agent1.sources.fpssKafkaTopic.interceptors.i3.serializers.s1.name = year > agent1.sources.fpssKafkaTopic.interceptors.i3.serializers.s2.name = month > agent1.sources.fpssKafkaTopic.interceptors.i3.serializers.s3.name = day > agent1.sources.fpssKafkaTopic.interceptors.i3.serializers.s4.name = hour > agent1.sources.fpssKafkaTopic.kafka.consumer.timeout.ms = 100 > > agent1.channels.fpssHdfsFileChannel.type = file > agent1.channels.fpssHdfsFileChannel.checkpointDir = > /opt/flume/file-channel/fpss/checkpoint > agent1.channels.fpssHdfsFileChannel.dataDirs = > /opt/flume/file-channel/fpss/data > > agent1.sinks.fpssHdfsSink.type = hdfs > agent1.sinks.fpssHdfsSink.hdfs.filePrefix = %{hostname}-log > agent1.sinks.fpssHdfsSink.hdfs.inUseSuffix = .tmp > agent1.sinks.fpssHdfsSink.hdfs.path = hdfs://prodcluster/flumedata/ > processed/first-pass-stream/%{year}/%{month}/%{day}/%{hour}-00 > agent1.sinks.fpssHdfsSink.hdfs.kerberosPrincipal = [email protected] > agent1.sinks.fpssHdfsSink.hdfs.kerberosKeytab = <keytab path removed for > privacy> > agent1.sinks.fpssHdfsSink.hdfs.rollInterval = 0 > agent1.sinks.fpssHdfsSink.hdfs.rollCount = 0 > ## Account for compression. See flume-2128 > ## My calculation: 512 * 1024 * 1024 * 2.75 > agent1.sinks.fpssHdfsSink.hdfs.rollSize = 1476395008 > # Close file if idle more than 300 seconds > agent1.sinks.hdfsSink.hdfs.idleTimeout = 300 > agent1.sinks.hdfsSink.hdfs.useLocalTimeStamp = true > agent1.sinks.fpssHdfsSink.hdfs.fileType = CompressedStream > agent1.sinks.fpssHdfsSink.hdfs.codeC = snappy > agent1.sinks.fpssHdfsSink.hdfs.writeFormat = Text > agent1.sinks.fpssHdfsSink.channel = fpssHdfsFileChannel > agent1.sinks.fpssHdfsSink.hdfs.batchSize = 10000 > agent1.sinks.fpssHdfsSink.hdfs.threadsPoolSize = 20 > agent1.sinks.fpssHdfsSink.hdfs.callTimeout = 20000 > > *HDFS Output Since Midnight (Notice the last file is never closed/renamed)* > hdfs dfs -ls /flumedata/processed/first-pass-stream/2017/01/13/*/ > 17/01/13 12:38:52 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > Found 7 items > -rw-r--r-- 3 b2c_runtime hadoop 513710580 2017-01-13 00:09 > /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log. > 1484290815397.snappy > -rw-r--r-- 3 b2c_runtime hadoop 514439844 2017-01-13 00:18 > /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log. > 1484290815398.snappy > -rw-r--r-- 3 b2c_runtime hadoop 515125962 2017-01-13 00:28 > /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log. > 1484290815399.snappy > -rw-r--r-- 3 b2c_runtime hadoop 513010837 2017-01-13 00:38 > /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log. > 1484290815400.snappy > -rw-r--r-- 3 b2c_runtime hadoop 511315467 2017-01-13 00:49 > /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log. > 1484290815401.snappy > -rw-r--r-- 3 b2c_runtime hadoop 508420966 2017-01-13 00:59 > /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log. > 1484290815402.snappy > -rw-r--r-- 3 b2c_runtime hadoop 2503353 2017-01-13 00:59 > /flumedata/processed/first-pass-stream/2017/01/13/00-00/flumeload100-log. > 1484290815403.snappy.tmp > Found 6 items > -rw-r--r-- 3 b2c_runtime hadoop 509116221 2017-01-13 01:10 > /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log. > 1484294415705.snappy > -rw-r--r-- 3 b2c_runtime hadoop 507800675 2017-01-13 01:21 > /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log. > 1484294415706.snappy > -rw-r--r-- 3 b2c_runtime hadoop 504432110 2017-01-13 01:32 > /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log. > 1484294415707.snappy > -rw-r--r-- 3 b2c_runtime hadoop 501932914 2017-01-13 01:42 > /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log. > 1484294415708.snappy > -rw-r--r-- 3 b2c_runtime hadoop 498136257 2017-01-13 01:50 > /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log. > 1484294415709.snappy > -rw-r--r-- 3 b2c_runtime hadoop 60539 2017-01-13 01:50 > /flumedata/processed/first-pass-stream/2017/01/13/01-00/flumeload100-log. > 1484294415710.snappy.tmp > Found 6 items > -rw-r--r-- 3 b2c_runtime hadoop 500879399 2017-01-13 02:11 > /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log. > 1484298016017.snappy > -rw-r--r-- 3 b2c_runtime hadoop 501827071 2017-01-13 02:21 > /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log. > 1484298016018.snappy > -rw-r--r-- 3 b2c_runtime hadoop 501489101 2017-01-13 02:32 > /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log. > 1484298016019.snappy > -rw-r--r-- 3 b2c_runtime hadoop 501527838 2017-01-13 02:43 > /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log. > 1484298016020.snappy > -rw-r--r-- 3 b2c_runtime hadoop 499393977 2017-01-13 02:54 > /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log. > 1484298016021.snappy > -rw-r--r-- 3 b2c_runtime hadoop 1282327 2017-01-13 02:54 > /flumedata/processed/first-pass-stream/2017/01/13/02-00/flumeload100-log. > 1484298016022.snappy.tmp > Found 6 items > -rw-r--r-- 3 b2c_runtime hadoop 501033294 2017-01-13 03:10 > /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log. > 1484301615579.snappy > -rw-r--r-- 3 b2c_runtime hadoop 500933906 2017-01-13 03:20 > /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log. > 1484301615580.snappy > -rw-r--r-- 3 b2c_runtime hadoop 505869233 2017-01-13 03:31 > /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log. > 1484301615581.snappy > -rw-r--r-- 3 b2c_runtime hadoop 502910608 2017-01-13 03:41 > /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log. > 1484301615582.snappy > -rw-r--r-- 3 b2c_runtime hadoop 499561080 2017-01-13 03:52 > /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log. > 1484301615583.snappy > -rw-r--r-- 3 b2c_runtime hadoop 3616826 2017-01-13 03:52 > /flumedata/processed/first-pass-stream/2017/01/13/03-00/flumeload100-log. > 1484301615584.snappy.tmp > Found 6 items > -rw-r--r-- 3 b2c_runtime hadoop 502243204 2017-01-13 04:11 > /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log. > 1484305215893.snappy > -rw-r--r-- 3 b2c_runtime hadoop 508966498 2017-01-13 04:22 > /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log. > 1484305215894.snappy > -rw-r--r-- 3 b2c_runtime hadoop 510972236 2017-01-13 04:34 > /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log. > 1484305215895.snappy > -rw-r--r-- 3 b2c_runtime hadoop 513225577 2017-01-13 04:46 > /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log. > 1484305215896.snappy > -rw-r--r-- 3 b2c_runtime hadoop 512743679 2017-01-13 04:57 > /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log. > 1484305215897.snappy > -rw-r--r-- 3 b2c_runtime hadoop 3888775 2017-01-13 04:57 > /flumedata/processed/first-pass-stream/2017/01/13/04-00/flumeload100-log. > 1484305215898.snappy.tmp > Found 7 items > -rw-r--r-- 3 b2c_runtime hadoop 515832251 2017-01-13 05:11 > /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log. > 1484308811983.snappy > -rw-r--r-- 3 b2c_runtime hadoop 518077964 2017-01-13 05:20 > /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log. > 1484308811984.snappy > -rw-r--r-- 3 b2c_runtime hadoop 519490676 2017-01-13 05:29 > /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log. > 1484308811985.snappy > -rw-r--r-- 3 b2c_runtime hadoop 519105563 2017-01-13 05:37 > /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log. > 1484308811986.snappy > -rw-r--r-- 3 b2c_runtime hadoop 518672209 2017-01-13 05:46 > /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log. > 1484308811987.snappy > -rw-r--r-- 3 b2c_runtime hadoop 520019853 2017-01-13 05:53 > /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log. > 1484308811988.snappy > -rw-r--r-- 3 b2c_runtime hadoop 1574211 2017-01-13 05:53 > /flumedata/processed/first-pass-stream/2017/01/13/05-00/flumeload100-log. > 1484308811989.snappy.tmp > Found 9 items > -rw-r--r-- 3 b2c_runtime hadoop 521428204 2017-01-13 06:07 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log. > 1484312413743.snappy > -rw-r--r-- 3 b2c_runtime hadoop 519885769 2017-01-13 06:15 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log. > 1484312413744.snappy > -rw-r--r-- 3 b2c_runtime hadoop 519050891 2017-01-13 06:21 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log. > 1484312413745.snappy > -rw-r--r-- 3 b2c_runtime hadoop 520691322 2017-01-13 06:29 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log. > 1484312413746.snappy > -rw-r--r-- 3 b2c_runtime hadoop 520902319 2017-01-13 06:36 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log. > 1484312413747.snappy > -rw-r--r-- 3 b2c_runtime hadoop 520831873 2017-01-13 06:42 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log. > 1484312413748.snappy > -rw-r--r-- 3 b2c_runtime hadoop 519785647 2017-01-13 06:49 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log. > 1484312413749.snappy > -rw-r--r-- 3 b2c_runtime hadoop 520590143 2017-01-13 06:55 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log. > 1484312413750.snappy > -rw-r--r-- 3 b2c_runtime hadoop 4621367 2017-01-13 06:55 > /flumedata/processed/first-pass-stream/2017/01/13/06-00/flumeload100-log. > 1484312413751.snappy.tmp > Found 11 items > -rw-r--r-- 3 b2c_runtime hadoop 522623760 2017-01-13 07:06 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log. > 1484316015214.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523065112 2017-01-13 07:12 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log. > 1484316015215.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523445533 2017-01-13 07:18 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log. > 1484316015216.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523084945 2017-01-13 07:24 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log. > 1484316015217.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524283976 2017-01-13 07:30 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log. > 1484316015218.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523923379 2017-01-13 07:36 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log. > 1484316015219.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523910723 2017-01-13 07:42 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log. > 1484316015220.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524266095 2017-01-13 07:47 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log. > 1484316015221.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523002505 2017-01-13 07:53 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log. > 1484316015222.snappy > -rw-r--r-- 3 b2c_runtime hadoop 520706211 2017-01-13 07:58 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log. > 1484316015223.snappy > -rw-r--r-- 3 b2c_runtime hadoop 8051588 2017-01-13 07:58 > /flumedata/processed/first-pass-stream/2017/01/13/07-00/flumeload100-log. > 1484316015224.snappy.tmp > Found 11 items > -rw-r--r-- 3 b2c_runtime hadoop 520528155 2017-01-13 08:05 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log. > 1484319618433.snappy > -rw-r--r-- 3 b2c_runtime hadoop 521761390 2017-01-13 08:11 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log. > 1484319618434.snappy > -rw-r--r-- 3 b2c_runtime hadoop 522548272 2017-01-13 08:16 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log. > 1484319618435.snappy > -rw-r--r-- 3 b2c_runtime hadoop 522616117 2017-01-13 08:22 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log. > 1484319618436.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525953759 2017-01-13 08:28 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log. > 1484319618437.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524475009 2017-01-13 08:34 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log. > 1484319618438.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523995339 2017-01-13 08:40 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log. > 1484319618439.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524188832 2017-01-13 08:47 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log. > 1484319618440.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525303001 2017-01-13 08:53 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log. > 1484319618441.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525606532 2017-01-13 08:59 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log. > 1484319618442.snappy > -rw-r--r-- 3 b2c_runtime hadoop 4486982 2017-01-13 08:59 > /flumedata/processed/first-pass-stream/2017/01/13/08-00/flumeload100-log. > 1484319618443.snappy.tmp > Found 11 items > -rw-r--r-- 3 b2c_runtime hadoop 525207364 2017-01-13 09:06 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log. > 1484323216987.snappy > -rw-r--r-- 3 b2c_runtime hadoop 526105891 2017-01-13 09:12 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log. > 1484323216988.snappy > -rw-r--r-- 3 b2c_runtime hadoop 526426735 2017-01-13 09:18 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log. > 1484323216989.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525298099 2017-01-13 09:24 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log. > 1484323216990.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525282945 2017-01-13 09:30 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log. > 1484323216991.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523921005 2017-01-13 09:36 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log. > 1484323216992.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524827705 2017-01-13 09:42 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log. > 1484323216993.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524203463 2017-01-13 09:47 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log. > 1484323216994.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524678485 2017-01-13 09:53 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log. > 1484323216995.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524598220 2017-01-13 09:59 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log. > 1484323216996.snappy > -rw-r--r-- 3 b2c_runtime hadoop 3877959 2017-01-13 09:59 > /flumedata/processed/first-pass-stream/2017/01/13/09-00/flumeload100-log. > 1484323216997.snappy.tmp > Found 10 items > -rw-r--r-- 3 b2c_runtime hadoop 523000460 2017-01-13 10:06 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log. > 1484326813831.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523455154 2017-01-13 10:12 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log. > 1484326813832.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525465618 2017-01-13 10:18 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log. > 1484326813833.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524630955 2017-01-13 10:24 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log. > 1484326813834.snappy > -rw-r--r-- 3 b2c_runtime hadoop 527780298 2017-01-13 10:30 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log. > 1484326813835.snappy > -rw-r--r-- 3 b2c_runtime hadoop 526565562 2017-01-13 10:37 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log. > 1484326813836.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524936336 2017-01-13 10:43 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log. > 1484326813837.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524565610 2017-01-13 10:49 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log. > 1484326813838.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524276950 2017-01-13 10:55 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log. > 1484326813839.snappy > -rw-r--r-- 3 b2c_runtime hadoop 654810 2017-01-13 10:55 > /flumedata/processed/first-pass-stream/2017/01/13/10-00/flumeload100-log. > 1484326813840.snappy.tmp > Found 11 items > -rw-r--r-- 3 b2c_runtime hadoop 524174553 2017-01-13 11:06 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log. > 1484330415712.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524127864 2017-01-13 11:12 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log. > 1484330415713.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524778919 2017-01-13 11:18 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log. > 1484330415714.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524851182 2017-01-13 11:24 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log. > 1484330415715.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525156750 2017-01-13 11:30 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log. > 1484330415716.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525334538 2017-01-13 11:35 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log. > 1484330415717.snappy > -rw-r--r-- 3 b2c_runtime hadoop 527346578 2017-01-13 11:41 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log. > 1484330415718.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525592734 2017-01-13 11:47 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log. > 1484330415719.snappy > -rw-r--r-- 3 b2c_runtime hadoop 525502291 2017-01-13 11:53 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log. > 1484330415720.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523135186 2017-01-13 11:58 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log. > 1484330415721.snappy > -rw-r--r-- 3 b2c_runtime hadoop 9967141 2017-01-13 11:58 > /flumedata/processed/first-pass-stream/2017/01/13/11-00/flumeload100-log. > 1484330415722.snappy.tmp > Found 7 items > -rw-r--r-- 3 b2c_runtime hadoop 520881970 2017-01-13 12:05 > /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log. > 1484334016849.snappy > -rw-r--r-- 3 b2c_runtime hadoop 522340745 2017-01-13 12:11 > /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log. > 1484334016850.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524156495 2017-01-13 12:17 > /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log. > 1484334016851.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523482390 2017-01-13 12:23 > /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log. > 1484334016852.snappy > -rw-r--r-- 3 b2c_runtime hadoop 524096591 2017-01-13 12:29 > /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log. > 1484334016853.snappy > -rw-r--r-- 3 b2c_runtime hadoop 523184628 2017-01-13 12:35 > /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log. > 1484334016854.snappy > -rw-r--r-- 3 b2c_runtime hadoop 10981218 2017-01-13 12:35 > /flumedata/processed/first-pass-stream/2017/01/13/12-00/flumeload100-log. > 1484334016855.snappy.tmp > > *HDFS Stat On One Of The File (Keep in Mind the output backet is based on > event time that is MDT/MST vs the stat date of GMT)* > hadoop fs -stat "%y %n" /flumedata/processed/first- > pass-stream/2017/01/13/10-00/flumeload100 > -log.1484326813840.snappy.tmp > 17/01/13 12:57:07 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 2017-01-13 17:55:35 flumeload100-log.1484326813840.snappy.tmp > > Thanks > Justin > > On Thu, Jan 12, 2017 at 11:56 PM, Denes Arvay <[email protected]> wrote: > >> Hi Justin, >> >> Could you please share your config file with us? >> >> Thanks, >> Denes >> >> >> On Thu, Jan 12, 2017, 20:20 Justin Workman <[email protected]> >> wrote: >> >>> sorry for cross posting to user and dev. I have recently set up a flume >>> configuration where we are using the regex_extractor interceptor to parse >>> the actual event date from the record flowing through the Flume source, >>> then using that date to build the HDFS sink bucket path. However, it >>> appears that the hdfs.idleTimeout value is not honored in this >>> configuration. It does work when using the timestamp interceptor you build >>> the output path. >>> >>> I have set the hdfs.idleTimeout value for the HDFS sink, but the files >>> are never closed or renamed until I restart or shutdown Flume. Our flume is >>> configured to roll based on size or output path, and the files >>> rename/close/roll fine based on size, however the last file in each output >>> path is always left with the .tmp extension until we restart Flume. I would >>> expect that the file would be renamed and closed if there are no records >>> written to this file after the idleTimeout is reached. >>> >>> Could I be missing something, or is this a known bug with the >>> regex_extract interceptor? >>> >>> Thanks >>> Justin >>> >> >
