I believe that is supported as of Flume 1.5.0: http://flume.apache.org/FlumeUserGuide.html#hdfs-sink
See hdfs.retryInterval If you think there is a problem with that behavior, please file a bug. Regards, Mike On Wed, Jul 20, 2016 at 1:43 AM, no jihun <jees...@gmail.com> wrote: > In fact looking at your error the timeout looks like the hdfs.callTimeout, >> so that's where I'd focus. Is your HDFS cluster particularily unperformant? >> 10s to respond to a call is pretty slow. > > you are right. > > At that time hdfs disks fully utiliized by Map/Reduce jobs. > I expected even flume failed to close files, a while later, disk under > utilized , close retry processed by flume, then close file succefully. > > 2016-07-20 17:36 GMT+09:00 no jihun <jees...@gmail.com>: > >> I know about idleTimeout. rollingSize, rollingCount ( which about roll >> over writing file). >> >> I didn't set callTimeout, so the default 10s will be applied. >> also closeTries, retryInterval haven't set too. >> >> So, I think even close failed one time, close retries will be retried >> after 180s(default retryInterval) >> But as you can see at the logs above, close retry never happen. >> >> am I wrong? >> >> 2016-07-20 17:25 GMT+09:00 Chris Horrocks <chris@hor.rocks>: >> >>> You could look at tuning either hdfs.idleTimeout, hdfs.callTimeout, or >>> hdfs.retryInterval which can all be found at: >>> http://flume.apache.org/FlumeUserGuide.html#hdfs-sink >>> >>> -- >>> Chris Horrocks >>> >>> >>> On Wed, Jul 20, 2016 at 9:01 am, no jihun <'jees...@gmail.com'> wrote: >>> >>> @chirs If you meant hdfs.callTimeout >>> Now I am doing a test on that. >>> >>> I can increase the value. >>> When timeout occur while close, It will never retried? ( as logs above ) >>> >>> 2016-07-20 16:50 GMT+09:00 Chris Horrocks <chris@hor.rocks>: >>> >>>> Have you tried increasing the HDFS sink timeouts? >>>> >>>> -- >>>> Chris Horrocks >>>> >>>> >>>> On Wed, Jul 20, 2016 at 8:03 am, no jihun <'jees...@gmail.com'> wrote: >>>> >>>> Hi. >>>> >>>> I found some files on hdfs left as OPEN_FOR_WRITE state. >>>> >>>> *This is flume's log about the file.* >>>> >>>> >>>> 01 18 7 2016 16:12:02,765 INFO >>>>> [SinkRunner-PollingRunner-DefaultSinkProcessor] >>>>> (org.apache.flume.sink.hdfs.BucketWriter.open:234) >>>> >>>> 02 - Creating 1468825922758.avro.tmp >>>> >>>> >>>>> 03 18 7 2016 16:22:39,812 INFO [hdfs-hdfs2-roll-timer-0] >>>>> (org.apache.flume.sink.hdfs.BucketWriter$5.call:429) >>>> >>>> 04 - Closing idle bucketWriter 1468825922758.avro.tmp at 1468826559812 >>>> >>>> >>>>> 05 18 7 2016 16:22:39,812 INFO [hdfs-hdfs2-roll-timer-0] >>>>> (org.apache.flume.sink.hdfs.BucketWriter.close:363) >>>> >>>> 06 - Closing 1468825922758.avro.tmp >>>> >>>> >>>>> 07 18 7 2016 16:22:49,813 WARN [hdfs-hdfs2-roll-timer-0] >>>>> (org.apache.flume.sink.hdfs.BucketWriter.close:370) >>>> >>>> 08 - failed to close() HDFSWriter for file (1468825922758.avro.tmp). >>>>> Exception follows. >>>> >>>> 09 java.io.IOException: Callable timed out after 10000 ms on file: >>>>> 1468825922758.avro.tmp >>>> >>>> >>>>> 10 18 7 2016 16:22:49,816 INFO [hdfs-hdfs2-call-runner-7] >>>>> (org.apache.flume.sink.hdfs.BucketWriter$8.call:629) >>>> >>>> 11 - Renaming 1468825922758.avro.tmp to 1468825922758.avro >>>> >>>> >>>> - seems close never retried >>>> - flume just renamed which still opened. >>>> >>>> >>>> *2 day later I've found that file by this command* >>>> >>>> hdfs fsck /data/flume -openforwrite | grep "OPENFORWRITE" | grep >>>>> "2016/07/18" | sed 's//data/flume// /data/flume//g' | grep -v ".avro.tmp" >>>>> | >>>>> sed -n 's/.*(/data/flume/.*avro).*/ /p' >>>> >>>> >>>> >>>> *So, reverseLease-ed* >>>> >>>> hdfs debug recoverLease -path 1468825922758.avro -retries 3 >>>>> recoverLease returned false. >>>>> Retrying in 5000 ms... >>>>> Retry #1 >>>>> recoverLease SUCCEEDED on 1468825922758.avro >>>> >>>> >>>> >>>> *My hdfs sink configuration* >>>> >>>> hadoop2.sinks.hdfs2.type = hdfs >>>>> hadoop2.sinks.hdfs2.channel = fileCh1 >>>>> hadoop2.sinks.hdfs2.hdfs.fileType = DataStream >>>>> hadoop2.sinks.hdfs2.serializer = .... >>>>> hadoop2.sinks.hdfs2.serializer.compressionCodec = snappy >>>>> hadoop2.sinks.hdfs2.hdfs.filePrefix = %{type}_%Y-%m-%d_%{host} >>>>> hadoop2.sinks.hdfs2.hdfs.fileSuffix = .avro >>>>> hadoop2.sinks.hdfs2.hdfs.rollInterval = 3700 >>>>> #hadoop2.sinks.hdfs2.hdfs.rollSize = 67000000 >>>>> hadoop2.sinks.hdfs2.hdfs.rollSize = 800000000 >>>>> hadoop2.sinks.hdfs2.hdfs.rollCount = 0 >>>>> hadoop2.sinks.hdfs2.hdfs.batchSize = 10000 >>>>> hadoop2.sinks.hdfs2.hdfs.idleTimeout = 300 >>>> >>>> >>>> hdfs.closeTries, retryInterval both not set. >>>> >>>> >>>> *My question is * >>>> why '1468825922758.avro' left OPEN_FOR_WRITE? even though renamed to >>>> .avro succesufully. >>>> Is this expected behavior? so , what should I do to eliminate these >>>> anomal OPENFORWRITE files? >>>> >>>> Regards, >>>> Jihun. >>>> >>>> >>> >>> >>> -- >>> ---------------------------------------------- >>> Jihun No ( 노지훈 ) >>> ---------------------------------------------- >>> Twitter : @nozisim >>> Facebook : nozisim >>> Website : http://jeesim2.godohosting.com >>> >>> --------------------------------------------------------------------------------- >>> Market Apps : android market products. >>> <https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88> >>> >>> >> >> >> -- >> ---------------------------------------------- >> Jihun No ( 노지훈 ) >> ---------------------------------------------- >> Twitter : @nozisim >> Facebook : nozisim >> Website : http://jeesim2.godohosting.com >> >> --------------------------------------------------------------------------------- >> Market Apps : android market products. >> <https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88> >> > > > > -- > ---------------------------------------------- > Jihun No ( 노지훈 ) > ---------------------------------------------- > Twitter : @nozisim > Facebook : nozisim > Website : http://jeesim2.godohosting.com > > --------------------------------------------------------------------------------- > Market Apps : android market products. > <https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88> >