Hi.

I found some files on hdfs left as OPEN_FOR_WRITE state.

*This is flume's log about the file.*


01  18 7 2016 16:12:02,765 INFO
>  [SinkRunner-PollingRunner-DefaultSinkProcessor]
> (org.apache.flume.sink.hdfs.BucketWriter.open:234)

02 - Creating 1468825922758.avro.tmp


> 03  18 7 2016 16:22:39,812 INFO  [hdfs-hdfs2-roll-timer-0]
> (org.apache.flume.sink.hdfs.BucketWriter$5.call:429)

04 - Closing idle bucketWriter 1468825922758.avro.tmp at 1468826559812


> 05  18 7 2016 16:22:39,812 INFO  [hdfs-hdfs2-roll-timer-0]
> (org.apache.flume.sink.hdfs.BucketWriter.close:363)

06 - Closing 1468825922758.avro.tmp


> 07  18 7 2016 16:22:49,813 WARN  [hdfs-hdfs2-roll-timer-0]
> (org.apache.flume.sink.hdfs.BucketWriter.close:370)

08 - failed to close() HDFSWriter for file (1468825922758.avro.tmp).
> Exception follows.

09 java.io.IOException: Callable timed out after 10000 ms on file:
> 1468825922758.avro.tmp


> 10  18 7 2016 16:22:49,816 INFO  [hdfs-hdfs2-call-runner-7]
> (org.apache.flume.sink.hdfs.BucketWriter$8.call:629)

11 - Renaming 1468825922758.avro.tmp to 1468825922758.avro


- seems close never retried
- flume just renamed which still opened.


*2 day later I've found that file by this command*

hdfs fsck /data/flume -openforwrite | grep "OPENFORWRITE" | grep
> "2016/07/18" | sed 's/\/data\/flume\//\n\/data\/flume\//g' | grep -v
> ".avro.tmp" | sed -n 's/.*\(\/data\/flume\/.*avro\).*/\1/p'



*So, reverseLease-ed*

hdfs debug recoverLease -path 1468825922758.avro -retries 3
> recoverLease returned false.
> Retrying in 5000 ms...
> Retry #1
> recoverLease SUCCEEDED on 1468825922758.avro



*My hdfs sink configuration*

hadoop2.sinks.hdfs2.type = hdfs
> hadoop2.sinks.hdfs2.channel = fileCh1
> hadoop2.sinks.hdfs2.hdfs.fileType = DataStream
> hadoop2.sinks.hdfs2.serializer = ....
> hadoop2.sinks.hdfs2.serializer.compressionCodec = snappy
> hadoop2.sinks.hdfs2.hdfs.filePrefix = %{type}_%Y-%m-%d_%{host}
> hadoop2.sinks.hdfs2.hdfs.fileSuffix = .avro
> hadoop2.sinks.hdfs2.hdfs.rollInterval = 3700
> #hadoop2.sinks.hdfs2.hdfs.rollSize = 67000000
> hadoop2.sinks.hdfs2.hdfs.rollSize = 800000000
> hadoop2.sinks.hdfs2.hdfs.rollCount = 0
> hadoop2.sinks.hdfs2.hdfs.batchSize = 10000
> hadoop2.sinks.hdfs2.hdfs.idleTimeout = 300


hdfs.closeTries, retryInterval both not set.


*My question is  *
why '1468825922758.avro' left OPEN_FOR_WRITE? even though renamed to .avro
succesufully.
Is this expected behavior? so , what should I do to eliminate these anomal
OPENFORWRITE files?

Regards,
Jihun.

Reply via email to