My implementation is synchronized on the writer map, and the append and
close operations on the bucketwriter are synchronized. It is possible
for a writer to rarely get closed before it's about to append but that
is harmless as it will just back off and get a fresh writer the next
cycle. Also, if possible, please add comments to the jira thread when
the mail is generated from there :)
On 10/19/2012 05:13 AM, Roshan Naik wrote:
Will need to handle race conditions like.. a thread resumes writing
immediately after the watcher thread decides to close the file handle. In
that sense a deterministic close is nicer than a timeout based 'garbage
collection'
-roshan
On Thu, Oct 18, 2012 at 12:04 PM, Mike Percy (JIRA) <[email protected]> wrote:
[
https://issues.apache.org/jira/browse/FLUME-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479255#comment-13479255]
Mike Percy commented on FLUME-1350:
-----------------------------------
Hi Juhani, something like a close-on-idle timeout makes sense. I'd be
happy to review it if you want to work on it.
HDFS file handle not closed properly when date bucketing
---------------------------------------------------------
Key: FLUME-1350
URL: https://issues.apache.org/jira/browse/FLUME-1350
Project: Flume
Issue Type: Bug
Components: Sinks+Sources
Affects Versions: v1.1.0, v1.2.0
Reporter: Robert Mroczkowski
Attachments: HDFSEventSink.java.patch
With configuration:
agent.sinks.hdfs-cafe-access.type = hdfs
agent.sinks.hdfs-cafe-access.hdfs.path =
hdfs://nga/nga/apache/access/%y-%m-%d/
agent.sinks.hdfs-cafe-access.hdfs.fileType = DataStream
agent.sinks.hdfs-cafe-access.hdfs.filePrefix = cafe_access
agent.sinks.hdfs-cafe-access.hdfs.rollInterval = 21600
agent.sinks.hdfs-cafe-access.hdfs.rollSize = 10485760
agent.sinks.hdfs-cafe-access.hdfs.rollCount = 0
agent.sinks.hdfs-cafe-access.hdfs.txnEventMax = 1000
agent.sinks.hdfs-cafe-access.hdfs.batchSize = 1000
#agent.sinks.hdfs-cafe-access.hdfs.codeC = snappy
agent.sinks.hdfs-cafe-access.hdfs.hdfs.maxOpenFiles = 5000
agent.sinks.hdfs-cafe-access.channel = memo-1
When new directory is created previous file handle remains opened.
rollInterval setting is used only with files in current date bucket.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA
administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira