[ https://issues.apache.org/jira/browse/FLUME-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13466363#comment-13466363 ]
Mike Percy commented on FLUME-1573: ----------------------------------- After a 2nd look at the code, it seems fine to me. HDFSEventSink keeps a hash of bucket writers keyed by unique path. Can you please clarify how a collision could occur in the same sink? > Duplicated HDFS file name when multiple SinkRunner was existing > --------------------------------------------------------------- > > Key: FLUME-1573 > URL: https://issues.apache.org/jira/browse/FLUME-1573 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources > Affects Versions: v1.2.0 > Reporter: Denny Ye > Assignee: Denny Ye > Fix For: v1.3.0 > > Attachments: FLUME-1573.patch > > > Multiple HDFS Sinks to write events into storage. Timeout exception is always > happening: > {code:xml} > 11 Sep 2012 07:04:53,478 WARN > [SinkRunner-PollingRunner-DefaultSinkProcessor] > (org.apache.flume.sink.hdfs.HDFSEventSink.process:442) - HDFS IO error > java.io.IOException: Callable timed out after 10000 ms > at > org.apache.flume.sink.hdfs.HDFSEventSink.callWithTimeout(HDFSEventSink.java:342) > at > org.apache.flume.sink.hdfs.HDFSEventSink.append(HDFSEventSink.java:713) > at > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:412) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) > at java.lang.Thread.run(Thread.java:619) > Caused by: java.util.concurrent.TimeoutException > at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228) > at java.util.concurrent.FutureTask.get(FutureTask.java:91) > at > org.apache.flume.sink.hdfs.HDFSEventSink.callWithTimeout(HDFSEventSink.java:335) > ... 5 more > {code} > I doubted that there might be happened HDFS timeout or slowly response. As > expected, I found the duplicated creation exception with same with at HDFS. > Also, Flume recorded same case for duplicated file name. > {code:xml} > 13 Sep 2012 02:09:35,432 INFO [hdfs-hdfsSink-3-call-runner-7] > (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189) - Creating > /FLUME/dt=2012-09-13/02-host.1347501924111.tmp > 13 Sep 2012 02:09:36,425 INFO [hdfs-hdfsSink-4-call-runner-8] > (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189) - Creating > /FLUME/dt=2012-09-13/02-host.1347501924111.tmp > {code} > Different threads were going to create same file without time conflict. > I found the root cause might be wrong usage the AtomicLong property named > 'fileExtensionCounter' at BucketWriter. Different threads should own same > counter by protected with CAS, not multiple private property in each thread. > It's useless to avoid conflict of HDFS path -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira