[
https://issues.apache.org/jira/browse/FLUME-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478295#comment-13478295
]
Brock Noland commented on FLUME-1573:
-------------------------------------
What route are we going to go here? IMHO, let's just use a UUID and then be
done with it. That way users won't trip over this.
> Duplicated HDFS file name when multiple SinkRunner was existing
> ---------------------------------------------------------------
>
> Key: FLUME-1573
> URL: https://issues.apache.org/jira/browse/FLUME-1573
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v1.2.0
> Reporter: Denny Ye
> Assignee: Denny Ye
> Fix For: v1.3.0
>
> Attachments: FLUME-1573.patch
>
>
> Multiple HDFS Sinks to write events into storage. Timeout exception is always
> happening:
> {code:xml}
> 11 Sep 2012 07:04:53,478 WARN
> [SinkRunner-PollingRunner-DefaultSinkProcessor]
> (org.apache.flume.sink.hdfs.HDFSEventSink.process:442) - HDFS IO error
> java.io.IOException: Callable timed out after 10000 ms
> at
> org.apache.flume.sink.hdfs.HDFSEventSink.callWithTimeout(HDFSEventSink.java:342)
> at
> org.apache.flume.sink.hdfs.HDFSEventSink.append(HDFSEventSink.java:713)
> at
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:412)
> at
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.util.concurrent.TimeoutException
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
> at java.util.concurrent.FutureTask.get(FutureTask.java:91)
> at
> org.apache.flume.sink.hdfs.HDFSEventSink.callWithTimeout(HDFSEventSink.java:335)
> ... 5 more
> {code}
> I doubted that there might be happened HDFS timeout or slowly response. As
> expected, I found the duplicated creation exception with same with at HDFS.
> Also, Flume recorded same case for duplicated file name.
> {code:xml}
> 13 Sep 2012 02:09:35,432 INFO [hdfs-hdfsSink-3-call-runner-7]
> (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189) - Creating
> /FLUME/dt=2012-09-13/02-host.1347501924111.tmp
> 13 Sep 2012 02:09:36,425 INFO [hdfs-hdfsSink-4-call-runner-8]
> (org.apache.flume.sink.hdfs.BucketWriter.doOpen:189) - Creating
> /FLUME/dt=2012-09-13/02-host.1347501924111.tmp
> {code}
> Different threads were going to create same file without time conflict.
> I found the root cause might be wrong usage the AtomicLong property named
> 'fileExtensionCounter' at BucketWriter. Different threads should own same
> counter by protected with CAS, not multiple private property in each thread.
> It's useless to avoid conflict of HDFS path
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira