[
https://issues.apache.org/jira/browse/FLUME-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Li Junjun updated FLUME-1258:
-----------------------------
Description:
when fileDelete fired ,DirWatcher it calls tail.removeCursor(c)
DirWather:
1,(sucess)get lock of tail --> synchronized public void removeCursor
2,get lock of rmCursors--->synchronized (rmCursors) {
rmCursors.add(cursor);}
if at this time , TailThread has some rmCursors to handle
TailThread:1,(sucess)get lock of rmCursors
2,foreach rmCursors call c.flush()
in c.flush(), it put event in SynchronousQueue,block until
another thread(PumperThread) to poll it
PumperThread:1,(success)poll from SynchronousQueue,
2,try get lock of tail to updateEventProcessingStats,but it can
beacuse
DirWather get it!so the PumperThread block here.
Because the PumperThread blocked , the SynchronousQueue didn't release it's
capacity,then the TailThread 2 step blocked ,and the TailThread didn't release
the lock of rmCursors ,and the DirWather 2 step blocked .
it's a block chain.
was:
when fileDelete fired ,DirWatcher it calls tail.removeCursor(c)
DirWather:
<br/>1,(sucess)get lock of tail --> synchronized public void
removeCursor
2,get lock of rmCursors--->synchronized (rmCursors) {
rmCursors.add(cursor);}
if at this time , TailThread has some rmCursors to handle
TailThread:1,(sucess)get lock of rmCursors
2,foreach rmCursors call c.flush()
in c.flush(), it put event in SynchronousQueue,block until
another thread(PumperThread) to poll it
PumperThread:1,(success)poll from SynchronousQueue,
2,try get lock of tail to updateEventProcessingStats,but it can
beacuse
DirWather get it!so the PumperThread block here.
Because the PumperThread blocked , the SynchronousQueue didn't release it's
capacity,then the TailThread 2 step blocked ,and the TailThread didn't release
the lock of rmCursors ,and the DirWather 2 step blocked .
it's a block chain.
> TailDirSource might get dead lock when delete file in dir.
> -------------------------------------------------------------
>
> Key: FLUME-1258
> URL: https://issues.apache.org/jira/browse/FLUME-1258
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v0.9.3
> Reporter: Li Junjun
> Priority: Critical
>
> when fileDelete fired ,DirWatcher it calls tail.removeCursor(c)
> DirWather:
> 1,(sucess)get lock of tail --> synchronized public void
> removeCursor
> 2,get lock of rmCursors--->synchronized (rmCursors) {
> rmCursors.add(cursor);}
> if at this time , TailThread has some rmCursors to handle
> TailThread:1,(sucess)get lock of rmCursors
> 2,foreach rmCursors call c.flush()
> in c.flush(), it put event in SynchronousQueue,block until
> another thread(PumperThread) to poll it
> PumperThread:1,(success)poll from SynchronousQueue,
> 2,try get lock of tail to updateEventProcessingStats,but it can
> beacuse
> DirWather get it!so the PumperThread block here.
> Because the PumperThread blocked , the SynchronousQueue didn't release it's
> capacity,then the TailThread 2 step blocked ,and the TailThread didn't
> release the lock of rmCursors ,and the DirWather 2 step blocked .
> it's a block chain.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira