[ 
https://issues.apache.org/jira/browse/FLUME-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Junjun updated FLUME-1258:
-----------------------------

    Description: 
  when fileDelete fired ,DirWatcher it calls tail.removeCursor(c)
DirWather: 
           1,(sucess)get lock of tail --> synchronized public void removeCursor 
           2,get lock of rmCursors--->synchronized (rmCursors) { 
rmCursors.add(cursor);}


  if at this time , TailThread has some rmCursors to handle
TailThread:1,(sucess)get lock of rmCursors 
           2,foreach rmCursors  call c.flush() 
             in c.flush(), it put event in SynchronousQueue,block until 
             another thread(PumperThread) to poll it 



PumperThread:1,(success)poll from SynchronousQueue,
             2,try get lock of tail to updateEventProcessingStats,but it can 
beacuse
              DirWather get it!so the PumperThread block here.


Because the PumperThread blocked , the SynchronousQueue didn't release it's 
capacity,then the TailThread 2 step blocked ,and the TailThread didn't release 
the lock of rmCursors ,and the DirWather 2 step blocked .

it's a block chain.






  was:
  when fileDelete fired ,DirWatcher it calls tail.removeCursor(c)
DirWather: 
           <br/>1,(sucess)get lock of tail --> synchronized public void 
removeCursor 
           2,get lock of rmCursors--->synchronized (rmCursors) { 
rmCursors.add(cursor);}
  if at this time , TailThread has some rmCursors to handle
TailThread:1,(sucess)get lock of rmCursors 
           2,foreach rmCursors  call c.flush() 
             in c.flush(), it put event in SynchronousQueue,block until 
             another thread(PumperThread) to poll it 
PumperThread:1,(success)poll from SynchronousQueue,
             2,try get lock of tail to updateEventProcessingStats,but it can 
beacuse
              DirWather get it!so the PumperThread block here.

Because the PumperThread blocked , the SynchronousQueue didn't release it's 
capacity,then the TailThread 2 step blocked ,and the TailThread didn't release 
the lock of rmCursors ,and the DirWather 2 step blocked .

it's a block chain.






    
> TailDirSource might  get dead lock  when  delete file in dir.
> -------------------------------------------------------------
>
>                 Key: FLUME-1258
>                 URL: https://issues.apache.org/jira/browse/FLUME-1258
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v0.9.3
>            Reporter: Li Junjun
>            Priority: Critical
>
>   when fileDelete fired ,DirWatcher it calls tail.removeCursor(c)
> DirWather: 
>            1,(sucess)get lock of tail --> synchronized public void 
> removeCursor 
>            2,get lock of rmCursors--->synchronized (rmCursors) { 
> rmCursors.add(cursor);}
>   if at this time , TailThread has some rmCursors to handle
> TailThread:1,(sucess)get lock of rmCursors 
>            2,foreach rmCursors  call c.flush() 
>              in c.flush(), it put event in SynchronousQueue,block until 
>              another thread(PumperThread) to poll it 
> PumperThread:1,(success)poll from SynchronousQueue,
>              2,try get lock of tail to updateEventProcessingStats,but it can 
> beacuse
>               DirWather get it!so the PumperThread block here.
> Because the PumperThread blocked , the SynchronousQueue didn't release it's 
> capacity,then the TailThread 2 step blocked ,and the TailThread didn't 
> release the lock of rmCursors ,and the DirWather 2 step blocked .
> it's a block chain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to