GitHub user Ethanlm opened a pull request:

    https://github.com/apache/storm/pull/2315

    [STORM-2732] Close the eldest writer before removing it from WritersMap

    [JIRA-STORM-2732](https://issues.apache.org/jira/browse/STORM-2732)
    
    `WritersMap` serves as an LRU cache for `maxWriters` number of `Writer`s. 
But it doesn't close the removed writer, which could lead to potential data 
loss.
    
    The exception handling is really hard. The bolt behaviors depend on later 
on situations. I don't know how to address the exception handling currently so 
I chose to just log the error which is also what we have done elsewhere in this 
class.
    
    --------
    
    I wanted to discuss about what would happen. Please correct me if I am 
wrong.
    #### Case 1
    If `eldest.getValue().close()` failed, assume next tuple invokes SyncPolicy 
but the sync failed, then the bolt will call `collector.fail()` on all the 
tuples in `tupleBatch`. In this case, all those tuples will be transferred 
again, which gives the bolt another chance to write those tuples later.  
However, if some tuples has been written to hdfs before 
`eldest.getValue().close()` failed, these tuples could be written to hdfs 
**twice**. 
    
    #### Case 2
    If `eldest.getValue().close()` failed, assume next tuple invokes SyncPolicy 
and the sync succeed, then the bolt will call `collector.ack()` on all the 
tuples in `tupleBatch`. Then the lost data because of  
`eldest.getValue().close()` failure will not be transferred to the bolt again, 
which means they are **lost** permanently .
    
    #### Case 3
    If `eldest.getValue().close()` succeeded, assume next tuple invokes 
SyncPolicy but the sync failed, then the bolt will call `collector.fail()` on 
all the tuples in `tupleBatch`. Then the tuples have been written to hdfs by 
`eldest.getValue().close()`  will be transferred to the bolt again. Potentially 
they could be written **twice**.
    
    The same issues apply to the calling to `doRotationAndRemoveWriter()`. The 
behaviors are not consistent.  We might want to have a better way to deal with 
exceptions.
    
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Ethanlm/storm STORM-2732

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/2315.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2315
    
----
commit 56fefea364a1fdcf92352d873ac56d4ba6dc4243
Author: Ethan Li <ethanopensou...@gmail.com>
Date:   2017-09-08T19:17:50Z

    [STORM-2732] Close the eldest writer before removing it from WritersMap

----


---

Reply via email to