[ 
https://issues.apache.org/jira/browse/STORM-108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rick Kellogg updated STORM-108:
-------------------------------
    Component/s: storm-core

> Add commit support to Trident Transactional Spouts
> --------------------------------------------------
>
>                 Key: STORM-108
>                 URL: https://issues.apache.org/jira/browse/STORM-108
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-core
>            Reporter: James Xu
>            Priority: Minor
>
> https://github.com/nathanmarz/storm/issues/559
> There is no notice from Trident back to the Spout when a batch is 
> successfully completed (for a specific transaction id). When building a 
> Transactional Spout it would be useful to have a success method on the 
> Coordinator to know the batch was completed.
> Looking at code: 
> On completion of a batch, PartitionedTridentSpoutExecutor's success method is 
> called on the Coordinator, but the Coordinator doesn't do anything. And the 
> ITridentSpout.BatchCoordinator interface doesn't even define a 'success' 
> method.
> It looks like what I need to do to complete this code is to:
> change the IPartitionedTridentSpout.Coordinator to have a success(long txid) 
> method
> change PartitionedTridentSpoutExecutor's success to call the coordinator's 
> success method
> Within my own IPartitionedTridentSpout-derived Spout:
> have a common state object in the Spout accessible by both my Emitter and the 
> Coordinator
> implement the success() method on the Coordinator
> when an batch is emitted via emitPartitionBatchNew write information about 
> which messages were included in that batch to the shared state object with 
> the transaction id
> when the Coordinator success() method is called, find the transaction and 
> then 'acknowledge' the messages in that batch back to the source.
> to handle failures, have the emitPartitionBatch method check a counter in the 
> shared state for the transaction id and fail after 'x' retries. By 'fail' I 
> mean execute my own logic, such as writing to a dead.letter queue, then not 
> output any tuples, thus allowing Trident to advance to the next transactions.
> I understand that some messages in the batch may have succeeded when I give 
> up, but I have no way of knowing which ones, so we'll have to handle that in 
> our recovery logic outside of Trident.
> Am I missing anything?
> Is there something in the TridentSpout lifecycle I haven't figured out by 
> looking at the code? I see a 'success' method on the Coordinator but should 
> there be a complementary 'failed' method as well? I didn't see any retry 
> logic on the calls to emitPartitionBatch either so I'm not sure my failure 
> handling above is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to