[ 
https://issues.apache.org/jira/browse/STORM-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated STORM-3292:
----------------------------------
    Labels: pull-request-available  (was: )

> Trident HiveState must flush writers when the batch commits
> -----------------------------------------------------------
>
>                 Key: STORM-3292
>                 URL: https://issues.apache.org/jira/browse/STORM-3292
>             Project: Apache Storm
>          Issue Type: Improvement
>            Reporter: Arun Mahadevan
>            Priority: Major
>              Labels: pull-request-available
>
> For trident the hive writer is flushed only after it hits the batch size.
> see - 
> https://github.com/apache/storm/blob/master/external/storm-hive/src/main/java/org/apache/storm/hive/trident/HiveState.java#L108
> Trident HiveState does not flush during the batch commit and it appears to be 
> an oversight. Without this trident state cannot guarantee at-least once. 
> (E.g. if the transaction is open but trident moves to the next txid and later 
> fails the data in the open transaction is lost).
> So I think for at-least once, the HiveState must flush all the writers 
> irrespective of the batch sizes when trident invokes the "commit(txid)" .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to