Arun Mahadevan created STORM-3292: ------------------------------------- Summary: Trident HiveState must flush writers when the batch commits Key: STORM-3292 URL: https://issues.apache.org/jira/browse/STORM-3292 Project: Apache Storm Issue Type: Improvement Reporter: Arun Mahadevan
For trident the hive writer is flushed only after it hits the batch size. see - https://github.com/apache/storm/blob/master/external/storm-hive/src/main/java/org/apache/storm/hive/trident/HiveState.java#L108 Trident HiveState does not flush during the batch commit and it appears to be an oversight. Without this trident state cannot guarantee at-least once. (E.g. if the transaction is open but trident moves to the next txid and later fails the data in the open transaction is lost). So I think for at-least once, the HiveState must flush all the writers irrespective of the batch sizes when trident invokes the "commit(txid)" . -- This message was sent by Atlassian JIRA (v7.6.3#76005)