Github user cestella commented on the issue:

    https://github.com/apache/incubator-metron/pull/359
  
    @mmiklavc I completely agree that we should ack every tuple that comes 
through in the `JoinBolt`.  I think, however, that they should be acked only on 
a successful join.  
    
    The failure scenarios are currently handled thusly:
    * A failure in an enrichment (e.g. exception, hbase or mysql going down, 
etc) is handled by the enrichment adapter being robust enough to send an empty 
message even on failure
    * A failure in an enrichment bolt's infrastructure (e.g. the process dying 
or a machine failure) should be handled by the join failing and the original 
message replaying.  We anchor the message coming on the `message` channel in 
the `SplitBolt`, so it should get replayed.
    
    If we ack every message as they come down the pipe, we could be in a 
situation where we've acked a message that never joins and never gets through.  
Instead, I think we should ack every message on a successful join.
    
    As of now, the tuple that sends the last enrichment segment in will get 
acked, which is definitely wrong.  What I would like to see is for the 
`streamMessageMap` to get augmented to track the tuples as well as the messages 
and for a successful join to result in all of the tuples in the map to get 
ack'd.  Possibly it's time for that map to actually be a proper class and it 
really only has to keep track of the tuple from the `message` stream.
    
    Thoughts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to