Thomas Groh created BEAM-3646:
---------------------------------

             Summary: Add comments about appropriate use of DoFn.Teardown
                 Key: BEAM-3646
                 URL: https://issues.apache.org/jira/browse/BEAM-3646
             Project: Beam
          Issue Type: Bug
          Components: sdk-java-core
            Reporter: Thomas Groh
            Assignee: Thomas Groh


Because the {{Teardown}} method has no relation to the atomicity of processing 
and commiting of output, it is EXTREMELY DANGEROUS to use to flush outputs, and 
buffered data there is extremely likely to never be flushed. If a DoFn instance 
with buffered data is lost (for example, via worker/machine failure), and the 
runner has committed the result of processing that input, the data is lost.

 

Not commenting on this being the case can cause users to believe that 
(especially if running a batch pipeline) that their data will be flushed on 
pipeline completion. This is very dangerous behavior that we do not warn of 
sufficiently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to