je-ik commented on pull request #14013:
URL: https://github.com/apache/beam/pull/14013#issuecomment-783209882


   > The reason why we want to call 
`watermarkEstimator.setWatermark(currentRestriction.getWatermark());` when 
`tryClaim()` returns `false` is for tracking watermark when returning 
ProcessContinuation.resume(). It could happen when there is no output records 
from reader and we want to read again later.
   
   That seems incorrect. When tryClaim returns `false`, it is part of the 
contract to return `ProcessContinuation.done()`:
   
https://github.com/apache/beam/blob/aaad864c9acb22e35050f974a7ac74fb7638f085/runners/core-java/src/main/java/org/apache/beam/runners/core/OutputAndTimeBoundedSplittableProcessElementInvoker.java#L221
   When the reader returns false() we do not fail clam, instead we go though 
`out[0] == null` in `processElement`.
   I think there should be no reason to enforce:
    a) returning ProcessContinuation.done(), and
    b) manually setting the watermark to BoundedWindow.TIMESTAMP_MAX_VALUE
   because, that seems redundant.
   
   You are right, that the watermark is read *before* call to trySplit (I 
overlooked that), that probably means, we *must* set watermark both *before* 
and *after* the tryClaim loop. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to