Eliaaazzz commented on issue #21521:
URL: https://github.com/apache/beam/issues/21521#issuecomment-4055812109

   I put together a focused PoC for the Python `Watch` transform to validate 
the core execution model before trying to build a full implementation.
   
   PoC branch:
   `https://github.com/Eliaaazzz/beam/tree/gsoc/poc`
   
   For the `Watch` side specifically, the PoC currently has **56 passing 
tests**.
   
   The current prototype is mainly exploring whether the main `Watch` lifecycle 
can be represented cleanly in Python using an SDF-based implementation shape. 
So far, it includes:
   
   - a builder-style `Watch.growth_of(...)` API with poll interval / 
termination configuration
   - a `PollResult` abstraction carrying outputs, optional watermark, and 
completion
   - composable per-input termination conditions
   - a `GrowthState` hierarchy separating active polling state from replay state
   - a custom `GrowthStateCoder` with structured encoding and 
backward-compatibility decode paths
   - a `GrowthStateTracker` with explicit three-case `try_split()` behavior 
modeled after Java `Watch.GrowthTracker`
   - an SDF `process()` flow that distinguishes replay from active polling
   - output dedup via `output_key_fn` and stable hashing
   - watermark / completion handling in the polling path
   
   The goal of this first step was not full parity or production hardening yet, 
but to de-risk the design and get executable validation around the hardest 
parts first: state evolution, resumption, split behavior, termination behavior, 
and watermark-related semantics.
   
   If this direction looks reasonable, my next step would be to tighten it 
further against the Java semantics and then expand it toward a more complete 
implementation.
   
   Feedback would be very helpful.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to