I see. Thank you Kenn and Lukasz. Best, Shen
On Thu, Mar 8, 2018 at 7:46 PM, Kenneth Knowles <k...@google.com> wrote: > I think the description of when a side input is ready vs expired is the > trouble here. > > - You know that W is expired only when you can be sure that no main input > element could reference it. > - You know that W is ready *even if it got no data* if the input that > would end up in W would be dropped (aka when W expires according to the > *side input* watermark) > > So for your scenario, you push back the elements, that holds W from being > collected, when W expires on the side input you make it ready, you process > the elements with empty contents on the side input, then you GC the side > input. > > Kenn > > On Thu, Mar 8, 2018 at 4:32 PM Shen Li <cs.she...@gmail.com> wrote: > >> Hi Lukasz, >> >> Let's explain this problem using a specific example. >> >> Say I have a main input element X, which accesses side input window W. >> When X arrives at a ParDo operator, W is not ready and not expired either. >> So, in this case, the ParDo should push back X and wait for W to become >> ready. Say, after two minutes, W is still unready but is expired due to >> advanced main input watermark. In this situation, how does Beam expect >> runners/engines to handle the pushed back value X? Discard X or throw an >> error? >> >> Thanks, >> Shen >> >> On Thu, Mar 8, 2018 at 6:35 PM, Lukasz Cwik <lc...@google.com> wrote: >> >>> I believe your missing over this point: "and also to not expire the >>> side input till the main input watermark advances beyond the garbage >>> collection hold of the side input." >>> >>> On Thu, Mar 8, 2018 at 3:33 PM, Shen Li <cs.she...@gmail.com> wrote: >>> >>>> Hi Lukasz, >>>> >>>> Thanks again. >>>> >>>> > the runner is required to hold back the main input till the side >>>> input is ready >>>> >>>> Yes, I understand these requirements. But what if the side input >>>> expires before it becomes ready? >>>> >>>> Shen >>>> >>>> >>> >>