Re: Unbounded input join Unbounded input then write to Bounded Sink

2020-02-25 Thread Ismaël Mejía
Hello, Sinks are not bounded or unbounded, they are just normal ParDos (DoFns) that behave consistently with the pipeline data, so if your pipeline deals with unbounded data the sink will write this data correspondingly (when windows close, triggers match, etc so data is ready to be out). One

Re: Unbounded input join Unbounded input then write to Bounded Sink

2020-02-24 Thread rahul patwari
Hi Kenn, Rui, The pipeline that we are trying is exactly what Kenn has mentioned above i.e. Read From Kafka => Apply Fixed Windows of 1 Min => SqlTransform => Write to Hive using HcatalogIO We are interested in understanding the behaviour when the source is Unbounded and Sink is bounded as this

Re: Unbounded input join Unbounded input then write to Bounded Sink

2020-02-24 Thread Rui Wang
I see. So I guess I wasn't fully understand the requirement: Do you want to have a 1-min window join on two unbounded sources and write to sink when the window closes ? Or there is an extra requirement such that you also want to write to sink every minute per window? For the former, you can do

Re: Unbounded input join Unbounded input then write to Bounded Sink

2020-02-24 Thread Rui Wang
Sorry please remove " .apply(Window.into(FixedWindows.of(1 minute))" from the query above. -Rui On Mon, Feb 24, 2020 at 5:26 PM Rui Wang wrote: > I see. So I guess I wasn't fully understand the requirement: > > Do you want to have a 1-min window join on two unbounded sources and write > to

Re: Unbounded input join Unbounded input then write to Bounded Sink

2020-02-24 Thread Rui Wang
SQL does not support such joins with your requirement: write to sink after every 1 min after window closes. You might can use state and timer API to achieve your goal. -Rui On Mon, Feb 24, 2020 at 9:50 AM shanta chakpram wrote: > Hi, > > I am trying to join inputs from Unbounded Sources

Unbounded input join Unbounded input then write to Bounded Sink

2020-02-24 Thread shanta chakpram
Hi, I am trying to join inputs from Unbounded Sources then write to Bounded Sink. The pipeline I'm trying is: Kafka Sources -> SqlTransform -> HCatalogIO Sink And, a FixedWindow of 1 minute duration is applied. I'm expecting the inputs from unbounded sources joined within the current window to