Hello folks,

I would love to hear back your feedback on this.

Regards,
Satyam

On Wed, Mar 10, 2021 at 6:53 PM Satyam Shekhar <satyamshek...@gmail.com>
wrote:

> Hello folks,
>
> I am looking to enrich rows from an unbounded streaming table by
> joining it with a bounded static table while preserving rowtime for the
> streaming table. For example, let's consider table two tables F and D,
> where F is unbounded and D is bounded. The schema for the two tables is the
> following -
>
> F:
>  |-- C0: BIGINT
>  |-- C1: STRING
>  |-- R: TIMESTAMP(3) **rowtime**
>  |-- WATERMARK FOR R: TIMESTAMP(3) AS `R` - INTERVAL '0' SECONDS
>
> D:
>  |-- C0: BIGINT
>  |-- C1: STRING NOT NULL
>
> I'd like to run the following query on this schema -
>
> select sum(F.C0), D.C1, tumble_start(F.R, interval '1' second)
>     from F join D ON F.C1 = D.C1
>     group by D.C1, tumble(F.R, interval '1' second)
>
> However, I run into the following error while running the above query -
>
> "Rowtime attributes must not be in the input rows of a regular join. As a
> workaround you can cast the time attributes of input tables to TIMESTAMP
> before."
>
> My understanding reading the docs is that Time Temporal Join is meant to
> solve this problem. So I model table D as the following -
>
> D:
>  |-- C0: BIGINT
>  |-- C1: STRING NOT NULL
>  |-- R: TIMESTAMP(3)
>  |-- WATERMARK FOR R: TIMESTAMP(3) AS `R` - INTERVAL '0' SECONDS
>  |-- CONSTRAINT 2da2dd2e-9937-48cb-9dec-4f6055713004 PRIMARY KEY (C1)
>
> With column D.R always set to 0 and modify the query as follows -
>
> select sum(F.C0), D.C1, tumble_start(F.R, interval '1' second)
>     from F join D FOR SYSTEM_TIME AS OF F.R ON F.C1 = D.C1
>     group by D.C1, tumble(F.R, interval '1' second)
>
> The above query runs but does not return any result. I have the following
> data in D initially -
> Emit D row=+I(0,"0",1970-01-01T00:00)@time=0
> Emit D row=+I(1,"1",1970-01-01T00:00)@time=0
> Emit D row=+I(2,"2",1970-01-01T00:00)@time=0
> Emit D watermark=0
>
> And F streams the following rows -
> Emit F row=+I(0,"0",1970-01-01T00:00)@time=0
> Emit F row=+I(1,"1",1970-01-01T00:00:10)@time=1000
> Emit F watermark=1000
>
> I expect that two rows in F will join with matching rows (on C1) in D and
> produce some output. But I do not see anything in the output.
>
> So I have the following questions -
>
> 1. Is time temporal join the correct tool to solve this problem?
> 2. What could be the reason for not getting any output rows in the result?
>
> Thanks,
> Satyam
>
>

Reply via email to