scenario:

kafka stream enriched with tableS in postgresql

Let's pretend that the postgres has an organizations, departments, and
persons table, and we want to join the full name of the kafka table
that has the person id.  I also want to determine if the person id is
missing.

This requires a left join.

SELECT o.id, d.id, p.fullname, k.ssn, SUM(k.amount)
FROM purchases k
JOIN organizations o ON o.code = k.organization
JOIN departmentS d ON d.code = k.department
LEFT JOIN persons FOR SYSTEM_TIME AS OF k.procTime AS p ON
p.department_id = d.id
WHERE p.ssn = k.ssn
GROUP BY
TUMBLE(s.procTime, INTERVAL '3' MINUTE), o.id, d.id, p.fullname, k.ssn

Let's say that the TTL for organizations and departments is 12 months,
but for persons
it is 1 month.

observations:

If six unique people enter the kafka topic, then that will issue six
separate queries to the database of the form:

SELECT id, ssn, fullname, dob FROM persons WHERE.deparment_id = $1 AND ssn = $2

However, since this is a tumbling, it would be more efficient to do
one query with six parameters in an IN clause.  Example:

SELECT id, ssn, fullname, dob FROM persons WHERE.(deparment_id, ssn)
IN (($1,$2), ($3,$4),($5,$6),($7,$8),($9,$10)($11,$12))

Q: Is there a way to control that? I don't want the N + 1 query problem.

Q: Are these queries performed asynchronously?  If there were 200000
unique persons, I would not want 20000 synchronous queries.

Q: Is there a way to preload persons table, since it changes only
about once every two weeks and then do a LEFT JOIN on it?

Let's say that the person does not exist. I am impressed that Flink
caches that a person does not exist.  However, I want to cache if a
person exists for a month, but if the person does not exist, I only
want to remember that for a day.

Q: Is there a way to control a shorter cache time for non-existent items?

I really like the expressiveness and succinctness of the SQL api in
Flink, however, I am worried that I need use the data-stream API in
order to control the scenarios above.

I appreciate any advice, thank you.

Reply via email to