On 7/28/23 13:02, Tatsuo Ishii wrote:
Attached is the v3 patch. In this patch following changes are made.

- PATTERN variables do not have to exist in the DEFINE clause.  They are
- considered TRUE if not present.

Do you think we really need this? I found a criticism regarding this.

https://link.springer.com/article/10.1007/s13222-022-00404-3
"3.2 Explicit Definition of All Row Pattern Variables"

What do you think?

I think that a large part of obeying the standard is to allow queries from other engines to run the same on ours. The standard does not require the pattern variables to be defined and so there are certainly queries out there without them, and that hurts migrating to PostgreSQL.

- I am working on making window aggregates RPR aware now. The
    implementation is in progress and far from completeness. An example
    is below. I think row 2, 3, 4 of "count" column should be NULL
    instead of 3, 2, 0, 0. Same thing can be said to other
    rows. Probably this is an effect of moving aggregate but I still
    studying the window aggregation code.

This tells me again that RPR is not being run in the right place.  All
windowed aggregates and frame-level window functions should Just Work
with no modification.

I am not touching each aggregate function. I am modifying
eval_windowaggregates() in nodeWindowAgg.c, which calls each aggregate
function. Do you think it's not the right place to make window
aggregates RPR aware?

Oh, okay.

SELECT company, tdate, first_value(price) OVER W, count(*) OVER w FROM
stock
   WINDOW w AS (
   PARTITION BY company
   ORDER BY tdate
   ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
   AFTER MATCH SKIP PAST LAST ROW
   INITIAL
   PATTERN (START UP+ DOWN+)
   DEFINE
    START AS TRUE,
    UP AS price > PREV(price),
    DOWN AS price < PREV(price)
);
   company  |   tdate    | first_value | count
----------+------------+-------------+-------
   company1 | 2023-07-01 |         100 |     4
   company1 | 2023-07-02 |             |     3
   company1 | 2023-07-03 |             |     2
   company1 | 2023-07-04 |             |     0
   company1 | 2023-07-05 |             |     0
   company1 | 2023-07-06 |          90 |     4
   company1 | 2023-07-07 |             |     3
   company1 | 2023-07-08 |             |     2
   company1 | 2023-07-09 |             |     0
   company1 | 2023-07-10 |             |     0
   company2 | 2023-07-01 |          50 |     4
   company2 | 2023-07-02 |             |     3
   company2 | 2023-07-03 |             |     2
   company2 | 2023-07-04 |             |     0
   company2 | 2023-07-05 |             |     0
   company2 | 2023-07-06 |          60 |     4
   company2 | 2023-07-07 |             |     3
   company2 | 2023-07-08 |             |     2
   company2 | 2023-07-09 |             |     0
   company2 | 2023-07-10 |             |     0

In this scenario, row 1's frame is the first 5 rows and specified SKIP
PAST LAST ROW, so rows 2-5 don't have *any* frame (because they are
skipped) and the result of the outer count should be 0 for all of them
because there are no rows in the frame.

Ok. Just I want to make sure. If it's other aggregates like sum or
avg, the result of the outer aggregates should be NULL.

They all behave the same way as in a normal query when they receive no rows as input.

When we get to adding count in the MEASURES clause, there will be a
difference between no match and empty match, but that does not apply
here.

Can you elaborate more? I understand that "no match" and "empty match"
are different things. But I do not understand how the difference
affects the result of count.

This query:

SELECT v.a, wcnt OVER w, count(*) OVER w
FROM (VALUES ('A')) AS v (a)
WINDOW w AS (
  ORDER BY v.a
  MEASURES count(*) AS wcnt
  ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
  PATTERN (B)
  DEFINE B AS B.a = 'B'
)

produces this result:

 a | wcnt | count
---+------+-------
 A |      |     0
(1 row)

Inside the window specification, *no match* was found and so all of the MEASURES are null. The count(*) in the target list however, still exists and operates over zero rows.

This very similar query:

SELECT v.a, wcnt OVER w, count(*) OVER w
FROM (VALUES ('A')) AS v (a)
WINDOW w AS (
  ORDER BY v.a
  MEASURES count(*) AS wcnt
  ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
  PATTERN (B?)
  DEFINE B AS B.a = 'B'
)

produces this result:

 a | wcnt | count
---+------+-------
 A |    0 |     0
(1 row)

In this case, the pattern is B? instead of just B, which produces an *empty match* for the MEASURES to be applied over.
--
Vik Fearing



Reply via email to