Hi all,

I am playing around with the table API, and I have a doubt about temporal 
operator overlaps. In particular, a test in the 
scalarFunctionsTest.testOverlaps checks for false the following intervals:
testAllApis(
      temporalOverlaps("2011-03-10 05:02:02".toTimestamp, 0.second,
        "2011-03-10 05:02:02".toTimestamp, "2011-03-10 05:02:01".toTimestamp),
      "temporalOverlaps(toTimestamp('2011-03-10 05:02:02'), 0.second, " +
        "'2011-03-10 05:02:02'.toTimestamp, '2011-03-10 05:02:01'.toTimestamp)",
      "(TIMESTAMP '2011-03-10 05:02:02', INTERVAL '0' SECOND) OVERLAPS " +
        "(TIMESTAMP '2011-03-10 05:02:02', TIMESTAMP '2011-03-10 05:02:01')",
      "false")

Basically, the compared intervals overlap just by one of the extreme. The 
interpretation of the time.scala implementation is
AND(
                        >=(DATETIME_PLUS(CAST('2011-03-10 
05:02:02'):TIMESTAMP(3) NOT NULL, 0), CAST('2011-03-10 05:02:02'):TIMESTAMP(3) 
NOT NULL),
                        >=(CAST('2011-03-10 05:02:01'):TIMESTAMP(3) NOT NULL, 
CAST('2011-03-10 05:02:02'):TIMESTAMP(3) NOT NULL)
),

Where the result is false as the second clause is not satisfied.

However, latest calcite master compiles the overlaps as follows:
[AND
            (
                        >=(      CASE(
                                                <=(2011-03-10 05:02:02, 
DATETIME_PLUS(2011-03-10 05:02:02, 0)), DATETIME_PLUS(2011-03-10 05:02:02, 0), 
2011-03-10 05:02:02
                                                ),
                                    CASE(
                                                <=(2011-03-10 05:02:02, 
2011-03-10 05:02:01), 2011-03-10 05:02:02, 2011-03-10 05:02:01
                                                )
                        ),
                        >=(      CASE(
                                                <=(2011-03-10 05:02:02, 
2011-03-10 05:02:01), 2011-03-10 05:02:01, 2011-03-10 05:02:02
                                                ),
                                    CASE(
                                                <=(2011-03-10 05:02:02, 
DATETIME_PLUS(2011-03-10 05:02:02, 0)), 2011-03-10 05:02:02, 
DATETIME_PLUS(2011-03-10 05:02:02, 0)
                                    )
                        )
            )
]

Where the result is true.

I believe the issue is about interpreting the extremes as part of the 
overlapping intervals or not. Flink does not consider the intervals as 
overlapping (as the test shows), whereas Calcite implements the test including 
them.

Which one should be preserved?

I think that calcite implementation is correct, and overlapping extremes should 
be considered. What do you think?

Best,
Stefano

Reply via email to