[
https://issues.apache.org/jira/browse/CALCITE-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18076655#comment-18076655
]
Mihai Budiu commented on CALCITE-7308:
--------------------------------------
So the question is: "what is the semantics of comparing CHAR(5) with
VARCHAR(10)?"
In Calcite comparisons cast to a common type. Casting to VARCHAR(10) does not
provide the expected result.
According to Claude, the SQL standard says (I did not verify this):
The SQL standard (ISO/IEC 9075) says:
When comparing {{CHAR(N)}} and {{{}VARCHAR(M){}}}, the *shorter string is
padded with spaces* on the right until both strings are the same length, then
compared character by character. So:
{{CHAR(10) 'hello' → 'hello '
VARCHAR(5) 'hello' → 'hello ' (padded to match)
result → EQUAL}}
This is called *blank padding semantics* and is the default comparison behavior
for character strings in the standard.
The standard also defines two comparison modes:
* *Blank-padded comparison (BPC)* — the default, pads shorter string with
spaces before comparing
* *Non-padded comparison (NPC)* — compares strings as-is, so {{'hello'}} ≠
{{'hello}}
{{{}{}}}Which mode applies depends on the *collation* in use. The default
collation uses BPC, but you can explicitly use a non-padded collation to get
strict comparison.{{{}{}}}
> Incorrect Result (Missing Row) When Joining a CTE (Temporary Table) with a
> Persistent Table
> -------------------------------------------------------------------------------------------
>
> Key: CALCITE-7308
> URL: https://issues.apache.org/jira/browse/CALCITE-7308
> Project: Calcite
> Issue Type: Bug
> Reporter: weihua zhang
> Priority: Major
>
> When executing a JOIN operation between a CTE (Common Table Expression,
> acting as a temporary table) and a persistent table, Calcite returns
> incorrect results with one row missing. The expected result should include
> all matching rows from both tables, but the actual output omits one valid
> matching row.
> {code:sql}
> !use scott
> WITH bonus1(ENAME, JOB, SAL, COMM) AS (
> VALUES
> ('ALLEN', 'SALESMAN', 1600.00, 300.00),
> ('WARD', 'SALESMAN', 1250.00, 500.00)
> )
> SELECT * FROM bonus1
> INNER JOIN emp -- `emp` is a persistent table (predefined in the database)
> ON bonus1.ename = emp.ename;
> {code}
> {code:java}
> ENAME, JOB, SAL, COMM, EMPNO, ENAME0, JOB0, MGR, HIREDATE, SAL0, COMM0, DEPTNO
> ALLEN, SALESMAN, 1600.00, 300.00, 7499, ALLEN, SALESMAN, 7698, 1981-02-20,
> 1600.00, 300.00, 30
> !ok
> {code}
> but right result is:
> {code:java}
> ename | job | sal | comm | empno | ename | job | mgr |
> hiredate | sal | comm | deptno
> -------+----------+---------+--------+-------+-------+----------+------+------------+---------+--------+--------+
> WARD | SALESMAN | 1250.00 | 500.00 | 7521 | WARD | SALESMAN | 7698 |
> 1981-02-22 | 1250.00 | 500.00 | 30 |
> ALLEN | SALESMAN | 1600.00 | 300.00 | 7499 | ALLEN | SALESMAN | 7698 |
> 1981-02-20 | 1600.00 | 300.00 | 30 |
> (2 rows)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)