Yu Xu created CALCITE-6988:
------------------------------
Summary: Support DuckDB Dialect
Key: CALCITE-6988
URL: https://issues.apache.org/jira/browse/CALCITE-6988
Project: Calcite
Issue Type: New Feature
Components: core
Affects Versions: 1.39.0
Reporter: Yu Xu
Assignee: Yu Xu
Fix For: 1.40.0
DuckDB is a very popular high-performance analytical database
system(https://github.com/duckdb/duckdb), many users are using DuckDB in their
work but Calcite had not supported with it yet.
I had relevant materials about mainly syntax difference between default dialect
and DuckDB (from document : https://duckdb.org/docs/stable/ ):
*1. NullCollation is LAST mode, Nulls last for DESC and ASC, test as follow:*
D select * from t3 order by data asc;
┌───────┬─────────┐
│ id │ data │
│ int32 │ varchar │
├───────┼─────────┤
│ 1 │ aa │
│ 1 │ bb │
│ 1 │ │
└───────┴─────────┘
D select * from t3 order by data desc;
┌───────┬─────────┐
│ id │ data │
│ int32 │ varchar │
├───────┼─────────┤
│ 1 │ bb │
│ 1 │ aa │
│ 1 │ │
└───────┴─────────┘
*2. max precision of decimal is 38(we can refer to
https://duckdb.org/docs/stable/sql/data_types/numeric#fixed-point-decimals)*
*3. IdentifierQuoteString is ", test for it:*
test is OK:
D SELECT DATETRUNC('second', "d") from t5;
┌────────────────────────┐
│ datetrunc('second', d) │
│ timestamp │
├────────────────────────┤
│ 2025-01-22 12:21:23 │
│ 2025-01-22 12:21:25 │
└────────────────────────┘
ERROR result:
D SELECT DATETRUNC('second', 'd') from t5;
Binder Error: Could not choose a best candidate function for the function call
"datetrunc(STRING_LITERAL, STRING_LITERAL)". In order to select one, please add
explicit type casts.
Candidate functions:
datetrunc(VARCHAR, DATE) -> TIMESTAMP
datetrunc(VARCHAR, INTERVAL) -> INTERVAL
datetrunc(VARCHAR, TIMESTAMP WITH TIME ZONE) -> TIMESTAMP WITH TIME ZONE
datetrunc(VARCHAR, TIMESTAMP) -> TIMESTAMP
LINE 1: SELECT DATETRUNC('second', 'd') from t5;
^
D SELECT DATETRUNC('second', `d`) from t5;
Catalog Error: Scalar Function with name `__postfix does not exist!
Did you mean "!__postfix"?
LINE 1: SELECT DATETRUNC('second', `d`) from t5;
*4. support approx_count_distinct function:*
D select approx_count_distinct("id") from t3 ;
┌───────────────────────────┐
│ approx_count_distinct(id) │
│ int64 │
├───────────────────────────┤
│ 1 │
└───────────────────────────┘
*5. Not support character_length/char_length, need use length function to
support it:*
D select length('aa');
┌──────────────┐
│ length('aa') │
│ int64 │
├──────────────┤
│ 2 │
└──────────────┘
D select character_length('aa');
Catalog Error: Scalar Function with name character_length does not exist!
Did you mean "array_length"?
LINE 1: select character_length('aa');
^
D select char_length('aa');
Catalog Error: Scalar Function with name char_length does not exist!
Did you mean "array_length"?
LINE 1: select char_length('aa');
^
*6. Map construncor method need convert () to {}, test for it:*
result is OK:
D select MAP \{'a':1, 'b':2};
┌────────────────────────────────────────────────────────────┐
│ main.map(main.list_value('a', 'b'), main.list_value(1, 2)) │
│ map(varchar, integer) │
├────────────────────────────────────────────────────────────┤
│ \{a=1, b=2} │
└────────────────────────────────────────────────────────────┘
Error result:
D select MAP ('a':1, 'b':2);
Parser Error: syntax error at or near ":"
LINE 1: select MAP ('a':1, 'b':2);
*7. floor date should convert datetrunc function(can refer to:
https://duckdb.org/docs/stable/sql/functions/date#date_truncpart-date)*
D SELECT DATETRUNC('year', "d") from t5;
┌──────────────────────┐
│ datetrunc('year', d) │
│ date │
├──────────────────────┤
│ 2025-01-01 │
│ 2025-01-01 │
└──────────────────────┘
D SELECT DATETRUNC('month', "d") from t5;
┌───────────────────────┐
│ datetrunc('month', d) │
│ date │
├───────────────────────┤
│ 2025-01-01 │
│ 2025-01-01 │
└───────────────────────┘
D SELECT DATETRUNC('day', "d") from t5;
┌─────────────────────┐
│ datetrunc('day', d) │
│ date │
├─────────────────────┤
│ 2025-01-22 │
│ 2025-01-22 │
└─────────────────────┘
D SELECT DATETRUNC('week', "d") from t5;
┌──────────────────────┐
│ datetrunc('week', d) │
│ date │
├──────────────────────┤
│ 2025-01-20 │
│ 2025-01-20 │
└──────────────────────┘
D SELECT DATETRUNC('millisecond', "d") from t5;
┌─────────────────────────────┐
│ datetrunc('millisecond', d) │
│ timestamp │
├─────────────────────────────┤
│ 2025-01-22 12:21:23.123 │
│ 2025-01-22 12:21:25.123 │
└─────────────────────────────┘
D SELECT DATETRUNC('microseconds', "d") from t5;
┌──────────────────────────────┐
│ datetrunc('microseconds', d) │
│ timestamp │
├──────────────────────────────┤
│ 2025-01-22 12:21:23.123 │
│ 2025-01-22 12:21:25.123223 │
└──────────────────────────────┘
for these differences we should make a new DuckDB dialect.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)