codeant-ai-for-open-source[bot] commented on code in PR #37590: URL: https://github.com/apache/superset/pull/37590#discussion_r2749583865
########## superset/db_engine_specs/phoenix.py: ########## @@ -0,0 +1,90 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +from datetime import datetime +from typing import Any, Optional + +from sqlalchemy import types + +from superset.constants import TimeGrain +from superset.db_engine_specs.base import BaseEngineSpec, DatabaseCategory + + +class PhoenixEngineSpec(BaseEngineSpec): # pylint: disable=abstract-method + """Dialect for Apache Phoenix""" + + engine = "phoenix" + engine_name = "Apache Phoenix" + + metadata = { + "description": ( + "Apache Phoenix is a relational database layer over Apache HBase, " + "providing low-latency SQL queries over HBase data." + ), + "logo": "apache-phoenix.png", + "homepage_url": "https://phoenix.apache.org/", + "categories": [ + DatabaseCategory.APACHE_PROJECTS, + DatabaseCategory.ANALYTICAL_DATABASES, + DatabaseCategory.OPEN_SOURCE, + ], + "pypi_packages": ["phoenixdb"], + "connection_string": "phoenix://{hostname}:{port}/", + "default_port": 8765, + "notes": ( + "Phoenix provides a SQL interface to Apache HBase. " + "The phoenixdb driver connects via the Phoenix Query Server " + "and supports a subset of SQLAlchemy." + ), + } + + _time_grain_expressions = { + None: "{col}", + TimeGrain.SECOND: ( + "CAST(DATE_TRUNC('SECOND', CAST({col} AS TIMESTAMP)) AS TIMESTAMP)" + ), + TimeGrain.MINUTE: ( + "CAST(DATE_TRUNC('MINUTE', CAST({col} AS TIMESTAMP)) AS TIMESTAMP)" + ), + TimeGrain.HOUR: ( + "CAST(DATE_TRUNC('HOUR', CAST({col} AS TIMESTAMP)) AS TIMESTAMP)" + ), + TimeGrain.DAY: "CAST(DATE_TRUNC('DAY', CAST({col} AS TIMESTAMP)) AS DATE)", + TimeGrain.WEEK: "CAST(DATE_TRUNC('WEEK', CAST({col} AS TIMESTAMP)) AS DATE)", + TimeGrain.MONTH: ( + "CAST(DATE_TRUNC('MONTH', CAST({col} AS TIMESTAMP)) AS DATE)" + ), + TimeGrain.QUARTER: ( + "CAST(DATE_TRUNC('QUARTER', CAST({col} AS TIMESTAMP)) AS DATE)" + ), + TimeGrain.YEAR: "CAST(DATE_TRUNC('YEAR', CAST({col} AS TIMESTAMP)) AS DATE)", Review Comment: **Suggestion:** The time grain expressions are using the PostgreSQL-specific function `DATE_TRUNC`, which is not supported by Apache Phoenix, so any query that applies a time grain will generate invalid SQL and fail at runtime; use Phoenix's `TRUNC` function instead. [logic error] <details> <summary><b>Severity Level:</b> Major ⚠️</summary> ```mdx - ❌ Time-grain grouping fails for Phoenix-backed charts. - ❌ SQL Lab queries with time grains error on Phoenix. - ⚠️ Affects users connecting to Phoenix databases. ``` </details> ```suggestion "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'SECOND') AS TIMESTAMP)" ), TimeGrain.MINUTE: ( "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'MINUTE') AS TIMESTAMP)" ), TimeGrain.HOUR: ( "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'HOUR') AS TIMESTAMP)" ), TimeGrain.DAY: "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'DAY') AS DATE)", TimeGrain.WEEK: "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'WEEK') AS DATE)", TimeGrain.MONTH: ( "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'MONTH') AS DATE)" ), TimeGrain.QUARTER: ( "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'QUARTER') AS DATE)" ), TimeGrain.YEAR: "CAST(TRUNC(CAST({col} AS TIMESTAMP), 'YEAR') AS DATE)", ``` <details> <summary><b>Steps of Reproduction ✅ </b></summary> ```mdx 1. Open a Python REPL in the repo and import the Phoenix spec: - Run: python -c "from superset.db_engine_specs.phoenix import PhoenixEngineSpec; print(PhoenixEngineSpec.engine_name)" - This imports the class defined at `superset/db_engine_specs/phoenix.py:26` and loads its attributes. 2. Inspect the time-grain templates defined on the class at `superset/db_engine_specs/phoenix.py:54-74`: - In the REPL evaluate: ``` from superset.db_engine_specs.phoenix import PhoenixEngineSpec from superset.constants import TimeGrain print(PhoenixEngineSpec._time_grain_expressions[TimeGrain.MINUTE]) ``` - This returns the string containing "DATE_TRUNC(...)" located in the lines shown at `phoenix.py:56,59,63,65-73`. 3. Use the produced template to build a query that groups by time grain (the same templates are consumed by the query generator that uses engine specs): - Substitute a timestamp column name, e.g. PhoenixEngineSpec._time_grain_expressions[TimeGrain.HOUR].format(col='event_ts') - Embed the resulting expression into a SELECT/GROUP BY and execute it against a running Apache Phoenix instance (default port 8765 from `phoenix.py:46`) via SQL Lab or a DB client. 4. Observe the Phoenix server returns a SQL error (function or syntax error) because the generated SQL uses DATE_TRUNC (Postgres-style) instead of Phoenix-supported syntax: - Error arises at query execution time in SQL Lab or when the driver (phoenixdb) forwards the SQL to Phoenix. Notes: - The reproduction is concrete: the DATE_TRUNC strings exist in `phoenix.py:54-74`. Any code path that renders time-grain expressions from PhoenixEngineSpec (the Superset query generator consuming engine specs) will produce these invalid fragments. ``` </details> <details> <summary><b>Prompt for AI Agent 🤖 </b></summary> ```mdx This is a comment left during a code review. **Path:** superset/db_engine_specs/phoenix.py **Line:** 57:73 **Comment:** *Logic Error: The time grain expressions are using the PostgreSQL-specific function `DATE_TRUNC`, which is not supported by Apache Phoenix, so any query that applies a time grain will generate invalid SQL and fail at runtime; use Phoenix's `TRUNC` function instead. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. ``` </details> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
