GitHub user 0eu created a discussion: Refactor Pinot sqlglot dialect tests to
use AST comparison instead of sub-string matching
**Background**
Currently, many of Pinot dialect-specific tests rely on direct string
comparisons or substring checks to verify that SQL is being parsed and
generated correctly.
Example:
```python
def test_cast_to_string() -> None:
"""
Test that CAST to STRING is preserved (not converted to CHAR).
"""
sql = "SELECT CAST(cohort_size AS STRING) FROM table"
ast = sqlglot.parse_one(sql, Pinot)
generated = Pinot().generate(expression=ast)
assert "STRING" in generated
assert "CHAR" not in generated
```
**Problem**
String-based assertions are highly sensitive to:
- Small changes in the sqlglot generator's "pretty print" logic can break the
entire test suite.
- Dialect-specific normalization (upper vs. lower case) can lead to false
negatives.
- We are often testing sqlglot's formatting engine rather than the semantic
correctness of the dialect translation.
- Last, but not least, these tests are a bit more difficult to read because
these substring checks are difficult to reason about.
**Proposed Solution**
I propose refactoring these tests to leverage sqlglot’s ability to compare
expressions directly. By comparing the resulting AST of the generated SQL
against an expected SQL string's AST, we ensure semantic equivalence without
being tied to a specific string format.
Example:
```python
# Instead of comparing strings:
# assert generated_string == expected_string
# We compare ASTs:
expected_ast = sqlglot.parse_one(expected_sql, read=Pinot)
generated_ast = sqlglot.parse_one(generated_sql, read=Pinot)
assert generated_ast == expected_ast
```
**I’m volunteering to work on this!**
I have already analyzed the existing
[test_pinot](https://github.com/apache/superset/blob/28c802fb6c7f6cef7d22d4793f4e5bfe0d16cbd5/tests/unit_tests/sql/dialects/pinot_tests.py#L505)
and am ready to submit a PR implementing this pattern.
GitHub link: https://github.com/apache/superset/discussions/36833
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]