codeant-ai-for-open-source[bot] commented on code in PR #37576:
URL: https://github.com/apache/superset/pull/37576#discussion_r2755852545
##########
superset/explorables/base.py:
##########
@@ -106,6 +106,18 @@ def get_query_str(self, query_obj: QueryObjectDict) -> str:
# Identity & Metadata
# =========================================================================
+ @property
+ def id(self) -> int:
+ """
+ Primary key identifier for this explorable.
+
+ Used for database lookups such as row-level security filter resolution.
+ Must be accessible without triggering expensive operations like
+ database engine connections.
+
+ :return: Integer primary key
Review Comment:
**Suggestion:** The `Explorable.id` protocol property is typed as returning
`int`, but elsewhere the explorable metadata explicitly allows non-integer
identifiers (e.g., `ExplorableData["id"]` can be `int` or `str`, and guest RLS
compares dataset IDs as strings), so this overly narrow type will cause type
checking errors for valid implementations and encourages incorrect assumptions
that IDs must be integers. Widening the return type to a hashable value (e.g.,
`Hashable`) better matches existing contracts and avoids type mismatches while
still constraining the type appropriately. [type error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ⚠️ Type-checking failures in IDEs and CI linting.
- ⚠️ Developer friction when implementing Explorable types.
- ⚠️ Mypy errors for implementations using string IDs.
- ⚠️ Suggestion does not alter runtime behavior.
```
</details>
```suggestion
def id(self) -> Hashable:
"""
Primary key identifier for this explorable.
Used for database lookups such as row-level security filter
resolution.
Must be accessible without triggering expensive operations like
database engine connections.
:return: Hashable primary key (typically int or str)
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Trigger dashboard serialization path by requesting a dashboard: GET
/api/v1/dashboard/{id}. The call chain reaches row-level security code that
resolves
datasources (see superset/security/manager.py).
2. The security manager entry is defined at
superset/security/manager.py:2692 in function
get_rls_filters(self, table: "BaseDatasource | Explorable") — this function
is called
during dashboard serialization (evidence: grepping shows calls at
superset/connectors/sqla/models.py:1956 and superset/jinja_context.py:229).
3. Inside get_rls_filters, the code reads datasource.metadata via
datasource.data["id"]
(evidence: superset/security/manager.py:818 and lines around 2570/2688 in
the Grep
results). Those lines show direct usage of datasource.data["id"] as the
canonical dataset
identifier.
4. The submitted PR changed the Explorable protocol to type id -> int
(superset/explorables/base.py lines ~109-119). However, existing codebase
treats the
explorable id in metadata as int or str in many places (see
superset/superset_typing.py
referenced by the Grep results which defines ExplorableData typing used by
.data), and
security code sometimes compares rule.dataset using string conversions
(superset/security/manager.py:2688 uses str(rule.get("dataset")) ==
str(dataset.data["id"])).
5. Practical consequence: static type checkers (mypy/IDE) will flag
implementations that
return str ids or other hashable identifiers as mismatched against the new
int return
annotation on Explorable.id. This is a developer-level/type-check issue
rather than a
runtime bug — it will not change runtime behavior, because Python's runtime
does not
enforce the declared return type.
6. Reproduction of the type problem locally:
a. Open any concrete Explorable implementation, e.g.,
superset/connectors/sqla/models.py (SqlaTable) which participates in
.data and metadata
usage.
b. Run the project's type checker (mypy/flake) or observe IDE linting
after the PR: the
implementation returning or exposing non-int values for id (or code that
treats
Explorable.id as Hashable) will produce type-mismatch diagnostics
referencing
superset/explorables/base.py:109 def id(self) -> int.
7. Note: This suggestion is not addressing the runtime regression described
in the PR
(that was caused by reading table.data["id"] triggering expensive .data
computation). The
proposed change is a typing refinement to better match existing contracts
and avoid
static-analysis errors. The runtime code paths (datasource.data access)
remain unchanged.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/explorables/base.py
**Line:** 110:118
**Comment:**
*Type Error: The `Explorable.id` protocol property is typed as
returning `int`, but elsewhere the explorable metadata explicitly allows
non-integer identifiers (e.g., `ExplorableData["id"]` can be `int` or `str`,
and guest RLS compares dataset IDs as strings), so this overly narrow type will
cause type checking errors for valid implementations and encourages incorrect
assumptions that IDs must be integers. Widening the return type to a hashable
value (e.g., `Hashable`) better matches existing contracts and avoids type
mismatches while still constraining the type appropriately.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]