Xiao-zhen-Liu commented on code in PR #4488:
URL: https://github.com/apache/texera/pull/4488#discussion_r3170310892
##########
amber/src/main/python/core/models/state.py:
##########
@@ -15,58 +15,56 @@
# specific language governing permissions and limitations
# under the License.
-from dataclasses import dataclass
-from pandas import DataFrame
-from pyarrow import Table
-from typing import Optional
+import base64
+import json
+from typing import Any, Dict, TypeAlias
-from .schema import Schema, AttributeType
-from .schema.attribute_type import FROM_PYOBJECT_MAPPING
+from .schema import Schema
+from .tuple import Tuple
+State: TypeAlias = Dict[str, Any]
Review Comment:
`State` being a type alias is fine for annotations, but here it is used as a
runtime match pattern in `main_loop.py`:
```
match(
element,
Tuple,
self._process_tuple,
State,
self._process_state,
)
```
Since `State: TypeAlias = Dict[str, Any]`, it is not a concrete runtime
class like `dict`; it is a subscripted `typing.Dict`.
That means runtime checks against it can fail, e.g.:
```
from typing import Any, Dict, TypeAlias
State: TypeAlias = Dict[str, Any]
isinstance({"x": 1}, State)
# TypeError: Subscripted generics cannot be used with class and instance
checks
```
So an incoming `StateFrame` payload can reach this dispatch path and fail
before `_process_state` is called. Could we match `dict` in `main_loop.py`
instead, or make `State` a real runtime class if we need to distinguish state
from ordinary dictionaries?
##########
amber/src/main/python/core/architecture/packaging/output_manager.py:
##########
@@ -248,7 +248,7 @@ def emit_state(
receiver,
(
StateFrame(payload)
- if isinstance(payload, State)
+ if isinstance(payload, dict)
Review Comment:
Using `dict` here avoids the `State` type-alias runtime issue, but it also
makes the dispatch very broad: any dictionary returned from `flush_state`
becomes a `StateFrame`. Is `flush_state` guaranteed to return only either a
State dictionary or a list of Tuples? If so, could we make that contract
explicit, or consider a concrete `State` runtime type/wrapper so this branch
does not silently classify arbitrary dictionaries as state?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]