bogao007 commented on code in PR #47933: URL: https://github.com/apache/spark/pull/47933#discussion_r1737296085
########## python/pyspark/sql/streaming/stateful_processor.py: ########## @@ -77,6 +78,58 @@ def clear(self) -> None: self._value_state_client.clear(self._state_name) +class ListState: + """ + Class used for arbitrary stateful operations with transformWithState to capture single value + state. + + .. versionadded:: 4.0.0 + """ + + def __init__( + self, list_state_client: ListStateClient, state_name: str, schema: Union[StructType, str] + ) -> None: + self._list_state_client = list_state_client + self._state_name = state_name + self.schema = schema + + def exists(self) -> bool: + """ + Whether list state exists or not. + """ + return self._list_state_client.exists(self._state_name) + + def get(self) -> Iterator["PandasDataFrameLike"]: Review Comment: @HeartSaVioR I'm keeping the `Iterator["PandasDataFrameLike"]` for `get()` for now since I feel it's easier for users to use `PandasDataFrameLike` as input parameter for `put()` and `appendList()`. It's better to keep it consistent for `get()` as well. But let me know if you have different ideas, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org