[
https://issues.apache.org/jira/browse/SPARK-53743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jungtaek Lim updated SPARK-53743:
---------------------------------
Description:
We got a report that TWS PySpark with Row type API failed on requesting
ListState.put(), weirdly ran fine and eventually failed.
>From stack trace of the report, we figured out it took the code path of
>fetchWithArrow (which is only triggered when the list size is exactly 100 -
>which was a bug) and the conversion somehow failed on below stack trace:
was:
We got a report that TWS PySpark with Row type API failed on requesting
ListState.put(), weirdly ran fine and eventually failed.
>From stack trace of the report, we figured out it took the code path of
>fetchWithArrow (which is only triggered when the list size is exactly 100 -
>which was a bug) and the conversion does only consider Pandas type API.
We need to either fix Arrow code path to deal with both Pandas type and Row
type, or just remove that code path.
> ListState fetchWithArrow option does not work with PySpark Row type API
> -----------------------------------------------------------------------
>
> Key: SPARK-53743
> URL: https://issues.apache.org/jira/browse/SPARK-53743
> Project: Spark
> Issue Type: Bug
> Components: Structured Streaming
> Affects Versions: 4.1.0
> Reporter: Jungtaek Lim
> Priority: Major
>
> We got a report that TWS PySpark with Row type API failed on requesting
> ListState.put(), weirdly ran fine and eventually failed.
> From stack trace of the report, we figured out it took the code path of
> fetchWithArrow (which is only triggered when the list size is exactly 100 -
> which was a bug) and the conversion somehow failed on below stack trace:
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]