[ https://issues.apache.org/jira/browse/SPARK-45671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jungtaek Lim updated SPARK-45671: --------------------------------- Parent: (was: SPARK-45511) Issue Type: Improvement (was: Sub-task) > Implement an option similar to corrupt record column in State Data Source > Reader > -------------------------------------------------------------------------------- > > Key: SPARK-45671 > URL: https://issues.apache.org/jira/browse/SPARK-45671 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming > Affects Versions: 4.0.0 > Reporter: Jungtaek Lim > Priority: Major > > Querying against the state would be most likely failing if the underlying > state file is corrupted. There may be another case that the binary data (raw) > state store read from state file does not fit with state schema and ends up > with exception/fatal error in runtime. > (We can't catch the case where the data is loaded with incorrect schema if it > does not throw an exception. We cannot add the schema for every data.) > To handle above cases without failure, we want to provide state rows for > valid rows, with also providing binary data for corrupted rows (like we do > for CSV/JSON) if users specify an option. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org