Patrick Wendell created SPARK-4737: -------------------------------------- Summary: Prevent serialization errors from ever crashing the DAG scheduler Key: SPARK-4737 URL: https://issues.apache.org/jira/browse/SPARK-4737 Project: Spark Issue Type: Bug Reporter: Patrick Wendell Assignee: Matthew Cheah Priority: Blocker
Currently in Spark we assume that when tasks are serialized in the TaskSetManager that the serialization cannot fail. We assume this because upstream in the DAGScheduler we attempt to catch any serialization errors by serializing a single partition. However, in some cases this upstream test is not accurate - i.e. an RDD can have one partition that can serialize cleanly but not others. Do do this in the proper way we need to catch and propagate the exception at the time of serialization. The tricky bit is making sure it gets propagated in the right way. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org