Github user ilganeli commented on the pull request: https://github.com/apache/spark/pull/3518#issuecomment-65001161 Thanks for the quick review Josh, I'll look into refactoring the search step into a separate component. With regards to your first comment, I display two types of debugging information for RDDs. The first is the serialization trace we are discussing here (the one that shows which specific dependency was at fault). The second is the dependencyTrace which is printed using the toDebugString() function of the RDD class. This secondary output actually prints the dependency graph I believe we are talking about (where the items in the graph are labeled as the items in the serializationTrace). Thus, the serialization trace tells us which specific RDD is unserializable and this secondary output shows how it's related to the parent RDD. Would this address your concern?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org