GitHub user ilganeli opened a pull request: https://github.com/apache/spark/pull/3518
[SPARK-3694] RDD and Task serialization debugging output Hi all - in addition to what was explicitly requested in the original JIRA, I also added the ability to have a trace of the serialization for RDDs so that you can see which specific dependency is unserializable. For debugging task serialization, I added a debug log output that shows the file and jar dependencies. However, I am unsure whether I can add more functionality there. For the RDD, it is possible to attempt to serialize each dependency in turn, which is why I can identify which component fails. For task debugging, I did not see a straightforward way to do the same thing. If anyone can suggest an approach here, I would be happily to implement it. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ilganeli/spark SPARK-3694B Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3518.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3518 ---- commit 6c997629e4d3bf9bccfbe9c3fa65aa1afa4bfca0 Author: Ilya Ganelin <ilya.gane...@capitalone.com> Date: 2014-10-30T15:02:04Z Created class to traverse dependency graph of RDD commit 47ccc227e5bdf14a1db20edfcf1b8f9c77b3b64a Author: Ilya Ganelin <ilya.gane...@capitalone.com> Date: 2014-10-30T22:06:04Z Started walker code commit a8d5332a71fbad4cca0aa1a7ca73db8e1386e15f Author: Ilya Ganelin <ilya.gane...@capitalone.com> Date: 2014-11-06T18:40:38Z RDD WAlker updates commit a63652f8240e0c370100ab05a11c95beaf47faa5 Author: Ilya Ganelin <ilya.gane...@capitalone.com> Date: 2014-11-06T18:42:48Z Added debug output to task serialization. Added debug output to RDD serialization. commit 05f2cc0665af3ca297936c8c4c5f6128be5a1ddc Author: Ilya Ganelin <ilya.gane...@capitalone.com> Date: 2014-11-06T18:51:50Z Rebase commit cbb1d771f4576c6ba981252cd8b7490722317ddf Author: Ilya Ganelin <ilya.gane...@capitalone.com> Date: 2014-11-14T19:03:25Z Style errors commit 183100019a0866e515edd0164db9c4c7fdf3ee5f Author: Ilya Ganelin <ilya.gane...@capitalone.com> Date: 2014-11-29T16:21:43Z Merge remote-tracking branch 'upstream/master' commit 916a31c57d89bc6fb83b33fdf70dfc1b94192cc5 Author: Ilya Ganelin <ilya.gane...@capitalone.com> Date: 2014-11-29T23:52:00Z Manual merge of updates commit bfb723de65e60aabb9cccc3b45ccc4638f12583d Author: Ilya Ganelin <ilya.gane...@capitalone.com> Date: 2014-11-29T23:55:40Z Added helper files commit e0a81537d5962f8bc79b8b9193a30b46827246ed Author: Ilya Ganelin <ilya.gane...@capitalone.com> Date: 2014-11-30T00:45:52Z Fixed whitespace errors ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org