[GitHub] spark pull request: [SPARK-3694] RDD and Task serialization debugg...

ilganeli Sun, 30 Nov 2014 13:31:33 -0800

Github user ilganeli commented on the pull request:

    https://github.com/apache/spark/pull/3518#issuecomment-65001161
  
    Thanks for the quick review Josh, I'll look into refactoring the search 
step into a separate component. With regards to your first comment, I display 
two types of debugging information for RDDs.
    
    The first is the serialization trace we are discussing here (the one that 
shows which specific dependency was at fault). The second is the 
dependencyTrace which is printed using the toDebugString() function of the RDD 
class. This secondary output actually prints the dependency graph I believe we 
are talking about (where the items in the graph are labeled as the items in the 
serializationTrace). Thus, the serialization trace tells us which specific RDD 
is unserializable and this secondary output shows how it's related to the 
parent RDD. Would this address your concern?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3694] RDD and Task serialization debugg...

Reply via email to