Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/106#discussion_r10417424
  
    --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
    @@ -1031,8 +1026,10 @@ abstract class RDD[T: ClassTag](
     
       private var storageLevel: StorageLevel = StorageLevel.NONE
     
    -  /** Record user function generating this RDD. */
    -  @transient private[spark] val origin = sc.getCallSite()
    +  /** Info about the function call site where this was created (e.g. 
`textFile`, `parallelize`). */
    +  @transient private[spark] val callSite = Utils.getCallSiteInfo
    +
    +  private[spark] def getCallSiteString = Utils.formatCallSiteInfo(callSite)
    --- End diff --
    
    At present, none of these are public or returned to users (including old 
origin) they are just consumed internally.
    
    It seemed to me if you have one thing that's a complex struct it should be 
the `callSite` and then sometimes you make a string representation that should 
be called `callSiteString`. For instance the user override I also changed to 
`callSiteString` because really that is what the user is overriding... 
    
    Anyways I don't really feel strongly. We could rename `val callSite` here 
to `val callSiteInfo` and then rename all of the `callSiteString` things to 
`callSite` which bascially reverts it to how it was before. Would you prefer 
that?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to