You could also enable it with --conf spark.logLineage=true if you do not want to change any code.
Regards, Keith. http://keith-chapman.com On Fri, Jul 21, 2017 at 7:57 PM, Keith Chapman <keithgchap...@gmail.com> wrote: > Hi Ron, > > You can try using the toDebugString method on the RDD, this will print > the RDD lineage. > > Regards, > Keith. > > http://keith-chapman.com > > On Fri, Jul 21, 2017 at 11:24 AM, Ron Gonzalez < > zlgonza...@yahoo.com.invalid> wrote: > >> Hi, >> Can someone point me to a test case or share sample code that is able >> to extract the RDD graph from a Spark job anywhere during its lifecycle? I >> understand that Spark has UI that can show the graph of the execution so >> I'm hoping that is using some API somewhere that I could use. >> I know RDD is the actual execution graph, so if there is also a more >> logical abstraction API closer to calls like map, filter, aggregate, etc., >> that would even be better. >> Appreciate any help... >> >> Thanks, >> Ron >> > >