Re: Understanding Spark UI DAGs

C. Josephson Thu, 21 Jul 2016 12:44:45 -0700

Ok, so those line numbers in our DAG don't refer to our code. Is there any
way to display (or calculate) line numbers that refer to code we actually
wrote, or is that only possible in Scala Spark?


On Thu, Jul 21, 2016 at 12:24 PM, Jacek Laskowski <ja...@japila.pl> wrote:

> Hi,
>
> My little understanding of Python-Spark bridge is that at some point
> the python code communicates over the wire with Spark's backbone that
> includes PythonRDD [1].
>
> When the CallSite can't be computed, it's null:-1 to denote "nothing
> could be referred to".
>
> [1]
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Thu, Jul 21, 2016 at 8:36 PM, C. Josephson <cjos...@uhana.io> wrote:
> >> It's called a CallSite that shows where the line comes from. You can see
> >> the code yourself given the python file and the line number.
> >
> >
> > But that's what I don't understand. Which python file? We spark submit
> one
> > file called ctr_parsing.py, but it only has 150 lines. So what is
> > MapPartitions at PythonRDD.scala:374 referring to? ctr_parsing.py
> imports a
> > number of support functions we wrote, but how do we know which python
> file
> > to look at?
> >
> > Furthermore, what on earth is null:-1 referring to?
>



-- 
Colleen Josephson
Engineering Researcher
Uhana, Inc.

Re: Understanding Spark UI DAGs

Reply via email to