[ 
https://issues.apache.org/jira/browse/SPARK-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14058159#comment-14058159
 ] 

Tathagata Das commented on SPARK-1853:
--------------------------------------

I dont think it is a bug in that. This is an artifact of the Spark Streaming's 
execution model -- all the DStreams are initially defined in one thread, and 
then RDDs and jobs are created and executed them in another thread. So when the 
thread creating RDDs calls Utils#getCallSite (whic looks at the stack trace to 
figure out the user code that in the call stack), it cannot find any reference 
to the user program as it is not called from the user program. The correct 
solution will probably require each DStream's to store its own call site info 
(where they were defined), and set it explicitly on every RDD that gets created 
by it. 



> Show Streaming application code context (file, line number) in Spark Stages UI
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-1853
>                 URL: https://issues.apache.org/jira/browse/SPARK-1853
>             Project: Spark
>          Issue Type: Improvement
>          Components: Streaming
>    Affects Versions: 1.0.0
>            Reporter: Tathagata Das
>            Assignee: Mubarak Seyed
>             Fix For: 1.1.0
>
>         Attachments: Screen Shot 2014-07-03 at 2.54.05 PM.png
>
>
> Right now, the code context (file, and line number) shown for streaming jobs 
> in stages UI is meaningless as it refers to internal DStream:<random line> 
> rather than user application file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to