nirav patel created SPARK-15845:
-----------------------------------

             Summary: Expose metrics for sub-task steps 
                 Key: SPARK-15845
                 URL: https://issues.apache.org/jira/browse/SPARK-15845
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 1.5.2
            Reporter: nirav patel


Spark optimizes DAG processing by efficiently selecting stage boundaries.  This 
makes spark stage a sequence of multiple transformation and one or zero action. 
As Aa result stage that spark is currently running can be internally series of 
(map -> shuffle -> map -> map -> collect) Notice here that it goes pass shuffle 
dependency and includes the next transformations and actions into same stage. 
So any task of this stage is essentially doing all those transformation/actions 
as a Unit and there is no further visibility inside it. Basically network read, 
populating partitions, compute, shuffle write, shuffle read, compute, writing 
final partitions to disk ALL happens within one stage! Means all tasks of that 
stage is basically doing all those operations on single partition as a unit. 
This takes away huge visibility into users transformation and actions in terms 
of which one is taking longer or which one is resource bottleneck and which one 
is failing.

spark UI just shows its currently running some action stage. If job fails at 
that point spark UI just says Action failed but in fact it could be any stage 
in that lazy chain of evaluation. Looking at executor logs gives some insights 
but that's not always straightforward. 

I think we need more visibility into what's happening underneath a task (series 
of spark transformations/actions that comprise a stage) so we can easily 
troubleshoot as well as find bottlenecks and optimize our DAG.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to