Andrew Ash created SPARK-21962:
----------------------------------

             Summary: Distributed Tracing in Spark
                 Key: SPARK-21962
                 URL: https://issues.apache.org/jira/browse/SPARK-21962
             Project: Spark
          Issue Type: New Feature
          Components: Spark Core
    Affects Versions: 2.2.0
            Reporter: Andrew Ash


Spark should support distributed tracing, which is the mechanism, widely 
popularized by Google in the [Dapper 
Paper|https://research.google.com/pubs/pub36356.html], where network requests 
have additional metadata used for tracing requests between services.

This would be useful for me since I have OpenZipkin style tracing in my 
distributed application up to the Spark driver, and from the executors out to 
my other services, but the link is broken in Spark between driver and executor 
since the Span IDs aren't propagated across that link.

An initial implementation could instrument the most important network calls 
with trace ids (like launching and finishing tasks), and incrementally add more 
tracing to other calls (torrent block distribution, external shuffle service, 
etc) as the feature matures.

Search keywords: Dapper, Brave, OpenZipkin, HTrace



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to