Andrew Ash created SPARK-21962: ---------------------------------- Summary: Distributed Tracing in Spark Key: SPARK-21962 URL: https://issues.apache.org/jira/browse/SPARK-21962 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 2.2.0 Reporter: Andrew Ash
Spark should support distributed tracing, which is the mechanism, widely popularized by Google in the [Dapper Paper|https://research.google.com/pubs/pub36356.html], where network requests have additional metadata used for tracing requests between services. This would be useful for me since I have OpenZipkin style tracing in my distributed application up to the Spark driver, and from the executors out to my other services, but the link is broken in Spark between driver and executor since the Span IDs aren't propagated across that link. An initial implementation could instrument the most important network calls with trace ids (like launching and finishing tasks), and incrementally add more tracing to other calls (torrent block distribution, external shuffle service, etc) as the feature matures. Search keywords: Dapper, Brave, OpenZipkin, HTrace -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org