[ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561193#comment-16561193 ]
Ben Sigelman commented on HADOOP-15566: --------------------------------------- Re the actual technical issue (there's a PS below about the more FUD-oriented points): rather than expecting maintainers of every ASF storage system *and* the maintainers of every distributed tracing system to (a) decide on the nuances of a data model, then (b) write bindings from "Storage System X's" tracing hooks to "Tracing System Y's" client library (for all combinations of X's and Y's), we can instrument the ASF storage systems with a single API that has been specifically designed to be portable. To address [~stack]'s question about performance, the noop implementation of OpenTracing tracers amount to an empty function call but avoid the costs and/or lock contention of generating random numbers, context objects, and so forth. Another point that [~stack] made: {quote}For me, the hard part is not which tracing lib to use – if a tracing lib discussion, lets do it out on dev? {quote} I 90% agree with this. Certainly as a response to [~michaelsembwever], in any case, I would be glad to see a side-by-side using OpenTracing vs "something custom" to understand the amount of *additional* work required to actually get end-to-end tracing to work. That said, doing the tracing lib analysis "on dev" should also take the application developer experience into account... whatever we decide to do must require a minimum of configuration work (or educational work) for application developers, and that means that we should think hard about being agnostic about the tracing system "above" the storage systems under consideration here – ideally we are able to plug into any of them without forcing the application developer / operator to write new code or go on a yak-shaving mission. As a concrete next step, I would be curious to see the code / branch that [~jojochuang] used to generate the OT+Jaeger screenshots above. I would also like to create a dev branch of HDFS or Cassandra that adds "native" OpenTracing instrumentation to a distributed code path that the HDFS devs think would be instructive/representative... I just think we're going to be hard-pressed to make an informed decision without pairings of trace visualizations (ideally in many tracing systems to illustrate portability) *and* the respective instrumentation code to illustrate non-bloat / maintainability. Would that be useful? [~stack] you were suggesting we try this on dev – any pointers to a non-HDFS / non-HBase expert for a place to focus on for such an exercise? {color:#707070}PS: {color}[~michaelsembwever]{color:#707070}, that was a lot of FUD to pack into one message ("bloat its API with vendor concerns", "hostile to the ASF", "hostile ... to those tracing solutions those vendors see as competition", etc). These concerns were also presented without any evidence – unsurprisingly, as I doubt that evidence exists. OpenTracing's two most common "pairings" are Zipkin and Jaeger, neither of which are commercial solutions. To the contrary of what you suggest, the API is intentionally – if not primarily – designed to focus on _describing system behavior_ rather than the concerns of any downstream tracing system (OSS or commercial). All OpenTracing meetings are recorded and the notes are public if people here would like to judge for themselves about the openness and intent of the actual decision process (as opposed to the one you described/imagined). For those who want a primer on what we're up to, I would recommend reading either [this doc that I wrote when we were just getting started|https://medium.com/opentracing/towards-turnkey-distributed-tracing-5f4297d1736], or [this more recent doc explaining how OT fits into the larger ecosystem|https://medium.com/opentracing/the-difference-between-tracing-tracing-and-tracing-84b49b2d54ea] that's developed in the interim.{color} > Remove HTrace support > --------------------- > > Key: HADOOP-15566 > URL: https://issues.apache.org/jira/browse/HADOOP-15566 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics > Affects Versions: 3.1.0 > Reporter: Todd Lipcon > Priority: Major > Labels: security > Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, > ss-trace-s3a.png > > > The HTrace incubator project has voted to retire itself and won't be making > further releases. The Hadoop project currently has various hooks with HTrace. > It seems in some cases (eg HDFS-13702) these hooks have had measurable > performance overhead. Given these two factors, I think we should consider > removing the HTrace integration. If there is someone willing to do the work, > replacing it with OpenTracing might be a better choice since there is an > active community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org