[ 
https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16530144#comment-16530144
 ] 

Wei-Chiu Chuang commented on HADOOP-15566:
------------------------------------------

Hi Ben!
With the help from [~tlipcon], I worked with [~fabbri] and [~rizaon] and spent 
a day or two on porting htrace to opentracing. It turns out to be a quite fun 
exercise.

Most of the porting is mechanical, changing htrace span to opentracing span; 
took me a while to figure out how to pass trace id in opentracing, but doable. 
I was even able to add a few more tracing code that was lacking before.

Some observation I have:
# porting the code in Hadoop seems straightforward.
# I am not aware of any one using htrace in production. So I don't expect too 
much resistance in replacing it. (Shout out if this is not the case)
# By embracing opentracing, which is becoming the de facto tracing standard, it 
makes it possible to trace end-to-end, from non-Hadoop applications into Hadoop.

Some possible hurdles
# To pass trace id around, we'll need to update client -> namenode RPC 
messages, as well as client -> datanode RPC, KMS Rest API. So wire 
compatibility needs to be considered. (Some messages already carries htrace 
trace id. Would it make sense to replace the htrace trace id field with 
opentracing trace id field? Or should the opentracing trace id be appended? 
Hopefully there's not much overhead)
# opentracing is just a set of APIs. We used Jaeger as the implementation. I 
can see people might want an implementation that is more neutral, For example, 
Jaeger comes from Uber, and people might not want to use it (hey, any Lyft 
developers here? :))
# Community adoption: I am aware Hbase uses Htrace. So if we switch to 
opentracing, there'll need some coordination to convince HBase community to 
switch too (I'd be happy to contribute). And I am hoping to convince other 
communities to adopt opentracing as well. It's not too interesting if 
opentracing is adopted in Hadoop but not in Hive or Spark or Kafka.

Thoughts?

> Remove HTrace support
> ---------------------
>
>                 Key: HADOOP-15566
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15566
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: metrics
>    Affects Versions: 3.1.0
>            Reporter: Todd Lipcon
>            Priority: Major
>
> The HTrace incubator project has voted to retire itself and won't be making 
> further releases. The Hadoop project currently has various hooks with HTrace. 
> It seems in some cases (eg HDFS-13702) these hooks have had measurable 
> performance overhead. Given these two factors, I think we should consider 
> removing the HTrace integration. If there is someone willing to do the work, 
> replacing it with OpenTracing might be a better choice since there is an 
> active community.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to