[ 
https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16562025#comment-16562025
 ] 

BOGDAN DRUTU commented on HADOOP-15566:
---------------------------------------

Hello all,

First sorry for jumping into this issue, but I will try to be short (edited 
after I finished the comment: I was wrong) and as much possible project 
independent (for the record I am one of the main contributor in OpenCensus, 
also in my previous life I debugged a lot of BigTable issues using the same 
technology as OpenCensus).

Some comments about other comments in this issue:

[~bensigelman] - FYI: OpenCensus does not enforce any wire format. The format 
is configurable and we are adding support for the w3c standard.

[~elek] - About OT vs OC my personal opinion is the philosophy behind these 
projects, OT was designed with a mindset of being an open-source API for 
vendors to implement and because of these certain tradeoffs were made to help 
some vendors (as [~michaelsembwever] mentioned), OC was designed to be a fully 
implemented library that supports multiple different backend (Zipkin, Jagger, 
Stackdriver, AppInsight, etc.) as well as in-process debugging capabilities. 
For example one of the key feature that I used a lot when I debugged BigTable 
issues is what OpenCensus calls z-pages (in-process handlers to track active 
requests, in-memory latency based sampled spans, stats, etc.). You can take a 
look here [https://opencensus.io/core-concepts/z-pages/#1].

Based on my small experience there are 3 components that are critical in the 
instrumentation of a service:
 # Wire propagation (I saw a previous discussion about this). 
[https://github.com/w3c/distributed-tracing] - it is a w3c standard proposed by 
couple of APM vendors and cloud providers. Even though the format is mostly 
focus on HTTP requests HBase can define their own format if needed, the only 
requirement being the ability to propagate all fields defined in the format 
(trace-id, span-id, trace-options and tracestate). This part is critical when 
HBase is used as a service (e.g. something like Google Bigtable which works 
with the HBase client), having standard fields that are propagated allows 
service owners to correlate incoming requests from a customer with the internal 
trace. Also similar issue may occur when only HDFS is used as a service.
 # APIs to start/end a span, record tracing events, etc. There are multiple 
open source APIs including (OpenCensus, OpenTracing, Zipkin, etc.).
 # In-process propagation. This can be implemented in two ways: explicitly 
propagate the current "Span" between function calls, runnable, callable, etc. 
or implicitly usually using a thread-local mechanism. From a previous comment 
from [~stack] about keeping this working, my personal experience is that you 
can achieve this using the "implicit" mechanism described before by having a 
clean context api (for an example of a context api that works good I can 
recommend the [https://grpc.io/grpc-java/javadoc/io/grpc/Context.html)] and 
ensure that all async calls are wrapped accordingly (e.g wrapping all 
Executors), the "explicit" mechanism may be very hard to maintain and based on 
my experience annoying for developers. This part is very important when 
instrumenting the HBase client (which I think should be instrumented in order 
to debug more complex issues) because the client is used as a library and a 
standard way to propagate the current Span is very important in order to 
continue the same trace between client application and bigtable client.

When OpenCensus was designed I thought that it is very important that the 
library ensures all these 3 components are covered. Some may say that the 1) it 
is not important when deployed internally but with the new cloud providers this 
becomes more common, others may say that 3) it is not important but when 
instrument client libraries (like HBase client) this becomes very important in 
my opinion. FYI there are other libraries that solve these issues as well like 
Zipkin, etc. but I am not here to suggest one particular library, just to 
explain the concepts, issues and what is important to think about.

 

In my personal opinion OpenTracing does not deal very well with 1 and 3 
(probably on purpose) but I am not an expert in OpenTracing or one of the 
owner/author/co-author so I cannot comment on what is good or what is bad in 
their design choices.

 

These are my thoughts about what you should consider when you pick one library 
vs other. Related to OpenCensus we are happy to help if you have any questions 
about our design choices, or about stats/metrics support in OpenCensus and why 
we think that these are very important as well.

 

PS: Hope the comment makes sense, it became larger than expected but I tried to 
give an overview of the whole instrumentation issue.

> Remove HTrace support
> ---------------------
>
>                 Key: HADOOP-15566
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15566
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: metrics
>    Affects Versions: 3.1.0
>            Reporter: Todd Lipcon
>            Priority: Major
>              Labels: security
>         Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, 
> ss-trace-s3a.png
>
>
> The HTrace incubator project has voted to retire itself and won't be making 
> further releases. The Hadoop project currently has various hooks with HTrace. 
> It seems in some cases (eg HDFS-13702) these hooks have had measurable 
> performance overhead. Given these two factors, I think we should consider 
> removing the HTrace integration. If there is someone willing to do the work, 
> replacing it with OpenTracing might be a better choice since there is an 
> active community.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to