[jira] [Commented] (HTRACE-214) De-globalize Tracer.java

Colin Patrick McCabe (JIRA) Fri, 21 Aug 2015 11:47:56 -0700

    [ 
https://issues.apache.org/jira/browse/HTRACE-214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14707246#comment-14707246
 ]


Colin Patrick McCabe commented on HTRACE-214:
---------------------------------------------

Thanks for the reviews!

bq. Pity we loose Interfaces and now have Abstracts instead.

I know, but if we want to add more functions in the future, we'll be glad 
they're abstract base classes.

bq. Why can we remove the check for trace id? Because traceid has been purged?

TraceID is still around.  It's just that TraceID is now set by the client API, 
rather than being set by the SpanReceiver.  This change was needed because 
different Tracers will produce spans with different TraceIDs, which will all go 
to the same SpanReceivers.  It's part of the multi-tenancy stuff.

bq. In MilliSpan I see this added but is it ever set or used? "private Tracer 
tracer;"

Good catch, we don't need that.  It was left over from some earlier editing.  
Fixed

bq. NullScope probably needs a test to ensure it not mutable.

Hmm, does TestNullScope cover this, or should we add more?

bq. Why Sampler change from Interface to abstract when there is nothing added 
to the Sampler interface that would warrent it becoming a Class.

I was thinking that in the future we may want to add more sampling strategies.  
For example, we could have a filtering method that operates when tracing is 
already active to implement HTRACE-69.  By default, nothing would be filtered 
(the base class version could do no filtering) but certain samplers might.  Or 
we could have a function that passes the Tracer name or potential span name 
into the sampler, and have it have an effect on the sampling rate.  These ideas 
probably need more polish, but the basic idea is... we have the freedom to add 
more functions if it's an ABC.  If it's an interface then our hands are tied 
forever (or at least until HTrace 5.0) :)

bq. Could move SamplerBuilder into the abstract Sampler as inner class. This 
would tie the two. Sampler.Builder rather than have to hunt around to figure, 
oh, there is a Builder needed to make Samplers.

Yeah, that makes sense.  Same for SpanReceiverBuilder.

bq. We have to have jackson in htrace-core? It is a versioning PITA. I suppose 
we have shading to insulate....

Yeah.  It's needed for the JSON stuff.  That didn't change in this patch.  
Shading seems to have worked well

bq. So, in a JVM, spanreceivers are distingushed by an id

Hadoop has been using an ID for span receivers for a while.  It's used in the 
"add receiver / remove receiver" RPC stuff to uniquely identify a receiver in 
the RPC.  It's a lot cleaner to just put the ID into the SpanReceiver object 
rather than have to maintain a side index inside Hadoop, which may get out of 
date.

bq. \[Trace\] was \[a primary\] entry point. Is it as clear now after this 
refactoring where to go to start 'tracing'?

It's a lot easier to add tracing to an app after the refactoring.  It's just as 
simple as:

{code}
Tracer tracer = new TracerBuilder(conf).name("MyApp").build()
TraceScope scope = tracer.newScope();
try {
  doStuff();
} finally {
  scope.close();
}
{code}

The TracerBuilder takes care of setting up Samplers and SpanReceivers according 
to the configuration.

Previously, you would have had to:
1. Parse your configuration to determine what sampler was configured.
2. Use SamplerBuilder to build that Sampler
3. Parse your configuration to determine what SpanReceiver was configured.
4. Use SpanReceiverBuilder to build an appropriate SpanReceiver
5. Manually set up cleanup hooks to close the SpanReceiver when the application 
exited
6. Use Trace#addReceiver() to set the new SpanReceiver

Trace only looked like a primary entry point if you ignored the complexity in 
SpanReceiverHost, which every app would have had to copy/paste.  Tracer is now 
much more of a primary entry point than Trace was.

bq. What is a TraceExecutorService for. It is 'public'.

This is an existing class which is just getting moved around.  It is basically 
just an ExecutorService that wraps the Runnables in trace scopes.

bq. I used to start a span but now I say there is a newScope?

I never thought {{newSpan}} was a good function name, since what it really 
created was a {{TraceScope}} object.  Sometimes a {{Span}} was created, but 
only when tracing.  So the name was actually kind of a lie.

bq. Shutdown hooks are PITA. Doesn't HDFS install one. Will this interfere? 
HBase does. Does order in which hooks run matter?

This is more for short-running utilities like FsShell.  HDFS and HBase 
themselves don't tend to shut down unless someone kills the process, and if 
they do, tracing isn't so important.  We could make this shutdown hook optional 
if there are cases where it doesn't work or should be tied into something else.

bq. Should TracerPool be private?

Hadoop uses it to add and remove SpanReceivers to implement its tracing RPC 
protocol.

> De-globalize Tracer.java
> ------------------------
>
>                 Key: HTRACE-214
>                 URL: https://issues.apache.org/jira/browse/HTRACE-214
>             Project: HTrace
>          Issue Type: Sub-task
>    Affects Versions: 4.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HTRACE-214.001.patch, HTRACE-214.002.patch, 
> HTRACE-214.003.patch, HTRACE-214.004.patch, HTRACE-214.005.patch, 
> HTRACE-214.006.patch, HTRACE-214.007.patch
>
>
> De-globalize Tracer.java.
> Currently, Tracer is a Singleton managed by TracerHolder.  Instead, Tracer 
> objects should be created by each process or library that needs to use 
> HTrace.  This enables a few things:
> * When the Tracer object is created, we can give it a name.  Then we can use 
> this name in the "process id" of all spans created by that tracer, rather 
> than trying to scrape the JVM name using "questionable" methods.
> * SpanReceivers can be shared between multiple Tracer objects in the same 
> process.  The span receivers are reference counted.  This should eliminate 
> the "double tracing" issues we have had when tracing client libraries inside 
> processes which also want tracing.
> * Tracers can be closed by calling Tracer#close.  If the Tracer being closed 
> is the last tracer in the process, it will close all the span receivers.
> * We will have a TracerFactory that takes care of the details of creating the 
> right span receivers based on the configuration.  This removes some 
> boilerplate that is currently needed to enable HTrace in an application or 
> library.  We can also make SpanReceiverFactory package-private since it will 
> no longer need to be publicly visible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HTRACE-214) De-globalize Tracer.java

Reply via email to