[jira] [Commented] (HTRACE-20) Add after the fact sampler

Colin Patrick McCabe (JIRA) Tue, 16 Dec 2014 11:47:19 -0800

    [ 
https://issues.apache.org/jira/browse/HTRACE-20?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248786#comment-14248786
 ]

Colin Patrick McCabe commented on HTRACE-20:
--------------------------------------------

Elliot wrote:
bq. Lots of traces are really useful if things are anomalous. However it's 
sometimes the case that they are only anomalous once or very infrequently. So 
the probabilistic sampler can miss all the interesting traces. We should 
provide the ability keep a trace 100% of the time if something interesting 
happened.

Interesting idea.  Could we just use {{AlwaysSampler}} and aggressively "age 
out" old trace spans in {{traced}}?  Or were you thinking of having the client 
hold on to the trace spans, but only send them if some API call was made later?

In the second case, I wouldn't call this a "sampler" since sampling, by 
definition, implies taking a subset of the traces and discarding the rest.  It 
could be a decorator on the span receiver, perhaps?

Note that Google's Dapper paper talks about probabilistic tracing.  (See 
http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/36356.pdf
 ).  They seem to find sampling to be adequate.  Quote:

bq. In practice, we have found that there is still an adequate amount of trace 
data for high-volume services when using a sampling rate as low as 1/1024.

So maybe we'll find the same thing... on a big enough cluster, even infrequent 
things happen "often enough."  My instinct would be to get the web gui, 
htraced, and so on finished and deployed on a big cluster or two and see 
whether this is a problem in practice.

> Add after the fact sampler
> --------------------------
>
>                 Key: HTRACE-20
>                 URL: https://issues.apache.org/jira/browse/HTRACE-20
>             Project: HTrace
>          Issue Type: Bug
>            Reporter: Elliott Clark
>
> Lots of traces are really useful if things are anomalous. However it's 
> sometimes the case that they are only anomalous once or very infrequently. So 
> the probabilistic sampler can miss all the interesting traces. We should 
> provide the ability keep a trace 100% of the time if something interesting 
> happened.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HTRACE-20) Add after the fact sampler

Reply via email to