Re: high-percentile latency metrics, aka HBASE-6261

Andrew Wang Thu, 28 Jun 2012 15:54:01 -0700

I put this on the jira too, but the algo I found whittled down a stream of
10 million items down to ~19.5k samples. With each sample at ~36B, that's
~685KiB. There's a bit more from using a LinkedList and general bookkeeping.

Since the estimator is reset every O(minutes) window, and I doubt very many
metrics see more than 10 million items in O(minutes), it seems lightweight
enough to keep going.

I'm planning on doing this in hadoop-common's metrics2 since HDFS is also
interested, backporting to 1.x and 2.x. This would thus depend on the
metrics2 conversion (HBASE-4050) going through too.

Thanks,
Andrew

On Thu, Jun 28, 2012 at 3:31 PM, Stack <st...@duboce.net> wrote:

> On Tue, Jun 26, 2012 at 6:35 PM, Andrew Wang <andrew.w...@cloudera.com>
> wrote:
> > I wanted to ask off JIRA though about what would be useful in practice. I
> > think it'd be nice to see, for example, accurate 90th and 99th percentile
> > latency over recent 10s, 1m, 5m, and 15m time windows. I found some nice
> > algos to do this, I think at the cost of MBs of memory.
> >
>
> Agree.
>
> How many MBs?
>
>
> > So, is the "full" solution compelling enough to proceed? Anything
> > missing/extraneous?
> >
>
> Whats going on is a critical focus going forward so I'd say 'full'
> unless the cost obscene.
>
> St.Ack
>

Re: high-percentile latency metrics, aka HBASE-6261

Reply via email to