Once histograms are implemented in the SDK(s) (Alex, you're tackling
this, right?) it shoudn't be much work to update the Samza worker code
to publish these via the Samza runner APIs (in parallel with Alex's
work to do the same on Dataflow).

On Fri, Aug 14, 2020 at 5:35 PM Alex Amato <[email protected]> wrote:
>
> Noone has any plans currently to work on adding a generic histogram metric, 
> at the moment.
>
> But I will be actively working on adding it for a specific set of metrics in 
> the next quarter or so
> https://s.apache.org/beam-gcp-debuggability
>
> After that work, one could take a look at my PRs for reference to create new 
> metrics using the same histogram. One may wish to implement the UserHistogram 
> use case and use that in the Samza Runner
>
>
>
>
> On Fri, Aug 14, 2020 at 5:25 PM Ke Wu <[email protected]> wrote:
>>
>> Thank you Robert and Alex. I am not running a Beam job in Google Cloud but 
>> with Samza Runner, so I am wondering if there is any ETA to add the 
>> Histogram metrics in Metrics class so it can be mapped to the SamzaHistogram 
>> metric to the actual emitting.
>>
>> Best,
>> Ke
>>
>> On Aug 14, 2020, at 4:44 PM, Alex Amato <[email protected]> wrote:
>>
>> One of the plans to use the histogram data is to send it to Google 
>> Monitoring to compute estimates of percentiles. This is done using the 
>> bucket counts and bucket boundaries.
>>
>> Here is a describing of roughly how its calculated.
>> https://stackoverflow.com/questions/59635115/gcp-console-how-are-percentile-charts-calculated
>> This is a non exact estimate. But plotting the estimated percentiles over 
>> time is often easier to understand and sufficient.
>> (An alternative is a heatmap chart representing histograms over time. I.e. a 
>> histogram for each window of time).
>>
>>
>> On Fri, Aug 14, 2020 at 4:16 PM Robert Bradshaw <[email protected]> wrote:
>>>
>>> You may be interested in the propose histogram metrics:
>>> https://docs.google.com/document/d/1kiNG2BAR-51pRdBCK4-XFmc0WuIkSuBzeb__Zv8owbU/edit
>>>
>>> I think it'd be reasonable to add percentiles as its own metric type
>>> as well. The tricky bit (though there are lots of resources on this)
>>> is that one would have to publish more than just the percentiles from
>>> each worker to be able to compute the final percentiles across all
>>> workers.
>>>
>>> On Fri, Aug 14, 2020 at 4:05 PM Ke Wu <[email protected]> wrote:
>>> >
>>> > Hi everyone,
>>> >
>>> > I am looking to add percentile metrics (p50, p90 etc) to my beam job but 
>>> > I only find Counter, Gauge and Distribution metrics. I understand that I 
>>> > can calculate percentile metrics in my job itself and use Gauge to emit, 
>>> > however this is not an easy approach. On the other hand, Distribution 
>>> > metrics sounds like the one to go to according to its documentation: "A 
>>> > metric that reports information about the distribution of reported 
>>> > values.”, however it seems that it is intended for SUM, COUNT, MIN, MAX.
>>> >
>>> > The question(s) are:
>>> >
>>> > 1. is Distribution metric only intended for sum, count, min, max?
>>> > 2. If Yes, can the documentation be updated to be more specific?
>>> > 3. Can we add percentiles metric support, such as Histogram, with 
>>> > configurable list of percentiles to emit?
>>> >
>>> > Best,
>>> > Ke
>>
>>

Reply via email to