[ 
https://issues.apache.org/jira/browse/BEAM-7528?focusedWorklogId=278193&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-278193
 ]

ASF GitHub Bot logged work on BEAM-7528:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/Jul/19 11:54
            Start Date: 17/Jul/19 11:54
    Worklog Time Spent: 10m 
      Work Description: kkucharc commented on pull request #8941: [BEAM-7528] 
Save load test metrics according to distribution name
URL: https://github.com/apache/beam/pull/8941#discussion_r304360938
 
 

 ##########
 File path: 
sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py
 ##########
 @@ -138,8 +143,25 @@ def as_dict(self):
 class CounterMetric(Metric):
   def __init__(self, counter_dict, submit_timestamp, metric_id):
     super(CounterMetric, self).__init__(submit_timestamp, metric_id)
-    self.value = counter_dict.committed
     self.label = str(counter_dict.key.metric.name)
+    self.value = counter_dict.committed
+
+
+class DistributionMetrics(Metric):
 
 Review comment:
   I understand your point, calculating one value might "save" some space in 
BigQuery. Unfortunately distribution metrics doesn't support with median or 
other percentiles out of the box (maybe it's a good idea for feature request?) 
and I am not sure if it's good idea to introduce such logic in util for reading 
metrics. Mainly because in the situation when we would like save some external 
(not expected in MetricsReader) metrics which we don't know what mean (but 
still may be valuable for the others ex. tfx team) calculating median or any 
aggregates may cause loosing meaningful info (maybe someone who defined those 
external metrics needs max, min, sum or mean). Do you agree?
   
   As it comes to time windows, also distribution doesn't store such thing. It 
is possible to retrieve the step name in which metrics would be collected. And 
this I can add to metric label we save.
   
   This is confusing problem. To sum everything up:
   - What to do with metrics that weren't collected in our pipeline but 
somewhere deeper? It is decided to save them. But I can suggest here that it 
would help to have pipeline option called `save_external_metrics`. WDYT?
   - In what shape we should save metrics that we don't know nothing about 
them? IMO probably as raw as possible, because we don't know what is useful for 
user that collects them.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 278193)
    Time Spent: 5h 20m  (was: 5h 10m)

> Save correctly Python Load Tests metrics according to it's namespace
> --------------------------------------------------------------------
>
>                 Key: BEAM-7528
>                 URL: https://issues.apache.org/jira/browse/BEAM-7528
>             Project: Beam
>          Issue Type: Bug
>          Components: testing
>            Reporter: Kasia Kucharczyk
>            Assignee: Kasia Kucharczyk
>            Priority: Major
>          Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Bug discovered when metrics monitored more than one distribution and saved 
> all as `runtime`.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to