[ 
https://issues.apache.org/jira/browse/FLINK-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15636212#comment-15636212
 ] 

ASF GitHub Bot commented on FLINK-4840:
---------------------------------------

Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/2753
  
    I think we need a different way to solve this.
    
    This pull request adds a very high overhead to the processing of each 
record:
      - two calls to `System.nanoTime()`
      - Maintining a Dropwizard Histogram
    
    Without having benchmarked this, I would expect this to drop the 
performance for typical operations like filters or lightweight map functions by 
a large degree.
    
    Flink is building a streaming runtime that is performance competitive with 
a batch runtime, so the base runtime overhead per record needs to be minimal.
    
    All metrics so far have been designed with that paradigm in mind: Metrics 
may not add any cost to the processing.
      - Metrics are gathered by asynchronous threads
      - The core uses only non-synchronized counters and gauges because they 
come quasi for free
      - We consciously decided to not use in the data paths any metric type 
that has the overhead of creating objects of maintaining a data structure.
    
    I would suggest to first have a design discussion about whether we want to 
measure this and how we can do it for free.
    For example, have a look at the "end to end" latency measurements #2386 via 
latency markers, for an idea how to measure with minimal impact on the data 
processing.


> Measure latency of record processing and expose it as a metric
> --------------------------------------------------------------
>
>                 Key: FLINK-4840
>                 URL: https://issues.apache.org/jira/browse/FLINK-4840
>             Project: Flink
>          Issue Type: Improvement
>          Components: Metrics
>            Reporter: zhuhaifeng
>            Assignee: zhuhaifeng
>            Priority: Minor
>             Fix For: 1.2.0
>
>
> We should expose the following Metrics on the TaskIOMetricGroup:
> 1. recordProcessLatency(ms): Histogram measuring the processing time per 
> record of a task. It is the processing time of chain if a chained task.  
> 2. recordProcTimeProportion(ms): Meter measuring the proportion of record 
> processing time for infor whether the main cost



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to