[ 
https://issues.apache.org/jira/browse/BEAM-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reassigned BEAM-2812:
----------------------------------

    Assignee:     (was: Amit Sela)

> Dropped windows counters / log prints no longer working
> -------------------------------------------------------
>
>                 Key: BEAM-2812
>                 URL: https://issues.apache.org/jira/browse/BEAM-2812
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>            Reporter: Aviem Zur
>            Priority: Major
>
> In https://github.com/apache/beam/pull/2838 aggregators were removed from 
> Spark runner, this caused regression around dropped windows counters and logs.
> {{CounterCell}} instances are created ad hoc instead of using the {{Metrics}} 
> class static factory methods: 
> [SparkGroupAlsoByWindowViaWindowSet.java#L213-L219|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L213-L219]
> Context of where the metrics are reported isn't taken into account, and since 
> these counters are being passed to a lazily evaluated iterator 
> [SparkGroupAlsoByWindowViaWindowSet.java#L221-L223|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L221-L223]
>  the subsequent code which looks at the counters is always looking at these 
> counters immediately after initialization, before they are populated, so 
> these prints will never happen since the conditional statements do not check 
> on the right counters 
> [SparkGroupAlsoByWindowViaWindowSet.java#L323-L333|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L323-L333].
> What we want is these counts exposed as metrics as well as logs.
> Additionally, 
> {{org.apache.beam.runners.core.LateDataUtils#dropExpiredWindows}} now takes a 
> {{CounterCell}} as a parameter, which is a class for metrics implementation 
> and should generally not be used elsewhere (this is also mentioned in its 
> Javadoc), we should look into changing this method to use something else and 
> perhaps make {{CounterCell}} and similar classes package private (And change 
> runner code which uses these to be in the same package).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to