[ 
https://issues.apache.org/jira/browse/BEAM-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15742473#comment-15742473
 ] 

Daniel Halperin commented on BEAM-1126:
---------------------------------------

This [thread on the dev 
list|https://lists.apache.org/thread.html/03792d43e94b7d1c342617e64511a62a681b7c2c6797055394ff22a8@%3Cdev.beam.apache.org%3E]
 has the additional context Davor is presumably asking for.

I think the confusion is between human-comprehensible and 
machine-comprehensible. Using {{bytes}} as the measure of backlog was not 
written with PubSub in mind, it was written because bytes is more directly 
related to overhead than events. Using bytes also allows for comparison between 
sources of different types... so {{bytes}} is generally a pretty good signal 
for runners, and better than {{events}}.

If the purpose of exposing {{events}} is purely for human visibility, this is 
probably indeed better done using metric or aggregator reporting. [~bchambers] 
has been thinking most about metrics recently, maybe he has additional thoughts?

> Expose UnboundedSource split backlog in number of events
> --------------------------------------------------------
>
>                 Key: BEAM-1126
>                 URL: https://issues.apache.org/jira/browse/BEAM-1126
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>            Reporter: Aviem Zur
>            Assignee: Daniel Halperin
>            Priority: Minor
>
> Today {{UnboundedSource}} exposes split backlog in bytes via 
> {{getSplitBacklogBytes()}}
> There is value in exposing backlog in number of events as well, since this 
> number can be more human comprehensible than bytes. something like 
> {{getSplitBacklogEvents()}} or {{getSplitBacklogCount()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to