Hi Aviem, Another good question. There's no strong reason why not have Count in addition to Bytes.
Practically, in the Dataflow runner we found bytes to be the best signal here. I won't go deeply into why, but two intuitions: * Beam is designed to enable runners to minimize the per-element overhead; that's why we have multi-element bundles in the first place. * If serialization is one of the main overheads in your system, then bytes is often what you are going to care about. That said, other runners may (and surely do) work very differently than Dataflow's. It's totally reasonable to add these signals to the APIs if there is a runner that would benefit from using them! Dan On Tue, Nov 29, 2016 at 12:42 AM, Aviem Zur <aviem...@gmail.com> wrote: > Hi, > > Today UnboundedSource exposes split backlog in bytes via > getSplitBacklogBytes() > > I think there is much value in exposing backlog in number of events as > well, since this number can be more human comprehensible than bytes. > something like getSplitBacklogEvents() or getSplitBacklogCount(). > Thoughts? >