damccorm commented on code in PR #37565:
URL: https://github.com/apache/beam/pull/37565#discussion_r2850304115
##########
sdks/python/apache_beam/ml/inference/base.py:
##########
@@ -178,6 +178,8 @@ def __init__(
max_batch_duration_secs: Optional[int] = None,
max_batch_weight: Optional[int] = None,
element_size_fn: Optional[Callable[[Any], int]] = None,
+ length_fn: Optional[Callable[[Any], int]] = None,
+ bucket_boundaries: Optional[list[int]] = None,
Review Comment:
Can we make it clear these are batching parameters? e.g. `batch_length_fn`
and `batch_bucket_boundaries`?
##########
sdks/python/apache_beam/ml/inference/base.py:
##########
@@ -190,6 +192,11 @@ def __init__(
before emitting; used in streaming contexts.
max_batch_weight: the maximum weight of a batch. Requires
element_size_fn.
element_size_fn: a function that returns the size (weight) of an element.
+ length_fn: a callable mapping an element to its length. When set with
+ max_batch_duration_secs, enables length-aware bucketed keying so
+ elements of similar length are batched together.
+ bucket_boundaries: sorted list of positive boundary values for length
Review Comment:
Could we add more data to this description, similar to below?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]