Re: Design Doc for Controlling Batching in RunInference

Andy Ye via dev Fri, 19 Aug 2022 11:58:39 -0700

Thanks all for your feedback.

I've been looking into the batched DoFns, and will have a follow up on how
we can best interact with them.


On Mon, Aug 15, 2022 at 7:16 PM Robert Bradshaw <rober...@google.com> wrote:

> Thanks. I added some comments to the doc.
>
> I agree with Brian that it makes sense to figure out how this
> interacts with batched DoFns, as we'll want to migrate to that.
> (Perhaps they're already ready to migrate to as a first step?)
>
> On Fri, Aug 12, 2022 at 1:03 PM Brian Hulette via dev
> <dev@beam.apache.org> wrote:
> >
> > Hi Andy,
> >
> > Thanks for writing this up! This seems like something that Batched DoFns
> could help with. Could we make a BatchConverter [1] that represents the
> necessary transformations here, and define RunInference as a Batched DoFn?
> Note that the Numpy BatchConverter already enables users to specify a batch
> dimension using a custom typehint, like NumpyArray[np.int64, (N, 10)] (the
> N identifies the batch dimension) [2]. I think we could do something
> similar, but with pytorch types. It's likely we'd need to define our own
> typehints though, I suspect pytorch typehints aren't already parameterized
> by size.
> >
> > Brian
> >
> >
> > [1]
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/typehints/batch.py
> > [2]
> https://github.com/apache/beam/blob/3173b503beaf30c4d32a4a39c709fd81e8161907/sdks/python/apache_beam/typehints/batch_test.py#L42
> >
> > On Fri, Aug 12, 2022 at 12:36 PM Andy Ye via dev <dev@beam.apache.org>
> wrote:
> >>
> >> Hi everyone,
> >>
> >> I've written up a design doc [1] on controlling batching in
> RunInference. I'd appreciate any feedback. Thanks!
> >>
> >> Summary:
> >> Add a custom stacking function to RunInference to enable users to
> control how they want their data to be stacked. This addresses issues
> regarding data that have existing batching dimensions, or different sizes.
> >>
> >> Best,
> >> Andy
> >>
> >> [1]
> https://docs.google.com/document/d/1l40rOTOEqrQAkto3r_AYq8S_L06dDgoZu-4RLKAE6bo/edit#
>

Re: Design Doc for Controlling Batching in RunInference

Reply via email to