I think for now it's fine to just require Singleton partitioning for this.
In the future we could add a couple optimizations:
- Recognize elementwise np.ufunc implementations. I think we can do this by
looking at the signature [1].
- Allow the user to indicate their function is elementwise with a
beam-specific argument (as Robert suggested).

[1]
https://numpy.org/doc/stable/reference/generated/numpy.ufunc.signature.html#numpy.ufunc.signature

On Fri, Apr 30, 2021 at 11:52 AM Robert Bradshaw <[email protected]>
wrote:

> On Fri, Apr 30, 2021 at 7:04 AM Irwin Alejandro Rodriguez Ramirez
> <[email protected]> wrote:
> >
> > Awesome, thanks! It helps me a lot,
>
> You're welcome. Looking forward to a PR :).
>
> > Now I don't know how to tell if the callable would act on a full column
> or will be pure elementwise, there are some examples of this?
>
> I don't think it's possible to figure this out in general. which is
> why we'd have to take it as explicit user input or use the Singleton
> partitioning (which brings everything to the same machine where it
> doesn't matter as the full columns would then be available).
>
> > On Wed, Apr 28, 2021 at 7:57 PM Robert Bradshaw <[email protected]>
> wrote:
> >>
> >> Hi Irwin,
> >>
> >> Looking forward to your first contribution!
> >>
> >> For combine_first, reading the documentation, is completely elementwise.
> >> One could implement it as
> >>
> >>
> https://github.com/apache/beam/blob/release-2.28.0/sdks/python/apache_beam/dataframe/frames.py#L182
> >>
> >> and then update the tests to allow this
> >>
> >>
> https://github.com/apache/beam/blob/release-2.28.0/sdks/python/apache_beam/dataframe/pandas_doctests_test.py#L98
> >>
> >> The plaine old combine has the unfortunate property that the passed
> >> callable may act on a full column, but in practice is often
> >> elementwise. It could be implemented similar to the non-pearson
> >> variant of corr:
> >>
> >>
> https://github.com/apache/beam/blob/release-2.29.0/sdks/python/apache_beam/dataframe/frames.py#L636
> >>
> >> requiring Singleton partitioning. One could consider adding an extra
> >> flag "elementwise" which would allow one to only require Index
> >> partitioning.
> >>
> >>
> >>
> >>
> >> On Wed, Apr 28, 2021 at 5:00 PM Irwin Alejandro Rodriguez Ramirez
> >> <[email protected]> wrote:
> >> >
> >> > Hi team,
> >> >
> >> > I'm a new contributor at Beam, and I'm trying to implement the
> methods combine and combine_first from BEAM-12017, I couldn't solve it yet,
> I was looking for some suggestions on how to implement these methods.
> >> > I would appreciate any help you can provide.
> >> >
> >> >
> >> > --
> >> >
> >> > Irwin Alejandro Rodríguez Ramírez | WIZELINE
> >> >
> >> > Software Engineer
> >> >
> >> > [email protected] | +52 1(55) 6694 6649
> <+52%2055%206694%206649>
> >> >
> >> > Paseo de la Reforma #296, Piso 32, Col. Juárez, Del. Cuauhtémoc,
> 06600 CDMX.
> >> >
> >> > This email and its contents (including any attachments) are being
> sent to
> >> > you on the condition of confidentiality and may be protected by legal
> >> > privilege. Access to this email by anyone other than the intended
> recipient
> >> > is unauthorized. If you are not the intended recipient, please
> immediately
> >> > notify the sender by replying to this message and delete the material
> >> > immediately from your system. Any further use, dissemination,
> distribution
> >> > or reproduction of this email is strictly prohibited. Further, no
> >> > representation is made with respect to any content contained in this
> email.
> >
> >
> > This email and its contents (including any attachments) are being sent to
> > you on the condition of confidentiality and may be protected by legal
> > privilege. Access to this email by anyone other than the intended
> recipient
> > is unauthorized. If you are not the intended recipient, please
> immediately
> > notify the sender by replying to this message and delete the material
> > immediately from your system. Any further use, dissemination,
> distribution
> > or reproduction of this email is strictly prohibited. Further, no
> > representation is made with respect to any content contained in this
> email.
>

Reply via email to