Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Valentyn Tymofieiev via dev
It's fair. if we change the default value, we can perhaps add an error handling logic so that (pcoll) | beam.Flatten() fails with an error that recommends (pcoll) | beam.FlatMap(), instead of saying that input is not an iterable. On Thu, Mar 21, 2024 at 3:41 PM Joey Tran wrote: > +1 > > On Thu,

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Joey Tran
+1 On Thu, Mar 21, 2024 at 6:30 PM Robert Bradshaw via dev wrote: > I would be more comfortable with a default for FlatMap than overloading > Flatten in this way. Distinguishing between > > (pcoll,) | beam.Flatten() > > and > > (pcoll) | beam.Flatten() > > seems a bit error prone. > > >

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Robert Bradshaw via dev
I would be more comfortable with a default for FlatMap than overloading Flatten in this way. Distinguishing between (pcoll,) | beam.Flatten() and (pcoll) | beam.Flatten() seems a bit error prone. On Thu, Mar 21, 2024 at 2:23 PM Joey Tran wrote: > Ah, I misunderstood your original

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Joey Tran
Ah, I misunderstood your original suggestion then. That makes sense then. I have already seen someone get a little confused about the names and surprised that Flatten doesn't do what FlatMap does. On Thu, Mar 21, 2024 at 5:20 PM Valentyn Tymofieiev wrote: > Beam throws an error at submission

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Valentyn Tymofieiev via dev
Beam throws an error at submission time in Python if you pass a single PCollection to Flatten. The scenario you describe concerns a one-element list. On Thu, Mar 21, 2024, 13:43 Joey Tran wrote: > I think it'd be quite surprising if beam.Flatten would become equivalent > to FlatMap if passed

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Joey Tran
I think it'd be quite surprising if beam.Flatten would become equivalent to FlatMap if passed only a single pcollection. One use case that would be broken from that is cases where someone might be flattening a variable number of pcollections, including possibly only one pcollection. In that case,

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Valentyn Tymofieiev via dev
One possible alternative is to define beam.Flatten for a single collection to be functionally equivalent to beam.FlatMap(lambda x: x), but that would be a larger change and such behavior might need to be consistent across SDKs and documented. Adding a default value is a simpler change. I can also

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Robert Bradshaw via dev
IIRC, Java has Flatten.iterables() and Flatten.collections(), the first of which does what you want. Giving FlatMap a default arg of lambda x: x is an interesting idea. The only downside I see is a less clear error if one forgets to provide this (now mandatory) parameter, but maybe that's low

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Joey Tran
That's not really the same thing, is it? `beam.Flatten` combines two or more pcollections into a single pcollection while beam.FlatMap unpacks iterables of elements (i.e. PCollection> -> PCollection) On Thu, Mar 21, 2024 at 2:57 PM Valentyn Tymofieiev via dev < dev@beam.apache.org> wrote: > Hi,

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Valentyn Tymofieiev via dev
Actually, disregard that, Flatten is used in a different context to flatten multiple collections. On Thu, Mar 21, 2024 at 11:55 AM Valentyn Tymofieiev wrote: > Hi, you can use beam.Flatten() instead. > > On Thu, Mar 21, 2024 at 10:55 AM Joey Tran > wrote: > >> Hey all, >> >> Using an identity

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Valentyn Tymofieiev via dev
Hi, you can use beam.Flatten() instead. On Thu, Mar 21, 2024 at 10:55 AM Joey Tran wrote: > Hey all, > > Using an identity function for FlatMap comes up more often than using > FlatMap without an identity function. Would it make sense to use the > identity function as a default? > > > >

Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Joey Tran
Hey all, Using an identity function for FlatMap comes up more often than using FlatMap without an identity function. Would it make sense to use the identity function as a default?

Re: [VOTE] Release 2.55.0, release candidate #3

2024-03-21 Thread Danny McCormick via dev
+1 - validated some ML examples with the interactive runner Thanks, Danny On Thu, Mar 21, 2024 at 9:21 AM Jan Lukavský wrote: > +1 (binding) > > Tested Java SDK with FlinkRunner. > > Jan > On 3/20/24 22:40, Chamikara Jayalath via dev wrote: > > +1 (binding) > > Tested multi-lang Java/Python

Re: [VOTE] Release 2.55.0, release candidate #3

2024-03-21 Thread Jan Lukavský
+1 (binding) Tested Java SDK with FlinkRunner.  Jan On 3/20/24 22:40, Chamikara Jayalath via dev wrote: +1 (binding) Tested multi-lang Java/Python pipelines and upgrading BQ/Kafka transforms from 2.53.0 to 2.55.0 using the Transform Service. Thanks, Cham On Tue, Mar 19, 2024 at 2:10 PM

Beam High Priority Issue Report (57)

2024-03-21 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/30683 The PreCommit Java