Ah, I misunderstood your original suggestion then. That makes sense then. I have already seen someone get a little confused about the names and surprised that Flatten doesn't do what FlatMap does.
On Thu, Mar 21, 2024 at 5:20 PM Valentyn Tymofieiev <valen...@google.com> wrote: > Beam throws an error at submission time in Python if you pass a single > PCollection to Flatten. The scenario you describe concerns a one-element > list. > > On Thu, Mar 21, 2024, 13:43 Joey Tran <joey.t...@schrodinger.com> wrote: > >> I think it'd be quite surprising if beam.Flatten would become equivalent >> to FlatMap if passed only a single pcollection. One use case that would be >> broken from that is cases where someone might be flattening a variable >> number of pcollections, including possibly only one pcollection. In that >> case, that single pcollection suddenly get FlatMapped. >> >> >> >> On Thu, Mar 21, 2024 at 4:36 PM Valentyn Tymofieiev via dev < >> dev@beam.apache.org> wrote: >> >>> One possible alternative is to define beam.Flatten for a single >>> collection to be functionally equivalent to beam.FlatMap(lambda x: x), but >>> that would be a larger change and such behavior might need to be >>> consistent across SDKs and documented. Adding a default value is a simpler >>> change. >>> >>> I can also confirm that the usage >>> >>> | 'Flatten' >> beam.FlatMap(lambda x: x) >>> >>> is fairly common by inspecting uses of Beam internally. >>> On Thu, Mar 21, 2024 at 1:30 PM Robert Bradshaw via dev < >>> dev@beam.apache.org> wrote: >>> >>>> IIRC, Java has Flatten.iterables() and Flatten.collections(), the first >>>> of which does what you want. >>>> >>>> Giving FlatMap a default arg of lambda x: x is an interesting idea. The >>>> only downside I see is a less clear error if one forgets to provide this >>>> (now mandatory) parameter, but maybe that's low enough to be worth the >>>> convenience? >>>> >>>> On Thu, Mar 21, 2024 at 12:02 PM Joey Tran <joey.t...@schrodinger.com> >>>> wrote: >>>> >>>>> That's not really the same thing, is it? `beam.Flatten` combines two >>>>> or more pcollections into a single pcollection while beam.FlatMap unpacks >>>>> iterables of elements (i.e. PCollection<Iterable<T>> -> PCollection<T>) >>>>> >>>>> On Thu, Mar 21, 2024 at 2:57 PM Valentyn Tymofieiev via dev < >>>>> dev@beam.apache.org> wrote: >>>>> >>>>>> Hi, you can use beam.Flatten() instead. >>>>>> >>>>>> On Thu, Mar 21, 2024 at 10:55 AM Joey Tran <joey.t...@schrodinger.com> >>>>>> wrote: >>>>>> >>>>>>> Hey all, >>>>>>> >>>>>>> Using an identity function for FlatMap comes up more often than >>>>>>> using FlatMap without an identity function. Would it make sense to use >>>>>>> the >>>>>>> identity function as a default? >>>>>>> >>>>>>> >>>>>>> >>>>>>>