Ah, I misunderstood your original suggestion then. That makes sense then. I
have already seen someone get a little confused about the names and
surprised that Flatten doesn't do what FlatMap does.

On Thu, Mar 21, 2024 at 5:20 PM Valentyn Tymofieiev <valen...@google.com>
wrote:

> Beam throws an error at submission time in Python if you pass a single
> PCollection  to Flatten. The scenario you describe concerns a one-element
> list.
>
> On Thu, Mar 21, 2024, 13:43 Joey Tran <joey.t...@schrodinger.com> wrote:
>
>> I think it'd be quite surprising if beam.Flatten would become equivalent
>> to FlatMap if passed only a single pcollection. One use case that would be
>> broken from that is cases where someone might be flattening a variable
>> number of pcollections, including possibly only one pcollection. In that
>> case, that single pcollection suddenly get FlatMapped.
>>
>>
>>
>> On Thu, Mar 21, 2024 at 4:36 PM Valentyn Tymofieiev via dev <
>> dev@beam.apache.org> wrote:
>>
>>> One possible alternative is to define beam.Flatten for a single
>>> collection to be functionally equivalent to beam.FlatMap(lambda x: x), but
>>> that would be a larger change and such behavior might need to be
>>> consistent across SDKs and documented. Adding a default value is a simpler
>>> change.
>>>
>>> I can also confirm that the usage
>>>
>>>     |  'Flatten' >> beam.FlatMap(lambda x: x)
>>>
>>> is fairly common by inspecting uses of Beam internally.
>>> On Thu, Mar 21, 2024 at 1:30 PM Robert Bradshaw via dev <
>>> dev@beam.apache.org> wrote:
>>>
>>>> IIRC, Java has Flatten.iterables() and Flatten.collections(), the first
>>>> of which does what you want.
>>>>
>>>> Giving FlatMap a default arg of lambda x: x is an interesting idea. The
>>>> only downside I see is a less clear error if one forgets to provide this
>>>> (now mandatory) parameter, but maybe that's low enough to be worth the
>>>> convenience?
>>>>
>>>> On Thu, Mar 21, 2024 at 12:02 PM Joey Tran <joey.t...@schrodinger.com>
>>>> wrote:
>>>>
>>>>> That's not really the same thing, is it? `beam.Flatten` combines two
>>>>> or more pcollections into a single pcollection while beam.FlatMap unpacks
>>>>> iterables of elements (i.e. PCollection<Iterable<T>> -> PCollection<T>)
>>>>>
>>>>> On Thu, Mar 21, 2024 at 2:57 PM Valentyn Tymofieiev via dev <
>>>>> dev@beam.apache.org> wrote:
>>>>>
>>>>>> Hi, you can use beam.Flatten() instead.
>>>>>>
>>>>>> On Thu, Mar 21, 2024 at 10:55 AM Joey Tran <joey.t...@schrodinger.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hey all,
>>>>>>>
>>>>>>> Using an identity function for FlatMap comes up more often than
>>>>>>> using FlatMap without an identity function. Would it make sense to use 
>>>>>>> the
>>>>>>> identity function as a default?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>

Reply via email to