subject:"\[JAVA\] Handling repeated elements when merging two pcollections"

Re: [JAVA] Handling repeated elements when merging two pcollections

2022-08-10 Thread Luke Cwik via user

Sorry, I should have said that you should Flatten and do a GroupByKey, not a CoGroupByKey making the pipeline like: PCollectionA -> Flatten -> GroupByKey -> ParDo(EmitOnlyFirstElementPerKey) PCollectionB -/ The CoGroupByKey will have one iterable per PCollection containing zero or more elements

Re: [JAVA] Handling repeated elements when merging two pcollections

2022-08-10 Thread Shivam Singhal

Think this should solve my problem. Thanks Evan ans Luke! On Thu, 11 Aug 2022 at 1:49 AM, Luke Cwik via user wrote: > Use CoGroupByKey to join the two PCollections and emit only the first > value of each iterable with the key. > > Duplicates will appear as iterables with more then one value

Re: [JAVA] Handling repeated elements when merging two pcollections

2022-08-10 Thread Luke Cwik via user

Use CoGroupByKey to join the two PCollections and emit only the first value of each iterable with the key. Duplicates will appear as iterables with more then one value while keys without duplicates will have iterables containing exactly one value. On Wed, Aug 10, 2022 at 12:25 PM Shivam Singhal

Re: [JAVA] Handling repeated elements when merging two pcollections

2022-08-10 Thread Evan Galpin

Hi Shivam, When you say "merge the PCollections" do you mean Flatten, or somehow join? CoGroupByKey[1] would be a good choice if you need to join based on key. You would then be able to implement application logic to keep 1 of the 2 records if there is a way to decipher an element from

[JAVA] Handling repeated elements when merging two pcollections

2022-08-10 Thread Shivam Singhal

I have two PCollections, CollectionA & CollectionB of type KV. I would like to merge them into one PCollection but CollectionA & CollectionB might have some elements with the same key. In those repeated cases, I would like to keep the element from CollectionA & drop the repeated element from

Re: [JAVA] Handling repeated elements when merging two pcollections

Re: [JAVA] Handling repeated elements when merging two pcollections

Re: [JAVA] Handling repeated elements when merging two pcollections

Re: [JAVA] Handling repeated elements when merging two pcollections

[JAVA] Handling repeated elements when merging two pcollections

5 matches

Site Navigation

Mail list logo

Footer information