Hi Beam devs, I'm working on Euphoria DSL, where we implemented `BroadcastHashJoin` using side-inputs. But our test shows some missing data. We use `View.asMultimap()` to get our join-small-side to view in form of `PCollectionView<Map<K, Iterable<T>>>`. Then some duplicated key-value (the same key and value as some other element) gets lost. That is of course unfortunate behavior when doing joins. I believe that it all nails down to:
https://github.com/apache/beam/blob/05fb694f265dda0254d7256e938e508fec9ba098/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionViews.java#L293 Where `HashMultimap` is used to gather all the elements to a `Multimap<K, V>`. Which do not allow duplicate key-value pairs. Do you also feel this is a bug? And if yes, then we would like to fix it by replacing `HashMultimap` with `ArrayListMultimap` which allows allows duplicate key-value pairs. We can thing of some workarounds. But we prefer to do the fix, if possible. So what are your opinions? And how should we proceed? Thank you. Vaclav Plajt Je dobré vedet, ze tento e-mail a prílohy jsou duverné. Pokud spolu jednáme o uzavrení obchodu, vyhrazujeme si právo nase jednání kdykoli ukoncit. Pro fanousky právní mluvy - vylucujeme tím ustanovení obcanského zákoníku o predsmluvní odpovednosti. Pravidla o tom, kdo u nás a jak vystupuje za spolecnost a kdo muze co a jak podepsat naleznete zde<https://onas.seznam.cz/cz/podpisovy-rad-cz.html> You should know that this e-mail and its attachments are confidential. If we are negotiating on the conclusion of a transaction, we reserve the right to terminate the negotiations at any time. For fans of legalese-we hereby exclude the provisions of the Civil Code on pre-contractual liability. The rules about who and how may act for the company and what are the signing procedures can be found here<https://onas.seznam.cz/cz/signature-rules.html>.
