[ 
https://issues.apache.org/jira/browse/BEAM-5184?focusedWorklogId=136990&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136990
 ]

ASF GitHub Bot logged work on BEAM-5184:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Aug/18 15:29
            Start Date: 22/Aug/18 15:29
    Worklog Time Spent: 10m 
      Work Description: lukecwik commented on issue #6257: [BEAM-5184] Multimap 
side inputs with duplicate keys and values are being lost
URL: https://github.com/apache/beam/pull/6257#issuecomment-415074170
 
 
   This seems to break Dataflow, it's side input handling is different.
   
   The Jenkins logs for 
org.apache.beam.sdk.transforms.ViewTest.testMultimapSideInputWithNonDeterministicKeyCoder
 fail with
   
   Expected: iterable over [<KV{apple, 1}>, <KV{apple, 1}>, <KV{apple, 2}>, 
<KV{banana, 3}>, <KV{blackberry, 3}>] in any order
        but: No item matches: <KV{apple, 1}> in [<KV{apple, 2}>, <KV{apple, 
1}>, <KV{blackberry, 3}>, <KV{banana, 3}>]
   
   I'll try to take a look as to why this is failing as the error message is 
implying a comparison issue since all the values do exist in the actual output 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 136990)
    Time Spent: 1.5h  (was: 1h 20m)

> Multimap side inputs with duplicate keys and values are being lost
> ------------------------------------------------------------------
>
>                 Key: BEAM-5184
>                 URL: https://issues.apache.org/jira/browse/BEAM-5184
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>            Reporter: Luke Cwik
>            Assignee: Vaclav Plajt
>            Priority: Major
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Side inputs with duplicate values are being lost due to the usage of a set 
> based multimap.
> [https://github.com/apache/beam/blob/05fb694f265dda0254d7256e938e508fec9ba098/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionViews.java#L293]
>  
> Originating thread: 
> [https://lists.apache.org/thread.html/48bae7cf71bf6851622cdee0e8bc8619c79c4c2273ed63f288202169@%3Cdev.beam.apache.org%3E]
>  
> Please update the existing tests to exercise this scenario as well: 
> https://github.com/apache/beam/blob/9f23ffc97535e7255245f3852b9d2f0939df5a0a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ViewTest.java#L507



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to