[ 
https://issues.apache.org/jira/browse/BEAM-13009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17464147#comment-17464147
 ] 

Valentyn Tymofieiev commented on BEAM-13009:
--------------------------------------------

I am seeing that some DynamoDBIO unit tests consistently fail for me when I run 
the :sdks:java:io:amazon-web-services:test suite on a VM that I created from 
apache-beam Ubuntu-16-based Jenkins image. Yet, these tests don't fail when I 
run this suite on my other machine (Debian based), and they also don't fail in 
regular jenkins cron runs 
([example|https://scans.gradle.com/s/osiae2n3k2tv6/tests/:sdks:java:io:amazon-web-services:test/org.apache.beam.sdk.io.aws.dynamodb.DynamoDBIOWriteTest/testWritePutItemsWithDuplicates?top-execution=1]).

repro:
{noformat}
gcloud compute instances create valentyn-test-image 
--project=apache-beam-testing --zone us-central1-b 
--image-family=jenkins-worker-boot-image --machine-type=n1-highmem-4
gcloud compute ssh valentyn-test-image --zone us-central1-b  
--project=apache-beam-testing 

# lsb_release -a
# No LSB modules are available.
# Distributor ID: Ubuntu
# Description:    Ubuntu 16.04.6 LTS

git clone https://github.com/apache/beam.git
cd beam
sudo su  # two more tests fail without this
./gradlew  :sdks:java:io:amazon-web-services:test   --scan
{noformat}
Gradle scan: https://gradle.com/s/othuni7yzvgcs
Sample error:
{noformat}
testWriteDeleteItemsWithDuplicates
FAILED
:sdks:java:io:amazon-web-services:test  
org.apache.beam.sdk.io.aws.dynamodb.DynamoDBIOWriteTest
105 tests executed in build, 4 failed
Execution 1 of 1FAILED0.273stoday at 2:38:59 PM PST
Exception
java.lang.AssertionError:       
Expected size:<100> but was:<103> in:   
<[Item{id=97},  
    Item{id=85},        
    Item{id=88},        
    Item{id=90},        
    Item{id=94},        
    Item{id=87},        
    Item{id=93},        
    Item{id=84},        
    Item{id=91},        
    Item{id=96},        
    Item{id=99},        
    Item{id=92},        
    Item{id=98},        
    Item{id=95},        
    Item{id=86},        
    Item{id=89},        
    Item{id=72},        
    Item{id=66},        
    Item{id=57},        
    Item{id=61},        
    Item{id=75},        
    Item{id=78},        
    Item{id=77},        
    Item{id=60},        
    Item{id=63},        
    Item{id=69},        
    Item{id=56},        
    Item{id=80},        
    Item{id=79},        
    Item{id=62},        
    Item{id=58},        
    Item{id=65},        
    Item{id=71},        
    Item{id=74},        
    Item{id=68},        
    Item{id=59},        
    Item{id=67},        
    Item{id=73},        
    Item{id=70},        
    Item{id=64},        
    Item{id=76},        
    Item{id=81},        
    Item{id=83},        
    Item{id=80},        
    Item{id=82},        
    Item{id=11},        
    Item{id=20},        
    Item{id=19},        
    Item{id=13},        
    Item{id=23},        
    Item{id=9}, 
    Item{id=16},        
    Item{id=17},        
    Item{id=3}, 
    Item{id=14},        
    Item{id=10},        
    Item{id=22},        
    Item{id=0}, 
    Item{id=8}, 
    Item{id=21},        
    Item{id=5}, 
    Item{id=6}, 
    Item{id=2}, 
    Item{id=1}, 
    Item{id=12},        
    Item{id=4}, 
    Item{id=24},        
    Item{id=7}, 
    Item{id=15},        
    Item{id=18},        
    Item{id=25},        
    Item{id=26},        
    Item{id=27},        
    Item{id=24},        
    Item{id=28},        
    Item{id=40},        
    Item{id=43},        
    Item{id=34},        
    Item{id=52},        
    Item{id=48},        
    Item{id=39},        
    Item{id=49},        
    Item{id=31},        
    Item{id=46},        
    Item{id=45},        
    Item{id=51},        
    Item{id=37},        
    Item{id=38},        
    Item{id=30},        
    Item{id=42},        
    Item{id=29},        
    Item{id=35},        
    Item{id=33},        
    Item{id=32},        
    Item{id=36},        
    Item{id=41},        
    Item{id=44},        
    Item{id=47},        
    Item{id=50},        
    Item{id=55},        
    Item{id=52},        
    Item{id=54},        
    Item{id=53}]>       
{noformat}

> DynamoDBIO misses writing items if `withDeduplicateKeys` is not set
> -------------------------------------------------------------------
>
>                 Key: BEAM-13009
>                 URL: https://issues.apache.org/jira/browse/BEAM-13009
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-aws
>    Affects Versions: 2.27.0
>            Reporter: Lei Li
>            Assignee: Moritz Mack
>            Priority: P1
>              Labels: aws, data-loss, dynamodb
>             Fix For: 2.35.0
>
>          Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> A new method `withDeduplicateKeys` was added in DynamoDBIO from 2.27.0. It 
> feels like it is optional according to the 
> [doc|https://beam.apache.org/releases/javadoc/2.27.0/index.html?org/apache/beam/sdk/io/aws/dynamodb/DynamoDBIO.html],
>  and it was not shown in the examples either. But if a key name not set by 
> it, [the deduplication 
> logic|https://github.com/apache/beam/pull/12583/files#diff-0b5f7a7c1ee0ec890eef82e05e08ef1152421d2c8dcef11fca107f6af0d22e87R479-R492]
>  still takes effect but uses an empty map as the `Map<String, 
> AttributeValue>` part of the deduplication key, which results in all items 
> having the same key and being deduplicated, writing only the last item to 
> DynamoDB.
> I think we need to add an check on DeduplicateKeys in 
> `extractDeduplicateKeyValues`, and skip the deduplication logic if it's empty.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to