[ https://issues.apache.org/jira/browse/KUDU-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grant Henke updated KUDU-2809: ------------------------------ Labels: backup (was: ) > Incremental backup / diff scan does not handle rows that are inserted and > deleted between two incrementals correctly > -------------------------------------------------------------------------------------------------------------------- > > Key: KUDU-2809 > URL: https://issues.apache.org/jira/browse/KUDU-2809 > Project: Kudu > Issue Type: Bug > Components: backup > Affects Versions: 1.9.0 > Reporter: Will Berkeley > Priority: Major > Labels: backup > > I did the following sequence of operations: > # Insert 100 million rows > # Update 1 out of every 11 rows > # Make a full backup > # Insert 100 million more rows, after the original rows in keyspace > # Delete 1 out of every 23 rows > # Make an incremental backup > Restore failed to apply the incremental backup, failing with an error like > {noformat} > java.lang.RuntimeException: failed to write 1000 rows from DataFrame to Kudu; > sample errors: > {noformat} > Due to another bug, there's no sample errors, but after hacking around that > bug, I found that the incremental contained a row with a DELETE action for a > key that is not present in the full backup. That's because the row was > inserted in step 4 and deleted in step 5, between backups. > We could fix this by > # Making diff scan not return a DELETE for such a row > # Implementing and using DELETE IGNORE in the restore job -- This message was sent by Atlassian JIRA (v7.6.3#76005)