[ https://issues.apache.org/jira/browse/KUDU-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709366#comment-16709366 ]
Brock Noland commented on KUDU-1563: ------------------------------------ Hey all, I've got a use case which could really benefit from {{INSERT IGNORE DUPLICATE KEY}} since we will have duplicates at a ratio of 3x so I am trying to revive this work. I am not sold on creating an extremely generic approach to server-side error ignoring because I think it'll be really easy to abuse. I feel like Kudu contributors should have some control over when ignoring errors is allowed so we understand and validate the use case. Furthermore, {{INSERT INGNORE ALL ERRORS}} won't work for my use case because we are generating so many duplicates precisely because we are so concerned about data loss. Therefore I am suggesting we add a session level property allows the user to ignore certain server side errors for {{{INSERT}},{{UPDATE}},{{DELETE}}} {{IGNORE}} operations. Below is a likely edited summary from [~adar] of my proposal: * Move forward with a new operation {{INSERT IGNORE}}, with the understanding that {{UPDATE IGNORE}} and {{DELETE IGNORE}} would be good additions in the future. Together they comprise a new set of write operations that may ignore certain errors. * Document that {{INSERT IGNORE}} isn't just about duplicate primary keys; the precise set of errors ignored by all of these new write operations is configurable. * Add new {{KuduSession}} properties that control the set of errors ignored by write operations. This set will initially just be "duplicate primary key on insert". The properties should be combinable (i.e. I should be able to ignore duplicate primary keys AND missing partitions), but the granularity will be session-level, not operation-level. Default no errors ignored, so that the user is forced to configure the precise set they want to ignore. > Add support for INSERT IGNORE > ----------------------------- > > Key: KUDU-1563 > URL: https://issues.apache.org/jira/browse/KUDU-1563 > Project: Kudu > Issue Type: New Feature > Reporter: Dan Burkert > Assignee: Brock Noland > Priority: Major > Labels: newbie > > The Java client currently has an [option to ignore duplicate row key errors| > https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setIgnoreAllDuplicateRows-boolean-], > which is implemented by filtering the errors on the client side. If we are > going to continue to support this feature (and the consensus seems to be that > we probably should), we should promote it to a first class operation type > that is handled on the server side. This would have a modest perf. > improvement since less errors are returned, and it would allow INSERT IGNORE > ops to be mixed in the same batch as other INSERT, DELETE, UPSERT, etc. ops. -- This message was sent by Atlassian JIRA (v7.6.3#76005)