[jira] [Commented] (SOLR-13320) add a param ignoreVersionConflicts=true to updates to not overwrite existing docs
[ https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16833508#comment-16833508 ] Noble Paul commented on SOLR-13320: --- I've updated the patch to use `failOnVersionConflicts=false` and the ref guide is updated as well > add a param ignoreVersionConflicts=true to updates to not overwrite existing > docs > - > > Key: SOLR-13320 > URL: https://issues.apache.org/jira/browse/SOLR-13320 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Attachments: SOLR-13320.patch, SOLR-13320.patch, SOLR-13320.patch > > > Updates should have an option to ignore duplicate documents and drop them if > an option {{ignoreDuplicates=true}} is specified -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13320) add a param ignoreVersionConflicts=true to updates to not overwrite existing docs
[ https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16833464#comment-16833464 ] Noble Paul commented on SOLR-13320: --- yeah, the name can be misleading. even {{continueOnversionConflict}} suggests that you are continuing the operation irrespective of version conflict. Continue what? continuing to add the doc ? basically we are just ignoring only those docs with version conflicts. basically, {{failOnVersionConflicts=false}} may be a better name? > add a param ignoreVersionConflicts=true to updates to not overwrite existing > docs > - > > Key: SOLR-13320 > URL: https://issues.apache.org/jira/browse/SOLR-13320 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Attachments: SOLR-13320.patch, SOLR-13320.patch > > > Updates should have an option to ignore duplicate documents and drop them if > an option {{ignoreDuplicates=true}} is specified -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13320) add a param ignoreVersionConflicts=true to updates to not overwrite existing docs
[ https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16833460#comment-16833460 ] Yonik Seeley commented on SOLR-13320: - Hmmm, when I read "ignoreVersionConflicts" I assumed the wrong behavior... go ahead and add even if there is a version conflict. We aren't really ignoring it, but rather continuing on to the next update/doc in the batch after it happened? I'm not sure if I can think if a better name though... thinking along the lines of [~gus_heck], maybe something like "continueOnVersionConflict" (or "continueOnError" for the general case)? > add a param ignoreVersionConflicts=true to updates to not overwrite existing > docs > - > > Key: SOLR-13320 > URL: https://issues.apache.org/jira/browse/SOLR-13320 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Attachments: SOLR-13320.patch, SOLR-13320.patch > > > Updates should have an option to ignore duplicate documents and drop them if > an option {{ignoreDuplicates=true}} is specified -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13320) add a param ignoreVersionConflicts=true to updates to not overwrite existing docs
[ https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16833165#comment-16833165 ] Tomás Fernández Löbbe commented on SOLR-13320: -- LGTM > add a param ignoreVersionConflicts=true to updates to not overwrite existing > docs > - > > Key: SOLR-13320 > URL: https://issues.apache.org/jira/browse/SOLR-13320 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Attachments: SOLR-13320.patch, SOLR-13320.patch > > > Updates should have an option to ignore duplicate documents and drop them if > an option {{ignoreDuplicates=true}} is specified -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13320) add a param ignoreVersionConflicts=true to updates to not overwrite existing docs
[ https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832974#comment-16832974 ] Noble Paul commented on SOLR-13320: --- I plan to commit this in a day or two > add a param ignoreVersionConflicts=true to updates to not overwrite existing > docs > - > > Key: SOLR-13320 > URL: https://issues.apache.org/jira/browse/SOLR-13320 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Attachments: SOLR-13320.patch, SOLR-13320.patch > > > Updates should have an option to ignore duplicate documents and drop them if > an option {{ignoreDuplicates=true}} is specified -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13320) add a param ignoreVersionConflicts=true to updates to not overwrite existing docs
[ https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829856#comment-16829856 ] Noble Paul commented on SOLR-13320: --- The version comparison logic is already handled by DUH2. Using a URP is much more complex than explaining people to use two URPs and I still have no idea how it works > add a param ignoreVersionConflicts=true to updates to not overwrite existing > docs > - > > Key: SOLR-13320 > URL: https://issues.apache.org/jira/browse/SOLR-13320 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Attachments: SOLR-13320.patch > > > Updates should have an option to ignore duplicate documents and drop them if > an option {{ignoreDuplicates=true}} is specified -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13320) add a param ignoreVersionConflicts=true to updates to not overwrite existing docs
[ https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829853#comment-16829853 ] Tomás Fernández Löbbe commented on SOLR-13320: -- You can combine the {{DocBasedVersionConstraintsProcessor}} with a {{DefaultValueUpdateProcessor}}? I'm just trying to avoid adding yet another way to do something that's already possible to do with Solr, another random parameter that needs to be tested, maintained and documented (and explain over and over in the users list), Solr has already too many of those IMO. Just advocating for a simpler/smaller API and trying to prevent adding more complexity to already complex code like DistributedUpdateProcessor, or DUH2. > add a param ignoreVersionConflicts=true to updates to not overwrite existing > docs > - > > Key: SOLR-13320 > URL: https://issues.apache.org/jira/browse/SOLR-13320 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Attachments: SOLR-13320.patch > > > Updates should have an option to ignore duplicate documents and drop them if > an option {{ignoreDuplicates=true}} is specified -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13320) add a param ignoreVersionConflicts=true to updates to not overwrite existing docs
[ https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829602#comment-16829602 ] Noble Paul commented on SOLR-13320: --- Modifying the payload is not always a good idea. Users usually already have the data in some format or there are tools that generate that data. This purely seems like a requirement added by Solr and does not belong to the data. We shouldn't ask the users to generate data in our format, instead we should work with any format they have. I'll check it in non cloud as well. > add a param ignoreVersionConflicts=true to updates to not overwrite existing > docs > - > > Key: SOLR-13320 > URL: https://issues.apache.org/jira/browse/SOLR-13320 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Attachments: SOLR-13320.patch > > > Updates should have an option to ignore duplicate documents and drop them if > an option {{ignoreDuplicates=true}} is specified -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13320) add a param ignoreVersionConflicts=true to updates to not overwrite existing docs
[ https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829375#comment-16829375 ] Tomás Fernández Löbbe commented on SOLR-13320: -- Yes, it's not a new parameter, but my point is that it achieves the use case Scott proposed. You just need to have a version field (which I don't remember if it can be \_version\_ or if it needs to be a new version field) that could actually be a constant number in your case, since you don't really care about versioning AFAICT. You could then, define an UpdateRequestProcessorChain where you use DocBasedVersionConstraintsProcessor and use that for your backfill, knowing that it'll skip any docs that already exist in the index. DocBasedVersionConstraintsProcessor also has the advantage that it works with deletes, by creating tombstones (though, in that case, you'd also have to use the DBVCP for regular updates). I'm not sure it's a good idea to add a new global parameter to do something that can already be done. Also, looking at your patch, it seems to only work in Cloud mode? > add a param ignoreVersionConflicts=true to updates to not overwrite existing > docs > - > > Key: SOLR-13320 > URL: https://issues.apache.org/jira/browse/SOLR-13320 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Attachments: SOLR-13320.patch > > > Updates should have an option to ignore duplicate documents and drop them if > an option {{ignoreDuplicates=true}} is specified -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org