[jira] [Commented] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-04-28 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828873#comment-16828873
 ] 

Noble Paul commented on SOLR-13320:
---

[~tomasflobbe]
 Well, no. IIRC {{DocBasedVersionConstraintsProcessor}} can skip the docs based 
on the _{{version_}} field in the document (not from a request param)

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-04-28 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828861#comment-16828861
 ] 

Tomás Fernández Löbbe commented on SOLR-13320:
--

With {{DocBasedVersionConstraintsProcessor}} you can tell Solr to skip 
documents that have a higher (or equal) version than the one you are trying to 
add (see {{ignoreOldUpdates}}). Isn't that what you need?

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-04-27 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827496#comment-16827496
 ] 

Noble Paul commented on SOLR-13320:
---

How does {{DocBasedVersionConstraintsProcessor}} solve this [~tomasflobbe] ?

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-04-26 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827402#comment-16827402
 ] 

Tomás Fernández Löbbe commented on SOLR-13320:
--

Isn’t this what {{DocBasedVersionConstraintsProcessor}} does? 

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-04-25 Thread Scott Blum (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16826483#comment-16826483
 ] 

Scott Blum commented on SOLR-13320:
---

+1!

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-04-25 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825990#comment-16825990
 ] 

Noble Paul commented on SOLR-13320:
---

{{ignoreVersionConflicts=true}} makes more sense

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-04-25 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825968#comment-16825968
 ] 

Shalin Shekhar Mangar commented on SOLR-13320:
--

Thanks [~dragonsinth] for explaining the use-case and the problem.

These are conflicts -- a document was not the version we wanted it to be. Here 
{{-1}} is just a special version that means the document should not have 
existed. So I think {{ignoreConflicts}} or {{ignoreVersionConflicts}} is more 
appropriate than {{ignoreDuplicates}}. Regardless of what we call the param, 
returning a list of docs IDs that were skipped would be nice to have as Gus 
noted. {{haltBatchOnError}} is definitely too broad and it is not always 
possible to recover from errors e.g. if there is malformed JSON in the middle 
of a batch.

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-04-24 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825692#comment-16825692
 ] 

Noble Paul commented on SOLR-13320:
---

That just sounds very cool complex. We will have a tough time explaining it to 
people

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-04-24 Thread Gus Heck (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825667#comment-16825667
 ] 

Gus Heck commented on SOLR-13320:
-

It would be an error if you sent version=-1 as suggested by Shalin. So the 
haltBatchOnError=false plus the existing functionality with version=-1 covers 
your case, right?

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-04-24 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825658#comment-16825658
 ] 

Noble Paul commented on SOLR-13320:
---

well, it's not an error in the strictest sense. 

* Basically what we want is ignore a document if it already exists and,
* the response should have ids of discarded docs

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-04-24 Thread Gus Heck (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825630#comment-16825630
 ] 

Gus Heck commented on SOLR-13320:
-

Maybe this could be broadened a bit? An option to continue with a batch even if 
one document has an error. A return response enumerating failed docs and their 
associated messages would also make sense. That would be a generally useful 
feature I think. Call it haltBatchOnError... defaults to true.

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-04-24 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825485#comment-16825485
 ] 

Noble Paul commented on SOLR-13320:
---

[~shalinmangar] I guess we are good to go , right?

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-03-18 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794790#comment-16794790
 ] 

Noble Paul commented on SOLR-13320:
---

bq. "ignoreConflicts" might be a better name.

these are not really "conflicts" , right?

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-03-15 Thread Scott Blum (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794015#comment-16794015
 ] 

Scott Blum commented on SOLR-13320:
---

Shalin lemme break this down a bit...

Imagine you're restoring a collection from a backup, but you want to be able to 
accept writes while this is in progress.  You start accepting writes (of new 
data) on the new, empty collection, then in the background you want to backfill 
from your backup copy, but you don't want to overwrite anything that has been 
written recently.

Setting "version:-1" on all the incoming, backfill doc is almost what you 
want-- add any documents that don't exist, but don't overwrite any documents 
that do exist.  The problem is that the entire batch gets rejected if even one 
document already exists.  We just want a way to be able to ignore conflicts and 
quietly drop the offending documents rather than rejecting the entire batch.

"ignoreConflicts" might be a better name.

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

2019-03-12 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791318#comment-16791318
 ] 

Shalin Shekhar Mangar commented on SOLR-13320:
--

If the definition of duplicate is just having the same id then that can also be 
done today using optimistic concurrency. Use `_version_` with a negative value. 
See 
https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html#UpdatingPartsofDocuments-OptimisticConcurrency

If duplicate depends on the content of the document then you need to use the 
SignatureUpdateProcessorFactory

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---
>
> Key: SOLR-13320
> URL: https://issues.apache.org/jira/browse/SOLR-13320
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if 
> an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org