quick allowDups questions
Normally this is the type of thing I'd just scour through the online docs or the source code for, but I'm under the gun a bit. Anyway, I need to update some docs in my index because my client program wasn't accurately putting these docs in (values for one of the fields was missing). I'm hoping I won't have to write additional code to go through and delete each existing doc before I add the new one, and I think setting allowDups on the add command to false will allow me to do this. I seem to recall something in the update handler code that goes through and deletes all but the last copy of the doc if allowDups is false - does that sound accurate? If so, I just need to make sure that solrj properly sets that flag, which leads me to my next question. Does solrj default allowDups to false? If not, what do I need to do to make sure allowDups is set to false when I'm adding these docs?
Re: quick allowDups questions
On 10-Oct-07, at 1:11 PM, Charlie Jackson wrote: Anyway, I need to update some docs in my index because my client program wasn't accurately putting these docs in (values for one of the fields was missing). I'm hoping I won't have to write additional code to go through and delete each existing doc before I add the new one, and I think setting allowDups on the add command to false will allow me to do this. I seem to recall something in the update handler code that goes through and deletes all but the last copy of the doc if allowDups is false - does that sound accurate? Yes. But you need to define a uniqueKey in schema and make sure it is the same for docs you want overwritten. This is how solr detects dups. If so, I just need to make sure that solrj properly sets that flag, which leads me to my next question. Does solrj default allowDups to false? If not, what do I need to do to make sure allowDups is set to false when I'm adding these docs? It is the normal mode of operation for Solr, so I'd be surprised if it wasn't the default in solrj (but I don't actually know). -Mike
RE: quick allowDups questions
Thanks for the response, Mike. A quick test using the example app confirms your statement. As for Solrj, you're probably right, but I'm not going to take any chances for the time being. The server.add method has an optional Boolean flag named overwrite that defaults to true. Without knowing for sure what it does, I'm not going to mess with it. For the purposes of my problem, I've got an upper and lower bound of affected docs, so I'm just going to delete them all and then initiate a re-index of those specific ids from my source. Thanks again for the help! -Original Message- From: Mike Klaas [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 10, 2007 3:58 PM To: solr-user@lucene.apache.org Subject: Re: quick allowDups questions On 10-Oct-07, at 1:11 PM, Charlie Jackson wrote: Anyway, I need to update some docs in my index because my client program wasn't accurately putting these docs in (values for one of the fields was missing). I'm hoping I won't have to write additional code to go through and delete each existing doc before I add the new one, and I think setting allowDups on the add command to false will allow me to do this. I seem to recall something in the update handler code that goes through and deletes all but the last copy of the doc if allowDups is false - does that sound accurate? Yes. But you need to define a uniqueKey in schema and make sure it is the same for docs you want overwritten. This is how solr detects dups. If so, I just need to make sure that solrj properly sets that flag, which leads me to my next question. Does solrj default allowDups to false? If not, what do I need to do to make sure allowDups is set to false when I'm adding these docs? It is the normal mode of operation for Solr, so I'd be surprised if it wasn't the default in solrj (but I don't actually know). -Mike
Re: quick allowDups questions
the default solrj implementation should do what you need. As for Solrj, you're probably right, but I'm not going to take any chances for the time being. The server.add method has an optional Boolean flag named overwrite that defaults to true. Without knowing for sure what it does, I'm not going to mess with it. direct solr update allows a few extra fields allowDups, overwritePending, overwriteCommited -- the future of overwritePending, overwriteCommited is in doubt (SOLR-60), so i did not want to bake that into the solrj API. internally, allowDups = !overwrite; (the one field you can set) overwritePending = !allowDups; overwriteCommited = !allowDups; ryan