quick allowDups questions

2007-10-10 Thread Charlie Jackson
Normally this is the type of thing I'd just scour through the online
docs or the source code for, but I'm under the gun a bit. 

 

Anyway, I need to update some docs in my index because my client program
wasn't accurately putting these docs in (values for one of the fields
was missing). I'm hoping I won't have to write additional code to go
through and delete each existing doc before I add the new one, and I
think setting allowDups on the add command to false will allow me to do
this. I seem to recall something in the update handler code that goes
through and deletes all but the last copy of the doc if allowDups is
false - does that sound accurate?

 

If so, I just need to make sure that solrj properly sets that flag,
which leads me to my next question. Does solrj default allowDups to
false? If not, what do I need to do to make sure allowDups is set to
false when I'm adding these docs? 



Re: quick allowDups questions

2007-10-10 Thread Mike Klaas

On 10-Oct-07, at 1:11 PM, Charlie Jackson wrote:

Anyway, I need to update some docs in my index because my client  
program

wasn't accurately putting these docs in (values for one of the fields
was missing). I'm hoping I won't have to write additional code to go
through and delete each existing doc before I add the new one, and I
think setting allowDups on the add command to false will allow me  
to do

this. I seem to recall something in the update handler code that goes
through and deletes all but the last copy of the doc if allowDups is
false - does that sound accurate?


Yes.  But you need to define a uniqueKey in schema and make sure it  
is the same for docs you want overwritten.  This is how solr detects  
dups.




If so, I just need to make sure that solrj properly sets that flag,
which leads me to my next question. Does solrj default allowDups to
false? If not, what do I need to do to make sure allowDups is set to
false when I'm adding these docs?


It is the normal mode of operation for Solr, so I'd be surprised if  
it wasn't the default in solrj (but I don't actually know).


-Mike


RE: quick allowDups questions

2007-10-10 Thread Charlie Jackson
Thanks for the response, Mike. A quick test using the example app
confirms your statement. 

As for Solrj, you're probably right, but I'm not going to take any
chances for the time being. The server.add method has an optional
Boolean flag named overwrite that defaults to true. Without knowing
for sure what it does, I'm not going to mess with it. 

For the purposes of my problem, I've got an upper and lower bound of
affected docs, so I'm just going to delete them all and then initiate a
re-index of those specific ids from my source. 

Thanks again for the help!


-Original Message-
From: Mike Klaas [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, October 10, 2007 3:58 PM
To: solr-user@lucene.apache.org
Subject: Re: quick allowDups questions

On 10-Oct-07, at 1:11 PM, Charlie Jackson wrote:

 Anyway, I need to update some docs in my index because my client  
 program
 wasn't accurately putting these docs in (values for one of the fields
 was missing). I'm hoping I won't have to write additional code to go
 through and delete each existing doc before I add the new one, and I
 think setting allowDups on the add command to false will allow me  
 to do
 this. I seem to recall something in the update handler code that goes
 through and deletes all but the last copy of the doc if allowDups is
 false - does that sound accurate?

Yes.  But you need to define a uniqueKey in schema and make sure it  
is the same for docs you want overwritten.  This is how solr detects  
dups.


 If so, I just need to make sure that solrj properly sets that flag,
 which leads me to my next question. Does solrj default allowDups to
 false? If not, what do I need to do to make sure allowDups is set to
 false when I'm adding these docs?

It is the normal mode of operation for Solr, so I'd be surprised if  
it wasn't the default in solrj (but I don't actually know).

-Mike


Re: quick allowDups questions

2007-10-10 Thread Ryan McKinley

the default solrj implementation should do what you need.



As for Solrj, you're probably right, but I'm not going to take any
chances for the time being. The server.add method has an optional
Boolean flag named overwrite that defaults to true. Without knowing
for sure what it does, I'm not going to mess with it. 



direct solr update allows a few extra fields allowDups, 
overwritePending, overwriteCommited -- the future of overwritePending, 
overwriteCommited is in doubt (SOLR-60), so i did not want to bake that 
into the solrj API.


internally,

 allowDups = !overwrite; (the one field you can set)
 overwritePending = !allowDups;
 overwriteCommited = !allowDups;


ryan