Re: docFreq coming to be more than 1 for unique id field

Jack Krupansky Tue, 17 Jun 2014 08:33:54 -0700

Personally, although I understand the rationale and performanceramifications of the current approach of including deleted documents, Iwould agree that DF and IDF should definitely be accurate, despitedeletions. So, if they aren't, I'd suggest filing a bug Jira. Granted itmight be rejected as "by design" or "won't fix" or "improvement", but it'sworth having the discussion.

Maybe one theory from the old days is that the model of "batch update" wouldby definition include an optimize step. But now with Solr considered by someto be a "NoSQL database" and with (near) real-time updates, that model isclearly obsolete.


-- Jack Krupansky

-----Original Message-----From: Apoorva Gaurav

Sent: Tuesday, June 17, 2014 11:15 AM
To: solr-user ; Ahmet Arslan
Subject: Re: docFreq coming to be more than 1 for unique id field

Yes we have updates on these. Didn't try optimizing will do. But isn't the
unique field supposed to be unique?

On Tue, Jun 17, 2014 at 8:37 PM, Ahmet Arslan <iori...@yahoo.com.invalid>
wrote:

Hi,

Just a guess, do you have deletions? What happens when you optimize and
re-try?



On Tuesday, June 17, 2014 5:58 PM, Apoorva Gaurav <
apoorva.gau...@myntra.com> wrote:
Hello All,

We are using solr 4.4.0. We have a uniqueKey of type solr.StrField. Weneed

to extract docs in a pre-defined order if they match a certain condition.
Our query is of the format

uniqueField:(id1 ^ weight1 OR id2 ^ weight2 ..... OR idN ^ weightN)
where weight1 > weight2 > ........ > weightN

But the result is not in the desired order. On debugging the query we've

found out that for some of the documents docFreq is higher than 1 andhencetheir tf-idf based score is less than others. What can be the reasonbehind

a unique id field having docFreq greater than 1?  How can we prevent it?

--
Thanks & Regards,
Apoorva



--
Thanks & Regards,

Apoorva

Re: docFreq coming to be more than 1 for unique id field

Reply via email to