That's what I think, glad I am not going mad. I've spent 1/2 a day comparing the config files, checking out from SVN again and ensuring the databases are identical. I cannot see what else I can do to make them equivalent. Both servers checkout directly from SVN, I am convinced the files are the same. The database is definately the same.
Not sure what you mean about having identical indices - that's my problem - I don't - or do you mean something else I've missed? But yes everything else you mention is identical, I am as certain as I can be. I too think there must be a difference I have missed but I have run out of ideas for what to check! Frustrating :) On Mar 9, 2011, at 4:38 PM, Jonathan Rochkind wrote: > Yes, but the identical index with the identical solrconfig.xml and the > identical query and the identical version of Solr on two different machines > should preduce identical results. > > So it's a legitimate question why it's not. But perhaps queryNorm isn't > enough to answer that. Sorry, it's out of my league to try and figure out it > out. > > But are you absolutely sure you have identical indexes, identical > solrconfig.xml, identical queries, and identical versions of Solr and any > other installed Java libraries... on both machines? One of these being > different seems more likely than a bug in Solr, although that's possible. > > On 3/9/2011 4:34 PM, Jayendra Patil wrote: >> queryNorm is just a normalizing factor and is the same value across >> all the results for a query, to just make the scores comparable. >> So even if it varies in different environment, you should not worried about. >> >> http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html#formula_queryNorm >> - >> Defination - queryNorm(q) is just a normalizing factor used to make >> scores between queries comparable. This factor does not affect >> document ranking (since all ranked documents are multiplied by the >> same factor), but rather just attempts to make scores from different >> queries (or even different indexes) comparable >> >> Regards, >> Jayendra >> >> On Wed, Mar 9, 2011 at 4:22 PM, Allistair Crossley<a...@roxxor.co.uk> wrote: >>> Hi, >>> >>> I am seeing an issue I do not understand and hope that someone can shed >>> some light on this. The issue is that for a particular search we are seeing >>> a particular result rank in position 3 on one machine and position 8 on the >>> production machine. The position 3 is our desired and roughly expected >>> ranking. >>> >>> I have a local machine with solr and a version deployed on a production >>> server. My local machine's solr and the production version are both checked >>> out from our project's SVN trunk. They are identical files except for the >>> data files (not in SVN) and database connection settings. >>> >>> The index is populated exclusively via data import handler queries to a >>> database. >>> >>> I have exported the production database as-is to my local development >>> machine so that my local machine and production have access to the self >>> same data. >>> >>> I execute a total full-import on both. >>> >>> Still, I see a different position for this document that should surely rank >>> in the same location, all else being equal. >>> >>> I ran debugQuery diff to see how the scores were being computed. See >>> appendix at foot of this email. >>> >>> As far as I can tell every single query normalisation block of the debug is >>> marginally different, e.g. >>> >>> - 0.021368012 = queryNorm (local) >>> + 0.009944122 = queryNorm (production) >>> >>> Which leads to a final score of -2 versus +1 which is enough to skew the >>> results from correct to incorrect (in terms of what we expect to see). >>> >>> - -2.286596 (local) >>> +1.0651637 = (production) >>> >>> I cannot explain this difference. The database is the same. The >>> configuration is the same. I have fully imported from scratch on both >>> servers. What am I missing? >>> >>> Thank you for your time >>> >>> Allistair >>> >>> ----- snip >>> >>> APPENDIX - debugQuery=on DIFF >>> >>> --- untitled >>> +++ (clipboard) >>> @@ -1,51 +1,49 @@ >>> -<str name="L12411p"> >>> +<str name="L12411"> >>> >>> -2.286596 = (MATCH) sum of: >>> - 1.6891675 = (MATCH) sum of: >>> - 1.3198489 = (MATCH) max plus 0.01 times others of: >>> - 0.023022119 = (MATCH) weight(text:dubai^0.1 in 1551), product of: >>> - 0.011795795 = queryWeight(text:dubai^0.1), product of: >>> - 0.1 = boost >>> +1.0651637 = (MATCH) sum of: >>> + 0.7871359 = (MATCH) sum of: >>> + 0.6151879 = (MATCH) max plus 0.01 times others of: >>> + 0.10713901 = (MATCH) weight(text:dubai in 1551), product of: >>> + 0.05489459 = queryWeight(text:dubai), product of: >>> 5.520305 = idf(docFreq=65, maxDocs=6063) >>> - 0.021368012 = queryNorm >>> + 0.009944122 = queryNorm >>> 1.9517226 = (MATCH) fieldWeight(text:dubai in 1551), product of: >>> 1.4142135 = tf(termFreq(text:dubai)=2) >>> 5.520305 = idf(docFreq=65, maxDocs=6063) >>> 0.25 = fieldNorm(field=text, doc=1551) >>> - 1.3196187 = (MATCH) weight(profile:dubai^2.0 in 1551), product of: >>> - 0.32609802 = queryWeight(profile:dubai^2.0), product of: >>> + 0.6141165 = (MATCH) weight(profile:dubai^2.0 in 1551), product of: >>> + 0.15175761 = queryWeight(profile:dubai^2.0), product of: >>> 2.0 = boost >>> 7.6305184 = idf(docFreq=7, maxDocs=6063) >>> - 0.021368012 = queryNorm >>> + 0.009944122 = queryNorm >>> 4.0466933 = (MATCH) fieldWeight(profile:dubai in 1551), product of: >>> 1.4142135 = tf(termFreq(profile:dubai)=2) >>> 7.6305184 = idf(docFreq=7, maxDocs=6063) >>> 0.375 = fieldNorm(field=profile, doc=1551) >>> - 0.36931866 = (MATCH) max plus 0.01 times others of: >>> - 0.0018293816 = (MATCH) weight(text:product^0.1 in 1551), product of: >>> - 0.003954251 = queryWeight(text:product^0.1), product of: >>> - 0.1 = boost >>> + 0.17194802 = (MATCH) max plus 0.01 times others of: >>> + 0.00851347 = (MATCH) weight(text:product in 1551), product of: >>> + 0.018402064 = queryWeight(text:product), product of: >>> 1.8505468 = idf(docFreq=2589, maxDocs=6063) >>> - 0.021368012 = queryNorm >>> + 0.009944122 = queryNorm >>> 0.4626367 = (MATCH) fieldWeight(text:product in 1551), product of: >>> 1.0 = tf(termFreq(text:product)=1) >>> 1.8505468 = idf(docFreq=2589, maxDocs=6063) >>> 0.25 = fieldNorm(field=text, doc=1551) >>> - 0.36930037 = (MATCH) weight(profile:product^2.0 in 1551), product of: >>> - 0.1725098 = queryWeight(profile:product^2.0), product of: >>> + 0.17186289 = (MATCH) weight(profile:product^2.0 in 1551), product of: >>> + 0.08028162 = queryWeight(profile:product^2.0), product of: >>> 2.0 = boost >>> 4.036637 = idf(docFreq=290, maxDocs=6063) >>> - 0.021368012 = queryNorm >>> + 0.009944122 = queryNorm >>> 2.14075 = (MATCH) fieldWeight(profile:product in 1551), product of: >>> 1.4142135 = tf(termFreq(profile:product)=2) >>> 4.036637 = idf(docFreq=290, maxDocs=6063) >>> 0.375 = fieldNorm(field=profile, doc=1551) >>> - 0.59742856 = (MATCH) max plus 0.01 times others of: >>> - 0.59742856 = weight(profile:"dubai product"~10^0.5 in 1551), product >>> of: >>> - 0.12465195 = queryWeight(profile:"dubai product"~10^0.5), product of: >>> + 0.27802786 = (MATCH) max plus 0.01 times others of: >>> + 0.27802786 = weight(profile:"dubai product"~10^0.5 in 1551), product >>> of: >>> + 0.05800981 = queryWeight(profile:"dubai product"~10^0.5), product of: >>> 0.5 = boost >>> 11.667155 = idf(profile: dubai=7 product=290) >>> - 0.021368012 = queryNorm >>> + 0.009944122 = queryNorm >>> 4.7927732 = fieldWeight(profile:"dubai product" in 1551), product of: >>> 1.0954452 = tf(phraseFreq=1.2) >>> 11.667155 = idf(profile: dubai=7 product=290) >>> >>> >>> >>>