Yonik Seeley wrote:
On 1/23/07, Walter Lewis <[EMAIL PROTECTED]> wrote:
This is quite possibly a Lucene question rather than a solr one, so my
apologies if you think its out of scope.

Underlying the solr search, are some very useful Lucene constructs.

One of the most powerful, imho, is the tilde number combination for a
"fuzzy" search.

In one of my data sets
    q=Sutherland returns 41 results
    q=Sutherland~0.75 returns 275
    q=Sutherland~0.70 returns 484
etc. all of which fits a pattern Add a first name and
   q=(James Sutherland) returns 13
   q=(James~0.75 Sutherland~0.75) returns 1
    q=(James~0.70 Sutherland~0.70) returns 97
Qualify only one term and there is a consistent pattern.  But routinely
qualifying two terms yields a smaller number than a string match.
Trying
   q=(James~0.75 AND Sutherland~0.75) returns the same record (the
schema has default set to AND)

Why would the ~0.75 *narrow* rather than broaden a search? Is there some
pattern in the solr syntax I'm overlooking?

That's a great question... that doesn't make sense.
Could you post your debug-query output (add debugQuery=on)?
My apologies for the delay and for the generally excessive top quoting here. I thought it might save a bit of time to keep the alternatives together. I should also note that I simplified the queries above. Each ran with a searchSet constraint, which was the same value. The "normal" queries also have a significant baggages of fields and facets, which are also consistent across the whole set of them.

I ran the debug against the two following queries:

  q=(James Sutherland) returns 13
  q=(James~0.75 Sutherland~0.75) returns 1

I have attached the debug fragments below.

Walter


====
<lst name="debug">
<str name="rawquerystring">(james sutherland) searchSet:testSet</str>
<str name="querystring">(james sutherland) searchSet:testSet</str>
-
   <str name="parsedquery">
+(+text:jame +text:sutherland) +searchSet:testSet
</str>
-
   <str name="parsedquery_toString">
+(+text:jame +text:sutherland) +searchSet:testSet
</str>
-
   <lst name="explain">
-
   <str name="id=MHGL.502,internal_docid=80313">

2.2928324 = (MATCH) sum of:
 2.2204013 = (MATCH) sum of:
   0.444597 = (MATCH) weight(text:jame in 80313), product of:
     0.46986106 = queryWeight(text:jame), product of:
       4.370453 = idf(docFreq=3085)
       0.107508555 = queryNorm
     0.94623077 = (MATCH) fieldWeight(text:jame in 80313), product of:
       1.7320508 = tf(termFreq(text:jame)=3)
       4.370453 = idf(docFreq=3085)
       0.125 = fieldNorm(field=text, doc=80313)
   1.7758043 = (MATCH) weight(text:sutherland in 80313), product of:
     0.8738745 = queryWeight(text:sutherland), product of:
       8.128418 = idf(docFreq=71)
       0.107508555 = queryNorm
     2.0321045 = (MATCH) fieldWeight(text:sutherland in 80313), product of:
       2.0 = tf(termFreq(text:sutherland)=4)
       8.128418 = idf(docFreq=71)
       0.125 = fieldNorm(field=text, doc=80313)
 0.072431125 = (MATCH) weight(searchSet:testSet in 80313), product of:
   0.124795556 = queryWeight(searchSet:testSet), product of:
     1.1607965 = idf(docFreq=76441)
     0.107508555 = queryNorm
0.58039826 = (MATCH) fieldWeight(searchSet:testSet in 80313), product of:
     1.0 = tf(termFreq(searchSet:testSet)=1)
     1.1607965 = idf(docFreq=76441)
     0.5 = fieldNorm(field=searchSet, doc=80313)
</str>
-
   <str name="id=MHGL.503,internal_docid=80314">

2.1340907 = (MATCH) sum of:
 2.0616596 = (MATCH) sum of:
   0.43047923 = (MATCH) weight(text:jame in 80314), product of:
     0.46986106 = queryWeight(text:jame), product of:
       4.370453 = idf(docFreq=3085)
       0.107508555 = queryNorm
     0.91618407 = (MATCH) fieldWeight(text:jame in 80314), product of:
       2.236068 = tf(termFreq(text:jame)=5)
       4.370453 = idf(docFreq=3085)
       0.09375 = fieldNorm(field=text, doc=80314)
   1.6311804 = (MATCH) weight(text:sutherland in 80314), product of:
     0.8738745 = queryWeight(text:sutherland), product of:
       8.128418 = idf(docFreq=71)
       0.107508555 = queryNorm
     1.8666072 = (MATCH) fieldWeight(text:sutherland in 80314), product of:
       2.4494898 = tf(termFreq(text:sutherland)=6)
       8.128418 = idf(docFreq=71)
       0.09375 = fieldNorm(field=text, doc=80314)
 0.072431125 = (MATCH) weight(searchSet:testSet in 80314), product of:
   0.124795556 = queryWeight(searchSet:testSet), product of:
     1.1607965 = idf(docFreq=76441)
     0.107508555 = queryNorm
0.58039826 = (MATCH) fieldWeight(searchSet:testSet in 80314), product of:
     1.0 = tf(termFreq(searchSet:testSet)=1)
     1.1607965 = idf(docFreq=76441)
     0.5 = fieldNorm(field=searchSet, doc=80314)
</str>
-
   <str name="id=MHGL.501,internal_docid=80312">

1.5031691 = (MATCH) sum of:
 1.430738 = (MATCH) sum of:
   0.32086027 = (MATCH) weight(text:jame in 80312), product of:
     0.46986106 = queryWeight(text:jame), product of:
       4.370453 = idf(docFreq=3085)
       0.107508555 = queryNorm
     0.68288326 = (MATCH) fieldWeight(text:jame in 80312), product of:
       1.0 = tf(termFreq(text:jame)=1)
       4.370453 = idf(docFreq=3085)
       0.15625 = fieldNorm(field=text, doc=80312)
   1.1098777 = (MATCH) weight(text:sutherland in 80312), product of:
     0.8738745 = queryWeight(text:sutherland), product of:
       8.128418 = idf(docFreq=71)
       0.107508555 = queryNorm
     1.2700653 = (MATCH) fieldWeight(text:sutherland in 80312), product of:
       1.0 = tf(termFreq(text:sutherland)=1)
       8.128418 = idf(docFreq=71)
       0.15625 = fieldNorm(field=text, doc=80312)
 0.072431125 = (MATCH) weight(searchSet:testSet in 80312), product of:
   0.124795556 = queryWeight(searchSet:testSet), product of:
     1.1607965 = idf(docFreq=76441)
     0.107508555 = queryNorm
0.58039826 = (MATCH) fieldWeight(searchSet:testSet in 80312), product of:
     1.0 = tf(termFreq(searchSet:testSet)=1)
     1.1607965 = idf(docFreq=76441)
     0.5 = fieldNorm(field=searchSet, doc=80312)
</str>
-
<str name="id=http://archeion-aao.fis.utoronto.ca/cgi-bin/ifetch?DBRootName=ON&RecordKey=42&FieldKey=F&FilePath=E:\Documents\archeion\/ON00313f/ON00313-f0000358.xml,internal_docid=12073";>

0.6628341 = (MATCH) sum of:
 0.5722952 = (MATCH) sum of:
   0.1283441 = (MATCH) weight(text:jame in 12073), product of:
     0.46986106 = queryWeight(text:jame), product of:
       4.370453 = idf(docFreq=3085)
       0.107508555 = queryNorm
     0.2731533 = (MATCH) fieldWeight(text:jame in 12073), product of:
       1.0 = tf(termFreq(text:jame)=1)
       4.370453 = idf(docFreq=3085)
       0.0625 = fieldNorm(field=text, doc=12073)
   0.44395107 = (MATCH) weight(text:sutherland in 12073), product of:
     0.8738745 = queryWeight(text:sutherland), product of:
       8.128418 = idf(docFreq=71)
       0.107508555 = queryNorm
     0.5080261 = (MATCH) fieldWeight(text:sutherland in 12073), product of:
       1.0 = tf(termFreq(text:sutherland)=1)
       8.128418 = idf(docFreq=71)
       0.0625 = fieldNorm(field=text, doc=12073)
 0.090538904 = (MATCH) weight(searchSet:testSet in 12073), product of:
   0.124795556 = queryWeight(searchSet:testSet), product of:
     1.1607965 = idf(docFreq=76441)
     0.107508555 = queryNorm
0.72549784 = (MATCH) fieldWeight(searchSet:testSet in 12073), product of:
     1.0 = tf(termFreq(searchSet:testSet)=1)
     1.1607965 = idf(docFreq=76441)
     0.625 = fieldNorm(field=searchSet, doc=12073)
</str>
-
<str name="id=http://archeion-aao.fis.utoronto.ca/cgi-bin/ifetch?DBRootName=ON&RecordKey=42&FieldKey=F&FilePath=ON00313f/ON00313-f0000358.xml,internal_docid=60185";>

0.6628341 = (MATCH) sum of:
 0.5722952 = (MATCH) sum of:
   0.1283441 = (MATCH) weight(text:jame in 60185), product of:
     0.46986106 = queryWeight(text:jame), product of:
       4.370453 = idf(docFreq=3085)
       0.107508555 = queryNorm
     0.2731533 = (MATCH) fieldWeight(text:jame in 60185), product of:
       1.0 = tf(termFreq(text:jame)=1)
       4.370453 = idf(docFreq=3085)
       0.0625 = fieldNorm(field=text, doc=60185)
   0.44395107 = (MATCH) weight(text:sutherland in 60185), product of:
     0.8738745 = queryWeight(text:sutherland), product of:
       8.128418 = idf(docFreq=71)
       0.107508555 = queryNorm
     0.5080261 = (MATCH) fieldWeight(text:sutherland in 60185), product of:
       1.0 = tf(termFreq(text:sutherland)=1)
       8.128418 = idf(docFreq=71)
       0.0625 = fieldNorm(field=text, doc=60185)
 0.090538904 = (MATCH) weight(searchSet:testSet in 60185), product of:
   0.124795556 = queryWeight(searchSet:testSet), product of:
     1.1607965 = idf(docFreq=76441)
     0.107508555 = queryNorm
0.72549784 = (MATCH) fieldWeight(searchSet:testSet in 60185), product of:
     1.0 = tf(termFreq(searchSet:testSet)=1)
     1.1607965 = idf(docFreq=76441)
     0.625 = fieldNorm(field=searchSet, doc=60185)
</str>
-
<str name="id=http://archeion-aao.fis.utoronto.ca/cgi-bin/ifetch?DBRootName=ON&RecordKey=42&FieldKey=F&FilePath=E:\Documents\archeion\/ON00093f/ON00093-f93-9.xml,internal_docid=10564";>

0.48144954 = (MATCH) sum of:
 0.39091066 = (MATCH) sum of:
   0.11344123 = (MATCH) weight(text:jame in 10564), product of:
     0.46986106 = queryWeight(text:jame), product of:
       4.370453 = idf(docFreq=3085)
       0.107508555 = queryNorm
     0.24143569 = (MATCH) fieldWeight(text:jame in 10564), product of:
       1.4142135 = tf(termFreq(text:jame)=2)
       4.370453 = idf(docFreq=3085)
       0.0390625 = fieldNorm(field=text, doc=10564)
   0.27746943 = (MATCH) weight(text:sutherland in 10564), product of:
     0.8738745 = queryWeight(text:sutherland), product of:
       8.128418 = idf(docFreq=71)
       0.107508555 = queryNorm
0.31751633 = (MATCH) fieldWeight(text:sutherland in 10564), product of:
       1.0 = tf(termFreq(text:sutherland)=1)
       8.128418 = idf(docFreq=71)
       0.0390625 = fieldNorm(field=text, doc=10564)
 0.090538904 = (MATCH) weight(searchSet:testSet in 10564), product of:
   0.124795556 = queryWeight(searchSet:testSet), product of:
     1.1607965 = idf(docFreq=76441)
     0.107508555 = queryNorm
0.72549784 = (MATCH) fieldWeight(searchSet:testSet in 10564), product of:
     1.0 = tf(termFreq(searchSet:testSet)=1)
     1.1607965 = idf(docFreq=76441)
     0.625 = fieldNorm(field=searchSet, doc=10564)
</str>
-
<str name="id=http://archeion-aao.fis.utoronto.ca/cgi-bin/ifetch?DBRootName=ON&RecordKey=42&FieldKey=F&FilePath=ON00093f/ON00093-f93-9.xml,internal_docid=58676";>

0.48144954 = (MATCH) sum of:
 0.39091066 = (MATCH) sum of:
   0.11344123 = (MATCH) weight(text:jame in 58676), product of:
     0.46986106 = queryWeight(text:jame), product of:
       4.370453 = idf(docFreq=3085)
       0.107508555 = queryNorm
     0.24143569 = (MATCH) fieldWeight(text:jame in 58676), product of:
       1.4142135 = tf(termFreq(text:jame)=2)
       4.370453 = idf(docFreq=3085)
       0.0390625 = fieldNorm(field=text, doc=58676)
   0.27746943 = (MATCH) weight(text:sutherland in 58676), product of:
     0.8738745 = queryWeight(text:sutherland), product of:
       8.128418 = idf(docFreq=71)
       0.107508555 = queryNorm
0.31751633 = (MATCH) fieldWeight(text:sutherland in 58676), product of:
       1.0 = tf(termFreq(text:sutherland)=1)
       8.128418 = idf(docFreq=71)
       0.0390625 = fieldNorm(field=text, doc=58676)
 0.090538904 = (MATCH) weight(searchSet:testSet in 58676), product of:
   0.124795556 = queryWeight(searchSet:testSet), product of:
     1.1607965 = idf(docFreq=76441)
     0.107508555 = queryNorm
0.72549784 = (MATCH) fieldWeight(searchSet:testSet in 58676), product of:
     1.0 = tf(termFreq(searchSet:testSet)=1)
     1.1607965 = idf(docFreq=76441)
     0.625 = fieldNorm(field=searchSet, doc=58676)
</str>
-
   <str name="id=ECF.873,internal_docid=18553">

0.25359273 = (MATCH) sum of:
 0.16305381 = (MATCH) sum of:
   0.07981298 = (MATCH) weight(text:jame in 18553), product of:
     0.46986106 = queryWeight(text:jame), product of:
       4.370453 = idf(docFreq=3085)
       0.107508555 = queryNorm
     0.16986507 = (MATCH) fieldWeight(text:jame in 18553), product of:
       3.3166249 = tf(termFreq(text:jame)=11)
       4.370453 = idf(docFreq=3085)
       0.01171875 = fieldNorm(field=text, doc=18553)
   0.08324082 = (MATCH) weight(text:sutherland in 18553), product of:
     0.8738745 = queryWeight(text:sutherland), product of:
       8.128418 = idf(docFreq=71)
       0.107508555 = queryNorm
     0.0952549 = (MATCH) fieldWeight(text:sutherland in 18553), product of:
       1.0 = tf(termFreq(text:sutherland)=1)
       8.128418 = idf(docFreq=71)
       0.01171875 = fieldNorm(field=text, doc=18553)
 0.090538904 = (MATCH) weight(searchSet:testSet in 18553), product of:
   0.124795556 = queryWeight(searchSet:testSet), product of:
     1.1607965 = idf(docFreq=76441)
     0.107508555 = queryNorm
0.72549784 = (MATCH) fieldWeight(searchSet:testSet in 18553), product of:
     1.0 = tf(termFreq(searchSet:testSet)=1)
     1.1607965 = idf(docFreq=76441)
     0.625 = fieldNorm(field=searchSet, doc=18553)
</str>
-
   <str name="id=ECF.373,internal_docid=18055">

0.2336127 = (MATCH) sum of:
 0.1430738 = (MATCH) sum of:
   0.032086026 = (MATCH) weight(text:jame in 18055), product of:
     0.46986106 = queryWeight(text:jame), product of:
       4.370453 = idf(docFreq=3085)
       0.107508555 = queryNorm
     0.068288326 = (MATCH) fieldWeight(text:jame in 18055), product of:
       1.0 = tf(termFreq(text:jame)=1)
       4.370453 = idf(docFreq=3085)
       0.015625 = fieldNorm(field=text, doc=18055)
   0.11098777 = (MATCH) weight(text:sutherland in 18055), product of:
     0.8738745 = queryWeight(text:sutherland), product of:
       8.128418 = idf(docFreq=71)
       0.107508555 = queryNorm
0.12700653 = (MATCH) fieldWeight(text:sutherland in 18055), product of:
       1.0 = tf(termFreq(text:sutherland)=1)
       8.128418 = idf(docFreq=71)
       0.015625 = fieldNorm(field=text, doc=18055)
 0.090538904 = (MATCH) weight(searchSet:testSet in 18055), product of:
   0.124795556 = queryWeight(searchSet:testSet), product of:
     1.1607965 = idf(docFreq=76441)
     0.107508555 = queryNorm
0.72549784 = (MATCH) fieldWeight(searchSet:testSet in 18055), product of:
     1.0 = tf(termFreq(searchSet:testSet)=1)
     1.1607965 = idf(docFreq=76441)
     0.625 = fieldNorm(field=searchSet, doc=18055)
</str>
-
   <str name="id=ECF.2476,internal_docid=20148">

0.2336127 = (MATCH) sum of:
 0.1430738 = (MATCH) sum of:
   0.032086026 = (MATCH) weight(text:jame in 20148), product of:
     0.46986106 = queryWeight(text:jame), product of:
       4.370453 = idf(docFreq=3085)
       0.107508555 = queryNorm
     0.068288326 = (MATCH) fieldWeight(text:jame in 20148), product of:
       1.0 = tf(termFreq(text:jame)=1)
       4.370453 = idf(docFreq=3085)
       0.015625 = fieldNorm(field=text, doc=20148)
   0.11098777 = (MATCH) weight(text:sutherland in 20148), product of:
     0.8738745 = queryWeight(text:sutherland), product of:
       8.128418 = idf(docFreq=71)
       0.107508555 = queryNorm
0.12700653 = (MATCH) fieldWeight(text:sutherland in 20148), product of:
       1.0 = tf(termFreq(text:sutherland)=1)
       8.128418 = idf(docFreq=71)
       0.015625 = fieldNorm(field=text, doc=20148)
 0.090538904 = (MATCH) weight(searchSet:testSet in 20148), product of:
   0.124795556 = queryWeight(searchSet:testSet), product of:
     1.1607965 = idf(docFreq=76441)
     0.107508555 = queryNorm
0.72549784 = (MATCH) fieldWeight(searchSet:testSet in 20148), product of:
     1.0 = tf(termFreq(searchSet:testSet)=1)
     1.1607965 = idf(docFreq=76441)
     0.625 = fieldNorm(field=searchSet, doc=20148)
</str>
</lst>
</lst>

=========

-
   <lst name="debug">
-
   <str name="rawquerystring">
(james~0.75 AND sutherland~0.75) searchSet:testSet
</str>
-
   <str name="querystring">
(james~0.75 AND sutherland~0.75) searchSet:testSet
</str>
-
   <str name="parsedquery">
+(+text:james~0.75 +text:sutherland~0.75) +searchSet:testSet
</str>
-
   <str name="parsedquery_toString">
+(+text:james~0.75 +text:sutherland~0.75) +searchSet:testSet
</str>
-
   <lst name="explain">
-
   <str name="id=ECF.2227,internal_docid=19900">

0.10142321 = (MATCH) sum of:
 0.04733514 = (MATCH) sum of:
   0.03207182 = (MATCH) sum of:
0.03207182 = (MATCH) weight(text:rames^0.20000005 in 19900), product of:
       0.1452334 = queryWeight(text:rames^0.20000005), product of:
         0.20000005 = boost
         11.306472 = idf(docFreq=2)
         0.06422576 = queryNorm
       0.22082953 = (MATCH) fieldWeight(text:rames in 19900), product of:
         1.0 = tf(termFreq(text:rames)=1)
         11.306472 = idf(docFreq=2)
         0.01953125 = fieldNorm(field=text, doc=19900)
   0.015263321 = (MATCH) sum of:
0.015263321 = (MATCH) weight(text:netherland^0.20000005 in 19900), product of:
       0.10019111 = queryWeight(text:netherland^0.20000005), product of:
         0.20000005 = boost
         7.799914 = idf(docFreq=99)
         0.06422576 = queryNorm
0.15234207 = (MATCH) fieldWeight(text:netherland in 19900), product of:
         1.0 = tf(termFreq(text:netherland)=1)
         7.799914 = idf(docFreq=99)
         0.01953125 = fieldNorm(field=text, doc=19900)
 0.05408807 = (MATCH) weight(searchSet:testSet in 19900), product of:
   0.07455304 = queryWeight(searchSet:testSet), product of:
     1.1607965 = idf(docFreq=76441)
     0.06422576 = queryNorm
0.72549784 = (MATCH) fieldWeight(searchSet:testSet in 19900), product of:
     1.0 = tf(termFreq(searchSet:testSet)=1)
     1.1607965 = idf(docFreq=76441)
     0.625 = fieldNorm(field=searchSet, doc=19900)
</str>
</lst>
</lst>

Reply via email to