The 204 number came from a default quality-weight of 1.0, which is multiplied 
by the quality of 100, then added to the score.  If the quality-weight is 0, 
then it would be 104.  So the log(tf/idf) score is actually 104.

If you have qualities on documents that are set as negative numbers, then the 
score can get lower (because you will "add" a negative number for the quality 
part of the score).  Similarly, if you put a negative number for quality-weight 
it will subtract from the score (or add if it is a negative quality on the 
document...).

Also, remember that the number of fragments, including deleted fragments, in 
the database is part of the score calculation, so particularly when you have 
tiny databases (like 1 or 2 documents) it will make a big difference in scores. 
 So if you update a document and the database has not merged out the deleted 
fragments yet, that will have an effect on the score.  Now if you have 100s of 
millions of documents in the database, it will not really make much of a 
difference.

The nice thing about the operator approach you took is that it allows the user 
to specify to use the quality at runtime.  It should be the same as the direct 
child though, as far as the score you get (assuming you use the same quality 
weight and your databases are exactly the same).

-Danny

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Stewart Shelline
Sent: Monday, October 18, 2010 2:02 PM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] Quality weight

"If you make it 3.0, it will be 404 [  104 + (3.0 * 100) ]."

Shouldn't it be 504? If the log(tf/idf) score is 204 and the quality-weight is 
set to 3, wouldn't it be

   204 + ( 3 * 100 ) = 504

When I used <quality-weight> as a direct element child of the <options> node, I 
actually got scores lower than the original log(tf/idf) score, which drew my 
suspicion. At any rate, I have discovered that using the following actually 
gets me what I wanted, which is for the document quality to be multiplied by my 
quality weight, then added to the log(tf/idf) score:

            search:search(
                "score:qw some-terms" ),
                <options xmlns="http://marklogic.com/appservices/search";>
                          ...
                    <operator name="score">
                        <state name="qw">
                            <quality-weight>{ $some-quality-weight 
}</quality-weight>
                        </state>
                    </operator>
                </options>, 1, 10 )


-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Danny Sokolsky
Sent: Monday, October 18, 2010 2:52 PM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] Quality weight

Hi Stewart,

Maybe I am not understanding what you are asking, but I think <quality-weight> 
will do what you want using the search API.  It should behave the same way as 
passing a quality-weight to cts:search (and in fact, I think that is what the 
search API is doing with it).

Here is the tricky thing about quality:  the quality on a document is 0 by 
default, so unless you create a document with some quality or update with some 
quality, any quality-weight you add to a search will have no effect.

As an example, I created a single document in an empty db with a quality of 100:

xdmp:document-insert("/test.xml", <a>hello</a>, (), (), 100)

Next I ran search:search on it with no options:

search:search("hello")

the score was 204

Next, I ran search:search with a quality-weight of 1.0 (the default):

xquery version "1.0-ml";
import module namespace search =
  "http://marklogic.com/appservices/search";
  at "/MarkLogic/appservices/search/search.xqy";
 
let  $options :=
<options xmlns="http://marklogic.com/appservices/search";>
  <quality-weight>1.0</quality-weight>
</options>
return
 
search:search("hello", $options)

The score is still 204.

Now change the options node as follows:

<quality-weight>2.0</quality-weight>

The score is now 304.

If you make it 3.0, it will be 404 [  104 + (3.0 * 100) ].

Does that make sense?
-Danny





-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Stewart Shelline
Sent: Monday, October 18, 2010 12:51 PM
To: General Mark Logic Developer Discussion
Subject: [MarkLogic Dev General] Quality weight

According to the Search Developer's Guide, a document's score is calculated as 
follows when using cts:search:

    Score = Score + (QualityWeight * Quality)

How can I get this same effect when using search:search? It appears the 
<quality-weight> parameter is not the right answer, since it applies the weight 
to the overall score for the document, not as a qualifier on the document's 
document quality, i.e.:

    Score = (Score + Quality) * QualityWeight


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.


_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to