The 204 number came from a default quality-weight of 1.0, which is multiplied by the quality of 100, then added to the score. If the quality-weight is 0, then it would be 104. So the log(tf/idf) score is actually 104.
If you have qualities on documents that are set as negative numbers, then the score can get lower (because you will "add" a negative number for the quality part of the score). Similarly, if you put a negative number for quality-weight it will subtract from the score (or add if it is a negative quality on the document...). Also, remember that the number of fragments, including deleted fragments, in the database is part of the score calculation, so particularly when you have tiny databases (like 1 or 2 documents) it will make a big difference in scores. So if you update a document and the database has not merged out the deleted fragments yet, that will have an effect on the score. Now if you have 100s of millions of documents in the database, it will not really make much of a difference. The nice thing about the operator approach you took is that it allows the user to specify to use the quality at runtime. It should be the same as the direct child though, as far as the score you get (assuming you use the same quality weight and your databases are exactly the same). -Danny -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Stewart Shelline Sent: Monday, October 18, 2010 2:02 PM To: General Mark Logic Developer Discussion Subject: Re: [MarkLogic Dev General] Quality weight "If you make it 3.0, it will be 404 [ 104 + (3.0 * 100) ]." Shouldn't it be 504? If the log(tf/idf) score is 204 and the quality-weight is set to 3, wouldn't it be 204 + ( 3 * 100 ) = 504 When I used <quality-weight> as a direct element child of the <options> node, I actually got scores lower than the original log(tf/idf) score, which drew my suspicion. At any rate, I have discovered that using the following actually gets me what I wanted, which is for the document quality to be multiplied by my quality weight, then added to the log(tf/idf) score: search:search( "score:qw some-terms" ), <options xmlns="http://marklogic.com/appservices/search"> ... <operator name="score"> <state name="qw"> <quality-weight>{ $some-quality-weight }</quality-weight> </state> </operator> </options>, 1, 10 ) -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Danny Sokolsky Sent: Monday, October 18, 2010 2:52 PM To: General Mark Logic Developer Discussion Subject: Re: [MarkLogic Dev General] Quality weight Hi Stewart, Maybe I am not understanding what you are asking, but I think <quality-weight> will do what you want using the search API. It should behave the same way as passing a quality-weight to cts:search (and in fact, I think that is what the search API is doing with it). Here is the tricky thing about quality: the quality on a document is 0 by default, so unless you create a document with some quality or update with some quality, any quality-weight you add to a search will have no effect. As an example, I created a single document in an empty db with a quality of 100: xdmp:document-insert("/test.xml", <a>hello</a>, (), (), 100) Next I ran search:search on it with no options: search:search("hello") the score was 204 Next, I ran search:search with a quality-weight of 1.0 (the default): xquery version "1.0-ml"; import module namespace search = "http://marklogic.com/appservices/search" at "/MarkLogic/appservices/search/search.xqy"; let $options := <options xmlns="http://marklogic.com/appservices/search"> <quality-weight>1.0</quality-weight> </options> return search:search("hello", $options) The score is still 204. Now change the options node as follows: <quality-weight>2.0</quality-weight> The score is now 304. If you make it 3.0, it will be 404 [ 104 + (3.0 * 100) ]. Does that make sense? -Danny -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Stewart Shelline Sent: Monday, October 18, 2010 12:51 PM To: General Mark Logic Developer Discussion Subject: [MarkLogic Dev General] Quality weight According to the Search Developer's Guide, a document's score is calculated as follows when using cts:search: Score = Score + (QualityWeight * Quality) How can I get this same effect when using search:search? It appears the <quality-weight> parameter is not the right answer, since it applies the weight to the overall score for the document, not as a qualifier on the document's document quality, i.e.: Score = (Score + Quality) * QualityWeight NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
