One interesting thing to talk about is when you need to create a new Query subclass, and how to do it.
For example, let's say you want something between a BooleanQuery and a PhraseQuery, which matches documents with some of the query words in them (like the normal BooleanQuery), but giving more score to documents which contain these words near each other (there was a discussion about this idea about a month ago, when we discussed short documents). In that case, what do I need to do? I supposed I need to write a new Query subclass, but what does doing this take? Do I need to write a "Scorer"? A "Similarity"? Or what? I think this is an interesting topic. -- Nadav Har'El [EMAIL PROTECTED] +972-4-829-6326 Grant Ingersoll <[EMAIL PROTECTED] > To java-dev@lucene.apache.org 15/06/2006 03:01 cc AM Subject Re: Scoring Please respond to [EMAIL PROTECTED] pache.org Karl, This is a great start. I have also started a scoring.xml document under the xdocs directory (in my sandbox). So far, I have the following sections (some even have content under them!): 1. Introduction // Intro about Vector Space Model, some references to theory, links to the Similarity scoring Formula 2. Scoring and the Index //How scoring relates to what is in the index (i.e. how it takes advantage of precomputed info such as norms, etc. 3. Understanding Similarity //How the Similarity class fits into Scoring and what it means to override the Similarity (Greek Kung Fu!) 4. Changing Your Scoring -- Expert // A discussion of overriding/creating Scorer/Query/Whatever else 5. Class Diagrams // Links to your cool pictures 6. Sequence Diagrams //More cool pictures What else is needed/useful? Anyone want to volunteer on a section? -Grant karl wettin wrote: > On Wed, 2006-06-07 at 08:27 -0400, Grant Ingersoll wrote: > >> I have started something in my sandbox that goes in the xdocs directory >> that is going to cover the scoring and how it works (something parallel >> in spirit to the file formats documentation). Adding in sequence >> diagrams and whatever you have would be a perfect fit. I would be happy >> to coordinate with you, as you may end up getting to it before me. >> >> I would also like to see, possibly, some package level documentation and >> more javadocs. >> > > Day (night) one of me getting to know the finding and scoring of the > documents matching a query ended up with an initial class diagram. > > < http://wiki.apache.org/jakarta-lucene/KarlWettin?action=AttachFile&do=view&target=search_uml_1.jpg > > <http://shorl.com/hynulymolijo> > > Feel free to let me know what I got wrong. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- Grant Ingersoll Sr. Software Engineer Center for Natural Language Processing Syracuse University School of Information Studies 335 Hinds Hall Syracuse, NY 13244 http://www.cnlp.org Voice: 315-443-5484 Fax: 315-443-6886 --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]