Just my two cents, I think what he meant by "single field" is the following:
If the concept of "field" was introduced to differentiate the significance of term occurrences in difference regions of a document, (eg, the occurence in title is more important than in body, etc), that significance can be alternatively represented be (or at least encoded in) "impact" which is per occurence of a term. For example, if a base "impact" value for a term occurence is 1.0, you can assign an additional "0.5" to the occurence in title, thus you would have the a impace of "1.5" for the title occurrence of that term, while "1.0" for the body occurrence. Does this make sense to you? I feel that people should take a look at the theoretical retrieval model first. It is not clear to me that Lucene is fully vector-space model. It seems that so much assumption has been made about the context of the discussion. Michael --- jian chen <[EMAIL PROTECTED]> wrote: > Hi, Jeff, > > I like the idea of impact based scoring. However, > could you elaborate more > on why we only need to use single field at search > time? > > In Lucene, the indexed terms are field specific, and > two terms, even if they > are the same, are still different terms if they are > of different fields. > > So, I think the multiple field scenario is still > needed right? What if the > user wants to search on both subject and content for > emails, for example, > and sometimes, only wants to search on subject, this > type of tasks, without > multiple fields, how this would be handled. > > I got lost on this, could any one educate? > > Thanks, > > Jian > > On 1/9/07, Dalton, Jeffery <[EMAIL PROTECTED]> > wrote: > > > > I'm not sure we fully understand one another, but > I'll try to explain > > what I am thinking. > > > > Yes, it has use after sorting. It is used at > query time for document > > scoring in place of the TF and length norm > components (new scorers > > would need to be created). > > > > Using an impact based index moves most of the > scoring from query time to > > index time (trades query time flexibility for > greatly improved query > > search performance). Because the field boosts, > length norm, position > > boosts, etc... are incorporated into a single > document-term-score, you > > can use a single field at search time. It allows > one posting list per > > query term instead of the current one posting list > per field per query > > term (MultiFieldQueryParser wouldn't be necessary > in most cases). In > > addition to having fewer posting lists to examine, > you often don't need > > to read to the end of long posting lists when > processing with a > > score-at-a-time approach (see Anh/Moffat's Pruned > Query Evaluation Using > > Pre-Computed Impacts, SIGIR 2006) for details on > one potential > > algorithm. > > > > I'm not quite sure what you mean when mention > leaving them out and > > re-calculating them at merge time. > > > > - Jeff > > > > > -----Original Message----- > > > From: Marvin Humphrey > [mailto:[EMAIL PROTECTED] > > > Sent: Tuesday, January 09, 2007 2:58 PM > > > To: java-dev@lucene.apache.org > > > Subject: Re: Beyond Lucene 2.0 Index Design > > > > > > > > > On Jan 9, 2007, at 6:25 AM, Dalton, Jeffery > wrote: > > > > > > > e. <impact, num_docs, (doc1,...docN)> > > > > f. <impact, num_docs, ([doc1, freq > ,<positions>],...[docN, freq > > > > ,<positions>]) > > > > > > Does the impact have any use after it's used to > sort the postings? > > > Can we leave it out of the index format and > recalculate at merge-time? > > > > > > Marvin Humphrey > > > Rectangular Research > > > http://www.rectangular.com/ > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: > [EMAIL PROTECTED] > > > For additional commands, e-mail: > [EMAIL PROTECTED] > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: > [EMAIL PROTECTED] > > For additional commands, e-mail: > [EMAIL PROTECTED] > > > > > ____________________________________________________________________________________ Need a quick answer? Get one in minutes from people who know. Ask your question on www.Answers.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]