Hello everybody, As far as I know Lucene processes documents DAAT. Depending on the query either the intersection or union is calculated. For the intersection only documents occurring in all posting lists are scored. In the union case every document is scored which makes it a more expensive operation.
Lucene stores its index into several files. Depending on the query different files might be accessed for scoring. For example a payload query needs to read paylods from .pos. What is not clear for me how term frequencies or payloads are processed. Assuming I store term frequencies I need to set setOmitTermFreqAndPositions(false). 1) Which queries include term frequencies? I assume all queries if term frequencies are stored? 2) Why is fetching payloads so much more expensive than getting term frequencies. Both are stored in seperated files and therefore demand a disk seek. 3) What for a value contains tf if I set setOmitTermFreqAndPositions(true)? Allways 1? 4) How are term freqs, payloads read from disk? In bulk for all remaining docs at once or every time a document gets scored? Regards Alex -- View this message in context: http://lucene.472066.n3.nabble.com/Lucene-query-processing-tp2868144p2868144.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org