[ https://issues.apache.org/jira/browse/LUCENE-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638888#action_12638888 ]
Michael McCandless commented on LUCENE-1410: -------------------------------------------- bq. It should really make a difference for stop words and disjunction queries depending on DocIdSetIterator.next(). Yes. bq. Conjunctions that depend on skipTo(docNum) will probably make it necessary to impose an upperbound the size of the compressed arrays. Yes. Though, I think when optimizing search performance we should focus entirely on the high-latency queries. TermQuery on very frequent terms, disjunctions queries involving common terms, phrase/span queries that have many matches, etc. EG if PFOR speeds up high-latency queries say by 20% (say 10 sec -> 8 sec), but causes queries that are already fast (say 30 msec) to get a bit slower (say 40 msec) I think that's fine. It's the high-latency queries that kill us because those ones limit how large a collection you can put on one box before you're forced to shard your index. At some point we should make use of concurrency when iterating over large result sets. EG if estimated # total hits is > X docs, use multiple threads where each threads skips to it's own "chunk" and iterates over it, and then merge the results. Then we should be able to cut down on the max latency query and handle more documents on a single machine. Computers are very quickly become very concurrent. bq. I'm wondering whether it would make sense to add skip info to the term positions of very large documents. Any ideas on that? Probably we should -- yet another issue :) > PFOR implementation > ------------------- > > Key: LUCENE-1410 > URL: https://issues.apache.org/jira/browse/LUCENE-1410 > Project: Lucene - Java > Issue Type: New Feature > Components: Other > Reporter: Paul Elschot > Priority: Minor > Attachments: autogen.tgz, LUCENE-1410b.patch, LUCENE-1410c.patch, > TestPFor2.java, TestPFor2.java, TestPFor2.java > > Original Estimate: 21840h > Remaining Estimate: 21840h > > Implementation of Patched Frame of Reference. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]