Hi Mike and Martin, We have a similar use-case. Is there a scalability/performance issue with the getDocIdSet having to iterate through hundreds of thousands of docIDs?
Tom Burton-West http://www.hathitrust.org/blogs/large-scale-search -----Original Message----- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Thursday, July 22, 2010 5:20 AM To: java-user@lucene.apache.org Subject: Re: on-the-fly "filters" from docID lists It sounds like you should implement a custom Filter? Its getDocIdSet would consult your foreign key-value store and iterate through the allowed docIDs, per segment. Mike On Wed, Jul 21, 2010 at 8:37 AM, Martin J <martinj.eng...@gmail.com> wrote: > Hello, we are trying to implement a query type for Lucene (with eventual > target being Solr) where the query string passed in needs to be "filtered" > through a large list of document IDs per user. We can't store the user ID > information in the lucene index per document so we were planning to pull the > list of documents owned by user X from a key-value store at query time and > then build some sort of filter in memory before doing the Lucene/Solr query. > For example: > > content:"cars" user_id:X567 > > would first pull the list of docIDs that user_id:X567 has "access" to from a > keyvalue store and then we'd query the main index with content:"cars" but > only allow the docIDs that came back to be part of the response. The list of > docIDs can near the hundreds of thousands. > > What should I be looking at to implement such a feature? > > Thank you > Martin > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org