DocValues looks interesting, a non-inverted field. I'll play with it a bit and see how it works. Thanks for the suggestion.
I don't know how many total terms we've got, but each "document" is only 2-5 words/terms on average, and there is a TON of overlap between docs. -Greg On Tue, Aug 20, 2013 at 11:38 AM, Jack Krupansky <j...@basetechnology.com> wrote: > Sounds like a problem for DocValues - assuming the number of unique values > fits reasonably in memory to avoid I/O. > > How many unique values do you have or contemplate for two your billion > documents? > > Two possibilities: > > 1. You need a lot more hardware. > 2. You need to scale back your ambitions. > > -- Jack Krupansky > > -----Original Message----- From: Greg Preston > Sent: Tuesday, August 20, 2013 2:00 PM > > To: solr-user@lucene.apache.org > Subject: Autosuggest on very large index > > Using 4.4.0 - > > I would like to be able to do an autosuggest query against one of the > fields in our index and have the results be limited by an fq. > > I can get exactly the results I want with a facet query using a > facet.prefix, but the first query takes ~5 minutes to run on our QA > env (~240M docs). I'm afraid to attempt it on prod (~2B docs). > Subsequent queries are sufficiently fast (~500ms). > > I'm assuming the first query is uninverting the field. Is there any > way to mark that field so that an uninverted copy is maintained as > updates come in? We plan to soft commit every 5 minutes, and we'd > prefer to not be continuously uninverting this one field. > > Or is there a better way to do what I'm trying to do? I've looked at > the spellcheck component a little bit, but it looks like I can't > filter results by fq. The fq I'm using is based on which client is > logged in, and we can't autosuggest terms from one client to another. > > Thanks. > > -Greg