Antony Bowesman <[EMAIL PROTECTED]> wrote on 27/02/2007 17:37:41: > Doron Cohen wrote: > > The collect() method is going to be invoked once for each document that > > matches the query (having nonzero score). If the index is very large, that > > may turn to be a very large number of calls. Often, search applications > > only fetch additional data (doc fields) for only a small subset of the > > entire set of documents matching a query - e.g. first page (0-9), second > > page (10-19), etc. But if your application is going to fetch in an > > exhaustive manner, and especially for a short field like DB_ID, it > > sometimes makes sense to cache in memory the entire field (its values for > > all the docs), for the entire life of the index reader/searcher, and use > > that cached data. The collect method can then use that cached data. > > That's an excellent idea! We cannot easily change our client > implementation, so > have to support the exhaustive retrieval for now, although I do limit the
> absolute max hits that will be returned. We are hoping to implement > paging in a > later client version. > > I'm not sure I can cache all the GUIDs though. A GUID is 20 bytes > and there are > two that need to be cached. The document count could be up to 100M,though in > most cases <20M. I am keeping a BitSet filter cache for a searcher for each > user's mail, so I could extend that to cache all the IDs for that > user and give > that cache a shortish life and/or limit the total cache available. > That would > really help. > > I'll have a play - thanks for the input. > Antony If you decide to cache stored field value in memory, FieldCache may be useful for this - so you don't have to implement your own cache - you can access the field values with something like: FieldCache fieldCache = FieldCache.DEFAULT; String db_id_field[] = fieldCache.getStrings(indexReader,"DB_ID_FIELD_NAME"); Those values are valid for the lifetime of the index-reader. Once a new index reader is opened, when GC collects the unused old index reader object, it would also be able to collect (from the cache) unused field values. See also http://www.gossamer-threads.com/lists/lucene/java-user/39352 Doron --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]