That is exactly what I did when I started to realize the effects of using Lucene sorting with millions of documents in the index. I used STORED fields and sorted the results with a generic Comparator, which is configured for a field and a search order. I only do this if the query did not return too many search results.
Thanks a lot, Tom -----Ursprüngliche Nachricht----- Von: Chris Hostetter [mailto:[EMAIL PROTECTED] Gesendet: Thursday, June 29, 2006 6:40 AM An: java-user@lucene.apache.org Betreff: Re: MemoryUsage of sorting : some OutOfMemory errors. If I understand it correctly, each unique term : in a field is read into a cache, when I use Searcher.search(Query query, : Sort sort) with one SortField. So even if my query only finds 5 Minor clarification: if the sort type is one of the numeric types, then an array of that type is created of the same size as the number of docs in your index -- regardless of how many unique terms there are. if the sort type is String, then a String[] of all the unique Term values is created, *and* and array of ints (one per document) is created to use as an index into that String[] : documents, Lucene would start to build a cache of maybe a few millionen : unique field entries, which would then be re-used for a further queries. correct - the assumption is that if you are sorting on field "foo" in this query, there will probably be another query you want to sort on field "foo" in the near future. : Is this correct? It would probably be best practice, to do sorting : yourself, if many unique terms are concerned for only a few search : results, especially since we have lots of index updates, which makes : reusing IndexSearcher a little lot harder. it might make sense to do that ... but it you would have to use "STORED" fields to do that -- typically searching is done on INDEXED fields, because it's very easy/fast to build up the FieldCache for every document by walking the TermEnum of the indexed fields; but if you don't want to build the FieldCache for every document you'll need some way to get a value for each of hte matching documents -- STORED fields seem like the most logical. You could probably write a pretty generic SortComparatorSource that would do this. -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]