On Thursday 09 March 2006 15:54, Yonik Seeley wrote: > On 3/9/06, Øyvind Stegard <[EMAIL PROTECTED]> wrote: > > - How does many stored fields eventually affect indexing/query > > performance compared to if no fields were stored (only indexed) ? > > Additional stored fields should have no effect on querying (the > internal information about a field is looked up in a hashmap). > > Additional stored fields that are used has an impact on indexing since > that data must be copied every time segments are merged. > > Additional stored fields that are not used in most documents (sparse) > should have very little performance impact on indexing. The field > list is walked a few times linearly (in-memory) during a segment > merge, which should be very fast, but it's still O(n), so don't go > crazy and have a million stored field types. > > > - Are there any known scalability issues with a large amount of distinct > > fields in an index (not necessarily the same set of fields for every doc) > > ? > > If they are indexed fields, yes. > Each indexed field has a 1 byte norm *per document*, regardless of if > the document contains that field. In the current version of lucene, > there is a way to omit these norms on a per field basis (see > Field.setOmitNorms()) if you don't need length normalization or > index-time field boosting. Thanks for the quick and informative reply ! I will investigate further into the possibility of omitting such norm data from fields (we typically do very exact searches, and don't use much score data, yet).
Øyvind -- < Øyvind Stegard < oyviste at usit uio no >> --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]