Thanks everyone for your help so far. I'm still trying to get to the bottom of whether switching over to index-time boosts will give me a performance improvement, and if so if it will be noticeable. This is all under the assumption that I can achieve the scoring functionality that I need with either index-time or search-time boosting (given the loss of precision. I can always dust off the old profiler to see what's going on with the search-time boosts, but testing the index-time boosts will require a full reindex, which could take days with our dataset.
On Sat, Jun 5, 2010 at 9:17 AM, Robert Muir <rcm...@gmail.com> wrote: > On Fri, Jun 4, 2010 at 7:50 PM, Asif Rahman <a...@newscred.com> wrote: > > > Perhaps I should have been more specific in my initial post. I'm doing > > date-based boosting on the documents in my index, so as to assign a > higher > > score to more recent documents. Currently I'm using a boost function to > > achieve this. I'm wondering if there would be a performance improvement > if > > instead of using the boost function at search time, I indexed the > documents > > with a date-based boost. > > > > > Asif, without knowing more details, before you look at performance you > might > want to consider the relevance impacts of switching to index-time boosting > for your use case too. > > You can read more about the differences here: > http://lucene.apache.org/java/3_0_1/scoring.html > > But I think the most important for this date-influenced use case is: > > "Indexing time boosts are preprocessed for storage efficiency and written > to > the directory (when writing the document) in a single byte (!)" > > If you do this as an index-time boost, your boosts will lose lots of > precision for this reason. > > -- > Robert Muir > rcm...@gmail.com > -- Asif Rahman Lead Engineer - NewsCred a...@newscred.com http://platform.newscred.com