Thanks everyone for your help so far.  I'm still trying to get to the bottom
of whether switching over to index-time boosts will give me a performance
improvement, and if so if it will be noticeable.  This is all under the
assumption that I can achieve the scoring functionality that I need with
either index-time or search-time boosting (given the loss of precision.  I
can always dust off the old profiler to see what's going on with the
search-time boosts, but testing the index-time boosts will require a full
reindex, which could take days with our dataset.

On Sat, Jun 5, 2010 at 9:17 AM, Robert Muir <rcm...@gmail.com> wrote:

> On Fri, Jun 4, 2010 at 7:50 PM, Asif Rahman <a...@newscred.com> wrote:
>
> > Perhaps I should have been more specific in my initial post.  I'm doing
> > date-based boosting on the documents in my index, so as to assign a
> higher
> > score to more recent documents.  Currently I'm using a boost function to
> > achieve this.  I'm wondering if there would be a performance improvement
> if
> > instead of using the boost function at search time, I indexed the
> documents
> > with a date-based boost.
> >
> >
> Asif, without knowing more details, before you look at performance you
> might
> want to consider the relevance impacts of switching to index-time boosting
> for your use case too.
>
> You can read more about the differences here:
> http://lucene.apache.org/java/3_0_1/scoring.html
>
> But I think the most important for this date-influenced use case is:
>
> "Indexing time boosts are preprocessed for storage efficiency and written
> to
> the directory (when writing the document) in a single byte (!)"
>
> If you do this as an index-time boost, your boosts will lose lots of
> precision for this reason.
>
> --
> Robert Muir
> rcm...@gmail.com
>



-- 
Asif Rahman
Lead Engineer - NewsCred
a...@newscred.com
http://platform.newscred.com

Reply via email to