As an alternative approach, I've been looking into attaching a float/int Payload to each Term, and then providing a custom Similarity class so that Terms with a higher Payload score do better in the query. There are several tutorials online about this, and I think I've got it worked out. But I have one question: the docs say that if I override Similarity, I must provide it to IndexWriter, not just the Searcher. This article does the same:
http://edwarddrapkin.com/2011/04/14/an-introduction-to-lucene-payloads/ But if my custom Payload just extends DefaultPayload and only overrides scorePayload, I don't see why IndexWriter needs it, because nothing from that method is written into the index. Or am I mistaken about that? Thanks, Paul On Thu, Oct 25, 2012 at 5:18 PM, Paul Jungwirth <[email protected]> wrote: > Okay, thanks! This is my first time using Lucene, and what I want to > do with it seems just slightly off the beaten path, so I'm glad to get > some confirmation from an expert. > > Yours, > Paul > > > On Thu, Oct 25, 2012 at 11:51 AM, Upayavira <[email protected]> wrote: >> This would seem a pretty reasonable way to go. It would just require >> that you know the boost for each category at indexing time, and would >> likely require some experimentation to identify the best boosts for each >> of your categories. >> >> Other than that, it seems perfectly reasonable to me. >> >> Upayavira >> >> On Thu, Oct 25, 2012, at 07:18 PM, Paul Jungwirth wrote: >>> Thank you for your help! Just to be clear: I wasn't asking for the >>> syntax, but I was wondering if in your judgment this approach is >>> appropriate. Will it give sensible results? Are there drawbacks in >>> performance, flexibility, etc.? Is there an better way to do it? >>> >>> Thanks, >>> Paul >>> >>> >>> On Thu, Oct 25, 2012 at 11:13 AM, Upayavira <[email protected]> wrote: >>> > In Solr syntax: >>> > <field name="category" boost="8">entertainment</field> >>> > <field name="category" boost="4">tv</field> >>> > <field name="category" boost="20">sports</field> >>> > <field name="category" boost="5">entertainment</field> >>> > >>> > That way: category(football tv) would do as you require, and would boost >>> > football above TV. >>> > >>> > That is - use index time boosts on your fields when you add them. >>> > >>> > Upayavira >>> > >>> > On Thu, Oct 25, 2012, at 06:16 PM, Paul Jungwirth wrote: >>> >> Hello, >>> >> >>> >> I have documents with various tags, and each tag has a numeric score, >>> >> so one document might be tagged "sports:20, entertainment:5, >>> >> football:10", and another "entertainment:8, tv:4". I'd like to let >>> >> people search by one or more tags, e.g. "football tv", and have the >>> >> results sorted with higher-scored tags first. I thought I could do >>> >> this by adding a separate Field for each tag (all named "tag" or >>> >> whatever), and then boosting the fields according to their score. Does >>> >> that seem like a good approach, or is there some cleaner way? I've >>> >> been reading the Lucene in Action book and looking through the online >>> >> docs, but I haven't found this usage scenario anywhere. >>> >> >>> >> Thanks, >>> >> Paul >>> >> >>> >> -- >>> >> _________________________________ >>> >> Pulchritudo splendor veritatis. >>> >>> >>> >>> -- >>> _________________________________ >>> Pulchritudo splendor veritatis. > > > > -- > _________________________________ > Pulchritudo splendor veritatis. -- _________________________________ Pulchritudo splendor veritatis.
