David Spencer wrote:

  > I'm on JDK 1.4.2_06 and Tomcat 4+. Had issues w/ the Tomcat 5.5+/JDK
1.5
  >   combo so I rolled back.

There have been issues with Tomcat 5.5, although supposedly the latest
version has them resolved.  I'm using Tomcat 5.0.28 with JDK 1.5.0_01,
which has been solid -- no problems at all.  But your combo should be
fine; I just want to verify Miles Barr's changes to remove the 1.5
dependencies from my classes.

  > The baseline will presumably use the default Lucene Similarity and
Query
  > Parser.

I think the baseline should use Lucene's MultiFieldQueryParser to expand
the query to search both title and body fields, as this is presumably
the current "out-of-the-box" solution.  Similarly, it should use Lucene
1.4.3, the current official release; is this what you are using?  There
may be a desire to use the CVS HEAD instead, which I have never run
with.

  > For "your case" you'll need to tell me if I need to call
  > DistributingMultiFieldQueryParser or whatnot.

Yes, you need something like this:

  private static final String[] DEFAULT_FIELDS = {"title", "body"};
  private static final float[] DEFAULT_BOOSTS = {3.0f, 1.0f};

  DistributingMultiFieldQueryParser.parse(
    queryString, DEFAULT_FIELDS, DEFAULT_BOOSTS, new
StandardAnalyzer());

DEFAULT_FIELDS should contain whatever list of fields you are using
(that should be searched for simple query terms containing no explicit
field specs).  DEFAULT_BOOSTS must be in 1:1 correspondence with
DEFAULT_FIELDS.

  > Also, dumb question...do I need to build an index for every impl of
  > Similarity? Thus there will be "n" indexes and wikipedia-sim.jsp
will
  > search in each one with the corresponding Similarity?

Yes, you will need a separate index for each Similarity, as some values
computed from the similarity are stored in the index.

I'll send you a Similarity and an initial value for DEFAULT_BOOSTS later
today or tomorrow.

Can you put up an explain mechanism to support tuning?  I'll want to
tune the DEFAULT_BOOSTS and various Similarity factors based on the
collection.

Thanks,

Chuck


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to