RE: [Lucene-dev] Allowing an Analyzer to choose a parsing strategy based on contex t

Doug Cutting Tue, 19 Jun 2001 15:29:02 -0700

> From: Scott Ganyo [mailto:[EMAIL PROTECTED]]
> 
> I've made the following simple and backward-compatible 
> changes to a couple
> of classes in order to allow an Analyzer to choose a parsing 
> strategy based
> on Document and/or Field:
> 
> I changed DocumentWriter.java, line 123 from:
> TokenStream stream = analyzer.tokenStream(reader);
> 
> To:
> TokenStream stream = analyzer.tokenStream(doc, field, reader);
> 
> 
> ...and I changed Analyzer.java implementation to add 
> tokenStream(Document, Field, Reader) method:

I've thought a bit more about this.  The new method should also be usable by
the query parser, right?  But the query parser doesn't have a Document or a
Field.  So I think the the new method should instead be:

  public TokenStream tokenStream(String fieldName, Reader text);

That way the query parser can, after having parsed out field names, apply
the appropriate analysis to the tokens.

A utility Analyzer class like the following would also be useful:

  public class FieldAnalyzers extends Analyzer {
    private HashTable fieldToAnalyzer = new HashTable();
    public void add(String fieldName, Analyzer analyzer) {
      fieldToAnalyzer.put(field, analyzer);
    }
    public TokenStream tokenStream(String field, Reader reader) {
      return ((Analyzer)fieldToAnalyzer.get(field)).tokenStream(field,
reader); 
    }
  }

Probably needs a little more error checking, and maybe a default analyzer,
but you get the idea...

Doug

_______________________________________________
Lucene-dev mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/lucene-dev

RE: [Lucene-dev] Allowing an Analyzer to choose a parsing strategy based on contex t

Reply via email to