--- Harpreet S Walia <[EMAIL PROTECTED]> wrote: > Hi, > > Are there any resources available which explain how the simple > analyser processes the data given to it . > what i want to know is that suppose i have a set of words , what > exact rules are applied to tokenize and index these words and how can > i customize them. > > My requirement is that the words be broken only by spaces and not at > any other character . I understand that this can be done by writing > a parser in JAVACC . but is there any simpler way of achieving this .
Actually, this can be done by writing your own custom Analyzer. Check this: ./org/apache/lucene/analysis/standard/StandardAnalyzer.java ./org/apache/lucene/analysis/Analyzer.java ./org/apache/lucene/analysis/de/GermanAnalyzer.java ./org/apache/lucene/analysis/SimpleAnalyzer.java ./org/apache/lucene/analysis/StopAnalyzer.java ./org/apache/lucene/analysis/WhitespaceAnalyzer.java Maybe this last one is what you are looking for. Otis __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>