Re: Problem with tokenizing/stemming in GermanAnalyzer

2003-02-17 Thread Christoph Kiehl
Hi Gerhard, > I promise I will check the stemmer next days... hm... not before this > weekend, i have a martial arts challenge at sunday. Mental i'm not > prepared to _fix_ anything. :) Cool, I just started reading about stemmers etc. I'm very interested in your solution. And good luck for your c

Re: Problem with tokenizing/stemming in GermanAnalyzer

2003-02-17 Thread Gerhard Schwarz
Christoph Kiehl wrote: Hi Volker, I have noticed a strange problem with capitalization. Search for "computer" results in the token "compu". Search for "Computer", however, results in "comput". The search is supposed to be case-insensitive, so this must be a bug, right? This problem was already

Re: Problem with tokenizing/stemming in GermanAnalyzer

2003-02-17 Thread Christoph Kiehl
> For now you could check out the current lucene version from cvs and > just comment out the following line: > > uppercase = Character.isUpperCase( term.charAt( 0 ) ); In GermanStemmer.java of course ;)) > Regards > Christoph --

Re: Problem with tokenizing/stemming in GermanAnalyzer

2003-02-17 Thread Christoph Kiehl
Hi Volker, > I have noticed a strange problem with capitalization. Search for > "computer" results in the token "compu". Search for "Computer", > however, results in "comput". The search is supposed to be > case-insensitive, so this must be a bug, right? This problem was already mentioned on the

Problem with tokenizing/stemming in GermanAnalyzer

2003-02-17 Thread Volker Luedeling
Hi, my application uses a GermanAnalyzer for tokenizing a search string and constructing Query classes: Analyzer an = new org.apache.lucene.analysis.de.GermanAnalyzer(); TokenStream ts = an.tokenStream(fieldName, new StringReader(fieldText)); I have noticed a strange problem with