Hi Rafael,
What is your scenario?
Maybe it was defined this way so it do not filter uppercased stop words.
Like, for example, the downcased word "se" is a stopword, but the uppercased
"SE" stands for "Sergipe", a brazilian state, so it should not be filtered.
Best Regards,
Adriano Crestani
On Mon, Nov 17, 2008 at 3:39 PM, Rafael Cunha de Almeida <
[EMAIL PROTECTED]> wrote:
> Following is the patch for what I think is a bug on the
> BrazilianAnalyzer. The default stopwords list is all in lowercase, so
> it will only work if the LowerCaseFilter comes first of if the
> StopWordFilter is set to ignore case.
>
> Since the LowerCaseFilter is instantiated anyway I just changed its
> order. If there's some problem with that order, then please consider
> setting StopWordFilter to ignore case.
>
> Index: BrazilianAnalyzer.java
> ===================================================================
> --- BrazilianAnalyzer.java (revision 718407)
> +++ BrazilianAnalyzer.java (working copy)
> @@ -131,10 +131,9 @@
> public final TokenStream tokenStream(String fieldName, Reader
> reader) { TokenStream result = new StandardTokenizer( reader );
> result = new StandardFilter( result );
> + result = new LowerCaseFilter( result );
> result = new StopFilter( result, stoptable );
> result = new BrazilianStemFilter( result, excltable );
> - // Convert to lowercase after stemming!
> - result = new LowerCaseFilter( result );
> return result;
> }
> }
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>