Re: EnglishAnalyzer vs WhiteSpaceAnalyzer in getting Term Frequency

2014-08-07 Thread Bianca Pereira
he aalyzer > yourself. The stemming is very likely the culprit here. > > -- Jack Krupansky > > -Original Message- From: Uwe Schindler > Sent: Thursday, August 7, 2014 9:00 AM > To: java-user@lucene.apache.org > Subject: RE: EnglishAnalyzer vs WhiteSpaceAnalyzer in

Re: EnglishAnalyzer vs WhiteSpaceAnalyzer in getting Term Frequency

2014-08-07 Thread Jack Krupansky
rsday, August 7, 2014 9:00 AM To: java-user@lucene.apache.org Subject: RE: EnglishAnalyzer vs WhiteSpaceAnalyzer in getting Term Frequency Hi, if you create the term yourself, it is not going through the analyzer: public int getTermFrequency(String term, String id) (you create a BytesRef out of it).

RE: EnglishAnalyzer vs WhiteSpaceAnalyzer in getting Term Frequency

2014-08-07 Thread Uwe Schindler
rom: Bianca Pereira [mailto:aivykar...@gmail.com] > Sent: Thursday, August 07, 2014 2:47 PM > To: java-user > Subject: Re: EnglishAnalyzer vs WhiteSpaceAnalyzer in getting Term > Frequency > > Hi Jack, > > Thank you very much. I just changed for the StandardAnalyzer and i

Re: EnglishAnalyzer vs WhiteSpaceAnalyzer in getting Term Frequency

2014-08-07 Thread Bianca Pereira
Hi Jack, Thank you very much. I just changed for the StandardAnalyzer and it is working as I would like. But there is something I still cannot understand. If I use the same analyzer for indexing and for searching, the same term should be parsed in the same way in both moments, shouldn't it? It i

Re: EnglishAnalyzer vs WhiteSpaceAnalyzer in getting Term Frequency

2014-08-07 Thread Jack Krupansky
Generally, the standard analyzer will be a better choice, unless you have some special need. A language-specific analyzer will include stemming. The English analyzer includes the Porter stemmer. Generally, you need to apply a compatible analyzer to query terms to match the index, or you need