Problem solved. Now another problem comes.
As I want to use Highlighter in my system, the token offset is incorrect after the MappingCharFilter is used. Koji, do you known how to fix the offset problem? On Sun, Dec 13, 2009 at 11:12 AM, Weiwei Wang <ww.wang...@gmail.com> wrote: > I use Luke to check the result and find only c exists as a term, no > cplusplus found in the index > > > On Sun, Dec 13, 2009 at 10:34 AM, Weiwei Wang <ww.wang...@gmail.com>wrote: > >> Thanks, Koji, I followed your advice and change my analyzer as shown >> below: >> NormalizeCharMap RECOVERY_MAP = new NormalizeCharMap(); >> RECOVERY_MAP.add("c++","cplusplus$"); >> CharFilter filter = new LowercaseCharFilter(reader); >> filter = new MappingCharFilter(RECOVERY_MAP,filter); >> StandardTokenizer tokenStream = new StandardTokenizer(Version.LUCENE_30, >> filter); >> tokenStream.setMaxTokenLength(maxTokenLength); >> TokenStream result = new StandardFilter(tokenStream); >> result = new LowerCaseFilter(result); >> result = new StopFilter(enableStopPositionIncrements, result, stopSet); >> result = new SnowballFilter(result, STEMMER); >> >> I use the same analyzer in the search side. As you know, this analyzer can >> token c++ as cplusplus, for this reason, it seems I can search c++ with >> the same analyzer because it is also tokenized as cplusplus. >> >> I tested it on as string c++c++, however, when i search c++ on the built >> index, nothing is returned. >> >> I do not know what's wrong with my code. Waiting for your replay >> >> >> >> >> >> On Fri, Dec 11, 2009 at 9:43 PM, Weiwei Wang <ww.wang...@gmail.com>wrote: >> >>> Thanks, Koji >>> >>> >>> On Fri, Dec 11, 2009 at 7:59 PM, Koji Sekiguchi <k...@r.email.ne.jp>wrote: >>> >>>> MappingCharFilter can be used to convert c++ to cplusplus. >>>> >>>> Koji >>>> >>>> -- >>>> http://www.rondhuit.com/en/ >>>> >>>> >>>> >>>> Anshum wrote: >>>> >>>>> How about getting the original token stream and then converting c++ to >>>>> cplusplus or anyother such transform. Or perhaps you might look at >>>>> using/extending(in the non java sense) some other tokenized! >>>>> >>>>> -- >>>>> Anshum Gupta >>>>> Naukri Labs! >>>>> http://ai-cafe.blogspot.com >>>>> >>>>> The facts expressed here belong to everybody, the opinions to me. The >>>>> distinction is yours to draw............ >>>>> >>>>> >>>>> On Fri, Dec 11, 2009 at 11:00 AM, Weiwei Wang <ww.wang...@gmail.com> >>>>> wrote: >>>>> >>>>> >>>>> >>>>>> Hi, all, >>>>>> I designed a ftp search engine based on Lucene. I did a few >>>>>> modifications to the StandardTokenizer. >>>>>> My problem is: >>>>>> C++ is tokenized as c from StandardTokenizer and I want to recover it >>>>>> from >>>>>> the TokenStream from StandardTokenizer >>>>>> >>>>>> What should I do? >>>>>> >>>>>> -- >>>>>> Weiwei Wang >>>>>> Alex Wang >>>>>> 王巍巍 >>>>>> Room 403, Mengmin Wei Building >>>>>> Computer Science Department >>>>>> Gulou Campus of Nanjing University >>>>>> Nanjing, P.R.China, 210093 >>>>>> >>>>>> Homepage: http://cs.nju.edu.cn/rl/weiweiwang >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>> >>>> >>> >>> >>> -- >>> Weiwei Wang >>> Alex Wang >>> 王巍巍 >>> Room 403, Mengmin Wei Building >>> Computer Science Department >>> Gulou Campus of Nanjing University >>> Nanjing, P.R.China, 210093 >>> >>> Homepage: http://cs.nju.edu.cn/rl/weiweiwang >>> >> >> >> >> -- >> Weiwei Wang >> Alex Wang >> 王巍巍 >> Room 403, Mengmin Wei Building >> Computer Science Department >> Gulou Campus of Nanjing University >> Nanjing, P.R.China, 210093 >> >> Homepage: http://cs.nju.edu.cn/rl/weiweiwang >> > > > > -- > Weiwei Wang > Alex Wang > 王巍巍 > Room 403, Mengmin Wei Building > Computer Science Department > Gulou Campus of Nanjing University > Nanjing, P.R.China, 210093 > > Homepage: http://cs.nju.edu.cn/rl/weiweiwang > -- Weiwei Wang Alex Wang 王巍巍 Room 403, Mengmin Wei Building Computer Science Department Gulou Campus of Nanjing University Nanjing, P.R.China, 210093 Homepage: http://cs.nju.edu.cn/rl/weiweiwang