thanks,Ian.I will try your idea. 2011-10-24
janwen | China website : http://www.qianpin.com/ From:Ian Lea Date:2011-10-24 18:01 Subject:Re: custome index rule To:java-user Cc: You can achieve pretty much anything by customizing parsers and tokenizers but for your simple case I'd just use String.split() and add the phrases one by one. Something like Document d = ... String[] phrases = sentence,split(","); for (String phrase : phrases) { d.add(new Field("phrase", phrase, ...); } I think that would achieve what you want. On special characters. see http://lucene.apache.org/java/3_4_0/queryparsersyntax.html#Escaping Special Characters and QueryParser.escape(String s). -- Ian. On Mon, Oct 24, 2011 at 10:12 AM, janwen <tom.grade1...@163.com> wrote: > Hi, > I want to implement a custom index rule: > Assume the sentence like the following:Note comma > I am in China,I am in USA,I am in UK > > I hope lucene index above sentece based on the rule: > 1)split the sentence with comma(,),so we get(I am in China)(I am in USA)(I am > in UK) > 2)then lucene just store the short senteces from step 1,NOT_ANALYZED > > P.S How many characters lucene do not support,and What they are? > I input a^b and get exception: > org.apache.lucene.queryParser.ParseException: Cannot parse 'a^b: Lexical > error at line 1, column 4. Encountered: "\u671d" (26397), after : "" > > thanks > > 2011-10-24 > > > > janwen | China > website : http://www.qianpin.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org