Thanks for the suggestion. We're going to go over all of this information/suggestions next week to see what we want to do.
-----Original Message----- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Saturday, February 21, 2009 11:52 AM To: java-user@lucene.apache.org Subject: Re: 2.3.2 -> 2.4.0 StandardTokenizer issue that was just a suggestion as a quick hack... it still won't really fix the problem because some character + accent combinations don't have composed forms. even if you added entire combining diacritical marks block to the jflex grammar, its still wrong... what needs to be supported is \p{Word_Break = Extend} property, etc etc. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org