On Apr 15, 2010, at 5:28 PM, Shai Erera wrote: > DM I think ICU is great. But currently we use JFlex and you can run Java 10 > if you want, but as long as JFlex is compiled w/ Java 1.4, that's what you'll > get. Luckily Uwe and Robert recently bumped it up to Java 1.5. Such a change > should be clearly documented in CHANGES so people are aware of this, and at > least until they figure out what they want to do with it, they should take > the pre-3.1 analyzers (assuming that's the next release w/ JFlex 1.5 > tokenizers) and use them.
I'm not sure I understand. Is JFlex used by every tokenizer? > > Alternatively, we can think of writing an ICU analyzer/tokenizer, but we're > still using JFlex, so I don't know how much control we have on that ... Robert has already started one. (1488 I think). > > Shai > > On Fri, Apr 16, 2010 at 12:21 AM, DM Smith <dmsmith...@gmail.com> wrote: > > On Apr 15, 2010, at 4:50 PM, Shai Erera wrote: > > > Robert ... I'm sorry but changes to Analyzers don't *force* people to > > reindex. They can simply choose not to use the latest version. They can > > choose not to upgrade a Unicode version. They can copy the entire Analyzer > > code to match their needs. Index format changes is what I'm worried about > > because that *forces* people to reindex. > > In several threads and issues it has been pointed out that upgrading Unicode > versions is not an obvious choice or even controllable. It is dictated by the > version of Java, the version of the OS and any Unicode specific libraries. > > A desktop application which internally uses lucene has no control over the > automatic update of Java (yes it can detect the version change and refuse to > run or force an upgrade) or when the user feels like upgrading the OS (not > sure how to detect the Unicode version of an arbitrary OS. Not sure I want > to). > > Even with server applications, some shared servers have one version of Java > that all use. And the owner of an individual application might have no say in > if or when that is upgraded. > > This is to say that one needs to be ready to re-index at all times unless it > can be controlled. > > One way to handle the Java/Unicode is to use ICU at a specific version and > control its upgrade. > > One way to handle the OS problem (which really is one of user input) is to > keep up with the changes to Unicode and create a filter that handles the > differences normalizing to the Unicode version of the index (if that's even > possible). > > Still goes to your point. The onus is on the application not on Lucene. > > -- DM > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >