On Apr 15, 2010, at 5:28 PM, Shai Erera wrote:

> DM I think ICU is great. But currently we use JFlex and you can run Java 10 
> if you want, but as long as JFlex is compiled w/ Java 1.4, that's what you'll 
> get. Luckily Uwe and Robert recently bumped it up to Java 1.5. Such a change 
> should be clearly documented in CHANGES so people are aware of this, and at 
> least until they figure out what they want to do with it, they should take 
> the pre-3.1 analyzers (assuming that's the next release w/ JFlex 1.5 
> tokenizers) and use them.

I'm not sure I understand. Is JFlex used by every tokenizer?

> 
> Alternatively, we can think of writing an ICU analyzer/tokenizer, but we're 
> still using JFlex, so I don't know how much control we have on that ...

Robert has already started one. (1488 I think).

> 
> Shai
> 
> On Fri, Apr 16, 2010 at 12:21 AM, DM Smith <dmsmith...@gmail.com> wrote:
> 
> On Apr 15, 2010, at 4:50 PM, Shai Erera wrote:
> 
> > Robert ... I'm sorry but changes to Analyzers don't *force* people to 
> > reindex. They can simply choose not to use the latest version. They can 
> > choose not to upgrade a Unicode version. They can copy the entire Analyzer 
> > code to match their needs. Index format changes is what I'm worried about 
> > because that *forces* people to reindex.
> 
> In several threads and issues it has been pointed out that upgrading Unicode 
> versions is not an obvious choice or even controllable. It is dictated by the 
> version of Java, the version of the OS and any Unicode specific libraries.
> 
> A desktop application which internally uses lucene has no control over the 
> automatic update of Java (yes it can detect the version change and refuse to 
> run or force an upgrade) or when the user feels like upgrading the OS (not 
> sure how to detect the Unicode version of an arbitrary OS. Not sure I want 
> to).
> 
> Even with server applications, some shared servers have one version of Java 
> that all use. And the owner of an individual application might have no say in 
> if or when that is upgraded.
> 
> This is to say that one needs to be ready to re-index at all times unless it 
> can be controlled.
> 
> One way to handle the Java/Unicode is to use ICU at a specific version and 
> control its upgrade.
> 
> One way to handle the OS problem (which really is one of user input) is to 
> keep up with the changes to Unicode and create a filter that handles the 
> differences normalizing to the Unicode version of the index (if that's even 
> possible).
> 
> Still goes to your point. The onus is on the application not on Lucene.
> 
> -- DM
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
> 
> 

Reply via email to