On 26/03/2016 11:56, Uwe Schindler wrote:
Hi,

after also testing the separate "Jigsaw" build on jdk9.java.net I see the same 
problems. So both builds 111 are wrong.

To me it looks like the Unicode data files are missing some information - which 
could again be a packaging bug. As said before, build 110 does not have this 
problem, so it seems to be a side-effect of Jigsaw merging.

The following stuff does not work:

(1) Thai's locale does not have working dictionary-based BreakIterator available. The 
following "check" in Lucene for this fails, because it cannot detect a boundary 
correctly:

   /**
    * True if the JRE supports a working dictionary-based breakiterator for 
Thai.
    * If this is false, this tokenizer will not work at all!
    */
   public static final boolean DBBI_AVAILABLE;
   private static final BreakIterator proto = BreakIterator.getWordInstance(new 
Locale("th"));
   static {
     // check that we have a working dictionary-based break iterator for thai
     proto.setText("ภาษาไทย");
     DBBI_AVAILABLE = proto.isBoundary(4);
   }

After this static initializer, DBBI_AVAILABLE is false. This makes some tests 
to be ignored, but 2 fail because of this (which might be an oversight on our 
side). But nevertheless, this is a bug in build 111.
I just tried to duplicate this on OSX and Linux without success. The log you linked to suggests this is Linux, is that right? Is this the JDK bundle, I haven't checked the JRE bundle but would be surprise anything is missing. The JDK has several tests for Thai so if it was completely broken then I would have expected it would have been seen. I've no doubt that it is not working in your environment, we just need to figure out what is different.


(2) The collator for Arabic (Farsi) language fails to work correctly. This also 
looks like missing data.

Collator collator = Collator.getInstance(new Locale("ar"));

Are there any exceptions or anything here? Or maybe it tests the collector with compare?

-Alan

Reply via email to