In fact I think the 2 test cases of the "modified UTF8 null bytes" are
just bogus, because they are using Java's UTF8 charset decoder to
construct a String when (as Ken points out) the byte sequence 0xC0
0x80 is illegal UTF8.
I'll remove those 2 test cases.
Mike
Mark Miller wrote:
I think your on the right tack Ken. Don't know enough about Unicode
myself, but I was looking at this this morning, and what you say
somewhat jives with what I saw.
I don't think you can just flip that switch though - the index
format will not match what its trying to read (having been written
in the new format). Which is why that couldn't have been intended to
test reading the old format, unless it was mistake (it was also
added when the format changed, so its not like it was left around).
Possibly a mistake that works when your Unicode support is older?
(Im on Ubuntu 8.10 - not sure what that means to my Unicode level)
That unicode version comment looks very interesting - no one really
noticing this problem in America (thats mentioned it), and I think
Sami is in Europe.
McCandless knows whats wrong I sure (he did that patch), but hes
either busy fixing it, or ordering another margarita in tijuana.
In either case, I'm sure the issue will be resolved soon.
- Mark
Ken Krugler wrote:
Ok, it's not a java 1.6 thing it's something else. I also found a
box that runs that test ok.
From what I can tell, this is the test that's failing:
http://www.krugle.org/kse/entfiles/lucene/apache.org/java/trunk/src/test/org/apache/lucene/index/TestIndexInput.java#89
This is verifying that the "Modified UTF-8 null bytes" sequence is
handled properly, from line 63 in the same file.
I think this is the old, deprecated format for pre-2.4 indexes.
So shouldn't there be a call to setModifiedUTF8StringsMode()? And
since this is a one-way setting of the preUTF8Strings flag, It
feels like this should be in a separate test.
Without this call, you'll get the result of calling the String
class's default constructor with an ill-formed UTF-8 sequence (for
Unicode 3.1 or later), since 0xC0 0x80 isn't the shortest form for
the u0000 code point.
-- Ken
Mark Miller wrote:
Hey Sami, I've been running tests quite a bit recently with
Ubuntu 8.10 and OpenJDK 6 on a 64-bit machine, and I have not
seen it once.
Just tried again with Sun JDK 6 and 5 32-bit as well, and I am
still not seeing it.
Odd.
- Mark
Sami Siren wrote:
I am constantly seeing following error when running "ant test":
[junit] Testcase:
testRead(org.apache.lucene.index.TestIndexInput): FAILED
[junit] expected:<[]> but was:<[??]>
[junit] junit.framework.ComparisonFailure: expected:<[]> but
was:<[??]>
[junit] at
org
.apache.lucene.index.TestIndexInput.testRead(TestIndexInput.java:
89)
on both intel and amd architectures running linux.
java on AMD:
java version "1.6.0_11"
Java(TM) SE Runtime Environment (build 1.6.0_11-b03)
Java HotSpot(TM) 64-Bit Server VM (build 11.0-b16, mixed mode)
java on Intel:
java version "1.6.0_0"
IcedTea6 1.4 (fedora-7.b12.fc10-x86_64) Runtime Environment
(build 1.6.0_0-b12)
OpenJDK 64-Bit Server VM (build 10.0-b19, mixed mode)
java version "1.6.0_11"
Java(TM) SE Runtime Environment (build 1.6.0_11-b03)
Java HotSpot(TM) 64-Bit Server VM (build 11.0-b16, mixed mode)
java version "1.6.0_11"
Java(TM) SE Runtime Environment (build 1.6.0_11-b03)
Java HotSpot(TM) Server VM (build 11.0-b16, mixed mode)
Anyone else seeing this?
--
Sami Siren
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org