On Apr 11, 2006, at 12:18 PM, Doug Cutting wrote:
Marvin Humphrey wrote:
I'm back working on converting Lucene to using a byte count
instead of a char count at as a prefix at the head of each
String. Three tests are failing: TestIndexModifier,
TestConstantScoreRangeQuery, and TestRangeFilter.
Why those and not others?
- private static final int compareChars(char[] v1, int len1,
- char[] v2, int len2) {
+ private static final int compareBytes(byte[] bytes1, int len1,
+ byte[] bytes2, int len2) {
int end = Math.min(len1, len2);
for (int k = 0; k < end; k++) {
- char c1 = v1[k];
- char c2 = v2[k];
- if (c1 != c2) {
- return c1 - c2;
+ if (bytes1[k] != bytes2[k]) {
+ return bytes1[k] - bytes2[k];
}
}
return len1 - len2;
}
Since char is unsigned and byte is signed, this change the value of
comparisions, no? I've frequently found that using (int)(byteValue
& 0xFF) in place of a byteValue gives me what I want.
Hmm... while that's surely a good change to make, it seems to have
had no impact on the failing tests.
private static final int compareBytes(byte[] bytes1, int len1,
byte[] bytes2, int len2) {
int end = Math.min(len1, len2);
for (int k = 0; k < end; k++) {
int b1 = (bytes1[k] & 0xFF);
int b2 = (bytes2[k] & 0xFF);
if (b1 != b2) {
return b1 - b2;
}
}
return len1 - len2;
}
What do the failing tests have in common?
On TestIndexModifier, only a small portion of the deletions fail, and
they're all for fairly high values of delId -- sometimes the highest,
but not always. For RangeFilter and ConstantScoreRangeQuery, it's
the "find all" tests, and only those, that fail. They find 0 docs
instead of 10001.
Still scratching my head,
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]