Marvin Humphrey wrote:
I'm back working on converting Lucene to using a byte count instead of a char count at as a prefix at the head of each String. Three tests are failing: TestIndexModifier, TestConstantScoreRangeQuery, and TestRangeFilter.

Why those and not others?
-  private static final int compareChars(char[] v1, int len1,
-                                        char[] v2, int len2) {
+  private static final int compareBytes(byte[] bytes1, int len1,
+                                        byte[] bytes2, int len2) {
     int end = Math.min(len1, len2);
     for (int k = 0; k < end; k++) {
-      char c1 = v1[k];
-      char c2 = v2[k];
-      if (c1 != c2) {
-        return c1 - c2;
+      if (bytes1[k] != bytes2[k]) {
+        return bytes1[k] - bytes2[k];
       }
     }
     return len1 - len2;
   }

Since char is unsigned and byte is signed, this change the value of comparisions, no? I've frequently found that using (int)(byteValue & 0xFF) in place of a byteValue gives me what I want.

See, e.g., compareBytes in:

http://svn.apache.org/viewcvs.cgi/lucene/hadoop/trunk/src/java/org/apache/hadoop/io/WritableComparator.java?view=markup

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to