[ https://issues.apache.org/jira/browse/PIG-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883557#action_12883557 ]
Gianmarco De Francisci Morales commented on PIG-1468: ----------------------------------------------------- I ran some tests. I see a ~1% decrease in performance overall. I looked around the codebase for references to the method, and it does not seem there is any place that relies on the specific ordering. Here is the code I used: {code} import java.util.Random; public class TestSpeed { private static final int TIMES = (int) 10e6; private static final int NUM_ARRAYS = (int) 10e5; private static final int ARRAY_LENGTH = 50; private static int compareSigned(byte[] b1, byte[] b2) { int i; for (i = 0; i < b1.length; i++) { if (i >= b2.length) return 1; int a = b1[i]; int b = b2[i]; if (a < b) return -1; else if (a > b) return 1; } if (i < b2.length) return -1; return 0; } private static int compareUnsisgned(byte[] b1, byte[] b2) { int i; for (i = 0; i < b1.length; i++) { if (i >= b2.length) return 1; int a = b1[i] & 0xff; int b = b2[i] & 0xff; if (a < b) return -1; else if (a > b) return 1; } if (i < b2.length) return -1; return 0; } public static void main(String[] args) { long before, after; Random rand = new Random(123456789); byte[][] batch1 = new byte[NUM_ARRAYS][]; byte[][] batch2 = new byte[NUM_ARRAYS][]; for (int i = 0; i < NUM_ARRAYS; i++) { batch1[i] = new byte[ARRAY_LENGTH]; batch2[i] = new byte[ARRAY_LENGTH]; rand.nextBytes(batch1[i]); rand.nextBytes(batch2[i]); } before = System.currentTimeMillis(); for (int i = 0; i < TIMES; i++) for (int j = 0; j < ARRAY_LENGTH; j++) compareSigned(batch1[j], batch2[j]); after = System.currentTimeMillis(); System.out.println("Time for signed comparison (ms): " + (after - before)); before = System.currentTimeMillis(); for (int i = 0; i < TIMES; i++) for (int j = 0; j < ARRAY_LENGTH; j++) compareUnsisgned(batch1[j], batch2[j]); after = System.currentTimeMillis(); System.out.println("Time for UNsigned comparison (ms): " + (after - before)); } } {code} > DataByteArray.compareTo() does not compare in lexicographic order > ----------------------------------------------------------------- > > Key: PIG-1468 > URL: https://issues.apache.org/jira/browse/PIG-1468 > Project: Pig > Issue Type: Bug > Reporter: Gianmarco De Francisci Morales > Assignee: Gianmarco De Francisci Morales > Attachments: PIG-1468.patch > > > The compareTo() method of org.apache.pig.data.DataByteArray does not compare > items in lexicographic order. > Actually, it takes into account the signum of the bytes that compose the > DataByteArray. > So, for example, 0xff compares to less than 0x00 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.