[jira] Commented: (HIVE-1629) Patch to fix hashCode method in DoubleWritable class
[ https://issues.apache.org/jira/browse/HIVE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12908639#action_12908639 ] Ning Zhang commented on HIVE-1629: -- Good question John. I think this patch doesn't affect bucketing, which is implemented using ObjectInspectorUtils.hashCode(). Actually the hash function used there for Double is the same as the one provided in this patch. But I'll double check with Zheng/Namit tomorrow. Patch to fix hashCode method in DoubleWritable class Key: HIVE-1629 URL: https://issues.apache.org/jira/browse/HIVE-1629 Project: Hadoop Hive Issue Type: Bug Reporter: Vaibhav Aggarwal Assignee: Vaibhav Aggarwal Fix For: 0.7.0 Attachments: HIVE-1629.patch A patch to fix the hashCode() method of DoubleWritable class of Hive. It prevents the HashMap (of type DoubleWritable) from behaving as LinkedList. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1629) Patch to fix hashCode method in DoubleWritable class
[ https://issues.apache.org/jira/browse/HIVE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12908599#action_12908599 ] Ning Zhang commented on HIVE-1629: -- +1 Will commit if tests pass. Patch to fix hashCode method in DoubleWritable class Key: HIVE-1629 URL: https://issues.apache.org/jira/browse/HIVE-1629 Project: Hadoop Hive Issue Type: Bug Reporter: Vaibhav Aggarwal Attachments: HIVE-1629.patch A patch to fix the hashCode() method of DoubleWritable class of Hive. It prevents the HashMap (of type DoubleWritable) from behaving as LinkedList. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1629) Patch to fix hashCode method in DoubleWritable class
[ https://issues.apache.org/jira/browse/HIVE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12908622#action_12908622 ] John Sichi commented on HIVE-1629: -- Ning, does this change introduce incompatibility with any persistent storage (e.g. bucketing)? Patch to fix hashCode method in DoubleWritable class Key: HIVE-1629 URL: https://issues.apache.org/jira/browse/HIVE-1629 Project: Hadoop Hive Issue Type: Bug Reporter: Vaibhav Aggarwal Assignee: Vaibhav Aggarwal Attachments: HIVE-1629.patch A patch to fix the hashCode() method of DoubleWritable class of Hive. It prevents the HashMap (of type DoubleWritable) from behaving as LinkedList. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1629) Patch to fix hashCode method in DoubleWritable class
[ https://issues.apache.org/jira/browse/HIVE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12908194#action_12908194 ] Ning Zhang commented on HIVE-1629: -- +long v = Double.doubleToLongBits(value); +return (int) (v ^ (v 32)); won't this return 0 for all long values less than 2^32? Search on the web and it seems the following 64 bit to 32 bit hash is a good one http://www.cris.com/~ttwang/tech/inthash.htm Patch to fix hashCode method in DoubleWritable class Key: HIVE-1629 URL: https://issues.apache.org/jira/browse/HIVE-1629 Project: Hadoop Hive Issue Type: Bug Reporter: Vaibhav Aggarwal Attachments: HIVE-1629.patch A patch to fix the hashCode() method of DoubleWritable class of Hive. It prevents the HashMap (of type DoubleWritable) from behaving as LinkedList. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1629) Patch to fix hashCode method in DoubleWritable class
[ https://issues.apache.org/jira/browse/HIVE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12908210#action_12908210 ] Vaibhav Aggarwal commented on HIVE-1629: Hi The doubleToLongBits converts the double value into IEEE 754 floating-point double format bit layout. Furthermore the XOR operator prevents returning 0 for values less than 2^32. This is the hashCode function used by standard java implementation. I was noticing unexpected delay in one of the operations related to double data types. After some debugging I realized that the HashMap puts and gets were extremely slow. That pointed me to the hashCode implementatoin in DoubleWritable which turned out to be the cause of slow HashMap IO. That is why I propose to use the standard java implmenetation of HashCode for double type. Thanks Vaibhav Patch to fix hashCode method in DoubleWritable class Key: HIVE-1629 URL: https://issues.apache.org/jira/browse/HIVE-1629 Project: Hadoop Hive Issue Type: Bug Reporter: Vaibhav Aggarwal Attachments: HIVE-1629.patch A patch to fix the hashCode() method of DoubleWritable class of Hive. It prevents the HashMap (of type DoubleWritable) from behaving as LinkedList. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.