[jira] Commented: (HIVE-1629) Patch to fix hashCode method in DoubleWritable class

2010-09-13 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12908639#action_12908639
 ] 

Ning Zhang commented on HIVE-1629:
--

Good question John. I think this patch doesn't affect bucketing, which is 
implemented using ObjectInspectorUtils.hashCode(). Actually the hash function 
used there for Double is the same as the one provided in this patch. But I'll 
double check with Zheng/Namit tomorrow. 

 Patch to fix hashCode method in DoubleWritable class
 

 Key: HIVE-1629
 URL: https://issues.apache.org/jira/browse/HIVE-1629
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Fix For: 0.7.0

 Attachments: HIVE-1629.patch


 A patch to fix the hashCode() method of DoubleWritable class of Hive.
 It prevents the HashMap (of type DoubleWritable) from behaving as LinkedList.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1629) Patch to fix hashCode method in DoubleWritable class

2010-09-12 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12908599#action_12908599
 ] 

Ning Zhang commented on HIVE-1629:
--

+1 Will commit if tests pass.

 Patch to fix hashCode method in DoubleWritable class
 

 Key: HIVE-1629
 URL: https://issues.apache.org/jira/browse/HIVE-1629
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
 Attachments: HIVE-1629.patch


 A patch to fix the hashCode() method of DoubleWritable class of Hive.
 It prevents the HashMap (of type DoubleWritable) from behaving as LinkedList.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1629) Patch to fix hashCode method in DoubleWritable class

2010-09-12 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12908622#action_12908622
 ] 

John Sichi commented on HIVE-1629:
--

Ning, does this change introduce incompatibility with any persistent storage 
(e.g. bucketing)?


 Patch to fix hashCode method in DoubleWritable class
 

 Key: HIVE-1629
 URL: https://issues.apache.org/jira/browse/HIVE-1629
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-1629.patch


 A patch to fix the hashCode() method of DoubleWritable class of Hive.
 It prevents the HashMap (of type DoubleWritable) from behaving as LinkedList.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1629) Patch to fix hashCode method in DoubleWritable class

2010-09-10 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12908194#action_12908194
 ] 

Ning Zhang commented on HIVE-1629:
--

+long v = Double.doubleToLongBits(value);
+return (int) (v ^ (v  32));

won't this return 0 for all long values less than 2^32?

Search on the web and it seems the following 64 bit to 32 bit hash is a good one

http://www.cris.com/~ttwang/tech/inthash.htm

 Patch to fix hashCode method in DoubleWritable class
 

 Key: HIVE-1629
 URL: https://issues.apache.org/jira/browse/HIVE-1629
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
 Attachments: HIVE-1629.patch


 A patch to fix the hashCode() method of DoubleWritable class of Hive.
 It prevents the HashMap (of type DoubleWritable) from behaving as LinkedList.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1629) Patch to fix hashCode method in DoubleWritable class

2010-09-10 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12908210#action_12908210
 ] 

Vaibhav Aggarwal commented on HIVE-1629:


Hi

The doubleToLongBits converts the double value into IEEE 754 floating-point 
double format bit layout.
Furthermore the XOR operator prevents returning 0 for values less than 2^32.

This is the hashCode function used by standard java implementation.

I was noticing unexpected delay in one of the operations related to double data 
types.
After some debugging I realized that the HashMap puts and gets were extremely 
slow.
That pointed me to the hashCode implementatoin in DoubleWritable which turned 
out to be the cause of slow HashMap IO.

That is why I propose to use the standard java implmenetation of HashCode for 
double type.

Thanks
Vaibhav

 Patch to fix hashCode method in DoubleWritable class
 

 Key: HIVE-1629
 URL: https://issues.apache.org/jira/browse/HIVE-1629
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
 Attachments: HIVE-1629.patch


 A patch to fix the hashCode() method of DoubleWritable class of Hive.
 It prevents the HashMap (of type DoubleWritable) from behaving as LinkedList.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.