[ https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13705401#comment-13705401 ]
Brock Noland commented on HIVE-4838: ------------------------------------ Hey thanks for the feedback! Yes I thought about those items as well. I have a patch just about ready, which I'd like to get in before the optimizations since it fixes some correctness bugs but I'd love to per-sue those two items in a follow up jira. For example, the following code produces unexpected results :) {noformat} public static void main(String[] args) { MapJoinDoubleKeys left = new MapJoinDoubleKeys(148, null); MapJoinDoubleKeys right = new MapJoinDoubleKeys(148, null); System.out.println(left.equals(right)); MapJoinObjectKey left = new MapJoinObjectKey(new Object[]{null, "left"}); MapJoinObjectKey right = new MapJoinObjectKey(new Object[]{null, "right"}); System.out.println(left.equals(right)); } {noformat} > Refactor MapJoin HashMap code to improve testability and readability > -------------------------------------------------------------------- > > Key: HIVE-4838 > URL: https://issues.apache.org/jira/browse/HIVE-4838 > Project: Hive > Issue Type: Bug > Reporter: Brock Noland > Assignee: Brock Noland > > MapJoin is an essential component for high performance joins in Hive and the > current code has done great service for many years. However, the code is > showing it's age and currently suffers from the following issues: > * Uses static state via the MapJoinMetaData class to pass serialization > metadata to the Key, Row classes. > * The api of a logical "Table Container" is not defined and therefore it's > unclear what apis HashMapWrapper > needs to publicize. Additionally HashMapWrapper has many used public methods. > * HashMapWrapper contains logic to serialize, test memory bounds, and > implement the table container. Ideally these logical units could be seperated > * HashTableSinkObjectCtx has unused fields and unused methods > * CommonJoinOperator and children use ArrayList on left hand side when only > List is required > * There are unused classes MRU, DCLLItemm and classes which duplicate > functionality MapJoinSingleKey and MapJoinDoubleKeys -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira