[
https://issues.apache.org/jira/browse/TAJO-691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13943440#comment-13943440
]
hyoungjunkim commented on TAJO-691:
-----------------------------------
HashMap is not the cause of poor performance. VTuple.hashCode() returns a same
hash value in case of following.
{code}
VTuple v1 = new VTuple(new Datum[]{new Int4Datum(1), new Int4Datum(2)});
VTuple v2 = new VTuple(new Datum[]{new Int4Datum(2), new Int4Datum(1)});
System.out.println(v1.hashCode());
System.out.println(v2.hashCode());
{code}
This code prints same hashcode.
{noformat}
94
94
{noformat}
> HashJoin or HashAggregation is too slow if there is many unique keys
> --------------------------------------------------------------------
>
> Key: TAJO-691
> URL: https://issues.apache.org/jira/browse/TAJO-691
> Project: Tajo
> Issue Type: Improvement
> Reporter: hyoungjunkim
> Assignee: hyoungjunkim
> Attachments: TAJO-691.patch
>
>
> HashJoin or HashAggregation is too slow if there is many unique keys.
> Java's native Map is inefficient to handle many items. In case more than 1
> million items in HashMap, Adding 10000 items takes more than 7 ~ 10 seconds.
>
> This should be improved.
--
This message was sent by Atlassian JIRA
(v6.2#6252)