[
https://issues.apache.org/jira/browse/HIVE-4732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769049#comment-13769049
]
Mohammad Kamrul Islam commented on HIVE-4732:
---------------------------------------------
[~appodictic]: I can see your point. Indeed a very informative link.
As the link mentioned, the probability of ID collisions are very very rare.
Pasted from wikipedia:
"To put these numbers into perspective, the annual risk of someone being hit by
a meteorite is estimated to be one chance in 17 billion,[38] which means the
probability is about 0.00000000006 (6 × 10−11), equivalent to the odds of
creating a few tens of trillions of UUIDs in a year and having one duplicate.
In other words, only after generating 1 billion UUIDs every second for the next
100 years, the probability of creating just one duplicate would be about 50%.
The probability of one duplicate would be about 50% if every person on earth
owns 600 million UUIDs."
With these probability, will it be necessary to make thing complex. Moreover,
these IDs are often few in one hive session.
> Reduce or eliminate the expensive Schema equals() check for AvroSerde
> ---------------------------------------------------------------------
>
> Key: HIVE-4732
> URL: https://issues.apache.org/jira/browse/HIVE-4732
> Project: Hive
> Issue Type: Improvement
> Components: Serializers/Deserializers
> Reporter: Mark Wagner
> Assignee: Mohammad Kamrul Islam
> Attachments: HIVE-4732.1.patch, HIVE-4732.4.patch,
> HIVE-4732.v1.patch, HIVE-4732.v4.patch
>
>
> The AvroSerde spends a significant amount of time checking schema equality.
> Changing to compare hashcodes (which can be computed once then reused) will
> improve performance.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira