[
https://issues.apache.org/jira/browse/HIVE-29165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18016926#comment-18016926
]
Zhihua Deng commented on HIVE-29165:
------------------------------------
Thank you [~dkuzmenko] for the review!
> PartColNameInfo could introduce high hash collision due to the wide table
> -------------------------------------------------------------------------
>
> Key: HIVE-29165
> URL: https://issues.apache.org/jira/browse/HIVE-29165
> Project: Hive
> Issue Type: Improvement
> Components: Standalone Metastore
> Reporter: Zhihua Deng
> Assignee: Zhihua Deng
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.1.0
>
>
> In {{{}DirectSqlUpdatePart{}}}, the {{PartColNameInfo}} acts as a map key,
> referring to those statistics to be updated or inserted. If the current table
> has lots of columns, say 1000, then for each {{{}PartColNameInfo{}}}, it’s
> assumed to hash into the same map buckets more than 1000 times. If the table
> has thousands of partitions, locating or inserting the {{PartColNameInfo}}
> could be very slow.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)