[
https://issues.apache.org/jira/browse/HIVE-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brock Noland updated HIVE-5454:
-------------------------------
Resolution: Fixed
Fix Version/s: 0.13.0
Assignee: Brock Noland
Status: Resolved (was: Patch Available)
Thank you for the contribution Harsh! I have committed this to trunk and will
attribute it to you when you are added as a contributor.
Note: I am assigning it to myself in the interim so I don't forget.
> HCatalog runs a partition listing with an empty filter
> ------------------------------------------------------
>
> Key: HIVE-5454
> URL: https://issues.apache.org/jira/browse/HIVE-5454
> Project: Hive
> Issue Type: Bug
> Components: HCatalog
> Affects Versions: 0.12.0
> Reporter: Harsh J
> Assignee: Brock Noland
> Fix For: 0.13.0
>
> Attachments: D13317.1.patch, D13317.2.patch, D13317.3.patch
>
>
> This is a HCATALOG-527 caused regression, wherein the HCatLoader's way of
> calling HCatInputFormat causes it to do 2x partition lookups - once without
> the filter, and then again with the filter.
> For tables with large number partitions (100000, say), the non-filter lookup
> proves fatal both to the client ("Read timed out" errors from
> ThriftMetaStoreClient cause the server doesn't respond) and to the server
> (too much data loaded into the cache, OOME, or slowdown).
> The fix would be to use a single call that also passes a partition filter
> information, as was in the case of HCatalog 0.4 sources before HCATALOG-527.
> (HCatalog-release-wise, this affects all 0.5.x users)
--
This message was sent by Atlassian JIRA
(v6.1#6144)