Harsh J created HIVE-5454:
-----------------------------
Summary: HCatalog runs a partition listing with an empty filter
Key: HIVE-5454
URL: https://issues.apache.org/jira/browse/HIVE-5454
Project: Hive
Issue Type: Bug
Components: HCatalog
Affects Versions: 0.12.0
Reporter: Harsh J
This is a HCATALOG-527 caused regression, wherein the HCatLoader's way of
calling HCatInputFormat causes it to do 2x partition lookups - once without the
filter, and then again with the filter.
For tables with large number partitions (100000, say), the non-filter lookup
proves fatal both to the client ("Read timed out" errors from
ThriftMetaStoreClient cause the server doesn't respond) and to the server (too
much data loaded into the cache, OOME, or slowdown).
The fix would be to use a single call that also passes a partition filter
information, as was in the case of HCatalog 0.4 sources before HCATALOG-527.
(HCatalog-release-wise, this affects all 0.5.x users)
--
This message was sent by Atlassian JIRA
(v6.1#6144)