[
https://issues.apache.org/jira/browse/HIVE-19821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sahil Takiar reassigned HIVE-19821:
---
Assignee: (was: Sahil Takiar)
> Distributed HiveServer2
> ---
>
> Key: HIVE-19821
> URL: https://issues.apache.org/jira/browse/HIVE-19821
> Project: Hive
> Issue Type: New Feature
> Components: HiveServer2
>Reporter: Sahil Takiar
>Priority: Major
> Attachments: HIVE-19821.1.WIP.patch, HIVE-19821.2.WIP.patch,
> HIVE-19821_ Distributed HiveServer2.pdf
>
>
> HS2 deployments often hit OOM issues due to a number of factors: (1) too many
> concurrent connections, (2) query that scan a large number of partitions have
> to pull a lot of metadata into memory (e.g. a query reading thousands of
> partitions requires loading thousands of partitions into memory), (3) very
> large queries can take up a lot of heap space, especially during query
> parsing. There are a number of other factors that cause HiveServer2 to run
> out of memory, these are just some of the more commons ones.
> Distributed HS2 proposes to do all query parsing, compilation, planning, and
> execution coordination inside a dedicated container. This should
> significantly decrease memory pressure on HS2 and allow HS2 to scale to a
> larger number of concurrent users.
> For HoS (and I think Hive-on-Tez) this just requires moving all query
> compilation, planning, etc. inside the application master for the
> corresponding Hive session.
> The main benefit here is isolation. A poorly written Hive query cannot bring
> down an entire HiveServer2 instance and force all other queries to fail.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)