[
https://issues.apache.org/jira/browse/HIVE-17814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stamatis Zampetakis updated HIVE-17814:
---------------------------------------
Fix Version/s: (was: 3.2.0)
I cleared the fixVersion field since this ticket is still open. Please review
this ticket and if the fix is already committed to a specific version please
set the version accordingly and mark the ticket as RESOLVED.
According to the [JIRA
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute]
the fixVersion should be set only when the issue is resolved/closed.
> Reduce Memory footprint for large database bootstrap replication load
> ----------------------------------------------------------------------
>
> Key: HIVE-17814
> URL: https://issues.apache.org/jira/browse/HIVE-17814
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2
> Affects Versions: 3.0.0
> Reporter: Anishek Agarwal
> Assignee: Anishek Agarwal
> Priority: Major
>
> As part of HIVE-16896 we are doing dynamic Query Task generation for
> bootstrap repl load. This was done since the number of tasks for large
> databases will generate a very large graph with hundreds of thousands of
> objects, this would put additional memory pressure on hive.
> The execution hook's however still keep reference to the query plan which
> gets dynamically modified and at the end of all task execution hive will have
> the whole DAG in memory which is what we have to prevent, Additionally for
> PostExecution Hive hooks we are additionally storing the TaskRunner objects
> for each task that is executed.
> We have to handle these issues to prevent excessive memory usage for
> replication specifically bootstrap replication.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)