[
https://issues.apache.org/jira/browse/HIVE-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890108#action_12890108
]
Joydeep Sen Sarma commented on HIVE-1408:
-----------------------------------------
summarizing comments from internal review:
- log why local mode was not chosen (not clear whether this should be printed
all the way to the console)
- turn it on by default in trunk
- use mapred.child.java.opts for child jvm memory for local mode (as opposed to
the current policy of passing down HADOOP_HEAPMAX). this will let the
map-reduce engine run with more memory and allow us to differentiate between
compiler and execution memory requirements
- set auto-local reducer threshold to 1. local mode doesn't run more than one
reducer.
follow on jiras:
1. don't scan all partitions for determining local mode (may apply to
estimateReducers as well)
2. use # of splits instead of # files for determining local mode.
> add option to let hive automatically run in local mode based on tunable
> heuristics
> ----------------------------------------------------------------------------------
>
> Key: HIVE-1408
> URL: https://issues.apache.org/jira/browse/HIVE-1408
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Query Processor
> Reporter: Joydeep Sen Sarma
> Assignee: Joydeep Sen Sarma
> Attachments: 1408.1.patch, 1408.2.patch, 1408.2.q.out.patch,
> 1408.3.patch
>
>
> as a followup to HIVE-543 - we should have a simple option (enabled by
> default) to let hive run in local mode if possible.
> two levels of options are desirable:
> 1. hive.exec.mode.local.auto=true/false // control whether local mode is
> automatically chosen
> 2. Options to control different heuristics, some naiive examples:
> hive.exec.mode.local.auto.input.size.max=1G // don't choose local mode
> if data > 1G
> hive.exec.mode.local.auto.script.enable=true/false // choose if local
> mode is enabled for queries with user scripts
> this can be implemented as a pre/post execution hook. It makes sense to
> provide this as a standard hook in the hive codebase since it's likely to
> improve response time for many users (especially for test queries).
> the initial proposal is to choose this at a query level and not at per
> hive-task (ie. hadoop job) level. per job-level requires more changes to
> compilation (to not pre-commit to hdfs or local scratch directories at
> compile time).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.