[ 
https://issues.apache.org/jira/browse/HIVE-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joydeep Sen Sarma updated HIVE-1408:
------------------------------------

    Attachment: 1408.2.patch
                1408.2.q.out.patch

v2. this is ready for review.

added tests:
- the tests now use 'pfile:///' namespace as the default warehouse filesystem. 
This is served by a proxy filesystem class that passes requests to the local 
file system
- this comprehensively tests all the file system issues related to running in 
local mode (where there is now a difference between the intermediate data's 
file system and the warehouse's file system). there are several small bug fixes 
related to bugs discovered because of this test mode.
- there are changes in a lot of test results as a result of the new namespace 
as well as because of the changes in tmp file naming. i am attaching a extra 
diff (.q.out.patch) that shows only the interesting changes.
- some tests have been modified to run with a non-local setting for the 
jobtracker and with auto-local-mode turned on. this tests the new functionality.
- there is one test (archive.q) that's still breaking because of the filesystem 
issues. waiting for a fix from pyang. but it should not stop the review.

> add option to let hive automatically run in local mode based on tunable 
> heuristics
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-1408
>                 URL: https://issues.apache.org/jira/browse/HIVE-1408
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>            Assignee: Joydeep Sen Sarma
>         Attachments: 1408.1.patch, 1408.2.patch, 1408.2.q.out.patch
>
>
> as a followup to HIVE-543 - we should have a simple option (enabled by 
> default) to let hive run in local mode if possible.
> two levels of options are desirable:
> 1. hive.exec.mode.local.auto=true/false // control whether local mode is 
> automatically chosen
> 2. Options to control different heuristics, some naiive examples:
>      hive.exec.mode.local.auto.input.size.max=1G // don't choose local mode 
> if data > 1G
>      hive.exec.mode.local.auto.script.enable=true/false // choose if local 
> mode is enabled for queries with user scripts
> this can be implemented as a pre/post execution hook. It makes sense to 
> provide this as a standard hook in the hive codebase since it's likely to 
> improve response time for many users (especially for test queries).
> the initial proposal is to choose this at a query level and not at per 
> hive-task (ie. hadoop job) level. per job-level requires more changes to 
> compilation (to not pre-commit to hdfs or local scratch directories at 
> compile time).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to