[
https://issues.apache.org/jira/browse/HIVE-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14270390#comment-14270390
]
Adam Kramer commented on HIVE-836:
----------------------------------
Oh hey there five year old task.
Workaround: Use CLUSTER BY to force a reduce phase, and a staging table to
force a map phase. Hive writes all the data to disk in every phase anyway so
the staging table isn't actually a performance hit.
Also protip: DON'T get distracted by the Hive keywords "MAP" and "REDUCE", they
are just synonyms for TRANSFORM and do not do what anybody expects.
> Add syntax to force a new mapreduce job / transform subquery in mapper
> ----------------------------------------------------------------------
>
> Key: HIVE-836
> URL: https://issues.apache.org/jira/browse/HIVE-836
> Project: Hive
> Issue Type: Wish
> Reporter: Adam Kramer
>
> Hive currently does a lot of awesome work to figure out when my transformers
> should be used in the mapper and when they should be used in the reducer.
> However, sometimes I have a different plan.
> For example, consider this:
> {code:title=foo.sql}
> SELECT TRANSFORM(a.val1, a.val2)
> USING './niftyscript'
> AS part1, part2, part3
> FROM (
> SELECT b.val AS val1, c.val AS val2
> FROM tblb b JOIN tblc c on (b.key=c.key)
> ) a
> {code}
> ...now, assume that the join step is very easy and 'niftyscript' is really
> processor intensive. The ideal format for this is a MR task with few mappers
> and few reducers, and then a second MR task with lots of mappers.
> Currently, there is no way to even require the outer TRANSFORM statement
> occur in a separate map phase. Implementing a "hint" such as /* +MAP */, akin
> to /* +MAPJOIN(x) */, would be awesome.
> Current workaround is to dump everything to a temporary table and then start
> over, but that is not an easy to scale--the subquery structure effectively
> (and easily) "locks" the mid-points so no other job can touch the table.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)