Benjamin Kim created HIVE-6230:
----------------------------------

             Summary: Hive UDAF with subquery runs all logic on reducers
                 Key: HIVE-6230
                 URL: https://issues.apache.org/jira/browse/HIVE-6230
             Project: Hive
          Issue Type: Bug
          Components: UDF
    Affects Versions: 0.10.0
            Reporter: Benjamin Kim


When I have a subquery in my custom built UDAF, all the iterate, 
terminatePartial, merge, terminate runs on reducers only, where iterate and 
terminatePartial should run on mappers.

Now I don't know if this is due to design purpose, but this behavior leads to 
very long execution time on reducers and create large temporary files from them.

This happened to me with SimpleUDAF. I haven't tested it with GenericUDAF.

Here is an example
SELECT MyUDAF(col1) FROM(
  SELECT * FROM test)
GROUP BY col2




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to