Kevin Wilfong created HIVE-3915:
-----------------------------------

             Summary: Union with map-only query on one side and two MR job 
query on the other produces wrong results
                 Key: HIVE-3915
                 URL: https://issues.apache.org/jira/browse/HIVE-3915
             Project: Hive
          Issue Type: Bug
          Components: Query Processor
    Affects Versions: 0.11.0
            Reporter: Kevin Wilfong
            Assignee: Kevin Wilfong


When a query contains a union with a map only subquery on one side and a 
subquery involving two sequential map reduce jobs on the other, it can produce 
wrong results.  It appears that if the map only queries table scan operator is 
processed first the task involving a union is made a root task.  Then when the 
other subquery is processed, the second map reduce job gains the task involving 
the union as a child and it is made a root task.  This means that both the 
first and second map reduce jobs are root tasks, so the dependency between the 
two is ignored.  If they are run in parallel (i.e. the cluster has more than 
one node) no results will be produced for the side of the union with the two 
map reduce jobs and only the results of the other side of the union will be 
returned.

The order TableScan operators are processed is crucial to reproducing this bug, 
and it is determined by the order values are retrieved from a map, and hence 
hard to predict, so it doesn't always reproduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to