[jira] [Updated] (HIVE-3915) Union with map-only query on one side and two MR job query on the other produces wrong results

2013-01-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3915:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Kevin

> Union with map-only query on one side and two MR job query on the other 
> produces wrong results
> --
>
> Key: HIVE-3915
> URL: https://issues.apache.org/jira/browse/HIVE-3915
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Fix For: 0.11.0
>
> Attachments: HIVE-3915.1.patch.txt
>
>
> When a query contains a union with a map only subquery on one side and a 
> subquery involving two sequential map reduce jobs on the other, it can 
> produce wrong results.  It appears that if the map only queries table scan 
> operator is processed first the task involving a union is made a root task.  
> Then when the other subquery is processed, the second map reduce job gains 
> the task involving the union as a child and it is made a root task.  This 
> means that both the first and second map reduce jobs are root tasks, so the 
> dependency between the two is ignored.  If they are run in parallel (i.e. the 
> cluster has more than one node) no results will be produced for the side of 
> the union with the two map reduce jobs and only the results of the other side 
> of the union will be returned.
> The order TableScan operators are processed is crucial to reproducing this 
> bug, and it is determined by the order values are retrieved from a map, and 
> hence hard to predict, so it doesn't always reproduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3915) Union with map-only query on one side and two MR job query on the other produces wrong results

2013-01-18 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3915:


Status: Patch Available  (was: Open)

> Union with map-only query on one side and two MR job query on the other 
> produces wrong results
> --
>
> Key: HIVE-3915
> URL: https://issues.apache.org/jira/browse/HIVE-3915
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-3915.1.patch.txt
>
>
> When a query contains a union with a map only subquery on one side and a 
> subquery involving two sequential map reduce jobs on the other, it can 
> produce wrong results.  It appears that if the map only queries table scan 
> operator is processed first the task involving a union is made a root task.  
> Then when the other subquery is processed, the second map reduce job gains 
> the task involving the union as a child and it is made a root task.  This 
> means that both the first and second map reduce jobs are root tasks, so the 
> dependency between the two is ignored.  If they are run in parallel (i.e. the 
> cluster has more than one node) no results will be produced for the side of 
> the union with the two map reduce jobs and only the results of the other side 
> of the union will be returned.
> The order TableScan operators are processed is crucial to reproducing this 
> bug, and it is determined by the order values are retrieved from a map, and 
> hence hard to predict, so it doesn't always reproduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3915) Union with map-only query on one side and two MR job query on the other produces wrong results

2013-01-17 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3915:


Attachment: HIVE-3915.1.patch.txt

> Union with map-only query on one side and two MR job query on the other 
> produces wrong results
> --
>
> Key: HIVE-3915
> URL: https://issues.apache.org/jira/browse/HIVE-3915
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-3915.1.patch.txt
>
>
> When a query contains a union with a map only subquery on one side and a 
> subquery involving two sequential map reduce jobs on the other, it can 
> produce wrong results.  It appears that if the map only queries table scan 
> operator is processed first the task involving a union is made a root task.  
> Then when the other subquery is processed, the second map reduce job gains 
> the task involving the union as a child and it is made a root task.  This 
> means that both the first and second map reduce jobs are root tasks, so the 
> dependency between the two is ignored.  If they are run in parallel (i.e. the 
> cluster has more than one node) no results will be produced for the side of 
> the union with the two map reduce jobs and only the results of the other side 
> of the union will be returned.
> The order TableScan operators are processed is crucial to reproducing this 
> bug, and it is determined by the order values are retrieved from a map, and 
> hence hard to predict, so it doesn't always reproduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira