J. Tipan Verella created HIVE-7555:
--------------------------------------

             Summary: inner join is being resolves as cartesian product
                 Key: HIVE-7555
                 URL: https://issues.apache.org/jira/browse/HIVE-7555
             Project: Hive
          Issue Type: Bug
         Environment: CentOS
            Reporter: J. Tipan Verella


I believe this is a bug, because I do not seem to be able to find a way around 
the following stackoverflow question, 

http://stackoverflow.com/questions/25020190/hive-query-returns-cartesian-product-instead-of-inner-join


The issue is as follows (repeated from SO for convenience).
This is type of query I am sending to HIVE:

    SELECT BigTable.nicefield,LargeTable.* 
    FROM LargeTable INNER JOIN BigTable 
        ON (
            LargeTable.joinfield1of4 = BigTable.joinfield1of4 
            AND LargeTable.joinfield2of4 = BigTable.joinfield2of4 
        )   
    WHERE LargeTable.joinfield3of4=20140726 AND LargeTable.joinfield4of4=15 AND 
BigTable.joinfield3of4=20140726 AND BigTable.joinfield4of4=15
        AND LargeTable.filterfiled1of2=123456
        AND LargeTable.filterfiled2of2=98765
        AND LargeTable.joinfield2of4=12 
        AND LargeTable.joinfield1of4='iwanttolikehive'       

It returns `2418025` rows.  The issue is that 

    SELECT *  
    FROM LargeTable 
    WHERE joinfield3of4=20140726 AND joinfield4of4=15
        AND filterfiled1of2=123456 
        AND filterfiled2of2=98765
        AND joinfield2of4=12 
        AND joinfield1of4='iwanttolikehive'

returns `1555` rows, and so does:

    SELECT *  
    FROM BigTable 
    WHERE joinfield3of4=20140726 AND joinfield4of4=15
        AND joinfield2of4=12 
        AND joinfield1of4='iwanttolikehive'


Note that **1555^2 = 2418025**.

Feel free to discard this issue if it is not a bug, but please provide a 
solution on SO.

Thank you.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to