Searching my computer, I find Namit quoting: "ansi sql semantics are that the 
filter is executed after the join."

So there u go ..

In the same mail he suggested putting the filter condition for the table inside 
the ON clause for execution before the join. So I guess u might want to try:

SELECT communications.sender_id, conversions.actor_id,
communications.version, conversions.reference_id FROM communications
LEFT OUTER JOIN conversions ON (communications.reference_id =
conversions.reference_id AND conversions.application='reference') WHERE 
communications.hour >= 343013 AND
communications.hour < 343014



________________________________
From: Josh Ferguson [mailto:j...@besquared.net]
Sent: Monday, February 16, 2009 10:53 PM
To: hive-user@hadoop.apache.org
Subject: Re: Yet another join issue

The thing I want the WHERE clause to do here is filter the two sets *before* I 
join them not after. This is what Prasad's query was doing unfortunately it 
wasn't working properly  inside of a SELECT TRANSFORM FROM(...) block. Anyone 
know the SQL standard involving WHERE clauses and joins?

Josh Ferguson

On Feb 16, 2009, at 10:28 PM, Zheng Shao wrote:


The following will also work. In the case of left outer join, the value of 
columns from the missing table will be NULL, so checking whether the value IS 
NULL should solve the problem.

SELECT communications.sender_id, conversions.actor_id, communications.version, 
conversions.reference_id FROM communications LEFT OUTER JOIN conversions ON 
communications.reference_id = conversions.reference_id WHERE 
communications.hour >= 343013 AND communications.hour < 343014 AND ( 
conversions.application = 'reference' OR conversions.application IS NULL);

I am not sure whether SQL standard says something different from what Hive 
implements. It will be great if you could point us to some references.




Zheng
On Mon, Feb 16, 2009 at 10:21 PM, Josh Ferguson 
<j...@besquared.net<mailto:j...@besquared.net>> wrote:
Will this work if this select statement is already nested in a FROM?

Josh Ferguson

On Feb 16, 2009, at 10:14 PM, Prasad Chakka wrote:


Try below,

SELECT a.sender_id,
             b.actor_id,
             a.version,
             b.reference_id
FROM communications a
            LEFT OUTER JOIN (select conversions.actor_id, 
conversions.reference_id from conversions where 
conversions.application='reference')   b
            ON a.reference_id = b.reference_id
WHERE a.hour >= 343013
               AND a.hour < 343014

________________________________
From: Josh Ferguson <j...@besquared.net<http://j...@besquared.net>>
Reply-To: <hive-user@hadoop.apache.org<http://hive-user@hadoop.apache.org>>
Date: Mon, 16 Feb 2009 22:04:14 -0800
To: <hive-user@hadoop.apache.org<http://hive-user@hadoop.apache.org>>
Subject: Yet another join issue

Ok so this one I just ran into. It appears when doing a join of two
tables that, when using a LEFT OUTER JOIN and specifying a WHERE
condition on both tables, the resulting join is an INNER JOIN. Example
follows:

LEFT OUTER JOIN with WHERE condition on the LEFT table

SELECT communications.sender_id, conversions.actor_id,
communications.version, conversions.reference_id FROM communications
LEFT OUTER JOIN conversions ON communications.reference_id =
conversions.reference_id WHERE communications.hour >= 343013 AND
communications.hour < 343014

1       NULL    A       NULL
1       2       A       2
2       NULL    A       NULL
3       NULL    B       NULL

correct results, as expected

LEFT OUTER JOIN with WHERE conditions on both tables

SELECT communications.sender_id, conversions.actor_id,
communications.version, conversions.reference_id FROM communications
LEFT OUTER JOIN conversions ON communications.reference_id =
conversions.reference_id WHERE communications.hour >= 343013 AND
communications.hour < 343014 AND conversions.application = 'reference'

1       2       A       2

incorrect results, it simply displays the row that was matched in both
tables.

Any ideas what's going on here? Is this intended? How can I get a LEFT
OUTER JOIN and put conditions on both tables?

Josh Ferguson




--
Yours,
Zheng

Reply via email to