Wow!  Thanks, JKL.  You sent me back to school on this one. ;-)  I had to go 
back and refresh the Venn diagram. :-) Thanks.  Also, thanks for the time 
explaining it.  I have to read it a few more times to really get it into my 
brains so that I don't keep asking the same questions.  It would be nice to do 
a compilation of all the questions done in here and any pertinent 
answer/response.



From: James K. Lowden
Sent: Saturday, April 6, 2019 01:17 PM
To: [email protected]
Subject: Re: [sqlite] Returning NULL or empty values when the SELECT does not 
satisfy all of the query

On Fri, 5 Apr 2019 18:54:18 +0000
Jose Isaias Cabrera <[email protected]> wrote:

> Why does this work

I don't know what "work" means, but I can explain the difference.  With
an outer join, JOIN and WHERE are not the same.  In analyzing the
query, we consider JOIN before WHERE.

> select
>  a.*, b.* from t as a LEFT JOIN z as b on a.a = b.f
> AND
>   a.idate = (select max(idate) from t where a = a.a)
> AND
>   b.idate = (select max(idate) from z where f = a.a)
> where a.a = 'p006'

In an outer join, the outer table -- think "outer" as Venn diagram --
is the "preserved" table.  All rows match, join criteria
notwithstanding. This component has no effect:

>   a.idate = (select max(idate) from t where a = a.a)

because "a" is the outer table, and so all rows match, join criteria
notwithstanding.

The inner table is a little different, and also different from WHERE in
an inner join.  On the inner table, JOIN restrictions can cause a row
not to match that otherwise would.  In your query:

>   b.idate = (select max(idate) from z where f = a.a)

"b" rows that don't pass that test are rejected from the join.  The "a"
columns will appear (because all "a" rows do) and the "b" columns will
be NULL.

Only once the join is completed do we consider WHERE:

> where a.a = 'p006'

This restricts rows in the outer table.

Now let's look at your other query.

> select
>  a.*, b.* from t as a LEFT JOIN z as b on a.a = b.f

The *only* join criterion is a.a = b.f.  All rows passing that test are
subjected to WHERE: all "a" rows (because outer table) and "b"
rows with a matching "f".

> where a.a = 'p006'

OK, same as #1.

> AND
>   a.idate = (select max(idate) from t where a = a.a)

This further restricts the "a" table rows, unlike #1.

> AND
>   b.idate = (select max(idate) from z where f = a.a)

This restricts produced rows to those passing the test.  Different from
#1, no row passes this test for which b.idate is NULL.

Whenever you disallow NULL on the inner table of an outer join, you
effectively convert the join from outer to inner.  You're asking for:

1.      all rows in "a", whether or not they match a "b" row, provided
2.      they do match a "b" row (because b.idate cannot be NULL)

You can vote for anyone in either party from this list of Democrats.

Because both AND clauses restrict the output, we can expect the 2nd
query to produce fewer rows.  If it doesn't, there's no need for an
outer query for the data in their current state.

By the way, the distinction of JOIN and WHERE is not a relational
concept.  It was added to SQL-92 in part to provide exactly the
separation your question illustrates: how to express an outer join.

Relationally, that's not needed.  An outer join is nothing but an inner
join + a union with the outer table.  That's cumbersome to express
in SQL, and anything cumbersome to express is hard to optimize, and
query optimization was and is the unsolved technical challenge of SQL.

--jkl
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to