[sqlite] UNION with results distinct on a particular column?

2009-03-24 Thread Matthew L. Creech
Hi,

I'm hoping someone here can help me out with a query.  I have multiple
tables, each with the same schema.  For example:

=
Table A:
=
1|"xxx"
2|"yyy"
3|"zzz"
=

=
Table B:
=
1|"xxx222"
3|"zzz222"
5|"www"
=

I'd like a SELECT statement that yields:

=
Result:
=
1|"xxx"
2|"yyy"
3|"zzz"
5|"www"
=

In other words, I want the UNION of all the input tables, but if there
are multiple results that have the same value in the first column, the
first table's value should take precedence.

This seems like a common scenario, so I'm probably missing something
trivial.  :)  But so far, the only way I've figured out to do this is
with something like:

SELECT * FROM
(SELECT 1 AS precedence, col1, col2 FROM A UNION
 SELECT 2 AS precedence, col1, col2 FROM B
 ORDER BY col1 ASC, precedence DESC)
GROUP BY precedence
ORDER BY col1 ASC;

(Just an example, I've got several other columns that have to be
sorted on, and there can be any number of tables).  This seems to do
what I want, but it takes an order of magnitude longer than the inner
SELECTs do on their own (i.e. without the GROUP BY which eliminates
rows with duplicate 'col1' values).  Any ideas on how I could do this
more efficiently?

Thanks!

-- 
Matthew L. Creech
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] UNION with results distinct on a particular column?

2009-03-24 Thread David Westbrook
Two possible solutions:

A) pure sql ... (warning: untested)  Start with all the possible col1
values, and then left join to the other tables, and pick the first
col2 found.

select col1,
  coalesce( A.col2, B.col2, C.col2 ) as col2
from (
  select distinct col1 from (
select col1 from A
union
select col1 from B
union
select col1 from C
  ) as u
) as tmp
left join A using (col1)
left join B using (col1)
left join C using (col1)
order by col1
;

B) This is potentially very easy at the application level ... here's a
perl/DBI example (also untested):
my %pairs = map {
  %{ $dbh->selectall_hashref("select col1, col2 from $_", 'col1') }
} reverse qw/ A B C /;
# This next line is optional, if you want the hash values to be col2's
instead of hashrefs:
$_=$_->{col2} for values %pairs;

--david

On Tue, Mar 24, 2009 at 5:36 PM, Matthew L. Creech  wrote:
> Hi,
>
> I'm hoping someone here can help me out with a query.  I have multiple
> tables, each with the same schema.  For example:
>
> =
> Table A:
> =
> 1|"xxx"
> 2|"yyy"
> 3|"zzz"
> =
>
> =
> Table B:
> =
> 1|"xxx222"
> 3|"zzz222"
> 5|"www"
> =
>
> I'd like a SELECT statement that yields:
>
> =
> Result:
> =
> 1|"xxx"
> 2|"yyy"
> 3|"zzz"
> 5|"www"
> =
>
> In other words, I want the UNION of all the input tables, but if there
> are multiple results that have the same value in the first column, the
> first table's value should take precedence.
>
> This seems like a common scenario, so I'm probably missing something
> trivial.  :)  But so far, the only way I've figured out to do this is
> with something like:
>
> SELECT * FROM
> (SELECT 1 AS precedence, col1, col2 FROM A UNION
>  SELECT 2 AS precedence, col1, col2 FROM B
>  ORDER BY col1 ASC, precedence DESC)
> GROUP BY precedence
> ORDER BY col1 ASC;
>
> (Just an example, I've got several other columns that have to be
> sorted on, and there can be any number of tables).  This seems to do
> what I want, but it takes an order of magnitude longer than the inner
> SELECTs do on their own (i.e. without the GROUP BY which eliminates
> rows with duplicate 'col1' values).  Any ideas on how I could do this
> more efficiently?
>
> Thanks!
>
> --
> Matthew L. Creech
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users