[GENERAL] Unexpected results when joining on date fields

2011-07-12 Thread Tim Uckun
I have two tables, traffic and sales. Each one has a date field and
lists the traffic and sales broken down by various parameters
(multiple rows for each date).

If I run  select (select count(*) from traffic) as traffic, (select
count(*) from sales) as sales; I get the following  49383;167807

if I run   select count(*) from traffic t inner join sales s on t.date
= s.date  I get 24836841.

If I change the join to a left join, right join, full join I get the
same number of records.

So I created a data table which just has the dates in it and ran this query.

select count(d.date) from dates d
inner join traffic t on t.date = d.date
inner join sales s on s.date = d.date

And I get the same number 24836841

Same goes for right joins on the above query. Left joins of course
give a different answer as there are more dates in the date table than
there are in the other tables.

I am a bit perplexed by what is happening here.

Cheers

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unexpected results when joining on date fields

2011-07-12 Thread Rick Genter
I don't think you understand what JOIN does. Think of it as a double-nested
FOR loop: for each record that has the value on the left side of the JOIN,
it will match all records on the right side of the JOIN that meet the ON
criteria. For example, if I have two tables:

A (i int, j int):

i   j
1 1
2 1
3 2
4 2
5 3
6 3


and

B (k int, j int)
k j
10   1
11   1
12   2
13   2
14   3
15   3

Then if I do

SELECT COUNT(*) FROM A JOIN B ON A.j = B.j

I'll get 12. Each record in A matches 2 records in B on the value of j.
Study the following transcript:

bash-3.2$ bin/psql -d g2_master
Password:
psql (8.4.4)
Type help for help.

g2_master=# CREATE TABLE A (i int, j int);
CREATE TABLE
g2_master=# CREATE TABLE B (k int, j int);
CREATE TABLE
g2_master=# INSERT INTO A VALUES (1, 1), (2, 1), (3, 2), (4, 2), (5, 3), (6,
3);
INSERT 0 6
g2_master=# INSERT INTO B VALUES (10, 1), (11, 1), (12, 2), (13, 2), (14,
3), (15, 3);
INSERT 0 6
g2_master=# SELECT COUNT(*) FROM A JOIN B ON A.j = B.j;
 count
---
12
(1 row)

g2_master=# SELECT * FROM A JOIN B ON A.j = B.j;
 i | j | k  | j
---+---++---
 1 | 1 | 10 | 1
 1 | 1 | 11 | 1
 2 | 1 | 10 | 1
 2 | 1 | 11 | 1
 3 | 2 | 12 | 2
 3 | 2 | 13 | 2
 4 | 2 | 12 | 2
 4 | 2 | 13 | 2
 5 | 3 | 14 | 3
 5 | 3 | 15 | 3
 6 | 3 | 14 | 3
 6 | 3 | 15 | 3
(12 rows)

g2_master=#


On Sun, Jul 10, 2011 at 4:58 PM, Tim Uckun t...@basediary.com wrote:

 I have two tables, traffic and sales. Each one has a date field and
 lists the traffic and sales broken down by various parameters
 (multiple rows for each date).

 If I run  select (select count(*) from traffic) as traffic, (select
 count(*) from sales) as sales; I get the following  49383;167807

 if I run   select count(*) from traffic t inner join sales s on t.date
 = s.date  I get 24836841.

 If I change the join to a left join, right join, full join I get the
 same number of records.

 So I created a data table which just has the dates in it and ran this
 query.

 select count(d.date) from dates d
 inner join traffic t on t.date = d.date
 inner join sales s on s.date = d.date

 And I get the same number 24836841

 Same goes for right joins on the above query. Left joins of course
 give a different answer as there are more dates in the date table than
 there are in the other tables.

 I am a bit perplexed by what is happening here.

 Cheers

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general




-- 
Rick Genter
rick.gen...@gmail.com


Re: [GENERAL] Unexpected results when joining on date fields

2011-07-12 Thread Alban Hertroys
On 11 Jul 2011, at 1:58, Tim Uckun wrote:

 I have two tables, traffic and sales. Each one has a date field and
 lists the traffic and sales broken down by various parameters
 (multiple rows for each date).
 
 If I run  select (select count(*) from traffic) as traffic, (select
 count(*) from sales) as sales; I get the following  49383;167807
 
 if I run   select count(*) from traffic t inner join sales s on t.date
 = s.date  I get 24836841.


Perhaps you also want to filter that join so that you don't get matches for 
different articles sold at the same date? Just joining on date doesn't seem to 
make a whole lot of sense.

Alban Hertroys

--
The size of a problem often equals the size of an ego.



!DSPAM:737,4e1c7f0c12096580658153!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general