[SQL] Make the planner smarter about idiosyncratic (or broken) ORM behaviour

2009-05-22 Thread Frank Jördens
Almost all of the 8k queries per second that are hitting our database
are generated by a web development framework's ORM (Django), and some
of the syntax there is quite mad. Possibly the worst tic exhibited by
the translation layer is that if you filter a query on an object by
more than one property on a related object it will create a join for
each property:

woome_video=# EXPLAIN ANALYZE SELECT * FROM "webapp_person" INNER JOIN
"auth_user" ON ("webapp_person"."user_id" = "auth_user"."id") INNER
JOIN "auth_user" T3 ON ("webapp_person"."user_id" = T3."id") WHERE
"webapp_person"."is_suspended" = false  AND
"webapp_person"."is_banned" = false  AND ("webapp_person"."is_human" =
true  OR "webapp_person"."is_human" IS NULL) AND
(LOWER("auth_user"."username") = LOWER('d00ditsnicole')  OR
LOWER(T3."first_name") = LOWER('d00ditsnicole') ) AND
"webapp_person"."dob" >= '1910-01-01'  AND "webapp_person"."dob" <=
'1991-01-01'  ORDER BY "auth_user"."last_login" DESC LIMIT 30;




QUERY PLAN
-
 Limit  (cost=0.00..87160.10 rows=30 width=496) (actual
time=17641.618..17641.618 rows=0 loops=1)
   ->  Nested Loop  (cost=0.00..1095311.97 rows=377 width=496) (actual
time=17641.616..17641.616 rows=0 loops=1)
 ->  Nested Loop  (cost=0.00..1090710.92 rows=554 width=230)
(actual time=17641.614..17641.614 rows=0 loops=1)
   Join Filter: ((lower((auth_user.username)::text) =
'd00ditsnicole'::text) OR (lower((t3.first_name)::text) =
'd00ditsnicole'::text))
   ->  Index Scan Backward using auth_user_last_login_idx
on auth_user  (cost=0.00..141657.02 rows=1684525 width=115) (actual
time=0.039..3315.272 rows=1685757 loops=1)
   ->  Index Scan using auth_user_pkey on auth_user t3
(cost=0.00..0.54 rows=1 width=115) (actual time=0.004..0.005 rows=1
loops=1685757)
 Index Cond: (t3.id = auth_user.id)
 ->  Index Scan using webapp_person_user_id on webapp_person
(cost=0.00..8.29 rows=1 width=266) (never executed)
   Index Cond: (webapp_person.user_id = auth_user.id)
   Filter: ((NOT webapp_person.is_suspended) AND (NOT
webapp_person.is_banned) AND (webapp_person.is_human OR
(webapp_person.is_human IS NULL)) AND (webapp_person.dob >=
'1910-01-01'::date) AND (webapp_person.dob <= '1991-01-01'::date))
 Total runtime: 17641.871 ms
(11 rows)


There LIMIT and dob filters there are silly as well but they don't
seem to impact query performance; the trouble is the spurious 2nd join
with the T3 alias for auth_user. If I just remove that, we get:

woome_video=# EXPLAIN ANALYZE SELECT * FROM "webapp_person" INNER JOIN
"auth_user" ON ("webapp_person"."user_id" = "auth_user"."id") WHERE
"webapp_person"."is_suspended" = false  AND
"webapp_person"."is_banned" = false  AND ("webapp_person"."is_human" =
true  OR "webapp_person"."is_human" IS NULL) AND
(LOWER("auth_user"."username") = LOWER('d00ditsnicole')  OR
LOWER("auth_user"."first_name") = LOWER('d00ditsnicole') ) AND
"webapp_person"."dob" >= '1910-01-01'  AND "webapp_person"."dob" <=
'1991-01-01'  ORDER BY "auth_user"."last_login" DESC LIMIT 30;


QUERY PLAN
---
 Limit  (cost=6637.36..6637.43 rows=30 width=381) (actual
time=0.230..0.230 rows=0 loops=1)
   ->  Sort  (cost=6637.36..6638.30 rows=377 width=381) (actual
time=0.228..0.228 rows=0 loops=1)
 Sort Key: auth_user.last_login
 Sort Method:  quicksort  Memory: 25kB
 ->  Nested Loop  (cost=29.88..6626.22 rows=377 width=381)
(actual time=0.162..0.162 rows=0 loops=1)
   ->  Bitmap Heap Scan on auth_user  (cost=29.88..2025.17
rows=554 width=115) (actual time=0.161..0.161 rows=0 loops=1)
 Recheck Cond: ((lower((username)::text) =
'd00ditsnicole'::text) OR (lower((first_name)::text) =
'd00ditsnicole'::text))
 ->  BitmapOr  (cost=29.88..29.88 rows=554
width=0) (actual time=0.158..0.158 rows=0 loops=1)
   ->  Bitmap Index Scan on
woome_username_lower  (cost=0.00..4.60 rows=1 width=0) (actual
time=0.096..0.096 rows=0 loops=1)
 Index Cond: (lower((username)::text)
= 'd00ditsnicole'::text)
   ->  Bitmap Index Scan on
auth_user_firstname_idx  (cost=0.00..25.00 rows=553 width=0) (actual
time=0.060..0.060 rows=0 loops=1)
 Index Cond:
(lower((first_name)::text) = 'd00ditsnicole'::text)
   ->  Index Scan using webapp_person_user_id on
webap

Re: [SQL] Make the planner smarter about idiosyncratic (or broken) ORM behaviour

2009-05-22 Thread Richard Huxton

Frank Jördens wrote:

Our Django experts are telling me that it is probably not practical to
fix in the ORM, as it seems to be structural (anyway not fixable for
us in the near term). Hence I am wondering if anyone has an idea as to
how to make the planner smarter about such weirdness (or brokenness);
you might argue that the 2nd join there is merely syntactic bloat
which the planner might just recognize as such?


Even if you have funding to hire a developer to adapt PG's planner, it's 
going to be an uphill struggle to get patches accepted unless there is a 
simple, quick can-merge-two-joins test someone can come up with. Time 
spent planning to deal with badly written queries costs every 
well-written query too of course.


Even with a patch and acceptance from core, 8.4 is in beta at the moment 
so you'll have a long wait before 8.5 comes out with your patch.


Are you sure it wouldn't be easier to hire a Python guru for a couple of 
days and have him/her hack the ORM to make it less, um, "simplistic"? 
There must be an "assemble references into JOINs" point in the code you 
could rationalise this at.


--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Make the planner smarter about idiosyncratic (or broken) ORM behaviour

2009-05-22 Thread Richard Huxton

Frank Joerdens wrote:

On Fri, May 22, 2009 at 10:56 AM, Richard Huxton  wrote:
[...]

Are you sure it wouldn't be easier to hire a Python guru for a couple of
days and have him/her hack the ORM to make it less, um, "simplistic"?


Not sure. :) Your arguments make eminent sense to me. I am not exactly
a fan of ORMs, they make my job much harder ... but there seems to be
no avoiding them these days.


I either like:
1. Really simple, clean ORM but makes it simple to override with custom 
SQL when you need to.
2. Very clever, sophisticated ORM but makes it simple to override with 
custom SQL when you need to.


Unfortunately I'm not a Python guy so I can't comment on Django's ORM.

--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Make the planner smarter about idiosyncratic (or broken) ORM behaviour

2009-05-22 Thread Simon Riggs

On Fri, 2009-05-22 at 10:33 +0100, Frank Jördens wrote:
> Almost all of the 8k queries per second that are hitting our database
> are generated by a web development framework's ORM (Django), and some
> of the syntax there is quite mad. Possibly the worst tic exhibited by
> the translation layer is that if you filter a query on an object by
> more than one property on a related object it will create a join for
> each property:

It's a known issue that we're looking to work on in the next release.

Work started in 8.4, though was more complex than first appeared and we
didn't complete it in time for the dev deadline.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] help with pg_hba.conf

2009-05-22 Thread Isaac Dover
possibly this answers my question, and what i am doing is indeed the most
appropriate?

"Databases are physically separated and access control is managed at the
connection level."

from 8.3 docs, section 20.1

thanks,
Isaac

On Fri, May 22, 2009 at 2:37 AM, Isaac Dover  wrote:

> Hello, to this point i've been working with pg_hba.conf authentication
> defaults as installed with PostgreSQL 8.3. I'm trying to better understand
> "best practice" for managing connections to databases (I've grown accustomed
> to the MSSQL EM method of assigning user privileges). As far as i can tell,
> pg_hba.conf is the only manner in which to prevent users from connecting to
> other users' databases. I've restricted roles to connecting only using
> sameuser:
>
> host sameuser all 192.168.168.0/24 md5
>
> this works fine until a user connects and creates a new database. Pg shows
> that the owner of the database is the currently connected user, but the user
> can't connect to it, as the hba.conf file has the sameuser restriction. I
> was hoping that (somehow, magically) the owner of the database could always
> connect to the databases he/she owns.
>
> Is hba.conf the only way to restrict users connections to specific
> databases? Are there privileges I can grant without having to maintain this
> file?
>
> I've spent quite some time researching this, even with the documentation,
> but I'm wondering what I'm missing.
>
> Thanks,
> Isaac
>


[SQL] Need help combining 2 tables together

2009-05-22 Thread Richard Ekblom

Hello

I have frequently encountered the need of combining two tables into one. 
First, please take a look at the following table setups...


CREATE TABLE topics (
  id SERIAL PRIMARY KEY,
  topic TEXT NOT NULL
);

CREATE TABLE messages (
  id SERIAL PRIMARY KEY,
  topic INTEGER REFERENCES topics(id),
  message TEXT NOT NULL
);

Example of a topics table:
IDTOPIC
1 Are squares better then circles?
2 My favorite food

Example of a messages table:
IDTOPICMESSAGE
1 2I like lasagna!
2 2Pizza is also a favorite
3 1I like circles, they remind me of pizza

Notice that the number of topics may differ from the number of messages.

Now I want to combine these tables with a single SELECT to get...

Combined table:
ID   TOPIC   MESSAGE
1My favorite foodI like lasagna!
2My favorite foodPizza is also a favorite
3Are squares better then circles?I like circles, they remind me 
of pizza


I have seen different examples of this with something called JOIN but 
they always give me only two rows. How can I do this when the two tables 
may have different sizes to produce exactly the combined table above???



Some SQL for Postgres if you want to set up this example...

CREATE TABLE topics (id SERIAL PRIMARY KEY,topic TEXT NOT NULL);
CREATE TABLE messages (id SERIAL PRIMARY KEY,topic INTEGER REFERENCES 
topics(id),message TEXT NOT NULL);

INSERT INTO topics(topic) VALUES('Are squares better then circles?');
INSERT INTO topics(topic) VALUES('My favorite food');
INSERT INTO messages(topic,message) VALUES(2, 'I like lasagna!');
INSERT INTO messages(topic,message) VALUES(2, 'Pizza is also a favorite');
INSERT INTO messages(topic,message) VALUES(1, 'I like circles, they 
remind me of pizza');

SELECT * FROM topics;
SELECT * FROM messages;


Thanks in advance
/RE


--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Need help combining 2 tables together

2009-05-22 Thread Adrian Klaver
On Friday 22 May 2009 6:48:43 am Richard Ekblom wrote:
> Hello
>
> I have frequently encountered the need of combining two tables into one.
> First, please take a look at the following table setups...
>
> CREATE TABLE topics (
>id SERIAL PRIMARY KEY,
>topic TEXT NOT NULL
> );
>
> CREATE TABLE messages (
>id SERIAL PRIMARY KEY,
>topic INTEGER REFERENCES topics(id),
>message TEXT NOT NULL
> );
>
> Example of a topics table:
> IDTOPIC
> 1 Are squares better then circles?
> 2 My favorite food
>
> Example of a messages table:
> IDTOPICMESSAGE
> 1 2I like lasagna!
> 2 2Pizza is also a favorite
> 3 1I like circles, they remind me of pizza
>
> Notice that the number of topics may differ from the number of messages.
>
> Now I want to combine these tables with a single SELECT to get...
>
> Combined table:
> ID   TOPIC   MESSAGE
> 1My favorite foodI like lasagna!
> 2My favorite foodPizza is also a favorite
> 3Are squares better then circles?I like circles, they remind me
> of pizza
>
> I have seen different examples of this with something called JOIN but
> they always give me only two rows. How can I do this when the two tables
> may have different sizes to produce exactly the combined table above???
>
>
> Some SQL for Postgres if you want to set up this example...
>
> CREATE TABLE topics (id SERIAL PRIMARY KEY,topic TEXT NOT NULL);
> CREATE TABLE messages (id SERIAL PRIMARY KEY,topic INTEGER REFERENCES
> topics(id),message TEXT NOT NULL);
> INSERT INTO topics(topic) VALUES('Are squares better then circles?');
> INSERT INTO topics(topic) VALUES('My favorite food');
> INSERT INTO messages(topic,message) VALUES(2, 'I like lasagna!');
> INSERT INTO messages(topic,message) VALUES(2, 'Pizza is also a favorite');
> INSERT INTO messages(topic,message) VALUES(1, 'I like circles, they
> remind me of pizza');
> SELECT * FROM topics;
> SELECT * FROM messages;
>
>
> Thanks in advance
> /RE

test=# SELECT m.id,t.topic,m.message from topics as t,messages as m where 
m.topic=t.id order by m.id;
 id |  topic   | message
+--+--
  1 | My favorite food | I like lasagna!
  2 | My favorite food | Pizza is also a favorite
  3 | Are squares better then circles? | I like circles, they
   : remind me of pizza


-- 
Adrian Klaver
akla...@comcast.net

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Need help combining 2 tables together

2009-05-22 Thread James Kitambara
Dear Richard Ekblom,

I think Mr. Adrian Klaver gave you the solution. Mine is the similar solution
SELECT message.id,topic.topic,message.message 
FROM topics, messages 
WHERE message.topic=topic.id order by message.id;
 
After executing this query you will get the following:

id |  topic   | message
+--+--
  1 | My favorite food | I like lasagna!
  2 | My favorite food | Pizza is also a favorite
  3 | Are squares better then circles? | I like circles, they
   : remind me of pizza

 
Best Regards,
 
Muhoji James Kitambara
Database Administrator,
B.Sc. With Computer Science and Statistics (Hons),
National Bureau of Statistics,
P.O. Box 796, 
Tel : +255 22 2122722/3    Fax: +255 22 2130852,
Mobile : +255 71 3307632,
Dar es Salaam,
Tanzania.
 


-ORGINAL 
MESSAGE

On Friday 22 May 2009 6:48:43 am Richard Ekblom wrote:
> Hello
>
> I have frequently encountered the need of combining two tables into one.
> First, please take a look at the following table setups...
>
> CREATE TABLE topics (
>    id SERIAL PRIMARY KEY,
>    topic TEXT NOT NULL
> );
>
> CREATE TABLE messages (
>    id SERIAL PRIMARY KEY,
>    topic INTEGER REFERENCES topics(id),
>    message TEXT NOT NULL
> );
>
> Example of a topics table:
> ID    TOPIC
> 1     Are squares better then circles?
> 2     My favorite food
>
> Example of a messages table:
> ID    TOPIC    MESSAGE
> 1     2        I like lasagna!
> 2     2        Pizza is also a favorite
> 3     1        I like circles, they remind me of pizza
>
> Notice that the number of topics may differ from the number of messages.
>
> Now I want to combine these tables with a single SELECT to get...
>
> Combined table:
> ID   TOPIC                               MESSAGE
> 1    My favorite food                    I like lasagna!
> 2    My favorite food                    Pizza is also a favorite
> 3    Are squares better then circles?    I like circles, they remind me
> of pizza
>
> I have seen different examples of this with something called JOIN but
> they always give me only two rows. How can I do this when the two tables
> may have different sizes to produce exactly the combined table above???
>
>


  

[SQL] Allow column type to change without worrying about view dependencies

2009-05-22 Thread Emi Lu

Good morning,

When there are lots of views and complicated dependencies, it is not 
easy to alter column from varchar(a) to varchar(b).


Is it possible when defining a view, adding cascade or some other key 
word(s) to allow column type change?


When column1 is changed, all related views' column type is changed as well.

Thanks,
---
Lu Ying


--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Need help combining 2 tables together

2009-05-22 Thread Oliveiros Cristina
I guess this is pretty much the same 
as doing 
 SELECT message.id,topic.topic,message.message 
FROM topics 
JOIN messages 
ON topics.id = message.topic
ORDER BY message.ID

Ain't I right?

Best,
Oliveiros

  - Original Message - 
  From: James Kitambara 
  To: Richard Ekblom 
  Cc: pgsql-sql@postgresql.org 
  Sent: Friday, May 22, 2009 3:47 PM
  Subject: Re: [SQL] Need help combining 2 tables together


Dear Richard Ekblom,

I think Mr. Adrian Klaver gave you the solution. Mine is the similar 
solution
SELECT message.id,topic.topic,message.message 
FROM topics, messages 
WHERE message.topic=topic.id order by message.id;

After executing this query you will get the following:

id |  topic   | message
+--+--
  1 | My favorite food | I like lasagna!
  2 | My favorite food | Pizza is also a favorite
  3 | Are squares better then circles? | I like circles, they
   : remind me of pizza


Best Regards,

Muhoji James Kitambara
Database Administrator,
B.Sc. With Computer Science and Statistics (Hons),
National Bureau of Statistics,
P.O. Box 796, 
Tel : +255 22 2122722/3Fax: +255 22 2130852,
Mobile : +255 71 3307632,
Dar es Salaam,
Tanzania.


  -ORGINAL 
MESSAGE

  On Friday 22 May 2009 6:48:43 am Richard Ekblom wrote:
  > Hello
  >
  > I have frequently encountered the need of combining two tables into 
one.
  > First, please take a look at the following table setups...
  >
  > CREATE TABLE topics (
  >id SERIAL PRIMARY KEY,
  >topic TEXT NOT NULL
  > );
  >
  > CREATE TABLE messages (
  >id SERIAL PRIMARY KEY,
  >topic INTEGER REFERENCES topics(id),
  >message TEXT NOT NULL
  > );
  >
  > Example of a topics table:
  > IDTOPIC
  > 1 Are squares better then circles?
  > 2 My favorite food
  >
  > Example of a messages table:
  > IDTOPICMESSAGE
  > 1 2I like lasagna!
  > 2 2Pizza is also a favorite
  > 3 1I like circles, they remind me of pizza
  >
  > Notice that the number of topics may differ from the number of 
messages.
  >
  > Now I want to combine these tables with a single SELECT to get...
  >
  > Combined table:
  > ID   TOPIC   MESSAGE
  > 1My favorite foodI like lasagna!
  > 2My favorite foodPizza is also a favorite
  > 3Are squares better then circles?I like circles, they 
remind me
  > of pizza
  >
  > I have seen different examples of this with something called JOIN 
but
  > they always give me only two rows. How can I do this when the two 
tables
  > may have different sizes to produce exactly the combined table 
above???
  >
  > 



Re: [SQL] Need help combining 2 tables together

2009-05-22 Thread Rob Sargent
if you want topics listed which don't yet have messages try

select t.id, t.topic, m.id, m.message from topics t left join messages m on
m.topic = t.id;

On Fri, May 22, 2009 at 8:47 AM, James Kitambara  wrote:

> Dear Richard Ekblom,
>
> I think Mr. Adrian Klaver gave you the solution. Mine is the similar
> solution
> SELECT message.id,topic.topic,message.message
> FROM topics, messages
> WHERE message.topic=topic.id order by message.id;
>
> After executing this query you will get the following:
>
> id |  topic   | message
> +--+--
>   1 | My favorite food | I like lasagna!
>   2 | My favorite food | Pizza is also a favorite
>   3 | Are squares better then circles? | I like circles, they
>: remind me of pizza
>
> Best Regards,
>
> *Muhoji James Kitambara*
> *Database Administrator,*
> *B.Sc. With Computer Science and Statistics (Hons),*
> *National Bureau of Statistics,*
> *P.O. Box 796, *
> *Tel : +255 22 2122722/3Fax: +255 22 2130852,*
> *Mobile : +255 71 3307632,*
> *Dar es Salaam,*
> *Tanzania.*
>
>
> -ORGINAL
> MESSAGE
> On Friday 22 May 2009 6:48:43 am Richard Ekblom wrote:
> > Hello
> >
> > I have frequently encountered the need of combining two tables into one.
> > First, please take a look at the following table setups...
> >
> > CREATE TABLE topics (
> >id SERIAL PRIMARY KEY,
> >topic TEXT NOT NULL
> > );
> >
> > CREATE TABLE messages (
> >id SERIAL PRIMARY KEY,
> >topic INTEGER REFERENCES topics(id),
> >message TEXT NOT NULL
> > );
> >
> > Example of a topics table:
> > IDTOPIC
> > 1 Are squares better then circles?
> > 2 My favorite food
> >
> > Example of a messages table:
> > IDTOPICMESSAGE
> > 1 2I like lasagna!
> > 2 2Pizza is also a favorite
> > 3 1I like circles, they remind me of pizza
> >
> > Notice that the number of topics may differ from the number of messages.
> >
> > Now I want to combine these tables with a single SELECT to get...
> >
> > Combined table:
> > ID   TOPIC   MESSAGE
> > 1My favorite foodI like lasagna!
> > 2My favorite foodPizza is also a favorite
> > 3Are squares better then circles?I like circles, they remind me
> > of pizza
> >
> > I have seen different examples of this with something called JOIN but
> > they always give me only two rows. How can I do this when the two tables
> > may have different sizes to produce exactly the combined table above???
> >
> >
>
>
>


Re: [SQL] help with pg_hba.conf

2009-05-22 Thread Tom Lane
Isaac Dover  writes:
>> As far as i can tell,
>> pg_hba.conf is the only manner in which to prevent users from connecting to
>> other users' databases. I've restricted roles to connecting only using
>> sameuser:
>> 
>> host sameuser all 192.168.168.0/24 md5

In recent releases (certainly 8.3) the better approach is probably to
use CONNECT privilege to grant or deny access.  However that does have
some drawbacks if you intend to let users create their own databases
--- they have to remember to set the privileges properly on new DBs.

regards, tom lane

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] help with pg_hba.conf

2009-05-22 Thread Isaac Dover
thank you _SO_ much. I'm not sure how I overlooked that, but it is far
easier to manage using connect. I'm assuming that even if tables have public
privileges granted, that they are not visible to people not granted connect
privileges to the database?

Thanks,
Isaac

On Fri, May 22, 2009 at 12:31 PM, Tom Lane  wrote:

> Isaac Dover  writes:
> >> As far as i can tell,
> >> pg_hba.conf is the only manner in which to prevent users from connecting
> to
> >> other users' databases. I've restricted roles to connecting only using
> >> sameuser:
> >>
> >> host sameuser all 192.168.168.0/24 md5
>
> In recent releases (certainly 8.3) the better approach is probably to
> use CONNECT privilege to grant or deny access.  However that does have
> some drawbacks if you intend to let users create their own databases
> --- they have to remember to set the privileges properly on new DBs.
>
>regards, tom lane
>


Re: [SQL] help with pg_hba.conf

2009-05-22 Thread Tom Lane
Isaac Dover  writes:
> thank you _SO_ much. I'm not sure how I overlooked that, but it is far
> easier to manage using connect. I'm assuming that even if tables have public
> privileges granted, that they are not visible to people not granted connect
> privileges to the database?

Right, if you can't get into the database then the permissions of
objects within it don't matter...

regards, tom lane

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


[SQL] left join where not null vs. inner join

2009-05-22 Thread Emi Lu
Two tables, each contains more than hundreds of thousands records. Is 
there any efficiency differences between (1) and (2)?


(1) T1 inner join T2 using (c1, c2)


(2) T1 left join T2 using (c1, c2) where c2 is not null


Thanks,
--
Lu Ying

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] left join where not null vs. inner join

2009-05-22 Thread Erik Jones


On May 22, 2009, at 1:51 PM, Emi Lu wrote:

Two tables, each contains more than hundreds of thousands records.  
Is there any efficiency differences between (1) and (2)?


(1) T1 inner join T2 using (c1, c2)


(2) T1 left join T2 using (c1, c2) where c2 is not null


Yes, stick with the first.  In the second you're asking the db to  
generate a result set with tuples for every row in T1 and then filter  
it down to where there are only matching T2 rows whereas in the first  
it does the filtering as it goes.  The LEFT JOIN ... WHERE X NOT NULL  
construct is typically used as an alternative to a NOT IN or NOT  
EXISTS ().  So, this:


SELECT *
FROM t1
WHERE id NOT IN (SELECT some_id
FROM T2);

becomes

SELECT t1.*
FROM t1 LEFT JOIN t2 ON (t1.id = t2.some_id)
WHERE t2.id IS NULL;

Basically, it's used in the opposite case of what you're asking about.

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k






--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql