[SQL] Make the planner smarter about idiosyncratic (or broken) ORM behaviour
Almost all of the 8k queries per second that are hitting our database are generated by a web development framework's ORM (Django), and some of the syntax there is quite mad. Possibly the worst tic exhibited by the translation layer is that if you filter a query on an object by more than one property on a related object it will create a join for each property: woome_video=# EXPLAIN ANALYZE SELECT * FROM "webapp_person" INNER JOIN "auth_user" ON ("webapp_person"."user_id" = "auth_user"."id") INNER JOIN "auth_user" T3 ON ("webapp_person"."user_id" = T3."id") WHERE "webapp_person"."is_suspended" = false AND "webapp_person"."is_banned" = false AND ("webapp_person"."is_human" = true OR "webapp_person"."is_human" IS NULL) AND (LOWER("auth_user"."username") = LOWER('d00ditsnicole') OR LOWER(T3."first_name") = LOWER('d00ditsnicole') ) AND "webapp_person"."dob" >= '1910-01-01' AND "webapp_person"."dob" <= '1991-01-01' ORDER BY "auth_user"."last_login" DESC LIMIT 30; QUERY PLAN - Limit (cost=0.00..87160.10 rows=30 width=496) (actual time=17641.618..17641.618 rows=0 loops=1) -> Nested Loop (cost=0.00..1095311.97 rows=377 width=496) (actual time=17641.616..17641.616 rows=0 loops=1) -> Nested Loop (cost=0.00..1090710.92 rows=554 width=230) (actual time=17641.614..17641.614 rows=0 loops=1) Join Filter: ((lower((auth_user.username)::text) = 'd00ditsnicole'::text) OR (lower((t3.first_name)::text) = 'd00ditsnicole'::text)) -> Index Scan Backward using auth_user_last_login_idx on auth_user (cost=0.00..141657.02 rows=1684525 width=115) (actual time=0.039..3315.272 rows=1685757 loops=1) -> Index Scan using auth_user_pkey on auth_user t3 (cost=0.00..0.54 rows=1 width=115) (actual time=0.004..0.005 rows=1 loops=1685757) Index Cond: (t3.id = auth_user.id) -> Index Scan using webapp_person_user_id on webapp_person (cost=0.00..8.29 rows=1 width=266) (never executed) Index Cond: (webapp_person.user_id = auth_user.id) Filter: ((NOT webapp_person.is_suspended) AND (NOT webapp_person.is_banned) AND (webapp_person.is_human OR (webapp_person.is_human IS NULL)) AND (webapp_person.dob >= '1910-01-01'::date) AND (webapp_person.dob <= '1991-01-01'::date)) Total runtime: 17641.871 ms (11 rows) There LIMIT and dob filters there are silly as well but they don't seem to impact query performance; the trouble is the spurious 2nd join with the T3 alias for auth_user. If I just remove that, we get: woome_video=# EXPLAIN ANALYZE SELECT * FROM "webapp_person" INNER JOIN "auth_user" ON ("webapp_person"."user_id" = "auth_user"."id") WHERE "webapp_person"."is_suspended" = false AND "webapp_person"."is_banned" = false AND ("webapp_person"."is_human" = true OR "webapp_person"."is_human" IS NULL) AND (LOWER("auth_user"."username") = LOWER('d00ditsnicole') OR LOWER("auth_user"."first_name") = LOWER('d00ditsnicole') ) AND "webapp_person"."dob" >= '1910-01-01' AND "webapp_person"."dob" <= '1991-01-01' ORDER BY "auth_user"."last_login" DESC LIMIT 30; QUERY PLAN --- Limit (cost=6637.36..6637.43 rows=30 width=381) (actual time=0.230..0.230 rows=0 loops=1) -> Sort (cost=6637.36..6638.30 rows=377 width=381) (actual time=0.228..0.228 rows=0 loops=1) Sort Key: auth_user.last_login Sort Method: quicksort Memory: 25kB -> Nested Loop (cost=29.88..6626.22 rows=377 width=381) (actual time=0.162..0.162 rows=0 loops=1) -> Bitmap Heap Scan on auth_user (cost=29.88..2025.17 rows=554 width=115) (actual time=0.161..0.161 rows=0 loops=1) Recheck Cond: ((lower((username)::text) = 'd00ditsnicole'::text) OR (lower((first_name)::text) = 'd00ditsnicole'::text)) -> BitmapOr (cost=29.88..29.88 rows=554 width=0) (actual time=0.158..0.158 rows=0 loops=1) -> Bitmap Index Scan on woome_username_lower (cost=0.00..4.60 rows=1 width=0) (actual time=0.096..0.096 rows=0 loops=1) Index Cond: (lower((username)::text) = 'd00ditsnicole'::text) -> Bitmap Index Scan on auth_user_firstname_idx (cost=0.00..25.00 rows=553 width=0) (actual time=0.060..0.060 rows=0 loops=1) Index Cond: (lower((first_name)::text) = 'd00ditsnicole'::text) -> Index Scan using webapp_person_user_id on webap
Re: [SQL] Make the planner smarter about idiosyncratic (or broken) ORM behaviour
Frank Jördens wrote: Our Django experts are telling me that it is probably not practical to fix in the ORM, as it seems to be structural (anyway not fixable for us in the near term). Hence I am wondering if anyone has an idea as to how to make the planner smarter about such weirdness (or brokenness); you might argue that the 2nd join there is merely syntactic bloat which the planner might just recognize as such? Even if you have funding to hire a developer to adapt PG's planner, it's going to be an uphill struggle to get patches accepted unless there is a simple, quick can-merge-two-joins test someone can come up with. Time spent planning to deal with badly written queries costs every well-written query too of course. Even with a patch and acceptance from core, 8.4 is in beta at the moment so you'll have a long wait before 8.5 comes out with your patch. Are you sure it wouldn't be easier to hire a Python guru for a couple of days and have him/her hack the ORM to make it less, um, "simplistic"? There must be an "assemble references into JOINs" point in the code you could rationalise this at. -- Richard Huxton Archonet Ltd -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Make the planner smarter about idiosyncratic (or broken) ORM behaviour
Frank Joerdens wrote: On Fri, May 22, 2009 at 10:56 AM, Richard Huxton wrote: [...] Are you sure it wouldn't be easier to hire a Python guru for a couple of days and have him/her hack the ORM to make it less, um, "simplistic"? Not sure. :) Your arguments make eminent sense to me. I am not exactly a fan of ORMs, they make my job much harder ... but there seems to be no avoiding them these days. I either like: 1. Really simple, clean ORM but makes it simple to override with custom SQL when you need to. 2. Very clever, sophisticated ORM but makes it simple to override with custom SQL when you need to. Unfortunately I'm not a Python guy so I can't comment on Django's ORM. -- Richard Huxton Archonet Ltd -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Make the planner smarter about idiosyncratic (or broken) ORM behaviour
On Fri, 2009-05-22 at 10:33 +0100, Frank Jördens wrote: > Almost all of the 8k queries per second that are hitting our database > are generated by a web development framework's ORM (Django), and some > of the syntax there is quite mad. Possibly the worst tic exhibited by > the translation layer is that if you filter a query on an object by > more than one property on a related object it will create a join for > each property: It's a known issue that we're looking to work on in the next release. Work started in 8.4, though was more complex than first appeared and we didn't complete it in time for the dev deadline. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] help with pg_hba.conf
possibly this answers my question, and what i am doing is indeed the most appropriate? "Databases are physically separated and access control is managed at the connection level." from 8.3 docs, section 20.1 thanks, Isaac On Fri, May 22, 2009 at 2:37 AM, Isaac Dover wrote: > Hello, to this point i've been working with pg_hba.conf authentication > defaults as installed with PostgreSQL 8.3. I'm trying to better understand > "best practice" for managing connections to databases (I've grown accustomed > to the MSSQL EM method of assigning user privileges). As far as i can tell, > pg_hba.conf is the only manner in which to prevent users from connecting to > other users' databases. I've restricted roles to connecting only using > sameuser: > > host sameuser all 192.168.168.0/24 md5 > > this works fine until a user connects and creates a new database. Pg shows > that the owner of the database is the currently connected user, but the user > can't connect to it, as the hba.conf file has the sameuser restriction. I > was hoping that (somehow, magically) the owner of the database could always > connect to the databases he/she owns. > > Is hba.conf the only way to restrict users connections to specific > databases? Are there privileges I can grant without having to maintain this > file? > > I've spent quite some time researching this, even with the documentation, > but I'm wondering what I'm missing. > > Thanks, > Isaac >
[SQL] Need help combining 2 tables together
Hello I have frequently encountered the need of combining two tables into one. First, please take a look at the following table setups... CREATE TABLE topics ( id SERIAL PRIMARY KEY, topic TEXT NOT NULL ); CREATE TABLE messages ( id SERIAL PRIMARY KEY, topic INTEGER REFERENCES topics(id), message TEXT NOT NULL ); Example of a topics table: IDTOPIC 1 Are squares better then circles? 2 My favorite food Example of a messages table: IDTOPICMESSAGE 1 2I like lasagna! 2 2Pizza is also a favorite 3 1I like circles, they remind me of pizza Notice that the number of topics may differ from the number of messages. Now I want to combine these tables with a single SELECT to get... Combined table: ID TOPIC MESSAGE 1My favorite foodI like lasagna! 2My favorite foodPizza is also a favorite 3Are squares better then circles?I like circles, they remind me of pizza I have seen different examples of this with something called JOIN but they always give me only two rows. How can I do this when the two tables may have different sizes to produce exactly the combined table above??? Some SQL for Postgres if you want to set up this example... CREATE TABLE topics (id SERIAL PRIMARY KEY,topic TEXT NOT NULL); CREATE TABLE messages (id SERIAL PRIMARY KEY,topic INTEGER REFERENCES topics(id),message TEXT NOT NULL); INSERT INTO topics(topic) VALUES('Are squares better then circles?'); INSERT INTO topics(topic) VALUES('My favorite food'); INSERT INTO messages(topic,message) VALUES(2, 'I like lasagna!'); INSERT INTO messages(topic,message) VALUES(2, 'Pizza is also a favorite'); INSERT INTO messages(topic,message) VALUES(1, 'I like circles, they remind me of pizza'); SELECT * FROM topics; SELECT * FROM messages; Thanks in advance /RE -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Need help combining 2 tables together
On Friday 22 May 2009 6:48:43 am Richard Ekblom wrote: > Hello > > I have frequently encountered the need of combining two tables into one. > First, please take a look at the following table setups... > > CREATE TABLE topics ( >id SERIAL PRIMARY KEY, >topic TEXT NOT NULL > ); > > CREATE TABLE messages ( >id SERIAL PRIMARY KEY, >topic INTEGER REFERENCES topics(id), >message TEXT NOT NULL > ); > > Example of a topics table: > IDTOPIC > 1 Are squares better then circles? > 2 My favorite food > > Example of a messages table: > IDTOPICMESSAGE > 1 2I like lasagna! > 2 2Pizza is also a favorite > 3 1I like circles, they remind me of pizza > > Notice that the number of topics may differ from the number of messages. > > Now I want to combine these tables with a single SELECT to get... > > Combined table: > ID TOPIC MESSAGE > 1My favorite foodI like lasagna! > 2My favorite foodPizza is also a favorite > 3Are squares better then circles?I like circles, they remind me > of pizza > > I have seen different examples of this with something called JOIN but > they always give me only two rows. How can I do this when the two tables > may have different sizes to produce exactly the combined table above??? > > > Some SQL for Postgres if you want to set up this example... > > CREATE TABLE topics (id SERIAL PRIMARY KEY,topic TEXT NOT NULL); > CREATE TABLE messages (id SERIAL PRIMARY KEY,topic INTEGER REFERENCES > topics(id),message TEXT NOT NULL); > INSERT INTO topics(topic) VALUES('Are squares better then circles?'); > INSERT INTO topics(topic) VALUES('My favorite food'); > INSERT INTO messages(topic,message) VALUES(2, 'I like lasagna!'); > INSERT INTO messages(topic,message) VALUES(2, 'Pizza is also a favorite'); > INSERT INTO messages(topic,message) VALUES(1, 'I like circles, they > remind me of pizza'); > SELECT * FROM topics; > SELECT * FROM messages; > > > Thanks in advance > /RE test=# SELECT m.id,t.topic,m.message from topics as t,messages as m where m.topic=t.id order by m.id; id | topic | message +--+-- 1 | My favorite food | I like lasagna! 2 | My favorite food | Pizza is also a favorite 3 | Are squares better then circles? | I like circles, they : remind me of pizza -- Adrian Klaver akla...@comcast.net -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Need help combining 2 tables together
Dear Richard Ekblom, I think Mr. Adrian Klaver gave you the solution. Mine is the similar solution SELECT message.id,topic.topic,message.message FROM topics, messages WHERE message.topic=topic.id order by message.id; After executing this query you will get the following: id | topic | message +--+-- 1 | My favorite food | I like lasagna! 2 | My favorite food | Pizza is also a favorite 3 | Are squares better then circles? | I like circles, they : remind me of pizza Best Regards, Muhoji James Kitambara Database Administrator, B.Sc. With Computer Science and Statistics (Hons), National Bureau of Statistics, P.O. Box 796, Tel : +255 22 2122722/3 Fax: +255 22 2130852, Mobile : +255 71 3307632, Dar es Salaam, Tanzania. -ORGINAL MESSAGE On Friday 22 May 2009 6:48:43 am Richard Ekblom wrote: > Hello > > I have frequently encountered the need of combining two tables into one. > First, please take a look at the following table setups... > > CREATE TABLE topics ( > id SERIAL PRIMARY KEY, > topic TEXT NOT NULL > ); > > CREATE TABLE messages ( > id SERIAL PRIMARY KEY, > topic INTEGER REFERENCES topics(id), > message TEXT NOT NULL > ); > > Example of a topics table: > ID TOPIC > 1 Are squares better then circles? > 2 My favorite food > > Example of a messages table: > ID TOPIC MESSAGE > 1 2 I like lasagna! > 2 2 Pizza is also a favorite > 3 1 I like circles, they remind me of pizza > > Notice that the number of topics may differ from the number of messages. > > Now I want to combine these tables with a single SELECT to get... > > Combined table: > ID TOPIC MESSAGE > 1 My favorite food I like lasagna! > 2 My favorite food Pizza is also a favorite > 3 Are squares better then circles? I like circles, they remind me > of pizza > > I have seen different examples of this with something called JOIN but > they always give me only two rows. How can I do this when the two tables > may have different sizes to produce exactly the combined table above??? > >
[SQL] Allow column type to change without worrying about view dependencies
Good morning, When there are lots of views and complicated dependencies, it is not easy to alter column from varchar(a) to varchar(b). Is it possible when defining a view, adding cascade or some other key word(s) to allow column type change? When column1 is changed, all related views' column type is changed as well. Thanks, --- Lu Ying -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Need help combining 2 tables together
I guess this is pretty much the same as doing SELECT message.id,topic.topic,message.message FROM topics JOIN messages ON topics.id = message.topic ORDER BY message.ID Ain't I right? Best, Oliveiros - Original Message - From: James Kitambara To: Richard Ekblom Cc: pgsql-sql@postgresql.org Sent: Friday, May 22, 2009 3:47 PM Subject: Re: [SQL] Need help combining 2 tables together Dear Richard Ekblom, I think Mr. Adrian Klaver gave you the solution. Mine is the similar solution SELECT message.id,topic.topic,message.message FROM topics, messages WHERE message.topic=topic.id order by message.id; After executing this query you will get the following: id | topic | message +--+-- 1 | My favorite food | I like lasagna! 2 | My favorite food | Pizza is also a favorite 3 | Are squares better then circles? | I like circles, they : remind me of pizza Best Regards, Muhoji James Kitambara Database Administrator, B.Sc. With Computer Science and Statistics (Hons), National Bureau of Statistics, P.O. Box 796, Tel : +255 22 2122722/3Fax: +255 22 2130852, Mobile : +255 71 3307632, Dar es Salaam, Tanzania. -ORGINAL MESSAGE On Friday 22 May 2009 6:48:43 am Richard Ekblom wrote: > Hello > > I have frequently encountered the need of combining two tables into one. > First, please take a look at the following table setups... > > CREATE TABLE topics ( >id SERIAL PRIMARY KEY, >topic TEXT NOT NULL > ); > > CREATE TABLE messages ( >id SERIAL PRIMARY KEY, >topic INTEGER REFERENCES topics(id), >message TEXT NOT NULL > ); > > Example of a topics table: > IDTOPIC > 1 Are squares better then circles? > 2 My favorite food > > Example of a messages table: > IDTOPICMESSAGE > 1 2I like lasagna! > 2 2Pizza is also a favorite > 3 1I like circles, they remind me of pizza > > Notice that the number of topics may differ from the number of messages. > > Now I want to combine these tables with a single SELECT to get... > > Combined table: > ID TOPIC MESSAGE > 1My favorite foodI like lasagna! > 2My favorite foodPizza is also a favorite > 3Are squares better then circles?I like circles, they remind me > of pizza > > I have seen different examples of this with something called JOIN but > they always give me only two rows. How can I do this when the two tables > may have different sizes to produce exactly the combined table above??? > >
Re: [SQL] Need help combining 2 tables together
if you want topics listed which don't yet have messages try select t.id, t.topic, m.id, m.message from topics t left join messages m on m.topic = t.id; On Fri, May 22, 2009 at 8:47 AM, James Kitambara wrote: > Dear Richard Ekblom, > > I think Mr. Adrian Klaver gave you the solution. Mine is the similar > solution > SELECT message.id,topic.topic,message.message > FROM topics, messages > WHERE message.topic=topic.id order by message.id; > > After executing this query you will get the following: > > id | topic | message > +--+-- > 1 | My favorite food | I like lasagna! > 2 | My favorite food | Pizza is also a favorite > 3 | Are squares better then circles? | I like circles, they >: remind me of pizza > > Best Regards, > > *Muhoji James Kitambara* > *Database Administrator,* > *B.Sc. With Computer Science and Statistics (Hons),* > *National Bureau of Statistics,* > *P.O. Box 796, * > *Tel : +255 22 2122722/3Fax: +255 22 2130852,* > *Mobile : +255 71 3307632,* > *Dar es Salaam,* > *Tanzania.* > > > -ORGINAL > MESSAGE > On Friday 22 May 2009 6:48:43 am Richard Ekblom wrote: > > Hello > > > > I have frequently encountered the need of combining two tables into one. > > First, please take a look at the following table setups... > > > > CREATE TABLE topics ( > >id SERIAL PRIMARY KEY, > >topic TEXT NOT NULL > > ); > > > > CREATE TABLE messages ( > >id SERIAL PRIMARY KEY, > >topic INTEGER REFERENCES topics(id), > >message TEXT NOT NULL > > ); > > > > Example of a topics table: > > IDTOPIC > > 1 Are squares better then circles? > > 2 My favorite food > > > > Example of a messages table: > > IDTOPICMESSAGE > > 1 2I like lasagna! > > 2 2Pizza is also a favorite > > 3 1I like circles, they remind me of pizza > > > > Notice that the number of topics may differ from the number of messages. > > > > Now I want to combine these tables with a single SELECT to get... > > > > Combined table: > > ID TOPIC MESSAGE > > 1My favorite foodI like lasagna! > > 2My favorite foodPizza is also a favorite > > 3Are squares better then circles?I like circles, they remind me > > of pizza > > > > I have seen different examples of this with something called JOIN but > > they always give me only two rows. How can I do this when the two tables > > may have different sizes to produce exactly the combined table above??? > > > > > > >
Re: [SQL] help with pg_hba.conf
Isaac Dover writes: >> As far as i can tell, >> pg_hba.conf is the only manner in which to prevent users from connecting to >> other users' databases. I've restricted roles to connecting only using >> sameuser: >> >> host sameuser all 192.168.168.0/24 md5 In recent releases (certainly 8.3) the better approach is probably to use CONNECT privilege to grant or deny access. However that does have some drawbacks if you intend to let users create their own databases --- they have to remember to set the privileges properly on new DBs. regards, tom lane -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] help with pg_hba.conf
thank you _SO_ much. I'm not sure how I overlooked that, but it is far easier to manage using connect. I'm assuming that even if tables have public privileges granted, that they are not visible to people not granted connect privileges to the database? Thanks, Isaac On Fri, May 22, 2009 at 12:31 PM, Tom Lane wrote: > Isaac Dover writes: > >> As far as i can tell, > >> pg_hba.conf is the only manner in which to prevent users from connecting > to > >> other users' databases. I've restricted roles to connecting only using > >> sameuser: > >> > >> host sameuser all 192.168.168.0/24 md5 > > In recent releases (certainly 8.3) the better approach is probably to > use CONNECT privilege to grant or deny access. However that does have > some drawbacks if you intend to let users create their own databases > --- they have to remember to set the privileges properly on new DBs. > >regards, tom lane >
Re: [SQL] help with pg_hba.conf
Isaac Dover writes: > thank you _SO_ much. I'm not sure how I overlooked that, but it is far > easier to manage using connect. I'm assuming that even if tables have public > privileges granted, that they are not visible to people not granted connect > privileges to the database? Right, if you can't get into the database then the permissions of objects within it don't matter... regards, tom lane -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] left join where not null vs. inner join
Two tables, each contains more than hundreds of thousands records. Is there any efficiency differences between (1) and (2)? (1) T1 inner join T2 using (c1, c2) (2) T1 left join T2 using (c1, c2) where c2 is not null Thanks, -- Lu Ying -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] left join where not null vs. inner join
On May 22, 2009, at 1:51 PM, Emi Lu wrote: Two tables, each contains more than hundreds of thousands records. Is there any efficiency differences between (1) and (2)? (1) T1 inner join T2 using (c1, c2) (2) T1 left join T2 using (c1, c2) where c2 is not null Yes, stick with the first. In the second you're asking the db to generate a result set with tuples for every row in T1 and then filter it down to where there are only matching T2 rows whereas in the first it does the filtering as it goes. The LEFT JOIN ... WHERE X NOT NULL construct is typically used as an alternative to a NOT IN or NOT EXISTS (). So, this: SELECT * FROM t1 WHERE id NOT IN (SELECT some_id FROM T2); becomes SELECT t1.* FROM t1 LEFT JOIN t2 ON (t1.id = t2.some_id) WHERE t2.id IS NULL; Basically, it's used in the opposite case of what you're asking about. Erik Jones, Database Administrator Engine Yard Support, Scalability, Reliability 866.518.9273 x 260 Location: US/Pacific IRC: mage2k -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql