[SQL] Can someone explain the problem with this select
Allow me to demonstrate my pitiful SQL knowledge I have tables documents and comments If I run join and list doc_nums the query is quite fast If I run join and use subselect the query is extremely slow Can someone offer analysis Thanks Richard dcc=# EXPLAIN ANALYZE select doc_num from documents limit 10; QUERY PLAN -- Limit (cost=0.00..1.01 rows=10 width=13) (actual time=0.013..0.061 rows=10 loops=1) -> Seq Scan on documents (cost=0.00..909333.85 rows=9014885 width=13) (actual time=0.008..0.027 rows=10 loops=1) Total runtime: 0.125 ms (3 rows) dcc=# dcc=# EXPLAIN ANALYZE select * from documents left outer join comments on (documents.doc_num = comments.doc_num) where documents.doc_num in ('105364107','105513059','105513095','105513112','105585627','102933195','014650340','014650361','014650362','105419865'); QUERY PLAN Hash Left Join (cost=21.23..61.54 rows=10 width=444) (actual time=0.507..0.574 rows=10 loops=1) Hash Cond: ("outer".doc_num = "inner".doc_num) -> Bitmap Heap Scan on documents (cost=20.03..60.28 rows=10 width=361) (actual time=0.397..0.432 rows=10 loops=1) Recheck Cond: ((doc_num = '105364107'::bpchar) OR (doc_num = '105513059'::bpchar) OR (doc_num = '105513095'::bpchar) OR (doc_num = '105513112'::bpchar) OR (doc_num = '105585627'::bpchar) OR (doc_num = '102933195'::bpchar) OR (doc_num = '014650340'::bpchar) OR (doc_num = '014650361'::bpchar) OR (doc_num = '014650362'::bpchar) OR (doc_num = '105419865'::bpchar)) -> BitmapOr (cost=20.03..20.03 rows=10 width=0) (actual time=0.383..0.383 rows=0 loops=1) -> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.059..0.059 rows=1 loops=1) Index Cond: (doc_num = '105364107'::bpchar) -> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.039..0.039 rows=1 loops=1) Index Cond: (doc_num = '105513059'::bpchar) -> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.034..0.034 rows=1 loops=1) Index Cond: (doc_num = '105513095'::bpchar) -> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.031..0.031 rows=1 loops=1) Index Cond: (doc_num = '105513112'::bpchar) -> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.034..0.034 rows=1 loops=1) Index Cond: (doc_num = '105585627'::bpchar) -> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.036..0.036 rows=1 loops=1) Index Cond: (doc_num = '102933195'::bpchar) -> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.036..0.036 rows=1 loops=1) Index Cond: (doc_num = '014650340'::bpchar) -> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.031..0.031 rows=1 loops=1) Index Cond: (doc_num = '014650361'::bpchar) -> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.031..0.031 rows=1 loops=1) Index Cond: (doc_num = '014650362'::bpchar) -> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.035..0.035 rows=1 loops=1) Index Cond: (doc_num = '105419865'::bpchar) -> Hash (cost=1.16..1.16 rows=16 width=83) (actual time=0.080..0.080 rows=16 loops=1) -> Seq Scan on comments (cost=0.00..1.16 rows=16 width=83) (actual time=0.005..0.037 rows=16 loops=1) Total runtime: 0.775 ms (28 rows) dcc=# dcc=# EXPLAIN ANALYZE select * from documents left outer join comments on (documents.doc_num = comments.doc_num) where documents.doc_num in (select doc_num from documents limit 10); QUERY PLAN -- Merge IN Join (cost=10002.19..136154797.93 rows=10 width=654) (actual time =23.534..2216180.550 rows=10 loops=1) Merge Cond: ("outer".doc_num = "inner".doc_num) -> Merge Left Join (cost=0.00..36129585.92 rows=10083868 width=654) (actual tim
Re: [SQL] Can someone explain the problem with this select
I've been foolin with this for a couple of days Sometimes you just have to ask Thanks gentlemen On Tue, 5 Dec 2006, Tom Lane wrote: Richard Ray <[EMAIL PROTECTED]> writes: dcc=# EXPLAIN ANALYZE select * from documents left outer join comments on (documents.doc_num = comments.doc_num) where documents.doc_num in (select doc_num from documents limit 10); [ is slow ] This isn't your fault, it's an optimizer limitation: PG < 8.2 can't reorder outer joins relative to regular joins, and the IN with a sub-select is a kind of regular join. So it's forming the whole outer-join result and then joining to the sub-select :-( This is fixed in 8.2, released today, so perhaps upgrading is the thing for you to do. Alternatively, you can contort the query to get the IN restriction inside the outer join: select * from (select * from documents where documents.doc_num in (select doc_num from documents limit 10)) ss left outer join comments on (ss.doc_num = comments.doc_num); regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[SQL] Query is fast and function is slow
The query select count(*) from documents where doc_num = '106973821' and (select bit_or(group_access) from mda_groups where group_name in (select groname from pg_user,pg_group where usename = 'bbob' and usesysid = any(grolist) and (groname ~ '.*owner$' or groname = 'admin'))) & access > '0'::bit(100); returns very fast If I create function create or replace function check_for_update_permission(text,text) returns boolean as ' declare doc_number alias for $1; user alias for $2; doc_count integer; begin select count(*) into doc_count from documents where doc_num = doc_number and (select bit_or(group_access) from mda_groups where group_name in (select groname from pg_user,pg_group where usename = user and usesysid = any(grolist) and (groname ~ ''.*owner$'' or groname = ''admin''))) & access > ''0''::bit(100); if doc_count > 0 then return(true); end if; return(false); end; ' language 'plpgsql'; and run "select check_for_update_permission('106973821','bbob');" it returns the correct info but takes several minutes Would someone please enlighten me. Can you do something like explain analyze on a function Thanks Richard ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [SQL] Query is fast and function is slow
On Thu, 7 Dec 2006, Thomas Pundt wrote: Hi, On Wednesday 06 December 2006 16:44, Richard Ray wrote: | select count(*) from documents where doc_num = '106973821' and (select | bit_or(group_access) from mda_groups where group_name in (select groname | from pg_user,pg_group where usename = 'bbob' and usesysid = any(grolist) | and (groname ~ '.*owner$' or groname = 'admin'))) & access > | '0'::bit(100); | | returns very fast | | If I create function | | create or replace function check_for_update_permission(text,text) returns | boolean as ' | declare |doc_number alias for $1; |user alias for $2; |doc_count integer; | begin ... | end; | ' language 'plpgsql'; | | | and run "select check_for_update_permission('106973821','bbob');" | it returns the correct info but takes several minutes | Would someone please enlighten me. | Can you do something like explain analyze on a function Just a guess: is the column "doc_num" really of type text? Maybe using "text" in the function lets the planner choose a sequential scan? Actually "doc_num" is char(9) I changed text to char(9) and got same slow results I'd try putting a "raise notice '%', explain analyze ..." statement into the function and check the log file. It appears that the function is not using the index The table documents has a index on doc_num and doc_num is a unique value dcc=# explain analyze select doc_num from documents where doc_num = '106973821'; QUERY PLAN - Index Scan using documents_pkey on documents (cost=0.00..5.48 rows=1 width=13) (actual time=37.475..37.481 rows=1 loops=1) Index Cond: (doc_num = '106973821'::bpchar) Total runtime: 37.535 ms (3 rows) dcc=# But this same statement in a function takes several minutes; My SQL knowledge is pitiful so would you explain how to use "explain analyze" in the function I get errors when I try to load the file with raise notice ''%'',explain analyze select doc_num from documents where doc_num = doc_number; dcc=# \i /src/check_for_update_permission psql:/src/check_for_update_permission:52: ERROR: syntax error at or near "analyze" at character 16 QUERY: SELECT explain analyze select doc_num from documents where doc_num = $1 CONTEXT: SQL statement in PL/PgSQL function "check_for_update_permission" near line 18 psql:/src/check_for_update_permission:52: LINE 1: SELECT explain analyze select doc_num from documents where d... psql:/src/check_for_update_permission:52: ^ dcc=# Ciao, Thomas ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [SQL] Query is fast and function is slow
On Thu, 7 Dec 2006, Tom Lane wrote: Richard Ray <[EMAIL PROTECTED]> writes: On Thu, 7 Dec 2006, Thomas Pundt wrote: Just a guess: is the column "doc_num" really of type text? Maybe using "text" in the function lets the planner choose a sequential scan? Actually "doc_num" is char(9) I changed text to char(9) and got same slow results You need to make the second argument type "name", too, if you have a lot of users. I said my knowledge was pitiful Changing both parameters to char(9) and name fixed the problem It appears to be using the index If time allows could you explain this a bit Thanks Richard regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[SQL] Help with quotes in plpgsql
How should this be properly quoted create or replace function test(integer) returns setof text as $$ declare a record; begin select into a now() - interval '$1 day'; return next a; return; end $$ language 'plpgsql'; I'm not having a lot of luck Thanks Richard ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [SQL] Help with quotes in plpgsql
On Tue, 19 Dec 2006, Hector Villarreal wrote: Hi Not sure about $1 parms : you may need to use a variable to set $1 to then cast it as interval : create or replace function test(integer) returns setof text as $$ declare a record; begin select into a now() - ($1::text||'days')::interval; return next a; return; end $$ language 'plpgsql'; This works quite well I never looked at it that way Thanks -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Richard Ray Sent: Tuesday, December 19, 2006 12:10 PM To: pgsql-sql@postgresql.org Subject: [SQL] Help with quotes in plpgsql How should this be properly quoted create or replace function test(integer) returns setof text as $$ declare a record; begin select into a now() - interval '$1 day'; return next a; return; end $$ language 'plpgsql'; I'm not having a lot of luck Thanks Richard ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] Help with quotes in plpgsql
On Tue, 19 Dec 2006, Milen A. Radev wrote: Richard Ray : How should this be properly quoted create or replace function test(integer) returns setof text as $$ declare a record; begin select into a now() - interval '$1 day'; return next a; return; end $$ language 'plpgsql'; I'm not having a lot of luck Welcome to psql 8.2.0, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit milen=> create or replace function test(integer) returns setof text as $$ milen$> declare milen$> a record; milen$> begin milen$> select into a now() - interval '$1 day'; milen$> return next a; milen$> return; milen$> end milen$> $$ language 'plpgsql'; CREATE FUNCTION milen=> No problems here. What version are you using? I'm using 8.1.0 but I don't think that's the problem I have no problem creating the function but it will only substract 1 day ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [SQL] Help with quotes in plpgsql
It makes sense when ya'll explain it It never sounds that good when I'm talkin to myself That solves my problem but not my ignorance I'm still curious about how would I properly quote create or replace function test(integer) returns setof text as $$ declare a record; begin select into a now() - interval '$1 day'; return next a; return; end $$ language 'plpgsql'; I got the usage example for interval from http://www.postgresql.org/docs/8.1/interactive/functions-datetime.html Thanks Richard On Tue, 19 Dec 2006, Tom Lane wrote: "Hector Villarreal" <[EMAIL PROTECTED]> writes: select into a now() - ($1::text||'days')::interval; People keep suggesting variants of that as ways to convert numeric values to intervals, but it's really extremely bad practice. Much better is to use number-times-interval multiplication: select into a now() - $1 * '1 day'::interval; This is less typing, at least as easy to understand, more flexible (you can use any scale factor you want), and considerably more efficient. The first way involves coercing the integer to text, then text-concatenating that with a constant, then applying interval_in which does a fairly nontrivial parsing process. The second way is basically just a multiplication, because '1 day'::interval is already a constant value of type interval. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[SQL] How to query by column names
This may be a simple but can I create a query such as select (select attname from pg_attribute where attrelid = (select relfilenode from pg_class where relname = 't1') and attisdropped = false and attnum > 0) from t1; I get ERROR: more than one row returned by a subquery used as an expression Thanks Richard ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [SQL] How to query by column names
This is not exactly what I need I want to return the data in t1 On Mon, 22 Jan 2007, Jeff Frost wrote: I think this is what you're looking for Richard: SELECT attname FROM pg_attribute pa, pg_class pc WHERE pc.relname = 't1' AND pa.attrelid = pc.relfilenode AND pa.attisdropped IS FALSE AND pa.attnum > 0; Let me know if it doesn't do what you intended. On Mon, 22 Jan 2007, Richard Ray wrote: This may be a simple but can I create a query such as select (select attname from pg_attribute where attrelid = (select relfilenode from pg_class where relname = 't1') and attisdropped = false and attnum > 0) from t1; I get ERROR: more than one row returned by a subquery used as an expression Thanks Richard ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- Jeff Frost, Owner <[EMAIL PROTECTED]> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] How to query by column names
All attributes of t1 Where (select attname from pg_attribute where attrelid = (select relfilenode from pg_class where relname = 't1') and attisdropped = false and attnum > 0) is a substitute for * On Mon, 22 Jan 2007, Jeff Frost wrote: Perhaps I should have asked this earlier. What information are you trying to extract? On Mon, 22 Jan 2007, Richard Ray wrote: This is not exactly what I need I want to return the data in t1 On Mon, 22 Jan 2007, Jeff Frost wrote: I think this is what you're looking for Richard: SELECT attname FROM pg_attribute pa, pg_class pc WHERE pc.relname = 't1' AND pa.attrelid = pc.relfilenode AND pa.attisdropped IS FALSE AND pa.attnum > 0; Let me know if it doesn't do what you intended. On Mon, 22 Jan 2007, Richard Ray wrote: This may be a simple but can I create a query such as select (select attname from pg_attribute where attrelid = (select relfilenode from pg_class where relname = 't1') and attisdropped = false and attnum > 0) from t1; I get ERROR: more than one row returned by a subquery used as an expression Thanks Richard ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- Jeff Frost, Owner <[EMAIL PROTECTED]> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Jeff Frost, Owner <[EMAIL PROTECTED]> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954 ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [SQL] How to query by column names
On Mon, 22 Jan 2007, Jeff Frost wrote: So why are you avoiding "SELECT * FROM t1;" ? I was affeared that if I brought my total ignorance to light I would be band from the list but here goes. I work in UNIX/Linux environments. It's my habit to record my scripts. A simple example: #!/bin/bash CMD="psql -d test \"select * from t1\"" echo $CMD >> my_log eval $CMD | while read x; do do_something_with_x done In this example * expands to all files in the current working directory. I was attempting to get around this by enumerating the table attributes. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[SQL] How to influence the planner
I have a table, t1, with about 12 million rows The column foo is unique and indexed The column bar is not unique and is indexed "select * from t1 where length(bar) = 0;" runs about 2 minutes "select * from t1 where length(bar) = 0 order by foo ;" ran until I stopped it after about 20 minutes My simple solution is "select * into t2 from t1 where length(bar) = 0;" and "select * from t2 order by foo ;" Is there a way to make "select * from t1 where length(bar) = 0 order by foo ;" or something similar work Thanks Richard Ray ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [SQL] How to influence the planner
On Fri, 31 Aug 2007, Michael Glaesemann wrote: On Aug 31, 2007, at 13:32 , Richard Ray wrote: "select * from t1 where length(bar) = 0;" runs about 2 minutes "select * from t1 where length(bar) = 0 order by foo ;" ran until I stopped it after about 20 minutes EXPLAIN ANALYZE will help you see what the planner is doing to produce the results. mda=# EXPLAIN ANALYZE select * from t1 where length(bar) = 0 order by foo; QUERY PLAN --- Index Scan using t1_pkey on t1 (cost=0.00..46698478.18 rows=60038 width=334) (actual time=4612.740..2349582.520 rows=32705 loops=1) Filter: (length(bar) = 0) Total runtime: 2349614.258 ms (3 rows) mda=# EXPLAIN ANALYZE select * from t1 where length(bar) = 0; QUERY PLAN --- Seq Scan on t1 (cost=1.00..102020349.17 rows=60038 width=334) (actual time=39.065..108645.233 rows=32705 loops=1) Filter: (length(bar) = 0) Total runtime: 108677.759 ms (3 rows) mda=# The index for foo on t1 is the primary index t1_pkey Why is it slower using the index Have you recently ANALYZEd t1? I run vacuum analyze nightly If length(bar) = 0 is a common operation on this table, you might consider using an expression index on t1: create index t1_length_bar_idx on t1 (length(bar)); This is a one time procedure to fix some data but I've had this problem before I'm running PostgreSQL 8.1.0 on Fedora Core 6 You might want to ask on the performance list as well, as this is right up their alley. Hope this gets you started on the right track. Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [SQL] How to influence the planner
On Fri, 31 Aug 2007, Tom Lane wrote: Richard Ray <[EMAIL PROTECTED]> writes: On Fri, 31 Aug 2007, Michael Glaesemann wrote: EXPLAIN ANALYZE will help you see what the planner is doing to produce the results. mda=# EXPLAIN ANALYZE select * from t1 where length(bar) = 0 order by foo; QUERY PLAN --- Index Scan using t1_pkey on t1 (cost=0.00..46698478.18 rows=60038 width=334) (actual time=4612.740..2349582.520 rows=32705 loops=1) Filter: (length(bar) = 0) Total runtime: 2349614.258 ms (3 rows) mda=# EXPLAIN ANALYZE select * from t1 where length(bar) = 0; QUERY PLAN --- Seq Scan on t1 (cost=1.00..102020349.17 rows=60038 width=334) (actual time=39.065..108645.233 rows=32705 loops=1) Filter: (length(bar) = 0) Total runtime: 108677.759 ms (3 rows) The problem here is you've got enable_seqscan = off. Don't do that. That will make it use an index if it possibly can, whether using one is a good idea or not. In this case, since the useful condition on length(bar) is not indexable, the best available index-using scan uses the index to implement order by foo ... which is pointless here in terms of saving runtime. I'm running PostgreSQL 8.1.0 on Fedora Core 6 Please update. There are a *lot* of bugs fixed in the 8.1.x series since then. Changing to enable_seqscan = on does solve this problem, thanks Is there some method of crafting a query that will assert my wishes to the planner mda=# EXPLAIN ANALYZE select * from (select * from t1 where length(bar) = 0) a order by foo; QUERY PLAN Index Scan using t1_pkey on t1 (cost=0.00..46698482.18 rows=60038 width=334) (actual time=4784.869..2317363.298 rows=32705 loops=1) Filter: (length(bar) = 0) Total runtime: 2317395.137 ms (3 rows) When is enable_seqscan = off appropriate regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [SQL] How to influence the planner
Thanks guys Lesson learned On Tue, 4 Sep 2007, Scott Marlowe wrote: On 8/31/07, Richard Ray <[EMAIL PROTECTED]> wrote: Changing to enable_seqscan = on does solve this problem, thanks Is there some method of crafting a query that will assert my wishes to the planner When is enable_seqscan = off appropriate enable_xxx = off are troubleshooting tools. They override the query planner. For instance, I had a query that was running slow, and using set enable_nestloop=off allowed the query to run fast. However, using explain analyze I could see that the estimated number of rows was off. Analyze didn't fix it, so I increased the stats target for the column I was working with, reanalyzed, and voila, the query ran fine with nestloop=on. So, enable_xxx=off is normally only appropriate when troubleshooting an issue, not as a fix all. That's doubly true for enable_seqscan=off. If you do have a query that nothing else seems to work on it, you can set one of the enable_xxx settings off for that connection only and not worry about messing up all the other sessions connecting to your db. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly