[GENERAL] Why is this query running slowly?
I have a simple query that's been running for a while, which is fine, but it seems to be running very slowly, which is a problem: mydb=# explain select user_id from den where user_id not in (select duid from user_mappings) and timestamp between '2009-04-01' and '2010-04-01'; QUERY PLAN --- Seq Scan on den (cost=711.58..66062724212.74 rows=22634720 width=4) Filter: ((timestamp = '2009-04-01 00:00:00'::timestamp without time zone) AND (timestamp = '2010-04-01 00:00:00'::timestamp without time zone) AND (NOT (SubPlan 1))) SubPlan 1 - Materialize (cost=711.58..1223.38 rows=36780 width=4) - Seq Scan on user_mappings (cost=0.00..530.80 rows=36780 width=4) user_mappings is fairly small: mydb=# select count(*) from user_mappings; count --- 36780 (1 row) The client is on the same LAN, but only sees packets trickling in: $ sudo tcpdump -i lo port 5434 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on lo, link-type EN10MB (Ethernet), capture size 96 bytes 00:44:05.475584 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 3757958746:3757966938, ack 3697417382, win 265, options [nop,nop,TS val 194791416 ecr 194791044], length 8192 00:44:05.475684 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 8192, win 771, options [nop,nop,TS val 194791416 ecr 194791416], length 0 00:44:08.867976 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 8192:15779, ack 1, win 265, options [nop,nop,TS val 194791756 ecr 194791416], length 7587 00:44:08.868019 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 15779, win 771, options [nop,nop,TS val 194791756 ecr 194791756], length 0 00:44:08.870336 IP nuthouse.40349 nuthouse.5434: Flags [P.], seq 1:19, ack 15779, win 771, options [nop,nop,TS val 194791756 ecr 194791756], length 18 00:44:08.870343 IP nuthouse.5434 nuthouse.40349: Flags [.], ack 19, win 265, options [nop,nop,TS val 194791756 ecr 194791756], length 0 00:44:12.578220 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 15779:23971, ack 19, win 265, options [nop,nop,TS val 194792127 ecr 194791756], length 8192 00:44:12.618116 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 23971, win 771, options [nop,nop,TS val 194792131 ecr 194792127], length 0 00:44:16.664645 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 23971:32163, ack 19, win 265, options [nop,nop,TS val 194792535 ecr 194792131], length 8192 00:44:16.664755 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 32163, win 771, options [nop,nop,TS val 194792535 ecr 194792535], length 0 00:44:20.465773 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 32163:40355, ack 19, win 265, options [nop,nop,TS val 194792915 ecr 194792535], length 8192 00:44:20.465878 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 40355, win 720, options [nop,nop,TS val 194792915 ecr 194792915], length 0 00:44:24.115273 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 40355:48547, ack 19, win 265, options [nop,nop,TS val 194793280 ecr 194792915], length 8192 00:44:24.115380 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 48547, win 720, options [nop,nop,TS val 194793280 ecr 194793280], length 0 00:44:27.749084 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 48547:56739, ack 19, win 265, options [nop,nop,TS val 194793644 ecr 194793280], length 8192 00:44:27.749192 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 56739, win 720, options [nop,nop,TS val 194793644 ecr 194793644], length 0 00:44:31.618124 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 56739:64931, ack 19, win 265, options [nop,nop,TS val 194794031 ecr 194793644], length 8192 00:44:31.618166 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 64931, win 720, options [nop,nop,TS val 194794031 ecr 194794031], length 0 00:44:35.349958 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 64931:73123, ack 19, win 265, options [nop,nop,TS val 194794404 ecr 194794031], length 8192 00:44:35.350054 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 73123, win 720, options [nop,nop,TS val 194794404 ecr 194794404], length 0 00:44:39.311627 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 73123:81315, ack 19, win 265, options [nop,nop,TS val 194794800 ecr 194794404], length 8192 00:44:39.311734 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 81315, win 771, options [nop,nop,TS val 194794800 ecr 194794800], length 0 00:44:43.178319 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 81315:89507, ack 19, win 265, options [nop,nop,TS val 194795187 ecr 194794800], length 8192 00:44:43.178429 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 89507, win 771, options [nop,nop,TS val 194795187 ecr 194795187], length 0 00:44:46.798511 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 89507:97699, ack 19, win 265, options [nop,nop,TS val 194795549 ecr 194795187], length 8192 00:44:46.798622 IP nuthouse.40349
Re: [GENERAL] Why is this query running slowly?
If it matters, the client is a JDBC app doing: con = DriverManager.getConnection(jdbc:postgresql://localhost:5434/mydb, yang, password) con.setAutoCommit(false) val st = con.prepareStatement( select user_id from den where user_id not in (select duid from user_mappings) and timestamp between '2009-04-01'::date and '2010-04-01'::date; , ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY) st.setFetchSize(8192) rs = st.executeQuery() On Thu, Sep 15, 2011 at 12:53 AM, Yang Zhang yanghates...@gmail.com wrote: I have a simple query that's been running for a while, which is fine, but it seems to be running very slowly, which is a problem: mydb=# explain select user_id from den where user_id not in (select duid from user_mappings) and timestamp between '2009-04-01' and '2010-04-01'; QUERY PLAN --- Seq Scan on den (cost=711.58..66062724212.74 rows=22634720 width=4) Filter: ((timestamp = '2009-04-01 00:00:00'::timestamp without time zone) AND (timestamp = '2010-04-01 00:00:00'::timestamp without time zone) AND (NOT (SubPlan 1))) SubPlan 1 - Materialize (cost=711.58..1223.38 rows=36780 width=4) - Seq Scan on user_mappings (cost=0.00..530.80 rows=36780 width=4) user_mappings is fairly small: mydb=# select count(*) from user_mappings; count --- 36780 (1 row) The client is on the same LAN, but only sees packets trickling in: $ sudo tcpdump -i lo port 5434 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on lo, link-type EN10MB (Ethernet), capture size 96 bytes 00:44:05.475584 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 3757958746:3757966938, ack 3697417382, win 265, options [nop,nop,TS val 194791416 ecr 194791044], length 8192 00:44:05.475684 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 8192, win 771, options [nop,nop,TS val 194791416 ecr 194791416], length 0 00:44:08.867976 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 8192:15779, ack 1, win 265, options [nop,nop,TS val 194791756 ecr 194791416], length 7587 00:44:08.868019 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 15779, win 771, options [nop,nop,TS val 194791756 ecr 194791756], length 0 00:44:08.870336 IP nuthouse.40349 nuthouse.5434: Flags [P.], seq 1:19, ack 15779, win 771, options [nop,nop,TS val 194791756 ecr 194791756], length 18 00:44:08.870343 IP nuthouse.5434 nuthouse.40349: Flags [.], ack 19, win 265, options [nop,nop,TS val 194791756 ecr 194791756], length 0 00:44:12.578220 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 15779:23971, ack 19, win 265, options [nop,nop,TS val 194792127 ecr 194791756], length 8192 00:44:12.618116 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 23971, win 771, options [nop,nop,TS val 194792131 ecr 194792127], length 0 00:44:16.664645 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 23971:32163, ack 19, win 265, options [nop,nop,TS val 194792535 ecr 194792131], length 8192 00:44:16.664755 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 32163, win 771, options [nop,nop,TS val 194792535 ecr 194792535], length 0 00:44:20.465773 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 32163:40355, ack 19, win 265, options [nop,nop,TS val 194792915 ecr 194792535], length 8192 00:44:20.465878 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 40355, win 720, options [nop,nop,TS val 194792915 ecr 194792915], length 0 00:44:24.115273 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 40355:48547, ack 19, win 265, options [nop,nop,TS val 194793280 ecr 194792915], length 8192 00:44:24.115380 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 48547, win 720, options [nop,nop,TS val 194793280 ecr 194793280], length 0 00:44:27.749084 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 48547:56739, ack 19, win 265, options [nop,nop,TS val 194793644 ecr 194793280], length 8192 00:44:27.749192 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 56739, win 720, options [nop,nop,TS val 194793644 ecr 194793644], length 0 00:44:31.618124 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 56739:64931, ack 19, win 265, options [nop,nop,TS val 194794031 ecr 194793644], length 8192 00:44:31.618166 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 64931, win 720, options [nop,nop,TS val 194794031 ecr 194794031], length 0 00:44:35.349958 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 64931:73123, ack 19, win 265, options [nop,nop,TS val 194794404 ecr 194794031], length 8192 00:44:35.350054 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 73123, win 720, options [nop,nop,TS val 194794404 ecr 194794404], length 0 00:44:39.311627 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 73123:81315, ack 19, win 265, options [nop,nop,TS val 194794800 ecr 194794404], length 8192
Re: [GENERAL] Why is this query running slowly?
Should also add that while the client is under no load (CPU1%, load ~0.1, mem ~20%), the server looks pretty busy (CPU 90-100% of 1 core, load ~1.5, mem ~70-80%), but PG is the only thing using resources. On Thu, Sep 15, 2011 at 12:58 AM, Yang Zhang yanghates...@gmail.com wrote: If it matters, the client is a JDBC app doing: con = DriverManager.getConnection(jdbc:postgresql://localhost:5434/mydb, yang, password) con.setAutoCommit(false) val st = con.prepareStatement( select user_id from den where user_id not in (select duid from user_mappings) and timestamp between '2009-04-01'::date and '2010-04-01'::date; , ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY) st.setFetchSize(8192) rs = st.executeQuery() On Thu, Sep 15, 2011 at 12:53 AM, Yang Zhang yanghates...@gmail.com wrote: I have a simple query that's been running for a while, which is fine, but it seems to be running very slowly, which is a problem: mydb=# explain select user_id from den where user_id not in (select duid from user_mappings) and timestamp between '2009-04-01' and '2010-04-01'; QUERY PLAN --- Seq Scan on den (cost=711.58..66062724212.74 rows=22634720 width=4) Filter: ((timestamp = '2009-04-01 00:00:00'::timestamp without time zone) AND (timestamp = '2010-04-01 00:00:00'::timestamp without time zone) AND (NOT (SubPlan 1))) SubPlan 1 - Materialize (cost=711.58..1223.38 rows=36780 width=4) - Seq Scan on user_mappings (cost=0.00..530.80 rows=36780 width=4) user_mappings is fairly small: mydb=# select count(*) from user_mappings; count --- 36780 (1 row) The client is on the same LAN, but only sees packets trickling in: $ sudo tcpdump -i lo port 5434 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on lo, link-type EN10MB (Ethernet), capture size 96 bytes 00:44:05.475584 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 3757958746:3757966938, ack 3697417382, win 265, options [nop,nop,TS val 194791416 ecr 194791044], length 8192 00:44:05.475684 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 8192, win 771, options [nop,nop,TS val 194791416 ecr 194791416], length 0 00:44:08.867976 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 8192:15779, ack 1, win 265, options [nop,nop,TS val 194791756 ecr 194791416], length 7587 00:44:08.868019 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 15779, win 771, options [nop,nop,TS val 194791756 ecr 194791756], length 0 00:44:08.870336 IP nuthouse.40349 nuthouse.5434: Flags [P.], seq 1:19, ack 15779, win 771, options [nop,nop,TS val 194791756 ecr 194791756], length 18 00:44:08.870343 IP nuthouse.5434 nuthouse.40349: Flags [.], ack 19, win 265, options [nop,nop,TS val 194791756 ecr 194791756], length 0 00:44:12.578220 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 15779:23971, ack 19, win 265, options [nop,nop,TS val 194792127 ecr 194791756], length 8192 00:44:12.618116 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 23971, win 771, options [nop,nop,TS val 194792131 ecr 194792127], length 0 00:44:16.664645 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 23971:32163, ack 19, win 265, options [nop,nop,TS val 194792535 ecr 194792131], length 8192 00:44:16.664755 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 32163, win 771, options [nop,nop,TS val 194792535 ecr 194792535], length 0 00:44:20.465773 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 32163:40355, ack 19, win 265, options [nop,nop,TS val 194792915 ecr 194792535], length 8192 00:44:20.465878 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 40355, win 720, options [nop,nop,TS val 194792915 ecr 194792915], length 0 00:44:24.115273 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 40355:48547, ack 19, win 265, options [nop,nop,TS val 194793280 ecr 194792915], length 8192 00:44:24.115380 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 48547, win 720, options [nop,nop,TS val 194793280 ecr 194793280], length 0 00:44:27.749084 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 48547:56739, ack 19, win 265, options [nop,nop,TS val 194793644 ecr 194793280], length 8192 00:44:27.749192 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 56739, win 720, options [nop,nop,TS val 194793644 ecr 194793644], length 0 00:44:31.618124 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 56739:64931, ack 19, win 265, options [nop,nop,TS val 194794031 ecr 194793644], length 8192 00:44:31.618166 IP nuthouse.40349 nuthouse.5434: Flags [.], ack 64931, win 720, options [nop,nop,TS val 194794031 ecr 194794031], length 0 00:44:35.349958 IP nuthouse.5434 nuthouse.40349: Flags [P.], seq 64931:73123, ack 19, win 265, options [nop,nop,TS val 194794404 ecr 194794031], length 8192 00:44:35.350054
Re: [GENERAL] Why is this query running slowly?
On 09/15/11 12:53 AM, Yang Zhang wrote: mydb=# explain select user_id from den where user_id not in (select duid from user_mappings) and timestamp between '2009-04-01' and '2010-04-01'; QUERY PLAN --- Seq Scan on den (cost=711.58..66062724212.74 rows=22634720 width=4) Filter: ((timestamp= '2009-04-01 00:00:00'::timestamp without time zone) AND (timestamp= '2010-04-01 00:00:00'::timestamp without time zone) AND (NOT (SubPlan 1))) SubPlan 1 - Materialize (cost=711.58..1223.38 rows=36780 width=4) - Seq Scan on user_mappings (cost=0.00..530.80 rows=36780 width=4) user_mappings is fairly small: so you have to check if every one of the 22 million rows of 'den' in that time range and is not one of the 36000 rows of user_mappings? how many rows are in that time range? are timestamp and duid indexed? have the tables been analyzed any time recently? -- john r pierceN 37, W 122 santa cruz ca mid-left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why is this query running slowly?
On 15 Září 2011, 9:53, Yang Zhang wrote: I have a simple query that's been running for a while, which is fine, but it seems to be running very slowly, which is a problem: mydb=# explain select user_id from den where user_id not in (select duid from user_mappings) and timestamp between '2009-04-01' and '2010-04-01'; QUERY PLAN --- Seq Scan on den (cost=711.58..66062724212.74 rows=22634720 width=4) Filter: ((timestamp = '2009-04-01 00:00:00'::timestamp without time zone) AND (timestamp = '2010-04-01 00:00:00'::timestamp without time zone) AND (NOT (SubPlan 1))) SubPlan 1 - Materialize (cost=711.58..1223.38 rows=36780 width=4) - Seq Scan on user_mappings (cost=0.00..530.80 rows=36780 width=4) user_mappings is fairly small: The problem is that for each of the 22634720 rows in den a separate uncorrelated subquery (a seq scan on user_mappings) has to be executed. Althogh the subquery is not very expensive, multiplied by the number of rows in den the total cost is extreme. The only solution is to get rid of the not in subquery - try to turn it to a join like this: SELECT user_id FROM den LEFT JOIN user_mappings ON (user_id = duid) WHERE (timestamp BETWEEN '2009-04-01' AND '2010-04-01') AND (duid IS NULL) That should give the same result I guess. Tomas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why is this query running slowly?
On 15/09/11 17:53, Yang Zhang wrote: I have a simple query that's been running for a while, which is fine, but it seems to be running very slowly, which is a problem: mydb=# explain select user_id from den where user_id not in (select duid from user_mappings) and timestamp between '2009-04-01' and '2010-04-01'; Do you have indexes on user_id and timestamp? QUERY PLAN --- Seq Scan on den (cost=711.58..66062724212.74 rows=22634720 width=4) Filter: ((timestamp= '2009-04-01 00:00:00'::timestamp without time zone) AND (timestamp= '2010-04-01 00:00:00'::timestamp without time zone) AND (NOT (SubPlan 1))) SubPlan 1 - Materialize (cost=711.58..1223.38 rows=36780 width=4) - Seq Scan on user_mappings (cost=0.00..530.80 rows=36780 width=4) user_mappings is fairly small: mydb=# select count(*) from user_mappings; count --- 36780 (1 row) I know that `den` is large but things seem to be running much slower than I'd expect. Nothing in the logs. Is there any way to inspect what's going on? Hesitant to kill the query in case it's almost done, though I doubt it (didn't have the foresight to expose this information in the client process - wasn't expecting to run into this). Thanks in advance. -- .signature -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why is this query running slowly?
On Thu, Sep 15, 2011 at 1:22 AM, Tomas Vondra t...@fuzzy.cz wrote: On 15 Září 2011, 9:53, Yang Zhang wrote: I have a simple query that's been running for a while, which is fine, but it seems to be running very slowly, which is a problem: mydb=# explain select user_id from den where user_id not in (select duid from user_mappings) and timestamp between '2009-04-01' and '2010-04-01'; QUERY PLAN --- Seq Scan on den (cost=711.58..66062724212.74 rows=22634720 width=4) Filter: ((timestamp = '2009-04-01 00:00:00'::timestamp without time zone) AND (timestamp = '2010-04-01 00:00:00'::timestamp without time zone) AND (NOT (SubPlan 1))) SubPlan 1 - Materialize (cost=711.58..1223.38 rows=36780 width=4) - Seq Scan on user_mappings (cost=0.00..530.80 rows=36780 width=4) user_mappings is fairly small: The problem is that for each of the 22634720 rows in den a separate uncorrelated subquery (a seq scan on user_mappings) has to be executed. Althogh the subquery is not very expensive, multiplied by the number of rows in den the total cost is extreme. The only solution is to get rid of the not in subquery - try to turn it to a join like this: SELECT user_id FROM den LEFT JOIN user_mappings ON (user_id = duid) WHERE (timestamp BETWEEN '2009-04-01' AND '2010-04-01') AND (duid IS NULL) That should give the same result I guess. This worked great, thank you. Too bad the planner isn't smart enough to do this yet! Tomas -- Yang Zhang http://yz.mit.edu/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why is this query running slowly?
On 15 Září 2011, 11:07, Yang Zhang wrote: On Thu, Sep 15, 2011 at 1:22 AM, Tomas Vondra t...@fuzzy.cz wrote: On 15 Září 2011, 9:53, Yang Zhang wrote: I have a simple query that's been running for a while, which is fine, but it seems to be running very slowly, which is a problem: mydb=# explain select user_id from den where user_id not in (select duid from user_mappings) and timestamp between '2009-04-01' and '2010-04-01'; QUERY PLAN --- Seq Scan on den (cost=711.58..66062724212.74 rows=22634720 width=4) Filter: ((timestamp = '2009-04-01 00:00:00'::timestamp without time zone) AND (timestamp = '2010-04-01 00:00:00'::timestamp without time zone) AND (NOT (SubPlan 1))) SubPlan 1 - Materialize (cost=711.58..1223.38 rows=36780 width=4) - Seq Scan on user_mappings (cost=0.00..530.80 rows=36780 width=4) user_mappings is fairly small: The problem is that for each of the 22634720 rows in den a separate uncorrelated subquery (a seq scan on user_mappings) has to be executed. Althogh the subquery is not very expensive, multiplied by the number of rows in den the total cost is extreme. The only solution is to get rid of the not in subquery - try to turn it to a join like this: SELECT user_id FROM den LEFT JOIN user_mappings ON (user_id = duid) WHERE (timestamp BETWEEN '2009-04-01' AND '2010-04-01') AND (duid IS NULL) That should give the same result I guess. This worked great, thank you. Too bad the planner isn't smart enough to do this yet! It's probably a bit more complicated I guess - there are probably cases when this would not work this great. Anyway, you should consider Toby Corkindale's recommendation and check if there's an index on that timestamp column - that might further improve the performance. (I don't think an index on user_id might be useful in this case). Tomas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why is this query running slowly?
Yang Zhang yanghates...@gmail.com writes: On Thu, Sep 15, 2011 at 1:22 AM, Tomas Vondra t...@fuzzy.cz wrote: [ use a left join instead of NOT IN ] This worked great, thank you. Too bad the planner isn't smart enough to do this yet! It never will be, because they're not equivalent queries. NOT IN has different (and very hard-to-optimize) behavior in the face of NULLs in the right-hand table. The moral of the story is avoid NOT IN. You could use NOT EXISTS, which is less of a notational change than recasting as a left join. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why is this query running slowly?
On 15 Září 2011, 16:26, Tom Lane wrote: Yang Zhang yanghates...@gmail.com writes: On Thu, Sep 15, 2011 at 1:22 AM, Tomas Vondra t...@fuzzy.cz wrote: [ use a left join instead of NOT IN ] This worked great, thank you. Too bad the planner isn't smart enough to do this yet! It never will be, because they're not equivalent queries. NOT IN has different (and very hard-to-optimize) behavior in the face of NULLs in the right-hand table. True, but in this case the subquery selects a PK column (at least I believe so, we haven't seen the structure), so there should be no NULLs. So the result should be the same. Could the planner do this reasoning, i.e. see if there's a NOT NULL constraint on the column and change the plan accordingly? Tomas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why is this query running slowly?
Tomas Vondra t...@fuzzy.cz writes: Could the planner do this reasoning, i.e. see if there's a NOT NULL constraint on the column and change the plan accordingly? Possibly. I'm not sure how expensive it would be. Right now, a structural change of that sort would have to be done pretty early in planning, and we don't have a lot of info collected about the tables at that point. I'm afraid it would involve a special-purpose catalog probe, which would add a lot of overhead. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general