Re: [GENERAL] forcing use of a specific (expression) index?
Thanks to you and Tom. The partial index solution is working splendidly. On 5/3/07, Richard Huxton <[EMAIL PROTECTED]> wrote: Dan Weber wrote: > I made an expression index specifically for that where clause: > > CREATE INDEX special_testing_idx on my_table (((bool_1 or > int_1 = 0) AND (int_2 IS NULL) AND (int_3 IS > NULL) AND (protocol = 2))); No, you haven't. What you've done here is create an index *for that expression*. And it's not terribly useful because your expression will only have two possible values: true, false. What you were after is something like: CREATE INDEX special_idx2 ON my_table (id) WHERE (...long expression...) -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] forcing use of a specific (expression) index?
I've got a table with about 15 columns and 200,000 rows. I have indexes on a lot of my columns, but postgres doesn't seem to be grabbing the ideal index -- in this case, an expression index that exactly matches my WHERE clause. I have the following query: SELECT columns FROM my_table WHERE (bool_1 or int_1 = 0) AND (int_2 IS NULL) AND (int_3 IS NULL) AND (protocol = 2) ORDER BY id LIMIT 1; Here is an explain analyze: Limit (cost=0.00..8.43 rows=1 width=111) (actual time=17511.939..17511.940 rows=1 loops=1) -> Index Scan using my_table_id_key on my_table (cost=0.00..384000.58 rows=45562 width=111) (actual time=17511.935..17511.935 rows=1 loops=1) Filter: ((bool_1 OR (int_1 = 0)) AND (int_2 IS NULL) AND (int_3 IS NULL) AND (protocol = 2)) Total runtime: 17512.031 ms I made an expression index specifically for that where clause: CREATE INDEX special_testing_idx on my_table (((bool_1 or int_1 = 0) AND (int_2 IS NULL) AND (int_3 IS NULL) AND (protocol = 2))); It looks like the database is letting the ORDER BY id LIMIT 1 override the above index. I suppose I could live with that, except removing both the ORDER and LIMIT clauses doesn't make it use that index. Here is the EXPLAIN ANALYZE without the ORDER or LIMIT: Bitmap Heap Scan on my_table (cost=1558.71..10154.01 rows=45562 width=111) (actual time=7442.835..14391.969 rows=678 loops=1) Filter: ((bool_1 OR (int_1 = 0)) AND (int_2 IS NULL) AND (int_3 IS NULL) AND (protocol = 2)) -> Bitmap Index Scan on my_table_int_2_null_idx (cost=0.00..1558.71 rows=122487 width=0) (actual time=7081.952..7081.952 rows=123001 loops=1) Index Cond: ((int_2 IS NULL) = true) Total runtime: 14392.966 ms I've done lots of ANALYZEs of my_table, as well as tried setting enable_seqscan to off. Have I just created too many indexes? Is there some syntax I can use to make the database recognize my big hairy WHERE clause? Would I be better of making a special boolean column that resolves to that expression, and then indexing that column? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] query takes 65 times longer if I add 1 column (explain attached)
What have you got work_mem set to?The default, 1024. The fact that the estimates diverge so far from reality can no doubt beblamed on the horridly bad rowcount estimates: 3 rows estimated vs 179 actual for table_one, and 128592 estimated vs 4202 actual fortable_two. Have you ANALYZEd these tables lately?Nope. Doing that caused the first operation to respond very quickly.(And the second as well, of course.) Thanks for the assistance.
[GENERAL] query takes 65 times longer if I add 1 column (explain attached)
I've got a query that takes a very long time to execute if I add anadditional column to the SELECT clause. Here is Query A, the slow one. I've numbered the rowsof the EXPLAIN for convenience.psql=# explain analyze SELECT timestamp, src_ip, dst_ip, id, string_one, string_two FROM table_one LEFT JOIN table_two USING(id) WHERE TRUE AND src_ip = '192.168.10.128' ORDER BY timestamp DESC LIMIT 10; QUERY PLAN ---a1. Limit (cost=18954.93..18954.95 rows=10 width=136) (actual time=169038.823..169038.851 rows=10 loops=1)a2. -> Sort (cost= 18954.93..18959.75 rows=1929 width=136) (actual time=169038.816..169038.825 rows=10 loops=1)a3. Sort Key: table_one."timestamp"a4. -> Nested Loop Left Join (cost= 0.00..18849.66 rows=1929 width=136) (actual time=862.661..169036.636 rows=179 loops=1)a5. Join Filter: ("outer".id = "inner".id)a6. -> Index Scan using table_one_type_key on table_one (cost=0.00..17.70 rows=3 width=72) (actual time=0.083..864.386 rows=179 loops=1)a7. Index Cond: (src_ip = '192.168.10.128 '::inet)a8. -> Seq Scan on table_two (cost=0.00..4669.92 rows=128592 width=68) (actual time=842.166..933.496 rows=4202 loops=179) Total runtime: 169039.368 ms(9 rows)Time: 169041.495 msIt takes almost 3 minutes to run. If I knock off either string_one orstring_two, the query runs in <3 seconds. Here is Query B:psql=# explain analyze SELECT timestamp, src_ip, dst_ip, id, string_one FROM table_one LEFT JOIN table_two USING(id) WHERE TRUE AND src_ip = '192.168.10.128' ORDER BY timestamp DESC LIMIT 10; QUERY PLAN--- b1. Limit (cost=17617.45..17617.47 rows=10 width=104 (actual time=2644.975..2645.003 rows=10 loops=1)b2. -> Sort (cost=17617.45..17622.27 rows=1929 width=104) (actual time=2644.968..2644.978 rows=10 loops=1)b3. Sort Key: table_one."timestamp"b4. -> Nested Loop Left Join (cost=5810.75..17512.19 rows=1929 width=104) (actual time=956.129..2643.458 rows=179 loops=1)b5. Join Filter: ("outer".id = "inner".id)b6. -> Bitmap Heap Scan on table_one (cost=7.24..13.72 rows=3 width=72) (actual time=39.745..67.818 rows=179 loops=1)b7. Recheck Cond: (src_ip = '192.168.10.128'::inet)b8. -> Bitmap Index Scan on table_one_type_key (cost=0.00..7.24 rows=3 width=0) (actual time=29.399..29.399 rows=179 loops=1)b9. Index Cond: (src_ip = '192.168.10.128 '::inet)b10. -> Materialize (cost=5803.51..8094.43 rows=128592 width=36) (actual time=4.573..9.462 rows=4202 loops=179)b11. -> Seq Scan on table_two (cost=0.00..4669.92 rows=128592 width=36) (actual time=818.216..895.127 rows=4202 loops=1) Total runtime: 2646.899 ms(12 rows)If we compare each EXPLAIN, they're about the same down through line 5, the Join Filter. But Query A joins things from an "Index Scan" anda "Seq Scan," while Query B joins things from a "Bitmap Index Scan"and a "Materialize."1. Why is the additional column causing such a big change in the searching technique, given that the column is not involved in the WHERE at all?2. Besides indices, how can I coerce Postgres to act like Query B instead of like Query A?