On May 27, 2013, at 6:35 PM, John Mudd <johnbm...@gmail.com> wrote: > Thanks, that's easy enough to test. Didn't seem to help though. >
Ok. And if you CLUSTER tables USING PK? > > test=# REINDEX index test_select_pkey; > REINDEX > test=# VACUUM ANALYZE test_select ; > VACUUM > > > (stopped postgres; reset O/S cache; started postgres) > > test=# explain analyze SELECT * FROM test_select WHERE key1 >= 500000 ORDER > BY key1, key2, key3, id LIMIT 1; > QUERY PLAN > > -------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.00..0.08 rows=1 width=21) (actual time=16.368..16.369 rows=1 > loops=1) > -> Index Scan using my_key on test_select (cost=0.00..41981.16 > rows=501333 width=21) (actual time=16.366..16.366 rows=1 loops=1) > Index Cond: (key1 >= 500000) > Total runtime: 16.444 ms > > > (stopped postgres; reset O/S cache; started postgres) > > test=# explain analyze SELECT * FROM test_select WHERE id = 500000; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------- > Index Scan using test_select_pkey on test_select (cost=0.00..8.36 rows=1 > width=21) (actual time=23.072..23.074 rows=1 loops=1) > Index Cond: (id = 500000) > Total runtime: 23.192 ms > > > > > On Mon, May 27, 2013 at 10:21 AM, Evgeny Shishkin <itparan...@gmail.com> > wrote: > > On May 27, 2013, at 6:02 PM, John Mudd <johnbm...@gmail.com> wrote: > > > Postgres 9.1.2 on Ubuntu 12.04 > > > > Any reason why a select by primary key would be slower than a select that > > includes an ORDER BY? I was really hoping using the primary key would give > > me a boost. > > > > You created my_key after data loading, and PK was there all the time. > If you REINDEX PK, i bet it will be as fast. > > > I stopped the server and cleared the O/S cache using "sync; echo 3 > > > /proc/sys/vm/drop_caches" between the runs. > > > > > > > > test=# VACUUM ANALYZE test_select; > > VACUUM > > > > (stopped postgres; reset O/S cache; started postgres) > > > > test=# explain analyze SELECT * FROM test_select WHERE key1 >= 500000 ORDER > > BY key1, key2, key3, id LIMIT 1; > > QUERY PLAN > > -------------------------------------------------------------------------------------------------------------------------------------- > > Limit (cost=0.00..0.08 rows=1 width=21) (actual time=12.599..12.600 > > rows=1 loops=1) > > -> Index Scan using my_key on test_select (cost=0.00..41895.49 > > rows=498724 width=21) (actual time=12.597..12.597 rows=1 loops=1) > > Index Cond: (key1 >= 500000) > > Total runtime: 12.678 ms > > > > (stopped postgres; reset O/S cache; started postgres) > > > > test=# explain analyze SELECT * FROM test_select WHERE id = 500000; > > QUERY PLAN > > --------------------------------------------------------------------------------------------------------------------------------- > > Index Scan using test_select_pkey on test_select (cost=0.00..8.36 rows=1 > > width=21) (actual time=31.396..31.398 rows=1 loops=1) > > Index Cond: (id = 500000) > > Total runtime: 31.504 ms > > > > > > > > Schema: > > > > test=# \d test_select > > Table "public.test_select" > > Column | Type | Modifiers > > --------+--------------+---------------------------------------------------------- > > id | integer | not null default > > nextval('test_select_id_seq'::regclass) > > key1 | integer | > > key2 | integer | > > key3 | integer | > > data | character(4) | > > Indexes: > > "test_select_pkey" PRIMARY KEY, btree (id) > > "my_key" btree (key1, key2, key3, id) > > > > test=# > > > > > > > > Sample data: > > > > test=# SELECT * FROM test_select LIMIT 10; > > id | key1 | key2 | key3 | data > > ----+--------+--------+--------+------ > > 1 | 984966 | 283954 | 772063 | x > > 2 | 817668 | 393533 | 924888 | x > > 3 | 751039 | 798753 | 454309 | x > > 4 | 128505 | 329643 | 280553 | x > > 5 | 105600 | 257225 | 710015 | x > > 6 | 323891 | 615614 | 83206 | x > > 7 | 194054 | 63506 | 353171 | x > > 8 | 212068 | 881225 | 271804 | x > > 9 | 644180 | 26693 | 200738 | x > > 10 | 136586 | 498699 | 554417 | x > > (10 rows) > > > > > > > > > > Here's how I populated the table: > > > > import psycopg2 > > > > conn = psycopg2.connect('dbname=test') > > > > cur = conn.cursor() > > > > def random_int(): > > n = 1000000 > > return random.randint(0,n) > > > > def random_key(): > > return random_int(), random_int(), random_int() > > > > def create_table(): > > cur.execute(''' > > DROP TABLE IF EXISTS test_select; > > > > CREATE TABLE test_select ( > > id SERIAL PRIMARY KEY, > > key1 INTEGER, > > key2 INTEGER, > > key3 INTEGER, > > data char(4) > > ); > > ''') > > conn.commit() > > > > n = 1000000 > > for i in range(n): > > cur.execute("INSERT INTO test_select(key1, key2, key3, data) > > VALUES(%s, %s, %s, 'x')", random_key()) > > conn.commit() > > > > cur.execute('CREATE INDEX my_key ON test_select(key1, key2, key3, id)') > > conn.commit() > > > > create_table() > > > >