Re: [GENERAL] Rules, Windows and ORDER BY
Martijn van Oosterhout writes: > On Fri, Aug 24, 2012 at 09:32:32AM +, Jason Dusek wrote: >> Why are the individual indices not useful? The tests that the >> query does -- equality on key and realm and ordering on at -- >> are each supported by indices. Does it have to do with the cost >> of loading the three indices? > I'm not entirely sure, but I'll take a stab at it. I think it has to do > with the fact that you want order. Combining multiple indexes so you > use them at the same time works as an BitmapAnd. That is, it uses each > index to determine blocks that are interesting and then find the blocks > that are listed by all tindexes, and then it loads the blocks and chcks > them. Yeah. While you *can* in principle solve the problem with the individual indexes, it's much less efficient than a single index. In particular, BitmapAnd plans are far from being a magic bullet for combining two individually-not-very-selective conditions. (That realm constraint is surely not very selective; dunno about the key one.) That implies reading a large number of entries from each index, forming a rather large bitmap for each one, and then ANDing those bitmaps to get a smaller one. And even after all that work, you're still not done, because you have no idea which bit in the bitmap represents the row with largest "at" value. > In theory you could BitmapAnd the 'k' and 'realm' indexes and then scan > the 'at' index only checking rows that the bitmap shows are > interesting. But I'm not sure if postgres can do that. No, it can't, and that likely wouldn't be a very effective plan anyway; you could end up scanning a very large fraction of the "at" index, since you'd have to start at the end (the latest entry anywhere in the table). Even if you didn't make many trips to the heap, that's not cheap. In constrast, given a three-column btree index organized with the equality-constrained columns first, the btree code can descend the index tree straight to the entry you want. We've expended a lot of sweat on optimizing that case, and it will absolutely blow the doors off anything involving a bitmap scan. Of course the downside is that the three-column index might be relatively useless for queries of forms other than this one. So it's a tradeoff between flexibility and performance. But since the OP is asking, I'm assuming he cares a lot about performance of queries of this exact form. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rules, Windows and ORDER BY
On Fri, Aug 24, 2012 at 09:32:32AM +, Jason Dusek wrote: > 2012/8/23 Tom Lane : > > Jason Dusek writes: > >> CREATE TABLE kv > >> ( k bytea NOT NULL, > >> at timestamptz NOT NULL, > >> realm bytea NOT NULL, > >> v bytea NOT NULL ); > >> CREATE INDEX ON kv USING hash(k); > >> CREATE INDEX ON kv (t); > >> CREATE INDEX ON kv USING hash(realm); > > > >> SELECT * FROM kv WHERE k = $1 AND realm = $2 ORDER BY at DESC LIMIT 1; > > > > If you want to make that fast, an index on (k,realm,at) would > > help. Those indexes that you did create are next to useless > > for this, and furthermore hash indexes are quite unsafe for > > production. > > Why are the individual indices not useful? The tests that the > query does -- equality on key and realm and ordering on at -- > are each supported by indices. Does it have to do with the cost > of loading the three indices? I'm not entirely sure, but I'll take a stab at it. I think it has to do with the fact that you want order. Combining multiple indexes so you use them at the same time works as an BitmapAnd. That is, it uses each index to determine blocks that are interesting and then find the blocks that are listed by all tindexes, and then it loads the blocks and chcks them. The problem here is that you want ORDER BY at, which makes the above scheme fall apart, because order is not preversed. So it falls back on either scanning the 'at' index and probing checking the rows to see if they match, or using all indexes, and then sorting the result. In theory you could BitmapAnd the 'k' and 'realm' indexes and then scan the 'at' index only checking rows that the bitmap shows are interesting. But I'm not sure if postgres can do that. Anyway, the suggested three column index will match your query in a single lookup and hence be much faster than any of the above suggestions, so if this is a really important query then it may be worth it here. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
Re: [GENERAL] Rules, Windows and ORDER BY
2012/8/23 Tom Lane : > Jason Dusek writes: >> I have a simple table of keys and values which periodically >> receives updated values. It's desirable to keep older values >> but, most of the time, we query only for the latest value of a >> particular key. > >> CREATE TABLE kv >> ( k bytea NOT NULL, >> at timestamptz NOT NULL, >> realm bytea NOT NULL, >> v bytea NOT NULL ); >> CREATE INDEX ON kv USING hash(k); >> CREATE INDEX ON kv (t); >> CREATE INDEX ON kv USING hash(realm); > >> SELECT * FROM kv WHERE k = $1 AND realm = $2 ORDER BY at DESC LIMIT 1; > > If you want to make that fast, an index on (k,realm,at) would > help. Those indexes that you did create are next to useless > for this, and furthermore hash indexes are quite unsafe for > production. Thanks for pointing out the unsafety of hash indexes. I think I got in the habit of using them for a project with large, temporary data sets. Why are the individual indices not useful? The tests that the query does -- equality on key and realm and ordering on at -- are each supported by indices. Does it have to do with the cost of loading the three indices? -- Jason Dusek pgp // solidsnack // C1EBC57DC55144F35460C8DF1FD4C6C1FED18A2B -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Rules, Windows and ORDER BY
Jason Dusek writes: > I have a simple table of keys and values which periodically > receives updated values. It's desirable to keep older values > but, most of the time, we query only for the latest value of a > particular key. > CREATE TABLE kv > ( k bytea NOT NULL, > at timestamptz NOT NULL, > realm bytea NOT NULL, > v bytea NOT NULL ); > CREATE INDEX ON kv USING hash(k); > CREATE INDEX ON kv (t); > CREATE INDEX ON kv USING hash(realm); > SELECT * FROM kv WHERE k = $1 AND realm = $2 ORDER BY at DESC LIMIT 1; If you want to make that fast, an index on (k,realm,at) would help. Those indexes that you did create are next to useless for this, and furthermore hash indexes are quite unsafe for production. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general