On 10/17/14, 4:49 PM, Marko Tiikkaja wrote:
On 10/17/14, 11:47 PM, Tom Lane wrote:
Marko Tiikkaja <ma...@joh.to> writes:
This week we had one of the most annoying problems I've ever encountered
with postgres.  We had a big index on multiple columns, say,  foo(a, b,
c).  According to pg_stat_all_indexes the index was being used *all the
time*.  However, after looking into our queries more closely, it turns
out that it was only being used to look up statistics for the foo.a
column to estimate merge scan viability during planning.  But this took
hours for two people to track down.

So what I'd like to have is a way to be able to distinguish between
indexes being used to answer queries, and ones being only used for stats
lookups during planning.

Why?  Used is used.

Because I don't need a 30GB index on foo(a,b,c) to look up statistics. If I 
ever have a problem, I can replace it with a 5GB one on foo(a).

That problem can exist with user queries too. Perhaps it would be better to 
find a way to count scans that didn't use all the fields in the index.

I do also see value in differentiating planning use from real query processing; 
not doing that can certainly cause confusion. What I don't know is if the added 
stats bloat is worth it. If we do go down that road, I think it'd be better to 
add an indicator to EState. Aside from allowing stats for all planning access, 
it should make it less likely that someone adds a new access path and forgets 
to mark it as internal (especially if the added field defaults to an invalid 
value).
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to