Re: [GENERAL] Queries for unused/useless indexes

2015-05-26 Thread William Dunn
The query I previously sent was table level. Here is an index level one:
SELECT pg_stat_user_indexes.schemaname,
   pg_stat_user_indexes.relname,
   pg_stat_user_indexes.indexrelid,
   pg_stat_user_indexes.indexrelname,
   pg_stat_user_indexes.idx_scan,
   pg_stat_user_tables.seq_scan,
   (100 * pg_stat_user_indexes.idx_scan / (pg_stat_user_tables.seq_scan
+ pg_stat_user_indexes.idx_scan)) AS perc_idx_used
FROM pg_stat_user_indexes INNER JOIN pg_stat_user_tables ON
pg_stat_user_indexes.relid = pg_stat_user_tables.relid
WHERE pg_relation_size(pg_stat_user_indexes.relid)(5*8192)
  AND NOT ((pg_stat_user_indexes.idx_scan=0
OR pg_stat_user_indexes.idx_scan=NULL)
   AND pg_stat_user_tables.seq_scan=0)
ORDER BY perc_idx_used;

*Will J. Dunn*
*willjdunn.com http://willjdunn.com*

On Tue, May 26, 2015 at 10:31 AM, William Dunn dunn...@gmail.com wrote:

 Melvin - thanks for sharing.

 Here is the query I use which lists the percent of queries against the
 table which use the index ordered by least used first.

 The 'pg_relation_size(relid)(5*8192)' is used to remove any tables that
 would be so small the optimizer would just choose a table scan.

 SELECT schemaname,
relname,
idx_scan,
seq_scan,
(100 * idx_scan / (seq_scan + idx_scan)) AS perc_idx_used
 FROM pg_stat_user_tables
 WHERE pg_relation_size(relid)(5*8192)
   AND NOT ((idx_scan=0
 OR idx_scan=NULL)
AND seq_scan=0)
 ORDER BY perc_idx_used;

 *Will J. Dunn*
 *willjdunn.com http://willjdunn.com*

 On Mon, May 25, 2015 at 12:39 PM, Peter J. Holzer h...@hjp.at wrote:

 On 2015-05-25 12:25:01 -0400, Melvin Davidson wrote:
  I'm not sure why you are using pg_stat_user_indexes.

 Because you did. I didn't change that.

  My original query below
  uses pg_stat_all_indexes and the schema names are joined and it does
 work.

 I'm not sure what you mean by original, but this:

  SELECT n.nspname as schema,
 i.relname as table,
 i.indexrelname as index,
 i.idx_scan,
 i.idx_tup_read,
 i.idx_tup_fetch,
 pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' ||
  quote_ident(i.relname))) AS table_size,
 pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' ||
  quote_ident(i.indexrelname))) AS index_size,
 pg_get_indexdef(idx.indexrelid) as idx_definition
FROM pg_stat_all_indexes i
JOIN pg_class c ON (c.oid = i.relid)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_index idx ON (idx.indexrelid =  i.indexrelid )
   WHERE i.idx_scan  200
 AND NOT idx.indisprimary
 AND NOT idx.indisunique
   ORDER BY 1, 2, 3;

 is not the query you posted in your original message.

 Here is what you posted:

  On Mon, May 25, 2015 at 10:41 AM, Peter J. Holzer hjp-pg...@hjp.at
 wrote:
 
  On 2015-05-22 09:41:57 -0400, Melvin Davidson wrote:
   I'd like to share those queries with the community, as I know
 there must
  be
   others out there with the same problem.
  
   /* useless_indexes.sql */
   SELECT
  idstat.schemaname AS schema,
  idstat.relname AS table_name,
  indexrelname AS index_name,
  idstat.idx_scan AS times_used,
  
 pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
  '.' ||
   quote_ident(idstat.relname))) AS table_size,
  
 pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
  '.' ||
   quote_ident(indexrelname))) AS index_size,
  n_tup_upd + n_tup_ins + n_tup_del as num_writes,
  indexdef AS definition
   FROM pg_stat_user_indexes AS idstat
   JOIN pg_indexes ON indexrelname = indexname
   JOIN pg_stat_user_tables AS tabstat ON idstat.relname =
 tabstat.relname
   WHERE idstat.idx_scan  200
   AND indexdef !~* 'unique'
   ORDER BY idstat.schemaname,
idstat.relname,
indexrelname;

 --
_  | Peter J. Holzer| I want to forget all about both belts and
 |_|_) || suspenders; instead, I want to buy pants
 | |   | h...@hjp.at | that actually fit.
 __/   | http://www.hjp.at/ |   -- http://noncombatant.org/





Re: [GENERAL] Queries for unused/useless indexes

2015-05-26 Thread William Dunn
Melvin - thanks for sharing.

Here is the query I use which lists the percent of queries against the
table which use the index ordered by least used first.

The 'pg_relation_size(relid)(5*8192)' is used to remove any tables that
would be so small the optimizer would just choose a table scan.

SELECT schemaname,
   relname,
   idx_scan,
   seq_scan,
   (100 * idx_scan / (seq_scan + idx_scan)) AS perc_idx_used
FROM pg_stat_user_tables
WHERE pg_relation_size(relid)(5*8192)
  AND NOT ((idx_scan=0
OR idx_scan=NULL)
   AND seq_scan=0)
ORDER BY perc_idx_used;

*Will J. Dunn*
*willjdunn.com http://willjdunn.com*

On Mon, May 25, 2015 at 12:39 PM, Peter J. Holzer h...@hjp.at wrote:

 On 2015-05-25 12:25:01 -0400, Melvin Davidson wrote:
  I'm not sure why you are using pg_stat_user_indexes.

 Because you did. I didn't change that.

  My original query below
  uses pg_stat_all_indexes and the schema names are joined and it does
 work.

 I'm not sure what you mean by original, but this:

  SELECT n.nspname as schema,
 i.relname as table,
 i.indexrelname as index,
 i.idx_scan,
 i.idx_tup_read,
 i.idx_tup_fetch,
 pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' ||
  quote_ident(i.relname))) AS table_size,
 pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' ||
  quote_ident(i.indexrelname))) AS index_size,
 pg_get_indexdef(idx.indexrelid) as idx_definition
FROM pg_stat_all_indexes i
JOIN pg_class c ON (c.oid = i.relid)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_index idx ON (idx.indexrelid =  i.indexrelid )
   WHERE i.idx_scan  200
 AND NOT idx.indisprimary
 AND NOT idx.indisunique
   ORDER BY 1, 2, 3;

 is not the query you posted in your original message.

 Here is what you posted:

  On Mon, May 25, 2015 at 10:41 AM, Peter J. Holzer hjp-pg...@hjp.at
 wrote:
 
  On 2015-05-22 09:41:57 -0400, Melvin Davidson wrote:
   I'd like to share those queries with the community, as I know
 there must
  be
   others out there with the same problem.
  
   /* useless_indexes.sql */
   SELECT
  idstat.schemaname AS schema,
  idstat.relname AS table_name,
  indexrelname AS index_name,
  idstat.idx_scan AS times_used,
  
 pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
  '.' ||
   quote_ident(idstat.relname))) AS table_size,
  
 pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
  '.' ||
   quote_ident(indexrelname))) AS index_size,
  n_tup_upd + n_tup_ins + n_tup_del as num_writes,
  indexdef AS definition
   FROM pg_stat_user_indexes AS idstat
   JOIN pg_indexes ON indexrelname = indexname
   JOIN pg_stat_user_tables AS tabstat ON idstat.relname =
 tabstat.relname
   WHERE idstat.idx_scan  200
   AND indexdef !~* 'unique'
   ORDER BY idstat.schemaname,
idstat.relname,
indexrelname;

 --
_  | Peter J. Holzer| I want to forget all about both belts and
 |_|_) || suspenders; instead, I want to buy pants
 | |   | h...@hjp.at | that actually fit.
 __/   | http://www.hjp.at/ |   -- http://noncombatant.org/



Re: [GENERAL] Queries for unused/useless indexes

2015-05-26 Thread Melvin Davidson
Will,

Much thanks. Let's keep up the sharing with the community.

On Tue, May 26, 2015 at 11:32 AM, William Dunn dunn...@gmail.com wrote:

 The query I previously sent was table level. Here is an index level one:
 SELECT pg_stat_user_indexes.schemaname,
pg_stat_user_indexes.relname,
pg_stat_user_indexes.indexrelid,
pg_stat_user_indexes.indexrelname,
pg_stat_user_indexes.idx_scan,
pg_stat_user_tables.seq_scan,
(100 * pg_stat_user_indexes.idx_scan /
 (pg_stat_user_tables.seq_scan + pg_stat_user_indexes.idx_scan)) AS
 perc_idx_used
 FROM pg_stat_user_indexes INNER JOIN pg_stat_user_tables ON
 pg_stat_user_indexes.relid = pg_stat_user_tables.relid
 WHERE pg_relation_size(pg_stat_user_indexes.relid)(5*8192)
   AND NOT ((pg_stat_user_indexes.idx_scan=0
 OR pg_stat_user_indexes.idx_scan=NULL)
AND pg_stat_user_tables.seq_scan=0)
 ORDER BY perc_idx_used;

 *Will J. Dunn*
 *willjdunn.com http://willjdunn.com*

 On Tue, May 26, 2015 at 10:31 AM, William Dunn dunn...@gmail.com wrote:

 Melvin - thanks for sharing.

 Here is the query I use which lists the percent of queries against the
 table which use the index ordered by least used first.

 The 'pg_relation_size(relid)(5*8192)' is used to remove any tables that
 would be so small the optimizer would just choose a table scan.

 SELECT schemaname,
relname,
idx_scan,
seq_scan,
(100 * idx_scan / (seq_scan + idx_scan)) AS perc_idx_used
 FROM pg_stat_user_tables
 WHERE pg_relation_size(relid)(5*8192)
   AND NOT ((idx_scan=0
 OR idx_scan=NULL)
AND seq_scan=0)
 ORDER BY perc_idx_used;

 *Will J. Dunn*
 *willjdunn.com http://willjdunn.com*

 On Mon, May 25, 2015 at 12:39 PM, Peter J. Holzer h...@hjp.at wrote:

 On 2015-05-25 12:25:01 -0400, Melvin Davidson wrote:
  I'm not sure why you are using pg_stat_user_indexes.

 Because you did. I didn't change that.

  My original query below
  uses pg_stat_all_indexes and the schema names are joined and it does
 work.

 I'm not sure what you mean by original, but this:

  SELECT n.nspname as schema,
 i.relname as table,
 i.indexrelname as index,
 i.idx_scan,
 i.idx_tup_read,
 i.idx_tup_fetch,
 pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' ||
  quote_ident(i.relname))) AS table_size,
 pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' ||
  quote_ident(i.indexrelname))) AS index_size,
 pg_get_indexdef(idx.indexrelid) as idx_definition
FROM pg_stat_all_indexes i
JOIN pg_class c ON (c.oid = i.relid)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_index idx ON (idx.indexrelid =  i.indexrelid )
   WHERE i.idx_scan  200
 AND NOT idx.indisprimary
 AND NOT idx.indisunique
   ORDER BY 1, 2, 3;

 is not the query you posted in your original message.

 Here is what you posted:

  On Mon, May 25, 2015 at 10:41 AM, Peter J. Holzer hjp-pg...@hjp.at
 wrote:
 
  On 2015-05-22 09:41:57 -0400, Melvin Davidson wrote:
   I'd like to share those queries with the community, as I know
 there must
  be
   others out there with the same problem.
  
   /* useless_indexes.sql */
   SELECT
  idstat.schemaname AS schema,
  idstat.relname AS table_name,
  indexrelname AS index_name,
  idstat.idx_scan AS times_used,
  
 pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
  '.' ||
   quote_ident(idstat.relname))) AS table_size,
  
 pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
  '.' ||
   quote_ident(indexrelname))) AS index_size,
  n_tup_upd + n_tup_ins + n_tup_del as num_writes,
  indexdef AS definition
   FROM pg_stat_user_indexes AS idstat
   JOIN pg_indexes ON indexrelname = indexname
   JOIN pg_stat_user_tables AS tabstat ON idstat.relname =
 tabstat.relname
   WHERE idstat.idx_scan  200
   AND indexdef !~* 'unique'
   ORDER BY idstat.schemaname,
idstat.relname,
indexrelname;

 --
_  | Peter J. Holzer| I want to forget all about both belts and
 |_|_) || suspenders; instead, I want to buy pants
 | |   | h...@hjp.at | that actually fit.
 __/   | http://www.hjp.at/ |   -- http://noncombatant.org/






-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] Queries for unused/useless indexes

2015-05-25 Thread Peter J. Holzer
On 2015-05-22 09:41:57 -0400, Melvin Davidson wrote:
 I'd like to share those queries with the community, as I know there must be
 others out there with the same problem.
 
 /* useless_indexes.sql */
 SELECT
    idstat.schemaname AS schema,
    idstat.relname AS table_name,
    indexrelname AS index_name,
    idstat.idx_scan AS times_used,
    pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) || '.' 
 ||
 quote_ident(idstat.relname))) AS table_size,
    pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) || '.' 
 ||
 quote_ident(indexrelname))) AS index_size,
    n_tup_upd + n_tup_ins + n_tup_del as num_writes,
    indexdef AS definition
 FROM pg_stat_user_indexes AS idstat
 JOIN pg_indexes ON indexrelname = indexname
 JOIN pg_stat_user_tables AS tabstat ON idstat.relname = tabstat.relname
 WHERE idstat.idx_scan  200
 AND indexdef !~* 'unique'
 ORDER BY idstat.schemaname,
  idstat.relname,
  indexrelname;

Thanks, that's useful.

However, it doesn't quite work if there are indexes with the same name
in different schemas. Better join on the schemaname, too:

FROM pg_stat_user_indexes AS idstat
JOIN pg_indexes AS idx ON indexrelname = indexname and idstat.schemaname = 
idx.schemaname
JOIN pg_stat_user_tables AS tabstat ON idstat.relname = tabstat.relname and 
idstat.schemaname = tabstat.schemaname

(for some reason that makes it a lot slower, though)

hp

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/


signature.asc
Description: Digital signature


Re: [GENERAL] Queries for unused/useless indexes

2015-05-25 Thread Melvin Davidson
I'm not sure why you are using pg_stat_user_indexes. My original query
below uses pg_stat_all_indexes and the schema names are joined and it
does work.


SELECT n.nspname as schema,
   i.relname as table,
   i.indexrelname as index,
   i.idx_scan,
   i.idx_tup_read,
   i.idx_tup_fetch,
   pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' ||
quote_ident(i.relname))) AS table_size,
   pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' ||
quote_ident(i.indexrelname))) AS index_size,
   pg_get_indexdef(idx.indexrelid) as idx_definition
  FROM pg_stat_all_indexes i
  JOIN pg_class c ON (c.oid = i.relid)
  JOIN pg_namespace n ON (n.oid = c.relnamespace)
  JOIN pg_index idx ON (idx.indexrelid =  i.indexrelid )
 WHERE i.idx_scan  200
   AND NOT idx.indisprimary
   AND NOT idx.indisunique
 ORDER BY 1, 2, 3;

On Mon, May 25, 2015 at 10:41 AM, Peter J. Holzer hjp-pg...@hjp.at wrote:

 On 2015-05-22 09:41:57 -0400, Melvin Davidson wrote:
  I'd like to share those queries with the community, as I know there must
 be
  others out there with the same problem.
 
  /* useless_indexes.sql */
  SELECT
 idstat.schemaname AS schema,
 idstat.relname AS table_name,
 indexrelname AS index_name,
 idstat.idx_scan AS times_used,
 pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
 '.' ||
  quote_ident(idstat.relname))) AS table_size,
 pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
 '.' ||
  quote_ident(indexrelname))) AS index_size,
 n_tup_upd + n_tup_ins + n_tup_del as num_writes,
 indexdef AS definition
  FROM pg_stat_user_indexes AS idstat
  JOIN pg_indexes ON indexrelname = indexname
  JOIN pg_stat_user_tables AS tabstat ON idstat.relname = tabstat.relname
  WHERE idstat.idx_scan  200
  AND indexdef !~* 'unique'
  ORDER BY idstat.schemaname,
   idstat.relname,
   indexrelname;

 Thanks, that's useful.

 However, it doesn't quite work if there are indexes with the same name
 in different schemas. Better join on the schemaname, too:

 FROM pg_stat_user_indexes AS idstat
 JOIN pg_indexes AS idx ON indexrelname = indexname and
 idstat.schemaname = idx.schemaname
 JOIN pg_stat_user_tables AS tabstat ON idstat.relname =
 tabstat.relname and idstat.schemaname = tabstat.schemaname

 (for some reason that makes it a lot slower, though)

 hp

 --
_  | Peter J. Holzer| I want to forget all about both belts and
 |_|_) || suspenders; instead, I want to buy pants
 | |   | h...@hjp.at | that actually fit.
 __/   | http://www.hjp.at/ |   -- http://noncombatant.org/




-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] Queries for unused/useless indexes

2015-05-25 Thread Peter J. Holzer
On 2015-05-25 12:25:01 -0400, Melvin Davidson wrote:
 I'm not sure why you are using pg_stat_user_indexes.

Because you did. I didn't change that.

 My original query below
 uses pg_stat_all_indexes and the schema names are joined and it does work.

I'm not sure what you mean by original, but this:

 SELECT n.nspname as schema,
    i.relname as table,
    i.indexrelname as index,
    i.idx_scan,
    i.idx_tup_read,
    i.idx_tup_fetch,
    pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' ||
 quote_ident(i.relname))) AS table_size,
    pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' ||
 quote_ident(i.indexrelname))) AS index_size,
    pg_get_indexdef(idx.indexrelid) as idx_definition
   FROM pg_stat_all_indexes i
   JOIN pg_class c ON (c.oid = i.relid)
   JOIN pg_namespace n ON (n.oid = c.relnamespace)
   JOIN pg_index idx ON (idx.indexrelid =  i.indexrelid )
  WHERE i.idx_scan  200
    AND NOT idx.indisprimary
    AND NOT idx.indisunique
  ORDER BY 1, 2, 3;

is not the query you posted in your original message. 

Here is what you posted:

 On Mon, May 25, 2015 at 10:41 AM, Peter J. Holzer hjp-pg...@hjp.at wrote:
 
 On 2015-05-22 09:41:57 -0400, Melvin Davidson wrote:
  I'd like to share those queries with the community, as I know there must
 be
  others out there with the same problem.
 
  /* useless_indexes.sql */
  SELECT
     idstat.schemaname AS schema,
     idstat.relname AS table_name,
     indexrelname AS index_name,
     idstat.idx_scan AS times_used,
     pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
 '.' ||
  quote_ident(idstat.relname))) AS table_size,
     pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
 '.' ||
  quote_ident(indexrelname))) AS index_size,
     n_tup_upd + n_tup_ins + n_tup_del as num_writes,
     indexdef AS definition
  FROM pg_stat_user_indexes AS idstat
  JOIN pg_indexes ON indexrelname = indexname
  JOIN pg_stat_user_tables AS tabstat ON idstat.relname = tabstat.relname
  WHERE idstat.idx_scan  200
  AND indexdef !~* 'unique'
  ORDER BY idstat.schemaname,
   idstat.relname,
   indexrelname;

-- 
   _  | Peter J. Holzer| I want to forget all about both belts and
|_|_) || suspenders; instead, I want to buy pants 
| |   | h...@hjp.at | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/


signature.asc
Description: Digital signature


Re: [GENERAL] Queries for unused/useless indexes

2015-05-22 Thread Melvin Davidson
200 is a completely arbitrary value. At the time, I wanted to find indexes
that were sufficiently less used than most others in a highly queried
system. To find indexes that were never used, just change the value to 0.

On Fri, May 22, 2015 at 9:12 PM, Venkata Balaji N nag1...@gmail.com wrote:

 On Fri, May 22, 2015 at 11:41 PM, Melvin Davidson melvin6...@gmail.com
 wrote:


 Over the years I've wrote many scripts and queries to track the database
 status. Recently I've had to convince a client who thought it was a good
 idea to create indexes for every column on every table that it is really a
 bad idea. To do so, I wrote useless_indexes2.sql, which shows every index
 that has never been scanned. They still didn't believe me, so I wrote
 wasted_index_space.sql. That shows how much space is wasted by all the
 unused indexes.

 I'd like to share those queries with the community, as I know there must
 be others out there with the same problem.


 I had a similar problem a few times in the past with some of our clients.
 I completely agree that it is not at all a good idea and we are simply
 inviting an extra over-head from maintenance and performance perspective.

 Indexing every column of the table does not make sense as it is almost
 impossible that every column of the table can have rows with high
 cardinality. That's not typical RDBMS design.

 Generally, most of them believe that, if an Index is unused, though its
 not beneficial at-least its not a harm. That is not correct.

 - Depending on the data-types and cardinality of the columns, Indexes can
 occupy a lot of space and remain unused. This invites maintenance over-head
   (ex: backups and vacuum operations)
 - The biggest problem is, if the table is a heavy-write table, even though
 Indexes are not picked during SELECT, they cannot escape
   WRITES (INSERTS/UPDATES). This is purely an extra and unnecessary I/O.

 /* useless_indexes.sql */
 SELECT
idstat.schemaname AS schema,
idstat.relname AS table_name,
indexrelname AS index_name,
idstat.idx_scan AS times_used,
pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
 '.' || quote_ident(idstat.relname))) AS table_size,
pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
 '.' || quote_ident(indexrelname))) AS index_size,
n_tup_upd + n_tup_ins + n_tup_del as num_writes,
indexdef AS definition
 FROM pg_stat_user_indexes AS idstat
 JOIN pg_indexes ON indexrelname = indexname
 JOIN pg_stat_user_tables AS tabstat ON idstat.relname = tabstat.relname
 WHERE idstat.idx_scan  200
 AND indexdef !~* 'unique'
 ORDER BY idstat.schemaname,
  idstat.relname,
  indexrelname;


 Not sure why do you have 200

 Regards,
 Venkata Balaji N

 Fujitsu Australia




-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] Queries for unused/useless indexes

2015-05-22 Thread Venkata Balaji N
On Fri, May 22, 2015 at 11:41 PM, Melvin Davidson melvin6...@gmail.com
wrote:


 Over the years I've wrote many scripts and queries to track the database
 status. Recently I've had to convince a client who thought it was a good
 idea to create indexes for every column on every table that it is really a
 bad idea. To do so, I wrote useless_indexes2.sql, which shows every index
 that has never been scanned. They still didn't believe me, so I wrote
 wasted_index_space.sql. That shows how much space is wasted by all the
 unused indexes.

 I'd like to share those queries with the community, as I know there must
 be others out there with the same problem.


I had a similar problem a few times in the past with some of our clients. I
completely agree that it is not at all a good idea and we are simply
inviting an extra over-head from maintenance and performance perspective.

Indexing every column of the table does not make sense as it is almost
impossible that every column of the table can have rows with high
cardinality. That's not typical RDBMS design.

Generally, most of them believe that, if an Index is unused, though its not
beneficial at-least its not a harm. That is not correct.

- Depending on the data-types and cardinality of the columns, Indexes can
occupy a lot of space and remain unused. This invites maintenance over-head
  (ex: backups and vacuum operations)
- The biggest problem is, if the table is a heavy-write table, even though
Indexes are not picked during SELECT, they cannot escape
  WRITES (INSERTS/UPDATES). This is purely an extra and unnecessary I/O.

/* useless_indexes.sql */
 SELECT
idstat.schemaname AS schema,
idstat.relname AS table_name,
indexrelname AS index_name,
idstat.idx_scan AS times_used,
pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
 '.' || quote_ident(idstat.relname))) AS table_size,
pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
 '.' || quote_ident(indexrelname))) AS index_size,
n_tup_upd + n_tup_ins + n_tup_del as num_writes,
indexdef AS definition
 FROM pg_stat_user_indexes AS idstat
 JOIN pg_indexes ON indexrelname = indexname
 JOIN pg_stat_user_tables AS tabstat ON idstat.relname = tabstat.relname
 WHERE idstat.idx_scan  200
 AND indexdef !~* 'unique'
 ORDER BY idstat.schemaname,
  idstat.relname,
  indexrelname;


Not sure why do you have 200

Regards,
Venkata Balaji N

Fujitsu Australia


[GENERAL] Queries for unused/useless indexes

2015-05-22 Thread Melvin Davidson
Over the years I've wrote many scripts and queries to track the database
status. Recently I've had to convince a client who thought it was a good
idea to create indexes for every column on every table that it is really a
bad idea. To do so, I wrote useless_indexes2.sql, which shows every index
that has never been scanned. They still didn't believe me, so I wrote
wasted_index_space.sql. That shows how much space is wasted by all the
unused indexes.

I'd like to share those queries with the community, as I know there must be
others out there with the same problem.

/* useless_indexes.sql */
SELECT
   idstat.schemaname AS schema,
   idstat.relname AS table_name,
   indexrelname AS index_name,
   idstat.idx_scan AS times_used,
   pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
'.' || quote_ident(idstat.relname))) AS table_size,
   pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
'.' || quote_ident(indexrelname))) AS index_size,
   n_tup_upd + n_tup_ins + n_tup_del as num_writes,
   indexdef AS definition
FROM pg_stat_user_indexes AS idstat
JOIN pg_indexes ON indexrelname = indexname
JOIN pg_stat_user_tables AS tabstat ON idstat.relname = tabstat.relname
WHERE idstat.idx_scan  200
AND indexdef !~* 'unique'
ORDER BY idstat.schemaname,
 idstat.relname,
 indexrelname;

/*wasted_index_space.sql
 Requires PostgreSQL 8.4 or greater */
WITH s AS(
  SELECT SUM(pg_relation_size(quote_ident(n.nspname) || '.' ||
quote_ident(c.relname))::bigint) AS table_size,
   pg_size_pretty(SUM(pg_relation_size(quote_ident(n.nspname) || '.' ||
quote_ident(c.relname))::bigint)::bigint) AS table_size_pretty
  FROM pg_class c
  JOIN pg_namespace n ON (n.oid = c.relnamespace)
   WHERE c.relkind = 'r'
   AND c.relname NOT LIKE 'pg_%'
   AND c.relname NOT LIKE 'sql%'
)
SELECT s.table_size,
   s.table_size_pretty,
   SUM(pg_relation_size(quote_ident(n.nspname) || '.' ||
quote_ident(i.indexrelname))::bigint) AS unused_idx_size,
   pg_size_pretty(SUM(pg_relation_size(quote_ident(n.nspname) || '.' ||
quote_ident(i.indexrelname))::bigint)::bigint) AS unused_idx_size_pretty,
   pg_database_size(current_database()) as db_size,
   pg_size_pretty(pg_database_size(current_database()))as
db_size_pretty,
   pg_size_pretty(pg_database_size(current_database()) -
SUM(pg_relation_size(quote_ident(n.nspname) || '.' ||
quote_ident(i.indexrelname))::bigint)::bigint) as db_minus_wasted_space

  FROM s, pg_stat_all_indexes i
  JOIN pg_class c ON (c.oid = i.relid)
  JOIN pg_namespace n ON (n.oid = c.relnamespace)
  JOIN pg_index idx ON (idx.indexrelid =  i.indexrelid )
 WHERE i.idx_scan = 0
   AND NOT idx.indisprimary
   AND NOT idx.indisunique
  GROUP BY table_size, table_size_pretty;


*Melvin Davidson*