Re: [GENERAL] Disable an index temporarily

2014-04-20 Thread Torsten Förtsch
On 20/04/14 03:02, Sergey Konoplev wrote:

Thanks for you reply.

 an index can be INVALID (pg_index.indisvalid=false).

 I want to temporarily disable an index so that it won't be used to
 access data but will still be updated.

 Can I simply set pg_index.indisvalid=false and later turn it true again?
 
 It works on a quick test, but I'm not sure how safe it is.
 
 If you need to test a query without the index use a transaction:
 
 Begin;
 Drop index ...;
 Explain ... select ...;
 Rollback;

I know about that.

The problem is I have a number of indexes in a large system that are
very similar. And I suspect some of them are superfluous.

Example:

btree (fmb_id, action_type)
btree (fmb_id)

Action_type in this case is one out of a handful of values (should
perhaps be an ENUM but is TEXT) and for most of the table the
combination of (fmb_id, action_type) is unique. The table itself has
~2E8 rows. So it takes a while to build these indexes from scratch.

Now, we have several performance counters in place. I want to disable
these indexes one by one and see what happens. I am probably not able to
find all of the queries that use them. But I believe that nothing much
happens if I drop one of them (preferably the former?).

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Disable an index temporarily

2014-04-20 Thread Gavin Flower

On 20/04/14 20:09, Torsten Förtsch wrote:

On 20/04/14 03:02, Sergey Konoplev wrote:

Thanks for you reply.


an index can be INVALID (pg_index.indisvalid=false).

I want to temporarily disable an index so that it won't be used to
access data but will still be updated.

Can I simply set pg_index.indisvalid=false and later turn it true again?

It works on a quick test, but I'm not sure how safe it is.

If you need to test a query without the index use a transaction:

Begin;
Drop index ...;
Explain ... select ...;
Rollback;

I know about that.

The problem is I have a number of indexes in a large system that are
very similar. And I suspect some of them are superfluous.

Example:

 btree (fmb_id, action_type)
 btree (fmb_id)

Action_type in this case is one out of a handful of values (should
perhaps be an ENUM but is TEXT) and for most of the table the
combination of (fmb_id, action_type) is unique. The table itself has
~2E8 rows. So it takes a while to build these indexes from scratch.

Now, we have several performance counters in place. I want to disable
these indexes one by one and see what happens. I am probably not able to
find all of the queries that use them. But I believe that nothing much
happens if I drop one of them (preferably the former?).

Torsten


You might be best off dropping all indexes, then initially adding them 
back individually, then in pairs etc.  As each index used, will have to 
be read in from disk and consumes RAM. The more RAM that is used to hold 
indexes, the less RAM there is that can be used to hold table data and 
other stuff related to your queries.


Also PostgreSQL can use multiple indexes on columns.  For example if you 
have an index on column A and another on column B - then a query with 
restrictions on columns A, B, and C might (if the values searched for in 
A  B each hit very a low fraction of pages) result in the indexes for A 
 B being used to select which pages to be read in to check on the 
values for C.


So sometimes an index on A  B plus an index on C  D, could be used for 
a query that has restrictions on A, B, C,  E - as the 2 indexes can be 
used to search on the values of A, B,  C, so restricting the pages 
needing to be read to check on E.  So possibly, depending on your data 
and your query mix, an index on A, B, C,  E may not be needed in 
addition to the other 2 indexes - or just the latter index would be 
useful - or possibly all 3 indexes.


So in your example, you might be better off with the two indexes: btree 
(fmb_id)  btree (action_type) - or just the one index: btree (fmb_id, 
action_type) - or all 3!


Depends on trade-offs between the cost of testing and the time saved, 
how much effort you should put in!


Can you get any insights from looking at the existing queries?


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Disable an index temporarily

2014-04-20 Thread Thomas Kellerer

Torsten Förtsch wrote on 20.04.2014 10:09:

The problem is I have a number of indexes in a large system that are
very similar. And I suspect some of them are superfluous.

Example:

 btree (fmb_id, action_type)
 btree (fmb_id)

Action_type in this case is one out of a handful of values (should
perhaps be an ENUM but is TEXT) and for most of the table the
combination of (fmb_id, action_type) is unique. The table itself has
~2E8 rows. So it takes a while to build these indexes from scratch.

Now, we have several performance counters in place. I want to disable
these indexes one by one and see what happens. I am probably not able to
find all of the queries that use them. But I believe that nothing much
happens if I drop one of them (preferably the former?).


What about monitoring pg_stat_all_indexes to see if an index is used:

http://www.postgresql.org/docs/current/static/monitoring-stats.html#PG-STAT-ALL-INDEXES-VIEW

Btw: in the above example the second one is definitely not needed.
Any query that uses the second one (single column) can also use the first one.




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Disable an index temporarily

2014-04-20 Thread Torsten Förtsch
On 20/04/14 12:08, Thomas Kellerer wrote:
 Example:

  btree (fmb_id, action_type)
  btree (fmb_id)

[...]
 
 Btw: in the above example the second one is definitely not needed.
 Any query that uses the second one (single column) can also use the
 first one.

I know.

But the single column index is 3534 MB the 2-column one 4963 MB. The
number of rows per distinct fmb_id is very small, usually 1 or 2. So, if
a query looks for fmb_id=A and action_type=B, it has to filter out
only one row if the first index is not available. Hence, I thought maybe
the lower size of the index and the higher compactness per page could
outweigh the more direct access provided by the 2-column index.

I am quite sure there is no query that qualifies for an index-only scan
on the 2-column index.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Altering array(composite-types) without breaking code when inserting them and similar questions

2014-04-20 Thread Dorian Hoxha
Hi list,

I have a
create type thetype(width integer, height integer);
create table mytable(thetype thetype[]);

How can i make an insert statement so if i later add fields to the
composite type, the code/query doesn't break ?
Maybe by specifying the fields of the composite type in the query ?

This can be done for normal inserts(non arrays):
CREATE TABLE mytable (t thetype);
INSERT INTO mytable(t.width, t.height) VALUES (11,22);


Also how to update an whole element of an array of composites ?
Also, how to update an attribute in a specific element in an array of
composites?

(so when i add columns later to the composite, my old code doesn't break)

How much overhead have the composite types beside the values and nulls?

Thanks


[GENERAL] Re: Altering array(composite-types) without breaking code when inserting them and similar questions

2014-04-20 Thread David G Johnston
Dorian Hoxha wrote
 Hi list,
 
 I have a
 create type thetype(width integer, height integer);
 create table mytable(thetype thetype[]);
 
 How can i make an insert statement so if i later add fields to the
 composite type, the code/query doesn't break ?
 Maybe by specifying the fields of the composite type in the query ?
 
 This can be done for normal inserts(non arrays):
 CREATE TABLE mytable (t thetype);
 INSERT INTO mytable(t.width, t.height) VALUES (11,22);
 
 
 Also how to update an whole element of an array of composites ?
 Also, how to update an attribute in a specific element in an array of
 composites?
 
 (so when i add columns later to the composite, my old code doesn't break)
 
 How much overhead have the composite types beside the values and nulls?
 
 Thanks

You should probably just use a table...

I'm doubtful you can alter the type without breaking code.

You can replace a entire single array element as described in the
documentation.  You cannot update single attributes.

As to the overhead question what are you comparing against and what do you
want to measure?

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Altering-array-composite-types-without-breaking-code-when-inserting-them-and-similar-questions-tp5800847p5800848.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Altering array(composite-types) without breaking code when inserting them and similar questions

2014-04-20 Thread Fede Martinez
If you don't know the columns your type will have, you could consider using
json or hstore if the data is unstructured.
El 20/04/2014 14:04, Dorian Hoxha dorian.ho...@gmail.com escribió:

 Hi list,

 I have a
 create type thetype(width integer, height integer);
 create table mytable(thetype thetype[]);

 How can i make an insert statement so if i later add fields to the
 composite type, the code/query doesn't break ?
 Maybe by specifying the fields of the composite type in the query ?

 This can be done for normal inserts(non arrays):
 CREATE TABLE mytable (t thetype);
 INSERT INTO mytable(t.width, t.height) VALUES (11,22);


 Also how to update an whole element of an array of composites ?
 Also, how to update an attribute in a specific element in an array of
 composites?

 (so when i add columns later to the composite, my old code doesn't break)

 How much overhead have the composite types beside the values and nulls?

 Thanks



Re: [GENERAL] Altering array(composite-types) without breaking code when inserting them and similar questions

2014-04-20 Thread Dorian Hoxha
Was just curious about the overhead.

I know the columns, but i may need to add other columns in the future.
Yeah, json is the alternative if this doesn't work.



On Sun, Apr 20, 2014 at 7:54 PM, Fede Martinez
federicoemarti...@gmail.comwrote:

 If you don't know the columns your type will have, you could consider
 using json or hstore if the data is unstructured.
 El 20/04/2014 14:04, Dorian Hoxha dorian.ho...@gmail.com escribió:

 Hi list,

 I have a
 create type thetype(width integer, height integer);
 create table mytable(thetype thetype[]);

 How can i make an insert statement so if i later add fields to the
 composite type, the code/query doesn't break ?
 Maybe by specifying the fields of the composite type in the query ?

 This can be done for normal inserts(non arrays):
 CREATE TABLE mytable (t thetype);
 INSERT INTO mytable(t.width, t.height) VALUES (11,22);


 Also how to update an whole element of an array of composites ?
 Also, how to update an attribute in a specific element in an array of
 composites?

 (so when i add columns later to the composite, my old code doesn't break)

 How much overhead have the composite types beside the values and nulls?

 Thanks




Re: [GENERAL] Altering array(composite-types) without breaking code when inserting them and similar questions

2014-04-20 Thread Rob Sargent
Why do you think you need an array of theType v. a dependent table of theType. 
This tack is of course immune to to most future type changess. 

Sent from my iPhone

 On Apr 20, 2014, at 11:57 AM, Dorian Hoxha dorian.ho...@gmail.com wrote:
 
 Was just curious about the overhead.
 
 I know the columns, but i may need to add other columns in the future.
 Yeah, json is the alternative if this doesn't work.
 
 
 
 On Sun, Apr 20, 2014 at 7:54 PM, Fede Martinez federicoemarti...@gmail.com 
 wrote:
 If you don't know the columns your type will have, you could consider using 
 json or hstore if the data is unstructured.
 
 El 20/04/2014 14:04, Dorian Hoxha dorian.ho...@gmail.com escribió:
 
 Hi list,
 
 I have a 
 create type thetype(width integer, height integer);
 create table mytable(thetype thetype[]);
 
 How can i make an insert statement so if i later add fields to the 
 composite type, the code/query doesn't break ? 
 Maybe by specifying the fields of the composite type in the query ?
 
 This can be done for normal inserts(non arrays):
 CREATE TABLE mytable (t thetype);
 INSERT INTO mytable(t.width, t.height) VALUES (11,22);
 
 
 Also how to update an whole element of an array of composites ? 
 Also, how to update an attribute in a specific element in an array of 
 composites?
 
 (so when i add columns later to the composite, my old code doesn't break) 
 
 How much overhead have the composite types beside the values and nulls?
 
 Thanks
 


Re: [GENERAL] Disable an index temporarily

2014-04-20 Thread Jeff Janes
On Apr 20, 2014 4:21 AM, Torsten Förtsch torsten.foert...@gmx.net wrote:

 On 20/04/14 12:08, Thomas Kellerer wrote:
  Example:
 
   btree (fmb_id, action_type)
   btree (fmb_id)
 
 [...]
 
  Btw: in the above example the second one is definitely not needed.
  Any query that uses the second one (single column) can also use the
  first one.

 I know.

 But the single column index is 3534 MB the 2-column one 4963 MB. The
 number of rows per distinct fmb_id is very small, usually 1 or 2. So, if
 a query looks for fmb_id=A and action_type=B, it has to filter out
 only one row if the first index is not available. Hence, I thought maybe
 the lower size of the index and the higher compactness per page could
 outweigh the more direct access provided by the 2-column index.

I don't think there is a supported way to do that. And since the quasi
dropped index still has to be maintained (and so probably kept in cache),
the benefits of dropping it may not be observable anyway by your proposed
experiment, unless the table is read only.

Cheers,

Jeff


Re: [GENERAL] Altering array(composite-types) without breaking code when inserting them and similar questions

2014-04-20 Thread Dorian Hoxha
Because i always query the whole row, and in the other way(many tables) i
will always join + have other indexes.


On Sun, Apr 20, 2014 at 8:56 PM, Rob Sargent robjsarg...@gmail.com wrote:

 Why do you think you need an array of theType v. a dependent table of
 theType. This tack is of course immune to to most future type changess.

 Sent from my iPhone

 On Apr 20, 2014, at 11:57 AM, Dorian Hoxha dorian.ho...@gmail.com wrote:

 Was just curious about the overhead.

 I know the columns, but i may need to add other columns in the future.
 Yeah, json is the alternative if this doesn't work.



 On Sun, Apr 20, 2014 at 7:54 PM, Fede Martinez 
 federicoemarti...@gmail.com wrote:

 If you don't know the columns your type will have, you could consider
 using json or hstore if the data is unstructured.
 El 20/04/2014 14:04, Dorian Hoxha dorian.ho...@gmail.com escribió:

 Hi list,

 I have a
 create type thetype(width integer, height integer);
 create table mytable(thetype thetype[]);

 How can i make an insert statement so if i later add fields to the
 composite type, the code/query doesn't break ?
 Maybe by specifying the fields of the composite type in the query ?

 This can be done for normal inserts(non arrays):
 CREATE TABLE mytable (t thetype);
 INSERT INTO mytable(t.width, t.height) VALUES (11,22);


 Also how to update an whole element of an array of composites ?
 Also, how to update an attribute in a specific element in an array of
 composites?

 (so when i add columns later to the composite, my old code doesn't
 break)

 How much overhead have the composite types beside the values and nulls?

 Thanks





Re: [GENERAL] Altering array(composite-types) without breaking code when inserting them and similar questions

2014-04-20 Thread Rob Sargentg

Sorry, I should not have top-posted (Dang iPhone).  Continued below:
On 04/20/2014 05:54 PM, Dorian Hoxha wrote:
Because i always query the whole row, and in the other way(many 
tables) i will always join + have other indexes.



On Sun, Apr 20, 2014 at 8:56 PM, Rob Sargent robjsarg...@gmail.com 
mailto:robjsarg...@gmail.com wrote:


Why do you think you need an array of theType v. a dependent table
of theType. This tack is of course immune to to most future type
changess.

Sent from my iPhone

Interesting.  Of course any decent mapper will return the whole row. 
And would it be less disk intensive as an array of struct ( where 
struct is implemented as an array).  From other threads [1] [2] I've 
come to understand the datatype overhead per native type will be applied 
per type instance per array element.


[1] 30K floats 
http://postgresql.1045698.n5.nabble.com/Is-it-reasonable-to-store-double-arrays-of-30K-elements-td5790562.html
[2] char array 
http://postgresql.1045698.n5.nabble.com/COPY-v-java-performance-comparison-tc5798389.html


Re: [GENERAL] Non-deterministic 100% CPU hang on postgres 9.3

2014-04-20 Thread Fenn Bailey
Hi all,

I was wondering if anyone had any further insight as to why the query plan
would have changed so dramatically between queries, even when isolated by a
transaction?

Is it possible for database statistics to change after a query has been run
(independent of transactions)?

Thanks in advance,

 Fenn.


On Wed, Apr 16, 2014 at 2:25 PM, Fenn Bailey fenn.bai...@gmail.com wrote:

 Some more information on this:


  I say hang, where really I've given up after ~12 hours execution. The
  exact same query can then be terminated and run in 90 seconds, with
 none
  of the underlying data changing.

 But does the plan change?


 As it turns out it does. Some further examination shows that the plan
 changes, at least partially due to the fact that I now see this query
 attempts to modify the data that it's querying (ie: if any updates have
 occurred, the subsequent query is not the same).

 That said, all queries were performed in a transaction, so when they're
 aborted, no modification should be made to the underlying data.

 This makes it even more odd that on a subsequent run of the same query
 (after the rollback) it will complete in ~140 seconds (vs the several hours
 it was taking on the first run).

 Is it possible that the initial run is changing the statistics available
 on the table, hence providing a different query plan (even on the same
 data)?

 Query plan on first/second run follows -

 First run (failure):
 ---
  Update on ad_events e  (cost=1426714795.64..1426749795.64 rows=100
 width=237)
-  Hash Join  (cost=1426714795.64..1426749795.64 rows=100
 width=237)
   Hash Cond: (x.ad_event_id = e.ad_event_id)
  -  Subquery Scan on x  (cost=1426638856.33..1426651356.33
 rows=100 width=144)
-  Sort  (cost=1426638856.33..1426641356.33 rows=100
 width=108)
   Sort Key: s.ad_user_id, s.ad_session_id,
 t.timestamp
  -  Nested Loop Left Join
  (cost=12046.09..1426539198.49 rows=100 width=108)
Join Filter: ((t.timestamp =
 s.session_start) AND (t.timestamp = s.session_end) AND ((s.ad_user_id =
 u.ad_user_id) OR (s.ad_user_id = u.merged_id)))
-  Hash Join  (cost=12046.09..203878.09
 rows=100 width=92)
  Hash Cond: (t.user_id =
 u.orig_distinct_id)
  -  Seq Scan on ad_events_mv t
  (cost=0.00..173082.00 rows=100 width=112)
Filter: (NOT processed)
  -  Hash  (cost=7932.15..7932.15
 rows=329115 width=67)
-  Seq Scan on ad_users u
  (cost=0.00..7932.15 rows=329115 width=67)
-  Materialize  (cost=0.00..5478.88 rows=63392
 width=32)
  -  Seq Scan on ad_sessions s
  (cost=0.00..5161.92 rows=63392 width=32)
   -  Hash  (cost=50938.58..50938.58 rows=258 width=101)
-  Seq Scan on ad_events e  (cost=0.00..50938.58
 rows=258 width=101)

 Second run (completed in ~148 seconds):
 --
  Update on ad_events e  (cost=39730372.76..39765372.76 rows=100
 width=237)
-  Hash Join  (cost=39730372.76..39765372.76 rows=100 width=237)
   Hash Cond: (x.ad_event_id = e.ad_event_id)
  -  Subquery Scan on x  (cost=39654433.45..39666933.45
 rows=100 width=144)
-  Sort  (cost=39654433.45..39656933.45 rows=100
 width=108)
  Sort Key: s.ad_user_id, s.ad_session_id, t.timestamp
  -  Nested Loop Left Join
  (cost=12054.20..39554775.61 rows=100 width=108)
Join Filter: ((t.timestamp =
 s.session_start) AND (t.timestamp = s.session_end))
-  Hash Join  (cost=12046.09..203878.09
 rows=100 width=92)
  Hash Cond: (t.user_id =
 u.orig_distinct_id)
  -  Seq Scan on ad_events_mv t
  (cost=0.00..173082.00 rows=100 width=112)
Filter: (NOT processed)
  -  Hash  (cost=7932.15..7932.15
 rows=329115 width=67)
-  Seq Scan on ad_users u
  (cost=0.00..7932.15 rows=329115 width=67)
-  Bitmap Heap Scan on ad_sessions s
  (cost=8.11..39.22 rows=8 width=32)
  Recheck Cond: ((ad_user_id =
 u.ad_user_id) OR (ad_user_id = u.merged_id))
  -  BitmapOr  (cost=8.11..8.11 rows=8
 width=0)
-  Bitmap Index Scan on
 ad_sessions_ad_user_id_idx  (cost=0.00..4.05 rows=4 width=0)
  Index Cond: (ad_user_id =
 u.ad_user_id)
-  Bitmap Index Scan on
 

Re: [GENERAL] Disable an index temporarily

2014-04-20 Thread Rajeev rastogi
On 20 April 2014 02:21, Torsten Wrote:
 
 Hi,
 
 an index can be INVALID (pg_index.indisvalid=false).
 
 I want to temporarily disable an index so that it won't be used to
 access data but will still be updated.
 
 Can I simply set pg_index.indisvalid=false and later turn it true again?

I don't think there is any way to do so.

But If your intension is just to avoid index scan for some time, then you can 
use following command, which disable index scan
set enable_indexscan to off;
Once you are done with experimentation, you can execute command, which enable 
index scan.
set enable_indexscan to on;

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] any way for a transaction to see inserts done earlier in the transaction?

2014-04-20 Thread Rajeev rastogi
Was the function doing INSERT operation was successful? I suspect may be INSERT 
function has failed and hence INSERT got ABORTED.
Also you can try to commit whole transaction and see if you are able to find 
the new id (This will prove that whether really new id was inserted or not).

Thanks and Regards,
Kumar Rajeev Rastogi
From: pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Susan Cassidy
Sent: 17 April 2014 05:36
To: Steven Schlansker
Cc: Tom Lane; pgsql-general@postgresql.org
Subject: Re: [GENERAL] any way for a transaction to see inserts done earlier 
in the transaction?

It is a fairly large and complex Perl program, so no, not really.
I do an insert via a function, which returns the new id, then later I try to 
SELECT on that id, and it doesn't find it.
Could it be because the insert is done inside a function?
Susan

On Wed, Apr 16, 2014 at 4:58 PM, Steven Schlansker 
ste...@likeness.commailto:ste...@likeness.com wrote:


 On Wed, Apr 16, 2014 at 4:31 PM, Tom Lane 
 t...@sss.pgh.pa.usmailto:t...@sss.pgh.pa.us wrote:
 Susan Cassidy 
 susan.cass...@decisionsciencescorp.commailto:susan.cass...@decisionsciencescorp.com
  writes:
  Is there any way to let a transaction see the inserts that were done
  earlier in the transaction?

 It works that way automatically, as long as you're talking about separate
 statements within one transaction.

 regards, tom lane

 On Apr 16, 2014, at 4:53 PM, Susan Cassidy 
 susan.cass...@decisionsciencescorp.commailto:susan.cass...@decisionsciencescorp.com
  wrote:
 Well, it isn't working for me right now.  It can't see a row that was 
 inserted earlier in the transaction.  It is a new primary key, and when I 
 SELECT it, it isn't found.


Can you share the code that does not work with us?  Preferably as a small 
self-contained example.