Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scara Maccai
> You might be better off > with a CLUSTER on > some index.  I can't: table is too big, can't lock it for minutes; that's why I wanted to cluster it "one day at a time". -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http

Re: [PERFORM] Performance comparison between Postgres and Greenplum

2009-07-16 Thread Scott Marlowe
On Wed, Jul 15, 2009 at 7:02 AM, Suvankar Roy wrote: > > Hi Alex, > > Yes, I have got 2 segments and a master host. So, in a way processing should > be faster in Greenplum. > > Actually this is only a sort of Proof of Concept trial that I am carrying > out to notice differences between greenplum an

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Justin Pitts
ISTR that is the approach that MSSQL follows. Storing the full tuple in an index and not even having a data only page would also be an interesting approach to this (and perhaps simpler than a separate index file and data file if trying to keep the data in the order of the index). -- S

Re: [PERFORM] [BUGS] BUG #4919: CREATE USER command slows down systemperformance

2009-07-16 Thread Lauris Ulmanis
Yes, it seems problem in pg_auth flat file. We are using db users to manage access rights to db tables and data, that way we have two layer security - application and DB. Each system user has it's own group role and groups have different access levels. So we cannot use one login role for all user

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Justin Pitts
According to the books online http://msdn.microsoft.com/en-us/library/ms177443.aspx : "In a clustered index, the leaf nodes contain the data pages of the underlying table." Which agrees with your assertion. From a performance perspective, it DOES work very well. Which is why I keep h

Re: [PERFORM] [GENERAL] Concurrency issue under very heay loads

2009-07-16 Thread Greenhorn
2009/7/16 Raji Sridar (raji) : > Hi, > > We use a typical counter within a transaction to generate order sequence > number and update the next sequence number. This is a simple next counter - > nothing fancy about it.  When multiple clients are concurrently accessing > this table and updating it, u

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Justin Pitts
Is there any interest in adding that (continual/automatic cluster order maintenance) to a future release? On Wed, Jul 15, 2009 at 8:33 PM, Scott Carey wrote: > If you have a lot of insert/update/delete activity on a table fillfactor can > help. > > I don’t believe that postgres will try and mainta

Re: [PERFORM] Poor overall performance unless regular VACUUM FULL

2009-07-16 Thread Wayne Conrad
On Tue, 14 Jul 2009, Scott Marlowe wrote: Are you guys doing anything that could be deemed pathological, like full table updates on big tables over and over? Had an issue last year where a dev left a where clause off an update to a field in one of our biggest tables and in a few weeks the databa

[PERFORM] Re: [BUGS] BUG #4919: CREATE USER command slows down system performance

2009-07-16 Thread toruvinn
On Wed, 15 Jul 2009 16:02:09 +0200, Alvaro Herrera wrote: My bet is on the pg_auth flat file. I doubt we have ever tested the behavior of that code with 1 billion users ... I've noticed this behaviour some time ago, on a cluster with 50k+ roles (not sure about the number now). Restoring the

Re: [PERFORM] Performance comparison between Postgres and Greenplum

2009-07-16 Thread Suvankar Roy
Hi Alex, Yes, I have got 2 segments and a master host. So, in a way processing should be faster in Greenplum. Actually this is only a sort of Proof of Concept trial that I am carrying out to notice differences between greenplum and postgres, if any. For other queries though, results are satis

Re: [PERFORM] Poor overall performance unless regular VACUUM FULL

2009-07-16 Thread Wayne Conrad
On Tue, 14 Jul 2009, Scott Marlowe wrote: Just wondering, which pgsql version, and also, do you have autovacuum turned on? Dang, I should have said in my initial message. 8.3.6, and autovacuum is turned on and has plenty of log activity. -- Sent via pgsql-performance mailing list (pgsql-perfo

Re: [PERFORM] [BUGS] BUG #4919: CREATE USER command slows down system performance

2009-07-16 Thread Lauris Ulmanis
Hello again! I did test on my local test server I created up 500 000 users in function loop very quickly - within 48 seconds. I did again this script reaching up to 1 billion users - results was the same - 48 seconds. It is very quickly. But problem seems is with transaction preparation because

Re: [PERFORM] Performance comparison between Postgres and Greenplum

2009-07-16 Thread Suvankar Roy
Hi Scott, Thanks for your input Scott. But, then being a Massively Parallel Processing Database, is Greenplum not expected to outperform versions of Postgres higher than on which it is based. My notion was that GP 3.3 (based on PostgreSQL 8.2.13) would exceed PG 8.3.7. It seems that I was wr

Re: [PERFORM] Performance comparison between Postgres and Greenplum

2009-07-16 Thread Suvankar Roy
Hi Scott, This is what I have got - In Greenplum, the following query returns: test_db1=# select version(); version ---

[PERFORM] Strange memory behavior with rails - caching in connection?

2009-07-16 Thread Alex
I am using Postgres with Rails. Each rails application "thread" is actually a separate process (mongrel) with it's own connection. Normally, the db connection processes (?) look something like this in top: 15772 postgres 15 0 229m 13m 12m S0 0.8 0:00.09 postgres: db db [local] idle

Re: [PERFORM] Poor query performance

2009-07-16 Thread Alex
Thanks. That's very helpful. I'll take your suggestions and see if things improve. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Poor query performance

2009-07-16 Thread Alex
> > How is the index  sl_city_etc defined? > Index "public.sl_city_etc" Column|Type --+- city | text listing_type | text post_time| timestamp without time zone bedrooms | integer region | text geo_lat |

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scott Carey
On 7/16/09 5:27 PM, "Greg Stark" wrote: > On Fri, Jul 17, 2009 at 1:02 AM, Scott Carey wrote: >> Indexes would point to a heap page for normal tables and clustered index >> pages for clustered tables.  When new versions of data come in, it may point >> to new clustered index pages, just like the

Re: [PERFORM] Repeated Query is much slower in PostgreSQL8.2.4 than DB2 9.1

2009-07-16 Thread Mike Ivanov
Interesting. It's quite a hairy plan even though all the branches are cut off by conditions ("never executed") so the query yields 0 rows. 0.018 is not a bad timing for that. However, if you run this query with different parameters, the result could be quite sad. There are some deeply neste

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Greg Stark
On Fri, Jul 17, 2009 at 1:02 AM, Scott Carey wrote: > Indexes would point to a heap page for normal tables and clustered index > pages for clustered tables.  When new versions of data come in, it may point > to new clustered index pages, just like they currently get modified to point > to new heap

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scott Carey
On 7/16/09 1:49 PM, "Greg Stark" wrote: > On Thu, Jul 16, 2009 at 9:06 PM, Scott Carey wrote: >> Keep the old page around or a copy of it that old transactions reference? >> Just more Copy on Write. >> How is that different from a nested loop on an index scan/seek currently? >> Doesn't an old tr

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-16 Thread Kevin Grittner
Marc Cousin wrote: > As mentionned in another mail from the thread (from Richard Huxton), > I felt this message in the documentation a bit misleading : > > effective_cache_size (integer) > Sets the planner's assumption about the effective size of the disk > cache that is available to a single

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-16 Thread Devin Ben-Hur
Marc Cousin wrote: Le Thursday 16 July 2009 22:07:25, Kevin Grittner a écrit : Marc Cousin wrote: the hot parts of these 2 tables are extremely likely to be in the database or linux cache (buffer hit rate was 97% in the example provided). Moreover, the first two queries of the insert procedure

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-16 Thread Marc Cousin
Le Thursday 16 July 2009 23:54:54, Kevin Grittner a écrit : > Marc Cousin wrote: > > to sum it up, should I keep these values (I hate doing this :) ) ? > > Many people need to set the random_page_cost and/or seq_page_cost to > reflect the overall affect of caching on the active portion of the > da

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-16 Thread Kevin Grittner
Marc Cousin wrote: > to sum it up, should I keep these values (I hate doing this :) ) ? Many people need to set the random_page_cost and/or seq_page_cost to reflect the overall affect of caching on the active portion of the data. We set our fully-cached databases to 0.1 for both. Databases w

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Ibrahim Harrani
Hi Scott, Which fillfactor is better 70, 80 or another value? Thanks. Thanks On Thu, Jul 16, 2009 at 3:33 AM, Scott Carey wrote: > If you have a lot of insert/update/delete activity on a table fillfactor can > help. > > I don’t believe that postgres will try and maintain the table in the cluste

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Greg Stark
> Scara Maccai wrote: > >> What am I doing wrong? I didn't exactly follow the full sequence but it sounded like what's happening is that Postgres is noticing all these empty pages from earlier deletes and reusing that space. That's what it's designed to do. As Kevin said, there's no guarantee tha

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-16 Thread Marc Cousin
Le Thursday 16 July 2009 22:07:25, Kevin Grittner a écrit : > Marc Cousin wrote: > > the hot parts of these 2 tables are extremely likely to be in the > > database or linux cache (buffer hit rate was 97% in the example > > provided). Moreover, the first two queries of the insert procedure > > fill

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Greg Stark
On Thu, Jul 16, 2009 at 9:06 PM, Scott Carey wrote: > Keep the old page around or a copy of it that old transactions reference? > Just more Copy on Write. > How is that different from a nested loop on an index scan/seek currently? > Doesn't an old transaction have to reference an old heap page thro

Re: [PERFORM] Repeated Query is much slower in PostgreSQL8.2.4 than DB2 9.1

2009-07-16 Thread Dimitri Fontaine
Hi, Le 16 juil. 09 à 11:52, Andres Freund a écrit : If I interpret those findings correcty the execution is approx. as fast as DB2, only DB2 is doing automated plan caching while pg is not. If it _really_ is necessary that this is that fast, you can prepare the query like I showed. A for

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Kevin Grittner
Scara Maccai wrote: > What am I doing wrong? > [function which uses INSERT/UPDATE/DELETE statements to try to force > order of rows in heap] You seem to be assuming that the rows will be in the table in the sequence of your inserts. You might be better off with a CLUSTER on some index. (The

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-16 Thread Kevin Grittner
Marc Cousin wrote: > the hot parts of these 2 tables are extremely likely to be in the > database or linux cache (buffer hit rate was 97% in the example > provided). Moreover, the first two queries of the insert procedure > fill the cache for us... This would be why the optimizer does the be

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scott Carey
On 7/16/09 12:46 PM, "Greg Stark" wrote: > On Thu, Jul 16, 2009 at 8:18 PM, Scott Carey wrote: >> " Each index row in the nonclustered index contains the nonclustered key >> value and a row locator. This locator points to the data row in the >> clustered index or heap having the key value." >>

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Greg Stark
On Thu, Jul 16, 2009 at 8:18 PM, Scott Carey wrote: > " Each index row in the nonclustered index contains the nonclustered key > value and a row locator. This locator points to the data row in the > clustered index or heap having the key value." > > That sort of model should work with MVCC and even

Re: [PERFORM] Incr/Decr Integer

2009-07-16 Thread Kevin Grittner
William Scott Jordan wrote: > We seem to be getting a lot of deadlocks using this method under > heavy load. Could you post the exact message from one of these? (Copy and paste if possible.) -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make chang

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scott Carey
Yes, it seems as though the whole tuple is entirely in the index if it is clustered. From : http://msdn.microsoft.com/en-us/library/ms177484.aspx " Each index row in the nonclustered index contains the nonclustered key value and a row locator. This locator points to the data row in the clustered

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Kevin Grittner
Scott Carey wrote: > I could be wrong, but I think MSSQL only keeps the data specified in > the index in the index, and the remaining columns in the data. Unless it has changed recently, an MS SQL Server clustered index is the same as the Sybase implementation: all data for the tuple is stored

Re: [PERFORM] Incr/Decr Integer

2009-07-16 Thread Richard Huxton
William Scott Jordan wrote: Hey all! Is there a better way to increase or decrease the value of an integer than doing something like: UPDATE the_table SET the_int = the_int + 1 WHERE the_id = 123 ; No. We seem to be getting a lot of deadlocks using this method under heavy load. Just won

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scott Carey
I could be wrong, but I think MSSQL only keeps the data specified in the index in the index, and the remaining columns in the data. That is, if there is a clustered index on a table on three columns out of five, those three columns in the index are stored in the index, while the other two are in a

[PERFORM] Incr/Decr Integer

2009-07-16 Thread William Scott Jordan
Hey all! Is there a better way to increase or decrease the value of an integer than doing something like: --- UPDATE the_table SET the_int = the_int + 1 WHERE the_id = 123 ; --- We seem to be getting a lot of deadlocks using this method under heavy load. Just wondering if we should be doing

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scott Carey
Either true Index Organized Tables or Clustered Indexes would be very useful for a variety of table/query types. The latter seems more difficult with Postgres' MVCC model since it requires data to be stored in the index that is authoritative. Storing the full tuple in an index and not even having

Re: [PERFORM] Poor overall performance unless regular VACUUM FULL

2009-07-16 Thread Wayne Conrad
Ouch hurts my eyes :) Can you see something like table_len, dead_tuple_percent, free_percent order by dead_tuple_percent desc limit 10 or something like that maybe? Sorry about the pain. Didn't know what you needed to see. Ordering by dead_tuple_percent: db.production=> select table_name, ta

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scara Maccai
>   +1 for index organized tables  +1 I have a table: CREATE TABLE mytab ( "time" timestamp without time zone NOT NULL, ne_id integer NOT NULL, values integer, CONSTRAINT mytab_pk PRIMARY KEY (ne_id, "time"), CONSTRAINT mytab_ne_id_key UNIQUE ("time", ne_id) } The table is written eve

Re: [PERFORM] cluster index on a table

2009-07-16 Thread Scott Mead
On Wed, Jul 15, 2009 at 10:36 PM, Scott Marlowe wrote: > I'd love to see it. +1 for index organized tables --Scott

Re: [PERFORM] Repeated Query is much slower in PostgreSQL8.2.4 than DB2 9.1

2009-07-16 Thread Andres Freund
On Thursday 16 July 2009 11:46:18 ning wrote: > I'm sorry, they are in milliseconds, not seconds. > The time used is quite same to the result of "explain analyze select > " I pasted above, > which was " Total runtime: 0.479 ms". Yea. Unfortunately that time does not including planning time. If

Re: [PERFORM] Repeated Query is much slower in PostgreSQL8.2.4 than DB2 9.1

2009-07-16 Thread ning
I'm sorry, they are in milliseconds, not seconds. The time used is quite same to the result of "explain analyze select " I pasted above, which was " Total runtime: 0.479 ms". Greetings, Ning On Thu, Jul 16, 2009 at 6:33 PM, Andres Freund wrote: > On Thursday 16 July 2009 11:30:00 ning wrote:

Re: [PERFORM] Repeated Query is much slower in PostgreSQL8.2.4 than DB2 9.1

2009-07-16 Thread ning
Hi Andres, By executing #explain analyze execute test_query; the first execution cost 0.389 seconds the second cost 0.285 seconds Greetings, Ning On Thu, Jul 16, 2009 at 4:45 PM, Andres Freund wrote: > On Thursday 16 July 2009 03:11:29 ning wrote: >> Hi Andres, >> >> The log for the test you s

Re: [PERFORM] Repeated Query is much slower in PostgreSQL8.2.4 than DB2 9.1

2009-07-16 Thread Andres Freund
On Thursday 16 July 2009 11:30:00 ning wrote: > Hi Andres, > > By executing > #explain analyze execute test_query; > > the first execution cost 0.389 seconds > the second cost 0.285 seconds Seconds or milliseconds? If seconds that would be by far slower than the plain SELECT, right? Andres -- S

Re: [PERFORM] Concurrency issue under very heay loads

2009-07-16 Thread Albe Laurenz
Raji Sridar wrote: > We use a typical counter within a transaction to generate > order sequence number and update the next sequence number. > This is a simple next counter - nothing fancy about it. When > multiple clients are concurrently accessing this table and > updating it, under extermely

Re: [PERFORM] Repeated Query is much slower in PostgreSQL8.2.4 than DB2 9.1

2009-07-16 Thread Andres Freund
On Thursday 16 July 2009 03:11:29 ning wrote: > Hi Andres, > > The log for the test you suggested is as follows in PostgreSQL8.2.4, > but I cannot find a clue to prove or prove not PostgreSQL is doing > plan caching. Well. How long is the PREPARE and the EXECUTEs taking? Andres -- Sent via pgsq

Re: [PERFORM] Very big insert/join performance problem (bacula)

2009-07-16 Thread Marc Cousin
On Thursday 16 July 2009 07:20:18 Marc Cousin wrote: > Le Thursday 16 July 2009 01:56:37, Devin Ben-Hur a écrit : > > Marc Cousin wrote: > > > This mail contains the asked plans : > > > Plan 1 > > > around 1 million records to insert, seq_page_cost 1, random_page_cost 4 > > > > > > -> Has