Re: [PERFORM] Building multiple indexes concurrently
On Mar 16, 2010, at 6:04 PM, Rob Wultsch wrote: > Lets say I have a large table bigTable to which I would like to add > two btree indexes. Is there a more efficient way to create indexes > than: > CREATE INDEX idx_foo on bigTable (foo); > CREATE INDEX idx_baz on bigTable (baz); > Or > CREATE INDEX CONCURRENTLY idx_foo on bigTable (foo); > CREATE INDEX CONCURRENTLY idx_baz on bigTable (baz); > > Are there any particular performance optimizations that would be in > play in such a scenario? > > At a minimum I assume that if both of the commands were started at > about the same time they would each scan the table in the same > direction and whichever creation was slower would benefit from most of > the table data it needed being prepopulated in shared buffers. Is this > the case? That sounds reasonable to me. You might also look at upping your maintenance_work_mem for your session, as well. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Building multiple indexes concurrently
Lets say I have a large table bigTable to which I would like to add two btree indexes. Is there a more efficient way to create indexes than: CREATE INDEX idx_foo on bigTable (foo); CREATE INDEX idx_baz on bigTable (baz); Or CREATE INDEX CONCURRENTLY idx_foo on bigTable (foo); CREATE INDEX CONCURRENTLY idx_baz on bigTable (baz); Are there any particular performance optimizations that would be in play in such a scenario? At a minimum I assume that if both of the commands were started at about the same time they would each scan the table in the same direction and whichever creation was slower would benefit from most of the table data it needed being prepopulated in shared buffers. Is this the case? -- Rob Wultsch wult...@gmail.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Block at a time ...
Dave Crooke escribió: > An awesomely simple alternative is to just specify the extension as e.g. 5% > of the existing table size it starts by adding one block at a time for > tiny tables, and once your table is over 20GB, it ends up adding a whole 1GB > file and pre-allocating it. Very little wasteage. I was thinking in something like that, except that the factor I'd use would be something like 50% or 100% of current size, capped at (say) 1 GB. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Block at a time ...
I agree with Tom, any reordering attempt is at best second guessing the filesystem and underlying storage. However, having the ability to control the extent size would be a worthwhile improvement for systems that walk and chew gum (write to lots of tables) concurrently. I'm thinking of Oracle's AUTOEXTEND settings for tablespace datafiles I think the ideal way to do it for PG would be to make the equivalent configurable in postgresql.conf system wide, and allow specific per-table settings in the SQL metadata, similar to auto-vacuum. An awesomely simple alternative is to just specify the extension as e.g. 5% of the existing table size it starts by adding one block at a time for tiny tables, and once your table is over 20GB, it ends up adding a whole 1GB file and pre-allocating it. Very little wasteage. Cheers Dave On Tue, Mar 16, 2010 at 4:49 PM, Alvaro Herrera wrote: > Tom Lane escribió: > > Alvaro Herrera writes: > > > Maybe it would make more sense to try to reorder the fsync calls > > > instead. > > > > Reorder to what, though? You still have the problem that we don't know > > much about the physical layout on-disk. > > Well, to block numbers as a first step. > > However, this reminds me that sometimes we take the block-at-a-time > extension policy too seriously. We had a customer that had a > performance problem because they were inserting lots of data to TOAST > tables, causing very frequent extensions. I kept wondering whether an > allocation policy that allocated several new blocks at a time could be > useful (but I didn't try it). This would also alleviate fragmentation, > thus helping the physical layout be more similar to logical block > numbers. > > -- > Alvaro Herrera > http://www.CommandPrompt.com/ > PostgreSQL Replication, Consulting, Custom Development, 24x7 support >
Re: [PERFORM] shared_buffers advice
Alvaro Herrera wrote: Maybe it would make more sense to try to reorder the fsync calls instead. The pretty obvious left behind idea from 8.3 spread checkpoint development was to similarly spread the fsync calls around. Given that we know, for example, Linux with ext3 is going to dump the whole filesystem write cache out when the fsync call comes in, the way they're currently scheduled has considerably potential for improvement. Unfortunately, since the tuning on that is going to be very platform dependent and require a lot of benchmarking work, I think we need a performance farm up and running as a prerequisite to finishing that work off. The spread checkpoint stuff was a much more obvious improvement, and that was hard enough to quantify usefully and test. Returning to the idea of the sorted checkpoints patch as a simple example, if it were possible to just push that patch to a test repo and see how that changed typical throughput/latency against a well-established history, it would be a lot easier to figure out if something like that is sensible to consider or not. I'm not sure how to make progress on similar ideas about tuning closer to the filesystem level without having something automated that takes over the actual benchmark running and data recording steps; it's just way too time consuming to do those right now with every tool that's available for PostgreSQL so far. That's the problem I work on, there are easily a half dozen good ideas for improvements here floating around where coding time is dwarfed by required performance validation time. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] shared_buffers advice
Tom Lane escribió: > Alvaro Herrera writes: > > Tom Lane escribió: > >> That's not going to do anything towards reducing the actual I/O volume. > >> Although I suppose it might be useful if it just cuts the number of > >> seeks. > > > Oh, they had no problems with I/O volume. It was relation extension > > lock that was heavily contended for them. > > Really? I guess that serialized all the I/O ... I'll bet if we got rid > of that locking somehow, they *would* have a problem with I/O volume. Well, that would solve the problem as far as I'm concerned and they'd have to start talking to their storage provider ;-) -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] shared_buffers advice
Alvaro Herrera writes: > Tom Lane escribió: >> That's not going to do anything towards reducing the actual I/O volume. >> Although I suppose it might be useful if it just cuts the number of >> seeks. > Oh, they had no problems with I/O volume. It was relation extension > lock that was heavily contended for them. Really? I guess that serialized all the I/O ... I'll bet if we got rid of that locking somehow, they *would* have a problem with I/O volume. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] shared_buffers advice
Tom Lane escribió: > Alvaro Herrera writes: > > Tom Lane escribi�: > >> Reorder to what, though? You still have the problem that we don't know > >> much about the physical layout on-disk. > > > Well, to block numbers as a first step. > > fsync is a file-based operation, and we know exactly zip about the > relative positions of different files on the disk. Doh, right, I was thinking in the sync-file-range kind of API. > > We had a customer that had a > > performance problem because they were inserting lots of data to TOAST > > tables, causing very frequent extensions. I kept wondering whether an > > allocation policy that allocated several new blocks at a time could be > > useful (but I didn't try it). This would also alleviate fragmentation, > > thus helping the physical layout be more similar to logical block > > numbers. > > That's not going to do anything towards reducing the actual I/O volume. > Although I suppose it might be useful if it just cuts the number of > seeks. Oh, they had no problems with I/O volume. It was relation extension lock that was heavily contended for them. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] shared_buffers advice
Alvaro Herrera writes: > Tom Lane escribió: >> Reorder to what, though? You still have the problem that we don't know >> much about the physical layout on-disk. > Well, to block numbers as a first step. fsync is a file-based operation, and we know exactly zip about the relative positions of different files on the disk. > However, this reminds me that sometimes we take the block-at-a-time > extension policy too seriously. Yeah, that's a huge performance penalty in some circumstances. > We had a customer that had a > performance problem because they were inserting lots of data to TOAST > tables, causing very frequent extensions. I kept wondering whether an > allocation policy that allocated several new blocks at a time could be > useful (but I didn't try it). This would also alleviate fragmentation, > thus helping the physical layout be more similar to logical block > numbers. That's not going to do anything towards reducing the actual I/O volume. Although I suppose it might be useful if it just cuts the number of seeks. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] shared_buffers advice
Tom Lane escribió: > Alvaro Herrera writes: > > Maybe it would make more sense to try to reorder the fsync calls > > instead. > > Reorder to what, though? You still have the problem that we don't know > much about the physical layout on-disk. Well, to block numbers as a first step. However, this reminds me that sometimes we take the block-at-a-time extension policy too seriously. We had a customer that had a performance problem because they were inserting lots of data to TOAST tables, causing very frequent extensions. I kept wondering whether an allocation policy that allocated several new blocks at a time could be useful (but I didn't try it). This would also alleviate fragmentation, thus helping the physical layout be more similar to logical block numbers. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] shared_buffers advice
Alvaro Herrera writes: > Maybe it would make more sense to try to reorder the fsync calls > instead. Reorder to what, though? You still have the problem that we don't know much about the physical layout on-disk. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] shared_buffers advice
Greg Stark escribió: > On Tue, Mar 16, 2010 at 2:30 PM, Tom Lane wrote: > > "Pierre C" writes: > >> Does PG issue checkpoint writes in "sorted" order ? > > > > No. IIRC, a patch for that was submitted, and rejected because no > > significant performance improvement could be demonstrated. We don't > > have enough information about the actual on-disk layout to be very > > intelligent about this, so it's better to just issue the writes and > > let the OS sort them. > > Keep in mind that postgres is issuing writes to the OS buffer cache. > It defers fsyncing the files as late as it can in the hopes that most > of those buffers will be written out by the OS before then. That gives > the OS a long time window in which to flush them out in whatever order > and whatever schedule is most convenient. Maybe it would make more sense to try to reorder the fsync calls instead. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] shared_buffers advice
Greg Stark wrote: On Tue, Mar 16, 2010 at 2:30 PM, Tom Lane wrote: "Pierre C" writes: Does PG issue checkpoint writes in "sorted" order ? No. IIRC, a patch for that was submitted, and rejected because no significant performance improvement could be demonstrated. If the OS filesystem buffer cache is really small then that might not work so well. It might be worth rerunning those benchmarks on a machine with shared buffers taking up all of RAM. Here's the original patch again: http://archives.postgresql.org/message-id/20080415181742.6c97.52131...@oss.ntt.co.jp I was the person who tried to reproduce the suggested 10% pgbench speedup on a similar system and couldn't replicate any improvement. Never was sure what was going on to show such a difference on the reference system used to develop the patch versus mine, since they were pretty similar. Possibly some positive interaction with LVM in the test case I didn't have. Maybe the actual reason sorting helped was limitations in the HP P400 controller used there I wasn't running into with the Areca card I used. And the always popular "didn't account fully for all pgbench run to run variation" possibility crossed my mind too--that the original observed speedup wasn't caused by the patch but by something else. I did not go out of my way to find test conditions where the patch would more likely to help, like the situation you describe where shared_buffers was really large relative to the OS cache. Since the patch complicates the checkpoint code and requires some working memory to operate, it would have to be a unquestionable win using standard practices before it was worth applying. If it only helps in a situation people are unlikely to use in the field, and it net negative for everyone else, that's still going to end up on the interesting but rejected idea scrapheap at the end of the day. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] shared_buffers advice
Pierre C wrote: Actually, I meant that in the case of a seq scan, PG will try to use just a few buffers (a ring) in shared_buffers instead of thrashing the whole buffers. But if there was actually a lot of free space in shared_buffers, do the pages stay, or do they not ? Pages inserted into the ring buffer and later re-used for new data do not stay behind even if there is room for them. There's a potential improvement possible in that code involving better management of the situation where the buffer cache hasn't actually reached full capacity yet, but as it's an unusual case it's hard to justify optimizing for. Besides, the hope is that in this case the OS cache will end up caching everything anyway until it has a reason to evict it. So if you follow the rest of the data suggesting you should not give all the memory to PostgreSQL to manage, you end up with a reasonable solution to this problem anyway. Those pages will just live in the OS cache instead of the database's, with only a few trickling in and staying behind each time you do a sequential scan. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres DB maintainenance - vacuum and reindex
Meena_Ramkumar escribió: How to run vacuumdb and reindex for Postgres DB in a non-stop server? Will it be made without shutting the server? If so, then what will be performance degradation percentage? To execute vacuum, you can´t stop the server, is another process of it. If you are using a recent version of PostgreSQL, you can use autovacuum on the server and this process is charged of this or to use VACUUM with the right schedule. You should avoid to use VACUUM FULL, because is very slow and it requires exclusive locks of the tables that you are executing this, and it reduces the table size on the disc but It doesn´t reduce the index size, but iit can make indexes larger. With autovacuum = on, you can avoid to use VACUUM frecuently The performance degradation depends of the quantity of tables and databases that you have on your server. REINDEX is another task that you can execute periodicly on you server, but if you don´t want to affect the production task, the best thing yo do is to drop the index and reissue the CREATE INDEX CONCURRENTLY command. Regards -- -- Ing. Marcos Luís Ortíz Valmaseda -- -- Twitter: http://twitter.com/@marcosluis2186-- -- FreeBSD Fan/User -- -- http://www.freebsd.org/es -- -- Linux User # 418229-- -- Database Architect/Administrator -- -- PostgreSQL RDBMS -- -- http://www.postgresql.org -- -- http://planetpostgresql.org-- -- http://www.postgresql-es.org -- -- Data WareHouse -- Business Intelligence Apprentice -- -- http://www.tdwi.org-- -- Ruby on Rails Fan/Developer-- -- http://rubyonrails.org -- Comunidad Técnica Cubana de PostgreSQL http://postgresql.uci.cu http://personas.grm.uci.cu/+marcos Centro de Tecnologías de Gestión de Datos (DATEC) Contacto: Correo: centa...@uci.cu Telf: +53 07-837-3737 +53 07-837-3714 Universidad de las Ciencias Informáticas http://www.uci.cu -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres DB maintainenance - vacuum and reindex
Autovacuum is your friend for minimal downtime. It is configurable to let you adjust how invasive it will be, and you can have different settings per table if you wish. As for the reindex, why do you think you will be reindexing regularly? On Mar 15, 2010, at 10:30 PM, Meena_Ramkumar wrote: > > How to run vacuumdb and reindex for Postgres DB in a non-stop server? Will it > be made without shutting the server? If so, then what will be performance > degradation percentage? > -- > View this message in context: > http://old.nabble.com/Postgres-DB-maintainenance---vacuum-and-reindex-tp27913694p27913694.html > Sent from the PostgreSQL - performance mailing list archive at Nabble.com. > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres DB maintainenance - vacuum and reindex
On Mon, Mar 15, 2010 at 11:30 PM, Meena_Ramkumar wrote: > > How to run vacuumdb and reindex for Postgres DB in a non-stop server? Will it > be made without shutting the server? If so, then what will be performance > degradation percentage? vacuum can be tuned by the various vacuum_* parameters in the postgresql.conf file to have little or no impact on other processes running. Depending on your IO subsystem, you can tune it up or down to fit your needs (speed versus impact on other processes). reindex however tends to be more intrusive to the system, and may cause some performance degradation, which will be very dependent on your IO subsystem (i.e. a single 7200RPM SATA drive system is more likely to notice and be slowed down by reindexing than a 48 disk 15krpm SAS RAID-10 array. The more important question is what problem are you trying to solve, and are there other, better approaches than the ones you're trying. Without more info, no one can really say. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] GiST index performance
Matthew Wakeling wrote: Matthew Wakeling wrote: A second quite distinct issue is the general performance of GiST indexes which is also mentioned in the old thread linked from Open Items. For that, we have a test case at http://archives.postgresql.org/pgsql-performance/2009-04/msg00276.php for btree_gist indexes. I have a similar example with the bioseg GiST index. I have completely reimplemented the same algorithms in Java for algorithm investigation and instrumentation purposes, and it runs about a hundred times faster than in Postgres. I think this is a problem, and I'm willing to do some investigation to try and solve it. I have not made any progress on this issue. I think Oleg and Teodor would be better placed working it out. All I can say is that I implemented the exact same indexing algorithm in Java, and it performed 100 times faster than Postgres. Now, Postgres has to do a lot of additional work, like mapping the index onto disc, locking pages, and abstracting to plugin user functions, so I would expect some difference - I'm not sure 100 times is reasonable though. I tried to do some profiling, but couldn't see any one section of code that was taking too much time. Not sure what I can further do. Hello Mathew and list, A lot of time spent in gistget.c code and a lot of functioncall5's to the gist's consistent function which is out of sight for gprof. Something different but related since also gist: we noticed before that gist indexes that use a compressed form for index entries suffer from repeated compress calls on query operands (see http://archives.postgresql.org/pgsql-hackers/2009-05/msg00078.php). The btree_gist int4 compress function calls the generic gbt_num_compress, which does a palloc. Maybe this palloc is allso hit al lot when scanning the index, because the constants that are queries with are repeatedly compressed and palloced. regards, Yeb Havinga -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Postgres DB maintainenance - vacuum and reindex
How to run vacuumdb and reindex for Postgres DB in a non-stop server? Will it be made without shutting the server? If so, then what will be performance degradation percentage? -- View this message in context: http://old.nabble.com/Postgres-DB-maintainenance---vacuum-and-reindex-tp27913694p27913694.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] shared_buffers advice
On Tue, Mar 16, 2010 at 2:30 PM, Tom Lane wrote: > "Pierre C" writes: >> Does PG issue checkpoint writes in "sorted" order ? > > No. IIRC, a patch for that was submitted, and rejected because no > significant performance improvement could be demonstrated. We don't > have enough information about the actual on-disk layout to be very > intelligent about this, so it's better to just issue the writes and > let the OS sort them. Keep in mind that postgres is issuing writes to the OS buffer cache. It defers fsyncing the files as late as it can in the hopes that most of those buffers will be written out by the OS before then. That gives the OS a long time window in which to flush them out in whatever order and whatever schedule is most convenient. If the OS filesystem buffer cache is really small then that might not work so well. It might be worth rerunning those benchmarks on a machine with shared buffers taking up all of RAM. -- greg -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] shared_buffers advice
On Tue, Mar 16, 2010 at 1:48 PM, Pierre C wrote: > Actually, I meant that in the case of a seq scan, PG will try to use just a > few buffers (a ring) in shared_buffers instead of thrashing the whole > buffers. But if there was actually a lot of free space in shared_buffers, do > the pages stay, or do they not ? They don't. The logic only kicks in if the table is expected to be > 1/4 of shared buffers though. -- greg -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] shared_buffers advice
"Pierre C" writes: > Does PG issue checkpoint writes in "sorted" order ? No. IIRC, a patch for that was submitted, and rejected because no significant performance improvement could be demonstrated. We don't have enough information about the actual on-disk layout to be very intelligent about this, so it's better to just issue the writes and let the OS sort them. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] shared_buffers advice
I think the logic you are referring to is the clock sweep buffer accounting scheme. That just makes sure that the most popular pages stay in the buffers. If your entire db fits in the buffer pool then it'll all get in there real fast. Actually, I meant that in the case of a seq scan, PG will try to use just a few buffers (a ring) in shared_buffers instead of thrashing the whole buffers. But if there was actually a lot of free space in shared_buffers, do the pages stay, or do they not ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] shared_buffers advice
On Tue, Mar 16, 2010 at 7:24 AM, Pierre C wrote: > > I wonder about something, too : if your DB size is smaller than RAM, you > could in theory set shared_buffers to a size larger than your DB provided > you still have enough free RAM left for work_mem and OS writes management. > How does this interact with the logic which prevents seq-scans hogging > shared_buffers ? I think the logic you are referring to is the clock sweep buffer accounting scheme. That just makes sure that the most popular pages stay in the buffers. If your entire db fits in the buffer pool then it'll all get in there real fast. Two things to consider though: 1. The checkpoint issue still stands. 2. You should really mess around with your cost estimates if this is the case. If you make random IO cost the same as sequential IO postgres will prefer index scans over bitmap index scans and table scans which makes sense if everything is in memory.
Re: [PERFORM] shared_buffers advice
-My warnings about downsides related to checkpoint issues with larger buffer pools isn't an opinion at all; that's a fact based on limitations in how Postgres does its checkpoints. If we get something more like Oracle's incremental checkpoint logic, this particular concern might go away. Does PG issue checkpoint writes in "sorted" order ? I wonder about something, too : if your DB size is smaller than RAM, you could in theory set shared_buffers to a size larger than your DB provided you still have enough free RAM left for work_mem and OS writes management. How does this interact with the logic which prevents seq-scans hogging shared_buffers ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Is DBLINK transactional
cr...@postnewspapers.com.au (Craig Ringer) writes: > On 13/03/2010 5:54 AM, Jeff Davis wrote: >> On Fri, 2010-03-12 at 12:07 -0500, Merlin Moncure wrote: >>> of course. You can always explicitly open a transaction on the remote >>> side over dblink, do work, and commit it at the last possible moment. >>> Your transactions aren't perfectly synchronized...if you crash in the >>> precise moment between committing the remote and the local you can get >>> in trouble. The chances of this are extremely remote though. >> >> If you want a better guarantee than that, consider using 2PC. > > Translation in case you don't know: 2PC = two phase commit. > > Note that you have to monitor "lost" transactions that were prepared > for commit then abandoned by the controlling app and periodically get > rid of them or you'll start having issues. There can be issues even if they're not abandoned... Note that prepared transactions establish, and maintain, until removed, all the appropriate locks on the underlying tables and tuples. As a consequence, maintenance-related activities may be somewhat surprisingly affected. foo=# begin; set transaction isolation level serializable; BEGIN SET foo=# insert into my_table (date_time, hostname, duration, diag) values (now(), 'foo', 1, 2); INSERT 0 1 foo=# prepare transaction 'foo'; PREPARE TRANSACTION [then, I quit the psql session...] foo=# select * from pg_locks where relation = (select oid from pg_class where relname = 'my_table'); -[ RECORD 1 ]--+- locktype | relation database | 308021 relation | 308380 page | tuple | virtualxid | transactionid | classid| objid | objsubid | virtualtransaction | -1/433653 pid| mode | RowExclusiveLock granted| t If I try to truncate the table... foo=# truncate my_table; [hangs, waiting on the lock...] [looking at another session...] foo=# select * from pg_locks where relation = (select oid from pg_class where relname = 'my_table'); -[ RECORD 1 ]--+ locktype | relation database | 308021 relation | 308380 page | tuple | virtualxid | transactionid | classid| objid | objsubid | virtualtransaction | -1/433653 pid| mode | RowExclusiveLock granted| t -[ RECORD 2 ]--+ locktype | relation database | 308021 relation | 308380 page | tuple | virtualxid | transactionid | classid| objid | objsubid | virtualtransaction | 2/13 pid| 3749 mode | AccessExclusiveLock granted| f Immediately upon submitting "commit prepared 'foo';", both locks are resolved quite quickly. >> The problem with things that are "extremely remote" possibilities are >> that they tend to be less remote than we expect ;) > > ... and they know just when they can happen despite all the odds to > maximise the pain and chaos caused. A lot of these kinds of things only come up as race conditions. The trouble is that a lot of races do wind up synchronizing themselves. In sporting events, this is intended and desired; an official fires the starter pistol or activates the horn, or what have you, with the intended result that athletes begin very nearly simultaneously. And at the end of Olympic races, their times frequently differ only by miniscule intervals. In my example up above, there's a possibly unexpected synchronization point; the interweaving of the PREPARE TRANSACTION and TRUNCATE requests lead to a complete lock against the table. Supposing 15 processes then try accessing that table, they'll be blocked until the existing locks get closed out. Which takes place the very instant after the COMMIT PREPARED request comes in. At that moment, 15 "racers" are released very nearly simultaneously. If there is any further mischief to be had in the race, well, they're set up to tickle it... -- let name="cbbrowne" and tld="gmail.com" in name ^ "@" ^ tld;; http://linuxdatabases.info/info/nonrdbms.html "Barf, what is all this prissy pedantry? Groups, modules, rings, ufds, patent-office algebra. Barf!" -- R. William Gosper -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] shared_buffers advice
Dave Crooke wrote: There seems to be a wide range of opinion on this I am new to PG and grew up on Oracle, where more SGA is always a good thing ... I know people who run Oracle on 2TB Superdome's with titanic SGA sizes to keep the whole DB in RAM. I'd be using a 40GB+ Oracle SGA on that box of yours. I wouldn't call it opinion so much as a series of anecdotes all suggesting the same thing: that you cannot translate SGA practice into PostgreSQL and expect that to work the same way. Some data points: -An academic study at Duke suggested 40% of RAM was optimal for their mixed workload, but that was a fairly small amount of RAM. http://www.cs.duke.edu/~shivnath/papers/ituned.pdf -Tests done by Jignesh Shah at Sun not too long ago put diminishing returns on a system with a bunch of RAM at 10GB, probably due to buffer lock contention issues (details beyond that number not in the slides, recalling from memory of the talk itself): http://blogs.sun.com/jkshah/entry/postgresql_east_2008_talk_best -My warnings about downsides related to checkpoint issues with larger buffer pools isn't an opinion at all; that's a fact based on limitations in how Postgres does its checkpoints. If we get something more like Oracle's incremental checkpoint logic, this particular concern might go away. -Concerns about swapping, work_mem, etc. are all very real. All of us who have had the database server process killed by the Linux OOM killer at least once know that's one OS you absolutely cannot push this too hard on. This is not unique to here, that issue exists in Oracle+SGA land as well: http://lkml.indiana.edu/hypermail/linux/kernel/0103.3/0906.html -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance