[GENERAL] 9.6 Beta 2 Performance Regression on Recursive CTE
Hi, I'm seeing a performance regression on 9.6 Beta 2 compared to 9.5.3. The query is question is a recursive query on graph data stored as an adjacency list. While this is example is fairly contrived, it mimics the behavior I am seeing on real data with more realistic queries. The example below uses the same data set, same DB configuration, same query plan and same host. But the 9.6 beta 2 query runs in 26 seconds and 9.5.3 runs in 13.4 seconds. This pattern holds up on repeated runs. This seems to be related to the citext data type. If I re-run the test below with varchar as the data type, performance is similar between 9.5 and 9.6. Both are running on CentOS 6.6 using the PG community RPMs. Version details: version -- PostgreSQL 9.5.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit version - PostgreSQL 9.6beta2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit Relevant config options (same on both) cpu_index_tuple_cost=0.005 cpu_operator_cost=0.0025 cpu_tuple_cost=0.05 effective_cache_size=4GB random_page_cost=4 seq_page_cost=1 shared_buffers=2GB temp_buffers=8MB work_mem=4MB Very basic table structure: \d adj_edge_citext Table "public.adj_edge_citext" Column | Type | Modifiers ++--- parent | citext | child | citext | Import the Graph of Thrones data set from: https://www.macalester.edu/~abeverid/data/stormofswords.csv (ignore the weight column). Run VACUUM ANALYZE on the table, and then run the following: EXPLAIN (analyze, buffers, verbose) WITH RECURSIVE d AS ( SELECT a.parent, a.child, 1 AS LEVEL, ARRAY[a.parent] AS path, FALSE AS CYCLE FROM adj_edge_citext a WHERE a.parent in (SELECT distinct(parent) FROM adj_edge_citext) UNION ALL SELECT a1.parent, a1.child, d.LEVEL+1, path || a1.parent, a1.parent=ANY(path) FROM adj_edge_citext a1 JOIN d ON d.child=a1.parent WHERE NOT CYCLE ) SELECT parent, child, level, path FROM d; 9.5.3: https://explain.depesz.com/s/HOO7 QUERY PLAN CTE Scan on d (cost=9766.98..12900.18 rows=31332 width=100) (actual time=0.810..13023.860 rows=1427997 loops=1) Output: d.parent, d.child, d.level, d.path Buffers: shared hit=6, temp read=31714 written=64240 CTE d -> Recursive Union (cost=31.31..9766.98 rows=31332 width=48) (actual time=0.805..9613.617 rows=1427997 loops=1) Buffers: shared hit=6, temp read=31714 written=31701 -> Hash Semi Join (cost=31.31..69.83 rows=352 width=12) (actual time=0.803..1.879 rows=352 loops=1) Output: a.parent, a.child, 1, ARRAY[a.parent], false Hash Cond: (a.parent = adj_edge_citext.parent) Buffers: shared hit=4 -> Seq Scan on public.adj_edge_citext a (cost=0.00..19.60 rows=352 width=12) (actual time=0.021..0.177 rows=352 loops=1) Output: a.parent, a.child Buffers: shared hit=2 -> Hash (cost=27.58..27.58 rows=71 width=6) (actual time=0.760..0.760 rows=71 loops=1) Output: adj_edge_citext.parent Buckets: 1024 Batches: 1 Memory Usage: 11kB Buffers: shared hit=2 -> HashAggregate (cost=20.48..24.03 rows=71 width=6) (actual time=0.638..0.681 rows=71 loops=1) Output: adj_edge_citext.parent Group Key: adj_edge_citext.parent Buffers: shared hit=2
[GENERAL] pgbench - prevent client from aborting on ERROR
Hi, Is there any way to do this? For context, I'm wanting to write a custom script in repeatable read isolation level. If I hit a serializable error, I don't want the client to abort, I want it to continue running transactions. Is that possible? thanks, Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Documentation Inaccuracy – Transaction Isolation
Hi, I noticed an inaccuracy in the transaction isolation docs. Under the Repeatable Read Isolation Level section it states: “The Repeatable Read isolation level only sees data committed before the transaction began; it never sees either uncommitted data or changes committed during transaction execution by concurrent transactions.” That is not entirely accurate. The snapshot starts with the first SQL statement in the transaction, not at the start of the transaction. Any change that is committed in another transaction after the start of the transaction but before the first SQL statement will be seen. Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Stat estiamtes off - why?
Hi, I'm seeing something odd in my DB stats. This is PG 9.2.4 on CentOS. I know the stats collector can be off at times, but I am curious as to why it would be off in such a case. I know that under heavy load the stats collector can be off sometimes, but this system was barely doing anything (CPU ~ 90% idle). PG didn't shutdown at all either. I'm curious how it ended up with 191 live tuples, but only did 17 inserts to the table. In actuality, the table is empty at the time I ran this query, manually analyzing it updated the live_tuples back down to 0. I select n_tup_ins, n_tup_upd, n_tup_del, n_tup_hot_upd, n_live_tup, n_dead_tup from pg_stat_user_tables where relname = 'foo'; -[ RECORD 1 ]-+-- n_tup_ins | 17 n_tup_upd | 305 n_tup_del | 17 n_tup_hot_upd | 297 n_live_tup| 191 n_dead_tup | 11 Thanks, Brad.
[GENERAL] Temp files on Commit
Hi, I'm seeing cases where I have temp files being written on commit, such as. 2013-08-16 12:09:02 GMT [14480]: [588-1] user=dbuser,db=dbname STATEMENT: COMMIT 2013-08-16 12:09:02 GMT [14480]: [589-1] user= dbuser,db=dbname LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp14480.263", size 814822 Is this a case of having work_mem set to low, or something else? I haven't seen temp files on commit before. Thanks, Brad.
[GENERAL] Source code and database object identifiers
Hi all, We are planning to add PostgreSQL database support to our application. We have run into the issue of where in PostgreSQL is converting all the database object identifiers into lower case. I understand that's the how PostgreSQL works and I can double quote the identifiers and preserve the camel case. Unfortunately I cannot double quote the identifiers and need to preserve the camel case (mixed case) for the identifiers for our application to work. I wouldn't mind changing the source code to help us in this issue. I have set up the debug environment on eclipse and able to compile + debug the PostgreSQL. Can someone please provide some guidance where I should make the changes to preserve mixed case for identifiers? Thank you Brad.
Re: [GENERAL] Hope for a new PostgreSQL era?
On Thu, Dec 8, 2011 at 9:11 AM, Marc Cousin > > wrote: > > I wish it was the same (I use and like both pgbouncer and pgpool too, > and they do a good job, I'm not arguing on that). But unfortunately it > isn't: you still have the notion of session for each connected client > in Oracle when using the shared servers model. > > It means you keep your session variables, your prepared statements, > your running transaction, etc… in each individual session while having > the multiplexing equivalent of a 'statement level' from pgbouncer. In Oracle - can the pool share connections between DB users and/or databases on the instance? If the answer is yes to either, that is a fair bit better than what we can achieve today. Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] psql HTML mode - quoting HTML characters
> -Original Message- > From: Josh Kupershmidt [mailto:schmi...@gmail.com] > Sent: Wednesday, October 26, 2011 5:04 PM > To: Nicholson, Brad (Toronto, ON, CA) > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] psql HTML mode - quoting HTML characters > > On Wed, Oct 26, 2011 at 3:51 PM, Nicholson, Brad (Toronto, ON, CA) > wrote: > > How do I quote characters like < and > in psql's HTML mode? > > From a brief look at print_html_text() and html_escaped_print() in > psql's print.c, I don't see any way to tell psql not to escape some > block of text in HTML print mode. Darn. Thanks though. > Out of curiosity, what do you find psql's HTML mode useful for? Quick and dirty reporting. Set the HTML mode, run a query, and dump the output into Apache's document directory and you have a very quick, albeit crude way to present results. Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] psql HTML mode - quoting HTML characters
Hi, How do I quote characters like < and > in psql's HTML mode? For example: Regular mode: postgres=# select 'http://www.postgresql.org>Postgres'; ?column? http://www.postgresql.org>Postgres (1 row) HTML mode: postgres=# \H Output format is html. postgres=# select 'http://www.postgresql.org>Postgres'; ?column? <a href=http://www.postgresql.org>Postgres</a>; (1 row) I would like the line <a href=http://www.postgresql.org>Postgres</a>; To be http://www.postgresql.org>Postgres Thanks, Brad -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] max_stack_depth error, need suggestion
> -Original Message- > From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- > ow...@postgresql.org] On Behalf Of Leif Biberg Kristensen > Sent: Thursday, August 18, 2011 6:49 AM > To: pgsql-general@postgresql.org > Subject: Re: [GENERAL] max_stack_depth error, need suggestion > > On Thursday 18. August 2011 12.39.31 AI Rumman wrote: > > I am using Postgresql 9.0.1 in Centos 5. > > > > Yesterday, I got the error inlog: > > > > 2011-08-16 10:57:34 EDT [3868]: [1-1] user=vcrm,db=vtigercrm504 > ERROR: > > stack depth limit exceeded > > 2011-08-16 10:57:34 EDT [3868]: [2-1] user=vcrm,db=vtigercrm504 HINT: > > Increase the configuration parameter "max_stack_depth", after > ensuring the > > platform's stack depth limit is adequate. > > > > > > I found that I need to increase max_stack_depth. But doc says that it > is a > > bit risky increasing it. > > > > Could any one please suggest me what the maximum safe value I may set > in my > > environment? > > > > My Server RAM is 32 GB. > > That error message is usually caused by an infinite recursion. Slony can also cause this to happen (at least it could - I'm not sure if it still does) - it wasn't from infinite recursion though. I used to have to set that higher for some of my clusters. They may have fixed the query that was causing that to happen though. Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL 9.0 or 9.1 ?
> -Original Message- > From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- > ow...@postgresql.org] On Behalf Of Grzegorz Jaskiewicz > Sent: Thursday, June 16, 2011 11:05 AM > To: Achilleas Mantzios > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] PostgreSQL 9.0 or 9.1 ? > > It could be worth considering 9.1. Probably by the time you get > production ready version, 9.1 will be already stable (few months I > guess). > The usual answer to that question is - it will be ready when its ready. > I would also ask, what is your (and your managements) tolerance for risk, and do you actually need any of the new features and/or performance benefits in 9.1? Postgres does have an excellent track record for quality and stability with new releases, but a couple of months in the field isn't really considered stable in most places. Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Inspecting a DB - psql or system tables ?
> -Original Message- > From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- > ow...@postgresql.org] On Behalf Of Andrew Sullivan > Sent: Friday, May 27, 2011 2:32 PM > To: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Inspecting a DB - psql or system tables ? > > On Fri, May 27, 2011 at 08:26:33PM +0200, Tomas Vondra wrote: > > > While parsing the output of psql is cumbersome, accessing the > > > system tables seems more likely to break whenever a new version > > > of PostgreSQL comes out. > > > > Really? Those catalogs are pretty stable, and when changed they're > > usually extended (new columns are added). So well written queries > won't > > break very often. Actually I'd expect the psql output to change much > > more often. > > The whole point of the information_schema is that it's well-defined by > the standard. The system tables themselves do sometimes change > between versions -- that's why you get warnings from psql when you > start up a client with a different major version number than the > server. (If you want to see this in action, try using a 7.4-era > client with 9.0, and do some tab completion or something like that.) > There is a sharp edge to watch out for when querying for this data between the system catalogs and the information schema, and it's not mentioned in our docs anywhere. The information schema queries will only return rows back for objects that the user issuing the query has permissions on. This is the correct behavior as per the SQL spec I believe, but very different from the way the pg_catalog queries work - which will return you all objects back regardless of permissions on them. Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 10 missing features
> -Original Message- > From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- > ow...@postgresql.org] On Behalf Of Radoslaw Smogura > Sent: Tuesday, April 26, 2011 9:55 AM > To: Leif Biberg Kristensen > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] 10 missing features > > > For example, having knowledge when particular query stuck may be great > advantage for administrators and developers as well. Ofcourse each > functionality gives some overhead, but from other hand, if you create > important systems (like financials) "stability" and how it's easy to > track errors is required. For those types of systems - lack of a true audit log is probably a bigger barrier. The biggest "missing feature" are going to depend on your problem space. > Form this what I was interested and saw: > * I think he good pointed that logging indices, may be unneeded, as > those can be recreated. Whether this is acceptable depends on your system. Yes they can be recreated with a number of caveats -performance for many systems will be poor until some (or all) indexes are back. If you have SLA's based around performance you can extend your outage until the indexes get rebuilt. -Indexes are used to enforce primary keys. Are you comfortable running temporarily without your primary keys? -Some replication engines rely on primary keys or unique indexes. Losing these could break replication for you. I think if you could control this on a per-index basis though it could be a win. Brad -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 10 missing features
> -Original Message- > From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- > ow...@postgresql.org] On Behalf Of Greg Smith > Sent: Monday, April 25, 2011 4:23 PM > To: pgsql-general@postgresql.org > Subject: Re: [GENERAL] 10 missing features > > On 04/25/2011 10:48 AM, Andrew Sullivan wrote: > > You can see this in certain items in the top 10. Three, four, five, > > seven, maybe 8 eight, and ten all seemed to me to be things I've > > actually done before, but not using something directly inside > > Postgres. > > > > The idea that something must ship in the database to be useful is > really > engrained in some people. I do this talk nowadays about common > mistakes > people make when deploying PostgreSQL, and one of the top items I put > on > there is not actively investigating external tools. The problem is that there is a lot of noise in the add-on space. There are lots of things out there that are no longer supported or partially supported. There is a fairly high barrier of entry into figuring out which tools to use, how to put them together and what you can and can't do with them. If (and I stress the word if) the target is winning over DBA's from the commercial crowd this is an important point, as those DBA's are going to be used to getting most of what they need in one package along with the DB. > None of the items on this list would be on my own top list of missing > things in PostgreSQL. I see "Better fragmentation management" as a > footnote and there's an intro discussion to that on the blog at > http://blog.kimiensoftware.com/2011/04/compacting-postgresql-tables/ > Apparently the struggles required to sort out a little 25GB table > apparently didn't make enough of an impression to put that into its > proper place, which is way ahead of every item listed on the suggested > missing feature set. Query progress is #1? It's annoying, yes, but so > not even close to pole position to me. From reading the blog a bit, it > sounds like the author is managing lots of smallish (to me) databases, > so putting so much emphasis on making each individual one easier to > troubleshoot makes more sense. I think you touch on this here - but a lot of what the "most needed" things are will depend on your problem set. Lack of differential backups used to be a huge pain when I had multi-terabyte datawarehouses to look after. Ditto for query progress when I had managers asking me when ad-hoc OLAP style queries would complete. I do think the areas that are lacking in PG though do come to finer grain profiling of tasks. The ability to isolate CPU and IO utilization of particular queries or sets of queries is something I find very useful in the commercial DB space that I'd love to see in Postgres. Same goes for troubleshooting locking conflicts if you aren't looking at the system when they are happening, and tracing the causes of those locks down to finer grained details (IE - am I waiting on buffer eviction or xlog writes). I do realize that there are ways to get at some of this stuff or work around it - but the barrier of entry is often pretty high, can involves high volume logging and is often far more time consuming task than it could be. Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Disk space usage analyzer?
> -Original Message- > From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- > ow...@postgresql.org] On Behalf Of Steve Crawford > Sent: Monday, March 28, 2011 12:22 PM > To: Yang Zhang > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Disk space usage analyzer? > > On 03/25/2011 07:58 PM, Yang Zhang wrote: > > Is there any tool for breaking down how much disk space is used by > > (could be freed by removing) various tables, indexes, selected rows, > > etc.? Thanks! > > > > You can use the pg_class table and the pg_relation_size (and optionally > the pg_size_pretty) functions to get that info. This query gives table > sizes and percent of overall usage. > > BUT! It is only looking at tables, not indexes. If you want to know how > much space will be freed by dropping a table, you will have to modify > this query to total up the index space used for all the indexes > associated with each table. pg_total_relation_size() will give you the size of the table and the indexes on it. Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] multi-tenant vs. multi-cluster
> -Original Message- > From: Ben Chobot [mailto:be...@silentmedia.com] > Sent: Friday, March 18, 2011 3:45 PM > To: Nicholson, Brad (Toronto, ON, CA) > Cc: pgsql-general General > Subject: Re: [GENERAL] multi-tenant vs. multi-cluster > > > On Mar 18, 2011, at 12:34 PM, Nicholson, Brad (Toronto, ON, CA) wrote: > > >>> b) its own postgresql processes (many of them) running in memory > >> > >> I believe this is entirely a function of client connections. > > > > With a single instance, you can use connection pooling to reduce the > overall number of backend connections which will reduce your memory > footprint. > > Er, right, for some reason I was thinking I could use connection > pooling against multiple clusters, but now that I think about it that > doesn't make much sense, does it? Not for reducing overall numbers of connections on the server. > >> > >>> c) its own shared_buffers in memory. > >> > >> Given that each application will be independent, I don't see a > >> different between clusters and schemas here either. > > > > The difference is that in a single cluster, a single instance is > going to make decisions about what data to cache or not. This is an > overly simplified example - but illustrates the point. Say you have > 4GB of RAM available to dedicate to a shared buffers on a server, and > two databases (DB A and DB B) to run. You either set up a single > instance with a 4GB pool, or two instances with 2GB pools each. Let's > say that DB A gets really busy, and DB B is not. In the shared > instance approach, the instance can evict buffers cached for DB B in > order to load buffers needed for DB A. In the split instance, you > can't. > > Ah, that's an illustrative example. Thanks. > > OK, so are there any good ways to keep a bad/clueless user from gumming > up a whole cluster? Something like statement_timeout, but for > transactions, seems like it would be idle. statement_timeout will only time out SQL queries, not DB transactions. There is nothing internal for that. It's a fairly easy query to terminate all IDLE transactions, but you have to be careful that you aren't terminating active sessions. Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] multi-tenant vs. multi-cluster
> -Original Message- > From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- > ow...@postgresql.org] On Behalf Of Ben Chobot > Sent: Friday, March 18, 2011 3:10 PM > To: Ivan Voras > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] multi-tenant vs. multi-cluster > > On Mar 18, 2011, at 11:47 AM, Ivan Voras wrote: > > > On 18/03/2011 19:17, Ben Chobot wrote: > > > >> if we're talking an extra 50MB of memory per cluster, that will > start to add up. > > > > Consider this: each such cluster will have: > > > > a) its own database files on the drives (WAL, data - increasing IO) > > Oh, I hadn't thought about WAL. Good point. > But data files are a function of tables and indexes, right? Having them > in different schemas or different clusters isn't going to change that. > I guess there are system tables but those are relatively trivial - I > think? Correct, but with different clusters you are going to have different back ends handling writes without regard to each other. How this unfolds will depend on your underlying disk structure and filsystems. I've had bad experiences in the past having multiple Postgres instances fighting for the same disk. > > b) its own postgresql processes (many of them) running in memory > > I believe this is entirely a function of client connections. With a single instance, you can use connection pooling to reduce the overall number of backend connections which will reduce your memory footprint. > > > c) its own shared_buffers in memory. > > Given that each application will be independent, I don't see a > different between clusters and schemas here either. The difference is that in a single cluster, a single instance is going to make decisions about what data to cache or not. This is an overly simplified example - but illustrates the point. Say you have 4GB of RAM available to dedicate to a shared buffers on a server, and two databases (DB A and DB B) to run. You either set up a single instance with a 4GB pool, or two instances with 2GB pools each. Let's say that DB A gets really busy, and DB B is not. In the shared instance approach, the instance can evict buffers cached for DB B in order to load buffers needed for DB A. In the split instance, you can't. Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How do you change the size of the WAL files?
From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of runner Sent: Friday, March 11, 2011 2:31 PM To: pgsql-general@postgresql.org Subject: Re: [GENERAL] How do you change the size of the WAL files? >> > Our WAL files are 16 Mb in size each. I cannot find where you >> > configure them to make them larger. 16 Mb is too small for our >> > instalation. If necessary, I can rebuild the entire cluster. Our >> > DBA initialized a new cluster and specified a larger WAL size but it >> > stayed at 16 Mb. Any info on this? >> >> Why is 16MB too small? Most people increase checkpoint_segments for >> larger installs and it works fine. >To answer the question, it is possible to change the wal log size. It is >a setting in pg_config.h: #define XLOG_SEG_SIZE (16 * 1024 * 1024) . > >If you change that and recompile, your wal segment size will be >different. I have heard of people running 32meg sizes. We have tried >with smaller sizes but it ended up thrashes the disk. > >Further keep in mind you will have to do a dump and restore to the new >cluster. Your existing cluster will not be usable once you do this. >Probably won't change then. The boss was worried that a 16 Mb file size would >cause performance issues on a busy database server. I would be more concerned about encountering performance problems by changing to a non-standard setting for something like wal segment size. I'd also want to do so very intensive performance testing of such a change before deploying it. Be very aware that just because something works in one fashion on another database like Oracle, it does not mean that it will work in the same fashion for Postgres (or any other DB for that matter). Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Binary Replication and Slony
On 10-09-20 12:49 PM, Bruce Momjian wrote: John Cheng wrote: Congrats on the 9.0 release of PostgreSQL. One of the features I am really interested in is the built-in binary replication. Our production environment has been using PostgreSQL for more than 5 years (since this project started). We have been using Slony-I as our replication mechanism. I am interested to find out the pros and cons of Slony vs the built-in replication in 9.0. Based on what I understand: * Slony has a higher overhead than the binary replication in 9.0 * When using Slony, schema change must be applied via slonik (in most cases) * Unfortunately, IMO it is easy to make a mistake when applying schema changes in Slony, fortunately, it is easy to drop and recreate the replication sets * Slony is an asynchronous replication mechanism * Slony allows you to replication some tables, while ignoring others * PostgreSQL 9.0 with hot standby& streaming replication is an asynchronous replication mechanism * Overhead is low compared to Slony Are there some cases where it is better to use Slony, for example, when you must specifically exclude tables from replication? I believe our system will be better off using the built-in replication mechanism of 9.0, and I am guessing most people will be in the same boat. You have summarized the differences well. Streaming replication has lower overhread, but doesn't allow per-table granularity or allow replication between different versions of Postgres. Slony will also allow you to: -run custom schema (like extra indexes) on replicas -replicate between different hardware architectures and OS's -run lengthy queries against replicas having to worry about trade offs surrounding query cancellation vs standby lagging. -switch roles of two nodes without entering a degraded state or worrying about STONITH. If you switch roles in a controlled manner, both nodes remain in the cluster. Slony prevents writes against the replica. I do agree that for most, Slony is overkill and streaming replication and hot standby will be the better choice. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Need help understanding vacuum verbose output
On 10-08-06 11:45 AM, Gordon Shannon wrote: OK, so if it knew that all vacuumable tuples could be found in 492 pages, and it scanned only those pages, then how could it be that it reports 16558 removable tuples from those 492 pages, when it has already reported earlier that it removed 45878 tuples -- a number we know in fact to be correct? How could both statements be correct? It found 45878 dead tuples in 396 pages for the index authors_archive_pkey. It found 16558 dead tuples in 492 pages for the table authors_archive. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] idle process & autovacuum
On 10-07-30 10:52 AM, Gauthier, Dave wrote: Hi: I have a perl/dbi script that's loading a lot of data into a DB (probably about 8,000,000 records). pg_stat_activity shows the script's activity alongside another process with current_query as... autovacuum: ANALYZE public.instance_pins The total instance_pins table cardinality is about 60,000,000. The script inserts records but never updates or deletes anything. And before the script runs, the table is truncated. The script's process seems to be stalled in the meantime. The analyze that is running on that table will block the truncate of that until the analyze completes. You can check pg_locks for to see if there are any ungranted locks relating to the script you are running. This is a new phenomena as of yesterday. The only change is that I added an index on that table. But I created the new index yesterday (create index...) and it completed OK. What does this message mean in that context? Is this indicative of someting being misconfigured in the DB? The message means that the data in instance_pins has changed enough for autovacuum to run an analyze on the table. It's doing this to re-generate statistics for the table for the query planner to use. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp.
Re: [GENERAL] Which CMS/Ecommerce/Shopping cart ?
On 10-07-29 08:54 PM, Greg Smith wrote: Brad Nicholson wrote: Postgres also had a reputation of being slow compared to MySQL. This was due to a lot of really poor MySQL vs Postgres benchmarks floating around in the early 2000's. I think more of those were fair than you're giving them credit for. For many common loads, up until PG 8.1 came out--November 8.1--MySQL really was faster. That was the release with the killer read scalability improvements, then 8.3 piled on again with all the write-heavy stuff too. MySQL 4 vs. PG 8.0? MySQL won that fair and square sometimes. oh, btw - I'm talking about MySQL 3.x w/MyISAM vs Postgres 7.1/7.2 days. By the time MySQL 4.0/PG 8.0 was around, I was long off MySQL. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Which CMS/Ecommerce/Shopping cart ?
On 10-07-29 08:54 PM, Greg Smith wrote: Brad Nicholson wrote: Postgres also had a reputation of being slow compared to MySQL. This was due to a lot of really poor MySQL vs Postgres benchmarks floating around in the early 2000's. I think more of those were fair than you're giving them credit for. I'm sure some where, but I recall a lot that were not. The main problems I recall is that they took the stock postgresql.conf (which was far to restrictive) and measured it against a much better MySQL config. They then measured some unrealistic test for most applications and declared MySQL the clear winner for everything and Postgres slow as a dog. It's one thing for database folks to look at that see the problems and/or limitations with those sorts of tests. But a lot of developers were taking these to heart and siding with MySQL and slagging Postgres as being slow - often unjustly. For many common loads, up until PG 8.1 came out--November 8.1--MySQL really was faster. That was the release with the killer read scalability improvements, then 8.3 piled on again with all the write-heavy stuff too. MySQL 4 vs. PG 8.0? MySQL won that fair and square sometimes. Oh, I agree that MySQL was faster for some stuff, but not everything. Back in those days, I routinely saw web sites backed by MySQL 3.x (forget the exact version) grind to an absolute halt under concurrent access due to table level locking in MyISAM. Moving those over to the earlier branches of 7.x improved things drastically. That said, I also saw the opposite, where MySQL was a lot faster than Postgres. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Which CMS/Ecommerce/Shopping cart ?
On 10-07-29 02:57 PM, Tom Lane wrote: Samantha Atkins writes: Why is MySQL so much more popular right now, especially in the OpenSource community? I think it's strictly historical. The mysql bias you see in so many web tools was established in the late 90s, a time when mysql worked reasonably well (at least according to the mysql developers' notion of "reasonably well") whereas postgres was still pretty slow and buggy. It took us a long time to get from the original academically-oriented code to something of real production quality. We're definitely competitive now, but I don't know if we'll ever fully overcome that historical disadvantage. regards, tom lane Postgres also had a reputation of being slow compared to MySQL. This was due to a lot of really poor MySQL vs Postgres benchmarks floating around in the early 2000's. They generally tested stock configurations (MySQL had a less restrictive out of the box configuration) and they tended to test things like how fast can a single client insert/update/delete data from a table. Unsurprisingly, MySQL won, as Postgres imposed all sorts of pesky behind the scenes protection for your data that MySQL didn't worry about. No one really tested it in a way that mattered, which was how the two databases performed under concurrent load, where Postgres won hands down. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Configure Postgres From SQL
On Mon, 2010-07-12 at 14:57 +0100, Thom Brown wrote: > On 12 July 2010 14:50, Tom Wilcox wrote: > > Hi Thom, > > > > I am performing update statements that are applied to a single table that is > > about 96GB in size. These updates are grouped together in a single > > transaction. This transaction runs until the machine runs out of disk space. As you are updating this table, you are leaving dead tuples behind for each of the updates that are not hot updates and the table is getting bloated. That is most likely why you are running out of disk space. Turning off fsync will not help you with this. What will help you is trying to get the database to use hot updates instead, or batching the updates and letting the table get vacuumed often enough so that the dead tuples can get marked for re-use. Hot updates would be very beneficial, even if batch updating. They will happen if their is no index on the updated column and there is enough space in the physical page to keep the tuple on the same page. You can adjust the fillfactor to try and favour this. You can check if you are doing hot updates by looking at pg_stat_user_tables for the number of hot updates. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] A thought about other open source projects
Scott Marlowe wrote: As with phrases like, "the quickest way to grill a unicorn steak," that it can be stated in a few words does not make in possible. Exactly. The big issue here is that nobody's saying what kind of app they want to write. Or what sort of performance requirements are tied to that app. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] pgcrypto in separate schema
I want to put the functions from pgcrypto into a separate schema, but pgcrypto.sql is explicitly setting the search path to public. Is there a reason it does this that I should be aware of? Is it fine to change that and install the functions in a separate schema? -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ERROR: unrecognized time zone name: "UTC"
On Tue, May 18, 2010 at 8:15 PM, Brad Ediger wrote: > I am experiencing this error trying to set the connection time zone to > UTC on PostgreSQL 8.4.4: > > postgres=# set time zone 'UTC'; > ERROR: unrecognized time zone name: "UTC" Answered my own question -- it was my Prefix Portage install missing the zoneinfo database ($EPREFIX/usr/share/zoneinfo/UTC). Working great now. -be -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] ERROR: unrecognized time zone name: "UTC"
I am experiencing this error trying to set the connection time zone to UTC on PostgreSQL 8.4.4: postgres=# set time zone 'UTC'; ERROR: unrecognized time zone name: "UTC" I have read the documentation for timezone_abbreviations, and everything looked in order, at least as far as my limited knowledge takes me: postgres=# show timezone_abbreviations; timezone_abbreviations Default (1 row) $ grep UTC `pg_config --sharedir`/timezonesets/Default UTC 0# Coordinated Universal Time Setting the time zone to 'CST6CDT' appears to work fine: postgres=# set time zone 'CST6CDT'; SET Is there something I'm doing wrong here? I have a slightly odd configuration (Gentoo Prefix Portage on Mac OS X, with a custom ebuild) so the problem may well be with my system setup. But I'd appreciate any pointers folks may have as to where to troubleshoot. Thanks! -be -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Lifekeeper
Hi, Is anyone using Lifekeeper for Linux availability with Postgres? If so, what are your thoughts on it? Work as advertised? Any dangerous gotchas? -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Solid State Drives with PG
On Fri, 2010-03-26 at 15:27 -0400, Merlin Moncure wrote: > On Fri, Mar 26, 2010 at 2:32 PM, Greg Smith wrote: > > Merlin Moncure wrote: > >> > >> So flash isn't yet a general purpose database solution, and wont be until > >> the write performance problem is fixed in a way that doesn't > >> compromise on volatility. > > > > Flash drives that ship with a supercapacitor large enough to ensure orderly > > write cache flushing in the event of power loss seem to be the only solution > > anyone is making progress on for this right now. That would turn them into > > something even better even than the traditional approach of using regular > > disk with a battery-backed write caching controller. Given the relatively > > small write cache involved and the fast write speed, it's certainly feasible > > to just flush at power loss every time rather than what the BBWC products > > do--recover once power comes back. > > right -- unfortunately there is likely going to be a fairly high cost > premium on these devices for a good while yet. right now afaik you > only see this stuff on boutique type devices...yeech. TMS RamSan products have more than adequate capacitor power to handle failure cases. They look like a very solid product. In addition to this, they have internal RAID across the chips to protect against chip failure. Wear-leveling is controlled on the board instead of offloaded to the host. I haven't gotten my hands on one yet, but should at some point in the not to distant future. I'm not sure what the price point is though. But when you factor in the cost of the products they are competing against from a performance perspective, I'd be surprise if they aren't a lot cheaper. Especially when figuring in all the other costs that go along with disk arrays - power, cooling, rack space costs. Depends on the your vantange point I guess. I'm looking at these as potential alternatives to some high end, expensive storage products, not a cheap way to get really fast disk. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] 8.3.10 Changes
Could someone please point me towards the changes for 8.3.10 that was mentioned on -announce this morning? Also, any idea when this is going to be released? -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Work Mem Allocation Questions
1: Is the default work_mem pre-allocated to the Postgres processes - or does it get allocated when needed? Say I have work_mem at 30MB - will Postgres allocate that 30MB on connection, or only when it needed by a query? 2: If work_mem is, say, 30MB, and my query needs 10MB - will Postgres allocate all 30MB, or just the 10MB I need? -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Cacti + PostgreSQL Graphing
On Mon, 2010-03-01 at 14:43 -0400, Marc G. Fournier wrote: > First, thanks for the pointer to the Check_postgres stuff, definitely alot > of good stuff in there ... but ... that is a totally different beast then > I'm looking at from Cacti (or, so I believe) ... for instance, > Check_posgres:database_size will alert me if I go over X size, which is > useful but with Cacti, I get a graph of waht the size is over a 5 min > period of time, so I can watch growth ... Check out Hyperic. We're currently evaluating it for profiling out systems, and I'm pretty impressed with it so far. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] checkpoints/bgwriter tuning verification
On Fri, 2009-10-30 at 07:15 -0400, Steve Clark wrote: > On 10/29/2009 04:42 PM, Scott Marlowe wrote: > > On Thu, Oct 29, 2009 at 12:46 PM, Vick Khera wrote: > >> On my primary DB I'm observing random slowness which just doesn't make > >> sense to me. The I/O system can easily do 40MB/sec writes, but I'm > >> only seeing a sustained 5MB/sec, even as the application is stalling > >> waiting on the DB. > > > > Just one point on top of everything else you'll hear. 40 MB/sec > > sequential throughput does not equal 40MB/sec random PLUS checkpoint > > throughput. Random access is gonna lower that 40MB/sec way down real > > fast. > > > > First step to speed things up is putting pg_xlog on its own disk(s). > Hi Scott, > > How exactly do you do this? By creating a link to the new location or > is there a config option somewhere that says where the pg_xlog resides? There is an option to do this during initdb. If you want to do it after the DB is created, move the contents of pg_xlog/ (when the DB is shut down) and make a symlink to the new directory. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] checkpoints/bgwriter tuning verification
On Thu, 2009-10-29 at 14:46 -0400, Vick Khera wrote: > On my primary DB I'm observing random slowness which just doesn't make > sense to me. The I/O system can easily do 40MB/sec writes, but I'm > only seeing a sustained 5MB/sec, even as the application is stalling > waiting on the DB. > > My only guess is that I'm getting hit by checkpoints too often, and > this is causing some delays. I did a lot of reading and came across > this most excelelent article by Greg Smith > (http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm). > Reading that and the Pg manual and looking at the statistics, here is > what I think I need to do. (Greg, do you have a performance tuning > book? If so, I really want to buy it! Your articles are awesome.) > > current settings for checkpoints: > checkpoint_segments = 32 > checkpoint_timeout = 900 > checkpoint_completion_target = 0.9 > log_checkpoints = on > > In my logs I see this pretty consistently during times of slowness: > > Oct 29 14:17:38 d04 postgres[54902]: [10990-1] LOG: checkpoint starting: xlog > Oct 29 14:22:04 d04 postgres[54902]: [10991-1] LOG: checkpoint > complete: wrote 52828 buffers (24.3%); 0 transaction log file(s) > added, 0 removed, 32 recycled; write=265.881 > Oct 29 14:22:57 d04 postgres[54902]: [10992-1] LOG: checkpoint starting: xlog > Oct 29 14:26:56 d04 postgres[54902]: [10993-1] LOG: checkpoint > complete: wrote 52773 buffers (24.3%); 0 transaction log file(s) > added, 0 removed, 32 recycled; write=234.846 > Oct 29 14:28:32 d04 postgres[54902]: [10994-1] LOG: checkpoint starting: xlog > Oct 29 14:33:32 d04 postgres[54902]: [10995-1] LOG: checkpoint > complete: wrote 53807 buffers (24.7%); 0 transaction log file(s) > added, 0 removed, 32 recycled; write=299.170 > > > basically, the next checkpoint starts within a few seconds of the > prior one completing. That's most likely due to having checkpoint_completion target set near the maximum. You are telling it to stretch the checkpoint out as long as possible. > > The stats show this: > > # select * from pg_stat_bgwriter; > -[ RECORD 1 ]--+--- > checkpoints_timed | 3236 > checkpoints_req| 83044 > buffers_checkpoint | 1376460896 > buffers_clean | 59124159 > maxwritten_clean | 304410 > buffers_backend| 285595787 > buffers_alloc | 6643047623 > > Based on Greg's article and the above number showing that most > checkpoints are triggered by running out of WAL segments, I should > increase my checkpoint_buffers. Also, based on the article, I should > increase the bgwriter_lru_maxpages (currently at default 100). > > My plan is to bump checkpoint_segments to 48 and reduce > checkpoint_completion_target to 0.7, That's what I would do. I think you getting hit by checkpointing too frequently or stretching the checkpoints out too long (or a combo of both). I'd also keep bumping checkpoint_segements up until you are checkpointing on the timeout. This will give you a longer window between checkpoints - which gives more time to smooth the the checkpoint out. > and bump the > bgwriter_lru_maxpages to 500. I would probably start by increasing the number of WAL segments and decreasing the checkpoint_completion_timeout and see if that solves the problem. I wouldn't change the bgwriter_lru_maxpages at the same time, and wouldn't necessarily increase it by a factor of 5 when doing so. It does look like you need to increase it though. > Can the checkpoint operation actually cause the DB to stop responding > for a few seconds at a time? That seems to be what I observe. > Sometimes for 5 or more seconds one transaction will just stall. Absolutely. The smoothing is supposed to alleviate that, but I'm not sure how well it works with aggressive activity driven checkpoints like you have. Keep up posted, I'm curious. > Thanks for any ideas. > -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum and immediate shutdown issues
On Mon, 2009-10-19 at 15:09 -0400, Brad Nicholson wrote: > On Mon, 2009-10-19 at 15:01 -0400, Tom Lane wrote: > > Brad Nicholson writes: > > > autoanalyze will automatically analyze new tables when they don't have > > > stats. It seems logical that it should handle this case where the table > > > also does not have stats. > > > > It will autoanalyze once a sufficient number of inserts have occurred. > > The effect of a crash is just to reset the inserts-since-last-analyze > > counters. You really haven't made the case why that's so awful that > > we need to do things that are unpleasant along other dimensions in order > > to avoid it. (The only ways of avoiding it that I can see would result > > in very significant overhead added to the stats collection mechanism.) > > What about the question that Scott and I both asked - what about query > plans. Will they be affected? > > If so, you could have a table that does not get updated, that no longer > generates relevant query plans, and won't until manual intervention. > Actually, isn't there also a condition (although unlikely to occur) here that could result in transaction wraparound? -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum and immediate shutdown issues
On Mon, 2009-10-19 at 15:01 -0400, Tom Lane wrote: > Brad Nicholson writes: > > autoanalyze will automatically analyze new tables when they don't have > > stats. It seems logical that it should handle this case where the table > > also does not have stats. > > It will autoanalyze once a sufficient number of inserts have occurred. > The effect of a crash is just to reset the inserts-since-last-analyze > counters. You really haven't made the case why that's so awful that > we need to do things that are unpleasant along other dimensions in order > to avoid it. (The only ways of avoiding it that I can see would result > in very significant overhead added to the stats collection mechanism.) What about the question that Scott and I both asked - what about query plans. Will they be affected? If so, you could have a table that does not get updated, that no longer generates relevant query plans, and won't until manual intervention. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum and immediate shutdown issues
On Mon, 2009-10-19 at 11:35 -0700, Christophe Pettus wrote: > On Oct 19, 2009, at 11:26 AM, Scott Marlowe wrote: > > That could be a pretty bad worst case scenario for certain types of > > tables / usage patterns. > > Given that (presumably) the database server is not failing repeatedly > without some kind of operator notification, isn't it at least somewhat > reasonable to just make "start an vacuum upon recovery from an > immediate shutdown" an operational procedure, rather than something PG > does automatically? > It's not a vacuum you want, it's an analyze. Once the stats are back, autovacuum will vacuum accordingly. autoanalyze will automatically analyze new tables when they don't have stats. It seems logical that it should handle this case where the table also does not have stats. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum and immediate shutdown issues
On Mon, 2009-10-19 at 11:16 -0600, Scott Marlowe wrote: > On Mon, Oct 19, 2009 at 11:06 AM, Brad Nicholson > wrote: > > On Mon, 2009-10-19 at 10:53 -0600, Scott Marlowe wrote: > >> On Mon, Oct 19, 2009 at 10:44 AM, Tom Lane wrote: > >> > Brad Nicholson writes: > >> >> On Mon, 2009-10-19 at 12:07 -0400, Tom Lane wrote: > >> >>> That seems like a fundamentally stupid idea, unless you are unconcerned > >> >>> with the time and cost of getting the DB running again, which seemingly > >> >>> you are. > >> > > >> >> I disagree that this is fundamentally stupid. We are talking about a > >> >> situation where the server is about to die, HA solution kicks in and > >> >> moves it to standby. > >> > > >> > Moving it to standby immediately is a good idea, but it does not follow > >> > that you need to hit the DB over the head with a hammer. A fast-mode > >> > shutdown seems perfectly adequate. If it isn't, you're going to need > >> > nontrivial recovery effort anyhow. > >> > >> All of this is completely besides the point that a database that's > >> been shutdown immediately / had the power cord yanked comes back up > >> and doesn't start autovacuuming automatically, which seems a > >> non-optimal behaviour. > > > > It's also not going to endear us very much to the VLDB crowd - it will > > amounts to a multi-hour crash recovery for those folks while analyze > > regenerates statistics. > > But this would be AOK behaviour for small transactional databases? Defiantly not. > Again, besides the point, but important. The real point is a database > that doesn't run autovac after an emergency shutdown is broken by > design, and not just for one use case. This behaviour is also undocumented AFAIK. I would bet that a lot of users would have no idea that they are in this state post crash-recovery. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum and immediate shutdown issues
On Mon, 2009-10-19 at 10:53 -0600, Scott Marlowe wrote: > On Mon, Oct 19, 2009 at 10:44 AM, Tom Lane wrote: > > Brad Nicholson writes: > >> On Mon, 2009-10-19 at 12:07 -0400, Tom Lane wrote: > >>> That seems like a fundamentally stupid idea, unless you are unconcerned > >>> with the time and cost of getting the DB running again, which seemingly > >>> you are. > > > >> I disagree that this is fundamentally stupid. We are talking about a > >> situation where the server is about to die, HA solution kicks in and > >> moves it to standby. > > > > Moving it to standby immediately is a good idea, but it does not follow > > that you need to hit the DB over the head with a hammer. A fast-mode > > shutdown seems perfectly adequate. If it isn't, you're going to need > > nontrivial recovery effort anyhow. > > All of this is completely besides the point that a database that's > been shutdown immediately / had the power cord yanked comes back up > and doesn't start autovacuuming automatically, which seems a > non-optimal behaviour. It's also not going to endear us very much to the VLDB crowd - it will amounts to a multi-hour crash recovery for those folks while analyze regenerates statistics. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum and immediate shutdown issues
On Mon, 2009-10-19 at 12:07 -0400, Tom Lane wrote: > Brad Nicholson writes: > > If you issue an immediate shutdown to the database, autovacumm will not > > process tables that should be vacuumed until manually re-analyzed. > > AFAICS this is an unsurprising consequence of flushing stats on a crash. > If you don't like it, avoid immediate shutdowns --- they are not > especially good practice in any case. > > > 3: What is the best work around for this? When our HA solution triggers > > a DB shutdown, we want it to be immediate. > > That seems like a fundamentally stupid idea, unless you are unconcerned > with the time and cost of getting the DB running again, which seemingly > you are. > I disagree that this is fundamentally stupid. We are talking about a situation where the server is about to die, HA solution kicks in and moves it to standby. If we wait for a clean shutdown instead, and the server dies before it completes (which is entirely possible), Postgres crashes and the exact same behaviour will happen. It also means that if any server crashes (HA aside, shutdown method aside), the database will come up, but functionality may be impacted until manual intervention. At the very least. shouldn't autoanalyze not correct the lack of statistics? To me, this looks like the database will not come up cleanly after crashing. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] autovacuum and immediate shutdown issues
If you issue an immediate shutdown to the database, autovacumm will not process tables that should be vacuumed until manually re-analyzed. PG 8.3.8 Relevant settings: autovacuum = on log_autovacuum_min_duration = 0 autovacuum_max_workers = 3 autovacuum_naptime = 1min autovacuum_vacuum_threshold = 10 autovacuum_analyze_threshold = 10 autovacuum_vacuum_scale_factor = 0.05 autovacuum_analyze_scale_factor = 0.05 test=# CREATE TABLE foo (id int); test=# INSERT INTO foo SELECT generate_series(1,1000); auto-analyze runs as expected test=# SELECT last_autovacuum, last_autoanalyze from pg_stat_user_tables where relname ='foo'; -[ RECORD 1 ]+-- last_autovacuum | last_autoanalyze | 2009-10-19 14:14:47.791658+00 test=# DELETE FROM foo WHERE id IN (SELECT id FROM foo LIMIT 100); test=# SELECT * from pgstattuple('foo'); -[ RECORD 1 ]--+-- table_len | 32768 tuple_count| 900 tuple_len | 25200 tuple_percent | 76.9 dead_tuple_count | 100 dead_tuple_len | 2800 dead_tuple_percent | 8.54 free_space | 656 free_percent | 2 dead_tuple_percent > 5% - autovacuum runs as expected on next pass: test=# SELECT last_autovacuum, last_autoanalyze from pg_stat_user_tables where relname ='foo'; -[ RECORD 1 ]+-- last_autovacuum | 2009-10-19 14:16:47.910177+00 last_autoanalyze | 2009-10-19 14:14:47.791658+00 --repopulate table test=# TRUNCATE foo; test=# INSERT INTO foo SELECT generate_series(1,1000); Wait for autoanalyze, then before next run of autovacuum test=# DELETE FROM foo WHERE id IN (SELECT id FROM foo LIMIT 100); pg_ctl -D data -m immediate stop restart the postmaster stats are gone due to the immediate shutdown. test=# SELECT last_autovacuum, last_autoanalyze from pg_stat_user_tables where relname ='foo'; -[ RECORD 1 ]+- last_autovacuum | last_autoanalyze | test=# SELECT * from pgstattuple('foo'); -[ RECORD 1 ]--+-- table_len | 32768 tuple_count| 900 tuple_len | 25200 tuple_percent | 76.9 dead_tuple_count | 100 dead_tuple_len | 2800 dead_tuple_percent | 8.54 free_space | 656 free_percent | 2 dead_tuple_percent > 5% of table is dead, autovacuum should pick it up, but it doesn't (yes, I have waited longer enough). autoanalyze does not process the table. Autovacuum will not process this table again until one of the following two conditions are met: 1: Manually analyze the table 2: an additional 5% (autovacuum_vacuum_scale_factor) of the tuples in the are dirtied - which amounts to 10% (autovacuum_vacuum_scale_factor * 2) of the actual table being dirtied before it gets vacuumed. In case 2 - reports from pgstattuple are odd. After deleting one row. test=# DELETE FROM foo WHERE id IN (SELECT id FROM foo LIMIT 1); DELETE 1 test=# SELECT * from pgstattuple('foo'); -[ RECORD 1 ]--+-- table_len | 32768 tuple_count| 899 tuple_len | 25172 tuple_percent | 76.82 dead_tuple_count | 1 dead_tuple_len | 28 dead_tuple_percent | 0.09 free_space | 3456 free_percent | 10.55 Stats start over. The DB should be able to recover cleanly from an immediate shutdown IMHO. If the stats are no longer there, I would expect autoanalyze to run and regenerate them. This is the same behaviour as when a new table is created and populated. A few questions 1: Is this expected behaviour, or a bug? 2: If not a bug, why does autoanalyze not process these tables. It will process newly loaded tables when they do not have any stats. 3: What is the best work around for this? When our HA solution triggers a DB shutdown, we want it to be immediate. Currently we run a manual analyze post recovery, this is bad for really for large databases. 4: after restart, why does pgstattuple shoe dead_tuple_percent = 8.54, but after deleting one row, it shows dead_tuple_percent = 0.09? 5: on the missing stats - does this mean my query plans are potentially bad until the stats are regenerated? -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_stat_reset() not resetting all statistics counters
On Wed, 2009-07-15 at 14:13 +0200, Rafael Martinez wrote: > Hello > > Should not the execution of pg_stat_reset() reset *all* statistics > counters everywhere in the database? It only resets the stats for the current database, not the cluster wide stats - pg_database is cluster wide. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Maintenance database SQL_ASCII
After a new pgsql installation the "postgres" maintenance database has an encoding of SQL_ASCII. pgAdmin III gave me a warning about that, and I may want to create users or databases that are not restricted 7bit ASCII. I was going to backup and recreate this table, but it can't be dropped. I guess I could backup all of the other databases and start over from "initdb", but that would be a hassle. Do I have any other options? The strange thing is that I have two very similarly configured Ubuntu machines and the other created the "postgres" database with UTF8 encoding. The locale on both machines is the same: $ locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= -Brad -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Average connections
On Wed, 2009-06-10 at 15:29 +0100, Pedro Doria Meunier wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Thank you Bill for your tip. > > As far as the table's name is concerned the only one I can find is > 'pg_statistic' (under pg_catalog). I'm using PGSQL 8.2.9 on x86_64. > > BR, http://www.postgresql.org/docs/8.2/interactive/monitoring-stats.html#MONITORING-STATS-VIEWS-TABLE pg_stat_database also has the number of active server processes connected to a database. > > > Bill Moran wrote: > > In response to Pedro Doria Meunier : > > > >> -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > >> > >> Hi all > >> > >> I've googled for an answer but couldn't find one. Is there anyway > >> to monitor the average connections to a database? > >> > >> Thank you in advance for any insights. > > > > The pg_stat_activity table holds 1 row for each connection with > > information on what that connection is doing. > > > > It wouldn't be very difficult to write a passthrough script for > > something like MRTG to graph this data. > > > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.7 (GNU/Linux) > Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org > > iD8DBQFKL8ND2FH5GXCfxAsRAu/XAJ43UGqlzv5gfzg1YgECbhvL2MaPzwCdEnt3 > GfewITsorV/t7cfpq3WxVqM= > =84cI > -END PGP SIGNATURE- > > -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bloated Table
On Wed, 2009-05-27 at 11:15 -0400, Tom Lane wrote: > =?iso-8859-1?Q?Alexander_Sch=F6cke?= writes: > > I'm using a view > > (http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html) to > > display the bloat (unused disk space) of the tables in a PostgreSQL > > database. > > I wouldn't trust the calculations that view does in the least. > You might look at contrib/pgstattuple if you want numbers that > have some relationship to reality (and are correspondingly more > expensive to get :-() > > regards, tom lane Is the referenced query reliable for even estimating, or is it flat our wrong? Co-workers that were PGCon are saying that this is becoming a popular/accepted way to check for bloated tables. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Prepare / Execute Method Error
I'm having problems passing in variables into my execute statement. Any ideas? * * ** *Table:* CREATE TABLE cat.case ( vari character varying(50),--Contains Value ‘BDD1’ htnumeric(4,1) --Contains Value 30.0 ) *Statements:* PREPARE insert_plan (character varying, numeric) AS SELECT vari FROM cat.case WHERE vari = $1 AND ht = $2; EXECUTE insert_plan(*vari_string*, *case[1]*); *--Fails with Errors vari_string* = 'BDD1' --character varying *case[1]* = 30.0--numeric(4,1)[] *Errors:* NOTICE: vari_string: BDD1 NOTICE: ht: 30.0 ERROR: function insert_plan(character varying, numeric) does not exist LINE 1: SELECT insert_plan( $1 , $2 [ $3 ]) HINT: No function matches the given name and argument types. You might need to add explicit type casts. QUERY: SELECT insert_plan( $1 , $2 [ $3 ]) *Hard Coded – It will work! It just doesn’t like using variables inside???* EXECUTE insert_plan('BDD1', 30.0); *Another Variation of the Execution Statement and still failed.* EXECUTE insert_plan(vari_string:: character varying(50), case[1]:: numeric(4,1));
Re: [GENERAL] Selling an 8.1 to 8.3 upgrade
On Fri, 2009-04-24 at 12:01 -0500, Josh Trutwin wrote: > I've been asked to put together a list of reasons to upgrade a db > from 8.1 to 8.3 and I've looked over the changelog, but they want a > bullet list of 4-5 top things. I'm curious what others would say the > most 5 important updates from 8.1 to 8.3 are. What is the reason that you want to upgrade? Do you have a particular problem you are trying to solve? Are there improvements for developers that you think will be beneficial, or are you simply wanting to upgrade from a 4.5 year old DB platform? You'll probably have the best luck if you sell it based on pains that you are facing, improvements that can be made, or money that can be saved. > I can say "performance improvevents" but I'm not sure how to sell > that better other than listing what's in the release notes. I also > think the autovac was greatly improved in these releases as well? Or > maybe I'm thinking 8.0 to 8.1? Are you experiencing performance issues, or do you have reason to suspect you will? Performance improvements are great if you need them. It can be a hard sell to business folks though when things are performing just fine. > Sorry this is so vague, I'm frustrated with this request as I figured > just the amount of bug-fixes alone would be adequate reasoning. Unfortunately, what seems adequate to us technical folks is seldom is to the business folks. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Time intersect query
I'm wanting to optimize and improve a query to get the maximum number of users over a period of time. What I'm trying to accomplish is to get graphable data points of the maximum number of simultaneous users at a specified interval over a period of time, preferably with only a single pass through the data. -- streamlog table (I've only included relevant fields and indexes): id bigint ts timestamp viewtime integer client integer -- primary key on id field -- streamlog_tsrange_index btree (client, ts, startts(ts, viewtime)) \df+ startts List of functions Schema | Name | Result data type | Argument data types | Volatility | Owner | Language |Sou rce code | Description +-+-+--++---+--+--- -+- public | startts | timestamp without time zone | timestamp without time zone, integer | immutable | root | sql | select $1 - ($2::varch ar || ' seconds')::interval; | The ts timestamp is the time which the data was logged. The viewtime is the amount of time the user was online in seconds the startts function determines when the session started by subtracting viewtime from ts and returning a timestamp - My current procedure... 1) Create temporary table with each possible data point. This example uses recursive functions from pgsql 8.4 but was originally implemented by using large numbers of queries from php. My knowledge of the recursive functions is pretty week, but I was able to put this together without too much trouble. create temp table timerange as with recursive f as ( select '2009-03-21 18:20:00'::timestamp as a union all select a+'30 seconds'::interval as a from f where a < '2009-03-21 20:20:00'::timestamp ) select a from f; 2) Update table with record counts alter table timerange add column mycount integer; explain analyze update timerange set mycount = (select count(*) from streamlogfoo where client = 3041 and a between startts(ts,viewtime) and ts); - This seems to work reasonably well, with the following exceptions... 1) The number reported is the number at the set time period, not the highest value between each data point. With a 30 second interval, this isn't a big problem, but with larger intervals gives results that do not match what I'm looking for (maximum users). 2) This does not scale well for large numbers of points, as internally each data point is a complete scan through the data, even though most of the data points will be common for the entire range. I'm thinking this would be a good use for the new window functions, but I'm not sure where to begin. Any ideas? -
Re: [GENERAL] clearing the buffer cache
On Fri, 2009-02-13 at 17:49 +, Sam Mason wrote: > Hi, > > I was doing some performance checks and wondered what the best way to > clear out the shared buffers is? With the recent improvements in buffer > management it appears a simple SELECT * FROM large_table; doesn't help > here! > > I was hoping for a function I could call, or maybe some variable I write > to, that would cause the contents to be invalidated. Restart the database. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] XPath to search for elements in a sequence
With XML similar to: < a > < b > < c > 1 < c > 2 < c > 3 I'm trying to create an xpath expression (for a postgresql query) that will return if is a particular value and not that is all three values. What I currently have (which does not work) is: select * from someTable where xpath ('//uim:a/text()', job, ARRAY[ ARRAY['uim',' http://www.cmpy.com/uim '] ])::text[] IN (ARRAY['1','3']); If I try with ARRAY['1'] this will not return any values but with ARRAY['1','2','3'] it will return all three. How can I select based on a single element in a sequence? Thanks.
[GENERAL] pg_stat_reset not resetting all stats
Is there a reason that pg_stat_reset doesn't reset the stats in pg_stat_bgwriter and pg_stat_database? PG 8.3 (obviously). The call to pg_stat_reset works, as my other stats tables are clear. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] "Healing" a table after massive updates
On Thu, 2008-09-11 at 07:01 -0700, Gauthier, Dave wrote: > Hi: > > > > I have a job that loads a large table, but then has to “update” about > half the records for various reasons. My perception of what happens > on update for a particular recors is... > > - a new record will be inserted with the updated value(s). > > - The old record is marked as being obselete. > > - Not sure what happens to index elements that pointed to the original > (now obselete) record. Is it updated to point directly at the newly > inserted record? Or does it use the obselete record as a “link” to > the newly inserted record? Depends on the version of Postgres. Prior to 8.3, the obsolete tuples and index entries are dead. In 8.3, the updates are HOT updates, it will not leave the dead tuples or index. 8.3 might be a big help for you. It could remove the need to vacuum this table entirely. > My concern is that the resulting table is not in optimal shape for > queries. I would like to get rid of the obseleted records (vacuum I > believe) but also “heal” the table in terms of filling in the holes > left where those deleted records used to be (will gather more records > per disk block read if record density on disk is greater). Is there a > way to do this? Regular VACUUM is the correct operation to get rid of the dead tuples. If you want to compact the the table, you either need to use CLUSTER or VACUUM FULL + REINDEX. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] dbsize command
On Mon, 2008-06-23 at 10:50 -0700, Reynolds, Gregory A wrote: > In past versions there was a contrib/dbsize command that would tell > you how much storage space was being taken by a table. Is that is > still a supported feature in 8.3, and if so what is the syntax or > where is the documentation on how to use it? There are a series of functions in the database core that will tell you this now. http://www.postgresql.org/docs/8.3/interactive/functions-admin.html -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] page is uninitialized --- fixing
On Thu, 2008-03-27 at 10:37 -0400, Tom Lane wrote: > > What do you mean by "two separate SAN switches pulled out" --- is the > DB spread across multiple SAN controllers? > It's using IO mutilpath through 2 HBAs. Both of those were taken down. Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] page is uninitialized --- fixing
On Thu, 2008-03-27 at 10:29 -0300, Alvaro Herrera wrote: > Brad Nicholson wrote: > > On Wed, 2008-03-26 at 15:31 -0400, Tom Lane wrote: > > > Brad Nicholson <[EMAIL PROTECTED]> writes: > > > > We just took a test database down (PG 8.1.11) fairly hard (pulled a SAN > > > > It would be easier to believe that if the uninitialized pages were all > > > contiguous though. Do you know that this table was under heavy insert > > > load at the time? > > > > It was. This table is an insert only log table that was being heavily > > was being heavily written to at the time of the crash. > > Is it possible that there were *two* crashes? There was only one crash. However, there were two separate SAN switches that were pulled out from under the DB, not sure if that would matter. Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] page is uninitialized --- fixing
On Wed, 2008-03-26 at 15:31 -0400, Tom Lane wrote: > Brad Nicholson <[EMAIL PROTECTED]> writes: > > We just took a test database down (PG 8.1.11) fairly hard (pulled a SAN > It could be that but not necessarily. These could be pages that were > allocated to put new tuples into, but the crash happened before the > inserting transaction committed (and, in fact, before any of the > associated WAL entries had made their way to disk -- else the empty > pages would've been initialized during WAL replay). > > It would be easier to believe that if the uninitialized pages were all > contiguous though. Do you know that this table was under heavy insert > load at the time? It was. This table is an insert only log table that was being heavily was being heavily written to at the time of the crash. Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] page is uninitialized --- fixing
We just took a test database down (PG 8.1.11) fairly hard (pulled a SAN switch out while it was under load), which caused the DB to crash. It started up fine, when I vacuumed the DB, I saw the following messages. WARNING: relation "my_table" page 652139 is uninitialized --- fixing WARNING: relation "my_table" page 652140 is uninitialized --- fixing WARNING: relation "my_table" page 652940 is uninitialized --- fixing WARNING: relation "my_table" page 652941 is uninitialized --- fixing That sort of looks like it could be data loss, can someone explain what happened? Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to copy tables between databases?
On Tue, 2008-02-26 at 15:19 -0500, Kynn Jones wrote: > > Is there a simple way to copy a table from one database to another > without generating an intermediate dump file? > pg_dump -t | psql -d -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Perceived weaknesses of postgres
Dawid Kuroczko wrote: Slony is good as long as there are no DDLs issued. And its easy to shoot oneself in the foot if one is not careful (some time ago I have lost all the triggers while upgrading from 8.1 to 8.2; it was my fault since I did pg_dump -s on a slave database, not on the master...). It can also be a very time consuming process. It often also involves having to juggle resources around, find extra disk space floating around somewhere to build extra replicas, or abandon your fail over target for a period of time while upgrade that. BTW, the trigger issue goes away with 8.3 and Slony 2.0. Brad. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] postgresql storage and performance questions
On Tue, 2007-11-20 at 13:04 -0500, Josh Harrison wrote: > On Nov 20, 2007 11:13 AM, Brad Nicholson <[EMAIL PROTECTED]> wrote: > > On Tue, 2007-11-20 at 07:22 -0500, Josh Harrison wrote: > > > > > There were a couple of things we noted. > > > 1. Tablesize twice as much than oracle-- Im not sure if postgres null > > > columns has any overhead since we have lots of null columns in our > > > tables.Does postgresql has lots of overhead for null columns? > > > > Did you by any chance have an aborted load of the data? If you load in > > a table, and that load fails or does not commit, it will still occupy > > the space until you vacuum. If you try to load again, the table will be > > twice the size. > > > > If you want to compact the physical space the table occupies, you can > > try running VACUUM FULL on it, and possibly a redindex afterwards. This > > will bring the physical space down to the minimum. Both of these > > operations will lock out access to the tables though. > I ran vacuum full on this table already. I haven't re-indexed it. But > this will not affect the table size...right...since indexes are stored > separately? You are correct about the indexes. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] postgresql storage and performance questions
On Tue, 2007-11-20 at 07:22 -0500, Josh Harrison wrote: > There were a couple of things we noted. > 1. Tablesize twice as much than oracle-- Im not sure if postgres null > columns has any overhead since we have lots of null columns in our > tables.Does postgresql has lots of overhead for null columns? Did you by any chance have an aborted load of the data? If you load in a table, and that load fails or does not commit, it will still occupy the space until you vacuum. If you try to load again, the table will be twice the size. If you want to compact the physical space the table occupies, you can try running VACUUM FULL on it, and possibly a redindex afterwards. This will bring the physical space down to the minimum. Both of these operations will lock out access to the tables though. > 2. Oracle seems to be reading larger bocks than postgresql (when we > examined the iostat and vmstat) (we had set postgres' db block size as > 8 and oracle's is 16kb...) > Do you have any comments on this? 8k is the defualt. You can change the block size if you need to. You need to modify src/include/pg_config_manual.h recompile and re-initdb. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] young guy wanting (Postgres DBA) ammo
On Fri, 2007-11-02 at 00:03 -0400, Kevin Hunter wrote: > - More in line with the conversation with my friend, what/why is it that > Postgres needs a DBA while MySQL doesn't? I highly suspect that the > assumption that MySQL doesn't need a DBA is incorrect, but that's what > was posed to me and I couldn't agree or disagree. Before I was a DBA I worked as a developer in shop that had both PG and MySQL running. We had no DBA, or anyone at that time that really understood databases from a DBA side. The stuff that we had running in PG just worked. Period, no problems (and this was on PG 7.0). The stuff we had in MySQL, well, that "mysteriously" ground to a halt every night at the same time, making several customers applications unavailable. Without anyone on staff that could actually diagnose the issue, the only soution that they came up with (and I emphasise the word they, as I had no part in this :-)) was a cron job was that restarted the MySQL server every night. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] subversion support?
You could setup a subversion commit hook to export the functions to the database. Then you adjust your development mentality to: 1) Edit the files on the disk 2) Commit to Subversion Then the hook takes over and runs the drop/create automatically, you could even have it email the developer if the create failed. Roberts, Jon wrote: Robert, that does sound better. It keeps the names of the files in svn consistent with the database object names which is essential. It also makes it automatic. Unfortunately, it doesn't tell you who did the changes. Do you want to share that code? Thanks! Jon -Original Message- From: Robert Treat [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 24, 2007 10:24 PM To: pgsql-general@postgresql.org Cc: Roberts, Jon Subject: Re: [GENERAL] subversion support? On Wednesday 24 October 2007 15:11, Roberts, Jon wrote: Yeah. I think having to save the function to disk and then leave pgAdmin to execute subversion commands is going through hoops. Also, pgAdmin should be integrated so that you are notified if the function in the database is different from the last committed version. A visual diff should be there so you can see what the differences are. We have a script that runs nightly that dumps tables / functions to file, and then checks it in automagically to svn, which sends an email of the diffs. Perhaps that would work for you? -- Brad Lhotsky<[EMAIL PROTECTED]> NCTS Computer SpecialistPhone: 410.558.8006 "Darkness is a state of mind, I can go where you would stumble." -Wolfsheim, 'Blind' ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Missing files under pg_data following power failure
On Fri, 2007-10-12 at 10:54 -0500, Scott Marlowe wrote: > On 10/12/07, Jack Orenstein <[EMAIL PROTECTED]> wrote: > > Our testing involves cutting power to machines running postgres 7.4, > > while under load (lots of reads and writes). When we do this, we often > > lose some files under pg_data storing table content. I.e., the file > > named for a table's pg_class.oid value is simply gone. This can affect > > many tables following a power outage. We know this problem has > > occurred when we are unable to access a table, e.g. > > > > ERROR: relation "xxx" does not exist > > > > The table is present in the system tables, but the file storing table > > content under pg_data is missing. > > > > Can anyone provide insight on this phenomenon? Why are these files > > lost? Are they really lost, or have they simply moved somewhere? What > > happens to the disk blocks formerly occupied by the files? > > > > Getting back in service following this file loss is not a problem; I'm > > just trying to understand how postgres gets into this state. > > First of all, this should not happen on a machine with proper > fsyncing. The possible causes are generally either fsync is off in > postgresql.conf or the drive array <--> OS layer is lying about fsync > operations. What filesystem are you using? I've seen similar problems on JFS2, it was something with logredo or fsck being busted on a Big Blue OS. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] lowering impact of checkpoints
On Tue, 2007-09-25 at 11:58 +0200, hubert depesz lubaczewski wrote: > hi, > our system is handling between 600 and 2000 transactions per second. all > of them are very small, very fast. typical query runs in under 1ms. > yes - sometimes we get queries that take longer than then should get. > simple check shows that we have a very visible pattern of > every-5-minutes peak. > in the minute that there is checkpoint - we get usually 15-20 times more > queries "over 500 ms" than in other minutes. > > we are using 8.2.4 (upgrade will be soon), with these settings: > # select name, setting, unit from pg_settings where name ~* > 'bgwriter|wal|checkpoint'; > name | setting | unit > ---+---+ > bgwriter_all_maxpages | 5 | > bgwriter_all_percent | 0.333 | [null] > bgwriter_delay| 200 | ms > bgwriter_lru_maxpages | 5 | > bgwriter_lru_percent | 1 | [null] > checkpoint_segments | 32| > checkpoint_timeout| 300 | s > checkpoint_warning| 30| s > wal_buffers | 128 | 8kB > wal_sync_method | fdatasync | [null] > (10 rows) > > is there anything i can change to make it "smoother"? Sounds like bgwriter is not flushing dirty pages quickly enough, so there is still a lot of work to do at checkpoint time. You probably need to tune it. This can be a tough thing to do properly though. There are no magic values to suggest, as what will work is highly dependent on your hardware and your applications pattern of use. If possible, up the settings for bgwriter_all_percent a *little* and perhaps bgwriter_all_maxpages and see if it helps. You can change these with a reload. If you are doing this on a production system as opposed to a test system, keep a close eye on what is going on, as it is possible that you can make things worse. I would start with something like 2% for bgwriter_all_maxpages and see if that helps things out. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] I get a error msg when i try to connect to the DB
On Thu, 2007-08-16 at 21:16 +0530, Rajaram J wrote: > Hi > > I have installed PostGRESql 7.4.17 and when i try to connect to the db > i get the msg > > psql: FATAL: no pg_hba.conf entry for host "???", user "sfmdb", > database "LOGDB", SSL off > You are allowing local, non TCP/IP connctions, and connections through the loop back, but you are not allowing remote conncetions. You need an entry of some form that is host all all ??? 255.255.255.255 trust btw, trust is usually bad, you should use some form of authentication. > My pg_hba.conf entry is as below > > > > # TYPE DATABASEUSERIP-ADDRESSIP-MASK > METHOD > > # Disable default configuration > > local all all > trust > # IPv4-style local connections: > hostall all 127.0.0.1 255.255.255.255 > trust > # IPv6-style local connections: > hostall all ::1 > ::::::: trust > > > can you please guide me on what the problem might be. > > Regards > > Rajaram > > -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Cluster and MVCC
I just want to confirm that the cluster/MVCC issues are due to transaction visibility. Assuming that no concurrent access is happening to a given table when the cluster command is issued (when takes it visibility snapshot), it is safe to cluster that table. Correct? -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] Time for Vacuum vs. Vacuum Full
I have a couple of database clusters that need a vacuum full, and I would like to estimate how long it will take, as it will need to be in a maintenance window. I have the times that it takes to to do a regular vacuum on the clusters, will vacuum full take longer? -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Vaccum Stalling
On Tue, 2007-07-10 at 11:31 -0400, Tom Lane wrote: > Brad Nicholson <[EMAIL PROTECTED]> writes: > > On Tue, 2007-07-10 at 11:19 -0400, Tom Lane wrote: > >> Oh, I forgot to mention --- you did check that vacuum_mem is set to > >> a pretty high value, no? Else you might be doing a lot more > >> btbulkdelete scans than you need to. > > > What would you define as high for 7.4? I bumped it up to ~ 245mbs > > That sounds like plenty --- you only need 6 bytes per dead tuple, > so that should be enough to handle all your 15-20M dead tuples in > one scan. > > How big is this index again? Not sure which one it's working on - there are 6 of them each are ~ 2.5GB -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Vaccum Stalling
On Tue, 2007-07-10 at 11:19 -0400, Tom Lane wrote: > Oh, I forgot to mention --- you did check that vacuum_mem is set to > a pretty high value, no? Else you might be doing a lot more > btbulkdelete scans than you need to. > > regards, tom lane What would you define as high for 7.4? I bumped it up to ~ 245mbs -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] Vaccum Stalling
Version 7.4.12 AIX 5.3 Scenario - a large table was not being vacuumed correctly, there now ~ 15 million dead tuples that account for approximately 20%-25% of the table. Vacuum appears to be stalling - ran for approximately 10 hours before I killed it. I hooked up to the process with gdb and this looks a bit suspicious to me. gdb attach 979354 GNU gdb 6.0 Copyright 2003 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "powerpc-ibm-aix5.1.0.0"...attach: No such file or directory. Attaching to process 979354 [Switching to Thread 1] 0xd033ce1c in read () from /usr/lib/libc.a(shr.o) (gdb) bt #0 0xd033ce1c in read () from /usr/lib/libc.a(shr.o) #1 0x10034398 in FileRead (file=805585096, buffer=0xb38d0e40 "", amount=8192) at fd.c:973 #2 0x1009eea8 in mdread (reln=0x301aa618, blocknum=160246, buffer=0xb38d0e40 "") at md.c:434 #3 0x1009d3d0 in smgrread (which=8192, reln=0x301aa618, blocknum=160246, buffer=0x ) at smgr.c:316 #4 0x1002c0a8 in ReadBufferInternal (reln=0x301aa618, blockNum=160246, bufferLockHeld=84 'T') at bufmgr.c:230 #5 0x101e5080 in _bt_getbuf (rel=0x301aa618, blkno=4294967295, access=1) at nbtpage.c:401 #6 0x101e41c4 in btbulkdelete (fcinfo=0x) at nbtree.c:692 #7 0x10048670 in OidFunctionCall3 (functionId=8192, arg1=807052824, arg2=805496980, arg3=806803160) at fmgr.c:1260 #8 0x100dad04 in index_bulk_delete (indexRelation=0x301aa618, [EMAIL PROTECTED]: 0x10194150 , callback_state=0x3016d6d8) at indexam.c:618 #9 0x10193fcc in lazy_vacuum_index (indrel=0x301aa618, vacrelstats=0x3016d6d8) at vacuumlazy.c:712 #10 0x10194b04 in lazy_vacuum_rel (onerel=0x302357b8, vacstmt=0x) at vacuumlazy.c:279 #11 0x10192da4 in vacuum_rel (relid=807622584, vacstmt=0x3016b558, expected_relkind=114 'r') at vacuum.c:855 #12 0x10192fb0 in vacuum (vacstmt=0x3016b558) at vacuum.c:290 #13 0x10141670 in ProcessUtility (parsetree=0x3016b558, dest=0x3016b5a8, completionTag=0x2ff1d0c4 "") at utility.c:831 #14 0x10121b20 in PortalRunUtility (portal=0x301f3768, query=0x3016b5f8, dest=0x3016b5a8, completionTag=0x2ff1d0c4 "") at pquery.c:772 #15 0x1012204c in PortalRun (portal=0x301f3768, count=806794920, dest=0x3016b5a8, altdest=0x3016b5a8, completionTag=0x2ff1d0c4 "") at pquery.c:836 #16 0x10010034 in exec_simple_query (query_string=0x3016b278 "VACUUM ANALYZE v\erbos\e;") at postgres.c:914 ---Type to continue, or q to quit--- #17 0x100119f8 in PostgresMain (argc=4, argv=0x30086e78, username=0x30086db8 "slony") at postgres.c:2973 #18 0x101b79e4 in ServerLoop () at postmaster.c:2564 #19 0x101b95a0 in PostmasterMain (argc=1, argv=0x300853c8) at postmaster.c:897 #20 0x153c in main (argc=1, argv=0x2ff22c40) at main.c:222 #21 0x1204 in __start () Ideas? -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Dangers of fsync = off
On Wed, 2007-05-09 at 08:26 -0600, Scott Ribe wrote: > > I still wouldn't trust Slony with fsync off. Another scenario would be > > the Slony trigger writes a change to the Slony DB, the db crashes before > > it gets committed to disk. When the DB is started, no errors prevent > > startup, but that transaction is lost. > > I'm not sure, but I think the questioner was proposing a policy of "if it > crashes, we go to the standby, no attempt at recovery, ever", and I think > that would be safe. Just make sure that there is no way that the database would come back up after the crash. If it did, the slons could pick up and cause you trouble. If you disable all start up scripts, and operate under the assumption that crash=corruption=failover to Slony replica, you should be okay. You will lose whatever transactions were not replicated to the subscriber, but that's inherent to async replication. > And, personally, given my experience with pg, I think that's reasonable. > Because the day I see pg crash I'm going to assume I have a hardware problem > ;-) If you care about your data, leave fsync on. Period. If you can accept the potential for data loss, and you've proven that there is a worthwhile performance benefit from turning it off (which there may not be), and you gotten your boss/clients/stakeholders to sign off (preferably in writing) that data loss is acceptable if the db crashes, then go ahead and turn it off. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Dynamically Allocated System Resources
Running PG8.1 - will it recognize CPU and memory that are added dynamically to the server when the postmaster is running? -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] [ADMIN] Regarding WAL
On Tue, 2007-04-24 at 09:02 +0530, Mageshwaran wrote: > Hi , > > I want to do replication using WAL , please tell the methods by which > log shipping is done ie moving the wal files to slaves and executing it. http://www.postgresql.org/docs/8.2/interactive/continuous-archiving.html -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Postgres and geographically diverse replication
On Wed, 2007-04-18 at 16:43 -0400, Bill Moran wrote: > In response to "Drew Myers" <[EMAIL PROTECTED]>: > > > > I've been given a task to build a couple of geographically separate > > servers, which are capable of replicating data between each other. > > > > I've surfed through various google results, and most of what I've found > > seems to be a bit dated, so I thought I'd pose my question here, perhaps > > for more detailed and more up-to-date info. > > > > Is this normally done in a push/pull scenario within the postgres > > installations themselves, or is additional software required? What are > > the various replication capabilities? > > > > I apologize for the general nature of my questions, I'm new to postgres > > and to geographically separate replication. Any tips, books, whitepapers > > or other resources you might be able to point me to is most appreciated. > > Generally speaking, when you're talking geographically separate, Slony > is your best bet. We're using it to maintain data on opposites sides of > the US with good success. Successfully using slony over a wide area is going to depend on how much data you are replicating, how fast the connection between the two sites is, and how stable the connection between the two sites is. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Missing magic block
Hi Martijn, Thank you very much for the suggestion: > > CREATE FUNCTION add_one(IN int) > > RETURNS int > > AS 'add_one' > > LANGUAGE C; I corrected this to say: AS 'Project1', 'add_one' And restarted psql (rebooted for that matter as well) and am still getting the same error. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] Missing magic block
I'm trying to learn how to write custom extensions to postgres so wrote a basic C function to see how it works. However, I keep getting the following error "Missing magic block" when I try to add the function to the database. According to the documentation in the manual, all I need to do is add the following: #include "server/fmgr.h" #ifdef PG_MODULE_MAGIC PG_MODULE_MAGIC; #endif To my C file and it will work. However, I guess it is not working? Below is the source code: #include "server/postgres.h" #include #include "server/fmgr.h" #ifdef PG_MODULE_MAGIC PG_MODULE_MAGIC; #endif int add_one(int arg) { return arg + 1; } And the sql statement I am using is: CREATE FUNCTION add_one(IN int) RETURNS int AS 'add_one' LANGUAGE C; Any feedback as to how to correct it is appreciated! Thanks, Brad ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] pg_dump and buffer usage
Question about pg_dump and Postgres 8.1. Assuming you've let you buffers settle, and then you dump your database. Will this clobber your shared buffers like a seq scan against a large table will? -- Brad Nicholson 416-673-4106[EMAIL PROTECTED] Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Recording insert, updates, and deletes
On Wed, 2007-01-10 at 16:51 +0100, Andy Dale wrote: > If anyone can help or offer advice on how to achieve my objective it > would be greatly appreciated. Slony log shipping will do this -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Unrecognized time zone name error.
On Tue, 2006-12-12 at 11:13 -0500, Tom Lane wrote: > Brad Nicholson <[EMAIL PROTECTED]> writes: > > This seems odd. Any idea what's going on here? > > > template1=# SET TimeZone TO 'GMT'; > > ERROR: unrecognized time zone name: "GMT" > > Worksforme. Perhaps you are missing the /usr/share/pgsql/timezone/ > directory (your path might vary)? > > regards, tom lane So it's documented in the archives, the GMT file was missing from the directory you mentioned. Thx. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Unrecognized time zone name error.
This seems odd. Any idea what's going on here? template1=# SET TimeZone TO 'GMT'; ERROR: unrecognized time zone name: "GMT" template1=# SELECT version(); version - PostgreSQL 8.1.5 on powerpc-ibm-aix5.3.0.0, compiled by GCC gcc (GCC) 3.3.2 (1 row) -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Request for replication advice
On Fri, 2006-11-10 at 15:16 -0500, Tom Lane wrote: > Brad Nicholson <[EMAIL PROTECTED]> writes: > > On Fri, 2006-11-10 at 15:07 -0500, Tom Lane wrote: > >> Those are two different methods: you'd use one or the other, not both. > > > Slony has its own log shipping, I think that was what he was referring > > to. > > Oh, OK, I was thinking of the trigger-driven version. Same thing, actually. There's an option that tells the slon daemon to write the data syncs to files instead of applying directly to a subscriber. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Request for replication advice
On Fri, 2006-11-10 at 15:07 -0500, Tom Lane wrote: > "Brendan Jurd" <[EMAIL PROTECTED]> writes: > > So, my question for the list is: is Slony + log shipping the direction > > I should be investigating, or is there something else out that I ought > > to consider? > > Those are two different methods: you'd use one or the other, not both. Slony has its own log shipping, I think that was what he was referring to. > Slony-I is much the more battle-tested of the two at the moment. In > theory WAL log shipping should be higher performance for heavy-update > scenarios, but its latency is variable (low update rate = higher > latency), and not easy to put a bound on pre-8.2. I'm not entirely sure how battle tested the Slony log shipping stuff actually is. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Request for replication advice
On Sat, 2006-11-11 at 06:34 +1100, Brendan Jurd wrote: > So, my question for the list is: is Slony + log shipping the direction > I should be investigating, or is there something else out that I ought > to consider? My understanding of WAL-based replication is that the This is certainly the direction to look at. The speed of which replication keeps on on a log shipped node is going to depend on the frequency of inserts/updates/deletes, you class of hardware and network connectivity. You also need to be absolutely certain that long running queries and 'idle in transaction' transactions are minimized as much as possible (or completely eliminated). Anything that prevents vacuums from doing there job properly is going to hurt a slony replica. Oh, and regarding log shipping itself, I recommend waiting for the release of 1.1.6 or 1.2.1, there is a fairly serious bug in the current version that causes log shipping to fall over if you have more than 2 nodes in your config (not just log shipped nodes). If you have more questions, please sign up for the Slony list. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] log_duration and JDBC V3 Preparded Statements
On Mon, 2006-10-30 at 10:27 -0500, Tom Lane wrote: > Brad Nicholson <[EMAIL PROTECTED]> writes: > > Will do. Is this strictly an 8.2 patch, or will it be back-ported to > > 8.1 and 7.4? > > We aren't going to change the behavior of logging that much in existing > releases --- or were you just thinking of that core-dump condition in > 8.1? That's fixed in 8.1.5. Nope, I was specifically thinking of the issue with prepared statements with JDBC V3 not logging their duration. Thanks. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] log_duration and JDBC V3 Preparded Statements
On Mon, 2006-10-30 at 10:14 -0500, Tom Lane wrote: > Brad Nicholson <[EMAIL PROTECTED]> writes: > > I'm wondering what that status of the fix for this is. > > AFAIK it all works ... grab beta2 and try it. > Will do. Is this strictly an 8.2 patch, or will it be back-ported to 8.1 and 7.4? -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] log_duration and JDBC V3 Preparded Statements
I'm wondering what that status of the fix for this is. Looking at the archives, it looks like Bruce had a patch http://beta.linuxports.com/pgsql-jdbc/2006-08/msg00036.php I don't see anything in the release notes though. What's the status on this? -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Stats Collector Won't Start
On Wed, 2006-10-18 at 15:59 -0400, Tom Lane wrote: > Brad Nicholson <[EMAIL PROTECTED]> writes: > > On Wed, 2006-10-18 at 14:31 -0400, Tom Lane wrote: > >> Would you try strace'ing postmaster start to see what gets passed to the > >> socket() and bind() calls just before this message comes out? > > > Here is the relative output from truss. > > > socket(2, 2, 0) = 7 > > bind(7, 0x0001100E3BF0, 16) Err#13 EACCES > > Well, that's pretty odd --- I'd have expected to see something involving > 127.0.0.1 (ie, 7F01 in one byte order or another). Does your > /etc/hosts file map "localhost" to something other than 127.0.0.1? Nope, it maps localhost to 127.0.0.1 > Also, do regular connections to this postmaster work across TCP/IP? > If getaddrinfo() is broken I'd expect there to be problems binding > to the postmaster's listen socket too ... Yes, TCP/IP connections work fine. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Stats Collector Won't Start
On Wed, 2006-10-18 at 14:31 -0400, Tom Lane wrote: > Brad Nicholson <[EMAIL PROTECTED]> writes: > > On Wed, 2006-10-18 at 13:00 -0400, Tom Lane wrote: > >> That's bizarre. What error conditions does your man page for bind(2) > >> document as yielding EACCES? The only one mentioned on my systems is > >> "protected address", but we aren't requesting a reserved port number ... > > > "The requested address is protected, and the current user does not have > > permission to access it." > > That's what the Linux man page says, too. Could getaddrinfo somehow be > returning bogus data that includes a reserved port number? > > Would you try strace'ing postmaster start to see what gets passed to the > socket() and bind() calls just before this message comes out? Here is the relative output from truss. open("/etc/hosts", O_RDONLY)= 7 kioctl(7, 22528, 0x, 0x) Err#25 ENOTTY kfcntl(7, F_SETFD, 0x0001) = 0 kioctl(7, 22528, 0x, 0x) Err#25 ENOTTY kread(7, " # @ ( # ) 4 7\t 1 . 1".., 4096)= 2770 kread(7, " # @ ( # ) 4 7\t 1 . 1".., 4096)= 0 close(7)= 0 socket(2, 2, 0) = 7 bind(7, 0x0001100E3BF0, 16) Err#13 EACCES kwrite(2, " 2 0 0 6 - 1 0 -", 8)= 8 kwrite(2, " 1 8 1 9 : 0 1", 8)= 8 kwrite(2, " : 0 1 . 7 2 5 ", 8)= 8 kwrite(2, " U T C [ 2 8 2", 8)= 8 kwrite(2, " 9 3 2 ]", 8)= 8 kwrite(2, " L O G : c o", 8)= 8 kwrite(2, " u l d n o t ", 8)= 8 kwrite(2, " b i n d s o c", 8)= 8 kwrite(2, " k e t f o r ", 8)= 8 kwrite(2, " s t a t i s t i", 8)= 8 kwrite(2, " c s c o l l e", 8)= 8 kwrite(2, " c t o r : P e", 8)= 8 kwrite(2, " r m i s s i o n", 8)= 8 kwrite(2, " d e n i e d\n", 8)= 8 close(7)= 0 kwrite(2, " 2 0 0 6 - 1 0 -", 8)= 8 kwrite(2, " 1 8 1 9 : 0 1", 8)= 8 kwrite(2, " : 0 1 . 7 2 8 ", 8)= 8 kwrite(2, " U T C [ 2 8 2", 8)= 8 kwrite(2, " 9 3 2 ]", 8)= 8 kwrite(2, " L O G : d i", 8)= 8 kwrite(2, " s a b l i n g ", 8)= 8 kwrite(2, " s t a t i s t i", 8)= 8 kwrite(2, " c s c o l l e", 8)= 8 kwrite(2, " c t o r f o r", 8)= 8 kwrite(2, " l a c k o f", 8)= 8 kwrite(2, " w o r k i n g", 8)= 8 kwrite(2, " s o c k e t\n", 8)= 8 -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Stats Collector Won't Start
On Wed, 2006-10-18 at 13:00 -0400, Tom Lane wrote: > Brad Nicholson <[EMAIL PROTECTED]> writes: > > Can someone please provide a bit of information where the following > > error is coming from? This is PG 8.1.3 on AIX 5.3 > > > LOG: could not bind socket for statistics collector: Permission denied > > That's bizarre. What error conditions does your man page for bind(2) > document as yielding EACCES? The only one mentioned on my systems is > "protected address", but we aren't requesting a reserved port number ... > > regards, tom lane "The requested address is protected, and the current user does not have permission to access it." -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] Stats Collector Won't Start
Can someone please provide a bit of information where the following error is coming from? This is PG 8.1.3 on AIX 5.3 LOG: could not bind socket for statistics collector: Permission denied LOG: disabling statistics collector for lack of working socket What exactly does the PG user not have appropriate permissions to? -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] How does PG access wal files?
Is it by file name or by inode? Brad. ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Select Cast Error
I have a field that is varchar(15) type and an example of data I'm working with is (PROJ-0001-06) I can make these two select statements work but not together. select cast((max(substring(test.test from 6 for 4))) AS INTEGER) + 1 FROM test;select max(substring(test.test from 11 for 2)) FROM test; I want this to work like this. select cast((max(substring(test.test from 6 for 4))) as integer) + 1 FROM test where max(substring(test.test from 11 for 2)); List below is a better idea of what my table looks like and the result I need. PROJ-0004-05 PROJ-0001-06 PROJ-0002-06 PROJ-0003-06 When I run my select statement I want to return the number 4. The idea is that I need the next highest number in the middle but take in consideration that the highest trailing numbers take president. There error I get is this ERROR: argument of WHERE must be type boolean, not type textBe one of the first to try Windows Live Mail. Windows Live Mail.
Re: [GENERAL] postgresql rising
On Wed, 2006-09-20 at 16:38 -0500, Philip Hallstrom wrote: > > On Wed, Sep 20, 2006 at 10:10:56AM -0500, Tony Caduto wrote: > >> For a high level corp manager all they ever hear about is MS SQL Server, > >> Oracle and DB2, and the more it costs the more they think it is what > >> they need :-) > > > > I think that description is false. At a certain point in the > > management hierarchy, the only way anyone has the ability to evaluate > > something is on the basis of reputation. > > I think that description is false. At a certain point in the management > hierarchy, the only way anyone has the ability to evaluate something is on > the basis of > > - if there is someone they can sue. Good luck attempting to sue Microsoft, Oracle or IBM for deficiencies in their database products. Brad. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Best approach for a "gap-less" sequence
On Thu, 2006-08-17 at 15:13 -0500, Scott Marlowe wrote: > On Thu, 2006-08-17 at 15:07, Merlin Moncure wrote: > > On 8/17/06, Brad Nicholson <[EMAIL PROTECTED]> wrote: > > > > > > > Hmm, I think you are wrong. There is a SELECT ... FOR UPDATE; > > > > > The first-to-obtain the gapless sequence transaction will establish > > > > > a lock onthe "tax_id" row. The other transaction will block until > > > > > the first transaction finishes (and the row is updated) and will > > > > > establish the row lock on it. > > > > > > > > yes, you are right...i didnt think the problem through properly. > > > > > > Lets just hope the performance on a concurrent system is not a > > > requirement of such a system... > > > > > > > right, if the transations are long running, there is a big problem as > > they are serialized around access to the sequence. however this is > > better than the control record approach because control record have > > problems with mvcc bloat. concurrent performance will of course be > > awful. > > > > a good compomise in some cases is to save off canceled transactions > > ids' in a free list you would still have to deal with transactions > > that were not gracefully cancelled though. > > Is it not possible in some circumstances to create the invoice first, > THEN assign a sequential ID after creation? If speed of access was an issue, that's how I'd look at doing it - batch assign them after the fact. Brad. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Best approach for a "gap-less" sequence
On Thu, 2006-08-17 at 16:07 -0400, Merlin Moncure wrote: > On 8/17/06, Brad Nicholson <[EMAIL PROTECTED]> wrote: > > > > > Hmm, I think you are wrong. There is a SELECT ... FOR UPDATE; > > > > The first-to-obtain the gapless sequence transaction will establish > > > > a lock onthe "tax_id" row. The other transaction will block until > > > > the first transaction finishes (and the row is updated) and will > > > > establish the row lock on it. > > > > > > yes, you are right...i didnt think the problem through properly. > > > > Lets just hope the performance on a concurrent system is not a > > requirement of such a system... > > > > right, if the transations are long running, there is a big problem as > they are serialized around access to the sequence. however this is > better than the control record approach because control record have > problems with mvcc bloat. concurrent performance will of course be > awful. This effect will be magnified if there other long running transactions (pg_dump and pre 8.2 vacuum, I'm looking at you), as the dead tuples from the updates will start to pile up, and reads to the table slow down, locks persist for longer... ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Best approach for a "gap-less" sequence
On Thu, 2006-08-17 at 12:12 -0400, Merlin Moncure wrote: > On 8/17/06, Dawid Kuroczko <[EMAIL PROTECTED]> wrote: > > On 8/17/06, Merlin Moncure <[EMAIL PROTECTED]> wrote: > > > On 8/16/06, Dawid Kuroczko <[EMAIL PROTECTED]> wrote: > > > > -- then create a function to retrieve the values: > > > > CREATE FUNCTION gseq_nextval(t text) RETURNS integer AS $$ > > > > DECLARE > > > >n integer; > > > > BEGIN > > > >SELECT INTO n gseq_value+1 FROM gapless_seq WHERE gseq_name = t > > > > FOR UPDATE; > > > >UPDATE gapless_seq SET gapless_value = n WHERE gseq_name = t; > > > >RETURN n; > > > > END; > > > > $$ STABLE LANGUAGE PLpgsql; > > > > > > > > > > the problem here is if you have two concurrent transactions which call > > > this funtion, it is possible for them both to return the same sequence > > > number in read comitted mode. Using this funtion outside of > > > transactions is no different that using a sequence except that it is > > > slower. > > > > Hmm, I think you are wrong. There is a SELECT ... FOR UPDATE; > > The first-to-obtain the gapless sequence transaction will establish > > a lock onthe "tax_id" row. The other transaction will block until > > the first transaction finishes (and the row is updated) and will > > establish the row lock on it. > > yes, you are right...i didnt think the problem through properly. Lets just hope the performance on a concurrent system is not a requirement of such a system... Brad. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Best approach for a "gap-less" sequence
On Mon, 2006-08-14 at 16:08 -0400, Berend Tober wrote: > Jorge Godoy wrote: > > > Chris <[EMAIL PROTECTED]> writes: > > > > > >>I'm not sure what type of lock you'd need to make sure no other transactions > >>updated the table (see > >>http://www.postgresql.org/docs/8.1/interactive/sql-lock.html) but "in > >>theory" > >>something like this should work: > >> > >>begin; > >>select id from table order by id desc limit 1; > >>insert into table (id, blah) values (id+1, 'blah'); > >>commit; > > > > > > This is part of the solution, yes. But I would still need locking this > > table > > so that no other concurrent transaction gets another "id". I don't want to > > lock the main table -- > > Wouldn't SELECT ... FOR UPDATE give you the row lock you need without > locking the table? Nope, concurrent transactions won't work. Let current max id = x Transaction 1 (t1) does a select max(id) for update, gets a lock on the last tuple at the time of the select, and gets x as a value for max id Transaction 2 (t2) does a select max(id) for update, has to wait for t1 to release its lock. t1 inserts (x+1) as the new max id of the table. t1 releases its lock t2 is granted the lock on the tuple it has been waiting for, which contains the max id of x t2 tries to insert a value of x+1, insert fails (if it doesn't, you really want to have a close look at your constraints :-) Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly