[HACKERS] W3C Specs: Web SQL Revisit
Revisiting my original post: http://archives.postgresql.org/pgsql-hackers/2010-11/msg00318.php The Web SQL spec is still in its abandoned state. Mozilla has stated that they do not want to support the API, though they do allow extensions to send calls to sqlite directly. Many posters responding to my OP, noted that exposing SQL directly, even with permissions, is not something they're comfortable with. IndexedDB has gained acceptance across Mozilla, WebKit and Microsoft. SQL is not exposed directly. It's a simple system. IndexedDB is currently implemented in WebKit and Mozilla browsers on using the SQLite library. MS recently implemented a .Net prototype. I'm going to compile libpq as a browser extension to prototype indexedb with postgres, then work on patches to WebKit to develop a libpq flag [default: false] for webkit build scripts. I will post back when I've got something to demonstrate (hoping to get to it in a few months). -Charles -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] ToDo List Item - System Table Index Clustering
Hello Postgres Hackers, In reference to this todo item about clustering system table indexes, ( http://archives.postgresql.org/pgsql-hackers/2004-05/msg00989.php ) I have been studying the system tables to see which would benefit from clustering. I have some index suggestions and a question if you have a moment. Cluster Candidates: pg_attribute: Make the existing index ( attrelid, attnum ) clustered to order it by table and column. pg_attrdef: Existing index ( adrelid, adnum ) clustered to order it by table and column. pg_constraint: Existing index ( conrelid ) clustered to get table constraints contiguous. pg_depend: Existing Index (refclassid, refobjid, refobjsubid) clustered to so that when the referenced object is changed its dependencies arevcontiguous. pg_description: Make the existing index ( Objoid, classoid, objsubid ) clustered to order it by entity, catalog, and optional column. * reversing the first two columns makes more sense to me ... catalog, object, column or since object implies catalog ( right? ) just dispensing with catalog altogether, but that would mean creating a new index. pg_shdependent: Existing index (refclassid, refobjid) clustered for same reason as pg_depend. pg_statistic: Existing index (starelid, staattnum) clustered to order it by table and column. pg_trigger: Make the existing index ( tgrelid, tgname ) clustered to order it by table then name getting all the triggers on a table together. Maybe Cluster: pg_rewrite: Not sure about this one ... The existing index ( ev_class, rulename ) seems logical to cluster to get all the rewrite rules for a given table contiguous but in the db's available to me virtually every table only has one rewrite rule. pg_auth_members: We could order it by role or by member of that role. Not sure which would be more valuable. Stupid newbie question: is there a way to make queries on the system tables show me what is actually there when I'm poking around? So for example: Select * from pg_type limit 1; tells me that the typoutput is 'boolout'. An english string rather than a number. So even though the documentation says that column maps to pg_proc.oid I can't then write: Select * from pg_proc where oid = 'boolout'; It would be very helpful if I wasn't learning the system but since I am I'd like to turn it off for now. Fewer layers of abstraction. Thanks, Simone Aiken 303-956-7188 Quietly Competent Consulting -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
2011/1/16 Simone Aiken sai...@ulfheim.net: is there a way to make queries on the system tables show me what is actually there when I'm poking around? So for example: Select * from pg_type limit 1; tells me that the typoutput is 'boolout'. An english string rather than a number. So even though the documentation says that column maps to pg_proc.oid I can't then write: Select * from pg_proc where oid = 'boolout'; Type type of typoutput is regproc, which is really an oid with a different output function. To get the numeric value, do: Select typoutput::oid from pg_type limit 1; Nicolas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql: Add \dL to show languages
On Sun, Jan 16, 2011 at 02:26, Andreas Karlsson andr...@proxel.se wrote: Hi Josh, Contents and Purpose This patch adds the \dL command in psql to list the procedual languages. snip Some things I noticed when using it though. I do not like the use of parentheses in the usage description list (procedural) languages. Why not have it simply as list procedural languages? Because it lists non-procedural langauges as well? (I didn't check it, that's just a guess) Should we include a column in \dL+ for the laninline function (DO blocks)? +1 for doing that. Remember it has to be version-dependent though. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Maintenance downtime for commitfest.postgresql.org and media.postgresql.org
On 01/14/2011 05:23 PM, Stefan Kaltenbrunner wrote: Hi all! In preperation of some further improvments to our infrastructure, the sysadmin team needs to move coridian.postgresql.org (aka commitfest.postgresql.org) and mintaka.postgresql.org (media.postgresql.org) to a different host within the same datacenter. We are planning to do that move starting Sunday January 16th 12:00 to 13:00 (GMT). The expected downtime is ~30min per VM and we will send a all is good again after the work is done - it would be good to not make any chances to the commitfest interface until you see that notice. In preparation for the move we also have reduced the TTLs of the affected records down to 5 minutes to make the DNS-migration go as fast as possible. move completed and dns updated - if something does not work as expected just tell us :) for reference the new IPs are: commitfest.postgresql.org/coridan.postgresql.org - 67.192.136.130 media.postgresql.org/mintaka.postgresql.org - 67.192.136.129 Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] replication and pg_hba.conf
In 9.0, we specifically require using replication as database name to start a replication session. In 9.1 we will have the REPLICATION attribute to a role - should we change it so that all in database includes replication connections? It certainly goes in the principle of least surprise path.. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Replication logging
Currently, replication connections *always* logs something like: LOG: replication connection authorized: user=mha host=[local] There's no way to turn that off. I can't find the reasoning behind this - why is this one not controlled by log_connections like normal ones? There's a comment in the code that says this is intentional, but I can't figure out why... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup for streaming base backups
On Sat, Jan 15, 2011 at 23:10, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: That should be -D --pgdata, for consistency with pg_dump. pg_dump doesn't have a -D. I assume you mean pg_ctl / initdb? Yes, sorry, been too fast. Ok. Updated patch that includes this change attached. I also changed the tar directory from -t to -T, for consistency. It also includes the change to take -h host, -U user, -w/-W for password -p port instead of a conninfo string. Another option, I think Heikki mentioned this on IM at some point, is to do something like name it oid-name.tar. That would give us best of both worlds? Well I'd think we know the pg_tablespace columns encoding, so the problem might be the filesystem encodings, right? Well there's also the Do we really? That's one of the global catalogs that don't really have an encoding, isn't it? option of creating oid.tar and have a symlink to it called name.tar but that's pushing it. I don't think naming after OIDs is a good service for users, but if that's all we can reasonably do… Yeah, symlink seems to be making things way too complex. oid-name seems is perhaps a reasonable compromise? Will continue reviewing and post something more polished and comprehensive next week — mainly wanted to see if you wanted to include pg_ctl command in the patch already. Ok, thanks. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml index db7c834..c14ae43 100644 --- a/doc/src/sgml/backup.sgml +++ b/doc/src/sgml/backup.sgml @@ -813,6 +813,16 @@ SELECT pg_stop_backup(); /para para +You can also use the xref linkend=app-pgbasebackup tool to take +the backup, instead of manually copying the files. This tool will take +care of the functionpg_start_backup()/, copy and +functionpg_stop_backup()/ steps automatically, and transfers the +backup over a regular productnamePostgreSQL/productname connection +using the replication protocol, instead of requiring filesystem level +access. + /para + + para Some file system backup tools emit warnings or errors if the files they are trying to copy change while the copy proceeds. When taking a base backup of an active database, this situation is normal diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index f40fa9d..c44d11e 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -160,6 +160,7 @@ Complete list of usable sgml source files in this directory. !entity dropuser system dropuser.sgml !entity ecpgRefsystem ecpg-ref.sgml !entity initdb system initdb.sgml +!entity pgBasebackup system pg_basebackup.sgml !entity pgConfig system pg_config-ref.sgml !entity pgControldata system pg_controldata.sgml !entity pgCtl system pg_ctl-ref.sgml diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml new file mode 100644 index 000..dc8e2f4 --- /dev/null +++ b/doc/src/sgml/ref/pg_basebackup.sgml @@ -0,0 +1,313 @@ +!-- +doc/src/sgml/ref/pg_basebackup.sgml +PostgreSQL documentation +-- + +refentry id=app-pgbasebackup + refmeta + refentrytitlepg_basebackup/refentrytitle + manvolnum1/manvolnum + refmiscinfoApplication/refmiscinfo + /refmeta + + refnamediv + refnamepg_basebackup/refname + refpurposetake a base backup of a productnamePostgreSQL/productname cluster/refpurpose + /refnamediv + + indexterm zone=app-pgbasebackup + primarypg_basebackup/primary + /indexterm + + refsynopsisdiv + cmdsynopsis + commandpg_basebackup/command + arg rep=repeatreplaceableoption//arg + /cmdsynopsis + /refsynopsisdiv + + refsect1 + title + Description + /title + para + applicationpg_basebackup/application is used to take base backups of + a running productnamePostgreSQL/productname database cluster. These + are taken without affecting other clients to the database, and can be used + both for point-in-time recovery (see xref linkend=continuous-archiving) + and as the starting point for a log shipping or streaming replication standby + server (see xref linkend=warm-standby). + /para + + para + applicationpg_basebackup/application makes a binary copy of the database + cluster files, while making sure the system is automatically put in and + out of backup mode automatically. Backups are always taken of the entire + database cluster, it is not possible to back up individual databases or + database objects. For individual database backups, a tool such as + xref linkend=APP-PGDUMP must be used. + /para + + para + The backup is made over a regular productnamePostgreSQL/productname + connection, and uses the replication protocol. The connection must be + made with a user having literalREPLICATION/literal permissions (see + xref linkend=role-attributes). + /para + + para + Only one backup can be
[HACKERS] pg_stat_replication security
pg_stat_replication shows all replication information to all users, no requirement to be a superuser or anything. That leaks a bunch of information that regular pg_stat_activity doesn't - such as clients IP addresses. And also of course all the replication info itself, which may or may not be a problem. I suggest pg_stat_replication do just like pg_stat_activity, which is return NULL in most fields if the user isn't (superuser||same_user_as_that_session). -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] walreceiver fallback_application_name
Since we now show the application name in pg_stat_replication, I think it would make sense to have the walreceiver set fallback_application_name on the connection string, like so: diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiv index c052df2..962ee04 100644 --- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c +++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c @@ -92,7 +92,7 @@ libpqrcv_connect(char *conninfo, XLogRecPtr startpoint) * replication for .pgpass lookup. */ snprintf(conninfo_repl, sizeof(conninfo_repl), -%s dbname=replication replication=true, +%s dbname=replication replication=true fallback_application_name=postgres, conninfo); streamConn = PQconnectdb(conninfo_repl); Using fallback_application_name, it can still be overridden in primary_conninfo if the user wants to use it to separate different instances. Any objections to that? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery control functions
On Sat, Jan 15, 2011 at 12:17, Simon Riggs si...@2ndquadrant.com wrote: On Sat, 2011-01-15 at 20:11 +0900, Fujii Masao wrote: On Fri, Jan 14, 2011 at 9:00 PM, Simon Riggs si...@2ndquadrant.com wrote: How hard would it be to have a pg_xlog_replay_until(xlog location or timestamp), to have it resume recovery up to that point and then pause again? You can already do that for timestamps. You mean using recovery_target_time and pause_at_recovery_target? The problem is that we cannot continue recovery after the pause by them. If we resume recovery after the pause, recovery ends immediately. Shutdown while paused, alter parameter, restart. That's something I'd very much like to avoid - being able to say continue-until using the function would be very nice. Consider for example doing this from pgadmin. So I'm back to my original question which is, how much work would this be? I don't know my way around that part so I can't estimate, and what's there so far is certainly a lot better than nothing, but if it's not a huge amount of work it would be a great improvement. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We need to log aborted autovacuums
Robert Haas wrote: On Sat, Jan 15, 2011 at 11:14 AM, Tom Lane t...@sss.pgh.pa.us wrote: Greg Smith g...@2ndquadrant.com writes: Does try_relation_open need to have a lock acquisition timeout when AV is calling it? Hmm. I think when looking at the AV code, I've always subconsciously assumed that try_relation_open would fail immediately if it couldn't get the lock. That certainly seems like it would be a more appropriate way to behave than delaying indefinitely. I'm confused how that's not happening already. What does try mean, otherwise? Apparently try means acquire the requested lock on the oid of the relation, waiting for any amount of time for that part, and then check if the relation really exists or not once it's got it. In this context, it means it will try to open the relation, but might fail if it doesn't actually exist anymore. The relation not existing once it tries the check done after the lock is acquired is the only way it will return a failure. try_relation_open calls LockRelationOid, which blocks. There is also a ConditionalLockRelationOid, which does the same basic thing except it exits immediately, with a false return code, if it can't acquire the lock. I think we just need to nail down in what existing cases it's acceptable to have try_relation_oid use ConditionalLockRelationOid instead, which would make it do what all us reading the code thought it did all along. There are four callers of try_relation_open to be concerned about here: src/backend/commands/vacuum.cvacuum_rel onerel = try_relation_open(relid, lmode); src/backend/commands/analyze.canalyze_rel onerel = try_relation_open(relid, ShareUpdateExclusiveLock); src/backend/commands/cluster.ccluster_rel OldHeap = try_relation_open(tableOid, AccessExclusiveLock); src/backend/commands/lockcmds.c LockTableRecurse * Apply LOCK TABLE recursively over an inheritance tree rel = try_relation_open(reloid, NoLock); I think that both the vacuum_rel and analyze_rel cases are capable of figuring out if they are the autovacuum process, and if so calling the fast non-blocking version of this. I wouldn't want to mess with the other two, which rely upon the current behavior as far as I can see. Probably took me longer to write this e-mail than the patch will take. Since I've got trivial patch fever this weekend and already have the test case, I'll do this one next. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books
Re: [HACKERS] replication and pg_hba.conf
Magnus Hagander mag...@hagander.net writes: In 9.0, we specifically require using replication as database name to start a replication session. In 9.1 we will have the REPLICATION attribute to a role - should we change it so that all in database includes replication connections? It certainly goes in the principle of least surprise path.. No, not at all. If we had set things up so that roles with replication bit could *only* do replication, it might be sensible to think about that, but we didn't. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walreceiver fallback_application_name
Magnus Hagander mag...@hagander.net writes: Since we now show the application name in pg_stat_replication, I think it would make sense to have the walreceiver set fallback_application_name on the connection string, like so: Seems reasonable, but postgres is a mighty poor choice of name for that, no? I don't have any really great substitute suggestion --- best I can do offhand is walreceiver --- but postgres seems uselessly generic, not to mention potentially confusing compared to the default superuser name for instance. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
Nicolas Barbier nicolas.barb...@gmail.com writes: 2011/1/16 Simone Aiken sai...@ulfheim.net: ... So even though the documentation says that column maps to pg_proc.oid I can't then write: Select * from pg_proc where oid = 'boolout'; Type type of typoutput is regproc, which is really an oid with a different output function. To get the numeric value, do: Select typoutput::oid from pg_type limit 1; Also, you *can* go back the other way. It's very common to write Select * from pg_proc where oid = 'boolout'::regproc rather than looking up the OID first. There are similar pseudotypes for relation and operator names; see Object Identifier Types in the manual. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We need to log aborted autovacuums
Greg Smith g...@2ndquadrant.com writes: try_relation_open calls LockRelationOid, which blocks. There is also a ConditionalLockRelationOid, which does the same basic thing except it exits immediately, with a false return code, if it can't acquire the lock. I think we just need to nail down in what existing cases it's acceptable to have try_relation_oid use ConditionalLockRelationOid instead, which would make it do what all us reading the code thought it did all along. No, I don't believe we should be messing with the semantics of try_relation_open. It is what it is. The right way to fix this is similar to what LockTableRecurse does, ie call ConditionalLockRelationOid itself. I tried changing vacuum_rel that way yesterday, but the idea crashed when I realized that vacuum_rel doesn't have the name of the target relation, only its OID, so it can't log any very useful message ... and according to the original point of this thread, we're surely going to want an elog(LOG) when we can't get the lock. I think the best thing to do is probably to have autovacuum.c do the ConditionalLockRelationOid call before entering vacuum.c (making the later acquisition of the lock by try_relation_open redundant, but it will be cheap enough to not matter). But I haven't chased the details. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 0: Introduction; test cases
Robert Haas robertmh...@gmail.com writes: On Sat, Jan 15, 2011 at 10:25 AM, Noah Misch n...@leadboat.com wrote: Do you value test coverage so little? If you're asking whether I think real-world usability is more important than test coverage, then yes. Quite honestly, I'd be inclined to rip out most of the DEBUG messages I see in that regression test altogether. They are useless, and so is the regression test itself. An appropriate regression test would involve something more like checking that the relfilenode changed and then checking that the contained data is still sane. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walreceiver fallback_application_name
On Sun, Jan 16, 2011 at 17:29, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: Since we now show the application name in pg_stat_replication, I think it would make sense to have the walreceiver set fallback_application_name on the connection string, like so: Seems reasonable, but postgres is a mighty poor choice of name for that, no? I don't have any really great substitute suggestion --- best I can do offhand is walreceiver --- but postgres seems uselessly generic, not to mention potentially confusing compared to the default superuser name for instance. I agree it's not a great name. Is walreceiver something that the average DBA is going to realize what it is? Perhaps go for something like replication slave? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup for streaming base backups
Magnus Hagander mag...@hagander.net writes: + * The file will be named base.tar[.gz] if it's for the main data directory + * or tablespaceoid.tar[.gz] if it's for another tablespace. Well we have UNIQUE, btree (spcname), so maybe we can use that here? We could, but that would make it more likely to run into encoding issues and such - do we restrict what can be in a tablespace name? No. Don't even think of going there --- we got rid of user-accessible names in the filesystem years ago and we're not going back. Consider CREATE TABLESPACE /foo/bar LOCATION '/foo/bar'; regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We need to log aborted autovacuums
On Sun, 2011-01-16 at 11:47 -0500, Tom Lane wrote: Greg Smith g...@2ndquadrant.com writes: try_relation_open calls LockRelationOid, which blocks. There is also a ConditionalLockRelationOid, which does the same basic thing except it exits immediately, with a false return code, if it can't acquire the lock. I think we just need to nail down in what existing cases it's acceptable to have try_relation_oid use ConditionalLockRelationOid instead, which would make it do what all us reading the code thought it did all along. No, I don't believe we should be messing with the semantics of try_relation_open. It is what it is. The right way to fix this is similar to what LockTableRecurse does, ie call ConditionalLockRelationOid itself. I tried changing vacuum_rel that way yesterday, but the idea crashed when I realized that vacuum_rel doesn't have the name of the target relation, only its OID, so it can't log any very useful message ... and according to the original point of this thread, we're surely going to want an elog(LOG) when we can't get the lock. I think the best thing to do is probably to have autovacuum.c do the ConditionalLockRelationOid call before entering vacuum.c (making the later acquisition of the lock by try_relation_open redundant, but it will be cheap enough to not matter). But I haven't chased the details. I'm fairly confused by this thread. We *do* emit a message when we cancel an autovacuum task. We went to a lot of trouble to do that. The message is DEBUG2, and says sending cancel to blocking autovacuum pid =. We just need to make that LOG, not DEBUG2. The autovacuum itself then says canceling autovacuum task when canceled. It doesn't say what table the autovacuum was running on when cancelled, but that seems like an easy thing to add since the AV does know that. I can't see any reason to differentiate between manually canceled AVs and automatically canceled AVs. It's all the same thing. Not really sure what it is you're talking about above or how that relates to log messages for AV. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup for streaming base backups
On Sun, Jan 16, 2011 at 18:18, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: + * The file will be named base.tar[.gz] if it's for the main data directory + * or tablespaceoid.tar[.gz] if it's for another tablespace. Well we have UNIQUE, btree (spcname), so maybe we can use that here? We could, but that would make it more likely to run into encoding issues and such - do we restrict what can be in a tablespace name? No. Don't even think of going there --- we got rid of user-accessible names in the filesystem years ago and we're not going back. Consider CREATE TABLESPACE /foo/bar LOCATION '/foo/bar'; Well, we'd try to name the file for that oid-/foo/bar.tar, which I guess would break badly, yes. I guess we could normalize the tablespace name into [a-zA-Z0-9] or so, which would still be useful for the majority of cases, I think? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK for non-tables
Robert Haas robertmh...@gmail.com writes: Do we wish to officially document LOCK without TABLE as a good idea to start avoiding, in case we decide to do something about that in the future? I'm still not for fixing the ambiguity that way. TABLE is an optional noise word in other contexts, notably GRANT/REVOKE where that syntax is dictated by SQL standard. It would be inconsistent to have it be required in LOCK. I think we should deprecate using NOWAIT without an IN...MODE clause. Another possibility is to disallow just the single case LOCK tablename NOWAIT ie, you can write NOWAIT if you include *either* the object type or the IN...MODE clause. This is not too hard as far as the grammar is concerned, but I'm not exactly sure how to document it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] auto-sizing wal_buffers
Josh Berkus j...@agliodbs.com writes: I think we can be more specific on that last sentence; is there even any *theoretical* benefit to settings above 16MB, the size of a WAL segment? IIRC there's a forced fsync at WAL segment switch, so no. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] textarray option for file FDW
On 01/15/2011 07:41 PM, Andrew Dunstan wrote: On 01/15/2011 12:29 PM, Andrew Dunstan wrote: I've been waiting for the latest FDW patches as patiently as I can, and I've been reviewing them this morning, in particular the file_fdw patch and how it interacts with the newly exposed COPY API. Overall it seems to be a whole lot cleaner, and the wholesale duplication of the copy code is gone, so it's much nicer and cleaner. So now I'd like to add a new option to it: textarray. This option would require that the foreign table have exactly one field, of type text[], and would compose all the field strings read from the file for each record into the array (however many there are). This would require a few changes to contrib/file_fdw/file_fdw.c and a few changes to src/backend/commands/copy.c, which I can probably have done in fairly short order, Deo Volente. This will allow something like: CREATE FOREIGN TABLE arr_text ( t text[] ) SERVER file_server OPTIONS (format 'csv', filename '/path/to/ragged.csv', textarray 'true'); SELECT t[3]::int as a, t[1]::timestamptz as b, t[99] as not_there FROM arr_text; A WIP patch is attached. It's against Shigeru Hanada's latest FDW patches. It's surprisingly tiny. Right now it probably leaks memory like a sieve, and that's the next thing I'm going to chase down. Updated patch attached, that should use memory better. cheers andrew *** a/contrib/file_fdw/file_fdw.c --- b/contrib/file_fdw/file_fdw.c *** *** 58,67 static struct FileFdwOption valid_options[] = { { quote, ForeignTableRelationId }, { escape, ForeignTableRelationId }, { null, ForeignTableRelationId }, /* FIXME: implement force_not_null option */ ! /* Centinel */ { NULL, InvalidOid } }; --- 58,68 { quote, ForeignTableRelationId }, { escape, ForeignTableRelationId }, { null, ForeignTableRelationId }, + { textarray, ForeignTableRelationId }, /* FIXME: implement force_not_null option */ ! /* Sentinel */ { NULL, InvalidOid } }; *** *** 134,139 file_fdw_validator(PG_FUNCTION_ARGS) --- 135,141 char *escape = NULL; char *null = NULL; boolheader; + booltextarray; /* Only superuser can change generic options of the foreign table */ if (catalog == ForeignTableRelationId !superuser()) *** *** 220,225 file_fdw_validator(PG_FUNCTION_ARGS) --- 222,231 errmsg(null representation cannot use newline or carriage return))); null = strVal(def-arg); } + else if (strcmp(def-defname, textarray) == 0) + { + textarray = defGetBoolean(def); + } } /* Check options which depend on the file format. */ *** a/src/backend/commands/copy.c --- b/src/backend/commands/copy.c *** *** 44,49 --- 44,51 #include utils/memutils.h #include utils/snapmgr.h + /* initial size for arrays in textarray mode */ + #define TEXTARRAY_SIZE 64 #define ISOCTAL(c) (((c) = '0') ((c) = '7')) #define OCTVALUE(c) ((c) - '0') *** *** 117,122 typedef struct CopyStateData --- 119,127 bool *force_quote_flags; /* per-column CSV FQ flags */ bool *force_notnull_flags;/* per-column CSV FNN flags */ + /* param from FDW */ + bool text_array; /* scan to a single text array field */ + /* these are just for error messages, see CopyFromErrorCallback */ const char *cur_relname;/* table name for error messages */ int cur_lineno; /* line number for error messages */ *** *** 970,975 BeginCopy(bool is_from, --- 975,988 errmsg(argument to option \%s\ must be a list of column names, defel-defname))); } + else if (strcmp(defel-defname, textarray) == 0) + { + if (cstate-text_array) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), +errmsg(conflicting or redundant options))); + cstate-text_array = defGetBoolean(defel); + } else ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), *** *** 1109,1114
Re: [HACKERS] Include WAL in base backup
Magnus Hagander mag...@hagander.net writes: What if you start a concurrent process that begins streaming the WAL segments just before you start the backup, and you stop it after having stopped the backup. I would think that then, the local received files would be complete. We would only need a program able to stream the WAL segments and build WAL files from that… do you know about one? :) Sure, if you stream the backups on the side, then you don't need this feature. This is for very simple filesystem backups, without the need to set up streaming of archiving. What I meant is: why don't we provide an option to automate just that behavior in pg_basebackup? It looks like a fork() then calling code you already wrote. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql
Hello 2011/1/15 Noah Misch n...@leadboat.com: Hello Pavel, I'm reviewing this patch for CommitFest 2011-01. Thank you very much, I am sending a updated version with little bit more comments. But I am sure, so somebody with good English have to edit my comments. Minimally I hope, so your questions will be answered. The patch seems fully desirable. It appropriately contains no documentation updates. It contains no new tests, and that's probably fine, too; I can't think of any corner cases where this would do something other than work correctly or break things comprehensively. Using your test case from here: http://archives.postgresql.org/message-id/aanlktikdcw+c-c4u4ngaobhpfszkb5uy_zuatdzfp...@mail.gmail.com I observed a 28x speedup, similar to Álvaro's report. I also made my own test case, attached, to evaluate this from a somewhat different angle and also to consider the worst case. With a 100 KiB string (good case), I see a 4.8x speedup. With a 1 KiB string (worst case - never toasted), I see no statistically significant change. This is to be expected. A few specific questions and comments follow, all minor. Go ahead and mark the patch ready for committer when you've acted on them, or declined to do so, to your satisfaction. I don't think a re-review will be needed. Thanks, nm On Mon, Nov 22, 2010 at 01:46:51PM +0100, Pavel Stehule wrote: *** ./pl_exec.c.orig 2010-11-16 10:28:42.0 +0100 --- ./pl_exec.c 2010-11-22 13:33:01.597726809 +0100 The patch applies cleanly, but the format is slightly nonstandard (-p0 when applied from src/pl/plpgsql/src, rather than -p1 from the root). *** *** 3944,3949 --- 3965,3993 *typeid = var-datatype-typoid; *typetypmod = var-datatype-atttypmod; + + /* + * explicit deTOAST and decomprim for varlena types decompress, perhaps? fixed + */ + if (var-should_be_detoasted) + { + Datum dvalue; + + Assert(!var-isnull); + + oldcontext = MemoryContextSwitchTo(estate-fn_mcxt); + dvalue = PointerGetDatum(PG_DETOAST_DATUM(var-value)); + if (dvalue != var-value) + { + if (var-freeval) + free_var(var); + var-value = dvalue; + var-freeval = true; + } + MemoryContextSwitchTo(oldcontext); This line adds trailing white space. + var-should_be_detoasted = false; + } + *value = var-value; *isnull = var-isnull; break; *** ./plpgsql.h.orig 2010-11-16 10:28:42.0 +0100 --- ./plpgsql.h 2010-11-22 13:12:38.897851879 +0100 *** *** 644,649 --- 645,651 bool fn_is_trigger; PLpgSQL_func_hashkey *fn_hashkey; /* back-link to hashtable key */ MemoryContext fn_cxt; + MemoryContext fn_mcxt; /* link to function's memory context */ Oid fn_rettype; int fn_rettyplen; *** *** 692,697 --- 694,701 Oid rettype; /* type of current retval */ Oid fn_rettype; /* info about declared function rettype */ + MemoryContext fn_mcxt; /* link to function's memory context */ + bool retistuple; bool retisset; I only see PLpgSQL_execstate.fn_mcxt getting populated in this patch. Is the PLpgSQL_function.fn_mcxt leftover from an earlier design? I have to access to top execution context from exec_eval_datum function. This function can be called from parser's context, and without explicit switch to top execution context a variables are detoasted in wrong context. I could not readily tell the difference between fn_cxt and fn_mcxt. As I understand it, fn_mcxt is the SPI procCxt, and fn_cxt is the long-lived context used to cache facts across many transactions. Perhaps name the member something like top_cxt, exec_cxt or proc_cxt and comment it like /* SPI Proc memory context */ to make this clearer. I used a top_exec_cxt name Pavel Stehule Regards *** ./src/pl/plpgsql/src/pl_exec.c.orig 2011-01-16 14:18:59.0 +0100 ---
Re: [HACKERS] pg_basebackup for streaming base backups
Magnus Hagander mag...@hagander.net writes: On Sun, Jan 16, 2011 at 18:18, Tom Lane t...@sss.pgh.pa.us wrote: No. Don't even think of going there --- we got rid of user-accessible names in the filesystem years ago and we're not going back. Consider CREATE TABLESPACE /foo/bar LOCATION '/foo/bar'; Well, we'd try to name the file for that oid-/foo/bar.tar, which I guess would break badly, yes. I guess we could normalize the tablespace name into [a-zA-Z0-9] or so, which would still be useful for the majority of cases, I think? Well if we're not using user names, there's no good choice except for system name, and the one you're making up here isn't the true one… Now I think the unfriendliness is around the fact that you need to prepare (untar, unzip) and start a cluster from the backup to be able to know what file contains what. Is it possible to offer a tool that lists the logical objects contained into each tar file? Maybe adding a special section at the beginning of each. That would be logically like pg_dump catalog, but implemented as a simple noise file that you simply `cat` with some command. Once more, I'm still unclear how important that is, but it's scratching. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup for streaming base backups
Magnus Hagander mag...@hagander.net writes: Well, we'd try to name the file for that oid-/foo/bar.tar, which I guess would break badly, yes. I guess we could normalize the tablespace name into [a-zA-Z0-9] or so, which would still be useful for the majority of cases, I think? Just stick with the OID. There's no reason that I can see to have friendly names for these tarfiles --- in most cases, the DBA will never even deal with them, no? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup for streaming base backups
On Sun, Jan 16, 2011 at 18:59, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: Well, we'd try to name the file for that oid-/foo/bar.tar, which I guess would break badly, yes. I guess we could normalize the tablespace name into [a-zA-Z0-9] or so, which would still be useful for the majority of cases, I think? Just stick with the OID. There's no reason that I can see to have friendly names for these tarfiles --- in most cases, the DBA will never even deal with them, no? No, this is the output mode where the DBA chooses to get the output in the form of tarfiles. So if chosen, he will definitely deal with it. When we unpack the tars right away to a directory, they have no name, so that doesn't apply here. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup for streaming base backups
Magnus Hagander mag...@hagander.net writes: On Sun, Jan 16, 2011 at 18:59, Tom Lane t...@sss.pgh.pa.us wrote: Just stick with the OID. There's no reason that I can see to have friendly names for these tarfiles --- in most cases, the DBA will never even deal with them, no? No, this is the output mode where the DBA chooses to get the output in the form of tarfiles. So if chosen, he will definitely deal with it. Mph. How big a use-case has that got? Offhand I can't see a reason to use it at all, ever. If you're trying to set up a clone you want the files unpacked. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup for streaming base backups
On Sun, Jan 16, 2011 at 19:03, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: On Sun, Jan 16, 2011 at 18:59, Tom Lane t...@sss.pgh.pa.us wrote: Just stick with the OID. There's no reason that I can see to have friendly names for these tarfiles --- in most cases, the DBA will never even deal with them, no? No, this is the output mode where the DBA chooses to get the output in the form of tarfiles. So if chosen, he will definitely deal with it. Mph. How big a use-case has that got? Offhand I can't see a reason to use it at all, ever. If you're trying to set up a clone you want the files unpacked. Yes, but the tool isn't just for setting up a clone. If you're doing a regular base backup, that's *not* for replication, you might want them in files. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We need to log aborted autovacuums
Simon Riggs si...@2ndquadrant.com writes: I'm fairly confused by this thread. We *do* emit a message when we cancel an autovacuum task. We went to a lot of trouble to do that. The message is DEBUG2, and says sending cancel to blocking autovacuum pid =. That doesn't necessarily match one-to-one with actual cancellations, nor does it cover the case Greg is on about at the moment of an AV worker being blocked indefinitely because it can't get the table lock in the first place. It might be an adequate substitute, but on the whole I agree with the idea that it'd be better to have autovacuum log when it actually cancels an operation, not when someone tries to cancel one. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We need to log aborted autovacuums
Simon Riggs wrote: I'm fairly confused by this thread. That's becuase you think it has something to do with cancellation, which it doesn't. The original report here noted a real problem but got the theorized cause wrong. It turns out the code that acquires a lock when autovacuum decides it is going to process something will wait forever for that lock to be obtained. It cannot be the case that other locks on the table are causing it to cancel, or as you say it would be visible in the logs. Instead the AV worker will just queue up and wait for its turn as long as it takes. That does mean there's all sorts of ways that your AV workers can all get stuck where they are waiting for a table, and there's no way to know when that's happening either from the logs; you'll only see it in ps or pg_stat_activity. Given that I think it's actually a mild denial of service attack vector, this really needs an improvement. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Include WAL in base backup
On Sun, Jan 16, 2011 at 18:45, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Magnus Hagander mag...@hagander.net writes: What if you start a concurrent process that begins streaming the WAL segments just before you start the backup, and you stop it after having stopped the backup. I would think that then, the local received files would be complete. We would only need a program able to stream the WAL segments and build WAL files from that… do you know about one? :) Sure, if you stream the backups on the side, then you don't need this feature. This is for very simple filesystem backups, without the need to set up streaming of archiving. What I meant is: why don't we provide an option to automate just that behavior in pg_basebackup? It looks like a fork() then calling code you already wrote. Ah, I see. That's a good idea. However, it's not quite that simple. just adding a fork() doesn't work on all our platforms, and you need to deal with feedback and such between them as well. I think it's definitely something worth doing in the long run, but I think we should start with the simpler way. Oh, and this might be the use-case for integrating the streaming log code as well :-) But if we plan to do that, perhaps we should pick a different name for the binary? Or maybe just share code with another one later.. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We need to log aborted autovacuums
Greg Smith g...@2ndquadrant.com writes: Simon Riggs wrote: I'm fairly confused by this thread. That's becuase you think it has something to do with cancellation, which it doesn't. The original report here noted a real problem but got the theorized cause wrong. I think that cancellations are also a potentially important issue that could do with being logged. The issue that I see is that if an application has a use-pattern that involves obtaining exclusive lock on a large table fairly frequently, then AV will never be able to complete on that table, leading to bloat and eventual XID wraparound issues. Once we get scared about XID wraparound, AV will refuse to let itself be canceled, so it will prevent the wraparound ... at the cost of denying service to the application code path that wants the lock. So this is the sort of thing that it'd be good to have some bleating about in the log, giving the DBA a chance to rethink the app's locking behavior before the consequences get nasty. But, having said that, false alarms in the log are not nice. So I'd rather have the LOG messages coming out from actual transaction aborts in AV, not from the remote end of the signal. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup for streaming base backups
Magnus Hagander mag...@hagander.net writes: If you're doing a regular base backup, that's *not* for replication, you might want them in files. +1 So, is that pg_restore -l idea feasible with your current tar format? I guess that would translate to pg_basebackup -l directory|oid.tar. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup for streaming base backups
On Sun, Jan 16, 2011 at 19:21, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Magnus Hagander mag...@hagander.net writes: If you're doing a regular base backup, that's *not* for replication, you might want them in files. +1 So, is that pg_restore -l idea feasible with your current tar format? I guess that would translate to pg_basebackup -l directory|oid.tar. Um, not easily if you want to translate it to names. Just like you don't have access to the oid-name mapping without the server started. The walsender can't read pg_class for example, so it can't generate that mapping file. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We need to log aborted autovacuums
Tom Lane wrote: No, I don't believe we should be messing with the semantics of try_relation_open. It is what it is. With only four pretty simple callers to the thing, and two of them needing the alternate behavior, it seemed a reasonable place to modify to me. I thought the nowait boolean idea was in enough places that it was reasonable to attach to try_relation_open. Attached patch solves the wait for lock forever problem, and introduces a new log message when AV or auto-analyze fail to obtain a lock on something that needs to be cleaned up: DEBUG: autovacuum: processing database gsmith INFO: autovacuum skipping relation 65563 --- cannot open or obtain lock INFO: autoanalyze skipping relation 65563 --- cannot open or obtain lock My main concern is that this may cause AV to constantly fail to get access to a busy table, where in the current code it would queue up and eventually get the lock needed. A secondary issue is that while the autovacuum messages only show up if you have log_autovacuum_min_duration set to not -1, the autoanalyze ones can't be stopped. If you don't like the way I structured the code, you can certainly do it some other way instead. I thought this approach was really simple and not unlike similar code elsewhere. Here's the test case that worked for me here again: psql SHOW log_autovacuum_min_duration; DROP TABLE t; CREATE TABLE t(s serial,i integer); INSERT INTO t(i) SELECT generate_series(1,10); SELECT relname,last_autovacuum,autovacuum_count FROM pg_stat_user_tables WHERE relname='t'; DELETE FROM t WHERE s5; \q psql BEGIN; LOCK t; Leave that open, then go to anther session with old tail -f on the logs to wait for the errors to show up. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 25d9fde..4193fff 100644 *** a/src/backend/access/heap/heapam.c --- b/src/backend/access/heap/heapam.c *** relation_open(Oid relationId, LOCKMODE l *** 918,936 * * Same as relation_open, except return NULL instead of failing * if the relation does not exist. * */ Relation ! try_relation_open(Oid relationId, LOCKMODE lockmode) { Relation r; ! Assert(lockmode = NoLock lockmode MAX_LOCKMODES); /* Get the lock first */ if (lockmode != NoLock) ! LockRelationOid(relationId, lockmode); ! /* * Now that we have the lock, probe to see if the relation really exists * or not. --- 918,951 * * Same as relation_open, except return NULL instead of failing * if the relation does not exist. + * + * If called with nowait enabled, this will immediately exit + * if a lock is requested and it can't be acquired. The + * return code in this case doesn't distinguish between this + * situation and the one where the relation was locked, but + * doesn't exist. Callers using nowait must not care that + * they won't be able to tell the difference. * */ Relation ! try_relation_open(Oid relationId, LOCKMODE lockmode, bool nowait) { Relation r; ! bool locked; Assert(lockmode = NoLock lockmode MAX_LOCKMODES); /* Get the lock first */ if (lockmode != NoLock) ! { ! if (nowait) ! { ! locked=ConditionalLockRelationOid(relationId, lockmode); ! if (!locked) ! return NULL; ! } ! else ! LockRelationOid(relationId, lockmode); ! } /* * Now that we have the lock, probe to see if the relation really exists * or not. diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c index 7bc5f11..24bfb16 100644 *** a/src/backend/commands/analyze.c --- b/src/backend/commands/analyze.c *** analyze_rel(Oid relid, VacuumStmt *vacst *** 147,156 * concurrent VACUUM, which doesn't matter much at the moment but might * matter if we ever try to accumulate stats on dead tuples.) If the rel * has been dropped since we last saw it, we don't need to process it. */ ! onerel = try_relation_open(relid, ShareUpdateExclusiveLock); if (!onerel) ! return; /* * Check permissions --- this should match vacuum's check! --- 147,168 * concurrent VACUUM, which doesn't matter much at the moment but might * matter if we ever try to accumulate stats on dead tuples.) If the rel * has been dropped since we last saw it, we don't need to process it. + * + * If this is a manual analyze, opening will wait forever to acquire + * the requested lock on the relation. Autovacuum will just give up + * immediately if it can't get the lock. This prevents a series of locked + * relations from potentially hanging all of the AV workers waiting + * for locks. */ ! onerel = try_relation_open(relid,
Re: [HACKERS] We need to log aborted autovacuums
Greg Smith g...@2ndquadrant.com writes: Tom Lane wrote: No, I don't believe we should be messing with the semantics of try_relation_open. It is what it is. With only four pretty simple callers to the thing, and two of them needing the alternate behavior, it seemed a reasonable place to modify to me. I thought the nowait boolean idea was in enough places that it was reasonable to attach to try_relation_open. I would be willing to do that if it actually fixed the problem, but it doesn't. Logging only the table OID and not the table name is entirely inadequate IMO, so the fix has to be at a different place anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] texteq/byteaeq: avoid detoast [REVIEW]
This is a review of: https://commitfest.postgresql.org/action/patch_view?id=468 Purpose: Equal and not-equal _may_ be quickly determined if their lengths are different. This _may_ be a huge speed up if we dont have to detoat. The Patch: == I was able to read and understand the patch, its a simple change and looked correct to me (a non PG hacker). It applies clean to git head, compiles and runs fine with debug enabled. make check passes Usability: == I used _may_ above. The benchmark included with the patch, showing huge speedups, is really contrived. It uses a where clause with a thousand character constant: (where c = 'long...long...long...long...ConstantText...etc'). In my opinion this is very uncommon (the author does note this is a best case). If you have a field large enough to be toasted you are not going to be using that to search on, you are going to have an ID field that is indexed. (select c where id = 7) This also only touches = and .and like wont be touched. So I think the scope of this is limited. THAT being said, the patch is simple, and if you do happen to hit the code, it will speed things up. As a user of PG I'd like to have this included. Its a corner case, but a big corner, and its a small, simple change, and it wont slow anything else down. Performance: I created myself a more real world test, with a table with indexes and id's and a large toasted field. create table junk(id serial primary key, xgroup integer, c text); create index junk_group on junk(xgroup); I filled it full of junk: do $$ declare i integer; declare j integer; begin for i in 1..100 loop for j in 1..500 loop insert into junk(xgroup, c) values (j, 'c'||i); insert into junk (xgroup, c) select j, repeat('abc', 2000)|| n from generate_series(1, 5) n; end loop; end loop; end$$; This will make about 600 records within the same xgroup. As well as a simple 'c15' type of value in c we can search for. My thinking is you may not know the exact unique id, but you do know what group its in, so that'll cut out 90% of the records, and then you'll have to add and c = 'c15' to get the exact one you want. I still saw a nice performance boost. Old PG: $ psql bench3.sql Timing is on. DO Time: 2010.412 ms Patched: $ psql bench3.sql Timing is on. DO Time: 184.602 ms bench3.sql: do $$ declare i integer; begin for i in 1..400 loop perform count(*) from junk where xgroup = i and c like 'c' || i; end loop; end$$; Summary: Performance speed-up: Oh yeah! If you just happen to hit it, and if you do hit it, you might want to re-think your layout a little bit. Do I want it? Yes please. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Include WAL in base backup
Magnus Hagander mag...@hagander.net writes: However, it's not quite that simple. just adding a fork() doesn't work on all our platforms, and you need to deal with feedback and such between them as well. I'd think client-side, we have an existing implementation with the parallel pg_restore option. Don't know (yet) how easy it is to reuse that code… Oh, and this might be the use-case for integrating the streaming log code as well :-) But if we plan to do that, perhaps we should pick a different name for the binary? Or maybe just share code with another one later.. You're talking about the pg_streamrecv binary? Then yes, my idea about it is that it should become the default archive client we ship with PostgreSQL. And grow into offering a way to be the default restore command too. I'd see the current way that the standby can switch between restoring from archive and from a live stream as a first step into that direction. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We need to log aborted autovacuums
On Sun, 2011-01-16 at 13:08 -0500, Greg Smith wrote: Simon Riggs wrote: I'm fairly confused by this thread. That's becuase you think it has something to do with cancellation, which it doesn't. The original report here noted a real problem but got the theorized cause wrong. It turns out the code that acquires a lock when autovacuum decides it is going to process something will wait forever for that lock to be obtained. It cannot be the case that other locks on the table are causing it to cancel, or as you say it would be visible in the logs. Instead the AV worker will just queue up and wait for its turn as long as it takes. OK, thanks for explaining. So my proposed solution is to set a statement_timeout on autovacuum tasks, so that the AV does eventually get canceled, if the above mentioned case occurs. We can scale that timeout to the size of the table. Now every issue is a cancelation and we can handle it the way I suggested in my last post. I would prefer it if we had a settable lock timeout, as suggested many moons ago. When that was discussed before it was said there was no difference between a statement timeout and a lock timeout, but I think there clearly is, this case being just one example. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_describe_object, patch v2
Andreas Karlsson andr...@proxel.se writes: On Sat, 2011-01-15 at 10:36 -0500, Tom Lane wrote: But I can read the handwriting on the wall: if I want this done right, I'm going to have to do it myself. Do I understand you correctly if I interpret what you would like to see is the same format used now in these cases? Attached is a patch that does what I would consider an acceptable job of printing these datatypes only when they're interesting. I still think that this is largely a waste of code, but if people want it, this is what to do. Testing this in the regression database, it fires on (a) the entries where a binary-compatible hash function is used, and (b) all the entries associated with the GIN operator family array_ops. The latter happens because we've more or less arbitrarily crammed a bunch of opclasses into the same opfamily. One other point here is that I find messages like this a mite unreadable: function 1 (oidvector[], oidvector[]) btoidvectorcmp(oidvector,oidvector) of operator family array_ops for access method gin If we were to go with this, I'd be strongly tempted to rearrange all four of the messages involved to put the operator or function name at the end, eg function 1 (oidvector[], oidvector[]) of operator family array_ops for access method gin: btoidvectorcmp(oidvector,oidvector) Comments? regards, tom lane diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c index ec8eb74954a9cc0ec3623dc42dfed462dc1a3533..d02b58dcc2933a278959727f55d93e1e9101c1f1 100644 *** a/src/backend/catalog/dependency.c --- b/src/backend/catalog/dependency.c *** getObjectDescription(const ObjectAddress *** 2337,2351 initStringInfo(opfam); getOpFamilyDescription(opfam, amopForm-amopfamily); ! /* ! * translator: %d is the operator strategy (a number), the ! * first %s is the textual form of the operator, and the ! * second %s is the description of the operator family. ! */ ! appendStringInfo(buffer, _(operator %d %s of %s), ! amopForm-amopstrategy, ! format_operator(amopForm-amopopr), ! opfam.data); pfree(opfam.data); systable_endscan(amscan); --- 2337,2372 initStringInfo(opfam); getOpFamilyDescription(opfam, amopForm-amopfamily); ! if (opfamily_op_has_default_types(amopForm-amopfamily, ! amopForm-amoplefttype, ! amopForm-amoprighttype, ! amopForm-amopopr)) ! { ! /* ! * translator: %d is the operator strategy (a number), the ! * first %s is the textual form of the operator, and the ! * second %s is the description of the operator family. ! */ ! appendStringInfo(buffer, _(operator %d %s of %s), ! amopForm-amopstrategy, ! format_operator(amopForm-amopopr), ! opfam.data); ! } ! else ! { ! /* ! * translator: %d is the operator strategy (a number), the ! * first two %s's are data type names, the third %s is the ! * textual form of the operator, and the last %s is the ! * description of the operator family. ! */ ! appendStringInfo(buffer, _(operator %d (%s, %s) %s of %s), ! amopForm-amopstrategy, ! format_type_be(amopForm-amoplefttype), ! format_type_be(amopForm-amoprighttype), ! format_operator(amopForm-amopopr), ! opfam.data); ! } pfree(opfam.data); systable_endscan(amscan); *** getObjectDescription(const ObjectAddress *** 2384,2398 initStringInfo(opfam); getOpFamilyDescription(opfam, amprocForm-amprocfamily); ! /* ! * translator: %d is the function number, the first %s is the ! * textual form of the function with arguments, and the second ! * %s is the description of the operator family. ! */ ! appendStringInfo(buffer, _(function %d %s of %s), ! amprocForm-amprocnum, ! format_procedure(amprocForm-amproc), ! opfam.data); pfree(opfam.data); systable_endscan(amscan); --- 2405,2441 initStringInfo(opfam); getOpFamilyDescription(opfam, amprocForm-amprocfamily); ! if (opfamily_proc_has_default_types(amprocForm-amprocfamily, ! amprocForm-amproclefttype, ! amprocForm-amprocrighttype, ! amprocForm-amproc)) ! { ! /* ! * translator: %d is the function number, the first %s is ! * the textual form of the function with arguments, and ! * the second %s is the description of the operator ! * family. ! */ ! appendStringInfo(buffer, _(function %d %s of %s), ! amprocForm-amprocnum, ! format_procedure(amprocForm-amproc), ! opfam.data); ! } ! else ! { ! /* ! * translator: %d is the function number, the first two ! * %s's are data type names, the third %s is
Re: [HACKERS] reviewers needed!
I reviewed a couple patched, and I added my review to the commitfest page. If I find a problem, its obvious I should mark the patch as returned with feedback. But what if I'm happy with it? I'm not a hacker so cannot do C code review, should I leave it alone? Mark it as ready for committer? I marked my two reviews as ready for committer, but I feel like I've overstepped my bounds. -Andy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
Select typoutput::oid from pg_type limit 1; Also, you *can* go back the other way. It's very common to write Select * from pg_proc where oid = 'boolout'::regproc rather than looking up the OID first. see Object Identifier Types in the manual. Many thanks to you both, that helps tremendously. - Simone Aiken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] What happened to open_sync_without_odirect?
On 1/15/11 4:30 PM, Bruce Momjian wrote: Josh Berkus wrote: Last I remember, we were going to add this as an option. But I don't see a patch in the queue. Am I missing it? Was I supposed to write it? I don't know, but let me add that I am confused how this would look to users. In many cases, kernels don't even support O_DIRECT, so what would we do to specify this? What about just auto-disabling O_DIRECT if the filesystem does not support it; maybe issue a log message about it. Yes, you *are* confused. The problem isn't auto-disabling, we already do that. The problem is *auto-enabling*; ages ago we made the assumption that if o_sync was supported, so was o_direct. We've now found out that's not true on all platforms. Also, test results show that even when supported, o_direct isn't necessarily a win. Hence, the additional fsync_method options. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We need to log aborted autovacuums
On 1/16/11 11:19 AM, Simon Riggs wrote: I would prefer it if we had a settable lock timeout, as suggested many moons ago. When that was discussed before it was said there was no difference between a statement timeout and a lock timeout, but I think there clearly is, this case being just one example. Whatever happend to lock timeouts, anyway? We even had some patches floating around for 9.0 and they disappeared. However, we'd want a separate lock timeout for autovac, of course. I'm not at all keen on a *statement* timeout on autovacuum; as long as autovacuum is doing work, I don't want to cancel it. Also, WTF would we set it to? Going the statement timeout route seems like a way to create a LOT of extra work, troubleshooting, getting it wrong, and releasing patch updates. Please let's just create a lock timeout. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_replication security
I suggest pg_stat_replication do just like pg_stat_activity, which is return NULL in most fields if the user isn't (superuser||same_user_as_that_session). What session would that be, exactly? I suggest instead either superuser or replication permissions. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walreceiver fallback_application_name
Magnus Hagander mag...@hagander.net writes: Is walreceiver something that the average DBA is going to realize what it is? Perhaps go for something like replication slave? I think walreceiver is very good here, and the user is already confronted to such phrasing. http://www.postgresql.org/docs/9.0/interactive/runtime-config-wal.html#GUC-MAX-WAL-SENDERS Also, we're about to extend the technique usage in some other places such as integrated base backup facility and default archiving solution, so let's talk about what it's doing, not what for. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] replication and pg_hba.conf
In 9.0, we specifically require using replication as database name to start a replication session. In 9.1 we will have the REPLICATION attribute to a role - should we change it so that all in database includes replication connections? It certainly goes in the principle of least surprise path.. +1. It'll eliminate an entire file to edit for replication setup, so does a lot to make initial replication setup easier. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_replication security
On Sun, Jan 16, 2011 at 21:51, Josh Berkus j...@agliodbs.com wrote: I suggest pg_stat_replication do just like pg_stat_activity, which is return NULL in most fields if the user isn't (superuser||same_user_as_that_session). What session would that be, exactly? The user doing the query to pg_stat_replication being the same as the user running the replication. I suggest instead either superuser or replication permissions. That's another idea. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_replication security
I suggest instead either superuser or replication permissions. That's another idea. Oh, wait. I take that back ... we're trying to encourage users NOT to use the replication user as a login, yes? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK for non-tables
Tom Lane t...@sss.pgh.pa.us writes: Another possibility is to disallow just the single case LOCK tablename NOWAIT ie, you can write NOWAIT if you include *either* the object type or the IN...MODE clause. This is not too hard as far as the grammar is concerned, but I'm not exactly sure how to document it. I don't see anything better than documenting it using 2 extra lines: LOCK [ TABLE ] [ ONLY ] name [, ...] [ IN lockmode MODE ] LOCK TABLE tablename [ IN lockmode MODE ] [ NOWAIT ] LOCK [ TABLE ] [ ONLY ] tablename IN lockmode MODE [ NOWAIT ] Ok it looks like a mess, but that's what it is :) And every user with LOCK tablename NOWAIT in their code would have to change that to LOCK TABLE tablename NOWAIT. Is there no way to reduce that to only be a problem with tables named the same as the new objects we want to add support for? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] texteq/byteaeq: avoid detoast [REVIEW]
Hello I looked on this patch too. It's good idea. I think, so we can have a function or macro that compare a varlena sizes. Some like Datum texteq(..) { if (!datumsHasSameLength(PG_GETARG_DATUM(0), PG_GETARG_DATUM(1)) PG_RETURN_FALSE(); ... actual code .. } Regards Pavel Stehule 2011/1/16 Andy Colson a...@squeakycode.net: This is a review of: https://commitfest.postgresql.org/action/patch_view?id=468 Purpose: Equal and not-equal _may_ be quickly determined if their lengths are different. This _may_ be a huge speed up if we dont have to detoat. The Patch: == I was able to read and understand the patch, its a simple change and looked correct to me (a non PG hacker). It applies clean to git head, compiles and runs fine with debug enabled. make check passes Usability: == I used _may_ above. The benchmark included with the patch, showing huge speedups, is really contrived. It uses a where clause with a thousand character constant: (where c = 'long...long...long...long...ConstantText...etc'). In my opinion this is very uncommon (the author does note this is a best case). If you have a field large enough to be toasted you are not going to be using that to search on, you are going to have an ID field that is indexed. (select c where id = 7) This also only touches = and . and like wont be touched. So I think the scope of this is limited. THAT being said, the patch is simple, and if you do happen to hit the code, it will speed things up. As a user of PG I'd like to have this included. Its a corner case, but a big corner, and its a small, simple change, and it wont slow anything else down. Performance: I created myself a more real world test, with a table with indexes and id's and a large toasted field. create table junk(id serial primary key, xgroup integer, c text); create index junk_group on junk(xgroup); I filled it full of junk: do $$ declare i integer; declare j integer; begin for i in 1..100 loop for j in 1..500 loop insert into junk(xgroup, c) values (j, 'c'||i); insert into junk (xgroup, c) select j, repeat('abc', 2000)|| n from generate_series(1, 5) n; end loop; end loop; end$$; This will make about 600 records within the same xgroup. As well as a simple 'c15' type of value in c we can search for. My thinking is you may not know the exact unique id, but you do know what group its in, so that'll cut out 90% of the records, and then you'll have to add and c = 'c15' to get the exact one you want. I still saw a nice performance boost. Old PG: $ psql bench3.sql Timing is on. DO Time: 2010.412 ms Patched: $ psql bench3.sql Timing is on. DO Time: 184.602 ms bench3.sql: do $$ declare i integer; begin for i in 1..400 loop perform count(*) from junk where xgroup = i and c like 'c' || i; end loop; end$$; Summary: Performance speed-up: Oh yeah! If you just happen to hit it, and if you do hit it, you might want to re-think your layout a little bit. Do I want it? Yes please. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reviewers needed!
Em 16-01-2011 16:30, Andy Colson escreveu: I reviewed a couple patched, and I added my review to the commitfest page. If I find a problem, its obvious I should mark the patch as returned with feedback. But what if I'm happy with it? I'm not a hacker so cannot do C code review, should I leave it alone? Mark it as ready for committer? Did you take a look at [1]? If your patch involves C code and you're not C proficient then there must be another reviewer to give his/her opinion (of course, the other person could be the committer). I wouldn't mark it ready for committer instead leave it as is (needs review); just be sure to add your comments in the commitfest app. [1] http://wiki.postgresql.org/wiki/RRReviewers -- Euler Taveira de Oliveira http://www.timbira.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 0: Introduction; test cases
On Sun, Jan 16, 2011 at 12:07:44PM -0500, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Sat, Jan 15, 2011 at 10:25 AM, Noah Misch n...@leadboat.com wrote: Do you value test coverage so little? If you're asking whether I think real-world usability is more important than test coverage, then yes. Quite honestly, I'd be inclined to rip out most of the DEBUG messages I see in that regression test altogether. They are useless, and so is the regression test itself. An appropriate regression test would involve something more like checking that the relfilenode changed and then checking that the contained data is still sane. This patch is the first of a series. Divorced from the other patches, many of the test cases exercise the same code path, making them redundant. Even so, the tests revealed a defect we released with 9.0; that seems sufficient to promote them out of the useless bucket. One can easily confirm by inspection that the relfilenode will change if and only if the rewriting DEBUG message appears. Your proposed direct comparison of the relfilenode in the regression tests adds negligible sensitivity. If that were all, I'd call it a question of style. However, a relfilenode comparison does not distinguish no-op changes from changes entailing a verification scan. A similar ambiguity would arise without the foreign key DEBUG message. As for checking that the contained data is still sane, what do you have in mind? After the test cases, I SELECT the table-under-test and choice catalog entries. If later patches in the series leave these expected outputs unchanged, that confirms the continued sanity of the data. Perhaps I should do this after every test, or also test forced index scans. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reviewers needed!
On Sun, Jan 16, 2011 at 2:30 PM, Andy Colson a...@squeakycode.net wrote: I reviewed a couple patched, and I added my review to the commitfest page. If I find a problem, its obvious I should mark the patch as returned with feedback. Only if it's got sufficiently serious flaws that getting it committed during this CommitFest is not practical. If it just needs some revision, Waiting on Author is the right place. But what if I'm happy with it? I'm not a hacker so cannot do C code review, should I leave it alone? Mark it as ready for committer? Yep, that's fine. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] limiting hint bit I/O
On Sat, Jan 15, 2011 at 6:28 PM, Josh Berkus j...@agliodbs.com wrote: If the problem is that all the freezing happens at once, then ISTM the solution is to add a random factor. Say, when a tuple just passes the lower threshold it has a 1% chance of being frozen. The chance grows until it is 100% as it reaches the upper threshold. Doesn't have to be random; it could be determinative. That is, we could have a vacuum_freeze_max_size parameter ... and accompanying autovacuum parameter ... which allowed the user to limit freezing scans to, say, 1GB of the table at a time. If I could, say, call a manual freeze of 10% of the largest tables ever night, then I might actually be able to schedule it. It's a full scan of the whole table which is fatal. I think this is worth pursuing at some point, though of course one needs to devise an algorithm that spreads out the freezing enough but not too much. But it's fairly off-topic from the original subject of this thread, which was a quick-and-dirty attempt to limit the amount of I/O caused by hint bits. I'm still very interested in knowing what people think about that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 0: Introduction; test cases
On Sun, Jan 16, 2011 at 12:07 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sat, Jan 15, 2011 at 10:25 AM, Noah Misch n...@leadboat.com wrote: Do you value test coverage so little? If you're asking whether I think real-world usability is more important than test coverage, then yes. Quite honestly, I'd be inclined to rip out most of the DEBUG messages I see in that regression test altogether. They are useless, and so is the regression test itself. An appropriate regression test would involve something more like checking that the relfilenode changed and then checking that the contained data is still sane. From my point of view, the value of those messages is that if someone is altering or clustering a large table, they might like to get a series of messages: rewriting the table, rebuilding this index, rebuilding that index, rewriting the toast table index, as a crude sort of progress indication. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql
On Sun, Jan 16, 2011 at 06:49:27PM +0100, Pavel Stehule wrote: I am sending a updated version with little bit more comments. But I am sure, so somebody with good English have to edit my comments. Minimally I hope, so your questions will be answered. Thanks. I edited the comments and white space somewhat, as attached. I'll go ahead and mark it Ready for Committer. *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *** *** 255,260 plpgsql_exec_function(PLpgSQL_function *func, FunctionCallInfo fcinfo) --- 255,264 var-value = fcinfo-arg[i]; var-isnull = fcinfo-argnull[i]; var-freeval = false; + + /* only varlena types should be detoasted */ + var-should_be_detoasted = !var-isnull !var-datatype-typbyval + var-datatype-typlen == -1; } break; *** *** 570,581 plpgsql_exec_trigger(PLpgSQL_function *func, --- 574,587 elog(ERROR, unrecognized trigger action: not INSERT, DELETE, UPDATE, or TRUNCATE); var-isnull = false; var-freeval = true; + var-should_be_detoasted = false; var = (PLpgSQL_var *) (estate.datums[func-tg_name_varno]); var-value = DirectFunctionCall1(namein, CStringGetDatum(trigdata-tg_trigger-tgname)); var-isnull = false; var-freeval = true; + var-should_be_detoasted = false; var = (PLpgSQL_var *) (estate.datums[func-tg_when_varno]); if (TRIGGER_FIRED_BEFORE(trigdata-tg_event)) *** *** 588,593 plpgsql_exec_trigger(PLpgSQL_function *func, --- 594,600 elog(ERROR, unrecognized trigger execution time: not BEFORE, AFTER, or INSTEAD OF); var-isnull = false; var-freeval = true; + var-should_be_detoasted = false; var = (PLpgSQL_var *) (estate.datums[func-tg_level_varno]); if (TRIGGER_FIRED_FOR_ROW(trigdata-tg_event)) *** *** 598,620 plpgsql_exec_trigger(PLpgSQL_function *func, --- 605,631 elog(ERROR, unrecognized trigger event type: not ROW or STATEMENT); var-isnull = false; var-freeval = true; + var-should_be_detoasted = false; var = (PLpgSQL_var *) (estate.datums[func-tg_relid_varno]); var-value = ObjectIdGetDatum(trigdata-tg_relation-rd_id); var-isnull = false; var-freeval = false; + var-should_be_detoasted = false; var = (PLpgSQL_var *) (estate.datums[func-tg_relname_varno]); var-value = DirectFunctionCall1(namein, CStringGetDatum(RelationGetRelationName(trigdata-tg_relation))); var-isnull = false; var-freeval = true; + var-should_be_detoasted = false; var = (PLpgSQL_var *) (estate.datums[func-tg_table_name_varno]); var-value = DirectFunctionCall1(namein, CStringGetDatum(RelationGetRelationName(trigdata-tg_relation))); var-isnull = false; var-freeval = true; + var-should_be_detoasted = false; var = (PLpgSQL_var *) (estate.datums[func-tg_table_schema_varno]); var-value = DirectFunctionCall1(namein, *** *** 624,634 plpgsql_exec_trigger(PLpgSQL_function *func, --- 635,647 trigdata-tg_relation; var-isnull = false; var-freeval = true; + var-should_be_detoasted = false; var = (PLpgSQL_var *) (estate.datums[func-tg_nargs_varno]); var-value = Int16GetDatum(trigdata-tg_trigger-tgnargs); var-isnull = false; var-freeval = false; + var-should_be_detoasted = false; var = (PLpgSQL_var *) (estate.datums[func-tg_argv_varno]); if (trigdata-tg_trigger-tgnargs 0) *** *** 654,665 plpgsql_exec_trigger(PLpgSQL_function *func, --- 667,680 -1, false, 'i')); var-isnull = false; var-freeval = true; + var-should_be_detoasted = false; } else { var-value = (Datum) 0; var-isnull = true; var-freeval = false; + var-should_be_detoasted = false; } estate.err_text = gettext_noop(during function entry); *** *** 841,846 copy_plpgsql_datum(PLpgSQL_datum *datum) --- 856,862 new-value = 0;
Re: [HACKERS] texteq/byteaeq: avoid detoast [REVIEW]
On Sun, Jan 16, 2011 at 01:05:11PM -0600, Andy Colson wrote: This is a review of: https://commitfest.postgresql.org/action/patch_view?id=468 Thanks! I created myself a more real world test, with a table with indexes and id's and a large toasted field. This will make about 600 records within the same xgroup. As well as a simple 'c15' type of value in c we can search for. My thinking is you may not know the exact unique id, but you do know what group its in, so that'll cut out 90% of the records, and then you'll have to add and c = 'c15' to get the exact one you want. Good to have a benchmark like that, rather than just looking at the extrema. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to add a primary key using an existing index
I've taken a look at this version of the patch. Submission Review This version of the patch applies cleanly to master. It matches your git repo and includes test + docs. Usability Review --- The command syntax now matches what was discussed during the last cf. The text of the notice: test=# alter table a add constraint acons unique using index aind2; NOTICE: ALTER TABLE / ADD UNIQUE USING INDEX will rename index aind2 to acons Documentation -- I've attached a patch (to be applied on top of your latest patch) with some editorial changes I'd recommend to your documentation. I feel it reads a bit clearer (but others should chime in if they disagree or have better wordings) A git tree with changes rebased to master + this change is available at https://github.com/ssinger/postgres/tree/ssinger/constraint_with_index Code Review --- src/backend/parser/parse_utilcmd.c: 1452 Your calling strdup on the attribute name. I don't have a good enough grasp on the code to be able to trace this through to where the memory gets free'd. Does it get freed? Should/could this be a call to pstrdup Feature Test - I wasn't able to find any issues in my testing I'm marking this as returned with feedback pending your answer on the possible memory leak above but I think the patch is very close to being ready. Steve Singer diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index 83d2fbb..0b486ab 100644 *** a/doc/src/sgml/ref/alter_table.sgml --- b/doc/src/sgml/ref/alter_table.sgml *** ALTER TABLE replaceable class=PARAMETE *** 242,268 termliteralADD replaceable class=PARAMETERtable_constraint_using_index/replaceable/literal/term listitem para ! This form adds a new literalPRIMARY KEY/ or literalUNIQUE/ constraint to the table using an existing index. All the columns of the index will be included in the constraint. /para para ! The index should be UNIQUE, and should not be a firsttermpartial index/ ! or an firsttermexpressional index/. /para para ! This can be helpful in situations where one wishes to recreate or ! literalREINDEX/ the index of a literalPRIMARY KEY/ or a ! literalUNIQUE/ constraint, but dropping and recreating the constraint ! to acheive the effect is not desirable. See the illustrative example below. /para note para If a constraint name is provided then the index will be renamed to that ! name, else the constraint will be named to match the index name. /para /note --- 242,270 termliteralADD replaceable class=PARAMETERtable_constraint_using_index/replaceable/literal/term listitem para ! This form adds a new literalPRIMARY KEY/ or literalUNIQUE/literal constraint to the table using an existing index. All the columns of the index will be included in the constraint. /para para ! The index should be a UNIQUE index. A firsttermpartial index/firstterm ! or an firsttermexpressional index/firstterm is not allowed. /para para ! Adding a constraint using an existing index can be helpful in situations ! where you wishes to rebuild an index used for a ! literalPRIMARY KEY/literal or a literalUNIQUE/literal constraint, ! but dropping and recreating the constraint ! is not desirable. See the illustrative example below. /para note para If a constraint name is provided then the index will be renamed to that ! name of the constraint. Otherwise the constraint will be named to match ! the index name. /para /note -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] limiting hint bit I/O
Robert Haas wrote: a quick-and-dirty attempt to limit the amount of I/O caused by hint bits. I'm still very interested in knowing what people think about that. I found the elimination of the response-time spike promising. I don't think I've seen enough data yet to feel comfortable endorsing it, though. I guess the question in my head is: how much of the lingering performance hit was due to having to go to clog and how much was due to competition with the deferred writes? If much of it is due to repeated recalculation of visibility based on clog info, I think there would need to be some way to limit how many times that happened before the hint bits were saved. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] auto-sizing wal_buffers
On Sun, Jan 16, 2011 at 00:34, Josh Berkus j...@agliodbs.com wrote: I think we can be more specific on that last sentence; is there even any *theoretical* benefit to settings above 16MB, the size of a WAL segment? Certainly there have been no test results to show any. I don't know if it's applicable to real workloads in any way, but it did make a measurable difference in one of my tests. Back when benchmarking different wal_sync_methods, I found that when doing massive INSERTs from generate_series, the INSERT time kept improving even after increasing wal_buffers from 16MB to 32, 64 and 128MB; especially with wal_sync_method=open_datasync. The total INSERT+COMMIT time remained constant, however. More details here: http://archives.postgresql.org/pgsql-performance/2010-11/msg00094.php Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] texteq/byteaeq: avoid detoast [REVIEW]
On Sun, Jan 16, 2011 at 10:07:13PM +0100, Pavel Stehule wrote: I think, so we can have a function or macro that compare a varlena sizes. Some like Datum texteq(..) { if (!datumsHasSameLength(PG_GETARG_DATUM(0), PG_GETARG_DATUM(1)) PG_RETURN_FALSE(); ... actual code .. } Good point. Is this something that would be useful many places? One thing that bugged me slightly writing this patch is that texteq, textne, byteaeq and byteane all follow the same pattern rather tightly. (Indeed, I think one could easily implement texteq and byteaeq with the exact same C function.) I like how we handle this for tsvector (see TSVECTORCMPFUNC in tsvector_op.c) to avoid the redundancy. If datumHasSameLength would mainly apply to these four functions or ones very similar to them, maybe we should abstract out the entire function body like we do for tsvector? A topic for a different patch in any case, I think. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] limiting hint bit I/O
On Sun, Jan 16, 2011 at 5:37 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas wrote: a quick-and-dirty attempt to limit the amount of I/O caused by hint bits. I'm still very interested in knowing what people think about that. I found the elimination of the response-time spike promising. I don't think I've seen enough data yet to feel comfortable endorsing it, though. I guess the question in my head is: how much of the lingering performance hit was due to having to go to clog and how much was due to competition with the deferred writes? If much of it is due to repeated recalculation of visibility based on clog info, I think there would need to be some way to limit how many times that happened before the hint bits were saved. I think you may be confused about what the patch does - currently, pages with hint bit changes are considered dirty, period. Therefore, they are written whenever any other dirty page would be written: by the background writer cleaning scan, at checkpoints, and when a backend must write a dirty buffer before reallocating it to hold a different page. The patch keeps the first of these and changes the second two: pages with only hint bit changes are dirty for purposes of the background writer, but are considered clean for checkpoint purposes and buffer recycling. IOW, I'm not adding any new mechanism for these pages to get written. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] texteq/byteaeq: avoid detoast [REVIEW]
2011/1/16 Noah Misch n...@leadboat.com: On Sun, Jan 16, 2011 at 10:07:13PM +0100, Pavel Stehule wrote: I think, so we can have a function or macro that compare a varlena sizes. Some like Datum texteq(..) { if (!datumsHasSameLength(PG_GETARG_DATUM(0), PG_GETARG_DATUM(1)) PG_RETURN_FALSE(); ... actual code .. } Good point. Is this something that would be useful many places? One thing that bugged me slightly writing this patch is that texteq, textne, byteaeq and byteane all follow the same pattern rather tightly. (Indeed, I think one could easily implement texteq and byteaeq with the exact same C function.). It isn't good idea. Theoretically, there can be differencies between text and bytea in future - there can be important collations. Now, these types are distinct and some basic methods should be distinct too. Different situation is on varlena level. Regards Pavel Stehule I like how we handle this for tsvector (see TSVECTORCMPFUNC in tsvector_op.c) to avoid the redundancy. If datumHasSameLength would mainly apply to these four functions or ones very similar to them, maybe we should abstract out the entire function body like we do for tsvector? A topic for a different patch in any case, I think. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] READ ONLY fixes
On Mon, Jan 10, 2011 at 8:27 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Attached is a rebased roll-up of the 3 and 3a patches from last month. -Kevin Hi Kevin, A review: The main motivation for the patch is to allow future optimization of read-only transactions, by preventing them from changing back to read write once their read-onliness has been noticed. This will also probably bring closer compliance with SQL standards. The patch was mostly reviewed by others at the time it was proposed and altered. The patch applies and builds cleanly, and passes make check. It does what it says. I found the following message somewhat confusing: ERROR: read-only property must be set before any query I was not setting read-only, but rather READ WRITE. This message is understandable from the perspective of the code (and the SET transaction_read_only=... command), but I think it should be framed in the context of the SET TRANSACTION command in which read-only is not the name of a boolean, but one label of a binary switch. Maybe something like: ERROR: transaction modes READ WRITE or READ ONLY must be set before any query. It seems a bit strange that you can do dummy changes (setting the mode to the same value it currently has) as much as you want in a subtransaction, but not in a top-level transaction. But this was discussed previously and not objected to. The old behavior was not described in the docs. This patch does not include a doc change, but considering the parallelism between this and ISOLATION LEVEL, perhaps a parallel sentence should be added to the docs about this aspect as well. There are probably many people around who are abusing the current laxity, so a mention in the release notes is probably warranted. When a subtransaction has set the mode more stringent than the top-level transaction did, that setting is reversed when the subtransaction ends (whether by success or by rollback), which was discussed as the desired behavior. But the included regression tests do not exercise that case by testing the case where a SAVEPOINT is either rolled back or released. Should those tests be included? The changes made to the isolation level code to get rid of some spurious warnings are not tested in the regression test--if I excise that part of the patch, the code still passes make check. Just looking at that code, it appears to do what it is supposed to, but I can't figure out how to test it myself. I poked at the patched code a bit and I could not break it, but I don't know enough about this part of the system to design truly devilish tests to apply to it. I did not do any performance testing, as I don't see how this patch could have performance implications. None of the issues I raise above are severe. Does that mean I should change the status to ready for committer? Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] REVIEW: PL/Python validator function
This is a review for the patch sent as https://commitfest.postgresql.org/action/patch_view?id=456 == Submission == The patch applied cleanly atop of plpython refactor patches. The format is git diff (though refactor patches is format-patch). I did patch -p1. It includes adequate amount of test. I found regression test failure in plpython_error. *** *** 8,14 '.syntaxerror' LANGUAGE plpythonu; ERROR: could not compile PL/Python function python_syntax_error ! DETAIL: SyntaxError: invalid syntax (string, line 2) /* With check_function_bodies = false the function should get defined * and the error reported when called */ --- 8,14 '.syntaxerror' LANGUAGE plpythonu; ERROR: could not compile PL/Python function python_syntax_error ! DETAIL: SyntaxError: invalid syntax (line 2) /* With check_function_bodies = false the function should get defined * and the error reported when called */ *** *** 19,29 LANGUAGE plpythonu; SELECT python_syntax_error(); ERROR: could not compile PL/Python function python_syntax_error ! DETAIL: SyntaxError: invalid syntax (string, line 2) /* Run the function twice to check if the hashtable entry gets cleaned up */ SELECT python_syntax_error(); ERROR: could not compile PL/Python function python_syntax_error ! DETAIL: SyntaxError: invalid syntax (string, line 2) RESET check_function_bodies; /* Flat out syntax error */ --- 19,29 LANGUAGE plpythonu; SELECT python_syntax_error(); ERROR: could not compile PL/Python function python_syntax_error ! DETAIL: SyntaxError: invalid syntax (line 2) /* Run the function twice to check if the hashtable entry gets cleaned up */ SELECT python_syntax_error(); ERROR: could not compile PL/Python function python_syntax_error ! DETAIL: SyntaxError: invalid syntax (line 2) RESET check_function_bodies; /* Flat out syntax error */ My environment is CentOS release 5.4 (Final) with python 2.4.3 installed default. It includes no additional doc, which seems sane, for no relevant parts found in the existing doc. == Usability and Feature == The patch works fine as advertised. CREATE FUNCTION checks the function body syntax while before the patch it doesn't. The way of it seems following the one of plperl. I think catversion update should be included in the patch, since it contains pg_pltemplate catalog changes. == Performance == The performance is out of scope. The validator function is called by the system once at the creation of functions. == Code == It looks fine overall. The only thing that I came up with is trigger check logic in PLy_procedure_is_trigger. Although it seems following plperl's corresponding function, the check of whether the prorettype is pseudo type looks redundant since it checks prorettype is TRIGGEROID or OPAQUEOID later. But it is not critical. I mark Waiting on Author for the regression test issue. Other points are trivial. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] READ ONLY fixes
On Sun, Jan 16, 2011 at 6:58 PM, Jeff Janes jeff.ja...@gmail.com wrote: None of the issues I raise above are severe. Does that mean I should change the status to ready for committer? Sounds right to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spread checkpoint sync
On Tue, Jan 11, 2011 at 5:27 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Nov 30, 2010 at 3:29 PM, Greg Smith g...@2ndquadrant.com wrote: One of the ideas Simon and I had been considering at one point was adding some better de-duplication logic to the fsync absorb code, which I'm reminded by the pattern here might be helpful independently of other improvements. Hopefully I'm not stepping on any toes here, but I thought this was an awfully good idea and had a chance to take a look at how hard it would be today while en route from point A to point B. The answer turned out to be not very, so PFA a patch that seems to work. I tested it by attaching gdb to the background writer while running pgbench, and it eliminate the backend fsyncs without even breaking a sweat. I had been concerned about how long the lock would be held, and I was pondering ways to do only partial deduplication to reduce the time. But since you already wrote a patch to do the whole thing, I figured I'd time it. I arranged to test an instrumented version of your patch under large shared_buffers of 4GB, conditions that would maximize the opportunity for it to take a long time. Running your compaction to go from 524288 to a handful (14 to 29, depending on run) took between 36 and 39 milliseconds. For comparison, doing just the memcpy part of AbsorbFsyncRequest on a full queue took from 24 to 27 milliseconds. They are close enough to each other that I am no longer interested in partial deduplication. But both are long enough that I wonder if having a hash table in shared memory that is kept unique automatically at each update might not be worthwhile. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK for non-tables
Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: It's a major undertaking trying to write software that runs against PostgreSQL for more than one release. We should be making that easier, not harder. None of the proposals would make it impossible to write a LOCK statement that works on all available releases, [] +1 for this as a nice guideline/philosophy for syntax changes in general. Personally I don't mind changing a few SQL statements when I upgrade to a new release; but it sure is nice if there's at least some syntax that works on both a current and previous release. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] auto-sizing wal_buffers
On Sun, Jan 16, 2011 at 1:52 AM, Greg Smith g...@2ndquadrant.com wrote: Fujii Masao wrote: +int XLOGbuffersMin = 8; XLOGbuffersMin is a fixed value. I think that defining it as a macro rather than a variable seems better. + if (XLOGbuffers 2048) + XLOGbuffers = 2048; Using XLOG_SEG_SIZE/XLOG_BLCKSZ rather than 2048 seems better. +#wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers Typo: s/32kB/64kB Thanks, I've fixed all these issues and attached a new full patch, pushed to github, etc. Tests give same results back, and it's nice that it scale to reasonable behavior if someone changes their XLOG segment size. Thanks for the update. +/* Minimum setting used for a lower bound on wal_buffers */ +#define XLOG_BUFFER_MIN4 Why didn't you use XLOG_BUFFER_MIN instead of XLOGbuffersMin? XLOG_BUFFER_MIN is not used anywhere for now. + if (XLOGbuffers (XLOGbuffersMin * 2)) + XLOGbuffers = XLOGbuffersMin * 2; + } Why is the minimum value 64kB only when wal_buffers is set to -1? This seems confusing for users. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LOCK for non-tables
On Sun, 2011-01-16 at 16:30 -0800, Ron Mayer wrote: Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: It's a major undertaking trying to write software that runs against PostgreSQL for more than one release. We should be making that easier, not harder. None of the proposals would make it impossible to write a LOCK statement that works on all available releases, [] +1 for this as a nice guideline/philosophy for syntax changes in general. Thanks Tom, appreciate it. Personally I don't mind changing a few SQL statements when I upgrade to a new release; but it sure is nice if there's at least some syntax that works on both a current and previous release. Yes, changing to a form of syntax that is then both backwards and forwards compatible is a good solution. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperlu problem with utf8 [REVIEW]
On Sat, Jan 15, 2011 at 14:20, Andy Colson a...@squeakycode.net wrote: This is a review of plperl encoding issues https://commitfest.postgresql.org/action/patch_view?id=452 Thanks for taking the time to review! [...] The Patch: == Applies clean to git head as of January 15 2011. PG built with --enable-cassert and --enable-debug seems to run fine with no errors. I don't think regression tests cover plperl, so understandable there are no tests in the patch. FWI there are plperl tests, you can do 'make installcheck' from the plperl dir or installcheck-world from the top. However I did not add any as AFAIK there is not a way to handle multiple locales with them (at least for the automated case). There is no manual updates in the patch either, and I think there should be. I think it should be made clear that data (varchar, text, etc. but not bytea) will be passed to perl as UTF-8, regardless of database encoding I don't disagree, but I dont see where to put it either. Maybe its only release note material? Also that use utf8; is always loaded and in use. Sorry, I probably mis-worded that in my original description. Its that we always do the 'utf8fix' for plperl. Not that utf8 is loaded and in use. This fix basically makes sure the unicode database and associated modules are loaded. This is needed because perl will try to dynamically load these when you need them. As we restrict 'require' in the plperl case, things that depended on that would fail. Previously we only did the utf8fix when we were a PG_UTF8 database. I don't really think its worth documenting, its more a bug fix than anything else. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] auto-sizing wal_buffers
On Sun, Jan 16, 2011 at 7:34 AM, Josh Berkus j...@agliodbs.com wrote: I think we can be more specific on that last sentence; is there even any *theoretical* benefit to settings above 16MB, the size of a WAL segment? Certainly there have been no test results to show any. If the workload generates 16MB or more WAL for wal_writer_delay, 16MB or more of wal_buffers would be effective. In that case, wal_buffers is likely to be filled up with unwritten WAL, then you have to write buffers while holding WALInsert lock. This is obviously not good. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We need to log aborted autovacuums
On Sun, 2011-01-16 at 12:50 -0800, Josh Berkus wrote: On 1/16/11 11:19 AM, Simon Riggs wrote: I would prefer it if we had a settable lock timeout, as suggested many moons ago. When that was discussed before it was said there was no difference between a statement timeout and a lock timeout, but I think there clearly is, this case being just one example. Whatever happend to lock timeouts, anyway? We even had some patches floating around for 9.0 and they disappeared. However, we'd want a separate lock timeout for autovac, of course. I'm not at all keen on a *statement* timeout on autovacuum; as long as autovacuum is doing work, I don't want to cancel it. Also, WTF would we set it to? Going the statement timeout route seems like a way to create a LOT of extra work, troubleshooting, getting it wrong, and releasing patch updates. Please let's just create a lock timeout. I agree with you, but if we want it *this* release, on top of all the other features we have queued, then I suggest we compromise. If you hold out for more feature, you may get less. Statement timeout = 2 * (100ms + autovacuum_vacuum_cost_delay) * tablesize/(autovacuum_vacuum_cost_limit / vacuum_cost_page_dirty) -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] limiting hint bit I/O
Robert Haas wrote: I think you may be confused about what the patch does - currently, pages with hint bit changes are considered dirty, period. Therefore, they are written whenever any other dirty page would be written: by the background writer cleaning scan, at checkpoints, and when a backend must write a dirty buffer before reallocating it to hold a different page. The patch keeps the first of these and changes the second two No, I understood that. I'm just concerned that if you eliminate the other two, we may be recomputing visibility based on clog often enough to kill performance. In other words, I'm asking that you show that the other two methods aren't really needed for decent overall performance. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spread checkpoint sync
On Sun, Jan 16, 2011 at 7:32 PM, Jeff Janes jeff.ja...@gmail.com wrote: But since you already wrote a patch to do the whole thing, I figured I'd time it. Thanks! I arranged to test an instrumented version of your patch under large shared_buffers of 4GB, conditions that would maximize the opportunity for it to take a long time. Running your compaction to go from 524288 to a handful (14 to 29, depending on run) took between 36 and 39 milliseconds. For comparison, doing just the memcpy part of AbsorbFsyncRequest on a full queue took from 24 to 27 milliseconds. They are close enough to each other that I am no longer interested in partial deduplication. But both are long enough that I wonder if having a hash table in shared memory that is kept unique automatically at each update might not be worthwhile. There are basically three operations that we care about here: (1) time to add an fsync request to the queue, (2) time to absorb requests from the queue, and (3) time to compact the queue. The first is by far the most common, and at least in any situation that anyone's analyzed so far, the second will be far more common than the third. Therefore, it seems unwise to accept any slowdown in #1 to speed up either #2 or #3, and a hash table probe is definitely going to be slower than what's required to add an element under the status quo. We could perhaps mitigate this by partitioning the hash table. Alternatively, we could split the queue in half and maintain a global variable - protected by the same lock - indicating which half is currently open for insertions. The background writer would grab the lock, flip the global, release the lock, and then drain the half not currently open to insertions; the next iteration would flush the other half. However, it's unclear to me that either of these things has any value. I can't remember any reports of contention on the BgWriterCommLock, so it seems like changing the logic as minimally as possible as the way to go. (In contrast, note that the WAL insert lock, proc array lock, and lock manager/buffer manager partition locks are all known to be heavily contended in certain workloads.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] limiting hint bit I/O
On Sun, Jan 16, 2011 at 8:41 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas wrote: \ I think you may be confused about what the patch does - currently, pages with hint bit changes are considered dirty, period. Therefore, they are written whenever any other dirty page would be written: by the background writer cleaning scan, at checkpoints, and when a backend must write a dirty buffer before reallocating it to hold a different page. The patch keeps the first of these and changes the second two No, I understood that. I'm just concerned that if you eliminate the other two, we may be recomputing visibility based on clog often enough to kill performance. In other words, I'm asking that you show that the other two methods aren't really needed for decent overall performance. Admittedly I've only done one test, but on the basis of that test I'd say the other two methods ARE really needed for decent overall performance. I think it'd be interesting to see this tested on a machine with large shared buffers, where the background writer might succeed in cleaning a higher fraction of the pages before the bulk read buffer access strategy starts recycling buffers. But I'm not very optimistic. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to add a primary key using an existing index
On Sun, Jan 16, 2011 at 5:34 PM, Steve Singer ssinger...@sympatico.ca wrote: I'm marking this as returned with feedback pending your answer on the possible memory leak above but I think the patch is very close to being ready. Please use Waiting on Author if the patch is to be considered further for this CommitFest, and Returned with Feedback only if it will not be further considered for this CommitFest. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] We need to log aborted autovacuums
On Sun, Jan 16, 2011 at 8:36 PM, Simon Riggs si...@2ndquadrant.com wrote: I agree with you, but if we want it *this* release, on top of all the other features we have queued, then I suggest we compromise. If you hold out for more feature, you may get less. Statement timeout = 2 * (100ms + autovacuum_vacuum_cost_delay) * tablesize/(autovacuum_vacuum_cost_limit / vacuum_cost_page_dirty) I'm inclined to think that would be a very dangerous compromise. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql: Add \dL to show languages
On Sun, Jan 16, 2011 at 7:04 AM, Magnus Hagander mag...@hagander.net wrote: I do not like the use of parentheses in the usage description list (procedural) languages. Why not have it simply as list procedural languages? Because it lists non-procedural langauges as well? (I didn't check it, that's just a guess) There are many places in our code and documentation where procedural language or language are treated as synonyms. There's no semantic difference; procedural is simply a noise word. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery control functions
On Sun, Jan 16, 2011 at 11:52 PM, Magnus Hagander mag...@hagander.net wrote: So I'm back to my original question which is, how much work would this be? I don't know my way around that part so I can't estimate, and what's there so far is certainly a lot better than nothing, but if it's not a huge amount of work it would be a great improvement. I don't think it's a huge amount of work. Though I'm not sure Simon has time to do that since he would be very busy with SyncRep. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication logging
On Sun, Jan 16, 2011 at 9:19 AM, Magnus Hagander mag...@hagander.net wrote: Currently, replication connections *always* logs something like: LOG: replication connection authorized: user=mha host=[local] There's no way to turn that off. I can't find the reasoning behind this - why is this one not controlled by log_connections like normal ones? There's a comment in the code that says this is intentional, but I can't figure out why... Because it's reasonably likely that you'd want to log replication connections but not regular ones? On the theory that replication is more important than an ordinary login? What do you have in mind? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming base backups
On Sat, Jan 15, 2011 at 8:33 PM, Tatsuo Ishii is...@postgresql.org wrote: When do the standby launch its walreceiver? It would be extra-nice for the base backup tool to optionally continue streaming WALs until the standby starts doing it itself, so that wal_keep_segments is really deprecated. No idea how feasible that is, though. Good point. I have been always wondering why we can't use exiting WAL transporting infrastructure for sending/receiving WAL archive segments in streaming replication. If my memory serves, Fujii has already proposed such an idea but was rejected for some reason I don't understand. I must be confused, because you can use backup_command/restore_command to transport WAL segments, in conjunction with streaming replication. What Fujii-san unsuccessfully proposed was to have the master restore segments from the archive and stream them to clients, on request. It was deemed better to have the slave obtain them from the archive directly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walreceiver fallback_application_name
On Sun, Jan 16, 2011 at 3:53 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Magnus Hagander mag...@hagander.net writes: Is walreceiver something that the average DBA is going to realize what it is? Perhaps go for something like replication slave? I think walreceiver is very good here, and the user is already confronted to such phrasing. http://www.postgresql.org/docs/9.0/interactive/runtime-config-wal.html#GUC-MAX-WAL-SENDERS I agree that walreceiver is a reasonable default to supply in this case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Warning compiling pg_dump (MinGW, Windows XP)
2011/1/13 Pavel Golub pa...@microolap.com: Hello, Pgsql-hackers. I'm getting such warnings: pg_dump.c: In function 'dumpSequence': pg_dump.c:11449:2: warning: unknown conversion type character 'l' in format pg_dump.c:11449:2: warning: too many arguments for format pg_dump.c:11450:2: warning: unknown conversion type character 'l' in format pg_dump.c:11450:2: warning: too many arguments for format Line numbers my not be the same in the official sources, because I've made some changes. But the lines are: snprintf(bufm, sizeof(bufm), INT64_FORMAT, SEQ_MINVALUE); snprintf(bufx, sizeof(bufx), INT64_FORMAT, SEQ_MAXVALUE); In my oppinion configure failed for MinGw+Windows in this case. Am I right? Can someone give me a hint how to avoid this? It seems like PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT is getting the wrong answer on your machine, though I'm not sure why. The easiest workaround is probably to run configure and then edit src/include/pg_config.h before compiling. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming base backups
Good point. I have been always wondering why we can't use exiting WAL transporting infrastructure for sending/receiving WAL archive segments in streaming replication. If my memory serves, Fujii has already proposed such an idea but was rejected for some reason I don't understand. I must be confused, because you can use backup_command/restore_command to transport WAL segments, in conjunction with streaming replication. Yes, but using restore_command is not terribly convenient. On Linux/UNIX systems you have to enable ssh access, which is extremely hard on Windows. IMO Streaming replication is not yet easy enough to set up for ordinary users. It is already proposed that making base backup easier and I think it's good. Why don't we go step beyond a little bit more? What Fujii-san unsuccessfully proposed was to have the master restore segments from the archive and stream them to clients, on request. It was deemed better to have the slave obtain them from the archive directly. Did Fuji-san agreed on the conclusion? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming base backups
On Mon, Jan 17, 2011 at 11:32 AM, Tatsuo Ishii is...@postgresql.org wrote: Good point. I have been always wondering why we can't use exiting WAL transporting infrastructure for sending/receiving WAL archive segments in streaming replication. If my memory serves, Fujii has already proposed such an idea but was rejected for some reason I don't understand. I must be confused, because you can use backup_command/restore_command to transport WAL segments, in conjunction with streaming replication. Yes, but using restore_command is not terribly convenient. On Linux/UNIX systems you have to enable ssh access, which is extremely hard on Windows. Agreed. IMO Streaming replication is not yet easy enough to set up for ordinary users. It is already proposed that making base backup easier and I think it's good. Why don't we go step beyond a little bit more? What Fujii-san unsuccessfully proposed was to have the master restore segments from the archive and stream them to clients, on request. It was deemed better to have the slave obtain them from the archive directly. Did Fuji-san agreed on the conclusion? No. If the conclusion is true, we would not need a streaming backup feature. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walreceiver fallback_application_name
On Mon, Jan 17, 2011 at 11:16 AM, Robert Haas robertmh...@gmail.com wrote: On Sun, Jan 16, 2011 at 3:53 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Magnus Hagander mag...@hagander.net writes: Is walreceiver something that the average DBA is going to realize what it is? Perhaps go for something like replication slave? I think walreceiver is very good here, and the user is already confronted to such phrasing. http://www.postgresql.org/docs/9.0/interactive/runtime-config-wal.html#GUC-MAX-WAL-SENDERS I agree that walreceiver is a reasonable default to supply in this case. +1 though I could not find the mention to walreceiver in the doc. diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiv index c052df2..962ee04 100644 --- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c +++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c @@ -92,7 +92,7 @@ libpqrcv_connect(char *conninfo, XLogRecPtr startpoint) * replication for .pgpass lookup. */ snprintf(conninfo_repl, sizeof(conninfo_repl), -%s dbname=replication replication=true, +%s dbname=replication replication=true fallback_application_name=postgres, conninfo); Also the size of conninfo_repl needs to be enlarged. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spread checkpoint sync
I have finished a first run of benchmarking the current 9.1 code at various sizes. See http://www.2ndquadrant.us/pgbench-results/index.htm for many details. The interesting stuff is in Test Set 3, near the bottom. That's the first one that includes buffer_backend_fsync data. This iall on ext3 so far, but is using a newer 2.6.32 kernel, the one from Ubuntu 10.04. The results are classic Linux in 2010: latency pauses from checkpoint sync will easily leave the system at a dead halt for a minute, with the worst one observed this time dropping still for 108 seconds. That one is weird, but these two are completely averge cases: http://www.2ndquadrant.us/pgbench-results/210/index.html http://www.2ndquadrant.us/pgbench-results/215/index.html I think a helpful next step here would be to put Robert's fsync compaction patch into here and see if that helps. There are enough backend syncs showing up in the difficult workloads (scale=1000, clients =32) that its impact should be obvious. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql: Add \dL to show languages
On Sat, Jan 15, 2011 at 8:26 PM, Andreas Karlsson andr...@proxel.se wrote: Hi Josh, Here is my review of this patch for the commitfest. Review of https://commitfest.postgresql.org/action/patch_view?id=439 Thanks a lot for the review! Contents and Purpose This patch adds the \dL command in psql to list the procedual languages. To me this seems like a useful addition to the commands in psql and \dL is also a quite sane name for it which follows the established conventions. Documentation of the new command is included and looks good. Runing it = Patch did not apply cleanly against HEAD so fixed it. I tested the comamnds, \dL, \dLS, \dL+, \dLS+ and they wroked as I expected. Support for patterns worked too and while not that useful it was not much code either. The psql completion worked fine too. Yeah, IIRC Fernando included pattern-completion in the original patch, and a reviewer said roughly the same thing -- it's probably overkill, but since it's just a small amount of code and it's already in there, no big deal. Some things I noticed when using it though. I do not like the use of parentheses in the usage description list (procedural) languages. Why not have it simply as list procedural languages? I agree. Should we include a column in \dL+ for the laninline function (DO blocks)? Hrm, I guess that could be useful for the verbose output at least. Performance === Quite irrelevant to this patch. :) Coding == In general the code looks good and follows conventions except for some whitesapce errors that I cleaned up. * Trailing whitespace in src/bin/psql/describe.c. * Incorrect indenation, spaces vs tabs in psql/describe.c and psql/tab-complete.c. * Removed empty line after return in listLanguages to match other functions. The comment for the function is not that descriptive but it is enough and fits with the other functions. Another things is that generated SQL needs its formatting fixed up in my oppinion. I recommend looking at the query built by \dL by using psql -E. Here is an example how the query looks for \dL+ SELECT l.lanname AS Procedural Language, pg_catalog.pg_get_userbyid(l.lanowner) as Owner, l.lanpltrusted AS Trusted, l.lanplcallfoid::regprocedure AS Call Handler, l.lanvalidator::regprocedure AS Validator, NOT l.lanispl AS System Language, pg_catalog.array_to_string(l.lanacl, E'\n') AS Access privileges FROM pg_catalog.pg_language l ORDER BY 1; Sorry, I don't understand.. what's wrong with the formatting of this query? In terms of whitespace, it looks pretty similar to listDomains() directly below listLanguages() in describe.c. Conclusion == The patch looks useful, worked, and there were no bugs obvious to me. The code also looks good and in line with other functions doing similar things. There are just some small issues that I think should be resolved before committing it: laninline, format of the built query and the usage string. Attached is a version that applies to current HEAD and with whitespace fixed. Regards, Andreas Thanks for the cleaned up patch. Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spread checkpoint sync
On Sun, Jan 16, 2011 at 10:13 PM, Greg Smith g...@2ndquadrant.com wrote: I have finished a first run of benchmarking the current 9.1 code at various sizes. See http://www.2ndquadrant.us/pgbench-results/index.htm for many details. The interesting stuff is in Test Set 3, near the bottom. That's the first one that includes buffer_backend_fsync data. This iall on ext3 so far, but is using a newer 2.6.32 kernel, the one from Ubuntu 10.04. The results are classic Linux in 2010: latency pauses from checkpoint sync will easily leave the system at a dead halt for a minute, with the worst one observed this time dropping still for 108 seconds. I wish I understood better what makes Linux systems freeze up under heavy I/O load. Linux - like other UNIX-like systems - generally has reasonably effective mechanisms for preventing a single task from monopolizing the (or a) CPU in the presence of other processes that also wish to be time-sliced, but the same thing doesn't appear to be true of I/O. I think a helpful next step here would be to put Robert's fsync compaction patch into here and see if that helps. There are enough backend syncs showing up in the difficult workloads (scale=1000, clients =32) that its impact should be obvious. Thanks for doing this work. I look forward to the results. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql: Add \dL to show languages
On Sun, Jan 16, 2011 at 8:52 PM, Robert Haas robertmh...@gmail.com wrote: On Sun, Jan 16, 2011 at 7:04 AM, Magnus Hagander mag...@hagander.net wrote: I do not like the use of parentheses in the usage description list (procedural) languages. Why not have it simply as list procedural languages? Because it lists non-procedural langauges as well? (I didn't check it, that's just a guess) There are many places in our code and documentation where procedural language or language are treated as synonyms. There's no semantic difference; procedural is simply a noise word. [bikeshedding] I agree with Andreas' suggestion that the help string be list procedural languages, even though the \dLS output looks something like this: List of languages Procedural Language | Owner | Trusted -+---+- c | josh | f internal| josh | f plpgsql | josh | t sql | josh | t (4 rows) which, as Magnus points out, includes non-procedural languages (SQL). I think that list languages could be confusing to newcomers -- the very people who might be reading through the help output of psql for the first time -- who might suppose that languages has something to do with the character sets supported by PostgreSQL, and might not even be aware that a variety of procedural languages can be used inside the database. Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] auto-sizing wal_buffers
On Sat, Jan 15, 2011 at 2:34 PM, Josh Berkus j...@agliodbs.com wrote: On 1/14/11 10:51 PM, Greg Smith wrote: ! Since the data is written out to disk at every transaction commit, ! the setting many only need to be be large enough to hold the amount ! of WAL data generated by one typical transaction. Larger values, ! typically at least a few megabytes, can improve write performance ! on a busy server where many clients are committing at once. ! Extremely large settings are unlikely to provide additional benefit. I think we can be more specific on that last sentence; is there even any *theoretical* benefit to settings above 16MB, the size of a WAL segment? I would turn it around and ask if there is any theoretical reason it would not benefit? (And if so, can they be cured soon?) Certainly there have been no test results to show any. Did the tests show steady improvement up to 16MB and then suddenly hit a wall? (And in which case, were they recompiled at a larger segment size and repeated?) Or did improvement just peter out because 16MB is really quite a bit and there was just no need for it to be larger independent of segment size? Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperlu problem with utf8 [REVIEW]
On 01/16/2011 07:14 PM, Alex Hunsaker wrote: On Sat, Jan 15, 2011 at 14:20, Andy Colsona...@squeakycode.net wrote: This is a review of plperl encoding issues https://commitfest.postgresql.org/action/patch_view?id=452 Thanks for taking the time to review! [...] The Patch: == Applies clean to git head as of January 15 2011. PG built with --enable-cassert and --enable-debug seems to run fine with no errors. I don't think regression tests cover plperl, so understandable there are no tests in the patch. FWI there are plperl tests, you can do 'make installcheck' from the plperl dir or installcheck-world from the top. However I did not add any as AFAIK there is not a way to handle multiple locales with them (at least for the automated case). oh, cool. I'd kinda thought 'make check' was the one to run. I'll have to checkout 'make check' vs 'make installcheck'. There is no manual updates in the patch either, and I think there should be. I think it should be made clear that data (varchar, text, etc. but not bytea) will be passed to perl as UTF-8, regardless of database encoding I don't disagree, but I dont see where to put it either. Maybe its only release note material? I think this page: http://www.postgresql.org/docs/current/static/plperl-funcs.html Right after: Arguments and results are handled as in any other Perl subroutine: arguments are passed in @_, and a result value is returned with return or as the last expression evaluated in the function. Add: Arguments will be converted from the databases encoding to UTF-8 for use inside plperl, and then converted from UTF-8 back to the database encoding upon return. OR, that same sentence could be added to the next page: http://www.postgresql.org/docs/current/static/plperl-data.html However, this patch brings back DWIM to plperl. It should just work without having to worry about it. I'd be ok either way. Also that use utf8; is always loaded and in use. Sorry, I probably mis-worded that in my original description. Its that we always do the 'utf8fix' for plperl. Not that utf8 is loaded and in use. This fix basically makes sure the unicode database and associated modules are loaded. This is needed because perl will try to dynamically load these when you need them. As we restrict 'require' in the plperl case, things that depended on that would fail. Previously we only did the utf8fix when we were a PG_UTF8 database. I don't really think its worth documenting, its more a bug fix than anything else. Agreed. -Andy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] auto-sizing wal_buffers
On Sun, Jan 16, 2011 at 9:32 AM, Tom Lane t...@sss.pgh.pa.us wrote: Josh Berkus j...@agliodbs.com writes: I think we can be more specific on that last sentence; is there even any *theoretical* benefit to settings above 16MB, the size of a WAL segment? IIRC there's a forced fsync at WAL segment switch, so no. However other backends can still do WAL inserts while that fsync takes place, as long as they can find available buffers to write into. So that should not be too limiting--a larger wal_buffers make it more likely they will find available buffers. However if the background writer does not keep up under bulk loading conditions, then the end of segment fsync will probably happen via AdvanceXLInsertBuffer, which will be sitting on the WALInsertLock. So that is obviously bad news. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql: Add \dL to show languages
On Sun, Jan 16, 2011 at 10:40 PM, Josh Kupershmidt schmi...@gmail.com wrote: On Sun, Jan 16, 2011 at 8:52 PM, Robert Haas robertmh...@gmail.com wrote: On Sun, Jan 16, 2011 at 7:04 AM, Magnus Hagander mag...@hagander.net wrote: I do not like the use of parentheses in the usage description list (procedural) languages. Why not have it simply as list procedural languages? Because it lists non-procedural langauges as well? (I didn't check it, that's just a guess) There are many places in our code and documentation where procedural language or language are treated as synonyms. There's no semantic difference; procedural is simply a noise word. [bikeshedding] I agree with Andreas' suggestion that the help string be list procedural languages, even though the \dLS output looks something like this: List of languages Procedural Language | Owner | Trusted -+---+- c | josh | f internal | josh | f plpgsql | josh | t sql | josh | t (4 rows) By the by, in the output of \df, \dt, \db, etc., that first column is called simply Name. which, as Magnus points out, includes non-procedural languages (SQL). I think that list languages could be confusing to newcomers -- the very people who might be reading through the help output of psql for the first time -- who might suppose that languages has something to do with the character sets supported by PostgreSQL, and might not even be aware that a variety of procedural languages can be used inside the database. Fair point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [JDBC] Weird issues when reading UDT from stored function
On Wed, Jan 12, 2011 at 5:12 AM, rsmogura rsmog...@softperience.eu wrote: Dear hackers :) Could you look at this thread from General. --- I say the backend if you have one row type output result treats it as the full output result, it's really bad if you use STRUCT types (in your example you see few columns, but this should be one column!). I think backend should return ROWDESC(1), then per row data describe this row type data. In other words result should be as in my example but without last column. Because this funny behaviour is visible in psql in JDBC I think it's backend problem or some far inconsistency. I don't see this described in select statement. I've read this report over a few times now, and I'm still not understanding exactly what is happening that you're unhappy about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers