[HACKERS] automatically updating security barrier views
Hello list, I am posting here since Craig Ringer suggested user feedback on this feature at this time may make a difference for 9.4 inclusion. I apologize if I am in the wrong place. Row-level security is probably THE major element the FOSS databases are behind compared to proprietary databases like Oracle or DB2. I understand that RLS was originally targeted for 9.3, slipped, was retargeted for 9.4, and will slip again. This is unfortunate, but I realize it’s a hard problem. It is possible for me to get about 90% of the way there by clever use of views. However views are not intrinsically secure, and well-crafted queries can leak the underlying table information. It is not immediately clear to me, as a user, if there is a way to lock down the user account hard enough to prevent this, but I suspect the answer is no. In theory security-barrier views solve this problem, but since you cannot INSERT or UPDATE a security-barrier view this only works for read-only applications. For read-write scenarios, some other solution must be taken for the write path. If UPDATE/INSERT on security-barrier views slips to 9.5, the impact to me personally is that I will have to create some kind of MITM proxy or write some kind of privilege-elevated process to handle the write path. I am willing to take on some tasks in support of solving this problem in 9.4, but my unfamiliarity with the codebase means that realistically I’m not of much help. Certainly, if there is something I could be doing, please let me know. Drew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction (WIP)
Hello +1 for feature -1 for Oracle syntax - it is hardly inconsistent with Postgres Autonomous transactions should be used everywhere - not only in plpgsql Regards Pavel 2014-04-07 6:06 GMT+02:00 Rajeev rastogi rajeev.rast...@huawei.com: I would like to propose “Autonomous Transaction” feature for 9.5. Details for the same are mentioned below: *What is Autonomous Transaction?* An autonomous transaction has its own COMMIT and ROLLBACK scope to ensure that its outcome does not affect the caller’s uncommitted changes. Additionally, the COMMITs and ROLLBACK in the calling transaction should not affect the changes that were finalized on the completion of autonomous transaction itself. Below are properties of autonomous transaction: 1. The autonomous transaction does not see uncommitted changes made by the main transaction and does not share locks or resources with main transaction. 2. Changes in autonomous transactions are visible to other transactions upon commit of the autonomous transactions. Thus, users can access the updated information without having to wait for the main transaction to commit. 3. Autonomous transactions can start other autonomous transaction. There are no limit, other than resource limits, on how many levels of autonomous transaction can be started. *Use-case:* There are many use-case for this feature. One of the use-case is illustrated below Say a procedure is defined, which does some operation on the database and incase of any failure in operation on main table, it maintains the failure information in a separate relation. But because of current transaction behavior, once main table operation fails, it will rollback whole transaction and hence error logged in error relation will be also lost, which might have been required for future analysis. In order to solve this issue, we can use autonomous transaction as shown below: *CREATE OR REPLACE function operation(err_msg IN VARCHAR) returns void AS $$* *BEGIN* * INSERT INTO at_test(id, description) VALUES (998, ‘Description for 998’);* * INSERT INTO at_test(id, description) VALUES (999, NULL);* *EXCEPTION* * WHEN OTHER THEN* * PRAGMA AUTONOMOUS TRANSACTION;* * INSERT INTO error_logs(id, timestamp, err_msg) VALUES(nextval(‘errno’), timenow(), err_msg);* * COMMIT;* * RAISE not_null_violation;* *END;* *$$ LANGUAGE plpgsql;* So once we execute above procedure, second INSERT will fails and then within exception handling it will start autonomous transaction and log the error information in a separate table and then gets committed. So though operation to table at_test will fail and rollback, error information will persist in the error_logs table. After execution of procedure, record in two tables will be as below: *Postgres=# select * from error_logs;* *id | log_time | err_msg* *+-+-* * 5 | 2014-01-17 19:57:11 | error* *postgres=# select * from at_test;* *id | decsription* *+-* *(0 rows)* *Syntax:* Syntax to create autonomous transaction can be as: *PRAGMA AUTONOMOUS TRANSACTION;* This can be used with independent SQL commands, from procedure, triggers. *Implementation:* Implementation of autonomous transaction is based on the existing sub-transaction and main transaction. Most of the implementations are re-used for autonomous transaction also. Below are the brief details about the same: *Autonomous Transaction Storage:* As for main transaction, structure PGXACT is used to store main transactions, which are created in shared memory of size: (Number of process)*sizeof(struct PGXACT) Similarly a new structure will be defined to store autonomous transaction: *Struct PGAutonomousXACT* *{* * TransactionId xid;* * TransactionId xmin;* * /* Store the level below main transaction as stored for sub-transaction*/* * intnestingLevel;* * struct XidCache subxids;* * bool overflowed;* * bool delaychkpt;* * uint nxids;* *} PGAutonomousXACT;* All structure members of PGAutonomousXACT are same as used in PGXACT except nestingLevel as marked in bold color to store the level of transaction. Similar to main transaction, the memory allocated to store autonomous transaction will be: *(Number of process) * sizeof (struct PGAutonomousXACT)*MAX_AUTO_TX_LEVEL* Where MAX_AUTO_TX_LEVEL is maximum number of nested autonomous transaction level. Unlike main transaction, autonomous transaction cannot be accessed directly. It can be accessed using offset as: *(Process number)*MAX_AUTO_TX_LEVEL + (current auto tx level)*
Re: [HACKERS] Autonomous Transaction (WIP)
On 04/07/2014 12:06 PM, Rajeev rastogi wrote: Syntax to create autonomous transaction can be as: */PRAGMA AUTONOMOUS TRANSACTION;/* Wouldn't you want to use SET TRANSACTION for this? Or a suffix on BEGIN, like BEGIN AUTONOMOUS TRANSACTION ? What's the logic behind introducing PRAGMA ? If you wanted to use that syntax for Oracle compatibility you'd need to use: PRAGMA AUTONOMOUS_TRANSACTION; (note underscore). But really, this would no be a pragma at all, PostgreSQL doesn't really have the concept. Calling it that would just be misleading. *_Starting of Autonomous Transaction:_* Starting of autonomous transaction will be exactly same as starting sub-transaction. If you don't want it to dirty read data from the parent tx, or inherit parent locks, then it cannot be the same at all. 2. Freeing of all resource and popping of previous transaction happens in the same way as sub-transaction. I'm not sure what you mean here. Overall, this looks like a HUGE job to make work well. I know some others have been doing work along the same lines, so hopefully you'll be able to collaborate and share ideas. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pending 9.4 patches
On 04/05/2014 03:57 AM, Andres Freund wrote: c07) Updatable security barrier views. This needs a serious look by a committer. I've been exercising it via row security and it's been looking pretty solid. It isn't a huge or intrusive patch, and it's seen several rounds of discussion during its development and refinement. (Thanks Dean). In some ways it'd be nicer to do it by splitting resultRelation into two (row read source, and relation to write modified tuples into), but this turns out to be rather complex and exceedingly intrusive. We might need to do it someday - at that point, it wouldn't be overly hard to change updatable s.b. views over to working that way, but only once the major surgery required to remove the assumptions about resultRelation was done. Having this in place in 9.4 would allow people to build row-security like features in applications, and ease the path of row security into 9.5. r04) Row-security based on Updatable security barrier views This one's fate seems to be hard to judge without c07. Open issues remain with this patch, and resources for working on it in 9.4 have run out. It is not ready for commit. A core bugfix with locking in security barrier views is required before the regression tests can be fixed up properly, for one thing. Tom also expressed concerns about how plan invalidation works, though it's not yet clear whether that was just miscommunication about how it works on my part or whether there's a concrete problem there. I'd really love to finish this off for 9.4, but other projects have to come first. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction (WIP)
On 07/04/14 15:50, Craig Ringer wrote: On 04/07/2014 12:06 PM, Rajeev rastogi wrote: Syntax to create autonomous transaction can be as: */PRAGMA AUTONOMOUS TRANSACTION;/* Wouldn't you want to use SET TRANSACTION for this? Or a suffix on BEGIN, like BEGIN AUTONOMOUS TRANSACTION ? What's the logic behind introducing PRAGMA ? If you wanted to use that syntax for Oracle compatibility you'd need to use: PRAGMA AUTONOMOUS_TRANSACTION; (note underscore). FWIW the implementation in the patch uses PRAGMA AUTONOMOUS_TRANSACTION, the space is presumably a typo. Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Including replication slot data in base backups
On Fri, Apr 4, 2014 at 9:57 PM, Michael Paquier michael.paqu...@gmail.com wrote: For 9.4, clearly yes, this would change the semantic of recovery and this is not something wise to do at the end of a development cycle. For 9.5 though, this is a different story. It clearly depends on if this is though as useful enough to change how recovery fetches WAL files (in this case by scanning existing repslots). There are other things to consider as well like for example: do we reset the restart_lsn of a repslot if needed WAL files are not here anymore or abort recovery? I haven't worked much with repslots though... Coming back to that, I am still wondering if for the time being it would not be better to add in pg_basebackup documentation that replication slot information is not added in a backup, per se the patch attached. Regards, -- Michael diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml index 6ce0c8c..4305788 100644 --- a/doc/src/sgml/ref/pg_basebackup.sgml +++ b/doc/src/sgml/ref/pg_basebackup.sgml @@ -590,6 +590,13 @@ PostgreSQL documentation or an older major version, down to 9.1. However, WAL streaming mode (-X stream) only works with server version 9.3. /para + + para + The backup will not include information about replication slots + (see xref linkend=streaming-replication-slots) as it is not + guaranteed that a node in recovery will have WAL files required for + a given slot. + /para /refsect1 refsect1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] polymorphic SQL functions has a problem with domains
2014-04-02 17:19 GMT+02:00 Tom Lane t...@sss.pgh.pa.us: Pavel Stehule pavel.steh...@gmail.com writes: I was informed about impossibility to use a polymorphic functions together with domain types see create domain xx as numeric(15); create or replace function g(anyelement, anyelement) returns anyelement as $$ select $1 + $2 $$ language sql immutable; postgres=# select g(1::xx, 2::xx); ERROR: return type mismatch in function declared to return xx DETAIL: Actual return type is numeric. CONTEXT: SQL function g during inlining That example doesn't say you can't use polymorphic functions with domains. It says that this particular polymorphic function definition is wrong: it is not making sure its result is of the expected data type. I don't recall right now whether SQL functions will apply an implicit cast on the result for you, but even if they do, an upcast from numeric to some domain over numeric wouldn't be implicit. I though about this issue again, and I am thinking so it is PostgreSQL bug we can do safe transformation from Parent type - domain. and returning result require same transformation (in this case) - so enforcing casting (not only binary casting) should be safe. Otherwise - CAST(var AS var) should be useful and can helps too. Regards Pavel Stehule regards, tom lane
Re: [HACKERS] Get more from indices.
Hi Horiguchi-san, Sorry for not reviewing this patch in the last CF. (2014/03/10 16:21), Kyotaro HORIGUCHI wrote: Oops! I found a bug in this patch. The previous v8 patch missed the case that build_index_pathkeys() could build a partial pathkeys from the index tlist. This causes the situation follows, === =# \d cu11 Table public.cu11 Column | Type | Modifiers +-+--- a | integer | not null b | integer | not null c | integer | d | text| Indexes: cu11_a_b_idx UNIQUE, btree (a, b) s=# explain (costs off) select * from cu11 order by a, c ,d; QUERY PLAN --- Index Scan using cu11_a_b_idx on cu11 (1 row) === Where the simple ORDER BY a, c, d on the table with index on columns (a, b) results simple index scan which cannot perform the order. The attached v9 patche is fixed by adding a check for the case into index_pathkeys_are_extensible(), and rebase to current HEAD. Good catch! ISTM that the biggest concern about this patch would be whether it's worth complicating the code, because the range of application of the patch is not so wide as is. So, all we need to do is show important use cases that prove the effectiveness of the patch. Sorry, I can't come up with a good idea, though. Thanks, Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQputCopyData dont signal error
On Tue, Apr 01, 2014 at 01:53:13PM -0400, Robert Haas wrote: One of the things you mentioned is I often find it necessary to refer to existing examples of code when trying to figure out how to do things correctly. I couldn't agree more. Haven't seen one yet, but found plenty of discussion that tap danced around one or more of the components of the copy, put, end paradigm. Maybe I should have just asked for a sample code snippet but didn't after a day or so of frustration and trying to piece together other people's incomplete samples. FWIW, I've generally found that the best examples are what's in the core distribution. I'd go and look at a tool like psql or pg_restore and find the code that handles this, and then copy it and cut it down to what you need. To move the conversation along: https://github.com/postgres/postgres/blob/master/src/bin/psql/copy.c#L664 Seems possibly even more robust than most people will code, but it's had a lot of real world testing. Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ He who writes carelessly confesses thereby at the very outset that he does not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
Re: [HACKERS] Autonomous Transaction (WIP)
On 07 April 2014 12:20, Craig Ringer Syntax to create autonomous transaction can be as: */PRAGMA AUTONOMOUS TRANSACTION;/* Wouldn't you want to use SET TRANSACTION for this? Or a suffix on BEGIN, like BEGIN AUTONOMOUS TRANSACTION ? What's the logic behind introducing PRAGMA ? If you wanted to use that syntax for Oracle compatibility you'd need to use: PRAGMA AUTONOMOUS_TRANSACTION; (note underscore). But really, this would no be a pragma at all, PostgreSQL doesn't really have the concept. Calling it that would just be misleading. Actually it is same as oracle (i.e. PRAGMA AUTONOMOUS_TRANSACTION), it was just typo mistake in previous mail. But if this is also not accepted then we can discuss and come out with a syntax based on everyone agreement. *_Starting of Autonomous Transaction:_* Starting of autonomous transaction will be exactly same as starting sub-transaction. If you don't want it to dirty read data from the parent tx, or inherit parent locks, then it cannot be the same at all. While starting sub-transaction, it is just initializing the resources required and links the same to the parent transaction, which we require for autonomous transaction also. I am not able to notice any issue as you mentioned above with this. Please let me know if I am missing something or misunderstood your concern. 2. Freeing of all resource and popping of previous transaction happens in the same way as sub-transaction. I'm not sure what you mean here. It means, during commit of autonomous transaction, freeing of all resource are done in the same way as done for sub-transaction. Also current autonomous transaction gets popped out and points to the parent transaction in the similar way as done for sub-transaction. Overall, this looks like a HUGE job to make work well. I know some others have been doing work along the same lines, so hopefully you'll be able to collaborate and share ideas. Yes it is huge works, so I have proposed in the beginning of 9.5 so that we can have multiple round of discussion and hence address all concerns. Also I have proposed to finish this feature in multiple rounds i.e. first patch, we can try to support autonomous transaction from standalone SQL-command only, which will set-up infrastructure for future work in this area. Using the WIP patch sent, I have done basic testing and it works fine. Any comments? Thanks and Regards, Kumar Rajeev Rastogi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction (WIP)
On 07 April 2014 12:12, Pavel Stehule wrote: +1 for feature Thanks -1 for Oracle syntax - it is hardly inconsistent with Postgres We can discuss and come out with the syntax based on everyone agreement. Autonomous transactions should be used everywhere - not only in plpgsql Yes you are right. I am not planning to support only using plpgsql. Initially we can support this Using the standalone SQL-commands and then later we can enhance based on this infrastructure to be used using plpgsql, triggers. Thanks and Regards, Kumar Rajeev Rastogi
Re: [HACKERS] Autonomous Transaction (WIP)
2014-04-07 11:59 GMT+02:00 Rajeev rastogi rajeev.rast...@huawei.com: On 07 April 2014 12:12, Pavel Stehule wrote: +1 for feature Thanks -1 for Oracle syntax - it is hardly inconsistent with Postgres We can discuss and come out with the syntax based on everyone agreement. Autonomous transactions should be used everywhere - not only in plpgsql Yes you are right. I am not planning to support only using plpgsql. Initially we can support this Using the standalone SQL-commands and then later we can enhance based on this infrastructure to be used using plpgsql, triggers. ok long time I though about this feature. I am thinking so this should be fully isolated transaction - it should not be subtransaction, because then you can break database consistency - RI I am happy so someone does this job Regards Pavel *Thanks and Regards,* *Kumar Rajeev Rastogi *
Re: [HACKERS] Autonomous Transaction (WIP)
On Mon, Apr 7, 2014 at 3:41 PM, Pavel Stehule pavel.steh...@gmail.comwrote: 2014-04-07 11:59 GMT+02:00 Rajeev rastogi rajeev.rast...@huawei.com: On 07 April 2014 12:12, Pavel Stehule wrote: +1 for feature Thanks -1 for Oracle syntax - it is hardly inconsistent with Postgres We can discuss and come out with the syntax based on everyone agreement. Autonomous transactions should be used everywhere - not only in plpgsql Yes you are right. I am not planning to support only using plpgsql. Initially we can support this Using the standalone SQL-commands and then later we can enhance based on this infrastructure to be used using plpgsql, triggers. ok long time I though about this feature. I am thinking so this should be fully isolated transaction - it should not be subtransaction, because then you can break database consistency - RI I am missing something here, but how does making it a subtransaction break consistency? Isnt that what should actually be happening so that the autonomous transaction's changes are actually visible till the parent transaction commits? What am I missing here? Regards, Atri
Re: [HACKERS] Autonomous Transaction (WIP)
On 2014-04-07 15:46:42 +0530, Atri Sharma wrote: On Mon, Apr 7, 2014 at 3:41 PM, Pavel Stehule pavel.steh...@gmail.comwrote: I am missing something here, but how does making it a subtransaction break consistency? Isnt that what should actually be happening so that the autonomous transaction's changes are actually visible till the parent transaction commits? What am I missing here? START TRANSACTION; INSERT INTO referenced_to_table ... id = 1; START AUTONOMOUS SUBTRANSACTION; INSERT INTO referencing_table id = 1 ...; COMMIT AUTONOMOUS SUBTRANSACTION; ROLLBACK; Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction (WIP)
2014-04-07 12:16 GMT+02:00 Atri Sharma atri.j...@gmail.com: On Mon, Apr 7, 2014 at 3:41 PM, Pavel Stehule pavel.steh...@gmail.comwrote: 2014-04-07 11:59 GMT+02:00 Rajeev rastogi rajeev.rast...@huawei.com: On 07 April 2014 12:12, Pavel Stehule wrote: +1 for feature Thanks -1 for Oracle syntax - it is hardly inconsistent with Postgres We can discuss and come out with the syntax based on everyone agreement. Autonomous transactions should be used everywhere - not only in plpgsql Yes you are right. I am not planning to support only using plpgsql. Initially we can support this Using the standalone SQL-commands and then later we can enhance based on this infrastructure to be used using plpgsql, triggers. ok long time I though about this feature. I am thinking so this should be fully isolated transaction - it should not be subtransaction, because then you can break database consistency - RI I am missing something here, but how does making it a subtransaction break consistency? Isnt that what should actually be happening so that the autonomous transaction's changes are actually visible till the parent transaction commits? commit of autonomous transaction doesn't depends on outer transaction. So anything what you can do, should be independent on outer transaction. Pavel What am I missing here? Regards, Atri
Re: [HACKERS] automatically updating security barrier views
* Drew Crawford (d...@sealedabstract.com) wrote: I am willing to take on some tasks in support of solving this problem in 9.4, but my unfamiliarity with the codebase means that realistically I’m not of much help. Certainly, if there is something I could be doing, please let me know. Download the patch, build PG with it, test it and try to break it. Review the docs. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)
On Apr5, 2014, at 09:55 , Dean Rasheed dean.a.rash...@gmail.com wrote: On 5 April 2014 08:38, Dean Rasheed dean.a.rash...@gmail.com wrote: [snip] releasing it in this state feels a little half-baked to me. I regret writing that almost as soon as I sent it. The last of those queries is now over 10 times faster than HEAD, which is certainly worthwhile. What bugs me is that there is so much more potential in this patch. Well, the perfect is the enemy of the good as they say. By all means, let's get rid of the O(n^2) behaviour in 9.5, but let's get a basic version into 9.4. I think it's pretty close to being committable, but I fear that time is running out for 9.4. I'm not aware of any open issues in the basic patch, other then scaling the reported calling statistics by nloops, which should be trivial. I'll post an updated patch later today. I don't really expect all the add-on patches to make it into 9.4 - they don't seem to have gotten much attention yet - but at least the inverse transition functions for the basic arithmetic aggregates should be doable I hope. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: COUNT(*) (and related) speedup
On 04 April 2014 18:09, Joshua Yanovski Wrote: The counter would be updated only by VACUUM, which would perform the same computation performed by the COUNT operation but add it permanently to counter just before it set the visible_map bit to 1 (I am not certain whether this would require unusual changes to WAL replay or not). I think there is one more disadvantages with this (or maybe I have missed something): User may not use count(*) or may use just once after creating new kind of index proposed but VACUUM will have to keep performing operation equivalent to iterative count(*), which might degrade performance for VACUUM. Thanks and Regards, Kumar Rajeev Rastogi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] pg_ctl always uses the same event source
Amit Kapila amit.kapil...@gmail.com writes: On Sat, Apr 5, 2014 at 8:24 PM, Tom Lane t...@sss.pgh.pa.us wrote: How's that going to work during pg_ctl stop? There's no -o switch provided. As there's no -o switch, so there won't be problem of getting wrong event source name from server due to command line options which you mentioned upthread or is there something which I am missing about it? AFAICS, the sole argument for trying to do things this way is to log to the same event_source the running postmaster is using. But -C will not provide that; it will only tell you what the postmaster *would* use if it were freshly started right now. If -o had been used at postmaster start time, an inquiry done by pg_ctl stop (or reload, etc) won't match. Another, possibly more plausible, failure mode is if the entry in postgresql.conf has been edited since the running postmaster looked at it. Admittedly, all of these cases are kind of unusual. But that's all the more reason, IMO, to be wary of sending our error messages to an unexpected place in such cases. Somebody who's trying to debug a configuration problem doesn't need the added complication of trying to figure out where pg_ctl's error messages might have gone. You are right that with the current patch approach, we will miss many opportunities for failures and the way suggested by you below (new switch) is more appropriate to fix this issue. Another thought that occurred to me is why not change the failures which are before set of appropriate event_source to report on console. The main reason for using event log to report error's in pg_ctl is because it can be used for service (register/unregister/..) in Windows and all the work we do before setting event_source is not related to it's usage as a service. AFAICS, pg_ctl already reports to stderr if stderr is a tty. This whole issue only comes up when pg_ctl itself is running as a service (which I guess primarily means starting/stopping the postgresql service?). So that puts extra weight on trying to be sure that error messages go to a predictable place; the user's going to be hard-pressed to debug background failures without that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQputCopyData dont signal error
Martijn van Oosterhout klep...@svana.org writes: To move the conversation along: https://github.com/postgres/postgres/blob/master/src/bin/psql/copy.c#L664 Seems possibly even more robust than most people will code, but it's had a lot of real world testing. Note that the looping behavior there is actually rather new, and is probably unnecessary for 99% of applications; it would depend on what your ambitions are for dealing gracefully with connection loss (and even then, I think we were forced into this by pre-existing decisions elsewhere in psql about where and how we handle connection loss). The important thing for this question is the error reporting that occurs at lines 692-694. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pending 9.4 patches
Craig Ringer cr...@2ndquadrant.com writes: On 04/05/2014 03:57 AM, Andres Freund wrote: r04) Row-security based on Updatable security barrier views This one's fate seems to be hard to judge without c07. Open issues remain with this patch, and resources for working on it in 9.4 have run out. It is not ready for commit. A core bugfix with locking in security barrier views is required before the regression tests can be fixed up properly, for one thing. Tom also expressed concerns about how plan invalidation works, though it's not yet clear whether that was just miscommunication about how it works on my part or whether there's a concrete problem there. I'd really love to finish this off for 9.4, but other projects have to come first. Given that, I think we should go ahead and mark this one Returned With Feedback. It's past time to be punting anything that doesn't have a serious chance of getting committed for 9.4. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FastPathStrongRelationLocks still has an issue in HEAD
On Sun, Apr 6, 2014 at 1:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rover_fireflydt=2014-04-06%2017%3A04%3A00 TRAP: FailedAssertion(!(FastPathStrongRelationLocks-count[fasthashcode] 0), File: lock.c, Line: 1240) [53418a51.6a08:2] LOG: server process (PID 27631) was terminated by signal 6 [53418a51.6a08:3] DETAIL: Failed process was running: create table gc1() inherits (c1); Uggh. That's unfortunate, but not terribly surprising: I didn't think that missing volatile was very likely to be the cause of this. Have we been getting random failures of this type since the fastlock stuff went in, and we're only just now noticing? Or did some recent change expose this problem? I'm a bit suspicious of the patches to static-ify stuff, since that might cause the compiler to think it could move things across function calls that it hadn't thought move-able before, but FastPathStrongLocks references would seem to be the obvious candidate for that, and volatile-izing it ought to have fixed it. I would think. One thing I noticed, looking at this particular failure, is that at the time that create table gc1() inherits (c1) failed an assertion, another backend was inside select blockme(), and specifically inside of select count(*)from tenk1 a, tenk1 b, tenk1 c. I can't help but suspect that the bug is somehow concurrency-related, so the presence of concurrent activity seems like a clue, but I can't figure out the relationship. blockme() shouldn't be taking any strong relation locks. Unless AV decided to truncate a table just then, the process that failed should be the only one in the system with any strong relation lock, so if there's a race, what is it racing against? In the failure on prairiedog back on March 25th... http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=prairiedogdt=2014-03-25%2003%3A15%3A03 ...there's a lot more concurrent activity. The process that failed the assertion was running CREATE TABLE t3 (name TEXT, n INTEGER); concurrently, the following other queries were running: SELECT oid AS clsoid, relname, relnatts + 10 AS x INTO selinto_schema.tmp2 FROM pg_class WHERE relname like '%b%'; CREATE TEMP TABLE foo (id integer); create temp table t3 as select generate_series(-1000,1000) as x; CREATE TEMPORARY TABLE bitwise_test( i2 INT2, i4 INT4, i8 INT8, i INTEGER, x INT2, y BIT(4) ); DROP TABLE savepoints; create temp table tt1(f1 int); CREATE TEMP TABLE arrtest2 (i integer ARRAY[4], f float8[], n numeric[], t text[], d timestamp[]); COMMIT PREPARED 'regress-one'; All but the first and last of those take a strong relation lock, so some kind of race could certainly account for that failure. It's also interesting that COMMIT PREPARED is shown as being involved here; that code is presumably much more rarely executed than the code for the regular commit or abort paths, and might therefore be thought more likely to harbor a bug. In particular, there's this code in LockRefindAndRelease: /* * Decrement strong lock count. This logic is needed only for 2PC. */ if (decrement_strong_lock_count ConflictsWithRelationFastPath(lock-tag, lockmode)) { uint32 fasthashcode = FastPathStrongLockHashPartition(hashcode); SpinLockAcquire(FastPathStrongRelationLocks-mutex); FastPathStrongRelationLocks-count[fasthashcode]--; SpinLockRelease(FastPathStrongRelationLocks-mutex); } I notice that this code lacks an Assert(FastPathStrongRelationLocks-count[fasthashcode] 0). I think we should add one. If this code is somehow managing to decrement one of the counts when it's already zero, the next process whose lock gets mapped to this partition would increment the count from the maximum value that can be stored back to zero. Then, when it goes to release the strong lock, it finds that the count is already zero and goes boom. This theory could even explain the new crash, since the COMMIT PREPARED stuff has already happened by the point where rover_firefly failed; since the lock tags are hashed to create fasthashcode, variation in which objects got which OIDs due to concurrency in the regression test could cause the failure to move around or even escape detection altogether. Now, even if the 2PC code is the problem, that doesn't explain exactly what's wrong with the above logic, but it would help narrow down where to look. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tds_fdw for Sybase and MS SQL Server
Excellent! I have an application for this. I'll give it a look. Thanks! Mike __ *Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management | RR Donnelley* 1750 Wallace Ave | St Charles, IL 60174-3401 Office: 630.313.7818 mike.blackw...@rrd.com http://www.rrdonnelley.com http://www.rrdonnelley.com/ * mike.blackw...@rrd.com* On Sun, Apr 6, 2014 at 9:03 AM, Geoff Montee geoff.mon...@gmail.com wrote: If anyone is interested, I've developed a foreign data wrapper that can be used to connect to Sybase databases and Microsoft SQL Server. You can get it on GitHub here: https://github.com/GeoffMontee/tds_fdw I did my testing with FreeTDS, an open source TDS library. I've talked to Greg Smith about this in the past. I don't really have the expertise or resources to develop and test this much more than I already have. If anyone likes it and would like to take over ownership of the code, feel free to do so. I actually developed this over a year ago, so it doesn't support write operations added in PostgreSQL 9.3. By the way, thanks to everyone who spoke at PGConf NYC 2014. It was very informative. Geoff Montee
Re: [HACKERS] FastPathStrongRelationLocks still has an issue in HEAD
On 2014-04-07 10:06:07 -0400, Robert Haas wrote: I'm a bit suspicious of the patches to static-ify stuff, since that might cause the compiler to think it could move things across function calls that it hadn't thought move-able before, but FastPathStrongLocks references would seem to be the obvious candidate for that, and volatile-izing it ought to have fixed it. I would think. Hm. It generally might be interesting to get a few !X86 buildfarms running builds with LTO enabled. That might expose some dangerous assumptions more easily. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Windows exit code 128 ... it's baaack
On 2014-04-05 11:05:09 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-02-27 19:14:13 -0500, Tom Lane wrote: I looked at the postmaster log for the ongoing issue on narwhal (to wit, that the contrib/dblink test dies the moment it tries to do anything dblink-y), and looky here what the postmaster has logged: One interesting bit about this is that it seems to work in 9.3... Well, yeah, it seems to have been broken somehow by the Windows linking changes we did awhile back to try to ensure that missing PGDLLIMPORT markers would be detected reliably. Which we did not back-patch. Hard to say since there's been no working builds for HEAD for so long on narwahl :(. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ipc_test
On Fri, Apr 4, 2014 at 10:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-04-04 09:31:01 -0400, Robert Haas wrote: Does anybody care about being able to compile ipc_test as a standalone binary any more? I don't. I can't remember the last time I had use for it either. +1 for removal. OK, done. One less thing to worry about when committing! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FastPathStrongRelationLocks still has an issue in HEAD
On Mon, Apr 7, 2014 at 10:20 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-04-07 10:06:07 -0400, Robert Haas wrote: I'm a bit suspicious of the patches to static-ify stuff, since that might cause the compiler to think it could move things across function calls that it hadn't thought move-able before, but FastPathStrongLocks references would seem to be the obvious candidate for that, and volatile-izing it ought to have fixed it. I would think. Hm. It generally might be interesting to get a few !X86 buildfarms running builds with LTO enabled. That might expose some dangerous assumptions more easily. I strongly suspect that will break stuff all over the place. We can either get compiler barriers working for real, or we can start volatile-izing every reference in an LWLock-protected critical section. Hint: the second one is insane. That might be off-topic for this issue at hand, though... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] pg_ctl always uses the same event source
On Sat, Apr 5, 2014 at 10:54 AM, Tom Lane t...@sss.pgh.pa.us wrote: So basically, I think having pg_ctl try to do what this patch proposes is a bad idea. I'm not a Windows person either, but I tend to agree. I can't think that this is going to be very robust ... and if it's not going to be robust, what's the point? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FastPathStrongRelationLocks still has an issue in HEAD
On 2014-04-07 10:45:51 -0400, Robert Haas wrote: On Mon, Apr 7, 2014 at 10:20 AM, Andres Freund and...@2ndquadrant.com wrote: Hm. It generally might be interesting to get a few !X86 buildfarms running builds with LTO enabled. That might expose some dangerous assumptions more easily. I strongly suspect that will break stuff all over the place. We can either get compiler barriers working for real, or we can start volatile-izing every reference in an LWLock-protected critical section. Hint: the second one is insane. You don't have to convince me. The way there is where I am not sure we're agreeing. I didn't break a few months back for x86 on light loads btw. Not that that's saying much. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gsoc knn spgist
On Fri, Apr 4, 2014 at 11:56 AM, Костя Кузнецов chapae...@yandex.ru wrote: I want to implement knn for spgist. I dont have question with knn, but have questions with implementation of interface. i modify pg_am.h (set amcanorderbyop to true in spgist index).Also i modify pg_amop.h(add DATA(insert (4015 600 600 15 o 517 4000 1970 )); ) explain SELECT * FROM quad_point_tbl ORDER BY p - '-2,50' Sort (cost=1219.31..1246.82 rows=11003 width=16) Sort Key: ((p - '(-2,50)'::point)) - Index Only Scan using sp_quad_ind on quad_point_tbl (cost=0.15..480.70 r ows=11003 width=16) what am I doing wrong? I don't know. I think there are a lot of possibilities based on the information provided. A simple explanation would be that the planner considered the plan you want and found the other one cheaper, but the real cause could well be something else. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FastPathStrongRelationLocks still has an issue in HEAD
Robert Haas robertmh...@gmail.com writes: On Sun, Apr 6, 2014 at 1:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rover_fireflydt=2014-04-06%2017%3A04%3A00 Uggh. That's unfortunate, but not terribly surprising: I didn't think that missing volatile was very likely to be the cause of this. Yeah. That was a bug, but evidently it's not the bug we're looking for. Have we been getting random failures of this type since the fastlock stuff went in, and we're only just now noticing? Or did some recent change expose this problem? Not sure. I used to rely on the pgbuildfarm-status-green daily digests to cue me to look at transient buildfarm failures, but that list has been AWOL for months. However, I'm pretty sure this has not been happening ever since 9.2, so yeah, it's at least become more probable in the last few months. I'm a bit suspicious of the patches to static-ify stuff, since that might cause the compiler to think it could move things across function calls that it hadn't thought move-able before, but FastPathStrongLocks references would seem to be the obvious candidate for that, and volatile-izing it ought to have fixed it. I would think. Keep in mind also that prairiedog is running a pretty old gcc (4.0.1 if memory serves), so I'd not expect it to be doing any crazy optimizations. I suspect we are looking at some plain old logic bug, but as you say it's hard to guess where exactly. [ LockRefindAndRelease ] lacks an Assert(FastPathStrongRelationLocks-count[fasthashcode] 0). I think we should add one. Absolutely. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC 2014: Implementing clustering algorithms in MADlib
Hi Maxence, This is really really good. Hopefully we will have a fruitful summer, and make MADlib a better product. Thanks Hai -- *Pivotal http://www.gopivotal.com/* A new platform for a new era On Sat, Apr 5, 2014 at 7:14 AM, Maxence Ahlouche maxence.ahlou...@gmail.com wrote: Hi all, 2014-04-03 9:04 GMT+02:00 Hai Qian hq...@gopivotal.com: Hi Maxence, We are very glad that you are enthusiastic about MADlib.Your proposal looks very nice. The algorithms that you proposed will be very useful, and can serve as a good start. After learning about the C++ abstraction layer and testing suite, you should be able to quickly grasp how to program for MADlib. We can fill in the details of the plan later. The conflict in your time schedule shouldn't be a problem. It would be nice if you could provide a little bit more introduction to these algorithms and their importance in the proposal. Overall it is very good. And thank you again Hai As you requested, I've described the clustering algorithms on my github [0]. I'm copying it at the end of this email, just in case. In the case of OPTICS, I still need some time to grasp the details of how it works, so I've only included a basic description of this one. Regards, Maxence [0] https://github.com/viodlen/gsoc_2014/blob/master/algorithm_details.rst Details about the different clustering algorithms = This file describes the k-means algorithm, which is currently the only clustering algorithm implemented in MADlib, and the k-medoids and OPTICS algorithm, which I plan to implement this summer, as a GSoC project. I'll also explain the DBSCAN algorithm, as OPTICS is only an extension of this one. The goal of all these algorithms is to determine clusters in a set of points. While this problem is NP-hard, it can be solved efficiently thanks to these algorithms, even though the result is usually not perfect. I've tried not to add too many details, and taken some shortcuts, so there may be some inaccuracies. I wanted to keep this readable (not sure that goal was achieved, though), but if anyone wants more precisions, I'll gladly add them. k-means --- The k-means algorithm is the one implemented in MADlib. It selects *k* points, either randomly, among the dataset, or set by the user, to use as initial centroids (center of clusters). *k* must be defined by the user; the algorithm is unable to guess the number of clusters. All the points in the dataset are then affected to the nearest centroid, thus making *k* clusters. The next step is to compute the new centroids as a mean of all the points in a cluster. Then reassign all the points to then new centroids, and start all over again until there is no change in cluster assignation. This algorithm is usually pretty fast (even though it can be made to take an exponential time to converge). Still, it is easy to get a wrong result because of local convergence, e.g. having one cluster split in two parts by the algorithm, or two clusters merged. It is also pretty sensitive to outliers (points that don't obviously belong to any cluster), and the final result depend greatly on the initial centroids. This algorithm doesn't work well with non-euclidean distance, in general. Another think to note is that k-means will result in Voronoi cell shaped clusters, which is not always what we want. k-medoids - The k-medoids algorithm works the same way as k-means, save for a few exceptions. With k-medoids, the centroids are always points of the dataset (and are then called medoids). The new medoids are the points in clusters which minimize the sum of pairwise dissimilarities (in other terms, the point for which the average distance to other points in the cluster is minimal). This makes the algorithm less sensitive to outliers than k-means. This algorithm is computationnally more intensive than k-means, but still fast. As for k-means, it is possible to run the algorithm several times with different initial centroids, and get the best result (i.e. the one that minimizes the sum of distances from points to their centroids). DBSCAN --- DBSCAN (*Density-Based Spatial Clustering of Applications with Noise*) is a clustering algorithm based on the density of clusters. This adresses several limitations of k-means and k-medoids: it does not assign a cluster to outliers, and allows the detection of weird-shaped clusters. Moreover, it doesn't need to be told the number of clusters. A point is called dense if enough other points are near enough from it, where enough other points and near enough are defined by the parameters *min_points* and *epsilon*. *min_points* therefore represents the minimum number of points required to make a cluster, and *epsilon* is the maximum distance a point can be from a cluster to be considered part of this cluster. For every unassigned point, count
Re: [HACKERS] FastPathStrongRelationLocks still has an issue in HEAD
On Mon, Apr 7, 2014 at 10:54 AM, Tom Lane t...@sss.pgh.pa.us wrote: [ LockRefindAndRelease ] lacks an Assert(FastPathStrongRelationLocks-count[fasthashcode] 0). I think we should add one. Absolutely. Turns out there were two places missing such an assertion: the 2PC path, and the abort-strong-lock-acquire path. I added an assertion to both. In theory, if the problem is coming from either of those places, this might even increase the frequency of buildfarm failures, since it removes the necessity for another normal-path release to hit the same partition afterwards. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Fri, Apr 4, 2014 at 4:29 PM, Greg Stark st...@mit.edu wrote: On Fri, Apr 4, 2014 at 12:15 PM, Robert Haas robertmh...@gmail.com wrote: Perhaps I shouldn't lay my own guilt trip on other committers --- but I think it would be a bad precedent to not deal with the existing patch queue first. +1 +1 I don't think we have to promise a strict priority queue and preempt all other development. But I agree loosely that processing patches that have been around should be a higher priority. I've been meaning to do more review for a while and just took a skim through the queue. There are only a couple I feel I can contribute with so I'm going to work on those and then if it's still before the feature freeze I would like to go ahead with Peter's patch. I think it's generally a good patch. To be honest, I think that's just flat-out inappropriate. There were over 100 patches in this CommitFest and there's not a single committed patch that has your name on it even as a reviewer, let alone a committer. When a committer says, hey, I'm going to commit XYZ, that basically forces anybody who might have an objection to it to drop what they're doing and object fast, before it's too late. In other words, the people who just said that they are too busy reviewing patches that were timely submitted and don't want to divert effort from that to handle patches that weren't are going to have to do that anyway, or lose their right to object. I think that's unfair. You're essentially leveraging a commit bit that you haven't used in more than three years to try to push a patch that was submitted months too late to the head of the review queue - and, just to put icing on the cake, it just so happens that you and the patch author work for the same employer. I have no objection to people committing patches written by others who work at the same company, but only if those patches have gone through a full, fair, and public review, with ample opportunity for other people to complain if they don't like it. That is obviously not the case here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)
On 7 April 2014 14:09, Florian Pflug f...@phlo.org wrote: On Apr5, 2014, at 09:55 , Dean Rasheed dean.a.rash...@gmail.com wrote: On 5 April 2014 08:38, Dean Rasheed dean.a.rash...@gmail.com wrote: [snip] releasing it in this state feels a little half-baked to me. I regret writing that almost as soon as I sent it. The last of those queries is now over 10 times faster than HEAD, which is certainly worthwhile. What bugs me is that there is so much more potential in this patch. Well, the perfect is the enemy of the good as they say. By all means, let's get rid of the O(n^2) behaviour in 9.5, but let's get a basic version into 9.4. I think it's pretty close to being committable, but I fear that time is running out for 9.4. I'm not aware of any open issues in the basic patch, other then scaling the reported calling statistics by nloops, which should be trivial. I'll post an updated patch later today. I don't really expect all the add-on patches to make it into 9.4 - they don't seem to have gotten much attention yet - but at least the inverse transition functions for the basic arithmetic aggregates should be doable I hope. I've just finished reading through all the other patches, and they all look OK to me. It's mostly straightforward stuff, so despite the size it's hopefully all committable once the base patch goes in. I think that you're probably right that optimising window_gettupleslot() to eliminate the O(n^2) behaviour can be left to a later patch --- the existing performance benefits of this patch are enough to justify its inclusion IMO. It would be nice to include the trivial optimisation to window_gettupleslot() that I posted upthread, since it gives such a big improvement for only a few lines of code changed. If you post a new patch set, I'll mark it as ready for committer attention. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] Proposal: variant of regclass
On Sat, Apr 5, 2014 at 1:10 AM, Amit Kapila amit.kapil...@gmail.com wrote: The reason of this behavior is that in out functions (regclassout), we return the OID as it is incase it doesn't exist. One way to fix this is incase of OID input parameters, we check if the passed OID exists in to_* functions and return NULL if it doesn't exist. I think by this way we can retain the similarity of input parameters between types and to_* functions and making to_* functions return NULL incase OID doesn't exist makes it similar to case when user passed name. We could do that, but that's not my preferred fix. My suggestion is to restructure the code so that to_regclass() only accepts a name, not an OID, and make its charter precisely to perform a name lookup and return an OID (as regclass) or NULL if there's no match. How will we restrict user from passing some number in string form? Do you mean that incase user has passed OID, to_* functions should return NULL or if found that it is OID then return error incase of to_* functions? Each of the _guts functions first handles the case where the input is exactly -; then it checks for an all-numeric value, which is interpreted an OID; then it has a lengthy chunk of logic to handle the case where we're in bootstrap mode; and if none of those cases handle the situation, then it ends by doing the lookup in the normal way, in each case marked with a comment that says Normal case. I think that we should do only the last part - what in each case follows the normal case comment - for the to_reg* functions. In other words, let's revert the whole refactoring of this file to create reg*_guts functions, and instead just copy the relevant logic for the name lookups into the new functions. For to_regproc(), for example, it would look like this (untested): names = stringToQualifiedNameList(pro_name_or_oid); clist = FuncnameGetCandidates(names, -1, NIL, false, false, false); if (clist == NULL || clist-next != NULL) result = InvalidOid; else result = clist-oid; With that change, this patch will actually get a whole lot smaller, change less already-existing code, and deliver cleaner behavior. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Useless Replica Identity: NOTHING noise from psql \d
On Sat, Apr 5, 2014 at 7:12 AM, Andres Freund and...@anarazel.de wrote: On 2014-04-03 14:49:54 -0400, Andrew Dunstan wrote: I've been kind of hoping that someone would step up on both these items, but the trail seems to have gone cold. I'm going to put out the new buildfarm release with the new module to run test_decoding check. But of course It won't have MSVC support. These can go on my long TOO list unless someone else gets there first. So, I was thinking on how we can improve this situation. There's basically one bigger remaining problem besides make check vs. make installcheck: Currently contrib modules can't easily run isolationtester in a general fashion. That's why test_decoding's Makefile has to rig that all itself. How about simply improving the contrib support to recognize tests in ISOLATION_REGRESS in addition to the current REGRESS? Then we can simply train vcregress.pl to pick them up generally, without special support for test_decoding. IMHO, that's a fine idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Including replication slot data in base backups
On Mon, Apr 7, 2014 at 3:04 AM, Michael Paquier michael.paqu...@gmail.com wrote: On Fri, Apr 4, 2014 at 9:57 PM, Michael Paquier michael.paqu...@gmail.com wrote: For 9.4, clearly yes, this would change the semantic of recovery and this is not something wise to do at the end of a development cycle. For 9.5 though, this is a different story. It clearly depends on if this is though as useful enough to change how recovery fetches WAL files (in this case by scanning existing repslots). There are other things to consider as well like for example: do we reset the restart_lsn of a repslot if needed WAL files are not here anymore or abort recovery? I haven't worked much with repslots though... Coming back to that, I am still wondering if for the time being it would not be better to add in pg_basebackup documentation that replication slot information is not added in a backup, per se the patch attached. Not sure if this is exactly the right way to do it, but I agree that something along those lines is a good idea. I also think, maybe even importantly, that we should probably document that people using file-copy based hot backups should strongly consider removing the replication slots by hand before using the backup. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] Proposal: variant of regclass
Robert Haas robertmh...@gmail.com writes: In other words, let's revert the whole refactoring of this file to create reg*_guts functions, and instead just copy the relevant logic for the name lookups into the new functions. The main discomfort I'd had with this patch was the amount of refactoring it did; that made it hard to verify and I wasn't feeling like it made for much net improvement in cleanliness. If we can do it without that, and have only as much duplicate code as you suggest, then +1. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] Proposal: variant of regclass
On 2014-04-04 11:18:10 -0400, Robert Haas wrote: On Wed, Apr 2, 2014 at 11:27 PM, Amit Kapila amit.kapil...@gmail.com wrote: Right, it will get reset in error. However still we need to free for missing_ok case and when it is successful in getting typeid. So don't you think it is better to just free once before calling LookupTypeName()? The code is right in it's current form as well, it's just a minor suggestion for improvement, so if you think current way the code written is okay, then ignore this suggestion. I see. Here's an updated patch with a bit of minor refactoring to clean that up, and some improvements to the documentation. I was all ready to commit this when I got cold feet. What's bothering me is that the patch, as written, mimics the exact behavior of the text-regproc cast, including the fact that the supplying an OID, written as a number, will always return that OID, whether it exists or not: rhaas=# select to_regclass('1259'), to_regclass('pg_class'); to_regclass | to_regclass -+- pg_class| pg_class (1 row) rhaas=# select to_regclass('12590'), to_regclass('does_not_exist'); to_regclass | to_regclass -+- 12590 | (1 row) I think that's unacceptably weird behavior. My suggestion is to restructure the code so that to_regclass() only accepts a name, not an OID, and make its charter precisely to perform a name lookup and return an OID (as regclass) or NULL if there's no match. There's actually another good reason to not copy regclass's behaviour: postgres=# CREATE TABLE 123(); CREATE TABLE postgres=# SELECT '123'::regclass; regclass -- 123 (1 row) I don't think that's fixable for ::regclass, but we shouldn't copy it. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Including replication slot data in base backups
Em segunda-feira, 7 de abril de 2014, Robert Haas robertmh...@gmail.com escreveu: On Mon, Apr 7, 2014 at 3:04 AM, Michael Paquier michael.paqu...@gmail.com javascript:; wrote: On Fri, Apr 4, 2014 at 9:57 PM, Michael Paquier michael.paqu...@gmail.com javascript:; wrote: For 9.4, clearly yes, this would change the semantic of recovery and this is not something wise to do at the end of a development cycle. For 9.5 though, this is a different story. It clearly depends on if this is though as useful enough to change how recovery fetches WAL files (in this case by scanning existing repslots). There are other things to consider as well like for example: do we reset the restart_lsn of a repslot if needed WAL files are not here anymore or abort recovery? I haven't worked much with repslots though... Coming back to that, I am still wondering if for the time being it would not be better to add in pg_basebackup documentation that replication slot information is not added in a backup, per se the patch attached. Not sure if this is exactly the right way to do it, but I agree that something along those lines is a good idea. I also think, maybe even importantly, that we should probably document that people using file-copy based hot backups should strongly consider removing the replication slots by hand before using the backup. +1 Fabrízio -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
* Robert Haas (robertmh...@gmail.com) wrote: On Fri, Apr 4, 2014 at 4:29 PM, Greg Stark st...@mit.edu wrote: I've been meaning to do more review for a while and just took a skim through the queue. There are only a couple I feel I can contribute with so I'm going to work on those and then if it's still before the feature freeze I would like to go ahead with Peter's patch. I think it's generally a good patch. To be honest, I think that's just flat-out inappropriate. There were over 100 patches in this CommitFest and there's not a single committed patch that has your name on it even as a reviewer, let alone a committer. I haven't got any either (except for my little one), which frustrates me greatly. Not because I'm looking for credit on the time that I've spent in discussions, doing reviews, and I could have sworn there was some patch that I did commit, but because I've not been able to find the larger chunks of time required to get the more complex patches in. When a committer says, hey, I'm going to commit XYZ, that basically forces anybody who might have an objection to it to drop what they're doing and object fast, before it's too late. In other words, the people who just said that they are too busy reviewing patches that were timely submitted and don't want to divert effort from that to handle patches that weren't are going to have to do that anyway, or lose their right to object. I don't agree that this is the case. We do revert patches from time to time, when necessary, and issues with this particular patch seem likely to be found during testing, well in advance of any release, and it's self contained enough to be reverted pretty easily. Perhaps it's not fair to expect everyone to have realized that, but at least any committer looking at it would, I believe, weigh those against it being a late patch. I think that's unfair. You're essentially leveraging a commit bit that you haven't used in more than three years to try to push a patch that was submitted months too late to the head of the review queue - and, just to put icing on the cake, it just so happens that you and the patch author work for the same employer. I have no objection to people committing patches written by others who work at the same company, but only if those patches have gone through a full, fair, and public review, with ample opportunity for other people to complain if they don't like it. That is obviously not the case here. As for this- it's disingenuous at best and outright accusatory at worst. The reason for this discussion is entirely because of PGConf.NYC, imv, where Peter brought this patch up with at least Greg, Magnus and I. Also during PGConf.NYC, Greg was specifically asking me about patches which were in the commitfest that could be reviewed, ideally without having to go back through threads hundreds of messages long and dealing with complex parts of the code. Sadly, as is often the case, the easy ones get handled pretty quickly and the difficult ones get left behind. One bad part of the commitfest, imv, is that when a clearly good, small patch does manage to show up, we don't formally take any of that into consideration when we are prioritizing patches. They end up being informally prioritized and get in quickly at the beginning, but that doesn't address the reality that those smaller patches likely would get in without any kind of commitfest process and not letting them in because of the commitfest process doesn't generally mean that the larger patches are any more likely to get in- iow, it's not a zero-sum game. We have quite a few part-time committers (or at least, committers who have disproportionate amounts of time, or perhaps the difference is in the size of the blocks of time which can be dedicated to PG) and saying no, you can't help unless you tackle the hard problems doesn't particularly move us forward. All that said, I don't have any particularly good idea of how to fix any of this- it's not fair to tell the committers who have more time (or larger blocks of time, etc) you must work the hard problems only either. I don't feel Greg's interest in this patch has anything to do with his current employment and everything to do with the side-talks in NYC, and to that point, I'm very tempted to go look at this patch myself because it sounds like an exciting improvement with minimal effort. Would I feel bad for doing so, with the CustomScan API and Updatable Security Barrier Views patches still pending? Sure, but it's the difference between finding an hour and finding 8. The hour will come pretty easily (probably spent half that on this email..), while an 8-hour block, which would likely turn into more, is neigh-on impossible til at least this weekend. And, no, 8x one-hour blocks would not be worthwhile; I've tried that before. Thanks, Stephen signature.asc Description: Digital signature
Re: Fwd: [HACKERS] Proposal: variant of regclass
Andres Freund and...@2ndquadrant.com writes: There's actually another good reason to not copy regclass's behaviour: postgres=# CREATE TABLE 123(); CREATE TABLE postgres=# SELECT '123'::regclass; regclass -- 123 (1 row) I don't think that's fixable for ::regclass, but we shouldn't copy it. I think that's not proving what you thought; the case is correctly handled if you quote: regression=# create table 123(z int); CREATE TABLE regression=# select '123'::regclass; regclass -- 123 (1 row) regression=# select '123'::regclass; regclass -- 123 (1 row) But I agree that we don't want these functions accepting numeric OIDs, even though ::regclass must. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] Proposal: variant of regclass
On 2014-04-07 12:59:36 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: There's actually another good reason to not copy regclass's behaviour: postgres=# CREATE TABLE 123(); CREATE TABLE postgres=# SELECT '123'::regclass; regclass -- 123 (1 row) I don't think that's fixable for ::regclass, but we shouldn't copy it. I think that's not proving what you thought; the case is correctly handled if you quote: I know, but it's a pretty easy to make mistake. Many if not most of the usages of regclass I have seen were wrong in that way. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Apr 7, 2014 at 1:01 PM, Stephen Frost sfr...@snowman.net wrote: All that said, I don't have any particularly good idea of how to fix any of this- it's not fair to tell the committers who have more time (or larger blocks of time, etc) you must work the hard problems only either. I don't feel Greg's interest in this patch has anything to do with his current employment and everything to do with the side-talks in NYC, and to that point, I'm very tempted to go look at this patch myself because it sounds like an exciting improvement with minimal effort. Would I feel bad for doing so, with the CustomScan API and Updatable Security Barrier Views patches still pending? Sure, but it's the difference between finding an hour and finding 8. The hour will come pretty easily (probably spent half that on this email..), while an 8-hour block, which would likely turn into more, is neigh-on impossible til at least this weekend. And, no, 8x one-hour blocks would not be worthwhile; I've tried that before. If it's only going to take you an hour to address this patch (or 8 to address those other ones) then you spend a heck of a lot less time on review for a patch of a given complexity level than I do. I agree that it's desirable to slip things in, from time to time, when they're uncontroversial and obviously meritorious, but I'm not completely convinced that this is such a case. As an utterly trivial point, I find the naming to be less than ideal: poorman is not a term I want to enshrine in our code. That's not very descriptive of what the patch is actually doing even if you know what the idiom means, and people whose first language - many of whom do significant work on our code - may not. Now the point is not that that's a serious flaw in and of itself. The point is that these kinds of issues deserve to be discussed and agreed on, and the process should be structured in a way that permits that. And that discussion will require the time not only of the people who find this patch more interesting than any other, but also of the people who just said that they're busy with other things right now, unless those people want to forfeit their right to an opinion. Experience has shown that it's a whole lot easier for anyone here to get a patch changed before it's committed than after it's committed, so I don't buy your argument that the timing there doesn't matter. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Apr 7, 2014 at 10:23 AM, Robert Haas robertmh...@gmail.com wrote: As an utterly trivial point, I find the naming to be less than ideal: poorman is not a term I want to enshrine in our code. That's not very descriptive of what the patch is actually doing even if you know what the idiom means, and people whose first language - many of whom do significant work on our code - may not. I didn't come up with the idea, or the name. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
KONDO Mitsumasa wrote: Hi all, I think this patch is completely forgotten, and feel very unfortunate:( Min, max, and stdev is basic statistics in general monitoring tools, So I'd like to push it. I just noticed that this patch not only adds min,max,stddev, but it also adds the ability to reset an entry's counters. This hasn't been mentioned in this thread at all; there has been no discussion on whether this is something we want to have, nor on whether this is the right API. What it does is add a new function pg_stat_statements_reset_time() which resets the min and max values from all function's entries, without touching the stddev state variables nor the number of calls. Note there's no ability to reset the values for a single function. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Apr 7, 2014 at 10:42 AM, Andres Freund and...@2ndquadrant.com wrote: I didn't come up with the idea, or the name. Doesn't mean it needs to be enshrined everywhere. I don't think Robert's against putting it in some comments. That seems reasonable. If someone wants to call what I have here semi-reliable normalized keys, for example, I have no objection whatsoever. In fact, I think that's probably a good idea. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Apr 7, 2014 at 1:23 PM, Robert Haas robertmh...@gmail.com wrote: As an utterly trivial point, I find the naming to be less than ideal: poorman is not a term I want to enshrine in our code. That's not very descriptive of what the patch is actually doing even if you know what the idiom means, and people whose first language - many of whom do significant work on our code - may not. To throw out one more point that I think is problematic, Peter's original email on this thread gives a bunch of examples of strxfrm() normalization that all different in the first few bytes - but so do the underlying strings. I *think* (but don't have time to check right now) that on my MacOS X box, strxfrm() spits out 3 bytes of header junk and then 8 bytes per character in the input string - so comparing the first 8 bytes of the strxfrm()'d representation would amount to comparing part of the first byte. If for any reason the first byte is the same (or similar enough) on many of the input strings, then this will probably work out to be slower rather than faster. Even if other platforms are more space-efficient (and I think at least some of them are), I think it's unlikely that this optimization will ever pay off for strings that don't differ in the first 8 bytes. And there are many cases where that could be true a large percentage of the time throughout the input, e.g. -MM-DD HH:MM:SS timestamps stored as text. It seems like that the patch pessimizes those cases, though of course there's no way to know without testing. Now it *may well be* that after doing some research and performance testing we will conclude that either no commonly-used platforms show any regressions or that the regressions that do occur are discountable in view of the benefits to more common cases to the benefits. I just don't think mid-April is the right time to start those discussions with the goal of a 9.4 commit; and I also don't think committing without having those discussions is very prudent. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On 2014-04-07 10:29:53 -0700, Peter Geoghegan wrote: On Mon, Apr 7, 2014 at 10:23 AM, Robert Haas robertmh...@gmail.com wrote: As an utterly trivial point, I find the naming to be less than ideal: poorman is not a term I want to enshrine in our code. That's not very descriptive of what the patch is actually doing even if you know what the idiom means, and people whose first language - many of whom do significant work on our code - may not. I didn't come up with the idea, or the name. Doesn't mean it needs to be enshrined everywhere. I don't think Robert's against putting it in some comments. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On 2014-04-07 13:01:52 -0400, Stephen Frost wrote: I haven't got any either (except for my little one), which frustrates me greatly. Not because I'm looking for credit on the time that I've spent in discussions, doing reviews, and I could have sworn there was some patch that I did commit, but because I've not been able to find the larger chunks of time required to get the more complex patches in. I am a bit confused. To my eyes there's been a huge number of actually trivial patches in this commitfest? Even now, there's some: * Bugfix for timeout in LDAP connection parameter resolution * Problem with displaying wide tables in psql * Enable CREATE FOREIGN TABLE (... LIKE ... ) * Add min, max, and stdev execute statement time in pg_stat_statement * variant of regclass etc. * vacuumdb: Add option --analyze-in-stages Are all small patches that don't need major changes before getting committed. That's after three months. And after a high number of smaller patches committed by Tom on Friday. When a committer says, hey, I'm going to commit XYZ, that basically forces anybody who might have an objection to it to drop what they're doing and object fast, before it's too late. In other words, the people who just said that they are too busy reviewing patches that were timely submitted and don't want to divert effort from that to handle patches that weren't are going to have to do that anyway, or lose their right to object. I don't agree that this is the case. We do revert patches from time to time, when necessary, and issues with this particular patch seem likely to be found during testing, well in advance of any release, and it's self contained enough to be reverted pretty easily. Given the trackrecord with testing the project seems to have with testing, I don't have much faith in that claim. But even if, it'll only get you testing on 2-3 platforms, without noticing portability issues. I think it'd be a different discussion if this where CF-1 or so. But we're nearly *2* months after the the *end* of the last CF. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
* Robert Haas (robertmh...@gmail.com) wrote: If it's only going to take you an hour to address this patch (or 8 to address those other ones) then you spend a heck of a lot less time on review for a patch of a given complexity level than I do. Eh, I don't really time it and I'm probably completely off in reality, it's more of a relative feeling thing wrt how long it'd take. I was also thinking about it from a 'initial review and provide feedback' standpoint, actually getting to a point of committing something certainly takes much longer, but I can also kick off tests and do individual testing in those smaller time blocks. Reading the code and understanding it, writing up the feedback email, etc, is what requires the larger single block. Perhaps I can work on improving that for myself and maybe find a way to do it in smaller chunks, but that hasn't happened in the however-many-years, and there's the whole 'old dog, new tricks' issue. I agree that it's desirable to slip things in, from time to time, when they're uncontroversial and obviously meritorious, but I'm not completely convinced that this is such a case. As an utterly trivial point, I find the naming to be less than ideal: poorman is not a term I want to enshrine in our code. That's not very descriptive of what the patch is actually doing even if you know what the idiom means, and people whose first language - many of whom do significant work on our code - may not. Fair enough. Now the point is not that that's a serious flaw in and of itself. The point is that these kinds of issues deserve to be discussed and agreed on, and the process should be structured in a way that permits that. The issue on it being called poorman? That doesn't exactly strike me as needing a particularly long discussion, nor that it would be difficult to change later. I agree that there may be other issues, and it'd be great to get buy-in from everyone before anything goes in, but there's really zero hope of that being a reality. And that discussion will require the time not only of the people who find this patch more interesting than any other, but also of the people who just said that they're busy with other things right now, unless those people want to forfeit their right to an opinion. I certainly hope that no committer feels that they forfeit their right to an opinion about a piece of code because they didn't object to it before it was committed. My experience is that committed code gets reviewed and concerns are raised, at which point it's usually on the original committer to go back and fix it; which I'm certainly glad for. Experience has shown that it's a whole lot easier for anyone here to get a patch changed before it's committed than after it's committed, so I don't buy your argument that the timing there doesn't matter. Once code has been released and there are external dependencies on it, that's obviously an issue. Ahead of that, I feel like the issue is really more one of interest- everyone is very interested when code is about to go in, but once it's in, for whatever reason, the interest becomes much less to review and comment on it. I feel like that's a very long-standing issue as it relates to our 'beta' period because doing code review just simply isn't fun and the motivation is reduced once it's been committed. That makes me wonder about having beta review-fests (in fact, I feel like that may have been proposed before...), where commits are assigned out to be reviewed by someone other than the committer/author/original reviewer, as a way to motivate individuals to go review what has gone in before we get to release. That might help us ensure that more of the committed code *gets* another review before release and reduce the issues we have post-release. Just a thought. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Firing trigger if only
On Thu, Apr 3, 2014 at 8:50 AM, Gabriel yu1...@gmail.com wrote: Good afternoon all.I have some problem with triggers on PostgreSQL 8.4.I have a trigger on specific table(articles) that fires on update statement: CREATE OR REPLACE FUNCTION trigger_articles_update() RETURNS trigger AS $BODY$BEGIN INSERT INTO article_change(article_id,change_type)VALUES(OLD.article_id,2); RETURN NULL; END$BODY$ LANGUAGE 'plpgsql' VOLATILE COST 100; ALTER FUNCTION trigger_articles_update() OWNER TO postgres; I have 2 different applications that performs update on table articles(written in Delphi and using ZeosDB). My problem is that I want trigger to fire only when performing update with first application, but not with second.I know that triggers supposed to fire on every change on table, but this is a specific problem that I have.Any hint appreciated. 8) Since 9.0 version of PostgreSQL you can set 'application_name' in connection [1] and test it in your trigger using a query like: regress=# SELECT application_name FROM pg_stat_activity WHERE pid = pg_backend_pid(); application_name -- psql (1 registro) I strongly recommend you to upgrade your 8.4 because the EOL [2] is July 2014. But if an upgrade isn't an option for now then you can use the old custom_variable_classes to set your application, i.e.: 1) Add to your postgresql.conf: custom_variable_classes = 'foo' foo.application_name = 'undefined' 2) Reload your PostgreSQL 3) You can use the following functions to get/set the 'foo.application_name' custom variable: -- get SELECT current_setting('foo.application_name'); -- set SELECT set_config('foo.application_name', 'myapp'); 4) Now you can use this functions do register the name of your application an use it in your trigger. Regards, [1] http://www.postgresql.org/docs/current/static/runtime-config-logging.html#GUC-APPLICATION-NAME [2] http://www.postgresql.org/support/versioning/ [3] http://www.postgresql.org/docs/8.4/static/runtime-config-custom.html -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
* Andres Freund (and...@2ndquadrant.com) wrote: On 2014-04-07 13:01:52 -0400, Stephen Frost wrote: I haven't got any either (except for my little one), which frustrates me greatly. Not because I'm looking for credit on the time that I've spent in discussions, doing reviews, and I could have sworn there was some patch that I did commit, but because I've not been able to find the larger chunks of time required to get the more complex patches in. I am a bit confused. To my eyes there's been a huge number of actually trivial patches in this commitfest? Even now, there's some: * Bugfix for timeout in LDAP connection parameter resolution I can take a look at that (if no one else wants to speak up about it). * Problem with displaying wide tables in psql That's not without controvery, as I understand it, but I admit that I haven't been following it terribly closely. * Enable CREATE FOREIGN TABLE (... LIKE ... ) This has definitely got issues which are not trival, see Tom's recent email on the subject.. * Add min, max, and stdev execute statement time in pg_stat_statement This was also quite controversal. If we've finally settled on this as being acceptable then perhaps it can get in pretty easily. * variant of regclass etc. This was recently being discussed also. * vacuumdb: Add option --analyze-in-stages Haven't looked at this at all. Are all small patches that don't need major changes before getting committed. That strikes me as optimistic. I do plan to go do another pass through the commitfest patches before looking at other things (as Greg also said he would do); thanks for bringing up the ones you feel are more managable- it'll help me focus on them. Given the trackrecord with testing the project seems to have with testing, I don't have much faith in that claim. But even if, it'll only get you testing on 2-3 platforms, without noticing portability issues. This would be another case where it'd be nice if we could give people access to the buildfarm w/o having to actually commit something. I think it'd be a different discussion if this where CF-1 or so. But we're nearly *2* months after the the *end* of the last CF. There wouldn't be any discussion if it was CF-1 as I doubt anyone would object to it going in (or at least not as strongly..), even if it was submitted after CF-1 was supposed to be over with remaining patches. It's the threat of getting punted to the next release that really makes the difference here, imv. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Apr 7, 2014 at 1:58 PM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: If it's only going to take you an hour to address this patch (or 8 to address those other ones) then you spend a heck of a lot less time on review for a patch of a given complexity level than I do. Eh, I don't really time it and I'm probably completely off in reality, it's more of a relative feeling thing wrt how long it'd take. I was also thinking about it from a 'initial review and provide feedback' standpoint, actually getting to a point of committing something certainly takes much longer, but I can also kick off tests and do individual testing in those smaller time blocks. Reading the code and understanding it, writing up the feedback email, etc, is what requires the larger single block. Perhaps I can work on improving that for myself and maybe find a way to do it in smaller chunks, but that hasn't happened in the however-many-years, and there's the whole 'old dog, new tricks' issue. It takes me a big block of time, too, at least for the initial review. The issue on it being called poorman? That doesn't exactly strike me as needing a particularly long discussion, nor that it would be difficult to change later. I agree that there may be other issues, and it'd be great to get buy-in from everyone before anything goes in, but there's really zero hope of that being a reality. Really? I think there have been just about zero patches that have gone in in the last (ugh) three months that have had significant unaddressed objections from anyone at the time they were committed. There has certainly been an enormous amount of work by a whole lot of people to address objections that have been raised, and generally that has gone well. I will not pretend that every patch that has gone in is completely well-liked by everyone and I am sure that is not the case. Nevertheless I think we've done pretty well. Now the people whose stuff has not got in - and may not get in - may well feel that their stuff got short shrift, and I'm not going to deny that there's a problem there. But breaking from our usual procedure for this patch is just adding to that unfairness, not making anything better. And that discussion will require the time not only of the people who find this patch more interesting than any other, but also of the people who just said that they're busy with other things right now, unless those people want to forfeit their right to an opinion. I certainly hope that no committer feels that they forfeit their right to an opinion about a piece of code because they didn't object to it before it was committed. My experience is that committed code gets reviewed and concerns are raised, at which point it's usually on the original committer to go back and fix it; which I'm certainly glad for. Sure, people can object to anything whenever they like. But it becomes an uphill battle once it goes in, unless the breakage is rather flagrant. Regardless, if this patch had had multiple, detailed reviews finding lots of issues that were then addressed, I'd probably be keeping my trap shut and hoping for the best. But it hasn't. Any patch of this size is going to have nitpicky stuff that needs to be addressed, if nothing else, and points requiring some modicum of public discussion, and there hasn't been any of that. That suggests to me that it just hasn't been thoroughly reviewed yet, at least not in public, and that should happen long before anyone talks about picking it up for commit. If this patch had been timely submitted and I'd picked it up right away, I would have expected at least three weeks to elapse between the time my first review was posted and the time all the details were nailed down, even assuming no more serious problems were found, and starting that ball rolling now means looking forward to a commit around the end of April assuming things go great. That seems way too late to me; IMHO, we should already be mostly in mop-up mode now (hence my recent DSM patch cleanup patch, which no one has commented on...). And I think it's likely that, in fact, we'll find cases where this regresses performance rather badly, for reasons sketched in an email I posted a bit ago. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
* Robert Haas (robertmh...@gmail.com) wrote: To throw out one more point that I think is problematic, Peter's original email on this thread gives a bunch of examples of strxfrm() normalization that all different in the first few bytes - but so do the underlying strings. I *think* (but don't have time to check right now) that on my MacOS X box, strxfrm() spits out 3 bytes of header junk and then 8 bytes per character in the input string - so comparing the first 8 bytes of the strxfrm()'d representation would amount to comparing part of the first byte. If for any reason the first byte is the same (or similar enough) on many of the input strings, then this will probably work out to be slower rather than faster. Even if other platforms are more space-efficient (and I think at least some of them are), I think it's unlikely that this optimization will ever pay off for strings that don't differ in the first 8 bytes. And there are many cases where that could be true a large percentage of the time throughout the input, e.g. -MM-DD HH:MM:SS timestamps stored as text. It seems like that the patch pessimizes those cases, though of course there's no way to know without testing. Portability and performance concerns were exactly what worried me as well. It was my hope/understanding that this was a clear win which was vetted by other large projects across multiple platforms. If that's actually in doubt and it isn't a clear win then I agree that we can't be trying to squeeze it in at this late date. Now it *may well be* that after doing some research and performance testing we will conclude that either no commonly-used platforms show any regressions or that the regressions that do occur are discountable in view of the benefits to more common cases to the benefits. I just don't think mid-April is the right time to start those discussions with the goal of a 9.4 commit; and I also don't think committing without having those discussions is very prudent. I agree with this in concept- but I'd be willing to spend a bit of time researching it, given that it's from a well known and respected author who I trust has done much of this research already. Thanks, Stephen signature.asc Description: Digital signature
[HACKERS] WAL replay bugs
I've been playing with a little hack that records a before and after image of every page modification that is WAL-logged, and writes the images to a file along with the LSN of the corresponding WAL record. I set up a master-standby replication with that hack in place in both servers, and ran the regression suite. Then I compared the after images after every WAL record, as written on master, and as replayed by the standby. The idea is that the page content in the standby after replaying a WAL record should be identical to the page in the master, when the WAL record was generated. There are some known cases where that doesn't hold, but it's a useful sanity check. To reduce noise, I've been focusing on one access method at a time, filtering out others. I did that for GIN first, and indeed found a bug in my new incomplete-split code, see commit 594bac42. After fixing that, and zeroing some padding bytes (38a2b95c), I'm now getting a clean run with that. Next, I took on GiST, and lo-and-behold found a bug there pretty quickly as well. This one has been there ever since we got Hot Standby: the redo of a page update (e.g an insertion) resets the right-link of the page. If there is a concurrent scan, in a hot standby server, that scan might still need the rightlink, and will hence miss some tuples. This can be reproduced like this: 1. in master, create test table. CREATE TABLE gisttest (id int4); CREATE INDEX gisttest_idx ON gisttest USING gist (id); INSERT INTO gisttest SELECT g * 1000 from generate_series(1, 10) g; -- Test function. Starts a scan, fetches one row from it, then waits 10 seconds until fetching the rest of the rows. -- Returns the number of rows scanned. Should be 10 if you follow -- these test instructions. CREATE OR REPLACE FUNCTION gisttestfunc() RETURNS int AS $$ declare i int4; t text; cur CURSOR FOR SELECT 'foo' FROM gisttest WHERE id = 0; begin set enable_seqscan=off; set enable_bitmapscan=off; i = 0; OPEN cur; FETCH cur INTO t; perform pg_sleep(10); LOOP EXIT WHEN NOT FOUND; -- this is bogus on first iteration i = i + 1; FETCH cur INTO t; END LOOP; CLOSE cur; RETURN i; END; $$ LANGUAGE plpgsql; 2. in standby SELECT gisttestfunc(); blocks 3. Quickly, before the scan in standby continues, cause some page splits: INSERT INTO gisttest SELECT g * 1000+1 from generate_series(1, 10) g; 4. The scan in standby finishes. It should return 10, but will return a lower number if you hit the bug. At a quick glance, I think fixing that is just a matter of not resetting the right-link. I'll take a closer look tomorrow, but for now I just wanted to report what I've been doing. I'll post the scripts I've been using later too - nag me if I don't. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Apr 7, 2014 at 10:47 AM, Robert Haas robertmh...@gmail.com wrote: To throw out one more point that I think is problematic, Peter's original email on this thread gives a bunch of examples of strxfrm() normalization that all different in the first few bytes - but so do the underlying strings. I *think* (but don't have time to check right now) that on my MacOS X box, strxfrm() spits out 3 bytes of header junk and then 8 bytes per character in the input string - so comparing the first 8 bytes of the strxfrm()'d representation would amount to comparing part of the first byte. If for any reason the first byte is the same (or similar enough) on many of the input strings, then this will probably work out to be slower rather than faster. Even if other platforms are more space-efficient (and I think at least some of them are), I think it's unlikely that this optimization will ever pay off for strings that don't differ in the first 8 bytes. Why would any platform have header bytes in the resulting binary strings? That doesn't make any sense. Are you sure you aren't thinking of the homogeneous trailing bytes that you can also see in my example? The only case that this patch could possibly regress is where there are strings that differ beyond about the first 8 bytes, but are not identical (we chance a memcmp() == 0 before doing a full strcoll() when tie-breaking on the semi-reliable initial comparison). We still always avoid fmgr-overhead (and shim overhead, which I've measured), as in your original patch - you put that at adding 7% at the time, which is likely to make up for otherwise-regressed cases. There is nothing at all contrived about my test-case. You have to have an awfully large number of significantly similar but not identical strings in order to possibly lose out. Even if you have such a case, and the fmgr-trampoline-elision doesn't make up for it (doesn't make up for having to do a separate heapattr lookup on the minimal tuple, and optimization not too relevant for pass by reference types), which is quite a stretch, it seems likely that you have other cases that do benefit, which in aggregate makes up for it. The benefits I've shown, on the first test case I picked are absolutely enormous. Now, let's assume that I'm wrong about all this, and that in fact there is a plausible case where all of those tricks don't work out, and someone has a complaint about a regression. What are we going to do about that? Not accelerate text sorting by at least a factor of 3 for the benefit of some very narrow use-case? That's the only measure I can see that you could take to not regress that case. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On 2014-04-07 14:12:09 -0400, Stephen Frost wrote: I can take a look at that (if no one else wants to speak up about it). * Problem with displaying wide tables in psql That's not without controvery, as I understand it, but I admit that I haven't been following it terribly closely. There didn't seem to be any conflicts here? I am talking about http://archives.postgresql.org/message-id/CAJTaR32A1_d0DqP25T4%3DLwE3RpmhNf3oY%3Dr0-ksejepfPv6O%3Dw%40mail.gmail.com * Enable CREATE FOREIGN TABLE (... LIKE ... ) This has definitely got issues which are not trival, see Tom's recent email on the subject.. Yea. Besides others, he confirmed my comments. The issue there was basically that I didn't like something, others disagreed. Still needed a committers input. * Add min, max, and stdev execute statement time in pg_stat_statement This was also quite controversal. If we've finally settled on this as being acceptable then perhaps it can get in pretty easily. The minimal variant (just stddev) didn't seem to be all that controversial. I think it'd be a different discussion if this where CF-1 or so. But we're nearly *2* months after the the *end* of the last CF. There wouldn't be any discussion if it was CF-1 as I doubt anyone would object to it going in (or at least not as strongly..), even if it was submitted after CF-1 was supposed to be over with remaining patches. It's the threat of getting punted to the next release that really makes the difference here, imv. I can understand that for feature patches which your company/client needs, but for a localized performance improvement? Unconvinced. And sorry, if the threat of getting punted to the next release plays a significant role in here, the patch *needs* to be punted. The only reason I can see at this point is ah, trivial enough, let's just do this now. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
Stephen Frost wrote: * Andres Freund (and...@2ndquadrant.com) wrote: I think it'd be a different discussion if this where CF-1 or so. But we're nearly *2* months after the the *end* of the last CF. There wouldn't be any discussion if it was CF-1 as I doubt anyone would object to it going in (or at least not as strongly..), even if it was submitted after CF-1 was supposed to be over with remaining patches. It's the threat of getting punted to the next release that really makes the difference here, imv. That's why we have this rule that CF4 should only receive patches that were already reviewed in previous commitfests. I, too, find the fast-tracking of this patch completely outside of the CF process to be distasteful. We summarily reject much smaller patches at the end of each cycle process, even when the gain is as obvious as is claimed to be for this patch. TBH I don't see why we're even discussing this. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Apr 7, 2014 at 2:12 PM, Stephen Frost sfr...@snowman.net wrote: I think it'd be a different discussion if this where CF-1 or so. But we're nearly *2* months after the the *end* of the last CF. There wouldn't be any discussion if it was CF-1 as I doubt anyone would object to it going in (or at least not as strongly..), even if it was submitted after CF-1 was supposed to be over with remaining patches. It's the threat of getting punted to the next release that really makes the difference here, imv. The point is that if this had been submitted for CF-1, CF-2, CF-3, or CF-4, and I had concerns about it (which I do), then I would have budgeted time to record those concerns so that they could be discussed and, if necessary, addressed. Since it wasn't, I assumed I didn't need to worry about studying the patch, figuring out which of my concerns were actually legitimate, searching for other areas of potential concern, and putting together a nice write-up until June - and maybe not even then, because at that point other people might also be studying the patch and might cover those areas in sufficient detail as to obviate my own concerns. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
* Robert Haas (robertmh...@gmail.com) wrote: On Mon, Apr 7, 2014 at 1:58 PM, Stephen Frost sfr...@snowman.net wrote: The issue on it being called poorman? That doesn't exactly strike me as needing a particularly long discussion, nor that it would be difficult to change later. I agree that there may be other issues, and it'd be great to get buy-in from everyone before anything goes in, but there's really zero hope of that being a reality. Really? I think there have been just about zero patches that have gone in in the last (ugh) three months that have had significant unaddressed objections from anyone at the time they were committed. That implies that everyone has been reviewing everything, which is certainly not entirely the case.. If that's happening then I'm not sure I understand the point of the commitfest, and feel I must be falling down on the job here. Certainly there are patches which have been picked up by committers and committed that I've not reviewed. I also didn't object, but if I find issue with them in the future, I'd certainly bring them up, even post-commit. There has certainly been an enormous amount of work by a whole lot of people to address objections that have been raised, and generally that has gone well. I will not pretend that every patch that has gone in is completely well-liked by everyone and I am sure that is not the case. Nevertheless I think we've done pretty well. Now the people whose stuff has not got in - and may not get in - may well feel that their stuff got short shrift, and I'm not going to deny that there's a problem there. But breaking from our usual procedure for this patch is just adding to that unfairness, not making anything better. This goes back to the point which I was trying to make earlier- prioritization. Small+clearly-good patches should be prioritized higher, but identifying those is no small matter. Of course, we would need to address the starvation risk when it comes to larger patches, which I've got no answer for. Regardless, if this patch had had multiple, detailed reviews finding lots of issues that were then addressed, I'd probably be keeping my trap shut and hoping for the best. But it hasn't. Any patch of this size is going to have nitpicky stuff that needs to be addressed, if nothing else, and points requiring some modicum of public discussion, and there hasn't been any of that. That suggests to me that it just hasn't been thoroughly reviewed yet, at least not in public, and that should happen long before anyone talks about picking it up for commit. Perhaps my memory is foggy, but I recall at least some amount of discussion and review of this, along with fixes going in, over the past couple of weeks. More may be needed, of course, but if so, I'd expect any committer looking at it would realize that and bounce it at this point, given the feature freeze deadline which is like next week.. If this patch had been timely submitted and I'd picked it up right away, I would have expected at least three weeks to elapse between the time my first review was posted and the time all the details were nailed down, even assuming no more serious problems were found, and starting that ball rolling now means looking forward to a commit around the end of April assuming things go great. That seems way too late to me; IMHO, we should already be mostly in mop-up mode now (hence my recent DSM patch cleanup patch, which no one has commented on...). I agree with this. And I think it's likely that, in fact, we'll find cases where this regresses performance rather badly, for reasons sketched in an email I posted a bit ago. I've not researched it enough myself to say. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: That's why we have this rule that CF4 should only receive patches that were already reviewed in previous commitfests. I, at least, always understood that rule to be 'large' patches, which this didn't strike me as. I, too, find the fast-tracking of this patch completely outside of the CF process to be distasteful. We summarily reject much smaller patches at the end of each cycle process, even when the gain is as obvious as is claimed to be for this patch. In the past, we've also committed large patches which were submitted for the first time to CF-4. TBH I don't see why we're even discussing this. Think I'm about done, personally. I can't comment more without actually looking at it and doing some research on it myself and I don't know that I'll be able to do that any time soon, as I told Peter when he asked me about it in NYC. That said, for my part, I don't like telling Greg that he either has to review something else which was submitted but that he's got no interest in (or which would take much longer), or not do anything. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On 2014-04-07 14:35:23 -0400, Stephen Frost wrote: That said, for my part, I don't like telling Greg that he either has to review something else which was submitted but that he's got no interest in (or which would take much longer), or not do anything. Reviewing and committing are two very different shoes imo. This discussion wasn't about it getting reviewed before the next CF, but about committing it into 9.4. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Apr 7, 2014 at 2:19 PM, Peter Geoghegan p...@heroku.com wrote: On Mon, Apr 7, 2014 at 10:47 AM, Robert Haas robertmh...@gmail.com wrote: To throw out one more point that I think is problematic, Peter's original email on this thread gives a bunch of examples of strxfrm() normalization that all different in the first few bytes - but so do the underlying strings. I *think* (but don't have time to check right now) that on my MacOS X box, strxfrm() spits out 3 bytes of header junk and then 8 bytes per character in the input string - so comparing the first 8 bytes of the strxfrm()'d representation would amount to comparing part of the first byte. If for any reason the first byte is the same (or similar enough) on many of the input strings, then this will probably work out to be slower rather than faster. Even if other platforms are more space-efficient (and I think at least some of them are), I think it's unlikely that this optimization will ever pay off for strings that don't differ in the first 8 bytes. Why would any platform have header bytes in the resulting binary strings? That doesn't make any sense. Are you sure you aren't thinking of the homogeneous trailing bytes that you can also see in my example? No, I'm not sure of that at all. I haven't looked at this topic in a while, but I'm happy to budget some to time to do so - for the June CommitFest. The only case that this patch could possibly regress is where there are strings that differ beyond about the first 8 bytes, but are not identical (we chance a memcmp() == 0 before doing a full strcoll() when tie-breaking on the semi-reliable initial comparison). We still always avoid fmgr-overhead (and shim overhead, which I've measured), as in your original patch - you put that at adding 7% at the time, which is likely to make up for otherwise-regressed cases. There is nothing at all contrived about my test-case. It's not a question of whether your test case is contrived. Your test case can be (and likely is) extremely realistic and still not account for other cases when the patch regresses performance. If I understand correctly, and I may not because I wasn't planning to spend time on this patch until the next CommitFest, the patch basically uses the bytes available in datum1 to cache what you refer to as a normalized or poor man's sort key which, it is hoped, will break most ties. However, on any workload where it fails to break ties, you're going to incur the additional overheads of (1) comparing the poor-man's sort key, (2) memcmping the strings (based on what you just said), and then (3) digging the correct datum back out of the tuple. I note that somebody thought #3 was important enough to be worth creating datum1 for in the first place, so I don't think it can or should be assumed that undoing that optimization in certain cases will turn out to be cheap enough not to matter. In short, I don't see any evidence that you've made an attempt to construct a worst-case scenario for this patch, and that's a basic part of performance testing. I had to endure having Andres beat the snot out of me over cases where the MVCC snapshot patch regressed performance, and as painful as that was, it led to a better patch. If I'd committed the first thing that did well on my own tests, which *did* include attempts (much less successful than Andres's) to find regressions, we'd be significantly worse off today than we are. You have to have an awfully large number of significantly similar but not identical strings in order to possibly lose out. Even if you have such a case, and the fmgr-trampoline-elision doesn't make up for it (doesn't make up for having to do a separate heapattr lookup on the minimal tuple, and optimization not too relevant for pass by reference types), which is quite a stretch, it seems likely that you have other cases that do benefit, which in aggregate makes up for it. The benefits I've shown, on the first test case I picked are absolutely enormous. Testing the cases where your patch wins and hand-waving that the losses won't be that bad in other cases - without actually testing - is not the right methodology. And I think it would be quite a large error to assume that tables never contain large numbers of similar but not identical strings. I bet there are many people who have just that. That's also quite an inaccurate depiction of the cases where this will regress things; a bunch of strings that share the first few characters might not be similar in any normal human sense of that term, while still slipping through the proposed sieve. In your OP, a six-byte string blew up until 20 bytes after being strxfrm()'d, which means that you're only comparing the first 2-3 bytes. On a platform where Datums are only 4 bytes, you're only comparing the first 1-2 bytes. Arguing that nobody anywhere has a table where not even the first character or two are identical across most or all of the strings in the
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
* Andres Freund (and...@2ndquadrant.com) wrote: On 2014-04-07 14:35:23 -0400, Stephen Frost wrote: That said, for my part, I don't like telling Greg that he either has to review something else which was submitted but that he's got no interest in (or which would take much longer), or not do anything. Reviewing and committing are two very different shoes imo. This discussion wasn't about it getting reviewed before the next CF, but about committing it into 9.4. I'm not entirely sure why. No one is giving grief to the people bringing up new things which are clearly for 9.5 at this point, yet we have a ton of patches that still need to be reviewed. I don't think that we don't want to tell individuals who are volunteering how to spend their time, but what we're saying is that they can do anything except actually commit stuff, because we must have ordering to what gets committed; an ordering which hasn't really got any prioritization to it based on the patch itself but entirely depends on when it was submitted. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Apr 7, 2014 at 2:35 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-04-07 14:35:23 -0400, Stephen Frost wrote: That said, for my part, I don't like telling Greg that he either has to review something else which was submitted but that he's got no interest in (or which would take much longer), or not do anything. Reviewing and committing are two very different shoes imo. This discussion wasn't about it getting reviewed before the next CF, but about committing it into 9.4. Yes. I did not object to this patch being posted in the midst of trying to nail down this release, and I certainly do not object to Greg, or Stephen, or anyone else reviewing it. My note was specifically prompted not by someone say they intended to *review* the patch, but that they intended to *commit* it when it hasn't even really been reviewed yet. There are patches that are trivial enough that it's fine for someone to commit them without a public review first, but this isn't remotely close to being in that category. If nothing else, the fact that it extends the definition of the btree opclass is sufficient reason to merit a public review. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
* Robert Haas (robertmh...@gmail.com) wrote: There are patches that are trivial enough that it's fine for someone to commit them without a public review first, but this isn't remotely close to being in that category. If nothing else, the fact that it extends the definition of the btree opclass is sufficient reason to merit a public review. hrmpf. You have a good point about that- I admit that I didn't consider that as much as I should have. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Apr 7, 2014 at 11:37 AM, Robert Haas robertmh...@gmail.com wrote: You're essentially leveraging a commit bit that you haven't used in more than three years to try to push a patch that was submitted months too late I'm not leveraging anything any I'm not going to push something unless people are on board. That's *why* I sent that message. And I started the email by saying I was going to go work on patches from the commitfest first. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Apr 7, 2014 at 11:59 AM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: There are patches that are trivial enough that it's fine for someone to commit them without a public review first, but this isn't remotely close to being in that category. If nothing else, the fact that it extends the definition of the btree opclass is sufficient reason to merit a public review. hrmpf. You have a good point about that- I admit that I didn't consider that as much as I should have. Actually, contrary to the original subject of this thread, that isn't the case. I have not added a support function 3, which I ultimately concluded was a bad idea. This is all sort support. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Apr 7, 2014 at 11:56 AM, Greg Stark st...@mit.edu wrote: On Mon, Apr 7, 2014 at 11:37 AM, Robert Haas robertmh...@gmail.com wrote: You're essentially leveraging a commit bit that you haven't used in more than three years to try to push a patch that was submitted months too late I'm not leveraging anything any I'm not going to push something unless people are on board. That's *why* I sent that message. And I started the email by saying I was going to go work on patches from the commitfest first. Exactly. I was of the opinion, as some familiar with the subject matter, that this rose to the level of deserving special consideration. I'm glad that there does seem to be a general recognition that such a category exists. Given the reservations of Robert and others, this isn't going to happen for 9.4. It was never going to happen under a cloud of controversy. I only broached the idea. Special consideration is not something I ask for lightly. I must admit that it's hard to see things as I do if you aren't as familiar with the problem. I happen to think that this is the wrong decision, but I'll leave it at that. I'm sure that whatever we come up with for 9.5 will be a lot better than what I have here, because it will probably be generalized to other important cases. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Apr 7, 2014 at 2:56 PM, Greg Stark st...@mit.edu wrote: On Mon, Apr 7, 2014 at 11:37 AM, Robert Haas robertmh...@gmail.com wrote: You're essentially leveraging a commit bit that you haven't used in more than three years to try to push a patch that was submitted months too late I'm not leveraging anything any I'm not going to push something unless people are on board. That's *why* I sent that message. And I started the email by saying I was going to go work on patches from the commitfest first. I said that a lot more harshly than I should have, and I impugned you unfairly. Sorry. I'm going to try again: I don't doubt that your desire to move this patch forward is motivated by anything under than the best of possible motivations. However, whether you intend it or not, trying to move this patch toward a 9.4 commit, or even trying to get people to express an opinion on whether this is suitable for a 9.4 commit, is inevitably going to cause senior reviewers who think they might have concerns about it to need to spend time on it. Inevitably, that time will come at the expense of patches that were timely submitted, and that is unfair to the people who submitted those patches. Of course, if you want to review this patch now, I'm 100% OK with that. If you want to review other pending patches, for 9.4 or 9.5, I think that's great, too. But if there's talk of committing this patch, I think that seems both quite a bit too late (relative to the timing of CF4) and quite a bit too early (relative to the amount of review and testing done thus far). Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
* Peter Geoghegan (p...@heroku.com) wrote: Actually, contrary to the original subject of this thread, that isn't the case. I have not added a support function 3, which I ultimately concluded was a bad idea. This is all sort support. Well, as apparently no one is objecting to Greg reviewing it, I'd suggest he do that and actually articulate his feelings on the patch post-review and exactly what it is changing and if he feels it needs public comment, rather than all this speculation by folks who aren't looking at the patch. In other words, in hindsight, Greg was rather premature with his suggestion that he might commit it and rather than suggesting such, he should have just said he was going to review it and then come back with a detailed email argueing the case for it to go in. I don't particularly fault Greg for that, but perhaps some of this could be avoided in the future. Thanks, Stephen signature.asc Description: Digital signature
[HACKERS] Transaction local statistics are incorrect at speed
My Salesforce colleague Teja Mupparti found an interesting bug. Consider the following example: drop table if exists test; create table test(i int); insert into test values(1); select pg_sleep(1); begin; insert into test values(2); insert into test values(3); select pg_stat_get_xact_tuples_inserted('test'::regclass); commit; select pg_sleep(1); begin; insert into test values(4); insert into test values(5); select pg_stat_get_xact_tuples_inserted('test'::regclass); commit; If you do this by hand, or with the above script verbatim, the pg_stat_get_xact_tuples_inserted() calls both report 2, which is what you'd expect: the counts are supposed to reflect rows inserted in the current transaction. However, if you take out the pg_sleep calls, you get entirely different results, and soon realize that the counts are including the previous transactions! The reason for this is that pgstat_report_stat() includes a delay check, such that it doesn't ship off statistics counts to the collector unless at least 500 ms have elapsed since the last report. Without the sleeps, the later transactions execute while the previous transactions' counts are still being held locally, *and those counts get included into the reported totals*. This seems like a pretty clear bug to me; does anyone want to argue that it isn't? In the case of pg_stat_get_xact_tuples_inserted and a couple of other routines, it would be entirely trivial to fix: just ignore tabentry-t_counts.t_tuples_inserted (which is the count held over from previous transactions) and only total the trans-tuples_inserted counters. However, for a number of other counters such as blocks_fetched, we don't store transaction-local counts separately, and would have to start doing so if we wanted to make these functions work as documented. Thoughts? I have other things to do right now than fix this myself. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Why is it not sane to pass ExecStoreTuple(shouldFree=true) for tuples point into buffers
Hi, I've the need to persist a the result of an index_getnext() in a tuple slot. I don't want to unneccessarily duplicate the tuple data itself, so I'd like to use ExecStoreTuple(buffer = real_buffer) notion. But since the next index_getnext()/index_endscan() will overwrite/release the heaptuple I need to copy the HeapTupleData(). It'd be rather useful to be able to do ExecStoreTuple(tuple, slot, some_buffer, true), i.e. hav ethe HeapTupleData struct freed *and* the buffer pinned. There's an Assert preventing that though. Why? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Transaction local statistics are incorrect at speed
* Tom Lane (t...@sss.pgh.pa.us) wrote: This seems like a pretty clear bug to me; does anyone want to argue that it isn't? I'd agree that it's a bug. In the case of pg_stat_get_xact_tuples_inserted and a couple of other routines, it would be entirely trivial to fix: just ignore tabentry-t_counts.t_tuples_inserted (which is the count held over from previous transactions) and only total the trans-tuples_inserted counters. However, for a number of other counters such as blocks_fetched, we don't store transaction-local counts separately, and would have to start doing so if we wanted to make these functions work as documented. I've been thinking for a while that we need more of this (storing and aggregating statistics data locally first) or we're never going to be able to add more information like the stddev, etc, that's been requested. I'd really like to see us collecting a great deal more regarding stats and that's not going to happen unless we can cache and aggregate locally first. Thoughts? I have other things to do right now than fix this myself. While a bug, I don't feel that it's a particularly high priority one. What worries me more is the risk of breaking things if we do backpatch a fix that postpones updating some information that used to be reflected immediately. It seems like we'd need/want some kind of generalized structure for this too, which would be even more work and which has zero chance for 9.4, meaning that it'll need to be fixed one way now and then whacked around in 9.5 (assuming someone has the time, etc..). Not a good situation. :/ Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Why is it not sane to pass ExecStoreTuple(shouldFree=true) for tuples point into buffers
Andres Freund and...@2ndquadrant.com writes: I've the need to persist a the result of an index_getnext() in a tuple slot. I don't want to unneccessarily duplicate the tuple data itself, so I'd like to use ExecStoreTuple(buffer = real_buffer) notion. But since the next index_getnext()/index_endscan() will overwrite/release the heaptuple I need to copy the HeapTupleData(). It'd be rather useful to be able to do ExecStoreTuple(tuple, slot, some_buffer, true), i.e. hav ethe HeapTupleData struct freed *and* the buffer pinned. There's an Assert preventing that though. There's an assumption that if you are asking to pin a buffer, the tuple pointer you're passing is pointing into that buffer (and is therefore not something that could be pfree'd). If it isn't pointing into a buffer, the tuple slot is not the place to be keeping the buffer reference. I'm disinclined to remove that Assert because failing to pin a buffer when you *are* passing a pointer into it is a very bad, hard-to-find bug. Admittedly the Assert is only a partial defense against that problem, but it's better than nothing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is it not sane to pass ExecStoreTuple(shouldFree=true) for tuples point into buffers
On 2014-04-07 15:58:31 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: I've the need to persist a the result of an index_getnext() in a tuple slot. I don't want to unneccessarily duplicate the tuple data itself, so I'd like to use ExecStoreTuple(buffer = real_buffer) notion. But since the next index_getnext()/index_endscan() will overwrite/release the heaptuple I need to copy the HeapTupleData(). It'd be rather useful to be able to do ExecStoreTuple(tuple, slot, some_buffer, true), i.e. hav ethe HeapTupleData struct freed *and* the buffer pinned. There's an Assert preventing that though. There's an assumption that if you are asking to pin a buffer, the tuple pointer you're passing is pointing into that buffer (and is therefore not something that could be pfree'd). If it isn't pointing into a buffer, the tuple slot is not the place to be keeping the buffer reference. Yea. I *have* a HeapTupleData struct pointing into the buffer. It's just that the lifetime of the indexscans's xs_ctup isn't sufficient for my case. So I have to allocate a new HeapTupleData struct, *again* pointing into the buffer. I'd like to manage the lifetime of that HeapTupleData struct (*not* the entire HeapTupleHeader blob) via the slot. That doesn't sound too crazy to me? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
Alvaro Herrera alvhe...@2ndquadrant.com writes: I just noticed that this patch not only adds min,max,stddev, but it also adds the ability to reset an entry's counters. This hasn't been mentioned in this thread at all; there has been no discussion on whether this is something we want to have, nor on whether this is the right API. What it does is add a new function pg_stat_statements_reset_time() which resets the min and max values from all function's entries, without touching the stddev state variables nor the number of calls. Note there's no ability to reset the values for a single function. That seems pretty bizarre. At this late date, the best thing would probably be to strip out the undiscussed functionality. It can get resubmitted for 9.5 if there's a real use-case for it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
Andres Freund and...@2ndquadrant.com writes: I am a bit confused. To my eyes there's been a huge number of actually trivial patches in this commitfest? Even now, there's some: * Bugfix for timeout in LDAP connection parameter resolution * Problem with displaying wide tables in psql * Enable CREATE FOREIGN TABLE (... LIKE ... ) * Add min, max, and stdev execute statement time in pg_stat_statement * variant of regclass etc. * vacuumdb: Add option --analyze-in-stages Are all small patches that don't need major changes before getting committed. FWIW, I think the reason most of those aren't in is that there's not consensus that it's a feature we want. Triviality of the patch itself doesn't make it easier to get past that. (Indeed, when a feature patch is actually trivial, that usually suggests to me that it's not been thought through fully ...) The LDAP one requires both LDAP and Windows expertise, which means the pool of qualified committers for it is pretty durn small. I think Magnus promised to deal with it, though. I think it'd be a different discussion if this where CF-1 or so. But we're nearly *2* months after the the *end* of the last CF. Yeah. At this point the default decision has to be to reject (or more accurately, punt to 9.5). I think we can still get a few of these in and meet the mid-April target date, but many of them will not make it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE FOREIGN TABLE ( ... LIKE ... )
On 2014-04-05 11:46:16 -0400, Tom Lane wrote: ISTM this is because the proposed feature is wrongheaded. The basic concept of CREATE TABLE LIKE is that you're copying properties from another object of the same type. You might or might not want every property, but there's no question of whether you *could* copy every property. In contrast, what this is proposing to do is copy properties from (what might be) a plain table to a foreign table, and those things aren't even remotely the same kind of object. It would make sense to me to restrict LIKE to copy from another foreign table, and then there would be a different set of INCLUDING/EXCLUDING options that would be relevant (options yes, indexes no, for example). I actually think it's quite useful to create a foreign table that's the same shape as a local table. And the patches approach of refusing to copy thinks that aren't supported sounds sane to me. Consider e.g. moving off older partitioned data off to an archiving server. New local partitions are often created using CREATE TABLE LIKE, but that's not possible for the foreign ones. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is it not sane to pass ExecStoreTuple(shouldFree=true) for tuples point into buffers
Andres Freund and...@2ndquadrant.com writes: On 2014-04-07 15:58:31 -0400, Tom Lane wrote: There's an assumption that if you are asking to pin a buffer, the tuple pointer you're passing is pointing into that buffer (and is therefore not something that could be pfree'd). If it isn't pointing into a buffer, the tuple slot is not the place to be keeping the buffer reference. Yea. I *have* a HeapTupleData struct pointing into the buffer. It's just that the lifetime of the indexscans's xs_ctup isn't sufficient for my case. So I have to allocate a new HeapTupleData struct, *again* pointing into the buffer. I'd like to manage the lifetime of that HeapTupleData struct (*not* the entire HeapTupleHeader blob) via the slot. That doesn't sound too crazy to me? In that case you should have another tuple slot of your own and let it keep the tuple (and buffer pin). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Apr 7, 2014 at 11:47 AM, Robert Haas robertmh...@gmail.com wrote: It's not a question of whether your test case is contrived. Your test case can be (and likely is) extremely realistic and still not account for other cases when the patch regresses performance. If I understand correctly, and I may not because I wasn't planning to spend time on this patch until the next CommitFest, the patch basically uses the bytes available in datum1 to cache what you refer to as a normalized or poor man's sort key which, it is hoped, will break most ties. However, on any workload where it fails to break ties, you're going to incur the additional overheads of (1) comparing the poor-man's sort key, (2) memcmping the strings (based on what you just said), and then (3) digging the correct datum back out of the tuple. I note that somebody thought #3 was important enough to be worth creating datum1 for in the first place, so I don't think it can or should be assumed that undoing that optimization in certain cases will turn out to be cheap enough not to matter. The much earlier datum1 optimization is mostly compelling for pass-by-value types, for reasons that prominently involve cache/locality considerations. That's probably why this patch is so compelling - it makes those advantages apply to pass-by-reference types too. Testing the cases where your patch wins and hand-waving that the losses won't be that bad in other cases - without actually testing - is not the right methodology. Okay. Here is a worst-case, with the pgbench script the same as my original test-case, but with much almost maximally unsympathetic data to sort: [local]/postgres=# update customers set firstname = 'padding-padding-padding-padding' || firstname; UPDATE 2 Master: pg@hamster:~/sort-tests$ pgbench -f sort.sql -n -T 100 transaction type: Custom query scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 100 s number of transactions actually processed: 323 latency average: 309.598 ms tps = 3.227745 (including connections establishing) tps = 3.227784 (excluding connections establishing) Patch: pg@hamster:~/sort-tests$ pgbench -f sort.sql -n -T 100 transaction type: Custom query scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 100 s number of transactions actually processed: 307 latency average: 325.733 ms tps = 3.066256 (including connections establishing) tps = 3.066313 (excluding connections establishing) That seems like a pretty modest regression for a case where 100% of what I've done goes to waste. If something that I'd done worked out 10% of the time, we'd still be well ahead. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is it not sane to pass ExecStoreTuple(shouldFree=true) for tuples point into buffers
On 2014-04-07 16:29:57 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-04-07 15:58:31 -0400, Tom Lane wrote: There's an assumption that if you are asking to pin a buffer, the tuple pointer you're passing is pointing into that buffer (and is therefore not something that could be pfree'd). If it isn't pointing into a buffer, the tuple slot is not the place to be keeping the buffer reference. Yea. I *have* a HeapTupleData struct pointing into the buffer. It's just that the lifetime of the indexscans's xs_ctup isn't sufficient for my case. So I have to allocate a new HeapTupleData struct, *again* pointing into the buffer. I'd like to manage the lifetime of that HeapTupleData struct (*not* the entire HeapTupleHeader blob) via the slot. That doesn't sound too crazy to me? In that case you should have another tuple slot of your own and let it keep the tuple (and buffer pin). Maybe I am just being dense here and misunderstanding tuple slots. The executor isn't exactly my strong suit. But say I am doing something like: slot = ExecInitExtraTupleSlot(estate); ExecSetSlotDescriptor(slot, RelationGetDescr(rel)); ... while ((scantuple = index_getnext(scan, ForwardScanDirection)) != NULL) { /* some check */ ... if (blurb) ExecStoreTuple(scantuple, slot, scan-xs_cbuf, true); } ... index_endscan(scan); /* possibly */ /* now use the stored tuple via slot-tts_tuple */ that's not going to work, because scantuple might be free'd or pointing to another tuple, from the next index_getnext() call. Right? So what I now do is essentially: while ((scantuple = index_getnext(scan, ForwardScanDirection)) != NULL) { ... ht = palloc(sizeof(HeapTupleData)); /* in the right context */ memcpy(ht, scantuple, sizeof(HeapTupleData)); ExecStoreTuple(ht, slot, scan-xs_cbuf, false); slot-tts_shouldFree = true; ... } That will a) keep the buffer pinned long enough, b) keep the HeapTupleData struct around long enough. That works, but seems pretty ugly. Thus I am wondering if a) there's a way to avoid the outside manipulation of tts_shouldFree b) why there's no builtin operation doing the palloc, memcpy and store... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Apr 7, 2014 at 4:35 PM, Peter Geoghegan p...@heroku.com wrote: The much earlier datum1 optimization is mostly compelling for pass-by-value types, for reasons that prominently involve cache/locality considerations. I agree. That's probably why this patch is so compelling - it makes those advantages apply to pass-by-reference types too. Well, whether the patch is compelling is actually precisely what we need to figure out. I feel like you're asserting your hoped-for conclusion prematurely. Testing the cases where your patch wins and hand-waving that the losses won't be that bad in other cases - without actually testing - is not the right methodology. Okay. Here is a worst-case, with the pgbench script the same as my original test-case, but with much almost maximally unsympathetic data to sort: [local]/postgres=# update customers set firstname = 'padding-padding-padding-padding' || firstname; UPDATE 2 Master: pg@hamster:~/sort-tests$ pgbench -f sort.sql -n -T 100 transaction type: Custom query scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 100 s number of transactions actually processed: 323 latency average: 309.598 ms tps = 3.227745 (including connections establishing) tps = 3.227784 (excluding connections establishing) Patch: pg@hamster:~/sort-tests$ pgbench -f sort.sql -n -T 100 transaction type: Custom query scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 100 s number of transactions actually processed: 307 latency average: 325.733 ms tps = 3.066256 (including connections establishing) tps = 3.066313 (excluding connections establishing) That seems like a pretty modest regression for a case where 100% of what I've done goes to waste. If something that I'd done worked out 10% of the time, we'd still be well ahead. Now that is definitely interesting, and it does seem to demonstrate that the worst case for this patch might not be as bad as I had feared - it's about a 5% regression: not great, but perhaps tolerable. It's not actually a worse case unless firstname is a fair ways into a tuple with lots of variable-length columns before it, because part of what the datum1 thing saves you is the cost of repeatedly walking through the tuple's column list. But I still think that a lot more could be done - and I'd challenge you (or others) to do it - to look for cases where this might be a pessimization. I get that the patch has an upside, but nearly every patch that anybody proposes does, and the author usually points out those cases quite prominently, as you have, and right so. But what makes for a much more compelling submission is when the author *also* tries really hard to break the patch, and hopefully fails. I agree that the test result shown above is good news for this patch's future prospects, but I *don't* agree that it nails the door shut. What about other locales? Other operating systems? Other versions of libc? Longer strings? Wider tuples? Systems where datums are only 32-bits? Sure, you can leave those things to the reviewer and/or committer to worry about, but that's not the way to expedite the patch's trip into the tree. I have to admit, I didn't really view the original postings on this topic to be anything more than, hey, we've got something promising here, it's worth more study. That's part of why I was so taken aback by Greg's email. There's certainly good potential here, but I think there's quite a bit left of work left to do before you can declare victory... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Apr 7, 2014 at 3:49 PM, Robert Haas robertmh...@gmail.com wrote: Now that is definitely interesting, and it does seem to demonstrate that the worst case for this patch might not be as bad as I had feared - it's about a 5% regression: not great, but perhaps tolerable. It's not actually a worse case unless firstname is a fair ways into a tuple with lots of variable-length columns before it, because part of what the datum1 thing saves you is the cost of repeatedly walking through the tuple's column list. I only use strxfrm() when the leading key is text. Otherwise, it's pretty much just what your original sort support for text patch from 2012 does, and we don't bother with anything other than fmgr-elision. The only thing that stops the above test case being perfectly pessimal is that we don't always chance a memcmp(), due to some comparisons realizing that len1 != len2. Also, exactly one comparison actually benefits from that same chance a memcmp() trick in practice (there is one duplicate). But it really is vanishingly close to perfectly pessimal on my dev system, or that is at least my sincere belief. I think that we're going to find ourselves with more and more cases where to risk wasting lots of compute bandwidth to save a little memory bandwidth is, counter-intuitively, a useful optimization. My opportunistic memcmp() is probably an example of this. I'm aware of others. This is perhaps a contributing factor to how inexpensive it is to waste all the effort that goes into strxfrm() and so on. Having said that, this is an idea from the 1960s, which might explain something about the name of the technique. But I still think that a lot more could be done - and I'd challenge you (or others) to do it - to look for cases where this might be a pessimization. I get that the patch has an upside, but nearly every patch that anybody proposes does, and the author usually points out those cases quite prominently, as you have, and right so. But what makes for a much more compelling submission is when the author *also* tries really hard to break the patch, and hopefully fails. I agree that the test result shown above is good news for this patch's future prospects, but I *don't* agree that it nails the door shut. What about other locales? Other operating systems? Other versions of libc? Longer strings? Wider tuples? Systems where datums are only 32-bits? Sure, you can leave those things to the reviewer and/or committer to worry about, but that's not the way to expedite the patch's trip into the tree. I don't think that 32-bit systems are all that high a priority for performance work. Cellphones will come out with 64-bit processors this year. Even still, it's hard to reason about how much difference that will make, except to say that the worst case cannot be that bad. As for other locales, it only gets better for this patch, because I believe en_US.UTF-8 is one of the cheapest to compare locales (which is to say, requires fewest passes). If you come up with some test case with complicated collation rules (I'm thinking of hu_HU, Hungarian), it surely makes the patch look much better. Having said that, I still don't do anything special with the C locale (just provide a non-fmgr-accessed comparator), which should probably be fixed, on the grounds that sorting using the C locale is probably now more expensive than with the en_US.UTF-8 collation. I have to admit, I didn't really view the original postings on this topic to be anything more than, hey, we've got something promising here, it's worth more study. That's part of why I was so taken aback by Greg's email. There's certainly good potential here, but I think there's quite a bit left of work left to do before you can declare victory... I think that Greg's choice of words was a little imprudent, but must be viewed in the context of an offline discussion during the hall track of pgConf NYC. Clearly Greg wasn't about to go off and unilaterally commit this. FWIW, I think I put him off the idea a few hours after he made those remarks, without intending for what I'd said to have that specific effect (or the opposite effect). -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE FOREIGN TABLE ( ... LIKE ... )
On Tue, Apr 8, 2014 at 5:24 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-04-05 11:46:16 -0400, Tom Lane wrote: ISTM this is because the proposed feature is wrongheaded. The basic concept of CREATE TABLE LIKE is that you're copying properties from another object of the same type. You might or might not want every property, but there's no question of whether you *could* copy every property. In contrast, what this is proposing to do is copy properties from (what might be) a plain table to a foreign table, and those things aren't even remotely the same kind of object. It would make sense to me to restrict LIKE to copy from another foreign table, and then there would be a different set of INCLUDING/EXCLUDING options that would be relevant (options yes, indexes no, for example). I actually think it's quite useful to create a foreign table that's the same shape as a local table. And the patches approach of refusing to copy thinks that aren't supported sounds sane to me. This could be improved as well: it would be useful to be able to copy the column options of another foreign table. Consider e.g. moving off older partitioned data off to an archiving server. New local partitions are often created using CREATE TABLE LIKE, but that's not possible for the foreign ones. Definitely a use case. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is it not sane to pass ExecStoreTuple(shouldFree=true) for tuples point into buffers
Andres Freund and...@2ndquadrant.com writes: On 2014-04-07 16:29:57 -0400, Tom Lane wrote: In that case you should have another tuple slot of your own and let it keep the tuple (and buffer pin). that's not going to work, because scantuple might be free'd or pointing to another tuple, from the next index_getnext() call. Right? It's true that scantuple is probably pointing at the xs_ctup field of the IndexScanDesc, so you need to put that data somewhere else if you want to hold onto it past the current loop iteration. So what I now do is essentially: while ((scantuple = index_getnext(scan, ForwardScanDirection)) != NULL) { ... ht = palloc(sizeof(HeapTupleData)); /* in the right context */ memcpy(ht, scantuple, sizeof(HeapTupleData)); ExecStoreTuple(ht, slot, scan-xs_cbuf, false); slot-tts_shouldFree = true; ... } Well, that is certainly messy. I think you could just use a local HeapTupleData variable instead of palloc'ing every time, where local means has lifespan similar to the slot pointer. That is HeapTupleData my_htup; while ((scantuple = index_getnext(scan, ForwardScanDirection)) != NULL) { memcpy(my_htup, scantuple, sizeof(HeapTupleData)); ExecStoreTuple(my_htup, slot, scan-xs_cbuf, false); } If my_htup is just a local, you'd want to clear the slot before my_htup goes out of scope, just to be sure it doesn't try to dereference a dangling pointer. There's some vaguely similar hacking near the end of ExecDelete. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL replay bugs
On 04/07/2014 02:16 PM, Heikki Linnakangas wrote: I've been playing with a little hack that records a before and after image of every page modification that is WAL-logged, and writes the images to a file along with the LSN of the corresponding WAL record. I set up a master-standby replication with that hack in place in both servers, and ran the regression suite. Then I compared the after images after every WAL record, as written on master, and as replayed by the standby. This is awesome ... thank you for doing this. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Default gin operator class of jsonb failing with index row size maximum reached
Hi all, While doing some tests with jsonb, I found a failure as told in $subject: =# create table data_jsonb (data jsonb); CREATE TABLE =# insert into data_jsonb ... tuple in the script attached INSERT 1 =# create index data_index on data_jsonb using gin(data); ERROR: 54000: index row size 1808 exceeds maximum 1352 for index data_index LOCATION: GinFormTuple, ginentrypage.c:110 =# create index data_index2 on data_jsonb using gin (data jsonb_hash_ops); CREATE INDEX The data creating the failure is a tuple in a dump of geonames (http://www.geonames.org/export/), listing some geographical data, and it is caused by some arabic characters it seems used to provide translations for a given geographical location. Encoding of the database on which I have done the tests is UTF-8. Japanese, Chinese equivalents were working fine btw with this operator. Documentation of jsonb tells that jsonb documents should be kept at a reasonable size to reduce lock contention, but there is no mention of size limitation for indexes: http://www.postgresql.org/docs/devel/static/datatype-json.html A test case is attached, note as well that only the default gin operator class is failing, jsonb_hash_ops worked well. Regards, -- Michael jsonb_index_error.sql Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL replay bugs
On Tue, Apr 8, 2014 at 3:16 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: I've been playing with a little hack that records a before and after image of every page modification that is WAL-logged, and writes the images to a file along with the LSN of the corresponding WAL record. I set up a master-standby replication with that hack in place in both servers, and ran the regression suite. Then I compared the after images after every WAL record, as written on master, and as replayed by the standby. Assuming that adding some dedicated hooks in the core able to do actions before and after a page modification occur is not *that* costly (well I imagine that it is not acceptable in terms of performance), could it be possible to get that in the shape of a extension that could be used to test WAL record consistency? This may be an idea to think about... -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On Mon, Apr 7, 2014 at 4:19 PM, Tom Lane t...@sss.pgh.pa.us wrote: Alvaro Herrera alvhe...@2ndquadrant.com writes: I just noticed that this patch not only adds min,max,stddev, but it also adds the ability to reset an entry's counters. This hasn't been mentioned in this thread at all; there has been no discussion on whether this is something we want to have, nor on whether this is the right API. What it does is add a new function pg_stat_statements_reset_time() which resets the min and max values from all function's entries, without touching the stddev state variables nor the number of calls. Note there's no ability to reset the values for a single function. That seems pretty bizarre. At this late date, the best thing would probably be to strip out the undiscussed functionality. It can get resubmitted for 9.5 if there's a real use-case for it. If I'm understanding correctly, that actually seems reasonably sensible. I mean, the big problem with min/max is that you might be taking averages and standard deviations over a period of months, but the maximum and minimum are not that meaningful over such a long period. You might well want to keep a longer-term average while seeing the maximum and minimum for, say, today. I don't know exactly how useful that is, but it doesn't seem obviously dumb. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
Robert Haas robertmh...@gmail.com writes: On Mon, Apr 7, 2014 at 4:19 PM, Tom Lane t...@sss.pgh.pa.us wrote: Alvaro Herrera alvhe...@2ndquadrant.com writes: What it does is add a new function pg_stat_statements_reset_time() which resets the min and max values from all function's entries, without touching the stddev state variables nor the number of calls. Note there's no ability to reset the values for a single function. That seems pretty bizarre. At this late date, the best thing would probably be to strip out the undiscussed functionality. It can get resubmitted for 9.5 if there's a real use-case for it. If I'm understanding correctly, that actually seems reasonably sensible. I mean, the big problem with min/max is that you might be taking averages and standard deviations over a period of months, but the maximum and minimum are not that meaningful over such a long period. You might well want to keep a longer-term average while seeing the maximum and minimum for, say, today. I don't know exactly how useful that is, but it doesn't seem obviously dumb. Well, maybe it's okay, but not being able to reset the stddev state seems odd, and the chosen function name seems odder. In any case, the time to be discussing the design of such functionality was several months ago. I'm still for ripping it out for 9.4, rather than being stuck with behavior that's not been adequately discussed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] four minor proposals for 9.5
On Mon, Apr 7, 2014 at 12:16 AM, Pavel Stehule pavel.steh...@gmail.com wrote: 2014-04-04 6:51 GMT+02:00 Amit Kapila amit.kapil...@gmail.com: On Tue, Apr 1, 2014 at 11:42 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2014-03-27 17:56 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com: So I'll prepare a some prototypes in next month for 1. log a execution time for cancelled queries, 2. track a query lock time Yes. Initially I though only about cancelled queries, but now O am thinking so some more wide solution can be better. Sometimes - some long queries can be stopped by Postgres, or by system - and info about duration can be same usefull. Right. One more thing I think currently also we can find that by crude way (pg_stat_activity has query_start time and log_line_prefix has end time), but I think the having in one place 'log' will be more useful. ?? I just wanted to say that if someone wants to calculate the duration of cancelled query (or other error'd query), you can do that by checking the start time from pg_stat_activity and end time from log (using log_line_prefix), but this is of course cumbersome. Same technique I would to use for printing lock time - it can be printing instead symbol %L. Above you have mentioned that you are planing to have three different lock times (Table, Tuple and others), so will this one symbol (%L) enable all three lock times? My idea is start with %L as total lock time, what is useful for wide group of users, and next or in same time we can enhance it with two chars prefix symbols So do you want to just print lock time for error'd statements, won't it better to do it for non-error'd statements as well or rather I feel it can be more useful for non-error statements? Do we already have some easy way to get wait-time for non-error statements? so %L .. total lock time %Lr .. lock relation %Lt .. lock tuples %Lo .. lock others L = Lr + Lt + Lr Are you planing to add new logs for logging this or planing to use existing infrastructure? I have not a prototype yet, so I don't know what will be necessary Okay, I think then it's better to discuss after your initial analysis/ prototype, but I think you might need to add some infrastructure code to make it possible, as lock database object (relation, tuple, ..) and lock others (buffers, ..) have different locking strategy, so to get total wait time for a statement due to different kind of locks you need to accumulate different wait times. One general point is that won't it be bit difficult to parse the log line having so many times, should we consider to log with some special marker for each time (for example: tup_lock_wait_time:1000ms). We should to optimize a log_line_prefix processing instead. Proposed options are interesting for enterprise using, when you have a some more smart tools for log entry processing, and when you need a complex view about performance of billions queries - when cancel time and lock time is important piece in mosaic of server' fitness. Exactly, though this might not be directly related to this patch, but having it would be useful. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers