Re: [HACKERS] inherit support for foreign tables
If we are going to change that portion of the code, we may as well go a bit forward and allow any expressions to be fetched from a foreign server (obviously, if that server is capable of doing so). It will help, when we come to aggregate push-down or whole query push-down (whenever that happens). So, instead of attr_needed, which restricts only the attributes to be fetched, why not to targetlist itself? On Mon, Jun 30, 2014 at 7:18 PM, Tom Lane t...@sss.pgh.pa.us wrote: Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes: Done. I think this is because create_foreignscan_plan() makes reference to attr_needed, which isn't computed for inheritance children. I wonder whether it isn't time to change that. It was coded like that originally only because calculating the values would've been a waste of cycles at the time. But this is at least the third place where it'd be useful to have attr_needed for child rels. regards, tom lane -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] inherit support for foreign tables
On Tue, Jul 1, 2014 at 7:39 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: (2014/06/30 20:17), Ashutosh Bapat wrote: On Mon, Jun 30, 2014 at 4:17 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp mailto:fujita.ets...@lab.ntt.co.jp wrote: (2014/06/30 17:47), Ashutosh Bapat wrote: BTW, why aren't you using the tlist passed to this function? I guess create_scan_plan() passes tlist after processing it, so that should be used rather than rel-reltargetlist. I think that that would be maybe OK, but I think that it would not be efficient to use the list to compute attrs_used, because the tlist would have more information than rel-reltargetlist in cases where the tlist is build through build_physical_tlist(). In that case, we can call build_relation_tlist() for foreign tables. Do you mean build_physical_tlist()? Sorry, I meant build_path_tlist() or disuse_physical_tlist() whichever is appropriate. We may want to modify use_physical_tlist(), to return false, in case of foreign tables. BTW, it does return false for inheritance trees. 486 /* 487 * Can't do it with inheritance cases either (mainly because Append 488 * doesn't project). 489 */ 490 if (rel-reloptkind != RELOPT_BASEREL) 491 return false; Yeah, we can call build_physical_tlist() (and do that in some cases), but if we call the function, it would generate a tlist that contains all Vars in the relation, not only those Vars actually needed by the query (ie, Vars in reltargetlist), and thus it would take more cycles to compute attr_used from the tlist than from reltargetlist. That' what I wanted to say. Thanks, Best regards, Etsuro Fujita -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Autonomous Transaction (WIP)
On 30 June 2014 22:50, Pavel Stehule Wrote: 2014-06-30 12:38 GMT+02:00 Abhijit Menon-Sen a...@2ndquadrant.commailto:a...@2ndquadrant.com: If I understand correctly, the design of this patch has already been considered earlier and rejected. So I guess the patch should also be marked rejected? I didn't find a related message. ? I think there have been some confusion, the design idea were never rejected but yes there were few feedback/ concern, which I had clarified. Also some of the other concerns are already fixed in latest patch. So I wanted to have this patch in commitfest application, so that we can have a healthy discussion and rectify all the issues. But now I see that this patch has already been moved to rejected category, which will put break on further review. So is there any way to bring back and continue reviewing this patch. Please let me know if any issue or I am missing something. Thanks and Regards, Kumar Rajeev Rastogi
Re: [HACKERS] Autonomous Transaction (WIP)
2014-07-01 8:16 GMT+02:00 Rajeev rastogi rajeev.rast...@huawei.com: On 30 June 2014 22:50, Pavel Stehule Wrote: 2014-06-30 12:38 GMT+02:00 Abhijit Menon-Sen a...@2ndquadrant.com: If I understand correctly, the design of this patch has already been considered earlier and rejected. So I guess the patch should also be marked rejected? I didn't find a related message. ? I think there have been some confusion, the design idea were never rejected but yes there were few feedback/ concern, which I had clarified. Also some of the other concerns are already fixed in latest patch. So I wanted to have this patch in commitfest application, so that we can have a healthy discussion and rectify all the issues. But now I see that this patch has already been moved to rejected category, which will put break on further review. So is there any way to bring back and continue reviewing this patch. Please let me know if any issue or I am missing something. I didn't watch a discuss about internal implementation, but now, when I am testing this feature - it works well. Surely - this feature has important but with relatively large impact and should be extremely well tested. Now there are no any special test. Probably we can reuse a tests for nested transactions. I prefer this feature will be part of first commitfest due high complexity. Regards Pavel *Thanks and Regards,* *Kumar Rajeev Rastogi*
Re: [HACKERS] inherit support for foreign tables
Hello, Sorry, this was no relation with this patch. ForeignNext materializes the slot, which would be any of physical and virtual tuple, when system column was requested. If it was a virtual one, file_fdw makes this, heap_form_tuple generates the tuple as DatumTuple. The result is a jumble of virtual and physical. But the returning slot has tts_tuple so the caller interprets it as a physical one. Anyway the requests for xmin/xmax could not be prevented beforehand in current framework, I did rewrite the physical tuple header so as to really be a physical one. This would be another patch, so I will put this into next CF if this don't get any immediate objection. By the way, I tried xmin and xmax for the file_fdw tables. postgres=# select tableoid, xmin, xmax, * from passwd1; tableoid | xmin |xmax| uname | pass | uid | gid | .. 16396 | 244 | 4294967295 | root | x| 0 | 0 | root... 16396 | 252 | 4294967295 | bin | x| 1 | 1 | bin... 16396 | 284 | 4294967295 | daemon| x| 2 | 2 | daemon... The xmin and xmax apparently doesn't look sane. After some investigation, I found that they came from the following place in heap_form_tuple(), (call stach is show below) regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/executor/nodeForeignscan.c b/src/backend/executor/nodeForeignscan.c index 9cc5345..728db14 100644 --- a/src/backend/executor/nodeForeignscan.c +++ b/src/backend/executor/nodeForeignscan.c @@ -22,6 +22,8 @@ */ #include postgres.h +#include access/transam.h +#include access/htup_details.h #include executor/executor.h #include executor/nodeForeignscan.h #include foreign/fdwapi.h @@ -58,8 +60,21 @@ ForeignNext(ForeignScanState *node) */ if (plan-fsSystemCol !TupIsNull(slot)) { + bool was_virtual_tuple = (slot-tts_tuple == NULL); HeapTuple tup = ExecMaterializeSlot(slot); + if (was_virtual_tuple) + { + /* + * ExecMaterializeSlot fills the tuple header as a + * DatumTupleFields if the slot was a virtual tuple, but a + * physical one is needed here. So rewrite the tuple header as + * HeapTupleFirelds. + */ + HeapTupleHeaderSetXmin(tup-t_data, FrozenTransactionId); + HeapTupleHeaderSetXmax(tup-t_data, InvalidTransactionId); + HeapTupleHeaderSetCmin(tup-t_data, FirstCommandId); + } tup-t_tableOid = RelationGetRelid(node-ss.ss_currentRelation); } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction (WIP)
On Tue, Jul 1, 2014 at 11:46 AM, Rajeev rastogi rajeev.rast...@huawei.com wrote: On 30 June 2014 22:50, Pavel Stehule Wrote: I didn't find a related message. ? I think there have been some confusion, the design idea were never rejected but yes there were few feedback/ concern, which I had clarified. Also some of the other concerns are already fixed in latest patch. Simon has mentioned that exactly this idea has been rejected at PGCon 2 years back. Please refer that in below mail: http://www.postgresql.org/message-id/bf2827dcce55594c8d7a8f7ffd3ab7713dde1...@szxeml508-mbx.china.huawei.com As far as I can see, you never came back with the different solution. Have you checked the discussion in Developer meeting notes. Please check the same at below link: http://wiki.postgresql.org/wiki/PgCon_2012_Developer_Meeting#Autonomous_Transactions So I wanted to have this patch in commitfest application, so that we can have a healthy discussion and rectify all the issues. But now I see that this patch has already been moved to rejected category, which will put break on further review. I believe ideally this patch should have been marked as Returned with feedback as you already got a feedback long back and never come up with solution for same. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Long paths for tablespace leads to uninterruptible hang in Windows
On Sat, Feb 15, 2014 at 1:26 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Fri, Feb 14, 2014 at 8:27 AM, Bruce Momjian br...@momjian.us wrote: On Tue, Jan 7, 2014 at 12:33:33PM +0530, Amit Kapila wrote: Still they have not told anything about other API's (rmdir, RemoveDirectory) which has same problem. Where are we on this? Till now we didn't received any workaround which can fix this problem from Microsoft. From the discussion over support ticket with them, it seems this problem is in their kernel and changing the code for it might not be straight forward for them, neither they have any clear alternative. Reply from Microsoft is as below. This is regarding a long pending case where stat() was failing on hard links and causing an infinite loop. We have discussed this multiple times internally and unfortunately do not have a commercially viable solution to this issue. Currently there are no workarounds available for this issue, but this has been marked for triage in future OSes. Since we have out run the maximum time that can be spent on this Professional Level Service request, I have been asked to move ahead and mark this as a won’t fix. We would need to close this case out as a won’t fix, and you would not be charged for this incident. Is there a check we should add in our code? We can possibly solve this problem in one of the below ways: 1. Resolve symbolic link to actual path in code whenever we tries to access it. 2. Another way is to check in code (initdb and create tablespace) to not allow path of length more than ~120 for Windows. Approach-1 has benefit that it can support the actual MAX_PATH and even if MS doesn't resolve the problem, PostgreSQL will not face it. Approach-2 is straightforward to fix. If we want to go with Approach-2, then we might change the limit of MaxPath for windows in future whenever there is a fix for it. From the reply above, it is clear that there is neither a workaround nor a fix for this issue in Windows. I think now we need to decide on which solution we want to pursue for PostgreSQL. Does any one of the above approaches seems sensible or let me know if you have any other idea to solve this problem. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] inherit support for foreign tables
(2014/07/01 15:13), Ashutosh Bapat wrote: On Tue, Jul 1, 2014 at 7:39 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp mailto:fujita.ets...@lab.ntt.co.jp wrote: We may want to modify use_physical_tlist(), to return false, in case of foreign tables. BTW, it does return false for inheritance trees. Yeah, but please consider cases where foreign tables are not inheritance child rels (and any system columns are requested). 486 /* 487 * Can't do it with inheritance cases either (mainly because Append 488 * doesn't project). 489 */ 490 if (rel-reloptkind != RELOPT_BASEREL) 491 return false; Yeah, we can call build_physical_tlist() (and do that in some cases), but if we call the function, it would generate a tlist that contains all Vars in the relation, not only those Vars actually needed by the query (ie, Vars in reltargetlist), and thus it would take more cycles to compute attr_used from the tlist than from reltargetlist. That' what I wanted to say. Maybe I'm missing something, but what's the point of using the tlist, not reltargetlist? Thanks, Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction (WIP)
2014-07-01 8:29 GMT+02:00 Amit Kapila amit.kapil...@gmail.com: On Tue, Jul 1, 2014 at 11:46 AM, Rajeev rastogi rajeev.rast...@huawei.com wrote: On 30 June 2014 22:50, Pavel Stehule Wrote: I didn't find a related message. ? I think there have been some confusion, the design idea were never rejected but yes there were few feedback/ concern, which I had clarified. Also some of the other concerns are already fixed in latest patch. Simon has mentioned that exactly this idea has been rejected at PGCon 2 years back. Please refer that in below mail: http://www.postgresql.org/message-id/bf2827dcce55594c8d7a8f7ffd3ab7713dde1...@szxeml508-mbx.china.huawei.com As far as I can see, you never came back with the different solution. Have you checked the discussion in Developer meeting notes. Please check the same at below link: http://wiki.postgresql.org/wiki/PgCon_2012_Developer_Meeting#Autonomous_Transactions Are these notes still valid? * Why autonomous transaction should be close to functions? We can implement AT as first step and next step can be implementation of integration AT to stored procedures. * When autonomous transaction is independent on parent transaction, then locks parent and autonomous transaction should be in conflict I though about integration to PL/pgSQL and I don't think so close integration between autonomous transaction and procedure is optimal. More practical is design so autonomous transaction is similar to subtransaction. Then we can simply wrote some code like BEGIN .. some code WHEN OTHERS THEN .. I would to write permanently to log BEGIN AUTONOMOUS INSERT INTO log VALUES(..); WHEN OTHERS RAISE WARNING 'Cannot to write to log ..'; RAISE EXCEPTION ' ...' forward up exception from autonomous transaction to parent transaction END END; Now I am thinking so PL/SQL design of autonomous transactions is relatively limited and is not best to follow it. Regards Pavel So I wanted to have this patch in commitfest application, so that we can have a healthy discussion and rectify all the issues. But now I see that this patch has already been moved to rejected category, which will put break on further review. I believe ideally this patch should have been marked as Returned with feedback as you already got a feedback long back and never come up with solution for same. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] IMPORT FOREIGN SCHEMA statement
Michael Paquier wrote: After sleeping on it, I have put my hands on the postgres_fdw portion and came up with a largely simplified flow, resulting in the patch attached. [...] Ronan, what do you think of those patches? I have nothing more to add, and I think that they should be looked by a committer. Particularly the FDW API that is perhaps not the best fit, but let's see some extra opinions about that. I looked the the API and ist documentation, and while I saw no problem with the API, I think that the documentation still needs some attention: It mentions a local_schema, which doesn't exist (any more?). It should be mentioned that the CreateForeignTableStmt's base.relation-schemaname should be set to NULL. Also, it would be nice to find a few words for options, maybe explaining a potential application. Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
On Tue, Jul 1, 2014 at 12:25 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: (2014/07/01 15:13), Ashutosh Bapat wrote: On Tue, Jul 1, 2014 at 7:39 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp mailto:fujita.ets...@lab.ntt.co.jp wrote: We may want to modify use_physical_tlist(), to return false, in case of foreign tables. BTW, it does return false for inheritance trees. Yeah, but please consider cases where foreign tables are not inheritance child rels (and any system columns are requested). 486 /* 487 * Can't do it with inheritance cases either (mainly because Append 488 * doesn't project). 489 */ 490 if (rel-reloptkind != RELOPT_BASEREL) 491 return false; Yeah, we can call build_physical_tlist() (and do that in some cases), but if we call the function, it would generate a tlist that contains all Vars in the relation, not only those Vars actually needed by the query (ie, Vars in reltargetlist), and thus it would take more cycles to compute attr_used from the tlist than from reltargetlist. That' what I wanted to say. Maybe I'm missing something, but what's the point of using the tlist, not reltargetlist? Compliance with other create_*scan_plan() functions. The tlist passed to those functions is sometimes preprocessed in create_scan_plan() and some of the function it calls. If we use reltargetlist directly, we loose that preprocessing. I have not see any of create_*scan_plan() fetch the targetlist directly from RelOptInfo. It is always the one supplied by build_path_tlist() or disuse_physical_tlist() (which in turn calls build_path_tlist()) or build_physical_tlist(). Thanks, Best regards, Etsuro Fujita -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
[HACKERS] debugging with child process
hi, can any one help me with the 4th and 5th steps of the following link which is to debug with a child process in postgres. *https://wiki.postgresql.org/wiki/Working_with_Eclipse#Debugging_with_child_processes https://wiki.postgresql.org/wiki/Working_with_Eclipse#Debugging_with_child_processes* when I give the following fields in the debug configuration *c/c++ Application : postgres* *Project - psql* *Build Configuration - Use Active* I am getting the following error- * No source available for __kernel_vsyscall() at 0xb726a424 * Is there any better tool other than eclipse to use for editing or knowing about source code in postgres. Thank you
Re: [HACKERS] IMPORT FOREIGN SCHEMA statement
Le mardi 1 juillet 2014 06:59:49 Albe Laurenz a écrit : Michael Paquier wrote: After sleeping on it, I have put my hands on the postgres_fdw portion and came up with a largely simplified flow, resulting in the patch attached. [...] Ronan, what do you think of those patches? I have nothing more to add, and I think that they should be looked by a committer. Particularly the FDW API that is perhaps not the best fit, but let's see some extra opinions about that. The remote_schema parameter can be used for SQL injection. Either we should go back to using parameters, or be extra careful. Since the remote schema is parsed as a name, it is limited to 64 characters which is not that useful for an SQL injection, but still. The new query as you wrote it returns the typname (was cast to regtype before) This is not schema qualified, and will fail when importing tables with columns from a type not in search_path. The regression tests don't pass: a user name is hard-coded in the result of DROP USER MAPPING. Should we expect the tests to be run as postgres ? I looked the the API and ist documentation, and while I saw no problem with the API, I think that the documentation still needs some attention: It mentions a local_schema, which doesn't exist (any more?). It should be mentioned that the CreateForeignTableStmt's base.relation-schemaname should be set to NULL. Also, it would be nice to find a few words for options, maybe explaining a potential application. Yours, Laurenz Albe -- Ronan Dunklau http://dalibo.com - http://dalibo.org signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] inherit support for foreign tables
(2014/07/01 16:04), Ashutosh Bapat wrote: On Tue, Jul 1, 2014 at 12:25 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp mailto:fujita.ets...@lab.ntt.co.jp wrote: Maybe I'm missing something, but what's the point of using the tlist, not reltargetlist? Compliance with other create_*scan_plan() functions. The tlist passed to those functions is sometimes preprocessed in create_scan_plan() and some of the function it calls. If we use reltargetlist directly, we loose that preprocessing. I have not see any of create_*scan_plan() fetch the targetlist directly from RelOptInfo. It is always the one supplied by build_path_tlist() or disuse_physical_tlist() (which in turn calls build_path_tlist()) or build_physical_tlist(). I've got the point. As I said upthread, I'll work on calculating attr_needed for child rels, and I hope that that will eliminate your concern. Thanks, Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RLS Design
On 29 June 2014 20:42, Stephen Frost sfr...@snowman.net wrote: To try and clarify what this distinction is- Dean's approach with GRANT allows specifying the policy to be used when a given role queries a given table. Through this mechanism, one role might have access to many different tables, possibly with a different policy granting that access for each table. Robert's approach defines a policy for a user and that policy is used for all tables that user accesses. This ties the policy very closely to the role. Actually I think they were both originally Robert's ideas in one form or another, but at this point I'm losing track a bit :-) With either approach, I wonder how we are going to address the role membership question. Do you inherit policies through role membership? What happens on 'set role'? Robert points out that we should be using OR for these situations of overlapping policies and I tend to agree. Therefore, we would look at the RLS policies for a table and extract out all of them for all of the roles which the current user is a member of, OR them together and that would be the set of quals used. Yes I think that's right. I had hoped to avoid overlapping policies, but maybe they're more-or-less inevitable and we should just allow them. It seems justifiable in terms of GRANTs --- one GRANT gives you permission to access one set of rows from a table, another GRANT gives you permission to access another set of rows, so in the end you have access to the union of both sets. I'm leaning towards Dean's approach. With Robert's approach, one could emulate Dean's approach but I suspect it would devolve quickly into one policy per user with that policy simply being a proxy for the role instead of being useful on its own. With Dean's approach though, I don't think there's a need for a policy to be a stand-alone object. The policy is simply a proxy for the set of quals to be added and therefore the policy could really live as a per-table object. Yes I think that's right too. I had thought that stand-alone policies would be useful for selecting which policies to apply to each role, but maybe that's not necessary if you rely entirely on GRANTs to decide which policies apply. That means the syntax I proposed earlier is wrong/misleading. Instead of GRANT SELECT ON TABLE tbl TO role USING polname; it should really be GRANT SELECT USING polname ON TABLE tbl TO role; This would work, though the 'polname' could be a per-table object, no? Right. This could even be: GRANT SELECT USING (sec_level=manager) ON TABLE tbl TO role; Maybe. The important thing is that it's granting the role access to a {table,command,policy} set or equivalently a {table,command,quals} set --- i.e., the right to access a sub-set of the table's rows with a particular command. Let's explore this further to see where it leads. In some ways, I think it has ended up even simpler than I thought. To setup RLS, you would just need to do 2 things: 1). Add a bunch of RLS policies to your tables (not connected to any particular commands, since that is done using GRANTs). This could use Robert's earlier syntax: ALTER TABLE t1 ADD POLICY p1 WHERE p1_quals; ALTER TABLE t1 ADD POLICY p2 WHERE p2_quals; ... (or some similar syntax) where the policy names p1 and p2 need only be unique within the table. For maintenance purposes you'd also need to be able to do ALTER TABLE t1 DROP POLICY pol; and maybe in the future we'd support ALTER TABLE t1 ALTER POLICY pol TO new_quals; 2). Once each table has the required set of policies, grant each role permissions, specifying the allowed commands and policies together: GRANT SELECT USING p1 ON TABLE t1 TO role1; GRANT SELECT USING p2 ON TABLE t1 TO role1; GRANT UPDATE USING p3 ON TABLE t1 TO role1; ... (or some similar syntax) So in this example, if role1 SELECTed from t1, the system would automatically apply the combined quals (p1_quals OR p2_quals), whereas when role1 UPDATEd t1, it would apply p3_quals. So that takes care of the different-quals-for-different-commands requirement without even needing any special syntax for it in ALTER TABLE. A straight GRANT SELECT ON TABLE .. TO .. would grant access to the whole table without any RLS quals, as it always has done, which is good because it means nothing changes for users who aren't interested in RLS. Finally, pg_dump would require a GUC to ensure that RLS was not in effect. Perhaps something like SET require_direct_table_access = true, which would cause an error to be thrown if the user hadn't been granted straight select permissions on the tables in question. That all seems relatively easy to understand, whilst giving a lot of flexibility. An annoying complication, however, is how this interacts with column privileges. Right now GRANT SELECT(col1) ON t1 TO role1 gives role1 access to every row in col1, and I think that has to remain the case, since GRANTs only ever give you more access. But that
Re: [HACKERS] inherit support for foreign tables
Hi, At Tue, 01 Jul 2014 16:30:41 +0900, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote in 53b263a1.3060...@lab.ntt.co.jp I've got the point. As I said upthread, I'll work on calculating attr_needed for child rels, and I hope that that will eliminate your concern. Inheritance tree is expanded far after where attr_needed for the parent built, in set_base_rel_sizes() in make_one_rel(). I'm afraid that building all attr_needed for whole inheritance tree altogether is a bit suffering. I have wanted the point of inheritance expansion earlier for another patch. Do you think that rearranging there? Or generate them individually in crete_foreign_plan()? Anyway, I'm lookin forward to your next patch. So no answer needed. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autonomous Transaction (WIP)
On 01 July 2014 12:26, Pavel Stehule Wrote: Have you checked the discussion in Developer meeting notes. Please check the same at below link: http://wiki.postgresql.org/wiki/PgCon_2012_Developer_Meeting#Autonomous_Transactions Are these notes still valid? * Why autonomous transaction should be close to functions? We can implement AT as first step and next step can be implementation of integration AT to stored procedures. We have implemented AT on the line of sub-transaction. Also we have integrated AT with stored procedure i.e. we can create an autonomous transaction inside the store procedure, which can be also committed. * When autonomous transaction is independent on parent transaction, then locks parent and autonomous transaction should be in conflict Yes our implementation makes the autonomous transaction independent of main transaction and hence as per our design parent (main) transaction and autonomous may get conflicted. For which we have implemented deadlock detection mechanism between autonomous transaction and its parent transaction. I though about integration to PL/pgSQL and I don't think so close integration between autonomous transaction and procedure is optimal. More practical is design so autonomous transaction is similar to subtransaction. Yes as mentioned above, our implementation of autonomous transaction is on track of subtransaction. Then we can simply wrote some code like BEGIN .. some code WHEN OTHERS THEN .. I would to write permanently to log BEGIN AUTONOMOUS INSERT INTO log VALUES(..); WHEN OTHERS RAISE WARNING 'Cannot to write to log ..'; RAISE EXCEPTION ' ...' forward up exception from autonomous transaction to parent transaction END END; Now I am thinking so PL/SQL design of autonomous transactions is relatively limited and is not best to follow it. With our approach, we can use autonomous transaction in procedure as given below: BEGIN .. some code WHEN OTHERS THEN .. I would to write permanently to log START AUTONOMOUS TRANSACTION INSERT INTO log VALUES(..); COMMIT: WHEN OTHERS RAISE WARNING 'Cannot to write to log ..'; RAISE EXCEPTION ' ...' forward up exception from autonomous transaction to parent transaction END END; Please let me know if I have missed to answer any of your queries. Thanks and Regards, Kumar Rajeev Rastogi
Re: [HACKERS] Autonomous Transaction (WIP)
2014-07-01 10:38 GMT+02:00 Rajeev rastogi rajeev.rast...@huawei.com: On 01 July 2014 12:26, Pavel Stehule Wrote: Have you checked the discussion in Developer meeting notes. Please check the same at below link: http://wiki.postgresql.org/wiki/PgCon_2012_Developer_Meeting#Autonomous_Transactions Are these notes still valid? * Why autonomous transaction should be close to functions? We can implement AT as first step and next step can be implementation of integration AT to stored procedures. We have implemented AT on the line of sub-transaction. Also we have integrated AT with stored procedure i.e. we can create an autonomous transaction inside the store procedure, which can be also committed. * When autonomous transaction is independent on parent transaction, then locks parent and autonomous transaction should be in conflict Yes our implementation makes the autonomous transaction independent of main transaction and hence as per our design parent (main) transaction and autonomous may get conflicted. For which we have implemented deadlock detection mechanism between autonomous transaction and its parent transaction. I though about integration to PL/pgSQL and I don't think so close integration between autonomous transaction and procedure is optimal. More practical is design so autonomous transaction is similar to subtransaction. Yes as mentioned above, our implementation of autonomous transaction is on track of subtransaction. ok Then we can simply wrote some code like BEGIN .. some code WHEN OTHERS THEN .. I would to write permanently to log BEGIN AUTONOMOUS INSERT INTO log VALUES(..); WHEN OTHERS RAISE WARNING 'Cannot to write to log ..'; RAISE EXCEPTION ' ...' forward up exception from autonomous transaction to parent transaction END END; Now I am thinking so PL/SQL design of autonomous transactions is relatively limited and is not best to follow it. With our approach, we can use autonomous transaction in procedure as given below: BEGIN .. some code WHEN OTHERS THEN .. I would to write permanently to log *START AUTONOMOUS TRANSACTION* INSERT INTO log VALUES(..); *COMMIT: * WHEN OTHERS RAISE WARNING 'Cannot to write to log ..'; RAISE EXCEPTION ' ...' forward up exception from autonomous transaction to parent transaction END END; I don't like this design (really) - it can be used in later implementation of procedures - but I don't like a explicit transaction manipulation in functions. It is Oracleism (and this part I don't want to follow, because Oracle design is not lucky) - and it is one reason, why Oracle SP are significantly complex than PostgreSQL SP. After all I am thinking so PostgreSQL relation between transactions and procedures are better, simply for usage, simply for learning. But it is little bit different topic. Regards Pavel Please let me know if I have missed to answer any of your queries. Thanks and Regards, Kumar Rajeev Rastogi
Re: [HACKERS] NUMA packaging and patch
Re: Kevin Grittner 2014-06-09 1402267501.4.yahoomail...@web122304.mail.ne1.yahoo.com @@ -536,6 +539,24 @@ PGSharedMemoryCreate(Size size, bool makePrivate, int port, */ } +#ifdef USE_LIBNUMA + /* + * If this is not a private segment and we are using libnuma, make the + * large memory segment interleaved. + */ + if (!makePrivate numa_available()) + { + void *start; + + if (AnonymousShmem == NULL) + start = memAddress; + else + start = AnonymousShmem; + + numa_interleave_memory(start, size, numa_all_nodes_ptr); + } +#endif How much difference would it make if numactl --interleave=all was used instead of using numa_interleave_memory() on the shared memory segments? I guess that would make backend-local memory also interleaved, but it would avoid having a dependency on libnuma in the packages. The numactl manpage even has this example: numactl --interleave=all bigdatabase arguments Run big database with its memory interleaved on all CPUs. It is probably better to have native support in the postmaster, though this could be mentioned as an alternative in the documentation. Christoph -- c...@df7cb.de | http://www.df7cb.de/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Array of composite types returned from python
Le dimanche 29 juin 2014 16:54:03 Tom Lane a écrit : Abhijit Menon-Sen a...@2ndquadrant.com writes: 3) This is such a simple change with no new infrastructure code (PLyObject_ToComposite already exists). Can you think of a reason why this wasn't done until now? Was it a simple miss or purposefully excluded? This is not an authoritative answer: I think the infrastructure was originally missing, but was later added in #bc411f25 for OUT parameters. Perhaps it was overlooked at the time that the same code would suffice for this earlier-missing case. (I've Cc:ed Peter E. in case he has any comments.) I think the patch is ready for committer. Sorry for being this late. I've tested the patch, everything seems to work as expected, including complex nesting of Composite and array types. No documentation changes are needed, since the limitation wasn't even mentioned before. Regression tests are ok, and the patch seems simple enough. Formatting looks OK too. I took a quick look at this; not really a review either, but I have a couple comments. 1. While I think the patch does what it intends to, it's a bit distressing that it will invoke the information lookups in PLyObject_ToComposite over again for *each element* of the array. We probably ought to quantify that overhead to see if it's bad enough that we need to do something about improving caching, as speculated in the comment in PLyObject_ToComposite. I don't know how to do that without implementing the cache itself. 2. I wonder whether the no-composites restriction in plpy.prepare (see lines 133ff in plpy_spi.c) could be removed as easily. Hum, I tried that, but its not that easy: lifting the restriction results in a SEGFAULT when trying to pfree the parameters given to SPI_ExecutePlan (line 320 in plpy_spi.c). Correct me if I'm wrong, but I think the problem is that HeapTupleGetDatum returns the t_data field, whereas heap_form_tuple allocation returns the address of the HeapTuple itself. Then, the datum itself has not been palloced. Changing the HeapTupleGetDatum call for an heap_copy_tuple_as_datum fixes this issue, but I'm not sure this the best way to do that. The attached patch implements this. regards, tom lane -- Ronan Dunklau http://dalibo.com - http://dalibo.orgdiff --git a/src/pl/plpython/expected/plpython_composite.out b/src/pl/plpython/expected/plpython_composite.out index 61b3046..e7a7edb 100644 --- a/src/pl/plpython/expected/plpython_composite.out +++ b/src/pl/plpython/expected/plpython_composite.out @@ -270,9 +270,13 @@ yield {'tab': [['first', 1], ['second', 2]], {'first': 'fourth', 'second': 4}]} $$ LANGUAGE plpythonu; SELECT * FROM composite_types_table(); -ERROR: PL/Python functions cannot return type table_record[] -DETAIL: PL/Python does not support conversion to arrays of row types. -CONTEXT: PL/Python function composite_types_table +tab |typ ++ + {(first,1),(second,2)} | {(third,3),(fourth,4)} + {(first,1),(second,2)} | {(third,3),(fourth,4)} + {(first,1),(second,2)} | {(third,3),(fourth,4)} +(3 rows) + -- check what happens if the output record descriptor changes CREATE FUNCTION return_record(t text) RETURNS record AS $$ return {'t': t, 'val': 10} diff --git a/src/pl/plpython/expected/plpython_spi.out b/src/pl/plpython/expected/plpython_spi.out index 5175794..de547d2 100644 --- a/src/pl/plpython/expected/plpython_spi.out +++ b/src/pl/plpython/expected/plpython_spi.out @@ -376,6 +376,15 @@ plan = plpy.prepare(select fname, lname from users where fname like $1 || '%', [text]) c = plpy.cursor(plan, [a, b]) $$ LANGUAGE plpythonu; +CREATE TYPE test_composite_type AS ( + a1 int, + a2 varchar +); +CREATE OR REPLACE FUNCTION plan_composite_args() RETURNS test_composite_type AS $$ +plan = plpy.prepare(select $1::test_composite_type as c1, [test_composite_type]) +res = plpy.execute(plan, [{a1: 3, a2: label}]) +return res[0][c1] +$$ LANGUAGE plpythonu; SELECT simple_cursor_test(); simple_cursor_test @@ -432,3 +441,9 @@ CONTEXT: Traceback (most recent call last): PL/Python function cursor_plan_wrong_args, line 4, in module c = plpy.cursor(plan, [a, b]) PL/Python function cursor_plan_wrong_args +SELECT plan_composite_args(); + plan_composite_args +- + (3,label) +(1 row) + diff --git a/src/pl/plpython/expected/plpython_types.out b/src/pl/plpython/expected/plpython_types.out index b98318c..9576905 100644 --- a/src/pl/plpython/expected/plpython_types.out +++ b/src/pl/plpython/expected/plpython_types.out @@ -632,13 +632,12 @@ PL/Python function test_type_conversion_array_mixed2 CREATE FUNCTION test_type_conversion_array_record() RETURNS type_record[] AS $$ return [None] $$ LANGUAGE plpythonu; -ERROR: PL/Python functions cannot return type type_record[] -DETAIL: PL/Python does not
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-06-30 22:44:58 -0400, Robert Haas wrote: On Mon, Jun 30, 2014 at 6:28 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-30 19:22:59 +0200, Andres Freund wrote: On 2014-06-30 12:46:29 -0400, Robert Haas wrote: , which if I understand you correctly are ARM without GCC atomics, Sparc, and MIPS. I've to revise my statement on MIPS, it actually looks safe. I seem to have missed that it has its own S_UNLOCK. So, here's my first blind attempt at fixing these. Too tired to think much more about it. Sparc's configurable cache coherency drives me nuts. Linux apparently switched somewhere in 2011 from RMO (relaxed memory order) to TSO (total store order), solaris always used TSO, but the BSDs don't. Man. Accordingly there's a somewhat ugly thingy attached. I don't think this is really ready, but it's what I can come up today. You know, looking at this, I wonder if we shouldn't just remove support for ARMv5 instead of making a blind stab at a fix. Well, I argued that way for a while ;). We don't even need to really desupport it, but just say it's not supported for gcc 4.4. On the other hand, the swpb release thing isn't too complicated and just the inverse of existing code. I'm quite in favor of doing what we can to support obscure architectures, but I think this might be beyond what's reasonable. Yea, I felt like I was going mad doing it. Perhaps I should have stopped. Since we have a Sun Studio machine in the buildfarm, we shouldn't give up on SPARC completely, but maybe we should only add the cases for sparcv8+ and above? That at least has some chance of getting tested. That we have code for sparcv7 is ridiculous btw. Sparcv8 (not 8+/9) is from 1992. v7/sun4c been dropped from solaris 8 if I see that correctly. I agree that v8 (in contrast to v8+ aka v9 in 32bit mode) support has to go as well. I'd personally vote for backpatching a note to installation.sgml saying that it's probably not working, and not do anything else there. That means we also should replace the ldstub by cas in the the gcc inline assembly - but we have buildfarm members for that, so it's not too bad. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Wait free LW_SHARED acquisition
Hi, Over at -performance Mark Kirkwood tested a recent version of this (http://archives.postgresql.org/message-id/53B283F3.7020005%40catalyst.net.nz) . I thought it's interesting to add the numbers to this thread: Test: pgbench Options: scale 500 read only Os: Ubuntu 14.04 Pg: 9.3.4 Pg Options: max_connections = 200 shared_buffers = 10GB maintenance_work_mem = 1GB effective_io_concurrency = 10 wal_buffers = 32MB checkpoint_segments = 192 checkpoint_completion_target = 0.8 Results Clients | 9.3 tps 32 cores | 9.3 tps 60 cores +--+- 6 | 70400 | 71028 12 | 98918 | 129140 24 | 230345 | 240631 48 | 324042 | 409510 96 | 346929 | 120464 192 | 312621 | 92663 So we have anti scaling with 60 cores as we increase the client connections. Ouch! A level of urgency led to trying out Andres's 'rwlock' 9.4 branch [1] - cherry picking the last 5 commits into 9.4 branch and building a package from that and retesting: Clients | 9.4 tps 60 cores (rwlock) +-- 6 | 70189 12 | 128894 24 | 233542 48 | 422754 96 | 590796 192 | 630672 Now, this is a bit of a skewed comparison due to 9.4 vs. 9.3 but still interesting. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL replay bugs
Hello, I had a look on this patch. Let me show you some comments about the README, Makefile and buffer_capture_cmp of the second part for the present. A continuation of this comment would be seen later.. - contrib/buffer_capture_cmp/README - 'contains' seems duplicate in the first paragraph. - The second pragraph says that 'This script can use the node number of the master node available as the first argument of the script when it is run within the test suite.' But test.sh seems not giving such a parameter. - contrib/buffer_capture_cmp/Makefile make check does nothing when BUFFER_CAPTURE is not defined, as described in itself. But I trapped by that after build the server by 'make CFLAGS=-DBUFFER_CAPTURE':( It would be better that 'make check' without defining it prints some message. - buffer_capture_cmp.c This source generates void executable when BUFFER_CAPTURE is not defined. The restriction seems to be put only to use BUFFER_CAPTURE_FILE in bufcapt.h. If so, changing the parameter of the executable as described in the following comment for main() would blow off the necessity for the restriction. - buffer_capture_cmp.c/main() The parameters for this command are the parent directories for each capture file. This is a bit incovenient for separate use. For example, when I want to gather the capture files from multiple servers then compare them, I should unwillingly make their own directories for each capture file. If no particular reason exists for the design, I suppose it would be more convenient that the parameters are the names of the capture files themselves. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] better atomics - v0.5
On 2014-07-01 10:44:19 +0530, Amit Kapila wrote: I think for such usage, we need to rely on barriers wherever there is a need for synchronisation, recent example I have noticed is in your patch where we have to use pg_write_barrier() during wakeup. However if we go by atomic ops definition, then no such dependency is required, like if somewhere we need to use pg_atomic_compare_exchange_u32(), after that we don't need to ensure about barriers, because this atomic API adheres to Full barrier semantics. I think defining barrier support for some atomic ops and not for others will lead to inconsistency with specs and we need to additionally define rules for such atomic ops usage. Meh. It's quite common that read/load do not have barrier semantics. The majority of read/write users won't need barriers and it'll be expensive to provide them with them. b) It's only supported from vista onwards. Afaik we still support XP. Well, with barrier.h as it stands they'd get a compiler error if it indeed is unsupported? But I think there's something else going on - msft might just be removing XP from it's documentation. Okay, but why would they remove for MemoryBarrier() and not for InterlockedCompareExchange(). I think it is bit difficult to predict the reason and if we want to use anything which is not as msft docs, we shall do that carefully and have some way to ensure that it will work fine. Well, we have quite good evidence for it working by knowing that postgres has in the past worked on xp? If you have a better idea, send in a patch for today's barrier.h and I'll adopt that. In this case, I have a question for you. Un-patched usage in barrier.h is as follows: .. If I understand correctly the current define mechanism in barrier.h, it will have different definition for Itanium processors even for windows. Either noone has ever tested postgres on itanium windows (quite possible), because afaik _Asm_sched_fence() doesn't work on windows/msvc, or windows doesn't define __ia64__/__ia64. Those macros arefor HP's acc on HPUX. Hmm.. Do you think in such a case, it would have gone in below define: #elif defined(__INTEL_COMPILER) /* * icc defines __GNUC__, but doesn't support gcc's inline asm syntax */ #if defined(__ia64__) || defined(__ia64) #define pg_memory_barrier() __mf() #elif defined(__i386__) || defined(__x86_64__) #define pg_memory_barrier() _mm_mfence() #endif Hm? Since _Asm_sched_fence isn't for the intel compiler but for HP's acc, I don't see why? But they should go into atomics-generic-acc.h. -#define pg_compiler_barrier() __memory_barrier() Currently this will be considered as compiler barrier for __INTEL_COMPILER, but after patch, I don't see this define. I think after patch, it will be compiler_impl specific, but why such a change? #if defined(__INTEL_COMPILER) #define pg_compiler_barrier_impl() __memory_barrier() #else #define pg_compiler_barrier_impl() __asm__ __volatile__( ::: memory) #endif There earlier was a typo defining pg_compiler_barrier instead of pg_compiler_barrier_impl for intel, maybe that's what you were referring to? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] NUMA packaging and patch
On 2014-07-01 11:01:04 +0200, Christoph Berg wrote: Re: Kevin Grittner 2014-06-09 1402267501.4.yahoomail...@web122304.mail.ne1.yahoo.com @@ -536,6 +539,24 @@ PGSharedMemoryCreate(Size size, bool makePrivate, int port, */ } +#ifdef USE_LIBNUMA + /* +* If this is not a private segment and we are using libnuma, make the +* large memory segment interleaved. +*/ + if (!makePrivate numa_available()) + { + void *start; + + if (AnonymousShmem == NULL) + start = memAddress; + else + start = AnonymousShmem; + + numa_interleave_memory(start, size, numa_all_nodes_ptr); + } +#endif How much difference would it make if numactl --interleave=all was used instead of using numa_interleave_memory() on the shared memory segments? I guess that would make backend-local memory also interleaved, but it would avoid having a dependency on libnuma in the packages. I've tested this a while ago, and it's rather painful if you have a OLAP workload with lots of backend private memory. The numactl manpage even has this example: numactl --interleave=all bigdatabase arguments Run big database with its memory interleaved on all CPUs. It is probably better to have native support in the postmaster, though this could be mentioned as an alternative in the documentation. I wonder if we shouldn't backpatch such a notice. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] NUMA packaging and patch
Andres Freund and...@2ndquadrant.com wrote: On 2014-07-01 11:01:04 +0200, Christoph Berg wrote: How much difference would it make if numactl --interleave=all was used instead of using numa_interleave_memory() on the shared memory segments? I guess that would make backend-local memory also interleaved, but it would avoid having a dependency on libnuma in the packages. I've tested this a while ago, and it's rather painful if you have a OLAP workload with lots of backend private memory. I'm not surprised; I would expect it to generally have a negative effect, which would be most pronounced with an OLAP workload. The numactl manpage even has this example: numactl --interleave=all bigdatabase arguments Run big database with its memory interleaved on all CPUs. It is probably better to have native support in the postmaster, though this could be mentioned as an alternative in the documentation. I wonder if we shouldn't backpatch such a notice. I would want to see some evidence that it was useful first. In most of my tests the benefit of interleaving just the OS cache and PostgreSQL shared_buffers was about 2%. That could easily be erased if work_mem allocations and other process-local memory were not allocated close to the process which was using it. I expect that the main benefit of this proposed patch isn't the 2% typical benefit I was seeing, but that it will be insurance against occasional, much larger hits. I haven't had much luck making these worst case episodes reproducible, though. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Wait free LW_SHARED acquisition
On 07/01/2014 01:08 PM, Andres Freund wrote: Hi, Over at -performance Mark Kirkwood tested a recent version of this (http://archives.postgresql.org/message-id/53B283F3.7020005%40catalyst.net.nz) . I thought it's interesting to add the numbers to this thread: Test: pgbench Options: scale 500 read only Os: Ubuntu 14.04 Pg: 9.3.4 Pg Options: max_connections = 200 shared_buffers = 10GB maintenance_work_mem = 1GB effective_io_concurrency = 10 wal_buffers = 32MB checkpoint_segments = 192 checkpoint_completion_target = 0.8 Results Clients | 9.3 tps 32 cores | 9.3 tps 60 cores +--+- 6 | 70400 | 71028 12 | 98918 | 129140 24 | 230345 | 240631 48 | 324042 | 409510 96 | 346929 | 120464 192 | 312621 | 92663 So we have anti scaling with 60 cores as we increase the client connections. Ouch! A level of urgency led to trying out Andres's 'rwlock' 9.4 branch [1] - cherry picking the last 5 commits into 9.4 branch and building a package from that and retesting: Clients | 9.4 tps 60 cores (rwlock) +-- 6 | 70189 12 | 128894 24 | 233542 48 | 422754 96 | 590796 192 | 630672 Now, this is a bit of a skewed comparison due to 9.4 vs. 9.3 but still interesting. It looks like the issue I reported here: http://www.postgresql.org/message-id/5190e17b.9060...@vmware.com fixed by this commit: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b03d196be055450c7260749f17347c2d066b4254. So, definitely need to compare plain 9.4 vs patched 9.4, not 9.3. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RETURNING PRIMARY KEY syntax extension
I spent some more time on the patch and here are my review comments. .) Patch gets applied through patch -p1 (git apply fails) .) trailing whitespace in the patch at various places .) Unnecessary new line + and - in the patch. (src/backend/rewrite/rewriteManip.c::getInsertSelectQuery()) (src/include/rewrite/rewriteManip.h) .) patch has proper test coverage and regression running cleanly. .) In map_primary_key_to_list() patch using INDEX_ATTR_BITMAP_IDENTITY_KEY bitmap to get the keycols. In IndexAttrBitmapKind there is also INDEX_ATTR_BITMAP_KEY, so was confuse between INDEX_ATTR_BITMAP_KEY and INDEX_ATTR_BITMAP_IDENTITY_KEY and also haven't found related comments in the code. Later with use of testcase and debugging found confirmed that INDEX_ATTR_BITMAP_IDENTITY_KEY only returns the Primary key. .) At present in patch when RETURNING PRIMARY KEY is used on table having no primary key it throw an error. If I am not wrong JDBC will be using that into getGeneratedKeys(). Currently this feature is implemented in the JDBC driver by appending RETURNING * to the supplied statement. With this implementation it will replace RETURNING * with RETURNING PRIMARY KEY, right ? So just wondering what JDBC expect getGeneratedKeys() to return when query don't have primary key and user called executeUpdate() with Statement.RETURN_GENERATED_KEYS? I looked at JDBC specification but its not clear what it will return when table don't have keys. Can you please let us know your finding on this ? .) Already raised my concern regarding syntax limiting the current returning clause implementation. On Fri, Jun 27, 2014 at 6:22 AM, Ian Barwick i...@2ndquadrant.com wrote: On 27/06/14 09:09, Tom Dunstan wrote: On 27 June 2014 06:14, Gavin Flower gavinflo...@archidevsys.co.nz mailto:gavinflo...@archidevsys.co.nz wrote: On 27/06/14 00:12, Rushabh Lathia wrote: INSERT INTO dept VALUES (10,'ACCOUNTING','NEW YORK') returning primary key, dname; I think allowing other columns with PRIMARY KEY would be more useful syntax. Even in later versions if we want to extend this syntax to return UNIQUE KEY, SEQUENCE VALUES, etc.. comma separation syntax will be more handy. I agree 100%. If the query is being hand-crafted, what's to stop the query writer from just listing the id columns in the returning clause? And someone specifying RETURNING * is getting all the columns anyway. The target use-case for this feature is a database driver that has just done an insert and doesn't know what the primary key columns are - in that case mixing them with any other columns is actually counter-productive as the driver won't know which columns are which. What use cases are there where the writer of the query knows enough to write specific columns in the RETURNING clause but not enough to know which column is the id column? Consistency is nice, and I can understand wanting to treat the PRIMARY KEY bit as just another set of columns in the list to return, but I'd hate to see this feature put on the back-burner to support use-cases that are already handled by the current RETURNING feature. Maybe it's easy to do, though.. I haven't looked into the implementation at all. Normal columns are injected into the query's returning list at parse time, whereas this version of the patch handles expansion of PRIMARY KEY at the rewrite stage, which would make handling a mix of PRIMARY KEY and normal output expressions somewhat tricky to handle. (In order to maintain the columns in their expected position you'd have to add some sort of placeholder/dummy TargetEntry to the returning list at parse time, then rewrite it later with the expanded primary key columns, or something equally messy). On the other hand, it should be fairly straightforward to handle a list of keywords for expansion (e.g. RETURNING PRIMARY KEY, UNIQUE KEYS, SEQUENCE VALUES) should the need arise. Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Rushabh Lathia
Re: [HACKERS] NUMA packaging and patch
Re: Kevin Grittner 2014-07-01 1404213492.98740.yahoomail...@web122306.mail.ne1.yahoo.com Andres Freund and...@2ndquadrant.com wrote: On 2014-07-01 11:01:04 +0200, Christoph Berg wrote: How much difference would it make if numactl --interleave=all was used instead of using numa_interleave_memory() on the shared memory segments? I guess that would make backend-local memory also interleaved, but it would avoid having a dependency on libnuma in the packages. I've tested this a while ago, and it's rather painful if you have a OLAP workload with lots of backend private memory. I'm not surprised; I would expect it to generally have a negative effect, which would be most pronounced with an OLAP workload. Ok, then +1 on having this in core, even if it buys us a dependency on something that isn't in the usual base system after OS install. I wonder if we shouldn't backpatch such a notice. I would want to see some evidence that it was useful first. In most of my tests the benefit of interleaving just the OS cache and PostgreSQL shared_buffers was about 2%. That could easily be erased if work_mem allocations and other process-local memory were not allocated close to the process which was using it. I expect that the main benefit of this proposed patch isn't the 2% typical benefit I was seeing, but that it will be insurance against occasional, much larger hits. I haven't had much luck making these worst case episodes reproducible, though. Afaict, the numactl notice will only be useful as a postscriptum to the --with-libnuma docs, with the caveats mentioned. Or we backpatch (something like) the full docs of the feature, with a note that it's only 9.5+. (Or the full feature gets backpatched...) Christoph -- c...@df7cb.de | http://www.df7cb.de/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] IMPORT FOREIGN SCHEMA statement
On Tue, Jul 1, 2014 at 4:23 PM, Ronan Dunklau ronan.dunk...@dalibo.com wrote: The remote_schema parameter can be used for SQL injection. Either we should go back to using parameters, or be extra careful. Since the remote schema is parsed as a name, it is limited to 64 characters which is not that useful for an SQL injection, but still. Yes, right, I completely forgot that. IMPORT SCHEMA could be used by a non-superuser so controlling only the size of relation name is not enough. The new query as you wrote it returns the typname (was cast to regtype before) This is not schema qualified, and will fail when importing tables with columns from a type not in search_path. Hm. The current solution with simply LookupTypeNameOid is more elegant than relying on InputFunctionCall to fetch a Datum, that is then converted back into TypeName... In all cases I am wondering about the use of regtype where the same type name is used across multiple schemas. We should perhaps document correctly that search_path influences the custom type chosen when rebuilding foreign tables and that postgres_fdw does its best but that it may not be compatible with type on foreign server. The regression tests don't pass: a user name is hard-coded in the result of DROP USER MAPPING. Should we expect the tests to be run as postgres? I think that we need a cleaner solution for this test case, I've let it for the time being but a make installcheck would let an extra database as it cannot be removed in postgres_fdw.sql (an extra test case just to clean up would work but I don't think that it is worth the complication). We could abuse of search_path not tracking types located on certain schemas to trigger this error :) Want to give a shot on the following things? I wouldn't mind changing back the query construction part :) -- Michael
Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels
Re: Tom Lane 2014-06-23 17054.1403542...@sss.pgh.pa.us While I'd love to reduce the number of future installations without this fix in place, I respect the decision to honor project policy. At the same time, this change does not break anything. It introduces new environment variables which change the behaviour, but behaves the old way in the absence of those variables. Uh, no, it doesn't. We removed the dependence on -DLINUX_OOM_SCORE_ADJ. If a packager is expecting that to still work in 9.4, he's going to be unpleasantly surprised, because the system will silently fail to do what he's expecting: it will run all the backend processes at no-OOM-kill priority, which is likely to be bad. Ok I'm late to the party, but the reason I'm still joining is we have proper unit tests which just told me the 9.5 packages have changed OOM handling. So it wouldn't just slip through if you changed that in 9.4 as well, but would get fixed. I have two comments on the patch: The choice to make the behavior depend first on PG_OOM_ADJUST_FILE and only secondly on PG_OOM_ADJUST_VALUE seems the wrong way round to me. On every modestly new kernel oom_score_adj is present, so PG_OOM_ADJUST_FILE should default to using it. On the other hand, what people really want to achieve (or tune) with this feature is to set the OOM adjustment back to 0 (or some other value), so to me, it would be much more natural to set PG_OOM_ADJUST_VALUE=0 to activate the feature, and only have to mess with PG_OOM_ADJUST_FILE if my kernel is older. (Which it isn't on any kernel supported by Debian and Ubuntu LTS.) The other bit is the non-deprecation of OOM_ADJ in contrib/start-scripts/linux. It took me a while to understand the logic of the variables used there - the interface would be much clearer if it just was like this: ... set default PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj ... and then this in the configurable part of the script: PG_MASTER_OOM_SCORE_ADJ=-1000 PG_OOM_SCORE_ADJ=0 # Older Linux kernels may not have /proc/self/oom_score_adj, but instead # /proc/self/oom_adj, which works similarly except the disable value is -17. # For such a system, uncomment these three lines instead. #PG_OOM_ADJUST_FILE=/proc/self/oom_adj #PG_MASTER_OOM_SCORE_ADJ=-17 #PG_OOM_SCORE_ADJ=0 ... and then use PG_OOM_ADJUST_FILE below instead of manually figuring out which proc file to write to by looking at OOM.*_ADJ. Christoph -- c...@df7cb.de | http://www.df7cb.de/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD
On 6/29/14, 12:20 PM, Tom Lane wrote: I'm tempted to just rip out all the useless code rather than fix the logic bug as such. OTOH, that might complicate updating to more recent versions of the original Autoconf macro. On the third hand, we've not bothered to do that in ten years either. Our version of the tests are already hacked up enough that we probably won't ever upgrade. I say keep hacking away. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
I'm sorry to interrupt you, but I feel strong sympathy with Stephen-san. From: Robert Haas robertmh...@gmail.com I don't think that's a valid objection. If we someday have auditing in core, and if it subsumes what pgaudit does, then whatever interfaces pgaudit implements can be replaced with wrappers around the core functionality, just as we did for text search. Won't it be burden and a headache to maintain pgaudit code when it becomes obsolete in the near future? But personally, I think this patch deserves to be reviewed on its own merits, and not the extent to which it satisfies your requirements, or those of NIST 800-53. As I said before, I think auditing is a complicated topic and there's no guarantee that one solution will be right for everyone. As long as we keep those solutions out of core, there's no reason that multiple solutions can't coexist; people can pick the one that best meets their requirements. As soon as we start talking about something putting into core, the bar is a lot higher, because we're not going to put two auditing solutions into core, so if we do put one in, it had better be the right thing for everybody. I don't even think we should be considering that at this point; I think the interesting (and under-discussed) question on this thread is whether it even makes sense to put this into contrib. That means we need some review of the patch for what it is, which there hasn't been much of, yet. Then, what is this auditing capability for? I don't know whether various regulations place so different requirements on auditing, but how about targeting some real requirements? What would make many people happy? PCI DSS? I bet Japanese customers are severe from my experience, and I'm afraid they would be disappointed if PostgreSQL provides auditing functionality which does not conform to any real regulations like PCI DSS, NIST, etc, now that other major vendors provide auditing for years. They wouldn't want to customize contrib code because DBMS development is difficult. I wish for in-core serious auditing functionality. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fresh initdb contains a few deleted B-Tree pages
Why is this even a small concern? Initdb runs the bootstrap SQL which does various SQL operations so it's not surprising there are some updates creating garbage. Iirc we don't even rely on template0 being frozen any more. -- greg
Re: [HACKERS] pg_receivexlog add synchronous mode
On Mon, Jun 30, 2014 at 7:09 PM, furu...@pm.nttdata.co.jp wrote: Thanks for the review! +if (secs = 0) +secs = 1;/* Always sleep at least 1 sec */ + +sleeptime = secs * 1000 + usecs / 1000; The above is the code which caused that problem. 'usecs' should have been reset to zero when 'secs' are rounded up to 1 second. But not. Attached is the updated version of the patch. Thank you for the refactoring v2 patch. I did a review of the patch. 1. applied cleanly and compilation was without warnings and errors 2. all regress tests was passed ok 3. sleeptime is ok when the --status-intarvall is set to 1 Thanks for reviewing the patch! I think that this refactoring patch is useful for improving source code readability and making the future patches simpler, whether we adopt your patch or not. So, barring any objections, I'm thinking to commit this refactoring patch. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
On 30 June 2014 16:20, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: On Mon, Jun 30, 2014 at 9:39 AM, Stephen Frost sfr...@snowman.net wrote: I think the fact that pgaudit does X and you think it should do Y is a perfect example of why we're nowhere close to being ready to push anything into core. We may very well want to do that someday, but not yet. That's fine- but don't push something in which will make it difficult to add these capabilities later (and, to be clear, I'm not asking out of pipe dreams and wishes but rather after having very specific discussions with users and reviewing documents like NIST 800-53, which is publically available for anyone to peruse). I don't think that's a valid objection. If we someday have auditing in core, and if it subsumes what pgaudit does, then whatever interfaces pgaudit implements can be replaced with wrappers around the core functionality, just as we did for text search. I actually doubt that we'd be willing to do what we did for text search again. Perhaps I'm wrong, which would be great, but I doubt it. But personally, I think this patch deserves to be reviewed on its own merits, and not the extent to which it satisfies your requirements, or those of NIST 800-53. eh? The focus of this patch is to add auditing to PG and having very clearly drawn auditing requirements of a very large customer isn't relevant? I don't follow that logic at all. In fact, I thought the point of pgaudit was specifically to help address the issues which are being raised from this section of our user base (perhaps not those who follow NIST, but at least those organizations with similar interests..). It's not clear to me how NIST 800-53 mandates the use of SQL (or any other) language statements for auditing. I strongly doubt that it does, but I am happy to hear summaries of large documents, or specific references to pages and paragraphs, just as we would do for SQL Standard. There is no other difference between a function and a statement - both may create catalog entries; I agree we need catalog entries so that auditing config can itself be audited. The rest of this argument seems to revolve around the idea that in-core and extension/contrib are different things. That is much more of a general point and one that is basically about development style. We've built logical replication around the idea that the various plugins can be the in-core version sometime in the future, so anything you say along those lines affects that as well as sepgsql, pgcrypto, pg_stat_statements etc.. . Adding things to the grammar, to pg_proc.h and other hardcoding locations is seriously painful and timeconsuming, one of the reasons why extensions and background workers exist. But if you missed it before, I will say it again: we have a patch now and 2ndQuadrant has a limit on further funding. If you want to reject pgaudit and move on to something else, lets start discussing that design and consider who has the time (and implicitly the money) to make that happen. We can pick the best one for inclusion in 9.5 later in the cycle. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_xlogdump --stats
On Wed, Jun 4, 2014 at 1:47 PM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: Here's a patch to make pg_xlogdump print summary statistics instead of individual records. Thanks! I had a use for this feature so I backported the (first) patch to PostgreSQL 9.3. It's a rush job so it's ugly and may have bugs, but it works for me. Just posting here, maybe someone else will find it useful too. Regards, Marti xlogdiff_stats_93.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to send transaction commit/rollback stats to the stats collector unconditionally.
Tom Lane t...@sss.pgh.pa.us wrote: If we're going to do it like this, then I think the force flag should be considered to do nothing except override the clock check, which probably means it shouldn't be tested in the initial if() at all. That makes sense, and is easily done. The only question left is how far back to take the patch. I'm inclined to only apply it to master and 9.4. Does anyone think otherwise? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RETURNING PRIMARY KEY syntax extension
On Tue, Jul 1, 2014 at 8:00 AM, Rushabh Lathia rushabh.lat...@gmail.com wrote: .) In map_primary_key_to_list() patch using INDEX_ATTR_BITMAP_IDENTITY_KEY bitmap to get the keycols. In IndexAttrBitmapKind there is also INDEX_ATTR_BITMAP_KEY, so was confuse between INDEX_ATTR_BITMAP_KEY and INDEX_ATTR_BITMAP_IDENTITY_KEY and also haven't found related comments in the code. Later with use of testcase and debugging found confirmed that INDEX_ATTR_BITMAP_IDENTITY_KEY only returns the Primary key. Actually, that depends on how REPLICA IDENTITY is set. IOW, you can't assume that will give you the primary key. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] avoiding tuple copying in btree index builds
On Tue, Jun 17, 2014 at 10:08 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Jun 16, 2014 at 8:10 PM, Tom Lane t...@sss.pgh.pa.us wrote: On a micro-optimization level, it might be worth passing the TID as ItemPointer not ItemPointerData (ie, pass a pointer until we get to the point of actually inserting the TID into the index tuple). I'm not sure that copying odd-size structs should be assumed to be efficient. Yeah, true. Checking existing precedent, it looks like we usually pass ItemPointer rather than ItemPointerData, so it's probably a good idea to do this that way too for reasons of style if nothing else. I kind of wonder whether it's really more efficient to pass an 8-byte pointer to a 6-byte structure than to just pass the structure itself, but it might be. The pointer will certainly be passed in a register, or whatever passes for registers on the particular machine architecture. Weird-size structs, though, tend to have arcane and not-so-efficient rules for being passed by value. It's not unlikely that what the compiler will do under the hood is pass a pointer anyhow, and then do a memcpy to make a local copy in the called function. OK, committed with the suggested changes. (Thanks to Abhijit for pinging me about this, and more generally for his active and effective management of this CommitFest!) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL format and API changes (9.5)
On Mon, Jun 30, 2014 at 4:51 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: Another big change in the attached patch version is that XLogRegisterData() now *copies* the data to a working area, instead of merely adding a XLogRecData entry to the chain. This simplifies things in xlog.c, now that the XLogRecData concept is not visible to the rest of the system. This is a subtle semantic change for the callers: you can no longer register a struct first, and fill the fields afterwards. That adds a lot of memcpys to the critical path, which could be bad for performance. In practice, I think it's OK, or even a win, because a typical WAL record payload is very small. But I haven't measured that. If it becomes an issue, we could try to optimize, and link larger data blocks to the rdata chain, but memcpy small small chunks of data. There are a few WAL record types that don't fit in a small statically allocated buffer, so I provided a new function, XLogEnsureSpace(), that can be called before XLogBegin() to allocate a large-enough working area. On the other hand, the patch I just committed (9f03ca915196dfc871804a1f8aad26207f601fd6) demonstrates that even copying relatively small amounts of data can be expensive if you do it frequently, and certainly WAL insertions are frequent. Of course, some of the overhead there is from palloc() and friends rather than from the copying itself, but the copying itself isn't free, either. And some WAL records can be really big. Of course, it could still work out to a win if the simplifications in xlog.c save enough. I don't know whether that's the case or not. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set new system identifier using pg_resetxlog
On Mon, Jun 30, 2014 at 12:46 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: I think it's pretty much a given that pg_resetxlog is a tool that can have disastrous effects if used lightly. If people changes their sysid wrongly, they're not any worse than if they change their multixact counters and start getting failures because the old values stored in data cannot be resolved anymore (it's already been wrapped around). Or if they remove all the XLOG they have since the latest crash. From that POV, I don't think the objection that but this can be used to corrupt data! has any value. After thinking about this a little more, I guess I don't really think it's a bit problem either - so consider my objection withdrawn. I am, however, kind of frustrated, still, that the pg_computemaxlsn patch, which I thought was rather a good idea, was scuttled by the essentially that same objection: let's not extend pg_resetxlog friends because people might use the new functionality to do bad things and then blame us. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set new system identifier using pg_resetxlog
On 2014-07-01 11:11:12 -0400, Robert Haas wrote: On Mon, Jun 30, 2014 at 12:46 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: I think it's pretty much a given that pg_resetxlog is a tool that can have disastrous effects if used lightly. If people changes their sysid wrongly, they're not any worse than if they change their multixact counters and start getting failures because the old values stored in data cannot be resolved anymore (it's already been wrapped around). Or if they remove all the XLOG they have since the latest crash. From that POV, I don't think the objection that but this can be used to corrupt data! has any value. After thinking about this a little more, I guess I don't really think it's a bit problem either - so consider my objection withdrawn. Thanks! I am, however, kind of frustrated, still, that the pg_computemaxlsn patch, which I thought was rather a good idea, was scuttled by the essentially that same objection: let's not extend pg_resetxlog friends because people might use the new functionality to do bad things and then blame us. Well, the reasons were a bit different. Senior community members repeatedly suggested that it'd be usable for faillback - and it's not a unreasonable to think it is. Even though it'd fail subtly because of hint bit and related reasons. In contrast you have to be pretty desperate to think that you could make two clusters replicate from each other by just fudging pg_control long enough, even if the clusters aren't actually related. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set new system identifier using pg_resetxlog
Robert Haas wrote: On Mon, Jun 30, 2014 at 12:46 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: I think it's pretty much a given that pg_resetxlog is a tool that can have disastrous effects if used lightly. If people changes their sysid wrongly, they're not any worse than if they change their multixact counters and start getting failures because the old values stored in data cannot be resolved anymore (it's already been wrapped around). Or if they remove all the XLOG they have since the latest crash. From that POV, I don't think the objection that but this can be used to corrupt data! has any value. After thinking about this a little more, I guess I don't really think it's a bit problem either - so consider my objection withdrawn. Great. I am, however, kind of frustrated, still, that the pg_computemaxlsn patch, which I thought was rather a good idea, was scuttled by the essentially that same objection: let's not extend pg_resetxlog friends because people might use the new functionality to do bad things and then blame us. Uh, I thought you killed that patch yourself: http://www.postgresql.org/message-id/CA+TgmoZqbYHWYbi18FUk-+dGHig=icv+pj-nnav3miy4daj...@mail.gmail.com -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On Mon, Jun 30, 2014 at 11:26 PM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: 2. I think it would be reasonable to try to kill off the connection without notifying the client if we're unable to send the data to the client in a reasonable period of time. But I'm unsure what a reasonable period of time means. This patch would basically do it after no delay at all, which seems like it might be too aggressive. However, I'm not sure. I think there's no such a reasonable time. The behavior might should be determined from another point.. On alternative would be let pg_terminate_backend() have a parameter instructing force shutodwn (how to propagate it?), or make a forced shutdown on duplicate invocation of pg_terminate_backend(). Well, I think that when people call pg_terminate_backend() just once, they expect it to kill the target backend. I think people will tolerate a short delay, like a few seconds; after all, there's no guarantee, even today, that the backend will hit a CHECK_FOR_INTERRUPTS() in less than a few hundred milliseconds. But they are not going to want to have to take a second action to kill the backend - killing it once should be sufficient. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Can simplify 'limit 1' with slow function?
The simplified scene: select slowfunction(s) from a order by b limit 1; is slow than select slowfunction(s) from (select s from a order by b limit 1) as z; if there are many records in table 'a'. The real scene. Function ST_Distance_Sphere is slow, the query: SELECT ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from road order by c limit 1; is slow than: select ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from (SELECT s from road order by c limit 1) as a; There are about 7000 records in 'road'. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set new system identifier using pg_resetxlog
On Tue, Jul 1, 2014 at 11:19 AM, Andres Freund and...@2ndquadrant.com wrote: I am, however, kind of frustrated, still, that the pg_computemaxlsn patch, which I thought was rather a good idea, was scuttled by the essentially that same objection: let's not extend pg_resetxlog friends because people might use the new functionality to do bad things and then blame us. Well, the reasons were a bit different. Senior community members repeatedly suggested that it'd be usable for faillback - and it's not a unreasonable to think it is. Even though it'd fail subtly because of hint bit and related reasons. In contrast you have to be pretty desperate to think that you could make two clusters replicate from each other by just fudging pg_control long enough, even if the clusters aren't actually related. Well, I still think it's pretty likely someone could make that mistake here, too. Maybe not a senior community member, but somebody. However, I think the right answer in that case and this one is to tell the person who has made the mistake you screwed up and move on. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD
On 06/29/2014 02:58 PM, Patrick Finnegan wrote: Last I checked, NetBSD doesn't support any sort of multiprocessor VAX. Multiprocessor VAXes exist, but you're stuck with either Ultrix or VMS on them. Hi Pat, it's good to see your name in my inbox. NetBSD ran on multiprocessor BI-bus VAXen many, many years ago. Is that support broken? -Dave -- Dave McGuire, AK4HZ/3 New Kensington, PA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD
On 06/29/2014 10:54 AM, Andres Freund wrote: Is there anyone in the NetBSD/VAX community who would be willing to host a PG buildfarm member? I could put together a simh-based machine (i.e., fast) on a vm, if nobody else has stepped up for this. No other volunteers have emerged, so if you'd do that it'd be great. Maybe I'm just not playful enough, but keeping a platform alive so we can run postgres in simulator seems a bit, well, pointless. There are a couple of points. First and foremost is portability. Using as many architectures as possible as test platforms keeps us honest and can be a highly valuable tool for early discovery of portability issues or iffy code constructs. The VAX in particular is an extreme example of many aspects of processor architecture, and as such, is an excellent tool for that sort of thing. Next, some people actually want to *run* it on a VAX. Maybe for hobby reasons, maybe for approved platform reasons, whatever...We don't know (and don't care) why. On the in a simulator matter: It's important to keep in mind that there are more VAXen out there than just simulated ones. I'm offering up a simulated one here because I can spin it up in a dedicated VM on a VMware host that's already running and I already have power budget for. I could just as easily run it on real hardware...there are, at last count, close to forty real-iron VAXen here, but only a few of those are running 24/7. I'd happily bring up another one to do Postgres builds and testing, if someone will send me the bucks to pay for the additional power and cooling. (that is a real offer) -Dave -- Dave McGuire, AK4HZ/3 New Kensington, PA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD
On Sun, Jun 29, 2014 at 10:24:22AM -0400, Tom Lane wrote: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=coypudt=2014-06-29%2012%3A33%3A12 so I'm a bit confused as to what we need to change for VAX. Dave did use NetBSD 6.1 (IIRC), which uses an ancient gcc version. I would suggest to go for NetBSD-current (if someone sets it up before the netbsd-7 branch, or 7 post-branch), which uses gcc 4.8.3 on vax. All other architectures already used a more modern gcc on 6. However, there is still a bit of fallout to fix in -current. Martin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD
On 06/29/2014 03:35 PM, Tom Lane wrote: Hey, right up the river from here! Come on up and hack! There's always something neat going on around here. Ever run a PDP-11? B-) There were so many PDP-11s around CMU when I was an undergrad that I remember seeing spare ones being used as doorstops ;-). I even got paid to help hack on this: http://en.wikipedia.org/wiki/C.mmp So nah, don't want to do it any more. Been there done that. Ah well, more for me then. I cut my teeth on PDP-11s 30yrs ago, and I always find it therapeutic to fire one up today. I have a biggish commercial building and am putting together a computer museum up here. Right now it's only privately open, and we have hack workshops and stuff to get things running. There are about sixty racks in here. It's awesome that you worked on C.mmp. I saw that machine in person a few weeks ago. I'd love to discuss that with you a bit off-list if you can spare the time. -Dave -- Dave McGuire, AK4HZ/3 New Kensington, PA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD
On 2014-06-29 12:12, Dave McGuire wrote: On 06/29/2014 03:10 PM, Patrick Finnegan wrote: And it also runs on the 11/780 which can have multiple CPUs... but I've never seen support for using more than one CPU (and the NetBSD page still says NetBSD/vax can only make use of one CPU on multi-CPU machines). If that has changed, I'd love to hear about it. Support for my VAX 6000 would also be nice... It changed well over a decade ago, if memory serves. The specific work was done on a VAX-8300 or -8350. I'm pretty sure the 11/780's specific flavor of SMP is not supported. (though I do have a pair of 11/785s here...wanna come hack? ;)) Well, VAX-11/78x do not support SMP, they have (had) ASMP only. Johnny -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD
On 06/25/2014 01:57 PM, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Jun 25, 2014 at 1:05 PM, John Klos j...@ziaspace.com wrote: While I wouldn't be surprised if you remove the VAX code because not many people are going to be running PostgreSQL, I'd disagree with the assessment that this port is broken. It compiles, it initializes databases, it runs, et cetera, albeit not with the default postgresql.conf. Well, the fact that initdb didn't produce a working configuration and that make installcheck failed to work properly are bad. But, yeah, it's not totally broken. I'm actually rather impressed at how well PostgreSQL can be adjusted to lower memory systems. I deploy a lot of embedded systems with 128 megs (a lot for an embedded system, but nothing compared with what everyone else assumes), so I'll be checking out PostgreSQL for other uses. I agree, that's cool. I think we'd be happy to keep the VAX port of PG going as long as we get regular feedback on it, ie closed-loop maintenance not open-loop ;-) Is there anyone in the NetBSD/VAX community who would be willing to host a PG buildfarm member? http://buildfarm.postgresql.org/index.html The requirements for this beyond what it takes to build from source are basically just working git and Perl (ccache helps a lot too), and enough cycles to build the code at least once a day or so. Once you've got the thing set up it seldom needs human attention. If we had a buildfarm member to tell us when we break things, it would be a lot easier to promise continued support. I could put together a simh-based machine (i.e., fast) on a vm, if nobody else has stepped up for this. -Dave -- Dave McGuire, AK4HZ/3 New Kensington, PA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD
On 06/29/2014 02:06 PM, Tom Lane wrote: Well, the issue from our point of view is that a lot of what we care about testing is extremely low-level hardware behavior, like whether spinlocks work as expected across processors. It's not clear that a simulator would provide a sufficiently accurate emulation. Oh ok, I understand. Thank you for the clarification. OTOH, the really nasty issues like cache coherency rules don't arise in single-processor systems. So unless you have a multiprocessor VAX available to spin up, a simulator may tell us as much as we'd learn anyway. (If you have got one, maybe some cash could be found --- we do have project funds available, and I think they'd be well spent on testing purposes. I don't make those decisions though.) I have several multiprocessor VAXen, but only one of them is capable of running NetBSD, and I only (currently) have a single processor in that machine. I can (and want to) fix that, but not right away. -Dave -- Dave McGuire, AK4HZ/3 New Kensington, PA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Tue, Jul 1, 2014 at 6:04 AM, Andres Freund and...@2ndquadrant.com wrote: You know, looking at this, I wonder if we shouldn't just remove support for ARMv5 instead of making a blind stab at a fix. Well, I argued that way for a while ;). We don't even need to really desupport it, but just say it's not supported for gcc 4.4. Yeah, I didn't realize at the time that you were making that argument that the existing code was thought to be broken on its own terms. Removing probably-working code that we just can't test easily is, in my mind, quite different from removing probably-broken code for which we can't test a fix. By the time PostgreSQL 9.5 is released, GCC 4.4 will be six years old, and telling people on an obscure platform that few operating system distributions support that they can't use a brand-new PostgeSQL with a seven-year-old compiler doesn't seem like a serious problem, especially since the only alternative we can offer is compiling against completely-untested code. Since we have a Sun Studio machine in the buildfarm, we shouldn't give up on SPARC completely, but maybe we should only add the cases for sparcv8+ and above? That at least has some chance of getting tested. That we have code for sparcv7 is ridiculous btw. Sparcv8 (not 8+/9) is from 1992. v7/sun4c been dropped from solaris 8 if I see that correctly. I agree that v8 (in contrast to v8+ aka v9 in 32bit mode) support has to go as well. I'd personally vote for backpatching a note to installation.sgml saying that it's probably not working, and not do anything else there. That means we also should replace the ldstub by cas in the the gcc inline assembly - but we have buildfarm members for that, so it's not too bad. If you want to do that, it's fine with me. What I would do is: - Back-patch the addition of the sparcv8+ stuff all the way. If anyone's running anything older, let them complain... - Remove the special case for MIPS without gcc intrinsics only in master, leaving the back-branches broken. If anyone cares, let them complain... - Nothing else. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RLS Design
On Tue, Jul 1, 2014 at 3:33 AM, Dean Rasheed dean.a.rash...@gmail.com wrote: An annoying complication, however, is how this interacts with column privileges. Right now GRANT SELECT(col1) ON t1 TO role1 gives role1 access to every row in col1, and I think that has to remain the case, since GRANTs only ever give you more access. But that leads to a situation where the RLS quals applied would depend on the columns selected. Wow, that seems pretty horrible to me. That means that if I do: SELECT a FROM tab; and then: SELECT a, b FROM tab; ...the second one might return fewer rows than the first one. I think there's a good argument that RLS is unlike other grantable privileges, and that it really ought to be defined as something which is imposed rather than a kind of access grant. If RLS is merely a modifier to an access grant, then every access grant has to make sure to include that modifier, or you have a security hole. But if it's a separate constrain on access, then you just do it once, and exempt people from it only as needed. That seems less error-prone to me -- it's sort of a default-deny policy, which is generally viewed as good for security -- and it avoids weird cases like the above, which I think could easily break application logic. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Tue, Jul 1, 2014 at 11:22 AM, Robert Haas robertmh...@gmail.com wrote: On Tue, Jul 1, 2014 at 6:04 AM, Andres Freund and...@2ndquadrant.com wrote: You know, looking at this, I wonder if we shouldn't just remove support for ARMv5 instead of making a blind stab at a fix. Well, I argued that way for a while ;). We don't even need to really desupport it, but just say it's not supported for gcc 4.4. Yeah, I didn't realize at the time that you were making that argument that the existing code was thought to be broken on its own terms. Removing probably-working code that we just can't test easily is, in my mind, quite different from removing probably-broken code for which we can't test a fix. By the time PostgreSQL 9.5 is released, GCC 4.4 will be six years old, and telling people on an obscure platform that few operating system distributions support that they can't use a brand-new PostgeSQL with a seven-year-old compiler doesn't seem like a serious problem, especially since the only alternative we can offer is compiling against completely-untested code. A few years back I ported the postresql client libraries and a few other pieces of software (in particular subversion) to a lot of obscure platforms (old sparc, hpux, irix, older aix, etc etc). Getting a modern gcc working on those platforms (with the possible exception of aix) is in many cases difficult or impossible. So requiring new gcc is exactly equivalent to desupporting. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
On Tue, Jul 1, 2014 at 1:25 PM, Dilip kumar dilip.ku...@huawei.com wrote: On 01 July 2014 03:48, Alvaro Wrote, In particular, pgpipe is almost an exact duplicate between them, except the copy in vac_parallel.c has fallen behind changes made to parallel.c. (Those changes would have fixed the Windows warnings). I think that this function (and perhaps other parts as well--exit_horribly for example) need to refactored into a common file that both files can include. I don't know where the best place for that would be, though. (I haven't done this type of refactoring myself.) I think commit d2c1740dc275543a46721ed254ba3623f63d2204 is apropos. Maybe we should move pgpipe back to src/port and have pg_dump and this new thing use that. I'm not sure about the rest of duplication in vac_parallel.c; there might be a lot in common with what pg_dump/parallel.c does too. Having two copies of code is frowned upon for good reasons. This patch introduces 1200 lines of new code in vac_parallel.c, ugh. If we really require 1200 lines to get parallel vacuum working for vacuumdb, I would question the wisdom of this effort. To me, it seems better spent improving autovacuum to cover whatever it is that this patch is supposed to be good for --- or maybe just enable having a shell script that launches multiple vacuumdb instances in parallel ... Thanks for looking into the patch, I think if we use shell script for launching parallel vacuumdb, we cannot get complete control of dividing the task, If we directly divide table b/w multiple process, it may happen some process get very big tables then it will be as good as one process is doing operation. In this patch at a time we assign only one table to each process and whichever process finishes fast, we assign new table, this way all process get equal sharing of the task. Thanks Regards, Dilip Kumar I have executed latest patch. One question is that how to use --jobs option is correct? $ vacuumdb -d postgres --jobs=30 I got following error. vacuumdb: unrecognized option '--jobs=30' Try vacuumdb --help for more information. Regards, --- Sawada Masahiko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] buildfarm and rolling release distros
I've always been a bit reluctant to accept buildfarm members that are constantly being updated, because it seemed to me that it created something with too many variables. However, we occasionally get requests from people who want to run on such platforms, and I'm also a bit reluctant to turn away willing volunteers. We have one such application now in hand. What do people think about this. Is it valuable to have? Do we have enough stability from the buildfarm members that are not auto-updated that we can accept a certain number of auto-updating members, where, if something breaks, and it doesn't break elsewhere, then we suspect that something that got upgraded broke the build? I'm also not sure how to designate these machines. The buildfarm server metadata isn't designed for auto-updating build platforms. But no doubt if necessary we can come up with something. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-07-01 11:46:19 -0500, Merlin Moncure wrote: On Tue, Jul 1, 2014 at 11:22 AM, Robert Haas robertmh...@gmail.com wrote: On Tue, Jul 1, 2014 at 6:04 AM, Andres Freund and...@2ndquadrant.com wrote: You know, looking at this, I wonder if we shouldn't just remove support for ARMv5 instead of making a blind stab at a fix. Well, I argued that way for a while ;). We don't even need to really desupport it, but just say it's not supported for gcc 4.4. Yeah, I didn't realize at the time that you were making that argument that the existing code was thought to be broken on its own terms. Removing probably-working code that we just can't test easily is, in my mind, quite different from removing probably-broken code for which we can't test a fix. By the time PostgreSQL 9.5 is released, GCC 4.4 will be six years old, and telling people on an obscure platform that few operating system distributions support that they can't use a brand-new PostgeSQL with a seven-year-old compiler doesn't seem like a serious problem, especially since the only alternative we can offer is compiling against completely-untested code. A few years back I ported the postresql client libraries and a few other pieces of software (in particular subversion) to a lot of obscure platforms (old sparc, hpux, irix, older aix, etc etc). Getting a modern gcc working on those platforms (with the possible exception of aix) is in many cases difficult or impossible. Well, we're talking about a case where the current code is *broken*. Subtly so. Where we have no way of testing potential fixes. If you/somebody steps up to fix test that, cool. But I don't see it as a service to anyone to ship broken stuff. So requiring new gcc is exactly equivalent to desupporting. Yea, right. The case we're talking about here is armv5 and some armv6's (the current fallback inline assembly doesn't work on all v6s). If postgres were indeed in use and updated on such a platform it'd be cross compiled with near certainty. The other case is sparcv7 and sparcv8 (not v8+). The former has been dropped from solaris8, the latter from solaris9. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
Simon, * Simon Riggs (si...@2ndquadrant.com) wrote: On 30 June 2014 16:20, Stephen Frost sfr...@snowman.net wrote: eh? The focus of this patch is to add auditing to PG and having very clearly drawn auditing requirements of a very large customer isn't relevant? I don't follow that logic at all. In fact, I thought the point of pgaudit was specifically to help address the issues which are being raised from this section of our user base (perhaps not those who follow NIST, but at least those organizations with similar interests..). It's not clear to me how NIST 800-53 mandates the use of SQL (or any other) language statements for auditing. 800-53 doesn't mandate SQL. If that was implied by me somewhere then I apologize for the confusion. Having SQL support *has* been requested of multiple users and in some of those cases has been considered a requirement for their specific case; perhaps that was the confusion. What 800-53 *does* discuss (under AU-9, control 4, from here: http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-53r4.pdf) is the auditor role as being independent and able to control auditing. Further, it discusses under AU-2 that organizations must determine the set of events which are to be audited and then under AU-12 (in general, but also specifically under AU-12 control 3), that authorized individuals can change what is being audited. Another point is a practical matter- it's extremely important to control where the audit logs go as well, and ideally it'd be possible to send audit logs for certain RLS policies to one location while other audit logs are sent to a different (perhaps less secure) location. That's an issue which I don't believe has to be addressed immediately, but until it is addressed the audit log has to be at the highest level of control, which makes it more difficult to manage and monitor. I strongly doubt that it does, but I am happy to hear summaries of large documents, or specific references to pages and paragraphs, just as we would do for SQL Standard. There is no other difference between a function and a statement - both may create catalog entries; I agree we need catalog entries so that auditing config can itself be audited. Having functions to control the auditing would work, but it's not exactly the ideal approach, imv, and the only reason it's being discussed here is because it might be a way to allow an extension to provide the auditing- not because it's actually a benefit to anyone. However, if we have such functions in a contrib extension, I worry what the pg_upgrade path is from that extension to an in-core solution. Robert feels that's managable and perhaps he's right but we don't have either the function-based design with custom reloptions nor a concrete idea of what the in-core solution would be and so I'm skeptical that we'd be able to easily provide a path from one to the other with pg_upgrade. The rest of this argument seems to revolve around the idea that in-core and extension/contrib are different things. That is much more of a general point and one that is basically about development style. The downside of extensions for this case is the inability to create and integrate new catalog tables into the pg_catalog environment, and the inability for extensions to extend the SQL grammar, as you mention. I'd love to see both of those issues solved in a way which would allow us to build the auditing system that I've been dreaming about as an extension or feature. I don't feel we should be holding off on projects like auditing while we wait for that to happen though. But if you missed it before, I will say it again: we have a patch now and 2ndQuadrant has a limit on further funding. If you want to reject pgaudit and move on to something else, lets start discussing that design and consider who has the time (and implicitly the money) to make that happen. We can pick the best one for inclusion in 9.5 later in the cycle. What I'd love to see is progress on both fronts, really. pgaudit is great and it'd be good to have it for the older releases and for those releases up until we get real auditing in PG- I just don't want to make adding that in-PG-auditing later more difficult than it already is by having to address upgrades from pgaudit. That's not an issue if pgaudit lives outside of PG, but it could be an issue if it's in -contrib. If we're willing to say that we won't worry about upgrades from pgaudit to an in-core solution (or at least that the lack of such an upgrade path won't prevent adding an in-core auditing system) then my concerns about that would be addressed, but we don't get to change our minds about that after pgaudit has been released as part of 9.5.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] PATCH: Allow empty targets in unaccent dictionary
At 2014-06-30 22:06:30 -0400, t...@sss.pgh.pa.us wrote: I went ahead and committed this patch, and also some further work to fix the multicharacter-source problem. I took it on myself to make the code issue warnings about misformatted lines, too. Thanks, looks good. I found the multicharacter-source diff an instructive read. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Tue, Jul 1, 2014 at 12:46 PM, Merlin Moncure mmonc...@gmail.com wrote: A few years back I ported the postresql client libraries and a few other pieces of software (in particular subversion) to a lot of obscure platforms (old sparc, hpux, irix, older aix, etc etc). Getting a modern gcc working on those platforms (with the possible exception of aix) is in many cases difficult or impossible. So requiring new gcc is exactly equivalent to desupporting. I took a look around to see which operating systems support which hardware platforms. It's a little hard to tell who is supporting which operating systems, because the terminology is different on different project web sites, and it's hard to tell which things are actually ARMv5, as opposed to something else. But it seems that NetBSD is has by far the largest list of supported platforms, and it appears that they compile their current releases with either gcc 4.5 or gcc 4.8, depending on the particular port: http://www.netbsd.org/developers/features/ According to http://www.netbsd.org/ports/ the ports that run on some form of ARM are acorn26, acorn32, cats, epoc32, evbarm, hpcarm, iyonix, netwinder, shark, and zaurus. epoc32 is not listed at all on the features link above, but the others are all listed as using gcc 4.8.x. Even if they were using gcc 4.5.x, though, that would be good enough, and most platforms that are now on 4.8.x were on 4.5.x before the 4.8.x import got done: http://mail-index.netbsd.org/tech-userlevel/2014/02/18/msg008484.html Apparently, the last holdout preventing removal of gcc 4.1.x from NetBSD was the vax port, which has happily now been upgraded to 4.8.x. Debian also supports ARMv5; actually, they support ARMv4t and higher: https://wiki.debian.org/ArmEabiPort Debian has shipped a sufficiently-new compiler for our purposes since 6.0 (squeeze), released in 2009: https://packages.debian.org/squeeze/gcc So it's clearly *possible* to get newer gcc versions running on ARMv5. I will grant you that it may not be the easiest thing to do on an existing installation. But I don't think having us continue to ship either known-broken code or completely-untested code is any better. If someone needs to get PostgreSQL 9.5 running on ARMv5 using an older gcc, they can test Andres's already-posted patch and, if it works, we can commit it. What I think doesn't make sense is to ship it untested and claim we have support when that really might or might not be true. Also, none of this would affect the PostgreSQL client libraries that you are talking about. s_lock.h is only used by the backend. The bottom line is that I love supporting obscure platforms as much as anyone here, and several other committers are already telling me that I love it too much. We've got to draw the line somewhere, and I think refusing to ship newly-written code that we have exactly zero means of testing is a pretty good place to draw it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fresh initdb contains a few deleted B-Tree pages
On Tue, Jul 1, 2014 at 6:02 AM, Greg Stark st...@mit.edu wrote: Why is this even a small concern? Initdb runs the bootstrap SQL which does various SQL operations so it's not surprising there are some updates creating garbage. Iirc we don't even rely on template0 being frozen any more. It's not surprising that some initdb updates create garbage, but the extent to which they do did slightly surprise me. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] buildfarm and rolling release distros
On Tue, Jul 1, 2014 at 12:49 PM, Andrew Dunstan and...@dunslane.net wrote: I've always been a bit reluctant to accept buildfarm members that are constantly being updated, because it seemed to me that it created something with too many variables. However, we occasionally get requests from people who want to run on such platforms, and I'm also a bit reluctant to turn away willing volunteers. We have one such application now in hand. What do people think about this. Is it valuable to have? Do we have enough stability from the buildfarm members that are not auto-updated that we can accept a certain number of auto-updating members, where, if something breaks, and it doesn't break elsewhere, then we suspect that something that got upgraded broke the build? I'm also not sure how to designate these machines. The buildfarm server metadata isn't designed for auto-updating build platforms. But no doubt if necessary we can come up with something. Off-hand, it seems like we could give it a try, and abandon the effort if it proves too problematic. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cluster name in ps output
On 06/29/2014 02:25 PM, Andres Freund wrote: On 2014-06-29 11:11:14 +0100, Thomas Munro wrote: On 29 June 2014 10:55, Andres Freund and...@2ndquadrant.com wrote: So, I'd looked at it with an eye towards committing it and found some more things. I've now * added the restriction that the cluster name can only be ASCII. It's shown from several clusters with differing encodings, and it's shown in process titles, so that seems wise. * moved everything to the LOGGING_WHAT category * added minimal documention to monitoring.sgml * expanded the config.sgml entry to mention the restriction on the name. * Changed the GUCs short description Thank you. I also think, but haven't done so, we should add a double colon after the cluster name, so it's not: postgres: server1 stats collector process but postgres: server1: stats collector process +1 Committed with the above changes. Thanks for the contribution! Is there a reason for not using this in synchronous_standby_names, perhaps falling back to application_name if not set? -- Vik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
Robert Haas robertmh...@gmail.com writes: The bottom line is that I love supporting obscure platforms as much as anyone here, and several other committers are already telling me that I love it too much. We've got to draw the line somewhere, and I think refusing to ship newly-written code that we have exactly zero means of testing is a pretty good place to draw it. I'm good with the concept of expecting anyone who complains about lack of support for $platform to provide resources for testing/debugging PG on that platform; and that pending arrival of such resources it's okay to consider the platform desupported. What concerns me here is what level of support we could provide even with adequate resources. That is, are we going to be buying into a scenario where platforms with poor atomics support take a significant performance hit compared to the current state of affairs? I don't think that would be good. Another way of framing the problem is in response to Andres' upthread comment that relying on emulated atomics makes things much easier to reason about. That may be true as far as correctness is concerned but it's patently false for reasoning about performance. This ties into Robert's worry about how many different hardware performance profiles we're going to have to concern ourselves with. Basically the future that concerns me is that we perform well on x86_64 hardware (which I believe is pretty much all that any active developers are using) and poorly on other hardware. I don't want to end up there, but I think the current direction of this patch pretty much guarantees that outcome. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Can simplify 'limit 1' with slow function?
On Sun, Jun 29, 2014 at 10:05:50PM +0800, gotoschool6g wrote: The simplified scene: select slowfunction(s) from a order by b limit 1; is slow than select slowfunction(s) from (select s from a order by b limit 1) as z; if there are many records in table 'a'. The real scene. Function ST_Distance_Sphere is slow, the query: SELECT ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from road order by c limit 1; is slow than: select ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from (SELECT s from road order by c limit 1) as a; There are about 7000 records in 'road'. I think to help here I think we need the EXPLAIN ANALYSE output for both queries. Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ He who writes carelessly confesses thereby at the very outset that he does not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
Re: [HACKERS] RLS Design
On 1 July 2014 17:42, Robert Haas robertmh...@gmail.com wrote: On Tue, Jul 1, 2014 at 3:33 AM, Dean Rasheed dean.a.rash...@gmail.com wrote: An annoying complication, however, is how this interacts with column privileges. Right now GRANT SELECT(col1) ON t1 TO role1 gives role1 access to every row in col1, and I think that has to remain the case, since GRANTs only ever give you more access. But that leads to a situation where the RLS quals applied would depend on the columns selected. Wow, that seems pretty horrible to me. That means that if I do: SELECT a FROM tab; and then: SELECT a, b FROM tab; ...the second one might return fewer rows than the first one. I think there's a good argument that RLS is unlike other grantable privileges, and that it really ought to be defined as something which is imposed rather than a kind of access grant. If RLS is merely a modifier to an access grant, then every access grant has to make sure to include that modifier, or you have a security hole. But if it's a separate constrain on access, then you just do it once, and exempt people from it only as needed. That seems less error-prone to me -- it's sort of a default-deny policy, which is generally viewed as good for security -- and it avoids weird cases like the above, which I think could easily break application logic. That seems like a pretty strong argument. If RLS quals are instead regarded as constraints on access, and multiple policies apply, then it seems that the quals should now be combined with AND rather than OR, right? Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On Tue, Jul 01, 2014 at 12:26:43PM +0900, Kyotaro HORIGUCHI wrote: 1. I think it's the case that there are platforms around where a signal won't cause send() to return EINTR and I'd be entirely unsurprised if SSL_write() doesn't necessarily return EINTR in that case. I'm not sure what, if anything, we can do about that. man 2 send on FreeBSD has not description about EINTR.. And even on linux, send won't return EINTR for most cases, at least I haven't seen that. So send()=-1,EINTR seems to me as only an equivalent of send() = 0. I have no idea about what the implementer thought the difference is. Whether send() returns EINTR or not depends on whether the signal has been marked restartable or not. This is configurable per signal, see sigaction(). If the signal is marked to restart, the kernel returns ERESTARTHAND (IIRC) and the libc will redo the call internally. Default BSD does not return EINTR normally, but supports sigaction(). Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ He who writes carelessly confesses thereby at the very outset that he does not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
Re: [HACKERS] replication commands and log_statements
Hi. Do we have any consensus about what to do with these two patches? 1. Introduce a log_replication_command setting. 2. Change log_statement to be a list of tokens. If I understand correctly, there weren't any strong objections to the former, but the situation is less clear when it comes to the second. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Can simplify 'limit 1' with slow function?
On Tue, Jul 1, 2014 at 2:16 PM, Martijn van Oosterhout klep...@svana.org wrote: On Sun, Jun 29, 2014 at 10:05:50PM +0800, gotoschool6g wrote: The simplified scene: select slowfunction(s) from a order by b limit 1; is slow than select slowfunction(s) from (select s from a order by b limit 1) as z; if there are many records in table 'a'. The real scene. Function ST_Distance_Sphere is slow, the query: SELECT ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from road order by c limit 1; is slow than: select ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from (SELECT s from road order by c limit 1) as a; There are about 7000 records in 'road'. I think to help here I think we need the EXPLAIN ANALYSE output for both queries. Well, I think the problem is a well understood one: there is no guarantee that functions-in-select-list are called exactly once per output row. This is documented -- for example see here: http://www.postgresql.org/docs/9.1/static/explicit-locking.html#ADVISORY-LOCKS. In short, if you want very precise control of function evaluation use a subquery, or, if you're really paranoid, a CTE. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump reporing version of server pg_dump as comments in the output
At 2014-06-17 13:21:34 +0530, jeevan.cha...@enterprisedb.com wrote: Anyone has any other views ? I guess nobody has strong feelings either way. I've marked this (i.e. your slightly-revised patch) ready for committer. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Tue, Jul 1, 2014 at 2:52 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: The bottom line is that I love supporting obscure platforms as much as anyone here, and several other committers are already telling me that I love it too much. We've got to draw the line somewhere, and I think refusing to ship newly-written code that we have exactly zero means of testing is a pretty good place to draw it. I'm good with the concept of expecting anyone who complains about lack of support for $platform to provide resources for testing/debugging PG on that platform; and that pending arrival of such resources it's okay to consider the platform desupported. What concerns me here is what level of support we could provide even with adequate resources. That is, are we going to be buying into a scenario where platforms with poor atomics support take a significant performance hit compared to the current state of affairs? I don't think that would be good. Another way of framing the problem is in response to Andres' upthread comment that relying on emulated atomics makes things much easier to reason about. That may be true as far as correctness is concerned but it's patently false for reasoning about performance. This ties into Robert's worry about how many different hardware performance profiles we're going to have to concern ourselves with. I have to admit that my concerns in that area were ameliorated to a significant degree by the performance results that Andres posted yesterday. On top of the atomics patch (which is not this thread), he's got a rewritten lwlock patch that uses those atomics. His testing showed that, even when that patch uses the atomics emulation layer instead of real atomics, it's still better than the status quo. Maybe that won't be true for every patch that uses atomics, but it's certainly good as far as it goes. But I think the issue on this sub-thread is a different one altogether: in studying s_lock.h, Andres found clear evidence that the ARMv5 and Sun Studio spinlock implementations are in fact *buggy*. Even if the programmer uses volatile pointers until they turn blue in the face, nothing in that patch prevents the CPU from changing the apparent order of execution such that instructions within the spinlock-protected critical section appear to execute after lock release. That is no good. It means that if somebody runs PostgreSQL on those platforms and tries to do non-trivial things with it, it's probably going to break. Now, if we want, we can simply ignore that problem. No actual users have complained about it, and it's not entirely outside the bounds of plausibility that Andres is wrong, and those implementations are not buggy after all. But I don't think he's wrong. If we assume he's right, then we've got two choices: we can either blindly install fixes for those platforms that we can't test, or we can drop support. Basically the future that concerns me is that we perform well on x86_64 hardware (which I believe is pretty much all that any active developers are using) and poorly on other hardware. I don't want to end up there, but I think the current direction of this patch pretty much guarantees that outcome. I think this concern is more germane to the atomics patch than what we're talking about here, because what we're talking about here, at the moment, is either attempting to repair, or just rip out completely, S_UNLOCK() implementations that appear to be buggy. I have access to several PPC64 systems, use them regularly for benchmarking and performance-testing, and can provide access to one of those to others who may need it for testing. For that reason, I think at least x64/amd64 and PPC64 will get tested regularly, at least as long as IBM keeps that hardware on loan to us. I previously had access to an Itanium server and used that for a lot of testing during the 9.2 cycle, but that eventually went away. I think it would be awfully nice if some hardware vendor (or interested community member) could provide us with access to some top-quality SPARC, ARM, and maybe zSeries machines for testing, but nobody's offered that I know of. We can't test hardware we don't have. Despite my concerns about keeping the list of supported atomics short, and I do have concerns in that area, I'm not really sure that we have much choice but to go in that direction. We can't accept a 5x performance hit in the name of portability, and that's literally what we're talking about in some cases. I definitely want to think carefully about how we proceed in this area but doing nothing doesn't seem like an option. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Array of composite types returned from python
Hi Ronan. Based on your review, I'm marking this as ready for committer. The attached patch implements this. Your patch looks sensible enough (thanks for adding tests), but I guess we'll let the reviewer sort out whether to commit the original or your extended version. Thanks. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RLS Design
On Tue, Jul 1, 2014 at 3:20 PM, Dean Rasheed dean.a.rash...@gmail.com wrote: On 1 July 2014 17:42, Robert Haas robertmh...@gmail.com wrote: On Tue, Jul 1, 2014 at 3:33 AM, Dean Rasheed dean.a.rash...@gmail.com wrote: An annoying complication, however, is how this interacts with column privileges. Right now GRANT SELECT(col1) ON t1 TO role1 gives role1 access to every row in col1, and I think that has to remain the case, since GRANTs only ever give you more access. But that leads to a situation where the RLS quals applied would depend on the columns selected. Wow, that seems pretty horrible to me. That means that if I do: SELECT a FROM tab; and then: SELECT a, b FROM tab; ...the second one might return fewer rows than the first one. I think there's a good argument that RLS is unlike other grantable privileges, and that it really ought to be defined as something which is imposed rather than a kind of access grant. If RLS is merely a modifier to an access grant, then every access grant has to make sure to include that modifier, or you have a security hole. But if it's a separate constrain on access, then you just do it once, and exempt people from it only as needed. That seems less error-prone to me -- it's sort of a default-deny policy, which is generally viewed as good for security -- and it avoids weird cases like the above, which I think could easily break application logic. That seems like a pretty strong argument. If RLS quals are instead regarded as constraints on access, and multiple policies apply, then it seems that the quals should now be combined with AND rather than OR, right? Yeah, maybe. I intuitively feel that OR would be more useful, so it would be nice to find a design where that makes sense. But it depends a lot, in my view, on what syntax we end up with. For example, suppose we add just one command: ALTER TABLE table_name FILTER [ role_name | PUBLIC ] USING qual; If the given role inherits from multiple roles that have different filters, I think the user will naturally expect all of the filters to be applied. But you could do it other ways. For example: ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY; ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING qual; If a table is set to NO ROW LEVEL SECURITY then it behaves just like it does now: anyone who accesses it sees all the rows, restricted to those columns for which they have permission. If the table is set to ROW LEVEL SECURITY then the default is to show no rows. The second command then allows access to a subset of the rows for a give role name. In this case, it is probably logical for access to be combined via OR. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels
On Tue, Jul 1, 2014 at 8:22 AM, Christoph Berg c...@df7cb.de wrote: Re: Tom Lane 2014-06-23 17054.1403542...@sss.pgh.pa.us While I'd love to reduce the number of future installations without this fix in place, I respect the decision to honor project policy. At the same time, this change does not break anything. It introduces new environment variables which change the behaviour, but behaves the old way in the absence of those variables. Uh, no, it doesn't. We removed the dependence on -DLINUX_OOM_SCORE_ADJ. If a packager is expecting that to still work in 9.4, he's going to be unpleasantly surprised, because the system will silently fail to do what he's expecting: it will run all the backend processes at no-OOM-kill priority, which is likely to be bad. Ok I'm late to the party, but the reason I'm still joining is we have proper unit tests which just told me the 9.5 packages have changed OOM handling. So it wouldn't just slip through if you changed that in 9.4 as well, but would get fixed. I have two comments on the patch: The choice to make the behavior depend first on PG_OOM_ADJUST_FILE and only secondly on PG_OOM_ADJUST_VALUE seems the wrong way round to me. On every modestly new kernel oom_score_adj is present, so PG_OOM_ADJUST_FILE should default to using it. On the other hand, what people really want to achieve (or tune) with this feature is to set the OOM adjustment back to 0 (or some other value), so to me, it would be much more natural to set PG_OOM_ADJUST_VALUE=0 to activate the feature, and only have to mess with PG_OOM_ADJUST_FILE if my kernel is older. (Which it isn't on any kernel supported by Debian and Ubuntu LTS.) Of course, we have no guarantee that the Linux kernel guys won't change this again. Apparently we don't break userspace is a somewhat selectively-enforced principle. The other bit is the non-deprecation of OOM_ADJ in contrib/start-scripts/linux. It took me a while to understand the logic of the variables used there - the interface would be much clearer if it just was like this: ... set default PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj ... and then this in the configurable part of the script: PG_MASTER_OOM_SCORE_ADJ=-1000 PG_OOM_SCORE_ADJ=0 # Older Linux kernels may not have /proc/self/oom_score_adj, but instead # /proc/self/oom_adj, which works similarly except the disable value is -17. # For such a system, uncomment these three lines instead. #PG_OOM_ADJUST_FILE=/proc/self/oom_adj #PG_MASTER_OOM_SCORE_ADJ=-17 #PG_OOM_SCORE_ADJ=0 ... and then use PG_OOM_ADJUST_FILE below instead of manually figuring out which proc file to write to by looking at OOM.*_ADJ. I can't help but agree with this, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Can simplify 'limit 1' with slow function?
Merlin Moncure-2 wrote On Tue, Jul 1, 2014 at 2:16 PM, Martijn van Oosterhout lt; kleptog@ gt; wrote: On Sun, Jun 29, 2014 at 10:05:50PM +0800, gotoschool6g wrote: The simplified scene: select slowfunction(s) from a order by b limit 1; is slow than select slowfunction(s) from (select s from a order by b limit 1) as z; if there are many records in table 'a'. The real scene. Function ST_Distance_Sphere is slow, the query: SELECT ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from road order by c limit 1; is slow than: select ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from (SELECT s from road order by c limit 1) as a; There are about 7000 records in 'road'. I think to help here I think we need the EXPLAIN ANALYSE output for both queries. Well, I think the problem is a well understood one: there is no guarantee that functions-in-select-list are called exactly once per output row. This is documented -- for example see here: http://www.postgresql.org/docs/9.1/static/explicit-locking.html#ADVISORY-LOCKS. In short, if you want very precise control of function evaluation use a subquery, or, if you're really paranoid, a CTE. merlin I would have to disagree on the this is documented comment - the linked section on advisory locks does not constitute documentation of the fact that limit can be applied after expressions in the select-list are evaluated. http://www.postgresql.org/docs/9.3/static/sql-select.html In the select command documentation item 5 covers select-list evaluation while item 9 covers limit thus implying what we are saying - though keep in mind each select statement gets processed independently and possibly in a correlated fashion (i.e. potentially multiple times). David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Can-simplify-limit-1-with-slow-function-tp5809997p5810061.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_xlogdump --stats
At 2014-07-01 16:39:57 +0300, ma...@juffo.org wrote: Here's a patch to make pg_xlogdump print summary statistics instead of individual records. Thanks! I had a use for this feature so I backported the (first) patch to PostgreSQL 9.3. It's a rush job so it's ugly and may have bugs, but it works for me. Just posting here, maybe someone else will find it useful too. Thanks for taking the trouble. In CF terms, did you form any opinion while porting the patch I posted about whether it's sensible/ready for inclusion in 9.5? -- Abhijit P.S. In your patch, the documentation changes are duplicated. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] change alter user to be a true alias for alter role
At 2014-06-27 16:11:21 +0200, vik.fear...@dalibo.com wrote: After a week of silence from Jov, I decided to do this myself since it didn't seem very hard. Many frustrating hours of trying to understand why I'm getting shift/reduce conflicts by the hundreds later, I've decided to give up for now. Jov, do you expect to be able to work on the patch along the lines Tom suggested and resubmit during this CF? If not, since Vik gave up on it, it seems to me that it would be best to mark it returned with feedback (which I'll do in a couple of days if I don't hear anything to the contrary). -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels
Robert Haas robertmh...@gmail.com writes: On Tue, Jul 1, 2014 at 8:22 AM, Christoph Berg c...@df7cb.de wrote: I have two comments on the patch: The choice to make the behavior depend first on PG_OOM_ADJUST_FILE and only secondly on PG_OOM_ADJUST_VALUE seems the wrong way round to me. Of course, we have no guarantee that the Linux kernel guys won't change this again. Apparently we don't break userspace is a somewhat selectively-enforced principle. Yeah, I'm unexcited about this proposal. In any case, given the two existing APIs we have to deal with, allowing PG_OOM_ADJUST_VALUE to default to 0 is sane in both APIs but a default for the file name can work for only one. The other bit is the non-deprecation of OOM_ADJ in contrib/start-scripts/linux. It took me a while to understand the logic of the variables used there - the interface would be much clearer if it just was like this: ... and then use PG_OOM_ADJUST_FILE below instead of manually figuring out which proc file to write to by looking at OOM.*_ADJ. I can't help but agree with this, though. Fair enough. I went for a minimum-change approach when hacking that script, but we could change it some more in the name of readability. Will do something about that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
Robert Haas robertmh...@gmail.com writes: Despite my concerns about keeping the list of supported atomics short, and I do have concerns in that area, I'm not really sure that we have much choice but to go in that direction. We can't accept a 5x performance hit in the name of portability, and that's literally what we're talking about in some cases. I definitely want to think carefully about how we proceed in this area but doing nothing doesn't seem like an option. To be clear, I'm not advocating doing nothing (and I don't think anyone else is). It's obvious based on Andres' results that we want to use atomics on platforms where they're well-supported. The argument is around what we're going to do for other platforms. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] buildfarm and rolling release distros
On 02/07/14 06:02, Robert Haas wrote: On Tue, Jul 1, 2014 at 12:49 PM, Andrew Dunstan and...@dunslane.net wrote: I've always been a bit reluctant to accept buildfarm members that are constantly being updated, because it seemed to me that it created something with too many variables. However, we occasionally get requests from people who want to run on such platforms, and I'm also a bit reluctant to turn away willing volunteers. We have one such application now in hand. What do people think about this. Is it valuable to have? Do we have enough stability from the buildfarm members that are not auto-updated that we can accept a certain number of auto-updating members, where, if something breaks, and it doesn't break elsewhere, then we suspect that something that got upgraded broke the build? I'm also not sure how to designate these machines. The buildfarm server metadata isn't designed for auto-updating build platforms. But no doubt if necessary we can come up with something. Off-hand, it seems like we could give it a try, and abandon the effort if it proves too problematic. How about prefixing the names of Auto Updating build farms with 'au_'? Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Can simplify 'limit 1' with slow function?
On Tue, Jul 1, 2014 at 3:06 PM, David G Johnston david.g.johns...@gmail.com wrote: Merlin Moncure-2 wrote On Tue, Jul 1, 2014 at 2:16 PM, Martijn van Oosterhout lt; kleptog@ gt; wrote: On Sun, Jun 29, 2014 at 10:05:50PM +0800, gotoschool6g wrote: The simplified scene: select slowfunction(s) from a order by b limit 1; is slow than select slowfunction(s) from (select s from a order by b limit 1) as z; if there are many records in table 'a'. The real scene. Function ST_Distance_Sphere is slow, the query: SELECT ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from road order by c limit 1; is slow than: select ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from (SELECT s from road order by c limit 1) as a; There are about 7000 records in 'road'. I think to help here I think we need the EXPLAIN ANALYSE output for both queries. Well, I think the problem is a well understood one: there is no guarantee that functions-in-select-list are called exactly once per output row. This is documented -- for example see here: http://www.postgresql.org/docs/9.1/static/explicit-locking.html#ADVISORY-LOCKS. In short, if you want very precise control of function evaluation use a subquery, or, if you're really paranoid, a CTE. merlin I would have to disagree on the this is documented comment - the linked section on advisory locks does not constitute documentation of the fact that limit can be applied after expressions in the select-list are evaluated. http://www.postgresql.org/docs/9.3/static/sql-select.html In the select command documentation item 5 covers select-list evaluation while item 9 covers limit thus implying what we are saying - though keep in mind each select statement gets processed independently and possibly in a correlated fashion (i.e. potentially multiple times). Sure, although I did not claim that..the select documentation *does* cover this behavior but I find the syntax driven doc pages to be fairly arcane and unhelpful -- they don't say (for the most part) avoid this or do that. I pointed out this particular section because it proved an example that matched the OP's problem case. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Tue, Jul 1, 2014 at 4:54 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Despite my concerns about keeping the list of supported atomics short, and I do have concerns in that area, I'm not really sure that we have much choice but to go in that direction. We can't accept a 5x performance hit in the name of portability, and that's literally what we're talking about in some cases. I definitely want to think carefully about how we proceed in this area but doing nothing doesn't seem like an option. To be clear, I'm not advocating doing nothing (and I don't think anyone else is). It's obvious based on Andres' results that we want to use atomics on platforms where they're well-supported. The argument is around what we're going to do for other platforms. OK, but that still seems like the issue on the other thread, not what's being discussed here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 01/07/14 11:04, Andres Freund wrote: Since we have a Sun Studio machine in the buildfarm, we shouldn't give up on SPARC completely, but maybe we should only add the cases for sparcv8+ and above? That at least has some chance of getting tested. That we have code for sparcv7 is ridiculous btw. Sparcv8 (not 8+/9) is from 1992. v7/sun4c been dropped from solaris 8 if I see that correctly. I agree that v8 (in contrast to v8+ aka v9 in 32bit mode) support has to go as well. I'd personally vote for backpatching a note to installation.sgml saying that it's probably not working, and not do anything else there. That means we also should replace the ldstub by cas in the the gcc inline assembly - but we have buildfarm members for that, so it's not too bad. Being involved in QEMU SPARC development, I can tell you that patches are still actively being received for SPARCv8. The last set of CPU patches were related to fixing bugs in the LEON3, a 32-bit SPARC CPU which is available in hardened versions certified for extreme environments such as military and space. I'd hate to find out that they switched to another database because they couldn't upgrade PostgreSQL on the ISS ;) Also if you're struggling for Sun buildfarm animals, recent versions of QEMU will quite happily install and run later versions of 32-bit Solaris over serial, and 2.0 even manages to give you a cgthree framebuffer for the full experience. ATB, Mark. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-07-01 23:21:07 +0100, Mark Cave-Ayland wrote: On 01/07/14 11:04, Andres Freund wrote: Since we have a Sun Studio machine in the buildfarm, we shouldn't give up on SPARC completely, but maybe we should only add the cases for sparcv8+ and above? That at least has some chance of getting tested. That we have code for sparcv7 is ridiculous btw. Sparcv8 (not 8+/9) is from 1992. v7/sun4c been dropped from solaris 8 if I see that correctly. I agree that v8 (in contrast to v8+ aka v9 in 32bit mode) support has to go as well. I'd personally vote for backpatching a note to installation.sgml saying that it's probably not working, and not do anything else there. That means we also should replace the ldstub by cas in the the gcc inline assembly - but we have buildfarm members for that, so it's not too bad. Being involved in QEMU SPARC development, I can tell you that patches are still actively being received for SPARCv8. The last set of CPU patches were related to fixing bugs in the LEON3, a 32-bit SPARC CPU which is available in hardened versions certified for extreme environments such as military and space. I'd hate to find out that they switched to another database because they couldn't upgrade PostgreSQL on the ISS ;) LEON3 isn't really sparcv8. It's more like sparcv8 with v9 stuff cherry picked... It e.g. has CAS :) Your point stands though - we should probably backpatch the v8 version of my patch as well, since apparently LEON3 does *not* have membar. But I don't think this is a rat race we can win. We can't keep up with all the variants of sparc and even worse arm. We should add a intrinsics based version for sparc as well. That'll work just fine on any recent gcc. Also if you're struggling for Sun buildfarm animals, recent versions of QEMU will quite happily install and run later versions of 32-bit Solaris over serial, and 2.0 even manages to give you a cgthree framebuffer for the full experience. Well. I have to admit I'm really not interested in investing that much time in something I've no stake in. If postgres developers have to put emulated machines to develop features something imo went seriously wrong. That's more effort than at least I'm willing to spend. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Patch to send transaction commit/rollback stats to the stats collector unconditionally.
On Sun, Jun 29, 2014 at 11:58 AM, Kevin Grittner kgri...@ymail.com wrote: I have reviewed this patch, and think we should do what the patch is trying to do, but I don't think the submitted patch would actually work. Just curious, why do you think it won't work. Although the discussion is a bit old, but I'm sure I would've tested the patch before submitting. I have attached a suggested patch which I think would work. Gurjeet, could you take a look at it? The patch, when considered together with Tom's suggestion upthread, looks good to me. Best regards, -- Gurjeet Singh http://gurjeet.singh.im/ EDB : www.EnterpriseDB.com : The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Wait free LW_SHARED acquisition
On 01/07/14 23:25, Heikki Linnakangas wrote: On 07/01/2014 01:08 PM, Andres Freund wrote: Hi, Over at -performance Mark Kirkwood tested a recent version of this (http://archives.postgresql.org/message-id/53B283F3.7020005%40catalyst.net.nz) . I thought it's interesting to add the numbers to this thread: Test: pgbench Options: scale 500 read only Os: Ubuntu 14.04 Pg: 9.3.4 Pg Options: max_connections = 200 shared_buffers = 10GB maintenance_work_mem = 1GB effective_io_concurrency = 10 wal_buffers = 32MB checkpoint_segments = 192 checkpoint_completion_target = 0.8 Results Clients | 9.3 tps 32 cores | 9.3 tps 60 cores +--+- 6 | 70400 | 71028 12 | 98918 | 129140 24 | 230345 | 240631 48 | 324042 | 409510 96 | 346929 | 120464 192 | 312621 | 92663 So we have anti scaling with 60 cores as we increase the client connections. Ouch! A level of urgency led to trying out Andres's 'rwlock' 9.4 branch [1] - cherry picking the last 5 commits into 9.4 branch and building a package from that and retesting: Clients | 9.4 tps 60 cores (rwlock) +-- 6 | 70189 12 | 128894 24 | 233542 48 | 422754 96 | 590796 192 | 630672 Now, this is a bit of a skewed comparison due to 9.4 vs. 9.3 but still interesting. It looks like the issue I reported here: http://www.postgresql.org/message-id/5190e17b.9060...@vmware.com fixed by this commit: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b03d196be055450c7260749f17347c2d066b4254. So, definitely need to compare plain 9.4 vs patched 9.4, not 9.3. Here's plain 9.4 vs patched 9.4: Clients | 9.4 tps 60 cores | 9.4 tps 60 cores (rwlock) +--+-- 6 | 69490 | 70189 12 | 128200 | 128894 24 | 232243 | 233542 48 | 417689 | 422754 96 | 464037 | 590796 192 | 418252 | 630672 It appears that plain 9.4 does not exhibit the dramatic anti scaling that 9.3 showed, but there is still evidence of some contention in the higher client numbers, and we peak at the 96 client mark. The patched variant looks pretty much free from this, still scaling at 192 connections (might have been interesting to try more, but had max_connections set to 200)! Cheers Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Patch to send transaction commit/rollback stats to the stats collector unconditionally.
On Tue, Jul 1, 2014 at 10:05 AM, Kevin Grittner kgri...@ymail.com wrote: Tom Lane t...@sss.pgh.pa.us wrote: If we're going to do it like this, then I think the force flag should be considered to do nothing except override the clock check, which probably means it shouldn't be tested in the initial if() at all. That makes sense, and is easily done. Attached is the patch to save you a few key strokes :) The only question left is how far back to take the patch. I'm inclined to only apply it to master and 9.4. Does anyone think otherwise? Considering this as a bug-fix, I'd vote for it to be applied to all supported releases. But since this may cause unforeseen performance penalty, I think it should be applied only as far back as the introduction of PGSTAT_STAT_INTERVAL throttle. The throttle was implemeted in 641912b, which AFAICT was part of 8.3. So I guess all the supported releases it is. Best regards, -- Gurjeet Singh http://gurjeet.singh.im/ EDB : www.EnterpriseDB.com : The Enterprise PostgreSQL Company diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index 3ab1428..c7f41a5 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -753,7 +753,8 @@ pgstat_report_stat(bool force) /* Don't expend a clock check if nothing to do */ if ((pgStatTabList == NULL || pgStatTabList-tsa_used == 0) - !have_function_stats !force) + pgStatXactCommit == 0 pgStatXactRollback == 0 + !have_function_stats) return; /* @@ -817,11 +818,11 @@ pgstat_report_stat(bool force) } /* -* Send partial messages. If force is true, make sure that any pending -* xact commit/abort gets counted, even if no table stats to send. +* Send partial messages. Make sure that any pending xact commit/abort +* gets counted, even if there are no table stats to send. */ if (regular_msg.m_nentries 0 || - (force (pgStatXactCommit 0 || pgStatXactRollback 0))) + pgStatXactCommit 0 || pgStatXactRollback 0) pgstat_send_tabstat(regular_msg); if (shared_msg.m_nentries 0) pgstat_send_tabstat(shared_msg); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL access to database attributes
Vik Fearing vik.fear...@dalibo.com writes: Okay, here is version two of the refactoring patch that documents that the with-space version is deprecated but still accepted. The feature patch is not affected by this and so I am not attaching a new version of that. I've committed this without the changes to expose the CONNECTION_LIMIT spelling, and with some other minor fixes --- the only one of substance being that you'd broken the foo = DEFAULT variants of the options by removing the checks on whether defel-arg was provided. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposing pg_hibernate
On Sun, Jun 15, 2014 at 2:51 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Thu, Jun 12, 2014 at 9:31 AM, Gurjeet Singh gurj...@singh.im wrote: I don't have intimate knowledge of recovery but I think the above assessment of recovery's operations holds true. If you still think this is a concern, can you please provide a bit firm example using which I can visualize the problem you're talking about. Okay, let me try with an example: Assume No. of shared buffers = 5 Before Crash: 1. Pages in shared buffers numbered 3, 4, 5 have write operations performed on them, followed by lot of reads which makes their usage_count as 5. 2. Write operation happened on pages in shared buffers numbered 1, 2. Usage_count of these buffers is 1. 3. Now one read operation needs some different pages and evict pages in shared buffers numbered 1 and 2 and read the required pages, so buffers 1 and 2 will have usage count as 1. 4. At this moment shutdown initiated. 5. Bgwriter saved just buffers 1 and 2 and crashed. After Crash: 1. Recovery will read in pages on which operations happened in step-1 and 2 above. 2. Buffer loader (pg_hibernator) will load buffers on which operations happened in step-3, so here it might needs to evict buffers which are corresponding to buffers of step-1 before crash. So what this essentially means is that pg_hibernator can lead to eviction of more useful pages. Granted, you have demonstrated that the blocks restored by pg_hibernator can cause eviction of loaded-by-recovery blocks. But, one can argue that pg_hibernator brought the shared-buffer contents to to a state that is much closer to the pre-shutdown state than the recovery would have restored them to. I think this supports the case for pg_hibernator, that is, it is doing what it is supposed to do: restore shared-buffers to pre-shutdown state. I agree that there's uncertainty as to which buffers will be cleared, and hence which blocks will be evicted. So pg_hibernator may cause eviction of blocks that had higher usage count before the shutdown, because they may have a lower/same usage count as other blocks' buffers after recovery. There's not much that can be done for that, because usage count information is not saved anywhere on disk, and I don't think it's worth saving just for pg_hibernator's sake. Best regards, -- Gurjeet Singh http://gurjeet.singh.im/ EDB : www.EnterpriseDB.com : The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
Andres Freund and...@2ndquadrant.com writes: On 2014-07-01 23:21:07 +0100, Mark Cave-Ayland wrote: Also if you're struggling for Sun buildfarm animals, recent versions of QEMU will quite happily install and run later versions of 32-bit Solaris over serial, and 2.0 even manages to give you a cgthree framebuffer for the full experience. Well. I have to admit I'm really not interested in investing that much time in something I've no stake in. If postgres developers have to put emulated machines to develop features something imo went seriously wrong. That's more effort than at least I'm willing to spend. Perhaps more to the point, I have no faith at all that an emulator will mimic multiprocessor timing behavior to the level of detail needed to tell whether memory-barrier-related logic works. See the VAX discussion just a couple days ago. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] buildfarm and rolling release distros
Robert Haas robertmh...@gmail.com writes: On Tue, Jul 1, 2014 at 12:49 PM, Andrew Dunstan and...@dunslane.net wrote: I've always been a bit reluctant to accept buildfarm members that are constantly being updated, because it seemed to me that it created something with too many variables. However, we occasionally get requests from people who want to run on such platforms, and I'm also a bit reluctant to turn away willing volunteers. We have one such application now in hand. What do people think about this. Is it valuable to have? Do we have enough stability from the buildfarm members that are not auto-updated that we can accept a certain number of auto-updating members, where, if something breaks, and it doesn't break elsewhere, then we suspect that something that got upgraded broke the build? I'm also not sure how to designate these machines. The buildfarm server metadata isn't designed for auto-updating build platforms. But no doubt if necessary we can come up with something. Off-hand, it seems like we could give it a try, and abandon the effort if it proves too problematic. If a majority of buildfarm critters were like that, it'd be too confusing. But as long as they are few, not all following the same update stream, and well labeled in the buildfarm status page, I think we could cope. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_xlogdump --stats
On Tue, Jul 1, 2014 at 11:13 PM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: In CF terms, did you form any opinion while porting the patch I posted about whether it's sensible/ready for inclusion in 9.5? I didn't look at the code more than necessary to make the build work. As far as functionality goes, it does exactly what I needed it to do; the output is very clear. I wanted to see how much WAL is being generated from SELECT FOR UPDATE locks. Here are some thoughts: The FPI acronym made me wonder at first. Perhaps Full page image size would be clearer (exactly 20 characters long, so it fits). But on the other hand, I find that the table is too wide, I always need to stretch my terminal window to fit it. I don't think we need to accommodate for 10^20 bytes ~ 87 exabytes of WAL files. :) You might also add units (kB/MB) to the table like pg_size_pretty, although that would make the magnitudes harder to gauge. P.S. In your patch, the documentation changes are duplicated. Oh right you are, I don't know how that happened. I also missed some record types (Btree/0x80, Btree/0xb0). Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
On Fri, Jun 20, 2014 at 05:04:06PM +0900, Kyotaro HORIGUCHI wrote: Attached is the rebased patch of v11 up to the current master. I've been studying this patch. SELECT FOR UPDATE on the inheritance parent fails with a can't-happen error condition, even when SELECT FOR UPDATE on the child foreign table alone would have succeeded. The patch adds zero tests. Add principal tests to postgres_fdw.sql. Also, create a child foreign table in foreign_data.sql; this will make dump/reload tests of the regression database exercise an inheritance tree that includes a foreign table. The inheritance section of ddl.sgml should mention child foreign tables, at least briefly. Consider mentioning it in the partitioning section, too. Your chosen ANALYZE behavior is fair, but the messaging from a database-wide ANALYZE VERBOSE needs work: INFO: analyzing test_foreign_inherit.parent INFO: parent: scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows INFO: analyzing test_foreign_inherit.parent inheritance tree WARNING: relcache reference leak: relation child not closed WARNING: relcache reference leak: relation tchild not closed WARNING: relcache reference leak: relation parent not closed Please arrange to omit the 'analyzing tablename inheritance tree' message, since this ANALYZE actually skipped that task. (The warnings obviously need a fix, too.) I do find it awkward that adding a foreign table to an inheritance tree will make autovacuum stop collecting statistics on that inheritance tree, but I can't think of a better policy. The rest of these review comments are strictly observations; they're not requests for you to change the patch, but they may deserve more discussion. We use the term child table in many messages. Should that change to something more inclusive, now that the child may be a foreign table? Perhaps one of child relation, plain child, or child foreign table/child table depending on the actual object? child relation is robust technically, but it might add more confusion than it removes. Varying the message depending on the actual object feels like a waste. Opinions? LOCK TABLE on the inheritance parent locks child foreign tables, but LOCK TABLE fails when given a foreign table directly. That's odd, but I see no cause to change it. A partition root only accepts an UPDATE command if every child is updatable. Similarly, UPDATE ... WHERE CURRENT OF cursor_name fails if any child does not support it. That seems fine. Incidentally, does anyone have a FDW that supports WHERE CURRENT OF? The longstanding behavior of CREATE TABLE INHERITS is to reorder local columns to match the order found in parents. That is, both of these tables actually have columns in the order (a,b): create table parent (a int, b int); create table child (b int, a int) inherits (parent); Ordinary dump/reload always uses CREATE TABLE INHERITS, thereby changing column order in this way. (pg_dump --binary-upgrade uses ALTER TABLE INHERIT and some catalog hacks to avoid doing so.) I've never liked that dump/reload can change column order, but it's tolerable if you follow the best practice of always writing out column lists. The stakes rise for foreign tables, because column order is inherently significant to file_fdw and probably to certain other non-RDBMS FDWs. If pg_dump changes column order in your file_fdw foreign table, the table breaks. I would heartily support making pg_dump preserve column order for all inheritance children. That doesn't rise to the level of being a blocker for this patch, though. I am attaching rough-hewn tests I used during this review. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com drop schema test_foreign_inherit cascade; create schema test_foreign_inherit; set search_path = test_foreign_inherit; create extension postgres_fdw; CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname 'test'); CREATE USER MAPPING FOR CURRENT_USER SERVER loopback; create extension file_fdw; CREATE SERVER f FOREIGN DATA WRAPPER file_fdw; create table parent ( a int, b text, c date, d path ); insert into parent values (4, 'foo', current_date, '0,0,1,2'); -- regular child create table tchild () inherits (parent); insert into tchild values (5, 'oof', current_date - 2, '-1,-1,0,0'); -- cursor update works begin; declare c cursor for select 1 from parent where b = 'oof'; fetch from c; explain analyze update parent set b = null where current of c; select * from parent; rollback; -- foreign side create table _child ( c date, b text, extra numeric, d path, a int ); -- Foreign child. No column reorder. create foreign table child ( c date, b text, extra numeric, d path, a int ) SERVER LOOPBACK OPTIONS (table_name '_child'); alter table child inherit parent; insert into
Re: [HACKERS] Proposing pg_hibernate
On Sat, Jun 7, 2014 at 6:48 AM, Cédric Villemain ced...@2ndquadrant.com wrote: Le lundi 3 février 2014 19:18:54 Gurjeet Singh a écrit : Possible enhancements: - Ability to save/restore only specific databases. - Control how many BlockReaders are active at a time; to avoid I/O storms. FWIW, this has been implemented. By default, only one database is restored at a time. - Be smart about lowered shared_buffers across the restart. - Different modes of reading like pg_prewarm does. - Include PgFincore functionality, at least for Linux platforms. Please note that pgfincore is working on any system where PostgreSQL prefetch is working, exactly like pg_prewarm. This includes linux, BSD and many unix-like. It *is not* limited to linux. I never had a single request for windows, but windows does provides an API for that too (however I have no windows offhand to test). Another side note is that currently BSD (at least freeBSD) have a more advanced mincore() syscall than linux and offers a better analysis (dirty status is known) and they implemented posix_fadvise... I have never used pgfincore, and have analyzed it solely based on the examples provided, second-hand info, and some code reading, so the following may be wrong; feel free to correct. The UI of pgfincore suggests that to save a snapshot of an object, pgfincore reads all the segments of the object and queries the OS cache. This may take a lot of time on big databases. If this is done at shutdown time, the time to finish shutdown will be proportional to the size of the database, rather than being proportional to the amount of data files in OS cache. There is a previous thread about that hibernation feature. Mitsuru IWASAKI did a patch, and it triggers some interesting discussions. Some notes in this thread are outdated now, but it's worth having a look at it: http://www.postgresql.org/message-id/20110504.231048.113741617.iwas...@jp.freebsd.org https://commitfest.postgresql.org/action/patch_view?id=549 Thanks for sharing these. I agree with Greg's observations there that the shared-buffers are becoming increasingly smaller subset of the RAM available on modern machines. But until it can be done in a platform-independent way I doubt it will ever be accepted in Postgres. Even when it's accepted, it would have to be off by default because of the slow shutdown mentioned above. Best regards, -- Gurjeet Singh http://gurjeet.singh.im/ EDB : www.EnterpriseDB.com : The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Audit of logout
On Mon, Jun 23, 2014 at 5:42 PM, Fujii Masao masao.fu...@gmail.com wrote: On Sat, Jun 21, 2014 at 12:59 PM, Joe Conway m...@joeconway.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 06/13/2014 07:29 AM, Tom Lane wrote: Fujii Masao masao.fu...@gmail.com writes: On Thu, Jun 12, 2014 at 8:51 PM, Fujii Masao masao.fu...@gmail.com wrote: Some users enable log_disconnections in postgresql.conf to audit all logouts. But since log_disconnections is defined with PGC_BACKEND, it can be changed at connection start. This means that any client (even nonsuperuser) can freely disable log_disconnections not to log his or her logout even when the system admin enables it in postgresql.conf. Isn't this problematic for audit? That's harmful for audit purpose. I think that we should make log_disconnections PGC_SUSET rather than PGC_BACKEND in order to forbid non-superusers from changing its setting. Attached patch does this. This whole argument seems wrong unless I'm missing something: test=# set log_connections = on; ERROR: parameter log_connections cannot be set after connection start test=# set log_disconnections = off; ERROR: parameter log_disconnections cannot be set after connection start You can change log_connections/disconnections via connection option as follows $ grep log_disconnections $PGDATA/postgresql.conf log_disconnections = on $ psql -U hoge -d options='-c log_disconnections=off' = show log_disconnections ; log_disconnections off (1 row) = \du List of roles Role name | Attributes | Member of ---++--- hoge || {} postgres | Superuser, Create role, Create DB, Replication | {} I wonder whether we should just get rid of log_disconnections as a separate variable, instead logging disconnections when log_connections is set. That might be a good idea though. David pointed the merit of keeping those two parameters separate upthread and I understand his thought. http://www.postgresql.org/message-id/1402675662004-5807224.p...@n5.nabble.com Let me explain the problem which I'd like to address here, again. The problem is that any client (even non-superuser) can change the setting of log_disconnections when it connects to the server. For example, you can do by using options connection parameter as follows. $ psql -U hoge -d options='-c log_disconnections=off' = show log_disconnections ; log_disconnections off (1 row) This means that, even when DBA enables log_disconnections in postgresql.conf in order to audit all the logouts, a client can freely avoid logging of his or her logout for some reasons. I think this situation problematic especially for audit purpose. You may think that logging logins is enough for audit purpose and logging logouts is not so important. But imagine the case where all logins are logged but some corresponding logouts are not logged. This would make it difficult to check and verify the validities of audit logs. It's not easy to handle such users that only login is logged. Any client can enable/disable log_disconnections because it's defined with PGC_BACKEND context. By definition, GUC parameters with PGC_BACKEND can be changed at connection startup. But it cannot be changed after connection is established. BTW, log_connections is also defined with PGC_BACKEND, so any client can change its setting at connection startup. But *fortunately* it's used by the server before the changed value is given from a client at connection startup. So, the server always uses the setting value in postgresql.conf, whether a client tries to change log_connections or not. Therefore log_connections doesn't cause the problem which I described the above at all. That's good for audit purpose. OTOH, ISTM that defining log_connections with PGC_BACKEND is a bit strange since a client cannot change it at all. To address the problem, I'm thinking to change the GUC context of log_disconnections from PGC_BACKEND to PGC_SIGHUP. This prevents a client from changing the setting. In the top of the thread, I posted the patch which changed the context to PGC_SUSET, but now I'm thinking PGC_SIGHUP is better. Since log_disconnections is used (if it's enabled, the callback function which logs lotout is registered) just after connection is established, using PGC_SUSET and allowing superuser to change the setting by SET command is useless. Changing the GUC context of log_disconnections cause two behavior changes. (1) As I explained the above, it prevents a client from changing the setting. Yeah, this is what I want. (2) It allows DBA to change log_disconnections of running backends by reloading the configuration file. But such changed value has no effect on whether logout is logged or not
Re: [HACKERS] buildfarm and rolling release distros
On Tue, Jul 01, 2014 at 08:35:16PM -0400, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Tue, Jul 1, 2014 at 12:49 PM, Andrew Dunstan and...@dunslane.net wrote: I'm also not sure how to designate these machines. The buildfarm server metadata isn't designed for auto-updating build platforms. But no doubt if necessary we can come up with something. Off-hand, it seems like we could give it a try, and abandon the effort if it proves too problematic. If a majority of buildfarm critters were like that, it'd be too confusing. But as long as they are few, not all following the same update stream, and well labeled in the buildfarm status page, I think we could cope. +1. The buildfarm has one such member already, anchovy, and I recall it having given at least one helpful forewarning. It shows as Arch Linux testing [updated daily], which is sufficient annotation for me. Its failure rate has been low; member-caused failures due to ENOSPC and other miscellany are a good deal more common. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_xlogdump --stats
At 2014-07-02 04:20:31 +0300, ma...@juffo.org wrote: As far as functionality goes, it does exactly what I needed it to do; the output is very clear. Good to hear. You might also add units (kB/MB) to the table like pg_size_pretty, although that would make the magnitudes harder to gauge. I think df-style -k/m/g switches might be useful to scale all printed values. If we did that, we could also shrink the columns accordingly. But that would also complicate the code a bit. I think it's better to start with the simplest thing, and add more tweaks later. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers