Re: [HACKERS] Enabling Checksums
On Thu, 2013-03-07 at 13:45 -0800, Jeff Davis wrote: I need to do another self-review after these changes and some more extensive testing, so I might have missed a couple things. New patch attached. Aside from rebasing, I also found a problem with temp tables. At first I was going to fix it by continuing to exclude temp tables from checksums entirely. But then I re-thought it and decided to just checksum temp tables, too. Excluding temp tables from checksums means more special cases in the code, and more documentation. After thinking about it, there is no huge benefit to excluding temp tables: * small temp tables will be in memory only, and never checksummed * no WAL for temp tables, so the biggest cost of checksums is non-existent * there are good reasons to want to checksum temp tables, because they can be used to stage data for permanent tables However, I'm willing to be convinced to exclude temp tables again. Regards, Jeff Davis checksums-20130312.patch.gz Description: GNU Zip compressed data replace-tli-with-checksums-20130312.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal for Allow postgresql.conf values to be changed via SQL [review]
On 03/12/2013 06:27 AM, Craig Ringer wrote: Think also about the case where someone wants to change multiple values together and having just some set and not others would be inconsistent. Yeah, that's a killer. The reload would need to be scheduled for COMMIT time, it can't be done by `SET PERSISTENT` directly. Thinking about this some more, I'm not sure this is a good idea. Right now, SET takes effect immediately. Always, without exception. Delaying SET PERSISTENT's effects until commit would make it inconsistent with SET's normal behaviour. However, *not* delaying it would make it another quirky not-transactional not-atomic command. That's OK, but if it's not going to follow transactional semantics it should not be allowed to run within a transaction, like VACUUM . Writing the changes immediately but deferring the reload until commit seems to be the worst of those two worlds. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statistics and selectivity estimation for ranges
Hi! Thanks for your work on this patch! On Tue, Mar 12, 2013 at 8:03 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: So, after some hacking, I ended up with this version. I find it more readable, I hope I didn't miss anything. This seems to produce results that are close, but not identical, to the original patch. I'm not sure where the discrepancy is coming from, or which patch is more correct in that respect. I'll continue from this tomorrow, but if you have the time, please take a look and let me know what you think. I've read your explanation and version of patch. In general it seems correct to me. There is one point why I have breaked up abstraction in some functions is infinities. For example, in calc_length_hist_frac one or both of length1 and length2 can be infinity. In the line frac = area / (length2 - length1); you can get NaN result. I've especially adjusted the code to get more of less correct result in this case. Another minor note about this line bin_width *= get_position(typcache, lower, hist_lower[i], hist_lower[i + 1]); ITSM it sould looks like bin_width -= 1.0 - get_position(typcache, lower, hist_lower[i], hist_lower[i + 1]); Imagine lower and upper bounds fall into same histogram bin. In this case we should subtract lengths of both parts which were cut in the left and in the right. -- With best regards, Alexander Korotkov.
Re: [HACKERS] leaking lots of unreferenced inodes (pg_xlog files?), maybe after moving tables and indexes to tablespace on different volume
On Mar 13, 2013 3:04 AM, Tom Lane t...@sss.pgh.pa.us wrote: Palle Girgensohn gir...@freebsd.org writes: ... I got lots of space freed up, but it seems that after that the disk usage grows linearly (it seems to leave many inodes unreferenced). Hm. We've seen issues in the past with PG processes failing to close no-longer-useful files promptly, but ... Strange thing is I cannot find any open files. ... that suggests there's something else going on. The unreferenced inodes are almost exclusively around 16 MB in size, so i.e. they would most probably all be pg_xlog files. Have you got any sort of WAL archiving active, and if so maybe that's holding onto WAL files? Not that it's clear how come lsof wouldn't tattle on an archiving process either. Stopping postgresql briefly did not help, I tried that. That seems to point the finger at some non-postgres cause. I confess I can't guess what. Yeah, unreferenced inodes with no open files, and only discoverable with fsck sounds like a filsystem bug to me. Particularly since it showed up just after a operating system upgrade, and doesn't go away with a postgres restart... /Magnus
Re: Column defaults for foreign tables (was Re: [HACKERS] [v9.3] writable foreign tables)
Tom Lane wrote: Yeah, I'm drifting towards the position that we should just define the defaults as being whatever they are locally, rather than trying to be cute about supporting remotely-executed defaults. It looks to me like if we try to do the latter, we're going to have pitfalls and weird corner cases that will never be quite transparent. There's also the argument that this'd be a lot of work that benefits only some FDWs, since the whole concept of remote column defaults doesn't apply when the FDW's data source isn't a traditional RDBMS. That was my first thought on the topic, to have a solution that is simple (if not perfect). Your argument that it would be unpleasant to lose the ability to use sequence-generated remote default values made me reconsider. But there is a workaround, namely to use a trigger before insert to generate an automatic primary key (e.g. if the inserted value is NULL). Maybe it would be good to add a few hints at workarounds like that to the documentation if it's going to be local defaults. Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal for Allow postgresql.conf values to be changed via SQL [review]
2013-03-13 07:42 keltezéssel, Craig Ringer írta: On 03/12/2013 06:27 AM, Craig Ringer wrote: Think also about the case where someone wants to change multiple values together and having just some set and not others would be inconsistent. Yeah, that's a killer. The reload would need to be scheduled for COMMIT time, it can't be done by `SET PERSISTENT` directly. Thinking about this some more, I'm not sure this is a good idea. Right now, SET takes effect immediately. Always, without exception. Delaying SET PERSISTENT's effects until commit would make it inconsistent with SET's normal behaviour. However, *not* delaying it would make it another quirky not-transactional not-atomic command. That's OK, but if it's not going to follow transactional semantics it should not be allowed to run within a transaction, like VACUUM . Writing the changes immediately but deferring the reload until commit seems to be the worst of those two worlds. I was thinking about it a little. There is a hook that runs at the end of (sub-)transactions. It can be abused for this purpose to make SET PERSISTENT transactional. The subtransactions can also stack these settings, by forgetting settings upon ROLLBACK [ TO SAVEPOINT ] and overwriting previous settings upon COMMIT and RELEASE SAVEPOINT. All it needs is a list and maintaining intermediate pointers when entering into a new level of SAVEPOINT. The functions to register such hooks are in src/include/access/xact.h: extern void RegisterXactCallback(XactCallback callback, void *arg); extern void UnregisterXactCallback(XactCallback callback, void *arg); extern void RegisterSubXactCallback(SubXactCallback callback, void *arg); extern void UnregisterSubXactCallback(SubXactCallback callback, void *arg); The per-session local effect can also be immediate, it just needs the normal SET and SET PERSISTENT code paths to be unified. On the other hand, it would make a lot of sense to implement it as one setting per file or extending the code to allow modifying settings in bulk. The one setting per file should be easier and it would also allow extensions to drop their settings automatically into the automatic config directory. I don't know who mentioned this idea about extensions but it also came up a while back. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add some regression tests for SEQUENCE
robins wrote: Attached is a small patch to test corner cases related to Sequences (basically aimed at increasing code-coverage of sequence.sql in regression tests). Look forward to any and all feedback. Looks ok except that the patch is backwards (all added lines start with -). I found a typo: exit instead of exist. You should add the patch to the next commitfest (http://wiki.postgresql.org/wiki/Submitting_a_Patch). Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] scanner/parser minimization
On 2 March 2013 18:47, Heikki Linnakangas hlinnakan...@vmware.com wrote: Uh ... no. I haven't looked into why the flex tables are so large, but this theory is just wrong. See ScanKeywordLookup(). Interestingly, the yy_transition array generated by flex used to be much smaller: 8.3: 22072 elements 8.4: 62623 elements master: 64535 elements The big jump between 8.3 and 8.4 was caused by introduction of the unicode escapes: U'foo' [UESCAPE 'x'] . And in particular, the error rule for the UESCAPE, which we use to avoid backtracking. I experimented with a patch that uses two extra flex states to shorten the error rules, see attached. The idea is that after lexing a unicode literal like U'foo', you enter a new state, in which you check whether an UESCAPE 'x' follows. This slashes the size of the array to 36581 elements. +1 to do this sooner rather than later -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal for Allow postgresql.conf values to be changed via SQL [review]
2013-03-13 09:09 keltezéssel, Boszormenyi Zoltan írta: 2013-03-13 07:42 keltezéssel, Craig Ringer írta: On 03/12/2013 06:27 AM, Craig Ringer wrote: Think also about the case where someone wants to change multiple values together and having just some set and not others would be inconsistent. Yeah, that's a killer. The reload would need to be scheduled for COMMIT time, it can't be done by `SET PERSISTENT` directly. Thinking about this some more, I'm not sure this is a good idea. Right now, SET takes effect immediately. Always, without exception. Delaying SET PERSISTENT's effects until commit would make it inconsistent with SET's normal behaviour. However, *not* delaying it would make it another quirky not-transactional not-atomic command. That's OK, but if it's not going to follow transactional semantics it should not be allowed to run within a transaction, like VACUUM . Writing the changes immediately but deferring the reload until commit seems to be the worst of those two worlds. I was thinking about it a little. There is a hook that runs at the end of (sub-)transactions. It can be abused for this purpose to make SET PERSISTENT transactional. The subtransactions can also stack these settings, by forgetting settings upon ROLLBACK [ TO SAVEPOINT ] and overwriting previous settings upon COMMIT and RELEASE SAVEPOINT. By overwriting at RELEASE SAVEPOINT I meant only the memory copy so the value that was set last is remembered at the same level of subxact. The final settings make it to the configuration file at COMMIT time. All it needs is a list and maintaining intermediate pointers when entering into a new level of SAVEPOINT. The functions to register such hooks are in src/include/access/xact.h: extern void RegisterXactCallback(XactCallback callback, void *arg); extern void UnregisterXactCallback(XactCallback callback, void *arg); extern void RegisterSubXactCallback(SubXactCallback callback, void *arg); extern void UnregisterSubXactCallback(SubXactCallback callback, void *arg); The per-session local effect can also be immediate, it just needs the normal SET and SET PERSISTENT code paths to be unified. On the other hand, it would make a lot of sense to implement it as one setting per file or extending the code to allow modifying settings in bulk. The one setting per file should be easier and it would also allow extensions to drop their settings automatically into the automatic config directory. I don't know who mentioned this idea about extensions but it also came up a while back. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add some regression tests for SEQUENCE
Thanks Laurenz. Would correct these (and readup) before submitting next patch. -- Robins Tharakan On 13 March 2013 13:49, Albe Laurenz laurenz.a...@wien.gv.at wrote: robins wrote: Attached is a small patch to test corner cases related to Sequences (basically aimed at increasing code-coverage of sequence.sql in regression tests). Look forward to any and all feedback. Looks ok except that the patch is backwards (all added lines start with -). I found a typo: exit instead of exist. You should add the patch to the next commitfest (http://wiki.postgresql.org/wiki/Submitting_a_Patch). Yours, Laurenz Albe
Re: [HACKERS] Duplicate JSON Object Keys
On Fri, Mar 8, 2013 at 4:42 PM, Andrew Dunstan and...@dunslane.net wrote: So my order of preference for the options would be: 1. Have the JSON type collapse objects so the last instance of a key wins and is actually stored 2. Throw an error when a JSON type has duplicate keys 3. Have the accessors find the last instance of a key and return that value 4. Let things remain as they are now On second though, I don't like 4 at all. It means that the JSON type things a value is valid while the accessor does not. They contradict one another. You can forget 1. We are not going to have the parser collapse anything. Either the JSON it gets is valid or it's not. But the parser isn't going to try to MAKE it valid. Why not? Because it's the wrong thing to do, or because it would be slower? What I think is tricky here is that there's more than one way to conceptualize what the JSON data type really is. Is it a key-value store of sorts, or just a way to store text values that meet certain minimalist syntactic criteria? I had imagined it as the latter, in which case normalization isn't sensible. But if you think of it the first way, then normalization is not only sensible, but almost obligatory. For example, we don't feel bad about this: rhaas=# select '1e1'::numeric; numeric - 10 (1 row) I think Andrew and I had envisioned this as basically a text data type that enforces some syntax checking on its input, hence the current design. But I'm not sure that's the ONLY sensible design. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writable foreign tables: how to identify rows
On Wed, Mar 6, 2013 at 12:06 PM, Tom Lane t...@sss.pgh.pa.us wrote: On the other hand, I don't have a problem with decreeing that non-Postgres FDWs need to use PK row identification in the first release; which would be the consequence if we don't do anything about allowing new system columns in 9.3. We will certainly need that style of row identification to be written and tested anyway. It won't stop us from extending things later. Oh, I didn't realize that was how it was going to work out. That seems very reasonable to me. There is a performance problem with forcing DELETE FROM ft WHERE nonkey = 5 to be pushed to the remote side as SELECT pk FROM ft WHERE nonkey = 5 followed by DELETE FROM ft WHERE pk = $1 for each pk value returned by the SELECT, which sounds like it's what will happen under this system. But I don't have any problem leaving that as future work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Proposal for Allow postgresql.conf values to be changed via SQL [review]
On 2013-03-12 10:46:53 +0530, Amit Kapila wrote: Do you mean to say that because some variables can only be set after restart can lead to inconsistency, or is it because of asynchronous nature of pg_reload_conf()? As long as SET PERSISTENT cannot be executed inside a transaction - or only takes effect after its end - there doesn't seem to be any problem executing ProcessConfigFile() directly. The reason its not executed directly is that it is normally called during query execution and it wouldn't be nice changing stuff after half of a query has been processed. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal for Allow postgresql.conf values to be changed via SQL [review]
On 2013-03-13 09:09:24 +0100, Boszormenyi Zoltan wrote: 2013-03-13 07:42 keltezéssel, Craig Ringer írta: On 03/12/2013 06:27 AM, Craig Ringer wrote: Think also about the case where someone wants to change multiple values together and having just some set and not others would be inconsistent. Yeah, that's a killer. The reload would need to be scheduled for COMMIT time, it can't be done by `SET PERSISTENT` directly. Thinking about this some more, I'm not sure this is a good idea. Right now, SET takes effect immediately. Always, without exception. Delaying SET PERSISTENT's effects until commit would make it inconsistent with SET's normal behaviour. However, *not* delaying it would make it another quirky not-transactional not-atomic command. That's OK, but if it's not going to follow transactional semantics it should not be allowed to run within a transaction, like VACUUM . Writing the changes immediately but deferring the reload until commit seems to be the worst of those two worlds. I was thinking about it a little. There is a hook that runs at the end of (sub-)transactions. It can be abused for this purpose to make SET PERSISTENT transactional. The subtransactions can also stack these settings, by forgetting settings upon ROLLBACK [ TO SAVEPOINT ] and overwriting previous settings upon COMMIT and RELEASE SAVEPOINT. All it needs is a list and maintaining intermediate pointers when entering into a new level of SAVEPOINT. The functions to register such hooks are in src/include/access/xact.h: extern void RegisterXactCallback(XactCallback callback, void *arg); extern void UnregisterXactCallback(XactCallback callback, void *arg); extern void RegisterSubXactCallback(SubXactCallback callback, void *arg); extern void UnregisterSubXactCallback(SubXactCallback callback, void *arg); (sub)xact commit/abort already calls AtEOXact_GUC(commit, nestlevel). So you wouldn't even need that. It seems we could add another value to enum GucStackState, like GUC_SET_PERSISTENT - and process those only if commit nestlevel == 1. Everytime you see one with commit nestlevel 1 you put them into them into the stack one level up. This seems like its somewhat in line with the way SET LOCAL is implemented? On the other hand, it would make a lot of sense to implement it as one setting per file or extending the code to allow modifying settings in bulk. The one setting per file should be easier and it would also allow extensions to drop their settings automatically into the automatic config directory. I don't know who mentioned this idea about extensions but it also came up a while back. I still think one-setting-per-file is the right way to go, yes. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On 03/13/2013 08:17 PM, Robert Haas wrote: I think Andrew and I had envisioned this as basically a text data type that enforces some syntax checking on its input, hence the current design. But I'm not sure that's the ONLY sensible design. We're probably stuck with it at this point, but it may well be worth considering the later introduction of a compatible `jsonobj` that stores parsed and normalized json objects in some internal format the client doesn't have to care about, like serialized V8 JS VM objects. I suspect that such a type is better offered by a contrib until/unless PL/V8 or a similar becomes a core language. It'd be nuts to try to re-implement all of the JSON and javascript object functionality in a javascript engine when we can just plug an existing one in and use its JSON and javascript object manipulation. The minimalist approach makes sense for the json type precisely because it's just validated text, but I don't think it makes sense to continually extend it and slowly reinvent a whole javascript engine in Pg. If we're going to do things like normalizing json I think that's a job for a real JavaScript engine that understands Javascript objects. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On 03/13/2013 08:17 AM, Robert Haas wrote: On Fri, Mar 8, 2013 at 4:42 PM, Andrew Dunstan and...@dunslane.net wrote: So my order of preference for the options would be: 1. Have the JSON type collapse objects so the last instance of a key wins and is actually stored 2. Throw an error when a JSON type has duplicate keys 3. Have the accessors find the last instance of a key and return that value 4. Let things remain as they are now On second though, I don't like 4 at all. It means that the JSON type things a value is valid while the accessor does not. They contradict one another. You can forget 1. We are not going to have the parser collapse anything. Either the JSON it gets is valid or it's not. But the parser isn't going to try to MAKE it valid. Why not? Because it's the wrong thing to do, or because it would be slower? What I think is tricky here is that there's more than one way to conceptualize what the JSON data type really is. Is it a key-value store of sorts, or just a way to store text values that meet certain minimalist syntactic criteria? I had imagined it as the latter, in which case normalization isn't sensible. But if you think of it the first way, then normalization is not only sensible, but almost obligatory. For example, we don't feel bad about this: rhaas=# select '1e1'::numeric; numeric - 10 (1 row) I think Andrew and I had envisioned this as basically a text data type that enforces some syntax checking on its input, hence the current design. But I'm not sure that's the ONLY sensible design. I think we've moved on from this point, because a) other implementations allow duplicate keys, b) it's trivially easy to make Postgres generate such json, and c) there is some dispute about exactly what the spec mandates. I'll be posting a revised patch shortly that doesn't error out but simply uses the value for the later key lexically. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]
On Wednesday, March 13, 2013 6:10 PM Andres Freund wrote: On 2013-03-12 10:46:53 +0530, Amit Kapila wrote: Do you mean to say that because some variables can only be set after restart can lead to inconsistency, or is it because of asynchronous nature of pg_reload_conf()? As long as SET PERSISTENT cannot be executed inside a transaction - or only takes effect after its end - there doesn't seem to be any problem executing ProcessConfigFile() directly. Do you mean to say we call directly ProcessConfigFile() at end of SET PERSISTENT instead Of pg_reload_conf() but in that case would it load the variables for other backends? With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On 2013-03-13 20:48:57 +0800, Craig Ringer wrote: On 03/13/2013 08:17 PM, Robert Haas wrote: I think Andrew and I had envisioned this as basically a text data type that enforces some syntax checking on its input, hence the current design. But I'm not sure that's the ONLY sensible design. We're probably stuck with it at this point, but it may well be worth considering the later introduction of a compatible `jsonobj` that stores parsed and normalized json objects in some internal format the client doesn't have to care about, like serialized V8 JS VM objects. I suspect that such a type is better offered by a contrib until/unless PL/V8 or a similar becomes a core language. It'd be nuts to try to re-implement all of the JSON and javascript object functionality in a javascript engine when we can just plug an existing one in and use its JSON and javascript object manipulation. The minimalist approach makes sense for the json type precisely because it's just validated text, but I don't think it makes sense to continually extend it and slowly reinvent a whole javascript engine in Pg. While I am not convinced - but not the contrary either - that using something like V8 is a good idea, I wish the patch adding json had reserved the first byte in the varlena for the 'json encoding' or something similar. That would have left the road open for easily adding different encodings in the future. Now youre left of marking it with a nullbyte in the beginning or similar atrocities... Just wanted to say that we might want to think about such stuff now that we preserve cross-version compatibility of on-disk data the next time we add a type. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]
On 2013-03-13 18:38:12 +0530, Amit Kapila wrote: On Wednesday, March 13, 2013 6:10 PM Andres Freund wrote: On 2013-03-12 10:46:53 +0530, Amit Kapila wrote: Do you mean to say that because some variables can only be set after restart can lead to inconsistency, or is it because of asynchronous nature of pg_reload_conf()? As long as SET PERSISTENT cannot be executed inside a transaction - or only takes effect after its end - there doesn't seem to be any problem executing ProcessConfigFile() directly. Do you mean to say we call directly ProcessConfigFile() at end of SET PERSISTENT instead Of pg_reload_conf() but in that case would it load the variables for other backends? I'd say do both. Yes, we would evaluate config potentially twice. Who cares. Messages inside non-postmaster environments are only output at DEBUG2 anyway. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]
On Wednesday, March 13, 2013 6:44 PM Andres Freund wrote: On 2013-03-13 18:38:12 +0530, Amit Kapila wrote: On Wednesday, March 13, 2013 6:10 PM Andres Freund wrote: On 2013-03-12 10:46:53 +0530, Amit Kapila wrote: Do you mean to say that because some variables can only be set after restart can lead to inconsistency, or is it because of asynchronous nature of pg_reload_conf()? As long as SET PERSISTENT cannot be executed inside a transaction - or only takes effect after its end - there doesn't seem to be any problem executing ProcessConfigFile() directly. Do you mean to say we call directly ProcessConfigFile() at end of SET PERSISTENT instead Of pg_reload_conf() but in that case would it load the variables for other backends? I'd say do both. Yes, we would evaluate config potentially twice. Who cares. Messages inside non-postmaster environments are only output at DEBUG2 anyway. I could see your point, when you say do both that you want that in current session, the values will be immediately available which can make user happy. However if there is any error during function ProcessConfigFile(), it could be little inconvenient for user as the setting would have been done in file but memory processing has created problem. Ideally, there should not be any error in function ProcessConfigFile(), but might be some palloc or such call would be sufficient to make user unhappy. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writable foreign tables: how to identify rows
On Wed, Mar 6, 2013 at 11:06 AM, Tom Lane t...@sss.pgh.pa.us wrote: Shigeru Hanada shigeru.han...@gmail.com writes: I'm not sure whether postgres_fdw should support, but updatable views have no system column including ctid. So, we need magic identifier, perhaps it would be set of primary key value(s), to support updating remote updatable views via foreign tables. Yeah, I considered that. I thought seriously about proposing that we forget magic row identifiers altogether, and instead make postgres_fdw require a remote primary key for a foreign table to be updatable. IMO, Utilizing anything but this for remote record identification is an implementation specific optimization. Aren't the semantics different though? If you go: update foo set id = 1 where id = 1; the primary key would not change, but the ctid would. or is that already a handled? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writable foreign tables: how to identify rows
Robert Haas robertmh...@gmail.com writes: On Wed, Mar 6, 2013 at 12:06 PM, Tom Lane t...@sss.pgh.pa.us wrote: On the other hand, I don't have a problem with decreeing that non-Postgres FDWs need to use PK row identification in the first release; which would be the consequence if we don't do anything about allowing new system columns in 9.3. We will certainly need that style of row identification to be written and tested anyway. It won't stop us from extending things later. Oh, I didn't realize that was how it was going to work out. That seems very reasonable to me. There is a performance problem with forcing DELETE FROM ft WHERE nonkey = 5 to be pushed to the remote side as SELECT pk FROM ft WHERE nonkey = 5 followed by DELETE FROM ft WHERE pk = $1 for each pk value returned by the SELECT, which sounds like it's what will happen under this system. But I don't have any problem leaving that as future work. That performance issue exists for *all* foreign updates/deletes at the moment, with or without a ctid-ish column. As you say, fixing it is material for 9.4 or later. (It's possible that an FDW could dodge this by itself without any additional help from the core code, but I'm not sure. Anyway postgres_fdw hasn't tried yet, and I think there are a number of things that are higher priority to worry about than bulk update performance.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writable foreign tables: how to identify rows
Merlin Moncure mmonc...@gmail.com writes: On Wed, Mar 6, 2013 at 11:06 AM, Tom Lane t...@sss.pgh.pa.us wrote: Yeah, I considered that. I thought seriously about proposing that we forget magic row identifiers altogether, and instead make postgres_fdw require a remote primary key for a foreign table to be updatable. IMO, Utilizing anything but this for remote record identification is an implementation specific optimization. Aren't the semantics different though? If you go: update foo set id = 1 where id = 1; the primary key would not change, but the ctid would. or is that already a handled? In postgres_fdw as it currently stands, the remote ctid would change. I'm not sure we should posit that as a universal property of FDWs though. It's not even a meaningful question for FDWs with no underlying rowid concept. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]
On 2013-03-13 18:52:48 +0530, Amit Kapila wrote: On Wednesday, March 13, 2013 6:44 PM Andres Freund wrote: On 2013-03-13 18:38:12 +0530, Amit Kapila wrote: On Wednesday, March 13, 2013 6:10 PM Andres Freund wrote: On 2013-03-12 10:46:53 +0530, Amit Kapila wrote: Do you mean to say that because some variables can only be set after restart can lead to inconsistency, or is it because of asynchronous nature of pg_reload_conf()? As long as SET PERSISTENT cannot be executed inside a transaction - or only takes effect after its end - there doesn't seem to be any problem executing ProcessConfigFile() directly. Do you mean to say we call directly ProcessConfigFile() at end of SET PERSISTENT instead Of pg_reload_conf() but in that case would it load the variables for other backends? I'd say do both. Yes, we would evaluate config potentially twice. Who cares. Messages inside non-postmaster environments are only output at DEBUG2 anyway. I could see your point, when you say do both that you want that in current session, the values will be immediately available which can make user happy. However if there is any error during function ProcessConfigFile(), it could be little inconvenient for user as the setting would have been done in file but memory processing has created problem. But thats pretty independent from this? If anything it allows for *better* reporting of problems since you could convert the log level to WARNING if ProcessConfigFile() is executed in foreground - which at least interactive sessions normally will noramlly be displayed for the user. If you do don't do it immediately you're in the same situation after the pg_reload_config(), just that the user won't see any error messages. There is something I am more worried about which is that it ight be bad if a postmaster child adapts new values bfore postmaster does. I right now can't think of any new dangers since the reverse is already true... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writable foreign tables: how to identify rows
On Wed, Mar 13, 2013 at 9:30 AM, Tom Lane t...@sss.pgh.pa.us wrote: Merlin Moncure mmonc...@gmail.com writes: On Wed, Mar 6, 2013 at 11:06 AM, Tom Lane t...@sss.pgh.pa.us wrote: Yeah, I considered that. I thought seriously about proposing that we forget magic row identifiers altogether, and instead make postgres_fdw require a remote primary key for a foreign table to be updatable. IMO, Utilizing anything but this for remote record identification is an implementation specific optimization. Aren't the semantics different though? If you go: update foo set id = 1 where id = 1; the primary key would not change, but the ctid would. or is that already a handled? In postgres_fdw as it currently stands, the remote ctid would change. I'm not sure we should posit that as a universal property of FDWs though. It's not even a meaningful question for FDWs with no underlying rowid concept. I just find it odd that rowid concept is used at all without strong guarantee that the record you are referencing is the one you are supposed to be referencing. Basically I'm saying PKEY semantics are the correct ones and that ctid is ok to use iff they match the pkey ones. I don't think this is possible unless you maintain a remote lock on the ctid between when you fetch it and do some other operation. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writable foreign tables: how to identify rows
Merlin Moncure mmonc...@gmail.com writes: I just find it odd that rowid concept is used at all without strong guarantee that the record you are referencing is the one you are supposed to be referencing. Basically I'm saying PKEY semantics are the correct ones and that ctid is ok to use iff they match the pkey ones. I don't think this is possible unless you maintain a remote lock on the ctid between when you fetch it and do some other operation. postgres_fdw does maintain such a lock (it does SELECT FOR UPDATE when scanning an update/delete target table). However, I'm not exactly sure why you think a pkey-based design would be free of such hazards. pkeys aren't immutable either ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writable foreign tables: how to identify rows
On 2013-03-13 09:51:59 -0500, Merlin Moncure wrote: On Wed, Mar 13, 2013 at 9:30 AM, Tom Lane t...@sss.pgh.pa.us wrote: Merlin Moncure mmonc...@gmail.com writes: On Wed, Mar 6, 2013 at 11:06 AM, Tom Lane t...@sss.pgh.pa.us wrote: Yeah, I considered that. I thought seriously about proposing that we forget magic row identifiers altogether, and instead make postgres_fdw require a remote primary key for a foreign table to be updatable. IMO, Utilizing anything but this for remote record identification is an implementation specific optimization. Aren't the semantics different though? If you go: update foo set id = 1 where id = 1; the primary key would not change, but the ctid would. or is that already a handled? In postgres_fdw as it currently stands, the remote ctid would change. I'm not sure we should posit that as a universal property of FDWs though. It's not even a meaningful question for FDWs with no underlying rowid concept. I just find it odd that rowid concept is used at all without strong guarantee that the record you are referencing is the one you are supposed to be referencing. Basically I'm saying PKEY semantics are the correct ones and that ctid is ok to use iff they match the pkey ones. I don't think this is possible unless you maintain a remote lock on the ctid between when you fetch it and do some other operation. ISTM thats just mostly the same semantics as you have locally. If you locally do an UPDATE it will use the ctid for location of the updated row. If it got updated since (which you can detect by checking whether xmax has committed) the behaviour depends on the transaction isolation level. On repeatable_read upwards it errors out, otherwise it will do the EvalPlanQual magic. Which is to make sure the WHERE condition still fits the row. Thats obviously where the cases start to differ. But thats also the case if you use the primary key since you obviously can have more quals than just that on the origin side. Perhaps pgsql-fdw should make sure the update was performed *without* following the ctid chain to a new valid tuple? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] leaking lots of unreferenced inodes (pg_xlog files?), maybe after moving tables and indexes to tablespace on different volume
We're seeing similar behaviour on several of our FreeBSD servers too. It doesn't look like open files, or filesystem snapshots. Rebooting does reset it, but restarting PG makes no difference. We've got an assortment of different versions of both FreeBSD and PostgreSQL, some of which are demonstrating this behaviour, some aren't. Here's a quick breakdown of versions and what we've got running: FreeBSD PostgreSQL Leaking? 8.0 8.4.4no 8.2 9.0.4no 8.3 9.1.4yes 8.3 9.2.3yes 9.1 9.2.3yes All of these machines are under similar load patterns and (apart from the version differences), are set up essentially the same and are doing the same job. They all have hot standbys yet this problem doesn't exist on any of the standby servers. We haven't done anything with tablespaces, the database has its own dedicated partition (although pg_log/pg_xlog are both symlinked out to /usr). However (just to throw a spanner in the works) we do have another server running fbsd8.3/pg9.1.4 which ISN'T showing this behaviour - although its load patterns are quite different. I'm not sure if this is going to help, but here's a graph of this disk space disparity over the last few days (Y axis is in gigabytes). The flat-ish part in the middle is the weekend where we have little traffic, so we can at least say it's not constant: http://i.imgur.com/jlbgzNI.png Up until now we've been upgrading things in the hope that the problem will go away, but since we've got one server up to fbsd9.1/pg9.2.3 and still seeing the problem we're a little stumped. Any ideas about how we can go about debugging this would be appreciated. Thanks, Dan On 13 March 2013 07:39, Magnus Hagander mag...@hagander.net wrote: On Mar 13, 2013 3:04 AM, Tom Lane t...@sss.pgh.pa.us wrote: Palle Girgensohn gir...@freebsd.org writes: ... I got lots of space freed up, but it seems that after that the disk usage grows linearly (it seems to leave many inodes unreferenced). Hm. We've seen issues in the past with PG processes failing to close no-longer-useful files promptly, but ... Strange thing is I cannot find any open files. ... that suggests there's something else going on. The unreferenced inodes are almost exclusively around 16 MB in size, so i.e. they would most probably all be pg_xlog files. Have you got any sort of WAL archiving active, and if so maybe that's holding onto WAL files? Not that it's clear how come lsof wouldn't tattle on an archiving process either. Stopping postgresql briefly did not help, I tried that. That seems to point the finger at some non-postgres cause. I confess I can't guess what. Yeah, unreferenced inodes with no open files, and only discoverable with fsck sounds like a filsystem bug to me. Particularly since it showed up just after a operating system upgrade, and doesn't go away with a postgres restart... /Magnus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writable foreign tables: how to identify rows
Andres Freund and...@2ndquadrant.com writes: Perhaps pgsql-fdw should make sure the update was performed *without* following the ctid chain to a new valid tuple? I did think about these issues before committing the patch ;-) The basic theory in PG's existing design is to postpone locking rows as long as possible; which means that when we do finally lock a target row, we have to check if it's changed since we scanned it, and that leads into the whole EvalPlanQual mess. I considered trying to make FDWs duplicate that behavior, but gave up on it. In the first place, it's hard to see how you even define did the row change unless you have something exactly like ctids (including forward update chains). And in the second place, this would mandate yet another round trip to the remote server for each row to be updated. In the patch as committed, the expectation (which is satisfied by postgres_fdw) is that FDWs should lock rows that are candidates for update/delete during the initial scan. This avoids an extra round trip and justifies leaving EvalPlanQual out of the picture altogether. The cost is that we may lock rows that we end up not updating, because they fail locally-checked restriction or join conditions. I think on balance that's a good trade-off. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writable foreign tables: how to identify rows
On Wed, Mar 13, 2013 at 10:15 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: Perhaps pgsql-fdw should make sure the update was performed *without* following the ctid chain to a new valid tuple? I did think about these issues before committing the patch ;-) The basic theory in PG's existing design is to postpone locking rows as long as possible; which means that when we do finally lock a target row, we have to check if it's changed since we scanned it, and that leads into the whole EvalPlanQual mess. I considered trying to make FDWs duplicate that behavior, but gave up on it. In the first place, it's hard to see how you even define did the row change unless you have something exactly like ctids (including forward update chains). And in the second place, this would mandate yet another round trip to the remote server for each row to be updated. In the patch as committed, the expectation (which is satisfied by postgres_fdw) is that FDWs should lock rows that are candidates for update/delete during the initial scan. This avoids an extra round trip and justifies leaving EvalPlanQual out of the picture altogether. The cost is that we may lock rows that we end up not updating, because they fail locally-checked restriction or join conditions. I think on balance that's a good trade-off. agreed. As long as lock as held between ctid examination and row modification you are ok. didn't read the patch, just pointing that out (history is full of client side drivers that did not do this properly). I also might have missed some of the finer contextual points of the discussion here: I was thinking that you are identifying rows on the client over fetch transaction A to write back in transaction B. If that is the case, ctid based identification to me is full of issues and probably best used to handle awkward cases like there is no identified primary key. But sure, if you are holding locks you can base it on physical position. with pkey row identification, yes, rows can change from under you, but now you can employ application tricks to guard for that just as you would with any optimistic locking strategy in SQL. This is and expected and well understood problem (use transactions etc). merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writable foreign tables: how to identify rows
Merlin Moncure mmonc...@gmail.com writes: As long as lock as held between ctid examination and row modification you are ok. didn't read the patch, just pointing that out (history is full of client side drivers that did not do this properly). I also might have missed some of the finer contextual points of the discussion here: I was thinking that you are identifying rows on the client over fetch transaction A to write back in transaction B. If that is the case, ctid based identification to me is full of issues Absolutely --- you can't rely on ctid across transactions. postgres_fdw isn't doing that though, just using it to update or delete rows that it locked earlier in the same remote transaction. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On Mar 13, 2013, at 5:17 AM, Robert Haas robertmh...@gmail.com wrote: What I think is tricky here is that there's more than one way to conceptualize what the JSON data type really is. Is it a key-value store of sorts, or just a way to store text values that meet certain minimalist syntactic criteria? I had imagined it as the latter, in which case normalization isn't sensible. But if you think of it the first way, then normalization is not only sensible, but almost obligatory. That makes a lot of sense. Given the restrictions I tend to prefer in my database data types, I had imagined it as the former. And since I'm using it now to store key/value pairs (killing off some awful EAV implementations in the process, BTW), I certainly think of it more formally as an object. But I can live with the other interpretation, as long as the differences are clearly understood and documented. Perhaps a note could be added to the docs explaining this difference, and what one can do to adapt for it. A normalizing function would certainly help. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On 14/03/13 02:02, Andrew Dunstan wrote: On 03/13/2013 08:17 AM, Robert Haas wrote: On Fri, Mar 8, 2013 at 4:42 PM, Andrew Dunstan and...@dunslane.net wrote: So my order of preference for the options would be: 1. Have the JSON type collapse objects so the last instance of a key wins and is actually stored 2. Throw an error when a JSON type has duplicate keys 3. Have the accessors find the last instance of a key and return that value 4. Let things remain as they are now On second though, I don't like 4 at all. It means that the JSON type things a value is valid while the accessor does not. They contradict one another. You can forget 1. We are not going to have the parser collapse anything. Either the JSON it gets is valid or it's not. But the parser isn't going to try to MAKE it valid. Why not? Because it's the wrong thing to do, or because it would be slower? What I think is tricky here is that there's more than one way to conceptualize what the JSON data type really is. Is it a key-value store of sorts, or just a way to store text values that meet certain minimalist syntactic criteria? I had imagined it as the latter, in which case normalization isn't sensible. But if you think of it the first way, then normalization is not only sensible, but almost obligatory. For example, we don't feel bad about this: rhaas=# select '1e1'::numeric; numeric - 10 (1 row) I think Andrew and I had envisioned this as basically a text data type that enforces some syntax checking on its input, hence the current design. But I'm not sure that's the ONLY sensible design. I think we've moved on from this point, because a) other implementations allow duplicate keys, b) it's trivially easy to make Postgres generate such json, and c) there is some dispute about exactly what the spec mandates. I'll be posting a revised patch shortly that doesn't error out but simply uses the value for the later key lexically. cheers andrew How about adding a new function with '_strict' added to the existing name, with an extra parameter 'coalesce' - or using other names, if considered more appropriate! That way slower more stringent functionality can be added where required. This way, the existing function need not be changed. If coalesce = true, then: the last duplicate is used else: an error is returned when the new key is a duplicate. Cheers, Gavin
Re: [HACKERS] transforms
Peter Eisentraut pete...@gmx.net writes: At run time, this will sort itself out, because all the required modules will be loaded. The problem is when you create the glue extension and haven't invoked any code from any of the dependent extension in the session. Abstractly, the possible solutions are either not to check the functions when the extension is created (possibly settable by a flag) or to somehow force a load of all dependent extensions when the new extension is created. (I say extension here even though dynamically loadable modules are attached to functions, which makes this even more confusing.) I think CREATE EXTENSION should be LOADing all distinct modules referenced from all required extensions functions. It does sound easy enough to code, and I can't imagine why we would want to not do that. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Enabling Checksums
Jeff, However, I'm willing to be convinced to exclude temp tables again. Those reasons sound persuasive. Let's leave them in for 9.3. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On 03/13/2013 12:51 PM, Gavin Flower wrote: On 14/03/13 02:02, Andrew Dunstan wrote: On 03/13/2013 08:17 AM, Robert Haas wrote: On Fri, Mar 8, 2013 at 4:42 PM, Andrew Dunstan and...@dunslane.net wrote: So my order of preference for the options would be: 1. Have the JSON type collapse objects so the last instance of a key wins and is actually stored 2. Throw an error when a JSON type has duplicate keys 3. Have the accessors find the last instance of a key and return that value 4. Let things remain as they are now On second though, I don't like 4 at all. It means that the JSON type things a value is valid while the accessor does not. They contradict one another. You can forget 1. We are not going to have the parser collapse anything. Either the JSON it gets is valid or it's not. But the parser isn't going to try to MAKE it valid. Why not? Because it's the wrong thing to do, or because it would be slower? What I think is tricky here is that there's more than one way to conceptualize what the JSON data type really is. Is it a key-value store of sorts, or just a way to store text values that meet certain minimalist syntactic criteria? I had imagined it as the latter, in which case normalization isn't sensible. But if you think of it the first way, then normalization is not only sensible, but almost obligatory. For example, we don't feel bad about this: rhaas=# select '1e1'::numeric; numeric - 10 (1 row) I think Andrew and I had envisioned this as basically a text data type that enforces some syntax checking on its input, hence the current design. But I'm not sure that's the ONLY sensible design. I think we've moved on from this point, because a) other implementations allow duplicate keys, b) it's trivially easy to make Postgres generate such json, and c) there is some dispute about exactly what the spec mandates. I'll be posting a revised patch shortly that doesn't error out but simply uses the value for the later key lexically. cheers andrew How about adding a new function with '_strict' added to the existing name, with an extra parameter 'coalesce' - or using other names, if considered more appropriate! That way slower more stringent functionality can be added where required. This way, the existing function need not be changed. If coalesce = true, then: the last duplicate is used else: an error is returned when the new key is a duplicate. For good or ill, we now already have a json type that will accept strings with duplicate keys, and generator functions which can now generate such strings. If someone wants functions to enforce a stricter validity check (e.g. via a check constraint on a domain), or to convert json to a canonical version which strips out prior keys of the same name and their associated values, then these should be relatively simple to implement given the parser API in the current patch. But they aren't part of the current patch, and I think it's way too late to be adding such things. I have been persuaded by arguments made upthread that the best thing to do is exactly what other well known json-accepting implementations do (e.g. V8), which is to accept json with duplicate keys and to treat the later key/value as overriding the former key/value. If I'd done that from the start nobody would now be talking about this at all. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On Mar 13, 2013, at 10:45 AM, Andrew Dunstan and...@dunslane.net wrote: If someone wants functions to enforce a stricter validity check (e.g. via a check constraint on a domain), or to convert json to a canonical version which strips out prior keys of the same name and their associated values, then these should be relatively simple to implement given the parser API in the current patch. But they aren't part of the current patch, and I think it's way too late to be adding such things. I think it might be good to get something like this into core eventually, otherwise I suspect that there will be a different version of it for every JSON-using project out there. And my first cut at it won’t descend into sub-objects. I have been persuaded by arguments made upthread that the best thing to do is exactly what other well known json-accepting implementations do (e.g. V8), which is to accept json with duplicate keys and to treat the later key/value as overriding the former key/value. If I'd done that from the start nobody would now be talking about this at all. That’s true, though I might have started thinking about a canonicalizing function before long. :-) Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On 03/13/2013 01:50 PM, David E. Wheeler wrote: On Mar 13, 2013, at 10:45 AM, Andrew Dunstan and...@dunslane.net wrote: If someone wants functions to enforce a stricter validity check (e.g. via a check constraint on a domain), or to convert json to a canonical version which strips out prior keys of the same name and their associated values, then these should be relatively simple to implement given the parser API in the current patch. But they aren't part of the current patch, and I think it's way too late to be adding such things. I think it might be good to get something like this into core eventually, otherwise I suspect that there will be a different version of it for every JSON-using project out there. And my first cut at it won’t descend into sub-objects. The you wouldn't be doing it right. The whole thing about a recursive descent parser is that it's, well, recursive. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On Mar 13, 2013, at 10:59 AM, Andrew Dunstan and...@dunslane.net wrote: And my first cut at it won’t descend into sub-objects. The you wouldn't be doing it right. The whole thing about a recursive descent parser is that it's, well, recursive. Right, but it would serve my immediate needs. I have a column that just stores key/value pairs with no nesting. So I know I can write something like this and have it be good enough: create or replace function json_smash( json ) RETURNS JSON language SQL STRICT IMMUTABLE AS $$ SELECT format('{%s}', array_to_string(ARRAY( SELECT format('%s: %s', to_json(key), value) FROM ( SELECT key, value, row_number() OVER ( partition by key order by rnum desc ) AS rnum FROM ( SELECT key, value, row_number() OVER ( partition by key ) AS rnum FROM json_each($1) ) a ) b WHERE rnum = 1 ), ','))::json; $$; And do you really want to see that unloosed on the world? :-P (Yes, I know there is no guarantee on the order of rows returned by json_each()). Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Column defaults for foreign tables (was Re: [HACKERS] [v9.3] writable foreign tables)
Albe Laurenz laurenz.a...@wien.gv.at writes: Yeah, I'm drifting towards the position that we should just define the defaults as being whatever they are locally, rather than trying to be cute about supporting remotely-executed defaults. That was my first thought on the topic, to have a solution that is simple (if not perfect). Your argument that it would be unpleasant to lose the ability to use sequence-generated remote default values made me reconsider. But there is a workaround, namely to use a trigger before insert to generate an automatic primary key (e.g. if the inserted value is NULL). Another attack is to set up a different foreign table pointing to the same remote table, but lacking the sequence column. When you insert via that table, you'll get the remote's default for the hidden column. This doesn't require any weird triggers on the remote side, but it could be hard to persuade existing apps to use the second foreign table. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statistics and selectivity estimation for ranges
On 01.03.2013 16:22, Alexander Korotkov wrote: On Tue, Mar 12, 2013 at 8:03 PM, Heikki Linnakangashlinnakan...@vmware.com wrote: So, after some hacking, I ended up with this version. I find it more readable, I hope I didn't miss anything. This seems to produce results that are close, but not identical, to the original patch. I'm not sure where the discrepancy is coming from, or which patch is more correct in that respect. I'll continue from this tomorrow, but if you have the time, please take a look and let me know what you think. I've read your explanation and version of patch. In general it seems correct to me. There is one point why I have breaked up abstraction in some functions is infinities. For example, in calc_length_hist_frac one or both of length1 and length2 can be infinity. In the line frac = area / (length2 - length1); you can get NaN result. I've especially adjusted the code to get more of less correct result in this case. Hmm, good point. I think I managed to fix those cases in the attached version. Is there any other corner case that I missed? Another minor note about this line bin_width *= get_position(typcache, lower,hist_lower[i], hist_lower[i + 1]); ITSM it sould looks like bin_width -= 1.0 - get_position(typcache, lower,hist_lower[i], hist_lower[i + 1]); Imagine lower and upper bounds fall into same histogram bin. In this case we should subtract lengths of both parts which were cut in the left and in the right. Yes, true. There's one negation too many above, though; should be: bin_width -= get_position(typcache, lower,hist_lower[i], hist_lower[i + 1]); Fixed that. Barring any more issues, I'll read through this once more tomorrow and commit. - Heikki *** a/src/backend/utils/adt/rangetypes_selfuncs.c --- b/src/backend/utils/adt/rangetypes_selfuncs.c *** *** 20,25 --- 20,26 #include access/htup_details.h #include catalog/pg_operator.h #include catalog/pg_statistic.h + #include utils/builtins.h #include utils/lsyscache.h #include utils/rangetypes.h #include utils/selfuncs.h *** *** 39,44 static int rbound_bsearch(TypeCacheEntry *typcache, RangeBound *value, --- 40,60 RangeBound *hist, int hist_length, bool equal); static float8 get_position(TypeCacheEntry *typcache, RangeBound *value, RangeBound *hist1, RangeBound *hist2); + static float8 get_len_position(double value, double hist1, double hist2); + static float8 get_distance(TypeCacheEntry *typcache, RangeBound *bound1, + RangeBound *bound2); + static int length_hist_bsearch(Datum *length_hist_values, + int length_hist_nvalues, double value, bool equal); + static double calc_length_hist_frac(Datum *length_hist_values, + int length_hist_nvalues, double length1, double length2, bool equal); + static double calc_hist_selectivity_contained(TypeCacheEntry *typcache, + RangeBound *lower, RangeBound *upper, + RangeBound *hist_lower, int hist_nvalues, + Datum *length_hist_values, int length_hist_nvalues); + static double calc_hist_selectivity_contains(TypeCacheEntry *typcache, + RangeBound *lower, RangeBound *upper, + RangeBound *hist_lower, int hist_nvalues, + Datum *length_hist_values, int length_hist_nvalues); /* * Returns a default selectivity estimate for given operator, when we don't *** *** 213,219 calc_rangesel(TypeCacheEntry *typcache, VariableStatData *vardata, /* Try to get fraction of empty ranges */ if (get_attstatsslot(vardata-statsTuple, vardata-atttype, vardata-atttypmod, ! STATISTIC_KIND_RANGE_EMPTY_FRAC, InvalidOid, NULL, NULL, NULL, numbers, nnumbers)) --- 229,235 /* Try to get fraction of empty ranges */ if (get_attstatsslot(vardata-statsTuple, vardata-atttype, vardata-atttypmod, ! STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM, InvalidOid, NULL, NULL, NULL, numbers, nnumbers)) *** *** 332,337 calc_hist_selectivity(TypeCacheEntry *typcache, VariableStatData *vardata, --- 348,355 { Datum *hist_values; int nhist; + Datum *length_hist_values; + int length_nhist; RangeBound *hist_lower; RangeBound *hist_upper; int i; *** *** 365,370 calc_hist_selectivity(TypeCacheEntry *typcache, VariableStatData *vardata, --- 383,403 elog(ERROR, bounds histogram contains an empty range); } + /* @ and @ also need a histogram of range lengths */ + if (operator == OID_RANGE_CONTAINS_OP || + operator == OID_RANGE_CONTAINED_OP) + { + if (!(HeapTupleIsValid(vardata-statsTuple) + get_attstatsslot(vardata-statsTuple, + vardata-atttype, vardata-atttypmod, + STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM, + InvalidOid, + NULL, + length_hist_values, length_nhist, + NULL, NULL))) + return
Re: [HACKERS] matview join view error
Erikjan Rijkers e...@xs4all.nl wrote: With 9.3devel, I can't seem to join a matview to a view; surely that should be allowed? Yes. Here is an example: I expect to add a regression test based on that, once the bug is fixed. Thanks! -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized View patch broke pg_dump
Andrew Dunstan and...@dunslane.net wrote: On 03/11/2013 12:30 PM, Fujii Masao wrote: Andrew Dunstan and...@dunslane.net wrote: On 03/11/2013 10:43 AM, Andrew Dunstan wrote: I noticed this morning that I am still getting failures on 9.0, 9.1 and 9.2 which cause my cross-version upgrade testing to fail for git tip. For all I know this might apply to all back branches, but these are the only ones tested for upgrade, so that's all I can report on reliably. I'm chasing it up to find out exactly what's going on, but figured some extra eyeballs would help. The problem is that pg_dump is sending an empty query in versions less than 9.3, and choking on that. Suggested fix attached - there's really no reason to be doing anything re mat views in versions 9.3. This is the same problem that I reported in another thread. http://www.postgresql.org/message-id/CAHGQGwH+4vtyq== l6hrupxtggfqrnlf0mwj75bfisoske28...@mail.gmail.com Oh, I missed that. Yes, either of these would work. I've been out of town and not able to keep up here for a few days. Will push a fix today unless I find that someone has beaten me to it as I work through the rest of the messages. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump selectively ignores extension configuration tables
On Fri, Mar 8, 2013 at 5:07 PM, Joe Conway m...@joeconway.com wrote: (reposting apparently I used a verboten word the first time around sigh. Sorry for any duplicates) The -t and -n options of pg_dump do not dump anything from an extension configuration table, whereas normal pg_dump will dump the user data. To see what I mean, in psql do (tested on pg9.2): 8-- create extension postgis; insert into spatial_ref_sys values(42,'test',42,'foo','bar'); 8-- Then in bash do: 8-- pg_dumptest|grep spatial_ref_sys pg_dump -t spatial_ref_sys test|grep spatial_ref_sys pg_dump -n public test|grep spatial_ref_sys 8-- Is this intentional, or oversight, or missing feature? Hmm. It doesn't seem right to me. It seems like it should either dump everything, or dump just the user data portion, when the name matches. Not entirely sure which - probably the latter? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump selectively ignores extension configuration tables
On 03/13/2013 03:07 PM, Robert Haas wrote: Is this intentional, or oversight, or missing feature? Hmm. It doesn't seem right to me. It seems like it should either dump everything, or dump just the user data portion, when the name matches. Not entirely sure which - probably the latter? +1 I think it should dump the user data portion, especially since that matches what pg_dump would do if you did not specify the table or schema. Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Column defaults for foreign tables (was Re: [HACKERS] [v9.3] writable foreign tables)
On 13 March 2013 19:04, Tom Lane t...@sss.pgh.pa.us wrote: Albe Laurenz laurenz.a...@wien.gv.at writes: Yeah, I'm drifting towards the position that we should just define the defaults as being whatever they are locally, rather than trying to be cute about supporting remotely-executed defaults. That was my first thought on the topic, to have a solution that is simple (if not perfect). Your argument that it would be unpleasant to lose the ability to use sequence-generated remote default values made me reconsider. But there is a workaround, namely to use a trigger before insert to generate an automatic primary key (e.g. if the inserted value is NULL). Another attack is to set up a different foreign table pointing to the same remote table, but lacking the sequence column. When you insert via that table, you'll get the remote's default for the hidden column. This doesn't require any weird triggers on the remote side, but it could be hard to persuade existing apps to use the second foreign table. How about: CREATE FOREIGN TABLE tablename (id int DEFAULT PASSTHROUGH) SERVER pg_server; That way it will pass DEFAULT through to the remote table as it's defined on the table. Users can then explicitly insert values, or select the default, which will configured to ensure the default on the remote server is used... although I suspect I'm overlooking something here. -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal for Allow postgresql.conf values to be changed via SQL [review]
2013-03-13 13:45 keltezéssel, Andres Freund írta: On 2013-03-13 09:09:24 +0100, Boszormenyi Zoltan wrote: 2013-03-13 07:42 keltezéssel, Craig Ringer írta: On 03/12/2013 06:27 AM, Craig Ringer wrote: Think also about the case where someone wants to change multiple values together and having just some set and not others would be inconsistent. Yeah, that's a killer. The reload would need to be scheduled for COMMIT time, it can't be done by `SET PERSISTENT` directly. Thinking about this some more, I'm not sure this is a good idea. Right now, SET takes effect immediately. Always, without exception. Delaying SET PERSISTENT's effects until commit would make it inconsistent with SET's normal behaviour. However, *not* delaying it would make it another quirky not-transactional not-atomic command. That's OK, but if it's not going to follow transactional semantics it should not be allowed to run within a transaction, like VACUUM . Writing the changes immediately but deferring the reload until commit seems to be the worst of those two worlds. I was thinking about it a little. There is a hook that runs at the end of (sub-)transactions. It can be abused for this purpose to make SET PERSISTENT transactional. The subtransactions can also stack these settings, by forgetting settings upon ROLLBACK [ TO SAVEPOINT ] and overwriting previous settings upon COMMIT and RELEASE SAVEPOINT. All it needs is a list and maintaining intermediate pointers when entering into a new level of SAVEPOINT. The functions to register such hooks are in src/include/access/xact.h: extern void RegisterXactCallback(XactCallback callback, void *arg); extern void UnregisterXactCallback(XactCallback callback, void *arg); extern void RegisterSubXactCallback(SubXactCallback callback, void *arg); extern void UnregisterSubXactCallback(SubXactCallback callback, void *arg); (sub)xact commit/abort already calls AtEOXact_GUC(commit, nestlevel). So you wouldn't even need that. Yes, thanks. It seems we could add another value to enum GucStackState, like GUC_SET_PERSISTENT - and process those only if commit nestlevel == 1. Maybe it's not needed, only enum GucAction needs a new GUC_ACTION_PERSISTENT value since that's what has any business in push_old_value(). Adding two new members to GucStack like these are enough bool has_persistent; config_var_value persistent; and SET PERSISTENT can be treated as GUC_SET. push_old_value() can merge GUC values set in the same transaction level. Everytime you see one with commit nestlevel 1 you put them into them into the stack one level up. This seems like its somewhat in line with the way SET LOCAL is implemented? On the other hand, it would make a lot of sense to implement it as one setting per file or extending the code to allow modifying settings in bulk. The one setting per file should be easier and it would also allow extensions to drop their settings automatically into the automatic config directory. I don't know who mentioned this idea about extensions but it also came up a while back. I still think one-setting-per-file is the right way to go, yes. Greetings, Andres Freund -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statistics and selectivity estimation for ranges
On Wed, Mar 13, 2013 at 11:10 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 01.03.2013 16:22, Alexander Korotkov wrote: On Tue, Mar 12, 2013 at 8:03 PM, Heikki Linnakangashlinnakangas@** vmware.com hlinnakan...@vmware.com wrote: So, after some hacking, I ended up with this version. I find it more readable, I hope I didn't miss anything. This seems to produce results that are close, but not identical, to the original patch. I'm not sure where the discrepancy is coming from, or which patch is more correct in that respect. I'll continue from this tomorrow, but if you have the time, please take a look and let me know what you think. I've read your explanation and version of patch. In general it seems correct to me. There is one point why I have breaked up abstraction in some functions is infinities. For example, in calc_length_hist_frac one or both of length1 and length2 can be infinity. In the line frac = area / (length2 - length1); you can get NaN result. I've especially adjusted the code to get more of less correct result in this case. Hmm, good point. I think I managed to fix those cases in the attached version. Is there any other corner case that I missed? Did you try test case by Jeff Davis on this thread? http://www.postgresql.org/message-id/1355167304.3896.37.camel@jdavis I try it with attached version of patch and get NaN estimate. -- With best regards, Alexander Korotkov.
Re: [HACKERS] Display output file name in psql prompt?
Josh Berkus j...@agliodbs.com writes: Tom, If you're proposing changing the contents of the default prompt, I think that has very little chance of passing. A new option for something to add into a custom prompt might get accepted. I'm not sure that that approach would do much for the scenario you describe, since it's unlikely that anyone would add it to their prompt (at least not before they'd gotten burnt...). However, I don't recall hearing of anyone getting confused like this before, and you say you figured it out pretty quickly, so it doesn't seem like a big problem. Well, I think having a \ command to show the current \o setting would be fine. Actually, it might be better to show *all* psql settings with a general command, including things like \timing and \pset. Actually in my case, I've also used \t and \a psql commands to produce undecorated machine-readable output and now I think it would be nice to also show the status of these in the prompt. What if we add '%\' substitute to expand to the list of all settings with non-default values? Like this: postgres=# \set PROMPT1 '%/%\%R%# ' postgres=# -- Nothing is set, so '%\' expands to an empty string. postgres=# \t Showing only tuples. postgres\t=# \a Output format is unaligned. postgres\a\t=# \o /path/to/output/file.out postgres\a\o=/path/to/output/file.out\t=# Not sure how \pset could fit there, it might be a bit excessive. Another idea is we could add \O and \G commands that will act like the plain \o and \g, but will set \a and \t automatically. I mean it must be quite often that you don't need all the decoration when you save query results to a file, so instead of doing \a, \t, \g (then setting \a and \t back) you can just do \G and move on. -- Alex -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On 03/13/2013 12:40 PM, David E. Wheeler wrote: On Mar 13, 2013, at 5:17 AM, Robert Haas robertmh...@gmail.com wrote: What I think is tricky here is that there's more than one way to conceptualize what the JSON data type really is. Is it a key-value store of sorts, or just a way to store text values that meet certain minimalist syntactic criteria? I had imagined it as the latter, in which case normalization isn't sensible. But if you think of it the first way, then normalization is not only sensible, but almost obligatory. That makes a lot of sense. Given the restrictions I tend to prefer in my database data types, I had imagined it as the former. And since I'm using it now to store key/value pairs (killing off some awful EAV implementations in the process, BTW), I certainly think of it more formally as an object. But I can live with the other interpretation, as long as the differences are clearly understood and documented. Perhaps a note could be added to the docs explaining this difference, and what one can do to adapt for it. A normalizing function would certainly help. I guess the easiest and most generic way to normalize is to actually convert to some internal representation and back. in pl/python this would look like this: hannu=# create function normalize(IN ij json, OUT oj json) language plpythonu as $$ import json return json.dumps(json.loads(ij)) $$; CREATE FUNCTION hannu=# select normalize('{a:1, a:b, a:true}'); normalize - {a: true} (1 row) If we could want to be really fancy we could start storing our json in some format which is faster to parse, like tnetstrings, but probably it is too late in release cycle to change this now. Hannu Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump selectively ignores extension configuration tables
Joe Conway m...@joeconway.com writes: I think it should dump the user data portion, especially since that matches what pg_dump would do if you did not specify the table or schema. +1 If you don't have time slots to fix that by then, I will have a look at fixing that while in beta. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] matview patch readability/correctness gripe
Tom Lane t...@sss.pgh.pa.us wrote: diff --git a/src/backend/rewrite/rewriteDefine.c b/src/backend/rewrite/rewriteDefine.c index a1a9808e5d94959218b415ed34c46579c478c177..896326615753f2344b466eb180080174ddeda31d 100644 *** a/src/backend/rewrite/rewriteDefine.c --- b/src/backend/rewrite/rewriteDefine.c *** DefineQueryRewrite(char *rulename, *** 356,362 */ checkRuleResultList(query-targetList, RelationGetDescr(event_relation), ! true); /* * ... there must not be another ON SELECT rule already ... --- 357,364 */ checkRuleResultList(query-targetList, RelationGetDescr(event_relation), ! event_relation-rd_rel-relkind != ! RELKIND_MATVIEW); /* * ... there must not be another ON SELECT rule already ... IMO this is either flat-out wrong, or an unmaintainable crock, or both. So this either needs to be reverted, or refactored into two arguments not one, with checkRuleResultList being updated to account honestly and readably for whatever it's supposed to be doing here. Will review in light of your comments. Thanks! -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Enabling Checksums
On 3/7/13 9:31 PM, Bruce Momjian wrote: 1 storage 2 storage controller 3 file system 4 RAM 5 CPU I would add 2.5 in there: storage interconnect. iSCSI, FC, what-have-you. Obviously not everyone has that. My guess is that storage checksums only cover layer 1, while our patch covers layers 1-3, and probably not 4-5 because we only compute the checksum on write. Actually, it depends. In our case, we run 512GB servers and 8GB shared buffers (previous testing has shown that anything much bigger than 8G hurts performance). So in our case, PG checksums protect a very significant portion of #4. If that is correct, the open question is what percentage of corruption happens in layers 1-3? The last bout of corruption we had was entirely coincident with memory failures. IIRC we had 3-4 corruption events on more than one server. Everything was running standard ECC (sadly, not 4-bit ECC). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Using indexes for partial index builds
On 3/12/13 9:10 AM, Ants Aasma wrote: I have a feeling this is an increasingly widespread pattern with a proliferation of mobile devices that need syncing. If you're doing that with timestamps you're asking for a slew of problems, not all of which can be solved by just adding some random amount of fluff to your criteria. A queue-based solution is often a more robust solution, even if it is harder to implement. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Column defaults for foreign tables (was Re: [HACKERS] [v9.3] writable foreign tables)
Thom Brown t...@linux.com writes: How about: CREATE FOREIGN TABLE tablename (id int DEFAULT PASSTHROUGH) SERVER pg_server; That way it will pass DEFAULT through to the remote table as it's defined on the table. Users can then explicitly insert values, or select the default, which will configured to ensure the default on the remote server is used... although I suspect I'm overlooking something here. Yeah ... how to implement it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Temporal features in PostgreSQL
On 2/13/13 10:06 PM, Vlad Arkhipov wrote: - I don't need to deal with update conflicts, because I use clock_timestamp() instead of current_timestamp. You can still come across a conflict even with clock_timestamp(). What if clocks go back during the time synchronization? Even if you have absolutely precious clocks, there are may be clock skew on different CPUs, low system clock time resolution, etc. Sorry for the late reply, catching up on email... If you want to track the history of something, measured time is absolutely NOT the way to do it. I use the term measured time to differentiate from the real-world concept of time that is forever flowing forward from one instant to the next. The problem with measured time is that it's incredibly easy to screw up. Clocks out of sync, clocks running backwards, etc. Heck, it's not even clear what time you should actually use: transaction start, wallclock, or transaction end. For any kind of history tracking to actually be robust you have no choice but to link one history record to another so that you can actually walk down a chain. Of course you might want to capture a timestamp as part of your history metadata, but you better be ready to deal with a history record with a timestamp that is *earlier* than the prior history record. BTW, we've been working on a generic history implementation at work; hopefully we'll be able to release it relatively soon. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Using indexes for partial index builds
On Thu, Mar 14, 2013 at 1:51 AM, Jim Nasby j...@nasby.net wrote: On 3/12/13 9:10 AM, Ants Aasma wrote: I have a feeling this is an increasingly widespread pattern with a proliferation of mobile devices that need syncing. If you're doing that with timestamps you're asking for a slew of problems, not all of which can be solved by just adding some random amount of fluff to your criteria. A queue-based solution is often a more robust solution, even if it is harder to implement. Do you know of anything else besides the obvious issues with having to use one clocksource and ensure that it produces monotonic timestamps? My first reaction was also that this is what queues are meant for, but the proposed solution seems to work surprisingly well. Unless you can point at some glaring hole that I'm missing I would say that it is good enough for a rather wide range of syncing problems. Ants Aasma -- Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Increasing code-coverage of 'make check'
Thanks Alvaro! The thought of psql_help purely because it was the easiest at that time. Since I've just begun my understanding of the code is barely negligible. I began working on SEQUENCE related tests thereafter and hopefully would move to more complicated tests in time. Peter's link is obviously helpful but since I end up doing make check ~100 of times a day, for now its useful only to cross-check how much code is uncommitted :) Robins On 11 March 2013 09:16, Alvaro Herrera alvhe...@2ndquadrant.com wrote: I think increasing coverage is a good thing. But psql help? *shrug* backend code is far more interesting and useful. Another thing to keep in mind is that there are some corner cases that are interesting to test that might not necessarily show up in a coverage chart -- for example how stuff behaves in the face of concurrent processes, or when various counters wrap around. Peter Eisentraut has set up a Jenkins instance that publishes coverage info. http://pgci.eisentraut.org/jenkins/job/postgresql_master_coverage/Coverage/ (I think he only has it running make check; doing the isolation tests probably raises percentages a bit).
[HACKERS] TupleTable like data structure
I'm writing my own /Group by/ operator (non-hashed implementation) and I'm currently looking on a good data structure to store the result groups. There are two requirement in my mind: the table need to be efficiently expanded and the tuples within each group need to be sorted. I carefully examined the implementation of standard /Group by/ (both hashed and sorted implementation), the closest I got is to leverage parts of the /TupleHashTable/ but before going through this plan, I need to make sure I'm not overlooking an existing data structure. Thanks -- View this message in context: http://postgresql.1045698.n5.nabble.com/TupleTable-like-data-structure-tp5748432.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized View patch broke pg_dump
Kevin Grittner kgri...@ymail.com wrote: Andrew Dunstan and...@dunslane.net wrote: On 03/11/2013 12:30 PM, Fujii Masao wrote: Andrew Dunstan and...@dunslane.net wrote: The problem is that pg_dump is sending an empty query in versions less than 9.3, and choking on that. Suggested fix attached - there's really no reason to be doing anything re mat views in versions 9.3. This is the same problem that I reported in another thread. http://www.postgresql.org/message-id/CAHGQGwH+4vtyq==l6hrupxtggfqrnlf0mwj75bfisoske28...@mail.gmail.com Oh, I missed that. Yes, either of these would work. Will push a fix today unless I find that someone has beaten me to it as I work through the rest of the messages. I agree that either would work. I preferred Andrew's patch because it kept knowledge of this issue more localized. Pushed. Thanks to you both, and apologies for the error. Next time I do anything with pg_dump I will know better what constitutes decent testing. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] is it bug? - printing boolean domains in sql/xml function
On Mon, 2013-03-04 at 08:35 +0100, Pavel Stehule wrote: in this use case I am think so some regression test is important - It should not be mine, but missing more explicit regression test is reason, why this bug was not detected some years. I've added the tests. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers