Re: [HACKERS] Triggers on foreign tables
Le mercredi 5 mars 2014 22:36:44 Noah Misch a écrit : Agreed. More specifically, I see only two scenarios for retrieving tuples from the tuplestore. Scenario one is that we need the next tuple (or pair of tuples, depending on the TriggerEvent). Scenario two is that we need the tuple(s) most recently retrieved. If that's correct, I'm inclined to rearrange afterTriggerInvokeEvents() and AfterTriggerExecute() to remember the tuple or pair of tuples most recently retrieved. They'll then never call tuplestore_advance() just to reposition. Do you see a problem with that? I don't see any problem with that. I don't know how this would be implemented, but it would make sense to avoid those scans, as long as a fresh copy is passed to the trigger: modifications to a tuple performed in an after trigger should not be visible to the next one. I was again somewhat tempted to remove ate_tupleindex, perhaps by defining the four flag bits this way: #define AFTER_TRIGGER_DONE0x1000 #define AFTER_TRIGGER_IN_PROGRESS 0x2000 /* two bits describing the size of and tuple sources for this event */ #define AFTER_TRIGGER_TUP_BITS0xC000 #define AFTER_TRIGGER_FDW_REUSE 0x #define AFTER_TRIGGER_FDW_FETCH 0x4000 #define AFTER_TRIGGER_1CTID 0x8000 #define AFTER_TRIGGER_2CTID 0xC000 AFTER_TRIGGER_FDW_FETCH and AFTER_TRIGGER_FDW_REUSE correspond to the aforementioned scenarios one and two, respectively. I think, though, I'll rate this as needless micro-optimization and not bother; opinions welcome. (The savings is four bytes per foreign table trigger event.) I was already happy with having a lower footprint for foreign table trigger events than for regular trigger events, but if we remove the need for seeking in the tuplestore entirely, it would make sense to get rid of the index. Thanks, nm Thanks to you. -- Ronan Dunklau http://dalibo.com - http://dalibo.org signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Row-security on updatable s.b. views
On 06/03/14 02:56, Craig Ringer wrote: On 03/06/2014 04:56 AM, Yeb Havinga wrote: If you state it like that, it sounds like a POLA violation. But the complete story is: A user is allowed to UPDATE a set of rows from a table that is not a subsect of the set of rows he can SELECT from the table, iow he can UPDATE rows he is not allowed to SELECT. Is there a compelling use case for this? Where it really makes sense to let users update/delete rows they cannot see via row security? We support it in the table based permissions model, but it's possible to do it with much saner semantics there. And with row security, it'll be possible with security definer functions. I intend to add a row security exempt flag for functions down the track, too. Use case: https://en.wikipedia.org/wiki/Bell-La_Padula_model - being able to write up and read down access levels. So for instance in healthcare, a data enterer may enter from hand written notes sensitive data (like subject has aids) in the electronic health record, without having general read access of the level of sensitivity of aids diagnosis. I think what is important in use cases like this, is that at data entry time, the actual data is still at the desk, so having data returned for inserts in the running transaction might not be problematic. As most EHR's today are additive in nature, future additions about the aids conditions would be inserts as well, no updates required. For updates my best guess would be that allowing the command to run with rls permissions different from the select is not required. regards, Yeb -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Thank you for checking that. Teodor's goal was that new-hstore be 100% backwards-compatible with old-hstore. If we're breaking APIs, then it That's true. Binary format is fully compatible unless old hstore value has more than 2^28 key-value pairs (256 mln which is far from reachable by memory requirements). The single issue is a GiST index, GIN index should be recreated to utilize new features. doesn't really work to force users to upgrade the type, no? Teodor, are these output changes things that can be made consistent, or do we need separate hstore and hstore2 datatypes? Introducing types in hstore causes this incompatibility - but I don't think that's huge or even big problem. In most cases application does quoting (sets 1 instead of just 1) to preserve SQL-injection and to protect hstore-forbidden characters in hstore. Keys leaves untouched - it could be only a string. That's possible to introduce GUC variable for i/o functions which will control old bug-to-bug behavior. IMHO, this is much better option that stopping hstore development or split hstore to two branches. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 12:23 AM, Teodor Sigaev teo...@sigaev.ru wrote: That's possible to introduce GUC variable for i/o functions which will control old bug-to-bug behavior. IMHO, this is much better option that stopping hstore development or split hstore to two branches. A GUC that controls i/o functions is generally considered to be an unacceptable hack. In what sense are we really stopping hstore development if hstore2 lives as jsonb? I have a hard time imagining someone dealing with the incompatibility that a user-facing hstore2 would introduce, while still preferring hstore syntax over json syntax given the choice. There are very rich facilities for manipulating json available in every programming language. The same is not true of hstore. Having looked at the issue today, I think that the amount of redundant code between a hstore2 in core as jsonb and hstore1 will be acceptable. The advantages of making a clean-break in having to support the legacy hstore disk format strengthen the case for doing so too. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: contrib/cache_scan (Re: [HACKERS] What's needed for cache-only table scan?)
On Tue, Mar 4, 2014 at 3:07 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote: 4. + cchunk = ccache_vacuum_tuple(ccache, ccache-root_chunk, ctid); + if (pchunk != NULL pchunk != cchunk) + ccache_merge_chunk(ccache, pchunk); + pchunk = cchunk; The merge_chunk is called only when the heap tuples are spread across two cache chunks. Actually one cache chunk can accommodate one or more than heap pages. it needs some other way of handling. I adjusted the logic to merge the chunks as follows: Once a tuple is vacuumed from a chunk, it also checks whether it can be merged with its child leafs. A chunk has up to two child leafs; left one has less ctid that the parent, and right one has greater ctid. It means a chunk without right child in the left sub-tree or a chunk without left child in the right sub-tree are neighbor of the chunk being vacuumed. In addition, if vacuumed chunk does not have either (or both) of children, it can be merged with parent node. I modified ccache_vacuum_tuple() to merge chunks during t-tree walk-down, if vacuumed chunk has enough free space. Patch looks good. Regarding merging of the nodes, instead of checking whether merge is possible or not for every tuple which is vacuumed, can we put some kind of threshold as whenever the node is 50% free then try to merge it from leaf nodes until 90% is full. The rest of the 10% will be left for the next inserts on the node. what do you say? I will update you later regarding the performance test results. Regards, Hari Babu Fujitsu Australia
Re: [HACKERS] jsonb and nested hstore
In what sense are we really stopping hstore development if hstore2 lives as jsonb? I have a hard time imagining someone dealing with the incompatibility that a user-facing hstore2 would introduce, while still preferring hstore syntax over json syntax given the choice. There are very rich facilities for manipulating json available in every programming language. The same is not true of hstore. It's true for perl. Syntax of hstore is close to hash/array syntax and it's easy serialize/deserialize hstore to/from perl. Syntax of hstore was inspired by perl. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe
On 5 March 2014 09:28, Simon Riggs si...@2ndquadrant.com wrote: So that returns us to solving the catalog consistency problem in pg_dump and similar applications. No answer, guys, and time is ticking away here. I'd like to get a communal solution to this rather than just punting the whole patch. If we have to strip it down to the bar essentials, so be it. For me, the biggest need here is to make VALIDATE CONSTRAINT take only a ShareUpdateExclusiveLock while it runs. Almost everything else needs a full AccessExclusiveLock anyway, or doesn't run for very long so isn't a critical problem. (Perhaps we can then wrap ADD CONSTRAINT ... NOT VALID and VALIDATE into a single command using the CONCURRENTLY keyword so it runs two transactions to complete the task). Validating FKs on big tables can take hours and it really isn't acceptable for us to lock out access while we do that. FKs are *supposed* to be a major reason people use RDBMS, so keeping them in a state where they are effectively unusable is a major debilitating point against adoption of PostgreSQL. If there are issues with pg_dump we can just document them. Guide me with your thoughts. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 1:32 AM, Teodor Sigaev teo...@sigaev.ru wrote: It's true for perl. Syntax of hstore is close to hash/array syntax and it's easy serialize/deserialize hstore to/from perl. Syntax of hstore was inspired by perl. I understand that. There is a module on CPAN called Pg::hstore that will do this; it appears to have been around since 2011. I don't use Perl, so I don't know a lot about it. Perhaps David Wheeler has an opinion on the value of Perl-like syntax, as a long time Perl enthusiast? In any case, Perl has excellent support for JSON, just like every other language - you are at no particular advantage in Perl by having a format that happens to more closely resemble the format of Perl hashes and arrays. I really feel that we should concentrate our efforts on one standardized format here. It makes the effort to integrate your good work, in a way that makes it available to everyone so much easier. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: contrib/cache_scan (Re: [HACKERS] What's needed for cache-only table scan?)
2014-03-06 18:17 GMT+09:00 Haribabu Kommi kommi.harib...@gmail.com: On Tue, Mar 4, 2014 at 3:07 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote: 4. + cchunk = ccache_vacuum_tuple(ccache, ccache-root_chunk, ctid); + if (pchunk != NULL pchunk != cchunk) + ccache_merge_chunk(ccache, pchunk); + pchunk = cchunk; The merge_chunk is called only when the heap tuples are spread across two cache chunks. Actually one cache chunk can accommodate one or more than heap pages. it needs some other way of handling. I adjusted the logic to merge the chunks as follows: Once a tuple is vacuumed from a chunk, it also checks whether it can be merged with its child leafs. A chunk has up to two child leafs; left one has less ctid that the parent, and right one has greater ctid. It means a chunk without right child in the left sub-tree or a chunk without left child in the right sub-tree are neighbor of the chunk being vacuumed. In addition, if vacuumed chunk does not have either (or both) of children, it can be merged with parent node. I modified ccache_vacuum_tuple() to merge chunks during t-tree walk-down, if vacuumed chunk has enough free space. Patch looks good. Thanks for your volunteering. Regarding merging of the nodes, instead of checking whether merge is possible or not for every tuple which is vacuumed, can we put some kind of threshold as whenever the node is 50% free then try to merge it from leaf nodes until 90% is full. The rest of the 10% will be left for the next inserts on the node. what do you say? Hmm. Indeed, it makes sense. How about an idea that kicks chunk merging if expected free space of merged chunk is less than 50%? If threshold depends on the (expected) usage of merged chunk, it can avoid over-merging. I will update you later regarding the performance test results. Thhanks, Also, I'll rebase the patch on top of the new custom-scan interfaces according to Tom's suggestion, even though main logic of cache_scan is not changed. Best regards, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] CREATE TYPE similar CHAR type
I want use CREATE TYPE to create one type similar to char. I want to when I create type, then my type behave similar to char: CREATE TABLE test (oneChar char); when I want insert one column with length1 to it, so it gets this error: ERROR: value too long for type character(1) I want my type behave similar this but it behaves similar varchar type.
Re: [HACKERS] CREATE TYPE similar CHAR type
On 6 March 2014 11:24, Mohsen SM mohsensoodk...@gmail.com wrote: I want use CREATE TYPE to create one type similar to char. I want to when I create type, then my type behave similar to char: CREATE TABLE test (oneChar char); when I want insert one column with length1 to it, so it gets this error: ERROR: value too long for type character(1) I want my type behave similar this but it behaves similar varchar type. If you did that, you'd have a char field padded out to fill up 10 megabytes-worth of characters. I doubt that's what you want. It's not clear what you want this for though. If you want an arbitrary number of characters, just use the text data type. -- Thom
[HACKERS] Next CommitFest Deadlines
Hi all, There are some place with the next commitfest deadlines (2014/06 and 2014/09) ? Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] GSoC on WAL-logging hash indexes
(de-CC'ing pgsql-advocacy) On 03/06/2014 04:03 AM, Greg Stark wrote: On Mon, Mar 3, 2014 at 4:12 PM, Robert Haas robertmh...@gmail.com wrote: Unfortunately, I don't believe that it's possible to do this easily today because of the way bucket splits are handled. I wrote about this previously here, with an idea for solving the problem: We could just tackle this in the same incomplete, buggy, way that btrees tackled it for years until Heikki fixed them and the way gin and gist still do I believe. Namely just emit xlog records for each page individually and during replay remember when you have an incomplete split and complain if recovery ends with any still incomplete. That would be unfortunate to be adding new cases of this just as Heikki and company are making progress eliminating the ones we already had but that's surely better than having no recovery at all. Grmph. Indeed. Looking at Robert's idea from November: I have thought about doing some work on this, although I don't know when I'll ever have the time. As far as I can see, the basic problem is that we need a better way of splitting buckets. Right now, splitting a bucket means locking it, scanning the entire bucket chain and moving ~1/2 the tuples to the new bucket, and then unlocking it. Since this is a long operation, we have to use heavyweight locks to protect the buckets, which is bad for concurrency. Since it involves a modification to an arbitrarily large number of pages that has to be atomic, it's not possible to WAL-log it sensibly -- and in fact, I think this is really the major obstacle to being able to implementing WAL-logging for hash indexes. I don't think it's necessary to improve concurrency just to get WAL-logging. Better concurrency is a worthy goal of its own, of course, but it's a separate concern. For crash safety, the key is to make sure you perform the whole operation in small atomic steps, and you have a way of knowing where to continue after crash (this is the same whether you do the cleanup immediately at the end of recovery, which I want to avoid, or lazily afterwards). But you can hold locks across those small atomic steps, to ensure concurrency-safety. I think we could work around this problem as follows. Suppose we have buckets 1..N and the split will create bucket N+1. We need a flag that can be set and cleared on the metapage, call it split-in-progress, and a flag that can optionally be set on particular index tuples, call it moved-by-split. We will allow scans of all buckets and insertions into all buckets while the split is in progress, but (as now) we will not allow more than one split to be in progress at the same time. Hmm, unless I'm reading the code wrong, it *does* allow more than one split to be in progress at the same time today. [description of how to use the moved-by-split to allow scans to run concurrently with the split] I guess that would work, although you didn't actually describe how to continue a split after a crash. But it's a lot simpler if you don't also try to make it more concurrent: --- When you start splitting a bucket, first acquire a heavy-weight lock on the old and new buckets. Allocate the required number of pages, before changing anything on-disk, so that you can easily back out if you run out of disk space. So far, this is how splitting works today. Then you update the metapage to show that the bucket has been split and initialize the new bucket's primary page (as one atomic WAL-logged operation). Also mark the new bucket's primary page with a split-in-progress flag. That's used later by scans to detect if the split was interrupted. Now you scan the old bucket, and move all the tuples belonging to the new bucket. That needs to be done as a series of small atomic WAL-logged operations, each involving a small number of old and new pages (one WAL record for each moved tuple is the simplest, but you can do some batching for efficiency). After you're done, clear the split-in-progress flag in the new bucket's primary page (WAL-log that), and release the locks. In a scan, if the bucket you're about to scan has the split-in-progress flag set, that indicates that the split was interrupted by a crash. You won't see the flag as set if a concurrent split is in progress, because you will block on the lock it's holding on the bucket. If you see the flag as set, you share-lock and scan both buckets, the old and the new. If you see the split-in-progress flag in the bucket you're about to insert to, you don't need to do anything special. Just insert the tuple to the new bucket as normal. Before splitting the new bucket again, however, the previous split needs to be finished, or things will get complicated. To do that, acquire the locks on the old and the new bucket, and scan the old bucket for any remaining tuples that belong to the new bucket and move them, and finally clear the split-in-progress flag. --- This is similar to
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 12:43 PM, Peter Geoghegan p...@heroku.com wrote: On Thu, Mar 6, 2014 at 12:23 AM, Teodor Sigaev teo...@sigaev.ru wrote: That's possible to introduce GUC variable for i/o functions which will control old bug-to-bug behavior. IMHO, this is much better option that stopping hstore development or split hstore to two branches. A GUC that controls i/o functions is generally considered to be an unacceptable hack. In what sense are we really stopping hstore development if hstore2 lives as jsonb? I have a hard time imagining someone dealing with the incompatibility that a user-facing hstore2 would introduce, while still preferring hstore syntax over json syntax given the choice. There are very rich facilities for manipulating json available in every programming language. The same is not true of hstore. Having looked at the issue today, I think that the amount of redundant code between a hstore2 in core as jsonb and hstore1 will be acceptable. The advantages of making a clean-break in having to support the legacy hstore disk format strengthen the case for doing so too. Heh, let's not to do an implusive decision about hstore2. I agree, that jsonb has a lot of facilities, but don't forget, that json(b) has to follow standard and in that sense it's more constrained than hstore, which we could further develop to support some interesting features, which will never be implemented in json(b). Also, it'd be a bit awkward after working on nested hstore and declaring it on several conferences (Engine Yard has sponsored part of our hstore work), suddenly break people expectation and say, that our work has moved to core to provide json some very cool features, good bye, hstore users :( I'm afraid people will not understand us. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl status with nonexistent data directory
On Thu, Mar 6, 2014 at 09:54:57AM +0530, Amit Kapila wrote: If they haven't passed us a data directory, we don't really know if the server is running or not, so the patch just returns '1'. But for such cases, isn't the status 4 more appropriate? As per above link 4 program or service status is unknown status 1 - 1 program is dead and /var/run pid file exists Going by this definition, it seems status 1 means, someone has forcefully killed the server and pid file still remains. Technically, you are right, but I tried a while ago to assign meaningful values to all the exit locations and the community feedback I got was that we didn't want that. I don't see how specifying non-existant or non-cluster directory would somehow be a case that would be special. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/06/2014 08:16 AM, Oleg Bartunov wrote: On Thu, Mar 6, 2014 at 12:43 PM, Peter Geoghegan p...@heroku.com wrote: On Thu, Mar 6, 2014 at 12:23 AM, Teodor Sigaev teo...@sigaev.ru wrote: That's possible to introduce GUC variable for i/o functions which will control old bug-to-bug behavior. IMHO, this is much better option that stopping hstore development or split hstore to two branches. A GUC that controls i/o functions is generally considered to be an unacceptable hack. In what sense are we really stopping hstore development if hstore2 lives as jsonb? I have a hard time imagining someone dealing with the incompatibility that a user-facing hstore2 would introduce, while still preferring hstore syntax over json syntax given the choice. There are very rich facilities for manipulating json available in every programming language. The same is not true of hstore. Having looked at the issue today, I think that the amount of redundant code between a hstore2 in core as jsonb and hstore1 will be acceptable. The advantages of making a clean-break in having to support the legacy hstore disk format strengthen the case for doing so too. Heh, let's not to do an implusive decision about hstore2. I agree, that jsonb has a lot of facilities, but don't forget, that json(b) has to follow standard and in that sense it's more constrained than hstore, which we could further develop to support some interesting features, which will never be implemented in json(b). Also, it'd be a bit awkward after working on nested hstore and declaring it on several conferences (Engine Yard has sponsored part of our hstore work), suddenly break people expectation and say, that our work has moved to core to provide json some very cool features, good bye, hstore users :( I'm afraid people will not understand us. Oleg, I hear you, and largely agree, as long as the compatibility issue is solved. If it's not, I think inventing a new hstore2 type is probably a lousy way to go. For good or ill, the world has pretty much settled on wanting to use json for lightweight treeish data. That's where we'll get the most impact. Virtually every programming language (including Perl) has good support for json. I'm not sure what the constraints of json that you might want to break are. Perhaps you'd like to specify. Whatever we do, rest assured your work won't go to waste. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 09:33:18AM -0500, Andrew Dunstan wrote: I hear you, and largely agree, as long as the compatibility issue is solved. If it's not, I think inventing a new hstore2 type is probably a lousy way to go. For good or ill, the world has pretty much settled on wanting to use json for lightweight treeish data. That's where we'll get the most impact. Virtually every programming language (including Perl) has good support for json. I'm not sure what the constraints of json that you might want to break are. Perhaps you'd like to specify. Whatever we do, rest assured your work won't go to waste. OK, just to summarize: JSONB and everything it shares with hstore will be in core hstore-specific code stays in contrib hstore contrib will create an hstore type to call contrib and core code 9.4 hstore has some differences from pre-9.4 The question is whether we change/improve hstore in 9.4, or create an hstore2 that is the improved hstore for 9.4 and keep hstore identical to pre-9.4. That last option looks an awful like the dreaded VARCHAR2. What can we do to help people migrate to an hstore type that supports data types? Is there a function we can give them to flag possible problem data, or give them some function to format things the old way for migrations, etc. If they are going to have to rewrite all their old data, why bother with a backward-compatible binary format? Is it only the client applications that will need to be changed? How would we instruct users on the necessary changes? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl status with nonexistent data directory
On Mar6, 2014, at 00:08 , Bruce Momjian br...@momjian.us wrote: I have addressed this issue with the attached patch: $ pg_ctl -D /lkjasdf status pg_ctl: directory /lkjasdf does not exist $ pg_ctl -D / status pg_ctl: directory / is not a database cluster directory One odd question is that pg_ctl status has this comment for reporting the exit code for non-running servers: printf(_(%s: no server running\n), progname); /* * The Linux Standard Base Core Specification 3.1 says this should return * '3' * https://refspecs.linuxbase.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html */ exit(3); If they haven't passed us a data directory, we don't really know if the server is running or not, so the patch just returns '1'. Why change the exit code at all in the ENOENT-case? If the directory does not exist, it's fairly certain that the server is not running, so 3 seems fine. Technically, changing the return value is an API change and might break things, so why do it if there's no clear benefit? In the EPERM case (or, rather the non-ENOENT case), I agree with Amit that 4 (meaning program or service status is unknown) fits much better than 1 (meaning program is dead and /var/run pid file exists). So *if* we change it at all, we should change it to 4, not to some other, equally arbitrary value. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl status with nonexistent data directory
Bruce Momjian escribió: Technically, you are right, but I tried a while ago to assign meaningful values to all the exit locations and the community feedback I got was that we didn't want that. That sounds odd. Do you have a link? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Bruce Momjian br...@momjian.us writes: OK, just to summarize: JSONB and everything it shares with hstore will be in core hstore-specific code stays in contrib hstore contrib will create an hstore type to call contrib and core code 9.4 hstore has some differences from pre-9.4 I've got a problem with the last part of that. AFAICS, the value proposition for hstore2 largely fails if it's not 100% upward compatible with existing hstore, both as to on-disk storage and as to application- visible behavior. If you've got to adapt your application anyway, why not switch to JSONB which is going to offer a lot of benefits in terms of available code you can work with? Although I've not looked at the patch, it was claimed upthread that there were changes in the I/O format for existing test cases, for example. IMO, that's an absolute dead no-go. The question is whether we change/improve hstore in 9.4, or create an hstore2 that is the improved hstore for 9.4 and keep hstore identical to pre-9.4. That last option looks an awful like the dreaded VARCHAR2. I think hstore2 as a separate type isn't likely to be a win either. The bottom line here is that hstore2 is more or less what we'd agreed to doing back at the last PGCon, but that decision has now been obsoleted by events in the JSON area. If jsonb gets in, I think we probably end up rejecting hstore2 as such. Or at least, that's what we should do IMO. contrib/hstore is now a legacy type and we shouldn't be putting additional work into it, especially not work that breaks backwards compatibility. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl status with nonexistent data directory
Alvaro Herrera alvhe...@2ndquadrant.com writes: Bruce Momjian escribió: Technically, you are right, but I tried a while ago to assign meaningful values to all the exit locations and the community feedback I got was that we didn't want that. That sounds odd. Do you have a link? FWIW, I recall that in my former life at Red Hat, I had to deal several times with patching pg_ctl to make its exit codes more LSB-compliant. And I think Debian does the same. So Linux packagers, at least, would like to see us pay more attention to that standard. On the other hand, there is no market for changes that make the exit codes more meaningful unless the changes can be justified by chapter and verse in LSB. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)
On Wed, Mar 5, 2014 at 10:49 PM, Tom Lane t...@sss.pgh.pa.us wrote: [ shrug... ] They can see whether the Window plan node is where the time is going. It's not apparent to me that the extra numbers you propose to report will edify anybody. Perhaps just saying Incremental Window Function versus Iterated Window Function or something like that be sufficient? At least that way query tuning quidelines have a keyword they can say to watch out for. And someone trying to figure out *why* the time is being spent in this node has something they might notice a correlation with. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Le jeudi 6 mars 2014 09:33:18 Andrew Dunstan a écrit : On 03/06/2014 08:16 AM, Oleg Bartunov wrote: On Thu, Mar 6, 2014 at 12:43 PM, Peter Geoghegan p...@heroku.com wrote: On Thu, Mar 6, 2014 at 12:23 AM, Teodor Sigaev teo...@sigaev.ru wrote: That's possible to introduce GUC variable for i/o functions which will control old bug-to-bug behavior. IMHO, this is much better option that stopping hstore development or split hstore to two branches. A GUC that controls i/o functions is generally considered to be an unacceptable hack. In what sense are we really stopping hstore development if hstore2 lives as jsonb? I have a hard time imagining someone dealing with the incompatibility that a user-facing hstore2 would introduce, while still preferring hstore syntax over json syntax given the choice. There are very rich facilities for manipulating json available in every programming language. The same is not true of hstore. Having looked at the issue today, I think that the amount of redundant code between a hstore2 in core as jsonb and hstore1 will be acceptable. The advantages of making a clean-break in having to support the legacy hstore disk format strengthen the case for doing so too. Heh, let's not to do an implusive decision about hstore2. I agree, that jsonb has a lot of facilities, but don't forget, that json(b) has to follow standard and in that sense it's more constrained than hstore, which we could further develop to support some interesting features, which will never be implemented in json(b). Also, it'd be a bit awkward after working on nested hstore and declaring it on several conferences (Engine Yard has sponsored part of our hstore work), suddenly break people expectation and say, that our work has moved to core to provide json some very cool features, good bye, hstore users :( I'm afraid people will not understand us. Oleg, I hear you, and largely agree, as long as the compatibility issue is solved. If it's not, I think inventing a new hstore2 type is probably a lousy way to go. For good or ill, the world has pretty much settled on wanting to use json for lightweight treeish data. That's where we'll get the most impact. Virtually every programming language (including Perl) has good support for json. I'm not sure what the constraints of json that you might want to break are. Perhaps you'd like to specify. I haven't followed the whole thread, but json is really restrictive on the supported types: a hierarchical hstore could maybe support more types (timestamp comes to mind) as its values, which is not a valid data type in the json spec. Whatever we do, rest assured your work won't go to waste. cheers andrew -- Ronan Dunklau http://dalibo.com - http://dalibo.org signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 4:25 PM, Tom Lane t...@sss.pgh.pa.us wrote: Bruce Momjian br...@momjian.us writes: OK, just to summarize: JSONB and everything it shares with hstore will be in core hstore-specific code stays in contrib hstore contrib will create an hstore type to call contrib and core code 9.4 hstore has some differences from pre-9.4 I've got a problem with the last part of that. AFAICS, the value proposition for hstore2 largely fails if it's not 100% upward compatible with existing hstore, both as to on-disk storage and as to application- visible behavior. If you've got to adapt your application anyway, why not switch to JSONB which is going to offer a lot of benefits in terms of available code you can work with? Although I've not looked at the patch, it was claimed upthread that there were changes in the I/O format for existing test cases, for example. IMO, that's an absolute dead no-go. The question is whether we change/improve hstore in 9.4, or create an hstore2 that is the improved hstore for 9.4 and keep hstore identical to pre-9.4. That last option looks an awful like the dreaded VARCHAR2. I think hstore2 as a separate type isn't likely to be a win either. The bottom line here is that hstore2 is more or less what we'd agreed to doing back at the last PGCon, but that decision has now been obsoleted by events in the JSON area. If jsonb gets in, I think we probably end up rejecting hstore2 as such. Or at least, that's what we should do IMO. contrib/hstore is now a legacy type and we shouldn't be putting additional work into it, especially not work that breaks backwards compatibility. (not read up on the full details of the thread, sorry if I'm re-iterating something) I think we definitely want/need to maintain hstore compatibility. A completely separate hstore2 type that's not backwards compatible makes very little sense. However, if the new hstore type (compatible with the old one) is the wrapper around jsonb, rather than the other way around, I don't see any problem with it at all. Most future users are almost certainly going to use the json interfaces, but we don't want to leave upgraded users behind. (But of course it has to actually maintain backwards compatibility for that argument to hold) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] jsonb and nested hstore
Magnus Hagander mag...@hagander.net writes: However, if the new hstore type (compatible with the old one) is the wrapper around jsonb, rather than the other way around, I don't see any problem with it at all. Most future users are almost certainly going to use the json interfaces, but we don't want to leave upgraded users behind. (But of course it has to actually maintain backwards compatibility for that argument to hold) Yeah --- all of this turns on whether hstore improvements can be 100% upwards compatible or not. If they are, I don't object to including them; I'd have said it was wasted effort, but if the work is already done then that's moot. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/06/2014 10:46 AM, Tom Lane wrote: Magnus Hagander mag...@hagander.net writes: However, if the new hstore type (compatible with the old one) is the wrapper around jsonb, rather than the other way around, I don't see any problem with it at all. Most future users are almost certainly going to use the json interfaces, but we don't want to leave upgraded users behind. (But of course it has to actually maintain backwards compatibility for that argument to hold) Yeah --- all of this turns on whether hstore improvements can be 100% upwards compatible or not. If they are, I don't object to including them; I'd have said it was wasted effort, but if the work is already done then that's moot. Clearly there are people who want it, or else they would not have sponsored the work. We seem to have an emerging consensus on the compatibility issue. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE TYPE similar CHAR type
mohsencs wrote I want use CREATE TYPE to create one type similar to char. I want to when I create type, then my type behave similar to char: CREATE TABLE test (oneChar char); when I want insert one column with length1 to it, so it gets this error: ERROR: value too long for type character(1) I want my type behave similar this but it behaves similar varchar type. If you can get over the need for using CREATE TYPE you'll find that using CREATE DOMAIN with a check constraint will probably meet your needs perfectly. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/CREATE-TYPE-similar-CHAR-type-tp5794946p5794981.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/06/2014 05:46 PM, Tom Lane wrote: Magnus Hagander mag...@hagander.net writes: However, if the new hstore type (compatible with the old one) is the wrapper around jsonb, rather than the other way around, I don't see any problem with it at all. Most future users are almost certainly going to use the json interfaces, but we don't want to leave upgraded users behind. (But of course it has to actually maintain backwards compatibility for that argument to hold) Yeah --- all of this turns on whether hstore improvements can be 100% upwards compatible or not. If they are, I don't object to including them; There are reasons for *not* wanting the new hstore2 functionality. If you don't want nesting, for example, with the new type you're going to need to add a constraint to forbid that. Ugh. Many applications are happy with the current functionality, a simple string key/value dictionary, and for them the new features are not an improvement. As an analogy, adding significant new functionality like nesting to the existing hstore type is like suddenly adding the time of day to the date datatype. It might be useful in many cases. And an existing application can leave the hour and minute fields zero, so it's backwards-compatible. But as soon as someone inserts a datum that uses the hour and minute fields, it will confuse the application that doesn't know about that. I haven't been following these discussions closely, but for those reasons, I thought hstore2 was going to be a separate type. I don't think there are very many applications that would be interested in upgrading from the current hstore to the new hstore2 type. More likely, the new data type is useful for many applications that couldn't have used hstore before because it didn't support nesting or was too loosely typed. And old applications that are already using hstore are perfectly happy with the status quo. Let's not mess with the existing hstore datatype. For what it does, it works great. Likewise, jsonb is significantly different from hstore2, so it should be a separate data type. Frankly I don't understand what the problem is with doing that. I don't have a problem with copy-pasting the common parts. BTW, now that I look at the nested hstore patch, I'm disappointed to see that it only supports a few hardcoded datatypes. Call me naive, but somehow I thought it would support *all* PostgreSQL datatypes, built-in or user-defined. I realize that's a tall order, but that's what I thought it did. Since it doesn't, color me unimpressed. It's really not any better than json, I don't see why anyone would prefer it over json. Not that I particularly like json, but it's a format a lot of people are familiar with. So here my opinion on what we should do: 1. Forget about hstore2 2. Add GIN and GIST operator classes to jsonb, if they're ready for commit pretty darn soon. If not, punt them to next release. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Securing make check (CVE-2014-0067)
Noah Misch n...@leadboat.com writes: Thanks. To avoid socket path length limitations, I lean toward placing the socket temporary directory under /tmp rather than placing under the CWD: I'm not thrilled with that; it's totally insecure on platforms where /tmp isn't sticky, so it doesn't seem like an appropriate solution given that this discussion is now being driven by security concerns. http://www.postgresql.org/message-id/flat/20121129223632.ga15...@tornado.leadboat.com I re-read that thread. While we did fix the reporting end of it, ie the postmaster will now give you a clear failure message if your socket path is too long, that's going to be cold comfort to anyone who has to build in an environment they don't have much control over (such as my still-hypothetical-I-hope scenario about Red Hat package updates). I'm inclined to suggest that we should put the socket under $CWD by default, but provide some way for the user to override that choice. If they want to put it in /tmp, it's on their head as to how secure that is. On most modern platforms it'd be fine. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 11:28 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: So here my opinion on what we should do: 1. Forget about hstore2 2. Add GIN and GIST operator classes to jsonb, if they're ready for commit pretty darn soon. If not, punt them to next release. For #2, would we maintain the hstore syntax for the searching operators. For example, SELECT count(*) FROM jsonb_schema WHERE tabledata @ 'columns = {{column_name=total_time}}'; Note the hstore-ish = in the searching operator. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Hi there, Looks like consensus is done. I and Teodor are not happy with it, but what we can do :) One thing I want to do is to reserve our contribution to the flagship feature (jsonb), particularly, binary storage for nested structures and indexing. Their work was sponsored by Engine Yard. As for the old hstore I think it'd be nice to add gin_hstore_hash_ops, so hstore users will benefit from 9.4 release. There is no compatibiliy issue, so I think this could be harmless. Oleg On Thu, Mar 6, 2014 at 7:25 PM, Tom Lane t...@sss.pgh.pa.us wrote: Bruce Momjian br...@momjian.us writes: OK, just to summarize: JSONB and everything it shares with hstore will be in core hstore-specific code stays in contrib hstore contrib will create an hstore type to call contrib and core code 9.4 hstore has some differences from pre-9.4 I've got a problem with the last part of that. AFAICS, the value proposition for hstore2 largely fails if it's not 100% upward compatible with existing hstore, both as to on-disk storage and as to application- visible behavior. If you've got to adapt your application anyway, why not switch to JSONB which is going to offer a lot of benefits in terms of available code you can work with? Although I've not looked at the patch, it was claimed upthread that there were changes in the I/O format for existing test cases, for example. IMO, that's an absolute dead no-go. The question is whether we change/improve hstore in 9.4, or create an hstore2 that is the improved hstore for 9.4 and keep hstore identical to pre-9.4. That last option looks an awful like the dreaded VARCHAR2. I think hstore2 as a separate type isn't likely to be a win either. The bottom line here is that hstore2 is more or less what we'd agreed to doing back at the last PGCon, but that decision has now been obsoleted by events in the JSON area. If jsonb gets in, I think we probably end up rejecting hstore2 as such. Or at least, that's what we should do IMO. contrib/hstore is now a legacy type and we shouldn't be putting additional work into it, especially not work that breaks backwards compatibility. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
Kouhei Kaigai kai...@ak.jp.nec.com writes: I expected to include simple function pointers for copying and text-output as follows: typedef struct { Planplan; : NodeCopy_functionnode_copy; NodeTextOut_function node_textout; } Custom; I was thinking more like typedef struct CustomPathFuncs { const char *name; /* used for debugging purposes only */ NodeCopy_function node_copy; NodeTextOut_function node_textout; ... etc etc etc ... } CustomPathFuncs; typedef struct CustomPath { Path path; const CustomPathFuncs *funcs; ... maybe a few more fields here, but not too darn many ... } CustomPath; and similarly for CustomPlan. The advantage of this way is it's very cheap for (what I expect will be) the common case where an extension has a fixed set of support functions for its custom paths and plans. It just declares a static constant CustomPathFuncs struct, and puts a pointer to that into its paths. If an extension really needs to set the support functions on a per-object basis, it can do this: typdef struct MyCustomPath { CustomPath cpath; CustomPathFuncs funcs; ... more fields ... } MyCustomPath; and then initialization of a MyCustomPath would include mypath-cpath.funcs = mypath-funcs; mypath-funcs.node_copy = MyCustomPathCopy; ... etc etc ... In this case we're arguably wasting one pointer worth of space in the path, but considering the number of function pointers such a path will be carrying, I don't think that's much of an objection. So? If you did that, then you wouldn't have renumbered the Vars as INNER/OUTER. I don't believe that CUSTOM_VAR is necessary at all; if it is needed, then there would also be a need for an additional tuple slot in executor contexts, which you haven't provided. For example, the enhanced postgres_fdw fetches the result set of remote join query, thus a tuple contains the fields come from both side. In this case, what varno shall be suitable to put? Not sure what we'd do for the general case, but CUSTOM_VAR isn't the solution. Consider for example a join where both tables supply columns named id --- if you put them both in one tupledesc then there's no non-kluge way to identify them. Possibly the route to a solution involves adding another plan-node callback function that ruleutils.c would use for printing Vars in custom join nodes. Or maybe we could let the Vars keep their original RTE numbers, though that would complicate life at execution time. Anyway, if we're going to punt on add_join_path_hook for the time being, this problem can probably be left to solve later. It won't arise for simple table-scan cases, nor for single-input plan nodes such as sorts. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/06/2014 12:50 PM, Oleg Bartunov wrote: Hi there, Looks like consensus is done. I and Teodor are not happy with it, but what we can do :) One thing I want to do is to reserve our contribution to the flagship feature (jsonb), particularly, binary storage for nested structures and indexing. Their work was sponsored by Engine Yard. We don't normally credit sponsors in commits, but if I'm doing the commit I promise you guys would certainly get major credit as authors. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
I meant in Release Notes for 9.4 On Thu, Mar 6, 2014 at 10:26 PM, Andrew Dunstan and...@dunslane.net wrote: On 03/06/2014 12:50 PM, Oleg Bartunov wrote: Hi there, Looks like consensus is done. I and Teodor are not happy with it, but what we can do :) One thing I want to do is to reserve our contribution to the flagship feature (jsonb), particularly, binary storage for nested structures and indexing. Their work was sponsored by Engine Yard. We don't normally credit sponsors in commits, but if I'm doing the commit I promise you guys would certainly get major credit as authors. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/06/2014 07:00 AM, Bruce Momjian wrote: What can we do to help people migrate to an hstore type that supports data types? Is there a function we can give them to flag possible problem data, or give them some function to format things the old way for migrations, etc. If they are going to have to rewrite all their old data, why bother with a backward-compatible binary format? Is it only the client applications that will need to be changed? How would we instruct users on the necessary changes? So, from what I've been able to check: The actual storage upgrade of hstore--hstore2 is fairly painless from the user perspective; they don't have to do anything. The problem is that the input/output strings are different, something which I didn't think to check for (and Peter did), and which will break applications relying on Hstore, since the drivers which support Hstore (like psycopg2) rely on string-parsing to convert it. I haven't regression-tested hstore2 against psycopg2 since I don't have a good test, but that would be a useful thing to do. Hstore2 supports the same limited data types as JSON does, and not any additional ones. This makes an hstore2 of dubious value unless the compatibility issues can be solved conclusively. Is that all correct? Have I missed something? On 03/06/2014 09:50 AM, Oleg Bartunov wrote: Looks like consensus is done. I and Teodor are not happy with it, but what we can do :) One thing I want to do is to reserve our contribution to the flagship feature (jsonb), particularly, binary storage for nested structures and indexing. Their work was sponsored by Engine Yard. We don't generally credit companies in the release notes, since if we started, where would we stop? However, we *do* credit them in the press release, and I'll make a note of the EY sponsorship, especially since it's also good PR. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 10:54 PM, Josh Berkus j...@agliodbs.com wrote: g? On 03/06/2014 09:50 AM, Oleg Bartunov wrote: Looks like consensus is done. I and Teodor are not happy with it, but what we can do :) One thing I want to do is to reserve our contribution to the flagship feature (jsonb), particularly, binary storage for nested structures and indexing. Their work was sponsored by Engine Yard. We don't generally credit companies in the release notes, since if we started, where would we stop? However, we *do* credit them in the press release, and I'll make a note of the EY sponsorship, especially since it's also good PR. I think press release is fine. We waited a long time for sponsorship of our work and EY help was crucial. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC on WAL-logging hash indexes
On Thu, Mar 6, 2014 at 8:11 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: I don't think it's necessary to improve concurrency just to get WAL-logging. Better concurrency is a worthy goal of its own, of course, but it's a separate concern. To some extent, I agree, but only to some extent. To make hash indexes generally usable, we really need to solve both problems. When I got rid of just some of the heavyweight locking in commit 76837c1507cb5a5a0048046233568447729e66dd, the results were pretty dramatic at higher levels of concurrency: http://www.postgresql.org/message-id/CA+Tgmoaf=nojxlyzgcbrry+pe-0vll0vfhi6tjdm3fftvws...@mail.gmail.com But there was still an awful lot of contention inside the heavyweight lock manager, and I don't think hash indexes are going to be ready for prime time until we solve that problem. This is similar to your description, as you scan both buckets if you see an interrupted split, but doesn't require the per-tuple moved-by-split flag, or waiting out scans. Also, I put the split-in-progress flag in the new bucket's primary page instead of the metapage. That allows multiple splits to be in progress at the same time. Putting the split-in-progress flag in the new bucket's primary page makes a lot of sense. I don't have any problem with dumping the rest of it for a first cut if we have a different long-term plan for how to improve concurrency, but I don't see much point in going to a lot of work to implement a system for WAL logging if we're going to end up having to afterwards throw it out and start from scratch to get rid of the heavyweight locks - and it's not obvious to me how what you have here could be extended to do that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On Wed, Mar 5, 2014 at 7:42 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Tue, Mar 4, 2014 at 5:00 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Tue, Mar 4, 2014 at 3:29 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-04 12:54:02 -0500, Robert Haas wrote: On Tue, Mar 4, 2014 at 9:50 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-04 09:47:08 -0500, Robert Haas wrote: Can't that be solved by just creating the permanent relation in a new relfilenode? That's equivalent to a rewrite, yes, but we need to do that for anything but wal_level=minimal anyway. Yes, that would work. I've tended to view optimizing away the relfilenode copy as an indispensable part of this work, but that might be wrongheaded. It would certainly be a lot easier to make this happen if we didn't insist on that. I think it'd already much better than today's situation, and it's a required codepath for wal_level logical anyway. So even if somebody wants to make this work without the full copy for minimal, it'd still be a required codepath. So I am perfectly ok with a patch just adding that. Then is this a good idea for a GSoC project ? I don't know very well this internals, but I am willing to learn and I think the GSoC is a good opportunity. Any of you are willing to mentoring this project? I written the proposal to this feature, so I would like to know if someone can review. I think this isn't a good design. Per the discussion between Andres and I, I think that I think you should do is make ALTER TABLE .. SET LOGGED work just like VACUUM FULL, with the exception that it will set a different relpersistence for the new relfilenode. If you do it that way, this will be less efficient, but much simpler, and you might actually finish it in one summer. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On 6 March 2014 19:42, Robert Haas robertmh...@gmail.com wrote: On Wed, Mar 5, 2014 at 7:42 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Tue, Mar 4, 2014 at 5:00 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Tue, Mar 4, 2014 at 3:29 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-04 12:54:02 -0500, Robert Haas wrote: On Tue, Mar 4, 2014 at 9:50 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-03-04 09:47:08 -0500, Robert Haas wrote: Can't that be solved by just creating the permanent relation in a new relfilenode? That's equivalent to a rewrite, yes, but we need to do that for anything but wal_level=minimal anyway. Yes, that would work. I've tended to view optimizing away the relfilenode copy as an indispensable part of this work, but that might be wrongheaded. It would certainly be a lot easier to make this happen if we didn't insist on that. I think it'd already much better than today's situation, and it's a required codepath for wal_level logical anyway. So even if somebody wants to make this work without the full copy for minimal, it'd still be a required codepath. So I am perfectly ok with a patch just adding that. Then is this a good idea for a GSoC project ? I don't know very well this internals, but I am willing to learn and I think the GSoC is a good opportunity. Any of you are willing to mentoring this project? I written the proposal to this feature, so I would like to know if someone can review. I think this isn't a good design. Per the discussion between Andres and I, I think that I think you should do is make ALTER TABLE .. SET LOGGED work just like VACUUM FULL, with the exception that it will set a different relpersistence for the new relfilenode. If you do it that way, this will be less efficient, but much simpler, and you might actually finish it in one summer. Sounds like a plan. Would there be any stretch-goals for this work, or is there not really anything else that could be done? -- Thom
Re: [HACKERS] walsender doesn't send keepalives when writes are pending
On 03/05/2014 10:57 PM, Andres Freund wrote: On 2014-03-05 18:26:13 +0200, Heikki Linnakangas wrote: The logic was the same before the patch, but I added the XXX comment above. Why do we sleep in increments of 1/10 of wal_sender_timeout? Originally, the code calculated when the next wakeup should happen, by adding wal_sender_timeout (or replication_timeout, as it was called back then) to the time of the last reply. Why don't we do that? [ archeology ] It imo makes sense to wakeup after last_reply + wal_sender_timeout/2, so a requested reply actually has time to arrive, but otherwise I agree. I think your patch makes sense. Additionally imo the timeout checking should be moved outside the if (caughtup || pq_is_send_pending()), but that's probably a separate patch. Any chance you could apply your patch soon? I've a patch pending that'll surely conflict with this and it seems better to fix it first. Ok, pushed. I left the polling-style sleep in place for now. Thanks! - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On Thu, Mar 6, 2014 at 4:42 PM, Robert Haas robertmh...@gmail.com wrote: I think this isn't a good design. Per the discussion between Andres and I, I think that I think you should do is make ALTER TABLE .. SET LOGGED work just like VACUUM FULL, with the exception that it will set a different relpersistence for the new relfilenode. If you do it that way, this will be less efficient, but much simpler, and you might actually finish it in one summer. Do it like 'VACUUM FULL' for any wal_level? Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On Thu, Mar 6, 2014 at 2:52 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Thu, Mar 6, 2014 at 4:42 PM, Robert Haas robertmh...@gmail.com wrote: I think this isn't a good design. Per the discussion between Andres and I, I think that I think you should do is make ALTER TABLE .. SET LOGGED work just like VACUUM FULL, with the exception that it will set a different relpersistence for the new relfilenode. If you do it that way, this will be less efficient, but much simpler, and you might actually finish it in one summer. Do it like 'VACUUM FULL' for any wal_level? Yep. Anything else appears to be a research problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] patch: make_timestamp function
On Thu, Mar 6, 2014 at 1:26 AM, Pavel Stehule pavel.steh...@gmail.com wrote: 2014-03-05 16:22 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pavel Stehule escribió: Hi I hope, so this patch fix it wtf? I tried to fix http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f1ba94bcd9717b94b36868d6905547e313f3a359 Tom did it better than me. The patch you attached was one from Heikki, not anything you wrote for yourself, and utterly unrelated to the topic of this thread. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On Thu, Mar 6, 2014 at 5:04 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Mar 6, 2014 at 2:52 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Thu, Mar 6, 2014 at 4:42 PM, Robert Haas robertmh...@gmail.com wrote: I think this isn't a good design. Per the discussion between Andres and I, I think that I think you should do is make ALTER TABLE .. SET LOGGED work just like VACUUM FULL, with the exception that it will set a different relpersistence for the new relfilenode. If you do it that way, this will be less efficient, but much simpler, and you might actually finish it in one summer. Do it like 'VACUUM FULL' for any wal_level? Yep. Anything else appears to be a research problem. I'll change the proposal. Thanks a lot! Grettings, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: Fwd: [HACKERS] patch: make_timestamp function
2014-03-06 21:06 GMT+01:00 Robert Haas robertmh...@gmail.com: On Thu, Mar 6, 2014 at 1:26 AM, Pavel Stehule pavel.steh...@gmail.com wrote: 2014-03-05 16:22 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pavel Stehule escribió: Hi I hope, so this patch fix it wtf? I tried to fix http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f1ba94bcd9717b94b36868d6905547e313f3a359 Tom did it better than me. The patch you attached was one from Heikki, not anything you wrote for yourself, and utterly unrelated to the topic of this thread. yes, sorry - it is some git issue on my side (I had to use wrong hash). I did changes similar to Tom fix, but patch was some other than I did. Regards Pavel -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] GSoC on WAL-logging hash indexes
On Thu, Mar 6, 2014 at 11:34 AM, Robert Haas robertmh...@gmail.com wrote: Putting the split-in-progress flag in the new bucket's primary page makes a lot of sense. I don't have any problem with dumping the rest of it for a first cut if we have a different long-term plan for how to improve concurrency, but I don't see much point in going to a lot of work to implement a system for WAL logging if we're going to end up having to afterwards throw it out and start from scratch to get rid of the heavyweight locks - and it's not obvious to me how what you have here could be extended to do that. +1 I don't think we have to improve concurrency at the same time as WAL logging, but we at least have to implement WAL logging in a way that doesn't foreclose future improvements to concurrency. I've been tempted to implement a new type of hash index that allows both WAL and high concurrency, simply by disallowing bucket splits. At the index creation time you use a storage parameter to specify the number of buckets, and that is that. If you mis-planned, build a new index with more buckets, possibly concurrently, and drop the too-small one. I think it would be easier to add bucket splitting to something like this than it would be to add WAL logging and concurrency to what we already have--mostly because I think that route would be more amenable to incremental programming efforts, and also because if people had an actually useful hash index type they would use it and see that it worked well (assuming of course that it does work well), and then be motivated to improve it. If this could be done as an extension I would just go (attempt to) do it. But since WAL isn't pluggable, it can't go in as an extension. Cheers, Jeff
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 6:54 PM, Josh Berkus j...@agliodbs.com wrote: The actual storage upgrade of hstore--hstore2 is fairly painless from the user perspective; they don't have to do anything. The problem is that the input/output strings are different, something which I didn't think to check for (and Peter did), and which will break applications relying on Hstore, since the drivers which support Hstore (like psycopg2) rely on string-parsing to convert it. I haven't regression-tested hstore2 against psycopg2 since I don't have a good test, but that would be a useful thing to do. Hello, psycopg developer here. Not following the entire thread as it's quite articulated and not of my direct interest (nor comprehension). But if you throw at me a few test cases I can make sure psycopg can parse them much before hstore2 is released. FYI I have a trigger that highlights me the -hackers messages mentioning psycopg, so just mentioning it is enough for me to take a better look. But if you want a more active collaboration just ask. Thank you, -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 12:58 PM, Daniele Varrazzo daniele.varra...@gmail.com wrote: Hello, psycopg developer here. Not following the entire thread as it's quite articulated and not of my direct interest (nor comprehension). But if you throw at me a few test cases I can make sure psycopg can parse them much before hstore2 is released. I don't think that'll be necessary. Any break in compatibility in the hstore format has been ruled a non-starter for having hstore support nested data structures. I believe on balance we're content to let hstore continue to be hstore. jsonb support would certainly be interesting, though. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 9:10 PM, Peter Geoghegan p...@heroku.com wrote: On Thu, Mar 6, 2014 at 12:58 PM, Daniele Varrazzo daniele.varra...@gmail.com wrote: Hello, psycopg developer here. Not following the entire thread as it's quite articulated and not of my direct interest (nor comprehension). But if you throw at me a few test cases I can make sure psycopg can parse them much before hstore2 is released. I don't think that'll be necessary. Any break in compatibility in the hstore format has been ruled a non-starter for having hstore support nested data structures. I believe on balance we're content to let hstore continue to be hstore. jsonb support would certainly be interesting, though. Cool, just let me know what you would expect a well-behaved client library to behave. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rowcounts marked by create_foreignscan_path()
Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes: Maybe I'm missing something. But I don't think postgresGetForeignPaths() marks the parameterized path with the correct row estimate. Also, that function doesn't seem to estimate the cost of the parameterized path correctly. Please find attached a patch. [ looks at that... ] Oh, I see what you're worried about: we select a specific safe join clause to drive creation of the parameterized path, but then there might be additional join clauses that come along with that because they're also movable to the parameterized path. You're right, the current postgres_fdw code is not accounting for that. I experimented with this patch using this test case in the database created by the postgres_fdw regression test: explain verbose SELECT * FROM ft2 a, ft2 b WHERE a.c2 = 47 AND b.c1 = a.c1 and a.c6 = 'fooo' and b.c7=a.c7; What I saw in this example was that with the patch, the join clause got counted *twice* in the cost estimate, because the clause returned by generate_implied_equalities_for_column() isn't pointer-equal to the one in the ppi_clauses list (note the comment about such cases in postgresGetForeignPlan). So that's not right. However, we've got bigger problems than that: with or without the patch, explain verbose SELECT * FROM ft2 a, ft2 b WHERE a.c2 = 47 AND b.c1 = a.c1 and a.c6 = 'fooo' and b.c7=upper(a.c7); fails with TRAP: FailedAssertion(!(is_foreign_expr(root, baserel, rinfo-clause)), File: postgres_fdw.c, Line: 759) This happens because we select the safe joinclause b.c1=a.c1, and then the PPI machinery adds all the other movable join clauses for that outer relation, including the unsafe one involving an upper() call. postgresGetForeignPlan believes that the only joinclauses it can find in scan_clauses are the safe one(s) selected by postgresGetForeignPaths, but that's completely wrong. I think this is probably relatively simple to fix: in postgresGetForeignPlan, we need to actually test whether a join clause is safe or not, and shove it into the appropriate list. So far as the original issue goes, I think what this consideration shows is that postgresGetForeignPaths is going about things incorrectly. I thought that considering only one join clause per path would be valid, if perhaps sometimes dumb. But the way the movable-clause machinery works, we don't get to consider only one join clause; it's all-or-nothing for a given outer relation. So what we really ought to be doing here is generating one parameterized path per valid outer relation. What we could possibly do to resolve this without much extra code is to keep using the same logic as a guide to which outer relations are interesting, but for each such relation, do get_baserel_parampathinfo() and then use the ppi_clauses list as the quals-to-apply for purposes of cost/selectivity estimation. postgresGetForeignPaths would have to check each of those clauses for safety rather than assuming anything. One possible benefit is that we could record the results in the path and not have to repeat the is_foreign_expr tests in postgresGetForeignPlan. I'm not entirely sure if that's worth the trouble though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe Reply-To:
On Tue, Mar 04, 2014 at 06:50:17PM +0100, Andres Freund wrote: On 2014-03-04 11:40:10 -0500, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: I think this is all too late for 9.4, though. I agree with the feeling that a meaningful fix for pg_dump isn't going to get done for 9.4. So that leaves us with the alternatives of (1) put off the lock-strength-reduction patch for another year; (2) push it anyway and accept a reduction in pg_dump reliability. I don't care for (2). I'd like to have lock strength reduction as much as anybody, but it can't come at the price of reduction of reliability. I am sorry, but I think this is vastly overstating the scope of the pg_dump problem. CREATE INDEX *already* doesn't require a AEL, and the amount of problems that has caused in the past is surprisingly low. If such a frequently used command doesn't cause problems, why are you assuming other commands to be that problematic? And I think it's hard to argue that the proposed changes are more likely to cause problems. Let's try to go at this a bit more methodically. The commands that - afaics - change their locklevel due to latest patch (v21) are: [snip] Good analysis. The hazards arise when pg_dump uses one of the ruleutils.c deparse worker functions. As a cross-check to your study, I looked briefly at the use of those functions in pg_dump and how this patch might affect them: -- pg_get_constraintdef() pg_dump reads the constraint OID with its transaction snapshot, so we will never see a too-new constraint. Dropping a constraint still requires AccessExclusiveLock. Concerning VALIDATE CONSTRAINT, pg_dump reads convalidated with its transaction snapshot and uses that to decide whether to dump the CHECK constraint as part of the CREATE TABLE or as a separate ALTER TABLE ADD CONSTRAINT following the data load. However, pg_get_constraintdef() reads the latest convalidated to decide whether to emit NOT VALID. Consequently, one can get a dump in which the dumped table data did not yet conform to the constraint, and the ALTER TABLE ADD CONSTRAINT (w/o NOT VALID) fails. (Suppose you deleted the last invalid rows just before executing the VALIDATE CONSTRAINT. I tested this by committing the DELETE + VALIDATE CONSTRAINT with pg_dump stopped at getTableAttrs().) One hacky, but maintainable and effective, solution to the VALIDATE CONSTRAINT problem is to have pg_dump tack on a NOT VALID if pg_get_constraintdef() did not do so. It's, conveniently, the last part of the definition. I would tend to choose this. We could also just decide this isn't bad enough to worry about. The consequence is that an ALTER TABLE ADD CONSTRAINT fails. Assuming no --single-transaction for the original restoral, you just add NOT VALID to the command and rerun. Like most of the potential new pg_dump problems, this can already happen today if the relevant database changes happen between taking the pg_dump transaction snapshot and locking the tables. -- pg_get_expr() for default expressions pg_dump reads pg_attrdef.adbin using its transaction snapshot, so it will never see a too-new default. This does allow us to read a dropped default. That's not a problem directly. However, suppose the default references a function dropped at the same time as the default. pg_dump could fail in pg_get_expr(). -- pg_get_indexdef() As you explained elsewhere, new indexes are no problem. DROP INDEX still requires AccessExclusiveLock. Overall, no problems here. -- pg_get_ruledef() The patch changes lock requirements for enabling and disabling of rules, but that is all separate from the rule expression handling. No problems. -- pg_get_triggerdef() The patch reduces CREATE TRIGGER and DROP TRIGGER to ShareUpdateExclusiveLock. The implications for pg_dump are similar to those for pg_get_expr(). -- pg_get_viewdef() Untamed: pg_dump does not lock views at all. One thing not to forget is that you can always get the old mutual exclusion back by issuing LOCK TABLE just before a DDL operation. If some unlucky user regularly gets pg_dump failures due to concurrent DROP TRIGGER, he has a workaround. There's no comparable way for someone who would not experience that problem to weaken the now-hardcoded AccessExclusiveLock. Many consequences of insufficient locking are too severe for that workaround to bring comfort, but the pg_dump failure scenarios around pg_get_expr() and pg_get_triggerdef() seem mild enough. Restore-time failures are more serious, hence my recommendation to put a hack in pg_dump around VALIDATE CONSTRAINT. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC on WAL-logging hash indexes
On Thu, Mar 6, 2014 at 3:44 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Thu, Mar 6, 2014 at 11:34 AM, Robert Haas robertmh...@gmail.com wrote: Putting the split-in-progress flag in the new bucket's primary page makes a lot of sense. I don't have any problem with dumping the rest of it for a first cut if we have a different long-term plan for how to improve concurrency, but I don't see much point in going to a lot of work to implement a system for WAL logging if we're going to end up having to afterwards throw it out and start from scratch to get rid of the heavyweight locks - and it's not obvious to me how what you have here could be extended to do that. +1 I don't think we have to improve concurrency at the same time as WAL logging, but we at least have to implement WAL logging in a way that doesn't foreclose future improvements to concurrency. I've been tempted to implement a new type of hash index that allows both WAL and high concurrency, simply by disallowing bucket splits. At the index creation time you use a storage parameter to specify the number of buckets, and that is that. If you mis-planned, build a new index with more buckets, possibly concurrently, and drop the too-small one. Yeah, we could certainly do something like that. It sort of sucks, though. I mean, it's probably pretty easy to know that starting with the default 2 buckets is not going to be enough; most people will at least be smart enough to start with, say, 1024. But are you going to know whether you need 32768 or 1048576 or 33554432? A lot of people won't, and we have more than enough reasons for performance to degrade over time as it is. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe Reply-To:
On 6 March 2014 22:43, Noah Misch n...@leadboat.com wrote: Good analysis. The hazards arise when pg_dump uses one of the ruleutils.c deparse worker functions. Ah, good. We're thinking along the same lines. I was already working on this analysis also. I'll complete mine and then compare notes. One thing not to forget is that you can always get the old mutual exclusion back by issuing LOCK TABLE just before a DDL operation. If some unlucky user regularly gets pg_dump failures due to concurrent DROP TRIGGER, he has a workaround. There's no comparable way for someone who would not experience that problem to weaken the now-hardcoded AccessExclusiveLock. Good point. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC on WAL-logging hash indexes
On Thu, Mar 6, 2014 at 11:14 PM, Robert Haas robertmh...@gmail.com wrote: I've been tempted to implement a new type of hash index that allows both WAL and high concurrency, simply by disallowing bucket splits. At the index creation time you use a storage parameter to specify the number of buckets, and that is that. If you mis-planned, build a new index with more buckets, possibly concurrently, and drop the too-small one. Yeah, we could certainly do something like that. It sort of sucks, though. I mean, it's probably pretty easy to know that starting with the default 2 buckets is not going to be enough; most people will at least be smart enough to start with, say, 1024. But are you going to know whether you need 32768 or 1048576 or 33554432? A lot of people won't, and we have more than enough reasons for performance to degrade over time as it is. The other thought I had was that you can do things lazily in vacuum. So when you probe you need to check multiple pages until vacuum comes along and rehashes everything. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/06/2014 12:58 PM, Daniele Varrazzo wrote: On Thu, Mar 6, 2014 at 6:54 PM, Josh Berkus j...@agliodbs.com wrote: The actual storage upgrade of hstore--hstore2 is fairly painless from the user perspective; they don't have to do anything. The problem is that the input/output strings are different, something which I didn't think to check for (and Peter did), and which will break applications relying on Hstore, since the drivers which support Hstore (like psycopg2) rely on string-parsing to convert it. I haven't regression-tested hstore2 against psycopg2 since I don't have a good test, but that would be a useful thing to do. Hello, psycopg developer here. Not following the entire thread as it's quite articulated and not of my direct interest (nor comprehension). But if you throw at me a few test cases I can make sure psycopg can parse them much before hstore2 is released. Looks like that won't be necessary, Daniele. But thanks for speaking up! -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
I was thinking more like typedef struct CustomPathFuncs { const char *name; /* used for debugging purposes only */ NodeCopy_function node_copy; NodeTextOut_function node_textout; ... etc etc etc ... } CustomPathFuncs; typedef struct CustomPath { Path path; const CustomPathFuncs *funcs; ... maybe a few more fields here, but not too darn many ... } CustomPath; and similarly for CustomPlan. The advantage of this way is it's very cheap for (what I expect will be) the common case where an extension has a fixed set of support functions for its custom paths and plans. It just declares a static constant CustomPathFuncs struct, and puts a pointer to that into its paths. If an extension really needs to set the support functions on a per-object basis, it can do this: typdef struct MyCustomPath { CustomPath cpath; CustomPathFuncs funcs; ... more fields ... } MyCustomPath; and then initialization of a MyCustomPath would include mypath-cpath.funcs = mypath-funcs; mypath-funcs.node_copy = MyCustomPathCopy; ... etc etc ... In this case we're arguably wasting one pointer worth of space in the path, but considering the number of function pointers such a path will be carrying, I don't think that's much of an objection. That is exactly same as my expectation, and no objection here. Thanks for your clarification. So? If you did that, then you wouldn't have renumbered the Vars as INNER/OUTER. I don't believe that CUSTOM_VAR is necessary at all; if it is needed, then there would also be a need for an additional tuple slot in executor contexts, which you haven't provided. For example, the enhanced postgres_fdw fetches the result set of remote join query, thus a tuple contains the fields come from both side. In this case, what varno shall be suitable to put? Not sure what we'd do for the general case, but CUSTOM_VAR isn't the solution. Consider for example a join where both tables supply columns named id --- if you put them both in one tupledesc then there's no non-kluge way to identify them. Possibly the route to a solution involves adding another plan-node callback function that ruleutils.c would use for printing Vars in custom join nodes. Or maybe we could let the Vars keep their original RTE numbers, though that would complicate life at execution time. My preference is earlier one, because complication in execution time may make performance degradation. Once two tuples get joined in custom-node, only extension can know which relation is the origin of a particular attribute in the unified tuple. So, it seems to me reasonable extension has to provide a hint to resolve the Var naming. Probably, another callback that provides a translation table from a Var node that reference custom-plan but originated from either of subtree. (It looks like a translated_vars in prepunion.c) For example, let's assume here is a Var node with INDEX_VAR in the tlist of custom-plan. It eventually references ecxt_scantuple in the execution time, and this tuple-slot will keep a joined tuple being originated from two relations. If its varattno=9 came from the column varno=1/varatno=3, we like to print its original name. If we can have a translation table like translated_vars, it allows to translate an attribute number on the custom-plan into its original ones. Even it might be abuse of INDEX_VAR, it seems to me a good idea. Also, I don't like to re-define the meaning of INNER_VAR/OUTER_VAR because custom-plan may have both of left-/right-subtree, thus it makes sense to support a case when both of tupleslots are available. Anyway, if we're going to punt on add_join_path_hook for the time being, this problem can probably be left to solve later. It won't arise for simple table-scan cases, nor for single-input plan nodes such as sorts. Yes, it is a problem if number of input plans is larger then 1. Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On Thu, Mar 6, 2014 at 5:05 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Thu, Mar 6, 2014 at 5:04 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Mar 6, 2014 at 2:52 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Thu, Mar 6, 2014 at 4:42 PM, Robert Haas robertmh...@gmail.com wrote: I think this isn't a good design. Per the discussion between Andres and I, I think that I think you should do is make ALTER TABLE .. SET LOGGED work just like VACUUM FULL, with the exception that it will set a different relpersistence for the new relfilenode. If you do it that way, this will be less efficient, but much simpler, and you might actually finish it in one summer. Do it like 'VACUUM FULL' for any wal_level? Yep. Anything else appears to be a research problem. I'll change the proposal. Thanks a lot! One last question. Do you think is difficult to implement ALTER TABLE ... SET UNLOGGED too? Thinking in a scope of one GSoC, of course. Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
[HACKERS] Clustering and PostgresXC meetings at pgCon 2014
Clustering Summit - We will be holding the fourth annual Cluster-Hacker meeting on Tuesday May 20th, just before pgCon 2014. Everyone with a serious interest in clustering, replication, and scale-out technologies is welcome and encouraged to attend. This meeting will take place from 2pm to 6pm, in a meeting room on the University of Ottawa campus. The agenda will be set by the participants using the Clustering Summit Wiki Page: https://wiki.postgresql.org/wiki/PgCon2014ClusterSummit RSVP on the wiki page if you plan to attend. We need a headcount for the room size and for lunch (see below) PostgresXC Meeting -- The Cluster-Hackers meeting will be preceded by the PostgresXC Developer's Meeting from 9:30am to 1pm, in the same location. This meeting is for all contributors and beta-testers of PostgresXC to discuss the current state of development and plans for its future. The agenda will be set by the participants using the PostgresXC Meeting Wiki Page https://wiki.postgresql.org/wiki/Pgcon2014PostgresXCmeeting For those attending both meetings who RSVP to both, box lunches will be provided. PostgresXC Pizza Demo - The evening of Tuesday, May 20th, we will follow up our meetings with a demo of PostgresXC's current capabilities over pizza. All pgCon attendees are invited to join us. Location and exact time TBD; check the pgCon schedule for updates. All of the above is sponsored by NTT Open Source. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Comment - uniqueness of relfilenode
On Thu, Mar 6, 2014 at 08:34:29AM +0100, Antonin Houska wrote: You're right. I missed the fact that Postgres (unlike another DBMS that I worked with) allows for tablespace to be shared across databases. I have update the C comment: * Notice that relNode is only unique within a particular database. --- * Notice that relNode is only unique within a particular tablespace. Yep. But the new text is no more correct than the old text. Did you read what I wrote upthread? Perhaps ... unique within a particular (tablespace, database) combination. ? Oh, I thought people just wanted the text to be consistent with other mentions, rather than improving it. Attached is a patch which updates comments about the tablespace/database issue. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c new file mode 100644 index e3002eb..282e0b9 *** a/src/backend/catalog/catalog.c --- b/src/backend/catalog/catalog.c *** GetNewOidWithIndex(Relation relation, Oi *** 408,415 /* * GetNewRelFileNode ! * Generate a new relfilenode number that is unique within the given ! * tablespace. * * If the relfilenode will also be used as the relation's OID, pass the * opened pg_class catalog, and this routine will guarantee that the result --- 408,415 /* * GetNewRelFileNode ! * Generate a new relfilenode number that is unique within the ! * database of the given tablespace. * * If the relfilenode will also be used as the relation's OID, pass the * opened pg_class catalog, and this routine will guarantee that the result diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c new file mode 100644 index 1de3170..25f01e5 *** a/src/backend/commands/tablecmds.c --- b/src/backend/commands/tablecmds.c *** ATExecSetTableSpace(Oid tableOid, Oid ne *** 9078,9085 FlushRelationBuffers(rel); /* ! * Relfilenodes are not unique across tablespaces, so we need to allocate ! * a new one in the new tablespace. */ newrelfilenode = GetNewRelFileNode(newTableSpace, NULL, rel-rd_rel-relpersistence); --- 9078,9085 FlushRelationBuffers(rel); /* ! * Relfilenodes are not unique in databases across tablespaces, so we ! * need to allocate a new one in the new tablespace. */ newrelfilenode = GetNewRelFileNode(newTableSpace, NULL, rel-rd_rel-relpersistence); diff --git a/src/include/storage/relfilenode.h b/src/include/storage/relfilenode.h new file mode 100644 index c995c10..8616bd3 *** a/src/include/storage/relfilenode.h --- b/src/include/storage/relfilenode.h *** typedef enum ForkNumber *** 55,61 * relNode identifies the specific relation. relNode corresponds to * pg_class.relfilenode (NOT pg_class.oid, because we need to be able * to assign new physical files to relations in some situations). ! * Notice that relNode is only unique within a particular tablespace. * * Note: spcNode must be GLOBALTABLESPACE_OID if and only if dbNode is * zero. We support shared relations only in the global tablespace. --- 55,62 * relNode identifies the specific relation. relNode corresponds to * pg_class.relfilenode (NOT pg_class.oid, because we need to be able * to assign new physical files to relations in some situations). ! * Notice that relNode is only unique within a database in a particular ! * tablespace. * * Note: spcNode must be GLOBALTABLESPACE_OID if and only if dbNode is * zero. We support shared relations only in the global tablespace. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl status with nonexistent data directory
On Thu, Mar 6, 2014 at 12:17:55PM -0300, Alvaro Herrera wrote: Bruce Momjian escribió: Technically, you are right, but I tried a while ago to assign meaningful values to all the exit locations and the community feedback I got was that we didn't want that. That sounds odd. Do you have a link? Sure, the patch is here: http://www.postgresql.org/message-id/20130629025033.gi13...@momjian.us and the idea of keeping what we have is stated here: http://www.postgresql.org/message-id/51d1e482.5090...@gmx.net -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On Thu, Mar 6, 2014 at 5:04 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Mar 6, 2014 at 2:52 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Thu, Mar 6, 2014 at 4:42 PM, Robert Haas robertmh...@gmail.com wrote: I think this isn't a good design. Per the discussion between Andres and I, I think that I think you should do is make ALTER TABLE .. SET LOGGED work just like VACUUM FULL, with the exception that it will set a different relpersistence for the new relfilenode. If you do it that way, this will be less efficient, but much simpler, and you might actually finish it in one summer. Do it like 'VACUUM FULL' for any wal_level? Yep. Anything else appears to be a research problem. Updated proposal: proposal ** Add to PostgreSQL the capacity to making an Unlogged table Logged ** Introduction This project will allow to change an unlogged table (that doesn't create transaction logs - WAL files) and it's dependencies to a logged table, in other words, a regular table that create WAL files. To make this happen we'll introduce a new SQL syntax: ALTER TABLE name SET LOGGED; Benefits to the PostgreSQL Community The unlogged tables feature was introduced by 9.1 version, and provide better write performance than regular tables (logged), but are not crash-safe. Their contents are automatically discarded (cleared) in a case of a server crash, and their contents do not propagate to replication slaves, either. With the capacity of turning an unlogged table in a logged table will allow us have the better of two features, in other words, we can use an unlogged table to run a bulk load a thousands of lines (ETL scripts) and get better performance, and then change it to a logged table to get durability of loaded data. Deliverables This project has just one deliverable at the end. The deliverable will be the implementation of the routines that transform an unlogged table to logged, using the same algorithm of the vacuum full, with the exception that it will set a different relpersistence for the new relfilenode. Project Schedule until May 19: * create a website to the project (wiki.postgresql.org) * create a public repository to the project (github.com/fabriziomello) * read what has already been discussed by the community about the project ( http://wiki.postgresql.org/wiki/Todo) * as already discussed in pgsql-hackers mailing list this feature will be implemented similar to vacuum full, with the exception that it will set a differente relpersistence for the new relfilenode * learn about some PostgreSQL internals: . grammar (src/backend/parser/gram.y) . vacuum full (src/backend/commands/[vacuum.c | cluster.c]) May 19 - June 23 * implementation of the first prototype: . change the grammar of PostgreSQL to support ALTER TABLE ... SET LOGGED . implement and/or adapt the routines to change an unlogged table to logged (similar to vacuum full) * write documentation and the test cases * submit this first prototype to the commitfest 2014/06 ( https://commitfest.postgresql.org/action/commitfest_view?id=22) June 23 - June 27 * review with the Mentor of the work done until now June 27 - August 18 * do the adjustments based on the community feedback during the commitfest 2014/06 * submit to the commitfest 2014/09 for final evaluation and maybe will be committed to 9.5 version (webpage don't created yet) August 18 - August 22 * final review with the Mentor of all work done. About the proponent Fabrízio de Royes Mello e-mail: fabriziome...@gmail.com twitter: @fabriziomello github: http://github.com/fabriziomello linkedin: http://linkedin.com/in/fabriziomello Currently I help people and teams to take the full potential of relational databases, especially PostgreSQL, helping teams to design the structure of the database (modeling), build physical architecture (database schema), programming (procedural languages), SQL (usage, tuning, best practices), optimization and orchestration of instances in production too. I perform a volunteer work for Brazilian Community of PostgreSQL (www.postgresql.org.br), supporting mailing lists, organizing events (pgbr.postgresql.org.br) and some admin tasks. And also I help a little the PostgreSQL Global Development Group (PGDG) in the implementation of some features and review of patches (git.postgresql.org). /proposal -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] GSoC proposal - make an unlogged table logged
=?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com writes: Do you think is difficult to implement ALTER TABLE ... SET UNLOGGED too? Thinking in a scope of one GSoC, of course. I think it's basically the same thing. You might hope to optimize it; but you have to create (rather than remove) an init fork, and there's no way to do that in exact sync with the commit. So for safety I think you have to copy the data into a new relfilenode. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl status with nonexistent data directory
On Thu, Mar 6, 2014 at 7:46 PM, Bruce Momjian br...@momjian.us wrote: On Thu, Mar 6, 2014 at 09:54:57AM +0530, Amit Kapila wrote: If they haven't passed us a data directory, we don't really know if the server is running or not, so the patch just returns '1'. But for such cases, isn't the status 4 more appropriate? As per above link 4 program or service status is unknown status 1 - 1 program is dead and /var/run pid file exists Going by this definition, it seems status 1 means, someone has forcefully killed the server and pid file still remains. Technically, you are right, but I tried a while ago to assign meaningful values to all the exit locations and the community feedback I got was that we didn't want that. I don't see how specifying non-existant or non-cluster directory would somehow be a case that would be special. One reason could be that as we are already returning special exit code for 'status' option of pg_ctl (exit(3)), this seems to be inline with it as this also get called during status command. Also in the link sent by you in another mail, I could see some statement which indicates it is more important to return correct codes in case of status which sounds a good reason to me. Link and statement http://www.postgresql.org/message-id/51d1e482.5090...@gmx.net The status case is different, because there the exit code can be passed out by the init script directly. If we just want to handle correct exit codes for status option, then may be we need to refactor the code a bit to ensure the same. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl status with nonexistent data directory
Bruce Momjian br...@momjian.us writes: On Thu, Mar 6, 2014 at 12:17:55PM -0300, Alvaro Herrera wrote: Bruce Momjian escribió: Technically, you are right, but I tried a while ago to assign meaningful values to all the exit locations and the community feedback I got was that we didn't want that. That sounds odd. Do you have a link? Sure, the patch is here: http://www.postgresql.org/message-id/20130629025033.gi13...@momjian.us and the idea of keeping what we have is stated here: http://www.postgresql.org/message-id/51d1e482.5090...@gmx.net Perhaps I shouldn't be putting words in Peter's mouth, but my reading of his complaint was that he didn't think you'd mapped the pg_ctl failure conditions to LSB status codes very well. That's not necessarily a vote against the abstract idea of making it more LSB-compliant. But it seems like we might have to go through it case-by-case to argue out what's the right error code for each case ... and I'm not sure anybody thinks it's worth that much effort. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - make an unlogged table logged
On Fri, Mar 7, 2014 at 12:36 AM, Tom Lane t...@sss.pgh.pa.us wrote: =?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com writes: Do you think is difficult to implement ALTER TABLE ... SET UNLOGGED too? Thinking in a scope of one GSoC, of course. I think it's basically the same thing. You might hope to optimize it; but you have to create (rather than remove) an init fork, and there's no way to do that in exact sync with the commit. So for safety I think you have to copy the data into a new relfilenode. Well, the same thing that 'vacuum full' does, but changing relpersistence to RELPERSISTENCE_UNLOGGED for the new relfilenode. Is this? Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] pg_ctl status with nonexistent data directory
On Thu, Mar 6, 2014 at 10:43:01PM -0500, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: On Thu, Mar 6, 2014 at 12:17:55PM -0300, Alvaro Herrera wrote: Bruce Momjian escribi�: Technically, you are right, but I tried a while ago to assign meaningful values to all the exit locations and the community feedback I got was that we didn't want that. That sounds odd. Do you have a link? Sure, the patch is here: http://www.postgresql.org/message-id/20130629025033.gi13...@momjian.us and the idea of keeping what we have is stated here: http://www.postgresql.org/message-id/51d1e482.5090...@gmx.net Perhaps I shouldn't be putting words in Peter's mouth, but my reading of his complaint was that he didn't think you'd mapped the pg_ctl failure conditions to LSB status codes very well. That's not necessarily a vote against the abstract idea of making it more LSB-compliant. But it seems like we might have to go through it case-by-case to argue out what's the right error code for each case ... and I'm not sure anybody thinks it's worth that much effort. Yes, I think the question was whether the effort was worth it. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_ctl status with nonexistent data directory
On Fri, Mar 7, 2014 at 09:07:24AM +0530, Amit Kapila wrote: One reason could be that as we are already returning special exit code for 'status' option of pg_ctl (exit(3)), this seems to be inline with it as this also get called during status command. Also in the link sent by you in another mail, I could see some statement which indicates it is more important to return correct codes in case of status which sounds a good reason to me. Link and statement http://www.postgresql.org/message-id/51d1e482.5090...@gmx.net The status case is different, because there the exit code can be passed out by the init script directly. If we just want to handle correct exit codes for status option, then may be we need to refactor the code a bit to ensure the same. OK, done with the attached patch Three is returned for status, but one for other cases. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c new file mode 100644 index 0dbaa6e..8c9970d *** a/src/bin/pg_ctl/pg_ctl.c --- b/src/bin/pg_ctl/pg_ctl.c *** static bool allow_core_files = false; *** 97,102 --- 97,103 static time_t start_time; static char postopts_file[MAXPGPATH]; + static char version_file[MAXPGPATH]; static char pid_file[MAXPGPATH]; static char backup_file[MAXPGPATH]; static char recovery_file[MAXPGPATH]; *** static void pgwin32_doRunAsService(void) *** 152,158 static int CreateRestrictedProcess(char *cmd, PROCESS_INFORMATION *processInfo, bool as_service); #endif ! static pgpid_t get_pgpid(void); static char **readfile(const char *path); static void free_readfile(char **optlines); static int start_postmaster(void); --- 153,159 static int CreateRestrictedProcess(char *cmd, PROCESS_INFORMATION *processInfo, bool as_service); #endif ! static pgpid_t get_pgpid(bool is_status_request); static char **readfile(const char *path); static void free_readfile(char **optlines); static int start_postmaster(void); *** print_msg(const char *msg) *** 246,255 } static pgpid_t ! get_pgpid(void) { FILE *pidf; long pid; pidf = fopen(pid_file, r); if (pidf == NULL) --- 247,275 } static pgpid_t ! get_pgpid(bool is_status_request) { FILE *pidf; long pid; + struct stat statbuf; + + if (stat(pg_data, statbuf) != 0) + { + if (errno == ENOENT) + printf(_(%s: directory \%s\ does not exist\n), progname, + pg_data); + else + printf(_(%s: cannot access directory \%s\\n), progname, + pg_data); + exit(is_status_request ? 3 : 1); + } + + if (stat(version_file, statbuf) != 0 errno == ENOENT) + { + printf(_(%s: directory \%s\ is not a database cluster directory\n), + progname, pg_data); + exit(is_status_request ? 3 : 1); + } pidf = fopen(pid_file, r); if (pidf == NULL) *** do_start(void) *** 810,816 if (ctl_command != RESTART_COMMAND) { ! old_pid = get_pgpid(); if (old_pid != 0) write_stderr(_(%s: another server might be running; trying to start server anyway\n), --- 830,836 if (ctl_command != RESTART_COMMAND) { ! old_pid = get_pgpid(false); if (old_pid != 0) write_stderr(_(%s: another server might be running; trying to start server anyway\n), *** do_stop(void) *** 894,900 pgpid_t pid; struct stat statbuf; ! pid = get_pgpid(); if (pid == 0)/* no pid file */ { --- 914,920 pgpid_t pid; struct stat statbuf; ! pid = get_pgpid(false); if (pid == 0)/* no pid file */ { *** do_stop(void) *** 943,949 for (cnt = 0; cnt wait_seconds; cnt++) { ! if ((pid = get_pgpid()) != 0) { print_msg(.); pg_usleep(100); /* 1 sec */ --- 963,969 for (cnt = 0; cnt wait_seconds; cnt++) { ! if ((pid = get_pgpid(false)) != 0) { print_msg(.); pg_usleep(100); /* 1 sec */ *** do_restart(void) *** 980,986 pgpid_t pid; struct stat statbuf; ! pid = get_pgpid(); if (pid == 0)/* no pid file */ { --- 1000,1006 pgpid_t pid; struct stat statbuf; ! pid = get_pgpid(false); if (pid == 0)/* no pid file */ { *** do_restart(void) *** 1033,1039 for (cnt = 0; cnt wait_seconds; cnt++) { ! if ((pid = get_pgpid()) != 0) { print_msg(.); pg_usleep(100); /* 1 sec */ --- 1053,1059 for (cnt = 0; cnt wait_seconds; cnt++) { ! if ((pid = get_pgpid(false)) != 0) { print_msg(.); pg_usleep(100); /* 1 sec */ *** do_reload(void) *** 1071,1077 { pgpid_t pid; ! pid = get_pgpid(); if (pid == 0)/* no pid file */ {
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 09:50:56PM +0400, Oleg Bartunov wrote: Hi there, Looks like consensus is done. I and Teodor are not happy with it, but what we can do :) One thing I want to do is to reserve our contribution to the flagship feature (jsonb), particularly, binary storage for nested structures and indexing. Their work was sponsored by Engine Yard. OK, if we are going with an unchanged hstore in contrib and a new JSONB, there is no reason to wack around JSONB to be binary compatible with the old hstore format. What sacrifices did we need to make to have JSBONB be binary compatible with hstore, can those sacrifices be removed, and can that be done in time for 9.4? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 04:33:08PM +0100, Ronan Dunklau wrote: I'm not sure what the constraints of json that you might want to break are. Perhaps you'd like to specify. I haven't followed the whole thread, but json is really restrictive on the supported types: a hierarchical hstore could maybe support more types (timestamp comes to mind) as its values, which is not a valid data type in the json spec. Yes, I can see this as an idea for a new data type that allows hierarchical storage of key/value pairs where the value can be any Postgres data type. It wouldn't be called hstore, or hstore2, but something else. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Securing make check (CVE-2014-0067)
On Thu, Mar 06, 2014 at 12:44:34PM -0500, Tom Lane wrote: Noah Misch n...@leadboat.com writes: Thanks. To avoid socket path length limitations, I lean toward placing the socket temporary directory under /tmp rather than placing under the CWD: I'm not thrilled with that; it's totally insecure on platforms where /tmp isn't sticky, so it doesn't seem like an appropriate solution given that this discussion is now being driven by security concerns. http://www.postgresql.org/message-id/flat/20121129223632.ga15...@tornado.leadboat.com I re-read that thread. While we did fix the reporting end of it, ie the postmaster will now give you a clear failure message if your socket path is too long, that's going to be cold comfort to anyone who has to build in an environment they don't have much control over (such as my still-hypothetical-I-hope scenario about Red Hat package updates). I'm inclined to suggest that we should put the socket under $CWD by default, but provide some way for the user to override that choice. If they want to put it in /tmp, it's on their head as to how secure that is. On most modern platforms it'd be fine. I am skeptical about the value of protecting systems with non-sticky /tmp, but long $CWD isn't of great importance, either. I'm fine with your suggestion. Though the $CWD or one of its parents could be world-writable, that would typically mean an attacker could just replace the test cases directly. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] atexit_callback can be a net negative
Back in commit 249a899f7, we introduced an atexit callback that forced backend process cleanup to occur if some random backend plugin (I'm looking at you, plperl and plpython) executed exit(). While that seemed like a great safety feature at the time, bug #9464 http://www.postgresql.org/message-id/flat/20140307005623.1918.22...@wrigleys.postgresql.org shows that this can actually destabilize things rather than improve them. The issue is that code that does fork() and then does an exit() in the subprocess will run the backend's atexit callback, which is The Wrong Thing since the parent backend is (probably) still running. In the bug thread I proposed making atexit_callback check whether getpid() still matches MyProcPid. If it doesn't, then presumably we inherited the atexit callback list, along with the value of MyProcPid, from some parent backend process whose elbow we should not joggle. Can anyone see a flaw in that? There's an interesting connection here to the existing coding rule that child processes of the postmaster have to do on_exit_reset() immediately after being forked to avoid running the postmaster's on_exit routines unintentionally. We've not seen any reported breakdowns in that scheme, but I wonder if we ought to handle it differently, seeing as how people keep coming up with new randomness that they want to load into the postmaster. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] atexit_callback can be a net negative
On Fri, Mar 7, 2014 at 2:03 AM, Tom Lane t...@sss.pgh.pa.us wrote: In the bug thread I proposed making atexit_callback check whether getpid() still matches MyProcPid. If it doesn't, then presumably we inherited the atexit callback list, along with the value of MyProcPid, from some parent backend process whose elbow we should not joggle. Can anyone see a flaw in that? While my answer would be not really (lots of python libraries do the same to handle forks), there's an optional path: pthread_atfork. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Securing make check (CVE-2014-0067)
Noah Misch n...@leadboat.com writes: On Thu, Mar 06, 2014 at 12:44:34PM -0500, Tom Lane wrote: I'm inclined to suggest that we should put the socket under $CWD by default, but provide some way for the user to override that choice. If they want to put it in /tmp, it's on their head as to how secure that is. On most modern platforms it'd be fine. I am skeptical about the value of protecting systems with non-sticky /tmp, but long $CWD isn't of great importance, either. I'm fine with your suggestion. Though the $CWD or one of its parents could be world-writable, that would typically mean an attacker could just replace the test cases directly. If the build tree is world-writable, that is clearly Not Our Fault. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On 03/06/2014 02:58 AM, Tom Lane wrote: Craig Ringer cr...@hobby.2ndquadrant.com writes: One of the remaining issues with row security is how to pass plan invalidation information generated in the rewriter back into the planner. With row security, it's necessary to set a field in PlannerGlobal, tracking the user ID of the user the query was planned for if row security was applied. It is also necessary to add a PlanInvalItem for the user ID. TBH I'd just add a user OID field in struct Query and not hack up a bunch of existing function APIs. It's not much worse than the existing constraintDeps field. If you're happy with that, I certainly won't complain. It's much simpler and less intrusive. I should be able to post an update using this later today. The PlanInvalItem could perfectly well be generated by the planner, no, if it has the user OID? But I'm not real sure why you need it. I don't see the reason for an invalidation triggered by user ID. What exactly about the *user*, and not something else, would trigger plan invalidation? It's only that the plan depends on the user ID. There's no point keeping the plan around if the user no longer exists. You're quite right that this can be done in the planner when a dependency on the user ID is found, though. So there's no need to pass a PlanInvalItem down, which is a lot nicer. What we do need is a notion that a plan cache entry might only be valid for a specific calling user ID; but that's a matter for cache entry lookup not for subsequent invalidation. Yes, that would be good, but is IMO more of a separate optimization. I'm currently using KaiGai's code to invalidate and re-plan when a user ID change is detected. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] syslog_ident mentioned as syslog_identify in the docs
Hi all, In the documentation, particularly the doc index, syslog_ident is incorrectly mentioned as syslog_identify. The attached patch fixes that. This error is in the docs since 8.0. Regards, -- Michael *** a/doc/src/sgml/config.sgml --- b/doc/src/sgml/config.sgml *** *** 3743,3749 local0.*/var/log/postgresql varlistentry id=guc-syslog-ident xreflabel=syslog_ident termvarnamesyslog_ident/varname (typestring/type)/term indexterm !primaryvarnamesyslog_identity/ configuration parameter/primary /indexterm listitem para --- 3743,3749 varlistentry id=guc-syslog-ident xreflabel=syslog_ident termvarnamesyslog_ident/varname (typestring/type)/term indexterm !primaryvarnamesyslog_ident/ configuration parameter/primary /indexterm listitem para -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gaussian distribution pgbench
Hi, (2014/03/04 17:42), KONDO Mitsumasa wrote: (2014/03/04 17:28), Fabien COELHO wrote: OK. I'm not sure which idia is the best. So I wait for comments in community:) Hmmm. Maybe you can do what Tom voted for, he is the committer:-) Yeah, but he might change his mind by our disscuttion. So I wait untill tomorrow, and if nothing to comment, I will start to fix what Tom voted for. I create the patch which is fixed UI. If we agree with this interface, I also start to fix the document. New \setrandom interface is here. \setrandom var min max [gaussian threshold | exponential threshold] Attached patch realizes this interface, but it has little bit ugly codeing in executeStatement() and process_commands().. That is under following. if(argc == 4) { ... /* uniform */ } else if (argv[4]== gaussian or exponential) { ... /* gaussian or exponential */ } else { ... /* uniform with extra argments */ } It is beacause pgbench custom script allows extra comments or extra argument in its file. For example, under following cases are no problem case. \setrandom var min max #hoge -- uniform random \setrandom var min max #hoge1 #hoge2 -- uniform random \setrandom var min max gaussian threshold #hoge --gaussian random And other cases are classified under following. \setrandom var min max gaussian #hoge -- uniform \setrandom var min max max2 gaussian threshold -- uniform \setrandom var min gaussian #hoge -- ERROR However, if we wrong grammer in pgbench custom script, pgbench outputs error log on user terminal. So I think it is especially no problem. What do you think? Regards, -- Mitsumasa KONDO NTT Open Source Software Center *** a/contrib/pgbench/pgbench.c --- b/contrib/pgbench/pgbench.c *** *** 98,103 static int pthread_join(pthread_t th, void **thread_return); --- 98,106 #define LOG_STEP_SECONDS 5 /* seconds between log messages */ #define DEFAULT_NXACTS 10 /* default nxacts */ + #define MIN_GAUSSIAN_THRESHOLD 2.0 /* minimum threshold for gauss */ + #define MIN_EXPONENTIAL_THRESHOLD 2.0 /* minimum threshold for exp */ + int nxacts = 0; /* number of transactions per client */ int duration = 0; /* duration in seconds */ *** *** 169,174 bool is_connect; /* establish connection for each transaction */ --- 172,185 bool is_latencies; /* report per-command latencies */ int main_pid; /* main process id used in log filename */ + /* gaussian distribution tests: */ + double stdev_threshold; /* standard deviation threshold */ + booluse_gaussian = false; + + /* exponential distribution tests: */ + double exp_threshold; /* threshold for exponential */ + bool use_exponential = false; + char *pghost = ; char *pgport = ; char *login = NULL; *** *** 330,335 static char *select_only = { --- 341,428 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n }; + /* --exponential case */ + static char *exponential_tpc_b = { + \\set nbranches CppAsString2(nbranches) * :scale\n + \\set ntellers CppAsString2(ntellers) * :scale\n + \\set naccounts CppAsString2(naccounts) * :scale\n + \\setrandom aid 1 :naccounts exponential :exp_threshold\n + \\setrandom bid 1 :nbranches\n + \\setrandom tid 1 :ntellers\n + \\setrandom delta -5000 5000\n + BEGIN;\n + UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n + SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n + UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;\n + UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;\n + INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n + END;\n + }; + + /* --exponential with -N case */ + static char *exponential_simple_update = { + \\set nbranches CppAsString2(nbranches) * :scale\n + \\set ntellers CppAsString2(ntellers) * :scale\n + \\set naccounts CppAsString2(naccounts) * :scale\n + \\setrandom aid 1 :naccounts exponential :exp_threshold\n + \\setrandom bid 1 :nbranches\n + \\setrandom tid 1 :ntellers\n + \\setrandom delta -5000 5000\n + BEGIN;\n + UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n + SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n + INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n + END;\n + }; + + /* --exponential with -S case */ + static char *exponential_select_only = { + \\set naccounts CppAsString2(naccounts) * :scale\n + \\setrandom aid 1 :naccounts exponential :exp_threshold\n + SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n + }; + + /* --gaussian case */ + static char *gaussian_tpc_b = { + \\set nbranches CppAsString2(nbranches) * :scale\n + \\set ntellers CppAsString2(ntellers) * :scale\n + \\set naccounts CppAsString2(naccounts) * :scale\n + \\setrandom aid
Re: [HACKERS] gaussian distribution pgbench
(2014/03/07 16:02), KONDO Mitsumasa wrote: And other cases are classified under following. \setrandom var min max gaussian #hoge -- uniform Oh, it's wrong... It will be.. \setrandom var min max gaussian #hoge -- ERROR Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers