Re: [HACKERS] logical changeset generation v3 - comparison to Postgres-R change set format
On 11/16/2012 03:05 PM, Andres Freund wrote: I'd like to provide a comparison of the proposed change set format to the one used in Postgres-R. Uh, sorry to interrupt you right here, but thats not the proposed format ;) Understood. Sorry, I didn't mean to imply that. It's pretty obvious to me that this is more of a human readable format and that others, including binary formats, can be implemented. I apologize for the bad wording of a proposed format, which doesn't make that clear. The Postgres-R approach is independent of WAL and its format, where as the approach proposed here clearly is not. Either way, there is a certain overhead - however minimal it is - which the former adds to the transaction processing itself, while the later postpones it to a separate XLogReader process. If there's any noticeable difference, it might reduce latency in case of asynchronous replication, but can only increase latency in the synchronous case. As far as I understood Andres, it was easier to collect the additional meta data from within the separate process. There also is the point that if you do the processing inside heap_* you need to make sure the replication targeted data is safely received fsynced away, in our case thats not necessary as WAL already provides crash safety, so should the replication connection break you can simply start from the location last confirmed as being safely sent. In the case of Postgres-R, the safely received part isn't really handled at the change set level at all. And regarding the fsync guarantee: you can well use the WAL to provide that, without basing change set generation on in. In that regard, Postgres-R is probably the more general approach: you can run Postgres-R with WAL turned off entirely - which may well make sense if you take into account the vast amount of cloud resources available, which don't have a BBWC. Instead of WAL, you can add more nodes at more different locations. And no, you don't want your database to ever go down, anyway :-) In summary, I'd say that Postgres-R is an approach specifically targeting and optimized for multi-master replication between Postgres nodes, where as the proposed patches are kept more general. One major aim definitely was optionally be able to replicate into just about any target system, so yes, I certainly agree. I'm glad I got that correct ;-) Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical changeset generation v3 - comparison to Postgres-R change set format
On 11/16/2012 03:14 PM, Andres Freund wrote: Whats the data type of the COID in -R? It's short for CommitOrderId, a 32bit global transaction identifier, being wrapped-around, very much like TransactionIds are. (In that sense, it's global, but unique only for a certain amount of time). In the patchset the output plugin has enough data to get the old xid and the new xid in the case of updates (not in the case of deletes, but thats a small bug and should be fixable with a single line of code), and it has enough information to extract the primary key without problems. It's the xmin of the old tuple that Postgres-R needs to get the COID. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 05/14] Add a new relmapper.c function RelationMapFilenodeToOid that acts as a reverse of RelationMapOidToFilenode
On Fri, Nov 16, 2012 at 7:58 PM, Andres Freund and...@2ndquadrant.comwrote: Hi, On 2012-11-16 13:44:45 +0900, Michael Paquier wrote: This patch looks OK. I got 3 comments: 1) Why changing the OID of pg_class_tblspc_relfilenode_index from 3171 to 3455? It does not look necessary. Its a mismerge and should have happened in Add a new RELFILENODE syscache to fetch a pg_class entry via (reltablespace, relfilenode) but it seems I squashed the wrong two commits. I had originally used 3171 but that since got used up for lo_tell64... 2) You should perhaps change the header of RelationMapFilenodeToOid so as not mentionning it as the opposite operation of RelationMapOidToFilenode but as an operation that looks for the OID of a relation based on its relfilenode. Both functions are opposite but independent. I described it as the opposite because RelationMapOidToFilenode is the relmappers stated goal and RelationMapFilenodeToOid is just some side-business. 3) Both functions are doing similar operations. Could it be possible to wrap them in the same central function? I don't really see how without making both quite a bit more complicated. The amount of if's needed seems to be too large to me. OK thanks for your answers. As this patch only adds a new function and is not that much complicated, I think there is no problem in committing it. The only thing to remove is the diff in indexing.h. Could someone take care of that? If other people have additional comments on the ability to perform a relfileoid-reloid operation for cached maps, of course go ahead. -- Michael Paquier http://michael.otacoo.com
Re: [HACKERS] Do we need so many hint bits?
On 2012-11-16 17:19:23 -0800, Jeff Davis wrote: On Fri, 2012-11-16 at 16:09 +0100, Andres Freund wrote: As far as I understand the code the crash-safety aspects of the visibilitymap currently rely on on having the knowledge that ALL_VISIBLE has been cleared during a heap_(insert|update|delete). That allows management of the visibilitymap without it being xlogged itself which seems pretty important to me. It looks like the xlrec does contain a cleared all visible flag in it, and it uses that to clear the VM as well as PD_ALL_VISIBLE. I think the point is that to check whether the visibilitymap bit needs to be unset - and thus locked exlusively - no locks have to be acquired but those heap_* already has. Given that in a the large amount of cases ALL_VISIBLE does *not* need to be reset I think that this is a very important property for concurrency purposes. If you consider the large amount of pages that are covered by a single visibilitymap page we don't want more locking in that path... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL
On Friday, November 16, 2012 7:52 PM Cédric Villemain wrote: Le vendredi 16 novembre 2012 15:08:30, Amit Kapila a écrit : On Thursday, November 15, 2012 8:18 PM Amit kapila wrote: On Wednesday, November 14, 2012 12:24 AM Robert Haas wrote: On Mon, Nov 12, 2012 at 10:59 PM, Amit kapila amit.kap...@huawei.com wrote: Uh, no, I don't think that's a good idea. IMHO, what we should do is: 1. Read postgresql.conf.auto and remember all the settings we saw. If we see something funky like an include directive, barf. 2. Forget the value we remembered for the particular setting being changed. Instead, remember the user-supplied new value for that parameter. 3. Write a new postgresql.conf.auto based on the information remembered in steps 1 and 2. Attached patch contains implementaion for above concept. It can be changed to adapt the write file based on GUC variables as described by me yesterday or in some other better way. Currenty to remember and forget, I have used below concept: 1. Read postgresql.auto.conf in memory. 2. parse the GUC_file for exact loction of changed variable 3. update the changed variable in memory at location found in step-2 4. Write the postgresql.auto.conf Overall changes: 1. include dir in postgresql.conf at time of initdb 2. new built-in function pg_change_conf to change configuration settings 3. write file as per logic described above. Some more things left are: 1. Transactional capability to command, so that rename of .lock file to .auto.conf can be done at commit time. About transaction capability, I think it will be difficult to implement it in transaction block, because during Rollback to savepoint it is difficult to rollback (delete the file), as there is no track of changes w.r.t Savepoint. not a problem. consider that pseudo code: begin serializable; update pg_settings; -- or whatever the name of the object (can include creation of a table, etc...) savepoint... update pg_settings; rollback to savepoint; commit; -- here the deferred trigger FOR STATEMENT on pg_settings is fired and is responsible to write/mv the/a file. Is there something obvious I'm not seeing ? I think transaction handling is better with the way you are mentioning. Here is what I am able to think about your idea: 1. have a system table pg_global_system_settings(key,value) 2. On SQL command execution, insert if the value doesn't exist or update if already existing. 3. On commit, a deffered trigger will read from table and put all the rows in a .auto flat file 4. In the deffered trigger, may be we need to use lock for writting to file, so that 2 backends writting same time may not garbled the file. I am not sure if lock is required or not? Advantages of this approach: 1. transaction handling can be better. 2. updation of config value row can be easier Problem which needs to be thought Sychronization between flat file .auto.conf and system table Case-1 a. During commit, we write into file (deffered trigger execution) before marking transaction as commit. b. after writting to file, any error or system crash, then table and file will have different contents. Case-2 a. During commit, we write into file (deffered trigger execution) after marking transaction as commit. b. any error or system crash before write into file can lead to different contents in table and flat file. Resolution May be during recovery, we can try to make table and file consistent, but it can be tricky. Comparison with Approach I have implemented 1. Because it will be done in serializable isolation, 2 users trying to modify the same value, will get error. However in current approach, user will not get this error. 2. The lock time will be lesser in system table approach but don't think it will matter because it is a rarely used command. I think, some other people thoughts are also required to see if there is any deeper design issue which I could not see in this approach and whether it can clearly score over approach with which currently it is implemented(directly operate on a file). Suggestions/Thoughts? With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] another idea for changing global configuration settings from SQL
On Saturday, November 17, 2012 3:35 AM Dimitri Fontaine wrote: Tom Lane t...@sss.pgh.pa.us writes: Have you considered ALTER SYSTEM SET ... ? We'd talked about that in the context of the other patch, but it seems to fit much more naturally with this one. Or maybe ALTER GLOBAL SET or ALTER ALL SET. I would paint that one ALTER SYSTEM SET and the file based one ALTER CONFIGURATION SET. No new keyword were armed in that proposal. One more could be to have built-in function pg_change_config(level,key,value) level - PG_NEW_CONNECTION - PG_SYTEM_LEVEL Level will distinguish how and when the value will be used. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical changeset generation v3 - comparison to Postgres-R change set format
On 11/16/2012 02:46 PM, Markus Wanner wrote: Andres, On 11/15/2012 01:27 AM, Andres Freund wrote: In response to this you will soon find the 14 patches that currently implement $subject. Congratulations on that piece of work. I'd like to provide a comparison of the proposed change set format to the one used in Postgres-R. I hope for this comparison to shed some light on the similarities and differences of the two projects. As the author of Postgres-R, I'm obviously biased, but I try to be as neutral as I can. ... Let's compare by example: table replication_example: INSERT: id[int4]:1 somedata[int4]:1 text[varchar]:1 table replication_example: UPDATE: id[int4]:1 somedata[int4]:-1 text[varchar]:1 table replication_example: DELETE (pkey): id[int4]:1 In Postgres-R, the change sets for these same operations would carry the following information, in a binary representation: table replication_example: INSERT: VALUES (1, 1, '1') table replication_example: UPDATE: PKEY(1) COID(77) MODS('nyn') VALUES(-1) table replication_example: DELETE: PKEY(1) COID(78) Is it possible to replicate UPDATEs and DELETEs without a primary key in PostgreSQL-R Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical changeset generation v3 - comparison to Postgres-R change set format
On 11/17/2012 02:30 PM, Hannu Krosing wrote: Is it possible to replicate UPDATEs and DELETEs without a primary key in PostgreSQL-R No. There must be some way to logically identify the tuple. Note, though, that theoretically any (unconditional) unique key would suffice. In practice, that usually doesn't matter, as you rarely have one or more unique keys without a primary. Also note that the underlying index is useful for remote application of change sets (except perhaps for very small tables). In some cases, for example for n:m linking tables, you need to add a uniqueness key that spans all columns (as opposed to a simple index on one of the columns that's usually required, anyway). I hope for index-only scans eventually mitigating this issue. Alternatively, I've been thinking about the ability to add a hidden column, which can then be used as a PRIMARY KEY without breaking legacy applications that rely on SELECT * not returning that primary key. Are there other reasons to want tables without primary keys that I'm missing? Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Parser - Query Analyser
Hello guys, My name is Michail Giannakopoulos and I am a graduate student at University of Toronto. I have no previous experience in developing a system like postgreSQL before. What I am trying to explore is if it is possible to extend postgreSQL in order to accept queries of the form: Select function(att1, att2, att3) AS output(out1, out2, ..., outk) FROM [database_name]; where att1, att2, att3 are attributes of the relation [database_name] while output(out1, out2, out3) expresses the output that comes from 'function' and the fields that this output should have are (out1, out2, out3). What I mean is that this kind of query should return tuples that obay to the 'output(...)' schema and are produced by processing the original tuples of 'database_name' relation in attributes att1, att2 and att3. From what I have seen I need to teak both the parser and the query analyser in order to accept this form of query. However, I do not know how to tweak these systems. Even worse I do not know where to begin from. The manual of extending SQL did not helped me and currently I am debugging postgreSQL source code. I would appreciate if you could provide me with more resources and correct ways of how to hack inside postgreSQL files and system. For example, where do I add a new parse node, where should I change the the grammar. Thank you very much for all your time and all your help, Michail
Re: [HACKERS] foreign key locks
On 2012-11-16 22:31:51 -0500, Noah Misch wrote: On Fri, Nov 16, 2012 at 05:31:12PM +0100, Andres Freund wrote: On 2012-11-16 13:17:47 -0300, Alvaro Herrera wrote: Andres is on the verge of convincing me that we need to support singleton FOR SHARE without multixacts due to performance concerns. I don't really see any arguments against doing so. We aren't in a that big shortage of flags and if we need more than available I think we can free some (e.g. XMAX/XMIN_INVALID). The patch currently leaves two unallocated bits. Reclaiming currently-used bits means a binary compatibility break. Until we plan out such a thing, reclaimable bits are not as handy as never-allocated bits. I don't think we should lightly expend one of the final two. Not sure what you mean with a binary compatibility break? pg_upgrade'ability? What I previously suggested somewhere was: #define HEAP_XMAX_SHR_LOCK0x0010 #define HEAP_XMAX_EXCL_LOCK 0x0040 #define HEAP_XMAX_KEYSHR_LOCK (HEAP_XMAX_SHR_LOCK|HEAP_XMAX_EXCL_LOCK) /* * Different from _LOCK_BITS because it doesn't include LOCK_ONLY */ #define HEAP_LOCK_MASK(HEAP_XMAX_SHR_LOCK|HEAP_XMAX_EXCL_LOCK) #define HEAP_XMAX_IS_SHR_LOCKED(tup) \ (((tup)-t_infomask HEAP_LOCK_BITS) == HEAP_XMAX_SHR_LOCK) #define HEAP_XMAX_IS_EXCL_LOCKED(tup) \ (((tup)-t_infomask HEAP_LOCK_BITS) == HEAP_XMAX_EXCL_LOCK) #define HEAP_XMAX_IS_KEYSHR_LOCKED(tup) \ (((tup)-t_infomask HEAP_LOCK_BITS) == HEAP_XMAX_KEYSHR_LOCK) It makes checking for locks sightly more more complicated, but its not too bad... It would be useful for more people to chime in here: is FOR SHARE an important case to cater for? I wonder if using FOR KEY SHARE (keep performance characteristics, but would need to revise application code) would satisfy Andres' users, for example. It definitely wouldn't help in the cases I have seen because the point is to protect against actual content changes of the rows, not just the keys. Note that you actually need to use explicit FOR SHARE/UPDATE for correctness purposes in many scenarios unless youre running in 9.1+ serializable mode. And that cannot be used in some cases (try queuing for example) because the rollback rates would be excessive. I agree that tripling FOR SHARE cost is risky. Where is the added cost concentrated? Perchance that multiple belies optimization opportunities. Good question, let me play a bit. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical changeset generation v3 - comparison to Postgres-R change set format
On 11/17/2012 03:00 PM, Markus Wanner wrote: On 11/17/2012 02:30 PM, Hannu Krosing wrote: Is it possible to replicate UPDATEs and DELETEs without a primary key in PostgreSQL-R No. There must be some way to logically identify the tuple. It can be done as selecting on _all_ attributes and updating/deleting just the first matching row create cursor ... select from t ... where t.* = () fetch one ... delete where current of ... This is on distant (round 3 or 4) roadmap for this work, just was interested if you had found any better way of doing this :) Hannu Note, though, that theoretically any (unconditional) unique key would suffice. In practice, that usually doesn't matter, as you rarely have one or more unique keys without a primary. Also note that the underlying index is useful for remote application of change sets (except perhaps for very small tables). In some cases, for example for n:m linking tables, you need to add a uniqueness key that spans all columns (as opposed to a simple index on one of the columns that's usually required, anyway). I hope for index-only scans eventually mitigating this issue. Alternatively, I've been thinking about the ability to add a hidden column, which can then be used as a PRIMARY KEY without breaking legacy applications that rely on SELECT * not returning that primary key. Are there other reasons to want tables without primary keys that I'm missing? Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parser - Query Analyser
On Nov 17, 2012, at 9:18, Michael Giannakopoulos miccagi...@gmail.com wrote: Hello guys, My name is Michail Giannakopoulos and I am a graduate student at University of Toronto. I have no previous experience in developing a system like postgreSQL before. What I am trying to explore is if it is possible to extend postgreSQL in order to accept queries of the form: Select function(att1, att2, att3) AS output(out1, out2, ..., outk) FROM [database_name]; Anything is possible but what you are trying to do makes little sense generally and would take a tremendous amount of work to be done in PostgreSQL. The two main limitations are that you are creating a entirely new query language format and that the name of the database is constant and determined at the time of connection to the database. From a practical perspective I do not believe it (as written exactly above) can done without breaking existing functionality and/or introducing ambiguities. As I am not a PostgreSQL developer myself I cannot be of much more help but ISTM that providing more why and less what would get you better advice. As to learning how to contribute to the project I will let others point you to the existing resources that are out there. It would, however, probably help to explain what skills and background you already posses. David J. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doc patch, put pg_temp into the documentation's index
Peter Eisentraut pete...@gmx.net writes: On Fri, 2012-09-28 at 11:10 -0500, Karl O. Pinc wrote: pg_temp-toindex.patch Puts pg_temp into the index of the docs. But there is no object called pg_temp. It always pg_temp_ something. How should that be indexed? We do replaceable/ in a lot of places, and that seems serviceable enough, at least in output formats where the can be rendered differently from plain text. I don't remember though whether the sgml index infrastructure allows markup in an index item. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] foreign key locks
On Sat, Nov 17, 2012 at 03:20:20PM +0100, Andres Freund wrote: On 2012-11-16 22:31:51 -0500, Noah Misch wrote: On Fri, Nov 16, 2012 at 05:31:12PM +0100, Andres Freund wrote: On 2012-11-16 13:17:47 -0300, Alvaro Herrera wrote: Andres is on the verge of convincing me that we need to support singleton FOR SHARE without multixacts due to performance concerns. I don't really see any arguments against doing so. We aren't in a that big shortage of flags and if we need more than available I think we can free some (e.g. XMAX/XMIN_INVALID). The patch currently leaves two unallocated bits. Reclaiming currently-used bits means a binary compatibility break. Until we plan out such a thing, reclaimable bits are not as handy as never-allocated bits. I don't think we should lightly expend one of the final two. Not sure what you mean with a binary compatibility break? pg_upgrade'ability? Yes. If we decide HEAP_XMIN_INVALID isn't helping, we can stop adding it to tuples anytime. Old tuples may continue to carry the bit, with no particular deadline for their demise. To reuse that bit in the mean time, we'll need to prove that no tuple writable by PostgreSQL 8.3+ will get an unacceptable interpretation under the new meaning of the bit. Alternately, build the mechanism to prove that all such old bits are gone before using the bit in the new way. This keeps HEAP_MOVED_IN and HEAP_MOVED_OFF unavailable today. What I previously suggested somewhere was: #define HEAP_XMAX_SHR_LOCK0x0010 #define HEAP_XMAX_EXCL_LOCK 0x0040 #define HEAP_XMAX_KEYSHR_LOCK (HEAP_XMAX_SHR_LOCK|HEAP_XMAX_EXCL_LOCK) /* * Different from _LOCK_BITS because it doesn't include LOCK_ONLY */ #define HEAP_LOCK_MASK(HEAP_XMAX_SHR_LOCK|HEAP_XMAX_EXCL_LOCK) #define HEAP_XMAX_IS_SHR_LOCKED(tup) \ (((tup)-t_infomask HEAP_LOCK_BITS) == HEAP_XMAX_SHR_LOCK) #define HEAP_XMAX_IS_EXCL_LOCKED(tup) \ (((tup)-t_infomask HEAP_LOCK_BITS) == HEAP_XMAX_EXCL_LOCK) #define HEAP_XMAX_IS_KEYSHR_LOCKED(tup) \ (((tup)-t_infomask HEAP_LOCK_BITS) == HEAP_XMAX_KEYSHR_LOCK) It makes checking for locks sightly more more complicated, but its not too bad... Agreed; that seems reasonable. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical changeset generation v3 - comparison to Postgres-R change set format
On 11/17/2012 03:00 PM, Markus Wanner wrote: On 11/17/2012 02:30 PM, Hannu Krosing wrote: Is it possible to replicate UPDATEs and DELETEs without a primary key in PostgreSQL-R No. There must be some way to logically identify the tuple. Note, though, that theoretically any (unconditional) unique key would suffice. In practice, that usually doesn't matter, as you rarely have one or more unique keys without a primary. ... Are there other reasons to want tables without primary keys that I'm missing? Nope. The only place a table without a primary key would be needed is a log table, but as these are (supposed to be) INSERT-only this is not a problem for them. Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parser - Query Analyser
Michael Giannakopoulos miccagi...@gmail.com writes: What I am trying to explore is if it is possible to extend postgreSQL in order to accept queries of the form: Select function(att1, att2, att3) AS output(out1, out2, ..., outk) FROM [database_name]; where att1, att2, att3 are attributes of the relation [database_name] while output(out1, out2, out3) expresses the output that comes from 'function' and the fields that this output should have are (out1, out2, out3). You're not being terribly clear about what you intend this to mean, but the standard interpretation of AS is that it just provides a column renaming and doesn't for instance change datatypes. If that's what you have in mind then it can be done today using AS in the FROM clause: select * from foo() AS output(out1, out2, ...); That doesn't allow passing data from a table to the function, but as of HEAD we have LATERAL, so you could do select output.* from tab, LATERAL foo(att1, att2) AS output(out1, out2, ...); If you really insist on doing the renaming within a single composite column in the SELECT output list then you're going to have a lot of issues. Column name aliases are normally only associated with RTEs (FROM-list entries) and SELECT output columns. Column names for columns of a composite data type are properties of the type and so are out of the reach of AS-renaming in the current system design. I think you'd have to cons up an anonymous record type and treat the AS as an implicit cast to that type. Seems like an awful lot of work in order to have a nonstandard way to do something that can be done already. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
Josh Berkus j...@agliodbs.com writes: You could make that same claim about plain views, but in point of fact the demand for making them work in COPY has been minimal. So I'm not convinced this is an essential first-cut feature. We can always add it later. Of course. I just had the impression that we could support COPY FROM by *deleting* a couple lines from Kevin's patch, rather than it being extra work. Even if it happens to be trivial in the current patch, it's an added functional requirement that we might later regret having cavalierly signed up for. And, as noted upthread, relations that support only one direction of COPY don't exist at the moment; that would be adding an asymmetry that we might later regret, too. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parser - Query Analyser
- Цитат от Michael Giannakopoulos (miccagi...@gmail.com), на 17.11.2012 в 16:18 - Hello guys, My name is Michail Giannakopoulos and I am a graduate student at University of Toronto. I have no previous experience in developing a system like postgreSQL before. What I am trying to explore is if it is possible to extend postgreSQL in order to accept queries of the form: Select function(att1, att2, att3) AS output(out1, out2, ..., outk) FROM [database_name]; Why invent non-standard syntax for something that you could do in SQL. You could try something like this: SELECT (m.f).f_out_name1 AS out1, (m.f).f_out_name2 AS out2, (m.f).f_out_name3 AS out3 FROM ( SELECT f(att1,att2,att3) FROM input_table_name ) AS m; Best regards -- Luben Karavelov
Re: [HACKERS] foreign key locks
I agree that tripling FOR SHARE cost is risky. Where is the added cost concentrated? Perchance that multiple belies optimization opportunities. Good question, let me play a bit. Ok, I benchmarked around and from what I see there is no single easy target. The biggest culprits I could find are: 1. higher amount of XLogInsert calls per transaction (visible in pgbench -t instead of -T mode while watching the WAL volume) 2. Memory allocations in GetMultiXactIdMembers 3. Memory allocations in mXactCachePut a) cache entry itself b) the cache context 4. More lwlocks acquisitions We can possibly optimize a bit with 2) by using a static buffer for common member sizes, but thats not going to buy us too much... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doc patch, put pg_temp into the documentation's index
On 11/17/2012 12:19:02 AM, Peter Eisentraut wrote: On Fri, 2012-09-28 at 11:10 -0500, Karl O. Pinc wrote: pg_temp-toindex.patch Puts pg_temp into the index of the docs. But there is no object called pg_temp. It always pg_temp_ something. How should that be indexed? My thought is not to index the db object; it isn't particularly interesting to a user. Instead what's indexed is the token pg_temp, used when setting search_path. The utility of the token is explained in several places in the docs. Regards, Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical changeset generation v3 - comparison to Postgres-R change set format
Hannu, On 11/17/2012 03:40 PM, Hannu Krosing wrote: On 11/17/2012 03:00 PM, Markus Wanner wrote: On 11/17/2012 02:30 PM, Hannu Krosing wrote: Is it possible to replicate UPDATEs and DELETEs without a primary key in PostgreSQL-R No. There must be some way to logically identify the tuple. It can be done as selecting on _all_ attributes and updating/deleting just the first matching row create cursor ... select from t ... where t.* = () fetch one ... delete where current of ... That doesn't sound like it could possibly work for Postgres-R. At least not when there can be multiple rows with all the same attributes, i.e. without a unique key constraint over all columns. Otherwise, some nodes could detect two concurrent UPDATES as a conflict, while other nodes select different rows and don't handle it as a conflict. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add big fat caution to pg_restore docs regards partial db restores
On 11/17/2012 12:27:14 AM, Peter Eisentraut wrote: On Sun, 2012-09-23 at 21:22 -0500, Karl O. Pinc wrote: Hi, Adds a caution to the pg_restore docs Against git master. I'm not sure what you are trying to get at here. It's basically saying, if you make an incomplete database restore, you might get an incomplete database. Is there any specific failure scenario that we should address? Basically, no. It's a reminder of all the various sorts of inconsistencies that might arise from a partial restore, not just referential integrity but other integrity constraints that might be enforced by triggers or the application. Possibly even manual procedural checks. The idea is that such a list might direct the attention of the person doing data recovery to overlooked integrity issues. I agree, there's no point in a generic warning. The warning is only useful if it leads the reader to do a better job of data recovery. I would summarize slightly differently, an incomplete restore can lead to an inconsistent database. --- One criticism of this patch: Suggestion for --truncate-tables to pg_restore https://commitfest.postgresql.org/action/patch_view?id=944 was that because it allowed an incomplete restore you might get an inconsistent database. If that's a problem, and at some level it is, then it's a problem right now and the only way to address the problem is to help the data recovery person. I thought some sort of brief checklist of kinds of data integrity might help, put someplace where it would be seen when needed. Feel free to reject. I sent in the patch to try out the idea. It's a bit crazy, but I didn't think too crazy to share with the list. Regards, Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] array exclusion constraint
CREATE TABLE foo ( x CHAR(32) PRIMARY KEY, y CHAR(32) NOT NULL, EXCLUDE USING gist ((ARRAY[x, y]) WITH ) ); ERROR: data type character[] has no default operator class for access method gist HINT: You must specify an operator class for the index or define a default operator class for the data type. Neither gist nor gin work. You can do that with integers using the intarray extension module. Could you please suggest me a clean way to achieve the same result (using the char data type, not integers)? Probably I could create a specific operator class or some other workaround, but the intarray implementation looks a bit complex. Someone so kind to point me in the right direction? Of course the following produces the same error: CREATE TABLE bar ( x VARCHAR[] NOT NULL, EXCLUDE USING gist (x WITH ) );
Re: [HACKERS] foreign key locks
On Sat, Nov 17, 2012 at 05:07:18PM +0100, Andres Freund wrote: I agree that tripling FOR SHARE cost is risky. Where is the added cost concentrated? Perchance that multiple belies optimization opportunities. Good question, let me play a bit. Ok, I benchmarked around and from what I see there is no single easy target. The biggest culprits I could find are: 1. higher amount of XLogInsert calls per transaction (visible in pgbench -t instead of -T mode while watching the WAL volume) 2. Memory allocations in GetMultiXactIdMembers 3. Memory allocations in mXactCachePut a) cache entry itself b) the cache context 4. More lwlocks acquisitions We can possibly optimize a bit with 2) by using a static buffer for common member sizes, but thats not going to buy us too much... In that case, +1 for your proposal to prop up FOR SHARE. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] another idea for changing global configuration settings from SQL
On Fri, Nov 16, 2012 at 2:53 AM, Peter Eisentraut pete...@gmx.net wrote: Independent of the discussion of how to edit configuration files from SQL, I had another idea how many of the use cases for this could be handled. We already have the ability to store in pg_db_role_setting configuration settings for specific user, specific database specific user, any database any user, specific database The existing infrastructure would also support any user, any database (= all the time) All you'd need is to add ApplySetting(InvalidOid, InvalidOid, relsetting, PGC_S_$SOMETHING); in postinit.c, and have some SQL command to modify this setting. The only thing you couldn't handle that way are SIGHUP settings, but the often-cited use cases work_mem, logging, etc. would work. There would also be the advantage that pg_dumpall would save these settings. Thoughts? In this approach, we cannot change the settings in the standby? If yes, I don't like this approach. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need so many hint bits?
On Sat, 2012-11-17 at 14:24 +0100, Andres Freund wrote: I think the point is that to check whether the visibilitymap bit needs to be unset What's the problem with that? If you already have the VM buffer pinned (which should be possible if we keep the VM buffer in a longer-lived structure), then doing the test is almost as cheap as checking PD_ALL_VISIBLE, because you don't need any locks. So, the proposal is: 1. Keep the VM buffer around in a longer-lived structure for scans and nodeModifyTable. 2. Replace all tests of PD_ALL_VISIBLE with tests directly against the VM, hopefully using a buffer that we already have a pin on. I haven't really dug into this yet, but I don't see any obvious problem. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL
On Sat, Nov 17, 2012 at 10:25 PM, Amit Kapila amit.kap...@huawei.com wrote: 1. have a system table pg_global_system_settings(key,value) Do we really need to store the settings in a system table? Since WAL would be generated when storing the settings in a system table, this approach seems to prevent us from changing the settings in the standby. 2. On SQL command execution, insert if the value doesn't exist or update if already existing. This means that we should implement something like MERGE command first? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need so many hint bits?
Jeff Davis pg...@j-davis.com writes: What's the problem with that? If you already have the VM buffer pinned (which should be possible if we keep the VM buffer in a longer-lived structure), then doing the test is almost as cheap as checking PD_ALL_VISIBLE, because you don't need any locks. Really? What about race conditions? Specifically, I think what you suggest is likely to be unreliable on machines with weak memory ordering. Consider possibility that someone else just changed the VM bit. Getting a lock ensures synchronization. (Yeah, it's possible that we could use some primitive cheaper than a lock ... but it's not going to be free.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL
Fujii Masao masao.fu...@gmail.com writes: Do we really need to store the settings in a system table? Since WAL would be generated when storing the settings in a system table, this approach seems to prevent us from changing the settings in the standby. That's a really good point: if we try to move all GUCs into a system table, there's no way for a standby to have different values; and for some of them different values are *necessary*. I think that shoots down this line of thought entirely. Can we go back to the plain write a file approach now? I think a SET PERSISTENT command that's disallowed in transaction blocks and just writes the file immediately is perfectly sensible. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doc patch, put pg_temp into the documentation's index
On Sat, 2012-11-17 at 11:33 -0600, Karl O. Pinc wrote: On 11/17/2012 12:19:02 AM, Peter Eisentraut wrote: On Fri, 2012-09-28 at 11:10 -0500, Karl O. Pinc wrote: pg_temp-toindex.patch Puts pg_temp into the index of the docs. But there is no object called pg_temp. It always pg_temp_ something. How should that be indexed? My thought is not to index the db object; it isn't particularly interesting to a user. Instead what's indexed is the token pg_temp, used when setting search_path. The utility of the token is explained in several places in the docs. Actually, since this is the pg_temp alias for the search path, it is appropriate. So committed as is. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need so many hint bits?
On 16 November 2012 19:58, Jeff Davis pg...@j-davis.com wrote: On Fri, 2012-11-16 at 11:58 -0500, Robert Haas wrote: Also, I am wondering about PD_ALL_VISIBLE. It was originally introduced in the visibility map patch, apparently as a way to know when to clear the VM bit when doing an update. It was then also used for scans, which showed a significant speedup. But I wonder: why not just use the visibilitymap directly from those places? Well, you'd have to look up, lock and pin the page to do that. I suspect that overhead is pretty significant. The benefit of noticing that the flag is set is that you need not call HeapTupleSatisfiesMVCC for each tuple on the page: checking one bit in the page header is a lot cheaper than calling that function for every tuple. However, if you had to lock and pin a second page in order to check whether the page is all-visible, I suspect it wouldn't be a win; you'd probably be better off just doing the HeapTupleSatisfiesMVCC checks for each tuple. That's pretty easy to test. Here's what I got on a 10M record table (Some runs got some strangely high numbers around 1700ms, which I assume is because it's difficult to keep the data in shared buffers, so I took the lower numbers.): PD_ALL_VISIBLE: 661ms VM Lookup: 667ms Neither: 740ms Even if pinning the vm buffer were slow, we could keep the pin longer during a scan (it seems like the VM API is designed for that kind of a use case), so I don't think scans are a problem at all, even if there is a lot of concurrency. The biggest problem with hint bits is SeqScans on a table that ends up dirtying many pages. Repeated checks against clog and hint bit setting are massive overheads for the user that hits that, plus it generates an unexpected surge of database writes. Even without checksums that is annoying. ISTM that we should tune that specifically by performing a VM lookup for next 32 pages (or more), so we reduce the lookups well below 1 per page. That way the overhead of using the VM will be similar to using the PD_ALL_VISIBLE. Also, if we pass through a flag to HeapTupleSateisfies indicating we are not interested in setting hints on a SeqScan then we can skip individual tuple hints also. If the whole page becomes visible then we can set the VM. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doc patch, put pg_temp into the documentation's index
On 11/17/2012 05:10:12 PM, Peter Eisentraut wrote: On Sat, 2012-11-17 at 11:33 -0600, Karl O. Pinc wrote: what's indexed is the token pg_temp, used when setting search_path. Actually, since this is the pg_temp alias for the search path, it is appropriate. So committed as is. Thanks for the work on this and the other patches you've helped me out with, and for the larger work on PG of course. I would feel like I was cluttering the channel if I sent a thanks each time but I do want to acknowledge both your help and the work the other Postgres people do. Regards, Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need so many hint bits?
On Sat, 2012-11-17 at 16:53 -0500, Tom Lane wrote: Jeff Davis pg...@j-davis.com writes: What's the problem with that? If you already have the VM buffer pinned (which should be possible if we keep the VM buffer in a longer-lived structure), then doing the test is almost as cheap as checking PD_ALL_VISIBLE, because you don't need any locks. Really? What about race conditions? Specifically, I think what you suggest is likely to be unreliable on machines with weak memory ordering. Consider possibility that someone else just changed the VM bit. Getting a lock ensures synchronization. (Yeah, it's possible that we could use some primitive cheaper than a lock ... but it's not going to be free.) There's already a similar precedent in IndexOnlyNext, which calls visibilitymap_test with no lock. I am not authoritative on these kinds of lockless accesses, but it looks like we can satisfy those memory barrier requirements in the places we need to. Here is my analysis: Process A (process that clears a VM bit for a data page): 1. Acquires exclusive lock on data buffer 2. Acquires exclusive lock on VM buffer 3. clears VM bit 4. Releases VM buffer lock 5. Releases data buffer lock Process B (process that tests the VM bit for the same data page): 1. Acquires shared lock (if it's a scan doing a visibility test) or an exclusive lock (if it's an I/U/D that wants to know whether to clear the bit or not) on the data buffer. 2. Tests bit using an already-pinned VM buffer. 3. Releases data buffer lock. Process A and B must be serialized, because A takes an exclusive lock on the data buffer and B takes at least a shared lock on the data buffer. The only dangerous case is when A happens right before B. So, the question is: are there enough memory barriers between A-3 and B-2? And I think the answer is yes. A-4 should act as a write barrier after clearing the bit, and B-1 should act as a read barrier before reading the bit. Let me know if there is a flaw with this analysis. If not, then I still agree with you that it's not as cheap as testing PD_ALL_VISIBLE, but I am skeptical that memory-ordering constraints we're imposing on the CPU are expensive enough to matter in these cases. If you have a test case in mind that might exercise this, then I will try to run it (although my workstation is only 4 cores, and the most I can probably get access to is 16 cores). Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need so many hint bits?
On Sat, 2012-11-17 at 19:35 -0500, Simon Riggs wrote: The biggest problem with hint bits is SeqScans on a table that ends up dirtying many pages. Repeated checks against clog and hint bit setting are massive overheads for the user that hits that, plus it generates an unexpected surge of database writes. Even without checksums that is annoying. Yeah. I am nowhere close to a general solution for that, but I am targeting the PD_ALL_VISIBLE hint for removal (which is one part of the problem), and I think I am close to an approach with no measurable downside. ISTM that we should tune that specifically by performing a VM lookup for next 32 pages (or more), so we reduce the lookups well below 1 per page. That way the overhead of using the VM will be similar to using the PD_ALL_VISIBLE. That's another potential way to mitigate the effects during a scan, but it does add a little complexity. Right now, it share locks a buffer, and uses an array with one element for each tuple in the page. If PD_ALL_VISIBLE is set, then it marks all of the tuples *currently present* on the page as visible in the array, and then releases the share lock. Then, when reading the page, if another tuple is added (because we released the share lock and only have a pin), it doesn't matter because it's already invisible according to the array. With this approach, we'd need to keep a larger array to represent many pages. And it sounds like we'd need to share lock the pages ahead, and find out which items are currently present, in order to properly fill in the array. Not quite sure what to do there, but would require some more thought. I'm inclined to avoid going down this path unless there is some performance reason to do so. We can keep a VM buffer pinned and do some lockless testing (similar to that in IndexOnlyNext; see my response to Tom), which will hopefully be fast enough that we don't need anything else. Also, if we pass through a flag to HeapTupleSateisfies indicating we are not interested in setting hints on a SeqScan then we can skip individual tuple hints also. If the whole page becomes visible then we can set the VM. Hmm, that's an idea. Maybe we shouldn't bother setting the hints if it's already all-visible in the VM? Something else to think about. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Do we need so many hint bits?
On Fri, Nov 16, 2012 at 5:35 PM, Jeff Davis pg...@j-davis.com wrote: On Fri, 2012-11-16 at 17:04 -0800, Jeff Janes wrote: Your question prompts me to post something I had been wondering. Might it be worthwhile to break the PD_ALL_VISIBLE / vm equivalence? Should the vm bit get cleared by a HOT update? To clarify: are you saying that a hot update should clear the PD_ALL_VISIBLE bit, but not the VM bit? Yes. And anyone can vacuum a block that has only had HOT updates, you don't need to be dedicated vacuum worker to do that. And obviously this would be incompatible with removing the PD_ALL_VISIBLE, unless we also wanted to eliminate the ability to short-cut hint bit checks. I'm still a little unclear on what the benefit is. The benefit would be that index only scans would be more likely to not need to visit the heap page, if it only had HOT updates done to it since the last time it was all-visible. Also some reduced vacuuming, but I don't know if that benefit would be beneficial. It sounds like a slightly different kind of hint, so maybe we should just treat it as a completely different thing after removing PD_ALL_VISIBLE. If it's related to HOT updates, then the page will probably be dirty anyway, so that removes my primary complaint about PD_ALL_VISIBLE. Right, but if the HOT-only bit was on the page itself, it would no longer help out index-only-scans. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers