Re: [HACKERS] proposal: simple date constructor from numeric values
On Wed, Sep 18, 2013 at 9:54 PM, Pavel Stehule pavel.steh...@gmail.comwrote: Hello thank you, I have no comments Thanks. Assigned it to committer. Regards Pavel -- Jeevan B Chalke
Re: [HACKERS] pg_stat_statements: calls under-estimation propagation
On Thu, Sep 19, 2013 at 2:25 PM, samthakur74 samthaku...@gmail.com wrote: I got the segmentation fault when I tested the case where the least-executed query statistics is discarded, i.e., when I executed different queries more than pg_stat_statements.max times. I guess that the patch might have a bug. Thanks, will try to fix it. pg_stat_statements--1.1.sql should be removed. Yes will do that + entrystructfieldqueryid/structfield/entry + entrytypebigint/type/entry + entry/entry + entryUnique value of each representative statement for the current statistics session. + This value will change for each new statistics session./entry What does statistics session mean? The time period when statistics are gathered by statistics collector without being reset. So the statistics session continues across normal shutdowns, but in case of abnormal situations like crashes, format upgrades or statistics being reset for any other reason, a new time period of statistics collection starts i.e. a new statistics session. The queryid value generation is linked to statistics session so emphasize the fact that in case of crashes,format upgrades or any situation of statistics reset, the queryid for the same queries will also change. I'm afraid that this behavior narrows down the use case of queryid very much. For example, since the queryid of the same query would not be the same in the master and the standby servers, we cannot associate those two statistics by using the queryid. The queryid changes through the crash recovery, so we cannot associate the query statistics generated before the crash with that generated after the crash recovery even if the query is the same. This is not directly related to the patch itself, but why does the queryid need to be calculated based on also the statistics session? Will update documentation clearly explain the term statistics session in this context Yep, that's helpful! Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Kevin Grittner kgri...@ymail.com writes: There are examples in the patch and this thread, but rather than reference back to those I'll add a new one. Without the patch: Thanks much for doing that. The problem, as I see it, is that the view and the concurrently refreshed materialized view don't yield the same results for the same query. The rows are equal, but they are not the same. With the patch the matview, after RMVC, looks just the same as the view. My understanding is that if you choose citext then you don't care at all about the case, so that the two relations actually yield the same results for the right definition of same here: the citext one. In other words, the results only look different in ways that don't matter for the datatype involved, and I think that if it matters to the user then he needs to review is datatype choices or view definition. So my position on that is that your patch is only adding confusion for no benefits that I'm able to understand. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Extend namespace of valid guc names
On 2013-09-18 11:02:50 +0200, Andres Freund wrote: On 2013-09-18 11:55:24 +0530, Amit Kapila wrote: I think that ship has long since sailed. postgresql.conf has allowed foo.bar style GUCs via custom_variable_classes for a long time, and these days we don't even require that but allow them generally. Also, SET, postgres -c, and SELECT set_config() already don't have the restriction to one dot in the variable name. It's even explained in document that a two-part name is allowed for Customized Options at link: http://www.postgresql.org/docs/devel/static/runtime-config-custom.html Oh I somehow missed that. I'll need to patch that as well. Updated patch attached. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services From 2ab86c1cd230b4187ee994447075bd6a72b408dc Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Sat, 14 Sep 2013 23:15:10 +0200 Subject: [PATCH] Allow custom GUCs to be nested more than one level in config files Doing so was allowed via SET, postgres -c and set_config() before. Allowing to do so is useful for grouping together variables inside an extension's toplevel namespace. There still are differences in the accepted variable names between the different methods of setting GUCs after this commit, but the concensus is that those are acceptable. --- doc/src/sgml/config.sgml | 29 - src/backend/utils/misc/guc-file.l | 2 +- 2 files changed, 21 insertions(+), 10 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 370aa09..ae9ef3b 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -6375,21 +6375,32 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir' /para para - Custom options have two-part names: an extension name, then a dot, then - the parameter name proper, much like qualified names in SQL. An example - is literalplpgsql.variable_conflict/. + Custom options consist out of an extension name, then a dot, zero + or more extension internal namespaces each with a trailing dot, + then the parameter name proper, much like qualified names in + SQL. An example without extension internal namespaces + is literalplpgsql.variable_conflict/, one with + is literalfoo.a.connection/. /para para Because custom options may need to be set in processes that have not loaded the relevant extension module, productnamePostgreSQL/ - will accept a setting for any two-part parameter name. Such variables - are treated as placeholders and have no function until the module that - defines them is loaded. When an extension module is loaded, it will add - its variable definitions, convert any placeholder values according to - those definitions, and issue warnings for any unrecognized placeholders - that begin with its extension name. + will accept a setting for any parameter name containing at least one dot. + Such variables are treated as placeholders and have no function until the + module that defines them is loaded. When an extension module is loaded, + it will add its variable definitions, convert any placeholder values + according to those definitions, and issue warnings for any unrecognized + placeholders that begin with its extension name. /para + +para + Note that options set in filenamepostgresql.conf/ and files + it includes have different restrictions than names set + with commandSET/, commandSELECT set_config(...)/, or ones + passed directly to commandpostgres/ and commandpg_ctl/. +/para + /sect1 sect1 id=runtime-config-developer diff --git a/src/backend/utils/misc/guc-file.l b/src/backend/utils/misc/guc-file.l index b730a12..ff4202d 100644 --- a/src/backend/utils/misc/guc-file.l +++ b/src/backend/utils/misc/guc-file.l @@ -78,7 +78,7 @@ LETTER [A-Za-z_\200-\377] LETTER_OR_DIGIT [A-Za-z_0-9\200-\377] ID{LETTER}{LETTER_OR_DIGIT}* -QUALIFIED_ID {ID}.{ID} +QUALIFIED_ID {ID}(.{ID})+ UNQUOTED_STRING {LETTER}({LETTER_OR_DIGIT}|[-._:/])* STRING \'([^'\\\n]|\\.|\'\')*\' -- 1.8.4.21.g992c386.dirty -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for fail-back without fresh backup
On Thu, Sep 19, 2013 at 12:25 PM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Sep 19, 2013 at 11:48 AM, Sawada Masahiko sawada.m...@gmail.com wrote: I attached the patch which I have modified. Thanks for updating the patch! Here are the review comments: Thank you for reviewing! I got the compiler warning: syncrep.c:112: warning: unused variable 'i' How does synchronous_transfer work with synchronous_commit? The currently patch synchronous_transfer doesn't work when synchronous_commit is set 'off' or 'local'. if user changes synchronous_commit value on transaction, checkpointer process can't see it. Due to that, even if synchronous_commit is changed to 'off' from 'on', synchronous_transfer doesn't work. I'm planning to modify the patch so that synchronous_transfer is not affected by synchronous_commit. + * accept all the likely variants of off. This comment should be removed because synchronous_transfer doesn't accept the value off. +{commit, SYNCHRONOUS_TRANSFER_COMMIT, true}, ISTM the third value true should be false. +{0, SYNCHRONOUS_TRANSFER_COMMIT, true}, Why is this needed? +elog(WARNING, XLogSend sendTimeLineValidUpto(%X/%X) = sentPtr(%X/%X) AND sendTImeLine, + (uint32) (sendTimeLineValidUpto 32), (uint32) sendTimeLineValidUpto, + (uint32) (sentPtr 32), (uint32) sentPtr); Why is this needed? They are unnecessary. I had forgot to remove unnecessary codes. +#define SYNC_REP_WAIT_FLUSH1 +#define SYNC_REP_WAIT_DATA_FLUSH2 Why do we need to separate the wait-queue for wait-data-flush from that for wait-flush? ISTM that wait-data-flush also can wait for the replication on the wait-queue for wait-flush, and which would simplify the patch. Yes, it seems not necessary to add queue newly. I will delete SYNC_REP_WAIT_DATA_FLUSH and related that. Regards, --- Sawada Masahiko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: lob conversion functionality
Hi Pavel, I have reviewed you patch. -- Patch got applied cleanly (using patch -p1) -- Make Make install works fine -- make check looks good I done code-walk and it looks good. Also did some manual testing and haven't found any issue with the implementation. Patch introduced two new API load_lo() and make_lo() for loading and saving from/to large objects Functions. When it comes to drop an lo object created using make_lo() this still depend on older API lo_unlink(). I think we should add that into documentation for the clerification. As a user to lo object function when I started testing this new API, first question came to mind is why delete_lo() or destroy_lo() API is missing. Later I realize that need to use lo_unlink() older API for that functionality. So I feel its good to document that. Do let you know what you think ? Otherwise patch looks nice and clean. Regards, Rushabh Lathia www.EnterpriseDB.com On Sun, Aug 25, 2013 at 8:31 PM, Pavel Stehule pavel.steh...@gmail.comwrote: Hello here is a patch it introduce a load_lo and make_lo functions postgres=# select make_lo(decode('ff00','hex')); make_lo ─ 24629 (1 row) Time: 40.724 ms postgres=# select load_lo(24628); load_lo \xff00 (1 row) postgres=# \lo_import ~/avatar.png lo_import 24630 postgres=# select md5(load_lo(24630)); md5 ── 513f60836f3b625713acaf1c19b6ea78 (1 row) postgres=# \q bash-4.1$ md5sum ~/avatar.png 513f60836f3b625713acaf1c19b6ea78 /home/pavel/avatar.png Regards Pavel Stehule 2013/8/22 Jov am...@amutu.com +1 badly need the large object and bytea convert function. Once I have to use the ugly pg_read_file() to put some text to pg,I tried to use large object but find it is useless without function to convert large object to bytea. Jov blog: http:amutu.com/blog http://amutu.com/blog 2013/8/10 Pavel Stehule pavel.steh...@gmail.com Hello I had to enhance my older project, where XML documents are parsed and created on server side - in PLpgSQL and PLPerl procedures. We would to use a LO API for client server communication, but we have to parse/serialize LO on server side. I found so there are no simple API for working with LO from PL without access to file system. I had to use a ugly hacks: CREATE OR REPLACE FUNCTION parser.save_as_lob(bytea) RETURNS oid AS $$ DECLARE _loid oid; _substr bytea; BEGIN _loid := lo_creat(-1); FOR i IN 0..length($1)/2048 LOOP _substr := substring($1 FROM i * 2048 + 1 FOR 2048); IF _substr '' THEN INSERT INTO pg_largeobject(loid, pageno, data) VALUES(_loid, i, _substr); END IF; END LOOP; EXECUTE format('GRANT SELECT ON LARGE OBJECT %s TO ohs', _loid); RETURN _loid; END; $$ LANGUAGE plpgsql SECURITY DEFINER STRICT SET search_path = 'pg_catalog'; and CREATE OR REPLACE FUNCTION fbuilder.attachment_to_xml(attachment oid) RETURNS xml AS $$ DECLARE b_cum bytea = ''; b bytea; BEGIN FOR b IN SELECT l.data FROM pg_largeobject l WHERE l.loid = attachment_to_xml.attachment ORDER BY l.pageno LOOP b_cum := b_cum || b; END LOOP; IF NOT FOUND THEN RETURN NULL; ELSE RETURN xmlelement(NAME attachment, encode(b_cum, 'base64')); END IF; END; $$ LANGUAGE plpgsql STRICT SECURITY DEFINER SET search_path = 'pg_catalog'; These functions can be simplified if we supports some functions like encode, decode for LO So my proposal is creating functions: * lo_encode(loid oid) .. returns bytea * lo_encode(loid oid, encoding text) .. returns text * lo_make(loid oid, data bytea) * lo_make(loid oid, data text, encoding text) This can simplify all transformation between LO and VARLENA. Known limit is 1G for varlena, but it is still relative enough high. Notes. comments? Regards Pavel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Rushabh Lathia
Re: [HACKERS] Patch for fail-back without fresh backup
On Thu, Sep 19, 2013 at 7:07 PM, Sawada Masahiko sawada.m...@gmail.com wrote: On Thu, Sep 19, 2013 at 12:25 PM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Sep 19, 2013 at 11:48 AM, Sawada Masahiko sawada.m...@gmail.com wrote: I attached the patch which I have modified. Thanks for updating the patch! Here are the review comments: Thank you for reviewing! I got the compiler warning: syncrep.c:112: warning: unused variable 'i' How does synchronous_transfer work with synchronous_commit? The currently patch synchronous_transfer doesn't work when synchronous_commit is set 'off' or 'local'. if user changes synchronous_commit value on transaction, checkpointer process can't see it. Due to that, even if synchronous_commit is changed to 'off' from 'on', synchronous_transfer doesn't work. I'm planning to modify the patch so that synchronous_transfer is not affected by synchronous_commit. Hmm... when synchronous_transfer is set to data_flush, IMO the intuitive behaviors are (1) synchronous_commit = on A data flush should wait for the corresponding WAL to be flushed in the standby (2) synchronous_commit = remote_write A data flush should wait for the corresponding WAL to be written to OS in the standby. (3) synchronous_commit = local (4) synchronous_commit = off A data flush should wait for the corresponding WAL to be written locally in the master. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] FW: REVIEW: Allow formatting in log_line_prefix
(Forgot to copy list) From: David Rowley [mailto:dgrowle...@gmail.com] Sent: 19 September 2013 22:35 To: Albe Laurenz Subject: Re: REVIEW: Allow formatting in log_line_prefix Hi Laurenz, Thanks for the review. Please see my comments below and the updated patch attached. On Thu, Sep 19, 2013 at 2:18 AM, Albe Laurenz laurenz.a...@wien.gv.at mailto:laurenz.a...@wien.gv.at wrote: This is a review for patch caaphdvpagtypzb2kwa0mmtksayg9+vashyjmjfatngxr1ad...@mail.gmail.com mailto:caaphdvpagtypzb2kwa0mmtksayg9%2bvashyjmjfatngxr1ad...@mail.gmail.com The patch is readable, applies fine and builds without warnings. It contains sufficient documentation. It works as it should, no crashes or errors. It is well written, in fact it improves the readability of the log_line_prefix function in backend/utils/error/elog.c. I think that this is a useful feature as it can help to make stderr logging more readable for humans. I have a few gripes with the English: - In the documentation patch: + numeric literal after the % and before the option. A negative + padding will cause the status information to be padded on the + right with spaces to give it a minimum width. Whereas a positive + padding will pad on the left. Padding can be useful keep log + files more human readable. I think that there should be a comma, not a period, before whereas. I've made the change you request here and also made a change to my last sentence, which hopefully is an improvement. - The description for the function process_log_prefix_padding should probably mention that it returns NULL if it encounters bad syntax. I've added a comment before the function for this. - This comment: + /* Format is invalid if it ends with the padding number */ should begin with lower case. Fixed, but not quite sure of the reason for lack of caps at the moment. - In this comment: + /* +* Process any formatting which may exist after the '%'. +* Note that this moves p passed the padding number +* if it exists. +*/ It should be past, not passed. Fixed. If these details get fixed, I'll mark the patch ready for committer. Yours, Laurenz Albe log_line_formatting_v0.2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On 09/19/2013 10:54 AM, Dimitri Fontaine wrote: Kevin Grittner kgri...@ymail.com writes: There are examples in the patch and this thread, but rather than reference back to those I'll add a new one. Without the patch: Thanks much for doing that. The problem, as I see it, is that the view and the concurrently refreshed materialized view don't yield the same results for the same query. The rows are equal, but they are not the same. With the patch the matview, after RMVC, looks just the same as the view. My understanding is that if you choose citext then you don't care at all about the case, so that the two relations actually yield the same results for the right definition of same here: the citext one. In other words, the results only look different in ways that don't matter for the datatype involved, and I think that if it matters to the user then he needs to review is datatype choices or view definition. So my position on that is that your patch is only adding confusion for no benefits that I'm able to understand. The aim is to have a view and materialized view return the same results. If they do not, the user is guaranteed to be confused and to consider the matview implementation broken the patch solves the general problem of when the table changes, refresh After saying it like this, the problem could also be solved by including xmin(s) for rows from underlying table(s)in the matview. Would this be a better approach ? -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freezing without write I/O
On 18.09.2013 16:22, Andres Freund wrote: On 2013-09-16 16:59:28 +0300, Heikki Linnakangas wrote: Here's a rebased version of the patch, including the above-mentioned fixes. Nothing else new. * We need some higherlevel description of the algorithm somewhere in the source. I don't think I've understood the concept from the patch alone without having read the thread previously. * why do we need to do the PageUpdateNeedsFreezing() dance in heap_page_prune? No xids should change during it. Ah, but its LSN does change. Moving the LSN forward might move the LSN from one xid-lsn range to another, requiring any XIDs on the page that fall outside the xid range of the new xid-lsn range to be frozen. * Why can we do a GetOldestXmin(allDbs = false) in BeginXidLSNRangeSwitch()? Looks like a bug. I think I got the arguments backwards, was supposed to be allDbs = true and ignoreVacuum = false. I'm not sure if vacuums could be ignored, but 'false' is the safe option. * Is there any concrete reasoning behind the current values for XID_LSN_RANGE_INTERVAL and NUM_XID_LSN_RANGES or just gut feeling? The values in the patch are: #define NUM_XID_LSN_RANGES 100 #define XID_LSN_RANGE_INTERVAL 100 The interval is probably way too small for production, but a small value is handy for testing, as you can get through ranges faster. Something like 100 million would probably be a more suitable value for production. With a smaller interval, you need to freeze more often, while with a large interval, you can't truncate the clog as early. NUM_XID_LSN_RANGES is also quite arbitrary. I don't have a good feeling on what the appropriate sizing for that would be. Something like 5 or 10 would probably be enough. Although at the moment, the patch doesn't handle the situation that you run out of slots very well. You could merge some old ranges to make room, but ATM, it just stops creating new ones. * the lsn ranges file can possibly become bigger than 512bytes (the size we assume to be written atomically) and you write it inplace. If we fail halfway through writing, we seem to be able to recover by using the pageMatureLSN from the last checkpoint, but it seems better to do the fsync(),rename(),fsync() dance either way. The lsn-range array is also written to the xlog in toto whenever it changes, so on recovery, the ranges will be read from the WAL, and the ranges file will be recreated at the end-of-recovery checkpoint. * Should we preemptively freeze tuples on a page in lazy_scan_heap if we already have dirtied the page? That would make future modifcations cheaper. Hmm, dunno. Any future modification will also dirty the page, so you're not actually saving any I/O by freezing it earlier. You're just choosing to do the CPU work and WAL insertion earlier than you have to. And if the page is not modified later, it is a waste of cycles. That said, maybe it would indeed be better to do it in vacuum when you have a chance to reduce latency in the critical path, even if it might be a waste. * lazy_scan_heap now blocks acquiring a cleanup lock on every buffer that contains dead tuples. Shouldn't we use some kind of cutoff xid there? That might block progress too heavily. Also the comment above it still refers to the old logic. Hmm, you mean like skip the page even if there are some dead tuples on it, as long as the dead tuples are not older than X xids? I guess we could do that. Or the other thing mentioned in the comments, ie. remember the page and come back to it later. For now though, I think it's good enough as it is. * There's no way to force a full table vacuum anymore, that seems problematic to me. Yeah, it probably would be good to have that. An ignore visibility map option. * I wonder if CheckPointVarsup() doesn't need to update minRecoveryPoint. Hmm, I guess it should. StartupVarsup() should be ok, because we should only read one from the future during a basebackup? You might read one from the future during recovery, whether it's crash or archive recovery, but as soon as you've replayed up to the min recovery point, or the end of WAL on crash recovery, the XID ranges array in memory should be consistent with the rest of the system, because changes to the array are WAL logged. * xidlsnranges_recently[_dirtied] are not obvious on a first glance. Why can't we just reset dirty before the WriteXidLSNRangesFile() call? There's only one process doing the writeout. Just because the checkpointing process could be killed? Right, if the write fails, you need to retry on the next checkpoint. * I think we should either not require consuming an multixactid or use a function that doesn't need MultiXactIdSetOldestMember(). If the transaction doing so lives for long it will unnecessarily prevent truncation of mxacts. Agreed, that was just a quick kludge to get it working. *
Re: [HACKERS] proposal: lob conversion functionality
2013/9/19 Rushabh Lathia rushabh.lat...@gmail.com Hi Pavel, I have reviewed you patch. -- Patch got applied cleanly (using patch -p1) -- Make Make install works fine -- make check looks good I done code-walk and it looks good. Also did some manual testing and haven't found any issue with the implementation. Patch introduced two new API load_lo() and make_lo() for loading and saving from/to large objects Functions. When it comes to drop an lo object created using make_lo() this still depend on older API lo_unlink(). I think we should add that into documentation for the clerification. As a user to lo object function when I started testing this new API, first question came to mind is why delete_lo() or destroy_lo() API is missing. Later I realize that need to use lo_unlink() older API for that functionality. So I feel its good to document that. Do let you know what you think ? good idea I'll send a updated patch evening Otherwise patch looks nice and clean. Thank you :) Regards Pavel Regards, Rushabh Lathia www.EnterpriseDB.com On Sun, Aug 25, 2013 at 8:31 PM, Pavel Stehule pavel.steh...@gmail.comwrote: Hello here is a patch it introduce a load_lo and make_lo functions postgres=# select make_lo(decode('ff00','hex')); make_lo ─ 24629 (1 row) Time: 40.724 ms postgres=# select load_lo(24628); load_lo \xff00 (1 row) postgres=# \lo_import ~/avatar.png lo_import 24630 postgres=# select md5(load_lo(24630)); md5 ── 513f60836f3b625713acaf1c19b6ea78 (1 row) postgres=# \q bash-4.1$ md5sum ~/avatar.png 513f60836f3b625713acaf1c19b6ea78 /home/pavel/avatar.png Regards Pavel Stehule 2013/8/22 Jov am...@amutu.com +1 badly need the large object and bytea convert function. Once I have to use the ugly pg_read_file() to put some text to pg,I tried to use large object but find it is useless without function to convert large object to bytea. Jov blog: http:amutu.com/blog http://amutu.com/blog 2013/8/10 Pavel Stehule pavel.steh...@gmail.com Hello I had to enhance my older project, where XML documents are parsed and created on server side - in PLpgSQL and PLPerl procedures. We would to use a LO API for client server communication, but we have to parse/serialize LO on server side. I found so there are no simple API for working with LO from PL without access to file system. I had to use a ugly hacks: CREATE OR REPLACE FUNCTION parser.save_as_lob(bytea) RETURNS oid AS $$ DECLARE _loid oid; _substr bytea; BEGIN _loid := lo_creat(-1); FOR i IN 0..length($1)/2048 LOOP _substr := substring($1 FROM i * 2048 + 1 FOR 2048); IF _substr '' THEN INSERT INTO pg_largeobject(loid, pageno, data) VALUES(_loid, i, _substr); END IF; END LOOP; EXECUTE format('GRANT SELECT ON LARGE OBJECT %s TO ohs', _loid); RETURN _loid; END; $$ LANGUAGE plpgsql SECURITY DEFINER STRICT SET search_path = 'pg_catalog'; and CREATE OR REPLACE FUNCTION fbuilder.attachment_to_xml(attachment oid) RETURNS xml AS $$ DECLARE b_cum bytea = ''; b bytea; BEGIN FOR b IN SELECT l.data FROM pg_largeobject l WHERE l.loid = attachment_to_xml.attachment ORDER BY l.pageno LOOP b_cum := b_cum || b; END LOOP; IF NOT FOUND THEN RETURN NULL; ELSE RETURN xmlelement(NAME attachment, encode(b_cum, 'base64')); END IF; END; $$ LANGUAGE plpgsql STRICT SECURITY DEFINER SET search_path = 'pg_catalog'; These functions can be simplified if we supports some functions like encode, decode for LO So my proposal is creating functions: * lo_encode(loid oid) .. returns bytea * lo_encode(loid oid, encoding text) .. returns text * lo_make(loid oid, data bytea) * lo_make(loid oid, data text, encoding text) This can simplify all transformation between LO and VARLENA. Known limit is 1G for varlena, but it is still relative enough high. Notes. comments? Regards Pavel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Rushabh Lathia
Re: [HACKERS] Assertions in PL/PgSQL
2013/9/18 Jim Nasby j...@nasby.net On 9/14/13 11:55 PM, Pavel Stehule wrote: 2013/9/15 Marko Tiikkaja ma...@joh.to mailto:ma...@joh.to On 2013-09-15 00:09, Pavel Stehule wrote: this is a possibility for introduction a new hook and possibility implement asserions and similar task in generic form (as extension). it can be assertions, tracing, profiling. You can already do tracing and profiling in an extension. I don't see what you would put inside the function body for these two, either. you cannot mark a tracing points explicitly in current (unsupported now) extensions. These functions share same pattern: CREATE OR REPLACE FUNCTION assert(boolean) RETURNS void AS $$ IF current_setting('plpgsq.**assertions') = 'on' THEN IF $1 THEN RAISE EXCEPTION 'Assert fails'; END IF; END IF; END; $$ LANGUAGE plpgsql; CREATE OR REPLACE FUNCTION trace(text) RETURNS void AS $$ IF current_setting('plpgsq.trace'**) = 'on' THEN RAISE WARNING 'trace: %', $1; END IF; END; $$ LANGUAGE plpgsql; Depends on usage, these functions will not be extremely slow against to builtin solution - can be faster, if we implement it in C, and little bit faster if we implement it as internal PLpgSQL statement. But if you use a one not simple queries, then overhead is not significant (probably). You have to watch some global state variable and then execute (or not) some functionality. FWIW, we've written a framework (currently available in the EnovaTools project on pgFoundry) that allows for very, very fine-grain control over asserts. - Every assert has a name (and an optional sub-name) as well as a level - You can globally set the minimum level that will trigger an assert. This is useful for some debugging stuff; have an assert with a negative level and normally it won't fire unless you set the minimum level to be less than zero. - You can disable an assert globally (across all backends) - You can disable an assert only within your session We should eventually allow for disabling an assert only for your transaction; we just haven't gotten around to it yet. The reason for all this flexibility is the concept of it should be very difficult but not impossible for the code to do X. We use it for sanity-checking things. I think so similar frameworks will be exists (we have some similar functionality) in orafce too - and it is not reason why we should not merge some function to core. I am with Marko, so some simple, user friendly statement for assertions should be very nice plpgsql feature. I am different in opinion how to implementat it and about syntax. I prefer a possibility (not necessary currently implemented) to enhance this feature for similar tasks (as buildin or external feature) Probably You and me have a same opinion so only simple and very primitive assert is not enough: I see as useful feature for assertions: a) possibility to specify a message (two parametric assert) b) possibility to specify some threshold c) possibility to specify some level (exception, warning, notice) .. default should be exception c) possibility to specify a handled/unhandled exception Regards Pavel -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net
Re: [HACKERS] Assertions in PL/PgSQL
On 9/18/13 5:11 PM, Pavel Stehule wrote: In this code a assert fail can be lost in app log. Or can be knowingly handled and ignored - what is wrong, and should not be allowed. When I wrote a little bit complex procedures, I had to use a EXCEPTION WHEN OTHERS clause - because I would not to lost a transaction. It worked, but searching a syntax errors was significantly harder - so on base of this experience I am thinking so some errors can be handled (related to database usage) and others not - like syntax errors in PL/pgSQL or possible assertions (although we can handle syntax error, but I don't think so it is practical). It significantly increase a work that is necessary for error identification. I think that's a fair point. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertions in PL/PgSQL
On 9/19/13 2:08 PM, Pavel Stehule wrote: I think so similar frameworks will be exists (we have some similar Probably You and me have a same opinion so only simple and very primitive assert is not enough: I see as useful feature for assertions: a) possibility to specify a message (two parametric assert) b) possibility to specify some threshold c) possibility to specify some level (exception, warning, notice) .. default should be exception c) possibility to specify a handled/unhandled exception I think these are all neat ideas on how to further improve this feature. I'd like to see at least a) in 9.4, but I haven't yet looked at how it could be implemented. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Kevin Grittner kgri...@ymail.com writes: The problem, as I see it, is that the view and the concurrently refreshed materialized view don't yield the same results for the same query. The rows are equal, but they are not the same. With the patch the matview, after RMVC, looks just the same as the view. My understanding is that if you choose citext then you don't care at all about the case That's not my understanding. If that was what citext was for it would be much simpler to force the case in creating each value. It *preserves* the case for display, but ignores it for comparisons. That's the contract of the type, like it or not. Equal does not mean the same. They clearly want to preserve and display differences among equal values. Streaming replication would preserve the differences. pg_dump and restore of the data would preserve the differences. SELECTing the data shows the differences. suppress_redundant_updates_trigger() will not suppress an UPDATE setting an equal value unless it is also *the same*. A normal VIEW shows the differences. RMV without CONCURRENTLY will show the difference. I'm suggesting that RMVC and the upcoming incremental update of matviews should join this crowd. A matview which is being refreshed with the CONCURRENTLY option, or maintained by incremental maintenance should not look different from a regular VIEW, and should not suddenly look completely different after a non-concurrent REFRESH. If things are left that way, people will quite justifiably feel that matviews are broken. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On 2013-09-19 05:33:22 -0700, Kevin Grittner wrote: Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Kevin Grittner kgri...@ymail.com writes: The problem, as I see it, is that the view and the concurrently refreshed materialized view don't yield the same results for the same query. The rows are equal, but they are not the same. With the patch the matview, after RMVC, looks just the same as the view. My understanding is that if you choose citext then you don't care at all about the case That's not my understanding. If that was what citext was for it would be much simpler to force the case in creating each value. It *preserves* the case for display, but ignores it for comparisons. That's the contract of the type, like it or not. Equal does not mean the same. They clearly want to preserve and display differences among equal values. I agree. I am not 100% sure if the can of worms this opens is worth the trouble, but from my POV it's definitely an understandable and sensible goal. My complaints about this subfeature were never about trying to get that right for matviews... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertions in PL/PgSQL
2013/9/19 Marko Tiikkaja ma...@joh.to On 9/19/13 2:08 PM, Pavel Stehule wrote: I think so similar frameworks will be exists (we have some similar Probably You and me have a same opinion so only simple and very primitive assert is not enough: I see as useful feature for assertions: a) possibility to specify a message (two parametric assert) b) possibility to specify some threshold c) possibility to specify some level (exception, warning, notice) .. default should be exception c) possibility to specify a handled/unhandled exception I think these are all neat ideas on how to further improve this feature. I'd like to see at least a) in 9.4, but I haven't yet looked at how it could be implemented. Not all must be implemented in 9.4, although it is +/- only exception parametrization - not hard for implementation. But syntax should be prepared for this functionality (or should be extensible as minimum) before. Bison parser is not friendly for additional extending :( - and we can break a future extending simply just only on syntax level with bad design now. It is reason, why I am doing noise here. I remember relatively difficult extending of RAISE statement. Regards Pavel Regards, Marko Tiikkaja
Re: [HACKERS] Freezing without write I/O
Hi, On 2013-09-19 14:42:19 +0300, Heikki Linnakangas wrote: On 18.09.2013 16:22, Andres Freund wrote: * Why can we do a GetOldestXmin(allDbs = false) in BeginXidLSNRangeSwitch()? Looks like a bug. I think I got the arguments backwards, was supposed to be allDbs = true and ignoreVacuum = false. I'm not sure if vacuums could be ignored, but 'false' is the safe option. Not sure either... * the lsn ranges file can possibly become bigger than 512bytes (the size we assume to be written atomically) and you write it inplace. If we fail halfway through writing, we seem to be able to recover by using the pageMatureLSN from the last checkpoint, but it seems better to do the fsync(),rename(),fsync() dance either way. The lsn-range array is also written to the xlog in toto whenever it changes, so on recovery, the ranges will be read from the WAL, and the ranges file will be recreated at the end-of-recovery checkpoint. But we will read the file before we read the WAL record covering it, won't we? * Should we preemptively freeze tuples on a page in lazy_scan_heap if we already have dirtied the page? That would make future modifcations cheaper. Hmm, dunno. Any future modification will also dirty the page, so you're not actually saving any I/O by freezing it earlier. You're just choosing to do the CPU work and WAL insertion earlier than you have to. And if the page is not modified later, it is a waste of cycles. That said, maybe it would indeed be better to do it in vacuum when you have a chance to reduce latency in the critical path, even if it might be a waste. Well, if we already dirtied the page (as in vacuum itself, to remove dead tuples) we already have to do WAL. True it's a separate record, but that should probably be fixed. * lazy_scan_heap now blocks acquiring a cleanup lock on every buffer that contains dead tuples. Shouldn't we use some kind of cutoff xid there? That might block progress too heavily. Also the comment above it still refers to the old logic. Hmm, you mean like skip the page even if there are some dead tuples on it, as long as the dead tuples are not older than X xids? I guess we could do that. Or the other thing mentioned in the comments, ie. remember the page and come back to it later. For now though, I think it's good enough as it is. I have seen vacuum being slowed down considerably because we couldn't acquire a cleanup lock. It's not that infrequent to have pages that are pinned most of the time. And that was when we only acquired the cleanup lock when necessary for freezing. Not processing the page will not mark it as all-visible which basically is remembering it... * switchFinishXmin and nextSwitchXid should probably be either volatile or have a compiler barrier between accessing shared memory and checking them. The compiler very well could optimize them away and access shmem all the time which could lead to weird results. Hmm, that would be a weird optimization. Is that really legal for the compiler to do? Better safe than sorry, I guess. I think it is. The compiler doesn't know anything about shared memory or threads, so it doesn't see that those parameters can change. It will be forced to load them as soon as an non-inlined function is called which actually might make it safe in this case - unless you compile in LTO mode where it recognizes that TransactionIdFollowsOrEquals() won't modify thigns... * I wonder whether the fact that we're doing the range switches after acquiring an xid could be problematic if we're preventing xid allocation due to the checks earlier in that function? You mean that you might get into a situation where you cannot assign a new XID because you've reached xidStopLimit, and you cannot advance xidStopLimit because you can't switch to a new xid-lsn range, because no new XIDs are being assigned? Yeah, that would be nasty. The range management stuff should really be moved out of GetNewTransaction(), maybe do them in walwriter or bgwriter as Alvaro suggested. Yes, I think we should do that. I am not sure yet where to put it best though. Doing it in the wal writer doesn't seem to be a good idea, doing more there will increase latency for normal backends. * I think heap_lock_tuple() needs to unset all-visible, otherwise we won't vacuum that page again which can lead to problems since we don't do full-table vacuums again? It's OK if the page is never vacuumed again, the whole point of the patch is that old XIDs can be just left lingering in the table. The next time the page is updated, perhaps to lock a tuple again, the page will be frozen and the old xmax will be cleared. Yes, you're right, it should be possible to make it work that way. But currently, we look at xmax and infomask of a tuple in heap_lock_tuple() *before* the PageUpdateNeedsFreezing() call. Currently we would thus create a new multixact with long dead xids as members and
Re: [HACKERS] record identical operator
Hannu Krosing ha...@2ndquadrant.com wrote: the patch solves the general problem of when the table changes, refresh After saying it like this, the problem could also be solved by including xmin(s) for rows from underlying table(s)in the matview. Would this be a better approach ? Now you're moving from REFRESH territory into incremental maintenance. There have been a very large number of papers written on that topic, I have reviewed the literature, and I have picked a technique which looks like it will be fast and reliable -- based on relational algebra. Let's save discussion of alternatives such as you're suggesting here, for when I get past the easy stuff ... like just refreshing a view so that the new contents are the same as what you would see if you re-ran the query defining the matview. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FW: REVIEW: Allow formatting in log_line_prefix
Something is weird in your latest patch. The header is: diff -u -r b/postgresql/doc/src/sgml/config.sgml a/postgresql/doc/src/sgml/config.sgml --- b/postgresql/doc/src/sgml/config.sgml 2013-09-09 23:40:52.356371191 +1200 +++ a/postgresql/doc/src/sgml/config.sgml 2013-09-19 22:13:26.236681468 +1200 This doesn't apply with patch -p1 or -p0. (You need -p2.) Your previous patch in this thread seemed OK. You might want to check what you did there. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] information schema parameter_default implementation
On 15 September 2013 01:35, Peter Eisentraut pete...@gmx.net wrote: Here is an updated patch which fixes the bug you have pointed out. On Thu, 2013-01-31 at 18:59 +0500, Ali Dar wrote: I checked our your patch. There seems to be an issue when we have OUT parameters after the DEFAULT values. Fixed. Some other minor observations: 1) Some variables are not lined in pg_get_function_arg_default(). Are you referring to indentation issues? I think the indentation is good, but pgindent will fix it anyway. I find only proc variable not aligned. Adding one more tab for all the variables should help. 2) I found the following check a bit confusing, maybe you can make it better if (!argmodes || argmodes[i] == PROARGMODE_IN || argmodes[i] == PROARGMODE_INOUT || argmodes[i] == PROARGMODE_VARIADIC) Factored that out into a separate helper function. The statement needs to be split into 80 columns width lines. 2) inputargn can be assigned in declaration. I'd prefer to initialize it close to where it is used. Me too. 3) Function level comment for pg_get_function_arg_default() is missing. I think the purpose of the function is clear. Unless a reader goes through the definition, it is not obvious whether the second function argument represents the parameter position in input parameters or it is the parameter position in *all* the function parameters (i.e. proargtypes or proallargtypes). I think at least a one-liner description of what this function does should be present. 4) You can also add comments inside the function, for example the comment for the line: nth = inputargn - 1 - (proc-pronargs - proc-pronargdefaults); Suggestion? Yes, it is difficult to understand what's the logic behind this calculation, until we go read the documentation related to pg_proc.proargdefaults. I think this should be sufficient: /* * proargdefaults elements always correspond to the last N input arguments, * where N = pronargdefaults. So calculate the nth_default accordingly. */ 5) I think the line added in the documentation(informational_schema.sgml) is very long. Consider revising. Maybe change from: The default expression of the parameter, or null if none or if the function is not owned by a currently enabled role. TO The default expression of the parameter, or null if none was specified. It will also be null if the function is not owned by a currently enabled role. I don't know what do you exactly mean by: function is not owned by a currently enabled role? I think this style is used throughout the documentation of the information schema. We need to keep the descriptions reasonably compact, but I'm willing to entertain other opinions. This requires first an answer to my earlier question about why the role-related privilege is needed. --- There should be an initial check to see if nth_arg is passed a negative value. It is used as an index into the argmodes array, so a -ve array index can cause a crash. Because pg_get_function_arg_default() can now be called by a user also, we need to validate the input values. I am ok with returning with an error Invalid argument. --- Instead of : deparse_expression_pretty(node, NIL, false, false, 0, 0) you can use : deparse_expression(node, NIL, false, false) We are anyway not using pretty printing. --- Other than this, I have no more issues. --- After the final review is over, the catversion.h requires the appropriate date value. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On 09/19/2013 02:41 PM, Kevin Grittner wrote: Hannu Krosing ha...@2ndquadrant.com wrote: the patch solves the general problem of when the table changes, refresh After saying it like this, the problem could also be solved by including xmin(s) for rows from underlying table(s)in the matview. Would this be a better approach ? Now you're moving from REFRESH territory into incremental maintenance. There have been a very large number of papers written on that topic, I have reviewed the literature, and I have picked a technique which looks like it will be fast and reliable -- based on relational algebra. Let's save discussion of alternatives such as you're suggesting here, for when I get past the easy stuff ... like just refreshing a view so that the new contents are the same as what you would see if you re-ran the query defining the matview. I'm sure that comparing xmin records would work exactly similar to binary comparisons of records for detecting possible change in 99.999% of real-world cases. I am also pretty sure it would have its own cans of worms all over again when trying to actually implement this in matviews :) Cheers -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freezing without write I/O
On 2013-09-19 14:40:35 +0200, Andres Freund wrote: * I think heap_lock_tuple() needs to unset all-visible, otherwise we won't vacuum that page again which can lead to problems since we don't do full-table vacuums again? It's OK if the page is never vacuumed again, the whole point of the patch is that old XIDs can be just left lingering in the table. The next time the page is updated, perhaps to lock a tuple again, the page will be frozen and the old xmax will be cleared. Yes, you're right, it should be possible to make it work that way. But currently, we look at xmax and infomask of a tuple in heap_lock_tuple() *before* the PageUpdateNeedsFreezing() call. Currently we would thus create a new multixact with long dead xids as members and such. That reminds me of something: There are lots of checks sprayed around that unconditionally look at xmin, xmax or infomask. Things I noticed in a quick look after thinking about the previous statment: * AlterEnum() looks at xmin * The PLs look at xmin * So does afair VACUUM FREEZE, COPY and such. * Several HeapTupleSatisfiesVacuum() callers look at xmin and stuff without checking for maturity or freezing the page first. * lazy_scan_heap() itself if the page isn't already marked all_visible * heap_page_is_all_visible() * rewrite_heap_tuple() does an unconditional heap_freeze_tuple() without considering maturity/passing in Invalid* * heap_page_prune_opt() shouldn't do anything on a mature (or even all visible) page. * heap_page_prune() marks a buffer dirty, writes a new xid without changing the lsn. * heap_prune_chain() looks at tuples without considering page maturity, but the current implementation actually looks safe. * CheckForSerializableConflictOut() looks at xmins. * pgrowlocks() looks at xmax unconditionally. * heap_update() computes, just like heap_lock_tuple(), a new xmax/infomask before freezing. * Same for heap_lock_updated_tuple(). Not sure if that's an actual concern, but it might if PageMatureLSN advances or such. * heap_getsysattr() should probably return different things for mature pages. There's probably more. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for fail-back without fresh backup
On Thu, Sep 19, 2013 at 7:32 PM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Sep 19, 2013 at 7:07 PM, Sawada Masahiko sawada.m...@gmail.com wrote: On Thu, Sep 19, 2013 at 12:25 PM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Sep 19, 2013 at 11:48 AM, Sawada Masahiko sawada.m...@gmail.com wrote: I attached the patch which I have modified. Thanks for updating the patch! Here are the review comments: Thank you for reviewing! I got the compiler warning: syncrep.c:112: warning: unused variable 'i' How does synchronous_transfer work with synchronous_commit? The currently patch synchronous_transfer doesn't work when synchronous_commit is set 'off' or 'local'. if user changes synchronous_commit value on transaction, checkpointer process can't see it. Due to that, even if synchronous_commit is changed to 'off' from 'on', synchronous_transfer doesn't work. I'm planning to modify the patch so that synchronous_transfer is not affected by synchronous_commit. Hmm... when synchronous_transfer is set to data_flush, IMO the intuitive behaviors are (1) synchronous_commit = on A data flush should wait for the corresponding WAL to be flushed in the standby (2) synchronous_commit = remote_write A data flush should wait for the corresponding WAL to be written to OS in the standby. (3) synchronous_commit = local (4) synchronous_commit = off A data flush should wait for the corresponding WAL to be written locally in the master. It is good idea. So synchronous_commit value need to be visible from other process. To share synchronous_commit with other process, I will try to put synchronous_commit value into shared buffer. Is there already the guc parameter which is shared with other process? I tried to find such parameter, but there was not it. Regards, --- Sawada Masahiko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical changeset generation v6
On Tue, Sep 17, 2013 at 10:31 AM, Andres Freund and...@2ndquadrant.com wrote: Rebased patches attached. I spent a bit of time looking at these patches yesterday and today. It seems to me that there's a fair amount of stylistic cleanup that is still needed, and some pretty bad naming choices, and some FIXMEs that should probably be fixed, but for an initial review email it seems more helpful to focus on high-level design, so here goes. - Looking specifically at the 0004 patch, I think that the RecentGlobalDataXmin changes are logically separate from the rest of the patch, and that if we're going to commit them at all, it should be separate from the rest of this. I think this is basically a performance optimization. AFAICS, there's no correctness problem with the idea of continuing to maintain a single RecentGlobalXmin; it's just that you'll potentially end up with quite a lot of bloat. But that's not an argument that those two things HAVE to be committed together; either could be done first, and independently of the other. Also, these changes are quite complex and it's different to form a judgement as to whether this idea is safe when they are intermingled with the rest of the logical replication stuff. More generally, the thing that bugs me about this approach is that logical replication is not really special, but what you've done here MAKES it special. There are plenty of other situations where we are too aggressive about holding back xmin. A single-statement transaction that selects from a single table holds back xmin for the entire cluster, and that is Very Bad. It would probably be unfair to say, well, you have to solve that problem first. But on the other hand, how do we know that the approach you've adopted here isn't going to make the more general problem harder to solve? It seems invasive at a pretty low level. I think we should at least spend some time thinking about what *general* solutions to this problem would like like and then decide whether this is approach is likely to be forward-compatible with those solutions. - There are no changes to the doc directory. Obviously, if you're going to add a new value for the wal_level GUC, it's gonna need to be documented. Similarly, pg_receivellog needs to be documented. In all likelihood, you also need a whole chapter providing general background on this technology. A couple of README files is not going to do it, and those files aren't suitable for check-in anyway (e.g. DESIGN.txt makes reference to a URL where the current version of some patch can be found; that's not appropriate for permanent documentation). But aside from that, what we really need here is user documentation, not developer documentation. I can perhaps pass judgement on whether the guts of this functionality do things that are fundamentally unsafe, but whether the user interface is good or bad is a question that deserves broader input, and without documentation, most people aren't going to understand it well enough to know whether they like it. And TBH, *I* don't really want to reverse-engineer what pg_receivellog does from a read-through of the code, either. - Given that we don't reassemble transactions until commit time, why do we need to to ensure that XIDs are logged before their sub-XIDs appear in WAL? As I've said previously, I actually think that on-the-fly reassembly is probably going to turn out to be very important. But if we're not getting that, do we really need this? Also, while I'm trying to keep this email focused on high-level concerns, I have to say that guaranteedlyLogged has got to be one of the worst variable names I've ever seen, starting (but not ending) with the fact that guaranteedly is not a word. I'm also tempted to say that all of the wal_level=logical changes should be split out as their own patch, separate from the decoding stuff. Personally, I would find that much easier to review, although I admit it's less critical here than for the RecentGlobalDataXmin stuff. - If XLOG_HEAP_INPLACE is not decoded, doesn't that mean that this facility can't be used to replicate a system catalog table? Is that a restriction we should enforce/document somehow? - The new code is rather light on comments. decode.c is extremely light. For example, consider the function DecodeAbort(), which contains the following comment: + /* +* this is a bit grotty, but if we're faking an abort we've already gone +* through +*/ Well, I have no idea what that means. I'm sure you do, but I bet the next person who isn't you that tries to understand this probably won't. It's also probably true that I could figure it out if I spent more time on it, but I think the point of comments is to keep the amount of time that must be spent trying to understand code to a manageable level. Generally, I'd suggest that any non-trivial functions in these files should have a header comment explaining what their job is; e.g. for DecodeStandbyOp you
Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
On Tue, Sep 17, 2013 at 2:27 PM, Andrew Gierth and...@tao11.riddles.org.uk wrote: Robert == Robert Haas robertmh...@gmail.com writes: Someone should do the same in WaitForBackgroundWorkerStartup so that building with -Werror works. Robert I don't get a warning there. Can you be more specific about Robert the problem? bgworker.c: In function 'WaitForBackgroundWorkerStartup': bgworker.c:866: warning: 'pid' may be used uninitialized in this function Does the attached patch fix it for you? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company bgworker-wait-fix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some interesting news about Linux 3.12 OOM
On Wed, Sep 18, 2013 at 10:09 PM, Daniel Farina dan...@heroku.com wrote: I'm not sure how many of you have been tracking this but courtesy of lwn.net I have learned that it seems that the OOM killer behavior in Linux 3.12 will be significantly different. And by description, it sounds like an improvement. I thought some people reading -hackers might be interested. Based on the description at lwn, excerpted below, it sounds like the news might be that systems with overcommit on might return OOM when a non-outlandish request for memory is made from the kernel. Johannes Weiner has posted a set of patches aimed at improving this situation. Following a bunch of cleanup work, these patches make two fundamental changes to how OOM conditions are handled in the kernel. The first of those is perhaps the most visible: it causes the kernel to avoid calling the OOM killer altogether for most memory allocation failures. In particular, if the allocation is being made in response to a system call, the kernel will just cause the system call to fail with an ENOMEMerror rather than trying to find a process to kill. That may cause system call failures to happen more often and in different contexts than they used to. But, naturally, that will not be a problem since all user-space code diligently checks the return status of every system call and responds with well-tested error-handling code when things go wrong. Subject to experiment, this may be some good news, as many programs, libraries, and runtime environments that may run parallel to Postgres on a machine are pretty lackadaisical about limiting the amount of virtual memory charged to them, and overcommit off is somewhat punishing in those situations if one really needed a large hash table from Postgres or whatever. I've seen some cases here where a good amount of VM has been reserved and caused apparent memory pressure that cut throughput short of what should ought to be possible. Yes, that does sound good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical changeset generation v6
Hi Robert, On 2013-09-19 10:02:31 -0400, Robert Haas wrote: On Tue, Sep 17, 2013 at 10:31 AM, Andres Freund and...@2ndquadrant.com wrote: Rebased patches attached. I spent a bit of time looking at these patches yesterday and today. It seems to me that there's a fair amount of stylistic cleanup that is still needed, and some pretty bad naming choices, and some FIXMEs that should probably be fixed, but for an initial review email it seems more helpful to focus on high-level design, so here goes. Thanks for looking at it. Yes, I think the highlevel stuff is the important bit. As you note, the documentation needs to be written and that's certainly not a small task. But doing so before the highlevel design is agreed upon makes it too likely that it will need to be entirely scrapped. - Looking specifically at the 0004 patch, I think that the RecentGlobalDataXmin changes are logically separate from the rest of the patch, and that if we're going to commit them at all, it should be separate from the rest of this. I think this is basically a performance optimization. AFAICS, there's no correctness problem with the idea of continuing to maintain a single RecentGlobalXmin; it's just that you'll potentially end up with quite a lot of bloat. But that's not an argument that those two things HAVE to be committed together; either could be done first, and independently of the other. Also, these changes are quite complex and it's different to form a judgement as to whether this idea is safe when they are intermingled with the rest of the logical replication stuff. Up until v3 the RecentGlobalDataXmin stuff wasn't included and reviewers (primarily Peter G. on -hackers and Greg Stark at pgconf.eu) remarked on that and considered it critical. I argued for a considerable amount of time that it shouldn't be done in an initial patch and then gave in. They have a point though, if you e.g. replicate a pgbench -j16 workload the addition of RecentGlobalDataXmin reduces the performance impact of replication from about 60% to less than 5% in my measurements. Turns out heap pruning is damn critical for that kind of workload. More generally, the thing that bugs me about this approach is that logical replication is not really special, but what you've done here MAKES it special. There are plenty of other situations where we are too aggressive about holding back xmin. A single-statement transaction that selects from a single table holds back xmin for the entire cluster, and that is Very Bad. It would probably be unfair to say, well, you have to solve that problem first. But on the other hand, how do we know that the approach you've adopted here isn't going to make the more general problem harder to solve? It seems invasive at a pretty low level. The reason why I think it's actually different is that the user actually has control over how long transactions are running on the primary. They don't really control how fast a replication consumer consumes and how often it sends feedback messages. I think we should at least spend some time thinking about what *general* solutions to this problem would like like and then decide whether this is approach is likely to be forward-compatible with those solutions. I thought about the general case for a good bit and decided that all solutions that work in a more general scenario are complex enough that I don't want to implement them. And I don't really see any backward compatibility concerns here - removing the logic of using a separate horizon for user tables in contrast to system tables is pretty trivial and shouldn't have any external effect. Except pegging the horizon more, but that's what the new approach would fix, right? - Given that we don't reassemble transactions until commit time, why do we need to to ensure that XIDs are logged before their sub-XIDs appear in WAL? Currently it's important to know where the oldest transaction that is alive started at to determine from where we need to restart decoding. That's done by keeping a lsn-ordered list of in progress toplevel transactions. The above invariant makes it cheap to maintain that list. As I've said previously, I actually think that on-the-fly reassembly is probably going to turn out to be very important. But if we're not getting that, do we really need this? It's also preparatory for supporting that. I agree that it's pretty important, but after actually having implemented a replication solution using this, I still think that most usecase won't using it when available. I plan to work on implementing that. Also, while I'm trying to keep this email focused on high-level concerns, I have to say that guaranteedlyLogged has got to be one of the worst variable names I've ever seen, starting (but not ending) with the fact that guaranteedly is not a word. I'm also tempted to say that all of the wal_level=logical changes should be split out as their own patch, separate from the
Re: [HACKERS] Some interesting news about Linux 3.12 OOM
On Thu, Sep 19, 2013 at 9:12 AM, Robert Haas robertmh...@gmail.com wrote: But, naturally, that will not be a problem since all user-space code diligently checks the return status of every system call and responds with well-tested error-handling code when things go wrong. That just short circuited my sarcasm detector. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some interesting news about Linux 3.12 OOM
On Thu, Sep 19, 2013 at 11:30 AM, Merlin Moncure mmonc...@gmail.com wrote: On Thu, Sep 19, 2013 at 9:12 AM, Robert Haas robertmh...@gmail.com wrote: But, naturally, that will not be a problem since all user-space code diligently checks the return status of every system call and responds with well-tested error-handling code when things go wrong. That just short circuited my sarcasm detector. I laughed, too, but the reality is that at least as far as PG is concerned it's probably a truthful statement, and if it isn't, nobody here is likely to complain about having to fix it. Yeah, there's a lot of other code out there not as well written or maintained as PG, but using SIGKILL as a substitute for ENOMEM because people might not be checking the return value for malloc() is extremely heavy-handed nannyism. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some interesting news about Linux 3.12 OOM
On 2013-09-19 18:23:07 +0200, Dimitri Fontaine wrote: I've been told at several instances that this has been made for the JVM and other such programs that want to allocate huge amount of memory even if they don't really intend to use it. That's not really related - what you describe is memory overcommitting (which as lots of uses besides JVMs). That's not removed by the changes references upthread. What has changed is how to react to situations where memory has been overcommitted but is now actually needed. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some interesting news about Linux 3.12 OOM
On Thu, Sep 19, 2013 at 12:02 PM, Andres Freund and...@2ndquadrant.com wrote: The problem is that it's not just about malloc() (aka brk() and mmap()) and friends. It's about many of the other systemcalls. Like e.g. send() to name one of the more likely ones. *shrug* If you're using for send() and not testing for a -1 return value, you're writing amazingly bad code anyway. And if you ARE testing for -1, you'll probably do something at least mildly sensible with a not-specifically-foreseen errno value, like print a message that includes %m. That's about what we'd probably do, and I have to imagine what most people would do. I'm not saying it won't break anything to return a proper error code; I'm just saying that sending SIGKILL is worse. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some interesting news about Linux 3.12 OOM
On 2013-09-19 11:49:05 -0400, Robert Haas wrote: On Thu, Sep 19, 2013 at 11:30 AM, Merlin Moncure mmonc...@gmail.com wrote: On Thu, Sep 19, 2013 at 9:12 AM, Robert Haas robertmh...@gmail.com wrote: But, naturally, that will not be a problem since all user-space code diligently checks the return status of every system call and responds with well-tested error-handling code when things go wrong. That just short circuited my sarcasm detector. I laughed, too, but the reality is that at least as far as PG is concerned it's probably a truthful statement, and if it isn't, nobody here is likely to complain about having to fix it. Yeah, there's a lot of other code out there not as well written or maintained as PG, but using SIGKILL as a substitute for ENOMEM because people might not be checking the return value for malloc() is extremely heavy-handed nannyism. The problem is that it's not just about malloc() (aka brk() and mmap()) and friends. It's about many of the other systemcalls. Like e.g. send() to name one of the more likely ones. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical changeset generation v6
On Thu, Sep 19, 2013 at 10:43 AM, Andres Freund and...@2ndquadrant.com wrote: - Looking specifically at the 0004 patch, I think that the RecentGlobalDataXmin changes are logically separate from the rest of the patch, and that if we're going to commit them at all, it should be separate from the rest of this. I think this is basically a performance optimization. AFAICS, there's no correctness problem with the idea of continuing to maintain a single RecentGlobalXmin; it's just that you'll potentially end up with quite a lot of bloat. But that's not an argument that those two things HAVE to be committed together; either could be done first, and independently of the other. Also, these changes are quite complex and it's different to form a judgement as to whether this idea is safe when they are intermingled with the rest of the logical replication stuff. Up until v3 the RecentGlobalDataXmin stuff wasn't included and reviewers (primarily Peter G. on -hackers and Greg Stark at pgconf.eu) remarked on that and considered it critical. I argued for a considerable amount of time that it shouldn't be done in an initial patch and then gave in. They have a point though, if you e.g. replicate a pgbench -j16 workload the addition of RecentGlobalDataXmin reduces the performance impact of replication from about 60% to less than 5% in my measurements. Turns out heap pruning is damn critical for that kind of workload. No question. I'm not saying that that optimization shouldn't go in right after the main patch does, but IMHO right now there are too many things going in the 0004 patch to discuss them all simultaneously. I'd like to find a way of splitting this up that will let us block-and-tackle individual pieces of it, even we end up committing them all one right after the other. But that raises an interesting question: why is the overhead so bad? I mean, this shouldn't be any worse than having a series of transactions running concurrently with pgbench that take a snapshot and hold it for as long as it takes the decoding process to decode the most-recently committed transaction. Is the issue here that we can't advance xmin until we've fsync'd the fruits of decoding down to disk? If so, that's mighty painful. But we'd really only need to hold back xmin in that situation when some catalog change has occurred meanwhile, which for pgbench means never. So something seems fishy here. I thought about the general case for a good bit and decided that all solutions that work in a more general scenario are complex enough that I don't want to implement them. And I don't really see any backward compatibility concerns here - removing the logic of using a separate horizon for user tables in contrast to system tables is pretty trivial and shouldn't have any external effect. Except pegging the horizon more, but that's what the new approach would fix, right? Hmm, maybe. Also, while I'm trying to keep this email focused on high-level concerns, I have to say that guaranteedlyLogged has got to be one of the worst variable names I've ever seen, starting (but not ending) with the fact that guaranteedly is not a word. I'm also tempted to say that all of the wal_level=logical changes should be split out as their own patch, separate from the decoding stuff. Personally, I would find that much easier to review, although I admit it's less critical here than for the RecentGlobalDataXmin stuff. I can do that again and it actually was that way in the past. But there's no user for it before the later patch and it's hard to understand the reasoning for the changed wal logging separately, that's why I merged it at some point. OK. If I'm committing it, I'd prefer to handle that piece separately, if possible. - If XLOG_HEAP_INPLACE is not decoded, doesn't that mean that this facility can't be used to replicate a system catalog table? Is that a restriction we should enforce/document somehow? Currently catalog tables aren't replicated, yes. They simply are skipped during decoding. XLOG_HEAP_INPLACE isn't the primary reason for that though. Do you see a usecase for it? I can imagine someone wanting to do it, but I think we can live with it not being supported. - It still bothers me that we're going to have mandatory slots for logical replication and no slots for physical replication. Why are slots mandatory in one case and not even allowable in the other? Is it sensible to think about slotless logical replication - or maybe I should say, is it any LESS sensible than slotless physical replication? Well, as you know, I do want to have slots for physical replication as well. But there actually is a fundamental difference why we need it for logical rep and not for physical: In physical replication, if the xmin progresses too far, client queries will be cancelled. Annoying but not fatal. In logical replication we will not be able to continue replicating since we cannot decode the WAL
Re: [HACKERS] Some interesting news about Linux 3.12 OOM
Andres Freund and...@2ndquadrant.com writes: What has changed is how to react to situations where memory has been overcommitted but is now actually needed. Sure. You either have a failure at malloc() or usage, over commit is all about never failing at malloc(), but now you have to deal with OOM conditions in creative way, like with the OOM Killer. Anyways, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance problem in PLPgSQL
We intsall postgresql 9.3.0 server from FreeBSD ports http://svnweb.freebsd.org/ports/head/databases/postgresql93-server/ the administrator says that he already contains this patch. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Performance-problem-in-PLPgSQL-tp5764796p5771639.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On Tue, Sep 17, 2013 at 7:04 PM, Andres Freund and...@2ndquadrant.com wrote: 1. We're not in a huge hurry to ensure that sinval notifications are delivered in a timely fashion. We know that sinval resets are bad, so if a backend is getting close to needing a sinval reset, we kick it in an attempt to get it to AcceptInvalidationMessages(). But if the sinval queue isn't filling up, there's no upper bound on the amount of time that can pass before a particular sinval is read. Therefore, the amount of time that passes before an idle backend is forced to drain the sinval queue can vary widely, from a fraction of a second to minutes, hours, or days. So it's kind of unappealing to think about making user-visible behavior dependent on how long it ends up taking. Well, when we're signalling it's certainly faster than waiting for the other's snapshot to vanish which can take ages for normal backends. And we can signal when we wait for consumption without too many problems. Also, I think in most of the usecases we can simply not wait for any of the idle backends, those don't use the old definition anyway. Possibly. It would need some thought, though. I am pretty sure there's quite a bit to improve around sinvals but I think any replacement would look surprisingly similar to what we have. So I think doing it incrementally is more realistic. And I am certainly scared by the thought of having to replace it without breaking corner cases all over. I guess I was more thinking that we might want some parallel mechanism with somewhat different semantics. But that might be a bad idea anyway. On the flip side, if I had any clear idea how to adapt the current mechanism to suck less, I would have done it already. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some interesting news about Linux 3.12 OOM
Robert Haas robertmh...@gmail.com writes: I laughed, too, but the reality is that at least as far as PG is concerned it's probably a truthful statement, and if it isn't, nobody here is likely to complain about having to fix it. Yeah, there's a lot of other code out there not as well written or maintained as PG, but using SIGKILL as a substitute for ENOMEM because people might not be checking the return value for malloc() is extremely heavy-handed nannyism. I've been told at several instances that this has been made for the JVM and other such programs that want to allocate huge amount of memory even if they don't really intend to use it. Back in the day that amount could well be greater that the actual amount of physical memory available. So the only way to allow Java applications on Linux was, as I've been told, to implement OOM. And as the target was the desktop, well, have it turned on by default. Now, I liked that story enough to never actually try and check about it, so if some knows for real why the linux kernel appears so stupid in its choice of implementing OOM and turning it on by default… Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Not In Foreign Key Constraint
2013/9/19 David Johnston pol...@yahoo.com Misa Simic wrote I guess that rule can be achieved with triigers on TableA and TableC - but the same is true for FK (and FK constraint is more effective then trigger - that is why I wonder would it be useful/achievable to create that kind of constraint) Thoughts, ideas? You create a common keys in use table and only insert a record into the main tables if you can successfully add the desired key to the shared keys table ( as a unique value ). Setup a normal FK to that table to help enforce that valid records must exist on the keys table. Not fool-proof but you only need to worry about insertions - delete from the pk table to remove the record from the main table and free up the key. David J. Thanks David, Yes, that is one of ways that goal can be achieved via triggers (or to let someone else worry about that Key is inserted/updated/deleted in Master Table first...) Constraint - should be more effective way... (It shouldnt be mixed with FK constraint - even it is opposite on some kind... - it was just simplest way to describe the feature) And it should ensure that every row in table is valid from moment it is created (what trigger can't ensure - constraint does it - or constraint cant be created etc) Thanks, Misa
Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
On Thu, Sep 19, 2013 at 12:52 PM, Andrew Gierth and...@tao11.riddles.org.uk wrote: It compiles without error and looks ok... Thanks for checking. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
Robert == Robert Haas robertmh...@gmail.com writes: bgworker.c: In function 'WaitForBackgroundWorkerStartup': bgworker.c:866: warning: 'pid' may be used uninitialized in this function Robert Does the attached patch fix it for you? It compiles without error and looks ok... -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: lob conversion functionality
Hello here is patch Regards Pavel 2013/9/19 Pavel Stehule pavel.steh...@gmail.com 2013/9/19 Rushabh Lathia rushabh.lat...@gmail.com Hi Pavel, I have reviewed you patch. -- Patch got applied cleanly (using patch -p1) -- Make Make install works fine -- make check looks good I done code-walk and it looks good. Also did some manual testing and haven't found any issue with the implementation. Patch introduced two new API load_lo() and make_lo() for loading and saving from/to large objects Functions. When it comes to drop an lo object created using make_lo() this still depend on older API lo_unlink(). I think we should add that into documentation for the clerification. As a user to lo object function when I started testing this new API, first question came to mind is why delete_lo() or destroy_lo() API is missing. Later I realize that need to use lo_unlink() older API for that functionality. So I feel its good to document that. Do let you know what you think ? good idea I'll send a updated patch evening Otherwise patch looks nice and clean. Thank you :) Regards Pavel Regards, Rushabh Lathia www.EnterpriseDB.com On Sun, Aug 25, 2013 at 8:31 PM, Pavel Stehule pavel.steh...@gmail.comwrote: Hello here is a patch it introduce a load_lo and make_lo functions postgres=# select make_lo(decode('ff00','hex')); make_lo ─ 24629 (1 row) Time: 40.724 ms postgres=# select load_lo(24628); load_lo \xff00 (1 row) postgres=# \lo_import ~/avatar.png lo_import 24630 postgres=# select md5(load_lo(24630)); md5 ── 513f60836f3b625713acaf1c19b6ea78 (1 row) postgres=# \q bash-4.1$ md5sum ~/avatar.png 513f60836f3b625713acaf1c19b6ea78 /home/pavel/avatar.png Regards Pavel Stehule 2013/8/22 Jov am...@amutu.com +1 badly need the large object and bytea convert function. Once I have to use the ugly pg_read_file() to put some text to pg,I tried to use large object but find it is useless without function to convert large object to bytea. Jov blog: http:amutu.com/blog http://amutu.com/blog 2013/8/10 Pavel Stehule pavel.steh...@gmail.com Hello I had to enhance my older project, where XML documents are parsed and created on server side - in PLpgSQL and PLPerl procedures. We would to use a LO API for client server communication, but we have to parse/serialize LO on server side. I found so there are no simple API for working with LO from PL without access to file system. I had to use a ugly hacks: CREATE OR REPLACE FUNCTION parser.save_as_lob(bytea) RETURNS oid AS $$ DECLARE _loid oid; _substr bytea; BEGIN _loid := lo_creat(-1); FOR i IN 0..length($1)/2048 LOOP _substr := substring($1 FROM i * 2048 + 1 FOR 2048); IF _substr '' THEN INSERT INTO pg_largeobject(loid, pageno, data) VALUES(_loid, i, _substr); END IF; END LOOP; EXECUTE format('GRANT SELECT ON LARGE OBJECT %s TO ohs', _loid); RETURN _loid; END; $$ LANGUAGE plpgsql SECURITY DEFINER STRICT SET search_path = 'pg_catalog'; and CREATE OR REPLACE FUNCTION fbuilder.attachment_to_xml(attachment oid) RETURNS xml AS $$ DECLARE b_cum bytea = ''; b bytea; BEGIN FOR b IN SELECT l.data FROM pg_largeobject l WHERE l.loid = attachment_to_xml.attachment ORDER BY l.pageno LOOP b_cum := b_cum || b; END LOOP; IF NOT FOUND THEN RETURN NULL; ELSE RETURN xmlelement(NAME attachment, encode(b_cum, 'base64')); END IF; END; $$ LANGUAGE plpgsql STRICT SECURITY DEFINER SET search_path = 'pg_catalog'; These functions can be simplified if we supports some functions like encode, decode for LO So my proposal is creating functions: * lo_encode(loid oid) .. returns bytea * lo_encode(loid oid, encoding text) .. returns text * lo_make(loid oid, data bytea) * lo_make(loid oid, data text, encoding text) This can simplify all transformation between LO and VARLENA. Known limit is 1G for varlena, but it is still relative enough high. Notes. comments? Regards Pavel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Rushabh Lathia load_lo_v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minor inheritance/check bug: Inconsistent behavior
Marking this as Ready for committer. Thanks a ton for reviewing the patch. I rewrote the comments for this patch and fixed the incorrect formatting in parse_relation.c. It used spaces instead of tabs, which is why if you look at the patch file you'll see that the alignment looked off. See new version, attached. However, I have a few residual questions: 1. Does copy.c also need to be fixed? I see that it also calls ExecConstraints(), and I don't see it setting tableOid anywhere. We certainly want COPY and INSERT to behave the same way. 2. Should constraints also allow access to the oid column? Right now you can do that but it doesn't seem to work, e.g.: rhaas=# create table foo (a int, check (oid::integer % 2 = 0)) with oids; CREATE TABLE rhaas=# insert into foo values (1); INSERT 16404 1 rhaas=# insert into foo values (1); INSERT 16405 1 rhaas=# insert into foo values (1); INSERT 16406 1 rhaas=# select oid, * from foo; oid | a ---+--- 16404 | 1 16405 | 1 16406 | 1 (3 rows) Just prohibiting that might be fine; it doesn't seem that useful anyway. But if we're setting the table OID, maybe we should set the OID too, and then just allow this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company sys_col_constr_v3.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Not In Foreign Key Constraint
Misa Simic wrote Hi hackers, I just wonder how hard would be to implement something like Not In FK Constraint or opposite to FK... A more useful couple next sentences would be along the lines of: I have this problemI've approached it by doingbut it seems that an actual database enforced constraint would be a better solution. Is that something that has been considered? Are their other ways of attacking this problem I have not considered? You took quite a bit of time to try and start a discussion, and I get that you don't necessarily know where it is going to lead, but Not In FK constraint, with a descriptive sentence of two, likely would have been enough to get the ball rolling. Instead you devoted more space to technical clarification that would have been better served by espousing on what problem and how current approaches to dealing with said problem are limited. A more specific end-question would also help solicit better responses. I say all this because 3 days later nothing more substantial than why is this feature necessary has been put forth. The general idea likely has some merit but you've not provided anything for people to hook their teeth into. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Not-In-Foreign-Key-Constraint-tp5771056p5771651.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Where to load modules from?
On Wed, Sep 18, 2013 at 12:53 PM, Andres Freund and...@2ndquadrant.com wrote: On 2013-09-18 08:46:08 -0400, Robert Haas wrote: Here's another idea. At initdb time, create an empty directory called called pg_you_can_load_stuff_from_here (pick a better name) inside $PGDATA. Allow it to be replaced with a symlink. This would be similar to what we do today with pg_xlog. In fact, you can imagine an equivalent of initdb -X that does something precisely analogous. This feels a bit more natural to me than a GUC. I think I'd prefer a GUC that allows specifying multiple directories that are searched in order to a single symlinked directory. Why? I ask because I have the opposite preference, based on the precedent of pg_xlog. Also, aren't symlinks an absolute PITA to manipulate by hand on windows? Maybe so, but if that's an issue here it's a preexisting issue also. I think we shouldn't burden this patch with fixing it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dead code or buggy code?
On Wed, Sep 18, 2013 at 6:20 PM, Greg Stark st...@mit.edu wrote: The following code is in the ProcSleep at proc.c:1138. GetBlockingAutoVacuumPgproc() should presumably always return a vacuum pgproc entry since the deadlock state says it's blocked by autovacuum. But I'm not really familiar enough with this codepath to know whether there's not a race condition here where it can sometimes return null. The following code checks autovac != NULL but the PGXACT initializer would have seg faulted if it returned NULL if that's possible. if (deadlock_state == DS_BLOCKED_BY_AUTOVACUUM allow_autovacuum_cancel) { PGPROC *autovac = GetBlockingAutoVacuumPgproc(); PGXACT *autovac_pgxact = ProcGlobal-allPgXact[autovac-pgprocno]; LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); /* * Only do it if the worker is not working to protect against Xid * wraparound. */ if ((autovac != NULL) (autovac_pgxact-vacuumFlags PROC_IS_AUTOVACUUM) !(autovac_pgxact-vacuumFlags PROC_VACUUM_FOR_WRAPAROUND)) { Hmm, yeah. I remember noticing this some time ago but never got around to fixing it. +1 for rearranging things there somehow. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PERFORM] encouraging index-only scans
On Tue, Sep 17, 2013 at 7:10 PM, Andres Freund and...@2ndquadrant.com wrote: I generally think the current logic for triggering VACUUMs via autovacuum doesn't really make all that much sense in the days where we have the visibility map. Right now, whether or not to autovacuum is the rest of a two-pronged test. The first prong is based on number of updates and deletes relative to table size; that triggers a regular autovacuum. The second prong is based on age(relfrozenxid) and triggers a non-page-skipping vacuum (colloquially, an anti-wraparound vacuum). The typical case in which this doesn't work out well is when the table has a lot of inserts but few or no updates and deletes. So I propose that we change the first prong to count inserts as well as updates and deletes when deciding whether it needs to vacuum the table. We already use that calculation to decide whether to auto-analyze, so it wouldn't be very novel. We know that the work of marking pages all-visible will need to be done at some point, and doing it sooner will result in doing it in smaller batches, which seems generally good. However, I do have one concern: it might lead to excessive index-vacuuming. Right now, we skip the index vac step only if there ZERO dead tuples are found during the heap scan. Even one dead tuple (or line pointer) will cause an index vac cycle, which may easily be excessive. So I further propose that we introduce a threshold for index-vac; so that we only do index vac cycle if the number of dead tuples exceeds, say 0.1% of the table size. Thoughts? Let the hurling of rotten tomatoes begin. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF8 national character data type support WIP patch and list of open issues.
On Wed, Sep 18, 2013 at 6:42 PM, MauMau maumau...@gmail.com wrote: It seems to me that these two points here are the real core of your proposal. The rest is just syntactic sugar. No, those are desirable if possible features. What's important is to declare in the manual that PostgreSQL officially supports national character types, as I stated below. That may be what's important to you, but it's not what's important to me. I am not keen to introduce support for nchar and nvarchar as differently-named types with identical semantics. And I think it's an even worse idea to introduce them now, making them work one way, and then later change the behavior in a backward-incompatible fashion. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertions in PL/PgSQL
On Sat, Sep 14, 2013 at 6:09 PM, Marko Tiikkaja ma...@joh.to wrote: On 2013-09-14 23:55, Pavel Stehule wrote: but introduction a reserved keword for one very special purpose (without extensibility) is not prudent. How about using an existing keyword then? ASSERTION used to be reserved in the SQL standard but is unreserved in postgres. CHECK might work and is fully reserved. CONSTRAINT? IS? Personally, I'm pretty skeptical about the value of adding dedicated syntax for this. I mean, I'll be the first to admit that PL/pgsql is a clunky and awkward language. But somebody's always proposing something that they think will make it better, and I feel somehow that if we accept all of those proposals at face value, we'll just end up with a mess. So IMHO the bar for adding new syntax to PL/pgsql should be reasonably high. YMMV, of course, and probably does. The issue of how this is spelled is somewhat secondary for me. I think ASSERT is probably as good as anything. But let's not kid ourselves: even reserving this word only in PL/pgsql WILL break things for some users, and there are LOTS of people out there with LOTS of procedural code. Every tiny backward-incompatibility reduces by just a little bit the percentage of those people who can upgrade and increases the delay before they do. This is an area where past experience has made me quite wary. Maybe I'm worrying over nothing; this really is a pretty small change. But once bitten, twice shy. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] record identical operator
On Wed, Sep 18, 2013 at 7:29 PM, Vik Fearing vik.fear...@dalibo.com wrote: On 09/19/2013 12:55 AM, Dimitri Fontaine wrote: We have, as a community, gone to a fair amount of trouble to make the concept of equality pluggable and allow multiple types of equality per type. To me it seems the perfect tool to solve this problem. Why the fuss? Because I don't understand why you need another equality than the default btree one, certainly. Basically because 'word'::citext and 'Word'::citext are the same to your brain but not to your eyeballs. Unique indexes, for example, only need to please your brain. Matviews need to please your eyeballs. Right, and well said. It's perfectly reasonable to want a unique index that doesn't allow both Kevin O'leary and Kevin O'Leary to be listed in the irish_names table, but it's also perfectly reasonable to want to remember that the user who entered the data spelled it the second way and not the first. And it's also reasonable to want any corrections made to the table to propagate to any materialized views defined over it. The fact that we have multiple concepts of equality can be confusing, but it's not for no reason, and we're not the only database or programming language to have such a concept. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Extend namespace of valid guc names
On Tue, Sep 17, 2013 at 6:56 PM, Andres Freund and...@2ndquadrant.com wrote: That's more or less what I figured. One thing I'm uncomfortable with is that, while this is useful for extensions, what do we do when we have a similar requirement for core? The proposed GUC name is of the format extension.user_defined_object_name.property_name; but omitting the extension name would lead to user_defined_object_name.property_name, which doesn't feel good as a method for naming core GUCs. The user defined object name might just so happen to be an extension name. I think that ship has long since sailed. postgresql.conf has allowed foo.bar style GUCs via custom_variable_classes for a long time, and these days we don't even require that but allow them generally. Also, SET, postgres -c, and SELECT set_config() already don't have the restriction to one dot in the variable name. Well, we should make it consistent, but I'm not sure that settles the question of which direction to go. If it's not a good fit for pg_hba.conf, why is it a good fit for anything else we want to do? Well, for now it's primarily there for extensions, not for core code. Although somebody has already asked when I'd submit the patch because they could use it for their patch... I don't really find the pg_hba.conf comparison terribly convincing. As an extension, how would you prefer to formulate the configuration in the example? Well, to me it looks like what you've got there is a table. And I don't have a clear feeling of how that ought to be represented in configuration-land, but trying to jam it into the GUC machinery seems awkward at best. I think we need a way of handling tabular configuration data, but I'm not convinced this is it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Range types do not display in pg_stats
create table tstztest( trange tstzrange ); postgres=# insert into tstztest select tstzrange(t, t + interval '1 month') from generate_series('2012-01-01'::timestamptz,'2018-01-01','1 month') as gs(t); INSERT 0 73 postgres=# analyze tstztest; ANALYZE postgres=# select * from pg_stats where tablename = 'tstztest'; schemaname | tablename | attname | inherited | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation | most_common_elems | most_common_elem_freqs | elem_count_histogram +---+-+---+---+---++--+---+-- +-+---++-- public | tstztest | trange | f | 0 |22 | -1 | | | | | || Now, there actually *is* a histogram for the column, which you can find via pg_statistic. But is shows up as NULL in pg_stats view. If this is a known issue, we ought to at least add it to the docs. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Dump/Reload broken with relocatable extensions
I don't know if this has been discussed before, a cursory search of the archives didn't turn up anything interesting. I perhaps didn't put in the right keywords. Unfortunately, the easiest way that I've found to reproduce this bug is to apply the attached patch to the unaccent extension. This isn't a bug with that extension, it was just the simplest. With that patch applied, the following steps will put a database in a state which cannot be dumped/restored. create extension unaccent with schema public; create schema s; create table s.t (v text check (public.no_accents(v))); insert into s.t values ('a'); The problem is the no_accents(text) function, which belongs to the unaccent extension, calls unaccent(text), also belonging to the unaccent extension. If a table living in a different schema to that of the extension has a CHECK constraint using the function, the dump/restore behavior of setting the search_path will cause restoration to fail. At first I thought the solution would be to have all functions of an extension have a custom search_path equal to that of the extension, but that doesn't really work because it would cause too many undesirable side effects. Another solution could be to postpone adding constraints until after everything's been set up, but that seems a bit unwieldly. My preferred solution at the moment is to invent a special $extension schema analog to the $user schema. It wouldn't be implicit like the $user one, but an extension could call one of its own functions as $extension.funcname(). I'm not sure what should happen if the caller isn't part of an extension. I'm leaning towards a schema does not exist error. This has grammar issues, though, that $user doesn't have. The original problem came from a CHECK constraint on the PostGIS _raster_constraint_pixel_types(raster) function which calls st_bandmetadata(raster, int[]). I thought that a simple patch to the in-core extension unaccent would be more practical. I am not proposing adding this patch to the unaccent extension. -- Vik *** a/contrib/unaccent/unaccent--1.0.sql --- b/contrib/unaccent/unaccent--1.0.sql *** *** 32,34 CREATE TEXT SEARCH DICTIONARY unaccent ( --- 32,39 TEMPLATE = unaccent, RULES= 'unaccent' ); + + CREATE FUNCTION no_accents(text) + RETURNS boolean + AS 'select $1 = unaccent($1);' + LANGUAGE sql STABLE STRICT; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Where to load modules from?
Robert Haas robertmh...@gmail.com writes: I think I'd prefer a GUC that allows specifying multiple directories that are searched in order to a single symlinked directory. Why? I ask because I have the opposite preference, based on the precedent of pg_xlog. I understand Andres preference, as it would allow a management somewhat comparable to PATH or LD_LIBRARY_PATH here. In an effort to reach consensus, what about having both, with the GUC being empty by default? That way you have a default per-cluster place where to stuff binaries to be loaded, and a GUC to manage finer settings if needs be. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PERFORM] encouraging index-only scans
Robert Haas robertmh...@gmail.com wrote: Right now, whether or not to autovacuum is the rest of a two-pronged test. The first prong is based on number of updates and deletes relative to table size; that triggers a regular autovacuum. The second prong is based on age(relfrozenxid) and triggers a non-page-skipping vacuum (colloquially, an anti-wraparound vacuum). The typical case in which this doesn't work out well is when the table has a lot of inserts but few or no updates and deletes. So I propose that we change the first prong to count inserts as well as updates and deletes when deciding whether it needs to vacuum the table. We already use that calculation to decide whether to auto-analyze, so it wouldn't be very novel. We know that the work of marking pages all-visible will need to be done at some point, and doing it sooner will result in doing it in smaller batches, which seems generally good. However, I do have one concern: it might lead to excessive index-vacuuming. Right now, we skip the index vac step only if there ZERO dead tuples are found during the heap scan. Even one dead tuple (or line pointer) will cause an index vac cycle, which may easily be excessive. So I further propose that we introduce a threshold for index-vac; so that we only do index vac cycle if the number of dead tuples exceeds, say 0.1% of the table size. +1 I've been thinking of suggesting something along the same lines, for the same reasons. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dump/Reload broken with relocatable extensions
Vik Fearing vik.fear...@dalibo.com writes: I don't know if this has been discussed before, a cursory search of the archives didn't turn up anything interesting. I perhaps didn't put in the right keywords. For others not to spend too much time on this: it seems like a problem with the extension not abiding by the rules about its relocatable property and the @extschema@ thingy. http://www.postgresql.org/docs/9.3/static/extend-extensions.html#AEN54999 Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Where to load modules from?
On 2013-09-19 22:56:52 +0200, Dimitri Fontaine wrote: Robert Haas robertmh...@gmail.com writes: I think I'd prefer a GUC that allows specifying multiple directories that are searched in order to a single symlinked directory. Why? I ask because I have the opposite preference, based on the precedent of pg_xlog. Because I want to specify multiple paths. E.g. one with modules for a specific postgres version, one for the cluster and one for my development directory. Now we could recursively search a directory that contains symlinks to directories, but that seems ugly. I understand Andres preference, as it would allow a management somewhat comparable to PATH or LD_LIBRARY_PATH here. In an effort to reach consensus, what about having both, with the GUC being empty by default? That way you have a default per-cluster place where to stuff binaries to be loaded, and a GUC to manage finer settings if needs be. Well, we can have the guc have a default value of $datadir/pg_lib or such. But using two independent mechanisms seems like a bad idea to me. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freezing without write I/O
On Thu, Sep 19, 2013 at 2:42 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: * switchFinishXmin and nextSwitchXid should probably be either volatile or have a compiler barrier between accessing shared memory and checking them. The compiler very well could optimize them away and access shmem all the time which could lead to weird results. Hmm, that would be a weird optimization. Is that really legal for the compiler to do? Better safe than sorry, I guess. C doesn't really have a multi-threaded memory model before C11, so the compiler makers have just made one up that suits them best. For unlocked memory reads the compiler is free to assume that no one else is accessing the variable, so yes that optimization would be legal according to their rules. I'm tackling similar issues in my patch. What I'm thinking is that we should change our existing supposedly atomic accesses to be more explicit and make the API compatible with C11 atomics[1]. For now I think the changes should be something like this: * Have separate typedefs for atomic variants of datatypes (I don't think we have a whole lot of them). With C11 atomics support, this would change to typedef _Atomic TransactionId AtomicTransactionId; * Require that pg_atomic_init(type, var, val) be used to init atomic variables. Initially it would just pass through to assignment, as all supported platforms can do 32bit atomic ops. C11 compatible compilers will delegate to atomic_init(). * Create atomic read and write macros that look something like: #define pg_atomic_load(type, var) (*((volatile type*) (var))) and #define pg_atomic_store(type, var, val) do { \ *((volatile type*) (var)) = (val); \ } while(0) C11 would pass through to the compilers implementation with relaxed memory ordering. * Rename pg_read_barrier()/pg_write_barrier()/pg_memory_barrier() to pg_acquire_fence()/pg_release_fence()/pg_acq_rel_fence(). Herb Sutter makes a convincing argument why loosening up the barrier semantics is the sane choice here. [2] C11 support can then pass though to atomic_thread_fence(). This way we have a relatively future proof baseline for lockfree programming, with options to expand with other primitives. We could also only do the volatile access macros part, at least it would make the intention more clear than the current approach of sprinkling around volatile pointers. Regards, Ants Aasma [1] http://en.cppreference.com/w/c/atomic [2] (long video about atomics) http://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-1-of-2 -- Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Extend namespace of valid guc names
On 2013-09-19 14:57:53 -0400, Robert Haas wrote: On Tue, Sep 17, 2013 at 6:56 PM, Andres Freund and...@2ndquadrant.com wrote: I think that ship has long since sailed. postgresql.conf has allowed foo.bar style GUCs via custom_variable_classes for a long time, and these days we don't even require that but allow them generally. Also, SET, postgres -c, and SELECT set_config() already don't have the restriction to one dot in the variable name. Well, we should make it consistent, but I'm not sure that settles the question of which direction to go. I suggested making it consistent in either form elsewhere in this thread, Tom wasn't convinced. I think because of backward compatibility concerns we hardly can restrict the valid namespace in all forms to the most restrictive form (i.e. guc-file.l). Quite some people use GUCs as variables. Maybe we can forbid the more absurd variants (SET ... = 3; SELECT set_config('...', '1', true)), but restricting it entirely seems to cause pain for no reason. If it's not a good fit for pg_hba.conf, why is it a good fit for anything else we want to do? Well, for now it's primarily there for extensions, not for core code. Although somebody has already asked when I'd submit the patch because they could use it for their patch... I don't really find the pg_hba.conf comparison terribly convincing. As an extension, how would you prefer to formulate the configuration in the example? Well, to me it looks like what you've got there is a table. And I don't have a clear feeling of how that ought to be represented in configuration-land, but trying to jam it into the GUC machinery seems awkward at best. I think we need a way of handling tabular configuration data, but I'm not convinced this is it. Well, it has two major advantages from my pov: a) we already have it from SQL and people already use it that way b) it's dirt simple and it doesn't allow anything not allowed before via other means. Maybe you can invent something nicer to look at that's also parsed before the server starts, but I find it hard to see the need TBH. Especially as you will want most of the infrastructure GUCs provide... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist
2013/9/16 Satoshi Nagayasu sn...@uptime.jp (2013/07/06 1:16), Pavel Stehule wrote: I am sending a patch that removes strict requirements for DROP IF EXISTS statements. This behave is similar to our ALTER IF EXISTS behave now. postgres=# DROP CAST IF EXISTS (sss AS public.casttesttype); NOTICE: types sss and public.casttesttype does not exist, skipping DROP CAST postgres=# DROP FUNCTION IF EXISTS public.pt_in_widget(point, widget); NOTICE: function public.pt_in_widget(point,**widget) does not exist, skipping DROP FUNCTION postgres=# DROP OPERATOR IF EXISTS public.% (point, widget); NOTICE: operator public.% does not exist, skipping DROP OPERATOR postgres=# DROP TRIGGER test_trigger_exists ON no_such_table; ERROR: relation no_such_table does not exist postgres=# DROP TRIGGER IF EXISTS test_trigger_exists ON no_such_table; NOTICE: trigger test_trigger_exists for table no_such_table does not exist, skipping DROP TRIGGER This functionality is necessary for correct quite reload from dump without possible warnings I'm looking at this patch, and I have a question here. Should DROP TRIGGER IF EXISTS ignore error for non-existing trigger and non-existing table? Or just only for non-existing trigger? My opinion is so, both variants should be ignored - it should be fully fault tolerant in this use case. Regards Pavel I've not understood the pg_restore issue precisely so far, but IMHO DROP TRIGGER IF EXISTS means if the _trigger_ exists, not if the _table_ exists. Is this a correct and/or an expected behavior? Sorry if I missed some consensus which we already made. Any comments? Regards, -- Satoshi Nagayasu sn...@uptime.jp Uptime Technologies, LLC. http://www.uptime.jp
Re: [HACKERS] record identical operator - Review
On 09/12/2013 06:27 PM, Kevin Grittner wrote: Attached is a patch for a bit of infrastructure I believe to be necessary for correct behavior of REFRESH MATERIALIZED VIEW CONCURRENTLY as well as incremental maintenance of matviews. Here is attempt at a review of the v1 patch. There has been a heated discussion on list about if we even want this type of operator. I will try to summarize it here The arguments against it are * that a record_image_identical call on two records that returns false can still return true under the equality operator, and the records might (or might not) appear to be the same to users * Once we expose this as an operator via SQL someone will find it and use it and then complain when we change things such as the internal representation of a datatype which might break there queries * The differences between = and record_image_identical might not be understood by everywhere who finds the operator leading to confusion * Using a criteria other than equality (the = operator provided by the datatype) might cause problems if we later provide 'on change' triggers for materialized views The arguments for this patch are * We want the materialized view to return the same value as would be returned if the query were executed directly. This not only means that the values should be the same according to a datatypes = operator but that they should appear the same 'to the eyeball'. * Supporting the materialized view refresh check via SQL makes a lot of sense but doing that requires exposing something via SQL * If we adequately document what we mean by record_image_identical and the operator we pick for this then users shouldn't be surprised at what they get if they use this I think there is agreement that better (as in more obscure) operators than === and !== need to be picked and we need to find a place in the user-docs to warn users of the behaviour of this operators. Hannu has proposed *==* binary equal, surely very equal by any other definition as wall !==? maybe not equal -- binary inequal, may still be equal by other comparison methods and no one has yet objected to this. My vote would be to update the patch to use those operator names and then figure out where we can document them. It it means we have to section describing the equal operator for records as well then maybe we should do that so we can get on with things. Code Review The record_image_eq and record_image_cmp functions are VERY similar. It seems to me that you could have the meet of these functions into a common function (that isn't exposed via SQL) that then behaves differently in 2 or 3 spots based on a parameter that controls if it detoasts the tuple for the compare or returns false on the equals. Beyond that I don't see any issues with the code. You call out a question about two records being compared have a different number of columns should it always be an error, or only an error when they match on the columns they do have in common. The current behaviour is select * FROM r1,r2 where r1==r2; a | b | a | b | c ---+---+---+---+--- 1 | 1 | 1 | 2 | 1 (1 row) update r2 set b=1; UPDATE 1 test=# select * FROM r1,r2 where r2==r1; ERROR: cannot compare record types with different numbers of columns This seems like a bad idea to me. I don't like that I get a type comparison error only sometimes based on the values of the data in the column. If I'm a user testing code that compares two records with this operator and I test my query with 1 value pair then and it works then I'd naively expect to get a true or false on all other value sets of the same record type. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF8 national character data type support WIP patch and list of open issues.
From: Robert Haas robertmh...@gmail.com That may be what's important to you, but it's not what's important to me. National character types support may be important to some potential users of PostgreSQL and the popularity of PostgreSQL, not me. That's why national character support is listed in the PostgreSQL TODO wiki. We might be losing potential users just because their selection criteria includes national character support. I am not keen to introduce support for nchar and nvarchar as differently-named types with identical semantics. Similar examples already exist: - varchar and text: the only difference is the existence of explicit length limit - numeric and decimal - int and int4, smallint and int2, bigint and int8 - real/double precison and float In addition, the SQL standard itself admits: The key words NATIONAL CHARACTER are used to specify the character type with an implementation- defined character set. Special syntax (N'string') is provided for representing literals in that character set. ... NATIONAL CHARACTER is equivalent to the corresponding character string type with a specification of CHARACTER SET CSN, where CSN is an implementation-defined character set name. A national character string literal is equivalent to a character string literal with the N replaced by introducercharacter set specification, where character set specification is an implementation- defined character set name. And I think it's an even worse idea to introduce them now, making them work one way, and then later change the behavior in a backward-incompatible fashion. I understand your feeling. The concern about incompatibility can be eliminated by thinking the following way. How about this? - NCHAR can be used with any database encoding. - At first, NCHAR is exactly the same as CHAR. That is, implementation-defined character set described in the SQL standard is the database character set. - In the future, the character set for NCHAR can be selected at database creation like Oracle's CREATE DATABAWSE NATIONAL CHARACTER SET AL16UTF16. The default it the database set. Could you tell me what kind of specification we should implement if we officially support national character types? Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PERFORM] encouraging index-only scans
On 2013-09-19 14:39:43 -0400, Robert Haas wrote: On Tue, Sep 17, 2013 at 7:10 PM, Andres Freund and...@2ndquadrant.com wrote: I generally think the current logic for triggering VACUUMs via autovacuum doesn't really make all that much sense in the days where we have the visibility map. Right now, whether or not to autovacuum is the rest of a two-pronged test. The first prong is based on number of updates and deletes relative to table size; that triggers a regular autovacuum. The second prong is based on age(relfrozenxid) and triggers a non-page-skipping vacuum (colloquially, an anti-wraparound vacuum). And I have some hopes we can get rid of that in 9.4 (that alone would be worth a bump to 10.0 ;)). I really like Heikki's patch, even if I am envious that I didn't have the idea :P. Although it needs quite a bit of work to be ready. The typical case in which this doesn't work out well is when the table has a lot of inserts but few or no updates and deletes. So I propose that we change the first prong to count inserts as well as updates and deletes when deciding whether it needs to vacuum the table. We already use that calculation to decide whether to auto-analyze, so it wouldn't be very novel. We know that the work of marking pages all-visible will need to be done at some point, and doing it sooner will result in doing it in smaller batches, which seems generally good. Yes, that's a desperately needed change. The reason I suggested keeping track of the xids of unremovable tuples is that the current logic doesn't handle that at all. We just unconditionally set n_dead_tuples to zero after a vacuum even if not a single row could actually be cleaned out. Which has the effect that we will not start a vacuum until enough bloat (or after changing this, new inserts) has collected to start vacuum anew. Which then will do twice the work. Resetting n_dead_tuples to the actual remaining dead tuples wouldn't do much good either - we would just immediately trigger a new vacuum the next time we check, even if the xmin horizon is still the same. However, I do have one concern: it might lead to excessive index-vacuuming. Right now, we skip the index vac step only if there ZERO dead tuples are found during the heap scan. Even one dead tuple (or line pointer) will cause an index vac cycle, which may easily be excessive. So I further propose that we introduce a threshold for index-vac; so that we only do index vac cycle if the number of dead tuples exceeds, say 0.1% of the table size. Yes, that's a pretty valid concern. But we can't really do it that easily. a) We can only remove dead line pointers when we know there's no index pointing to it anymore. Which we only know after the index has been removed. b) We cannot check the validity of an index pointer if there's no heap tuple for it. Sure, we could check whether we're pointing to a dead line pointer, but the random io costs of that are prohibitive. Now, we could just mark line pointers as dead and not mark that page as all-visible and pick it up again on the next vacuum cycle. But that would suck long-term. I think the only real solution here is to store removed tuples tids (i.e. items where we've marked as dead) somewhere. Whenever we've found sufficient tuples to-be-removed from indexes we do phase 2. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dead code or buggy code?
So I'm just going to make the code defensive and assume NULL is possible when if maybe it isn't. In case it's not clear, this is one of the thing's Xi Wang's took picked up. There not to many but it turns out they are indeed not all in the adt code so I might wait until after the commit fest to commit it to avoid causing bit churn. -- greg On 19 Sep 2013 12:52, Robert Haas robertmh...@gmail.com wrote: On Wed, Sep 18, 2013 at 6:20 PM, Greg Stark st...@mit.edu wrote: The following code is in the ProcSleep at proc.c:1138. GetBlockingAutoVacuumPgproc() should presumably always return a vacuum pgproc entry since the deadlock state says it's blocked by autovacuum. But I'm not really familiar enough with this codepath to know whether there's not a race condition here where it can sometimes return null. The following code checks autovac != NULL but the PGXACT initializer would have seg faulted if it returned NULL if that's possible. if (deadlock_state == DS_BLOCKED_BY_AUTOVACUUM allow_autovacuum_cancel) { PGPROC *autovac = GetBlockingAutoVacuumPgproc(); PGXACT *autovac_pgxact = ProcGlobal-allPgXact[autovac-pgprocno]; LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); /* * Only do it if the worker is not working to protect against Xid * wraparound. */ if ((autovac != NULL) (autovac_pgxact-vacuumFlags PROC_IS_AUTOVACUUM) !(autovac_pgxact-vacuumFlags PROC_VACUUM_FOR_WRAPAROUND)) { Hmm, yeah. I remember noticing this some time ago but never got around to fixing it. +1 for rearranging things there somehow. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
[HACKERS] Looking for information on our elephant
Hi, I'm Looking for information on our elephant: especially how/why we chose elephant as our mascot. According to: http://wiki.postgresql.org/wiki/Logo The discussion was back to 1997 on the hackers list. Unfortunately the official archive for 1997 was lost. Mine seems to be gone too. Any information will be appreciated. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Looking for information on our elephant
Tatsuo, I have emails even from 1995 ! You can see that historical message here: http://www.pgsql.ru/db/mw/msg.html?mid=1238939 Re: [HACKERS] PostgreSQL logo. *Author:* yang( at )sjuphil( dot )sju( dot )edu *Date:* 1997-04-03 20:36:33 *Subject:*http://www.pgsql.ru/db/mw/index.html?word=Re%3A%20%5BHACKERS%5D%20PostgreSQL%20logo.Re: [HACKERS] PostgreSQL logo. Some other ideas: derivative: a sword (derivative of the Dragon book cover -- Postgres as a tool) illustrative: a bowl of Alphabet Soup, with letters spelling out POSTGRESQL obscure: a revolver/hit man (Grosse Pt is an anagram of Postgres, and an abbreviation of the title of the new John Cusack movie) but if you want an animal-based logo, how about some sort of elephant? After all, as the Agatha Christie title read, elephants can remember ... David yangy...@sju.edu Oleg On Fri, Sep 20, 2013 at 3:16 AM, Tatsuo Ishii is...@postgresql.org wrote: Hi, I'm Looking for information on our elephant: especially how/why we chose elephant as our mascot. According to: http://wiki.postgresql.org/wiki/Logo The discussion was back to 1997 on the hackers list. Unfortunately the official archive for 1997 was lost. Mine seems to be gone too. Any information will be appreciated. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF8 national character data type support WIP patch and list of open issues.
On Mon, Sep 16, 2013 at 8:49 AM, MauMau maumau...@gmail.com wrote: 2. NCHAR/NVARCHAR columns can be used in non-UTF-8 databases and always contain Unicode data. ... 3. Store strings in UTF-16 encoding in NCHAR/NVARCHAR columns. Fixed-width encoding may allow faster string manipulation as described in Oracle's manual. But I'm not sure about this, because UTF-16 is not a real fixed-width encoding due to supplementary characters. It seems to me that these two points here are the real core of your proposal. The rest is just syntactic sugar. Let me start with the second one: I don't think there's likely to be any benefit in using UTF-16 as the internal encoding. In fact, I think it's likely to make things quite a bit more complicated, because we have a lot of code that assumes that server encodings have certain properties that UTF-16 doesn't - specifically, that any byte with the high-bit clear represents the corresponding ASCII character. Agreed. As to the first one, if we're going to go to the (substantial) trouble of building infrastructure to allow a database to store data in multiple encodings, why limit it to storing UTF-8 in non-UTF-8 databases? What about storing SHIFT-JIS in UTF-8 databases, or Windows-yourfavoriteM$codepagehere in UTF-8 databases, or any other combination you might care to name? Whether we go that way or not, I think storing data in one encoding in a database with a different encoding is going to be pretty tricky and require far-reaching changes. You haven't mentioned any of those issues or discussed how you would solve them. What about limiting to use NCHAR with a database which has same encoding or compatible encoding (on which the encoding conversion is defined)? This way, NCHAR text can be automatically converted from NCHAR to the database encoding in the server side thus we can treat NCHAR exactly same as CHAR afterward. I suppose what encoding is used for NCHAR should be defined in initdb time or creation of the database (if we allow this, we need to add a new column to know what encoding is used for NCHAR). For example, CREATE TABLE t1(t NCHAR(10)) will succeed if NCHAR is UTF-8 and database encoding is UTF-8. Even succeed if NCHAR is SHIFT-JIS and database encoding is UTF-8 because there is a conversion between UTF-8 and SHIFT-JIS. However will not succeed if NCHAR is SHIFT-JIS and database encoding is ISO-8859-1 because there's no conversion between them. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Looking for information on our elephant
Oh great! Thank you Oleg! -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp Tatsuo, I have emails even from 1995 ! You can see that historical message here: http://www.pgsql.ru/db/mw/msg.html?mid=1238939 Re: [HACKERS] PostgreSQL logo. *Author:* yang( at )sjuphil( dot )sju( dot )edu *Date:* 1997-04-03 20:36:33 *Subject:*http://www.pgsql.ru/db/mw/index.html?word=Re%3A%20%5BHACKERS%5D%20PostgreSQL%20logo.Re: [HACKERS] PostgreSQL logo. Some other ideas: derivative: a sword (derivative of the Dragon book cover -- Postgres as a tool) illustrative: a bowl of Alphabet Soup, with letters spelling out POSTGRESQL obscure: a revolver/hit man (Grosse Pt is an anagram of Postgres, and an abbreviation of the title of the new John Cusack movie) but if you want an animal-based logo, how about some sort of elephant? After all, as the Agatha Christie title read, elephants can remember ... David yangy...@sju.edu Oleg On Fri, Sep 20, 2013 at 3:16 AM, Tatsuo Ishii is...@postgresql.org wrote: Hi, I'm Looking for information on our elephant: especially how/why we chose elephant as our mascot. According to: http://wiki.postgresql.org/wiki/Logo The discussion was back to 1997 on the hackers list. Unfortunately the official archive for 1997 was lost. Mine seems to be gone too. Any information will be appreciated. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range types do not display in pg_stats
On Thu, Sep 19, 2013 at 4:54 PM, Josh Berkus j...@agliodbs.com wrote: create table tstztest( trange tstzrange ); postgres=# insert into tstztest select tstzrange(t, t + interval '1 month') from generate_series('2012-01-01'::timestamptz,'2018-01-01','1 month') as gs(t); INSERT 0 73 postgres=# analyze tstztest; ANALYZE postgres=# select * from pg_stats where tablename = 'tstztest'; schemaname | tablename | attname | inherited | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation | most_common_elems | most_common_elem_freqs | elem_count_histogram +---+-+---+---+---++--+---+-- +-+---++-- public | tstztest | trange | f | 0 |22 | -1 | | | | | || Now, there actually *is* a histogram for the column, which you can find via pg_statistic. But is shows up as NULL in pg_stats view. If this is a known issue, we ought to at least add it to the docs. It probably has to do with the CASE stakind stuff in the definition of the pg_stats view. Try \d+ pg_stats to see what I mean. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF8 national character data type support WIP patch and list of open issues.
Hi, That may be what's important to you, but it's not what's important to me. National character types support may be important to some potential users of PostgreSQL and the popularity of PostgreSQL, not me. That's why national character support is listed in the PostgreSQL TODO wiki. We might be losing potential users just because their selection criteria includes national character support. the whole NCHAR appeared as hack for the systems, that did not have it from the beginning. It would not be needed, if all the text would be magically stored in UNICODE or UTF from the beginning and idea of character would be the same as an idea of a rune and not a byte. PostgreSQL has a very powerful possibilities for storing any kind of encoding. So maybe it makes sense to add the ENCODING as another column property, the same way a COLLATION was added? It would make it possible to have a database, that talks to the clients in UTF8 and stores text and varchar data in the encoding that is the most appropriate for the situation. It will make it impossible (or complicated) to make the database have a non-UTF8 default encoding (I wonder who should need that in this case), as conversions will not be possible from the broader charsets into the default database encoding. One could define an additional DATABASE property like LC_ENCODING that would work for the ENCODING property of a column like LC_COLLATE for COLLATE property of a column. Text operations should work automatically, as in memory all strings will be converted to the database encoding. This approach will also open a possibility to implement custom ENCODINGs for the column data storage, like snappy compression or even BSON, gobs or protbufs for much more compact type storage. Regards, -- Valentine Gogichashvili
Re: [HACKERS] [RFC] Extend namespace of valid guc names
On Thu, Sep 19, 2013 at 2:48 PM, Andres Freund and...@2ndquadrant.com wrote: On 2013-09-18 11:02:50 +0200, Andres Freund wrote: On 2013-09-18 11:55:24 +0530, Amit Kapila wrote: I think that ship has long since sailed. postgresql.conf has allowed foo.bar style GUCs via custom_variable_classes for a long time, and these days we don't even require that but allow them generally. Also, SET, postgres -c, and SELECT set_config() already don't have the restriction to one dot in the variable name. It's even explained in document that a two-part name is allowed for Customized Options at link: http://www.postgresql.org/docs/devel/static/runtime-config-custom.html Oh I somehow missed that. I'll need to patch that as well. Updated patch attached. old part of line - PostgreSQL will accept a setting for any two-part parameter name. new part of line + PostgreSQL will accept a setting for any parameter name containing at least one dot. If I read new part of line just in context of custom options, it makes sense, but when I compare with old line I think below lines or variant of them might make more sense as this line is not restricted to just custom options: a. PostgreSQL will accept a setting for any parameter name containing two or more part parameter names. b. PostgreSQL will accept a setting for any parameter name containing one or more dots in parts of parameter names. It's just how user interpret any line, so may be your line is more meaningful to some users. If you don't think there is any need to change then keep as it is and let committer decide about it. I don't have any big problem with the current wording. I think Robert is still not fully convinced about this patch, but from my side review of patch with the current scope is complete, so I will mark it as Ready For Committer if nobody has objections for the same. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: lob conversion functionality
On Thu, Sep 19, 2013 at 10:19 PM, Pavel Stehule pavel.steh...@gmail.comwrote: Hello here is patch Looks good. Marking it as Ready for Committer. Regards Pavel 2013/9/19 Pavel Stehule pavel.steh...@gmail.com 2013/9/19 Rushabh Lathia rushabh.lat...@gmail.com Hi Pavel, I have reviewed you patch. -- Patch got applied cleanly (using patch -p1) -- Make Make install works fine -- make check looks good I done code-walk and it looks good. Also did some manual testing and haven't found any issue with the implementation. Patch introduced two new API load_lo() and make_lo() for loading and saving from/to large objects Functions. When it comes to drop an lo object created using make_lo() this still depend on older API lo_unlink(). I think we should add that into documentation for the clerification. As a user to lo object function when I started testing this new API, first question came to mind is why delete_lo() or destroy_lo() API is missing. Later I realize that need to use lo_unlink() older API for that functionality. So I feel its good to document that. Do let you know what you think ? good idea I'll send a updated patch evening Otherwise patch looks nice and clean. Thank you :) Regards Pavel Regards, Rushabh Lathia www.EnterpriseDB.com On Sun, Aug 25, 2013 at 8:31 PM, Pavel Stehule pavel.steh...@gmail.comwrote: Hello here is a patch it introduce a load_lo and make_lo functions postgres=# select make_lo(decode('ff00','hex')); make_lo ─ 24629 (1 row) Time: 40.724 ms postgres=# select load_lo(24628); load_lo \xff00 (1 row) postgres=# \lo_import ~/avatar.png lo_import 24630 postgres=# select md5(load_lo(24630)); md5 ── 513f60836f3b625713acaf1c19b6ea78 (1 row) postgres=# \q bash-4.1$ md5sum ~/avatar.png 513f60836f3b625713acaf1c19b6ea78 /home/pavel/avatar.png Regards Pavel Stehule 2013/8/22 Jov am...@amutu.com +1 badly need the large object and bytea convert function. Once I have to use the ugly pg_read_file() to put some text to pg,I tried to use large object but find it is useless without function to convert large object to bytea. Jov blog: http:amutu.com/blog http://amutu.com/blog 2013/8/10 Pavel Stehule pavel.steh...@gmail.com Hello I had to enhance my older project, where XML documents are parsed and created on server side - in PLpgSQL and PLPerl procedures. We would to use a LO API for client server communication, but we have to parse/serialize LO on server side. I found so there are no simple API for working with LO from PL without access to file system. I had to use a ugly hacks: CREATE OR REPLACE FUNCTION parser.save_as_lob(bytea) RETURNS oid AS $$ DECLARE _loid oid; _substr bytea; BEGIN _loid := lo_creat(-1); FOR i IN 0..length($1)/2048 LOOP _substr := substring($1 FROM i * 2048 + 1 FOR 2048); IF _substr '' THEN INSERT INTO pg_largeobject(loid, pageno, data) VALUES(_loid, i, _substr); END IF; END LOOP; EXECUTE format('GRANT SELECT ON LARGE OBJECT %s TO ohs', _loid); RETURN _loid; END; $$ LANGUAGE plpgsql SECURITY DEFINER STRICT SET search_path = 'pg_catalog'; and CREATE OR REPLACE FUNCTION fbuilder.attachment_to_xml(attachment oid) RETURNS xml AS $$ DECLARE b_cum bytea = ''; b bytea; BEGIN FOR b IN SELECT l.data FROM pg_largeobject l WHERE l.loid = attachment_to_xml.attachment ORDER BY l.pageno LOOP b_cum := b_cum || b; END LOOP; IF NOT FOUND THEN RETURN NULL; ELSE RETURN xmlelement(NAME attachment, encode(b_cum, 'base64')); END IF; END; $$ LANGUAGE plpgsql STRICT SECURITY DEFINER SET search_path = 'pg_catalog'; These functions can be simplified if we supports some functions like encode, decode for LO So my proposal is creating functions: * lo_encode(loid oid) .. returns bytea * lo_encode(loid oid, encoding text) .. returns text * lo_make(loid oid, data bytea) * lo_make(loid oid, data text, encoding text) This can simplify all transformation between LO and VARLENA. Known limit is 1G for varlena, but it is still relative enough high. Notes. comments? Regards Pavel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Rushabh Lathia -- Rushabh Lathia