Re: [HACKERS] 9.2.3 crashes during archive recovery
Sorry, Let me correct a bit. I tried to postpone smgrtruncate after the next checkpoint. This I tried to postpone smgrtruncate TO the next checktpoint. is similar to what hotstandby feedback does to vacuum. It seems to be working fine but I warry that it might also bloats the table. I haven't found the way to postpone only objective smgrtruncate. The patch below is a immediate verification patch for this solution. - CreateCheckPoint records the oldest xmin at that point. Let's call it 'checkpoint xmin'. - vacuum skips the modification by the transactions at the same time or after the checkpoint xmin. -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
Kevin Grittner kgri...@ymail.com writes: When I went to do this, I hit a shift/reduce conflict, because with TABLE being optional it couldn't tell whether: TRUNCATE MATERIALIZED VIEW x, y, z; ... was looking for five relations or three. That goes away with MATERIALIZED escalated to TYPE_FUNC_NAME_KEYWORD. Is that OK? Not really. I would much rather see us not bother with this pedantic syntax than introduce an even-partially-reserved word. Having said that, I don't think I believe your analysis of why this doesn't work. The presence or absence of commas ought to make the syntax non-ambiguous, I would think. Maybe you just factored the grammar wrong. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.2.3 crashes during archive recovery
On 20.02.2013 10:01, Kyotaro HORIGUCHI wrote: Sorry, Let me correct a bit. I tried to postpone smgrtruncate after the next checkpoint. This I tried to postpone smgrtruncate TO the next checktpoint. Umm, why? I don't understand this patch at all. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add PQconninfoParseParams and PQconninfodefaultsMerge to libpq
Tuesday, February 19, 2013 6:23 PM Amit Kapila wrote: On Monday, February 18, 2013 1:41 PM Heikki Linnakangas wrote: On 18.02.2013 06:07, Amit Kapila wrote: On Sunday, February 17, 2013 8:44 PM Phil Sorber wrote: On Sun, Feb 17, 2013 at 1:35 AM, Amit kapilaamit.kap...@huawei.com wrote: Now the patch of Phil Sober provides 2 new API's PQconninfoParseParams(), and PQconninfodefaultsMerge(), using these API's I can think of below way for patch pass a connection string to pg_basebackup, ... 1. Call existing function PQconinfoParse() with connection string input by user and get PQconninfoOption. 2. Now use the existing keywords (individual options specified by user) and extract the keywords from PQconninfoOption structure and call new API PQconninfoParseParams() which will return PQconninfoOption. The PQconninfoOption structure returned in this step will contain all keywords 3. Call PQconninfodefaultsMerge() to merge any default values if exist. Not sure if this step is required? 4. Extract individual keywords from PQconninfoOption structure and call PQconnectdbParams. Is this inline with what you have in mind or you have thought of some other simpler way of using new API's? Yep, that's roughly what I had in mind. I don't think it's necessary to merge defaults in step 3, but it needs to add the replication=true and dbname=replication options. I could see the advantage of calling PQconninfoParseParams() in step- 2 is that it will remove the duplicate values by overriding the values for conflicting keywords. This is done in function conninfo_array_parse() which is called from PQconninfoParseParams(). Am I right or there is any other advantage of calling PQconninfoParseParams()? If there is no other advantage then this is done in PQconnectdbParams() also, so can't we avoid calling PQconninfoParseParams()? I note that pg_dumpall also has a similar issue as pg_basebackup and pg_receivexlog; there's no way to pass a connection string to it either. I think not only pg_dumpall, but we need to add it to pg_dump. As -C is already used option in pg_dump, I need to use something different. I am planning to use -K as new option(available ones were d,g,j,k,l,m,p,q,y). I am planning to keep option same for pg_dumpall, as pg_dumpall internally calls pg_dump with the options supplied by user. In fact, I think we can hack the string passed to pg_dump to change the option from -C to -K, but I am not able see if it will be way better than using -K for both. The patch for providing connection string for pg_basebackup, pg_receivexlog, pg_dump and pg_dumpall is attached with this mail. With Regards, Amit Kapila. pg_basebkup_recvxlog_dump_conn_str_v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] streaming header too small
On 20.02.2013 02:11, Selena Deckelmann wrote: So, I just ran into a similar issue backing up a 9.2.1 server using pg_basebackup version 9.2.3: pg_basebackup: starting background WAL receiver pg_basebackup: streaming header too small: 25 I've had it happen two times in a row. I'm going to try again... But -- what would be helpful here? I can recompile pg_basebackup with more debugging... Hmm, 25 bytes would be the size of the WAL data packet, if it contains just the header and no actual WAL data. I think pg_basebackup should accept that - it's not unreasonable that the server might send such a packet sometimes. Looking at the walsender code, it's not supposed to ever send such a packet. But I suspect there's one corner-case where it might: if the current send location is at an xlogid boundary, so that we previously sent the last byte from the last WAL segment in the previous logical xlog file, and the WAL flush position points to byte 0 in the beginning of the new WAL file. Both of those positions are in fact the same thing, but we have two different ways to represent the same position. For example, if we've already sent up to WAL position (sentPtr in walsender.c): xlogid = 4 xrecoff = XLogFileSize and GetFlushRecPtr() returns: xlogid = 5 xrecoff = 0 Those both point to the same position. But the check in XLogSend that decides if there is any work to do uses XLByteLE() to check if they are equal, and XLByteLE() treats the latter to be greater than the former. So, in that situation, XLogSend() would decide that it has work to do, but there actually isn't, so it would send 0 bytes of WAL data. I'm not sure how GetFlushRecPtr() could return such a position, though. But I'm also not convinced that it can't happen. It would be fairly easy to fix walsender to not send anything in that situation. It would also be easy to fix pg_basebackup to not treat it as an error. We probably should do both. In 9.3, the XLogRecPtr representation changed so that there is only one value for a boundary position like that, so this is a 9.2-only issue. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Comment typo
On 20.02.2013 05:07, Etsuro Fujita wrote: Sorry, I found one more typo. Attached is a patch. Thanks, applied. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PostgreSql - access modified rows in prepare transaction command
Hello, I have the problem to access modified data (updates and inserts) in a prepare transaction statement before a commit/rollback. For example consider the following block: BEGIN; do some update; do some insert; PREPARE TRANSACTION 'transaction1'; After executing the 'prepare' command (and before executing the 'commit' command), I neeed to fetch the data modified by the transaction 'transaction1'. I can fetch the rows with the old values that are modified by the transaction (using the xmax field), howewer I need also the values that the transaction will write into these rows when it commits. Postgres needs to store these values somewhere in order to commit the transaction when it is required, so my question is: how can I access these values? Thank you in advance. Pierpaolo Cincilla -- View this message in context: http://postgresql.1045698.n5.nabble.com/PostgreSql-access-modified-rows-in-prepare-transaction-command-tp5745926.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSql - access modified rows in prepare transaction command
On 20.02.2013 12:48, pierpaolo.cincilla wrote: I have the problem to access modified data (updates and inserts) in a prepare transaction statement before a commit/rollback. For example consider the following block: BEGIN; do some update; do some insert; PREPARE TRANSACTION 'transaction1'; After executing the 'prepare' command (and before executing the 'commit' command), I neeed to fetch the data modified by the transaction 'transaction1'. I can fetch the rows with the old values that are modified by the transaction (using the xmax field), howewer I need also the values that the transaction will write into these rows when it commits. Postgres needs to store these values somewhere in order to commit the transaction when it is required, so my question is: how can I access these values? Thank you in advance. In short, you can't. PostgreSQL stores the values in the tables, but they are invisible to other transactions until the prepared transaction is committed. From this point of view, a prepared transaction behaves the same as a transaction that's still in-progress in another backend. If you explain a bit more what you're trying to accomplish, someone can probably suggest a better solution. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On Tue, Feb 19, 2013 at 11:01 AM, Peter Eisentraut pete...@gmx.net wrote: On 2/19/13 8:54 AM, Robert Haas wrote: In the department of crazy ideas, what about having pg_dump NEVER refresh ANY materialized views? It might be useful to have an option for this, but I don't think it should be the default. The default should be that the new database is ready to go. Then again, when would you ever actually use that option? You'd use that option if you'd rather get the database mostly-up as soon as possible, and then worry about the materialized views afterwards. This might be different if there were a command to refresh all materialized views, because you don't want to have to go around and type separate commands 47 times after a restore. Well, it's pretty easy to do: SELECT 'LOAD MATERIALIZED VIEW ' || p.oid::regclass || ';' FROM pg_class WHERE relkind = 'm'; ...but we could also add explicit syntax for it, perhaps along the lines of what we have for CLUSTER and VACUUM. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: PostgreSql - access modified rows in prepare transaction command
Thank you Heikki for your reply. As you suggest, I will explain better what I'm trying to accomplish. What I'm writing a ditributed two-phase-commit termination protocol that work in this manner: 1) Each site has a replica of the database. A site A perform a transaction t1 and prepare it (PREPARE TRANSACTION 't1'). Then it atomic broadcast a certification request for the transaction t1 *along with its writeset* (values updated by t1) to other sites. 2) When a site receive the certification request for transaction t1 does the certification (check that there are no concurrent conflicting transactions). If the certification succeed then 2a) if the transaction is local (i.e. originated at that site) it commit the transaction (COMMMIT PREPARED 't1'). 2b) If the transaction is remote (i.e. prepared at another site) *it apply locally the writeset of transaction t1* to reflect modifications to its local replica of the database (UPDATE command). The problem is that if I can't fetch the writeset of a transaction in phase 1 (before the commit request) then when I certify the transaction at another site I can't apply the updates performed by the remote transaction right away but I have to wait the originating site to commit the transaction and send back its writeset (now visible). This will be very bad because it adds an extra round to the algorithm. -- View this message in context: http://postgresql.1045698.n5.nabble.com/PostgreSql-access-modified-rows-in-prepare-transaction-command-tp5745926p5745930.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
Robert Haas robertmh...@gmail.com writes: On Tue, Feb 19, 2013 at 11:01 AM, Peter Eisentraut pete...@gmx.net wrote: This might be different if there were a command to refresh all materialized views, because you don't want to have to go around and type separate commands 47 times after a restore. Well, it's pretty easy to do: SELECT 'LOAD MATERIALIZED VIEW ' || p.oid::regclass || ';' FROM pg_class WHERE relkind = 'm'; ...but we could also add explicit syntax for it, perhaps along the lines of what we have for CLUSTER and VACUUM. It's not really that easy, because of the likelihood that MVs have to be refreshed in a specific order. The SELECT you suggest definitely seems too simplistic. A dedicated command could perhaps be built to pay attention to dependencies ... but if we're still coding such things now, it seems a bit late for 9.3. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review : Add hooks for pre- and post-processor executables for COPY and \copy
Hi Amit, Thank you for the review. From: Amit Kapila [mailto:amit.kap...@huawei.com] Test case issues: -- 1. Broken pipe is not handled in case of psql \copy command; Issue are as follows: Following are verified on SuSE-Linux 10.2. 1) psql is exiting when \COPY xxx TO command is issued and command/script is not found When popen is called in write mode it is creating valid file descriptor and when it tries to write to file Broken pipe error is coming which is not handled. psql# \copy pgbench_accounts TO PROGRAM '../compress.sh pgbench_accounts4.txt' 2) When \copy command is in progress then program/command is killed/crashed due to any problem psql is exiting. This is a headache. I have no idea how to solve this. I think we can keep it for committer to take a call on this issue. Agreed. I have found few more minor issues as below: 1. The comment above do_copy can be modified to address the new functionality it can handle. /* * Execute a \copy command (frontend copy). We have to open a file, then * submit a COPY query to the backend and either feed it data from the * file or route its response into the file. */ bool do_copy(const char *args) Done. 2. @@ -256,8 +273,14 @@ do_copy(const char *args) +if (options-file == NULL options-program) +{ +psql_error(program is not supported to stdout/pstdout or from stdin/pstdin\n); +return false; +} should call free_copy_options(options); before return false; Good catch! Done. 3. \copy command doesn't need semicolon at end, however it was working previous to your patch, but now it is giving error. postgres=# \copy t1 from 'e:\pg_git_code\Data\t1_Data.txt'; e:/pg_git_code/Data/t1_Data.txt';: No such file or directory e:/pg_git_code/Data/t1_Data.txt';: No such file or directory Sorry, I've fixed the bug. 4. Please check if OpenPipeStream() it needs to call if (ReleaseLruFile()), OpenPipeStream() calls ReleaseLruFile() by itself if necessary. 5. Following in copy.sgml can be changed to make more meaningful as the first line looks little adhoc. + para + The command that input comes from or that output goes to. + The command for COPY FROM, which input comes from, must write its output + to standard output. The command for COPY TO, which output goes to, must + read its input from standard input. + /para I've struggled to make the document more meaningful. 6. Can we have one example of this new syntax, it can make it more meaningful. Done. Sorry for the long delay. Best regards, Etsuro Fujita With Regards, Amit Kapila. copy-popen-20130220.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BUG #7493: Postmaster messages unreadable in a Windows console
Hello, 15.02.2013 02:59, Noah Misch wrote: With your proposed change, the problem will resurface in an actual SQL_ASCII database. At the problem's root is write_console()'s assumption that messages are in the database encoding. pg_bind_textdomain_codeset() tries to make that so, but it only works for encodings with a pg_enc2gettext_tbl entry. That excludes SQL_ASCII, MULE_INTERNAL, and others. write_console() needs to behave differently in such cases. Thank you for the notice. So it seems that DatabaseEncoding variable alone can't present a database encoding (for communication with a client) and current process messages encoding (for logging messages) at once. There should be another variable, something like CurrentProcessEncoding, that will be set to OS encoding at start and can be changed to encoding of a connected database (if bind_textdomain_codeset succeeded). I'd call it MessageEncoding unless it corresponds with similar rigor to a broader concept. Please look at the next version of the patch. Thanks, Alexander From 5bce21326d48761c6f86be8797432a69b2533dcd Mon Sep 17 00:00:00 2001 From: Alexander Lakhin exclus...@gmail.com Date: Wed, 20 Feb 2013 15:34:05 +0400 Subject: Fix postmaster messages encoding --- src/backend/main/main.c|2 ++ src/backend/utils/error/elog.c |4 ++-- src/backend/utils/mb/mbutils.c | 24 ++-- 3 files changed, 26 insertions(+), 4 deletions(-) diff --git a/src/backend/main/main.c b/src/backend/main/main.c index 1173bda..ed4067e 100644 --- a/src/backend/main/main.c +++ b/src/backend/main/main.c @@ -100,6 +100,8 @@ main(int argc, char *argv[]) set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN(postgres)); + SetMessageEncoding(GetPlatformEncoding()); + #ifdef WIN32 /* diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c index 3a211bf..40f20f3 100644 --- a/src/backend/utils/error/elog.c +++ b/src/backend/utils/error/elog.c @@ -1868,7 +1868,7 @@ write_eventlog(int level, const char *line, int len) * Also verify that we are not on our way into error recursion trouble due * to error messages thrown deep inside pgwin32_toUTF16(). */ - if (GetDatabaseEncoding() != GetPlatformEncoding() + if (GetMessageEncoding() != GetPlatformEncoding() !in_error_recursion_trouble()) { utf16 = pgwin32_toUTF16(line, len, NULL); @@ -1915,7 +1915,7 @@ write_console(const char *line, int len) * through to writing unconverted if we have not yet set up * CurrentMemoryContext. */ - if (GetDatabaseEncoding() != GetPlatformEncoding() + if (GetMessageEncoding() != GetPlatformEncoding() !in_error_recursion_trouble() !redirection_done CurrentMemoryContext != NULL) diff --git a/src/backend/utils/mb/mbutils.c b/src/backend/utils/mb/mbutils.c index 287ff80..8b51b78 100644 --- a/src/backend/utils/mb/mbutils.c +++ b/src/backend/utils/mb/mbutils.c @@ -57,6 +57,7 @@ static FmgrInfo *ToClientConvProc = NULL; */ static pg_enc2name *ClientEncoding = pg_enc2name_tbl[PG_SQL_ASCII]; static pg_enc2name *DatabaseEncoding = pg_enc2name_tbl[PG_SQL_ASCII]; +static pg_enc2name *MessageEncoding = pg_enc2name_tbl[PG_SQL_ASCII]; static pg_enc2name *PlatformEncoding = NULL; /* @@ -881,6 +882,16 @@ SetDatabaseEncoding(int encoding) Assert(DatabaseEncoding-encoding == encoding); } +void +SetMessageEncoding(int encoding) +{ + if (!PG_VALID_BE_ENCODING(encoding)) + elog(ERROR, invalid message encoding: %d, encoding); + + MessageEncoding = pg_enc2name_tbl[encoding]; + Assert(MessageEncoding-encoding == encoding); +} + /* * Bind gettext to the codeset equivalent with the database encoding. */ @@ -915,6 +926,8 @@ pg_bind_textdomain_codeset(const char *domainname) if (bind_textdomain_codeset(domainname, pg_enc2gettext_tbl[i].name) == NULL) elog(LOG, bind_textdomain_codeset failed); + else +SetMessageEncoding(encoding); break; } } @@ -964,6 +977,13 @@ GetPlatformEncoding(void) return PlatformEncoding-encoding; } +int +GetMessageEncoding(void) +{ + Assert(MessageEncoding); + return MessageEncoding-encoding; +} + #ifdef WIN32 /* @@ -977,7 +997,7 @@ pgwin32_toUTF16(const char *str, int len, int *utf16len) int dstlen; UINT codepage; - codepage = pg_enc2name_tbl[GetDatabaseEncoding()].codepage; + codepage = pg_enc2name_tbl[GetMessageEncoding()].codepage; /* * Use MultiByteToWideChar directly if there is a corresponding codepage, @@ -994,7 +1014,7 @@ pgwin32_toUTF16(const char *str, int len, int *utf16len) char *utf8; utf8 = (char *) pg_do_encoding_conversion((unsigned char *) str, - len, GetDatabaseEncoding(), PG_UTF8); + len, GetMessageEncoding(), PG_UTF8); if (utf8 != str) len = strlen(utf8); -- 1.7.10.4 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW for PostgreSQL
On Fri, Feb 15, 2013 at 12:58 AM, Tom Lane t...@sss.pgh.pa.us wrote: [ rereads that... ] Hm, I did make some good points. But having seen the end result of this way, I'm still not very happy; it still looks like a maintenance problem. Maybe some additional flags in ruleutils.c is the least evil way after all. Needs more thought. I'm working on revising deparser so that it uses ruleutils routines to construct remote query, and re-found an FDW-specific problem which I encountered some months ago. So far ruleutils routines require deparse context, which is a list of namespace information. Currently deparse_context_for() seems to fit postgres_fdw's purpose, but it always uses names stored in catalogs (pg_class, pg_attribute and pg_namespace), though postgres_fdw wants to replace column/table/schema name with the name specified in relevant FDW options if any. Proper remote query will be generated If postgres_fdw can modify deparse context, but deparse_context is hidden detail of ruleutils.c. IMO disclosing it is bad idea. Given these, I'm thinking to add new deparse context generator which basically construct namespaces from catalogs, but replace one if FDW option *_name was specified for an object. With this context, existing ruleutils would generate expression-strings with proper names, without any change. Is this idea acceptable? -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW for PostgreSQL
Shigeru Hanada shigeru.han...@gmail.com writes: On Fri, Feb 15, 2013 at 12:58 AM, Tom Lane t...@sss.pgh.pa.us wrote: [ rereads that... ] Hm, I did make some good points. But having seen the end result of this way, I'm still not very happy; it still looks like a maintenance problem. Maybe some additional flags in ruleutils.c is the least evil way after all. Needs more thought. I'm working on revising deparser so that it uses ruleutils routines to construct remote query, and re-found an FDW-specific problem which I encountered some months ago. After further review I'm unconvinced that we can really do much better than what's there now --- the idea of sharing code with ruleutils sounds attractive, but once you look at all the specific issues that ruleutils would have to be taught about, it gets much less so. (In particular I fear we'll find that we have to do some weird stuff to deal with cross-server-version issues.) I've been revising the patch on the assumption that we'll keep deparse.c more or less as is. Having said that, I remain pretty unhappy with the namespace handling in deparse.c. I don't think it serves much purpose to schema-qualify everything when we're restricting what we can access to built-in operators and functions --- the loss of readability outweighs the benefits IMO. Also, there is very little point in schema-qualifying almost everything rather than everything; if you're not 100% then you have no safety against search_path issues. But that's what we've got because the code still relies on format_type to print type names. Now we could get around that complaint by duplicating format_type as well as ruleutils, but I don't think that's the right direction to proceed. I still think it might be a good idea to set search_path to pg_catalog on the remote side, and then schema-qualify only what is not in pg_catalog (which would be nothing, in the current code, so far as types/functions/operators are concerned). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] streaming header too small
On Feb 20, 2013 11:29 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 20.02.2013 02:11, Selena Deckelmann wrote: So, I just ran into a similar issue backing up a 9.2.1 server using pg_basebackup version 9.2.3: pg_basebackup: starting background WAL receiver pg_basebackup: streaming header too small: 25 I've had it happen two times in a row. I'm going to try again... But -- what would be helpful here? I can recompile pg_basebackup with more debugging... Hmm, 25 bytes would be the size of the WAL data packet, if it contains just the header and no actual WAL data. I think pg_basebackup should accept that - it's not unreasonable that the server might send such a packet sometimes. Looking at the walsender code, it's not supposed to ever send such a packet. But I suspect there's one corner-case where it might: if the current send location is at an xlogid boundary, so that we previously sent the last byte from the last WAL segment in the previous logical xlog file, and the WAL flush position points to byte 0 in the beginning of the new WAL file. Both of those positions are in fact the same thing, but we have two different ways to represent the same position. For example, if we've already sent up to WAL position (sentPtr in walsender.c): xlogid = 4 xrecoff = XLogFileSize and GetFlushRecPtr() returns: xlogid = 5 xrecoff = 0 Those both point to the same position. But the check in XLogSend that decides if there is any work to do uses XLByteLE() to check if they are equal, and XLByteLE() treats the latter to be greater than the former. So, in that situation, XLogSend() would decide that it has work to do, but there actually isn't, so it would send 0 bytes of WAL data. I'm not sure how GetFlushRecPtr() could return such a position, though. But I'm also not convinced that it can't happen. It would be fairly easy to fix walsender to not send anything in that situation. It would also be easy to fix pg_basebackup to not treat it as an error. We probably should do both. In 9.3, the XLogRecPtr representation changed so that there is only one value for a boundary position like that, so this is a 9.2-only issue. That does sound like a reasonable explanation and fix. Heck, probably enough to just put the fix in pg_basebackup since it's gone in 9.3 anyway. But I'd really like to confirm this is the actual situation before considering it fixed, since it's clearly very intermittent. Selena, was this reasonably reproducible for you? Would it be possible to get a network trace of it to show of that's the kind of package coming across, or by hacking up pg_basebackup to print the exact position it was at when the problem occurred? /Magnus
Re: [HACKERS] Materialized views WIP patch
On 2/19/13 5:22 PM, David Fetter wrote: On Tue, Feb 19, 2013 at 11:09:13PM +0100, Erik Rijkers wrote: On Sat, February 16, 2013 02:01, Kevin Grittner wrote: matview-v4.patch.gz Hi, I was wondering if material views should not go into information_schema. I was thinking either .views or .tables. Have you considered this? I'm guessing it'd be .views if anything. Haven't been able to decipher from section 11 of the standard (Schemata) whether the standard has anything to say on the matter. I suppose one should be able to expect that if one finds a view in the information schema, then one should be able to use DROP VIEW to remove it. Which in this case wouldn't work. So I don't think including a materialized view under views or tables is appropriate. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE RULE _RETURN and toast tables
On 2013-02-14 20:47:11 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: The current behaviour doesn't seem to be a terribly good idea. I propose to drop the toast table and reset the relfrozenxid in DefineQueryRewrite in the RelisBecomingView case. Yeah, probably worth doing. At the time we thought that that code path was just a short-term legacy thing for loading ancient pg_dump files. However, given that even modern pg_dumps will use this syntax if necessary to break circular dependencies for views, we're probably never going to be rid of it completely. What about the attached patch? I chose to move the update of relkind from SetRelationRuleStatus to the RelisBecomingView part of DefineQueryRewrite. As we're updating pg_class in there anyway there doesn't seem to be any reason to spread knowledge of that any further. Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/src/backend/rewrite/rewriteDefine.c b/src/backend/rewrite/rewriteDefine.c index b37f36b..7e7b16a 100644 --- a/src/backend/rewrite/rewriteDefine.c +++ b/src/backend/rewrite/rewriteDefine.c @@ -16,6 +16,9 @@ #include access/heapam.h #include access/htup_details.h +#include access/transam.h +#include access/multixact.h +#include access/xact.h #include catalog/catalog.h #include catalog/dependency.h #include catalog/heap.h @@ -500,30 +503,101 @@ DefineQueryRewrite(char *rulename, replace); /* - * Set pg_class 'relhasrules' field TRUE for event relation. If - * appropriate, also modify the 'relkind' field to show that the - * relation is now a view. + * Set pg_class 'relhasrules' field TRUE for event relation. * * Important side effect: an SI notice is broadcast to force all * backends (including me!) to update relcache entries with the new * rule. */ - SetRelationRuleStatus(event_relid, true, RelisBecomingView); + SetRelationRuleStatus(event_relid, true); } - /* - * If the relation is becoming a view, delete the storage files associated - * with it. Also, get rid of any system attribute entries in pg_attribute, - * because a view shouldn't have any of those. + /* - + * If the relation is becoming a view + * - delete the associated storage files + * - get rid of any system attributes in pg_attribute, a view shouldn't + have any of those + * - remove the toast table, there is no need for it anymore, and its + presence would make vacuum slightly more complicated + * - set relkind to RELKIND_VIEW + * - adjust other pg_class attributes to be appropriate for a view * * NB: we had better have AccessExclusiveLock to do this ... - * - * XXX what about getting rid of its TOAST table? For now, we don't. + * - */ if (RelisBecomingView) { + Relation relationRelation; + Oid toastrelid; + HeapTuple classTup; + Form_pg_class classForm; + + relationRelation = heap_open(RelationRelationId, RowExclusiveLock); + toastrelid = event_relation-rd_rel-reltoastrelid; + + /* drop storage while table still looks like a table */ RelationDropStorage(event_relation); DeleteSystemAttributeTuples(event_relid); + + /* + * Now drop the toast table which is not needed anymore, the pg_class + * entry is adapted below. + */ + if (toastrelid != InvalidOid) + { + ObjectAddress toastobject; + + /* + * delete the dependency of the main relation to the toast relation + * so we can delete the toast relation without also deleting what + * is becoming the view. + */ + deleteDependencyRecordsFor(RelationRelationId, toastrelid, + false); + + /* make deletion of dependency record visible */ + CommandCounterIncrement(); + + /* now drop toast table, including index */ + toastobject.classId = RelationRelationId; + toastobject.objectId = toastrelid; + toastobject.objectSubId = 0; + performDeletion(toastobject, DROP_RESTRICT, PERFORM_DELETION_INTERNAL); + } + + /* + * Fixup pg_class entry to look like a normal view's, including setting + * the correct relkind and removal of reltoastrelid, reltoastidxid of + * the toast table we potentially removed above. + */ + + /* + * SetRelationRuleStatus may have updated the pg_class row, so make + * current version visible before we fetch the current tuple. + */ + CommandCounterIncrement(); + + classTup = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(event_relid)); + if (!HeapTupleIsValid(classTup)) + elog(ERROR, cache lookup failed for relation %u, event_relid); + + classForm = (Form_pg_class) GETSTRUCT(classTup); + classForm-reltablespace = InvalidOid; + classForm-relpages = 0; + classForm-reltuples = 0; + classForm-relallvisible = 0; + classForm-reltoastrelid = InvalidOid; + classForm-reltoastidxid =
Re: [HACKERS] Unarchived WALs deleted after crash
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Just a quick top-post to thank you all for this fix guys ! Cheers, On 15/02/2013 18:43, Heikki Linnakangas wrote: On 15.02.2013 19:16, Fujii Masao wrote: On Sat, Feb 16, 2013 at 2:07 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 15.02.2013 18:10, Fujii Masao wrote: At least in 9.2, when the archived file is restored into pg_xlog, its xxx.done archive status file is created. So we don't need to check InArchiveRecovery when deleting old WAL files. Checking whether xxx.done exists is enough. Hmm, what about streamed WAL files? I guess we could go back to the pre-9.2 coding, and check WalRcvInProgress(). But I didn't actually like that too much, it seems rather random that old streamed files are recycled when wal receiver is running at the time of restartpoint, and otherwise not. Because whether wal receiver is running at the time the restartpoint happens has little to do with which files were created by streaming replication. With the right pattern of streaming files from the master, but always being teporarily disconnected when the restartpoint runs, you could still accumulate WAL files infinitely. Walreceiver always creates .done file when it closes the already-flushed WAL file and switches WAL file to next. So we also don't need to check WalRcvInProgress(). Ah, I missed that part of the patch. Okay, agreed, that's a better fix. I committed your forward-port of the 9.2 patch to master, reverted my earlier fix for this bug, and simply removed the InArchiveRecovery/ArchiveRecoveryInProgress()/RecoveryInProgress() condition from RemoveOldXlogFiles(). - Heikki - -- Jehan-Guillaume de Rorthais http://www.dalibo.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlEk5hwACgkQXu9L1HbaT6JZ3wCg4h7QT+wRMT8KZAA/PjOjZcCV CS4AnRFeGdXIgklo1/RD2hi+e98pNBEe =voW3 -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] indirect toast tuple support
On Tue, Feb 19, 2013 at 9:26 AM, Andres Freund and...@2ndquadrant.com wrote: On 2013-02-19 09:12:02 -0500, Robert Haas wrote: On Tue, Feb 19, 2013 at 9:00 AM, Andres Freund and...@2ndquadrant.com wrote: So the other way that we could do this is to use something that's the same size as a TOAST pointer but has different content - the seemingly-obvious choice being va_toastrelid == 0. Unfortunately that would mean you need to copy the varatt_external (or whatever it would be called) to aligned storage to check what it is. Thats why I went the other way. How big a problem is that, though? There are quite some places where we test the actual type of a Datum inside tuptoaster.c. Copying it to local storage everytime might actually be noticeable performancewise. Besides the ugliness of needing a local variable, copying the data and only testing afterwards... Hrm, OK. I'd be a little reluctant to do it the way you propose because we might, at some point, want to try to reduce the size of toast pointers. If you have a tuple with many attributes, the size of the TOAST pointers themselves starts to add up. It would be nice to be able to have 8 byte or even 4 byte toast pointers to handle those situations. If we steal one or both of those lengths to mean the data is cached in memory somewhere then we can't use those lengths in a smaller on-disk representation, which would seem a shame. I agree. As I said above, having the type overlayed into the lenght was and is a bad idea, I just haven't found a better one thats compatible yet. Except inventing typlen=-3 aka toast2 or something. But even that wouldn't help getting rid of existing pg_upgraded tables. Besides being a maintenance nightmare. The only reasonable thing I can see us doing is renaming varattrib_1b_e.va_len_1be into va_type and redefine VARSIZE_1B_E into a switch that maps types into lengths. But I think I would put this off, except placing a comment somewhere, until its gets necessary. I guess I wonder how hard we think it would be to insert such a thing when it becomes necessary. How much stuff is there out there that cares about the fact that that length is a byte? You mean whether we could store the length in 6 bytes and use two for the type? That should probably work as well. But I don't see much advantage in that given that all those sizes ought to be static. Redefining VARSIZE_1B_E as indicated above should be fairly easy, there aren't many callsites that touch stuff at such low level. /me blinks. No, that's not what I meant. I meant: how hard it would be to redefine VARSIZE_1B_E along the lines you suggest? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] indirect toast tuple support
On 2013-02-20 10:16:45 -0500, Robert Haas wrote: On Tue, Feb 19, 2013 at 9:26 AM, Andres Freund and...@2ndquadrant.com wrote: On 2013-02-19 09:12:02 -0500, Robert Haas wrote: On Tue, Feb 19, 2013 at 9:00 AM, Andres Freund and...@2ndquadrant.com wrote: I'd be a little reluctant to do it the way you propose because we might, at some point, want to try to reduce the size of toast pointers. If you have a tuple with many attributes, the size of the TOAST pointers themselves starts to add up. It would be nice to be able to have 8 byte or even 4 byte toast pointers to handle those situations. If we steal one or both of those lengths to mean the data is cached in memory somewhere then we can't use those lengths in a smaller on-disk representation, which would seem a shame. I agree. As I said above, having the type overlayed into the lenght was and is a bad idea, I just haven't found a better one thats compatible yet. Except inventing typlen=-3 aka toast2 or something. But even that wouldn't help getting rid of existing pg_upgraded tables. Besides being a maintenance nightmare. The only reasonable thing I can see us doing is renaming varattrib_1b_e.va_len_1be into va_type and redefine VARSIZE_1B_E into a switch that maps types into lengths. But I think I would put this off, except placing a comment somewhere, until its gets necessary. I guess I wonder how hard we think it would be to insert such a thing when it becomes necessary. How much stuff is there out there that cares about the fact that that length is a byte? You mean whether we could store the length in 6 bytes and use two for the type? That should probably work as well. But I don't see much advantage in that given that all those sizes ought to be static. Redefining VARSIZE_1B_E as indicated above should be fairly easy, there aren't many callsites that touch stuff at such low level. /me blinks. No, that's not what I meant. I meant: how hard it would be to redefine VARSIZE_1B_E along the lines you suggest? Should be pretty easy. Will do so for the next revision. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kgri...@ymail.com writes: When I went to do this, I hit a shift/reduce conflict, because with TABLE being optional it couldn't tell whether: TRUNCATE MATERIALIZED VIEW x, y, z; ... was looking for five relations or three. That goes away with MATERIALIZED escalated to TYPE_FUNC_NAME_KEYWORD. Is that OK? Not really. I would much rather see us not bother with this pedantic syntax than introduce an even-partially-reserved word. I'm not sure it's worth it either; but two people requested it and I didn't forsee this shift/reduce conflict, so I took a shot at it. If we can't eliminate the conflict, I'm fine with leaving things as they are in the latest posted patch. Having said that, I don't think I believe your analysis of why this doesn't work. The presence or absence of commas ought to make the syntax non-ambiguous, I would think. Maybe you just factored the grammar wrong. Well, it wouldn't be the first time you've seen a better way to do something in flex than I was able to see. Taking just the gram.y part of the change which implemented this, and omitting the change in reservedness of MATERIALIZED, I have: diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 820cb41..1d393c5 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -394,6 +394,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); %type ival opt_column event cursor_options opt_hold opt_set_data %type objtype reindex_type drop_type comment_type security_label_type + trunc_type %type node fetch_args limit_clause select_limit_value offset_clause select_offset_value @@ -5172,9 +5173,10 @@ attrs: '.' attr_name */ TruncateStmt: - TRUNCATE opt_table relation_expr_list opt_restart_seqs opt_drop_behavior + TRUNCATE trunc_type relation_expr_list opt_restart_seqs opt_drop_behavior { TruncateStmt *n = makeNode(TruncateStmt); + n-objtype = $2; n-relations = $3; n-restart_seqs = $4; n-behavior = $5; @@ -5182,6 +5184,12 @@ TruncateStmt: } ; +trunc_type: + TABLE { $$ = OBJECT_TABLE; } + | MATERIALIZED VIEW { $$ = OBJECT_MATVIEW; } + | /*EMPTY*/ { $$ = OBJECT_UNSPECIFIED; } + ; + opt_restart_seqs: CONTINUE_P IDENTITY_P { $$ = false; } | RESTART IDENTITY_P { $$ = true; } I'm open to suggestions on a better way. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
Peter Eisentraut pete...@gmx.net wrote: I suppose one should be able to expect that if one finds a view in the information schema, then one should be able to use DROP VIEW to remove it. Which in this case wouldn't work. So I don't think including a materialized view under views or tables is appropriate. Right. I think adding pg_matviews covers the stated use-case enough to answer Erik's concern. I'm not going to mess with adding non-standard stuff to the standard views. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON Function Bike Shedding
On Tue, Feb 19, 2013 at 10:00 AM, Merlin Moncure mmonc...@gmail.com wrote: Anyways, as to overloading in general, well, SQL is heavily overloaded. We don't have int_max, float_max, etc. and it would be usability reduction if we did. That's true, but max(int) and max(float) are doing pretty much the same logical operation - they are taking the maximum of a group of numbers. Overloading in cases where the semantics vary - e.g. + for both integer addition and string concatenation - is something else altogether, and I have not generally observed it to be a very good idea. Sometimes it works in cases where it's part of the core language design, but we don't have the luxury of knowing what other data types we'll want to add in the future, and I'm vary wary of allowing JSON to engage in uncontrolled namespace pollution. But that's not even the point; the driving philosophy of SQL is that your data structures (and types) are to be strongly decoupled from the manipulation you do -- this keeps the language very general. That philosophy, while not perfect, should be adhered to when possible. Perhaps, but that goal seems unlikely to be met in this case. The JSON functions and operators are being named by one group of people with one set of sensibilities, and the hstore functions and operators were named by a different group of people with a different set of sensibilities (and therefore don't match), and the next type that comes along will be named according to yet another group of people with another set of sensibilities. So we're unlikely to end up with a coherent set of primitives that operate on underlying data of a variety of types; we are instead likely to end up with an incoherent jumble. Although we now have a JSON type in core, we should not pretend that it's in the same league as text or int4. If those data types claim common function names like max and abs and common operator names like + and ||, it can be justified on the grounds that the appeal of those data types is pretty near universal. JSON is a very popular format right now and I completely support adding more support for it, but I cheerfully submit that if you think it falls into the same category as text or int4, you've gotten way too caught up in the hype. It's completely appropriate to apply stricter criteria for namespace pollution to JSON than to a basic data type whose semantics are dictated by the SQL standard, the behavior of other database products, and fourth-grade math class. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] streaming header too small
On Wed, Feb 20, 2013 at 6:23 AM, Magnus Hagander mag...@hagander.netwrote: On Feb 20, 2013 11:29 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 20.02.2013 02:11, Selena Deckelmann wrote: So, I just ran into a similar issue backing up a 9.2.1 server using pg_basebackup version 9.2.3: pg_basebackup: starting background WAL receiver pg_basebackup: streaming header too small: 25 I've had it happen two times in a row. I'm going to try again... But -- what would be helpful here? I can recompile pg_basebackup with more debugging... Hmm, 25 bytes would be the size of the WAL data packet, if it contains just the header and no actual WAL data. I think pg_basebackup should accept that - it's not unreasonable that the server might send such a packet sometimes. Looking at the walsender code, it's not supposed to ever send such a packet. But I suspect there's one corner-case where it might: if the current send location is at an xlogid boundary, so that we previously sent the last byte from the last WAL segment in the previous logical xlog file, and the WAL flush position points to byte 0 in the beginning of the new WAL file. Both of those positions are in fact the same thing, but we have two different ways to represent the same position. For example, if we've already sent up to WAL position (sentPtr in walsender.c): xlogid = 4 xrecoff = XLogFileSize and GetFlushRecPtr() returns: xlogid = 5 xrecoff = 0 Those both point to the same position. But the check in XLogSend that decides if there is any work to do uses XLByteLE() to check if they are equal, and XLByteLE() treats the latter to be greater than the former. So, in that situation, XLogSend() would decide that it has work to do, but there actually isn't, so it would send 0 bytes of WAL data. I'm not sure how GetFlushRecPtr() could return such a position, though. But I'm also not convinced that it can't happen. It would be fairly easy to fix walsender to not send anything in that situation. It would also be easy to fix pg_basebackup to not treat it as an error. We probably should do both. In 9.3, the XLogRecPtr representation changed so that there is only one value for a boundary position like that, so this is a 9.2-only issue. Thanks for the debugging, Heikki! That does sound like a reasonable explanation and fix. Heck, probably enough to just put the fix in pg_basebackup since it's gone in 9.3 anyway. But I'd really like to confirm this is the actual situation before considering it fixed, since it's clearly very intermittent. Selena, was this reasonably reproducible for you? Would it be possible to get a network trace of it to show of that's the kind of package coming across, or by hacking up pg_basebackup to print the exact position it was at when the problem occurred? This is happening with a very busy 700 GB system, so I'm going to rule out a network trace out for the moment. The error is occurring sometime in the middle of the backup. Last time it was at least 30-40 minutes into a 2 hr backup. I'll see about hacking up pg_basebackup. I'm doing nightly backups so hopefully (?) it will happen again in the next few days. -selena -- http://chesnok.com
Re: [HACKERS] posix_fadvise missing in the walsender
On Tue, Feb 19, 2013 at 5:48 PM, Simon Riggs si...@2ndquadrant.com wrote: I agree with Merlin and Joachim - if we have the call in one place, we should have it in both. We might want to assess whether we even want to have it one place. I've seen cases where the existing call hurts performance, because of WAL file recycling. If we don't flush the WAL file blocks out of cache, then they're still there when we recycle the WAL file and we can overwrite them without further I/O. But if we tell the OS to blow them away, then it has to reread them when we try to overwrite the old files, and so we stall waiting for the I/O. I was able to clearly measure this problem back when I was hacking on write scalability, so it's not a purely hypothetical risk. As for the proposed optimization, I tend to doubt that it's a good idea. We're talking about doing extra work to give the OS cache a hint that may not be right anyway. Color me skeptical... but like Heikki, I'm certainly willing to be proven wrong by some actual benchmark results. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] streaming header too small
On Wed, Feb 20, 2013 at 4:53 PM, Selena Deckelmann sel...@chesnok.com wrote: On Wed, Feb 20, 2013 at 6:23 AM, Magnus Hagander mag...@hagander.net wrote: On Feb 20, 2013 11:29 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 20.02.2013 02:11, Selena Deckelmann wrote: So, I just ran into a similar issue backing up a 9.2.1 server using pg_basebackup version 9.2.3: pg_basebackup: starting background WAL receiver pg_basebackup: streaming header too small: 25 I've had it happen two times in a row. I'm going to try again... But -- what would be helpful here? I can recompile pg_basebackup with more debugging... Hmm, 25 bytes would be the size of the WAL data packet, if it contains just the header and no actual WAL data. I think pg_basebackup should accept that - it's not unreasonable that the server might send such a packet sometimes. Looking at the walsender code, it's not supposed to ever send such a packet. But I suspect there's one corner-case where it might: if the current send location is at an xlogid boundary, so that we previously sent the last byte from the last WAL segment in the previous logical xlog file, and the WAL flush position points to byte 0 in the beginning of the new WAL file. Both of those positions are in fact the same thing, but we have two different ways to represent the same position. For example, if we've already sent up to WAL position (sentPtr in walsender.c): xlogid = 4 xrecoff = XLogFileSize and GetFlushRecPtr() returns: xlogid = 5 xrecoff = 0 Those both point to the same position. But the check in XLogSend that decides if there is any work to do uses XLByteLE() to check if they are equal, and XLByteLE() treats the latter to be greater than the former. So, in that situation, XLogSend() would decide that it has work to do, but there actually isn't, so it would send 0 bytes of WAL data. I'm not sure how GetFlushRecPtr() could return such a position, though. But I'm also not convinced that it can't happen. It would be fairly easy to fix walsender to not send anything in that situation. It would also be easy to fix pg_basebackup to not treat it as an error. We probably should do both. In 9.3, the XLogRecPtr representation changed so that there is only one value for a boundary position like that, so this is a 9.2-only issue. Thanks for the debugging, Heikki! That does sound like a reasonable explanation and fix. Heck, probably enough to just put the fix in pg_basebackup since it's gone in 9.3 anyway. But I'd really like to confirm this is the actual situation before considering it fixed, since it's clearly very intermittent. Selena, was this reasonably reproducible for you? Would it be possible to get a network trace of it to show of that's the kind of package coming across, or by hacking up pg_basebackup to print the exact position it was at when the problem occurred? This is happening with a very busy 700 GB system, so I'm going to rule out a network trace out for the moment. The error is occurring sometime in the middle of the backup. Last time it was at least 30-40 minutes into a 2 hr backup. Yikes. Yeah, that doesn't seem like the best choice of method :) I'll see about hacking up pg_basebackup. I'm doing nightly backups so hopefully (?) it will happen again in the next few days. Thanks! -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] streaming header too small
On 20.02.2013 17:53, Selena Deckelmann wrote: On Wed, Feb 20, 2013 at 6:23 AM, Magnus Hagandermag...@hagander.netwrote: Selena, was this reasonably reproducible for you? Would it be possible to get a network trace of it to show of that's the kind of package coming across, or by hacking up pg_basebackup to print the exact position it was at when the problem occurred? This is happening with a very busy 700 GB system, so I'm going to rule out a network trace out for the moment. The error is occurring sometime in the middle of the backup. Last time it was at least 30-40 minutes into a 2 hr backup. If you could pinpoint the WAL position where the error happens, that would already help somewhat. For starters, put pg_receivexlog to verbose mode, so that it will print a line after each WAL segment. If my theory is correct, the error should happen at xlogid boundaries, ie. just after finishing a WAL segment whose filename ends with FE. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
Kevin Grittner kgri...@ymail.com writes: Tom Lane t...@sss.pgh.pa.us wrote: Having said that, I don't think I believe your analysis of why this doesn't work. Well, it wouldn't be the first time you've seen a better way to do something in flex than I was able to see. Taking just the gram.y part of the change which implemented this, and omitting the change in reservedness of MATERIALIZED, I have: - TRUNCATE opt_table relation_expr_list opt_restart_seqs opt_drop_behavior + TRUNCATE trunc_type relation_expr_list opt_restart_seqs opt_drop_behavior +trunc_type: + TABLE { $$ = OBJECT_TABLE; } + | MATERIALIZED VIEW { $$ = OBJECT_MATVIEW; } + | /*EMPTY*/ { $$ = OBJECT_UNSPECIFIED; } + ; Yeah, this is a standard gotcha when working with unreserved keywords. You can't factor it like that because then the parser is required to make a shift-reduce decision (on whether to reduce trunc_type to empty) before it can see past the first word. So for instance given TRUNCATE MATERIALIZED ... ^ the parser has to make that decision when it can't see past the word MATERIALIZED and so doesn't know what comes after it. The way to fix it is to not try to use the sub-production but spell it all out: TRUNCATE TABLE relation_expr_list ... | TRUNCATE MATERIALIZED VIEW relation_expr_list ... | TRUNCATE relation_expr_list ... Now the parser doesn't have to make any shift-reduce decision until after it can see past the first identifier. It's a bit tedious but beats making a word more reserved than it has to be. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Altering Views
Hi, If we want to add new column to the view - the only one way (we have found) is: -drop view -create it again ( with new column ) Now, if some other view depends on the view we want change - it will not allow us to drop the view - what is fine, of course - but it does not allow us to change it either (add column) even our change has not any effect on other views what depend on this one... So what we are doing is: 1. Take create scripts of all others views what depends on this one 2. Take create scripts of all others views what depends on any of views in point 1 (recursively) 3. Drop the view (cascaded) 4. Create the view with the new column 5. Run create scripts taken from point 1 and 2... Lot of hassle - just because of adding one more column to the view... Is there any better way to alter view without hassle ? If not - Are there any plans to allow it? Many Thanks, Misa
Re: [HACKERS] Materialized views WIP patch
Tom Lane t...@sss.pgh.pa.us wrote: The way to fix it is to not try to use the sub-production but spell it all out: TRUNCATE TABLE relation_expr_list ... | TRUNCATE MATERIALIZED VIEW relation_expr_list ... | TRUNCATE relation_expr_list ... Now the parser doesn't have to make any shift-reduce decision until after it can see past the first identifier. It's a bit tedious but beats making a word more reserved than it has to be. Thanks! Will do. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Altering Views
On 2013-02-20 17:25:41 +0100, Misa Simic wrote: Hi, If we want to add new column to the view - the only one way (we have found) is: -drop view -create it again ( with new column ) Since 8.4 you can add new columns to the end of a view definition using CREATE OR REPLACE VIEW. http://www.postgresql.org/docs/current/interactive/sql-createview.html : CREATE OR REPLACE VIEW is similar, but if a view of the same name already exists, it is replaced. The new query must generate the same columns that were generated by the existing view query (that is, the same column names in the same order and with the same data types), but it may add additional columns to the end of the list. The calculations giving rise to the output columns may be completely different. Btw, this is more of a pgsql-general type of question... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PATCH: Split stats file per database WAS: [HACKERS] autovacuum stress-testing our system
Jeff Janes jeff.ja...@gmail.com wrote: Alvaro Herrera alvhe...@2ndquadrant.com wrote: So here's v11. I intend to commit this shortly. (I wanted to get it out before lunch, but I introduced a silly bug that took me a bit to fix.) On Windows with Mingw I get this: pgstat.c:4389:8: warning: variable 'found' set but not used [-Wunused-but-set-variable] I don't get that on Linux, but I bet that is just the gcc version (4.6.2 vs 4.4.6) rather than the OS. I get it on Linux with gcc version 4.7.2. It looks like it is just a useless variable Agreed. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On 2/20/13 6:13 AM, Robert Haas wrote: It might be useful to have an option for this, but I don't think it should be the default. The default should be that the new database is ready to go. Then again, when would you ever actually use that option? You'd use that option if you'd rather get the database mostly-up as soon as possible, and then worry about the materialized views afterwards. Since the proposed materialized views are not available for implicit use in query optimization, the only way an application would make use of them is to access them directly. And if it accesses an unpopulated materialized view, it would fail. So I don't think in the current state a database is mostly-up without the materialized views filled in. I can see the value in having a restore mode that postpones certain nonessential operations, such as creating indexes or certain constraints or even materialized views. But I think the boundaries and expectations for that need to be defined more precisely. For example, a database without constraints might be considered ready for read-only use, without secondary indexes it might be ready for use but slow. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Contrib module xml2 status
Hi I'm not sure if this is a documentation or hackers issue, but the documentation page for contrib module xml2 refers to PostgreSQL 8.4 in the future tense: It is planned that this module will be removed in PostgreSQL 8.4 in favor of the newer standard API http://www.postgresql.org/docs/devel/static/xml2.html Are there any plans to remove this module by a forseeable date? Regards Ian Barwick
Re: [HACKERS] Materialized views WIP patch
On Wed, February 20, 2013 16:28, Kevin Grittner wrote: Peter Eisentraut pete...@gmx.net wrote: I suppose one should be able to expect that if one finds a view in the information schema, then one should be able to use DROP VIEW to remove it. Which in this case wouldn't work. So I don't think including a materialized view under views or tables is appropriate. Right. I think adding pg_matviews covers the stated use-case enough to answer Erik's concern. Absolutely - I agree pg_matviews is much better than adding deviating information_schema stuff. Thank you, Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On Wed, Feb 20, 2013 at 4:26 PM, Kevin Grittner kgri...@ymail.com wrote: Tom Lane t...@sss.pgh.pa.us wrote: The way to fix it is to not try to use the sub-production but spell it all out: TRUNCATE TABLE relation_expr_list ... | TRUNCATE MATERIALIZED VIEW relation_expr_list ... | TRUNCATE relation_expr_list ... Now the parser doesn't have to make any shift-reduce decision until after it can see past the first identifier. It's a bit tedious but beats making a word more reserved than it has to be. Thanks! Will do. Fwiw I think worrying about stuff like this at this point is probably a waste of time. There'll be a period of bike-shedding where people debate what the command should be called so worrying about parser conflicts before there's a consensus is kind pointless. I would like to know what operations you plan to support independently of the command names. I may have missed much earlier in the discussion but then I suspect things have evolved since then. It sounds like you want to support: 1) Selecting from materialized viws 2) Manually refreshing materialized views 3) Manually truncating materialized views And explicitly not support 1) Automatically rewriting queries to select from matching views 2) Incrementally refreshing materialized views 3) Manual DML against data in materialized views (except truncate which is kind of DDL) 4) Keeping track of whether the data in the materialized view is up to date I have to say I find this model a bit odd. It seems the UI you're presenting is that they're basically read-only tables that the database will fill in the data for automatically. My mental model of materialized views is that they're basically views that the database guarantees a different performance characteristic for. I would expect a materialized view to be up to date all the time. If we don't support incremental updates (which seems like a fine thing not to support in a first cut) then I would expect any DML against the table to mark the view invalid and any queries against it to produce an error (or possibly go to the source tables using the view definition but that's probably a bad idea for most use cases). Ie. they should behave like a view at all times and have up to date information or fail entirely. I would expect a command like TRUNCATE MATERIALIZED VIEW to exist but I would expect it to be called something like INVALIDATE rather than TRUNCATE and dropping the storage is a side effect of simply telling the database that it doesn't need to maintain this materialized view. Though I could be convinced truncate is a good name as long as it's documented well. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
Greg Stark st...@mit.edu wrote: Kevin Grittner kgri...@ymail.com wrote: Tom Lane t...@sss.pgh.pa.us wrote: The way to fix it is to not try to use the sub-production but spell it all out: TRUNCATE TABLE relation_expr_list ... | TRUNCATE MATERIALIZED VIEW relation_expr_list ... | TRUNCATE relation_expr_list ... Now the parser doesn't have to make any shift-reduce decision until after it can see past the first identifier. It's a bit tedious but beats making a word more reserved than it has to be. Thanks! Will do. Fwiw I think worrying about stuff like this at this point is probably a waste of time. There'll be a period of bike-shedding where people debate what the command should be called so worrying about parser conflicts before there's a consensus is kind pointless. That sort of bikeshedding already happened three months ago. Too late now. I would like to know what operations you plan to support independently of the command names. I may have missed much earlier in the discussion but then I suspect things have evolved since then. It sounds like you want to support: 1) Selecting from materialized viws 2) Manually refreshing materialized views 3) Manually truncating materialized views And explicitly not support 1) Automatically rewriting queries to select from matching views 2) Incrementally refreshing materialized views Those are material for later releases, building on the base of what goes into this release. 3) Manual DML against data in materialized views (except truncate which is kind of DDL) There is quite a lot of DML allowed -- changing tablespace, changing schema, changing name of the MV or of individual columns in it, changing statistics targets, creating indexes, and other operations are supported. 4) Keeping track of whether the data in the materialized view is up to date Only keeping track of whether data has been populated or not, for now. There has been agreement that one or more timestamps relating to freshness will make sense, but these are not in the initial patch. I have to say I find this model a bit odd. It's not a model, it's a starting point. Several people have already said that even this much is useful and they expect to take advantage of it. I'm doing what I can to not paint us into a corner where it's hard to extend to all the features everyone dreams of, but if we waited for that to commit something, it will never happen. I would expect a materialized view to be up to date all the time. I expect that this will eventually be an option, but I expect that is will be a seldom-used one. Most cases that I've seen, people want summary data that is reasonably up-to-date without unduly affecting the performance of incremental changes to the underlying data. I've sketched out the roadmap from this patch to all of these options in a vauge, handwavy fashion, and don't have a lot of interest in taking it farther until we're past 9.3 beta. If we don't support incremental updates (which seems like a fine thing not to support in a first cut) then I would expect any DML against the table to mark the view invalid and any queries against it to produce an error (or possibly go to the source tables using the view definition but that's probably a bad idea for most use cases). Ie. they should behave like a view at all times and have up to date information or fail entirely. That would render them completely useless for the use-cases I've seen. If you want to offer a patch to do that as an option, feel free, but I will strongly argue against that as unconditional behavior. I would expect a command like TRUNCATE MATERIALIZED VIEW to exist but I would expect it to be called something like INVALIDATE rather than TRUNCATE and dropping the storage is a side effect of simply telling the database that it doesn't need to maintain this materialized view. Though I could be convinced truncate is a good name as long as it's documented well. I'm trying to minimize the number of new keywords. The initial patch only added MATERIALIZED. I added REFRESH due to near-universal demand for something other than the LOAD MATERIALIZED VIEW I initially used. Have you seen the statistics Tom gave out on how much the size of the executable bloats with every new keyword? Until now nobody has expressed concern about TRUNCATE MATERIALIZED VIEW, so it would take quite a groundswell of concern at this point to even consider a new keyword for this functionality this late in the game. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
And explicitly not support 1) Automatically rewriting queries to select from matching views 2) Incrementally refreshing materialized views 3) Manual DML against data in materialized views (except truncate which is kind of DDL) 4) Keeping track of whether the data in the materialized view is up to date The idea is to add the above features over the next few versions of Postgres. I have to say I find this model a bit odd. It seems the UI you're presenting is that they're basically read-only tables that the database will fill in the data for automatically. This is what matviews are in other DBMSes. My mental model of materialized views is that they're basically views that the database guarantees a different performance characteristic for. How would we do that, exactly? That would be lovely, but unless you have a way to accomplish it ... I would expect a materialized view to be up to date all the time. Actually, there's a huge use case for asynchronously updated matviews, so we would not want an implementation which ruled them out. Also there's the argument that synchronously updated matviews have little actual performance advantage over regular dynamic views. Or to put it another way: I could use this feature, as it is, in about 8 different projects I'm currently supporting. I personally can't think of a single project where I need synchronously updated matviews, currently. I have in the past, but it's a LOT less frequent that the desire for async, just as the desire for async replication is more common than the desire for syncrep. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On 2/19/13 5:47 PM, Kevin Grittner wrote: When I went to do this, I hit a shift/reduce conflict, because with TABLE being optional it couldn't tell whether: TRUNCATE MATERIALIZED VIEW x, y, z; ... was looking for five relations or three. That goes away with MATERIALIZED escalated to TYPE_FUNC_NAME_KEYWORD. Is that OK? Is TRUNCATE even the right command here? For regular tables TRUNCATE is a fast DELETE, which logically empties the table. For materialized views, there is no deleting, so this command (I suppose?) just invalidates the materalized view. That's not the same thing. Are there TRUNCATE triggers on materialized views? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add PQconninfoParseParams and PQconninfodefaultsMerge to libpq
On 20.02.2013 11:42, Amit Kapila wrote: The patch for providing connection string for pg_basebackup, pg_receivexlog, pg_dump and pg_dumpall is attached with this mail. Thanks. Now that I look at this patch, I realize that we don't actually need these new functions for pg_basebackup and friends after all. We already have PQconninfoParse(), we can just use that. pg_dump can already take a connection string: pg_dump dbname=postgres port=5432 For consistency with psql and other tools, perhaps we should add a -d option to pg_dump, so that you could do: pg_dump -d dbname=postgres port=5432 It'd be nice to call the option -d or --dbname in all the tools. That's a bit confusing for pg_basebackup and pg_receivexlog, as it can *not* actually be a database name, but it would be otherwise consistent with the other commands. I came up with the attached three patches. The first adds -d/--dbname option to pg_basebackup and pg_receivexlog. The second adds it to pg_dump, per above. The third adds it to pg_dumpall. The third patch is a bit complicated. It first parses the user-specified connection string using PQconninfoParse, so that it can merge in some extra keywords: user, host, password, dbname and fallback_application_name. It then calls PQconnectdbParams with the keyword/value pairs. After making the initial connection to postgres or template1 database, it calls PQconninfo() to again extract the keyword/value pairs in effect in the connection, and constructs a new connection string from them. That new connection string is then passed to pg_dump on the command line, with the database name appended to it. That seems to work, although it's perhaps a bit Rube Goldbergian. One step of deparsing and parsing could be avoided by keeping the keyword/value pairs from the first PQconninfoParse() call, instead of constructing them again with PQconninfo(). I'll experiment with that tomorrow. The docs need some improvement. In those commands where you can't pass a database name to the -d/--dbname option, only a connection string, I kept your wording in the docs. But it ought to explain the seemingly strange name for the option, and more. I'll take another whack at that tomorrow as well. Where does this leave the PQconninfoParseParams/PQconninfodefaultsMerge patch? I'm not sure. Somehow I thought it would be necessary for this work, but it wasn't. I didn't remember that we already have PQconninfoParse() function, which was enough. So, what's the use case for those functions? - Heikki From ebfcff54ae934b38f3bfecfbe8dbe0cbe0573c95 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas heikki.linnakan...@iki.fi Date: Wed, 20 Feb 2013 20:49:24 +0200 Subject: [PATCH 1/3] Add -d option, for specifying a connection string, to pg_basebackup and pg_receivexlog. It's a bit strange that the option is called -d/--dbname, when in fact you can *not* pass a database name to it. But it's consistent with other client tools, where you can pass a connection string using the -d option. --- doc/src/sgml/ref/pg_basebackup.sgml| 12 + doc/src/sgml/ref/pg_receivexlog.sgml | 12 + src/bin/pg_basebackup/pg_basebackup.c |7 ++- src/bin/pg_basebackup/pg_receivexlog.c |7 ++- src/bin/pg_basebackup/streamutil.c | 87 +++- src/bin/pg_basebackup/streamutil.h |1 + 6 files changed, 101 insertions(+), 25 deletions(-) diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml index 2f89f2c..3ab460a 100644 --- a/doc/src/sgml/ref/pg_basebackup.sgml +++ b/doc/src/sgml/ref/pg_basebackup.sgml @@ -359,6 +359,18 @@ PostgreSQL documentation variablelist varlistentry + termoption-d replaceable class=parameterconnstr/replaceable/option/term + termoption--dbname=replaceable class=parameterconnstr/replaceable/option/term + listitem + para +Specifies connection string options, used for connecting to server. +These options can be used along with other user supplied options. +In case of aconflicting options, the user supplied option is used. + /para + /listitem + /varlistentry + + varlistentry termoption-h replaceable class=parameterhost/replaceable/option/term termoption--host=replaceable class=parameterhost/replaceable/option/term listitem diff --git a/doc/src/sgml/ref/pg_receivexlog.sgml b/doc/src/sgml/ref/pg_receivexlog.sgml index d06dd1f..314da0e 100644 --- a/doc/src/sgml/ref/pg_receivexlog.sgml +++ b/doc/src/sgml/ref/pg_receivexlog.sgml @@ -123,6 +123,18 @@ PostgreSQL documentation variablelist varlistentry + termoption-d replaceable class=parameterconnstr/replaceable/option/term + termoption--dbname=replaceable class=parameterconnstr/replaceable/option/term + listitem + para +Specifies connection string options, used for connecting to server. +These options can be used along with
Re: [HACKERS] Materialized views WIP patch
Peter Eisentraut pete...@gmx.net wrote: On 2/19/13 5:47 PM, Kevin Grittner wrote: When I went to do this, I hit a shift/reduce conflict, because with TABLE being optional it couldn't tell whether: TRUNCATE MATERIALIZED VIEW x, y, z; ... was looking for five relations or three. That goes away with MATERIALIZED escalated to TYPE_FUNC_NAME_KEYWORD. Is that OK? Is TRUNCATE even the right command here? For regular tables TRUNCATE is a fast DELETE, which logically empties the table. For materialized views, there is no deleting, so this command (I suppose?) just invalidates the materalized view. That's not the same thing. Hmm. That's what Greg Stark just said, and I replied that nobody else had raised the issue in over three months. With Greg, that's two now. TRUNCATE MATERIALIZED VIEW discards any data which has been loaded into the MV, rendering it unavailable for scanning. Internally, it does do a truncate, exactly as truncate table. The resulting zero-length heap file is what is used to determine whether a materialized view is scannable. When a CREATE WITH DATA or a REFRESH generates zero rows, an empty single page is created to indicate that it is scannable (valid to use in queries) but contains no rows. I agree that INVALIDATE is probably more descriptive, although it seems that there might be some even better word if we bikeshed enough. The question is, is it worth creating a new keyword to call the internal truncate function for materialized views, versus documenting that truncating a materialized view renders it invalid? Again, given the numbers that Tom presented a while back about the space requirements of every new keyword, I don't think this is enough of a gain to justify that. I still squirm a little about having used REFRESH, even though demand for that was overwhelming. Are there TRUNCATE triggers on materialized views? No. Nor SELECT, INSERT, UPDATE, or DELETE triggers. You can't create a trigger of any type on a materialized view. I don't think that would interfere with event triggers, though. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
* Kevin Grittner (kgri...@ymail.com) wrote: Peter Eisentraut pete...@gmx.net wrote: Is TRUNCATE even the right command here? For regular tables TRUNCATE is a fast DELETE, which logically empties the table. For materialized views, there is no deleting, so this command (I suppose?) just invalidates the materalized view. That's not the same thing. Hmm. That's what Greg Stark just said, and I replied that nobody else had raised the issue in over three months. With Greg, that's two now. TRUNCATE MAT VIEW seems like the right command to me. Just my 2c. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] [PATCH] Add PQconninfoParseParams and PQconninfodefaultsMerge to libpq
On Wed, Feb 20, 2013 at 2:16 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: Where does this leave the PQconninfoParseParams/PQconninfodefaultsMerge patch? I'm not sure. Somehow I thought it would be necessary for this work, but it wasn't. I didn't remember that we already have PQconninfoParse() function, which was enough. So, what's the use case for those functions? I don't think that there is an immediate case. I still think they are useful, and would be more useful if we had some other functions that took PQconninfoOption. But the original reason for their being has been circumvented and I think we should just push them off to next release commit fest and discuss them then. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On 2/20/13 2:30 PM, Kevin Grittner wrote: Are there TRUNCATE triggers on materialized views? No. Nor SELECT, INSERT, UPDATE, or DELETE triggers. You can't create a trigger of any type on a materialized view. I don't think that would interfere with event triggers, though. More generally, I would consider the invalidation of a materialized view a DDL command, whereas truncating a table is a DML command. This has various implications with triggers, logging, permissions. I think it's not good to mix those two. Also note that un-invalidating==refreshing a materialized view is already a DDL command. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PATCH: Split stats file per database WAS: [HACKERS] autovacuum stress-testing our system
Jeff Janes escribió: On Mon, Feb 18, 2013 at 7:50 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: So here's v11. I intend to commit this shortly. (I wanted to get it out before lunch, but I introduced a silly bug that took me a bit to fix.) On Windows with Mingw I get this: pgstat.c:4389:8: warning: variable 'found' set but not used [-Wunused-but-set-variable] I don't get that on Linux, but I bet that is just the gcc version (4.6.2 vs 4.4.6) rather than the OS. It looks like it is just a useless variable, rather than any possible cause of the Windows make check failure (which I can't reproduce). Hm, I remember looking at that code and thinking that the return there might not be the best idea because it'd miss running the code that checks for clock skew; and so the found was necessary because the return was to be taken out. But on second thought, a database for which the loop terminates early has already run the clock-skew detection code recently, so that's probably not worth worrying about. IOW I will just remove that variable. Thanks for the notice. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
Peter Eisentraut pete...@gmx.net wrote: On 2/20/13 2:30 PM, Kevin Grittner wrote: Are there TRUNCATE triggers on materialized views? No. Nor SELECT, INSERT, UPDATE, or DELETE triggers. You can't create a trigger of any type on a materialized view. I don't think that would interfere with event triggers, though. More generally, I would consider the invalidation of a materialized view a DDL command, whereas truncating a table is a DML command. The force of that assertion is somewhat undercut by the fact that the ExecuteTruncate() function does exactly what needs to be done to discard the data in a materialized view and make it appear as non-scannable. Even if we dress it up with different syntax, it's not clear that we wouldn't build a TruncateStmt in the parser and pass it through exactly the same execution path. We would just need to look at the relkind to generate a different command tag. This has various implications with triggers, logging, permissions. I think it's not good to mix those two. Could you give a more concrete example of where you see a problem? -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
I would like to know what operations you plan to support independently of the command names. I may have missed much earlier in the discussion but then I suspect things have evolved since then. It sounds like you want to support: 1) Selecting from materialized viws 2) Manually refreshing materialized views 3) Manually truncating materialized views Maybe plus? 4) Automatically dropping materialized views if underlying table(s) are dropped/altered Or this has to be done manually? And explicitly not support 1) Automatically rewriting queries to select from matching views 2) Incrementally refreshing materialized views 3) Manual DML against data in materialized views (except truncate which is kind of DDL) 4) Keeping track of whether the data in the materialized view is up to date I have to say I find this model a bit odd. It seems the UI you're presenting is that they're basically read-only tables that the database will fill in the data for automatically. My mental model of materialized views is that they're basically views that the database guarantees a different performance characteristic for. I would expect a materialized view to be up to date all the time. If we don't support incremental updates (which seems like a fine thing not to support in a first cut) then I would expect any DML against the table to mark the view invalid and any queries against it to produce an error (or possibly go to the source tables using the view definition but that's probably a bad idea for most use cases). Ie. they should behave like a view at all times and have up to date information or fail entirely. I would expect a command like TRUNCATE MATERIALIZED VIEW to exist but I would expect it to be called something like INVALIDATE rather than TRUNCATE and dropping the storage is a side effect of simply telling the database that it doesn't need to maintain this materialized view. Though I could be convinced truncate is a good name as long as it's documented well. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unarchived WALs deleted after crash
On Fri, Feb 15, 2013 at 9:29 AM, Simon Riggs si...@2ndquadrant.com wrote: On 15 February 2013 17:07, Heikki Linnakangas hlinnakan...@vmware.com wrote: Unfortunately in HEAD, xxx.done file is not created when restoring archived file because of absence of the patch. We need to implement that first. Ah yeah, that thing again.. (http://www.postgresql.org/message-id/50df5ba7.6070...@vmware.com) I'm going to forward-port that patch now, before it's forgotten again. It's not clear to me what the holdup was on this, but whatever the bigger patch we've been waiting for is, it can just as well be done on top of the forward-port. Agreed. I wouldn't wait for a better version now. Related to this, how is this going to affect point releases, and are there any lingering doubts about the mechanism of the fix? This is quite serious given my reliance on archiving, so unless the thinking for point releases is 'real soon' I must backpatch and release it on my own accord until then. Thanks for the attention paid to the bug report, as always. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] indirect toast tuple support
On Tue, Feb 19, 2013 at 2:00 PM, Andres Freund and...@2ndquadrant.com wrote: The only reasonable thing I can see us doing is renaming varattrib_1b_e.va_len_1be into va_type and redefine VARSIZE_1B_E into a switch that maps types into lengths. But I think I would put this off, except placing a comment somewhere, until its gets necessary. Is there any reason to make it a switch before we actually have two types that happen to have the same length? It might make the code clearer if there was an enum with the (one) type listed but as long as all the enum values happen to have the value of the length of the struct then it makes heap_form_tuple and heap_deform_tuple marginally faster. (unless gcc can optimize away the whole switch statement which might be plausible, especially if it's just a few ?: expressions) -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] indirect toast tuple support
On Thu, Feb 21, 2013 at 2:32 AM, Greg Stark st...@mit.edu wrote: On Tue, Feb 19, 2013 at 2:00 PM, Andres Freund and...@2ndquadrant.com wrote: The only reasonable thing I can see us doing is renaming varattrib_1b_e.va_len_1be into va_type and redefine VARSIZE_1B_E into a switch that maps types into lengths. But I think I would put this off, except placing a comment somewhere, until its gets necessary. Is there any reason to make it a switch before we actually have two types that happen to have the same length? It might make the code clearer if there was an enum with the (one) type listed but as long as all the enum values happen to have the value of the length of the struct then it makes heap_form_tuple and heap_deform_tuple marginally faster. (unless gcc can optimize away the whole switch statement which might be plausible, especially if it's just a few ?: expressions) For what it's worth much of this was discussed at the time. I originally wrote it as an enum and Tom changed it to a length byte, specifically for performance reasons, and said we could always change it back to an enum where some of the values just happened to be equal to their length if we needed it: http://www.postgresql.org/message-id/flat/82tzp7bbbh@mid.bfk.de#82tzp7bbbh@mid.bfk.de -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] posix_fadvise missing in the walsender
On Wed, Feb 20, 2013 at 4:54 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Feb 19, 2013 at 5:48 PM, Simon Riggs si...@2ndquadrant.com wrote: I agree with Merlin and Joachim - if we have the call in one place, we should have it in both. We might want to assess whether we even want to have it one place. I've seen cases where the existing call hurts performance, because of WAL file recycling. That's interesting, I hadn't thought about WAL recycling. I now agree that this whole thing is even more complicated, you might have an archive_command set as well, like cp for instance, that reads in the WAL file again, possibly even right after we called posix_fadvise on it. It appears to me that the right strategy depends on a few factors: a) what ratio of your active dataset fits into RAM? b) how many WAL files do you have? c) how long does it take for them to get recycled? d) archive_command set / wal_senders active? And recommendations for the two extremes would be: If your dataset fits mostly into RAM and if you have only few WAL files that get recycled quickly then you don't want to evict the WAL file from the buffer cache. On the other hand if your dataset doesn't fit into RAM and you have many WAL files that take a while until they get recycled, then you should consider hinting to the OS. If you're in that second category (I am) and you're also using the archive_command you could just piggyback the posix_fadvise call onto your archive_command, assuming that the walsender is already done with the file at that moment. And I'm also pretty certain that Robert's setup that he used for the write scalability tests fell into the first category. So given the above, I think it's possible to come up with benchmarks that prove whatever you want to prove :-) Joachim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: PostgreSql - access modified rows in prepare transaction command
On Wednesday, February 20, 2013 5:10 PM pierpaolo.cincilla wrote: Thank you Heikki for your reply. As you suggest, I will explain better what I'm trying to accomplish. What I'm writing a ditributed two-phase-commit termination protocol that work in this manner: 1) Each site has a replica of the database. A site A perform a transaction t1 and prepare it (PREPARE TRANSACTION 't1'). Then it atomic broadcast a certification request for the transaction t1 *along with its writeset* (values updated by t1) to other sites. 2) When a site receive the certification request for transaction t1 does the certification (check that there are no concurrent conflicting transactions). If the certification succeed then 2a) if the transaction is local (i.e. originated at that site) it commit the transaction (COMMMIT PREPARED 't1'). 2b) If the transaction is remote (i.e. prepared at another site) *it apply locally the writeset of transaction t1* to reflect modifications to its local replica of the database (UPDATE command). The problem is that if I can't fetch the writeset of a transaction in phase 1 (before the commit request) then when I certify the transaction at another site I can't apply the updates performed by the remote transaction right away but I have to wait the originating site to commit the transaction and send back its writeset (now visible). This will be very bad because it adds an extra round to the algorithm. I think the one possible way to get the transaction data at the point you need will be through WAL, but that will also not be straightforward, you need to decode and find in WAL corresponding data. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On Wed, Feb 20, 2013 at 9:25 PM, Peter Eisentraut pete...@gmx.net wrote: More generally, I would consider the invalidation of a materialized view a DDL command, whereas truncating a table is a DML command. That's not entirely true. From the database's point of view, TRUNCATE is in many ways actually DDL. I actually don't really dislike using TRUNCATE for this command. I was more asking about whether this meant people were thinking of the view as a thing where you could control the data in it by hand and could have the view be empty rather than just not valid. The way I was thinking about it, whatever the command is named, you might be able to tell the database to drop the storage associated with the view but that would make the view invalid until it was refreshed. It wouldn't make it appear to be empty. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers